I am running bacula-2.2.0 trying to get it working on a FreeBSD 6.2-RELEASE.
I have an autochanger with 3 drives. I have run the complete btape tests
with success. I am now trying to get multiple jobs running on multiple
tapes simultaneously. Below are scenarios I have documented. It would
appear that:
         1. Drive status does not always reflect reality
         2. Volume status rarely reflects reality
         3. Drive and Volume status do not match up
         4. Bacula's need to ask for a tape on a new drive that he
            is already using.

That said, I hope there is some glaring wrong I am committing, or at the
least you have heard of these problems before and could offer me some
help...

*********************************************************************

SCENARIO 1

Two jobs are running:

140 | nfs 2007-08-21 07:25:49 Using Drive 1
141 | nfs 2007-08-21 07:25:59 Using Drive 2

However the Device Status DOES NOT MATCH
         the Volume status.

Device status:
Autochanger "Autochanger" with devices:
    "Drive-0" (/dev/nsa0)
    "Drive-1" (/dev/nsa1)
    "Drive-2" (/dev/nsa2)
Device "FileStorage" (/database/backup) is not open.
Device "Drive-0" (/dev/nsa0) is not open.
     Drive 0 status unknown.
Device "Drive-1" (/dev/nsa1) is mounted with:
     Volume:      000006L3
     Pool:        Tape1
     Media type:  LTO3
     Slot 258 is loaded in drive 1.
     Total Bytes=354,428,928 Blocks=5,493 Bytes/block=64,523
     Positioned at File=2 Block=0
Device "Drive-2" (/dev/nsa2) is mounted with:
     Volume:      000005L3
     Pool:        Tape1
     Media type:  LTO3
     Slot 259 is loaded in drive 2.
     Total Bytes=64,512 Blocks=0 Bytes/block=64,512
     Positioned at File=0 Block=1
====

In Use Volume status:
000005L3 on device "Drive-0" (/dev/nsa0)
     Reader=0 writers=0 reserved=0
000006L3 on device "Drive-1" (/dev/nsa1)
     Reader=0 writers=1 reserved=0
====

Now lets start the third job.....

Bacula knows there is nothing in Drive 0  but he insists on
selecting a tape for that drive that is in Drive 2 even though
the Drive status says the tape is in use and the volume status
(all though WRONG!!) says the tape is IN USE....

21-Aug 07:43 mailbackup-dir: Start Backup JobId 143, Job=nfs.2007-08-21_07.43.21
21-Aug 07:43 mailbackup-dir: Using Device "Drive-0"
21-Aug 07:43 mailbackup-sd: 3301 Issuing autochanger "loaded? drive 0" command.
21-Aug 07:43 mailbackup-sd: 3302 Autochanger "loaded? drive 0", result: nothing 
loaded.
21-Aug 07:43 mailbackup-sd: 3304 Issuing autochanger "load slot 12, drive 0" 
command.
21-Aug 07:44 mailbackup-sd: nfs.2007-08-21_07.43.21 Fatal error: 3992 Bad 
autochanger "load slot 12, drive 0": ERR=Child exite
d with code 1.
Results=chio: /dev/ch0: CHIOMOVE: Input/output error

21-Aug 07:44 nfs: nfs.2007-08-21_07.43.21 Fatal error: job.c:1758 Bad response 
to Append Data command. Wanted 3000 OK data
, got 3903 Error append data

21-Aug 07:44 mailbackup-dir: nfs.2007-08-21_07.43.21 Error: Bacula 
mailbackup-dir 2.2.0 (08Aug07): 21-Aug-2007 07:44:16
   Build OS:               i386-unknown-freebsd6.2 freebsd 6.2-RELEASE
   JobId:                  143
   Job:                    nfs.2007-08-21_07.43.21
   Backup Level:           Full
   Client:                 "nfs" 2.0.3 (06Mar07) 
i386-unknown-openbsd4.1,openbsd,4.1
   FileSet:                "nfs" 2007-08-10 10:34:22
   Pool:                   "Tape1" (From Job resource)
   Storage:                "Autochanger" (From Job resource)
   Scheduled time:         21-Aug-2007 07:43:16
   Start time:             21-Aug-2007 07:43:25
   End time:               21-Aug-2007 07:44:16
   Elapsed time:           51 secs
   Priority:               10
   FD Files Written:       0
   SD Files Written:       0
   FD Bytes Written:       0 (0 B)
   SD Bytes Written:       0 (0 B)
   Rate:                   0.0 KB/s
   Software Compression:   None
   VSS:                    no
   Encryption:             no
   Volume name(s):
   Volume Session Id:      11
   Volume Session Time:    1187700921
   Last Volume Bytes:      64,512 (64.51 KB)
   Non-fatal FD errors:    0
   SD Errors:              0
   FD termination status:  Error
   SD termination status:  Error
   Termination:            *** Backup Error ***

***************************************************************************

SCENARIO 2

It would appear that Bacula gets confused as to the tapes that are 
already mounted if Bacula has been shutdown and restarted and there where
tapes left mounted in the drives. Why?

        -Should I be shuting down bacula differently?
        -Should I be shutting down postgres first?


Job finishes OK! But when doing status 2 2 there is no Volume name????

        Device "Drive-2" (/dev/nsa2) is mounted with:
     Volume:
     Pool:        Tape1
     Media type:  LTO3
     Slot 259 is loaded in drive 2.
     Total Bytes=23,804,928 Blocks=368 Bytes/block=64,687
     Positioned at File=1 Block=0


2 tapes are loaded in drives. I shutdown sd and restart:

Device status:
Autochanger "Autochanger" with devices:
    "Drive-0" (/dev/nsa0)
    "Drive-1" (/dev/nsa1)
    "Drive-2" (/dev/nsa2)
Device "FileStorage" (/database/backup) is not open.
Device "Drive-0" (/dev/nsa0) is mounted with:
     Volume:      000007L3
     Pool:        *unknown*
     Media type:  LTO3
     Slot 257 is loaded in drive 0.
     Total Bytes Read=0 Blocks Read=0 Bytes/block=0
     Positioned at File=0 Block=0
Device "Drive-1" (/dev/nsa1) is not open.
     Drive 1 status unknown.
Device "Drive-2" (/dev/nsa2) is mounted with:
     Volume:      000015L3
     Pool:        *unknown*
     Media type:  LTO3
     Slot 259 is loaded in drive 2.
     Total Bytes Read=0 Blocks Read=0 Bytes/block=0
     Positioned at File=0 Block=0
====

In Use Volume status:
000007L3 on device "Drive-0" (/dev/nsa0)
     Reader=0 writers=0 reserved=0
000015L3 on device "Drive-2" (/dev/nsa2)
     Reader=0 writers=0 reserved=0


Start a job just after restart of sd.



20-Aug 08:53 mailbackup-dir: Start Backup JobId 116, 
Job=maildrop0.2007-08-20_08.53.47
20-Aug 08:53 mailbackup-dir: Using Device "Drive-0"
20-Aug 08:53 mailbackup-sd: 3301 Issuing autochanger "loaded? drive 0" command.
20-Aug 08:54 mailbackup-sd: 3302 Autochanger "loaded? drive 0", result: nothing 
loaded.
20-Aug 08:54 mailbackup-sd: 3304 Issuing autochanger "load slot 4, drive 0" 
command.
20-Aug 08:54 mailbackup-sd: 3305 Autochanger "load slot 4, drive 0", status is 
OK.
20-Aug 08:54 mailbackup-sd: 3301 Issuing autochanger "loaded? drive 0" command.
20-Aug 08:55 mailbackup-sd: 3302 Autochanger "loaded? drive 0", result is Slot 
257.
20-Aug 08:55 mailbackup-sd: Volume "000004L3" previously written, moving to end 
of data.
20-Aug 08:56 mailbackup-sd: maildrop0.2007-08-20_08.53.47 Error: Bacula cannot 
write on tape Volume "000004L3" because:
The number of files mismatch! Volume=13 Catalog=14
20-Aug 08:56 mailbackup-sd: Marking Volume "000004L3" in Error in Catalog.
20-Aug 08:58 mailbackup-sd: 3307 Issuing autochanger "unload slot 257, drive 0" 
command.
20-Aug 08:58 mailbackup-sd: 3304 Issuing autochanger "load slot 2, drive 0" 
command.
20-Aug 08:59 mailbackup-sd: 3305 Autochanger "load slot 2, drive 0", status is 
OK.
20-Aug 08:59 mailbackup-sd: 3301 Issuing autochanger "loaded? drive 0" command.
20-Aug 09:00 mailbackup-sd: 3302 Autochanger "loaded? drive 0", result is Slot 
257.
20-Aug 09:00 mailbackup-sd: Volume "000007L3" previously written, moving to end 
of data.
20-Aug 09:13 mailbackup-sd: Ready to append to end of Volume "000007L3" at 
file=133.
20-Aug 09:13 mailbackup-sd: Spooling data ...


SD wants to unload the first drive. Then he wants to load slot 2 
into drive 0. Status seems okay.

This "000007L3" tape was running a job that I CANCELLED prior to restarting sd.
Volume "000007L3" previously written, moving to end of data....

Device "Drive-0" (/dev/nsa0) is mounted with:
     Volume:      000007L3
     Pool:        Tape1
     Media type:  LTO3
     Device is being initialized.
     Slot 257 is loaded in drive 0.
     Total Bytes Read=0 Blocks Read=0 Bytes/block=0
     Positioned at File=13 Block=0

Job starts running okay1

Start a second job....
20-Aug 09:57 mailbackup-dir: Start Backup JobId 120, 
Job=maildrop1.2007-08-20_09.57.06
20-Aug 09:57 mailbackup-dir: Using Device "Drive-1"
20-Aug 09:57 mailbackup-sd: 3301 Issuing autochanger "loaded? drive 1" command.
20-Aug 09:57 mailbackup-sd: 3302 Autochanger "loaded? drive 1", result: nothing 
loaded.
20-Aug 09:57 mailbackup-sd: 3304 Issuing autochanger "load slot 5, drive 1" 
command.
20-Aug 09:58 mailbackup-sd: maildrop1.2007-08-20_09.57.06 Fatal error: 3992 Bad 
autochanger "load slot 5, drive 1":
ERR=Child exited with code 1.
Results=chio: /dev/ch0: CHIOMOVE: Input/output error

20-Aug 09:58 maildrop1: maildrop1.2007-08-20_09.57.06 Fatal error: job.c:1758 
Bad response to Append Data command. Wanted
3000 OK data
, got 3903 Error append data
Fatal error: job.c:1758 Bad response to Append Data command.
Wanted 3000 OK data , got 3903 Error append data.

Finds that nothing is loaded in drive 1. It then
wishes to load slot 5 into drive 1. Of course slot 15
is tape number 000015L3, which is currently loaded in
drive 2!  This will cause a continuous problem until
tape number 000015L3 is put back into slot 5!!!!!



Device "Drive-0" (/dev/nsa0) is not open.
     Drive 0 status unknown.
Device "Drive-1" (/dev/nsa1) is not open.
     Drive 1 status unknown.
Device "Drive-2" (/dev/nsa2) is mounted with:
     Volume:
     Pool:        *unknown*
     Media type:  LTO3
     Slot 259 is loaded in drive 2.
     Total Bytes Read=0 Blocks Read=0 Bytes/block=0
     Positioned at File=0 Block=0
====

In Use Volume status:
====

Device "Drive-0" (/dev/nsa0) is not open.
     Drive 0 status unknown.
Device "Drive-1" (/dev/nsa1) is not open.
     Drive 1 status unknown.
Device "Drive-2" (/dev/nsa2) is mounted with:
     Volume:
     Pool:        *unknown*
     Media type:  LTO3
     Slot 259 is loaded in drive 2.
     Total Bytes Read=0 Blocks Read=0 Bytes/block=0
     Positioned at File=0 Block=0
====

In Use Volume status:
====

*********************************************************************

SCENARIO 3

Device status:
Autochanger "Autochanger" with devices:
    "Drive-0" (/dev/nsa0)
    "Drive-1" (/dev/nsa1)
    "Drive-2" (/dev/nsa2)
Device "FileStorage" (/database/backup) is not open.
Device "Drive-0" (/dev/nsa0) is mounted with:
     Volume:      000004L3
     Pool:        *unknown*
     Media type:  LTO3
     Slot 257 is loaded in drive 0.
     Total Bytes Read=0 Blocks Read=0 Bytes/block=0
     Positioned at File=0 Block=0
Device "Drive-1" (/dev/nsa1) is not open.
     Drive 1 status unknown.
Device "Drive-2" (/dev/nsa2) is mounted with:
     Volume:      000015L3
     Pool:        *unknown*
     Media type:  LTO3
     Slot 259 is loaded in drive 2.
     Total Bytes Read=0 Blocks Read=0 Bytes/block=0
     Positioned at File=0 Block=0
====

In Use Volume status:
000004L3 on device "Drive-0" (/dev/nsa0)
     Reader=0 writers=0 reserved=0
000015L3 on device "Drive-2" (/dev/nsa2)
     Reader=0 writers=0 reserved=0
====

20-Aug 10:53 mailbackup-dir: Start Backup JobId 121, 
Job=maildrop0.2007-08-20_10.53.55
20-Aug 10:53 mailbackup-dir: Using Device "Drive-0"
20-Aug 10:53 mailbackup-sd: 3307 Issuing autochanger "unload slot 257, drive 0" 
command.
20-Aug 10:54 mailbackup-sd: 3304 Issuing autochanger "load slot 5, drive 0" 
command.
20-Aug 10:55 mailbackup-sd: maildrop0.2007-08-20_10.53.55 Fatal error: 3992 Bad 
autochanger "load slot 5, drive 0": ERR=Child
exited with code 1.
Results=chio: /dev/ch0: CHIOMOVE: Input/output error

20-Aug 10:55 maildrop0: maildrop0.2007-08-20_10.53.55 Fatal error: job.c:1758 
Bad response to Append Data command. Wanted 3000
  OK data
, got 3903 Error append data

====

Device status:
Autochanger "Autochanger" with devices:
    "Drive-0" (/dev/nsa0)
    "Drive-1" (/dev/nsa1)
    "Drive-2" (/dev/nsa2)
Device "FileStorage" (/database/backup) is not open.
Device "Drive-0" (/dev/nsa0) is not open.
     Drive 0 status unknown.
Device "Drive-1" (/dev/nsa1) is not open.
     Drive 1 status unknown.
Device "Drive-2" (/dev/nsa2) is mounted with:
     Volume:
     Pool:        *unknown*
     Media type:  LTO3
     Slot 259 is loaded in drive 2.
     Total Bytes Read=0 Blocks Read=0 Bytes/block=0
     Positioned at File=0 Block=0
====
In Use Volume status:
====

====

Bacula starts it's first job after a Bacula restart and having tapes still in 
some drives. 
He knows that tape number 000004L3 is loaded in drive 0 and tape number 
000015L3 is 
loaded in drive 2.

Device "Drive-0" (/dev/nsa0) is mounted with:
     Volume:      000004L3
     Pool:        *unknown*
     Media type:  LTO3
     Slot 257 is loaded in drive 0.
     Total Bytes Read=0 Blocks Read=0 Bytes/block=0
     Positioned at File=0 Block=0
Device "Drive-1" (/dev/nsa1) is not open.
     Drive 1 status unknown.
Device "Drive-2" (/dev/nsa2) is mounted with:
     Volume:      000015L3
     Pool:        *unknown*
     Media type:  LTO3
     Slot 259 is loaded in drive 2.
     Total Bytes Read=0 Blocks Read=0 Bytes/block=0
     Positioned at File=0 Block=0
====

In Use Volume status:
000004L3 on device "Drive-0" (/dev/nsa0)
     Reader=0 writers=0 reserved=0
000015L3 on device "Drive-2" (/dev/nsa2)
     Reader=0 writers=0 reserved=0
====

Bacula first says he wants to use drive 0. Because he KNOWS the status, he 
immedeately issues an unload of 
drive 0. Bacula wishes to use tape number 000015L3 in drive 0 but it is already 
loaded on drive 2. Start a new job and

error: job.c:1758. Bad response to Append Data command. Wanted 3000 OK data , 
got 3903 Error append data

Start another job and

error: job.c:1758. Bad response to Append Data command. Wanted 3000 OK data , 
got 3903 Error append data

Output from the second "status 2 2" command shows that Bacula nolonger knows 
what tape is in drive 2.

This problem will continue to happen until I unload tape number 000015L3.

Keeping in mind that at this point in the discussion, Bacula has a "hold" on 
drive 0 trying to load tape 000015L3, as discussed above. We will compund the 
problem
by starting a few more jobs....

Bacula will get one job to use drive 1 with a new tape 000011L3. This job goes 
on it's merry way.

The next job starts....

Bacula decides to use drive 0, (bacula is finished with the "hold"
he has on drive 0) and the next available tape 000011L3 thinking 
tape 000011L3 is still in a slot when it is really in drive 1.

error: job.c:1758. Bad response to Append Data command. Wanted 3000 OK data ,
got 3903 Error append data

This problem will continue to happen until I can free up tape 000011L3 for 
drive 0.

Do a chio status -a

drive 0: <ACCESS> sense: <0x00/0x00> voltag: <:0> avoltag: <:0> source: <> 
intaddr: <256> scsi: <3:?>
drive 1: <FULL> sense: <0x00/0x00> voltag: <000011L3:0> avoltag: <:0> source: 
<slot 5> intaddr: <257> scsi: <1:?>
drive 2: <FULL> sense: <0x00/0x00> voltag: <000015L3:0> avoltag: <:0> source: 
<slot 4> intaddr: <258> scsi: <2:?>

Tape number 000011L3 in drive 1
Tape number 000015L3 in drive 2


Do another status 2 2 and WHAT????
Bacula now thinks that the Device status is:

Device "Drive-0" (/dev/nsa0) is not open.
     Drive 0 status unknown.
Device "Drive-1" (/dev/nsa1) is mounted with:
     Volume:
     Pool:        Tape1
     Media type:  LTO3
     Slot 258 is loaded in drive 1.
     Total Bytes=1,864,654,848 Blocks=28,903 Bytes/block=64,514
     Positioned at File=2 Block=0
Device "Drive-2" (/dev/nsa2) is mounted with:
     Volume:
     Pool:        *unknown*
     Media type:  LTO3
     Slot 259 is loaded in drive 2.
     Total Bytes Read=0 Blocks Read=0 Bytes/block=0
     Positioned at File=0 Block=0

Bacula now thinks the In Use Volume status is:

000011L3 on device "Drive-0" (/dev/nsa0)
     Reader=0 writers=0 reserved=0


It would appear that Bacula is doing 2 things at job start.

        1. Decide what drive he wishes to use.
        2. Decide what tape he wishes to use.

The problem is that Bacula is not thinking about what is in use and where it is.
Note the last status 2 2 after the job fails with a fatal error.



-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
Bacula-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/bacula-devel

Reply via email to