Hello,

I'd recommend that you send this email to the bacula-users list (in fact, I am 
copying them).  They are very good at resolving these kinds of problems.  
Please see www.bacula.org -> Support for more details.

Regards,

Kern

On Tuesday 21 August 2007 16:26, Steve Holstead wrote:
> I am running bacula-2.2.0 trying to get it working on a FreeBSD
> 6.2-RELEASE. I have an autochanger with 3 drives. I have run the complete
> btape tests with success. I am now trying to get multiple jobs running on
> multiple tapes simultaneously. Below are scenarios I have documented. It
> would appear that:
>          1. Drive status does not always reflect reality
>          2. Volume status rarely reflects reality
>          3. Drive and Volume status do not match up
>          4. Bacula's need to ask for a tape on a new drive that he
>             is already using.
>
> That said, I hope there is some glaring wrong I am committing, or at the
> least you have heard of these problems before and could offer me some
> help...
>
> *********************************************************************
>
> SCENARIO 1
>
> Two jobs are running:
>
> 140 | nfs 2007-08-21 07:25:49 Using Drive 1
> 141 | nfs 2007-08-21 07:25:59 Using Drive 2
>
> However the Device Status DOES NOT MATCH
>          the Volume status.
>
> Device status:
> Autochanger "Autochanger" with devices:
>     "Drive-0" (/dev/nsa0)
>     "Drive-1" (/dev/nsa1)
>     "Drive-2" (/dev/nsa2)
> Device "FileStorage" (/database/backup) is not open.
> Device "Drive-0" (/dev/nsa0) is not open.
>      Drive 0 status unknown.
> Device "Drive-1" (/dev/nsa1) is mounted with:
>      Volume:      000006L3
>      Pool:        Tape1
>      Media type:  LTO3
>      Slot 258 is loaded in drive 1.
>      Total Bytes=354,428,928 Blocks=5,493 Bytes/block=64,523
>      Positioned at File=2 Block=0
> Device "Drive-2" (/dev/nsa2) is mounted with:
>      Volume:      000005L3
>      Pool:        Tape1
>      Media type:  LTO3
>      Slot 259 is loaded in drive 2.
>      Total Bytes=64,512 Blocks=0 Bytes/block=64,512
>      Positioned at File=0 Block=1
> ====
>
> In Use Volume status:
> 000005L3 on device "Drive-0" (/dev/nsa0)
>      Reader=0 writers=0 reserved=0
> 000006L3 on device "Drive-1" (/dev/nsa1)
>      Reader=0 writers=1 reserved=0
> ====
>
> Now lets start the third job.....
>
> Bacula knows there is nothing in Drive 0  but he insists on
> selecting a tape for that drive that is in Drive 2 even though
> the Drive status says the tape is in use and the volume status
> (all though WRONG!!) says the tape is IN USE....
>
> 21-Aug 07:43 mailbackup-dir: Start Backup JobId 143,
> Job=nfs.2007-08-21_07.43.21 21-Aug 07:43 mailbackup-dir: Using Device
> "Drive-0"
> 21-Aug 07:43 mailbackup-sd: 3301 Issuing autochanger "loaded? drive 0"
> command. 21-Aug 07:43 mailbackup-sd: 3302 Autochanger "loaded? drive 0",
> result: nothing loaded. 21-Aug 07:43 mailbackup-sd: 3304 Issuing
> autochanger "load slot 12, drive 0" command. 21-Aug 07:44 mailbackup-sd:
> nfs.2007-08-21_07.43.21 Fatal error: 3992 Bad autochanger "load slot 12,
> drive 0": ERR=Child exite d with code 1.
> Results=chio: /dev/ch0: CHIOMOVE: Input/output error
>
> 21-Aug 07:44 nfs: nfs.2007-08-21_07.43.21 Fatal error: job.c:1758 Bad
> response to Append Data command. Wanted 3000 OK data , got 3903 Error
> append data
>
> 21-Aug 07:44 mailbackup-dir: nfs.2007-08-21_07.43.21 Error: Bacula
> mailbackup-dir 2.2.0 (08Aug07): 21-Aug-2007 07:44:16 Build OS:             
>  i386-unknown-freebsd6.2 freebsd 6.2-RELEASE JobId:                  143
>    Job:                    nfs.2007-08-21_07.43.21
>    Backup Level:           Full
>    Client:                 "nfs" 2.0.3 (06Mar07)
> i386-unknown-openbsd4.1,openbsd,4.1 FileSet:                "nfs"
> 2007-08-10 10:34:22
>    Pool:                   "Tape1" (From Job resource)
>    Storage:                "Autochanger" (From Job resource)
>    Scheduled time:         21-Aug-2007 07:43:16
>    Start time:             21-Aug-2007 07:43:25
>    End time:               21-Aug-2007 07:44:16
>    Elapsed time:           51 secs
>    Priority:               10
>    FD Files Written:       0
>    SD Files Written:       0
>    FD Bytes Written:       0 (0 B)
>    SD Bytes Written:       0 (0 B)
>    Rate:                   0.0 KB/s
>    Software Compression:   None
>    VSS:                    no
>    Encryption:             no
>    Volume name(s):
>    Volume Session Id:      11
>    Volume Session Time:    1187700921
>    Last Volume Bytes:      64,512 (64.51 KB)
>    Non-fatal FD errors:    0
>    SD Errors:              0
>    FD termination status:  Error
>    SD termination status:  Error
>    Termination:            *** Backup Error ***
>
> ***************************************************************************
>
> SCENARIO 2
>
> It would appear that Bacula gets confused as to the tapes that are
> already mounted if Bacula has been shutdown and restarted and there where
> tapes left mounted in the drives. Why?
>
>       -Should I be shuting down bacula differently?
>       -Should I be shutting down postgres first?
>
>
> Job finishes OK! But when doing status 2 2 there is no Volume name????
>
>       Device "Drive-2" (/dev/nsa2) is mounted with:
>      Volume:
>      Pool:        Tape1
>      Media type:  LTO3
>      Slot 259 is loaded in drive 2.
>      Total Bytes=23,804,928 Blocks=368 Bytes/block=64,687
>      Positioned at File=1 Block=0
>
>
> 2 tapes are loaded in drives. I shutdown sd and restart:
>
> Device status:
> Autochanger "Autochanger" with devices:
>     "Drive-0" (/dev/nsa0)
>     "Drive-1" (/dev/nsa1)
>     "Drive-2" (/dev/nsa2)
> Device "FileStorage" (/database/backup) is not open.
> Device "Drive-0" (/dev/nsa0) is mounted with:
>      Volume:      000007L3
>      Pool:        *unknown*
>      Media type:  LTO3
>      Slot 257 is loaded in drive 0.
>      Total Bytes Read=0 Blocks Read=0 Bytes/block=0
>      Positioned at File=0 Block=0
> Device "Drive-1" (/dev/nsa1) is not open.
>      Drive 1 status unknown.
> Device "Drive-2" (/dev/nsa2) is mounted with:
>      Volume:      000015L3
>      Pool:        *unknown*
>      Media type:  LTO3
>      Slot 259 is loaded in drive 2.
>      Total Bytes Read=0 Blocks Read=0 Bytes/block=0
>      Positioned at File=0 Block=0
> ====
>
> In Use Volume status:
> 000007L3 on device "Drive-0" (/dev/nsa0)
>      Reader=0 writers=0 reserved=0
> 000015L3 on device "Drive-2" (/dev/nsa2)
>      Reader=0 writers=0 reserved=0
>
>
> Start a job just after restart of sd.
>
>
>
> 20-Aug 08:53 mailbackup-dir: Start Backup JobId 116,
> Job=maildrop0.2007-08-20_08.53.47 20-Aug 08:53 mailbackup-dir: Using Device
> "Drive-0"
> 20-Aug 08:53 mailbackup-sd: 3301 Issuing autochanger "loaded? drive 0"
> command. 20-Aug 08:54 mailbackup-sd: 3302 Autochanger "loaded? drive 0",
> result: nothing loaded. 20-Aug 08:54 mailbackup-sd: 3304 Issuing
> autochanger "load slot 4, drive 0" command. 20-Aug 08:54 mailbackup-sd:
> 3305 Autochanger "load slot 4, drive 0", status is OK. 20-Aug 08:54
> mailbackup-sd: 3301 Issuing autochanger "loaded? drive 0" command. 20-Aug
> 08:55 mailbackup-sd: 3302 Autochanger "loaded? drive 0", result is Slot
> 257. 20-Aug 08:55 mailbackup-sd: Volume "000004L3" previously written,
> moving to end of data. 20-Aug 08:56 mailbackup-sd:
> maildrop0.2007-08-20_08.53.47 Error: Bacula cannot write on tape Volume
> "000004L3" because: The number of files mismatch! Volume=13 Catalog=14
> 20-Aug 08:56 mailbackup-sd: Marking Volume "000004L3" in Error in Catalog.
> 20-Aug 08:58 mailbackup-sd: 3307 Issuing autochanger "unload slot 257,
> drive 0" command. 20-Aug 08:58 mailbackup-sd: 3304 Issuing autochanger
> "load slot 2, drive 0" command. 20-Aug 08:59 mailbackup-sd: 3305
> Autochanger "load slot 2, drive 0", status is OK. 20-Aug 08:59
> mailbackup-sd: 3301 Issuing autochanger "loaded? drive 0" command. 20-Aug
> 09:00 mailbackup-sd: 3302 Autochanger "loaded? drive 0", result is Slot
> 257. 20-Aug 09:00 mailbackup-sd: Volume "000007L3" previously written,
> moving to end of data. 20-Aug 09:13 mailbackup-sd: Ready to append to end
> of Volume "000007L3" at file=133. 20-Aug 09:13 mailbackup-sd: Spooling data
> ...
>
>
> SD wants to unload the first drive. Then he wants to load slot 2
> into drive 0. Status seems okay.
>
> This "000007L3" tape was running a job that I CANCELLED prior to restarting
> sd. Volume "000007L3" previously written, moving to end of data....
>
> Device "Drive-0" (/dev/nsa0) is mounted with:
>      Volume:      000007L3
>      Pool:        Tape1
>      Media type:  LTO3
>      Device is being initialized.
>      Slot 257 is loaded in drive 0.
>      Total Bytes Read=0 Blocks Read=0 Bytes/block=0
>      Positioned at File=13 Block=0
>
> Job starts running okay1
>
> Start a second job....
> 20-Aug 09:57 mailbackup-dir: Start Backup JobId 120,
> Job=maildrop1.2007-08-20_09.57.06 20-Aug 09:57 mailbackup-dir: Using Device
> "Drive-1"
> 20-Aug 09:57 mailbackup-sd: 3301 Issuing autochanger "loaded? drive 1"
> command. 20-Aug 09:57 mailbackup-sd: 3302 Autochanger "loaded? drive 1",
> result: nothing loaded. 20-Aug 09:57 mailbackup-sd: 3304 Issuing
> autochanger "load slot 5, drive 1" command. 20-Aug 09:58 mailbackup-sd:
> maildrop1.2007-08-20_09.57.06 Fatal error: 3992 Bad autochanger "load slot
> 5, drive 1": ERR=Child exited with code 1.
> Results=chio: /dev/ch0: CHIOMOVE: Input/output error
>
> 20-Aug 09:58 maildrop1: maildrop1.2007-08-20_09.57.06 Fatal error:
> job.c:1758 Bad response to Append Data command. Wanted 3000 OK data
> , got 3903 Error append data
> Fatal error: job.c:1758 Bad response to Append Data command.
> Wanted 3000 OK data , got 3903 Error append data.
>
> Finds that nothing is loaded in drive 1. It then
> wishes to load slot 5 into drive 1. Of course slot 15
> is tape number 000015L3, which is currently loaded in
> drive 2!  This will cause a continuous problem until
> tape number 000015L3 is put back into slot 5!!!!!
>
>
>
> Device "Drive-0" (/dev/nsa0) is not open.
>      Drive 0 status unknown.
> Device "Drive-1" (/dev/nsa1) is not open.
>      Drive 1 status unknown.
> Device "Drive-2" (/dev/nsa2) is mounted with:
>      Volume:
>      Pool:        *unknown*
>      Media type:  LTO3
>      Slot 259 is loaded in drive 2.
>      Total Bytes Read=0 Blocks Read=0 Bytes/block=0
>      Positioned at File=0 Block=0
> ====
>
> In Use Volume status:
> ====
>
> Device "Drive-0" (/dev/nsa0) is not open.
>      Drive 0 status unknown.
> Device "Drive-1" (/dev/nsa1) is not open.
>      Drive 1 status unknown.
> Device "Drive-2" (/dev/nsa2) is mounted with:
>      Volume:
>      Pool:        *unknown*
>      Media type:  LTO3
>      Slot 259 is loaded in drive 2.
>      Total Bytes Read=0 Blocks Read=0 Bytes/block=0
>      Positioned at File=0 Block=0
> ====
>
> In Use Volume status:
> ====
>
> *********************************************************************
>
> SCENARIO 3
>
> Device status:
> Autochanger "Autochanger" with devices:
>     "Drive-0" (/dev/nsa0)
>     "Drive-1" (/dev/nsa1)
>     "Drive-2" (/dev/nsa2)
> Device "FileStorage" (/database/backup) is not open.
> Device "Drive-0" (/dev/nsa0) is mounted with:
>      Volume:      000004L3
>      Pool:        *unknown*
>      Media type:  LTO3
>      Slot 257 is loaded in drive 0.
>      Total Bytes Read=0 Blocks Read=0 Bytes/block=0
>      Positioned at File=0 Block=0
> Device "Drive-1" (/dev/nsa1) is not open.
>      Drive 1 status unknown.
> Device "Drive-2" (/dev/nsa2) is mounted with:
>      Volume:      000015L3
>      Pool:        *unknown*
>      Media type:  LTO3
>      Slot 259 is loaded in drive 2.
>      Total Bytes Read=0 Blocks Read=0 Bytes/block=0
>      Positioned at File=0 Block=0
> ====
>
> In Use Volume status:
> 000004L3 on device "Drive-0" (/dev/nsa0)
>      Reader=0 writers=0 reserved=0
> 000015L3 on device "Drive-2" (/dev/nsa2)
>      Reader=0 writers=0 reserved=0
> ====
>
> 20-Aug 10:53 mailbackup-dir: Start Backup JobId 121,
> Job=maildrop0.2007-08-20_10.53.55 20-Aug 10:53 mailbackup-dir: Using Device
> "Drive-0"
> 20-Aug 10:53 mailbackup-sd: 3307 Issuing autochanger "unload slot 257,
> drive 0" command. 20-Aug 10:54 mailbackup-sd: 3304 Issuing autochanger
> "load slot 5, drive 0" command. 20-Aug 10:55 mailbackup-sd:
> maildrop0.2007-08-20_10.53.55 Fatal error: 3992 Bad autochanger "load slot
> 5, drive 0": ERR=Child exited with code 1.
> Results=chio: /dev/ch0: CHIOMOVE: Input/output error
>
> 20-Aug 10:55 maildrop0: maildrop0.2007-08-20_10.53.55 Fatal error:
> job.c:1758 Bad response to Append Data command. Wanted 3000 OK data
> , got 3903 Error append data
>
> ====
>
> Device status:
> Autochanger "Autochanger" with devices:
>     "Drive-0" (/dev/nsa0)
>     "Drive-1" (/dev/nsa1)
>     "Drive-2" (/dev/nsa2)
> Device "FileStorage" (/database/backup) is not open.
> Device "Drive-0" (/dev/nsa0) is not open.
>      Drive 0 status unknown.
> Device "Drive-1" (/dev/nsa1) is not open.
>      Drive 1 status unknown.
> Device "Drive-2" (/dev/nsa2) is mounted with:
>      Volume:
>      Pool:        *unknown*
>      Media type:  LTO3
>      Slot 259 is loaded in drive 2.
>      Total Bytes Read=0 Blocks Read=0 Bytes/block=0
>      Positioned at File=0 Block=0
> ====
> In Use Volume status:
> ====
>
> ====
>
> Bacula starts it's first job after a Bacula restart and having tapes still
> in some drives. He knows that tape number 000004L3 is loaded in drive 0 and
> tape number 000015L3 is loaded in drive 2.
>
> Device "Drive-0" (/dev/nsa0) is mounted with:
>      Volume:      000004L3
>      Pool:        *unknown*
>      Media type:  LTO3
>      Slot 257 is loaded in drive 0.
>      Total Bytes Read=0 Blocks Read=0 Bytes/block=0
>      Positioned at File=0 Block=0
> Device "Drive-1" (/dev/nsa1) is not open.
>      Drive 1 status unknown.
> Device "Drive-2" (/dev/nsa2) is mounted with:
>      Volume:      000015L3
>      Pool:        *unknown*
>      Media type:  LTO3
>      Slot 259 is loaded in drive 2.
>      Total Bytes Read=0 Blocks Read=0 Bytes/block=0
>      Positioned at File=0 Block=0
> ====
>
> In Use Volume status:
> 000004L3 on device "Drive-0" (/dev/nsa0)
>      Reader=0 writers=0 reserved=0
> 000015L3 on device "Drive-2" (/dev/nsa2)
>      Reader=0 writers=0 reserved=0
> ====
>
> Bacula first says he wants to use drive 0. Because he KNOWS the status, he
> immedeately issues an unload of
> drive 0. Bacula wishes to use tape number 000015L3 in drive 0 but it is
> already loaded on drive 2. Start a new job and
>
> error: job.c:1758. Bad response to Append Data command. Wanted 3000 OK data
> , got 3903 Error append data
>
> Start another job and
>
> error: job.c:1758. Bad response to Append Data command. Wanted 3000 OK data
> , got 3903 Error append data
>
> Output from the second "status 2 2" command shows that Bacula nolonger
> knows what tape is in drive 2.
>
> This problem will continue to happen until I unload tape number 000015L3.
>
> Keeping in mind that at this point in the discussion, Bacula has a "hold"
> on drive 0 trying to load tape 000015L3, as discussed above. We will
> compund the problem by starting a few more jobs....
>
> Bacula will get one job to use drive 1 with a new tape 000011L3. This job
> goes on it's merry way.
>
> The next job starts....
>
> Bacula decides to use drive 0, (bacula is finished with the "hold"
> he has on drive 0) and the next available tape 000011L3 thinking
> tape 000011L3 is still in a slot when it is really in drive 1.
>
> error: job.c:1758. Bad response to Append Data command. Wanted 3000 OK data
> , got 3903 Error append data
>
> This problem will continue to happen until I can free up tape 000011L3 for
> drive 0.
>
> Do a chio status -a
>
> drive 0: <ACCESS> sense: <0x00/0x00> voltag: <:0> avoltag: <:0> source: <>
> intaddr: <256> scsi: <3:?> drive 1: <FULL> sense: <0x00/0x00> voltag:
> <000011L3:0> avoltag: <:0> source: <slot 5> intaddr: <257> scsi: <1:?>
> drive 2: <FULL> sense: <0x00/0x00> voltag: <000015L3:0> avoltag: <:0>
> source: <slot 4> intaddr: <258> scsi: <2:?>
>
> Tape number 000011L3 in drive 1
> Tape number 000015L3 in drive 2
>
>
> Do another status 2 2 and WHAT????
> Bacula now thinks that the Device status is:
>
> Device "Drive-0" (/dev/nsa0) is not open.
>      Drive 0 status unknown.
> Device "Drive-1" (/dev/nsa1) is mounted with:
>      Volume:
>      Pool:        Tape1
>      Media type:  LTO3
>      Slot 258 is loaded in drive 1.
>      Total Bytes=1,864,654,848 Blocks=28,903 Bytes/block=64,514
>      Positioned at File=2 Block=0
> Device "Drive-2" (/dev/nsa2) is mounted with:
>      Volume:
>      Pool:        *unknown*
>      Media type:  LTO3
>      Slot 259 is loaded in drive 2.
>      Total Bytes Read=0 Blocks Read=0 Bytes/block=0
>      Positioned at File=0 Block=0
>
> Bacula now thinks the In Use Volume status is:
>
> 000011L3 on device "Drive-0" (/dev/nsa0)
>      Reader=0 writers=0 reserved=0
>
>
> It would appear that Bacula is doing 2 things at job start.
>
>       1. Decide what drive he wishes to use.
>       2. Decide what tape he wishes to use.
>
> The problem is that Bacula is not thinking about what is in use and where
> it is. Note the last status 2 2 after the job fails with a fatal error.
>
>
>
> -------------------------------------------------------------------------
> This SF.net email is sponsored by: Microsoft
> Defy all challenges. Microsoft(R) Visual Studio 2005.
> http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
> _______________________________________________
> Bacula-devel mailing list
> [EMAIL PROTECTED]
> https://lists.sourceforge.net/lists/listinfo/bacula-devel

-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2005.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users

Reply via email to