Hello, I'd recommend that you send this email to the bacula-users list (in fact, I am copying them). They are very good at resolving these kinds of problems. Please see www.bacula.org -> Support for more details.
Regards, Kern On Tuesday 21 August 2007 16:26, Steve Holstead wrote: > I am running bacula-2.2.0 trying to get it working on a FreeBSD > 6.2-RELEASE. I have an autochanger with 3 drives. I have run the complete > btape tests with success. I am now trying to get multiple jobs running on > multiple tapes simultaneously. Below are scenarios I have documented. It > would appear that: > 1. Drive status does not always reflect reality > 2. Volume status rarely reflects reality > 3. Drive and Volume status do not match up > 4. Bacula's need to ask for a tape on a new drive that he > is already using. > > That said, I hope there is some glaring wrong I am committing, or at the > least you have heard of these problems before and could offer me some > help... > > ********************************************************************* > > SCENARIO 1 > > Two jobs are running: > > 140 | nfs 2007-08-21 07:25:49 Using Drive 1 > 141 | nfs 2007-08-21 07:25:59 Using Drive 2 > > However the Device Status DOES NOT MATCH > the Volume status. > > Device status: > Autochanger "Autochanger" with devices: > "Drive-0" (/dev/nsa0) > "Drive-1" (/dev/nsa1) > "Drive-2" (/dev/nsa2) > Device "FileStorage" (/database/backup) is not open. > Device "Drive-0" (/dev/nsa0) is not open. > Drive 0 status unknown. > Device "Drive-1" (/dev/nsa1) is mounted with: > Volume: 000006L3 > Pool: Tape1 > Media type: LTO3 > Slot 258 is loaded in drive 1. > Total Bytes=354,428,928 Blocks=5,493 Bytes/block=64,523 > Positioned at File=2 Block=0 > Device "Drive-2" (/dev/nsa2) is mounted with: > Volume: 000005L3 > Pool: Tape1 > Media type: LTO3 > Slot 259 is loaded in drive 2. > Total Bytes=64,512 Blocks=0 Bytes/block=64,512 > Positioned at File=0 Block=1 > ==== > > In Use Volume status: > 000005L3 on device "Drive-0" (/dev/nsa0) > Reader=0 writers=0 reserved=0 > 000006L3 on device "Drive-1" (/dev/nsa1) > Reader=0 writers=1 reserved=0 > ==== > > Now lets start the third job..... > > Bacula knows there is nothing in Drive 0 but he insists on > selecting a tape for that drive that is in Drive 2 even though > the Drive status says the tape is in use and the volume status > (all though WRONG!!) says the tape is IN USE.... > > 21-Aug 07:43 mailbackup-dir: Start Backup JobId 143, > Job=nfs.2007-08-21_07.43.21 21-Aug 07:43 mailbackup-dir: Using Device > "Drive-0" > 21-Aug 07:43 mailbackup-sd: 3301 Issuing autochanger "loaded? drive 0" > command. 21-Aug 07:43 mailbackup-sd: 3302 Autochanger "loaded? drive 0", > result: nothing loaded. 21-Aug 07:43 mailbackup-sd: 3304 Issuing > autochanger "load slot 12, drive 0" command. 21-Aug 07:44 mailbackup-sd: > nfs.2007-08-21_07.43.21 Fatal error: 3992 Bad autochanger "load slot 12, > drive 0": ERR=Child exite d with code 1. > Results=chio: /dev/ch0: CHIOMOVE: Input/output error > > 21-Aug 07:44 nfs: nfs.2007-08-21_07.43.21 Fatal error: job.c:1758 Bad > response to Append Data command. Wanted 3000 OK data , got 3903 Error > append data > > 21-Aug 07:44 mailbackup-dir: nfs.2007-08-21_07.43.21 Error: Bacula > mailbackup-dir 2.2.0 (08Aug07): 21-Aug-2007 07:44:16 Build OS: > i386-unknown-freebsd6.2 freebsd 6.2-RELEASE JobId: 143 > Job: nfs.2007-08-21_07.43.21 > Backup Level: Full > Client: "nfs" 2.0.3 (06Mar07) > i386-unknown-openbsd4.1,openbsd,4.1 FileSet: "nfs" > 2007-08-10 10:34:22 > Pool: "Tape1" (From Job resource) > Storage: "Autochanger" (From Job resource) > Scheduled time: 21-Aug-2007 07:43:16 > Start time: 21-Aug-2007 07:43:25 > End time: 21-Aug-2007 07:44:16 > Elapsed time: 51 secs > Priority: 10 > FD Files Written: 0 > SD Files Written: 0 > FD Bytes Written: 0 (0 B) > SD Bytes Written: 0 (0 B) > Rate: 0.0 KB/s > Software Compression: None > VSS: no > Encryption: no > Volume name(s): > Volume Session Id: 11 > Volume Session Time: 1187700921 > Last Volume Bytes: 64,512 (64.51 KB) > Non-fatal FD errors: 0 > SD Errors: 0 > FD termination status: Error > SD termination status: Error > Termination: *** Backup Error *** > > *************************************************************************** > > SCENARIO 2 > > It would appear that Bacula gets confused as to the tapes that are > already mounted if Bacula has been shutdown and restarted and there where > tapes left mounted in the drives. Why? > > -Should I be shuting down bacula differently? > -Should I be shutting down postgres first? > > > Job finishes OK! But when doing status 2 2 there is no Volume name???? > > Device "Drive-2" (/dev/nsa2) is mounted with: > Volume: > Pool: Tape1 > Media type: LTO3 > Slot 259 is loaded in drive 2. > Total Bytes=23,804,928 Blocks=368 Bytes/block=64,687 > Positioned at File=1 Block=0 > > > 2 tapes are loaded in drives. I shutdown sd and restart: > > Device status: > Autochanger "Autochanger" with devices: > "Drive-0" (/dev/nsa0) > "Drive-1" (/dev/nsa1) > "Drive-2" (/dev/nsa2) > Device "FileStorage" (/database/backup) is not open. > Device "Drive-0" (/dev/nsa0) is mounted with: > Volume: 000007L3 > Pool: *unknown* > Media type: LTO3 > Slot 257 is loaded in drive 0. > Total Bytes Read=0 Blocks Read=0 Bytes/block=0 > Positioned at File=0 Block=0 > Device "Drive-1" (/dev/nsa1) is not open. > Drive 1 status unknown. > Device "Drive-2" (/dev/nsa2) is mounted with: > Volume: 000015L3 > Pool: *unknown* > Media type: LTO3 > Slot 259 is loaded in drive 2. > Total Bytes Read=0 Blocks Read=0 Bytes/block=0 > Positioned at File=0 Block=0 > ==== > > In Use Volume status: > 000007L3 on device "Drive-0" (/dev/nsa0) > Reader=0 writers=0 reserved=0 > 000015L3 on device "Drive-2" (/dev/nsa2) > Reader=0 writers=0 reserved=0 > > > Start a job just after restart of sd. > > > > 20-Aug 08:53 mailbackup-dir: Start Backup JobId 116, > Job=maildrop0.2007-08-20_08.53.47 20-Aug 08:53 mailbackup-dir: Using Device > "Drive-0" > 20-Aug 08:53 mailbackup-sd: 3301 Issuing autochanger "loaded? drive 0" > command. 20-Aug 08:54 mailbackup-sd: 3302 Autochanger "loaded? drive 0", > result: nothing loaded. 20-Aug 08:54 mailbackup-sd: 3304 Issuing > autochanger "load slot 4, drive 0" command. 20-Aug 08:54 mailbackup-sd: > 3305 Autochanger "load slot 4, drive 0", status is OK. 20-Aug 08:54 > mailbackup-sd: 3301 Issuing autochanger "loaded? drive 0" command. 20-Aug > 08:55 mailbackup-sd: 3302 Autochanger "loaded? drive 0", result is Slot > 257. 20-Aug 08:55 mailbackup-sd: Volume "000004L3" previously written, > moving to end of data. 20-Aug 08:56 mailbackup-sd: > maildrop0.2007-08-20_08.53.47 Error: Bacula cannot write on tape Volume > "000004L3" because: The number of files mismatch! Volume=13 Catalog=14 > 20-Aug 08:56 mailbackup-sd: Marking Volume "000004L3" in Error in Catalog. > 20-Aug 08:58 mailbackup-sd: 3307 Issuing autochanger "unload slot 257, > drive 0" command. 20-Aug 08:58 mailbackup-sd: 3304 Issuing autochanger > "load slot 2, drive 0" command. 20-Aug 08:59 mailbackup-sd: 3305 > Autochanger "load slot 2, drive 0", status is OK. 20-Aug 08:59 > mailbackup-sd: 3301 Issuing autochanger "loaded? drive 0" command. 20-Aug > 09:00 mailbackup-sd: 3302 Autochanger "loaded? drive 0", result is Slot > 257. 20-Aug 09:00 mailbackup-sd: Volume "000007L3" previously written, > moving to end of data. 20-Aug 09:13 mailbackup-sd: Ready to append to end > of Volume "000007L3" at file=133. 20-Aug 09:13 mailbackup-sd: Spooling data > ... > > > SD wants to unload the first drive. Then he wants to load slot 2 > into drive 0. Status seems okay. > > This "000007L3" tape was running a job that I CANCELLED prior to restarting > sd. Volume "000007L3" previously written, moving to end of data.... > > Device "Drive-0" (/dev/nsa0) is mounted with: > Volume: 000007L3 > Pool: Tape1 > Media type: LTO3 > Device is being initialized. > Slot 257 is loaded in drive 0. > Total Bytes Read=0 Blocks Read=0 Bytes/block=0 > Positioned at File=13 Block=0 > > Job starts running okay1 > > Start a second job.... > 20-Aug 09:57 mailbackup-dir: Start Backup JobId 120, > Job=maildrop1.2007-08-20_09.57.06 20-Aug 09:57 mailbackup-dir: Using Device > "Drive-1" > 20-Aug 09:57 mailbackup-sd: 3301 Issuing autochanger "loaded? drive 1" > command. 20-Aug 09:57 mailbackup-sd: 3302 Autochanger "loaded? drive 1", > result: nothing loaded. 20-Aug 09:57 mailbackup-sd: 3304 Issuing > autochanger "load slot 5, drive 1" command. 20-Aug 09:58 mailbackup-sd: > maildrop1.2007-08-20_09.57.06 Fatal error: 3992 Bad autochanger "load slot > 5, drive 1": ERR=Child exited with code 1. > Results=chio: /dev/ch0: CHIOMOVE: Input/output error > > 20-Aug 09:58 maildrop1: maildrop1.2007-08-20_09.57.06 Fatal error: > job.c:1758 Bad response to Append Data command. Wanted 3000 OK data > , got 3903 Error append data > Fatal error: job.c:1758 Bad response to Append Data command. > Wanted 3000 OK data , got 3903 Error append data. > > Finds that nothing is loaded in drive 1. It then > wishes to load slot 5 into drive 1. Of course slot 15 > is tape number 000015L3, which is currently loaded in > drive 2! This will cause a continuous problem until > tape number 000015L3 is put back into slot 5!!!!! > > > > Device "Drive-0" (/dev/nsa0) is not open. > Drive 0 status unknown. > Device "Drive-1" (/dev/nsa1) is not open. > Drive 1 status unknown. > Device "Drive-2" (/dev/nsa2) is mounted with: > Volume: > Pool: *unknown* > Media type: LTO3 > Slot 259 is loaded in drive 2. > Total Bytes Read=0 Blocks Read=0 Bytes/block=0 > Positioned at File=0 Block=0 > ==== > > In Use Volume status: > ==== > > Device "Drive-0" (/dev/nsa0) is not open. > Drive 0 status unknown. > Device "Drive-1" (/dev/nsa1) is not open. > Drive 1 status unknown. > Device "Drive-2" (/dev/nsa2) is mounted with: > Volume: > Pool: *unknown* > Media type: LTO3 > Slot 259 is loaded in drive 2. > Total Bytes Read=0 Blocks Read=0 Bytes/block=0 > Positioned at File=0 Block=0 > ==== > > In Use Volume status: > ==== > > ********************************************************************* > > SCENARIO 3 > > Device status: > Autochanger "Autochanger" with devices: > "Drive-0" (/dev/nsa0) > "Drive-1" (/dev/nsa1) > "Drive-2" (/dev/nsa2) > Device "FileStorage" (/database/backup) is not open. > Device "Drive-0" (/dev/nsa0) is mounted with: > Volume: 000004L3 > Pool: *unknown* > Media type: LTO3 > Slot 257 is loaded in drive 0. > Total Bytes Read=0 Blocks Read=0 Bytes/block=0 > Positioned at File=0 Block=0 > Device "Drive-1" (/dev/nsa1) is not open. > Drive 1 status unknown. > Device "Drive-2" (/dev/nsa2) is mounted with: > Volume: 000015L3 > Pool: *unknown* > Media type: LTO3 > Slot 259 is loaded in drive 2. > Total Bytes Read=0 Blocks Read=0 Bytes/block=0 > Positioned at File=0 Block=0 > ==== > > In Use Volume status: > 000004L3 on device "Drive-0" (/dev/nsa0) > Reader=0 writers=0 reserved=0 > 000015L3 on device "Drive-2" (/dev/nsa2) > Reader=0 writers=0 reserved=0 > ==== > > 20-Aug 10:53 mailbackup-dir: Start Backup JobId 121, > Job=maildrop0.2007-08-20_10.53.55 20-Aug 10:53 mailbackup-dir: Using Device > "Drive-0" > 20-Aug 10:53 mailbackup-sd: 3307 Issuing autochanger "unload slot 257, > drive 0" command. 20-Aug 10:54 mailbackup-sd: 3304 Issuing autochanger > "load slot 5, drive 0" command. 20-Aug 10:55 mailbackup-sd: > maildrop0.2007-08-20_10.53.55 Fatal error: 3992 Bad autochanger "load slot > 5, drive 0": ERR=Child exited with code 1. > Results=chio: /dev/ch0: CHIOMOVE: Input/output error > > 20-Aug 10:55 maildrop0: maildrop0.2007-08-20_10.53.55 Fatal error: > job.c:1758 Bad response to Append Data command. Wanted 3000 OK data > , got 3903 Error append data > > ==== > > Device status: > Autochanger "Autochanger" with devices: > "Drive-0" (/dev/nsa0) > "Drive-1" (/dev/nsa1) > "Drive-2" (/dev/nsa2) > Device "FileStorage" (/database/backup) is not open. > Device "Drive-0" (/dev/nsa0) is not open. > Drive 0 status unknown. > Device "Drive-1" (/dev/nsa1) is not open. > Drive 1 status unknown. > Device "Drive-2" (/dev/nsa2) is mounted with: > Volume: > Pool: *unknown* > Media type: LTO3 > Slot 259 is loaded in drive 2. > Total Bytes Read=0 Blocks Read=0 Bytes/block=0 > Positioned at File=0 Block=0 > ==== > In Use Volume status: > ==== > > ==== > > Bacula starts it's first job after a Bacula restart and having tapes still > in some drives. He knows that tape number 000004L3 is loaded in drive 0 and > tape number 000015L3 is loaded in drive 2. > > Device "Drive-0" (/dev/nsa0) is mounted with: > Volume: 000004L3 > Pool: *unknown* > Media type: LTO3 > Slot 257 is loaded in drive 0. > Total Bytes Read=0 Blocks Read=0 Bytes/block=0 > Positioned at File=0 Block=0 > Device "Drive-1" (/dev/nsa1) is not open. > Drive 1 status unknown. > Device "Drive-2" (/dev/nsa2) is mounted with: > Volume: 000015L3 > Pool: *unknown* > Media type: LTO3 > Slot 259 is loaded in drive 2. > Total Bytes Read=0 Blocks Read=0 Bytes/block=0 > Positioned at File=0 Block=0 > ==== > > In Use Volume status: > 000004L3 on device "Drive-0" (/dev/nsa0) > Reader=0 writers=0 reserved=0 > 000015L3 on device "Drive-2" (/dev/nsa2) > Reader=0 writers=0 reserved=0 > ==== > > Bacula first says he wants to use drive 0. Because he KNOWS the status, he > immedeately issues an unload of > drive 0. Bacula wishes to use tape number 000015L3 in drive 0 but it is > already loaded on drive 2. Start a new job and > > error: job.c:1758. Bad response to Append Data command. Wanted 3000 OK data > , got 3903 Error append data > > Start another job and > > error: job.c:1758. Bad response to Append Data command. Wanted 3000 OK data > , got 3903 Error append data > > Output from the second "status 2 2" command shows that Bacula nolonger > knows what tape is in drive 2. > > This problem will continue to happen until I unload tape number 000015L3. > > Keeping in mind that at this point in the discussion, Bacula has a "hold" > on drive 0 trying to load tape 000015L3, as discussed above. We will > compund the problem by starting a few more jobs.... > > Bacula will get one job to use drive 1 with a new tape 000011L3. This job > goes on it's merry way. > > The next job starts.... > > Bacula decides to use drive 0, (bacula is finished with the "hold" > he has on drive 0) and the next available tape 000011L3 thinking > tape 000011L3 is still in a slot when it is really in drive 1. > > error: job.c:1758. Bad response to Append Data command. Wanted 3000 OK data > , got 3903 Error append data > > This problem will continue to happen until I can free up tape 000011L3 for > drive 0. > > Do a chio status -a > > drive 0: <ACCESS> sense: <0x00/0x00> voltag: <:0> avoltag: <:0> source: <> > intaddr: <256> scsi: <3:?> drive 1: <FULL> sense: <0x00/0x00> voltag: > <000011L3:0> avoltag: <:0> source: <slot 5> intaddr: <257> scsi: <1:?> > drive 2: <FULL> sense: <0x00/0x00> voltag: <000015L3:0> avoltag: <:0> > source: <slot 4> intaddr: <258> scsi: <2:?> > > Tape number 000011L3 in drive 1 > Tape number 000015L3 in drive 2 > > > Do another status 2 2 and WHAT???? > Bacula now thinks that the Device status is: > > Device "Drive-0" (/dev/nsa0) is not open. > Drive 0 status unknown. > Device "Drive-1" (/dev/nsa1) is mounted with: > Volume: > Pool: Tape1 > Media type: LTO3 > Slot 258 is loaded in drive 1. > Total Bytes=1,864,654,848 Blocks=28,903 Bytes/block=64,514 > Positioned at File=2 Block=0 > Device "Drive-2" (/dev/nsa2) is mounted with: > Volume: > Pool: *unknown* > Media type: LTO3 > Slot 259 is loaded in drive 2. > Total Bytes Read=0 Blocks Read=0 Bytes/block=0 > Positioned at File=0 Block=0 > > Bacula now thinks the In Use Volume status is: > > 000011L3 on device "Drive-0" (/dev/nsa0) > Reader=0 writers=0 reserved=0 > > > It would appear that Bacula is doing 2 things at job start. > > 1. Decide what drive he wishes to use. > 2. Decide what tape he wishes to use. > > The problem is that Bacula is not thinking about what is in use and where > it is. Note the last status 2 2 after the job fails with a fatal error. > > > > ------------------------------------------------------------------------- > This SF.net email is sponsored by: Microsoft > Defy all challenges. Microsoft(R) Visual Studio 2005. > http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ > _______________________________________________ > Bacula-devel mailing list > [EMAIL PROTECTED] > https://lists.sourceforge.net/lists/listinfo/bacula-devel ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2005. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ _______________________________________________ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users