Re: [Bacula-users] Autochanger: Unloading/Reloading tape for follow-up job - Bad autochanger loaded drive 0 command: ???
Hi, On 7/7/2006 6:25 PM, Christoff Buch wrote: Hi Arno This has taken a while but here come mtx and bacula-sd.conf: As to the btape test results: I can't find my files with the results at the moment (oops...), but I have the print-outs here: I reviewed them again (as I already did when I started with the autochanger) and there is really not a single error whatsoever. I did the normal test for the drive as such and the test for the autochanger. Both succeded fine. A good start, at least... If you can find anything, answers would be very appreciated! Well, I looked them over, and - see below. Thanks again + Best Regards, Christoff Buch ... #!/bin/sh # # Bacula interface to mtx autoloader Adapted by Christoff Buch, OneVision Software AG, 2006-05-31 # # $Id: mtx-changer.in,v 1.21.2.7 2006/05/02 14:48:12 kerns Exp $ # # If you set in your Device resource # # Changer Command = path-to-this-script/mtx-changer %c %o %S %a %d #you will have the following input to this script: # # So Bacula will always call with all the following arguments, even though #in come cases, not all are used. # # mtx-changer changer-device command slot archive-device drive-index #$1 $2 $3$4 $5 # # for example: # # mtx-changer /dev/sg0 load 1 /dev/nst0 0 (on a Linux system) # # will request to load the first cartidge into drive 0, where # the SCSI control channel is /dev/sg0, and the read/write device # is /dev/nst0. # # If you need to an offline, refer to the drive as $4 #e.g. mt -f $4 offline # # Many changers need an offline after (Sorry, not before?) the unload. Also many # changers need a sleep 60 after the mtx load. # # N.B. If you change the script, take care to return either # the mtx exit code or a 0. If the script exits with a non-zero # exit code, Bacula will assume the request failed. # MTX=mtx # # log whats done # # to turn on logging, uncomment the following line You might consider turning on logging. Personally, I also like some more extensive output in the debug log, like mtx output. #touch /usr/local/bacula/working/mtx.log # dbgfile=/usr/local/bacula/working/mtx.log debug() { if test -f $dbgfile; then echo `date +\%Y%m%d-%H:%M:%S\` $* $dbgfile fi } # # Create a temporary file # make_temp_file() { ... } # # The purpose of this function to wait a maximum # time for the drive. It will # return as soon as the drive is ready, or after # waiting a maximum of 300 seconds. # Note, this is very system dependent, so if you are # not running on Linux, you will probably need to # re-write it, or at least change the grep target. # wait_for_drive() { i=0 while [ $i -le 300 ]; do # Wait max 300 seconds if mt -f $1 status | grep ONLINE /dev/null 21; then break fi debug Device $1 - not ready, retrying... sleep 1 i=`expr $i + 1` done } Ok, here it might happen that waiting for the drive times out without notification. I use a longer wait time per loop, and you should log the result of the last check, for example. Also, there is a last-resort timeout in the SD configuration: Maximum Changer Wait = time in seconds should be set to a time longer than any mtx-changer operation should ever take. I have this set to a very long time. # check parameter count on commandline # check_parm_count() { ... # Check for special cases where only 2 arguments are needed, # all others are a minimum of 5 # ... case $cmd in unload) debug Doing mtx -f $ctl unload $slot $drive # # enable the following line if you need to eject the cartridge # the following two(!)lines enabled by Christoff Buch, OneVision Software AG, 2006-05-31 mt -f $device offline sleep 10 should not be a problem, but when reading this I thought about adding a negative wait_for_drive, some function that makes sure a tape is not loaded (but perhaps in the drive). ${MTX} -f $ctl unload $slot $drive ;; load) debug Doing mtx -f $ctl load $slot $drive ${MTX} -f $ctl load $slot $drive rtn=$? # # Increase the sleep time if you have a slow device # or remove the sleep and add the following: # the following line enabled by Christoff Buch, OneVision Software AG, 2006-05-31 wait_for_drive $device # the following line disabled by Christoff Buch, OneVision Software AG, 2006-05-31 # sleep 15 exit $rtn looks good, although, even after the wait_for_drive succeeded the $rtn variable might still contain a value != 0. Worth checking or logging, I think. ;; list) debug Doing mtx -f $ctl -- to list volumes make_temp_file # Enable the following if you are using barcodes and need an inventory # OneVision
Re: [Bacula-users] Autochanger: Unloading/Reloading tape for follow-up job - Bad autochanger loaded drive 0 command: ???
Hi Arno This has taken a while but here come mtx and bacula-sd.conf: As to the btape test results: I can't find my files with the results at the moment (oops...), but I have the print-outs here: I reviewed them again (as I already did when I started with the autochanger) and there is really not a single error whatsoever. I did the normal test for the drive as such and the test for the autochanger. Both succeded fine. If you can find anything, answers would be very appreciated! Thanks again + Best Regards, Christoff Buch [EMAIL PROTECTED] wrote on 29.06.2006 23:06:27: Hello, On 6/28/2006 11:23 AM, Christoff Buch wrote: Hello Users! I hope someone with autochanger expierience can help me with my concerns First of all my environment: - bacula 1.38.9 (from source) - mtx 1.2.18rel (from rpm) - SuSE 9.0 Kernel 2.4.21-99 - Exabyte 221L TapeLibrary (21 slots in 3 Magazines, Barcodereader, 1 IBM Ultrium LTO-2 drive) First question: btape tests? I don't think problem is in configure-options nor in bacula-xxx.confs since backups succeed. So none of them here. I just get those strange statements in log over and over again, and I'm not sure wether someday this may lead to a crash. Concerning the Bad autochanger loaded drive 0 command I guess it might be a timing problem, but can't decide wether it's the mtx-changer or bacula itself. Do you know anything? (I'll add my mtx-changer) I can't read it as it claims to be a .doc file and I'm too lazy to read list mail in other formats than ASCII ;-) I think two things are most likely the reason for your problems: Configuration (use btape and post the SD config) and mtx-changer, especially concerning timing of autoloader operations. Arno Concerning the Unloading / Reloading tape for follow-up job: Just why??? It's just sometimes happening, not each time! Today both happened in once, so here is the part of my log: JobId: 155 Job: Job1_.2006-06-27_22.12.15 Backup Level: Full Client: Client1 Windows Server 2003,MVS,NT 5.2.3790 FileSet: FileSet1 2006-06-13 20:16:29 Pool: Daily Storage: TapeLibrary Scheduled time: 27-Jun-2006 22:12:10 Start time: 27-Jun-2006 22:12:20 End time: 28-Jun-2006 00:55:04 Elapsed time: 2 hours 42 mins 44 secs Priority: 10 FD Files Written:3,903 SD Files Written:3,903 FD Bytes Written:122,954,473,051 (122.9 GB) SD Bytes Written:122,955,147,172 (122.9 GB) Rate: 12592.6 KB/s Software Compression: None Volume name(s): TUE001L2 Volume Session Id: 1 Volume Session Time: 1151424310 Last Volume Bytes: 123,046,455,465 (123.0 GB) Non-fatal FD errors: 0 SD Errors: 0 FD termination status: OK SD termination status: OK Termination: Backup OK 28-Jun 00:55 heem-dir: Begin pruning Jobs. 28-Jun 00:55 heem-dir: No Jobs found to prune. 28-Jun 00:55 heem-dir: Begin pruning Files. 28-Jun 00:55 heem-dir: No Files found to prune. 28-Jun 00:55 heem-dir: End auto prune. 28-Jun 00:55 heem-dir: Start Backup JobId 156, Job2_.2006-06-27_23.00.00 *28-Jun 00:55 heem-sd: 3301 Issuing autochanger loaded drive 0 command.* *28-Jun 00:55 heem-sd: 3991 Bad autochanger loaded drive 0 command: ERR=Operation not permitted.* 28-Jun 00:55 heem-sd: 3301 Issuing autochanger loaded drive 0 command. 28-Jun 00:55 heem-sd: 3302 Autochanger loaded drive 0, result is Slot 3. *28-Jun 00:55 heem-sd: 3307 Issuing autochanger unload slot 3, drive 0 command.* *28-Jun 00:55 heem-sd: 3304 Issuing autochanger load slot 3, drive 0 command.* 28-Jun 00:56 heem-sd: 3305 Autochanger load slot 3, drive 0, status is OK. 28-Jun 00:56 heem-sd: 3301 Issuing autochanger loaded drive 0 command. 28-Jun 00:56 heem-sd: 3302 Autochanger loaded drive 0, result is Slot 3. 28-Jun 00:56 heem-sd: Volume TUE001L2 previously written, moving to end of data. 28-Jun 00:56 heem-sd: Ready to append to end of Volume TUE001L2 at file=124. 28-Jun 01:52 heem-dir: Bacula 1.38.9 (02May06): 28-Jun-2006 01:52:29 JobId: 156 Job: Job2_.2006-06-27_23.00.00 Backup Level: Full Client: Client2 i686-pc-linux-gnu,suse,9.0 FileSet: FileSet2 2006-06-15 00:45:31 Pool: Daily Storage: TapeLibrary Scheduled time: 27-Jun-2006 23:00:00 Start time: 28-Jun-2006 00:55:08 End time: 28-Jun-2006 01:52:29 Elapsed time: 57 mins 21 secs Priority: 10 FD Files Written:110,844 SD Files Written:110,844 FD Bytes Written:28,747,087,799 (28.74 GB) SD Bytes Written:28,768,112,206 (28.76 GB) Rate: 8354.3 KB/s Software Compression: None Volume name(s): TUE001L2 Volume Session Id: 2 Volume Session Time:
Re: [Bacula-users] Autochanger: Unloading/Reloading tape for follow-up job - Bad autochanger loaded drive 0 command: ???
Hello, On 6/28/2006 11:23 AM, Christoff Buch wrote: Hello Users! I hope someone with autochanger expierience can help me with my concerns First of all my environment: - bacula 1.38.9 (from source) - mtx 1.2.18rel (from rpm) - SuSE 9.0 Kernel 2.4.21-99 - Exabyte 221L TapeLibrary (21 slots in 3 Magazines, Barcodereader, 1 IBM Ultrium LTO-2 drive) First question: btape tests? I don't think problem is in configure-options nor in bacula-xxx.confs since backups succeed. So none of them here. I just get those strange statements in log over and over again, and I'm not sure wether someday this may lead to a crash. Concerning the Bad autochanger loaded drive 0 command I guess it might be a timing problem, but can't decide wether it's the mtx-changer or bacula itself. Do you know anything? (I'll add my mtx-changer) I can't read it as it claims to be a .doc file and I'm too lazy to read list mail in other formats than ASCII ;-) I think two things are most likely the reason for your problems: Configuration (use btape and post the SD config) and mtx-changer, especially concerning timing of autoloader operations. Arno Concerning the Unloading / Reloading tape for follow-up job: Just why??? It's just sometimes happening, not each time! Today both happened in once, so here is the part of my log: JobId: 155 Job:Job1_.2006-06-27_22.12.15 Backup Level: Full Client: Client1 Windows Server 2003,MVS,NT 5.2.3790 FileSet:FileSet1 2006-06-13 20:16:29 Pool: Daily Storage:TapeLibrary Scheduled time: 27-Jun-2006 22:12:10 Start time: 27-Jun-2006 22:12:20 End time: 28-Jun-2006 00:55:04 Elapsed time: 2 hours 42 mins 44 secs Priority: 10 FD Files Written: 3,903 SD Files Written: 3,903 FD Bytes Written: 122,954,473,051 (122.9 GB) SD Bytes Written: 122,955,147,172 (122.9 GB) Rate: 12592.6 KB/s Software Compression: None Volume name(s): TUE001L2 Volume Session Id: 1 Volume Session Time:1151424310 Last Volume Bytes: 123,046,455,465 (123.0 GB) Non-fatal FD errors:0 SD Errors: 0 FD termination status: OK SD termination status: OK Termination:Backup OK 28-Jun 00:55 heem-dir: Begin pruning Jobs. 28-Jun 00:55 heem-dir: No Jobs found to prune. 28-Jun 00:55 heem-dir: Begin pruning Files. 28-Jun 00:55 heem-dir: No Files found to prune. 28-Jun 00:55 heem-dir: End auto prune. 28-Jun 00:55 heem-dir: Start Backup JobId 156, Job2_.2006-06-27_23.00.00 *28-Jun 00:55 heem-sd: 3301 Issuing autochanger loaded drive 0 command.* *28-Jun 00:55 heem-sd: 3991 Bad autochanger loaded drive 0 command: ERR=Operation not permitted.* 28-Jun 00:55 heem-sd: 3301 Issuing autochanger loaded drive 0 command. 28-Jun 00:55 heem-sd: 3302 Autochanger loaded drive 0, result is Slot 3. *28-Jun 00:55 heem-sd: 3307 Issuing autochanger unload slot 3, drive 0 command.* *28-Jun 00:55 heem-sd: 3304 Issuing autochanger load slot 3, drive 0 command.* 28-Jun 00:56 heem-sd: 3305 Autochanger load slot 3, drive 0, status is OK. 28-Jun 00:56 heem-sd: 3301 Issuing autochanger loaded drive 0 command. 28-Jun 00:56 heem-sd: 3302 Autochanger loaded drive 0, result is Slot 3. 28-Jun 00:56 heem-sd: Volume TUE001L2 previously written, moving to end of data. 28-Jun 00:56 heem-sd: Ready to append to end of Volume TUE001L2 at file=124. 28-Jun 01:52 heem-dir: Bacula 1.38.9 (02May06): 28-Jun-2006 01:52:29 JobId: 156 Job:Job2_.2006-06-27_23.00.00 Backup Level: Full Client: Client2 i686-pc-linux-gnu,suse,9.0 FileSet:FileSet2 2006-06-15 00:45:31 Pool: Daily Storage:TapeLibrary Scheduled time: 27-Jun-2006 23:00:00 Start time: 28-Jun-2006 00:55:08 End time: 28-Jun-2006 01:52:29 Elapsed time: 57 mins 21 secs Priority: 10 FD Files Written: 110,844 SD Files Written: 110,844 FD Bytes Written: 28,747,087,799 (28.74 GB) SD Bytes Written: 28,768,112,206 (28.76 GB) Rate: 8354.3 KB/s Software Compression: None Volume name(s): TUE001L2 Volume Session Id: 2 Volume Session Time:1151424310 Last Volume Bytes: 151,839,027,155 (151.8 GB) Non-fatal FD errors:0 SD Errors: 0 FD termination status: OK SD termination status: OK Termination:Backup OK In the 3rd job, bacula doesen't repeat the loaded drive 0 command even though it gets an ERR But this time at least it doesn't unload/load the tape but goes on immediately.
[Bacula-users] Autochanger: Unloading/Reloading tape for follow-up job - Bad autochanger loaded drive 0 command: ???
Hello Users! I hope someone with autochanger expierience can help me with my concerns First of all my environment: - bacula 1.38.9 (from source) - mtx 1.2.18rel (from rpm) - SuSE 9.0 Kernel 2.4.21-99 - Exabyte 221L TapeLibrary (21 slots in 3 Magazines, Barcodereader, 1 IBM Ultrium LTO-2 drive) I don't think problem is in configure-options nor in bacula-xxx.confs since backups succeed. So none of them here. I just get those strange statements in log over and over again, and I'm not sure wether someday this may lead to a crash. Concerning the Bad autochanger loaded drive 0 command I guess it might be a timing problem, but can't decide wether it's the mtx-changer or bacula itself. Do you know anything? (I'll add my mtx-changer) Concerning the Unloading / Reloading tape for follow-up job: Just why??? It's just sometimes happening, not each time! Today both happened in once, so here is the part of my log: JobId: 155 Job: Job1_.2006-06-27_22.12.15 Backup Level: Full Client: Client1 Windows Server 2003,MVS,NT 5.2.3790 FileSet: FileSet1 2006-06-13 20:16:29 Pool: Daily Storage: TapeLibrary Scheduled time: 27-Jun-2006 22:12:10 Start time: 27-Jun-2006 22:12:20 End time: 28-Jun-2006 00:55:04 Elapsed time: 2 hours 42 mins 44 secs Priority: 10 FD Files Written: 3,903 SD Files Written: 3,903 FD Bytes Written: 122,954,473,051 (122.9 GB) SD Bytes Written: 122,955,147,172 (122.9 GB) Rate: 12592.6 KB/s Software Compression: None Volume name(s): TUE001L2 Volume Session Id: 1 Volume Session Time: 1151424310 Last Volume Bytes: 123,046,455,465 (123.0 GB) Non-fatal FD errors: 0 SD Errors: 0 FD termination status: OK SD termination status: OK Termination: Backup OK 28-Jun 00:55 heem-dir: Begin pruning Jobs. 28-Jun 00:55 heem-dir: No Jobs found to prune. 28-Jun 00:55 heem-dir: Begin pruning Files. 28-Jun 00:55 heem-dir: No Files found to prune. 28-Jun 00:55 heem-dir: End auto prune. 28-Jun 00:55 heem-dir: Start Backup JobId 156, Job2_.2006-06-27_23.00.00 28-Jun 00:55 heem-sd: 3301 Issuing autochanger loaded drive 0 command. 28-Jun 00:55 heem-sd: 3991 Bad autochanger loaded drive 0 command: ERR=Operation not permitted. 28-Jun 00:55 heem-sd: 3301 Issuing autochanger loaded drive 0 command. 28-Jun 00:55 heem-sd: 3302 Autochanger loaded drive 0, result is Slot 3. 28-Jun 00:55 heem-sd: 3307 Issuing autochanger unload slot 3, drive 0 command. 28-Jun 00:55 heem-sd: 3304 Issuing autochanger load slot 3, drive 0 command. 28-Jun 00:56 heem-sd: 3305 Autochanger load slot 3, drive 0, status is OK. 28-Jun 00:56 heem-sd: 3301 Issuing autochanger loaded drive 0 command. 28-Jun 00:56 heem-sd: 3302 Autochanger loaded drive 0, result is Slot 3. 28-Jun 00:56 heem-sd: Volume TUE001L2 previously written, moving to end of data. 28-Jun 00:56 heem-sd: Ready to append to end of Volume TUE001L2 at file=124. 28-Jun 01:52 heem-dir: Bacula 1.38.9 (02May06): 28-Jun-2006 01:52:29 JobId: 156 Job: Job2_.2006-06-27_23.00.00 Backup Level: Full Client: Client2 i686-pc-linux-gnu,suse,9.0 FileSet: FileSet2 2006-06-15 00:45:31 Pool: Daily Storage: TapeLibrary Scheduled time: 27-Jun-2006 23:00:00 Start time: 28-Jun-2006 00:55:08 End time: 28-Jun-2006 01:52:29 Elapsed time: 57 mins 21 secs Priority: 10 FD Files Written: 110,844 SD Files Written: 110,844 FD Bytes Written: 28,747,087,799 (28.74 GB) SD Bytes Written: 28,768,112,206 (28.76 GB) Rate: 8354.3 KB/s Software Compression: None Volume name(s): TUE001L2 Volume Session Id: 2 Volume Session Time: 1151424310 Last Volume Bytes: 151,839,027,155 (151.8 GB) Non-fatal FD errors: 0 SD Errors: 0 FD termination status: OK SD termination status: OK Termination: Backup OK In the 3rd job, bacula doesen't repeat the loaded drive 0 command even though it gets an ERR But this time at least it doesn't unload/load the tape but goes on immediately. (see below) I don't understand this behaviour. 28-Jun 01:52 heem-dir: Start Backup JobId 157, Job3.2006-06-28_00.10.01 28-Jun 01:52 heem-sd: 3301 Issuing autochanger loaded drive 0 command. 28-Jun 01:52 heem-sd: 3302 Autochanger loaded drive 0, result is Slot 3. 28-Jun 01:52 heem-sd: 3301 Issuing autochanger loaded drive 0 command. 28-Jun 01:52 heem-sd: 3991 Bad autochanger loaded drive 0 command: ERR=Operation not permitted. 28-Jun 01:52 heem-sd: Volume TUE001L2 previously written, moving to end of data. 28-Jun 01:53 heem-sd: Ready to append to end of Volume TUE001L2 at file=153. 28-Jun 02:32 heem-dir: Bacula 1.38.9 (02May06): 28-Jun-2006 02:32:50 . Thanks in advance! Best Regards, i. A. Christoff Buch = [EMAIL PROTECTED] +49.941.78004.607 [EMAIL PROTECTED] OneVision Software AG