On 02/04/14 18:07, Rao, Uthra R. (GSFC-672.0)[ADNET SYSTEMS INC] wrote: > Alan, > > In my case it looks like the autochanger is unable to get the tape out of its > slot. I have already opened a case with IBM and they suggested a Firmware > upgrade which I did last week. This problem has not occurred since the > firmware upgrade but I have to wait longer for this conclusion. I was looking > in to adding " Maximum Changer Wait directive 3d" so that if the tape gets > stuck in the slot during the weekend the job will not fail and I will have > some time to manually mount the needed tape from the command line. I would > also be interested in the scrip that you have mentioned to unlock the drives. >
What does "mt -f /dev/{drive} unlock" return? What about "mt -f /dev/{drive} eject" Can you look insde the changer to see if the tape ejects? If these don't work then the problem is most likely to be a locking issue - scsi locks set via one interface have to be unset on the same interface. The quick'n'dirty solution is to power cycle the autochanger, but it's far better to identify if the same tape drive is being seen multiple times. The attached (ugly) script will do the trick, but first: mkdir -p /etc/bacula/DEVICES/ ln -s /dev/tape/by-id/*-nst /etc/bacula/DEVICES/ If you can ID which drive is in which position on which changer, then ln -s /dev/tape/by-id/{wwid}-nst /etc/bacula/DEVICES/CHANGER_{foo}-DRIVE-{bar} this will give stable anchor points for bacula to use instead of having to reconfigure bacula-sd and bacula-dir every time a drive is changed out or the fabric changes. Note that it is _extremely dangerous_ to refer to /dev/nst* (or /dev/st*) in a fabric environment as they tend to jump around at the slightest provokation, resulting in attempts to load/unload the wrong drive, (or worse, write to the wrong one). ===== $ cat /usr/local/bin/unlocktapedrive.sh #!/bin/bash # on a Fibrechannel host with 2 connections to the fabric, # a tape drive will show up twice. # # The same thing will occur if the drive has 2 fabric connections # and if both ends are multiply connected then the numbers end up # multiplied (dual + dual = 4 instances seen) # # This script assumes only 2 instances and will need modifying for more. # # It's normal for udev to swap /dev/tape/by-id around from time to time, # which is fine for standard operations, BUT: # # Tape drive door locks are set per-initiator and ORed. # Therefore a drive might be locked by one initiator # and unlocked from the other. That doesn't work as locks # set by one initator have to be released by the same initator # # This script gets BOTH /dev/nst devices for each # /etc/bacula/DEVICES/{drive} (symlinks to /dev/tape/by-id/) # and sends unlocks to them, to be on the safe side. # # input is assumed to be /etc/bacula/DEVICES/drive #echo 0 $0 1 $1 2 $2 3 $3 if test -z $1 then echo Argument: /etc/bacula/DEVICES/drive exit 1 fi if ! test -L $1 then echo Argument: /etc/bacula/DEVICES/drive exit 1 fi # GET THE /dev/tape/by-id for this device export INDIRECT=`/bin/ls -l $1 | /bin/cut -f2 -d\>` if ! test -L $INDIRECT then echo I have lost that drive! Consider running udevtrigger to recover it. exit 1 fi #echo $INDIRECT # get the /dev/nst being used. export DEVICE=`/bin/ls -l $INDIRECT | /bin/cut -f2 -d\> | /bin/cut -f3 -d/` #echo $DEVICE # get the device's /dev/tape/by-path entry export PATH=`/bin/ls -l /dev/tape/by-path | /bin/grep $DEVICE$ | /bin/cut -f4 -d-` #echo $PATH # get the OTHER device export DEVICEGHOST=`/bin/ls -l /dev/tape/by-path/*$PATH*-nst | /bin/grep -v $DEVICE$ | /bin/cut -f2 -d\> | /bin/cut -f3 -d/` # echo $DEVICEGHOST echo $1 $INDIRECT $DEVICE $PATH $DEVICEGHOST #echo $DEVICE $DEVICEGHOST # export STATUS=`/bin/mt -f /dev/$DEVICE status | /bin/grep OPEN` if test -n "$STATUS" then echo $d already unlocked else /bin/mt -f /dev/$DEVICE unlock /bin/mt -f /dev/$DEVICEGHOST unlock fi # exit 0 ================= This script is also useful. I use it as one of my RunAfterJobs. ============================ $ cat cat /usr/local/bin/gettapeinfo.sh #!/bin/bash # input is assumed to be /etc/bacula/DEVICES/{drive} if test -z $1 then echo Argument: /etc/bacula/DEVICES/drive exit 1 fi # GET which tape is in the drive export DRIVE=`echo $1 | cut -f2 -d"-"` export CHANGER=`echo $1 | cut -f1 -d"-"` export CONTENT=`mtx -f $CHANGER-changer status | grep "Data Transfer Element "$DRIVE` # GET THE /dev/tape/by-id for this device export INDIRECT=`ls -l $1 | cut -f2 -d\>` # get the /dev/nst being used. export DEVICE=`ls -l $INDIRECT | cut -f2 -d\> | cut -f3 -d/` # get the /dev/sg export GENERIC=`ls /sys/class/scsi_tape/$DEVICE/device/scsi_generic | rev | cut -f1 -d" " | rev` echo $1 $INDIRECT $DEVICE $GENERIC $CHANGER > /tmp/tapeinfo.log.$$ echo $1 $CONTENT >> /tmp/tapeinfo.log.$$ smartctl -T permissive -H -d scsi -a -l error /dev/$GENERIC >> /tmp/tapeinfo.log.$$ tapeinfo -f /dev/$GENERIC >> /tmp/tapeinfo.log.$$ smartctl -T permissive -H -d scsi -a -l error /dev/$GENERIC >> /tmp/tapeinfo.log.$$ tapeinfo -f /dev/$GENERIC >> /tmp/tapeinfo.log.$$ export TAPEALERT=`grep "TapeAlert Error" /tmp/tapeinfo.log.$$ | cut -c1-5` if test -n $TAPEALERT then # this section is shamelessly ripped from http://wiki.bacula.org/doku.php?id=tapealert export BACULA_ETC="/opt/bacula/etc/bacula-dir.conf.d" export BACULA_DIR_CONF="messages" export MAIL_BIN="/usr/bin/mail" export SUBJECT="Bacula tapedrive TapeInfo alert" export TAPEINFO_LOG="/tmp/tapeinfo.log.$$" # --- get email-address of Bacula's Tape-Operator & System Administrator --- ## AJB2 ours is in /opt/bacula/etc/bacula-dir.conf.d/messages thanks to include statements - hence the "odd" locations. export BACDIRCONF=$BACULA_ETC"/"$BACULA_DIR_CONF ## get the first email-address of the Bacula Tape Operator(s) export bo=`cat $BACDIRCONF |sed -e 's/^[ \t]*//' | grep -w ^operator |cut -d"=" -f2 | head -1` export bo=${bo//[[:space:]]} ## get the first mail-address of the Bacula System Administrator(s) export bs=`cat $BACDIRCONF |sed -e 's/^[ \t]*//' | grep -w ^mail |cut -d"=" -f2 | head -1` export bs=${bs//[[:space:]]} #echo "Email-address of Bacula's Tape-operator : $bo" #echo "Email-address of Bacula's System Admin : $bs" # sanity check for email-addresses: if [ ! -n "$bo" ]; then echo "Could not retrieve an email-address for the Bacula Tape-Operator." exit 1; fi if [ ! -n "$bs" ]; then echo "Could not retrieve an email-address for the Bacula System Administrator." exit 1; fi # sanity check for existence of the logfile if [ ! -f $TAPEINFO_LOG ]; then echo "Could not find the logfile containing TapeInfo results." exit 1; fi # now actually do what is intended to be done! export TA_OK=`cat $TAPEINFO_LOG | grep "TapeAlert: OK" -c` if [ $TA_OK != "1" ]; then # hmm. TapeInfo is not OK. Send an email! $MAIL_BIN -s "$SUBJECT" -c "$bo" "$bs" < $TAPEINFO_LOG fi fi cat /tmp/tapeinfo.log.$$ rm /tmp/tapeinfo.log.$$ echo $1 $INDIRECT $DEVICE $GENERIC $CHANGER $CONTENT ======================= > Thank you. > Uthra > > -----Original Message----- > From: Alan Brown [mailto:a...@mssl.ucl.ac.uk] > Sent: Wednesday, April 02, 2014 12:47 PM > To: "Ana Emília M. Arruda"; Rao, Uthra R. (GSFC-672.0)[ADNET SYSTEMS INC] > Cc: Alan Brown; bacula-users@lists.sourceforge.net > Subject: Re: [Bacula-users] Maximum Changer Wait directive > > On 02/04/14 17:18, Ana Emília M. Arruda wrote: >> Hi Ulthra, >> >> I had this problem with my tape library. But it was just with one >> specific slot. It had physical problems and frequently stuck the tape. > > I've run into jamming problems too (but that was on a Neo8000) > > Does the tape library have a front panel or webmin option to run a robot > recalibration routine? > > The issue is trying to find whether the drive is still locked (if the server > has multiple fibre connectors it's possible to lock the drive twice), or if > the tape is mechanically jamming or if there's another factor at work. > > If it's a locking issue I have a script which can be tweaked to unlock all > iterations of the same drive. > > If mechanical then a support call is probably required. > > >> Regards, >> Ana >> >> >> On Tue, Apr 1, 2014 at 3:39 PM, Rao, Uthra R. (GSFC-672.0)[ADNET >> SYSTEMS INC] <uthra.r....@nasa.gov <mailto:uthra.r....@nasa.gov>> wrote: >> >> How is the library connected to the bacula server? >> -- The Tape Library is connected to the server through fiber. >> >> What's the exact error message you get? >> -- I get Media Error when this happens. >> >> What shows on the library front panel? >> -- "Media attention!" >> >> Thanks, >> Uthra >> >> >> -----Original Message----- >> From: Alan Brown [mailto:a...@mssl.ucl.ac.uk >> <mailto:a...@mssl.ucl.ac.uk>] >> Sent: Tuesday, April 01, 2014 2:17 PM >> To: Rao, Uthra R. (GSFC-672.0)[ADNET SYSTEMS INC]; >> bacula-users@lists.sourceforge.net >> <mailto:bacula-users@lists.sourceforge.net> >> Subject: Re: [Bacula-users] Maximum Changer Wait directive >> >> On 01/04/14 18:23, Rao, Uthra R. (GSFC-672.0)[ADNET SYSTEMS INC] wrote: >> > I am running bacula 5.2.12 on RHEL6 O.S. with IBM TS3200 Tape Library >> > with two LTO5 drives. Sometimes the autochanger is unable to mount a >> > required tape as the tape gets stuck in its slot and needs manual >> > intervension. >> >> "stuck" can mean many things. >> >> How is the library connected to the bacula server? >> >> What's the exact error message you get? >> What shows on the library front panel? >> >> >> >> >> >> >> ------------------------------------------------------------------------------ >> _______________________________________________ >> Bacula-users mailing list >> Bacula-users@lists.sourceforge.net >> <mailto:Bacula-users@lists.sourceforge.net> >> https://lists.sourceforge.net/lists/listinfo/bacula-users >> >> > > > > > > ------------------------------------------------------------------------------ _______________________________________________ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users