I'm not entirely sure if its a Bacula specific issue, but I could use some
advice (as usual)
I'm running Bacula 2.4.2 on Solaris 10_x68 with and Exabyte LTO-3 Tape
libary. What is happening is that at some point an error occurs to the
device and then Bacula can no longer load or even determine the correct tape
to load for subsequent jobs. My only solution has been to stop Bacula and
remove the bacula-dir.state file. If I don't do this, no matter how many
times I update slots I can't get Bacula to locate the right slots.
This is the log output
2008-10-16 22:00:00mt-back4.director JobId 14221: Start Backup JobId 14221,
Job=NAS_Weekly_Tape.2008-10-16_22.
00.40
2008-10-16 22:00:00mt-back4.storage JobId 14221: 3307 Issuing autochanger
"unload slot 1, drive 0" command.
2008-10-16 22:01:45mt-back4.director JobId 14221: Using Device "Ultrium-TD3"
2008-10-16 22:01:45mt-back4.storage JobId 14221: 3301 Issuing autochanger
"loaded? drive 0" command.
2008-10-16 22:01:47mt-back4.storage JobId 14221: 3302 Autochanger "loaded?
drive 0", result: nothing loaded.
2008-10-16 22:01:47mt-back4.storage JobId 14221: 3304 Issuing autochanger
"load slot 7, drive 0" command.
2008-10-16 22:04:06mt-back4.storage JobId 14221: 3305 Autochanger "load slot
7, drive 0", status is OK.
2008-10-16 22:04:06mt-back4.storage JobId 14221: Volume "A00043" previously
written, moving to end of data.
2008-10-16 22:05:32mt-back4.storage JobId 14221: Ready to append to end of
Volume "A00043" at file=663.
2008-10-17 00:16:36mt-back4.storage JobId 14221: End of Volume "A00043" at
1048:9866 on device "Ultrium-TD3" (/dev/rmt
/0cbn). Write of 64512 bytes got 0.
2008-10-17 00:16:39mt-back4.storage JobId 14221: Re-read of last block
succeeded.
2008-10-17 00:16:39mt-back4.storage JobId 14221: End of medium on Volume
"A00043" Bytes=1,047,677,137,920 Blocks=16,24
0,034 at 17-Oct-2008 00:16.
2008-10-17 00:16:39mt-back4.storage JobId 14221: 3307 Issuing autochanger
"unload slot 7, drive 0" command.
2008-10-17 00:18:17mt-back4.director JobId 14221: There are no more Jobs
associated with Volume "A00046". Marking it p
urged.
2008-10-17 00:18:17mt-back4.director JobId 14221: All records pruned from
Volume "A00046"; marking it "Purged"
2008-10-17 00:18:17mt-back4.director JobId 14221: Recycled volume "A00046"
2008-10-17 00:18:17mt-back4.storage JobId 14221: 3301 Issuing autochanger
"loaded? drive 0" command.
2008-10-17 00:18:20mt-back4.storage JobId 14221: 3302 Autochanger "loaded?
drive 0", result: nothing loaded.
2008-10-17 00:18:20mt-back4.storage JobId 14221: 3304 Issuing autochanger
"load slot 10, drive 0" command.
2008-10-17 00:20:39mt-back4.storage JobId 14221: 3305 Autochanger "load slot
10, drive 0", status is OK.
2008-10-17 00:20:40mt-back4.storage JobId 14221: Recycled volume "A00046" on
device "Ultrium-TD3" (/dev/rmt/0cbn), all
previous data lost.
2008-10-17 00:20:40mt-back4.storage JobId 14221: New volume "A00046" mounted
on device "Ultrium-TD3" (/dev/rmt/0cbn) a
t 17-Oct-2008 00:20.
2008-10-17 05:42:21mt-back4.storage JobId 14221: End of Volume "A00046" at
727:0 on device "Ultrium-TD3" (/dev/rmt/0cb
n). Write of 64512 bytes got 0.
2008-10-17 05:42:25mt-back4.storage JobId 14221: Error: Backspace record at
EOT failed. ERR=I/O error
2008-10-17 05:42:25mt-back4.storage JobId 14221: End of medium on Volume
"A00046" Bytes=726,953,472,000 Blocks=11,268,
499 at 17-Oct-2008 05:42.
2008-10-17 05:42:25mt-back4.storage JobId 14221: 3307 Issuing autochanger
"unload slot 10, drive 0" command.
2008-10-17 05:44:08mt-back4.director JobId 14221: There are no more Jobs
associated with Volume "B00043". Marking it p
urged.
2008-10-17 05:44:09mt-back4.director JobId 14221: All records pruned from
Volume "B00043"; marking it "Purged"
2008-10-17 05:44:09mt-back4.director JobId 14221: Recycled volume "B00043"
2008-10-17 05:44:09mt-back4.storage JobId 14221: 3301 Issuing autochanger
"loaded? drive 0" command.
2008-10-17 05:44:11mt-back4.storage JobId 14221: 3302 Autochanger "loaded?
drive 0", result: nothing loaded.
2008-10-17 05:44:11mt-back4.storage JobId 14221: 3304 Issuing autochanger
"load slot 19, drive 0" command.
2008-10-17 05:46:35mt-back4.storage JobId 14221: 3305 Autochanger "load slot
19, drive 0", status is OK.
2008-10-17 05:46:35mt-back4.storage JobId 14221: Recycled volume "B00043" on
device "Ultrium-TD3" (/dev/rmt/0cbn), all
previous data lost.
2008-10-17 05:46:35mt-back4.storage JobId 14221: New volume "B00043" mounted
on device "Ultrium-TD3" (/dev/rmt/0cbn) a
t 17-Oct-2008 05:46.
2008-10-17 10:06:31mt-back4.storage JobId 14221: End of Volume "B00043" at
928:8964 on device "Ultrium-TD3" (/dev/rmt/
0cbn). Write of 64512 bytes got 0.
2008-10-17 10:06:34mt-back4.storage JobId 14221: Re-read of last block
succeeded.
2008-10-17 10:06:34mt-back4.storage JobId 14221: End of medium on Volume
"B00043" Bytes=928,518,893,568 Blocks=14,392,
963 at 17-Oct-2008 10:06.
2008-10-17 10:06:34mt-back4.storage JobId 14221: 3307 Issuing autochanger
"unload slot 19, drive 0" command.
2008-10-17 10:12:06mt-back4.storage JobId 14221: 3995 Bad autochanger
"unload slot 19, drive 0": ERR=Child exited with
code 1
Results=Unloading Data Transfer Element into Storage Element 19...mtx:
Request Sense: Long Report=yes
mtx: Request Sense: Valid Residual=no
mtx: Request Sense: Error Code=0 (Unknown?!)
mtx: Request Sense: Sense Key=No Sense
mtx: Request Sense: FileMark=no
mtx: Request Sense: EOM=no
mtx: Request Sense: ILI=no
mtx: Request Sense: Additional Sense Code = 00
mtx: Request Sense: Additional Sense Qualifier = 00
mtx: Request Sense: BPV=no
mtx: Request Sense: Error in CDB=no
mtx: Request Sense: SKSV=no
MOVE MEDIUM from Element Address 81 to 19 Failed
2008-10-17 10:12:06mt-back4.director JobId 14221: There are no more Jobs
associated with Volume "B00044". Marking it p
urged.
2008-10-17 10:12:06mt-back4.director JobId 14221: All records pruned from
Volume "B00044"; marking it "Purged"
2008-10-17 10:12:06mt-back4.director JobId 14221: Recycled volume "B00044"
2008-10-17 10:12:06mt-back4.storage JobId 14221: 3301 Issuing autochanger
"loaded? drive 0" command.
2008-10-17 10:12:12mt-back4.storage JobId 14221: 3991 Bad autochanger
"loaded? drive 0" command: ERR=Child exited with
code 1.
Results=mtx: Request Sense: Long Report=yes
mtx: Request Sense: Valid Residual=no
mtx: Request Sense: Error Code=70 (Current)
mtx: Request Sense: Sense Key=Aborted Command
mtx: Request Sense: FileMark=no
mtx: Request Sense: EOM=no
mtx: Request Sense: ILI=no
mtx: Request Sense: Additional Sense Code = 29
mtx: Request Sense: Additional Sense Qualifier = 07
mtx: Request Sense: BPV=no
mtx: Request Sense: Error in CDB=no
mtx: Request Sense: SKSV=no
INQUIRY command Failed
2008-10-17 10:12:12mt-back4.storage JobId 14221: 3301 Issuing autochanger
"loaded? drive 0" command.
2008-10-17 10:12:13mt-back4.storage JobId 14221: 3991 Bad autochanger
"loaded? drive 0" command: ERR=Child exited with
code 1.
Results=mtx: Request Sense: Long Report=yes
mtx: Request Sense: Valid Residual=no
mtx: Request Sense: Error Code=70 (Current)
mtx: Request Sense: Sense Key=Not Ready
mtx: Request Sense: FileMark=no
mtx: Request Sense: EOM=no
mtx: Request Sense: ILI=no
mtx: Request Sense: Additional Sense Code = 3E
mtx: Request Sense: Additional Sense Qualifier = 00
mtx: Request Sense: BPV=no
mtx: Request Sense: Error in CDB=no
mtx: Request Sense: SKSV=no
INQUIRY command Failed
2008-10-17 10:12:13mt-back4.storage JobId 14221: 3304 Issuing autochanger
"load slot 18, drive 0" command.
2008-10-17 10:16:20mt-back4.storage JobId 14221: Fatal error: 3992 Bad
autochanger "load slot 18, drive 0": ERR=Child
exited with code 1.
Results=mtx: Request Sense: Long Report=yes
mtx: Request Sense: Valid Residual=no
mtx: Request Sense: Error Code=70 (Current)
mtx: Request Sense: Sense Key=Not Ready
mtx: Request Sense: FileMark=no
mtx: Request Sense: EOM=no
mtx: Request Sense: ILI=no
mtx: Request Sense: Additional Sense Code = 3E
mtx: Request Sense: Additional Sense Qualifier = 00
mtx: Request Sense: BPV=no
mtx: Request Sense: Error in CDB=no
mtx: Request Sense: SKSV=no
INQUIRY command Failed
2008-10-17 10:16:20mt-back4.storage JobId 14221: Job write elapsed time =
12:02:37, Transfer rate = 47.04 M bytes/seco
nd
2008-10-17 10:16:20mt-back4.mentora JobId 14221: Fatal error: backup.c:892
Network send error to SD. ERR=Broken pipe
2008-10-17 10:16:20mt-back4.mentora JobId 14221: Error: bsock.c:306 Write
error sending 65536 bytes to Storage daemon:
mt-back4.mentora.:9103: ERR=Broken pipe
2008-10-17 10:16:32mt-back4.director JobId 14221: Error: Bacula
mt-back4.director 2.4.2 (26Jul08): 17-Oct-2008 10:16:3
2
Build OS: i386-pc-solaris2.10 solaris 5.10
JobId: 14221
Job: NAS_Weekly_Tape.2008-10-16_22.00.40
Backup Level: Full
Client: "mt-back4" 2.4.2 (26Jul08)
i386-pc-solaris2.10,solaris,5.10
FileSet: "NAS_Files" 2008-06-12 16:00:56
Pool: "Tapes" (From Job resource)
Storage: "Exabyte_224" (From Pool resource)
Scheduled time: 16-Oct-2008 22:00:00
Start time: 16-Oct-2008 22:01:45
End time: 17-Oct-2008 10:16:32
Elapsed time: 12 hours 14 mins 47 secs
Priority: 10
FD Files Written: 851,657
SD Files Written: 851,657
FD Bytes Written: 2,039,356,838,887 (2.039 TB)
SD Bytes Written: 2,039,544,213,715 (2.039 TB)
Rate: 46257.6 KB/s
Software Compression: None
VSS: no
Storage Encryption: no
Volume name(s): A00043|A00046|B00043
Volume Session Id: 203
Volume Session Time: 1223923211
Last Volume Bytes: 1 (1 B)
Non-fatal FD errors: 1
SD Errors: 0
FD termination status: Error
SD termination status: Error
Termination: *** Backup Error ***
Once this happens, other jobs begin having tape failures. This one in
particular is strange to me because the tape that Bacula marked as "Recycle"
should have then been made appendable.
2008-10-17 10:16:34mt-back4.director JobId 14261: Start Backup JobId 14261,
Job=OracleRMAN_Weekly_Tape.2008-10-17_
10.00.20
2008-10-17 10:16:35mt-back4.director JobId 14261: There are no more Jobs
associated with Volume "B00039". Marking it p
urged.
2008-10-17 10:16:35mt-back4.director JobId 14261: All records pruned from
Volume "B00039"; marking it "Purged"
2008-10-17 10:16:35mt-back4.director JobId 14261: Recycled volume "B00039"
2008-10-17 10:16:35mt-back4.director JobId 14261: Using Device "Ultrium-TD3"
2008-10-17 10:16:38mt-back4.storage JobId 14261: Job
OracleRMAN_Weekly_Tape.2008-10-17_10.00.20 waiting. Cannot fi
nd any appendable volumes.
Please use the "label" command to create a new Volume for:
Storage: "Ultrium-TD3" (/dev/rmt/0cbn)
Pool: TSN_Tapes
Media type: LTO-3
2008-10-17 10:32:51mt-back4.storage JobId 14261: Job
OracleRMAN_Weekly_Tape.2008-10-17_10.00.20 marked to be cance
led.
2008-10-17 10:32:51mt-back4.storage JobId 14261: Job
OracleRMAN_Weekly_Tape.2008-10-17_10.00.20 canceled while wai
ting for mount on Storage Device ""Ultrium-TD3" (/dev/rmt/0cbn)".
2008-10-17 10:32:51mt-back4.director JobId 14261: Bacula mt-back4.director
2.4.2 (26Jul08): 17-Oct-2008 10:32:51
Build OS: i386-pc-solaris2.10 solaris 5.10
JobId: 14261
Job: OracleRMAN_Weekly_Tape.2008-10-17_10.00.20
Backup Level: Full
Client: "adm9" 2.2.8 (26Jan08)
i386-pc-solaris2.8,solaris,5.8
FileSet: "Oracle_RMAN" 2008-03-28 07:20:53
Pool: "Tapes" (From Job resource)
Storage: "Exabyte_224" (From Pool resource)
Scheduled time: 17-Oct-2008 10:00:00
Start time: 17-Oct-2008 10:16:38
End time: 17-Oct-2008 10:32:51
Elapsed time: 16 mins 13 secs
Priority: 10
FD Files Written: 0
SD Files Written: 0
FD Bytes Written: 0 (0 B)
SD Bytes Written: 0 (0 B)
Rate: 0.0 KB/s
Software Compression: None
VSS: no
Storage Encryption: no
Volume name(s):
Volume Session Id: 243
Volume Session Time: 1223923211
Last Volume Bytes: 1 (1 B)
Non-fatal FD errors: 0
SD Errors: 0
FD termination status: Canceled
SD termination status: Canceled
Termination: Backup Canceled
This is where I cancel the scheduled tape jobs. Stop bacula. Remove the
bacula-dir.state file. Start the backups again and everything runs smooth.
I'm just not sure why this happens so frequently.
Thanks,
Shon
-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/
_______________________________________________
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users