Hello,
I'm running some backups and I'm not quite sure what to make of the
reports:
10-Jul 23:26 dio-sd JobId 438: Alert: smartctl 5.39.1 2010-01-28 r3054
[x86_64-redhat-linux-gnu] (local build)
10-Jul 23:26 dio-sd JobId 438: Alert: Copyright (C) 2002-10 by Bruce
Allen,http://smartmontools.sourceforge.net
10-Jul 23:26 dio-sd JobId 438: Alert:
10-Jul 23:26 dio-sd JobId 438: Alert: TapeAlert: OK
10-Jul 23:26 dio-sd JobId 438: Alert:
10-Jul 23:26 dio-sd JobId 438: Alert: Error Counter logging not supported
10-Jul 23:26 dio-sd JobId 438: Alert:
10-Jul 23:26 dio-sd JobId 438: Alert: Last n error events log page
10-Jul 23:26 dio-sd JobId 438: Alert: Error event 29556:
10-Jul 23:26 dio-sd JobId 438: Alert: [binary]:
10-Jul 23:26 dio-sd JobId 438: Alert: 00 63 63 75 72 72 65 6e 63
65 2f 6c 69 73 74 3f 6d
10-Jul 23:26 dio-sd JobId 438: Alert: 10 6f 64 65 3d 72 61 77 26
66 6f 72 6d 61 74 3d 64
10-Jul 23:26 dio-sd JobId 438: Alert: 20 61 72 77 69 6e 26 63 6f
6f 72 64 69 6e 61 74 65
10-Jul 23:26 dio-sd JobId 438: Alert: 30 73 74 61 74 75 73 3d 74
72 75 65 26 68 6f 73 74
10-Jul 23:26 dio-sd JobId 438: Alert: 40 69 73 6f 63 6f 75 6e 74
72 79 63 6f 64 65 3d 4d
10-Jul 23:26 dio-sd JobId 438: Alert: 50 58 26 73 63 69 65 6e 74
00 00 00 00 00 00 00 00
10-Jul 23:26 dio-sd JobId 438: Alert: 60 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00
........
SD Errors: 0
FD termination status: OK
SD termination status: OK
Termination: Backup OK
I'm using smartctl as suggested (Alert Command = "sh -c 'smartctl -H -l
error %c'").
- some backups give errors like the above, some don't; unrelated to
which server is backed up
- some restore jobs give errors like the above, some don't; could not
yet establish a correlation between restore jobs and errors (i.e. if
restore of one specific backup job will give the same errors/none at all)
- when they occur, the errors are not identical; a specific job run
repeatedly will give different errors or even none at all
- the errors occur regardless of how the jobs are run (concurrently,
independent, with or without disk spooling) and regardless of the
physical tape cartridge (tried with different tapes)
- the backup jobs statuses reported by Bacula are all OK in spite of the
smartctl errors
- the restore of jobs backed up with the smartctl errors (sometimes they
have their own smartctl errors) always ends with Bacula status OK
The Bacula *btape* tape and changer tests were all successful.
I've run the following test: restored from Bacula job with smartctl
errors (restore job had smartctl errors of its own) then computed the
MD5 sums of the restored files and compared them with the sums of the
files on the server that was being backed up. I've done this for 4
different jobs of 4 separate servers (~800.000 files and 40GBs in total)
and the only MD5sum differences were a handful of files that had very
good reasons for it (logs, bash history and such). So basically there
seemed to be no errors in the restored files.
So far I was unable to match the errors with anything (codes) in the
tape library documentation.
The TL2000 tape library is brand new.
I'm wondering if this might be a false positive?!
I'm using:
Fedora 13 (x86_64)
Bacula 5.0.2-5 (RPM packages coming with FC13 updates)
PostgreSQL 8.4.4-1 (RPMs same as above)
Dell PowerVault TL2000 (LTO 4)
(the drive is IBM ULT3580-TD4 with the A232 firmware)
IBM LTO 4 tapes
Thank you!
Best regards,
Andrei Cenja
------------------------------------------------------------------------------
This SF.net email is sponsored by Sprint
What will you do first with EVO, the first 4G phone?
Visit sprint.com/first -- http://p.sf.net/sfu/sprint-com-first
_______________________________________________
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users