On Tue, Jan 20, 2009 at 11:03:51PM +0200, Pasi Kärkkäinen wrote: > > > > > > > Anyway, I'll upgrade Bacula now and we'll see if these Volume data > > > > > > > errors disappear for copy jobs. > > > > > > > > > > > > OK > > > > > > > > > > Now running SVN revision 8381 (2.5.29). > > > > > > > > > > Unfortunately I still see these errors.. > > > > > > > > > > First it was all OK without errors for some hours, but then: > > > > > > > > > > Ready to read from volume "Pool2-Vol-0104" on device "FSDevice2" > > > > > (/mnt/backup1/pool02). Forward spacing Volume "Pool2-Vol-0104" to > > > > > file:block 0:218. > > > > > Error: block.c:1098 Volume data error at 0:3599769803! Short block of > > > > > 7988 bytes on device "FSDevice2" (/mnt/backup1/pool02) discarded. > > > > > Error: read_record.c:148 block.c:1098 Volume data error at > > > > > 0:3599769803! Short block of 7988 bytes on device "FSDevice2" > > > > > (/mnt/backup1/pool02) discarded. End of file 0 on device "FSDevice2" > > > > > (/mnt/backup1/pool02), Volume "Pool2-Vol-0104" > > > > > > > > > > .. And then it continues OK with the next file volume, and then again > > > > > similar errors for the next file volume: > > > > > > > > > > Ready to read from volume "Pool2-Vol-0117" on device "FSDevice2" > > > > > (/mnt/backup1/pool02). Forward spacing Volume "Pool2-Vol-0117" to > > > > > file:block 0:218. > > > > > Error: block.c:1098 Volume data error at 1:2863735978! Short block of > > > > > 27477 bytes on device "FSDevice2" (/mnt/backup1/pool02) discarded. > > > > > Error: read_record.c:148 block.c:1098 Volume data error at > > > > > 1:2863735978! Short block of 27477 bytes on device "FSDevice2" > > > > > (/mnt/backup1/pool02) discarded. End of file 1 on device "FSDevice2" > > > > > (/mnt/backup1/pool02), Volume "Pool2-Vol-0117" > > > > > > > > > > I don't see any errors in kernel dmesg and/or syslog. > > > > > > > > > > Any suggestions? > > > > > > > > Were the files that are being read written with Bacula version 2.5.29? > > > > > > There's a big chance those files were created using the older Bacula 2.5 > > > version. I'll have to check that.. > > > > Yes, that would be the first thing to check. I don't expect any problems > > in > > the database or with older backups, but it is worth confirming -- it was > > just > > the copying process (actually the seeking) where we had a problem. > > > > I verified this, and yes, the job being copied and giving those errors was > made with > earlier Bacula version. > > I'll monitor this and let's see how it goes.. >
Update on this.. I've been monitoring the situation, and I've still gotten some errors.. Checking the logs I've noticed _all_ of those errors have been with disk volumes that were created with Bacula 2.5.20 and now being copied with Bacula 2.5.29. When I'm copying with Bacula 2.5.29 disk volumes that were created with Bacula 2.5.29 I don't see any errors.. > > > > > > > What kind of device is /mnt/backup1/poolnn? > > > > > > > > If it is some sort of network mount, then you probably have a bad > > > > driver, > > > > bad network, or something wrong on the other end, and you should try > > > > running using local disk. > > > > > > Hmm.. it's iSCSI volume. I assume I'd had SCSI errors in the logs, or > > > filesystem errors if that was the reason.. everything is possible, of > > > course.. > > > > If it is a driver bug on either side, it might not necessairly show up in > > the > > logs. With iSCSI, there is a lot of software, and thus the possibility for > > lots of problems, especially since it is not 20 year old technology. > > Depending on what OS you are using, there may be some problem with the way > > Bacula does seeking on hard disk. > > > > I would still recommend try writing to a locally mounted disk. If the > > problem > > still occurs with locally mounted disk, then it will point strongly toward > > Bacula. > > Yep. I'll see if I can hook up local storage to the server. > > I'll try some restores from tapes (from copied jobs) to see how it goes.. is > there some tool in Bacula to verify the job is consistent/ok on tape? > > Or compare original and copied job? > So it looks like the storage is not the problem here.. -- Pasi ------------------------------------------------------------------------------ This SF.net email is sponsored by: SourcForge Community SourceForge wants to tell your story. http://p.sf.net/sfu/sf-spreadtheword _______________________________________________ Bacula-devel mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/bacula-devel
