Re: [Bacula-users] Block checksum mismatch on file storage

2014-07-04 Thread advantex
On Sat, 28 Jun 2014 09:30:12 +0200
Kern Sibbald k...@sibbald.com wrote:

 It is unlikely that this is a Bacula problem, especially considering
 your remark that you have
 used it for years and never had any problems.

Hi List,

first of all I have to say thanks for all the helpful replies.
I checked every disk twice, USB and SATA, but I could find anything. 
Reinstalled.. IO tests... nothing.
In the End I ran a memory test over night and guess what the §/$($§
machine has corrupted memory. (handful bits out of 6GB)
Don't know when I had problems with memormy the last time, but I
guess this must be more than 10 years ago. (or maybe I simply didn't
notice :-) ) 

So I assume it will work again this weekend. And I have to do automated
restore tests I guess. 

Thanks a lot again
G.

--
Open source business process management suite built on Java and Eclipse
Turn processes into business applications with Bonita BPM Community Edition
Quickly connect people, data, and systems into organized workflows
Winner of BOSSIE, CODIE, OW2 and Gartner awards
http://p.sf.net/sfu/Bonitasoft
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] Block checksum mismatch on file storage

2014-07-01 Thread Kern Sibbald
On 06/30/2014 10:35 PM, John Stoffel wrote:
 Kern Yes, it is clear that one can do read-only tests that do not destroy
 Kern data.  However, in this case, it seems to me more useful to do
 Kern read/write (it is actually write/read) tests as it appears that the
 Kern problem is more likely in the write ... 

 Absolutely.  And hopefully, this way you don't corrupt the existing
 data on the disk, but you do force the disk to do a low level
 re-allocation of bad blocks and sectors.  But if you are seeing bad
 blocks on the disk, then it's time to start thinking about retiring
 it.  
Hmm. I learn something new every day; re-allocation of bad blocks.  That
sounds very interesting.  Thanks for the information, it could be very
useful for situations like this.

Best regards,
Kern


 Kern I have never heard of a non-destructive read/write test, which I assume
 Kern reads then rewrites the disk.  Although that is clever and could be
 Kern useful, in this case it sounds to me risky on a disk that seems to be
 Kern failing.

 Kern Best regards,
 Kern Kern

 Kern On 06/29/2014 09:04 PM, John Stoffel wrote:
 Kern 3. Run read/write disk tests on your USB disk (note: this will
 Kern destroy any existing data). 
 This isn't quite right.  You can run read-write tests on a quiescent
 filesystem (ie unmounted) without problems:

 badblocks -svn /dev/sd?

 will scan the entire disk using non-destructive read-write mode.  But
 as Kern said, check your logs as well.

 John



--
Open source business process management suite built on Java and Eclipse
Turn processes into business applications with Bonita BPM Community Edition
Quickly connect people, data, and systems into organized workflows
Winner of BOSSIE, CODIE, OW2 and Gartner awards
http://p.sf.net/sfu/Bonitasoft
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] Block checksum mismatch on file storage

2014-06-30 Thread Kern Sibbald
Hello,

Yes, it is clear that one can do read-only tests that do not destroy
data.  However, in this case, it seems to me more useful to do
read/write (it is actually write/read) tests as it appears that the
problem is more likely in the write ... 

I have never heard of a non-destructive read/write test, which I assume
reads then rewrites the disk.  Although that is clever and could be
useful, in this case it sounds to me risky on a disk that seems to be
failing.

Best regards,
Kern

On 06/29/2014 09:04 PM, John Stoffel wrote:
 Kern 3. Run read/write disk tests on your USB disk (note: this will
 Kern destroy any existing data). 

 This isn't quite right.  You can run read-write tests on a quiescent
 filesystem (ie unmounted) without problems:

   badblocks -svn /dev/sd?

 will scan the entire disk using non-destructive read-write mode.  But
 as Kern said, check your logs as well.

 John



--
Open source business process management suite built on Java and Eclipse
Turn processes into business applications with Bonita BPM Community Edition
Quickly connect people, data, and systems into organized workflows
Winner of BOSSIE, CODIE, OW2 and Gartner awards
http://p.sf.net/sfu/Bonitasoft
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] Block checksum mismatch on file storage

2014-06-30 Thread Josh Fisher
I have seen this before with both disk and tape media, where a backup 
job with no errors cannot later be restored due to i/o errors. The 
simple answer is that media can fail, even when offline, which is one of 
the reasons we make more than one backup.

It is possible, if cumbersome and expensive, to write to RAID-1 storage, 
which would practically eliminate this issue. If restore from a 
secondary backup is not acceptable for whatever reason, then more fault 
tolerant hardware is the only answer.

The alternative I would recommend, where restore from secondary backup 
is acceptable, is to set a volume size limit for disk volumes. Disk 
media usually fails in a small area of the media, meaning that if there 
are multiple volumes on the disk then only one (or a few) are likely to 
be affected. Huge volumes are at greater risk. Smaller volume size does 
not eliminate the problem, but mitigates the risk at the expense of a 
somewhat larger database size.

On 6/28/2014 3:30 AM, Kern Sibbald wrote:
 It is unlikely that this is a Bacula problem, especially considering
 your remark that you have
 used it for years and never had any problems.

 My best guess is that you have bad media or a bad medium or a bad
 connector.  When writing, unless the OS reports an error, Bacula assumes
 the write is good.  That is, it does not re-read the data.  If you want
 to verify then you must run a Bacula verify job after the backup job.

 I suspect that there is no difference between Bacula and rsync except
 that rsync is writing on a part of the media that is good and Bacula is
 writing elsewhere.

 There are several solutions (this is not exhaustive):  1. Get new
 media.  2. Use a more reliable form of backup device (USB is relatively
 unreliable compared to SATA, ...).  3. Run read/write disk tests on your
 USB disk (note: this will destroy any existing data). 4. Check your OS
 logs.  They may show low level errors that are not reported to Bacula.
 If you have such errors, you must eliminate them to have reliable
 backups (or said the other way around: reliable backups *never* generate
 any OS device errors).

 Best regards,
 Kern

 On 06/27/2014 04:36 PM, advan...@posteo.de wrote:
 Hi Liste,

 I am using Bacula for years now and had no trouble so far.
 But now it really hits me.
 Well it worked smoothly .. until restore. (on ubuntu 12LTS and ubuntu
 14, bacula version 5.2.6)
 The files were on USB disk. To be on the safe side I recreated
 everything on local sata again. Same result.
 I do tons of rsync on that disc with no problem, checked with smart,
 upgraded the system and no change. If I run bacula-sd with -p the
 restore is pulled through but the files are really corrupted.

 Luckily I have another backup. But this is really a bad move.
 How can I rely on the backup of bacula now? (i.e. Rsync tells me at
 once if the file is corrupt) Do I really have to do a checking restore
 on every job now?

 Could you give me a hint what might be the problem?
 Thanks
 G.



--
Open source business process management suite built on Java and Eclipse
Turn processes into business applications with Bonita BPM Community Edition
Quickly connect people, data, and systems into organized workflows
Winner of BOSSIE, CODIE, OW2 and Gartner awards
http://p.sf.net/sfu/Bonitasoft
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] Block checksum mismatch on file storage

2014-06-30 Thread John Stoffel

Kern Yes, it is clear that one can do read-only tests that do not destroy
Kern data.  However, in this case, it seems to me more useful to do
Kern read/write (it is actually write/read) tests as it appears that the
Kern problem is more likely in the write ... 

Absolutely.  And hopefully, this way you don't corrupt the existing
data on the disk, but you do force the disk to do a low level
re-allocation of bad blocks and sectors.  But if you are seeing bad
blocks on the disk, then it's time to start thinking about retiring
it.  

Kern I have never heard of a non-destructive read/write test, which I assume
Kern reads then rewrites the disk.  Although that is clever and could be
Kern useful, in this case it sounds to me risky on a disk that seems to be
Kern failing.

Kern Best regards,
Kern Kern

Kern On 06/29/2014 09:04 PM, John Stoffel wrote:
Kern 3. Run read/write disk tests on your USB disk (note: this will
Kern destroy any existing data). 
 
 This isn't quite right.  You can run read-write tests on a quiescent
 filesystem (ie unmounted) without problems:
 
 badblocks -svn /dev/sd?
 
 will scan the entire disk using non-destructive read-write mode.  But
 as Kern said, check your logs as well.
 
 John
 

--
Open source business process management suite built on Java and Eclipse
Turn processes into business applications with Bonita BPM Community Edition
Quickly connect people, data, and systems into organized workflows
Winner of BOSSIE, CODIE, OW2 and Gartner awards
http://p.sf.net/sfu/Bonitasoft
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] Block checksum mismatch on file storage

2014-06-29 Thread John Stoffel

Kern 3. Run read/write disk tests on your USB disk (note: this will
Kern destroy any existing data). 

This isn't quite right.  You can run read-write tests on a quiescent
filesystem (ie unmounted) without problems:

  badblocks -svn /dev/sd?

will scan the entire disk using non-destructive read-write mode.  But
as Kern said, check your logs as well.

John

--
Open source business process management suite built on Java and Eclipse
Turn processes into business applications with Bonita BPM Community Edition
Quickly connect people, data, and systems into organized workflows
Winner of BOSSIE, CODIE, OW2 and Gartner awards
http://p.sf.net/sfu/Bonitasoft
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Re: [Bacula-users] Block checksum mismatch on file storage

2014-06-28 Thread Kern Sibbald
It is unlikely that this is a Bacula problem, especially considering
your remark that you have
used it for years and never had any problems.

My best guess is that you have bad media or a bad medium or a bad
connector.  When writing, unless the OS reports an error, Bacula assumes
the write is good.  That is, it does not re-read the data.  If you want
to verify then you must run a Bacula verify job after the backup job. 

I suspect that there is no difference between Bacula and rsync except
that rsync is writing on a part of the media that is good and Bacula is
writing elsewhere. 

There are several solutions (this is not exhaustive):  1. Get new
media.  2. Use a more reliable form of backup device (USB is relatively
unreliable compared to SATA, ...).  3. Run read/write disk tests on your
USB disk (note: this will destroy any existing data). 4. Check your OS
logs.  They may show low level errors that are not reported to Bacula. 
If you have such errors, you must eliminate them to have reliable
backups (or said the other way around: reliable backups *never* generate
any OS device errors).

Best regards,
Kern

On 06/27/2014 04:36 PM, advan...@posteo.de wrote:
 Hi Liste, 

 I am using Bacula for years now and had no trouble so far. 
 But now it really hits me.
 Well it worked smoothly .. until restore. (on ubuntu 12LTS and ubuntu
 14, bacula version 5.2.6)
 The files were on USB disk. To be on the safe side I recreated
 everything on local sata again. Same result.
 I do tons of rsync on that disc with no problem, checked with smart,
 upgraded the system and no change. If I run bacula-sd with -p the
 restore is pulled through but the files are really corrupted.

 Luckily I have another backup. But this is really a bad move.
 How can I rely on the backup of bacula now? (i.e. Rsync tells me at
 once if the file is corrupt) Do I really have to do a checking restore
 on every job now?

 Could you give me a hint what might be the problem?
 Thanks
 G.


 
 27-Jun 15:03 backup01-sd JobId 252: Ready to read from volume
 File010018 on device FileStorage (/data/bacula/FileStorage). 
 27-Jun 15:03 backup01-sd JobId 252: Forward spacing Volume File010018
 to file:block 1:699862044. 
 27-Jun 15:03 backup01-sd JobId 252: Error: block.c:318 Volume data
 error at 1:930427898! Block checksum mismatch in block=3574 len=64512:
 calc=ea539ac7 blk=d1a3deba 
 27-Jun 15:03 backup01-fd JobId 252: Error: attribs.c:485 File size of
 restored file /tmp/restore/data/tmp/xx.zip notcorrect. Original
 45958435, restored 2949120.
 

 --
 Open source business process management suite built on Java and Eclipse
 Turn processes into business applications with Bonita BPM Community Edition
 Quickly connect people, data, and systems into organized workflows
 Winner of BOSSIE, CODIE, OW2 and Gartner awards
 http://p.sf.net/sfu/Bonitasoft
 ___
 Bacula-users mailing list
 Bacula-users@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/bacula-users



--
Open source business process management suite built on Java and Eclipse
Turn processes into business applications with Bonita BPM Community Edition
Quickly connect people, data, and systems into organized workflows
Winner of BOSSIE, CODIE, OW2 and Gartner awards
http://p.sf.net/sfu/Bonitasoft
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users


[Bacula-users] Block checksum mismatch on file storage

2014-06-27 Thread advantex

Hi Liste, 

I am using Bacula for years now and had no trouble so far. 
But now it really hits me.
Well it worked smoothly .. until restore. (on ubuntu 12LTS and ubuntu
14, bacula version 5.2.6)
The files were on USB disk. To be on the safe side I recreated
everything on local sata again. Same result.
I do tons of rsync on that disc with no problem, checked with smart,
upgraded the system and no change. If I run bacula-sd with -p the
restore is pulled through but the files are really corrupted.

Luckily I have another backup. But this is really a bad move.
How can I rely on the backup of bacula now? (i.e. Rsync tells me at
once if the file is corrupt) Do I really have to do a checking restore
on every job now?

Could you give me a hint what might be the problem?
Thanks
G.



27-Jun 15:03 backup01-sd JobId 252: Ready to read from volume
File010018 on device FileStorage (/data/bacula/FileStorage). 
27-Jun 15:03 backup01-sd JobId 252: Forward spacing Volume File010018
to file:block 1:699862044. 
27-Jun 15:03 backup01-sd JobId 252: Error: block.c:318 Volume data
error at 1:930427898! Block checksum mismatch in block=3574 len=64512:
calc=ea539ac7 blk=d1a3deba 
27-Jun 15:03 backup01-fd JobId 252: Error: attribs.c:485 File size of
restored file /tmp/restore/data/tmp/xx.zip notcorrect. Original
45958435, restored 2949120.


--
Open source business process management suite built on Java and Eclipse
Turn processes into business applications with Bonita BPM Community Edition
Quickly connect people, data, and systems into organized workflows
Winner of BOSSIE, CODIE, OW2 and Gartner awards
http://p.sf.net/sfu/Bonitasoft
___
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users