Re: [Bacula-users] CRC ERROR on volume what possible to do

Bill Arlofski via Bacula-users Mon, 04 Mar 2024 08:43:16 -0800

On 3/4/24 09:00, Lionel PLASSE wrote:

Hello,


I have this error while reading a volume for virtualfull consolidating job

Error: block_util.c:521 Volume data error at 0:0!
Block checksum mismatch in block=11755301 len=64512: calc=dfb5486f
blk=9eafdf1a

It seems the volume file is definitively corrupted,  nothing is possible for
this block I think . ( The error occurs twice for the same job at the same
block)

But ,
is this possible to continue reading the volume by bypassing or "marking" the
error and proceed with consolidating the remaining data?
Can we blacklist the block in error (along with the corresponding files) to
complete the consolidation job, even if the result will be an incomplete
fileset?
Or Bacula definitively kill the job in error. I don't recall seeing an option
to bypass I/O errors .

What to do in such kind of hardware I/O problems.


Hello Lionel,

I have no idea if this would work, but it may be possible to start the SD with the `-p` (Proceed despite I/O errors), then try the restore. I have never tried this, and would typically revert to using the low-level `bextract` tool, which also has the `-p` command line option.


If starting the SD with `-p`:

# sudo -u bacula /path/to/bacula-sd -p -f    (just start the SD in foreground 
with ignore errors etc)

... And performing a VFull does not get you a good* virtual full, you may have to use `bextract` against the volumes used in the last Full/VirtualFull to restore the data from the volumes.

*In this sentence, the word "good" is relative. I mean, with hardware I/O errors, the data recovered during the VFull will surely be missing data... The same thing goes for using bextract set to ignore I/O errors.

Personally, unless this were a critical restore situation, and I were just trying to "get what I can" back, I would abandon the last Full/VirtualFull and perform a new, good real Full immediately since you have surely lost some data in your backup chain due to this hardware issue.

I'd also recommend implementing Verify jobs in some manner (ie: Automatically restore and/or Verify (level=data) critical jobs when they finish with a RunsWhen = after script, or implement some Admin job that pseudo-randomly picks a recent, Good backup job and performs a restore and/or Verify job against it)

Here's an all-in-one™, overkill™ example script that I wrote as a proof-of-concept a while ago which performs a restore, then all three Verify levels against a backup job when it completes. You can pick and choose parts of this that you need and abandon the rest. :)


https://github.com/waa/AutoRestoreAndVerify


Hope this helps,
Bill

--
Bill Arlofski
w...@protonmail.com

signature.asc
Description: OpenPGP digital signature

_______________________________________________
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users

Re: [Bacula-users] CRC ERROR on volume what possible to do

Reply via email to