On 3/4/24 09:00, Lionel PLASSE wrote:
Hello,I have this error while reading a volume for virtualfull consolidating job Error: block_util.c:521 Volume data error at 0:0! Block checksum mismatch in block=11755301 len=64512: calc=dfb5486f blk=9eafdf1a It seems the volume file is definitively corrupted, nothing is possible for this block I think . ( The error occurs twice for the same job at the same block) But , is this possible to continue reading the volume by bypassing or "marking" the error and proceed with consolidating the remaining data? Can we blacklist the block in error (along with the corresponding files) to complete the consolidation job, even if the result will be an incomplete fileset? Or Bacula definitively kill the job in error. I don't recall seeing an option to bypass I/O errors . What to do in such kind of hardware I/O problems.
Hello Lionel,I have no idea if this would work, but it may be possible to start the SD with the `-p` (Proceed despite I/O errors), then try the restore. I have never tried this, and would typically revert to using the low-level `bextract` tool, which also has the `-p` command line option.
If starting the SD with `-p`: # sudo -u bacula /path/to/bacula-sd -p -f (just start the SD in foreground with ignore errors etc)... And performing a VFull does not get you a good* virtual full, you may have to use `bextract` against the volumes used in the last Full/VirtualFull to restore the data from the volumes.
*In this sentence, the word "good" is relative. I mean, with hardware I/O errors, the data recovered during the VFull will surely be missing data... The same thing goes for using bextract set to ignore I/O errors.
Personally, unless this were a critical restore situation, and I were just trying to "get what I can" back, I would abandon the last Full/VirtualFull and perform a new, good real Full immediately since you have surely lost some data in your backup chain due to this hardware issue.
I'd also recommend implementing Verify jobs in some manner (ie: Automatically restore and/or Verify (level=data) critical jobs when they finish with a RunsWhen = after script, or implement some Admin job that pseudo-randomly picks a recent, Good backup job and performs a restore and/or Verify job against it)
Here's an all-in-one™, overkill™ example script that I wrote as a proof-of-concept a while ago which performs a restore, then all three Verify levels against a backup job when it completes. You can pick and choose parts of this that you need and abandon the rest. :)
https://github.com/waa/AutoRestoreAndVerify Hope this helps, Bill -- Bill Arlofski w...@protonmail.com
signature.asc
Description: OpenPGP digital signature
_______________________________________________ Bacula-users mailing list Bacula-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bacula-users