Bacula DOES NOT LIKE and does not handle network interruptions _at all_
if backups are in progress. This _will_ cause backups to abort - and
these aborted backups are _not_ resumable

Hi,

My feeble two cents is that this has been a bit of an Achilles heel for us even though we are a LAN backup environment (e.g. backups don't leave our local network). We are still running an older "somewhat/slightly" customized/modified version of community bacula so I have not explored the restarting of stopped jobs option that has come with newer versions. Given that, I can recall when we initially deployed our "backups to disk" setup, I would see backups of large file systems/data (e.g. 1TB) write 3/4ths of their data to volumes and then error out due to some random network interruption. I didn't like the idea that this meant e.g. 750GBs worth of our volume space was taken up by an errored/incomplete job that would never be used. Because of this, I had to implement spooling which typically people would only do if their backups were then being written to sequential media (tape). So, we now spool all jobs to dedicated spool disks and then bacula writes that data to the disk data volumes. It fixed the "cruft" issue and made large backups more stable (along with other options). But I can imagine a scenario where we would not have had to do this if Bacula could more easily recover from network glitches and automatically restart jobs where it last left off (thinking along the lines of the concept of checkpointing in a RDBMS).

As someone else said, this would require non-trivial changes to Bacula (i.e. I won't be making those changes to our version - :) ) and the devil would be in the details in practice. Still, if it was put to a vote, I'd probably vote for this as "a nice feature to have."

cheers,


--tom



_______________________________________________
Bacula-users mailing list
Bacula-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/bacula-users

Reply via email to