Matthew Mondor wrote:
Greg Troxel <[email protected]> wrote:
So, I'm inclined to patch rdiff-backup not to fsync, since it
seems excessive, and the backup is toast if the machine crashes
before it is finished -- in that case rdiff-backup just rolls
back. Opinions?
I also wonder why fsync would be used for every file, especially
if you consider a whole run a single "transaction", even more so
if using snapshots (although you don't mention using them).
If rdiff-backup was easily able to roll back after a crash, then
I'd probably agree with the above. But it's expensive to roll
back (you have to compare the actual data in the files, without
assuming that {same size, same mtime} implies same data).
The current state of ffs+wabl is that, if the system crashes and
the log is replayed, then files that had been written shortly
before the crash end up with whatever old data happened to be
in the underlying disk blocks, but new metadata indicating that
the size and timestamps are all up to date. I think that this
violates traditional unix file system semantics, but the people
who worked on wapbl don't seem to think it's a problem.
Anyway, the new metadata with old data tends to make rsync (and
probably rdiff-backup) think that the file is up to date, and
so not copy it again next time (unless you perform an expensive
comparison of all the data, nit just the metadata).
I have patched rsync to issue fdatasync(2) calls frequently,
to mitigate this problem in my own usage. It does slow it
down, but nowhere near as dramatically as you report. (I use
NetBSD-current.)
--apb (Alan Barrett)