https://bugzilla.samba.org/show_bug.cgi?id=14529

            Bug ID: 14529
           Summary: Please add option to save metadata to single file to
                    speed up backups
           Product: rsync
           Version: 3.2.0
          Hardware: All
                OS: All
            Status: NEW
          Severity: enhancement
          Priority: P5
         Component: core
          Assignee: wa...@opencoder.net
          Reporter: korn-bugzilla.samba....@elan.rulez.org
        QA Contact: rsync...@samba.org
  Target Milestone: ---

There are compelling reasons to use rsync as a backup tool; then snapshot the
destination fs to preserve the current backup; and save the next backup to the
same destination, again using rsync.

In this scenario, the data in the backup filesystem is only ever changed by
rsync.

If there are many files, a backup run will take a very long time and most I/O
will be spent in reading the metadata of files to see if the source is
different from the destination:

% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 45.85    0.627125          31     20222           lstat
 30.61    0.418682          20     20222           lgetxattr
 13.54    0.185181          79      2338           getdents64
  3.23    0.044241          22      1982      1982 getxattr
  2.34    0.032001          16      1982           stat
  1.78    0.024293          20      1169           openat
  1.25    0.017112          14      1169           close
  1.05    0.014389          12      1169           fstat
  0.27    0.003737          19       187           brk
  0.04    0.000503          45        11           write
  0.02    0.000306          27        11           read
  0.01    0.000159          14        11           select
------ ----------- ----------- --------- --------- ----------------
100.00    1.367729                 50473      1982 total

If rsync could be told to save all metadata to some "database" in addition to
the filesystem, the load on the backup server on subsequent backups of the same
source data to the same destination could be much lower. The "database" could
be  read into RAM, perhaps in chunks if it's very large, and checking metadata
for changes would be almost free.

Of course, if data is changed in the actual filesystem by a tool other than
rsync (which would keep the "database" updated), the "database" gets out of
sync, but that can't be helped.

This could also be an enhancement of "fake super" -- instead of saving metadata
in an xattr for each file separately, all metadata could be saved in a single
file, in a location outside the root of the rsync module (or, to support
chroot, inside it, but hidden from rsync transfers).

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
-- 
Please use reply-all for most replies to avoid omitting the mailing list.
To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
  • [Bug 14529] New: Plea... just subscribed for rsync-qa from bugzilla via rsync

Reply via email to