Re: [Bug 11521] rsync does not use high-resolution timestamps to determine file differences
> Date: Sun, 24 Jan 2016 15:43:20 -0800 > From: Wayne Davison A couple questions below; please bear with me. > No, if you do a ext4 -> ext4 copy, rsync has set the matching ns info for > transferred files since 3.1.0. There was a case prior to rsync 3.1.2 where > a brand-new file transferred in the same second it was created wouldn't get > the right ns value because rsync was optimizing away the time-set if the > file's mod-time matched in the integer part (3.1.2 fixed that). Oh, I see what happened. My problem is that no Ubuntu LTS before 14.04 had rsync 3.1.0 or newer, and the original capability took more than four years to make it into a released rsync version, if I'm reading the release notes correctly.* Unfortunately, that means the vast majority of my machine base predates the fix, including the machine hosting the backups. I can obviously install newer rsyncs, but that gives me a big installed base of pre-fix data that I'm going to have to fix, and no more rsync security updates unless I track them manually. Yet I'd rather do this now, so I'm future-proofed, than be badly surprised some years down the road when this rsync behavior becomes the default, and to keep the problem from continuing to get worse. * I may have tried 3.1.0 at some point and then realized the problems it'd give me for backups and didn't install it everywhere, pending a better fix; this is starting to ring bells. I'm really surprised that the initial patch of Sep 7, 2009 never made it into a released rsync until 3.1.0 of Sep 28 2013; that's a four-year delay, and explains why I obviously never tried this out until perhaps an experiment with 3.1.0, and no doubt I didn't want to run a private version that wouldn't get security updates. Ubuntu rsync versions: 10.04 has 3.0.7 proto 30; 12.04 has 3.0.9 p 30; 14.04 has 3.1.0 p 31.) So what it looks like is that the capability to transfer ns times at all existed in CVS but not released since 2009, in released since 2013, and in an Ubuntu LTS since 2014. And the current patch -seems- to be an optimization that avoids -comparison- if the ns times match, but that only affects speed---it doesn't change what gets written in any event, just how fast. Right? But actually I think it changes behavior besides that---see my test case below. > Beginning with this patch you can run rsync -aiv --checksum -@-1 and have > it fix the full mod-time on any matching files it finds. But for most newer > backups, the ns time will already be set correctly (as long as it was > created using a new enough rsync and protocol 31). If someone has a large > link-dest hierarchy that predates 3.1.0, then you could be sharing > hard-linked matching files from back before the ns info was included (the > older files would all have 0 for the ns value). Wow, that -c really hurts. If one wanted to live dangerously---with the assumption that two files that otherwise match in all metadata (including obviously length :) but whose timestamps differ in that one has integer seconds and the other has the same integer seconds but also nanoseconds, can rsync readjust the dates, without doing a full checksum? If not, I may write such a tool, or do it the (very) slow way and have rsync re-checksum a few terabytes of my backups... :) [Might find some bitrot that way, of course.] Also, I actually -can't- use that command to fix my snapshots, because (if I understand correctly), it will -alter- my existing snapshots to match the -current- contents of files, destroying them ---I'll no longer be able to go back in time to a previous version. I only want to update ns times on files in the older snapshots if and only if changing integer times to ns times would be the only modification. I think rsync -ac -@-1 will do far more, yes? As for -@-1, that introduces a surprising change in behavior when I try it. I'm unsure if it's intended, though I think it is. But it will -definitely- break my hardlinks and bloat the backups if I try it without readjusting the dates in --link-dest directories (e.g., previous snapshots). I find that specifying -@-1 copies the ns timestamp from the source to the destination even if the --link-dest directory has an integer timestamp, and so I assume this is part of the purpose of the patch? Not just an optimization, but a change in the way --link-dest might work. Observe: 22:57:22 ~$ mkdir T 22:57:25 ~$ cd T 22:57:26 ~/T$ mkdir 1 2 3 4 5 6 22:57:30 ~/T$ lat() { ls -alF -i --full-time "$@"; } 22:57:49 ~/T$ touch 1/foo 22:57:53 ~/T$ ln 1/foo 1/bar 22:57:56 ~/T$ lat */* 1321175 -rw-r--r-- 2 user user 0 2016-01-24 22:57:53.013689572 -0500 1/bar 1321175 -rw-r--r-- 2 user user 0 2016-01-24 22:57:53.013689572 -0500 1/foo 22:57:59 ~/T$ rsync -aviH 1/ 2/ sending incremental file list .d..t.. ./ >f+ foo hf+ bar => foo sent 139 bytes received 53 bytes 384.00 bytes/sec total size is 0 speedup is 0.00 2
Re: [Bug 11521] rsync does not use high-resolution timestamps to determine file differences
[Included text copied from the commit.] > Date: Sun, 24 Jan 2016 19:52:49 + > From: samba-b...@samba.org > Auto-Submitted: auto-generated > https://bugzilla.samba.org/show_bug.cgi?id=11521 > Wayne Davison changed: >What|Removed |Added > > Status|NEW |RESOLVED > Resolution|--- |FIXED > --- Comment #3 from Wayne Davison --- > The latest git version has an option that lets you choose to include > nanoseconds in comparisons if you want them. Having it on by default would > likely cause far too many headaches for various backup solutions that use an > older filesystem (e.g. ext3) that doesn't support nanoseconds. Thanks for the patch! Just FYI, this comment is true but incomplete---the scenario I was describing was straight ext4-to-ext4 copies and/or backups. The timestamp problem I currently see with those is because rsync was throwing away the nanosecond information until this patch, even though both ends supported it. Anyone who's used dirvish, or presumably similar tools such as rsnapshot, from and to ext4 or other ns-supporting filesystems, will be bitten by the problem of non-ns vs ns timestamps bloating backups and breaking hardlinks, either when they manually use --modify-window=-1, or when this becomes the default. I'd still love to see either some in-rsync workaround that can be left in place*, or (second best) some clever by-hand one-time workaround that uses rsync just once to update all those timestamps while not breaking the hardlinks (presumably -not- using --link-dest for those), or (third best) some not-rsync-at-all solution that basically does a giant ls at the source and a giant touch at the destination. Yes, I know that none of those can fix up backed-up files that are no longer in the same place in the source; those at least won't bloat/ unhardlink later backups. This solution is likely going to have to be reinvented repeatedly by people running backups, so working out the right way to do it and then putting it somewhere other rsync users will find it is likely a useful exercise. Before I solve it by hand for myself, it'd be useful to know if it's likely rsync will come up with some way to solve it for me. * E.g., (a) "If modify-window is -1, and --link-dest is in use, and everything else about the file or directory matches -except- the timestamp, then update the timestamp," and (b) "Provide a switch to turn off this heuristic after I'm sure my backups are okay." (The intent of (b) is to catch later slight changes of timestamp but actually record them as -separate files-, which is important so older snapshots don't magically change out from under you if something updates a timestamp by a fraction of a second without otherwise changing anything, which does happen and can occasionally be -very- important to know about when tracking down issues.) -- Please use reply-all for most replies to avoid omitting the mailing list. To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
[Bug 11521] rsync does not use high-resolution timestamps to determine file differences
https://bugzilla.samba.org/show_bug.cgi?id=11521 Wayne Davison changed: What|Removed |Added Status|NEW |RESOLVED Resolution|--- |FIXED --- Comment #3 from Wayne Davison --- The latest git version has an option that lets you choose to include nanoseconds in comparisons if you want them. Having it on by default would likely cause far too many headaches for various backup solutions that use an older filesystem (e.g. ext3) that doesn't support nanoseconds. -- You are receiving this mail because: You are the QA Contact for the bug. -- Please use reply-all for most replies to avoid omitting the mailing list. To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
[Bug 11521] rsync does not use high-resolution timestamps to determine file differences
https://bugzilla.samba.org/show_bug.cgi?id=11521 --- Comment #2 from Andrey Gursky --- (In reply to Michael McCracken from comment #1) I believe the rsync maintainer might have commented this with at least the reference to the mailing list [1], where this has been already proposed, though ignored (like this bug report either). The things are not so easy, of course [2] (and follow the discussion). [1] [PATCH] Consider nanoseconds when quick-checking for unchanged files https://lists.samba.org/archive/rsync/2014-December/029853.html [2] [PATCH] Consider nanoseconds when quick-checking for unchanged files https://lists.samba.org/archive/rsync/2016-January/030511.html -- You are receiving this mail because: You are the QA Contact for the bug. -- Please use reply-all for most replies to avoid omitting the mailing list. To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
[Bug 11521] rsync does not use high-resolution timestamps to determine file differences
https://bugzilla.samba.org/show_bug.cgi?id=11521 --- Comment #1 from Michael McCracken --- Created attachment 11440 --> https://bugzilla.samba.org/attachment.cgi?id=11440&action=edit patch to check hi-res timestamp in unchanged_file -- You are receiving this mail because: You are the QA Contact for the bug. -- Please use reply-all for most replies to avoid omitting the mailing list. To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html