Re: rsync local performance
Has anyone compiled rsync with other newer compilers like Intel 11.1? Does this break anything? My quick test shows rsync-3.1.0 performance jumps to ~120MB/sec. Greg On Nov 13, 2009, at 10:44 AM, Greg Siekas wrote: > Wayne, > > Transferring an 8gb file using rsync between a network (10GbE) mounted > filesystem and local disk. > > rsync-2.6.9 - 88-95 MB/sec > rsync-3.0.6 - 62-72 MB/sec > rsync-3.1.0 - 86-90 MB/sec > > Doing a cp of the file yields 140-160MB/sec. > > It appears the IO code improvements in 3.1 have definitely made a difference > over the 3.0 code base. > > Greg > > > -- > Please use reply-all for most replies to avoid omitting the mailing list. > To unsubscribe or change options: > https://lists.samba.org/mailman/listinfo/rsync > Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html -- Please use reply-all for most replies to avoid omitting the mailing list. To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Re: max file size
On Fri, 13 Nov 2009 13:33:08 -0500 Matt McCutchen wrote: > On Fri, 2009-11-13 at 12:36 +0100, Heinz-Josef Claes wrote: > > On Fri, 13 Nov 2009 01:38:48 -0500 > > Matt McCutchen wrote: > > > On Mon, 2009-11-09 at 18:20 +0100, Heinz-Josef Claes wrote: > > > > I want to check if the following is possible: > > > > > > > > 1. transport a big block of data (several terabytes) physically from > > > > location > > > > A to location B (very long distance) via tapes (or disks). > > > > (Location A and B use different storage technologies.) > > > > > > > > When the tapes arrive in location B, the block of data has changed in > > > > location > > > > A (a program / OS is running and storing data in it). > > > > > > > > 2. shutdown application / OS in location A, rsync the delta between > > > > Location A > > > > and B online, then restart the system in location B. > > > > > > > > (Perhaps step 2 has to be done multiple times.) > > > > > > Since the source and destination versions are practically certain to > > > differ, --checksum would serve no purpose. See the man page description > > > of --checksum. > > > > Don't understand what you mean. From 1. und 2., only a few percent of > > the data will change, so the idea is to transfer the differences only. > > Transferring the whole file online takes too long. > > How to do this without check sums (either --checksum or --inbound)? > > Did you read the description of --checksum as I suggested? It is an > alternative "quick check" for deciding whether a file needs to be > transferred, which is not what you want. You're talking about the > delta-transfer algorithm, which is on by default for remote runs and is > controlled by a separate option, --(no-)whole-file. > You're right - sorry misunderstanding from my side. --no-whole-file --out-format='%n%L (%b of %l)' does the job. Thanks, HJC -- Please use reply-all for most replies to avoid omitting the mailing list. To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Tutorial:How to install and configure cwRsync on a Windows Platform
Hi, At http://www.alanbonnici.com/videos/cwrsync.asp I have created a video tutorial on how to install and configure rsync running on a windows platform. It is a works-in-progress document. If you have any corrections or comments please email me. Regards, Alan Bonnici -- Please use reply-all for most replies to avoid omitting the mailing list. To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
rsync local performance
Wayne, Transferring an 8gb file using rsync between a network (10GbE) mounted filesystem and local disk. rsync-2.6.9 - 88-95 MB/sec rsync-3.0.6 - 62-72 MB/sec rsync-3.1.0 - 86-90 MB/sec Doing a cp of the file yields 140-160MB/sec. It appears the IO code improvements in 3.1 have definitely made a difference over the 3.0 code base. Greg -- Please use reply-all for most replies to avoid omitting the mailing list. To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Re: max file size
On Fri, 2009-11-13 at 12:36 +0100, Heinz-Josef Claes wrote: > On Fri, 13 Nov 2009 01:38:48 -0500 > Matt McCutchen wrote: > > On Mon, 2009-11-09 at 18:20 +0100, Heinz-Josef Claes wrote: > > > I want to check if the following is possible: > > > > > > 1. transport a big block of data (several terabytes) physically from > > > location > > > A to location B (very long distance) via tapes (or disks). > > > (Location A and B use different storage technologies.) > > > > > > When the tapes arrive in location B, the block of data has changed in > > > location > > > A (a program / OS is running and storing data in it). > > > > > > 2. shutdown application / OS in location A, rsync the delta between > > > Location A > > > and B online, then restart the system in location B. > > > > > > (Perhaps step 2 has to be done multiple times.) > > > > Since the source and destination versions are practically certain to > > differ, --checksum would serve no purpose. See the man page description > > of --checksum. > > Don't understand what you mean. From 1. und 2., only a few percent of > the data will change, so the idea is to transfer the differences only. > Transferring the whole file online takes too long. > How to do this without check sums (either --checksum or --inbound)? Did you read the description of --checksum as I suggested? It is an alternative "quick check" for deciding whether a file needs to be transferred, which is not what you want. You're talking about the delta-transfer algorithm, which is on by default for remote runs and is controlled by a separate option, --(no-)whole-file. -- Matt -- Please use reply-all for most replies to avoid omitting the mailing list. To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Stability of rsync 3.1.0dev (Re: Does files-from work with --delete?)
On Fri, 2009-11-13 at 08:38 -0800, Wayne Davison wrote: > On Thu, Nov 12, 2009 at 3:56 PM, Philip Pokorny > wrote: > How confident are you in the current state of 3.1.0.pre and > the nightly snapshots? Should I be concerned about running > this on production data? > > Personally, I'm almost ready to start using it in production. The > 3.1.0dev code prior to the I/O changes was in production-ready shape, > and the I/O overhaul has been testing well so far (I use it for all > the personal rsyncing needs). Any failure cases from the I/O code > should result in the stopping of the transfer, not some kind of > corruption, and I haven't found anything in quite a while. 3.1.0dev needs to win back my trust after a bad experience trying to do a backup a few weeks ago. First I got an out-of-space error that rsync didn't report: http://lists.samba.org/archive/rsync/2009-November/024135.html And once I freed up disk space, I got an "unexpected tag 50" (I think), which I held off on reporting pending a reproducible case. I haven't had a chance to try another backup with 3.1.0dev yet. -- Matt -- Please use reply-all for most replies to avoid omitting the mailing list. To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Re: Does files-from work with --delete?
On Fri, 2009-11-13 at 08:43 -0800, Wayne Davison wrote: > On Thu, Nov 12, 2009 at 6:07 PM, Matt McCutchen > wrote: > The best approach for now is probably to backport the > --delete-missing-args changes to 3.0.6. > > In the future, I'd suggest starting with the head of the b3.0.x > branch. That currently gets you one extra commit, an xattr-related > memory fix (2daed024b17a2cafb956e12581c25119d07a5950). Sure, provided that you keep b3.0.x stable all the time rather than just at releases. I've redone the missing-args backport: https://mattmccutchen.net/rsync/rsync.git/?a=shortlog;h=hacks/missing-args-b3.0.x -- Matt -- Please use reply-all for most replies to avoid omitting the mailing list. To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Re: --fuzzy search over to-be-deleted files to catch moved files and directories
Hi Matt, Thank you very much for answering those questions and helping me to understand rsync better! On Thu, Nov 12, 2009 at 11:20:19PM -0500, Matt McCutchen wrote: > Attempting to address each of your questions, here and then in your > other message... > > On Wed, 2009-11-11 at 12:17 +0100, H. Langos wrote: > > > It will find moved files that match exactly > > > according to the "quick check" in effect (size + mtime or checksum). > > > > That is basename+size+mtime or basename+checksum, right? > > No, a basename match is not a requirement (hence the ability to detect > renames), but it is a tie-breaker. Ahh, ok, so here size+mtime or checksum select the base file. And if that selection fails then "--fuzzy" search is applied but looks only in the /dst directory for a suitable candidate. (Or is the temporal order reversed?) > > How does "--detect-renamed" interact with "--fuzzy" and "--delete-after"? > > --detect-renamed and --fuzzy are two different means of finding basis > files that overlap in some cases but do not really interact. > --detect-renamed considers the whole destination using the quick check, > while --fuzzy considers only the same destination subdir using > size+mtime or otherwise name similarity. > > --delete-before and --delete-during may reduce the effectiveness of > --fuzzy, as stated in the man page description of --fuzzy, but they do > not affect --detect-renamed since --detect-renamed actually works during > the delete pass. ... > > > It doesn't calculate name similarity like --fuzzy because that would > > > be prohibitively expensive in the current implementation. > > Only files of the same size should be > > candidates to start with, right? > > No, the name similarity calculation I'm talking about is the fallback to > select a similar basis file when no available destination file passes > the quick check, so it does not require a size match. Hmm, ok so fuzzy also finds files that are slightly different and have their name slightly changed. This sounds like it would be a good idea to (have the option to) include the delete candidates directory .~tmp~ (or whatever else "--detect-renamed" uses) included in the --fuzzy search. The real world applications are obvious. Apart from software packages as described in https://bugzilla.samba.org/show_bug.cgi?id=3392#c7 (thanks for tha link!), which is aspecial case, using rsync friendly gzip/zlib compression, there is the large area of media files. Example: For my photo collections it would speed things up in the case where I move pictures to a different directory, rename them from DSC_01234.JPG to 20091113-174354_dsc01234.jpg (extracted timestamp from exif data) and add author, license and some keywords to the exif tags. This is not theory. In fact I do just those things with a script when importing pictures from any of my cameras into the photo archive. I rename them as shown above and then I move them to a directory structure made of /// . I don't change the exif tags yet, which I wanted to add in the future. But that would make the size+mtime/checksum test fail. Using "--fuzzy" would help, but only if I'd do an rsync between the moving operation and the tag changing operation. No matter which operation I'd do first, but doing both together would mean completely new transfer to my backup location. :-/ Same thing goes for mp3 collections when you finally find the time to tag your new music and move it to the right directory in your collection. > > Why would it be so expensive? > > Wayne said so here: > > https://bugzilla.samba.org/show_bug.cgi?id=3392#c11 Well, I think I'll have to wait then ... or refrain from doing move and change operations at the same time. :-) Thank you very much for your help! cheers -henrik -- Please use reply-all for most replies to avoid omitting the mailing list. To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Re: Does files-from work with --delete?
On Thu, Nov 12, 2009 at 6:07 PM, Matt McCutchen wrote: > The best approach for now is probably to backport the --delete-missing-args > changes to 3.0.6. In the future, I'd suggest starting with the head of the b3.0.x branch. That currently gets you one extra commit, an xattr-related memory fix (2daed024b17a2cafb956e12581c25119d07a5950). ..wayne.. -- Please use reply-all for most replies to avoid omitting the mailing list. To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Re: Does files-from work with --delete?
On Thu, Nov 12, 2009 at 3:56 PM, Philip Pokorny < ppoko...@penguincomputing.com> wrote: > How confident are you in the current state of 3.1.0.pre and the nightly > snapshots? Should I be concerned about running this on production data? > Personally, I'm almost ready to start using it in production. The 3.1.0dev code prior to the I/O changes was in production-ready shape, and the I/O overhaul has been testing well so far (I use it for all the personal rsyncing needs). Any failure cases from the I/O code should result in the stopping of the transfer, not some kind of corruption, and I haven't found anything in quite a while. ..wayne.. -- Please use reply-all for most replies to avoid omitting the mailing list. To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Re: max file size
On Fri, 13 Nov 2009 01:38:48 -0500 Matt McCutchen wrote: > On Mon, 2009-11-09 at 18:20 +0100, Heinz-Josef Claes wrote: > > Am Montag, 9. November 2009 17:48:35 schrieb Matt McCutchen: > > > On Mon, 2009-11-09 at 11:43 +0100, Heinz-Josef Claes wrote: > > > > does anybody know what's the maximum file size (terabytes?) when using > > > > rsync with options --checksum and / or --inplace? > > > > > > > > What file sizes have been tested in reality? Are there any experiences > > > > using rsync (with --checksum and / or --inplace) for big files with > > > > several / dozens or terabytes? > > > > > > I don't believe rsync has a fixed maximum size other than "what can fit > > > in 64 bits", but I can't speak to any reliability issues that might come > > > up with extremely large files. > > > > > I've read about a fix for overrun checksum buffers with more than some > > hundred > > terabytes but that was just something undefined . . . > > Indeed, I forgot about that. The delta-transfer algorithm doesn't work > for files longer than 2^31 blocks. With the default maximum block size > of 2^17, the limit is 2^48 bytes or 256 TB. You could stretch the limit > by fixing a larger block size with --block-size . See: > > https://bugzilla.samba.org/show_bug.cgi?id=5459#c2 Thanks for that information! Do you (or anybody) every has done a test with big file sizes? > > > > For what purpose are you considering --checksum? In the case where the > > > file's size hasn't changed (probably true for large image files), it > > > will add an extra full read of the file on both sides before the > > > transfer begins, which would be very expensive for multi-terabyte files. > > > > I want to check if the following is possible: > > > > 1. transport a big block of data (several terabytes) physically from > > location > > A to location B (very long distance) via tapes (or disks). > > (Location A and B use different storage technologies.) > > > > When the tapes arrive in location B, the block of data has changed in > > location > > A (a program / OS is running and storing data in it). > > > > 2. shutdown application / OS in location A, rsync the delta between > > Location A > > and B online, then restart the system in location B. > > > > (Perhaps step 2 has to be done multiple times.) > > Since the source and destination versions are practically certain to > differ, --checksum would serve no purpose. See the man page description > of --checksum. > Don't understand what you mean. From 1. und 2., only a few percent of the data will change, so the idea is to transfer the differences only. Transferring the whole file online takes too long. How to do this without check sums (either --checksum or --inbound)? I'll probably be able to make a test with a file size of some terabytes in the next weeks, but that's not guaranteed. Regards, HJC -- Please use reply-all for most replies to avoid omitting the mailing list. To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
DO NOT REPLY [Bug 5583] Don't write out an unchanged file if all the checksums matched
https://bugzilla.samba.org/show_bug.cgi?id=5583 henrik-rs...@prak.org changed: What|Removed |Added CC||henrik-rs...@prak.org --- Comment #4 from henrik-rs...@prak.org 2009-11-13 04:49 CST --- Here's my "me too" comment on the issue (feel free to move it to a separate bug depending on this one): I have stumbled upon the same issue in connection with rsnapshot and rsync with the "--detect-renamed" patch. Basically rsnapshot works like this: On the first run creates a full copy of a directory tree /src to /dst/0. Then the next time it rotates /dst/(x) to /dst/(x+1) and creates a copy with just hard links from /dst/1 to /dst/0 and then calls rsync to transfer the changes between /src and /dst/0, effectively creating a differential backup at the granularity of files. I applied the detect-renamed patch to avoid multiple copies of big files when they are moved around in the directory tree. The patch works in so far as it finds the correct base files in /dst. Then it uses the delta algorithm to make sure that no coincidental match of filename,size and mtime results in a false positive. Unfortunately usage of the delta algorithm creates a new copy of the file at /dst even if the content is the same as the base file (instead of using a hardlink to the base file). -- Configure bugmail: https://bugzilla.samba.org/userprefs.cgi?tab=email --- You are receiving this mail because: --- You are the QA contact for the bug, or are watching the QA contact. -- Please use reply-all for most replies to avoid omitting the mailing list. To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
DO NOT REPLY [Bug 6881] --bwlimit option uses KiB/s, but is documented as (what amounts to) kB/s
https://bugzilla.samba.org/show_bug.cgi?id=6881 way...@samba.org changed: What|Removed |Added Status|NEW |RESOLVED Resolution||FIXED --- Comment #2 from way...@samba.org 2009-11-13 02:10 CST --- I've both improved the docs and improved the option to be able to accept the same unit suffixes that are accepted by --max-size and --min-size. This makes it clearer what --bwlimit=1000 is doing, and allows someone to specify --bwlimit=1000kb (aka --bwlimit=1mb) for a slightly lower transfer limit than --bwlimit=1MiB. -- Configure bugmail: https://bugzilla.samba.org/userprefs.cgi?tab=email --- You are receiving this mail because: --- You are the QA contact for the bug, or are watching the QA contact. -- Please use reply-all for most replies to avoid omitting the mailing list. To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
Re: Daemon tests broken when running as root
On Thu, Nov 12, 2009 at 9:11 PM, Matt McCutchen wrote: > I'm guessing the "uid = 0" and "gid = 0" in the test daemon > configuration were disabled so that the daemon tests could run for > unprivileged users. > Right. I've checked in a fix that ensures that uid & gid are specified when the test is run as root. ..wayne.. -- Please use reply-all for most replies to avoid omitting the mailing list. To unsubscribe or change options: https://lists.samba.org/mailman/listinfo/rsync Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html