Craig Barratt wrote:
They are between 2 and 8gb each. It's not doing two passes for sure. In my testing I have been using -vv. I will double check that though.I am rsyncing 1tb of data each day. I am finding in my testing that actually removing the target files each day then rsyncing is faster than doing a compare of the source->target files then rsyncing over the delta blocks. This is because we have a fast link between the two boxes, and that are disk is fairly slow. I am finding that the creation of the temp file (the 'dot file') is actually the slowest part of the operation. This has to be done for each file because the timestamp and at least a couple blocks are guaranteed to have changed (oracle files).
How big are the individual files? If they are bigger than 1-2GB then it is possible rsync is failing on the first pass and repeating the file. You should be able to see from the output of -vv (you will see a message like "redoing fileName (nnn)").
I think we are going to re-stripe the disks to make them faster. What you note above it exactly the issue. Watching the rsync process take place using vxstat (veritas iostat tool) I can observe that the first read is done from disk for the compare, and the subsequent read for creating the temp file is from memory (sun boxes with tons of ram). It's really only the write phase of the temp (or dot) file that ruins the advantage of the delta compare vs a wholesale copy.The reason for this is that the first-pass block checksum (32 bits Adler + 16 bits of MD4) is too small for large files. There was a long thread about this a few months ago. The first message was from Terry Reed around mid Oct 2002 ("Problem with checksum failing on large files"). In any case, as your already note, if the network is fast and the disk is slow then copying the files will be faster. Rsync on the receiving side reads each file 1-2 times and writes each file once, while copying just requires a write on the receiving side.
Hmm, thats very interesting. We have a huge veritas cache for filesystem buffers. I was under the impression that was buffering the block writes. We are going to tune it's writeback caching some to try to help.Another comment: rsync doesn't buffer its writes, so each write is a block (as little as 700 bytes, or up to 16K for big files). Buffering the writes might help. There is an optional buffering patch (patches/craigb-perf.diff) included with rsync 2.5.6 that improves the write buffering, plus other I/O buffering. That might improve the write performance, althought so far significant improvements have only been seen on cygwin.
Thanks everyone for all the replies! Good info.
-kg
--
To unsubscribe or change options: http://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.tuxedo.org/~esr/faqs/smart-questions.html