> 
> From: "James Harper" <[email protected]>
> > Rsync integration
> >
> > Not claimed - no patches yet - Not in kernel yet
> >
> > Now that we have code to efficiently find newly updated files, we need to
> > tie it into tools such as rsync and dirvish. (For bonus points, we can
> > even allow rsync to use btrfs's builtin checksums and, when a file has
> > changed, tell rsync _which blocks_ inside that file have changed. Would
> > need to work with the rsync developers on that one.)
> >
> > Update rsync to preserve NOCOW file status.
> 
> Means: Make rsync work like btrfs send/receive;-) and put filesystem
> specific code in it.

Not really... it's just putting code in rsync to get existing metadata from the 
filesystem rather than calculating the metadata itself.

I'm just testing out some of the deduplication stuff in btrfs, and was actually 
a little shocked to find it calculating the hashes itself. btrfs already has 
checksums, and if nothing else it could have used them to trivially reject 
blocks that are different before calculating a stronger hash. There is talk 
about exposing the btrfs checksums to userspace, but of course that puts 
contraints on further development as they now have to consider userspace 
compatibility.

It would be a huge speedup for dedup though.
   
> 
> I am not sure whether this is a great idea.
> 
> Most of the time you will have the same filesystem on both ends. Then you
> can use zfs/btrfs etc. tools. Or rsync if it's not a COW system.
> 
> It is more "code polluting" than it's worth I think.

Maybe. Depends on the speedup. In a lot of cases, the above optimisations would 
speed up the processing that rsync has to do, but if 90% of the time taken in 
your rsync was actually moving data then you're never going to get anymore than 
10% faster. For LAN links though, I normally just use -W for rsync because 
computing changes just adds overhead (I mean you have to read the file at both 
ends anyway, and unless your disk can pull data faster than 1GByte/second 
you're not going to saturate your 10GBit /second link so don't bother computing 
changes. If you got the change computation "for free", then it's a big win.

> 
> And if btrfs send/receive isn't stable there is a good chance to implement
> an unstable rsync as well.
> 

(I think) Russell was supposing that there weren't many bugs reported for 
send/receive because not many people were using it. I'm not sure how we got 
from there to "send/receive isn't stable".

But yes, new code has bugs :)

James

_______________________________________________
luv-main mailing list
[email protected]
http://lists.luv.asn.au/listinfo/luv-main

Reply via email to