On Sun, Nov 2, 2008 at 9:28 AM, Yves Dorfsman <[EMAIL PROTECTED]> wrote: > Edward Ned Harvey wrote: >>> >>> http://code.google.com/p/lsyncd/ >> >> Yup, that one fits the description. It looks really cool! :-) > > Hmmm... rsync is so efficient that I have to wonder what kind of extreme > case would make this attractive. I'd be so afraid that one transaction get > missed, and then because "notification" has been done, it would never get > sync'ed again... That here and there, and over a long enough time period, > you have two different file system.
I have the same fear as you which may be mitigated by using a more robust notification and transmission system on top of inotify (eg: amqp) though lsyncd looks like a winner too :) I however can also see the utility of not using rsync in the case where you're rsync'ing a fs with several million files -- rsync keeps that filelist in memory while building it and you can hit some ungood edge cases. Additionally, rsync isn't exactly the right tool to use for something closer to synchronous replication because of the thrash it can cause on a high use fs. I have Other options that I can think of for a random "would like some kind of replication thingy without thrashing my filesystem regularly" thingy: - Your SAN probably has a replication engine, use that (for vast quantities of random unstructured end-user data, this is probably the beast/easiest method) - Chop up the rsync into smaller parts that can run in parallel/different times/based on some other notifier - Replicated backups (eg, stick to your normal backup routine, clone/dupe/copy from the backup system) - Append/update to a tar file, then sync that tarfile - OS native-ish replication: - csync2 -> http://oss.linbit.com/csync2/ - drbd -> http://www.drbd.org/ - On Windows 2003R2 (and up) DFSr replicates based actions to the NTFS journal and it much easier to use than FRS - FreeBSD has something called ggated (?) - AFIK, most of the varied "cluster filesystems" aimed at the HPC crowed also offer replication/duplication of data for additional throughput/redundancy - Don't do that: Put your data in a more structured container that has replication (RDMBS, Hadoop, Hypertable) I suspect that there's probably a different way to efficiently do replication for every combination of data/OS/need out there, and a lot of what one would need to do is look at the specifics of the situation to figure out the best method. Though, I guess that's true for everything, isn't it? > I have been using those scripts on my laptop, but I think eventually (once > I've got the deletes worked out) I'll put that on all my machine, because > then, it means that everybody can work while the server is down, it also > means that I can suspend/hybernate desktop while not in use (hybernate and > automount are NOT friends :-), etc... > > I've looked at CODA, but it was designed a long time ago, and does not work > for today's sizes + their authentication mechanism is a headache. > > Anybody's been giving thought to this ? Heh. CODA. Have you looked at unison? http://www.cis.upenn.edu/~bcpierce/unison/ I suspect that its' multi-way merge is probably closer to what you may want, if I understand your use case correctly. -n -- ------------------------------------------- nathan hruby <[EMAIL PROTECTED]> metaphysically wrinkle-free ------------------------------------------- _______________________________________________ Tech mailing list [email protected] http://lopsa.org/cgi-bin/mailman/listinfo/tech This list provided by the League of Professional System Administrators http://lopsa.org/
