Christophe-Marie Duquesne wrote: > I am currently writing a FUSE file system based on git-annex for > replicating binary files on several machines. I thought I could share > it here in order to get some ideas and contributors.
Wow, you have completely anticipated a blog post I was gonna make in a few days that a) announces git-annex's support for using Amazon S3 as a git "remote", and b) suggests that a free, distributed dropbox-type thing could be built on this foundation. My day, no, my week, is officially made. This is close enough to my birthday that you are in the running for best birthday present. :) > What are your goals? > Seamless synchronization "à la dropbox". > Ability to use with big binary files such as mp3/movies. > Entirely decentralized. > Don't use unnecessary space > Keep it simple: avoid special VCS commands and keep a filesystem > interface as much as possible. 100% agree with this list, although I think that explicitly not mentioning what kind of large binary files a tool might be used to store is a wise thing. ;) > Why? > Because sparkleshare and dvcs-autosync are bad at versioning binary files I have not looked at sparkleshare, but have been wondering if it could be adapted to be used as a GUI frontend for git annex. > What do you have? > A python implementation. It is about 600 sloc, and you'll find it on > https://github.com/chmduquesne/sharebox > Be careful, it is very alpha and it still does not have a proper > conflict handler. > > Hey, but copying is slow! > On my machine, copying files to a sharebox fs is about 10 times slower > than copying it on a normal fs. All the time is spent in python's > os.write(): I guess the only way to work around this problem is to > rewrite the whole thing in C, but I am keeping this for later. I do wonder if a FUSE filesystem is really the best approach. Even a tight C implementation will need to read/write entire file contents to put them into the filesystem. Notice that git-annex avoids doing any copying of large file content when adding a file (it even defaults to using a backend that doesn't checksum, in order to preserve maximum speed). I had been thinking more along the lines of an inotify daemon that watches a directory (like dvcs-autosync), and drives git-annex. One real benefit of a filesystem is that you can support modififying the files, and proxy that through to git-annex as a delete of the old object and an add of the new object. That certainly has vaue -- do you do it? -- see shy jo
Description: Digital signature
_______________________________________________ vcs-home mailing list email@example.com http://lists.madduck.net/listinfo/vcs-home