On 09/11/2007, Andy Lo A Foe <[EMAIL PROTECTED]> wrote: > You can md5 hash the content and use that as the key?
Yes, we detect candidates for merging via duplicate md5s. > Most of what you want can be done with just a few lines of code in > application space.. the I/O win would be pretty minimal, unless the majority > of your opertations is copying files, in which case MogileFS might not even > be the right solution anyway ;) Yes, it does save some I/O, but I anticipate the bigger gain is reduced latency when breaking a merge. Instead of having to copy the file (potentially large, although edited files should in general be small), mogile can split a replica and schedule the replication. (You can't do this split+replicate in application space, can you? You have to do the read/write loop?) OK, I'm suggesting an optimisation without numbers to back me up, which is a de facto losing position :-) I'll come back if I can see there are some large files having their replicas broken. regards, jb
