On 9/10/10 4:30 PM, Robert Kisteleki wrote: >> I think this sounds good. I tried a little experiment on a Linux machine >> using simply renaming of a directory while another process was in that >> directory. The second process was not disturbed. I think, therefore, one > > Welcome to the wonderful world of inodes. > >> could do the following at the rsync target machine where the rsync target >> directory is named, say, xxxxx: >> >> 1. Create a new directory named, say, newxxxxx as a sibling to xxxxx. >> 2. Fill newxxxxx with the new material for rsync. >> 3. Rename xxxxx to oldxxxxx. >> 4. Rename newxxxxx to xxxxx. >>
Oops.. I double checked our code and it seems I lied to the list.. I said that we were using a symlink, but we are in fact doing the renames as described here.. Using a symlink would be a bit better, so we may update our implementation. So something like this: 1) create new dir 'new-YYMMDD-millisecond..something' 2) fill it with the new content 3) ln -sf ./new-YYMMDD-millisecond..something ./current-repos but... below >> Anyone connected to xxxxx before step 3 will remain connected to oldxxxxx >> and get old data. Anyone starting rsync between steps 3 and 4 will get >> nothing -- or some signal that xxxxx is unavailable. Anyone starting >> after >> step 4 will get the new data. (I'm not sure that newxxxxx has to be a >> sibling of xxxxx, but I haven't tried anything different.) > > And anyone entering between 1 and 3 will have a problem. You can create > newxxx outside of this space and then swap in quickly, but even that's > not atomic, and not always possible. > > The thing is, it's very difficult to implement anything on the server > side that is guaranteed to be atomic from the RP's point of view. > There's simply too much interaction between various abstraction levels > (fs caches, filesystems, inodes, directory scans, symlinks, rsync, ...) > > It's far easier to check consistency on the receiving side. > Even with the symlinking this problem still stands. It just makes it a whole less likely to get to the repo between the renames and see nothing. Content may still change during your transfer (not sure that rsync has a session, it may just fetch the new content when it gets to it) and relying parties may actually come to you multiple times. As mentioned by Steve my proposed approach for relying parties only works in a specific case (where one walks down the SIAs). So his proposal may be more generic. I think choosing the right strategy depends a lot on the implementation details of the RP software. So I am not 100% sure if the WG can provide an algorithm for RPs that is guaranteed to work. Listing some recommendations may still be helpful though. Tim _______________________________________________ sidr mailing list [email protected] https://www.ietf.org/mailman/listinfo/sidr
