> On Nov 14, 2016, at 9:25 PM, Richard L. Hamilton <rlha...@smart.net> wrote:
> 
> 
>> On Nov 14, 2016, at 18:48, Ryan Schmidt <ryandes...@macports.org> wrote:
>> 
>> No. There is a bug in mprsyncup in which updates of our private master rsync 
>> server are not atomic. I think that occasionally the public master rsync 
>> server must be connecting to the private server at exactly the wrong moment 
>> and ends up copying a broken configuration.
>> 
>> I'm currently rewriting mprsyncup to hopefully lessen the likelihood of this 
>> problem. Eliminating it completely is quite difficult because there isn't an 
>> easy way to atomically replace more than a single file on a server, and even 
>> if that were possible, rsync doesn't ensure that a consistent set of files 
>> is transferred.
>> 
> 
> 
> I think the following would provide the appearance of an atomically updated 
> directory, more or less:
> 
> If one had the directory "version_0001" and the symlink  "directory" -> 
> "version_0001",  and then created the directory "version_0002" and the 
> symlink "directory_new" -> "version_0002" one should be able to atomically 
> rename "directory_new" to "directory" (replacing the old symlink with the new 
> one).

Yes. The problem is that the directory you're talking about would have to be 
the one that contains the macports rsync module on the private rsync server. 
This directory is hundreds of gigabytes in size because it contains the 
distfiles and packages, making keeping more than one copy of this directory 
undesirable. One could think about recreating the directory hierarchy and using 
hard links to make "copies" of the files that don't take up additional disk 
space -- rsync includes a feature to help with that -- but now we're getting 
into things that are not "easy" when you also consider that at any time we may 
want to add to the distfiles or packages.


> Some say that it's not assured that "mv" would do a rename; the code gets a 
> little tricky to deal with case-insensitive directories, and I'm not going to 
> spend the time to figure that out; one could always do a "real" rename in 
> perl or something.

I have read that macOS (BSD) mv does not do a rename, but GNU mv does, so I 
will be using GNU mv. One could also use perl or another solution.


> A process handing out the contents would still have to chdir() via the 
> symlink and hand out paths relative to the current directory, to work with a 
> consistent set of files.  That might take an interposing script or the like 
> to pull that off.   One would not want to blow away the old directory tree 
> for at least twice a generous time-to-copy.  

Because of the slow network connection, if someone commits updates to a large 
port, we have already at times observed a single sync taking over 24 hours. 
Meanwhile, hundreds of commits could occur each day, so we want to be able to 
update the server many times a day. I don't especially want to keep dozens of 
copies of the data around.


Even if we can do a truly atomic replacement of the directory on the private 
rsync server via the above methods, there is no guarantee that the public rsync 
server will obtain a consistent set of files. For example, suppose the public 
rsync server begins transferring files from the private rsync server, and while 
that's happening, we update the private rsync server. The public rsync server 
will then have some old files (the ones it transferred before we updated the 
private server), and some new files (the ones it transferred after).


In light of these complications, I'm implementing the best solution I can, 
which is to ensure that the amount of time that the files are inconsistent on 
the private rsync server is as small as possible.

Reply via email to