Re: [PATCH] refs: do not use cached refs in repack_without_ref
...[Sorry about the previous HTML reposts] Jeff King p...@peff.net wrote: On Mon, Dec 31, 2012 at 03:30:53AM -0700, Martin Fick wrote: The general approach is to setup a transaction and either commit or abort it. A transaction can be setup by renaming an appropriately setup directory to the ref.lock name. If the rename succeeds, the transaction is begun. Any actor can abort the transaction (up until it is committed) by simply deleting the ref.lock directory, so it is not at risk of going stale. Deleting a directory is not atomic, as you first have to remove the contents, putting it into a potentially inconsistent state. I'll assume you deal with that later... Right, these simple single file transactions have at best 1 important file/directory in them, once deleted the transaction is aborted (can no longer complete). However to support multi file transactions, a better approach is likely to rename the uuid directory to have a .delete extension before deleting stuff in it. One important piece of the transaction is the use of uuids. The uuids provide a mechanism to tie the atomic commit pieces to the transactions and thus to prevent long sleeping process from inadvertently performing actions which could be out of date when they wake finally up. Has this been a problem for you in practice? No, but as you say, we don't currently hold locks for very long. I anticipate it being a problem in a clustered environment when transactions start spanning repos from java processes, with insane amounts of RAM, which can sometimes have unpredictable indeterminately long java GC cycles at inopportune times.. It would seem short sighted if Gerrit at least did not assume this will be a problem. But, deletes today in git are not so short and Michael's fixes may make things worse? But, as you point out, that should perhaps be solved a different way. Avoiding this is one of the reasons that git does not take out long locks; instead, it takes the lock only at the moment it is ready to write, and aborts if it has been updated since the longer-term operation began. This has its own problems (you might do a lot of work only to have your operation aborted), but I am not sure that your proposal improves on that. It does not, it might increase this. Git typically holds ref locks for a few syscalls. If you are conservative about leaving potentially stale locks in place (e.g., give them a few minutes to complete before assuming they are now bogus), you will not run into that problem. In a distributed environment even a few minutes might not be enough, processes could be on a remote server with a temporarily split network, that could cause delays longer than your typical local expectations. But there is also the other piece of this problem, how do you detect stale locks? How long will it be stale until a user figures it out and reports it? How many other users will simply have failed pushes and wonder why without reporting them? In each case, the atomic commit piece is the renaming of a file. For the create and update pieces, a file is renamed from the ref.lock dir to the ref file resulting in an update to the sha for the ref. I think we've had problems with cross-directory renames on some filesystems, but I don't recall the details. I know that Coda does not like cross-directory links, but cross-directory renames are OK (and in fact we fall back to the latter when the former does not work). Ah, here we go: 5723fe7 (Avoid cross-directory renames and linking on object creation, 2008-06-14). Looks like NFS is the culprit. If the renames fail we can fall back to regular file locking, the hard part to detect and deal with would be if the renames don't fail but become copies/mkdirs. In the case of a delete, the actor may verify that ref currently contains the sha to prune if it needs to, and then renames the ref file to ref.lock/uuid/delete. On success, the ref was deleted. Whether successful or not, the actor may now simply delete the ref.lock directory, clearing the way for a new transaction. Any other actor may delete this directory at any time also, likely either on conflict (if they are attempting to initiate a transaction), or after a grace period just to cleanup the FS. Any actor may also safely cleanup the tmp directories, preferably also after a grace period. Hmm. So what happens to the delete file when the ref.lock directory is being deleted? Presumably deleting the ref.lock directory means doing it recursively (which is non-atomic). But then why are we keeping the delete file at all, if we're just about to remove it? We are not trying to keep it, but we need to ensure that our transaction has not yet been aborted: the rename does this. If we just deleted the file, we may sleep and another transaction may abort our transaction and complete before we wake up and actually delete the file. But by using
Re: [PATCH] refs: do not use cached refs in repack_without_ref
On Wed, Dec 26, 2012 at 09:24:39AM +0100, Michael Haggerty wrote: I'm sorry to take so long to respond to this patch. Thank you for tracking down this bug and for your careful analysis. I think your patch is correct and should fix the first race condition that you described. Thanks for checking it over. I almost cc'd you, as I know you have been working on ref caching. But I think that the problem is much older, as we always cached the packed-refs list in memory. But I think the continued existence of the other race conditions is an indication of a fundamental problem with the reference locking policy--independent of the in-RAM reference cache. The tacit assumption of the current locking policy is that changes to the packed-refs file are not critical for correctness, because loose references take precedence over it anyway. This is true for adding and modifying references. But it is not true for deleting references, because there is no way for a deletion to be represented as a loose reference in a way that takes precedence over the packed-refs file (i.e., there is no way for a loose reference to say I am deleted, regardless of what packed-refs says). Thus the race conditions for deleting references, whether via delete_ref() or via pack_refs() with --prune. Yeah. It would be much nicer if we could just write the null sha1 or a similar sentinel into the loose ref for I am deleted. The problem (besides backwards compatibility) is the usual D/F conflict. I cannot delete refs/heads/foo and then create refs/heads/foo/bar if the old ref file is in the way. There is a problem if two processes try to delete a reference at the same time, or if one process tries to delete a reference at the same time as another process is trying to pack the references. The reason is that there is no transaction that spans both the rewriting of the packed-refs file and also the deletion of the loose-ref files, and therefore it is possible for conflicting changes to be made in the two locations. Just to be clear, are you talking about the races I wrote about in my other email? Or are there other races? I didn't (and still don't) see any actual on-disk data loss races. Just ones where a reader may get an old, packed value (which is still bad, but is less bad than one where a write is lost). I think that all of the problems would be fixed if a lock would be held on the packed-refs file during the whole process of deleting any reference; i.e., change the algorithms to: Yeah, I looked at that, too. In fact, before I had correctly analyzed the problem, I thought that doing so would solve the problem I was seeing (which I noticed was wrong when it did not pass my tests :) ). From your description, I think you realize this, but I want to point out for other readers: just updating the pack_refs side to call prune_refs under the lock would create a deadlock with a simultaneous delete (which takes the individual ref lock first, then the packed-refs lock). Of course, I do not think git is capable of deadlock, as we typically just die() instead of blocking on a lock. So maybe it does not matter so much. :) * Delete reference foo: 1. Acquire the lock $GIT_DIR/packed-refs.lock (regardless of whether foo is a packed ref) This suffers from the same problem I mentioned in my earlier email: we create lock contention on packed-refs.lock when there are two simultaneous deletes, even when the refs aren't packed. That might be an acceptable tradeoff, though. It's only for deletion, not for update, which is presumably rare. And it has to happen anyway when both refs are packed. 2. Write to $GIT_DIR/packed-refs.new a version of the packed-refs file that omits foo [...] All of the further steps make sense. By deleting from packed-refs first, we eliminate the read race-condition I mentioned in my earlier email. The only downside is the possible increased lock contention on packed-refs.lock. I'm very tempted to go this route. It's better to be correct and sometimes die on lock contention than to sometimes give the wrong answer. I would appreciate a critique of my analysis. Even if you agree, I think it would be OK to apply Peff's patch to fix up the most pressing problem, then implement the more complete solution later. I think your analysis is correct, modulo the issue I mentioned. And I agree that my patch is a good interim, as it fixes the worst case (losing writes). -Peff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] refs: do not use cached refs in repack_without_ref
On 12/21/2012 09:04 AM, Jeff King wrote: When we delete a ref that is packed, we rewrite the whole packed-refs file and simply omit the ref that no longer exists. However, we base the rewrite on whatever happens to be in our refs cache, not what is necessarily on disk. That opens us up to a race condition if another process is simultaneously packing the refs, as we will overwrite their newly-made pack-refs file with our potentially stale data, losing commits. [...] There are a few other interesting races in this code that this does not fix: 1. We check to see whether the ref is packed based on the cached data. That means that in the following sequence: a. receive-pack starts, caches packed refs; master is not packed b. meanwhile, pack-refs runs and packs master c. receive-pack deletes the loose ref for master (which might be a no-op if the pack-refs from (b) got there first). It checks its cached packed-refs and sees that there is nothing to delete. We end up leaving the entry in packed-refs. In other words, the deletion does not actually delete anything, but it still returns success. We could fix this by scanning the list of packed refs only after we've acquired the lock. The downside is that this would increase lock contention on packed-refs.lock. Right now, two deletions may conflict if they are deletions of packed refs. With this change, any two deletions might conflict, packed or not. If we work under the assumption that deletions are relatively rare, this is probably OK. And if you tend to keep your refs packed, it would not make any difference. It would have an impact on repos which do not pack refs, and which have frequent simultaneous deletions. 2. The delete_ref function first deletes the loose ref, then rewrites the packed-refs file. This means that for a moment, the ref may appear to have rewound to whatever was in the packed-refs file, and the reader has no way of knowing. This is not a huge deal, but I think it could be fixed by swapping the ordering. However, I think that would open us up to the reverse race from above: we delete from packed-refs, then before we delete the loose ref, a pack-refs process repacks it. Our deletion looks successful, but the ref remains afterwards. I'm sorry to take so long to respond to this patch. Thank you for tracking down this bug and for your careful analysis. I think your patch is correct and should fix the first race condition that you described. But I think the continued existence of the other race conditions is an indication of a fundamental problem with the reference locking policy--independent of the in-RAM reference cache. The tacit assumption of the current locking policy is that changes to the packed-refs file are not critical for correctness, because loose references take precedence over it anyway. This is true for adding and modifying references. But it is not true for deleting references, because there is no way for a deletion to be represented as a loose reference in a way that takes precedence over the packed-refs file (i.e., there is no way for a loose reference to say I am deleted, regardless of what packed-refs says). Thus the race conditions for deleting references, whether via delete_ref() or via pack_refs() with --prune. The current algorithms for deleting references are: * Delete reference foo: 1. Acquire the lock $GIT_DIR/refs/heads/foo.lock 2. Unlink $GIT_DIR/refs/heads/foo 3. repack_without_ref(): a. Acquire the lock $GIT_DIR/packed-refs.lock b. Write the packed-refs without the foo reference to $GIT_DIR/packed-refs.lock c. Rename $GIT_DIR/packed-refs.lock to $GIT_DIR/packed-refs 4. Release lock $GIT_DIR/refs/heads/foo.lock * Pack references: 1. Acquire lock $GIT_DIR/packed-refs.lock 2. for_each_ref() call handle_one_ref(): a. Write ref and its SHA1 to $GIT_DIR/packed-refs.lock b. Possibly record ref and its SHA1 in the refs_to_prune list. 3. Commit $GIT_DIR/packed-refs 4. prune_refs(): for each ref in the ref_to_prune list, call prune_ref(): a. Lock the reference using lock_ref_sha1(), verifying that the recorded SHA1 is still valid. If it is, unlink the loose reference file then free the lock; otherwise leave the loose reference file untouched. There is a problem if two processes try to delete a reference at the same time, or if one process tries to delete a reference at the same time as another process is trying to pack the references. The reason is that there is no transaction that spans both the rewriting of the packed-refs file and also the deletion of the loose-ref files, and therefore it is possible for conflicting changes to be made in the two locations. I think that all of the problems would