Re: [PATCH] refs: do not use cached refs in repack_without_ref

2013-01-07 Thread Martin Fick
...[Sorry about the previous HTML reposts]

Jeff King p...@peff.net wrote:
On Mon, Dec 31, 2012 at 03:30:53AM -0700, Martin Fick 
wrote:

 The general approach is to setup a transaction and
 either commit or abort it. A transaction can be setup
 by renaming an appropriately setup directory to the
 ref.lock name. If the rename succeeds, the
 transaction is begun. Any actor can abort the
 transaction (up until it is committed) by simply
 deleting the ref.lock directory, so it is not at
 risk of going stale.

 Deleting a directory is not atomic, as you first have
 to remove the contents, putting it into a potentially
 inconsistent state. I'll assume you deal with that
 later...

Right, these simple single file transactions have at 
best 1 important file/directory in them, once deleted 
the transaction is aborted (can no longer complete).  
However to support multi file transactions, a better 
approach is likely to rename the uuid directory to have 
a .delete extension before deleting stuff in it.


  One important piece of the transaction is the use
  of uuids. The uuids provide a mechanism to tie the
  atomic commit pieces to the transactions and thus to
  prevent long sleeping process from inadvertently
  performing actions which could be out of date when
  they wake finally up.
 
Has this been a problem for you in practice?

No, but as you say, we don't currently hold locks for 
very long. I anticipate it being a problem in a 
clustered environment when transactions start spanning 
repos from java processes, with insane amounts of RAM, 
which can sometimes have unpredictable indeterminately 
long java GC cycles at inopportune times.. It would seem 
short sighted if Gerrit at least did not assume this 
will be a problem.

But, deletes today in git are not so short and Michael's 
fixes may make things worse? But, as you point out, that 
should perhaps be solved a different way.


 Avoiding this is one of the reasons that git does not
 take out long locks; instead, it takes the lock only
 at the moment it is ready to write, and aborts if it
 has been updated since the longer-term operation
 began. This has its own problems (you might do a lot
 of work only to have your operation aborted), but I
 am not sure that your proposal improves on that.

It does not, it might increase this.


 Git typically holds ref locks for a few syscalls. If
 you are conservative about leaving potentially stale
 locks in place (e.g., give them a few minutes to
 complete before assuming they are now bogus), you will
 not run into that problem.

In a distributed environment even a few minutes might 
not be enough, processes could be on a remote server 
with a temporarily split network, that could cause 
delays longer than your typical local expectations.

But there is also the other piece of this problem, how 
do you detect stale locks? How long will it be stale 
until a user figures it out and reports it? How many 
other users will simply have failed pushes and wonder 
why without reporting them?


  In each case, the atomic commit piece is the
  renaming of a file. For the create and update
  pieces, a file is renamed from the ref.lock 
  dir to the ref file resulting in an update to 
  the sha for the ref.

 I think we've had problems with cross-directory
 renames on some filesystems, but I don't recall the
 details. I know that Coda does not like 
 cross-directory links, but cross-directory renames 
 are OK (and in fact we fall back to the latter when
 the former does not work).

 Ah, here we go: 5723fe7 (Avoid cross-directory renames
 and linking on object creation, 2008-06-14). Looks
 like NFS is the culprit.

If the renames fail we can fall back to regular file 
locking, the hard part to detect and deal with would be 
if the renames don't fail but become copies/mkdirs.


 In the case of a delete, the actor may verify that
 ref currently contains the sha to prune if it
 needs to, and then renames the ref file to
 ref.lock/uuid/delete. On success, the ref was
 deleted.

 Whether successful or not, the actor may now simply
 delete the ref.lock directory, clearing the way for
 a new transaction. Any other actor may delete this
 directory at any time also, likely either on conflict
 (if they are attempting to initiate a transaction),
 or after a grace period just to cleanup the FS. Any
 actor may also safely cleanup the tmp directories,
 preferably also after a grace period.

 Hmm. So what happens to the delete file when the
 ref.lock directory is being deleted? Presumably
 deleting the ref.lock directory means doing it
 recursively (which is non-atomic). But then why are 
 we keeping the delete file at all, if we're just about
 to remove it?

We are not trying to keep it, but we need to ensure that 
our transaction has not yet been aborted: the rename 
does this.  If we just deleted the file, we may sleep 
and another transaction may abort our transaction and 
complete before we wake up and actually delete the file. 
But by using 

Re: [PATCH] refs: do not use cached refs in repack_without_ref

2012-12-28 Thread Jeff King
On Wed, Dec 26, 2012 at 09:24:39AM +0100, Michael Haggerty wrote:

 I'm sorry to take so long to respond to this patch.  Thank you for
 tracking down this bug and for your careful analysis.
 
 I think your patch is correct and should fix the first race condition
 that you described.

Thanks for checking it over. I almost cc'd you, as I know you have been
working on ref caching. But I think that the problem is much older, as
we always cached the packed-refs list in memory.

 But I think the continued existence of the other race conditions is an
 indication of a fundamental problem with the reference locking
 policy--independent of the in-RAM reference cache.
 
 The tacit assumption of the current locking policy is that changes to
 the packed-refs file are not critical for correctness, because loose
 references take precedence over it anyway.  This is true for adding and
 modifying references.  But it is not true for deleting references,
 because there is no way for a deletion to be represented as a loose
 reference in a way that takes precedence over the packed-refs file
 (i.e., there is no way for a loose reference to say I am deleted,
 regardless of what packed-refs says).  Thus the race conditions for
 deleting references, whether via delete_ref() or via pack_refs() with
 --prune.

Yeah. It would be much nicer if we could just write the null sha1 or a
similar sentinel into the loose ref for I am deleted. The problem
(besides backwards compatibility) is the usual D/F conflict. I cannot
delete refs/heads/foo and then create refs/heads/foo/bar if the old
ref file is in the way.

 There is a problem if two processes try to delete a reference at the
 same time, or if one process tries to delete a reference at the same
 time as another process is trying to pack the references.  The reason is
 that there is no transaction that spans both the rewriting of the
 packed-refs file and also the deletion of the loose-ref files, and
 therefore it is possible for conflicting changes to be made in the two
 locations.

Just to be clear, are you talking about the races I wrote about in my
other email? Or are there other races? I didn't (and still don't) see
any actual on-disk data loss races. Just ones where a reader may get an
old, packed value (which is still bad, but is less bad than one where a
write is lost).

 I think that all of the problems would be fixed if a lock would be held
 on the packed-refs file during the whole process of deleting any
 reference; i.e., change the algorithms to:

Yeah, I looked at that, too. In fact, before I had correctly analyzed
the problem, I thought that doing so would solve the problem I was
seeing (which I noticed was wrong when it did not pass my tests :) ).

From your description, I think you realize this, but I want to point out
for other readers: just updating the pack_refs side to call prune_refs
under the lock would create a deadlock with a simultaneous delete (which
takes the individual ref lock first, then the packed-refs lock). Of
course, I do not think git is capable of deadlock, as we typically just
die() instead of blocking on a lock. So maybe it does not matter so
much. :)

 * Delete reference foo:
 
   1. Acquire the lock $GIT_DIR/packed-refs.lock (regardless of whether
  foo is a packed ref)

This suffers from the same problem I mentioned in my earlier email: we
create lock contention on packed-refs.lock when there are two
simultaneous deletes, even when the refs aren't packed. That might be an
acceptable tradeoff, though. It's only for deletion, not for update,
which is presumably rare. And it has to happen anyway when both refs are
packed.

   2. Write to $GIT_DIR/packed-refs.new a version of the packed-refs
  file that omits foo
 [...]

All of the further steps make sense. By deleting from packed-refs first,
we eliminate the read race-condition I mentioned in my earlier email.
The only downside is the possible increased lock contention on
packed-refs.lock.  I'm very tempted to go this route. It's better to be
correct and sometimes die on lock contention than to sometimes give the
wrong answer.

 I would appreciate a critique of my analysis.  Even if you agree, I
 think it would be OK to apply Peff's patch to fix up the most pressing
 problem, then implement the more complete solution later.

I think your analysis is correct, modulo the issue I mentioned. And I
agree that my patch is a good interim, as it fixes the worst case
(losing writes).

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] refs: do not use cached refs in repack_without_ref

2012-12-26 Thread Michael Haggerty
On 12/21/2012 09:04 AM, Jeff King wrote:
 When we delete a ref that is packed, we rewrite the whole
 packed-refs file and simply omit the ref that no longer
 exists. However, we base the rewrite on whatever happens to
 be in our refs cache, not what is necessarily on disk. That
 opens us up to a race condition if another process is
 simultaneously packing the refs, as we will overwrite their
 newly-made pack-refs file with our potentially stale data,
 losing commits.
 [...]
 
 There are a few other interesting races in this code that this does not
 fix:
 
   1. We check to see whether the ref is packed based on the cached data.
  That means that in the following sequence:
 
a. receive-pack starts, caches packed refs; master is not packed
 
b. meanwhile, pack-refs runs and packs master
 
c. receive-pack deletes the loose ref for master (which might be
   a no-op if the pack-refs from (b) got there first). It checks
   its cached packed-refs and sees that there is nothing to
   delete.
 
  We end up leaving the entry in packed-refs. In other words, the
  deletion does not actually delete anything, but it still returns
  success.
 
  We could fix this by scanning the list of packed refs only after
  we've acquired the lock. The downside is that this would increase
  lock contention on packed-refs.lock. Right now, two deletions may
  conflict if they are deletions of packed refs. With this change,
  any two deletions might conflict, packed or not.
 
  If we work under the assumption that deletions are relatively rare,
  this is probably OK. And if you tend to keep your refs packed, it
  would not make any difference. It would have an impact on repos
  which do not pack refs, and which have frequent simultaneous
  deletions.
 
   2. The delete_ref function first deletes the loose ref, then rewrites
  the packed-refs file. This means that for a moment, the ref may
  appear to have rewound to whatever was in the packed-refs file, and
  the reader has no way of knowing.
 
  This is not a huge deal, but I think it could be fixed by swapping
  the ordering. However, I think that would open us up to the reverse
  race from above: we delete from packed-refs, then before we delete
  the loose ref, a pack-refs process repacks it. Our deletion looks
  successful, but the ref remains afterwards.

I'm sorry to take so long to respond to this patch.  Thank you for
tracking down this bug and for your careful analysis.

I think your patch is correct and should fix the first race condition
that you described.  But I think the continued existence of the other
race conditions is an indication of a fundamental problem with the
reference locking policy--independent of the in-RAM reference cache.

The tacit assumption of the current locking policy is that changes to
the packed-refs file are not critical for correctness, because loose
references take precedence over it anyway.  This is true for adding and
modifying references.  But it is not true for deleting references,
because there is no way for a deletion to be represented as a loose
reference in a way that takes precedence over the packed-refs file
(i.e., there is no way for a loose reference to say I am deleted,
regardless of what packed-refs says).  Thus the race conditions for
deleting references, whether via delete_ref() or via pack_refs() with
--prune.

The current algorithms for deleting references are:

* Delete reference foo:

  1. Acquire the lock $GIT_DIR/refs/heads/foo.lock

  2. Unlink $GIT_DIR/refs/heads/foo

  3. repack_without_ref():

 a. Acquire the lock $GIT_DIR/packed-refs.lock

 b. Write the packed-refs without the foo reference to
$GIT_DIR/packed-refs.lock

 c. Rename $GIT_DIR/packed-refs.lock to $GIT_DIR/packed-refs

  4. Release lock $GIT_DIR/refs/heads/foo.lock

* Pack references:

  1. Acquire lock $GIT_DIR/packed-refs.lock

  2. for_each_ref() call handle_one_ref():

 a. Write ref and its SHA1 to $GIT_DIR/packed-refs.lock

 b. Possibly record ref and its SHA1 in the refs_to_prune list.

  3. Commit $GIT_DIR/packed-refs

  4. prune_refs(): for each ref in the ref_to_prune list, call
 prune_ref():

 a. Lock the reference using lock_ref_sha1(), verifying that the
recorded SHA1 is still valid.  If it is, unlink the loose
reference file then free the lock; otherwise leave the loose
reference file untouched.

There is a problem if two processes try to delete a reference at the
same time, or if one process tries to delete a reference at the same
time as another process is trying to pack the references.  The reason is
that there is no transaction that spans both the rewriting of the
packed-refs file and also the deletion of the loose-ref files, and
therefore it is possible for conflicting changes to be made in the two
locations.

I think that all of the problems would