Re: [PATCH 8/8] reflog_expire(): lock symbolic refs themselves, not their referent

2015-02-13 Thread Michael Haggerty
On 02/12/2015 07:04 PM, Stefan Beller wrote:
 On Thu, Feb 12, 2015 at 8:52 AM, Michael Haggerty mhag...@alum.mit.edu 
 wrote:
 However, another important question is whether other Git implementations
 have copied this unusual locking policy. If so, that would be a reason
 not to change it. I just pinged the libgit2 maintainer to find out their
 policy. Maybe you could find out about JGit?
 
 I could not really find reflog expiration for jgit for a while, but
 then I stumbled upon this:
 
 // TODO: implement reflog_expire(pm, repo);
 
 so I think it's safe to change it in git.git for now.

Stefan: This locking policy doesn't only affect reflog expire; we also
need to lock the reflog when we add a new entry to it, for example when
updating refs/heads/master through HEAD. Do you know what JGit locks in
that scenario?


I had a discussion in IRC [1] with Carlos Martín Nieto about how libgit2
handles this:

* When libgit2 updates a reference through a symref, it locks the
pointed-to reference while updating the reflog for the symbolic ref. So,
for example, git commit would hold refs/heads/master.lock while
updating both logs/refs/heads/master and logs/HEAD. (This matches the
current Git behavior.)
* libgit2 doesn't support reflog expire, though it does have an API
that allows users to overwrite the reflog with a specified list of
entries. That API locks the reference itself; i.e., HEAD.lock. This
disagrees with Git's current behavior.

I also realized that Git's current policy is probably not tenable if one
process is re-seating a symref at the same time as another process is
expiring its reflog. The git reflog expire HEAD might grab
refs/heads/master.lock then start rewriting logs/HEAD. Meanwhile,
git checkout foo would grab HEAD.lock (not being blocked by the
expire process), then rewrite it to ref: refs/heads/foo, then grab
refs/heads/foo.lock before updating logs/HEAD. So both processes
could be writing logs/HEAD at the same time.

In fact, it's even worse. git checkout foo and git symbolic-ref HEAD
refs/heads/foo release HEAD.lock *before* updating logs/HEAD--in
other words, they append to logs/HEAD without holding *any* lock.

What is the best way forward?

I think the current locking policy is untenable, even aside from the bug
in symbolic-ref.

Switching to holding only HEAD.lock while updating logs/HEAD is the
right solution, but it would introduce an incompatibility with old
versions of Git and libgit2 (and maybe JGit?) Probably nobody would
notice, but who knows?

Therefore, I propose that we hold *both* HEAD.lock *and*
refs/heads/master.lock when modifying logs/HEAD (even when running
reflog expire or reflog delete). I think this will be compatible
with old versions of Git and libgit2, and it will also fix some design
problems mentioned above. Plus, it will prevent updates to the two
reflogs from appearing out of order relative to each other.

Someday, when old clients have been retired, we can optionally switch to
locking *only* HEAD.lock when modifying logs/HEAD.

What do people think?
Michael

[1] https://github.com/libgit2/libgit2/issues/2902

-- 
Michael Haggerty
mhag...@alum.mit.edu

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 8/8] reflog_expire(): lock symbolic refs themselves, not their referent

2015-02-13 Thread Junio C Hamano
Michael Haggerty mhag...@alum.mit.edu writes:

 I also realized that Git's current policy is probably not tenable if one
 process is re-seating a symref at the same time as another process is
 expiring its reflog. The git reflog expire HEAD might grab
 refs/heads/master.lock then start rewriting logs/HEAD. Meanwhile,
 git checkout foo would grab HEAD.lock (not being blocked by the
 expire process), then rewrite it to ref: refs/heads/foo, then grab
 refs/heads/foo.lock before updating logs/HEAD. So both processes
 could be writing logs/HEAD at the same time.
 ...
 Switching to holding only HEAD.lock while updating logs/HEAD is the
 right solution,...

We convinced ourselves that not locking the symref is wrong, but
have we actually convinced us that not locking the underlying ref,
as long as we have a lock on the symref, is safe?

To protect you, the holder of a lock on refs/remotes/origin/HEAD
that happens to point at refs/remotes/origin/next, from somebody who
is updating the underlying refs/remotes/origin/next directly without
going through the symbolic ref (e.g. receive-pack), wouldn't the
other party need to find any and all symbolic refs that point at the
underlying ref and take locks on them?

As dereferencing a symbolic ref in the forward direction is much
cheaper than in the reverse, and because you need to dereference it
anyway, I wonder if we want the endgame to be hold locks on both,
not just hold a lock on the symlink.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 8/8] reflog_expire(): lock symbolic refs themselves, not their referent

2015-02-13 Thread Stefan Beller
On Fri, Feb 13, 2015 at 10:26 AM, Junio C Hamano gits...@pobox.com wrote:
 Stefan Beller sbel...@google.com writes:

 On Fri, Feb 13, 2015 at 10:05 AM, Junio C Hamano gits...@pobox.com wrote:

 We convinced ourselves that not locking the symref is wrong, but
 have we actually convinced us that not locking the underlying ref,
 as long as we have a lock on the symref, is safe?

 To protect you, the holder of a lock on refs/remotes/origin/HEAD
 that happens to point at refs/remotes/origin/next, from somebody who
 is updating the underlying refs/remotes/origin/next directly without
 going through the symbolic ref (e.g. receive-pack), wouldn't the
 other party need to find any and all symbolic refs that point at the
 underlying ref and take locks on them?

 As we're just modifying the ref log of HEAD in this case, we don't bother
 with where the HEAD points to. The other party may change
 refs/remotes/origin/next without us noticing, but we don't care here as
 all we do is rewriting the ref log for HEAD.

 If the other party were to modify HEAD (point it to some other branch, or
 forward the branch pointed to), they'd be expected to lock HEAD and
 would fail as we have the lock?

 I was not talking about HEAD in what you are responding to, though.
 Don't we maintain a reflog on refs/remotes/origin/HEAD?  Is it OK to
 allow its entries to become inconsistent with the underlying ref?


Yes we do have a HEAD ref log, and that ref log would differ from
the underlying ref log. I don't know if that is desired, but I would tend to
not like it.

So the other party updating the underlying ref would also need to lock
HEAD then,
which ... they don't. But they try writing to it nevertheless, in
refs.c line 3121-3150
see the /* special hack */ comment.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 8/8] reflog_expire(): lock symbolic refs themselves, not their referent

2015-02-13 Thread Junio C Hamano
Stefan Beller sbel...@google.com writes:

 On Fri, Feb 13, 2015 at 10:05 AM, Junio C Hamano gits...@pobox.com wrote:

 We convinced ourselves that not locking the symref is wrong, but
 have we actually convinced us that not locking the underlying ref,
 as long as we have a lock on the symref, is safe?

 To protect you, the holder of a lock on refs/remotes/origin/HEAD
 that happens to point at refs/remotes/origin/next, from somebody who
 is updating the underlying refs/remotes/origin/next directly without
 going through the symbolic ref (e.g. receive-pack), wouldn't the
 other party need to find any and all symbolic refs that point at the
 underlying ref and take locks on them?

 As we're just modifying the ref log of HEAD in this case, we don't bother
 with where the HEAD points to. The other party may change
 refs/remotes/origin/next without us noticing, but we don't care here as
 all we do is rewriting the ref log for HEAD.

 If the other party were to modify HEAD (point it to some other branch, or
 forward the branch pointed to), they'd be expected to lock HEAD and
 would fail as we have the lock?

I was not talking about HEAD in what you are responding to, though.
Don't we maintain a reflog on refs/remotes/origin/HEAD?  Is it OK to
allow its entries to become inconsistent with the underlying ref?

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 8/8] reflog_expire(): lock symbolic refs themselves, not their referent

2015-02-13 Thread Stefan Beller
On Fri, Feb 13, 2015 at 8:26 AM, Michael Haggerty mhag...@alum.mit.edu wrote:

 What is the best way forward?

 Switching to holding only HEAD.lock while updating logs/HEAD is the
 right solution, but it would introduce an incompatibility with old
 versions of Git and libgit2 (and maybe JGit?) Probably nobody would
 notice, but who knows?

 Therefore, I propose that we hold *both* HEAD.lock *and*
 refs/heads/master.lock when modifying logs/HEAD (even when running
 reflog expire or reflog delete). I think this will be compatible
 with old versions of Git and libgit2, and it will also fix some design
 problems mentioned above. Plus, it will prevent updates to the two
 reflogs from appearing out of order relative to each other.

Yeah, I agree on that. It's ok to lock both for now (it doesn't
really harm the user). But we really need to document it in the
source code why we lock both and that the 2nd lock can be removed
after time has passed, something like:

/*
 * We need to lock both the symbolic ref as well as
 * the dereferenced ref for now because of inconsistencies with
 * older and other implementations of git. Originally only the
 * dereferenced ref lock was held, but discussion($gmane/263555)
 * turned out, we actually want to hold the lock on the symbolic ref.
 * For now hold both locks until all implementation hold the lock on
 * the symbolic ref. (Feb/2015)
 */


 Someday, when old clients have been retired, we can optionally switch to
 locking *only* HEAD.lock when modifying logs/HEAD.

 What do people think?
 Michael

 [1] https://github.com/libgit2/libgit2/issues/2902

 --
 Michael Haggerty
 mhag...@alum.mit.edu

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 8/8] reflog_expire(): lock symbolic refs themselves, not their referent

2015-02-13 Thread Stefan Beller
On Fri, Feb 13, 2015 at 10:05 AM, Junio C Hamano gits...@pobox.com wrote:
 Michael Haggerty mhag...@alum.mit.edu writes:

 I also realized that Git's current policy is probably not tenable if one
 process is re-seating a symref at the same time as another process is
 expiring its reflog. The git reflog expire HEAD might grab
 refs/heads/master.lock then start rewriting logs/HEAD. Meanwhile,
 git checkout foo would grab HEAD.lock (not being blocked by the
 expire process), then rewrite it to ref: refs/heads/foo, then grab
 refs/heads/foo.lock before updating logs/HEAD. So both processes
 could be writing logs/HEAD at the same time.
 ...
 Switching to holding only HEAD.lock while updating logs/HEAD is the
 right solution,...

 We convinced ourselves that not locking the symref is wrong, but
 have we actually convinced us that not locking the underlying ref,
 as long as we have a lock on the symref, is safe?

 To protect you, the holder of a lock on refs/remotes/origin/HEAD
 that happens to point at refs/remotes/origin/next, from somebody who
 is updating the underlying refs/remotes/origin/next directly without
 going through the symbolic ref (e.g. receive-pack), wouldn't the
 other party need to find any and all symbolic refs that point at the
 underlying ref and take locks on them?

As we're just modifying the ref log of HEAD in this case, we don't bother
with where the HEAD points to. The other party may change
refs/remotes/origin/next without us noticing, but we don't care here as
all we do is rewriting the ref log for HEAD.

If the other party were to modify HEAD (point it to some other branch, or
forward the branch pointed to), they'd be expected to lock HEAD and
would fail as we have the lock?



 As dereferencing a symbolic ref in the forward direction is much
 cheaper than in the reverse, and because you need to dereference it
 anyway, I wonder if we want the endgame to be hold locks on both,
 not just hold a lock on the symlink.

That would be the safest option indeed.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 8/8] reflog_expire(): lock symbolic refs themselves, not their referent

2015-02-13 Thread Michael Haggerty
On 02/13/2015 08:12 PM, Junio C Hamano wrote:
 [...]
 As we are trying to see a way to move forward to do the right thing
 around reflog, I was wondering if locking only the symbolic ref is a
 sensible endgame.  The right thing being:
 
When a symbolic ref S points at underlying ref R, and if R's tip
changes from X to Y, we would want to know from the reflog of S
that S used to point at X and then changed to point at Y.

Let's first talk about an ideal world if we had complete support for
symbolic references.

Yes, I agree with your principle. Moreover, suppose that S and S2 *both*
point at R, and there is a third symref S3 that points at symref S
(symbolic refs can point at other symbolic refs!):

X - R - S - S3
\
 S2

Now, if R updated from X to Y (regardless of whether the update is via R
directly or via one of the symrefs), then each of the four reflogs (R,
S, S2, and S3) should gain a new entry reflecting the update.

If S is reseated to point at R2 instead of R, then the reflogs for S and
for S3 should each gain new entries

What locks should we hold? In my opinion, we should hold the locks on
exactly those references (symbolic or regular) whose reflogs we want to
change. So in the first example, we should hold

$GIT_DIR/$R.lock
$GIT_DIR/$S.lock
$GIT_DIR/$S2.lock, and
$GIT_DIR/$S3.lock

Ideally, we should acquire all of the locks before making any modifications.


Now back to the real world. Currently, if R is changed *through* a
symbolic reference S, then the reflogs for both R and S are updated, but
not the reflogs for any other symbolic references that might point at R.
If R is changed directly, then no symref's reflogs are affected, except
for the special case that HEAD's reflog is changed if it points directly
at R. This limitation is a hack to avoid having to walk symrefs
backwards to find any symrefs that might be pointing at R.

If a symref is reseated, then its reflog is changed but not that of any
symrefs that might be pointed at it.

It might actually not be extremely expensive to follow symrefs
backwards. Symbolic references cannot be packed, so we would only have
to scan the loose references; we could ignore packed refs. But it would
still be a lot more expensive than just updating one file. I don't know
that it's worth it, given that symbolic references are used so sparingly.

I think that the rule about locks as expressed above can be carried over
the the real world:

We should hold the locks on exactly those references (symbolic
or regular) whose reflogs we plan to change. We should acquire all
of the locks before making any changes.

Michael

-- 
Michael Haggerty
mhag...@alum.mit.edu

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 8/8] reflog_expire(): lock symbolic refs themselves, not their referent

2015-02-13 Thread Junio C Hamano
Stefan Beller sbel...@google.com writes:

 On Fri, Feb 13, 2015 at 10:26 AM, Junio C Hamano gits...@pobox.com wrote:
 Stefan Beller sbel...@google.com writes:

 On Fri, Feb 13, 2015 at 10:05 AM, Junio C Hamano gits...@pobox.com wrote:

 We convinced ourselves that not locking the symref is wrong, but
 have we actually convinced us that not locking the underlying ref,
 as long as we have a lock on the symref, is safe?

 To protect you, the holder of a lock on refs/remotes/origin/HEAD
 that happens to point at refs/remotes/origin/next, from somebody who
 is updating the underlying refs/remotes/origin/next directly without
 going through the symbolic ref (e.g. receive-pack), wouldn't the
 other party need to find any and all symbolic refs that point at the
 underlying ref and take locks on them?

 As we're just modifying the ref log of HEAD in this case, we don't bother
 with where the HEAD points to. The other party may change
 refs/remotes/origin/next without us noticing, but we don't care here as
 all we do is rewriting the ref log for HEAD.

 If the other party were to modify HEAD (point it to some other branch, or
 forward the branch pointed to), they'd be expected to lock HEAD and
 would fail as we have the lock?

 I was not talking about HEAD in what you are responding to, though.
 Don't we maintain a reflog on refs/remotes/origin/HEAD?  Is it OK to
 allow its entries to become inconsistent with the underlying ref?

 Yes we do have a HEAD ref log, and that ref log would differ from
 the underlying ref log. I don't know if that is desired, but I
 would tend to not like it.

HEAD (or refs/remotes/origin/HEAD) reflog and reflog for
refs/heads/master (or refs/remotes/origin/next) would have to be
different as long as we allow symbolic refs to be repointed to
different refs.  If HEAD refers to 'next' today, and at the tip of
next sits commit X, the reflog for both of them would record the
fact that they were pointing at X.  If you repoint HEAD to point at
'master' (e.g. git checkout master) whose tip is at commit Y, then
reflog for HEAD would record the fact that now HEAD points at Y, and
reflogs for 'master' or 'next' would not change merely because you
switched where HEAD points at.

And there is anything to like or not to like about it.

As we are trying to see a way to move forward to do the right thing
around reflog, I was wondering if locking only the symbolic ref is a
sensible endgame.  The right thing being:

   When a symbolic ref S points at underlying ref R, and if R's tip
   changes from X to Y, we would want to know from the reflog of S
   that S used to point at X and then changed to point at Y.

Replace S and R with either (HEAD, refs/heads/master) or
(refs/remotes/origin/HEAD, refs/remotes/origin/next) in the above
and we want both to be true.

How best to achieve that, and what is the set of right ref(s) to
take lock(s) on?  I am not very much interested in how incorrect
today's code might behave.  That is not the central point when
discussing what is the best way forward.

 So the other party updating the underlying ref would also need to lock
 HEAD then,

Yes, that is what I meant.  Your also can be read in two different
ways (other party, too or HEAD, too), though, and I think we
want both ;-).  That is why I hinted that it was iffy to state that
we only have to take the lock only on S and not on R, but only as a
workaround to keep older implementation out we take both---once they
get extinct we can get away with by taking a lock only on S.

When pushing to update 'master' and 'next' into a repository whose
'HEAD' points at 'master', we would want to take locks on 'next' (no
question), but is it sensible to take the lock on 'HEAD' and
deliberately leave 'master' unlocked?  Or is it more sensible to
take all locks on the underlying refs involved (i.e. 'next' and
'master') and in addition any symbolic refs that are pointing at
these refs involved (i.e. 'HEAD')?

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 8/8] reflog_expire(): lock symbolic refs themselves, not their referent

2015-02-13 Thread Junio C Hamano
Michael Haggerty mhag...@alum.mit.edu writes:

 Now back to the real world. Currently, if R is changed *through* a
 symbolic reference S, then the reflogs for both R and S are updated, but
 not the reflogs for any other symbolic references that might point at R.
 If R is changed directly, then no symref's reflogs are affected, except
 for the special case that HEAD's reflog is changed if it points directly
 at R. This limitation is a hack to avoid having to walk symrefs
 backwards to find any symrefs that might be pointing at R.

Yup.

 It might actually not be extremely expensive to follow symrefs
 backwards. Symbolic references cannot be packed, so we would only have
 to scan the loose references; we could ignore packed refs. But it would
 still be a lot more expensive than just updating one file. I don't know
 that it's worth it, given that symbolic references are used so sparingly.

I personally do not think it is worth it.  I further think that it
would be perfectly OK to do one of the following:

- We only maintain reflogs for $GIT_DIR/HEAD; no other symrefs
  get their own reflog, and we only check $GIT_DIR/HEAD when
  updating refs/heads/* and no other refs for direct reference
  (i.e. HEAD - refs/something/else - refs/heads/master symref
  chain is ignored).

- In addition to the above, we also maintain reflogs for
  $GIT_DIR/refs/remotes/*/HEAD but support only when they
  directly point into a remote tracking branch in the same
  hierarchy.  $GIT_DIR/refs/remotes/foo/HEAD that points at
  $GIT_DIR/refs/remotes/bar/master is ignored and will get an
  undefined behaviour.

 I think that the rule about locks as expressed above can be carried over
 the the real world:

 We should hold the locks on exactly those references (symbolic
 or regular) whose reflogs we plan to change. We should acquire all
 of the locks before making any changes.

Sure.

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 8/8] reflog_expire(): lock symbolic refs themselves, not their referent

2015-02-13 Thread Michael Haggerty
On 02/13/2015 10:53 PM, Junio C Hamano wrote:
 Michael Haggerty mhag...@alum.mit.edu writes:
 
 Now back to the real world. Currently, if R is changed *through* a
 symbolic reference S, then the reflogs for both R and S are updated, but
 not the reflogs for any other symbolic references that might point at R.
 If R is changed directly, then no symref's reflogs are affected, except
 for the special case that HEAD's reflog is changed if it points directly
 at R. This limitation is a hack to avoid having to walk symrefs
 backwards to find any symrefs that might be pointing at R.
 
 Yup.
 
 It might actually not be extremely expensive to follow symrefs
 backwards. Symbolic references cannot be packed, so we would only have
 to scan the loose references; we could ignore packed refs. But it would
 still be a lot more expensive than just updating one file. I don't know
 that it's worth it, given that symbolic references are used so sparingly.
 
 I personally do not think it is worth it.  I further think that it
 would be perfectly OK to do one of the following:
 
 - We only maintain reflogs for $GIT_DIR/HEAD; no other symrefs
   get their own reflog, and we only check $GIT_DIR/HEAD when
   updating refs/heads/* and no other refs for direct reference
   (i.e. HEAD - refs/something/else - refs/heads/master symref
   chain is ignored).
 
 - In addition to the above, we also maintain reflogs for
   $GIT_DIR/refs/remotes/*/HEAD but support only when they
   directly point into a remote tracking branch in the same
   hierarchy.  $GIT_DIR/refs/remotes/foo/HEAD that points at
   $GIT_DIR/refs/remotes/bar/master is ignored and will get an
   undefined behaviour.

Yes. The first is approximately the status quo, except that you would
like explicitly to *suppress* creating reflogs for symbolic refs other
than HEAD even if a reference is altered via the symbolic ref.

The second makes sense, though I think reflogs for remote HEADs are far
less useful than those for HEAD. So I think this is a low-priority project.

 I think that the rule about locks as expressed above can be carried over
 the the real world:

 We should hold the locks on exactly those references (symbolic
 or regular) whose reflogs we plan to change. We should acquire all
 of the locks before making any changes.
 
 Sure.

I forgot to mention that if we want to retain lock-compatibility with
older clients, then we *also* need to lock the reference pointed at by a
symbolic ref when modifying the symbolic ref's reflog. This is often
implied by the previous rule, but not when we reseat a symbolic
reference or when we expire a symbolic reference's reflog.

I will look at how hard this is to implement. If it is at all involved,
then I might drop this patch from the current patch series and defer it
to another one.

Michael

-- 
Michael Haggerty
mhag...@alum.mit.edu

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 8/8] reflog_expire(): lock symbolic refs themselves, not their referent

2015-02-12 Thread Stefan Beller
On Thu, Feb 12, 2015 at 8:52 AM, Michael Haggerty mhag...@alum.mit.edu wrote:
 On 02/12/2015 12:25 AM, Stefan Beller wrote:
 On Wed, Feb 11, 2015 at 2:49 PM, Junio C Hamano gits...@pobox.com wrote:
 Stefan Beller sbel...@google.com writes:

 On Mon, Feb 9, 2015 at 1:12 AM, Michael Haggerty mhag...@alum.mit.edu 
 wrote:
 When processing the reflog of a symbolic ref, hold the lock on the
 symbolic reference itself, not on the reference that it points to.

 I am not sure if that makes sense.
 So when expiring HEAD, you want to have a .git/HEAD.lock file
 instead of a .git/refs/heads/master.lock file?

 What would happen if there is a concurrent modification
 to the master branch?

 The HEAD may be pointing at 'master' and the other party that is
 trying to modify it would fail when it tries to update the reflog
 for HEAD thanks to HEAD.lock being held by us.  The HEAD may be
 pointing at 'next' and the other part that updates 'master' would
 not try to modify HEAD reflog and we do not conflict.

 At least, I think that is the rationale behind this change.

 That makes sense! Do we have documentation on symrefs?

 Originally I was wondering if this would make things
 complicated for  symbolic branches which are not HEAD.
 Then you could update the branch pointed to, because it
 has no lock as the lock is on the symref. On the other hand
 this seems to be an improvement, as you cannot move the
 symref itself, as it has the lock and we don't really have other
 symrefs?

 The convention is that holding lock $GIT_DIR/$refname.lock (where
 $refname might be, for example, HEAD or refs/heads/master) protects
 two things:

 * The loose-reference file $GIT_DIR/$refname
 * The reflog file $GIT_DIR/logs/$refname

 And this makes sense:

 Suppose that HEAD is refs/heads/master. These two thing have independent
 reflogs, so there is no reason that one process can't be expiring the
 reflog of HEAD while another expires the reflog of refs/heads/master.

 The behavior before this patch was that the reflog for HEAD was
 modified while holding the reflog for refs/heads/master. This is too
 strict and would make those two processes contend unnecessarily.

 I can't think of a reason that the current behavior is unsafe. But it's
 more restrictive than necessary, and more confusing than necessary. My
 guess is that it was unintended (i.e., a bug). It dates from

 68db31cc28 (2007-05-09) git-update-ref: add --no-deref option for
 overwriting/detaching ref

 which initially added the REF_NODEREF constant and probably forgot that
 the new flag should be used in this invocation.

 However, another important question is whether other Git implementations
 have copied this unusual locking policy. If so, that would be a reason
 not to change it. I just pinged the libgit2 maintainer to find out their
 policy. Maybe you could find out about JGit?

I could not really find reflog expiration for jgit for a while, but
then I stumbled upon this:

// TODO: implement reflog_expire(pm, repo);

so I think it's safe to change it in git.git for now.

Reviewed-by: Stefan Beller sbel...@google.com


 Michael

 --
 Michael Haggerty
 mhag...@alum.mit.edu

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 8/8] reflog_expire(): lock symbolic refs themselves, not their referent

2015-02-12 Thread Michael Haggerty
On 02/12/2015 12:25 AM, Stefan Beller wrote:
 On Wed, Feb 11, 2015 at 2:49 PM, Junio C Hamano gits...@pobox.com wrote:
 Stefan Beller sbel...@google.com writes:

 On Mon, Feb 9, 2015 at 1:12 AM, Michael Haggerty mhag...@alum.mit.edu 
 wrote:
 When processing the reflog of a symbolic ref, hold the lock on the
 symbolic reference itself, not on the reference that it points to.

 I am not sure if that makes sense.
 So when expiring HEAD, you want to have a .git/HEAD.lock file
 instead of a .git/refs/heads/master.lock file?

 What would happen if there is a concurrent modification
 to the master branch?

 The HEAD may be pointing at 'master' and the other party that is
 trying to modify it would fail when it tries to update the reflog
 for HEAD thanks to HEAD.lock being held by us.  The HEAD may be
 pointing at 'next' and the other part that updates 'master' would
 not try to modify HEAD reflog and we do not conflict.

 At least, I think that is the rationale behind this change.
 
 That makes sense! Do we have documentation on symrefs?
 
 Originally I was wondering if this would make things
 complicated for  symbolic branches which are not HEAD.
 Then you could update the branch pointed to, because it
 has no lock as the lock is on the symref. On the other hand
 this seems to be an improvement, as you cannot move the
 symref itself, as it has the lock and we don't really have other
 symrefs?

The convention is that holding lock $GIT_DIR/$refname.lock (where
$refname might be, for example, HEAD or refs/heads/master) protects
two things:

* The loose-reference file $GIT_DIR/$refname
* The reflog file $GIT_DIR/logs/$refname

And this makes sense:

Suppose that HEAD is refs/heads/master. These two thing have independent
reflogs, so there is no reason that one process can't be expiring the
reflog of HEAD while another expires the reflog of refs/heads/master.

The behavior before this patch was that the reflog for HEAD was
modified while holding the reflog for refs/heads/master. This is too
strict and would make those two processes contend unnecessarily.

I can't think of a reason that the current behavior is unsafe. But it's
more restrictive than necessary, and more confusing than necessary. My
guess is that it was unintended (i.e., a bug). It dates from

68db31cc28 (2007-05-09) git-update-ref: add --no-deref option for
overwriting/detaching ref

which initially added the REF_NODEREF constant and probably forgot that
the new flag should be used in this invocation.

However, another important question is whether other Git implementations
have copied this unusual locking policy. If so, that would be a reason
not to change it. I just pinged the libgit2 maintainer to find out their
policy. Maybe you could find out about JGit?

Michael

-- 
Michael Haggerty
mhag...@alum.mit.edu

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 8/8] reflog_expire(): lock symbolic refs themselves, not their referent

2015-02-11 Thread Stefan Beller
On Wed, Feb 11, 2015 at 2:49 PM, Junio C Hamano gits...@pobox.com wrote:
 Stefan Beller sbel...@google.com writes:

 On Mon, Feb 9, 2015 at 1:12 AM, Michael Haggerty mhag...@alum.mit.edu 
 wrote:
 When processing the reflog of a symbolic ref, hold the lock on the
 symbolic reference itself, not on the reference that it points to.

 I am not sure if that makes sense.
 So when expiring HEAD, you want to have a .git/HEAD.lock file
 instead of a .git/refs/heads/master.lock file?

 What would happen if there is a concurrent modification
 to the master branch?

 The HEAD may be pointing at 'master' and the other party that is
 trying to modify it would fail when it tries to update the reflog
 for HEAD thanks to HEAD.lock being held by us.  The HEAD may be
 pointing at 'next' and the other part that updates 'master' would
 not try to modify HEAD reflog and we do not conflict.

 At least, I think that is the rationale behind this change.

That makes sense! Do we have documentation on symrefs?

Originally I was wondering if this would make things
complicated for  symbolic branches which are not HEAD.
Then you could update the branch pointed to, because it
has no lock as the lock is on the symref. On the other hand
this seems to be an improvement, as you cannot move the
symref itself, as it has the lock and we don't really have other
symrefs?
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 8/8] reflog_expire(): lock symbolic refs themselves, not their referent

2015-02-11 Thread Junio C Hamano
Stefan Beller sbel...@google.com writes:

 On Mon, Feb 9, 2015 at 1:12 AM, Michael Haggerty mhag...@alum.mit.edu wrote:
 When processing the reflog of a symbolic ref, hold the lock on the
 symbolic reference itself, not on the reference that it points to.

 I am not sure if that makes sense.
 So when expiring HEAD, you want to have a .git/HEAD.lock file
 instead of a .git/refs/heads/master.lock file?

 What would happen if there is a concurrent modification
 to the master branch?

The HEAD may be pointing at 'master' and the other party that is
trying to modify it would fail when it tries to update the reflog
for HEAD thanks to HEAD.lock being held by us.  The HEAD may be
pointing at 'next' and the other part that updates 'master' would
not try to modify HEAD reflog and we do not conflict.

At least, I think that is the rationale behind this change.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 8/8] reflog_expire(): lock symbolic refs themselves, not their referent

2015-02-10 Thread Stefan Beller
On Mon, Feb 9, 2015 at 1:12 AM, Michael Haggerty mhag...@alum.mit.edu wrote:
 When processing the reflog of a symbolic ref, hold the lock on the
 symbolic reference itself, not on the reference that it points to.

I am not sure if that makes sense.
So when expiring HEAD, you want to have a .git/HEAD.lock file
instead of a .git/refs/heads/master.lock file?

What would happen if there is a concurrent modification
to the master branch?


 Signed-off-by: Michael Haggerty mhag...@alum.mit.edu
 ---
  refs.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

 diff --git a/refs.c b/refs.c
 index 1b2a497..3fcf342 100644
 --- a/refs.c
 +++ b/refs.c
 @@ -4037,7 +4037,7 @@ int reflog_expire(const char *refname, const unsigned 
 char *sha1,
  * reference itself, plus we might need to update the
  * reference if --updateref was specified:
  */
 -   lock = lock_ref_sha1_basic(refname, sha1, NULL, 0, type);
 +   lock = lock_ref_sha1_basic(refname, sha1, NULL, REF_NODEREF, type);
 if (!lock)
 return error(cannot lock ref '%s', refname);
 if (!reflog_exists(refname)) {
 --
 2.1.4

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 8/8] reflog_expire(): lock symbolic refs themselves, not their referent

2015-02-09 Thread Michael Haggerty
When processing the reflog of a symbolic ref, hold the lock on the
symbolic reference itself, not on the reference that it points to.

Signed-off-by: Michael Haggerty mhag...@alum.mit.edu
---
 refs.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/refs.c b/refs.c
index 1b2a497..3fcf342 100644
--- a/refs.c
+++ b/refs.c
@@ -4037,7 +4037,7 @@ int reflog_expire(const char *refname, const unsigned 
char *sha1,
 * reference itself, plus we might need to update the
 * reference if --updateref was specified:
 */
-   lock = lock_ref_sha1_basic(refname, sha1, NULL, 0, type);
+   lock = lock_ref_sha1_basic(refname, sha1, NULL, REF_NODEREF, type);
if (!lock)
return error(cannot lock ref '%s', refname);
if (!reflog_exists(refname)) {
-- 
2.1.4

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html