Re: [PATCH 07/23] expire_reflog(): use a lock_file for rewriting the reflog file

2014-12-09 Thread Jeff King
On Tue, Dec 09, 2014 at 10:47:24AM -0800, Junio C Hamano wrote:

> Michael Haggerty  writes:
> 
> > This isn't documented very well. I thought I saw a comment somewhere in
> > the code that stated it explicitly, but I can't find it now. In any
> > case, my understanding of the locking protocol for reflogs is:
> >
> > The reflog for "$refname", which is stored at
> > "$GIT_DIR/logs/$refname", is locked by holding
> > "$GIT_DIR/refs/$refname.lock", *even if the corresponding
> > reference is packed*.
> >
> > This implies that readers, who don't pay attention to locks, have to be
> > prepared for the possibility that the reflog is in the middle of an
> > update and that the last line is incomplete. This is handled by
> > show_one_reflog_ent(), which discards incomplete lines.
> 
> Interesting, as I think I saw Peff did something around that area
> recently.

Yeah, and I had no idea about the truncated-line race which Michael
described at the time. It makes me glad that I took the time to do the
more careful thing[1] (fixing the reverse-reflog parser to properly
include the newline when present) and not the quick-and-easy thing[2]
(teaching show_one_reflog_ent to accept entries without newlines).

What you have queued in jk/for-each-reflog-ent-reverse should be fine,
even in light of what Michael said.

-Peff

[1] http://article.gmane.org/gmane.comp.version-control.git/260849

[2] http://article.gmane.org/gmane.comp.version-control.git/260807
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 07/23] expire_reflog(): use a lock_file for rewriting the reflog file

2014-12-09 Thread Junio C Hamano
Michael Haggerty  writes:

> This isn't documented very well. I thought I saw a comment somewhere in
> the code that stated it explicitly, but I can't find it now. In any
> case, my understanding of the locking protocol for reflogs is:
>
> The reflog for "$refname", which is stored at
> "$GIT_DIR/logs/$refname", is locked by holding
> "$GIT_DIR/refs/$refname.lock", *even if the corresponding
> reference is packed*.
>
> This implies that readers, who don't pay attention to locks, have to be
> prepared for the possibility that the reflog is in the middle of an
> update and that the last line is incomplete. This is handled by
> show_one_reflog_ent(), which discards incomplete lines.

Interesting, as I think I saw Peff did something around that area
recently.

I have some more thought around the "transaction" in general, but
it will be in a separate message.

Thanks.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 07/23] expire_reflog(): use a lock_file for rewriting the reflog file

2014-12-08 Thread Michael Haggerty
On 12/05/2014 01:23 AM, Jonathan Nieder wrote:
> Michael Haggerty wrote:
> 
>> We don't actually need the locking functionality, because we already
>> hold the lock on the reference itself, which is how the reflog file is
>> locked. But the lock_file code still does some of the bookkeeping for
>> us and is more careful than the old code here was.
> 
> As you say, the ref lock takes care of mutual exclusion, so we do not
> have to be too careful about compatibility with other tools that might
> not know to lock the reflog.  And this is not tying our hands for a
> future when I might want to lock logs/refs/heads/topic/1 while
> logs/refs/heads/topic still exists as part of the implementation of
> "git mv topic/1 topic".
> 
> Stefan and I had forgotten about that guarantee when looking at that
> kind of operation --- thanks for the reminder.

This reminder is important (and forgettable) enough that I will add a
comment within the function explaining it.

> Should updates to the HEAD reflog acquire HEAD.lock?  (They don't
> currently.)

Yes, they should; good catch. I assume that you are referring to the
code at the bottom of write_ref_sha1()? Or did you find a problem in
this patch series?

If the former, then I propose that we address this bug in a separate
patch series.

> [...]
>> --- a/builtin/reflog.c
>> +++ b/builtin/reflog.c
>> @@ -349,12 +349,14 @@ static int push_tip_to_list(const char *refname, const 
>> unsigned char *sha1, int
>>  return 0;
>>  }
>>  
>> +static struct lock_file reflog_lock;
> 
> If this lockfile is only used in that one function, it can be declared
> inside the function.
> 
> If it is meant to be used throughout the 'git reflog' command, then it
> can go near the top of the file.

For now it is only used within this function, so I will move it into the
function as you suggest. (As you know, it does need to remain static,
because of the way the lock_file module takes over ownership of these
objects.)

>> +
>>  static int expire_reflog(const char *refname, const unsigned char *sha1, 
>> void *cb_data)
>>  {
>>  struct cmd_reflog_expire_cb *cmd = cb_data;
>>  struct expire_reflog_cb cb;
>>  struct ref_lock *lock;
>> -char *log_file, *newlog_path = NULL;
>> +char *log_file;
>>  struct commit *tip_commit;
>>  struct commit_list *tips;
>>  int status = 0;
>> @@ -372,10 +374,14 @@ static int expire_reflog(const char *refname, const 
>> unsigned char *sha1, void *c
>>  unlock_ref(lock);
>>  return 0;
>>  }
>> +
>>  log_file = git_pathdup("logs/%s", refname);
>>  if (!cmd->dry_run) {
>> -newlog_path = git_pathdup("logs/%s.lock", refname);
>> -cb.newlog = fopen(newlog_path, "w");
>> +if (hold_lock_file_for_update(&reflog_lock, log_file, 0) < 0)
>> +goto failure;
> 
> hold_lock_file_for_update doesn't print a message.  Code to print one
> looks like
> 
>   if (hold_lock_file_for_update(&reflog_lock, log_file, 0) < 0) {
>   unable_to_lock_message(log_file, errno, &err);
>   error("%s", err.buf);
>   goto failure;
>   }

Thanks; will add.

> (A patch in flight changes that to
> 
>   if (hold_lock_file_for_update(&reflog_lock, log_file, 0, &err) < 0) {
>   error("%s", err.buf);
>   goto failure;
>   }
> 
> )

Thanks for the heads-up. The compiler will complain when the branches
are merged, and hopefully the fix will be obvious.

>> +cb.newlog = fdopen_lock_file(&reflog_lock, "w");
>> +if (!cb.newlog)
>> +goto failure;
> 
> Hm.  lockfile.c::fdopen_lock_file ought to use xfdopen to make this
> case impossible.  And xfdopen should use try_to_free_routine() and
> try again on failure.

That sounds reasonable, but it is not manifestly obvious given that at
least one caller of fdopen_lock_file() (in fast-import.c) tries to
recover if fdopen_lock_file() fails. Let's address this in a separate
patch series if that is OK with you. For now I will add explicit
error-reporting code here before "goto failure".

> [...]
>> @@ -423,10 +429,9 @@ static int expire_reflog(const char *refname, const 
>> unsigned char *sha1, void *c
>>  }
>>  
>>  if (cb.newlog) {
>> -if (fclose(cb.newlog)) {
>> -status |= error("%s: %s", strerror(errno),
>> -newlog_path);
>> -unlink(newlog_path);
>> +if (close_lock_file(&reflog_lock)) {
>> +status |= error("Couldn't write %s: %s", log_file,
>> +strerror(errno));
> 
> Style nit: error messages usually start with a lowercase letter
> (though I realize nearby examples are already inconsistent).

Thanks; will fix.

> commit_lock_file() can take care of the close_lock_file automatically.

The existing code is a tiny bit safer: first make sure both files can be
written, *then* ren

Re: [PATCH 07/23] expire_reflog(): use a lock_file for rewriting the reflog file

2014-12-08 Thread Michael Haggerty
On 12/05/2014 03:59 AM, ronnie sahlberg wrote:
> On Thu, Dec 4, 2014 at 3:08 PM, Michael Haggerty  wrote:
>> We don't actually need the locking functionality, because we already
>> hold the lock on the reference itself,
> 
> No. You do need the lock.
> The ref is locked only during transaction_commit()
> 
> If you don't want to lock the reflog file and instead rely on the lock
> on the ref itself you will need to
> rework your patches so that the lock on the ref is taken already
> during, for example, transaction_update_ref() instead.
> 
> But without doing those changes and moving the ref locking from
> _commit() to _update_ref() you will risk reflog corruption/surprises
> if two operations collide and both rewrite the reflog without any lock held.

Ronnie, I don't understand your comments.

It is a statement of fact (to the best of my knowledge) that reflogs are
supposed to be modified only under a lock on the corresponding
reference, namely "$GIT_DIR/refs/$refname.lock". We do not require
reflog writers to hold "$GIT_DIR/logs/$refname.lock".

In this function, "$GIT_DIR/logs/$refname.lock" happens to be the name
of the temporary file being used to stage the new contents of the
reflog. But that is more or less a coincidence; we could call the
temporary file whatever we want because it has no locking implications.
However, what we want to do with the file in this code path (write a new
version then rename the new version on top of the old version, deleting
the temporary file if the program is interrupted) is the same as what we
do with lockfiles, so we use the lockfile code because it is convenient.

This patch series has nothing to do with ref_transaction_commit() or any
of the transaction machinery. It has to do with expire_reflog(), which
is invoked outside of any transaction.

Michael

-- 
Michael Haggerty
mhag...@alum.mit.edu

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 07/23] expire_reflog(): use a lock_file for rewriting the reflog file

2014-12-08 Thread Michael Haggerty
On 12/05/2014 03:19 AM, Stefan Beller wrote:
> On Thu, Dec 04, 2014 at 04:23:31PM -0800, Jonathan Nieder wrote:
>> Michael Haggerty wrote:
>>
>>> We don't actually need the locking functionality, because we already
>>> hold the lock on the reference itself, which is how the reflog file is
>>> locked. But the lock_file code still does some of the bookkeeping for
>>> us and is more careful than the old code here was.
>>
>> As you say, the ref lock takes care of mutual exclusion, so we do not
>> have to be too careful about compatibility with other tools that might
>> not know to lock the reflog.  And this is not tying our hands for a
>> future when I might want to lock logs/refs/heads/topic/1 while
>> logs/refs/heads/topic still exists as part of the implementation of
>> "git mv topic/1 topic".
>>
>> Stefan and I had forgotten about that guarantee when looking at that
>> kind of operation --- thanks for the reminder.
> 
> I did not forget about it, I did not know about that in the first hand.
> We don't seem to have documentation on it?
> 
> So sorry for heading in a direction, which would have been avoidable.

This isn't documented very well. I thought I saw a comment somewhere in
the code that stated it explicitly, but I can't find it now. In any
case, my understanding of the locking protocol for reflogs is:

The reflog for "$refname", which is stored at
"$GIT_DIR/logs/$refname", is locked by holding
"$GIT_DIR/refs/$refname.lock", *even if the corresponding
reference is packed*.

This implies that readers, who don't pay attention to locks, have to be
prepared for the possibility that the reflog is in the middle of an
update and that the last line is incomplete. This is handled by
show_one_reflog_ent(), which discards incomplete lines.

This protocol avoids the need to rewrite the reflog from scratch for
each reference update.

Given how poorly-documented this point is, I wonder whether other
implementations of Git (e.g., libgit2, JGit, Dulwich, ...) got it right.

Michael

-- 
Michael Haggerty
mhag...@alum.mit.edu

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 07/23] expire_reflog(): use a lock_file for rewriting the reflog file

2014-12-05 Thread Junio C Hamano
Stefan Beller  writes:

>>> After the series completes, this lock is only used in reflog_expire.
>>> So I'd rather move it inside the function? Then we could run the 
>>> reflog_expire
>>> function in parallel for different locks in theory?
>>
>> I am not sure about the "parallel" part, but I would imagine that it
>> is an essential prerequisite to move this outside the "client" code
>> if we want to later replace the backing storage of refs and reflogs
>> outside the filesystem, so from that point of view,  I think the
>> suggestion makes sense.
>>
>> Thanks.
>>
>
> Sorry for the confusion. With parallel I mean,...

There is no confusion.  I understand exactly what you meant.

What I said was not sure was if "parallel" is a practical enough
possiblity to include into the set of value propositions the
suggested change to move the lock out of the "client" may give us.

In other words, "With this change, doing a parallel will become a
lot easier"---"Really?  It probably is not one of the harder part of
the problem if you really want to go parallel" was the discourse I
had in my mind.

;-)
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 07/23] expire_reflog(): use a lock_file for rewriting the reflog file

2014-12-05 Thread Stefan Beller
On Fri, Dec 5, 2014 at 11:32 AM, Junio C Hamano  wrote:
> Stefan Beller  writes:
>
>>> > +static struct lock_file reflog_lock;
>>>
>>> If this lockfile is only used in that one function, it can be declared
>>> inside the function.
>>>
>>> If it is meant to be used throughout the 'git reflog' command, then it
>>> can go near the top of the file.
>>>
>>
>> After the series completes, this lock is only used in reflog_expire.
>> So I'd rather move it inside the function? Then we could run the 
>> reflog_expire
>> function in parallel for different locks in theory?
>
> I am not sure about the "parallel" part, but I would imagine that it
> is an essential prerequisite to move this outside the "client" code
> if we want to later replace the backing storage of refs and reflogs
> outside the filesystem, so from that point of view,  I think the
> suggestion makes sense.
>
> Thanks.
>

Sorry for the confusion. With parallel I mean, that in theory we could have
multiple threads running reflog expire at the same time in the same program.
Having the "static struct lock_file reflog_lock;" around, we can only
process one
reflog at a time, as that is holding the lock_file struct.

I am not saying we want to go multi-threading any time soon if at all.
Just that it would
be easier to do, if not having the lock file as a file-global variable
instead of a
variable inside a function.

Thanks,
Stefan
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 07/23] expire_reflog(): use a lock_file for rewriting the reflog file

2014-12-05 Thread Junio C Hamano
Stefan Beller  writes:

>> > +static struct lock_file reflog_lock;
>> 
>> If this lockfile is only used in that one function, it can be declared
>> inside the function.
>> 
>> If it is meant to be used throughout the 'git reflog' command, then it
>> can go near the top of the file.
>> 
>
> After the series completes, this lock is only used in reflog_expire.
> So I'd rather move it inside the function? Then we could run the reflog_expire
> function in parallel for different locks in theory?

I am not sure about the "parallel" part, but I would imagine that it
is an essential prerequisite to move this outside the "client" code
if we want to later replace the backing storage of refs and reflogs
outside the filesystem, so from that point of view,  I think the
suggestion makes sense.

Thanks.

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 07/23] expire_reflog(): use a lock_file for rewriting the reflog file

2014-12-05 Thread Stefan Beller
On Thu, Dec 04, 2014 at 04:23:31PM -0800, Jonathan Nieder wrote:
> Michael Haggerty wrote:
> 
> > We don't actually need the locking functionality, because we already
> > hold the lock on the reference itself, which is how the reflog file is
> > locked. But the lock_file code still does some of the bookkeeping for
> > us and is more careful than the old code here was.
> 
> As you say, the ref lock takes care of mutual exclusion, so we do not
> have to be too careful about compatibility with other tools that might
> not know to lock the reflog.  And this is not tying our hands for a
> future when I might want to lock logs/refs/heads/topic/1 while
> logs/refs/heads/topic still exists as part of the implementation of
> "git mv topic/1 topic".
> 
> Stefan and I had forgotten about that guarantee when looking at that
> kind of operation --- thanks for the reminder.
> 
> Should updates to the HEAD reflog acquire HEAD.lock?  (They don't
> currently.)
> 
> [...]
> > --- a/builtin/reflog.c
> > +++ b/builtin/reflog.c
> > @@ -349,12 +349,14 @@ static int push_tip_to_list(const char *refname, 
> > const unsigned char *sha1, int
> > return 0;
> >  }
> >  
> > +static struct lock_file reflog_lock;
> 
> If this lockfile is only used in that one function, it can be declared
> inside the function.
> 
> If it is meant to be used throughout the 'git reflog' command, then it
> can go near the top of the file.
> 

After the series completes, this lock is only used in reflog_expire.
So I'd rather move it inside the function? Then we could run the reflog_expire
function in parallel for different locks in theory?

> > +
> >  static int expire_reflog(const char *refname, const unsigned char *sha1, 
> > void *cb_data)
> >  {
> > struct cmd_reflog_expire_cb *cmd = cb_data;
> > struct expire_reflog_cb cb;
> > struct ref_lock *lock;
> > -   char *log_file, *newlog_path = NULL;
> > +   char *log_file;
> > struct commit *tip_commit;
> > struct commit_list *tips;
> > int status = 0;
> > @@ -372,10 +374,14 @@ static int expire_reflog(const char *refname, const 
> > unsigned char *sha1, void *c
> > unlock_ref(lock);
> > return 0;
> > }
> > +
> > log_file = git_pathdup("logs/%s", refname);
> > if (!cmd->dry_run) {
> > -   newlog_path = git_pathdup("logs/%s.lock", refname);
> > -   cb.newlog = fopen(newlog_path, "w");
> > +   if (hold_lock_file_for_update(&reflog_lock, log_file, 0) < 0)
> > +   goto failure;
> 
> hold_lock_file_for_update doesn't print a message.  Code to print one
> looks like
> 
>   if (hold_lock_file_for_update(&reflog_lock, log_file, 0) < 0) {
>   unable_to_lock_message(log_file, errno, &err);
>   error("%s", err.buf);
>   goto failure;
>   }
> 
> (A patch in flight changes that to
> 
>   if (hold_lock_file_for_update(&reflog_lock, log_file, 0, &err) < 0) {
>   error("%s", err.buf);
>   goto failure;
>   }
> 
> )
> 
> > +   cb.newlog = fdopen_lock_file(&reflog_lock, "w");
> > +   if (!cb.newlog)
> > +   goto failure;
> 
> Hm.  lockfile.c::fdopen_lock_file ought to use xfdopen to make this
> case impossible.  And xfdopen should use try_to_free_routine() and
> try again on failure.
> 
> [...]
> > @@ -423,10 +429,9 @@ static int expire_reflog(const char *refname, const 
> > unsigned char *sha1, void *c
> > }
> >  
> > if (cb.newlog) {
> > -   if (fclose(cb.newlog)) {
> > -   status |= error("%s: %s", strerror(errno),
> > -   newlog_path);
> > -   unlink(newlog_path);
> > +   if (close_lock_file(&reflog_lock)) {
> > +   status |= error("Couldn't write %s: %s", log_file,
> > +   strerror(errno));
> 
> Style nit: error messages usually start with a lowercase letter
> (though I realize nearby examples are already inconsistent).
> 
> commit_lock_file() can take care of the close_lock_file automatically.
> 
> [...]
> > @@ -434,21 +439,23 @@ static int expire_reflog(const char *refname, const 
> > unsigned char *sha1, void *c
> >  close_ref(lock) < 0)) {
> > status |= error("Couldn't write %s",
> > lock->lk->filename.buf);
> > -   unlink(newlog_path);
> > -   } else if (rename(newlog_path, log_file)) {
> > -   status |= error("cannot rename %s to %s",
> > -   newlog_path, log_file);
> > -   unlink(newlog_path);
> > +   rollback_lock_file(&reflog_lock);
> > +   } else if (commit_lock_file(&reflog_lock)) {
> > +   status |= error("cannot rename %s.lock to %s",
> > +   log_file, log_file);
> 
> Most callers say "unable to commit reflog '%s'", log_file to hedge 

Re: [PATCH 07/23] expire_reflog(): use a lock_file for rewriting the reflog file

2014-12-05 Thread Stefan Beller
On Thu, Dec 04, 2014 at 06:51:36PM -0800, ronnie sahlberg wrote:
> On Thu, Dec 4, 2014 at 3:08 PM, Michael Haggerty 
> wrote:
> 
> > We don't actually need the locking functionality, because we already
> > hold the lock on the reference itself,
> 
> 
> No. You do need the lock.
> The ref is locked only during transaction_commit()
> 

Michael is saying, that you never want to modify the reflog, if you're not 
holding
the lock on the corresponding ref. Or in other words, you may modify the 
reflog, if
the corresponding ref is held. This used to be this way back then. 

> If you don't want to lock the reflog file and instead rely on the lock on
> the ref itself you will need to
> rework your patches so that the lock on the ref is taken already during,
> for example, transaction_update_ref() instead.
> 
> But without doing those changes and moving the ref locking from _commit()
> to _update_ref() you will risk reflog corruption/surprises
> if two operations collide and both rewrite the reflog without any lock held.
> 
> 
> 
> 
> 
> > which is how the reflog file is
> > locked. But the lock_file code still does some of the bookkeeping for
> > us and is more careful than the old code here was.
> >
> > For example:
> >
> > * Correctly handle the case that the reflog lock file already exists
> >   for some reason or cannot be opened.
> >
> > * Correctly clean up the lockfile if the program dies.
> >
> > Signed-off-by: Michael Haggerty 
> > ---
> >  builtin/reflog.c | 37 ++---
> >  1 file changed, 22 insertions(+), 15 deletions(-)
> >
> > diff --git a/builtin/reflog.c b/builtin/reflog.c
> > index a282e60..d344d45 100644
> > --- a/builtin/reflog.c
> > +++ b/builtin/reflog.c
> > @@ -349,12 +349,14 @@ static int push_tip_to_list(const char *refname,
> > const unsigned char *sha1, int
> > return 0;
> >  }
> >
> > +static struct lock_file reflog_lock;
> > +
> >  static int expire_reflog(const char *refname, const unsigned char *sha1,
> > void *cb_data)
> >  {
> > struct cmd_reflog_expire_cb *cmd = cb_data;
> > struct expire_reflog_cb cb;
> > struct ref_lock *lock;
> > -   char *log_file, *newlog_path = NULL;
> > +   char *log_file;
> > struct commit *tip_commit;
> > struct commit_list *tips;
> > int status = 0;
> > @@ -372,10 +374,14 @@ static int expire_reflog(const char *refname, const
> > unsigned char *sha1, void *c
> > unlock_ref(lock);
> > return 0;
> > }
> > +
> > log_file = git_pathdup("logs/%s", refname);
> > if (!cmd->dry_run) {
> > -   newlog_path = git_pathdup("logs/%s.lock", refname);
> > -   cb.newlog = fopen(newlog_path, "w");
> > +   if (hold_lock_file_for_update(&reflog_lock, log_file, 0) <
> > 0)
> > +   goto failure;
> > +   cb.newlog = fdopen_lock_file(&reflog_lock, "w");
> > +   if (!cb.newlog)
> > +   goto failure;
> > }
> >
> > cb.cmd = cmd;
> > @@ -423,10 +429,9 @@ static int expire_reflog(const char *refname, const
> > unsigned char *sha1, void *c
> > }
> >
> > if (cb.newlog) {
> > -   if (fclose(cb.newlog)) {
> > -   status |= error("%s: %s", strerror(errno),
> > -   newlog_path);
> > -   unlink(newlog_path);
> > +   if (close_lock_file(&reflog_lock)) {
> > +   status |= error("Couldn't write %s: %s", log_file,
> > +   strerror(errno));
> > } else if (cmd->updateref &&
> > (write_in_full(lock->lock_fd,
> > sha1_to_hex(cb.last_kept_sha1), 40) != 40
> > ||
> > @@ -434,21 +439,23 @@ static int expire_reflog(const char *refname, const
> > unsigned char *sha1, void *c
> >  close_ref(lock) < 0)) {
> > status |= error("Couldn't write %s",
> > lock->lk->filename.buf);
> > -   unlink(newlog_path);
> > -   } else if (rename(newlog_path, log_file)) {
> > -   status |= error("cannot rename %s to %s",
> > -   newlog_path, log_file);
> > -   unlink(newlog_path);
> > +   rollback_lock_file(&reflog_lock);
> > +   } else if (commit_lock_file(&reflog_lock)) {
> > +   status |= error("cannot rename %s.lock to %s",
> > +   log_file, log_file);
> > } else if (cmd->updateref && commit_ref(lock)) {
> > status |= error("Couldn't set %s", lock->ref_name);
> > -   } else {
> > -   adjust_shared_perm(log_file);
> > }
> > }
> > -   free(newlog

Re: [PATCH 07/23] expire_reflog(): use a lock_file for rewriting the reflog file

2014-12-04 Thread ronnie sahlberg
On Thu, Dec 4, 2014 at 3:08 PM, Michael Haggerty  wrote:
> We don't actually need the locking functionality, because we already
> hold the lock on the reference itself,

No. You do need the lock.
The ref is locked only during transaction_commit()

If you don't want to lock the reflog file and instead rely on the lock
on the ref itself you will need to
rework your patches so that the lock on the ref is taken already
during, for example, transaction_update_ref() instead.

But without doing those changes and moving the ref locking from
_commit() to _update_ref() you will risk reflog corruption/surprises
if two operations collide and both rewrite the reflog without any lock held.


> which is how the reflog file is
> locked. But the lock_file code still does some of the bookkeeping for
> us and is more careful than the old code here was.
>
> For example:
>
> * Correctly handle the case that the reflog lock file already exists
>   for some reason or cannot be opened.
>
> * Correctly clean up the lockfile if the program dies.
>
> Signed-off-by: Michael Haggerty 
> ---
>  builtin/reflog.c | 37 ++---
>  1 file changed, 22 insertions(+), 15 deletions(-)
>
> diff --git a/builtin/reflog.c b/builtin/reflog.c
> index a282e60..d344d45 100644
> --- a/builtin/reflog.c
> +++ b/builtin/reflog.c
> @@ -349,12 +349,14 @@ static int push_tip_to_list(const char *refname, const 
> unsigned char *sha1, int
> return 0;
>  }
>
> +static struct lock_file reflog_lock;
> +
>  static int expire_reflog(const char *refname, const unsigned char *sha1, 
> void *cb_data)
>  {
> struct cmd_reflog_expire_cb *cmd = cb_data;
> struct expire_reflog_cb cb;
> struct ref_lock *lock;
> -   char *log_file, *newlog_path = NULL;
> +   char *log_file;
> struct commit *tip_commit;
> struct commit_list *tips;
> int status = 0;
> @@ -372,10 +374,14 @@ static int expire_reflog(const char *refname, const 
> unsigned char *sha1, void *c
> unlock_ref(lock);
> return 0;
> }
> +
> log_file = git_pathdup("logs/%s", refname);
> if (!cmd->dry_run) {
> -   newlog_path = git_pathdup("logs/%s.lock", refname);
> -   cb.newlog = fopen(newlog_path, "w");
> +   if (hold_lock_file_for_update(&reflog_lock, log_file, 0) < 0)
> +   goto failure;
> +   cb.newlog = fdopen_lock_file(&reflog_lock, "w");
> +   if (!cb.newlog)
> +   goto failure;
> }
>
> cb.cmd = cmd;
> @@ -423,10 +429,9 @@ static int expire_reflog(const char *refname, const 
> unsigned char *sha1, void *c
> }
>
> if (cb.newlog) {
> -   if (fclose(cb.newlog)) {
> -   status |= error("%s: %s", strerror(errno),
> -   newlog_path);
> -   unlink(newlog_path);
> +   if (close_lock_file(&reflog_lock)) {
> +   status |= error("Couldn't write %s: %s", log_file,
> +   strerror(errno));
> } else if (cmd->updateref &&
> (write_in_full(lock->lock_fd,
> sha1_to_hex(cb.last_kept_sha1), 40) != 40 ||
> @@ -434,21 +439,23 @@ static int expire_reflog(const char *refname, const 
> unsigned char *sha1, void *c
>  close_ref(lock) < 0)) {
> status |= error("Couldn't write %s",
> lock->lk->filename.buf);
> -   unlink(newlog_path);
> -   } else if (rename(newlog_path, log_file)) {
> -   status |= error("cannot rename %s to %s",
> -   newlog_path, log_file);
> -   unlink(newlog_path);
> +   rollback_lock_file(&reflog_lock);
> +   } else if (commit_lock_file(&reflog_lock)) {
> +   status |= error("cannot rename %s.lock to %s",
> +   log_file, log_file);
> } else if (cmd->updateref && commit_ref(lock)) {
> status |= error("Couldn't set %s", lock->ref_name);
> -   } else {
> -   adjust_shared_perm(log_file);
> }
> }
> -   free(newlog_path);
> free(log_file);
> unlock_ref(lock);
> return status;
> +
> + failure:
> +   rollback_lock_file(&reflog_lock);
> +   free(log_file);
> +   unlock_ref(lock);
> +   return -1;
>  }
>
>  static int collect_reflog(const char *ref, const unsigned char *sha1, int 
> unused, void *cb_data)
> --
> 2.1.3
>
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 07/23] expire_reflog(): use a lock_file for rewriting the reflog file

2014-12-04 Thread Stefan Beller
On Thu, Dec 04, 2014 at 04:23:31PM -0800, Jonathan Nieder wrote:
> Michael Haggerty wrote:
> 
> > We don't actually need the locking functionality, because we already
> > hold the lock on the reference itself, which is how the reflog file is
> > locked. But the lock_file code still does some of the bookkeeping for
> > us and is more careful than the old code here was.
> 
> As you say, the ref lock takes care of mutual exclusion, so we do not
> have to be too careful about compatibility with other tools that might
> not know to lock the reflog.  And this is not tying our hands for a
> future when I might want to lock logs/refs/heads/topic/1 while
> logs/refs/heads/topic still exists as part of the implementation of
> "git mv topic/1 topic".
> 
> Stefan and I had forgotten about that guarantee when looking at that
> kind of operation --- thanks for the reminder.

I did not forget about it, I did not know about that in the first hand.
We don't seem to have documentation on it?

So sorry for heading in a direction, which would have been avoidable.

Thanks,
Stefan

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 07/23] expire_reflog(): use a lock_file for rewriting the reflog file

2014-12-04 Thread Jonathan Nieder
Michael Haggerty wrote:

> We don't actually need the locking functionality, because we already
> hold the lock on the reference itself, which is how the reflog file is
> locked. But the lock_file code still does some of the bookkeeping for
> us and is more careful than the old code here was.

As you say, the ref lock takes care of mutual exclusion, so we do not
have to be too careful about compatibility with other tools that might
not know to lock the reflog.  And this is not tying our hands for a
future when I might want to lock logs/refs/heads/topic/1 while
logs/refs/heads/topic still exists as part of the implementation of
"git mv topic/1 topic".

Stefan and I had forgotten about that guarantee when looking at that
kind of operation --- thanks for the reminder.

Should updates to the HEAD reflog acquire HEAD.lock?  (They don't
currently.)

[...]
> --- a/builtin/reflog.c
> +++ b/builtin/reflog.c
> @@ -349,12 +349,14 @@ static int push_tip_to_list(const char *refname, const 
> unsigned char *sha1, int
>   return 0;
>  }
>  
> +static struct lock_file reflog_lock;

If this lockfile is only used in that one function, it can be declared
inside the function.

If it is meant to be used throughout the 'git reflog' command, then it
can go near the top of the file.

> +
>  static int expire_reflog(const char *refname, const unsigned char *sha1, 
> void *cb_data)
>  {
>   struct cmd_reflog_expire_cb *cmd = cb_data;
>   struct expire_reflog_cb cb;
>   struct ref_lock *lock;
> - char *log_file, *newlog_path = NULL;
> + char *log_file;
>   struct commit *tip_commit;
>   struct commit_list *tips;
>   int status = 0;
> @@ -372,10 +374,14 @@ static int expire_reflog(const char *refname, const 
> unsigned char *sha1, void *c
>   unlock_ref(lock);
>   return 0;
>   }
> +
>   log_file = git_pathdup("logs/%s", refname);
>   if (!cmd->dry_run) {
> - newlog_path = git_pathdup("logs/%s.lock", refname);
> - cb.newlog = fopen(newlog_path, "w");
> + if (hold_lock_file_for_update(&reflog_lock, log_file, 0) < 0)
> + goto failure;

hold_lock_file_for_update doesn't print a message.  Code to print one
looks like

if (hold_lock_file_for_update(&reflog_lock, log_file, 0) < 0) {
unable_to_lock_message(log_file, errno, &err);
error("%s", err.buf);
goto failure;
}

(A patch in flight changes that to

if (hold_lock_file_for_update(&reflog_lock, log_file, 0, &err) < 0) {
error("%s", err.buf);
goto failure;
}

)

> + cb.newlog = fdopen_lock_file(&reflog_lock, "w");
> + if (!cb.newlog)
> + goto failure;

Hm.  lockfile.c::fdopen_lock_file ought to use xfdopen to make this
case impossible.  And xfdopen should use try_to_free_routine() and
try again on failure.

[...]
> @@ -423,10 +429,9 @@ static int expire_reflog(const char *refname, const 
> unsigned char *sha1, void *c
>   }
>  
>   if (cb.newlog) {
> - if (fclose(cb.newlog)) {
> - status |= error("%s: %s", strerror(errno),
> - newlog_path);
> - unlink(newlog_path);
> + if (close_lock_file(&reflog_lock)) {
> + status |= error("Couldn't write %s: %s", log_file,
> + strerror(errno));

Style nit: error messages usually start with a lowercase letter
(though I realize nearby examples are already inconsistent).

commit_lock_file() can take care of the close_lock_file automatically.

[...]
> @@ -434,21 +439,23 @@ static int expire_reflog(const char *refname, const 
> unsigned char *sha1, void *c
>close_ref(lock) < 0)) {
>   status |= error("Couldn't write %s",
>   lock->lk->filename.buf);
> - unlink(newlog_path);
> - } else if (rename(newlog_path, log_file)) {
> - status |= error("cannot rename %s to %s",
> - newlog_path, log_file);
> - unlink(newlog_path);
> + rollback_lock_file(&reflog_lock);
> + } else if (commit_lock_file(&reflog_lock)) {
> + status |= error("cannot rename %s.lock to %s",
> + log_file, log_file);

Most callers say "unable to commit reflog '%s'", log_file to hedge their
bets in case the close failed (which may be what you were avoiding
above.

errno is meaningful when commit_lock_file fails, making a more
detailed diagnosis from strerror(errno) possible.

Thanks,
Jonathan
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 07/23] expire_reflog(): use a lock_file for rewriting the reflog file

2014-12-04 Thread Michael Haggerty
We don't actually need the locking functionality, because we already
hold the lock on the reference itself, which is how the reflog file is
locked. But the lock_file code still does some of the bookkeeping for
us and is more careful than the old code here was.

For example:

* Correctly handle the case that the reflog lock file already exists
  for some reason or cannot be opened.

* Correctly clean up the lockfile if the program dies.

Signed-off-by: Michael Haggerty 
---
 builtin/reflog.c | 37 ++---
 1 file changed, 22 insertions(+), 15 deletions(-)

diff --git a/builtin/reflog.c b/builtin/reflog.c
index a282e60..d344d45 100644
--- a/builtin/reflog.c
+++ b/builtin/reflog.c
@@ -349,12 +349,14 @@ static int push_tip_to_list(const char *refname, const 
unsigned char *sha1, int
return 0;
 }
 
+static struct lock_file reflog_lock;
+
 static int expire_reflog(const char *refname, const unsigned char *sha1, void 
*cb_data)
 {
struct cmd_reflog_expire_cb *cmd = cb_data;
struct expire_reflog_cb cb;
struct ref_lock *lock;
-   char *log_file, *newlog_path = NULL;
+   char *log_file;
struct commit *tip_commit;
struct commit_list *tips;
int status = 0;
@@ -372,10 +374,14 @@ static int expire_reflog(const char *refname, const 
unsigned char *sha1, void *c
unlock_ref(lock);
return 0;
}
+
log_file = git_pathdup("logs/%s", refname);
if (!cmd->dry_run) {
-   newlog_path = git_pathdup("logs/%s.lock", refname);
-   cb.newlog = fopen(newlog_path, "w");
+   if (hold_lock_file_for_update(&reflog_lock, log_file, 0) < 0)
+   goto failure;
+   cb.newlog = fdopen_lock_file(&reflog_lock, "w");
+   if (!cb.newlog)
+   goto failure;
}
 
cb.cmd = cmd;
@@ -423,10 +429,9 @@ static int expire_reflog(const char *refname, const 
unsigned char *sha1, void *c
}
 
if (cb.newlog) {
-   if (fclose(cb.newlog)) {
-   status |= error("%s: %s", strerror(errno),
-   newlog_path);
-   unlink(newlog_path);
+   if (close_lock_file(&reflog_lock)) {
+   status |= error("Couldn't write %s: %s", log_file,
+   strerror(errno));
} else if (cmd->updateref &&
(write_in_full(lock->lock_fd,
sha1_to_hex(cb.last_kept_sha1), 40) != 40 ||
@@ -434,21 +439,23 @@ static int expire_reflog(const char *refname, const 
unsigned char *sha1, void *c
 close_ref(lock) < 0)) {
status |= error("Couldn't write %s",
lock->lk->filename.buf);
-   unlink(newlog_path);
-   } else if (rename(newlog_path, log_file)) {
-   status |= error("cannot rename %s to %s",
-   newlog_path, log_file);
-   unlink(newlog_path);
+   rollback_lock_file(&reflog_lock);
+   } else if (commit_lock_file(&reflog_lock)) {
+   status |= error("cannot rename %s.lock to %s",
+   log_file, log_file);
} else if (cmd->updateref && commit_ref(lock)) {
status |= error("Couldn't set %s", lock->ref_name);
-   } else {
-   adjust_shared_perm(log_file);
}
}
-   free(newlog_path);
free(log_file);
unlock_ref(lock);
return status;
+
+ failure:
+   rollback_lock_file(&reflog_lock);
+   free(log_file);
+   unlock_ref(lock);
+   return -1;
 }
 
 static int collect_reflog(const char *ref, const unsigned char *sha1, int 
unused, void *cb_data)
-- 
2.1.3

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html