Re: [PATCH 02/15] refs.c: return error instead of dying when locking fails during transaction

2014-11-11 Thread Jeff King
On Tue, Oct 21, 2014 at 01:36:47PM -0700, Ronnie Sahlberg wrote:

 commit e193c10fc4f9274d1e751cfcdcc4507818e8d498 upstream.
 
 Change lock_ref_sha1_basic to return an error instead of dying when
 we fail to lock a file during a transaction.
 This function is only called from transaction_commit() and it knows how
 to handle these failures.
 [...]
 - else
 - unable_to_lock_die(ref_file, errno);
 + else {
 + struct strbuf err = STRBUF_INIT;
 + unable_to_lock_message(ref_file, errno, err);
 + error(%s, err.buf);
 + strbuf_reset(err);
 + goto error_return;
 + }

I coincidentally just wrote almost the identical patch, because this
isn't just a cleanup; it fixes a real bug. During pack_refs, we call
prune_ref to lock and delete the loose ref. If the lock fails, that's
OK; that just means somebody else is updating it at the same time, and
we can skip our pruning step. But due to the unable_to_lock_die call
here in lock_ref_sha1_basic, the pack-refs process may die prematurely.

I wonder if it is worth pulling this one out from the rest of the
series, as it has value (and can be applied) on its own. I did some
digging on the history of this, too. Here's the rationale I wrote:


lock_ref_sha1_basic: do not die on locking errors

lock_ref_sha1_basic is inconsistent about when it calls
die() and when it returns NULL to signal an error. This is
annoying to any callers that want to recover from a locking
error.

This seems to be mostly historical accident. It was added in
4bd18c4 (Improve abstraction of ref lock/write.,
2006-05-17), which returned an error in all cases except
calling safe_create_leading_directories, in which case it
died.  Later, 40aaae8 (Better error message when we are
unable to lock the index file, 2006-08-12) asked
hold_lock_file_for_update to die for us, leaving the
resolve_ref code-path the only one which returned NULL.

We tried to correct that in 5cc3cef (lock_ref_sha1(): do not
sometimes error() and sometimes die()., 2006-09-30),
by converting all of the die() calls into returns. But we
missed the die flag passed to the lock code, leaving us
inconsistent. This state persisted until e5c223e
(lock_ref_sha1_basic(): if locking fails with ENOENT, retry,
2014-01-18). Because of its retry scheme, it does not ask
the lock code to die, but instead manually dies with
unable_to_lock_die().

We can make this consistent with the other return paths by
converting this to use unable_to_lock_message(), and
returning NULL. This is safe to do because all callers
already needed to check the return value of the function,
since it could fail (and return NULL) for other reasons.

I also have some other cleanups to lock_ref_sha1_basic's error handling.
I'd be happy to take over this patch and send it along with those
cleanups as a separate series.

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 02/15] refs.c: return error instead of dying when locking fails during transaction

2014-11-11 Thread Ronnie Sahlberg
On Tue, Nov 11, 2014 at 2:34 AM, Jeff King p...@peff.net wrote:
 On Tue, Oct 21, 2014 at 01:36:47PM -0700, Ronnie Sahlberg wrote:

 commit e193c10fc4f9274d1e751cfcdcc4507818e8d498 upstream.

 Change lock_ref_sha1_basic to return an error instead of dying when
 we fail to lock a file during a transaction.
 This function is only called from transaction_commit() and it knows how
 to handle these failures.
 [...]
 - else
 - unable_to_lock_die(ref_file, errno);
 + else {
 + struct strbuf err = STRBUF_INIT;
 + unable_to_lock_message(ref_file, errno, err);
 + error(%s, err.buf);
 + strbuf_reset(err);
 + goto error_return;
 + }

 I coincidentally just wrote almost the identical patch, because this
 isn't just a cleanup; it fixes a real bug. During pack_refs, we call
 prune_ref to lock and delete the loose ref. If the lock fails, that's
 OK; that just means somebody else is updating it at the same time, and
 we can skip our pruning step. But due to the unable_to_lock_die call
 here in lock_ref_sha1_basic, the pack-refs process may die prematurely.

 I wonder if it is worth pulling this one out from the rest of the
 series, as it has value (and can be applied) on its own. I did some
 digging on the history of this, too. Here's the rationale I wrote:


 lock_ref_sha1_basic: do not die on locking errors

 lock_ref_sha1_basic is inconsistent about when it calls
 die() and when it returns NULL to signal an error. This is
 annoying to any callers that want to recover from a locking
 error.

 This seems to be mostly historical accident. It was added in
 4bd18c4 (Improve abstraction of ref lock/write.,
 2006-05-17), which returned an error in all cases except
 calling safe_create_leading_directories, in which case it
 died.  Later, 40aaae8 (Better error message when we are
 unable to lock the index file, 2006-08-12) asked
 hold_lock_file_for_update to die for us, leaving the
 resolve_ref code-path the only one which returned NULL.

 We tried to correct that in 5cc3cef (lock_ref_sha1(): do not
 sometimes error() and sometimes die()., 2006-09-30),
 by converting all of the die() calls into returns. But we
 missed the die flag passed to the lock code, leaving us
 inconsistent. This state persisted until e5c223e
 (lock_ref_sha1_basic(): if locking fails with ENOENT, retry,
 2014-01-18). Because of its retry scheme, it does not ask
 the lock code to die, but instead manually dies with
 unable_to_lock_die().

 We can make this consistent with the other return paths by
 converting this to use unable_to_lock_message(), and
 returning NULL. This is safe to do because all callers
 already needed to check the return value of the function,
 since it could fail (and return NULL) for other reasons.

 I also have some other cleanups to lock_ref_sha1_basic's error handling.
 I'd be happy to take over this patch and send it along with those
 cleanups as a separate series.


 Sounds Good To Me.


Thanks,
Ronnie Sahlberg
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 02/15] refs.c: return error instead of dying when locking fails during transaction

2014-10-21 Thread Ronnie Sahlberg
commit e193c10fc4f9274d1e751cfcdcc4507818e8d498 upstream.

Change lock_ref_sha1_basic to return an error instead of dying when
we fail to lock a file during a transaction.
This function is only called from transaction_commit() and it knows how
to handle these failures.

Change-Id: Ie830b7970412d9299e76a86f08633ce721130aed
Signed-off-by: Ronnie Sahlberg sahlb...@google.com
Signed-off-by: Jonathan Nieder jrnie...@gmail.com
---
 refs.c | 10 --
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/refs.c b/refs.c
index e49fbe9..c088d36 100644
--- a/refs.c
+++ b/refs.c
@@ -2340,6 +2340,7 @@ static struct ref_lock *lock_ref_sha1_basic(const char 
*refname,
 
lock-lock_fd = hold_lock_file_for_update(lock-lk, ref_file, lflags);
if (lock-lock_fd  0) {
+   last_errno = errno;
if (errno == ENOENT  --attempts_remaining  0)
/*
 * Maybe somebody just deleted one of the
@@ -2347,8 +2348,13 @@ static struct ref_lock *lock_ref_sha1_basic(const char 
*refname,
 * again:
 */
goto retry;
-   else
-   unable_to_lock_die(ref_file, errno);
+   else {
+   struct strbuf err = STRBUF_INIT;
+   unable_to_lock_message(ref_file, errno, err);
+   error(%s, err.buf);
+   strbuf_reset(err);
+   goto error_return;
+   }
}
return old_sha1 ? verify_lock(lock, old_sha1, mustexist) : lock;
 
-- 
2.1.0.rc2.206.gedb03e5

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html