Re: [PATCH v4 0/3] Make update refs more atomic

2014-04-16 Thread Ronnie Sahlberg
On Tue, Apr 15, 2014 at 1:32 PM, Michael Haggerty mhag...@alum.mit.edu wrote:
 On 04/15/2014 06:33 PM, Ronnie Sahlberg wrote:
 On Mon, Apr 14, 2014 at 11:36 PM, Michael Haggerty mhag...@alum.mit.edu 
 wrote:
 [...]
 I wonder, however, whether your approach of changing callers from

 lock = lock_ref_sha1_basic() (or varient of)
 write_ref_sha1(lock)

 to

 lock = lock_ref_sha1_basic() (or varient of)
 write_ref_sha1(lock)
 unlock_ref(lock) | commit_ref_lock(lock)

 is not doing work that we will soon need to rework.  Would it be jumping
 the gun to change the callers to

 transaction = ref_transaction_begin();
 ref_transaction_{update,delete,etc}(transaction, ...);
 ref_transaction_{commit,rollback}(transaction, ...);

 instead?  Then we could bury the details of calling write_ref_sha1() and
 commit_lock_ref() inside ref_transaction_commit() rather than having to
 expose them in the public API.

 I think you are right.

 Lets put this patch series on the backburner for now and start by
 making all callers use transactions
 and remove write_ref_sha1() from the public API thar refs.c exports.

 Once everything is switched over to transactions I can rework this
 patchseries for ref_transaction_commit()
 and resubmit to the mailing list.

 Sounds good.  Rewriting callers to use transactions would be a great
 next step.  Please especially keep track of what new features the
 transactions API still needs.  More flexible error handling?  The
 ability to have steps in the transaction that are best-effort (i.e.,
 don't abort the transaction if they fail)?  Different reflog messages
 for different updates within the same transaction rather than one reflog
 message for all updates?  Etc.

 And some callers who currently change multiple references one at a time
 might be able to be rewritten to update the references in a single
 transaction.

As an experiment I rewrite most of the callers to use transactions yesterday.
Most callers would translate just fine, but some callers, such as walker_fetch()
does not yet fit well with the current transaction code.

For example that code does want to first take out locks on all refs
before it does a
lot of processing, with the locks held, before it writes and updates the refs.


Some of my thoughts after going over the callers :

Currently any locking of refs in a transaction only happens during the commit
phase. I think it would be useful to have a mechanism where you could
optionally take out locks for the involved refs early during the transaction.
So that simple callers could continue using
ref_transaction_begin()
ref_transaction_create|update|delete()*
ref_transaction_commit()

but, if a caller such as walker_fetch() could opt to do
ref_transaction_begin()
ref_transaction_lock_ref()*
...do stuff...
ref_transaction_create|update|delete()*
ref_transaction_commit()

In this second case ref_transaction_commit() would only take out any locks that
are missing during the 'lock the refs loop.

Suggestion 1: Add a ref_transaction_lock_ref() to allow locking a ref
early during
a transaction.


A second idea is to change the signatures for
ref_transaction_create|update|delete()
slightly and allow them to return errors early.
We can check for things like add_update() failing, check that the
ref-name looks sane,
check some of the flags, like if has_old==true then old sha1 should
not be NULL or 0{40}, etc.

Additionally for robustness, if any of these functions detect an error
we can flag this in the
transaction structure and take action during ref_transaction_commit().
I.e. if a ref_transaction_update had a hard failure, do not allow
ref_transaction_commit()
to succeed.

Suggestion 2: Change ref_transaction_create|delete|update() to return an int.
All callers that use these functions should check the function for error.


Suggestion 3: remove the qsort and check for duplicates in
ref_transaction_commit()
Since we are already taking out a lock for each ref we are updating
during the transaction
any duplicate refs will fail the second attempt to lock the same ref which will
implicitly make sure that a transaction will not change the same ref twice.

There are only two caveats I see with this third suggestion.
1, The second lock attempt will always cause a die() since we
eventually would end up
in lock_ref_sha1_basic() and this function will always unconditionally
die() if the lock failed.
But your locking changes are addressing this, right?
(all callers to lock_ref_sha1() or lock_any_ref_for_update() do check
for and handle if the lock
 failed, so that change to not die() should be safe)

2, We would need to take care when a lock fails here to print the
proper error message
so that we still show Multiple updates for ref '%s' not allowed. if
the lock failed because
the transaction had already locked this file.





 Lets start preparing patches to change all external callers to use
 transactions instead.
 I am happy to help preparing patches for this. How do 

Re: [PATCH v4 0/3] Make update refs more atomic

2014-04-16 Thread Junio C Hamano
Ronnie Sahlberg sahlb...@google.com writes:

 Currently any locking of refs in a transaction only happens during the commit
 phase. I think it would be useful to have a mechanism where you could
 optionally take out locks for the involved refs early during the transaction.
 So that simple callers could continue using
 ref_transaction_begin()
 ref_transaction_create|update|delete()*
 ref_transaction_commit()

 but, if a caller such as walker_fetch() could opt to do
 ref_transaction_begin()
 ref_transaction_lock_ref()*
 ...do stuff...
 ref_transaction_create|update|delete()*
 ref_transaction_commit()

 In this second case ref_transaction_commit() would only take out any locks 
 that
 are missing during the 'lock the refs loop.

 Suggestion 1: Add a ref_transaction_lock_ref() to allow locking a ref
 early during
 a transaction.

Hmph.

I am not sure if that is the right way to go, or instead change all
create/update/delete to take locks without adding a new primitive.

 A second idea is to change the signatures for
 ref_transaction_create|update|delete()
 slightly and allow them to return errors early.
 We can check for things like add_update() failing, check that the
 ref-name looks sane,
 check some of the flags, like if has_old==true then old sha1 should
 not be NULL or 0{40}, etc.

 Additionally for robustness, if any of these functions detect an error
 we can flag this in the
 transaction structure and take action during ref_transaction_commit().
 I.e. if a ref_transaction_update had a hard failure, do not allow
 ref_transaction_commit()
 to succeed.

 Suggestion 2: Change ref_transaction_create|delete|update() to return an int.
 All callers that use these functions should check the function for error.

I think that is a very sensible thing to do.

The details of determining this cannot possibly succeed may change
(for example, if we have them take locks at the point of
create/delete/update, a failure to lock may count as an early
error).

Is there any reason why this should be conditional (i.e. you said
allow them to, implying that the early failure is optional)?

 Suggestion 3: remove the qsort and check for duplicates in
 ref_transaction_commit()
 Since we are already taking out a lock for each ref we are updating
 during the transaction
 any duplicate refs will fail the second attempt to lock the same ref which 
 will
 implicitly make sure that a transaction will not change the same ref twice.

I do not know if I care about the implementation detail of do we
have a unique set of update requests?.  While I do not see a strong
need for one transaction to touch the same ref twice (e.g. create to
point at commit A and update it to point at commit B), I do not see
why we should forbid such a use in the future.

Just some of my knee-jerk reactions.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4 0/3] Make update refs more atomic

2014-04-16 Thread Ronnie Sahlberg
On Wed, Apr 16, 2014 at 12:31 PM, Junio C Hamano gits...@pobox.com wrote:
 Ronnie Sahlberg sahlb...@google.com writes:

 Currently any locking of refs in a transaction only happens during the commit
 phase. I think it would be useful to have a mechanism where you could
 optionally take out locks for the involved refs early during the transaction.
 So that simple callers could continue using
 ref_transaction_begin()
 ref_transaction_create|update|delete()*
 ref_transaction_commit()

 but, if a caller such as walker_fetch() could opt to do
 ref_transaction_begin()
 ref_transaction_lock_ref()*
 ...do stuff...
 ref_transaction_create|update|delete()*
 ref_transaction_commit()

 In this second case ref_transaction_commit() would only take out any locks 
 that
 are missing during the 'lock the refs loop.

 Suggestion 1: Add a ref_transaction_lock_ref() to allow locking a ref
 early during
 a transaction.

 Hmph.

 I am not sure if that is the right way to go, or instead change all
 create/update/delete to take locks without adding a new primitive.

ack.


 A second idea is to change the signatures for
 ref_transaction_create|update|delete()
 slightly and allow them to return errors early.
 We can check for things like add_update() failing, check that the
 ref-name looks sane,
 check some of the flags, like if has_old==true then old sha1 should
 not be NULL or 0{40}, etc.

 Additionally for robustness, if any of these functions detect an error
 we can flag this in the
 transaction structure and take action during ref_transaction_commit().
 I.e. if a ref_transaction_update had a hard failure, do not allow
 ref_transaction_commit()
 to succeed.

 Suggestion 2: Change ref_transaction_create|delete|update() to return an int.
 All callers that use these functions should check the function for error.

 I think that is a very sensible thing to do.

 The details of determining this cannot possibly succeed may change
 (for example, if we have them take locks at the point of
 create/delete/update, a failure to lock may count as an early
 error).

 Is there any reason why this should be conditional (i.e. you said
 allow them to, implying that the early failure is optional)?

It was poor wording on my side. Checking for the ref_transaction_*()
return for error should be mandatory (modulo bugs).

But a caller could be buggy and fail to check properly.
It would be very cheap to detect this condition in
ref_transaction_commit() which could then do

  die(transaction commit called for errored transaction);

which would make it easy to spot this kind of bugs.



 Suggestion 3: remove the qsort and check for duplicates in
 ref_transaction_commit()
 Since we are already taking out a lock for each ref we are updating
 during the transaction
 any duplicate refs will fail the second attempt to lock the same ref which 
 will
 implicitly make sure that a transaction will not change the same ref twice.

 I do not know if I care about the implementation detail of do we
 have a unique set of update requests?.  While I do not see a strong
 need for one transaction to touch the same ref twice (e.g. create to
 point at commit A and update it to point at commit B), I do not see
 why we should forbid such a use in the future.


ack.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4 0/3] Make update refs more atomic

2014-04-16 Thread Junio C Hamano
Ronnie Sahlberg sahlb...@google.com writes:

 I am not sure if that is the right way to go, or instead change all
 create/update/delete to take locks without adding a new primitive.

 ack.

Hmph.  When I say I am not sure, I dunno, etc., I do mean it.

Did you mean by Ack I do not know, either, or I think it is
better to take locks early everywhere?
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4 0/3] Make update refs more atomic

2014-04-16 Thread Michael Haggerty
On 04/16/2014 09:31 PM, Junio C Hamano wrote:
 Ronnie Sahlberg sahlb...@google.com writes:
 
 Currently any locking of refs in a transaction only happens during the commit
 phase. I think it would be useful to have a mechanism where you could
 optionally take out locks for the involved refs early during the transaction.
 So that simple callers could continue using
 ref_transaction_begin()
 ref_transaction_create|update|delete()*
 ref_transaction_commit()

 but, if a caller such as walker_fetch() could opt to do
 ref_transaction_begin()
 ref_transaction_lock_ref()*
 ...do stuff...
 ref_transaction_create|update|delete()*
 ref_transaction_commit()

 In this second case ref_transaction_commit() would only take out any locks 
 that
 are missing during the 'lock the refs loop.

 Suggestion 1: Add a ref_transaction_lock_ref() to allow locking a ref
 early during
 a transaction.
 
 Hmph.
 
 I am not sure if that is the right way to go, or instead change all
 create/update/delete to take locks without adding a new primitive.

Junio's suggestion seems like a good idea to me.  Obviously, as soon as
we take out a lock we could also do any applicable old_sha1 check and
possibly fail fast.

Does a verify operation require holding a lock?  If not, when is the
verification done--early, or during the commit, or both?  (I realize
that we don't yet *have* a verify operation at the API level, but we
should!)

We also need to think about for what period of time we have to hold the
packed-refs lock.

Finally, we shouldn't forget that currently the reflog files are locked
by holding the lock on the corresponding loose reference file.  Do we
need to integrate these files into our system any more than they
currently are?

[By the way, I noticed the other day that the command

git reflog expire --stale-fix --expire-unreachable=now --all

can hold loose reference locks for a *long* time (like 10s of minutes),
especially in weird cases like the repository having 9000 packfiles for
some reason or another :-)  The command execution time grows strongly
with the length of the reference's log, or maybe even the square of the
length assuming the history is roughly linear and reachability is
computed separately for each SHA-1.  This is just an empirical
observation so far; I haven't looked into the code yet.]

 A second idea is to change the signatures for
 ref_transaction_create|update|delete()
 slightly and allow them to return errors early.
 We can check for things like add_update() failing, check that the
 ref-name looks sane,
 check some of the flags, like if has_old==true then old sha1 should
 not be NULL or 0{40}, etc.

 Additionally for robustness, if any of these functions detect an error
 we can flag this in the
 transaction structure and take action during ref_transaction_commit().
 I.e. if a ref_transaction_update had a hard failure, do not allow
 ref_transaction_commit()
 to succeed.

 Suggestion 2: Change ref_transaction_create|delete|update() to return an int.
 All callers that use these functions should check the function for error.
 
 I think that is a very sensible thing to do.
 
 The details of determining this cannot possibly succeed may change
 (for example, if we have them take locks at the point of
 create/delete/update, a failure to lock may count as an early
 error).
 
 Is there any reason why this should be conditional (i.e. you said
 allow them to, implying that the early failure is optional)?

Also a good suggestion.  We should make it clear in the documentation
that the create/delete/update functions are not *obliged* to return an
error (for example that the current value of the reference does not
agree with old_sha1) because future alternate ref backends might
possibly not be able to make such checks until the commit phase.  For
example, checking old_sha1 might involve a round-trip to a database or
remote repository, in which case it might be preferable to check all
values in a single round-trip upon commit.

So, callers might be informed early of problems, or they might only
learn about problems when they try to commit the transaction.  They have
to be able to handle either type of error reporting.

So then the question arises (and maybe this is what Ronnie was getting
at by suggesting that the checks might be conditional): are callers
*obliged* to check the return values from create/delete/update, or are
they allowed to just keep throwing everything into the transaction,
ignoring errors, and only check the result of ref_transaction_commit()?

I don't feel strongly one way or the other about this question.  It
might be nice to be able to write callers sloppily, but it would cost a
bit more code in the reference backends.  Though maybe it wouldn't even
be much extra code, given that we would probably want to put consistency
checks in there anyway.

 Suggestion 3: remove the qsort and check for duplicates in
 ref_transaction_commit()
 Since we are already taking out a lock for each ref we are updating
 

Re: [PATCH v4 0/3] Make update refs more atomic

2014-04-15 Thread Michael Haggerty
On 04/14/2014 08:29 PM, Ronnie Sahlberg wrote:
 refs.c:ref_transaction_commit() intermingles doing updates and checks with
 actually applying changes to the refs in loops that abort on error.
 This is done one ref at a time and means that if an error is detected that
 will fail the operation partway through the list of refs to update we
 will end up with some changes applied to disk and others not.
 
 Without having transaction support from the filesystem, it is hard to
 make an update that involves multiple refs to guarantee atomicity, but we
 can do a somewhat better than we currently do.

It took me a moment to understand what you were talking about here,
because the code for ref_transaction_commit() already seems
superficially to do reference modifications in phases.  The problem is
that write_ref_sha1() internally contains additional checks that can
fail in normal circumstances.  So the most important part of this
patch series is allowing those checks to be done before committing anything.

 These patches change the update and delete functions to use a three
 call pattern of
 
 1, lock
 2, update, or flag for deletion
 3, apply on disk  (rename() or unlink())
 
 When a transaction is commited we first do all the locking, preparations
 and most of the error checking before we actually start applying any changes
 to the filesystem store.
 
 This means that more of the error cases that will fail the commit
 will trigger before we start doing any changes to the actual files.
 
 
 This should make the changes of refs in refs_transaction_commit slightly
 more atomic.
 [...]

Yes, this is a good and important goal.

I wonder, however, whether your approach of changing callers from

lock = lock_ref_sha1_basic() (or varient of)
write_ref_sha1(lock)

to

lock = lock_ref_sha1_basic() (or varient of)
write_ref_sha1(lock)
unlock_ref(lock) | commit_ref_lock(lock)

is not doing work that we will soon need to rework.  Would it be jumping
the gun to change the callers to

transaction = ref_transaction_begin();
ref_transaction_{update,delete,etc}(transaction, ...);
ref_transaction_{commit,rollback}(transaction, ...);

instead?  Then we could bury the details of calling write_ref_sha1() and
commit_lock_ref() inside ref_transaction_commit() rather than having to
expose them in the public API.

I suspect that the answer is no, ref transactions are not yet powerful
enough to do everything that the callers need.  But then I would
suggest that we *make* them powerful enough and *then* make the change
at the callers.

I'm not saying that we shouldn't accept your change as a first step [1]
and do the next step later, but wanted to get your reaction about making
the first step a bit more ambitious.

Michael

[1] Though I still need to review your patch series in detail.

-- 
Michael Haggerty
mhag...@alum.mit.edu
http://softwareswirl.blogspot.com/
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4 0/3] Make update refs more atomic

2014-04-15 Thread Ronnie Sahlberg
On Mon, Apr 14, 2014 at 1:24 PM, Junio C Hamano gits...@pobox.com wrote:
 Thanks; will queue.

Junio,
Please defer queuing for now.

I think we should convert more of the external callers to use
transactions first.
Once that is done and everything uses transactions I will re-send an
updated version of this patch series.


regards
ronnie sahlberg
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4 0/3] Make update refs more atomic

2014-04-15 Thread Michael Haggerty
On 04/15/2014 06:33 PM, Ronnie Sahlberg wrote:
 On Mon, Apr 14, 2014 at 11:36 PM, Michael Haggerty mhag...@alum.mit.edu 
 wrote:
 [...]
 I wonder, however, whether your approach of changing callers from

 lock = lock_ref_sha1_basic() (or varient of)
 write_ref_sha1(lock)

 to

 lock = lock_ref_sha1_basic() (or varient of)
 write_ref_sha1(lock)
 unlock_ref(lock) | commit_ref_lock(lock)

 is not doing work that we will soon need to rework.  Would it be jumping
 the gun to change the callers to

 transaction = ref_transaction_begin();
 ref_transaction_{update,delete,etc}(transaction, ...);
 ref_transaction_{commit,rollback}(transaction, ...);

 instead?  Then we could bury the details of calling write_ref_sha1() and
 commit_lock_ref() inside ref_transaction_commit() rather than having to
 expose them in the public API.
 
 I think you are right.
 
 Lets put this patch series on the backburner for now and start by
 making all callers use transactions
 and remove write_ref_sha1() from the public API thar refs.c exports.
 
 Once everything is switched over to transactions I can rework this
 patchseries for ref_transaction_commit()
 and resubmit to the mailing list.

Sounds good.  Rewriting callers to use transactions would be a great
next step.  Please especially keep track of what new features the
transactions API still needs.  More flexible error handling?  The
ability to have steps in the transaction that are best-effort (i.e.,
don't abort the transaction if they fail)?  Different reflog messages
for different updates within the same transaction rather than one reflog
message for all updates?  Etc.

And some callers who currently change multiple references one at a time
might be able to be rewritten to update the references in a single
transaction.

 Lets start preparing patches to change all external callers to use
 transactions instead.
 I am happy to help preparing patches for this. How do we ensure that
 we do not create duplicate work
 and work on the same functions?

I have a few loose ends to take care of on my lockfile patch series, and
there are a few things I would like to tidy up internal to the
transactions implementation, so I think if you are working on the caller
side then we won't step on each other's toes too much in the near future.

I suggest we use IRC (mhagger@freenode) or XMPP (mhag...@jabber.org) for
small-scale coordination.  I also have a GitHub repo
(http://github.com/mhagger/git) to which I often push intermediate
results; I will try to push to that more regularly (warning: I often
rebase feature branches even after they are pushed to GitHub).  I think
you are in Pacific Time whereas I am in Berlin, so we will tend to work
in serial rather than in parallel; that should help.  It would be a good
habit to shoot each short status emails at the end of each working day.

Of course we should only use one-on-one communication for early work; as
soon as something is getting ripe we should make sure our technical
discussions take place here on the mailing list.

Sound OK?
Michael

-- 
Michael Haggerty
mhag...@alum.mit.edu
http://softwareswirl.blogspot.com/
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4 0/3] Make update refs more atomic

2014-04-14 Thread Junio C Hamano
Thanks; will queue.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html