Jeff King <p...@peff.net> wrote:
>On Fri, Dec 28, 2012 at 07:50:14AM -0700, Martin Fick wrote:
>> Hmm, actually I believe that with a small modification to the
>> semantics described here it would be possible to make multi
>> repo/branch commits work. Simply allow the ref filename to
>> be locked by a transaction by appending the transaction ID to
>> the filename. So if transaction 123 wants to lock master
>> which points currently to abcde, then it will move
>> master/abcde to master/abcde_123. If transaction 123 is
>> designed so that any process can commit/complete/abort it
>> without requiring any locks which can go stale, then this ref
>> lock will never go stale either (easy as long as it writes
>> all its proposed updates somewhere upfront and has atomic
>> semantics for starting, committing and aborting). On commit,
>> the ref lock gets updated to its new value: master/newsha and
>> on abort it gets unlocked: master/abcde.
>Hmm. I thought our goal was to avoid locks? Isn't this just locking by
It is a lock, but it is a lock with an owner: the transaction. If the
transaction has reliable recovery semantics, then the lock will be recoverable
also. This is possible if we have lock ownership (the transaction) which does
not exist today for the ref locks. With good lock ownership we gain the
ability to reliably delete locks for a specific owner without the risk of
deleting the lock when held by another owner (putting the owner in the filename
is "good", while putting the owner in the filecontents is not). Lastly, for
reliable recovery of stale locks we need the ability to determine when an owner
has abandoned a lock. I believe that the transaction semantics laid out below
>I guess your point is to have no locks in the "normal" case, and have
>locked transactions as an optional add-on?
Basically. If we design the transaction into the git semantics we could ensure
that it is recoverable and we should not need to expose these reflocks outside
of the transaction APIs.
To illustrate a simple transaction approach (borrowing some of Shawn's ideas),
we could designate a directory to hold transaction files *1. To prepare a
transaction: write a list of repo:ref:oldvalue:newvalue to a file named id.new
(in a stable sorted order based on repo:ref to prevent deadlocks). This is not
a state change and thus this file could be deleted by any process at anytime
(preferably after a long grace period).
If file renames are atomic on the filesystem holding the transaction files then
1, 2, 3 below will be atomic state changes. It does not matter who performs
state transitions 2 or 3. It does not matter who implements the work following
any of the 3 transitions, many processes could attempt the work in parallel (so
could a human).
1) To start the transaction, rename the id.new file to id. If the rename
fails, start over if desired/still possible. On success, ref locks for each
entry should be acquired in listed order (to prevent deadlocks), using
transaction id and oldvalue. It is never legal to unlock a ref in this state
(because a block could cause the unlock to be delayed until the commit phase).
However, it is legal for any process to transition to abort at any time from
this state, perhaps because of a failure to acquire a lock (held by another
transaction), and definitely if a ref has changed (is no longer oldvalue).
2) To abort the transaction, rename the id file to id.abort. This should only
ever fail if commit was achieved first. Once in this state, any process
may/should unlock any ref locks belonging to this transaction id. Once all
refs are unlocked, id.abort may be deleted (it could be deleted earlier, but
then cleanup will take longer).
3) To commit the transaction, rename the file to id.commit. This should only
ever fail if abort was achieved first. This transition should never be done
until every listed ref is locked by the current transaction id. Once in this
phase, all refs may/should be moved to their new values and unlocked by any
process. Once all refs are unlocked, id.commit may be deleted.
Since any process attempting any of the work in these transactions could block
at any time for an indefinite amount of time, these processes may wake after
the transaction is aborted or comitted and the transaction files are cleaned
up. I believe that in these cases the only actions which could succeed by
these waking processes is the ref locking action. All such abandoned ref locks
may/should be unlocked by any process. This last rule means that no
transaction ids should ever be reused,
*1 We may want to adapt the simple model illustrated above to use git
mechanisms such as refs to hold transaction info instead of files in a
directory, and git submodule files to hold the list of refs to update.
Employee of Qualcomm Innovation Center,Inc. which is a member of Code Aurora
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html