Re: [PATCH 00/17] Remove assumptions about refname lifetimes

Junio C Hamano Mon, 20 May 2013 09:38:09 -0700

Johan Herland <[email protected]> writes:

> For server-class installations we need ref storage that can be read
> (and updated?) atomically, and the current system of loose + packed
> files won't work since reading (and updating) more than a single file
> is not an atomic operation. Trivially, one could resolve this by
> dropping loose refs, and always using a single packed-refs file, but
> that would make it prohibitively expensive to update refs (the entire
> packed-refs file must be rewritten for every update).
>
> Now, observe that we don't have these race conditions in the object
> database, because it is an add-only immutable data store.
>
> What if we stored the refs as a tree object in the object database,
> referenced by a single (loose) ref?


What is the cost of updating a single branch with that scheme?

Doesn't it end up recording roughly the same amount of information
as updating a single packed-refs file that is flat, but with the
need to open a few tree objects (top-level, refs/, and refs/heads/),
writing out a blob that stores the object name at the tip, computing
the updated trees (refs/heads/, refs/ and top-level), and then
finally doing the compare-and-swap of that single loose ref?

You may guarantee atomicity but it is the same granularity of
atomicity as a single packed-refs file.  When you are updating a
branch while somebody else is updating a tag, of course you do not
have to look at refs/tags/ in your operation and you can write out
your final packed-refs equivalent tree to the object database
without racing with the other process, but the top-level you come up
with and the top-level the other process comes up with (which has
an up-to-date refs/tags/ part, but has a stale refs/heads/ part from
your point of view) have to race to update that single loose ref,
and one of you have to back out.

That "backing out" can be made more intelligently than just dying
with "compare and swap failed--please retry" message, e.g. you at
that point notice what you started with, what the other party did
while you were working on (i.e. updating refs/tags/), and three-way
merge the refs tree, and in cases where "all refs recorded as loose
refs" scheme wouldn't have resulted in problematic conflict, such a
three-way merge would resolve trivially (you updated refs/heads/ and
the update by the other process to refs/tags/ would not conflict
with what you did).  But the same three-way merge scheme can be
employed with the current flat single packed-refs scheme, can't it?

Even worse, what is the cost of looking up the value of a single
branch?  You would need to open a few tree objects and the leaf blob
that records the object name the ref points at, wouldn't you?

Right now, such a look-up is either opening a single small file and
reading the first 41 bytes off of it, and falling back (when the ref
in question is packed) to read a single packed-refs file and finding
the ref you want from it.

So...
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 00/17] Remove assumptions about refname lifetimes

Reply via email to