On Sat, Aug 10, 2013 at 5:16 AM, Jeff King <p...@peff.net> wrote:
> Another solution could involve not writing the duplicate of Y in the
> first place. The reason we do not store thin-packs on disk is that you
> run into problems with cycles in the delta graph (e.g., A deltas against
> B, which deltas against C, which deltas against A; at one point you had
> a full copy of one object which let you create the cycle, but you later
> deleted it as redundant with the delta, and now you cannot reconstruct
> any of the objects).
> You could possibly solve this with cycle detection, though it would be
> complicated (you need to do it not just when getting rid of objects, but
> when sending a pack, to make sure you don't send a cycle of deltas that
> the other end cannot use). You _might_ be able to get by with a kind of
> "two-level" hack: consider your main pack as "group A" and newly pushed
> packs as "group B". Allow storing thin deltas on disk from group B
> against group A, but never the reverse (nor within group B). That makes
> sure you don't have cycles, and it eliminates even more I/O than any
> repacking solution (because you never write the extra copy of Y to disk
> in the first place). But I can think of two problems:
> 1. You still want to repack more often than every 300 packs, because
> having many packs cost both in space, but also in object lookup
> time (we can do a log(N) search through each pack index, but have
> to search linearly through the set of indices).
> 2. As you accumulate group B packs with new objects, the deltas that
> people send will tend to be against objects in group B. They are
> closer to the tip of history, and therefore make better deltas for
> history built on top.
> That's all just off the top of my head. There are probably other flaws,
> too, as I haven't considered it too hard.
Some refinements on this idea
- We could keep packs in group B ordered as the packs come in. The
new pack can depend on the previous ones.
- A group index in addition to separate index for each pack would
solve linear search object lookup problem.
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html