Shawn Pearce <spea...@spearce.org> writes:
> On Wed, Aug 15, 2012 at 10:42 PM, Junio C Hamano <gits...@pobox.com> wrote:
>> An obvious way to record the "delta chain" is to simply keep the
>> name_hash of each object in the pack, which would need 2 bytes per
>> object in the pack, that would bloat pack_idx_entry size from 32
>> bytes to 34 bytes per entry. That way, after your bitmap discovers
>> an object that cannot reuse existing deltas, you can throw it, other
>> such objects with the same name-hash, and then objects that you know
>> will be available to the recipient (you mark the last category of
>> objects as "preferred base"), into the delta_list so that they are
>> close together in the delta window.
> Yes, this is one thought I had. Inside of JGit I think the name hash
> is 32 bits, not 16 bits. Storing the name hash into the *.idx file
> means we need to codify what the name hash algorithm is for a given
> *.idx file version, and compatible implementations of Git must use the
> same hash function. Thus far the name hash has been an in-memory
> transient concept that doesn't need to be persisted across runs of the
> packer. Storing it means we have to do that.
Let's not go there. We cannot resurrect the name hash out of *.pack
stream, which means index-pack cannot recreate it after receiving
objects over the network. We would need to instead teach index-pack
to observe the delta chains, and come up with some "delta chain
identifier" (unique name to identify what you called "delta cluster"
in your response) on its own, and give it to each object when it
writes the *.idx file out.
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html