On Wed, 14 Aug 2013, Jeff King wrote:
> 1. Is sha1_entry_pos wrong to barf on duplicate items in the index? If
> so, do we want to fix it, or simply retire GIT_USE_LOOKUP?
I'd think that sha1_entry_pos should be more lenient here, especially if
this doesn't compromize the overall git behavior.
> Related, should we consider duplicate items in a packfile to be a
> bogus packfile (and consequently notice and complain during
> indexing)? I don't think it _hurts_ anything (aside from the assert
> above), though it is of course wasteful.
This should indeed be considered a bogus pack file. But we have a lot
of code to be able to cope with bogus/corrupted pack files already.
Handling this case as well would not hurt.
More importantly we should make sure the code we have doesn't generate
> 2. How can duplicate entries get into a packfile?
> Git itself should not generate duplicate entries (pack-objects is
> careful to remove duplicates). Since these packs almost certainly
> were pushed by a client, I wondered if "index-pack --fix-thin"
> might accidentally add multiple copies of an object when it is the
> preferred base for multiple objects, but it specifically avoids
> doing so.
It is probably simpler than that. An alternative pack-objects
implementation could stream multiple copies of an object upon a push,
and index-pack on the receiving end would simply store what's been
received to disk as is without a fuss.
> Given the dates on the packs and how rare this is, I'm pretty much
> willing to chalk it up to a random bug (in git or otherwise) that does
> not any longer exist.
Possibly. Given this is not compromizing the validity of the pack, and
a simple repack "fixes" it, I would not worry too much about it.
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html