On Wed, 14 Aug 2013, Jeff King wrote:

>   1. Is sha1_entry_pos wrong to barf on duplicate items in the index? If
>      so, do we want to fix it, or simply retire GIT_USE_LOOKUP?

I'd think that sha1_entry_pos should be more lenient here, especially if 
this doesn't compromize the overall git behavior.

>      Related, should we consider duplicate items in a packfile to be a
>      bogus packfile (and consequently notice and complain during
>      indexing)? I don't think it _hurts_ anything (aside from the assert
>      above), though it is of course wasteful.

This should indeed be considered a bogus pack file.  But we have a lot 
of code to be able to cope with bogus/corrupted pack files already.  
Handling this case as well would not hurt.

More importantly we should make sure the code we have doesn't generate 
such packs.

>   2. How can duplicate entries get into a packfile?
>      Git itself should not generate duplicate entries (pack-objects is
>      careful to remove duplicates). Since these packs almost certainly
>      were pushed by a client, I wondered if "index-pack --fix-thin"
>      might accidentally add multiple copies of an object when it is the
>      preferred base for multiple objects, but it specifically avoids
>      doing so.

It is probably simpler than that.  An alternative pack-objects 
implementation could stream multiple copies of an object upon a push, 
and index-pack on the receiving end would simply store what's been 
received to disk as is without a fuss.

> Given the dates on the packs and how rare this is, I'm pretty much
> willing to chalk it up to a random bug (in git or otherwise) that does
> not any longer exist.

Possibly.  Given this is not compromizing the validity of the pack, and 
a simple repack "fixes" it, I would not worry too much about it.

To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to