Vegard Nossum <vegard.nos...@oracle.com> writes:
> A closer inspection reveals the problem to really be that this is an
> extremely hot path with more than -- holy cow -- 4,106,756,451
> iterations on the 'packed_git' list for a single 'git fetch' on my
> repository. I'm guessing the patch above just made the inner loop
> ever so slightly slower.
Very plausible, and this ...
> My .git/objects/pack/ has ~2088 files (1042 idx files, 1042 pack files,
> and 4 tmp_pack_* files).
... may explain why nobody else has seen a difference.
Is there a reason why your repository has that many pack files? Is
automatic GC not working for some reason?
"gc" would try to make sure that you have reasonably low number of
packs, as having too many packs is detrimental for performance for
multiple reasons, including:
* All objects in a single pack expressed in delta format (i.e. only
the difference from another object is stored) must eventually
have another object that its difference is based on recorded in
the full format in the same packfile.
* A single packfile records a single object only once, but it is
normal (and often required because of the point above) that the
same object appears in multiple packfiles.
* Locating of objects from a single packfile uses its .idx file by
binary search of sorted list of object names, which is efficient,
but this cost is multiplied linearly as the number of packs you
have in your repository.