On Mon, Sep 26, 2016 at 08:33:52AM +0200, Matthieu Moy wrote:
> Junio C Hamano <gits...@pobox.com> writes:
> > I am not opposed to bump the default to 12 or whatever, but I
> > suspect any lengthening today may need to be accompanied by a tool
> > support that finds the set of objects that are reachable from a
> > commit whose names begin with non-unique abbreviations that appear
> > in the commit log message.
> Something much simpler would be to set core.abbrev at clone time,
> depending on the size of the project just cloned. So, when cloning a
> hello-world, we'd keep the 7 but when cloning a big project we'd get a
> larger value.
> This doesn't cover the case of someone growing his own project without
> cloning, and isn't as clever as actually looking for colision, but it
> would probably provide a sane default in 99% cases, and wouldn't be
> worse than hardcoding 7 in the 1% remaining cases.
I think we could easily make this even more dynamic, and just base the
minimum for DEFAULT_ABBREV on the number of objects _currently_ in the
repository, plus some safety factor. We could do this cheaply by just
counting the number of objects in the packs (which we get for free when
we open their pack index). That misses loose objects, but if you have 4
million loose objects you have bigger problems than abbreviation
lengths, I think.
OTOH, any scheme that looks at the current repository size will
eventually grow outdated. The safety factor depends on how fast your
repository grows, and how big you expect it to eventually get. Such a
default might still have been using 7-character abbreviations on
linux.git in 2006, and we'd be stuck with them now.
The idea of a 12-character default is basically that we'd expect decades
or more for even the largest projects to get there, so you err on the
side of future-proofing.