On Thu, May 30, 2013 at 10:00:23PM +0200, Thomas Rast wrote:

> lookup_commit_reference_gently unconditionally parses the object given
> to it.  This slows down git-describe a lot if you have a repository
> with large tagged blobs in it: parse_object() will read the entire
> blob and verify that its sha1 matches, only to then throw it away.
> Speed it up by checking the type with sha1_object_info() prior to
> unpacking.

This would speed up the case where we do not end up looking at the
object at all, but it will slow down the (presumably common) case where
we will in fact find a commit and end up parsing the object anyway.

Have you measured the impact of this on normal operations? During a
traversal, we spend a measurable amount of time looking up commits in
packfiles, and this would presumably double it.

This is not the first time I have seen this tradeoff in git.  It would
be nice if our object access was structured to do incremental
examination of the objects (i.e., store the packfile index lookup or
partial unpack of a loose object header, and then use that to complete
the next step of actually getting the contents).

