Junio C Hamano <gits...@pobox.com> writes:

> Thomas Rast <tr...@inf.ethz.ch> writes:
>> So we take a slightly different approach, and trade some memory for
>> better cache locality.
> Interesting.  It feels somewhat bait-and-switch to reveal that the
> above "some" turns out to be "double" later, but the resulting code
> does not look too bad, and the numbers do not look insignificant.

Oh, that wasn't the intent.  I was too lazy to gather some memory
numbers, so here's an estimate on the local effect and some measurements
on the global one.

struct object is at least 24 bytes (flags etc. and sha1).  We grow the
hash by 2x whenever it reaches 50% load, so it is always at least 25%

A 25% loaded hash-table used to consist of 75% pointers (8 bytes) and
25% pointers-to-struct-object (32 bytes), for 14 bytes per average slot.
Now it's 22 bytes (one more unsigned long) per slot, i.e., a 60%
increase for the data managed by the hash table.

But that's using the crudest estimates I could think of.  If we assume
that an average blob and tree is at least as big as the smallest
possible commit, we'd guess that objects are at least ~240 bytes (this
is still somewhat of an estimate and assumes that you don't go and
handcraft commits with single-digit timestamps).  So the numbers above
go up by 25% * 240 per average slot, and work out to an about 11%
overall increase.

Here are some real numbers from /usr/bin/time git rev-list --all --objects:


  2.30user 0.02system 0:02.33elapsed 99%CPU (0avgtext+0avgdata 
  0inputs+0outputs (0major+17844minor)pagefaults 0swaps


  2.18user 0.02system 0:02.21elapsed 99%CPU (0avgtext+0avgdata 
  0inputs+0outputs (0major+18202minor)pagefaults 0swaps

So that would be about 14MB or 5.7% of extra memory.

Thomas Rast
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to