On Fri, Aug 16, 2013 at 11:26:28AM +0200, Thomas Rast wrote:

> > I trust the laptop numbers less because it has far more thermal (and
> > thus throttling) issues, but the runs do show a significant difference,
> > though less than you claimed.
> Well, as I feared... another run on the same laptop:
> Test                               HEAD                next                   
> ------------------------------------------------------------------------------
> 0001.1: rev-list --all             6.41(6.14+0.24)     6.36(6.10+0.23) -0.9%* 
> 0001.2: rev-list --all --objects   54.60(53.84+0.55)   54.23(53.50+0.53) -0.7%
> ------------------------------------------------------------------------------
> Significance hints:  '.' 0.1  '*' 0.05  '**' 0.01  '***' 0.001

Yeah, I get similar results on my i7-840QM. The new code is sometimes
faster and sometimes slower, never wins by more than 1%, and is well
within the run-to-run noise (which seems to vary by up to 5%).

I think the reason is that having a "2-deep" LRU cache helps less than
one might hope. There are a lot of objects in the hash table, but only a
small fraction are "hot" at any time; specifically, those objects that
are in the currently-examined tree. When we hit the "X" object from
Stefan's diagrams, we know that it is hot. But the "B" object may or may
not be hot. If it is, Stefan's optimization helps; if not, it does

But statistically it probably isn't hot. There are ~3 million objects in
the kernel repo, but only ~40,000 tree entries. So we would naively
expect it to have an effect in only ~1% of cases (I am not sure if that
is accurate, though, as "hot" items within the same collision sequence
would tend to float to the front of the chain).  I suspect any savings
in those 1% are eaten up by the extra swapping, as well as the fact that
the "hot" B actually moves to the middle.

Another way to think about it is that if B is hot, it will then get
looked up again and get shifted to the front of the chain, pushing X
back. So this is really only a win when two hot entries, X and B,
collide and trade the front position back and forth.

In that case, it seems like we would want to move B to the second
position. This lets the 2-hot case just keep swapping the hot items back
and forth as quickly as possible. To the detriment of C, D, etc, which
never get promoted. But the likelihood of having _3_ hot items in a
collision chain is even less than 2.

That's all vague and hand-wavy intuition of course, and might be
completely wrong. But it's at least a working theory that explains the
experimental results.

To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to