Jeff King <p...@peff.net> writes:
> On Tue, Mar 18, 2014 at 12:00:48PM +0700, Duy Nguyen wrote:
>> On Tue, Mar 18, 2014 at 11:50 AM, Jeff King <p...@peff.net> wrote:
>> > On Sun, Mar 16, 2014 at 08:35:04PM +0700, Nguyễn Thái Ngọc Duy wrote:
>> >> As explained in the previous commit, current aggressive settings
>> >> --depth=250 --window=250 could slow down repository access
>> >> significantly. Notice that people usually work on recent history only,
>> >> we could keep recent history more loosely packed, so that repo access
>> >> is fast most of the time while the pack file remains small.
>> > One thing I have not seen is real-world timings showing the slowdown
>> > based on --depth. Did I miss them, or are we just making assumptions
>> > based on one old case from 2009 (that, AFAIK does not have real numbers,
>> > just speculation)? Has anyone measured the effect of bumping the delta
>> > cache size (and its hash implementation)?
>> David tested it with git-blame . I should probably run some tests
>> too (I don't remember if I tested some operations last time).
> Ah, thanks. I do remember that thread now.
> It looks like David's last word is that he gets a significant
> performance from bumping the delta base cache size (and number of
Increasing number of buckets was having comparatively minor effects
(that was the suggestion I started with), actually _degrading_
performance rather soon. The delta base cache size was much more
noticeable. I had prepared a patch serious increasing it. The reason
I have not submitted it yet is that I have not found a compelling
real-world test case _apart_ from the fast git-blame that is still
missing implementation of -M and -C options.
There should be other commands digging through large amounts of old
history, but I did not really find something benchmarking convincingly.
Either most stuff is inefficient anyway, or the access order is
better-behaved, causing fewer unwanted cache flushes.
Access order in the optimized git-blame case is basically done with a
reverse commit-time based priority queue leading to a breadth-first
strategy. It still beats unsorted access solidly in its timing. Don't
think I compared depth-first results (inversing the priority queue
sorting condition) with regard to cache results, but it's bad for
interactive use as it tends to leave some recent history unblamed for a
long time while digging up stuff in the remote past.
Moderate cache size increases seem like a better strategy, and the
default size of 16M does not make a lot of sense with modern computers.
In particular since the history digging is rarely competing with other
memory intensive operations at the same time.
> And that matches the timings I just did. I suspect there are still
> pathological cases that could behave worse, but it really sounds like
> we should be looking into improving that cache as a first step.
I can put up a patch. My git-blame experiments used 128M, and the patch
proposes a more conservative 64M. I don't actually have made
experiments for the 64M setting, though. The current default is 16M.
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html