Re: [PATCH 5/7] delta_base_cache: drop special treatment of blobs

2016-08-23 Thread Junio C Hamano
Jeff King  writes:

> Let's run the same numbers without caring about object type
> at all (i.e., one LRU list, and always evicting whatever is
> at the head, regardless of type).
> ...
> So it seems like a clear winner, and that's what this patch
> implements.
>
> Signed-off-by: Jeff King 
> ---

Nice work.  You make your readers expect some clever data structures
that may perform better than the obvious two-separate-list approach
and end up using the simplest way.  Quite nice.



>  sha1_file.c | 8 
>  1 file changed, 8 deletions(-)
>
> diff --git a/sha1_file.c b/sha1_file.c
> index c02e785..33564d6 100644
> --- a/sha1_file.c
> +++ b/sha1_file.c
> @@ -2175,14 +2175,6 @@ static void add_delta_base_cache(struct packed_git *p, 
> off_t base_offset,
>   list_entry(lru, struct delta_base_cache_entry, lru);
>   if (delta_base_cached <= delta_base_cache_limit)
>   break;
> - if (f->type == OBJ_BLOB)
> - release_delta_base_cache(f);
> - }
> - list_for_each(lru, &delta_base_cache_lru) {
> - struct delta_base_cache_entry *f =
> - list_entry(lru, struct delta_base_cache_entry, lru);
> - if (delta_base_cached <= delta_base_cache_limit)
> - break;
>   release_delta_base_cache(f);
>   }
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 5/7] delta_base_cache: drop special treatment of blobs

2016-08-22 Thread Jeff King
When the delta base cache runs out of allowed memory, it has
to drop entries. It does so by walking an LRU list, dropping
objects until we are under the memory limit. But we actually
walk the list twice: once to drop blobs, and then again to
drop other objects (which are generally trees). This comes
from 18bdec1 (Limit the size of the new delta_base_cache,
2007-03-19).

This performs poorly as the number of entries grows, because
any time dropping blobs does not satisfy the limit, we have
to walk the _entire_ list, trees included, looking for blobs
to drop, before starting to drop any trees.

It's not generally a problem now, as the cache is limited to
only 256 entries. But as we could benefit from increasing
that in a future patch, it's worth looking at how it
performs as the cache size grows. And the answer is "not
well".

The table below shows times for various operations with
different values of MAX_DELTA_CACHE (which is not a run-time
knob; I recompiled with -DMAX_DELTA_CACHE=$n for each).

I chose "git log --raw" ("log-raw" in the table) because it
will access all of the trees, but no blobs at all (so in a
sense it is a worst case for this problem, because we will
always walk over the entire list of trees once before
realizing there are no blobs to drop). This is also
representative of other tree-only operations like "rev-list
--objects" and "git log -- ".

I also timed "git log -Sfoo --raw" ("log-S" in the table).
It similarly accesses all of the trees, but also the blobs
for each commit. It's representative of "git log -p", though
it emphasizes the cost of blob access more, as "-S" is
cheaper than computing an actual blob diff.

All timings are best-of-3 wall-clock times (though they all
were CPU bound, so the user CPU times are similar). The
repositories were fully packed with --depth=50, and the
default core.deltaBaseCacheLimit of 96M was in effect.  The
current value of MAX_DELTA_CACHE is 256, so I started there
and worked up by factors of 2.

First, here are values for git.git (the asterisk signals the
fastest run for each operation):

MAX_DELTA_CACHElog-raw   log-S
---   --
256   0m02.212s0m12.634s
512   0m02.136s*   0m10.614s
   1024   0m02.156s0m08.614s
   2048   0m02.208s0m07.062s
   4096   0m02.190s0m06.484s*
   8192   0m02.176s0m07.635s
  16384   0m02.913s0m19.845s
  32768   0m03.617s1m05.507s
  65536   0m04.031s1m18.488s

You can see that for the tree-only log-raw case, we don't
actually benefit that much as the cache grows (all the
differences up through 8192 are basically just noise; this
is probably because we don't actually have that many
distinct trees in git.git). But for log-S, we get a definite
speed improvement as the cache grows, but the improvements
are lost as cache size grows and the linear LRU management
starts to dominate.

Here's the same thing run against linux.git:

MAX_DELTA_CACHElog-raw   log-S
---   ---
256   0m40.987s 5m13.216s
512   0m37.949s 5m03.243s
   1024   0m35.977s 4m50.580s
   2048   0m33.855s 4m39.818s
   4096   0m32.913s 4m47.299s*
   8192   0m32.176s*5m14.650s
  16384   0m32.185s 6m31.625s
  32768   0m38.056s 9m31.136s
  65536   1m30.518s17m38.549s

The pattern is similar, though the effect in log-raw is more
pronounced here. The times dip down in the middle, and then
go back up as we keep growing.

So we know there's a problem. What's the solution?

The obvious one is to improve the data structure to avoid
walking over tree entries during the looking-for-blobs
traversal. We can do this by keeping _two_ LRU lists: one
for blobs, and one for other objects. We drop items from the
blob LRU first, and then from the tree LRU (if necessary).

Here's git.git using that strategy:

MAX_DELTA_CACHElog-raw  log-S
---   -   --
256   0m02.264s   0m12.830s
512   0m02.201s   0m10.771s
   1024   0m02.181s   0m08.593s
   2048   0m02.205s   0m07.116s
   4096   0m02.158s   0m06.537s*
   8192   0m02.213s   0m07.246s
  16384   0m02.155s*  0m10.975s
  32768   0m02.159s   0m16.047s
  65536   0m02.181s   0m16.992s

The upswing on log-raw is gone completely. But log-S still
has it (albeit much better than without this strategy).
Let's see what linux.git shows:

MAX_DELTA_CACHElog-raw   log-S
---   --
256   0m42.519s5m14.654s
512   0m39.106s5m04.708s
   1024   0m36.802s4m51.454s
   2048   0m34.685s4m39.378s*