Re: [PATCH/RFC 0/6] reuse deltas found by bitmaps

2014-03-27 Thread Junio C Hamano
Jeff King  writes:

> But for a small fetch...
>
>   5311.3: server   (1 days)0.20(0.17+0.03)   4.39(4.03+6.59) +2095.0%
>   5311.4: size (1 days)  57.2K 59.5K +4.1%
>   5311.5: client   (1 days)0.08(0.08+0.00)   0.08(0.08+0.00) +0.0%

Nice ;-)

> So this is a dead end, but I think it was good to double-check that.

Thanks.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH/RFC 0/6] reuse deltas found by bitmaps

2014-03-27 Thread Siddharth Agarwal

On 03/26/2014 03:40 PM, Siddharth Agarwal wrote:

On 03/26/2014 12:22 AM, Jeff King wrote:

[tl;dr the patch is the same as before, but there is a script to measure
its effects; please try it out on your repos]


Here are the numbers from another, much larger repo:

Test origin HEAD
 


5311.3: server   (1 days)11.72(14.02+1.44) 6.33(5.87+0.75) -46.0%
5311.4: size (1 days) 19.4M 15.3M -21.3%
5311.5: client   (1 days)6.99(8.06+1.50) 10.60(10.34+1.83) +51.6%
5311.7: server   (2 days)39.82(40.56+3.05) 33.94(31.21+3.10) -14.8%
5311.8: size (2 days) 26.5M 22.8M -14.1%
5311.9: client   (2 days)15.81(16.48+4.29) 19.20(18.20+4.19) +21.4%
5311.11: server   (4 days)   37.21(39.75+3.73) 28.01(26.97+1.75) -24.7%
5311.12: size (4 days) 37.5M 34.1M -9.0%
5311.13: client   (4 days)   24.24(26.43+6.76) 31.14(30.75+5.96) +28.5%
5311.15: server   (8 days)   33.74(40.47+3.39) 22.42(22.25+1.51) -33.6%
5311.16: size (8 days) 81.9M 78.4M -4.2%
5311.17: client   (8 days)   81.96(91.07+22.35) 88.03(89.45+17.11) +7.4%
5311.19: server  (16 days)   30.87(34.57+2.78) 27.03(25.93+2.73) -12.4%
5311.20: size(16 days) 153.2M150.9M -1.5%
5311.21: client  (16 days)   169.99(183.57+46.96) 177.12(177.17+37.93) 
+4.2%

5311.23: server  (32 days)   51.00(75.49+4.62) 19.52(19.28+1.93) -61.7%
5311.24: size(32 days) 279.4M283.0M +1.3%
5311.25: client  (32 days)   272.43(296.17+76.48) 284.75(285.98+63.19) 
+4.5%

5311.27: server  (64 days)   51.73(97.88+6.40) 37.32(32.63+5.05) -27.9%
5311.28: size(64 days) 1.7G  1.8G +5.0%
5311.29: client  (64 days)   2600.42(2751.56+718.10) 
2429.06(2501.29+651.56) -6.6%

5311.31: server (128 days)   51.33(95.33+6.91) 37.73(32.98+5.09) -26.5%
5311.32: size   (128 days) 1.7G  1.8G +5.0%
5311.33: client (128 days)   2595.68(2739.45+729.07) 
2386.99(2524.54+583.07) -8.0%


--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH/RFC 0/6] reuse deltas found by bitmaps

2014-03-26 Thread Jeff King
On Wed, Mar 26, 2014 at 02:13:00PM -0400, Jeff King wrote:

> So I think the next steps are probably:
> 
>   1. Measure the "all objects are preferred bases" approach and confirm
>  that it is bad.

Below is a very rough patch to accomplish this. It just traverses the
"have" bitmap and adds every object with the "exclude" flag. The result
is as comically bad as I expected.

For a big fetch, it seems like it's working (numbers against v1.9.0):

  5311.31: server (128 days)   4.49(7.35+0.23)   4.98(6.82+3.31) +10.9%
  5311.32: size   (128 days) 25.8M 32.0M +24.2%
  5311.33: client (128 days)   7.17(7.38+0.20)   7.33(7.90+0.20) +2.2%

A modest increase in CPU time, and we get back most of our size
(remember that our "bad" case here is ~80M).

But for a small fetch...

  5311.3: server   (1 days)0.20(0.17+0.03)   4.39(4.03+6.59) +2095.0%
  5311.4: size (1 days)  57.2K 59.5K +4.1%
  5311.5: client   (1 days)0.08(0.08+0.00)   0.08(0.08+0.00) +0.0%

Yikes. Besides spending lots of CPU on handling the enlarged object
list, notice that we still only dropped the size in the 128-day case to
32M. Which is almost exactly what the earlier "reuse on-disk deltas"
patch achieved.

What I think is happening is that we manage to reuse those on-disk
deltas (since they are now preferred bases, and we know we can). But we
never actually come up with any _new_ deltas, because the search window
is overwhelmed with candidates. So not only do we waste a huge amount of
CPU, but we just end up at the same not-quite-optimal result as before.

So this is a dead end, but I think it was good to double-check that.

The patch below is messy and would probably be better split into a few
patches, but I don't expect anyone to apply it (or even read it,
really). It's just for reference.

---
diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index 0ee5f1f..1a5d401 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -1026,7 +1026,7 @@ static int add_object_entry_from_bitmap(const unsigned 
char *sha1,
if (have_duplicate_entry(sha1, 0, &index_pos))
return 0;
 
-   create_object_entry(sha1, type, name_hash, 0, 0, index_pos, pack, 
offset);
+   create_object_entry(sha1, type, name_hash, flags, 0, index_pos, pack, 
offset);
 
display_progress(progress_state, to_pack.nr_objects);
return 1;
@@ -2436,6 +2436,7 @@ static int get_object_list_from_bitmap(struct rev_info 
*revs)
}
 
traverse_bitmap_commit_list(&add_object_entry_from_bitmap);
+   bitmap_have_foreach(&add_object_entry_from_bitmap);
return 0;
 }
 
diff --git a/pack-bitmap.c b/pack-bitmap.c
index 1bae7e8..f4e30f5 100644
--- a/pack-bitmap.c
+++ b/pack-bitmap.c
@@ -605,6 +605,7 @@ static void show_objects_for_type(
struct bitmap *objects,
struct ewah_bitmap *type_filter,
enum object_type object_type,
+   int flags,
show_reachable_fn show_reach)
 {
size_t pos = 0, i = 0;
@@ -613,9 +614,6 @@ static void show_objects_for_type(
struct ewah_iterator it;
eword_t filter;
 
-   if (bitmap_git.reuse_objects == bitmap_git.pack->num_objects)
-   return;
-
ewah_iterator_init(&it, type_filter);
 
while (i < objects->word_alloc && ewah_iterator_next(&filter, &it)) {
@@ -640,7 +638,7 @@ static void show_objects_for_type(
if (bitmap_git.hashes)
hash = ntohl(bitmap_git.hashes[entry->nr]);
 
-   show_reach(sha1, object_type, 0, hash, bitmap_git.pack, 
entry->offset);
+   show_reach(sha1, object_type, flags, hash, 
bitmap_git.pack, entry->offset);
}
 
pos += BITS_IN_WORD;
@@ -816,14 +814,17 @@ void traverse_bitmap_commit_list(show_reachable_fn 
show_reachable)
 {
assert(bitmap_git.result);
 
+   if (bitmap_git.reuse_objects == bitmap_git.pack->num_objects)
+   return;
+
show_objects_for_type(bitmap_git.result, bitmap_git.commits,
-   OBJ_COMMIT, show_reachable);
+   OBJ_COMMIT, 0, show_reachable);
show_objects_for_type(bitmap_git.result, bitmap_git.trees,
-   OBJ_TREE, show_reachable);
+   OBJ_TREE, 0, show_reachable);
show_objects_for_type(bitmap_git.result, bitmap_git.blobs,
-   OBJ_BLOB, show_reachable);
+   OBJ_BLOB, 0, show_reachable);
show_objects_for_type(bitmap_git.result, bitmap_git.tags,
-   OBJ_TAG, show_reachable);
+   OBJ_TAG, 0, show_reachable);
 
show_extended_objects(bitmap_git.result, show_reachable);
 
@@ -1090,3 +1091,18 @@ int bitmap_have(const unsigned char *sha1)
 
return bitmap_get(bitmap_git.haves, pos);
 }
+
+void bitmap_have_foreach(show_reachable_fn show_reachable)
+{
+   if (!bitmap_git.haves)
+   return;
+
+   show_object

Re: [PATCH/RFC 0/6] reuse deltas found by bitmaps

2014-03-26 Thread Siddharth Agarwal

On 03/26/2014 12:22 AM, Jeff King wrote:

[tl;dr the patch is the same as before, but there is a script to measure
its effects; please try it out on your repos]


Here are results from one of our repos:

Test origin HEAD
-
5311.3: server   (1 days)1.77(2.49+0.15) 0.76(0.65+0.12) -57.1%
5311.4: size (1 days)   7.8M3.1M -60.2%
5311.5: client   (1 days)0.56(0.48+0.03) 1.18(0.97+0.05) +110.7%
5311.7: server   (2 days)2.79(3.68+0.25) 0.77(0.65+0.14) -72.4%
5311.8: size (2 days)  24.2M6.8M -71.8%
5311.9: client   (2 days)1.14(0.95+0.10) 2.72(2.33+0.13) +138.6%
5311.11: server   (4 days)   3.70(4.76+0.28) 0.84(0.72+0.14) -77.3%
5311.12: size (4 days) 51.2M   18.9M -63.0%
5311.13: client   (4 days)   2.29(2.02+0.29) 4.58(3.91+0.30) +100.0%
5311.15: server   (8 days)   5.16(7.48+0.38) 1.20(1.18+0.21) -76.7%
5311.16: size (8 days) 78.3M   42.6M -45.5%
5311.17: client   (8 days)   3.73(3.67+0.40) 6.59(5.95+0.51) +76.7%
5311.19: server  (16 days)   6.48(10.27+0.52)1.60(1.85+0.27) -75.3%
5311.20: size(16 days) 97.5M   64.0M -34.4%
5311.21: client  (16 days)   5.31(5.76+0.57) 8.99(8.61+0.68) +69.3%
5311.23: server  (32 days)   8.56(14.00+0.67)2.31(2.91+0.37) -73.0%
5311.24: size(32 days)124.7M   92.0M -26.2%
5311.25: client  (32 days)   7.69(9.20+0.76) 12.80(13.07+0.91) +66.4%
5311.27: server  (64 days)   37.47(45.48+1.55)   29.75(29.58+0.96) -20.6%
5311.28: size(64 days)245.5M  207.1M -15.6%
5311.29: client  (64 days)   15.11(18.26+1.58)   25.02(25.90+2.03) +65.6%
5311.31: server (128 days)   43.85(57.02+2.57)   29.88(29.54+1.35) -31.9%
5311.32: size   (128 days)507.3M  461.6M -9.0%
5311.33: client (128 days)   27.13(31.54+3.32)   37.76(39.49+3.16) +39.2%

I'm currently running tests on a much larger repo than that. The full 
perf run will take several hours on that, so the soonest I can likely 
publish results for that is tomorrow.

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH/RFC 0/6] reuse deltas found by bitmaps

2014-03-26 Thread Jeff King
On Wed, Mar 26, 2014 at 03:31:41PM -0700, Junio C Hamano wrote:

> > I think we could still add the objects from the tip of the client's HAVE
> > list.
> 
> That should make the result at least per to the non-bitmap case,
> right?

That's my expectation.
> 
> >   2. Measure the "reused deltas become preferred bases" approach. I
> >  expect the resulting size to be slightly better than what I have
> >  now, but not as good as v1.9.0's size (but taking less CPU time).
> 
> Do you mean "the bases for reused deltas become preferred bases, so
> that we can deltify more objects off of them"?

Exactly.

I'll update more when I have results for some of these. :)

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH/RFC 0/6] reuse deltas found by bitmaps

2014-03-26 Thread Junio C Hamano
Jeff King  writes:

> Just looking at the 128-day case again, using bitmaps increased our
> server CPU time _and_ made a much bigger pack. This series not only
> fixes the CPU time regression, but it also drops the server CPU time to
> almost nothing. That's a nice improvement, and it makes perfect sense:
> we are reusing on-disk deltas instead of on-the-fly computing deltas
> against the preferred bases.

True.

> I think we could still add the objects from the tip of the client's HAVE
> list.

That should make the result at least per to the non-bitmap case,
right?

> This patch would still be a CPU win on top of that, because it
> would reduce the number of objects which need a delta search in the
> first place.

Yes.

> So I think the next steps are probably:
>
>   1. Measure the "all objects are preferred bases" approach and confirm
>  that it is bad.

;-)

>   2. Measure the "reused deltas become preferred bases" approach. I
>  expect the resulting size to be slightly better than what I have
>  now, but not as good as v1.9.0's size (but taking less CPU time).

Do you mean "the bases for reused deltas become preferred bases, so
that we can deltify more objects off of them"?

>   3. Measure the "figure out boundaries and add them as preferred bases,
>  like we do without bitmaps" approach. I expect this to behave
>  similarly to v1.9.0.

Yes.

>   4. Combine (2) and (3) and measure them. I'm _hoping_ this will give
>  us the best of both worlds, but I still want to do the individual
>  measurements so we can see where any improvement is coming from.

Sensible.  Thanks.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH/RFC 0/6] reuse deltas found by bitmaps

2014-03-26 Thread Jeff King
On Wed, Mar 26, 2014 at 10:33:41AM -0700, Junio C Hamano wrote:

> Jeff King  writes:
> 
> >   2. When considering whether a delta can be reused, check the bitmaps
> >  to see if the client has the base. If so, allow reuse.
> > ...
> > The implementation I'm including here is the one I've shown before,
> > which does (2). Part of the reason that I'm reposting it before looking
> > further into these options is that I've written a t/perf script that can
> > help with the analysis.
> 
> Conceptually, that feels like a natural extension for the "thin
> pack" transfer.  I wouldn't foresee a correctness issue, as long as
> the fetcher runs "index-pack --fix-thin" at their end.

Right, this is only enabled when --thin is negotiated during the
protocol transfer. I don't think there's any correctness problem here.
But I want to make sure we are doing the fastest thing.

Here's another set of results comparing three runs: v1.9.0 (with no
bitmaps), origin/master (bitmaps), and this series. Apologies for the
long lines.

  Test v1.9.0origin 
HEAD   
  
-
  5311.3: server   (1 days)0.20(0.17+0.02)   0.29(0.34+0.04) +45.0% 
0.14(0.12+0.02) -30.0% 
  5311.4: size (1 days)  56.2K  1.0M +1694.4%   
  59.5K +5.9%  
  5311.5: client   (1 days)0.08(0.07+0.00)   0.03(0.04+0.00) -62.5% 
0.08(0.07+0.00) +0.0%  
  5311.7: server   (2 days)0.06(0.06+0.00)   0.28(0.33+0.04) +366.7%
0.14(0.11+0.02) +133.3%
  5311.8: size (2 days)  56.2K  1.0M +1694.4%   
  59.5K +5.9%  
  5311.9: client   (2 days)0.09(0.08+0.00)   0.03(0.04+0.00) -66.7% 
0.09(0.08+0.00) +0.0%  
  5311.11: server   (4 days)   0.21(0.18+0.02)   0.29(0.33+0.03) +38.1% 
0.14(0.11+0.02) -33.3% 
  5311.12: size (4 days) 64.2K  1.1M +1536.6%   
  67.9K +5.8%  
  5311.13: client   (4 days)   0.08(0.08+0.00)   0.04(0.03+0.01) -50.0% 
0.09(0.07+0.01) +12.5% 
  5311.15: server   (8 days)   0.22(0.21+0.02)   0.36(0.47+0.03) +63.6% 
0.14(0.11+0.02) -36.4% 
  5311.16: size (8 days)125.7K  1.5M +1100.8%   
 130.1K +3.5%  
  5311.17: client   (8 days)   0.11(0.10+0.00)   0.05(0.05+0.00) -54.5% 
0.13(0.11+0.01) +18.2% 
  5311.19: server  (16 days)   0.26(0.26+0.03)   0.76(1.20+0.04) +192.3%
0.25(0.21+0.04) -3.8%  
  5311.20: size(16 days)358.6K  3.7M +931.7%
 359.8K +0.3%  
  5311.21: client  (16 days)   0.28(0.27+0.01)   0.13(0.15+0.00) -53.6% 
0.30(0.28+0.02) +7.1%  
  5311.23: server  (32 days)   0.44(0.54+0.07)   1.39(2.54+0.04) +215.9%
0.28(0.23+0.04) -36.4% 
  5311.24: size(32 days)  1.1M  8.6M +652.4%
   2.1M +83.9% 
  5311.25: client  (32 days)   0.66(0.64+0.02)   0.32(0.38+0.01) -51.5% 
0.69(0.67+0.04) +4.5%  
  5311.27: server  (64 days)   2.78(4.02+0.17)   7.02(15.59+0.16) +152.5%   
0.43(0.37+0.07) -84.5% 
  5311.28: size(64 days) 18.7M 68.3M +265.5%
  24.5M +31.3% 
  5311.29: client  (64 days)   6.25(6.29+0.16)   3.21(4.76+0.14) -48.6% 
6.27(6.46+0.18) +0.3%  
  5311.31: server (128 days)   4.48(7.32+0.21)   7.56(16.60+0.16) +68.8%
0.51(0.45+0.10) -88.6% 
  5311.32: size   (128 days) 25.8M 83.1M +222.3%
  35.9M +39.2% 
  5311.33: client (128 days)   7.32(7.58+0.17)   3.94(5.87+0.18) -46.2% 
7.15(7.80+0.20) -2.3%  

My previous results showed that this series was an improvement over
straight bitmaps. But what it didn't show is that bitmaps actually
provide a regression for some fetches (I think because we do not find
the boundary commits at all, and do not have any preferred bases).

Just looking at the 128-day case again, using bitmaps increased our
server CPU time _and_ made a much bigger pack. This series not only
fixes the CPU time regression, but it also drops the server CPU time to
almost nothing. That's a nice improvement, and it makes perfect sense:
we are reusing on-disk deltas instead of on-the-fly computing deltas
against the preferred bases.

And it does help with the size regression, though the end result is
still a bit worse than v1.9.0. I'm not exactly sure what's going on
here. My guess is that we have objects to send that are not deltas on
disk (probably because they are at the tip of history and other things
are deltas against them). We would still want to do delta compression of
them against preferred bases, but we don't have any preferred bases in
our list.

So I think we still want to add back in some list of preferred base
objects when bitmaps are in use. The question is, which objects?
With bitmaps, it's easy to add in all of the objects the client has, but
that is probably going to be counterproductive (I sti

Re: [PATCH/RFC 0/6] reuse deltas found by bitmaps

2014-03-26 Thread Junio C Hamano
Jeff King  writes:

>   2. When considering whether a delta can be reused, check the bitmaps
>  to see if the client has the base. If so, allow reuse.
> ...
> The implementation I'm including here is the one I've shown before,
> which does (2). Part of the reason that I'm reposting it before looking
> further into these options is that I've written a t/perf script that can
> help with the analysis.

Conceptually, that feels like a natural extension for the "thin
pack" transfer.  I wouldn't foresee a correctness issue, as long as
the fetcher runs "index-pack --fix-thin" at their end.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH/RFC 0/6] reuse deltas found by bitmaps

2014-03-26 Thread Jeff King
[tl;dr the patch is the same as before, but there is a script to measure
   its effects; please try it out on your repos]

This is a continuation of the discussion here:

  http://thread.gmane.org/gmane.comp.version-control.git/239647

I'll summarize the story so far.

Basically, the problem is that while pack bitmaps make the "Counting
objects" phase of a fetch fast (which in turn makes clones very fast),
they do not help as much for an incremental fetch. One, because we have
a lot fewer objects to count, so there is less for us to speed up there.
And two, we spend a lot more effort on delta compression for a fetch,
because we are often throwing out on-disk deltas that we are not sure
that the other side has.

One common case is that the server side is mostly packed and has
bitmaps, and the fetching client has some subset of its objects, but
wants to bring itself up to the tip. Any delta that the server has on
disk is generally eligible for reuse, because the base is either:

  1. An object that the client already has.

  2. An object that we are going to send as part of the new pack.

However, we quite often do not notice case (1), because we only consider
a subset of the client's objects as "preferred bases" for thin-packing.
The reason for this is that traditionally it was expensive to enumerate
all of the objects we know the client has. But with bitmaps, this is
very cheap. So the basic idea is to better notice which objects the
client has and change our delta reuse and preferred-base rules.

There are three basic strategies I can think of for doing this:

  1. Add every object the client has to the packing list as a preferred
 base.

  2. When considering whether a delta can be reused, check the bitmaps
 to see if the client has the base. If so, allow reuse.

  3. Do (2), but also add the object as a preferred base.

I think that option (1) is probably a bad idea; we'll spend a lot of
time generating a large packing list, and the huge number of objects
will probably clog the delta window.

Option (3) might or might not be a good idea. It would hopefully give us
some relevant subset of the objects to use as preferred bases, in case
other objects also need deltas.

The implementation I'm including here is the one I've shown before,
which does (2). Part of the reason that I'm reposting it before looking
further into these options is that I've written a t/perf script that can
help with the analysis.

I've run it against git.git and linux.git. I'd be curious to see the
results on other repositories that might have different delta patterns.

The script simulates a fetch from a fully-packed server by a client who
has not fetched in N days, for several values of N. It measures the time
taken on the server (to run pack-objects), the time taken on the client
(to run index-pack), and the size of the resulting pack. Here are the
results for running it on linux.git (note that the script output has the
tests interleaved, but I've rearranged it here for clarity):

Test origin HEAD   
---
5311.4: size (1 days)   1.0M  59.5K -94.1% 
5311.8: size (2 days)   1.0M  59.5K -94.1% 
5311.12: size (4 days)  1.1M  67.9K -93.5% 
5311.16: size (8 days)  1.5M 130.1K -91.4% 
5311.20: size(16 days)  3.7M 359.8K -90.3% 
5311.24: size(32 days)  8.6M   1.4M -84.3% 
5311.28: size(64 days) 68.3M  23.0M -66.3% 
5311.32: size   (128 days) 83.1M  35.1M -57.7% 

You can see that it produces much smaller packs, because we're getting
better delta reuse (and bitmaps don't generally produce preferred bases
at all, so we were probably failing to make thin packs at all). The
percentage benefit lessens as we go further back, of course, because the
thin objects will be at the "edge" of the pack (and the bigger the pack,
the less of it is edge).

So far so good...here are the server timings:

Test origin HEAD   
---
5311.3: server   (1 days)0.29(0.33+0.03)0.14(0.11+0.02) -51.7% 
5311.7: server   (2 days)0.29(0.33+0.04)0.14(0.11+0.02) -51.7% 
5311.11: server   (4 days)   0.29(0.32+0.04)0.14(0.11+0.02) -51.7% 
5311.15: server   (8 days)   0.36(0.45+0.04)0.14(0.11+0.02) -61.1% 
5311.19: server  (16 days)   0.74(1.17+0.05)0.26(0.23+0.02) -64.9% 
5311.23: server  (32 days)   1.38(2.53+0.06)0.29(0.25+0.03) -79.0% 
5311.27: server  (64 days)   7.12(15.70+0.18)   0.43(0.38+0.07) -94.0% 
5311.31: server (128 days)   7.60(16.72+0.19)   0.52(0.48+0.07) -93.2% 

Again, looking good. We'r