Re: [PATCH/RFC 0/6] reuse deltas found by bitmaps
Jeff King writes: > But for a small fetch... > > 5311.3: server (1 days)0.20(0.17+0.03) 4.39(4.03+6.59) +2095.0% > 5311.4: size (1 days) 57.2K 59.5K +4.1% > 5311.5: client (1 days)0.08(0.08+0.00) 0.08(0.08+0.00) +0.0% Nice ;-) > So this is a dead end, but I think it was good to double-check that. Thanks. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH/RFC 0/6] reuse deltas found by bitmaps
On 03/26/2014 03:40 PM, Siddharth Agarwal wrote: On 03/26/2014 12:22 AM, Jeff King wrote: [tl;dr the patch is the same as before, but there is a script to measure its effects; please try it out on your repos] Here are the numbers from another, much larger repo: Test origin HEAD 5311.3: server (1 days)11.72(14.02+1.44) 6.33(5.87+0.75) -46.0% 5311.4: size (1 days) 19.4M 15.3M -21.3% 5311.5: client (1 days)6.99(8.06+1.50) 10.60(10.34+1.83) +51.6% 5311.7: server (2 days)39.82(40.56+3.05) 33.94(31.21+3.10) -14.8% 5311.8: size (2 days) 26.5M 22.8M -14.1% 5311.9: client (2 days)15.81(16.48+4.29) 19.20(18.20+4.19) +21.4% 5311.11: server (4 days) 37.21(39.75+3.73) 28.01(26.97+1.75) -24.7% 5311.12: size (4 days) 37.5M 34.1M -9.0% 5311.13: client (4 days) 24.24(26.43+6.76) 31.14(30.75+5.96) +28.5% 5311.15: server (8 days) 33.74(40.47+3.39) 22.42(22.25+1.51) -33.6% 5311.16: size (8 days) 81.9M 78.4M -4.2% 5311.17: client (8 days) 81.96(91.07+22.35) 88.03(89.45+17.11) +7.4% 5311.19: server (16 days) 30.87(34.57+2.78) 27.03(25.93+2.73) -12.4% 5311.20: size(16 days) 153.2M150.9M -1.5% 5311.21: client (16 days) 169.99(183.57+46.96) 177.12(177.17+37.93) +4.2% 5311.23: server (32 days) 51.00(75.49+4.62) 19.52(19.28+1.93) -61.7% 5311.24: size(32 days) 279.4M283.0M +1.3% 5311.25: client (32 days) 272.43(296.17+76.48) 284.75(285.98+63.19) +4.5% 5311.27: server (64 days) 51.73(97.88+6.40) 37.32(32.63+5.05) -27.9% 5311.28: size(64 days) 1.7G 1.8G +5.0% 5311.29: client (64 days) 2600.42(2751.56+718.10) 2429.06(2501.29+651.56) -6.6% 5311.31: server (128 days) 51.33(95.33+6.91) 37.73(32.98+5.09) -26.5% 5311.32: size (128 days) 1.7G 1.8G +5.0% 5311.33: client (128 days) 2595.68(2739.45+729.07) 2386.99(2524.54+583.07) -8.0% -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH/RFC 0/6] reuse deltas found by bitmaps
On Wed, Mar 26, 2014 at 02:13:00PM -0400, Jeff King wrote: > So I think the next steps are probably: > > 1. Measure the "all objects are preferred bases" approach and confirm > that it is bad. Below is a very rough patch to accomplish this. It just traverses the "have" bitmap and adds every object with the "exclude" flag. The result is as comically bad as I expected. For a big fetch, it seems like it's working (numbers against v1.9.0): 5311.31: server (128 days) 4.49(7.35+0.23) 4.98(6.82+3.31) +10.9% 5311.32: size (128 days) 25.8M 32.0M +24.2% 5311.33: client (128 days) 7.17(7.38+0.20) 7.33(7.90+0.20) +2.2% A modest increase in CPU time, and we get back most of our size (remember that our "bad" case here is ~80M). But for a small fetch... 5311.3: server (1 days)0.20(0.17+0.03) 4.39(4.03+6.59) +2095.0% 5311.4: size (1 days) 57.2K 59.5K +4.1% 5311.5: client (1 days)0.08(0.08+0.00) 0.08(0.08+0.00) +0.0% Yikes. Besides spending lots of CPU on handling the enlarged object list, notice that we still only dropped the size in the 128-day case to 32M. Which is almost exactly what the earlier "reuse on-disk deltas" patch achieved. What I think is happening is that we manage to reuse those on-disk deltas (since they are now preferred bases, and we know we can). But we never actually come up with any _new_ deltas, because the search window is overwhelmed with candidates. So not only do we waste a huge amount of CPU, but we just end up at the same not-quite-optimal result as before. So this is a dead end, but I think it was good to double-check that. The patch below is messy and would probably be better split into a few patches, but I don't expect anyone to apply it (or even read it, really). It's just for reference. --- diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c index 0ee5f1f..1a5d401 100644 --- a/builtin/pack-objects.c +++ b/builtin/pack-objects.c @@ -1026,7 +1026,7 @@ static int add_object_entry_from_bitmap(const unsigned char *sha1, if (have_duplicate_entry(sha1, 0, &index_pos)) return 0; - create_object_entry(sha1, type, name_hash, 0, 0, index_pos, pack, offset); + create_object_entry(sha1, type, name_hash, flags, 0, index_pos, pack, offset); display_progress(progress_state, to_pack.nr_objects); return 1; @@ -2436,6 +2436,7 @@ static int get_object_list_from_bitmap(struct rev_info *revs) } traverse_bitmap_commit_list(&add_object_entry_from_bitmap); + bitmap_have_foreach(&add_object_entry_from_bitmap); return 0; } diff --git a/pack-bitmap.c b/pack-bitmap.c index 1bae7e8..f4e30f5 100644 --- a/pack-bitmap.c +++ b/pack-bitmap.c @@ -605,6 +605,7 @@ static void show_objects_for_type( struct bitmap *objects, struct ewah_bitmap *type_filter, enum object_type object_type, + int flags, show_reachable_fn show_reach) { size_t pos = 0, i = 0; @@ -613,9 +614,6 @@ static void show_objects_for_type( struct ewah_iterator it; eword_t filter; - if (bitmap_git.reuse_objects == bitmap_git.pack->num_objects) - return; - ewah_iterator_init(&it, type_filter); while (i < objects->word_alloc && ewah_iterator_next(&filter, &it)) { @@ -640,7 +638,7 @@ static void show_objects_for_type( if (bitmap_git.hashes) hash = ntohl(bitmap_git.hashes[entry->nr]); - show_reach(sha1, object_type, 0, hash, bitmap_git.pack, entry->offset); + show_reach(sha1, object_type, flags, hash, bitmap_git.pack, entry->offset); } pos += BITS_IN_WORD; @@ -816,14 +814,17 @@ void traverse_bitmap_commit_list(show_reachable_fn show_reachable) { assert(bitmap_git.result); + if (bitmap_git.reuse_objects == bitmap_git.pack->num_objects) + return; + show_objects_for_type(bitmap_git.result, bitmap_git.commits, - OBJ_COMMIT, show_reachable); + OBJ_COMMIT, 0, show_reachable); show_objects_for_type(bitmap_git.result, bitmap_git.trees, - OBJ_TREE, show_reachable); + OBJ_TREE, 0, show_reachable); show_objects_for_type(bitmap_git.result, bitmap_git.blobs, - OBJ_BLOB, show_reachable); + OBJ_BLOB, 0, show_reachable); show_objects_for_type(bitmap_git.result, bitmap_git.tags, - OBJ_TAG, show_reachable); + OBJ_TAG, 0, show_reachable); show_extended_objects(bitmap_git.result, show_reachable); @@ -1090,3 +1091,18 @@ int bitmap_have(const unsigned char *sha1) return bitmap_get(bitmap_git.haves, pos); } + +void bitmap_have_foreach(show_reachable_fn show_reachable) +{ + if (!bitmap_git.haves) + return; + + show_object
Re: [PATCH/RFC 0/6] reuse deltas found by bitmaps
On 03/26/2014 12:22 AM, Jeff King wrote: [tl;dr the patch is the same as before, but there is a script to measure its effects; please try it out on your repos] Here are results from one of our repos: Test origin HEAD - 5311.3: server (1 days)1.77(2.49+0.15) 0.76(0.65+0.12) -57.1% 5311.4: size (1 days) 7.8M3.1M -60.2% 5311.5: client (1 days)0.56(0.48+0.03) 1.18(0.97+0.05) +110.7% 5311.7: server (2 days)2.79(3.68+0.25) 0.77(0.65+0.14) -72.4% 5311.8: size (2 days) 24.2M6.8M -71.8% 5311.9: client (2 days)1.14(0.95+0.10) 2.72(2.33+0.13) +138.6% 5311.11: server (4 days) 3.70(4.76+0.28) 0.84(0.72+0.14) -77.3% 5311.12: size (4 days) 51.2M 18.9M -63.0% 5311.13: client (4 days) 2.29(2.02+0.29) 4.58(3.91+0.30) +100.0% 5311.15: server (8 days) 5.16(7.48+0.38) 1.20(1.18+0.21) -76.7% 5311.16: size (8 days) 78.3M 42.6M -45.5% 5311.17: client (8 days) 3.73(3.67+0.40) 6.59(5.95+0.51) +76.7% 5311.19: server (16 days) 6.48(10.27+0.52)1.60(1.85+0.27) -75.3% 5311.20: size(16 days) 97.5M 64.0M -34.4% 5311.21: client (16 days) 5.31(5.76+0.57) 8.99(8.61+0.68) +69.3% 5311.23: server (32 days) 8.56(14.00+0.67)2.31(2.91+0.37) -73.0% 5311.24: size(32 days)124.7M 92.0M -26.2% 5311.25: client (32 days) 7.69(9.20+0.76) 12.80(13.07+0.91) +66.4% 5311.27: server (64 days) 37.47(45.48+1.55) 29.75(29.58+0.96) -20.6% 5311.28: size(64 days)245.5M 207.1M -15.6% 5311.29: client (64 days) 15.11(18.26+1.58) 25.02(25.90+2.03) +65.6% 5311.31: server (128 days) 43.85(57.02+2.57) 29.88(29.54+1.35) -31.9% 5311.32: size (128 days)507.3M 461.6M -9.0% 5311.33: client (128 days) 27.13(31.54+3.32) 37.76(39.49+3.16) +39.2% I'm currently running tests on a much larger repo than that. The full perf run will take several hours on that, so the soonest I can likely publish results for that is tomorrow. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH/RFC 0/6] reuse deltas found by bitmaps
On Wed, Mar 26, 2014 at 03:31:41PM -0700, Junio C Hamano wrote: > > I think we could still add the objects from the tip of the client's HAVE > > list. > > That should make the result at least per to the non-bitmap case, > right? That's my expectation. > > > 2. Measure the "reused deltas become preferred bases" approach. I > > expect the resulting size to be slightly better than what I have > > now, but not as good as v1.9.0's size (but taking less CPU time). > > Do you mean "the bases for reused deltas become preferred bases, so > that we can deltify more objects off of them"? Exactly. I'll update more when I have results for some of these. :) -Peff -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH/RFC 0/6] reuse deltas found by bitmaps
Jeff King writes: > Just looking at the 128-day case again, using bitmaps increased our > server CPU time _and_ made a much bigger pack. This series not only > fixes the CPU time regression, but it also drops the server CPU time to > almost nothing. That's a nice improvement, and it makes perfect sense: > we are reusing on-disk deltas instead of on-the-fly computing deltas > against the preferred bases. True. > I think we could still add the objects from the tip of the client's HAVE > list. That should make the result at least per to the non-bitmap case, right? > This patch would still be a CPU win on top of that, because it > would reduce the number of objects which need a delta search in the > first place. Yes. > So I think the next steps are probably: > > 1. Measure the "all objects are preferred bases" approach and confirm > that it is bad. ;-) > 2. Measure the "reused deltas become preferred bases" approach. I > expect the resulting size to be slightly better than what I have > now, but not as good as v1.9.0's size (but taking less CPU time). Do you mean "the bases for reused deltas become preferred bases, so that we can deltify more objects off of them"? > 3. Measure the "figure out boundaries and add them as preferred bases, > like we do without bitmaps" approach. I expect this to behave > similarly to v1.9.0. Yes. > 4. Combine (2) and (3) and measure them. I'm _hoping_ this will give > us the best of both worlds, but I still want to do the individual > measurements so we can see where any improvement is coming from. Sensible. Thanks. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH/RFC 0/6] reuse deltas found by bitmaps
On Wed, Mar 26, 2014 at 10:33:41AM -0700, Junio C Hamano wrote: > Jeff King writes: > > > 2. When considering whether a delta can be reused, check the bitmaps > > to see if the client has the base. If so, allow reuse. > > ... > > The implementation I'm including here is the one I've shown before, > > which does (2). Part of the reason that I'm reposting it before looking > > further into these options is that I've written a t/perf script that can > > help with the analysis. > > Conceptually, that feels like a natural extension for the "thin > pack" transfer. I wouldn't foresee a correctness issue, as long as > the fetcher runs "index-pack --fix-thin" at their end. Right, this is only enabled when --thin is negotiated during the protocol transfer. I don't think there's any correctness problem here. But I want to make sure we are doing the fastest thing. Here's another set of results comparing three runs: v1.9.0 (with no bitmaps), origin/master (bitmaps), and this series. Apologies for the long lines. Test v1.9.0origin HEAD - 5311.3: server (1 days)0.20(0.17+0.02) 0.29(0.34+0.04) +45.0% 0.14(0.12+0.02) -30.0% 5311.4: size (1 days) 56.2K 1.0M +1694.4% 59.5K +5.9% 5311.5: client (1 days)0.08(0.07+0.00) 0.03(0.04+0.00) -62.5% 0.08(0.07+0.00) +0.0% 5311.7: server (2 days)0.06(0.06+0.00) 0.28(0.33+0.04) +366.7% 0.14(0.11+0.02) +133.3% 5311.8: size (2 days) 56.2K 1.0M +1694.4% 59.5K +5.9% 5311.9: client (2 days)0.09(0.08+0.00) 0.03(0.04+0.00) -66.7% 0.09(0.08+0.00) +0.0% 5311.11: server (4 days) 0.21(0.18+0.02) 0.29(0.33+0.03) +38.1% 0.14(0.11+0.02) -33.3% 5311.12: size (4 days) 64.2K 1.1M +1536.6% 67.9K +5.8% 5311.13: client (4 days) 0.08(0.08+0.00) 0.04(0.03+0.01) -50.0% 0.09(0.07+0.01) +12.5% 5311.15: server (8 days) 0.22(0.21+0.02) 0.36(0.47+0.03) +63.6% 0.14(0.11+0.02) -36.4% 5311.16: size (8 days)125.7K 1.5M +1100.8% 130.1K +3.5% 5311.17: client (8 days) 0.11(0.10+0.00) 0.05(0.05+0.00) -54.5% 0.13(0.11+0.01) +18.2% 5311.19: server (16 days) 0.26(0.26+0.03) 0.76(1.20+0.04) +192.3% 0.25(0.21+0.04) -3.8% 5311.20: size(16 days)358.6K 3.7M +931.7% 359.8K +0.3% 5311.21: client (16 days) 0.28(0.27+0.01) 0.13(0.15+0.00) -53.6% 0.30(0.28+0.02) +7.1% 5311.23: server (32 days) 0.44(0.54+0.07) 1.39(2.54+0.04) +215.9% 0.28(0.23+0.04) -36.4% 5311.24: size(32 days) 1.1M 8.6M +652.4% 2.1M +83.9% 5311.25: client (32 days) 0.66(0.64+0.02) 0.32(0.38+0.01) -51.5% 0.69(0.67+0.04) +4.5% 5311.27: server (64 days) 2.78(4.02+0.17) 7.02(15.59+0.16) +152.5% 0.43(0.37+0.07) -84.5% 5311.28: size(64 days) 18.7M 68.3M +265.5% 24.5M +31.3% 5311.29: client (64 days) 6.25(6.29+0.16) 3.21(4.76+0.14) -48.6% 6.27(6.46+0.18) +0.3% 5311.31: server (128 days) 4.48(7.32+0.21) 7.56(16.60+0.16) +68.8% 0.51(0.45+0.10) -88.6% 5311.32: size (128 days) 25.8M 83.1M +222.3% 35.9M +39.2% 5311.33: client (128 days) 7.32(7.58+0.17) 3.94(5.87+0.18) -46.2% 7.15(7.80+0.20) -2.3% My previous results showed that this series was an improvement over straight bitmaps. But what it didn't show is that bitmaps actually provide a regression for some fetches (I think because we do not find the boundary commits at all, and do not have any preferred bases). Just looking at the 128-day case again, using bitmaps increased our server CPU time _and_ made a much bigger pack. This series not only fixes the CPU time regression, but it also drops the server CPU time to almost nothing. That's a nice improvement, and it makes perfect sense: we are reusing on-disk deltas instead of on-the-fly computing deltas against the preferred bases. And it does help with the size regression, though the end result is still a bit worse than v1.9.0. I'm not exactly sure what's going on here. My guess is that we have objects to send that are not deltas on disk (probably because they are at the tip of history and other things are deltas against them). We would still want to do delta compression of them against preferred bases, but we don't have any preferred bases in our list. So I think we still want to add back in some list of preferred base objects when bitmaps are in use. The question is, which objects? With bitmaps, it's easy to add in all of the objects the client has, but that is probably going to be counterproductive (I sti
Re: [PATCH/RFC 0/6] reuse deltas found by bitmaps
Jeff King writes: > 2. When considering whether a delta can be reused, check the bitmaps > to see if the client has the base. If so, allow reuse. > ... > The implementation I'm including here is the one I've shown before, > which does (2). Part of the reason that I'm reposting it before looking > further into these options is that I've written a t/perf script that can > help with the analysis. Conceptually, that feels like a natural extension for the "thin pack" transfer. I wouldn't foresee a correctness issue, as long as the fetcher runs "index-pack --fix-thin" at their end. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH/RFC 0/6] reuse deltas found by bitmaps
[tl;dr the patch is the same as before, but there is a script to measure its effects; please try it out on your repos] This is a continuation of the discussion here: http://thread.gmane.org/gmane.comp.version-control.git/239647 I'll summarize the story so far. Basically, the problem is that while pack bitmaps make the "Counting objects" phase of a fetch fast (which in turn makes clones very fast), they do not help as much for an incremental fetch. One, because we have a lot fewer objects to count, so there is less for us to speed up there. And two, we spend a lot more effort on delta compression for a fetch, because we are often throwing out on-disk deltas that we are not sure that the other side has. One common case is that the server side is mostly packed and has bitmaps, and the fetching client has some subset of its objects, but wants to bring itself up to the tip. Any delta that the server has on disk is generally eligible for reuse, because the base is either: 1. An object that the client already has. 2. An object that we are going to send as part of the new pack. However, we quite often do not notice case (1), because we only consider a subset of the client's objects as "preferred bases" for thin-packing. The reason for this is that traditionally it was expensive to enumerate all of the objects we know the client has. But with bitmaps, this is very cheap. So the basic idea is to better notice which objects the client has and change our delta reuse and preferred-base rules. There are three basic strategies I can think of for doing this: 1. Add every object the client has to the packing list as a preferred base. 2. When considering whether a delta can be reused, check the bitmaps to see if the client has the base. If so, allow reuse. 3. Do (2), but also add the object as a preferred base. I think that option (1) is probably a bad idea; we'll spend a lot of time generating a large packing list, and the huge number of objects will probably clog the delta window. Option (3) might or might not be a good idea. It would hopefully give us some relevant subset of the objects to use as preferred bases, in case other objects also need deltas. The implementation I'm including here is the one I've shown before, which does (2). Part of the reason that I'm reposting it before looking further into these options is that I've written a t/perf script that can help with the analysis. I've run it against git.git and linux.git. I'd be curious to see the results on other repositories that might have different delta patterns. The script simulates a fetch from a fully-packed server by a client who has not fetched in N days, for several values of N. It measures the time taken on the server (to run pack-objects), the time taken on the client (to run index-pack), and the size of the resulting pack. Here are the results for running it on linux.git (note that the script output has the tests interleaved, but I've rearranged it here for clarity): Test origin HEAD --- 5311.4: size (1 days) 1.0M 59.5K -94.1% 5311.8: size (2 days) 1.0M 59.5K -94.1% 5311.12: size (4 days) 1.1M 67.9K -93.5% 5311.16: size (8 days) 1.5M 130.1K -91.4% 5311.20: size(16 days) 3.7M 359.8K -90.3% 5311.24: size(32 days) 8.6M 1.4M -84.3% 5311.28: size(64 days) 68.3M 23.0M -66.3% 5311.32: size (128 days) 83.1M 35.1M -57.7% You can see that it produces much smaller packs, because we're getting better delta reuse (and bitmaps don't generally produce preferred bases at all, so we were probably failing to make thin packs at all). The percentage benefit lessens as we go further back, of course, because the thin objects will be at the "edge" of the pack (and the bigger the pack, the less of it is edge). So far so good...here are the server timings: Test origin HEAD --- 5311.3: server (1 days)0.29(0.33+0.03)0.14(0.11+0.02) -51.7% 5311.7: server (2 days)0.29(0.33+0.04)0.14(0.11+0.02) -51.7% 5311.11: server (4 days) 0.29(0.32+0.04)0.14(0.11+0.02) -51.7% 5311.15: server (8 days) 0.36(0.45+0.04)0.14(0.11+0.02) -61.1% 5311.19: server (16 days) 0.74(1.17+0.05)0.26(0.23+0.02) -64.9% 5311.23: server (32 days) 1.38(2.53+0.06)0.29(0.25+0.03) -79.0% 5311.27: server (64 days) 7.12(15.70+0.18) 0.43(0.38+0.07) -94.0% 5311.31: server (128 days) 7.60(16.72+0.19) 0.52(0.48+0.07) -93.2% Again, looking good. We'r