Re: [PATCH 11/16] rev-list: add bitmap mode to speed up lists
Vicent Martí tan...@gmail.com writes: I'm afraid I cannot reproduce the segfault locally (assuming you're performing the rev-list on the git/git repository). Could you please send me more information, and a core dump if possible? Sure, but isn't the core dump useless if you don't have the same executable? And since I'm building custom git, you won't have that. Here's a semi-full backtrace (I left out the spammy output in the outermost frames). Some variables in #2 and #3 seem to have gone off the rails. #0 0x772b06fb in __memset_sse2 () from /lib64/libc.so.6 No symbol table info available. #1 0x0054c31c in bitmap_set (self=0x89c360, pos=18446744072278122040) at ewah/bitmap.c:46 old_size = 7666 block = 288230376129345656 #2 0x004e6c70 in add_to_include_set (data=0x7fffcd00, sha1=0x85c014 \230\062˝M\311i\373\372\317\321\370\224\017\313\336\301\213\271\060, bitmap_pos=-1431429576) at pack-bitmap.c:428 hash_pos = 512 #3 0x004e6cd6 in should_include (commit=0x85c010, _data=0x7fffcd00) at pack-bitmap.c:443 data = 0x7fffcd00 bitmap_pos = -1431429576 #4 0x0050cf1d in add_parents_to_list (revs=0x7fffce30, commit=0x85c010, list=0x7fffce30, cache_ptr=0x0) at revision.c:784 parent = 0x88c260 left_flag = 32767 cached_base = 0x0 #5 0x00512b66 in get_revision_1 (revs=0x7fffce30) at revision.c:2857 entry = 0x8f9ce0 commit = 0x85c010 #6 0x00512dcf in get_revision_internal (revs=0x7fffce30) at revision.c:2964 c = 0x0 l = 0x1000 #7 0x00512fe1 in get_revision (revs=0x7fffce30) at revision.c:3040 c = 0xb92608 reversed = 0x89c360 #8 0x004d2a24 in traverse_commit_list (revs=0x7fffce30, show_commit=0x4e6b72 show_commit, show_object=0x4e6afa show_object, data=0x89c360) at list-objects.c:179 i = -1 commit = 0xb92608 base = { alloc = 4097, len = 0, buf = 0x87bbe0 } #9 0x004e6fa4 in find_objects (revs=0x7fffce30, roots=0x0, seen=0x85b760) at pack-bitmap.c:549 incdata = { base = 0x89c360, seen = 0x85b760 } base = 0x89c360 needs_walk = true not_mapped = 0x8f9dc0 #10 0x004e747b in prepare_bitmap_walk (revs=0x7fffce30, result_size=0x0) at pack-bitmap.c:679 i = 2 pending_nr = 2 pending_alloc = 64 pending_e = 0x853e10 wants = 0x8545b0 haves = 0x854820 wants_bitmap = 0x0 haves_bitmap = 0x85b760 #11 0x00474bb3 in cmd_rev_list (argc=2, argv=0x7fffd6e8, prefix=0x0) at builtin/rev-list.c:356 #12 0x00405820 in run_builtin (p=0x7c3ef8 commands.20770+2040, argc=4, argv=0x7fffd6e8) at git.c:291 #13 0x004059b3 in handle_internal_command (argc=4, argv=0x7fffd6e8) at git.c:454 #14 0x00405b87 in main (argc=4, av=0x7fffd6e8) at git.c:544 This is with a version of your series that you can find at https://github.com/trast/git.git vm/ewah I am'd your patches on top of Junio's master at the time, except for the parts to the Makefile that did not apply, which I fixed up manually. -- Thomas Rast trast@{inf,student}.ethz.ch -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 11/16] rev-list: add bitmap mode to speed up lists
Vicent Marti tan...@gmail.com writes: Calling `git rev-list --use-bitmaps [committish]` is the equivalent of `git rev-list --objects`, but the rev list is performed based on a bitmap result instead of using a manual counting objects phase. Why would we ever want to not --use-bitmaps, once it actually works? I.e., shouldn't this be the default if pack.usebitmaps is set (or possibly even core.usebitmaps for these things)? These are some example timings for `torvalds/linux`: $ time ../git/git rev-list --objects master /dev/null real0m25.567s user0m25.148s sys 0m0.384s $ time ../git/git rev-list --use-bitmaps master /dev/null real0m0.393s user0m0.356s sys 0m0.036s I see your badass numbers, and raise you a critical issue: $ time git rev-list --use-bitmaps --count --left-right origin/pu...origin/next Segmentation fault real0m0.408s user0m0.383s sys 0m0.022s It actually seems to be related solely to having negated commits in the walk: thomas@linux-k42r:~/g(next u+65)$ time git rev-list --use-bitmaps --count origin/pu 32315 real0m0.041s user0m0.034s sys 0m0.006s thomas@linux-k42r:~/g(next u+65)$ time git rev-list --use-bitmaps --count origin/pu ^origin/next Segmentation fault real0m0.460s user0m0.214s sys 0m0.244s I also can't help noticing that the time spent generating the segfault would have sufficed to generate the answer the old way as well: $ time git rev-list --count --left-right origin/pu...origin/next 189 125 real0m0.409s user0m0.386s sys 0m0.022s Can we use the same trick to speed up merge base computation and then --left-right? The latter is a component of __git_ps1 and can get somewhat slow in some cases, so it would be nice to make it really fast, too. -- Thomas Rast trast@{inf,student}.ethz.ch -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 11/16] rev-list: add bitmap mode to speed up lists
I'm afraid I cannot reproduce the segfault locally (assuming you're performing the rev-list on the git/git repository). Could you please send me more information, and a core dump if possible? On Tue, Jun 25, 2013 at 6:22 PM, Thomas Rast tr...@inf.ethz.ch wrote: Vicent Marti tan...@gmail.com writes: Calling `git rev-list --use-bitmaps [committish]` is the equivalent of `git rev-list --objects`, but the rev list is performed based on a bitmap result instead of using a manual counting objects phase. Why would we ever want to not --use-bitmaps, once it actually works? I.e., shouldn't this be the default if pack.usebitmaps is set (or possibly even core.usebitmaps for these things)? These are some example timings for `torvalds/linux`: $ time ../git/git rev-list --objects master /dev/null real0m25.567s user0m25.148s sys 0m0.384s $ time ../git/git rev-list --use-bitmaps master /dev/null real0m0.393s user0m0.356s sys 0m0.036s I see your badass numbers, and raise you a critical issue: $ time git rev-list --use-bitmaps --count --left-right origin/pu...origin/next Segmentation fault real0m0.408s user0m0.383s sys 0m0.022s It actually seems to be related solely to having negated commits in the walk: thomas@linux-k42r:~/g(next u+65)$ time git rev-list --use-bitmaps --count origin/pu 32315 real0m0.041s user0m0.034s sys 0m0.006s thomas@linux-k42r:~/g(next u+65)$ time git rev-list --use-bitmaps --count origin/pu ^origin/next Segmentation fault real0m0.460s user0m0.214s sys 0m0.244s I also can't help noticing that the time spent generating the segfault would have sufficed to generate the answer the old way as well: $ time git rev-list --count --left-right origin/pu...origin/next 189 125 real0m0.409s user0m0.386s sys 0m0.022s Can we use the same trick to speed up merge base computation and then --left-right? The latter is a component of __git_ps1 and can get somewhat slow in some cases, so it would be nice to make it really fast, too. -- Thomas Rast trast@{inf,student}.ethz.ch -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 11/16] rev-list: add bitmap mode to speed up lists
On Tue, Jun 25, 2013 at 09:22:28AM -0700, Thomas Rast wrote: Vicent Marti tan...@gmail.com writes: Calling `git rev-list --use-bitmaps [committish]` is the equivalent of `git rev-list --objects`, but the rev list is performed based on a bitmap result instead of using a manual counting objects phase. Why would we ever want to not --use-bitmaps, once it actually works? I.e., shouldn't this be the default if pack.usebitmaps is set (or possibly even core.usebitmaps for these things)? If you are using bitmaps, you cannot produce the same output as --objects; the latter prints the path at which each object is found. In the JGit bitmap format, we have no information at all; in Vicent's v2, we have only a hash of that pathname. -Peff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 11/16] rev-list: add bitmap mode to speed up lists
The bitmap reachability index used to speed up the counting objects phase during `pack-objects` can also be used to optimize a normal rev-list if the only thing required are the SHA1s of the objects during the list. Calling `git rev-list --use-bitmaps [committish]` is the equivalent of `git rev-list --objects`, but the rev list is performed based on a bitmap result instead of using a manual counting objects phase. These are some example timings for `torvalds/linux`: $ time ../git/git rev-list --objects master /dev/null real0m25.567s user0m25.148s sys 0m0.384s $ time ../git/git rev-list --use-bitmaps master /dev/null real0m0.393s user0m0.356s sys 0m0.036s Additionally, a `--test-bitmap` flag has been added that will perform the same rev-list manually (i.e. using a normal revwalk) and using bitmaps, and verify that the results are the same. --- builtin/rev-list.c | 28 +++- 1 file changed, 27 insertions(+), 1 deletion(-) diff --git a/builtin/rev-list.c b/builtin/rev-list.c index 67701be..905ed08 100644 --- a/builtin/rev-list.c +++ b/builtin/rev-list.c @@ -3,6 +3,8 @@ #include diff.h #include revision.h #include list-objects.h +#include pack.h +#include pack-bitmap.h #include builtin.h #include log-tree.h #include graph.h @@ -256,6 +258,17 @@ static int show_bisect_vars(struct rev_list_info *info, int reaches, int all) return 0; } +static int show_object_fast( + const unsigned char *sha1, + enum object_type type, + uint32_t hash, int exclude, + struct packed_git *found_pack, + off_t found_offset) +{ + fprintf(stdout, %ss\n, sha1_to_hex(sha1)); + return 1; +} + int cmd_rev_list(int argc, const char **argv, const char *prefix) { struct rev_info revs; @@ -264,6 +277,7 @@ int cmd_rev_list(int argc, const char **argv, const char *prefix) int bisect_list = 0; int bisect_show_vars = 0; int bisect_find_all = 0; + int use_bitmaps = 0; git_config(git_default_config, NULL); init_revisions(revs, prefix); @@ -305,8 +319,15 @@ int cmd_rev_list(int argc, const char **argv, const char *prefix) bisect_show_vars = 1; continue; } + if (!strcmp(arg, --use-bitmaps)) { + use_bitmaps = 1; + continue; + } + if (!strcmp(arg, --test-bitmap)) { + test_bitmap_walk(revs); + return 0; + } usage(rev_list_usage); - } if (revs.commit_format != CMIT_FMT_UNSPECIFIED) { /* The command line has a --pretty */ @@ -332,6 +353,11 @@ int cmd_rev_list(int argc, const char **argv, const char *prefix) if (bisect_list) revs.limited = 1; + if (use_bitmaps !prepare_bitmap_walk(revs, NULL)) { + traverse_bitmap_commit_list(show_object_fast); + return 0; + } + if (prepare_revision_walk(revs)) die(revision walk setup failed); if (revs.tree_objects) -- 1.7.9.5 -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html