Re: [PATCH 11/16] rev-list: add bitmap mode to speed up lists

2013-06-26 Thread Thomas Rast
Vicent Martí tan...@gmail.com writes:

 I'm afraid I cannot reproduce the segfault locally (assuming you're
 performing the rev-list on the git/git repository). Could you please
 send me more information, and a core dump if possible?

Sure, but isn't the core dump useless if you don't have the same
executable?  And since I'm building custom git, you won't have that.

Here's a semi-full backtrace (I left out the spammy output in the
outermost frames).  Some variables in #2 and #3 seem to have gone off
the rails.

#0  0x772b06fb in __memset_sse2 () from /lib64/libc.so.6
No symbol table info available.
#1  0x0054c31c in bitmap_set (self=0x89c360, pos=18446744072278122040) 
at ewah/bitmap.c:46
old_size = 7666
block = 288230376129345656
#2  0x004e6c70 in add_to_include_set (data=0x7fffcd00, 
sha1=0x85c014 
\230\062˝M\311i\373\372\317\321\370\224\017\313\336\301\213\271\060, 
bitmap_pos=-1431429576) at pack-bitmap.c:428
hash_pos = 512
#3  0x004e6cd6 in should_include (commit=0x85c010, 
_data=0x7fffcd00) at pack-bitmap.c:443
data = 0x7fffcd00
bitmap_pos = -1431429576
#4  0x0050cf1d in add_parents_to_list (revs=0x7fffce30, 
commit=0x85c010, list=0x7fffce30, cache_ptr=0x0) at revision.c:784
parent = 0x88c260
left_flag = 32767
cached_base = 0x0
#5  0x00512b66 in get_revision_1 (revs=0x7fffce30) at 
revision.c:2857
entry = 0x8f9ce0
commit = 0x85c010
#6  0x00512dcf in get_revision_internal (revs=0x7fffce30) at 
revision.c:2964
c = 0x0
l = 0x1000
#7  0x00512fe1 in get_revision (revs=0x7fffce30) at revision.c:3040
c = 0xb92608
reversed = 0x89c360
#8  0x004d2a24 in traverse_commit_list (revs=0x7fffce30, 
show_commit=0x4e6b72 show_commit, show_object=0x4e6afa show_object, 
data=0x89c360) at list-objects.c:179
i = -1
commit = 0xb92608
base = {
  alloc = 4097, 
  len = 0, 
  buf = 0x87bbe0 
}
#9  0x004e6fa4 in find_objects (revs=0x7fffce30, roots=0x0, 
seen=0x85b760) at pack-bitmap.c:549
incdata = {
  base = 0x89c360, 
  seen = 0x85b760
}
base = 0x89c360
needs_walk = true
not_mapped = 0x8f9dc0
#10 0x004e747b in prepare_bitmap_walk (revs=0x7fffce30, 
result_size=0x0) at pack-bitmap.c:679
i = 2
pending_nr = 2
pending_alloc = 64
pending_e = 0x853e10
wants = 0x8545b0
haves = 0x854820
wants_bitmap = 0x0
haves_bitmap = 0x85b760
#11 0x00474bb3 in cmd_rev_list (argc=2, argv=0x7fffd6e8, 
prefix=0x0) at builtin/rev-list.c:356
#12 0x00405820 in run_builtin (p=0x7c3ef8 commands.20770+2040, 
argc=4, argv=0x7fffd6e8) at git.c:291
#13 0x004059b3 in handle_internal_command (argc=4, argv=0x7fffd6e8) 
at git.c:454
#14 0x00405b87 in main (argc=4, av=0x7fffd6e8) at git.c:544


This is with a version of your series that you can find at

  https://github.com/trast/git.git vm/ewah

I am'd your patches on top of Junio's master at the time, except for the
parts to the Makefile that did not apply, which I fixed up manually.

-- 
Thomas Rast
trast@{inf,student}.ethz.ch
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 11/16] rev-list: add bitmap mode to speed up lists

2013-06-25 Thread Thomas Rast
Vicent Marti tan...@gmail.com writes:

 Calling `git rev-list --use-bitmaps [committish]` is the equivalent
 of `git rev-list --objects`, but the rev list is performed based on
 a bitmap result instead of using a manual counting objects phase.

Why would we ever want to not --use-bitmaps, once it actually works?
I.e., shouldn't this be the default if pack.usebitmaps is set (or
possibly even core.usebitmaps for these things)?

 These are some example timings for `torvalds/linux`:

   $ time ../git/git rev-list --objects master  /dev/null

   real0m25.567s
   user0m25.148s
   sys 0m0.384s

   $ time ../git/git rev-list --use-bitmaps master  /dev/null

   real0m0.393s
   user0m0.356s
   sys 0m0.036s

I see your badass numbers, and raise you a critical issue:

  $ time git rev-list --use-bitmaps --count --left-right origin/pu...origin/next
  Segmentation fault

  real0m0.408s
  user0m0.383s
  sys 0m0.022s

It actually seems to be related solely to having negated commits in the
walk:

  thomas@linux-k42r:~/g(next u+65)$ time git rev-list --use-bitmaps --count 
origin/pu
  32315

  real0m0.041s
  user0m0.034s
  sys 0m0.006s
  thomas@linux-k42r:~/g(next u+65)$ time git rev-list --use-bitmaps --count 
origin/pu ^origin/next
  Segmentation fault

  real0m0.460s
  user0m0.214s
  sys 0m0.244s

I also can't help noticing that the time spent generating the segfault
would have sufficed to generate the answer the old way as well:

  $ time git rev-list --count --left-right origin/pu...origin/next
  189 125

  real0m0.409s
  user0m0.386s
  sys 0m0.022s

Can we use the same trick to speed up merge base computation and then
--left-right?  The latter is a component of __git_ps1 and can get
somewhat slow in some cases, so it would be nice to make it really fast,
too.

-- 
Thomas Rast
trast@{inf,student}.ethz.ch
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 11/16] rev-list: add bitmap mode to speed up lists

2013-06-25 Thread Vicent Martí
I'm afraid I cannot reproduce the segfault locally (assuming you're
performing the rev-list on the git/git repository). Could you please
send me more information, and a core dump if possible?

On Tue, Jun 25, 2013 at 6:22 PM, Thomas Rast tr...@inf.ethz.ch wrote:
 Vicent Marti tan...@gmail.com writes:

 Calling `git rev-list --use-bitmaps [committish]` is the equivalent
 of `git rev-list --objects`, but the rev list is performed based on
 a bitmap result instead of using a manual counting objects phase.

 Why would we ever want to not --use-bitmaps, once it actually works?
 I.e., shouldn't this be the default if pack.usebitmaps is set (or
 possibly even core.usebitmaps for these things)?

 These are some example timings for `torvalds/linux`:

   $ time ../git/git rev-list --objects master  /dev/null

   real0m25.567s
   user0m25.148s
   sys 0m0.384s

   $ time ../git/git rev-list --use-bitmaps master  /dev/null

   real0m0.393s
   user0m0.356s
   sys 0m0.036s

 I see your badass numbers, and raise you a critical issue:

   $ time git rev-list --use-bitmaps --count --left-right 
 origin/pu...origin/next
   Segmentation fault

   real0m0.408s
   user0m0.383s
   sys 0m0.022s

 It actually seems to be related solely to having negated commits in the
 walk:

   thomas@linux-k42r:~/g(next u+65)$ time git rev-list --use-bitmaps --count 
 origin/pu
   32315

   real0m0.041s
   user0m0.034s
   sys 0m0.006s
   thomas@linux-k42r:~/g(next u+65)$ time git rev-list --use-bitmaps --count 
 origin/pu ^origin/next
   Segmentation fault

   real0m0.460s
   user0m0.214s
   sys 0m0.244s

 I also can't help noticing that the time spent generating the segfault
 would have sufficed to generate the answer the old way as well:

   $ time git rev-list --count --left-right origin/pu...origin/next
   189 125

   real0m0.409s
   user0m0.386s
   sys 0m0.022s

 Can we use the same trick to speed up merge base computation and then
 --left-right?  The latter is a component of __git_ps1 and can get
 somewhat slow in some cases, so it would be nice to make it really fast,
 too.

 --
 Thomas Rast
 trast@{inf,student}.ethz.ch
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 11/16] rev-list: add bitmap mode to speed up lists

2013-06-25 Thread Jeff King
On Tue, Jun 25, 2013 at 09:22:28AM -0700, Thomas Rast wrote:

 Vicent Marti tan...@gmail.com writes:
 
  Calling `git rev-list --use-bitmaps [committish]` is the equivalent
  of `git rev-list --objects`, but the rev list is performed based on
  a bitmap result instead of using a manual counting objects phase.
 
 Why would we ever want to not --use-bitmaps, once it actually works?
 I.e., shouldn't this be the default if pack.usebitmaps is set (or
 possibly even core.usebitmaps for these things)?

If you are using bitmaps, you cannot produce the same output as
--objects; the latter prints the path at which each object is found.
In the JGit bitmap format, we have no information at all; in Vicent's
v2, we have only a hash of that pathname.

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 11/16] rev-list: add bitmap mode to speed up lists

2013-06-24 Thread Vicent Marti
The bitmap reachability index used to speed up the counting objects
phase during `pack-objects` can also be used to optimize a normal
rev-list if the only thing required are the SHA1s of the objects during
the list.

Calling `git rev-list --use-bitmaps [committish]` is the equivalent
of `git rev-list --objects`, but the rev list is performed based on
a bitmap result instead of using a manual counting objects phase.

These are some example timings for `torvalds/linux`:

$ time ../git/git rev-list --objects master  /dev/null

real0m25.567s
user0m25.148s
sys 0m0.384s

$ time ../git/git rev-list --use-bitmaps master  /dev/null

real0m0.393s
user0m0.356s
sys 0m0.036s

Additionally, a `--test-bitmap` flag has been added that will perform
the same rev-list manually (i.e. using a normal revwalk) and using
bitmaps, and verify that the results are the same.
---
 builtin/rev-list.c |   28 +++-
 1 file changed, 27 insertions(+), 1 deletion(-)

diff --git a/builtin/rev-list.c b/builtin/rev-list.c
index 67701be..905ed08 100644
--- a/builtin/rev-list.c
+++ b/builtin/rev-list.c
@@ -3,6 +3,8 @@
 #include diff.h
 #include revision.h
 #include list-objects.h
+#include pack.h
+#include pack-bitmap.h
 #include builtin.h
 #include log-tree.h
 #include graph.h
@@ -256,6 +258,17 @@ static int show_bisect_vars(struct rev_list_info *info, 
int reaches, int all)
return 0;
 }
 
+static int show_object_fast(
+   const unsigned char *sha1,
+   enum object_type type,
+   uint32_t hash, int exclude,
+   struct packed_git *found_pack,
+   off_t found_offset)
+{
+   fprintf(stdout, %ss\n, sha1_to_hex(sha1));
+   return 1;
+}
+
 int cmd_rev_list(int argc, const char **argv, const char *prefix)
 {
struct rev_info revs;
@@ -264,6 +277,7 @@ int cmd_rev_list(int argc, const char **argv, const char 
*prefix)
int bisect_list = 0;
int bisect_show_vars = 0;
int bisect_find_all = 0;
+   int use_bitmaps = 0;
 
git_config(git_default_config, NULL);
init_revisions(revs, prefix);
@@ -305,8 +319,15 @@ int cmd_rev_list(int argc, const char **argv, const char 
*prefix)
bisect_show_vars = 1;
continue;
}
+   if (!strcmp(arg, --use-bitmaps)) {
+   use_bitmaps = 1;
+   continue;
+   }
+   if (!strcmp(arg, --test-bitmap)) {
+   test_bitmap_walk(revs);
+   return 0;
+   }
usage(rev_list_usage);
-
}
if (revs.commit_format != CMIT_FMT_UNSPECIFIED) {
/* The command line has a --pretty  */
@@ -332,6 +353,11 @@ int cmd_rev_list(int argc, const char **argv, const char 
*prefix)
if (bisect_list)
revs.limited = 1;
 
+   if (use_bitmaps  !prepare_bitmap_walk(revs, NULL)) {
+   traverse_bitmap_commit_list(show_object_fast);
+   return 0;
+   }
+
if (prepare_revision_walk(revs))
die(revision walk setup failed);
if (revs.tree_objects)
-- 
1.7.9.5

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html