Re: [PATCH 5/5] (BROKEN) get_merge_bases_many(): walk from many tips in parallel

2012-08-29 Thread Jeff King
On Tue, Aug 28, 2012 at 04:39:11PM -0700, Junio C Hamano wrote:

  git rev-list --committer=torva...@linux-foundation.org \
  --max-parents=2 --min-parents=2 --parents v3.5..v3.6-rc2 RL
 
  cmd='
  while read result parent1 parent2
  do
  $GIT merge-base $parent1 $parent2
  done RL
  '
 
 I have this suspicion that it is spending most of its time in
 insert-by-date.  Luckily, merge_bases_many() is totally outside of
 the usual revision traversal and its use of the commit list is
 pretty well isolated.
 
 Peff, can I interest you into resurrecting your $gmane/174007 and
 $gmane/174008 (we do not need pop_commit_from_queue() in the latter;
 everything can stay as static to commit.c for now) to see how well
 priority queue based approach would perform?

I did a quick port of merge_bases_many and get_merge_bases_many to use
priority queues, but I didn't see any speedup at all on the above test
case.  According to perf, we spend most of our time in zlib, inflating
commits. Only 1% is spent on commit_list_insert_by_date, which is about
the same amount spent on queue insertion after my patches.

Patches follow, just for reference.

  [1/2]: basic priority queue implementation
  [2/2]: commit: use a priority queue in merge base functions

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 5/5] (BROKEN) get_merge_bases_many(): walk from many tips in parallel

2012-08-28 Thread Junio C Hamano
Junio C Hamano gits...@pobox.com writes:

 Junio C Hamano gits...@pobox.com writes:

  for (i = 0; i  cnt; i++) {
 -if (rslt[i])
 +/*
 + * Is rslt[i] an ancestor of any of the others?
 + * then it is not interesting to us.
 + */
 +for (j = 0; j  i; j++)
 +others[j] = rslt[j];
 +for (j = 1 + 1; j  cnt; j++)

 s/1 + 1/i + 1/;

 With that, all tests seem to pass ;-)

git merge-base itself seems to have more room for improvement.
Trying to recompute bases for recent 200 merges in the kernel
history with the attached script does not show any improvement with
or without the series on top of recent master.  Correctnesswise it
seems to be OK, though---I get byte-for-byte identical output.

-- 8 --
#!/bin/sh

git rev-list --committer=torva...@linux-foundation.org \
--max-parents=2 --min-parents=2 --parents v3.5..v3.6-rc2 RL

cmd='
while read result parent1 parent2
do
$GIT merge-base $parent1 $parent2
done RL
'

GIT=rungit master time sh -c $cmd :stock
GIT=../git.git/git time sh -c $cmd :optim
cmp :stock :optim
-- 8 --
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 5/5] (BROKEN) get_merge_bases_many(): walk from many tips in parallel

2012-08-28 Thread Junio C Hamano
Junio C Hamano gits...@pobox.com writes:

 Junio C Hamano gits...@pobox.com writes:

 Junio C Hamano gits...@pobox.com writes:

 for (i = 0; i  cnt; i++) {
 -   if (rslt[i])
 +   /*
 +* Is rslt[i] an ancestor of any of the others?
 +* then it is not interesting to us.
 +*/
 +   for (j = 0; j  i; j++)
 +   others[j] = rslt[j];
 +   for (j = 1 + 1; j  cnt; j++)

 s/1 + 1/i + 1/;

 With that, all tests seem to pass ;-)

 git merge-base itself seems to have more room for improvement.
 Trying to recompute bases for recent 200 merges in the kernel
 history with the attached script does not show any improvement with
 or without the series on top of recent master.  Correctnesswise it
 seems to be OK, though---I get byte-for-byte identical output.

 -- 8 --
 #!/bin/sh

 git rev-list --committer=torva...@linux-foundation.org \
 --max-parents=2 --min-parents=2 --parents v3.5..v3.6-rc2 RL

 cmd='
   while read result parent1 parent2
   do
   $GIT merge-base $parent1 $parent2
   done RL
 '

 GIT=rungit master time sh -c $cmd :stock
 GIT=../git.git/git time sh -c $cmd :optim
 cmp :stock :optim
 -- 8 --

I have this suspicion that it is spending most of its time in
insert-by-date.  Luckily, merge_bases_many() is totally outside of
the usual revision traversal and its use of the commit list is
pretty well isolated.

Peff, can I interest you into resurrecting your $gmane/174007 and
$gmane/174008 (we do not need pop_commit_from_queue() in the latter;
everything can stay as static to commit.c for now) to see how well
priority queue based approach would perform?
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 5/5] (BROKEN) get_merge_bases_many(): walk from many tips in parallel

2012-08-27 Thread Junio C Hamano
After merge_bases_many() paints the graph in two colors to find
common ancestors, get_merge_bases_many() reduces the result by
excluding commits that can be reached from other commits in the
result.  We used to do N * (N-1) traversals for this, but we can
check if one commit is reachable from any of the other (N-1) commits
by a single traversal, and repeat it N times to find the answer.

Signed-off-by: Junio C Hamano gits...@pobox.com
---
 commit.c | 42 +++---
 1 file changed, 23 insertions(+), 19 deletions(-)

diff --git a/commit.c b/commit.c
index f5211bd..ccf2c5a 100644
--- a/commit.c
+++ b/commit.c
@@ -715,7 +715,7 @@ struct commit_list *get_merge_bases_many(struct commit *one,
 int cleanup)
 {
struct commit_list *list;
-   struct commit **rslt;
+   struct commit **rslt, **others;
struct commit_list *result;
int cnt, i, j;
 
@@ -744,33 +744,37 @@ struct commit_list *get_merge_bases_many(struct commit 
*one,
for (list = result, i = 0; list; list = list-next)
rslt[i++] = list-item;
free_commit_list(result);
+   result = NULL;
 
clear_commit_marks(one, all_flags);
for (i = 0; i  n; i++)
clear_commit_marks(twos[i], all_flags);
-   for (i = 0; i  cnt - 1; i++) {
-   for (j = i+1; j  cnt; j++) {
-   if (!rslt[i] || !rslt[j])
-   continue;
-   result = merge_bases_many(rslt[i], 1, rslt[j]);
-   clear_commit_marks(rslt[i], all_flags);
-   clear_commit_marks(rslt[j], all_flags);
-   for (list = result; list; list = list-next) {
-   if (rslt[i] == list-item)
-   rslt[i] = NULL;
-   if (rslt[j] == list-item)
-   rslt[j] = NULL;
-   }
-   }
-   }
 
-   /* Surviving ones in rslt[] are the independent results */
-   result = NULL;
+   others = xcalloc(cnt - 1, sizeof(*others));
for (i = 0; i  cnt; i++) {
-   if (rslt[i])
+   /*
+* Is rslt[i] an ancestor of any of the others?
+* then it is not interesting to us.
+*/
+   for (j = 0; j  i; j++)
+   others[j] = rslt[j];
+   for (j = 1 + 1; j  cnt; j++)
+   others[j - 1] = rslt[j];
+   list = merge_bases_many(rslt[i], cnt - 1, others);
+   clear_commit_marks(rslt[i], all_flags);
+   for (j = 0; j  cnt - 1; j++)
+   clear_commit_marks(others[j], all_flags);
+   while (list) {
+   if (rslt[i] == list-item)
+   break;
+   list = list-next;
+   }
+   if (!list)
commit_list_insert_by_date(rslt[i], result);
+   free_commit_list(list);
}
free(rslt);
+   free(others);
return result;
 }
 
-- 
1.7.12.116.g31e0100

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 5/5] (BROKEN) get_merge_bases_many(): walk from many tips in parallel

2012-08-27 Thread Junio C Hamano
Junio C Hamano gits...@pobox.com writes:

   for (i = 0; i  cnt; i++) {
 - if (rslt[i])
 + /*
 +  * Is rslt[i] an ancestor of any of the others?
 +  * then it is not interesting to us.
 +  */
 + for (j = 0; j  i; j++)
 + others[j] = rslt[j];
 + for (j = 1 + 1; j  cnt; j++)

s/1 + 1/i + 1/;

With that, all tests seem to pass ;-)

 + others[j - 1] = rslt[j];
 + list = merge_bases_many(rslt[i], cnt - 1, others);
 + clear_commit_marks(rslt[i], all_flags);
 + for (j = 0; j  cnt - 1; j++)
 + clear_commit_marks(others[j], all_flags);
 + while (list) {
 + if (rslt[i] == list-item)
 + break;
 + list = list-next;
 + }
 + if (!list)
   commit_list_insert_by_date(rslt[i], result);
 + free_commit_list(list);
   }
   free(rslt);
 + free(others);
   return result;
  }
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html