Re: [PATCH v2] git tag --contains : avoid stack overflow

2014-04-24 Thread Stepan Kasal
Thanks for all suggestions and explanations.
The diff against PATCH v2 is below, PATCH v3 follows.

Have a nice day,
Stepan

Subject: [PATCH] fixup! git tag --contains : avoid stack overflow

---
 t/t7004-tag.sh | 11 +++
 1 file changed, 7 insertions(+), 4 deletions(-)

diff --git a/t/t7004-tag.sh b/t/t7004-tag.sh
index db82f6d..a911df0 100755
--- a/t/t7004-tag.sh
+++ b/t/t7004-tag.sh
@@ -1423,12 +1423,15 @@ EOF
test_cmp expect actual
 '
 
-ulimit_stack=ulimit -s 64
-test_lazy_prereq ULIMIT 'bash -c '$ulimit_stack''
+run_with_limited_stack () {
+   (ulimit -s 64  $@)
+}
+
+test_lazy_prereq ULIMIT 'run_with_limited_stack true'
 
-expect
 # we require bash and ulimit, this excludes Windows
 test_expect_success ULIMIT '--contains works in a deep repo' '
+   expect 
i=1 
while test $i -lt 4000
do
@@ -1442,7 +1445,7 @@ EOF
done | git fast-import 
git checkout master 
git tag far-far-away HEAD^ 
-   bash -c '$ulimit_stack'  git tag --contains HEAD actual 
+   run_with_limited_stack git tag --contains HEAD actual 
test_cmp expect actual
 '
 
-- 
1.9.0.msysgit.0.119.g722efef

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2] git tag --contains : avoid stack overflow

2014-04-23 Thread Stepan Kasal
From: Jean-Jacques Lafay jeanjacques.la...@gmail.com

In large repos, the recursion implementation of contains(commit,
commit_list) may result in a stack overflow. Replace the recursion with
a loop to fix it.

This problem is more apparent on Windows than on Linux, where the stack
is more limited by default.

See also this thread on the msysGit list:

https://groups.google.com/d/topic/msysgit/FqT6boJrb2g/discussion

[jes: re-written to imitate the original recursion more closely]

Thomas Braun pointed out several documentation shortcomings.

Tests are run only if ulimit -s is available.  This means they cannot
be run on Windows.

Signed-off-by: Jean-Jacques Lafay jeanjacques.la...@gmail.com
Signed-off-by: Johannes Schindelin johannes.schinde...@gmx.de
Tested-by: Stepan Kasal ka...@ucw.cz
---
Hello,
I have found out that ulimit -s does not work on Windows.
Adding this as a prerequisite, we will skip the test there.

On Thu, Apr 17, 2014 at 05:32:38PM -0400, Jeff King wrote:
 So we can bump the depth further; probably 4000 is enough for any system
 to fail with a 64k stack. The deeper we make it, the longer it takes to
 run the test, though. At 4000, my machine seems to take about 300ms to
 run it. That's may be OK.

Consequently, we can accept this proposal.

Stepan Kasal

 builtin/tag.c  | 90 --
 t/t7004-tag.sh | 23 +++
 2 files changed, 98 insertions(+), 15 deletions(-)

diff --git a/builtin/tag.c b/builtin/tag.c
index 6c7c6bd..f344002 100644
--- a/builtin/tag.c
+++ b/builtin/tag.c
@@ -80,11 +80,19 @@ static int in_commit_list(const struct commit_list *want, 
struct commit *c)
return 0;
 }
 
-static int contains_recurse(struct commit *candidate,
+enum contains_result {
+   CONTAINS_UNKNOWN = -1,
+   CONTAINS_NO = 0,
+   CONTAINS_YES = 1,
+};
+
+/*
+ * Test whether the candidate or one of its parents is contained in the list.
+ * Do not recurse to find out, though, but return -1 if inconclusive.
+ */
+static enum contains_result contains_test(struct commit *candidate,
const struct commit_list *want)
 {
-   struct commit_list *p;
-
/* was it previously marked as containing a want commit? */
if (candidate-object.flags  TMP_MARK)
return 1;
@@ -92,26 +100,78 @@ static int contains_recurse(struct commit *candidate,
if (candidate-object.flags  UNINTERESTING)
return 0;
/* or are we it? */
-   if (in_commit_list(want, candidate))
+   if (in_commit_list(want, candidate)) {
+   candidate-object.flags |= TMP_MARK;
return 1;
+   }
 
if (parse_commit(candidate)  0)
return 0;
 
-   /* Otherwise recurse and mark ourselves for future traversals. */
-   for (p = candidate-parents; p; p = p-next) {
-   if (contains_recurse(p-item, want)) {
-   candidate-object.flags |= TMP_MARK;
-   return 1;
-   }
-   }
-   candidate-object.flags |= UNINTERESTING;
-   return 0;
+   return -1;
 }
 
-static int contains(struct commit *candidate, const struct commit_list *want)
+/*
+ * Mimicking the real stack, this stack lives on the heap, avoiding stack
+ * overflows.
+ *
+ * At each recursion step, the stack items points to the commits whose
+ * ancestors are to be inspected.
+ */
+struct stack {
+   int nr, alloc;
+   struct stack_entry {
+   struct commit *commit;
+   struct commit_list *parents;
+   } *stack;
+};
+
+static void push_to_stack(struct commit *candidate, struct stack *stack)
+{
+   int index = stack-nr++;
+   ALLOC_GROW(stack-stack, stack-nr, stack-alloc);
+   stack-stack[index].commit = candidate;
+   stack-stack[index].parents = candidate-parents;
+}
+
+static enum contains_result contains(struct commit *candidate,
+   const struct commit_list *want)
 {
-   return contains_recurse(candidate, want);
+   struct stack stack = { 0, 0, NULL };
+   int result = contains_test(candidate, want);
+
+   if (result != CONTAINS_UNKNOWN)
+   return result;
+
+   push_to_stack(candidate, stack);
+   while (stack.nr) {
+   struct stack_entry *entry = stack.stack[stack.nr - 1];
+   struct commit *commit = entry-commit;
+   struct commit_list *parents = entry-parents;
+
+   if (!parents) {
+   commit-object.flags |= UNINTERESTING;
+   stack.nr--;
+   }
+   /*
+* If we just popped the stack, parents-item has been marked,
+* therefore contains_test will return a meaningful 0 or 1.
+*/
+   else switch (contains_test(parents-item, want)) {
+   case CONTAINS_YES:
+   commit-object.flags |= TMP_MARK;
+

Re: [PATCH v2] git tag --contains : avoid stack overflow

2014-04-23 Thread Johannes Schindelin
Hi,

On Wed, 23 Apr 2014, Stepan Kasal wrote:

 I have found out that ulimit -s does not work on Windows.
 Adding this as a prerequisite, we will skip the test there.

The interdiff can be seen here:

https://github.com/msysgit/git/commit/c68e27d5

Ciao,
Johannes
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2] git tag --contains : avoid stack overflow

2014-04-23 Thread Stepan Kasal
Hello,

On Wed, Apr 23, 2014 at 04:28:39PM +0200, Johannes Schindelin wrote:
 The interdiff can be seen here:
   https://github.com/msysgit/git/commit/c68e27d5

not exatly, is also changes the number of commits in the deep repo
from 1000 to 4000, as peff proposed.

Stepan
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2] git tag --contains : avoid stack overflow

2014-04-23 Thread Jeff King
On Wed, Apr 23, 2014 at 12:12:14PM -0700, Junio C Hamano wrote:

  +ulimit_stack=ulimit -s 64
  +test_lazy_prereq ULIMIT 'bash -c '$ulimit_stack''
 
 With this implementaion, ULIMIT implies bash, and we use bash that
 appears on user's PATH that may not be the one the user chose to run
 git with.  Can't we fix both of them by using $SHELL_PATH?

I don't think so. The point is that we _must_ use bash here, not any
POSIX shell. So my $SHELL_PATH is /bin/sh, which is dash, and would not
run the test.

We want to run some bash if we can. We may pick a bash on the user's
PATH that is not what they put into $SHELL_PATH, but that should be
relatively rare. And the consequence is that either that bash works fine
and we run the test, or it does not, and we skip the test.

 How about doing it along this line instead?
 
   run_with_limited_stack () {
   $SHELL_PATH -c ulimit -s 64  $*
   }
 
   test_lazy_prereq ULIMIT run_with_limited_stack true

That's a much more direct test. I like it (aside from the $SHELL_PATH
thing as described above).

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2] git tag --contains : avoid stack overflow

2014-04-23 Thread Jeff King
On Wed, Apr 23, 2014 at 09:53:25AM +0200, Stepan Kasal wrote:

 I have found out that ulimit -s does not work on Windows.
 Adding this as a prerequisite, we will skip the test there.

I found this bit weird, as the test originated on Windows. Did it never
actually cause a failure there (i.e., the ulimit -s doesn't do
anything)? Or does ulimit fail?

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2] git tag --contains : avoid stack overflow

2014-04-23 Thread Stepan Kasal
Hi,

On Wed, Apr 23, 2014 at 12:12:14PM -0700, Junio C Hamano wrote:
 [Administrivia: please refrain from using Mail-Followup-To to
 deflect an attempt to directly respond to you;

thanks a lot for telling me.
Actually, this was a mistake: I added git to the list of discussion
lists, without realizing the consequences.
I'm glad to have separate copies of my threads that do not fall
to the git-list folder.

  +expect
 Move this inside test_expect_success?

Of course.  Had this in mind, then forgot.

 So this runs a separate bash, [...]
 run git tag in the original shell?
 
 Ahh, no, I am mis-pairing the quotes.

Point taken.  I admire how nicely you explained that!

   run_with_limited_stack () {
   $SHELL_PATH -c ulimit -s 64  $*
   }

Elegant. But I agree with Peff that we shall run a bash instead.

I'll mail an updated patch tomorrow.

Stepan
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2] git tag --contains : avoid stack overflow

2014-04-23 Thread Junio C Hamano
Jeff King p...@peff.net writes:

 On Wed, Apr 23, 2014 at 12:12:14PM -0700, Junio C Hamano wrote:

  +ulimit_stack=ulimit -s 64
  +test_lazy_prereq ULIMIT 'bash -c '$ulimit_stack''
 
 With this implementaion, ULIMIT implies bash, and we use bash that
 appears on user's PATH that may not be the one the user chose to run
 git with.  Can't we fix both of them by using $SHELL_PATH?

 I don't think so. The point is that we _must_ use bash here, not any
 POSIX shell.

Sorry, but I do not understand.  Isn't what you want any POSIX
shell with 'ulimit -s 64' supported?

$ dash -c 'ulimit -s  ulimit -s 64  ulimit -s'
8192
64

 We want to run some bash if we can. We may pick a bash on the user's
 PATH that is not what they put into $SHELL_PATH, but that should be
 relatively rare. And the consequence is that either that bash works fine
 and we run the test, or it does not, and we skip the test.

 How about doing it along this line instead?
 
  run_with_limited_stack () {
  $SHELL_PATH -c ulimit -s 64  $*
  }
 
  test_lazy_prereq ULIMIT run_with_limited_stack true

 That's a much more direct test. I like it (aside from the $SHELL_PATH
 thing as described above).

Still puzzled.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2] git tag --contains : avoid stack overflow

2014-04-23 Thread Jeff King
On Wed, Apr 23, 2014 at 01:48:05PM -0700, Junio C Hamano wrote:

  I don't think so. The point is that we _must_ use bash here, not any
  POSIX shell.
 
 Sorry, but I do not understand.  Isn't what you want any POSIX
 shell with 'ulimit -s 64' supported?

Sure, that would be fine, but the original patch which started this
thread claimed that bash was required. I had assumed that to be true,
but it seems like it's not:

 $ dash -c 'ulimit -s  ulimit -s 64  ulimit -s'
 8192
 64

If we are just using the same shell we are already running, then why
invoke it by name in the first place? IOW, why not:

  run_with_limited_stack () {
(ulimit -s 64  $@)
  }

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2] git tag --contains : avoid stack overflow

2014-04-23 Thread Junio C Hamano
Jeff King p...@peff.net writes:

 On Wed, Apr 23, 2014 at 01:48:05PM -0700, Junio C Hamano wrote:

  I don't think so. The point is that we _must_ use bash here, not any
  POSIX shell.
 
 Sorry, but I do not understand.  Isn't what you want any POSIX
 shell with 'ulimit -s 64' supported?

 Sure, that would be fine, but the original patch which started this
 thread claimed that bash was required. I had assumed that to be true,
 but it seems like it's not:

 $ dash -c 'ulimit -s  ulimit -s 64  ulimit -s'
 8192
 64

 If we are just using the same shell we are already running, then why
 invoke it by name in the first place? IOW, why not:

   run_with_limited_stack () {
   (ulimit -s 64  $@)
   }

That is certainly more preferrable than an explicit run this piece
with $SHELL_PATH.

I think the choice between Any bash that is on user's PATH vs The
shell the user told us to use when working with Git is a trade-off
between

 - those who choose a shell that does not support ulimit -s to
   work with Git (which is fine, because our scripted Porcelains
   would not have any need for that); for these people, this test
   would be skipped unnecessarily if we insist on SHELL_PATH; and

 - those who run on a box without any bash on their PATH, chose a
   shell that is not bash but does support ulimit -s as their
   SHELL_PATH to build Git with; for these people, this test would
   be skipped unnecessarily if we insist on bash.

and I do not think of a good reason to favor one over the other.

If I have to pick, I'd take your don't name any shell, and let the
current one run it approach, solely for the simplicity of the
solution (it ends up favoring the latter class of people as a
side-effect, though).

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2] git tag --contains : avoid stack overflow

2014-04-23 Thread Johannes Schindelin
Hi Peff,

On Wed, 23 Apr 2014, Jeff King wrote:

 On Wed, Apr 23, 2014 at 09:53:25AM +0200, Stepan Kasal wrote:
 
  I have found out that ulimit -s does not work on Windows.  Adding
  this as a prerequisite, we will skip the test there.
 
 I found this bit weird, as the test originated on Windows. Did it never
 actually cause a failure there (i.e., the ulimit -s doesn't do
 anything)? Or does ulimit fail?

I must have forgotten to test on Windows. For performance reasons (you
know that I only have a Git time budget of about 15min/day), I developed
the test and the patch on Linux.

Ciao,
Johannes
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html