Re: [PATCH v2] git tag --contains : avoid stack overflow
Thanks for all suggestions and explanations. The diff against PATCH v2 is below, PATCH v3 follows. Have a nice day, Stepan Subject: [PATCH] fixup! git tag --contains : avoid stack overflow --- t/t7004-tag.sh | 11 +++ 1 file changed, 7 insertions(+), 4 deletions(-) diff --git a/t/t7004-tag.sh b/t/t7004-tag.sh index db82f6d..a911df0 100755 --- a/t/t7004-tag.sh +++ b/t/t7004-tag.sh @@ -1423,12 +1423,15 @@ EOF test_cmp expect actual ' -ulimit_stack=ulimit -s 64 -test_lazy_prereq ULIMIT 'bash -c '$ulimit_stack'' +run_with_limited_stack () { + (ulimit -s 64 $@) +} + +test_lazy_prereq ULIMIT 'run_with_limited_stack true' -expect # we require bash and ulimit, this excludes Windows test_expect_success ULIMIT '--contains works in a deep repo' ' + expect i=1 while test $i -lt 4000 do @@ -1442,7 +1445,7 @@ EOF done | git fast-import git checkout master git tag far-far-away HEAD^ - bash -c '$ulimit_stack' git tag --contains HEAD actual + run_with_limited_stack git tag --contains HEAD actual test_cmp expect actual ' -- 1.9.0.msysgit.0.119.g722efef -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v2] git tag --contains : avoid stack overflow
From: Jean-Jacques Lafay jeanjacques.la...@gmail.com In large repos, the recursion implementation of contains(commit, commit_list) may result in a stack overflow. Replace the recursion with a loop to fix it. This problem is more apparent on Windows than on Linux, where the stack is more limited by default. See also this thread on the msysGit list: https://groups.google.com/d/topic/msysgit/FqT6boJrb2g/discussion [jes: re-written to imitate the original recursion more closely] Thomas Braun pointed out several documentation shortcomings. Tests are run only if ulimit -s is available. This means they cannot be run on Windows. Signed-off-by: Jean-Jacques Lafay jeanjacques.la...@gmail.com Signed-off-by: Johannes Schindelin johannes.schinde...@gmx.de Tested-by: Stepan Kasal ka...@ucw.cz --- Hello, I have found out that ulimit -s does not work on Windows. Adding this as a prerequisite, we will skip the test there. On Thu, Apr 17, 2014 at 05:32:38PM -0400, Jeff King wrote: So we can bump the depth further; probably 4000 is enough for any system to fail with a 64k stack. The deeper we make it, the longer it takes to run the test, though. At 4000, my machine seems to take about 300ms to run it. That's may be OK. Consequently, we can accept this proposal. Stepan Kasal builtin/tag.c | 90 -- t/t7004-tag.sh | 23 +++ 2 files changed, 98 insertions(+), 15 deletions(-) diff --git a/builtin/tag.c b/builtin/tag.c index 6c7c6bd..f344002 100644 --- a/builtin/tag.c +++ b/builtin/tag.c @@ -80,11 +80,19 @@ static int in_commit_list(const struct commit_list *want, struct commit *c) return 0; } -static int contains_recurse(struct commit *candidate, +enum contains_result { + CONTAINS_UNKNOWN = -1, + CONTAINS_NO = 0, + CONTAINS_YES = 1, +}; + +/* + * Test whether the candidate or one of its parents is contained in the list. + * Do not recurse to find out, though, but return -1 if inconclusive. + */ +static enum contains_result contains_test(struct commit *candidate, const struct commit_list *want) { - struct commit_list *p; - /* was it previously marked as containing a want commit? */ if (candidate-object.flags TMP_MARK) return 1; @@ -92,26 +100,78 @@ static int contains_recurse(struct commit *candidate, if (candidate-object.flags UNINTERESTING) return 0; /* or are we it? */ - if (in_commit_list(want, candidate)) + if (in_commit_list(want, candidate)) { + candidate-object.flags |= TMP_MARK; return 1; + } if (parse_commit(candidate) 0) return 0; - /* Otherwise recurse and mark ourselves for future traversals. */ - for (p = candidate-parents; p; p = p-next) { - if (contains_recurse(p-item, want)) { - candidate-object.flags |= TMP_MARK; - return 1; - } - } - candidate-object.flags |= UNINTERESTING; - return 0; + return -1; } -static int contains(struct commit *candidate, const struct commit_list *want) +/* + * Mimicking the real stack, this stack lives on the heap, avoiding stack + * overflows. + * + * At each recursion step, the stack items points to the commits whose + * ancestors are to be inspected. + */ +struct stack { + int nr, alloc; + struct stack_entry { + struct commit *commit; + struct commit_list *parents; + } *stack; +}; + +static void push_to_stack(struct commit *candidate, struct stack *stack) +{ + int index = stack-nr++; + ALLOC_GROW(stack-stack, stack-nr, stack-alloc); + stack-stack[index].commit = candidate; + stack-stack[index].parents = candidate-parents; +} + +static enum contains_result contains(struct commit *candidate, + const struct commit_list *want) { - return contains_recurse(candidate, want); + struct stack stack = { 0, 0, NULL }; + int result = contains_test(candidate, want); + + if (result != CONTAINS_UNKNOWN) + return result; + + push_to_stack(candidate, stack); + while (stack.nr) { + struct stack_entry *entry = stack.stack[stack.nr - 1]; + struct commit *commit = entry-commit; + struct commit_list *parents = entry-parents; + + if (!parents) { + commit-object.flags |= UNINTERESTING; + stack.nr--; + } + /* +* If we just popped the stack, parents-item has been marked, +* therefore contains_test will return a meaningful 0 or 1. +*/ + else switch (contains_test(parents-item, want)) { + case CONTAINS_YES: + commit-object.flags |= TMP_MARK; +
Re: [PATCH v2] git tag --contains : avoid stack overflow
Hi, On Wed, 23 Apr 2014, Stepan Kasal wrote: I have found out that ulimit -s does not work on Windows. Adding this as a prerequisite, we will skip the test there. The interdiff can be seen here: https://github.com/msysgit/git/commit/c68e27d5 Ciao, Johannes -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2] git tag --contains : avoid stack overflow
Hello, On Wed, Apr 23, 2014 at 04:28:39PM +0200, Johannes Schindelin wrote: The interdiff can be seen here: https://github.com/msysgit/git/commit/c68e27d5 not exatly, is also changes the number of commits in the deep repo from 1000 to 4000, as peff proposed. Stepan -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2] git tag --contains : avoid stack overflow
On Wed, Apr 23, 2014 at 12:12:14PM -0700, Junio C Hamano wrote: +ulimit_stack=ulimit -s 64 +test_lazy_prereq ULIMIT 'bash -c '$ulimit_stack'' With this implementaion, ULIMIT implies bash, and we use bash that appears on user's PATH that may not be the one the user chose to run git with. Can't we fix both of them by using $SHELL_PATH? I don't think so. The point is that we _must_ use bash here, not any POSIX shell. So my $SHELL_PATH is /bin/sh, which is dash, and would not run the test. We want to run some bash if we can. We may pick a bash on the user's PATH that is not what they put into $SHELL_PATH, but that should be relatively rare. And the consequence is that either that bash works fine and we run the test, or it does not, and we skip the test. How about doing it along this line instead? run_with_limited_stack () { $SHELL_PATH -c ulimit -s 64 $* } test_lazy_prereq ULIMIT run_with_limited_stack true That's a much more direct test. I like it (aside from the $SHELL_PATH thing as described above). -Peff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2] git tag --contains : avoid stack overflow
On Wed, Apr 23, 2014 at 09:53:25AM +0200, Stepan Kasal wrote: I have found out that ulimit -s does not work on Windows. Adding this as a prerequisite, we will skip the test there. I found this bit weird, as the test originated on Windows. Did it never actually cause a failure there (i.e., the ulimit -s doesn't do anything)? Or does ulimit fail? -Peff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2] git tag --contains : avoid stack overflow
Hi, On Wed, Apr 23, 2014 at 12:12:14PM -0700, Junio C Hamano wrote: [Administrivia: please refrain from using Mail-Followup-To to deflect an attempt to directly respond to you; thanks a lot for telling me. Actually, this was a mistake: I added git to the list of discussion lists, without realizing the consequences. I'm glad to have separate copies of my threads that do not fall to the git-list folder. +expect Move this inside test_expect_success? Of course. Had this in mind, then forgot. So this runs a separate bash, [...] run git tag in the original shell? Ahh, no, I am mis-pairing the quotes. Point taken. I admire how nicely you explained that! run_with_limited_stack () { $SHELL_PATH -c ulimit -s 64 $* } Elegant. But I agree with Peff that we shall run a bash instead. I'll mail an updated patch tomorrow. Stepan -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2] git tag --contains : avoid stack overflow
Jeff King p...@peff.net writes: On Wed, Apr 23, 2014 at 12:12:14PM -0700, Junio C Hamano wrote: +ulimit_stack=ulimit -s 64 +test_lazy_prereq ULIMIT 'bash -c '$ulimit_stack'' With this implementaion, ULIMIT implies bash, and we use bash that appears on user's PATH that may not be the one the user chose to run git with. Can't we fix both of them by using $SHELL_PATH? I don't think so. The point is that we _must_ use bash here, not any POSIX shell. Sorry, but I do not understand. Isn't what you want any POSIX shell with 'ulimit -s 64' supported? $ dash -c 'ulimit -s ulimit -s 64 ulimit -s' 8192 64 We want to run some bash if we can. We may pick a bash on the user's PATH that is not what they put into $SHELL_PATH, but that should be relatively rare. And the consequence is that either that bash works fine and we run the test, or it does not, and we skip the test. How about doing it along this line instead? run_with_limited_stack () { $SHELL_PATH -c ulimit -s 64 $* } test_lazy_prereq ULIMIT run_with_limited_stack true That's a much more direct test. I like it (aside from the $SHELL_PATH thing as described above). Still puzzled. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2] git tag --contains : avoid stack overflow
On Wed, Apr 23, 2014 at 01:48:05PM -0700, Junio C Hamano wrote: I don't think so. The point is that we _must_ use bash here, not any POSIX shell. Sorry, but I do not understand. Isn't what you want any POSIX shell with 'ulimit -s 64' supported? Sure, that would be fine, but the original patch which started this thread claimed that bash was required. I had assumed that to be true, but it seems like it's not: $ dash -c 'ulimit -s ulimit -s 64 ulimit -s' 8192 64 If we are just using the same shell we are already running, then why invoke it by name in the first place? IOW, why not: run_with_limited_stack () { (ulimit -s 64 $@) } -Peff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2] git tag --contains : avoid stack overflow
Jeff King p...@peff.net writes: On Wed, Apr 23, 2014 at 01:48:05PM -0700, Junio C Hamano wrote: I don't think so. The point is that we _must_ use bash here, not any POSIX shell. Sorry, but I do not understand. Isn't what you want any POSIX shell with 'ulimit -s 64' supported? Sure, that would be fine, but the original patch which started this thread claimed that bash was required. I had assumed that to be true, but it seems like it's not: $ dash -c 'ulimit -s ulimit -s 64 ulimit -s' 8192 64 If we are just using the same shell we are already running, then why invoke it by name in the first place? IOW, why not: run_with_limited_stack () { (ulimit -s 64 $@) } That is certainly more preferrable than an explicit run this piece with $SHELL_PATH. I think the choice between Any bash that is on user's PATH vs The shell the user told us to use when working with Git is a trade-off between - those who choose a shell that does not support ulimit -s to work with Git (which is fine, because our scripted Porcelains would not have any need for that); for these people, this test would be skipped unnecessarily if we insist on SHELL_PATH; and - those who run on a box without any bash on their PATH, chose a shell that is not bash but does support ulimit -s as their SHELL_PATH to build Git with; for these people, this test would be skipped unnecessarily if we insist on bash. and I do not think of a good reason to favor one over the other. If I have to pick, I'd take your don't name any shell, and let the current one run it approach, solely for the simplicity of the solution (it ends up favoring the latter class of people as a side-effect, though). -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2] git tag --contains : avoid stack overflow
Hi Peff, On Wed, 23 Apr 2014, Jeff King wrote: On Wed, Apr 23, 2014 at 09:53:25AM +0200, Stepan Kasal wrote: I have found out that ulimit -s does not work on Windows. Adding this as a prerequisite, we will skip the test there. I found this bit weird, as the test originated on Windows. Did it never actually cause a failure there (i.e., the ulimit -s doesn't do anything)? Or does ulimit fail? I must have forgotten to test on Windows. For performance reasons (you know that I only have a Git time budget of about 15min/day), I developed the test and the patch on Linux. Ciao, Johannes -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html