Re: [PATCH 0/4] Test name-rev with small stack
On Fri, Sep 08, 2017 at 02:33:35PM +0200, Michael J Gruber wrote: > Looking at it more closely, the solution in cbc60b6720 ("git tag > --contains: avoid stack overflow", 2014-04-24) seems to be a bit "ad > hoc" to me: > > First of all, there is more than "tag --contains" that may exceed the > stack by recursion. So I would expect the solution to be more general, > and not localised and specialised to builtin/tag.c At the time, tag was the only one using this depth-first contains algorithm. It's since been adapted to ref-filter.c, but of course the stack handling went with it. Most traversals have a date-sorted queue, so are effectively doing a breadth-first iteration with no recursion. > Second, this is a walk, so I'm wondering whether our revision walk > machinery should be the place to add missing functionality (if any). > That way, everything would benefit from possible or existing > improvements there. For example, I think some of our "extra walkers" > don't heed object replacements. (I need to test more.) It's possible that name-rev could make better use of the existing traversal machinery. It's often tough to do so, though, because that machinery gives you a linearized ordering of the commits. Whereas something like name-rev really cares about the order that it visits the commits, because it's building up the names. It's the same for this "tag --contains" traversal. It _used_ to be a series of merge-base computations. But by doing a custom traversal, we can cache incremental results through the graph and avoid walking over the same bits multiple times. There actually is a way to do it with the regular breadth-first traversal, but you have to store one bit per ref you're checking for each commit. I played around with that a bit a while ago, and it did seem to work. I can dig up the patches if you're interested. But one downside is that one bit per ref per commit adds up if you have a lot of refs. A large number of those bitfields will be the same, so you could probably get by with a copy-on-write scheme, but I never implemented that. Of course somebody may have a more clever algorithm, too. I don't claim the above is a proof. ;) -Peff
Re: [PATCH 0/4] Test name-rev with small stack
Jeff King venit, vidit, dixit 07.09.2017 16:54: > On Thu, Sep 07, 2017 at 04:02:19PM +0200, Michael J Gruber wrote: > >> name-rev segfaults for me in emacs.git with the typical 8102 stack size. >> The reason is the recursive walk that name-rev uses. >> >> This series adds a test to mark this as known failure, after some >> clean-ups. > > These all look reasonable to me. The size of the test case in the final > one is presumably arbitrary and just copied from t7004. I don't know if > it's worth trying to shrink it. It could shorten a rather expensive > test. OTOH, if we shorten it too much then we might get a false pass > (e.g., if the algorithm remains recursive but has a smaller stack > footprint). > >> Michael J Gruber (4): >> t7004: move limited stack prereq to test-lib >> t6120: test name-rev --all and --stdin >> t6120: clean up state after breaking repo >> t6120: test describe and name-rev with deep repos > > Now comes the hard part: rewriting the C code. :) Looking at it more closely, the solution in cbc60b6720 ("git tag --contains: avoid stack overflow", 2014-04-24) seems to be a bit "ad hoc" to me: First of all, there is more than "tag --contains" that may exceed the stack by recursion. So I would expect the solution to be more general, and not localised and specialised to builtin/tag.c Second, this is a walk, so I'm wondering whether our revision walk machinery should be the place to add missing functionality (if any). That way, everything would benefit from possible or existing improvements there. For example, I think some of our "extra walkers" don't heed object replacements. (I need to test more.) Michael
Re: [PATCH 0/4] Test name-rev with small stack
On Thu, Sep 07, 2017 at 04:02:19PM +0200, Michael J Gruber wrote: > name-rev segfaults for me in emacs.git with the typical 8102 stack size. > The reason is the recursive walk that name-rev uses. > > This series adds a test to mark this as known failure, after some > clean-ups. These all look reasonable to me. The size of the test case in the final one is presumably arbitrary and just copied from t7004. I don't know if it's worth trying to shrink it. It could shorten a rather expensive test. OTOH, if we shorten it too much then we might get a false pass (e.g., if the algorithm remains recursive but has a smaller stack footprint). > Michael J Gruber (4): > t7004: move limited stack prereq to test-lib > t6120: test name-rev --all and --stdin > t6120: clean up state after breaking repo > t6120: test describe and name-rev with deep repos Now comes the hard part: rewriting the C code. :) -Peff
[PATCH 0/4] Test name-rev with small stack
name-rev segfaults for me in emacs.git with the typical 8102 stack size. The reason is the recursive walk that name-rev uses. This series adds a test to mark this as known failure, after some clean-ups. Michael J Gruber (4): t7004: move limited stack prereq to test-lib t6120: test name-rev --all and --stdin t6120: clean up state after breaking repo t6120: test describe and name-rev with deep repos t/t6120-describe.sh | 57 + t/t7004-tag.sh | 6 -- t/test-lib.sh | 6 ++ 3 files changed, 63 insertions(+), 6 deletions(-) -- 2.14.1.603.gf58147c36e