Re: [PATCH 0/4] Test name-rev with small stack

2017-09-11 Thread Jeff King
On Fri, Sep 08, 2017 at 02:33:35PM +0200, Michael J Gruber wrote:

> Looking at it more closely, the solution in cbc60b6720 ("git tag
> --contains: avoid stack overflow", 2014-04-24) seems to be a bit "ad
> hoc" to me:
> 
> First of all, there is more than "tag --contains" that may exceed the
> stack by recursion. So I would expect the solution to be more general,
> and not localised and specialised to builtin/tag.c

At the time, tag was the only one using this depth-first contains
algorithm. It's since been adapted to ref-filter.c, but of course the
stack handling went with it.

Most traversals have a date-sorted queue, so are effectively doing a
breadth-first iteration with no recursion.

> Second, this is a walk, so I'm wondering whether our revision walk
> machinery should be the place to add missing functionality (if any).
> That way, everything would benefit from possible or existing
> improvements there. For example, I think some of our "extra walkers"
> don't heed object replacements. (I need to test more.)

It's possible that name-rev could make better use of the existing
traversal machinery. It's often tough to do so, though, because that
machinery gives you a linearized ordering of the commits. Whereas
something like name-rev really cares about the order that it visits the
commits, because it's building up the names.

It's the same for this "tag --contains" traversal. It _used_ to be a
series of merge-base computations. But by doing a custom traversal, we
can cache incremental results through the graph and avoid walking over
the same bits multiple times. There actually is a way to do it with the
regular breadth-first traversal, but you have to store one bit per ref
you're checking for each commit.

I played around with that a bit a while ago, and it did seem to work. I
can dig up the patches if you're interested. But one downside is that
one bit per ref per commit adds up if you have a lot of refs. A large
number of those bitfields will be the same, so you could probably get by
with a copy-on-write scheme, but I never implemented that.

Of course somebody may have a more clever algorithm, too. I don't claim
the above is a proof. ;)

-Peff


Re: [PATCH 0/4] Test name-rev with small stack

2017-09-08 Thread Michael J Gruber
Jeff King venit, vidit, dixit 07.09.2017 16:54:
> On Thu, Sep 07, 2017 at 04:02:19PM +0200, Michael J Gruber wrote:
> 
>> name-rev segfaults for me in emacs.git with the typical 8102 stack size.
>> The reason is the recursive walk that name-rev uses.
>>
>> This series adds a test to mark this as known failure, after some
>> clean-ups.
> 
> These all look reasonable to me. The size of the test case in the final
> one is presumably arbitrary and just copied from t7004. I don't know if
> it's worth trying to shrink it. It could shorten a rather expensive
> test. OTOH, if we shorten it too much then we might get a false pass
> (e.g., if the algorithm remains recursive but has a smaller stack
> footprint).
> 
>> Michael J Gruber (4):
>>   t7004: move limited stack prereq to test-lib
>>   t6120: test name-rev --all and --stdin
>>   t6120: clean up state after breaking repo
>>   t6120: test describe and name-rev with deep repos
> 
> Now comes the hard part: rewriting the C code. :)

Looking at it more closely, the solution in cbc60b6720 ("git tag
--contains: avoid stack overflow", 2014-04-24) seems to be a bit "ad
hoc" to me:

First of all, there is more than "tag --contains" that may exceed the
stack by recursion. So I would expect the solution to be more general,
and not localised and specialised to builtin/tag.c

Second, this is a walk, so I'm wondering whether our revision walk
machinery should be the place to add missing functionality (if any).
That way, everything would benefit from possible or existing
improvements there. For example, I think some of our "extra walkers"
don't heed object replacements. (I need to test more.)

Michael


Re: [PATCH 0/4] Test name-rev with small stack

2017-09-07 Thread Jeff King
On Thu, Sep 07, 2017 at 04:02:19PM +0200, Michael J Gruber wrote:

> name-rev segfaults for me in emacs.git with the typical 8102 stack size.
> The reason is the recursive walk that name-rev uses.
> 
> This series adds a test to mark this as known failure, after some
> clean-ups.

These all look reasonable to me. The size of the test case in the final
one is presumably arbitrary and just copied from t7004. I don't know if
it's worth trying to shrink it. It could shorten a rather expensive
test. OTOH, if we shorten it too much then we might get a false pass
(e.g., if the algorithm remains recursive but has a smaller stack
footprint).

> Michael J Gruber (4):
>   t7004: move limited stack prereq to test-lib
>   t6120: test name-rev --all and --stdin
>   t6120: clean up state after breaking repo
>   t6120: test describe and name-rev with deep repos

Now comes the hard part: rewriting the C code. :)

-Peff


[PATCH 0/4] Test name-rev with small stack

2017-09-07 Thread Michael J Gruber
name-rev segfaults for me in emacs.git with the typical 8102 stack size.
The reason is the recursive walk that name-rev uses.

This series adds a test to mark this as known failure, after some
clean-ups.

Michael J Gruber (4):
  t7004: move limited stack prereq to test-lib
  t6120: test name-rev --all and --stdin
  t6120: clean up state after breaking repo
  t6120: test describe and name-rev with deep repos

 t/t6120-describe.sh | 57 +
 t/t7004-tag.sh  |  6 --
 t/test-lib.sh   |  6 ++
 3 files changed, 63 insertions(+), 6 deletions(-)

-- 
2.14.1.603.gf58147c36e