Re: [PATCH v6 0/6] recursively grep across submodules

2016-12-01 Thread Jeff King
On Thu, Dec 01, 2016 at 09:45:47AM -0800, Brandon Williams wrote:

> Yeah I was trying to think through these scenarios myself last night.
> And like you found it seemed alright to let the child process deal with
> the .git file/dir as long as once actually exists at that path.  If one
> didn't then there would be the possibility that we ended up back at the
> superproject, which would result in an infinite loop.  And yeah if the
> .git file doesn't resolve to anything sensible then the user probably
> mangled their repository somehow anyways.

I hadn't considered the infinite loop. I thought the worst case is that
we might just generate bogus results by going back to the superproject.
But of course there is nothing to stop it from just recursing again.

However, it looks like there is a circuit-breaker; we end up back in the
superproject, but inside a subdirectory, which causes --super-prefix to
complain.

You can test it with just:

  rm submodule/.git
  mkdir submodule/.git

which says:

  fatal: can't use --super-prefix from a subdirectory
  fatal: process for submodule 'foo' failed with exit code: 128

It might be worth including a test to make sure that behavior remains.
I think it's more of an emergent behavior than something planned. :)

-Peff


Re: [PATCH v6 0/6] recursively grep across submodules

2016-12-01 Thread Brandon Williams
On 11/30, Jeff King wrote:
> On Wed, Nov 30, 2016 at 05:28:28PM -0800, Brandon Williams wrote:
> 
> > v6 fixes a race condition which existed in the 'is_submodule_populated'
> > function.  Instead of calling 'resolve_gitdir' to check for the existance 
> > of a
> > .git file/directory, use 'stat'.  'resolve_gitdir' calls 'chdir' which can
> > affect other running threads trying to load thier files into a buffer in
> > memory.
> 
> This one passes my stress-test for t7814 (though I imagine you already
> knew that).
> 
> I tried to think of things that could go wrong by using a simple stat()
> instead of resolve_gitdir(). They should only differ when ".git" for
> some reason does not point to a git repository. My initial thought is
> that this might be more vocal about errors, because the child process
> will complain. But actually, the original would already die if the
> ".git" file is funny, so we were pretty vocal already.
> 
> I also wondered whether the sub-process might skip a bogus ".git" file
> and keep looking upward in the filesystem tree (which would confusingly
> end up back in the super-project!). But it looks like we bail hard when
> we see a ".git" file but it's bogus. Which is probably a good thing in
> general for submodules.
> 
> I'm not sure any of that is actually even worth worrying about, as such
> a setup is broken by definition. I just wanted to think it through as a
> devil's advocate, and even that seems pretty reasonable.
> 
> -Peff

Yeah I was trying to think through these scenarios myself last night.
And like you found it seemed alright to let the child process deal with
the .git file/dir as long as once actually exists at that path.  If one
didn't then there would be the possibility that we ended up back at the
superproject, which would result in an infinite loop.  And yeah if the
.git file doesn't resolve to anything sensible then the user probably
mangled their repository somehow anyways.

Thanks again for all the help!

-- 
Brandon Williams


Re: [PATCH v6 0/6] recursively grep across submodules

2016-11-30 Thread Jeff King
On Wed, Nov 30, 2016 at 05:28:28PM -0800, Brandon Williams wrote:

> v6 fixes a race condition which existed in the 'is_submodule_populated'
> function.  Instead of calling 'resolve_gitdir' to check for the existance of a
> .git file/directory, use 'stat'.  'resolve_gitdir' calls 'chdir' which can
> affect other running threads trying to load thier files into a buffer in
> memory.

This one passes my stress-test for t7814 (though I imagine you already
knew that).

I tried to think of things that could go wrong by using a simple stat()
instead of resolve_gitdir(). They should only differ when ".git" for
some reason does not point to a git repository. My initial thought is
that this might be more vocal about errors, because the child process
will complain. But actually, the original would already die if the
".git" file is funny, so we were pretty vocal already.

I also wondered whether the sub-process might skip a bogus ".git" file
and keep looking upward in the filesystem tree (which would confusingly
end up back in the super-project!). But it looks like we bail hard when
we see a ".git" file but it's bogus. Which is probably a good thing in
general for submodules.

I'm not sure any of that is actually even worth worrying about, as such
a setup is broken by definition. I just wanted to think it through as a
devil's advocate, and even that seems pretty reasonable.

-Peff