Re: Fwd: Runaway git remote if group definition contains a remote by the same name

2013-12-31 Thread Jeff King
On Mon, Dec 30, 2013 at 11:10:31AM -0800, Junio C Hamano wrote:

  So I think the sanest thing is probably:
 
1. Teach fetch to expand recursively in a single process, and then
   tell sub-processes (via a new command-line option) not to expand
   any further.
 
2. Teach fetch to detect cycles (probably just by a simple depth
   counter).
 
 I suspect that the expansion code will just accumulate remotes found
 into a string-list (as part of 4. below), so deduping would be
 fairly easily done without a depth counter.

I don't think that will work (at least not naively). The end-product of
step 1, and the string_list that is de-duped in step 4, is a list of the
concrete remotes. The cycles occur between groups, which are not
mentioned in the final list.

You can keep a separate list of the groups we visit, of course, but we
do not otherwise need it.

One thing that does make such a list easier is that we do not need to
care about order. E.g., in config like this:

  [remotes]
  a = c
  b = c
  c = d e

you can mark c as seen after visiting it via a. It is not
technically a cycle, but since we would want to suppress duplicates
anyway, we can be overly broad.

3. Teach the group-reading code to detect groups more robustly, so
   that a single-item group like remotes.foo=bar correctly recurses
   to bar.
 
 A single-item remote group is somewhat annoying, but expanding it
 only at some places while ignoring it at other places is even more
 annoying, so this sounds like a right thing to do.

The only configuration that I think would be negatively affected is
something like:

  [remote]
  foo = foo
  [remote foo]
  url = ...

that silently works now, but would become broken (because we would
complain about the cycle). I think that's OK; that config is clearly
stupid and broken. If it were remote.foo = foo bar, trying to expand
the concrete foo and bar, that might make some sense, but then it is
already broken in the current code (that is the example that started the
discussion).

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Fwd: Runaway git remote if group definition contains a remote by the same name

2013-12-28 Thread Alex Riesen
FWIW, the issue is still present.

-- Forwarded message --
From: Alex Riesen raa.l...@gmail.com
Date: Wed, Nov 17, 2010 at 6:10 PM
Subject: Runaway git remote if group definition contains a remote by
the same name
To: Git Mailing List git@vger.kernel.org


Hi,

it is also a way to create a fork bomb out of the innocent tool on platforms
where pressing Ctrl-C does not terminate subprocesses of the foreground
process (like, of course, Windows).

To reproduce, run

   git -c remotes.origin='origin other' remote update origin

I just cannot look at it right now, and have to resolve to only reporting
the problem to warn people. Something seems to resolve the remotes group
definition over and over again.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Fwd: Runaway git remote if group definition contains a remote by the same name

2013-12-28 Thread Jeff King
On Sat, Dec 28, 2013 at 03:56:55PM +0100, Alex Riesen wrote:

 it is also a way to create a fork bomb out of the innocent tool on platforms
 where pressing Ctrl-C does not terminate subprocesses of the foreground
 process (like, of course, Windows).
 
 To reproduce, run
 
git -c remotes.origin='origin other' remote update origin

Hmm. This is a pretty straightforward reference cycle. We expand origin
to contain itself, so it recurses forever on expansion. As with most
such problems, the cycle path may be longer than one:

  git -c remotes.foo='bar baz' -c remotes.bar='foo baz' fetch foo

Detecting the cycle can be done by keeping track of which names we've
seen, or just by putting in a depth limit that no sane setup would hit.
In either case, it's complicated slightly by the fact that we pass the
expanded list to a sub-process (which then recurses on the expansion).
So we'd have to communicate the depth (or the list of seen remotes) via
the command line or the environment.

One alternative would be to have the parent git fetch recursively
expand the list itself down to scalar entries, and then invoke
sub-processes on the result (and tell them not to expand at all). That
would also let us cull duplicates if a remote is found via multiple
groups.

Interestingly, the problem does not happen with this:

  git -c remotes.foo=foo fetch foo

Fetch sees that foo expands only to a single item and says oh, that
must not be a group. And then treats it like a regular remote, rather
than recursing. So it's not clear to me whether groups are meant to be
recursive or not. They are in some cases:

  # fetch remotes 1-4
  git -c remotes.parent='child1 child2' \
  -c remotes.child1='remote1 remote2' \
  -c remotes.child2='remote3 remote4' \
  fetch parent

but not in others:

  # foo should be an alias for bar, but it's not
  git -c remotes.foo=bar \
  -c remotes.bar='remote1 remote2' \
  fetch foo

If they are not allowed to recurse, the problem is much easier; the
parent fetch simply tells all of the sub-invocations not to expand the
arguments further. However, whether it was planned or not, it has been
this way for a long time. I would not be surprised if somebody is
relying on the recursion to help organize their groups.

So I think the sanest thing is probably:

  1. Teach fetch to expand recursively in a single process, and then
 tell sub-processes (via a new command-line option) not to expand
 any further.

  2. Teach fetch to detect cycles (probably just by a simple depth
 counter).

  3. Teach the group-reading code to detect groups more robustly, so
 that a single-item group like remotes.foo=bar correctly recurses
 to bar.

  4. (Optional) Teach the expansion code from step 1 to cull duplicates,
 so that we do not try to fetch from the same remote twice (e.g., if
 it is mentioned as part of two groups, and both are specified on
 the command line).

I do not plan to work on this myself in the immediate future, but
perhaps it is an interesting low-hanging fruit for somebody else.

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html