Re: [PATCH v3 3/4] transport.c: introduce core.alternateRefsCommand

Jeff King Sat, 29 Sep 2018 00:31:55 -0700

On Fri, Sep 28, 2018 at 03:04:10PM -0700, Taylor Blau wrote:

> > Well, you also need to pass the path so it knows which repo to look at.
> > Which I think is the primary reason we do it, but behaving differently
> > for each alternate is another option.
> 
> Yeah. I think that the clearer argument is yours, so I'll amend my copy.
> I am thinking of:
> 
>   To find the alternate, pass its absolute path as the first argument.
> 
> How does that sound?


Sounds good.

> > > +heads, configure `core.alternateRefsCommand` to the path of a script 
> > > which runs
> > > +`git --git-dir="$1" for-each-ref --format='%(objectname)' refs/heads`.
> >
> > Does that script actually work? Because of the way we invoke shell
> > commands with arguments, I think we'd end up with:
> >
> >   git --git-dir="$1" for-each-ref --format='%(objectname)' refs/heads "$@"
> [...]
> ...but this was what I was trying to get across with saying "...to the
> path of a script which runs...", such that we would get the implicit
> scoping that you make explicit in your example with "f() { ... }; f".
>
> Does that seem OK as-is after the additional context? I think that after
> reading your response, it seems to be confusing, so perhaps it should be
> changed...

Ah, OK. I totally missed that "path of a script" part. What you have is
correct, then, but I do wonder if we could make it less subtle.

Maybe something like:

  For example, if `/path/to/script` runs `git --git-dir="$1"
  for-each-ref --format='%(objectname)' refs/heads/`, then putting
  `/path/to/script` in `core.alternateRefsCommand` will show only the
  branch heads from the alternate.

I dunno. It's certainly clunkier. I wonder if we would be less awkward
to show the sample script in a fenced block, with the `#!/bin/sh` and
everything.

Or maybe just keep the text you have and add a note at the end like:

  Note that writing that `for-each-ref` command directly in the config
  option doesn't quite work, as it has to handle the path argument
  specially.

I don't think we need to hand-hold a user through the f() shell-snippet
trickery. I just don't want somebody thinking they can blindly paste
that into their config.

> > The other alternative is to pass $GIT_DIR in the environment on behalf
> > of the program. Then writing:
> >
> >   git for-each-ref --format='%(objectname)' refs/heads
> >
> > would Just Work. But it's a bit subtle, since it is not immediately
> > obvious that the command is meant to run in a different repository.
> 
> I think that we discussed this approach a bit off-list, and I had the
> idea that it was too fragile to work in practice, and that it would be
> too surprising for callers to suddenly be in a different world.
> 
> I say this not because it wouldn't make this particular scenario more
> convenient, which it uncountably would, but because it would make other
> scenarios _more_ complicated.
> 
> For example, if a caller uses an alternate reference backed, perhaps,
> MySQL (or anything that _isn't_ Git), they're not going to want to have
> these GIT_ environment variable set.

If they're not using Git under the hood, then GIT_* probably isn't
hurting anything. But it is still pretty subtle. Let's forget I
mentioned it.  Just chaining for-each-ref with a prefix is pretty
awkward, but that's why we have the next patch with
alternateRefsPrefixes.

Your response did make me think of one other thing, though. The
alternate file points to a directory with objects, and the
for_each_alternate_ref() code checks to see if that looks vaguely like
the objects/ directory of a git repo. But would anybody want to run
something like alternateRefsCommand on _just_ the object directory?
I.e., you don't have a real git repo there, but your script can
"somehow" come up with a list of valid tips.

That isn't inconceivable to me for the kind of multi-fork storage we do
at GitHub. E.g., imagine a shared object directory with no refs, and
then a script that goes out to the other related forks to look at their
ref tips. I don't think we have any immediate plans for it, though (and
there are a lot of subtle bits that I won't go into here that make it
non-trivial). So I'm OK to punt on it for now. I also think in a pinch
that you could easily fool the alternates code by just having a dummy
"refs/" directory.

> > > diff --git a/t/t5410-receive-pack.sh b/t/t5410-receive-pack.sh
> [...]
> > > +test_description='git receive-pack test'
> >
> > The name of this test file and the description are pretty vague. Can we
> > say something like "test handling of receive-pack with alternate-refs
> > config"?
> 
> I left it intentionally vague, since I'd like for it to contain more
> tests about 'git receive-pack'-specific things in the future.
> 
> I'm happy to change the name, though I wonder if we should change the
> filename accordingly, and if so, to what.

I think we'd want to have a separate script for other receive-pack tests
that aren't related to alternates. There's some startup overhead to each
script so we don't want to make them _too_ small, but there are benefits
to having small test scripts:

 - they're our unit of parallelism, so we want to be able to keep a
   reasonable number of processors full

 - each test script starts with a clean slate, so there's less chance
   for unexpected interactions between individual tests (e.g., when
   modifying or adding a test in the middle of the script)

 - it's less annoying when you're debugging a failing test near the end
   of a script ;)

I actually think we'd benefit from splitting up a few of the longer
scripts. On my quad-core laptop, running the tests in slow-to-fast order
keeps the processors pretty busy, and the slowest test takes less time
than the whole suite. But I've also tried running on a 40-core box. It
burns through the short tests quickly, but you can never get faster than
the slowest single test, which takes something like 35 seconds. So
instead of being 10 times faster, it's more like two times faster, as
most of the processors idle waiting for that one script to finish.

But that's all pretty tangential here. My point is just that this
probably ought to be remain its own script. :)

I'd probably name it "t5410-receive-pack-alternates" or similar.

> I'll wait until Monday to re-roll, just to make sure that there isn't
> any new feedback between now and then.

Sounds good. Thanks for working on this.

-Peff

Re: [PATCH v3 3/4] transport.c: introduce core.alternateRefsCommand

Reply via email to