Re: [PATCH] ref-filter: don't look for objects when outside of a repository

2018-11-16 Thread SZEDER Gábor
On Fri, Nov 16, 2018 at 02:09:07PM +0900, Junio C Hamano wrote:
> Jeff King  writes:
> 
> > On Thu, Nov 15, 2018 at 04:38:44AM -0500, Jeff King wrote:
> >
> >> Is SOURCE_NONE a complete match for what we want?
> >> 
> >> I see problems in both directions:
> >> 
> >>  - sorting by "objectname" works now, but it's marked with SOURCE_OBJ,
> >>and would be forbidden with your patch.  I'm actually not sure if
> >>SOURCE_OBJ is accurate; we shouldn't need to access the object to
> >>show it (and we are probably wasting effort loading the full contents
> >>for tools like for-each-ref).
> >> 
> >>However, that's not the full story. For objectname:short, it _does_ call
> >>find_unique_abbrev(). So we expect to have an object directory.
> >
> > Oops, I'm apparently bad at reading. It is in fact SOURCE_OTHER, which
> > makes sense (outside of this whole "--sort outside a repo thing").
> >
> > But we'd ideally distinguish between "objectname" (which should be OK
> > outside a repo) and "objectname:short" (which currently segfaults).
> 
> Arguably, use of ref-filter machinery in ls-remote, whether it is
> given from inside or outside a repo, was a mistake in 1fb20dfd
> ("ls-remote: create '--sort' option", 2018-04-09), as the whole
> point of "ls-remote" is to peek the list of refs and it is perfectly
> normal that the objects listed are not available.

I hope that one day 'git ls-remote' will learn to '--format=...' its
output, and I think that (re)using the ref-filter machinery would be
the right way to go to achive that.  Sure, ref-filter supports a lot
of format specifiers that don't at all make sense in the context of
'ls-remote' (perhaps we should have a dedicated set of valid_atoms for
that), but I think it's perfectly reasonable to do something like:

  git ls-remote --format=%(refname:strip=2) remote

A concrete use case for that could be to eliminate the last remaining
shell loops from refs completion.



Re: [PATCH] ref-filter: don't look for objects when outside of a repository

2018-11-16 Thread Junio C Hamano
Jeff King  writes:

> If you are arguing that even in a repo we should reject "authorname"
> early (just as we would outside of a repo), I could buy that.

Yup, that (and replace 'authorname' with anything that won't work
with missing objects) for consistency was what I meant.



Re: [PATCH] ref-filter: don't look for objects when outside of a repository

2018-11-16 Thread Jeff King
On Fri, Nov 16, 2018 at 02:09:07PM +0900, Junio C Hamano wrote:

> >> I see problems in both directions:
> >> 
> >>  - sorting by "objectname" works now, but it's marked with SOURCE_OBJ,
> >>and would be forbidden with your patch.  I'm actually not sure if
> >>SOURCE_OBJ is accurate; we shouldn't need to access the object to
> >>show it (and we are probably wasting effort loading the full contents
> >>for tools like for-each-ref).
> >> 
> >>However, that's not the full story. For objectname:short, it _does_ call
> >>find_unique_abbrev(). So we expect to have an object directory.
> >
> > Oops, I'm apparently bad at reading. It is in fact SOURCE_OTHER, which
> > makes sense (outside of this whole "--sort outside a repo thing").
> >
> > But we'd ideally distinguish between "objectname" (which should be OK
> > outside a repo) and "objectname:short" (which currently segfaults).
> 
> Arguably, use of ref-filter machinery in ls-remote, whether it is
> given from inside or outside a repo, was a mistake in 1fb20dfd
> ("ls-remote: create '--sort' option", 2018-04-09), as the whole
> point of "ls-remote" is to peek the list of refs and it is perfectly
> normal that the objects listed are not available.

I think it's conceptually reasonable to use the ref-filter machinery.
It's just that it was underprepared to handle this out-of-repo case. I
think we're not too far off, though.

> "ls-remote --sort=authorname" that is run in a repository may not
> segfault on a ref that points at a yet-to-be-fetched commit, but it
> cannot be doing anything sensible.  Is it still better to silently
> produce a nonsense result than refusing to --sort no matter what the
> sort keys are, whether we are inside or outside a repository?

I don't think we produce silent nonsense in the current code (or after
any of the discussed solutions), either in a repo or out. We say "fatal:
missing object ..." inside a repo if the request cannot be fulfilled.
That's not incredibly illuminating, perhaps, but it means we fulfill
whatever we _can_ on behalf of the user's request, and bail otherwise.

If you are arguing that even in a repo we should reject "authorname"
early (just as we would outside of a repo), I could buy that.
Technically we can make it work sometimes (if we happen to have fetched
everything the other side has), but behaving consistently (and with a
decent error message) may trump that.

-Peff


Re: [PATCH] ref-filter: don't look for objects when outside of a repository

2018-11-15 Thread Junio C Hamano
Jeff King  writes:

> On Thu, Nov 15, 2018 at 04:38:44AM -0500, Jeff King wrote:
>
>> Is SOURCE_NONE a complete match for what we want?
>> 
>> I see problems in both directions:
>> 
>>  - sorting by "objectname" works now, but it's marked with SOURCE_OBJ,
>>and would be forbidden with your patch.  I'm actually not sure if
>>SOURCE_OBJ is accurate; we shouldn't need to access the object to
>>show it (and we are probably wasting effort loading the full contents
>>for tools like for-each-ref).
>> 
>>However, that's not the full story. For objectname:short, it _does_ call
>>find_unique_abbrev(). So we expect to have an object directory.
>
> Oops, I'm apparently bad at reading. It is in fact SOURCE_OTHER, which
> makes sense (outside of this whole "--sort outside a repo thing").
>
> But we'd ideally distinguish between "objectname" (which should be OK
> outside a repo) and "objectname:short" (which currently segfaults).

Arguably, use of ref-filter machinery in ls-remote, whether it is
given from inside or outside a repo, was a mistake in 1fb20dfd
("ls-remote: create '--sort' option", 2018-04-09), as the whole
point of "ls-remote" is to peek the list of refs and it is perfectly
normal that the objects listed are not available.

"ls-remote --sort=authorname" that is run in a repository may not
segfault on a ref that points at a yet-to-be-fetched commit, but it
cannot be doing anything sensible.  Is it still better to silently
produce a nonsense result than refusing to --sort no matter what the
sort keys are, whether we are inside or outside a repository?



Re: [PATCH] ref-filter: don't look for objects when outside of a repository

2018-11-15 Thread Jeff King
On Thu, Nov 15, 2018 at 04:38:44AM -0500, Jeff King wrote:

> Is SOURCE_NONE a complete match for what we want?
> 
> I see problems in both directions:
> 
>  - sorting by "objectname" works now, but it's marked with SOURCE_OBJ,
>and would be forbidden with your patch.  I'm actually not sure if
>SOURCE_OBJ is accurate; we shouldn't need to access the object to
>show it (and we are probably wasting effort loading the full contents
>for tools like for-each-ref).
> 
>However, that's not the full story. For objectname:short, it _does_ call
>find_unique_abbrev(). So we expect to have an object directory.

Oops, I'm apparently bad at reading. It is in fact SOURCE_OTHER, which
makes sense (outside of this whole "--sort outside a repo thing").

But we'd ideally distinguish between "objectname" (which should be OK
outside a repo) and "objectname:short" (which currently segfaults).

-Peff


Re: [PATCH] ref-filter: don't look for objects when outside of a repository

2018-11-15 Thread Jeff King
On Wed, Nov 14, 2018 at 01:27:25PM +0100, SZEDER Gábor wrote:

> The command 'git ls-remote --sort=authordate ' segfaults when
> run outside of a repository, ever since the introduction of its
> '--sort' option in 1fb20dfd8e (ls-remote: create '--sort' option,
> 2018-04-09).
> 
> While in general the 'git ls-remote' command can be run outside of a
> repository just fine, its '--sort=' option with certain keys does
> require access to the referenced objects.  This sorting is implemented
> using the generic ref-filter sorting facility, which already handles
> missing objects gracefully with the appropriate 'missing object
> deadbeef for HEAD' message.  However, being generic means that it
> checks replace refs while trying to retrieve an object, and while
> doing so it accesses the 'git_replace_ref_base' variable, which has
> not been initialized and is still a NULL pointer when outside of a
> repository, thus causing the segfault.
> 
> Make ref-filter more careful upfront while parsing the format string,
> and make it error out when encountering a format atom requiring object
> access when we are not in a repository.  Also add a test to ensure
> that 'git ls-remote --sort' fails gracefully when executed outside of
> a repository.

Thanks for picking up this loose end. I like the general approach here,
but...

> diff --git a/ref-filter.c b/ref-filter.c
> index 0c45ed9d94..a1290659af 100644
> --- a/ref-filter.c
> +++ b/ref-filter.c
> @@ -534,6 +534,10 @@ static int parse_ref_filter_atom(const struct ref_format 
> *format,
>   if (ARRAY_SIZE(valid_atom) <= i)
>   return strbuf_addf_ret(err, -1, _("unknown field name: %.*s"),
>  (int)(ep-atom), atom);
> + if (valid_atom[i].source != SOURCE_NONE && !have_git_dir())
> + return strbuf_addf_ret(err, -1,
> +_("not a git repository, but the field 
> '%.*s' requires access to object data"),
> +(int)(ep-atom), atom);

Is SOURCE_NONE a complete match for what we want?

I see problems in both directions:

 - sorting by "objectname" works now, but it's marked with SOURCE_OBJ,
   and would be forbidden with your patch.  I'm actually not sure if
   SOURCE_OBJ is accurate; we shouldn't need to access the object to
   show it (and we are probably wasting effort loading the full contents
   for tools like for-each-ref).

   However, that's not the full story. For objectname:short, it _does_ call
   find_unique_abbrev(). So we expect to have an object directory.

 - sorting by "HEAD" hits a BUG(), and would still be allowed with your
   patch.

So I like the idea here that the particular atoms would tell us whether
they're going to need to be in a repository or not, but I think the
annotations have to be cleaned up first.

> diff --git a/t/t5512-ls-remote.sh b/t/t5512-ls-remote.sh
> index 91ee6841c1..32e722db2e 100755
> --- a/t/t5512-ls-remote.sh
> +++ b/t/t5512-ls-remote.sh
> @@ -302,6 +302,12 @@ test_expect_success 'ls-remote works outside repository' 
> '
>   nongit git ls-remote dst.git
>  '
>  
> +test_expect_success 'ls-remote --sort fails gracefully outside repository' '
> + # Use a sort key that requires access to the referenced objects.
> + nongit test_must_fail git ls-remote --sort=authordate 
> "$TRASH_DIRECTORY" 2>err &&
> + test_i18ngrep "^fatal: not a git repository, but the field 
> '\''authordate'\'' requires access to object data" err
> +'

Regardless of our solution, we probably want to add an extra test making
sure that something vanilla like:

  nongit git ls-remote --sort=v:refname "$TRASH_DIRECTORY"

continues to work (we do test ls-remote outside a repo already, but not
with a sort specifier).

-Peff


[PATCH] ref-filter: don't look for objects when outside of a repository

2018-11-14 Thread SZEDER Gábor
The command 'git ls-remote --sort=authordate ' segfaults when
run outside of a repository, ever since the introduction of its
'--sort' option in 1fb20dfd8e (ls-remote: create '--sort' option,
2018-04-09).

While in general the 'git ls-remote' command can be run outside of a
repository just fine, its '--sort=' option with certain keys does
require access to the referenced objects.  This sorting is implemented
using the generic ref-filter sorting facility, which already handles
missing objects gracefully with the appropriate 'missing object
deadbeef for HEAD' message.  However, being generic means that it
checks replace refs while trying to retrieve an object, and while
doing so it accesses the 'git_replace_ref_base' variable, which has
not been initialized and is still a NULL pointer when outside of a
repository, thus causing the segfault.

Make ref-filter more careful upfront while parsing the format string,
and make it error out when encountering a format atom requiring object
access when we are not in a repository.  Also add a test to ensure
that 'git ls-remote --sort' fails gracefully when executed outside of
a repository.

Reported-by: H.Merijn Brand 
Signed-off-by: SZEDER Gábor 
---

On Tue, Sep 25, 2018 at 01:57:38PM -0700, Junio C Hamano wrote:
> SZEDER Gábor  writes:
> 
> > However, if we go for a more informative error message, then wouldn't
> > it be better to add this condition in populate_value() before it even
> > calls get_object()?  Then we could also add the problematic format
> > specifier to the error message (I think, but didn't actually check),
> > just in case someone specified multiple sort keys.
> 
> Even though I suspect that verify_ref_format() is the logically the
> right place to do this (after all, it is about seeing if the format
> makes sense, and a format that requires an object access used
> outside a repository should trigger an verification error), doing
> that in populate_value() probably strikes the best balance, I would
> think.

We are dealing with format specifiers used for sorting here, and those
don't go through verify_ref_format().

So how about this patch instead?

I think it will catch all cases where a user would try to use a format
specifier, for any purpose, requiring object access outside of a
repository (though I don't know whether there are any other cases
besides 'git ls-remote --sort=...'; but perhaps in the future
'ls-remote' will get a '--format' option as well), and it does so
before performing a potentially expensive query to the remote.  OTOH,
it won't change the documented "missing object" error message when run
inside a repo but the necessary object is indeed missing.


 ref-filter.c | 4 
 t/t5512-ls-remote.sh | 6 ++
 2 files changed, 10 insertions(+)

diff --git a/ref-filter.c b/ref-filter.c
index 0c45ed9d94..a1290659af 100644
--- a/ref-filter.c
+++ b/ref-filter.c
@@ -534,6 +534,10 @@ static int parse_ref_filter_atom(const struct ref_format 
*format,
if (ARRAY_SIZE(valid_atom) <= i)
return strbuf_addf_ret(err, -1, _("unknown field name: %.*s"),
   (int)(ep-atom), atom);
+   if (valid_atom[i].source != SOURCE_NONE && !have_git_dir())
+   return strbuf_addf_ret(err, -1,
+  _("not a git repository, but the field 
'%.*s' requires access to object data"),
+  (int)(ep-atom), atom);
 
/* Add it in, including the deref prefix */
at = used_atom_cnt;
diff --git a/t/t5512-ls-remote.sh b/t/t5512-ls-remote.sh
index 91ee6841c1..32e722db2e 100755
--- a/t/t5512-ls-remote.sh
+++ b/t/t5512-ls-remote.sh
@@ -302,6 +302,12 @@ test_expect_success 'ls-remote works outside repository' '
nongit git ls-remote dst.git
 '
 
+test_expect_success 'ls-remote --sort fails gracefully outside repository' '
+   # Use a sort key that requires access to the referenced objects.
+   nongit test_must_fail git ls-remote --sort=authordate 
"$TRASH_DIRECTORY" 2>err &&
+   test_i18ngrep "^fatal: not a git repository, but the field 
'\''authordate'\'' requires access to object data" err
+'
+
 test_expect_success 'ls-remote patterns work with all protocol versions' '
git for-each-ref --format="%(objectname)%(refname)" \
refs/heads/master refs/remotes/origin/master >expect &&
-- 
2.19.1.1182.gbfcc7ed3e6



Re: [PATCH] ref-filter: don't look for objects when outside of a repository

2018-09-25 Thread Junio C Hamano
SZEDER Gábor  writes:

> However, if we go for a more informative error message, then wouldn't
> it be better to add this condition in populate_value() before it even
> calls get_object()?  Then we could also add the problematic format
> specifier to the error message (I think, but didn't actually check),
> just in case someone specified multiple sort keys.

Even though I suspect that verify_ref_format() is the logically the
right place to do this (after all, it is about seeing if the format
makes sense, and a format that requires an object access used
outside a repository should trigger an verification error), doing
that in populate_value() probably strikes the best balance, I would
think.



Re: [PATCH] ref-filter: don't look for objects when outside of a repository

2018-09-24 Thread Jeff King
On Mon, Sep 24, 2018 at 11:20:34PM +0200, SZEDER Gábor wrote:

> > Would we perhaps want to give the user a hint that the object is not
> > really missing, but rather that we're not in a repository? E.g.,
> > something like:
> > 
> >   if (!have_git_dir())
> > return strbuf_addf_ret(err, -1, "format specifier requires a 
> > repository");
> >   if (oid_object_info_extended(...))
> > return ...;
> > 
> > ?
> 
> I think it makes sense.
> 
> I wanted to preserve the error message, because the description of
> '--sort=' in 'Documentation/git-ls-remote.txt' explicitly
> mentions it, and I added the condition at this place because I didn't
> want to duplicate the construction of the error message.

Ah, I didn't realize we actually documented that. And perhaps it is more
consistent, too: you'd get different results from running "ls-remote"
outside a repository versus one that just doesn't have the objects from
the other side.

> However, if we go for a more informative error message, then wouldn't
> it be better to add this condition in populate_value() before it even
> calls get_object()?  Then we could also add the problematic format
> specifier to the error message (I think, but didn't actually check),
> just in case someone specified multiple sort keys.

Yeah, that probably would be a better place. Though your response also
has made me think that maybe just sticking with the "missing object"
response is reasonable. I don't have a strong opinion between the two.

-Peff


Re: [PATCH] ref-filter: don't look for objects when outside of a repository

2018-09-24 Thread SZEDER Gábor
On Mon, Sep 24, 2018 at 02:17:22PM -0400, Jeff King wrote:
> On Sat, Sep 22, 2018 at 04:11:45PM +0200, SZEDER Gábor wrote:
 
> > diff --git a/ref-filter.c b/ref-filter.c
> > index e1bcb4ca8a..3555bc29e7 100644
> > --- a/ref-filter.c
> > +++ b/ref-filter.c
> > @@ -1473,7 +1473,8 @@ static int get_object(struct ref_array_item *ref, int 
> > deref, struct object **obj
> > oi->info.sizep = >size;
> > oi->info.typep = >type;
> > }
> > -   if (oid_object_info_extended(the_repository, >oid, >info,
> > +   if (!have_git_dir() ||
> > +   oid_object_info_extended(the_repository, >oid, >info,
> >  OBJECT_INFO_LOOKUP_REPLACE))
> > return strbuf_addf_ret(err, -1, _("missing object %s for %s"),
> >oid_to_hex(>oid), ref->refname);
> 
> Would we perhaps want to give the user a hint that the object is not
> really missing, but rather that we're not in a repository? E.g.,
> something like:
> 
>   if (!have_git_dir())
>   return strbuf_addf_ret(err, -1, "format specifier requires a 
> repository");
>   if (oid_object_info_extended(...))
>   return ...;
> 
> ?

I think it makes sense.

I wanted to preserve the error message, because the description of
'--sort=' in 'Documentation/git-ls-remote.txt' explicitly
mentions it, and I added the condition at this place because I didn't
want to duplicate the construction of the error message.

However, if we go for a more informative error message, then wouldn't
it be better to add this condition in populate_value() before it even
calls get_object()?  Then we could also add the problematic format
specifier to the error message (I think, but didn't actually check),
just in case someone specified multiple sort keys.




Re: [PATCH] ref-filter: don't look for objects when outside of a repository

2018-09-24 Thread Jeff King
On Sat, Sep 22, 2018 at 04:11:45PM +0200, SZEDER Gábor wrote:

> The command 'git ls-remote --sort=authordate ' segfaults when
> run outside of a repository, ever since the introduction of its
> '--sort' option in 1fb20dfd8e (ls-remote: create '--sort' option,
> 2018-04-09).
> 
> While in general the 'git ls-remote' command can be run outside of a
> repository just fine, its '--sort=' option with certain keys does
> require access to the referenced objects.  This sorting is implemented
> using the generic ref-filter sorting facility, which already handles
> missing objects gracefully with the appropriate 'missing object
> deadbeef for HEAD' message.  However, being generic means that it
> checks replace refs while trying to retrieve an object, and while
> doing so it accesses the 'git_replace_ref_base' variable, which has
> not been initialized and is still a NULL pointer when outside of a
> repository, thus causing the segfault.
> 
> Make ref-filter more careful and only attempt to retrieve an object
> when we are in a repository.  Also add a test to ensure that 'git
> ls-remote --sort' fails gracefully when executed outside of a
> repository.

This all makes sense, and I think your fix is going in the right
direction.

But...

> I'm not quite sure that this is the best place to add this check...
> but hey, it's a Saturday afternoon after all ;)

I also wonder about this. For refs, we already catch these cases at a
low-level and BUG(). That's better than a segfault, and I suspect we
should be doing the same here in oid_object_info_extended(). But that
just shifts the segfault to a BUG().

For the refs code, we've generally tried to catch things at a high-level
and report a more human-friendly error explaining the situation. So
doing the same thing here would mean adding code to ls-remote. But I
think the plumbing gets pretty tricky, since it has no way to ask
ref-filter "hey, are we doing to need to look at objects?".

That's a thing that I think ref-filter _should_ support (it knows it
after having parsed the format string). But it probably ought to come
along with other refactoring, and shouldn't hold up this fix.

So this probably _is_ a reasonable place to check it. However...

> diff --git a/ref-filter.c b/ref-filter.c
> index e1bcb4ca8a..3555bc29e7 100644
> --- a/ref-filter.c
> +++ b/ref-filter.c
> @@ -1473,7 +1473,8 @@ static int get_object(struct ref_array_item *ref, int 
> deref, struct object **obj
>   oi->info.sizep = >size;
>   oi->info.typep = >type;
>   }
> - if (oid_object_info_extended(the_repository, >oid, >info,
> + if (!have_git_dir() ||
> + oid_object_info_extended(the_repository, >oid, >info,
>OBJECT_INFO_LOOKUP_REPLACE))
>   return strbuf_addf_ret(err, -1, _("missing object %s for %s"),
>  oid_to_hex(>oid), ref->refname);

Would we perhaps want to give the user a hint that the object is not
really missing, but rather that we're not in a repository? E.g.,
something like:

  if (!have_git_dir())
return strbuf_addf_ret(err, -1, "format specifier requires a 
repository");
  if (oid_object_info_extended(...))
return ...;

?

-Peff


Re: [PATCH] ref-filter: don't look for objects when outside of a repository

2018-09-24 Thread Junio C Hamano
SZEDER Gábor  writes:

> The command 'git ls-remote --sort=authordate ' segfaults when
> run outside of a repository, ever since the introduction of its
> '--sort' option in 1fb20dfd8e (ls-remote: create '--sort' option,
> 2018-04-09).
>
> While in general the 'git ls-remote' command can be run outside of a
> repository just fine, its '--sort=' option with certain keys does
> require access to the referenced objects.  This sorting is implemented
> using the generic ref-filter sorting facility, which already handles
> missing objects gracefully with the appropriate 'missing object
> deadbeef for HEAD' message.  However, being generic means that it
> checks replace refs while trying to retrieve an object, and while
> doing so it accesses the 'git_replace_ref_base' variable, which has
> not been initialized and is still a NULL pointer when outside of a
> repository, thus causing the segfault.
>
> Make ref-filter more careful and only attempt to retrieve an object
> when we are in a repository.  Also add a test to ensure that 'git
> ls-remote --sort' fails gracefully when executed outside of a
> repository.

OK.  So by forcing get_object() return an error, we do the same to
populate_value() which in turn will make get_ref_atgom_value return
an error and cmp_ref_sorting() will notice and die.

I think that is the best we could do.

>
> Reported-by: H.Merijn Brand 
> Signed-off-by: SZEDER Gábor 
> ---
>
> I'm not quite sure that this is the best place to add this check...
> but hey, it's a Saturday afternoon after all ;)
>
>  ref-filter.c | 3 ++-
>  t/t5512-ls-remote.sh | 6 ++
>  2 files changed, 8 insertions(+), 1 deletion(-)
>
> diff --git a/ref-filter.c b/ref-filter.c
> index e1bcb4ca8a..3555bc29e7 100644
> --- a/ref-filter.c
> +++ b/ref-filter.c
> @@ -1473,7 +1473,8 @@ static int get_object(struct ref_array_item *ref, int 
> deref, struct object **obj
>   oi->info.sizep = >size;
>   oi->info.typep = >type;
>   }
> - if (oid_object_info_extended(the_repository, >oid, >info,
> + if (!have_git_dir() ||
> + oid_object_info_extended(the_repository, >oid, >info,
>OBJECT_INFO_LOOKUP_REPLACE))
>   return strbuf_addf_ret(err, -1, _("missing object %s for %s"),
>  oid_to_hex(>oid), ref->refname);
> diff --git a/t/t5512-ls-remote.sh b/t/t5512-ls-remote.sh
> index bc5703ff9b..7dd081da01 100755
> --- a/t/t5512-ls-remote.sh
> +++ b/t/t5512-ls-remote.sh
> @@ -302,4 +302,10 @@ test_expect_success 'ls-remote works outside repository' 
> '
>   nongit git ls-remote dst.git
>  '
>  
> +test_expect_success 'ls-remote --sort fails gracefully outside repository' '
> + # Use a sort key that requires access to the referenced objects.
> + nongit test_must_fail git ls-remote --sort=authordate 
> "$TRASH_DIRECTORY" 2>err &&
> + test_i18ngrep "^fatal: missing object" err
> +'
> +
>  test_done


[PATCH] ref-filter: don't look for objects when outside of a repository

2018-09-22 Thread SZEDER Gábor
The command 'git ls-remote --sort=authordate ' segfaults when
run outside of a repository, ever since the introduction of its
'--sort' option in 1fb20dfd8e (ls-remote: create '--sort' option,
2018-04-09).

While in general the 'git ls-remote' command can be run outside of a
repository just fine, its '--sort=' option with certain keys does
require access to the referenced objects.  This sorting is implemented
using the generic ref-filter sorting facility, which already handles
missing objects gracefully with the appropriate 'missing object
deadbeef for HEAD' message.  However, being generic means that it
checks replace refs while trying to retrieve an object, and while
doing so it accesses the 'git_replace_ref_base' variable, which has
not been initialized and is still a NULL pointer when outside of a
repository, thus causing the segfault.

Make ref-filter more careful and only attempt to retrieve an object
when we are in a repository.  Also add a test to ensure that 'git
ls-remote --sort' fails gracefully when executed outside of a
repository.

Reported-by: H.Merijn Brand 
Signed-off-by: SZEDER Gábor 
---

I'm not quite sure that this is the best place to add this check...
but hey, it's a Saturday afternoon after all ;)

 ref-filter.c | 3 ++-
 t/t5512-ls-remote.sh | 6 ++
 2 files changed, 8 insertions(+), 1 deletion(-)

diff --git a/ref-filter.c b/ref-filter.c
index e1bcb4ca8a..3555bc29e7 100644
--- a/ref-filter.c
+++ b/ref-filter.c
@@ -1473,7 +1473,8 @@ static int get_object(struct ref_array_item *ref, int 
deref, struct object **obj
oi->info.sizep = >size;
oi->info.typep = >type;
}
-   if (oid_object_info_extended(the_repository, >oid, >info,
+   if (!have_git_dir() ||
+   oid_object_info_extended(the_repository, >oid, >info,
 OBJECT_INFO_LOOKUP_REPLACE))
return strbuf_addf_ret(err, -1, _("missing object %s for %s"),
   oid_to_hex(>oid), ref->refname);
diff --git a/t/t5512-ls-remote.sh b/t/t5512-ls-remote.sh
index bc5703ff9b..7dd081da01 100755
--- a/t/t5512-ls-remote.sh
+++ b/t/t5512-ls-remote.sh
@@ -302,4 +302,10 @@ test_expect_success 'ls-remote works outside repository' '
nongit git ls-remote dst.git
 '
 
+test_expect_success 'ls-remote --sort fails gracefully outside repository' '
+   # Use a sort key that requires access to the referenced objects.
+   nongit test_must_fail git ls-remote --sort=authordate 
"$TRASH_DIRECTORY" 2>err &&
+   test_i18ngrep "^fatal: missing object" err
+'
+
 test_done
-- 
2.19.0.355.geb876cd9d6