Re: [PATCH v4 13/35] ls-refs: introduce ls-refs server command

2018-03-13 Thread Brandon Williams
On 03/02, Junio C Hamano wrote:
> Brandon Williams  writes:
> 
> > + ls-refs
> > +-
> > +
> > +`ls-refs` is the command used to request a reference advertisement in v2.
> > +Unlike the current reference advertisement, ls-refs takes in arguments
> > +which can be used to limit the refs sent from the server.
> 
> OK.
> 
> > +Additional features not supported in the base command will be advertised
> > +as the value of the command in the capability advertisement in the form
> > +of a space separated list of features, e.g.  "=
> > +".
> 
> Doesn't this explain the general convention that applies to any
> command, not just ls-refs command?  As a part of ls-refs section,
>  in the above explanation is always a constant "ls-refs",
> right?
> 
> It is a bit unclear how  in the above description are
> related to "arguments" in the following paragraph.  Do the server
> that can show symref and peeled tags and that can limit the output
> with ref-pattern advertise these three as supported features, i.e.
> 
>   ls-refs=symrefs peel ref-pattern
> 
> or something?  Would there a case where a "feature" does not
> correspond 1:1 to an argument to the command, and if so how would
> the server and the client negotiate use of such a feature?

I mention earlier in the document that the values of each capability are
to be defined by the capability itself, so I'm just documenting what the
value advertised means.  And a feature could mean a couple things and
doesn't necessarily mean it affects the arguments which can be provided,
and it definitely doesn't mean that each argument that can be provided
must be advertised as a feature.  If you look at the patch that
introduces shallow, the shallow feature adds lots of arguments that a
client can that use in its request.

> 
> > +ref-pattern 
> > +   When specified, only references matching one of the provided
> > +   patterns are displayed.  A pattern is either a valid refname
> > +   (e.g.  refs/heads/master), in which a ref must match the pattern
> > +   exactly, or a prefix of a ref followed by a single '*' wildcard
> > +   character (e.g. refs/heads/*), in which a ref must have a prefix
> > +   equal to the pattern up to the wildcard character.
> 
> I thought the recent concensus was left-anchored prefix match that
> honors /-directory boundary, i.e. no explicit asterisk and just
> saying "refs/heads" is enough to match "refs/heads" itself and
> "refs/heads/master" but not "refs/headscarf"?

I don't think there was a consensus at the time, but in the next
revision I'll have them be prefixes instead of containing wildcards.

-- 
Brandon Williams


Re: [PATCH v4 13/35] ls-refs: introduce ls-refs server command

2018-03-13 Thread Brandon Williams
On 03/05, Jeff King wrote:
> On Mon, Mar 05, 2018 at 10:21:55AM -0800, Brandon Williams wrote:
> 
> > > Hmm, so this would accept stuff like "refs/heads/*/foo" but quietly
> > > ignore the "/foo" part.
> > 
> > Yeah that's true...this should probably not do that.  Since
> > "refs/heads/*/foo" violates what the spec is, really this should error
> > out as an invalid pattern.
> 
> Yeah, that would be better, I think.
> 
> > > It also accepts "refs/h*" to get "refs/heads" and "refs/hello".  I think
> > > it's worth going for the most-restrictive thing to start with, since
> > > that enables a lot more server operations without worrying about
> > > breaking compatibility.
> > 
> > And just to clarify what do you see as being the most-restrictive case
> > of patterns that would should use?
> 
> I mean only accepting "*" at a "/" boundary (or just allowing a trailing
> slash to imply recursion, like "refs/heads/", or even just always
> assuming recursion to allow "refs/heads").

For simplicity I'll change ref-patterns to be ref-prefixes where
a ref must start_with() one of the provided ref-prefixes.  Clients won't
send '*'s either but can send everything upto but not including the '*'
as a prefix.

-- 
Brandon Williams


Re: [PATCH v4 13/35] ls-refs: introduce ls-refs server command

2018-03-05 Thread Jeff King
On Mon, Mar 05, 2018 at 10:29:14AM -0800, Jonathan Nieder wrote:

> >> It also accepts "refs/h*" to get "refs/heads" and "refs/hello".  I think
> >> it's worth going for the most-restrictive thing to start with, since
> >> that enables a lot more server operations without worrying about
> >> breaking compatibility.
> >
> > And just to clarify what do you see as being the most-restrictive case
> > of patterns that would should use?
> 
> Peff, can you say a little more about the downsides of accepting
> refs/h*?
> 
> IIRC the "git push" command already accepts such refspecs, so there's a
> benefit to accepting them.  Reftable and packed-refs support such
> queries about as efficiently as refs/heads/*.  For loose refs, readdir
> doesn't provide a way to restrict which files you look at, but loose
> refs are always slow anyway. :)
> 
> In other words, I see real benefits and I don't see much in the way of
> costs, so I'm not seeing why not to support this.

"git for-each-ref" only handles "/" boundaries. I think we used to have
similar problems with the internal for_each_ref(), but I just checked
and I think it's more flexible these days.  One could imagine a more
trie-like storage, though I agree that is stretching it with a
hypothetical.

Mostly my point was that I don't see any big upside, and the choice
seemed rather arbitrary. And as it is generally easier to loosen the
patterns later than tighten them, it makes sense to go with the tightest
option at first unless there is a compelling reason not to.

-Peff


Re: [PATCH v4 13/35] ls-refs: introduce ls-refs server command

2018-03-05 Thread Jeff King
On Mon, Mar 05, 2018 at 10:21:55AM -0800, Brandon Williams wrote:

> > Hmm, so this would accept stuff like "refs/heads/*/foo" but quietly
> > ignore the "/foo" part.
> 
> Yeah that's true...this should probably not do that.  Since
> "refs/heads/*/foo" violates what the spec is, really this should error
> out as an invalid pattern.

Yeah, that would be better, I think.

> > It also accepts "refs/h*" to get "refs/heads" and "refs/hello".  I think
> > it's worth going for the most-restrictive thing to start with, since
> > that enables a lot more server operations without worrying about
> > breaking compatibility.
> 
> And just to clarify what do you see as being the most-restrictive case
> of patterns that would should use?

I mean only accepting "*" at a "/" boundary (or just allowing a trailing
slash to imply recursion, like "refs/heads/", or even just always
assuming recursion to allow "refs/heads").

-Peff


Re: [PATCH v4 13/35] ls-refs: introduce ls-refs server command

2018-03-05 Thread Jonathan Nieder
Hi,

On Mon, Mar 05, 2018 at 10:21:55AM -0800, Brandon Williams wrote:
> On 03/02, Jeff King wrote:

>> It also accepts "refs/h*" to get "refs/heads" and "refs/hello".  I think
>> it's worth going for the most-restrictive thing to start with, since
>> that enables a lot more server operations without worrying about
>> breaking compatibility.
>
> And just to clarify what do you see as being the most-restrictive case
> of patterns that would should use?

Peff, can you say a little more about the downsides of accepting
refs/h*?

IIRC the "git push" command already accepts such refspecs, so there's a
benefit to accepting them.  Reftable and packed-refs support such
queries about as efficiently as refs/heads/*.  For loose refs, readdir
doesn't provide a way to restrict which files you look at, but loose
refs are always slow anyway. :)

In other words, I see real benefits and I don't see much in the way of
costs, so I'm not seeing why not to support this.

Thanks,
Jonathan


Re: [PATCH v4 13/35] ls-refs: introduce ls-refs server command

2018-03-05 Thread Brandon Williams
On 03/02, Jeff King wrote:
> On Wed, Feb 28, 2018 at 03:22:30PM -0800, Brandon Williams wrote:
> 
> > +static void add_pattern(struct pattern_list *patterns, const char *pattern)
> > +{
> > +   struct ref_pattern p;
> > +   const char *wildcard;
> > +
> > +   p.pattern = strdup(pattern);
> 
> xstrdup?
> 
> > +   wildcard = strchr(pattern, '*');
> > +   if (wildcard) {
> > +   p.wildcard_pos = wildcard - pattern;
> > +   } else {
> > +   p.wildcard_pos = -1;
> > +   }
> 
> Hmm, so this would accept stuff like "refs/heads/*/foo" but quietly
> ignore the "/foo" part.

Yeah that's true...this should probably not do that.  Since
"refs/heads/*/foo" violates what the spec is, really this should error
out as an invalid pattern.

> 
> It also accepts "refs/h*" to get "refs/heads" and "refs/hello".  I think
> it's worth going for the most-restrictive thing to start with, since
> that enables a lot more server operations without worrying about
> breaking compatibility.

And just to clarify what do you see as being the most-restrictive case
of patterns that would should use?

-- 
Brandon Williams


Re: [PATCH v4 13/35] ls-refs: introduce ls-refs server command

2018-03-02 Thread Jeff King
On Wed, Feb 28, 2018 at 03:22:30PM -0800, Brandon Williams wrote:

> +static void add_pattern(struct pattern_list *patterns, const char *pattern)
> +{
> + struct ref_pattern p;
> + const char *wildcard;
> +
> + p.pattern = strdup(pattern);

xstrdup?

> + wildcard = strchr(pattern, '*');
> + if (wildcard) {
> + p.wildcard_pos = wildcard - pattern;
> + } else {
> + p.wildcard_pos = -1;
> + }

Hmm, so this would accept stuff like "refs/heads/*/foo" but quietly
ignore the "/foo" part.

It also accepts "refs/h*" to get "refs/heads" and "refs/hello".  I think
it's worth going for the most-restrictive thing to start with, since
that enables a lot more server operations without worrying about
breaking compatibility.

-Peff


Re: [PATCH v4 13/35] ls-refs: introduce ls-refs server command

2018-03-02 Thread Junio C Hamano
Brandon Williams  writes:

> + ls-refs
> +-
> +
> +`ls-refs` is the command used to request a reference advertisement in v2.
> +Unlike the current reference advertisement, ls-refs takes in arguments
> +which can be used to limit the refs sent from the server.

OK.

> +Additional features not supported in the base command will be advertised
> +as the value of the command in the capability advertisement in the form
> +of a space separated list of features, e.g.  "=
> +".

Doesn't this explain the general convention that applies to any
command, not just ls-refs command?  As a part of ls-refs section,
 in the above explanation is always a constant "ls-refs",
right?

It is a bit unclear how  in the above description are
related to "arguments" in the following paragraph.  Do the server
that can show symref and peeled tags and that can limit the output
with ref-pattern advertise these three as supported features, i.e.

ls-refs=symrefs peel ref-pattern

or something?  Would there a case where a "feature" does not
correspond 1:1 to an argument to the command, and if so how would
the server and the client negotiate use of such a feature?

> +ref-pattern 
> + When specified, only references matching one of the provided
> + patterns are displayed.  A pattern is either a valid refname
> + (e.g.  refs/heads/master), in which a ref must match the pattern
> + exactly, or a prefix of a ref followed by a single '*' wildcard
> + character (e.g. refs/heads/*), in which a ref must have a prefix
> + equal to the pattern up to the wildcard character.

I thought the recent concensus was left-anchored prefix match that
honors /-directory boundary, i.e. no explicit asterisk and just
saying "refs/heads" is enough to match "refs/heads" itself and
"refs/heads/master" but not "refs/headscarf"?


[PATCH v4 13/35] ls-refs: introduce ls-refs server command

2018-02-28 Thread Brandon Williams
Introduce the ls-refs server command.  In protocol v2, the ls-refs
command is used to request the ref advertisement from the server.  Since
it is a command which can be requested (as opposed to mandatory in v1),
a client can sent a number of parameters in its request to limit the ref
advertisement based on provided ref-patterns.

Signed-off-by: Brandon Williams 
---
 Documentation/technical/protocol-v2.txt |  36 ++
 Makefile|   1 +
 ls-refs.c   | 144 
 ls-refs.h   |   9 ++
 serve.c |   8 ++
 t/t5701-git-serve.sh| 115 +++
 6 files changed, 313 insertions(+)
 create mode 100644 ls-refs.c
 create mode 100644 ls-refs.h

diff --git a/Documentation/technical/protocol-v2.txt 
b/Documentation/technical/protocol-v2.txt
index b02eefc21..7f50e6462 100644
--- a/Documentation/technical/protocol-v2.txt
+++ b/Documentation/technical/protocol-v2.txt
@@ -169,3 +169,39 @@ printable ASCII characters except space (i.e., the byte 
range 32 < x <
 "git/1.8.3.1"). The agent strings are purely informative for statistics
 and debugging purposes, and MUST NOT be used to programmatically assume
 the presence or absence of particular features.
+
+ ls-refs
+-
+
+`ls-refs` is the command used to request a reference advertisement in v2.
+Unlike the current reference advertisement, ls-refs takes in arguments
+which can be used to limit the refs sent from the server.
+
+Additional features not supported in the base command will be advertised
+as the value of the command in the capability advertisement in the form
+of a space separated list of features, e.g.  "=
+".
+
+ls-refs takes in the following arguments:
+
+symrefs
+   In addition to the object pointed by it, show the underlying ref
+   pointed by it when showing a symbolic ref.
+peel
+   Show peeled tags.
+ref-pattern 
+   When specified, only references matching one of the provided
+   patterns are displayed.  A pattern is either a valid refname
+   (e.g.  refs/heads/master), in which a ref must match the pattern
+   exactly, or a prefix of a ref followed by a single '*' wildcard
+   character (e.g. refs/heads/*), in which a ref must have a prefix
+   equal to the pattern up to the wildcard character.
+
+The output of ls-refs is as follows:
+
+output = *ref
+flush-pkt
+ref = PKT-LINE(obj-id SP refname *(SP ref-attribute) LF)
+ref-attribute = (symref | peeled)
+symref = "symref-target:" symref-target
+peeled = "peeled:" obj-id
diff --git a/Makefile b/Makefile
index 18c255428..e50927cfb 100644
--- a/Makefile
+++ b/Makefile
@@ -825,6 +825,7 @@ LIB_OBJS += list-objects-filter-options.o
 LIB_OBJS += ll-merge.o
 LIB_OBJS += lockfile.o
 LIB_OBJS += log-tree.o
+LIB_OBJS += ls-refs.o
 LIB_OBJS += mailinfo.o
 LIB_OBJS += mailmap.o
 LIB_OBJS += match-trees.o
diff --git a/ls-refs.c b/ls-refs.c
new file mode 100644
index 0..91d7deb34
--- /dev/null
+++ b/ls-refs.c
@@ -0,0 +1,144 @@
+#include "cache.h"
+#include "repository.h"
+#include "refs.h"
+#include "remote.h"
+#include "argv-array.h"
+#include "ls-refs.h"
+#include "pkt-line.h"
+
+struct ref_pattern {
+   char *pattern;
+   int wildcard_pos; /* If > 0, indicates the position of the wildcard */
+};
+
+struct pattern_list {
+   struct ref_pattern *patterns;
+   int nr;
+   int alloc;
+};
+
+static void add_pattern(struct pattern_list *patterns, const char *pattern)
+{
+   struct ref_pattern p;
+   const char *wildcard;
+
+   p.pattern = strdup(pattern);
+
+   wildcard = strchr(pattern, '*');
+   if (wildcard) {
+   p.wildcard_pos = wildcard - pattern;
+   } else {
+   p.wildcard_pos = -1;
+   }
+
+   ALLOC_GROW(patterns->patterns,
+  patterns->nr + 1,
+  patterns->alloc);
+   patterns->patterns[patterns->nr++] = p;
+}
+
+static void clear_patterns(struct pattern_list *patterns)
+{
+   int i;
+   for (i = 0; i < patterns->nr; i++)
+   free(patterns->patterns[i].pattern);
+   FREE_AND_NULL(patterns->patterns);
+   patterns->nr = 0;
+   patterns->alloc = 0;
+}
+
+/*
+ * Check if one of the patterns matches the tail part of the ref.
+ * If no patterns were provided, all refs match.
+ */
+static int ref_match(const struct pattern_list *patterns, const char *refname)
+{
+   int i;
+
+   if (!patterns->nr)
+   return 1; /* no restriction */
+
+   for (i = 0; i < patterns->nr; i++) {
+   const struct ref_pattern *p = &patterns->patterns[i];
+
+   /* No wildcard, exact match expected */
+   if (p->wildcard_pos < 0) {
+   if (!strcmp(refname, p->pattern))
+   return 1;
+   } else {
+   /* Wildcard,