Re: [RFC 0/7] transitioning to protocol v2

2017-09-01 Thread Bryan Turner
On Wed, Aug 30, 2017 at 2:12 PM, Brandon Williams  wrote:
> On 08/30, Bryan Turner wrote:
>> On Fri, Aug 25, 2017 at 10:29 AM, Jeff King  wrote:
>> > On Thu, Aug 24, 2017 at 03:53:21PM -0700, Brandon Williams wrote:
>> >
>> >> The biggest question I'm trying to answer is if these are reasonable ways 
>> >> with
>> >> which to communicate a request to a server to use a newer protocol, 
>> >> without
>> >> breaking current servers/clients.  As far as I've tested, with patches 1-5
>> >> applied I can still communicate with current servers without causing any
>> >> problems.
>> >
>> > Current git.git servers, I assume?. How much do we want to care about
>> > alternate implementations? I would not be surprised if other git://
>> > implementations are more picky about cruft after the virtual-host field
>> > (though I double-checked GitHub's implementation at least, and it is
>> > fine).
>> >
>> > I don't think libgit2 implements the server side. That leaves probably
>> > JGit, Microsoft's VSTS (which I think is custom), and whatever Atlassian
>> > and GitLab use.
>>
>> Before I manually apply the patches to test how they work with
>> Bitbucket Server, are they applied on a branch somewhere where I can
>> just fetch them? If not, I'll apply them manually and verify.
>
> I just pushed this set of patches up to: 
> https://github.com/bmwill/git/tree/protocol-v2
> so you should be able to fetch them from there (saves you from having to
> manually applying the patches).

Thanks for that! It made testing a lot simpler.

>
>> Just based on the description, though, I expect no issues. We don't
>> currently support the git:// protocol. Our HTTP handling passes
>> headers through to the receive-pack and upload-pack processes as
>> environment variables (with a little massaging), but doesn't consider
>> them itself; it only considers the URL and "service" query parameter
>> to decide what command to run and to detect "dumb" requests. Our SSH
>> handling ignores any environment variables provided and does not
>> forward them to the git process, similar to VSTS.
>>
>> I'll confirm explicitly, to be certain, but just based on reading the
>> overview and knowing our code I think the described approaches should
>> work fine.
>
> Perfect!  Thanks for taking the time to verify that this will work.

With 2 small tweaks on top of "protocol-v2", I was able to run our
full command and hosting (HTTP and SSH) test suites without issues.

diff --git a/transport.c b/transport.c
index c05e167..37b5d83 100644
--- a/transport.c
+++ b/transport.c
@@ -222,7 +222,8 @@ static struct ref *get_refs_via_connect(struct
transport *transport, int for_pus
switch(determine_version(data->fd[0], )) {
case 2:
/* The server understands Protocol v2 */
-   fprintf(stderr, "Server understands Protocol v2!\n");
+   if (transport->verbose >= 0)
+   fprintf(stderr, "Server understands Protocol v2!\n");
break;
case 1:
/* Server is speaking Protocol v1 and sent a ref so
process it */
diff --git a/upload-pack.c b/upload-pack.c
index 0f85315..7c59495 100644
--- a/upload-pack.c
+++ b/upload-pack.c
@@ -1075,7 +1075,7 @@ int cmd_main(int argc, const char **argv)
git_config(upload_pack_config, NULL);

version = getenv("GIT_PROTOCOL");
-   if (!strcmp(version, "2"))
+   if (version && !strcmp(version, "2"))
upload_pack_v2();

upload_pack();

I'd imagine the "Server understands Protocol v2!" message won't
actually *ship* in Git, so it's likely making that honor "--quiet"
doesn't actually matter; I only adjusted it because we have a test
that verifies "git clone --quiet" is quiet. The "if (version" change I
commented on separately in the 7/7 patch that introduced the check.

Thanks again for publishing this, and for letting me test it!

>
> --
> Brandon Williams


Re: [RFC 0/7] transitioning to protocol v2

2017-08-30 Thread Brandon Williams
On 08/30, Bryan Turner wrote:
> On Fri, Aug 25, 2017 at 10:29 AM, Jeff King  wrote:
> > On Thu, Aug 24, 2017 at 03:53:21PM -0700, Brandon Williams wrote:
> >
> >> The biggest question I'm trying to answer is if these are reasonable ways 
> >> with
> >> which to communicate a request to a server to use a newer protocol, without
> >> breaking current servers/clients.  As far as I've tested, with patches 1-5
> >> applied I can still communicate with current servers without causing any
> >> problems.
> >
> > Current git.git servers, I assume?. How much do we want to care about
> > alternate implementations? I would not be surprised if other git://
> > implementations are more picky about cruft after the virtual-host field
> > (though I double-checked GitHub's implementation at least, and it is
> > fine).
> >
> > I don't think libgit2 implements the server side. That leaves probably
> > JGit, Microsoft's VSTS (which I think is custom), and whatever Atlassian
> > and GitLab use.
> 
> Before I manually apply the patches to test how they work with
> Bitbucket Server, are they applied on a branch somewhere where I can
> just fetch them? If not, I'll apply them manually and verify.

I just pushed this set of patches up to: 
https://github.com/bmwill/git/tree/protocol-v2
so you should be able to fetch them from there (saves you from having to
manually applying the patches).

> Just based on the description, though, I expect no issues. We don't
> currently support the git:// protocol. Our HTTP handling passes
> headers through to the receive-pack and upload-pack processes as
> environment variables (with a little massaging), but doesn't consider
> them itself; it only considers the URL and "service" query parameter
> to decide what command to run and to detect "dumb" requests. Our SSH
> handling ignores any environment variables provided and does not
> forward them to the git process, similar to VSTS.
> 
> I'll confirm explicitly, to be certain, but just based on reading the
> overview and knowing our code I think the described approaches should
> work fine.

Perfect!  Thanks for taking the time to verify that this will work.

-- 
Brandon Williams


Re: [RFC 0/7] transitioning to protocol v2

2017-08-30 Thread Bryan Turner
On Fri, Aug 25, 2017 at 10:29 AM, Jeff King  wrote:
> On Thu, Aug 24, 2017 at 03:53:21PM -0700, Brandon Williams wrote:
>
>> The biggest question I'm trying to answer is if these are reasonable ways 
>> with
>> which to communicate a request to a server to use a newer protocol, without
>> breaking current servers/clients.  As far as I've tested, with patches 1-5
>> applied I can still communicate with current servers without causing any
>> problems.
>
> Current git.git servers, I assume?. How much do we want to care about
> alternate implementations? I would not be surprised if other git://
> implementations are more picky about cruft after the virtual-host field
> (though I double-checked GitHub's implementation at least, and it is
> fine).
>
> I don't think libgit2 implements the server side. That leaves probably
> JGit, Microsoft's VSTS (which I think is custom), and whatever Atlassian
> and GitLab use.

Before I manually apply the patches to test how they work with
Bitbucket Server, are they applied on a branch somewhere where I can
just fetch them? If not, I'll apply them manually and verify.

Just based on the description, though, I expect no issues. We don't
currently support the git:// protocol. Our HTTP handling passes
headers through to the receive-pack and upload-pack processes as
environment variables (with a little massaging), but doesn't consider
them itself; it only considers the URL and "service" query parameter
to decide what command to run and to detect "dumb" requests. Our SSH
handling ignores any environment variables provided and does not
forward them to the git process, similar to VSTS.

I'll confirm explicitly, to be certain, but just based on reading the
overview and knowing our code I think the described approaches should
work fine.

Best regards,
Bryan Turner


Re: [RFC 0/7] transitioning to protocol v2

2017-08-30 Thread Brandon Williams
On 08/30, Jeff Hostetler wrote:
> 
> 
> On 8/29/2017 11:06 PM, Jeff King wrote:
> >On Tue, Aug 29, 2017 at 04:08:25PM -0400, Jeff Hostetler wrote:
> >
> >>I just wanted to jump in here and say I've done some initial
> >>testing of this against VSTS and so far it seems fine.  And yes,
> >>we have a custom git server.
> >
> >Great, thank you for checking.
> >
> >>VSTS doesn't support the "git://" protocol, so the double-null trick
> >>isn't an issue for us.  But "https://; worked just fine.  I'm still
> >>asking around internally whether we support passing SSH environment
> >>variables.
> >
> >The key thing for ssh is not whether you support passing environment
> >variables. It's whether you quietly ignore unknown variables rather than
> >cutting off the connection.
> >
> >To support the v2 protocol you'd need to pass the new variables, but
> >you'd also need to modify your server to actually do something useful
> >with them anyway. At this point we're mostly concerned with whether we
> >can safely pass the variables to current implementations unconditionally
> >and get a reasonable outcome.
> 
> Right.  I just spoke with our server folks and, currently, our SSH
> support quietly eats ALL variables.   So we're safe :-)
> 
> I'm starting a conversation with them to pass them thru so we can
> be ready for this.  (Assuming we choose to go this way.)

Perfect! Thanks again.

-- 
Brandon Williams


Re: [RFC 0/7] transitioning to protocol v2

2017-08-30 Thread Jeff Hostetler



On 8/29/2017 11:06 PM, Jeff King wrote:

On Tue, Aug 29, 2017 at 04:08:25PM -0400, Jeff Hostetler wrote:


I just wanted to jump in here and say I've done some initial
testing of this against VSTS and so far it seems fine.  And yes,
we have a custom git server.


Great, thank you for checking.


VSTS doesn't support the "git://" protocol, so the double-null trick
isn't an issue for us.  But "https://; worked just fine.  I'm still
asking around internally whether we support passing SSH environment
variables.


The key thing for ssh is not whether you support passing environment
variables. It's whether you quietly ignore unknown variables rather than
cutting off the connection.

>

To support the v2 protocol you'd need to pass the new variables, but
you'd also need to modify your server to actually do something useful
with them anyway. At this point we're mostly concerned with whether we
can safely pass the variables to current implementations unconditionally
and get a reasonable outcome.


Right.  I just spoke with our server folks and, currently, our SSH
support quietly eats ALL variables.   So we're safe :-)

I'm starting a conversation with them to pass them thru so we can
be ready for this.  (Assuming we choose to go this way.)

Thanks,
Jeff




Re: [RFC 0/7] transitioning to protocol v2

2017-08-29 Thread Jeff King
On Tue, Aug 29, 2017 at 04:08:25PM -0400, Jeff Hostetler wrote:

> I just wanted to jump in here and say I've done some initial
> testing of this against VSTS and so far it seems fine.  And yes,
> we have a custom git server.

Great, thank you for checking.

> VSTS doesn't support the "git://" protocol, so the double-null trick
> isn't an issue for us.  But "https://; worked just fine.  I'm still
> asking around internally whether we support passing SSH environment
> variables.

The key thing for ssh is not whether you support passing environment
variables. It's whether you quietly ignore unknown variables rather than
cutting off the connection.

To support the v2 protocol you'd need to pass the new variables, but
you'd also need to modify your server to actually do something useful
with them anyway. At this point we're mostly concerned with whether we
can safely pass the variables to current implementations unconditionally
and get a reasonable outcome.

To be honest, I'm not all that worried about VSTS either way. It's a
centralized service which will likely get v2 extensions implemented in a
timely manner (once they exist). I'm much more concerned about
shrink-wrap implementations deployed in the wild, which may hang around
without being upgraded for years. If v2-aware clients send requests that
cause those old implementations to choke, users won't be happy.

-Peff


Re: [RFC 0/7] transitioning to protocol v2

2017-08-29 Thread Brandon Williams
On 08/29, Jeff Hostetler wrote:
> 
> 
> On 8/25/2017 1:35 PM, Jonathan Nieder wrote:
> >Hi,
> >
> >Jeff King wrote:
> >>On Thu, Aug 24, 2017 at 03:53:21PM -0700, Brandon Williams wrote:
> >
> >>>Another version of Git's wire protocol is a topic that has been discussed 
> >>>and
> >>>attempted by many in the community over the years.  The biggest challenge, 
> >>>as
> >>>far as I understand, has been coming up with a transition plan to using 
> >>>the new
> >>>server without breaking existing clients and servers.  As such this RFC is
> >>>really only concerned with solidifying a transition plan.  Once it has been
> >>>decided how we can transition to a new protocol we can get into decided 
> >>>what
> >>>this new protocol would look like (though it would obviously eliminate the 
> >>>ref
> >>>advertisement ;).
> >>
> >
> >>I don't think libgit2 implements the server side. That leaves probably
> >>JGit, Microsoft's VSTS (which I think is custom), and whatever Atlassian
> >>and GitLab use.
> >
> >I'd be happy if someone tests the patches against those. :)
> 
> I just wanted to jump in here and say I've done some initial
> testing of this against VSTS and so far it seems fine.  And yes,
> we have a custom git server.
> 
> VSTS doesn't support the "git://" protocol, so the double-null trick
> isn't an issue for us.  But "https://; worked just fine.  I'm still
> asking around internally whether we support passing SSH environment
> variables.
> 
> Jeff
> 

Thanks for checking on this, I really appreciate it.  Please let me know
if anything I haven't thought of becomes an issue.

I'm currently working on getting these patches into a more polished
state to be used (as discussed elsewhere on this thread) as a precursor
to an actual v2.

-- 
Brandon Williams


Re: [RFC 0/7] transitioning to protocol v2

2017-08-29 Thread Jeff Hostetler



On 8/25/2017 1:35 PM, Jonathan Nieder wrote:

Hi,

Jeff King wrote:

On Thu, Aug 24, 2017 at 03:53:21PM -0700, Brandon Williams wrote:



Another version of Git's wire protocol is a topic that has been discussed and
attempted by many in the community over the years.  The biggest challenge, as
far as I understand, has been coming up with a transition plan to using the new
server without breaking existing clients and servers.  As such this RFC is
really only concerned with solidifying a transition plan.  Once it has been
decided how we can transition to a new protocol we can get into decided what
this new protocol would look like (though it would obviously eliminate the ref
advertisement ;).





I don't think libgit2 implements the server side. That leaves probably
JGit, Microsoft's VSTS (which I think is custom), and whatever Atlassian
and GitLab use.


I'd be happy if someone tests the patches against those. :)


I just wanted to jump in here and say I've done some initial
testing of this against VSTS and so far it seems fine.  And yes,
we have a custom git server.

VSTS doesn't support the "git://" protocol, so the double-null trick
isn't an issue for us.  But "https://; worked just fine.  I'm still
asking around internally whether we support passing SSH environment
variables.

Jeff



Re: [RFC 0/7] transitioning to protocol v2

2017-08-25 Thread Brandon Williams
On 08/25, Jeff King wrote:
> On Fri, Aug 25, 2017 at 10:35:50AM -0700, Jonathan Nieder wrote:
> 
> > > Sadly, while splitting these things apart makes the protocol
> > > conceptually cleaner, I'm not sure if we can consider them separately
> > > and avoid adding an extra round-trip to the protocol.
> > 
> > How about the idea of using this mechanism to implement a protocol
> > "v1"?
> > 
> > The reply would be the same as today, except that it has a "protocol
> > v1" pkt-line at the beginning.  So this doesn't change the number of
> > round-trips --- it just validates the protocol migration approach.
> 
> I'm not that worried about validating the ideas here to shoe-horn a
> version field. I'm worried about what "step 2" is going to look like,
> and whether "we shoe-horned a version field" can be extended to "we
> shoe-horned arbitrary data".

I know that this initial RFC didn't mention adding arbitrary data in the
initial request but I fully intend that the final version will allow
such a thing.  One such idea (via the envvar, though the same can be
done with git-daemon) would be to have GIT_PROTOCOL hold colon
delimited values, versions could be indicated by "v1", "v2", etc while
arbitrary key/values as "key=value" such that the envvar would look
like: 'v1:v2:key1=value1:key2=value2'.  This would mean that the client
understands protocol version 1 and 2 (so either are acceptable for the
server to select since there is a change the server doesn't understand
v2 or some other higher version number) and that if supported it wants
key1 to have value 'value1' and key2 to have value 'value2'.

As you said the initial request to git-daemon is limited in length (I
think envvars have a length limit too?) so we would still be limited in
how much data we can send in that initial request and so we should
design a new protocol in such a way that doesn't hinge on adding extra
data in that first request (aside from a version number) but that can
use it to speed things up if at all possible.

> > Can we do that by making it a patch / letting it cook for a while in
> > 'next'? :)
> 
> If people actually ran 'next', that would help. I was hoping that we
> could get it out to the masses behind a feature flag, and dangle it in
> front of them with "this will improve fetch performance if you turn it
> on". But that carrot implies going all the way through the follow-on
> steps of designing the performance-improving v2 extensions and getting
> them implemented on the server side.

We run 'next' here so we will be able to get at least a little bit of
feedback from a small subset of users.

-- 
Brandon Williams


Re: [RFC 0/7] transitioning to protocol v2

2017-08-25 Thread Junio C Hamano
Jeff King  writes:

> But what if we instead think of it not as "protocol v2" but as "can I
> give the server some hints that it may end up ignoring", then we end up
> with something more like:
>
>   C: please run upload-pack (btw, I'm only interested in refs/heads/foo)
>   S: advertisement + caps (hopefully limited to foo, but client is prepared 
> to receive all)
>   ... etc, as before ...

Nice.  The caps that come back can tell us between the cases where
they only had refs/heads/foo and nothing else, or if they limited
their output to it among many others we told them to ignore, so
there is no ambiguity.

> Or alternatively, I guess make this optional to start with, and
> let early adopters turn it on and complain to their server vendors
> for a while before flipping the default to on.

That sounds like a safe transition plan.


Re: [RFC 0/7] transitioning to protocol v2

2017-08-25 Thread Jeff King
On Fri, Aug 25, 2017 at 10:35:50AM -0700, Jonathan Nieder wrote:

> > Sadly, while splitting these things apart makes the protocol
> > conceptually cleaner, I'm not sure if we can consider them separately
> > and avoid adding an extra round-trip to the protocol.
> 
> How about the idea of using this mechanism to implement a protocol
> "v1"?
> 
> The reply would be the same as today, except that it has a "protocol
> v1" pkt-line at the beginning.  So this doesn't change the number of
> round-trips --- it just validates the protocol migration approach.

I'm not that worried about validating the ideas here to shoe-horn a
version field. I'm worried about what "step 2" is going to look like,
and whether "we shoe-horned a version field" can be extended to "we
shoe-horned arbitrary data".

Maybe those are the same thing, and validating the first validates the
second. But I don't think it's a foregone conclusion.

> > Current git.git servers, I assume?. How much do we want to care about
> > alternate implementations? I would not be surprised if other git://
> > implementations are more picky about cruft after the virtual-host field
> > (though I double-checked GitHub's implementation at least, and it is
> > fine).
> 
> FWIW JGit copes fine with this.

Thanks for checking.

> > I don't think libgit2 implements the server side. That leaves probably
> > JGit, Microsoft's VSTS (which I think is custom), and whatever Atlassian
> > and GitLab use.
> 
> I'd be happy if someone tests the patches against those. :)

Me too. I don't have an easy setup for the last 3.

> >Or alternatively, I guess make this optional to
> > start with, and let early adopters turn it on and complain to their server
> > vendors for a while before flipping the default to on.
> 
> Can we do that by making it a patch / letting it cook for a while in
> 'next'? :)

If people actually ran 'next', that would help. I was hoping that we
could get it out to the masses behind a feature flag, and dangle it in
front of them with "this will improve fetch performance if you turn it
on". But that carrot implies going all the way through the follow-on
steps of designing the performance-improving v2 extensions and getting
them implemented on the server side.

-Peff


Re: [RFC 0/7] transitioning to protocol v2

2017-08-25 Thread Jeff King
On Fri, Aug 25, 2017 at 10:14:13AM -0700, Junio C Hamano wrote:

> Stefan Beller  writes:
> 
> > For now I would suggest we put a protocol v2 in place that is
> > the current protocol + a version number coming through the
> > poked hole at the beginning; the goal and review of this series
> > ought to focus on getting the version handshake right...
> 
> Oh, we are in absolute agreement on that.  It would be nice if we
> can have new tests to demonostrate three combinations working well
> (i.e. use 'installed git' whose path is given externally on one end
> of the connection, while the just-built binary sits on the other
> end, in addition to making sure just-built binary successfully talks
> with itself).

The harness in t/interop can perhaps help with that (at least between
existing git versions; testing across other implementations makes the
setup a lot tricker).

-Peff


Re: [RFC 0/7] transitioning to protocol v2

2017-08-25 Thread Jonathan Nieder
Hi,

Jeff King wrote:
> On Thu, Aug 24, 2017 at 03:53:21PM -0700, Brandon Williams wrote:

>> Another version of Git's wire protocol is a topic that has been discussed and
>> attempted by many in the community over the years.  The biggest challenge, as
>> far as I understand, has been coming up with a transition plan to using the 
>> new
>> server without breaking existing clients and servers.  As such this RFC is
>> really only concerned with solidifying a transition plan.  Once it has been
>> decided how we can transition to a new protocol we can get into decided what
>> this new protocol would look like (though it would obviously eliminate the 
>> ref
>> advertisement ;).
>
> Sadly, while splitting these things apart makes the protocol
> conceptually cleaner, I'm not sure if we can consider them separately
> and avoid adding an extra round-trip to the protocol.

How about the idea of using this mechanism to implement a protocol
"v1"?

The reply would be the same as today, except that it has a "protocol
v1" pkt-line at the beginning.  So this doesn't change the number of
round-trips --- it just validates the protocol migration approach.

I agree with you that an actual protocol v2 needs to change how the
capability exchange etc work.  bmwill@ has mentioned some thoughts about
this privately.  Probably he can say more here too.

[...]
> Given the techniques you've used here, I suspect the answer may be
> "yes". We could stick arbitrary data in each of those methods (though I
> suspect our length may be limited to about 1024 bytes if we want
> compatibility with very old git servers).

Yes, including arbitrary data to be able to include some kinds of
requests inline in the initial request is one of the design goals.

>> The biggest question I'm trying to answer is if these are reasonable ways 
>> with
>> which to communicate a request to a server to use a newer protocol, without
>> breaking current servers/clients.  As far as I've tested, with patches 1-5
>> applied I can still communicate with current servers without causing any
>> problems.
>
> Current git.git servers, I assume?. How much do we want to care about
> alternate implementations? I would not be surprised if other git://
> implementations are more picky about cruft after the virtual-host field
> (though I double-checked GitHub's implementation at least, and it is
> fine).

FWIW JGit copes fine with this.

> I don't think libgit2 implements the server side. That leaves probably
> JGit, Microsoft's VSTS (which I think is custom), and whatever Atlassian
> and GitLab use.

I'd be happy if someone tests the patches against those. :)

> There's not really a spec here.

Technically pack-protocol is a spec, just not a very clear one.

It does say this kind of client request is invalid.  The idea of this
series is to change the spec. :)

[...]
> I dunno. Maybe it would be enough to have a config to switch off this
> feature, which would give people using those systems an escape hatch
> (until they upgrade).

I'd rather not.  That means there's less motivation for people to
report compatibility problems so we can fix them.

It might be necessary as a temporary escape hatch, though.

>Or alternatively, I guess make this optional to
> start with, and let early adopters turn it on and complain to their server
> vendors for a while before flipping the default to on.

Can we do that by making it a patch / letting it cook for a while in
'next'? :)

Thanks,
Jonathan


Re: [RFC 0/7] transitioning to protocol v2

2017-08-25 Thread Jeff King
On Thu, Aug 24, 2017 at 03:53:21PM -0700, Brandon Williams wrote:

> Another version of Git's wire protocol is a topic that has been discussed and
> attempted by many in the community over the years.  The biggest challenge, as
> far as I understand, has been coming up with a transition plan to using the 
> new
> server without breaking existing clients and servers.  As such this RFC is
> really only concerned with solidifying a transition plan.  Once it has been
> decided how we can transition to a new protocol we can get into decided what
> this new protocol would look like (though it would obviously eliminate the ref
> advertisement ;).

Sadly, while splitting these things apart makes the protocol
conceptually cleaner, I'm not sure if we can consider them separately
and avoid adding an extra round-trip to the protocol.

For instance, let's say as a client that I've communicated "I would like
to speak v2" to the server. I don't immediately know if it was received
and respected, so I have to wait for the server to say "OK, I know v2"
before sending any more data (like my list of capabilities that I'd like
the server to know before doing the ref advertisement).

So what was perhaps:

  C: please run upload-pack
  S: advertisement + caps
  C: caps + wants
  C+S: async have negotiation
  S: packfile

becomes:

  C: please run upload-pack (v2 if you support it)
  S: yes, I speak v2
  C: caps (including that I'm interested only in refs/heads/foo)
  S: advertise refs/heads/foo + caps
  C+S async have negotiation
  S: packfile

That extra round-trip is probably tolerable for stateful connections
like git:// or ssh. But what about http? We have to add a whole
request/response pair just to find out if v2 is supported.

But what if we instead think of it not as "protocol v2" but as "can I
give the server some hints that it may end up ignoring", then we end up
with something more like:

  C: please run upload-pack (btw, I'm only interested in refs/heads/foo)
  S: advertisement + caps (hopefully limited to foo, but client is prepared to 
receive all)
  ... etc, as before ...

It's a subtle distinction, but the question becomes not "can we sneak in
an extra bit of information" but "can we sneak in a reasonable number of
arbitrary key/value pairs".

Given the techniques you've used here, I suspect the answer may be
"yes". We could stick arbitrary data in each of those methods (though I
suspect our length may be limited to about 1024 bytes if we want
compatibility with very old git servers).

> The biggest question I'm trying to answer is if these are reasonable ways with
> which to communicate a request to a server to use a newer protocol, without
> breaking current servers/clients.  As far as I've tested, with patches 1-5
> applied I can still communicate with current servers without causing any
> problems.

Current git.git servers, I assume?. How much do we want to care about
alternate implementations? I would not be surprised if other git://
implementations are more picky about cruft after the virtual-host field
(though I double-checked GitHub's implementation at least, and it is
fine).

I don't think libgit2 implements the server side. That leaves probably
JGit, Microsoft's VSTS (which I think is custom), and whatever Atlassian
and GitLab use.

There's not really a spec here.  I'm not entirely opposed to saying "if
your server does not behave like the git.git one it is wrong". But this
is awfully quirky behavior to be relying on, and if it behaves badly
with those implementations it will create headaches for users. The
centralized services I'm not too worried about; they'll upgrade
promptly. But any deployments of those systems may hang around for
years.

I dunno. Maybe it would be enough to have a config to switch off this
feature, which would give people using those systems an escape hatch
(until they upgrade). Or alternatively, I guess make this optional to
start with, and let early adopters turn it on and complain to their server
vendors for a while before flipping the default to on.

-Peff


Re: [RFC 0/7] transitioning to protocol v2

2017-08-25 Thread Junio C Hamano
Stefan Beller  writes:

> For now I would suggest we put a protocol v2 in place that is
> the current protocol + a version number coming through the
> poked hole at the beginning; the goal and review of this series
> ought to focus on getting the version handshake right...

Oh, we are in absolute agreement on that.  It would be nice if we
can have new tests to demonostrate three combinations working well
(i.e. use 'installed git' whose path is given externally on one end
of the connection, while the just-built binary sits on the other
end, in addition to making sure just-built binary successfully talks
with itself).



Re: [RFC 0/7] transitioning to protocol v2

2017-08-25 Thread Stefan Beller
On Thu, Aug 24, 2017 at 6:19 PM, Junio C Hamano  wrote:
> Brandon Williams  writes:
>
>> The best way to preserve functionality with old servers and clients would be 
>> to
>> communicate using the same end point but have the client send a bit of extra
>> information with its initial request.  This extra information would need to 
>> be
>> sent in such a way that old servers ignore it and operate normally (using
>> protocol v1).  The client would then need to be able to look at a server's
>> response to determine whether the server understands and is speaking v2 or 
>> has
>> ignored the clients request to use a newer protocol and is speaking v1.
>
> Good. I think the idle talk last round was for the server to tell
> the v1 client "we are still doing the slow ls-remote comes first
> protocol with this exchange, but just for future reference, you can
> use the v2 endpoint with me", which was way less desirable (even
> though it may be safer).

The idea that this RFC series tries to show case is safer IMO.

>> Patches 1-5 enable a client to unconditionally send this back-channel
>> information to a server.  This is done by sending a version number after a
>> second NUL byte in git://, in the envvar GIT_PROTOCOL in file:// and ssh://,
>> and in an http header in http://, https://.  Patches 6-7 teach a client and
>> upload-pack to send and recognize a request to use protocol v2.
>
> All sounds sensible.
>
>  - for git://, if you say you found a hole in the protocol to stuff
>this information, I simply believe you ;-)  Good job.

The hole is incredible funny and sad IMHO, so I would expect that
this series (specifically the review once it leaves RFC state) focusses
on how to allow future protocols with no such hacks. So AFAICT
the core idea of this series is that we can have an exchange

client through poked hole>  "We can speak version 4,3, and 1 as fallback"
server > "Ok, 3 it is"
[ protocol v3 is executed, I don't know what it looks like. ]

Alternatively when the server is old:

client >  "We can speak version 4,3, and 1 as fallback"
server > lists refs
client continues v1 as usual, because the version announcements
are so different from the first ref in the refs advertisement of v1,
such that the client knows for sure which version is talked. (despite
never getting an explicit "it is v1")

Or if the client is old:

client > (nothing)
server > lists refs, current v1 style.


>  - http:// and https:// should be a no-brainer as the HTTP headers
>give ample room to send information from the initiator side.
>
>  - For ssh://, I do not think it is sane to assume that we can
>convince server operators to allow passing any random environment
>variable.  If the use of this specific variable turns out to be
>popular enough, however, and its benefit outweighs administrative
>burden, perhaps it is not impossible to convince them to allow
>AcceptEnv on this single variable.

Once the next generation protocol demonstrates it is far superior
than the current protocol admins will switch happily. (Some ideas:
(a) refstable instead packed-refs format, yielding better compression for
ref advertisement, (b) refs-in-want to cut down ref advertisement to
just the needs, (c) negotiation to draw from the remote reflog)

For now I would suggest we put a protocol v2 in place that is
the current protocol + a version number coming through the
poked hole at the beginning; the goal and review of this series
ought to focus on getting the version handshake right (and future
proof for an eventual v3 if needed in another 10 years)

Regarding the patches itself:

 patches 1,2 seem ok, no further comment
 patches 3-5 are the client stating that it can understand v2.
 which means that patch 6 ("actually understand a v2, that
 looks surprisingly similar to v1") should come before 3-5.

 patch 7 can be either before or after 6, the server side
 seems independent of the client side for the sake of
 this patch series.

Thanks,
Stefan

>
> Thanks.


Re: [RFC 0/7] transitioning to protocol v2

2017-08-24 Thread Junio C Hamano
Brandon Williams  writes:

> The best way to preserve functionality with old servers and clients would be 
> to
> communicate using the same end point but have the client send a bit of extra
> information with its initial request.  This extra information would need to be
> sent in such a way that old servers ignore it and operate normally (using
> protocol v1).  The client would then need to be able to look at a server's
> response to determine whether the server understands and is speaking v2 or has
> ignored the clients request to use a newer protocol and is speaking v1.

Good. I think the idle talk last round was for the server to tell
the v1 client "we are still doing the slow ls-remote comes first
protocol with this exchange, but just for future reference, you can
use the v2 endpoint with me", which was way less desirable (even
though it may be safer).

> Patches 1-5 enable a client to unconditionally send this back-channel
> information to a server.  This is done by sending a version number after a
> second NUL byte in git://, in the envvar GIT_PROTOCOL in file:// and ssh://,
> and in an http header in http://, https://.  Patches 6-7 teach a client and
> upload-pack to send and recognize a request to use protocol v2.

All sounds sensible.

 - for git://, if you say you found a hole in the protocol to stuff
   this information, I simply believe you ;-)  Good job.

 - http:// and https:// should be a no-brainer as the HTTP headers
   give ample room to send information from the initiator side.

 - For ssh://, I do not think it is sane to assume that we can
   convince server operators to allow passing any random environment
   variable.  If the use of this specific variable turns out to be
   popular enough, however, and its benefit outweighs administrative
   burden, perhaps it is not impossible to convince them to allow
   AcceptEnv on this single variable.

Thanks.


[RFC 0/7] transitioning to protocol v2

2017-08-24 Thread Brandon Williams
Another version of Git's wire protocol is a topic that has been discussed and
attempted by many in the community over the years.  The biggest challenge, as
far as I understand, has been coming up with a transition plan to using the new
server without breaking existing clients and servers.  As such this RFC is
really only concerned with solidifying a transition plan.  Once it has been
decided how we can transition to a new protocol we can get into decided what
this new protocol would look like (though it would obviously eliminate the ref
advertisement ;).

The best way to preserve functionality with old servers and clients would be to
communicate using the same end point but have the client send a bit of extra
information with its initial request.  This extra information would need to be
sent in such a way that old servers ignore it and operate normally (using
protocol v1).  The client would then need to be able to look at a server's
response to determine whether the server understands and is speaking v2 or has
ignored the clients request to use a newer protocol and is speaking v1.

Patches 1-5 enable a client to unconditionally send this back-channel
information to a server.  This is done by sending a version number after a
second NUL byte in git://, in the envvar GIT_PROTOCOL in file:// and ssh://,
and in an http header in http://, https://.  Patches 6-7 teach a client and
upload-pack to send and recognize a request to use protocol v2.

The biggest question I'm trying to answer is if these are reasonable ways with
which to communicate a request to a server to use a newer protocol, without
breaking current servers/clients.  As far as I've tested, with patches 1-5
applied I can still communicate with current servers without causing any
problems.

Any comments/discussion is welcome!

Brandon Williams (7):
  pkt-line: add packet_write function
  pkt-line: add strbuf_packet_read
  protocol: tell server that the client understands v2
  t: fix ssh tests to cope with using '-o SendEnv=GIT_PROTOCOL'
  http: send Git-Protocol-Version header
  transport: teach client to recognize v2 server response
  upload-pack: ack version 2

 builtin/fetch-pack.c |   4 +-
 builtin/send-pack.c  |   5 +-
 connect.c| 196 +++
 daemon.c |  28 ++-
 http.c   |   7 ++
 pkt-line.c   |  27 ++
 pkt-line.h   |   2 +
 remote-curl.c|   7 +-
 remote.h |  22 -
 t/lib-proto-disable.sh   |   1 +
 t/t5551-http-fetch-smart.sh  |   2 +
 t/t5601-clone.sh |  10 +--
 t/t5602-clone-remote-exec.sh |   4 +-
 transport.c  |  60 +++--
 upload-pack.c|  11 +++
 15 files changed, 286 insertions(+), 100 deletions(-)

-- 
2.14.1.342.g6490525c54-goog