Re: [PATCH 1/2] http: add option to enable 100 Continue responses

2013-10-11 Thread brian m. carlson
On Thu, Oct 10, 2013 at 01:14:28AM -0700, Shawn Pearce wrote:
 If a large enough percentage of users are stuck behind a proxy that
 doesn't support 100-continue, it is hard to rely on that part of HTTP
 1.1. You need to build the work-around for them anyway, so you might
 as well just make everyone use the work-around and assume 100-continue
 does not exist.

Well, the issue is that 100-continue is needed for functionality in some
cases, unless we want to restart the git-upload-pack command again or
force people to use outrageous sizes for http.postBuffer.  My preference
is generally to optimize for sane, standards-compliant behavior first,
and let the people with broken infrastructure turn on options to work
around that breakage.  I realize that git as a project is a little more
tolerant of people's myriad forms of breakage than I am personally.

Regardless, I have a reroll that leaves it disabled by default that I'll
send in a few minutes.

 100-continue is frequently used when there is a large POST body, but
 those suck for users on slow or unstable connections. Typically the
 POST cannot be resumed where the connection was broken. To be friendly
 to users on less reliable connections than your gigabit office
 ethernet, you need to design the client side with some sort of
 chunking and gracefully retrying. So Git is really doing it all wrong.
 :-)

Yeah, there's been requests for resumable pull/push before.  The proper
way to do it would probably to send lots of little mini-packs that each
depend on the previous pack sent; if the connection gets reset, then at
least some of the data has been transferred, and negotiation would
restart the next time.  The number of SHA-1s sent during negotiation
would have to increase though, because you couldn't be guaranteed that
an entire ref would be able to be transferred each time.  Large blobs
would still be a problem, though, and efficiency would plummet.

 Even if you want to live in the fairy land where all servers support
 100-continue, I'm not sure clients should pay that 100-160ms latency
 penalty during ancestor negotiation. Do 5 rounds of negotiation and
 its suddenly an extra half second for `git fetch`, and that is a
 fairly well connected client. Let me know how it works from India to a
 server on the west coast of the US, latency might be more like 200ms,
 and 5 rounds is now 1 full second of additional lag.

There shouldn't be that many rounds of negotiation.  HTTP retrieves the
list of refs over one connection, and then performs the POST over
another two.  Regardless, you should be using SSL over that connection,
and the number of round trips required for SSL negotiation in that case
completely dwarfs the overhead for the 100 continue, especially since
you'll do it thrice (even though the session is usually reused).  The
efficient way to do push is SSH, where you can avoid making multiple
connections and reuse the same encrypted connection at every stage.

-- 
brian m. carlson / brian with sandals: Houston, Texas, US
+1 832 623 2791 | http://www.crustytoothpaste.net/~bmc | My opinion only
OpenPGP: RSA v4 4096b: 88AC E9B2 9196 305B A994 7552 F1BA 225C 0223 B187


signature.asc
Description: Digital signature


Re: [PATCH 1/2] http: add option to enable 100 Continue responses

2013-10-11 Thread Shawn Pearce
On Fri, Oct 11, 2013 at 3:31 PM, brian m. carlson
sand...@crustytoothpaste.net wrote:
 On Thu, Oct 10, 2013 at 01:14:28AM -0700, Shawn Pearce wrote:
 Even if you want to live in the fairy land where all servers support
 100-continue, I'm not sure clients should pay that 100-160ms latency
 penalty during ancestor negotiation. Do 5 rounds of negotiation and
 its suddenly an extra half second for `git fetch`, and that is a
 fairly well connected client. Let me know how it works from India to a
 server on the west coast of the US, latency might be more like 200ms,
 and 5 rounds is now 1 full second of additional lag.

 There shouldn't be that many rounds of negotiation.  HTTP retrieves the
 list of refs over one connection, and then performs the POST over
 another two.

Why two connections? This should be a single HTTP connection with HTTP
Keep-Alive semantics allowing the same TCP stream and the same SSL
stream to be used for all requests. Which is nearly equivalent to SSH.
Where SSH wins is the multi_ack protocol allowing the server to talk
while the client is talking.

  Regardless, you should be using SSL over that connection,
 and the number of round trips required for SSL negotiation in that case
 completely dwarfs the overhead for the 100 continue, especially since
 you'll do it thrice (even though the session is usually reused).  The
 efficient way to do push is SSH, where you can avoid making multiple
 connections and reuse the same encrypted connection at every stage.

SSH setup is also not free. Like SSL its going to require a round trip
or two on top of what Git needs.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2] http: add option to enable 100 Continue responses

2013-10-10 Thread Shawn Pearce
On Wed, Oct 9, 2013 at 6:35 PM, brian m. carlson
sand...@crustytoothpaste.net wrote:
 On Wed, Oct 09, 2013 at 05:37:42PM -0400, Jeff King wrote:
 On Wed, Oct 09, 2013 at 02:19:36PM -0700, Shawn O. Pearce wrote:
  206b099 was written because the Google web servers for
  android.googlesource.com and code.google.com do not support
  100-continue semantics. This caused the client to stall a full 1
  second before each POST exchange. If ancestor negotiation required
  O(128) have lines to be advertised I think this was 2 or 4 POSTs,
  resulting in 2-4 second stalls above the other latency of the network
  and the server.

 Yuck.

 Shame on Google.  Of all people, they should be able to implement HTTP
 1.1 properly.

Heh. =)

If a large enough percentage of users are stuck behind a proxy that
doesn't support 100-continue, it is hard to rely on that part of HTTP
1.1. You need to build the work-around for them anyway, so you might
as well just make everyone use the work-around and assume 100-continue
does not exist.

100-continue is frequently used when there is a large POST body, but
those suck for users on slow or unstable connections. Typically the
POST cannot be resumed where the connection was broken. To be friendly
to users on less reliable connections than your gigabit office
ethernet, you need to design the client side with some sort of
chunking and gracefully retrying. So Git is really doing it all wrong.
:-)

Properly using 100-continue adds a full RTT to any request using it.
If the RTT time for an end-user to server is already 100-160ms on the
public Internet, using 100-continue just added an extra 160ms of
latency to whatever the operation was. That is hardly useful to
anyone. During that RTT the server has resources tied up associated
with that client connection. For your 10-person workgroup server this
is probably no big deal; at scale it can be a daunting additional
resource load.

Etc.


Even if you want to live in the fairy land where all servers support
100-continue, I'm not sure clients should pay that 100-160ms latency
penalty during ancestor negotiation. Do 5 rounds of negotiation and
its suddenly an extra half second for `git fetch`, and that is a
fairly well connected client. Let me know how it works from India to a
server on the west coast of the US, latency might be more like 200ms,
and 5 rounds is now 1 full second of additional lag.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2] http: add option to enable 100 Continue responses

2013-10-09 Thread Shawn Pearce
On Wed, Oct 9, 2013 at 12:30 PM, Jeff King p...@peff.net wrote:
 On Tue, Oct 08, 2013 at 08:48:06PM +, brian m. carlson wrote:

 When using GSS-Negotiate authentication with libcurl, the authentication
 provided will change every time, and so the probe that git uses to determine 
 if
 authentication is needed is not sufficient to guarantee that data can be 
 sent.
 If the data fits entirely in http.postBuffer bytes, the data can be rewound 
 and
 resent if authentication fails; otherwise, a 100 Continue must be requested 
 in
 this case.

 By default, curl will send an Expect: 100-continue if a certain amount of 
 data
 is to be uploaded, but when using chunked data this is never triggered.  Add 
 an
 option http.continue, which defaults to enabled, to control whether this 
 header
 is sent.  The addition of an option is necessary because some proxies break
 badly if sent this header.

 [+cc spearce]

 I'd be more comfortable defaulting this to on if I understood more
 about the original problem that led to 959dfcf and 206b099. It sounds
 like enabling this all the time will cause annoying stalls in the
 protocol, unless the number of non-crappy proxy implementations has
 gotten smaller over the past few years.

It actually hasn't, not significantly.

206b099 was written because the Google web servers for
android.googlesource.com and code.google.com do not support
100-continue semantics. This caused the client to stall a full 1
second before each POST exchange. If ancestor negotiation required
O(128) have lines to be advertised I think this was 2 or 4 POSTs,
resulting in 2-4 second stalls above the other latency of the network
and the server.

The advice I received from the Google web server developers was to not
rely on 100-continue, because a large number of corporate proxy
servers do not support it correctly. HTTP 100-continue is just pretty
broken in the wild.

If Expect: 100-continue is required for GSS-Negotiate to work then
Git should only enable the option if the server is demanding
GSS-Negotiate for authentication. Per 206b099 the default should still
be off for anonymous and HTTP basic, digest, and SSL certificate
authentication.

 diff --git a/remote-curl.c b/remote-curl.c
 index b5ebe01..3b5e160 100644
 --- a/remote-curl.c
 +++ b/remote-curl.c
 @@ -470,7 +470,12 @@ static int post_rpc(struct rpc_state *rpc)

   headers = curl_slist_append(headers, rpc-hdr_content_type);
   headers = curl_slist_append(headers, rpc-hdr_accept);
 - headers = curl_slist_append(headers, Expect:);
 +
 + /* Force it either on or off, since curl will try to decide based on 
 how
 +  * much data is to be uploaded and we want consistency.
 +  */
 + headers = curl_slist_append(headers, http_use_100_continue ?
 + Expect: 100-continue : Expect:);

 Is there any point in sending the Expect: header in cases where curl
 would not send it, though? It seems like we should assume curl does the
 right thing most of the time, and have our option only be to override
 curl in the negative direction.

Adding a header of Expect: causes curl to disable the header and
never use it. Always supplying the header with no value prevents
libcurl from using 100-continue on its own, which is what I fixed in
959dfcf.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2] http: add option to enable 100 Continue responses

2013-10-09 Thread Jeff King
On Wed, Oct 09, 2013 at 02:19:36PM -0700, Shawn O. Pearce wrote:

  I'd be more comfortable defaulting this to on if I understood more
  about the original problem that led to 959dfcf and 206b099. It sounds
  like enabling this all the time will cause annoying stalls in the
  protocol, unless the number of non-crappy proxy implementations has
  gotten smaller over the past few years.
 
 It actually hasn't, not significantly.

Thanks for update; my pessimistic side assumed this was the case, but it
is always good to check. :)

 206b099 was written because the Google web servers for
 android.googlesource.com and code.google.com do not support
 100-continue semantics. This caused the client to stall a full 1
 second before each POST exchange. If ancestor negotiation required
 O(128) have lines to be advertised I think this was 2 or 4 POSTs,
 resulting in 2-4 second stalls above the other latency of the network
 and the server.

Yuck.

 If Expect: 100-continue is required for GSS-Negotiate to work then
 Git should only enable the option if the server is demanding
 GSS-Negotiate for authentication. Per 206b099 the default should still
 be off for anonymous and HTTP basic, digest, and SSL certificate
 authentication.

Part of the problem is that curl is the one handling the negotiation.
When we get a 401, I think we can ask curl_easy_getinfo to tell us which
auth methods are available (via CURLINFO_HTTPAUTH_AVAIL). But I don't
know how we decide that GSS is what's going to be used. I guess if it is
the only option, and there is no basic-auth offered?

And then in that case turn on Expect (or more accurately, stop
disabling it).

I don't have a GSS-enabled server to test on. Brian, can you try the
patch at the end of this message on your non-working server and see what
it outputs?

  + headers = curl_slist_append(headers, http_use_100_continue ?
  + Expect: 100-continue : Expect:);
 
  Is there any point in sending the Expect: header in cases where curl
  would not send it, though? It seems like we should assume curl does the
  right thing most of the time, and have our option only be to override
  curl in the negative direction.
 
 Adding a header of Expect: causes curl to disable the header and
 never use it. Always supplying the header with no value prevents
 libcurl from using 100-continue on its own, which is what I fixed in
 959dfcf.

Right. What I meant is that we do not want to unconditionally send the
Expect: 100-continue in the other case. IOW, we would just want:

  if (!http_use_100_continue)
  headers = curl_slist_append(headers, Expect:);

-Peff

-- 8 --
diff --git a/http.c b/http.c
index f3e1439..e7257d7 100644
--- a/http.c
+++ b/http.c
@@ -889,6 +889,12 @@ static int http_request(const char *url, struct strbuf 
*type,
if (start_active_slot(slot)) {
run_active_slot(slot);
ret = handle_curl_result(results);
+   if (ret == HTTP_REAUTH) {
+   long auth_avail;
+   curl_easy_getinfo(slot-curl, CURLINFO_HTTPAUTH_AVAIL,
+ auth_avail);
+   fprintf(stderr, offered auth: %ld\n, auth_avail);
+   }
} else {
snprintf(curl_errorstr, sizeof(curl_errorstr),
 failed to start HTTP request);

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2] http: add option to enable 100 Continue responses

2013-10-09 Thread brian m. carlson
On Wed, Oct 09, 2013 at 05:37:42PM -0400, Jeff King wrote:
 On Wed, Oct 09, 2013 at 02:19:36PM -0700, Shawn O. Pearce wrote:
  206b099 was written because the Google web servers for
  android.googlesource.com and code.google.com do not support
  100-continue semantics. This caused the client to stall a full 1
  second before each POST exchange. If ancestor negotiation required
  O(128) have lines to be advertised I think this was 2 or 4 POSTs,
  resulting in 2-4 second stalls above the other latency of the network
  and the server.
 
 Yuck.

Shame on Google.  Of all people, they should be able to implement HTTP
1.1 properly.

 Part of the problem is that curl is the one handling the negotiation.
 When we get a 401, I think we can ask curl_easy_getinfo to tell us which
 auth methods are available (via CURLINFO_HTTPAUTH_AVAIL). But I don't
 know how we decide that GSS is what's going to be used. I guess if it is
 the only option, and there is no basic-auth offered?
 
 And then in that case turn on Expect (or more accurately, stop
 disabling it).
 
 I don't have a GSS-enabled server to test on. Brian, can you try the
 patch at the end of this message on your non-working server and see what
 it outputs?

It doesn't trigger.  My server only requires authentication for the
actual push operation, and it looks like your code doesn't trigger in
that case.

   + headers = curl_slist_append(headers, http_use_100_continue ?
   + Expect: 100-continue : Expect:);
  
   Is there any point in sending the Expect: header in cases where curl
   would not send it, though? It seems like we should assume curl does the
   right thing most of the time, and have our option only be to override
   curl in the negative direction.

I think curl expects that data in the POSTFIELDS option in order to
trigger.  I wasn't able to get it to trigger on its own.

  Adding a header of Expect: causes curl to disable the header and
  never use it. Always supplying the header with no value prevents
  libcurl from using 100-continue on its own, which is what I fixed in
  959dfcf.
 
 Right. What I meant is that we do not want to unconditionally send the
 Expect: 100-continue in the other case. IOW, we would just want:
 
   if (!http_use_100_continue)
   headers = curl_slist_append(headers, Expect:);

I tried that.  It doesn't work, since it never sends the Expect:
100-continue.  I made libcurl dump all the headers it sends, and it
doesn't send them in that case.

-- 
brian m. carlson / brian with sandals: Houston, Texas, US
+1 832 623 2791 | http://www.crustytoothpaste.net/~bmc | My opinion only
OpenPGP: RSA v4 4096b: 88AC E9B2 9196 305B A994 7552 F1BA 225C 0223 B187


signature.asc
Description: Digital signature


Re: [PATCH 1/2] http: add option to enable 100 Continue responses

2013-10-09 Thread Jeff King
On Thu, Oct 10, 2013 at 01:35:47AM +, brian m. carlson wrote:

  I don't have a GSS-enabled server to test on. Brian, can you try the
  patch at the end of this message on your non-working server and see what
  it outputs?
 
 It doesn't trigger.  My server only requires authentication for the
 actual push operation, and it looks like your code doesn't trigger in
 that case.

Can you describe the sequence of requests, then?

Do you not have to auth for the info/refs request, and then have to auth
later for the git-receive-pack request? Does the auth trigger for the
probe_rpc call?

Can you show us GIT_CURL_VERBOSE=1 output for a session that fails (do
note that it will dump your auth headers, so you may want to redact them
before sharing)?

Is there any point in sending the Expect: header in cases where curl
would not send it, though? It seems like we should assume curl does the
right thing most of the time, and have our option only be to override
curl in the negative direction.
 
 I think curl expects that data in the POSTFIELDS option in order to
 trigger.  I wasn't able to get it to trigger on its own.

Was your POST perhaps not big enough? My impression is that it only
sends it if the POST is too large to rewind (which seems like a good
thing in general), and I thought earlier in the thread that command-line
curl was sending one. I'm not sure what would be different for us here,
once our manual Expect suppression is turned off.

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/2] http: add option to enable 100 Continue responses

2013-10-08 Thread brian m. carlson
When using GSS-Negotiate authentication with libcurl, the authentication
provided will change every time, and so the probe that git uses to determine if
authentication is needed is not sufficient to guarantee that data can be sent.
If the data fits entirely in http.postBuffer bytes, the data can be rewound and
resent if authentication fails; otherwise, a 100 Continue must be requested in
this case.

By default, curl will send an Expect: 100-continue if a certain amount of data
is to be uploaded, but when using chunked data this is never triggered.  Add an
option http.continue, which defaults to enabled, to control whether this header
is sent.  The addition of an option is necessary because some proxies break
badly if sent this header.

Signed-off-by: brian m. carlson sand...@crustytoothpaste.net
---
 http.c| 6 ++
 http.h| 1 +
 remote-curl.c | 7 ++-
 3 files changed, 13 insertions(+), 1 deletion(-)

diff --git a/http.c b/http.c
index f3e1439..941c941 100644
--- a/http.c
+++ b/http.c
@@ -11,6 +11,7 @@
 int active_requests;
 int http_is_verbose;
 size_t http_post_buffer = 16 * LARGE_PACKET_MAX;
+int http_use_100_continue = 1;
 
 #if LIBCURL_VERSION_NUM = 0x070a06
 #define LIBCURL_CAN_HANDLE_AUTH_ANY
@@ -213,6 +214,11 @@ static int http_options(const char *var, const char 
*value, void *cb)
return 0;
}
 
+   if (!strcmp(http.continue, var)) {
+   http_use_100_continue = git_config_bool(var, value);
+   return 0;
+   }
+
if (!strcmp(http.useragent, var))
return git_config_string(user_agent, var, value);
 
diff --git a/http.h b/http.h
index d77c1b5..e72786e 100644
--- a/http.h
+++ b/http.h
@@ -102,6 +102,7 @@ extern void http_cleanup(void);
 extern int active_requests;
 extern int http_is_verbose;
 extern size_t http_post_buffer;
+extern int http_use_100_continue;
 
 extern char curl_errorstr[CURL_ERROR_SIZE];
 
diff --git a/remote-curl.c b/remote-curl.c
index b5ebe01..3b5e160 100644
--- a/remote-curl.c
+++ b/remote-curl.c
@@ -470,7 +470,12 @@ static int post_rpc(struct rpc_state *rpc)
 
headers = curl_slist_append(headers, rpc-hdr_content_type);
headers = curl_slist_append(headers, rpc-hdr_accept);
-   headers = curl_slist_append(headers, Expect:);
+
+   /* Force it either on or off, since curl will try to decide based on how
+* much data is to be uploaded and we want consistency.
+*/
+   headers = curl_slist_append(headers, http_use_100_continue ?
+   Expect: 100-continue : Expect:);
 
 retry:
slot = get_active_slot();
-- 
1.8.4.rc3

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html