Re: [PATCH v2 1/2] remote-curl: accept all encodings supported by curl

2018-05-22 Thread Daniel Stenberg

On Wed, 23 May 2018, Junio C Hamano wrote:


-> Accept-Encoding: gzip
+> Accept-Encoding: ENCODINGS


Is the ordering of these headers determined by the user of cURL library 
(i.e. Git), or whatever the version of cURL we happened to link with happens 
to produce?


The point is whether the order is expected to be stable, or we are better 
off sorting the actual log before comparing.


The order is not guaranteed by libcurl to be fixed, but it is likely to remain 
stable since we too have test cases and compare outputs with expected outputs! 
=)


Going forward, brotli (br) is going to become more commonly present in that 
header.


--

 / daniel.haxx.se


Re: [PATCH v2 1/2] remote-curl: accept all encodings supported by curl

2018-05-22 Thread brian m. carlson
On Wed, May 23, 2018 at 10:23:26AM +0900, Junio C Hamano wrote:
> Similarly, how much control do we have to ensure that the test HTTPD
> server (1) supports gzip and (2) does not support encoding algos
> with confusing names e.g. "funnygzipalgo" that may accidentally
> match that pattern?

I feel it's quite likely indeed that pretty much any Apache instance is
going to have the gzip encoding.  Every distributor I know supports it.

As for whether there are confusing alternate algorithms, I think it's
best to just look at the IANA registration[0] to see what people are
using.  Potential matches include gzip, x-gzip (a deprecated alias that
versions of Apache we can use are not likely to support), and
pack200-gzip (a format for Java archives, which we hope the remote side
will not be sending).

Overall, I think this is not likely to be a problem, but if necessary in
the future, we can add a prerequisite that looks in the module directory
for the appropriate module.  We haven't seen an issue with it yet,
though, TTBOMK.

[0] 
https://www.iana.org/assignments/http-parameters/http-parameters.xml#content-coding
-- 
brian m. carlson: Houston, Texas, US
OpenPGP: https://keybase.io/bk2204


signature.asc
Description: PGP signature


Re: [PATCH v2 1/2] remote-curl: accept all encodings supported by curl

2018-05-22 Thread Junio C Hamano
Brandon Williams  writes:

> diff --git a/t/t5551-http-fetch-smart.sh b/t/t5551-http-fetch-smart.sh
> index f5721b4a5..913089b14 100755
> --- a/t/t5551-http-fetch-smart.sh
> +++ b/t/t5551-http-fetch-smart.sh
> @@ -26,14 +26,14 @@ setup_askpass_helper
>  cat >exp <  > GET /smart/repo.git/info/refs?service=git-upload-pack HTTP/1.1
>  > Accept: */*
> -> Accept-Encoding: gzip
> +> Accept-Encoding: ENCODINGS
>  > Pragma: no-cache

Is the ordering of these headers determined by the user of cURL
library (i.e. Git), or whatever the version of cURL we happened to
link with happens to produce?

The point is whether the order is expected to be stable, or we are
better off sorting the actual log before comparing.

>  < HTTP/1.1 200 OK
>  < Pragma: no-cache
>  < Cache-Control: no-cache, max-age=0, must-revalidate
>  < Content-Type: application/x-git-upload-pack-advertisement

A similar question for this response.

>  > POST /smart/repo.git/git-upload-pack HTTP/1.1
> -> Accept-Encoding: gzip
> +> Accept-Encoding: ENCODINGS
>  > Content-Type: application/x-git-upload-pack-request
>  > Accept: application/x-git-upload-pack-result
>  > Content-Length: xxx

Ditto for this request.

> @@ -79,8 +79,13 @@ test_expect_success 'clone http repository' '
>   /^< Date: /d
>   /^< Content-Length: /d
>   /^< Transfer-Encoding: /d
> - " >act &&
> - test_cmp exp act
> + " >actual &&
> + sed -e "s/^> Accept-Encoding: .*/> Accept-Encoding: ENCODINGS/" \
> + actual >actual.smudged &&
> + test_cmp exp actual.smudged &&
> +
> + grep "Accept-Encoding:.*gzip" actual >actual.gzip &&
> + test_line_count = 2 actual.gzip
>  '

Similarly, how much control do we have to ensure that the test HTTPD
server (1) supports gzip and (2) does not support encoding algos
with confusing names e.g. "funnygzipalgo" that may accidentally
match that pattern?

Thanks.  Not a new issue with this patch, but just being curious if
you or anybody thought about it as a possible issue.




Re: [PATCH v2 1/2] remote-curl: accept all encodings supported by curl

2018-05-22 Thread Jonathan Nieder
Brandon Williams wrote:

> Configure curl to accept all encodings which curl supports instead of
> only accepting gzip responses.
>
> This fixes an issue when using an installation of curl which is built
> without the "zlib" feature. Since aa90b9697 (Enable info/refs gzip
> decompression in HTTP client, 2012-09-19) we end up requesting "gzip"
> encoding anyway despite libcurl not being able to decode it.  Worse,
> instead of getting a clear error message indicating so, we end up
> falling back to "dumb" http, producing a confusing and difficult to
> debug result.
>
> Since curl doesn't do any checking to verify that it supports the a
> requested encoding, instead set the curl option `CURLOPT_ENCODING` with
> an empty string indicating that curl should send an "Accept-Encoding"
> header containing only the encodings supported by curl.

Even better, this means we get the benefit of future of even better
compression algorithms once libcurl learns them.

> Reported-by: Anton Golubev 
> Signed-off-by: Brandon Williams 
> ---
> Version 2 of this series just tweaks the commit message and the test per
> Jonathan's suggestion.
>
>  http.c  |  2 +-
>  remote-curl.c   |  2 +-
>  t/t5551-http-fetch-smart.sh | 13 +
>  3 files changed, 11 insertions(+), 6 deletions(-)

Reviewed-by: Jonathan Nieder 

Thanks for fixing it.

Patch left unsnipped for reference.

> --- a/http.c
> +++ b/http.c
> @@ -1788,7 +1788,7 @@ static int http_request(const char *url,
>  
>   curl_easy_setopt(slot->curl, CURLOPT_URL, url);
>   curl_easy_setopt(slot->curl, CURLOPT_HTTPHEADER, headers);
> - curl_easy_setopt(slot->curl, CURLOPT_ENCODING, "gzip");
> + curl_easy_setopt(slot->curl, CURLOPT_ENCODING, "");
>  
>   ret = run_one_slot(slot, );
>  
> diff --git a/remote-curl.c b/remote-curl.c
> index ceb05347b..565bba104 100644
> --- a/remote-curl.c
> +++ b/remote-curl.c
> @@ -684,7 +684,7 @@ static int post_rpc(struct rpc_state *rpc)
>   curl_easy_setopt(slot->curl, CURLOPT_NOBODY, 0);
>   curl_easy_setopt(slot->curl, CURLOPT_POST, 1);
>   curl_easy_setopt(slot->curl, CURLOPT_URL, rpc->service_url);
> - curl_easy_setopt(slot->curl, CURLOPT_ENCODING, "gzip");
> + curl_easy_setopt(slot->curl, CURLOPT_ENCODING, "");
>  
>   if (large_request) {
>   /* The request body is large and the size cannot be predicted.
> diff --git a/t/t5551-http-fetch-smart.sh b/t/t5551-http-fetch-smart.sh
> index f5721b4a5..913089b14 100755
> --- a/t/t5551-http-fetch-smart.sh
> +++ b/t/t5551-http-fetch-smart.sh
> @@ -26,14 +26,14 @@ setup_askpass_helper
>  cat >exp <  > GET /smart/repo.git/info/refs?service=git-upload-pack HTTP/1.1
>  > Accept: */*
> -> Accept-Encoding: gzip
> +> Accept-Encoding: ENCODINGS
>  > Pragma: no-cache
>  < HTTP/1.1 200 OK
>  < Pragma: no-cache
>  < Cache-Control: no-cache, max-age=0, must-revalidate
>  < Content-Type: application/x-git-upload-pack-advertisement
>  > POST /smart/repo.git/git-upload-pack HTTP/1.1
> -> Accept-Encoding: gzip
> +> Accept-Encoding: ENCODINGS
>  > Content-Type: application/x-git-upload-pack-request
>  > Accept: application/x-git-upload-pack-result
>  > Content-Length: xxx
> @@ -79,8 +79,13 @@ test_expect_success 'clone http repository' '
>   /^< Date: /d
>   /^< Content-Length: /d
>   /^< Transfer-Encoding: /d
> - " >act &&
> - test_cmp exp act
> + " >actual &&
> + sed -e "s/^> Accept-Encoding: .*/> Accept-Encoding: ENCODINGS/" \
> + actual >actual.smudged &&
> + test_cmp exp actual.smudged &&
> +
> + grep "Accept-Encoding:.*gzip" actual >actual.gzip &&
> + test_line_count = 2 actual.gzip
>  '
>  
>  test_expect_success 'fetch changes via http' '