As discussed in v2.19.0-rc0~45^2~2 (http-backend: respect
CONTENT_LENGTH as specified by rfc3875, 2018-06-10), HTTP servers such
as IIS do not close a CGI script's standard input at the end of a
request, instead expecting CGI scripts to stop reading after
CONTENT_LENGTH bytes. That commit taught http-backend to respect this
convention except when CONTENT_LENGTH is unset, in which case it
preserved the previous behavior of reading until EOF.
RFC 3875 (the CGI specification) explains:
The CONTENT_LENGTH variable contains the size of the message-body
attached to the request, if any, in decimal number of octets. If no
data is attached, then NULL (or unset).
CONTENT_LENGTH = "" | 1*digit
And:
This specification does not distinguish between zero-length (NULL)
values and missing values.
But that specification was written before HTTP/1.1 and chunked
encoding. With chunked encoding, the length of a request is not known
early and it is useful to start a CGI script to process it anyway, so
Apache and many other servers violate the spec: they leave
CONTENT_LENGTH unset and rely on EOF to indicate the end of request.
This is reproducible using t5510-fetch.sh, which hangs if http-backend
is patched to treat a missing CONTENT_LENGTH as zero.
So we are in a bind: to support HTTP servers that don't produce EOF,
http-backend should respect an unset or empty CONTENT_LENGTH that
represents zero, and to support chunked encoding, http-backend should
respect an unset CONTENT_LENGTH that represents "read until EOF".
Fortunately, there's a way out. Use the HTTP_TRANSFER_ENCODING
environment variable to distinguish the two cases.
Reported-by: Jeff King <[email protected]>
Helped-by: Max Kirillov <[email protected]>
Signed-off-by: Jonathan Nieder <[email protected]>
---
How about this?
http-backend.c | 19 +++++++++++++++++--
1 file changed, 17 insertions(+), 2 deletions(-)
diff --git a/http-backend.c b/http-backend.c
index 458642ef72..7902eeb0b3 100644
--- a/http-backend.c
+++ b/http-backend.c
@@ -350,10 +350,25 @@ static ssize_t read_request_fixed_len(int fd, ssize_t
req_len, unsigned char **o
static ssize_t get_content_length(void)
{
- ssize_t val = -1;
+ ssize_t val;
const char *str = getenv("CONTENT_LENGTH");
- if (str && *str && !git_parse_ssize_t(str, &val))
+ if (!str || !*str) {
+ /*
+ * According to RFC 3875, an empty or missing
+ * CONTENT_LENGTH means "no body", but RFC 3875
+ * precedes HTTP/1.1 and chunked encoding. Apache and
+ * its imitators leave CONTENT_LENGTH unset for
+ * chunked requests, for which we should use EOF to
+ * detect the end of the request.
+ */
+ str = getenv("HTTP_TRANSFER_ENCODING");
+ if (str && !strcmp(str, "chunked"))
+ return -1;
+
+ return 0;
+ }
+ if (!git_parse_ssize_t(str, &val))
die("failed to parse CONTENT_LENGTH: %s", str);
return val;
}
--
2.19.0.397.gdd90340f6a