Re: [PATCH v2] http.c: use CURLOPT_RANGE for range requests
On Sun, 1 Nov 2015, Jeff King wrote: While I have your attention, Daniel, am I correct in assuming that performing a second unrelated request with the same CURL object will need an explicit: curl_easy_setopt(curl, CURLOPT_RANGE, NULL); to avoid using the range twice? Correct. As the options stick until modified. -- / daniel.haxx.se -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2] http.c: use CURLOPT_RANGE for range requests
On Mon, Nov 02, 2015 at 12:00:42AM +0100, Daniel Stenberg wrote: > On Fri, 30 Oct 2015, Jeff King wrote: > > >The goal makes sense. Why weren't we using CURLOPT_RANGE before? Did it > >not exist (or otherwise have limitations) in 2005, and if so, when did it > >become usable? Do we need to protect this with an #ifdef for the curl > >version? > > CURLOPT_RANGE existed already in the first libcurl release: version 7.1, > relased in August 2000. Ah, thanks. I guess we don't have to worry about that, then. While I have your attention, Daniel, am I correct in assuming that performing a second unrelated request with the same CURL object will need an explicit: curl_easy_setopt(curl, CURLOPT_RANGE, NULL); to avoid using the range twice? -Peff -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2] http.c: use CURLOPT_RANGE for range requests
On Fri, 30 Oct 2015, Jeff King wrote: The goal makes sense. Why weren't we using CURLOPT_RANGE before? Did it not exist (or otherwise have limitations) in 2005, and if so, when did it become usable? Do we need to protect this with an #ifdef for the curl version? CURLOPT_RANGE existed already in the first libcurl release: version 7.1, relased in August 2000. -- / daniel.haxx.se -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2] http.c: use CURLOPT_RANGE for range requests
On Sat, Oct 31, 2015 at 10:40:13AM -0700, Junio C Hamano wrote: > Jeff King writes: > > > We could even hide the whole thing away with something like: > > > > void http_set_range(CURL *curl, long lo, long hi) > > { > > char buf[128]; > > int len = 0; > > > > if (lo >= 0) > > len += xsnprintf(buf + len, "%ld", lo); > > len += xsnprintf(buf + len, "-"); > > if (hi >= 0) > > len += xsnprintf(buf + len, "%ld", hi); > > > > curl_easy_setopt(curl, CURLOPT_RANGE, buf); > > } > > > > That would also make it easier to replace if we do need to keep an > > #ifdef for older versions of curl. But maybe it is just > > over-engineering. > > I personally do not think this is an over-engineered version. This > is exactly what I had in mind when I alluded to a small helper ;-) Yeah, I somehow missed your suggestion before writing that. I think the important thing your example noted is that we would always pass -1 for "hi" in all the current callers, so we can simplify this quite a bit, and use only one snprintf. -Peff -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2] http.c: use CURLOPT_RANGE for range requests
Jeff King writes: > We could even hide the whole thing away with something like: > > void http_set_range(CURL *curl, long lo, long hi) > { > char buf[128]; > int len = 0; > > if (lo >= 0) > len += xsnprintf(buf + len, "%ld", lo); > len += xsnprintf(buf + len, "-"); > if (hi >= 0) > len += xsnprintf(buf + len, "%ld", hi); > > curl_easy_setopt(curl, CURLOPT_RANGE, buf); > } > > That would also make it easier to replace if we do need to keep an > #ifdef for older versions of curl. But maybe it is just > over-engineering. I personally do not think this is an over-engineered version. This is exactly what I had in mind when I alluded to a small helper ;-) -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2] http.c: use CURLOPT_RANGE for range requests
On Fri, Oct 30, 2015 at 06:54:42PM -0400, David Turner wrote: > A HTTP server is permitted to return a non-range response to a HTTP > range request (and Apache httpd in fact does this in some cases). > While libcurl knows how to correctly handle this (by skipping bytes > before and after the requested range), it only turns on this handling > if it is aware that a range request is being made. By manually > setting the range header instead of using CURLOPT_RANGE, we were > hiding the fact that this was a range request from libcurl. This > could cause corruption. The goal makes sense. Why weren't we using CURLOPT_RANGE before? Did it not exist (or otherwise have limitations) in 2005, and if so, when did it become usable? Do we need to protect this with an #ifdef for the curl version? > @@ -1213,8 +1213,9 @@ static int http_request(const char *url, > curl_easy_setopt(slot->curl, CURLOPT_WRITEFUNCTION, >fwrite); > if (posn > 0) { > - strbuf_addf(&buf, "Range: bytes=%ld-", posn); > - headers = curl_slist_append(headers, buf.buf); > + strbuf_addf(&buf, "%ld-", posn); > + curl_easy_setopt(slot->curl, CURLOPT_RANGE, > + &buf.buf); > strbuf_reset(&buf); > } > } else We reuse curl slots from request to request, and curl options may persist. The old code appended to the header list, which then gets added to the curl object with CURLOPT_HTTPHEADER. Subsequent requests, even if they are not using ranges, would also set CURLOPT_HTTPHEADER, possibly to NULL, so the range header would not accidentally creep into the next request. But with CURLOPT_RANGE, I think the value would persist to the next request (unless curl automagically forgets it after making the request, but I don't think it generally does so). I think the cleanest thing would be to reset it back to NULL in get_active_slot (there are several other fields that we reset there, too). > @@ -1685,7 +1680,6 @@ struct http_object_request > *new_http_object_request(const char *base_url, > ssize_t prev_read = 0; > long prev_posn = 0; > char range[RANGE_HEADER_SIZE]; > - struct curl_slist *range_header = NULL; > struct http_object_request *freq; Junio asked elsewhere if we really need to tweak RANGE_HEADER_SIZE. But I think we should go even further, and bump the definition much closer to the point of use[1], and not worry about an arbitrary constant. After all, we already use sizeof(), so we do not even end up repeating the constant. We could even hide the whole thing away with something like: void http_set_range(CURL *curl, long lo, long hi) { char buf[128]; int len = 0; if (lo >= 0) len += xsnprintf(buf + len, "%ld", lo); len += xsnprintf(buf + len, "-"); if (hi >= 0) len += xsnprintf(buf + len, "%ld", hi); curl_easy_setopt(curl, CURLOPT_RANGE, buf); } That would also make it easier to replace if we do need to keep an #ifdef for older versions of curl. But maybe it is just over-engineering. -Peff [1] Antique versions of curl needed the buffer to remain valid through the whole request, but these days it makes its own copy (and even under the old regime, I am not sure if the existing code would work anyway, as we return the request from this function, and the buffer is function-local). -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2] http.c: use CURLOPT_RANGE for range requests
David Turner writes: > A HTTP server is permitted to return a non-range response to a HTTP > range request (and Apache httpd in fact does this in some cases). > While libcurl knows how to correctly handle this (by skipping bytes > before and after the requested range), it only turns on this handling > if it is aware that a range request is being made. By manually > setting the range header instead of using CURLOPT_RANGE, we were > hiding the fact that this was a range request from libcurl. This > could cause corruption. Wow, that looks really bad. > Signed-off-by: David Turner > --- > > This version applies on top of pu. It also catches all of the range > requests instead of just the one that I happened to notice. > > --- > http.c | 24 > http.h | 1 - > 2 files changed, 8 insertions(+), 17 deletions(-) > > diff --git a/http.c b/http.c > index 6b89dea..16610b9 100644 > --- a/http.c > +++ b/http.c > @@ -30,7 +30,7 @@ static CURL *curl_default; > #endif > > #define PREV_BUF_SIZE 4096 > -#define RANGE_HEADER_SIZE 30 > +#define RANGE_HEADER_SIZE 17 Was this change necessary and is 17-byte string sufficient for 64-bit longs? > @@ -1213,8 +1213,9 @@ static int http_request(const char *url, > curl_easy_setopt(slot->curl, CURLOPT_WRITEFUNCTION, >fwrite); > if (posn > 0) { > - strbuf_addf(&buf, "Range: bytes=%ld-", posn); > - headers = curl_slist_append(headers, buf.buf); > + strbuf_addf(&buf, "%ld-", posn); > + curl_easy_setopt(slot->curl, CURLOPT_RANGE, > + &buf.buf); > strbuf_reset(&buf); Makes me wonder if all the callers have a CURL* and a long so that a new helper range_opt_ask_remainder(CURL *, long) can make the resulting code even simpler; we only say "we know what comes before this position, give us everything from there" in all callers, right? -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v2] http.c: use CURLOPT_RANGE for range requests
A HTTP server is permitted to return a non-range response to a HTTP range request (and Apache httpd in fact does this in some cases). While libcurl knows how to correctly handle this (by skipping bytes before and after the requested range), it only turns on this handling if it is aware that a range request is being made. By manually setting the range header instead of using CURLOPT_RANGE, we were hiding the fact that this was a range request from libcurl. This could cause corruption. Signed-off-by: David Turner --- This version applies on top of pu. It also catches all of the range requests instead of just the one that I happened to notice. --- http.c | 24 http.h | 1 - 2 files changed, 8 insertions(+), 17 deletions(-) diff --git a/http.c b/http.c index 6b89dea..16610b9 100644 --- a/http.c +++ b/http.c @@ -30,7 +30,7 @@ static CURL *curl_default; #endif #define PREV_BUF_SIZE 4096 -#define RANGE_HEADER_SIZE 30 +#define RANGE_HEADER_SIZE 17 char curl_errorstr[CURL_ERROR_SIZE]; @@ -1213,8 +1213,9 @@ static int http_request(const char *url, curl_easy_setopt(slot->curl, CURLOPT_WRITEFUNCTION, fwrite); if (posn > 0) { - strbuf_addf(&buf, "Range: bytes=%ld-", posn); - headers = curl_slist_append(headers, buf.buf); + strbuf_addf(&buf, "%ld-", posn); + curl_easy_setopt(slot->curl, CURLOPT_RANGE, +&buf.buf); strbuf_reset(&buf); } } else @@ -1526,10 +1527,6 @@ void release_http_pack_request(struct http_pack_request *preq) fclose(preq->packfile); preq->packfile = NULL; } - if (preq->range_header != NULL) { - curl_slist_free_all(preq->range_header); - preq->range_header = NULL; - } preq->slot = NULL; free(preq->url); free(preq); @@ -1631,10 +1628,8 @@ struct http_pack_request *new_http_pack_request( fprintf(stderr, "Resuming fetch of pack %s at byte %ld\n", sha1_to_hex(target->sha1), prev_posn); - xsnprintf(range, sizeof(range), "Range: bytes=%ld-", prev_posn); - preq->range_header = curl_slist_append(NULL, range); - curl_easy_setopt(preq->slot->curl, CURLOPT_HTTPHEADER, - preq->range_header); + xsnprintf(range, sizeof(range), "%ld-", prev_posn); + curl_easy_setopt(preq->slot->curl, CURLOPT_RANGE, range); } return preq; @@ -1685,7 +1680,6 @@ struct http_object_request *new_http_object_request(const char *base_url, ssize_t prev_read = 0; long prev_posn = 0; char range[RANGE_HEADER_SIZE]; - struct curl_slist *range_header = NULL; struct http_object_request *freq; freq = xcalloc(1, sizeof(*freq)); @@ -1791,10 +1785,8 @@ struct http_object_request *new_http_object_request(const char *base_url, fprintf(stderr, "Resuming fetch of object %s at byte %ld\n", hex, prev_posn); - xsnprintf(range, sizeof(range), "Range: bytes=%ld-", prev_posn); - range_header = curl_slist_append(range_header, range); - curl_easy_setopt(freq->slot->curl, -CURLOPT_HTTPHEADER, range_header); + xsnprintf(range, sizeof(range), "%ld-", prev_posn); + curl_easy_setopt(freq->slot->curl, CURLOPT_RANGE, range); } return freq; diff --git a/http.h b/http.h index 49afe39..4f97b60 100644 --- a/http.h +++ b/http.h @@ -190,7 +190,6 @@ struct http_pack_request { struct packed_git **lst; FILE *packfile; char tmpfile[PATH_MAX]; - struct curl_slist *range_header; struct active_request_slot *slot; }; -- 2.4.2.691.g714732c-twtrsrc -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html