On Wed, Feb 20, 2019 at 04:14:47PM -0800, Jonathan Tan wrote:
> This is part of the work of CDN offloading of fetch responses.
>
> I have plans to use the http_pack_request suite of functions to
> implement the part where we download from CDN over HTTP(S), but need
> this change to be able to do so. I think it's better from the code
> quality perspective to reuse these functions, but this necessitates a
> behavior change in that we no longer use the filename as declared by the
> server, so I'm sending this as RFC to see what the community thinks.
I think it makes sense. We don't use the server names for any of the
other protocols, and I'm happy to see one less place where we may
inherit a stupid or malicious item of data from the server.
> diff --git a/http-push.c b/http-push.c
> index b22c7caea0..409b266b0c 100644
> --- a/http-push.c
> +++ b/http-push.c
> @@ -586,11 +586,16 @@ static void finish_request(struct transfer_request
> *request)
> fprintf(stderr, "Unable to get pack file %s\n%s",
> request->url, curl_errorstr);
> } else {
> + char *lockfile;
> +
> preq = (struct http_pack_request *)request->userData;
>
> if (preq) {
> - if (finish_http_pack_request(preq) == 0)
> + if (finish_http_pack_request(preq,
> + &lockfile) == 0) {
> + unlink(lockfile);
> fail = 0;
> + }
> release_http_pack_request(preq);
> }
> }
I was puzzled that you had to touch http-push.c. But indeed, it seems to
have some fetching code in it, too? I'm willing to throw up my hands in
disgust at the http-push code without looking further at this point. :)
> argv_array_push(&ip.args, "index-pack");
> - argv_array_pushl(&ip.args, "-o", tmp_idx, NULL);
> - argv_array_push(&ip.args, preq->tmpfile.buf);
> + argv_array_push(&ip.args, "--stdin");
> + argv_array_pushf(&ip.args, "--keep=git %"PRIuMAX, (uintmax_t)getpid());
> ip.git_cmd = 1;
> - ip.no_stdin = 1;
> - ip.no_stdout = 1;
> + ip.in = tmpfile_fd;
> + ip.out = -1;
>
> - if (run_command(&ip)) {
> - unlink(preq->tmpfile.buf);
> - unlink(tmp_idx);
> - free(tmp_idx);
> - return -1;
> + if (start_command(&ip)) {
> + ret = -1;
> + goto cleanup;
> }
>
> - unlink(sha1_pack_index_name(p->sha1));
> + *lockfile = index_pack_lockfile(ip.out);
> + close(ip.out);
We're now doing bi-directional I/O with index-pack. But it should be
deadlock-free, because we know the output is small and will only come at
the end after we've closed its stdin.
> - if (finalize_object_file(preq->tmpfile.buf, sha1_pack_name(p->sha1))
> - || finalize_object_file(tmp_idx, sha1_pack_index_name(p->sha1))) {
> - free(tmp_idx);
> - return -1;
> + if (finish_command(&ip)) {
> + ret = -1;
> + goto cleanup;
> }
If the command fails but we got something in *lockfile, should we clean
it up? Likewise, do we need to be installing a signal handler to clean
it up in case we die in other code paths (or by a signal)?
-Peff