Re: [PATCH 3/3] remote-curl: sanity check ref advertisement from server
Jeff King wrote: > On Sun, Feb 17, 2013 at 03:05:34AM -0800, Jonathan Nieder wrote: >> Jeff King wrote: >>> + if (verify_ref_advertisement(last->buf, last->len) < 0) >>> + die("ref advertisement is invalid at %s", refs_url); >> >> Won't this error out with >> >> protocol error: bad line length character: ERR >> >> instead of the current more helpful behavior for ERR lines? > > I don't think so. Don't ERR lines appear inside their own packets? Yes, I misread get_remote_heads for some reason. Thanks for checking. [...] > The one thing we do also check, though, is that we end with a flush > packet. So depending on what servers produce, it may mean we trigger > this complaint instead of passing the ERR along to fetch-pack. > > Rather than doing this fake syntactic verification, I wonder if we > should simply call get_remote_heads, which does a more thorough check I'm not sure whether servers are expected to send a flush after an ERR packet. The only codepath I know of in git itself that sends such packets is git-daemon, which does not flush after the error (but is not used in the stateless-rpc case). http-backend uses HTTP error codes for its errors. If I am reading get_remote_heads correctly, calling it with the following tweak should work ok. The extra thread is just to feed a string into a fd-based interface and could be avoided for "list", too, if it costs too much. diff --git i/connect.c w/connect.c index 49e56ba3..55ee7de7 100644 --- i/connect.c +++ w/connect.c @@ -68,7 +68,8 @@ struct ref **get_remote_heads(int in, struct ref **list, { int got_at_least_one_head = 0; - *list = NULL; + if (list) + *list = NULL; for (;;) { struct ref *ref; unsigned char old_sha1[20]; @@ -92,6 +93,9 @@ struct ref **get_remote_heads(int in, struct ref **list, die("protocol error: expected sha/ref, got '%s'", buffer); name = buffer + 41; + if (!list) + continue; + name_len = strlen(name); if (len != name_len + 41) { free(server_capabilities); -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 3/3] remote-curl: sanity check ref advertisement from server
On Sun, Feb 17, 2013 at 03:05:34AM -0800, Jonathan Nieder wrote: > Jeff King wrote: > > > If the smart HTTP response from the server is truncated for > > any reason, we will get an incomplete ref advertisement. If > > we then feed this incomplete list to "fetch-pack", one of a > > few things may happen: > > > > 1. If the truncation is in a packet header, fetch-pack > > will notice the bogus line and complain. > > > > 2. If the truncation is inside a packet, fetch-pack will > > keep waiting for us to send the rest of the packet, > > which we never will. > > Mostly harmless since the operator could hit ^C, but still unpleasant. Fetching is not always interactive. The deadlock I ran into (and again, I am not sure if this fixes it or not, but it is _a_ deadlock) was on a server farm doing a large number of "fetch && checkout && deploy" operations. Only some of them hung, but it took a while to figure out what was going on. > [...] > > This fortunately doesn't happen in the normal fetching > > workflow, because git-fetch first uses the "list" command, > > which feeds the refs to get_remote_heads, which does notice > > the error. However, you can trigger it by sending a direct > > "fetch" to the remote-curl helper. > > Ah. Would a test for this make sense? A test would be great, if you can devise a way to reliably produce truncated git output (but still valid http output). In the real-world problem I had, I believe the truncation was caused by an intermediate reverse proxy that hit a timeout. I simulated truncation by using netcat to replay munged http headers and git output. I suspect the simplest portable thing would be a static file of truncated git output, served by apache, which would need custom configuration to serve it with the correct content-type header. It seemed like a lot of test infrastructure to check for a very specific thing, so I abandoned trying to make a test. > > + if (verify_ref_advertisement(last->buf, last->len) < 0) > > + die("ref advertisement is invalid at %s", refs_url); > > Won't this error out with > > protocol error: bad line length character: ERR > > instead of the current more helpful behavior for ERR lines? I don't think so. Don't ERR lines appear inside their own packets? We are just verifying that our packets are syntactically correct here, and my reading of get_remote_heads is that the ERR appears inside the packetized data. The one thing we do also check, though, is that we end with a flush packet. So depending on what servers produce, it may mean we trigger this complaint instead of passing the ERR along to fetch-pack. Rather than doing this fake syntactic verification, I wonder if we should simply call get_remote_heads, which does a more thorough check (and is what we _would_ call in the list case, and what fetch-pack will call once we pass data to it). It's slightly less efficient, in that it starts a new thread and actually builds the linked list of refs. But it probably isn't that big a deal (and normal operation does a "list" first which does that _anyway_). > Same stylistic comment about "what would it mean for the return value > to be positive?" as in patch 2/3. Same response. :) -Peff -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 3/3] remote-curl: sanity check ref advertisement from server
Jeff King wrote: > If the smart HTTP response from the server is truncated for > any reason, we will get an incomplete ref advertisement. If > we then feed this incomplete list to "fetch-pack", one of a > few things may happen: > > 1. If the truncation is in a packet header, fetch-pack > will notice the bogus line and complain. > > 2. If the truncation is inside a packet, fetch-pack will > keep waiting for us to send the rest of the packet, > which we never will. Mostly harmless since the operator could hit ^C, but still unpleasant. [...] > This fortunately doesn't happen in the normal fetching > workflow, because git-fetch first uses the "list" command, > which feeds the refs to get_remote_heads, which does notice > the error. However, you can trigger it by sending a direct > "fetch" to the remote-curl helper. Ah. Would a test for this make sense? [...] > --- a/remote-curl.c > +++ b/remote-curl.c [...] > @@ -174,6 +183,9 @@ static struct discovery* discover_refs(const char > *service) > die("smart-http metadata lines are invalid at %s", > refs_url); > > + if (verify_ref_advertisement(last->buf, last->len) < 0) > + die("ref advertisement is invalid at %s", refs_url); Won't this error out with protocol error: bad line length character: ERR instead of the current more helpful behavior for ERR lines? Same stylistic comment about "what would it mean for the return value to be positive?" as in patch 2/3. Aside from those two details, the idea looks sane, though. Good catch, and thanks for a pleasant read. Good night, Jonathan -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html