Re: [PATCH 3/3] remote-curl: sanity check ref advertisement from server

2013-02-17 Thread Jonathan Nieder
Jeff King wrote:
> On Sun, Feb 17, 2013 at 03:05:34AM -0800, Jonathan Nieder wrote:
>> Jeff King wrote:

>>> +   if (verify_ref_advertisement(last->buf, last->len) < 0)
>>> +   die("ref advertisement is invalid at %s", refs_url);
>>
>> Won't this error out with
>>
>>  protocol error: bad line length character: ERR
>>
>> instead of the current more helpful behavior for ERR lines?
>
> I don't think so. Don't ERR lines appear inside their own packets?

Yes, I misread get_remote_heads for some reason.  Thanks for checking.

[...]
> The one thing we do also check, though, is that we end with a flush
> packet. So depending on what servers produce, it may mean we trigger
> this complaint instead of passing the ERR along to fetch-pack.
>
> Rather than doing this fake syntactic verification, I wonder if we
> should simply call get_remote_heads, which does a more thorough check

I'm not sure whether servers are expected to send a flush after an
ERR packet.  The only codepath I know of in git itself that sends
such packets is git-daemon, which does not flush after the error (but
is not used in the stateless-rpc case).  http-backend uses HTTP error
codes for its errors.

If I am reading get_remote_heads correctly, calling it with the
following tweak should work ok.  The extra thread is just to feed a
string into a fd-based interface and could be avoided for "list", too,
if it costs too much.

diff --git i/connect.c w/connect.c
index 49e56ba3..55ee7de7 100644
--- i/connect.c
+++ w/connect.c
@@ -68,7 +68,8 @@ struct ref **get_remote_heads(int in, struct ref **list,
 {
int got_at_least_one_head = 0;
 
-   *list = NULL;
+   if (list)
+   *list = NULL;
for (;;) {
struct ref *ref;
unsigned char old_sha1[20];
@@ -92,6 +93,9 @@ struct ref **get_remote_heads(int in, struct ref **list,
die("protocol error: expected sha/ref, got '%s'", 
buffer);
name = buffer + 41;
 
+   if (!list)
+   continue;
+
name_len = strlen(name);
if (len != name_len + 41) {
free(server_capabilities);
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 3/3] remote-curl: sanity check ref advertisement from server

2013-02-17 Thread Jeff King
On Sun, Feb 17, 2013 at 03:05:34AM -0800, Jonathan Nieder wrote:

> Jeff King wrote:
> 
> > If the smart HTTP response from the server is truncated for
> > any reason, we will get an incomplete ref advertisement. If
> > we then feed this incomplete list to "fetch-pack", one of a
> > few things may happen:
> >
> >   1. If the truncation is in a packet header, fetch-pack
> >  will notice the bogus line and complain.
> >
> >   2. If the truncation is inside a packet, fetch-pack will
> >  keep waiting for us to send the rest of the packet,
> >  which we never will.
> 
> Mostly harmless since the operator could hit ^C, but still unpleasant.

Fetching is not always interactive.  The deadlock I ran into (and again,
I am not sure if this fixes it or not, but it is _a_ deadlock) was on a
server farm doing a large number of "fetch && checkout && deploy"
operations. Only some of them hung, but it took a while to figure out
what was going on.

> [...]
> > This fortunately doesn't happen in the normal fetching
> > workflow, because git-fetch first uses the "list" command,
> > which feeds the refs to get_remote_heads, which does notice
> > the error. However, you can trigger it by sending a direct
> > "fetch" to the remote-curl helper.
> 
> Ah.  Would a test for this make sense?

A test would be great, if you can devise a way to reliably produce
truncated git output (but still valid http output). In the real-world
problem I had, I believe the truncation was caused by an intermediate
reverse proxy that hit a timeout. I simulated truncation by using netcat
to replay munged http headers and git output.

I suspect the simplest portable thing would be a static file of
truncated git output, served by apache, which would need custom
configuration to serve it with the correct content-type header. It
seemed like a lot of test infrastructure to check for a very specific
thing, so I abandoned trying to make a test.

> > +   if (verify_ref_advertisement(last->buf, last->len) < 0)
> > +   die("ref advertisement is invalid at %s", refs_url);
> 
> Won't this error out with
> 
>   protocol error: bad line length character: ERR
> 
> instead of the current more helpful behavior for ERR lines?

I don't think so. Don't ERR lines appear inside their own packets? We
are just verifying that our packets are syntactically correct here, and
my reading of get_remote_heads is that the ERR appears inside the
packetized data.

The one thing we do also check, though, is that we end with a flush
packet. So depending on what servers produce, it may mean we trigger
this complaint instead of passing the ERR along to fetch-pack.

Rather than doing this fake syntactic verification, I wonder if we
should simply call get_remote_heads, which does a more thorough check
(and is what we _would_ call in the list case, and what fetch-pack will
call once we pass data to it). It's slightly less efficient, in that it
starts a new thread and actually builds the linked list of refs. But it
probably isn't that big a deal (and normal operation does a "list" first
which does that _anyway_).

> Same stylistic comment about "what would it mean for the return value
> to be positive?" as in patch 2/3.

Same response. :)

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 3/3] remote-curl: sanity check ref advertisement from server

2013-02-17 Thread Jonathan Nieder
Jeff King wrote:

> If the smart HTTP response from the server is truncated for
> any reason, we will get an incomplete ref advertisement. If
> we then feed this incomplete list to "fetch-pack", one of a
> few things may happen:
>
>   1. If the truncation is in a packet header, fetch-pack
>  will notice the bogus line and complain.
>
>   2. If the truncation is inside a packet, fetch-pack will
>  keep waiting for us to send the rest of the packet,
>  which we never will.

Mostly harmless since the operator could hit ^C, but still unpleasant.

[...]
> This fortunately doesn't happen in the normal fetching
> workflow, because git-fetch first uses the "list" command,
> which feeds the refs to get_remote_heads, which does notice
> the error. However, you can trigger it by sending a direct
> "fetch" to the remote-curl helper.

Ah.  Would a test for this make sense?

[...]
> --- a/remote-curl.c
> +++ b/remote-curl.c
[...]
> @@ -174,6 +183,9 @@ static struct discovery* discover_refs(const char 
> *service)
>   die("smart-http metadata lines are invalid at %s",
>   refs_url);
>  
> + if (verify_ref_advertisement(last->buf, last->len) < 0)
> + die("ref advertisement is invalid at %s", refs_url);

Won't this error out with

protocol error: bad line length character: ERR

instead of the current more helpful behavior for ERR lines?

Same stylistic comment about "what would it mean for the return value
to be positive?" as in patch 2/3.

Aside from those two details, the idea looks sane, though.  Good
catch, and thanks for a pleasant read.

Good night,
Jonathan
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html