Re: [RFC PATCH] BUG/MEDIUM: compression: Rewrite strong ETags

2019-01-29 Thread Willy Tarreau
On Tue, Jan 29, 2019 at 12:56:14PM +0100, Tim Düsterhus wrote:
> I just notice the `http_select_comp_reshdr` function. I guess I can put
> the ETag validation there and only check for strong / weak in
> `http_set_comp_reshdr`.

Yes, good idea!

Willy



Re: [RFC PATCH] BUG/MEDIUM: compression: Rewrite strong ETags

2019-01-29 Thread Tim Düsterhus
Willy,

Am 29.01.19 um 12:47 schrieb Tim Düsterhus:
> I initially implemented it as a `goto error`. That disables the actual
> compression of the body. Unfortunately the `Content-Encoding` header is
> already modified, thus the client expects gzip, but receives plain data.
> I could mitigate that by modifying the ETag header first which is the
> most likely one to fail.
> 
> Ideally the http_set_comp_reshdr / htx_set_comp_reshdr functions run
> atomically, if one header fails to be modified all of the headers revert
> to their original values. But this is currently not the case.
> 
> What do you think?
> 

I just notice the `http_select_comp_reshdr` function. I guess I can put
the ETag validation there and only check for strong / weak in
`http_set_comp_reshdr`.

Best regards
Tim Düsterhus



Re: [RFC PATCH] BUG/MEDIUM: compression: Rewrite strong ETags

2019-01-29 Thread Tim Düsterhus
Willy,

Am 29.01.19 um 04:05 schrieb Willy Tarreau:
>> FWIW: I have no idea what that "Warning header" in configuration.txt is. Do 
>> you
>> have any idea?
> 
> I have a vague memory about an old suggestion to update the Warning header
> when applying content transformations. Just found it, it's described here :
> 
>  https://tools.ietf.org/html/rfc7234#section-5.5
> 
> In practice it should only be affected for lossy compression, which is not
> our case.

Today I learned. Never heard about or seen that one.

>> If you feel that my code looks good you can merge despite the RFC tag. I 
>> don't
>> plan doing any more changes unless you complain.
> 
> I'm seeing a minor issue regarding the matched conditions :
> 
> [...]

You are correct. Fixed.

>> +/* This is a valid, strong, ETag. Convert it to a weak 
>> one. */
>> +trash.data = 8;
>> +if (trash.data + ctx.vlen + ctx.tws > trash.size)
>> +goto error;
>> +memcpy(trash.area, "ETag: W/", trash.data);
>> +memcpy(trash.area + trash.data, ctx.line + ctx.val, 
>> ctx.vlen + ctx.tws);
>> +trash.data += ctx.vlen;
>> +trash.area[trash.data] = '\0';
>> +ctx.idx = 0;
>> +while (http_find_full_header2("ETag", 4, 
>> ci_head(&s->res), &txn->hdr_idx, &ctx))
>> +http_remove_header2(msg, &txn->hdr_idx, &ctx);
>> +if (http_header_add_tail2(msg, &txn->hdr_idx, 
>> trash.area, trash.data) < 0)
>> +goto error;
>> +}
>> +else {
>> +/* This is an invalid ETag. */
>> +http_remove_header2(msg, &txn->hdr_idx, &ctx);
> 
> I'm personally worried about this specific change, because this one could
> have a strongly nasty effect in field, especially if backported to stable
> versions. I'd rather suggest that the compression turns a strong etag to
> a weak etag but doesn't modify other ones. There *are* definitely some
> implementations which don't send the double quotes. Just one example
> found here : https://bugs.launchpad.net/swift/+bug/1424614

Yes, I was worried about that as well, that's why I wasn't really sure
about which branches to backport to. In fact I made that mistake,
leaving out the quotes, myself in the past as well. RFC 7232 could need
some big fat "THE QUOTES ARE REQUIRED".

> However we could decide that we refuse to compress in case we're meeting
> such a situation, since the conditions are not met to reliably compress
> while perfectly respecting the standard. This way we'd just pass the
> response as-is without affecting it. This would also place some incentive
> on the server's owner to get it fixed to comply with the standard.

I initially implemented it as a `goto error`. That disables the actual
compression of the body. Unfortunately the `Content-Encoding` header is
already modified, thus the client expects gzip, but receives plain data.
I could mitigate that by modifying the ETag header first which is the
most likely one to fail.

Ideally the http_set_comp_reshdr / htx_set_comp_reshdr functions run
atomically, if one header fails to be modified all of the headers revert
to their original values. But this is currently not the case.

What do you think?

Best regards
Tim Düsterhus



Re: [RFC PATCH] BUG/MEDIUM: compression: Rewrite strong ETags

2019-01-28 Thread Willy Tarreau
Hi Tim,

On Mon, Jan 28, 2019 at 11:21:13PM +0100, Tim Duesterhus wrote:
> Please decide on which branches you feel this should be backported without
> breaking expected behavior. I guess 1.9 is safe, others might not for a patch
> release.

I'll take a deeper look after emitting 1.9.3.

> FWIW: I have no idea what that "Warning header" in configuration.txt is. Do 
> you
> have any idea?

I have a vague memory about an old suggestion to update the Warning header
when applying content transformations. Just found it, it's described here :

 https://tools.ietf.org/html/rfc7234#section-5.5

In practice it should only be affected for lossy compression, which is not
our case.

> If you feel that my code looks good you can merge despite the RFC tag. I don't
> plan doing any more changes unless you complain.

I'm seeing a minor issue regarding the matched conditions :

> + if (ctx.vlen > 4 && memcmp(ctx.line + ctx.val, "W/\"", 3) == 0 
> && ctx.line[ctx.val + ctx.vlen - 1] == '"') {

It should be "ctx.vlen >= 4" since it's enough to parse a valid, though empty 
string.

> + /* This is a valid, weak, ETag. Nothing to do here. */
> + }
> + else if (ctx.vlen > 2 && ctx.line[ctx.val] == '"' && 
> ctx.line[ctx.val + ctx.vlen - 1] == '"') {

Same here "ctx.vlen >= 2".

> + /* This is a valid, strong, ETag. Convert it to a weak 
> one. */
> + trash.data = 8;
> + if (trash.data + ctx.vlen + ctx.tws > trash.size)
> + goto error;
> + memcpy(trash.area, "ETag: W/", trash.data);
> + memcpy(trash.area + trash.data, ctx.line + ctx.val, 
> ctx.vlen + ctx.tws);
> + trash.data += ctx.vlen;
> + trash.area[trash.data] = '\0';
> + ctx.idx = 0;
> + while (http_find_full_header2("ETag", 4, 
> ci_head(&s->res), &txn->hdr_idx, &ctx))
> + http_remove_header2(msg, &txn->hdr_idx, &ctx);
> + if (http_header_add_tail2(msg, &txn->hdr_idx, 
> trash.area, trash.data) < 0)
> + goto error;
> + }
> + else {
> + /* This is an invalid ETag. */
> + http_remove_header2(msg, &txn->hdr_idx, &ctx);

I'm personally worried about this specific change, because this one could
have a strongly nasty effect in field, especially if backported to stable
versions. I'd rather suggest that the compression turns a strong etag to
a weak etag but doesn't modify other ones. There *are* definitely some
implementations which don't send the double quotes. Just one example
found here : https://bugs.launchpad.net/swift/+bug/1424614

However we could decide that we refuse to compress in case we're meeting
such a situation, since the conditions are not met to reliably compress
while perfectly respecting the standard. This way we'd just pass the
response as-is without affecting it. This would also place some incentive
on the server's owner to get it fixed to comply with the standard.

Thanks,
Willy