Hi Richard,
On Wed, Jul 31, 2019 at 02:03:26PM -0700, Richard Russo wrote:
> I've attached a patch to fix a bug in http sampling on raw connections. This
> fixes crashes experienced with a frontend configuration similar to:
>
> frontend haproxy_v4_http_0
> mode tcp
> bind ipv4@:80
> default_backend chat
> tcp-request inspect-delay 60s
> tcp-request content reject unless { req_len gt 0 }
> acl example_com req.hdr(host) -m str example.com
> acl example_org req.hdr(host) -m str example.org
> acl example_net req.hdr(host) -m str example.net
> acl chat path -m str /chat
> acl post method -m str POST
> use_backend chat if chat post # explicitly send chunked transfer encoded
> chat to chatd
> use_backend cdn if example_com
> use_backend cdn if example_org
> use_backend cdn if example_net
>
> If the host header in inbound requests is not example.com, the check for
> example.org will use the wrong trash buffer, and sometimes results in
> crashes. I wasn't able to get a clear crash reproduction, but the call to
> htx_get_next_blk() in src/http_htx.c:http_find_header would be called with
> blk way out of range, and the loop will walk memory until it happens to find
> an exit condition, or it reads unmapped memory and segfaults. Most often, it
> seems the wrong trash buffer still has the old htx buffer, so it usually
> doesn't crash.
>
> I found the blk pointer was out of range in my core dumps, and added logging
> in htx_get_next when pos > htx->tail; from there it was pretty easy to figure
> out what happened.
>
> I solved this by using a thread local static buffer for the raw buffer to htx
> conversion, but there's probably other ways to fix it.
I think your patch is right. I'll let Christopher check it and merge it,
but indeed there's a problem here since buffers allocated using
get_trash_chunk() are just for local use and should never be passed
across functions since they will be reassigned.
Good catch, thank you!
Willy