Hi Tim,

On Wed, Feb 27, 2019 at 01:23:28PM +0100, Tim Düsterhus wrote:
> As mentioned in my reply to Aleks I don't have any numbers, because I
> don't know to get them. My knowledge of both HAProxy's internals and C
> is not strong enough to get those.
> 
> The manpage documents this:
> 
> >        BROTLI_PARAM_LGBLOCK
> >               Recommended input block size. Encoder may reduce this value,
> >               e.g. if input is much smaller than input block size.
> > 
> >        Range is from BROTLI_MIN_INPUT_BLOCK_BITS to
> >        BROTLI_MAX_INPUT_BLOCK_BITS.
> > 
> >        Note:
> >            Bigger input block size allows better compression, but consumes
> >            more memory.
> >             The rough formula of memory used for temporary input storage is 
> > 3
> >            << lgBlock.
> 
> The default of this value depends on other configuration settings:
> https://github.com/google/brotli/blob/9cd01c0437e8b6010434d3491a348a5645de624b/c/enc/quality.h#L75-L92
> 
> It is the only place that talks about memory. There's also this (still
> open) issue: https://github.com/google/brotli/issues/389 "Functions to
> calculate approximate memory usage needed for compression and
> decompression".

This is quite scary, they're discussing about 2.6 MB for the huffmann
tables. It's manageable in a browser. Sometimes on a server with low
traffic, but on a shared load balancer, it's an immediate DoS. They're
saying that it's the worst case but that the majority of the cases are
below 0.5 MB, which is still twice as much as zlib which iself is insane.

> > can document such limits and let users decide on their own. We'll
> > need the equivalent of maxzlibmem though (or better, we can reuse it
> > to keep a single tunable and indicate it serves for any compression
> > algo so that there isn't the issue of "what if two frontends use a
> > different compression algo").
> 
> I guess one has to plug in a custom allocator to do this.

Quite likely, which is another level of pain :-/

> The library
> appears to handle the OOM case (but I did not check what happens if the
> OOM is encountered halfway through compression).

The problem is not as much how the lib handles the OOM situation as
how 100% of the remaining code (haproxy,openssl,pcre,...) handles it
once brotli makes this possibility a reality. We're always extremely
careful to make sure it still work in this situation by serializing
what can be, but we've already been hit by bugs in openssl and haproxy
at least.

For now, until we figure a way to properly control the resource usage
of this lib, I'm not a big fan of merging it, as it's clear that it
*will* cause lots of trouble. Seeing users complain here on the list
is one thing, but thinking about their crashed or frozen LB in prod
is another one, and I'd rather not cross this boundary especially
given the small gains we've seen that very few people would take for
a valuable justification for killing their production :-/

Cheers,
Willy

Reply via email to