Hi Tim. Am 13.02.2019 um 17:57 schrieb Tim Duesterhus: > Willy, > Aleks, > List, > > this (absolutely non-ready-to-merge) patch adds support for brotli > compression as suggested in issue #21: > https://github.com/haproxy/haproxy/issues/21
Cool ;-) > It is tested on Ubuntu Xenial with libbrotli 1.0.3: > > [timwolla@~]apt-cache policy libbrotli-dev > libbrotli-dev: > Installed: 1.0.3-1ubuntu1~16.04.1 > Candidate: 1.0.3-1ubuntu1~16.04.1 > Version table: > *** 1.0.3-1ubuntu1~16.04.1 500 > 500 http://de.archive.ubuntu.com/ubuntu xenial-updates/main > amd64 Packages > 100 /var/lib/dpkg/status > [timwolla@~]apt-cache policy libbrotli1 > libbrotli1: > Installed: 1.0.3-1ubuntu1~16.04.1 > Candidate: 1.0.3-1ubuntu1~16.04.1 > Version table: > *** 1.0.3-1ubuntu1~16.04.1 500 > 500 http://de.archive.ubuntu.com/ubuntu xenial-updates/main > amd64 Packages > 100 /var/lib/dpkg/status > > I am successfully able access brotli compressed URLs with Google Chrome, > this requires me to disable `gzip` though (because haproxy prefers to > select gzip, I suspect because `br` is last in Chrome's `Accept-Encoding` > header). Does it change it when you use `br` as frist entry in `compression algo ... ` https://cbonte.github.io/haproxy-dconv/1.9/configuration.html#4.2-compression%20algo > I also am able to sucessfully download and decompress URLs with `curl` > and the `brotli` CLI utility. The server I use as the backend for these > tests has about 45ms RTT to my machine. The HTML page I use is some random > HTML page on the server, the noise file is 1 MiB of finest /dev/urandom. > > You'll notice that brotli compressed requests are both faster as well as > smaller compared to gzip with the hardcoded brotli compression quality > of 3. The default is 11, which is *way* slower than gzip. How much more/less/equal CPU usage have brotli compared to gzip? I'm a little bit disappointed from the size point of view, it is only ~6K less then gzip, is it worth the amount of work for such a small gain of data reduction. Regards Aleks > + curl localhost:8080/*snip*.html -H 'Accept-Encoding: gzip' > % Total % Received % Xferd Average Speed Time Time Time > Current > Dload Upload Total Spent Left > Speed > 100 49280 0 49280 0 0 279k 0 --:--:-- --:--:-- > --:--:-- 279k > + curl localhost:8080/*snip*.html -H 'Accept-Encoding: br' > % Total % Received % Xferd Average Speed Time Time Time > Current > Dload Upload Total Spent Left > Speed > 100 43401 0 43401 0 0 332k 0 --:--:-- --:--:-- > --:--:-- 333k > + curl localhost:8080/*snip*.html -H 'Accept-Encoding: identity' > % Total % Received % Xferd Average Speed Time Time Time > Current > Dload Upload Total Spent Left > Speed > 100 127k 100 127k 0 0 441k 0 --:--:-- --:--:-- > --:--:-- 441k > + curl localhost:8080/noise -H 'Accept-Encoding: gzip' > % Total % Received % Xferd Average Speed Time Time Time > Current > Dload Upload Total Spent Left > Speed > 100 1025k 0 1025k 0 0 3330k 0 --:--:-- --:--:-- > --:--:-- 3338k > + curl localhost:8080/noise -H 'Accept-Encoding: br' > % Total % Received % Xferd Average Speed Time Time Time > Current > Dload Upload Total Spent Left > Speed > 100 1024k 0 1024k 0 0 3029k 0 --:--:-- --:--:-- > --:--:-- 3030k > + curl localhost:8080/noise -H 'Accept-Encoding: identity' > % Total % Received % Xferd Average Speed Time Time Time > Current > Dload Upload Total Spent Left > Speed > 100 1024k 100 1024k 0 0 3003k 0 --:--:-- --:--:-- > --:--:-- 3002k > + ls -al > total 3384 > drwxrwxr-x 2 timwolla timwolla 4096 Feb 13 17:30 . > drwxrwxrwt 28 root root 69632 Feb 13 17:25 .. > -rw-rw-r-- 1 timwolla timwolla 598 Feb 13 17:30 download > -rw-rw-r-- 1 timwolla timwolla 43401 Feb 13 17:30 html-br > -rw-rw-r-- 1 timwolla timwolla 49280 Feb 13 17:30 html-gz > -rw-rw-r-- 1 timwolla timwolla 130334 Feb 13 17:30 html-id > -rw-rw-r-- 1 timwolla timwolla 1048949 Feb 13 17:30 noise-br > -rw-rw-r-- 1 timwolla timwolla 1049666 Feb 13 17:30 noise-gz > -rw-rw-r-- 1 timwolla timwolla 1048576 Feb 13 17:30 noise-id > ++ zcat html-gz > + sha256sum html-id /dev/fd/63 /dev/fd/62 > ++ brotli --decompress --stdout html-br > 56f1664241b3dbb750f93b69570be76c6baccb8de4f3a62fb4fec0ce1bf440b5 > html-id > 56f1664241b3dbb750f93b69570be76c6baccb8de4f3a62fb4fec0ce1bf440b5 > /dev/fd/63 > 56f1664241b3dbb750f93b69570be76c6baccb8de4f3a62fb4fec0ce1bf440b5 > /dev/fd/62 > ++ zcat noise-gz > + sha256sum noise-id /dev/fd/63 /dev/fd/62 > ++ brotli --decompress --stdout noise-br > ab23236d9d4acecec239c3f0f9b59e59dd043267eeed9ed723da8b15f46bbf33 > noise-id > ab23236d9d4acecec239c3f0f9b59e59dd043267eeed9ed723da8b15f46bbf33 > /dev/fd/63 > ab23236d9d4acecec239c3f0f9b59e59dd043267eeed9ed723da8b15f46bbf33 > /dev/fd/62 > > This patch still spits out a bunch of debug output to `stdout`, because I'm > not sure about the do-while loops in `_flush` and `_finish`. I suppose they > could lead to infinite loops if the encoded data exceeds the out buffer. It > might be necessary to add a new return code that indicates that the `_flush` > / `_finish` must be retried after making space in the output buffer, because > the libbrotli API states: > >> Under some circumstances (e.g. lack of output stream capacity) this >> operation would require several calls to BrotliEncoderCompressStream. >> The method must be called again until both input stream is depleted >> and encoder has no more output (see BrotliEncoderHasMoreOutput) after >> the method is called. > > and > >> Warning: >> When flushing and finishing, op should not change until operation is >> complete; input stream should not be swapped, reduced or extended as well. > > Thus `_add_data` *must not* be called until the flush succeeded completely. > > One more thing: brotli theoretically supports passing a custom allocator. > I attempted to use a pool for that, but `BrotliEncoderState` is an opaque > struct. > > Best regards > Tim Duesterhus > > Apply with `git am --scissors` to automatically cut the commit message. > -- >8 -- > see issue #21 > --- > Makefile | 9 +++ > include/types/compression.h | 37 ++++++++---- > src/compression.c | 113 +++++++++++++++++++++++++++++++++++- > src/flt_http_comp.c | 2 +- > 4 files changed, 149 insertions(+), 12 deletions(-) > > diff --git a/Makefile b/Makefile > index e2c4d17a..f1818d01 100644 > --- a/Makefile > +++ b/Makefile > @@ -39,6 +39,7 @@ > # USE_MY_ACCEPT4 : use own implemention of accept4() if glibc < 2.10. > # USE_ZLIB : enable zlib library support. > # USE_SLZ : enable slz library instead of zlib (pick at most > one). > +# USE_BROTLI : enable brotli library support. > # USE_CPU_AFFINITY : enable pinning processes to CPU on Linux. > Automatic. > # USE_TFO : enable TCP fast open. Supported on Linux >= 3.7. > # USE_NS : enable network namespace support. Supported on > Linux >= 2.6.24. > @@ -555,6 +556,14 @@ BUILD_OPTIONS += $(call ignore_implicit,USE_ZLIB) > OPTIONS_LDFLAGS += $(if $(ZLIB_LIB),-L$(ZLIB_LIB)) -lz > endif > > +ifneq ($(USE_BROTLI),) > +BROTLI_INC = > +BROTLI_LIB = > +OPTIONS_CFLAGS += -DUSE_BROTLI $(if $(BROTLI_INC),-I$(BROTLI_INC)) > +BUILD_OPTIONS += $(call ignore_implicit,USE_BROTLI) > +OPTIONS_LDFLAGS += $(if $(BROTLI_LIB),-L$(BROTLI_LIB)) -lbrotlienc > +endif > + > ifneq ($(USE_POLL),) > OPTIONS_CFLAGS += -DENABLE_POLL > OPTIONS_OBJS += src/ev_poll.o > diff --git a/include/types/compression.h b/include/types/compression.h > index e515aadf..3c953e66 100644 > --- a/include/types/compression.h > +++ b/include/types/compression.h > @@ -32,6 +32,10 @@ > #include <zlib.h> > #endif > > +#if defined(USE_BROTLI) > +#include <brotli/encode.h> > +#endif > + > #include <common/buffer.h> > > struct comp { > @@ -40,22 +44,35 @@ struct comp { > unsigned int offload; > }; > > +#if defined(USE_SLZ) || defined(USE_ZLIB) || defined(USE_BROTLI) > struct comp_ctx { > + union { > +#if defined(USE_SLZ) || defined(USE_ZLIB) > + struct { > #if defined(USE_SLZ) > - struct slz_stream strm; > - const void *direct_ptr; /* NULL or pointer to beginning of data */ > - int direct_len; /* length of direct_ptr if not NULL */ > - struct buffer queued; /* if not NULL, data already queued */ > + struct slz_stream strm; > + const void *direct_ptr; /* NULL or pointer to beginning > of data */ > + int direct_len; /* length of direct_ptr if not > NULL */ > + struct buffer queued; /* if not NULL, data already > queued */ > #elif defined(USE_ZLIB) > - z_stream strm; /* zlib stream */ > - void *zlib_deflate_state; > - void *zlib_window; > - void *zlib_prev; > - void *zlib_pending_buf; > - void *zlib_head; > + z_stream strm; /* zlib stream */ > + void *zlib_deflate_state; > + void *zlib_window; > + void *zlib_prev; > + void *zlib_pending_buf; > + void *zlib_head; > +#endif > + }; > #endif > +#if defined(USE_BROTLI) > + struct { > + BrotliEncoderState *brotli_state; > + }; > +#endif > + }; > int cur_lvl; > }; > +#endif > > /* Thanks to MSIE/IIS, the "deflate" name is ambigous, as according to the > RFC > * it's a zlib-wrapped deflate stream, but MSIE only understands a raw > deflate > diff --git a/src/compression.c b/src/compression.c > index b0307b23..01d46945 100644 > --- a/src/compression.c > +++ b/src/compression.c > @@ -26,6 +26,10 @@ > #undef free_func > #endif /* USE_ZLIB */ > > +#if defined(USE_BROTLI) > +#include <brotli/encode.h> > +#endif > + > #include <common/cfgparse.h> > #include <common/compat.h> > #include <common/hathreads.h> > @@ -94,6 +98,14 @@ static int deflate_end(struct comp_ctx **comp_ctx); > > #endif /* USE_ZLIB */ > > +#if defined(USE_BROTLI) > +static int brotli_init(struct comp_ctx **comp_ctx, int level); > +static int brotli_add_data(struct comp_ctx *comp_ctx, const char *in_data, > int in_len, struct buffer *out); > +static int brotli_flush(struct comp_ctx *comp_ctx, struct buffer *out); > +static int brotli_finish(struct comp_ctx *comp_ctx, struct buffer *out); > +static int brotli_end(struct comp_ctx **comp_ctx); > +#endif > + > > const struct comp_algo comp_algos[] = > { > @@ -107,6 +119,9 @@ const struct comp_algo comp_algos[] = > { "raw-deflate", 11, "deflate", 7, raw_def_init, deflate_add_data, > deflate_flush, deflate_finish, deflate_end }, > { "gzip", 4, "gzip", 4, gzip_init, deflate_add_data, > deflate_flush, deflate_finish, deflate_end }, > #endif /* USE_ZLIB */ > +#if defined(USE_BROTLI) > + { "brotli", 6, "br", 2, brotli_init, brotli_add_data, > brotli_flush, brotli_finish, brotli_end }, > +#endif > { NULL, 0, NULL, 0, NULL , NULL, > NULL, NULL, NULL } > }; > > @@ -246,7 +261,6 @@ static int identity_end(struct comp_ctx **comp_ctx) > return 0; > } > > - > #ifdef USE_SLZ > > /* SLZ's gzip format (RFC1952). Returns < 0 on error. */ > @@ -691,6 +705,103 @@ static int zlib_parse_global_windowsize(char **args, > int section_type, struct pr > > #endif /* USE_ZLIB */ > > +#ifdef USE_BROTLI > + > +static int brotli_init(struct comp_ctx **comp_ctx, int level) > +{ > + printf("brotli_init\n"); > + BrotliEncoderState *brotli_state; > + *comp_ctx = pool_alloc(pool_comp_ctx); > + if (*comp_ctx == NULL) > + return -1; > + brotli_state = BrotliEncoderCreateInstance(NULL, NULL, NULL); > + BrotliEncoderSetParameter(brotli_state, BROTLI_PARAM_QUALITY, 3); > + if (brotli_state == NULL) { > + pool_free(pool_comp_ctx, *comp_ctx); > + *comp_ctx = NULL; > + return -1; > + } > + (*comp_ctx)->brotli_state = brotli_state; > + > + return 0; > +} > + > +static int brotli_add_data(struct comp_ctx *comp_ctx, const char *in_data, > int in_len, struct buffer *out) > +{ > + printf("brotli_add_data\n"); > + uint8_t *out_data = (uint8_t*)b_tail(out); > + size_t out_len = b_room(out); > + size_t out_len2 = out_len; > + const uint8_t *in_data2 = (const uint8_t*)in_data; > + size_t in_len2 = in_len; > + > + printf("BrotliEncoderCompressStream\n"); > + printf("+in_len: %zu\n", in_len2); > + printf("+out_len: %zu\n", out_len2); > + if (BrotliEncoderCompressStream(comp_ctx->brotli_state, > BROTLI_OPERATION_PROCESS, &in_len2, &in_data2, &out_len2, &out_data, NULL) == > BROTLI_FALSE) { > + return -1; > + } > + printf("-in_len: %zu\n", in_len2); > + printf("-out_len: %zu = %zu\n", out_len2, out_len - out_len2); > + b_add(out, out_len - out_len2); > + > + return in_len - in_len2; > +} > + > +static int brotli_flush(struct comp_ctx *comp_ctx, struct buffer *out) > +{ > + printf("brotli_flush\n"); > + uint8_t *out_data = (uint8_t*)b_tail(out); > + size_t out_len = b_room(out); > + size_t out_len2 = out_len; > + size_t in_len2 = 0; > + > + do > + { > + printf("BrotliEncoderCompressStream\n"); > + printf("+out_len: %zu\n", out_len2); > + if (BrotliEncoderCompressStream(comp_ctx->brotli_state, > BROTLI_OPERATION_FLUSH, &in_len2, NULL, &out_len2, &out_data, NULL) == > BROTLI_FALSE) { > + return -1; > + } > + printf("-out_len: %zu = %zu\n", out_len2, out_len - out_len2); > + } > + while (BrotliEncoderHasMoreOutput(comp_ctx->brotli_state) == > BROTLI_TRUE); > + b_add(out, out_len - out_len2); > + printf("Done\n"); > + return out_len - out_len2; > +} > +static int brotli_finish(struct comp_ctx *comp_ctx, struct buffer *out) > +{ > + printf("brotli_finish\n"); > + uint8_t *out_data = (uint8_t*)b_tail(out); > + size_t out_len = b_room(out); > + size_t out_len2 = out_len; > + size_t in_len = 0; > + > + do > + { > + printf("BrotliEncoderCompressStream\n"); > + printf("+out_len: %zu\n", out_len2); > + if (BrotliEncoderCompressStream(comp_ctx->brotli_state, > BROTLI_OPERATION_FINISH, &in_len, NULL, &out_len2, &out_data, NULL) == > BROTLI_FALSE) { > + return -1; > + } > + printf("-out_len: %zu = %zu\n", out_len2, out_len - out_len2); > + } > + while (BrotliEncoderHasMoreOutput(comp_ctx->brotli_state) == > BROTLI_TRUE); > + b_add(out, out_len - out_len2); > + printf("Done\n"); > + return out_len - out_len2; > +} > +static int brotli_end(struct comp_ctx **comp_ctx) > +{ > + printf("brotli_end\n"); > + BrotliEncoderDestroyInstance((*comp_ctx)->brotli_state); > + pool_free(pool_comp_ctx, *comp_ctx); > + *comp_ctx = NULL; > + return 0; > +} > + > +#endif > > /* config keyword parsers */ > static struct cfg_kw_list cfg_kws = {ILH, { > diff --git a/src/flt_http_comp.c b/src/flt_http_comp.c > index 3c2ce365..bedc9e6f 100644 > --- a/src/flt_http_comp.c > +++ b/src/flt_http_comp.c > @@ -1178,7 +1178,7 @@ http_compression_buffer_end(struct comp_state *st, > struct stream *s, > char *tail; > int to_forward, left; > > -#if defined(USE_SLZ) || defined(USE_ZLIB) > +#if defined(USE_SLZ) || defined(USE_ZLIB) || defined(USE_BROTLI) > int ret; > > /* flush data here */ >