Hi Olivier, On Tue, Apr 04, 2023 at 12:29:15AM +0200, Olivier Houchard wrote: > Hi, > > The attached patchset is a first attempt at adding the possibility to > compress requests, as well as responses.
This is pretty cool, I definitely see how this can be super useful :-) > It adds a new keyword for compression, "tocompress" (any better > alternative name would be appreciated). I would call it "direction" which I find more natural and understandable. > The valid values are "request", > if you want to compress requests, "response" if you want to compress > responses or "both", if you want to compress both. > The default is to compress responses only. I have a few below (besides renaming "tocompress"). > From 77a5657592338db2ed53189c50828b3c0ed52716 Mon Sep 17 00:00:00 2001 > From: Olivier Houchard <[email protected]> > Date: Tue, 4 Apr 2023 00:14:36 +0200 > Subject: [PATCH 2/2] MEDIUM: compression: allow to compress requests too. > > Make it so we can compress requests, as well as responses. > A new compress keyword is added, "tocompress". It takes exactly one > argument. The valid values for that arguments are "response", if we want > to compress only responses, "request", if we want to compress only > requests, or "both", if we want to compress both. > The dafault value is "response". (...) > diff --git a/include/haproxy/compression-t.h b/include/haproxy/compression-t.h > index cdbf0d14e..d15d15ee3 100644 > --- a/include/haproxy/compression-t.h > +++ b/include/haproxy/compression-t.h > @@ -37,6 +37,9 @@ > /* Compression flags */ > > #define COMP_FL_OFFLOAD 0x00000001 /* Compression offload */ > +#define COMP_FL_COMPRESS_REQ 0x00000002 /* Compress requests */ > +#define COMP_FL_COMPRESS_RES 0x00000004 /* Compress responses */ So these ones would be COMP_FL_DIR_{REQ,RES}. > struct comp { > struct comp_algo *algos; > struct comp_type *types; > diff --git a/src/flt_http_comp.c b/src/flt_http_comp.c > index 507009692..7680e241f 100644 > --- a/src/flt_http_comp.c > +++ b/src/flt_http_comp.c > @@ -33,7 +33,9 @@ struct flt_ops comp_ops; > > struct comp_state { > struct comp_ctx *comp_ctx; /* compression context */ > + struct comp_ctx *comp_ctx_req; /* compression context for request */ > struct comp_algo *comp_algo; /* compression algorithm if not NULL */ > + struct comp_algo *comp_algo_req; /* Algo to be used for request */ > unsigned int flags; /* COMP_STATE_* */ I find it confusing to have comp_ctx and comp_ctx_req (and same for algo). I'd rather see a preliminary patch renaming comp_ctx to comp_ctx_res and comp_algo to comp_algo_res (just perform a mechanical rename). Or you could also have an array [2] for each, which will even remove some ifs in the code. Also I don't understand how you can have different algos for each direction since the config language allows you to define "compression algo" and "compression tocompress" so you cannot (apparently) have a situation where the two algos differ. I think it can make sense to use different algos for each direction (at least compression settings, e.g. expensive compression for uploads, relayed over expensive links, little compression for downloads that possibly comes from a cache). Thus maybe an approach could be to use the algo to indicate the direction at the same time: compression algo foobar => sets response only, for compatibility compression algo-res foobar => sets response only, explicit syntax compression algo-req foobar => sets request only, explicit syntax Then there's probably no need to have a keyword to set it at once in both directions since both actions are totally independent and it's probably not desired to encourage users to conflate the two directions in a single keyword. You might need the same for "compression types" I think, so that you don't try to compress regular forms or login:password that will confuse applications, but only large data. > -static int set_compression_response_header(struct comp_state *st, > - struct stream *s, > - struct http_msg *msg); > +static int set_compression_header(struct comp_state *st, > + struct stream *s, > + struct http_msg *msg, > + int for_response); ^^^^^^^^^^^^^^^^ For this one as well I would call it "direction" (or "dir"), as we already do in samples and a few other parts that work in both directions. > +static void > +comp_prepare_compress_request(struct comp_state *st, struct stream *s, > struct http_msg *msg) > +{ > + struct htx *htx = htxbuf(&msg->chn->buf); > + struct http_txn *txn = s->txn; > + struct http_hdr_ctx ctx; > + > + ctx.blk = NULL; > + /* Already compressed, don't bother */ > + if (http_find_header(htx, ist("Content-Encoding"), &ctx, 1)) > + return; > + /* HTTP < 1.1 should not be compressed */ > + if (!(msg->flags & HTTP_MSGF_VER_11) || !(txn->req.flags & > HTTP_MSGF_VER_11)) > + return; > + > + if (txn->meth == HTTP_METH_HEAD) > + return; > + if (s->be->comp->algos != NULL) > + st->comp_algo_req = s->be->comp->algos; > + else if (strm_fe(s)->comp->algos != NULL) > + st->comp_algo_req = strm_fe(s)->comp->algos; > +/* limit compression rate */ > + if (global.comp_rate_lim > 0) > + if (read_freq_ctr(&global.comp_bps_in) > global.comp_rate_lim) > + goto fail; I had a doubt regarding this one but I think it's indeed reasonable to use the same global value for all directions as the goal was precisely to be able to enforce a total limit on all compressed streams to limit the resource impacts. > + /* limit cpu usage */ > + if (th_ctx->idle_pct < compress_min_idle) > + goto fail; > + > + /* initialize compression */ > + if (st->comp_algo_req->init(&st->comp_ctx_req, > global.tune.comp_maxlevel) < 0) > + goto fail; > + > + return; > +fail: > + st->comp_algo_req = NULL; > +} > + > static int > comp_http_headers(struct stream *s, struct filter *filter, struct http_msg > *msg) > { > struct comp_state *st = filter->ctx; > + int comp_flags = 0; > > if (!strm_fe(s)->comp && !s->be->comp) > goto end; > - > - if (!(msg->chn->flags & CF_ISRESP)) > - select_compression_request_header(st, s, msg); > - else { > + if (strm_fe(s)->comp) > + comp_flags |= strm_fe(s)->comp->flags; > + if (s->be->comp) > + comp_flags |= s->be->comp->flags; > + > + if (!(msg->chn->flags & CF_ISRESP)) { > + if (comp_flags & COMP_FL_COMPRESS_REQ) { > + comp_prepare_compress_request(st, s, msg); > + if (st->comp_algo_req) { > + if (!set_compression_header(st, s, msg, 0)) ^^^ This zero is all but explicit about the fact that it's the direction. Maybe you should create an enum for this, or you can decide to pass the compression flags (which could then be reused to pass other info in the future) and you'd have COMP_FL_DIR_REQ here, which is way more explicit. > + goto end; > + register_data_filter(s, msg->chn, filter); > + st->flags |= COMP_STATE_PROCESSING; > + } > + } > + if (comp_flags & COMP_FL_COMPRESS_RES) > + select_compression_request_header(st, s, msg); > + } else if (comp_flags & COMP_FL_COMPRESS_RES) { > /* Response headers have already been checked in > * comp_http_post_analyze callback. */ > if (st->comp_algo) { > - if (!set_compression_response_header(st, s, msg)) > + if (!set_compression_header(st, s, msg, 1)) Same here. > goto end; > register_data_filter(s, msg->chn, filter); > st->flags |= COMP_STATE_PROCESSING; > @@ -183,7 +244,9 @@ comp_http_payload(struct stream *s, struct filter > *filter, struct http_msg *msg, > enum htx_blk_type type = htx_get_blk_type(blk); > uint32_t sz = htx_get_blksz(blk); > struct ist v; > + int is_response; > > + is_response = msg->chn->flags & CF_ISRESP; Nitpicking but I'd do !!(msg->chn->flags & CF_ISRESP) here so that the day we start to extend flags to 64 bits we don't end up with a zero if CF_ISRESP is above 1<<31. > next = htx_get_next_blk(htx, blk); > while (next && htx_get_blk_type(next) == HTX_BLK_UNUSED) > next = htx_get_next_blk(htx, next); > @@ -207,8 +270,8 @@ comp_http_payload(struct stream *s, struct filter > *filter, struct http_msg *msg, > v.len = len; > } > > - ret = htx_compression_buffer_add_data(st, > v.ptr, v.len, &trash); > - if (ret < 0 || htx_compression_buffer_end(st, > &trash, last) < 0) > + ret = htx_compression_buffer_add_data(st, > v.ptr, v.len, &trash, is_response); > + if (ret < 0 || htx_compression_buffer_end(st, > &trash, last, is_response) < 0) > goto error; > BUG_ON(v.len != ret); > > @@ -228,7 +291,7 @@ comp_http_payload(struct stream *s, struct filter > *filter, struct http_msg *msg, > > case HTX_BLK_TLR: > case HTX_BLK_EOT: > - if (htx_compression_buffer_end(st, &trash, 1) < > 0) > + if (htx_compression_buffer_end(st, &trash, 1, > is_response) < 0) > goto error; > if (b_data(&trash)) { > struct htx_blk *last = > htx_add_last_data(htx, ist2(b_head(&trash), b_data(&trash))); > @@ -298,16 +361,22 @@ comp_http_end(struct stream *s, struct filter *filter, > > /***********************************************************************/ > static int > -set_compression_response_header(struct comp_state *st, struct stream *s, > struct http_msg *msg) > +set_compression_header(struct comp_state *st, struct stream *s, struct > http_msg *msg, int for_response) > { > struct htx *htx = htxbuf(&msg->chn->buf); > struct htx_sl *sl; > struct http_hdr_ctx ctx; > + struct comp_algo *comp_algo; > > sl = http_get_stline(htx); > if (!sl) > goto error; > > + if (for_response) > + comp_algo = st->comp_algo; > + else > + comp_algo = st->comp_algo_req; > + So this is one example where having an array for st->comp_algo could be nice. > /* add "Transfer-Encoding: chunked" header */ > if (!(msg->flags & HTTP_MSGF_TE_CHNK)) { > if (!http_add_header(htx, ist("Transfer-Encoding"), > ist("chunked"))) > @@ -348,8 +417,8 @@ set_compression_response_header(struct comp_state *st, > struct stream *s, struct > * Accept-Encoding header, and SHOULD NOT be used in the > Content-Encoding > * header. > */ > - if (st->comp_algo->cfg_name_len != 8 || memcmp(st->comp_algo->cfg_name, > "identity", 8) != 0) { > - struct ist v = ist2(st->comp_algo->ua_name, > st->comp_algo->ua_name_len); > + if (comp_algo->cfg_name_len != 8 || memcmp(comp_algo->cfg_name, > "identity", 8) != 0) { > + struct ist v = ist2(comp_algo->ua_name, comp_algo->ua_name_len); > > if (!http_add_header(htx, ist("Content-Encoding"), v)) > goto error; > @@ -358,8 +427,13 @@ set_compression_response_header(struct comp_state *st, > struct stream *s, struct > return 1; > > error: > - st->comp_algo->end(&st->comp_ctx); > - st->comp_algo = NULL; > + if (for_response) { > + st->comp_algo->end(&st->comp_ctx); > + st->comp_algo = NULL; > + } else { > + st->comp_algo_req->end(&st->comp_ctx_req); > + st->comp_algo_req = NULL; > + } And same here and a few other ones. I have not seen anything updating the stats, so I suspect that request and response compression stats are accounted as response stats, which can be a problem showing compression rates larger than 100%. If you hover over your frontend's "Bytes / Out" fields, you'll see a yellow box with numbers such as: Response bytes in: 468197882863 Compression in: 117157038990 Compression out: 93277104103 (79%) Compression bypass: 0 Total bytes saved: 23879934887 (5%) Here I'm figuring that we'll need to add the same for the upload, so these are extra stats. I can possibly help locate where to place them, but I'll need that you figure where they are updated. Thank you! Willy

