On Thu, Apr 06, 2023 at 01:12:09AM +0200, Olivier Houchard wrote:
> > Also I don't understand how you can have different algos for each direction
> > since the config language allows you to define "compression algo" and
> > "compression tocompress" so you cannot (apparently) have a situation where
> > the two algos differ. I think it can make sense to use different algos for
> > each direction (at least compression settings, e.g. expensive compression
> > for uploads, relayed over expensive links, little compression for downloads
> > that possibly comes from a cache).
>
> The idea was that while only one algo could be used for requests, the
> response algorithm would depend on what was supported by the client.
Ah yes indeed.
> > > -static int set_compression_response_header(struct comp_state *st,
> > > - struct stream *s,
> > > - struct http_msg *msg);
> > > +static int set_compression_header(struct comp_state *st,
> > > + struct stream *s,
> > > + struct http_msg *msg,
> > > + int for_response);
> > ^^^^^^^^^^^^^^^^
> > For this one as well I would call it "direction" (or "dir"), as we
> > already do in samples and a few other parts that work in both
> > directions.
>
> Ok I was about to do that, but somebody from the haproxy team whom I
> won't name told me it was even better to just deduce the direction from
> the http_msg, so I did just that.
:-) I thought the same but suspected that you needed it for any other
reason.
> > > +/* limit compression rate */
> > > + if (global.comp_rate_lim > 0)
> > > + if (read_freq_ctr(&global.comp_bps_in) > global.comp_rate_lim)
> > > + goto fail;
> >
> > I had a doubt regarding this one but I think it's indeed reasonable to
> > use the same global value for all directions as the goal was precisely
> > to be able to enforce a total limit on all compressed streams to limit
> > the resource impacts.
>
> I think it makes sense, unless someone wants to give more priority to
> request compression vs response compression, or vice versa, but I'll
> rethink it if somebody gives me a valid use-case.
We'll see, but I strongly doubt this would ever happen because those who
want to enforce speed limits are often hosting providers who don't want
that a single site eats all their CPU, so they configure maximum compression
bandwidths. It dates back from the era where it was still common to start
several daemons, which is no more the case these days. So actually if
anything would change in the short term it would rather be to have
per-frontend and/or per-backend total limits rather than splitting
req/res which still concern the same application.
> > > + if (!(msg->chn->flags & CF_ISRESP)) {
> > > + if (comp_flags & COMP_FL_COMPRESS_REQ) {
> > > + comp_prepare_compress_request(st, s, msg);
> > > + if (st->comp_algo_req) {
> > > + if (!set_compression_header(st, s, msg, 0))
> >
> > ^^^
> > This zero is all but explicit about the fact that it's the direction.
> > Maybe you should create an enum for this, or you can decide to pass
> > the compression flags (which could then be reused to pass other info
> > in the future) and you'd have COMP_FL_DIR_REQ here, which is way more
> > explicit.
>
> Yes this is fixed, I don't use the flag but introduced
> COMP_DIR_REQ/COMP_DIR_RES, which nicely matches the index of the
> corresponding arrays.
Cool, thanks!
> > I have not seen anything updating the stats, so I suspect that request
> > and response compression stats are accounted as response stats, which
> > can be a problem showing compression rates larger than 100%. If you hover
> > over your frontend's "Bytes / Out" fields, you'll see a yellow box with
> > numbers such as:
> >
> > Response bytes in: 468197882863
> > Compression in: 117157038990
> > Compression out: 93277104103 (79%)
> > Compression bypass: 0
> > Total bytes saved: 23879934887 (5%)
> >
> > Here I'm figuring that we'll need to add the same for the upload, so
> > these are extra stats. I can possibly help locate where to place them,
> > but I'll need that you figure where they are updated.
>
> Ok so I've had mixed feelings about this, because I thought a global
> counter could be useful too. But ultimately I think it's better to keep
> them separated, if only so that the percentages make sense.
Yes, and another point is that when you start with a global one and
finally decide to introduce a separate one for one direction so that
you can subtract it from the total to get the other one, it's not
uncommon to see negative numbers due to rounding errors or timings,
and that's extremely confusing for those who deal with that. Thus I
prefer to let users sum data on their side rather than providing
totals and separate ones later.
> We could
> have had Request + Response bytes there, but I think it's less useful.
> So I added the counters, but do not expose them for requests yet, as I'm
> unsure where to do that exactly, but that can easily be addressed with a
> separate commit.
Yes we can have a look later. I think we'll add new metric in the stats
that will appear in a new tooltip on the bytes/in column of the stats
page and that will be OK.
Just a few tiny cosmetic comments below:
> From: Olivier Houchard <[email protected]>
> Date: Wed, 5 Apr 2023 17:32:36 +0200
> Subject: [PATCH 3/5] MINOR: compression: Store algo and type for both request
> and response
(...)
> diff --git a/include/haproxy/compression-t.h b/include/haproxy/compression-t.h
> index 685110331..af51654d9 100644
> --- a/include/haproxy/compression-t.h
> +++ b/include/haproxy/compression-t.h
> @@ -43,8 +43,10 @@
>
> #define COMP_FL_OFFLOAD 0x00000001 /* Compression offload */
> struct comp {
> - struct comp_algo *algos;
> - struct comp_type *types;
> + struct comp_algo *algos_res; /* Algos available for response */
> + struct comp_algo *algo_req; /* Algo to use for request */
> + struct comp_type *types_req; /* Types to be compressed for requests */
> + struct comp_type *types_res; /* Types to be compressed for responses */
^^^
It would be nice to align the comments better for an easier reading.
> From: Olivier Houchard <[email protected]>
> Date: Thu, 6 Apr 2023 00:33:01 +0200
> Subject: [PATCH 4/5] MINOR: compression: Count separately request and response
> compression
(...)
> diff --git a/include/haproxy/counters-t.h b/include/haproxy/counters-t.h
> index 849f09683..2fd531125 100644
> --- a/include/haproxy/counters-t.h
> +++ b/include/haproxy/counters-t.h
> @@ -36,9 +36,10 @@ struct fe_counters {
> long long bytes_in; /* number of bytes transferred
> from the client to the server */
> long long bytes_out; /* number of bytes transferred
> from the server to the client */
>
> - long long comp_in; /* input bytes fed to the
> compressor */
> - long long comp_out; /* output bytes emitted by the
> compressor */
> - long long comp_byp; /* input bytes that bypassed
> the compressor (cpu/ram/bw limitation) */
> + /* compression counters, index 0 for requests, 1 for responses */
> + long long comp_in[2]; /* input bytes fed to the
> compressor */
> + long long comp_out[2]; /* output bytes emitted by
> the compressor */
> + long long comp_byp[2]; /* input bytes that bypassed
> the compressor (cpu/ram/bw limitation) */
Here as well, I sense that adding [2] pushed the comments on the right.
> long long denied_req; /* blocked requests because of
> security concerns */
> long long denied_resp; /* blocked responses because of
> security concerns */
> @@ -80,9 +81,10 @@ struct be_counters {
> long long bytes_in; /* number of bytes transferred
> from the client to the server */
> long long bytes_out; /* number of bytes transferred
> from the server to the client */
>
> - long long comp_in; /* input bytes fed to the
> compressor */
> - long long comp_out; /* output bytes emitted by the
> compressor */
> - long long comp_byp; /* input bytes that bypassed
> the compressor (cpu/ram/bw limitation) */
> + /* compression counters, index 0 for requests, 1 for responses */
> + long long comp_in[2]; /* input bytes fed to the
> compressor */
> + long long comp_out[2]; /* output bytes emitted by
> the compressor */
> + long long comp_byp[2]; /* input bytes that bypassed
> the compressor (cpu/ram/bw limitation) */
And here.
> From: Olivier Houchard <[email protected]>
> Date: Thu, 6 Apr 2023 00:33:48 +0200
> Subject: [PATCH 5/5] MEDIUM: compression: Make it so we can compress requests
> as well.
(...)
> diff --git a/doc/configuration.txt b/doc/configuration.txt
> index 3468c78f3..63e8d3440 100644
> --- a/doc/configuration.txt
> +++ b/doc/configuration.txt
> @@ -5099,7 +5111,15 @@ compression offload
> If this setting is used in a defaults section, a warning is emitted and the
> option is ignored.
>
> - See also : "compression type", "compression algo"
> + See also : "compression type", "compression algo", "compression "
^^^
A word is missing here, I suspect it was "direction" since that one
references "offload" which is where we are.
OK otherwise it looks good to me. I suggest you adjust these cosmetic
details and directly push it.
I'm having one question by the way: how did you manage to test this ?
Did you find a server that supports decompressing requests, or do you
only have some in-house stuff for this ? I'm wondering if we could manage
to make a regtest for this, at least by putting some expect rules on
the content-encoding header probably.
Thanks!
Willy