Re: Paid feature development: TCP stream compression

2022-05-23 Thread Mark Zealey

On 20/05/2022 22:15, Willy Tarreau wrote:

On Fri, May 20, 2022 at 12:16:07PM +0100, Mark Zealey wrote:

Thanks, we may use this for a very rough proof-of-concept. However we are
dealing with millions of concurrent connections, 10-100 million connections
per day, so we'd prefer to pay someone to develop (+ test!) something for
haproxy which will work at this scale

That's a big problem with gzip. While compression can be done stateless
(which is what we're doing with SLZ)
This is very useful thank you, I had not come across SLZ before. 
Unfortunately from reading the documentation it appears that our packet 
sizes would not be big enough to benefit from this (I don't have exact 
stats, but I believe median packet size on a long lived connection would 
be around 200 bytes uncompressed), although something with a predefined 
dictionary (eg Brotli or zstd with custom dict) may work well with this 
method. It may also be possible to use something like Brotli where you 
can define a window size to reduce the amount of decompression memory 
required.



decompression uses roughly 256 kB
of permanent RAM per *stream*. That's 256 GB of RAM for just one million
connections, 1 TB of RAM for just 4 million connections.


To be honest with you, whilst it would be nice to avoid this cost it's 
not the end of the world. Even on AWS such servers are relatively 
affordable.




Sadly, that's a
perfect example of use case that requires extreme horizontal scalability
and that's better kept away from the LB and performed on servers.


So currently we are absorbing this compression/memory cost on the 
backend servers, however as this is a stateful cluster which cannot 
scale horizontally so easily (rather than independent http servers) it 
would be better to keep the backend smaller and let the frontend of 
loadbalancers (which are independent and infinitely horizontally 
scalable in our infrastructure) handle this offload.




Isn't there any way to advertise support or lack of for compression ?
Because if your real need is to extract contents to perform any form of
processing, we could imagine that instead of having to decompress the
stream, it would be better if we could interfere with the connection
setup to disable compression.


The real need for us is to conserve bandwidth, and we are happy to spend 
money on hardware/development to do this. XMPP already has in-protocol 
compression support which we are using and is nice, but I would prefer 
it if we set it up on the frontend as a new TCP port (actually soon I 
would like to experiment with using QUIC for this - 
https://github.com/haproxytech/quic-dev/issues/1). As we fully control 
the end-user client of the service we can simply switch to a port which 
is compression-only (handled by haproxy) rather than having to do this 
on protocol level. And this would also save us some reasonably 
significant amount of bandwidth from the plaintext preamables currently 
used to establish compression.




I'm just trying to figure a reasonable alternative, because like this in
addition to being extremely unscalable, it makes your LBs a trivial DoS
target.
I agree with the high memory usage, DoS is the main concern here. 
However we can mitigate this in a few ways which I'd rather not go in to 
in public.




Compression is not done on TCP but since it's done using a filter that
deals with HTTP compression, I imagine that it wouldn't be too hard to
modify the filter not to emit HTTP chunks and to work on top of plain
TCP. My real concern is for the decompression. At 256 kB per stream,
that's quite a no-go


So I spent a few hours looking into this on Friday (prior to your 
suggestion), and writing a basic filter to handle tcp-level stream 
compression/decompression. I have this partially working in that it will 
do TCP-level compression thanks to the great http/compression 
abstraction already available. However whereas with HTTP it appears you 
can modify the contents quite easily using the HTX abstractions, I 
cannot see from the documentation (and I tried a number of things in the 
code) how to modify the response from a non-HTX buffer such as would be 
used in this case. I also did not look in to (and don't know if it's 
possible with the current filter library hooks) how to do decompression 
on the stream going f->b. So I'm still looking for someone with a lot of 
haproxy experience who I can pay for this dev work as I think there's a 
chance it may also involve some core changes to the filter mechanisms.


Mark




Re: Paid feature development: TCP stream compression

2022-05-20 Thread Willy Tarreau
On Fri, May 20, 2022 at 04:20:45PM +0500,  ??? wrote:
> yes, it was I meant actually. haproxy currently is not suitable for
> compressing tcp streams. even if such feature will be considered as useful,
> it will take time.

Compression is not done on TCP but since it's done using a filter that
deals with HTTP compression, I imagine that it wouldn't be too hard to
modify the filter not to emit HTTP chunks and to work on top of plain
TCP. My real concern is for the decompression. At 256 kB per stream,
that's quite a no-go :-(

Willy



Re: Paid feature development: TCP stream compression

2022-05-20 Thread Willy Tarreau
On Fri, May 20, 2022 at 12:16:07PM +0100, Mark Zealey wrote:
> Thanks, we may use this for a very rough proof-of-concept. However we are
> dealing with millions of concurrent connections, 10-100 million connections
> per day, so we'd prefer to pay someone to develop (+ test!) something for
> haproxy which will work at this scale

That's a big problem with gzip. While compression can be done stateless
(which is what we're doing with SLZ), decompression uses roughly 256 kB
of permanent RAM per *stream*. That's 256 GB of RAM for just one million
connections, 1 TB of RAM for just 4 million connections. Sadly, that's a
perfect example of use case that requires extreme horizontal scalability
and that's better kept away from the LB and performed on servers.

Isn't there any way to advertise support or lack of for compression ?
Because if your real need is to extract contents to perform any form of
processing, we could imagine that instead of having to decompress the
stream, it would be better if we could interfere with the connection
setup to disable compression.

I'm just trying to figure a reasonable alternative, because like this in
addition to being extremely unscalable, it makes your LBs a trivial DoS
target.

Regards,
Willy



Re: Paid feature development: TCP stream compression

2022-05-20 Thread Aleksandar Lazic
On Fri, 20 May 2022 12:16:07 +0100
Mark Zealey  wrote:

> Thanks, we may use this for a very rough proof-of-concept. However we 
> are dealing with millions of concurrent connections, 10-100 million 
> connections per day, so we'd prefer to pay someone to develop (+ test!) 
> something for haproxy which will work at this scale

Well at this scale you will have for sure more then one HAProxy instance. :-)

Do you want that the HAProxies all together have the same "knowledge" about the
connections?
What I mean should in the implementation the peers protocol be considered to be
used?
Do you expect some XMPP protocol knowledge in the implementation?

> Mark
> 
> On 20/05/2022 10:12, Илья Шипицин wrote:
> > in theory, you can try OpenVPN with compression enabled.
> > or maybe stunnel with compression stunnel TLS Proxy 
> > 
> >
> > пт, 20 мая 2022 г. в 13:59, Mark Zealey :
> >
> > Good point, I forgot to mention that bit. We will be
> > TLS-terminating the connection on haproxy itself so
> > compress/decompress would happen after the plain stream has been
> > received, prior to being forwarded (in plain, or re-encrypted with
> > TLS) to the backends.
> >
> > So:
> >
> > app generates gzip+tls TCP stream -> haproxy: strip TLS, gunzip ->
> > forward TCP to backend servers
> >
> > We don't have any other implementation of this, at the moment it
> > is just an idea we would like to implement.
> >
> > Mark
> >
> >
> > On 20/05/2022 09:54, Илья Шипицин wrote:
> >> isn't it SSL encapsulated ? how is compression is supposed to
> >> work in details ?
> >> any other implementation to look at ?
> >>
> >> чт, 19 мая 2022 г. в 21:32, Mark Zealey :
> >>
> >> Hi there,
> >>
> >> We are using HAProxy to terminate and balance TCP streams
> >> (XMPP) between
> >> our apps and our service infrastructure. We are currently running
> >> XMPP-level gzip compression but I'm interested in potentially
> >> shifting
> >> this to the haproxy layer - basically everything on the
> >> connection would
> >> be compressed with gzip, brotli or similar.
> >>
> >> If you would be interested in doing paid development on
> >> haproxy for
> >> this, please
> >> drop me a line with some details about roughly how much it
> >> would cost
> >> and how
> >> long it would take. Any development work done for this would be
> >> contributed back to the open source haproxy edition.
> >>
> >> Thanks,
> >>
> >> Mark
> >>
> >>




Re: Paid feature development: TCP stream compression

2022-05-20 Thread Илья Шипицин
yes, it was I meant actually. haproxy currently is not suitable for
compressing tcp streams. even if such feature will be considered as useful,
it will take time.
while this is not agreed yet, it looks good to start with at least
something which is suitable for PoC.

пт, 20 мая 2022 г. в 16:16, Mark Zealey :

> Thanks, we may use this for a very rough proof-of-concept. However we are
> dealing with millions of concurrent connections, 10-100 million connections
> per day, so we'd prefer to pay someone to develop (+ test!) something for
> haproxy which will work at this scale
>
> Mark
> On 20/05/2022 10:12, Илья Шипицин wrote:
>
> in theory, you can try OpenVPN with compression enabled.
> or maybe stunnel with compression stunnel TLS Proxy
> 
>
> пт, 20 мая 2022 г. в 13:59, Mark Zealey :
>
>> Good point, I forgot to mention that bit. We will be TLS-terminating the
>> connection on haproxy itself so compress/decompress would happen after the
>> plain stream has been received, prior to being forwarded (in plain, or
>> re-encrypted with TLS) to the backends.
>>
>> So:
>>
>> app generates gzip+tls TCP stream -> haproxy: strip TLS, gunzip ->
>> forward TCP to backend servers
>>
>> We don't have any other implementation of this, at the moment it is just
>> an idea we would like to implement.
>>
>> Mark
>>
>>
>> On 20/05/2022 09:54, Илья Шипицин wrote:
>>
>> isn't it SSL encapsulated ? how is compression is supposed to work in
>> details ?
>> any other implementation to look at ?
>>
>> чт, 19 мая 2022 г. в 21:32, Mark Zealey :
>>
>>> Hi there,
>>>
>>> We are using HAProxy to terminate and balance TCP streams (XMPP) between
>>> our apps and our service infrastructure. We are currently running
>>> XMPP-level gzip compression but I'm interested in potentially shifting
>>> this to the haproxy layer - basically everything on the connection would
>>> be compressed with gzip, brotli or similar.
>>>
>>> If you would be interested in doing paid development on haproxy for
>>> this, please
>>> drop me a line with some details about roughly how much it would cost
>>> and how
>>> long it would take. Any development work done for this would be
>>> contributed back to the open source haproxy edition.
>>>
>>> Thanks,
>>>
>>> Mark
>>>
>>>
>>>


Re: Paid feature development: TCP stream compression

2022-05-20 Thread Mark Zealey
Thanks, we may use this for a very rough proof-of-concept. However we 
are dealing with millions of concurrent connections, 10-100 million 
connections per day, so we'd prefer to pay someone to develop (+ test!) 
something for haproxy which will work at this scale


Mark

On 20/05/2022 10:12, Илья Шипицин wrote:

in theory, you can try OpenVPN with compression enabled.
or maybe stunnel with compression stunnel TLS Proxy 



пт, 20 мая 2022 г. в 13:59, Mark Zealey :

Good point, I forgot to mention that bit. We will be
TLS-terminating the connection on haproxy itself so
compress/decompress would happen after the plain stream has been
received, prior to being forwarded (in plain, or re-encrypted with
TLS) to the backends.

So:

app generates gzip+tls TCP stream -> haproxy: strip TLS, gunzip ->
forward TCP to backend servers

We don't have any other implementation of this, at the moment it
is just an idea we would like to implement.

Mark


On 20/05/2022 09:54, Илья Шипицин wrote:

isn't it SSL encapsulated ? how is compression is supposed to
work in details ?
any other implementation to look at ?

чт, 19 мая 2022 г. в 21:32, Mark Zealey :

Hi there,

We are using HAProxy to terminate and balance TCP streams
(XMPP) between
our apps and our service infrastructure. We are currently running
XMPP-level gzip compression but I'm interested in potentially
shifting
this to the haproxy layer - basically everything on the
connection would
be compressed with gzip, brotli or similar.

If you would be interested in doing paid development on
haproxy for
this, please
drop me a line with some details about roughly how much it
would cost
and how
long it would take. Any development work done for this would be
contributed back to the open source haproxy edition.

Thanks,

Mark



Re: Paid feature development: TCP stream compression

2022-05-20 Thread Илья Шипицин
in theory, you can try OpenVPN with compression enabled.
or maybe stunnel with compression stunnel TLS Proxy


пт, 20 мая 2022 г. в 13:59, Mark Zealey :

> Good point, I forgot to mention that bit. We will be TLS-terminating the
> connection on haproxy itself so compress/decompress would happen after the
> plain stream has been received, prior to being forwarded (in plain, or
> re-encrypted with TLS) to the backends.
>
> So:
>
> app generates gzip+tls TCP stream -> haproxy: strip TLS, gunzip -> forward
> TCP to backend servers
>
> We don't have any other implementation of this, at the moment it is just
> an idea we would like to implement.
>
> Mark
>
>
> On 20/05/2022 09:54, Илья Шипицин wrote:
>
> isn't it SSL encapsulated ? how is compression is supposed to work in
> details ?
> any other implementation to look at ?
>
> чт, 19 мая 2022 г. в 21:32, Mark Zealey :
>
>> Hi there,
>>
>> We are using HAProxy to terminate and balance TCP streams (XMPP) between
>> our apps and our service infrastructure. We are currently running
>> XMPP-level gzip compression but I'm interested in potentially shifting
>> this to the haproxy layer - basically everything on the connection would
>> be compressed with gzip, brotli or similar.
>>
>> If you would be interested in doing paid development on haproxy for
>> this, please
>> drop me a line with some details about roughly how much it would cost
>> and how
>> long it would take. Any development work done for this would be
>> contributed back to the open source haproxy edition.
>>
>> Thanks,
>>
>> Mark
>>
>>
>>


Re: Paid feature development: TCP stream compression

2022-05-20 Thread Mark Zealey
Good point, I forgot to mention that bit. We will be TLS-terminating the 
connection on haproxy itself so compress/decompress would happen after 
the plain stream has been received, prior to being forwarded (in plain, 
or re-encrypted with TLS) to the backends.


So:

app generates gzip+tls TCP stream -> haproxy: strip TLS, gunzip -> 
forward TCP to backend servers


We don't have any other implementation of this, at the moment it is just 
an idea we would like to implement.


Mark


On 20/05/2022 09:54, Илья Шипицин wrote:
isn't it SSL encapsulated ? how is compression is supposed to work in 
details ?

any other implementation to look at ?

чт, 19 мая 2022 г. в 21:32, Mark Zealey :

Hi there,

We are using HAProxy to terminate and balance TCP streams (XMPP)
between
our apps and our service infrastructure. We are currently running
XMPP-level gzip compression but I'm interested in potentially shifting
this to the haproxy layer - basically everything on the connection
would
be compressed with gzip, brotli or similar.

If you would be interested in doing paid development on haproxy for
this, please
drop me a line with some details about roughly how much it would cost
and how
long it would take. Any development work done for this would be
contributed back to the open source haproxy edition.

Thanks,

Mark



Re: Paid feature development: TCP stream compression

2022-05-20 Thread Илья Шипицин
isn't it SSL encapsulated ? how is compression is supposed to work in
details ?
any other implementation to look at ?

чт, 19 мая 2022 г. в 21:32, Mark Zealey :

> Hi there,
>
> We are using HAProxy to terminate and balance TCP streams (XMPP) between
> our apps and our service infrastructure. We are currently running
> XMPP-level gzip compression but I'm interested in potentially shifting
> this to the haproxy layer - basically everything on the connection would
> be compressed with gzip, brotli or similar.
>
> If you would be interested in doing paid development on haproxy for
> this, please
> drop me a line with some details about roughly how much it would cost
> and how
> long it would take. Any development work done for this would be
> contributed back to the open source haproxy edition.
>
> Thanks,
>
> Mark
>
>
>


Re: Paid feature development: TCP stream compression

2022-05-19 Thread Aleksandar Lazic
Hi Mark.

On Thu, 19 May 2022 17:29:37 +0100
Mark Zealey  wrote:

> Hi there,
> 
> We are using HAProxy to terminate and balance TCP streams (XMPP) between
> our apps and our service infrastructure. We are currently running
> XMPP-level gzip compression but I'm interested in potentially shifting
> this to the haproxy layer - basically everything on the connection would
> be compressed with gzip, brotli or similar.
> 
> If you would be interested in doing paid development on haproxy for 
> this, please
> drop me a line with some details about roughly how much it would cost 
> and how
> long it would take. Any development work done for this would be
> contributed back to the open source haproxy edition.

That sounds really great, thank you for this offering :-)

I suggest to get in touch with cont...@haproxy.com as that's the company behind
HAProxy.

> Thanks,
> 
> Mark

Regards
Alex



Paid feature development: TCP stream compression

2022-05-19 Thread Mark Zealey

Hi there,

We are using HAProxy to terminate and balance TCP streams (XMPP) between
our apps and our service infrastructure. We are currently running
XMPP-level gzip compression but I'm interested in potentially shifting
this to the haproxy layer - basically everything on the connection would
be compressed with gzip, brotli or similar.

If you would be interested in doing paid development on haproxy for 
this, please
drop me a line with some details about roughly how much it would cost 
and how

long it would take. Any development work done for this would be
contributed back to the open source haproxy edition.

Thanks,

Mark