On Thu, May 23, 2024, at 10:08 AM, Amaury Denoyelle wrote:
> On Wed, May 22, 2024 at 04:58:44PM +0100, William Manley wrote:
> > On Wed, May 22, 2024, at 1:06 PM, Amaury Denoyelle wrote:
> > > FYI, I just merged a series of fix to improve reverse HTTP. It is now
> > > possible to use PROXY protocol on preconnect stage. Also, you have the
> > > availability to use PROXY v2 TLV to differentiate connections. Note
> > > however that PROXY unique-id is not supported for now due to internal
> > > API limitations.
> > > > If you can do not hesitate to test this and report us if it's sufficient
> > > for you.
> > I've just tried this out and there's something about these changes that are 
> > causing my tests to fail.
> > It seems to be triggered by "MEDIUM: rhttp: create session for active 
> > preconnect"
> > Tested versions:
> > eb89a7da33ae30da3ed61570aa1597987b59dff3 OK
> > ceebb09744df367ad84586a341d9336f84f72bce OK
> > 45b80aed70a597614e31b748328570785099dfec OK
> > 12c40c25a9520fe3365950184fe724a1f4e91d03 BAD
> > 60496e884e5220b9330a1d8b3a1218f7988c879a BAD
> > 4a968d9d274a24e5d00bd3c03dd22fe2563b13af BAD
> > I'll investigate further tomorrow.
> 
> Hum can you describe what sort of tests you are running ?

Ok. There's a fair amount of background to explain, apologies in advance for 
length.

Background
===========

My company sells a product with a hardware and a cloud component.  The cloud 
component is called "portal" and the hardware component is called "node".  The 
nodes run in our customer's premises on their network, while the portal runs on 
AWS.

The portal sends commands to the node and the node sends results and status to 
the portal. Because we don't control the network environment of the node we 
want it to be as simple as possible - so all communication from the node to the 
portal is done via HTTP requests multiplexed over a single HTTP/2 TCP 
connection made from the node to the portal. This keeps the network setup 
simple and compatible.

The downside of the setup is that it makes communication difficult - we need to 
use long-poll from the node which is complicated and can be slow and there's a 
degree of friction whenever we adding new features.  We'd much prefer to just 
be able to use HTTP requests from the portal to the node.  So I came up with 
this idea of using HTTP CONNECT to establish TCP connections from the node to 
the portal multiplexed over HTTP/2, but then reversing the flow and allowing 
requests to be sent from the portal to the node.  In the process of testing 
this I noticed that HAProxy had recently added RHTTP support, with benefits 
beyond what I'd implemented, so I thought I'd use that instead.

Roughly speaking the setup looks like this (hopefully it won't be mangled):

│               │                    │          │                    │          
   │
│    python     │      HAProxy       │ internet │       HAProxy      │    
python   │
│               │                    │          │                    │          
   │

 ┌─────────────┐                                                      
┌───────────┐
 │   portal    │------------------------------------------------------►   node  
  │
 │  requests   │                        HTTP/1                        │   flask 
  │
 └──────╥──────┘                                                      
└─────▲─────┘
        ║         ┌───────────────┐                ┌───────────────┐        ║
        ╚═════════►     RHTTP     │----------------► flask reverse ╞════════╝
         HTTP/1   │    frontend   │     HTTP/2     │     proxy     │ HTTP/1
                  └───────▲───────┘                └───────▲───────┘
                          |                                |
                       ┌──────┐                        ┌──────┐
                       │ pool │                        │ pool │
                       └──▲───┘                        └───▲──┘
                          |                                |
                  ┌───────────────┐                ┌───────────────┐
                  │    add HTTP   ◄----------------│     RHTTP     │
                  │      pool     │     RHTTP      │    backend    │
                  └───────▲───────┘                └───────╥───────┘
              PROXY+RHTTP ║                                ║ RHTTP
                  ┌───────╨───────┐                ┌───────▼───────┐
                  │    CONNECT    ◄----------------│    CONNECT    │
                  │   terminator  │ HTTP/1 CONNECT │   payloader   │
                  └───────▲───────┘                └───────╥───────┘
           HTTP/1 CONNECT ║                                ║ HTTP/1 CONNECT
                  ┌───────╨───────┐                ┌───────▼───────┐
                  │     HTTP/2    ◄════════════════╡    HTTP/2     │
        ╔═════════╡   terminator  │     HTTP/2     │   payloader   ◄════════╗
        ║         └───────────────┘                └───────────────┘        ║
        ║                                                                   ║
 ┌──────▼──────┐                                                      
┌─────╨─────┐
 │    portal   ◄------------------------------------------------------│   node  
  │
 │    flask    │                        HTTP/1                        │ 
requests  │
 └─────────────┘                                                      
└───────────┘

 ═══ Actual TCP connections
 --- Logical connections

In case you're unaware: requests is a Python HTTP client library and flask is a 
Python HTTP server library.

Under this setup we're using HAProxy as both a forward and a reverse proxy on 
both the node and the portal, but with all the traffic multiplexed over a 
single HTTP/2 connection (secured with client certs).

I'm using lua for HTTP CONNECT payloading/depayloading as described here: 
https://discourse.haproxy.org/t/haproxy-http-connect-method/9709/2

There are a few additional complexities, like the portal serving connections 
from the nodes and browsers on the same IP address - thus necessitating 
separating them by SNI before requiring client certificates.

Test setup
=========

The above is quite complicated, and involves a lot of alternating mode tcp and 
mode http blocks looping out of HAProxy and back again.  So to test it I have a 
Python test that starts up two HAProxy instances with the real config files, 
but pointing at each other and I send a bunch of requests back and forth in 
parallel to make sure it works.

I've attached the configs used during the test.  You'll notice that each block 
uses a different IP address from 127.0.0.*.  This makes the TCP flow graph in 
wireshark meaningful.  I've attached a packet capture of the test failing.  I'd 
also attach a packet capture from it succeeding, but it's a bit big.  I'll look 
into them and try to figure out where it's going wrong.

Thanks

Will
---
William Manley
Stb-tester.com

Stb-tester.com Ltd is a company registered in England and Wales.
Registered number: 08800454. Registered office: 13B The Vale,
London, W3 7SH, United Kingdom (This is not a remittance address.)

Attachment: node.cfg
Description: Binary data

Attachment: portal.cfg
Description: Binary data

Attachment: test_haproxy-bad.pcapng
Description: Binary data

Reply via email to