Here are my current patches for comments.

-- 
  Richard Russo
  to...@enslaves.us

On Fri, Jul 5, 2019, at 12:23 PM, Richard Russo wrote:
> Hi,
> 
> I've been experimenting with Recieve Side Scaling (RSS) for a tcp proxy 
> application. The basic idea with RSS is by configuring the NICs, 
> kernel, and application to use the same CPU for a given socket, cross 
> CPU locking and communication is eliminated or at least significantly 
> reduced. On my system, configuring RSS allowed me to handle about three 
> times as many sessions before reaching CPU saturation, with the 
> remaining bottleneck seeming to be kernel processing around socket 
> creation and closing which requires cross cpu coordination. 
> 
> Aligning the incoming sockets is very simple, setting a socket option 
> (IP_RSS_LISTEN_BUCKET) on the listen socket restricts the accepted 
> socket to that bucket, and that's straight forward to add to the tcp 
> listener code, and configuration.
> 
> Aligning outgoing sockets is trickier -- there's no kernel help with a 
> socket option or otherwise, an application has to run the hash 
> (toeplitz) on the 4-tuple of {local ip, local port, remote ip, remote 
> port } and only use an outgoing port if the hash matches.  I've had 
> trouble finding a good approach to handle this.
> 
> The simplest thing would be to run the hash when a port is assigned by 
> port_range and return the port if it hashes to the wrong bucket; but if 
> you've already used all the acceptable ports for that port range, you 
> spend a lot of time hashing the ports that are still in the range, 
> without making any progress.
> 
> If you have a port range per rss bucket, you could hash on port 
> assignment, and not return the ports in case they hash to a wrong 
> bucket; but in the case that the remote ip changes because you've 
> configured it to use DNS or if you change the IP via "set server addr", 
> the previously computed hashes are no longer valid -- you would really 
> want to try all the ports again.
> 
> What I ended up with was a lock on port ranges (instead of atomics as 
> used in 07425de71777b688e77a9c70a7088c13e66e41e9 BUG/MEDIUM: 
> port_range: Make the ring buffer lock-free), adding a revision counter 
> to the port range, and resetting the port range whenever the server IP 
> changed. To avoid running the hash during steady state, and because 
> checking all the ports when the range needs to be filled, I also made 
> port range filing incremental. 
> 
> This approach works, but it feels complicated, and it made my config 
> much more verbose --- I had to duplicate my frontend sections, one for 
> each RSS bucket, which sends to corresponding duplicated backends for 
> each bucket; the backends had additional configuration to indicate the 
> RSS bucket (and the number of buckets). Incidentally, because each RSS 
> bucket has a distinct set of ports, and because my use case doesn't use 
> any features which benefit from coordination within HAProxy (such as 
> stick tables etc), this makes it possible to run in process mode rather 
> than threaded mode without running into a lot of port already in use 
> warnings/errors that would happen otherwise when sharing a port range.
> 
> If it's helpful for the discussion, I can share my patches as-is, but 
> if there are better ideas on how to structure this, I'd rather try to 
> get the changes done in a nice way before sharing.
> 
> Thanks!
> 
> -- 
>   Richard Russo
>   to...@enslaves.us
> 
>

Attachment: 0001-Allow-for-binding-listen-sockets-to-a-provided-RSS-b.patch
Description: Binary data

Attachment: 0002-Revert-BUG-MEDIUM-port_range-Make-the-ring-buffer-lo.patch
Description: Binary data

Attachment: 0003-add-port_range-locking-to-protect-against-concurrent.patch
Description: Binary data

Attachment: 0004-refill-port-ranges-when-addresses-change.patch
Description: Binary data

Attachment: 0005-Allow-for-RSS-aligned-port-selection-for-outgoing-co.patch
Description: Binary data

Reply via email to