Hello Baptiste, On Wed, Jul 4, 2018 at 1:07 PM, Baptiste <[email protected]> wrote: > Hi Aurélien, > > My 2 cents. > >> I'm trying to add a feature which allows HAProxy to use more than one >> source when connecting to a server of a backend. The main reason is to >> avoid duplicating the 'server' lines to reach more than 64k connections >> from HAProxy to one server. > > > Cool! > >> >> So far I thought of two ways: >> - each time the 'source' keyword is encountered on a 'server' line, >> duplicate the original 'struct server' and fill 'conn_src' with >> the correct source informations. It's easy to implement but does >> not scale at all. In fact it mimics the multiple 'server' lines. >> The big advantage is that it can use all existing features that >> deal with 'struct server' (balance keyword, for example). >> - use a list of 'struct conn_src' in 'struct server' and 'struct >> proxy' and choose the best source (using round-robbin, leastconn, >> etc...) when a connection is about to get established. > > > I also prefer the second option. > So we would have 2 LBing algorithm? One to choose the server and one to > choose the source IP to use?
It depends. Considering this feature could be (only ?) useful to address the 64k maximum connections, maybe hardcoding a leastconn algorithm is enough. >> >> The config. syntax would look like this: >> >> server srv 127.0.0.1:9000 source 127.0.0.2 source 127.0.0.3 source >> 127.0.0.4 source 127.0.0.5 source 127.0.0.6 source 127.0.1.0/24 >> >> Not using ip1,ip2,ip/cidr,... avoids confusion when using keywords like >> usesrc, interface, etc... > > > Sure, but at least, I don't want to set 255 source for a "source > 10.0.0.0/24", so please confirm you'll still allow CIDR notation. Yes, look at the last 'source' from my config. line example. What I found tedious is to use something like this: server srv 127.0.0.1:9000 source 127.0.0.2,127.0.0.3,127.0.0.4,127.0.1.0/24 usesrc clientip,client [...] >> >> Checks to the server would be done from each source but it can be very >> slow to cover the whole range. > > > I would make this optional. From a pure LBing safety point of view, I > understand the requirement. > That said, in some cases, we may not want to run tens or hundreds of health > checks per second. > I see different options: > - check from all source IP > - check from the host IP address (as of no source is configured) > - check from one source IP per source subnet > >> >> The main problem I see is how to efficiently store all sources for each >> server. Using the CIDR syntax can quickly allow millions of sources to >> be used and if we want to use algorithms like 'leastconn', we need to >> remember how many connections are still active on a particular source >> (using round-robbin + an index into the range would otherwise have been >> one solution) >> I have some ideas but I would like to know the preferred way. > > > Well, storing a 32 bit hash of <source IP><dest IP> and counting on this > pattern (and automatically eject server source+dest IP which have reached > 64K concurrent connections). Using a leastconn algorithm with very long connections will quickly fill the list/tree with entries with a counter of 1. > > I have a question: what would be the impact on "retries" ? At first, we > could use it as of today. But later, we may want to retry from a different > source IP. -- Aurélien Nephtali

