Re: Using different sources when connecting to a server

2018-07-04 Thread Aurélien Nephtali
Hello Baptiste,

On Wed, Jul 4, 2018 at 1:07 PM, Baptiste  wrote:
> Hi Aurélien,
>
> My 2 cents.
>
>> I'm trying to add a feature which allows HAProxy to use more than one
>> source when connecting to a server of a backend. The main reason is to
>> avoid duplicating the 'server' lines to reach more than 64k connections
>> from HAProxy to one server.
>
>
> Cool!
>
>>
>> So far I thought of two ways:
>> - each time the 'source' keyword is encountered on a 'server' line,
>>   duplicate the original 'struct server' and fill 'conn_src' with
>>   the correct source informations. It's easy to implement but does
>>   not scale at all. In fact it mimics the multiple 'server' lines.
>>   The big advantage is that it can use all existing features that
>>   deal with 'struct server' (balance keyword, for example).
>> - use a list of 'struct conn_src' in 'struct server' and 'struct
>>   proxy' and choose the best source (using round-robbin, leastconn,
>>   etc...) when a connection is about to get established.
>
>
> I also prefer the second option.
> So we would have 2 LBing algorithm? One to choose the server and one to
> choose the source IP to use?

It depends. Considering this feature could be (only ?) useful to address the 64k
maximum connections, maybe hardcoding a leastconn algorithm is enough.

>>
>> The config. syntax would look like this:
>>
>> server srv 127.0.0.1:9000 source 127.0.0.2 source 127.0.0.3 source
>> 127.0.0.4 source 127.0.0.5 source 127.0.0.6 source 127.0.1.0/24
>>
>> Not using ip1,ip2,ip/cidr,... avoids confusion when using keywords like
>> usesrc, interface, etc...
>
>
> Sure, but at least, I don't want to set 255 source for a "source
> 10.0.0.0/24", so please confirm you'll still allow CIDR notation.

Yes, look at the last 'source' from my config. line example. What I found
tedious is to use something like this:

server srv 127.0.0.1:9000 source
127.0.0.2,127.0.0.3,127.0.0.4,127.0.1.0/24 usesrc clientip,client
[...]

>>
>> Checks to the server would be done from each source but it can be very
>> slow to cover the whole range.
>
>
> I would make this optional. From a pure LBing safety point of view, I
> understand the requirement.
> That said, in some cases, we may not want to run tens or hundreds of health
> checks per second.
> I see different options:
> - check from all source IP
> - check from the host IP address (as of no source is configured)
> - check from one source IP per source subnet
>
>>
>> The main problem I see is how to efficiently store all sources for each
>> server. Using the CIDR syntax can quickly allow millions of sources to
>> be used and if we want to use algorithms like 'leastconn', we need to
>> remember how many connections are still active on a particular source
>> (using round-robbin + an index into the range would otherwise have been
>> one solution)
>> I have some ideas but I would like to know the preferred way.
>
>
> Well, storing a 32 bit hash of  and counting on this
> pattern (and automatically eject server source+dest IP which have reached
> 64K concurrent connections).

Using a leastconn algorithm with very long connections will quickly fill the
list/tree with entries with a counter of 1.

>
> I have a question: what would be the impact on "retries" ? At first, we
> could use it as of today. But later, we may want to retry from a different
> source IP.

-- 
Aurélien Nephtali



Re: Using different sources when connecting to a server

2018-07-04 Thread Baptiste
Hi Aurélien,

My 2 cents.

I'm trying to add a feature which allows HAProxy to use more than one
> source when connecting to a server of a backend. The main reason is to
> avoid duplicating the 'server' lines to reach more than 64k connections
> from HAProxy to one server.
>

Cool!


> So far I thought of two ways:
> - each time the 'source' keyword is encountered on a 'server' line,
>   duplicate the original 'struct server' and fill 'conn_src' with
>   the correct source informations. It's easy to implement but does
>   not scale at all. In fact it mimics the multiple 'server' lines.
>   The big advantage is that it can use all existing features that
>   deal with 'struct server' (balance keyword, for example).
> - use a list of 'struct conn_src' in 'struct server' and 'struct
>   proxy' and choose the best source (using round-robbin, leastconn,
>   etc...) when a connection is about to get established.
>

I also prefer the second option.
So we would have 2 LBing algorithm? One to choose the server and one to
choose the source IP to use?



> The config. syntax would look like this:
>
> server srv 127.0.0.1:9000 source 127.0.0.2 source 127.0.0.3 source
> 127.0.0.4 source 127.0.0.5 source 127.0.0.6 source 127.0.1.0/24
>
> Not using ip1,ip2,ip/cidr,... avoids confusion when using keywords like
> usesrc, interface, etc...
>

Sure, but at least, I don't want to set 255 source for a "source 10.0.0.0/24",
so please confirm you'll still allow CIDR notation.


> Checks to the server would be done from each source but it can be very
> slow to cover the whole range.
>

I would make this optional. From a pure LBing safety point of view, I
understand the requirement.
That said, in some cases, we may not want to run tens or hundreds of health
checks per second.
I see different options:
- check from all source IP
- check from the host IP address (as of no source is configured)
- check from one source IP per source subnet


> The main problem I see is how to efficiently store all sources for each
> server. Using the CIDR syntax can quickly allow millions of sources to
> be used and if we want to use algorithms like 'leastconn', we need to
> remember how many connections are still active on a particular source
> (using round-robbin + an index into the range would otherwise have been
> one solution)
> I have some ideas but I would like to know the preferred way.
>

Well, storing a 32 bit hash of  and counting on this
pattern (and automatically eject server source+dest IP which have reached
64K concurrent connections).

I have a question: what would be the impact on "retries" ? At first, we
could use it as of today. But later, we may want to retry from a different
source IP.

Baptiste


Using different sources when connecting to a server

2018-07-04 Thread Aurélien Nephtali
Hello,

I'm trying to add a feature which allows HAProxy to use more than one
source when connecting to a server of a backend. The main reason is to
avoid duplicating the 'server' lines to reach more than 64k connections
from HAProxy to one server.

So far I thought of two ways:
- each time the 'source' keyword is encountered on a 'server' line,
  duplicate the original 'struct server' and fill 'conn_src' with
  the correct source informations. It's easy to implement but does
  not scale at all. In fact it mimics the multiple 'server' lines.
  The big advantage is that it can use all existing features that
  deal with 'struct server' (balance keyword, for example).
- use a list of 'struct conn_src' in 'struct server' and 'struct
  proxy' and choose the best source (using round-robbin, leastconn,
  etc...) when a connection is about to get established.

The config. syntax would look like this:

server srv 127.0.0.1:9000 source 127.0.0.2 source 127.0.0.3 source 127.0.0.4 
source 127.0.0.5 source 127.0.0.6 source 127.0.1.0/24

Not using ip1,ip2,ip/cidr,... avoids confusion when using keywords like
usesrc, interface, etc...

Checks to the server would be done from each source but it can be very
slow to cover the whole range.

The main problem I see is how to efficiently store all sources for each
server. Using the CIDR syntax can quickly allow millions of sources to
be used and if we want to use algorithms like 'leastconn', we need to
remember how many connections are still active on a particular source
(using round-robbin + an index into the range would otherwise have been
one solution)
I have some ideas but I would like to know the preferred way.

Thanks.

-- 
Aurélien Nephtali