Hi Willy,

On 13 Dec 2013, at 02:13, Willy Tarreau <[email protected]> wrote:

> On Mon, Dec 09, 2013 at 03:43:09PM +0000, Annika Wickert wrote:
>> - Two Intel(R) Xeon(R) CPU X6550 @ 2.00GHz in each cluster node
>> - 2x Emulex Corporation OneConnect 10Gb NIC (rev 02) in each cluster node
>> - 32gbit RAM in each cluster node
>> - Two nodes per cluster (active-active in the new one)
> 
> I never had the opportunity to test Emulex NICs yet. It could be possible
> that they disable some TCP optimizations by default resulting in worse
> performance with splice().

I just read the documentation of Emulex and it says TSO and LRO and so on is 
enabled by default.
http://www-dl.emulex.com/support/linux/83525/linux_11sp.pdf


> 
>> - Debian Squeeze / 3.1.0-1-amd64 / Tickrate 250
>> - CentOS release 6.4 (Final) / 3.11.5-1.el6 / Tickrate 1000
>> 
>> The higher the tickrate, the higher the CPU load. You quadripled
>> the tickrate, and your load what - quadripled? I suggest you
>> try a lower tickrate in the very same configuration.
> 
> 250 is the best tick rate for network related traffic, it allows a
> number of timing conversions to milliseconds to be done with a simple
> shift instead of a divide, while not hammering the system too fast.
> 
>> - We are forcing by splice-request / splice-responce
> 
> OK so I suspect this is purely TCP.
No it’s mostly HTTP and HTTPS but we had enabled splice-request / 
splice-responce also in the previous Haproxy version and it worked without an 
impact.

> 
>> I believe splice is not always more efficient than recv/send;
> 
> Confirmed, especially with small transfers (less than a page = 4 kB).
Ok, we have many small transfers.

> 
>> use splice-auto to use it less aggressively (doc: splice-auto):
>> 
>> For testing we disabled splicing on one of the cluster members on the new
>> cluster (after succesfull tests). Now load drops below 8 from 16. So I maybe
>> try it with splice-auto and if that does not help with a new haproxy build
>> with the following git commits:
>> http://haproxy.1wt.eu/git?p=haproxy.git;a=commit;h=61d39a0e2a047df78f7f3bfcf5584090913cdc65
> 
> Oh good point, I completely forgot about this one. Yes it could be a culprit!
I tried it in testing environment and it looks like this makes the difference.


> 
>> http://haproxy.1wt.eu/git?p=haproxy.git;a=commit;h=fa8e2bc68c583a227ebc78bab5779b84065b28da
>> 
>> Haproxy uses heuristics to estimate if kernel splicing might improve
>> performance or not. Both directions are handled independently. Note
>> that the heuristics used are not much aggressive in order to limit
>> excessive use of splicing.
> 
> Yes, the heuristics consist in detecting if haproxy manages to read a full
> buffer a once and to purge it at once. If that works, then it's considered
> that the traffic is high enough for making a good use of splice(). Otherwise
> with non-complete buffers, it sticks to recv/send. It tends to work really
> well in web environments when you don't want favicon.ico to be spliced but
> you want your photos to be.
Ok, so I will try this also in testing environment. 


> 
> Regards,
> Willy

To say something positiv SSL offloading works like a charm :). 

Thank you for your explanations in the other mail :).

Regards,
Annika

Reply via email to