On Wed, Dec 20, 2017 at 2:10 PM, Willy Tarreau <[email protected]> wrote: > On Wed, Dec 20, 2017 at 11:48:27AM +0100, Elias Abacioglu wrote: > > Yes, I have one node running with Solarflare SFN8522 2p 10Gbit/s > currently > > without Onload enabled. > > it has 17.5K http_request_rate and ~26% server interrupts on core 0 and 1 > > where the NIC IRQ is bound to. > > > > And I have a similar node with Intel X710 2p 10Gbit/s. > > It has 26.1K http_request_rate and ~26% server interrupts on core 0 and 1 > > where the NIC IRQ is bound to. > > > > both nodes have 1 socket, Intel Xeon CPU E3-1280 v6, 32 GB RAM. > > In both cases this is very low performance. We're getting 245k req/s and > 90k > connections/s oon a somewhat comparable Core i7-4790K on small objects and > are easily saturating 2 10G NICs with medium sized objects. The problem I'm > seeing is that if your cable is not saturated, you're supposed to be > running > at a higher request rate, and if it's saturated you should not observe the > slightest difference between the two tests. In fact what I'm suspecting is > that you're running with ~45kB objects and that your intel NIC managed to > reach the line rate, and that in the same test the SFN8522 cannot even > reach > it. Am I wrong ? If so, from what I remember from the 40G tests 2 years > ago, > you should be able to get close to 35-40G with such object sizes. >
I forgot to mention that this was not a benchmark test. I tested with live traffic (/me hides in shame). Thats the reason we aren't saturated, so it's not that we've hit the limit now. And why one node gets more traffic got to do with the different VIP's assigned, not sure really why, cause we split the VIP's evenly but I suspect one of the VIP's get more traffic. And I can tell you that we can't reach 245k req/s. At this very moment if I look on the Intel node, we've got ~70% cpu idle on core 0+1 were the NIC IRQ is set to, and ~58% on core 2+3 where haproxy is running. And this node is currently at around 27k req/s. With this math we would hit 100% CPU on core 2+3 at around 47k req/s. So we spike in CPU usage before 50k req/s, we're not even close to 245k req/s, guess I need to learn more tuning. That was my goal/vision with Solarflare offload the CPU more so I can give more cores to haproxy. Is there a metric that shows avg object size? Apparently I'm not graphing conn_rate (i need to add it, but I have no values now), cause we're also sending all SSL traffic to other nodes using TCP load balancing. > Oh just one thing : verify that you're not running with jumbo frames on the > solarflare case. Jumbo frames used to help *a lot* 10 years ago when they > were saving interrupt processing time. Nowadays they instead hurt a lot > because allocating 9kB of contiguous memory at once for a packet is much > more difficult than allocating only 1.5kB. Honnestly I don't remember > having > seen a single case over the last 5+ years where running with jumbo frames > would permit to reach the same performance as no jumbo. GSO+GRO have helped > a lot there as well! > Jumbo frames are not enabled, these nodes are connected directly to the Internet :) GSO+GRO is enabled for both Intel and Solarflare.

