Re: haproxy and solarflare onload

2018-06-13 Thread Elias Abacioglu
Hi Emeric,

On Tue, Jun 12, 2018 at 2:14 PM, Emeric Brun  wrote:

>
> Yes, we are always interested testing hardware with our soft to advise the
> end users if they could have some benefits.
> But currently we are very busy and t thing we can't test your NIC before
> the last quarter of year.
>
> R,
> Emeric
>
>
Okay, that's should be fine I guess unless I find a use for the NIC until
then. I have a support ticket open with Solarflare team trying to iron out
some issues I'm hitting using onload. And of course I will share my
findings if I get it to work to my satisfaction. Drop me a mail with
details when it gets closer to Q4.

/Elias


Re: haproxy and solarflare onload

2018-06-12 Thread Emeric Brun
Hi Elias,

On 05/28/2018 04:08 PM, Elias Abacioglu wrote:
> Hi Willy and HAproxy folks!
> 
> Sorry for bumping this old thread. But Solarflare recently released a new 
> Onload version.
> http://www.openonload.org/download/openonload-201805-ReleaseNotes.txt 
> 
> 
> Here is a small excerpt from the Release Notes:
> "
> 
>  A major overhaul to clustering and scalable filters enables new data
>  center use cases.
> 
>  Scalable filters direct all traffic for a given VI to Onload so that
>  the filter table size is not a constraint on the number of concurrent
>  connections supported efficiently. This release allows scalable filters
>  to be combined for both passive and active open connections and with RSS,
>  enabling very high transaction rates for use cases such as a reverse web
>  proxy.
> 
> "
> 
> So this NIC with Onload features requires tuning and perhaps a better 
> understanding of the Linux network stack than what I got to get working in a 
> high volume/frequency setup.
> I am willing to gift one SFN8522(2x10Gbit/s SFP+) to the HAproxy team(within 
> EU there should be no toll) if you want to test the card and it's 
> capabilities.
> I don't have any hard requirements for gifting this card, just that you got 
> the will to give it a fair shot. The only thing I want in return is that you 
> share your insights, good or bad. Perhaps we can get a working Onload profile 
> for HAproxy. They ship a example profile for nginx in onload-201805. I am 
> still very curious if Onload actually can offload the CPU more than regular a 
> NIC.
> 
> I don't have an EnterpriseOnload license. But this card should get a free 
> ScaleoutOnload license(basically it's included in the card, but Dell forgot, 
> so I had to reach out to Solarflare support to get a free License). And 
> ScaleoutOnload is their OpenOnload.
> I could help out with that if needed.
> 
> So HAproxy team, you want this card to play with?
> 
> /Elias

Yes, we are always interested testing hardware with our soft to advise the end 
users if they could have some benefits.
But currently we are very busy and t thing we can't test your NIC before the 
last quarter of year.

R,
Emeric



Re: haproxy and solarflare onload

2018-05-28 Thread Elias Abacioglu
Hi Willy and HAproxy folks!

Sorry for bumping this old thread. But Solarflare recently released a new
Onload version.
http://www.openonload.org/download/openonload-201805-ReleaseNotes.txt

Here is a small excerpt from the Release Notes:
"

 A major overhaul to clustering and scalable filters enables new data
 center use cases.

 Scalable filters direct all traffic for a given VI to Onload so that
 the filter table size is not a constraint on the number of concurrent
 connections supported efficiently. This release allows scalable filters
 to be combined for both passive and active open connections and with RSS,
 enabling very high transaction rates for use cases such as a reverse web
 proxy.

"

So this NIC with Onload features requires tuning and perhaps a better
understanding of the Linux network stack than what I got to get working in
a high volume/frequency setup.
I am willing to gift one SFN8522(2x10Gbit/s SFP+) to the HAproxy
team(within EU there should be no toll) if you want to test the card and
it's capabilities.
I don't have any hard requirements for gifting this card, just that you got
the will to give it a fair shot. The only thing I want in return is that
you share your insights, good or bad. Perhaps we can get a working Onload
profile for HAproxy. They ship a example profile for nginx in
onload-201805. I am still very curious if Onload actually can offload the
CPU more than regular a NIC.

I don't have an EnterpriseOnload license. But this card should get a free
ScaleoutOnload license(basically it's included in the card, but Dell
forgot, so I had to reach out to Solarflare support to get a free License).
And ScaleoutOnload is their OpenOnload.
I could help out with that if needed.

So HAproxy team, you want this card to play with?

/Elias


Re: haproxy and solarflare onload

2017-12-20 Thread Elias Abacioglu
>
> Apparently I'm not graphing conn_rate (i need to add it, but I have no
> values now), cause we're also sending all SSL traffic to other nodes using
> TCP load balancing.
>

Update: I'm at around 7,7k connection rate.


Re: haproxy and solarflare onload

2017-12-20 Thread Elias Abacioglu
On Wed, Dec 20, 2017 at 2:10 PM, Willy Tarreau  wrote:

> On Wed, Dec 20, 2017 at 11:48:27AM +0100, Elias Abacioglu wrote:
> > Yes, I have one node running with Solarflare SFN8522 2p 10Gbit/s
> currently
> > without Onload enabled.
> > it has 17.5K http_request_rate and ~26% server interrupts on core 0 and 1
> > where the NIC IRQ is bound to.
> >
> > And I have a similar node with Intel X710 2p 10Gbit/s.
> > It has 26.1K http_request_rate and ~26% server interrupts on core 0 and 1
> > where the NIC IRQ is bound to.
> >
> > both nodes have 1 socket, Intel Xeon CPU E3-1280 v6, 32 GB RAM.
>
> In both cases this is very low performance. We're getting 245k req/s and
> 90k
> connections/s oon a somewhat comparable Core i7-4790K on small objects and
> are easily saturating 2 10G NICs with medium sized objects. The problem I'm
> seeing is that if your cable is not saturated, you're supposed to be
> running
> at a higher request rate, and if it's saturated you should not observe the
> slightest difference between the two tests. In fact what I'm suspecting is
> that you're running with ~45kB objects and that your intel NIC managed to
> reach the line rate, and that in the same test the SFN8522 cannot even
> reach
> it. Am I wrong ? If so, from what I remember from the 40G tests 2 years
> ago,
> you should be able to get close to 35-40G with such object sizes.
>

I forgot to mention that this was not a benchmark test. I tested with live
traffic (/me hides in shame).
Thats the reason we aren't saturated, so it's not that we've hit the limit
now. And why one node gets more traffic got to do with the different VIP's
assigned, not sure really why, cause we split the VIP's evenly but I
suspect one of the VIP's get more traffic.
And I can tell you that we can't reach 245k req/s. At this very moment if I
look on the Intel node, we've got ~70% cpu idle on core 0+1 were the NIC
IRQ is set to, and ~58% on core 2+3 where haproxy is running. And this node
is currently at around 27k req/s. With this math we would hit 100% CPU on
core 2+3 at around 47k req/s.
So we spike in CPU usage before 50k req/s, we're not even close to 245k
req/s, guess I need to learn more tuning.
That was my goal/vision with Solarflare offload the CPU more so I can give
more cores to haproxy.

Is there a metric that shows avg object size?

Apparently I'm not graphing conn_rate (i need to add it, but I have no
values now), cause we're also sending all SSL traffic to other nodes using
TCP load balancing.



> Oh just one thing : verify that you're not running with jumbo frames on the
> solarflare case. Jumbo frames used to help *a lot* 10 years ago when they
> were saving interrupt processing time. Nowadays they instead hurt a lot
> because allocating 9kB of contiguous memory at once for a packet is much
> more difficult than allocating only 1.5kB. Honnestly I don't remember
> having
> seen a single case over the last 5+ years where running with jumbo frames
> would permit to reach the same performance as no jumbo. GSO+GRO have helped
> a lot there as well!
>

Jumbo frames are not enabled, these nodes are connected directly to the
Internet :)
GSO+GRO is enabled for both Intel and Solarflare.


Re: haproxy and solarflare onload

2017-12-20 Thread Elias Abacioglu
On Wed, Dec 20, 2017 at 3:27 PM, Christian Ruppert  wrote:

> Oh, btw, I'm just reading that onload documentation.
>
> 
> Filters
> Filters are used to deliver packets received from the wire to the
> appropriate
> application. When filters are exhausted it is not possible to create new
> accelerated
> sockets. The general recommendation is that applications do not allocate
> more than
> 4096 filters ‐ or applications should not create more than 4096 outgoing
> connections.
> The limit does not apply to inbound connections to a listening socket.
> 


This feels severely limiting.
The support was talking about Scalable filters, but that doesn't seem
applicable when using virtual IP's.


Re: haproxy and solarflare onload

2017-12-20 Thread Elias Abacioglu
On Wed, Dec 20, 2017 at 1:11 PM, Christian Ruppert  wrote:

> Hi Elias,
>
> I'm currently preparing a test setup including a SFN8522 + onload.
> How did you measure it? When did those errors (drops/discard?) appear,
> during a test or some real traffic?
> The first thing I did is updating the driver + firmware. Is both up2date
> in your case?
>
> I haven't measured / compared the SFN8522 against a X520 nor X710 yet but
> do you have RSS / affinity or something related, enabled/set? Intel has
> some features and Solarflare may have its own stuff.


Those errors appear during real traffic.
My workflow.
* kill keepalived, public VIP's gets removed.
* restart HAproxy with onload (no errors here)
* start keepalived, public VIP's gets added.
* traffic starts to flow in and errors start appearing almost immediately.

And driver and firmware is latest.
Currently I've have not set RSS, which means it creates one queue per core,
in my case 4 per port. And I assign affinity to core 0+1, haproxy gets core
2+3.


Re: haproxy and solarflare onload

2017-12-20 Thread Christian Ruppert

Oh, btw, I'm just reading that onload documentation.


Filters
Filters are used to deliver packets received from the wire to the 
appropriate
application. When filters are exhausted it is not possible to create new 
accelerated
sockets. The general recommendation is that applications do not allocate 
more than

4096 filters ‐ or applications should not create more than 4096 outgoing
connections.
The limit does not apply to inbound connections to a listening socket.


On 2017-12-20 13:11, Christian Ruppert wrote:

Hi Elias,

I'm currently preparing a test setup including a SFN8522 + onload.
How did you measure it? When did those errors (drops/discard?) appear,
during a test or some real traffic?
The first thing I did is updating the driver + firmware. Is both
up2date in your case?

I haven't measured / compared the SFN8522 against a X520 nor X710 yet
but do you have RSS / affinity or something related, enabled/set?
Intel has some features and Solarflare may have its own stuff.

On 2017-12-20 11:48, Elias Abacioglu wrote:

Hi,

Yes on the LD_PRELOAD.

Yes, I have one node running with Solarflare SFN8522 2p 10Gbit/s
currently without Onload enabled.
it has 17.5K http_request_rate and ~26% server interrupts on core 0
and 1 where the NIC IRQ is bound to.

And I have a similar node with Intel X710 2p 10Gbit/s.
It has 26.1K http_request_rate and ~26% server interrupts on core 0
and 1 where the NIC IRQ is bound to.

both nodes have 1 socket, Intel Xeon CPU E3-1280 v6, 32 GB RAM.

So without Onload Solarflare performs worse than the X710 since it has
the same amount of SI load with less traffic. And a side note is that
I haven't compared the ethtool settings between Intel and Solarflare,
just running with the defaults of both cards.

I currently have a support ticket open with the Solarflare team to
about the issues I mentioned in my previous mail, if they sort that
out I can perhaps setup a test server if I can manage to free up one
server.
Then we can do some synthetic benchmarks with a set of parameters of
your choosing.

Regards,

/Elias

On Wed, Dec 20, 2017 at 9:48 AM, Willy Tarreau  wrote:


Hi Elias,

On Tue, Dec 19, 2017 at 02:23:21PM +0100, Elias Abacioglu wrote:

Hi,

I recently bought a solarflare NIC with (ScaleOut) Onload /

OpenOnload to

test it with HAproxy.

Have anyone tried running haproxy with solarflare onload

functions?


After I started haproxy with onload, this started spamming on the

kernel

log:
Dec 12 14:11:54 dflb06 kernel: [357643.035355] [onload]
oof_socket_add_full_hw: 6:3083 ERROR: FILTER TCP 10.3.54.43:4147

[1]

10.3.20.116:80 [2] failed (-16)
Dec 12 14:11:54 dflb06 kernel: [357643.064395] [onload]
oof_socket_add_full_hw: 6:3491 ERROR: FILTER TCP 10.3.54.43:39321

[3]

10.3.20.113:80 [4] failed (-16)
Dec 12 14:11:54 dflb06 kernel: [357643.081069] [onload]
oof_socket_add_full_hw: 3:2124 ERROR: FILTER TCP 10.3.54.43:62403

[5]

10.3.20.30:445 [6] failed (-16)
Dec 12 14:11:54 dflb06 kernel: [357643.082625] [onload]
oof_socket_add_full_hw: 3:2124 ERROR: FILTER TCP 10.3.54.43:62403

[5]

10.3.20.30:445 [6] failed (-16)

And this in haproxy log:
Dec 12 14:12:07 dflb06 haproxy[21145]: Proxy ssl-relay reached

system

memory limit at 9931 sockets. Please check system tunables.
Dec 12 14:12:07 dflb06 haproxy[21146]: Proxy ssl-relay reached

system

memory limit at 9184 sockets. Please check system tunables.
Dec 12 14:12:07 dflb06 haproxy[21145]: Proxy HTTP reached system

memory

limit at 9931 sockets. Please check system tunables.
Dec 12 14:12:07 dflb06 haproxy[21145]: Proxy HTTP reached system

memory

limit at 9931 sockets. Please check system tunables.


Apparently I've hit the max hardware filter limit on the  card.
Does anyone here have experience in running haproxy with onload

features?

I've never got any report of any such test, though in the past I
thought
it would be nice to run such a test, at least to validate the
perimeter
covered by the library (you're using it as LD_PRELOAD, that's it ?).


Mind sharing insights and advice on how to get a functional setup?


I really don't know what can reasonably be expected from code trying
to
partially bypass a part of the TCP stack to be honnest. From what
I've
read a long time ago, onload might be doing its work in a not very
intrusive way but judging by your messages above I'm having some
doubts
now.

Have you tried without this software, using the card normally ? I
mean,
2 years ago I had the opportunity to test haproxy on a dual-40G
setup
and we reached 60 Gbps of forwarded traffic with all machines in the
test bench reaching their limits (and haproxy reaching 100% as
well),
so for me that proves that the TCP stack still scales extremely well
and that while such acceleration software might make sense for a
next
generation NIC running on old hardware (eg: when 400 Gbps NICs start
to appear), I'm really not convinced that it makes any sense to use
them on well supported setups like 2-4 10Gbps links which are very
common nowadays. 

Re: haproxy and solarflare onload

2017-12-20 Thread Willy Tarreau
On Wed, Dec 20, 2017 at 11:48:27AM +0100, Elias Abacioglu wrote:
> Yes, I have one node running with Solarflare SFN8522 2p 10Gbit/s currently
> without Onload enabled.
> it has 17.5K http_request_rate and ~26% server interrupts on core 0 and 1
> where the NIC IRQ is bound to.
> 
> And I have a similar node with Intel X710 2p 10Gbit/s.
> It has 26.1K http_request_rate and ~26% server interrupts on core 0 and 1
> where the NIC IRQ is bound to.
> 
> both nodes have 1 socket, Intel Xeon CPU E3-1280 v6, 32 GB RAM.

In both cases this is very low performance. We're getting 245k req/s and 90k
connections/s oon a somewhat comparable Core i7-4790K on small objects and
are easily saturating 2 10G NICs with medium sized objects. The problem I'm
seeing is that if your cable is not saturated, you're supposed to be running
at a higher request rate, and if it's saturated you should not observe the
slightest difference between the two tests. In fact what I'm suspecting is
that you're running with ~45kB objects and that your intel NIC managed to
reach the line rate, and that in the same test the SFN8522 cannot even reach
it. Am I wrong ? If so, from what I remember from the 40G tests 2 years ago,
you should be able to get close to 35-40G with such object sizes.

Oh just one thing : verify that you're not running with jumbo frames on the
solarflare case. Jumbo frames used to help *a lot* 10 years ago when they
were saving interrupt processing time. Nowadays they instead hurt a lot
because allocating 9kB of contiguous memory at once for a packet is much
more difficult than allocating only 1.5kB. Honnestly I don't remember having
seen a single case over the last 5+ years where running with jumbo frames
would permit to reach the same performance as no jumbo. GSO+GRO have helped
a lot there as well!

Cheers,
Willy



Re: haproxy and solarflare onload

2017-12-20 Thread Christian Ruppert

Hi Elias,

I'm currently preparing a test setup including a SFN8522 + onload.
How did you measure it? When did those errors (drops/discard?) appear, 
during a test or some real traffic?
The first thing I did is updating the driver + firmware. Is both up2date 
in your case?


I haven't measured / compared the SFN8522 against a X520 nor X710 yet 
but do you have RSS / affinity or something related, enabled/set? Intel 
has some features and Solarflare may have its own stuff.


On 2017-12-20 11:48, Elias Abacioglu wrote:

Hi,

Yes on the LD_PRELOAD.

Yes, I have one node running with Solarflare SFN8522 2p 10Gbit/s
currently without Onload enabled.
it has 17.5K http_request_rate and ~26% server interrupts on core 0
and 1 where the NIC IRQ is bound to.

And I have a similar node with Intel X710 2p 10Gbit/s.
It has 26.1K http_request_rate and ~26% server interrupts on core 0
and 1 where the NIC IRQ is bound to.

both nodes have 1 socket, Intel Xeon CPU E3-1280 v6, 32 GB RAM.

So without Onload Solarflare performs worse than the X710 since it has
the same amount of SI load with less traffic. And a side note is that
I haven't compared the ethtool settings between Intel and Solarflare,
just running with the defaults of both cards.

I currently have a support ticket open with the Solarflare team to
about the issues I mentioned in my previous mail, if they sort that
out I can perhaps setup a test server if I can manage to free up one
server.
Then we can do some synthetic benchmarks with a set of parameters of
your choosing.

Regards,

/Elias

On Wed, Dec 20, 2017 at 9:48 AM, Willy Tarreau  wrote:


Hi Elias,

On Tue, Dec 19, 2017 at 02:23:21PM +0100, Elias Abacioglu wrote:

Hi,

I recently bought a solarflare NIC with (ScaleOut) Onload /

OpenOnload to

test it with HAproxy.

Have anyone tried running haproxy with solarflare onload

functions?


After I started haproxy with onload, this started spamming on the

kernel

log:
Dec 12 14:11:54 dflb06 kernel: [357643.035355] [onload]
oof_socket_add_full_hw: 6:3083 ERROR: FILTER TCP 10.3.54.43:4147

[1]

10.3.20.116:80 [2] failed (-16)
Dec 12 14:11:54 dflb06 kernel: [357643.064395] [onload]
oof_socket_add_full_hw: 6:3491 ERROR: FILTER TCP 10.3.54.43:39321

[3]

10.3.20.113:80 [4] failed (-16)
Dec 12 14:11:54 dflb06 kernel: [357643.081069] [onload]
oof_socket_add_full_hw: 3:2124 ERROR: FILTER TCP 10.3.54.43:62403

[5]

10.3.20.30:445 [6] failed (-16)
Dec 12 14:11:54 dflb06 kernel: [357643.082625] [onload]
oof_socket_add_full_hw: 3:2124 ERROR: FILTER TCP 10.3.54.43:62403

[5]

10.3.20.30:445 [6] failed (-16)

And this in haproxy log:
Dec 12 14:12:07 dflb06 haproxy[21145]: Proxy ssl-relay reached

system

memory limit at 9931 sockets. Please check system tunables.
Dec 12 14:12:07 dflb06 haproxy[21146]: Proxy ssl-relay reached

system

memory limit at 9184 sockets. Please check system tunables.
Dec 12 14:12:07 dflb06 haproxy[21145]: Proxy HTTP reached system

memory

limit at 9931 sockets. Please check system tunables.
Dec 12 14:12:07 dflb06 haproxy[21145]: Proxy HTTP reached system

memory

limit at 9931 sockets. Please check system tunables.


Apparently I've hit the max hardware filter limit on the  card.
Does anyone here have experience in running haproxy with onload

features?

I've never got any report of any such test, though in the past I
thought
it would be nice to run such a test, at least to validate the
perimeter
covered by the library (you're using it as LD_PRELOAD, that's it ?).


Mind sharing insights and advice on how to get a functional setup?


I really don't know what can reasonably be expected from code trying
to
partially bypass a part of the TCP stack to be honnest. From what
I've
read a long time ago, onload might be doing its work in a not very
intrusive way but judging by your messages above I'm having some
doubts
now.

Have you tried without this software, using the card normally ? I
mean,
2 years ago I had the opportunity to test haproxy on a dual-40G
setup
and we reached 60 Gbps of forwarded traffic with all machines in the
test bench reaching their limits (and haproxy reaching 100% as
well),
so for me that proves that the TCP stack still scales extremely well
and that while such acceleration software might make sense for a
next
generation NIC running on old hardware (eg: when 400 Gbps NICs start
to appear), I'm really not convinced that it makes any sense to use
them on well supported setups like 2-4 10Gbps links which are very
common nowadays. I mean, I managed to run haproxy at 10Gbps 10 years
ago on a core2-duo! Hardware has evolved quite a bit since :-)

Regards,
Willy




Links:
--
[1] http://10.3.54.43:4147
[2] http://10.3.20.116:80
[3] http://10.3.54.43:39321
[4] http://10.3.20.113:80
[5] http://10.3.54.43:62403
[6] http://10.3.20.30:445


--
Regards,
Christian Ruppert



Re: haproxy and solarflare onload

2017-12-20 Thread Elias Abacioglu
Hi,

Yes on the LD_PRELOAD.

Yes, I have one node running with Solarflare SFN8522 2p 10Gbit/s currently
without Onload enabled.
it has 17.5K http_request_rate and ~26% server interrupts on core 0 and 1
where the NIC IRQ is bound to.

And I have a similar node with Intel X710 2p 10Gbit/s.
It has 26.1K http_request_rate and ~26% server interrupts on core 0 and 1
where the NIC IRQ is bound to.

both nodes have 1 socket, Intel Xeon CPU E3-1280 v6, 32 GB RAM.

So without Onload Solarflare performs worse than the X710 since it has the
same amount of SI load with less traffic. And a side note is that I haven't
compared the ethtool settings between Intel and Solarflare, just running
with the defaults of both cards.
I currently have a support ticket open with the Solarflare team to about
the issues I mentioned in my previous mail, if they sort that out I can
perhaps setup a test server if I can manage to free up one server.
Then we can do some synthetic benchmarks with a set of parameters of your
choosing.

Regards,
/Elias



On Wed, Dec 20, 2017 at 9:48 AM, Willy Tarreau  wrote:

> Hi Elias,
>
> On Tue, Dec 19, 2017 at 02:23:21PM +0100, Elias Abacioglu wrote:
> > Hi,
> >
> > I recently bought a solarflare NIC with (ScaleOut) Onload / OpenOnload to
> > test it with HAproxy.
> >
> > Have anyone tried running haproxy with solarflare onload functions?
> >
> > After I started haproxy with onload, this started spamming on the kernel
> > log:
> > Dec 12 14:11:54 dflb06 kernel: [357643.035355] [onload]
> > oof_socket_add_full_hw: 6:3083 ERROR: FILTER TCP 10.3.54.43:4147
> > 10.3.20.116:80 failed (-16)
> > Dec 12 14:11:54 dflb06 kernel: [357643.064395] [onload]
> > oof_socket_add_full_hw: 6:3491 ERROR: FILTER TCP 10.3.54.43:39321
> > 10.3.20.113:80 failed (-16)
> > Dec 12 14:11:54 dflb06 kernel: [357643.081069] [onload]
> > oof_socket_add_full_hw: 3:2124 ERROR: FILTER TCP 10.3.54.43:62403
> > 10.3.20.30:445 failed (-16)
> > Dec 12 14:11:54 dflb06 kernel: [357643.082625] [onload]
> > oof_socket_add_full_hw: 3:2124 ERROR: FILTER TCP 10.3.54.43:62403
> > 10.3.20.30:445 failed (-16)
> >
> > And this in haproxy log:
> > Dec 12 14:12:07 dflb06 haproxy[21145]: Proxy ssl-relay reached system
> > memory limit at 9931 sockets. Please check system tunables.
> > Dec 12 14:12:07 dflb06 haproxy[21146]: Proxy ssl-relay reached system
> > memory limit at 9184 sockets. Please check system tunables.
> > Dec 12 14:12:07 dflb06 haproxy[21145]: Proxy HTTP reached system memory
> > limit at 9931 sockets. Please check system tunables.
> > Dec 12 14:12:07 dflb06 haproxy[21145]: Proxy HTTP reached system memory
> > limit at 9931 sockets. Please check system tunables.
> >
> >
> > Apparently I've hit the max hardware filter limit on the  card.
> > Does anyone here have experience in running haproxy with onload features?
>
> I've never got any report of any such test, though in the past I thought
> it would be nice to run such a test, at least to validate the perimeter
> covered by the library (you're using it as LD_PRELOAD, that's it ?).
>
> > Mind sharing insights and advice on how to get a functional setup?
>
> I really don't know what can reasonably be expected from code trying to
> partially bypass a part of the TCP stack to be honnest. From what I've
> read a long time ago, onload might be doing its work in a not very
> intrusive way but judging by your messages above I'm having some doubts
> now.
>
> Have you tried without this software, using the card normally ? I mean,
> 2 years ago I had the opportunity to test haproxy on a dual-40G setup
> and we reached 60 Gbps of forwarded traffic with all machines in the
> test bench reaching their limits (and haproxy reaching 100% as well),
> so for me that proves that the TCP stack still scales extremely well
> and that while such acceleration software might make sense for a next
> generation NIC running on old hardware (eg: when 400 Gbps NICs start
> to appear), I'm really not convinced that it makes any sense to use
> them on well supported setups like 2-4 10Gbps links which are very
> common nowadays. I mean, I managed to run haproxy at 10Gbps 10 years
> ago on a core2-duo! Hardware has evolved quite a bit since :-)
>
> Regards,
> Willy
>


Re: haproxy and solarflare onload

2017-12-20 Thread Willy Tarreau
Hi Elias,

On Tue, Dec 19, 2017 at 02:23:21PM +0100, Elias Abacioglu wrote:
> Hi,
> 
> I recently bought a solarflare NIC with (ScaleOut) Onload / OpenOnload to
> test it with HAproxy.
> 
> Have anyone tried running haproxy with solarflare onload functions?
> 
> After I started haproxy with onload, this started spamming on the kernel
> log:
> Dec 12 14:11:54 dflb06 kernel: [357643.035355] [onload]
> oof_socket_add_full_hw: 6:3083 ERROR: FILTER TCP 10.3.54.43:4147
> 10.3.20.116:80 failed (-16)
> Dec 12 14:11:54 dflb06 kernel: [357643.064395] [onload]
> oof_socket_add_full_hw: 6:3491 ERROR: FILTER TCP 10.3.54.43:39321
> 10.3.20.113:80 failed (-16)
> Dec 12 14:11:54 dflb06 kernel: [357643.081069] [onload]
> oof_socket_add_full_hw: 3:2124 ERROR: FILTER TCP 10.3.54.43:62403
> 10.3.20.30:445 failed (-16)
> Dec 12 14:11:54 dflb06 kernel: [357643.082625] [onload]
> oof_socket_add_full_hw: 3:2124 ERROR: FILTER TCP 10.3.54.43:62403
> 10.3.20.30:445 failed (-16)
> 
> And this in haproxy log:
> Dec 12 14:12:07 dflb06 haproxy[21145]: Proxy ssl-relay reached system
> memory limit at 9931 sockets. Please check system tunables.
> Dec 12 14:12:07 dflb06 haproxy[21146]: Proxy ssl-relay reached system
> memory limit at 9184 sockets. Please check system tunables.
> Dec 12 14:12:07 dflb06 haproxy[21145]: Proxy HTTP reached system memory
> limit at 9931 sockets. Please check system tunables.
> Dec 12 14:12:07 dflb06 haproxy[21145]: Proxy HTTP reached system memory
> limit at 9931 sockets. Please check system tunables.
> 
> 
> Apparently I've hit the max hardware filter limit on the  card.
> Does anyone here have experience in running haproxy with onload features?

I've never got any report of any such test, though in the past I thought
it would be nice to run such a test, at least to validate the perimeter
covered by the library (you're using it as LD_PRELOAD, that's it ?).

> Mind sharing insights and advice on how to get a functional setup?

I really don't know what can reasonably be expected from code trying to
partially bypass a part of the TCP stack to be honnest. From what I've
read a long time ago, onload might be doing its work in a not very
intrusive way but judging by your messages above I'm having some doubts
now.

Have you tried without this software, using the card normally ? I mean,
2 years ago I had the opportunity to test haproxy on a dual-40G setup
and we reached 60 Gbps of forwarded traffic with all machines in the
test bench reaching their limits (and haproxy reaching 100% as well),
so for me that proves that the TCP stack still scales extremely well
and that while such acceleration software might make sense for a next
generation NIC running on old hardware (eg: when 400 Gbps NICs start
to appear), I'm really not convinced that it makes any sense to use
them on well supported setups like 2-4 10Gbps links which are very
common nowadays. I mean, I managed to run haproxy at 10Gbps 10 years
ago on a core2-duo! Hardware has evolved quite a bit since :-)

Regards,
Willy



haproxy and solarflare onload

2017-12-19 Thread Elias Abacioglu
Hi,

I recently bought a solarflare NIC with (ScaleOut) Onload / OpenOnload to
test it with HAproxy.

Have anyone tried running haproxy with solarflare onload functions?

After I started haproxy with onload, this started spamming on the kernel
log:
Dec 12 14:11:54 dflb06 kernel: [357643.035355] [onload]
oof_socket_add_full_hw: 6:3083 ERROR: FILTER TCP 10.3.54.43:4147
10.3.20.116:80 failed (-16)
Dec 12 14:11:54 dflb06 kernel: [357643.064395] [onload]
oof_socket_add_full_hw: 6:3491 ERROR: FILTER TCP 10.3.54.43:39321
10.3.20.113:80 failed (-16)
Dec 12 14:11:54 dflb06 kernel: [357643.081069] [onload]
oof_socket_add_full_hw: 3:2124 ERROR: FILTER TCP 10.3.54.43:62403
10.3.20.30:445 failed (-16)
Dec 12 14:11:54 dflb06 kernel: [357643.082625] [onload]
oof_socket_add_full_hw: 3:2124 ERROR: FILTER TCP 10.3.54.43:62403
10.3.20.30:445 failed (-16)

And this in haproxy log:
Dec 12 14:12:07 dflb06 haproxy[21145]: Proxy ssl-relay reached system
memory limit at 9931 sockets. Please check system tunables.
Dec 12 14:12:07 dflb06 haproxy[21146]: Proxy ssl-relay reached system
memory limit at 9184 sockets. Please check system tunables.
Dec 12 14:12:07 dflb06 haproxy[21145]: Proxy HTTP reached system memory
limit at 9931 sockets. Please check system tunables.
Dec 12 14:12:07 dflb06 haproxy[21145]: Proxy HTTP reached system memory
limit at 9931 sockets. Please check system tunables.


Apparently I've hit the max hardware filter limit on the  card.
Does anyone here have experience in running haproxy with onload features?
Mind sharing insights and advice on how to get a functional setup?


Thanks,
Elias