Hi Willy,

Thanks for your response, I know this is not at all related to haproxy, so
your help is really appreciated! Also irqbalance is not running on the
systems.

> That's normal if you pinned the IRQs to cpus 0-47, you'd like to pin IRQs
> only to the CPUs of the first socket (ie: 0, 2, 4, 6, ..., 46 if I
understand
> it right).

I think that will still not solve the issue since packets can come on irq1
and
directly wired to cpu0, but the rx receive code path on cpu0 decides which
cpu this is meant for, and gets that lock and still does ipi. I expected
that
packets for a flow will go to the correct cpu, and thought this could be
related
to the size of the intel flow director?

root@1098366dea41:~# ethtool -S em1 | grep fdir_
     fdir_match: 1596533665     (29%)
     fdir_miss:    3908617542     (71%)
     fdir_overflow: 25408

The Intel document says: "And the Intel® Ethernet Flow Director
Perfect-Match Filter Table has to be large enough to capture the
unique flows a given controller would typically see. For that reason,
Intel Ethernet Flow Director’s Perfect-Match Filter Table has 8k entries."

http://www.intel.in/content/dam/www/public/us/en/documents/white-papers/intel-ethernet-flow-director.pdf

8K is very small for haproxy serving many more connection.

While trying to simulate in a small lab setup, I ran 4 wrk's to haproxy,
top
showed 3 haproxy's running on cpus 2,4 and 8, and tx/rx counters of only
2,4,8 incremented during the run, and always consistently. So it worked
as expected, but the load on the system was very low.

In our production testing, thousand client VM's are doing 50K connections
each to haproxy (running on 60 servers), and here, I noticed that haproxy
on every server runs on 0,2,4..22, but rx/tx of all queues increment.

Assuming this is the issue, which I am not sure of, do you have any ideas
how to get around this, or any other suggestions?

Thanks,
- Krishna Kumar


On Tue, Jul 7, 2015 at 1:38 PM, Willy Tarreau <w...@1wt.eu> wrote:

> Hi,
>
> On Tue, Jul 07, 2015 at 01:24:28PM +0530, Krishna Kumar (Engineering)
> wrote:
> > Hi all,
> >
> > This is not related to haproxy, but I am having a performance issue with
> > number of
> > packets processed. I am running haproxy on a 48 core system (we have 64
> > such servers
> > at present, which is going to increase for production tessting), where
> cpus
> > 0,2,4,6,..46
> > are part of NUMA node 1, and cpus 1,3,5,7,.. 47 are part of NUMA node 2.
> > The systems
> > are running Debian 7, with 3.16.0-23 (kernel has both CONFIG_XPS and
> > CONFIG_RPS
> > enabled). nbproc is set to 12, and each haproxy is bound to cpus 0,2,4,
> ...
> > 22, so that
> > they are on the same socket, as seen here:
> >
> > # ps -efF | egrep "hap|PID" | cut -c1-80
> > UID         PID   PPID  C    SZ   RSS PSR STIME TTY          TIME CMD
> > haproxy    3099      1 17 89697 324024  0 18:37 ?        00:11:19 haproxy
> > -f hap
> > haproxy    3100      1 18 87171 314324  2 18:37 ?        00:12:00 haproxy
> > -f hap
> > haproxy    3101      1 18 87214 305328  4 18:37 ?        00:12:00 haproxy
> > -f hap
> > haproxy    3102      1 19 89215 322676  6 18:37 ?        00:12:02 haproxy
> > -f hap
> > haproxy    3103      1 18 86788 310976  8 18:37 ?        00:11:59 haproxy
> > -f hap
> > haproxy    3104      1 18 87197 314888 10 18:37 ?        00:12:00 haproxy
> > -f hap
> > haproxy    3105      1 18 91311 319784 12 18:37 ?        00:11:59 haproxy
> > -f hap
> > haproxy    3106      1 18 88785 305576 14 18:37 ?        00:12:00 haproxy
> > -f hap
> > haproxy    3107      1 19 90366 326428 16 18:37 ?        00:12:09 haproxy
> > -f hap
> > haproxy    3108      1 19 89758 320780 18 18:37 ?        00:12:09 haproxy
> > -f hap
> > haproxy    3109      1 19 87670 314752 20 18:37 ?        00:12:07 haproxy
> > -f hap
> > haproxy    3110      1 19 87763 316672 22 18:37 ?        00:12:10 haproxy
> > -f hap
> >
> > set_irq_affinity.sh was run on the ixgbe card, and
> /proc/irq/*/smp_affinity
> > shows that each
> > irq is bound to cpus 0-47 correctly. However, I see that packets are
> being
> > processed on
> > cpus of the 2nd socket too, though user/system usage is zero on those as
> > haproxy does
> > not run on those cores.
>
> That's normal if you pinned the IRQs to cpus 0-47, you'd like to pin IRQs
> only to the CPUs of the first socket (ie: 0, 2, 4, 6, ..., 46 if I
> understand
> it right).
>
> Also, double-check that you don't have irqbalance. It can change the
> settings in your back, that's really unpleasant.
>
> Willy
>
>

-- 


------------------------------------------------------------------------------------------------------------------------------------------

This email and any files transmitted with it are confidential and intended 
solely for the use of the individual or entity to whom they are addressed. 
If you have received this email in error please notify the system manager. 
This message contains confidential information and is intended only for the 
individual named. If you are not the named addressee you should not 
disseminate, distribute or copy this e-mail. Please notify the sender 
immediately by e-mail if you have received this e-mail by mistake and 
delete this e-mail from your system. If you are not the intended recipient 
you are notified that disclosing, copying, distributing or taking any 
action in reliance on the contents of this information is strictly 
prohibited. Although Flipkart has taken reasonable precautions to ensure no 
viruses are present in this email, the company cannot accept responsibility 
for any loss or damage arising from the use of this email or attachments

Reply via email to