Hi,

This is the first time I post here because I like to find solutions by myself. 
But this  time I'm running out of ideas. (Well. The reality is that we are 
running out of time, as at some point our boss will run out of patience if we 
don't manage to deliver some results :P )

Our problem is that we are not able at all to replicate the performance we got 
with a specific kernel one of my colleagues build once. Actually he build that 
kernel not paying much attention to the options he was using (It was a "fast 
and dirty" build. and now we are paying the consequences!) and it seems he was 
extremely lucky (or inspired) that day, as we cannot reproduce the performance 
that specific kernel delivers. 

So after 3 weeks running tests we decided that maybe it was about time to ask, 
so here we are :) 

Ok, so let's begging with the test lab we have setup (I will give you some 
hardware details, but keep in mind that one kernel delivers about 3x 
performance than others with the exact same HW configuration):

Router: 
MB: SuperMicro X8DTN+F
CPUs: 2 x Xeon 5620
LANs: Integrated Intel 82576 Dual-Port Gigabit Ethernet Controller
HyperThreading Disabled
IGB Driver load parameters: IntMode=2 InterruptThrottleRate=0,0 QueuePairs=0,0 
RSS=4,4
IRQ Balance Disabled (SMP affinity changed for RSS queues)
All queues bind to the second CPU (One RSS queue of each adapter bind to each 
core)
Rp_filter Disabled
Ip_forwarding Enabled
Iptables modules are NOT loaded
Machine is just doing IP forwarding across two interfaces

And basically all the rest is almost default, as we wanted to remove as many 
variables as possible.

Receivers/Generators: 
Xeon 5620 machines using Bonesi as Packet generator (UDP 64 with 50k source 
addresses) 

And here comes the interesting part, in this scenario using kernel 2.6.32.27 
with igb driver 4.1.2 we manage to get around 1.5 Mpps but with all other 
kernels we tried the maximum we get is less than 750 Kpps. So far we tried with 
kernels 3.0.73, 3.2.43 and 3.4.40. (We still need to try with 2.6.34.14 and we 
are solving a problem with 2.6.32.60 because it doesn't boot. probably is a 
problem related to our LSI raid controller) and no success.

While investigating about this issue (Keep in mind that we are more 
networking/sysadmin guys. And yes, we may have a quite good knowledge of linux, 
but we are really far away from you guys when it comes to the kernel and 
networking drivers. ) the only way we managed to find a difference has been 
using "perf top" on the machine while using the 2.6.32.27 and other 3.X 
kernels, and the main difference we found has been:

Kernel 2.6.32.27: 

The top consuming function is "igb_poll": As I understand as the network is 
under heavy load the kernel stats operating the interface in NAPI polling mode, 
so everything seems to be normal and performance is really good.

Kernel 3.4.40 (We have seen similar behavious on other kernels)

Here things look completely different, and _raw_sping_lock_irqsave is consuming 
the 58% of the resources. (Quite big, isn't it?)

With my really low understanding I guess this is a process that spins and that 
might be the reason for the performance difference among kernels. But as 
_raw_sping_lock_irqsave it's a commonly call function we are not close at all 
to identifiying the real reason of the performance degradation and how to avoid 
it)

So, does anybody have an any idea about why we see this massive difference in 
performance? (Or at least an idea that could lead us to the answer...)

And few more questions (Just in case nobody knows the answer to the previous 
question) : 

- Do you think it is kernel or driver related? (we realized igb driver 
configures itself depending on the kernel version, so we are not sure)
- Any extremely important parameters when compiling the kernel we might me 
forgetting? 
- Any documentation you consider we should read? (BTW, we have seen Intel 
results when forgarding packets with Nehalem CPUs... But some information about 
how do you achieve  those astoning results would be really apreciated :))

Thanks for your time!

Cheers!

Saludos cordiales,
Xavier Trilla P.
Silicon Hosting

¿Todavía no conoces Bare Metal Cloud?
¡La evolución de los Servidores VPS ya ha llegado!

más información en: siliconhosting.com/cloud



------------------------------------------------------------------------------
Precog is a next-generation analytics platform capable of advanced
analytics on semi-structured data. The platform includes APIs for building
apps and a phenomenal toolset for data science. Developers can use
our toolset for easy data analysis & visualization. Get a free account!
http://www2.precog.com/precogplatform/slashdotnewsletter
_______________________________________________
E1000-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/e1000-devel
To learn more about Intel® Ethernet, visit 
http://communities.intel.com/community/wired

Reply via email to