Hi, This is the first time I post here because I like to find solutions by myself. But this time I'm running out of ideas. (Well. The reality is that we are running out of time, as at some point our boss will run out of patience if we don't manage to deliver some results :P )
Our problem is that we are not able at all to replicate the performance we got with a specific kernel one of my colleagues build once. Actually he build that kernel not paying much attention to the options he was using (It was a "fast and dirty" build. and now we are paying the consequences!) and it seems he was extremely lucky (or inspired) that day, as we cannot reproduce the performance that specific kernel delivers. So after 3 weeks running tests we decided that maybe it was about time to ask, so here we are :) Ok, so let's begging with the test lab we have setup (I will give you some hardware details, but keep in mind that one kernel delivers about 3x performance than others with the exact same HW configuration): Router: MB: SuperMicro X8DTN+F CPUs: 2 x Xeon 5620 LANs: Integrated Intel 82576 Dual-Port Gigabit Ethernet Controller HyperThreading Disabled IGB Driver load parameters: IntMode=2 InterruptThrottleRate=0,0 QueuePairs=0,0 RSS=4,4 IRQ Balance Disabled (SMP affinity changed for RSS queues) All queues bind to the second CPU (One RSS queue of each adapter bind to each core) Rp_filter Disabled Ip_forwarding Enabled Iptables modules are NOT loaded Machine is just doing IP forwarding across two interfaces And basically all the rest is almost default, as we wanted to remove as many variables as possible. Receivers/Generators: Xeon 5620 machines using Bonesi as Packet generator (UDP 64 with 50k source addresses) And here comes the interesting part, in this scenario using kernel 2.6.32.27 with igb driver 4.1.2 we manage to get around 1.5 Mpps but with all other kernels we tried the maximum we get is less than 750 Kpps. So far we tried with kernels 3.0.73, 3.2.43 and 3.4.40. (We still need to try with 2.6.34.14 and we are solving a problem with 2.6.32.60 because it doesn't boot. probably is a problem related to our LSI raid controller) and no success. While investigating about this issue (Keep in mind that we are more networking/sysadmin guys. And yes, we may have a quite good knowledge of linux, but we are really far away from you guys when it comes to the kernel and networking drivers. ) the only way we managed to find a difference has been using "perf top" on the machine while using the 2.6.32.27 and other 3.X kernels, and the main difference we found has been: Kernel 2.6.32.27: The top consuming function is "igb_poll": As I understand as the network is under heavy load the kernel stats operating the interface in NAPI polling mode, so everything seems to be normal and performance is really good. Kernel 3.4.40 (We have seen similar behavious on other kernels) Here things look completely different, and _raw_sping_lock_irqsave is consuming the 58% of the resources. (Quite big, isn't it?) With my really low understanding I guess this is a process that spins and that might be the reason for the performance difference among kernels. But as _raw_sping_lock_irqsave it's a commonly call function we are not close at all to identifiying the real reason of the performance degradation and how to avoid it) So, does anybody have an any idea about why we see this massive difference in performance? (Or at least an idea that could lead us to the answer...) And few more questions (Just in case nobody knows the answer to the previous question) : - Do you think it is kernel or driver related? (we realized igb driver configures itself depending on the kernel version, so we are not sure) - Any extremely important parameters when compiling the kernel we might me forgetting? - Any documentation you consider we should read? (BTW, we have seen Intel results when forgarding packets with Nehalem CPUs... But some information about how do you achieve those astoning results would be really apreciated :)) Thanks for your time! Cheers! Saludos cordiales, Xavier Trilla P. Silicon Hosting ¿Todavía no conoces Bare Metal Cloud? ¡La evolución de los Servidores VPS ya ha llegado! más información en: siliconhosting.com/cloud ------------------------------------------------------------------------------ Precog is a next-generation analytics platform capable of advanced analytics on semi-structured data. The platform includes APIs for building apps and a phenomenal toolset for data science. Developers can use our toolset for easy data analysis & visualization. Get a free account! http://www2.precog.com/precogplatform/slashdotnewsletter _______________________________________________ E1000-devel mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/e1000-devel To learn more about Intel® Ethernet, visit http://communities.intel.com/community/wired
