On Wed, 4 May 2011, Ed Ravin wrote:
> I have good news about the E31230 platform - I found a problem in my > test environment that was constraining the packet generator traffic > in the switch, before it reached the box under test. That's why I wasn't > seeing any packet drops - the packets weren't getting there in the first > place. > > After I fixed the switch problem, I was able to route 1.45 million PPS > of short "DoS" UDP packets from one igb port to the other using the Vyatta > platform, with zero packet drops. mpstat showed CPU usage running around > 85-90% on all the CPUs. When I increased the packet rate to 1.50 million > PPS, CPU usage rose to 95-98% on all CPUs and the receiving NIC began > to log RX FIFO errors (around 150/second). This is over 40% more packets > than the X3450 platform was able to route in the same test, which totally > meets our expectations for the E31230 hardware. really excellent and consistent with our results. The 82580 should give equivalent performance with twice the ports. > The only outstanding issue is how to tune the ring buffers on the > different CPU architectures - as I described below, the "ethtool -G" > command that improved performance on one platform impaired it on the > other, and vice versa. So when you go to 4096 buffers for rx you are using 8 contiguous pages of receive descriptors, which in my experience is not necessary and is even hurtful as each descriptor is likely to get flushed from cache more often, as well as you're using many more memory locations (from slab) for the data buffers of all those descriptors. If you're dropping packets with 512 descriptors you can try enabling 768 or 1024 descriptors instead, or slightly increase the interrupt rate to cause faster flushing/cleaning of descriptors (interrupt rate is less relevant once you starting polling with NAPI, how many ints/second per queues do you get when you're loaded at 95-98%? so, first go to 512 descriptors increase interrupt rate (ethtool -C eth2 rx-usecs 20) if that doesn't preclude drops (maybe because you're already polling) try decrease interrupt rate (ethtool -C eth2 rx-usecs 62) the rule generally is if you see rx_missed errors ONLY, your bus/memory/nic is not fast enough to keep up. if you see rx_missed and some receive_no_buffer_count (RNBC) then you need to get data buffers back to the adapter faster, and likely need to increase interrupt rate or increase rx buffers with ethtool -G. If your cpu is completely maxxed out (polling) you may be better off with a lower interrupt rate in order to give all available cpu to polling (and you increase bus efficiency), you just don't want the rate so low you start getting RNBC whenever interrupts do get enabled. as for the taskset changing things, that seems a bit random because you still only have one memory node on this machine (only one socket) and therefore would a) be fastest with SLAB allocator, and b) you can't have any NUMA locality issues. My best guess is that your prefetcher and/or DCA settings are different in the bios on each machine, and you're seeing kind of different results due to that. A tool you should be able to run is available from the whatif.intel.com web site, and if you download the Intel Performance Tuning Utility (don't worry about the vtune license) inside that package is a file vbtwrun, and it will show you the memory bandwidth used by your various workloads. It should run on both platforms, but I'm not sure if it will work on the E3 due to it being so new, but give it a try. Jesse > > Thanks, > > -- Ed > > > On Wed, May 04, 2011 at 09:29:35AM -0700, Wyborny, Carolyn wrote: > > >-----Original Message----- > > >From: Ed Ravin [mailto:era...@panix.com] > > >Sent: Tuesday, May 03, 2011 6:55 PM > > >To: e1000-de...@lists.sf.net > > >Subject: Re: [E1000-devel] igb driver throughput on Intel E31320 vs > > >Intel X3450 > > > > > >I'm beginning to wonder whether this is cache line bouncing. > > > > > >I went to the X3450 box and issued these commands: > > > > > > taskset ff ethtool -G eth2 rx 4096 tx 512 > > > taskset ff ethtool -G eth3 rx 4096 tx 512 > > > > > >And performance on that box dramatically declined when processing > > >the test with many short packets. Rebooting and issuing the same > > >commands without "taskset ff" restored the performance of my previous > > >test. > > > > > >As described below, using "taskset ff" on the ring buffer setting > > >commands > > >on the E31320 box improved performance. > > > > > >I can try switching to the Sourceforge igb driver so I can hand-tune the > > >queues and interrupt affinities, but would appreciate some guidance. > > [...] > > > Hello Ed, > > > > Sorry for the delay in responding. This is some good information and I'm > > not sure, off the top of my head, what the issue is. I will need to > > research a bit and get back to you. > > > > Thanks, > > > > Carolyn > > > > Carolyn Wyborny > > Linux Development > > LAN Access Division > > Intel Corporation > > ------------------------------------------------------------------------------ > WhatsUp Gold - Download Free Network Management Software > The most intuitive, comprehensive, and cost-effective network > management toolset available today. Delivers lowest initial > acquisition cost and overall TCO of any competing solution. > http://p.sf.net/sfu/whatsupgold-sd > _______________________________________________ > E1000-devel mailing list > E1000-devel@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/e1000-devel > To learn more about Intel® Ethernet, visit > http://communities.intel.com/community/wired > ------------------------------------------------------------------------------ WhatsUp Gold - Download Free Network Management Software The most intuitive, comprehensive, and cost-effective network management toolset available today. Delivers lowest initial acquisition cost and overall TCO of any competing solution. http://p.sf.net/sfu/whatsupgold-sd _______________________________________________ E1000-devel mailing list E1000-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/e1000-devel To learn more about Intel® Ethernet, visit http://communities.intel.com/community/wired