Andrew just to be precise (I don't want to tease you of course), on a X3440 we can send 14.88 Mpps (~ 26 Mpps on two ports) so we're quite close now. As of the 710 problem I have reported, I will ask the 710 user who has reported the issue.
Now the question is: where all these issues are coming from? Why a 810 (more powerful than a 710) reports a much poor performance? Do you have the chance to read the BIOS revision of your 710 so I can compare it with the one of the other use who as issues? This said: great news. Cheers Luca On Sep 15, 2011, at 4:45 PM, <andrew_leh...@agilent.com> wrote: > Hi Donald and Luca, > > I have managed to obtain the loan a R710 and using the Silicom card and > Luca's code I can send in excess of 14Million packets per sec, so whatever > the problem with the R710 Luca has reported it is not the same as my issues > with the R810! Of course, unless my R810 has suffered the same fault as the > R710 listed below and both are now broken in the same way. Does a reboot > clear your other user's problem Luca or is it permanent? > > Luca here's the details... > > ./pfsend -i dna:eth4 -g 1 -l 60 -n 0 -r 10 > > TX rate: [current 14'238'148.23 pps/9.57 Gbps][average 14'223'555.75 pps/9.56 > Gbps][total 2'147'799'248.00 pkts] > TX rate: [current 14'240'502.43 pps/9.57 Gbps][average 14'223'667.24 pps/9.56 > Gbps][total 2'162'040'021.00 pkts] > TX rate: [current 14'239'155.21 pps/9.57 Gbps][average 14'223'768.47 pps/9.56 > Gbps][total 2'176'279'461.00 pkts] > TX rate: [current 14'238'531.22 pps/9.57 Gbps][average 14'223'864.33 pps/9.56 > Gbps][total 2'190'518'277.00 pkts] > > Thanks > > Andrew > > -----Original Message----- > From: Luca Deri [mailto:d...@ntop.org] > Sent: Thursday, September 15, 2011 3:05 PM > To: Skidmore, Donald C > Cc: LEHANE,ANDREW (A-Scotland,ex1); e1000-devel@lists.sourceforge.net > Subject: Re: Problems with Dell R810. > > Donald > I have been reported by another PF_RING user of the following problem (Dell > 710 and Intel 82576): > > Wed Sep 14 2011 06:00:11 An OEM diagnostic event has occurred. > Critical 0.000009Wed Sep 14 2011 06:00:11 A bus fatal error was detected on > a component at bus 0 device 6 function 0. > Critical 0.000008Wed Sep 14 2011 06:00:11 A bus fatal error was detected on > a component at slot 1. > Normal 0.000007Wed Sep 14 2011 06:00:11 An OEM diagnostic event has > occurred. > Critical 0.000006Wed Sep 14 2011 06:00:11 A bus fatal error was detected on > a component at bus 0 device 5 function 0. > Critical 0.000005Wed Sep 14 2011 06:00:10 A bus fatal error was detected on > a component at slot 2. > Normal 0.000004Wed Sep 14 2011 06:00:08 An OEM diagnostic event has > occurred. > Critical 0.000003Wed Sep 14 2011 06:00:08 A bus fatal error was detected on > a component at bus 0 device 6 function 0. > Critical 0.000002Wed Sep 14 2011 06:00:08 A bus fatal error was detected on > a component at slot 1. > Normal 0.000001Wed Sep 14 2011 06:00:08 An OEM diagnostic event has > occurred. > > > Additionally, we captured the following logs as well: > alloc kstat_irqs on node -1 > pcieport 0000:00:09.0: irq 62 for MSI/MSI-X pcieport 0000:00:09.0: setting > latency timer to 64 aer 0000:00:01.0:pcie02: PCIe errors handled by platform > firmware. > aer 0000:00:03.0:pcie02: PCIe errors handled by platform firmware. > aer 0000:00:04.0:pcie02: PCIe errors handled by platform firmware. > aer 0000:00:05.0:pcie02: PCIe errors handled by platform firmware. > aer 0000:00:06.0:pcie02: PCIe errors handled by platform firmware. > aer 0000:00:07.0:pcie02: PCIe errors handled by platform firmware. > aer 0000:00:09.0:pcie02: PCIe errors handled by platform firmware. > > I believe there's a BIOS issue on Dell's. What do you think? > > Regards Luca > > > On Sep 4, 2011, at 1:25 PM, Luca Deri wrote: > >> Donald >> thanks for the reply. I don't think this is a PF_RING issue (even using the >> vanilla ixgbe driver we observe the same behavior) but rather a Dell/Intel >> issue. From what I see on dmesg, it seems that DCA is disabled and we have >> no way to enable it. I'm not sure if this is due to BIOS limitations. What I >> can tell you is that a low-end core2duo is much faster than this >> multiprocessor machine, and this is an indication that there's something >> wrong on this setup. >> >> Regards Luca >> >> On Sep 3, 2011, at 2:33 AM, Skidmore, Donald C wrote: >> >>>> -----Original Message----- >>>> From: andrew_leh...@agilent.com [mailto:andrew_leh...@agilent.com] >>>> Sent: Thursday, September 01, 2011 2:17 AM >>>> To: e1000-devel@lists.sourceforge.net >>>> Cc: d...@ntop.org >>>> Subject: [E1000-devel] Problems with Dell R810. >>>> >>>> Hi, >>>> >>>> I recently purchased as Dell R810 for use with Luca Deri's PF_RING >>>> networking driver for the 10 Gigabit PCI Express Network Driver and >>>> the Silicom 10Gig card that uses the 82599EB chipset, machine is >>>> running Fedora Core 14. >>>> >>>> Luca's driver is described here: >>>> http://www.ntop.org/blog/pf_ring/introducing-the-10-gbit-pf_ring-dna >>>> - >>>> driver/ >>>> >>>> Only the machine doesn't seem to want to play ball. We have tried a >>>> number of things and so eventually Luca suggested this mailing list, >>>> I do hope someone can help? >>>> >>>> The machine spec is as follows. >>>> >>>> 2x Intel Xeon L7555 Processor (1.86GHz, 8C, 24M Cache, 5.86 GT/s >>>> QPI, 95W TDP, Turbo, HT), DDR3-980MHz 128GB Memory for 2/4CPU >>>> (16x8GB Quad Rank LV RDIMMs) 1066MHz Additional 2x Intel Xeon L7555 >>>> Processor (1.86GHz, 8C, 24M Cache, 5.86 GT/s QPI, 95W TDP, Turbo, >>>> HT), Upgrade to 4CPU >>>> 2 600GB SAS 6Gbps 10k 2.5" HD >>>> Silicom 82599EB 10 Gigabit Ethernet NIC. >>>> >>>> According to Luca's experiments on his test machine, not an R810 >>>> (actually quite a low spec machine by comparison) we should be >>>> getting the following results, unfortunately, the R810 performance >>>> is very poor; it struggles at less than 8% capacity of a 10 Gig link >>>> on one core; Luca's test application (byte and packet counts only) >>>> and his machine can process a 100% of a 10 Gig Link on one core. >>>> >>>> http://www.ntop.org/blog/pf_ring/how-to-sendreceive-26mpps-using- >>>> pf_ring-on-commodity-hardware/ >>>> >>>> Importantly, Luca also seems to be getting excellent CPU usage >>>> figures, see the bottom of the page, indicating that both DCA and >>>> IOATDMA are operating correctly. My problem is that even on light >>>> network loads my CPU hits 100% and packets are dropped, indicating, >>>> to me, that DCA/IOATDMA isn't working. >>>> >>>> I have switched on IOATDMA in the Dell's BIOS, it's off by default, >>>> and discovered the following site where it talks about configuring a >>>> machine to use DCA and IOATDMA etc. I even found a chap who reported >>>> similar performance problems but with a Dell R710 and how he fixed >>>> it. I tried all this but still no improvement! >>>> >>>> http://www.mail-archive.com/ntop-misc@listgateway.unipi.it/msg01185. >>>> html >>>> >>>> The R810 seems to use a 7500 chipset. >>>> >>>> http://www.dell.com/downloads/global/products/pedge/pedge_r810_specs >>>> heet >>>> _en.pdf >>>> >>>> So, I think this is the R810 chipset reference http://www- >>>> techdoc.intel.com/content/dam/doc/datasheet/7500-chipset-datasheet.p >>>> df, >>>> see page 453 >>>> >>>> The program sets bits (0x8C @ bit 0) but it doesn't seem to stay >>>> set, so consecutive calls to "dca_probe" seem to always say "DCA >>>> disabled, enabling now." >>>> >>>> I commented out some of the defines in the original code as they are >>>> already set in the Linux kernel and, of course, changed the >>>> registers to point to the ones on page 453 - I hope they are correct. >>>> >>>> Still no luck the CPU usage is way too high. >>>> >>>> #define _XOPEN_SOURCE 500 >>>> >>>> #include <stdio.h> >>>> #include <stdlib.h> >>>> #include <pci/pci.h> >>>> #include <sys/io.h> >>>> #include <fcntl.h> >>>> #include <sys/stat.h> >>>> #include <sys/types.h> >>>> #include <unistd.h> >>>> >>>> #define INTEL_BRIDGE_DCAEN_OFFSET 0x8c >>>> #define INTEL_BRIDGE_DCAEN_BIT 0 >>>> /*#define PCI_HEADER_TYPE_BRIDGE 1 */ >>>> /*#define PCI_VENDOR_ID_INTEL 0x8086 *//* lol @ intel */ >>>> /*#define PCI_HEADER_TYPE 0x0e */ >>>> #define MSR_P6_DCA_CAP 0x000001f8 >>>> #define NUM_CPUS 64 >>>> >>>> void check_dca(struct pci_dev *dev) >>>> { >>>> u32 dca = pci_read_long(dev, INTEL_BRIDGE_DCAEN_OFFSET); >>>> printf("DCA old value %d.\n", dca); >>>> if (!(dca & (1 << INTEL_BRIDGE_DCAEN_BIT))) { >>>> printf("DCA disabled, enabling now.\n"); >>>> dca |= 1 << INTEL_BRIDGE_DCAEN_BIT; >>>> printf("DCA new value %d.\n", dca); >>>> pci_write_long(dev, INTEL_BRIDGE_DCAEN_OFFSET, dca); >>>> } else { >>>> printf("DCA already enabled!\n"); >>>> } >>>> } >>>> >>>> void msr_dca_enable(void) >>>> { >>>> char msr_file_name[64]; >>>> int fd = 0, i = 0; >>>> u64 data; >>>> >>>> for (;i < NUM_CPUS; i++) { >>>> sprintf(msr_file_name, "/dev/cpu/%d/msr", i); >>>> fd = open(msr_file_name, O_RDWR); >>>> if (fd < 0) { >>>> perror("open failed!"); >>>> exit(1); >>>> } >>>> if (pread(fd, &data, sizeof(data), MSR_P6_DCA_CAP) != >>>> sizeof(data)) { >>>> perror("reading msr failed!"); >>>> exit(1); >>>> } >>>> >>>> printf("got msr value: %*llx\n", 1, (unsigned long >>>> long)data); >>>> if (!(data & 1)) { >>>> data |= 1; >>>> if (pwrite(fd, &data, sizeof(data), MSR_P6_DCA_CAP) != >>>> sizeof(data)) { >>>> perror("writing msr failed!"); >>>> exit(1); >>>> } >>>> } else { >>>> printf("msr already enabled for CPU %d\n", i); >>>> } >>>> } >>>> } >>>> >>>> int main(void) >>>> { >>>> struct pci_access *pacc; >>>> struct pci_dev *dev; >>>> u8 type; >>>> >>>> pacc = pci_alloc(); >>>> pci_init(pacc); >>>> >>>> pci_scan_bus(pacc); >>>> for (dev = pacc->devices; dev; dev=dev->next) { >>>> pci_fill_info(dev, PCI_FILL_IDENT | PCI_FILL_BASES); >>>> if (dev->vendor_id == PCI_VENDOR_ID_INTEL) { >>>> type = pci_read_byte(dev, PCI_HEADER_TYPE); >>>> if (type == PCI_HEADER_TYPE_BRIDGE) { >>>> check_dca(dev); >>>> } >>>> } >>>> } >>>> >>>> msr_dca_enable(); >>>> return 0; >>>> } >>>> >>>> As you can see ixgbe, dca and ioatdma modules are loaded. >>>> >>>> # lsmod >>>> >>>> Module Size Used by >>>> ixgbe 200547 0 >>>> pf_ring 327754 4 >>>> tcp_lp 2111 0 >>>> fuse 61934 3 >>>> sunrpc 201569 1 >>>> ip6t_REJECT 4263 2 >>>> nf_conntrack_ipv6 18078 4 >>>> ip6table_filter 1687 1 >>>> ip6_tables 17497 1 ip6table_filter >>>> ipv6 286505 184 ip6t_REJECT,nf_conntrack_ipv6 >>>> uinput 7368 0 >>>> ioatdma 51376 72 >>>> i7core_edac 16210 0 >>>> dca 5590 2 ixgbe,ioatdma >>>> bnx2 65569 0 >>>> mdio 3934 0 >>>> ses 6319 0 >>>> dcdbas 8540 0 >>>> edac_core 41336 1 i7core_edac >>>> iTCO_wdt 11256 0 >>>> iTCO_vendor_support 2610 1 iTCO_wdt >>>> power_meter 9545 0 >>>> hed 2206 0 >>>> serio_raw 4640 0 >>>> microcode 18662 0 >>>> enclosure 7518 1 ses >>>> megaraid_sas 37653 2 >>>> >>>> # uname -a >>>> Linux test 2.6.35.14-95.fc14.x86_64 #1 SMP Tue Aug 16 21:01:58 UTC >>>> 2011 >>>> x86_64 x86_64 x86_64 GNU/Linux >>>> >>>> Thanks, >>>> >>>> Andrew >>> >>> Hey Andrew, >>> >>> Sorry you're having issues with the 28599 and ixgbe. I haven't done much >>> with the PF_RING networking driver but maybe we can see what is going on >>> with the ixgbe driver. It would help to know a little be more information >>> like: >>> >>> - What there any interesting system log messages of note? >>> >>> - How are your interrupt being divided among your queue's (cat >>> /proc/interrupts)? I know your testing with just one CPU are you also just >>> using one queue or affinizing one to that CPU? >>> >>> - Could you provide the lspic -vvv output. To verify you NIC is getting a >>> PCIe x8 connection. >>> >>> - What kind of cpu usage are you seeing if you don't use just the base >>> driver running at line rate with something like netperf/iperf? >>> >>> - Have you attempted this without DCA? Like I said above I don't have much >>> experience with PF_RING so I may be missing some fundamental advantage it >>> is suppose to gain from operation with DCA in this mode. >>> >>> These are just off the top of my head if I think of anything else I'll let >>> you know. >>> >>> Thanks, >>> -Don Skidmore <donald.c.skidm...@intel.com> >> >> --- >> >> "Debugging is twice as hard as writing the code in the first place. >> Therefore, if you write the code as cleverly as possible, you are, by >> definition, not smart enough to debug it. - Brian W. Kernighan >> ------------------------------------------------------------------------------ Doing More with Less: The Next Generation Virtual Desktop What are the key obstacles that have prevented many mid-market businesses from deploying virtual desktops? How do next-generation virtual desktops provide companies an easier-to-deploy, easier-to-manage and more affordable virtual desktop model.http://www.accelacomm.com/jaw/sfnl/114/51426474/ _______________________________________________ E1000-devel mailing list E1000-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/e1000-devel To learn more about Intel® Ethernet, visit http://communities.intel.com/community/wired