Re: [E1000-devel] Poor performance with ixgbe and E5-2643

Alexander Duyck Wed, 21 Nov 2012 09:10:53 -0800

On 11/20/2012 05:05 PM, Ben Greear wrote:
> On 11/20/2012 04:34 PM, Ben Greear wrote:
>> On 11/20/2012 04:18 PM, Ben Greear wrote:
>
>>>> Also, have you checked to make sure the feature set is comparable? 
>>>> For
>>>> instance the E5 can support VT-d.  If that is enabled it can have a
>>>> negative impact on I/O performance due to extra locking overhead for
>>>> map/unmap calls on the host.
>>>
>>> I'll go poke around the BIOS and disable the VT-d if I can find it.
>>
>> Wow, disabling VT-d gives a big improvement!
>>
>> It now runs around 9.3Gbps bi-directional.  Still not as good as
>> our E3 or i7 systems, but it's at least closer.
>>
>> Here's the new perf top
>>
>> Samples: 24K of event 'cycles', Event count (approx.): 15591201274
>>    10.61%  [ixgbe]                       [k] ixgbe_poll
>>     6.76%  [pktgen]                      [k] pktgen_thread_worker
>>     6.40%  [kernel]                      [k] timekeeping_get_ns
>>     5.46%  [ixgbe]                       [k] ixgbe_xmit_frame_ring
>>     4.20%  libc-2.15.so                  [.] __memcpy_ssse3_back
>>     3.98%  [kernel]                      [k] do_raw_spin_unlock
>>     3.02%  [kernel]                      [k] skb_copy_bits
>>     2.99%  [kernel]                      [k] build_skb
>>     2.61%  perf-2510.map                 [.] 0x00007f73b2a28476
>>     2.56%  [kernel]                      [k] __netif_receive_skb
>>
>>
>> What CPU(s) do you suggest for high network bandwidth..hopefully
>> pci-e gen3 systems that can push beyond 2 10G NICs at full speed?
>
> Well..I stepped away for a bit, and when I came back, it is now working
> very nicely.  Can do full 10G tx+tx on two ports, even with pktgen using
> multi-skb at 0 (ie, no skb cloning).
>
> And, bridging delay-emulator app of ours, which previously peaked at
> 6Gbps or so bi-directional
> on i7/E3 systems runs 9GB+ even with full 1 second of one-way delay
> (ie, cold cache on the skbs).
>
> So, I don't know why, aside from the VT-d, it was acting poorly
> earlier, but
> all seems good now.  Maybe 'updatedb' or something similar was running
> and
> I didn't notice....
>
> I'll keep poking at this..and should have a 4-port 10G with gen-3 pcie
> coming soon to play with.
>
> Thanks,
> Ben


Based on the trace you provided earlier it was all VT-d.  When
allocating DMA resources with the Intel IOMMU there is a walk through a
spinlock protected red-black tree that is required.  The problem is that
it doesn't scale with multiple queues.  I have seen it cause significant
issues on systems with multiple cores.

Any of the newer Xeon E5 systems will have an advantage for network
workloads due to DDIO.  Specifically you should find that the I/O will
scale quite well due to the fact that the memory bandwidth will not be
much of a bottleneck as long as the physical device, interrupts, and
traffic generator are all on the same CPU socket.

Thanks,

Alex

------------------------------------------------------------------------------
Monitor your physical, virtual and cloud infrastructure from a single
web console. Get in-depth insight into apps, servers, databases, vmware,
SAP, cloud infrastructure, etc. Download 30-day Free Trial.
Pricing starts from $795 for 25 servers or applications!
http://p.sf.net/sfu/zoho_dev2dev_nov
_______________________________________________
E1000-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/e1000-devel
To learn more about Intel&#174; Ethernet, visit 
http://communities.intel.com/community/wired

Re: [E1000-devel] Poor performance with ixgbe and E5-2643

Reply via email to