Hi,

I have a machine with 6 DPDK ports (4 igb, 2 ixgbe), with 1.23Mpps traffic 
offered to only one of the 10G ports (the other 5 are unused).  I also have a 
program with a pretty standard looking DPDK receive loop, where it calls 
rte_eth_rx_burst() for each configured port.  If I configure the loop to read 
from all 6 ports, it can read the 1.23Mpps rate with no drops.  If I configure 
the loop to poll only 1 port (the ixgbe receiving the traffic), I lose about 
1/3rd of the packets (i.e., the NIC drops ~400Kpps).

Another data point is that if I configure the loop to read from 3 out of the 6 
ports, the drop rate is reduced to less than half (i.e., the NIC is only 
dropping ~190Kpps now).  So it seems that in this test, throughput improves by 
adding NICs, not removing them, which is counter-intuitive.  Again, I get no 
drops when polling all 6 ports.  Note, the burst size is 32.

I did find a reference to a similar issue in a recent paper 
(http://www.net.in.tum.de/fileadmin/bibtex/publications/papers/ICN2015.pdf), 
Section III, which states:

"The DPDK L2FWD application initially only managed to forward 13.8 Mpps in the 
single direction test at the maximum CPU frequency, a similar result can be 
found in [11]. Reducing the CPU frequency increased the throughput to the 
expected value of 14.88 Mpps. Our investigation of this anomaly revealed that 
the lack of any processing combined with the fast CPU caused DPDK to poll the 
NIC too often. DPDK does not use interrupts, it utilizes a busy wait loop that 
polls the NIC until at least one packet is returned. This resulted in a high 
poll rate which affected the throughput. We limited the poll rate to 500,000 
poll operations per second (i.e., a batch size of about 30 packets) and 
achieved line rate in the unidirectional test with all frequencies. This effect 
was only observed with the X520 NIC, tests with X540 NICs did not show this 
anomaly.?

Another reference, from this mailing list last year 
(http://wiki.dpdk.org/ml/archives/dev/2014-January/001169.html):

"I suggest you to check average burst sizes on receive queues. Looks like I 
stumbled upon a similar issue several times. If you are calling 
rte_eth_rx_burst too frequently, NIC begins losing packets no matter how many 
CPU horse power you have (more you have, more it loses, actually). In my case 
this situation occured when average burst size is less than 20 packets or so. 
I'm not sure what's the reason for this behavior, but I observed it on several 
applications on Intel 82599 10Gb cards.?

So I?m wondering if anyone can explain at a lower level what happens when you 
poll ?too often?, and if there are any practical workarounds.  We?re using this 
same program and DPDK version to process 10G line-rate in other scenarios, so 
I?m confident that the overall packet capture architecture is sound.

-Aaron

Reply via email to