I am writing a bridge using DPDK, where I have traffic read from one port
transmitted to the other. Here is the core of the program, based on basicfwd.c.
while (!force_quit) {
nb_rx = rte_eth_rx_burst(rx_port, rx_queue, bufs, BURST_SIZE);
for (i = 0; i < nb_rx; i++) {
/* inspect packet */
}
nb_tx = rte_eth_tx_burst(tx_port, tx_queue, bufs, nb_rx);
for (i = nb_tx; i < nb_rx; i++) {
rte_pktmbuf_free(bufs[i]);
}
}
(A bunch of error checking and such left out for brevity.)
This worked great, I got bandwidth equivalent to using a Linux Bridge.
I then tried using tx buffers instead. (Initialization code left out for
brevity.) Here is the new loop.
while (!force_quit) {
nb_rx = rte_eth_rx_burst(rx_port, rx_queue, bufs, BURST_SIZE);
for (i = 0; i < nb_rx; i++) {
/* inspect packet */
rte_eth_tx_buffer(tx_port, tx_queue, tx_buffer, bufs[i]);
}
rte_eth_tx_buffer_flush(tx_port, tx_queue, tx_buffer);
}
(Once again, error checking left out for brevity.)
I am running this on 8 cores, each core has its own loop. (tx_buffer is
created for each core.)
If I have well balanced traffic across the cores, then my performance goes
down, about 5% or so. If I have unbalanced traffic such as all traffic coming
from a single flow, my performance goes down 80% from about 10 gbs to 2gbs.
I want to stress that the ONLY thing that changed in this code is changing how
I transmit packets. Everything else is the same.
Any idea why this would cause such a degradation in bit rate?
-Bev