Re: [dpdk-users] What to do after rte_eth_tx_burst: free or send again remaining packets?

Wiles, Keith Sat, 28 Jan 2017 14:45:06 -0800

> On Jan 28, 2017, at 1:57 PM, Peter Keereweer <[email protected]> 
> wrote:
> 
> Hi!
> 
> Currently I'am running some tests with the Load Balancer Sample Application. 
> I'm testing the Load Balancer Sample Application by sending packets with 
> pktgen.
> I have a setup of 2 servers with each server containing a Intel 10Gbe 82599 
> NIC (connected to each other). I have configured the Load Balancer 
> application to use 1 core for RX, 1 worker core and 1 TX core. The TX core 
> sends all packets back to the pktgen application.
> 
> With the pktgen I send 1024 UDP packets to the Load Balancer. Every packet 
> processed by the worker core will be printed to the screen (I added this code 
> by myself). If I send 1024 UDP packets, 1008 ( = 7 x 144) packets will be 
> printed to the screen. This is  correct, because the RX core reads packets 
> with a burst size of 144. So if I send 1024 packets, I expect 1008 packets 
> back in the pktgen application. But surprisingly I only receive 224 packets 
> instead of 1008 packets. After some research I found that that  224 packets 
> is not just a random number, its 7 x 32 (= 224). So if the RX reads 7 x 144 
> packets, I get back 7 x 32 packets. After digging into the code from the Load 
> Balancer application I found in 'runtime.c' in the 'app_lcore_io_tx' function 
> this code :
> 
> n_pkts = rte_eth_tx_burst(
>                                 port,
>                                 0,
>                                 lp->tx.mbuf_out[port].array,
>                                 (uint16_t) n_mbufs);
> 
> ...
> 
> if (unlikely(n_pkts < n_mbufs)) {
>                                 uint32_t k;
>                                 for (k = n_pkts; k < n_mbufs; k ++) {
>                                         struct rte_mbuf *pkt_to_free = 
> lp->tx.mbuf_out[port].array[k];
>                                         rte_pktmbuf_free(pkt_to_free);
>                                 }
>                         }
> 
> What I understand from this code is that n_mbufs 'packets' are send with 
> 'rte_eth_tx_burst' function. This function returns n_pkts, the number of 
> packets that are actually send. If the actual number of packets send is 
> smaller then n_mbufs (packets ready for  send given to the rte_eth_tx_burst) 
> then all remaining packets, which are not send, are freed. In de the Load 
> Balancer application, n_mbufs is equal to 144. But in my case 
> 'rte_eth_tx_burst' returns the value 32, and not 144. So 32 packets are 
> actually send  and the remaining packets (144 - 32 = 112) are freed. This is 
> the reason why I get 224 (7 x 32) packets back instead of 1008 (= 7 x 144).
> 
> But the question is: why are the remaining packets freed instead of trying to 
> send them again? If I look into the 'pktgen.c', there is a function 
> '_send_burst_fast' where all remaining packets are trying to be send again 
> (in a while loop until they are all  send) instead of freeing them (see code 
> below) :
> 
> static __inline__ void
> _send_burst_fast(port_info_t *info, uint16_t qid)
> {
>         struct mbuf_table   *mtab = &info->q[qid].tx_mbufs;
>         struct rte_mbuf **pkts;
>         uint32_t ret, cnt;
> 
>         cnt = mtab->len;
>         mtab->len = 0;
> 
>         pkts    = mtab->m_table;
> 
>         if (rte_atomic32_read(&info->port_flags) & PROCESS_TX_TAP_PKTS) {
>                 while (cnt > 0) {
>                         ret = rte_eth_tx_burst(info->pid, qid, pkts, cnt);
> 
>                         pktgen_do_tx_tap(info, pkts, ret);
> 
>                         pkts += ret;
>                         cnt -= ret;
>                 }
>         } else {
>                 while(cnt > 0) {
>                         ret = rte_eth_tx_burst(info->pid, qid, pkts, cnt);
> 
>                         pkts += ret;
>                         cnt -= ret;
>                 }
>         }
> } 
> 
> Why is this while loop (sending packets until they have all been send) not 
> implemented in the 'app_lcore_io_tx' function in the Load Balancer 
> application? That would make sense right? It looks like that the Load 
> Balancer application makes an assumption that  if not all packets have been 
> send, the remaining packets failed during the sending proces and should be 
> freed.


The size of the TX ring on the hardware is limited in size, but you can adjust 
that size. In pktgen I attempt to send all packets requested to be sent, but in 
the load balancer the developer decided to just drop the packets that are not 
sent as the TX hardware ring or even a SW ring is full. This normally means the 
core is sending packets faster then the HW ring on the NIC can send the packets.

It was just a choice of the developer to drop the packets instead of trying 
again until the packets array is empty. One possible way to fix this is to 
increase the size of the TX ring 2-4 time larger then the RX ring. This still 
does not truly solve the problem it just moves it to the RX ring. The NIC if is 
does not have a valid RX descriptor and a place to DMA the packet into memory 
it gets dropped at the wire. BTW increasing the TX ring size also means the 
these packets will not returned to the free pool and you can exhaust the packet 
pool. The packets are stuck on the TX ring as done because the threshold to 
reclaim the done packets is too high.

Say you have 1024 ring size and the high watermark for flushing the done off 
the ring is 900 packets. Then if the packet pool is only 512 packets then when 
you send 512 packets they will all be on the TX done queue and now you are in a 
deadlock not being able to send a packet as they are all on the TX done ring. 
This normally does not happen as the ring sizes or normally much smaller then 
the number of TX packets or even RX packets.

In pktgen I attempt to send all of the packets requested as it does not make 
any sense for the user to ask to send 10000 packets and pktgen only send some 
number less as the core sending the packets can over run the TX queue at some 
point.

I hope that helps.

> 
> I hope someone can help me with this questions. Thank you in advance!!
> 
> Peter

Regards,
Keith

Re: [dpdk-users] What to do after rte_eth_tx_burst: free or send again remaining packets?

Reply via email to