Re: [dpdk-users] What to do after rte_eth_tx_burst: free or send again remaining packets?

Peter Keereweer Mon, 30 Jan 2017 08:02:53 -0800

Hi Keith,

Thanks a lot for your response! Based on your information I have tested 
different burst sizes in the Load Balancer application (an let the TX ring size 
unchanged). One can configure the read / write burst sizes of the NIC and the 
software queues as a command line option. The default value of all burst size 
is equal to 144. If I configure all read/write burst sizes as 32, every packet 
will be transmitted by the TX core and no packets are dropped. But is this a 
valid solution? It seems to work, but it feels a little bit strange to decrease 
the burst size from 144 to 32.

Another solution is implementing a while loop (like in _send_burst_fast in 
pktgen), so every packet will be transmitted. This solution seems to work too, 
but the same question here, is this a valid solution? The strange feeling about 
this solution is that basically the same happens in the ixgbe driver code 
(ixgbe_rxtx.c):

uint16_t
ixgbe_xmit_pkts_simple(void *tx_queue, struct rte_mbuf **tx_pkts,
                       uint16_t nb_pkts)
{
        uint16_t nb_tx;

        /* Try to transmit at least chunks of TX_MAX_BURST pkts */
        if (likely(nb_pkts <= RTE_PMD_IXGBE_TX_MAX_BURST))
                return tx_xmit_pkts(tx_queue, tx_pkts, nb_pkts);

        /* transmit more than the max burst, in chunks of TX_MAX_BURST */
        nb_tx = 0;
        while (nb_pkts) {
                uint16_t ret, n;

                n = (uint16_t)RTE_MIN(nb_pkts, RTE_PMD_IXGBE_TX_MAX_BURST);
                ret = tx_xmit_pkts(tx_queue, &(tx_pkts[nb_tx]), n);
                nb_tx = (uint16_t)(nb_tx + ret);
                nb_pkts = (uint16_t)(nb_pkts - ret);
                if (ret < n)
                        break;
        }

        return nb_tx;
}

To be honest, I don't know whether this piece of code is called if I use the 
rte_eth_tx_burst, but I expect something similar happens when rte_eth_tx_burst 
uses another transmitting function in the ixgbe driver code. This while loop in 
the ixgbe driver code is exactly doing same as using a while loop in 
combination with rte_eth_tx_burst. But if I don't use a while loop in 
combination with the rte_eth_tx_burst (and a burst size of 144) it's not 
working (many packets are dropped), but if I implement this while loop it seems 
to work...

I hope you can help me again with finding the best solution to solve this 
problem!

Peter

Van: Wiles, Keith <[email protected]>
Verzonden: zaterdag 28 januari 2017 23:43
Aan: Peter Keereweer
CC: [email protected]
Onderwerp: Re: [dpdk-users] What to do after rte_eth_tx_burst: free or send 
again remaining packets?

> On Jan 28, 2017, at 1:57 PM, Peter Keereweer <[email protected]> 
> wrote:
> 
> Hi!
> 
> Currently I'am running some tests with the Load Balancer Sample Application. 
> I'm testing the Load Balancer Sample Application by sending packets with 
> pktgen.
> I have a setup of 2 servers with each server containing a Intel 10Gbe 82599 
> NIC (connected to each other). I have configured the Load Balancer 
> application to use 1 core for RX, 1 worker core and 1 TX core. The TX core 
> sends all packets back to the pktgen  application.
> 
> With the pktgen I send 1024 UDP packets to the Load Balancer. Every packet 
> processed by the worker core will be printed to the screen (I added this code 
> by myself). If I send 1024 UDP packets, 1008 ( = 7 x 144) packets will be 
> printed to the screen. This  is  correct, because the RX core reads packets 
> with a burst size of 144. So if I send 1024 packets, I expect 1008 packets 
> back in the pktgen application. But surprisingly I only receive 224 packets 
> instead of 1008 packets. After some research I found that  that  224 packets 
> is not just a random number, its 7 x 32 (= 224). So if the RX reads 7 x 144 
> packets, I get back 7 x 32 packets. After digging into the code from the Load 
> Balancer application I found in 'runtime.c' in the 'app_lcore_io_tx' function 
> this code  :
> 
> n_pkts = rte_eth_tx_burst(
>                                 port,
>                                 0,
>                                 lp->tx.mbuf_out[port].array,
>                                 (uint16_t) n_mbufs);
> 
> ...
> 
> if (unlikely(n_pkts < n_mbufs)) {
>                                 uint32_t k;
>                                 for (k = n_pkts; k < n_mbufs; k ++) {
>                                         struct rte_mbuf *pkt_to_free = 
>lp->tx.mbuf_out[port].array[k];
>                                         rte_pktmbuf_free(pkt_to_free);
>                                 }
>                         }
> 
> What I understand from this code is that n_mbufs 'packets' are send with 
> 'rte_eth_tx_burst' function. This function returns n_pkts, the number of 
> packets that are actually send. If the actual number of packets send is 
> smaller then n_mbufs (packets ready for   send given to the rte_eth_tx_burst) 
> then all remaining packets, which are not send, are freed. In de the Load 
> Balancer application, n_mbufs is equal to 144. But in my case 
> 'rte_eth_tx_burst' returns the value 32, and not 144. So 32 packets are 
> actually send   and the remaining packets (144 - 32 = 112) are freed. This is 
> the reason why I get 224 (7 x 32) packets back instead of 1008 (= 7 x 144).
> 
> But the question is: why are the remaining packets freed instead of trying to 
> send them again? If I look into the 'pktgen.c', there is a function 
> '_send_burst_fast' where all remaining packets are trying to be send again 
> (in a while loop until they are all   send) instead of freeing them (see code 
> below) :
> 
> static __inline__ void
> _send_burst_fast(port_info_t *info, uint16_t qid)
> {
>         struct mbuf_table   *mtab = &info->q[qid].tx_mbufs;
>         struct rte_mbuf **pkts;
>         uint32_t ret, cnt;
> 
>         cnt = mtab->len;
>         mtab->len = 0;
> 
>         pkts    = mtab->m_table;
> 
>         if (rte_atomic32_read(&info->port_flags) & PROCESS_TX_TAP_PKTS) {
>                 while (cnt > 0) {
>                         ret = rte_eth_tx_burst(info->pid, qid, pkts, cnt);
> 
>                         pktgen_do_tx_tap(info, pkts, ret);
> 
>                         pkts += ret;
>                         cnt -= ret;
>                 }
>         } else {
>                 while(cnt > 0) {
>                         ret = rte_eth_tx_burst(info->pid, qid, pkts, cnt);
> 
>                         pkts += ret;
>                         cnt -= ret;
>                 }
>         }
> } 
> 
> Why is this while loop (sending packets until they have all been send) not 
> implemented in the 'app_lcore_io_tx' function in the Load Balancer 
> application? That would make sense right? It looks like that the Load 
> Balancer application makes an assumption that   if not all packets have been 
> send, the remaining packets failed during the sending proces and should be 
> freed.

The size of the TX ring on the hardware is limited in size, but you can adjust 
that size. In pktgen I attempt to send all packets requested to be sent, but in 
the load balancer the developer decided to just drop the packets that are not 
sent as the TX hardware  ring or even a SW ring is full. This normally means 
the core is sending packets faster then the HW ring on the NIC can send the 
packets.

It was just a choice of the developer to drop the packets instead of trying 
again until the packets array is empty. One possible way to fix this is to 
increase the size of the TX ring 2-4 time larger then the RX ring. This still 
does not truly solve the problem  it just moves it to the RX ring. The NIC if 
is does not have a valid RX descriptor and a place to DMA the packet into 
memory it gets dropped at the wire. BTW increasing the TX ring size also means 
the these packets will not returned to the free pool and you  can exhaust the 
packet pool. The packets are stuck on the TX ring as done because the threshold 
to reclaim the done packets is too high.

Say you have 1024 ring size and the high watermark for flushing the done off 
the ring is 900 packets. Then if the packet pool is only 512 packets then when 
you send 512 packets they will all be on the TX done queue and now you are in a 
deadlock not being able  to send a packet as they are all on the TX done ring. 
This normally does not happen as the ring sizes or normally much smaller then 
the number of TX packets or even RX packets.

In pktgen I attempt to send all of the packets requested as it does not make 
any sense for the user to ask to send 10000 packets and pktgen only send some 
number less as the core sending the packets can over run the TX queue at some 
point.

I hope that helps.

> 
> I hope someone can help me with this questions. Thank you in advance!!
> 
> Peter

Regards,
Keith

Re: [dpdk-users] What to do after rte_eth_tx_burst: free or send again remaining packets?

Reply via email to