On 6/11/26 09:56, [email protected] wrote:
From: Menglong Dong <[email protected]>

During packet receiving in virtio-net, the rq can be empty, which means
"rq->vq->num_free == virtqueue_get_vring_size(rq->vq)", in
virtnet_add_recvbuf_xsk(), if we are using xsk. Meanwhile, the fill ring
can be empty too, which means we can't allocate anything from
xsk_buff_alloc_batch(). Then, we will set the XDP_RING_NEED_WAKEUP flag.

However, if the user clean all the data in rx ring and fill the
"fill ring" and check the XDP_RING_NEED_WAKEUP flag after
xsk_buff_alloc_batch() and before xsk_set_rx_need_wakeup(), then the rx
napi will never be scheduled: the rx ring is empty, which means we will
never receive a packet to trigger the further recv fill. The rx ring is
empty now, so the user will not check the flag too.

Fix this by set the XDP_RING_NEED_WAKEUP flag before
xsk_buff_alloc_batch() if both rq->vq and fill ring are empty.

Meanwhile, set the XDP_RING_NEED_WAKEUP flag if we have any free entry in
rq->vq.

Fixes: e3f8800aa243 ("virtio-net: xsk: Support wakeup on RX side")
Signed-off-by: Menglong Dong <[email protected]>
---
  drivers/net/virtio_net.c | 25 ++++++++++++++++++++++---
  1 file changed, 22 insertions(+), 3 deletions(-)

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index f4adcfee7a80..4b5b3fa62008 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -1323,16 +1323,27 @@ static int virtnet_add_recvbuf_xsk(struct virtnet_info 
*vi, struct receive_queue
                                   struct xsk_buff_pool *pool, gfp_t gfp)
  {
        struct xdp_buff **xsk_buffs;
+       bool need_wakeup;
        dma_addr_t addr;
        int err = 0;
        u32 len, i;
        int num;
+ need_wakeup = xsk_uses_need_wakeup(pool);
        xsk_buffs = rq->xsk_buffs;
+ /* If both rq->vq and fill ring are empty, and then the user submit
+        * all the chunks to the fill ring and check the wake up flag
+        * after xsk_buff_alloc_batch() and before xsk_set_rx_need_wakeup(),
+        * we will lose the chance to wake up the rx napi, so we have to
+        * set the need_wakeup flag here.
+        */
+       if (need_wakeup && virtqueue_get_vring_size(rq->vq) == rq->vq->num_free)
+               xsk_set_rx_need_wakeup(pool);

I think when polling the receive queue, the userspace program needs to check the XDP_RING_NEED_WAKEUP flag if it does not see any packets. The flag check is quite lightweight in my opinion. Here are some examples I find

- https://github.com/xdp-project/xdp-tools/blob/e9469501622aa22a7e452a671000bec8685edcde/lib/util/xdpsock.c#L1206 - https://github.com/xdp-project/bpf-examples/blob/43e565901c4287efa863edca7f0e6cd6e35ed896/AF_XDP-forwarding/xsk_fwd.c#L540

Furthermore, the XDP_RING_NEED_WAKEUP flag related functions does not provide any memory orderings. So even with your patch, I'm worried that this case is possible

kernel userspace

xsk_buff_alloc_batch -> failed
                                                            submit fill ring                                                             flag != XDP_RING_NEED_WAKEUP
// reordering due to lack of memory orderings
xsk_set_rx_need_wakeup

I'm not expert here, so correct me if I'm wrong. I think the wake up flag is designed with no orderings so we cannot rely on it to reason and skip further checks.

+
        num = xsk_buff_alloc_batch(pool, xsk_buffs, rq->vq->num_free);
        if (!num) {
-               if (xsk_uses_need_wakeup(pool)) {
+               if (need_wakeup) {
                        xsk_set_rx_need_wakeup(pool);
                        /* Return 0 instead of -ENOMEM so that NAPI is
                         * descheduled.
@@ -1341,8 +1352,6 @@ static int virtnet_add_recvbuf_xsk(struct virtnet_info 
*vi, struct receive_queue
                }
return -ENOMEM;
-       } else {
-               xsk_clear_rx_need_wakeup(pool);
        }
len = xsk_pool_get_rx_frame_size(pool) + vi->hdr_len;
@@ -1363,6 +1372,16 @@ static int virtnet_add_recvbuf_xsk(struct virtnet_info 
*vi, struct receive_queue
                        goto err;
        }
+ if (need_wakeup) {
+               if (rq->vq->num_free)
+                       /* We have free buffers, so we'd better wake up the
+                        * rx napi as soon as possible.
+                        */
+                       xsk_set_rx_need_wakeup(pool);
+               else
+                       xsk_clear_rx_need_wakeup(pool);
+       }
+

Why do we need to set XDP_RING_NEED_WAKEUP even when xsk_buff_alloc_batch succeeds?

        return num;
err:

Thanks,
Quang Minh.



Reply via email to