vhost-net initializes one ubuf_info per outstanding zerocopy TX
descriptor and hands it to the backend socket.  The networking stack may
then clone a zerocopy skb before all skb references are released.  For
example, batman-adv fragmentation reaches skb_split(), which calls
skb_zerocopy_clone() and increments the same ubuf_info refcount.

vhost_zerocopy_complete() currently treats every ubuf callback as a
completed vhost descriptor.  It dereferences ubuf->ctx, writes the
descriptor completion state, and drops the vhost_net_ubuf_ref even when
the callback only releases a cloned skb reference.  A backend reset can
therefore wait for and free the vhost_net_ubuf_ref while another cloned
skb still carries the same ubuf_info.  A later completion then
dereferences the freed ubufs pointer.

KASAN reports the stale completion as:

  BUG: KASAN: slab-use-after-free in vhost_zerocopy_complete+0x1d7/0x1f0
  BUG: KASAN: slab-use-after-free in vhost_zerocopy_complete+0x101/0x1f0
  vhost_zerocopy_complete
  skb_copy_ubufs
  __dev_forward_skb2
  veth_xmit

The freed object was allocated from vhost_net_ioctl() while setting the
backend and freed through kfree_rcu()/kvfree_rcu_bulk after backend
removal, while delayed skb completion still reached
vhost_zerocopy_complete().

Honor the generic ubuf_info refcount before touching vhost state, and run
the vhost descriptor completion only for the final ubuf reference.  This
matches the msg_zerocopy_complete() ownership rule for cloned zerocopy
skbs.

Fixes: bab632d69ee4 ("vhost: vhost TX zero-copy support")
Signed-off-by: Qing Ming <[email protected]>
---
 drivers/vhost/net.c | 15 +++++++++++----
 1 file changed, 11 insertions(+), 4 deletions(-)

diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
index c6536cad9c4f..b9af63fb6306 100644
--- a/drivers/vhost/net.c
+++ b/drivers/vhost/net.c
@@ -390,13 +390,20 @@ static void vhost_zerocopy_signal_used(struct vhost_net 
*net,
 static void vhost_zerocopy_complete(struct sk_buff *skb,
                                    struct ubuf_info *ubuf_base, bool success)
 {
-       struct ubuf_info_msgzc *ubuf = uarg_to_msgzc(ubuf_base);
-       struct vhost_net_ubuf_ref *ubufs = ubuf->ctx;
-       struct vhost_virtqueue *vq = ubufs->vq;
+       struct ubuf_info_msgzc *ubuf;
+       struct vhost_net_ubuf_ref *ubufs;
+       struct vhost_virtqueue *vq;
        int cnt;
 
-       rcu_read_lock_bh();
+       /* Only the final cloned skb reference completes the vhost descriptor. 
*/
+       if (!refcount_dec_and_test(&ubuf_base->refcnt))
+               return;
+
+       ubuf = uarg_to_msgzc(ubuf_base);
+       ubufs = ubuf->ctx;
+       vq = ubufs->vq;
 
+       rcu_read_lock_bh();
        /* set len to mark this desc buffers done DMA */
        vq->heads[ubuf->desc].len = success ?
                VHOST_DMA_DONE_LEN : VHOST_DMA_FAILED_LEN;
-- 
2.53.0


Reply via email to