On Mon, Mar 02, 2026 at 03:30:53PM +0100, Stefano Garzarella wrote:
> On Mon, Mar 02, 2026 at 03:51:49AM -0500, Michael S. Tsirkin wrote:
> > vhost_get_avail_idx is supposed to report whether it has updated
> > vq->avail_idx. Instead, it returns whether all entries have been
> > consumed, which is usually the same. But not always - in
> > drivers/vhost/net.c and when mergeable buffers have been enabled, the
> > driver checks whether the combined entries are big enough to store an
> > incoming packet. If not, the driver re-enables notifications with
> > available entries still in the ring. The incorrect return value from
> > vhost_get_avail_idx propagates through vhost_enable_notify and causes
> > the host to livelock if the guest is not making progress, as vhost will
> > immediately disable notifications and retry using the available entries.
> 
> Here I'd add something like this just to make it clear the full picture,
> because I spent quite some time to understand how it was related to the
> Fixes tag (which I agree is the right one to use).
> 
>   This goes back to commit d3bb267bbdcb ("vhost: cache avail index in
>   vhost_enable_notify()") which changed vhost_enable_notify() to compare
>   the freshly read avail index against vq->last_avail_idx instead of the
>   previously cached vq->avail_idx. Commit 7ad472397667 ("vhost: move
>   smp_rmb() into vhost_get_avail_idx()") then carried over the same
>   comparison when refactoring vhost_enable_notify() to call the unified
>   vhost_get_avail_idx().

Indeed.

> > 
> > The obvious fix is to make vhost_get_avail_idx do what the comment
> > says it does and report whether new entries have been added.
> > 
> > Reported-by: ShuangYu <[email protected]>
> > Fixes: d3bb267bbdcb ("vhost: cache avail index in vhost_enable_notify()")
> > Cc: Stefano Garzarella <[email protected]>
> > Cc: Stefan Hajnoczi <[email protected]>
> > Signed-off-by: Michael S. Tsirkin <[email protected]>
> > ---
> > 
> > Lightly tested, posting early to simplify testing for the reporter.
> 
> Tested with vhost-vsock and I didn't see any issue.
> 
> Thanks!
> 
> Reviewed-by: Stefano Garzarella <[email protected]>
> 
> > 
> > drivers/vhost/vhost.c | 11 +++++++----
> > 1 file changed, 7 insertions(+), 4 deletions(-)
> > 
> > diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
> > index 2f2c45d20883..db329a6f6145 100644
> > --- a/drivers/vhost/vhost.c
> > +++ b/drivers/vhost/vhost.c
> > @@ -1522,6 +1522,7 @@ static void vhost_dev_unlock_vqs(struct vhost_dev *d)
> > static inline int vhost_get_avail_idx(struct vhost_virtqueue *vq)
> > {
> >     __virtio16 idx;
> > +   u16 avail_idx;
> >     int r;
> > 
> >     r = vhost_get_avail(vq, idx, &vq->avail->idx);
> > @@ -1532,17 +1533,19 @@ static inline int vhost_get_avail_idx(struct 
> > vhost_virtqueue *vq)
> >     }
> > 
> >     /* Check it isn't doing very strange thing with available indexes */
> > -   vq->avail_idx = vhost16_to_cpu(vq, idx);
> > -   if (unlikely((u16)(vq->avail_idx - vq->last_avail_idx) > vq->num)) {
> > +   avail_idx = vhost16_to_cpu(vq, idx);
> > +   if (unlikely((u16)(avail_idx - vq->last_avail_idx) > vq->num)) {
> >             vq_err(vq, "Invalid available index change from %u to %u",
> > -                  vq->last_avail_idx, vq->avail_idx);
> > +                  vq->last_avail_idx, avail_idx);
> >             return -EINVAL;
> >     }
> > 
> >     /* We're done if there is nothing new */
> > -   if (vq->avail_idx == vq->last_avail_idx)
> > +   if (avail_idx == vq->avail_idx)
> >             return 0;
> > 
> > +   vq->avail_idx = avail_idx;
> > +
> >     /*
> >      * We updated vq->avail_idx so we need a memory barrier between
> >      * the index read above and the caller reading avail ring entries.
> > -- 
> > MST
> > 


Reply via email to