On Sat, Jan 31, 2026 at 8:15 AM Zhang Tianci
<[email protected]> wrote:
>
> Hi Eugenio,
>
> On Fri, Jan 30, 2026 at 7:52 PM Eugenio Perez Martin
> <[email protected]> wrote:
> >
> > On Fri, Jan 30, 2026 at 9:15 AM Zhang Tianci
> > <[email protected]> wrote:
> > >
> > > Move the message to recv_list before dropping msg_lock and copying the
> > > request to userspace, avoiding a transient unlinked state that can race
> > > with the msg_sync timeout path. Roll back to send_list on copy failures.
> > >
> >
> > Missed Fixes: tag and Cc: [email protected]. Or maybe we can
> > consider this a change in the behavior? I don't think any VDUSE
> > instance should trust it will never receive that message that it
> > received partially once but still...
>
> I'll add the Fixes tag and CC stable list in v2 patch.
>
> >
> >
> > > Signed-off-by: Zhang Tianci <[email protected]>
> > > Reviewed-by: Xie Yongji <[email protected]>
> > > ---
> > >  drivers/vdpa/vdpa_user/vduse_dev.c | 30 ++++++++++++++++++++++--------
> > >  1 file changed, 22 insertions(+), 8 deletions(-)
> > >
> > > diff --git a/drivers/vdpa/vdpa_user/vduse_dev.c 
> > > b/drivers/vdpa/vdpa_user/vduse_dev.c
> > > index ae357d014564c..b6a558341c06c 100644
> > > --- a/drivers/vdpa/vdpa_user/vduse_dev.c
> > > +++ b/drivers/vdpa/vdpa_user/vduse_dev.c
> > > @@ -325,6 +325,7 @@ static ssize_t vduse_dev_read_iter(struct kiocb 
> > > *iocb, struct iov_iter *to)
> > >         struct file *file = iocb->ki_filp;
> > >         struct vduse_dev *dev = file->private_data;
> > >         struct vduse_dev_msg *msg;
> > > +       struct vduse_dev_request req;
> > >         int size = sizeof(struct vduse_dev_request);
> > >         ssize_t ret;
> > >
> > > @@ -339,7 +340,7 @@ static ssize_t vduse_dev_read_iter(struct kiocb 
> > > *iocb, struct iov_iter *to)
> > >
> > >                 ret = -EAGAIN;
> > >                 if (file->f_flags & O_NONBLOCK)
> > > -                       goto unlock;
> > > +                       break;
> > >
> > >                 spin_unlock(&dev->msg_lock);
> > >                 ret = wait_event_interruptible_exclusive(dev->waitq,
> > > @@ -349,17 +350,30 @@ static ssize_t vduse_dev_read_iter(struct kiocb 
> > > *iocb, struct iov_iter *to)
> > >
> > >                 spin_lock(&dev->msg_lock);
> > >         }
> > > +       if (!msg) {
> > > +               spin_unlock(&dev->msg_lock);
> > > +               return ret;
> > > +       }
> > > +
> > > +       memcpy(&req, &msg->req, sizeof(req));
> > > +       /*
> > > +        * Move @msg to recv_list before dropping msg_lock.
> > > +        * This avoids a window where @msg is detached from any list and
> > > +        * vduse_dev_msg_sync() timeout path may operate on an unlinked 
> > > node.
> >
> > But in the timeout case, msg->completed is false so list_del is never
> > called, isn't it?
>
> Did you misread it?

Yes, I probably misread the code :). I should not post on Friday evenings!

> Here's the code:
>          spin_lock(&dev->msg_lock);
>          if (!msg->completed) {       <- here msg->completed is false.
>                  list_del(&msg->list);   <- boom
>                  msg->resp.result = VDUSE_REQ_RESULT_FAILED;
>                  /* Mark the device as malfunction when there is a timeout */
>                  if (!ret)
>                          vduse_dev_broken(dev);
>          }
>          ret = (msg->resp.result == VDUSE_REQ_RESULT_OK) ? 0 : -EIO;
>          spin_unlock(&dev->msg_lock);
>
> >
> > Is there even any event that may cause more than one packet in either
> > queue? Maybe we can simplify a lot of this if we don't have that
> > assumption.
>
> That makes sense. However, I believe that even though such a scenario doesn't
> exist at present, we cannot depend on the queue containing just a single 
> entry.
> This is a very fragile assumption and can easily be invalidated by 
> modifications
> to upper-layer operations.
>

I didn't mean to change the external character device, that would be a
userland visible change. Just to simplify the part of the code
internal to the module.

For example, instead of having a list, have a permanent msg with an
enum state: free (as "no message in flight)", sent_to_userland,
completed. So we don't need to worry about lists etc.

If vduse ever implements more than one message in flight it is very
easy to come back to the list. Note that I'm also interested in
speeding up the vdpa initialization and to have many msgs in flight is
one of the most straightforward ways to do it, I'm not against that in
the long term :).

> >
> > > +        */
> > > +       vduse_enqueue_msg(&dev->recv_list, msg);

If we re-enqueue it, another read from another userland thread could
read the same message again concurrently with this vduse_dev_read_iter
call, isn't it?

> > >         spin_unlock(&dev->msg_lock);
> > > -       ret = copy_to_iter(&msg->req, size, to);
> > > -       spin_lock(&dev->msg_lock);
> > > +
> > > +       ret = copy_to_iter(&req, size, to);
> > >         if (ret != size) {
> > > +               spin_lock(&dev->msg_lock);
> > > +               /* Roll back: move msg back to send_list if still 
> > > pending. */
> > > +               msg = vduse_find_msg(&dev->recv_list, req.request_id);
> > > +               if (msg)
> > > +                       vduse_enqueue_msg(&dev->send_list, msg)
> > > +               spin_unlock(&dev->msg_lock);
> > >                 ret = -EFAULT;
> > > -               vduse_enqueue_msg(&dev->send_list, msg);
> > > -               goto unlock;
> > >         }
> > > -       vduse_enqueue_msg(&dev->recv_list, msg);
> > > -unlock:
> > > -       spin_unlock(&dev->msg_lock);
> > >
> > >         return ret;
> > >  }
> > > --
> > > 2.39.5
> > >
> >
>
> I'll send out the v2 patch soon.
>
> Thanks,
> Tianci
>


Reply via email to