On Thu, Aug 4, 2022 at 4:19 PM Eugenio Perez Martin <epere...@redhat.com> wrote: > > On Thu, Aug 4, 2022 at 9:51 AM Jason Wang <jasow...@redhat.com> wrote: > > > > On Thu, Aug 4, 2022 at 3:39 PM Eugenio Perez Martin <epere...@redhat.com> > > wrote: > > > > > > On Thu, Aug 4, 2022 at 6:17 AM Jason Wang <jasow...@redhat.com> wrote: > > > > > > > > > > > > 在 2022/8/3 01:57, Eugenio Pérez 写道: > > > > > So its generic enough to accept any out sg buffer and we can inject > > > > > NIC state messages. > > > > > > > > > > Signed-off-by: Eugenio Pérez <epere...@redhat.com> > > > > > --- > > > > > v5: Accept out sg instead of dev_buffers[] > > > > > --- > > > > > net/vhost-vdpa.c | 13 +++++++------ > > > > > 1 file changed, 7 insertions(+), 6 deletions(-) > > > > > > > > > > diff --git a/net/vhost-vdpa.c b/net/vhost-vdpa.c > > > > > index 33bf3d6409..2421bca347 100644 > > > > > --- a/net/vhost-vdpa.c > > > > > +++ b/net/vhost-vdpa.c > > > > > @@ -302,16 +302,16 @@ dma_map_err: > > > > > } > > > > > > > > > > /** > > > > > - * Copy the guest element into a dedicated buffer suitable to be > > > > > sent to NIC > > > > > + * Maps out sg and in buffer into dedicated buffers suitable to be > > > > > sent to NIC > > > > > */ > > > > > -static bool vhost_vdpa_net_cvq_map_elem(VhostVDPAState *s, > > > > > - VirtQueueElement *elem, > > > > > - size_t *out_len) > > > > > +static bool vhost_vdpa_net_cvq_map_sg(VhostVDPAState *s, > > > > > + const struct iovec *out, > > > > > size_t out_num, > > > > > + size_t *out_len) > > > > > > > > > > > > This still looks not genreal as there's no guarantee that we won't have > > > > command-in-specific-data. One example is that Ali is working on the > > > > virtio-net statistics fetching from the control virtqueue. > > > > > > > > So it looks to me we'd better have a general bounce_map here that > > > > accepts: > > > > > > > > 1) out_sg and out_num > > > > 2) in_sg and in_num > > > > > > > > > > We don't need to pass in_sg for that: The only useful information is > > > its size. > > > > What if we support stats in the future where it extends the ctrl command: > > > > u8 command; > > u8 command-specific-data[]; > > u8 ack; > > + u8 command-specific-data-reply[]; > > > > in > > > > https://lists.oasis-open.org/archives/virtio-dev/202203/msg00000.html > > > > The guest will expose an in descriptor in whatever layout it wants. > QEMU reads its layout into a VirtQueueElement in_num and in_sg > members, in qemu's (and SVQ) address space.
Yes, but current code did: 1) map seems based on sg but not unmap, result an non-symmetry API 2) NULL is passed to vhost_vdpa_cvq_map_buf() ok = vhost_vdpa_cvq_map_buf(&s->vhost_vdpa, NULL, 0, sizeof(virtio_net_ctrl_ack), s->cvq_cmd_in_buffer, &in_copied, true); if (unlikely(!ok)) { vhost_vdpa_cvq_unmap_buf(&s->vhost_vdpa, s->cvq_cmd_out_buffer); return false; } So NULL to be used for iov_to_buf() which is tricky (not even sure it can work). And this won't work for commands that have in-data in the future. > > Since we don't want the guest to be able to modify the in buffer > maliciously, we offer the device a bounce in buffer. Since the in > buffer still contains no information, we only need its size, so we can > "allocate and map" an equivalent one for the device and memset to 0. > For simplicity, we allocate one page, so no need for iovec > complexities. > > After the device has written in it, we get the written len and verify > the information. If VIRTIO_NET_OK, we copy that to the guest's in > buffer, using the iov_from_buf right after out: tag at > vhost_vdpa_net_handle_ctrl_avail. Instead of copy from the stack > "(status, sizeof(status))" variable, we copy from > (s->cvq_cmd_in_buffer, written_len). Another asymmetry, the iov_to_buf() was hidden in map() but the iov_from_buf() was exposed to the vhost_vdpa_net_handle_ctrl_avail. I'd suggest tweaking the code otherwise it's not easy to read and maintain: map_sg() validate_cmd() add_to_svq() handle_ctrl() unmap_sg() Or since we know the location of the bounce buffer, we can simply do bouncing/iov_from_buf() explicitly before map_sg(). > > Note that this is still not enough for stats. We also need a way to: > * Update the virtio-net device model stats. virtio_net_handle_ctrl_iov > would try to write the virtio-net stats to in buffer, not to update > the device model stat. > * Update the stats on the destination. Another ctrl command? Intercept > them via svq and simply sum source stats + current device stats? I'd > say the second is better as it saves effort to the device, but maybe > it's not. It should be sufficient to maintain a delta for each counter? Thanks > > That's why I think this command should be left out at the moment, to > do the modifications is not hard but we should agree on how to do them > first. > > > > Since the exposed buffer is an in one, it's enough to expose > > > the s->cvq_cmd_in_buffer buffer in full. The caller already knows the > > > device will write to it, so the only missing piece is to return the > > > written length at vhost_vdpa_net_cvq_add. > > > > > > Is one page the right buffer size for the in buffer? > > > > We can start from this. > > > > > Is it worth > > > worrying about it before implementing the stat control command in qemu > > > virtio-net? > > > > If it's not complex, it's better to do that from the beginning, > > otherwise the user may be surprised and we need extra work. Anyhow, we > > should support at least ack which is an in_sg. > > > > > > > > > In this level, we'd better not have any special care about the in as the > > > > ack. > > > > > > We need to care about it. If a property has not been updated in the > > > vdpa device (it returned VIRTIO_NET_ERR), we must not update the > > > device model. > > > > Yes, but what I meant is at the level of bouncing itself. If we met > > VIRTIO_NET_ERR, we should propagate it to the guest as well? > > > > Yes we have, if not the guest thinks the command succeeds. Isn't it? > > Thanks! > > > Thanks > > > > > > > > We can move the processing from vhost_vdpa_net_cvq_add to > > > vhost_vdpa_net_load and vhost_vdpa_net_handle_ctrl_avail, but the code > > > gets duplicated then. > > > > > > > And we need do bouncing: > > > > > > > > 1) for out buffer, during map > > > > 2) for in buffer during unmap > > > > > > > > > > We can move the copy of the in_buffer to the unmap for sure. > > > > > > Thanks! > > > > > > > Thanks > > > > > > > > > > > > > { > > > > > size_t in_copied; > > > > > bool ok; > > > > > > > > > > - ok = vhost_vdpa_cvq_map_buf(&s->vhost_vdpa, elem->out_sg, > > > > > elem->out_num, > > > > > + ok = vhost_vdpa_cvq_map_buf(&s->vhost_vdpa, out, out_num, > > > > > vhost_vdpa_net_cvq_cmd_len(), > > > > > s->cvq_cmd_out_buffer, out_len, > > > > > false); > > > > > if (unlikely(!ok)) { > > > > > @@ -435,7 +435,8 @@ static int > > > > > vhost_vdpa_net_handle_ctrl_avail(VhostShadowVirtqueue *svq, > > > > > }; > > > > > bool ok; > > > > > > > > > > - ok = vhost_vdpa_net_cvq_map_elem(s, elem, > > > > > &dev_buffers[0].iov_len); > > > > > + ok = vhost_vdpa_net_cvq_map_sg(s, elem->out_sg, elem->out_num, > > > > > + &dev_buffers[0].iov_len); > > > > > if (unlikely(!ok)) { > > > > > goto out; > > > > > } > > > > > > > > > >