On Wed, Jan 15, 2014 at 10:07 AM, Michael S. Tsirkin <m...@redhat.com> wrote: > On Tue, Jan 14, 2014 at 07:13:43PM +0100, Antonios Motakis wrote: >> >> >> >> On Tue, Jan 14, 2014 at 12:33 PM, Michael S. Tsirkin <m...@redhat.com> wrote: >> >> On Mon, Jan 13, 2014 at 03:25:11PM +0100, Antonios Motakis wrote: >> > In this patch series we would like to introduce our approach for >> putting >> a >> > virtio-net backend in an external userspace process. Our eventual >> target >> is to >> > run the network backend in the Snabbswitch ethernet switch, while >> receiving >> > traffic from a guest inside QEMU/KVM which runs an unmodified >> virtio-net >> > implementation. >> > >> > For this, we are working into extending vhost to allow equivalent >> functionality >> > for userspace. Vhost already passes control of the data plane of >> virtio-net to >> > the host kernel; we want to realize a similar model, but for userspace. >> > >> > In this patch series the concept of a vhost-backend is introduced. >> > >> > We define two vhost backend types - vhost-kernel and vhost-user. The >> former is >> > the interface to the current kernel module implementation. Its control >> plane is >> > ioctl based. The data plane is the kernel directly accessing the QEMU >> allocated, >> > guest memory. >> > >> > In the new vhost-user backend, the control plane is based on >> communication >> > between QEMU and another userspace process using a unix domain socket. >> This >> > allows to implement a virtio backend for a guest running in QEMU, >> inside >> the >> > other userspace process. >> > >> > We change -mem-path to QemuOpts and add prealloc, share and unlink as >> properties >> > to it. HugeTLBFS requirements of -mem-path are relaxed, so any valid >> path >> can >> > be used now. The new properties allow more fine grained control over >> the >> guest >> > RAM backing store. >> > >> > The data path is realized by directly accessing the vrings and the >> buffer >> data >> > off the guest's memory. >> > >> > The current user of vhost-user is only vhost-net. We add new netdev >> backend >> > that is intended to initialize vhost-net with vhost-user backend. >> >> Some meta comments. >> >> Something that makes this patch harder to review is how it's >> split up. Generally IMHO it's not a good idea to repeatedly >> edit same part of file adding stuff in patch after patch, >> it's only making things harder to read if you add stubs, then fill them >> up. >> (we do this sometimes when we are changing existing code, but >> it is generally not needed when adding new code) >> >> Instead, split it like this: >> >> 1. general refactoring, split out linux specific and generic parts >> and add the ops indirection >> 2. add new files for vhost-user with complete implementation. >> without command line to support it, there will be no way to use it, >> but should build fine. >> 3. tie it all up with option parsing >> >> >> Generic vhost and vhost net files should be kept separate. >> Don't let vhost net stuff seep back into generic files, >> we have vhost-scsi too. >> I would also prefer that userspace vhost has its own files. >> >> >> Ok, we'll keep this into account. >> >> >> >> We need a small test server qemu can talk to, to verify things >> actually work. >> >> >> We have implemented such a test app: >> https://github.com/virtualopensystems/vapp >> >> We use it for testing, and also as a reference implementation. A client is >> also >> included. >> > > Sounds good. Can we include this in qemu and tie > it into the qtest framework? > From a brief look, it merely needs to be tweaked for portability, > unless
Could you provide some hints/examples about what it would look like to use qtest and pxe ROM? We have looked but haven't found anything obvious. Thanks Antonios > >> >> Already commented on: reuse the chardev syntax and preferably code. >> We already support a bunch of options there for >> domain sockets that will be useful here, they should >> work here as well. >> >> >> We adapted the syntax for this to be consistent with chardev. What we didn't >> use, it is not obvious at all to us on how they should be used; a lot of the >> chardev options just don't apply to us. >> > > Well server option should work at least. > nowait can work too? > > Also, if reconnect is useful it should be for chardevs too, so if we don't > share code, need to code it in two places to stay consistent. > > Overall sharing some code might be better ... > >> In particular you shouldn't require filesystem access by qemu, >> passing fd for domain socket should work. >> >> >> We can add an option to pass an fd for the domain socket if needed. However >> as >> far as we understand, chardev doesn't do that either (at least form looking >> at >> the man page). Maybe we misunderstand what you mean. > > Sorry. I got confused with e.g. tap which has this. This might be > useful but does not have to block this patch. > >> >> >> > Example usage: >> > >> > qemu -m 1024 -mem-path /hugetlbfs,prealloc=on,share=on \ >> > -netdev type=vhost-user,id=net0,path=/path/to/sock,poll_time=2500 >> \ >> > -device virtio-net-pci,netdev=net0 >> >> It's not clear which parts of -mem-path are required for vhost-user. >> It should be documented somewhere, made clear in -help >> and should fail gracefully if misconfigured. >> >> >> >> Ok. >> >> >> >> > >> > Changes from v5: >> > - Split -mem-path unlink option to a separate patch >> > - Fds are passed only in the ancillary data >> > - Stricter message size checks on receive/send >> > - Netdev vhost-user now includes path and poll_time options >> > - The connection probing interval is configurable >> > >> > Changes from v4: >> > - Use error_report for errors >> > - VhostUserMsg has new field `size` indicating the following payload >> length. >> > Field `flags` now has version and reply bits. The structure is >> packed. >> > - Send data is of variable length (`size` field in message) >> > - Receive in 2 steps, header and payload >> > - Add new message type VHOST_USER_ECHO, to check connection status >> > >> > Changes from v3: >> > - Convert -mem-path to QemuOpts with prealloc, share and unlink >> properties >> > - Set 1 sec timeout when read/write to the unix domain socket >> > - Fix file descriptor leak >> > >> > Changes from v2: >> > - Reconnect when the backend disappears >> > >> > Changes from v1: >> > - Implementation of vhost-user netdev backend >> > - Code improvements >> > >> > Antonios Motakis (8): >> > Convert -mem-path to QemuOpts and add prealloc and share properties >> > New -mem-path option - unlink. >> > Decouple vhost from kernel interface >> > Add vhost-user skeleton >> > Add domain socket communication for vhost-user backend >> > Add vhost-user calls implementation >> > Add new vhost-user netdev backend >> > Add vhost-user reconnection >> > >> > exec.c | 57 +++- >> > hmp-commands.hx | 4 +- >> > hw/net/vhost_net.c | 144 +++++++--- >> > hw/net/virtio-net.c | 42 ++- >> > hw/scsi/vhost-scsi.c | 13 +- >> > hw/virtio/Makefile.objs | 2 +- >> > hw/virtio/vhost-backend.c | 556 >> ++++++++++++++++++++++++++++++++++++++ >> > hw/virtio/vhost.c | 46 ++-- >> > include/exec/cpu-all.h | 3 - >> > include/hw/virtio/vhost-backend.h | 40 +++ >> > include/hw/virtio/vhost.h | 4 +- >> > include/net/vhost-user.h | 17 ++ >> > include/net/vhost_net.h | 15 +- >> > net/Makefile.objs | 2 +- >> > net/clients.h | 3 + >> > net/hub.c | 1 + >> > net/net.c | 2 + >> > net/tap.c | 16 +- >> > net/vhost-user.c | 177 ++++++++++++ >> > qapi-schema.json | 21 +- >> > qemu-options.hx | 24 +- >> > vl.c | 41 ++- >> > 22 files changed, 1106 insertions(+), 124 deletions(-) >> > create mode 100644 hw/virtio/vhost-backend.c >> > create mode 100644 include/hw/virtio/vhost-backend.h >> > create mode 100644 include/net/vhost-user.h >> > create mode 100644 net/vhost-user.c >> > >> > -- >> > 1.8.3.2 >> > >> >>