Re: [libvirt PATCH] RFC: Add support for vDPA network devices
I got some feedback from John Ferlan on a different forum about missing handling of migration and qemu capabilities. I'm adding this to my patch, but I'd appreciate any additional feedback, particularly on the xml format and the managed=yes/node-device questions below. On Mon, 2020-08-24 at 16:33 -0500, Jonathon Jongsma wrote: > On Thu, 2020-08-20 at 18:56 -0400, Laine Stump wrote: > > On 8/18/20 2:37 PM, Jonathon Jongsma wrote: > > > vDPA network devices allow high-performance networking in a > > > virtual > > > machine by providing a wire-speed data path. These devices > > > require > > > a > > > vendor-specific host driver but the data path follows the virtio > > > specification. > > > > > > The support for vDPA devices was recently added to qemu. This > > > allows > > > libvirt to support these devices. It requires that the device is > > > configured on the host with the appropriate vendor-specific > > > driver. > > > This will create a chardev on the host at e.g. /dev/vhost-vdpa-0. > > > That > > > chardev path can then be used to define a new interface with > > > type='vdpa'. > > > --- > > [snip] > > > > > > > + > > > +:: > > > + > > > + ... > > > + > > > + > > > + > > > > (The above device is created (I just learned this from you in IRC!) > > by > > unbinding a VF from its NIC driver on the host, and re-binding it > > to > > a > > special VDPA-VF driver.) > > > > > > As we were just discussing online, on one hand it could be nice if > > libvirt could automatically handle rebinding the VF to the vdpa > > host > > driver (given the PCI address of the VF), to make it easier to use > > (because no advance setup would be needed), similar to what's > > already > > done with hostdev devices (and ) when > > managed='yes' (which is the default setting). > > > > > > On the other hand, it is exactly that managed='yes' functionality > > that > > has created more "libvirt-but-not-really-libvirt" bug reports than > > any > > other aspect of vfio device assignment, because the process of > > unbinding > > and rebinding drivers is timing-sensitive and causes code that's > > usually > > run only once at host boot-time to be run hundreds of times thus > > making > > it more likely to expose infrequently-hit bugs. > > > > > > I just bring this up in advance of someone suggesting the addition > > of > > managed='yes' here to put in my vote for *not* doing it, and > > instead > > using that same effort to provide some sort of API in the node- > > device > > driver for easily creating one or more VDPA devices from VFs, > > which > > could be done once at host boot time, and thus avoid the level of > > "libvirt-not-libvirt" bug reports for VDPA. (and after that maybe > > even > > an API to allocate a device from that pool to be used by a guest). > > But > > that's for later. > > I'd really love to hear some additional opinions on this topic from > some more of the libvirt "old-timers". I intentionally kept the > initial > vdpa support simple by requiring that the device is already setup and > bound to the appropriate driver. But I want to make sure that we can > add additional capabilities later (as Laine suggested) with this same > API/schema. > > Anybody else have thoughts on this topic? > > > > > > > + > > > + > > > + ... > > > + > > > :anchor:`` > > > > > > Teaming a virtio/hostdev NIC pair > > > diff --git a/docs/schemas/domaincommon.rng > > > b/docs/schemas/domaincommon.rng > > > index 0d0dcbc5ce..17f74490f4 100644 > > > --- a/docs/schemas/domaincommon.rng > > > +++ b/docs/schemas/domaincommon.rng > > > @@ -3108,6 +3108,21 @@ > > > > > > > > > > > > + > > > + > > > + > > > +vdpa > > > + > > > + > > > + > > > + > > > + > > > + > > > + > > > + > > > + > > > + > > > + > > > > > > > > > > > > > > [snip] > > > > diff --git a/tests/qemuxml2argvdata/net-vdpa.x86_64-latest.args > > > b/tests/qemuxml2argvdata/net-vdpa.x86_64-latest.args > > > new file mode 100644 > > > index 00..8e76ac7794 > > > --- /dev/null > > > +++ b/tests/qemuxml2argvdata/net-vdpa.x86_64-latest.args > > > @@ -0,0 +1,37 @@ > > > +LC_ALL=C \ > > > +PATH=/bin \ > > > +HOME=/tmp/lib/domain--1-QEMUGuest1 \ > > > +USER=test \ > > > +LOGNAME=test \ > > > +XDG_DATA_HOME=/tmp/lib/domain--1-QEMUGuest1/.local/share \ > > > +XDG_CACHE_HOME=/tmp/lib/domain--1-QEMUGuest1/.cache \ > > > +XDG_CONFIG_HOME=/tmp/lib/domain--1-QEMUGuest1/.config \ > > > +QEMU_AUDIO_DRV=none \ > > > +/usr/bin/qemu-system-i386 \ > > > +-name guest=QEMUGuest1,debug-threads=on \ > > > +-S \ > > > +-object secret,id=masterKey0,format=raw,\ > > > +file=/tmp/lib/domain--1-QEMUGuest1/master-key.aes \ > > > +-machine pc,accel=tcg,usb=off,dump-guest-core=off \ > > > +-cpu qemu64 \ > > > +-m 214 \ > > >
Re: [libvirt PATCH] RFC: Add support for vDPA network devices
On Thu, 2020-08-20 at 18:56 -0400, Laine Stump wrote: > On 8/18/20 2:37 PM, Jonathon Jongsma wrote: > > vDPA network devices allow high-performance networking in a virtual > > machine by providing a wire-speed data path. These devices require > > a > > vendor-specific host driver but the data path follows the virtio > > specification. > > > > The support for vDPA devices was recently added to qemu. This > > allows > > libvirt to support these devices. It requires that the device is > > configured on the host with the appropriate vendor-specific driver. > > This will create a chardev on the host at e.g. /dev/vhost-vdpa-0. > > That > > chardev path can then be used to define a new interface with > > type='vdpa'. > > --- [snip] > > > > + > > +:: > > + > > + ... > > + > > + > > + > > (The above device is created (I just learned this from you in IRC!) > by > unbinding a VF from its NIC driver on the host, and re-binding it to > a > special VDPA-VF driver.) > > > As we were just discussing online, on one hand it could be nice if > libvirt could automatically handle rebinding the VF to the vdpa host > driver (given the PCI address of the VF), to make it easier to use > (because no advance setup would be needed), similar to what's > already > done with hostdev devices (and ) when > managed='yes' (which is the default setting). > > > On the other hand, it is exactly that managed='yes' functionality > that > has created more "libvirt-but-not-really-libvirt" bug reports than > any > other aspect of vfio device assignment, because the process of > unbinding > and rebinding drivers is timing-sensitive and causes code that's > usually > run only once at host boot-time to be run hundreds of times thus > making > it more likely to expose infrequently-hit bugs. > > > I just bring this up in advance of someone suggesting the addition > of > managed='yes' here to put in my vote for *not* doing it, and instead > using that same effort to provide some sort of API in the node- > device > driver for easily creating one or more VDPA devices from VFs, which > could be done once at host boot time, and thus avoid the level of > "libvirt-not-libvirt" bug reports for VDPA. (and after that maybe > even > an API to allocate a device from that pool to be used by a guest). > But > that's for later. I'd really love to hear some additional opinions on this topic from some more of the libvirt "old-timers". I intentionally kept the initial vdpa support simple by requiring that the device is already setup and bound to the appropriate driver. But I want to make sure that we can add additional capabilities later (as Laine suggested) with this same API/schema. Anybody else have thoughts on this topic? > > > > + > > + > > + ... > > + > > :anchor:`` > > > > Teaming a virtio/hostdev NIC pair > > diff --git a/docs/schemas/domaincommon.rng > > b/docs/schemas/domaincommon.rng > > index 0d0dcbc5ce..17f74490f4 100644 > > --- a/docs/schemas/domaincommon.rng > > +++ b/docs/schemas/domaincommon.rng > > @@ -3108,6 +3108,21 @@ > > > > > > > > + > > + > > + > > +vdpa > > + > > + > > + > > + > > + > > + > > + > > + > > + > > + > > + > > > > > > > > [snip] > > diff --git a/tests/qemuxml2argvdata/net-vdpa.x86_64-latest.args > > b/tests/qemuxml2argvdata/net-vdpa.x86_64-latest.args > > new file mode 100644 > > index 00..8e76ac7794 > > --- /dev/null > > +++ b/tests/qemuxml2argvdata/net-vdpa.x86_64-latest.args > > @@ -0,0 +1,37 @@ > > +LC_ALL=C \ > > +PATH=/bin \ > > +HOME=/tmp/lib/domain--1-QEMUGuest1 \ > > +USER=test \ > > +LOGNAME=test \ > > +XDG_DATA_HOME=/tmp/lib/domain--1-QEMUGuest1/.local/share \ > > +XDG_CACHE_HOME=/tmp/lib/domain--1-QEMUGuest1/.cache \ > > +XDG_CONFIG_HOME=/tmp/lib/domain--1-QEMUGuest1/.config \ > > +QEMU_AUDIO_DRV=none \ > > +/usr/bin/qemu-system-i386 \ > > +-name guest=QEMUGuest1,debug-threads=on \ > > +-S \ > > +-object secret,id=masterKey0,format=raw,\ > > +file=/tmp/lib/domain--1-QEMUGuest1/master-key.aes \ > > +-machine pc,accel=tcg,usb=off,dump-guest-core=off \ > > +-cpu qemu64 \ > > +-m 214 \ > > +-overcommit mem-lock=off \ > > +-smp 1,sockets=1,cores=1,threads=1 \ > > +-uuid c7a5fdbd-edaf-9455-926a-d65c16db1809 \ > > +-display none \ > > +-no-user-config \ > > +-nodefaults \ > > +-chardev socket,id=charmonitor,fd=1729,server,nowait \ > > +-mon chardev=charmonitor,id=monitor,mode=control \ > > +-rtc base=utc \ > > +-no-shutdown \ > > +-no-acpi \ > > +-boot strict=on \ > > +-device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 \ > > +-add-fd set=0,fd=1732 \ > > +-netdev vhost-vdpa,vhostdev=/dev/fdset/0,id=hostnet0 \ > > Okay, I'm feeling too lazy to parse through the code above an see > how > you arrived at "vhostdev='/dev/fdset/0'", but
Re: [libvirt PATCH] RFC: Add support for vDPA network devices
On 8/18/20 2:37 PM, Jonathon Jongsma wrote: vDPA network devices allow high-performance networking in a virtual machine by providing a wire-speed data path. These devices require a vendor-specific host driver but the data path follows the virtio specification. The support for vDPA devices was recently added to qemu. This allows libvirt to support these devices. It requires that the device is configured on the host with the appropriate vendor-specific driver. This will create a chardev on the host at e.g. /dev/vhost-vdpa-0. That chardev path can then be used to define a new interface with type='vdpa'. --- docs/formatdomain.rst | 20 + docs/schemas/domaincommon.rng | 15 +++ src/conf/domain_conf.c| 41 +++ src/conf/domain_conf.h| 4 ++ src/conf/netdev_bandwidth_conf.c | 1 + src/libxl/libxl_conf.c| 1 + src/libxl/xen_common.c| 1 + src/lxc/lxc_controller.c | 1 + src/lxc/lxc_driver.c | 3 ++ src/lxc/lxc_process.c | 1 + src/qemu/qemu_command.c | 29 - src/qemu/qemu_command.h | 3 +- src/qemu/qemu_domain.c| 6 ++- src/qemu/qemu_hotplug.c | 15 --- src/qemu/qemu_interface.c | 25 +++ src/qemu/qemu_interface.h | 2 + src/qemu/qemu_process.c | 1 + src/qemu/qemu_validate.c | 1 + src/vmx/vmx.c | 1 + .../net-vdpa.x86_64-latest.args | 37 + tests/qemuxml2argvdata/net-vdpa.xml | 28 + tests/qemuxml2argvmock.c | 11 - tests/qemuxml2argvtest.c | 1 + tests/qemuxml2xmloutdata/net-vdpa.xml | 34 +++ tests/qemuxml2xmltest.c | 1 + tools/virsh-domain.c | 1 + 26 files changed, 274 insertions(+), 10 deletions(-) create mode 100644 tests/qemuxml2argvdata/net-vdpa.x86_64-latest.args create mode 100644 tests/qemuxml2argvdata/net-vdpa.xml create mode 100644 tests/qemuxml2xmloutdata/net-vdpa.xml I would have had fewer excuses to procrastinate in looking at this if it was broken up into smaller patches. At least one patch for the change to the XML schema/parser/formatter, and xml2xml test case, and a bit of docs in formatdomain, then another putting the support into qemu for that bit of config. diff --git a/docs/formatdomain.rst b/docs/formatdomain.rst index 8365fc8bbb..1356485504 100644 --- a/docs/formatdomain.rst +++ b/docs/formatdomain.rst @@ -4632,6 +4632,26 @@ or stopping the guest. ... +:anchor:`` + +vDPA devices + + +A vDPA device can be used to provide wire speed network performance within a +domain. The host device must already be configured with the appropriate +device-specific vDPA driver. This creates a vDPA char device (e.g. +/dev/vhost-vdpa-0) that can be used to assign the device to a libvirt domain. Maybe at least mention here that this only works with certain models of SR-IOV NICs, and that each guest vdpa uses up one SR-IOV VF on the host. Otherwise we'll get people seeing the "wirespeed performance" part, then trying to figure out how to set it up using their ISA bus NE2000 NIC or something :-) + +:: + + ... + + + (The above device is created (I just learned this from you in IRC!) by unbinding a VF from its NIC driver on the host, and re-binding it to a special VDPA-VF driver.) As we were just discussing online, on one hand it could be nice if libvirt could automatically handle rebinding the VF to the vdpa host driver (given the PCI address of the VF), to make it easier to use (because no advance setup would be needed), similar to what's already done with hostdev devices (and ) when managed='yes' (which is the default setting). On the other hand, it is exactly that managed='yes' functionality that has created more "libvirt-but-not-really-libvirt" bug reports than any other aspect of vfio device assignment, because the process of unbinding and rebinding drivers is timing-sensitive and causes code that's usually run only once at host boot-time to be run hundreds of times thus making it more likely to expose infrequently-hit bugs. I just bring this up in advance of someone suggesting the addition of managed='yes' here to put in my vote for *not* doing it, and instead using that same effort to provide some sort of API in the node-device driver for easily creating one or more VDPA devices from VFs, which could be done once at host boot time, and thus avoid the level of "libvirt-not-libvirt" bug reports for VDPA. (and after that maybe even
[libvirt PATCH] RFC: Add support for vDPA network devices
vDPA network devices allow high-performance networking in a virtual machine by providing a wire-speed data path. These devices require a vendor-specific host driver but the data path follows the virtio specification. The support for vDPA devices was recently added to qemu. This allows libvirt to support these devices. It requires that the device is configured on the host with the appropriate vendor-specific driver. This will create a chardev on the host at e.g. /dev/vhost-vdpa-0. That chardev path can then be used to define a new interface with type='vdpa'. --- docs/formatdomain.rst | 20 + docs/schemas/domaincommon.rng | 15 +++ src/conf/domain_conf.c| 41 +++ src/conf/domain_conf.h| 4 ++ src/conf/netdev_bandwidth_conf.c | 1 + src/libxl/libxl_conf.c| 1 + src/libxl/xen_common.c| 1 + src/lxc/lxc_controller.c | 1 + src/lxc/lxc_driver.c | 3 ++ src/lxc/lxc_process.c | 1 + src/qemu/qemu_command.c | 29 - src/qemu/qemu_command.h | 3 +- src/qemu/qemu_domain.c| 6 ++- src/qemu/qemu_hotplug.c | 15 --- src/qemu/qemu_interface.c | 25 +++ src/qemu/qemu_interface.h | 2 + src/qemu/qemu_process.c | 1 + src/qemu/qemu_validate.c | 1 + src/vmx/vmx.c | 1 + .../net-vdpa.x86_64-latest.args | 37 + tests/qemuxml2argvdata/net-vdpa.xml | 28 + tests/qemuxml2argvmock.c | 11 - tests/qemuxml2argvtest.c | 1 + tests/qemuxml2xmloutdata/net-vdpa.xml | 34 +++ tests/qemuxml2xmltest.c | 1 + tools/virsh-domain.c | 1 + 26 files changed, 274 insertions(+), 10 deletions(-) create mode 100644 tests/qemuxml2argvdata/net-vdpa.x86_64-latest.args create mode 100644 tests/qemuxml2argvdata/net-vdpa.xml create mode 100644 tests/qemuxml2xmloutdata/net-vdpa.xml diff --git a/docs/formatdomain.rst b/docs/formatdomain.rst index 8365fc8bbb..1356485504 100644 --- a/docs/formatdomain.rst +++ b/docs/formatdomain.rst @@ -4632,6 +4632,26 @@ or stopping the guest. ... +:anchor:`` + +vDPA devices + + +A vDPA device can be used to provide wire speed network performance within a +domain. The host device must already be configured with the appropriate +device-specific vDPA driver. This creates a vDPA char device (e.g. +/dev/vhost-vdpa-0) that can be used to assign the device to a libvirt domain. + +:: + + ... + + + + + + ... + :anchor:`` Teaming a virtio/hostdev NIC pair diff --git a/docs/schemas/domaincommon.rng b/docs/schemas/domaincommon.rng index 0d0dcbc5ce..17f74490f4 100644 --- a/docs/schemas/domaincommon.rng +++ b/docs/schemas/domaincommon.rng @@ -3108,6 +3108,21 @@ + + + +vdpa + + + + + + + + + + + diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index 8e7981bf25..74f2c2f3e3 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -549,6 +549,7 @@ VIR_ENUM_IMPL(virDomainNet, "direct", "hostdev", "udp", + "vdpa", ); VIR_ENUM_IMPL(virDomainNetModel, @@ -2495,6 +2496,10 @@ virDomainNetDefClear(virDomainNetDefPtr def) def->data.vhostuser = NULL; break; +case VIR_DOMAIN_NET_TYPE_VDPA: +VIR_FREE(def->data.vdpa.devicepath); +break; + case VIR_DOMAIN_NET_TYPE_SERVER: case VIR_DOMAIN_NET_TYPE_CLIENT: case VIR_DOMAIN_NET_TYPE_MCAST: @@ -6489,6 +6494,15 @@ virDomainNetDefValidate(const virDomainNetDef *net) return -1; } +if (net->type == VIR_DOMAIN_NET_TYPE_VDPA && +net->model != VIR_DOMAIN_NET_MODEL_VIRTIO) { +virReportError(VIR_ERR_CONFIG_UNSUPPORTED, + _("invalid model for interface of type '%s': '%s'"), + virDomainNetTypeToString(net->type), + virDomainNetModelTypeToString(net->model)); +return -1; +} + return 0; } @@ -11982,6 +11996,7 @@ virDomainNetDefParseXML(virDomainXMLOptionPtr xmlopt, g_autofree char *vhost_path = NULL; g_autofree char *teamingType = NULL; g_autofree char *teamingPersistent = NULL; +g_autofree char *vdpa_dev = NULL; const char *prefix = xmlopt ? xmlopt->config.netPrefix : NULL; if (!(def =