[libvirt] [PATCH 0/5] Support Ephemeral passthrough hostdevs
The ephemeral flag helps support migration with PCI-passthrough. An ephemeral hostdev is automatically unplugged before migration and replugged (if one is available on the destination) after migration. Shradha Shah (5): Added ephemeral flag for hostdev in domain conf. Adding ephemeral flag for hostdev in network conf. Ephemeral flag mofication within the network driver. Ephemeral flag modification within the qemu driver. Migration support for ephemeral hostdevs. docs/schemas/domaincommon.rng | 16 docs/schemas/network.rng |8 ++ src/conf/domain_conf.c | 23 +- src/conf/domain_conf.h |1 + src/conf/network_conf.c| 11 +++ src/conf/network_conf.h|1 + src/network/bridge_driver.c|1 + src/qemu/qemu_command.c| 63 +- src/qemu/qemu_migration.c | 94 +++- tests/networkxml2xmlin/hostdev-pf.xml |2 +- tests/networkxml2xmlin/hostdev.xml |2 +- tests/networkxml2xmlout/hostdev-pf.xml |2 +- tests/networkxml2xmlout/hostdev.xml|2 +- .../qemuxml2argv-hostdev-pci-address.xml |2 +- .../qemuxml2argv-hostdev-usb-address.xml |2 +- .../qemuxml2argvdata/qemuxml2argv-net-hostdev.xml |2 +- tests/qemuxml2argvdata/qemuxml2argv-pci-rom.xml|4 +- 17 files changed, 200 insertions(+), 36 deletions(-) -- 1.7.4.4 -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
[libvirt] [PATCH 2/5] Adding ephemeral flag for hostdev in network conf.
The ephemeral flag helps support migration with PCI-passthrough. An ephemeral hostdev is automatically unplugged before migration and replugged (if one is available on the destination) after migration. --- docs/schemas/network.rng |8 src/conf/network_conf.c| 11 +++ src/conf/network_conf.h|1 + tests/networkxml2xmlin/hostdev-pf.xml |2 +- tests/networkxml2xmlin/hostdev.xml |2 +- tests/networkxml2xmlout/hostdev-pf.xml |2 +- tests/networkxml2xmlout/hostdev.xml|2 +- 7 files changed, 24 insertions(+), 4 deletions(-) diff --git a/docs/schemas/network.rng b/docs/schemas/network.rng index 4abfd91..c86ade8 100644 --- a/docs/schemas/network.rng +++ b/docs/schemas/network.rng @@ -100,6 +100,14 @@ /choice /attribute /optional +optional + attribute name=ephemeral +choice + valueyes/value + valueno/value +/choice + /attribute +/optional interleave choice group diff --git a/src/conf/network_conf.c b/src/conf/network_conf.c index 228951d..4f48f64 100644 --- a/src/conf/network_conf.c +++ b/src/conf/network_conf.c @@ -1222,6 +1222,7 @@ virNetworkDefParseXML(xmlXPathContextPtr ctxt) int nIps, nPortGroups, nForwardIfs, nForwardPfs, nForwardAddrs; char *forwardDev = NULL; char *forwardManaged = NULL; +char *forwardEphemeral = NULL; char *type = NULL; xmlNodePtr save = ctxt-node; xmlNodePtr bandwidthNode = NULL; @@ -1380,6 +1381,11 @@ virNetworkDefParseXML(xmlXPathContextPtr ctxt) if (STRCASEEQ(forwardManaged, yes)) def-managed = 1; } +forwardEphemeral = virXPathString(string(./@ephemeral), ctxt); +if (forwardEphemeral != NULL) { +if (STRCASEEQ(forwardEphemeral, yes)) +def-ephemeral = 1; +} /* all of these modes can use a pool of physical interfaces */ nForwardIfs = virXPathNodeSet(./interface, ctxt, forwardIfNodes); @@ -1527,6 +1533,7 @@ virNetworkDefParseXML(xmlXPathContextPtr ctxt) VIR_FREE(type); VIR_FREE(forwardDev); VIR_FREE(forwardManaged); +VIR_FREE(forwardEphemeral); VIR_FREE(forwardPfNodes); VIR_FREE(forwardIfNodes); VIR_FREE(forwardAddrNodes); @@ -1861,6 +1868,10 @@ char *virNetworkDefFormat(const virNetworkDefPtr def, unsigned int flags) virBufferAddLit(buf, managed='yes'); else virBufferAddLit(buf, managed='no'); +if (def-ephemeral == 1) +virBufferAddLit(buf, ephemeral='yes'); +else +virBufferAddLit(buf, ephemeral='no'); } virBufferAsprintf(buf, %s\n, (def-nForwardIfs || def-nForwardPfs) ? : /); diff --git a/src/conf/network_conf.h b/src/conf/network_conf.h index 3e46304..37a5d42 100644 --- a/src/conf/network_conf.h +++ b/src/conf/network_conf.h @@ -186,6 +186,7 @@ struct _virNetworkDef { int forwardType;/* One of virNetworkForwardType constants */ int managed;/* managed attribute for hostdev mode */ +int ephemeral; /* ephemeral attribute for hostdev mode */ /* If there are multiple forward devices (i.e. a pool of * interfaces), they will be listed here. diff --git a/tests/networkxml2xmlin/hostdev-pf.xml b/tests/networkxml2xmlin/hostdev-pf.xml index 7bf857d..6b928a6 100644 --- a/tests/networkxml2xmlin/hostdev-pf.xml +++ b/tests/networkxml2xmlin/hostdev-pf.xml @@ -1,7 +1,7 @@ network namehostdev/name uuid81ff0d90-c91e-6742-64da-4a736edb9a9b/uuid - forward mode='hostdev' managed='yes' + forward mode='hostdev' managed='yes' ephemeral='yes' pf dev='eth2'/ /forward /network diff --git a/tests/networkxml2xmlin/hostdev.xml b/tests/networkxml2xmlin/hostdev.xml index 03f1411..406c2df 100644 --- a/tests/networkxml2xmlin/hostdev.xml +++ b/tests/networkxml2xmlin/hostdev.xml @@ -1,7 +1,7 @@ network namehostdev/name uuid81ff0d90-c91e-6742-64da-4a736edb9a9b/uuid - forward mode='hostdev' managed='yes' + forward mode='hostdev' managed='yes' ephemeral='yes' address type='pci' domain='0x' bus='0x03' slot='0x00' function='0x1'/ address type='pci' domain='0x' bus='0x03' slot='0x00' function='0x2'/ address type='pci' domain='0x' bus='0x03' slot='0x00' function='0x3'/ diff --git a/tests/networkxml2xmlout/hostdev-pf.xml b/tests/networkxml2xmlout/hostdev-pf.xml index 7bf857d..6b928a6 100644 --- a/tests/networkxml2xmlout/hostdev-pf.xml +++ b/tests/networkxml2xmlout/hostdev-pf.xml @@ -1,7 +1,7 @@ network namehostdev/name uuid81ff0d90-c91e-6742-64da-4a736edb9a9b/uuid - forward mode='hostdev' managed='yes' + forward mode='hostdev' managed='yes' ephemeral='yes' pf
[libvirt] [PATCH 3/5] Ephemeral flag mofication within the network driver.
--- src/network/bridge_driver.c |1 + 1 files changed, 1 insertions(+), 0 deletions(-) diff --git a/src/network/bridge_driver.c b/src/network/bridge_driver.c index fb167dc..a72f3b4 100644 --- a/src/network/bridge_driver.c +++ b/src/network/bridge_driver.c @@ -3553,6 +3553,7 @@ networkAllocateActualDevice(virDomainNetDefPtr iface) iface-data.network.actual-data.hostdev.def.info = iface-info; iface-data.network.actual-data.hostdev.def.mode = VIR_DOMAIN_HOSTDEV_MODE_SUBSYS; iface-data.network.actual-data.hostdev.def.managed = netdef-managed; +iface-data.network.actual-data.hostdev.def.ephemeral = netdef-ephemeral; iface-data.network.actual-data.hostdev.def.source.subsys.type = dev-type; iface-data.network.actual-data.hostdev.def.source.subsys.u.pci = dev-device.pci; -- 1.7.4.4 -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
[libvirt] [PATCH 4/5] Ephemeral flag modification within the qemu driver.
When a guest with ephemeral device is migrated the PCI- passthrough of the ephemeral device should take place after migration and hence we check for the vmop in qemuBuildCommandLine. We also dicard the PCI slot assigned by qemuCollectPCIAddress as a PCI address will be assigned later during the hotplug. --- src/qemu/qemu_command.c | 62 ++ 1 files changed, 40 insertions(+), 22 deletions(-) diff --git a/src/qemu/qemu_command.c b/src/qemu/qemu_command.c index 1b20d23..6e1851c 100644 --- a/src/qemu/qemu_command.c +++ b/src/qemu/qemu_command.c @@ -4954,6 +4954,9 @@ qemuBuildCommandLine(virConnectPtr conn, VIR_DOMAIN_CONTROLLER_TYPE_CCID, }; +virDomainObjPtr vm = NULL; +virDomainObjListPtr doms = driver-domains; + VIR_DEBUG(conn=%p driver=%p def=%p mon=%p json=%d caps=%p migrateFrom=%s migrateFD=%d snapshot=%p vmop=%d, @@ -4962,6 +4965,8 @@ qemuBuildCommandLine(virConnectPtr conn, virUUIDFormat(def-uuid, uuid); +vm = virHashLookup(doms-objs, uuid); + emulator = def-emulator; /* @@ -5931,36 +5936,49 @@ qemuBuildCommandLine(virConnectPtr conn, if (net-type == VIR_DOMAIN_NET_TYPE_NETWORK) { virDomainHostdevDefPtr hostdev = virDomainNetGetActualHostdev(net); virDomainHostdevDefPtr found; +qemuDomainObjPrivatePtr priv = vm-privateData; /* For a network with forward mode='hostdev', there is a need to * add the newly minted hostdev to the hostdevs array. */ -if (qemuAssignDeviceHostdevAlias(def, hostdev, - (def-nhostdevs-1)) 0) { -goto error; -} - -if (virDomainHostdevFind(def, hostdev, found) 0) { -if (virDomainHostdevInsert(def, hostdev) 0) { -virReportOOMError(); +if (vmop == VIR_NETDEV_VPORT_PROFILE_OP_CREATE) { +if (qemuAssignDeviceHostdevAlias(def, hostdev, + (def-nhostdevs-1)) 0) { goto error; } -if (qemuPrepareHostdevPCIDevices(driver, def-name, def-uuid, - hostdev, 1) 0) { + +if (virDomainHostdevFind(def, hostdev, found) 0) { +if (virDomainHostdevInsert(def, hostdev) 0) { +virReportOOMError(); +goto error; +} +if (qemuPrepareHostdevPCIDevices(driver, def-name, def-uuid, + hostdev, 1) 0) { +goto error; +} +} +else { +virReportError(VIR_ERR_INTERNAL_ERROR, + _(PCI device %04x:%02x:%02x.%x + allocated from network %s is already + in use by domain %s), + hostdev-source.subsys.u.pci.domain, + hostdev-source.subsys.u.pci.bus, + hostdev-source.subsys.u.pci.slot, + hostdev-source.subsys.u.pci.function, + net-data.network.name, + def-name); goto error; } } -else { -virReportError(VIR_ERR_INTERNAL_ERROR, - _(PCI device %04x:%02x:%02x.%x - allocated from network %s is already - in use by domain %s), - hostdev-source.subsys.u.pci.domain, - hostdev-source.subsys.u.pci.bus, - hostdev-source.subsys.u.pci.slot, - hostdev-source.subsys.u.pci.function, - net-data.network.name, - def-name); -goto error; +else if (vmop == VIR_NETDEV_VPORT_PROFILE_OP_MIGRATE_IN_START) { +/* During migration the hostdev device is hotplugged at + * a later stage hence remove the PCI address collected by + * qemuCollectPCIAddress */
[libvirt] [PATCH 1/5] Added ephemeral flag for hostdev in domain conf.
The ephemeral flag helps support migration with PCI-passthrough. An ephemeral hostdev is automatically unplugged before migration and replugged (if one is available on the destination) after migration. --- docs/schemas/domaincommon.rng | 16 + src/conf/domain_conf.c | 23 ++- src/conf/domain_conf.h |1 + src/qemu/qemu_command.c|1 + .../qemuxml2argv-hostdev-pci-address.xml |2 +- .../qemuxml2argv-hostdev-usb-address.xml |2 +- .../qemuxml2argvdata/qemuxml2argv-net-hostdev.xml |2 +- tests/qemuxml2argvdata/qemuxml2argv-pci-rom.xml|4 +- 8 files changed, 44 insertions(+), 7 deletions(-) diff --git a/docs/schemas/domaincommon.rng b/docs/schemas/domaincommon.rng index 0e85739..29fc382 100644 --- a/docs/schemas/domaincommon.rng +++ b/docs/schemas/domaincommon.rng @@ -1694,6 +1694,14 @@ /choice /attribute /optional + optional +attribute name=ephemeral + choice +valueyes/value +valueno/value + /choice +/attribute + /optional interleave element name=source optional @@ -2836,6 +2844,14 @@ /choice /attribute /optional + optional +attribute name=ephemeral + choice +valueyes/value +valueno/value + /choice +/attribute + /optional group element name=source optional diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index 2ca608f..d0142f7 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -2958,6 +2958,7 @@ virDomainHostdevPartsParse(xmlNodePtr node, { xmlNodePtr sourcenode; char *managed = NULL; +char *ephemeral = NULL; int ret = -1; /* @mode is passed in separately from the caller, since an @@ -2984,6 +2985,16 @@ virDomainHostdevPartsParse(xmlNodePtr node, def-managed = 1; } +/* @ephemeral can be read from the xml document - it is always an + * attribute of the toplevel element, no matter what type of + * element that might be (pure hostdev, or higher level device + * (e.g. interface) with type='hostdev') + */ +if ((ephemeral = virXMLPropString(node, ephemeral))!= NULL) { +if (STREQ(ephemeral,yes)) +def-ephemeral = 1; +} + /* @type is passed in from the caller rather than read from the * xml document, because it is specified in different places for * different kinds of defs - it is an attribute of @@ -12428,6 +12439,10 @@ virDomainActualNetDefFormat(virBufferPtr buf, def-data.hostdev.def.managed) { virBufferAddLit(buf, managed='yes'); } +if (def-type == VIR_DOMAIN_NET_TYPE_HOSTDEV +def-data.hostdev.def.ephemeral) { +virBufferAddLit(buf, ephemeral='yes'); +} virBufferAddLit(buf, \n); virBufferAdjustIndent(buf, 2); @@ -12498,6 +12513,10 @@ virDomainNetDefFormat(virBufferPtr buf, def-data.hostdev.def.managed) { virBufferAddLit(buf, managed='yes'); } +if (def-type == VIR_DOMAIN_NET_TYPE_HOSTDEV +def-data.hostdev.def.ephemeral) { +virBufferAddLit(buf, ephemeral='yes'); +} virBufferAddLit(buf, \n); virBufferAdjustIndent(buf, 6); @@ -13473,8 +13492,8 @@ virDomainHostdevDefFormat(virBufferPtr buf, return -1; } -virBufferAsprintf(buf, hostdev mode='%s' type='%s' managed='%s'\n, - mode, type, def-managed ? yes : no); +virBufferAsprintf(buf, hostdev mode='%s' type='%s' managed='%s' ephemeral='%s'\n, + mode, type, def-managed ? yes : no, def-ephemeral ? yes : no); virBufferAdjustIndent(buf, 6); if (virDomainHostdevSourceFormat(buf, def, flags, false) 0) diff --git a/src/conf/domain_conf.h b/src/conf/domain_conf.h index 4ab15e9..597480f 100644 --- a/src/conf/domain_conf.h +++ b/src/conf/domain_conf.h @@ -390,6 +390,7 @@ struct _virDomainHostdevDef { int startupPolicy; /* enum virDomainStartupPolicy */ unsigned int managed : 1; unsigned int missing : 1; +unsigned int ephemeral : 1; union { virDomainHostdevSubsys subsys; struct { diff --git a/src/qemu/qemu_command.c b/src/qemu/qemu_command.c index 7736575..1b20d23 100644 --- a/src/qemu/qemu_command.c +++ b/src/qemu/qemu_command.c @@ -7729,6 +7729,7 @@ qemuParseCommandLinePCI(const char *val) def-mode = VIR_DOMAIN_HOSTDEV_MODE_SUBSYS; def-managed = 1; +def-ephemeral = 1; def-source.subsys.type = VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_PCI; def-source.subsys.u.pci.bus = bus; def-source.subsys.u.pci.slot = slot; diff --git a/tests/qemuxml2argvdata/qemuxml2argv-hostdev-pci-address.xml
[libvirt] [PATCH 5/5] Migration support for ephemeral hostdevs.
--- src/qemu/qemu_migration.c | 94 +++- 1 files changed, 91 insertions(+), 3 deletions(-) diff --git a/src/qemu/qemu_migration.c b/src/qemu/qemu_migration.c index d52ec59..dd1a2a7 100644 --- a/src/qemu/qemu_migration.c +++ b/src/qemu/qemu_migration.c @@ -32,6 +32,7 @@ #include qemu_monitor.h #include qemu_domain.h #include qemu_process.h +#include qemu_hotplug.h #include qemu_capabilities.h #include qemu_cgroup.h @@ -50,6 +51,7 @@ #include storage_file.h #include viruri.h #include hooks.h +#include network/bridge_driver.h #define VIR_FROM_THIS VIR_FROM_QEMU @@ -149,6 +151,79 @@ struct _qemuMigrationCookie { qemuMigrationCookieNetworkPtr network; }; +static void +qemuMigrationRemoveEphemeralDevices(struct qemud_driver *driver, +virDomainObjPtr vm) +{ +virDomainHostdevDefPtr dev; +virDomainDeviceDef def; +unsigned int i; + +for (i = 0; i vm-def-nhostdevs; i++) { +dev = vm-def-hostdevs[i]; +if (dev-ephemeral == 1) { +def.type = VIR_DOMAIN_DEVICE_HOSTDEV; +def.data.hostdev = dev; + +if (qemuDomainDetachHostDevice(driver, vm, def) = 0) { +continue; /* nhostdevs reduced */ +} +} +} +} + +static void +qemuMigrationRestoreEphemeralDevices(struct qemud_driver *driver, + virDomainObjPtr vm) +{ +virDomainNetDefPtr net; +unsigned int i; + +/* Do nothing if ephemeral devices are present in which case this + function was called before qemuMigrationRemoveEphemeralDevices */ + +for (i = 0; i vm-def-nhostdevs; i++) { +if (vm-def-hostdevs[i]-ephemeral == 1) +return; +} + +for (i = 0; i vm-def-nnets; i++) { +net = vm-def-nets[i]; + +if (virDomainNetGetActualType(net) == VIR_DOMAIN_NET_TYPE_HOSTDEV) { +if (qemuDomainAttachHostDevice(driver, vm, + virDomainNetGetActualHostdev(net)) 0) { +virReportError(VIR_ERR_INTERNAL_ERROR, %s, + _(Hostdev cannot be restored)); +networkReleaseActualDevice(net); +} +} +return; +} +} + +static void +qemuMigrationAttachEphemeralDevices(struct qemud_driver *driver, +virDomainObjPtr vm) +{ +virDomainNetDefPtr net; +unsigned int i; + +for (i = 0; i vm-def-nnets; i++) { +net = vm-def-nets[i]; + +if (virDomainNetGetActualType(net) == VIR_DOMAIN_NET_TYPE_HOSTDEV) { +if (qemuDomainAttachHostDevice(driver, vm, + virDomainNetGetActualHostdev(net)) 0) { +virReportError(VIR_ERR_INTERNAL_ERROR, %s, + _(Hostdev cannot be attached after migration)); +networkReleaseActualDevice(net); +} +} +} +return; +} + static void qemuMigrationCookieGraphicsFree(qemuMigrationCookieGraphicsPtr grap) { if (!grap) @@ -1041,21 +1116,22 @@ qemuMigrationIsAllowed(struct qemud_driver *driver, virDomainObjPtr vm, def = vm-def; } -/* Migration with USB host devices is allowed, all other devices are +/* Migration with USB and ephemeral PCI host devices is allowed, all other devices are * forbidden. */ forbid = false; for (i = 0; i def-nhostdevs; i++) { virDomainHostdevDefPtr hostdev = def-hostdevs[i]; if (hostdev-mode != VIR_DOMAIN_HOSTDEV_MODE_SUBSYS || -hostdev-source.subsys.type != VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_USB) { +((hostdev-source.subsys.type != VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_USB) + (hostdev-ephemeral == 0))) { forbid = true; break; } } if (forbid) { virReportError(VIR_ERR_OPERATION_INVALID, %s, - _(Domain with assigned non-USB host devices + _(Domain with assigned non-USB and non-ephemeral host devices cannot be migrated)); return false; } @@ -2347,6 +2423,7 @@ static int doNativeMigrate(struct qemud_driver *driver, cookieout=%p, cookieoutlen=%p, flags=%lx, resource=%lu, driver, vm, uri, NULLSTR(cookiein), cookieinlen, cookieout, cookieoutlen, flags, resource); +qemuMigrationRemoveEphemeralDevices(driver, vm); if (STRPREFIX(uri, tcp:) !STRPREFIX(uri, tcp://)) { char *tmp; @@ -2374,6 +2451,9 @@ static int doNativeMigrate(struct qemud_driver *driver, ret = qemuMigrationRun(driver, vm, cookiein, cookieinlen, cookieout, cookieoutlen, flags, resource, spec, dconn); +if (ret != 0) +qemuMigrationRestoreEphemeralDevices(driver, vm); + if (spec.destType == MIGRATION_DEST_FD)
[libvirt] [PATCH 3/5] Ephemeral flag mofication within the network driver.
--- src/network/bridge_driver.c |1 + 1 files changed, 1 insertions(+), 0 deletions(-) diff --git a/src/network/bridge_driver.c b/src/network/bridge_driver.c index 0e38016..61d5b13 100644 --- a/src/network/bridge_driver.c +++ b/src/network/bridge_driver.c @@ -3456,6 +3456,7 @@ networkAllocateActualDevice(virDomainNetDefPtr iface) iface-data.network.actual-data.hostdev.def.info = iface-info; iface-data.network.actual-data.hostdev.def.mode = VIR_DOMAIN_HOSTDEV_MODE_SUBSYS; iface-data.network.actual-data.hostdev.def.managed = netdef-managed; +iface-data.network.actual-data.hostdev.def.ephemeral = netdef-ephemeral; iface-data.network.actual-data.hostdev.def.source.subsys.type = dev-type; iface-data.network.actual-data.hostdev.def.source.subsys.u.pci = dev-device.pci; -- 1.7.4.4 -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
[libvirt] [PATCH 1/5] Added ephemeral flag for hostdev in domain conf.
The ephemeral flag helps support migration with PCI-passthrough. An ephemeral hostdev is automatically unplugged before migration and replugged (if one is available on the destination) after migration. --- docs/schemas/domaincommon.rng | 16 + src/conf/domain_conf.c | 23 ++- src/conf/domain_conf.h |1 + src/qemu/qemu_command.c|1 + .../qemuxml2argv-hostdev-pci-address.xml |2 +- .../qemuxml2argv-hostdev-usb-address.xml |2 +- .../qemuxml2argvdata/qemuxml2argv-net-hostdev.xml |2 +- tests/qemuxml2argvdata/qemuxml2argv-pci-rom.xml|4 +- 8 files changed, 44 insertions(+), 7 deletions(-) diff --git a/docs/schemas/domaincommon.rng b/docs/schemas/domaincommon.rng index aafb10c..349f22b 100644 --- a/docs/schemas/domaincommon.rng +++ b/docs/schemas/domaincommon.rng @@ -1644,6 +1644,14 @@ /choice /attribute /optional + optional +attribute name=ephemeral + choice +valueyes/value +valueno/value + /choice +/attribute + /optional interleave element name=source choice @@ -2750,6 +2758,14 @@ /choice /attribute /optional + optional +attribute name=ephemeral + choice +valueyes/value +valueno/value + /choice +/attribute + /optional group element name=source choice diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index b8ba0e2..0b6332a 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -2888,6 +2888,7 @@ virDomainHostdevPartsParse(xmlNodePtr node, { xmlNodePtr sourcenode; char *managed = NULL; +char *ephemeral = NULL; int ret = -1; /* @mode is passed in separately from the caller, since an @@ -2914,6 +2915,16 @@ virDomainHostdevPartsParse(xmlNodePtr node, def-managed = 1; } +/* @ephemeral can be read from the xml document - it is always an + * attribute of the toplevel element, no matter what type of + * element that might be (pure hostdev, or higher level device + * (e.g. interface) with type='hostdev') + */ +if ((ephemeral = virXMLPropString(node, ephemeral))!= NULL) { +if (STREQ(ephemeral,yes)) +def-ephemeral = 1; +} + /* @type is passed in from the caller rather than read from the * xml document, because it is specified in different places for * different kinds of defs - it is an attribute of @@ -12027,6 +12038,10 @@ virDomainActualNetDefFormat(virBufferPtr buf, def-data.hostdev.def.managed) { virBufferAddLit(buf, managed='yes'); } +if (def-type == VIR_DOMAIN_NET_TYPE_HOSTDEV +def-data.hostdev.def.ephemeral) { +virBufferAddLit(buf, ephemeral='yes'); +} virBufferAddLit(buf, \n); virBufferAdjustIndent(buf, 2); @@ -12097,6 +12112,10 @@ virDomainNetDefFormat(virBufferPtr buf, def-data.hostdev.def.managed) { virBufferAddLit(buf, managed='yes'); } +if (def-type == VIR_DOMAIN_NET_TYPE_HOSTDEV +def-data.hostdev.def.ephemeral) { +virBufferAddLit(buf, ephemeral='yes'); +} virBufferAddLit(buf, \n); virBufferAdjustIndent(buf, 6); @@ -13063,8 +13082,8 @@ virDomainHostdevDefFormat(virBufferPtr buf, return -1; } -virBufferAsprintf(buf, hostdev mode='%s' type='%s' managed='%s'\n, - mode, type, def-managed ? yes : no); +virBufferAsprintf(buf, hostdev mode='%s' type='%s' managed='%s' ephemeral='%s'\n, + mode, type, def-managed ? yes : no, def-ephemeral ? yes : no); virBufferAdjustIndent(buf, 6); if (virDomainHostdevSourceFormat(buf, def, flags, false) 0) diff --git a/src/conf/domain_conf.h b/src/conf/domain_conf.h index f0dea48..8263711 100644 --- a/src/conf/domain_conf.h +++ b/src/conf/domain_conf.h @@ -385,6 +385,7 @@ struct _virDomainHostdevDef { virDomainDeviceDef parent; /* higher level Def containing this */ int mode; /* enum virDomainHostdevMode */ unsigned int managed : 1; +unsigned int ephemeral : 1; union { virDomainHostdevSubsys subsys; struct { diff --git a/src/qemu/qemu_command.c b/src/qemu/qemu_command.c index cbf4aee..b7fbe75 100644 --- a/src/qemu/qemu_command.c +++ b/src/qemu/qemu_command.c @@ -7410,6 +7410,7 @@ qemuParseCommandLinePCI(const char *val) def-mode = VIR_DOMAIN_HOSTDEV_MODE_SUBSYS; def-managed = 1; +def-ephemeral = 1; def-source.subsys.type = VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_PCI; def-source.subsys.u.pci.bus = bus; def-source.subsys.u.pci.slot = slot; diff --git
[libvirt] [PATCH 2/5] Adding ephemeral flag for hostdev in network conf.
The ephemeral flag helps support migration with PCI-passthrough. An ephemeral hostdev is automatically unplugged before migration and replugged (if one is available on the destination) after migration. --- docs/schemas/network.rng |8 src/conf/network_conf.c| 11 +++ src/conf/network_conf.h|1 + tests/networkxml2xmlin/hostdev-pf.xml |2 +- tests/networkxml2xmlin/hostdev.xml |2 +- tests/networkxml2xmlout/hostdev-pf.xml |2 +- tests/networkxml2xmlout/hostdev.xml|2 +- 7 files changed, 24 insertions(+), 4 deletions(-) diff --git a/docs/schemas/network.rng b/docs/schemas/network.rng index 4abfd91..c86ade8 100644 --- a/docs/schemas/network.rng +++ b/docs/schemas/network.rng @@ -100,6 +100,14 @@ /choice /attribute /optional +optional + attribute name=ephemeral +choice + valueyes/value + valueno/value +/choice + /attribute +/optional interleave choice group diff --git a/src/conf/network_conf.c b/src/conf/network_conf.c index db398ae..7168c66 100644 --- a/src/conf/network_conf.c +++ b/src/conf/network_conf.c @@ -1206,6 +1206,7 @@ virNetworkDefParseXML(xmlXPathContextPtr ctxt) int nIps, nPortGroups, nForwardIfs, nForwardPfs, nForwardAddrs; char *forwardDev = NULL; char *forwardManaged = NULL; +char *forwardEphemeral = NULL; char *type = NULL; xmlNodePtr save = ctxt-node; xmlNodePtr bandwidthNode = NULL; @@ -1364,6 +1365,11 @@ virNetworkDefParseXML(xmlXPathContextPtr ctxt) if (STRCASEEQ(forwardManaged, yes)) def-managed = 1; } +forwardEphemeral = virXPathString(string(./@ephemeral), ctxt); +if(forwardEphemeral != NULL) { +if (STRCASEEQ(forwardEphemeral, yes)) +def-ephemeral = 1; +} /* all of these modes can use a pool of physical interfaces */ nForwardIfs = virXPathNodeSet(./interface, ctxt, forwardIfNodes); @@ -1511,6 +1517,7 @@ virNetworkDefParseXML(xmlXPathContextPtr ctxt) VIR_FREE(type); VIR_FREE(forwardDev); VIR_FREE(forwardManaged); +VIR_FREE(forwardEphemeral); VIR_FREE(forwardPfNodes); VIR_FREE(forwardIfNodes); VIR_FREE(forwardAddrNodes); @@ -1845,6 +1852,10 @@ char *virNetworkDefFormat(const virNetworkDefPtr def, unsigned int flags) virBufferAddLit(buf, managed='yes'); else virBufferAddLit(buf, managed='no'); +if (def-ephemeral == 1) +virBufferAddLit(buf, ephemeral='yes'); +else +virBufferAddLit(buf, ephemeral='no'); } virBufferAsprintf(buf, %s\n, (def-nForwardIfs || def-nForwardPfs) ? : /); diff --git a/src/conf/network_conf.h b/src/conf/network_conf.h index c8ed2ea..3a65772 100644 --- a/src/conf/network_conf.h +++ b/src/conf/network_conf.h @@ -186,6 +186,7 @@ struct _virNetworkDef { int forwardType;/* One of virNetworkForwardType constants */ int managed;/* managed attribute for hostdev mode */ +int ephemeral; /* ephemeral attribute for hostdev mode */ /* If there are multiple forward devices (i.e. a pool of * interfaces), they will be listed here. diff --git a/tests/networkxml2xmlin/hostdev-pf.xml b/tests/networkxml2xmlin/hostdev-pf.xml index 7bf857d..6b928a6 100644 --- a/tests/networkxml2xmlin/hostdev-pf.xml +++ b/tests/networkxml2xmlin/hostdev-pf.xml @@ -1,7 +1,7 @@ network namehostdev/name uuid81ff0d90-c91e-6742-64da-4a736edb9a9b/uuid - forward mode='hostdev' managed='yes' + forward mode='hostdev' managed='yes' ephemeral='yes' pf dev='eth2'/ /forward /network diff --git a/tests/networkxml2xmlin/hostdev.xml b/tests/networkxml2xmlin/hostdev.xml index 03f1411..406c2df 100644 --- a/tests/networkxml2xmlin/hostdev.xml +++ b/tests/networkxml2xmlin/hostdev.xml @@ -1,7 +1,7 @@ network namehostdev/name uuid81ff0d90-c91e-6742-64da-4a736edb9a9b/uuid - forward mode='hostdev' managed='yes' + forward mode='hostdev' managed='yes' ephemeral='yes' address type='pci' domain='0x' bus='0x03' slot='0x00' function='0x1'/ address type='pci' domain='0x' bus='0x03' slot='0x00' function='0x2'/ address type='pci' domain='0x' bus='0x03' slot='0x00' function='0x3'/ diff --git a/tests/networkxml2xmlout/hostdev-pf.xml b/tests/networkxml2xmlout/hostdev-pf.xml index 7bf857d..6b928a6 100644 --- a/tests/networkxml2xmlout/hostdev-pf.xml +++ b/tests/networkxml2xmlout/hostdev-pf.xml @@ -1,7 +1,7 @@ network namehostdev/name uuid81ff0d90-c91e-6742-64da-4a736edb9a9b/uuid - forward mode='hostdev' managed='yes' + forward mode='hostdev' managed='yes' ephemeral='yes' pf
[libvirt] [PATCH 5/5] Migration support for ephemeral hostdevs.
--- src/qemu/qemu_migration.c | 98 +++-- 1 files changed, 94 insertions(+), 4 deletions(-) diff --git a/src/qemu/qemu_migration.c b/src/qemu/qemu_migration.c index 99fc8ce..f3414b0 100644 --- a/src/qemu/qemu_migration.c +++ b/src/qemu/qemu_migration.c @@ -31,6 +31,7 @@ #include qemu_monitor.h #include qemu_domain.h #include qemu_process.h +#include qemu_hotplug.h #include qemu_capabilities.h #include qemu_cgroup.h @@ -49,6 +50,7 @@ #include storage_file.h #include viruri.h #include hooks.h +#include network/bridge_driver.h #define VIR_FROM_THIS VIR_FROM_QEMU @@ -122,6 +124,79 @@ struct _qemuMigrationCookie { virDomainDefPtr persistent; }; +static void +qemuMigrationRemoveEphemeralDevices(struct qemud_driver *driver, +virDomainObjPtr vm) +{ +virDomainHostdevDefPtr dev; +virDomainDeviceDef def; +unsigned int i; + +for (i = 0; i vm-def-nhostdevs; i++) { +dev = vm-def-hostdevs[i]; +if (dev-ephemeral == 1) { +def.type = VIR_DOMAIN_DEVICE_HOSTDEV; +def.data.hostdev = dev; + +if (qemuDomainDetachHostDevice(driver, vm, def) = 0) { +continue; /* nhostdevs reduced */ +} +} +} +} + +static void +qemuMigrationRestoreEphemeralDevices(struct qemud_driver *driver, + virDomainObjPtr vm) +{ +virDomainNetDefPtr net; +unsigned int i; + +/* Do nothing if ephemeral devices are present in which case this + function was called before qemuMigrationRemoveEphemeralDevices */ + +for (i = 0; i vm-def-nhostdevs; i++) { +if (vm-def-hostdevs[i]-ephemeral == 1) +return; +} + +for (i = 0; i vm-def-nnets; i++) { +net = vm-def-nets[i]; + +if (virDomainNetGetActualType(net) == VIR_DOMAIN_NET_TYPE_HOSTDEV) { +if (qemuDomainAttachHostDevice(driver, vm, + virDomainNetGetActualHostdev(net)) 0) { +virReportError(VIR_ERR_INTERNAL_ERROR, %s, + _(Hostdev cannot be restored)); +networkReleaseActualDevice(net); +} +} +return; +} +} + +static void +qemuMigrationAttachEphemeralDevices(struct qemud_driver *driver, +virDomainObjPtr vm) +{ +virDomainNetDefPtr net; +unsigned int i; + +for (i = 0; i vm-def-nnets; i++) { +net = vm-def-nets[i]; + +if (virDomainNetGetActualType(net) == VIR_DOMAIN_NET_TYPE_HOSTDEV) { +if (qemuDomainAttachHostDevice(driver, vm, + virDomainNetGetActualHostdev(net)) 0) { +virReportError(VIR_ERR_INTERNAL_ERROR, %s, + _(Hostdev cannot be attached after migration)); +networkReleaseActualDevice(net); +} +} +} +return; +} + static void qemuMigrationCookieGraphicsFree(qemuMigrationCookieGraphicsPtr grap) { if (!grap) @@ -800,6 +875,7 @@ qemuMigrationIsAllowed(struct qemud_driver *driver, virDomainObjPtr vm, virDomainDefPtr def) { int nsnapshots; +unsigned int i; if (vm) { if (qemuProcessAutoDestroyActive(driver, vm)) { @@ -817,10 +893,12 @@ qemuMigrationIsAllowed(struct qemud_driver *driver, virDomainObjPtr vm, def = vm-def; } -if (def-nhostdevs 0) { -virReportError(VIR_ERR_OPERATION_INVALID, - %s, _(Domain with assigned host devices cannot be migrated)); -return false; +for (i = 0; i def-nhostdevs; i++) { +if (def-hostdevs[i]-ephemeral == 0) { +virReportError(VIR_ERR_OPERATION_INVALID, + %s, _(Domain with assigned non-ephemeral host devices cannot be migrated)); +return false; +} } return true; @@ -2042,6 +2120,7 @@ static int doNativeMigrate(struct qemud_driver *driver, cookieout=%p, cookieoutlen=%p, flags=%lx, resource=%lu, driver, vm, uri, NULLSTR(cookiein), cookieinlen, cookieout, cookieoutlen, flags, resource); +qemuMigrationRemoveEphemeralDevices(driver, vm); if (STRPREFIX(uri, tcp:) !STRPREFIX(uri, tcp://)) { char *tmp; @@ -2069,6 +2148,9 @@ static int doNativeMigrate(struct qemud_driver *driver, ret = qemuMigrationRun(driver, vm, cookiein, cookieinlen, cookieout, cookieoutlen, flags, resource, spec, dconn); +if (ret != 0) +qemuMigrationRestoreEphemeralDevices(driver, vm); + if (spec.destType == MIGRATION_DEST_FD) VIR_FORCE_CLOSE(spec.dest.fd.qemu); @@ -2107,6 +2189,8 @@ static int doTunnelMigrate(struct qemud_driver *driver, return -1; } +qemuMigrationRemoveEphemeralDevices(driver, vm); + spec.fwdType =
[libvirt] [PATCH 4/5] Ephemeral flag modification within the qemu driver.
When a guest with ephemeral device is migrated the PCI- passthrough of the ephemeral device should take place after migration and hence we check for the vmop in qemuBuildCommandLine. --- src/qemu/qemu_command.c | 48 -- 1 files changed, 25 insertions(+), 23 deletions(-) diff --git a/src/qemu/qemu_command.c b/src/qemu/qemu_command.c index b7fbe75..e585ab1 100644 --- a/src/qemu/qemu_command.c +++ b/src/qemu/qemu_command.c @@ -5343,34 +5343,36 @@ qemuBuildCommandLine(virConnectPtr conn, /* For a network with forward mode='hostdev', there is a need to * add the newly minted hostdev to the hostdevs array. */ -if (qemuAssignDeviceHostdevAlias(def, hostdev, - (def-nhostdevs-1)) 0) { -goto error; -} - -if (virDomainHostdevFind(def, hostdev, found) 0) { -if (virDomainHostdevInsert(def, hostdev) 0) { -virReportOOMError(); +if (vmop == VIR_NETDEV_VPORT_PROFILE_OP_CREATE) { +if (qemuAssignDeviceHostdevAlias(def, hostdev, + (def-nhostdevs-1)) 0) { goto error; } -if (qemuPrepareHostdevPCIDevices(driver, def-name, def-uuid, - hostdev, 1) 0) { + +if (virDomainHostdevFind(def, hostdev, found) 0) { +if (virDomainHostdevInsert(def, hostdev) 0) { +virReportOOMError(); +goto error; +} +if (qemuPrepareHostdevPCIDevices(driver, def-name, def-uuid, + hostdev, 1) 0) { +goto error; +} +} +else { +virReportError(VIR_ERR_INTERNAL_ERROR, + _(PCI device %04x:%02x:%02x.%x + allocated from network %s is already + in use by domain %s), + hostdev-source.subsys.u.pci.domain, + hostdev-source.subsys.u.pci.bus, + hostdev-source.subsys.u.pci.slot, + hostdev-source.subsys.u.pci.function, + net-data.network.name, + def-name); goto error; } } -else { -virReportError(VIR_ERR_INTERNAL_ERROR, - _(PCI device %04x:%02x:%02x.%x - allocated from network %s is already - in use by domain %s), - hostdev-source.subsys.u.pci.domain, - hostdev-source.subsys.u.pci.bus, - hostdev-source.subsys.u.pci.slot, - hostdev-source.subsys.u.pci.function, - net-data.network.name, - def-name); -goto error; -} } continue; } -- 1.7.4.4 -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
[libvirt] [PATCH 0/5] Support ephemeral passthrough hostdevs
The ephemeral flag helps support migration with PCI-passthrough. An ephemeral hostdev is automatically unplugged before migration and replugged (if one is available on the destination) after migration. Shradha Shah (5): Added ephemeral flag for hostdev in domain conf. Adding ephemeral flag for hostdev in network conf. Ephemeral flag mofication within the network driver. Ephemeral flag modification within the qemu driver. Migration support for ephemeral hostdevs. docs/schemas/domaincommon.rng | 16 +++ docs/schemas/network.rng |8 ++ src/conf/domain_conf.c | 23 - src/conf/domain_conf.h |1 + src/conf/network_conf.c| 11 ++ src/conf/network_conf.h|1 + src/network/bridge_driver.c|1 + src/qemu/qemu_command.c| 49 +- src/qemu/qemu_migration.c | 98 +++- tests/networkxml2xmlin/hostdev-pf.xml |2 +- tests/networkxml2xmlin/hostdev.xml |2 +- tests/networkxml2xmlout/hostdev-pf.xml |2 +- tests/networkxml2xmlout/hostdev.xml|2 +- .../qemuxml2argv-hostdev-pci-address.xml |2 +- .../qemuxml2argv-hostdev-usb-address.xml |2 +- .../qemuxml2argvdata/qemuxml2argv-net-hostdev.xml |2 +- tests/qemuxml2argvdata/qemuxml2argv-pci-rom.xml|4 +- 17 files changed, 188 insertions(+), 38 deletions(-) -- 1.7.4.4 -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH 0/8] Hostdev-hybrid patches
On 09/14/2012 12:05 PM, Laine Stump wrote: On 09/13/2012 06:16 AM, Shradha Shah wrote: On 09/11/2012 08:07 PM, Laine Stump wrote: If so, one issue I have is that one of the devices (the pci-passthrough?) doesn't have its guest-side PCI address visible anywhere in the guest's XML, does it? This is problematic, because management applications (and libvirt itself) expect to be able to scan the list of devices to learn what PCI slots are occupied on the guest, and where they can add new devices. Actually the guest PCI address of the pci-passthrough device i.e. The VF is visible in the guest's XML when the guest is running. The VF will be plugged into the guest only when the guest is running or when the guest is not being migrated hence will be visible in the guest XML. But there's only a place for one guest-side PCI address in each device element. Where is it showing up? The guest PCI address of the pci-passthrough device is visible in hostdev device element and that of the virtio-net is visible in the interface device element. Also, we really need for the same PCI address to be used each time the device is attached; although there may not appear to be a need for that now, past experience has shown that changing the PCI slot of a device over time inevitably leads to a problem somewhere with something :-/ Due to this, the general complexity of what's being done vs. what's being added, and also time/review bandwidth constraints I think that at least for this release we can't take the full hostdev-hybrid device patchset (really I think it will need to be re-thought and probably a different approach taken for specifying the two devices). However, the ephemeral attribute for hostdev, interface type='hostdev' and forward mode='hostdev' is fairly straightforward and provides generally useful new functionality (especially if it is expanded as mentioned to work for save/restore as well as migration) - with just this part of your patch, we can still get all of the desired functionality at the level of libvirt XML (with the two limitations of 1) networks limited to a single PF, and 2) duplicated mac addresses must be manually specified). How difficult would it be to create a patch with just that part of the functionality (plus the additional save/restore tweak)? If you could do that before the freeze on Tuesday AM we might be able to get it into 0.10.2 (which will be what Fedora 18 is based on). I can provide the required patches over the weekend. -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH 0/8] Hostdev-hybrid patches
Laine, Please find my comments inline. Regards, Shradha Shah On 09/11/2012 08:07 PM, Laine Stump wrote: On 09/07/2012 12:12 PM, Shradha Shah wrote: This patch series adds the support for interface-type=hostdev-hybrid and forward mode=hostdev-hybrid. The hostdev-hybrid mode makes migration possible along with PCI-passthrough. I had posted a RFC on the hostdev-hybrid methodology earlier on the libvirt mailing list. The RFC can be found here: https://www.redhat.com/archives/libvir-list/2012-February/msg00309.html Before anything else, let me outline what I *think* happens with a hostdev-hybrid device entry, and you can tell me how far off I am :-): * Any hostdev-hybrid interface definition results in 2 PCI devices being added to the guest: a) a PCI passthrough of an SR-IOV VF (done essentially the same as interface type='hostdev' b) a virtio-net device which is connected via macvtap bridge mode (? is that always the case) to the PF of the VF in (a) Yes the virtio-net device is always connected via macvtap bridge mode * Both of these devices are assigned the same MAC address. Correct. * Each of these occupies one PCI address on the guest, so a total of 2 PCI addresses is needed for each hostdev-hybrid device. (The redundancy in this statement is to be sure that I'm right, as that's an important point :-) Yes a total of 2 PCI addresses are needed for each hostdev-hybrid device * On the guest, these two network devices with matching MAC addresses are put together into a bond interface, with an extra driver that causes the bond to prefer the pci-passthrough device when it is present. So, under normal circumstances *all* traffic goes through the pci-passthrough device. Correct. * At migration time, since guests with attached pci-passthrough devices can't be migrated, the pci-passthrough device (which is found by searching the hostdev array for items with the ephemeral flag set) is detached. This reduces the bond interface on the guest to only having the virtio-net device, so traffic now passes through that device - it's slower, but connectivity is maintained. Correct. * on the destination, a new VF is found, setup with proper MAC address, VLAN, and 802.1QbX port info. A virtio-net device attached to the PF associated with this VF (via macvtap bridge mode) is also setup. The qemu commandline includes an entry for both of these devices. (Question: Is it the virtio-net device that uses the guest PCI address given in the interface device info?) Yes, The virtio-net device gets the guest PCI address given in the interface device info. The VF gets a separate guest PCI address . (Question: actually, I guess the pci-passthrough device won't be attached until after the guest actually starts running on the destination host, correct?) The Vf is not attached until the migration completes because during migration qemu checks for PCI-passthrough devices on the source and on the destination. * When migration is finished, the guest is shut down on the source and started up on the destination, leaving the new instance of the guest temporarily with just a single (virtio-net) device in the bond. Yes, The VF is hotplug into the guest in the qemuMigrationFinish Stage * Finally, the pci-passthrough of the VF is attached to the guest, and the guest's bond interface resumes preferring this device, thus restoring full speed networking. Is that all correct? If so, one issue I have is that one of the devices (the pci-passthrough?) doesn't have its guest-side PCI address visible anywhere in the guest's XML, does it? This is problematic, because management applications (and libvirt itself) expect to be able to scan the list of devices to learn what PCI slots are occupied on the guest, and where they can add new devices. Actually the guest PCI address of the pci-passthrough device i.e. The VF is visible in the guest's XML when the guest is running. The VF will be plugged into the guest only when the guest is running or when the guest is not being migrated hence will be visible in the guest XML. i.e At any moment if a VF is plugged into the guest, its guest address will be visible in the guest XML and the PCI slot will be marked as being used. When the VF is hot-unplugged the PCI slot is free and hence can be used for some other device by management applications (libvirt). I have other questions beyond that, but either don't understand the code enough yet to verbalize them, or will ask them next to the associated code. -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH 0/8] Hostdev-hybrid patches
* At migration time, since guests with attached pci-passthrough devices can't be migrated, the pci-passthrough device (which is found by searching the hostdev array for items with the ephemeral flag set) is detached. This reduces the bond interface on the guest to only having the virtio-net device, so traffic now passes through that device - it's slower, but connectivity is maintained. And if this is the case, it means that 1) the guest must be aware that it is virtualized, and 2) can detect when it is being migrated. Unless I'm misunderstanding, the guest doesn't explicitly know that it's virtualized or that it's being migrated - the guest OS just knows that one of its PCI devices has been unplugged, and later that it's being re-plugged (of course since there's a special driver on the guest, at least that bit has a pretty good idea that it's in a virtual machine; but that's no different from the virtio-net guest driver (not to mention the guest agent)). I'm guessing that the migration will (should, anyway) fail if the guest OS fails to detach the device in a timely fashion. The migration will fail if the guest OS fails to detach the device in a timely fashion. The ideal virtualization is one in where the guest doesn't have to be aware of anything, but the goal of this patch is not ideal guest behavior, so much as faster performance by explicitly making virtualization a leaky interface where the guest has to cooperate. Assuming I'm correct, does that have any security implications on the host? Or are we okay even if the guest is malicious, because the worst the guest can do is use the slower interface rather than the faster pci-passthrough device? I think the only completely new functionality provided by these patches is the ephemeral flag, which makes it possible for the pci-passthrough device to be auto-detached to allow migration. So any *new* security concern would be related to that capability. Otherwise, I don't see this as any different than defining the two devices separately, which is already possible with existing libvirt. The hostdev-hybrid model can be configured using manual steps and 2 devices separately with existing libvirt device types. The single hostdev-hybrid two-in-one device does provide a convenience factor beyond just shortening the config - the PF to use for the virtio-net device is automatically derived from the VF that's allocated for the pci-passthrough device (in my mind that's the one thing that makes it desirable to have this special device type rather than just adding the ephemeral flag to hostdev and requiring guest configurations to have two separate device entries. Am I missing something else?) but it would still be possible to use existing device types to provide the same virtual hardware to the guest, and the guest could use that hardware in the same manner. I have other questions beyond that, but either don't understand the code enough yet to verbalize them, or will ask them next to the associated code. Seconded :) -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH 0/8] Hostdev-hybrid patches
Please find my comments inline. Many Thanks, Regards, Shradha Shah On 09/12/2012 08:01 PM, Laine Stump wrote: On 09/12/2012 05:59 AM, Daniel P. Berrange wrote: On Tue, Sep 11, 2012 at 03:07:25PM -0400, Laine Stump wrote: On 09/07/2012 12:12 PM, Shradha Shah wrote: This patch series adds the support for interface-type=hostdev-hybrid and forward mode=hostdev-hybrid. The hostdev-hybrid mode makes migration possible along with PCI-passthrough. I had posted a RFC on the hostdev-hybrid methodology earlier on the libvirt mailing list. The RFC can be found here: https://www.redhat.com/archives/libvir-list/2012-February/msg00309.html Before anything else, let me outline what I *think* happens with a hostdev-hybrid device entry, and you can tell me how far off I am :-): * Any hostdev-hybrid interface definition results in 2 PCI devices being added to the guest: a) a PCI passthrough of an SR-IOV VF (done essentially the same as interface type='hostdev' b) a virtio-net device which is connected via macvtap bridge mode (? is that always the case) to the PF of the VF in (a) * Both of these devices are assigned the same MAC address. * Each of these occupies one PCI address on the guest, so a total of 2 PCI addresses is needed for each hostdev-hybrid device. (The redundancy in this statement is to be sure that I'm right, as that's an important point :-) * On the guest, these two network devices with matching MAC addresses are put together into a bond interface, with an extra driver that causes the bond to prefer the pci-passthrough device when it is present. So, under normal circumstances *all* traffic goes through the pci-passthrough device. * At migration time, since guests with attached pci-passthrough devices can't be migrated, the pci-passthrough device (which is found by searching the hostdev array for items with the ephemeral flag set) is detached. This reduces the bond interface on the guest to only having the virtio-net device, so traffic now passes through that device - it's slower, but connectivity is maintained. * on the destination, a new VF is found, setup with proper MAC address, VLAN, and 802.1QbX port info. A virtio-net device attached to the PF associated with this VF (via macvtap bridge mode) is also setup. The qemu commandline includes an entry for both of these devices. (Question: Is it the virtio-net device that uses the guest PCI address given in the interface device info?) (Question: actually, I guess the pci-passthrough device won't be attached until after the guest actually starts running on the destination host, correct?) * When migration is finished, the guest is shut down on the source and started up on the destination, leaving the new instance of the guest temporarily with just a single (virtio-net) device in the bond. * Finally, the pci-passthrough of the VF is attached to the guest, and the guest's bond interface resumes preferring this device, thus restoring full speed networking. Is that all correct? If so, one issue I have is that one of the devices (the pci-passthrough?) doesn't have its guest-side PCI address visible anywhere in the guest's XML, does it? This is problematic, because management applications (and libvirt itself) expect to be able to scan the list of devices to learn what PCI slots are occupied on the guest, and where they can add new devices. If that description is correct, That's a big if - keep in mind the author of the description :-) (seriously, it's very possible I'm missing some important point) then I have to wonder why we need to add all this code for a new hybrid device type. It seems to me like we can do all this already simply by listing one virtio device and one hostdev device in the guest XML. Aside from detaching/re-attaching the hostdev, the other thing that these patches bring is automatic derivation of the source of the virtio-net device from the hostdev. The hostdev device will be grabbed from a pool of VFs in a network, then a reverse lookup is done in PCI space to determine the PF for that VF - that's where the virtio-net device is connected. I suppose this could be handled by 1) putting only the VFs of a single PF in any network definition's device pool, and 2) always having two parallel network definitions like this: network namenet-x-vfs-hostdev/name forward mode='hostdev' ephemeral='yes' pf dev='eth3'/ !-- makes a list of all VFs for PF 'eth3' -- /forward /network network namenet-x-pf-macvtap/name forward mode='bridge' interface dev='eth3'/ /forward /network Then each guest would have: interface type='network' mac address='x:x:x:x:x:x'/ network name='net-x-vfs-hostdev' /interface interface type='network' mac address='x:x:x:x:x:x'/ network name='net-x-pf-macvtap' model type='virtio'/ /interface
Re: [libvirt] [PATCH 3/8] Hostdev-hybrid mode requires a direct linkdev and direct mode.
On 09/11/2012 08:15 PM, Laine Stump wrote: (not a full review yet, but a couple of things I noticed in passing) On 09/07/2012 12:15 PM, Shradha Shah wrote: In this mode the guest contains a Virtual network device along with a SRIOV VF passed through to the guest as a pci device. --- src/conf/domain_conf.c | 38 -- src/conf/domain_conf.h |5 + src/libvirt_private.syms |1 + src/util/pci.c |2 +- src/util/pci.h |2 ++ src/util/virnetdev.c | 40 src/util/virnetdev.h |6 ++ 7 files changed, 91 insertions(+), 3 deletions(-) diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index d8ab40c..c59ea00 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -1022,6 +1022,7 @@ virDomainActualNetDefFree(virDomainActualNetDefPtr def) virDomainHostdevDefClear(def-data.hostdev.def); break; case VIR_DOMAIN_NET_TYPE_HOSTDEV_HYBRID: +VIR_FREE(def-data.hostdev.linkdev); virDomainHostdevDefClear(def-data.hostdev.def); break; default: @@ -1077,6 +1078,7 @@ void virDomainNetDefFree(virDomainNetDefPtr def) break; case VIR_DOMAIN_NET_TYPE_HOSTDEV_HYBRID: +VIR_FREE(def-data.hostdev.linkdev); virDomainHostdevDefClear(def-data.hostdev.def); break; @@ -4687,6 +4689,7 @@ virDomainNetDefParseXML(virCapsPtr caps, char *mode = NULL; char *linkstate = NULL; char *addrtype = NULL; +char *pfname = NULL; virNWFilterHashTablePtr filterparams = NULL; virDomainActualNetDefPtr actual = NULL; xmlNodePtr oldnode = ctxt-node; @@ -5024,6 +5027,27 @@ virDomainNetDefParseXML(virCapsPtr caps, hostdev, flags) 0) { goto error; } + +if (hostdev-source.subsys.type == VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_PCI) { +if (virNetDevGetPhysicalFunctionFromVfPciAddr(hostdev-source.subsys.u.pci.domain, + hostdev-source.subsys.u.pci.bus, + hostdev-source.subsys.u.pci.slot, + hostdev-source.subsys.u.pci.function, + pfname) 0) { It's problematic to have this call in a parsing function - it requires the actual hardware to be present on the machine doing the parse. For example, doing this in the parse causes the qemuxml2xml and qemuxml2argv tests to fail on my system, because my hardware doesn't match yours: One last thing - after applying the entire series, I'm getting the following failures in make check: VIR_TEST_DEBUG=2 ./qemuxml2xmltest TEST: qemuxml2xmltest [...] 7) QEMU XML-2-XML net-hostdevhybrid ... libvir: Domain Config error : internal error Could not get Physical Function of the hostdev ... VIR_TEST_DEBUG=2 ./qemuxml2argvtest TEST: qemuxml2argvtest [...] 113) QEMU XML-2-ARGV net-hostdevhybrid ... libvir: Domain Config error : internal error Could not get Physical Function of the hostdev I couldn't find any other uses of network device ioctl type calls in the conf directory. I think we at least frown on doing that in parsing/formatting, and may even forbid it. At any rate, You need to not do this here. Instead, perhaps you can leave it empty if it's not given explicitly in the input XML, and fill it in in the caller when appropriate? Thanks for the review Laine. I had actually encountered this issue when I was working on the patches but at that moment I could not think of a workaround. Thanks for the suggestion. I will do the needful. +virReportError(VIR_ERR_INTERNAL_ERROR, %s, + _(Could not get Physical Function of the hostdev)); +goto error; +} +} +if (pfname != NULL) +def-data.hostdev.linkdev = strdup(pfname); +else { +virReportError(VIR_ERR_INTERNAL_ERROR, + _(Linkdev is required in %s mode), + virDomainNetTypeToString(def-type)); +goto error; +} +def-data.hostdev.mode = VIR_NETDEV_MACVLAN_MODE_BRIDGE; break; case VIR_DOMAIN_NET_TYPE_USER: @@ -14664,11 +14688,16 @@ virDomainNetGetActualDirectDev(virDomainNetDefPtr iface) { if (iface-type == VIR_DOMAIN_NET_TYPE_DIRECT) return iface-data.direct.linkdev; +if (iface-type == VIR_DOMAIN_NET_TYPE_HOSTDEV_HYBRID) +return iface-data.hostdev.linkdev; if (iface-type != VIR_DOMAIN_NET_TYPE_NETWORK) return NULL; if (!iface-data.network.actual
[libvirt] [PATCH 0/8] Hostdev-hybrid patches
This patch series adds the support for interface-type=hostdev-hybrid and forward mode=hostdev-hybrid. The hostdev-hybrid mode makes migration possible along with PCI-passthrough. I had posted a RFC on the hostdev-hybrid methodology earlier on the libvirt mailing list. The RFC can be found here: https://www.redhat.com/archives/libvir-list/2012-February/msg00309.html Shradha Shah (8): RNG updates, new xml parser/formatter code for interface type=hostdev-hybrid RNG updates, new xml parser/formatter code for forward mode=hostdev-hybrid Hostdev-hybrid mode requires a direct linkdev and direct mode. ActualParent is used to store the information about the NETDEV. network: support hostdev-hybrid in network driver qemu: support netdevs from hostdev-hybrid networks Using the Ephemeral Flag to prepare for Migration Support. Migration support for hostdev-hybrid. docs/formatdomain.html.in | 29 docs/formatnetwork.html.in | 46 ++ docs/schemas/domaincommon.rng | 50 ++ docs/schemas/network.rng |1 + include/libvirt/libvirt.h.in |1 + src/conf/domain_conf.c | 167 +--- src/conf/domain_conf.h |8 + src/conf/network_conf.c|9 +- src/conf/network_conf.h|1 + src/libvirt_private.syms |1 + src/network/bridge_driver.c| 123 +-- src/qemu/qemu_command.c| 61 +++ src/qemu/qemu_domain.c |6 +- src/qemu/qemu_domain.h |3 +- src/qemu/qemu_driver.c |6 +- src/qemu/qemu_hostdev.c| 158 +-- src/qemu/qemu_hotplug.c| 26 +++- src/qemu/qemu_migration.c | 106 - src/qemu/qemu_process.c|3 +- src/uml/uml_conf.c |5 + src/util/pci.c |2 +- src/util/pci.h |2 + src/util/virnetdev.c | 40 + src/util/virnetdev.h |6 + src/xenxs/xen_sxpr.c |1 + tests/networkxml2xmlin/hostdev-hybrid-pf.xml |7 + tests/networkxml2xmlin/hostdev-hybrid.xml | 10 ++ tests/networkxml2xmlout/hostdev-hybrid-pf.xml |7 + tests/networkxml2xmlout/hostdev-hybrid.xml | 10 ++ tests/networkxml2xmltest.c |2 + .../qemuxml2argv-net-hostdevhybrid.args|8 + .../qemuxml2argv-net-hostdevhybrid.xml | 35 tests/qemuxml2argvtest.c |2 + .../qemuxml2xmlout-net-hostdevhybrid.xml | 40 + tests/qemuxml2xmltest.c|1 + 35 files changed, 884 insertions(+), 99 deletions(-) create mode 100644 tests/networkxml2xmlin/hostdev-hybrid-pf.xml create mode 100644 tests/networkxml2xmlin/hostdev-hybrid.xml create mode 100644 tests/networkxml2xmlout/hostdev-hybrid-pf.xml create mode 100644 tests/networkxml2xmlout/hostdev-hybrid.xml create mode 100644 tests/qemuxml2argvdata/qemuxml2argv-net-hostdevhybrid.args create mode 100644 tests/qemuxml2argvdata/qemuxml2argv-net-hostdevhybrid.xml create mode 100644 tests/qemuxml2xmloutdata/qemuxml2xmlout-net-hostdevhybrid.xml -- 1.7.4.4 -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
[libvirt] [PATCH 1/8] RNG updates, new xml parser/formatter code for interface type=hostdev-hybrid
This patch introduces the new interface type='hostdev-hybrid' along with attribute managed Includes updates to the domain RNG and new xml parser/formatter code. Also introduces a ephemeral tag for hybrid hostdevs. The ephemeral tag for hybrid hostdevs will be useful for live migration support at a later stage. --- docs/formatdomain.html.in | 29 ++ docs/schemas/domaincommon.rng | 50 ++ src/conf/domain_conf.c | 96 +--- src/conf/domain_conf.h |2 + src/uml/uml_conf.c |5 + src/xenxs/xen_sxpr.c |1 + .../qemuxml2argv-net-hostdevhybrid.args|8 ++ .../qemuxml2argv-net-hostdevhybrid.xml | 35 +++ tests/qemuxml2argvtest.c |2 + .../qemuxml2xmlout-net-hostdevhybrid.xml | 40 tests/qemuxml2xmltest.c|1 + 11 files changed, 256 insertions(+), 13 deletions(-) diff --git a/docs/formatdomain.html.in b/docs/formatdomain.html.in index 503685f..70cf362 100644 --- a/docs/formatdomain.html.in +++ b/docs/formatdomain.html.in @@ -2659,6 +2659,20 @@ guest instead of lt;interface type='hostdev'/gt;. /p +p + Libvirt later than 0.10.0 also supports intelligent passthrough + of VF in the hybrid mode. This is done by using the lt;interface + type='hostdev-hybrid'/gt; functionality. Similar to lt;interface + type='hostdev'/gt; the device's MAC address is first optionally + configured and the device is optionally associated with an 802.1Qbh + capable switch using an optionally specified lt;virtualportgt; + element (see the examples of virtualport given above for + type='direct' network devices). The Vf is passed into the guest as + a PCI device and at the same time a virtual interface with + type='direct' mode='bridge' is created in the guest. This hybrid mode + of intelligent passthrough makes Live migration possible. +/p + pre ... lt;devicesgt; @@ -2674,6 +2688,21 @@ lt;/devicesgt; .../pre +pre + ... + lt;devicesgt; +lt;interface type='hostdev-hybrid'gt; + lt;sourcegt; +lt;address type='pci' domain='0x' bus='0x00' slot='0x07' function='0x0'/gt; + lt;/sourcegt; + lt;mac address='52:54:00:6d:90:02'gt; + lt;virtualport type='802.1Qbh'gt; +lt;parameters profileid='finance'/gt; + lt;/virtualportgt; +lt;/interfacegt; + lt;/devicesgt; + .../pre + h5a name=elementsNICSMulticastMulticast tunnel/a/h5 diff --git a/docs/schemas/domaincommon.rng b/docs/schemas/domaincommon.rng index c2c6184..eedc255 100644 --- a/docs/schemas/domaincommon.rng +++ b/docs/schemas/domaincommon.rng @@ -1677,6 +1677,56 @@ ref name=interface-options/ /interleave /group +group + attribute name=type +valuehostdev-hybrid/value + /attribute + optional +attribute name=managed + choice +valueyes/value +valueno/value + /choice +/attribute + /optional + interleave +element name=source + choice +group + ref name=usbproduct/ + optional +ref name=usbaddress/ + /optional +/group +element name=address + choice +group + attribute name=type +valuepci/value + /attribute + ref name=pciaddress/ +/group +group + attribute name=type +valueusb/value + /attribute + attribute name=bus +ref name=usbAddr/ + /attribute + attribute name=device +ref name=usbPort/ + /attribute +/group + /choice +/element + /choice +/element +optional + ref name=virtualPortProfile/ +/optional +ref name=interface-options/ + /interleave +/group /choice /element /define diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index 8952b69..d8ab40c 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -293,7 +293,8 @@ VIR_ENUM_IMPL(virDomainNet, VIR_DOMAIN_NET_TYPE_LAST, bridge, internal, direct, - hostdev) + hostdev, + hostdev-hybrid) VIR_ENUM_IMPL(virDomainNetBackend, VIR_DOMAIN_NET_BACKEND_TYPE_LAST,
[libvirt] [PATCH 2/8] RNG updates, new xml parser/formatter code for forward mode=hostdev-hybrid
This patch introduces the new forward mode='hostdev-hybrid' along with attribute managed Includes updates to the network RNG and new xml parser/formatter code. --- docs/formatnetwork.html.in| 46 + docs/schemas/network.rng |1 + src/conf/network_conf.c |9 +++-- src/conf/network_conf.h |1 + tests/networkxml2xmlin/hostdev-hybrid-pf.xml |7 tests/networkxml2xmlin/hostdev-hybrid.xml | 10 + tests/networkxml2xmlout/hostdev-hybrid-pf.xml |7 tests/networkxml2xmlout/hostdev-hybrid.xml| 10 + tests/networkxml2xmltest.c|2 + 9 files changed, 90 insertions(+), 3 deletions(-) diff --git a/docs/formatnetwork.html.in b/docs/formatnetwork.html.in index 49206dd..5a5392c 100644 --- a/docs/formatnetwork.html.in +++ b/docs/formatnetwork.html.in @@ -259,6 +259,22 @@ with codelt;forward mode='hostdev'/gt;/code. /p /dd + dtcodehostdev-hybrid/code/dt + dd +Libvirt later than 0.10.0 also supports intelligent +passthrough of VF in the hybrid mode. This is done by +using the lt;interface type='hostdev-hybrid'/gt; +functionality. Similar to lt;interface type='hostdev'/gt; +the device's MAC address is first optionally configured and +the device is optionally associated with an 802.1Qbh capable +switch using an optionally specified lt;virtualportgt; +element (see the examples of virtualport given above for +type='direct' network devices). The Vf is passed into the +guest as a PCI device and at the same time a virtual interface +with type='direct' mode='bridge' is created in the guest. This +hybrid mode of intelligent passthrough makes Live migration +possible. + /dd /dl As mentioned above, a codelt;forwardgt;/code element can have multiple codelt;interfacegt;/code subelements, each @@ -341,6 +357,36 @@ ... /pre +Similarly hostdev-hybrid mode can be used to support PCI- +passthrough of VF in hybrid mode as described above. The +interface pool can be specified with a list of +codelt;addressgt;/code elements, each of which has +codelt; typegt;/code (must always be code'pci'/code, +codelt;domaingt;/code, codelt;busgt;/code, +codelt;slotgt;/code, and codelt;functiongt;/code +attributes or the interface pool can also be defined using a +single physical function codelt;pfgt;/code subelement to +call out the corresponding physical interface associated with +multiple virtual interfaces (similar to passthrough mode). + +pre +... + lt;forward mode='hostdev-hybrid' managed='yes'gt; +lt;address type='pci' domain='0' bus='4' slot='0' function='1'/gt; +lt;address type='pci' domain='0' bus='4' slot='0' function='2'/gt; +lt;address type='pci' domain='0' bus='4' slot='0' function='3'/gt; + lt;/forwardgt; +... +/pre + +pre +... + lt;forward mode='hostdev-hybrid' managed='yes'gt; +lt;pf dev='eth0'/gt; + lt;/forwardgt; +... +/pre + /dd /dl h5a name=elementQoSQuality of service/a/h5 diff --git a/docs/schemas/network.rng b/docs/schemas/network.rng index 4abfd91..1f2136a 100644 --- a/docs/schemas/network.rng +++ b/docs/schemas/network.rng @@ -88,6 +88,7 @@ valueprivate/value valuevepa/value valuehostdev/value + valuehostdev-hybrid/value /choice /attribute /optional diff --git a/src/conf/network_conf.c b/src/conf/network_conf.c index 9d53d8e..443b060 100644 --- a/src/conf/network_conf.c +++ b/src/conf/network_conf.c @@ -50,7 +50,7 @@ VIR_ENUM_IMPL(virNetworkForward, VIR_NETWORK_FORWARD_LAST, - none, nat, route, bridge, private, vepa, passthrough, hostdev) + none, nat, route, bridge, private, vepa, passthrough, hostdev, hostdev-hybrid) VIR_ENUM_DECL(virNetworkForwardHostdevDevice) VIR_ENUM_IMPL(virNetworkForwardHostdevDevice, @@ -1289,6 +1289,7 @@ virNetworkDefParseXML(xmlXPathContextPtr ctxt) case VIR_NETWORK_FORWARD_VEPA: case VIR_NETWORK_FORWARD_PASSTHROUGH: case VIR_NETWORK_FORWARD_HOSTDEV: +case VIR_NETWORK_FORWARD_HOSTDEV_HYBRID: if (def-bridge) { virReportError(VIR_ERR_XML_ERROR, _(bridge name not allowed in %s mode (network '%s')), @@ -1590,7 +1591,8 @@ char *virNetworkDefFormat(const virNetworkDefPtr def, unsigned int flags) virBufferAddLit(buf, forward); virBufferEscapeString(buf, dev='%s', dev); virBufferAsprintf(buf, mode='%s', mode); -if
[libvirt] [PATCH 5/8] network: support hostdev-hybrid in network driver
This patch updates the network driver to properly utilize the new attributes/elements that are now in virNetworkDef --- src/network/bridge_driver.c | 123 ++- 1 files changed, 110 insertions(+), 13 deletions(-) diff --git a/src/network/bridge_driver.c b/src/network/bridge_driver.c index 808c843..f94f81a 100644 --- a/src/network/bridge_driver.c +++ b/src/network/bridge_driver.c @@ -1993,9 +1993,9 @@ networkStartNetworkExternal(struct network_driver *driver ATTRIBUTE_UNUSED, virNetworkObjPtr network ATTRIBUTE_UNUSED) { /* put anything here that needs to be done each time a network of - * type BRIDGE, PRIVATE, VEPA, HOSTDEV or PASSTHROUGH is started. On - * failure, undo anything you've done, and return -1. On success - * return 0. + * type BRIDGE, PRIVATE, VEPA, HOSTDEV, HOSTDEV-HYBRID or PASSTHROUGH + * is started. On failure, undo anything you've done, and return -1. + * On success return 0. */ return 0; } @@ -2004,9 +2004,9 @@ static int networkShutdownNetworkExternal(struct network_driver *driver ATTRIBUT virNetworkObjPtr network ATTRIBUTE_UNUSED) { /* put anything here that needs to be done each time a network of - * type BRIDGE, PRIVATE, VEPA, HOSTDEV or PASSTHROUGH is shutdown. On - * failure, undo anything you've done, and return -1. On success - * return 0. + * type BRIDGE, PRIVATE, VEPA, HOSTDEV, HOSTDEV-HYBRID or PASSTHROUGH + * is shutdown. On failure, undo anything you've done, and return -1. + * On success return 0. */ return 0; } @@ -2036,6 +2036,7 @@ networkStartNetwork(struct network_driver *driver, case VIR_NETWORK_FORWARD_VEPA: case VIR_NETWORK_FORWARD_PASSTHROUGH: case VIR_NETWORK_FORWARD_HOSTDEV: +case VIR_NETWORK_FORWARD_HOSTDEV_HYBRID: ret = networkStartNetworkExternal(driver, network); break; } @@ -2096,6 +2097,7 @@ static int networkShutdownNetwork(struct network_driver *driver, case VIR_NETWORK_FORWARD_VEPA: case VIR_NETWORK_FORWARD_PASSTHROUGH: case VIR_NETWORK_FORWARD_HOSTDEV: +case VIR_NETWORK_FORWARD_HOSTDEV_HYBRID: ret = networkShutdownNetworkExternal(driver, network); break; } @@ -2885,7 +2887,8 @@ networkCreateInterfacePool(virNetworkDefPtr netdef) { goto finish; } } -else if (netdef-forwardType == VIR_NETWORK_FORWARD_HOSTDEV) { +else if (netdef-forwardType == VIR_NETWORK_FORWARD_HOSTDEV || + netdef-forwardType == VIR_NETWORK_FORWARD_HOSTDEV_HYBRID) { /* VF's are always PCI devices */ netdef-forwardIfs[ii].type = VIR_NETWORK_FORWARD_HOSTDEV_DEVICE_PCI; netdef-forwardIfs[ii].device.pci.domain = virt_fns[ii]-domain; @@ -3056,6 +3059,8 @@ networkAllocateActualDevice(virDomainNetDefPtr iface) } iface-data.network.actual-data.hostdev.def.parent.type = VIR_DOMAIN_DEVICE_NET; iface-data.network.actual-data.hostdev.def.parent.data.net = iface; +iface-data.network.actual-data.hostdev.def.actualParent.type = VIR_DOMAIN_DEVICE_NET; +iface-data.network.actual-data.hostdev.def.actualParent.data.net = iface; iface-data.network.actual-data.hostdev.def.info = iface-info; iface-data.network.actual-data.hostdev.def.mode = VIR_DOMAIN_HOSTDEV_MODE_SUBSYS; iface-data.network.actual-data.hostdev.def.managed = netdef-managed; @@ -3087,6 +3092,92 @@ networkAllocateActualDevice(virDomainNetDefPtr iface) } } +} else if (netdef-forwardType == VIR_NETWORK_FORWARD_HOSTDEV_HYBRID) { +char *pfname = NULL; +if (!iface-data.network.actual + (VIR_ALLOC(iface-data.network.actual) 0)) { +virReportOOMError(); +goto error; +} +iface-data.network.actual-type = actualType = VIR_DOMAIN_NET_TYPE_HOSTDEV_HYBRID; + +if (netdef-nForwardPfs 0 netdef-nForwardIfs = 0 +networkCreateInterfacePool(netdef) 0) { +goto error; +} +/* pick first dev with 0 connections*/ + +for (ii = 0; ii netdef-nForwardIfs; ii++) { +if (netdef-forwardIfs[ii].connections == 0) { +dev = netdef-forwardIfs[ii]; +break; +} +} +if (!dev) { +virReportError(VIR_ERR_INTERNAL_ERROR, + _(network '%s' requires exclusive access to interfaces, but none are available), + netdef-name); +goto error; +} +iface-data.network.actual-data.hostdev.def.parent.type = VIR_DOMAIN_DEVICE_NONE; +iface-data.network.actual-data.hostdev.def.actualParent.type = VIR_DOMAIN_DEVICE_NET; +iface-data.network.actual-data.hostdev.def.actualParent.data.net = iface; + +if
[libvirt] [PATCH 7/8] Using the Ephemeral Flag to prepare for Migration Support.
--- include/libvirt/libvirt.h.in |1 + src/conf/domain_conf.c | 24 +++- src/qemu/qemu_domain.c |6 +- src/qemu/qemu_domain.h |3 ++- src/qemu/qemu_driver.c |6 +++--- src/qemu/qemu_hostdev.c |6 ++ src/qemu/qemu_migration.c|4 ++-- 7 files changed, 38 insertions(+), 12 deletions(-) diff --git a/include/libvirt/libvirt.h.in b/include/libvirt/libvirt.h.in index deb35ec..8e7a85d 100644 --- a/include/libvirt/libvirt.h.in +++ b/include/libvirt/libvirt.h.in @@ -1651,6 +1651,7 @@ typedef enum { VIR_DOMAIN_XML_SECURE = (1 0), /* dump security sensitive information too */ VIR_DOMAIN_XML_INACTIVE = (1 1), /* dump inactive domain information */ VIR_DOMAIN_XML_UPDATE_CPU = (1 2), /* update guest CPU requirements according to host CPU */ +VIR_DOMAIN_XML_NO_EPHEMERAL_DEVICES = (1 24), /* Do not include ephemeral devices */ } virDomainXMLFlags; char * virDomainGetXMLDesc (virDomainPtr domain, diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index 52c00db..5e2b224 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -13174,7 +13174,8 @@ virDomainDefFormatInternal(virDomainDefPtr def, virCheckFlags(DUMPXML_FLAGS | VIR_DOMAIN_XML_INTERNAL_STATUS | VIR_DOMAIN_XML_INTERNAL_ACTUAL_NET | - VIR_DOMAIN_XML_INTERNAL_PCI_ORIG_STATES, + VIR_DOMAIN_XML_INTERNAL_PCI_ORIG_STATES | + VIR_DOMAIN_XML_NO_EPHEMERAL_DEVICES, -1); if (!(type = virDomainVirtTypeToString(def-virtType))) { @@ -13675,10 +13676,22 @@ virDomainDefFormatInternal(virDomainDefPtr def, /* If parent.type != NONE, this is just a pointer to the * hostdev in a higher-level device (e.g. virDomainNetDef), * and will have already been formatted there. + * Hostdevs marked as ephemeral are hybrid hostdevs and + * should not be formatted. */ -if (def-hostdevs[n]-parent.type == VIR_DOMAIN_DEVICE_NONE -virDomainHostdevDefFormat(buf, def-hostdevs[n], flags) 0) { -goto cleanup; +if (def-hostdevs[n]-parent.type == VIR_DOMAIN_DEVICE_NONE) { +if ((flags VIR_DOMAIN_XML_NO_EPHEMERAL_DEVICES) == 0) { +if (virDomainHostdevDefFormat(buf, def-hostdevs[n], flags) 0) { +goto cleanup; +} +} +else { +if (def-hostdevs[n]-ephemeral == 0) { +if (virDomainHostdevDefFormat(buf, def-hostdevs[n], flags) 0) { +goto cleanup; +} +} +} } } @@ -13727,7 +13740,8 @@ virDomainDefFormat(virDomainDefPtr def, unsigned int flags) { virBuffer buf = VIR_BUFFER_INITIALIZER; -virCheckFlags(DUMPXML_FLAGS, NULL); +virCheckFlags(DUMPXML_FLAGS | VIR_DOMAIN_XML_NO_EPHEMERAL_DEVICES, + NULL); if (virDomainDefFormatInternal(def, flags, buf) 0) return NULL; diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c index 0ae30b7..a05e0dd 100644 --- a/src/qemu/qemu_domain.c +++ b/src/qemu/qemu_domain.c @@ -1335,13 +1335,17 @@ char * qemuDomainDefFormatLive(struct qemud_driver *driver, virDomainDefPtr def, bool inactive, -bool compatible) +bool compatible, +bool ephemeral) { unsigned int flags = QEMU_DOMAIN_FORMAT_LIVE_FLAGS; if (inactive) flags |= VIR_DOMAIN_XML_INACTIVE; +if (ephemeral) +flags |= VIR_DOMAIN_XML_NO_EPHEMERAL_DEVICES; + return qemuDomainDefFormatXML(driver, def, flags, compatible); } diff --git a/src/qemu/qemu_domain.h b/src/qemu/qemu_domain.h index dff53cf..82c5f8d 100644 --- a/src/qemu/qemu_domain.h +++ b/src/qemu/qemu_domain.h @@ -269,7 +269,8 @@ char *qemuDomainFormatXML(struct qemud_driver *driver, char *qemuDomainDefFormatLive(struct qemud_driver *driver, virDomainDefPtr def, bool inactive, - bool compatible); + bool compatible, + bool ephemeral); void qemuDomainObjTaint(struct qemud_driver *driver, virDomainObjPtr obj, diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index 8e8e00c..3b18e80 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -2748,9 +2748,9 @@ qemuDomainSaveInternal(struct qemud_driver *driver, virDomainPtr dom, virDomainDefFree(def); goto endjob; } -xml = qemuDomainDefFormatLive(driver, def, true, true); +xml = qemuDomainDefFormatLive(driver, def, true, true, false); } else { -xml =
[libvirt] [PATCH 8/8] Migration support for hostdev-hybrid.
This patch uses the ephemeral flag to prevent the hybrid hostdev from being formatted into the xml. Before migration the hybrid hostdev is hot unplugged and hotplugged again after migration is the specific hostdev is available on the destination host. --- src/qemu/qemu_migration.c | 102 ++-- 1 files changed, 97 insertions(+), 5 deletions(-) diff --git a/src/qemu/qemu_migration.c b/src/qemu/qemu_migration.c index 4a51e11..5398049 100644 --- a/src/qemu/qemu_migration.c +++ b/src/qemu/qemu_migration.c @@ -31,6 +31,7 @@ #include qemu_monitor.h #include qemu_domain.h #include qemu_process.h +#include qemu_hotplug.h #include qemu_capabilities.h #include qemu_cgroup.h @@ -49,6 +50,7 @@ #include storage_file.h #include viruri.h #include hooks.h +#include network/bridge_driver.h #define VIR_FROM_THIS VIR_FROM_QEMU @@ -122,6 +124,79 @@ struct _qemuMigrationCookie { virDomainDefPtr persistent; }; +static void +qemuMigrationRemoveEphemeralDevices(struct qemud_driver *driver, +virDomainObjPtr vm) +{ +virDomainHostdevDefPtr dev; +virDomainDeviceDef def; +unsigned int i; + +for (i = 0; i vm-def-nhostdevs; i++) { +dev = vm-def-hostdevs[i]; +if (dev-ephemeral == 1) { +def.type = VIR_DOMAIN_DEVICE_HOSTDEV; +def.data.hostdev = dev; + +if (qemuDomainDetachHostDevice(driver, vm, def) = 0) { +continue; /* nhostdevs reduced */ +} +} +} +} + +static void +qemuMigrationRestoreEphemeralDevices(struct qemud_driver *driver, + virDomainObjPtr vm) +{ +virDomainNetDefPtr net; +unsigned int i; + +/* Do nothing if ephemeral devices are present in which case this + function was called before qemuMigrationRemoveEphemeralDevices */ + +for (i = 0; i vm-def-nhostdevs; i++) { +if (vm-def-hostdevs[i]-ephemeral == 1) +return; +} + +for (i = 0; i vm-def-nnets; i++) { +net = vm-def-nets[i]; + +if (virDomainNetGetActualType(net) == VIR_DOMAIN_NET_TYPE_HOSTDEV_HYBRID) { +if (qemuDomainAttachHostDevice(driver, vm, + virDomainNetGetActualHostdev(net)) 0) { +virReportError(VIR_ERR_INTERNAL_ERROR, %s, + _(Hybrid Hostdev cannot be attached after migration)); +networkReleaseActualDevice(net); +} +} +return; +} +} + +static void +qemuMigrationAttachEphemeralDevices(struct qemud_driver *driver, +virDomainObjPtr vm) +{ +virDomainNetDefPtr net; +unsigned int i; + +for (i = 0; i vm-def-nnets; i++) { +net = vm-def-nets[i]; + +if (virDomainNetGetActualType(net) == VIR_DOMAIN_NET_TYPE_HOSTDEV_HYBRID) { +if (qemuDomainAttachHostDevice(driver, vm, + virDomainNetGetActualHostdev(net)) 0) { +virReportError(VIR_ERR_INTERNAL_ERROR, %s, + _(Hybrid Hostdev cannot be attached after migration)); +networkReleaseActualDevice(net); +} +} +} +return; +} + static void qemuMigrationCookieGraphicsFree(qemuMigrationCookieGraphicsPtr grap) { if (!grap) @@ -800,6 +875,7 @@ qemuMigrationIsAllowed(struct qemud_driver *driver, virDomainObjPtr vm, virDomainDefPtr def) { int nsnapshots; +unsigned int i; if (vm) { if (qemuProcessAutoDestroyActive(driver, vm)) { @@ -817,10 +893,12 @@ qemuMigrationIsAllowed(struct qemud_driver *driver, virDomainObjPtr vm, def = vm-def; } -if (def-nhostdevs 0) { -virReportError(VIR_ERR_OPERATION_INVALID, - %s, _(Domain with assigned host devices cannot be migrated)); -return false; +for (i = 0; i def-nhostdevs; i++) { +if (def-hostdevs[i]-ephemeral == 0) { +virReportError(VIR_ERR_OPERATION_INVALID, + %s, _(Domain with assigned host devices cannot be migrated)); +return false; +} } return true; @@ -2043,6 +2121,8 @@ static int doNativeMigrate(struct qemud_driver *driver, driver, vm, uri, NULLSTR(cookiein), cookieinlen, cookieout, cookieoutlen, flags, resource); +qemuMigrationRemoveEphemeralDevices(driver, vm); + if (STRPREFIX(uri, tcp:) !STRPREFIX(uri, tcp://)) { char *tmp; /* HACK: source host generates bogus URIs, so fix them up */ @@ -2069,6 +2149,9 @@ static int doNativeMigrate(struct qemud_driver *driver, ret = qemuMigrationRun(driver, vm, cookiein, cookieinlen, cookieout, cookieoutlen, flags, resource, spec, dconn); +if (ret != 0 ) +qemuMigrationRestoreEphemeralDevices(driver,
[libvirt] [PATCH 6/8] qemu: support netdevs from hostdev-hybrid networks
For network devices allocated from a network with forward mode='hostdev-hybrid', there is a need to add the newly minted hostdev to the hostdevs array. In this case we also need to call qemuPrepareHostDevices just for this one device, as the standard call to initialize all the hostdevs that were defined directly in the domain's configuration has already been made by the time we allocate a device from a libvirt network, and thus have something that needs initializing. --- src/qemu/qemu_command.c | 61 +++ src/qemu/qemu_hostdev.c |2 +- src/qemu/qemu_hotplug.c | 24 -- src/qemu/qemu_process.c |3 +- 4 files changed, 85 insertions(+), 5 deletions(-) diff --git a/src/qemu/qemu_command.c b/src/qemu/qemu_command.c index a83d6de..e5388b3 100644 --- a/src/qemu/qemu_command.c +++ b/src/qemu/qemu_command.c @@ -27,6 +27,7 @@ #include qemu_hostdev.h #include qemu_capabilities.h #include qemu_bridge_filter.h +#include qemu_hostdev.h #include cpu/cpu.h #include memory.h #include logging.h @@ -4357,10 +4358,16 @@ qemuBuildCommandLine(virConnectPtr conn, bool emitBootindex = false; int usbcontroller = 0; bool usblegacy = false; + +virDomainObjPtr vm = NULL; +virDomainObjListPtr doms = driver-domains; + uname_normalize(ut); virUUIDFormat(def-uuid, uuid); +vm = virHashLookup(doms-objs, uuid); + emulator = def-emulator; /* @@ -5309,6 +5316,60 @@ qemuBuildCommandLine(virConnectPtr conn, continue; } + if (actualType == VIR_DOMAIN_NET_TYPE_HOSTDEV_HYBRID) { + if (net-type == VIR_DOMAIN_NET_TYPE_NETWORK) { + virDomainHostdevDefPtr hostdev = virDomainNetGetActualHostdev(net); + virDomainHostdevDefPtr found; + if (vmop == VIR_NETDEV_VPORT_PROFILE_OP_CREATE) { + if (qemuAssignDeviceHostdevAlias(def, hostdev, +(def-nhostdevs-1)) 0) { + goto error; + } + + if (virDomainHostdevFind(def, hostdev, found) 0) { + qemuDomainObjPrivatePtr priv = vm-privateData; + if (qemuDomainPCIAddressEnsureAddr(priv-pciaddrs, + hostdev-info) 0) { + goto error; + } + if (virDomainHostdevInsert(def, hostdev) 0) { + virReportOOMError(); + goto error; + } + if (qemuPrepareHostdevPCIDevices(driver, def-name, +def-uuid, hostdev, 1) 0) { + goto error; + } + } + else { + virReportError(VIR_ERR_INTERNAL_ERROR, + _(PCI device %04x:%02x:%02x.%x +allocated from network %s is already +in use by domain %s), + hostdev-source.subsys.u.pci.domain, + hostdev-source.subsys.u.pci.bus, + hostdev-source.subsys.u.pci.slot, + hostdev-source.subsys.u.pci.function, + net-data.network.name, + def-name); + goto error; + } + } + } + + int tapfd = qemuPhysIfaceConnect(def, driver, net, + qemuCaps, vmop); +if (tapfd 0) +goto error; + +last_good_net = i; +virCommandTransferFD(cmd, tapfd); + +if (snprintf(tapfd_name, sizeof(tapfd_name), %d, + tapfd) = sizeof(tapfd_name)) +goto no_memory; +} + if (actualType == VIR_DOMAIN_NET_TYPE_NETWORK || actualType == VIR_DOMAIN_NET_TYPE_BRIDGE) { /* diff --git a/src/qemu/qemu_hostdev.c b/src/qemu/qemu_hostdev.c index a060a7e..4851e11 100644 --- a/src/qemu/qemu_hostdev.c +++ b/src/qemu/qemu_hostdev.c @@ -362,7 +362,7 @@ qemuDomainHostdevNetConfigReplace(virDomainHostdevDefPtr hostdev, } } else if (hostdev-actualParent.data.net) { - vlan = virDomainNetGetActualVlan(hostdev-actualParent.data.net); +vlan = virDomainNetGetActualVlan(hostdev-actualParent.data.net); virtPort =
[libvirt] [PATCH 3/8] Hostdev-hybrid mode requires a direct linkdev and direct mode.
In this mode the guest contains a Virtual network device along with a SRIOV VF passed through to the guest as a pci device. --- src/conf/domain_conf.c | 38 -- src/conf/domain_conf.h |5 + src/libvirt_private.syms |1 + src/util/pci.c |2 +- src/util/pci.h |2 ++ src/util/virnetdev.c | 40 src/util/virnetdev.h |6 ++ 7 files changed, 91 insertions(+), 3 deletions(-) diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index d8ab40c..c59ea00 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -1022,6 +1022,7 @@ virDomainActualNetDefFree(virDomainActualNetDefPtr def) virDomainHostdevDefClear(def-data.hostdev.def); break; case VIR_DOMAIN_NET_TYPE_HOSTDEV_HYBRID: +VIR_FREE(def-data.hostdev.linkdev); virDomainHostdevDefClear(def-data.hostdev.def); break; default: @@ -1077,6 +1078,7 @@ void virDomainNetDefFree(virDomainNetDefPtr def) break; case VIR_DOMAIN_NET_TYPE_HOSTDEV_HYBRID: +VIR_FREE(def-data.hostdev.linkdev); virDomainHostdevDefClear(def-data.hostdev.def); break; @@ -4687,6 +4689,7 @@ virDomainNetDefParseXML(virCapsPtr caps, char *mode = NULL; char *linkstate = NULL; char *addrtype = NULL; +char *pfname = NULL; virNWFilterHashTablePtr filterparams = NULL; virDomainActualNetDefPtr actual = NULL; xmlNodePtr oldnode = ctxt-node; @@ -5024,6 +5027,27 @@ virDomainNetDefParseXML(virCapsPtr caps, hostdev, flags) 0) { goto error; } + +if (hostdev-source.subsys.type == VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_PCI) { +if (virNetDevGetPhysicalFunctionFromVfPciAddr(hostdev-source.subsys.u.pci.domain, + hostdev-source.subsys.u.pci.bus, + hostdev-source.subsys.u.pci.slot, + hostdev-source.subsys.u.pci.function, + pfname) 0) { +virReportError(VIR_ERR_INTERNAL_ERROR, %s, + _(Could not get Physical Function of the hostdev)); +goto error; +} +} +if (pfname != NULL) +def-data.hostdev.linkdev = strdup(pfname); +else { +virReportError(VIR_ERR_INTERNAL_ERROR, + _(Linkdev is required in %s mode), + virDomainNetTypeToString(def-type)); +goto error; +} +def-data.hostdev.mode = VIR_NETDEV_MACVLAN_MODE_BRIDGE; break; case VIR_DOMAIN_NET_TYPE_USER: @@ -14664,11 +14688,16 @@ virDomainNetGetActualDirectDev(virDomainNetDefPtr iface) { if (iface-type == VIR_DOMAIN_NET_TYPE_DIRECT) return iface-data.direct.linkdev; +if (iface-type == VIR_DOMAIN_NET_TYPE_HOSTDEV_HYBRID) +return iface-data.hostdev.linkdev; if (iface-type != VIR_DOMAIN_NET_TYPE_NETWORK) return NULL; if (!iface-data.network.actual) return NULL; -return iface-data.network.actual-data.direct.linkdev; +if (iface-data.network.actual-type == VIR_DOMAIN_NET_TYPE_HOSTDEV_HYBRID) +return iface-data.network.actual-data.hostdev.linkdev; +else +return iface-data.network.actual-data.direct.linkdev; } int @@ -14676,11 +14705,16 @@ virDomainNetGetActualDirectMode(virDomainNetDefPtr iface) { if (iface-type == VIR_DOMAIN_NET_TYPE_DIRECT) return iface-data.direct.mode; +if (iface-type == VIR_DOMAIN_NET_TYPE_HOSTDEV_HYBRID) +return iface-data.hostdev.mode; if (iface-type != VIR_DOMAIN_NET_TYPE_NETWORK) return 0; if (!iface-data.network.actual) return 0; -return iface-data.network.actual-data.direct.mode; +if (iface-data.network.actual-type == VIR_DOMAIN_NET_TYPE_HOSTDEV_HYBRID) +return iface-data.network.actual-data.hostdev.mode; +else +return iface-data.network.actual-data.direct.mode; } virDomainHostdevDefPtr diff --git a/src/conf/domain_conf.h b/src/conf/domain_conf.h index 156eb32..171dd70 100644 --- a/src/conf/domain_conf.h +++ b/src/conf/domain_conf.h @@ -44,6 +44,7 @@ # include virnetdevopenvswitch.h # include virnetdevbandwidth.h # include virnetdevvlan.h +# include virnetdev.h # include virobject.h # include device_conf.h @@ -778,6 +779,8 @@ struct _virDomainActualNetDef { int mode; /* enum virMacvtapMode from util/macvtap.h */ } direct; struct { +char *linkdev; +int mode; virDomainHostdevDef def; } hostdev; } data; @@ -833,6 +836,8 @@ struct _virDomainNetDef { int mode; /* enum virMacvtapMode from
[libvirt] [PATCH 4/8] ActualParent is used to store the information about the NETDEV.
The parent type for hostdev hybrid needs to be VIR_DOMAIN_DEVICE_NONE as the device is passed into the guest as a PCI Device. In order to store the information of the NETDEV that is the parent of the HOSTDEV in question we use a new variable actualParent. This variable also helps during VF MAC address, vlan and virtportprofile configuration. ActualParent = Parent in case of forward mode=hostdev --- src/conf/domain_conf.c |9 +++ src/conf/domain_conf.h |1 + src/qemu/qemu_hostdev.c | 152 +-- src/qemu/qemu_hotplug.c |2 +- 4 files changed, 118 insertions(+), 46 deletions(-) diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index c59ea00..52c00db 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -4587,6 +4587,8 @@ virDomainActualNetDefParseXML(xmlNodePtr node, hostdev-parent.type = VIR_DOMAIN_DEVICE_NET; hostdev-parent.data.net = parent; +hostdev-actualParent.type = VIR_DOMAIN_DEVICE_NET; +hostdev-actualParent.data.net = parent; hostdev-info = parent-info; /* The helper function expects type to already be found and * passed in as a string, since it is in a different place in @@ -4607,6 +4609,9 @@ virDomainActualNetDefParseXML(xmlNodePtr node, virDomainHostdevDefPtr hostdev = actual-data.hostdev.def; hostdev-parent.type = VIR_DOMAIN_DEVICE_NONE; +hostdev-actualParent.type = VIR_DOMAIN_DEVICE_NET; +hostdev-actualParent.data.net = parent; + if (VIR_ALLOC(hostdev-info) 0) { virReportOOMError(); goto error; @@ -4990,6 +4995,8 @@ virDomainNetDefParseXML(virCapsPtr caps, hostdev = def-data.hostdev.def; hostdev-parent.type = VIR_DOMAIN_DEVICE_NET; hostdev-parent.data.net = def; +hostdev-actualParent.type = VIR_DOMAIN_DEVICE_NET; +hostdev-actualParent.data.net = def; hostdev-info = def-info; /* The helper function expects type to already be found and * passed in as a string, since it is in a different place in @@ -5011,6 +5018,8 @@ virDomainNetDefParseXML(virCapsPtr caps, case VIR_DOMAIN_NET_TYPE_HOSTDEV_HYBRID: hostdev = def-data.hostdev.def; hostdev-parent.type = VIR_DOMAIN_DEVICE_NONE; +hostdev-actualParent.type = VIR_DOMAIN_DEVICE_NET; +hostdev-actualParent.data.net = def; if (VIR_ALLOC(hostdev-info) 0) { virReportOOMError(); goto error; diff --git a/src/conf/domain_conf.h b/src/conf/domain_conf.h index 171dd70..adbb777 100644 --- a/src/conf/domain_conf.h +++ b/src/conf/domain_conf.h @@ -377,6 +377,7 @@ struct _virDomainHostdevSubsys { /* basic device for direct passthrough */ struct _virDomainHostdevDef { virDomainDeviceDef parent; /* higher level Def containing this */ +virDomainDeviceDef actualParent; /*used only in the case of hybrid hostdev*/ int mode; /* enum virDomainHostdevMode */ unsigned int managed : 1; unsigned int ephemeral : 1; diff --git a/src/qemu/qemu_hostdev.c b/src/qemu/qemu_hostdev.c index 46c84b5..a060a7e 100644 --- a/src/qemu/qemu_hostdev.c +++ b/src/qemu/qemu_hostdev.c @@ -320,43 +320,87 @@ qemuDomainHostdevNetConfigReplace(virDomainHostdevDefPtr hostdev, if (qemuDomainHostdevNetDevice(hostdev, linkdev, vf) 0) return ret; -vlan = virDomainNetGetActualVlan(hostdev-parent.data.net); -virtPort = virDomainNetGetActualVirtPortProfile( - hostdev-parent.data.net); -if (virtPort) { -if (vlan) { -virReportError(VIR_ERR_CONFIG_UNSUPPORTED, - _(direct setting of the vlan tag is not allowed - for hostdev devices using %s mode), - virNetDevVPortTypeToString(virtPort-virtPortType)); -goto cleanup; -} -ret = qemuDomainHostdevNetConfigVirtPortProfile(linkdev, vf, -virtPort, hostdev-parent.data.net-mac, uuid, -port_profile_associate); -} else { -/* Set only mac and vlan */ -if (vlan) { -if (vlan-nTags != 1 || vlan-trunk) { -virReportError(VIR_ERR_CONFIG_UNSUPPORTED, %s, - _(vlan trunking is not supported - by SR-IOV network devices)); +if (hostdev-parent.data.net) { +vlan = virDomainNetGetActualVlan(hostdev-parent.data.net); +virtPort = virDomainNetGetActualVirtPortProfile(hostdev-parent.data.net); +if (virtPort) { +if (vlan) { +virReportError(VIR_ERR_CONFIG_UNSUPPORTED, + _(direct setting of the vlan tag is not allowed + for hostdev devices using %s mode), + virNetDevVPortTypeToString(virtPort-virtPortType));
[libvirt] make check failure
Hello All, I wanted to ask a question regarding the tests that are run during make check. If a particular test fails when running make check, how do we which test failed and why? Is there a log that helps when debugging such errors? -- Many Thanks, Regards, Shradha Shah -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH 07/15] RNG updates, new xml parser/formatter code to support interface type=hostdev-hybrid
I will be rebasing this remaining patch series this week. Many Thanks, Regards, Shradha Shah On 08/21/2012 11:34 PM, Eric Blake wrote: On 08/10/2012 10:24 AM, Shradha Shah wrote: This patch introduces the new interface type='hostdev-hybrid' along with attribute managed Includes updates to the domain RNG and new xml parser/formatter code. --- I'm assuming Laine will resume reviewing this series. But in the meantime, it may help if you rebase the remaining patches, since a lot has gone in since you first posted. +++ b/docs/formatdomain.html.in @@ -2504,6 +2504,20 @@ guest instead of lt;interface type='hostdev'/gt;. /p +p + Libvirt later than 0.9.13 also supports intelligent passthrough Rather than using 'later than 0.9.13', I'd use span class=sincesince 0.10.0/span. -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH 6/7] Forward Mode Hostdev network driver Implementation
I'll be posting a proposed replacement patch for this one in a few minutes. Please try it out as soon as possible. I haven't gone through 7/7 yet, but with the small changes I've squashed in elsewhere, plus the 3.5/7 I posted and the 6/7 refactor I'm about to post, definitely the first 6 are ready to push. Laine, Sorry for the errors that have been caused in this patch and thank you for refactoring the code. I have tested the patches 3.5 and 6 that you have sent and everything works fine. Do you want me to post another iteration of the patch series? (By the way, a general note - I've been running make check and make syntax-check after applying each of your patches in succession, and came up with a *lot* of errors (see my add-on patch that I want to squash into 3/7). In particular, you seem to end up with a lot of lines that have trailing blanks - if you run make syntax-check on each patch, it will catch those. There are occasional slipups, but in general every patch is supposed to pass both make check and make syntax-check (and for a patch series, both of those should complete after applying each patch, not just after the end of the series). Sorry I did not realize I had to perform these checks as well. I will make sure I perform these checks henceforth. === -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
[libvirt] [PATCH 2/7] network: helper function to create interface pool from PF
Existing code that creates a list of forwardIfs from a single PF was moved to the new utility function networkCreateInterfacePool. No functional change. Signed-off-by: Shradha Shah ss...@solarflare.com --- src/network/bridge_driver.c | 82 ++- 1 files changed, 50 insertions(+), 32 deletions(-) diff --git a/src/network/bridge_driver.c b/src/network/bridge_driver.c index 474bbfa..ff17c7f 100644 --- a/src/network/bridge_driver.c +++ b/src/network/bridge_driver.c @@ -2771,6 +2771,55 @@ int networkRegister(void) { * backend function table. */ +/* networkCreateInterfacePool: + * @netdef: the original NetDef from the network + * + * Creates an implicit interface pool of VF's when a PF dev is given + */ +static int +networkCreateInterfacePool(virNetworkDefPtr netdef) { +unsigned int num_virt_fns = 0; +char **vfname = NULL; +int ret = -1, ii = 0; + +if ((virNetDevGetVirtualFunctions(netdef-forwardPfs-dev, + vfname, num_virt_fns)) 0) { +virReportError(VIR_ERR_INTERNAL_ERROR, + _(Could not get Virtual functions on %s), + netdef-forwardPfs-dev); +goto finish; +} + +if (num_virt_fns == 0) { +virReportError(VIR_ERR_INTERNAL_ERROR, + _(No Vf's present on SRIOV PF %s), + netdef-forwardPfs-dev); + goto finish; +} + +if ((VIR_ALLOC_N(netdef-forwardIfs, num_virt_fns)) 0) { +virReportOOMError(); +goto finish; +} + +netdef-nForwardIfs = num_virt_fns; + +for (ii = 0; ii netdef-nForwardIfs; ii++) { +netdef-forwardIfs[ii].dev = strdup(vfname[ii]); +if (!netdef-forwardIfs[ii].dev) { +virReportOOMError(); +goto finish; +} +} + +ret = 0; +finish: +for (ii = 0; ii num_virt_fns; ii++) +VIR_FREE(vfname[ii]); +VIR_FREE(vfname); +return ret; +} + /* networkAllocateActualDevice: * @iface: the original NetDef from the domain * @@ -2793,8 +2842,6 @@ networkAllocateActualDevice(virDomainNetDefPtr iface) virNetDevVPortProfilePtr virtport = iface-virtPortProfile; virNetDevVlanPtr vlan = NULL; virNetworkForwardIfDefPtr dev = NULL; -unsigned int num_virt_fns = 0; -char **vfname = NULL; int ii; int ret = -1; @@ -2969,35 +3016,9 @@ networkAllocateActualDevice(virDomainNetDefPtr iface) */ if (netdef-forwardType == VIR_NETWORK_FORWARD_PASSTHROUGH) { if ((netdef-nForwardPfs 0) (netdef-nForwardIfs = 0)) { -if ((virNetDevGetVirtualFunctions(netdef-forwardPfs-dev, - vfname, num_virt_fns)) 0) { -virReportError(VIR_ERR_INTERNAL_ERROR, - _(Could not get Virtual functions on %s), - netdef-forwardPfs-dev); +if ((networkCreateInterfacePool(netdef)) 0) { goto error; } - -if (num_virt_fns == 0) { -virReportError(VIR_ERR_INTERNAL_ERROR, - _(No Vf's present on SRIOV PF %s), - netdef-forwardPfs-dev); -goto error; -} - -if ((VIR_ALLOC_N(netdef-forwardIfs, num_virt_fns)) 0) { -virReportOOMError(); -goto error; -} - -netdef-nForwardIfs = num_virt_fns; - -for (ii = 0; ii netdef-nForwardIfs; ii++) { -netdef-forwardIfs[ii].dev = strdup(vfname[ii]); -if (!netdef-forwardIfs[ii].dev) { -virReportOOMError(); -goto error; -} -} } /* pick first dev with 0 connections */ @@ -3105,9 +3126,6 @@ validate: ret = 0; cleanup: -for (ii = 0; ii num_virt_fns; ii++) -VIR_FREE(vfname[ii]); -VIR_FREE(vfname); if (network) virNetworkObjUnlock(network); return ret; -- 1.7.4.4 -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
[libvirt] [PATCH 1/7] conf: move DevicePCIAddress functions to separate file
Move the functions the parse/format, and validate PCI addresses to their own file so they can be conveniently used in other places besides device_conf.c Refactoring existing code without causing any functional changes to prepare for new code. This patch makes the code reusable. Signed-off-by: Shradha Shah ss...@solarflare.com --- include/libvirt/virterror.h |1 + src/Makefile.am |7 ++- src/conf/device_conf.c | 131 ++ src/conf/device_conf.h | 65 + src/conf/domain_conf.c | 114 + src/conf/domain_conf.h | 25 +--- src/libvirt_private.syms | 10 ++- src/qemu/qemu_command.c | 13 ++-- src/qemu/qemu_hotplug.c |7 +- src/qemu/qemu_monitor.c | 14 ++-- src/qemu/qemu_monitor.h | 17 +++--- src/qemu/qemu_monitor_json.c | 14 ++-- src/qemu/qemu_monitor_json.h | 14 ++-- src/qemu/qemu_monitor_text.c | 16 +++--- src/qemu/qemu_monitor_text.h | 14 ++-- src/util/virterror.c |3 +- src/xen/xend_internal.c |3 +- 17 files changed, 287 insertions(+), 181 deletions(-) diff --git a/include/libvirt/virterror.h b/include/libvirt/virterror.h index 913fc5d..d0af43d 100644 --- a/include/libvirt/virterror.h +++ b/include/libvirt/virterror.h @@ -110,6 +110,7 @@ typedef enum { VIR_FROM_AUTH = 46, /* Error from auth handling */ VIR_FROM_DBUS = 47, /* Error from DBus */ VIR_FROM_PARALLELS = 48,/* Error from Parallels */ +VIR_FROM_DEVICE = 49, /* Error from Device */ # ifdef VIR_ENUM_SENTINELS VIR_ERR_DOMAIN_LAST diff --git a/src/Makefile.am b/src/Makefile.am index b5f8056..ad24534 100644 --- a/src/Makefile.am +++ b/src/Makefile.am @@ -199,6 +199,10 @@ CPU_CONF_SOURCES = \ CONSOLE_CONF_SOURCES = \ conf/virconsole.c conf/virconsole.h +# Device Helper APIs +DEVICE_CONF_SOURCES = \ + conf/device_conf.c conf/device_conf.h + CONF_SOURCES = \ $(NETDEV_CONF_SOURCES) \ $(DOMAIN_CONF_SOURCES) \ @@ -211,7 +215,8 @@ CONF_SOURCES = \ $(INTERFACE_CONF_SOURCES) \ $(SECRET_CONF_SOURCES) \ $(CPU_CONF_SOURCES) \ - $(CONSOLE_CONF_SOURCES) + $(CONSOLE_CONF_SOURCES) \ + $(DEVICE_CONF_SOURCES) # The remote RPC driver, covering domains, storage, networks, etc REMOTE_DRIVER_GENERATED = \ diff --git a/src/conf/device_conf.c b/src/conf/device_conf.c new file mode 100644 index 000..ca600c5 --- /dev/null +++ b/src/conf/device_conf.c @@ -0,0 +1,131 @@ +/* + * device_conf.c: device XML handling + * + * Copyright (C) 2006-2012 Red Hat, Inc. + * + * This library is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * This library is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with this library; if not, write to the Free Software + * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA + * + * Author: Daniel P. Berrange berra...@redhat.com + */ + +#include config.h +#include virterror_internal.h +#include datatypes.h +#include memory.h +#include xml.h +#include uuid.h +#include util.h +#include buf.h +#include device_conf.h + +#define VIR_FROM_THIS VIR_FROM_DEVICE + +VIR_ENUM_IMPL(virDeviceAddressPciMulti, + VIR_DEVICE_ADDRESS_PCI_MULTI_LAST, + default, + on, + off) + +int virDevicePCIAddressIsValid(virDevicePCIAddressPtr addr) +{ +/* PCI bus has 32 slots and 8 functions per slot */ +if (addr-slot = 32 || addr-function = 8) +return 0; +return addr-domain || addr-bus || addr-slot; +} + + +int +virDevicePCIAddressParseXML(xmlNodePtr node, +virDevicePCIAddressPtr addr) +{ +char *domain, *slot, *bus, *function, *multi; +int ret = -1; + +memset(addr, 0, sizeof(*addr)); + +domain = virXMLPropString(node, domain); +bus = virXMLPropString(node, bus); +slot = virXMLPropString(node, slot); +function = virXMLPropString(node, function); +multi
[libvirt] [PATCH 6/7] Forward Mode Hostdev network driver Implementation
This patch updates the network driver to properly utilize the new attributes/elements that are now in virNetworkDef Signed-off-by: Shradha Shah ss...@solarflare.com --- docs/formatnetwork.html.in | 62 ++ src/network/bridge_driver.c | 282 ++- 2 files changed, 284 insertions(+), 60 deletions(-) diff --git a/docs/formatnetwork.html.in b/docs/formatnetwork.html.in index ed9f7a9..5ab5f3c 100644 --- a/docs/formatnetwork.html.in +++ b/docs/formatnetwork.html.in @@ -223,6 +223,37 @@ (usually either a domain start, or a hotplug interface attach to a domain).span class=sinceSince 0.9.4/span /dd + dtcodehostdev/code/dt + dd +This network facilitates PCI Passthrough of a network device. +A network device is chosen from the interface pool and +directly assigned to the guest using generic device +passthrough, after first optionally setting the device's MAC +address to the configured value, and optionally associating +the device with an 802.1Qbh capable switch using an optionally +specified codelt;virtualportgt;/code element. +Note that - due to limitations in standard single-port PCI +ethernet card driver design - only SR-IOV (Single Root I/O +Virtualization) virtual function (VF) devices can be assigned +in this manner; to assign a standard single-port PCI or PCIe +ethernet card to a guest, use the traditional codelt; +hostdevgt;/code device definition. span class=since +Since 0.10.0/span + +pNote that this intelligent passthrough of network devices is +very similar to the functionality of a standard codelt; +hostdevgt;/code device, the difference being that this +method allows specifying a MAC address and codelt;virtualport +gt;/code for the passed-through device. If these capabilities +are not required, if you have a standard single-port PCI, PCIe, +or USB network card that doesn't support SR-IOV (and hence would +anyway lose the configured MAC address during reset after being +assigned to the guest domain), or if you are using a version of +libvirt older than 0.10.0, you should use a standard +codelt;hostdevgt;/code definition to assign the device to +the guest instead of codelt;forward mode='hostdev'/gt;/code. +/p + /dd /dl As mentioned above, a codelt;forwardgt;/code element can have multiple codelt;interfacegt;/code subelements, each @@ -272,6 +303,37 @@ particular, 'passthrough' mode, and 'private' mode when using 802.1Qbh), libvirt will choose an unused physical interface or, if it can't find an unused interface, fail the operation./p + +span class=sincesince 0.9.12/span and when using forward mode +'hostdev' we specify the interface pool by using the +codelt;addressgt;/code element and codelt; +typegt;/code codelt;domaingt;/code codelt;busgt;/code +codelt;slotgt;/code and codelt;functiongt;/code +sub-elements. + +pre +... + lt;forward mode='hostdev' managed='yes'gt; +lt;address type='pci' domain='0' bus='4' slot='0' function='1'/gt; +lt;address type='pci' domain='0' bus='4' slot='0' function='2'/gt; +lt;address type='pci' domain='0' bus='4' slot='0' function='3'/gt; + lt;/forwardgt; +... +/pre + +Alternatively the interface pool can also be defined using a +single physical function codelt;pfgt;/code subelement to +call out the corresponding physical interface associated with +multiple virtual interfaces (similar to passthrough mode): + +pre +... + lt;forward mode='hostdev' managed='yes'gt; +lt;pf dev='eth0'/gt; + lt;/forwardgt; +... +/pre + /dd /dl h5a name=elementQoSQuality of service/a/h5 diff --git a/src/network/bridge_driver.c b/src/network/bridge_driver.c index 38f6d12..065af3e 100644 --- a/src/network/bridge_driver.c +++ b/src/network/bridge_driver.c @@ -47,6 +47,7 @@ #include datatypes.h #include bridge_driver.h #include network_conf.h +#include device_conf.h #include driver.h #include buf.h #include virpidfile.h @@ -1935,7 +1936,7 @@ networkStartNetworkExternal(struct network_driver *driver ATTRIBUTE_UNUSED, virNetworkObjPtr network ATTRIBUTE_UNUSED) { /* put anything here that needs to be done each time a network of - * type BRIDGE, PRIVATE, VEPA, or PASSTHROUGH is started. On + * type BRIDGE, PRIVATE, VEPA, HOSTDEV or PASSTHROUGH is started. On * failure, undo anything you've done, and return -1. On success * return 0. */ @@ -1946,7 +1947,7 @@ static int
[libvirt] [PATCH 5/7] Add function virDevicePCIAddressEqual
This function is needed by the network driver in a later commit. This function is useful in functions like networkNotifyActualDevice and networkReleaseActualDevice --- src/conf/device_conf.c | 16 src/conf/device_conf.h |3 +++ src/libvirt_private.syms |1 + 3 files changed, 20 insertions(+), 0 deletions(-) diff --git a/src/conf/device_conf.c b/src/conf/device_conf.c index ca600c5..8edcc0a 100644 --- a/src/conf/device_conf.c +++ b/src/conf/device_conf.c @@ -129,3 +129,19 @@ virDevicePCIAddressFormat(virBufferPtr buf, addr.function); return 0; } + +int +virDevicePCIAddressEqual(virDevicePCIAddress addr1, + virDevicePCIAddress addr2) +{ +int ret = -1; + +if (addr1.domain == addr2.domain +addr1.bus == addr2.bus +addr1.slot == addr2.slot +addr1.function == addr2.function) { +ret = 0; +} + +return ret; +} diff --git a/src/conf/device_conf.h b/src/conf/device_conf.h index c679bce..7c4d356 100644 --- a/src/conf/device_conf.h +++ b/src/conf/device_conf.h @@ -59,6 +59,9 @@ int virDevicePCIAddressFormat(virBufferPtr buf, virDevicePCIAddress addr, bool includeTypeInAddr); +int virDevicePCIAddressEqual(virDevicePCIAddress addr1, + virDevicePCIAddress addr2); + VIR_ENUM_DECL(virDeviceAddressPciMulti) diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index 1f32f8e..063d0bc 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -224,6 +224,7 @@ virDeviceAddressPciMultiTypeToString; virDevicePCIAddressIsValid; virDevicePCIAddressParseXML; virDevicePCIAddressFormat; +virDevicePCIAddressEqual; # dnsmasq.h dnsmasqAddDhcpHost; -- 1.7.4.4 -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
[libvirt] [PATCH 3/7] RNG updates, new xml parser/formatter code to support forward mode=hostdev
This patch introduces the new forward mode='hostdev' along with attribute managed Includes updates to the network RNG and new xml parser/formatter code. Signed-off-by: Shradha Shah ss...@solarflare.com --- docs/schemas/basictypes.rng| 46 +++ docs/schemas/domaincommon.rng | 44 --- docs/schemas/network.rng | 53 ++--- src/conf/network_conf.c| 130 +++- src/conf/network_conf.h| 26 ++- src/network/bridge_driver.c| 18 ++-- tests/networkxml2xmlin/hostdev-pf.xml |7 ++ tests/networkxml2xmlin/hostdev.xml | 10 +++ tests/networkxml2xmlout/hostdev-pf.xml |7 ++ tests/networkxml2xmlout/hostdev.xml| 10 +++ tests/networkxml2xmltest.c |2 + 11 files changed, 266 insertions(+), 87 deletions(-) diff --git a/docs/schemas/basictypes.rng b/docs/schemas/basictypes.rng index 9dbda4a..766f9a0 100644 --- a/docs/schemas/basictypes.rng +++ b/docs/schemas/basictypes.rng @@ -54,6 +54,31 @@ /choice /define + define name=pciaddress +optional + attribute name=domain +ref name=pciDomain/ + /attribute +/optional +attribute name=bus + ref name=pciBus/ +/attribute +attribute name=slot + ref name=pciSlot/ +/attribute +attribute name=function + ref name=pciFunc/ +/attribute +optional + attribute name=multifunction +choice + valueon/value + valueoff/value +/choice + /attribute +/optional + /define + !-- a 6 byte MAC address in ASCII-hex format, eg 12:34:56:78:9A:BC -- !-- The lowest bit of the 1st byte is the multicast bit. a -- !-- uniMacAddr requires that bit to be 0, and a multiMacAddr -- @@ -167,4 +192,25 @@ ref name='unsignedLong'/ /define + define name=pciDomain +data type=string + param name=pattern(0x)?[0-9a-fA-F]{1,4}/param +/data + /define + define name=pciBus +data type=string + param name=pattern(0x)?[0-9a-fA-F]{1,2}/param +/data + /define + define name=pciSlot +data type=string + param name=pattern(0x)?[0-1]?[0-9a-fA-F]/param +/data + /define + define name=pciFunc +data type=string + param name=pattern(0x)?[0-7]/param +/data + /define + /grammar diff --git a/docs/schemas/domaincommon.rng b/docs/schemas/domaincommon.rng index 4903ca6..35e9f82 100644 --- a/docs/schemas/domaincommon.rng +++ b/docs/schemas/domaincommon.rng @@ -2652,30 +2652,6 @@ /attribute /optional /define - define name=pciaddress -optional - attribute name=domain -ref name=pciDomain/ - /attribute -/optional -attribute name=bus - ref name=pciBus/ -/attribute -attribute name=slot - ref name=pciSlot/ -/attribute -attribute name=function - ref name=pciFunc/ -/attribute -optional - attribute name=multifunction -choice - valueon/value - valueoff/value -/choice - /attribute -/optional - /define define name=driveaddress optional attribute name=controller @@ -3376,26 +3352,6 @@ param name=pattern((0x)?[0-9a-fA-F]{1,3}\.){0,3}(0x)?[0-9a-fA-F]{1,3}/param /data /define - define name=pciDomain -data type=string - param name=pattern(0x)?[0-9a-fA-F]{1,4}/param -/data - /define - define name=pciBus -data type=string - param name=pattern(0x)?[0-9a-fA-F]{1,2}/param -/data - /define - define name=pciSlot -data type=string - param name=pattern(0x)?[0-1]?[0-9a-fA-F]/param -/data - /define - define name=pciFunc -data type=string - param name=pattern(0x)?[0-7]/param -/data - /define define name=driveController data type=string param name=pattern[0-9]{1,2}/param diff --git a/docs/schemas/network.rng b/docs/schemas/network.rng index e55105a..4abfd91 100644 --- a/docs/schemas/network.rng +++ b/docs/schemas/network.rng @@ -87,22 +87,51 @@ valuepassthrough/value valueprivate/value valuevepa/value + valuehostdev/value +/choice + /attribute +/optional + +optional + attribute name=managed +choice + valueyes/value + valueno/value /choice /attribute /optional interleave - zeroOrMore -element name='interface' - attribute name='dev' -ref name='deviceName'/ - /attribute - optional -attribute name=connections - data type=unsignedInt/ -/attribute - /optional -/element - /zeroOrMore + choice
[libvirt] [PATCH 4/7] Code to return interface name or pci_addr of the VF in actualDevice
The network pool should be able to keep track of both, network device names nad PCI addresses, and return the appropriate one in the actualDevice when networkAllocateActualDevice is called. Signed-off-by: Shradha Shah ss...@solarflare.com --- src/network/bridge_driver.c | 60 +-- src/util/virnetdev.c| 25 - src/util/virnetdev.h|4 ++- 3 files changed, 50 insertions(+), 39 deletions(-) diff --git a/src/network/bridge_driver.c b/src/network/bridge_driver.c index ddd66e5..38f6d12 100644 --- a/src/network/bridge_driver.c +++ b/src/network/bridge_driver.c @@ -59,6 +59,7 @@ #include dnsmasq.h #include configmake.h #include virnetdev.h +#include pci.h #include virnetdevbridge.h #include virnetdevtap.h #include virnetdevvportprofile.h @@ -2780,10 +2781,11 @@ static int networkCreateInterfacePool(virNetworkDefPtr netdef) { unsigned int num_virt_fns = 0; char **vfname = NULL; +struct pci_config_address **virt_fns; int ret = -1, ii = 0; if ((virNetDevGetVirtualFunctions(netdef-forwardPfs-dev, - vfname, num_virt_fns)) 0) { + vfname, virt_fns, num_virt_fns)) 0) { virReportError(VIR_ERR_INTERNAL_ERROR, _(Could not get Virtual functions on %s), netdef-forwardPfs-dev); @@ -2805,18 +2807,34 @@ networkCreateInterfacePool(virNetworkDefPtr netdef) { netdef-nForwardIfs = num_virt_fns; for (ii = 0; ii netdef-nForwardIfs; ii++) { -netdef-forwardIfs[ii].device.dev = strdup(vfname[ii]); -if (!netdef-forwardIfs[ii].device.dev) { -virReportOOMError(); -goto finish; +if ((netdef-forwardType == VIR_NETWORK_FORWARD_BRIDGE) || +(netdef-forwardType == VIR_NETWORK_FORWARD_PRIVATE) || +(netdef-forwardType == VIR_NETWORK_FORWARD_VEPA) || +(netdef-forwardType == VIR_NETWORK_FORWARD_PASSTHROUGH)) { +netdef-forwardIfs[ii].type = VIR_NETWORK_FORWARD_HOSTDEV_DEVICE_NETDEV; +if(vfname[ii]) { +netdef-forwardIfs[ii].device.dev = strdup(vfname[ii]); +if (!netdef-forwardIfs[ii].device.dev) { +virReportOOMError(); +goto finish; +} +} +else { +virReportError(VIR_ERR_INTERNAL_ERROR, %s, + _(Direct mode types requires interface names)); +goto finish; +} } } ret = 0; finish: -for (ii = 0; ii num_virt_fns; ii++) +for (ii = 0; ii num_virt_fns; ii++) { VIR_FREE(vfname[ii]); +VIR_FREE(virt_fns[ii]); +} VIR_FREE(vfname); +VIR_FREE(virt_fns); return ret; } @@ -3008,31 +3026,23 @@ networkAllocateActualDevice(virDomainNetDefPtr iface) } else { /* pick an interface from the pool */ +if ((netdef-nForwardPfs 0) (netdef-nForwardIfs = 0)) { +if ((networkCreateInterfacePool(netdef)) 0) { +goto error; +} +} + /* PASSTHROUGH mode, and PRIVATE Mode + 802.1Qbh both * require exclusive access to a device, so current * connections count must be 0. Other modes can share, so * just search for the one with the lowest number of * connections. */ -if (netdef-forwardType == VIR_NETWORK_FORWARD_PASSTHROUGH) { -if ((netdef-nForwardPfs 0) (netdef-nForwardIfs = 0)) { -if ((networkCreateInterfacePool(netdef)) 0) { -goto error; -} -} - -/* pick first dev with 0 connections */ - -for (ii = 0; ii netdef-nForwardIfs; ii++) { -if (netdef-forwardIfs[ii].connections == 0) { -dev = netdef-forwardIfs[ii]; -break; -} -} -} else if ((netdef-forwardType == VIR_NETWORK_FORWARD_PRIVATE) - iface-data.network.actual-virtPortProfile - (iface-data.network.actual-virtPortProfile-virtPortType -== VIR_NETDEV_VPORT_PROFILE_8021QBH)) { +if ((netdef-forwardType == VIR_NETWORK_FORWARD_PASSTHROUGH) || +((netdef-forwardType == VIR_NETWORK_FORWARD_PRIVATE) + iface-data.network.actual-virtPortProfile + (iface-data.network.actual-virtPortProfile-virtPortType + == VIR_NETDEV_VPORT_PROFILE_8021QBH))) { /* pick first dev with 0 connections */ for (ii = 0; ii netdef-nForwardIfs; ii++) { diff --git a/src/util/virnetdev.c b/src/util/virnetdev.c index 25bdf01..f9eba1a
[libvirt] [PATCH 7/7] Forward Mode 'Hostdev' qemu driver implementation
For a network with forward mode='hostdev', there is a need to add the newly minted hostdev to the hostdevs array. In this case we also call qemuPrepareHostDevicesas it has already been called by the time we get to here and are building the qemu commandline Signed-off-by: Shradha Shah ss...@solarflare.com --- src/qemu/qemu_command.c | 39 +-- 1 files changed, 33 insertions(+), 6 deletions(-) diff --git a/src/qemu/qemu_command.c b/src/qemu/qemu_command.c index 6f6c6cd..1470edd 100644 --- a/src/qemu/qemu_command.c +++ b/src/qemu/qemu_command.c @@ -24,6 +24,7 @@ #include config.h #include qemu_command.h +#include qemu_hostdev.h #include qemu_capabilities.h #include qemu_bridge_filter.h #include cpu/cpu.h @@ -5221,12 +5222,38 @@ qemuBuildCommandLine(virConnectPtr conn, actualType = virDomainNetGetActualType(net); if (actualType == VIR_DOMAIN_NET_TYPE_HOSTDEV) { -/* type='hostdev' interfaces are handled in codepath - * for standard hostdev (NB: when there is a network - * with forward mode='hostdev', there will need to be - * code here that adds the newly minted hostdev to the - * hostdevs array). - */ +if (net-type == VIR_DOMAIN_NET_TYPE_NETWORK) { +virDomainHostdevDefPtr hostdev = virDomainNetGetActualHostdev(net); +virDomainHostdevDefPtr found; +/* For a network with forward mode='hostdev', there is a need to + * add the newly minted hostdev to the hostdevs array. + */ +if (qemuAssignDeviceHostdevAlias(def, hostdev, + (def-nhostdevs-1)) 0) { +virReportError(VIR_ERR_INTERNAL_ERROR, %s, + _(Could not assign alias to Net Hostdev)); +goto error; +} + +if (virDomainHostdevFind(def, hostdev, found) 0) { +if (virDomainHostdevInsert(def, hostdev) 0) { +virReportError(VIR_ERR_INTERNAL_ERROR, %s, + _(Hostdev not inserted into the array)); +goto error; +} +if (qemuPrepareHostdevPCIDevices(driver, def-name, def-uuid, + hostdev, 1) 0) { +virReportError(VIR_ERR_INTERNAL_ERROR, %s, + _(Prepare Hostdev PCI Devices failed)); +goto error; +} +} +else { +virReportError(VIR_ERR_INTERNAL_ERROR, %s, + _(The Hostdev is being used by some other device)); +goto error; +} +} continue; } -- 1.7.4.4 -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
[libvirt] [PATCH 0/7] Forward mode='hostdev' patch series
This patch series supports the forward mode='hostdev'. The functionality of this mode is the same as interface type='hostdev' but with the added benefit of using interface pools. The patch series also contains a patch to support use of interface names and PCI device addresses interchangeably in a network xml, and return the appropriate one in actualDevice when networkAllocateActualDevice is called. At the top level managed attribute can be specified with identical results as when it's specified for a hostdev. Currently forward mode='hostdev' does not support USB devices. Shradha Shah (7): conf: move DevicePCIAddress functions to separate file network: helper function to create interface pool from PF RNG updates, new xml parser/formatter code to support forward mode=hostdev Code to return interface name or pci_addr of the VF in actualDevice Add function virDevicePCIAddressEqual Forward Mode Hostdev network driver Implementation Forward Mode 'Hostdev' qemu driver implementation docs/formatnetwork.html.in | 62 + docs/schemas/basictypes.rng| 46 docs/schemas/domaincommon.rng | 44 docs/schemas/network.rng | 53 - include/libvirt/virterror.h|1 + src/Makefile.am|7 +- src/conf/device_conf.c | 147 +++ src/conf/device_conf.h | 68 ++ src/conf/domain_conf.c | 114 + src/conf/domain_conf.h | 25 +-- src/conf/network_conf.c| 130 +-- src/conf/network_conf.h| 26 ++- src/libvirt_private.syms | 11 +- src/network/bridge_driver.c| 414 +++- src/qemu/qemu_command.c| 52 +++- src/qemu/qemu_hotplug.c|7 +- src/qemu/qemu_monitor.c| 14 +- src/qemu/qemu_monitor.h| 17 +- src/qemu/qemu_monitor_json.c | 14 +- src/qemu/qemu_monitor_json.h | 14 +- src/qemu/qemu_monitor_text.c | 16 +- src/qemu/qemu_monitor_text.h | 14 +- src/util/virnetdev.c | 25 +- src/util/virnetdev.h |4 +- src/util/virterror.c |3 +- src/xen/xend_internal.c|3 +- tests/networkxml2xmlin/hostdev-pf.xml |7 + tests/networkxml2xmlin/hostdev.xml | 10 + tests/networkxml2xmlout/hostdev-pf.xml |7 + tests/networkxml2xmlout/hostdev.xml| 10 + tests/networkxml2xmltest.c |2 + 31 files changed, 976 insertions(+), 391 deletions(-) create mode 100644 src/conf/device_conf.c create mode 100644 src/conf/device_conf.h create mode 100644 tests/networkxml2xmlin/hostdev-pf.xml create mode 100644 tests/networkxml2xmlin/hostdev.xml create mode 100644 tests/networkxml2xmlout/hostdev-pf.xml create mode 100644 tests/networkxml2xmlout/hostdev.xml -- 1.7.4.4 -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH 06/15] Forward Mode 'Hostdev' qemu driver implementation
On 08/14/2012 07:44 AM, Laine Stump wrote: On 08/10/2012 12:24 PM, Shradha Shah wrote: Some explanation is needed in the commit log of what this is being done here. A cut-paste of the comment in the code would be a good start (any anyway, that comment can be changed since it's talking about when there is a network with forward mode='hostdev', but that when is now :-) Signed-off-by: Shradha Shah ss...@solarflare.com --- src/qemu/qemu_command.c | 27 +++ 1 files changed, 27 insertions(+), 0 deletions(-) diff --git a/src/qemu/qemu_command.c b/src/qemu/qemu_command.c index 6f6c6cd..bb66364 100644 --- a/src/qemu/qemu_command.c +++ b/src/qemu/qemu_command.c @@ -24,6 +24,7 @@ #include config.h #include qemu_command.h +#include qemu_hostdev.h #include qemu_capabilities.h #include qemu_bridge_filter.h #include cpu/cpu.h @@ -5221,12 +5222,38 @@ qemuBuildCommandLine(virConnectPtr conn, actualType = virDomainNetGetActualType(net); if (actualType == VIR_DOMAIN_NET_TYPE_HOSTDEV) { +virDomainHostdevDefPtr hostdev = virDomainNetGetActualHostdev(net); +virDomainHostdevDefPtr found; /* type='hostdev' interfaces are handled in codepath * for standard hostdev (NB: when there is a network * with forward mode='hostdev', there will need to be * code here that adds the newly minted hostdev to the * hostdevs array). */ +if (qemuAssignDeviceHostdevAlias(def, + hostdev, Combine the above two lines. + (def-nhostdevs-1)) 0) { +virReportError(VIR_ERR_INTERNAL_ERROR, %s, +_(Could not assign alias to Net Hostdev)); +goto error; +} + +if (virDomainHostdevFind(def, + hostdev, + found) 0) { If the device is found already on the list, you should log an error and fail. The device will be found on the list when using interface type=hostdev. If I log an error and fail wouldn't that mean that interface type=hostdev will always fail at this point? +if (virDomainHostdevInsert(def, + hostdev) 0) { +virReportError(VIR_ERR_INTERNAL_ERROR, %s, +_(Hostdev not inserted into the array)); +goto error; +} +if (qemuPrepareHostdevPCIDevices(driver, def-name, def-uuid, + hostdev, 1) 0) { +virReportError(VIR_ERR_INTERNAL_ERROR, %s, +_(Prepare Hostdev PCI Devices failed)); +goto error; It took me awhile to follow that trail, but I do finally understand that this is necessary (because qemuPrepareHostDevices has already been called by the time we get to here and are building the qemu commandline). +} +} continue; } ACK with a better commit log message, fixing the comment in the code, and logging an error if the device is found already on the hostdev list. -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH 04/15] Code to return interface name or pci_addr of the VF in actualDevice
On 08/14/2012 06:36 AM, Laine Stump wrote: On 08/10/2012 12:23 PM, Shradha Shah wrote: The network pool should be able to keep track of both, network device names nad PCI addresses, and return the appropriate one in the actualDevice when networkAllocateActualDevice is called. Signed-off-by: Shradha Shah ss...@solarflare.com --- src/network/bridge_driver.c | 33 +++-- src/util/virnetdev.c| 25 - src/util/virnetdev.h|4 +++- 3 files changed, 42 insertions(+), 20 deletions(-) diff --git a/src/network/bridge_driver.c b/src/network/bridge_driver.c index df3cc25..602e17d 100644 --- a/src/network/bridge_driver.c +++ b/src/network/bridge_driver.c @@ -59,6 +59,7 @@ #include dnsmasq.h #include configmake.h #include virnetdev.h +#include pci.h #include virnetdevbridge.h #include virnetdevtap.h @@ -2737,10 +2738,11 @@ static int networkCreateInterfacePool(virNetworkDefPtr netdef) { unsigned int num_virt_fns = 0; char **vfname = NULL; +struct pci_config_address **virt_fns; int ret = -1, ii = 0; if ((virNetDevGetVirtualFunctions(netdef-forwardPfs-dev, - vfname, num_virt_fns)) 0) { + vfname, virt_fns, num_virt_fns)) 0) { virReportError(VIR_ERR_INTERNAL_ERROR, _(Could not get Virtual functions on %s), netdef-forwardPfs-dev); @@ -2762,19 +2764,38 @@ networkCreateInterfacePool(virNetworkDefPtr netdef) { netdef-nForwardIfs = num_virt_fns; for (ii = 0; ii netdef-nForwardIfs; ii++) { -netdef-forwardIfs[ii].device.dev = strdup(vfname[ii]); -if (!netdef-forwardIfs[ii].device.dev) { -virReportOOMError(); -goto finish; To be pure in the separation of patches, the following if else should be removed from this patch, with just the contents of the if clause here. Then the if else + body of the else should be added in the next patch. (And at any rate, the if() condition is incorrect here - really that part should happen for all forwardTypes except HOSTDEV (BRIDGE, PRIVATE, and VEPA also require netdev names.) I did not include the BRIDGE, PRIVATE and VEPA cases here because the networkCreateInterfacePool function is not called in those cases. Should I still include the conditions for BRIDGE, PRIVATE and VEPA? Aside from that movement of code to the next patch, ACK. +if (netdef-forwardType == VIR_NETWORK_FORWARD_PASSTHROUGH) { +if(vfname[ii]) { +netdef-forwardIfs[ii].device.dev = strdup(vfname[ii]); +if (!netdef-forwardIfs[ii].device.dev) { +virReportOOMError(); +goto finish; +} +} +else { +virReportError(VIR_ERR_INTERNAL_ERROR, %s, + _(Passthrough mode requires interface names)); +goto finish; +} +} +else if (netdef-forwardType == VIR_NETWORK_FORWARD_HOSTDEV) { +netdef-forwardIfs[ii].type = VIR_NETWORK_FORWARD_HOSTDEV_DEVICE_PCI; /*Assuming PCI as VF's are PCI devices */ +netdef-forwardIfs[ii].device.pci.domain = virt_fns[ii]-domain; +netdef-forwardIfs[ii].device.pci.bus = virt_fns[ii]-bus; +netdef-forwardIfs[ii].device.pci.slot = virt_fns[ii]-slot; +netdef-forwardIfs[ii].device.pci.function = virt_fns[ii]-function; } netdef-forwardIfs[ii].usageCount = 0; } ret = 0; finish: -for (ii = 0; ii num_virt_fns; ii++) +for (ii = 0; ii num_virt_fns; ii++) { VIR_FREE(vfname[ii]); +VIR_FREE(virt_fns[ii]); +} VIR_FREE(vfname); +VIR_FREE(virt_fns); return ret; } diff --git a/src/util/virnetdev.c b/src/util/virnetdev.c index f1ee0a4..8103aff 100644 --- a/src/util/virnetdev.c +++ b/src/util/virnetdev.c @@ -29,6 +29,7 @@ #include command.h #include memory.h #include pci.h +#include logging.h #include sys/ioctl.h #ifdef HAVE_NET_IF_H @@ -981,18 +982,18 @@ virNetDevSysfsDeviceFile(char **pf_sysfs_device_link, const char *ifname, int virNetDevGetVirtualFunctions(const char *pfname, char ***vfname, + struct pci_config_address ***virt_fns, unsigned int *n_vfname) { int ret = -1, i; char *pf_sysfs_device_link = NULL; char *pci_sysfs_device_link = NULL; -struct pci_config_address **virt_fns; char *pciConfigAddr; if (virNetDevSysfsFile(pf_sysfs_device_link, pfname, device) 0) return ret; -if (pciGetVirtualFunctions(pf_sysfs_device_link, virt_fns, +if (pciGetVirtualFunctions(pf_sysfs_device_link, virt_fns
Re: [libvirt] [PATCH 03/15] RNG updates, new xml parser/formatter code to support forward mode=hostdev
On 08/14/2012 06:27 AM, Laine Stump wrote: On 08/10/2012 12:23 PM, Shradha Shah wrote: This patch introduces the new forward mode='hostdev' along with attribute managed Includes updates to the network RNG and new xml parser/formatter code. This one still needs some tweaks. See below... Signed-off-by: Shradha Shah ss...@solarflare.com --- docs/schemas/network.rng | 82 +++-- src/conf/network_conf.c| 127 src/conf/network_conf.h| 29 +++- src/network/bridge_driver.c| 18 ++-- tests/networkxml2xmlin/hostdev-pf.xml | 11 +++ tests/networkxml2xmlin/hostdev.xml | 10 +++ tests/networkxml2xmlout/hostdev-pf.xml |7 ++ tests/networkxml2xmlout/hostdev.xml| 10 +++ tests/networkxml2xmltest.c |2 + 9 files changed, 263 insertions(+), 33 deletions(-) diff --git a/docs/schemas/network.rng b/docs/schemas/network.rng index 2ae879e..d1297cd 100644 --- a/docs/schemas/network.rng +++ b/docs/schemas/network.rng @@ -82,17 +82,41 @@ valuepassthrough/value valueprivate/value valuevepa/value + valuehostdev/value +/choice + /attribute +/optional + +optional + attribute name=managed +choice + valueyes/value + valueno/value /choice /attribute /optional interleave - zeroOrMore -element name='interface' - attribute name='dev' -ref name='deviceName'/ - /attribute -/element - /zeroOrMore + choice +group + zeroOrMore +element name='interface' + attribute name='dev' +ref name='deviceName'/ + /attribute +/element + /zeroOrMore +/group +group + zeroOrMore +element name='address' + attribute name='type' +valuepci/value + /attribute + ref name=pciaddress/ +/element + /zeroOrMore +/group + /choice optional element name='pf' attribute name='dev' @@ -238,4 +262,48 @@ /interleave /element /define + define name=pciaddress I'm kind of surprised there isn't already a pci address type in the rng... Oh, *now* I see - there is already a pciaddress type, but it's defined in domaincommon.rng, and you need to use it in network.rng. Rather than defining the whole thing twice, you should either move the definition to basictypes.rng or (if anyone objects to something that complex going into a file that has basic in the name) to a new file - devicetypes.rng or something like that. (This is analogous to the split of the pci device data and functions into device_conf.[ch]). +optional + attribute name=domain +ref name=pciDomain/ + /attribute +/optional +attribute name=bus + ref name=pciBus/ +/attribute +attribute name=slot + ref name=pciSlot/ +/attribute +attribute name=function + ref name=pciFunc/ +/attribute +optional + attribute name=multifunction +choice + valueon/value + valueoff/value +/choice + /attribute +/optional + /define + define name=pciDomain +data type=string + param name=pattern(0x)?[0-9a-fA-F]{1,4}/param +/data + /define + define name=pciBus +data type=string + param name=pattern(0x)?[0-9a-fA-F]{1,2}/param +/data + /define + define name=pciSlot +data type=string + param name=pattern(0x)?[0-1]?[0-9a-fA-F]/param +/data + /define + define name=pciFunc +data type=string + param name=pattern(0x)?[0-7]/param +/data + /define Yeah, basically everything from here up to my previous comment is the same here and in domaincommon.rng - definitely move it. (put it it basictypes.rng until/unless someone objects :-) /grammar diff --git a/src/conf/network_conf.c b/src/conf/network_conf.c index a3714d9..294939d 100644 --- a/src/conf/network_conf.c +++ b/src/conf/network_conf.c @@ -49,7 +49,12 @@ VIR_ENUM_IMPL(virNetworkForward, VIR_NETWORK_FORWARD_LAST, - none, nat, route, bridge, private, vepa, passthrough ) + none, nat, route, bridge, private, vepa, passthrough, hostdev) + +VIR_ENUM_DECL(virNetworkForwardHostdevDevice
[libvirt] [PATCH 00/15] Hostdev and Hostdev-hybrid patches
This patch series supports the forward mode='hostdev'. The functionality of this mode is the same as interface type='hostdev' but with the added benefit of using interface pools. The patch series also contains a patch to support use of interface names and PCI device addresses interchangeably in a network xml, and return the appropriate one in actualDevice when networkAllocateActualDevice is called. At the top level managed attribute can be specified with identical results as when it's specified for a hostdev. Currently forward mode='hostdev' does not support USB devices. Since the hostdev-hybrid patches are dependent on the hostdev patches, I have also included the support for interface-type=hostdev-hybrid and forward mode=hostdev-hybrid in this patch series. The hostdev-hybrid mode makes migration possible along with PCI-passthrough. I had posted a RFC on the hostdev-hybrid methodology earlier on the libvirt mailing list. The RFC can be found here: https://www.redhat.com/archives/libvir-list/2012-February/msg00309.html Shradha Shah (15): Prerequisite Patch. virDomainDevicePCIAddress and respective functions moved to a new file called conf/device_conf.ch Moved the code to create implicit interface pool from PF to a new function RNG updates, new xml parser/formatter code to support forward mode=hostdev Code to return interface name or pci_addr of the VF in actualDevice Forward Mode Hostdev network driver Implementation Forward Mode 'Hostdev' qemu driver implementation RNG updates, new xml parser/formatter code to support interface type=hostdev-hybrid RNG updates, new xml parser/formatter code to support forward mode=hostdev-hybrid Hostdev-hybrid mode requires a direct linkdev and direct mode. ActualParent is used to store the information about the NETDEV that contains HOSTDEV in hybrid case. Hybrid Hostdevs should be marked as ephemeral. Hostdev-hybrid network driver Implementation Hostdev-hybrid qemu driver implementation Using the Ephemeral Flag to prepare for Migration Support. Migration support for hostdev-hybrid. docs/formatdomain.html.in | 29 ++ docs/formatnetwork.html.in | 62 +++ docs/schemas/domaincommon.rng | 50 +++ docs/schemas/network.rng | 83 - include/libvirt/libvirt.h.in |1 + include/libvirt/virterror.h|1 + src/Makefile.am|6 +- src/conf/device_conf.c | 135 ++ src/conf/device_conf.h | 65 +++ src/conf/domain_conf.c | 286 +++- src/conf/domain_conf.h | 33 +- src/conf/network_conf.c| 131 +- src/conf/network_conf.h| 30 ++- src/libvirt_private.syms | 11 +- src/network/bridge_driver.c| 463 src/qemu/qemu_command.c| 99 - src/qemu/qemu_domain.c |6 +- src/qemu/qemu_domain.h |3 +- src/qemu/qemu_driver.c |6 +- src/qemu/qemu_hostdev.c| 97 +++-- src/qemu/qemu_hotplug.c| 35 ++- src/qemu/qemu_migration.c | 106 +- src/qemu/qemu_monitor.c| 14 +- src/qemu/qemu_monitor.h| 17 +- src/qemu/qemu_monitor_json.c | 14 +- src/qemu/qemu_monitor_json.h | 14 +- src/qemu/qemu_monitor_text.c | 16 +- src/qemu/qemu_monitor_text.h | 14 +- src/qemu/qemu_process.c|3 +- src/uml/uml_conf.c |5 + src/util/pci.c |2 +- src/util/pci.h |2 + src/util/virnetdev.c | 65 +++- src/util/virnetdev.h | 10 +- src/util/virterror.c |3 +- src/xen/xend_internal.c|3 +- src/xenxs/xen_sxpr.c |1 + tests/networkxml2xmlin/hostdev-hybrid-pf.xml | 11 + tests/networkxml2xmlin/hostdev-hybrid.xml | 10 + tests/networkxml2xmlin/hostdev-pf.xml | 11 + tests/networkxml2xmlin/hostdev.xml | 10 + tests/networkxml2xmlout/hostdev-hybrid-pf.xml |7 + tests/networkxml2xmlout/hostdev-hybrid.xml | 10 + tests/networkxml2xmlout/hostdev-pf.xml |7 + tests/networkxml2xmlout/hostdev.xml| 10 + tests/networkxml2xmltest.c
[libvirt] [PATCH 02/15] Moved the code to create implicit interface pool from PF to a new function
Just code movement no functional changes here. This makes the code reusable Signed-off-by: Shradha Shah ss...@solarflare.com --- src/network/bridge_driver.c | 86 ++ 1 files changed, 53 insertions(+), 33 deletions(-) diff --git a/src/network/bridge_driver.c b/src/network/bridge_driver.c index a5046f1..f128bd0 100644 --- a/src/network/bridge_driver.c +++ b/src/network/bridge_driver.c @@ -2728,6 +2728,56 @@ int networkRegister(void) { * backend function table. */ +/* networkCreateInterfacePool: + * @netdef: the original NetDef from the network + * + * Creates an implicit interface pool of VF's when a PF dev is given + */ +static int +networkCreateInterfacePool(virNetworkDefPtr netdef) { +unsigned int num_virt_fns = 0; +char **vfname = NULL; +int ret = -1, ii = 0; + +if ((virNetDevGetVirtualFunctions(netdef-forwardPfs-dev, + vfname, num_virt_fns)) 0) { +virReportError(VIR_ERR_INTERNAL_ERROR, + _(Could not get Virtual functions on %s), + netdef-forwardPfs-dev); +goto finish; +} + +if (num_virt_fns == 0) { +virReportError(VIR_ERR_INTERNAL_ERROR, + _(No Vf's present on SRIOV PF %s), + netdef-forwardPfs-dev); + goto finish; +} + +if ((VIR_ALLOC_N(netdef-forwardIfs, num_virt_fns)) 0) { +virReportOOMError(); +goto finish; +} + +netdef-nForwardIfs = num_virt_fns; + +for (ii = 0; ii netdef-nForwardIfs; ii++) { +netdef-forwardIfs[ii].dev = strdup(vfname[ii]); +if (!netdef-forwardIfs[ii].dev) { +virReportOOMError(); +goto finish; +} +netdef-forwardIfs[ii].usageCount = 0; +} + +ret = 0; +finish: +for (ii = 0; ii num_virt_fns; ii++) +VIR_FREE(vfname[ii]); +VIR_FREE(vfname); +return ret; +} + /* networkAllocateActualDevice: * @iface: the original NetDef from the domain * @@ -2746,8 +2796,6 @@ networkAllocateActualDevice(virDomainNetDefPtr iface) virNetworkObjPtr network; virNetworkDefPtr netdef; virPortGroupDefPtr portgroup; -unsigned int num_virt_fns = 0; -char **vfname = NULL; int ii; int ret = -1; @@ -2894,36 +2942,11 @@ networkAllocateActualDevice(virDomainNetDefPtr iface) */ if (netdef-forwardType == VIR_NETWORK_FORWARD_PASSTHROUGH) { if ((netdef-nForwardPfs 0) (netdef-nForwardIfs = 0)) { -if ((virNetDevGetVirtualFunctions(netdef-forwardPfs-dev, - vfname, num_virt_fns)) 0) { -virReportError(VIR_ERR_INTERNAL_ERROR, - _(Could not get Virtual functions on %s), - netdef-forwardPfs-dev); +if ((networkCreateInterfacePool(netdef)) 0) { +virReportError(VIR_ERR_INTERNAL_ERROR, %s, + _(Could not Interface Pool)); goto cleanup; } - -if (num_virt_fns == 0) { -virReportError(VIR_ERR_INTERNAL_ERROR, - _(No Vf's present on SRIOV PF %s), - netdef-forwardPfs-dev); -goto cleanup; -} - -if ((VIR_ALLOC_N(netdef-forwardIfs, num_virt_fns)) 0) { -virReportOOMError(); -goto cleanup; -} - -netdef-nForwardIfs = num_virt_fns; - -for (ii = 0; ii netdef-nForwardIfs; ii++) { -netdef-forwardIfs[ii].dev = strdup(vfname[ii]); -if (!netdef-forwardIfs[ii].dev) { -virReportOOMError(); -goto cleanup; -} -netdef-forwardIfs[ii].usageCount = 0; -} } /* pick first dev with 0 usageCount */ @@ -2976,9 +2999,6 @@ networkAllocateActualDevice(virDomainNetDefPtr iface) ret = 0; cleanup: -for (ii = 0; ii num_virt_fns; ii++) -VIR_FREE(vfname[ii]); -VIR_FREE(vfname); if (network) virNetworkObjUnlock(network); if (ret 0) { -- 1.7.4.4 -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
[libvirt] [PATCH 01/15] Prerequisite Patch. virDomainDevicePCIAddress and respective functions moved to a new file called conf/device_conf.ch
Refactoring existing code without causing any functional changes to prepare for new code. This patch makes the code reusable. Signed-off-by: Shradha Shah ss...@solarflare.com --- include/libvirt/virterror.h |1 + src/Makefile.am |6 ++- src/conf/device_conf.c | 135 ++ src/conf/device_conf.h | 65 src/conf/domain_conf.c | 114 --- src/conf/domain_conf.h | 25 +--- src/libvirt_private.syms | 10 ++- src/qemu/qemu_command.c | 13 ++-- src/qemu/qemu_hotplug.c |7 +- src/qemu/qemu_monitor.c | 14 ++-- src/qemu/qemu_monitor.h | 17 +++--- src/qemu/qemu_monitor_json.c | 14 ++-- src/qemu/qemu_monitor_json.h | 14 ++-- src/qemu/qemu_monitor_text.c | 16 +++--- src/qemu/qemu_monitor_text.h | 14 ++-- src/util/virterror.c |3 +- src/xen/xend_internal.c |3 +- 17 files changed, 290 insertions(+), 181 deletions(-) diff --git a/include/libvirt/virterror.h b/include/libvirt/virterror.h index ad8e101..a3516f6 100644 --- a/include/libvirt/virterror.h +++ b/include/libvirt/virterror.h @@ -110,6 +110,7 @@ typedef enum { VIR_FROM_AUTH = 46, /* Error from auth handling */ VIR_FROM_DBUS = 47, /* Error from DBus */ VIR_FROM_PARALLELS = 48,/* Error from Parallels */ +VIR_FROM_DEVICE = 49, /* Error from Device */ # ifdef VIR_ENUM_SENTINELS VIR_ERR_DOMAIN_LAST diff --git a/src/Makefile.am b/src/Makefile.am index 6ed4a41..d6ebfdf 100644 --- a/src/Makefile.am +++ b/src/Makefile.am @@ -200,6 +200,9 @@ CONSOLE_CONF_SOURCES = \ DOMAIN_LIST_SOURCES = \ conf/virdomainlist.c conf/virdomainlist.h +DEVICE_CONF_SOURCES = \ + conf/device_conf.c conf/device_conf.h + CONF_SOURCES = \ $(NETDEV_CONF_SOURCES) \ $(DOMAIN_CONF_SOURCES) \ @@ -213,7 +216,8 @@ CONF_SOURCES = \ $(SECRET_CONF_SOURCES) \ $(CPU_CONF_SOURCES) \ $(CONSOLE_CONF_SOURCES) \ - $(DOMAIN_LIST_SOURCES) + $(DOMAIN_LIST_SOURCES) \ + $(DEVICE_CONF_SOURCES) # The remote RPC driver, covering domains, storage, networks, etc REMOTE_DRIVER_GENERATED = \ diff --git a/src/conf/device_conf.c b/src/conf/device_conf.c new file mode 100644 index 000..d4eb764 --- /dev/null +++ b/src/conf/device_conf.c @@ -0,0 +1,135 @@ +/* + * device_conf.h: device XML handling + * + * Copyright (C) 2006-2012 Red Hat, Inc. + * + * This library is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * This library is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with this library; if not, write to the Free Software + * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA + * + * Author: Shradha Shah ss...@solarflare.com + */ + +#include config.h +#include virterror_internal.h +#include datatypes.h +#include memory.h +#include xml.h +#include uuid.h +#include util.h +#include buf.h +#include conf/device_conf.h + +#define VIR_FROM_THIS VIR_FROM_DEVICE + +#define virDeviceReportError(code, ...) \ +virReportErrorHelper(VIR_FROM_DEVICE, code, __FILE__,\ + __FUNCTION__, __LINE__, __VA_ARGS__) + +VIR_ENUM_IMPL(virDeviceAddressPciMulti, + VIR_DEVICE_ADDRESS_PCI_MULTI_LAST, + default, + on, + off) + +int virDevicePCIAddressIsValid(virDevicePCIAddressPtr addr) +{ +/* PCI bus has 32 slots and 8 functions per slot */ +if (addr-slot = 32 || addr-function = 8) +return 0; +return addr-domain || addr-bus || addr-slot; +} + + +int +virDevicePCIAddressParseXML(xmlNodePtr node, +virDevicePCIAddressPtr addr) +{ +char *domain, *slot, *bus, *function, *multi; +int ret = -1; + +memset(addr, 0, sizeof(*addr)); + +domain = virXMLPropString(node, domain); +bus = virXMLPropString(node, bus); +slot = virXMLPropString(node, slot); +function
[libvirt] [PATCH 03/15] RNG updates, new xml parser/formatter code to support forward mode=hostdev
This patch introduces the new forward mode='hostdev' along with attribute managed Includes updates to the network RNG and new xml parser/formatter code. Signed-off-by: Shradha Shah ss...@solarflare.com --- docs/schemas/network.rng | 82 +++-- src/conf/network_conf.c| 127 src/conf/network_conf.h| 29 +++- src/network/bridge_driver.c| 18 ++-- tests/networkxml2xmlin/hostdev-pf.xml | 11 +++ tests/networkxml2xmlin/hostdev.xml | 10 +++ tests/networkxml2xmlout/hostdev-pf.xml |7 ++ tests/networkxml2xmlout/hostdev.xml| 10 +++ tests/networkxml2xmltest.c |2 + 9 files changed, 263 insertions(+), 33 deletions(-) diff --git a/docs/schemas/network.rng b/docs/schemas/network.rng index 2ae879e..d1297cd 100644 --- a/docs/schemas/network.rng +++ b/docs/schemas/network.rng @@ -82,17 +82,41 @@ valuepassthrough/value valueprivate/value valuevepa/value + valuehostdev/value +/choice + /attribute +/optional + +optional + attribute name=managed +choice + valueyes/value + valueno/value /choice /attribute /optional interleave - zeroOrMore -element name='interface' - attribute name='dev' -ref name='deviceName'/ - /attribute -/element - /zeroOrMore + choice +group + zeroOrMore +element name='interface' + attribute name='dev' +ref name='deviceName'/ + /attribute +/element + /zeroOrMore +/group +group + zeroOrMore +element name='address' + attribute name='type' +valuepci/value + /attribute + ref name=pciaddress/ +/element + /zeroOrMore +/group + /choice optional element name='pf' attribute name='dev' @@ -238,4 +262,48 @@ /interleave /element /define + define name=pciaddress +optional + attribute name=domain +ref name=pciDomain/ + /attribute +/optional +attribute name=bus + ref name=pciBus/ +/attribute +attribute name=slot + ref name=pciSlot/ +/attribute +attribute name=function + ref name=pciFunc/ +/attribute +optional + attribute name=multifunction +choice + valueon/value + valueoff/value +/choice + /attribute +/optional + /define + define name=pciDomain +data type=string + param name=pattern(0x)?[0-9a-fA-F]{1,4}/param +/data + /define + define name=pciBus +data type=string + param name=pattern(0x)?[0-9a-fA-F]{1,2}/param +/data + /define + define name=pciSlot +data type=string + param name=pattern(0x)?[0-1]?[0-9a-fA-F]/param +/data + /define + define name=pciFunc +data type=string + param name=pattern(0x)?[0-7]/param +/data + /define /grammar diff --git a/src/conf/network_conf.c b/src/conf/network_conf.c index a3714d9..294939d 100644 --- a/src/conf/network_conf.c +++ b/src/conf/network_conf.c @@ -49,7 +49,12 @@ VIR_ENUM_IMPL(virNetworkForward, VIR_NETWORK_FORWARD_LAST, - none, nat, route, bridge, private, vepa, passthrough ) + none, nat, route, bridge, private, vepa, passthrough, hostdev) + +VIR_ENUM_DECL(virNetworkForwardHostdevDevice) +VIR_ENUM_IMPL(virNetworkForwardHostdevDevice, + VIR_NETWORK_FORWARD_HOSTDEV_DEVICE_LAST, + none, pci) virNetworkObjPtr virNetworkFindByUUID(const virNetworkObjListPtr nets, const unsigned char *uuid) @@ -94,6 +99,12 @@ virPortGroupDefClear(virPortGroupDefPtr def) static void virNetworkForwardIfDefClear(virNetworkForwardIfDefPtr def) { +VIR_FREE(def-device.dev); +} + +static void +virNetworkForwardPfDefClear(virNetworkForwardPfDefPtr def) +{ VIR_FREE(def-dev); } @@ -157,12 +168,13 @@ void virNetworkDefFree(virNetworkDefPtr def) VIR_FREE(def-domain); for (ii = 0 ; ii def-nForwardPfs def-forwardPfs ; ii++) { -virNetworkForwardIfDefClear(def-forwardPfs[ii]); +virNetworkForwardPfDefClear(def-forwardPfs[ii]); } VIR_FREE(def-forwardPfs); for (ii = 0 ; ii def-nForwardIfs def-forwardIfs ; ii++) { -virNetworkForwardIfDefClear(def-forwardIfs[ii]); +if (def-forwardType
[libvirt] [PATCH 10/15] ActualParent is used to store the information about the NETDEV that contains HOSTDEV in hybrid case.
The parent type for hostdev hybrid needs to be VIR_DOMAIN_DEVICE_NONE as the device is passed into the guest as a PCI Device. In order to store the information of the NETDEV that is the parent of the HOSTDEV in question we use a new variable actualParent. This variable also helps during VF MAC address, vlan and virtportprofile configuration. ActualParent = Parent in case of forward mode=hostdev --- src/conf/domain_conf.c |8 src/conf/domain_conf.h |1 + src/qemu/qemu_hostdev.c | 91 +-- src/qemu/qemu_hotplug.c |2 +- 4 files changed, 74 insertions(+), 28 deletions(-) diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index e73c07d..361850a 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -4366,6 +4366,8 @@ virDomainActualNetDefParseXML(xmlNodePtr node, hostdev-parent.type = VIR_DOMAIN_DEVICE_NET; hostdev-parent.data.net = parent; +hostdev-actualParent.type = VIR_DOMAIN_DEVICE_NET; +hostdev-actualParent.data.net = parent; hostdev-info = parent-info; /* The helper function expects type to already be found and * passed in as a string, since it is in a different place in @@ -4393,6 +4395,8 @@ virDomainActualNetDefParseXML(xmlNodePtr node, virDomainHostdevDefPtr hostdev = actual-data.hostdev.def; hostdev-parent.type = VIR_DOMAIN_DEVICE_NONE; +hostdev-actualParent.type = VIR_DOMAIN_DEVICE_NET; +hostdev-actualParent.data.net = parent; if (VIR_ALLOC(hostdev-info) 0) { virReportOOMError(); @@ -4760,6 +4764,8 @@ virDomainNetDefParseXML(virCapsPtr caps, hostdev = def-data.hostdev.def; hostdev-parent.type = VIR_DOMAIN_DEVICE_NET; hostdev-parent.data.net = def; +hostdev-actualParent.type = VIR_DOMAIN_DEVICE_NET; +hostdev-actualParent.data.net = def; hostdev-info = def-info; /* The helper function expects type to already be found and * passed in as a string, since it is in a different place in @@ -4783,6 +4789,8 @@ virDomainNetDefParseXML(virCapsPtr caps, case VIR_DOMAIN_NET_TYPE_HOSTDEV_HYBRID: hostdev = def-data.hostdev.def; hostdev-parent.type = VIR_DOMAIN_DEVICE_NONE; +hostdev-actualParent.type = VIR_DOMAIN_DEVICE_NET; +hostdev-actualParent.data.net = def; if (VIR_ALLOC(hostdev-info) 0) { virReportOOMError(); goto error; diff --git a/src/conf/domain_conf.h b/src/conf/domain_conf.h index 053c71c..4584671 100644 --- a/src/conf/domain_conf.h +++ b/src/conf/domain_conf.h @@ -368,6 +368,7 @@ struct _virDomainHostdevSubsys { /* basic device for direct passthrough */ struct _virDomainHostdevDef { virDomainDeviceDef parent; /* higher level Def containing this */ +virDomainDeviceDef actualParent; /*used only in the case of hybrid hostdev*/ int mode; /* enum virDomainHostdevMode */ unsigned int managed : 1; union { diff --git a/src/qemu/qemu_hostdev.c b/src/qemu/qemu_hostdev.c index 7619fd0..d2712f4 100644 --- a/src/qemu/qemu_hostdev.c +++ b/src/qemu/qemu_hostdev.c @@ -319,17 +319,36 @@ qemuDomainHostdevNetConfigReplace(virDomainHostdevDefPtr hostdev, if (qemuDomainHostdevNetDevice(hostdev, linkdev, vf) 0) return ret; -virtPort = virDomainNetGetActualVirtPortProfile( - hostdev-parent.data.net); -if (virtPort) -ret = qemuDomainHostdevNetConfigVirtPortProfile(linkdev, vf, -virtPort, hostdev-parent.data.net-mac, uuid, -port_profile_associate); -else -/* Set only mac */ -ret = virNetDevReplaceNetConfig(linkdev, vf, -hostdev-parent.data.net-mac, vlanid, -stateDir); +if (hostdev-parent.data.net) { +virtPort = virDomainNetGetActualVirtPortProfile(hostdev-parent.data.net); +if (virtPort) +ret = qemuDomainHostdevNetConfigVirtPortProfile(linkdev, vf, +virtPort, + hostdev-parent.data.net-mac, +uuid, + port_profile_associate); +else +/* Set only mac */ +ret = virNetDevReplaceNetConfig(linkdev, vf, +hostdev-parent.data.net-mac, +vlanid, +stateDir); +} +else if (hostdev-actualParent.data.net) { +virtPort = virDomainNetGetActualVirtPortProfile(hostdev-actualParent.data.net); +if (virtPort) +ret = qemuDomainHostdevNetConfigVirtPortProfile(linkdev, vf, +
[libvirt] [PATCH 05/15] Forward Mode Hostdev network driver Implementation
This patch updates the network driver to properly utilize the new attributes/elements that are now in virNetworkDef Signed-off-by: Shradha Shah ss...@solarflare.com --- docs/formatnetwork.html.in | 62 +++ src/network/bridge_driver.c | 237 ++ 2 files changed, 254 insertions(+), 45 deletions(-) diff --git a/docs/formatnetwork.html.in b/docs/formatnetwork.html.in index 7e8e991..96b9eb2 100644 --- a/docs/formatnetwork.html.in +++ b/docs/formatnetwork.html.in @@ -210,6 +210,37 @@ (usually either a domain start, or a hotplug interface attach to a domain).span class=sinceSince 0.9.4/span /dd + dtcodehostdev/code/dt + dd +This network facilitates PCI Passthrough of a network device. +A network device is chosen from the interface pool and +directly assigned to the guest using generic device +passthrough, after first optionally setting the device's MAC +address to the configured value, and associating the device with +an 802.1Qbh capable switch using an optionally specified +codelt;virtualportgt;/code element. +Note that - due to limitations in standard single-port PCI +ethernet card driver design - only SR-IOV (Single Root I/O +Virtualization) virtual function (VF) devices can be assigned +in this manner; to assign a standard single-port PCI or PCIe +ethernet card to a guest, use the traditional codelt; +hostdevgt;/code device definition and span class=since +Since 0.9.12/span + +pNote that this intelligent passthrough of network devices is +very similar to the functionality of a standard codelt; +hostdevgt;/code device, the difference being that this +method allows specifying a MAC address and codelt;virtualport +gt;/code for the passed-through device. If these capabilities +are not required, if you have a standard single-port PCI, PCIe, +or USB network card that doesn't support SR-IOV (and hence would +anyway lose the configured MAC address during reset after being +assigned to the guest domain), or if you are using a version of +libvirt older than 0.9.12, you should use standard +codelt;hostdevgt;/code to assign the device to the +guest instead of codelt;forward mode='hostdev'/gt;/code. +/p + /dd /dl As mentioned above, a codelt;forwardgt;/code element can have multiple codelt;interfacegt;/code subelements, each @@ -249,6 +280,37 @@ particular, 'passthrough' mode, and 'private' mode when using 802.1Qbh), libvirt will choose an unused physical interface or, if it can't find an unused interface, fail the operation./p + +span class=sincesince 0.9.12/span and when using forward mode +'hostdev' we specify the interface pool by using the +codelt;addressgt;/code element and codelt; +typegt;/code codelt;domaingt;/code codelt;busgt;/code +codelt;slotgt;/code and codelt;functiongt;/code +sub-elements. + +pre +... + lt;forward mode='hostdev' managed='yes'gt; +lt;address type='pci' domain='0' bus='4' slot='0' function='1'/gt; +lt;address type='pci' domain='0' bus='4' slot='0' function='2'/gt; +lt;address type='pci' domain='0' bus='4' slot='0' function='3'/gt; + lt;/forwardgt; +... +/pre + +Alternatively the interface pool can also be mentioned using a +single physical function codelt;pfgt;/code subelement to +call out the corresponding physical interface associated with +multiple virtual interfaces (similar to the passthrough mode): + +pre +... + lt;forward mode='hostdev' managed='yes'gt; +lt;pf dev='eth0'/gt; + lt;/forwardgt; +... +/pre + /dd /dl h5a name=elementQoSQuality of service/a/h5 diff --git a/src/network/bridge_driver.c b/src/network/bridge_driver.c index 602e17d..33bc09e 100644 --- a/src/network/bridge_driver.c +++ b/src/network/bridge_driver.c @@ -1934,7 +1934,7 @@ networkStartNetworkExternal(struct network_driver *driver ATTRIBUTE_UNUSED, virNetworkObjPtr network ATTRIBUTE_UNUSED) { /* put anything here that needs to be done each time a network of - * type BRIDGE, PRIVATE, VEPA, or PASSTHROUGH is started. On + * type BRIDGE, PRIVATE, VEPA, HOSTDEV or PASSTHROUGH is started. On * failure, undo anything you've done, and return -1. On success * return 0. */ @@ -1945,7 +1945,7 @@ static int networkShutdownNetworkExternal(struct network_driver *driver ATTRIBUT virNetworkObjPtr network ATTRIBUTE_UNUSED) { /* put anything here that needs
[libvirt] [PATCH 06/15] Forward Mode 'Hostdev' qemu driver implementation
Signed-off-by: Shradha Shah ss...@solarflare.com --- src/qemu/qemu_command.c | 27 +++ 1 files changed, 27 insertions(+), 0 deletions(-) diff --git a/src/qemu/qemu_command.c b/src/qemu/qemu_command.c index 6f6c6cd..bb66364 100644 --- a/src/qemu/qemu_command.c +++ b/src/qemu/qemu_command.c @@ -24,6 +24,7 @@ #include config.h #include qemu_command.h +#include qemu_hostdev.h #include qemu_capabilities.h #include qemu_bridge_filter.h #include cpu/cpu.h @@ -5221,12 +5222,38 @@ qemuBuildCommandLine(virConnectPtr conn, actualType = virDomainNetGetActualType(net); if (actualType == VIR_DOMAIN_NET_TYPE_HOSTDEV) { +virDomainHostdevDefPtr hostdev = virDomainNetGetActualHostdev(net); +virDomainHostdevDefPtr found; /* type='hostdev' interfaces are handled in codepath * for standard hostdev (NB: when there is a network * with forward mode='hostdev', there will need to be * code here that adds the newly minted hostdev to the * hostdevs array). */ +if (qemuAssignDeviceHostdevAlias(def, + hostdev, + (def-nhostdevs-1)) 0) { +virReportError(VIR_ERR_INTERNAL_ERROR, %s, +_(Could not assign alias to Net Hostdev)); +goto error; +} + +if (virDomainHostdevFind(def, + hostdev, + found) 0) { +if (virDomainHostdevInsert(def, + hostdev) 0) { +virReportError(VIR_ERR_INTERNAL_ERROR, %s, +_(Hostdev not inserted into the array)); +goto error; +} +if (qemuPrepareHostdevPCIDevices(driver, def-name, def-uuid, + hostdev, 1) 0) { +virReportError(VIR_ERR_INTERNAL_ERROR, %s, +_(Prepare Hostdev PCI Devices failed)); +goto error; +} +} continue; } -- 1.7.4.4 -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
[libvirt] [PATCH 07/15] RNG updates, new xml parser/formatter code to support interface type=hostdev-hybrid
This patch introduces the new interface type='hostdev-hybrid' along with attribute managed Includes updates to the domain RNG and new xml parser/formatter code. --- docs/formatdomain.html.in | 29 ++ docs/schemas/domaincommon.rng | 50 ++ src/conf/domain_conf.c | 97 ++-- src/conf/domain_conf.h |1 + src/uml/uml_conf.c |5 + src/xenxs/xen_sxpr.c |1 + .../qemuxml2argv-net-hostdevhybrid.args|6 + .../qemuxml2argv-net-hostdevhybrid.xml | 35 +++ tests/qemuxml2xmltest.c|1 + 9 files changed, 215 insertions(+), 10 deletions(-) diff --git a/docs/formatdomain.html.in b/docs/formatdomain.html.in index f97c630..045e655 100644 --- a/docs/formatdomain.html.in +++ b/docs/formatdomain.html.in @@ -2504,6 +2504,20 @@ guest instead of lt;interface type='hostdev'/gt;. /p +p + Libvirt later than 0.9.13 also supports intelligent passthrough + of VF in the hybrid mode. This is done by using the lt;interface + type='hostdev-hybrid'/gt; functionality. Similar to lt;interface + type='hostdev'/gt; the device's MAC address is first optionally + configured and the device is associated with an 802.1Qbh capable + switch using an optionally specified lt;virtualportgt; element + (see the examples of virtualport given above for type='direct' + network devices). The Vf is passed into the guest as a PCI device + and at the same time a virtual interface with type='direct' mode= + 'bridge' is created in the guest. This hybrid mode of intelligent + passthrough makes Live migration possible. +/p + pre ... lt;devicesgt; @@ -2519,6 +2533,21 @@ lt;/devicesgt; .../pre +pre + ... + lt;devicesgt; +lt;interface type='hostdev-hybrid'gt; + lt;sourcegt; +lt;address type='pci' domain='0x' bus='0x00' slot='0x07' function='0x0'/gt; + lt;/sourcegt; + lt;mac address='52:54:00:6d:90:02'gt; + lt;virtualport type='802.1Qbh'gt; +lt;parameters profileid='finance'/gt; + lt;/virtualportgt; +lt;/interfacegt; + lt;/devicesgt; + .../pre + h5a name=elementsNICSMulticastMulticast tunnel/a/h5 diff --git a/docs/schemas/domaincommon.rng b/docs/schemas/domaincommon.rng index c85d763..2f95e91 100644 --- a/docs/schemas/domaincommon.rng +++ b/docs/schemas/domaincommon.rng @@ -1597,6 +1597,56 @@ ref name=interface-options/ /interleave /group +group + attribute name=type +valuehostdev-hybrid/value + /attribute + optional +attribute name=managed + choice +valueyes/value +valueno/value + /choice +/attribute + /optional + interleave +element name=source + choice +group + ref name=usbproduct/ + optional +ref name=usbaddress/ + /optional +/group +element name=address + choice +group + attribute name=type +valuepci/value + /attribute + ref name=pciaddress/ +/group +group + attribute name=type +valueusb/value + /attribute + attribute name=bus +ref name=usbAddr/ + /attribute + attribute name=device +ref name=usbPort/ + /attribute +/group + /choice +/element + /choice +/element +optional + ref name=virtualPortProfile/ +/optional +ref name=interface-options/ + /interleave +/group /choice /element /define diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index ecad6cc..39b5cdb 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -289,7 +289,8 @@ VIR_ENUM_IMPL(virDomainNet, VIR_DOMAIN_NET_TYPE_LAST, bridge, internal, direct, - hostdev) + hostdev, + hostdev-hybrid) VIR_ENUM_IMPL(virDomainNetBackend, VIR_DOMAIN_NET_BACKEND_TYPE_LAST, default, @@ -1023,6 +1024,10 @@ virDomainActualNetDefFree(virDomainActualNetDefPtr def) virDomainHostdevDefClear(def-data.hostdev.def); VIR_FREE(def-data.hostdev.virtPortProfile); break; +case VIR_DOMAIN_NET_TYPE_HOSTDEV_HYBRID: +
[libvirt] [PATCH 08/15] RNG updates, new xml parser/formatter code to support forward mode=hostdev-hybrid
This patch introduces the new forward mode='hostdev-hybrid' along with attribute managed Includes updates to the network RNG and new xml parser/formatter code. --- docs/schemas/network.rng |1 + src/conf/network_conf.c | 12 src/conf/network_conf.h |1 + tests/networkxml2xmlin/hostdev-hybrid-pf.xml | 11 +++ tests/networkxml2xmlin/hostdev-hybrid.xml | 10 ++ tests/networkxml2xmlout/hostdev-hybrid-pf.xml |7 +++ tests/networkxml2xmlout/hostdev-hybrid.xml| 10 ++ tests/networkxml2xmltest.c|2 ++ 8 files changed, 50 insertions(+), 4 deletions(-) diff --git a/docs/schemas/network.rng b/docs/schemas/network.rng index d1297cd..0118b96 100644 --- a/docs/schemas/network.rng +++ b/docs/schemas/network.rng @@ -83,6 +83,7 @@ valueprivate/value valuevepa/value valuehostdev/value + valuehostdev-hybrid/value /choice /attribute /optional diff --git a/src/conf/network_conf.c b/src/conf/network_conf.c index 294939d..f299a3d 100644 --- a/src/conf/network_conf.c +++ b/src/conf/network_conf.c @@ -49,7 +49,7 @@ VIR_ENUM_IMPL(virNetworkForward, VIR_NETWORK_FORWARD_LAST, - none, nat, route, bridge, private, vepa, passthrough, hostdev) + none, nat, route, bridge, private, vepa, passthrough, hostdev, hostdev-hybrid) VIR_ENUM_DECL(virNetworkForwardHostdevDevice) VIR_ENUM_IMPL(virNetworkForwardHostdevDevice, @@ -173,7 +173,8 @@ void virNetworkDefFree(virNetworkDefPtr def) VIR_FREE(def-forwardPfs); for (ii = 0 ; ii def-nForwardIfs def-forwardIfs ; ii++) { -if (def-forwardType != VIR_NETWORK_FORWARD_HOSTDEV) +if ((def-forwardType != VIR_NETWORK_FORWARD_HOSTDEV) +(def-forwardType != VIR_NETWORK_FORWARD_HOSTDEV_HYBRID)) virNetworkForwardIfDefClear(def-forwardIfs[ii]); } VIR_FREE(def-forwardIfs); @@ -1274,6 +1275,7 @@ virNetworkDefParseXML(xmlXPathContextPtr ctxt) case VIR_NETWORK_FORWARD_VEPA: case VIR_NETWORK_FORWARD_PASSTHROUGH: case VIR_NETWORK_FORWARD_HOSTDEV: +case VIR_NETWORK_FORWARD_HOSTDEV_HYBRID: if (def-bridge) { virReportError(VIR_ERR_XML_ERROR, _(bridge name not allowed in %s mode (network '%s')), @@ -1559,7 +1561,8 @@ char *virNetworkDefFormat(const virNetworkDefPtr def, unsigned int flags) } virBufferAddLit(buf, forward); virBufferEscapeString(buf, dev='%s', dev); -if (def-forwardType == VIR_NETWORK_FORWARD_HOSTDEV) { +if (def-forwardType == VIR_NETWORK_FORWARD_HOSTDEV || +def-forwardType == VIR_NETWORK_FORWARD_HOSTDEV_HYBRID) { if (def-managed == 1) virBufferAddLit(buf, managed='yes'); else @@ -1576,7 +1579,8 @@ char *virNetworkDefFormat(const virNetworkDefPtr def, unsigned int flags) if (def-nForwardIfs (!def-nForwardPfs || !(flags VIR_NETWORK_XML_INACTIVE))) { for (ii = 0; ii def-nForwardIfs; ii++) { -if (def-forwardType != VIR_NETWORK_FORWARD_HOSTDEV) +if (def-forwardType != VIR_NETWORK_FORWARD_HOSTDEV +def-forwardType != VIR_NETWORK_FORWARD_HOSTDEV_HYBRID) virBufferEscapeString(buf, interface dev='%s'/\n, def-forwardIfs[ii].device.dev); else { diff --git a/src/conf/network_conf.h b/src/conf/network_conf.h index a57db36..3348877 100644 --- a/src/conf/network_conf.h +++ b/src/conf/network_conf.h @@ -47,6 +47,7 @@ enum virNetworkForwardType { VIR_NETWORK_FORWARD_VEPA, VIR_NETWORK_FORWARD_PASSTHROUGH, VIR_NETWORK_FORWARD_HOSTDEV, +VIR_NETWORK_FORWARD_HOSTDEV_HYBRID, VIR_NETWORK_FORWARD_LAST, }; diff --git a/tests/networkxml2xmlin/hostdev-hybrid-pf.xml b/tests/networkxml2xmlin/hostdev-hybrid-pf.xml new file mode 100644 index 000..c4d2f93 --- /dev/null +++ b/tests/networkxml2xmlin/hostdev-hybrid-pf.xml @@ -0,0 +1,11 @@ +network + namehostdev-hybrid/name + uuid81ff0d90-c91e-6742-64da-4a736edb9a9b/uuid + forward mode=hostdev-hybrid managed=yes +pf dev='eth2'/ +address type='pci' domain='0' bus='3' slot='0' function='1'/ +address type='pci' domain='0' bus='3' slot='0' function='2'/ +address type='pci' domain='0' bus='3' slot='0' function='3'/ +address type='pci' domain='0' bus='3' slot='0' function='4'/ + /forward +/network diff --git a/tests/networkxml2xmlin/hostdev-hybrid.xml b/tests/networkxml2xmlin/hostdev-hybrid.xml new file mode 100644 index 000..29960aa --- /dev/null +++ b/tests/networkxml2xmlin/hostdev-hybrid.xml @@ -0,0 +1,10 @@ +network + namehostdev-hybrid/name +
[libvirt] [PATCH 09/15] Hostdev-hybrid mode requires a direct linkdev and direct mode.
In this mode the guest contains a Virtual network device along with a SRIOV VF passed through to the guest as a pci device. --- src/conf/domain_conf.c | 37 +++-- src/conf/domain_conf.h |5 + src/libvirt_private.syms |1 + src/util/pci.c |2 +- src/util/pci.h |2 ++ src/util/virnetdev.c | 40 src/util/virnetdev.h |6 ++ 7 files changed, 90 insertions(+), 3 deletions(-) diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index 39b5cdb..e73c07d 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -1025,6 +1025,7 @@ virDomainActualNetDefFree(virDomainActualNetDefPtr def) VIR_FREE(def-data.hostdev.virtPortProfile); break; case VIR_DOMAIN_NET_TYPE_HOSTDEV_HYBRID: +VIR_FREE(def-data.hostdev.linkdev); virDomainHostdevDefClear(def-data.hostdev.def); VIR_FREE(def-data.hostdev.virtPortProfile); break; @@ -1084,6 +1085,7 @@ void virDomainNetDefFree(virDomainNetDefPtr def) break; case VIR_DOMAIN_NET_TYPE_HOSTDEV_HYBRID: +VIR_FREE(def-data.hostdev.linkdev); virDomainHostdevDefClear(def-data.hostdev.def); VIR_FREE(def-data.hostdev.virtPortProfile); break; @@ -4475,6 +4477,7 @@ virDomainNetDefParseXML(virCapsPtr caps, char *mode = NULL; char *linkstate = NULL; char *addrtype = NULL; +char *pfname = NULL; virNWFilterHashTablePtr filterparams = NULL; virNetDevVPortProfilePtr virtPort = NULL; virDomainActualNetDefPtr actual = NULL; @@ -4795,6 +4798,26 @@ virDomainNetDefParseXML(virCapsPtr caps, hostdev, flags) 0) { goto error; } +if (hostdev-source.subsys.type == VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_PCI) { +if (virNetDevGetPhysicalFunctionFromVfPciAddr(hostdev-source.subsys.u.pci.domain, + hostdev-source.subsys.u.pci.bus, + hostdev-source.subsys.u.pci.slot, + hostdev-source.subsys.u.pci.function, + pfname) 0) { +virReportError(VIR_ERR_INTERNAL_ERROR, %s, + _(Could not get Physical Function of the hostdev)); +goto error; +} +} +if (pfname != NULL) +def-data.hostdev.linkdev = strdup(pfname); +else { +virReportError(VIR_ERR_INTERNAL_ERROR, + _(Linkdev is required in %s mode), + virDomainNetTypeToString(def-type)); +goto error; +} +def-data.hostdev.mode = VIR_NETDEV_MACVLAN_MODE_BRIDGE; def-data.hostdev.virtPortProfile = virtPort; virtPort = NULL; break; @@ -15033,11 +15056,16 @@ virDomainNetGetActualDirectDev(virDomainNetDefPtr iface) { if (iface-type == VIR_DOMAIN_NET_TYPE_DIRECT) return iface-data.direct.linkdev; +if (iface-type == VIR_DOMAIN_NET_TYPE_HOSTDEV_HYBRID) +return iface-data.hostdev.linkdev; if (iface-type != VIR_DOMAIN_NET_TYPE_NETWORK) return NULL; if (!iface-data.network.actual) return NULL; -return iface-data.network.actual-data.direct.linkdev; +if (iface-data.network.actual-type == VIR_DOMAIN_NET_TYPE_HOSTDEV_HYBRID) +return iface-data.network.actual-data.hostdev.linkdev; +else +return iface-data.network.actual-data.direct.linkdev; } int @@ -15045,11 +15073,16 @@ virDomainNetGetActualDirectMode(virDomainNetDefPtr iface) { if (iface-type == VIR_DOMAIN_NET_TYPE_DIRECT) return iface-data.direct.mode; +if (iface-type == VIR_DOMAIN_NET_TYPE_HOSTDEV_HYBRID) +return iface-data.hostdev.mode; if (iface-type != VIR_DOMAIN_NET_TYPE_NETWORK) return 0; if (!iface-data.network.actual) return 0; -return iface-data.network.actual-data.direct.mode; +if (iface-data.network.actual-type == VIR_DOMAIN_NET_TYPE_HOSTDEV_HYBRID) +return iface-data.network.actual-data.hostdev.mode; +else +return iface-data.network.actual-data.direct.mode; } virDomainHostdevDefPtr diff --git a/src/conf/domain_conf.h b/src/conf/domain_conf.h index 7bcaee4..053c71c 100644 --- a/src/conf/domain_conf.h +++ b/src/conf/domain_conf.h @@ -43,6 +43,7 @@ # include virnetdevvportprofile.h # include virnetdevopenvswitch.h # include virnetdevbandwidth.h +# include virnetdev.h # include virobject.h # include device_conf.h @@ -762,6 +763,8 @@ struct _virDomainActualNetDef { virNetDevVPortProfilePtr virtPortProfile; } direct; struct { +char *linkdev; +int mode; virDomainHostdevDef
[libvirt] [PATCH 04/15] Code to return interface name or pci_addr of the VF in actualDevice
The network pool should be able to keep track of both, network device names nad PCI addresses, and return the appropriate one in the actualDevice when networkAllocateActualDevice is called. Signed-off-by: Shradha Shah ss...@solarflare.com --- src/network/bridge_driver.c | 33 +++-- src/util/virnetdev.c| 25 - src/util/virnetdev.h|4 +++- 3 files changed, 42 insertions(+), 20 deletions(-) diff --git a/src/network/bridge_driver.c b/src/network/bridge_driver.c index df3cc25..602e17d 100644 --- a/src/network/bridge_driver.c +++ b/src/network/bridge_driver.c @@ -59,6 +59,7 @@ #include dnsmasq.h #include configmake.h #include virnetdev.h +#include pci.h #include virnetdevbridge.h #include virnetdevtap.h @@ -2737,10 +2738,11 @@ static int networkCreateInterfacePool(virNetworkDefPtr netdef) { unsigned int num_virt_fns = 0; char **vfname = NULL; +struct pci_config_address **virt_fns; int ret = -1, ii = 0; if ((virNetDevGetVirtualFunctions(netdef-forwardPfs-dev, - vfname, num_virt_fns)) 0) { + vfname, virt_fns, num_virt_fns)) 0) { virReportError(VIR_ERR_INTERNAL_ERROR, _(Could not get Virtual functions on %s), netdef-forwardPfs-dev); @@ -2762,19 +2764,38 @@ networkCreateInterfacePool(virNetworkDefPtr netdef) { netdef-nForwardIfs = num_virt_fns; for (ii = 0; ii netdef-nForwardIfs; ii++) { -netdef-forwardIfs[ii].device.dev = strdup(vfname[ii]); -if (!netdef-forwardIfs[ii].device.dev) { -virReportOOMError(); -goto finish; +if (netdef-forwardType == VIR_NETWORK_FORWARD_PASSTHROUGH) { +if(vfname[ii]) { +netdef-forwardIfs[ii].device.dev = strdup(vfname[ii]); +if (!netdef-forwardIfs[ii].device.dev) { +virReportOOMError(); +goto finish; +} +} +else { +virReportError(VIR_ERR_INTERNAL_ERROR, %s, + _(Passthrough mode requires interface names)); +goto finish; +} +} +else if (netdef-forwardType == VIR_NETWORK_FORWARD_HOSTDEV) { +netdef-forwardIfs[ii].type = VIR_NETWORK_FORWARD_HOSTDEV_DEVICE_PCI; /*Assuming PCI as VF's are PCI devices */ +netdef-forwardIfs[ii].device.pci.domain = virt_fns[ii]-domain; +netdef-forwardIfs[ii].device.pci.bus = virt_fns[ii]-bus; +netdef-forwardIfs[ii].device.pci.slot = virt_fns[ii]-slot; +netdef-forwardIfs[ii].device.pci.function = virt_fns[ii]-function; } netdef-forwardIfs[ii].usageCount = 0; } ret = 0; finish: -for (ii = 0; ii num_virt_fns; ii++) +for (ii = 0; ii num_virt_fns; ii++) { VIR_FREE(vfname[ii]); +VIR_FREE(virt_fns[ii]); +} VIR_FREE(vfname); +VIR_FREE(virt_fns); return ret; } diff --git a/src/util/virnetdev.c b/src/util/virnetdev.c index f1ee0a4..8103aff 100644 --- a/src/util/virnetdev.c +++ b/src/util/virnetdev.c @@ -29,6 +29,7 @@ #include command.h #include memory.h #include pci.h +#include logging.h #include sys/ioctl.h #ifdef HAVE_NET_IF_H @@ -981,18 +982,18 @@ virNetDevSysfsDeviceFile(char **pf_sysfs_device_link, const char *ifname, int virNetDevGetVirtualFunctions(const char *pfname, char ***vfname, + struct pci_config_address ***virt_fns, unsigned int *n_vfname) { int ret = -1, i; char *pf_sysfs_device_link = NULL; char *pci_sysfs_device_link = NULL; -struct pci_config_address **virt_fns; char *pciConfigAddr; if (virNetDevSysfsFile(pf_sysfs_device_link, pfname, device) 0) return ret; -if (pciGetVirtualFunctions(pf_sysfs_device_link, virt_fns, +if (pciGetVirtualFunctions(pf_sysfs_device_link, virt_fns, n_vfname) 0) goto cleanup; @@ -1003,10 +1004,10 @@ virNetDevGetVirtualFunctions(const char *pfname, for (i = 0; i *n_vfname; i++) { -if (pciGetDeviceAddrString(virt_fns[i]-domain, - virt_fns[i]-bus, - virt_fns[i]-slot, - virt_fns[i]-function, +if (pciGetDeviceAddrString((*virt_fns)[i]-domain, + (*virt_fns)[i]-bus, + (*virt_fns)[i]-slot, + (*virt_fns)[i]-function, pciConfigAddr) 0) { virReportSystemError(ENOSYS, %s, _(Failed to get PCI Config Address String)); @@ -1019,20 +1020,17 @@ virNetDevGetVirtualFunctions(const
[libvirt] [PATCH 14/15] Using the Ephemeral Flag to prepare for Migration Support.
--- src/conf/domain_conf.c| 24 +++- src/qemu/qemu_domain.c|6 +- src/qemu/qemu_domain.h|3 ++- src/qemu/qemu_driver.c|6 +++--- src/qemu/qemu_hostdev.c |6 ++ src/qemu/qemu_migration.c |4 ++-- 6 files changed, 37 insertions(+), 12 deletions(-) diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index 00624ee..1005265 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -12759,7 +12759,8 @@ virDomainDefFormatInternal(virDomainDefPtr def, virCheckFlags(DUMPXML_FLAGS | VIR_DOMAIN_XML_INTERNAL_STATUS | VIR_DOMAIN_XML_INTERNAL_ACTUAL_NET | - VIR_DOMAIN_XML_INTERNAL_PCI_ORIG_STATES, + VIR_DOMAIN_XML_INTERNAL_PCI_ORIG_STATES | + VIR_DOMAIN_XML_NO_EPHEMERAL_DEVICES, -1); if (!(type = virDomainVirtTypeToString(def-virtType))) { @@ -13216,10 +13217,22 @@ virDomainDefFormatInternal(virDomainDefPtr def, /* If parent.type != NONE, this is just a pointer to the * hostdev in a higher-level device (e.g. virDomainNetDef), * and will have already been formatted there. + * Hostdevs marked as ephemeral are hybrid hostdevs and + * should not be formatted. */ -if (def-hostdevs[n]-parent.type == VIR_DOMAIN_DEVICE_NONE -virDomainHostdevDefFormat(buf, def-hostdevs[n], flags) 0) { -goto cleanup; +if (def-hostdevs[n]-parent.type == VIR_DOMAIN_DEVICE_NONE) { +if ((flags VIR_DOMAIN_XML_NO_EPHEMERAL_DEVICES) == 0) { +if (virDomainHostdevDefFormat(buf, def-hostdevs[n], flags) 0) { +goto cleanup; +} +} +else { +if (def-hostdevs[n]-ephemeral == 0) { +if (virDomainHostdevDefFormat(buf, def-hostdevs[n], flags) 0) { +goto cleanup; +} +} +} } } @@ -13267,7 +13280,8 @@ virDomainDefFormat(virDomainDefPtr def, unsigned int flags) { virBuffer buf = VIR_BUFFER_INITIALIZER; -virCheckFlags(DUMPXML_FLAGS, NULL); +virCheckFlags(DUMPXML_FLAGS | VIR_DOMAIN_XML_NO_EPHEMERAL_DEVICES, + NULL); if (virDomainDefFormatInternal(def, flags, buf) 0) return NULL; diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c index c47890b..447ec24 100644 --- a/src/qemu/qemu_domain.c +++ b/src/qemu/qemu_domain.c @@ -1335,12 +1335,16 @@ char * qemuDomainDefFormatLive(struct qemud_driver *driver, virDomainDefPtr def, bool inactive, -bool compatible) +bool compatible, +bool ephemeral) { unsigned int flags = QEMU_DOMAIN_FORMAT_LIVE_FLAGS; if (inactive) flags |= VIR_DOMAIN_XML_INACTIVE; + +if (ephemeral) +flags |= VIR_DOMAIN_XML_NO_EPHEMERAL_DEVICES; return qemuDomainDefFormatXML(driver, def, flags, compatible); } diff --git a/src/qemu/qemu_domain.h b/src/qemu/qemu_domain.h index b96087e..8e707a5 100644 --- a/src/qemu/qemu_domain.h +++ b/src/qemu/qemu_domain.h @@ -268,7 +268,8 @@ char *qemuDomainFormatXML(struct qemud_driver *driver, char *qemuDomainDefFormatLive(struct qemud_driver *driver, virDomainDefPtr def, bool inactive, - bool compatible); + bool compatible, + bool ephemeral); void qemuDomainObjTaint(struct qemud_driver *driver, virDomainObjPtr obj, diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index 9bf89bb..e879d6e 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -2683,9 +2683,9 @@ qemuDomainSaveInternal(struct qemud_driver *driver, virDomainPtr dom, virDomainDefFree(def); goto endjob; } -xml = qemuDomainDefFormatLive(driver, def, true, true); +xml = qemuDomainDefFormatLive(driver, def, true, true, false); } else { -xml = qemuDomainDefFormatLive(driver, vm-def, true, true); +xml = qemuDomainDefFormatLive(driver, vm-def, true, true, false); } if (!xml) { virReportError(VIR_ERR_OPERATION_FAILED, @@ -10607,7 +10607,7 @@ qemuDomainSnapshotCreateXML(virDomainPtr domain, } else { /* Easiest way to clone inactive portion of vm-def is via * conversion in and back out of xml. */ -if (!(xml = qemuDomainDefFormatLive(driver, vm-def, true, false)) || +if (!(xml = qemuDomainDefFormatLive(driver, vm-def, true, false, false)) || !(def-dom = virDomainDefParseString(driver-caps, xml, QEMU_EXPECTED_VIRT_TYPES,
[libvirt] [PATCH 11/15] Hybrid Hostdevs should be marked as ephemeral.
The ephemeral flag is checked along with the hostdev parent type before freeing a hostdev. Additionally Hostdev-Hybrid mode supports live migration with PCI Passthrough. Ephemeral flag plays a very important role in the upcoming migration suppot patch. --- include/libvirt/libvirt.h.in |1 + src/conf/domain_conf.c |6 +- src/conf/domain_conf.h |1 + src/qemu/qemu_hotplug.c |3 ++- 4 files changed, 9 insertions(+), 2 deletions(-) diff --git a/include/libvirt/libvirt.h.in b/include/libvirt/libvirt.h.in index d21d029..6c68fdd 100644 --- a/include/libvirt/libvirt.h.in +++ b/include/libvirt/libvirt.h.in @@ -1631,6 +1631,7 @@ typedef enum { VIR_DOMAIN_XML_SECURE = (1 0), /* dump security sensitive information too */ VIR_DOMAIN_XML_INACTIVE = (1 1), /* dump inactive domain information */ VIR_DOMAIN_XML_UPDATE_CPU = (1 2), /* update guest CPU requirements according to host CPU */ +VIR_DOMAIN_XML_NO_EPHEMERAL_DEVICES = (1 24), /* Do not include ephemeral devices */ } virDomainXMLFlags; char * virDomainGetXMLDesc (virDomainPtr domain, diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index 361850a..00624ee 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -1544,7 +1544,9 @@ void virDomainDefFree(virDomainDefPtr def) * to virDomainHostdevDefFree(). */ for (i = 0 ; i def-nhostdevs ; i++) -virDomainHostdevDefFree(def-hostdevs[i]); +if (def-hostdevs[i]-ephemeral == 0) { +virDomainHostdevDefFree(def-hostdevs[i]); +} VIR_FREE(def-hostdevs); for (i = 0 ; i def-nleases ; i++) @@ -4402,6 +4404,7 @@ virDomainActualNetDefParseXML(xmlNodePtr node, virReportOOMError(); goto error; } +hostdev-ephemeral = 1; /* The helper function expects type to already be found and * passed in as a string, since it is in a different place in * NetDef vs HostdevDef. @@ -4795,6 +4798,7 @@ virDomainNetDefParseXML(virCapsPtr caps, virReportOOMError(); goto error; } +hostdev-ephemeral = 1; addrtype = virXPathString(string(./source/address/@type), ctxt); /* if not explicitly stated, source/vendor implies usb device */ if (!addrtype virXPathNode(./source/vendor, ctxt) diff --git a/src/conf/domain_conf.h b/src/conf/domain_conf.h index 4584671..f88363a 100644 --- a/src/conf/domain_conf.h +++ b/src/conf/domain_conf.h @@ -371,6 +371,7 @@ struct _virDomainHostdevDef { virDomainDeviceDef actualParent; /*used only in the case of hybrid hostdev*/ int mode; /* enum virDomainHostdevMode */ unsigned int managed : 1; +unsigned int ephemeral: 1; union { virDomainHostdevSubsys subsys; struct { diff --git a/src/qemu/qemu_hotplug.c b/src/qemu/qemu_hotplug.c index 1822289..0fd506e 100644 --- a/src/qemu/qemu_hotplug.c +++ b/src/qemu/qemu_hotplug.c @@ -2107,7 +2107,8 @@ int qemuDomainDetachThisHostDevice(struct qemud_driver *driver, VIR_WARN(Failed to restore host device labelling); } virDomainHostdevRemove(vm-def, idx); -virDomainHostdevDefFree(detach); +if (detach-ephemeral == 0) +virDomainHostdevDefFree(detach); } return ret; } -- 1.7.4.4 -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
[libvirt] [PATCH 13/15] Hostdev-hybrid qemu driver implementation
--- src/qemu/qemu_command.c | 59 +++ src/qemu/qemu_hotplug.c | 23 -- src/qemu/qemu_process.c |3 +- 3 files changed, 81 insertions(+), 4 deletions(-) diff --git a/src/qemu/qemu_command.c b/src/qemu/qemu_command.c index bb66364..d67394e 100644 --- a/src/qemu/qemu_command.c +++ b/src/qemu/qemu_command.c @@ -27,6 +27,7 @@ #include qemu_hostdev.h #include qemu_capabilities.h #include qemu_bridge_filter.h +#include qemu_hostdev.h #include cpu/cpu.h #include memory.h #include logging.h @@ -4330,10 +4331,16 @@ qemuBuildCommandLine(virConnectPtr conn, bool emitBootindex = false; int usbcontroller = 0; bool usblegacy = false; + +virDomainObjPtr vm = NULL; +virDomainObjListPtr doms = driver-domains; + uname_normalize(ut); virUUIDFormat(def-uuid, uuid); +vm = virHashLookup(doms-objs, uuid); + emulator = def-emulator; /* @@ -5257,6 +5264,58 @@ qemuBuildCommandLine(virConnectPtr conn, continue; } + if (actualType == VIR_DOMAIN_NET_TYPE_HOSTDEV_HYBRID) { +virDomainHostdevDefPtr hostdev = virDomainNetGetActualHostdev(net); +virDomainHostdevDefPtr found; +if (vmop == VIR_NETDEV_VPORT_PROFILE_OP_CREATE) { +if (qemuAssignDeviceHostdevAlias(def, + hostdev, + (def-nhostdevs-1)) 0) { +virReportError(VIR_ERR_INTERNAL_ERROR, %s, + _(Could not assign alias to Net Hostdev)); +goto error; +} + +if (virDomainHostdevFind(def, + hostdev, + found) 0) { +qemuDomainObjPrivatePtr priv = vm-privateData; +if (qemuDomainPCIAddressEnsureAddr(priv-pciaddrs, + hostdev-info) 0) { +virReportError(VIR_ERR_INTERNAL_ERROR, %s, + _(Could not assign PCI addr to Hostdev hybrid)); +goto error; +} + +if (virDomainHostdevInsert(def, + hostdev) 0) { +virReportError(VIR_ERR_INTERNAL_ERROR, %s, + _(Hostdev not inserted into the array)); +goto error; +} + +if (qemuPrepareHostdevPCIDevices(driver, def-name, + def-uuid, hostdev, 1) 0) { +virReportError(VIR_ERR_INTERNAL_ERROR, %s, + _(Prepare Hostdev PCI Devices failed)); +goto error; +} +} +} + +int tapfd = qemuPhysIfaceConnect(def, driver, net, + qemuCaps, vmop); +if (tapfd 0) +goto error; + +last_good_net = i; +virCommandTransferFD(cmd, tapfd); + +if (snprintf(tapfd_name, sizeof(tapfd_name), %d, + tapfd) = sizeof(tapfd_name)) +goto no_memory; +} + if (actualType == VIR_DOMAIN_NET_TYPE_NETWORK || actualType == VIR_DOMAIN_NET_TYPE_BRIDGE) { /* diff --git a/src/qemu/qemu_hotplug.c b/src/qemu/qemu_hotplug.c index 0fd506e..3586d3e 100644 --- a/src/qemu/qemu_hotplug.c +++ b/src/qemu/qemu_hotplug.c @@ -694,6 +694,21 @@ int qemuDomainAttachNetDevice(virConnectPtr conn, goto cleanup; } +if (actualType == VIR_DOMAIN_NET_TYPE_HOSTDEV_HYBRID) { +ret = qemuDomainAttachHostDevice(driver, vm, + virDomainNetGetActualHostdev(net)); +if (ret 0) +goto cleanup; + +if ((tapfd = qemuPhysIfaceConnect(vm-def, driver, net, + priv-qemuCaps, + VIR_NETDEV_VPORT_PROFILE_OP_CREATE)) 0) +goto cleanup; +iface_connected = true; +if (qemuOpenVhostNet(vm-def, net, priv-qemuCaps, vhostfd) 0) +goto cleanup; +} + if (actualType == VIR_DOMAIN_NET_TYPE_BRIDGE || actualType == VIR_DOMAIN_NET_TYPE_NETWORK) { /* @@ -2198,7 +2213,8 @@ qemuDomainDetachNetDevice(struct qemud_driver *driver, goto cleanup; } -if
[libvirt] [PATCH 15/15] Migration support for hostdev-hybrid.
This patch uses the ephemeral flag to prevent the hybrid hostdev from being formatted into the xml. Before migration the hybrid hostdev is hot unplugged and hotplugged again after migration is the specific hostdev is available on the destination host. --- src/qemu/qemu_migration.c | 102 ++-- 1 files changed, 97 insertions(+), 5 deletions(-) diff --git a/src/qemu/qemu_migration.c b/src/qemu/qemu_migration.c index d8aefa0..21894e8 100644 --- a/src/qemu/qemu_migration.c +++ b/src/qemu/qemu_migration.c @@ -31,6 +31,7 @@ #include qemu_monitor.h #include qemu_domain.h #include qemu_process.h +#include qemu_hotplug.h #include qemu_capabilities.h #include qemu_cgroup.h @@ -49,6 +50,7 @@ #include storage_file.h #include viruri.h #include hooks.h +#include network/bridge_driver.h #define VIR_FROM_THIS VIR_FROM_QEMU @@ -122,6 +124,79 @@ struct _qemuMigrationCookie { virDomainDefPtr persistent; }; +static void +qemuMigrationRemoveEphemeralDevices(struct qemud_driver *driver, +virDomainObjPtr vm) +{ +virDomainHostdevDefPtr dev; +virDomainDeviceDef def; +unsigned int i; + +for (i = 0; i vm-def-nhostdevs; i++) { +dev = vm-def-hostdevs[i]; +if (dev-ephemeral == 1) { +def.type = VIR_DOMAIN_DEVICE_HOSTDEV; +def.data.hostdev = dev; + +if (qemuDomainDetachHostDevice(driver, vm, def) = 0) { +continue; /* nhostdevs reduced */ +} +} +} +} + +static void +qemuMigrationRestoreEphemeralDevices(struct qemud_driver *driver, + virDomainObjPtr vm) +{ +virDomainNetDefPtr net; + unsigned int i; + +/* Do nothing if ephemeral devices are present in which case this + function was called before qemuMigrationRemoveEphemeralDevices */ + +for (i = 0; i vm-def-nhostdevs; i++) { +if (vm-def-hostdevs[i]-ephemeral == 1) +return; +} + +for (i = 0; i vm-def-nnets; i++) { +net = vm-def-nets[i]; + +if (virDomainNetGetActualType(net) == VIR_DOMAIN_NET_TYPE_HOSTDEV_HYBRID) { +if (qemuDomainAttachHostDevice(driver, vm, + virDomainNetGetActualHostdev(net)) 0) { +virReportError(VIR_ERR_INTERNAL_ERROR, %s, + _(Hybrid Hostdev cannot be attached after migration)); +networkReleaseActualDevice(net); +} +} +return; +} +} + +static void +qemuMigrationAttachEphemeralDevices(struct qemud_driver *driver, +virDomainObjPtr vm) +{ +virDomainNetDefPtr net; +unsigned int i; + +for (i = 0; i vm-def-nnets; i++) { +net = vm-def-nets[i]; + +if (virDomainNetGetActualType(net) == VIR_DOMAIN_NET_TYPE_HOSTDEV_HYBRID) { +if (qemuDomainAttachHostDevice(driver, vm, + virDomainNetGetActualHostdev(net)) 0) { +virReportError(VIR_ERR_INTERNAL_ERROR, %s, + _(Hybrid Hostdev cannot be attached after migration)); +networkReleaseActualDevice(net); +} +} +} +return; +} + static void qemuMigrationCookieGraphicsFree(qemuMigrationCookieGraphicsPtr grap) { if (!grap) @@ -800,6 +875,7 @@ qemuMigrationIsAllowed(struct qemud_driver *driver, virDomainObjPtr vm, virDomainDefPtr def) { int nsnapshots; +unsigned int i; if (vm) { if (qemuProcessAutoDestroyActive(driver, vm)) { @@ -817,10 +893,12 @@ qemuMigrationIsAllowed(struct qemud_driver *driver, virDomainObjPtr vm, def = vm-def; } -if (def-nhostdevs 0) { -virReportError(VIR_ERR_OPERATION_INVALID, - %s, _(Domain with assigned host devices cannot be migrated)); -return false; +for (i = 0; i def-nhostdevs; i++) { +if (def-hostdevs[i]-ephemeral == 0) { +virReportError(VIR_ERR_OPERATION_INVALID, + %s, _(Domain with assigned host devices cannot be migrated)); +return false; +} } return true; @@ -2042,6 +2120,8 @@ static int doNativeMigrate(struct qemud_driver *driver, cookieout=%p, cookieoutlen=%p, flags=%lx, resource=%lu, driver, vm, uri, NULLSTR(cookiein), cookieinlen, cookieout, cookieoutlen, flags, resource); + +qemuMigrationRemoveEphemeralDevices(driver, vm); if (STRPREFIX(uri, tcp:) !STRPREFIX(uri, tcp://)) { char *tmp; @@ -2069,6 +2149,9 @@ static int doNativeMigrate(struct qemud_driver *driver, ret = qemuMigrationRun(driver, vm, cookiein, cookieinlen, cookieout, cookieoutlen, flags, resource, spec, dconn); +if (ret != 0 ) +
[libvirt] [PATCH 12/15] Hostdev-hybrid network driver Implementation
This patch updates the network driver to properly utilize the new attributes/elements that are now in virNetworkDef --- src/network/bridge_driver.c | 139 +- 1 files changed, 122 insertions(+), 17 deletions(-) diff --git a/src/network/bridge_driver.c b/src/network/bridge_driver.c index 33bc09e..457716c 100644 --- a/src/network/bridge_driver.c +++ b/src/network/bridge_driver.c @@ -1934,9 +1934,9 @@ networkStartNetworkExternal(struct network_driver *driver ATTRIBUTE_UNUSED, virNetworkObjPtr network ATTRIBUTE_UNUSED) { /* put anything here that needs to be done each time a network of - * type BRIDGE, PRIVATE, VEPA, HOSTDEV or PASSTHROUGH is started. On - * failure, undo anything you've done, and return -1. On success - * return 0. + * type BRIDGE, PRIVATE, VEPA, HOSTDEV, HOSTDEV-HYBRID or PASSTHROUGH + * is started. On failure, undo anything you've done, and return -1. + * On success return 0. */ return 0; } @@ -1945,9 +1945,9 @@ static int networkShutdownNetworkExternal(struct network_driver *driver ATTRIBUT virNetworkObjPtr network ATTRIBUTE_UNUSED) { /* put anything here that needs to be done each time a network of - * type BRIDGE, PRIVATE, VEPA, HOSTDEV or PASSTHROUGH is shutdown. On - * failure, undo anything you've done, and return -1. On success - * return 0. + * type BRIDGE, PRIVATE, VEPA, HOSTDEV, HOSTDEV-HYBRID or PASSTHROUGH + * is shutdown. On failure, undo anything you've done, and return -1. + * On success return 0. */ return 0; } @@ -1977,6 +1977,7 @@ networkStartNetwork(struct network_driver *driver, case VIR_NETWORK_FORWARD_VEPA: case VIR_NETWORK_FORWARD_PASSTHROUGH: case VIR_NETWORK_FORWARD_HOSTDEV: +case VIR_NETWORK_FORWARD_HOSTDEV_HYBRID: ret = networkStartNetworkExternal(driver, network); break; } @@ -2037,6 +2038,7 @@ static int networkShutdownNetwork(struct network_driver *driver, case VIR_NETWORK_FORWARD_VEPA: case VIR_NETWORK_FORWARD_PASSTHROUGH: case VIR_NETWORK_FORWARD_HOSTDEV: +case VIR_NETWORK_FORWARD_HOSTDEV_HYBRID: ret = networkShutdownNetworkExternal(driver, network); break; } @@ -2780,7 +2782,8 @@ networkCreateInterfacePool(virNetworkDefPtr netdef) { goto finish; } } -if (netdef-forwardType == VIR_NETWORK_FORWARD_HOSTDEV) { +if (netdef-forwardType == VIR_NETWORK_FORWARD_HOSTDEV || +netdef-forwardType == VIR_NETWORK_FORWARD_HOSTDEV_HYBRID) { netdef-forwardIfs[ii].type = VIR_NETWORK_FORWARD_HOSTDEV_DEVICE_PCI; /*Assuming PCI as VF's are PCI devices */ netdef-forwardIfs[ii].device.pci.domain = virt_fns[ii]-domain; netdef-forwardIfs[ii].device.pci.bus = virt_fns[ii]-bus; @@ -2925,6 +2928,8 @@ networkAllocateActualDevice(virDomainNetDefPtr iface) } iface-data.network.actual-data.hostdev.def.parent.type = VIR_DOMAIN_DEVICE_NET; iface-data.network.actual-data.hostdev.def.parent.data.net = iface; +iface-data.network.actual-data.hostdev.def.actualParent.type = VIR_DOMAIN_DEVICE_NET; +iface-data.network.actual-data.hostdev.def.actualParent.data.net = iface; iface-data.network.actual-data.hostdev.def.info = iface-info; iface-data.network.actual-data.hostdev.def.mode = VIR_DOMAIN_HOSTDEV_MODE_SUBSYS; iface-data.network.actual-data.hostdev.def.managed = netdef-managed; @@ -2958,6 +2963,97 @@ networkAllocateActualDevice(virDomainNetDefPtr iface) dev-device.pci.function, dev-usageCount); +} else if (netdef-forwardType == VIR_NETWORK_FORWARD_HOSTDEV_HYBRID) { +char *pfname = NULL; +if (!iface-data.network.actual + (VIR_ALLOC(iface-data.network.actual) 0)) { +virReportOOMError(); +goto cleanup; +} +iface-data.network.actual-type = VIR_DOMAIN_NET_TYPE_HOSTDEV_HYBRID; + +if ((netdef-nForwardPfs 0) (netdef-nForwardIfs = 0)) { +if(networkCreateInterfacePool(netdef) 0) { +virReportError(VIR_ERR_INTERNAL_ERROR, %s, + _(Could not create Interface Pool from PF)); +goto cleanup; +} +} +/* pick first dev with 0 usageCount */ + +for (ii = 0; ii netdef-nForwardIfs; ii++) { +if (netdef-forwardIfs[ii].usageCount == 0) { +dev = netdef-forwardIfs[ii]; +break; +} +} +if (!dev) { +virReportError(VIR_ERR_INTERNAL_ERROR, + _(network '%s' requires exclusive access to interfaces, but none are available), + netdef-name); +goto cleanup; +} +
[libvirt] Question about networkNotifyActualDevice
Hello All, I have recently found a bug in my patches to support forward mode=hostdev. The bug is that libvirtd restart forgets about the netdef-forwardIfs. Steps to reproduce: 1) virsh net-define hostdev.xml network namehostdev/name uuid81ff0d90-c91e-6742-64da-4a736edb9a8f/uuid forward mode='hostdev' managed='yes' pf dev='eth4'/ /forward /network 2) virsh define guest.xml 3) virsh start guest [root@c6100m libvirt]# virsh net-dumpxml hostdev network namehostdev/name uuid81ff0d90-c91e-6742-64da-4a736edb9a8f/uuid forward managed='yes' mode='hostdev' pf dev='eth4'/ address type='pci' domain='0x' bus='0x04' slot='0x00' function='0x2'/ address type='pci' domain='0x' bus='0x04' slot='0x00' function='0x4'/ address type='pci' domain='0x' bus='0x04' slot='0x00' function='0x6'/ address type='pci' domain='0x' bus='0x04' slot='0x01' function='0x0'/ address type='pci' domain='0x' bus='0x04' slot='0x01' function='0x2'/ address type='pci' domain='0x' bus='0x04' slot='0x01' function='0x4'/ address type='pci' domain='0x' bus='0x04' slot='0x01' function='0x6'/ address type='pci' domain='0x' bus='0x04' slot='0x02' function='0x0'/ /forward /network 4) service libvirtd restart After libvirtd is restarted I observe that the guest is shutdown. /var/log/libvirt/libvirtd.log complains with the following error message: networkNotifyActualDevice:3299 : internal error network 'hostdev' doesn't have dev in use by domain. Using gdb when I breakpoint on networkNotifyActualDevice on libvirtd start I observe that netdef-nForwardIfs has been reset to 0, but netdef-nForwardPfs is correctly assigned to 1. So my question is does libvirtd restart cause the network to remember only the inactive XML and not the active XML? Does anyone have any ideas on how I should proceed on this bug? -- Many Thanks, Regards, Shradha Shah -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] Question about pci-passthrough
Hello Laine, Thanks for your help. Please find my comments inline. On 07/27/2012 08:41 PM, Laine Stump wrote: On 07/26/2012 11:53 AM, Shradha Shah wrote: Hello All, I had a question about pci-passthrough using hostdev mode in Libvirt. When I assign hostdev-parent.type = VIR_DOMAIN_DEVICE_NET, the hostdev is passed into the guest and acts as a network device. Actually, aside from libvirt setting the MAC address prior to assigning the device to the guest, all the rest of the handling is identical to that of a PCI device. When I assign hostdev-parent.type = VIR_DOMAIN_DEVICE_NONE, the hostdev is passed into the guest and acts as a PCI Host device. Please can someone point me to the part of code in libvirt where this decision is made based on hostdev-parent.type? I'm curious why you're setting those yourself. that's really something internal that should only be set when parsing a device defined as interface type='hostdev' - this ends up creating entries for both types of devices, a virDomainNetDef and a virDomainHostdevDef, that are tightly intertwined. I am working on HOSTDEV_HYBRID patches. I am following the exact same steps as for HOSTDEV mode. So in the Hybrid mode I need a net device in the guest along with the VF, the VF needs to be pushed into the guest as a PCI device. If you attempt to just allocate your own virDomainHostdevDef and set hostdev-parent.type = VIR_DOMAIN_DEVICE_NET, you will have a very bad time (tm). The code assumes that any HostdevDef that has a parent-type != NONE is not a standalone object, but is one that resides within a higher level object (see the definition of _virDomainNetDef, in particular the virDomainHostdevDef def that is defined within it). The *only way* to setup one of these objects that will work properly is to create a virDomainNetDef object (e.g. call it net), then set net-type = VIR_DOMAIN_NET_TYPE_HOSTDEV, and initialize net-data.hostdev.dev with all the hostdev info, including pointing its parent back to the original net. Likewise, the virDomainNetDef that you create *must* be placed on the domain's list of network devices and net-data.hostdev.def *must* be placed on the domain's list of hostdevs. All of this is handled for you by virDomainNetDefParse (look for the VIR_DOMAIN_NET_TYPE_HOSTDEV case). Agreed As to where the decision is made about how to treat the device - for starters, interface type='hostdev' devices are never attached by calling the function to do a hostdev attach (qemuDomainAttachHostDevice)directly - they are attached by calling the function that does an attach of a net device (qemuDomainAttachNetDevice) - that function calls networkAllocateActualDevice if necessary (in case it's a type='hostdev' hiding behind a type='network') and then if the actualType is hostdev, calls qemuDomainAttachHostDevice to do the rest of the work. qemuDomainAttachHostDevice will call down a couple of levels to qemuPrepareHostdevPCIDevices, which will check for hostdev-parent.type == VIR_DOMAIN_DEVICE_NET, and if so will call qemuDomainHostdevNetConfigReplace, which calls the functions that do the netdev-specific setup (i.e. setting the MAC address, although I plan to also add support for setting the VLAN tag). Agreed So, a more organized version of this: 1) if you want a hybrid netdev/hostdev, start by creating a virDomainNetDef and filling in the hostdev part, not vice versa, and make sure the corresponding parts are on their respective lists for the domain. (hostdevs and nets) I have followed this step, Created the virDomainNetDef and filled in the hostdev part but I had to assign hostdev-parent.type = VIR_DOMAIN_DEVICE_NONE as I wanted to push the VF as a PCI device into the guest. 2) to attach/detach any of these devices, call the *net* version of the attach/detach functions, not the hostdev version. I have correctly followed this step as well. 3) The place that checks if a particular device is a network-type of hostdev, is qemuPrepareHostdevPCIDevices - look for hostdev-parent.type == VIR_DOMAIN_DEVICE_NET in that function and follow the call chain down from there to where the action is. The only problem now is, since hostdev-parent.type = VIR_DOMAIN_DEVICE_NONE I cannot assign the VF a mac address. If I change hostdev-parent.type = VIR_DOMAIN_DEVICE_NET, I am able to assign VF a mac address but the VF is pushed into the guest as a net device not a PCI device. So I am able to push the VF into the guest as a PCI device along with a virio net device but I cannot resolve the problem of assigning a MAC address to the VF. This is the only issue remaining in the HOSTDEV_HYBRID patches. Many Thanks, Regards, Shradha Shah -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH 0/6 v3] Support forward mode='hostdev' and interface pools
Hello All, May I request a review for this patch series that supports forward mode='hostdev' with interface pools? Many Thanks, Regards, Shradha Shah On 06/29/2012 01:19 PM, Shradha Shah wrote: This patch series supports the forward mode='hostdev'. The functionality of this mode is the same as interface type='hostdev' but with the added benefit of using interface pools. The patch series also contains a patch to support use of interface names and PCI device addresses interchangeably in a network xml, and return the appropriate one in actualDevice when networkAllocateActualDevice is called. At the top level managed attribute can be specified with identical results as when it's specified for a hostdev. Currently forward mode='hostdev' does not support USB devices. Shradha Shah (6): Prerequisite Patch. virDomainDevicePCIAddress and respective functions moved to a new file called conf/device_conf.ch Moved the code to create implicit interface pool from PF to a new function RNG updates, new xml parser/formatter code to support forward mode=hostdev Code to return interface name or pci_addr of the VF in actualDevice Forward Mode Hostdev network driver Implementation Forward Mode 'Hostdev' qemu driver implementation docs/formatnetwork.html.in | 62 ++ docs/schemas/network.rng | 82 - include/libvirt/virterror.h|1 + src/Makefile.am|7 +- src/conf/device_conf.c | 135 + src/conf/device_conf.h | 65 +++ src/conf/domain_conf.c | 114 ++-- src/conf/domain_conf.h | 25 +--- src/conf/network_conf.c| 126 +++-- src/conf/network_conf.h| 29 +++- src/libvirt_private.syms | 10 +- src/network/bridge_driver.c| 322 +--- src/qemu/qemu_command.c| 27 ++- src/qemu/qemu_hotplug.c|7 +- src/qemu/qemu_monitor.c| 14 +- src/qemu/qemu_monitor.h| 17 +- src/qemu/qemu_monitor_json.c | 14 +- src/qemu/qemu_monitor_json.h | 14 +- src/qemu/qemu_monitor_text.c | 16 +- src/qemu/qemu_monitor_text.h | 14 +- src/util/virnetdev.c | 29 ++-- src/util/virnetdev.h |4 +- src/xen/xend_internal.c|3 +- tests/networkxml2xmlin/hostdev-pf.xml | 11 + tests/networkxml2xmlin/hostdev.xml | 10 + tests/networkxml2xmlout/hostdev-pf.xml |7 + tests/networkxml2xmlout/hostdev.xml| 10 + tests/networkxml2xmltest.c |2 + 28 files changed, 890 insertions(+), 287 deletions(-) create mode 100644 src/conf/device_conf.c create mode 100644 src/conf/device_conf.h create mode 100644 tests/networkxml2xmlin/hostdev-pf.xml create mode 100644 tests/networkxml2xmlin/hostdev.xml create mode 100644 tests/networkxml2xmlout/hostdev-pf.xml create mode 100644 tests/networkxml2xmlout/hostdev.xml -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH 0/6 v3] Support forward mode='hostdev' and interface pools
That's fine. Thanks. Just wanted to ping the list once and keep these patches in the loop. Many Thanks, Regards, Shradha Shah On 07/26/2012 02:59 PM, Laine Stump wrote: On 07/26/2012 06:21 AM, Shradha Shah wrote: Hello All, May I request a review for this patch series that supports forward mode='hostdev' with interface pools? Sorry for keeping you up in the air for so long. This has been on my list since I got back from vacation, but something else always seems to come up. I'll try to go through it today. -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
[libvirt] Question about pci-passthrough
Hello All, I had a question about pci-passthrough using hostdev mode in Libvirt. When I assign hostdev-parent.type = VIR_DOMAIN_DEVICE_NET, the hostdev is passed into the guest and acts as a network device. When I assign hostdev-parent.type = VIR_DOMAIN_DEVICE_NONE, the hostdev is passed into the guest and acts as a PCI Host device. Please can someone point me to the part of code in libvirt where this decision is made based on hostdev-parent.type? -- Many Thanks, Regards, Shradha Shah -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
[libvirt] [PATCH 0/6 v3] Support forward mode='hostdev' and interface pools
This patch series supports the forward mode='hostdev'. The functionality of this mode is the same as interface type='hostdev' but with the added benefit of using interface pools. The patch series also contains a patch to support use of interface names and PCI device addresses interchangeably in a network xml, and return the appropriate one in actualDevice when networkAllocateActualDevice is called. At the top level managed attribute can be specified with identical results as when it's specified for a hostdev. Currently forward mode='hostdev' does not support USB devices. Shradha Shah (6): Prerequisite Patch. virDomainDevicePCIAddress and respective functions moved to a new file called conf/device_conf.ch Moved the code to create implicit interface pool from PF to a new function RNG updates, new xml parser/formatter code to support forward mode=hostdev Code to return interface name or pci_addr of the VF in actualDevice Forward Mode Hostdev network driver Implementation Forward Mode 'Hostdev' qemu driver implementation docs/formatnetwork.html.in | 62 ++ docs/schemas/network.rng | 82 - include/libvirt/virterror.h|1 + src/Makefile.am|7 +- src/conf/device_conf.c | 135 + src/conf/device_conf.h | 65 +++ src/conf/domain_conf.c | 114 ++-- src/conf/domain_conf.h | 25 +--- src/conf/network_conf.c| 126 +++-- src/conf/network_conf.h| 29 +++- src/libvirt_private.syms | 10 +- src/network/bridge_driver.c| 322 +--- src/qemu/qemu_command.c| 27 ++- src/qemu/qemu_hotplug.c|7 +- src/qemu/qemu_monitor.c| 14 +- src/qemu/qemu_monitor.h| 17 +- src/qemu/qemu_monitor_json.c | 14 +- src/qemu/qemu_monitor_json.h | 14 +- src/qemu/qemu_monitor_text.c | 16 +- src/qemu/qemu_monitor_text.h | 14 +- src/util/virnetdev.c | 29 ++-- src/util/virnetdev.h |4 +- src/xen/xend_internal.c|3 +- tests/networkxml2xmlin/hostdev-pf.xml | 11 + tests/networkxml2xmlin/hostdev.xml | 10 + tests/networkxml2xmlout/hostdev-pf.xml |7 + tests/networkxml2xmlout/hostdev.xml| 10 + tests/networkxml2xmltest.c |2 + 28 files changed, 890 insertions(+), 287 deletions(-) create mode 100644 src/conf/device_conf.c create mode 100644 src/conf/device_conf.h create mode 100644 tests/networkxml2xmlin/hostdev-pf.xml create mode 100644 tests/networkxml2xmlin/hostdev.xml create mode 100644 tests/networkxml2xmlout/hostdev-pf.xml create mode 100644 tests/networkxml2xmlout/hostdev.xml -- 1.7.4.4 -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
[libvirt] [PATCH 1/6 v3] Prerequisite Patch. virDomainDevicePCIAddress and respective functions moved to a new file called conf/device_conf.ch
Refactoring existing code without causing any functional changes to prepare for new code. This patch makes the code reusable. Signed-off-by: Shradha Shah ss...@solarflare.com --- include/libvirt/virterror.h |1 + src/Makefile.am |7 ++- src/conf/device_conf.c | 135 ++ src/conf/device_conf.h | 65 src/conf/domain_conf.c | 114 --- src/conf/domain_conf.h | 25 +--- src/libvirt_private.syms | 10 ++- src/qemu/qemu_command.c | 13 ++-- src/qemu/qemu_hotplug.c |7 +- src/qemu/qemu_monitor.c | 14 ++-- src/qemu/qemu_monitor.h | 17 +++--- src/qemu/qemu_monitor_json.c | 14 ++-- src/qemu/qemu_monitor_json.h | 14 ++-- src/qemu/qemu_monitor_text.c | 16 +++--- src/qemu/qemu_monitor_text.h | 14 ++-- src/xen/xend_internal.c |3 +- 16 files changed, 289 insertions(+), 180 deletions(-) diff --git a/include/libvirt/virterror.h b/include/libvirt/virterror.h index 0e0bc9c..7ad1201 100644 --- a/include/libvirt/virterror.h +++ b/include/libvirt/virterror.h @@ -97,6 +97,7 @@ typedef enum { VIR_FROM_URI = 45, /* Error from URI handling */ VIR_FROM_AUTH = 46, /* Error from auth handling */ VIR_FROM_DBUS = 47, /* Error from DBus */ +VIR_FROM_DEVICE = 48, /* Error from Device */ # ifdef VIR_ENUM_SENTINELS VIR_ERR_DOMAIN_LAST diff --git a/src/Makefile.am b/src/Makefile.am index 2309984..7ffb3c2 100644 --- a/src/Makefile.am +++ b/src/Makefile.am @@ -199,6 +199,9 @@ CONSOLE_CONF_SOURCES = \ DOMAIN_LIST_SOURCES = \ conf/virdomainlist.c conf/virdomainlist.h +DEVICE_CONF_SOURCES = \ + conf/device_conf.c conf/device_conf.h + CONF_SOURCES = \ $(NETDEV_CONF_SOURCES) \ $(DOMAIN_CONF_SOURCES) \ @@ -212,7 +215,8 @@ CONF_SOURCES = \ $(SECRET_CONF_SOURCES) \ $(CPU_CONF_SOURCES) \ $(CONSOLE_CONF_SOURCES) \ - $(DOMAIN_LIST_SOURCES) + $(DOMAIN_LIST_SOURCES) \ + $(DEVICE_CONF_SOURCES) # The remote RPC driver, covering domains, storage, networks, etc REMOTE_DRIVER_GENERATED = \ @@ -1526,6 +1530,7 @@ libvirt_lxc_SOURCES = \ $(ENCRYPTION_CONF_SOURCES) \ $(NETDEV_CONF_SOURCES) \ $(DOMAIN_CONF_SOURCES) \ + $(DEVICE_CONF_SOURCES) \ $(SECRET_CONF_SOURCES) \ $(CPU_CONF_SOURCES) \ $(SECURITY_DRIVER_SOURCES) \ diff --git a/src/conf/device_conf.c b/src/conf/device_conf.c new file mode 100644 index 000..d4eb764 --- /dev/null +++ b/src/conf/device_conf.c @@ -0,0 +1,135 @@ +/* + * device_conf.h: device XML handling + * + * Copyright (C) 2006-2012 Red Hat, Inc. + * + * This library is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * This library is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with this library; if not, write to the Free Software + * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA + * + * Author: Shradha Shah ss...@solarflare.com + */ + +#include config.h +#include virterror_internal.h +#include datatypes.h +#include memory.h +#include xml.h +#include uuid.h +#include util.h +#include buf.h +#include conf/device_conf.h + +#define VIR_FROM_THIS VIR_FROM_DEVICE + +#define virDeviceReportError(code, ...) \ +virReportErrorHelper(VIR_FROM_DEVICE, code, __FILE__,\ + __FUNCTION__, __LINE__, __VA_ARGS__) + +VIR_ENUM_IMPL(virDeviceAddressPciMulti, + VIR_DEVICE_ADDRESS_PCI_MULTI_LAST, + default, + on, + off) + +int virDevicePCIAddressIsValid(virDevicePCIAddressPtr addr) +{ +/* PCI bus has 32 slots and 8 functions
[libvirt] [PATCH 2/6 v3] Moved the code to create implicit interface pool from PF to a new function
Just code movement no functional changes here. This makes the code reusable Signed-off-by: Shradha Shah ss...@solarflare.com --- src/network/bridge_driver.c | 84 ++ 1 files changed, 52 insertions(+), 32 deletions(-) diff --git a/src/network/bridge_driver.c b/src/network/bridge_driver.c index 7e8de19..36afa1b 100644 --- a/src/network/bridge_driver.c +++ b/src/network/bridge_driver.c @@ -2730,6 +2730,56 @@ int networkRegister(void) { * backend function table. */ +/* networkCreateInterfacePool: + * @netdef: the original NetDef from the network + * + * Creates an implicit interface pool of VF's when a PF dev is given + */ +static int +networkCreateInterfacePool(virNetworkDefPtr netdef) { +unsigned int num_virt_fns = 0; +char **vfname = NULL; +int ret = -1, ii = 0; + +if ((virNetDevGetVirtualFunctions(netdef-forwardPfs-dev, + vfname, num_virt_fns)) 0) { +networkReportError(VIR_ERR_INTERNAL_ERROR, + _(Could not get Virtual functions on %s), + netdef-forwardPfs-dev); +goto finish; +} + +if (num_virt_fns == 0) { +networkReportError(VIR_ERR_INTERNAL_ERROR, + _(No Vf's present on SRIOV PF %s), + netdef-forwardPfs-dev); + goto finish; +} + +if ((VIR_ALLOC_N(netdef-forwardIfs, num_virt_fns)) 0) { +virReportOOMError(); +goto finish; +} + +netdef-nForwardIfs = num_virt_fns; + +for (ii = 0; ii netdef-nForwardIfs; ii++) { +netdef-forwardIfs[ii].dev = strdup(vfname[ii]); +if (!netdef-forwardIfs[ii].dev) { +virReportOOMError(); +goto finish; +} +netdef-forwardIfs[ii].usageCount = 0; +} + +ret = 0; +finish: +for (ii = 0; ii num_virt_fns; ii++) +VIR_FREE(vfname[ii]); +VIR_FREE(vfname); +return ret; +} + /* networkAllocateActualDevice: * @iface: the original NetDef from the domain * @@ -2748,8 +2798,6 @@ networkAllocateActualDevice(virDomainNetDefPtr iface) virNetworkObjPtr network; virNetworkDefPtr netdef; virPortGroupDefPtr portgroup; -unsigned int num_virt_fns = 0; -char **vfname = NULL; int ii; int ret = -1; @@ -2895,36 +2943,11 @@ networkAllocateActualDevice(virDomainNetDefPtr iface) */ if (netdef-forwardType == VIR_NETWORK_FORWARD_PASSTHROUGH) { if ((netdef-nForwardPfs 0) (netdef-nForwardIfs = 0)) { -if ((virNetDevGetVirtualFunctions(netdef-forwardPfs-dev, - vfname, num_virt_fns)) 0) { +if ((networkCreateInterfacePool(netdef)) 0) { networkReportError(VIR_ERR_INTERNAL_ERROR, - _(Could not get Virtual functions on %s), - netdef-forwardPfs-dev); + _(Could not Interface Pool)); goto cleanup; } - -if (num_virt_fns == 0) { -networkReportError(VIR_ERR_INTERNAL_ERROR, - _(No Vf's present on SRIOV PF %s), - netdef-forwardPfs-dev); -goto cleanup; -} - -if ((VIR_ALLOC_N(netdef-forwardIfs, num_virt_fns)) 0) { -virReportOOMError(); -goto cleanup; -} - -netdef-nForwardIfs = num_virt_fns; - -for (ii = 0; ii netdef-nForwardIfs; ii++) { -netdef-forwardIfs[ii].dev = strdup(vfname[ii]); -if (!netdef-forwardIfs[ii].dev) { -virReportOOMError(); -goto cleanup; -} -netdef-forwardIfs[ii].usageCount = 0; -} } /* pick first dev with 0 usageCount */ @@ -2976,9 +2999,6 @@ networkAllocateActualDevice(virDomainNetDefPtr iface) ret = 0; cleanup: -for (ii = 0; ii num_virt_fns; ii++) -VIR_FREE(vfname[ii]); -VIR_FREE(vfname); if (network) virNetworkObjUnlock(network); if (ret 0) { -- 1.7.4.4 -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
[libvirt] [PATCH 3/6 v3] RNG updates, new xml parser/formatter code to support forward mode=hostdev
This patch introduces the new forward mode='hostdev' along with attribute managed Includes updates to the network RNG and new xml parser/formatter code. Signed-off-by: Shradha Shah ss...@solarflare.com --- docs/schemas/network.rng | 82 +++-- src/conf/network_conf.c| 126 src/conf/network_conf.h| 29 +++- src/network/bridge_driver.c| 18 ++-- tests/networkxml2xmlin/hostdev-pf.xml | 11 +++ tests/networkxml2xmlin/hostdev.xml | 10 +++ tests/networkxml2xmlout/hostdev-pf.xml |7 ++ tests/networkxml2xmlout/hostdev.xml| 10 +++ tests/networkxml2xmltest.c |2 + 9 files changed, 262 insertions(+), 33 deletions(-) diff --git a/docs/schemas/network.rng b/docs/schemas/network.rng index 2ae879e..d1297cd 100644 --- a/docs/schemas/network.rng +++ b/docs/schemas/network.rng @@ -82,17 +82,41 @@ valuepassthrough/value valueprivate/value valuevepa/value + valuehostdev/value +/choice + /attribute +/optional + +optional + attribute name=managed +choice + valueyes/value + valueno/value /choice /attribute /optional interleave - zeroOrMore -element name='interface' - attribute name='dev' -ref name='deviceName'/ - /attribute -/element - /zeroOrMore + choice +group + zeroOrMore +element name='interface' + attribute name='dev' +ref name='deviceName'/ + /attribute +/element + /zeroOrMore +/group +group + zeroOrMore +element name='address' + attribute name='type' +valuepci/value + /attribute + ref name=pciaddress/ +/element + /zeroOrMore +/group + /choice optional element name='pf' attribute name='dev' @@ -238,4 +262,48 @@ /interleave /element /define + define name=pciaddress +optional + attribute name=domain +ref name=pciDomain/ + /attribute +/optional +attribute name=bus + ref name=pciBus/ +/attribute +attribute name=slot + ref name=pciSlot/ +/attribute +attribute name=function + ref name=pciFunc/ +/attribute +optional + attribute name=multifunction +choice + valueon/value + valueoff/value +/choice + /attribute +/optional + /define + define name=pciDomain +data type=string + param name=pattern(0x)?[0-9a-fA-F]{1,4}/param +/data + /define + define name=pciBus +data type=string + param name=pattern(0x)?[0-9a-fA-F]{1,2}/param +/data + /define + define name=pciSlot +data type=string + param name=pattern(0x)?[0-1]?[0-9a-fA-F]/param +/data + /define + define name=pciFunc +data type=string + param name=pattern(0x)?[0-7]/param +/data + /define /grammar diff --git a/src/conf/network_conf.c b/src/conf/network_conf.c index 515bc36..be37856 100644 --- a/src/conf/network_conf.c +++ b/src/conf/network_conf.c @@ -48,10 +48,14 @@ #define VIR_FROM_THIS VIR_FROM_NETWORK VIR_ENUM_DECL(virNetworkForward) - VIR_ENUM_IMPL(virNetworkForward, VIR_NETWORK_FORWARD_LAST, - none, nat, route, bridge, private, vepa, passthrough ) + none, nat, route, bridge, private, vepa, passthrough, hostdev) + +VIR_ENUM_DECL(virNetworkForwardHostdevDevice) +VIR_ENUM_IMPL(virNetworkForwardHostdevDevice, + VIR_NETWORK_FORWARD_HOSTDEV_DEVICE_LAST, + none, pci) #define virNetworkReportError(code, ...)\ virReportErrorHelper(VIR_FROM_NETWORK, code, __FILE__, \ @@ -100,6 +104,12 @@ virPortGroupDefClear(virPortGroupDefPtr def) static void virNetworkForwardIfDefClear(virNetworkForwardIfDefPtr def) { +VIR_FREE(def-device.dev); +} + +static void +virNetworkForwardPfDefClear(virNetworkForwardPfDefPtr def) +{ VIR_FREE(def-dev); } @@ -163,12 +173,13 @@ void virNetworkDefFree(virNetworkDefPtr def) VIR_FREE(def-domain); for (ii = 0 ; ii def-nForwardPfs def-forwardPfs ; ii++) { -virNetworkForwardIfDefClear(def-forwardPfs[ii]); +virNetworkForwardPfDefClear(def-forwardPfs[ii]); } VIR_FREE(def-forwardPfs); for (ii = 0 ; ii def-nForwardIfs def-forwardIfs ; ii
[libvirt] [PATCH 4/6 v3] Code to return interface name or pci_addr of the VF in actualDevice
The network pool should be able to keep track of both, network device names nad PCI addresses, and return the appropriate one in the actualDevice when networkAllocateActualDevice is called. Signed-off-by: Shradha Shah ss...@solarflare.com --- src/network/bridge_driver.c | 33 +++-- src/util/virnetdev.c| 29 - src/util/virnetdev.h|4 +++- 3 files changed, 46 insertions(+), 20 deletions(-) diff --git a/src/network/bridge_driver.c b/src/network/bridge_driver.c index 230012c..2f8a937 100644 --- a/src/network/bridge_driver.c +++ b/src/network/bridge_driver.c @@ -59,6 +59,7 @@ #include dnsmasq.h #include configmake.h #include virnetdev.h +#include pci.h #include virnetdevbridge.h #include virnetdevtap.h @@ -2739,10 +2740,11 @@ static int networkCreateInterfacePool(virNetworkDefPtr netdef) { unsigned int num_virt_fns = 0; char **vfname = NULL; +struct pci_config_address **virt_fns; int ret = -1, ii = 0; if ((virNetDevGetVirtualFunctions(netdef-forwardPfs-dev, - vfname, num_virt_fns)) 0) { + vfname, virt_fns, num_virt_fns)) 0) { networkReportError(VIR_ERR_INTERNAL_ERROR, _(Could not get Virtual functions on %s), netdef-forwardPfs-dev); @@ -2764,19 +2766,38 @@ networkCreateInterfacePool(virNetworkDefPtr netdef) { netdef-nForwardIfs = num_virt_fns; for (ii = 0; ii netdef-nForwardIfs; ii++) { -netdef-forwardIfs[ii].device.dev = strdup(vfname[ii]); -if (!netdef-forwardIfs[ii].device.dev) { -virReportOOMError(); -goto finish; +if (netdef-forwardType == VIR_NETWORK_FORWARD_PASSTHROUGH) { +if(vfname[ii]) { +netdef-forwardIfs[ii].device.dev = strdup(vfname[ii]); +if (!netdef-forwardIfs[ii].device.dev) { +virReportOOMError(); +goto finish; +} +} +else { +networkReportError(VIR_ERR_INTERNAL_ERROR, + _(Passthrough mode requires interface names)); +goto finish; +} +} +else if (netdef-forwardType == VIR_NETWORK_FORWARD_HOSTDEV) { +netdef-forwardIfs[ii].type = VIR_NETWORK_FORWARD_HOSTDEV_DEVICE_PCI; /*Assuming PCI as VF's are PCI devices */ +netdef-forwardIfs[ii].device.pci.domain = virt_fns[ii]-domain; +netdef-forwardIfs[ii].device.pci.bus = virt_fns[ii]-bus; +netdef-forwardIfs[ii].device.pci.slot = virt_fns[ii]-slot; +netdef-forwardIfs[ii].device.pci.function = virt_fns[ii]-function; } netdef-forwardIfs[ii].usageCount = 0; } ret = 0; finish: -for (ii = 0; ii num_virt_fns; ii++) +for (ii = 0; ii num_virt_fns; ii++) { VIR_FREE(vfname[ii]); +VIR_FREE(virt_fns[ii]); +} VIR_FREE(vfname); +VIR_FREE(virt_fns); return ret; } diff --git a/src/util/virnetdev.c b/src/util/virnetdev.c index d53352f..a59012f 100644 --- a/src/util/virnetdev.c +++ b/src/util/virnetdev.c @@ -983,18 +983,19 @@ virNetDevSysfsDeviceFile(char **pf_sysfs_device_link, const char *ifname, int virNetDevGetVirtualFunctions(const char *pfname, char ***vfname, + struct pci_config_address ***virt_fns, unsigned int *n_vfname) { int ret = -1, i; char *pf_sysfs_device_link = NULL; char *pci_sysfs_device_link = NULL; -struct pci_config_address **virt_fns; +//struct pci_config_address **virt_fns; char *pciConfigAddr; if (virNetDevSysfsFile(pf_sysfs_device_link, pfname, device) 0) return ret; -if (pciGetVirtualFunctions(pf_sysfs_device_link, virt_fns, +if (pciGetVirtualFunctions(pf_sysfs_device_link, virt_fns, n_vfname) 0) goto cleanup; @@ -1005,10 +1006,10 @@ virNetDevGetVirtualFunctions(const char *pfname, for (i = 0; i *n_vfname; i++) { -if (pciGetDeviceAddrString(virt_fns[i]-domain, - virt_fns[i]-bus, - virt_fns[i]-slot, - virt_fns[i]-function, +if (pciGetDeviceAddrString((*virt_fns)[i]-domain, + (*virt_fns)[i]-bus, + (*virt_fns)[i]-slot, + (*virt_fns)[i]-function, pciConfigAddr) 0) { virReportSystemError(ENOSYS, %s, _(Failed to get PCI Config Address String)); @@ -1021,20 +1022,21 @@ virNetDevGetVirtualFunctions(const char *pfname, } if (pciDeviceNetName
[libvirt] [PATCH 6/6 v3] Forward Mode 'Hostdev' qemu driver implementation
Signed-off-by: Shradha Shah ss...@solarflare.com --- src/qemu/qemu_command.c | 14 ++ 1 files changed, 14 insertions(+), 0 deletions(-) diff --git a/src/qemu/qemu_command.c b/src/qemu/qemu_command.c index 93c018d..0f6b714 100644 --- a/src/qemu/qemu_command.c +++ b/src/qemu/qemu_command.c @@ -5030,6 +5030,20 @@ qemuBuildCommandLine(virConnectPtr conn, * code here that adds the newly minted hostdev to the * hostdevs array). */ +if (qemuAssignDeviceHostdevAlias(def, + virDomainNetGetActualHostdev(net), + (def-nhostdevs-1)) 0) { +qemuReportError(VIR_ERR_INTERNAL_ERROR, %s, +_(Could not assign alias to Net Hostdev)); +goto error; +} + +if (virDomainHostdevInsert(def, + virDomainNetGetActualHostdev(net)) 0) { +qemuReportError(VIR_ERR_INTERNAL_ERROR, %s, +_(Hostdev not inserted into the array)); +goto error; +} continue; } -- 1.7.4.4 -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
[libvirt] [PATCH 5/6 v3] Forward Mode Hostdev network driver Implementation
This patch updates the network driver to properly utilize the new attributes/elements that are now in virNetworkDef Signed-off-by: Shradha Shah ss...@solarflare.com --- docs/formatnetwork.html.in | 62 + src/network/bridge_driver.c | 213 --- 2 files changed, 240 insertions(+), 35 deletions(-) diff --git a/docs/formatnetwork.html.in b/docs/formatnetwork.html.in index 7e8e991..96b9eb2 100644 --- a/docs/formatnetwork.html.in +++ b/docs/formatnetwork.html.in @@ -210,6 +210,37 @@ (usually either a domain start, or a hotplug interface attach to a domain).span class=sinceSince 0.9.4/span /dd + dtcodehostdev/code/dt + dd +This network facilitates PCI Passthrough of a network device. +A network device is chosen from the interface pool and +directly assigned to the guest using generic device +passthrough, after first optionally setting the device's MAC +address to the configured value, and associating the device with +an 802.1Qbh capable switch using an optionally specified +codelt;virtualportgt;/code element. +Note that - due to limitations in standard single-port PCI +ethernet card driver design - only SR-IOV (Single Root I/O +Virtualization) virtual function (VF) devices can be assigned +in this manner; to assign a standard single-port PCI or PCIe +ethernet card to a guest, use the traditional codelt; +hostdevgt;/code device definition and span class=since +Since 0.9.12/span + +pNote that this intelligent passthrough of network devices is +very similar to the functionality of a standard codelt; +hostdevgt;/code device, the difference being that this +method allows specifying a MAC address and codelt;virtualport +gt;/code for the passed-through device. If these capabilities +are not required, if you have a standard single-port PCI, PCIe, +or USB network card that doesn't support SR-IOV (and hence would +anyway lose the configured MAC address during reset after being +assigned to the guest domain), or if you are using a version of +libvirt older than 0.9.12, you should use standard +codelt;hostdevgt;/code to assign the device to the +guest instead of codelt;forward mode='hostdev'/gt;/code. +/p + /dd /dl As mentioned above, a codelt;forwardgt;/code element can have multiple codelt;interfacegt;/code subelements, each @@ -249,6 +280,37 @@ particular, 'passthrough' mode, and 'private' mode when using 802.1Qbh), libvirt will choose an unused physical interface or, if it can't find an unused interface, fail the operation./p + +span class=sincesince 0.9.12/span and when using forward mode +'hostdev' we specify the interface pool by using the +codelt;addressgt;/code element and codelt; +typegt;/code codelt;domaingt;/code codelt;busgt;/code +codelt;slotgt;/code and codelt;functiongt;/code +sub-elements. + +pre +... + lt;forward mode='hostdev' managed='yes'gt; +lt;address type='pci' domain='0' bus='4' slot='0' function='1'/gt; +lt;address type='pci' domain='0' bus='4' slot='0' function='2'/gt; +lt;address type='pci' domain='0' bus='4' slot='0' function='3'/gt; + lt;/forwardgt; +... +/pre + +Alternatively the interface pool can also be mentioned using a +single physical function codelt;pfgt;/code subelement to +call out the corresponding physical interface associated with +multiple virtual interfaces (similar to the passthrough mode): + +pre +... + lt;forward mode='hostdev' managed='yes'gt; +lt;pf dev='eth0'/gt; + lt;/forwardgt; +... +/pre + /dd /dl h5a name=elementQoSQuality of service/a/h5 diff --git a/src/network/bridge_driver.c b/src/network/bridge_driver.c index 2f8a937..c2c763a 100644 --- a/src/network/bridge_driver.c +++ b/src/network/bridge_driver.c @@ -1938,7 +1938,7 @@ networkStartNetworkExternal(struct network_driver *driver ATTRIBUTE_UNUSED, virNetworkObjPtr network ATTRIBUTE_UNUSED) { /* put anything here that needs to be done each time a network of - * type BRIDGE, PRIVATE, VEPA, or PASSTHROUGH is started. On + * type BRIDGE, PRIVATE, VEPA, HOSTDEV or PASSTHROUGH is started. On * failure, undo anything you've done, and return -1. On success * return 0. */ @@ -1949,7 +1949,7 @@ static int networkShutdownNetworkExternal(struct network_driver *driver ATTRIBUT virNetworkObjPtr network ATTRIBUTE_UNUSED) { /* put anything here that needs
[libvirt] Fwd: In Use tracker for network and pci-passthrough devices
This is a conversation that I started with Laine Stump for the implementation of the in-use tracker for network and pci devices. I want to make this conversation more public in order to receive everyone's view on the topic. I will also post the responses I got from Laine and Osier Yang. Many Thanks, Regards, Shradha Shah Original Message Subject: In Use tracker for network and pci-passthrough devices Date: Tue, 26 Jun 2012 12:23:52 +0100 From: Shradha Shah ss...@solarflare.com To: Laine Stump la...@laine.org Laine, I have submitted my v2 patches for forward mode='hostdev' and am planning to work on the in-use tracker for network and pci-passthrough devices. I am unable to wrap my head around how I should be implementing this functionality. I am unable to decide at what level I should be implementing this (network, domain or qemu). May I ask for your guidance in order to implement this functionality? -- Many Thanks, Regards, Shradha Shah -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] Fwd: In Use tracker for network and pci-passthrough devices: Laine response
This is a reply I got from Laine Stump = (NB: I'm Cc'ing Osier on this email, as he's quite knowledgeable about the PCI passthrough device allocation tracking code. You should probably move this discussion to the mailing list sooner rather than later though, as a public discussion of the design will give you a better chance of your first revision getting successfully past review :-)) On 06/26/2012 07:23 AM, Shradha Shah wrote: Laine, I have submitted my v2 patches for forward mode='hostdev' and am planning to work on the in-use tracker for network and pci-passthrough devices. I am unable to wrap my head around how I should be implementing this functionality. I am unable to decide at what level I should be implementing this (network, domain or qemu). May I ask for your guidance in order to implement this functionality? Yes, but I'm currently on vacation (in Turkey) so I won't have much time to respond until July 9 when I return. In the meantime, I think the right way to do this is by integrating with the code in the qemu driver that keeps track of which PCI devices are in use. This already happens at the very basic level of if the device allocated by the network driver is in use, the attempt to assign the device will fail; instead, the network driver should be able to ask qemu if the device it wants to allocate to the guest is already in use (and reserve it, in one atomic operation). Of course, once the network driver has reserved the device from qemu's PCI passthrough code, it would return that device to the qemu driver code that wants to attach the interface, and it would fail because it would be told the device is already in use (well, yeah! *We* just marked it as in-use!). To make that work, I guess some sort of cookie/handle/pointer would need to be passed from qemu's pci passthrough code back to the network driver, and the network driver would return it back to qemu's network interface attach code, which would then use that special cookie/handle/pointer to attach the device (saying yeah, I know it's already in use, and here's my pass-card). (Talking about this makes me think that the code that keeps track of PCI device allocation shouldn't really be a part of qemu, but should be a separate module, so that the network driver can still function properly even if the qemu driver isn't loaded.) Another twist to this that should be considered - if any particular device is in use by at least one guest for one of the macvtap modes, that device also needs to be marked as in-use in libvirt's pci device table - it would be disastrous if another guest decided to use that device for standard PCI Passthrough. (Keep in mind that I wrote everything above without even once looking at the code or any other reference, so you should take it with a grain of salt!) Many Thanks, Regards, Shradha Shah On 06/28/2012 11:19 AM, Shradha Shah wrote: This is a conversation that I started with Laine Stump for the implementation of the in-use tracker for network and pci devices. I want to make this conversation more public in order to receive everyone's view on the topic. I will also post the responses I got from Laine and Osier Yang. Many Thanks, Regards, Shradha Shah Original Message Subject: In Use tracker for network and pci-passthrough devices Date: Tue, 26 Jun 2012 12:23:52 +0100 From: Shradha Shah ss...@solarflare.com To: Laine Stump la...@laine.org Laine, I have submitted my v2 patches for forward mode='hostdev' and am planning to work on the in-use tracker for network and pci-passthrough devices. I am unable to wrap my head around how I should be implementing this functionality. I am unable to decide at what level I should be implementing this (network, domain or qemu). May I ask for your guidance in order to implement this functionality? -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] Fwd: In Use tracker for network and pci-passthrough devices: Laine response
This is a reply from Osier Yang On 2012年06月27日 04:02, Laine Stump wrote: (NB: I'm Cc'ing Osier on this email, as he's quite knowledgeable about the PCI passthrough device allocation tracking code. You should probably move this discussion to the mailing list sooner rather than later though, as a public discussion of the design will give you a better chance of your first revision getting successfully past review :-)) On 06/26/2012 07:23 AM, Shradha Shah wrote: Laine, I have submitted my v2 patches for forward mode='hostdev' and am planning to work on the in-use tracker for network and pci-passthrough devices. I am unable to wrap my head around how I should be implementing this functionality. I am unable to decide at what level I should be implementing this (network, domain or qemu). May I ask for your guidance in order to implement this functionality? Yes, but I'm currently on vacation (in Turkey) so I won't have much time to respond until July 9 when I return. In the meantime, I think the right way to do this is by integrating with the code in the qemu driver that keeps track of which PCI devices are in use. This already happens at the very basic level of if the device allocated by the network driver is in use, the attempt to assign the device will fail; instead, the network driver should be able to ask qemu if the device it wants to allocate to the guest is already in use (and reserve it, in one atomic operation). Hi, Shradha, Laine, I have not read your patches for forward=hostdev carefully, so not sure if I can give right direction, but let me try: It looks like what you will do is just reserve the vf or pf from host, and when the vf/pf is attached to domain or used in other ways, you want it to be marked as in-use, am I correct? If so, it should be not hard to do, for each PCI device, we have a field named used_by, to stores the domain name which uses it, and in qemu driver, we have two list activePciHostdevs, inactivePciHostdevs of pciDeviceList type. activePciHostdevs holds the PCI devices which are in used by all the qemu domains, and inactivePciHostdevs holds the PCI devices detached from the host, and not used by any domain. Basicly the purpose of inactivePciHostdevs is to resolve the problem of pci device resetting on two PCI devices share the same bus. See commit 6be610bf for more details. So that means, updating the used_by field of the pci device, activePciHostdevs, and inactivePciHostdevs all happens while attaching the interface to domain, or detaching it from the domain, or when domain starting, or when the domain is shutdown. E.g, attaching the interface to domain (assuming the attachment succeeded), it needs to do: 1) Set used_by as the domain name 2) Insert the device to activePciHostdevs list. 3) Remove the device from inactivePciHostdevs list if it was there. Porcess of detaching is just opposite with above. However, the whole process is much more complicated than the 3 listed steps. I found you introduce new members for virNetworkForwardIfDef: struct _virNetworkForwardIfDef { -char *dev; /* name of device */ +int type; +union { +virDevicePCIAddress pci; /*PCI Address of device */ +/* when USB devices are supported a new variable to be added here */ +char *dev; /* name of device */ +}device; +int usageCount; /* how many guest interfaces are bound to this device? */ +}; So why don't use pciDevice. e.g. struct _virNetworkForwardIfDef { char *dev; /* name of device */ int type; union { pciDevice pci; /*PCI Address of device */ /* when USB devices are supported a new variable to be added here */ char *dev; /* name of device */ } device; int usageCount; /* how many guest interfaces are bound to this device? */ }; You can add usbDevice there once it's supported. That means you can reuse the existed codes for pci and devices management of qemu driver. Of course, once the network driver has reserved the device from qemu's PCI passthrough code, it would return that device to the qemu driver code that wants to attach the interface, and it would fail because it would be told the device is already in use (well, yeah! *We* just marked it as in-use!). To make that work, I guess some sort of cookie/handle/pointer would need to be passed from qemu's pci passthrough code back to the network driver, and the network driver would return it back to qemu's network interface attach code, which would then use that special cookie/handle/pointer to attach the device (saying yeah, I know it's already in use, and here's my pass-card). (Talking about this makes me think that the code that keeps track of PCI device allocation shouldn't really be a part of qemu, but should be a separate module, so that the network driver can still
Re: [libvirt] Fwd: In Use tracker for network and pci-passthrough devices: Laine response
On 06/27/2012 09:03 AM, Osier Yang wrote: On 2012年06月27日 04:02, Laine Stump wrote: (NB: I'm Cc'ing Osier on this email, as he's quite knowledgeable about the PCI passthrough device allocation tracking code. You should probably move this discussion to the mailing list sooner rather than later though, as a public discussion of the design will give you a better chance of your first revision getting successfully past review :-)) On 06/26/2012 07:23 AM, Shradha Shah wrote: Laine, I have submitted my v2 patches for forward mode='hostdev' and am planning to work on the in-use tracker for network and pci-passthrough devices. I am unable to wrap my head around how I should be implementing this functionality. I am unable to decide at what level I should be implementing this (network, domain or qemu). May I ask for your guidance in order to implement this functionality? Yes, but I'm currently on vacation (in Turkey) so I won't have much time to respond until July 9 when I return. In the meantime, I think the right way to do this is by integrating with the code in the qemu driver that keeps track of which PCI devices are in use. This already happens at the very basic level of if the device allocated by the network driver is in use, the attempt to assign the device will fail; instead, the network driver should be able to ask qemu if the device it wants to allocate to the guest is already in use (and reserve it, in one atomic operation). Hi, Shradha, Laine, I have not read your patches for forward=hostdev carefully, so not sure if I can give right direction, but let me try: It looks like what you will do is just reserve the vf or pf from host, and when the vf/pf is attached to domain or used in other ways, you want it to be marked as in-use, am I correct? Correct. Currently the network driver picks a device from its pool and returns it to qemu having no idea if maybe that device is already used in some other way. By the time we get back to qemu and learn that the device is already used, the best we can do is fail, which is less than ideal :-) If so, it should be not hard to do, for each PCI device, we have a field named used_by, to stores the domain name which uses it, and in qemu driver, we have two list activePciHostdevs, inactivePciHostdevs of pciDeviceList type. activePciHostdevs holds the PCI devices which are in used by all the qemu domains, and inactivePciHostdevs holds the PCI devices detached from the host, and not used by any domain. Basicly the purpose of inactivePciHostdevs is to resolve the problem of pci device resetting on two PCI devices share the same bus. See commit 6be610bf for more details. So that means, updating the used_by field of the pci device, activePciHostdevs, and inactivePciHostdevs all happens while attaching the interface to domain, or detaching it from the domain, or when domain starting, or when the domain is shutdown. E.g, attaching the interface to domain (assuming the attachment succeeded), it needs to do: 1) Set used_by as the domain name 2) Insert the device to activePciHostdevs list. 3) Remove the device from inactivePciHostdevs list if it was there. The trick is to do enough of that in networkAllocateActualDevice to assure that 1) the device won't be used by someone else, 2) the guest that's grabbing the device *can* use it, and 3) the right thing will happen if libvirtd is restarted sometime after the device is reserved but before the guest is started. Porcess of detaching is just opposite with above. However, the whole process is much more complicated than the 3 listed steps. I found you introduce new members for virNetworkForwardIfDef: struct _virNetworkForwardIfDef { -char *dev; /* name of device */ +int type; +union { +virDevicePCIAddress pci; /*PCI Address of device */ +/* when USB devices are supported a new variable to be added here */ +char *dev; /* name of device */ +}device; +int usageCount; /* how many guest interfaces are bound to this device? */ +}; So why don't use pciDevice. e.g. In general I think it would be a good idea to unify pciDevice, virDevicePCIAddress, and pci_config_address as much as possible, but pciDevice itself has a lot of fields that don't make sense in a configuration object, and anyway currently all the other conf code (including hostdev definitions) uses virDevicePCIAddress, and there is already code to parse/format to/from a virDevicePCIAddress. As a matter of fact, pciDevice is defined in pci.c, so it can't be used anywhere else, and the API presented by pci.h uses individual components (domain, bus, slot, function) when it needs to describe a PCI device. So for now at least, I think virNetworkForwardIfDef should use virDevicePCIAddress, like the other *_conf code; when the network driver needs to call the APIs
Re: [libvirt] Fwd: In Use tracker for network and pci-passthrough devices: Laine response
On 06/28/2012 11:33 AM, Shradha Shah wrote: This is a reply I got from Laine Stump = (NB: I'm Cc'ing Osier on this email, as he's quite knowledgeable about the PCI passthrough device allocation tracking code. You should probably move this discussion to the mailing list sooner rather than later though, as a public discussion of the design will give you a better chance of your first revision getting successfully past review :-)) On 06/26/2012 07:23 AM, Shradha Shah wrote: Laine, I have submitted my v2 patches for forward mode='hostdev' and am planning to work on the in-use tracker for network and pci-passthrough devices. I am unable to wrap my head around how I should be implementing this functionality. I am unable to decide at what level I should be implementing this (network, domain or qemu). May I ask for your guidance in order to implement this functionality? Yes, but I'm currently on vacation (in Turkey) so I won't have much time to respond until July 9 when I return. In the meantime, I think the right way to do this is by integrating with the code in the qemu driver that keeps track of which PCI devices are in use. This already happens at the very basic level of if the device allocated by the network driver is in use, the attempt to assign the device will fail; instead, the network driver should be able to ask qemu if the device it wants to allocate to the guest is already in use (and reserve it, in one atomic operation). Of course, once the network driver has reserved the device from qemu's PCI passthrough code, it would return that device to the qemu driver code that wants to attach the interface, and it would fail because it would be told the device is already in use (well, yeah! *We* just marked it as in-use!). To make that work, I guess some sort of cookie/handle/pointer would need to be passed from qemu's pci passthrough code back to the network driver, and the network driver would return it back to qemu's network interface attach code, which would then use that special cookie/handle/pointer to attach the device (saying yeah, I know it's already in use, and here's my pass-card). Wouldn't this approach require network driver to call functions from the qemu driver? I think this is not good for the hierarchical structure we are trying to maintain. (Talking about this makes me think that the code that keeps track of PCI device allocation shouldn't really be a part of qemu, but should be a separate module, so that the network driver can still function properly even if the qemu driver isn't loaded.) Would this mean moving code to a new driver called device_driver.c or devicetracker_driver.c (which consumes device_conf.ch) and is called by network, domain and qemu drivers? If so, I like this approach. Another twist to this that should be considered - if any particular device is in use by at least one guest for one of the macvtap modes, that device also needs to be marked as in-use in libvirt's pci device table - it would be disastrous if another guest decided to use that device for standard PCI Passthrough. Agreed. (Keep in mind that I wrote everything above without even once looking at the code or any other reference, so you should take it with a grain of salt!) Many Thanks, Regards, Shradha Shah On 06/28/2012 11:19 AM, Shradha Shah wrote: This is a conversation that I started with Laine Stump for the implementation of the in-use tracker for network and pci devices. I want to make this conversation more public in order to receive everyone's view on the topic. I will also post the responses I got from Laine and Osier Yang. Many Thanks, Regards, Shradha Shah Original Message Subject: In Use tracker for network and pci-passthrough devices Date: Tue, 26 Jun 2012 12:23:52 +0100 From: Shradha Shah ss...@solarflare.com To: Laine Stump la...@laine.org Laine, I have submitted my v2 patches for forward mode='hostdev' and am planning to work on the in-use tracker for network and pci-passthrough devices. I am unable to wrap my head around how I should be implementing this functionality. I am unable to decide at what level I should be implementing this (network, domain or qemu). May I ask for your guidance in order to implement this functionality? -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] Fwd: In Use tracker for network and pci-passthrough devices: Laine response
Osier, Many thanks for your input. Comments inline. On 06/28/2012 11:48 AM, Shradha Shah wrote: This is a reply from Osier Yang On 2012年06月27日 04:02, Laine Stump wrote: (NB: I'm Cc'ing Osier on this email, as he's quite knowledgeable about the PCI passthrough device allocation tracking code. You should probably move this discussion to the mailing list sooner rather than later though, as a public discussion of the design will give you a better chance of your first revision getting successfully past review :-)) On 06/26/2012 07:23 AM, Shradha Shah wrote: Laine, I have submitted my v2 patches for forward mode='hostdev' and am planning to work on the in-use tracker for network and pci-passthrough devices. I am unable to wrap my head around how I should be implementing this functionality. I am unable to decide at what level I should be implementing this (network, domain or qemu). May I ask for your guidance in order to implement this functionality? Yes, but I'm currently on vacation (in Turkey) so I won't have much time to respond until July 9 when I return. In the meantime, I think the right way to do this is by integrating with the code in the qemu driver that keeps track of which PCI devices are in use. This already happens at the very basic level of if the device allocated by the network driver is in use, the attempt to assign the device will fail; instead, the network driver should be able to ask qemu if the device it wants to allocate to the guest is already in use (and reserve it, in one atomic operation). Hi, Shradha, Laine, I have not read your patches for forward=hostdev carefully, so not sure if I can give right direction, but let me try: It looks like what you will do is just reserve the vf or pf from host, and when the vf/pf is attached to domain or used in other ways, you want it to be marked as in-use, am I correct? If so, it should be not hard to do, for each PCI device, we have a field named used_by, to stores the domain name which uses it, and in qemu driver, we have two list activePciHostdevs, inactivePciHostdevs of pciDeviceList type. activePciHostdevs holds the PCI devices which are in used by all the qemu domains, and inactivePciHostdevs holds the PCI devices detached from the host, and not used by any domain. Basicly the purpose of inactivePciHostdevs is to resolve the problem of pci device resetting on two PCI devices share the same bus. See commit 6be610bf for more details. So that means, updating the used_by field of the pci device, activePciHostdevs, and inactivePciHostdevs all happens while attaching the interface to domain, or detaching it from the domain, or when domain starting, or when the domain is shutdown. E.g, attaching the interface to domain (assuming the attachment succeeded), it needs to do: 1) Set used_by as the domain name 2) Insert the device to activePciHostdevs list. 3) Remove the device from inactivePciHostdevs list if it was there. Porcess of detaching is just opposite with above. However, the whole process is much more complicated than the 3 listed steps. This approach is easier to implement but this would mean that we have to access the qemu driver from the network driver since we need to make a decision about device usage in networkAllocateActualDevice. This messes with the hierarchy I think. I found you introduce new members for virNetworkForwardIfDef: struct _virNetworkForwardIfDef { -char *dev; /* name of device */ +int type; +union { +virDevicePCIAddress pci; /*PCI Address of device */ +/* when USB devices are supported a new variable to be added here */ +char *dev; /* name of device */ +}device; +int usageCount; /* how many guest interfaces are bound to this device? */ +}; So why don't use pciDevice. e.g. struct _virNetworkForwardIfDef { char *dev; /* name of device */ int type; union { pciDevice pci; /*PCI Address of device */ /* when USB devices are supported a new variable to be added here */ char *dev; /* name of device */ } device; int usageCount; /* how many guest interfaces are bound to this device? */ }; You can add usbDevice there once it's supported. That means you can reuse the existed codes for pci and devices management of qemu driver. I was thinking of having a new driver called devicetracker_driver.c that consumes device_conf.ch and is used in domain, network and qemu drivers. Of course, once the network driver has reserved the device from qemu's PCI passthrough code, it would return that device to the qemu driver code that wants to attach the interface, and it would fail because it would be told the device is already in use (well, yeah! *We* just marked it as in-use!). To make that work, I
[libvirt] [PATCH 0/6 v2] Support forward mode='hostdev' and interface pools
This patch series supports the forward mode='hostdev'. The functionality of this mode is the same as interface type='hostdev' but with the added benefit of using interface pools. The patch series also contains a patch to support use of interface names and PCI device addresses interchangeably in a network xml, and return the appropriate one in actualDevice when networkAllocateActualDevice is called. At the top level managed attribute can be specified with identical results as when it's specified for a hostdev. Currently forward mode='hostdev' does not support USB devices. Shradha Shah (6): Prerequisite Patch. virDomainDevicePCIAddress and respective functions moved to a new file called conf/device_conf.ch Moved the code to create implicit interface pool from PF to a new function RNG updates, new xml parser/formatter code to support forward mode=hostdev Code to return interface name or pci_addr of the VF in actualDevice Forward Mode Hostdev network driver Implementation Forward Mode 'Hostdev' qemu driver implementation docs/formatnetwork.html.in | 62 ++ docs/schemas/network.rng | 82 - src/Makefile.am|7 +- src/conf/device_conf.c | 135 + src/conf/device_conf.h | 65 +++ src/conf/domain_conf.c | 114 ++-- src/conf/domain_conf.h | 25 +--- src/conf/network_conf.c| 126 +++-- src/conf/network_conf.h| 29 +++- src/libvirt_private.syms | 10 +- src/network/bridge_driver.c| 322 +--- src/qemu/qemu_command.c| 27 ++- src/qemu/qemu_hotplug.c|7 +- src/qemu/qemu_monitor.c| 14 +- src/qemu/qemu_monitor.h| 17 +- src/qemu/qemu_monitor_json.c | 14 +- src/qemu/qemu_monitor_json.h | 14 +- src/qemu/qemu_monitor_text.c | 16 +- src/qemu/qemu_monitor_text.h | 14 +- src/util/virnetdev.c | 29 ++-- src/util/virnetdev.h |4 +- src/xen/xend_internal.c|3 +- tests/networkxml2xmlin/hostdev-pf.xml | 11 + tests/networkxml2xmlin/hostdev.xml | 10 + tests/networkxml2xmlout/hostdev-pf.xml |7 + tests/networkxml2xmlout/hostdev.xml| 10 + tests/networkxml2xmltest.c |2 + 27 files changed, 889 insertions(+), 287 deletions(-) create mode 100644 src/conf/device_conf.c create mode 100644 src/conf/device_conf.h create mode 100644 tests/networkxml2xmlin/hostdev-pf.xml create mode 100644 tests/networkxml2xmlin/hostdev.xml create mode 100644 tests/networkxml2xmlout/hostdev-pf.xml create mode 100644 tests/networkxml2xmlout/hostdev.xml -- 1.7.4.4 -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
[libvirt] [PATCH 1/6 v2] Prerequisite Patch. virDomainDevicePCIAddress and respective functions moved to a new file called conf/device_conf.ch
Refactoring existing code without causing any functional changes to prepare for new code. This patch makes the code reusable. Signed-off-by: Shradha Shah ss...@solarflare.com --- src/Makefile.am |7 ++- src/conf/device_conf.c | 135 ++ src/conf/device_conf.h | 65 src/conf/domain_conf.c | 114 --- src/conf/domain_conf.h | 25 +--- src/libvirt_private.syms | 10 ++- src/qemu/qemu_command.c | 13 ++-- src/qemu/qemu_hotplug.c |7 +- src/qemu/qemu_monitor.c | 14 ++-- src/qemu/qemu_monitor.h | 17 +++--- src/qemu/qemu_monitor_json.c | 14 ++-- src/qemu/qemu_monitor_json.h | 14 ++-- src/qemu/qemu_monitor_text.c | 16 +++--- src/qemu/qemu_monitor_text.h | 14 ++-- src/xen/xend_internal.c |3 +- 15 files changed, 288 insertions(+), 180 deletions(-) diff --git a/src/Makefile.am b/src/Makefile.am index e40909b..009c4e5 100644 --- a/src/Makefile.am +++ b/src/Makefile.am @@ -199,6 +199,9 @@ CONSOLE_CONF_SOURCES = \ DOMAIN_LIST_SOURCES = \ conf/virdomainlist.c conf/virdomainlist.h +DEVICE_CONF_SOURCES = \ + conf/device_conf.c conf/device_conf.h + CONF_SOURCES = \ $(NETDEV_CONF_SOURCES) \ $(DOMAIN_CONF_SOURCES) \ @@ -212,7 +215,8 @@ CONF_SOURCES = \ $(SECRET_CONF_SOURCES) \ $(CPU_CONF_SOURCES) \ $(CONSOLE_CONF_SOURCES) \ - $(DOMAIN_LIST_SOURCES) + $(DOMAIN_LIST_SOURCES) \ + $(DEVICE_CONF_SOURCES) # The remote RPC driver, covering domains, storage, networks, etc REMOTE_DRIVER_GENERATED = \ @@ -1525,6 +1529,7 @@ libvirt_lxc_SOURCES = \ $(ENCRYPTION_CONF_SOURCES) \ $(NETDEV_CONF_SOURCES) \ $(DOMAIN_CONF_SOURCES) \ + $(DEVICE_CONF_SOURCES) \ $(SECRET_CONF_SOURCES) \ $(CPU_CONF_SOURCES) \ $(SECURITY_DRIVER_SOURCES) \ diff --git a/src/conf/device_conf.c b/src/conf/device_conf.c new file mode 100644 index 000..af21aad --- /dev/null +++ b/src/conf/device_conf.c @@ -0,0 +1,135 @@ +/* + * device_conf.h: device XML handling + * + * Copyright (C) 2006-2012 Red Hat, Inc. + * + * This library is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * This library is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with this library; if not, write to the Free Software + * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA + * + * Author: Shradha Shah ss...@solarflare.com + */ + +#include config.h +#include virterror_internal.h +#include datatypes.h +#include memory.h +#include xml.h +#include uuid.h +#include util.h +#include buf.h +#include conf/device_conf.h + +#define VIR_FROM_THIS VIR_FROM_DEVICE + +#define virDeviceReportError(code, ...) \ +virReportErrorHelper(VIR_FROM_DOMAIN, code, __FILE__,\ + __FUNCTION__, __LINE__, __VA_ARGS__) + +VIR_ENUM_IMPL(virDeviceAddressPciMulti, + VIR_DEVICE_ADDRESS_PCI_MULTI_LAST, + default, + on, + off) + +int virDevicePCIAddressIsValid(virDevicePCIAddressPtr addr) +{ +/* PCI bus has 32 slots and 8 functions per slot */ +if (addr-slot = 32 || addr-function = 8) +return 0; +return addr-domain || addr-bus || addr-slot; +} + + +int +virDevicePCIAddressParseXML(xmlNodePtr node, +virDevicePCIAddressPtr addr) +{ +char *domain, *slot, *bus, *function, *multi; +int ret = -1; + +memset(addr, 0, sizeof(*addr)); + +domain = virXMLPropString(node, domain); +bus = virXMLPropString(node, bus); +slot = virXMLPropString(node, slot); +function = virXMLPropString(node
[libvirt] [PATCH 3/6 v2] RNG updates, new xml parser/formatter code to support forward mode=hostdev
This patch introduces the new forward mode='hostdev' along with attribute managed Includes updates to the network RNG and new xml parser/formatter code. Signed-off-by: Shradha Shah ss...@solarflare.com --- docs/schemas/network.rng | 82 +++-- src/conf/network_conf.c| 126 src/conf/network_conf.h| 29 +++- src/network/bridge_driver.c| 18 ++-- tests/networkxml2xmlin/hostdev-pf.xml | 11 +++ tests/networkxml2xmlin/hostdev.xml | 10 +++ tests/networkxml2xmlout/hostdev-pf.xml |7 ++ tests/networkxml2xmlout/hostdev.xml| 10 +++ tests/networkxml2xmltest.c |2 + 9 files changed, 262 insertions(+), 33 deletions(-) diff --git a/docs/schemas/network.rng b/docs/schemas/network.rng index 2ae879e..d1297cd 100644 --- a/docs/schemas/network.rng +++ b/docs/schemas/network.rng @@ -82,17 +82,41 @@ valuepassthrough/value valueprivate/value valuevepa/value + valuehostdev/value +/choice + /attribute +/optional + +optional + attribute name=managed +choice + valueyes/value + valueno/value /choice /attribute /optional interleave - zeroOrMore -element name='interface' - attribute name='dev' -ref name='deviceName'/ - /attribute -/element - /zeroOrMore + choice +group + zeroOrMore +element name='interface' + attribute name='dev' +ref name='deviceName'/ + /attribute +/element + /zeroOrMore +/group +group + zeroOrMore +element name='address' + attribute name='type' +valuepci/value + /attribute + ref name=pciaddress/ +/element + /zeroOrMore +/group + /choice optional element name='pf' attribute name='dev' @@ -238,4 +262,48 @@ /interleave /element /define + define name=pciaddress +optional + attribute name=domain +ref name=pciDomain/ + /attribute +/optional +attribute name=bus + ref name=pciBus/ +/attribute +attribute name=slot + ref name=pciSlot/ +/attribute +attribute name=function + ref name=pciFunc/ +/attribute +optional + attribute name=multifunction +choice + valueon/value + valueoff/value +/choice + /attribute +/optional + /define + define name=pciDomain +data type=string + param name=pattern(0x)?[0-9a-fA-F]{1,4}/param +/data + /define + define name=pciBus +data type=string + param name=pattern(0x)?[0-9a-fA-F]{1,2}/param +/data + /define + define name=pciSlot +data type=string + param name=pattern(0x)?[0-1]?[0-9a-fA-F]/param +/data + /define + define name=pciFunc +data type=string + param name=pattern(0x)?[0-7]/param +/data + /define /grammar diff --git a/src/conf/network_conf.c b/src/conf/network_conf.c index 60cd888..a9aa330 100644 --- a/src/conf/network_conf.c +++ b/src/conf/network_conf.c @@ -48,10 +48,14 @@ #define VIR_FROM_THIS VIR_FROM_NETWORK VIR_ENUM_DECL(virNetworkForward) - VIR_ENUM_IMPL(virNetworkForward, VIR_NETWORK_FORWARD_LAST, - none, nat, route, bridge, private, vepa, passthrough ) + none, nat, route, bridge, private, vepa, passthrough, hostdev) + +VIR_ENUM_DECL(virNetworkForwardHostdevDevice) +VIR_ENUM_IMPL(virNetworkForwardHostdevDevice, + VIR_NETWORK_FORWARD_HOSTDEV_DEVICE_LAST, + none, pci) #define virNetworkReportError(code, ...)\ virReportErrorHelper(VIR_FROM_NETWORK, code, __FILE__, \ @@ -100,6 +104,12 @@ virPortGroupDefClear(virPortGroupDefPtr def) static void virNetworkForwardIfDefClear(virNetworkForwardIfDefPtr def) { +VIR_FREE(def-device.dev); +} + +static void +virNetworkForwardPfDefClear(virNetworkForwardPfDefPtr def) +{ VIR_FREE(def-dev); } @@ -163,12 +173,13 @@ void virNetworkDefFree(virNetworkDefPtr def) VIR_FREE(def-domain); for (ii = 0 ; ii def-nForwardPfs def-forwardPfs ; ii++) { -virNetworkForwardIfDefClear(def-forwardPfs[ii]); +virNetworkForwardPfDefClear(def-forwardPfs[ii]); } VIR_FREE(def-forwardPfs); for (ii = 0 ; ii def-nForwardIfs def-forwardIfs ; ii
[libvirt] [PATCH 5/6 v2] Forward Mode Hostdev network driver Implementation
This patch updates the network driver to properly utilize the new attributes/elements that are now in virNetworkDef Signed-off-by: Shradha Shah ss...@solarflare.com --- docs/formatnetwork.html.in | 62 + src/network/bridge_driver.c | 213 --- 2 files changed, 240 insertions(+), 35 deletions(-) diff --git a/docs/formatnetwork.html.in b/docs/formatnetwork.html.in index 7e8e991..96b9eb2 100644 --- a/docs/formatnetwork.html.in +++ b/docs/formatnetwork.html.in @@ -210,6 +210,37 @@ (usually either a domain start, or a hotplug interface attach to a domain).span class=sinceSince 0.9.4/span /dd + dtcodehostdev/code/dt + dd +This network facilitates PCI Passthrough of a network device. +A network device is chosen from the interface pool and +directly assigned to the guest using generic device +passthrough, after first optionally setting the device's MAC +address to the configured value, and associating the device with +an 802.1Qbh capable switch using an optionally specified +codelt;virtualportgt;/code element. +Note that - due to limitations in standard single-port PCI +ethernet card driver design - only SR-IOV (Single Root I/O +Virtualization) virtual function (VF) devices can be assigned +in this manner; to assign a standard single-port PCI or PCIe +ethernet card to a guest, use the traditional codelt; +hostdevgt;/code device definition and span class=since +Since 0.9.12/span + +pNote that this intelligent passthrough of network devices is +very similar to the functionality of a standard codelt; +hostdevgt;/code device, the difference being that this +method allows specifying a MAC address and codelt;virtualport +gt;/code for the passed-through device. If these capabilities +are not required, if you have a standard single-port PCI, PCIe, +or USB network card that doesn't support SR-IOV (and hence would +anyway lose the configured MAC address during reset after being +assigned to the guest domain), or if you are using a version of +libvirt older than 0.9.12, you should use standard +codelt;hostdevgt;/code to assign the device to the +guest instead of codelt;forward mode='hostdev'/gt;/code. +/p + /dd /dl As mentioned above, a codelt;forwardgt;/code element can have multiple codelt;interfacegt;/code subelements, each @@ -249,6 +280,37 @@ particular, 'passthrough' mode, and 'private' mode when using 802.1Qbh), libvirt will choose an unused physical interface or, if it can't find an unused interface, fail the operation./p + +span class=sincesince 0.9.12/span and when using forward mode +'hostdev' we specify the interface pool by using the +codelt;addressgt;/code element and codelt; +typegt;/code codelt;domaingt;/code codelt;busgt;/code +codelt;slotgt;/code and codelt;functiongt;/code +sub-elements. + +pre +... + lt;forward mode='hostdev' managed='yes'gt; +lt;address type='pci' domain='0' bus='4' slot='0' function='1'/gt; +lt;address type='pci' domain='0' bus='4' slot='0' function='2'/gt; +lt;address type='pci' domain='0' bus='4' slot='0' function='3'/gt; + lt;/forwardgt; +... +/pre + +Alternatively the interface pool can also be mentioned using a +single physical function codelt;pfgt;/code subelement to +call out the corresponding physical interface associated with +multiple virtual interfaces (similar to the passthrough mode): + +pre +... + lt;forward mode='hostdev' managed='yes'gt; +lt;pf dev='eth0'/gt; + lt;/forwardgt; +... +/pre + /dd /dl h5a name=elementQoSQuality of service/a/h5 diff --git a/src/network/bridge_driver.c b/src/network/bridge_driver.c index 6ce41b5..35636a8 100644 --- a/src/network/bridge_driver.c +++ b/src/network/bridge_driver.c @@ -1939,7 +1939,7 @@ networkStartNetworkExternal(struct network_driver *driver ATTRIBUTE_UNUSED, virNetworkObjPtr network ATTRIBUTE_UNUSED) { /* put anything here that needs to be done each time a network of - * type BRIDGE, PRIVATE, VEPA, or PASSTHROUGH is started. On + * type BRIDGE, PRIVATE, VEPA, HOSTDEV or PASSTHROUGH is started. On * failure, undo anything you've done, and return -1. On success * return 0. */ @@ -1950,7 +1950,7 @@ static int networkShutdownNetworkExternal(struct network_driver *driver ATTRIBUT virNetworkObjPtr network ATTRIBUTE_UNUSED) { /* put anything here that needs
[libvirt] [PATCH 4/6 v2] Code to return interface name or pci_addr of the VF in actualDevice
The network pool should be able to keep track of both, network device names nad PCI addresses, and return the appropriate one in the actualDevice when networkAllocateActualDevice is called. Signed-off-by: Shradha Shah ss...@solarflare.com --- src/network/bridge_driver.c | 33 +++-- src/util/virnetdev.c| 29 - src/util/virnetdev.h|4 +++- 3 files changed, 46 insertions(+), 20 deletions(-) diff --git a/src/network/bridge_driver.c b/src/network/bridge_driver.c index 630a655..6ce41b5 100644 --- a/src/network/bridge_driver.c +++ b/src/network/bridge_driver.c @@ -60,6 +60,7 @@ #include configmake.h #include ignore-value.h #include virnetdev.h +#include pci.h #include virnetdevbridge.h #include virnetdevtap.h @@ -2740,10 +2741,11 @@ static int networkCreateInterfacePool(virNetworkDefPtr netdef) { unsigned int num_virt_fns = 0; char **vfname = NULL; +struct pci_config_address **virt_fns; int ret = -1, ii = 0; if ((virNetDevGetVirtualFunctions(netdef-forwardPfs-dev, - vfname, num_virt_fns)) 0) { + vfname, virt_fns, num_virt_fns)) 0) { networkReportError(VIR_ERR_INTERNAL_ERROR, _(Could not get Virtual functions on %s), netdef-forwardPfs-dev); @@ -2765,19 +2767,38 @@ networkCreateInterfacePool(virNetworkDefPtr netdef) { netdef-nForwardIfs = num_virt_fns; for (ii = 0; ii netdef-nForwardIfs; ii++) { -netdef-forwardIfs[ii].device.dev = strdup(vfname[ii]); -if (!netdef-forwardIfs[ii].device.dev) { -virReportOOMError(); -goto finish; +if (netdef-forwardType == VIR_NETWORK_FORWARD_PASSTHROUGH) { +if(vfname[ii]) { +netdef-forwardIfs[ii].device.dev = strdup(vfname[ii]); +if (!netdef-forwardIfs[ii].device.dev) { +virReportOOMError(); +goto finish; +} +} +else { +networkReportError(VIR_ERR_INTERNAL_ERROR, + _(Passthrough mode requires interface names)); +goto finish; +} +} +else if (netdef-forwardType == VIR_NETWORK_FORWARD_HOSTDEV) { +netdef-forwardIfs[ii].type = VIR_NETWORK_FORWARD_HOSTDEV_DEVICE_PCI; /*Assuming PCI as VF's are PCI devices */ +netdef-forwardIfs[ii].device.pci.domain = virt_fns[ii]-domain; +netdef-forwardIfs[ii].device.pci.bus = virt_fns[ii]-bus; +netdef-forwardIfs[ii].device.pci.slot = virt_fns[ii]-slot; +netdef-forwardIfs[ii].device.pci.function = virt_fns[ii]-function; } netdef-forwardIfs[ii].usageCount = 0; } ret = 0; finish: -for (ii = 0; ii num_virt_fns; ii++) +for (ii = 0; ii num_virt_fns; ii++) { VIR_FREE(vfname[ii]); +VIR_FREE(virt_fns[ii]); +} VIR_FREE(vfname); +VIR_FREE(virt_fns); return ret; } diff --git a/src/util/virnetdev.c b/src/util/virnetdev.c index d53352f..a59012f 100644 --- a/src/util/virnetdev.c +++ b/src/util/virnetdev.c @@ -983,18 +983,19 @@ virNetDevSysfsDeviceFile(char **pf_sysfs_device_link, const char *ifname, int virNetDevGetVirtualFunctions(const char *pfname, char ***vfname, + struct pci_config_address ***virt_fns, unsigned int *n_vfname) { int ret = -1, i; char *pf_sysfs_device_link = NULL; char *pci_sysfs_device_link = NULL; -struct pci_config_address **virt_fns; +//struct pci_config_address **virt_fns; char *pciConfigAddr; if (virNetDevSysfsFile(pf_sysfs_device_link, pfname, device) 0) return ret; -if (pciGetVirtualFunctions(pf_sysfs_device_link, virt_fns, +if (pciGetVirtualFunctions(pf_sysfs_device_link, virt_fns, n_vfname) 0) goto cleanup; @@ -1005,10 +1006,10 @@ virNetDevGetVirtualFunctions(const char *pfname, for (i = 0; i *n_vfname; i++) { -if (pciGetDeviceAddrString(virt_fns[i]-domain, - virt_fns[i]-bus, - virt_fns[i]-slot, - virt_fns[i]-function, +if (pciGetDeviceAddrString((*virt_fns)[i]-domain, + (*virt_fns)[i]-bus, + (*virt_fns)[i]-slot, + (*virt_fns)[i]-function, pciConfigAddr) 0) { virReportSystemError(ENOSYS, %s, _(Failed to get PCI Config Address String)); @@ -1021,20 +1022,21 @@ virNetDevGetVirtualFunctions(const char *pfname, } if (pciDeviceNetName
[libvirt] [PATCH 2/6 v2] Moved the code to create implicit interface pool from PF to a new function
Just code movement no functional changes here. This makes the code reusable Signed-off-by: Shradha Shah ss...@solarflare.com --- src/network/bridge_driver.c | 84 ++ 1 files changed, 52 insertions(+), 32 deletions(-) diff --git a/src/network/bridge_driver.c b/src/network/bridge_driver.c index 79d3010..7d853c6 100644 --- a/src/network/bridge_driver.c +++ b/src/network/bridge_driver.c @@ -2731,6 +2731,56 @@ int networkRegister(void) { * backend function table. */ +/* networkCreateInterfacePool: + * @netdef: the original NetDef from the network + * + * Creates an implicit interface pool of VF's when a PF dev is given + */ +static int +networkCreateInterfacePool(virNetworkDefPtr netdef) { +unsigned int num_virt_fns = 0; +char **vfname = NULL; +int ret = -1, ii = 0; + +if ((virNetDevGetVirtualFunctions(netdef-forwardPfs-dev, + vfname, num_virt_fns)) 0) { +networkReportError(VIR_ERR_INTERNAL_ERROR, + _(Could not get Virtual functions on %s), + netdef-forwardPfs-dev); +goto finish; +} + +if (num_virt_fns == 0) { +networkReportError(VIR_ERR_INTERNAL_ERROR, + _(No Vf's present on SRIOV PF %s), + netdef-forwardPfs-dev); + goto finish; +} + +if ((VIR_ALLOC_N(netdef-forwardIfs, num_virt_fns)) 0) { +virReportOOMError(); +goto finish; +} + +netdef-nForwardIfs = num_virt_fns; + +for (ii = 0; ii netdef-nForwardIfs; ii++) { +netdef-forwardIfs[ii].dev = strdup(vfname[ii]); +if (!netdef-forwardIfs[ii].dev) { +virReportOOMError(); +goto finish; +} +netdef-forwardIfs[ii].usageCount = 0; +} + +ret = 0; +finish: +for (ii = 0; ii num_virt_fns; ii++) +VIR_FREE(vfname[ii]); +VIR_FREE(vfname); +return ret; +} + /* networkAllocateActualDevice: * @iface: the original NetDef from the domain * @@ -2749,8 +2799,6 @@ networkAllocateActualDevice(virDomainNetDefPtr iface) virNetworkObjPtr network; virNetworkDefPtr netdef; virPortGroupDefPtr portgroup; -unsigned int num_virt_fns = 0; -char **vfname = NULL; int ii; int ret = -1; @@ -2896,36 +2944,11 @@ networkAllocateActualDevice(virDomainNetDefPtr iface) */ if (netdef-forwardType == VIR_NETWORK_FORWARD_PASSTHROUGH) { if ((netdef-nForwardPfs 0) (netdef-nForwardIfs = 0)) { -if ((virNetDevGetVirtualFunctions(netdef-forwardPfs-dev, - vfname, num_virt_fns)) 0) { +if ((networkCreateInterfacePool(netdef)) 0) { networkReportError(VIR_ERR_INTERNAL_ERROR, - _(Could not get Virtual functions on %s), - netdef-forwardPfs-dev); + _(Could not Interface Pool)); goto cleanup; } - -if (num_virt_fns == 0) { -networkReportError(VIR_ERR_INTERNAL_ERROR, - _(No Vf's present on SRIOV PF %s), - netdef-forwardPfs-dev); -goto cleanup; -} - -if ((VIR_ALLOC_N(netdef-forwardIfs, num_virt_fns)) 0) { -virReportOOMError(); -goto cleanup; -} - -netdef-nForwardIfs = num_virt_fns; - -for (ii = 0; ii netdef-nForwardIfs; ii++) { -netdef-forwardIfs[ii].dev = strdup(vfname[ii]); -if (!netdef-forwardIfs[ii].dev) { -virReportOOMError(); -goto cleanup; -} -netdef-forwardIfs[ii].usageCount = 0; -} } /* pick first dev with 0 usageCount */ @@ -2977,9 +3000,6 @@ networkAllocateActualDevice(virDomainNetDefPtr iface) ret = 0; cleanup: -for (ii = 0; ii num_virt_fns; ii++) -VIR_FREE(vfname[ii]); -VIR_FREE(vfname); if (network) virNetworkObjUnlock(network); if (ret 0) { -- 1.7.4.4 -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
[libvirt] [PATCH 6/6 v2] Forward Mode 'Hostdev' qemu driver implementation
Signed-off-by: Shradha Shah ss...@solarflare.com --- src/qemu/qemu_command.c | 14 ++ 1 files changed, 14 insertions(+), 0 deletions(-) diff --git a/src/qemu/qemu_command.c b/src/qemu/qemu_command.c index 93c018d..0f6b714 100644 --- a/src/qemu/qemu_command.c +++ b/src/qemu/qemu_command.c @@ -5030,6 +5030,20 @@ qemuBuildCommandLine(virConnectPtr conn, * code here that adds the newly minted hostdev to the * hostdevs array). */ +if (qemuAssignDeviceHostdevAlias(def, + virDomainNetGetActualHostdev(net), + (def-nhostdevs-1)) 0) { +qemuReportError(VIR_ERR_INTERNAL_ERROR, %s, +_(Could not assign alias to Net Hostdev)); +goto error; +} + +if (virDomainHostdevInsert(def, + virDomainNetGetActualHostdev(net)) 0) { +qemuReportError(VIR_ERR_INTERNAL_ERROR, %s, +_(Hostdev not inserted into the array)); +goto error; +} continue; } -- 1.7.4.4 -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
[libvirt] [PATCH 0/5] Support forward mode='hostdev' and interface pools
This patch series supports the forward mode='hostdev'. The functionality of this mode is the same as interface type='hostdev' but with the added benefit of using interface pools. The patch series also contains a patch to support use of interface names and PCI device addresses interchangeably in a network xml, and return the appropriate one in actualDevice when networkAllocateActualDevice is called. This functionality is not supported for any other forward mode except hostdev. Currently this patch series also does not support USB hostdev passthrough. At the top level managed attribute can be specified for a pf dev or an interface dev (with identical results as when it's specified for a hostdev Shradha Shah (5): Code to return interface name or pci_addr of the VF in actualDevice Moved the code to create implicit interface pool from PF to a new function Introduce forward mode='hostdev' for network XML in order to use the functionality of interface pools. Forward Mode Hostdev Implementation Supporting managed option for forward devs when using HOSTDEV mode docs/schemas/network.rng|1 + src/conf/network_conf.c | 119 +++- src/conf/network_conf.h | 13 ++- src/libvirt_private.syms|3 + src/network/bridge_driver.c | 325 +++ src/qemu/qemu_command.c | 14 ++ src/util/pci.c |7 +- src/util/pci.h |3 + src/util/virnetdev.c| 134 +- src/util/virnetdev.h| 19 +++ 10 files changed, 568 insertions(+), 70 deletions(-) -- 1.7.4.4 -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
[libvirt] [PATCH 3/5] Introduce forward mode='hostdev' for network XML in order to use the functionality of interface pools.
This new forward mode sets up a PCI network device to be assigned to the guest with PCI passthrough. The PCI network device is chosen from an interface pool. Currently there is no support for USB devices when using this forward mode. Example XML: Network XML: network namedirect-network/name uuid81ff0d90-c91e-6742-64da-4a736edb9a8f/uuid forward mode=hostdev pf dev=eth2/ /forward /network The above xml leads to the creation of an implicit interface pool and a free network device is chosen from the pool. This patch would also work if an interface pool is given explicitly. The interface dev value can be an interface name or a PCI device address. Example XML: Network XML: network namedirect-network/name uuid81ff0d90-c91e-6742-64da-4a736edb9a8f/uuid forward mode=hostdev interface dev=:04:00.1/ interface dev=:04:00.2/ interface dev=:04:00.3/ /forward /network The MAC address would be provided in the domain xml. Signed-off-by: Shradha Shah ss...@solarflare.com --- docs/schemas/network.rng|1 + src/conf/network_conf.c |3 ++- src/conf/network_conf.h |1 + src/network/bridge_driver.c |6 -- 4 files changed, 8 insertions(+), 3 deletions(-) diff --git a/docs/schemas/network.rng b/docs/schemas/network.rng index 2ae879e..a0046f1 100644 --- a/docs/schemas/network.rng +++ b/docs/schemas/network.rng @@ -82,6 +82,7 @@ valuepassthrough/value valueprivate/value valuevepa/value + valuehostdev/value /choice /attribute /optional diff --git a/src/conf/network_conf.c b/src/conf/network_conf.c index 8fcba16..6b346c3 100644 --- a/src/conf/network_conf.c +++ b/src/conf/network_conf.c @@ -51,7 +51,7 @@ VIR_ENUM_DECL(virNetworkForward) VIR_ENUM_IMPL(virNetworkForward, VIR_NETWORK_FORWARD_LAST, - none, nat, route, bridge, private, vepa, passthrough ) + none, nat, route, bridge, private, vepa, passthrough, hostdev) #define virNetworkReportError(code, ...)\ virReportErrorHelper(VIR_FROM_NETWORK, code, __FILE__, \ @@ -1251,6 +1251,7 @@ virNetworkDefParseXML(xmlXPathContextPtr ctxt) case VIR_NETWORK_FORWARD_PRIVATE: case VIR_NETWORK_FORWARD_VEPA: case VIR_NETWORK_FORWARD_PASSTHROUGH: +case VIR_NETWORK_FORWARD_HOSTDEV: if (def-bridge) { virNetworkReportError(VIR_ERR_XML_ERROR, _(bridge name not allowed in %s mode (network '%s'), diff --git a/src/conf/network_conf.h b/src/conf/network_conf.h index b205cb0..d473c71 100644 --- a/src/conf/network_conf.h +++ b/src/conf/network_conf.h @@ -45,6 +45,7 @@ enum virNetworkForwardType { VIR_NETWORK_FORWARD_PRIVATE, VIR_NETWORK_FORWARD_VEPA, VIR_NETWORK_FORWARD_PASSTHROUGH, +VIR_NETWORK_FORWARD_HOSTDEV, VIR_NETWORK_FORWARD_LAST, }; diff --git a/src/network/bridge_driver.c b/src/network/bridge_driver.c index 8540003..cc53551 100644 --- a/src/network/bridge_driver.c +++ b/src/network/bridge_driver.c @@ -1974,7 +1974,7 @@ networkStartNetworkExternal(struct network_driver *driver ATTRIBUTE_UNUSED, virNetworkObjPtr network ATTRIBUTE_UNUSED) { /* put anything here that needs to be done each time a network of - * type BRIDGE, PRIVATE, VEPA, or PASSTHROUGH is started. On + * type BRIDGE, PRIVATE, VEPA, HOSTDEV or PASSTHROUGH is started. On * failure, undo anything you've done, and return -1. On success * return 0. */ @@ -1985,7 +1985,7 @@ static int networkShutdownNetworkExternal(struct network_driver *driver ATTRIBUT virNetworkObjPtr network ATTRIBUTE_UNUSED) { /* put anything here that needs to be done each time a network of - * type BRIDGE, PRIVATE, VEPA, or PASSTHROUGH is shutdown. On + * type BRIDGE, PRIVATE, VEPA, HOSTDEV or PASSTHROUGH is shutdown. On * failure, undo anything you've done, and return -1. On success * return 0. */ @@ -2016,6 +2016,7 @@ networkStartNetwork(struct network_driver *driver, case VIR_NETWORK_FORWARD_PRIVATE: case VIR_NETWORK_FORWARD_VEPA: case VIR_NETWORK_FORWARD_PASSTHROUGH: +case VIR_NETWORK_FORWARD_HOSTDEV: ret = networkStartNetworkExternal(driver, network); break; } @@ -2075,6 +2076,7 @@ static int networkShutdownNetwork(struct network_driver *driver, case VIR_NETWORK_FORWARD_PRIVATE: case VIR_NETWORK_FORWARD_VEPA: case VIR_NETWORK_FORWARD_PASSTHROUGH: +case VIR_NETWORK_FORWARD_HOSTDEV: ret = networkShutdownNetworkExternal(driver, network); break; } -- 1.7.4.4 -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
[libvirt] [PATCH 1/5] Code to return interface name or pci_addr of the VF in actualDevice
The network pool should be able to keep track of both, network device names nad PCI addresses, and return the appropriate one in the actualDevice when networkAllocateActualDevice is called. Signed-off-by: Shradha Shah ss...@solarflare.com --- src/conf/network_conf.c | 55 ++- src/conf/network_conf.h | 10 +++- src/network/bridge_driver.c | 47 src/util/pci.c |5 +++- src/util/virnetdev.c|6 +--- 5 files changed, 116 insertions(+), 7 deletions(-) diff --git a/src/conf/network_conf.c b/src/conf/network_conf.c index 6515efe..8fcba16 100644 --- a/src/conf/network_conf.c +++ b/src/conf/network_conf.c @@ -98,6 +98,12 @@ virPortGroupDefClear(virPortGroupDefPtr def) } static void +virNetworkForwardPfDefClear(virNetworkForwardPfDefPtr def) +{ +VIR_FREE(def-dev); +} + +static void virNetworkForwardIfDefClear(virNetworkForwardIfDefPtr def) { VIR_FREE(def-dev); @@ -163,7 +169,7 @@ void virNetworkDefFree(virNetworkDefPtr def) VIR_FREE(def-domain); for (ii = 0 ; ii def-nForwardPfs def-forwardPfs ; ii++) { -virNetworkForwardIfDefClear(def-forwardPfs[ii]); +virNetworkForwardPfDefClear(def-forwardPfs[ii]); } VIR_FREE(def-forwardPfs); @@ -925,6 +931,44 @@ error: return result; } +/* Function to compare strings with wildcard strings*/ +/* When editing this function also edit the one in src/network/bridge_driver.c*/ +static int +wildcmp(const char *wild, const char *string) +{ +/* Written by Jack Handy - A href=mailto:jakkha...@hotmail.com;jakkha...@hotmail.com/A*/ +const char *cp = NULL, *mp = NULL; + +while ((*string) (*wild != '*')) { +if ((*wild != *string) (*wild != '?')) { +return 0; +} +wild++; +string++; +} + +while (*string) { +if (*wild == '*') { +if (!*++wild) { +return 1; +} +mp = wild; +cp = string+1; +} else if ((*wild == *string) || (*wild == '?')) { +wild++; +string++; +} else { +wild = mp; +string = cp++; +} +} + +while (*wild == '*') { +wild++; +} +return !*wild; +} + static virNetworkDefPtr virNetworkDefParseXML(xmlXPathContextPtr ctxt) { @@ -1170,6 +1214,15 @@ virNetworkDefParseXML(xmlXPathContextPtr ctxt) def-nForwardIfs++; } } +int i; +for (i = 0; i nForwardIfs; i++) { +if (wildcmp(:??:??.?, def-forwardIfs[i].dev)) { +def-forwardIfs[i].isPciAddr = true; +} else { +def-forwardIfs[i].isPciAddr = false; +} +} + VIR_FREE(forwardDev); VIR_FREE(forwardPfNodes); VIR_FREE(forwardIfNodes); diff --git a/src/conf/network_conf.h b/src/conf/network_conf.h index 4339a69..b205cb0 100644 --- a/src/conf/network_conf.h +++ b/src/conf/network_conf.h @@ -133,6 +133,14 @@ typedef virNetworkForwardIfDef *virNetworkForwardIfDefPtr; struct _virNetworkForwardIfDef { char *dev; /* name of device */ int usageCount; /* how many guest interfaces are bound to this device? */ +bool isPciAddr; /* Differentiate a VF based on interface name or pci addr*/ +}; + +typedef struct _virNetworkForwardPfDef virNetworkForwardPfDef; +typedef virNetworkForwardPfDef *virNetworkForwardPfDefPtr; +struct _virNetworkForwardPfDef { +char *dev; /* name of device */ +int usageCount; /* how many guest interfaces are bound to this device? */ }; typedef struct _virPortGroupDef virPortGroupDef; @@ -163,7 +171,7 @@ struct _virNetworkDef { * interfaces), they will be listed here. */ size_t nForwardPfs; -virNetworkForwardIfDefPtr forwardPfs; +virNetworkForwardPfDefPtr forwardPfs; size_t nForwardIfs; virNetworkForwardIfDefPtr forwardIfs; diff --git a/src/network/bridge_driver.c b/src/network/bridge_driver.c index d82212f..41e3a49 100644 --- a/src/network/bridge_driver.c +++ b/src/network/bridge_driver.c @@ -87,6 +87,43 @@ struct network_driver { char *logDir; }; +/* Function to compare strings with wildcard strings*/ +/* When editing this function also edit the one in src/conf/network_conf.c*/ +static int +wildcmp(const char *wild, const char *string) +{ +// Written by Jack Handy - A href=mailto:jakkha...@hotmail.com;jakkha...@otmail.com/A +const char *cp = NULL, *mp = NULL; + +while ((*string) (*wild != '*')) { +if ((*wild != *string) (*wild != '?')) { +return 0; +} +wild++; +string++; +} + +while (*string) { +if (*wild == '*') { +if (!*++wild) { +return 1; +} +mp = wild; +cp = string+1; +} else if ((*wild == *string
[libvirt] [PATCH 2/5] Moved the code to create implicit interface pool from PF to a new function
This makes the code reusable Signed-off-by: Shradha Shah ss...@solarflare.com --- src/network/bridge_driver.c | 109 ++- 1 files changed, 66 insertions(+), 43 deletions(-) diff --git a/src/network/bridge_driver.c b/src/network/bridge_driver.c index 41e3a49..8540003 100644 --- a/src/network/bridge_driver.c +++ b/src/network/bridge_driver.c @@ -2761,6 +2761,60 @@ int networkRegister(void) { * backend function table. */ +/* networkCreateInterfacePool: + * @netdef: the original NetDef from the network + * + * Creates an implicit interface pool of VF's when a PF dev is given + */ +static int +networkCreateInterfacePool(virNetworkDefPtr netdef) { +unsigned int num_virt_fns = 0; +char **vfname = NULL; +int ret = -1, ii = 0; + +if ((virNetDevGetVirtualFunctions(netdef-forwardPfs-dev, + vfname, num_virt_fns)) 0) { +networkReportError(VIR_ERR_INTERNAL_ERROR, + _(Could not get Virtual functions on %s), + netdef-forwardPfs-dev); +goto finish; +} + +if (num_virt_fns == 0) { +networkReportError(VIR_ERR_INTERNAL_ERROR, + _(No Vf's present on SRIOV PF %s), + netdef-forwardPfs-dev); +goto finish; +} + +if ((VIR_ALLOC_N(netdef-forwardIfs, num_virt_fns)) 0) { +virReportOOMError(); +goto finish; +} + +netdef-nForwardIfs = num_virt_fns; + +for (ii = 0; ii netdef-nForwardIfs; ii++) { +netdef-forwardIfs[ii].dev = strdup(vfname[ii]); +if (!netdef-forwardIfs[ii].dev) { +virReportOOMError(); +goto finish; +} +netdef-forwardIfs[ii].usageCount = 0; +if (wildcmp(:??:??.?, netdef-forwardIfs[ii].dev)) +netdef-forwardIfs[ii].isPciAddr = true; +else +netdef-forwardIfs[ii].isPciAddr = false; +} + +ret = 0; +finish: +for (ii = 0; ii num_virt_fns; ii++) +VIR_FREE(vfname[ii]); +VIR_FREE(vfname); +return ret; +} + /* networkAllocateActualDevice: * @iface: the original NetDef from the domain * @@ -2779,8 +2833,6 @@ networkAllocateActualDevice(virDomainNetDefPtr iface) virNetworkObjPtr network; virNetworkDefPtr netdef; virPortGroupDefPtr portgroup; -unsigned int num_virt_fns = 0; -char **vfname = NULL; int ii; int ret = -1; @@ -2858,7 +2910,7 @@ networkAllocateActualDevice(virDomainNetDefPtr iface) (netdef-forwardType == VIR_NETWORK_FORWARD_VEPA) || (netdef-forwardType == VIR_NETWORK_FORWARD_PASSTHROUGH)) { virNetDevVPortProfilePtr virtport = NULL; - +int rc = -1; /* forward type='bridge|private|vepa|passthrough' are all * VIR_DOMAIN_NET_TYPE_DIRECT. */ @@ -2926,57 +2978,31 @@ networkAllocateActualDevice(virDomainNetDefPtr iface) */ if (netdef-forwardType == VIR_NETWORK_FORWARD_PASSTHROUGH) { if ((netdef-nForwardPfs 0) (netdef-nForwardIfs = 0)) { -if ((virNetDevGetVirtualFunctions(netdef-forwardPfs-dev, - vfname, num_virt_fns)) 0) { +if((rc = networkCreateInterfacePool(netdef)) 0) { networkReportError(VIR_ERR_INTERNAL_ERROR, - _(Could not get Virtual functions on %s), - netdef-forwardPfs-dev); + _(Could not create Interface Pool from PF)); goto cleanup; } - -if (num_virt_fns == 0) { +} + +/*if dev names are pci addrs in passthrough mode: error*/ +for (ii = 0; ii netdef-nForwardIfs; ii++) { +if (netdef-forwardIfs[ii].isPciAddr == true) { networkReportError(VIR_ERR_INTERNAL_ERROR, - _(No Vf's present on SRIOV PF %s), - netdef-forwardPfs-dev); + _(Passthrough mode does not work with PCI addresses and needs Interface names)); goto cleanup; } - -if ((VIR_ALLOC_N(netdef-forwardIfs, num_virt_fns)) 0) { -virReportOOMError(); -goto cleanup; -} - -netdef-nForwardIfs = num_virt_fns; - -for (ii = 0; ii netdef-nForwardIfs; ii++) { -netdef-forwardIfs[ii].dev = strdup(vfname[ii]); -if (!netdef-forwardIfs[ii].dev) { -virReportOOMError(); -goto cleanup
[libvirt] [PATCH 4/5] Forward Mode Hostdev Implementation
This patch chooses a free network device from the interface pool and creates a PCI HostDef to be passed to the guest, when forward mode is hostdev. networkNotifyActualDevice and networkReleaseActualDevice are modified accordingly. Signed-off-by: Shradha Shah ss...@solarflare.com --- src/libvirt_private.syms|3 + src/network/bridge_driver.c | 180 +-- src/qemu/qemu_command.c | 14 src/util/pci.c |2 +- src/util/pci.h |3 + src/util/virnetdev.c| 128 ++ src/util/virnetdev.h| 19 + 7 files changed, 325 insertions(+), 24 deletions(-) diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index afb308d..5b8ab0b 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -1261,6 +1261,9 @@ virNetDevGetVLanID; virNetDevGetVirtualFunctionIndex; virNetDevGetVirtualFunctionInfo; virNetDevGetVirtualFunctions; +virNetDevParsePciConfigAddress; +virNetDevGetDeviceAddrString; +virNetDevGetPciAddrFromName; virNetDevIsOnline; virNetDevIsVirtualFunction; virNetDevLinkDump; diff --git a/src/network/bridge_driver.c b/src/network/bridge_driver.c index cc53551..691ab07 100644 --- a/src/network/bridge_driver.c +++ b/src/network/bridge_driver.c @@ -2835,6 +2835,8 @@ networkAllocateActualDevice(virDomainNetDefPtr iface) virNetworkObjPtr network; virNetworkDefPtr netdef; virPortGroupDefPtr portgroup; +virNetDevVPortProfilePtr virtport = NULL; +virNetworkForwardIfDefPtr dev = NULL; int ii; int ret = -1; @@ -2906,12 +2908,95 @@ networkAllocateActualDevice(virDomainNetDefPtr iface) virReportOOMError(); goto cleanup; } - +} else if (netdef-forwardType == VIR_NETWORK_FORWARD_HOSTDEV) { +int rc = -1; +if (!iface-data.network.actual + (VIR_ALLOC(iface-data.network.actual) 0)) { +virReportOOMError(); +goto cleanup; +} +iface-data.network.actual-type = VIR_DOMAIN_NET_TYPE_HOSTDEV; +if ((netdef-nForwardPfs 0) (netdef-nForwardIfs = 0)) { +if((rc = networkCreateInterfacePool(netdef)) 0) { +networkReportError(VIR_ERR_INTERNAL_ERROR, + _(Could not create Interface Pool from PF)); +goto cleanup; +} +} +/* pick first dev with 0 usageCount */ + +for (ii = 0; ii netdef-nForwardIfs; ii++) { +if (netdef-forwardIfs[ii].usageCount == 0) { +dev = netdef-forwardIfs[ii]; +break; +} +} +if (!dev) { +networkReportError(VIR_ERR_INTERNAL_ERROR, + _(network '%s' requires exclusive access to interfaces, but none are available), + netdef-name); +goto cleanup; +} + +iface-data.network.actual-data.hostdev.def.parent.type = VIR_DOMAIN_DEVICE_NET; +iface-data.network.actual-data.hostdev.def.parent.data.net = iface; +iface-data.network.actual-data.hostdev.def.info = iface-info; +iface-data.network.actual-data.hostdev.def.mode = VIR_DOMAIN_HOSTDEV_MODE_SUBSYS; +iface-data.network.actual-data.hostdev.def.managed = 1; +iface-data.network.actual-data.hostdev.def.source.subsys.type = VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_PCI; + +if (dev-isPciAddr == true) { +virDomainDevicePCIAddressPtr addr = iface-data.network.actual-data.hostdev.def.source.subsys.u.pci; +if (virNetDevParsePciConfigAddress(dev-dev, + addr-domain, + addr-bus, + addr-slot, + addr-function) 0) { +goto cleanup; +} +} +else if (dev-isPciAddr == false) { +virDomainDevicePCIAddressPtr addr = iface-data.network.actual-data.hostdev.def.source.subsys.u.pci; +char *device_pci_addr = NULL; +if (virNetDevGetPciAddrFromName(dev-dev, +device_pci_addr) 0) { +goto cleanup; +} +if (virNetDevParsePciConfigAddress(device_pci_addr, + addr-domain, + addr-bus, + addr-slot, + addr-function) 0) { +goto cleanup; +} +VIR_FREE(device_pci_addr); +} +dev-usageCount++; +VIR_DEBUG(Using physical device %s, usageCount %d, + dev-dev, dev-usageCount); + +if (iface-data.network.virtPortProfile
[libvirt] [PATCH 5/5] Supporting managed option for forward devs when using HOSTDEV mode
This patch supports the managed option in a network xml for network devices. This option is used for pci network devices when forward mode=hostdev. Hence the example network xml would be: network namedirect-network/name uuid81ff0d90-c91e-6742-64da-4a736edb9a8f/uuid forward mode=hostdev pf dev=eth2 managed='yes'/ /forward /network OR network namedirect-network/name uuid81ff0d90-c91e-6742-64da-4a736edb9a8f/uuid forward mode=hostdev interface dev=:04:00.1 managed='yes'/ interface dev=:04:00.2 managed='yes'/ interface dev=:04:00.3 managed='yes'/ /forward /network Signed-off-by: Shradha Shah ss...@solarflare.com --- src/conf/network_conf.c | 61 -- src/conf/network_conf.h |2 + src/network/bridge_driver.c |5 +++- 3 files changed, 64 insertions(+), 4 deletions(-) diff --git a/src/conf/network_conf.c b/src/conf/network_conf.c index 6b346c3..18e4ee3 100644 --- a/src/conf/network_conf.c +++ b/src/conf/network_conf.c @@ -986,6 +986,7 @@ virNetworkDefParseXML(xmlXPathContextPtr ctxt) char *forwardDev = NULL; xmlNodePtr save = ctxt-node; xmlNodePtr bandwidthNode = NULL; +char *managed = NULL; if (VIR_ALLOC(def) 0) { virReportOOMError(); @@ -1128,6 +1129,7 @@ virNetworkDefParseXML(xmlXPathContextPtr ctxt) } forwardDev = virXPathString(string(./@dev), ctxt); +managed = virXPathString(string(./managed), ctxt); /* all of these modes can use a pool of physical interfaces */ nForwardIfs = virXPathNodeSet(./interface, ctxt, forwardIfNodes); @@ -1151,6 +1153,12 @@ virNetworkDefParseXML(xmlXPathContextPtr ctxt) goto error; } +if (managed) { +virNetworkReportError(VIR_ERR_XML_ERROR, + _(A managed field should not be used in this location when using a SRIOV PF)); +goto error; +} + forwardDev = virXMLPropString(*forwardPfNodes, dev); if (!forwardDev) { virNetworkReportError(VIR_ERR_XML_ERROR, @@ -1159,9 +1167,16 @@ virNetworkDefParseXML(xmlXPathContextPtr ctxt) goto error; } +managed = virXMLPropString(*forwardPfNodes, managed); +if(managed != NULL) { +if (STREQ(managed, yes)) +def-forwardPfs-managed = 1; +} + def-forwardPfs-usageCount = 0; def-forwardPfs-dev = forwardDev; forwardDev = NULL; +managed = NULL; def-nForwardPfs++; } else if (nForwardPfs 1) { virNetworkReportError(VIR_ERR_XML_ERROR, @@ -1170,6 +1185,7 @@ virNetworkDefParseXML(xmlXPathContextPtr ctxt) } if (nForwardIfs 0 || forwardDev) { int ii; +unsigned int managedvalue = 0; /* allocate array to hold all the portgroups */ if (VIR_ALLOC_N(def-forwardIfs, MAX(nForwardIfs, 1)) 0) { @@ -1181,12 +1197,24 @@ virNetworkDefParseXML(xmlXPathContextPtr ctxt) def-forwardIfs[0].usageCount = 0; def-forwardIfs[0].dev = forwardDev; forwardDev = NULL; +if (managed != NULL) { +if (STREQ(managed, yes)) +def-forwardIfs[0].managed = 1; +} +managed = NULL; def-nForwardIfs++; } /* parse each forwardIf */ for (ii = 0; ii nForwardIfs; ii++) { forwardDev = virXMLPropString(forwardIfNodes[ii], dev); +managed = virXMLPropString(forwardIfNodes[ii], managed); +if (managed != NULL) { +if (STREQ(managed, yes)) +managedvalue = 1; +else +managedvalue = 0; +} if (!forwardDev) { virNetworkReportError(VIR_ERR_XML_ERROR, _(Missing required dev attribute in network '%s' forward interface element), @@ -1204,12 +1232,22 @@ virNetworkDefParseXML(xmlXPathContextPtr ctxt) forwardDev, def-name); goto error; } + +if (managedvalue != def-forwardIfs[0].managed) { +virNetworkReportError(VIR_ERR_XML_ERROR, + _(managed field of forward dev must match that of the first interface element dev in network '%s'), + def-name); +goto error; +} +VIR_FREE(managed); VIR_FREE(forwardDev); continue
Re: [libvirt] [PATCH 0/5] Support forward mode='hostdev' and interface pools
Hello all, I have actually based these patches of v0.9.12. Top of tree libvirt had some errors due to which I was not able to use virsh and test these patches so I based them off the last release. But these patches should work equivalently well with top of tree. Many Thanks, Regards, Shradha Shah On 06/08/2012 04:26 PM, Shradha Shah wrote: This patch series supports the forward mode='hostdev'. The functionality of this mode is the same as interface type='hostdev' but with the added benefit of using interface pools. The patch series also contains a patch to support use of interface names and PCI device addresses interchangeably in a network xml, and return the appropriate one in actualDevice when networkAllocateActualDevice is called. This functionality is not supported for any other forward mode except hostdev. Currently this patch series also does not support USB hostdev passthrough. At the top level managed attribute can be specified for a pf dev or an interface dev (with identical results as when it's specified for a hostdev Shradha Shah (5): Code to return interface name or pci_addr of the VF in actualDevice Moved the code to create implicit interface pool from PF to a new function Introduce forward mode='hostdev' for network XML in order to use the functionality of interface pools. Forward Mode Hostdev Implementation Supporting managed option for forward devs when using HOSTDEV mode docs/schemas/network.rng|1 + src/conf/network_conf.c | 119 +++- src/conf/network_conf.h | 13 ++- src/libvirt_private.syms|3 + src/network/bridge_driver.c | 325 +++ src/qemu/qemu_command.c | 14 ++ src/util/pci.c |7 +- src/util/pci.h |3 + src/util/virnetdev.c| 134 +- src/util/virnetdev.h| 19 +++ 10 files changed, 568 insertions(+), 70 deletions(-) -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] Error when running virsh version and virt-manager
The output of virsh capabilities is: [root@c6100o bin]# virsh capabilities capabilities host uuid44454c4c-3500-1058-8030-b4c04f33354a/uuid cpu archx86_64/arch modelWestmere/model vendorIntel/vendor topology sockets='1' cores='4' threads='2'/ feature name='rdtscp'/ feature name='pdpe1gb'/ feature name='dca'/ feature name='pdcm'/ feature name='xtpr'/ feature name='tm2'/ feature name='est'/ feature name='smx'/ feature name='vmx'/ feature name='ds_cpl'/ feature name='monitor'/ feature name='dtes64'/ feature name='pclmuldq'/ feature name='pbe'/ feature name='tm'/ feature name='ht'/ feature name='ss'/ feature name='acpi'/ feature name='ds'/ feature name='vme'/ /cpu power_management suspend_disk/ /power_management migration_features live/ uri_transports uri_transporttcp/uri_transport /uri_transports /migration_features /host /capabilities Also the log from /var/log/libvirt/libvirtd.log says: 2012-05-29 09:55:38.801+: 24292: error : qemuCapsComputeCmdFlags:1215 : unsupported configuration: this qemu binary requires libvirt to be compiled with yajl 2012-05-29 09:55:38.806+: 24292: error : qemuCapsComputeCmdFlags:1215 : unsupported configuration: this qemu binary requires libvirt to be compiled with yajl 2012-05-29 09:57:57.438+: 24287: error : qemuCapsExtractVersion:1566 : internal error Cannot find suitable emulator for x86_64 I created a syslink /usr/bin/qemu-system-x86_64 to /usr/libexec/qemu-kvm and I still see the issue. Many Thanks, Regards, Shradha Shah On 05/29/2012 05:13 AM, Osier Yang wrote: On 2012年05月28日 20:27, Shradha Shah wrote: At this point when I run the command virsh version I see the following output: Compiled against library: libvir 0.9.12 Using library: libvir 0.9.12 Using API: QEMU 0.9.12 error: failed to get the hypervisor version error: internal error Cannot find suitable emulator for x86_64 What's the output of virsh capabilities? Osier -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
[libvirt] Error when running virsh version and virt-manager
Hello all, I have pulled the latest changes from the upstream libvirt repo as of today morning. I have then self-built and installed libvirt using the following commands (machine OS RHEL6.2): 1) ./autogen.sh --system --enable-compile-warnings=error 2) make 3) make install 4) service libvirtd restart At this point when I run the command virsh version I see the following output: Compiled against library: libvir 0.9.12 Using library: libvir 0.9.12 Using API: QEMU 0.9.12 error: failed to get the hypervisor version error: internal error Cannot find suitable emulator for x86_64 Some debug commands and their output: [root@c6100o libvirt]# rpm -qa | grep qemu-kvm qemu-kvm-0.12.1.2-2.209.el6.x86_64 [root@c6100o libvirt]# whereis qemu-kvm qemu-kvm: /usr/libexec/qemu-kvm /usr/share/qemu-kvm /usr/share/man/man1/qemu-kvm.1.gz [root@c6100o libvirt]# which virsh /usr/bin/virsh [root@c6100o libvirt]# which libvirtd /usr/sbin/libvirtd Please may I ask for help in order to resolve this issue? -- Many Thanks, Regards, Shradha Shah -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
[libvirt] Errors during compilation
Hello All. I have just pulled all the latest patches in my libvirt git repository. When I try to compile the source I see the following errors: rpc/virnetclientprogram.c: In function 'virNetClientProgramCall': rpc/virnetclientprogram.c:295: error: 'VIR_NET_CALL_WITH_FDS' undeclared (first use in this function) rpc/virnetclientprogram.c:295: error: (Each undeclared identifier is reported only once rpc/virnetclientprogram.c:295: error: for each function it appears in.) rpc/virnetclientprogram.c:338: error: 'VIR_NET_REPLY_WITH_FDS' undeclared (first use in this function) I was wondering if anyone has observed these errors before? What can I do to fix these issues? Thanks. -- Many Thanks, Regards, Shradha Shah -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
[libvirt] PXE boot issue with macvtap
Hello All, I currently have an issue with using macvtap interfaces on KVM. I was wondering if this is a known issue? Also may I request for any ideas on how I should go forward with this issue? Issue: When you try to PXE boot one KVM guest of another KVM guest over a macvtap interface the client does not receive the tftp packets sent by the server and the connection times out. Setup: Both the PXE client and the PXE server are VM's running on the same host, using macvtap on top of a LOM interface for connectivity Reproducible always. Findings of the testing: 1) The issue is seen between guests on the same on the same host and not between guests on different hosts. 2) PXE booting works fine if we're using a linux bridge for connectivity rather than macvtap 3) The initial DHCP stage works fine and the client acquires an IP address, but then the TFTP attempt stalls. 4) The client sends an RRQ to which the server replies with option acknowledgement, which is exactly what should happen. 5) However the client appears not to receive the acknowledgement. 6) Both client and server keep retransmitting their packets for a while, until the client times out and gives up. 7) Observing the ifconfig stats for the host macvtap interface for each guest, the packet counters seemed to show that the packets were being received - so they are dropped somewhere between macvtap on the host and the virtio device in the guest. 8) Tcpdump does not seem to work with macvtap so could not inspect further. -- Many Thanks, Regards, Shradha Shah -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
[libvirt] Migration failure related to EOF from Monitor
Hello All, I am trying to migrate a KVM guest while implementing the patches for PCI-Passthrough of SRIOV Vf's and I repeatedly see the following error: [root@c6100k cwd]# virsh migrate --live dibenchvm1 qemu+ssh://c6100l.uk.level5networks.com/system error: operation failed: migration job: unexpectedly failed I have turned on debugging in /etc/libvirt/libvirtd.conf and I see the following debug in /var/log/libvirt/libvirtd.log 10:37:26.632: 2449: error : qemuMonitorIO:584 : internal error End of file from monitor 10:37:26.632: 2449: debug : qemuMonitorIO:617 : Error on monitor internal error End of file from monitor 10:37:26.632: 2449: debug : virEventPollUpdateHandle:145 : Update handle w=21 e=12 10:37:26.632: 2449: debug : virEventPollInterruptLocked:676 : Skip interrupt, 1 -1497479264 10:37:26.632: 2449: debug : qemuMonitorIO:640 : Triggering EOF callback 10:37:26.632: 2449: debug : qemuProcessHandleMonitorEOF:125 : Received EOF on 0x7f588c003540 'dibenchvm1' 10:37:26.632: 2449: debug : qemuProcessStop:3271 : Shutting down VM 'dibenchvm1' pid=18739 migrated=0 10:37:26.632: 2449: debug : qemuMonitorClose:765 : mon=0x7f588c00c7f0 10:37:26.632: 2449: debug : virEventPollRemoveHandle:172 : Remove handle w=21 10:37:26.633: 2449: debug : virEventPollRemoveHandle:185 : mark delete 9 20 10:37:26.633: 2449: debug : virEventPollInterruptLocked:676 : Skip interrupt, 1 -1497479264 10:37:26.633: 2449: debug : qemuProcessKill:3217 : vm=dibenchvm1 pid=18739 gracefully=0 10:37:26.633: 2449: debug : qemuProcessAutoDestroyRemove:3736 : vm=dibenchvm1 uuid=a4452511-0f97-734a-dbcc-f4c66f821956 10:37:26.633: 2449: debug : virSecurityDACRestoreSecurityAllLabel:504 : Restoring security label on dibenchvm1 migrated=0 May I ask for suggestions on how I can debug the monitor and find out the reason why the qemu monitor dies? Many Thanks, Regards, Shradha Shah -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] Network not Persistent
On 02/20/2012 06:43 PM, Laine Stump wrote: On 02/20/2012 12:37 PM, Shradha Shah wrote: Hello All, I am currently working on patches for PCI-passthrough of SRIOV VF's and I am facing an issue with the network definition not being persistent. I am using a new forward mode = hostdev and the network xml is as follows: network namepci-passthrough-network/name uuid81ff0d90-c91e-6742-64da-4a736edb9a8f/uuid forward mode=hostdev pf dev=eth2/ /forward /network Command line used is: # virsh net-define pci_passthrough_network.xml The network is defined successfully and I can use it at this point. But if I restart libvirt after defining the above network, I lose the network definition. After restart libvirt does not possess any knowledge of the above network. This does not happen when I use forward mode=bridge. I have cross referenced the function call path of networkDefine for both the forwarding modes and I can't seem to find the problem. Is there some function I am missing? You need to add some logic to networkFindActiveConfigs() to determine if the network is active. I am not sure if I am missing something but, networkFindActiveConfigs() is called by networkStartup() only. Currently I am not starting any networks but just defining them and then restarting libvirtd, at which point the pci-passthrough-network dissapears. The commands I am using are as follows: # virsh net-define pci_passthrough_network.xml Network pci-passthrough-network defined from pci_passthrough_network.xml # virsh net-define macvtap_bridge_network.xml Network macvtap-bridge-network defined from macvtap_bridge_network.xml # virsh net-list -all Name State Autostart -- macvtap-bridge-network inactive no pci-passthrough-network inactive no # service libvirtd restart Stopping libvirtd daemon: [ OK ] Starting libvirtd daemon: [ OK ] # virsh net-list -all Name State Autostart -- macvtap-bridge-network inactive no I think its got to do something with the persistent flag of the network object, but I am not sure. Thanks, Shradha (actually this points out that networkFindActiveConfigs() doesn't have any code to determine if the macvtap network types are active. It turns out that there's really no effect to starting one of those types of network, but I should probably do *something* to allow a restarted libvirtd to determine if that type of network is started...) -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] Network not Persistent
On 02/21/2012 03:59 PM, Laine Stump wrote: On 02/21/2012 07:33 AM, Shradha Shah wrote: On 02/20/2012 06:43 PM, Laine Stump wrote: On 02/20/2012 12:37 PM, Shradha Shah wrote: Hello All, I am currently working on patches for PCI-passthrough of SRIOV VF's and I am facing an issue with the network definition not being persistent. I am using a new forward mode = hostdev and the network xml is as follows: network namepci-passthrough-network/name uuid81ff0d90-c91e-6742-64da-4a736edb9a8f/uuid forward mode=hostdev pf dev=eth2/ /forward /network Command line used is: # virsh net-define pci_passthrough_network.xml The network is defined successfully and I can use it at this point. But if I restart libvirt after defining the above network, I lose the network definition. After restart libvirt does not possess any knowledge of the above network. This does not happen when I use forward mode=bridge. I have cross referenced the function call path of networkDefine for both the forwarding modes and I can't seem to find the problem. Is there some function I am missing? You need to add some logic to networkFindActiveConfigs() to determine if the network is active. I am not sure if I am missing something but, networkFindActiveConfigs() is called by networkStartup() only. Right. And that is called every time libvirtd is restarted. I had read your post too quickly and assumed the problem was that the new network was no longer marked active after restarting libvirtd, in which case this is where you would want to look. But I see from you virsh net-list output that the real problem is that the network is no longer *defined* after a restart. Currently I am not starting any networks but just defining them and then restarting libvirtd, at which point the pci-passthrough-network dissapears. The commands I am using are as follows: # virsh net-define pci_passthrough_network.xml Network pci-passthrough-network defined from pci_passthrough_network.xml # virsh net-define macvtap_bridge_network.xml Network macvtap-bridge-network defined from macvtap_bridge_network.xml # virsh net-list -all Name State Autostart -- macvtap-bridge-network inactive no pci-passthrough-network inactive no # service libvirtd restart Stopping libvirtd daemon: [ OK ] Starting libvirtd daemon: [ OK ] # virsh net-list -all Name State Autostart -- macvtap-bridge-network inactive no Before you restart libvirtd, is the xml file in place in /etc/libvirt/qemu/networks/pci-passthrough-network.xml? And is that file still there after libvirtd restarts? If the file is still there but the definition doesn't show up in net-list --all, perhaps there is something in the xml file that is failing the parse - any log messages in libvirtd? This was indeed the case. Libvirt was failing to parse the forwarddev on restart. Bit of a mistake in my code. Now resolved. Many Thanks, Regards, Shradha Also, does it behave differently if you start the network before restarting libvirtd? Beyond this, it's really not possible to help much more without seeing the code you're working with. I think its got to do something with the persistent flag of the network object, but I am not sure. Well, the persistent flag is set unconditionally by networkDefine for all types of networks. -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
[libvirt] Network not Persistent
Hello All, I am currently working on patches for PCI-passthrough of SRIOV VF's and I am facing an issue with the network definition not being persistent. I am using a new forward mode = hostdev and the network xml is as follows: network namepci-passthrough-network/name uuid81ff0d90-c91e-6742-64da-4a736edb9a8f/uuid forward mode=hostdev pf dev=eth2/ /forward /network Command line used is: # virsh net-define pci_passthrough_network.xml The network is defined successfully and I can use it at this point. But if I restart libvirt after defining the above network, I lose the network definition. After restart libvirt does not possess any knowledge of the above network. This does not happen when I use forward mode=bridge. I have cross referenced the function call path of networkDefine for both the forwarding modes and I can't seem to find the problem. Is there some function I am missing? May I ask for your help to solve this issue? Many Thanks, Regards, Shradha Shah -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] RFC: PCI-Passthrough of SRIOV VF's new forward mode
Hello Laine, Many Thanks for reviewing the RFC. Please find my reply inline. On 02/07/2012 02:36 AM, Laine Stump wrote: On 02/06/2012 12:58 PM, Shradha Shah wrote: RFC: New network forward type pci-passthrough-hybrid I saw a couple of posts regarding PCI-Passthrough usage of SRIOV VF's a couple of weeks ago (20th Jan 2012). Initially I was going to post this RFC along with a set of patches. I would require a few more days to clean my patches for submission and hence I would start with an RFC on a new method to manage PCI-Passthrough of SRIOV VF's. I'm working on something similar, but purely in the domain's device list first. my plan is that PCI passthrough interface devices will be defined as interface type='hostdev' (rather than in a hostdev), thus allowing config of all the network interface-related things that may be needed without polluting hostdev (and yet giving us an anchor where the guest-side PCI address can be fixed so that it remains the same across restarts of the guest). I discussed this in a later email last month: https://www.redhat.com/archives/libvir-list/2012-January/msg00840.html Note that the first message is a proposal I made to use hostdev that was discarded, and we later arrived at: devices interface type='hostdev' I was thinking more like interface type='network' and in the network xml forward mode='pci-passthrough'/'hostdev' since I was thinking adding a new mode to the existing enum virNetworkForwardType. So currently the virNetworkForwardType has vepa, private, bridge, passthrough. I was thinking of adding 1) pci-passthrough or hostdev (VF passthrough to the guest, no virtio interface in the guest, as suggested in your previous proposals) 2) pci-passthrough-hybrid or hostdev-hybrid (VF passthrough to the guest + virtio interface in the guest to support migration with maximum performance results) source dev='eth22'/ I was thinking on terms of having the source dev mentioned in the network XML which will suppress any problems we might face while migration. Having a source dev='eth22 in the domain XML will mean that a similar device needs to be present on the destination host after migration else migration would fail. mac address='00:16:3e:5d:c7:9e'/ ... /interface /devices (see the first response from Paolo in the thread), in many ways returning to the proposal of last August. The above XML will set the MAC address of eth22, potentially associate a 802.1QbX port profile (if there is avirtualport element), decode eth22 into a PCI device, then attach that device to the guest. It will also be acceptable to specify the source (host side) address as a pci address rather than a net device name (for those cases when the VF isn't bound to a driver and thus has no net device name). This sounds like a good idea when using Solarflare network adapter as Solarflare VF's do not have a net device name and operate with PCI addresses. My plan has been to first implement this in interface, and then add the forward mode='hostdev' support in network (which would make things especially nice with your new patches to auto-generate the list of devices in the pool by specifying just a PF). Since you have some code already done, maybe we should compare notes - so far I've been working more on rearranging the data structures to accommodate the dual identity of a device that needs to be interface for config purposes, but hostdev (plus extra functionality) for device attachment purposes. The work that we are trying to achieve follows definitely the same path and it would indeed be a great idea to share notes and part of code between ourselves before we submit patches upstream. Solarflare Ethernet card supports 127 VF's on each port. The MAC address of each unused VF is 00:00:00:00:00:00 by default. Hence the MAC address of the VF does not change on every reboot. There is no VF driver on the host. Each VF does not correspond to an Ethernet device. Instead, VF's are managed using the PCI sysfs files. It's interesting that you say each VF doesn't correspond to an ethernet device. Is it that it doesn't, or just doesn't have to (but might)? My limited experiences with sriov hardware has been with an Intel 82576 card, which can operate in either fashion (if the igbvf driver is loaded and bound to the VFs, they have a network device name, otherwise they are visible only via the PF). Solarflare do not provide a separate VF driver (like the ixgbevf), we provide only a PF driver (sfc) hence the VF doesn't correspond to an ethernet device. With the pci-passthrough-hybrid model when the VF is passed into the guest, it appears in the guest as a PCI device and not as a network device. A virtual network device in the form of a virtio interface is also present in the guest. The virtio device in the guest comes from either bridging the physical
[libvirt] RFC: PCI-Passthrough of SRIOV VF's new forward mode
RFC: New network forward type pci-passthrough-hybrid I saw a couple of posts regarding PCI-Passthrough usage of SRIOV VF's a couple of weeks ago (20th Jan 2012). Initially I was going to post this RFC along with a set of patches. I would require a few more days to clean my patches for submission and hence I would start with an RFC on a new method to manage PCI-Passthrough of SRIOV VF's. I work for Solarflare Communications who make 10G network adapters. We currently have SRIOV capable adapters available and in production and we would like to work with upstream libvirt to develop the required support for our hardware. This RFC introduces a new network forward mode to libvirt called pci-passthrough-hybrid and provides a solution for migration with PCI-Passthrough as well as providing significant increase in the networking performance. The Solarflare SRIOV driver architecture for KVM is explained in the Release notes which can be found here: https://support.solarflare.com/index.php?view=categoriesid=1813option=com_cognidoxItemid=2 This is a working model and currently available to Solarflare Customers for evaluation. The hybrid model of the SRIOV driver provided by Solarflare currently achieves the highest SPECvirt performance in the market. Solarflare Ethernet card supports 127 VF's on each port. The MAC address of each unused VF is 00:00:00:00:00:00 by default. Hence the MAC address of the VF does not change on every reboot. There is no VF driver on the host. Each VF does not correspond to an Ethernet device. Instead, VF's are managed using the PCI sysfs files. With the pci-passthrough-hybrid model when the VF is passed into the guest, it appears in the guest as a PCI device and not as a network device. A virtual network device in the form of a virtio interface is also present in the guest. The virtio device in the guest comes from either bridging the physical network device or by creating a macvtap interface of type (vepa, private, bridge) on the physical network device. The virtio device and the VF bind together in the guest to create an accelerated and a non-accelerated path. The new method I wish to propose, uses implicit pci-passthrough and there is no need to provide an explicit hostdev element in the domain xml. The hostdev would be added to the live xml as non-persistent as suggested by Laine Stump in a previous post, link to which can be found at: https://www.redhat.com/archives/libvir-list/2011-August/msg00937.html 1) In order to support the above mentioned hybrid model, the requirement is that the VF needs to be assigned the same MAC address as the virtio device in the guest. This enables the VF and the virtio device to bind successfully using the Solarflare driver called XNAP. Effectively we do not need to extend the hostdev schema. This can be taken care of by the interface element. Along with the MAC address the VLAN tags can also be taken care of by the interface/network elements. 2) The VF appears in the guest as a PCI device hence the MAC address of the VF is stored in the sysfs files. Assigning the MAC address to the VF before or after pci passthough is not an issue. Proposed steps to support the hybrid model of pci-passthrough in libvirt: 1) network will have a new forward type='pci-passthroug-hybrid'. When forward type='pci-passthrough-hybrid' instead of a pool of Ethernet interfaces a pf element will need to be specified for implicit VF allocation as shown in the example below: network namedirect-network/name forward mode=pci-passthrough-hybrid pf dev=eth2/ /forward /network 2) In the domain's interface definition, when type='network' and if network has forward type='pci-passthrough-hybrid', the domain code will request an unused VF from the physical device. Example: interface type='network' source network='direct-network'/ mac address='00:50:56:0f:86:3b'/ model type='virtio'/ actual type='direct' source mode='pci-passthrough-hybrid'/ /actual /interface 3) The code will then use the NodeDevice API to learn all the necessary PCI domain/slot/bus/function information. 4) Before starting the guest the VF's PCI device name (:04:00.2) will be saved in interface/actual so that it can be easily retrieved if libvirtd is restarted. 5) While building the qemu command line, if a network device has forward mode='pci-passthrough-hybrid', the code will add a (non-persisting) hostdev element to the qemu command line. This hostdev will be marked as ephemeral before passing it to the guest. Ephemeral=transient. 6) During the process of network connection the MAC address of the VF will be set according to the domain interface config. This step can also involve setting the VLAN tag, port profiles, etc. 7) Follwoing the above steps the guest will then start with implicit PCI-Passthough of a SRIOV VF. 8) When the guest is eventually destroyed, the Ethernet device will be free'd back to the network pool for use by another guest.
[libvirt] Qemu text monitor
Hello All, I am currently working with libvirt on RHEL6.2. There are a few points I have noticed regarding the QEMU monitor. When I hotplug a device into the guest, the Qemu text monitor receives a device_add command and adds the device to its current devices list (I found this list via the qemu-monitor-command info pci). When I hot-unplug the device from the guest, the Qemu text monitor receives the device_del command but does not remove the device from its current devices list (qemu-monitor-command info pci still shows the device). The result of this is that the next hotplug of the device fails giving an error Duplicate id. I was wondering if any of you have hit this issue before or know about any qemu-monitor bugs I may not be aware of? Many Thanks in advance, Regards, Shradha Shah -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
[libvirt] Hidden symbols
Hello All, When using a systemtap to get a call chain of functions that are executed during the migration of a vm from one host to another I realised that many of the symbols in libvirt are hidden and not visible to systemtap. Is there a way I can make all the symbols visible just for debugging purposes? Many Thanks, Regards, Shradha Shah -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list