Re: [libvirt PATCH 5/5] qemu: Implement support for vDPA block devices
On 7/24/23 8:05 AM, Peter Krempa wrote: As I've promised a long time ago I gave your patches some testing in regards of cooperation with blockjobs and snapshots. Since the new version of the patches was not yet posted on the list I'm replying here including my observations from testing patches from your gitlab branch: On Tue, Jun 06, 2023 at 16:15:30 -0500, Jonathon Jongsma wrote: Requires recent qemu with support for the virtio-blk-vhost-vdpa device and the ability to pass a /dev/fdset/N path for the vdpa path (8.1.0) Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1900770 Since this is a feature addition the 'Fixes' keyword doesn't make sense. Use e.g. 'Resolves' instead. Additionally you're missing the DCO certification here. --- src/qemu/qemu_block.c | 20 -- src/qemu/qemu_domain.c | 25 src/qemu/qemu_validate.c | 44 +++--- tests/qemuxml2argvdata/disk-vhostvdpa.args | 35 + tests/qemuxml2argvdata/disk-vhostvdpa.xml | 21 +++ tests/qemuxml2argvtest.c | 2 + 6 files changed, 139 insertions(+), 8 deletions(-) create mode 100644 tests/qemuxml2argvdata/disk-vhostvdpa.args create mode 100644 tests/qemuxml2argvdata/disk-vhostvdpa.xml [...] diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c index 2f6b32e394..119e52a7d7 100644 --- a/src/qemu/qemu_domain.c +++ b/src/qemu/qemu_domain.c @@ -11157,6 +11157,28 @@ qemuDomainPrepareStorageSourceFDs(virStorageSource *src, } +static int +qemuDomainPrepareStorageSourceVDPA(virStorageSource *src, + qemuDomainObjPrivate *priv) +{ +qemuDomainStorageSourcePrivate *srcpriv = NULL; +virStorageType actualType = virStorageSourceGetActualType(src); +int vdpafd = -1; + +if (actualType != VIR_STORAGE_TYPE_VHOST_VDPA) +return 0; + +if ((vdpafd = qemuVDPAConnect(src->path)) < 0) +return -1; This function call directly touches the host filesystem, which is not supposed to be in the *DomainPrepareStorageSource* functions but we rather have a completely separate machinery in qemuProcessPrepareHostStorage. Unfortunately that one doesn't yet need to handle individual backing chain members though. This ensures that the code doesn't get accidentally called from tests even without mocking the code as the tests reimplement the functions differently for testing purposes. + +srcpriv = qemuDomainStorageSourcePrivateFetch(src); + +srcpriv->fdpass = qemuFDPassNew(src->nodestorage, priv); +qemuFDPassAddFD(srcpriv->fdpass, , "-vdpa"); +return 0; +} [...] diff --git a/src/qemu/qemu_validate.c b/src/qemu/qemu_validate.c index 9dce908cfe..67b0944162 100644 --- a/src/qemu/qemu_validate.c +++ b/src/qemu/qemu_validate.c @@ -3220,6 +3220,28 @@ qemuValidateDomainDeviceDefDiskTransient(const virDomainDiskDef *disk, } +static int +qemuValidateDomainDeviceDefDiskVhost(const virDomainDef *def, + virStorageType storagetype, + virQEMUCapsFlags flag, + virQEMUCaps *qemuCaps) +{ +const char *vhosttype = virStorageTypeToString(storagetype); + +if (!virQEMUCapsGet(qemuCaps, flag)) { +virReportError(VIR_ERR_CONFIG_UNSUPPORTED, + _("%1$s disk is not supported with this QEMU binary"), + vhosttype); +return -1; I'd prefer if both things this function does are duplicated inline below rather than passing it via arguments here. It makes it harder to follow. +} + +if (qemuValidateDomainDefVhostUserRequireSharedMemory(def, vhosttype) < 0) +return -1; + +return 0; +} In my testing of the code from the new branch I've observed that blockjobs and snapshot creation work well thanks to libblkio, so we don't have to add any additional checks or limitations. I'll still need to go ahead and finish the series removing the 'raw' driver when it's not necessary so that the fast-path, once implemented will be possible. Waiting for that is not necessary for this series as it works properly even with the 'raw' driver in place. With your new version of the patches I've noticed the following problems: - After converting to store the vdpa device path in src->vdpadev: - rejecting of the empty disk source doesn't work for vdpa. If you use in stead of the proper path, the XML will be defined but broken. - virStorageSourceCopy doesn't copy the new member (as instructed in the comment above the struct), thus block job code which uses this extensively to work on the inactive definition creates broken configurations. I've also noticed that using 'qcow2' format for the device doesn't work: error: internal error: process exited while connecting to monitor: 2023-07-24T12:54:15.818631Z qemu-system-x86_64: -blockdev
Re: [PATCH 0/2] libvirt-guests: small improvments
On 8/1/23 08:11, Martin Kletzander wrote: On Mon, Jul 31, 2023 at 05:06:44PM -0600, Jim Fehlig wrote: The first patch is trivial. I suppose the second is debatable. If I build libvirt with -Dremote_default_mode=legacy but deploy modular daemons, /run/libvirt/libvirt-sock is provided by virtproxyd, which may or may not be running when libvirt-guests starts/stops. I added an 'After=virtproxyd.socket' ordering dependency to libvirt-guests, but it hasn't fixed an issue I'm seeing when using libvirt-guests+virtproxyd libvirt-guests.sh[2607]: Can't connect to default. Skipping In this case the connection should use the legacy mode according to the code. Nod. I think you are running into the same thing as we were in https://bugzilla.redhat.com/show_bug.cgi?id=1964855 and you want to have virtproxyd.service instead of virtproxyd.socket there. It's complicated, but look at my commit 59d30adacd1d and comment #7 in the above mentioned bugzilla. Yes, I see the same problem. Thanks for the pointer! But unless I have something misconfigured, the suggestions you made to the daemon docs don't work for me. My setup is a recent openSUSE Tumbleweed with systemd 253.7 and libvirt 9.6.0 built with -Dremote_default_mode=direct. The unit section of /etc/systemd/system/libvirt-guests.service contains [Unit] Description=Suspend/Resume Running libvirt Guests Requires=virt-guest-shutdown.target After=network.target After=time-sync.target After=virtqemud.service After=virt-guest-shutdown.target 'systemctl daemon-reload' executed, virtqemud.service and socket are enabled, and the socket is active. With the service also active, 'systemctl stop libvirt-guests.service' works fine. If virtqemud.service is not running, 'systemctl stop libvirt-guests.service' hangs as described in the bug. Killing the hung libvirt-guests.sh allows the stop job to complete and start job for virtqemud.service to begin. So for me it seems the bug still exists when using the default URI. In any case I think in such a scenario you want the libvirt-guests to connect to the particular daemon. That's the reason I did not modify the virtproxyd in the commit and the reason the socket is not in the service file. Indeed, if I set 'uri_default = "qemu:///system"' in /etc/libvirt/libvirt.conf, then 'systemctl stop libvirt-guests.service' succeeds even when virtqemud.service is not active. Regards, Jim
Re: [PATCH] Fix some typos in documentation and comments
30.07.2023 21:03, Stefan Weil via wrote: Signed-off-by: Stefan Weil --- This patch was triggered by a spelling check for the generated QEMU documentation using codespell. It does not try to fix all typos which still exist in the QEMU code, but has a focus on those required to fix the documentation. Nevertheless some code comments with the same typos were fixed, too. I think the patch is trivial, so maybe it can still be included in the upcoming release, but that's not strictly necessary. Stefan docs/about/deprecated.rst| 2 +- docs/devel/qom.rst | 2 +- docs/system/devices/nvme.rst | 2 +- hw/core/loader.c | 4 ++-- include/exec/memory.h| 2 +- ui/vnc-enc-tight.c | 2 +- 6 files changed, 7 insertions(+), 7 deletions(-) diff --git a/docs/about/deprecated.rst b/docs/about/deprecated.rst index 1c35f55666..92a2bafd2b 100644 --- a/docs/about/deprecated.rst +++ b/docs/about/deprecated.rst @@ -369,7 +369,7 @@ mapping permissions et al by using its 'mapped' security model option. Nowadays it would make sense to reimplement the ``proxy`` backend by using QEMU's ``vhost`` feature, which would eliminate the high latency costs under which the 9p ``proxy`` backend currently suffers. However as of to date nobody -has indicated plans for such kind of reimplemention unfortunately. +has indicated plans for such kind of reimplementation unfortunately. FWIW, all these changes has been included in my "tree-wide spelling" series. This particular change: https://lists.nongnu.org/archive/html/qemu-devel/2023-07/msg03011.html /mjt
Re: [libvirt PATCH] src: fix max file limits in systemd services
On Wed, Aug 02, 2023 at 10:05:38AM +0100, Daniel P. Berrangé wrote: This fixes commit 38abf9c34dc481b0dc923bdab446ee623bdc5ab6 Author: Daniel P. Berrangé Date: Wed Jun 21 13:22:40 2023 +0100 src: set max open file limit to match systemd >= 240 defaults Pity we can't drop it because of systemd 239 still being in distros supported by us. The bug referenced in that commit had suggested to set LimitNOFile=512000:1024 on the basis that matches current systemd default behaviour and is compatible with old systemd. That was good except * The setting is LimitNOFILE and these are case sensitive * The hard and soft limits were inverted - soft must come first and so it would have been ignored even if the setting name was correct. * The default hard limit is 524288 not 512000 Reported-by: Olaf Hering Signed-off-by: Daniel P. Berrangé Reviewed-by: Martin Kletzander signature.asc Description: PGP signature
Re: [libvirt PATCH v2 19/24] qemu_snapshot: remove revertdisks when creating new snapshot
On Tue, Jun 27, 2023 at 17:07:22 +0200, Pavel Hrdina wrote: > When user creates a new snapshot after reverting to non-leaf snapshot we > no longer need to store the temporary overlays as they will be part of > the VM XMLs stored in the newly created snapshot. > > Signed-off-by: Pavel Hrdina > --- > src/qemu/qemu_snapshot.c | 23 ++- > 1 file changed, 22 insertions(+), 1 deletion(-) > > diff --git a/src/qemu/qemu_snapshot.c b/src/qemu/qemu_snapshot.c > index a206f015c4..2950ad7d77 100644 > --- a/src/qemu/qemu_snapshot.c > +++ b/src/qemu/qemu_snapshot.c > @@ -1652,6 +1652,23 @@ qemuSnapshotCreateWriteMetadata(virDomainObj *vm, > } > > > +static void > +qemuSnapshotClearRevertdisks(virDomainMomentObj *current) > +{ > +virDomainSnapshotDef *curdef = NULL; > + > +if (!current) > +return; > + > +curdef = virDomainSnapshotObjGetDef(current); > + > +if (curdef->revertdisks) { > +g_clear_pointer(>revertdisks, g_free); At least in one instance the structure's 'name' and 'src' fields are filled with allocated pointers, so this looks like it will leak memory. I also didn't find code which frees the 'reverdisks' field when the snapshot definition is being freed.
Re: [PATCH] daemon: Treat logging of VIR_ERR_MULTIPLE_INTERFACES same as VIR_ERR_NO_INTERFACE
On Wed, Aug 02, 2023 at 10:11:24AM +0200, Peter Krempa wrote: When a query for an interface via virInterfaceLookupByMACString finds multiple interfaces an error is returned. Treat such error with the same 'debug' priority as we treat when the interface was not found to avoid spamming logs with such configurations. Closes: https://gitlab.com/libvirt/libvirt/-/issues/514 Signed-off-by: Peter Krempa Reviewed-by: Martin Kletzander --- src/remote/remote_daemon.c | 1 + 1 file changed, 1 insertion(+) diff --git a/src/remote/remote_daemon.c b/src/remote/remote_daemon.c index d880711c91..d4d999e53a 100644 --- a/src/remote/remote_daemon.c +++ b/src/remote/remote_daemon.c @@ -97,6 +97,7 @@ static int daemonErrorLogFilter(virErrorPtr err, int priority) case VIR_ERR_NO_STORAGE_VOL: case VIR_ERR_NO_NODE_DEVICE: case VIR_ERR_NO_INTERFACE: +case VIR_ERR_MULTIPLE_INTERFACES: case VIR_ERR_NO_NWFILTER: case VIR_ERR_NO_NWFILTER_BINDING: case VIR_ERR_NO_SECRET: -- 2.41.0 signature.asc Description: PGP signature
Re: [libvirt PATCH v2 17/24] qemu_snapshot: add support to delete external snapshot without block commit
On Tue, Jun 27, 2023 at 17:07:20 +0200, Pavel Hrdina wrote: > When block commit is not needed we can just simply unlink the > disk files. > > Signed-off-by: Pavel Hrdina > --- > src/qemu/qemu_snapshot.c | 56 ++-- > 1 file changed, 36 insertions(+), 20 deletions(-) Reviewed-by: Peter Krempa
Re: [PATCH 0/3] lxc: Fix reporting of startup errors with debug logging enabled
On Wed, Aug 02, 2023 at 09:29:39AM +0200, Peter Krempa wrote: Few issues with propagating error to the user when debug logging is enabled, Peter Krempa (3): virLXCControllerSetupUsernsMap: Modify debug logging for clean startup errors virLXCProcessReadLogOutputData: Refill buffer after filtering out noise virLXCProcessReportStartupLogError: Strip trailing newline from error src/lxc/lxc_controller.c | 8 +--- src/lxc/lxc_process.c| 12 2 files changed, 17 insertions(+), 3 deletions(-) Reviewed-by: Martin Kletzander -- 2.41.0 signature.asc Description: PGP signature
Re: [PATCH 3/5] conf: add support for max_unmap_size
On Wed, Aug 02, 2023 at 13:47:16 +0200, Kristina Hanicova wrote: > Signed-off-by: Kristina Hanicova > --- > docs/formatdomain.rst | 10 ++ > src/conf/domain_conf.c| 12 +++- > src/conf/domain_conf.h| 1 + > src/conf/domain_validate.c| 3 ++- > src/conf/schemas/domaincommon.rng | 5 + > 5 files changed, 29 insertions(+), 2 deletions(-) > > diff --git a/docs/formatdomain.rst b/docs/formatdomain.rst > index 0d0812f08c..1fe93066bd 100644 > --- a/docs/formatdomain.rst > +++ b/docs/formatdomain.rst > @@ -2591,6 +2591,13 @@ paravirtualized driver is specified via the ``disk`` > element. > discard_granularity='4096'/> > > > + > + > + > + > + > + > + > > > > @@ -3439,6 +3446,9 @@ paravirtualized driver is specified via the ``disk`` > element. >The smallest amount of data that can be discarded in a single > operation. >It impacts the unmap operations and it must be a multiple of a >``logical_block_size``. > + ``max_unmap_size`` > + The maximum size of an unmap operation that can be performed for scsi > + disks. Does this imply that it works only for SCSI disks? The code isn't enforcing that anywhere
Re: [PATCH 4/5] qemu: Introduce QEMU_CAPS_SCSI_DISK_MAX_UNMAP_SIZE capability
On Wed, Aug 02, 2023 at 13:47:17 +0200, Kristina Hanicova wrote: > Signed-off-by: Kristina Hanicova > --- This capability is not actually used after this series. Additionally all qemu versions seem to support it.
Re: [PATCH 2/5] qemu: add support for discard_granularity
On Wed, Aug 02, 2023 at 13:47:15 +0200, Kristina Hanicova wrote: > Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1849570 > > Signed-off-by: Kristina Hanicova > --- > src/qemu/qemu_command.c| 2 ++ > src/vz/vz_utils.c | 3 ++- > tests/qemuxml2argvdata/disk-blockio.x86_64-latest.args | 2 +- > tests/qemuxml2argvdata/disk-blockio.xml| 2 +- > 4 files changed, 6 insertions(+), 3 deletions(-) > > diff --git a/src/qemu/qemu_command.c b/src/qemu/qemu_command.c > index 64af0b5ea9..23810bc067 100644 > --- a/src/qemu/qemu_command.c > +++ b/src/qemu/qemu_command.c > @@ -1760,6 +1760,7 @@ qemuBuildDiskDeviceProps(const virDomainDef *def, > unsigned int bootindex = 0; > unsigned int logical_block_size = disk->blockio.logical_block_size; > unsigned int physical_block_size = disk->blockio.physical_block_size; > +unsigned int discard_granularity = disk->blockio.discard_granularity; > g_autoptr(virJSONValue) wwn = NULL; > g_autofree char *serial = NULL; > virTristateSwitch removable = VIR_TRISTATE_SWITCH_ABSENT; > @@ -1939,6 +1940,7 @@ qemuBuildDiskDeviceProps(const virDomainDef *def, >"p:bootindex", bootindex, >"p:logical_block_size", logical_block_size, >"p:physical_block_size", physical_block_size, > + "p:discard_granularity", discard_granularity, This is a device frontend property, so you'll also need to add it to the ABI stability check and make sure it doesn't differ between cases when same ABI is required. See virDomainDiskDefCheckABIStability Note that logical_block_size and physical_block_size ought to have the same treatment.
[PATCH 4/5] qemu: Introduce QEMU_CAPS_SCSI_DISK_MAX_UNMAP_SIZE capability
Signed-off-by: Kristina Hanicova --- src/qemu/qemu_capabilities.c | 2 ++ src/qemu/qemu_capabilities.h | 1 + tests/qemucapabilitiesdata/caps_4.2.0_aarch64.xml | 1 + tests/qemucapabilitiesdata/caps_4.2.0_ppc64.xml | 1 + tests/qemucapabilitiesdata/caps_4.2.0_s390x.xml | 1 + tests/qemucapabilitiesdata/caps_4.2.0_x86_64.xml | 1 + tests/qemucapabilitiesdata/caps_5.0.0_aarch64.xml | 1 + tests/qemucapabilitiesdata/caps_5.0.0_ppc64.xml | 1 + tests/qemucapabilitiesdata/caps_5.0.0_riscv64.xml | 1 + tests/qemucapabilitiesdata/caps_5.0.0_x86_64.xml | 1 + tests/qemucapabilitiesdata/caps_5.1.0_sparc.xml | 1 + tests/qemucapabilitiesdata/caps_5.1.0_x86_64.xml | 1 + tests/qemucapabilitiesdata/caps_5.2.0_aarch64.xml | 1 + tests/qemucapabilitiesdata/caps_5.2.0_ppc64.xml | 1 + tests/qemucapabilitiesdata/caps_5.2.0_riscv64.xml | 1 + tests/qemucapabilitiesdata/caps_5.2.0_s390x.xml | 1 + tests/qemucapabilitiesdata/caps_5.2.0_x86_64.xml | 1 + tests/qemucapabilitiesdata/caps_6.0.0_aarch64.xml | 1 + tests/qemucapabilitiesdata/caps_6.0.0_s390x.xml | 1 + tests/qemucapabilitiesdata/caps_6.0.0_x86_64.xml | 1 + tests/qemucapabilitiesdata/caps_6.1.0_x86_64.xml | 1 + tests/qemucapabilitiesdata/caps_6.2.0_aarch64.xml | 1 + tests/qemucapabilitiesdata/caps_6.2.0_ppc64.xml | 1 + tests/qemucapabilitiesdata/caps_6.2.0_x86_64.xml | 1 + tests/qemucapabilitiesdata/caps_7.0.0_aarch64+hvf.xml | 1 + tests/qemucapabilitiesdata/caps_7.0.0_aarch64.xml | 1 + tests/qemucapabilitiesdata/caps_7.0.0_ppc64.xml | 1 + tests/qemucapabilitiesdata/caps_7.0.0_x86_64.xml | 1 + tests/qemucapabilitiesdata/caps_7.1.0_ppc64.xml | 1 + tests/qemucapabilitiesdata/caps_7.1.0_x86_64.xml | 1 + tests/qemucapabilitiesdata/caps_7.2.0_ppc.xml | 1 + tests/qemucapabilitiesdata/caps_7.2.0_x86_64+hvf.xml | 1 + tests/qemucapabilitiesdata/caps_7.2.0_x86_64.xml | 1 + tests/qemucapabilitiesdata/caps_8.0.0_riscv64.xml | 1 + tests/qemucapabilitiesdata/caps_8.0.0_x86_64.xml | 1 + tests/qemucapabilitiesdata/caps_8.1.0_s390x.xml | 1 + tests/qemucapabilitiesdata/caps_8.1.0_x86_64.xml | 1 + 37 files changed, 38 insertions(+) diff --git a/src/qemu/qemu_capabilities.c b/src/qemu/qemu_capabilities.c index f80bdb579d..5f57a5c028 100644 --- a/src/qemu/qemu_capabilities.c +++ b/src/qemu/qemu_capabilities.c @@ -697,6 +697,7 @@ VIR_ENUM_IMPL(virQEMUCaps, /* 450 */ "run-with.async-teardown", /* QEMU_CAPS_RUN_WITH_ASYNC_TEARDOWN */ + "scsi-disk.max_unmap_size", /* QEMU_CAPS_SCSI_DISK_MAX_UNMAP_SIZE */ ); @@ -1453,6 +1454,7 @@ static struct virQEMUCapsDevicePropsFlags virQEMUCapsDevicePropsVfioPCI[] = { static struct virQEMUCapsDevicePropsFlags virQEMUCapsDevicePropsSCSIDisk[] = { { "channel", QEMU_CAPS_SCSI_DISK_CHANNEL, NULL }, { "rotation_rate", QEMU_CAPS_ROTATION_RATE, NULL }, +{ "max_unmap_size", QEMU_CAPS_SCSI_DISK_MAX_UNMAP_SIZE, NULL }, }; static struct virQEMUCapsDevicePropsFlags virQEMUCapsDevicePropsIDEDrive[] = { diff --git a/src/qemu/qemu_capabilities.h b/src/qemu/qemu_capabilities.h index c72f73a161..c7f2ea7a45 100644 --- a/src/qemu/qemu_capabilities.h +++ b/src/qemu/qemu_capabilities.h @@ -676,6 +676,7 @@ typedef enum { /* virQEMUCapsFlags grouping marker for syntax-check */ /* 450 */ QEMU_CAPS_RUN_WITH_ASYNC_TEARDOWN, /* asynchronous teardown -run-with async-teardown=on|off */ +QEMU_CAPS_SCSI_DISK_MAX_UNMAP_SIZE, /* scsi-hd.max_unmap_size */ QEMU_CAPS_LAST /* this must always be the last item */ } virQEMUCapsFlags; diff --git a/tests/qemucapabilitiesdata/caps_4.2.0_aarch64.xml b/tests/qemucapabilitiesdata/caps_4.2.0_aarch64.xml index 0763809bd1..6f39c276cc 100644 --- a/tests/qemucapabilitiesdata/caps_4.2.0_aarch64.xml +++ b/tests/qemucapabilitiesdata/caps_4.2.0_aarch64.xml @@ -108,6 +108,7 @@ + 4002000 61700242 v4.1.0-2221-g36609b4fa3 diff --git a/tests/qemucapabilitiesdata/caps_4.2.0_ppc64.xml b/tests/qemucapabilitiesdata/caps_4.2.0_ppc64.xml index d596bae8d1..f52a075a1e 100644 --- a/tests/qemucapabilitiesdata/caps_4.2.0_ppc64.xml +++ b/tests/qemucapabilitiesdata/caps_4.2.0_ppc64.xml @@ -107,6 +107,7 @@ + 4002000 42900242 v4.1.0-2198-g9e583f2 diff --git a/tests/qemucapabilitiesdata/caps_4.2.0_s390x.xml b/tests/qemucapabilitiesdata/caps_4.2.0_s390x.xml index 8ee177f860..25d42d3268 100644 --- a/tests/qemucapabilitiesdata/caps_4.2.0_s390x.xml +++ b/tests/qemucapabilitiesdata/caps_4.2.0_s390x.xml @@ -74,6 +74,7 @@ + 4002000 39100242 qemu-4.2.0-20200115.0.1e4aa2da.fc31 diff --git a/tests/qemucapabilitiesdata/caps_4.2.0_x86_64.xml b/tests/qemucapabilitiesdata/caps_4.2.0_x86_64.xml index 6a78466b0f..fab5ea0d55 100644 ---
[PATCH 1/5] conf: add support for discard_granularity
Signed-off-by: Kristina Hanicova --- docs/formatdomain.rst | 6 +- src/conf/domain_conf.c| 12 +++- src/conf/domain_conf.h| 1 + src/conf/domain_validate.c| 3 ++- src/conf/schemas/domaincommon.rng | 5 + src/qemu/qemu_domain.c| 2 ++ 6 files changed, 26 insertions(+), 3 deletions(-) diff --git a/docs/formatdomain.rst b/docs/formatdomain.rst index cd9cb02bf8..0d0812f08c 100644 --- a/docs/formatdomain.rst +++ b/docs/formatdomain.rst @@ -2588,7 +2588,7 @@ paravirtualized driver is specified via the ``disk`` element. - + @@ -3435,6 +3435,10 @@ paravirtualized driver is specified via the ``disk`` element. this would be the value returned by the BLKPBSZGET ioctl and describes the disk's hardware sector size which can be relevant for the alignment of disk data. + ``discard_granularity`` + The smallest amount of data that can be discarded in a single operation. + It impacts the unmap operations and it must be a multiple of a + ``logical_block_size``. Filesystems ~~~ diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index 5ac5c0b771..950c9049ba 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -8055,6 +8055,10 @@ virDomainDiskDefParseXML(virDomainXMLOption *xmlopt, if (virXMLPropUInt(blockioNode, "physical_block_size", 10, VIR_XML_PROP_NONE, >blockio.physical_block_size) < 0) return NULL; + +if (virXMLPropUInt(blockioNode, "discard_granularity", 10, VIR_XML_PROP_NONE, + >blockio.discard_granularity) < 0) +return NULL; } if ((driverNode = virXPathNode("./driver", ctxt))) { @@ -22035,7 +22039,8 @@ virDomainDiskBlockIoDefFormat(virBuffer *buf, virDomainDiskDef *def) { if (def->blockio.logical_block_size > 0 || -def->blockio.physical_block_size > 0) { +def->blockio.physical_block_size > 0 || +def->blockio.discard_granularity > 0) { virBufferAddLit(buf, "blockio.logical_block_size > 0) { virBufferAsprintf(buf, @@ -22047,6 +22052,11 @@ virDomainDiskBlockIoDefFormat(virBuffer *buf, " physical_block_size='%u'", def->blockio.physical_block_size); } +if (def->blockio.discard_granularity > 0) { +virBufferAsprintf(buf, + " discard_granularity='%u'", + def->blockio.discard_granularity); +} virBufferAddLit(buf, "/>\n"); } } diff --git a/src/conf/domain_conf.h b/src/conf/domain_conf.h index c857ba556f..1621876a21 100644 --- a/src/conf/domain_conf.h +++ b/src/conf/domain_conf.h @@ -579,6 +579,7 @@ struct _virDomainDiskDef { struct { unsigned int logical_block_size; unsigned int physical_block_size; +unsigned int discard_granularity; } blockio; virDomainBlockIoTuneInfo blkdeviotune; diff --git a/src/conf/domain_validate.c b/src/conf/domain_validate.c index ad383b604e..7e00b6451a 100644 --- a/src/conf/domain_validate.c +++ b/src/conf/domain_validate.c @@ -466,7 +466,8 @@ virDomainDiskVhostUserValidate(const virDomainDiskDef *disk) } if (disk->blockio.logical_block_size > 0 || -disk->blockio.physical_block_size > 0) { +disk->blockio.physical_block_size > 0 || +disk->blockio.discard_granularity > 0) { virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", _("blockio is not supported with vhostuser disk")); return -1; diff --git a/src/conf/schemas/domaincommon.rng b/src/conf/schemas/domaincommon.rng index c2f56b0490..ee9c408a21 100644 --- a/src/conf/schemas/domaincommon.rng +++ b/src/conf/schemas/domaincommon.rng @@ -2476,6 +2476,11 @@ + + + + +
[PATCH 2/5] qemu: add support for discard_granularity
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1849570 Signed-off-by: Kristina Hanicova --- src/qemu/qemu_command.c| 2 ++ src/vz/vz_utils.c | 3 ++- tests/qemuxml2argvdata/disk-blockio.x86_64-latest.args | 2 +- tests/qemuxml2argvdata/disk-blockio.xml| 2 +- 4 files changed, 6 insertions(+), 3 deletions(-) diff --git a/src/qemu/qemu_command.c b/src/qemu/qemu_command.c index 64af0b5ea9..23810bc067 100644 --- a/src/qemu/qemu_command.c +++ b/src/qemu/qemu_command.c @@ -1760,6 +1760,7 @@ qemuBuildDiskDeviceProps(const virDomainDef *def, unsigned int bootindex = 0; unsigned int logical_block_size = disk->blockio.logical_block_size; unsigned int physical_block_size = disk->blockio.physical_block_size; +unsigned int discard_granularity = disk->blockio.discard_granularity; g_autoptr(virJSONValue) wwn = NULL; g_autofree char *serial = NULL; virTristateSwitch removable = VIR_TRISTATE_SWITCH_ABSENT; @@ -1939,6 +1940,7 @@ qemuBuildDiskDeviceProps(const virDomainDef *def, "p:bootindex", bootindex, "p:logical_block_size", logical_block_size, "p:physical_block_size", physical_block_size, + "p:discard_granularity", discard_granularity, "A:wwn", , "p:rotation_rate", disk->rotation_rate, "S:vendor", disk->vendor, diff --git a/src/vz/vz_utils.c b/src/vz/vz_utils.c index 7db7dbd419..de707df883 100644 --- a/src/vz/vz_utils.c +++ b/src/vz/vz_utils.c @@ -279,7 +279,8 @@ vzCheckDiskUnsupportedParams(virDomainDiskDef *disk) } if (disk->blockio.logical_block_size || -disk->blockio.physical_block_size) { +disk->blockio.physical_block_size || +disk->blockio.discard_granularity) { virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", _("Setting disk block sizes is not " "supported by vz driver.")); diff --git a/tests/qemuxml2argvdata/disk-blockio.x86_64-latest.args b/tests/qemuxml2argvdata/disk-blockio.x86_64-latest.args index 7270613573..15f31ae60d 100644 --- a/tests/qemuxml2argvdata/disk-blockio.x86_64-latest.args +++ b/tests/qemuxml2argvdata/disk-blockio.x86_64-latest.args @@ -32,7 +32,7 @@ XDG_CONFIG_HOME=/var/lib/libvirt/qemu/domain--1-QEMUGuest1/.config \ -device '{"driver":"ide-cd","bus":"ide.0","unit":1,"drive":"libvirt-2-format","id":"ide0-0-1"}' \ -blockdev '{"driver":"file","filename":"/tmp/idedisk.img","node-name":"libvirt-1-storage","auto-read-only":true,"discard":"unmap"}' \ -blockdev '{"node-name":"libvirt-1-format","read-only":false,"driver":"raw","file":"libvirt-1-storage"}' \ --device '{"driver":"ide-hd","bus":"ide.0","unit":2,"drive":"libvirt-1-format","id":"ide0-0-2","bootindex":1,"logical_block_size":512,"physical_block_size":512}' \ +-device '{"driver":"ide-hd","bus":"ide.0","unit":2,"drive":"libvirt-1-format","id":"ide0-0-2","bootindex":1,"logical_block_size":512,"physical_block_size":512,"discard_granularity":4096}' \ -audiodev '{"id":"audio1","driver":"none"}' \ -device '{"driver":"virtio-balloon-pci","id":"balloon0","bus":"pci.0","addr":"0x2"}' \ -sandbox on,obsolete=deny,elevateprivileges=deny,spawn=deny,resourcecontrol=deny \ diff --git a/tests/qemuxml2argvdata/disk-blockio.xml b/tests/qemuxml2argvdata/disk-blockio.xml index 170728371f..84943719d4 100644 --- a/tests/qemuxml2argvdata/disk-blockio.xml +++ b/tests/qemuxml2argvdata/disk-blockio.xml @@ -23,7 +23,7 @@ - + -- 2.41.0
[PATCH 3/5] conf: add support for max_unmap_size
Signed-off-by: Kristina Hanicova --- docs/formatdomain.rst | 10 ++ src/conf/domain_conf.c| 12 +++- src/conf/domain_conf.h| 1 + src/conf/domain_validate.c| 3 ++- src/conf/schemas/domaincommon.rng | 5 + 5 files changed, 29 insertions(+), 2 deletions(-) diff --git a/docs/formatdomain.rst b/docs/formatdomain.rst index 0d0812f08c..1fe93066bd 100644 --- a/docs/formatdomain.rst +++ b/docs/formatdomain.rst @@ -2591,6 +2591,13 @@ paravirtualized driver is specified via the ``disk`` element. + + + + + + + @@ -3439,6 +3446,9 @@ paravirtualized driver is specified via the ``disk`` element. The smallest amount of data that can be discarded in a single operation. It impacts the unmap operations and it must be a multiple of a ``logical_block_size``. + ``max_unmap_size`` + The maximum size of an unmap operation that can be performed for scsi + disks. Filesystems ~~~ diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index 950c9049ba..59eb9fb0c0 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -8059,6 +8059,10 @@ virDomainDiskDefParseXML(virDomainXMLOption *xmlopt, if (virXMLPropUInt(blockioNode, "discard_granularity", 10, VIR_XML_PROP_NONE, >blockio.discard_granularity) < 0) return NULL; + +if (virXMLPropUInt(blockioNode, "max_unmap_size", 10, VIR_XML_PROP_NONE, + >blockio.max_unmap_size) < 0) +return NULL; } if ((driverNode = virXPathNode("./driver", ctxt))) { @@ -22040,7 +22044,8 @@ virDomainDiskBlockIoDefFormat(virBuffer *buf, { if (def->blockio.logical_block_size > 0 || def->blockio.physical_block_size > 0 || -def->blockio.discard_granularity > 0) { +def->blockio.discard_granularity > 0 || +def->blockio.max_unmap_size > 0) { virBufferAddLit(buf, "blockio.logical_block_size > 0) { virBufferAsprintf(buf, @@ -22057,6 +22062,11 @@ virDomainDiskBlockIoDefFormat(virBuffer *buf, " discard_granularity='%u'", def->blockio.discard_granularity); } +if (def->blockio.max_unmap_size > 0) { +virBufferAsprintf(buf, + " max_unmap_size='%u'", + def->blockio.max_unmap_size); +} virBufferAddLit(buf, "/>\n"); } } diff --git a/src/conf/domain_conf.h b/src/conf/domain_conf.h index 1621876a21..26f5fdbd92 100644 --- a/src/conf/domain_conf.h +++ b/src/conf/domain_conf.h @@ -580,6 +580,7 @@ struct _virDomainDiskDef { unsigned int logical_block_size; unsigned int physical_block_size; unsigned int discard_granularity; +unsigned int max_unmap_size; } blockio; virDomainBlockIoTuneInfo blkdeviotune; diff --git a/src/conf/domain_validate.c b/src/conf/domain_validate.c index 7e00b6451a..db2dd01985 100644 --- a/src/conf/domain_validate.c +++ b/src/conf/domain_validate.c @@ -467,7 +467,8 @@ virDomainDiskVhostUserValidate(const virDomainDiskDef *disk) if (disk->blockio.logical_block_size > 0 || disk->blockio.physical_block_size > 0 || -disk->blockio.discard_granularity > 0) { +disk->blockio.discard_granularity > 0 || +disk->blockio.max_unmap_size > 0) { virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", _("blockio is not supported with vhostuser disk")); return -1; diff --git a/src/conf/schemas/domaincommon.rng b/src/conf/schemas/domaincommon.rng index ee9c408a21..31897342f4 100644 --- a/src/conf/schemas/domaincommon.rng +++ b/src/conf/schemas/domaincommon.rng @@ -2481,6 +2481,11 @@ + + + + +
[PATCH 5/5] qemu: add support for max_unmap_size
Signed-off-by: Kristina Hanicova --- src/qemu/qemu_command.c | 2 + src/qemu/qemu_domain.c| 2 + src/vz/vz_utils.c | 3 +- ...csi-disk-max_unmap_size.x86_64-latest.args | 37 +++ .../disk-scsi-disk-max_unmap_size.xml | 28 ++ tests/qemuxml2argvtest.c | 1 + 6 files changed, 72 insertions(+), 1 deletion(-) create mode 100644 tests/qemuxml2argvdata/disk-scsi-disk-max_unmap_size.x86_64-latest.args create mode 100644 tests/qemuxml2argvdata/disk-scsi-disk-max_unmap_size.xml diff --git a/src/qemu/qemu_command.c b/src/qemu/qemu_command.c index 23810bc067..266a12f1b8 100644 --- a/src/qemu/qemu_command.c +++ b/src/qemu/qemu_command.c @@ -1761,6 +1761,7 @@ qemuBuildDiskDeviceProps(const virDomainDef *def, unsigned int logical_block_size = disk->blockio.logical_block_size; unsigned int physical_block_size = disk->blockio.physical_block_size; unsigned int discard_granularity = disk->blockio.discard_granularity; +unsigned int max_unmap_size = disk->blockio.max_unmap_size; g_autoptr(virJSONValue) wwn = NULL; g_autofree char *serial = NULL; virTristateSwitch removable = VIR_TRISTATE_SWITCH_ABSENT; @@ -1941,6 +1942,7 @@ qemuBuildDiskDeviceProps(const virDomainDef *def, "p:logical_block_size", logical_block_size, "p:physical_block_size", physical_block_size, "p:discard_granularity", discard_granularity, + "p:max_unmap_size", max_unmap_size, "A:wwn", , "p:rotation_rate", disk->rotation_rate, "S:vendor", disk->vendor, diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c index 3230bc281d..447b77dc56 100644 --- a/src/qemu/qemu_domain.c +++ b/src/qemu/qemu_domain.c @@ -8389,6 +8389,8 @@ qemuDomainDiskChangeSupported(virDomainDiskDef *disk, "blockio physical_block_size", false); CHECK_EQ(blockio.discard_granularity, "blockio discard_granularity", false); +CHECK_EQ(blockio.max_unmap_size, + "blockio max_unmap_size", false); CHECK_EQ(blkdeviotune.total_bytes_sec, "blkdeviotune total_bytes_sec", diff --git a/src/vz/vz_utils.c b/src/vz/vz_utils.c index de707df883..54d67e2ae5 100644 --- a/src/vz/vz_utils.c +++ b/src/vz/vz_utils.c @@ -280,7 +280,8 @@ vzCheckDiskUnsupportedParams(virDomainDiskDef *disk) if (disk->blockio.logical_block_size || disk->blockio.physical_block_size || -disk->blockio.discard_granularity) { +disk->blockio.discard_granularity || +disk->blockio.max_unmap_size) { virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", _("Setting disk block sizes is not " "supported by vz driver.")); diff --git a/tests/qemuxml2argvdata/disk-scsi-disk-max_unmap_size.x86_64-latest.args b/tests/qemuxml2argvdata/disk-scsi-disk-max_unmap_size.x86_64-latest.args new file mode 100644 index 00..4616b3c26e --- /dev/null +++ b/tests/qemuxml2argvdata/disk-scsi-disk-max_unmap_size.x86_64-latest.args @@ -0,0 +1,37 @@ +LC_ALL=C \ +PATH=/bin \ +HOME=/var/lib/libvirt/qemu/domain--1-QEMUGuest1 \ +USER=test \ +LOGNAME=test \ +XDG_DATA_HOME=/var/lib/libvirt/qemu/domain--1-QEMUGuest1/.local/share \ +XDG_CACHE_HOME=/var/lib/libvirt/qemu/domain--1-QEMUGuest1/.cache \ +XDG_CONFIG_HOME=/var/lib/libvirt/qemu/domain--1-QEMUGuest1/.config \ +/usr/bin/qemu-system-x86_64 \ +-name guest=QEMUGuest1,debug-threads=on \ +-S \ +-object '{"qom-type":"secret","id":"masterKey0","format":"raw","file":"/var/lib/libvirt/qemu/domain--1-QEMUGuest1/master-key.aes"}' \ +-machine pc,usb=off,dump-guest-core=off,memory-backend=pc.ram,acpi=off \ +-accel tcg \ +-cpu qemu64 \ +-m size=219136k \ +-object '{"qom-type":"memory-backend-ram","id":"pc.ram","size":224395264}' \ +-overcommit mem-lock=off \ +-smp 1,sockets=1,cores=1,threads=1 \ +-uuid c7a5fdbd-edaf-9455-926a-d65c16db1809 \ +-display none \ +-no-user-config \ +-nodefaults \ +-chardev socket,id=charmonitor,fd=1729,server=on,wait=off \ +-mon chardev=charmonitor,id=monitor,mode=control \ +-rtc base=utc \ +-no-shutdown \ +-boot strict=on \ +-device '{"driver":"piix3-usb-uhci","id":"usb","bus":"pci.0","addr":"0x1.0x2"}' \ +-device '{"driver":"virtio-scsi-pci","id":"scsi0","bus":"pci.0","addr":"0x2"}' \ +-blockdev '{"driver":"file","filename":"/var/lib/libvirt/images/rhelik.raw","node-name":"libvirt-1-storage","auto-read-only":true,"discard":"unmap"}' \ +-blockdev '{"node-name":"libvirt-1-format","read-only":false,"driver":"raw","file":"libvirt-1-storage"}' \ +-device '{"driver":"scsi-hd","bus":"scsi0.0","channel":0,"scsi-id":4,"lun":0,"device_id":"drive-scsi0-0-4-0","drive":"libvirt-1-format","id":"scsi0-0-4-0","bootindex":1,"max_unmap_size":1073741824}' \
[PATCH 0/5] introduce support for block device properties
I noticed this on the list after the 1849570 bug on discard granularity: https://listman.redhat.com/archives/libvir-list/2020-June/203862.html so I decided to add maximum unmap size as well. The series is heavily based on Lin's patches and I would be glad if there is any way to add Lin into commit messages as well. Kristina Hanicova (5): conf: add support for discard_granularity qemu: add support for discard_granularity conf: add support for max_unmap_size qemu: Introduce QEMU_CAPS_SCSI_DISK_MAX_UNMAP_SIZE capability qemu: add support for max_unmap_size docs/formatdomain.rst | 16 +++- src/conf/domain_conf.c| 22 ++- src/conf/domain_conf.h| 2 + src/conf/domain_validate.c| 4 +- src/conf/schemas/domaincommon.rng | 10 + src/qemu/qemu_capabilities.c | 2 + src/qemu/qemu_capabilities.h | 1 + src/qemu/qemu_command.c | 4 ++ src/qemu/qemu_domain.c| 4 ++ src/vz/vz_utils.c | 4 +- .../caps_4.2.0_aarch64.xml| 1 + .../qemucapabilitiesdata/caps_4.2.0_ppc64.xml | 1 + .../qemucapabilitiesdata/caps_4.2.0_s390x.xml | 1 + .../caps_4.2.0_x86_64.xml | 1 + .../caps_5.0.0_aarch64.xml| 1 + .../qemucapabilitiesdata/caps_5.0.0_ppc64.xml | 1 + .../caps_5.0.0_riscv64.xml| 1 + .../caps_5.0.0_x86_64.xml | 1 + .../qemucapabilitiesdata/caps_5.1.0_sparc.xml | 1 + .../caps_5.1.0_x86_64.xml | 1 + .../caps_5.2.0_aarch64.xml| 1 + .../qemucapabilitiesdata/caps_5.2.0_ppc64.xml | 1 + .../caps_5.2.0_riscv64.xml| 1 + .../qemucapabilitiesdata/caps_5.2.0_s390x.xml | 1 + .../caps_5.2.0_x86_64.xml | 1 + .../caps_6.0.0_aarch64.xml| 1 + .../qemucapabilitiesdata/caps_6.0.0_s390x.xml | 1 + .../caps_6.0.0_x86_64.xml | 1 + .../caps_6.1.0_x86_64.xml | 1 + .../caps_6.2.0_aarch64.xml| 1 + .../qemucapabilitiesdata/caps_6.2.0_ppc64.xml | 1 + .../caps_6.2.0_x86_64.xml | 1 + .../caps_7.0.0_aarch64+hvf.xml| 1 + .../caps_7.0.0_aarch64.xml| 1 + .../qemucapabilitiesdata/caps_7.0.0_ppc64.xml | 1 + .../caps_7.0.0_x86_64.xml | 1 + .../qemucapabilitiesdata/caps_7.1.0_ppc64.xml | 1 + .../caps_7.1.0_x86_64.xml | 1 + tests/qemucapabilitiesdata/caps_7.2.0_ppc.xml | 1 + .../caps_7.2.0_x86_64+hvf.xml | 1 + .../caps_7.2.0_x86_64.xml | 1 + .../caps_8.0.0_riscv64.xml| 1 + .../caps_8.0.0_x86_64.xml | 1 + .../qemucapabilitiesdata/caps_8.1.0_s390x.xml | 1 + .../caps_8.1.0_x86_64.xml | 1 + .../disk-blockio.x86_64-latest.args | 2 +- tests/qemuxml2argvdata/disk-blockio.xml | 2 +- ...csi-disk-max_unmap_size.x86_64-latest.args | 37 +++ .../disk-scsi-disk-max_unmap_size.xml | 28 ++ tests/qemuxml2argvtest.c | 1 + 50 files changed, 168 insertions(+), 6 deletions(-) create mode 100644 tests/qemuxml2argvdata/disk-scsi-disk-max_unmap_size.x86_64-latest.args create mode 100644 tests/qemuxml2argvdata/disk-scsi-disk-max_unmap_size.xml -- 2.41.0
Re: [PATCH 0/2] qemu_passt: Stop pre-creating passt logfile
On Tue, Aug 01, 2023 at 04:33:41PM +0200, Michal Privoznik wrote: See reasoning in 2/2. Michal Prívozník (2): Revert "qemu_passt: Actually use @logfd" Revert "qemu_passt: Precreate passt logfile" Reviewed-by: Martin Kletzander src/qemu/qemu_passt.c | 40 +--- 1 file changed, 5 insertions(+), 35 deletions(-) -- 2.41.0 signature.asc Description: PGP signature
[libvirt PATCH] src: fix max file limits in systemd services
This fixes commit 38abf9c34dc481b0dc923bdab446ee623bdc5ab6 Author: Daniel P. Berrangé Date: Wed Jun 21 13:22:40 2023 +0100 src: set max open file limit to match systemd >= 240 defaults The bug referenced in that commit had suggested to set LimitNOFile=512000:1024 on the basis that matches current systemd default behaviour and is compatible with old systemd. That was good except * The setting is LimitNOFILE and these are case sensitive * The hard and soft limits were inverted - soft must come first and so it would have been ignored even if the setting name was correct. * The default hard limit is 524288 not 512000 Reported-by: Olaf Hering Signed-off-by: Daniel P. Berrangé --- src/ch/virtchd.service.in| 2 +- src/locking/virtlockd.service.in | 2 +- src/logging/virtlogd.service.in | 2 +- src/lxc/virtlxcd.service.in | 2 +- src/qemu/virtqemud.service.in| 2 +- src/remote/libvirtd.service.in | 2 +- 6 files changed, 6 insertions(+), 6 deletions(-) diff --git a/src/ch/virtchd.service.in b/src/ch/virtchd.service.in index be242fea78..351eee312b 100644 --- a/src/ch/virtchd.service.in +++ b/src/ch/virtchd.service.in @@ -24,7 +24,7 @@ Restart=on-failure # Raise hard limits to match behaviour of systemd >= 240. # During startup, daemon will set soft limit to match hard limit # per systemd recommendations -LimitNOFile=512000:1024 +LimitNOFILE=1024:524288 # The cgroups pids controller can limit the number of tasks started by # the daemon, which can limit the number of domains for some hypervisors. # A conservative default of 8 tasks per guest results in a TasksMax of diff --git a/src/locking/virtlockd.service.in b/src/locking/virtlockd.service.in index f1792dcb43..dd0bbab083 100644 --- a/src/locking/virtlockd.service.in +++ b/src/locking/virtlockd.service.in @@ -18,7 +18,7 @@ OOMScoreAdjust=-900 # Raise hard limits to match behaviour of systemd >= 240. # During startup, daemon will set soft limit to match hard limit # per systemd recommendations -LimitNOFile=512000:1024 +LimitNOFILE=1024:524288 [Install] Also=virtlockd.socket diff --git a/src/logging/virtlogd.service.in b/src/logging/virtlogd.service.in index cef4053f59..8e245ddb43 100644 --- a/src/logging/virtlogd.service.in +++ b/src/logging/virtlogd.service.in @@ -18,7 +18,7 @@ OOMScoreAdjust=-900 # Raise hard limits to match behaviour of systemd >= 240. # During startup, daemon will set soft limit to match hard limit # per systemd recommendations -LimitNOFile=512000:1024 +LimitNOFILE=1024:524288 [Install] Also=virtlogd.socket diff --git a/src/lxc/virtlxcd.service.in b/src/lxc/virtlxcd.service.in index b615a3f32d..ee3a7f1083 100644 --- a/src/lxc/virtlxcd.service.in +++ b/src/lxc/virtlxcd.service.in @@ -24,7 +24,7 @@ Restart=on-failure # Raise hard limits to match behaviour of systemd >= 240. # During startup, daemon will set soft limit to match hard limit # per systemd recommendations -LimitNOFile=512000:1024 +LimitNOFILE=1024:524288 # The cgroups pids controller can limit the number of tasks started by # the daemon, which can limit the number of domains for some hypervisors. # A conservative default of 8 tasks per guest results in a TasksMax of diff --git a/src/qemu/virtqemud.service.in b/src/qemu/virtqemud.service.in index e3dc738b8f..e79670ca95 100644 --- a/src/qemu/virtqemud.service.in +++ b/src/qemu/virtqemud.service.in @@ -26,7 +26,7 @@ Restart=on-failure # Raise hard limits to match behaviour of systemd >= 240. # During startup, daemon will set soft limit to match hard limit # per systemd recommendations -LimitNOFile=512000:1024 +LimitNOFILE=1024:524288 # The cgroups pids controller can limit the number of tasks started by # the daemon, which can limit the number of domains for some hypervisors. # A conservative default of 8 tasks per guest results in a TasksMax of diff --git a/src/remote/libvirtd.service.in b/src/remote/libvirtd.service.in index abac58cb2c..84f1613adc 100644 --- a/src/remote/libvirtd.service.in +++ b/src/remote/libvirtd.service.in @@ -31,7 +31,7 @@ Restart=on-failure # Raise hard limits to match behaviour of systemd >= 240. # During startup, daemon will set soft limit to match hard limit # per systemd recommendations -LimitNOFile=512000:1024 +LimitNOFILE=1024:524288 # The cgroups pids controller can limit the number of tasks started by # the daemon, which can limit the number of domains for some hypervisors. # A conservative default of 8 tasks per guest results in a TasksMax of -- 2.41.0
Re: [PATCH 1/3] qemu: Reflect MAC address change in live domain XML
On Wed, Aug 02, 2023 at 09:15:43AM +0200, Michal Prívozník wrote: On 7/26/23 16:45, Martin Kletzander wrote: On Wed, Jun 28, 2023 at 12:53:35PM +0200, Michal Privoznik wrote: If a guest changes MAC address on its vNIC, then QEMU emits NIC_RX_FILTER_CHANGED event (the event is emitted in other cases too, but that's not important right now). Now, domain XML allows users to chose whether to trust these events or not: For the 'no' case no action is performed and the event is ignored. But for the 'yes' case, some host side features of corresponding vNIC (well tap/macvtap device) are tweaked to reflect changed MAC address. But what is missing is reflecting this new MAC address in domain XML. Basically, what happens is: the host sees traffic with new MAC address, all tools inside the guest see the new MAC address (including 'virsh domifaddr --source agent') which makes it harder to match device in the guest with the one in the domain XML. NB, we should relay this event to clients, but that is covered in next commits. Signed-off-by: Michal Privoznik --- src/qemu/qemu_domain.c | 18 ++ src/qemu/qemu_driver.c | 2 +- 2 files changed, 19 insertions(+), 1 deletion(-) diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c index 94587638c3..5e5789a28c 100644 --- a/src/qemu/qemu_domain.c +++ b/src/qemu/qemu_domain.c @@ -12482,6 +12482,19 @@ syncNicRxFilterMulticast(char *ifname, } +/** + * qemuDomainSyncRxFilter: + * @vm: domain object + * @def: domain interface definition + * @asyncJob: async job type + * + * Fetch new state of RX Filter and set host side of the interface + * accordingly (e.g. reflect MAC address change on macvtap). + * + * Reflect changed MAC address in the domain definition. + * + * Returns: 0 on success, -1 on error. + */ int qemuDomainSyncRxFilter(virDomainObj *vm, virDomainNetDef *def, @@ -12535,6 +12548,11 @@ qemuDomainSyncRxFilter(virDomainObj *vm, return -1; } + /* Reflect changed MAC address in the domain XML. */ + if (virMacAddrCmp(>mac, >mac)) { + virMacAddrSet(>mac, >mac); + } + If we go with the idea I suggested this needs to be done even when we're not updating the filters. You mean, even when this qemuDomainSyncRxFilter() function is not called, i.e. update MAC addr from processNicRxFilterChangedEvent()? We could do that, but this RX_FILTER_CHANGED event is a bit special. To avoid flooding libvirt with this event, there's a antispam mechanism implemented - after QEMU emits the event it awaits 'query-rx-filter' monitor cmd. No other RX_FILTER_CHANGED event is emitted until the monitor cmd is issued. IOW, if we want to refresh MAC address on each event, we must (unconditionally) issue the command and then (based on trustGuestRxFilters) sync RX filters. What I can't decide is whether it's better to reflect MAC change in domain XML iff trustGuestRxFilters is set, or regardless. I meant this if we go with the idea of reporting both the HW MAC address and the guest's set MAC address in the live XML. That way we do not break existing use cases where they rely on the MAC address not changing and on top of that we allow for checking what the guest's MAC really is. I think this is a win-win, but feel free to disagree. Michal signature.asc Description: PGP signature
Re: [libvirt PATCH 6/9] src: set max open file limit to match systemd >= 240 defaults
On Wed, Aug 02, 2023 at 10:49:55AM +0200, Olaf Hering wrote: > Wed, 21 Jun 2023 14:32:29 +0100 Daniel P. Berrangé : > > > -LimitNOFILE=8192 > > +LimitNOFile=512000:1024 > > Did someone really rename that knob in upstream systemd? > My copy of systemd.directives(7) still lists the uppercase variant. > As a result this change means the knob is now not recognized anymore. No, its a mistake. With regards, Daniel -- |: https://berrange.com -o-https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o-https://fstop138.berrange.com :| |: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|
Re: [libvirt PATCH 6/9] src: set max open file limit to match systemd >= 240 defaults
Wed, 21 Jun 2023 14:32:29 +0100 Daniel P. Berrangé : > -LimitNOFILE=8192 > +LimitNOFile=512000:1024 Did someone really rename that knob in upstream systemd? My copy of systemd.directives(7) still lists the uppercase variant. As a result this change means the knob is now not recognized anymore. Olaf pgpnBOFE0vRul.pgp Description: Digitale Signatur von OpenPGP
[PATCH] daemon: Treat logging of VIR_ERR_MULTIPLE_INTERFACES same as VIR_ERR_NO_INTERFACE
When a query for an interface via virInterfaceLookupByMACString finds multiple interfaces an error is returned. Treat such error with the same 'debug' priority as we treat when the interface was not found to avoid spamming logs with such configurations. Closes: https://gitlab.com/libvirt/libvirt/-/issues/514 Signed-off-by: Peter Krempa --- src/remote/remote_daemon.c | 1 + 1 file changed, 1 insertion(+) diff --git a/src/remote/remote_daemon.c b/src/remote/remote_daemon.c index d880711c91..d4d999e53a 100644 --- a/src/remote/remote_daemon.c +++ b/src/remote/remote_daemon.c @@ -97,6 +97,7 @@ static int daemonErrorLogFilter(virErrorPtr err, int priority) case VIR_ERR_NO_STORAGE_VOL: case VIR_ERR_NO_NODE_DEVICE: case VIR_ERR_NO_INTERFACE: +case VIR_ERR_MULTIPLE_INTERFACES: case VIR_ERR_NO_NWFILTER: case VIR_ERR_NO_NWFILTER_BINDING: case VIR_ERR_NO_SECRET: -- 2.41.0
[PATCH 1/3] virLXCControllerSetupUsernsMap: Modify debug logging for clean startup errors
Avoid logging multiline debug logs so that the function which attempts to extract a non-debug log error message can work properly. Signed-off-by: Peter Krempa --- src/lxc/lxc_controller.c | 8 +--- 1 file changed, 5 insertions(+), 3 deletions(-) diff --git a/src/lxc/lxc_controller.c b/src/lxc/lxc_controller.c index 86dcd880e8..ba7f15ad24 100644 --- a/src/lxc/lxc_controller.c +++ b/src/lxc/lxc_controller.c @@ -1359,11 +1359,13 @@ virLXCControllerSetupUsernsMap(virDomainIdMapEntry *map, return -1; } -for (i = 0; i < num; i++) +VIR_DEBUG("Set '%s' mappings to:", path); + +for (i = 0; i < num; i++) { +VIR_DEBUG("%u %u %u", map[i].start, map[i].target, map[i].count); virBufferAsprintf(_value, "%u %u %u\n", map[i].start, map[i].target, map[i].count); - -VIR_DEBUG("Set '%s' to '%s'", path, virBufferCurrentContent(_value)); +} if (virFileWriteStr(path, virBufferCurrentContent(_value), 0) < 0) { virReportSystemError(errno, _("unable write to %1$s"), path); -- 2.41.0
[PATCH 2/3] virLXCProcessReadLogOutputData: Refill buffer after filtering out noise
The caller passes in a 1k buffer, which when debug logging is in use is easily filled with debug messages only. Thus after the first pass which is common if the controller process already terminated the buffer will not contain the real error, but rather a truncated debug message, which will result in an error such as: error: internal error: guest failed to start: 2023-08-01 12:58:31.948+: 798195: i instead of the proper error: error: internal error: guest failed to start: Failure in libvirt_lxc startup: Failed to create /home/rootfs/.oldroot: Permission denied To fix the above retry the reading loop if the filtering function made space in the buffer. Signed-off-by: Peter Krempa --- src/lxc/lxc_process.c | 6 ++ 1 file changed, 6 insertions(+) diff --git a/src/lxc/lxc_process.c b/src/lxc/lxc_process.c index d003742fa1..6b79bd737b 100644 --- a/src/lxc/lxc_process.c +++ b/src/lxc/lxc_process.c @@ -1011,6 +1011,7 @@ virLXCProcessReadLogOutputData(virDomainObj *vm, int retries = 10; int got = 0; char *filter_next = buf; +bool filtered; buf[0] = '\0'; @@ -1036,11 +1037,13 @@ virLXCProcessReadLogOutputData(virDomainObj *vm, buf[got] = '\0'; /* Filter out debug messages from intermediate libvirt process */ +filtered = false; while ((eol = strchr(filter_next, '\n'))) { *eol = '\0'; if (virLXCProcessIgnorableLogLine(filter_next)) { memmove(filter_next, eol + 1, got - (eol - buf)); got -= eol + 1 - filter_next; +filtered = true; } else { filter_next = eol + 1; *eol = '\n'; @@ -1054,6 +1057,9 @@ virLXCProcessReadLogOutputData(virDomainObj *vm, return -1; } +if (filtered) +continue; + if (isdead) return got; -- 2.41.0
[PATCH 3/3] virLXCProcessReportStartupLogError: Strip trailing newline from error
Since the error message originates from a log file it contains a trailing newline. Strip it as all error handling adds it's own newline. Signed-off-by: Peter Krempa --- src/lxc/lxc_process.c | 6 ++ 1 file changed, 6 insertions(+) diff --git a/src/lxc/lxc_process.c b/src/lxc/lxc_process.c index 6b79bd737b..04642b56dd 100644 --- a/src/lxc/lxc_process.c +++ b/src/lxc/lxc_process.c @@ -1124,6 +1124,7 @@ virLXCProcessReportStartupLogError(virDomainObj *vm, { size_t buflen = 1024; g_autofree char *errbuf = g_new0(char, buflen); +char *p; int rc; if ((rc = virLXCProcessReadLogOutput(vm, logfile, pos, errbuf, buflen)) < 0) @@ -1132,6 +1133,11 @@ virLXCProcessReportStartupLogError(virDomainObj *vm, if (rc == 0) return 0; +/* strip last newline */ +if ((p = strrchr(errbuf, '\n')) && +p[1] == '\0') +*p = '\0'; + virReportError(VIR_ERR_INTERNAL_ERROR, _("guest failed to start: %1$s"), errbuf); -- 2.41.0
[PATCH 0/3] lxc: Fix reporting of startup errors with debug logging enabled
Few issues with propagating error to the user when debug logging is enabled, Peter Krempa (3): virLXCControllerSetupUsernsMap: Modify debug logging for clean startup errors virLXCProcessReadLogOutputData: Refill buffer after filtering out noise virLXCProcessReportStartupLogError: Strip trailing newline from error src/lxc/lxc_controller.c | 8 +--- src/lxc/lxc_process.c| 12 2 files changed, 17 insertions(+), 3 deletions(-) -- 2.41.0
Re: [PATCH 1/3] qemu: Reflect MAC address change in live domain XML
On 7/26/23 16:45, Martin Kletzander wrote: > On Wed, Jun 28, 2023 at 12:53:35PM +0200, Michal Privoznik wrote: >> If a guest changes MAC address on its vNIC, then QEMU emits >> NIC_RX_FILTER_CHANGED event (the event is emitted in other cases >> too, but that's not important right now). Now, domain XML allows >> users to chose whether to trust these events or not: >> >> >> >> For the 'no' case no action is performed and the event is >> ignored. But for the 'yes' case, some host side features of >> corresponding vNIC (well tap/macvtap device) are tweaked to >> reflect changed MAC address. But what is missing is reflecting >> this new MAC address in domain XML. >> >> Basically, what happens is: the host sees traffic with new MAC >> address, all tools inside the guest see the new MAC address >> (including 'virsh domifaddr --source agent') which makes it >> harder to match device in the guest with the one in the domain >> XML. >> >> NB, we should relay this event to clients, but that is covered in >> next commits. >> >> Signed-off-by: Michal Privoznik >> --- >> src/qemu/qemu_domain.c | 18 ++ >> src/qemu/qemu_driver.c | 2 +- >> 2 files changed, 19 insertions(+), 1 deletion(-) >> >> diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c >> index 94587638c3..5e5789a28c 100644 >> --- a/src/qemu/qemu_domain.c >> +++ b/src/qemu/qemu_domain.c >> @@ -12482,6 +12482,19 @@ syncNicRxFilterMulticast(char *ifname, >> } >> >> >> +/** >> + * qemuDomainSyncRxFilter: >> + * @vm: domain object >> + * @def: domain interface definition >> + * @asyncJob: async job type >> + * >> + * Fetch new state of RX Filter and set host side of the interface >> + * accordingly (e.g. reflect MAC address change on macvtap). >> + * >> + * Reflect changed MAC address in the domain definition. >> + * >> + * Returns: 0 on success, -1 on error. >> + */ >> int >> qemuDomainSyncRxFilter(virDomainObj *vm, >> virDomainNetDef *def, >> @@ -12535,6 +12548,11 @@ qemuDomainSyncRxFilter(virDomainObj *vm, >> return -1; >> } >> >> + /* Reflect changed MAC address in the domain XML. */ >> + if (virMacAddrCmp(>mac, >mac)) { >> + virMacAddrSet(>mac, >mac); >> + } >> + > > If we go with the idea I suggested this needs to be done even when we're > not updating the filters. You mean, even when this qemuDomainSyncRxFilter() function is not called, i.e. update MAC addr from processNicRxFilterChangedEvent()? We could do that, but this RX_FILTER_CHANGED event is a bit special. To avoid flooding libvirt with this event, there's a antispam mechanism implemented - after QEMU emits the event it awaits 'query-rx-filter' monitor cmd. No other RX_FILTER_CHANGED event is emitted until the monitor cmd is issued. IOW, if we want to refresh MAC address on each event, we must (unconditionally) issue the command and then (based on trustGuestRxFilters) sync RX filters. What I can't decide is whether it's better to reflect MAC change in domain XML iff trustGuestRxFilters is set, or regardless. Michal
Re: [PATCH 2/3] Introduce NIC_MAC_CHANGE event
On 7/26/23 16:47, Martin Kletzander wrote: > On Wed, Jun 28, 2023 at 12:53:36PM +0200, Michal Privoznik wrote: >> The aim off this event is to notify management application that >> guest changed MAC address on one of its vNICs so the app can >> update its internal records, e.g. for finding match between >> guest/host view of vNICs. >> >> Signed-off-by: Michal Privoznik >> --- >> examples/c/misc/event-test.c | 14 + >> include/libvirt/libvirt-domain.h | 28 + >> src/conf/domain_event.c | 93 + >> src/conf/domain_event.h | 12 >> src/libvirt_private.syms | 2 + >> src/remote/remote_daemon_dispatch.c | 32 ++ >> src/remote/remote_driver.c | 34 +++ >> src/remote/remote_protocol.x | 17 +- >> tools/virsh-domain-event.c | 20 +++ >> 9 files changed, 251 insertions(+), 1 deletion(-) >> > > [...] > >> diff --git a/src/remote/remote_daemon_dispatch.c >> b/src/remote/remote_daemon_dispatch.c >> index 7144e9e7ca..f347e7bcce 100644 >> --- a/src/remote/remote_daemon_dispatch.c >> +++ b/src/remote/remote_daemon_dispatch.c >> @@ -1357,6 +1357,37 @@ >> remoteRelayDomainEventMemoryDeviceSizeChange(virConnectPtr conn, >> } >> >> >> +static int >> +remoteRelayDomainEventNICMACChange(virConnectPtr conn, >> + virDomainPtr dom, >> + const char *alias, >> + const char *oldMAC, >> + const char *newMAC, >> + void *opaque) >> +{ >> + daemonClientEventCallback *callback = opaque; >> + remote_domain_event_nic_mac_change_msg data; >> + >> + if (callback->callbackID < 0 || >> + !remoteRelayDomainEventCheckACL(callback->client, conn, dom)) >> + return -1; >> + >> + /* build return data */ >> + memset(, 0, sizeof(data)); > > Just a nit pick, but instead of this you should be able to do: > > remote_domain_event_nic_mac_change_msg data = {0}; Yep, you're right. This is what you get when you copy code from an event introduced earlier :-) Fixed locally. Michal