Re: [PATCH 0/3] Implement support for QCOW2 data files
On 8/16/24 17:00, Peter Krempa wrote: On Wed, Aug 14, 2024 at 15:28:57 +0200, Denis V. Lunev wrote: On 8/12/24 16:46, Peter Krempa wrote: On Mon, Aug 12, 2024 at 12:04:01 +0200, Denis V. Lunev wrote: On 8/12/24 10:36, Peter Krempa wrote: On Mon, Aug 12, 2024 at 09:26:08 +0200, Peter Krempa wrote: On Sun, Aug 11, 2024 at 17:34:45 +0300, Nikolai Barybin via Devel wrote: There are use cases when the existing disks (i.e. LVM) are wanted to be used with advanced features. For this purpose QEMU allows data-file feature for qcow2 files: metadata is kept in the qcow2 file like usual, but guest data is written to an external file. These patches enable support for this feature in libvirt. So this feature was once attempted to be added but was never finished and the comitted bits were reverted eventually. (I've purged my local archive so I don't have the link handy but I can look if you want the links to the old posting) It was deemed that this doesn't really add any performance benefit over storing the actual qcow2 with data inside. The qcow2 with data can be stored inside the LV or other block device for that matter and thus can provide all features that are necessary. The data file feature makes also the management of the metadata and data much more complex, for a very bad trade-off. At this point with 'qemu-img measure' it's easy to query the necessary size to have a fully allocated qcow2 inside the block device. Based on the history of this I'd like to ask you to summarize justifications and reasons for adding this feature before continuing. Based on the current state of the series and what would be required to make it viable to be accepted I very strongly suggest re-thinking if you really need this feature, especially based on the caveats above. Let me clarify a bit. QCOW2 as a data file uses QCOW2 as a metadata storage for ordinary block devices. This is a feature of QEMU and it would be quite natural to have its representation in the libvirt as without libvirt help QEMU could not even start with such a configuration due to namespaces. LVM or not LVM. How it is better in comparison with the QCOW2. Historically when this was considered to be used for the incremental backup feature in oVirt, similar set of the advantages was picked as the justification. Later on after discussing this for a bit it became obvious that the advantages are not as great to justify the extra effort: - extra management complexity (need to carry over the qcow2 as well as the data file) - possibility to desynchronize the state (write into the 'data_file' invalidates the qcow2 "overlay"), without being able to see that it was desync'd (writing into the data file doesn't touch the overlay so the qcow2 driver doesn't see that). - basically no performance benefits on top of qcow2 - (perhaps other's on oVirt side ... it was long time ago so I don't remember any more) Yes. And definitely we will have this extra complexity over extra functionality. Right now we have to support for our product backups of VM data residing on LVM volumes. This is shipped into the production and I have option to have this in downstream only or submit this upstream. Fair enough. As long as you are willing to properly implement backing file support I'm fine with that. I just wanted to note that one prior attempt was deemed not worth it and abandoned, so that you can understand what you are taking on. The problem is that if we would say that libvirt is not going this way, we should clearly indicate in * QCOW2 documentation * qemu-img man page that the option of using datafile for VM metadata is deprecated and will not get further development. This would be at least fair. I have no idea how the qemu project approaches this and what they think about the status of backing files. For libvirt if qemu supports it it's a fair game to add the feature if somebody is willing to spend the effort. We have taken the decision that this scenario should be supported on the base of availability of this option and presence it in the public docs. First of all, there are historical setups when the customer uses LVM for virtual machines and does not want to reformat his hard disks to the file system. This makes a sense as we Yes, such setups would not be possible. Users would need to convert to qcow2 but that can be done transparently to the VM. (but briefly requires twice the storage). That is a BIG problem. Customers do not want to change disks layout. Each time we try to force them, we get problems. Real big problems. are avoiding two level of metadata, i.e. QCOW2 metadata over local FS metadata. This makes the setup a little bit more As stated above you don't really need to use a filesystem. You can make a block device (whether real/partition/lv) into a qcow2 image by simply using qemu-img format. We use LVM for big
Re: [PATCH 0/3] Implement support for QCOW2 data files
On 8/12/24 16:46, Peter Krempa wrote: On Mon, Aug 12, 2024 at 12:04:01 +0200, Denis V. Lunev wrote: On 8/12/24 10:36, Peter Krempa wrote: On Mon, Aug 12, 2024 at 09:26:08 +0200, Peter Krempa wrote: On Sun, Aug 11, 2024 at 17:34:45 +0300, Nikolai Barybin via Devel wrote: There are use cases when the existing disks (i.e. LVM) are wanted to be used with advanced features. For this purpose QEMU allows data-file feature for qcow2 files: metadata is kept in the qcow2 file like usual, but guest data is written to an external file. These patches enable support for this feature in libvirt. So this feature was once attempted to be added but was never finished and the comitted bits were reverted eventually. (I've purged my local archive so I don't have the link handy but I can look if you want the links to the old posting) It was deemed that this doesn't really add any performance benefit over storing the actual qcow2 with data inside. The qcow2 with data can be stored inside the LV or other block device for that matter and thus can provide all features that are necessary. The data file feature makes also the management of the metadata and data much more complex, for a very bad trade-off. At this point with 'qemu-img measure' it's easy to query the necessary size to have a fully allocated qcow2 inside the block device. Based on the history of this I'd like to ask you to summarize justifications and reasons for adding this feature before continuing. Based on the current state of the series and what would be required to make it viable to be accepted I very strongly suggest re-thinking if you really need this feature, especially based on the caveats above. Let me clarify a bit. QCOW2 as a data file uses QCOW2 as a metadata storage for ordinary block devices. This is a feature of QEMU and it would be quite natural to have its representation in the libvirt as without libvirt help QEMU could not even start with such a configuration due to namespaces. LVM or not LVM. How it is better in comparison with the QCOW2. Historically when this was considered to be used for the incremental backup feature in oVirt, similar set of the advantages was picked as the justification. Later on after discussing this for a bit it became obvious that the advantages are not as great to justify the extra effort: - extra management complexity (need to carry over the qcow2 as well as the data file) - possibility to desynchronize the state (write into the 'data_file' invalidates the qcow2 "overlay"), without being able to see that it was desync'd (writing into the data file doesn't touch the overlay so the qcow2 driver doesn't see that). - basically no performance benefits on top of qcow2 - (perhaps other's on oVirt side ... it was long time ago so I don't remember any more) Yes. And definitely we will have this extra complexity over extra functionality. Right now we have to support for our product backups of VM data residing on LVM volumes. This is shipped into the production and I have option to have this in downstream only or submit this upstream. The problem is that if we would say that libvirt is not going this way, we should clearly indicate in * QCOW2 documentation * qemu-img man page that the option of using datafile for VM metadata is deprecated and will not get further development. This would be at least fair. We have taken the decision that this scenario should be supported on the base of availability of this option and presence it in the public docs. First of all, there are historical setups when the customer uses LVM for virtual machines and does not want to reformat his hard disks to the file system. This makes a sense as we Yes, such setups would not be possible. Users would need to convert to qcow2 but that can be done transparently to the VM. (but briefly requires twice the storage). That is a BIG problem. Customers do not want to change disks layout. Each time we try to force them, we get problems. Real big problems. are avoiding two level of metadata, i.e. QCOW2 metadata over local FS metadata. This makes the setup a little bit more As stated above you don't really need to use a filesystem. You can make a block device (whether real/partition/lv) into a qcow2 image by simply using qemu-img format. We use LVM for big data (TBs in size) and QCOW2 for metadata, namely CBT. This is how libvirt now distinguish EFI QCOW2 and disk QCOW2. reliable. It should also be noted that QCOW2 in very specific setups is at least 2 times slow (when L2 cache does not fit into the dedicated amount of RAM while the disk itself is quite big. I would admit that this problem would be seen even for not so big 1T disks). Doesn't a QCOW2 with a 'data file' behave the same as a fully-allocated qcow2? I don't really see how this is more reliable/performant than plain fully-allocated qcow2.
Re: [PATCH 0/3] Implement support for QCOW2 data files
On 8/12/24 10:36, Peter Krempa wrote: On Mon, Aug 12, 2024 at 09:26:08 +0200, Peter Krempa wrote: On Sun, Aug 11, 2024 at 17:34:45 +0300, Nikolai Barybin via Devel wrote: There are use cases when the existing disks (i.e. LVM) are wanted to be used with advanced features. For this purpose QEMU allows data-file feature for qcow2 files: metadata is kept in the qcow2 file like usual, but guest data is written to an external file. These patches enable support for this feature in libvirt. So this feature was once attempted to be added but was never finished and the comitted bits were reverted eventually. (I've purged my local archive so I don't have the link handy but I can look if you want the links to the old posting) It was deemed that this doesn't really add any performance benefit over storing the actual qcow2 with data inside. The qcow2 with data can be stored inside the LV or other block device for that matter and thus can provide all features that are necessary. The data file feature makes also the management of the metadata and data much more complex, for a very bad trade-off. At this point with 'qemu-img measure' it's easy to query the necessary size to have a fully allocated qcow2 inside the block device. Based on the history of this I'd like to ask you to summarize justifications and reasons for adding this feature before continuing. Based on the current state of the series and what would be required to make it viable to be accepted I very strongly suggest re-thinking if you really need this feature, especially based on the caveats above. Let me clarify a bit. QCOW2 as a data file uses QCOW2 as a metadata storage for ordinary block devices. This is a feature of QEMU and it would be quite natural to have its representation in the libvirt as without libvirt help QEMU could not even start with such a configuration due to namespaces. LVM or not LVM. How it is better in comparison with the QCOW2. First of all, there are historical setups when the customer uses LVM for virtual machines and does not want to reformat his hard disks to the file system. This makes a sense as we are avoiding two level of metadata, i.e. QCOW2 metadata over local FS metadata. This makes the setup a little bit more reliable. It should also be noted that QCOW2 in very specific setups is at least 2 times slow (when L2 cache does not fit into the dedicated amount of RAM while the disk itself is quite big. I would admit that this problem would be seen even for not so big 1T disks). On top of that LVM is quite useful once we start talking about clustered LVM setups. Clustered LVM is a good alternative for CEPH at least for some users. Thus this data file setup is a way to provide backups, VM state snapshots and other features. Den
Re: [PATCH 0/4] Rework qemu internal active snapshots to use QMP
On 7/26/24 12:35, Nikolai Barybin wrote: Hello everyone! This is a ping email On 7/16/24 00:42, Nikolai Barybin wrote: These patches make use of QMP recently added snapshot-save/delete commands and reaps HMP savevm/deletevm. The usage of HMP commands are highly discouraged by QEMU. Nikolai Barybin (4): qemu monitor: add snaphot-save/delete QMP commands qemu blockjob: add snapshot-save/delete job types qemu snapshot: use QMP snapshot-save/delete for internal snapshots qemu monitor: reap qemu_monitor_text po/POTFILES | 1 - po/libvirt.pot | 18 src/qemu/meson.build | 1 - src/qemu/qemu_block.c | 2 + src/qemu/qemu_blockjob.c | 6 +- src/qemu/qemu_blockjob.h | 2 + src/qemu/qemu_domain.c | 4 + src/qemu/qemu_monitor.c | 23 +++-- src/qemu/qemu_monitor.h | 16 +++- src/qemu/qemu_monitor_json.c | 66 +++ src/qemu/qemu_monitor_json.h | 13 +++ src/qemu/qemu_monitor_text.c | 88 --- src/qemu/qemu_monitor_text.h | 29 --- src/qemu/qemu_snapshot.c | 158 --- 14 files changed, 267 insertions(+), 160 deletions(-) delete mode 100644 src/qemu/qemu_monitor_text.c delete mode 100644 src/qemu/qemu_monitor_text.h 1. Please do not top-post 2. I believe you are pinging wrong thread, you should do that in v2
Re: [PATCH 3/4] qemu snapshot: use QMP snapshot-save/delete for internal snapshots
On 7/16/24 00:42, Nikolai Barybin wrote: The usage of HMP commands are highly discouraged by qemu. Moreover, current snapshot creation routine does not provide flexibility in choosing target device for VM state snapshot. This patch makes use of QMP commands snapshot-save/delete and by default chooses first writable disk (if present) as target for VM state, NVRAM - otherwise. Signed-off-by: Nikolai Barybin --- src/qemu/qemu_snapshot.c | 158 --- 1 file changed, 148 insertions(+), 10 deletions(-) diff --git a/src/qemu/qemu_snapshot.c b/src/qemu/qemu_snapshot.c index f5260c4a22..83949a9a27 100644 --- a/src/qemu/qemu_snapshot.c +++ b/src/qemu/qemu_snapshot.c @@ -308,6 +308,96 @@ qemuSnapshotCreateInactiveExternal(virQEMUDriver *driver, return ret; } +static int +qemuSnapshotActiveInternalGetWrdevListHelper(virDomainObj *vm, + char **wrdevs) +{ +size_t wrdevCount = 0; +size_t i = 0; + +for (i = 0; i < vm->def->ndisks; i++) { +virDomainDiskDef *disk = vm->def->disks[i]; +if (!disk->src->readonly) { +wrdevs[wrdevCount] = g_strdup(disk->src->nodenameformat); +wrdevCount++; +} +} + +if (wrdevCount == 0) { +if (vm->def->os.loader->nvram) { +wrdevs[0] = g_strdup(vm->def->os.loader->nvram->nodenameformat); +} else { +virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("no writable device for internal snapshot creation/deletion")); +return -1; +} +} + +return 0; +} + + +static int +qemuSnapshotCreateActiveInternalDone(virDomainObj *vm) +{ +qemuBlockJobData *job = NULL; +qemuDomainObjPrivate *priv = vm->privateData; + +if (!(job = virHashLookup(priv->blockjobs, g_strdup_printf("snapsave%d", vm->def->id { +virReportError(VIR_ERR_INTERNAL_ERROR, + _("failed to lookup blockjob 'snapsave%1$d'"), vm->def->id); +return -1; +} + +qemuBlockJobUpdate(vm, job, VIR_ASYNC_JOB_NONE); +if (job->state == VIR_DOMAIN_BLOCK_JOB_FAILED) { +virReportError(VIR_ERR_INTERNAL_ERROR, + _("snapshot-save job failed: %1$s"), NULLSTR(job->errmsg)); +return -1; +} + +return job->state == VIR_DOMAIN_BLOCK_JOB_COMPLETED ? 1 : 0; +} + + +static int +qemuSnapshotCreateActiveInternalStart(virDomainObj *vm, + const char *name) +{ +qemuBlockJobData *job = NULL; +g_autofree char** wrdevs = NULL; +int ret = -1; +int rc = 0; + +wrdevs = g_new0(char *, vm->def->ndisks + 1); +if (qemuSnapshotActiveInternalGetWrdevListHelper(vm, wrdevs) < 0) +return -1; + +if (!(job = qemuBlockJobDiskNew(vm, NULL, QEMU_BLOCKJOB_TYPE_SNAPSHOT_SAVE, +g_strdup_printf("snapsave%d", vm->def->id { +virReportError(VIR_ERR_INTERNAL_ERROR, "%s", _("failed to create new blockjob")); +return -1; +} + +qemuBlockJobSyncBegin(job); +if (qemuDomainObjEnterMonitorAsync(vm, VIR_ASYNC_JOB_SNAPSHOT) < 0) { +ret = -1; +goto cleanup; +} + +rc = qemuMonitorSnapshotSave(qemuDomainGetMonitor(vm), job->name, + name, wrdevs[0], wrdevs); +qemuDomainObjExitMonitor(vm); +if (rc == 0) { +qemuBlockJobStarted(job, vm); +ret = 0; +} + + cleanup: +qemuBlockJobStartupFinalize(vm, job); +return ret; +} + /* The domain is expected to be locked and active. */ static int @@ -316,11 +406,11 @@ qemuSnapshotCreateActiveInternal(virQEMUDriver *driver, virDomainMomentObj *snap, unsigned int flags) { -qemuDomainObjPrivate *priv = vm->privateData; virObjectEvent *event = NULL; bool resume = false; virDomainSnapshotDef *snapdef = virDomainSnapshotObjGetDef(snap); int ret = -1; +int rv = 0; if (!qemuMigrationSrcIsAllowed(vm, false, VIR_ASYNC_JOB_SNAPSHOT, 0)) goto cleanup; @@ -342,15 +432,17 @@ qemuSnapshotCreateActiveInternal(virQEMUDriver *driver, } } -if (qemuDomainObjEnterMonitorAsync(vm, VIR_ASYNC_JOB_SNAPSHOT) < 0) { +if ((ret = qemuSnapshotCreateActiveInternalStart(vm, snap->def->name)) < 0) { resume = false; goto cleanup; } -ret = qemuMonitorCreateSnapshot(priv->mon, snap->def->name); -qemuDomainObjExitMonitor(vm); -if (ret < 0) -goto cleanup; 'snapshot-save' has been added in QEMU 6.0. Right now we are at 8.2 Should we still support pre 6.0 versions, i.e. check the availability of the command via proper capability? Den
Re: [PATCH 3/4] qemu snapshot: use QMP snapshot-save/delete for internal snapshots
On 7/16/24 00:42, Nikolai Barybin wrote: The usage of HMP commands are highly discouraged by qemu. Moreover, current snapshot creation routine does not provide flexibility in choosing target device for VM state snapshot. This patch makes use of QMP commands snapshot-save/delete and by default chooses first writable disk (if present) as target for VM state, NVRAM - otherwise. Signed-off-by: Nikolai Barybin --- src/qemu/qemu_snapshot.c | 158 --- 1 file changed, 148 insertions(+), 10 deletions(-) diff --git a/src/qemu/qemu_snapshot.c b/src/qemu/qemu_snapshot.c index f5260c4a22..83949a9a27 100644 --- a/src/qemu/qemu_snapshot.c +++ b/src/qemu/qemu_snapshot.c @@ -308,6 +308,96 @@ qemuSnapshotCreateInactiveExternal(virQEMUDriver *driver, return ret; } +static int +qemuSnapshotActiveInternalGetWrdevListHelper(virDomainObj *vm, + char **wrdevs) +{ +size_t wrdevCount = 0; +size_t i = 0; + +for (i = 0; i < vm->def->ndisks; i++) { +virDomainDiskDef *disk = vm->def->disks[i]; +if (!disk->src->readonly) { +wrdevs[wrdevCount] = g_strdup(disk->src->nodenameformat); +wrdevCount++; +} +} + +if (wrdevCount == 0) { +if (vm->def->os.loader->nvram) { +wrdevs[0] = g_strdup(vm->def->os.loader->nvram->nodenameformat); +} else { +virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("no writable device for internal snapshot creation/deletion")); +return -1; +} +} + should we select NVRAM at all? IMHO that would be very problematic. I think that we should generate error in this case. Den
Re: [PATCH] Сheck snapshot disk is not NULL when searching it in the VM config
On 4/29/24 14:43, Fima Shevrin wrote: When creating a snapshot of a VM with multiple hard disks, the snapshot takes into account the presence of all disks in the system. If, over time, one of the disks is deleted, the snapshot will continue to store knowledge of the deleted disk. This results in the fact that at the moment of deleting the snapshot, at the validation stage, a disk from the snapshot will be searched which is not in the VM configuration. As a result, vmdisk variable will be equal to NULL. Dereferencing a null pointer at the time of calling virStorageSourceIsSameLocation(vmdisk->src, disk->src) will result in SIGSEGV. Signed-off-by: Fima Shevrin --- src/qemu/qemu_snapshot.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/qemu/qemu_snapshot.c b/src/qemu/qemu_snapshot.c index 09ec959f10..bf93cd485e 100644 --- a/src/qemu/qemu_snapshot.c +++ b/src/qemu/qemu_snapshot.c @@ -3806,7 +3806,7 @@ qemuSnapshotDeleteValidate(virDomainObj *vm, vmdisk = qemuDomainDiskByName(vm->def, snapDisk->name); disk = qemuDomainDiskByName(snapdef->parent.dom, snapDisk->name); -if (!virStorageSourceIsSameLocation(vmdisk->src, disk->src)) { +if (vmdisk != NULL && !virStorageSourceIsSameLocation(vmdisk->src, disk->src)) { virReportError(VIR_ERR_OPERATION_UNSUPPORTED, _("disk image '%1$s' for internal snapshot '%2$s' is not the same as disk image currently used by VM"), snapDisk->name, snap->def->name); ping
Re: [PATCH v2 1/1] remote: properly initialize objects in ACL helpers
On 4/2/24 09:31, Peter Krempa wrote: On Tue, Mar 19, 2024 at 15:07:04 +0100, Denis V. Lunev wrote: Commit 2ecdf259299813c2c674377e22a0acbce5ccbbb2 was intended to implement two things: reduce stack usage inside ACL helpers and minimally initialize virDomainDef object to avoid passing garbage inside validation framework. Though original commit has not touched other ACL helpers. This patch adds proper clauses to remoteRelayNetworkEventCheckACL remoteRelayStoragePoolEventCheckACL remoteRelayNodeDeviceEventCheckACL remoteRelaySecretEventCheckACL Signed-off-by: Denis V. Lunev CC: Peter Krempa CC: Roman Grigoriev I'll drop the CC lines if you are okay with that as it doesn't seem to be needed to record that info in git. --- src/remote/remote_daemon_dispatch.c | 32 ++--- 1 file changed, 16 insertions(+), 16 deletions(-) Changes from v1: * g_autoptr is replaced with g_autofree upon reached consensus * patch 1 in series has been dropped Reviewed-by: Peter Krempa and I'll push it after the release is done later today. No prob at all, thanks! ___ Devel mailing list -- devel@lists.libvirt.org To unsubscribe send an email to devel-le...@lists.libvirt.org
Re: [PATCH v2 1/1] remote: properly initialize objects in ACL helpers
On 3/19/24 15:07, Denis V. Lunev wrote: Commit 2ecdf259299813c2c674377e22a0acbce5ccbbb2 was intended to implement two things: reduce stack usage inside ACL helpers and minimally initialize virDomainDef object to avoid passing garbage inside validation framework. Though original commit has not touched other ACL helpers. This patch adds proper clauses to remoteRelayNetworkEventCheckACL remoteRelayStoragePoolEventCheckACL remoteRelayNodeDeviceEventCheckACL remoteRelaySecretEventCheckACL Signed-off-by: Denis V. Lunev CC: Peter Krempa CC: Roman Grigoriev --- src/remote/remote_daemon_dispatch.c | 32 ++--- 1 file changed, 16 insertions(+), 16 deletions(-) Changes from v1: * g_autoptr is replaced with g_autofree upon reached consensus * patch 1 in series has been dropped diff --git a/src/remote/remote_daemon_dispatch.c b/src/remote/remote_daemon_dispatch.c index aaabd1e56c..b566a510b8 100644 --- a/src/remote/remote_daemon_dispatch.c +++ b/src/remote/remote_daemon_dispatch.c @@ -180,21 +180,21 @@ static bool remoteRelayNetworkEventCheckACL(virNetServerClient *client, virConnectPtr conn, virNetworkPtr net) { -virNetworkDef def; +g_autofree virNetworkDef *def = g_new0(virNetworkDef, 1); g_autoptr(virIdentity) identity = NULL; bool ret = false; /* For now, we just create a virNetworkDef with enough contents to * satisfy what viraccessdriverpolkit.c references. This is a bit * fragile, but I don't know of anything better. */ -def.name = net->name; -memcpy(def.uuid, net->uuid, VIR_UUID_BUFLEN); +def->name = net->name; +memcpy(def->uuid, net->uuid, VIR_UUID_BUFLEN); if (!(identity = virNetServerClientGetIdentity(client))) goto cleanup; if (virIdentitySetCurrent(identity) < 0) goto cleanup; -ret = virConnectNetworkEventRegisterAnyCheckACL(conn, &def); +ret = virConnectNetworkEventRegisterAnyCheckACL(conn, def); cleanup: ignore_value(virIdentitySetCurrent(NULL)); @@ -206,21 +206,21 @@ remoteRelayStoragePoolEventCheckACL(virNetServerClient *client, virConnectPtr conn, virStoragePoolPtr pool) { -virStoragePoolDef def; +g_autofree virStoragePoolDef *def = g_new0(virStoragePoolDef, 1); g_autoptr(virIdentity) identity = NULL; bool ret = false; /* For now, we just create a virStoragePoolDef with enough contents to * satisfy what viraccessdriverpolkit.c references. This is a bit * fragile, but I don't know of anything better. */ -def.name = pool->name; -memcpy(def.uuid, pool->uuid, VIR_UUID_BUFLEN); +def->name = pool->name; +memcpy(def->uuid, pool->uuid, VIR_UUID_BUFLEN); if (!(identity = virNetServerClientGetIdentity(client))) goto cleanup; if (virIdentitySetCurrent(identity) < 0) goto cleanup; -ret = virConnectStoragePoolEventRegisterAnyCheckACL(conn, &def); +ret = virConnectStoragePoolEventRegisterAnyCheckACL(conn, def); cleanup: ignore_value(virIdentitySetCurrent(NULL)); @@ -232,20 +232,20 @@ remoteRelayNodeDeviceEventCheckACL(virNetServerClient *client, virConnectPtr conn, virNodeDevicePtr dev) { -virNodeDeviceDef def; +g_autofree virNodeDeviceDef *def = g_new0(virNodeDeviceDef, 1); g_autoptr(virIdentity) identity = NULL; bool ret = false; /* For now, we just create a virNodeDeviceDef with enough contents to * satisfy what viraccessdriverpolkit.c references. This is a bit * fragile, but I don't know of anything better. */ -def.name = dev->name; +def->name = dev->name; if (!(identity = virNetServerClientGetIdentity(client))) goto cleanup; if (virIdentitySetCurrent(identity) < 0) goto cleanup; -ret = virConnectNodeDeviceEventRegisterAnyCheckACL(conn, &def); +ret = virConnectNodeDeviceEventRegisterAnyCheckACL(conn, def); cleanup: ignore_value(virIdentitySetCurrent(NULL)); @@ -257,22 +257,22 @@ remoteRelaySecretEventCheckACL(virNetServerClient *client, virConnectPtr conn, virSecretPtr secret) { -virSecretDef def; +g_autofree virSecretDef *def = g_new0(virSecretDef, 1); g_autoptr(virIdentity) identity = NULL; bool ret = false; /* For now, we just create a virSecretDef with enough contents to * satisfy what viraccessdriverpolkit.c references. This is a bit * fragile, but I don't know of anything better. */ -memcpy(def.uuid, secret->uuid, VIR_UUID_BUFLEN); -def.usage_type = secret->usageType; -def.usage_id = secret-&g
Re: [PATCH v2 1/1] remote: properly initialize objects in ACL helpers
On 3/19/24 15:07, Denis V. Lunev wrote: Commit 2ecdf259299813c2c674377e22a0acbce5ccbbb2 was intended to implement two things: reduce stack usage inside ACL helpers and minimally initialize virDomainDef object to avoid passing garbage inside validation framework. Though original commit has not touched other ACL helpers. This patch adds proper clauses to remoteRelayNetworkEventCheckACL remoteRelayStoragePoolEventCheckACL remoteRelayNodeDeviceEventCheckACL remoteRelaySecretEventCheckACL Signed-off-by: Denis V. Lunev CC: Peter Krempa CC: Roman Grigoriev --- src/remote/remote_daemon_dispatch.c | 32 ++--- 1 file changed, 16 insertions(+), 16 deletions(-) Changes from v1: * g_autoptr is replaced with g_autofree upon reached consensus * patch 1 in series has been dropped diff --git a/src/remote/remote_daemon_dispatch.c b/src/remote/remote_daemon_dispatch.c index aaabd1e56c..b566a510b8 100644 --- a/src/remote/remote_daemon_dispatch.c +++ b/src/remote/remote_daemon_dispatch.c @@ -180,21 +180,21 @@ static bool remoteRelayNetworkEventCheckACL(virNetServerClient *client, virConnectPtr conn, virNetworkPtr net) { -virNetworkDef def; +g_autofree virNetworkDef *def = g_new0(virNetworkDef, 1); g_autoptr(virIdentity) identity = NULL; bool ret = false; /* For now, we just create a virNetworkDef with enough contents to * satisfy what viraccessdriverpolkit.c references. This is a bit * fragile, but I don't know of anything better. */ -def.name = net->name; -memcpy(def.uuid, net->uuid, VIR_UUID_BUFLEN); +def->name = net->name; +memcpy(def->uuid, net->uuid, VIR_UUID_BUFLEN); if (!(identity = virNetServerClientGetIdentity(client))) goto cleanup; if (virIdentitySetCurrent(identity) < 0) goto cleanup; -ret = virConnectNetworkEventRegisterAnyCheckACL(conn, &def); +ret = virConnectNetworkEventRegisterAnyCheckACL(conn, def); cleanup: ignore_value(virIdentitySetCurrent(NULL)); @@ -206,21 +206,21 @@ remoteRelayStoragePoolEventCheckACL(virNetServerClient *client, virConnectPtr conn, virStoragePoolPtr pool) { -virStoragePoolDef def; +g_autofree virStoragePoolDef *def = g_new0(virStoragePoolDef, 1); g_autoptr(virIdentity) identity = NULL; bool ret = false; /* For now, we just create a virStoragePoolDef with enough contents to * satisfy what viraccessdriverpolkit.c references. This is a bit * fragile, but I don't know of anything better. */ -def.name = pool->name; -memcpy(def.uuid, pool->uuid, VIR_UUID_BUFLEN); +def->name = pool->name; +memcpy(def->uuid, pool->uuid, VIR_UUID_BUFLEN); if (!(identity = virNetServerClientGetIdentity(client))) goto cleanup; if (virIdentitySetCurrent(identity) < 0) goto cleanup; -ret = virConnectStoragePoolEventRegisterAnyCheckACL(conn, &def); +ret = virConnectStoragePoolEventRegisterAnyCheckACL(conn, def); cleanup: ignore_value(virIdentitySetCurrent(NULL)); @@ -232,20 +232,20 @@ remoteRelayNodeDeviceEventCheckACL(virNetServerClient *client, virConnectPtr conn, virNodeDevicePtr dev) { -virNodeDeviceDef def; +g_autofree virNodeDeviceDef *def = g_new0(virNodeDeviceDef, 1); g_autoptr(virIdentity) identity = NULL; bool ret = false; /* For now, we just create a virNodeDeviceDef with enough contents to * satisfy what viraccessdriverpolkit.c references. This is a bit * fragile, but I don't know of anything better. */ -def.name = dev->name; +def->name = dev->name; if (!(identity = virNetServerClientGetIdentity(client))) goto cleanup; if (virIdentitySetCurrent(identity) < 0) goto cleanup; -ret = virConnectNodeDeviceEventRegisterAnyCheckACL(conn, &def); +ret = virConnectNodeDeviceEventRegisterAnyCheckACL(conn, def); cleanup: ignore_value(virIdentitySetCurrent(NULL)); @@ -257,22 +257,22 @@ remoteRelaySecretEventCheckACL(virNetServerClient *client, virConnectPtr conn, virSecretPtr secret) { -virSecretDef def; +g_autofree virSecretDef *def = g_new0(virSecretDef, 1); g_autoptr(virIdentity) identity = NULL; bool ret = false; /* For now, we just create a virSecretDef with enough contents to * satisfy what viraccessdriverpolkit.c references. This is a bit * fragile, but I don't know of anything better. */ -memcpy(def.uuid, secret->uuid, VIR_UUID_BUFLEN); -def.usage_type = secret->usageType; -def.usage_id = secret-&g
[PATCH v2 1/1] remote: properly initialize objects in ACL helpers
Commit 2ecdf259299813c2c674377e22a0acbce5ccbbb2 was intended to implement two things: reduce stack usage inside ACL helpers and minimally initialize virDomainDef object to avoid passing garbage inside validation framework. Though original commit has not touched other ACL helpers. This patch adds proper clauses to remoteRelayNetworkEventCheckACL remoteRelayStoragePoolEventCheckACL remoteRelayNodeDeviceEventCheckACL remoteRelaySecretEventCheckACL Signed-off-by: Denis V. Lunev CC: Peter Krempa CC: Roman Grigoriev --- src/remote/remote_daemon_dispatch.c | 32 ++--- 1 file changed, 16 insertions(+), 16 deletions(-) Changes from v1: * g_autoptr is replaced with g_autofree upon reached consensus * patch 1 in series has been dropped diff --git a/src/remote/remote_daemon_dispatch.c b/src/remote/remote_daemon_dispatch.c index aaabd1e56c..b566a510b8 100644 --- a/src/remote/remote_daemon_dispatch.c +++ b/src/remote/remote_daemon_dispatch.c @@ -180,21 +180,21 @@ static bool remoteRelayNetworkEventCheckACL(virNetServerClient *client, virConnectPtr conn, virNetworkPtr net) { -virNetworkDef def; +g_autofree virNetworkDef *def = g_new0(virNetworkDef, 1); g_autoptr(virIdentity) identity = NULL; bool ret = false; /* For now, we just create a virNetworkDef with enough contents to * satisfy what viraccessdriverpolkit.c references. This is a bit * fragile, but I don't know of anything better. */ -def.name = net->name; -memcpy(def.uuid, net->uuid, VIR_UUID_BUFLEN); +def->name = net->name; +memcpy(def->uuid, net->uuid, VIR_UUID_BUFLEN); if (!(identity = virNetServerClientGetIdentity(client))) goto cleanup; if (virIdentitySetCurrent(identity) < 0) goto cleanup; -ret = virConnectNetworkEventRegisterAnyCheckACL(conn, &def); +ret = virConnectNetworkEventRegisterAnyCheckACL(conn, def); cleanup: ignore_value(virIdentitySetCurrent(NULL)); @@ -206,21 +206,21 @@ remoteRelayStoragePoolEventCheckACL(virNetServerClient *client, virConnectPtr conn, virStoragePoolPtr pool) { -virStoragePoolDef def; +g_autofree virStoragePoolDef *def = g_new0(virStoragePoolDef, 1); g_autoptr(virIdentity) identity = NULL; bool ret = false; /* For now, we just create a virStoragePoolDef with enough contents to * satisfy what viraccessdriverpolkit.c references. This is a bit * fragile, but I don't know of anything better. */ -def.name = pool->name; -memcpy(def.uuid, pool->uuid, VIR_UUID_BUFLEN); +def->name = pool->name; +memcpy(def->uuid, pool->uuid, VIR_UUID_BUFLEN); if (!(identity = virNetServerClientGetIdentity(client))) goto cleanup; if (virIdentitySetCurrent(identity) < 0) goto cleanup; -ret = virConnectStoragePoolEventRegisterAnyCheckACL(conn, &def); +ret = virConnectStoragePoolEventRegisterAnyCheckACL(conn, def); cleanup: ignore_value(virIdentitySetCurrent(NULL)); @@ -232,20 +232,20 @@ remoteRelayNodeDeviceEventCheckACL(virNetServerClient *client, virConnectPtr conn, virNodeDevicePtr dev) { -virNodeDeviceDef def; +g_autofree virNodeDeviceDef *def = g_new0(virNodeDeviceDef, 1); g_autoptr(virIdentity) identity = NULL; bool ret = false; /* For now, we just create a virNodeDeviceDef with enough contents to * satisfy what viraccessdriverpolkit.c references. This is a bit * fragile, but I don't know of anything better. */ -def.name = dev->name; +def->name = dev->name; if (!(identity = virNetServerClientGetIdentity(client))) goto cleanup; if (virIdentitySetCurrent(identity) < 0) goto cleanup; -ret = virConnectNodeDeviceEventRegisterAnyCheckACL(conn, &def); +ret = virConnectNodeDeviceEventRegisterAnyCheckACL(conn, def); cleanup: ignore_value(virIdentitySetCurrent(NULL)); @@ -257,22 +257,22 @@ remoteRelaySecretEventCheckACL(virNetServerClient *client, virConnectPtr conn, virSecretPtr secret) { -virSecretDef def; +g_autofree virSecretDef *def = g_new0(virSecretDef, 1); g_autoptr(virIdentity) identity = NULL; bool ret = false; /* For now, we just create a virSecretDef with enough contents to * satisfy what viraccessdriverpolkit.c references. This is a bit * fragile, but I don't know of anything better. */ -memcpy(def.uuid, secret->uuid, VIR_UUID_BUFLEN); -def.usage_type = secret->usageType; -def.usage_id = secret->usageID; +memcpy(def->uuid, secret->uuid, VIR_UUID_BUFLEN); +def->usage_type = secret->usageType; +def
Re: [PATCH 1/1] gitignore: add cscope files to git ignore
On 3/19/24 11:02, Martin Kletzander wrote: On Mon, Mar 18, 2024 at 09:35:32PM +0100, Denis V. Lunev wrote: On 3/18/24 15:25, Martin Kletzander wrote: On Mon, Mar 18, 2024 at 02:25:53PM +0100, Pavel Hrdina wrote: On Mon, Mar 18, 2024 at 01:31:27PM +0100, Peter Krempa wrote: On Mon, Mar 18, 2024 at 13:14:54 +0100, Michal Prívozník wrote: > On 3/18/24 12:45, Denis V. Lunev wrote: > > On 3/18/24 11:42, Michal Prívozník wrote: > >> On 3/17/24 16:00, Denis V. Lunev wrote: > >>> Signed-off-by: Denis V. Lunev > >>> --- > >>> .gitignore | 1 + > >>> 1 file changed, 1 insertion(+) > >>> > >>> diff --git a/.gitignore b/.gitignore > >>> index 4695391342..44a9b446bd 100644 > >>> --- a/.gitignore > >>> +++ b/.gitignore > >>> @@ -20,6 +20,7 @@ __pycache__/ > >>> /build/ > >>> /ci/scratch/ > >>> tags > >>> +cscope.* > >>> # clangd related ignores > >>> .clangd > >> Apparently, at some point in time this was here, but it was removed: > >> > >> https://gitlab.com/libvirt/libvirt/-/commit/8a63add283c310952b7df161b4b413e817468d01 > >> > >> Michal > >> > > This is quite strange for me: > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/.gitignore > > https://gitlab.com/qemu-project/qemu/-/blob/master/.gitignore?ref_type=heads > > > > I do not see any obvious reason how big and extensive > > list of files in .gitignore could hurt us while it > > is obviously convenient. > It might be that once there is too much of it there might be some false positives and it is harder to maintain. > Yeah, it feels a bit selective. I mean, we allow vim swap files to be in > .gitignore, we allow tags file to be there too (which strictly speaking > is a tooling helper file), but not cscope? The reasoning to keep 'tags' ignored is because we ship .ctags, but generally tooling files should be ignored by "personal" .gitignores as we can't cover all of it especially if we don't configure it. Agreed, I see we have ignores for vim and emacs, but honestly these could be removed as well because everyone can just edit `.config/git/ignore` and put the files created by their favorite editor into that file and it will work for every project you participate in. Pavel > Martin, do you remember the reasoning there? No, not really, it was most probably what others said above. I don't remember having issues with those ignores, it might've been an end to some discussion we had, not sure. Even though I don't have issues with tool-specific gitignores being there I copied all the non-libvirt As I said I don't have problems with that ^^, I thought the line could be based on what we have configs or targets for, for example. But just out of curiosity I have few questions below. related ignores from our .gitignore to my existing ~/.config/git/ignore and I know I'll be fine in other projects as well. Maybe we were trying to help others learn this "trick" to help them in other repositories. For me this is not looking like a reasonable option. Rationale: * I am working on tenth of different hosts in the corporate environment and building libvirt not only locally but on those hosts too Are you building it with some script or a long command line that these files get generated for you? Default build for me does not generate those. scope files appear once I am starting to work in the same way as tags. I need them to navigate over sources. They do not have any connection to the building procedure. That is why they are special and falls into the same category as tags. Or do you need those tags to browse the code? That would suggest you want some configuration setup on those machines as well. As above. They are heavily used to navigate sources and unfortunately, I do not have such influence to bring by default .gitignore to user defaults on the system. Den * This is unavoidable as always there are several different versions involved Within the proposed way, I have EACH time to copy personal settings from the root node to all working nodes. This looks like a personal pain to me. I would say, once again, that other projects are going different way and the chosen approach does not fit the my scenario. Den ___ Devel mailing list -- devel@lists.libvirt.org To unsubscribe send an email to devel-le...@lists.libvirt.org
Re: [PATCH 1/1] gitignore: add cscope files to git ignore
On 3/18/24 15:25, Martin Kletzander wrote: On Mon, Mar 18, 2024 at 02:25:53PM +0100, Pavel Hrdina wrote: On Mon, Mar 18, 2024 at 01:31:27PM +0100, Peter Krempa wrote: On Mon, Mar 18, 2024 at 13:14:54 +0100, Michal Prívozník wrote: > On 3/18/24 12:45, Denis V. Lunev wrote: > > On 3/18/24 11:42, Michal Prívozník wrote: > >> On 3/17/24 16:00, Denis V. Lunev wrote: > >>> Signed-off-by: Denis V. Lunev > >>> --- > >>> .gitignore | 1 + > >>> 1 file changed, 1 insertion(+) > >>> > >>> diff --git a/.gitignore b/.gitignore > >>> index 4695391342..44a9b446bd 100644 > >>> --- a/.gitignore > >>> +++ b/.gitignore > >>> @@ -20,6 +20,7 @@ __pycache__/ > >>> /build/ > >>> /ci/scratch/ > >>> tags > >>> +cscope.* > >>> # clangd related ignores > >>> .clangd > >> Apparently, at some point in time this was here, but it was removed: > >> > >> https://gitlab.com/libvirt/libvirt/-/commit/8a63add283c310952b7df161b4b413e817468d01 > >> > >> Michal > >> > > This is quite strange for me: > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/.gitignore > > https://gitlab.com/qemu-project/qemu/-/blob/master/.gitignore?ref_type=heads > > > > I do not see any obvious reason how big and extensive > > list of files in .gitignore could hurt us while it > > is obviously convenient. > It might be that once there is too much of it there might be some false positives and it is harder to maintain. > Yeah, it feels a bit selective. I mean, we allow vim swap files to be in > .gitignore, we allow tags file to be there too (which strictly speaking > is a tooling helper file), but not cscope? The reasoning to keep 'tags' ignored is because we ship .ctags, but generally tooling files should be ignored by "personal" .gitignores as we can't cover all of it especially if we don't configure it. Agreed, I see we have ignores for vim and emacs, but honestly these could be removed as well because everyone can just edit `.config/git/ignore` and put the files created by their favorite editor into that file and it will work for every project you participate in. Pavel > Martin, do you remember the reasoning there? No, not really, it was most probably what others said above. I don't remember having issues with those ignores, it might've been an end to some discussion we had, not sure. Even though I don't have issues with tool-specific gitignores being there I copied all the non-libvirt related ignores from our .gitignore to my existing ~/.config/git/ignore and I know I'll be fine in other projects as well. Maybe we were trying to help others learn this "trick" to help them in other repositories. For me this is not looking like a reasonable option. Rationale: * I am working on tenth of different hosts in the corporate environment and building libvirt not only locally but on those hosts too * This is unavoidable as always there are several different versions involved Within the proposed way, I have EACH time to copy personal settings from the root node to all working nodes. This looks like a personal pain to me. I would say, once again, that other projects are going different way and the chosen approach does not fit the my scenario. Den ___ Devel mailing list -- devel@lists.libvirt.org To unsubscribe send an email to devel-le...@lists.libvirt.org
Re: [PATCH 1/1] gitignore: add cscope files to git ignore
On 3/18/24 11:42, Michal Prívozník wrote: On 3/17/24 16:00, Denis V. Lunev wrote: Signed-off-by: Denis V. Lunev --- .gitignore | 1 + 1 file changed, 1 insertion(+) diff --git a/.gitignore b/.gitignore index 4695391342..44a9b446bd 100644 --- a/.gitignore +++ b/.gitignore @@ -20,6 +20,7 @@ __pycache__/ /build/ /ci/scratch/ tags +cscope.* # clangd related ignores .clangd Apparently, at some point in time this was here, but it was removed: https://gitlab.com/libvirt/libvirt/-/commit/8a63add283c310952b7df161b4b413e817468d01 Michal This is quite strange for me: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/.gitignore https://gitlab.com/qemu-project/qemu/-/blob/master/.gitignore?ref_type=heads I do not see any obvious reason how big and extensive list of files in .gitignore could hurt us while it is obviously convenient. Den ___ Devel mailing list -- devel@lists.libvirt.org To unsubscribe send an email to devel-le...@lists.libvirt.org
[PATCH 1/2] remote: cleanup properly virDomainDef in ACL helpers
Technically commit 2ecdf259299813c2c674377e22a0acbce5ccbbb2 does not really introduces a leak, but it is incorrect ideologically. Neither function accepting non-const pointer to virDomainDef does not provide any warrantee that the object will not be improved inside. Thus, keeping object model in mind, we must ensure that virDomainDefFree is called over virDomainDef object as a destructor. In order to achieve this we should change pointer declaration inside remoteRelayDomainEventCheckACL remoteRelayDomainQemuMonitorEventCheckACL and assign def->name via strdup. Fixes: 2ecdf259299813c2c674377e22a0acbce5ccbbb2 Signed-off-by: Denis V. Lunev CC: Peter Krempa CC: Roman Grigoriev --- src/remote/remote_daemon_dispatch.c | 8 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/src/remote/remote_daemon_dispatch.c b/src/remote/remote_daemon_dispatch.c index aaabd1e56c..3172a632df 100644 --- a/src/remote/remote_daemon_dispatch.c +++ b/src/remote/remote_daemon_dispatch.c @@ -154,14 +154,14 @@ static bool remoteRelayDomainEventCheckACL(virNetServerClient *client, virConnectPtr conn, virDomainPtr dom) { -g_autofree virDomainDef *def = g_new0(virDomainDef, 1); +g_autoptr(virDomainDef) def = g_new0(virDomainDef, 1); g_autoptr(virIdentity) identity = NULL; bool ret = false; /* For now, we just create a virDomainDef with enough contents to * satisfy what viraccessdriverpolkit.c references. This is a bit * fragile, but I don't know of anything better. */ -def->name = dom->name; +def->name = g_strdup(dom->name); memcpy(def->uuid, dom->uuid, VIR_UUID_BUFLEN); if (!(identity = virNetServerClientGetIdentity(client))) @@ -283,14 +283,14 @@ static bool remoteRelayDomainQemuMonitorEventCheckACL(virNetServerClient *client, virConnectPtr conn, virDomainPtr dom) { -g_autofree virDomainDef *def = g_new0(virDomainDef, 1); +g_autoptr(virDomainDef) def = g_new0(virDomainDef, 1); g_autoptr(virIdentity) identity = NULL; bool ret = false; /* For now, we just create a virDomainDef with enough contents to * satisfy what viraccessdriverpolkit.c references. This is a bit * fragile, but I don't know of anything better. */ -def->name = dom->name; +def->name = g_strdup(dom->name); memcpy(def->uuid, dom->uuid, VIR_UUID_BUFLEN); if (!(identity = virNetServerClientGetIdentity(client))) -- 2.40.1 ___ Devel mailing list -- devel@lists.libvirt.org To unsubscribe send an email to devel-le...@lists.libvirt.org
[PATCH 2/2] remote: properly initialize objects in ACL helpers
Commit 2ecdf259299813c2c674377e22a0acbce5ccbbb2 is idealogically corrent, but unfortunately is incomplete. There are other similar objects in the module which are used also without proper initialization. This patch adds proper clauses to remoteRelayNetworkEventCheckACL remoteRelayStoragePoolEventCheckACL remoteRelayNodeDeviceEventCheckACL remoteRelaySecretEventCheckACL Signed-off-by: Denis V. Lunev CC: Peter Krempa CC: Roman Grigoriev --- src/remote/remote_daemon_dispatch.c | 32 ++--- 1 file changed, 16 insertions(+), 16 deletions(-) diff --git a/src/remote/remote_daemon_dispatch.c b/src/remote/remote_daemon_dispatch.c index 3172a632df..01f97bb345 100644 --- a/src/remote/remote_daemon_dispatch.c +++ b/src/remote/remote_daemon_dispatch.c @@ -180,21 +180,21 @@ static bool remoteRelayNetworkEventCheckACL(virNetServerClient *client, virConnectPtr conn, virNetworkPtr net) { -virNetworkDef def; +g_autoptr(virNetworkDef) def = g_new0(virNetworkDef, 1); g_autoptr(virIdentity) identity = NULL; bool ret = false; /* For now, we just create a virNetworkDef with enough contents to * satisfy what viraccessdriverpolkit.c references. This is a bit * fragile, but I don't know of anything better. */ -def.name = net->name; -memcpy(def.uuid, net->uuid, VIR_UUID_BUFLEN); +def->name = g_strdup(net->name); +memcpy(def->uuid, net->uuid, VIR_UUID_BUFLEN); if (!(identity = virNetServerClientGetIdentity(client))) goto cleanup; if (virIdentitySetCurrent(identity) < 0) goto cleanup; -ret = virConnectNetworkEventRegisterAnyCheckACL(conn, &def); +ret = virConnectNetworkEventRegisterAnyCheckACL(conn, def); cleanup: ignore_value(virIdentitySetCurrent(NULL)); @@ -206,21 +206,21 @@ remoteRelayStoragePoolEventCheckACL(virNetServerClient *client, virConnectPtr conn, virStoragePoolPtr pool) { -virStoragePoolDef def; +g_autoptr(virStoragePoolDef) def = g_new0(virStoragePoolDef, 1); g_autoptr(virIdentity) identity = NULL; bool ret = false; /* For now, we just create a virStoragePoolDef with enough contents to * satisfy what viraccessdriverpolkit.c references. This is a bit * fragile, but I don't know of anything better. */ -def.name = pool->name; -memcpy(def.uuid, pool->uuid, VIR_UUID_BUFLEN); +def->name = g_strdup(pool->name); +memcpy(def->uuid, pool->uuid, VIR_UUID_BUFLEN); if (!(identity = virNetServerClientGetIdentity(client))) goto cleanup; if (virIdentitySetCurrent(identity) < 0) goto cleanup; -ret = virConnectStoragePoolEventRegisterAnyCheckACL(conn, &def); +ret = virConnectStoragePoolEventRegisterAnyCheckACL(conn, def); cleanup: ignore_value(virIdentitySetCurrent(NULL)); @@ -232,20 +232,20 @@ remoteRelayNodeDeviceEventCheckACL(virNetServerClient *client, virConnectPtr conn, virNodeDevicePtr dev) { -virNodeDeviceDef def; +g_autoptr(virNodeDeviceDef) def = g_new0(virNodeDeviceDef, 1); g_autoptr(virIdentity) identity = NULL; bool ret = false; /* For now, we just create a virNodeDeviceDef with enough contents to * satisfy what viraccessdriverpolkit.c references. This is a bit * fragile, but I don't know of anything better. */ -def.name = dev->name; +def->name = g_strdup(dev->name); if (!(identity = virNetServerClientGetIdentity(client))) goto cleanup; if (virIdentitySetCurrent(identity) < 0) goto cleanup; -ret = virConnectNodeDeviceEventRegisterAnyCheckACL(conn, &def); +ret = virConnectNodeDeviceEventRegisterAnyCheckACL(conn, def); cleanup: ignore_value(virIdentitySetCurrent(NULL)); @@ -257,22 +257,22 @@ remoteRelaySecretEventCheckACL(virNetServerClient *client, virConnectPtr conn, virSecretPtr secret) { -virSecretDef def; +g_autoptr(virSecretDef) def = g_new0(virSecretDef, 1); g_autoptr(virIdentity) identity = NULL; bool ret = false; /* For now, we just create a virSecretDef with enough contents to * satisfy what viraccessdriverpolkit.c references. This is a bit * fragile, but I don't know of anything better. */ -memcpy(def.uuid, secret->uuid, VIR_UUID_BUFLEN); -def.usage_type = secret->usageType; -def.usage_id = secret->usageID; +memcpy(def->uuid, secret->uuid, VIR_UUID_BUFLEN); +def->usage_type = secret->usageType; +def->usage_id = secret->usageID; if (!(identity = virNetServerClientGetIdentity(client))) goto cleanup; if (virIdentitySetCurren
[PATCH 0/2] remote: fix object initialization in ACL helpers
Signed-off-by: Denis V. Lunev CC: Peter Krempa CC: Roman Grigoriev ___ Devel mailing list -- devel@lists.libvirt.org To unsubscribe send an email to devel-le...@lists.libvirt.org
[PATCH 1/1] gitignore: add cscope files to git ignore
Signed-off-by: Denis V. Lunev --- .gitignore | 1 + 1 file changed, 1 insertion(+) diff --git a/.gitignore b/.gitignore index 4695391342..44a9b446bd 100644 --- a/.gitignore +++ b/.gitignore @@ -20,6 +20,7 @@ __pycache__/ /build/ /ci/scratch/ tags +cscope.* # clangd related ignores .clangd -- 2.40.1 ___ Devel mailing list -- devel@lists.libvirt.org To unsubscribe send an email to devel-le...@lists.libvirt.org
Re: [PATCH 2/2] remote: properly initialize objects in ACL helpers
On 3/18/24 09:56, Peter Krempa wrote: FYI: Gmail decided that your whole series is spam. I'm not sure whether it's just gmail's spam filter being silly or your mail infra has something wrong, but just so you know. On Sun, Mar 17, 2024 at 18:08:50 +0100, Denis V. Lunev wrote: Commit 2ecdf259299813c2c674377e22a0acbce5ccbbb2 is idealogically corrent, but unfortunately is incomplete. There are other similar objects in the module which are used also without proper initialization. The commit you mention was justifying the change from stack-allocated to heap allocated in order to decrease the stack frame size, which was too big due to the fact that 'virDomainDef' is too big. With the other structs that isn't that much of a problem, so I don't think the justification of "Commit 2ecdf259299813 being incomplete" is relevant and thus please add your own justification or adapt what I wrote there as a separate justification, but IMO commit 2ecdf259299813 is not incomplete based on what it tried to do. Original commit has 2 motivations: * reduce stack size * properly initialize the object Right now we pass half-initialized object into the engine. We should either memset() or allocate/free the object. I thought that allocation is better as we could have object filled with more data inside validators. Den ___ Devel mailing list -- devel@lists.libvirt.org To unsubscribe send an email to devel-le...@lists.libvirt.org
Re: [PATCH 1/2] remote: cleanup properly virDomainDef in ACL helpers
On 3/18/24 09:37, Peter Krempa wrote: On Sun, Mar 17, 2024 at 18:08:49 +0100, Denis V. Lunev wrote: Technically commit 2ecdf259299813c2c674377e22a0acbce5ccbbb2 does not really introduces a leak, but it is incorrect ideologically. Neither function accepting non-const pointer to virDomainDef does not provide any warrantee that the object will not be improved inside. I don't quite understand what the second sentence is supposed to mean, can you please elaborate the justification? Thus, keeping object model in mind, we must ensure that virDomainDefFree is called over virDomainDef object as a destructor. Using the object model as argument would require that you also use 'virDomainDefNew' as "constructor", which IMO in this case would be overkill same as using virDomainDefFree. In order to achieve this we should change pointer declaration inside remoteRelayDomainEventCheckACL remoteRelayDomainQemuMonitorEventCheckACL and assign def->name via strdup. Fixes: 2ecdf259299813c2c674377e22a0acbce5ccbbb2 The commit message doesn't really clarify what the actual problem is that this patch is fixing, which is needed in case you are mentioning that some commit is broken/being fixed. Signed-off-by: Denis V. Lunev CC: Peter Krempa CC: Roman Grigoriev --- src/remote/remote_daemon_dispatch.c | 8 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/src/remote/remote_daemon_dispatch.c b/src/remote/remote_daemon_dispatch.c index aaabd1e56c..3172a632df 100644 --- a/src/remote/remote_daemon_dispatch.c +++ b/src/remote/remote_daemon_dispatch.c @@ -154,14 +154,14 @@ static bool remoteRelayDomainEventCheckACL(virNetServerClient *client, virConnectPtr conn, virDomainPtr dom) { -g_autofree virDomainDef *def = g_new0(virDomainDef, 1); +g_autoptr(virDomainDef) def = g_new0(virDomainDef, 1); g_autoptr(virIdentity) identity = NULL; bool ret = false; /* For now, we just create a virDomainDef with enough contents to * satisfy what viraccessdriverpolkit.c references. This is a bit * fragile, but I don't know of anything better. */ -def->name = dom->name; +def->name = g_strdup(dom->name); memcpy(def->uuid, dom->uuid, VIR_UUID_BUFLEN); Based on the CC-list I assume that there's a false positive from the static analysis tool which we've got multiple fixes/patches form recently. Nope. The code is written based on the manual code analysis. Sorry that commit message is not good enough. In such case I think it should be enough to explicitly clear def->name after use before freeing to show that we're intended to just borrow the value without going overkill on fully allocating and freeing the domain object. Alternatively a warning comment can be added too. Either way, please describe the reason for the patch in the commit message as requested above. Please take a look into the attached disassembly of remoteRelayDomainEventCheckACL The key moment is that in the cleanup section we call g_autoptr_cleanup_generic_gfree to cleanup def pointer. That is not bad, but this is fragile. Right now this does not induce any leak, but once we will change something here (adding mode data) or plugins validating ACL will do something with the object (potentially they can) and we will have leaks with the dependent pointers. As far as I could talk about object model, if we call a constructor, we should call the destructor, namely glib_autoptr_cleanup_virDomainDef. That is all beyond my motivation in this patch. Den00026716 : 26716: f3 0f 1e fa endbr64 2671a: 55 push %rbp 2671b: 48 89 e5mov%rsp,%rbp 2671e: 41 56 push %r14 26720: 41 55 push %r13 26722: 41 54 push %r12 26724: 53 push %rbx 26725: 48 83 ec 40 sub$0x40,%rsp 26729: 48 89 7d b8 mov%rdi,-0x48(%rbp) 2672d: 48 89 75 b0 mov%rsi,-0x50(%rbp) 26731: 48 89 55 a8 mov%rdx,-0x58(%rbp) 26735: 64 48 8b 04 25 28 00mov%fs:0x28,%rax 2673c: 00 00 2673e: 48 89 45 d8 mov%rax,-0x28(%rbp) 26742: 31 c0 xor%eax,%eax 26744: be e8 06 00 00 mov$0x6e8,%esi 26749: bf 01 00 00 00 mov$0x1,%edi 2674e: e8 00 00 00 00 call 26753 2674f: R_X86_64_PLT32 g_malloc0_n-0x4 26753: 48 89 45 c8 mov%rax,-0x38(%rbp) 26757: 48 c7 45 d0 00 00 00movq $0x0,-0x30(%rbp) 2675e: 00 2675f: c6 45 c3 00 movb $0x0,-0x3d(%rbp) 26763: 48 8b 45 c8 mov-0x38(%rbp),%ra
Re: [PATCH] rpc: fix race in waking up client event loop
On 12/18/23 13:23, Daniel P. Berrangé wrote: The first thread to issue a client RPC request will own the event loop execution, sitting in the virNetClientIOEventLoop function. It releases the client lock while running: virNetClientUnlock() g_main_loop_run() virNetClientLock() If a second thread arrives with an RPC request, it will queue it for the first thread to process. To inform the first thread that there's a new request it calls g_main_loop_quit() to break it out of the main loop. This works if the first thread is in g_main_loop_run() at that time. There is a small window of opportunity, however, where the first thread has released the client lock, but not yet got into g_main_loop_run(). If that happens, the wakeup from the second thread is lost. This patch deals with that by changing the way the wakeup is performed. Instead of directly calling g_main_loop_quit(), the second thread creates an idle source to run the quit function from within the first thread. This guarantees that the first thread will see the wakeup. Signed-off-by: Daniel P. Berrangé --- src/rpc/virnetclient.c | 13 - 1 file changed, 12 insertions(+), 1 deletion(-) diff --git a/src/rpc/virnetclient.c b/src/rpc/virnetclient.c index 4ab8af68c5..68098b1c8d 100644 --- a/src/rpc/virnetclient.c +++ b/src/rpc/virnetclient.c @@ -1848,6 +1848,15 @@ static void virNetClientIOUpdateCallback(virNetClient *client, } +static gboolean virNetClientIOWakeup(gpointer opaque) +{ +GMainLoop *loop = opaque; + +g_main_loop_quit(loop); + +return G_SOURCE_REMOVE; +} + /* * This function sends a message to remote server and awaits a reply * @@ -1925,7 +1934,9 @@ static int virNetClientIO(virNetClient *client, /* Check to see if another thread is dispatching */ if (client->haveTheBuck) { /* Force other thread to wakeup from poll */ -g_main_loop_quit(client->eventLoop); +GSource *wakeup = g_idle_source_new(); +g_source_set_callback(wakeup, virNetClientIOWakeup, client->eventLoop, NULL); +g_source_attach(wakeup, client->eventCtx); /* If we are non-blocking, detach the thread and keep the call in the * queue. */ Reviewed-by: Denis V. Lunev ___ Devel mailing list -- devel@lists.libvirt.org To unsubscribe send an email to devel-le...@lists.libvirt.org
[PATCH 1/3] rpc: create virNetClientIOWakeup helper
This functionality is to be extended, simple call to g_main_loop_quit() is not enough. In order to make changes convinient, the helper is required. Signed-off-by: Denis V. Lunev --- src/rpc/virnetclient.c | 18 -- 1 file changed, 12 insertions(+), 6 deletions(-) diff --git a/src/rpc/virnetclient.c b/src/rpc/virnetclient.c index d29917df27..d9a508246f 100644 --- a/src/rpc/virnetclient.c +++ b/src/rpc/virnetclient.c @@ -150,6 +150,13 @@ void virNetClientSetCloseCallback(virNetClient *client, } +static void +virNetClientIOWakeup(virNetClient *client) +{ +g_main_loop_quit(client->eventLoop); +} + + static void virNetClientIncomingEvent(virNetSocket *sock, int events, void *opaque); @@ -851,7 +858,7 @@ static void virNetClientCloseInternal(virNetClient *client, * queue and close the client because we set client->wantClose. */ if (client->haveTheBuck) { -g_main_loop_quit(client->eventLoop); +virNetClientIOWakeup(client); } else { virNetClientIOEventLoopPassTheBuck(client, NULL); } @@ -918,7 +925,7 @@ virNetClientIOEventTLS(int fd G_GNUC_UNUSED, virNetClient *client = opaque; if (!virNetClientTLSHandshake(client)) -g_main_loop_quit(client->eventLoop); +virNetClientIOWakeup(client); return G_SOURCE_REMOVE; } @@ -931,7 +938,7 @@ virNetClientIOEventTLSConfirm(int fd G_GNUC_UNUSED, { virNetClient *client = opaque; -g_main_loop_quit(client->eventLoop); +virNetClientIOWakeup(client); return G_SOURCE_REMOVE; } @@ -1675,7 +1682,7 @@ virNetClientIOEventFD(int fd G_GNUC_UNUSED, { struct virNetClientIOEventData *data = opaque; data->rev = ev; -g_main_loop_quit(data->client->eventLoop); +virNetClientIOWakeup(data->client); return G_SOURCE_REMOVE; } @@ -2006,8 +2013,7 @@ static int virNetClientIO(virNetClient *client, /* Check to see if another thread is dispatching */ if (client->haveTheBuck) { -/* Force other thread to wakeup from poll */ -g_main_loop_quit(client->eventLoop); +virNetClientIOWakeup(client); /* If we are non-blocking, detach the thread and keep the call in the * queue. */ -- 2.34.1 ___ Devel mailing list -- devel@lists.libvirt.org To unsubscribe send an email to devel-le...@lists.libvirt.org
[PATCH 2/3] rpc: mark GSourceFuncs functions in vireventglibwatch.c as static
They are not exported from the module and thus should be static. Signed-off-by: Denis V. Lunev --- src/util/vireventglibwatch.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/src/util/vireventglibwatch.c b/src/util/vireventglibwatch.c index b7f3a8786a..b21e505731 100644 --- a/src/util/vireventglibwatch.c +++ b/src/util/vireventglibwatch.c @@ -71,7 +71,7 @@ virEventGLibFDSourceFinalize(GSource *source G_GNUC_UNUSED) } -GSourceFuncs virEventGLibFDSourceFuncs = { +static GSourceFuncs virEventGLibFDSourceFuncs = { .prepare = virEventGLibFDSourcePrepare, .check = virEventGLibFDSourceCheck, .dispatch = virEventGLibFDSourceDispatch, @@ -194,7 +194,7 @@ virEventGLibSocketSourceFinalize(GSource *source) } -GSourceFuncs virEventGLibSocketSourceFuncs = { +static GSourceFuncs virEventGLibSocketSourceFuncs = { .prepare = virEventGLibSocketSourcePrepare, .check = virEventGLibSocketSourceCheck, .dispatch = virEventGLibSocketSourceDispatch, -- 2.34.1 ___ Devel mailing list -- devel@lists.libvirt.org To unsubscribe send an email to devel-le...@lists.libvirt.org
[PATCH 3/3] rpc: handle notifications properly
RPC client implementation uses the following paradigm. The critical section is organized via virObjectLock(client)/virObjectUnlock(client) braces. Though this is potentially problematic as main thread:side thread: virObjectUnlock(client); virObjectLock(client); g_main_loop_quit(client->eventLoop); virObjectUnlock(client); g_main_loop_run(client->eventLoop); This means in particular that is the main thread is executing very long request like VM migration, the wakeup from the side thread could be stuck until the main request will be fully completed. For us this means that Nova was reported as inaccessible. Anyway, this case is easily reproducible with the simple python scripts doing slow and fast requests in parallel from two different threads. This problem has been introduced with the rework for dropping gnulib inside the following commit: commit 7d4350bcac251bab2ecf85bd19eb1181db87fd07 Author: Daniel P. Berrangé Date: Thu Jan 16 11:21:44 2020 + rpc: convert RPC client to use GMainLoop instead of poll The cure is to revert back to old roots and use file descriptor for wakeup notifications. The code written is well suited for Linux while it is completely unclear how it will behave on Windows. Signed-off-by: Denis V. Lunev --- src/rpc/virnetclient.c | 39 ++- 1 file changed, 38 insertions(+), 1 deletion(-) diff --git a/src/rpc/virnetclient.c b/src/rpc/virnetclient.c index d9a508246f..7fff5a7017 100644 --- a/src/rpc/virnetclient.c +++ b/src/rpc/virnetclient.c @@ -90,6 +90,7 @@ struct _virNetClient { GMainLoop *eventLoop; GMainContext *eventCtx; +int pipeLoopNotify[2]; /* * List of calls currently waiting for dispatch @@ -150,10 +151,25 @@ void virNetClientSetCloseCallback(virNetClient *client, } +static gboolean +virNetClientIOEventNotify(int fd G_GNUC_UNUSED, + GIOCondition ev G_GNUC_UNUSED, + gpointer opaque) +{ +virNetClient *client = opaque; +char buf[1024]; + +while (saferead(client->pipeLoopNotify[0], buf, sizeof(buf)) > 0); +g_main_loop_quit(client->eventLoop); + +return G_SOURCE_CONTINUE; +} + static void virNetClientIOWakeup(virNetClient *client) { -g_main_loop_quit(client->eventLoop); +char c = 0; +ignore_value(safewrite(client->pipeLoopNotify[1], &c, sizeof(c))); } @@ -305,10 +321,15 @@ static virNetClient *virNetClientNew(virNetSocket *sock, const char *hostname) { virNetClient *client = NULL; +int pipenotify[2] = {-1, -1}; +g_autoptr(GSource) G_GNUC_UNUSED source = NULL; if (virNetClientInitialize() < 0) goto error; +if (virPipeNonBlock(pipenotify) < 0) +goto error; + if (!(client = virObjectLockableNew(virNetClientClass))) goto error; @@ -320,12 +341,25 @@ static virNetClient *virNetClientNew(virNetSocket *sock, client->hostname = g_strdup(hostname); +client->pipeLoopNotify[0] = pipenotify[0]; +client->pipeLoopNotify[1] = pipenotify[1]; + +/* FIXME: good for Unix, buggy for Windows */ +source = virEventGLibAddSocketWatch(pipenotify[0], +G_IO_IN | G_IO_ERR | G_IO_HUP, +client->eventCtx, +virNetClientIOEventNotify, +client, NULL); + PROBE(RPC_CLIENT_NEW, "client=%p sock=%p", client, client->sock); return client; error: +VIR_FORCE_CLOSE(pipenotify[0]); +VIR_FORCE_CLOSE(pipenotify[1]); + virObjectUnref(client); virObjectUnref(sock); return NULL; @@ -759,6 +793,9 @@ void virNetClientDispose(void *obj) g_main_loop_unref(client->eventLoop); g_main_context_unref(client->eventCtx); +VIR_FORCE_CLOSE(client->pipeLoopNotify[0]); +VIR_FORCE_CLOSE(client->pipeLoopNotify[1]); + g_free(client->hostname); if (client->sock) -- 2.34.1 ___ Devel mailing list -- devel@lists.libvirt.org To unsubscribe send an email to devel-le...@lists.libvirt.org
[PATCH v-pipe 0/3] alternative approach to the fix
This is a dirty working code using pipes. Sent for discussion of the approach. Made against our downstream based on libvirt 8.5. Signed-off-by: Denis V. Lunev ___ Devel mailing list -- devel@lists.libvirt.org To unsubscribe send an email to devel-le...@lists.libvirt.org
Re: [PATCH 3/3] rpc: Rework rpc notifications in main and side thread
On 12/18/23 12:03, Daniel P. Berrangé wrote: On Mon, Dec 18, 2023 at 10:32:07AM +, Daniel P. Berrangé wrote: On Fri, Dec 15, 2023 at 02:32:19AM +0800, Fima Shevrin wrote: RPC client implementation uses the following paradigm. The critical section is organized via virObjectLock(client)/virObjectUnlock(client) braces. Though this is potentially problematic as main thread:side thread: virObjectUnlock(client); virObjectLock(client); g_main_loop_quit(client->eventLoop); virObjectUnlock(client); g_main_loop_run(client->eventLoop); This means in particular that is the main thread is executing very long request like VM migration, the wakeup from the side thread could be stuck until the main request will be fully completed. Ah, I understand this now. The flag set from g_main_loop_quit() is ignored and overwritten by g_main_loop_run(). So the interruption from the side thread never happens. Your approach to solving this is by delaying the virObjectUnlock call until g_main_loop_run is running, which is clever but I don't much like playing games with the mutex locking. We need a way to interrupt the main loop, even if it hasn't started running yet. The previous impl in libvirt used a pipe for this trick, effectively as a dummy event source that would be watched. I think we can do the same here, though without needing an actual pipe which causes Windows portability problems. Glib already has an internal mechansim for breaking out of poll(), via its (private) GWakeup object which encapsulates a pipe. We just need a way of triggering this mechanism. I believe this can be achieved using just an idle source ie static gboolean dummy_cb(void *opaque) { Opps, would still need a g_main_loop_quit() call here I have sent dirty (but working) series with a pipe approach to come to the decision which approach would be better - with prepare callback or via pipes. I am not against any of them :) We need just to come to the decision which one would be better. Thank you in advance, Den ___ Devel mailing list -- devel@lists.libvirt.org To unsubscribe send an email to devel-le...@lists.libvirt.org
Re: [PATCH 3/3] rpc: Rework rpc notifications in main and side thread
On 12/15/23 16:07, Daniel P. Berrangé wrote: On Fri, Dec 15, 2023 at 01:48:09PM +, Efim Shevrin wrote: Hello, here are call traces with two threads generated by python script // == [root@dandreev-vz-6-0-0-243-master libvirt]# gdb -p 288985 (gdb) t a a bt Thread 2 (Thread 0x7f2112862640 (LWP 288986) "python3"): #0 0x7f2121d4296f in poll () at /lib64/libc.so.6 #1 0x7f211444650c in g_main_context_iterate.constprop () at /lib64/libglib-2.0.so.0 #2 0x7f21143f1483 in g_main_loop_run () at /lib64/libglib-2.0.so.0 #3 0x7f211406800b in virNetClientIOEventLoop () at /lib64/libvirt.so.0 #4 0x7f2114068a0a in virNetClientIO () at /lib64/libvirt.so.0 #5 0x7f21140692c1 in virNetClientSendInternal () at /lib64/libvirt.so.0 #6 0x7f211406936d in virNetClientSendWithReply () at /lib64/libvirt.so.0 #7 0x7f2114061a1d in virNetClientProgramCall () at /lib64/libvirt.so.0 #8 0x7f21140f79ac in callFull () at /lib64/libvirt.so.0 #9 0x7f21140f7a2f in call () at /lib64/libvirt.so.0 #10 0x7f21140d8435 in remoteDomainCreate () at /lib64/libvirt.so.0 #11 0x7f21141dd60c in virDomainCreate () at /lib64/libvirt.so.0 #12 0x7f21145c8114 in libvirt_virDomainCreate () at /usr/lib64/python3.9/site-packages/libvirtmod.cpython-39-x86_64-linux-gnu.so Ok, this thread is the primary one responsible for I/O. It is waiting, while not holding any mutex. Correct. Thread 1 (Thread 0x7f21223cf740 (LWP 288985) "python3"): #0 0x7f2121c9c39a in __futex_abstimed_wait_common () at /lib64/libc.so.6 #1 0x7f2121c9eba0 in pthread_cond_wait@@GLIBC_2.3.2 () at /lib64/libc.so.6 #2 0x7f2113f4982a in virCondWait () at /lib64/libvirt.so.0 #3 0x7f2113f1fee3 in virObjectWait () at /lib64/libvirt.so.0 #4 0x7f211406882a in virNetClientIO () at /lib64/libvirt.so.0 #5 0x7f21140692c1 in virNetClientSendInternal () at /lib64/libvirt.so.0 #6 0x7f211406936d in virNetClientSendWithReply () at /lib64/libvirt.so.0 #7 0x7f2114061a1d in virNetClientProgramCall () at /lib64/libvirt.so.0 #8 0x7f21140f79ac in callFull () at /lib64/libvirt.so.0 #9 0x7f21140f7a2f in call () at /lib64/libvirt.so.0 #10 0x7f21140f24eb in remoteNodeDeviceNumOfCaps () at /lib64/libvirt.so.0 #11 0x7f2114207a00 in virNodeDeviceNumOfCaps () at /lib64/libvirt.so.0 This is a second API call that has arrived and has queued its outgoing message, and is waiting either for the other thread to pass ownership of the socket to it, or for the other thread to provide its reply (whichever comes first). This is all totally normal, and working as expected, so I still don't see what the actual problem is ? No. This is not normal. The test is written in the following way. 1. One thread (2) has started very long operation, f.e. VM migration 2. Another thread (1) endlessly executes simple fast requests. Normally these requests are executed fast and we see progress in the log. Though there is a probability that thread (1) stuck until request processed by thread (2) is not completed. 3. It should be noted that this is not a final "deadlock" as once thread (2) completes its request, the process in thread (1) is restarted normally. Such behavior is a real pain. It has been detected by us once OpenStack Nova (which is working in this way) has been reported as unavailable. We have traced down this problem and comes to the conclusion that in the case of such behavior the problem is missed wakeup. In the debugger, if I call g_mail_loop_quit() thread (1) in unfrozen and requests are started processed normally. We have observed this problem over libvirt 8.5 and compared the behavior indeed against ancient libvirt 5.6. The ancient one does *NOT* have this problem and thus the problem is looking like a degradation. The difference in between them is a wakeup method. In libvirt 5.6 thread (2) is sleeping in a poll() and is woken up by writing to the pipe. Thus the sequence like called from the same thread (for simplicity) wakeup() waiting() behaves differently in libvirt 5.6 and libvirt 8.5 and this causes this trouble. // = just in case here is python script [root@dandreev-vz-6-0-0-243-master ~]# cat a.py import libvirt import time from threading import Thread def startVM(connection, vm_name): try: # Find the virtual machine by name print('starting VM') connection.lookupByName(vm_name).create() print('done starting VM') except libvirt.libvirtError as e: print(f'Libvirt Error: {e}') # Replace 'qemu+tcp://10.34.66.13/system' with your actual connection URI connection_uri = 'qemu+tcp://localhost/system' connection = libvirt.open(connection_uri) if connection is None: print(f'Failed to open connection to {connection_uri}') exit(1) try: # Replace 'your_vm_name' with the actual name of your virtual machine #
Re: [PATCH 3/3] rpc: Rework rpc notifications in main and side thread
On 12/15/23 16:47, Daniel P. Berrangé wrote: On Fri, Dec 15, 2023 at 03:51:19PM +0100, Denis V. Lunev wrote: On 12/15/23 14:48, Efim Shevrin wrote: *From:* Daniel P. Berrangé *Sent:* Friday, December 15, 2023 19:09 *To:* Efim Shevrin *Cc:* devel@lists.libvirt.org ; d...@openvz.org *Subject:* Re: [PATCH 3/3] rpc: Rework rpc notifications in main and side thread [You don't often get email from berra...@redhat.com. Learn why this is important at https://aka.ms/LearnAboutSenderIdentification ] On Fri, Dec 15, 2023 at 02:32:19AM +0800, Fima Shevrin wrote: RPC client implementation uses the following paradigm. The critical section is organized via virObjectLock(client)/virObjectUnlock(client) braces. Though this is potentially problematic as main thread: side thread: virObjectUnlock(client); virObjectLock(client); g_main_loop_quit(client->eventLoop); virObjectUnlock(client); g_main_loop_run(client->eventLoop); This means in particular that is the main thread is executing very long request like VM migration, the wakeup from the side thread could be stuck until the main request will be fully completed. Can you explain this in more detail, with call traces illustration for the two threads. You're not saying where the main thread is doing work with the 'client' lock hold for a long time. Generally the goal should be for the main thread to only hold the lock for a short time. Also if the side thread is already holding a reference on 'client', then potentially we should consider if it is possible to terminate the event loop without acquiring the mutex, as GMainLoop protects itself wrt concurrent usage already, provided all threads hold a reference directly or indirectly. At our opinion the problem here is missed wakeup from the side thread to the main thread. Hmmm, what platform are you seeing problems on ? Are you still targetting a very old RHEL-7 platform ? I vaguely recall there are/weere some old glib bugs in the main loop with threads that could be applicable. With regards, Daniel Nope. Original problem is observed against RHEL 9.1 i.e. libvirt 8.5. But the problem here comes from the "comparison" with very old ancient libvirt 5.6 which behaves here MUCH better. Den ___ Devel mailing list -- devel@lists.libvirt.org To unsubscribe send an email to devel-le...@lists.libvirt.org
Re: [PATCH 3/3] rpc: Rework rpc notifications in main and side thread
On 12/15/23 14:48, Efim Shevrin wrote: Hello, here are call traces with two threads generated by python script // == [root@dandreev-vz-6-0-0-243-master libvirt]# gdb -p 288985 (gdb) t a a bt Thread 2 (Thread 0x7f2112862640 (LWP 288986) "python3"): #0 0x7f2121d4296f in poll () at /lib64/libc.so.6 #1 0x7f211444650c in g_main_context_iterate.constprop () at /lib64/libglib-2.0.so.0 #2 0x7f21143f1483 in g_main_loop_run () at /lib64/libglib-2.0.so.0 #3 0x7f211406800b in virNetClientIOEventLoop () at /lib64/libvirt.so.0 #4 0x7f2114068a0a in virNetClientIO () at /lib64/libvirt.so.0 #5 0x7f21140692c1 in virNetClientSendInternal () at /lib64/libvirt.so.0 #6 0x7f211406936d in virNetClientSendWithReply () at /lib64/libvirt.so.0 #7 0x7f2114061a1d in virNetClientProgramCall () at /lib64/libvirt.so.0 #8 0x7f21140f79ac in callFull () at /lib64/libvirt.so.0 #9 0x7f21140f7a2f in call () at /lib64/libvirt.so.0 #10 0x7f21140d8435 in remoteDomainCreate () at /lib64/libvirt.so.0 #11 0x7f21141dd60c in virDomainCreate () at /lib64/libvirt.so.0 #12 0x7f21145c8114 in libvirt_virDomainCreate () at /usr/lib64/python3.9/site-packages/libvirtmod.cpython-39-x86_64-linux-gnu.so #13 0x7f21221268a8 in cfunction_call () at /lib64/libpython3.9.so.1.0 #14 0x7f2122118814 in _PyObject_MakeTpCall () at /lib64/libpython3.9.so.1.0 #15 0x7f212211566e in _PyEval_EvalFrameDefault () at /lib64/libpython3.9.so.1.0 #16 0x7f212211d013 in function_code_fastcall () at /lib64/libpython3.9.so.1.0 #17 0x7f2122110650 in _PyEval_EvalFrameDefault () at /lib64/libpython3.9.so.1.0 #18 0x7f212211d013 in function_code_fastcall () at /lib64/libpython3.9.so.1.0 #19 0x7f21221103e8 in _PyEval_EvalFrameDefault () at /lib64/libpython3.9.so.1.0 #20 0x7f212211d013 in function_code_fastcall () at /lib64/libpython3.9.so.1.0 #21 0x7f21221133d2 in _PyEval_EvalFrameDefault () at /lib64/libpython3.9.so.1.0 #22 0x7f212211d013 in function_code_fastcall () at /lib64/libpython3.9.so.1.0 #23 0x7f2122110650 in _PyEval_EvalFrameDefault () at /lib64/libpython3.9.so.1.0 #24 0x7f212211d013 in function_code_fastcall () at /lib64/libpython3.9.so.1.0 #25 0x7f2122110650 in _PyEval_EvalFrameDefault () at /lib64/libpython3.9.so.1.0 #26 0x7f212211d013 in function_code_fastcall () at /lib64/libpython3.9.so.1.0 #27 0x7f2122125382 in method_vectorcall () at /lib64/libpython3.9.so.1.0 #28 0x7f21221d8c4a in t_bootstrap () at /lib64/libpython3.9.so.1.0 #29 0x7f21221d8bf8 in pythread_wrapper () at /lib64/libpython3.9.so.1.0 #30 0x7f2121c9f802 in start_thread () at /lib64/libc.so.6 #31 0x7f2121c3f450 in clone3 () at /lib64/libc.so.6 Thread 1 (Thread 0x7f21223cf740 (LWP 288985) "python3"): #0 0x7f2121c9c39a in __futex_abstimed_wait_common () at /lib64/libc.so.6 #1 0x7f2121c9eba0 in pthread_cond_wait@@GLIBC_2.3.2 () at /lib64/libc.so.6 #2 0x7f2113f4982a in virCondWait () at /lib64/libvirt.so.0 #3 0x7f2113f1fee3 in virObjectWait () at /lib64/libvirt.so.0 #4 0x7f211406882a in virNetClientIO () at /lib64/libvirt.so.0 #5 0x7f21140692c1 in virNetClientSendInternal () at /lib64/libvirt.so.0 #6 0x7f211406936d in virNetClientSendWithReply () at /lib64/libvirt.so.0 #7 0x7f2114061a1d in virNetClientProgramCall () at /lib64/libvirt.so.0 #8 0x7f21140f79ac in callFull () at /lib64/libvirt.so.0 #9 0x7f21140f7a2f in call () at /lib64/libvirt.so.0 #10 0x7f21140f24eb in remoteNodeDeviceNumOfCaps () at /lib64/libvirt.so.0 #11 0x7f2114207a00 in virNodeDeviceNumOfCaps () at /lib64/libvirt.so.0 #12 0x7f21145d8edf in libvirt_virNodeDeviceListCaps.lto_priv.0 () at /usr/lib64/python3.9/site-packages/libvirtmod.cpython-39-x86_64-linux-gnu.so #13 0x7f21221268a8 in cfunction_call () at /lib64/libpython3.9.so.1.0 #14 0x7f2122118814 in _PyObject_MakeTpCall () at /lib64/libpython3.9.so.1.0 #15 0x7f212211566e in _PyEval_EvalFrameDefault () at /lib64/libpython3.9.so.1.0 #16 0x7f212211d013 in function_code_fastcall () at /lib64/libpython3.9.so.1.0 #17 0x7f2122110650 in _PyEval_EvalFrameDefault () at /lib64/libpython3.9.so.1.0 #18 0x7f212210f06d in _PyEval_EvalCode () at /lib64/libpython3.9.so.1.0 #19 0x7f212218c495 in _PyEval_EvalCodeWithName () at /lib64/libpython3.9.so.1.0 --Type for more, q to quit, c to continue without paging-- #20 0x7f212218c42d in PyEval_EvalCodeEx () at /lib64/libpython3.9.so.1.0 #21 0x7f212218c3df in PyEval_EvalCode () at /lib64/libpython3.9.so.1.0 #22 0x7f21221b7524 in run_eval_code_obj () at /lib64/libpython3.9.so.1.0 #23 0x7f21221b5da6 in run_mod () at /lib64/libpython3.9.so.1.0 #24 0x7f212208f0cb in pyrun_file.cold () at /lib64/libpython3.9.so.1.0 #25 0x7f21221bb253 in PyRun_SimpleFileExFlags () at /lib64/libpython3.9.so.1.0 #26 0x7f21221b7ee8 in Py_RunMain () at /lib64/lib
Re: [PATCH 1/3] rpc: mark GSourceFuncs functions in vireventglibwatch.c as static
On 12/14/23 19:32, Fima Shevrin wrote: From: "Denis V. Lunev" They are not exported from the module and thus should be static. Signed-off-by: Denis V. Lunev Signed-off-by: Fima Shevrin --- src/util/vireventglibwatch.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/src/util/vireventglibwatch.c b/src/util/vireventglibwatch.c index b7f3a8786a..b21e505731 100644 --- a/src/util/vireventglibwatch.c +++ b/src/util/vireventglibwatch.c @@ -71,7 +71,7 @@ virEventGLibFDSourceFinalize(GSource *source G_GNUC_UNUSED) } -GSourceFuncs virEventGLibFDSourceFuncs = { +static GSourceFuncs virEventGLibFDSourceFuncs = { .prepare = virEventGLibFDSourcePrepare, .check = virEventGLibFDSourceCheck, .dispatch = virEventGLibFDSourceDispatch, @@ -194,7 +194,7 @@ virEventGLibSocketSourceFinalize(GSource *source) } -GSourceFuncs virEventGLibSocketSourceFuncs = { +static GSourceFuncs virEventGLibSocketSourceFuncs = { .prepare = virEventGLibSocketSourcePrepare, .check = virEventGLibSocketSourceCheck, .dispatch = virEventGLibSocketSourceDispatch, Can you pls do submission work properly and carefully. 1. While you are sending more than 1 patch in a row, please write cover-letter aka "PATCH 0/3" with a description of the whole series and motivation of it 2. Please also do not forget to increment version at least inside cover letter. If you are sending patches 2nd time you should use "PATCH v2 0/3" 3. Please track changes inside cover letter, writing what has been changed during this particular submission. Thank you in advance, Den ___ Devel mailing list -- devel@lists.libvirt.org To unsubscribe send an email to devel-le...@lists.libvirt.org