Re: [libvirt] [PATCH 00/11] Avoid numerous calls of virQEMUCapsCacheLookup
On Fri, Oct 26, 2018 at 06:57:54AM +0200, Bjoern Walk wrote: > Daniel P. Berrangé [2018-10-25, 06:32PM +0100]: > > On Thu, Oct 25, 2018 at 01:47:26PM +0200, Bjoern Walk wrote: > > > Daniel P. Berrangé [2018-10-24, 10:43PM +0100]: > > > > We could optimize this by jcalling virFileAccessibleAs > > > > once and storing the result in a global. Then just do a > > > > plain stat() call in process to check the st_ctime field > > > > for changes. We only need re-run the heavy virFileAccessibleAs > > > > check if st_ctime has changed (indicating a owner/group/acl > > > > change). > > > > > > But can't access permission change outside of changing the actual device > > > file (e.g. cgroups or other OS capabilities)? Isn't that the whole > > > purpose of the virFileAccessibleAs gymnastics? > > > > Yes, cgroups could restrict access to /dev/kvm, but virFileAccessibleAs > > does not use cgroups, it only cares about using the correct user + group > > membership. > > Sorry, but then I don't understand the purpose of this function at all. > Why would you EVER check permissions like that? A simple stat(2) call > should give you the exact same information, no? The capabilities probing runs as the 'qemu' user and 'qemu' group. If libvirtd stat()s the /dev/kvm device, it will merely find out if the *root* user can access it. We need to chcek if the qemu/qemu user/group can access it, and if you try todo via a stat() then you have to reinvent the kernel logic for checking supplementary group memberships and file ACL membership. The only sensible way is for the process checking to actually be running as the qemu/qemu user and group, hence we fork and change to this user/group pair. Regards, Daniel -- |: https://berrange.com -o-https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o-https://fstop138.berrange.com :| |: https://entangle-photo.org-o-https://www.instagram.com/dberrange :| -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH 00/11] Avoid numerous calls of virQEMUCapsCacheLookup
On Wed, Oct 24, 2018 at 11:43 PM +0200, "Daniel P. Berrangé" wrote: > On Thu, Sep 20, 2018 at 07:44:46PM +0200, Marc Hartmayer wrote: >> For a domain definition there are numerous calls of >> virQEMUCapsCacheLookup (the same applies to the domain start). This >> slows down the process since virQEMUCapsCacheLookup validates that the >> QEMU capabilitites are still valid (among other things, a fork is done >> for this if the user for the QEMU processes is 'qemu'). Therefore >> let's reduce the number of virQEMUCapsCacheLookup calls whenever >> possible and reasonable. >> >> In addition to the speed up, there is the risk that >> virQEMUCapsCacheLookup returns different QEMU capabilities during a >> task if, for example, the QEMU binary has changed during the task. >> >> The correct way would be: >> >> - get the QEMU capabilities only once per task via virQEMUCapsCacheLookup >> - do the task with these QEMU capabilities >> >> or >> >> - abort the task after a cache invalidation >> >> Note: With this patch series the behavior is still not (completely) >> fixed, but the performance has been significantly improved. In a quick >> test this gave a speed up of factor 4 for a simple define/undefine >> loop. >> >> In general, the more devices a domain has, the more drastic the >> overhead becomes (because a cache validation is performed for each >> device). > > IIUC from your KVM Forum presentation, the overhead of the > cache lookup is almost entirely coming from the fork() call > triggered by the single call > > kvmUsable = virFileAccessibleAs("/dev/kvm", R_OK | W_OK, > priv->runUid, priv->runGid) == 0; > > Rather than the major refactor of the way the parse callbacks > work, we could instead just optimize this call to avoid the > repeated forks. > > Even if we reduced the number of calls to the cache, it is > still somewhat overkill to be checking /dev/kvm via fork() > every time. eg even if you reduced it to just a single > cache lookup, if you run virDomainDefine for 100 domains > in parallel it is still going to do 100 forks. > > We could optimize this by jcalling virFileAccessibleAs > once and storing the result in a global. Then just do a > plain stat() call in process to check the st_ctime field > for changes. We only need re-run the heavy virFileAccessibleAs > check if st_ctime has changed (indicating a owner/group/acl > change). Okay, if that’s fine I’ll add this as an additional patch to this series. Because the various virQEMUCapsCacheLookup calls still seem to be wrong to me. Marc > > Regards, > Daniel > -- > |: https://berrange.com -o-https://www.flickr.com/photos/dberrange :| > |: https://libvirt.org -o-https://fstop138.berrange.com :| > |: https://entangle-photo.org-o-https://www.instagram.com/dberrange :| > -- Kind regards / Beste Grüße Marc Hartmayer IBM Deutschland Research & Development GmbH Vorsitzende des Aufsichtsrats: Martina Koederitz Geschäftsführung: Dirk Wittkopp Sitz der Gesellschaft: Böblingen Registergericht: Amtsgericht Stuttgart, HRB 243294 -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH 00/11] Avoid numerous calls of virQEMUCapsCacheLookup
Daniel P. Berrangé [2018-10-25, 06:32PM +0100]: > On Thu, Oct 25, 2018 at 01:47:26PM +0200, Bjoern Walk wrote: > > Daniel P. Berrangé [2018-10-24, 10:43PM +0100]: > > > We could optimize this by jcalling virFileAccessibleAs > > > once and storing the result in a global. Then just do a > > > plain stat() call in process to check the st_ctime field > > > for changes. We only need re-run the heavy virFileAccessibleAs > > > check if st_ctime has changed (indicating a owner/group/acl > > > change). > > > > But can't access permission change outside of changing the actual device > > file (e.g. cgroups or other OS capabilities)? Isn't that the whole > > purpose of the virFileAccessibleAs gymnastics? > > Yes, cgroups could restrict access to /dev/kvm, but virFileAccessibleAs > does not use cgroups, it only cares about using the correct user + group > membership. Sorry, but then I don't understand the purpose of this function at all. Why would you EVER check permissions like that? A simple stat(2) call should give you the exact same information, no? -- IBM Systems Linux on Z & Virtualization Development -- IBM Deutschland Research & Development GmbH Schönaicher Str. 220, 71032 Böblingen Phone: +49 7031 16 1819 -- Vorsitzende des Aufsichtsrats: Martina Koederitz Geschäftsführung: Dirk Wittkopp Sitz der Gesellschaft: Böblingen Registergericht: Amtsgericht Stuttgart, HRB 243294 signature.asc Description: PGP signature -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH 00/11] Avoid numerous calls of virQEMUCapsCacheLookup
On Thu, Oct 25, 2018 at 01:47:26PM +0200, Bjoern Walk wrote: > Daniel P. Berrangé [2018-10-24, 10:43PM +0100]: > > We could optimize this by jcalling virFileAccessibleAs > > once and storing the result in a global. Then just do a > > plain stat() call in process to check the st_ctime field > > for changes. We only need re-run the heavy virFileAccessibleAs > > check if st_ctime has changed (indicating a owner/group/acl > > change). > > But can't access permission change outside of changing the actual device > file (e.g. cgroups or other OS capabilities)? Isn't that the whole > purpose of the virFileAccessibleAs gymnastics? Yes, cgroups could restrict access to /dev/kvm, but virFileAccessibleAs does not use cgroups, it only cares about using the correct user + group membership. Regards, Daniel -- |: https://berrange.com -o-https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o-https://fstop138.berrange.com :| |: https://entangle-photo.org-o-https://www.instagram.com/dberrange :| -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH 00/11] Avoid numerous calls of virQEMUCapsCacheLookup
Daniel P. Berrangé [2018-10-24, 10:43PM +0100]: > We could optimize this by jcalling virFileAccessibleAs > once and storing the result in a global. Then just do a > plain stat() call in process to check the st_ctime field > for changes. We only need re-run the heavy virFileAccessibleAs > check if st_ctime has changed (indicating a owner/group/acl > change). But can't access permission change outside of changing the actual device file (e.g. cgroups or other OS capabilities)? Isn't that the whole purpose of the virFileAccessibleAs gymnastics? -- IBM Systems Linux on Z & Virtualization Development -- IBM Deutschland Research & Development GmbH Schönaicher Str. 220, 71032 Böblingen Phone: +49 7031 16 1819 -- Vorsitzende des Aufsichtsrats: Martina Koederitz Geschäftsführung: Dirk Wittkopp Sitz der Gesellschaft: Böblingen Registergericht: Amtsgericht Stuttgart, HRB 243294 signature.asc Description: PGP signature -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH 00/11] Avoid numerous calls of virQEMUCapsCacheLookup
On Thu, Sep 20, 2018 at 07:44:46PM +0200, Marc Hartmayer wrote: > For a domain definition there are numerous calls of > virQEMUCapsCacheLookup (the same applies to the domain start). This > slows down the process since virQEMUCapsCacheLookup validates that the > QEMU capabilitites are still valid (among other things, a fork is done > for this if the user for the QEMU processes is 'qemu'). Therefore > let's reduce the number of virQEMUCapsCacheLookup calls whenever > possible and reasonable. > > In addition to the speed up, there is the risk that > virQEMUCapsCacheLookup returns different QEMU capabilities during a > task if, for example, the QEMU binary has changed during the task. > > The correct way would be: > > - get the QEMU capabilities only once per task via virQEMUCapsCacheLookup > - do the task with these QEMU capabilities > > or > > - abort the task after a cache invalidation > > Note: With this patch series the behavior is still not (completely) > fixed, but the performance has been significantly improved. In a quick > test this gave a speed up of factor 4 for a simple define/undefine > loop. > > In general, the more devices a domain has, the more drastic the > overhead becomes (because a cache validation is performed for each > device). IIUC from your KVM Forum presentation, the overhead of the cache lookup is almost entirely coming from the fork() call triggered by the single call kvmUsable = virFileAccessibleAs("/dev/kvm", R_OK | W_OK, priv->runUid, priv->runGid) == 0; Rather than the major refactor of the way the parse callbacks work, we could instead just optimize this call to avoid the repeated forks. Even if we reduced the number of calls to the cache, it is still somewhat overkill to be checking /dev/kvm via fork() every time. eg even if you reduced it to just a single cache lookup, if you run virDomainDefine for 100 domains in parallel it is still going to do 100 forks. We could optimize this by jcalling virFileAccessibleAs once and storing the result in a global. Then just do a plain stat() call in process to check the st_ctime field for changes. We only need re-run the heavy virFileAccessibleAs check if st_ctime has changed (indicating a owner/group/acl change). Regards, Daniel -- |: https://berrange.com -o-https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o-https://fstop138.berrange.com :| |: https://entangle-photo.org-o-https://www.instagram.com/dberrange :| -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
[libvirt] [PATCH 00/11] Avoid numerous calls of virQEMUCapsCacheLookup
For a domain definition there are numerous calls of virQEMUCapsCacheLookup (the same applies to the domain start). This slows down the process since virQEMUCapsCacheLookup validates that the QEMU capabilitites are still valid (among other things, a fork is done for this if the user for the QEMU processes is 'qemu'). Therefore let's reduce the number of virQEMUCapsCacheLookup calls whenever possible and reasonable. In addition to the speed up, there is the risk that virQEMUCapsCacheLookup returns different QEMU capabilities during a task if, for example, the QEMU binary has changed during the task. The correct way would be: - get the QEMU capabilities only once per task via virQEMUCapsCacheLookup - do the task with these QEMU capabilities or - abort the task after a cache invalidation Note: With this patch series the behavior is still not (completely) fixed, but the performance has been significantly improved. In a quick test this gave a speed up of factor 4 for a simple define/undefine loop. In general, the more devices a domain has, the more drastic the overhead becomes (because a cache validation is performed for each device). Marc Hartmayer (11): qemu: Use VIR_STEAL_PTR macro qemu: Introduce qemuDomainUpdateQEMUCaps() qemu: Pass QEMUCaps to virDomainDefPostParse conf: Use getParseOpaque() in virDomainObjSetDefTransient conf: Add function description for virDomainDefPostParse conf: Get rid of virDomainDeviceDefPostParseOne conf: Extend virDomainDefValidate(Callback) for parseOpaque conf: Use domainPostParseData(Alloc|Free) in virDomainDefValidate qemu: Use @parseOpaque in qemuDomainDefValidate conf: Extend virDomainDeviceDefValidate(Callback) for parseOpaque qemu: Use @parseOpaque in qemuDomainDeviceDefValidate src/conf/domain_conf.c | 98 ++--- src/conf/domain_conf.h | 9 ++-- src/qemu/qemu_domain.c | 82 +++--- src/qemu/qemu_domain.h | 4 ++ src/qemu/qemu_process.c | 18 +++- src/vz/vz_driver.c | 6 ++- 6 files changed, 128 insertions(+), 89 deletions(-) -- 2.17.0 -- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list