Re: REST service for libvirt to simplify SEV(ES) launch measurement

2022-03-14 Thread James Bottomley
On Wed, 2022-03-09 at 16:42 +, Dr. David Alan Gilbert wrote:
> * Tobin Feldman-Fitzthum (to...@linux.ibm.com) wrote:
> > 
> > On 3/3/22 12:20 PM, Daniel P. Berrangé wrote:
> > > On Fri, Feb 25, 2022 at 03:10:35PM -0500, Tobin Feldman-Fitzthum
> > > wrote:
> > > > 
> > > > On 2/24/22 7:26 AM, Daniel P. Berrangé wrote:
[...]
> > > > > I can understand if it is harder to achieve commonality with
> > > > > tech like libkrun though, since that's consuming virt in
> > > > > quite a different way at the userspace level.
> > > > Yeah, extending the focus beyond SEV(-ES) with QEMU might make
> > > > things more difficult. There is some discussion right now about
> > > > trying to find common ground between SEV-SNP and TDX
> > > > attestation, but I assume that is all out of scope since
> > > > libvirt isn't really involved.
> > > 
> > > I admit I don't know much about TDX, but from what I've
> > > understood talking to other people, SEV-SNP might not end up
> > > looking all that different. IIUC the attestation has to be
> > > initiated from inside the SNP guest after CPUs are running. It is
> > > going need to be run as early as possible and while you might be
> > > able todo it in the initrd, it feels likely that it could be put
> > > into the firmware (OVMF) instead, such that it does the
> > > validation before even loading the kernel. This would facilitate
> > > supporting it with arbitrary guest OS, as the firmware is common
> > > to all. We can't assume the firmware will have direct network
> > > connectivity to any attestation service needed to verify the
> > > boot. This implies the firmware might need to talk to the host
> > > via something like virtio-serial / virtio-vsock, from where
> > > libvirt or QEMU can proxy the traffic onto the real attestation
> > > service.  Such an architecture might end up aligning quite well
> > > with SEV/SEV-ES, possible allowing the same protocol to be used
> > > in both cases, just with differnt ultimate end points (libvirt
> > > for SEV(-ES) vs guest firmware for SEV-SNP).
> > 
> > Yeah that is an interesting point. Most SNP approaches that I have
> > seen so far use the kernel/initrd to handle decryption. There is
> > potentially a gap if the kernel/initrd are not themselves part of
> > the measurement that is provided in the attestation report. We have
> > been using this measured direct boot thing for SEV(-ES) and I think
> > it can be extended to SEV-SNP as well. This would close that gap
> > and make it feasible to do the decryption in the kernel.
> 
> With the direct boot setup, it feels like using 'clevis' in the
> initrd would be the right way to wire things to disk decryption.
> [ https://github.com/latchset/clevis ] It would need a 'pin' writing
> for SNP that then did whatever communication mechanism we settled on.
> 
> (A clevis pin might also be the way to wire the simple disk key from
> your EFI/SEV mechanism up to LUKS? )

We did a write up about this a while ago on the virt list:

https://listman.redhat.com/mailman/private/ibm-virt-security/2021-December/000498.html

Dimitri Pal is on the reply suggesting effectively the above and we had
quite a discussion about it, the upshot of which was that we might get
it to work for -SNP and TDX, but it couldn't work for plain SEV and
-ES.  What we were looking at above is a mechanism for unifying all the
flavours of boot.

James




Re: REST service for libvirt to simplify SEV(ES) launch measurement

2022-03-14 Thread Dr. David Alan Gilbert
* James Bottomley (j...@linux.ibm.com) wrote:
> On Wed, 2022-03-09 at 16:42 +, Dr. David Alan Gilbert wrote:
> > * Tobin Feldman-Fitzthum (to...@linux.ibm.com) wrote:
> > > 
> > > On 3/3/22 12:20 PM, Daniel P. Berrangé wrote:
> > > > On Fri, Feb 25, 2022 at 03:10:35PM -0500, Tobin Feldman-Fitzthum
> > > > wrote:
> > > > > 
> > > > > On 2/24/22 7:26 AM, Daniel P. Berrangé wrote:
> [...]
> > > > > > I can understand if it is harder to achieve commonality with
> > > > > > tech like libkrun though, since that's consuming virt in
> > > > > > quite a different way at the userspace level.
> > > > > Yeah, extending the focus beyond SEV(-ES) with QEMU might make
> > > > > things more difficult. There is some discussion right now about
> > > > > trying to find common ground between SEV-SNP and TDX
> > > > > attestation, but I assume that is all out of scope since
> > > > > libvirt isn't really involved.
> > > > 
> > > > I admit I don't know much about TDX, but from what I've
> > > > understood talking to other people, SEV-SNP might not end up
> > > > looking all that different. IIUC the attestation has to be
> > > > initiated from inside the SNP guest after CPUs are running. It is
> > > > going need to be run as early as possible and while you might be
> > > > able todo it in the initrd, it feels likely that it could be put
> > > > into the firmware (OVMF) instead, such that it does the
> > > > validation before even loading the kernel. This would facilitate
> > > > supporting it with arbitrary guest OS, as the firmware is common
> > > > to all. We can't assume the firmware will have direct network
> > > > connectivity to any attestation service needed to verify the
> > > > boot. This implies the firmware might need to talk to the host
> > > > via something like virtio-serial / virtio-vsock, from where
> > > > libvirt or QEMU can proxy the traffic onto the real attestation
> > > > service.  Such an architecture might end up aligning quite well
> > > > with SEV/SEV-ES, possible allowing the same protocol to be used
> > > > in both cases, just with differnt ultimate end points (libvirt
> > > > for SEV(-ES) vs guest firmware for SEV-SNP).
> > > 
> > > Yeah that is an interesting point. Most SNP approaches that I have
> > > seen so far use the kernel/initrd to handle decryption. There is
> > > potentially a gap if the kernel/initrd are not themselves part of
> > > the measurement that is provided in the attestation report. We have
> > > been using this measured direct boot thing for SEV(-ES) and I think
> > > it can be extended to SEV-SNP as well. This would close that gap
> > > and make it feasible to do the decryption in the kernel.
> > 
> > With the direct boot setup, it feels like using 'clevis' in the
> > initrd would be the right way to wire things to disk decryption.
> > [ https://github.com/latchset/clevis ] It would need a 'pin' writing
> > for SNP that then did whatever communication mechanism we settled on.
> > 
> > (A clevis pin might also be the way to wire the simple disk key from
> > your EFI/SEV mechanism up to LUKS? )
> 
> We did a write up about this a while ago on the virt list:
> 
> https://listman.redhat.com/mailman/private/ibm-virt-security/2021-December/000498.html

(Note that's a private list, while libvir-list cc'd above is public -
hello all!)

> Dimitri Pal is on the reply suggesting effectively the above and we had
> quite a discussion about it, the upshot of which was that we might get
> it to work for -SNP and TDX, but it couldn't work for plain SEV and
> -ES.  What we were looking at above is a mechanism for unifying all the
> flavours of boot.

Hmm yes for SNP;   for the simple non-SNP one, it actually becomes
easier with Clevis; you ignore Tang altogether and just add a Clevis
pin that wires the secret through - it looks like a few lines of shell
but fits into Clevis which we already have, and Clevis has the smarts to
fall back to letting you put a password in from what I can tell.

Although Christophe did just point me to:
https://github.com/confidential-containers/attestation-agent

which seems to have some wiring for basic SEV and Alibaba's
online protocol which I've yet to look at.

Dave
> James
> 
> 
-- 
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK



Re: [PATCH 1/2] qemu_namespace: Don't unlink paths from cgroupDeviceACL

2022-03-14 Thread Ján Tomko

On a Monday in 2022, Michal Privoznik wrote:

When building namespace for a domain there are couple of devices
that are created independent of domain config (see
qemuDomainPopulateDevices()). The idea behind is that these
devices are crucial for QEMU or one of its libraries, or user is
passing through a device and wants us to create it in the
namespace too.  That's the reason that these devices are allowed
in the devices CGroup controller as well.

However, during unplug it may happen that a device is configured
to use one of such devices and since we remove /dev nodes on
hotplug we would remove such device too. For example,
/dev/urandom belongs onto the list of implicit devices and users
can hotplug and hotunplug an RNG device with /dev/urandom as
backend.

The fix is fortunately simple - just consult the list of implicit
devices before removing the device from the namespace.

Signed-off-by: Michal Privoznik 
---
src/qemu/qemu_namespace.c | 21 +++--
1 file changed, 19 insertions(+), 2 deletions(-)

diff --git a/src/qemu/qemu_namespace.c b/src/qemu/qemu_namespace.c
index 3b41d72630..1132fd04e5 100644
--- a/src/qemu/qemu_namespace.c
+++ b/src/qemu/qemu_namespace.c
@@ -1364,6 +1364,8 @@ qemuNamespaceUnlinkPaths(virDomainObj *vm,
if (STRPREFIX(path, QEMU_DEVPREFIX)) {
GStrv mount;
bool inSubmount = false;
+const char *const *devices = (const char *const 
*)cfg->cgroupDeviceACL;
+bool inDevices = false;

for (mount = devMountsPath; *mount; mount++) {
if (STREQ(*mount, "/dev"))
@@ -1375,8 +1377,23 @@ qemuNamespaceUnlinkPaths(virDomainObj *vm,
}
}

-if (!inSubmount)
-unlinkPaths = g_slist_prepend(unlinkPaths, g_strdup(path));
+if (inSubmount)
+continue;
+
+if (!devices)
+devices = defaultDeviceACL;
+



+for (; devices; devices++) {
+if (STREQ(path, *devices)) {
+inDevices = true;
+break;
+}
+}
+
+if (inDevices)
+continue;
+


something like:

if (g_strv_contains(devices, path))
continue;

should do the same without the need for the bool variable.

(Not sure how to nicely eliminate the other one)

Reviewed-by: Ján Tomko 

Jano


+unlinkPaths = g_slist_prepend(unlinkPaths, g_strdup(path));
}
}

--
2.34.1



signature.asc
Description: PGP signature


Re: [libvirt RFC] virFile: new VIR_FILE_WRAPPER_BIG_PIPE to improve performance

2022-03-14 Thread Daniel P . Berrangé
On Mon, Mar 14, 2022 at 06:38:31PM +0100, Claudio Fontana wrote:
> On 3/14/22 6:17 PM, Daniel P. Berrangé wrote:
> > On Sat, Mar 12, 2022 at 05:30:01PM +0100, Claudio Fontana wrote:
> >> the first user is the qemu driver,
> >>
> >> virsh save/resume would slow to a crawl with a default pipe size (64k).
> >>
> >> This improves the situation by 400%.
> >>
> >> Going through io_helper still seems to incur in some penalty (~15%-ish)
> >> compared with direct qemu migration to a nc socket to a file.
> >>
> >> Signed-off-by: Claudio Fontana 
> >> ---
> >>  src/qemu/qemu_driver.c|  6 +++---
> >>  src/qemu/qemu_saveimage.c | 11 ++-
> >>  src/util/virfile.c| 12 
> >>  src/util/virfile.h|  1 +
> >>  4 files changed, 22 insertions(+), 8 deletions(-)
> >>
> >> Hello, I initially thought this to be a qemu performance issue,
> >> so you can find the discussion about this in qemu-devel:
> >>
> >> "Re: bad virsh save /dev/null performance (600 MiB/s max)"
> >>
> >> https://lists.gnu.org/archive/html/qemu-devel/2022-03/msg03142.html
> >>
> >> RFC since need to validate idea, and it is only lightly tested:
> >>
> >> save - about 400% benefit in throughput, getting around 20 Gbps to 
> >> /dev/null,
> >>and around 13 Gbps to a ramdisk.
> >>   By comparison, direct qemu migration to a nc socket is around 24Gbps.
> >>
> >> restore  - not tested, _should_ also benefit in the "bypass_cache" case
> >> coredump - not tested, _should_ also benefit like for save
> >>
> >> Thanks for your comments and review,
> >>
> >> Claudio
> >>
> >>
> >> diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c
> >> index c1b3bd8536..be248c1e92 100644
> >> --- a/src/qemu/qemu_driver.c
> >> +++ b/src/qemu/qemu_driver.c
> >> @@ -3044,7 +3044,7 @@ doCoreDump(virQEMUDriver *driver,
> >>  virFileWrapperFd *wrapperFd = NULL;
> >>  int directFlag = 0;
> >>  bool needUnlink = false;
> >> -unsigned int flags = VIR_FILE_WRAPPER_NON_BLOCKING;
> >> +unsigned int wrapperFlags = VIR_FILE_WRAPPER_NON_BLOCKING | 
> >> VIR_FILE_WRAPPER_BIG_PIPE;
> >>  const char *memory_dump_format = NULL;
> >>  g_autoptr(virQEMUDriverConfig) cfg = virQEMUDriverGetConfig(driver);
> >>  g_autoptr(virCommand) compressor = NULL;
> >> @@ -3059,7 +3059,7 @@ doCoreDump(virQEMUDriver *driver,
> >>  
> >>  /* Create an empty file with appropriate ownership.  */
> >>  if (dump_flags & VIR_DUMP_BYPASS_CACHE) {
> >> -flags |= VIR_FILE_WRAPPER_BYPASS_CACHE;
> >> +wrapperFlags |= VIR_FILE_WRAPPER_BYPASS_CACHE;
> >>  directFlag = virFileDirectFdFlag();
> >>  if (directFlag < 0) {
> >>  virReportError(VIR_ERR_OPERATION_FAILED, "%s",
> >> @@ -3072,7 +3072,7 @@ doCoreDump(virQEMUDriver *driver,
> >>   )) < 0)
> >>  goto cleanup;
> >>  
> >> -if (!(wrapperFd = virFileWrapperFdNew(, path, flags)))
> >> +if (!(wrapperFd = virFileWrapperFdNew(, path, wrapperFlags)))
> >>  goto cleanup;
> >>  
> >>  if (dump_flags & VIR_DUMP_MEMORY_ONLY) {
> >> diff --git a/src/qemu/qemu_saveimage.c b/src/qemu/qemu_saveimage.c
> >> index c0139041eb..1b522a1542 100644
> >> --- a/src/qemu/qemu_saveimage.c
> >> +++ b/src/qemu/qemu_saveimage.c
> >> @@ -267,7 +267,7 @@ qemuSaveImageCreate(virQEMUDriver *driver,
> >>  int fd = -1;
> >>  int directFlag = 0;
> >>  virFileWrapperFd *wrapperFd = NULL;
> >> -unsigned int wrapperFlags = VIR_FILE_WRAPPER_NON_BLOCKING;
> >> +unsigned int wrapperFlags = VIR_FILE_WRAPPER_NON_BLOCKING | 
> >> VIR_FILE_WRAPPER_BIG_PIPE;
> >>  
> >>  /* Obtain the file handle.  */
> >>  if ((flags & VIR_DOMAIN_SAVE_BYPASS_CACHE)) {
> >> @@ -463,10 +463,11 @@ qemuSaveImageOpen(virQEMUDriver *driver,
> >>  if ((fd = qemuDomainOpenFile(cfg, NULL, path, oflags, NULL)) < 0)
> >>  return -1;
> >>  
> >> -if (bypass_cache &&
> >> -!(*wrapperFd = virFileWrapperFdNew(, path,
> >> -   
> >> VIR_FILE_WRAPPER_BYPASS_CACHE)))
> >> -return -1;
> >> +if (bypass_cache) {
> >> +unsigned int wrapperFlags = VIR_FILE_WRAPPER_BYPASS_CACHE | 
> >> VIR_FILE_WRAPPER_BIG_PIPE;
> >> +if (!(*wrapperFd = virFileWrapperFdNew(, path, wrapperFlags)))
> >> +return -1;
> >> +}
> >>  
> >>  data = g_new0(virQEMUSaveData, 1);
> >>  
> >> diff --git a/src/util/virfile.c b/src/util/virfile.c
> >> index a04f888e06..fdacd17890 100644
> >> --- a/src/util/virfile.c
> >> +++ b/src/util/virfile.c
> >> @@ -282,6 +282,18 @@ virFileWrapperFdNew(int *fd, const char *name, 
> >> unsigned int flags)
> >>  
> >>  ret->cmd = virCommandNewArgList(iohelper_path, name, NULL);
> >>  
> >> +if (flags & VIR_FILE_WRAPPER_BIG_PIPE) {
> >> +/*
> >> + * virsh save/resume would slow to a crawl with a default pipe 
> >> size (usually 64k).
> >> + * This improves the situation by 400%, although 

Re: [libvirt RFC] virFile: new VIR_FILE_WRAPPER_BIG_PIPE to improve performance

2022-03-14 Thread Claudio Fontana
On 3/14/22 6:17 PM, Daniel P. Berrangé wrote:
> On Sat, Mar 12, 2022 at 05:30:01PM +0100, Claudio Fontana wrote:
>> the first user is the qemu driver,
>>
>> virsh save/resume would slow to a crawl with a default pipe size (64k).
>>
>> This improves the situation by 400%.
>>
>> Going through io_helper still seems to incur in some penalty (~15%-ish)
>> compared with direct qemu migration to a nc socket to a file.
>>
>> Signed-off-by: Claudio Fontana 
>> ---
>>  src/qemu/qemu_driver.c|  6 +++---
>>  src/qemu/qemu_saveimage.c | 11 ++-
>>  src/util/virfile.c| 12 
>>  src/util/virfile.h|  1 +
>>  4 files changed, 22 insertions(+), 8 deletions(-)
>>
>> Hello, I initially thought this to be a qemu performance issue,
>> so you can find the discussion about this in qemu-devel:
>>
>> "Re: bad virsh save /dev/null performance (600 MiB/s max)"
>>
>> https://lists.gnu.org/archive/html/qemu-devel/2022-03/msg03142.html
>>
>> RFC since need to validate idea, and it is only lightly tested:
>>
>> save - about 400% benefit in throughput, getting around 20 Gbps to 
>> /dev/null,
>>and around 13 Gbps to a ramdisk.
>> By comparison, direct qemu migration to a nc socket is around 24Gbps.
>>
>> restore  - not tested, _should_ also benefit in the "bypass_cache" case
>> coredump - not tested, _should_ also benefit like for save
>>
>> Thanks for your comments and review,
>>
>> Claudio
>>
>>
>> diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c
>> index c1b3bd8536..be248c1e92 100644
>> --- a/src/qemu/qemu_driver.c
>> +++ b/src/qemu/qemu_driver.c
>> @@ -3044,7 +3044,7 @@ doCoreDump(virQEMUDriver *driver,
>>  virFileWrapperFd *wrapperFd = NULL;
>>  int directFlag = 0;
>>  bool needUnlink = false;
>> -unsigned int flags = VIR_FILE_WRAPPER_NON_BLOCKING;
>> +unsigned int wrapperFlags = VIR_FILE_WRAPPER_NON_BLOCKING | 
>> VIR_FILE_WRAPPER_BIG_PIPE;
>>  const char *memory_dump_format = NULL;
>>  g_autoptr(virQEMUDriverConfig) cfg = virQEMUDriverGetConfig(driver);
>>  g_autoptr(virCommand) compressor = NULL;
>> @@ -3059,7 +3059,7 @@ doCoreDump(virQEMUDriver *driver,
>>  
>>  /* Create an empty file with appropriate ownership.  */
>>  if (dump_flags & VIR_DUMP_BYPASS_CACHE) {
>> -flags |= VIR_FILE_WRAPPER_BYPASS_CACHE;
>> +wrapperFlags |= VIR_FILE_WRAPPER_BYPASS_CACHE;
>>  directFlag = virFileDirectFdFlag();
>>  if (directFlag < 0) {
>>  virReportError(VIR_ERR_OPERATION_FAILED, "%s",
>> @@ -3072,7 +3072,7 @@ doCoreDump(virQEMUDriver *driver,
>>   )) < 0)
>>  goto cleanup;
>>  
>> -if (!(wrapperFd = virFileWrapperFdNew(, path, flags)))
>> +if (!(wrapperFd = virFileWrapperFdNew(, path, wrapperFlags)))
>>  goto cleanup;
>>  
>>  if (dump_flags & VIR_DUMP_MEMORY_ONLY) {
>> diff --git a/src/qemu/qemu_saveimage.c b/src/qemu/qemu_saveimage.c
>> index c0139041eb..1b522a1542 100644
>> --- a/src/qemu/qemu_saveimage.c
>> +++ b/src/qemu/qemu_saveimage.c
>> @@ -267,7 +267,7 @@ qemuSaveImageCreate(virQEMUDriver *driver,
>>  int fd = -1;
>>  int directFlag = 0;
>>  virFileWrapperFd *wrapperFd = NULL;
>> -unsigned int wrapperFlags = VIR_FILE_WRAPPER_NON_BLOCKING;
>> +unsigned int wrapperFlags = VIR_FILE_WRAPPER_NON_BLOCKING | 
>> VIR_FILE_WRAPPER_BIG_PIPE;
>>  
>>  /* Obtain the file handle.  */
>>  if ((flags & VIR_DOMAIN_SAVE_BYPASS_CACHE)) {
>> @@ -463,10 +463,11 @@ qemuSaveImageOpen(virQEMUDriver *driver,
>>  if ((fd = qemuDomainOpenFile(cfg, NULL, path, oflags, NULL)) < 0)
>>  return -1;
>>  
>> -if (bypass_cache &&
>> -!(*wrapperFd = virFileWrapperFdNew(, path,
>> -   VIR_FILE_WRAPPER_BYPASS_CACHE)))
>> -return -1;
>> +if (bypass_cache) {
>> +unsigned int wrapperFlags = VIR_FILE_WRAPPER_BYPASS_CACHE | 
>> VIR_FILE_WRAPPER_BIG_PIPE;
>> +if (!(*wrapperFd = virFileWrapperFdNew(, path, wrapperFlags)))
>> +return -1;
>> +}
>>  
>>  data = g_new0(virQEMUSaveData, 1);
>>  
>> diff --git a/src/util/virfile.c b/src/util/virfile.c
>> index a04f888e06..fdacd17890 100644
>> --- a/src/util/virfile.c
>> +++ b/src/util/virfile.c
>> @@ -282,6 +282,18 @@ virFileWrapperFdNew(int *fd, const char *name, unsigned 
>> int flags)
>>  
>>  ret->cmd = virCommandNewArgList(iohelper_path, name, NULL);
>>  
>> +if (flags & VIR_FILE_WRAPPER_BIG_PIPE) {
>> +/*
>> + * virsh save/resume would slow to a crawl with a default pipe size 
>> (usually 64k).
>> + * This improves the situation by 400%, although going through 
>> io_helper still incurs
>> + * in a performance penalty compared with a direct qemu migration 
>> to a socket.
>> + */
>> +int pipe_sz, rv = virFileReadValueInt(_sz, 
>> "/proc/sys/fs/pipe-max-size");
> 
> This is fine as an experiment but I don't think it is that 

Re: [PATCH v3 00/29] ppc64 PowerNV machines support

2022-03-14 Thread Daniel Henrique Barboza

Hi,

Just to warn any by passer who might want to review the remaining patches: 
there are
changes in the QEMU side that we would like to make that would make the proposed
support here simpler. The planned changes are:

- add virtual pnv-phb and pnv-phb-root-port devices. These virtual devices will 
not be
versioned and will be used in all PowerNV machines (powernv8/9/10), meaning 
that we wouldn't
need to add a pnv-phbN/pnv-phbN-root-port pair for each machine;

- this new virtual device will have its own capability 
(QEMU_CAPS_DEVICE_PNV_PHB), which
will also simplify what we're doing here - we won't need to snapshot an 
specific QEMU version
that happened to have user creatable PHBs.


Most of the already reviewed code will be used in the next version. As soon as 
the 7.1
support is upstream I'll reroll this series with these changes.


Thanks,


Daniel


On 2/23/22 10:19, Daniel Henrique Barboza wrote:


Hi,

This new version contains changes proposed by Jano. The most notable
change is on patch 9, where pnv_pnv3/pnv_phb4 capabilities are now being
probed and, if the QEMU version isn't high enough, they are cleared from
qemuCaps.

For convenience, the patches that are pending review/acks are patches 14,
17, 19, 20, 22, 23 and 24.


v2 link: 
https://listman.redhat.com/archives/libvir-list/2022-January/msg01149.html

Daniel Henrique Barboza (29):
   qemu_domain.c: add PowerNV machine helpers
   qemu_capabilities.c: use 'MachineIsPowerPC' in DeviceDiskCaps
   qemu_domain: turn qemuDomainMachineIsPSeries() static
   qemu_validate.c: use qemuDomainIsPowerPC() in
 qemuValidateDomainChrDef()
   qemu_domain.c: define ISA as default PowerNV serial
   qemu_validate.c: enhance 'machine type not supported' message
   qemu_domain.c: disable default devices for PowerNV machines
   tests: add basic PowerNV8 test
   qemu: introduce QEMU_CAPS_DEVICE_PNV_PHB3
   conf, qemu: add 'pnv-phb3-root-port' PCI controller model name
   conf, qemu: add 'pnv-phb3' PCI controller model name
   domain_conf.c: fix identation in virDomainControllerDefParseXML()
   conf: parse and format 
   formatdomain.rst: add 'index' semantics for PowerNV domains
   conf: introduce virDomainControllerIsPowerNVPHB
   conf, qemu: add default 'chip-id' value for pnv-phb3 controllers
   conf, qemu: add default 'targetIndex' value for pnv-phb3 devs
   qemu_command.c: add command line for the pnv-phb3 device
   qemu_domain_address.c: change pnv-phb3 minimal downstream slot
   domain_conf: always format pnv-phb3-root-port address
   tests: add pnv-phb3-root-port test
   domain_validate.c: allow targetIndex 0 out of idx 0 for PowerNV PHBs
   domain_conf.c: reject duplicated pnv-phb3 devices
   qemu: introduce QEMU_CAPS_DEVICE_PNV_PHB4
   conf, qemu: add 'pnv-phb4-root-port' PCI controller model name
   domain_conf.c: add phb4-root-port to IsPowerNVRootPort()
   conf, qemu: add 'pnv-phb4' controller model name
   domain_conf.c: add pnv-phb4 to ControllerIsPowerNVPHB()
   tests: add PowerNV9 tests

  docs/formatdomain.rst |  12 +-
  docs/schemas/domaincommon.rng |  10 ++
  src/conf/domain_conf.c| 156 ++
  src/conf/domain_conf.h|   8 +
  src/conf/domain_validate.c|   5 +-
  src/libvirt_private.syms  |   1 +
  src/qemu/qemu_capabilities.c  |  19 ++-
  src/qemu/qemu_capabilities.h  |   4 +
  src/qemu/qemu_command.c   |  21 ++-
  src/qemu/qemu_domain.c|  51 +-
  src/qemu/qemu_domain.h|   4 +-
  src/qemu/qemu_domain_address.c|  64 ++-
  src/qemu/qemu_validate.c  |  62 ++-
  .../qemucapabilitiesdata/caps_5.0.0.ppc64.xml |   2 +
  .../qemucapabilitiesdata/caps_5.2.0.ppc64.xml |   2 +
  .../qemucapabilitiesdata/caps_6.2.0.ppc64.xml |   2 +
  .../qemucapabilitiesdata/caps_7.0.0.ppc64.xml |   2 +
  .../powernv8-basic.ppc64-latest.args  |  34 
  tests/qemuxml2argvdata/powernv8-basic.xml |  16 ++
  tests/qemuxml2argvdata/powernv8-dupPHBs.err   |   1 +
  .../powernv8-dupPHBs.ppc64-latest.err |   1 +
  tests/qemuxml2argvdata/powernv8-dupPHBs.xml   |  27 +++
  .../powernv8-root-port.ppc64-latest.args  |  35 
  tests/qemuxml2argvdata/powernv8-root-port.xml |  17 ++
  .../powernv8-two-sockets.ppc64-latest.args|  35 
  .../qemuxml2argvdata/powernv8-two-sockets.xml |  26 +++
  .../powernv9-dupPHBs.ppc64-latest.err |   1 +
  tests/qemuxml2argvdata/powernv9-dupPHBs.xml   |  27 +++
  .../powernv9-root-port.ppc64-latest.args  |  35 
  tests/qemuxml2argvdata/powernv9-root-port.xml |  17 ++
  tests/qemuxml2argvtest.c  |   7 +
  .../powernv8-basic.ppc64-latest.xml   |  34 
  .../powernv8-root-port.ppc64-latest.xml   |  39 +
  .../powernv8-two-sockets.ppc64-latest.xml |  39 +
  

Re: [libvirt RFC] virFile: new VIR_FILE_WRAPPER_BIG_PIPE to improve performance

2022-03-14 Thread Daniel P . Berrangé
On Sat, Mar 12, 2022 at 05:30:01PM +0100, Claudio Fontana wrote:
> the first user is the qemu driver,
> 
> virsh save/resume would slow to a crawl with a default pipe size (64k).
> 
> This improves the situation by 400%.
> 
> Going through io_helper still seems to incur in some penalty (~15%-ish)
> compared with direct qemu migration to a nc socket to a file.
> 
> Signed-off-by: Claudio Fontana 
> ---
>  src/qemu/qemu_driver.c|  6 +++---
>  src/qemu/qemu_saveimage.c | 11 ++-
>  src/util/virfile.c| 12 
>  src/util/virfile.h|  1 +
>  4 files changed, 22 insertions(+), 8 deletions(-)
> 
> Hello, I initially thought this to be a qemu performance issue,
> so you can find the discussion about this in qemu-devel:
> 
> "Re: bad virsh save /dev/null performance (600 MiB/s max)"
> 
> https://lists.gnu.org/archive/html/qemu-devel/2022-03/msg03142.html
> 
> RFC since need to validate idea, and it is only lightly tested:
> 
> save - about 400% benefit in throughput, getting around 20 Gbps to 
> /dev/null,
>and around 13 Gbps to a ramdisk.
>  By comparison, direct qemu migration to a nc socket is around 24Gbps.
> 
> restore  - not tested, _should_ also benefit in the "bypass_cache" case
> coredump - not tested, _should_ also benefit like for save
> 
> Thanks for your comments and review,
> 
> Claudio
> 
> 
> diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c
> index c1b3bd8536..be248c1e92 100644
> --- a/src/qemu/qemu_driver.c
> +++ b/src/qemu/qemu_driver.c
> @@ -3044,7 +3044,7 @@ doCoreDump(virQEMUDriver *driver,
>  virFileWrapperFd *wrapperFd = NULL;
>  int directFlag = 0;
>  bool needUnlink = false;
> -unsigned int flags = VIR_FILE_WRAPPER_NON_BLOCKING;
> +unsigned int wrapperFlags = VIR_FILE_WRAPPER_NON_BLOCKING | 
> VIR_FILE_WRAPPER_BIG_PIPE;
>  const char *memory_dump_format = NULL;
>  g_autoptr(virQEMUDriverConfig) cfg = virQEMUDriverGetConfig(driver);
>  g_autoptr(virCommand) compressor = NULL;
> @@ -3059,7 +3059,7 @@ doCoreDump(virQEMUDriver *driver,
>  
>  /* Create an empty file with appropriate ownership.  */
>  if (dump_flags & VIR_DUMP_BYPASS_CACHE) {
> -flags |= VIR_FILE_WRAPPER_BYPASS_CACHE;
> +wrapperFlags |= VIR_FILE_WRAPPER_BYPASS_CACHE;
>  directFlag = virFileDirectFdFlag();
>  if (directFlag < 0) {
>  virReportError(VIR_ERR_OPERATION_FAILED, "%s",
> @@ -3072,7 +3072,7 @@ doCoreDump(virQEMUDriver *driver,
>   )) < 0)
>  goto cleanup;
>  
> -if (!(wrapperFd = virFileWrapperFdNew(, path, flags)))
> +if (!(wrapperFd = virFileWrapperFdNew(, path, wrapperFlags)))
>  goto cleanup;
>  
>  if (dump_flags & VIR_DUMP_MEMORY_ONLY) {
> diff --git a/src/qemu/qemu_saveimage.c b/src/qemu/qemu_saveimage.c
> index c0139041eb..1b522a1542 100644
> --- a/src/qemu/qemu_saveimage.c
> +++ b/src/qemu/qemu_saveimage.c
> @@ -267,7 +267,7 @@ qemuSaveImageCreate(virQEMUDriver *driver,
>  int fd = -1;
>  int directFlag = 0;
>  virFileWrapperFd *wrapperFd = NULL;
> -unsigned int wrapperFlags = VIR_FILE_WRAPPER_NON_BLOCKING;
> +unsigned int wrapperFlags = VIR_FILE_WRAPPER_NON_BLOCKING | 
> VIR_FILE_WRAPPER_BIG_PIPE;
>  
>  /* Obtain the file handle.  */
>  if ((flags & VIR_DOMAIN_SAVE_BYPASS_CACHE)) {
> @@ -463,10 +463,11 @@ qemuSaveImageOpen(virQEMUDriver *driver,
>  if ((fd = qemuDomainOpenFile(cfg, NULL, path, oflags, NULL)) < 0)
>  return -1;
>  
> -if (bypass_cache &&
> -!(*wrapperFd = virFileWrapperFdNew(, path,
> -   VIR_FILE_WRAPPER_BYPASS_CACHE)))
> -return -1;
> +if (bypass_cache) {
> +unsigned int wrapperFlags = VIR_FILE_WRAPPER_BYPASS_CACHE | 
> VIR_FILE_WRAPPER_BIG_PIPE;
> +if (!(*wrapperFd = virFileWrapperFdNew(, path, wrapperFlags)))
> +return -1;
> +}
>  
>  data = g_new0(virQEMUSaveData, 1);
>  
> diff --git a/src/util/virfile.c b/src/util/virfile.c
> index a04f888e06..fdacd17890 100644
> --- a/src/util/virfile.c
> +++ b/src/util/virfile.c
> @@ -282,6 +282,18 @@ virFileWrapperFdNew(int *fd, const char *name, unsigned 
> int flags)
>  
>  ret->cmd = virCommandNewArgList(iohelper_path, name, NULL);
>  
> +if (flags & VIR_FILE_WRAPPER_BIG_PIPE) {
> +/*
> + * virsh save/resume would slow to a crawl with a default pipe size 
> (usually 64k).
> + * This improves the situation by 400%, although going through 
> io_helper still incurs
> + * in a performance penalty compared with a direct qemu migration to 
> a socket.
> + */
> +int pipe_sz, rv = virFileReadValueInt(_sz, 
> "/proc/sys/fs/pipe-max-size");

This is fine as an experiment but I don't think it is that safe
to use in the real world. There could be a variety of reasons why
an admin can enlarge this value, and we shouldn't assume the max
size is sensible for 

Re: [libvirt RFC] virFile: new VIR_FILE_WRAPPER_BIG_PIPE to improve performance

2022-03-14 Thread Ani Sinha
On Mon, Mar 14, 2022 at 5:06 PM Michal Prívozník  wrote:
>
> On 3/12/22 17:30, Claudio Fontana wrote:
> > the first user is the qemu driver,
> >
> > virsh save/resume would slow to a crawl with a default pipe size (64k).
> >
> > This improves the situation by 400%.
> >
> > Going through io_helper still seems to incur in some penalty (~15%-ish)
> > compared with direct qemu migration to a nc socket to a file.
> >
> > Signed-off-by: Claudio Fontana 
> > ---
> >  src/qemu/qemu_driver.c|  6 +++---
> >  src/qemu/qemu_saveimage.c | 11 ++-
> >  src/util/virfile.c| 12 
> >  src/util/virfile.h|  1 +
> >  4 files changed, 22 insertions(+), 8 deletions(-)
> >
> > Hello, I initially thought this to be a qemu performance issue,
> > so you can find the discussion about this in qemu-devel:
> >
> > "Re: bad virsh save /dev/null performance (600 MiB/s max)"
> >
> > https://lists.gnu.org/archive/html/qemu-devel/2022-03/msg03142.html
> >
> > RFC since need to validate idea, and it is only lightly tested:
> >
> > save - about 400% benefit in throughput, getting around 20 Gbps to 
> > /dev/null,
> >and around 13 Gbps to a ramdisk.
> >  By comparison, direct qemu migration to a nc socket is around 
> > 24Gbps.
> >
> > restore  - not tested, _should_ also benefit in the "bypass_cache" case
> > coredump - not tested, _should_ also benefit like for save
> >
> > Thanks for your comments and review,
> >
> > Claudio
>
> Hey, I like this idea, but couple of points below.
>
> >
> >
> > diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c
> > index c1b3bd8536..be248c1e92 100644
> > --- a/src/qemu/qemu_driver.c
> > +++ b/src/qemu/qemu_driver.c
> > @@ -3044,7 +3044,7 @@ doCoreDump(virQEMUDriver *driver,
> >  virFileWrapperFd *wrapperFd = NULL;
> >  int directFlag = 0;
> >  bool needUnlink = false;
> > -unsigned int flags = VIR_FILE_WRAPPER_NON_BLOCKING;
> > +unsigned int wrapperFlags = VIR_FILE_WRAPPER_NON_BLOCKING | 
> > VIR_FILE_WRAPPER_BIG_PIPE;
> >  const char *memory_dump_format = NULL;
> >  g_autoptr(virQEMUDriverConfig) cfg = virQEMUDriverGetConfig(driver);
> >  g_autoptr(virCommand) compressor = NULL;
> > @@ -3059,7 +3059,7 @@ doCoreDump(virQEMUDriver *driver,
> >
> >  /* Create an empty file with appropriate ownership.  */
> >  if (dump_flags & VIR_DUMP_BYPASS_CACHE) {
> > -flags |= VIR_FILE_WRAPPER_BYPASS_CACHE;
> > +wrapperFlags |= VIR_FILE_WRAPPER_BYPASS_CACHE;
> >  directFlag = virFileDirectFdFlag();
> >  if (directFlag < 0) {
> >  virReportError(VIR_ERR_OPERATION_FAILED, "%s",
> > @@ -3072,7 +3072,7 @@ doCoreDump(virQEMUDriver *driver,
> >   )) < 0)
> >  goto cleanup;
> >
> > -if (!(wrapperFd = virFileWrapperFdNew(, path, flags)))
> > +if (!(wrapperFd = virFileWrapperFdNew(, path, wrapperFlags)))
> >  goto cleanup;
> >
> >  if (dump_flags & VIR_DUMP_MEMORY_ONLY) {
> > diff --git a/src/qemu/qemu_saveimage.c b/src/qemu/qemu_saveimage.c
> > index c0139041eb..1b522a1542 100644
> > --- a/src/qemu/qemu_saveimage.c
> > +++ b/src/qemu/qemu_saveimage.c
> > @@ -267,7 +267,7 @@ qemuSaveImageCreate(virQEMUDriver *driver,
> >  int fd = -1;
> >  int directFlag = 0;
> >  virFileWrapperFd *wrapperFd = NULL;
> > -unsigned int wrapperFlags = VIR_FILE_WRAPPER_NON_BLOCKING;
> > +unsigned int wrapperFlags = VIR_FILE_WRAPPER_NON_BLOCKING | 
> > VIR_FILE_WRAPPER_BIG_PIPE;
> >
> >  /* Obtain the file handle.  */
> >  if ((flags & VIR_DOMAIN_SAVE_BYPASS_CACHE)) {
> > @@ -463,10 +463,11 @@ qemuSaveImageOpen(virQEMUDriver *driver,
> >  if ((fd = qemuDomainOpenFile(cfg, NULL, path, oflags, NULL)) < 0)
> >  return -1;
> >
> > -if (bypass_cache &&
> > -!(*wrapperFd = virFileWrapperFdNew(, path,
> > -   VIR_FILE_WRAPPER_BYPASS_CACHE)))
> > -return -1;
> > +if (bypass_cache) {
> > +unsigned int wrapperFlags = VIR_FILE_WRAPPER_BYPASS_CACHE | 
> > VIR_FILE_WRAPPER_BIG_PIPE;
> > +if (!(*wrapperFd = virFileWrapperFdNew(, path, wrapperFlags)))
> > +return -1;
> > +}
> >
> >  data = g_new0(virQEMUSaveData, 1);
> >
> > diff --git a/src/util/virfile.c b/src/util/virfile.c
> > index a04f888e06..fdacd17890 100644
> > --- a/src/util/virfile.c
> > +++ b/src/util/virfile.c
> > @@ -282,6 +282,18 @@ virFileWrapperFdNew(int *fd, const char *name, 
> > unsigned int flags)
> >
> >  ret->cmd = virCommandNewArgList(iohelper_path, name, NULL);
> >
> > +if (flags & VIR_FILE_WRAPPER_BIG_PIPE) {
>
>
> I believe we don't need this flag. I mean, the plain fact that
> virFileWrapper is used means that caller wants to avoid VFS because it's
> interested in speed. Therefore, this code could be done unconditionally.
>
> > +/*
> > + * virsh save/resume would slow to a crawl with a default pipe 
> > size (usually 

Re: [libvirt RFC] virFile: new VIR_FILE_WRAPPER_BIG_PIPE to improve performance

2022-03-14 Thread Ani Sinha
>
> diff --git a/src/util/virfile.c b/src/util/virfile.c
> index a04f888e06..fdacd17890 100644
> --- a/src/util/virfile.c
> +++ b/src/util/virfile.c
> @@ -282,6 +282,18 @@ virFileWrapperFdNew(int *fd, const char *name, unsigned 
> int flags)
>
>  ret->cmd = virCommandNewArgList(iohelper_path, name, NULL);
>
> +if (flags & VIR_FILE_WRAPPER_BIG_PIPE) {
> +/*
> + * virsh save/resume would slow to a crawl with a default pipe size 
> (usually 64k).
> + * This improves the situation by 400%, although going through 
> io_helper still incurs
> + * in a performance penalty compared with a direct qemu migration to 
> a socket.
> + */
> +int pipe_sz, rv = virFileReadValueInt(_sz, 
> "/proc/sys/fs/pipe-max-size");
> +if (rv != 0) {
> +pipe_sz = 1024 * 1024; /* common default for pipe-max-size */
> +}
> +fcntl(pipefd[output ? 0 : 1], F_SETPIPE_SZ, pipe_sz);
> +}

I believe this entire hunk of code should be ifdef'd within #ifdef
__linux__. non-windows does not necessarily mean only linux.



[PATCH 0/2] qemu_namespace: Two simple improvements

2022-03-14 Thread Michal Privoznik
These are inspired by Nikola's question here:

https://listman.redhat.com/archives/libvir-list/2022-February/228812.html

See my reply here:

https://listman.redhat.com/archives/libvir-list/2022-March/229263.html

Michal Prívozník (2):
  qemu_namespace: Don't unlink paths from cgroupDeviceACL
  qemu_namespace: Be less aggressive in removing /dev nodes from
namespace

 src/qemu/qemu_namespace.c | 92 +++
 1 file changed, 63 insertions(+), 29 deletions(-)

-- 
2.34.1



RE: [PATCH] add build dependency on lxc_protocol.h to remote_daemon

2022-03-14 Thread Slater, Joseph
I was really only wanting to make sure that both remote_driver and 
remote_daemon have access to lxc_protocol.h.  I am not an expert in meson 
builds but I think the only way to do that is have a loop outside of 
remote_daemon_generated and remote_driver_generated.  You could start with an 
empty remote_daemon_generated and append remote_xxx_generated after the foreach 
loop instead of initializing the way I did.

So, your 1) is just a side-effect, not something I was explicitly seeking.

Joe

> -Original Message-
> From: Michal Prívozník 
> Sent: Monday, March 14, 2022 3:13 AM
> To: Slater, Joseph ; libvir-list@redhat.com
> Cc: MacLeod, Randy 
> Subject: Re: [PATCH] add build dependency on lxc_protocol.h to
> remote_daemon
> 
> On 3/10/22 21:53, Joe Slater wrote:
> > remote_daemon.c and others need the generated header lxc_protocol.h,
> > but do not have it as a dependency in meson.build.  This means that
> > builds will randomly (ok, very occasionally) fail.  Restructure how
> > the header is built so that remote_daemon can have it as a dependency.
> >
> > Signed-off-by: Joe Slater 
> >
> > ---
> >  src/remote/meson.build | 48
> > --
> >  1 file changed, 28 insertions(+), 20 deletions(-)
> >
> > diff --git a/src/remote/meson.build b/src/remote/meson.build index
> > 0a18826..31a30ee 100644
> > --- a/src/remote/meson.build
> > +++ b/src/remote/meson.build
> > @@ -1,27 +1,11 @@
> > -remote_driver_sources = [
> > -  'remote_driver.c',
> > -  'remote_sockets.c',
> > -]
> > -
> > -remote_driver_generated = []
> > +remote_xxx_generated = []
> >
> >  foreach name : [ 'remote', 'qemu', 'lxc' ]
> > -  client_bodies_h = '@0@_client_bodies.h'.format(name)
> >protocol_c = '@0@_protocol.c'.format(name)
> >protocol_h = '@0@_protocol.h'.format(name)
> >protocol_x = '@0@_protocol.x'.format(name)
> >
> > -  remote_driver_generated += custom_target(
> > -client_bodies_h,
> > -input: protocol_x,
> > -output: client_bodies_h,
> > -command: [
> > -  gendispatch_prog, '--mode=client', name, name.to_upper(), '@INPUT@',
> > -],
> > -capture: true,
> > -  )
> > -
> > -  remote_driver_generated += custom_target(
> > +  remote_xxx_generated += custom_target(
> >  protocol_h,
> >  input: protocol_x,
> >  output: protocol_h,
> > @@ -30,7 +14,7 @@ foreach name : [ 'remote', 'qemu', 'lxc' ]
> >  ],
> >)
> >
> > -  remote_driver_generated += custom_target(
> > +  remote_xxx_generated += custom_target(
> >  protocol_c,
> >  input: protocol_x,
> >  output: protocol_c,
> > @@ -42,6 +26,30 @@ foreach name : [ 'remote', 'qemu', 'lxc' ]
> >rpc_probe_files += files(protocol_x)  endforeach
> >
> > +
> > +remote_driver_sources = [
> > +  'remote_driver.c',
> > +  'remote_sockets.c',
> > +]
> > +
> > +remote_driver_generated =remote_xxx_generated
> > +
> > +foreach name : [ 'remote', 'qemu', 'lxc' ]
> > +  client_bodies_h = '@0@_client_bodies.h'.format(name)
> > +  protocol_x = '@0@_protocol.x'.format(name)
> > +
> > +  remote_driver_generated += custom_target(
> > +client_bodies_h,
> > +input: protocol_x,
> > +output: client_bodies_h,
> > +command: [
> > +  gendispatch_prog, '--mode=client', name, name.to_upper(), '@INPUT@',
> > +],
> > +capture: true,
> > +  )
> > +
> > +endforeach
> > +
> >  remote_daemon_sources = files(
> >'remote_daemon.c',
> >'remote_daemon_config.c',
> > @@ -49,7 +57,7 @@ remote_daemon_sources = files(
> >'remote_daemon_stream.c',
> >  )
> >
> > -remote_daemon_generated = []
> > +remote_daemon_generated = remote_xxx_generated
> 
> So IIUC, this fix consists of two parts:
> 1) generating client_bodies_h only AFTER protocol.x for all three drivers was
> processed,
> 2) Initializing remote_daemon_generated to remote_xxx_generated (which
> contains protocol.x processing targets).
> 
> IMO, it's only the second step that's really needed, isn't it? I'm not 
> against this
> patch, I'm just trying to confirm I understood the problem and the fix.
> 
> Michal



[PATCH 1/2] qemu_namespace: Don't unlink paths from cgroupDeviceACL

2022-03-14 Thread Michal Privoznik
When building namespace for a domain there are couple of devices
that are created independent of domain config (see
qemuDomainPopulateDevices()). The idea behind is that these
devices are crucial for QEMU or one of its libraries, or user is
passing through a device and wants us to create it in the
namespace too.  That's the reason that these devices are allowed
in the devices CGroup controller as well.

However, during unplug it may happen that a device is configured
to use one of such devices and since we remove /dev nodes on
hotplug we would remove such device too. For example,
/dev/urandom belongs onto the list of implicit devices and users
can hotplug and hotunplug an RNG device with /dev/urandom as
backend.

The fix is fortunately simple - just consult the list of implicit
devices before removing the device from the namespace.

Signed-off-by: Michal Privoznik 
---
 src/qemu/qemu_namespace.c | 21 +++--
 1 file changed, 19 insertions(+), 2 deletions(-)

diff --git a/src/qemu/qemu_namespace.c b/src/qemu/qemu_namespace.c
index 3b41d72630..1132fd04e5 100644
--- a/src/qemu/qemu_namespace.c
+++ b/src/qemu/qemu_namespace.c
@@ -1364,6 +1364,8 @@ qemuNamespaceUnlinkPaths(virDomainObj *vm,
 if (STRPREFIX(path, QEMU_DEVPREFIX)) {
 GStrv mount;
 bool inSubmount = false;
+const char *const *devices = (const char *const 
*)cfg->cgroupDeviceACL;
+bool inDevices = false;
 
 for (mount = devMountsPath; *mount; mount++) {
 if (STREQ(*mount, "/dev"))
@@ -1375,8 +1377,23 @@ qemuNamespaceUnlinkPaths(virDomainObj *vm,
 }
 }
 
-if (!inSubmount)
-unlinkPaths = g_slist_prepend(unlinkPaths, g_strdup(path));
+if (inSubmount)
+continue;
+
+if (!devices)
+devices = defaultDeviceACL;
+
+for (; devices; devices++) {
+if (STREQ(path, *devices)) {
+inDevices = true;
+break;
+}
+}
+
+if (inDevices)
+continue;
+
+unlinkPaths = g_slist_prepend(unlinkPaths, g_strdup(path));
 }
 }
 
-- 
2.34.1



[PATCH 2/2] qemu_namespace: Be less aggressive in removing /dev nodes from namespace

2022-03-14 Thread Michal Privoznik
When creating /dev nodes in a QEMU domain's namespace the first
thing we simply do is unlink() the path and create it again. This
aims to solve the case when a file changed type/major/minor in
the host and thus we need to reflect this in the guest's
namespace. Fair enough, except we can be a bit more clever about
it: firstly check whether the path doesn't already exist or isn't
already of the correct type/major/minor and do the
unlink+creation only if needed.

Currently, this is implemented only for symlinks and
block/character devices. For regular files/directories (which are
less common) this might be implemented one day, but not today.

Signed-off-by: Michal Privoznik 
---

Use --ignore-space-change for the best experience.

 src/qemu/qemu_namespace.c | 71 ---
 1 file changed, 44 insertions(+), 27 deletions(-)

diff --git a/src/qemu/qemu_namespace.c b/src/qemu/qemu_namespace.c
index 1132fd04e5..0714a2d0de 100644
--- a/src/qemu/qemu_namespace.c
+++ b/src/qemu/qemu_namespace.c
@@ -948,38 +948,55 @@ qemuNamespaceMknodOne(qemuNamespaceMknodItem *data)
 }
 
 if (isLink) {
-VIR_DEBUG("Creating symlink %s -> %s", data->file, data->target);
+g_autoptr(GError) gerr = NULL;
+g_autofree char *target = NULL;
 
-/* First, unlink the symlink target. Symlinks change and
- * therefore we have no guarantees that pre-existing
- * symlink is still valid. */
-if (unlink(data->file) < 0 &&
-errno != ENOENT) {
-virReportSystemError(errno,
- _("Unable to remove symlink %s"),
- data->file);
-goto cleanup;
-}
-
-if (symlink(data->target, data->file) < 0) {
-virReportSystemError(errno,
- _("Unable to create symlink %s (pointing to 
%s)"),
- data->file, data->target);
-goto cleanup;
+if ((target = g_file_read_link(data->file, )) &&
+STREQ(target, data->target)) {
+VIR_DEBUG("Skipping symlink %s -> %s which exists and points to 
correct target",
+  data->file, data->target);
 } else {
-delDevice = true;
+VIR_DEBUG("Creating symlink %s -> %s", data->file, data->target);
+
+/* First, unlink the symlink target. Symlinks change and
+ * therefore we have no guarantees that pre-existing
+ * symlink is still valid. */
+if (unlink(data->file) < 0 &&
+errno != ENOENT) {
+virReportSystemError(errno,
+ _("Unable to remove symlink %s"),
+ data->file);
+goto cleanup;
+}
+
+if (symlink(data->target, data->file) < 0) {
+virReportSystemError(errno,
+ _("Unable to create symlink %s (pointing 
to %s)"),
+ data->file, data->target);
+goto cleanup;
+} else {
+delDevice = true;
+}
 }
 } else if (isDev) {
-VIR_DEBUG("Creating dev %s (%d,%d)",
-  data->file, major(data->sb.st_rdev), 
minor(data->sb.st_rdev));
-unlink(data->file);
-if (mknod(data->file, data->sb.st_mode, data->sb.st_rdev) < 0) {
-virReportSystemError(errno,
- _("Unable to create device %s"),
- data->file);
-goto cleanup;
+GStatBuf sb;
+
+if (g_lstat(data->file, ) >= 0 &&
+sb.st_rdev == data->sb.st_rdev) {
+VIR_DEBUG("Skipping dev %s (%d,%d) which exists and has correct 
MAJ:MIN",
+   data->file, major(data->sb.st_rdev), 
minor(data->sb.st_rdev));
 } else {
-delDevice = true;
+VIR_DEBUG("Creating dev %s (%d,%d)",
+  data->file, major(data->sb.st_rdev), 
minor(data->sb.st_rdev));
+unlink(data->file);
+if (mknod(data->file, data->sb.st_mode, data->sb.st_rdev) < 0) {
+virReportSystemError(errno,
+ _("Unable to create device %s"),
+ data->file);
+goto cleanup;
+} else {
+delDevice = true;
+}
 }
 } else if (isReg || isDir) {
 /* We are not cleaning up disks on virDomainDetachDevice
-- 
2.34.1



[PATCH] scripts: Fix the parameter of warning function

2022-03-14 Thread luzhipeng
The parameter of self.warning is inconsistent with it's definition, So
fix it.

Signed-off-by: luzhipeng 
---
 scripts/apibuild.py | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/scripts/apibuild.py b/scripts/apibuild.py
index bdd3077c48..99b16f47fa 100755
--- a/scripts/apibuild.py
+++ b/scripts/apibuild.py
@@ -317,7 +317,7 @@ class index:
 if type in type_map:
 type_map[type][name] = d
 else:
-self.warning("Unable to register type ", type)
+self.warning("Unable to register type %s" % type)
 
 if name == debugsym and not quiet:
 print("New symbol: %s" % (d))
-- 
2.34.0.windows.1





Re: [libvirt RFC] virFile: new VIR_FILE_WRAPPER_BIG_PIPE to improve performance

2022-03-14 Thread Claudio Fontana
Hello Michal,

thanks for your answer,

On 3/14/22 12:36 PM, Michal Prívozník wrote:
> On 3/12/22 17:30, Claudio Fontana wrote:
>> the first user is the qemu driver,
>>
>> virsh save/resume would slow to a crawl with a default pipe size (64k).
>>
>> This improves the situation by 400%.
>>
>> Going through io_helper still seems to incur in some penalty (~15%-ish)
>> compared with direct qemu migration to a nc socket to a file.
>>
>> Signed-off-by: Claudio Fontana 
>> ---
>>  src/qemu/qemu_driver.c|  6 +++---
>>  src/qemu/qemu_saveimage.c | 11 ++-
>>  src/util/virfile.c| 12 
>>  src/util/virfile.h|  1 +
>>  4 files changed, 22 insertions(+), 8 deletions(-)
>>
>> Hello, I initially thought this to be a qemu performance issue,
>> so you can find the discussion about this in qemu-devel:
>>
>> "Re: bad virsh save /dev/null performance (600 MiB/s max)"
>>
>> https://lists.gnu.org/archive/html/qemu-devel/2022-03/msg03142.html
>>
>> RFC since need to validate idea, and it is only lightly tested:
>>
>> save - about 400% benefit in throughput, getting around 20 Gbps to 
>> /dev/null,
>>and around 13 Gbps to a ramdisk.
>> By comparison, direct qemu migration to a nc socket is around 24Gbps.
>>
>> restore  - not tested, _should_ also benefit in the "bypass_cache" case
>> coredump - not tested, _should_ also benefit like for save
>>
>> Thanks for your comments and review,
>>
>> Claudio
> 
> Hey, I like this idea, but couple of points below.
> 
>>
>>
>> diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c
>> index c1b3bd8536..be248c1e92 100644
>> --- a/src/qemu/qemu_driver.c
>> +++ b/src/qemu/qemu_driver.c
>> @@ -3044,7 +3044,7 @@ doCoreDump(virQEMUDriver *driver,
>>  virFileWrapperFd *wrapperFd = NULL;
>>  int directFlag = 0;
>>  bool needUnlink = false;
>> -unsigned int flags = VIR_FILE_WRAPPER_NON_BLOCKING;
>> +unsigned int wrapperFlags = VIR_FILE_WRAPPER_NON_BLOCKING | 
>> VIR_FILE_WRAPPER_BIG_PIPE;
>>  const char *memory_dump_format = NULL;
>>  g_autoptr(virQEMUDriverConfig) cfg = virQEMUDriverGetConfig(driver);
>>  g_autoptr(virCommand) compressor = NULL;
>> @@ -3059,7 +3059,7 @@ doCoreDump(virQEMUDriver *driver,
>>  
>>  /* Create an empty file with appropriate ownership.  */
>>  if (dump_flags & VIR_DUMP_BYPASS_CACHE) {
>> -flags |= VIR_FILE_WRAPPER_BYPASS_CACHE;
>> +wrapperFlags |= VIR_FILE_WRAPPER_BYPASS_CACHE;
>>  directFlag = virFileDirectFdFlag();
>>  if (directFlag < 0) {
>>  virReportError(VIR_ERR_OPERATION_FAILED, "%s",
>> @@ -3072,7 +3072,7 @@ doCoreDump(virQEMUDriver *driver,
>>   )) < 0)
>>  goto cleanup;
>>  
>> -if (!(wrapperFd = virFileWrapperFdNew(, path, flags)))
>> +if (!(wrapperFd = virFileWrapperFdNew(, path, wrapperFlags)))
>>  goto cleanup;
>>  
>>  if (dump_flags & VIR_DUMP_MEMORY_ONLY) {
>> diff --git a/src/qemu/qemu_saveimage.c b/src/qemu/qemu_saveimage.c
>> index c0139041eb..1b522a1542 100644
>> --- a/src/qemu/qemu_saveimage.c
>> +++ b/src/qemu/qemu_saveimage.c
>> @@ -267,7 +267,7 @@ qemuSaveImageCreate(virQEMUDriver *driver,
>>  int fd = -1;
>>  int directFlag = 0;
>>  virFileWrapperFd *wrapperFd = NULL;
>> -unsigned int wrapperFlags = VIR_FILE_WRAPPER_NON_BLOCKING;
>> +unsigned int wrapperFlags = VIR_FILE_WRAPPER_NON_BLOCKING | 
>> VIR_FILE_WRAPPER_BIG_PIPE;
>>  
>>  /* Obtain the file handle.  */
>>  if ((flags & VIR_DOMAIN_SAVE_BYPASS_CACHE)) {
>> @@ -463,10 +463,11 @@ qemuSaveImageOpen(virQEMUDriver *driver,
>>  if ((fd = qemuDomainOpenFile(cfg, NULL, path, oflags, NULL)) < 0)
>>  return -1;
>>  
>> -if (bypass_cache &&
>> -!(*wrapperFd = virFileWrapperFdNew(, path,
>> -   VIR_FILE_WRAPPER_BYPASS_CACHE)))
>> -return -1;
>> +if (bypass_cache) {
>> +unsigned int wrapperFlags = VIR_FILE_WRAPPER_BYPASS_CACHE | 
>> VIR_FILE_WRAPPER_BIG_PIPE;
>> +if (!(*wrapperFd = virFileWrapperFdNew(, path, wrapperFlags)))
>> +return -1;
>> +}
>>  
>>  data = g_new0(virQEMUSaveData, 1);
>>  
>> diff --git a/src/util/virfile.c b/src/util/virfile.c
>> index a04f888e06..fdacd17890 100644
>> --- a/src/util/virfile.c
>> +++ b/src/util/virfile.c
>> @@ -282,6 +282,18 @@ virFileWrapperFdNew(int *fd, const char *name, unsigned 
>> int flags)
>>  
>>  ret->cmd = virCommandNewArgList(iohelper_path, name, NULL);
>>  
>> +if (flags & VIR_FILE_WRAPPER_BIG_PIPE) {
> 
> 
> I believe we don't need this flag. I mean, the plain fact that
> virFileWrapper is used means that caller wants to avoid VFS because it's
> interested in speed. Therefore, this code could be done unconditionally.


right, I see now this is called only by the qemu driver for those specific 
operations.


> 
>> +/*
>> + * virsh save/resume would slow to a crawl with a default pipe 

Re: Virtqemud wants to unlink /dev/urandom

2022-03-14 Thread Michal Prívozník
On 3/14/22 12:45, Martin Kletzander wrote:
> [adding back libvir-list to the Cc]
> 
> On Fri, Mar 11, 2022 at 03:55:03PM +0100, Nikola Knazekova wrote:
>> Hey Martin,
>>
>> thanks for your resposne.
>>
>> I don't know if it is happening in the mount namespace. Can you look
>> at the
>> logs in attachment?
>>
>> It was happening on clear install on F35, F36 and on older versions
>> probably too.
>> But it is only an issue in the new selinux policy for libvirt. In old
>> selinux policy is allowed for virtd to unlink /dev/urandom char files.
>> I just wanted to be sure if it is ok to allow it for virtqemud.
>>
> 
> That actually might be the case, that it actually does set the context
> on /dev/urandom correctly and then the unlink fails for virtqemud since
> the selinux policy only accounts for libvirtd even though we switched to
> modular daemons making virtqemud the one to do the work.
> 
> @Michal can you confirm what I'm guessing here since you did a lot of
> the mount namespace work which I presume is what contributes to the
> issue here.
> 
> In the meantime, would you mind trying this with the mount namespace
> feature turned off in /etc/libvirt/qemu.conf like this:
> 
> namespaces = []
> 

Yeah, this will definitely help. So, a short introduction into how
libvirt starts a QEMU guest. It creates a mount namespace so that QEMU
doesn't have access to all the files in the system. In this namespace
(which is per each QEMU process) firstly very few paths are populated
independent of guest configuration (like /dev/null, /dev/random/,
/dev/urandom, etc.) - the full list is accessible here:

https://gitlab.com/libvirt/libvirt/-/blob/master/src/qemu/qemu.conf#L565

(yes, it's the cgroup_device_acl list - because what you want to enable
in CGroups you want to expose in the namespace)

Then, the paths from domain XML are created using the following function:

https://gitlab.com/libvirt/libvirt/-/blob/master/src/qemu/qemu_namespace.c#L931

This function is written in a fashion that allows files to exist and if
needed [1] it simply unlink()-s existing file and creates it from
scratch again. Now, since you configured TPM for your guest with
/dev/urandom as a backend, this node is created twice. The first time
among with other cgroup_device_acl files, the second because of TPM from
your domain config.

1: needed is probably a bad word, and in fact we can be more clever
about it. We might check whether given device already exists and if it
has the same MAJ:MIN and act accordingly. The same applies for symlinks.

Let me see if I can cook up a patch that implements this idea.

Michal



Re: Virtqemud wants to unlink /dev/urandom

2022-03-14 Thread Martin Kletzander

[adding back libvir-list to the Cc]

On Fri, Mar 11, 2022 at 03:55:03PM +0100, Nikola Knazekova wrote:

Hey Martin,

thanks for your resposne.

I don't know if it is happening in the mount namespace. Can you look at the
logs in attachment?

It was happening on clear install on F35, F36 and on older versions
probably too.
But it is only an issue in the new selinux policy for libvirt. In old
selinux policy is allowed for virtd to unlink /dev/urandom char files.
I just wanted to be sure if it is ok to allow it for virtqemud.



That actually might be the case, that it actually does set the context
on /dev/urandom correctly and then the unlink fails for virtqemud since
the selinux policy only accounts for libvirtd even though we switched to
modular daemons making virtqemud the one to do the work.

@Michal can you confirm what I'm guessing here since you did a lot of
the mount namespace work which I presume is what contributes to the
issue here.

In the meantime, would you mind trying this with the mount namespace
feature turned off in /etc/libvirt/qemu.conf like this:

namespaces = []

Thanks.


Regards,
Nikola

On Thu, Feb 24, 2022 at 3:00 PM Martin Kletzander 
wrote:


On Thu, Feb 24, 2022 at 01:41:50PM +0100, Nikola Knazekova wrote:
>Hi,
>
>when I am creating virtual machine on system with new SELinux policy for
>Libvirt, I am getting this error message:
>
>Unable to complete install: 'Unable to create device /dev/urandom: File
>exists'
>Traceback (most recent call last):
>  File "/usr/share/virt-manager/virtManager/asyncjob.py", line 65, in
>cb_wrapper
>callback(asyncjob, *args, **kwargs)
>  File "/usr/share/virt-manager/virtManager/createvm.py", line 2001, in
>_do_async_install
>installer.start_install(guest, meter=meter)
>  File "/usr/share/virt-manager/virtinst/install/installer.py", line 701,
>in start_install
>domain = self._create_guest(
>  File "/usr/share/virt-manager/virtinst/install/installer.py", line 649,
>in _create_guest
>domain = self.conn.createXML(install_xml or final_xml, 0)
>  File "/usr/lib64/python3.10/site-packages/libvirt.py", line 4393, in
>createXML
>raise libvirtError('virDomainCreateXML() failed')
>libvirt.libvirtError: Unable to create device /dev/urandom: File exists
>
>And SELinux denial, where SELinux prevents virtqemud to unlink character
>device /dev/urandom:
>
>time->Wed Feb 23 19:30:33 2022
>type=PROCTITLE msg=audit(1645662633.819:930):

>proctitle=2F7573722F7362696E2F7669727471656D7564002D2D74696D656F757400313230
>type=PATH msg=audit(1645662633.819:930): item=1 name="/dev/urandom"
inode=6
>dev=00:44 mode=020666 ouid=0 ogid=0 rdev=01:09
>obj=system_u:object_r:urandom_device_t:s0 nametype=DELETE cap_fp=0
cap_fi=0
>cap_fe=0 cap_fver=0 cap_frootid=0
>type=PATH msg=audit(1645662633.819:930): item=0 name="/dev/" inode=1
>dev=00:44 mode=040755 ouid=0 ogid=0 rdev=00:00
>obj=system_u:object_r:tmpfs_t:s0 nametype=PARENT cap_fp=0 cap_fi=0
cap_fe=0
>cap_fver=0 cap_frootid=0
>type=CWD msg=audit(1645662633.819:930): cwd="/"
>type=SYSCALL msg=audit(1645662633.819:930): arch=c03e syscall=87
>success=no exit=-13 a0=7f9418064f50 a1=7f943909c930 a2=7f941d0ef6d4 a3=0
>items=2 ppid=6722 pid=7196 auid=4294967295 uid=0 gid=0 euid=0 suid=0
>fsuid=0 egid=0 sgid=0 fsgid=0 tty=(none) ses=4294967295 comm="rpc-worker"
>exe="/usr/sbin/virtqemud" subj=system_u:system_r:virtqemud_t:s0 key=(null)
>type=AVC msg=audit(1645662633.819:930): avc:  denied  { unlink } for
> pid=7196 comm="rpc-worker" name="urandom" dev="tmpfs" ino=6
>scontext=system_u:system_r:virtqemud_t:s0
>tcontext=system_u:object_r:urandom_device_t:s0 tclass=chr_file
permissive=0
>
>Is this expected behavior?
>

The error is not, but creating and removing /dev/urandom is fine, as far
as it happens in the mount namespace of the domain, which we create and
as such we also need to create some basic /dev structure in there.

Unfortunately this error does not show whether it is happening in the
mount namespace, although it should definitely _not_ happen outside of it.

Does this happen on clean install?  What is the version of libvirt and
the selinux policy?  What's the distro+version of the system?  Would you
mind capturing the debug logs and attaching them?

How to capture debug logs: https://libvirt.org/kbase/debuglogs.html

>Thanks,
>Nikola




2022-03-04 03:08:28.053+: starting up libvirt version: 8.0.0, package: 
2.fc36 (Fedora Project, 2022-01-20-17:44:09, ), qemu version: 
6.2.0qemu-6.2.0-5.fc36, kernel: 5.17.0-0.rc5.102.fc36.x86_64, hostname: fedora
LC_ALL=C \
PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin \
HOME=/var/lib/libvirt/qemu/domain-4-fedora35-3 \
XDG_DATA_HOME=/var/lib/libvirt/qemu/domain-4-fedora35-3/.local/share \
XDG_CACHE_HOME=/var/lib/libvirt/qemu/domain-4-fedora35-3/.cache \
XDG_CONFIG_HOME=/var/lib/libvirt/qemu/domain-4-fedora35-3/.config \
/usr/bin/qemu-system-x86_64 \
-name guest=fedora35-3,debug-threads=on \
-S \
-object 

Re: [libvirt RFC] virFile: new VIR_FILE_WRAPPER_BIG_PIPE to improve performance

2022-03-14 Thread Michal Prívozník
On 3/12/22 17:30, Claudio Fontana wrote:
> the first user is the qemu driver,
> 
> virsh save/resume would slow to a crawl with a default pipe size (64k).
> 
> This improves the situation by 400%.
> 
> Going through io_helper still seems to incur in some penalty (~15%-ish)
> compared with direct qemu migration to a nc socket to a file.
> 
> Signed-off-by: Claudio Fontana 
> ---
>  src/qemu/qemu_driver.c|  6 +++---
>  src/qemu/qemu_saveimage.c | 11 ++-
>  src/util/virfile.c| 12 
>  src/util/virfile.h|  1 +
>  4 files changed, 22 insertions(+), 8 deletions(-)
> 
> Hello, I initially thought this to be a qemu performance issue,
> so you can find the discussion about this in qemu-devel:
> 
> "Re: bad virsh save /dev/null performance (600 MiB/s max)"
> 
> https://lists.gnu.org/archive/html/qemu-devel/2022-03/msg03142.html
> 
> RFC since need to validate idea, and it is only lightly tested:
> 
> save - about 400% benefit in throughput, getting around 20 Gbps to 
> /dev/null,
>and around 13 Gbps to a ramdisk.
>  By comparison, direct qemu migration to a nc socket is around 24Gbps.
> 
> restore  - not tested, _should_ also benefit in the "bypass_cache" case
> coredump - not tested, _should_ also benefit like for save
> 
> Thanks for your comments and review,
> 
> Claudio

Hey, I like this idea, but couple of points below.

> 
> 
> diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c
> index c1b3bd8536..be248c1e92 100644
> --- a/src/qemu/qemu_driver.c
> +++ b/src/qemu/qemu_driver.c
> @@ -3044,7 +3044,7 @@ doCoreDump(virQEMUDriver *driver,
>  virFileWrapperFd *wrapperFd = NULL;
>  int directFlag = 0;
>  bool needUnlink = false;
> -unsigned int flags = VIR_FILE_WRAPPER_NON_BLOCKING;
> +unsigned int wrapperFlags = VIR_FILE_WRAPPER_NON_BLOCKING | 
> VIR_FILE_WRAPPER_BIG_PIPE;
>  const char *memory_dump_format = NULL;
>  g_autoptr(virQEMUDriverConfig) cfg = virQEMUDriverGetConfig(driver);
>  g_autoptr(virCommand) compressor = NULL;
> @@ -3059,7 +3059,7 @@ doCoreDump(virQEMUDriver *driver,
>  
>  /* Create an empty file with appropriate ownership.  */
>  if (dump_flags & VIR_DUMP_BYPASS_CACHE) {
> -flags |= VIR_FILE_WRAPPER_BYPASS_CACHE;
> +wrapperFlags |= VIR_FILE_WRAPPER_BYPASS_CACHE;
>  directFlag = virFileDirectFdFlag();
>  if (directFlag < 0) {
>  virReportError(VIR_ERR_OPERATION_FAILED, "%s",
> @@ -3072,7 +3072,7 @@ doCoreDump(virQEMUDriver *driver,
>   )) < 0)
>  goto cleanup;
>  
> -if (!(wrapperFd = virFileWrapperFdNew(, path, flags)))
> +if (!(wrapperFd = virFileWrapperFdNew(, path, wrapperFlags)))
>  goto cleanup;
>  
>  if (dump_flags & VIR_DUMP_MEMORY_ONLY) {
> diff --git a/src/qemu/qemu_saveimage.c b/src/qemu/qemu_saveimage.c
> index c0139041eb..1b522a1542 100644
> --- a/src/qemu/qemu_saveimage.c
> +++ b/src/qemu/qemu_saveimage.c
> @@ -267,7 +267,7 @@ qemuSaveImageCreate(virQEMUDriver *driver,
>  int fd = -1;
>  int directFlag = 0;
>  virFileWrapperFd *wrapperFd = NULL;
> -unsigned int wrapperFlags = VIR_FILE_WRAPPER_NON_BLOCKING;
> +unsigned int wrapperFlags = VIR_FILE_WRAPPER_NON_BLOCKING | 
> VIR_FILE_WRAPPER_BIG_PIPE;
>  
>  /* Obtain the file handle.  */
>  if ((flags & VIR_DOMAIN_SAVE_BYPASS_CACHE)) {
> @@ -463,10 +463,11 @@ qemuSaveImageOpen(virQEMUDriver *driver,
>  if ((fd = qemuDomainOpenFile(cfg, NULL, path, oflags, NULL)) < 0)
>  return -1;
>  
> -if (bypass_cache &&
> -!(*wrapperFd = virFileWrapperFdNew(, path,
> -   VIR_FILE_WRAPPER_BYPASS_CACHE)))
> -return -1;
> +if (bypass_cache) {
> +unsigned int wrapperFlags = VIR_FILE_WRAPPER_BYPASS_CACHE | 
> VIR_FILE_WRAPPER_BIG_PIPE;
> +if (!(*wrapperFd = virFileWrapperFdNew(, path, wrapperFlags)))
> +return -1;
> +}
>  
>  data = g_new0(virQEMUSaveData, 1);
>  
> diff --git a/src/util/virfile.c b/src/util/virfile.c
> index a04f888e06..fdacd17890 100644
> --- a/src/util/virfile.c
> +++ b/src/util/virfile.c
> @@ -282,6 +282,18 @@ virFileWrapperFdNew(int *fd, const char *name, unsigned 
> int flags)
>  
>  ret->cmd = virCommandNewArgList(iohelper_path, name, NULL);
>  
> +if (flags & VIR_FILE_WRAPPER_BIG_PIPE) {


I believe we don't need this flag. I mean, the plain fact that
virFileWrapper is used means that caller wants to avoid VFS because it's
interested in speed. Therefore, this code could be done unconditionally.

> +/*
> + * virsh save/resume would slow to a crawl with a default pipe size 
> (usually 64k).
> + * This improves the situation by 400%, although going through 
> io_helper still incurs
> + * in a performance penalty compared with a direct qemu migration to 
> a socket.
> + */

This belongs into the commit message. This code has no knowledge 

Re: [PATCH 1/1] scripts: Fix the parameter of warning function

2022-03-14 Thread Michal Prívozník
On 3/12/22 08:42, luzhipeng wrote:
> ---
>  scripts/apibuild.py | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/scripts/apibuild.py b/scripts/apibuild.py
> index bdd3077c48..99b16f47fa 100755
> --- a/scripts/apibuild.py
> +++ b/scripts/apibuild.py
> @@ -317,7 +317,7 @@ class index:
>  if type in type_map:
>  type_map[type][name] = d
>  else:
> -self.warning("Unable to register type ", type)
> +self.warning("Unable to register type %s" % type)
>  
>  if name == debugsym and not quiet:
>  print("New symbol: %s" % (d))

Hey, the patch is correct. However, in order to be merged couple of
requirements have to be met:

1) the commit message must contain your Signed-off-by line:
   https://libvirt.org/hacking.html#developer-certificate-of-origin

2) speaking of commit message - you should introduce one. If I'd look at
this commit in git log (long after it's been pushed), I would have no
idea what it does and what problem it's fixing. It doesn't have to be
long, in fact the shorter the better. But too short is not good either.

I'm open to make an exception and if you reply to this e-mail with the
commit message and your Signed-off-by line I can amend those and merge.

Michal



Re: [PATCH] add build dependency on lxc_protocol.h to remote_daemon

2022-03-14 Thread Michal Prívozník
On 3/10/22 21:53, Joe Slater wrote:
> remote_daemon.c and others need the generated header lxc_protocol.h,
> but do not have it as a dependency in meson.build.  This means that
> builds will randomly (ok, very occasionally) fail.  Restructure how the
> header is built so that remote_daemon can have it as a dependency.
> 
> Signed-off-by: Joe Slater 
> 
> ---
>  src/remote/meson.build | 48 --
>  1 file changed, 28 insertions(+), 20 deletions(-)
> 
> diff --git a/src/remote/meson.build b/src/remote/meson.build
> index 0a18826..31a30ee 100644
> --- a/src/remote/meson.build
> +++ b/src/remote/meson.build
> @@ -1,27 +1,11 @@
> -remote_driver_sources = [
> -  'remote_driver.c',
> -  'remote_sockets.c',
> -]
> -
> -remote_driver_generated = []
> +remote_xxx_generated = []
>  
>  foreach name : [ 'remote', 'qemu', 'lxc' ]
> -  client_bodies_h = '@0@_client_bodies.h'.format(name)
>protocol_c = '@0@_protocol.c'.format(name)
>protocol_h = '@0@_protocol.h'.format(name)
>protocol_x = '@0@_protocol.x'.format(name)
>  
> -  remote_driver_generated += custom_target(
> -client_bodies_h,
> -input: protocol_x,
> -output: client_bodies_h,
> -command: [
> -  gendispatch_prog, '--mode=client', name, name.to_upper(), '@INPUT@',
> -],
> -capture: true,
> -  )
> -
> -  remote_driver_generated += custom_target(
> +  remote_xxx_generated += custom_target(
>  protocol_h,
>  input: protocol_x,
>  output: protocol_h,
> @@ -30,7 +14,7 @@ foreach name : [ 'remote', 'qemu', 'lxc' ]
>  ],
>)
>  
> -  remote_driver_generated += custom_target(
> +  remote_xxx_generated += custom_target(
>  protocol_c,
>  input: protocol_x,
>  output: protocol_c,
> @@ -42,6 +26,30 @@ foreach name : [ 'remote', 'qemu', 'lxc' ]
>rpc_probe_files += files(protocol_x)
>  endforeach
>  
> +
> +remote_driver_sources = [
> +  'remote_driver.c',
> +  'remote_sockets.c',
> +]
> +
> +remote_driver_generated =remote_xxx_generated
> +
> +foreach name : [ 'remote', 'qemu', 'lxc' ]
> +  client_bodies_h = '@0@_client_bodies.h'.format(name)
> +  protocol_x = '@0@_protocol.x'.format(name)
> +
> +  remote_driver_generated += custom_target(
> +client_bodies_h,
> +input: protocol_x,
> +output: client_bodies_h,
> +command: [
> +  gendispatch_prog, '--mode=client', name, name.to_upper(), '@INPUT@',
> +],
> +capture: true,
> +  )
> +
> +endforeach
> +
>  remote_daemon_sources = files(
>'remote_daemon.c',
>'remote_daemon_config.c',
> @@ -49,7 +57,7 @@ remote_daemon_sources = files(
>'remote_daemon_stream.c',
>  )
>  
> -remote_daemon_generated = []
> +remote_daemon_generated = remote_xxx_generated

So IIUC, this fix consists of two parts:
1) generating client_bodies_h only AFTER protocol.x for all three
drivers was processed,
2) Initializing remote_daemon_generated to remote_xxx_generated (which
contains protocol.x processing targets).

IMO, it's only the second step that's really needed, isn't it? I'm not
against this patch, I'm just trying to confirm I understood the problem
and the fix.

Michal



[PATCH] scripts: Fix the parameter of warning function

2022-03-14 Thread luzhipeng
---
 scripts/apibuild.py | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/scripts/apibuild.py b/scripts/apibuild.py
index bdd3077c48..99b16f47fa 100755
--- a/scripts/apibuild.py
+++ b/scripts/apibuild.py
@@ -317,7 +317,7 @@ class index:
 if type in type_map:
 type_map[type][name] = d
 else:
-self.warning("Unable to register type ", type)
+self.warning("Unable to register type %s" % type)
 
 if name == debugsym and not quiet:
 print("New symbol: %s" % (d))
-- 
2.34.0.windows.1





[PATCH 1/1] scripts: Fix the parameter of warning function

2022-03-14 Thread luzhipeng
---
 scripts/apibuild.py | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/scripts/apibuild.py b/scripts/apibuild.py
index bdd3077c48..99b16f47fa 100755
--- a/scripts/apibuild.py
+++ b/scripts/apibuild.py
@@ -317,7 +317,7 @@ class index:
 if type in type_map:
 type_map[type][name] = d
 else:
-self.warning("Unable to register type ", type)
+self.warning("Unable to register type %s" % type)
 
 if name == debugsym and not quiet:
 print("New symbol: %s" % (d))
-- 
2.34.0.windows.1





[PATCH 0/1]scripts: Fix the parameter of warning function

2022-03-14 Thread luzhipeng


luzhipeng (1):
  scripts: Fix the parameter of warning function

 scripts/apibuild.py | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

-- 
2.34.0.windows.1





Re: [libvirt PATCH v2 00/10] Automatic mutex management - part 3

2022-03-14 Thread Tim Wiederhake
ping

On Fri, 2022-03-04 at 18:28 +0100, Tim Wiederhake wrote:
> Use the recently implemented VIR_LOCK_GUARD and
> VIR_WITH_MUTEX_LOCK_GUARD
> to simplify mutex management.
> 
> v1:
> https://listman.redhat.com/archives/libvir-list/2022-February/msg00674.html
> 
> Changed since v1:
> * Removed locking / unlocking in storage driver initialization and
> cleanup
>   instead of working around the issue of the lifetime of the mutex.
> 
> Tim Wiederhake (10):
>   test: Use automatic mutex management
>   openvz: Use automatic mutex management
>   remote_daemon_dispatch: Use automatic mutex management
>   netdev: Use automatic mutex management
>   nodesuspend: Use automatic mutex management
>   admin: Use automatic mutex management
>   esx_stream: Use automatic mutex management
>   esx_vi: Use automatic mutex management
>   storage: Removing mutex locking in initialization and cleanup
>   storage: Use automatic mutex management
> 
>  src/admin/admin_server_dispatch.c   |   3 +-
>  src/esx/esx_stream.c    |  65 --
>  src/esx/esx_vi.c    | 109 +++-
>  src/openvz/openvz_driver.c  |  91 +-
>  src/remote/remote_daemon_dispatch.c | 187 +-
> --
>  src/storage/storage_driver.c    |  32 ++---
>  src/test/test_driver.c  |  15 +--
>  src/util/virnetdev.c    |  20 ++-
>  src/util/virnodesuspend.c   |  54 +++-
>  9 files changed, 193 insertions(+), 383 deletions(-)
> 
> -- 
> 2.31.1
> 
> 




Re: [PATCH 0/3] qemu: Introduce 'manual' snapshot mode for storage providers not managed by libvirt

2022-03-14 Thread Ján Tomko

On a Friday in 2022, Peter Krempa wrote:

Peter Krempa (3):
 conf: snapshot: Introduce 'manual' mode for snapshot of a disk
 qemuSnapshotCreateActiveExternal: Implement manual snapshot mode
 kbase: Introduce 'snapshots' page and describe the new 'manual'
   snapshot

docs/formatdomain.rst   | 15 +-
docs/formatsnapshot.rst |  9 ++
docs/kbase/index.rst|  3 ++
docs/kbase/meson.build  |  1 +
docs/kbase/snapshots.rst| 53 +
docs/schemas/domainsnapshot.rng |  3 ++
src/conf/domain_conf.c  |  1 +
src/conf/domain_conf.h  |  1 +
src/conf/snapshot_conf.c|  6 
src/qemu/qemu_snapshot.c| 24 +++
src/test/test_driver.c  | 17 +++
11 files changed, 121 insertions(+), 12 deletions(-)
create mode 100644 docs/kbase/snapshots.rst



Reviewed-by: Ján Tomko 

Jano


signature.asc
Description: PGP signature


Re: [PATCH 1/3] conf: snapshot: Introduce 'manual' mode for snapshot of a disk

2022-03-14 Thread Ján Tomko

On a Friday in 2022, Peter Krempa wrote:

The idea of the manual mode is to allow a synchronized snapshot in cases
when the storage is outsourced to an unmanaged storage provider which
requires cooperation with snapshotting.

The mode will instruct the hypervisor to pause along when the other
components are snapshotted and the 'manual' disk can be snapshotted
along. This increases latency of the snapshot but allows them in
otherwise impossible situations.

Signed-off-by: Peter Krempa 
---
docs/formatdomain.rst   | 15 ---
docs/formatsnapshot.rst |  9 +
docs/schemas/domainsnapshot.rng |  3 +++
src/conf/domain_conf.c  |  1 +
src/conf/domain_conf.h  |  1 +
src/conf/snapshot_conf.c|  6 ++
src/qemu/qemu_snapshot.c|  5 +
src/test/test_driver.c  | 17 +
8 files changed, 50 insertions(+), 7 deletions(-)

diff --git a/docs/formatsnapshot.rst b/docs/formatsnapshot.rst
index 0fee35d89c..4635df89cb 100644
--- a/docs/formatsnapshot.rst
+++ b/docs/formatsnapshot.rst
@@ -124,6 +124,15 @@ The top-level ``domainsnapshot`` element may contain the 
following elements:
  corresponding domain disk, while others like qemu allow this field to
  override the domain default.

+  :since:`Since 8.2.0` the ``snapshot`` attribute supports the ``manual``
+  value which instructs the hypervisor to create the snapshot and keep a
+  synchronized state by pausing the VM which allows to snapshot disk
+  storage from outside of the hypervisor if the storage provider supports
+  it.  The caller is responsible for resuming a VM paused by requesting a
+  ``manual`` snapshot When reverting such snapshot, the expectation is that


Missing period.


+  the storage is configured in a way where the hypervisor will see the
+  correct image state.
+
  :since:`Since 1.2.2` the ``disk`` element supports an optional attribute
  ``type`` if the ``snapshot`` attribute is set to ``external``. This
  attribute specifies the snapshot target storage type and allows to


Reviewed-by: Ján Tomko 

Jano


signature.asc
Description: PGP signature