Re: [libvirt] [RFC] virfile: fix cast-align error

2018-10-09 Thread Bjoern Walk
Michal Privoznik  [2018-10-09, 04:52PM +0200]:
> What header file are you looking at? From /usr/include/bits/statfs.h:

It's s390x-specific hard-coded as an unsigned int.

> > Also, I am against exposing any
> > internal data types.
> 
> It's not internal if it is exposed in a public header file.
> 
> > 
> > virFileIsSharedFixFUSE could just return the value, the return value
> > right now is not checked anyways.
> 
> Sure, but since you claim the f_type member is an unsigned type, then we
> would hit the sign warning anyway.
> Also, not sure. If virFileIsSharedFixFUSE() is unable to tell the
> filesystem it should not touch the f_type at all.

Ok, then I guess Daniel's solution is the most simple one. I'll have
Marc send a v3 :)

-- 
IBM Systems
Linux on Z & Virtualization Development
--
IBM Deutschland Research & Development GmbH
Schönaicher Str. 220, 71032 Böblingen
Phone: +49 7031 16 1819
--
Vorsitzende des Aufsichtsrats: Martina Koederitz
Geschäftsführung: Dirk Wittkopp
Sitz der Gesellschaft: Böblingen
Registergericht: Amtsgericht Stuttgart, HRB 243294 


signature.asc
Description: PGP signature
--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list

Re: [libvirt] The results of lspci are inconsistent between vfio reset pci devices and reset devices by sysfs interafce

2018-10-09 Thread Alex Williamson
On Wed, 10 Oct 2018 01:47:10 +
"Wuzongyong (Euler Dept)"  wrote:
> 
> You're right. The initial states are not identical.
> I found the function vfio_pci_pre_reset in qemu.
> /*
>  * Stop any ongoing DMA by disconecting I/O, MMIO, and bus master.
>  * Also put INTx Disable in known state.
>  */
> cmd = vfio_pci_read_config(pdev, PCI_COMMAND, 2);
> cmd &= ~(PCI_COMMAND_IO | PCI_COMMAND_MEMORY | PCI_COMMAND_MASTER |
>  PCI_COMMAND_INTX_DISABLE);
> vfio_pci_write_config(pdev, PCI_COMMAND, cmd, 2);
> 
> So the behaviors between the two reset are inconsistent.
> 
> Then I wonder whether the operation is necessary here?
> Could I enable the Memory bit in the Command register in vfio_pci_post_reset,
> because I want to write regions of PCI devices after reset.

One reset is done by the kernel to try to put the device into a known
clean state before allowing the user access to it, the other is done by
QEMU as part of the initial machine reset.  I suppose we could special
case the initial machine reset, but it seems perhaps risky and
unnecessary.

QEMU is the driver here, it can certainly enable MMIO on the device and
there are some examples in the QEMU code where MMIO is enabled to
interact with the device, see vfio_radeon_reset() for example.  The
device driver in the guest or the VM firmware should be the one to
enable the device for VM usage though, QEMU should provide the device
to the VM in a power-on default state, or as close as we can reasonably
get to that.  Thanks,

Alex

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] The results of lspci are inconsistent between vfio reset pci devices and reset devices by sysfs interafce

2018-10-09 Thread Wuzongyong (Euler Dept)
> > > Hi,
> > >
> > > I start a virtual machine with commandline:
> > > /usr/libexec/qemu-kvm --enable-kvm -smp 8 -m 8192 -device
> > > vfio-pci,host=:81:00.0
> > >
> > > Then I pause the qemu process before executing the main_loop
> > > function by
> > gdb.
> > > At this moment, lspci shows the regions are disabled like below:
> > > 81:00.0 3D controller: NVIDIA Corporation GP100GL [Tesla P100
> > > PCIe
> > 16GB] (rev a1)
> > > Subsystem: NVIDIA Corporation Device 118f
> > > Physical Slot: 0-6
> > > Control: I/O- Mem- BusMaster- SpecCycle- MemWINV- VGASnoop-
> > ParErr- Stepping- SERR- FastB2B- DisINTx+
> > > Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast
> > > >TAbort-
> > SERR-  > > Interrupt: pin A routed to IRQ 35
> > > NUMA node: 1
> > > Region 0: Memory at c800 (32-bit, non-prefetchable)
> > [disabled] [size=16M]
> > > Region 1: Memory at 278 (64-bit, prefetchable)
> > > [disabled]
> > [size=16G]
> > > Region 3: Memory at 27c (64-bit, prefetchable)
> > > [disabled] [size=32M]
> > >
> > > But after the command:
> > > echo 1 > /sys/bus/pci/devices/:81:00.0/reset
> > > lspci shows the regions are *not* disabled:
> > > 81:00.0 3D controller: NVIDIA Corporation GP100GL [Tesla P100
> > > PCIe
> > 16GB] (rev a1)
> > > Subsystem: Huawei Technologies Co., Ltd. Device 2061
> > > Physical Slot: 0-6
> > > Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop-
> > ParErr+ Stepping- SERR+ FastB2B- DisINTx-
> > > Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast
> > > >TAbort-
> > SERR-  > > Latency: 0, Cache Line Size: 32 bytes
> > > Interrupt: pin A routed to IRQ 7
> > > NUMA node: 1
> > > Region 0: Memory at c800 (32-bit, non-prefetchable)
> > [size=16M]
> > > Region 1: Memory at 278 (64-bit, prefetchable)
> [size=16G]
> > > Region 3: Memory at 27c (64-bit, prefetchable)
> > > [size=32M]
> > >
> > > AFAIK, qemu performs vfio_pci_reset like the below callstack:
> > > Qemu:
> > > vfio_pci_reset
> > > ioctl(vdev->vbasedev.fd, VFIO_DEVICE_RESET)
> > > Kernel:
> > > vfio_pci_ioctl
> > > pci_try_reset_function
> > > __pci_reset_function_locked
> > > pci_parent_bus_reset
> > > pci_reset_bridge_secondary_bus
> > >
> > > and write 1 to the reset interface of sysfs go through the path:
> > > Kernel:
> > > reset_store
> > > pci_reset_function
> > > __pci_reset_function_locked
> > > pci_parent_bus_reset
> > > pci_reset_bridge_secondary_bus
> > >
> > > So seem that these two methods are same actually, I am confused why
> > > the
> > results are inconsistent.
> >
> > Maybe there's a misunderstanding here, the kernel PCI reset functions
> > save and restore config space around the reset.  The intention of the
> > reset is to re-init the internal state of the device while preserving
> > (via
> > save+restore) the config space.  The BARs being disabled is simply a
> > matter of the Memory bit in the Command register being unset (note Mem-).
> > Whether this is indicative of some issue depends on whether the state
> > before reset matches the state after reset, not that the states after
> > two different paths of triggering a reset are identical.
> >
> > vfio-pci will hand off the device to the user (QEMU) disabled, so the
> > states in the first example make sense to me.  In the second case,
> > it's not clear what the starting state is for the device.  Was this
> > reset performed from the starting point of the first case or is the
> > device in some arbitrary, unknown state prior to reset?  Thanks,
> >
> > Alex
> In the second case, the reset was performed from the starting point of the
> first case.
> IOW, the states before the two cases are identical, I think. The only
> difference I can think of is the qemu process will perform twice reset,
> one occurs when vfio open the device' fd and the other one occurs as I
> mentioned above.
> 
> Thanks,
> Wu Zongyong

You're right. The initial states are not identical.
I found the function vfio_pci_pre_reset in qemu.
/*
 * Stop any ongoing DMA by disconecting I/O, MMIO, and bus master.
 * Also put INTx Disable in known state.
 */
cmd = vfio_pci_read_config(pdev, PCI_COMMAND, 2);
cmd &= ~(PCI_COMMAND_IO | PCI_COMMAND_MEMORY | PCI_COMMAND_MASTER |
 PCI_COMMAND_INTX_DISABLE);
vfio_pci_write_config(pdev, PCI_COMMAND, cmd, 2);

So the behaviors between the two reset are inconsistent.

Then I wonder whether the operation is necessary here?
Could I enable the Memory bit in the Command register in vfio_pci_post_reset,
because I want to write regions of PCI devices after reset.

Thanks,
Wu Zongyong





--
libvir-list mailing list
libvir-list@redhat.com

Re: [libvirt] The results of lspci are inconsistent between vfio reset pci devices and reset devices by sysfs interafce

2018-10-09 Thread Wuzongyong (Euler Dept)
> > Hi,
> >
> > I start a virtual machine with commandline:
> > /usr/libexec/qemu-kvm --enable-kvm -smp 8 -m 8192 -device
> > vfio-pci,host=:81:00.0
> >
> > Then I pause the qemu process before executing the main_loop function by
> gdb.
> > At this moment, lspci shows the regions are disabled like below:
> > 81:00.0 3D controller: NVIDIA Corporation GP100GL [Tesla P100 PCIe
> 16GB] (rev a1)
> > Subsystem: NVIDIA Corporation Device 118f
> > Physical Slot: 0-6
> > Control: I/O- Mem- BusMaster- SpecCycle- MemWINV- VGASnoop-
> ParErr- Stepping- SERR- FastB2B- DisINTx+
> > Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort-
> SERR-  > Interrupt: pin A routed to IRQ 35
> > NUMA node: 1
> > Region 0: Memory at c800 (32-bit, non-prefetchable)
> [disabled] [size=16M]
> > Region 1: Memory at 278 (64-bit, prefetchable) [disabled]
> [size=16G]
> > Region 3: Memory at 27c (64-bit, prefetchable)
> > [disabled] [size=32M]
> >
> > But after the command:
> > echo 1 > /sys/bus/pci/devices/:81:00.0/reset
> > lspci shows the regions are *not* disabled:
> > 81:00.0 3D controller: NVIDIA Corporation GP100GL [Tesla P100 PCIe
> 16GB] (rev a1)
> > Subsystem: Huawei Technologies Co., Ltd. Device 2061
> > Physical Slot: 0-6
> > Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop-
> ParErr+ Stepping- SERR+ FastB2B- DisINTx-
> > Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort-
> SERR-  > Latency: 0, Cache Line Size: 32 bytes
> > Interrupt: pin A routed to IRQ 7
> > NUMA node: 1
> > Region 0: Memory at c800 (32-bit, non-prefetchable)
> [size=16M]
> > Region 1: Memory at 278 (64-bit, prefetchable) [size=16G]
> > Region 3: Memory at 27c (64-bit, prefetchable)
> > [size=32M]
> >
> > AFAIK, qemu performs vfio_pci_reset like the below callstack:
> > Qemu:
> > vfio_pci_reset
> > ioctl(vdev->vbasedev.fd, VFIO_DEVICE_RESET)
> > Kernel:
> > vfio_pci_ioctl
> > pci_try_reset_function
> > __pci_reset_function_locked
> > pci_parent_bus_reset
> > pci_reset_bridge_secondary_bus
> >
> > and write 1 to the reset interface of sysfs go through the path:
> > Kernel:
> > reset_store
> > pci_reset_function
> > __pci_reset_function_locked
> > pci_parent_bus_reset
> > pci_reset_bridge_secondary_bus
> >
> > So seem that these two methods are same actually, I am confused why the
> results are inconsistent.
> 
> Maybe there's a misunderstanding here, the kernel PCI reset functions save
> and restore config space around the reset.  The intention of the reset is
> to re-init the internal state of the device while preserving (via
> save+restore) the config space.  The BARs being disabled is simply a
> matter of the Memory bit in the Command register being unset (note Mem-).
> Whether this is indicative of some issue depends on whether the state
> before reset matches the state after reset, not that the states after two
> different paths of triggering a reset are identical.
> 
> vfio-pci will hand off the device to the user (QEMU) disabled, so the
> states in the first example make sense to me.  In the second case, it's
> not clear what the starting state is for the device.  Was this reset
> performed from the starting point of the first case or is the device in
> some arbitrary, unknown state prior to reset?  Thanks,
> 
> Alex
In the second case, the reset was performed from the starting point of the 
first case.
IOW, the states before the two cases are identical, I think. The only 
difference I can think of
is the qemu process will perform twice reset, one occurs when vfio open the 
device' fd and the 
other one occurs as I mentioned above.

Thanks,
Wu Zongyong

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [PATCHv5 07/19] util: Refactor code for creating resctrl group

2018-10-09 Thread John Ferlan



On 10/9/18 6:30 AM, Wang Huaqiang wrote:
> The code for creating resctrl allocation group could be reused
> for monitoring group, refactor it for reusing in the later patch.
> 
> Signed-off-by: Wang Huaqiang 
> ---
>  src/util/virresctrl.c | 37 +++--
>  1 file changed, 23 insertions(+), 14 deletions(-)
> 

Reviewed-by: John Ferlan 

John

(NB: Done for the day too).

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [PATCHv5 02/19] util: Introduce resctrl monitor for CMT

2018-10-09 Thread John Ferlan



On 10/9/18 6:30 AM, Wang Huaqiang wrote:
> Cache Monitoring Technology (aka CMT) provides the capability
> to report cache utilization information of system task.
> 
> This patch introduces the concept of resctrl monitor through
> data structure virResctrlMonitor.
> 
> Signed-off-by: Wang Huaqiang 
> ---
>  src/libvirt_private.syms |  1 +
>  src/util/virresctrl.c| 56 
> 
>  src/util/virresctrl.h|  7 ++
>  3 files changed, 64 insertions(+)
> 
> diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms
> index 335210c..d2573c5 100644
> --- a/src/libvirt_private.syms
> +++ b/src/libvirt_private.syms
> @@ -2680,6 +2680,7 @@ virResctrlInfoGetCache;
>  virResctrlInfoGetMonitorPrefix;
>  virResctrlInfoMonFree;
>  virResctrlInfoNew;
> +virResctrlMonitorNew;
>  
>  
>  # util/virrotatingfile.h
> diff --git a/src/util/virresctrl.c b/src/util/virresctrl.c
> index 697424c..18ee560 100644
> --- a/src/util/virresctrl.c
> +++ b/src/util/virresctrl.c
> @@ -105,6 +105,7 @@ typedef virResctrlAllocMemBW *virResctrlAllocMemBWPtr;
>  /* Class definitions and initializations */
>  static virClassPtr virResctrlInfoClass;
>  static virClassPtr virResctrlAllocClass;
> +static virClassPtr virResctrlMonitorClass;
>  
>  
>  /* virResctrlInfo */
> @@ -319,6 +320,35 @@ struct _virResctrlAlloc {
>  char *path;
>  };
>  
> +/* virResctrlMonitor */
> +
> +/*
> + * virResctrlMonitor is the data structure for resctrl monitor. Resctrl
> + * monitor represents a resctrl monitoring group, which can be used to
> + * monitor the resource utilization information for either cache or
> + * memory bandwidth.
> + *
> + * From hardware perspective, cache monitoring technology (CMT), memory
> + * bandwidth technology (MBM), as well as the CAT and MBA, are all orthogonal
> + * features. The monitor will be created under the scope of default 
> allocation
> + * if no CAT or MBA supported in the system.

"if no specific CAT or MBA entries are provided for the guest"

The rest seems reasonable at least for now, so

Reviewed-by: John Ferlan 

John

[...]

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [PATCHv5 06/19] util: Add monitor interface to determine path

2018-10-09 Thread John Ferlan



On 10/9/18 6:30 AM, Wang Huaqiang wrote:
> Add interface for resctrl monitor to determine the path.
> 
> Signed-off-by: Wang Huaqiang 
> ---
>  src/libvirt_private.syms |  1 +
>  src/util/virresctrl.c| 32 
>  src/util/virresctrl.h|  3 +++
>  3 files changed, 36 insertions(+)
> 
> diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms
> index 4a52a86..e175c8b 100644
> --- a/src/libvirt_private.syms
> +++ b/src/libvirt_private.syms
> @@ -2681,6 +2681,7 @@ virResctrlInfoGetMonitorPrefix;
>  virResctrlInfoMonFree;
>  virResctrlInfoNew;
>  virResctrlMonitorAddPID;
> +virResctrlMonitorDeterminePath;
>  virResctrlMonitorNew;
>  
>  
> diff --git a/src/util/virresctrl.c b/src/util/virresctrl.c
> index 03001cc..1a5578e 100644
> --- a/src/util/virresctrl.c
> +++ b/src/util/virresctrl.c
> @@ -2465,3 +2465,35 @@ virResctrlMonitorAddPID(virResctrlMonitorPtr monitor,
>  {
>  return virResctrlAddPID(monitor->path, pid);
>  }
> +
> +int
> +virResctrlMonitorDeterminePath(virResctrlMonitorPtr monitor,
> +   const char *machinename)
> +{
> +char *alloc_path = NULL;

const char

> +char *parentpath = NULL;

VIR_AUTOFREE(char *) parentpath = NULL;

(thus the VIR_FREE later isn't necessary)

> +
> +if (!monitor) {
> +virReportError(VIR_ERR_INTERNAL_ERROR, "%s",
> +   _("Invalid resctrl monitor"));
> +return -1;
> +}

Shouldn't there be a monitor->path check here like there is in
virResctrlAllocDeterminePath?

> +
> +if (monitor->alloc)
> +alloc_path = monitor->alloc->path;
> +else
> +alloc_path = (char *)SYSFS_RESCTRL_PATH;

s/(char *)//

> +
> +if (virAsprintf(, "%s/mon_groups", alloc_path) < 0)
> +return -1;
> +
> +monitor->path = virResctrlDeterminePath(parentpath, machinename,
> +monitor->id);
> +
> +VIR_FREE(parentpath);

see above...

John

> +
> +if (!monitor->path)
> +return -1;
> +
> +return 0;
> +}
> diff --git a/src/util/virresctrl.h b/src/util/virresctrl.h
> index cb9bfae..69b6b1d 100644
> --- a/src/util/virresctrl.h
> +++ b/src/util/virresctrl.h
> @@ -196,4 +196,7 @@ virResctrlMonitorNew(void);
>  int
>  virResctrlMonitorAddPID(virResctrlMonitorPtr monitor,
>  pid_t pid);
> +int
> +virResctrlMonitorDeterminePath(virResctrlMonitorPtr monitor,
> +   const char *machinename);
>  #endif /*  __VIR_RESCTRL_H__ */
> 

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [PATCHv5 05/19] util: Refactor code for determining allocation path

2018-10-09 Thread John Ferlan



On 10/9/18 6:30 AM, Wang Huaqiang wrote:
> The code for determining resctrl allocation path could be reused
> for monitor. Refactor it for reusing.
> 
> Signed-off-by: Wang Huaqiang 
> ---
>  src/util/virresctrl.c | 33 +++--
>  1 file changed, 27 insertions(+), 6 deletions(-)
> 
> diff --git a/src/util/virresctrl.c b/src/util/virresctrl.c
> index c9a79f7..03001cc 100644
> --- a/src/util/virresctrl.c
> +++ b/src/util/virresctrl.c
> @@ -2267,6 +2267,26 @@ virResctrlAllocAssign(virResctrlInfoPtr resctrl,
>  }
>  
>  
> +static char *
> +virResctrlDeterminePath(const char *pathparent,

s/pathparent/parentpath

> +const char *prefix,
> +const char *id)
> +{
> +char *path = NULL;
> +
> +if (!id) {
> +virReportError(VIR_ERR_INTERNAL_ERROR, "%s",
> +   _("Resctrl resource ID must be set before creation"));

s/Resctrl resource ID/'%s' resctrl ID/

where %s is @parentpath

Working in the @parentpath, then we'd know which it was Alloc or Monitor

> +return NULL;
> +}
> +
> +if (virAsprintf(, "%s/%s-%s", pathparent, prefix, id) < 0)

parentpath

> +return NULL;
> +
> +return path;
> +}
> +
> +
>  int
>  virResctrlAllocDeterminePath(virResctrlAllocPtr alloc,
>   const char *machinename)
> @@ -2274,15 +2294,16 @@ virResctrlAllocDeterminePath(virResctrlAllocPtr alloc,
>  if (!alloc)
>  return 0;
>  
> -if (!alloc->id) {
> -virReportError(VIR_ERR_INTERNAL_ERROR, "%s",
> -   _("Resctrl Allocation ID must be set before 
> creation"));
> +if (alloc->path) {
> +virReportError(VIR_ERR_INVALID_ARG, "%s",
> +   _("Resctrl group path is expected to be NULL"));

Resctrl alloc group path is already set '%s'

I think this is an internal error not an invalid arg, too

John

>  return -1;
>  }
>  
> -if (!alloc->path &&
> -virAsprintf(>path, "%s/%s-%s",
> -SYSFS_RESCTRL_PATH, machinename, alloc->id) < 0)
> +alloc->path = virResctrlDeterminePath(SYSFS_RESCTRL_PATH,
> +  machinename,
> +  alloc->id);
> +if (!alloc->path)
>  return -1;
>  
>  return 0;
> 

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [PATCHv5 04/19] util: Add interface for adding PID to monitor

2018-10-09 Thread John Ferlan



On 10/9/18 6:30 AM, Wang Huaqiang wrote:
> Add interface for adding task PID to monitor.

to the monitor

> 
> Signed-off-by: Wang Huaqiang 
> ---
>  src/libvirt_private.syms | 1 +
>  src/util/virresctrl.c| 8 
>  src/util/virresctrl.h| 4 
>  3 files changed, 13 insertions(+)

Kind of an odd order considering @path gets introduced later, but it's
not that big a deal.

Reviewed-by: John Ferlan 

John

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [PATCHv5 03/19] util: Refactor code for adding PID to the resource group

2018-10-09 Thread John Ferlan



On 10/9/18 6:30 AM, Wang Huaqiang wrote:
> The code of adding PID to the allocation could be reused, refactor it
> for later reusing.

reuse.

> 
> Signed-off-by: Wang Huaqiang 
> ---
>  src/util/virresctrl.c | 26 +-
>  1 file changed, 17 insertions(+), 9 deletions(-)
> 

Reviewed-by: John Ferlan 

John

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [PATCH v4 2/5] libxl: add support for PVH

2018-10-09 Thread Marek Marczykowski-Górecki
On Tue, Oct 09, 2018 at 04:45:01PM -0600, Jim Fehlig wrote:
> On 10/7/18 9:39 AM, Marek Marczykowski-Górecki wrote:
> > @@ -647,6 +669,22 @@ libxlMakeDomBuildInfo(virDomainDefPtr def,
> >   return -1;
> >   }
> >   #endif
> > +} else if (pvh) {
> > +if (VIR_STRDUP(b_info->cmdline, def->os.cmdline) < 0)
> > +return -1;
> > +if (VIR_STRDUP(b_info->kernel, def->os.kernel) < 0)
> > +return -1;
> > +if (VIR_STRDUP(b_info->ramdisk, def->os.initrd) < 0)
> > +return -1;
> > +#ifdef LIBXL_HAVE_BUILDINFO_BOOTLOADER
> > +if (VIR_STRDUP(b_info->bootloader, def->os.bootloader) < 0)
> > +return -1;
> > +if (def->os.bootloaderArgs) {
> > +if (!(b_info->bootloader_args =
> > +  virStringSplit(def->os.bootloaderArgs, " \t\n", 0)))
> > +return -1;
> > +}
> > +#endif
> 
> This is probably fine, but AFAIK no bootloaders currently support pvh.
> Juergen is working on support in grub2 but that seems to be a little ways
> off.

This is independent of grub2 (which could be set as "kernel"), the
"bootloader" option here is about a program run _in dom0_ to find the
kernel to load, instead of providing a path directly (see xl.cfg(5)).
Like pygrub. And while I haven't tested it with PVH, I see no reason why
it shouldn't work.

-- 
Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab
A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?


signature.asc
Description: PGP signature
--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list

Re: [libvirt] [PATCH v4 2/5] libxl: add support for PVH

2018-10-09 Thread Jim Fehlig

On 10/7/18 9:39 AM, Marek Marczykowski-Górecki wrote:

Since this is something between PV and HVM, it makes sense to put the
setting in place where domain type is specified.
To enable it, use  It is
also included in capabilities.xml, for every supported HVM guest type - it
doesn't seems to be any other requirement (besides new enough Xen).

Signed-off-by: Marek Marczykowski-Górecki 
---
Changes in v2 proposed by Jim:
  - use new_arch_added var instead of i == nr_guest_archs for clarity
  - improve comment
  - adjust for now required Xen >= 4.6 (remove part for Xen < 4.5)

Changes in v3:
  - limit VIR_DOMAIN_OSTYPE_XEN -> VIR_DOMAIN_OSTYPE_LINUX conversion to
  Xen PV only
  - do not accept VIR_DOMAIN_OSTYPE_LINUX for PVH
  - fix reported capabilities for PVH - remove hostdev passthrough and
  video/graphics
  - use #ifdef LIBXL_DOMAIN_TYPE_PVH instead of hypervisor version to
  check for PVH support
  - compile fix for Xen < 4.9

Changes in v4:
  - revert to "i == nr_guest_archs-1" check
  - let configure check for LIBXL_DOMAIN_TYPE_PVH and use #ifdef
  HAVE_XEN_PVH then
  - fix comment about Xen version
---
  docs/formatcaps.html.in|  4 +--
  docs/schemas/domaincommon.rng  |  1 +-
  m4/virt-driver-libxl.m4|  3 ++-
  src/conf/domain_conf.c |  6 ++--
  src/libxl/libxl_capabilities.c | 38 ++-
  src/libxl/libxl_conf.c | 50 ++-
  src/libxl/libxl_driver.c   |  6 ++--
  7 files changed, 90 insertions(+), 18 deletions(-)

diff --git a/docs/formatcaps.html.in b/docs/formatcaps.html.in
index 0d9c53d..fe113bc 100644
--- a/docs/formatcaps.html.in
+++ b/docs/formatcaps.html.in
@@ -104,8 +104,8 @@
  machineMachine type, for use in
machine
attribute of os/type element in domain XML. For example Xen
-  supports xenfv for HVM or xenpv for
-  PV.
+  supports xenfv for HVM, xenpv for
+  PV, or xenpvh for PVH.
  domainThe type attribute of
this element specifies the type of hypervisor required to run 
the
domain. Use in type
diff --git a/docs/schemas/domaincommon.rng b/docs/schemas/domaincommon.rng
index 099a949..820a7c6 100644
--- a/docs/schemas/domaincommon.rng
+++ b/docs/schemas/domaincommon.rng
@@ -341,6 +341,7 @@

  xenpv
  xenfv
+xenpvh

  

diff --git a/m4/virt-driver-libxl.m4 b/m4/virt-driver-libxl.m4
index 479d911..2cd97cc 100644
--- a/m4/virt-driver-libxl.m4
+++ b/m4/virt-driver-libxl.m4
@@ -75,6 +75,9 @@ AC_DEFUN([LIBVIRT_DRIVER_CHECK_LIBXL], [
  ])
fi
  
+  dnl Check if Xen has support for PVH

+  AC_CHECK_DECL(LIBXL_DOMAIN_TYPE_PVH, [AC_DEFINE([HAVE_XEN_PVH], [1], [Define to 1 
if Xen has PVH support.])], [], [#include ])
+
AC_SUBST([LIBXL_CFLAGS])
AC_SUBST([LIBXL_LIBS])
  ])
diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c
index 9911d56..a945ad4 100644
--- a/src/conf/domain_conf.c
+++ b/src/conf/domain_conf.c
@@ -19094,7 +19094,8 @@ virDomainDefParseCaps(virDomainDefPtr def,
   * be 'xen'. So we accept the former and convert
   */
  if (def->os.type == VIR_DOMAIN_OSTYPE_LINUX &&
-def->virtType == VIR_DOMAIN_VIRT_XEN) {
+def->virtType == VIR_DOMAIN_VIRT_XEN &&
+(!def->os.machine || STREQ(def->os.machine, "xenpv"))) {
  def->os.type = VIR_DOMAIN_OSTYPE_XEN;
  }
  
@@ -27705,7 +27706,8 @@ virDomainDefFormatInternal(virDomainDefPtr def,

   * be 'xen'. So we convert to the former for backcompat
   */
  if (def->virtType == VIR_DOMAIN_VIRT_XEN &&
-def->os.type == VIR_DOMAIN_OSTYPE_XEN)
+def->os.type == VIR_DOMAIN_OSTYPE_XEN &&
+(!def->os.machine || STREQ(def->os.machine, "xenpv")))
  virBufferAsprintf(buf, ">%s\n",
virDomainOSTypeToString(VIR_DOMAIN_OSTYPE_LINUX));
  else


I'd still like to hear what others think of extending the hack. mprivozn? You 
often have an interesting opinion :-).



diff --git a/src/libxl/libxl_capabilities.c b/src/libxl/libxl_capabilities.c
index 18596c7..eecd5e9 100644
--- a/src/libxl/libxl_capabilities.c
+++ b/src/libxl/libxl_capabilities.c
@@ -57,6 +57,7 @@ struct guest_arch {
  virArch arch;
  int bits;
  int hvm;
+int pvh;
  int pae;
  int nonpae;
  int ia64_be;
@@ -491,13 +492,33 @@ libxlCapsInitGuests(libxl_ctx *ctx, virCapsPtr caps)
  guest_archs[i].nonpae = nonpae;
  if (ia64_be)
  guest_archs[i].ia64_be = ia64_be;
+
+/*
+ * Xen 4.10 introduced support for the PVH guest type, which
+ * requires hardware virtualization support similar to the
+ * HVM guest type. Add a PVH guest type for each new HVM
+ * guest type.
+ */
+#ifdef HAVE_XEN_PVH
+if (hvm && i == nr_guest_archs-1) 

Re: [libvirt] [PATCHv5 01/19] docs: Refactor schemas to support default allocation

2018-10-09 Thread John Ferlan


s/docs/conf,util/

It's more than just docs...

On 10/9/18 6:30 AM, Wang Huaqiang wrote:
> The resctrl default allocation is introduced in this patch,
> which refers to the root directory (/sys/fs/resctrl) and
> immediately be created after mounting, owns all the tasks
> and cpus in the system and can make full use of all resources.
> 
> It does not intentionally allocate any dedicated amount
> of resource, either cache or memory bandwidth, for default
> allocation.
> 
> If a system task has no resource control applied but you
> want to know task's cache or memroy bandwidth utilization
> information, the default allocation is meaningful. We create
> resctrl monitor under the default allocation for such kind of
> task.
> 
> Refactoring schemas docs and APIs to create a default
> cache allocation by allowing the appearance of an
>  with no  element.
> 
> Signed-off-by: Wang Huaqiang 
> ---
>  docs/formatdomain.html.in |  4 ++--
>  docs/schemas/domaincommon.rng |  4 ++--
>  src/conf/domain_conf.c| 32 +++-
>  src/util/virresctrl.c | 27 +++
>  4 files changed, 50 insertions(+), 17 deletions(-)

How would this look in XML output - supply a  w/o the 
element? Probably also need the  there as well in at least one
entry just prove it works.

It seems  entries have their own unique "back door" of sorts
calling virResctrlAllocNew when there are no  entries.
Similar to what happens if there were entries cachetune for vcpus of
"0-1" and "2-5", but nothing for "6-7". The assumption always being
 read after  and as long as there's no overlap,
just create the  entry, by supplying a  entry
without  entries.

It's a little awkward to read, but now makes me think about all the
existing/strange linkages. In a way I suppose having no 
entries is proven to work by tests/genericxml2xmlindata/memorytune.xml.
The reality is they get created, but without visibility.

> 
> diff --git a/docs/formatdomain.html.in b/docs/formatdomain.html.in
> index 8189959..b1651e3 100644
> --- a/docs/formatdomain.html.in
> +++ b/docs/formatdomain.html.in
> @@ -943,8 +943,8 @@
>  
>cache
>
> -This element controls the allocation of CPU cache and has the
> -following attributes:
> +This optional element controls the allocation of CPU cache and 
> has
> +the following attributes:
>  
>level
>
> diff --git a/docs/schemas/domaincommon.rng b/docs/schemas/domaincommon.rng
> index 099a949..5c533d6 100644
> --- a/docs/schemas/domaincommon.rng
> +++ b/docs/schemas/domaincommon.rng
> @@ -956,7 +956,7 @@
>  
>
>  
> -
> +
>
>  
>
> @@ -980,7 +980,7 @@
>
>  
>
> -
> +
>
>  
>  
> diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c
> index 9911d56..b77680e 100644
> --- a/src/conf/domain_conf.c
> +++ b/src/conf/domain_conf.c
> @@ -19002,22 +19002,27 @@ virDomainCachetuneDefParse(virDomainDefPtr def,
>  goto cleanup;
>  }
>  
> -if (virDomainResctrlVcpuMatch(def, vcpus, ) < 0)
> -goto cleanup;
> -
> -if (!alloc) {
> -alloc = virResctrlAllocNew();
> -if (!alloc)
> +/* If 'n' equals 0, then no  element found in ,
> + * this means it is a default alloction. For default allocation,

s/alloction/allocation

> + * @SetvirDomainResctrlDefPtr->alloc is set to NULL */

Not sure what ^^ this was...

> +if (n != 0) {
> +if (virDomainResctrlVcpuMatch(def, vcpus, ) < 0)
>  goto cleanup;
> -} else {
> -virReportError(VIR_ERR_XML_ERROR, "%s",
> -   _("Identical vcpus in cachetunes found"));
> -goto cleanup;
> -}

diff is perhaps easier to read if you:

if (n == 0) {
ret = 0;
goto cleanup;
}

then none of the indent is needed for n != 0

>  
> -for (i = 0; i < n; i++) {
> -if (virDomainCachetuneDefParseCache(ctxt, nodes[i], alloc) < 0)
> +if (!alloc) {
> +alloc = virResctrlAllocNew();
> +if (!alloc)
> +goto cleanup;
> +} else {
> +virReportError(VIR_ERR_XML_ERROR, "%s",
> +   _("Identical vcpus in cachetunes found"));
>  goto cleanup;
> +}
> +
> +for (i = 0; i < n; i++) {
> +if (virDomainCachetuneDefParseCache(ctxt, nodes[i], alloc) < 0)
> +goto cleanup;
> +}
>  }
>  
>  if (virResctrlAllocIsEmpty(alloc)) {
> @@ -19027,6 +19032,7 @@ virDomainCachetuneDefParse(virDomainDefPtr def,
>  
>  if (virDomainResctrlAppend(def, node, alloc, vcpus, flags) < 0)
>  goto cleanup;
> +

Superfluous

>  vcpus = NULL;
>  alloc = NULL;
>  
> 

Re: [libvirt] [Qemu-devel] [PATCH v2 0/3] HMP/snapshot changes - do not use ID anymore

2018-10-09 Thread Markus Armbruster
Cc: libvir-list for review of the compatibility argument below.

Daniel Henrique Barboza  writes:

> Hey David,
>
> On 9/21/18 9:29 AM, Dr. David Alan Gilbert wrote:
>> * Daniel Henrique Barboza (danielhb...@gmail.com) wrote:
>>> changes in v2:
>>> - removed the "RFC" marker;
>>> - added a new patch (patch 2) that removes
>>> bdrv_snapshot_delete_by_id_or_name from the code;
>>> - made changes in patch 1 as suggested by Murilo;
>>> - previous patch set link:
>>> https://lists.gnu.org/archive/html/qemu-devel/2018-08/msg04658.html
>>>
>>>
>>> It is not uncommon to see bugs being opened by testers that attempt to
>>> create VM snapshots using HMP. It turns out that "0" and "1" are quite
>>> common snapshot names and they trigger a lot of bugs. I gave an example
>>> in the commit message of patch 1, but to sum up here: QEMU treats the
>>> input of savevm/loadvm/delvm sometimes as 'ID', sometimes as 'name'. It
>>> is documented as such, but this can lead to strange situations.
>>>
>>> Given that it is strange for an API to consider a parameter to be 2 fields
>>> at the same time, and inadvently treating them as one or the other, and
>>> that removing the ID field is too drastic, my idea here is to keep the
>>> ID field for internal control, but do not let the user set it.
>>>
>>> I guess there's room for discussion about considering this change an API
>>> change or not. It doesn't affect users of HMP and it doesn't affect Libvirt,
>>> but simplifying the meaning of the parameters of savevm/loadvm/delvm.
>> Can you clarify a couple of things:
>>a) What is it about libvirt's use that means it's OK ? Does it never
>> use numeric tags?
>
> I am glad you asked because my understanding in how Libvirt was dealing
> with snapshots was wrong, and I just looked into it further to answer your
> question. Luckily, this series fixes the situation for Libvirt as well.
>
> I was misled by the fact that Libvirt does not show the same symptoms
> we see in
> QEMU of this problem, but the bug is there. Here's a quick test with
> Libvirt with
> "0" and "1" as snapshot names, considering a VM named with no snapshots,
> using QEMU 2.12 and Libvirt 4.0.0:
>
> - create the "0" snapshot:
>
> $ sudo virsh snapshot-create-as --name 0 dhb
> Domain snapshot 0 created
>
> $ sudo virsh snapshot-list dhb
> Name Creation Time State
> 
> 0 2018-09-24 15:47:56 -0400 running
>
> $ sudo virsh qemu-monitor-command dhb --hmp info snapshots
> List of snapshots present on all disks:
> ID TAG VM SIZE DATE VM CLOCK
> --    0  405M 2018-09-24 15:47:56 00:04:20.867
>
>
> - created the "1" snapshot. Here, Libvirt shows both snapshots with
> snapshot-list,
> but the snapshot was erased inside QEMU:
>
> $ sudo virsh snapshot-create-as --name 1 dhb
> Domain snapshot 1 created
> $
> $ sudo virsh snapshot-list dhb
> Name Creation Time State
> 
> 0 2018-09-24 15:47:56 -0400 running
> 1 2018-09-24 15:50:09 -0400 running
>
> $ sudo virsh qemu-monitor-command dhb --hmp info snapshots
> List of snapshots present on all disks:
> ID TAG VM SIZE DATE VM CLOCK
> --    1  404M 2018-09-24 15:50:10 00:05:36.226
>
>
> This is where I stopped checking out Libvirt at first, concluding
> wrongly that it
> was immune to the bug.
>
> Libvirt does not throw an error when trying to apply snapshot 0. In
> fact, it acts
> like everything went fine:
>
> $ sudo virsh snapshot-revert dhb 0
>
> $ echo $?
> 0

Is that a libvirt bug?

> Reverting back to snapshot "1" works as intended, restoring the VM
> state when it
> was created.
>
>
> (perhaps this is something we should let Libvirt be aware of ...)
>
>
>
> This series fixes this behavior because the snapshot 0 isn't erased
> from QEMU, allowing
> Libvirt to successfully restore it.
>
>
>>b) After this series are you always guaranteed to be able to fix
>> any existing oddly named snapshots?
>
> The oddly named snapshots that already exists can be affected by the
> semantic
> change in loadvm and delvm, in a way that the user can't load/del
> using the snap
> ID, just the tag. Aside from that, I don't see any side effects with
> existing
> snapshots and this patch series.

Do all snapshots have a tag that is unique within their image?  Even
snapshots created by old versions of QEMU?

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list

Re: [libvirt] [PATCHv5 00/19] Introduce x86 Cache Monitoring Technology (CMT)

2018-10-09 Thread John Ferlan



On 10/9/18 6:30 AM, Wang Huaqiang wrote:
> This series of patches and the series already been merged introduce
> the x86 Cache Monitoring Technology (CMT) to libvirt by interacting
> with kernel resource control (resctrl) interface. CMT is one of the
> Intel(R) x86 CPU feature which belongs to the Resource Director
> Technology (RDT). CMT reports the occupancy of the last level cache,
> which is shared by all CPU cores.
> 
> In the v1 series, an original and complete feature for CMT was introduced
> The v2 and v3 patches address the feature for the host capability of CMT.
> v4 is addressing the feature for monitoring VM vcpu thread set cache
> occupancy and reporting it through a virsh command.
> 
> We have serval discussion about the enabling of CMT, please refer to
> following links for the RFCs.
> RFCv3
> https://www.redhat.com/archives/libvir-list/2018-August/msg01213.html
> RFCv2
> https://www.redhat.com/archives/libvir-list/2018-July/msg00409.html
> https://www.redhat.com/archives/libvir-list/2018-July/msg01241.html
> RFCv1
> https://www.redhat.com/archives/libvir-list/2018-June/msg00674.html
> 
> And the merged commits are list as below, for host capability of CMT.
> 6af8417415508c31f8ce71234b573b4999f35980
> 8f6887998bf63594ae26e3db18d4d5896c5f2cb4
> 58fcee6f3a2b7e89c21c1fb4ec21429c31a0c5b8
> 12093f1feaf8f5023dcd9d65dff111022842183d
> a5d293c18831dcf69ec6195798387fbb70c9f461
> 
> 
> 1. About reason why CMT is necessary in libvirt?
> The perf events of 'CMT, MBML, MBMT' have been phased out since Linux
> kernel commit c39a0e2c8850f08249383f2425dbd8dbe4baad69, in libvirt
> the perf based cmt,mbm will not work with the latest linux kernel. These
> patches add CMT feature to libvirt through kernel resctrlfs interface.
> 
> 2 Create cache monitoring group (cache monitor).
> 
> The main interface for creating monitoring group is through XML file. The
> proposed configuration is like:
> 
>   
> 
>   
>   
> + 

Duplication of vcpus is odd for a child entry isn't it?  It's not in the
 entry...

> 
> 
> + 

... but perhaps that means using 4-6 is OK because it's a subset of the
parent cachetune 4-7?

I'm not sure I can keep track of all the discussions we've had about
this, so this could be something we've already covered, but has moved
out of my short term memory.

> 
>   
> 
> In above XML, created 2 cache resctrl allocation groups and 2 resctrl
> monitoring groups.
> The changes of cache monitor will be effective in next booting of VM.
> 
> 2 Show CMT result through command 'domstats'
> 
> Adding the interface in qemu to report this information for resource
> monitor group through command 'virsh domstats --cpu-total'.
> Below is a typical output:
> 
>  # virsh domstats 1 --cpu-total
>  Domain: 'ubuntu16.04-base'
>  ...
>cpu.cache.monitor.count=2
>cpu.cache.0.name=vcpus_1
>cpu.cache.0.vcpus=1
>cpu.cache.0.bank.count=2
>cpu.cache.0.bank.0.id=0
>cpu.cache.0.bank.0.bytes=4505600
>cpu.cache.0.bank.1.id=1
>cpu.cache.0.bank.1.bytes=5586944
>cpu.cache.1.name=vcpus_4-6

So perhaps "this" is more correct 4-6 (I assume this comes from the
 entryu...

>cpu.cache.1.vcpus=4,5,6

Interesting that a name can be 4-6, but these are each called out. Can
someone have "5,7,9"?  How does that look on the name line and then on
the vcpus line.

>cpu.cache.1.bank.count=2
>cpu.cache.1.bank.0.id=0
>cpu.cache.1.bank.0.bytes=17571840
>cpu.cache.1.bank.1.id=1
>cpu.cache.1.bank.1.bytes=29106176

Obviously a different example than above with only 1  entry...
and the .bytes values for everything doesn't match up with the kb values
above.

> 
> 
> Changes in v5:
> - qemu: Setting up vcpu and adding pids to resctrl monitor groups during
> re-connection.
> - Add the document for domain configuration related to resctrl monitor.
> 

Probably should have posted a reply to your v4 series to indicate you
were working on a v5 due to whatever reason so that no one started
reviewing it...

It takes a "long time" to set aside the time to review large series...

Also, while it may pass your compiler, the patch18 needed:

-unsigned int nmonitors = NULL;
+unsigned int nmonitors = 0;

Something I thought I had pointed out in much earlier reviews...

I'll work through the series over the next day or so with any luck...
It is on my short term radar at least.

John

> Changes in v4:
> v4 is addressing the feature for monitoring VM vcpu
> thread set cache occupancy and reporting it through a
> virsh command.
> - Introduced resctrl default allocation
> - Introduced resctrl monitor and default monitor
> 
> Changes in v3:
> - Addressed John Ferlan's review.
> - Typo fixed.
> - Removed VIR_ENUM_DECL(virMonitor);
> 
> Changes in v2:
> - Introduced MBM capability.
> - Capability layout changed
> * Moved  from cahe  to 
> * Renamed  to 
> 

Re: [libvirt] [ocaml] reset and resync the libvirt-ocaml repository

2018-10-09 Thread Daniel P . Berrangé
On Thu, Sep 06, 2018 at 05:13:23PM +0200, Pino Toscano wrote:
> Hi,
> 
> for reasons mostly lost in the history, after the libvirt-ocaml
> repository was converted to git, it was not used by its main author
> (Rich Jones); the development continued on Rich's git, at
>   http://git.annexia.org/?p=ocaml-libvirt.git;a=summary
> 
> After a talk with Rich, we agreed that it was better to move the
> development back to libvirt.org, just like all the other bindings.
> There are two problems however:
> 
> 1) the first 38 commits have an bad author/committer date, and this is
>also the reason why the existing libvirt-ocaml is not mirrored on
>github
> 
> 2) the top 3 commits on libvirt-ocaml were not integrated back to
>Rich's ocaml-libvirt, and maybe their content might not be totally
>OK (I will let Rich comment more on this)
> 
> While rewriting history is bad,
> - most probably there are not many users of libvirt-ocaml around,
> - the repository itself is very small (< 500k),
> - in general it will better to have a working repository

I agree with all those points

> So what I'm proposing is to replace the libvirt-ocaml repository with
> a fixed version of Rich's ocaml-libvirt, and directly on the git hosting
> side (i.e. not using force-push on the current one).  Rich has already
> commit access for libvirt, so there are no problems to keep his
> maintainer role on it.  Once done, we can notify users in this list
> about it.
> 
> What do you think? Is it an acceptable path forward?

This sounds fine to me, and will let us fix the mirroring too.


Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [RFC v2] virfile: fix cast-align error

2018-10-09 Thread Daniel P . Berrangé
On Tue, Oct 09, 2018 at 05:55:29PM +0200, Marc Hartmayer wrote:
> On s390x the struct member f_type of statsfs is hard coded to
> 'unsigned int'. So let's try to determine the type by means of the GNU
> extension typeof.
> 
> This fixes the following error:
> ../../src/util/virfile.c:3578:38: error: cast increases required alignment of 
> target type [-Werror=cast-align]
>  virFileIsSharedFixFUSE(path, (long *) _type);
> 
> Signed-off-by: Marc Hartmayer 
> ---
>  src/util/virfile.c | 10 --
>  1 file changed, 8 insertions(+), 2 deletions(-)
> 
> diff --git a/src/util/virfile.c b/src/util/virfile.c
> index 2a7e87102a25..6cde4ab6c23b 100644
> --- a/src/util/virfile.c
> +++ b/src/util/virfile.c
> @@ -3464,9 +3464,15 @@ int virFilePrintf(FILE *fp, const char *msg, ...)
>  
>  # define PROC_MOUNTS "/proc/mounts"
>  
> +# ifdef __GNUC__
> +typedef typeof(((struct statfs*)0)->f_type) f_type_type;
> +# else
> +typedef long f_type_type;
> +# endif

These games are rather overkill IMHO.

> +
>  static int
>  virFileIsSharedFixFUSE(const char *path,
> -   long *f_type)
> +   f_type_type *f_type)

Make this  'long long'

>  {
>  char *dirpath = NULL;
>  const char **mounts = NULL;
> @@ -3575,7 +3581,7 @@ virFileIsSharedFSType(const char *path,

Add a local variable

   long long f_type;

and then after the stat() call

   f_type = sb.f_type;

and change everything that follows to use the local variable

>  if (sb.f_type == FUSE_SUPER_MAGIC) {
>  VIR_DEBUG("Found FUSE mount for path=%s. Trying to fix it", path);
> -virFileIsSharedFixFUSE(path, (long *) _type);
> +virFileIsSharedFixFUSE(path, _type);
>  }
>  
>  VIR_DEBUG("Check if path %s with FS magic %lld is shared",

This also removes the need to cast to long long in this VIR_DEBUG
call


Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


[libvirt] [RFC v2] virfile: fix cast-align error

2018-10-09 Thread Marc Hartmayer
On s390x the struct member f_type of statsfs is hard coded to
'unsigned int'. So let's try to determine the type by means of the GNU
extension typeof.

This fixes the following error:
../../src/util/virfile.c:3578:38: error: cast increases required alignment of 
target type [-Werror=cast-align]
 virFileIsSharedFixFUSE(path, (long *) _type);

Signed-off-by: Marc Hartmayer 
---
 src/util/virfile.c | 10 --
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/src/util/virfile.c b/src/util/virfile.c
index 2a7e87102a25..6cde4ab6c23b 100644
--- a/src/util/virfile.c
+++ b/src/util/virfile.c
@@ -3464,9 +3464,15 @@ int virFilePrintf(FILE *fp, const char *msg, ...)
 
 # define PROC_MOUNTS "/proc/mounts"
 
+# ifdef __GNUC__
+typedef typeof(((struct statfs*)0)->f_type) f_type_type;
+# else
+typedef long f_type_type;
+# endif
+
 static int
 virFileIsSharedFixFUSE(const char *path,
-   long *f_type)
+   f_type_type *f_type)
 {
 char *dirpath = NULL;
 const char **mounts = NULL;
@@ -3575,7 +3581,7 @@ virFileIsSharedFSType(const char *path,
 
 if (sb.f_type == FUSE_SUPER_MAGIC) {
 VIR_DEBUG("Found FUSE mount for path=%s. Trying to fix it", path);
-virFileIsSharedFixFUSE(path, (long *) _type);
+virFileIsSharedFixFUSE(path, _type);
 }
 
 VIR_DEBUG("Check if path %s with FS magic %lld is shared",
-- 
2.17.0

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] The results of lspci are inconsistent between vfio reset pci devices and reset devices by sysfs interafce

2018-10-09 Thread Alex Williamson
On Tue, 9 Oct 2018 12:11:29 +
"Wuzongyong (Euler Dept)"  wrote:

> Hi,
> 
> I start a virtual machine with commandline:
> /usr/libexec/qemu-kvm --enable-kvm -smp 8 -m 8192 -device 
> vfio-pci,host=:81:00.0
> 
> Then I pause the qemu process before executing the main_loop function by gdb.
> At this moment, lspci shows the regions are disabled like below:
> 81:00.0 3D controller: NVIDIA Corporation GP100GL [Tesla P100 PCIe 16GB] 
> (rev a1)
> Subsystem: NVIDIA Corporation Device 118f
> Physical Slot: 0-6
> Control: I/O- Mem- BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- 
> Stepping- SERR- FastB2B- DisINTx+
> Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- 
> SERR-  Interrupt: pin A routed to IRQ 35
> NUMA node: 1
> Region 0: Memory at c800 (32-bit, non-prefetchable) [disabled] 
> [size=16M]
> Region 1: Memory at 278 (64-bit, prefetchable) [disabled] 
> [size=16G]
> Region 3: Memory at 27c (64-bit, prefetchable) [disabled] 
> [size=32M]
> 
> But after the command:
> echo 1 > /sys/bus/pci/devices/:81:00.0/reset
> lspci shows the regions are *not* disabled:
> 81:00.0 3D controller: NVIDIA Corporation GP100GL [Tesla P100 PCIe 16GB] 
> (rev a1)
> Subsystem: Huawei Technologies Co., Ltd. Device 2061
> Physical Slot: 0-6
> Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr+ 
> Stepping- SERR+ FastB2B- DisINTx-
> Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- 
> SERR-  Latency: 0, Cache Line Size: 32 bytes
> Interrupt: pin A routed to IRQ 7
> NUMA node: 1
> Region 0: Memory at c800 (32-bit, non-prefetchable) [size=16M]
> Region 1: Memory at 278 (64-bit, prefetchable) [size=16G]
> Region 3: Memory at 27c (64-bit, prefetchable) [size=32M]
> 
> AFAIK, qemu performs vfio_pci_reset like the below callstack:
> Qemu:
> vfio_pci_reset
> ioctl(vdev->vbasedev.fd, VFIO_DEVICE_RESET)
> Kernel:
> vfio_pci_ioctl
> pci_try_reset_function
> __pci_reset_function_locked
> pci_parent_bus_reset
> pci_reset_bridge_secondary_bus
> 
> and write 1 to the reset interface of sysfs go through the path:
> Kernel:
> reset_store
> pci_reset_function
> __pci_reset_function_locked
> pci_parent_bus_reset
> pci_reset_bridge_secondary_bus
> 
> So seem that these two methods are same actually, I am confused why the 
> results are inconsistent.

Maybe there's a misunderstanding here, the kernel PCI reset functions
save and restore config space around the reset.  The intention of the
reset is to re-init the internal state of the device while preserving
(via save+restore) the config space.  The BARs being disabled is simply
a matter of the Memory bit in the Command register being unset (note
Mem-).  Whether this is indicative of some issue depends on whether the
state before reset matches the state after reset, not that the states
after two different paths of triggering a reset are identical.

vfio-pci will hand off the device to the user (QEMU) disabled, so the
states in the first example make sense to me.  In the second case, it's
not clear what the starting state is for the device.  Was this reset
performed from the starting point of the first case or is the device in
some arbitrary, unknown state prior to reset?  Thanks,

Alex

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


[libvirt] [PATCH 1/2] qemu: Allow coldplugging of hub device

2018-10-09 Thread Han Han
https://bugzilla.redhat.com/show_bug.cgi?id=1375423

Signed-off-by: Han Han 
---
 src/qemu/qemu_driver.c | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c
index a52e2495d5..130ce4cbd6 100644
--- a/src/qemu/qemu_driver.c
+++ b/src/qemu/qemu_driver.c
@@ -8133,6 +8133,11 @@ qemuDomainAttachDeviceConfig(virDomainDefPtr vmdef,
 return -1;
 break;
 
+case VIR_DOMAIN_DEVICE_HUB:
+if (VIR_APPEND_ELEMENT(vmdef->hubs, vmdef->nhubs, dev->data.hub) < 0)
+return -1;
+break;
+
 case VIR_DOMAIN_DEVICE_VSOCK:
 if (vmdef->vsock) {
 virReportError(VIR_ERR_OPERATION_INVALID, "%s",
@@ -8145,7 +8150,6 @@ qemuDomainAttachDeviceConfig(virDomainDefPtr vmdef,
 case VIR_DOMAIN_DEVICE_SOUND:
 case VIR_DOMAIN_DEVICE_VIDEO:
 case VIR_DOMAIN_DEVICE_GRAPHICS:
-case VIR_DOMAIN_DEVICE_HUB:
 case VIR_DOMAIN_DEVICE_SMARTCARD:
 case VIR_DOMAIN_DEVICE_MEMBALLOON:
 case VIR_DOMAIN_DEVICE_NVRAM:
-- 
2.19.1

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


[libvirt] [PATCH 2/2] qemu: Allow coldunplugging of hub device

2018-10-09 Thread Han Han
https://bugzilla.redhat.com/show_bug.cgi?id=1375423

Signed-off-by: Han Han 
---
 src/conf/domain_conf.c   | 30 ++
 src/conf/domain_conf.h   |  3 +++
 src/libvirt_private.syms |  1 +
 src/qemu/qemu_driver.c   | 10 +-
 4 files changed, 43 insertions(+), 1 deletion(-)

diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c
index 9911d56130..cfca5daa02 100644
--- a/src/conf/domain_conf.c
+++ b/src/conf/domain_conf.c
@@ -17624,6 +17624,36 @@ virDomainVsockDefEquals(const virDomainVsockDef *a,
 }
 
 
+static bool
+virDomainHubDefEquals(const virDomainHubDef *a,
+  const virDomainHubDef *b)
+{
+if (a->type != b->type)
+return false;
+
+if (a->info.type != VIR_DOMAIN_DEVICE_ADDRESS_TYPE_NONE &&
+!virDomainDeviceInfoAddressIsEqual(>info, >info))
+return false;
+
+return true;
+}
+
+
+ssize_t
+virDomainHubDefFind(const virDomainDef *def,
+const virDomainHubDef *hub)
+{
+size_t i;
+
+for (i = 0; i < def->nhubs; i++) {
+if (virDomainHubDefEquals(hub, def->hubs[i]))
+return i;
+}
+
+return -1;
+}
+
+
 char *
 virDomainDefGetDefaultEmulator(virDomainDefPtr def,
virCapsPtr caps)
diff --git a/src/conf/domain_conf.h b/src/conf/domain_conf.h
index e30a4b2fe7..c2d0877170 100644
--- a/src/conf/domain_conf.h
+++ b/src/conf/domain_conf.h
@@ -3365,6 +3365,9 @@ virDomainShmemDefPtr 
virDomainShmemDefRemove(virDomainDefPtr def, size_t idx)
 ssize_t virDomainInputDefFind(const virDomainDef *def,
   const virDomainInputDef *input)
 ATTRIBUTE_NONNULL(1) ATTRIBUTE_NONNULL(2) ATTRIBUTE_RETURN_CHECK;
+ssize_t virDomainHubDefFind(const virDomainDef *def,
+  const virDomainHubDef *hub)
+ATTRIBUTE_NONNULL(1) ATTRIBUTE_NONNULL(2) ATTRIBUTE_RETURN_CHECK;
 bool virDomainVsockDefEquals(const virDomainVsockDef *a,
  const virDomainVsockDef *b)
 ATTRIBUTE_NONNULL(1) ATTRIBUTE_NONNULL(2) ATTRIBUTE_RETURN_CHECK;
diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms
index 335210c31d..6245927673 100644
--- a/src/libvirt_private.syms
+++ b/src/libvirt_private.syms
@@ -399,6 +399,7 @@ virDomainHostdevRemove;
 virDomainHostdevSubsysPCIBackendTypeToString;
 virDomainHostdevSubsysTypeToString;
 virDomainHPTResizingTypeToString;
+virDomainHubDefFind;
 virDomainHubTypeFromString;
 virDomainHubTypeToString;
 virDomainHypervTypeFromString;
diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c
index 130ce4cbd6..92a81ff1f5 100644
--- a/src/qemu/qemu_driver.c
+++ b/src/qemu/qemu_driver.c
@@ -8338,10 +8338,18 @@ qemuDomainDetachDeviceConfig(virDomainDefPtr vmdef,
 vmdef->vsock = NULL;
 break;
 
+case VIR_DOMAIN_DEVICE_HUB:
+if ((idx = virDomainHubDefFind(vmdef, dev->data.hub)) < 0) {
+virReportError(VIR_ERR_DEVICE_MISSING, "%s",
+   _("matching hub device not found"));
+return -1;
+}
+VIR_DELETE_ELEMENT(vmdef->hubs, idx, vmdef->nhubs);
+break;
+
 case VIR_DOMAIN_DEVICE_SOUND:
 case VIR_DOMAIN_DEVICE_VIDEO:
 case VIR_DOMAIN_DEVICE_GRAPHICS:
-case VIR_DOMAIN_DEVICE_HUB:
 case VIR_DOMAIN_DEVICE_SMARTCARD:
 case VIR_DOMAIN_DEVICE_MEMBALLOON:
 case VIR_DOMAIN_DEVICE_NVRAM:
-- 
2.19.1

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


[libvirt] [PATCH 0/2] Allow coldplug and coldunplug of hub device

2018-10-09 Thread Han Han


Han Han (2):
  qemu: Allow coldplugging of hub device
  qemu: Allow coldunplugging of hub device

 src/conf/domain_conf.c   | 30 ++
 src/conf/domain_conf.h   |  3 +++
 src/libvirt_private.syms |  1 +
 src/qemu/qemu_driver.c   | 16 ++--
 4 files changed, 48 insertions(+), 2 deletions(-)

-- 
2.19.1

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] Overview of libvirt incremental backup API, part 2 (incremental/differential pull mode)

2018-10-09 Thread Eric Blake

On 10/9/18 8:29 AM, Nir Soffer wrote:

On Fri, Oct 5, 2018 at 7:58 AM Eric Blake  wrote:


On 10/4/18 12:05 AM, Eric Blake wrote:

The following (long) email describes a portion of the work-flow of how
my proposed incremental backup APIs will work, along with the backend
QMP commands that each one executes.  I will reply to this thread with
further examples (the first example is long enough to be its own email).
This is an update to a thread last posted here:
https://www.redhat.com/archives/libvir-list/2018-June/msg01066.html




More to come in part 2.



- Second example: a sequence of incremental backups via pull model

In the first example, we did not create a checkpoint at the time of the
full pull. That means we have no way to track a delta of changes since
that point in time.



Why do we want to support backup without creating a checkpoint?


Fleecing. If you want to examine a portion of the disk at a given point 
in time, then kicking off a pull model backup gives you access to the 
state of the disk at that time, and your actions are transient.  Ending 
the job when you are done with the fleece cleans up everything needed to 
perform the fleece operation, and since you did not intend to capture a 
full (well, a complete) incremental backup, but were rather grabbing 
just a subset of the disk, you really don't want that point in time to 
be recorded as a new checkpoint.


Also, incremental backups (which are what require checkpoints) are 
limited to qcow2 disks, but full backups can be performed on any format 
(including raw disks).  If you have a guest that does not use qcow2 
disks, you can perform a full backup, but cannot create a checkpoint.




If we don't have any real use case, I suggest to always require a
checkpoint.


But we do have real cases for backup without checkpoint.





Let's repeat the full backup (reusing the same
backup.xml from before), but this time, we'll add a new parameter, a
second XML file for describing the checkpoint we want to create.

Actually, it was easy enough to get virsh to write the XML for me
(because it was very similar to existing code in virsh that creates XML
for snapshot creation):

$ $virsh checkpoint-create-as --print-xml $dom check1 testing \
 --diskspec sdc --diskspec sdd | tee check1.xml

check1



We should use an id, not a name, even of name is name is also unique like
in most libvirt apis.

In RHV we will use always use a UUID for this.


Nothing prevents you from using a UUID as your name. But this particular 
choice of XML () matches what already exists in the snapshot XML.






testing

  
  



I had to supply two --diskspec arguments to virsh to select just the two
qcow2 disks that I am using in my example (rather than every disk in the
domain, which is the default when  is not present).



So  is valid configuration, selecting all disks, or not having
"disks" element
selects all disks?


It's about a one-line change to get whichever behavior you find more 
useful. Right now, I'm leaning towards:  omitted == backup all 
disks,  present: you MUST have at least one  subelement 
that explicitly requests a checkpoint (because any omitted  when 
 is present is skipped). A checkpoint only makes sense as long as 
there is at least one disk to create a checkpoint with.


But I could also go with:  omitted == backup all disks,  
present but  subelements missing: the missing elements default to 
being backed up, and you have to explicitly provide checkpoint='no'> to skip a particular disk.


Or even:  omitted, or  present but  subelements 
missing: the missing elements defer to the hypervisor for their default 
state, and the qemu hypervisor defaults to qcow2 disks being backed 
up/checkpointed and to non-qcow2 disks being omitted.  But this latter 
one feels like more magic, which is harder to document and liable to go 
wrong.


A stricter version would be  is mandatory, and no  
subelement can be missing (or else the API fails because you weren't 
explicit in your choice). But that's rather strict, especially since 
existing snapshots XML handling is not that strict.






I also picked
a name (mandatory) and description (optional) to be associated with the
checkpoint.

The backup.xml file that we plan to reuse still mentions scratch1.img
and scratch2.img as files needed for staging the pull request. However,
any contents in those files could interfere with our second backup
(after all, every cluster written into that file from the first backup
represents a point in time that was frozen at the first backup; but our
second backup will want to read the data as the guest sees it now rather
than what it was at the first backup), so we MUST regenerate the scratch
files. (Perhaps I should have just deleted them at the end of example 1
in my previous email, had I remembered when typing that mail).

$ $qemu_img create -f qcow2 -b $orig1 -F qcow2 scratch1.img
$ $qemu_img create -f qcow2 -b $orig2 -F qcow2 scratch2.img

Now, 

Re: [libvirt] [RFC] virfile: fix cast-align error

2018-10-09 Thread Michal Privoznik
On 10/09/2018 04:39 PM, Bjoern Walk wrote:
> Michal Privoznik  [2018-10-09, 03:45PM +0200]:
>> On 10/08/2018 08:41 PM, Marc Hartmayer wrote:
>>> Use the correct type in order to fix the following error on s390x:
>>>
>>> In function 'virFileIsSharedFSType':
>>> ../../src/util/virfile.c:3578:38: error: cast increases required alignment 
>>> of target type [-Werror=cast-align]
>>>  virFileIsSharedFixFUSE(path, (long *) _type);
>>>
>>> Signed-off-by: Marc Hartmayer 
>>> ---
>>>  src/util/virfile.c | 4 ++--
>>>  1 file changed, 2 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/src/util/virfile.c b/src/util/virfile.c
>>> index 2a7e87102a25..832d832696d5 100644
>>> --- a/src/util/virfile.c
>>> +++ b/src/util/virfile.c
>>> @@ -3466,7 +3466,7 @@ int virFilePrintf(FILE *fp, const char *msg, ...)
>>>  
>>>  static int
>>>  virFileIsSharedFixFUSE(const char *path,
>>> -   long *f_type)
>>> +   unsigned int *f_type)
>>>  {
>>>  char *dirpath = NULL;
>>>  const char **mounts = NULL;
>>> @@ -3575,7 +3575,7 @@ virFileIsSharedFSType(const char *path,
>>>  
>>>  if (sb.f_type == FUSE_SUPER_MAGIC) {
>>>  VIR_DEBUG("Found FUSE mount for path=%s. Trying to fix it", path);
>>> -virFileIsSharedFixFUSE(path, (long *) _type);
>>> +virFileIsSharedFixFUSE(path, _type);
>>
>> This won't fly on x86_64 where f_type is long. I think we can use
>> __fsword_t directly.
>>
>>>  }
>>>  
>>>  VIR_DEBUG("Check if path %s with FS magic %lld is shared",
>>>
>>
>> Does that work for you?
> 
> Hmm, __fsword_t is still a long int but the f_type member is defined as
> unsigned int in the userspace header.

What header file are you looking at? From /usr/include/bits/statfs.h:

struct statfs
  {
__fsword_t f_type;
__fsword_t f_bsize;

/* . */

__fsid_t f_fsid;
__fsword_t f_namelen;
__fsword_t f_frsize;
__fsword_t f_flags;
__fsword_t f_spare[4];
  };


> Also, I am against exposing any
> internal data types.

It's not internal if it is exposed in a public header file.

> 
> virFileIsSharedFixFUSE could just return the value, the return value
> right now is not checked anyways.

Sure, but since you claim the f_type member is an unsigned type, then we
would hit the sign warning anyway.
Also, not sure. If virFileIsSharedFixFUSE() is unable to tell the
filesystem it should not touch the f_type at all.

Michal

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


[libvirt] [dbus PATCH] maint: fix compiler errors for FreeBSD

2018-10-09 Thread Katerina Koukiou
We were facing such errors:
* error: unused function 'glib_slistautoptr_cleanup_virtDBusConnect

Moving the G_DEFINE_AUTOPTR_CLEANUP_FUNC macros in the header files
fixes the issue.

Signed-off-by: Katerina Koukiou 
---
 src/connect.c | 3 +--
 src/connect.h | 5 +
 src/network.c | 4 +---
 src/network.h | 6 ++
 4 files changed, 13 insertions(+), 5 deletions(-)

diff --git a/src/connect.c b/src/connect.c
index f0da1dd..f8f76a2 100644
--- a/src/connect.c
+++ b/src/connect.c
@@ -1995,7 +1995,7 @@ static virtDBusGDBusMethodTable 
virtDBusConnectMethodTable[] = {
 
 static GDBusInterfaceInfo *interfaceInfo = NULL;
 
-static void
+void
 virtDBusConnectFree(virtDBusConnect *connect)
 {
 if (connect->connection)
@@ -2011,7 +2011,6 @@ virtDBusConnectFree(virtDBusConnect *connect)
 g_free(connect->storageVolPath);
 g_free(connect);
 }
-G_DEFINE_AUTOPTR_CLEANUP_FUNC(virtDBusConnect, virtDBusConnectFree);
 
 void
 virtDBusConnectNew(virtDBusConnect **connectp,
diff --git a/src/connect.h b/src/connect.h
index 961b115..829bd57 100644
--- a/src/connect.h
+++ b/src/connect.h
@@ -44,3 +44,8 @@ virtDBusConnectOpen(virtDBusConnect *connect,
 
 void
 virtDBusConnectListFree(virtDBusConnect **connectList);
+
+void
+virtDBusConnectFree(virtDBusConnect *connect);
+
+G_DEFINE_AUTOPTR_CLEANUP_FUNC(virtDBusConnect, virtDBusConnectFree);
diff --git a/src/network.c b/src/network.c
index f3d5472..95ecdc7 100644
--- a/src/network.c
+++ b/src/network.c
@@ -3,7 +3,7 @@
 
 #include 
 
-static void
+void
 virtDBusNetworkDHCPLeaseListFree(virNetworkDHCPLeasePtr *leases)
 {
 for (gint i = 0; leases[i] != NULL; i++)
@@ -12,8 +12,6 @@ virtDBusNetworkDHCPLeaseListFree(virNetworkDHCPLeasePtr 
*leases)
 g_free(leases);
 }
 
-G_DEFINE_AUTOPTR_CLEANUP_FUNC(virNetworkDHCPLeasePtr, 
virtDBusNetworkDHCPLeaseListFree);
-
 static virNetworkPtr
 virtDBusNetworkGetVirNetwork(virtDBusConnect *connect,
  const gchar *objectPath,
diff --git a/src/network.h b/src/network.h
index fc53b28..0012585 100644
--- a/src/network.h
+++ b/src/network.h
@@ -1,9 +1,15 @@
 #pragma once
 
 #include "connect.h"
+#include 
 
 #define VIRT_DBUS_NETWORK_INTERFACE "org.libvirt.Network"
 
 void
 virtDBusNetworkRegister(virtDBusConnect *connect,
 GError **error);
+
+void
+virtDBusNetworkDHCPLeaseListFree(virNetworkDHCPLeasePtr *leases);
+
+G_DEFINE_AUTOPTR_CLEANUP_FUNC(virNetworkDHCPLeasePtr, 
virtDBusNetworkDHCPLeaseListFree);
-- 
2.17.1

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [RFC] virfile: fix cast-align error

2018-10-09 Thread Bjoern Walk
Michal Privoznik  [2018-10-09, 03:45PM +0200]:
> On 10/08/2018 08:41 PM, Marc Hartmayer wrote:
> > Use the correct type in order to fix the following error on s390x:
> > 
> > In function 'virFileIsSharedFSType':
> > ../../src/util/virfile.c:3578:38: error: cast increases required alignment 
> > of target type [-Werror=cast-align]
> >  virFileIsSharedFixFUSE(path, (long *) _type);
> > 
> > Signed-off-by: Marc Hartmayer 
> > ---
> >  src/util/virfile.c | 4 ++--
> >  1 file changed, 2 insertions(+), 2 deletions(-)
> > 
> > diff --git a/src/util/virfile.c b/src/util/virfile.c
> > index 2a7e87102a25..832d832696d5 100644
> > --- a/src/util/virfile.c
> > +++ b/src/util/virfile.c
> > @@ -3466,7 +3466,7 @@ int virFilePrintf(FILE *fp, const char *msg, ...)
> >  
> >  static int
> >  virFileIsSharedFixFUSE(const char *path,
> > -   long *f_type)
> > +   unsigned int *f_type)
> >  {
> >  char *dirpath = NULL;
> >  const char **mounts = NULL;
> > @@ -3575,7 +3575,7 @@ virFileIsSharedFSType(const char *path,
> >  
> >  if (sb.f_type == FUSE_SUPER_MAGIC) {
> >  VIR_DEBUG("Found FUSE mount for path=%s. Trying to fix it", path);
> > -virFileIsSharedFixFUSE(path, (long *) _type);
> > +virFileIsSharedFixFUSE(path, _type);
> 
> This won't fly on x86_64 where f_type is long. I think we can use
> __fsword_t directly.
> 
> >  }
> >  
> >  VIR_DEBUG("Check if path %s with FS magic %lld is shared",
> > 
> 
> Does that work for you?

Hmm, __fsword_t is still a long int but the f_type member is defined as
unsigned int in the userspace header. Also, I am against exposing any
internal data types.

virFileIsSharedFixFUSE could just return the value, the return value
right now is not checked anyways.

> 
> Michal
> 
> --
> libvir-list mailing list
> libvir-list@redhat.com
> https://www.redhat.com/mailman/listinfo/libvir-list
> 

-- 
IBM Systems
Linux on Z & Virtualization Development
--
IBM Deutschland Research & Development GmbH
Schönaicher Str. 220, 71032 Böblingen
Phone: +49 7031 16 1819
--
Vorsitzende des Aufsichtsrats: Martina Koederitz
Geschäftsführung: Dirk Wittkopp
Sitz der Gesellschaft: Böblingen
Registergericht: Amtsgericht Stuttgart, HRB 243294 


signature.asc
Description: PGP signature
--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list

[libvirt] [PATCH 0/4] Couple of virFileIsShared* fixes

2018-10-09 Thread Michal Privoznik
This patch [1] got me looking into the code. I've found couple of issues
which I'm fixing and also I'm introducing couple of new test cases.

1: https://www.redhat.com/archives/libvir-list/2018-October/msg00513.html

Michal Prívozník (4):
  virfiletest: Fix test name prefix for virFileInData test
  virfiletst: Test virFileIsSharedFS
  virFileIsSharedFSType: Detect direct mount points
  virfile: Rework virFileIsSharedFixFUSE

 src/util/virfile.c|  66 ++-
 tests/Makefile.am |   7 +-
 tests/virfiledata/mounts3.txt |  35 
 tests/virfilemock.c   | 154 ++
 tests/virfiletest.c   |  64 +-
 5 files changed, 284 insertions(+), 42 deletions(-)
 create mode 100644 tests/virfiledata/mounts3.txt
 create mode 100644 tests/virfilemock.c

-- 
2.18.0

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list

[libvirt] [PATCH 1/4] virfiletest: Fix test name prefix for virFileInData test

2018-10-09 Thread Michal Privoznik
Because of lacking virTestCounterReset() call, the old test cases
name was preserved.

Signed-off-by: Michal Privoznik 
---
 tests/virfiletest.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/tests/virfiletest.c b/tests/virfiletest.c
index efe15485eb..19dd06e49e 100644
--- a/tests/virfiletest.c
+++ b/tests/virfiletest.c
@@ -388,6 +388,7 @@ mymain(void)
 } while (0)
 
 if (holesSupported()) {
+virTestCounterReset("testFileInData ");
 DO_TEST_IN_DATA(true, 4, 4, 4);
 DO_TEST_IN_DATA(false, 4, 4, 4);
 DO_TEST_IN_DATA(true, 8, 8, 8);
-- 
2.18.0

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


[libvirt] [PATCH 3/4] virFileIsSharedFSType: Detect direct mount points

2018-10-09 Thread Michal Privoznik
If the given path is already a mount point (e.g. a bind mount of
a file, or simply a direct mount point of a FS), then our code
fails to detect that because the first thing it does is cutting
off part after last slash '/'.

Signed-off-by: Michal Privoznik 
---
 src/util/virfile.c  | 8 
 tests/virfiletest.c | 3 +--
 2 files changed, 5 insertions(+), 6 deletions(-)

diff --git a/src/util/virfile.c b/src/util/virfile.c
index 2a7e87102a..666d703f99 100644
--- a/src/util/virfile.c
+++ b/src/util/virfile.c
@@ -3534,15 +3534,16 @@ virFileIsSharedFSType(const char *path,
   int fstypes)
 {
 VIR_AUTOFREE(char *) dirpath = NULL;
-char *p;
+char *p = NULL;
 struct statfs sb;
 int statfs_ret;
 
 if (VIR_STRDUP(dirpath, path) < 0)
 return -1;
 
-do {
+statfs_ret = statfs(dirpath, );
 
+while ((statfs_ret < 0) && (p != dirpath)) {
 /* Try less and less of the path until we get to a
  * directory we can stat. Even if we don't have 'x'
  * permission on any directory in the path on the NFS
@@ -3563,8 +3564,7 @@ virFileIsSharedFSType(const char *path,
 *p = '\0';
 
 statfs_ret = statfs(dirpath, );
-
-} while ((statfs_ret < 0) && (p != dirpath));
+}
 
 if (statfs_ret < 0) {
 virReportSystemError(errno,
diff --git a/tests/virfiletest.c b/tests/virfiletest.c
index 22c163f0a0..42d918aecc 100644
--- a/tests/virfiletest.c
+++ b/tests/virfiletest.c
@@ -453,8 +453,7 @@ mymain(void)
 DO_TEST_FILE_IS_SHARED_FS_TYPE("mounts1.txt", "/boot/vmlinuz", false);
 DO_TEST_FILE_IS_SHARED_FS_TYPE("mounts2.txt", 
"/run/user/501/gvfs/some/file", false);
 DO_TEST_FILE_IS_SHARED_FS_TYPE("mounts3.txt", "/nfs/file", true);
-/* TODO Detect bind mounts */
-DO_TEST_FILE_IS_SHARED_FS_TYPE("mounts3.txt", "/nfs/blah", true);
+DO_TEST_FILE_IS_SHARED_FS_TYPE("mounts3.txt", "/nfs/blah", false);
 
 return ret != 0 ? EXIT_FAILURE : EXIT_SUCCESS;
 }
-- 
2.18.0

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


[libvirt] [PATCH 2/4] virfiletst: Test virFileIsSharedFS

2018-10-09 Thread Michal Privoznik
Introduce some basic test cases for virFileIsSharedFS(). More
will be added later. In order to achieve desired result, mocks
for setmntent() and statfs() need to be invented because the
first thing that virFileIsSharedFS() does is calling the latter.
If it finds a FUSE mount it'll call the former.

The mock might look a bit complicated, but in fact it's quite
simple. The test sets LIBVIRT_MTAB env variable to hold the
absolute path to a file containing mount table. Then, statfs()
returns matching FS it finds, and setmntent() is there just to
replace /proc/mounts with the file the test wants to load.

Adding this test also exposed a bug we have - because we assume
the given path points to a file we cut off what we assume is a
file name to obtain directory path and only then we call
statfs(). This is buggy because the passed path could be already
a mount point.

Signed-off-by: Michal Privoznik 
---
 tests/Makefile.am |   7 +-
 tests/virfiledata/mounts3.txt |  33 
 tests/virfilemock.c   | 154 ++
 tests/virfiletest.c   |  62 +-
 4 files changed, 254 insertions(+), 2 deletions(-)
 create mode 100644 tests/virfiledata/mounts3.txt
 create mode 100644 tests/virfilemock.c

diff --git a/tests/Makefile.am b/tests/Makefile.am
index 51be1509e1..d7ec7e3a6f 100644
--- a/tests/Makefile.am
+++ b/tests/Makefile.am
@@ -242,6 +242,7 @@ test_libraries += virusbmock.la \
virnetdevbandwidthmock.la \
virnumamock.la \
virtestmock.la \
+   virfilemock.la \
$(NULL)
 endif WITH_LINUX
 
@@ -1170,9 +1171,13 @@ virresctrltest_SOURCES = \
virresctrltest.c testutils.h testutils.c virfilewrapper.h 
virfilewrapper.c
 virresctrltest_LDADD = $(LDADDS)
 
+virfilemock_la_SOURCES = \
+   virfilemock.c
+virfilemock_la_LDFLAGS = $(MOCKLIBS_LDFLAGS)
+virfilemock_la_LIBADD = $(MOCKLIBS_LIBS)
 else ! WITH_LINUX
 EXTRA_DIST += vircaps2xmltest.c virnumamock.c virfilewrapper.c \
- virfilewrapper.h virresctrltest.c
+ virfilewrapper.h virresctrltest.c virfilemock.c
 endif ! WITH_LINUX
 
 if WITH_NSS
diff --git a/tests/virfiledata/mounts3.txt b/tests/virfiledata/mounts3.txt
new file mode 100644
index 00..226f67dc00
--- /dev/null
+++ b/tests/virfiledata/mounts3.txt
@@ -0,0 +1,33 @@
+/dev/root / xfs rw,noatime,attr2,inode64,noquota 0 0
+proc /proc proc rw,nosuid,nodev,noexec,relatime 0 0
+tmpfs /run tmpfs rw,nodev,relatime,size=3281436k,mode=755 0 0
+sysfs /sys sysfs rw,nosuid,nodev,noexec,relatime 0 0
+dev /dev devtmpfs rw,nosuid,relatime,size=10240k,nr_inodes=4093060,mode=755 0 0
+securityfs /sys/kernel/security securityfs rw,nosuid,nodev,noexec,relatime 0 0
+debugfs /sys/kernel/debug debugfs rw,nosuid,nodev,noexec,relatime 0 0
+mqueue /dev/mqueue mqueue rw,nosuid,nodev,noexec,relatime 0 0
+configfs /sys/kernel/config configfs rw,nosuid,nodev,noexec,relatime 0 0
+devpts /dev/pts devpts rw,nosuid,noexec,relatime,gid=5,mode=620,ptmxmode=000 0 0
+fusectl /sys/fs/fuse/connections fusectl rw,nosuid,nodev,noexec,relatime 0 0
+shm /dev/shm tmpfs rw,nosuid,nodev,noexec,relatime 0 0
+cgroup_root /sys/fs/cgroup tmpfs 
rw,nosuid,nodev,noexec,relatime,size=10240k,mode=755 0 0
+openrc /sys/fs/cgroup/openrc cgroup 
rw,nosuid,nodev,noexec,relatime,release_agent=/lib/rc/sh/cgroup-release-agent.sh,name=openrc
 0 0
+none /sys/fs/cgroup/unified cgroup2 rw,nosuid,nodev,noexec,relatime,nsdelegate 
0 0
+cpuset /sys/fs/cgroup/cpuset cgroup rw,nosuid,nodev,noexec,relatime,cpuset 0 0
+cpu /sys/fs/cgroup/cpu cgroup rw,nosuid,nodev,noexec,relatime,cpu 0 0
+cpuacct /sys/fs/cgroup/cpuacct cgroup rw,nosuid,nodev,noexec,relatime,cpuacct 
0 0
+blkio /sys/fs/cgroup/blkio cgroup rw,nosuid,nodev,noexec,relatime,blkio 0 0
+memory /sys/fs/cgroup/memory cgroup rw,nosuid,nodev,noexec,relatime,memory 0 0
+devices /sys/fs/cgroup/devices cgroup rw,nosuid,nodev,noexec,relatime,devices 
0 0
+freezer /sys/fs/cgroup/freezer cgroup rw,nosuid,nodev,noexec,relatime,freezer 
0 0
+net_cls /sys/fs/cgroup/net_cls cgroup rw,nosuid,nodev,noexec,relatime,net_cls 
0 0
+perf_event /sys/fs/cgroup/perf_event cgroup 
rw,nosuid,nodev,noexec,relatime,perf_event 0 0
+net_prio /sys/fs/cgroup/net_prio cgroup 
rw,nosuid,nodev,noexec,relatime,net_prio 0 0
+hugetlb /sys/fs/cgroup/hugetlb cgroup rw,nosuid,nodev,noexec,relatime,hugetlb 
0 0
+pids /sys/fs/cgroup/pids cgroup rw,nosuid,nodev,noexec,relatime,pids 0 0
+rdma /sys/fs/cgroup/rdma cgroup rw,nosuid,nodev,noexec,relatime,rdma 0 0
+binfmt_misc /proc/sys/fs/binfmt_misc binfmt_misc 
rw,nosuid,nodev,noexec,relatime 0 0
+hugetlbfs /hugepages2M hugetlbfs rw,relatime,mode=1777,pagesize=2M 0 0
+none /run/user/1000 tmpfs rw,relatime,mode=700,uid=1000 0 0
+host:/nfs /nfs nfs4 
rw,relatime,vers=4.1,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp6,timeo=600,retrans=2,sec=sys,clientaddr=::,local_lock=none,addr=::
 0 0
+dev /nfs/blah devtmpfs 

[libvirt] [PATCH 4/4] virfile: Rework virFileIsSharedFixFUSE

2018-10-09 Thread Michal Privoznik
There are couple of things wrong with the current implementation.
The first one is that in the first loop the code tries to build a
list of fuse.glusterfs mount points. Well, since the strings are
allocated in a temporary buffer and are not duplicated this
results in wrong decision made later in the code.

The second problem is that the code does not take into account
subtree mounts. For instance, if there's a fuse.gluster mounted
at /some/path and another FS mounted at /some/path/subdir the
code would not recognize this subdir mount.

Reported-by: Han Han 
Signed-off-by: Michal Privoznik 
---
 src/util/virfile.c| 58 +--
 tests/virfiledata/mounts3.txt |  2 ++
 tests/virfiletest.c   |  2 ++
 3 files changed, 26 insertions(+), 36 deletions(-)

diff --git a/src/util/virfile.c b/src/util/virfile.c
index 666d703f99..19f504cc6d 100644
--- a/src/util/virfile.c
+++ b/src/util/virfile.c
@@ -3468,18 +3468,14 @@ static int
 virFileIsSharedFixFUSE(const char *path,
long *f_type)
 {
-char *dirpath = NULL;
-const char **mounts = NULL;
-size_t nmounts = 0;
-char *p;
 FILE *f = NULL;
 struct mntent mb;
 char mntbuf[1024];
+char *mntDir = NULL;
+char *mntType = NULL;
+size_t maxMatching = 0;
 int ret = -1;
 
-if (VIR_STRDUP(dirpath, path) < 0)
-return -1;
-
 if (!(f = setmntent(PROC_MOUNTS, "r"))) {
 virReportSystemError(errno,
  _("Unable to open %s"),
@@ -3488,43 +3484,33 @@ virFileIsSharedFixFUSE(const char *path,
 }
 
 while (getmntent_r(f, , mntbuf, sizeof(mntbuf))) {
-if (STRNEQ("fuse.glusterfs", mb.mnt_type))
+const char *p;
+size_t len = strlen(mb.mnt_dir);
+
+if (!(p = STRSKIP(path, mb.mnt_dir)))
 continue;
 
-if (VIR_APPEND_ELEMENT_COPY(mounts, nmounts, mb.mnt_dir) < 0)
-goto cleanup;
-}
-
-/* Add NULL sentinel so that this is a virStringList */
-if (VIR_REALLOC_N(mounts, nmounts + 1) < 0)
-goto cleanup;
-mounts[nmounts] = NULL;
-
-do {
-if ((p = strrchr(dirpath, '/')) == NULL) {
-virReportSystemError(EINVAL,
- _("Invalid relative path '%s'"), path);
-goto cleanup;
+if (len > maxMatching) {
+len = maxMatching;
+VIR_FREE(mntType);
+VIR_FREE(mntDir);
+if (VIR_STRDUP(mntDir, mb.mnt_dir) < 0 ||
+VIR_STRDUP(mntType, mb.mnt_type) < 0)
+goto cleanup;
 }
+}
 
-if (p == dirpath)
-*(p+1) = '\0';
-else
-*p = '\0';
-
-if (virStringListHasString(mounts, dirpath)) {
-VIR_DEBUG("Found gluster FUSE mountpoint=%s for path=%s. "
-  "Fixing shared FS type", dirpath, path);
-*f_type = GFS2_MAGIC;
-break;
-}
-} while (p != dirpath);
+if (STREQ_NULLABLE(mntType, "fuse.glusterfs")) {
+VIR_DEBUG("Found gluster FUSE mountpoint=%s for path=%s. "
+  "Fixing shared FS type", mntDir, path);
+*f_type = GFS2_MAGIC;
+}
 
 ret = 0;
  cleanup:
+VIR_FREE(mntType);
+VIR_FREE(mntDir);
 endmntent(f);
-VIR_FREE(mounts);
-VIR_FREE(dirpath);
 return ret;
 }
 
diff --git a/tests/virfiledata/mounts3.txt b/tests/virfiledata/mounts3.txt
index 226f67dc00..134c6e8f81 100644
--- a/tests/virfiledata/mounts3.txt
+++ b/tests/virfiledata/mounts3.txt
@@ -31,3 +31,5 @@ hugetlbfs /hugepages2M hugetlbfs 
rw,relatime,mode=1777,pagesize=2M 0 0
 none /run/user/1000 tmpfs rw,relatime,mode=700,uid=1000 0 0
 host:/nfs /nfs nfs4 
rw,relatime,vers=4.1,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp6,timeo=600,retrans=2,sec=sys,clientaddr=::,local_lock=none,addr=::
 0 0
 dev /nfs/blah devtmpfs 
rw,nosuid,relatime,size=10240k,nr_inodes=4093060,mode=755 0 0
+host:/gv0 /gluster fuse.glusterfs rw 0 0
+root@host:/tmp/mkdir /gluster/sshfs fuse.sshfs rw 0 0
diff --git a/tests/virfiletest.c b/tests/virfiletest.c
index 42d918aecc..d5102b1cc4 100644
--- a/tests/virfiletest.c
+++ b/tests/virfiletest.c
@@ -454,6 +454,8 @@ mymain(void)
 DO_TEST_FILE_IS_SHARED_FS_TYPE("mounts2.txt", 
"/run/user/501/gvfs/some/file", false);
 DO_TEST_FILE_IS_SHARED_FS_TYPE("mounts3.txt", "/nfs/file", true);
 DO_TEST_FILE_IS_SHARED_FS_TYPE("mounts3.txt", "/nfs/blah", false);
+DO_TEST_FILE_IS_SHARED_FS_TYPE("mounts3.txt", "/gluster/file", true);
+DO_TEST_FILE_IS_SHARED_FS_TYPE("mounts3.txt", "/gluster/sshfs/file", 
false);
 
 return ret != 0 ? EXIT_FAILURE : EXIT_SUCCESS;
 }
-- 
2.18.0

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [PATCH v2] virFileIsSharedFixFUSE: Copy mnt_dir when browsing mount table

2018-10-09 Thread Michal Privoznik
On 10/09/2018 09:10 AM, Han Han wrote:
> Fix typos of function name in commit msg.
> v1 version: 
> https://www.redhat.com/archives/libvir-list/2018-October/msg00511.html
> 
> virFileIsSharedFixFUSE doesn't fix f_type when "fuse.glusterfs"
> is not the last row of mount table. For example, it doesn't works on
> the mount table like following:
> 10.XX.XX.XX:/gv0 /mnt fuse.glusterfs rw 0 0
> r...@xx.xx.xx:/tmp/mkdir /tmp/br0 fuse.sshfs rw 0 0
> 
> Copy mnt_dir of struct mntent in case its mnt_dir is changed by
> getmntent_r in the loop later.
> 
> Signed-off-by: Han Han 
> ---
>  src/util/virfile.c | 12 
>  1 file changed, 8 insertions(+), 4 deletions(-)


Ah, thanks for catching this. In fact, we can do way better. I will post
my patches shortly.

Michal

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [RFC] virfile: fix cast-align error

2018-10-09 Thread Michal Privoznik
On 10/08/2018 08:41 PM, Marc Hartmayer wrote:
> Use the correct type in order to fix the following error on s390x:
> 
> In function 'virFileIsSharedFSType':
> ../../src/util/virfile.c:3578:38: error: cast increases required alignment of 
> target type [-Werror=cast-align]
>  virFileIsSharedFixFUSE(path, (long *) _type);
> 
> Signed-off-by: Marc Hartmayer 
> ---
>  src/util/virfile.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/src/util/virfile.c b/src/util/virfile.c
> index 2a7e87102a25..832d832696d5 100644
> --- a/src/util/virfile.c
> +++ b/src/util/virfile.c
> @@ -3466,7 +3466,7 @@ int virFilePrintf(FILE *fp, const char *msg, ...)
>  
>  static int
>  virFileIsSharedFixFUSE(const char *path,
> -   long *f_type)
> +   unsigned int *f_type)
>  {
>  char *dirpath = NULL;
>  const char **mounts = NULL;
> @@ -3575,7 +3575,7 @@ virFileIsSharedFSType(const char *path,
>  
>  if (sb.f_type == FUSE_SUPER_MAGIC) {
>  VIR_DEBUG("Found FUSE mount for path=%s. Trying to fix it", path);
> -virFileIsSharedFixFUSE(path, (long *) _type);
> +virFileIsSharedFixFUSE(path, _type);

This won't fly on x86_64 where f_type is long. I think we can use
__fsword_t directly.

>  }
>  
>  VIR_DEBUG("Check if path %s with FS magic %lld is shared",
> 

Does that work for you?

Michal

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] Regarding the installtion of libvirt on ubuntu

2018-10-09 Thread Andrea Bolognani
On Tue, 2018-10-09 at 13:34 +0100, Daniel P. Berrangé wrote:
> On Wed, Oct 03, 2018 at 12:32:04PM +, Mousumi Paul wrote:
> > I have installed libvirt-bin on ubuntu using the following command:sudo 
> > apt-get install libvirt-bin
> > I checked the virsh version. It says 4.6.0.Now i want to recompile libvirt 
> > with esx, xen and hyperv driver.Therefore i went to /usr/src/libvirt-4.6.0 
> > and trying to run the following command:./configure --prefix=/usr 
> > --localstatedir=/var --sysconfdir=/etc --with-esx=yes --with-xen=yes
> > 
> > but i am getting the following error:error: you must install the 
> > gnutls>=3.2.0 pkg-config module to compile libvirt.
> > 
> > So i have installed gnutls-binand it is already the latest version.
> > Why libvirt is not able to identify gnutls... what is the solution for 
> > this???
> 
> Again, you'll need the -dev package, not the -bin package.

Also, if you have further questions please direct them to the
libvirt-users mailing list: libvir-list is for discussing the
development of libvirt itself.

-- 
Andrea Bolognani / Red Hat / Virtualization

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list

Re: [libvirt] Overview of libvirt incremental backup API, part 2 (incremental/differential pull mode)

2018-10-09 Thread Nir Soffer
On Fri, Oct 5, 2018 at 7:58 AM Eric Blake  wrote:

> On 10/4/18 12:05 AM, Eric Blake wrote:
> > The following (long) email describes a portion of the work-flow of how
> > my proposed incremental backup APIs will work, along with the backend
> > QMP commands that each one executes.  I will reply to this thread with
> > further examples (the first example is long enough to be its own email).
> > This is an update to a thread last posted here:
> > https://www.redhat.com/archives/libvir-list/2018-June/msg01066.html
> >
>
> > More to come in part 2.
> >
>
> - Second example: a sequence of incremental backups via pull model
>
> In the first example, we did not create a checkpoint at the time of the
> full pull. That means we have no way to track a delta of changes since
> that point in time.


Why do we want to support backup without creating a checkpoint?

If we don't have any real use case, I suggest to always require a
checkpoint.


> Let's repeat the full backup (reusing the same
> backup.xml from before), but this time, we'll add a new parameter, a
> second XML file for describing the checkpoint we want to create.
>
> Actually, it was easy enough to get virsh to write the XML for me
> (because it was very similar to existing code in virsh that creates XML
> for snapshot creation):
>
> $ $virsh checkpoint-create-as --print-xml $dom check1 testing \
> --diskspec sdc --diskspec sdd | tee check1.xml
> 
>check1
>

We should use an id, not a name, even of name is name is also unique like
in most libvirt apis.

In RHV we will use always use a UUID for this.


>testing
>
>  
>  
>
> 
>
> I had to supply two --diskspec arguments to virsh to select just the two
> qcow2 disks that I am using in my example (rather than every disk in the
> domain, which is the default when  is not present).


So  is valid configuration, selecting all disks, or not having
"disks" element
selects all disks?


> I also picked
> a name (mandatory) and description (optional) to be associated with the
> checkpoint.
>
> The backup.xml file that we plan to reuse still mentions scratch1.img
> and scratch2.img as files needed for staging the pull request. However,
> any contents in those files could interfere with our second backup
> (after all, every cluster written into that file from the first backup
> represents a point in time that was frozen at the first backup; but our
> second backup will want to read the data as the guest sees it now rather
> than what it was at the first backup), so we MUST regenerate the scratch
> files. (Perhaps I should have just deleted them at the end of example 1
> in my previous email, had I remembered when typing that mail).
>
> $ $qemu_img create -f qcow2 -b $orig1 -F qcow2 scratch1.img
> $ $qemu_img create -f qcow2 -b $orig2 -F qcow2 scratch2.img
>
> Now, to begin the full backup and create a checkpoint at the same time.
> Also, this time around, it would be nice if the guest had a chance to
> freeze I/O to the disks prior to the point chosen as the checkpoint.
> Assuming the guest is trusted, and running the qemu guest agent (qga),
> we can do that with:
>
> $ $virsh fsfreeze $dom
> $ $virsh backup-begin $dom backup.xml check1.xml
> Backup id 1 started
> backup used description from 'backup.xml'
> checkpoint used description from 'check1.xml'
> $ $virsh fsthaw $dom
>

Great, this answer my (unsent) question about freeze/thaw from part 1 :-)

>
> and eventually, we may decide to add a VIR_DOMAIN_BACKUP_BEGIN_QUIESCE
> flag to combine those three steps into a single API (matching what we've
> done on some other existing API).  In other words, the sequence of QMP
> operations performed during virDomainBackupBegin are quick enough that
> they won't stall a freeze operation (at least Windows is picky if you
> stall a freeze operation longer than 10 seconds).
>

We use fsFreeze/fsThaw directly in RHV since we need to support external
snapshots (e.g. ceph), so we don't need this functionality, but it sounds
good
idea to make it work like snapshot.


>
> The tweaked $virsh backup-begin now results in a call to:
>   virDomainBackupBegin(dom, "",
> " and in turn libvirt makes a similar sequence of QMP calls as before,
> with a slight modification in the middle:
> {"execute":"nbd-server-start",...
> {"execute":"blockdev-add",...
>

This does not work yet for network disks like "rbd" and "glusterfs"
does it mean that they will not be supported for backup?


> {"execute":"transaction",
>   "arguments":{"actions":[
>{"type":"blockdev-backup", "data":{
> "device":"$node1", "target":"backup-sdc", "sync":"none",
> "job-id":"backup-sdc" }},
>{"type":"blockdev-backup", "data":{
> "device":"$node2", "target":"backup-sdd", "sync":"none",
> "job-id":"backup-sdd" }}
>{"type":"block-dirty-bitmap-add", "data":{
> "node":"$node1", "name":"check1", "persistent":true}},
>{"type":"block-dirty-bitmap-add", "data":{
> "node":"$node2", "name":"check1", 

[libvirt] [PATCH v3 6/6] monitor: deprecate acl_show, acl_reset, acl_policy, acl_add, acl_remove

2018-10-09 Thread Daniel P . Berrangé
The various ACL related commands are obsolete now that the QAuthZ
framework for authorization is fully integrated throughout QEMU network
services. Mark it as deprecated with no replacement to be provided.

Authorization is now provided by using 'object_add' together with
the 'tls-authz' or 'sasl-authz' parameters to the VNC server, and
equivalent for other network services.

Signed-off-by: Daniel P. Berrangé 
---
 monitor.c| 23 +++
 qemu-deprecated.texi |  6 ++
 2 files changed, 29 insertions(+)

diff --git a/monitor.c b/monitor.c
index abedc6263f..8144f304c4 100644
--- a/monitor.c
+++ b/monitor.c
@@ -2056,6 +2056,19 @@ static QAuthZList *find_auth(Monitor *mon, const char 
*name)
 return QAUTHZ_LIST(obj);
 }
 
+static bool warn_acl;
+static void hmp_warn_acl(void)
+{
+if (warn_acl) {
+return;
+}
+error_report("The acl_show, acl_reset, acl_policy, acl_add, acl_remove "
+ "commands are deprecated with no replacement. Authorization "
+ "for VNC should be performed using the pluggable QAuthZ "
+ "objects");
+warn_acl = true;
+}
+
 static void hmp_acl_show(Monitor *mon, const QDict *qdict)
 {
 const char *aclname = qdict_get_str(qdict, "aclname");
@@ -2063,6 +2076,8 @@ static void hmp_acl_show(Monitor *mon, const QDict *qdict)
 QAuthZListRuleList *rules;
 size_t i = 0;
 
+hmp_warn_acl();
+
 if (!auth) {
 return;
 }
@@ -2086,6 +2101,8 @@ static void hmp_acl_reset(Monitor *mon, const QDict 
*qdict)
 const char *aclname = qdict_get_str(qdict, "aclname");
 QAuthZList *auth = find_auth(mon, aclname);
 
+hmp_warn_acl();
+
 if (!auth) {
 return;
 }
@@ -2104,6 +2121,8 @@ static void hmp_acl_policy(Monitor *mon, const QDict 
*qdict)
 int val;
 Error *err = NULL;
 
+hmp_warn_acl();
+
 if (!auth) {
 return;
 }
@@ -2139,6 +2158,8 @@ static void hmp_acl_add(Monitor *mon, const QDict *qdict)
 QAuthZListFormat format;
 size_t i = 0;
 
+hmp_warn_acl();
+
 if (!auth) {
 return;
 }
@@ -2194,6 +2215,8 @@ static void hmp_acl_remove(Monitor *mon, const QDict 
*qdict)
 QAuthZList *auth = find_auth(mon, aclname);
 ssize_t i = 0;
 
+hmp_warn_acl();
+
 if (!auth) {
 return;
 }
diff --git a/qemu-deprecated.texi b/qemu-deprecated.texi
index d74d9ea02c..dc6b1909e0 100644
--- a/qemu-deprecated.texi
+++ b/qemu-deprecated.texi
@@ -121,6 +121,12 @@ replaced by the ``target'' output member.
 The @option{[hub_id name]} parameter tuple of the 'hostfwd_add' and
 'hostfwd_remove' HMP commands has been replaced by @option{netdev_id}.
 
+@subsection acl_show, acl_reset, acl_policy, acl_add, acl_remove (since 3.1)
+
+The ``acl_show'', ``acl_reset'', ``acl_policy'', ``acl_add'', and
+``acl_remove'' commands are deprecated with no replacement. Authorization
+for VNC should be performed using the pluggable QAuthZ objects.
+
 @section System emulator devices
 
 @subsection ivshmem (since 2.6.0)
-- 
2.17.1

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list

[libvirt] [PATCH v3 5/6] vnc: allow specifying a custom authorization object name

2018-10-09 Thread Daniel P . Berrangé
From: "Daniel P. Berrange" 

The VNC server has historically had support for ACLs to check both the
SASL username and the TLS x509 distinguished name. The VNC server was
responsible for creating the initial ACL, and the client app was then
responsible for populating it with rules using the HMP 'acl_add' command.

This is not satisfactory for a variety of reasons. There is no way to
populate the ACLs from the command line, users are forced to use the
HMP. With multiple network services all supporting TLS and ACLs now, it
is desirable to be able to define a single ACL that is referenced by all
services.

To address these limitations, two new options are added to the VNC
server CLI. The 'tls-authz' option takes the ID of a QAuthZ object to
use for checking TLS x509 distinguished names, and the 'sasl-authz'
option takes the ID of another object to use for checking SASL usernames.

In this example, we setup two authorization rules. The first allows any
client with a certificate issued by the 'RedHat' organization in the
'London' locality. The second ACL allows clients with either the
'j...@redhat.com' or  'f...@redhat.com' kerberos usernames. Both checks
must pass for the user to be allowed.

$QEMU -object tls-creds-x509,id=tls0,dir=/home/berrange/qemutls,\
  endpoint=server,verify-peer=yes \
  -object authz-simple,id=authz0,policy=deny,\
  rules.0.match=O=RedHat,,L=London,rules.0.policy=allow \
  -object authz-simple,id=authz1,policy=deny,\
  rules.0.match=f...@redhat.com,rules.0.policy=allow \
  rules.0.match=j...@redhat.com,rules.0.policy=allow \
  -vnc 0.0.0.0:1,tls-creds=tls0,tls-authz=authz0,
   sasl,sasl-authz=authz1 \
  ...other QEMU args...

Signed-off-by: Daniel P. Berrange 
---
 qemu-deprecated.texi |  5 
 qemu-options.hx  | 35 ++
 ui/vnc.c | 58 +---
 3 files changed, 79 insertions(+), 19 deletions(-)

diff --git a/qemu-deprecated.texi b/qemu-deprecated.texi
index 16ff946b55..d74d9ea02c 100644
--- a/qemu-deprecated.texi
+++ b/qemu-deprecated.texi
@@ -93,6 +93,11 @@ for these file types is 'host_cdrom' or 'host_device' as 
appropriate.
 The @option{name} parameter of the @option{-net} option is a synonym
 for the @option{id} parameter, which should now be used instead.
 
+@subsection -vnc acl (since 3.1.0)
+
+The @code{acl} option to the @code{-vnc} argument has been replaced
+by the @code{tls-authz} and @code{sasl-authz} options.
+
 @section QEMU Machine Protocol (QMP) commands
 
 @subsection block-dirty-bitmap-add "autoload" parameter (since 2.12.0)
diff --git a/qemu-options.hx b/qemu-options.hx
index 5079074088..6c3eef8b32 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -1626,6 +1626,14 @@ will cause the VNC server socket to enable the VeNCrypt 
auth
 mechanism.  The credentials should have been previously created
 using the @option{-object tls-creds} argument.
 
+@item tls-authz=@var{ID}
+
+Provides the ID of the QAuthZ authorization object against which
+the client's x509 distinguished name will validated. This object is
+only resolved at time of use, so can be deleted and recreated on the
+fly while the VNC server is active. If missing, it will default
+to denying access.
+
 @item sasl
 
 Require that the client use SASL to authenticate with the VNC server.
@@ -1641,18 +1649,25 @@ ensures a data encryption preventing compromise of 
authentication
 credentials. See the @ref{vnc_security} section for details on using
 SASL authentication.
 
+@item sasl-authz=@var{ID}
+
+Provides the ID of the QAuthZ authorization object against which
+the client's SASL username will validated. This object is
+only resolved at time of use, so can be deleted and recreated on the
+fly while the VNC server is active. If missing, it will default
+to denying access.
+
 @item acl
 
-Turn on access control lists for checking of the x509 client certificate
-and SASL party. For x509 certs, the ACL check is made against the
-certificate's distinguished name. This is something that looks like
-@code{C=GB,O=ACME,L=Boston,CN=bob}. For SASL party, the ACL check is
-made against the username, which depending on the SASL plugin, may
-include a realm component, eg @code{bob} or @code{bob@@EXAMPLE.COM}.
-When the @option{acl} flag is set, the initial access list will be
-empty, with a @code{deny} policy. Thus no one will be allowed to
-use the VNC server until the ACLs have been loaded. This can be
-achieved using the @code{acl} monitor command.
+Legacy method for enabling authorization of clients against the
+x509 distinguished name and SASL username. It results in the creation
+of two @code{authz-list} objects with IDs of @code{vnc.username} and
+@code{vnc.x509dname}. The rules for these objects must be configured
+with the HMP ACL commands.
+
+This option is deprecated and should no longer be used. The new
+@option{sasl-authz} and 

[libvirt] [PATCH v3 2/6] nbd: allow authorization with nbd-server-start QMP command

2018-10-09 Thread Daniel P . Berrangé
From: "Daniel P. Berrange" 

As with the previous patch to qemu-nbd, the nbd-server-start QMP command
also needs to be able to specify authorization when enabling TLS encryption.

First the client must create a QAuthZ object instance using the
'object-add' command:

   {
 'execute': 'object-add',
 'arguments': {
   'qom-type': 'authz-list',
   'id': 'authz0',
   'parameters': {
 'policy': 'deny',
 'rules': [
   {
 'match': '*CN=fred',
 'policy': 'allow'
   }
 ]
   }
 }
   }

They can then reference this in the new 'tls-authz' parameter when
executing the 'nbd-server-start' command:

   {
 'execute': 'nbd-server-start',
 'arguments': {
   'addr': {
   'type': 'inet',
   'host': '127.0.0.1',
   'port': '9000'
   },
   'tls-creds': 'tls0',
   'tls-authz': 'authz0'
 }
   }

Signed-off-by: Daniel P. Berrange 
---
 blockdev-nbd.c  | 11 ---
 hmp.c   |  2 +-
 include/block/nbd.h |  2 +-
 qapi/block.json |  8 +++-
 4 files changed, 17 insertions(+), 6 deletions(-)

diff --git a/blockdev-nbd.c b/blockdev-nbd.c
index 1d170c80b8..c10d6c0c81 100644
--- a/blockdev-nbd.c
+++ b/blockdev-nbd.c
@@ -23,6 +23,7 @@
 typedef struct NBDServerData {
 QIONetListener *listener;
 QCryptoTLSCreds *tlscreds;
+char *tlsauthz;
 } NBDServerData;
 
 static NBDServerData *nbd_server;
@@ -36,7 +37,7 @@ static void nbd_accept(QIONetListener *listener, 
QIOChannelSocket *cioc,
gpointer opaque)
 {
 qio_channel_set_name(QIO_CHANNEL(cioc), "nbd-server");
-nbd_client_new(cioc, nbd_server->tlscreds, NULL,
+nbd_client_new(cioc, nbd_server->tlscreds, nbd_server->tlsauthz,
nbd_blockdev_client_closed);
 }
 
@@ -52,6 +53,7 @@ static void nbd_server_free(NBDServerData *server)
 if (server->tlscreds) {
 object_unref(OBJECT(server->tlscreds));
 }
+g_free(server->tlsauthz);
 
 g_free(server);
 }
@@ -87,7 +89,7 @@ static QCryptoTLSCreds *nbd_get_tls_creds(const char *id, 
Error **errp)
 
 
 void nbd_server_start(SocketAddress *addr, const char *tls_creds,
-  Error **errp)
+  const char *tls_authz, Error **errp)
 {
 if (nbd_server) {
 error_setg(errp, "NBD server already running");
@@ -117,6 +119,8 @@ void nbd_server_start(SocketAddress *addr, const char 
*tls_creds,
 }
 }
 
+nbd_server->tlsauthz = g_strdup(tls_authz);
+
 qio_net_listener_set_client_func(nbd_server->listener,
  nbd_accept,
  NULL,
@@ -131,11 +135,12 @@ void nbd_server_start(SocketAddress *addr, const char 
*tls_creds,
 
 void qmp_nbd_server_start(SocketAddressLegacy *addr,
   bool has_tls_creds, const char *tls_creds,
+  bool has_tls_authz, const char *tls_authz,
   Error **errp)
 {
 SocketAddress *addr_flat = socket_address_flatten(addr);
 
-nbd_server_start(addr_flat, tls_creds, errp);
+nbd_server_start(addr_flat, tls_creds, tls_authz, errp);
 qapi_free_SocketAddress(addr_flat);
 }
 
diff --git a/hmp.c b/hmp.c
index 61ef120423..7fc5a6502f 100644
--- a/hmp.c
+++ b/hmp.c
@@ -2300,7 +2300,7 @@ void hmp_nbd_server_start(Monitor *mon, const QDict 
*qdict)
 goto exit;
 }
 
-nbd_server_start(addr, NULL, _err);
+nbd_server_start(addr, NULL, NULL, _err);
 qapi_free_SocketAddress(addr);
 if (local_err != NULL) {
 goto exit;
diff --git a/include/block/nbd.h b/include/block/nbd.h
index d7aa6b24d0..6cb336b45a 100644
--- a/include/block/nbd.h
+++ b/include/block/nbd.h
@@ -318,7 +318,7 @@ void nbd_client_get(NBDClient *client);
 void nbd_client_put(NBDClient *client);
 
 void nbd_server_start(SocketAddress *addr, const char *tls_creds,
-  Error **errp);
+  const char *tls_authz, Error **errp);
 
 void nbd_export_bitmap(NBDExport *exp, const char *bitmap,
const char *bitmap_export_name, Error **errp);
diff --git a/qapi/block.json b/qapi/block.json
index 11f01f28ef..812a9b1917 100644
--- a/qapi/block.json
+++ b/qapi/block.json
@@ -225,6 +225,11 @@
 #
 # @addr: Address on which to listen.
 # @tls-creds: (optional) ID of the TLS credentials object. Since 2.6
+# @tls-authz: ID of the QAuthZ authorization object used to validate
+# the client's x509 distinguished name. This object is
+# is only resolved at time of use, so can be deleted and
+# recreated on the fly while the NBD server is active.
+# If missing, it will default to denying access. Since 3.1
 #
 # Returns: error if the server is already running.
 #
@@ -232,7 +237,8 @@
 ##
 { 'command': 'nbd-server-start',
   'data': { 'addr': 'SocketAddressLegacy',
-'*tls-creds': 'str'} }
+'*tls-creds': 

[libvirt] [PATCH v3 4/6] chardev: add support for authorization for TLS clients

2018-10-09 Thread Daniel P . Berrangé
From: "Daniel P. Berrange" 

Currently any client which can complete the TLS handshake is able to use
a chardev server. The server admin can turn on the 'verify-peer' option
for the x509 creds to require the client to provide a x509
certificate. This means the client will have to acquire a certificate
from the CA before they are permitted to use the chardev server. This is
still a fairly low bar.

This adds a 'tls-authz=OBJECT-ID' option to the socket chardev backend
which takes the ID of a previously added 'QAuthZ' object instance. This
will be used to validate the client's x509 distinguished name. Clients
failing the check will not be permitted to use the chardev server.

For example to setup authorization that only allows connection from a
client whose x509 certificate distinguished name contains 'CN=fred', you
would use:

  $QEMU -object tls-creds-x509,id=tls0,dir=/home/berrange/qemutls,\
endpoint=server,verify-peer=yes \
-object authz-simple,id=authz0,identity=CN=laptop.example.com,,\
O=Example Org,,L=London,,ST=London,,C=GB \
-chardev socket,host=127.0.0.1,port=9000,server,\
 tls-creds=tls0,tls-authz=authz0 \
...other qemu args...

Signed-off-by: Daniel P. Berrange 
---
 chardev/char-socket.c | 11 ++-
 chardev/char.c|  3 +++
 qapi/char.json|  6 ++
 qemu-options.hx   |  9 +++--
 4 files changed, 26 insertions(+), 3 deletions(-)

diff --git a/chardev/char-socket.c b/chardev/char-socket.c
index a75b46d9fe..ab121e1e30 100644
--- a/chardev/char-socket.c
+++ b/chardev/char-socket.c
@@ -52,6 +52,7 @@ typedef struct {
 QIONetListener *listener;
 GSource *hup_source;
 QCryptoTLSCreds *tls_creds;
+char *tls_authz;
 int connected;
 int max_size;
 int do_telnetopt;
@@ -737,7 +738,7 @@ static void tcp_chr_tls_init(Chardev *chr)
 if (s->is_listen) {
 tioc = qio_channel_tls_new_server(
 s->ioc, s->tls_creds,
-NULL, /* XXX Use an ACL */
+s->tls_authz,
 );
 } else {
 tioc = qio_channel_tls_new_client(
@@ -889,6 +890,7 @@ static void char_socket_finalize(Object *obj)
 if (s->tls_creds) {
 object_unref(OBJECT(s->tls_creds));
 }
+g_free(s->tls_authz);
 
 qemu_chr_be_event(chr, CHR_EVENT_CLOSED);
 }
@@ -994,6 +996,7 @@ static void qmp_chardev_open_socket(Chardev *chr,
 }
 }
 }
+s->tls_authz = g_strdup(sock->tls_authz);
 
 s->addr = addr = socket_address_flatten(sock->addr);
 
@@ -1075,6 +1078,7 @@ static void qemu_chr_parse_socket(QemuOpts *opts, 
ChardevBackend *backend,
 const char *fd = qemu_opt_get(opts, "fd");
 const char *tls_creds = qemu_opt_get(opts, "tls-creds");
 SocketAddressLegacy *addr;
+const char *tls_authz = qemu_opt_get(opts, "tls-authz");
 ChardevSocket *sock;
 
 if ((!!path + !!fd + !!host) != 1) {
@@ -1103,6 +1107,10 @@ static void qemu_chr_parse_socket(QemuOpts *opts, 
ChardevBackend *backend,
 } else {
 g_assert_not_reached();
 }
+if (tls_authz && !tls_creds) {
+error_setg(errp, "Authorization can only be used when TLS is enabled");
+return;
+}
 
 sock = backend->u.socket.data = g_new0(ChardevSocket, 1);
 qemu_chr_parse_common(opts, qapi_ChardevSocket_base(sock));
@@ -1120,6 +1128,7 @@ static void qemu_chr_parse_socket(QemuOpts *opts, 
ChardevBackend *backend,
 sock->has_reconnect = true;
 sock->reconnect = reconnect;
 sock->tls_creds = g_strdup(tls_creds);
+sock->tls_authz = g_strdup(tls_authz);
 
 addr = g_new0(SocketAddressLegacy, 1);
 if (path) {
diff --git a/chardev/char.c b/chardev/char.c
index e115166995..987474a950 100644
--- a/chardev/char.c
+++ b/chardev/char.c
@@ -860,6 +860,9 @@ QemuOptsList qemu_chardev_opts = {
 },{
 .name = "tls-creds",
 .type = QEMU_OPT_STRING,
+},{
+.name = "tls-authz",
+.type = QEMU_OPT_STRING,
 },{
 .name = "width",
 .type = QEMU_OPT_NUMBER,
diff --git a/qapi/char.json b/qapi/char.json
index b7b2a05766..244213c075 100644
--- a/qapi/char.json
+++ b/qapi/char.json
@@ -243,6 +243,11 @@
 # @addr: socket address to listen on (server=true)
 #or connect to (server=false)
 # @tls-creds: the ID of the TLS credentials object (since 2.6)
+# @tls-authz: the ID of the QAuthZ authorization object against which
+# the client's x509 distinguished name will validated. This
+# object is only resolved at time of use, so can be deleted
+# and recreated on the fly while the chardev server is active.
+# If missing, it will default to denying access (since 3.1)
 # @server: create server socket (default: true)
 # @wait: wait for incoming connection on server
 #sockets (default: false).
@@ -260,6 +265,7 @@
 ##
 { 'struct': 'ChardevSocket', 'data': { 'addr'   : 

[libvirt] [PATCH v3 3/6] migration: add support for a "tls-authz" migration parameter

2018-10-09 Thread Daniel P . Berrangé
From: "Daniel P. Berrange" 

The QEMU instance that runs as the server for the migration data
transport (ie the target QEMU) needs to be able to configure access
control so it can prevent unauthorized clients initiating an incoming
migration. This adds a new 'tls-authz' migration parameter that is used
to provide the QOM ID of a QAuthZ subclass instance that provides the
access control check. This is checked against the x509 certificate
obtained during the TLS handshake.

For example, when starting a QEMU for incoming migration, it is
possible to give an example identity of the source QEMU that is
intended to be connecting later:

  $QEMU \
 -monitor stdio \
 -incoming defer \
 ...other args...

  (qemu) object_add tls-creds-x509,id=tls0,dir=/home/berrange/qemutls,\
 endpoint=server,verify-peer=yes \
  (qemu) object_add authz-simple,id=auth0,identity=CN=laptop.example.com,,\
 O=Example Org,,L=London,,ST=London,,C=GB \
  (qemu) migrate_incoming tcp:localhost:9000

Signed-off-by: Daniel P. Berrange 
---
 hmp.c |  9 +
 migration/migration.c |  8 
 migration/tls.c   |  2 +-
 qapi/migration.json   | 14 +-
 4 files changed, 31 insertions(+), 2 deletions(-)

diff --git a/hmp.c b/hmp.c
index 7fc5a6502f..52ee5e5444 100644
--- a/hmp.c
+++ b/hmp.c
@@ -396,6 +396,9 @@ void hmp_info_migrate_parameters(Monitor *mon, const QDict 
*qdict)
 monitor_printf(mon, "%s: %" PRIu64 "\n",
 MigrationParameter_str(MIGRATION_PARAMETER_MAX_POSTCOPY_BANDWIDTH),
 params->max_postcopy_bandwidth);
+monitor_printf(mon, " %s: '%s'\n",
+MigrationParameter_str(MIGRATION_PARAMETER_TLS_AUTHZ),
+params->has_tls_authz ? params->tls_authz : "");
 }
 
 qapi_free_MigrationParameters(params);
@@ -1702,6 +1705,12 @@ void hmp_migrate_set_parameter(Monitor *mon, const QDict 
*qdict)
 p->tls_hostname->type = QTYPE_QSTRING;
 visit_type_str(v, param, >tls_hostname->u.s, );
 break;
+case MIGRATION_PARAMETER_TLS_AUTHZ:
+p->has_tls_authz = true;
+p->tls_authz = g_new0(StrOrNull, 1);
+p->tls_authz->type = QTYPE_QSTRING;
+visit_type_str(v, param, >tls_authz->u.s, );
+break;
 case MIGRATION_PARAMETER_MAX_BANDWIDTH:
 p->has_max_bandwidth = true;
 /*
diff --git a/migration/migration.c b/migration/migration.c
index d6ae879dc8..f8d0251885 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -685,6 +685,8 @@ MigrationParameters *qmp_query_migrate_parameters(Error 
**errp)
 params->tls_creds = g_strdup(s->parameters.tls_creds);
 params->has_tls_hostname = true;
 params->tls_hostname = g_strdup(s->parameters.tls_hostname);
+params->has_tls_authz = true;
+params->tls_authz = g_strdup(s->parameters.tls_authz);
 params->has_max_bandwidth = true;
 params->max_bandwidth = s->parameters.max_bandwidth;
 params->has_downtime_limit = true;
@@ -1188,6 +1190,12 @@ static void migrate_params_apply(MigrateSetParameters 
*params, Error **errp)
 s->parameters.tls_hostname = g_strdup(params->tls_hostname->u.s);
 }
 
+if (params->has_tls_authz) {
+g_free(s->parameters.tls_authz);
+assert(params->tls_authz->type == QTYPE_QSTRING);
+s->parameters.tls_authz = g_strdup(params->tls_authz->u.s);
+}
+
 if (params->has_max_bandwidth) {
 s->parameters.max_bandwidth = params->max_bandwidth;
 if (s->to_dst_file) {
diff --git a/migration/tls.c b/migration/tls.c
index 3b9e8c9263..5171afc6c4 100644
--- a/migration/tls.c
+++ b/migration/tls.c
@@ -94,7 +94,7 @@ void migration_tls_channel_process_incoming(MigrationState *s,
 
 tioc = qio_channel_tls_new_server(
 ioc, creds,
-NULL, /* XXX pass ACL name */
+s->parameters.tls_authz,
 errp);
 if (!tioc) {
 return;
diff --git a/qapi/migration.json b/qapi/migration.json
index 6e8c21258a..9c656effb0 100644
--- a/qapi/migration.json
+++ b/qapi/migration.json
@@ -519,6 +519,12 @@
 #hostname must be provided so that the server's x509
 #certificate identity can be validated. (Since 2.7)
 #
+# @tls-authz: ID of the 'authz' object subclass that provides access control
+# checking of the TLS x509 certificate distinguished name.
+# This object is only resolved at time of use, so can be deleted
+# and recreated on the fly while the migration server is active.
+# If missing, it will default to denying access (Since 3.1)
+#
 # @max-bandwidth: to set maximum speed for migration. maximum speed in
 # bytes per second. (Since 2.8)
 #
@@ -560,7 +566,7 @@
   'data': ['compress-level', 'compress-threads', 'decompress-threads',
'compress-wait-thread',
'cpu-throttle-initial', 'cpu-throttle-increment',
-   'tls-creds', 'tls-hostname', 'max-bandwidth',
+

[libvirt] [PATCH v3 1/6] qemu-nbd: add support for authorization of TLS clients

2018-10-09 Thread Daniel P . Berrangé
From: "Daniel P. Berrange" 

Currently any client which can complete the TLS handshake is able to use
the NBD server. The server admin can turn on the 'verify-peer' option
for the x509 creds to require the client to provide a x509 certificate.
This means the client will have to acquire a certificate from the CA
before they are permitted to use the NBD server. This is still a fairly
low bar to cross.

This adds a '--tls-authz OBJECT-ID' option to the qemu-nbd command which
takes the ID of a previously added 'QAuthZ' object instance. This will
be used to validate the client's x509 distinguished name. Clients
failing the authorization check will not be permitted to use the NBD
server.

For example to setup authorization that only allows connection from a client
whose x509 certificate distinguished name is

   CN=laptop.example.com,O=Example Org,L=London,ST=London,C=GB

use:

  qemu-nbd --object tls-creds-x509,id=tls0,dir=/home/berrange/qemutls,\
endpoint=server,verify-peer=yes \
   --object authz-simple,id=auth0,identity=CN=laptop.example.com,,\
O=Example Org,,L=London,,ST=London,,C=GB \
   --tls-creds tls0 \
   --tls-authz authz0
   other qemu-nbd args...

Signed-off-by: Daniel P. Berrange 
---
 include/block/nbd.h |  2 +-
 nbd/server.c| 10 +-
 qemu-nbd.c  | 13 -
 qemu-nbd.texi   |  4 
 4 files changed, 22 insertions(+), 7 deletions(-)

diff --git a/include/block/nbd.h b/include/block/nbd.h
index 6a5bfe5d55..d7aa6b24d0 100644
--- a/include/block/nbd.h
+++ b/include/block/nbd.h
@@ -312,7 +312,7 @@ void nbd_export_close_all(void);
 
 void nbd_client_new(QIOChannelSocket *sioc,
 QCryptoTLSCreds *tlscreds,
-const char *tlsaclname,
+const char *tlsauthz,
 void (*close_fn)(NBDClient *, bool));
 void nbd_client_get(NBDClient *client);
 void nbd_client_put(NBDClient *client);
diff --git a/nbd/server.c b/nbd/server.c
index a1eda0114f..138a35f838 100644
--- a/nbd/server.c
+++ b/nbd/server.c
@@ -111,7 +111,7 @@ struct NBDClient {
 
 NBDExport *exp;
 QCryptoTLSCreds *tlscreds;
-char *tlsaclname;
+char *tlsauthz;
 QIOChannelSocket *sioc; /* The underlying data channel */
 QIOChannel *ioc; /* The current I/O channel which may differ (eg TLS) */
 
@@ -687,7 +687,7 @@ static QIOChannel *nbd_negotiate_handle_starttls(NBDClient 
*client,
 
 tioc = qio_channel_tls_new_server(ioc,
   client->tlscreds,
-  client->tlsaclname,
+  client->tlsauthz,
   errp);
 if (!tioc) {
 return NULL;
@@ -1353,7 +1353,7 @@ void nbd_client_put(NBDClient *client)
 if (client->tlscreds) {
 object_unref(OBJECT(client->tlscreds));
 }
-g_free(client->tlsaclname);
+g_free(client->tlsauthz);
 if (client->exp) {
 QTAILQ_REMOVE(>exp->clients, client, next);
 nbd_export_put(client->exp);
@@ -2403,7 +2403,7 @@ static coroutine_fn void nbd_co_client_start(void *opaque)
  */
 void nbd_client_new(QIOChannelSocket *sioc,
 QCryptoTLSCreds *tlscreds,
-const char *tlsaclname,
+const char *tlsauthz,
 void (*close_fn)(NBDClient *, bool))
 {
 NBDClient *client;
@@ -2415,7 +2415,7 @@ void nbd_client_new(QIOChannelSocket *sioc,
 if (tlscreds) {
 object_ref(OBJECT(client->tlscreds));
 }
-client->tlsaclname = g_strdup(tlsaclname);
+client->tlsauthz = g_strdup(tlsauthz);
 client->sioc = sioc;
 object_ref(OBJECT(client->sioc));
 client->ioc = QIO_CHANNEL(sioc);
diff --git a/qemu-nbd.c b/qemu-nbd.c
index e76fe3082a..61fe0fb5b9 100644
--- a/qemu-nbd.c
+++ b/qemu-nbd.c
@@ -52,6 +52,7 @@
 #define QEMU_NBD_OPT_TLSCREDS  261
 #define QEMU_NBD_OPT_IMAGE_OPTS262
 #define QEMU_NBD_OPT_FORK  263
+#define QEMU_NBD_OPT_TLSAUTHZ  264
 
 #define MBR_SIZE 512
 
@@ -65,6 +66,7 @@ static int shared = 1;
 static int nb_fds;
 static QIONetListener *server;
 static QCryptoTLSCreds *tlscreds;
+static const char *tlsauthz;
 
 static void usage(const char *name)
 {
@@ -354,7 +356,7 @@ static void nbd_accept(QIONetListener *listener, 
QIOChannelSocket *cioc,
 
 nb_fds++;
 nbd_update_server_watch();
-nbd_client_new(cioc, tlscreds, NULL, nbd_client_closed);
+nbd_client_new(cioc, tlscreds, tlsauthz, nbd_client_closed);
 }
 
 static void nbd_update_server_watch(void)
@@ -532,6 +534,7 @@ int main(int argc, char **argv)
 { "image-opts", no_argument, NULL, QEMU_NBD_OPT_IMAGE_OPTS },
 { "trace", required_argument, NULL, 'T' },
 { "fork", no_argument, NULL, QEMU_NBD_OPT_FORK },
+{ "tls-authz", no_argument, NULL, QEMU_NBD_OPT_TLSAUTHZ },
 { NULL, 0, NULL, 0 }
 

[libvirt] [PATCH v3 0/6] Add authorization support to all network services

2018-10-09 Thread Daniel P . Berrangé
  v1: https://lists.gnu.org/archive/html/qemu-devel/2018-06/msg04482.html
  v2: https://lists.gnu.org/archive/html/qemu-devel/2018-06/msg05727.html

This series builds on the core authorization framework:

  v5: https://lists.gnu.org/archive/html/qemu-devel/2018-10/msg01613.html

enabling its use with the VNC, chardev, NBD and migration network servers.

In combination with TLS x509 client certificates, this allows these
services to whitelist specific clients, which avoids the need to setup
restricted child certificate authorities.

In VNC it also allows whitelisting based on SASL user names.

Changed in v3:

  - Rebased to latest git master

Changed in v2:

 - Document that authz objects are resolved at time of use, not
   time of network service activation
 - Improve docs for tls-authz parameters on services
 - Fix 2.13 -> 3.0 version tags
 - Remove redundant conditionals around g_strdup
 - Fix arg syntax for qemu-nbd  s/-/--/
 - Remove QAPI (optional) annotation
 - Fix some outdated usage example

Based-on: <20181009130442.26296-1-berra...@redhat.com>

Daniel P. Berrangé (6):
  qemu-nbd: add support for authorization of TLS clients
  nbd: allow authorization with nbd-server-start QMP command
  migration: add support for a "tls-authz" migration parameter
  chardev: add support for authorization for TLS clients
  vnc: allow specifying a custom authorization object name
  monitor: deprecate acl_show, acl_reset, acl_policy, acl_add,
acl_remove

 blockdev-nbd.c| 11 +---
 chardev/char-socket.c | 11 +++-
 chardev/char.c|  3 +++
 hmp.c | 11 +++-
 include/block/nbd.h   |  4 +--
 migration/migration.c |  8 ++
 migration/tls.c   |  2 +-
 monitor.c | 23 +
 nbd/server.c  | 10 
 qapi/block.json   |  8 +-
 qapi/char.json|  6 +
 qapi/migration.json   | 14 ++-
 qemu-deprecated.texi  | 11 
 qemu-nbd.c| 13 +-
 qemu-nbd.texi |  4 +++
 qemu-options.hx   | 44 +++-
 ui/vnc.c  | 58 ---
 17 files changed, 204 insertions(+), 37 deletions(-)

-- 
2.17.1

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list

Re: [libvirt] Regarding the installtion of libvirt on ubuntu

2018-10-09 Thread Daniel P . Berrangé
On Wed, Oct 03, 2018 at 12:32:04PM +, Mousumi Paul wrote:
> I have installed libvirt-bin on ubuntu using the following command:sudo 
> apt-get install libvirt-bin
> I checked the virsh version. It says 4.6.0.Now i want to recompile libvirt 
> with esx, xen and hyperv driver.Therefore i went to /usr/src/libvirt-4.6.0 
> and trying to run the following command:./configure --prefix=/usr 
> --localstatedir=/var --sysconfdir=/etc --with-esx=yes --with-xen=yes
> 
> but i am getting the following error:error: you must install the 
> gnutls>=3.2.0 pkg-config module to compile libvirt.
> 
> So i have installed gnutls-binand it is already the latest version.
> Why libvirt is not able to identify gnutls... what is the solution for this???

Again, you'll need the -dev package, not the -bin package.

Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list

Re: [libvirt] Regarding openwsman requirement in libvirt configuration

2018-10-09 Thread Daniel P . Berrangé
On Wed, Oct 03, 2018 at 12:23:53PM +, Mousumi Paul wrote:
> Trying to install libvirt 1.2.14 with the following command :./configure 
> --prefix=/usr --localstatedir=/var --sysconfdir=/etc --with-esx=yes 
> --with-xen=yes --with-hyperv=yes
> 
> The above generates the following error:
> configure: error: openwsman is required for the Hyper-V driver
> 
> I have installed openwsman on ubuntu using the following command:sudo apt-get 
> install openwsman
> But still getting the same error. Please help me to compile libvirt with 
> hyperv driver.

You need the openwsman *development* package, which usually has a '-dev'
suffix on Debian derived distros.

Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


[libvirt] Regarding openwsman requirement in libvirt configuration

2018-10-09 Thread Mousumi Paul
Trying to install libvirt 1.2.14 with the following command :./configure 
--prefix=/usr --localstatedir=/var --sysconfdir=/etc --with-esx=yes 
--with-xen=yes --with-hyperv=yes

The above generates the following error:
configure: error: openwsman is required for the Hyper-V driver

I have installed openwsman on ubuntu using the following command:sudo apt-get 
install openwsman
But still getting the same error. Please help me to compile libvirt with hyperv 
driver.
 Thanks & Regards,Mousumi Paul---Senior R & D Engineer
Vigyanlabs Innovations Private Limited--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list

[libvirt] Regarding the installtion of libvirt on ubuntu

2018-10-09 Thread Mousumi Paul
I have installed libvirt-bin on ubuntu using the following command:sudo apt-get 
install libvirt-bin
I checked the virsh version. It says 4.6.0.Now i want to recompile libvirt with 
esx, xen and hyperv driver.Therefore i went to /usr/src/libvirt-4.6.0 and 
trying to run the following command:./configure --prefix=/usr 
--localstatedir=/var --sysconfdir=/etc --with-esx=yes --with-xen=yes

but i am getting the following error:error: you must install the gnutls>=3.2.0 
pkg-config module to compile libvirt.

So i have installed gnutls-binand it is already the latest version.
Why libvirt is not able to identify gnutls... what is the solution for this???
 Thanks & Regards,Mousumi Paul---Senior R & D Engineer
Vigyanlabs Innovations Private Limited--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list

[libvirt] The results of lspci are inconsistent between vfio reset pci devices and reset devices by sysfs interafce

2018-10-09 Thread Wuzongyong (Euler Dept)
Hi,

I start a virtual machine with commandline:
/usr/libexec/qemu-kvm --enable-kvm -smp 8 -m 8192 -device 
vfio-pci,host=:81:00.0

Then I pause the qemu process before executing the main_loop function by gdb.
At this moment, lspci shows the regions are disabled like below:
81:00.0 3D controller: NVIDIA Corporation GP100GL [Tesla P100 PCIe 16GB] 
(rev a1)
Subsystem: NVIDIA Corporation Device 118f
Physical Slot: 0-6
Control: I/O- Mem- BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- 
Stepping- SERR- FastB2B- DisINTx+
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- SERR-  /sys/bus/pci/devices/:81:00.0/reset
lspci shows the regions are *not* disabled:
81:00.0 3D controller: NVIDIA Corporation GP100GL [Tesla P100 PCIe 16GB] 
(rev a1)
Subsystem: Huawei Technologies Co., Ltd. Device 2061
Physical Slot: 0-6
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr+ 
Stepping- SERR+ FastB2B- DisINTx-
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- SERR- vbasedev.fd, VFIO_DEVICE_RESET)
Kernel:
vfio_pci_ioctl
pci_try_reset_function
__pci_reset_function_locked
pci_parent_bus_reset
pci_reset_bridge_secondary_bus

and write 1 to the reset interface of sysfs go through the path:
Kernel:
reset_store
pci_reset_function
__pci_reset_function_locked
pci_parent_bus_reset
pci_reset_bridge_secondary_bus

So seem that these two methods are same actually, I am confused why the results 
are inconsistent.

Thanks,
Zongyong Wu

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list

[libvirt] [PATCHv5 19/19] qemu: Setting up vcpu and adding pids to resctrl monitor groups during reconnection

2018-10-09 Thread Wang Huaqiang
Invoking qemuProcessSetupVcpus in process of VM reconnection.

Signed-off-by: Wang Huaqiang 
---
 src/qemu/qemu_process.c | 3 +++
 src/util/virresctrl.c   | 7 +++
 2 files changed, 10 insertions(+)

diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c
index a4bbef6..f85aef0 100644
--- a/src/qemu/qemu_process.c
+++ b/src/qemu/qemu_process.c
@@ -7987,6 +7987,9 @@ qemuProcessReconnect(void *opaque)
 }
 }
 
+if (qemuProcessSetupVcpus(obj) < 0)
+goto error;
+
 /* update domain state XML with possibly updated state in virDomainObj */
 if (virDomainSaveStatus(driver->xmlopt, cfg->stateDir, obj, driver->caps) 
< 0)
 goto error;
diff --git a/src/util/virresctrl.c b/src/util/virresctrl.c
index 67dfbb8..b5717b2 100644
--- a/src/util/virresctrl.c
+++ b/src/util/virresctrl.c
@@ -2495,9 +2495,16 @@ int
 virResctrlMonitorAddPID(virResctrlMonitorPtr monitor,
 pid_t pid)
 {
+size_t i = 0;
+
 if (virResctrlAddPID(monitor->path, pid) < 0)
 return -1;
 
+for (i = 0; i < monitor->npids; i++) {
+if (pid == monitor->pids[i])
+return 0;
+}
+
 if (VIR_APPEND_ELEMENT(monitor->pids, monitor->npids, pid) < 0)
 return -1;
 
-- 
2.7.4

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


[libvirt] [PATCHv5 07/19] util: Refactor code for creating resctrl group

2018-10-09 Thread Wang Huaqiang
The code for creating resctrl allocation group could be reused
for monitoring group, refactor it for reusing in the later patch.

Signed-off-by: Wang Huaqiang 
---
 src/util/virresctrl.c | 37 +++--
 1 file changed, 23 insertions(+), 14 deletions(-)

diff --git a/src/util/virresctrl.c b/src/util/virresctrl.c
index 1a5578e..8b617a6 100644
--- a/src/util/virresctrl.c
+++ b/src/util/virresctrl.c
@@ -2310,6 +2310,26 @@ virResctrlAllocDeterminePath(virResctrlAllocPtr alloc,
 }
 
 
+/* This function creates a resctrl directory in resource control file system,
+ * and the directory path is specified by @path. */
+static int
+virResctrlCreateGroupPath(const char *path)
+{
+/* Directory exists, return */
+if (virFileExists(path))
+return 0;
+
+if (virFileMakePath(path) < 0) {
+virReportSystemError(errno,
+ _("Cannot create resctrl directory '%s'"),
+ path);
+return -1;
+}
+
+return 0;
+}
+
+
 /* This checks if the directory for the alloc exists.  If not it tries to 
create
  * it and apply appropriate alloc settings. */
 int
@@ -2334,13 +2354,6 @@ virResctrlAllocCreate(virResctrlInfoPtr resctrl,
 if (virResctrlAllocDeterminePath(alloc, machinename) < 0)
 return -1;
 
-if (virFileExists(alloc->path)) {
-virReportError(VIR_ERR_INTERNAL_ERROR,
-   _("Path '%s' for resctrl allocation exists"),
-   alloc->path);
-goto cleanup;
-}
-
 lockfd = virResctrlLockWrite();
 if (lockfd < 0)
 goto cleanup;
@@ -2348,6 +2361,9 @@ virResctrlAllocCreate(virResctrlInfoPtr resctrl,
 if (virResctrlAllocAssign(resctrl, alloc) < 0)
 goto cleanup;
 
+if (virResctrlCreateGroupPath(alloc->path) < 0)
+goto cleanup;
+
 alloc_str = virResctrlAllocFormat(alloc);
 if (!alloc_str)
 goto cleanup;
@@ -2355,13 +2371,6 @@ virResctrlAllocCreate(virResctrlInfoPtr resctrl,
 if (virAsprintf(_path, "%s/schemata", alloc->path) < 0)
 goto cleanup;
 
-if (virFileMakePath(alloc->path) < 0) {
-virReportSystemError(errno,
- _("Cannot create resctrl directory '%s'"),
- alloc->path);
-goto cleanup;
-}
-
 VIR_DEBUG("Writing resctrl schemata '%s' into '%s'", alloc_str, 
schemata_path);
 if (virFileWriteStr(schemata_path, alloc_str, 0) < 0) {
 rmdir(alloc->path);
-- 
2.7.4

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


[libvirt] [PATCHv5 17/19] qemu: refactor qemuDomainGetStatsCpu

2018-10-09 Thread Wang Huaqiang
Refactoring qemuDomainGetStatsCpu, make it possible to add
more CPU statistics.

Signed-off-by: Wang Huaqiang 
---
 src/qemu/qemu_driver.c | 45 ++---
 1 file changed, 22 insertions(+), 23 deletions(-)

diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c
index a52e249..12a5f8f 100644
--- a/src/qemu/qemu_driver.c
+++ b/src/qemu/qemu_driver.c
@@ -19711,30 +19711,29 @@ qemuDomainGetStatsCpu(virQEMUDriverPtr driver 
ATTRIBUTE_UNUSED,
 unsigned long long sys_time = 0;
 int err = 0;
 
-if (!priv->cgroup)
-return 0;
-
-err = virCgroupGetCpuacctUsage(priv->cgroup, _time);
-if (!err && virTypedParamsAddULLong(>params,
->nparams,
-maxparams,
-"cpu.time",
-cpu_time) < 0)
-return -1;
+if (priv->cgroup) {
+err = virCgroupGetCpuacctUsage(priv->cgroup, _time);
+if (!err && virTypedParamsAddULLong(>params,
+>nparams,
+maxparams,
+"cpu.time",
+cpu_time) < 0)
+return -1;
 
-err = virCgroupGetCpuacctStat(priv->cgroup, _time, _time);
-if (!err && virTypedParamsAddULLong(>params,
->nparams,
-maxparams,
-"cpu.user",
-user_time) < 0)
-return -1;
-if (!err && virTypedParamsAddULLong(>params,
->nparams,
-maxparams,
-"cpu.system",
-sys_time) < 0)
-return -1;
+err = virCgroupGetCpuacctStat(priv->cgroup, _time, _time);
+if (!err && virTypedParamsAddULLong(>params,
+>nparams,
+maxparams,
+"cpu.user",
+user_time) < 0)
+return -1;
+if (!err && virTypedParamsAddULLong(>params,
+>nparams,
+maxparams,
+"cpu.system",
+sys_time) < 0)
+return -1;
+}
 
 return 0;
 }
-- 
2.7.4

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


[libvirt] [PATCHv5 18/19] qemu: Report cache occupancy (CMT) with domstats

2018-10-09 Thread Wang Huaqiang
Adding the interface in qemu to report CMT statistic information
through command 'virsh domstats --cpu-total'.

Below is a typical output:

 # virsh domstats 1 --cpu-total
 Domain: 'ubuntu16.04-base'
   ...
   cpu.cache.monitor.count=2
   cpu.cache.0.name=vcpus_1
   cpu.cache.0.vcpus=1
   cpu.cache.0.bank.count=2
   cpu.cache.0.bank.0.id=0
   cpu.cache.0.bank.0.bytes=4505600
   cpu.cache.0.bank.1.id=1
   cpu.cache.0.bank.1.bytes=5586944
   cpu.cache.1.name=vcpus_4-6
   cpu.cache.1.vcpus=4,5,6
   cpu.cache.1.bank.count=2
   cpu.cache.1.bank.0.id=0
   cpu.cache.1.bank.0.bytes=17571840
   cpu.cache.1.bank.1.id=1
   cpu.cache.1.bank.1.bytes=29106176

Signed-off-by: Wang Huaqiang 
---
 src/libvirt-domain.c   |   9 ++
 src/qemu/qemu_driver.c | 230 +
 2 files changed, 239 insertions(+)

diff --git a/src/libvirt-domain.c b/src/libvirt-domain.c
index 7690339..218c93a 100644
--- a/src/libvirt-domain.c
+++ b/src/libvirt-domain.c
@@ -11345,6 +11345,15 @@ virConnectGetDomainCapabilities(virConnectPtr conn,
  * "cpu.user" - user cpu time spent in nanoseconds as unsigned long long.
  * "cpu.system" - system cpu time spent in nanoseconds as unsigned long
  *long.
+ * "cpu.cache.monitor.count" - tocal cache monitoring groups
+ * "cpu.cache.M.name" - name of cache monitoring group 'M'
+ * "cpu.cache.M.vcpus" - vcpus for cache monitoring group 'M'
+ * "cpu.cache.M.bank.count" - total bank number of cache monitoring
+ *group 'M'
+ * "cpu.cache.M.bank.N.id" - OS assigned cache bank id for cache 'N' in
+ *   cache monitoring group 'M'
+ * "cpu.cache.M.bank.N.bytes" - monitor's cache occupancy of cache bank
+ *  'N' in cache monitoring group 'M'
  *
  * VIR_DOMAIN_STATS_BALLOON:
  * Return memory balloon device information.
diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c
index 12a5f8f..7510a62 100644
--- a/src/qemu/qemu_driver.c
+++ b/src/qemu/qemu_driver.c
@@ -102,6 +102,7 @@
 #include "virnuma.h"
 #include "dirname.h"
 #include "netdev_bandwidth_conf.h"
+#include "c-ctype.h"
 
 #define VIR_FROM_THIS VIR_FROM_QEMU
 
@@ -19698,6 +19699,231 @@ typedef enum {
 #define HAVE_JOB(flags) ((flags) & QEMU_DOMAIN_STATS_HAVE_JOB)
 
 
+/* In terms of the output of virBitmapFormat, both '1-3' and '1,3' are valid
+ * outputs and represent different vcpu set.
+ *
+ * It is not easy to differentiate these two vcpu set formats at first glance.
+ * This function could be used to clear this ambiguity, it substitutes all '-'
+ * with ',' while generates semantically correct vcpu set.
+ * e.g. vcpu set string '1-3' will be replaced by string '1,2,3'. */
+static char *
+qemuDomainVcpuFormatHelper(const char *vcpus)
+{
+size_t i = 0;
+int last = 0;
+int start = 0;
+char * tmp = NULL;
+bool firstnum = true;
+const char *cur = vcpus;
+virBuffer buf = VIR_BUFFER_INITIALIZER;
+char *ret = NULL;
+
+if (virStringIsEmpty(cur))
+return NULL;
+
+while (*cur != '\0') {
+if (!c_isdigit(*cur))
+goto cleanup;
+
+if (virStrToLong_i(cur, , 10, ) < 0)
+goto cleanup;
+if (start < 0)
+goto cleanup;
+
+cur = tmp;
+
+virSkipSpaces();
+
+if (*cur == ',' || *cur == 0) {
+if (!firstnum)
+virBufferAddChar(, ',');
+virBufferAsprintf(, "%d", start);
+firstnum = false;
+} else if (*cur == '-') {
+cur++;
+virSkipSpaces();
+
+if (virStrToLong_i(cur, , 10, ) < 0)
+goto cleanup;
+
+if (last < start)
+goto cleanup;
+cur = tmp;
+
+for (i = start; i <= last; i++) {
+if (!firstnum)
+
+virBufferAddChar(, ',');
+virBufferAsprintf(, "%ld", i);
+firstnum = 0;
+}
+
+virSkipSpaces();
+}
+
+if (*cur == ',') {
+cur++;
+virSkipSpaces();
+} else if (*cur == 0) {
+break;
+} else {
+goto cleanup;
+}
+}
+
+ret = virBufferContentAndReset();
+ cleanup:
+virBufferFreeAndReset();
+return ret;
+}
+
+
+static int
+qemuDomainGetStatsCpuResource(virQEMUDriverPtr driver ATTRIBUTE_UNUSED,
+  virDomainObjPtr dom,
+  virDomainStatsRecordPtr record,
+  int *maxparams,
+  unsigned int privflags ATTRIBUTE_UNUSED,
+  virResctrlMonitorType restag)
+{
+char param_name[VIR_TYPED_PARAM_FIELD_LENGTH];
+virDomainResctrlMonDefPtr domresmon = 

[libvirt] [PATCHv5 16/19] conf: Add a 'id' to virDomainResctrlDef

2018-10-09 Thread Wang Huaqiang
Adding element 'id' to virDomainResctrlDef. This 'id' reflects
the attribute 'id' of of element 'cachetune in XML.

virResctrlAlloc.id is a copy of virDomanResctrlDef.id.

Signed-off-by: Wang Huaqiang 
---
 src/conf/domain_conf.c | 20 
 src/conf/domain_conf.h |  1 +
 2 files changed, 9 insertions(+), 12 deletions(-)

diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c
index 4f4604f..6da9dd4 100644
--- a/src/conf/domain_conf.c
+++ b/src/conf/domain_conf.c
@@ -2979,6 +2979,7 @@ virDomainResctrlDefFree(virDomainResctrlDefPtr resctrl)
 virObjectUnref(resctrl->alloc);
 virBitmapFree(resctrl->vcpus);
 VIR_FREE(resctrl->monitors);
+VIR_FREE(resctrl->id);
 VIR_FREE(resctrl);
 }
 
@@ -19138,6 +19139,9 @@ virDomainResctrlAppend(virDomainDefPtr def,
 goto cleanup;
 }
 
+if (VIR_STRDUP(resctrl->id, alloc_id) < 0)
+goto cleanup;
+
 if (virResctrlAllocSetID(resctrl->alloc, alloc_id) < 0)
 goto cleanup;
 
@@ -27320,13 +27324,9 @@ virDomainCachetuneDefFormat(virBufferPtr buf,
 
 virBufferAsprintf(buf, "alloc);
-if (!alloc_id)
-goto cleanup;
+if (!(flags & VIR_DOMAIN_DEF_FORMAT_INACTIVE))
+virBufferAsprintf(buf, " id='%s'", resctrl->id);
 
-virBufferAsprintf(buf, " id='%s'", alloc_id);
-}
 virBufferAddLit(buf, ">\n");
 
 virBufferAddBuffer(buf, );
@@ -27383,13 +27383,9 @@ virDomainMemorytuneDefFormat(virBufferPtr buf,
 
 virBufferAsprintf(buf, "alloc);
-if (!alloc_id)
-goto cleanup;
+if (!(flags & VIR_DOMAIN_DEF_FORMAT_INACTIVE))
+virBufferAsprintf(buf, " id='%s'", resctrl->id);
 
-virBufferAsprintf(buf, " id='%s'", alloc_id);
-}
 virBufferAddLit(buf, ">\n");
 
 virBufferAddBuffer(buf, );
diff --git a/src/conf/domain_conf.h b/src/conf/domain_conf.h
index 60f6464..e190aa2 100644
--- a/src/conf/domain_conf.h
+++ b/src/conf/domain_conf.h
@@ -2248,6 +2248,7 @@ typedef struct _virDomainResctrlDef virDomainResctrlDef;
 typedef virDomainResctrlDef *virDomainResctrlDefPtr;
 
 struct _virDomainResctrlDef {
+char *id;
 virBitmapPtr vcpus;
 virResctrlAllocPtr alloc;
 
-- 
2.7.4

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


[libvirt] [PATCHv5 09/19] util: Add more interfaces for resctrl monitor

2018-10-09 Thread Wang Huaqiang
Add interfaces monitor group to support operations such
as add PID, set ID, remove group ... etc.

The interface for getting cache occupancy information
from the monitor is also added.

Signed-off-by: Wang Huaqiang 
---
 src/libvirt_private.syms |   6 ++
 src/util/virresctrl.c| 209 ++-
 src/util/virresctrl.h|  23 ++
 3 files changed, 236 insertions(+), 2 deletions(-)

diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms
index a878083..c93d19f 100644
--- a/src/libvirt_private.syms
+++ b/src/libvirt_private.syms
@@ -2683,7 +2683,13 @@ virResctrlInfoNew;
 virResctrlMonitorAddPID;
 virResctrlMonitorCreate;
 virResctrlMonitorDeterminePath;
+virResctrlMonitorGetCacheLevel;
+virResctrlMonitorGetCacheOccupancy;
+virResctrlMonitorGetID;
 virResctrlMonitorNew;
+virResctrlMonitorRemove;
+virResctrlMonitorSetCacheLevel;
+virResctrlMonitorSetID;
 
 
 # util/virrotatingfile.h
diff --git a/src/util/virresctrl.c b/src/util/virresctrl.c
index b3d20cc..fca1f6f 100644
--- a/src/util/virresctrl.c
+++ b/src/util/virresctrl.c
@@ -225,11 +225,19 @@ virResctrlInfoMonFree(virResctrlInfoMonPtr mon)
 }
 
 
+
+/*
+ * virResctrlAlloc and virResctrlMonitor are representing a resource control
+ * group (in XML under cputune/cachetune and consequently a directory under
+ * /sys/fs/resctrl). virResctrlAlloc is the data structure for resource
+ * allocation, while the virResctrlMonitor represents the resource monitoring
+ * part.
+ */
+
 /* virResctrlAlloc */
 
 /*
- * virResctrlAlloc represents one allocation (in XML under cputune/cachetune 
and
- * consequently a directory under /sys/fs/resctrl).  Since it can have multiple
+ * virResctrlAlloc represents one allocation. Since it can have multiple
  * parts of multiple caches allocated it is represented as bunch of nested
  * sparse arrays (by sparse I mean array of pointers so that each might be NULL
  * in case there is no allocation for that particular cache allocation (level,
@@ -347,6 +355,8 @@ struct _virResctrlMonitor {
 /* libvirt-generated path in /sys/fs/resctrl for this particular
  * monitor */
 char *path;
+/* The cache 'level', special for cache monitor */
+unsigned int cache_level;
 };
 
 
@@ -2510,6 +2520,27 @@ virResctrlMonitorDeterminePath(virResctrlMonitorPtr 
monitor,
 
 
 int
+virResctrlMonitorSetID(virResctrlMonitorPtr monitor,
+   const char *id)
+{
+if (!id) {
+virReportError(VIR_ERR_INTERNAL_ERROR, "%s",
+   _("Resctrl monitor 'id' cannot be NULL"));
+return -1;
+}
+
+return VIR_STRDUP(monitor->id, id);
+}
+
+
+const char *
+virResctrlMonitorGetID(virResctrlMonitorPtr monitor)
+{
+return monitor->id;
+}
+
+
+int
 virResctrlMonitorCreate(virResctrlAllocPtr alloc,
 virResctrlMonitorPtr monitor,
 const char *machinename)
@@ -2534,3 +2565,177 @@ virResctrlMonitorCreate(virResctrlAllocPtr alloc,
 virResctrlUnlock(lockfd);
 return ret;
 }
+
+
+int
+virResctrlMonitorRemove(virResctrlMonitorPtr monitor)
+{
+int ret = 0;
+
+if (!monitor->path)
+return 0;
+
+VIR_DEBUG("Removing resctrl monitor%s", monitor->path);
+if (rmdir(monitor->path) != 0 && errno != ENOENT) {
+virReportSystemError(errno,
+ _("Unable to remove %s (%d)"),
+ monitor->path, errno);
+ret = -errno;
+VIR_ERROR(_("Unable to remove %s (%d)"), monitor->path, errno);
+}
+
+return ret;
+}
+
+
+int
+virResctrlMonitorSetCacheLevel(virResctrlMonitorPtr monitor,
+   unsigned int level)
+{
+/* Only supports cache level 3 CMT */
+if (level != 3) {
+virReportError(VIR_ERR_INTERNAL_ERROR, "%s",
+   _("Invalid resctrl monitor cache level"));
+return -1;
+}
+
+monitor->cache_level = level;
+
+return 0;
+}
+
+unsigned int
+virResctrlMonitorGetCacheLevel(virResctrlMonitorPtr monitor)
+{
+return monitor->cache_level;
+}
+
+
+/*
+ * virResctrlMonitorGetStatistic
+ *
+ * @monitor: The monitor that the statistic data will be retrieved from.
+ * @resource: The name for resource name. 'llc_occpancy' for cache resource.
+ * "mbm_totol_bytes" and "mbm_local_bytes" for memory bandwidth resource.
+ * @len: The array length for @ids, and @vals
+ * @ids: The id array for resource statistic information, ids[0]
+ * stores the first node id value, ids[1] stores the second node id value,
+ * ... and so on.
+ * @vals: The resource resource utilization information array. vals[0]
+ * stores the cache or memory bandwidth utilization value for first node,
+ * vals[1] stores the second value ... and so on.
+ *
+ * Get cache or memory bandwidth utilization information from monitor that
+ * specified by @id.
+ *
+ * Returns 0 for success, -1 for error.
+ */
+static int
+virResctrlMonitorGetStatistic(virResctrlMonitorPtr monitor,
+  

[libvirt] [PATCHv5 15/19] qemu: enable resctrl monitor in qemu

2018-10-09 Thread Wang Huaqiang
Add functions for creating, destroying, reconnecting resctrl
monitor in qemu according to the configuration in domain XML.

Signed-off-by: Wang Huaqiang 
---
 src/qemu/qemu_process.c | 49 ++---
 1 file changed, 46 insertions(+), 3 deletions(-)

diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c
index e9c7618..a4bbef6 100644
--- a/src/qemu/qemu_process.c
+++ b/src/qemu/qemu_process.c
@@ -2611,10 +2611,22 @@ qemuProcessResctrlCreate(virQEMUDriverPtr driver,
 return -1;
 
 for (i = 0; i < vm->def->nresctrls; i++) {
+size_t j = 0;
 if (virResctrlAllocCreate(caps->host.resctrl,
   vm->def->resctrls[i]->alloc,
   priv->machineName) < 0)
 goto cleanup;
+
+for (j = 0; j < vm->def->resctrls[i]->nmonitors; j++) {
+virDomainResctrlMonDefPtr mon = NULL;
+
+mon = vm->def->resctrls[i]->monitors[j];
+if (virResctrlMonitorCreate(vm->def->resctrls[i]->alloc,
+mon->instance,
+priv->machineName) < 0)
+goto cleanup;
+
+}
 }
 
 ret = 0;
@@ -5440,11 +5452,22 @@ qemuProcessSetupVcpu(virDomainObjPtr vm,
 return -1;
 
 for (i = 0; i < vm->def->nresctrls; i++) {
+size_t j = 0;
 virDomainResctrlDefPtr ct = vm->def->resctrls[i];
 
 if (virBitmapIsBitSet(ct->vcpus, vcpuid)) {
 if (virResctrlAllocAddPID(ct->alloc, vcpupid) < 0)
 return -1;
+
+for (j = 0; j < vm->def->resctrls[i]->nmonitors; j++) {
+virDomainResctrlMonDefPtr mon = NULL;
+
+mon = vm->def->resctrls[i]->monitors[j];
+if (virBitmapIsBitSet(mon->vcpus, vcpuid)) {
+if (virResctrlMonitorAddPID(mon->instance, vcpupid) < 0)
+return -1;
+}
+}
 break;
 }
 }
@@ -7207,8 +7230,18 @@ void qemuProcessStop(virQEMUDriverPtr driver,
 /* Remove resctrl allocation after cgroups are cleaned up which makes it
  * kind of safer (although removing the allocation should work even with
  * pids in tasks file */
-for (i = 0; i < vm->def->nresctrls; i++)
+for (i = 0; i < vm->def->nresctrls; i++) {
+size_t j = 0;
+
+for (j = 0; j < vm->def->resctrls[i]->nmonitors; j++) {
+virDomainResctrlMonDefPtr mon = NULL;
+
+mon = vm->def->resctrls[i]->monitors[j];
+virResctrlMonitorRemove(mon->instance);
+}
+
 virResctrlAllocRemove(vm->def->resctrls[i]->alloc);
+}
 
 qemuProcessRemoveDomainStatus(driver, vm);
 
@@ -7222,8 +7255,7 @@ void qemuProcessStop(virQEMUDriverPtr driver,
 virPortAllocatorRelease(graphics->data.vnc.port);
 } else if (graphics->data.vnc.portReserved) {
 virPortAllocatorRelease(graphics->data.vnc.port);
-graphics->data.vnc.portReserved = false;
-}
+graphics->data.vnc.portReserved = false; }
 if (graphics->data.vnc.websocketGenerated) {
 virPortAllocatorRelease(graphics->data.vnc.websocket);
 graphics->data.vnc.websocketGenerated = false;
@@ -7939,9 +7971,20 @@ qemuProcessReconnect(void *opaque)
 goto error;
 
 for (i = 0; i < obj->def->nresctrls; i++) {
+size_t j = 0;
+
 if (virResctrlAllocDeterminePath(obj->def->resctrls[i]->alloc,
  priv->machineName) < 0)
 goto error;
+
+for (j = 0; j < obj->def->resctrls[i]->nmonitors; j++) {
+virDomainResctrlMonDefPtr mon = NULL;
+
+mon = obj->def->resctrls[i]->monitors[j];
+if (virResctrlMonitorDeterminePath(mon->instance,
+   priv->machineName) < 0)
+goto error;
+}
 }
 
 /* update domain state XML with possibly updated state in virDomainObj */
-- 
2.7.4

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


[libvirt] [PATCHv5 14/19] Util: Add function for checking if monitor is running

2018-10-09 Thread Wang Huaqiang
Check monitor status by checking the PIDs are in file 'task'
or not.

Signed-off-by: Wang Huaqiang 
---
 src/libvirt_private.syms |  1 +
 src/util/virresctrl.c| 84 +++-
 src/util/virresctrl.h|  4 +++
 3 files changed, 88 insertions(+), 1 deletion(-)

diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms
index 4b22ed4..c90e48a 100644
--- a/src/libvirt_private.syms
+++ b/src/libvirt_private.syms
@@ -2686,6 +2686,7 @@ virResctrlMonitorDeterminePath;
 virResctrlMonitorGetCacheLevel;
 virResctrlMonitorGetCacheOccupancy;
 virResctrlMonitorGetID;
+virResctrlMonitorIsRunning;
 virResctrlMonitorNew;
 virResctrlMonitorRemove;
 virResctrlMonitorSetCacheLevel;
diff --git a/src/util/virresctrl.c b/src/util/virresctrl.c
index 41e8d48..67dfbb8 100644
--- a/src/util/virresctrl.c
+++ b/src/util/virresctrl.c
@@ -364,6 +364,9 @@ struct _virResctrlMonitor {
 char *path;
 /* Boolean flag for default monitor */
 bool default_monitor;
+/* Tracking the tasks' PID associated with this monitor */
+pid_t *pids;
+size_t npids;
 /* The cache 'level', special for cache monitor */
 unsigned int cache_level;
 };
@@ -425,6 +428,7 @@ virResctrlMonitorDispose(void *obj)
 virObjectUnref(monitor->alloc);
 VIR_FREE(monitor->id);
 VIR_FREE(monitor->path);
+VIR_FREE(monitor->pids);
 }
 
 
@@ -2491,7 +2495,13 @@ int
 virResctrlMonitorAddPID(virResctrlMonitorPtr monitor,
 pid_t pid)
 {
-return virResctrlAddPID(monitor->path, pid);
+if (virResctrlAddPID(monitor->path, pid) < 0)
+return -1;
+
+if (VIR_APPEND_ELEMENT(monitor->pids, monitor->npids, pid) < 0)
+return -1;
+
+return 0;
 }
 
 
@@ -2762,3 +2772,75 @@ virResctrlMonitorSetDefault(virResctrlMonitorPtr monitor)
 {
 monitor->default_monitor = true;
 }
+
+
+static int
+virResctrlPIDCompare(const void *pida, const void *pidb)
+{
+return *(pid_t*)pida - *(pid_t*)pidb;
+}
+
+
+bool
+virResctrlMonitorIsRunning(virResctrlMonitorPtr monitor)
+{
+char *pidstr = NULL;
+char **spids = NULL;
+size_t nspids = 0;
+pid_t *pids = NULL;
+size_t npids = 0;
+size_t i = 0;
+int rv = -1;
+bool ret = false;
+
+if (!monitor->path)
+return false;
+
+if (monitor->npids == 0)
+return false;
+
+rv = virFileReadValueString(, "%s/tasks", monitor->path);
+if (rv == -2)
+virReportError(VIR_ERR_INTERNAL_ERROR,
+   _("Task file '%s/tasks' does not exist"),
+   monitor->path);
+if (rv < 0)
+goto cleanup;
+
+/* no PID in task file */
+if (!*pidstr)
+goto cleanup;
+
+spids = virStringSplitCount(pidstr, "\n", 0, );
+if (nspids != monitor->npids)
+return false;
+
+for (i = 0; i < nspids; i++) {
+unsigned int val = 0;
+pid_t pid = 0;
+
+if (virStrToLong_uip(spids[i], NULL, 0, ) < 0)
+goto cleanup;
+
+pid = (pid_t)val;
+
+if (VIR_APPEND_ELEMENT(pids, npids, pid) < 0)
+goto cleanup;
+}
+
+qsort(pids, npids, sizeof(pid_t), virResctrlPIDCompare);
+qsort(monitor->pids, monitor->npids, sizeof(pid_t), virResctrlPIDCompare);
+
+for (i = 0; i < monitor->npids; i++) {
+if (monitor->pids[i] != pids[i])
+goto cleanup;
+}
+
+ret = true;
+ cleanup:
+virStringListFree(spids);
+VIR_FREE(pids);
+VIR_FREE(pidstr);
+
+return ret;
+}
diff --git a/src/util/virresctrl.h b/src/util/virresctrl.h
index 371df8a..c5794cb 100644
--- a/src/util/virresctrl.h
+++ b/src/util/virresctrl.h
@@ -230,4 +230,8 @@ virResctrlMonitorGetCacheOccupancy(virResctrlMonitorPtr 
monitor,
unsigned int **bankcaches);
 void
 virResctrlMonitorSetDefault(virResctrlMonitorPtr monitor);
+
+bool
+virResctrlMonitorIsRunning(virResctrlMonitorPtr monitor);
+
 #endif /*  __VIR_RESCTRL_H__ */
-- 
2.7.4

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


[libvirt] [PATCHv5 10/19] util: Introduce default monitor

2018-10-09 Thread Wang Huaqiang
In resctrl file system, more than one monitoring groups
could be created within one allocation group, along with
the creation of allocation group, a monitoring group is
created at the same, which monitors the resource
utilization information of whole allocation group.

This patch is introducing the concept of default monitor,
which represents the particular monitoring group that
created along with the creation of allocation group.

Default monitor shares the common 'vcpu' list with the
allocation.

Signed-off-by: Wang Huaqiang 
---
 src/libvirt_private.syms |  1 +
 src/util/virresctrl.c| 23 +++
 src/util/virresctrl.h|  2 ++
 3 files changed, 26 insertions(+)

diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms
index c93d19f..4b22ed4 100644
--- a/src/libvirt_private.syms
+++ b/src/libvirt_private.syms
@@ -2689,6 +2689,7 @@ virResctrlMonitorGetID;
 virResctrlMonitorNew;
 virResctrlMonitorRemove;
 virResctrlMonitorSetCacheLevel;
+virResctrlMonitorSetDefault;
 virResctrlMonitorSetID;
 
 
diff --git a/src/util/virresctrl.c b/src/util/virresctrl.c
index fca1f6f..41e8d48 100644
--- a/src/util/virresctrl.c
+++ b/src/util/virresctrl.c
@@ -340,6 +340,13 @@ struct _virResctrlAlloc {
  * bandwidth technology (MBM), as well as the CAT and MBA, are all orthogonal
  * features. The monitor will be created under the scope of default allocation
  * if no CAT or MBA supported in the system.
+ *
+ * In resctrl file sytem, more than one monitoring groups could be created
+ * within one allocation group, along with the creation of allocation group,
+ * a monitoring group is created at the same, which monitors the resource
+ * utilization information of whole allocation group.
+ * A virResctrlMonitor with @default_monitor marked as 'true' is representing
+ * the monitoring group created along with the creation of allocation group.
  */
 struct _virResctrlMonitor {
 virObject parent;
@@ -355,6 +362,8 @@ struct _virResctrlMonitor {
 /* libvirt-generated path in /sys/fs/resctrl for this particular
  * monitor */
 char *path;
+/* Boolean flag for default monitor */
+bool default_monitor;
 /* The cache 'level', special for cache monitor */
 unsigned int cache_level;
 };
@@ -2499,6 +2508,13 @@ virResctrlMonitorDeterminePath(virResctrlMonitorPtr 
monitor,
 return -1;
 }
 
+if (monitor->default_monitor) {
+if (VIR_STRDUP(monitor->path, monitor->alloc->path) < 0)
+return -1;
+
+return 0;
+}
+
 if (monitor->alloc)
 alloc_path = monitor->alloc->path;
 else
@@ -2739,3 +2755,10 @@ virResctrlMonitorGetCacheOccupancy(virResctrlMonitorPtr 
monitor,
 return virResctrlMonitorGetStatistic(monitor, "llc_occupancy",
  nbank, bankids, bankcaches);
 }
+
+
+void
+virResctrlMonitorSetDefault(virResctrlMonitorPtr monitor)
+{
+monitor->default_monitor = true;
+}
diff --git a/src/util/virresctrl.h b/src/util/virresctrl.h
index 6137fee..371df8a 100644
--- a/src/util/virresctrl.h
+++ b/src/util/virresctrl.h
@@ -228,4 +228,6 @@ virResctrlMonitorGetCacheOccupancy(virResctrlMonitorPtr 
monitor,
size_t *nbank,
unsigned int **bankids,
unsigned int **bankcaches);
+void
+virResctrlMonitorSetDefault(virResctrlMonitorPtr monitor);
 #endif /*  __VIR_RESCTRL_H__ */
-- 
2.7.4

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


[libvirt] [PATCHv5 08/19] util: Add interface for creating monitor group

2018-10-09 Thread Wang Huaqiang
Add interface for creating the resource monitoring group according
to '@virResctrlMonitor->path'.

Signed-off-by: Wang Huaqiang 
---
 src/libvirt_private.syms |  1 +
 src/util/virresctrl.c| 28 
 src/util/virresctrl.h|  6 ++
 3 files changed, 35 insertions(+)

diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms
index e175c8b..a878083 100644
--- a/src/libvirt_private.syms
+++ b/src/libvirt_private.syms
@@ -2681,6 +2681,7 @@ virResctrlInfoGetMonitorPrefix;
 virResctrlInfoMonFree;
 virResctrlInfoNew;
 virResctrlMonitorAddPID;
+virResctrlMonitorCreate;
 virResctrlMonitorDeterminePath;
 virResctrlMonitorNew;
 
diff --git a/src/util/virresctrl.c b/src/util/virresctrl.c
index 8b617a6..b3d20cc 100644
--- a/src/util/virresctrl.c
+++ b/src/util/virresctrl.c
@@ -2475,6 +2475,7 @@ virResctrlMonitorAddPID(virResctrlMonitorPtr monitor,
 return virResctrlAddPID(monitor->path, pid);
 }
 
+
 int
 virResctrlMonitorDeterminePath(virResctrlMonitorPtr monitor,
const char *machinename)
@@ -2506,3 +2507,30 @@ virResctrlMonitorDeterminePath(virResctrlMonitorPtr 
monitor,
 
 return 0;
 }
+
+
+int
+virResctrlMonitorCreate(virResctrlAllocPtr alloc,
+virResctrlMonitorPtr monitor,
+const char *machinename)
+{
+int lockfd = -1;
+int ret = -1;
+
+if (!monitor)
+return 0;
+
+monitor->alloc = virObjectRef(alloc);
+
+if (virResctrlMonitorDeterminePath(monitor, machinename) < 0)
+return -1;
+
+lockfd = virResctrlLockWrite();
+if (lockfd < 0)
+return -1;
+
+ret = virResctrlCreateGroupPath(monitor->path);
+
+virResctrlUnlock(lockfd);
+return ret;
+}
diff --git a/src/util/virresctrl.h b/src/util/virresctrl.h
index 69b6b1d..1efe394 100644
--- a/src/util/virresctrl.h
+++ b/src/util/virresctrl.h
@@ -196,7 +196,13 @@ virResctrlMonitorNew(void);
 int
 virResctrlMonitorAddPID(virResctrlMonitorPtr monitor,
 pid_t pid);
+
 int
 virResctrlMonitorDeterminePath(virResctrlMonitorPtr monitor,
const char *machinename);
+
+int
+virResctrlMonitorCreate(virResctrlAllocPtr alloc,
+virResctrlMonitorPtr monitor,
+const char *machinename);
 #endif /*  __VIR_RESCTRL_H__ */
-- 
2.7.4

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


[libvirt] [PATCHv5 12/19] conf: Refactor virDomainResctrlAppend

2018-10-09 Thread Wang Huaqiang
Refactor virDomainResctrlAppend to facilitate virDomainResctrlDef
with the capability to hold more element.

Signed-off-by: Wang Huaqiang 
---
 src/conf/domain_conf.c | 64 +++---
 1 file changed, 45 insertions(+), 19 deletions(-)

diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c
index e2b4701..9a514a6 100644
--- a/src/conf/domain_conf.c
+++ b/src/conf/domain_conf.c
@@ -18920,24 +18920,43 @@ virDomainCachetuneDefParseCache(xmlXPathContextPtr 
ctxt,
 }
 
 
+static virDomainResctrlDefPtr
+virDomainResctrlNew(virResctrlAllocPtr alloc,
+virBitmapPtr vcpus)
+{
+virDomainResctrlDefPtr resctrl = NULL;
+
+if (VIR_ALLOC(resctrl) < 0)
+return NULL;
+
+if ((resctrl->vcpus = virBitmapNewCopy(vcpus)) == NULL) {
+virReportError(VIR_ERR_INTERNAL_ERROR, "%s",
+   _("failed to copy 'vcpus'"));
+goto error;
+}
+
+resctrl->alloc = virObjectRef(alloc);
+
+return resctrl;
+ error:
+virDomainResctrlDefFree(resctrl);
+return NULL;
+}
+
+
 static int
 virDomainResctrlAppend(virDomainDefPtr def,
xmlNodePtr node,
-   virResctrlAllocPtr alloc,
-   virBitmapPtr vcpus,
+   virDomainResctrlDefPtr resctrl,
unsigned int flags)
 {
 char *vcpus_str = NULL;
 char *alloc_id = NULL;
-virDomainResctrlDefPtr tmp_resctrl = NULL;
 int ret = -1;
 
-if (VIR_ALLOC(tmp_resctrl) < 0)
-goto cleanup;
-
 /* We need to format it back because we need to be consistent in the naming
  * even when users specify some "sub-optimal" string there. */
-vcpus_str = virBitmapFormat(vcpus);
+vcpus_str = virBitmapFormat(resctrl->vcpus);
 if (!vcpus_str)
 goto cleanup;
 
@@ -18954,18 +18973,14 @@ virDomainResctrlAppend(virDomainDefPtr def,
 goto cleanup;
 }
 
-if (virResctrlAllocSetID(alloc, alloc_id) < 0)
+if (virResctrlAllocSetID(resctrl->alloc, alloc_id) < 0)
 goto cleanup;
 
-tmp_resctrl->vcpus = vcpus;
-tmp_resctrl->alloc = alloc;
-
-if (VIR_APPEND_ELEMENT(def->resctrls, def->nresctrls, tmp_resctrl) < 0)
+if (VIR_APPEND_ELEMENT(def->resctrls, def->nresctrls, resctrl) < 0)
 goto cleanup;
 
 ret = 0;
  cleanup:
-virDomainResctrlDefFree(tmp_resctrl);
 VIR_FREE(alloc_id);
 VIR_FREE(vcpus_str);
 return ret;
@@ -18982,6 +18997,7 @@ virDomainCachetuneDefParse(virDomainDefPtr def,
 xmlNodePtr *nodes = NULL;
 virBitmapPtr vcpus = NULL;
 virResctrlAllocPtr alloc = NULL;
+virDomainResctrlDefPtr resctrl = NULL;
 ssize_t i = 0;
 int n;
 int ret = -1;
@@ -19030,15 +19046,18 @@ virDomainCachetuneDefParse(virDomainDefPtr def,
 goto cleanup;
 }
 
-if (virDomainResctrlAppend(def, node, alloc, vcpus, flags) < 0)
+resctrl = virDomainResctrlNew(alloc, vcpus);
+if (!resctrl)
 goto cleanup;
 
-vcpus = NULL;
-alloc = NULL;
+if (virDomainResctrlAppend(def, node, resctrl, flags) < 0)
+goto cleanup;
 
+resctrl = NULL;
 ret = 0;
  cleanup:
 ctxt->node = oldnode;
+virDomainResctrlDefFree(resctrl);
 virObjectUnref(alloc);
 virBitmapFree(vcpus);
 VIR_FREE(nodes);
@@ -19196,6 +19215,8 @@ virDomainMemorytuneDefParse(virDomainDefPtr def,
 xmlNodePtr *nodes = NULL;
 virBitmapPtr vcpus = NULL;
 virResctrlAllocPtr alloc = NULL;
+virDomainResctrlDefPtr resctrl = NULL;
+
 ssize_t i = 0;
 int n;
 int ret = -1;
@@ -19240,15 +19261,20 @@ virDomainMemorytuneDefParse(virDomainDefPtr def,
  * just update the existing alloc information, which is done in above
  * virDomainMemorytuneDefParseMemory */
 if (new_alloc) {
-if (virDomainResctrlAppend(def, node, alloc, vcpus, flags) < 0)
+resctrl = virDomainResctrlNew(alloc, vcpus);
+if (!resctrl)
 goto cleanup;
-vcpus = NULL;
-alloc = NULL;
+
+if (virDomainResctrlAppend(def, node, resctrl, flags) < 0)
+goto cleanup;
+
+resctrl = NULL;
 }
 
 ret = 0;
  cleanup:
 ctxt->node = oldnode;
+virDomainResctrlDefFree(resctrl);
 virObjectUnref(alloc);
 virBitmapFree(vcpus);
 VIR_FREE(nodes);
-- 
2.7.4

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


[libvirt] [PATCHv5 03/19] util: Refactor code for adding PID to the resource group

2018-10-09 Thread Wang Huaqiang
The code of adding PID to the allocation could be reused, refactor it
for later reusing.

Signed-off-by: Wang Huaqiang 
---
 src/util/virresctrl.c | 26 +-
 1 file changed, 17 insertions(+), 9 deletions(-)

diff --git a/src/util/virresctrl.c b/src/util/virresctrl.c
index 18ee560..41c7e51 100644
--- a/src/util/virresctrl.c
+++ b/src/util/virresctrl.c
@@ -2359,24 +2359,21 @@ virResctrlAllocCreate(virResctrlInfoPtr resctrl,
 }
 
 
-int
-virResctrlAllocAddPID(virResctrlAllocPtr alloc,
-  pid_t pid)
+static int
+virResctrlAddPID(const char *path,
+ pid_t pid)
 {
 char *tasks = NULL;
 char *pidstr = NULL;
 int ret = 0;
 
-if (!alloc)
-return 0;
-
-if (!alloc->path) {
+if (!path) {
 virReportError(VIR_ERR_INTERNAL_ERROR, "%s",
-   _("Cannot add pid to non-existing resctrl allocation"));
+   _("Cannot add pid to non-existing resctrl group"));
 return -1;
 }
 
-if (virAsprintf(, "%s/tasks", alloc->path) < 0)
+if (virAsprintf(, "%s/tasks", path) < 0)
 return -1;
 
 if (virAsprintf(, "%lld", (long long int) pid) < 0)
@@ -2398,6 +2395,17 @@ virResctrlAllocAddPID(virResctrlAllocPtr alloc,
 
 
 int
+virResctrlAllocAddPID(virResctrlAllocPtr alloc,
+  pid_t pid)
+{
+if (!alloc)
+return 0;
+
+return virResctrlAddPID(alloc->path, pid);
+}
+
+
+int
 virResctrlAllocRemove(virResctrlAllocPtr alloc)
 {
 int ret = 0;
-- 
2.7.4

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


[libvirt] [PATCHv5 13/19] conf: Add resctrl monitor configuration

2018-10-09 Thread Wang Huaqiang
Introducing  element under  to represent
a cache monitor.

Supports two kind of monitors, which are, monitor under default
allocation or monitor under particular allocation.

Monitor supervises the cache or memory bandwidth usage for
interested vcpu thread set, if the vcpu thread set is belong to some
resctrl allocation, then the monitor will be created under this
allocation, that is, creating a resctrl monitoring group directory under
the directory of '@alloc->path/mon_group'. Otherwise, the monitor
will be created under default allocation.

For default allocation monitor, it will have such kind of XML layout:


  


For other type monitor, the XML layout will be something like:


  
  
  


Signed-off-by: Wang Huaqiang 
---
 docs/formatdomain.html.in  |  26 +++
 docs/schemas/domaincommon.rng  |  10 +
 src/conf/domain_conf.c | 217 -
 src/conf/domain_conf.h |  11 ++
 tests/genericxml2xmlindata/cachetune-cdp.xml   |   3 +
 .../cachetune-colliding-monitor.xml|  30 +++
 tests/genericxml2xmlindata/cachetune-small.xml |   7 +
 tests/genericxml2xmltest.c |   2 +
 8 files changed, 301 insertions(+), 5 deletions(-)
 create mode 100644 tests/genericxml2xmlindata/cachetune-colliding-monitor.xml

diff --git a/docs/formatdomain.html.in b/docs/formatdomain.html.in
index b1651e3..2fd665c 100644
--- a/docs/formatdomain.html.in
+++ b/docs/formatdomain.html.in
@@ -759,6 +759,12 @@
 cachetune vcpus='0-3'
   cache id='0' level='3' type='both' size='3' unit='MiB'/
   cache id='1' level='3' type='both' size='3' unit='MiB'/
+  monitor level='3' vcpus='1'/
+  monitor level='3' vcpus='0-3'/
+/cachetune
+cachetune vcpus='4-5'
+  monitor level='3' vcpus='4'/
+  monitor level='3' vcpus='5'/
 /cachetune
 memorytune vcpus='0-3'
   node id='0' bandwidth='60'/
@@ -978,6 +984,26 @@
   
 
   
+  monitor
+  
+The optional element monitor creates the cache
+monitor(s) for current cache allocation and has the following
+required attributes:
+
+  level
+  
+Host cache level the monitor belongs to.
+  
+  vcpus
+  
+vCPU list the monitor applies to. A monitor's vCPU list
+can only be the member(s) of the vCPU list of associating
+allocation. The default monitor has the same vCPU list as the
+associating allocation. For non-default monitors, there
+are no vCPU overlap permitted.
+  
+
+  
 
   
 
diff --git a/docs/schemas/domaincommon.rng b/docs/schemas/domaincommon.rng
index 5c533d6..7ce49d3 100644
--- a/docs/schemas/domaincommon.rng
+++ b/docs/schemas/domaincommon.rng
@@ -981,6 +981,16 @@
 
   
 
+
+  
+
+  
+
+
+  
+
+  
+
   
 
 
diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c
index 9a514a6..4f4604f 100644
--- a/src/conf/domain_conf.c
+++ b/src/conf/domain_conf.c
@@ -2955,13 +2955,30 @@ virDomainLoaderDefFree(virDomainLoaderDefPtr loader)
 
 
 static void
+virDomainResctrlMonDefFree(virDomainResctrlMonDefPtr domresmon)
+{
+if (!domresmon)
+return;
+
+virBitmapFree(domresmon->vcpus);
+virObjectUnref(domresmon->instance);
+}
+
+
+static void
 virDomainResctrlDefFree(virDomainResctrlDefPtr resctrl)
 {
+size_t i = 0;
+
 if (!resctrl)
 return;
 
+for (i = 0; i < resctrl->nmonitors; i++)
+virDomainResctrlMonDefFree(resctrl->monitors[i]);
+
 virObjectUnref(resctrl->alloc);
 virBitmapFree(resctrl->vcpus);
+VIR_FREE(resctrl->monitors);
 VIR_FREE(resctrl);
 }
 
@@ -18919,6 +18936,154 @@ virDomainCachetuneDefParseCache(xmlXPathContextPtr 
ctxt,
 return ret;
 }
 
+/* Checking if the monitor's vcpus is conflicted with existing allocation
+ * and monitors.
+ *
+ * Returns 1 if @vcpus equals to @resctrl->vcpus, means it is a default
+ * monitor. Returns - 1 if a conflict found. Returns 0 if no conflict and
+ * @vcpus is not equal to @resctrl->vcpus.
+ * */
+static int
+virDomainResctrlMonValidateVcpu(virDomainResctrlDefPtr resctrl,
+virBitmapPtr vcpus)
+{
+size_t i = 0;
+int vcpu = -1;
+
+if (virBitmapIsAllClear(vcpus)) {
+virReportError(VIR_ERR_INVALID_ARG, "%s",
+   _("vcpus is empty"));
+return -1;
+}
+
+while ((vcpu = virBitmapNextSetBit(vcpus, vcpu)) >= 0) {
+if (!virBitmapIsBitSet(resctrl->vcpus, vcpu)) {
+

[libvirt] [PATCHv5 06/19] util: Add monitor interface to determine path

2018-10-09 Thread Wang Huaqiang
Add interface for resctrl monitor to determine the path.

Signed-off-by: Wang Huaqiang 
---
 src/libvirt_private.syms |  1 +
 src/util/virresctrl.c| 32 
 src/util/virresctrl.h|  3 +++
 3 files changed, 36 insertions(+)

diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms
index 4a52a86..e175c8b 100644
--- a/src/libvirt_private.syms
+++ b/src/libvirt_private.syms
@@ -2681,6 +2681,7 @@ virResctrlInfoGetMonitorPrefix;
 virResctrlInfoMonFree;
 virResctrlInfoNew;
 virResctrlMonitorAddPID;
+virResctrlMonitorDeterminePath;
 virResctrlMonitorNew;
 
 
diff --git a/src/util/virresctrl.c b/src/util/virresctrl.c
index 03001cc..1a5578e 100644
--- a/src/util/virresctrl.c
+++ b/src/util/virresctrl.c
@@ -2465,3 +2465,35 @@ virResctrlMonitorAddPID(virResctrlMonitorPtr monitor,
 {
 return virResctrlAddPID(monitor->path, pid);
 }
+
+int
+virResctrlMonitorDeterminePath(virResctrlMonitorPtr monitor,
+   const char *machinename)
+{
+char *alloc_path = NULL;
+char *parentpath = NULL;
+
+if (!monitor) {
+virReportError(VIR_ERR_INTERNAL_ERROR, "%s",
+   _("Invalid resctrl monitor"));
+return -1;
+}
+
+if (monitor->alloc)
+alloc_path = monitor->alloc->path;
+else
+alloc_path = (char *)SYSFS_RESCTRL_PATH;
+
+if (virAsprintf(, "%s/mon_groups", alloc_path) < 0)
+return -1;
+
+monitor->path = virResctrlDeterminePath(parentpath, machinename,
+monitor->id);
+
+VIR_FREE(parentpath);
+
+if (!monitor->path)
+return -1;
+
+return 0;
+}
diff --git a/src/util/virresctrl.h b/src/util/virresctrl.h
index cb9bfae..69b6b1d 100644
--- a/src/util/virresctrl.h
+++ b/src/util/virresctrl.h
@@ -196,4 +196,7 @@ virResctrlMonitorNew(void);
 int
 virResctrlMonitorAddPID(virResctrlMonitorPtr monitor,
 pid_t pid);
+int
+virResctrlMonitorDeterminePath(virResctrlMonitorPtr monitor,
+   const char *machinename);
 #endif /*  __VIR_RESCTRL_H__ */
-- 
2.7.4

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


[libvirt] [PATCHv5 02/19] util: Introduce resctrl monitor for CMT

2018-10-09 Thread Wang Huaqiang
Cache Monitoring Technology (aka CMT) provides the capability
to report cache utilization information of system task.

This patch introduces the concept of resctrl monitor through
data structure virResctrlMonitor.

Signed-off-by: Wang Huaqiang 
---
 src/libvirt_private.syms |  1 +
 src/util/virresctrl.c| 56 
 src/util/virresctrl.h|  7 ++
 3 files changed, 64 insertions(+)

diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms
index 335210c..d2573c5 100644
--- a/src/libvirt_private.syms
+++ b/src/libvirt_private.syms
@@ -2680,6 +2680,7 @@ virResctrlInfoGetCache;
 virResctrlInfoGetMonitorPrefix;
 virResctrlInfoMonFree;
 virResctrlInfoNew;
+virResctrlMonitorNew;
 
 
 # util/virrotatingfile.h
diff --git a/src/util/virresctrl.c b/src/util/virresctrl.c
index 697424c..18ee560 100644
--- a/src/util/virresctrl.c
+++ b/src/util/virresctrl.c
@@ -105,6 +105,7 @@ typedef virResctrlAllocMemBW *virResctrlAllocMemBWPtr;
 /* Class definitions and initializations */
 static virClassPtr virResctrlInfoClass;
 static virClassPtr virResctrlAllocClass;
+static virClassPtr virResctrlMonitorClass;
 
 
 /* virResctrlInfo */
@@ -319,6 +320,35 @@ struct _virResctrlAlloc {
 char *path;
 };
 
+/* virResctrlMonitor */
+
+/*
+ * virResctrlMonitor is the data structure for resctrl monitor. Resctrl
+ * monitor represents a resctrl monitoring group, which can be used to
+ * monitor the resource utilization information for either cache or
+ * memory bandwidth.
+ *
+ * From hardware perspective, cache monitoring technology (CMT), memory
+ * bandwidth technology (MBM), as well as the CAT and MBA, are all orthogonal
+ * features. The monitor will be created under the scope of default allocation
+ * if no CAT or MBA supported in the system.
+ */
+struct _virResctrlMonitor {
+virObject parent;
+
+/* In resctrl, each monitor is associated with one specific allocation,
+ * either the allocation under /sys/fs/resctrl or the default allocation.
+ * If this pointer is NULL, then the monitor will be associated with
+ * default allocation, otherwise, this pointer points to the allocation
+ * this monitor associated with. */
+virResctrlAllocPtr alloc;
+/* The monitor identifier */
+char *id;
+/* libvirt-generated path in /sys/fs/resctrl for this particular
+ * monitor */
+char *path;
+};
+
 
 static void
 virResctrlAllocDispose(void *obj)
@@ -368,6 +398,17 @@ virResctrlAllocDispose(void *obj)
 }
 
 
+static void
+virResctrlMonitorDispose(void *obj)
+{
+virResctrlMonitorPtr monitor = obj;
+
+virObjectUnref(monitor->alloc);
+VIR_FREE(monitor->id);
+VIR_FREE(monitor->path);
+}
+
+
 /* Global initialization for classes */
 static int
 virResctrlOnceInit(void)
@@ -378,6 +419,9 @@ virResctrlOnceInit(void)
 if (!VIR_CLASS_NEW(virResctrlAlloc, virClassForObject()))
 return -1;
 
+if (!VIR_CLASS_NEW(virResctrlMonitor, virClassForObject()))
+return -1;
+
 return 0;
 }
 
@@ -2372,3 +2416,15 @@ virResctrlAllocRemove(virResctrlAllocPtr alloc)
 
 return ret;
 }
+
+
+/* virResctrlMonitor-related definitions */
+
+virResctrlMonitorPtr
+virResctrlMonitorNew(void)
+{
+if (virResctrlInitialize() < 0)
+return NULL;
+
+return virObjectNew(virResctrlMonitorClass);
+}
diff --git a/src/util/virresctrl.h b/src/util/virresctrl.h
index 10505e9..f59a9aa 100644
--- a/src/util/virresctrl.h
+++ b/src/util/virresctrl.h
@@ -185,4 +185,11 @@ int
 virResctrlInfoGetMonitorPrefix(virResctrlInfoPtr resctrl,
const char *prefix,
virResctrlInfoMonPtr *monitor);
+
+/* Monitor-related things */
+typedef struct _virResctrlMonitor virResctrlMonitor;
+typedef virResctrlMonitor *virResctrlMonitorPtr;
+
+virResctrlMonitorPtr
+virResctrlMonitorNew(void);
 #endif /*  __VIR_RESCTRL_H__ */
-- 
2.7.4

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


[libvirt] [PATCHv5 05/19] util: Refactor code for determining allocation path

2018-10-09 Thread Wang Huaqiang
The code for determining resctrl allocation path could be reused
for monitor. Refactor it for reusing.

Signed-off-by: Wang Huaqiang 
---
 src/util/virresctrl.c | 33 +++--
 1 file changed, 27 insertions(+), 6 deletions(-)

diff --git a/src/util/virresctrl.c b/src/util/virresctrl.c
index c9a79f7..03001cc 100644
--- a/src/util/virresctrl.c
+++ b/src/util/virresctrl.c
@@ -2267,6 +2267,26 @@ virResctrlAllocAssign(virResctrlInfoPtr resctrl,
 }
 
 
+static char *
+virResctrlDeterminePath(const char *pathparent,
+const char *prefix,
+const char *id)
+{
+char *path = NULL;
+
+if (!id) {
+virReportError(VIR_ERR_INTERNAL_ERROR, "%s",
+   _("Resctrl resource ID must be set before creation"));
+return NULL;
+}
+
+if (virAsprintf(, "%s/%s-%s", pathparent, prefix, id) < 0)
+return NULL;
+
+return path;
+}
+
+
 int
 virResctrlAllocDeterminePath(virResctrlAllocPtr alloc,
  const char *machinename)
@@ -2274,15 +2294,16 @@ virResctrlAllocDeterminePath(virResctrlAllocPtr alloc,
 if (!alloc)
 return 0;
 
-if (!alloc->id) {
-virReportError(VIR_ERR_INTERNAL_ERROR, "%s",
-   _("Resctrl Allocation ID must be set before creation"));
+if (alloc->path) {
+virReportError(VIR_ERR_INVALID_ARG, "%s",
+   _("Resctrl group path is expected to be NULL"));
 return -1;
 }
 
-if (!alloc->path &&
-virAsprintf(>path, "%s/%s-%s",
-SYSFS_RESCTRL_PATH, machinename, alloc->id) < 0)
+alloc->path = virResctrlDeterminePath(SYSFS_RESCTRL_PATH,
+  machinename,
+  alloc->id);
+if (!alloc->path)
 return -1;
 
 return 0;
-- 
2.7.4

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


[libvirt] [PATCHv5 04/19] util: Add interface for adding PID to monitor

2018-10-09 Thread Wang Huaqiang
Add interface for adding task PID to monitor.

Signed-off-by: Wang Huaqiang 
---
 src/libvirt_private.syms | 1 +
 src/util/virresctrl.c| 8 
 src/util/virresctrl.h| 4 
 3 files changed, 13 insertions(+)

diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms
index d2573c5..4a52a86 100644
--- a/src/libvirt_private.syms
+++ b/src/libvirt_private.syms
@@ -2680,6 +2680,7 @@ virResctrlInfoGetCache;
 virResctrlInfoGetMonitorPrefix;
 virResctrlInfoMonFree;
 virResctrlInfoNew;
+virResctrlMonitorAddPID;
 virResctrlMonitorNew;
 
 
diff --git a/src/util/virresctrl.c b/src/util/virresctrl.c
index 41c7e51..c9a79f7 100644
--- a/src/util/virresctrl.c
+++ b/src/util/virresctrl.c
@@ -2436,3 +2436,11 @@ virResctrlMonitorNew(void)
 
 return virObjectNew(virResctrlMonitorClass);
 }
+
+
+int
+virResctrlMonitorAddPID(virResctrlMonitorPtr monitor,
+pid_t pid)
+{
+return virResctrlAddPID(monitor->path, pid);
+}
diff --git a/src/util/virresctrl.h b/src/util/virresctrl.h
index f59a9aa..cb9bfae 100644
--- a/src/util/virresctrl.h
+++ b/src/util/virresctrl.h
@@ -192,4 +192,8 @@ typedef virResctrlMonitor *virResctrlMonitorPtr;
 
 virResctrlMonitorPtr
 virResctrlMonitorNew(void);
+
+int
+virResctrlMonitorAddPID(virResctrlMonitorPtr monitor,
+pid_t pid);
 #endif /*  __VIR_RESCTRL_H__ */
-- 
2.7.4

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


[libvirt] [PATCHv5 11/19] conf: Refactor code for matching existing resctrls

2018-10-09 Thread Wang Huaqiang
Refactoring the code of matching the new resctrl with
existing resctrl groups. Add the virObjectRef action
into function virDomainResctrlVcpuMatch.

Signed-off-by: Wang Huaqiang 
---
 src/conf/domain_conf.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c
index b77680e..e2b4701 100644
--- a/src/conf/domain_conf.c
+++ b/src/conf/domain_conf.c
@@ -18833,7 +18833,7 @@ virDomainResctrlVcpuMatch(virDomainDefPtr def,
  * Just updating memory allocation information of that group
  */
 if (virBitmapEqual(def->resctrls[i]->vcpus, vcpus)) {
-*alloc = def->resctrls[i]->alloc;
+*alloc = virObjectRef(def->resctrls[i]->alloc);
 break;
 }
 if (virBitmapOverlaps(def->resctrls[i]->vcpus, vcpus)) {
@@ -19225,8 +19225,6 @@ virDomainMemorytuneDefParse(virDomainDefPtr def,
 if (!alloc)
 goto cleanup;
 new_alloc = true;
-} else {
-alloc = virObjectRef(alloc);
 }
 
 for (i = 0; i < n; i++) {
-- 
2.7.4

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


[libvirt] [PATCHv5 00/19] Introduce x86 Cache Monitoring Technology (CMT)

2018-10-09 Thread Wang Huaqiang
This series of patches and the series already been merged introduce
the x86 Cache Monitoring Technology (CMT) to libvirt by interacting
with kernel resource control (resctrl) interface. CMT is one of the
Intel(R) x86 CPU feature which belongs to the Resource Director
Technology (RDT). CMT reports the occupancy of the last level cache,
which is shared by all CPU cores.

In the v1 series, an original and complete feature for CMT was introduced
The v2 and v3 patches address the feature for the host capability of CMT.
v4 is addressing the feature for monitoring VM vcpu thread set cache
occupancy and reporting it through a virsh command.

We have serval discussion about the enabling of CMT, please refer to
following links for the RFCs.
RFCv3
https://www.redhat.com/archives/libvir-list/2018-August/msg01213.html
RFCv2
https://www.redhat.com/archives/libvir-list/2018-July/msg00409.html
https://www.redhat.com/archives/libvir-list/2018-July/msg01241.html
RFCv1
https://www.redhat.com/archives/libvir-list/2018-June/msg00674.html

And the merged commits are list as below, for host capability of CMT.
6af8417415508c31f8ce71234b573b4999f35980
8f6887998bf63594ae26e3db18d4d5896c5f2cb4
58fcee6f3a2b7e89c21c1fb4ec21429c31a0c5b8
12093f1feaf8f5023dcd9d65dff111022842183d
a5d293c18831dcf69ec6195798387fbb70c9f461


1. About reason why CMT is necessary in libvirt?
The perf events of 'CMT, MBML, MBMT' have been phased out since Linux
kernel commit c39a0e2c8850f08249383f2425dbd8dbe4baad69, in libvirt
the perf based cmt,mbm will not work with the latest linux kernel. These
patches add CMT feature to libvirt through kernel resctrlfs interface.

2 Create cache monitoring group (cache monitor).

The main interface for creating monitoring group is through XML file. The
proposed configuration is like:

  

  
  
+ 


+ 

  

In above XML, created 2 cache resctrl allocation groups and 2 resctrl
monitoring groups.
The changes of cache monitor will be effective in next booting of VM.

2 Show CMT result through command 'domstats'

Adding the interface in qemu to report this information for resource
monitor group through command 'virsh domstats --cpu-total'.
Below is a typical output:

 # virsh domstats 1 --cpu-total
 Domain: 'ubuntu16.04-base'
 ...
   cpu.cache.monitor.count=2
   cpu.cache.0.name=vcpus_1
   cpu.cache.0.vcpus=1
   cpu.cache.0.bank.count=2
   cpu.cache.0.bank.0.id=0
   cpu.cache.0.bank.0.bytes=4505600
   cpu.cache.0.bank.1.id=1
   cpu.cache.0.bank.1.bytes=5586944
   cpu.cache.1.name=vcpus_4-6
   cpu.cache.1.vcpus=4,5,6
   cpu.cache.1.bank.count=2
   cpu.cache.1.bank.0.id=0
   cpu.cache.1.bank.0.bytes=17571840
   cpu.cache.1.bank.1.id=1
   cpu.cache.1.bank.1.bytes=29106176


Changes in v5:
- qemu: Setting up vcpu and adding pids to resctrl monitor groups during
re-connection.
- Add the document for domain configuration related to resctrl monitor.

Changes in v4:
v4 is addressing the feature for monitoring VM vcpu
thread set cache occupancy and reporting it through a
virsh command.
- Introduced resctrl default allocation
- Introduced resctrl monitor and default monitor

Changes in v3:
- Addressed John Ferlan's review.
- Typo fixed.
- Removed VIR_ENUM_DECL(virMonitor);

Changes in v2:
- Introduced MBM capability.
- Capability layout changed
* Moved  from cahe  to 
* Renamed  to 
- Document for 'reuseThreshold' changed.
- Introduced API virResctrlInfoGetMonitorPrefix
- Added more tests, covering standalone CMT, fake new
  feature.
- Creating CMT resource control group will be
  subsequent job.


Wang Huaqiang (19):
  docs: Refactor schemas to support default allocation
  util: Introduce resctrl monitor for CMT
  util: Refactor code for adding PID to the resource group
  util: Add interface for adding PID to monitor
  util: Refactor code for determining allocation path
  util: Add monitor interface to determine path
  util: Refactor code for creating resctrl group
  util: Add interface for creating monitor group
  util: Add more interfaces for resctrl monitor
  util: Introduce default monitor
  conf: Refactor code for matching existing resctrls
  conf: Refactor virDomainResctrlAppend
  conf: Add resctrl monitor configuration
  Util: Add function for checking if monitor is running
  qemu: enable resctrl monitor in qemu
  conf: Add a 'id' to virDomainResctrlDef
  qemu: refactor qemuDomainGetStatsCpu
  qemu: Report cache occupancy (CMT) with domstats
  qemu: Setting up vcpu and adding pids to resctrl monitor groups during
reconnection

 docs/formatdomain.html.in  |  30 +-
 docs/schemas/domaincommon.rng  |  14 +-
 src/conf/domain_conf.c | 327 ++--
 src/conf/domain_conf.h |  12 +
 src/libvirt-domain.c   |   9 +
 src/libvirt_private.syms 

[libvirt] [PATCHv5 01/19] docs: Refactor schemas to support default allocation

2018-10-09 Thread Wang Huaqiang
The resctrl default allocation is introduced in this patch,
which refers to the root directory (/sys/fs/resctrl) and
immediately be created after mounting, owns all the tasks
and cpus in the system and can make full use of all resources.

It does not intentionally allocate any dedicated amount
of resource, either cache or memory bandwidth, for default
allocation.

If a system task has no resource control applied but you
want to know task's cache or memroy bandwidth utilization
information, the default allocation is meaningful. We create
resctrl monitor under the default allocation for such kind of
task.

Refactoring schemas docs and APIs to create a default
cache allocation by allowing the appearance of an
 with no  element.

Signed-off-by: Wang Huaqiang 
---
 docs/formatdomain.html.in |  4 ++--
 docs/schemas/domaincommon.rng |  4 ++--
 src/conf/domain_conf.c| 32 +++-
 src/util/virresctrl.c | 27 +++
 4 files changed, 50 insertions(+), 17 deletions(-)

diff --git a/docs/formatdomain.html.in b/docs/formatdomain.html.in
index 8189959..b1651e3 100644
--- a/docs/formatdomain.html.in
+++ b/docs/formatdomain.html.in
@@ -943,8 +943,8 @@
 
   cache
   
-This element controls the allocation of CPU cache and has the
-following attributes:
+This optional element controls the allocation of CPU cache and has
+the following attributes:
 
   level
   
diff --git a/docs/schemas/domaincommon.rng b/docs/schemas/domaincommon.rng
index 099a949..5c533d6 100644
--- a/docs/schemas/domaincommon.rng
+++ b/docs/schemas/domaincommon.rng
@@ -956,7 +956,7 @@
 
   
 
-
+
   
 
   
@@ -980,7 +980,7 @@
   
 
   
-
+
   
 
 
diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c
index 9911d56..b77680e 100644
--- a/src/conf/domain_conf.c
+++ b/src/conf/domain_conf.c
@@ -19002,22 +19002,27 @@ virDomainCachetuneDefParse(virDomainDefPtr def,
 goto cleanup;
 }
 
-if (virDomainResctrlVcpuMatch(def, vcpus, ) < 0)
-goto cleanup;
-
-if (!alloc) {
-alloc = virResctrlAllocNew();
-if (!alloc)
+/* If 'n' equals 0, then no  element found in ,
+ * this means it is a default alloction. For default allocation,
+ * @SetvirDomainResctrlDefPtr->alloc is set to NULL */
+if (n != 0) {
+if (virDomainResctrlVcpuMatch(def, vcpus, ) < 0)
 goto cleanup;
-} else {
-virReportError(VIR_ERR_XML_ERROR, "%s",
-   _("Identical vcpus in cachetunes found"));
-goto cleanup;
-}
 
-for (i = 0; i < n; i++) {
-if (virDomainCachetuneDefParseCache(ctxt, nodes[i], alloc) < 0)
+if (!alloc) {
+alloc = virResctrlAllocNew();
+if (!alloc)
+goto cleanup;
+} else {
+virReportError(VIR_ERR_XML_ERROR, "%s",
+   _("Identical vcpus in cachetunes found"));
 goto cleanup;
+}
+
+for (i = 0; i < n; i++) {
+if (virDomainCachetuneDefParseCache(ctxt, nodes[i], alloc) < 0)
+goto cleanup;
+}
 }
 
 if (virResctrlAllocIsEmpty(alloc)) {
@@ -19027,6 +19032,7 @@ virDomainCachetuneDefParse(virDomainDefPtr def,
 
 if (virDomainResctrlAppend(def, node, alloc, vcpus, flags) < 0)
 goto cleanup;
+
 vcpus = NULL;
 alloc = NULL;
 
diff --git a/src/util/virresctrl.c b/src/util/virresctrl.c
index df5b512..697424c 100644
--- a/src/util/virresctrl.c
+++ b/src/util/virresctrl.c
@@ -234,6 +234,10 @@ virResctrlInfoMonFree(virResctrlInfoMonPtr mon)
  * in case there is no allocation for that particular cache allocation (level,
  * cache, ...) or memory allocation for particular node).
  *
+ * Resctrl file system root directory, /sys/fs/sysctrl/, is called the default
+ * allocation, which is created, immediately after mounting, owns all the
+ * tasks and cpus in the system and can make full use of all resources.
+ *
  * =Cache allocation technology (CAT)=
  *
  * Since one allocation can be made for caches on different levels, the first
@@ -1165,6 +1169,9 @@ virResctrlAllocSetCacheSize(virResctrlAllocPtr alloc,
 unsigned int cache,
 unsigned long long size)
 {
+if (!alloc)
+return 0;
+
 if (virResctrlAllocCheckCollision(alloc, level, type, cache)) {
 virReportError(VIR_ERR_XML_ERROR,
_("Colliding cache allocations for cache "
@@ -1235,6 +1242,9 @@ virResctrlAllocSetMemoryBandwidth(virResctrlAllocPtr 
alloc,
 {
 virResctrlAllocMemBWPtr mem_bw = alloc->mem_bw;
 
+if (!alloc)
+return 0;
+
 if 

Re: [libvirt] [PATCHv2 1/4] util: Introduce monitor capability interface

2018-10-09 Thread Wang, Huaqiang
Hi Jano,

Recognized there was already a commit for a  fixing: 
7bff646d71aa90ed8727ef99be29d6d2ab5d8f06.
And now I got your idea.

Thanks
Huaqiang

> -Original Message-
> From: Wang, Huaqiang
> Sent: Tuesday, October 9, 2018 5:55 PM
> To: 'Ján Tomko' ; John Ferlan 
> Cc: libvir-list@redhat.com; Feng, Shaohe ; Niu, Bing
> ; Ding, Jian-feng ; Zang, Rui
> 
> Subject: RE: [libvirt] [PATCHv2 1/4] util: Introduce monitor capability 
> interface
> 
> 
> 
> > -Original Message-
> > From: Ján Tomko [mailto:jto...@redhat.com]
> > Sent: Friday, October 5, 2018 10:42 PM
> > To: John Ferlan 
> > Cc: Wang, Huaqiang ; libvir-list@redhat.com;
> > Feng, Shaohe ; Niu, Bing ;
> > Ding, Jian- feng ; Zang, Rui
> > 
> > Subject: Re: [libvirt] [PATCHv2 1/4] util: Introduce monitor
> > capability interface
> >
> > On Tue, Sep 18, 2018 at 03:38:44PM -0400, John Ferlan wrote:
> > >On 09/14/2018 09:30 PM, Wang Huaqiang wrote:
> > >> This patch introduces the resource monitor and creates the
> > >> interface for getting host capability of resource monitor from the
> > >> system resource control file system.
> > >>
> > >> The resource monitor take the role of RDT monitoring group, could
> > >> be
> > >
> > >*takes...
> > >
> > >s/, could/ and could/
> > >
> > >> used to monitor the resource consumption information, such as the
> > >> last level cache occupancy and the utilization of memory bandwidth.
> > >>
> > >> Signed-off-by: Wang Huaqiang 
> > >> ---
> > >>  src/util/virresctrl.c | 124
> > >> ++
> > >>  1 file changed, 124 insertions(+)
> > >>
> >
> > [...]
> >
> > >> +
> > >> +rv = virFileReadValueUint(_monitor->max_monitor,
> > >> +  SYSFS_RESCTRL_PATH 
> > >> "/info/L3_MON/num_rmids");
> > >> +if (rv == -2) {
> > >> +/* The file doesn't exist, so it's unusable for us, probably 
> > >> resource
> > >> + * monitor unsupported */
> > >> +VIR_INFO("The path '" SYSFS_RESCTRL_PATH
> > "/info/L3_MON/num_rmids' "
> > >> + "does not exist");
> > >
> > >Add virResetLastError()
> > >
> > >[avoids having this error in Last and something else failing and
> > >spewing the error]
> > >
> >
> > The return value of -2 means no error was set, so there is nothing to do 
> > here.
> 
> A return value of -2 means no error, rather than executing remaining part of
> function, it requires to return to the caller without reporting any error.
> 
> Here, a return value of -2 means "/info/L3_MON/num_rmids" is not exists, this
> could happen if CMT is not supported by host. This is a valid scenario and 
> does
> not mean an error, and this function should not report any error to its 
> caller, and
> the caller, which is virResctrlGetInfo, will continue to run its remaining
> statements normally.
> 
> >
> > Also, virResetLastError is meant to be used before starting an API.
> > It only resets the thread-local error object (which can only contain
> > one error), it cannot possibly unlog an error that was logged earlier.
> > In that case, creating a Quiet version of the function is the proper 
> > solution.
> 
> Do you mean the message reported by 'VIR_INFO' should be removed and also
> not adding the 'virResetLastError' line?
> 
> It might be more consistent if we keep the 'VIR_INFO' lines, because the 
> similar
> message has been emitted in checking the memory bandwidth information and
> cache allocation information. But if you insist that this message should not 
> be
> shown to user or developer, I could accept that and make change to it.
> 
> @@ -704,12 +704,7 @@ virResctrlGetMonitorInfo(virResctrlInfoPtr resctrl)
>  rv = virFileReadValueUint(_monitor->max_monitor,
>SYSFS_RESCTRL_PATH "/info/L3_MON/num_rmids");
>  if (rv == -2) {
> /* The file doesn't exist, so it's unusable for us, probably resource
>  * monitor unsupported */
> VIR_INFO("The file '" SYSFS_RESCTRL_PATH "/info/L3_MON/num_rmids' "
>  "does not exist");
>  ret = 0;
> -virResetLastError();
>  goto cleanup;
>  } else if (rv < 0) {
>  /* Other failures are fatal, so just quit */
> 
> >
> > Jano
> >
> > >> +ret = 0;
> > >> +goto cleanup;

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [PATCHv2 1/4] util: Introduce monitor capability interface

2018-10-09 Thread Wang, Huaqiang



> -Original Message-
> From: Ján Tomko [mailto:jto...@redhat.com]
> Sent: Friday, October 5, 2018 10:42 PM
> To: John Ferlan 
> Cc: Wang, Huaqiang ; libvir-list@redhat.com; Feng,
> Shaohe ; Niu, Bing ; Ding, Jian-
> feng ; Zang, Rui 
> Subject: Re: [libvirt] [PATCHv2 1/4] util: Introduce monitor capability 
> interface
> 
> On Tue, Sep 18, 2018 at 03:38:44PM -0400, John Ferlan wrote:
> >On 09/14/2018 09:30 PM, Wang Huaqiang wrote:
> >> This patch introduces the resource monitor and creates the interface
> >> for getting host capability of resource monitor from the system
> >> resource control file system.
> >>
> >> The resource monitor take the role of RDT monitoring group, could be
> >
> >*takes...
> >
> >s/, could/ and could/
> >
> >> used to monitor the resource consumption information, such as the
> >> last level cache occupancy and the utilization of memory bandwidth.
> >>
> >> Signed-off-by: Wang Huaqiang 
> >> ---
> >>  src/util/virresctrl.c | 124
> >> ++
> >>  1 file changed, 124 insertions(+)
> >>
> 
> [...]
> 
> >> +
> >> +rv = virFileReadValueUint(_monitor->max_monitor,
> >> +  SYSFS_RESCTRL_PATH 
> >> "/info/L3_MON/num_rmids");
> >> +if (rv == -2) {
> >> +/* The file doesn't exist, so it's unusable for us, probably 
> >> resource
> >> + * monitor unsupported */
> >> +VIR_INFO("The path '" SYSFS_RESCTRL_PATH
> "/info/L3_MON/num_rmids' "
> >> + "does not exist");
> >
> >Add virResetLastError()
> >
> >[avoids having this error in Last and something else failing and
> >spewing the error]
> >
> 
> The return value of -2 means no error was set, so there is nothing to do here.

A return value of -2 means no error, rather than executing remaining part of 
function,
it requires to return to the caller without reporting any error.

Here, a return value of -2 means "/info/L3_MON/num_rmids" is not exists, this
could happen if CMT is not supported by host. This is a valid scenario and does
not mean an error, and this function should not report any error to its caller, 
and
the caller, which is virResctrlGetInfo, will continue to run its remaining 
statements
normally.

> 
> Also, virResetLastError is meant to be used before starting an API.
> It only resets the thread-local error object (which can only contain one 
> error), it
> cannot possibly unlog an error that was logged earlier.
> In that case, creating a Quiet version of the function is the proper solution.

Do you mean the message reported by 'VIR_INFO' should be removed and also not
adding the 'virResetLastError' line?

It might be more consistent if we keep the 'VIR_INFO' lines, because the similar
message has been emitted in checking the memory bandwidth information and
cache allocation information. But if you insist that this message should not be
shown to user or developer, I could accept that and make change to it.

@@ -704,12 +704,7 @@ virResctrlGetMonitorInfo(virResctrlInfoPtr resctrl)
 rv = virFileReadValueUint(_monitor->max_monitor,
   SYSFS_RESCTRL_PATH "/info/L3_MON/num_rmids");
 if (rv == -2) {
/* The file doesn't exist, so it's unusable for us, probably resource
 * monitor unsupported */
VIR_INFO("The file '" SYSFS_RESCTRL_PATH "/info/L3_MON/num_rmids' "
 "does not exist");
 ret = 0;
-virResetLastError();
 goto cleanup;
 } else if (rv < 0) {
 /* Other failures are fatal, so just quit */

> 
> Jano
> 
> >> +ret = 0;
> >> +goto cleanup;

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [RFC] virfile: fix cast-align error

2018-10-09 Thread Bjoern Walk
Marc Hartmayer  [2018-10-08, 08:41PM +0200]:
> Use the correct type in order to fix the following error on s390x:
> 
> In function 'virFileIsSharedFSType':
> ../../src/util/virfile.c:3578:38: error: cast increases required alignment of 
> target type [-Werror=cast-align]
>  virFileIsSharedFixFUSE(path, (long *) _type);
> 
> Signed-off-by: Marc Hartmayer 
> ---
>  src/util/virfile.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/src/util/virfile.c b/src/util/virfile.c
> index 2a7e87102a25..832d832696d5 100644
> --- a/src/util/virfile.c
> +++ b/src/util/virfile.c
> @@ -3466,7 +3466,7 @@ int virFilePrintf(FILE *fp, const char *msg, ...)
>  
>  static int
>  virFileIsSharedFixFUSE(const char *path,
> -   long *f_type)
> +   unsigned int *f_type)
>  {
>  char *dirpath = NULL;
>  const char **mounts = NULL;
> @@ -3575,7 +3575,7 @@ virFileIsSharedFSType(const char *path,
>  
>  if (sb.f_type == FUSE_SUPER_MAGIC) {
>  VIR_DEBUG("Found FUSE mount for path=%s. Trying to fix it", path);
> -virFileIsSharedFixFUSE(path, (long *) _type);
> +virFileIsSharedFixFUSE(path, _type);
>  }

Using an unsigned int is fine as per statfs(2):

NOTES
   The __fsword_t type used for various fields in the statfs structure 
defini‐
   tion is a glibc internal type, not intended for public  use.   This  
leaves
   the programmer in a bit of a conundrum when trying to copy or compare 
these
   fields to local variables in a program.  Using unsigned int for such  
vari‐
   ables suffices on most systems.

I would prefer an explicit cast though.

>  
>  VIR_DEBUG("Check if path %s with FS magic %lld is shared",
>path, (long long int)sb.f_type);

Should we also fix this cast?

Reviewed-by: Bjoern Walk 

-- 
IBM Systems
Linux on Z & Virtualization Development
--
IBM Deutschland Research & Development GmbH
Schönaicher Str. 220, 71032 Böblingen
Phone: +49 7031 16 1819
--
Vorsitzende des Aufsichtsrats: Martina Koederitz
Geschäftsführung: Dirk Wittkopp
Sitz der Gesellschaft: Böblingen
Registergericht: Amtsgericht Stuttgart, HRB 243294 


signature.asc
Description: PGP signature
--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list

Re: [libvirt] [PATCH] qemu: use "id" instead of deprecated "name" for -net

2018-10-09 Thread Thomas Huth
On 2018-10-08 16:54, Ján Tomko wrote:
> -net name= will be deprecated in QEMU 3.1:
> commit 101625a4d4ac7e96227a156bc5f6d21a9cc383cd
> net: Deprecate the "name" parameter of -net
> git describe: v3.0.0-791-g101625a4d4
> 
> Use the id option instead, supported since QEMU 1.2:
> commit 6687b79d636cd60ed9adb1177d0d946b58fa7717
> convert net_client_init() to OptsVisitor
> git describe: v1.0-3564-g6687b79d63 contains: v1.2.0-rc0~142^2~8
> 
> Thankfully, libvirt only uses -net for non-PCI, non-virtio NICs
> on ARM.
> 
> Signed-off-by: Ján Tomko 
> ---
>  src/qemu/qemu_command.c  | 2 +-
>  tests/qemuxml2argvdata/arm-vexpressa9-basic.args | 2 +-
>  2 files changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/src/qemu/qemu_command.c b/src/qemu/qemu_command.c
> index d77cf8c2d6..269276f2f9 100644
> --- a/src/qemu/qemu_command.c
> +++ b/src/qemu/qemu_command.c
> @@ -3516,7 +3516,7 @@ qemuBuildLegacyNicStr(virDomainNetDefPtr net)
>   net->info.alias,
>   (net->model ? ",model=" : ""),
>   (net->model ? net->model : ""),
> - (net->info.alias ? ",name=" : ""),
> + (net->info.alias ? ",id=" : ""),
>   (net->info.alias ? net->info.alias : "")));
>  return str;
>  }
> diff --git a/tests/qemuxml2argvdata/arm-vexpressa9-basic.args 
> b/tests/qemuxml2argvdata/arm-vexpressa9-basic.args
> index 90661d8b55..b925baa0e0 100644
> --- a/tests/qemuxml2argvdata/arm-vexpressa9-basic.args
> +++ b/tests/qemuxml2argvdata/arm-vexpressa9-basic.args
> @@ -27,6 +27,6 @@ server,nowait \
>  -usb \
>  -drive file=/arm.raw,format=raw,if=sd,index=0 \
>  -netdev user,id=hostnet0 \
> --net nic,macaddr=52:54:00:09:a4:37,netdev=hostnet0,model=lan9118,name=net0 \
> +-net nic,macaddr=52:54:00:09:a4:37,netdev=hostnet0,model=lan9118,id=net0 \
>  -chardev pty,id=charserial0 \
>  -serial chardev:charserial0
> 

Looks good.

Reviewed-by: Thomas Huth 

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list

[libvirt] [PATCH v2] virFileIsSharedFixFUSE: Copy mnt_dir when browsing mount table

2018-10-09 Thread Han Han
Fix typos of function name in commit msg.
v1 version: 
https://www.redhat.com/archives/libvir-list/2018-October/msg00511.html

virFileIsSharedFixFUSE doesn't fix f_type when "fuse.glusterfs"
is not the last row of mount table. For example, it doesn't works on
the mount table like following:
10.XX.XX.XX:/gv0 /mnt fuse.glusterfs rw 0 0
r...@xx.xx.xx:/tmp/mkdir /tmp/br0 fuse.sshfs rw 0 0

Copy mnt_dir of struct mntent in case its mnt_dir is changed by
getmntent_r in the loop later.

Signed-off-by: Han Han 
---
 src/util/virfile.c | 12 
 1 file changed, 8 insertions(+), 4 deletions(-)

diff --git a/src/util/virfile.c b/src/util/virfile.c
index 2a7e87102a..c503462633 100644
--- a/src/util/virfile.c
+++ b/src/util/virfile.c
@@ -3469,7 +3469,7 @@ virFileIsSharedFixFUSE(const char *path,
long *f_type)
 {
 char *dirpath = NULL;
-const char **mounts = NULL;
+char **mounts = NULL;
 size_t nmounts = 0;
 char *p;
 FILE *f = NULL;
@@ -3491,8 +3491,12 @@ virFileIsSharedFixFUSE(const char *path,
 if (STRNEQ("fuse.glusterfs", mb.mnt_type))
 continue;
 
-if (VIR_APPEND_ELEMENT_COPY(mounts, nmounts, mb.mnt_dir) < 0)
+char *mnt_dir;
+if (VIR_STRDUP(mnt_dir, mb.mnt_dir) < 0 ||
+VIR_APPEND_ELEMENT_COPY(mounts, nmounts, mnt_dir) < 0) {
+VIR_FREE(mnt_dir);
 goto cleanup;
+}
 }
 
 /* Add NULL sentinel so that this is a virStringList */
@@ -3512,7 +3516,7 @@ virFileIsSharedFixFUSE(const char *path,
 else
 *p = '\0';
 
-if (virStringListHasString(mounts, dirpath)) {
+if (virStringListHasString((const char **)mounts, dirpath)) {
 VIR_DEBUG("Found gluster FUSE mountpoint=%s for path=%s. "
   "Fixing shared FS type", dirpath, path);
 *f_type = GFS2_MAGIC;
@@ -3523,7 +3527,7 @@ virFileIsSharedFixFUSE(const char *path,
 ret = 0;
  cleanup:
 endmntent(f);
-VIR_FREE(mounts);
+virStringListFree(mounts);
 VIR_FREE(dirpath);
 return ret;
 }
-- 
2.19.1

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


[libvirt] [PATCH] virFileIsSharedFSType: Copy mnt_dir when browsing mount table

2018-10-09 Thread Han Han
virFileIsSharedFSType doesn't fix f_type when "fuse.glusterfs"
is not the last row of mount table. For example, it doesn't works on
the mount table like following:
10.XX.XX.XX:/gv0 /mnt fuse.glusterfs rw 0 0
r...@xx.xx.xx:/tmp/mkdir /tmp/br0 fuse.sshfs rw 0 0

Copy mnt_dir of struct mntent in case its mnt_dir is changed by
getmntent_r in the loop later.

Signed-off-by: Han Han 
---
 src/util/virfile.c | 12 
 1 file changed, 8 insertions(+), 4 deletions(-)

diff --git a/src/util/virfile.c b/src/util/virfile.c
index 2a7e87102a..c503462633 100644
--- a/src/util/virfile.c
+++ b/src/util/virfile.c
@@ -3469,7 +3469,7 @@ virFileIsSharedFixFUSE(const char *path,
long *f_type)
 {
 char *dirpath = NULL;
-const char **mounts = NULL;
+char **mounts = NULL;
 size_t nmounts = 0;
 char *p;
 FILE *f = NULL;
@@ -3491,8 +3491,12 @@ virFileIsSharedFixFUSE(const char *path,
 if (STRNEQ("fuse.glusterfs", mb.mnt_type))
 continue;
 
-if (VIR_APPEND_ELEMENT_COPY(mounts, nmounts, mb.mnt_dir) < 0)
+char *mnt_dir;
+if (VIR_STRDUP(mnt_dir, mb.mnt_dir) < 0 ||
+VIR_APPEND_ELEMENT_COPY(mounts, nmounts, mnt_dir) < 0) {
+VIR_FREE(mnt_dir);
 goto cleanup;
+}
 }
 
 /* Add NULL sentinel so that this is a virStringList */
@@ -3512,7 +3516,7 @@ virFileIsSharedFixFUSE(const char *path,
 else
 *p = '\0';
 
-if (virStringListHasString(mounts, dirpath)) {
+if (virStringListHasString((const char **)mounts, dirpath)) {
 VIR_DEBUG("Found gluster FUSE mountpoint=%s for path=%s. "
   "Fixing shared FS type", dirpath, path);
 *f_type = GFS2_MAGIC;
@@ -3523,7 +3527,7 @@ virFileIsSharedFixFUSE(const char *path,
 ret = 0;
  cleanup:
 endmntent(f);
-VIR_FREE(mounts);
+virStringListFree(mounts);
 VIR_FREE(dirpath);
 return ret;
 }
-- 
2.19.1

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [RFC PATCH 1/3] qemu: add vfio-ap capability

2018-10-09 Thread Boris Fiuczynski

On 10/9/18 7:03 AM, Bjoern Walk wrote:

Boris Fiuczynski  [2018-10-08, 06:25PM +0200]:

Introduce vfio-ap capability.

Signed-off-by: Boris Fiuczynski 
---
  src/qemu/qemu_capabilities.c | 2 ++
  src/qemu/qemu_capabilities.h | 1 +
  2 files changed, 3 insertions(+)

diff --git a/src/qemu/qemu_capabilities.c b/src/qemu/qemu_capabilities.c
index e228f52ec0..2ca5af3297 100644
--- a/src/qemu/qemu_capabilities.c
+++ b/src/qemu/qemu_capabilities.c
@@ -508,6 +508,7 @@ VIR_ENUM_IMPL(virQEMUCaps, QEMU_CAPS_LAST,
/* 315 */
"vfio-pci.display",
"blockdev",
+  "vfio-ap",
  );
  
  
@@ -1092,6 +1093,7 @@ struct virQEMUCapsStringFlags virQEMUCapsObjectTypes[] = {

  { "vhost-vsock-device", QEMU_CAPS_DEVICE_VHOST_VSOCK },
  { "mch", QEMU_CAPS_DEVICE_MCH },
  { "sev-guest", QEMU_CAPS_SEV_GUEST },
+{ "vfio-ap", QEMU_CAPS_DEVICE_VFIO_AP },
  };
  
  static struct virQEMUCapsStringFlags virQEMUCapsDevicePropsVirtioBalloon[] = {

diff --git a/src/qemu/qemu_capabilities.h b/src/qemu/qemu_capabilities.h
index 934620ed31..6bb9a2c8f0 100644
--- a/src/qemu/qemu_capabilities.h
+++ b/src/qemu/qemu_capabilities.h
@@ -492,6 +492,7 @@ typedef enum { /* virQEMUCapsFlags grouping marker for 
syntax-check */
  /* 315 */
  QEMU_CAPS_VFIO_PCI_DISPLAY, /* -device vfio-pci.display */
  QEMU_CAPS_BLOCKDEV, /* -blockdev and blockdev-add are supported */
+QEMU_CAPS_DEVICE_VFIO_AP, /* -device vfio-ap */
  
  QEMU_CAPS_LAST /* this must always be the last item */

  } virQEMUCapsFlags;
--
2.17.0



Reviewed-by: Bjoern Walk 


Thanks for your rb.

--
Mit freundlichen Grüßen/Kind regards
   Boris Fiuczynski

IBM Deutschland Research & Development GmbH
Vorsitzender des Aufsichtsrats: Martina Köderitz
Geschäftsführung: Dirk Wittkopp
Sitz der Gesellschaft: Böblingen
Registergericht: Amtsgericht Stuttgart, HRB 243294

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [RFC PATCH 2/3] qemu: vfio-ap device support

2018-10-09 Thread Boris Fiuczynski

On 10/9/18 7:08 AM, Bjoern Walk wrote:

Boris Fiuczynski  [2018-10-08, 06:25PM +0200]:

Adjusting domain format documentation, adding device address
support and adding command line generation for vfio-ap.

Signed-off-by: Boris Fiuczynski 
---
  docs/formatdomain.html.in  | 3 ++-
  docs/schemas/domaincommon.rng  | 1 +
  src/qemu/qemu_command.c| 8 
  src/qemu/qemu_domain_address.c | 4 
  src/util/virmdev.c | 3 ++-
  src/util/virmdev.h | 1 +
  6 files changed, 18 insertions(+), 2 deletions(-)

diff --git a/docs/formatdomain.html.in b/docs/formatdomain.html.in
index 8189959773..269741a690 100644
--- a/docs/formatdomain.html.in
+++ b/docs/formatdomain.html.in
@@ -4616,8 +4616,9 @@
For mediated devices (Since 3.2.0)
the model attribute specifies the device API which
determines how the host's vfio driver will expose the device to the
-  guest. Currently, model='vfio-pci' and
+  guest. Currently, model='vfio-pci',
model='vfio-ccw' (Since 
4.4.0)
+  and model='vfio-ap' (Since 
4.9.0)
is supported. MDEV section
provides more information about mediated devices as well as how to
create mediated devices on the host.


Maybe it is time to explain what the models actually are? Or is this not
in the scope of libvirt's documentation?
I am not sure what the libvirt policy regarding the documentation is. 
Therefore it would be good if some familiar with it could answer this 
question.

I just stuck to the style of the already supported vfio models.




diff --git a/docs/schemas/domaincommon.rng b/docs/schemas/domaincommon.rng
index 099a949cf8..b9ac5df479 100644
--- a/docs/schemas/domaincommon.rng
+++ b/docs/schemas/domaincommon.rng
@@ -4618,6 +4618,7 @@

  vfio-pci
  vfio-ccw
+vfio-ap

  
  
diff --git a/src/qemu/qemu_command.c b/src/qemu/qemu_command.c
index d77cf8c2d6..83569d70ab 100644
--- a/src/qemu/qemu_command.c
+++ b/src/qemu/qemu_command.c
@@ -5476,6 +5476,14 @@ qemuBuildHostdevCommandLine(virCommandPtr cmd,
  return -1;
  }
  break;
+case VIR_MDEV_MODEL_TYPE_VFIO_AP:
+if (!virQEMUCapsGet(qemuCaps, QEMU_CAPS_DEVICE_VFIO_AP)) {
+virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s",
+   _("VFIO AP device assignment is not "
+ "supported by this version of QEMU"));
+return -1;
+}
+break;
  case VIR_MDEV_MODEL_TYPE_LAST:
  default:
  virReportEnumRangeError(virMediatedDeviceModelType,
diff --git a/src/qemu/qemu_domain_address.c b/src/qemu/qemu_domain_address.c
index 8a8764cff5..24dd7c1a58 100644
--- a/src/qemu/qemu_domain_address.c
+++ b/src/qemu/qemu_domain_address.c
@@ -294,6 +294,10 @@ qemuDomainPrimeVfioDeviceAddresses(virDomainDefPtr def,
  subsys->u.mdev.model == VIR_MDEV_MODEL_TYPE_VFIO_CCW &&
  def->hostdevs[i]->info->type == 
VIR_DOMAIN_DEVICE_ADDRESS_TYPE_NONE)
  def->hostdevs[i]->info->type = type;
+
+if (virHostdevIsMdevDevice(def->hostdevs[i]) &&
+subsys->u.mdev.model == VIR_MDEV_MODEL_TYPE_VFIO_AP)
+def->hostdevs[i]->info->type = VIR_DOMAIN_DEVICE_ADDRESS_TYPE_NONE;
  }
  }
  
diff --git a/src/util/virmdev.c b/src/util/virmdev.c

index 10a2b08337..3e11e38802 100644
--- a/src/util/virmdev.c
+++ b/src/util/virmdev.c
@@ -48,7 +48,8 @@ struct _virMediatedDeviceList {
  
  VIR_ENUM_IMPL(virMediatedDeviceModel, VIR_MDEV_MODEL_TYPE_LAST,

"vfio-pci",
-  "vfio-ccw")
+  "vfio-ccw",
+  "vfio-ap")
  
  static virClassPtr virMediatedDeviceListClass;
  
diff --git a/src/util/virmdev.h b/src/util/virmdev.h

index 7c93c4d390..c856ff5bdb 100644
--- a/src/util/virmdev.h
+++ b/src/util/virmdev.h
@@ -27,6 +27,7 @@
  typedef enum {
  VIR_MDEV_MODEL_TYPE_VFIO_PCI = 0,
  VIR_MDEV_MODEL_TYPE_VFIO_CCW = 1,
+VIR_MDEV_MODEL_TYPE_VFIO_AP  = 2,
  
  VIR_MDEV_MODEL_TYPE_LAST

  } virMediatedDeviceModelType;
--
2.17.0



Looks good.

Reviewed-by: Bjoern Walk 


Thanks for the review.


--
Mit freundlichen Grüßen/Kind regards
   Boris Fiuczynski

IBM Deutschland Research & Development GmbH
Vorsitzender des Aufsichtsrats: Martina Köderitz
Geschäftsführung: Dirk Wittkopp
Sitz der Gesellschaft: Böblingen
Registergericht: Amtsgericht Stuttgart, HRB 243294

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list