Re: [libvirt] [PATCH] qemu: Forbid without

2017-02-06 Thread Jiri Denemark
On Mon, Feb 06, 2017 at 18:29:59 +0100, Andrea Bolognani wrote:
> In order for memory locking to work, the hard limit on memory
> locking (and usage) has to be set appropriately by the user.
> 
> The documentation mentions the requirement already: with this
> patch, it's going to be enforced by runtime checks as well.
> 
> Note that this will make existing guests that don't satisfy
> the requirement disappear; that said, such guests have never
> been able to start in the first place.

Yes, could not start or if they started they would just crash
afterwards. But with your patch they would just disappear and the user
would not even be able to fix the XML. I think the check should be done
prior to starting a domain to avoid this unfortunate behaviour.

Jirka

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [resend v2 1/7] Resctrl: Add some utils functions

2017-02-06 Thread Eli Qiao


-- 
Eli Qiao
Sent with Sparrow (http://www.sparrowmailapp.com/?sig)


On Tuesday, 7 February 2017 at 12:11 AM, Daniel P. Berrange wrote:

> On Mon, Feb 06, 2017 at 10:23:36AM +0800, Eli Qiao wrote:
> > This patch adds some utils struct and functions to expose resctrl
> > information.
> > 
> > virResCtrlAvailable: If resctrl interface exist on host
> > virResCtrlGet: get specify type resource contral information
> > virResCtrlInit: initialize resctrl struct from the host's sys fs.
> > ResCtrlAll[]: an array to maintain resource control information.
> > 
> > Signed-off-by: Eli Qiao  > (mailto:liyong.q...@intel.com)>
> > ---
> > include/libvirt/virterror.h | 1 +
> > src/Makefile.am (http://Makefile.am) | 1 +
> > src/libvirt_private.syms | 4 +
> > src/util/virerror.c | 1 +
> > src/util/virresctrl.c | 330 
> > src/util/virresctrl.h | 89 
> > 6 files changed, 426 insertions(+)
> > create mode 100644 src/util/virresctrl.c
> > create mode 100644 src/util/virresctrl.h
> > 
> 
> 
> > diff --git a/src/util/virresctrl.c b/src/util/virresctrl.c
> > new file mode 100644
> > index 000..63bc808
> > --- /dev/null
> > +++ b/src/util/virresctrl.c
> > @@ -0,0 +1,330 @@
> > +/*
> > + * virresctrl.c: methods for managing resource contral
> > + *
> > + * This library is free software; you can redistribute it and/or
> > + * modify it under the terms of the GNU Lesser General Public
> > + * License as published by the Free Software Foundation; either
> > + * version 2.1 of the License, or (at your option) any later version.
> > + *
> > + * This library is distributed in the hope that it will be useful,
> > + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> > + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
> > + * Lesser General Public License for more details.
> > + *
> > + * You should have received a copy of the GNU Lesser General Public
> > + * License along with this library. If not, see
> > + * .
> > + *
> > + * Authors:
> > + * Eli Qiao 
> > + */
> > +#include 
> > +
> > +#include 
> > +#if defined HAVE_SYS_SYSCALL_H
> > +# include 
> > +#endif
> > +#include 
> > +#include 
> > +#include 
> > +
> > +#include "virresctrl.h"
> > +#include "viralloc.h"
> > +#include "virerror.h"
> > +#include "virfile.h"
> > +#include "virhostcpu.h"
> > +#include "virlog.h"
> > +#include "virstring.h"
> > +#include "nodeinfo.h"
> > +
> > +VIR_LOG_INIT("util.resctrl");
> > +
> > +#define VIR_FROM_THIS VIR_FROM_RESCTRL
> > +
> > +static unsigned int host_id = 0;
> > +
> > +static virResCtrl ResCtrlAll[] = {
> > 
> 
> 
> Lowercase for global variable names please.
> 
 
> > + {
> > + .name = "L3",
> > + .cache_level = "l3",
> > + },
> > + {
> > + .name = "L3DATA",
> > + .cache_level = "l3",
> > + },
> > + {
> > + .name = "L3CODE",
> > + .cache_level = "l3",
> > + },
> > + {
> > + .name = "L2",
> > + .cache_level = "l2",
> > + },
> > +};
> > +
> > +static int virResCtrlGetInfoStr(const int type, const char *item, char 
> > **str)
> > +{
> > + int ret = 0;
> > + char *tmp;
> > + char *path;
> > +
> > + if (asprintf(, "%s/%s/%s", RESCTRL_INFO_DIR, ResCtrlAll[type].name, 
> > item) < 0)
> > + return -1;
> > 
> 
> 
> Use of asprintf is forbidden in libvirt - use virAsprintf.
> 
> Please make sure to run 'make check' and 'make syntax-check' on each
> patch to catch this kind of policy error. There's quite a few other
> style rules missed in these patches - i won't list them all since
> 'make syntax-check' can tell you.
> 
> 
> 


okay, thanks Daniel. 
> 
> > + if (virFileReadAll(path, 10, str) < 0) {
> > + ret = -1;
> > + goto cleanup;
> > + }
> > +
> > + if ((tmp = strchr(*str, '\n'))) {
> > + *tmp = '\0';
> > + }
> > +
> > +cleanup:
> > + VIR_FREE(path);
> > + return ret;
> > +}
> > +
> > +
> > +static int virResCtrlGetCPUValue(const char* path, char** value)
> > +{
> > + int ret = -1;
> > + char* tmp;
> > 
> 
> 
> The '*' should be next to the variable name, not the type name.
> Multiple other cases follow
> 
okay 
> > +
> > + if(virFileReadAll(path, 10, value) < 0) {
> > + goto cleanup;
> > + }
> > + if ((tmp = strchr(*value, '\n'))) {
> > + *tmp = '\0';
> > + }
> > + ret = 0;
> > +cleanup:
> > + return ret;
> > +}
> > +
> > 
> 
> 
> 
> > +int virResCtrlInit(void) {
> > + int i = 0;
> > + char *tmp;
> > + int rc = 0;
> > +
> > + for(i = 0; i < RDT_NUM_RESOURCES; i++) {
> > + if ((rc = asprintf(, "%s/%s", RESCTRL_INFO_DIR, ResCtrlAll[i].name)) 
> > < 0) {
> > + continue;
> > 
> 
> 
> Silently ignoring OOM is not good - please return a proper error - using
> virAsprintf would do so.
> 
okay. 
> > + }
> > + if (virFileExists(tmp)) {
> > + if ((rc = virResCtrlGetConfig(i)) < 0 )
> > + VIR_WARN("Ignor error while get config for %d", i);
> > 
> 
> 
> Again don't ignore errors like this - this should be fatal.
> 
sure 
> > + }
> > +
> > + 

Re: [libvirt] [resend v2 0/7] Support cache tune in libvirt

2017-02-06 Thread Eli Qiao


--  
Eli Qiao
Sent with Sparrow (http://www.sparrowmailapp.com/?sig)


On Tuesday, 7 February 2017 at 3:03 AM, Marcelo Tosatti wrote:

> On Mon, Feb 06, 2017 at 01:33:09PM -0200, Marcelo Tosatti wrote:
> > On Mon, Feb 06, 2017 at 10:23:35AM +0800, Eli Qiao wrote:
> > > This series patches are for supportting CAT featues, which also
> > > called cache tune in libvirt.
> > >  
> > > First to expose cache information which could be tuned in capabilites XML.
> > > Then add new domain xml element support to add cacahe bank which will 
> > > apply
> > > on this libvirt domain.
> > >  
> > > This series patches add a util file `resctrl.c/h`, an interface to talk 
> > > with
> > > linux kernel's sys fs.
> > >  
> > > There are still some TODOs such as expose new public interface to get free
> > > cache information.
> > >  
> > > Some discussion about this feature support can be found from:
> > > https://www.redhat.com/archives/libvir-list/2017-January/msg00644.html
> > >  
> >  
> >  
> > Two comments:
> >  
> > 1) Please perform appropriate filesystem locking when accessing
> > resctrlfs, as described at:
> >  
> > http://www.spinics.net/lists/linux-tip-commits/msg36754.html

Sure.  
> >  
> > 2)  
> >  
> > 
> >  
> > [b4c270b5-e0f9-4106-a446-69032872ed7d]# cat tasks  
> > 8654
> > [b4c270b5-e0f9-4106-a446-69032872ed7d]# pstree -p | grep qemu
> > |-qemu-kvm(8654)-+-{qemu-kvm}(8688)
> > | |-{qemu-kvm}(8692)
> > | `-{qemu-kvm}(8693)
> >  
> > Should add individual vcpus to the "tasks" file, not the main QEMU
> > process.
> >  
> > The NFV usecase requires exclusive CAT allocation for the vcpu which
> > runs the sensitive workload.
> >  
> > Perhaps:
> >  
> >   
> >  
> > Adds all vcpus that are pinned to the socket which cachebank with
> > host_id=1.
> >  
> >  
> >  
> >  
> > Adds PID of vcpus 2 and 3 to resctrl directory created for this
> > allocation.
> >  
>  
>  
Hmm.. in this case, we need to figure out what’s the pid of vcpus=2 and vcpu=3 
and added them to the resctrl directory.
currently, I create a resctrl directory(called resctrl domain) for a VM so just 
put all task ids to it.

this is my though:

let say the vm has vcpus=0 1 2 3, and you want to let 0, 1 benefit cache on 
host_id=0, and 2, 3 on host_id=1

you will do:

1)
pin vcpus 0, 1 on the cpus of socket 0  
pin vcpus 2, 3 on the cpus of socket 1
this can be done in vcputune

2) define cache tune like this:



in libvirt:
we create a resctrl directory naming with the VM’s uuid
and set schemata for each socket 0, and socket 1, put all qemu tasks ids into 
tasks file, this will work fine.  
please be note that in a resctrl directory, we can define schemata for each 
socket id separately.
  
  
> 3) CDP / non-CDP convertion.
>  
> In case the size determination has been performed with non-CDP,
> to emulate such allocation on a CDP host,
> it would be good to allow both code and data allocations to share
> the CBM space:  
>  
IOM, I don’t think it’s good to have this.
in libvirt capabilities xml, the application will get to know if the host 
support cdp or not.
  
>  
> 
> 
>  
> Perhaps if using the same ID?
I am open to hear about what other’s say,  
  
>  
>  
> Other than that, testing looks good.  
Thanks for the testing.  


--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list

Re: [libvirt] [resend v2 2/7] Resctrl: expose cache information to capabilities

2017-02-06 Thread Eli Qiao


-- 
Eli Qiao
Sent with Sparrow (http://www.sparrowmailapp.com/?sig)


On Tuesday, 7 February 2017 at 12:02 AM, Daniel P. Berrange wrote:

> On Mon, Feb 06, 2017 at 10:23:37AM +0800, Eli Qiao wrote:
> > This patch expose cache information to host's capabilites xml.
> > 
> > For l3 cache allocation
> > 
> > 
> > 
> > 
> > 
> > 
> > 
> > 
> > 
> > For l3 cache allocation supported cdp(seperate data/code):
> > 
> > 
> > 
> > 
> > 
> > 
> > 
> > 
> > 
> > 
> > 
> > RFC on mailing list.
> > https://www.redhat.com/archives/libvir-list/2017-January/msg00644.html
> > 
> > Signed-off-by: Eli Qiao  > (mailto:liyong.q...@intel.com)>
> > ---
> > src/conf/capabilities.c | 56 
> > src/conf/capabilities.h | 23 +++
> > src/libvirt_private.syms | 1 +
> > src/qemu/qemu_capabilities.c | 68 
> > 
> > src/qemu/qemu_driver.c | 4 +++
> > 5 files changed, 152 insertions(+)
> > 
> 
> 
> 
> > diff --git a/src/qemu/qemu_capabilities.c b/src/qemu/qemu_capabilities.c
> > index 3247d25..23f416d 100644
> > --- a/src/qemu/qemu_capabilities.c
> > +++ b/src/qemu/qemu_capabilities.c
> > @@ -45,6 +45,7 @@
> > #include "qemu_domain.h"
> > #define __QEMU_CAPSRIV_H_ALLOW__
> > #include "qemu_capspriv.h"
> > +#include "virresctrl.h"
> > 
> > #include 
> > #include 
> > @@ -1098,7 +1099,71 @@ virQEMUCapsInitCPU(virCapsPtr caps,
> > goto cleanup;
> > }
> > 
> > +static int
> > +virQEMUCapsInitCache(virCapsPtr caps)
> > +{
> > + int i, j;
> > + virResCtrlPtr resctrl;
> > + virCapsHostCacheBankPtr bank;
> > +
> > + for (i = 0; i < RDT_NUM_RESOURCES; i ++)
> > + {
> > + /* L3DATA and L3CODE share L3 resources */
> > + if ( i == RDT_RESOURCE_L3CODE )
> > + continue;
> > 
> > + resctrl = virResCtrlGet(i);
> > +
> > + if(resctrl->enabled) {
> > + for( j = 0; j < resctrl->num_banks; j++)
> > + {
> > + if(VIR_RESIZE_N(caps->host.cachebank, caps->host.ncachebank_max,
> > + caps->host.ncachebank, 1) < 0)
> > + return -1;
> > +
> > + if(VIR_ALLOC(bank) < 0)
> > + return -1;
> > +
> > + bank->id = resctrl->cache_banks[j].host_id;
> > + if(VIR_STRDUP(bank->type, resctrl->cache_level) < 0)
> > + goto err;
> > + if(VIR_STRDUP(bank->cpus, 
> > virBitmapFormat(resctrl->cache_banks[j].cpu_mask)) < 0)
> > + goto err;
> > + bank->size = resctrl->cache_banks[j].cache_size;
> > + /*L3DATA and L3CODE shares L3 cache resources, so fill them to the 
> > control element*/
> > + if ( i == RDT_RESOURCE_L3DATA ) {
> > + if(VIR_EXPAND_N(bank->control, bank->ncontrol, 2) < 0)
> > + goto err;
> > +
> > + bank->control[0].min = 
> > virResCtrlGet(RDT_RESOURCE_L3DATA)->cache_banks[j].cache_min;
> > + bank->control[0].reserved = bank->control[0].min * 
> > (virResCtrlGet(RDT_RESOURCE_L3DATA)->min_cbm_bits);
> > + if(VIR_STRDUP(bank->control[0].scope,
> > + virResCtrlGet(RDT_RESOURCE_L3DATA)->name) < 0)
> > + goto err;
> > +
> > + bank->control[1].min = 
> > virResCtrlGet(RDT_RESOURCE_L3CODE)->cache_banks[j].cache_min;
> > + bank->control[1].reserved = bank->control[1].min * 
> > (virResCtrlGet(RDT_RESOURCE_L3CODE)->min_cbm_bits);
> > + if(VIR_STRDUP(bank->control[1].scope,
> > + virResCtrlGet(RDT_RESOURCE_L3CODE)->name) < 0)
> > + goto err;
> > + }
> > + else {
> > + if(VIR_EXPAND_N(bank->control, bank->ncontrol, 1) < 0)
> > + goto err;
> > + bank->control[0].min = resctrl->cache_banks[j].cache_min;
> > + bank->control[0].reserved = bank->control[0].min * resctrl->min_cbm_bits;
> > + if(VIR_STRDUP(bank->control[0].scope, resctrl->name) < 0)
> > + goto err;
> > + }
> > + caps->host.cachebank[caps->host.ncachebank++] = bank;
> > + }
> > + }
> > + }
> > + return 0;
> > +err:
> > + VIR_FREE(bank);
> > + return -1;
> > +}
> > 
> 
> 
> I don't think this code should be in the QEMU driver - better to have
> it in nodeinfo.c so it is common to all drivers.
> 
Sure, I will move it to nodeinfo.c 
> 
> Regards,
> Daniel
> -- 
> |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :|
> |: http://libvirt.org -o- http://virt-manager.org :|
> |: http://entangle-photo.org -o- http://search.cpan.org/~danberr/ :|
> 
> 
> 


--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list

Re: [libvirt] [resend v2 4/7] Resctrl: Add private interface to set cachebanks

2017-02-06 Thread Eli Qiao


-- 
Eli Qiao
Sent with Sparrow (http://www.sparrowmailapp.com/?sig)


On Tuesday, 7 February 2017 at 12:17 AM, Daniel P. Berrange wrote:

> On Mon, Feb 06, 2017 at 10:23:39AM +0800, Eli Qiao wrote:
> > virResCtrlSetCacheBanks: Set cache banks of a libvirt domain. It will
> > create new resource domain under `/sys/fs/resctrl` and fill the
> > schemata according the cache banks configration.
> > 
> > virResCtrlUpdate: Update the schemata after libvirt domain destroy.
> > 
> > Signed-off-by: Eli Qiao  > (mailto:liyong.q...@intel.com)>
> > ---
> > src/libvirt_private.syms | 2 +
> > src/util/virresctrl.c | 644 ++-
> > src/util/virresctrl.h | 47 +++-
> > 3 files changed, 691 insertions(+), 2 deletions(-)
> > 
> 
> 
> > diff --git a/src/util/virresctrl.h b/src/util/virresctrl.h
> > index 3cc41da..11f43d8 100644
> > --- a/src/util/virresctrl.h
> > +++ b/src/util/virresctrl.h
> > @@ -26,6 +26,7 @@
> > 
> > # include "virutil.h"
> > # include "virbitmap.h"
> > +# include "domain_conf.h"
> > 
> > #define RESCTRL_DIR "/sys/fs/resctrl"
> > #define RESCTRL_INFO_DIR "/sys/fs/resctrl/info"
> > @@ -82,9 +83,53 @@ struct _virResCtrl {
> > virResCacheBankPtr cache_banks;
> > };
> > 
> > +/**
> > + * a virResSchemata represents a schemata object under a resource control
> > + * domain.
> > + */
> > +typedef struct _virResSchemataItem virResSchemataItem;
> > +typedef virResSchemataItem *virResSchemataItemPtr;
> > +struct _virResSchemataItem {
> > + unsigned int socket_no;
> > + unsigned schemata;
> > +};
> > +
> > +typedef struct _virResSchemata virResSchemata;
> > +typedef virResSchemata *virResSchemataPtr;
> > +struct _virResSchemata {
> > + unsigned int n_schemata_items;
> > + virResSchemataItemPtr schemata_items;
> > +};
> > +
> > +/**
> > + * a virResDomain represents a resource control domain. It's a double 
> > linked
> > + * list.
> > + */
> > +
> > +typedef struct _virResDomain virResDomain;
> > +typedef virResDomain *virResDomainPtr;
> > +
> > +struct _virResDomain {
> > + char* name;
> > + virResSchemataPtr schematas[RDT_NUM_RESOURCES];
> > + char* tasks;
> > + int n_sockets;
> > + virResDomainPtr pre;
> > + virResDomainPtr next;
> > +};
> > +
> > +/* All resource control domains on this host*/
> > +
> > +typedef struct _virResCtrlDomain virResCtrlDomain;
> > +typedef virResCtrlDomain *virResCtrlDomainPtr;
> > +struct _virResCtrlDomain {
> > + unsigned int num_domains;
> > + virResDomainPtr domains;
> > +};
> > 
> 
> 
> I don't think any of these need to be in the header file - they're
> all private structs used only by the .c file.
> 
Yep. 
> The biggest issue I see here is a complete lack of any kind of
> mutex locking. Libvirt is multi-threaded, so you must expect to
> have virResCtrlSetCacheBanks() and virResCtrlUpdate called
> concurrently and thus use mutexes to ensure safety.
> 
okay. 
> With the data structure design of using linked list of virResDomain
> though, doing good locking is going to be hard. It'll force you to
> have a single global mutex across the whole set of data structures
> which will really harm concurrency for VM startup.
> 
> Really each virResDomain needs to be completely standalone, with
> its own dedicate mutex. A global mutex for virResCtrlDomain can
> be acquired whle you lookup the virResDomain object to use. Thereafter
> the global mutex should be released and only the mutex for the specific
> domain used.
> 
> 
> 

oh, yes, but lock is really a hard thing, I need to be careful to avoid dead 
lock. 
> 
> > bool virResCtrlAvailable(void);
> > int virResCtrlInit(void);
> > virResCtrlPtr virResCtrlGet(int);
> > -
> > +int virResCtrlSetCacheBanks(virDomainCachetunePtr, unsigned char*, pid_t);
> > +int virResCtrlUpdate(void);
> > 
> 
> 
> This API design doesn't really make locking very easy - since you are not
> passing any info into the virResCtrlUpdate() method you've created the
> need to do global rescans when updating.
> 
> 
> 


yes, what if I use a global mutex variable in virresctrl.c?
> 
> IMHO virResCtrlSetCacheBanks needs to return an object that represents the
> config for that particular VM. This can be passed back in, when the guest
> is shutdown to release resources once again, avoiding any global scans.
> 
> 
> 


Hmm.. what object should virResCtrlSetCacheBanks return? schemata setting?
I expect that when the VM reboot, we recalculate from cachetune(in xml setting) 
again, that because we are not sure if the host can offer the VM for enough 
cache when it restart again.

 
> 
> Regards,
> Daniel
> -- 
> |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :|
> |: http://libvirt.org -o- http://virt-manager.org :|
> |: http://entangle-photo.org -o- http://search.cpan.org/~danberr/ :|
> 
> 
> 


--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list

[libvirt] [PATCH] libxl: fix disk detach when not specified

2017-02-06 Thread Jim Fehlig
When a user does not explicitly set a  in the disk config,
libvirt defers selection of a default to libxl. This approach works
fine when starting a domain with such configuration or attaching a
disk to a running domain. But when detaching such a disk, libxl
will fail with "unrecognized disk backend type: 0". libxl makes no
attempt to recalculate a default backend (driver) on detach and
simply fails when uninitialized.

This patch updates the libvirt disk config with the backend selected
by libxl when starting a domain or attaching a disk to a running
domain. Another benefit of this approach is that the live XML is
also updated with the backend driver selected by libxl.

Signed-off-by: Jim Fehlig 
---
 src/libxl/libxl_conf.c   | 33 +
 src/libxl/libxl_conf.h   |  4 
 src/libxl/libxl_domain.c | 25 +
 src/libxl/libxl_driver.c |  2 +-
 4 files changed, 63 insertions(+), 1 deletion(-)

diff --git a/src/libxl/libxl_conf.c b/src/libxl/libxl_conf.c
index b5186f2..6ef7570 100644
--- a/src/libxl/libxl_conf.c
+++ b/src/libxl/libxl_conf.c
@@ -851,6 +851,7 @@ libxlMakeDisk(virDomainDiskDefPtr l_disk, libxl_device_disk 
*x_disk)
  * xl-disk-configuration.txt in the xen documentation and let
  * libxl pick a suitable backend.
  */
+virDomainDiskSetFormat(l_disk, VIR_STORAGE_FILE_RAW);
 x_disk->format = LIBXL_DISK_FORMAT_RAW;
 x_disk->backend = LIBXL_DISK_BACKEND_UNKNOWN;
 }
@@ -913,6 +914,38 @@ libxlMakeDiskList(virDomainDefPtr def, libxl_domain_config 
*d_config)
 return -1;
 }
 
+/*
+ * Update libvirt disk config with libxl disk config.
+ *
+ * This function can be used to update the libvirt disk config with default
+ * values selected by libxl. Currently only the backend type is selected by
+ * libxl when not explicitly specified by the user.
+ */
+void
+libxlUpdateDisk(virDomainDiskDefPtr l_disk, libxl_device_disk *x_disk)
+{
+const char *driver = NULL;
+
+if (virDomainDiskGetDriver(l_disk))
+return;
+
+switch (x_disk->backend) {
+case LIBXL_DISK_BACKEND_QDISK:
+driver = "qemu";
+break;
+case LIBXL_DISK_BACKEND_TAP:
+driver = "tap";
+break;
+case LIBXL_DISK_BACKEND_PHY:
+driver = "phy";
+break;
+case LIBXL_DISK_BACKEND_UNKNOWN:
+break;
+}
+if (driver)
+ignore_value(virDomainDiskSetDriver(l_disk, driver));
+}
+
 int
 libxlMakeNic(virDomainDefPtr def,
  virDomainNetDefPtr l_nic,
diff --git a/src/libxl/libxl_conf.h b/src/libxl/libxl_conf.h
index 69d7885..732a1d2 100644
--- a/src/libxl/libxl_conf.h
+++ b/src/libxl/libxl_conf.h
@@ -175,6 +175,10 @@ int libxlDriverConfigLoadFile(libxlDriverConfigPtr cfg,
 
 int
 libxlMakeDisk(virDomainDiskDefPtr l_dev, libxl_device_disk *x_dev);
+
+void
+libxlUpdateDisk(virDomainDiskDefPtr l_dev, libxl_device_disk *x_dev);
+
 int
 libxlMakeNic(virDomainDefPtr def,
  virDomainNetDefPtr l_nic,
diff --git a/src/libxl/libxl_domain.c b/src/libxl/libxl_domain.c
index ed73cd2..0168777 100644
--- a/src/libxl/libxl_domain.c
+++ b/src/libxl/libxl_domain.c
@@ -1067,6 +1067,30 @@ libxlDomainCreateIfaceNames(virDomainDefPtr def, 
libxl_domain_config *d_config)
 }
 }
 
+static void
+libxlDomainUpdateDiskParams(virDomainDefPtr def, libxl_ctx *ctx)
+{
+libxl_device_disk *disks;
+int num_disks = 0;
+size_t i;
+int idx;
+
+disks = libxl_device_disk_list(ctx, def->id, _disks);
+if (!disks)
+return;
+
+for (i = 0; i < num_disks; i++) {
+if ((idx = virDomainDiskIndexByName(def, disks[i].vdev, false)) < 0)
+continue;
+
+libxlUpdateDisk(def->disks[idx], [i]);
+}
+
+for (i = 0; i < num_disks; i++)
+libxl_device_disk_dispose([i]);
+VIR_FREE(disks);
+}
+
 #ifdef LIBXL_HAVE_DEVICE_CHANNEL
 static void
 libxlDomainCreateChannelPTY(virDomainDefPtr def, libxl_ctx *ctx)
@@ -1310,6 +1334,7 @@ libxlDomainStart(libxlDriverPrivatePtr driver,
 goto destroy_dom;
 
 libxlDomainCreateIfaceNames(vm->def, _config);
+libxlDomainUpdateDiskParams(vm->def, cfg->ctx);
 
 #ifdef LIBXL_HAVE_DEVICE_CHANNEL
 if (vm->def->nchannels > 0)
diff --git a/src/libxl/libxl_driver.c b/src/libxl/libxl_driver.c
index 3a69720..6dddf9f 100644
--- a/src/libxl/libxl_driver.c
+++ b/src/libxl/libxl_driver.c
@@ -3027,7 +3027,7 @@ libxlDomainAttachDeviceDiskLive(virDomainObjPtr vm, 
virDomainDeviceDefPtr dev)
 }
 goto cleanup;
 }
-
+libxlUpdateDisk(l_disk, _disk);
 virDomainDiskInsertPreAlloced(vm->def, l_disk);
 
 } else {
-- 
2.9.2

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [PATCH 5/5] xenFoxenFormatXLDisk: Don't leak @target

2017-02-06 Thread Jim Fehlig
Michal Privoznik wrote:
> ==11260== 1,006 bytes in 1 blocks are definitely lost in loss record 106 of 
> 111
> ==11260==at 0x4C2AE5F: malloc (vg_replace_malloc.c:297)
> ==11260==by 0x4C2BDFF: realloc (vg_replace_malloc.c:693)
> ==11260==by 0x4EA430B: virReallocN (viralloc.c:245)
> ==11260==by 0x4EA7C52: virBufferGrow (virbuffer.c:130)
> ==11260==by 0x4EA7D28: virBufferAdd (virbuffer.c:165)
> ==11260==by 0x4EA8E10: virBufferStrcat (virbuffer.c:718)
> ==11260==by 0x42D263: xenFormatXLDiskSrcNet (xen_xl.c:960)
> ==11260==by 0x42D4EB: xenFormatXLDiskSrc (xen_xl.c:1015)
> ==11260==by 0x42D870: xenFormatXLDisk (xen_xl.c:1101)
> ==11260==by 0x42DA89: xenFormatXLDomainDisks (xen_xl.c:1148)
> ==11260==by 0x42EAF8: xenFormatXL (xen_xl.c:1558)
> ==11260==by 0x40E85F: testCompareParseXML (xlconfigtest.c:105)
> 
> Signed-off-by: Michal Privoznik 

Strange function name in $subject. s/xenFoxenFormatXLDisk/xenFormatXLDisk/

ACK.

Regards,
Jim

> ---
>  src/xenconfig/xen_xl.c | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
> 
> diff --git a/src/xenconfig/xen_xl.c b/src/xenconfig/xen_xl.c
> index 18d9fe369..2c9174e53 100644
> --- a/src/xenconfig/xen_xl.c
> +++ b/src/xenconfig/xen_xl.c
> @@ -1034,6 +1034,7 @@ xenFormatXLDisk(virConfValuePtr list, 
> virDomainDiskDefPtr disk)
>  int format = virDomainDiskGetFormat(disk);
>  const char *driver = virDomainDiskGetDriver(disk);
>  char *target = NULL;
> +int ret = -1;
>  
>  /* format */
>  virBufferAddLit(, "format=");
> @@ -1119,12 +1120,12 @@ xenFormatXLDisk(virConfValuePtr list, 
> virDomainDiskDefPtr disk)
>  tmp->next = val;
>  else
>  list->list = val;
> -return 0;
> +ret = 0;
>  
>   cleanup:
>  VIR_FREE(target);
>  virBufferFreeAndReset();
> -return -1;
> +return ret;
>  }
>  
>  

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [resend v2 0/7] Support cache tune in libvirt

2017-02-06 Thread Marcelo Tosatti
On Mon, Feb 06, 2017 at 01:33:09PM -0200, Marcelo Tosatti wrote:
> On Mon, Feb 06, 2017 at 10:23:35AM +0800, Eli Qiao wrote:
> > This series patches are for supportting CAT featues, which also
> > called cache tune in libvirt.
> > 
> > First to expose cache information which could be tuned in capabilites XML.
> > Then add new domain xml element support to add cacahe bank which will apply
> > on this libvirt domain.
> > 
> > This series patches add a util file `resctrl.c/h`, an interface to talk with
> > linux kernel's sys fs.
> > 
> > There are still some TODOs such as expose new public interface to get free
> > cache information.
> > 
> > Some discussion about this feature support can be found from:
> > https://www.redhat.com/archives/libvir-list/2017-January/msg00644.html
> 
> Two comments:
> 
> 1) Please perform appropriate filesystem locking when accessing
> resctrlfs, as described at:
> 
> http://www.spinics.net/lists/linux-tip-commits/msg36754.html
> 
> 2) 
> 
> 
> 
> [b4c270b5-e0f9-4106-a446-69032872ed7d]# cat tasks  
> 8654
> [b4c270b5-e0f9-4106-a446-69032872ed7d]# pstree -p | grep qemu
>|-qemu-kvm(8654)-+-{qemu-kvm}(8688)
>||-{qemu-kvm}(8692)
>|`-{qemu-kvm}(8693)
> 
> Should add individual vcpus to the "tasks" file, not the main QEMU
> process.
> 
> The NFV usecase requires exclusive CAT allocation for the vcpu which
> runs the sensitive workload.
> 
> Perhaps:
> 
>  
> 
> Adds all vcpus that are pinned to the socket which cachebank with
> host_id=1.
> 
>  vcpus=2,3/> 
> 
> Adds PID of vcpus 2 and 3 to resctrl directory created for this
> allocation.

3) CDP / non-CDP convertion.

In case the size determination has been performed with non-CDP,
to emulate such allocation on a CDP host,
it would be good to allow both code and data allocations to share
the CBM space: 


 
 

Perhaps if using the same ID? 

Other than that, testing looks good.

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [PATCH] docs: mention bhyve SATA address changes in news.xml

2017-02-06 Thread Roman Bogorodskiy
  Andrea Bolognani wrote:

> On Sun, 2017-02-05 at 16:52 +0400, Roman Bogorodskiy wrote:
> [...]
> > +  
> > +   
> > +  bhyve: change address allocation schema for SATA disks
> > +   
> 
> The indentation is all wrong, here and below. Please indent
> using only spaces and make sure the result matches existing
> entries; be also mindful of line length.

Oops, need to configure vim for proper indentation for these files.
Fixed.

> > +   
> > +     Previously, the bhyve driver assigned PCI addresses to SATA disks 
> > directly
> > +     rather than assigning that to a controller and using SATA addresses 
> > for disks.
> > +  It was implemented this way because bhyve has no notion of an 
> > explicit SATA
> > +     controller.
> 
> Aside: does this mean there is an implicit, default SATA
> controller? How would that work otherwise?

Sort of. I mean that there's no such thing as
'slot:func:controller:controller_id' + 'controller_id:disk' or
something like that, it's just disk
'slot:func:ahci-(hd|cd):image_path'.

> > However, this doesn't go inline with the internal libvirt model,
> 
> "However, as this doesn't match libvirt's understanding of
> disk addresses, [...]" or something along those lines.

Done.

> > +  it was changed for the bhyve driver to follow the common schema 
> > and
> > +     have PCI addresses for SATA controllers and SATA addresses for disks. 
> > If you're having
> > +     issues because of this, it's recommended to edit the domain's XML and 
> > remove
> > +     address type='xml' from the disk elements with
> 
> s/xml/pci/ here, I assume.

Right, fixed.

> > +     target bus="sata"/ and let libvirt regenerate it properly.
> 
> s/"/'/g to match the above and what libvirt actually uses ;)

Done.

> -- 
> Andrea Bolognani / Red Hat / Virtualization

Roman Bogorodskiy


signature.asc
Description: PGP signature
--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list

[libvirt] [PATCH v2] docs: mention bhyve SATA address changes in news.xml

2017-02-06 Thread Roman Bogorodskiy
---
 docs/news.xml | 19 +++
 1 file changed, 19 insertions(+)

diff --git a/docs/news.xml b/docs/news.xml
index f408293a1..7990cc6d4 100644
--- a/docs/news.xml
+++ b/docs/news.xml
@@ -56,6 +56,25 @@
   a fabric name has been removed by making it optional.
 
   
+  
+
+  bhyve: change address allocation schema for SATA disks
+
+
+  Previously, the bhyve driver assigned PCI addresses to SATA disks
+  directly rather than assigning that to a controller and
+  using SATA addresses for disks. It was implemented this way
+  because bhyve has no notion of an explicit SATA controller.
+  However, as this doesn't match libvirt's understanding of
+  disk addresses, it was changed for the bhyve driver
+  to follow the common schema and have PCI addresses
+  for SATA controllers and SATA addresses for disks. If you're
+  having issues because of this, it's recommended to edit
+  the domain's XML and remove address type='pci'
+  from the disk elements with target bus='sata'/
+  and let libvirt regenerate it properly.
+
+  
 
   
   
-- 
2.11.0

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


[libvirt] [PATCH] qemu: Forbid without

2017-02-06 Thread Andrea Bolognani
In order for memory locking to work, the hard limit on memory
locking (and usage) has to be set appropriately by the user.

The documentation mentions the requirement already: with this
patch, it's going to be enforced by runtime checks as well.

Note that this will make existing guests that don't satisfy
the requirement disappear; that said, such guests have never
been able to start in the first place.

Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1316774
---
 src/qemu/qemu_domain.c   | 20 
 tests/qemuxml2argvdata/qemuxml2argv-mlock-on.xml |  3 +++
 xml => qemuxml2argv-mlock-without-hardlimit.xml} |  0
 tests/qemuxml2argvtest.c |  1 +
 4 files changed, 24 insertions(+)
 copy tests/qemuxml2argvdata/{qemuxml2argv-mlock-on.xml => 
qemuxml2argv-mlock-without-hardlimit.xml} (100%)

diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c
index c6ce090..bb29cfe 100644
--- a/src/qemu/qemu_domain.c
+++ b/src/qemu/qemu_domain.c
@@ -2629,6 +2629,23 @@ qemuDomainDefVcpusPostParse(virDomainDefPtr def)
 
 
 static int
+qemuDomainDefMemoryLockingPostParse(virDomainDefPtr def)
+{
+/* Memory locking can only work properly if the memory locking limit
+ * for the QEMU process has been raised appropriately: the default one
+ * is extrememly low, so there's no way the guest will fit in there */
+if (def->mem.locked && !virMemoryLimitIsSet(def->mem.hard_limit)) {
+virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s",
+   _("Setting  requires "
+ " to be set as well"));
+return -1;
+}
+
+return 0;
+}
+
+
+static int
 qemuDomainDefPostParse(virDomainDefPtr def,
virCapsPtr caps,
unsigned int parseFlags,
@@ -2692,6 +2709,9 @@ qemuDomainDefPostParse(virDomainDefPtr def,
 if (qemuDomainDefVcpusPostParse(def) < 0)
 goto cleanup;
 
+if (qemuDomainDefMemoryLockingPostParse(def) < 0)
+goto cleanup;
+
 ret = 0;
  cleanup:
 virObjectUnref(qemuCaps);
diff --git a/tests/qemuxml2argvdata/qemuxml2argv-mlock-on.xml 
b/tests/qemuxml2argvdata/qemuxml2argv-mlock-on.xml
index 20a5eaa..2046663 100644
--- a/tests/qemuxml2argvdata/qemuxml2argv-mlock-on.xml
+++ b/tests/qemuxml2argvdata/qemuxml2argv-mlock-on.xml
@@ -3,6 +3,9 @@
   c7a5fdbd-edaf-9455-926a-d65c16db1809
   219136
   219136
+  
+256000
+  
   
 
   
diff --git a/tests/qemuxml2argvdata/qemuxml2argv-mlock-on.xml 
b/tests/qemuxml2argvdata/qemuxml2argv-mlock-without-hardlimit.xml
similarity index 100%
copy from tests/qemuxml2argvdata/qemuxml2argv-mlock-on.xml
copy to tests/qemuxml2argvdata/qemuxml2argv-mlock-without-hardlimit.xml
diff --git a/tests/qemuxml2argvtest.c b/tests/qemuxml2argvtest.c
index 3532cb5..9b2fec5 100644
--- a/tests/qemuxml2argvtest.c
+++ b/tests/qemuxml2argvtest.c
@@ -2129,6 +2129,7 @@ mymain(void)
 DO_TEST_FAILURE("mlock-on", NONE);
 DO_TEST("mlock-off", QEMU_CAPS_REALTIME_MLOCK);
 DO_TEST("mlock-unsupported", NONE);
+DO_TEST_PARSE_ERROR("mlock-without-hardlimit", NONE);
 
 DO_TEST_PARSE_ERROR("pci-bridge-negative-index-invalid",
 QEMU_CAPS_DEVICE_PCI_BRIDGE);
-- 
2.7.4

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [RFC PATCH 00/16] Introduce vGPU mdev framework to libvirt

2017-02-06 Thread Alex Williamson
On Mon, 6 Feb 2017 16:44:37 +
"Daniel P. Berrange"  wrote:

> On Mon, Feb 06, 2017 at 01:19:42PM +0100, Erik Skultety wrote:
> > Finally. It's here. This is the initial suggestion on how libvirt might
> > interract with the mdev framework, currently only focussing on the 
> > non-managed
> > devices, i.e. those pre-created by the user, since that will be revisited 
> > once
> > we all settled on how the XML should look like, given we might not want to 
> > use
> > the sysfs path directly as an attribute in the domain XML. My proposal on 
> > the
> > XML is the following:
> > 
> >   
> > 
> > 
> > 
> > vGPU_UUID
> > 
> > 
> > 
> > 
> > So the mediated device is identified by the physical parent device visible 
> > on
> > the host and a UUID which allows us to construct the sysfs path by 
> > ourselves,
> > which we then put on the QEMU's command line.
> > 
> > A few remarks if you actually happen to have a machine to test this on:
> > - right now the mediated devices are one-time use only, i.e. they have to be
> > recreated before every machine boot
> > - I wouldn't recommend assigning multiple vGPUs to a single domain
> > 
> > Once this series is sorted out, we can then continue with 'managed=yes' 
> > where
> > as Laine pointed out [1], we need to figure out how exactly should the
> > management layer hint libvirt which vGPU type should be used for device
> > instantiation.  
> 
> You seem to be suggesting that managed=yes with mdev devices would
> cause create / delete of a mdev device from a specified parent.
> 
> This is rather different semantics from what managed=yes does with
> PCI device assignment today.  There the managed=yes flag is just
> about controlling host device driver attachment. ie whether libvirt
> will manually bind to vfio.ko, or expect the admin to have bound
> it to vfio.ko before hand. I think it is important to keep that
> concept as is for mdev too.
> 
> While we're thinking of mdev purely in terms of KVM + vfio usage,
> it wouldn't suprise me if there ended up being non-KVM based
> use cases for mdev.
> 
> It isn't clear to me that auto-creation of mdev devices as a concept
> even belongs in the domain XML neccessarily.
> 
> Looking at two similar areas. For SRIOV NICs, in the domain XML
> you either specify an explicit VF to use, or you reference a
> libvirt virtual network. The latter takes care of dynamically
> providing VFs to VMs.  For NPIV, IIRC, the domain XML works
> similarly either taking an explicit vHBA, or referencing a
> storage pool to get one more dynamically.

Nit, there are other constraints of SR-IOV which I think are over
simplifying this analogy.  With SR-IOV, we can't dynamically
instantiate new VFs individually.  The process there requires that we
set the number of VFs we need and enable them.  Changing that number
of VFs requires that all existing VFs on that PF are removed and
recreated.  So, does libvirt work the way it does with SR-IOV devices
because that's the optimal way for users to make use of those VFs, or
does it behave that way because it must to follow the constraints of
the device?  I think libvirt handles VFs much like it does PFs because
it has no other choice.  Here we do have a choice.  Individual mdev
devices can be created and destroyed.  The only dependency between mdev
devices is how creating one affects the availability of mdev types
remaining on the parent device.  It would really be a shame to not take
advantage of the fact that the underlying device creation has advanced
so far from SR-IOV and lump it into the same sort of management.  My
impression is that user management of creating SR-IOV VFs via module
options or self defined scripts is a stumbling point that libvirt could
help to address here.
 
> Before we even consider auto-creation though, I think we need
> to have manual creation designed & integrated in the node device
> APIs.
> 
> So in terms of the domain XML, I think the only think we need
> to provide is the address of the pre-existing mdev device
> to be used. In this case "address" means the UUID. We should
> not need anything about the parent device AFAICT.

Yep, agree.  Thanks,

Alex

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [PATCH] docs: mention bhyve SATA address changes in news.xml

2017-02-06 Thread Andrea Bolognani
On Sun, 2017-02-05 at 16:52 +0400, Roman Bogorodskiy wrote:
[...]
> +  
> + 
> +  bhyve: change address allocation schema for SATA disks
> + 

The indentation is all wrong, here and below. Please indent
using only spaces and make sure the result matches existing
entries; be also mindful of line length.

> + 
> +   Previously, the bhyve driver assigned PCI addresses to SATA disks 
> directly
> +   rather than assigning that to a controller and using SATA addresses 
> for disks.
> +  It was implemented this way because bhyve has no notion of an 
> explicit SATA
> +   controller.

Aside: does this mean there is an implicit, default SATA
controller? How would that work otherwise?

> However, this doesn't go inline with the internal libvirt model,

"However, as this doesn't match libvirt's understanding of
disk addresses, [...]" or something along those lines.

> +  it was changed for the bhyve driver to follow the common schema and
> +   have PCI addresses for SATA controllers and SATA addresses for disks. 
> If you're having
> +   issues because of this, it's recommended to edit the domain's XML and 
> remove
> +   address type='xml' from the disk elements with

s/xml/pci/ here, I assume.

> +   target bus="sata"/ and let libvirt regenerate it properly.

s/"/'/g to match the above and what libvirt actually uses ;)

-- 
Andrea Bolognani / Red Hat / Virtualization

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list

Re: [libvirt] [RFC PATCH 00/16] Introduce vGPU mdev framework to libvirt

2017-02-06 Thread Daniel P. Berrange
On Mon, Feb 06, 2017 at 01:19:42PM +0100, Erik Skultety wrote:
> Finally. It's here. This is the initial suggestion on how libvirt might
> interract with the mdev framework, currently only focussing on the non-managed
> devices, i.e. those pre-created by the user, since that will be revisited once
> we all settled on how the XML should look like, given we might not want to use
> the sysfs path directly as an attribute in the domain XML. My proposal on the
> XML is the following:
> 
>   
> 
> 
> 
> vGPU_UUID
> 
> 
> 
> 
> So the mediated device is identified by the physical parent device visible on
> the host and a UUID which allows us to construct the sysfs path by ourselves,
> which we then put on the QEMU's command line.
> 
> A few remarks if you actually happen to have a machine to test this on:
> - right now the mediated devices are one-time use only, i.e. they have to be
> recreated before every machine boot
> - I wouldn't recommend assigning multiple vGPUs to a single domain
> 
> Once this series is sorted out, we can then continue with 'managed=yes' where
> as Laine pointed out [1], we need to figure out how exactly should the
> management layer hint libvirt which vGPU type should be used for device
> instantiation.

You seem to be suggesting that managed=yes with mdev devices would
cause create / delete of a mdev device from a specified parent.

This is rather different semantics from what managed=yes does with
PCI device assignment today.  There the managed=yes flag is just
about controlling host device driver attachment. ie whether libvirt
will manually bind to vfio.ko, or expect the admin to have bound
it to vfio.ko before hand. I think it is important to keep that
concept as is for mdev too.

While we're thinking of mdev purely in terms of KVM + vfio usage,
it wouldn't suprise me if there ended up being non-KVM based
use cases for mdev.

It isn't clear to me that auto-creation of mdev devices as a concept
even belongs in the domain XML neccessarily.

Looking at two similar areas. For SRIOV NICs, in the domain XML
you either specify an explicit VF to use, or you reference a
libvirt virtual network. The latter takes care of dynamically
providing VFs to VMs.  For NPIV, IIRC, the domain XML works
similarly either taking an explicit vHBA, or referencing a
storage pool to get one more dynamically.

Before we even consider auto-creation though, I think we need
to have manual creation designed & integrated in the node device
APIs.

So in terms of the domain XML, I think the only think we need
to provide is the address of the pre-existing mdev device
to be used. In this case "address" means the UUID. We should
not need anything about the parent device AFAICT.

Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://entangle-photo.org   -o-http://search.cpan.org/~danberr/ :|

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [RFC PATCH 00/16] Introduce vGPU mdev framework to libvirt

2017-02-06 Thread Alex Williamson
On Mon,  6 Feb 2017 13:19:42 +0100
Erik Skultety  wrote:

> Finally. It's here. This is the initial suggestion on how libvirt might
> interract with the mdev framework, currently only focussing on the non-managed
> devices, i.e. those pre-created by the user, since that will be revisited once
> we all settled on how the XML should look like, given we might not want to use
> the sysfs path directly as an attribute in the domain XML. My proposal on the
> XML is the following:
> 
>   
> 
> 
> 
> vGPU_UUID
> 
> 
> 
> 
> So the mediated device is identified by the physical parent device visible on
> the host and a UUID which allows us to construct the sysfs path by ourselves,
> which we then put on the QEMU's command line.

Based on your test code, I think you're creating something like this:

-device 
vfio-pci,sysfsdev=/sys/class/mdev_bus/:00:03.0/53764d0e-85a0-42b4-af5c-2046b460b1dc

That would explain the need for the parent device address, but that's
an entirely self inflicted requirement.  For a managed="no" scenarios,
we shouldn't need the parent, we can get to the mdev device
via /sys/bus/mdev/devices/53764d0e-85a0-42b4-af5c-2046b460b1dc.  So it
seems that the UUID should be the only required source element for
managed="no".

For managed="yes", it seems like the parent device is still an optional
field.  The most important thing that libvirt needs to know when
creating a mdev device for a VM is the mdev type name.  The parent
device should be an optional field to help higher level management
tools deal with placement of the device for locality or load balancing.
Also, we can't assume that the parent device is a PCI device, the
sample mtty driver already breaks this assumption.

Also, grep'ing through the patches, I don't see that the "device_api"
file is being used to test that the mdev device actually exports the
vfio-pci API before making use of it with the QEMU vfio-pci driver.  We
don't yet have any examples to the contrary, but non vfio-pci mdev
devices are in development.  Just like we can't assume the parent
device type, we can't assume the API of an mdev device to the user.
Thanks,

Alex

> A few remarks if you actually happen to have a machine to test this on:
> - right now the mediated devices are one-time use only, i.e. they have to be
> recreated before every machine boot
> - I wouldn't recommend assigning multiple vGPUs to a single domain
> 
> Once this series is sorted out, we can then continue with 'managed=yes' where
> as Laine pointed out [1], we need to figure out how exactly should the
> management layer hint libvirt which vGPU type should be used for device
> instantiation.
> 
> [1] https://www.redhat.com/archives/libvir-list/2017-January/msg00287.html  
> 
> #pleaseshareyourfeedback
> 
> Thanks,
> Erik
> 
> Erik Skultety (16):
>   util: Introduce new module virmdev
>   conf: Introduce new hostdev device type mdev
>   docs: Update RNG schema to reflect the new hostdev type mdev
>   conf: Adjust the domain parser to work with mdevs
>   Adjust the formatter to reflect the new hostdev type mdev
>   security: dac: Enable labeling of vfio mediated devices
>   security: selinux: Enable labeling of vfio mediated devices
>   conf: Enable cold-plug of a mediated device
>   qemu: Assign PCI addresses for mediated devices as well
>   hostdev: Maintain a driver list of active mediated devices
>   hostdev: Introduce a reattach method for mediated devices
>   qemu: cgroup: Adjust cgroups' logic to allow mediated devices
>   qemu: namespace: Hook up the discovery of mdevs into the namespace
> code
>   qemu: Format mdevs on the qemu command line
>   test: Add some test cases for our test suite regarding the mdevs
>   docs: Document the new hostdev device type 'mdev'
> 
>  docs/formatdomain.html.in  |  40 ++-
>  docs/schemas/domaincommon.rng  |  17 +
>  po/POTFILES.in |   1 +
>  src/Makefile.am|   1 +
>  src/conf/domain_conf.c |  81 -
>  src/conf/domain_conf.h |  10 +
>  src/libvirt_private.syms   |  19 ++
>  src/qemu/qemu_cgroup.c |  35 ++
>  src/qemu/qemu_command.c|  49 +++
>  src/qemu/qemu_command.h|   5 +
>  src/qemu/qemu_domain.c |  13 +
>  src/qemu/qemu_domain_address.c |  12 +-
>  src/qemu/qemu_hostdev.c|  37 ++
>  src/qemu/qemu_hostdev.h|   8 +
>  src/qemu/qemu_hotplug.c|   2 +
>  src/security/security_apparmor.c   |   3 +
>  src/security/security_dac.c|  56 +++
>  src/security/security_selinux.c|  55 +++
>  src/util/virhostdev.c  | 

Re: [libvirt] [PATCH 01/16] util: Introduce new module virmdev

2017-02-06 Thread Michal Privoznik
On 06.02.2017 13:19, Erik Skultety wrote:
> Beside creation, disposal, getter, and setter methods the module exports
> methods to work with lists of mediated devices.
> 
> Signed-off-by: Erik Skultety 
> ---
>  po/POTFILES.in   |   1 +
>  src/Makefile.am  |   1 +
>  src/libvirt_private.syms |  17 +++
>  src/util/virmdev.c   | 375 
> +++
>  src/util/virmdev.h   |  85 +++
>  5 files changed, 479 insertions(+)
>  create mode 100644 src/util/virmdev.c
>  create mode 100644 src/util/virmdev.h
> 


> diff --git a/src/util/virmdev.c b/src/util/virmdev.c
> new file mode 100644
> index 000..0b70798
> --- /dev/null
> +++ b/src/util/virmdev.c
> @@ -0,0 +1,375 @@
> +/*
> + * virmdev.c: helper APIs for managing host MDEV devices
> + *
> + * Copyright (C) 2017-2018 Red Hat, Inc.
> + *
> + * This library is free software; you can redistribute it and/or
> + * modify it under the terms of the GNU Lesser General Public
> + * License as published by the Free Software Foundation; either
> + * version 2.1 of the License, or (at your option) any later version.
> + *
> + * This library is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> + * Lesser General Public License for more details.
> + *
> + * You should have received a copy of the GNU Lesser General Public
> + * License along with this library.  If not, see
> + * .
> + *
> + * Authors:
> + * Erik Skultety 
> + */
> +
> +#include 
> +
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +
> +#include "virmdev.h"
> +#include "dirname.h"
> +#include "virlog.h"
> +#include "viralloc.h"
> +#include "vircommand.h"
> +#include "virerror.h"
> +#include "virfile.h"
> +#include "virkmod.h"
> +#include "virstring.h"
> +#include "virutil.h"
> +#include "viruuid.h"
> +#include "virhostdev.h"
> +
> +VIR_LOG_INIT("util.mdev");
> +
> +struct _virMediatedDevice {
> +char *path; /* sysfs path */
> +bool managed;
> +
> +char *used_by_drvname;
> +char *used_by_domname;
> +};
> +
> +struct _virMediatedDeviceList {
> +virObjectLockable parent;
> +
> +size_t count;
> +virMediatedDevicePtr *devs;
> +};
> +
> +
> +/* For virReportOOMError()  and virReportSystemError() */
> +#define VIR_FROM_THIS VIR_FROM_NONE
> +
> +static virClassPtr virMediatedDeviceListClass;
> +
> +static void virMediatedDeviceListDispose(void *obj);
> +
> +static int virMediatedOnceInit(void)
> +{
> +if (!(virMediatedDeviceListClass = 
> virClassNew(virClassForObjectLockable(),
> +   "virMediatedDeviceList",
> +   
> sizeof(virMediatedDeviceList),
> +   
> virMediatedDeviceListDispose)))
> +return -1;
> +
> +return 0;
> +}
> +
> +VIR_ONCE_GLOBAL_INIT(virMediated)
> +
> +#ifdef __linux__
> +# define MDEV_SYSFS "/sys/class/mdev_bus/"

Some functions in this block look not-linux specific at all. For
instance there's nothing linux specific about virMediatedDeviceFree(),
GetPath() and so on. Usually, if running on unsupported environment,
erroring out in New() or Create() methods is good. The rest of the code
is not platform/environment/stellar constellation specific.

> +
> +virMediatedDevicePtr
> +virMediatedDeviceNew(virPCIDeviceAddressPtr pciaddr, const char *uuidstr)
> +{
> +virMediatedDevicePtr dev = NULL, ret = NULL;
> +char *pcistr = NULL;
> +
> +if (virAsprintf(, "%.4x:%.2x:%.2x.%.1x", pciaddr->domain,
> +pciaddr->bus, pciaddr->slot, pciaddr->function) < 0)
> +return NULL;
> +
> +if (VIR_ALLOC(dev) < 0)
> +goto cleanup;
> +
> +if (virAsprintf(>path, MDEV_SYSFS "%s/%s", pcistr, uuidstr) < 0)
> +   goto cleanup;
> +
> +ret = dev;
> +dev = NULL;
> +
> + cleanup:
> +VIR_FREE(pcistr);
> +virMediatedDeviceFree(dev);
> +return ret;
> +}
> +
> +
> +void
> +virMediatedDeviceFree(virMediatedDevicePtr dev)
> +{
> +if (!dev)
> +return;
> +VIR_FREE(dev->path);
> +VIR_FREE(dev->used_by_drvname);
> +VIR_FREE(dev->used_by_domname);
> +VIR_FREE(dev);
> +}
> +
> +
> +const char *
> +virMediatedDeviceGetPath(virMediatedDevicePtr dev)
> +{
> +return dev->path;
> +}
> +
> +
> +/* Returns an absolute canonicalized path to the device used to control the
> + * mediated device's IOMMU group (e.g. "/dev/vfio/15")
> + */
> +char *
> +virMediatedDeviceGetIOMMUGroupDev(virMediatedDevicePtr dev)
> +{
> +char *resultpath = NULL;
> +char *iommu_linkpath = NULL;
> +char *vfio_path = NULL;
> +
> +if (virAsprintf(_linkpath, "%s/iommu_group", dev->path) < 0)
> +return 

Re: [libvirt] [PATCH 02/16] conf: Introduce new hostdev device type mdev

2017-02-06 Thread Michal Privoznik
On 06.02.2017 13:19, Erik Skultety wrote:
> A mediated device will be identified by the host PCI address of the
> parent physical device which is backing the given mediated device, and a
> UUID of the user pre-created mediated device. The data necessary to
> identify a mediated device can be easily extended in the future, once we
> need to enable managed='yes' in which case a hint from the upper
> management layer about which mediated device type (e.g. vGPU type)
> should an instance be created on.
> 
> Signed-off-by: Erik Skultety 
> ---
>  src/conf/domain_conf.c  |  7 ++-
>  src/conf/domain_conf.h  | 10 ++
>  src/qemu/qemu_cgroup.c  |  5 +
>  src/qemu/qemu_domain.c  |  1 +
>  src/qemu/qemu_hotplug.c |  2 ++
>  src/security/security_apparmor.c|  3 +++
>  src/security/security_dac.c |  2 ++
>  src/security/security_selinux.c |  2 ++
>  tests/domaincapsschemadata/full.xml |  1 +
>  9 files changed, 32 insertions(+), 1 deletion(-)
> 
> diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c
> index c06b128..38ffc95 100644
> --- a/src/conf/domain_conf.c
> +++ b/src/conf/domain_conf.c
> @@ -649,7 +649,8 @@ VIR_ENUM_IMPL(virDomainHostdevSubsys, 
> VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_LAST,
>"usb",
>"pci",
>"scsi",
> -  "scsi_host")
> +  "scsi_host",
> +  "mdev")
>  
>  VIR_ENUM_IMPL(virDomainHostdevSubsysPCIBackend,
>VIR_DOMAIN_HOSTDEV_PCI_BACKEND_TYPE_LAST,
> @@ -6453,6 +6454,8 @@ virDomainHostdevDefParseXMLSubsys(xmlNodePtr node,
>  if (virDomainHostdevSubsysSCSIVHostDefParseXML(sourcenode, def) < 0)
>  goto error;
>  break;
> +case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV:
> +break;
>  
>  default:
>  virReportError(VIR_ERR_CONFIG_UNSUPPORTED,
> @@ -13281,6 +13284,7 @@ virDomainHostdevDefParseXML(virDomainXMLOptionPtr 
> xmlopt,
>  }
>  break;
>  case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_USB:
> +case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV:
>  case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_LAST:
>  break;
>  }
> @@ -14172,6 +14176,7 @@ virDomainHostdevMatchSubsys(virDomainHostdevDefPtr a,
>  return 1;
>  else
>  return 0;
> +case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV:
>  case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_LAST:
>  return 0;
>  }
> diff --git a/src/conf/domain_conf.h b/src/conf/domain_conf.h
> index 507ace8..3a1009a 100644
> --- a/src/conf/domain_conf.h
> +++ b/src/conf/domain_conf.h
> @@ -54,6 +54,7 @@
>  # include "virgic.h"
>  # include "virperf.h"
>  # include "virtypedparam.h"
> +# include "virpci.h"
>  
>  /* forward declarations of all device types, required by
>   * virDomainDeviceDef
> @@ -295,6 +296,7 @@ typedef enum {
>  VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_PCI,
>  VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_SCSI,
>  VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_SCSI_HOST,
> +VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV,
>  
>  VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_LAST
>  } virDomainHostdevSubsysType;
> @@ -369,6 +371,13 @@ struct _virDomainHostdevSubsysSCSI {
>  } u;
>  };
>  
> +typedef struct _virDomainHostdevSubsysMediatedDev 
> virDomainHostdevSubsysMediatedDev;
> +typedef virDomainHostdevSubsysMediatedDev 
> *virDomainHostdevSubsysMediatedDevPtr;
> +struct _virDomainHostdevSubsysMediatedDev {
> +virPCIDeviceAddress addr;   /* parent device's host address 
> */
> +char uuidstr[VIR_UUID_STRING_BUFLEN];   /* mediated device's uuid string 
> */

Either this or VIR_UUID_BUFLEN for storing UUID in raw format, which is
what we typically do. But I don't care that much.

> +};
> +

Michal

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [PATCH 03/16] docs: Update RNG schema to reflect the new hostdev type mdev

2017-02-06 Thread Michal Privoznik
On 06.02.2017 13:19, Erik Skultety wrote:
> To keep the domain XML as much platform agnostic as possible, do not
> expose an element/attribute which would contain path directly to the
> syfs filesystem which the mediated devices are build upon. Instead,
> identify each mediated device by the parent physical device and a UUID
> which is an optional element, but only if managed='yes' which is not
> implemented yet.
> 
> Signed-off-by: Erik Skultety 
> ---
>  docs/schemas/domaincommon.rng | 17 +
>  1 file changed, 17 insertions(+)
> 

This one, 04/16, 05/16 and 16/16 should be merged together.

Michal

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [PATCH 08/16] conf: Enable cold-plug of a mediated device

2017-02-06 Thread Michal Privoznik
On 06.02.2017 13:19, Erik Skultety wrote:
> This merely introduces virDomainHostdevMatchSubsysMediatedDev method that
> is supposed to check whether device being cold-plugged does not already
> exist in the domain configuration.
> 
> Signed-off-by: Erik Skultety 
> ---
>  src/conf/domain_conf.c | 19 +++
>  1 file changed, 19 insertions(+)
> 
> diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c
> index 9663350..aa1bd68 100644
> --- a/src/conf/domain_conf.c
> +++ b/src/conf/domain_conf.c
> @@ -14192,6 +14192,24 @@ 
> virDomainHostdevMatchSubsysSCSIiSCSI(virDomainHostdevDefPtr first,
>  }
>  
>  static int
> +virDomainHostdevMatchSubsysMediatedDev(virDomainHostdevDefPtr first,
> +   virDomainHostdevDefPtr second)
> +{
> +virDomainHostdevSubsysMediatedDevPtr first_mdevsrc =
> +>source.subsys.u.mdev;
> +virDomainHostdevSubsysMediatedDevPtr second_mdevsrc =
> +>source.subsys.u.mdev;

Had you gone with a/b, a_src/b_src this would nicely fit onto two lines.

> +
> +if (STREQ(first_mdevsrc->uuidstr, second_mdevsrc->uuidstr) &&
> +first_mdevsrc->addr.domain == second_mdevsrc->addr.domain &&
> +first_mdevsrc->addr.bus == second_mdevsrc->addr.bus &&
> +first_mdevsrc->addr.slot == second_mdevsrc->addr.slot &&
> +first_mdevsrc->addr.function == second_mdevsrc->addr.function)

Huh, we don't have virPCIAddrMatch or virPCIAddrCompare. :(

> +return 1;
> +return 0;
> +}
> +
> +static int
>  virDomainHostdevMatchSubsys(virDomainHostdevDefPtr a,
>  virDomainHostdevDefPtr b)
>  {
> @@ -14222,6 +14240,7 @@ virDomainHostdevMatchSubsys(virDomainHostdevDefPtr a,
>  else
>  return 0;
>  case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV:
> +return virDomainHostdevMatchSubsysMediatedDev(a, b);
>  case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_LAST:
>  return 0;
>  }
> 

Michal

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [RFC PATCH 00/16] Introduce vGPU mdev framework to libvirt

2017-02-06 Thread Michal Privoznik
On 06.02.2017 13:19, Erik Skultety wrote:
> Finally. It's here. This is the initial suggestion on how libvirt might
> interract with the mdev framework, currently only focussing on the non-managed
> devices, i.e. those pre-created by the user, since that will be revisited once
> we all settled on how the XML should look like, given we might not want to use
> the sysfs path directly as an attribute in the domain XML. My proposal on the
> XML is the following:
> 
>   
> 
> 
> 
> vGPU_UUID
> 
> 
> 
> 
> So the mediated device is identified by the physical parent device visible on
> the host and a UUID which allows us to construct the sysfs path by ourselves,
> which we then put on the QEMU's command line.
> 
> A few remarks if you actually happen to have a machine to test this on:
> - right now the mediated devices are one-time use only, i.e. they have to be
> recreated before every machine boot
> - I wouldn't recommend assigning multiple vGPUs to a single domain
> 
> Once this series is sorted out, we can then continue with 'managed=yes' where
> as Laine pointed out [1], we need to figure out how exactly should the
> management layer hint libvirt which vGPU type should be used for device
> instantiation.
> 
> [1] https://www.redhat.com/archives/libvir-list/2017-January/msg00287.html  
> 
> #pleaseshareyourfeedback
> 
> Thanks,
> Erik
> 
> Erik Skultety (16):
>   util: Introduce new module virmdev
>   conf: Introduce new hostdev device type mdev
>   docs: Update RNG schema to reflect the new hostdev type mdev
>   conf: Adjust the domain parser to work with mdevs
>   Adjust the formatter to reflect the new hostdev type mdev
>   security: dac: Enable labeling of vfio mediated devices
>   security: selinux: Enable labeling of vfio mediated devices
>   conf: Enable cold-plug of a mediated device
>   qemu: Assign PCI addresses for mediated devices as well
>   hostdev: Maintain a driver list of active mediated devices
>   hostdev: Introduce a reattach method for mediated devices
>   qemu: cgroup: Adjust cgroups' logic to allow mediated devices
>   qemu: namespace: Hook up the discovery of mdevs into the namespace
> code
>   qemu: Format mdevs on the qemu command line
>   test: Add some test cases for our test suite regarding the mdevs
>   docs: Document the new hostdev device type 'mdev'
> 
>  docs/formatdomain.html.in  |  40 ++-
>  docs/schemas/domaincommon.rng  |  17 +
>  po/POTFILES.in |   1 +
>  src/Makefile.am|   1 +
>  src/conf/domain_conf.c |  81 -
>  src/conf/domain_conf.h |  10 +
>  src/libvirt_private.syms   |  19 ++
>  src/qemu/qemu_cgroup.c |  35 ++
>  src/qemu/qemu_command.c|  49 +++
>  src/qemu/qemu_command.h|   5 +
>  src/qemu/qemu_domain.c |  13 +
>  src/qemu/qemu_domain_address.c |  12 +-
>  src/qemu/qemu_hostdev.c|  37 ++
>  src/qemu/qemu_hostdev.h|   8 +
>  src/qemu/qemu_hotplug.c|   2 +
>  src/security/security_apparmor.c   |   3 +
>  src/security/security_dac.c|  56 +++
>  src/security/security_selinux.c|  55 +++
>  src/util/virhostdev.c  | 179 +-
>  src/util/virhostdev.h  |  16 +
>  src/util/virmdev.c | 375 
> +
>  src/util/virmdev.h |  85 +
>  tests/domaincapsschemadata/full.xml|   1 +
>  ...qemuxml2argv-hostdev-mdev-unmanaged-no-uuid.xml |  37 ++
>  .../qemuxml2argv-hostdev-mdev-unmanaged.args   |  25 ++
>  .../qemuxml2argv-hostdev-mdev-unmanaged.xml|  38 +++
>  tests/qemuxml2argvtest.c   |   6 +
>  .../qemuxml2xmlout-hostdev-mdev-unmanaged.xml  |  41 +++
>  tests/qemuxml2xmltest.c|   1 +
>  29 files changed, 1239 insertions(+), 9 deletions(-)
>  create mode 100644 src/util/virmdev.c
>  create mode 100644 src/util/virmdev.h
>  create mode 100644 
> tests/qemuxml2argvdata/qemuxml2argv-hostdev-mdev-unmanaged-no-uuid.xml
>  create mode 100644 
> tests/qemuxml2argvdata/qemuxml2argv-hostdev-mdev-unmanaged.args
>  create mode 100644 
> tests/qemuxml2argvdata/qemuxml2argv-hostdev-mdev-unmanaged.xml
>  create mode 100644 
> tests/qemuxml2xmloutdata/qemuxml2xmlout-hostdev-mdev-unmanaged.xml
> 

I'm no expert in mdevs, but from code POV these look solid.

Michal

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


[libvirt] spice-gl bug: Failed to initialize EGL render node for SPICE GL

2017-02-06 Thread Paul Kek
Hello,

I am having issues with getting spice-gl to work properly with libvirt.
There is a bugzilla report which describes the issue but until now there is 
still no solution (The patch didn't work on my system.)
As I also mentioned there, I tried adding the `renderD128` device to 
`cgroup_device_acl` in the qemu.conf file but this produced different errors:

qemu-system-x86_64: egl: eglGetDisplay failed
qemu-system-x86_64: egl: EGL_KHR_surfaceless_context not supported
qemu-system-x86_64: Failed to initialize EGL render node for SPICE GL

Paul--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list

Re: [libvirt] [resend v2 4/7] Resctrl: Add private interface to set cachebanks

2017-02-06 Thread Daniel P. Berrange
On Mon, Feb 06, 2017 at 10:23:39AM +0800, Eli Qiao wrote:
> virResCtrlSetCacheBanks: Set cache banks of a libvirt domain. It will
> create new resource domain under `/sys/fs/resctrl` and fill the
> schemata according the cache banks configration.
> 
> virResCtrlUpdate: Update the schemata after libvirt domain destroy.
> 
> Signed-off-by: Eli Qiao 
> ---
>  src/libvirt_private.syms |   2 +
>  src/util/virresctrl.c| 644 
> ++-
>  src/util/virresctrl.h|  47 +++-
>  3 files changed, 691 insertions(+), 2 deletions(-)

> diff --git a/src/util/virresctrl.h b/src/util/virresctrl.h
> index 3cc41da..11f43d8 100644
> --- a/src/util/virresctrl.h
> +++ b/src/util/virresctrl.h
> @@ -26,6 +26,7 @@
>  
>  # include "virutil.h"
>  # include "virbitmap.h"
> +# include "domain_conf.h"
>  
>  #define RESCTRL_DIR "/sys/fs/resctrl"
>  #define RESCTRL_INFO_DIR "/sys/fs/resctrl/info"
> @@ -82,9 +83,53 @@ struct _virResCtrl {
>  virResCacheBankPtr  cache_banks;
>  };
>  
> +/**
> + * a virResSchemata represents a schemata object under a resource control
> + * domain.
> + */
> +typedef struct _virResSchemataItem virResSchemataItem;
> +typedef virResSchemataItem *virResSchemataItemPtr;
> +struct _virResSchemataItem {
> +unsigned int socket_no;
> +unsigned schemata;
> +};
> +
> +typedef struct _virResSchemata virResSchemata;
> +typedef virResSchemata *virResSchemataPtr;
> +struct _virResSchemata {
> +unsigned int n_schemata_items;
> +virResSchemataItemPtr schemata_items;
> +};
> +
> +/**
> + * a virResDomain represents a resource control domain. It's a double linked
> + * list.
> + */
> +
> +typedef struct _virResDomain virResDomain;
> +typedef virResDomain *virResDomainPtr;
> +
> +struct _virResDomain {
> +char* name;
> +virResSchemataPtr schematas[RDT_NUM_RESOURCES];
> +char* tasks;
> +int n_sockets;
> +virResDomainPtr pre;
> +virResDomainPtr next;
> +};
> +
> +/* All resource control domains on this host*/
> +
> +typedef struct _virResCtrlDomain virResCtrlDomain;
> +typedef virResCtrlDomain *virResCtrlDomainPtr;
> +struct _virResCtrlDomain {
> +unsigned int num_domains;
> +virResDomainPtr domains;
> +};

I don't think any of these need to be in the header file - they're
all private structs used only by the .c file.

The biggest issue I see here is a complete lack of any kind of
mutex locking. Libvirt is multi-threaded, so you must expect to
have virResCtrlSetCacheBanks() and virResCtrlUpdate called
concurrently and thus use mutexes to ensure safety.

With the data structure design of using linked list of virResDomain
though, doing good locking is going to be hard. It'll force you to
have a single global mutex across the whole set of data structures
which will really harm concurrency for VM startup.

Really each virResDomain needs to be completely standalone, with
its own dedicate mutex. A global mutex for virResCtrlDomain can
be acquired whle you lookup the virResDomain object to use. Thereafter
the global mutex should be released and only the mutex for the specific
domain used.

>  bool virResCtrlAvailable(void);
>  int virResCtrlInit(void);
>  virResCtrlPtr virResCtrlGet(int);
> -
> +int virResCtrlSetCacheBanks(virDomainCachetunePtr, unsigned char*, pid_t);
> +int virResCtrlUpdate(void);

This API design doesn't really make locking very easy - since you are not
passing any info into the virResCtrlUpdate() method you've created the
need to do global rescans when updating.

IMHO virResCtrlSetCacheBanks needs to return an object that represents the
config for that particular VM. This can be passed back in, when the guest
is shutdown to release resources once again, avoiding any global scans.

Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://entangle-photo.org   -o-http://search.cpan.org/~danberr/ :|

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [resend v2 1/7] Resctrl: Add some utils functions

2017-02-06 Thread Daniel P. Berrange
On Mon, Feb 06, 2017 at 10:23:36AM +0800, Eli Qiao wrote:
> This patch adds some utils struct and functions to expose resctrl
> information.
> 
> virResCtrlAvailable: If resctrl interface exist on host
> virResCtrlGet: get specify type resource contral information
> virResCtrlInit: initialize resctrl struct from the host's sys fs.
> ResCtrlAll[]: an array to maintain resource control information.
> 
> Signed-off-by: Eli Qiao 
> ---
>  include/libvirt/virterror.h |   1 +
>  src/Makefile.am |   1 +
>  src/libvirt_private.syms|   4 +
>  src/util/virerror.c |   1 +
>  src/util/virresctrl.c   | 330 
> 
>  src/util/virresctrl.h   |  89 
>  6 files changed, 426 insertions(+)
>  create mode 100644 src/util/virresctrl.c
>  create mode 100644 src/util/virresctrl.h

> diff --git a/src/util/virresctrl.c b/src/util/virresctrl.c
> new file mode 100644
> index 000..63bc808
> --- /dev/null
> +++ b/src/util/virresctrl.c
> @@ -0,0 +1,330 @@
> +/*
> + * virresctrl.c: methods for managing resource contral
> + *
> + * This library is free software; you can redistribute it and/or
> + * modify it under the terms of the GNU Lesser General Public
> + * License as published by the Free Software Foundation; either
> + * version 2.1 of the License, or (at your option) any later version.
> + *
> + * This library is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> + * Lesser General Public License for more details.
> + *
> + * You should have received a copy of the GNU Lesser General Public
> + * License along with this library.  If not, see
> + * .
> + *
> + * Authors:
> + *  Eli Qiao 
> + */
> +#include 
> +
> +#include 
> +#if defined HAVE_SYS_SYSCALL_H
> +# include 
> +#endif
> +#include 
> +#include 
> +#include 
> +
> +#include "virresctrl.h"
> +#include "viralloc.h"
> +#include "virerror.h"
> +#include "virfile.h"
> +#include "virhostcpu.h"
> +#include "virlog.h"
> +#include "virstring.h"
> +#include "nodeinfo.h"
> +
> +VIR_LOG_INIT("util.resctrl");
> +
> +#define VIR_FROM_THIS VIR_FROM_RESCTRL
> +
> +static unsigned int host_id = 0;
> +
> +static virResCtrl ResCtrlAll[] = {

Lowercase for global variable names please.

> +{
> +.name = "L3",
> +.cache_level = "l3",
> +},
> +{
> +.name = "L3DATA",
> +.cache_level = "l3",
> +},
> +{
> +.name = "L3CODE",
> +.cache_level = "l3",
> +},
> +{
> +.name = "L2",
> +.cache_level = "l2",
> +},
> +};
> +
> +static int virResCtrlGetInfoStr(const int type, const char *item, char **str)
> +{
> +int ret = 0;
> +char *tmp;
> +char *path;
> +
> +if (asprintf(, "%s/%s/%s", RESCTRL_INFO_DIR, ResCtrlAll[type].name, 
> item) < 0)
> +return -1;

Use of asprintf is forbidden in libvirt - use virAsprintf.

Please make sure to run 'make check' and 'make syntax-check' on each
patch to catch this kind of policy error. There's quite a few other
style rules missed in these patches - i won't list them all since
'make syntax-check' can tell you.

> +if (virFileReadAll(path, 10, str) < 0) {
> +ret = -1;
> +goto cleanup;
> +}
> +
> +if ((tmp = strchr(*str, '\n'))) {
> +*tmp = '\0';
> +}
> +
> +cleanup:
> +VIR_FREE(path);
> +return ret;
> +}
> +
> +
> +static int virResCtrlGetCPUValue(const char* path, char** value)
> +{
> +int ret = -1;
> +char* tmp;

The '*' should be next to the variable name, not the type name.
Multiple other cases follow

> +
> +if(virFileReadAll(path, 10, value) < 0) {
> +goto cleanup;
> +}
> +if ((tmp = strchr(*value, '\n'))) {
> +*tmp = '\0';
> +}
> +ret = 0;
> +cleanup:
> +return ret;
> +}
> +


> +int virResCtrlInit(void) {
> +int i = 0;
> +char *tmp;
> +int rc = 0;
> +
> +for(i = 0; i < RDT_NUM_RESOURCES; i++) {
> +if ((rc = asprintf(, "%s/%s", RESCTRL_INFO_DIR, 
> ResCtrlAll[i].name)) < 0) {
> +continue;

Silently ignoring OOM is not good - please return a proper error - using
virAsprintf would do so.

> +}
> +if (virFileExists(tmp)) {
> +if ((rc = virResCtrlGetConfig(i)) < 0 )
> +VIR_WARN("Ignor error while get config for %d", i);

Again don't ignore errors like this - this should be fatal.

> +}
> +
> +VIR_FREE(tmp);
> +}
> +return rc;
> +}
> +
> +bool virResCtrlAvailable(void) {
> +if (!virFileExists(RESCTRL_INFO_DIR))
> +return false;
> +return true;
> +}
> +
> +virResCtrlPtr virResCtrlGet(int type) {
> +return [type];
> +}


> diff --git a/src/util/virresctrl.h b/src/util/virresctrl.h
> new file mode 100644
> index 000..f713e66
> --- 

Re: [libvirt] [resend v2 2/7] Resctrl: expose cache information to capabilities

2017-02-06 Thread Daniel P. Berrange
On Mon, Feb 06, 2017 at 10:23:37AM +0800, Eli Qiao wrote:
> This patch expose cache information to host's capabilites xml.
> 
> For l3 cache allocation
> 
>   
> 
>   
>   
> 
>   
> 
> 
> For l3 cache allocation supported cdp(seperate data/code):
> 
>   
> 
> 
>   
>   
> 
> 
>   
> 
> 
> RFC on mailing list.
> https://www.redhat.com/archives/libvir-list/2017-January/msg00644.html
> 
> Signed-off-by: Eli Qiao 
> ---
>  src/conf/capabilities.c  | 56 
>  src/conf/capabilities.h  | 23 +++
>  src/libvirt_private.syms |  1 +
>  src/qemu/qemu_capabilities.c | 68 
> 
>  src/qemu/qemu_driver.c   |  4 +++
>  5 files changed, 152 insertions(+)
> 


> diff --git a/src/qemu/qemu_capabilities.c b/src/qemu/qemu_capabilities.c
> index 3247d25..23f416d 100644
> --- a/src/qemu/qemu_capabilities.c
> +++ b/src/qemu/qemu_capabilities.c
> @@ -45,6 +45,7 @@
>  #include "qemu_domain.h"
>  #define __QEMU_CAPSRIV_H_ALLOW__
>  #include "qemu_capspriv.h"
> +#include "virresctrl.h"
>  
>  #include 
>  #include 
> @@ -1098,7 +1099,71 @@ virQEMUCapsInitCPU(virCapsPtr caps,
>  goto cleanup;
>  }
>  
> +static int
> +virQEMUCapsInitCache(virCapsPtr caps)
> +{
> +int i, j;
> +virResCtrlPtr resctrl;
> +virCapsHostCacheBankPtr bank;
> +
> +for (i = 0; i < RDT_NUM_RESOURCES; i ++)
> +{
> +/* L3DATA and L3CODE share L3 resources */
> +if ( i == RDT_RESOURCE_L3CODE )
> +continue;
>  
> +resctrl = virResCtrlGet(i);
> +
> +if(resctrl->enabled) {
> +for( j = 0; j < resctrl->num_banks; j++)
> +{
> +if(VIR_RESIZE_N(caps->host.cachebank, 
> caps->host.ncachebank_max,
> +caps->host.ncachebank, 1) < 0)
> +return -1;
> +
> +if(VIR_ALLOC(bank) < 0)
> +return -1;
> +
> +bank->id = resctrl->cache_banks[j].host_id;
> +if(VIR_STRDUP(bank->type, resctrl->cache_level) < 0)
> +goto err;
> +if(VIR_STRDUP(bank->cpus, 
> virBitmapFormat(resctrl->cache_banks[j].cpu_mask)) < 0)
> +goto err;
> +bank->size = resctrl->cache_banks[j].cache_size;
> +/*L3DATA and L3CODE shares L3 cache resources, so fill them 
> to the control element*/
> +if ( i == RDT_RESOURCE_L3DATA ) {
> +if(VIR_EXPAND_N(bank->control, bank->ncontrol, 2) < 0)
> +goto err;
> +
> +bank->control[0].min = 
> virResCtrlGet(RDT_RESOURCE_L3DATA)->cache_banks[j].cache_min;
> +bank->control[0].reserved = bank->control[0].min * 
> (virResCtrlGet(RDT_RESOURCE_L3DATA)->min_cbm_bits);
> +if(VIR_STRDUP(bank->control[0].scope,
> +  virResCtrlGet(RDT_RESOURCE_L3DATA)->name) 
> < 0)
> +goto err;
> +
> +bank->control[1].min = 
> virResCtrlGet(RDT_RESOURCE_L3CODE)->cache_banks[j].cache_min;
> +bank->control[1].reserved = bank->control[1].min * 
> (virResCtrlGet(RDT_RESOURCE_L3CODE)->min_cbm_bits);
> +if(VIR_STRDUP(bank->control[1].scope,
> +  virResCtrlGet(RDT_RESOURCE_L3CODE)->name) 
> < 0)
> +goto err;
> +}
> +else {
> +if(VIR_EXPAND_N(bank->control, bank->ncontrol, 1) < 0)
> +goto err;
> +bank->control[0].min = resctrl->cache_banks[j].cache_min;
> +bank->control[0].reserved = bank->control[0].min * 
> resctrl->min_cbm_bits;
> +if(VIR_STRDUP(bank->control[0].scope, resctrl->name) < 0)
> +goto err;
> +}
> +caps->host.cachebank[caps->host.ncachebank++] = bank;
> +}
> +}
> +}
> +return 0;
> +err:
> +VIR_FREE(bank);
> +return -1;
> +}

I don't think this code should be in the QEMU driver - better to have
it in nodeinfo.c so it is common to all drivers.


Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://entangle-photo.org   -o-http://search.cpan.org/~danberr/ :|

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [resend v2 0/7] Support cache tune in libvirt

2017-02-06 Thread Marcelo Tosatti
On Mon, Feb 06, 2017 at 10:23:35AM +0800, Eli Qiao wrote:
> This series patches are for supportting CAT featues, which also
> called cache tune in libvirt.
> 
> First to expose cache information which could be tuned in capabilites XML.
> Then add new domain xml element support to add cacahe bank which will apply
> on this libvirt domain.
> 
> This series patches add a util file `resctrl.c/h`, an interface to talk with
> linux kernel's sys fs.
> 
> There are still some TODOs such as expose new public interface to get free
> cache information.
> 
> Some discussion about this feature support can be found from:
> https://www.redhat.com/archives/libvir-list/2017-January/msg00644.html

Two comments:

1) Please perform appropriate filesystem locking when accessing
resctrlfs, as described at:

http://www.spinics.net/lists/linux-tip-commits/msg36754.html

2) 



[b4c270b5-e0f9-4106-a446-69032872ed7d]# cat tasks  
8654
[b4c270b5-e0f9-4106-a446-69032872ed7d]# pstree -p | grep qemu
   |-qemu-kvm(8654)-+-{qemu-kvm}(8688)
   ||-{qemu-kvm}(8692)
   |`-{qemu-kvm}(8693)

Should add individual vcpus to the "tasks" file, not the main QEMU
process.

The NFV usecase requires exclusive CAT allocation for the vcpu which
runs the sensitive workload.

Perhaps:

 

Adds all vcpus that are pinned to the socket which cachebank with
host_id=1.

 

Adds PID of vcpus 2 and 3 to resctrl directory created for this
allocation.





--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] Issue with libvirtd

2017-02-06 Thread Michal Privoznik
On 06.02.2017 14:39, Boris Fiuczynski wrote:
> On 01/10/2017 10:19 AM, Michal Privoznik wrote:
>> On 01/10/2017 07:48 AM, Faysal Ali wrote:
>>> Hi Michal,
>>
>> [It is usually good idea to keep the list CCed - it may help others
>> finding a solution to their problems]
>>
>>>
>>> Well I have created my little python/libivrt app to manage my virtual
>>> machines. I am using python socket to check libivrt port
>>> availability, and
>>> the error End of file while reading data: Input/output error* only
>>> happens
>>> whenever that python socket script is trigger.
>>>
>>> Here is the script of python socket
>>>
>>> import libvirt, socket, sys
>>>
>>> hostname='kvm09'
>>> port = 16509
>>>
>>> try:
>>>   socket_host = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
>>>   socket_host.settimeout(1)
>>>   socket_host.connect((hostname, port))
>>>   socket_host.close()
>>>   print 'true'
>>> except Exception as err:
>>>   print err
>>
>> Ah, I haven't expected this. Of course your are seeing the error
>> message. Libvirt has its own protocol on the top of TCP and since you
>> are connecting and dying immediately - without sending any valid libvirt
>> packet, the daemon logs an error. This is perfectly expected.
>>
>> Also, this is *not* how you connect to libvirt. You want to open a
>> libvirt connection:
>>
>> conn = libvirt.open(uri)
>> dom = conn.lookupByName(name)
>> ...
>>
>> Michal
> 
> Michal,
> my expectation is that virsh is using a libvirt connection.
> Anyway I am getting the error whenever I use virsh start, virsh destroy
> or virsh shutdown on a domain, e.g.
> 
> Feb 06 14:01:30 s38lp24 virtlogd[42845]: 2017-02-06 13:01:30.675+:
> 42845: error : virNetSocketReadWire:1800 : End of file while reading
> data: Input/output error
> Feb 06 14:04:32 s38lp24 virtlogd[42845]: 2017-02-06 13:04:32.991+:
> 42845: error : virNetSocketReadWire:1800 : End of file while reading
> data: Input/output error
> 
> Would that also be covered under "This is perfectly expected."?
> 
> 

No. this shouldn't happen. Although, I'm not that familiar with the code
to tell why & what should be fixed.

Michal

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] Issue with libvirtd

2017-02-06 Thread Boris Fiuczynski

On 01/10/2017 10:19 AM, Michal Privoznik wrote:

On 01/10/2017 07:48 AM, Faysal Ali wrote:

Hi Michal,


[It is usually good idea to keep the list CCed - it may help others
finding a solution to their problems]



Well I have created my little python/libivrt app to manage my virtual
machines. I am using python socket to check libivrt port availability, and
the error End of file while reading data: Input/output error* only happens
whenever that python socket script is trigger.

Here is the script of python socket

import libvirt, socket, sys

hostname='kvm09'
port = 16509

try:
  socket_host = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
  socket_host.settimeout(1)
  socket_host.connect((hostname, port))
  socket_host.close()
  print 'true'
except Exception as err:
  print err


Ah, I haven't expected this. Of course your are seeing the error
message. Libvirt has its own protocol on the top of TCP and since you
are connecting and dying immediately - without sending any valid libvirt
packet, the daemon logs an error. This is perfectly expected.

Also, this is *not* how you connect to libvirt. You want to open a
libvirt connection:

conn = libvirt.open(uri)
dom = conn.lookupByName(name)
...

Michal


Michal,
my expectation is that virsh is using a libvirt connection.
Anyway I am getting the error whenever I use virsh start, virsh destroy 
or virsh shutdown on a domain, e.g.


Feb 06 14:01:30 s38lp24 virtlogd[42845]: 2017-02-06 13:01:30.675+: 
42845: error : virNetSocketReadWire:1800 : End of file while reading 
data: Input/output error
Feb 06 14:04:32 s38lp24 virtlogd[42845]: 2017-02-06 13:04:32.991+: 
42845: error : virNetSocketReadWire:1800 : End of file while reading 
data: Input/output error


Would that also be covered under "This is perfectly expected."?


--
Mit freundlichen Grüßen/Kind regards
   Boris Fiuczynski

IBM Deutschland Research & Development GmbH
Vorsitzender des Aufsichtsrats: Martina Köderitz
Geschäftsführung: Dirk Wittkopp
Sitz der Gesellschaft: Böblingen
Registergericht: Amtsgericht Stuttgart, HRB 243294

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [PATCH v2 0/4] Support setting MTU in libvirt networks

2017-02-06 Thread Michal Privoznik
On 03.02.2017 18:35, Laine Stump wrote:
> Similar to V1, but changing the XML to match Michal's patches, and
> adding a patch that plumbs network MTU back through to the qemu
> host_mtu option.
> 
> Laine Stump (4):
>   util: add MTU arg to virNetDevTapCreateInBridgePort()
>   conf: support configuring mtu size in a virtual network
>   network: honor mtu setting when creating network
>   qemu: propagate bridge MTU into qemu "host_mtu" option
> 
>  docs/formatnetwork.html.in  | 19 ++-
>  docs/news.xml   |  5 +++--
>  docs/schemas/network.rng|  5 +
>  src/bhyve/bhyve_command.c   |  1 +
>  src/conf/network_conf.c | 31 ++-
>  src/conf/network_conf.h |  1 +
>  src/network/bridge_driver.c |  1 +
>  src/qemu/qemu_command.c | 32 ++--
>  src/qemu/qemu_command.h |  3 ++-
>  src/qemu/qemu_hotplug.c |  5 +++--
>  src/qemu/qemu_interface.c   |  4 +++-
>  src/qemu/qemu_interface.h   |  3 ++-
>  src/uml/uml_conf.c  |  1 +
>  src/util/virnetdevtap.c | 32 +++-
>  src/util/virnetdevtap.h |  2 ++
>  tests/bhyvexml2argvmock.c   |  2 ++
>  tests/networkxml2xmlin/set-mtu.xml  | 12 
>  tests/networkxml2xmlout/set-mtu.xml | 12 
>  tests/networkxml2xmltest.c  |  1 +
>  19 files changed, 148 insertions(+), 24 deletions(-)
>  create mode 100644 tests/networkxml2xmlin/set-mtu.xml
>  create mode 100644 tests/networkxml2xmlout/set-mtu.xml
> 

ACK series. Regarding the migration issue - that is pre-existing and
thus does not hold back pushing of these patches.

Michal

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [PATCH v2 4/4] qemu: propagate bridge MTU into qemu "host_mtu" option

2017-02-06 Thread Michal Privoznik
On 03.02.2017 18:35, Laine Stump wrote:
> libvirt was able to set the host_mtu option when an MTU was explicitly
> given in the interface config (with ), set the MTU of a
> libvirt network in the network config (with the same named
> subelement), and would automatically set the MTU of any tap device to
> the MTU of the network.
> 
> This patch ties that all together (for networks based on tap devices
> and either Linux host bridges or OVS bridges) by learning the MTU of
> the network (i.e. the bridge) during qemuInterfaceBridgeConnect(), and
> returning that value so that it can be passed to qemuBuildNicDevStr();
> qemuBuildNicDevStr() then sets host_mtu in the interface's commandline
> options.
> 
> The result is that a higher MTU for all guests connecting to a
> particular network will be plumbed top to bottom by simply changing
> the MTU of the network (in libvirt's config for libvirt-managed
> networks, or directly on the bridge device for simple host bridges or
> OVS bridges managed outside of libvirt).
> 
> One question I have about this - it occurred to me that in the case of
> migrating a guest from a host with an older libvirt to one with a
> newer libvirt, the guest may have *not* had the host_mtu option on the
> older machine, but *will* have it on the newer machine. I'm curious if
> this could lead to incompatibilities between source and destination (I
> guess it all depends on whether or not the setting of host_mtu has a
> practical effect on a guest that is already running - Maxime?) (I hope
> we don't have to add a "" and only set host_mtu when
> that is present :-/)

This is a question for qemu folks. I know nothing about qemu internals.

> 
> Likewise, we could run into problems when migrating from a newer
> libvirt to older libvirt - The guest would have been told of the
> higher MTU on the newer libvirt, then migrated to a host that didn't
> understand . (If this really is a problem, it would
> be a problem with or without the current patch).


Well, if it turns out to be a problem there's yet another variation of it: 
users can supply new domain XML upon migration and thus change MTU. But that 
should be easy to check (not that we are doing that now).

> ---
> 
> New in V2.
> 
>  src/qemu/qemu_command.c   | 32 ++--
>  src/qemu/qemu_command.h   |  3 ++-
>  src/qemu/qemu_hotplug.c   |  5 +++--
>  src/qemu/qemu_interface.c |  5 +++--
>  src/qemu/qemu_interface.h |  3 ++-
>  5 files changed, 32 insertions(+), 16 deletions(-)
> 
> diff --git a/src/qemu/qemu_command.c b/src/qemu/qemu_command.c
> index 6d65872..522152d 100644
> --- a/src/qemu/qemu_command.c
> +++ b/src/qemu/qemu_command.c
> @@ -3555,7 +3555,8 @@ qemuBuildNicDevStr(virDomainDefPtr def,
> int vlan,
> unsigned int bootindex,
> size_t vhostfdSize,
> -   virQEMUCapsPtr qemuCaps)
> +   virQEMUCapsPtr qemuCaps,
> +   unsigned int mtu)
>  {
>  virBuffer buf = VIR_BUFFER_INITIALIZER;
>  const char *nic = net->model;
> @@ -3679,13 +3680,23 @@ qemuBuildNicDevStr(virDomainDefPtr def,
>  virBufferAsprintf(, ",rx_queue_size=%u", 
> net->driver.virtio.rx_queue_size);
>  }
>  
> -if (usingVirtio && net->mtu) {
> -if (!virQEMUCapsGet(qemuCaps, QEMU_CAPS_VIRTIO_NET_HOST_MTU)) {
> -virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s",
> -   _("setting MTU is not supported with this QEMU 
> binary"));
> -goto error;
> +if (usingVirtio && mtu) {
> +if (virQEMUCapsGet(qemuCaps, QEMU_CAPS_VIRTIO_NET_HOST_MTU)) {
> +
> +virBufferAsprintf(, ",host_mtu=%u", mtu);
> +
> +} else {
> +/* log an error if mtu was requested specifically for this
> + * interface, otherwise, if it's just what was reported by
> + * the attached network, ignore it.
> + */
> +if (net->mtu) {
> +virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s",
> +   _("setting MTU is not supported with "
> + "this QEMU binary"));
> +goto error;
> +}
>  }

This requires users to pass net->mtu even though net is already passed into 
this function. I wonder whether we should alter meaning of @mtu argument 
slightly. Something like you're going with in 1/4:
- a nonzero value means caller is requesting that particular MTU size
- a zero value means stick with net->mtu value.

Although, now that I've tried and changed code accordingly the difference is 
just one line changed (apart from these line above):

index 0767c6649..a556dc60a 100644
--- i/src/qemu/qemu_command.c
+++ w/src/qemu/qemu_command.c
@@ -3680,10 +3680,10 @@ qemuBuildNicDevStr(virDomainDefPtr def,
 virBufferAsprintf(, ",rx_queue_size=%u", 
net->driver.virtio.rx_queue_size);
 }
 
-if (usingVirtio && mtu) {
+if 

Re: [libvirt] [Block Replication] Question about supporting COLO in libvirt

2017-02-06 Thread Daniel P. Berrange
On Mon, Feb 06, 2017 at 08:34:28PM +0800, Hailiang Zhang wrote:
> Hi,
> I'm trying to implement supporting COLO in libvirt,
> But i found an annoying problem that libvirt does not
> support the command line option argument syntax we used
> for block replication in QEMU.
> 
> That is libvirt does not support the bellow syntax for block:
> -drive driver=qcow2,file.filename=test:a.qcow2
> -drive file=test.qcow2,\
> backing.file.filename=/dev/fdset/1
> 
> It seems to be a new syntax that libvirt does not support
> thought it has been exist in QEMU for a time.
> I found some introductions from
> http://www.linux-kvm.org/images/3/34/Kvm-forum-2013-block-dev-configuration.pdf
> 
> The command line we use for COLO is just like the above syntax,
> For example, for the shared disk in COLO, it is:
>  -drive 
> if=none,driver=qcow2,file.filename=/mnt/ramfs/hidden_disk.img,id=hidden_disk0,\
> backing.driver=raw,backing.file.filename=1.raw \
>  -drive if=virtio,id=active-disk0,driver=replication,mode=secondary,\
> file.driver=qcow2,top-id=active-disk0,\
> file.file.filename=/mnt/ramfs/active_disk.img,\
> file.backing=hidden_disk0,shared-disk=on
> 
> For the none-shared disk in COLO, it is quite same with the shared-disk:
>   -drive if=none,driver=raw,file.filename=1.raw,id=colo1 \
>   -drive if=xxx,id=topxxx,driver=replication,mode=secondary,top-id=topxxx\
>  file.file.filename=active_disk.qcow2,\
>  file.driver=qcow2,\
>  file.backing.file.filename=hidden_disk.qcow2,\
>  file.backing.driver=qcow2,\
>  file.backing.backing=colo1
> 
> So there seems to be two ways to solve this problem.
> 
> One is to support this new option argument syntax in libvirt,
> but I'm not sure if it is difficult or not to implement it,
> and i don't know where to start either.

Libvirt has to start supporting this new syntax. It is required for
many different use cases beyond just colo. For example, to be able
to explicitly given qemu details about a backing file format to
remove probing, or to be able to set LUKS passwords on backing files,
and more beside

Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://entangle-photo.org   -o-http://search.cpan.org/~danberr/ :|

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


[libvirt] [Block Replication] Question about supporting COLO in libvirt

2017-02-06 Thread Hailiang Zhang

Hi,
I'm trying to implement supporting COLO in libvirt,
But i found an annoying problem that libvirt does not
support the command line option argument syntax we used
for block replication in QEMU.

That is libvirt does not support the bellow syntax for block:
-drive driver=qcow2,file.filename=test:a.qcow2
-drive file=test.qcow2,\
backing.file.filename=/dev/fdset/1

It seems to be a new syntax that libvirt does not support
thought it has been exist in QEMU for a time.
I found some introductions from
http://www.linux-kvm.org/images/3/34/Kvm-forum-2013-block-dev-configuration.pdf

The command line we use for COLO is just like the above syntax,
For example, for the shared disk in COLO, it is:
 -drive 
if=none,driver=qcow2,file.filename=/mnt/ramfs/hidden_disk.img,id=hidden_disk0,\
backing.driver=raw,backing.file.filename=1.raw \
 -drive if=virtio,id=active-disk0,driver=replication,mode=secondary,\
file.driver=qcow2,top-id=active-disk0,\
file.file.filename=/mnt/ramfs/active_disk.img,\
file.backing=hidden_disk0,shared-disk=on

For the none-shared disk in COLO, it is quite same with the shared-disk:
  -drive if=none,driver=raw,file.filename=1.raw,id=colo1 \
  -drive if=xxx,id=topxxx,driver=replication,mode=secondary,top-id=topxxx\
 file.file.filename=active_disk.qcow2,\
 file.driver=qcow2,\
 file.backing.file.filename=hidden_disk.qcow2,\
 file.backing.driver=qcow2,\
 file.backing.backing=colo1

So there seems to be two ways to solve this problem.

One is to support this new option argument syntax in libvirt,
but I'm not sure if it is difficult or not to implement it,
and i don't know where to start either.

Another way is to convert these command line options in QEMU totally,
I mean hidden the descriptions of 'active_disk' and 'hidden_disk' disks.
Create/add them dynamicly by qmp commands while users want to make VM goes
into COLO state. That's just like to support live image clone in QEMU.

Any ideas ?


Thanks,
Hailiang

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


[libvirt] [PATCH 05/16] Adjust the formatter to reflect the new hostdev type mdev

2017-02-06 Thread Erik Skultety
Signed-off-by: Erik Skultety 
---
 src/conf/domain_conf.c | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c
index f7bdd7a..9663350 100644
--- a/src/conf/domain_conf.c
+++ b/src/conf/domain_conf.c
@@ -21140,6 +21140,7 @@ virDomainHostdevDefFormatSubsys(virBufferPtr buf,
 virDomainHostdevSubsysPCIPtr pcisrc = >source.subsys.u.pci;
 virDomainHostdevSubsysSCSIPtr scsisrc = >source.subsys.u.scsi;
 virDomainHostdevSubsysSCSIVHostPtr hostsrc = 
>source.subsys.u.scsi_host;
+virDomainHostdevSubsysMediatedDevPtr mdevsrc = >source.subsys.u.mdev;
 virDomainHostdevSubsysSCSIHostPtr scsihostsrc = >u.host;
 virDomainHostdevSubsysSCSIiSCSIPtr iscsisrc = >u.iscsi;
 
@@ -21244,6 +21245,15 @@ virDomainHostdevDefFormatSubsys(virBufferPtr buf,
 break;
 case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_SCSI_HOST:
 break;
+case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV:
+if (virPCIDeviceAddressFormat(buf, mdevsrc->addr,
+  includeTypeInAddr) != 0) {
+virReportError(VIR_ERR_INTERNAL_ERROR, "%s",
+   _("PCI address Formatting failed"));
+return -1;
+}
+virBufferAsprintf(buf, "%s\n", mdevsrc->uuidstr);
+break;
 default:
 virReportError(VIR_ERR_INTERNAL_ERROR,
_("unexpected hostdev type %d"),
-- 
2.10.2

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


[libvirt] [PATCH 04/16] conf: Adjust the domain parser to work with mdevs

2017-02-06 Thread Erik Skultety
As for the parser, the uuid element is optional and a UUID will be
generated automatically if missing unless the device is unmanaged
(default) in which case the element is mandatory, otherwise libvirt
wouldn't have means to identify the device uniquely.

Signed-off-by: Erik Skultety 
---
 src/conf/domain_conf.c | 45 +
 1 file changed, 45 insertions(+)

diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c
index 38ffc95..f7bdd7a 100644
--- a/src/conf/domain_conf.c
+++ b/src/conf/domain_conf.c
@@ -6326,6 +6326,49 @@ virDomainHostdevSubsysSCSIVHostDefParseXML(xmlNodePtr 
sourcenode,
 return ret;
 }
 
+static int
+virDomainHostdevSubsysMediatedDevDefParseXML(virDomainHostdevDefPtr def,
+ xmlXPathContextPtr ctxt)
+{
+int ret = -1;
+unsigned char uuid[VIR_UUID_BUFLEN] = {0};
+char *uuidxml = NULL;
+xmlNodePtr node = NULL;
+
+virDomainHostdevSubsysMediatedDevPtr mdevsrc = >source.subsys.u.mdev;
+virPCIDeviceAddressPtr addr = >addr;
+
+node = virXPathNode("./source/address", ctxt);
+if (virPCIDeviceAddressParseXML(node, addr) < 0)
+return -1;
+
+uuidxml = virXPathString("string(./source/uuid)", ctxt);
+if (!uuidxml && !def->managed) {
+virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s",
+   _("uuid element is mandatory for unmanaged devices"));
+goto cleanup;
+}
+
+if (uuidxml) {
+if (virUUIDParse(uuidxml, uuid) < 0) {
+virReportError(VIR_ERR_INTERNAL_ERROR,
+   "%s", _("malformed uuid element"));
+goto cleanup;
+}
+} else {
+if (virUUIDGenerate(uuid)) {
+virReportError(VIR_ERR_INTERNAL_ERROR,
+   "%s", _("Failed to generate UUID"));
+goto cleanup;
+}
+}
+
+virUUIDFormat(uuid, mdevsrc->uuidstr);
+ret = 0;
+ cleanup:
+VIR_FREE(uuidxml);
+return ret;
+}
 
 static int
 virDomainHostdevDefParseXMLSubsys(xmlNodePtr node,
@@ -6455,6 +6498,8 @@ virDomainHostdevDefParseXMLSubsys(xmlNodePtr node,
 goto error;
 break;
 case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV:
+if (virDomainHostdevSubsysMediatedDevDefParseXML(def, ctxt) < 0)
+goto error;
 break;
 
 default:
-- 
2.10.2

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


[libvirt] [PATCH 02/16] conf: Introduce new hostdev device type mdev

2017-02-06 Thread Erik Skultety
A mediated device will be identified by the host PCI address of the
parent physical device which is backing the given mediated device, and a
UUID of the user pre-created mediated device. The data necessary to
identify a mediated device can be easily extended in the future, once we
need to enable managed='yes' in which case a hint from the upper
management layer about which mediated device type (e.g. vGPU type)
should an instance be created on.

Signed-off-by: Erik Skultety 
---
 src/conf/domain_conf.c  |  7 ++-
 src/conf/domain_conf.h  | 10 ++
 src/qemu/qemu_cgroup.c  |  5 +
 src/qemu/qemu_domain.c  |  1 +
 src/qemu/qemu_hotplug.c |  2 ++
 src/security/security_apparmor.c|  3 +++
 src/security/security_dac.c |  2 ++
 src/security/security_selinux.c |  2 ++
 tests/domaincapsschemadata/full.xml |  1 +
 9 files changed, 32 insertions(+), 1 deletion(-)

diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c
index c06b128..38ffc95 100644
--- a/src/conf/domain_conf.c
+++ b/src/conf/domain_conf.c
@@ -649,7 +649,8 @@ VIR_ENUM_IMPL(virDomainHostdevSubsys, 
VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_LAST,
   "usb",
   "pci",
   "scsi",
-  "scsi_host")
+  "scsi_host",
+  "mdev")
 
 VIR_ENUM_IMPL(virDomainHostdevSubsysPCIBackend,
   VIR_DOMAIN_HOSTDEV_PCI_BACKEND_TYPE_LAST,
@@ -6453,6 +6454,8 @@ virDomainHostdevDefParseXMLSubsys(xmlNodePtr node,
 if (virDomainHostdevSubsysSCSIVHostDefParseXML(sourcenode, def) < 0)
 goto error;
 break;
+case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV:
+break;
 
 default:
 virReportError(VIR_ERR_CONFIG_UNSUPPORTED,
@@ -13281,6 +13284,7 @@ virDomainHostdevDefParseXML(virDomainXMLOptionPtr 
xmlopt,
 }
 break;
 case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_USB:
+case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV:
 case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_LAST:
 break;
 }
@@ -14172,6 +14176,7 @@ virDomainHostdevMatchSubsys(virDomainHostdevDefPtr a,
 return 1;
 else
 return 0;
+case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV:
 case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_LAST:
 return 0;
 }
diff --git a/src/conf/domain_conf.h b/src/conf/domain_conf.h
index 507ace8..3a1009a 100644
--- a/src/conf/domain_conf.h
+++ b/src/conf/domain_conf.h
@@ -54,6 +54,7 @@
 # include "virgic.h"
 # include "virperf.h"
 # include "virtypedparam.h"
+# include "virpci.h"
 
 /* forward declarations of all device types, required by
  * virDomainDeviceDef
@@ -295,6 +296,7 @@ typedef enum {
 VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_PCI,
 VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_SCSI,
 VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_SCSI_HOST,
+VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV,
 
 VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_LAST
 } virDomainHostdevSubsysType;
@@ -369,6 +371,13 @@ struct _virDomainHostdevSubsysSCSI {
 } u;
 };
 
+typedef struct _virDomainHostdevSubsysMediatedDev 
virDomainHostdevSubsysMediatedDev;
+typedef virDomainHostdevSubsysMediatedDev 
*virDomainHostdevSubsysMediatedDevPtr;
+struct _virDomainHostdevSubsysMediatedDev {
+virPCIDeviceAddress addr;   /* parent device's host address */
+char uuidstr[VIR_UUID_STRING_BUFLEN];   /* mediated device's uuid string */
+};
+
 typedef enum {
 VIR_DOMAIN_HOSTDEV_SUBSYS_SCSI_HOST_PROTOCOL_TYPE_NONE,
 VIR_DOMAIN_HOSTDEV_SUBSYS_SCSI_HOST_PROTOCOL_TYPE_VHOST,
@@ -394,6 +403,7 @@ struct _virDomainHostdevSubsys {
 virDomainHostdevSubsysPCI pci;
 virDomainHostdevSubsysSCSI scsi;
 virDomainHostdevSubsysSCSIVHost scsi_host;
+virDomainHostdevSubsysMediatedDev mdev;
 } u;
 };
 
diff --git a/src/qemu/qemu_cgroup.c b/src/qemu/qemu_cgroup.c
index 6c90d46..0902624 100644
--- a/src/qemu/qemu_cgroup.c
+++ b/src/qemu/qemu_cgroup.c
@@ -433,6 +433,9 @@ qemuSetupHostdevCgroup(virDomainObjPtr vm,
 break;
 }
 
+case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV:
+break;
+
 case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_LAST:
 break;
 }
@@ -501,6 +504,8 @@ qemuTeardownHostdevCgroup(virDomainObjPtr vm,
 case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_SCSI_HOST:
 /* nothing to tear down for scsi_host */
 break;
+case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV:
+break;
 case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_LAST:
 break;
 }
diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c
index c6ce090..3006d78 100644
--- a/src/qemu/qemu_domain.c
+++ b/src/qemu/qemu_domain.c
@@ -6901,6 +6901,7 @@ qemuDomainGetHostdevPath(virDomainHostdevDefPtr dev,
 break;
 }
 
+case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV:
 case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_LAST:
 break;
 }
diff --git 

[libvirt] [PATCH 16/16] docs: Document the new hostdev device type 'mdev'

2017-02-06 Thread Erik Skultety
Signed-off-by: Erik Skultety 
---
 docs/formatdomain.html.in | 40 
 1 file changed, 36 insertions(+), 4 deletions(-)

diff --git a/docs/formatdomain.html.in b/docs/formatdomain.html.in
index 77d8e71..10cef31 100644
--- a/docs/formatdomain.html.in
+++ b/docs/formatdomain.html.in
@@ -3765,6 +3765,20 @@
   /devices
   ...
 
+or:
+
+
+  ...
+  devices
+hostdev mode='subsystem' type='mdev'
+source
+  address domain='0x' bus='0x06' slot='0x02' function='0x0'/
+  uuidc2177883-f1bb-47f0-914d-32a22e3a8804/uuid
+/source
+/hostdev
+  /devices
+  ...
+
 
   hostdev
   The hostdev element is the main container for describing
@@ -3809,12 +3823,20 @@
 type passes all LUNs presented by a single HBA to
 the guest.
   
+  mdev
+  For mediated devices (Since 3.1.0)
+  the managed attribute controls whether the mediated
+  device to be assigned is pre-created by the user or shall be created
+  automatically by libvirt using the information provided by the
+  source element. Currently, only the value "no" is
+  supported which is the same as omitting the attribute completely.
+  
 
 
-  Note: The managed attribute is only used with PCI 
devices
-  and is ignored by all the other device types, thus setting
-  managed explicitly with other than PCI device has the 
same
-  effect as omitting it.
+  Note: The managed attribute is only used with PCI and
+  mediated devices and is ignored by all the other device types, thus
+  setting managed explicitly with other than a PCI or a
+  mediated device has the same effect as omitting it.
 
   
   source
@@ -3879,6 +3901,16 @@
 is the vhost_scsi wwpn (16 hexadecimal digits with a prefix of
 "naa.") established in the host configfs.
   
+  mdev
+  Mediated devices (Since 3.1.0) are
+  described by the PCI address of the backing physical
+  parent device. Additionally, the mediated device is further 
identified
+  by the uuid element. The element is optional if
+  managed="yes", in which case the UUID shall be
+  generated automatically along with creation with the mediated
+  device. However, when managed="no" then 
uuid
+  element is mandatory.
+  
 
   
   vendor, product
-- 
2.10.2

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


[libvirt] [PATCH 14/16] qemu: Format mdevs on the qemu command line

2017-02-06 Thread Erik Skultety
Format the mediated devices on the qemu command line as
-device vfio-pci,sysfsdev='/path/to/device/in/syfs'.

Signed-off-by: Erik Skultety 
---
 src/qemu/qemu_command.c | 49 +
 src/qemu/qemu_command.h |  5 +
 2 files changed, 54 insertions(+)

diff --git a/src/qemu/qemu_command.c b/src/qemu/qemu_command.c
index 1396661..51a9298 100644
--- a/src/qemu/qemu_command.c
+++ b/src/qemu/qemu_command.c
@@ -57,6 +57,7 @@
 #include "virscsi.h"
 #include "virnuma.h"
 #include "virgic.h"
+#include "virmdev.h"
 #if defined(__linux__)
 # include 
 #endif
@@ -5135,6 +5136,35 @@ qemuBuildChrChardevStr(virLogManagerPtr logManager,
 return ret;
 }
 
+char *
+qemuBuildHostdevMediatedDevStr(const virDomainDef *def,
+   virDomainHostdevDefPtr dev,
+   virQEMUCapsPtr qemuCaps)
+{
+virBuffer buf = VIR_BUFFER_INITIALIZER;
+virDomainHostdevSubsysMediatedDevPtr mdevsrc = >source.subsys.u.mdev;
+virMediatedDevicePtr mdev = NULL;
+char *ret = NULL;
+
+if (!(mdev = virMediatedDeviceNew(>addr, mdevsrc->uuidstr)))
+goto cleanup;
+
+virBufferAddLit(, "vfio-pci");
+virBufferAsprintf(, ",sysfsdev=%s", virMediatedDeviceGetPath(mdev));
+
+if (qemuBuildDeviceAddressStr(, def, dev->info, qemuCaps) < 0)
+goto cleanup;
+
+if (virBufferCheckError() < 0)
+goto cleanup;
+
+ret = virBufferContentAndReset();
+
+ cleanup:
+virBufferFreeAndReset();
+virMediatedDeviceFree(mdev);
+return ret;
+}
 
 static int
 qemuBuildHostdevCommandLine(virCommandPtr cmd,
@@ -5323,6 +5353,25 @@ qemuBuildHostdevCommandLine(virCommandPtr cmd,
 VIR_FREE(devstr);
 }
 }
+
+/* MDEV */
+if (hostdev->mode == VIR_DOMAIN_HOSTDEV_MODE_SUBSYS &&
+subsys->type == VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV) {
+
+if (!virQEMUCapsGet(qemuCaps, QEMU_CAPS_DEVICE_VFIO_PCI)) {
+virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s",
+   _("VFIO PCI device assignment is not "
+ "supported by this version of qemu"));
+return -1;
+}
+
+virCommandAddArg(cmd, "-device");
+if (!(devstr =
+  qemuBuildHostdevMediatedDevStr(def, hostdev, qemuCaps)))
+return -1;
+virCommandAddArg(cmd, devstr);
+VIR_FREE(devstr);
+}
 }
 
 return 0;
diff --git a/src/qemu/qemu_command.h b/src/qemu/qemu_command.h
index 3bcfdc6..e50b6f4 100644
--- a/src/qemu/qemu_command.h
+++ b/src/qemu/qemu_command.h
@@ -170,6 +170,11 @@ qemuBuildSCSIVHostHostdevDevStr(const virDomainDef *def,
 virQEMUCapsPtr qemuCaps,
 char *vhostfdName);
 
+char *
+qemuBuildHostdevMediatedDevStr(const virDomainDef *def,
+   virDomainHostdevDefPtr dev,
+   virQEMUCapsPtr qemuCaps);
+
 char *qemuBuildRedirdevDevStr(const virDomainDef *def,
   virDomainRedirdevDefPtr dev,
   virQEMUCapsPtr qemuCaps);
-- 
2.10.2

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


[libvirt] [PATCH 08/16] conf: Enable cold-plug of a mediated device

2017-02-06 Thread Erik Skultety
This merely introduces virDomainHostdevMatchSubsysMediatedDev method that
is supposed to check whether device being cold-plugged does not already
exist in the domain configuration.

Signed-off-by: Erik Skultety 
---
 src/conf/domain_conf.c | 19 +++
 1 file changed, 19 insertions(+)

diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c
index 9663350..aa1bd68 100644
--- a/src/conf/domain_conf.c
+++ b/src/conf/domain_conf.c
@@ -14192,6 +14192,24 @@ 
virDomainHostdevMatchSubsysSCSIiSCSI(virDomainHostdevDefPtr first,
 }
 
 static int
+virDomainHostdevMatchSubsysMediatedDev(virDomainHostdevDefPtr first,
+   virDomainHostdevDefPtr second)
+{
+virDomainHostdevSubsysMediatedDevPtr first_mdevsrc =
+>source.subsys.u.mdev;
+virDomainHostdevSubsysMediatedDevPtr second_mdevsrc =
+>source.subsys.u.mdev;
+
+if (STREQ(first_mdevsrc->uuidstr, second_mdevsrc->uuidstr) &&
+first_mdevsrc->addr.domain == second_mdevsrc->addr.domain &&
+first_mdevsrc->addr.bus == second_mdevsrc->addr.bus &&
+first_mdevsrc->addr.slot == second_mdevsrc->addr.slot &&
+first_mdevsrc->addr.function == second_mdevsrc->addr.function)
+return 1;
+return 0;
+}
+
+static int
 virDomainHostdevMatchSubsys(virDomainHostdevDefPtr a,
 virDomainHostdevDefPtr b)
 {
@@ -14222,6 +14240,7 @@ virDomainHostdevMatchSubsys(virDomainHostdevDefPtr a,
 else
 return 0;
 case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV:
+return virDomainHostdevMatchSubsysMediatedDev(a, b);
 case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_LAST:
 return 0;
 }
-- 
2.10.2

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


[libvirt] [RFC PATCH 00/16] Introduce vGPU mdev framework to libvirt

2017-02-06 Thread Erik Skultety
Finally. It's here. This is the initial suggestion on how libvirt might
interract with the mdev framework, currently only focussing on the non-managed
devices, i.e. those pre-created by the user, since that will be revisited once
we all settled on how the XML should look like, given we might not want to use
the sysfs path directly as an attribute in the domain XML. My proposal on the
XML is the following:

  



vGPU_UUID




So the mediated device is identified by the physical parent device visible on
the host and a UUID which allows us to construct the sysfs path by ourselves,
which we then put on the QEMU's command line.

A few remarks if you actually happen to have a machine to test this on:
- right now the mediated devices are one-time use only, i.e. they have to be
recreated before every machine boot
- I wouldn't recommend assigning multiple vGPUs to a single domain

Once this series is sorted out, we can then continue with 'managed=yes' where
as Laine pointed out [1], we need to figure out how exactly should the
management layer hint libvirt which vGPU type should be used for device
instantiation.

[1] https://www.redhat.com/archives/libvir-list/2017-January/msg00287.html  

#pleaseshareyourfeedback

Thanks,
Erik

Erik Skultety (16):
  util: Introduce new module virmdev
  conf: Introduce new hostdev device type mdev
  docs: Update RNG schema to reflect the new hostdev type mdev
  conf: Adjust the domain parser to work with mdevs
  Adjust the formatter to reflect the new hostdev type mdev
  security: dac: Enable labeling of vfio mediated devices
  security: selinux: Enable labeling of vfio mediated devices
  conf: Enable cold-plug of a mediated device
  qemu: Assign PCI addresses for mediated devices as well
  hostdev: Maintain a driver list of active mediated devices
  hostdev: Introduce a reattach method for mediated devices
  qemu: cgroup: Adjust cgroups' logic to allow mediated devices
  qemu: namespace: Hook up the discovery of mdevs into the namespace
code
  qemu: Format mdevs on the qemu command line
  test: Add some test cases for our test suite regarding the mdevs
  docs: Document the new hostdev device type 'mdev'

 docs/formatdomain.html.in  |  40 ++-
 docs/schemas/domaincommon.rng  |  17 +
 po/POTFILES.in |   1 +
 src/Makefile.am|   1 +
 src/conf/domain_conf.c |  81 -
 src/conf/domain_conf.h |  10 +
 src/libvirt_private.syms   |  19 ++
 src/qemu/qemu_cgroup.c |  35 ++
 src/qemu/qemu_command.c|  49 +++
 src/qemu/qemu_command.h|   5 +
 src/qemu/qemu_domain.c |  13 +
 src/qemu/qemu_domain_address.c |  12 +-
 src/qemu/qemu_hostdev.c|  37 ++
 src/qemu/qemu_hostdev.h|   8 +
 src/qemu/qemu_hotplug.c|   2 +
 src/security/security_apparmor.c   |   3 +
 src/security/security_dac.c|  56 +++
 src/security/security_selinux.c|  55 +++
 src/util/virhostdev.c  | 179 +-
 src/util/virhostdev.h  |  16 +
 src/util/virmdev.c | 375 +
 src/util/virmdev.h |  85 +
 tests/domaincapsschemadata/full.xml|   1 +
 ...qemuxml2argv-hostdev-mdev-unmanaged-no-uuid.xml |  37 ++
 .../qemuxml2argv-hostdev-mdev-unmanaged.args   |  25 ++
 .../qemuxml2argv-hostdev-mdev-unmanaged.xml|  38 +++
 tests/qemuxml2argvtest.c   |   6 +
 .../qemuxml2xmlout-hostdev-mdev-unmanaged.xml  |  41 +++
 tests/qemuxml2xmltest.c|   1 +
 29 files changed, 1239 insertions(+), 9 deletions(-)
 create mode 100644 src/util/virmdev.c
 create mode 100644 src/util/virmdev.h
 create mode 100644 
tests/qemuxml2argvdata/qemuxml2argv-hostdev-mdev-unmanaged-no-uuid.xml
 create mode 100644 
tests/qemuxml2argvdata/qemuxml2argv-hostdev-mdev-unmanaged.args
 create mode 100644 
tests/qemuxml2argvdata/qemuxml2argv-hostdev-mdev-unmanaged.xml
 create mode 100644 
tests/qemuxml2xmloutdata/qemuxml2xmlout-hostdev-mdev-unmanaged.xml

-- 
2.10.2

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


[libvirt] [PATCH 10/16] hostdev: Maintain a driver list of active mediated devices

2017-02-06 Thread Erik Skultety
Keep track of the assigned mediated devices the same way we do it for
the rest of hostdevs.

Signed-off-by: Erik Skultety 
---
 src/libvirt_private.syms |   1 +
 src/qemu/qemu_hostdev.c  |  22 +
 src/qemu/qemu_hostdev.h  |   4 ++
 src/util/virhostdev.c| 126 ++-
 src/util/virhostdev.h|   9 
 5 files changed, 161 insertions(+), 1 deletion(-)

diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms
index 4047945..8921a53 100644
--- a/src/libvirt_private.syms
+++ b/src/libvirt_private.syms
@@ -1707,6 +1707,7 @@ virHostdevPCINodeDeviceDetach;
 virHostdevPCINodeDeviceReAttach;
 virHostdevPCINodeDeviceReset;
 virHostdevPrepareDomainDevices;
+virHostdevPrepareMediatedDevices;
 virHostdevPreparePCIDevices;
 virHostdevPrepareSCSIDevices;
 virHostdevPrepareSCSIVHostDevices;
diff --git a/src/qemu/qemu_hostdev.c b/src/qemu/qemu_hostdev.c
index 7cd49e4..45b731c 100644
--- a/src/qemu/qemu_hostdev.c
+++ b/src/qemu/qemu_hostdev.c
@@ -305,6 +305,24 @@ qemuHostdevPrepareSCSIVHostDevices(virQEMUDriverPtr driver,
 }
 
 int
+qemuHostdevPrepareMediatedDevices(virQEMUDriverPtr driver,
+  const char *name,
+  virDomainHostdevDefPtr *hostdevs,
+  int nhostdevs)
+{
+virHostdevManagerPtr hostdev_mgr = driver->hostdevMgr;
+
+if (!qemuHostdevHostSupportsPassthroughVFIO()) {
+virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s",
+   _("host doesn't support VFIO PCI interface"));
+return -1;
+}
+
+return virHostdevPrepareMediatedDevices(hostdev_mgr, QEMU_DRIVER_NAME,
+name, hostdevs, nhostdevs);
+}
+
+int
 qemuHostdevPrepareDomainDevices(virQEMUDriverPtr driver,
 virDomainDefPtr def,
 virQEMUCapsPtr qemuCaps,
@@ -330,6 +348,10 @@ qemuHostdevPrepareDomainDevices(virQEMUDriverPtr driver,
def->hostdevs, def->nhostdevs) < 0)
 return -1;
 
+if (qemuHostdevPrepareMediatedDevices(driver, def->name,
+  def->hostdevs, def->nhostdevs) < 0)
+return -1;
+
 return 0;
 }
 
diff --git a/src/qemu/qemu_hostdev.h b/src/qemu/qemu_hostdev.h
index 74a7d4f..9399241 100644
--- a/src/qemu/qemu_hostdev.h
+++ b/src/qemu/qemu_hostdev.h
@@ -59,6 +59,10 @@ int qemuHostdevPrepareSCSIVHostDevices(virQEMUDriverPtr 
driver,
const char *name,
virDomainHostdevDefPtr *hostdevs,
int nhostdevs);
+int qemuHostdevPrepareMediatedDevices(virQEMUDriverPtr driver,
+  const char *name,
+  virDomainHostdevDefPtr *hostdevs,
+  int nhostdevs);
 int qemuHostdevPrepareDomainDevices(virQEMUDriverPtr driver,
 virDomainDefPtr def,
 virQEMUCapsPtr qemuCaps,
diff --git a/src/util/virhostdev.c b/src/util/virhostdev.c
index 86ca8e0..7691eaa 100644
--- a/src/util/virhostdev.c
+++ b/src/util/virhostdev.c
@@ -1,6 +1,6 @@
 /* virhostdev.c: hostdev management
  *
- * Copyright (C) 2006-2007, 2009-2016 Red Hat, Inc.
+ * Copyright (C) 2006-2007, 2009-2017 Red Hat, Inc.
  * Copyright (C) 2006 Daniel P. Berrange
  * Copyright (C) 2014 SUSE LINUX Products GmbH, Nuernberg, Germany.
  *
@@ -147,6 +147,7 @@ virHostdevManagerDispose(void *obj)
 virObjectUnref(hostdevMgr->activeUSBHostdevs);
 virObjectUnref(hostdevMgr->activeSCSIHostdevs);
 virObjectUnref(hostdevMgr->activeSCSIVHostHostdevs);
+virObjectUnref(hostdevMgr->activeMediatedHostdevs);
 VIR_FREE(hostdevMgr->stateDir);
 }
 
@@ -174,6 +175,9 @@ virHostdevManagerNew(void)
 if (!(hostdevMgr->activeSCSIVHostHostdevs = virSCSIVHostDeviceListNew()))
 goto error;
 
+if (!(hostdevMgr->activeMediatedHostdevs = virMediatedDeviceListNew()))
+goto error;
+
 if (privileged) {
 if (VIR_STRDUP(hostdevMgr->stateDir, HOSTDEV_STATE_DIR) < 0)
 goto error;
@@ -1595,6 +1599,126 @@ virHostdevPrepareSCSIVHostDevices(virHostdevManagerPtr 
mgr,
 return -1;
 }
 
+static int
+virHostdevMarkMediatedDevices(virHostdevManagerPtr mgr,
+  const char *drvname,
+  const char *domname,
+  virMediatedDeviceListPtr list)
+{
+size_t i, j;
+unsigned int count;
+virMediatedDevicePtr tmp;
+
+virObjectLock(mgr->activeMediatedHostdevs);
+count = virMediatedDeviceListCount(list);
+
+for (i = 0; i < count; i++) {
+virMediatedDevicePtr mdev = virMediatedDeviceListGet(list, i);
+if ((tmp = virMediatedDeviceListFind(mgr->activeMediatedHostdevs,
+ 

[libvirt] [PATCH 06/16] security: dac: Enable labeling of vfio mediated devices

2017-02-06 Thread Erik Skultety
Label the VFIO IOMMU devices under /dev/vfio/ referenced by the symlinks
in the sysfs (e.g. /sys/class/mdev_bus//iommu_group) which what
qemu actually gets formatted on the command line.

Signed-off-by: Erik Skultety 
---
 src/security/security_dac.c | 58 +++--
 1 file changed, 56 insertions(+), 2 deletions(-)

diff --git a/src/security/security_dac.c b/src/security/security_dac.c
index ecce1d3..93cb66f 100644
--- a/src/security/security_dac.c
+++ b/src/security/security_dac.c
@@ -33,6 +33,7 @@
 #include "virfile.h"
 #include "viralloc.h"
 #include "virlog.h"
+#include "virmdev.h"
 #include "virpci.h"
 #include "virusb.h"
 #include "virscsi.h"
@@ -856,6 +857,15 @@ virSecurityDACSetHostLabel(virSCSIVHostDevicePtr dev 
ATTRIBUTE_UNUSED,
 
 
 static int
+virSecurityDACSetMediatedDevLabel(virMediatedDevicePtr dev ATTRIBUTE_UNUSED,
+  const char *file,
+  void *opaque)
+{
+return virSecurityDACSetHostdevLabelHelper(file, opaque);
+}
+
+
+static int
 virSecurityDACSetHostdevLabel(virSecurityManagerPtr mgr,
   virDomainDefPtr def,
   virDomainHostdevDefPtr dev,
@@ -867,7 +877,9 @@ virSecurityDACSetHostdevLabel(virSecurityManagerPtr mgr,
 virDomainHostdevSubsysPCIPtr pcisrc = >source.subsys.u.pci;
 virDomainHostdevSubsysSCSIPtr scsisrc = >source.subsys.u.scsi;
 virDomainHostdevSubsysSCSIVHostPtr hostsrc = 
>source.subsys.u.scsi_host;
+virDomainHostdevSubsysMediatedDevPtr mdevsrc = >source.subsys.u.mdev;
 int ret = -1;
+virMediatedDevicePtr mdev = NULL;
 
 if (!priv->dynamicOwnership)
 return 0;
@@ -964,13 +976,26 @@ virSecurityDACSetHostdevLabel(virSecurityManagerPtr mgr,
 break;
 }
 
-case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV:
+case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV: {
+char *vfio_dev = NULL;
+if (!(mdev = virMediatedDeviceNew(>addr, mdevsrc->uuidstr)))
+goto done;
+
+if (!(vfio_dev = virMediatedDeviceGetIOMMUGroupDev(mdev)))
+goto done;
+
+ret = virSecurityDACSetMediatedDevLabel(mdev, vfio_dev, );
+VIR_FREE(vfio_dev);
+break;
+}
+
 case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_LAST:
 ret = 0;
 break;
 }
 
  done:
+virMediatedDeviceFree(mdev);
 return ret;
 }
 
@@ -1018,6 +1043,15 @@ virSecurityDACRestoreHostLabel(virSCSIVHostDevicePtr dev 
ATTRIBUTE_UNUSED,
 return virSecurityDACRestoreFileLabel(priv, file);
 }
 
+static int
+virSecurityDACRestoreMediatedDevLabel(virMediatedDevicePtr dev 
ATTRIBUTE_UNUSED,
+  const char *file,
+  void *opaque)
+{
+virSecurityManagerPtr mgr = opaque;
+virSecurityDACDataPtr priv = virSecurityManagerGetPrivateData(mgr);
+return virSecurityDACRestoreFileLabel(priv, file);
+}
 
 static int
 virSecurityDACRestoreHostdevLabel(virSecurityManagerPtr mgr,
@@ -1032,6 +1066,7 @@ virSecurityDACRestoreHostdevLabel(virSecurityManagerPtr 
mgr,
 virDomainHostdevSubsysPCIPtr pcisrc = >source.subsys.u.pci;
 virDomainHostdevSubsysSCSIPtr scsisrc = >source.subsys.u.scsi;
 virDomainHostdevSubsysSCSIVHostPtr hostsrc = 
>source.subsys.u.scsi_host;
+virDomainHostdevSubsysMediatedDevPtr mdevsrc = >source.subsys.u.mdev;
 int ret = -1;
 
 secdef = virDomainDefGetSecurityLabelDef(def, SECURITY_DAC_NAME);
@@ -1120,7 +1155,26 @@ virSecurityDACRestoreHostdevLabel(virSecurityManagerPtr 
mgr,
 break;
 }
 
-case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV:
+case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV: {
+char *vfiodev = NULL;
+virMediatedDevicePtr mdev = virMediatedDeviceNew(>addr,
+ mdevsrc->uuidstr);
+
+if (!mdev)
+goto done;
+
+if (!(vfiodev = virMediatedDeviceGetIOMMUGroupDev(mdev))) {
+virMediatedDeviceFree(mdev);
+goto done;
+}
+
+ret = virSecurityDACRestoreMediatedDevLabel(mdev, vfiodev, mgr);
+
+VIR_FREE(vfiodev);
+virMediatedDeviceFree(mdev);
+break;
+}
+
 case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_LAST:
 ret = 0;
 break;
-- 
2.10.2

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


[libvirt] [PATCH 15/16] test: Add some test cases for our test suite regarding the mdevs

2017-02-06 Thread Erik Skultety
For now, focus only on unmanaged devices, thus also testing whether the
uuid element is present or not, since in case of unmanaged mediated
devices it must be present.

Signed-off-by: Erik Skultety 
---
 ...qemuxml2argv-hostdev-mdev-unmanaged-no-uuid.xml | 37 +++
 .../qemuxml2argv-hostdev-mdev-unmanaged.args   | 25 +
 .../qemuxml2argv-hostdev-mdev-unmanaged.xml| 38 
 tests/qemuxml2argvtest.c   |  6 
 .../qemuxml2xmlout-hostdev-mdev-unmanaged.xml  | 41 ++
 tests/qemuxml2xmltest.c|  1 +
 6 files changed, 148 insertions(+)
 create mode 100644 
tests/qemuxml2argvdata/qemuxml2argv-hostdev-mdev-unmanaged-no-uuid.xml
 create mode 100644 
tests/qemuxml2argvdata/qemuxml2argv-hostdev-mdev-unmanaged.args
 create mode 100644 
tests/qemuxml2argvdata/qemuxml2argv-hostdev-mdev-unmanaged.xml
 create mode 100644 
tests/qemuxml2xmloutdata/qemuxml2xmlout-hostdev-mdev-unmanaged.xml

diff --git 
a/tests/qemuxml2argvdata/qemuxml2argv-hostdev-mdev-unmanaged-no-uuid.xml 
b/tests/qemuxml2argvdata/qemuxml2argv-hostdev-mdev-unmanaged-no-uuid.xml
new file mode 100644
index 000..0fda187
--- /dev/null
+++ b/tests/qemuxml2argvdata/qemuxml2argv-hostdev-mdev-unmanaged-no-uuid.xml
@@ -0,0 +1,37 @@
+
+  QEMUGuest2
+  c7a5fdbd-edaf-9455-926a-d65c16db1809
+  219136
+  219136
+  1
+  
+hvm
+
+  
+  
+  destroy
+  restart
+  destroy
+  
+/usr/bin/qemu
+
+  
+  
+  
+  
+
+
+
+
+
+
+
+
+
+  
+
+  
+
+
+  
+
diff --git a/tests/qemuxml2argvdata/qemuxml2argv-hostdev-mdev-unmanaged.args 
b/tests/qemuxml2argvdata/qemuxml2argv-hostdev-mdev-unmanaged.args
new file mode 100644
index 000..be0017e
--- /dev/null
+++ b/tests/qemuxml2argvdata/qemuxml2argv-hostdev-mdev-unmanaged.args
@@ -0,0 +1,25 @@
+LC_ALL=C \
+PATH=/bin \
+HOME=/home/test \
+USER=test \
+LOGNAME=test \
+QEMU_AUDIO_DRV=none \
+/usr/bin/qemu \
+-name QEMUGuest2 \
+-S \
+-M pc \
+-m 214 \
+-smp 1,sockets=1,cores=1,threads=1 \
+-uuid c7a5fdbd-edaf-9455-926a-d65c16db1809 \
+-nographic \
+-nodefconfig \
+-nodefaults \
+-monitor unix:/tmp/lib/domain--1-QEMUGuest2/monitor.sock,server,nowait \
+-no-acpi \
+-boot c \
+-usb \
+-drive file=/dev/HostVG/QEMUGuest2,format=raw,if=none,id=drive-ide0-0-0 \
+-device ide-drive,bus=ide.0,unit=0,drive=drive-ide0-0-0,id=ide0-0-0 \
+-device vfio-pci,\
+sysfsdev=/sys/class/mdev_bus/:00:03.0/53764d0e-85a0-42b4-af5c-2046b460b1dc,\
+bus=pci.0,addr=0x3
diff --git a/tests/qemuxml2argvdata/qemuxml2argv-hostdev-mdev-unmanaged.xml 
b/tests/qemuxml2argvdata/qemuxml2argv-hostdev-mdev-unmanaged.xml
new file mode 100644
index 000..025429d
--- /dev/null
+++ b/tests/qemuxml2argvdata/qemuxml2argv-hostdev-mdev-unmanaged.xml
@@ -0,0 +1,38 @@
+
+  QEMUGuest2
+  c7a5fdbd-edaf-9455-926a-d65c16db1809
+  219136
+  219136
+  1
+  
+hvm
+
+  
+  
+  destroy
+  restart
+  destroy
+  
+/usr/bin/qemu
+
+  
+  
+  
+  
+
+
+
+
+
+
+
+
+
+  
+
+53764d0e-85a0-42b4-af5c-2046b460b1dc
+  
+
+
+  
+
diff --git a/tests/qemuxml2argvtest.c b/tests/qemuxml2argvtest.c
index 3532cb5..12ed800 100644
--- a/tests/qemuxml2argvtest.c
+++ b/tests/qemuxml2argvtest.c
@@ -1469,6 +1469,12 @@ mymain(void)
 DO_TEST("hostdev-vfio-multidomain",
 QEMU_CAPS_NODEFCONFIG,
 QEMU_CAPS_DEVICE_VFIO_PCI, QEMU_CAPS_HOST_PCI_MULTIDOMAIN);
+DO_TEST("hostdev-mdev-unmanaged",
+QEMU_CAPS_NODEFCONFIG,
+QEMU_CAPS_DEVICE_VFIO_PCI);
+DO_TEST_PARSE_ERROR("hostdev-mdev-unmanaged-no-uuid",
+QEMU_CAPS_NODEFCONFIG,
+QEMU_CAPS_DEVICE_VFIO_PCI);
 DO_TEST_FAILURE("hostdev-vfio-multidomain",
 QEMU_CAPS_NODEFCONFIG, QEMU_CAPS_DEVICE_VFIO_PCI);
 DO_TEST("pci-rom",
diff --git a/tests/qemuxml2xmloutdata/qemuxml2xmlout-hostdev-mdev-unmanaged.xml 
b/tests/qemuxml2xmloutdata/qemuxml2xmlout-hostdev-mdev-unmanaged.xml
new file mode 100644
index 000..baa8b78
--- /dev/null
+++ b/tests/qemuxml2xmloutdata/qemuxml2xmlout-hostdev-mdev-unmanaged.xml
@@ -0,0 +1,41 @@
+
+  QEMUGuest2
+  c7a5fdbd-edaf-9455-926a-d65c16db1809
+  219136
+  219136
+  1
+  
+hvm
+
+  
+  
+  destroy
+  restart
+  destroy
+  
+/usr/bin/qemu
+
+  
+  
+  
+  
+
+
+  
+
+
+
+  
+
+
+
+
+  
+
+53764d0e-85a0-42b4-af5c-2046b460b1dc
+  
+  
+
+
+  
+
diff --git a/tests/qemuxml2xmltest.c b/tests/qemuxml2xmltest.c
index 4f3b09a..740025d 100644
--- a/tests/qemuxml2xmltest.c
+++ b/tests/qemuxml2xmltest.c
@@ -550,6 +550,7 @@ mymain(void)
 DO_TEST("hostdev-usb-address", NONE);
 DO_TEST("hostdev-pci-address", NONE);
 DO_TEST("hostdev-vfio", NONE);
+

[libvirt] [PATCH 13/16] qemu: namespace: Hook up the discovery of mdevs into the namespace code

2017-02-06 Thread Erik Skultety
Again, as for all the other hostdev device types, make sure that the
/dev/vfio/ device will be added to the qemu namespace.

Signed-off-by: Erik Skultety 
---
 src/qemu/qemu_domain.c | 12 
 1 file changed, 12 insertions(+)

diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c
index 3006d78..afc7d07 100644
--- a/src/qemu/qemu_domain.c
+++ b/src/qemu/qemu_domain.c
@@ -6824,10 +6824,12 @@ qemuDomainGetHostdevPath(virDomainHostdevDefPtr dev,
 virDomainHostdevSubsysPCIPtr pcisrc = >source.subsys.u.pci;
 virDomainHostdevSubsysSCSIPtr scsisrc = >source.subsys.u.scsi;
 virDomainHostdevSubsysSCSIVHostPtr hostsrc = 
>source.subsys.u.scsi_host;
+virDomainHostdevSubsysMediatedDevPtr mdevsrc = >source.subsys.u.mdev;
 virPCIDevicePtr pci = NULL;
 virUSBDevicePtr usb = NULL;
 virSCSIDevicePtr scsi = NULL;
 virSCSIVHostDevicePtr host = NULL;
+virMediatedDevicePtr mdev = NULL;
 char *tmpPath = NULL;
 bool freeTmpPath = false;
 
@@ -6902,6 +6904,15 @@ qemuDomainGetHostdevPath(virDomainHostdevDefPtr dev,
 }
 
 case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV:
+if (!(mdev = virMediatedDeviceNew(>addr,
+  mdevsrc->uuidstr)))
+goto cleanup;
+
+if (!(tmpPath = virMediatedDeviceGetIOMMUGroupDev(mdev)))
+goto cleanup;
+
+freeTmpPath = true;
+break;
 case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_LAST:
 break;
 }
@@ -6922,6 +6933,7 @@ qemuDomainGetHostdevPath(virDomainHostdevDefPtr dev,
 virUSBDeviceFree(usb);
 virSCSIDeviceFree(scsi);
 virSCSIVHostDeviceFree(host);
+virMediatedDeviceFree(mdev);
 if (freeTmpPath)
 VIR_FREE(tmpPath);
 return ret;
-- 
2.10.2

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


[libvirt] [PATCH 09/16] qemu: Assign PCI addresses for mediated devices as well

2017-02-06 Thread Erik Skultety
So far, the official support is for x86_64 arch guests so unless a
different device API than vfio-pci is available let's assume PCI address
assignment. Once a different device API is introduced, we'll actually
have to start checking for the device API in the sysfs in order to
assign a correct type of address.

Signed-off-by: Erik Skultety 
---
 src/qemu/qemu_domain_address.c | 12 +---
 1 file changed, 9 insertions(+), 3 deletions(-)

diff --git a/src/qemu/qemu_domain_address.c b/src/qemu/qemu_domain_address.c
index 70482f2..71a2e5c 100644
--- a/src/qemu/qemu_domain_address.c
+++ b/src/qemu/qemu_domain_address.c
@@ -618,9 +618,11 @@ 
qemuDomainDeviceCalculatePCIConnectFlags(virDomainDeviceDefPtr dev,
 virPCIDeviceAddressPtr hostAddr = >source.subsys.u.pci.addr;
 
 if (hostdev->mode != VIR_DOMAIN_HOSTDEV_MODE_SUBSYS ||
-hostdev->source.subsys.type != VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_PCI) 
{
+(hostdev->source.subsys.type !=
+ VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_PCI &&
+ hostdev->source.subsys.type !=
+ VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV))
 return 0;
-}
 
 if (pciFlags == pcieFlags) {
 /* This arch/qemu only supports legacy PCI, so there
@@ -642,6 +644,9 @@ 
qemuDomainDeviceCalculatePCIConnectFlags(virDomainDeviceDefPtr dev,
 return pcieFlags;
 }
 
+if (hostdev->source.subsys.type == VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV)
+return pcieFlags;
+
 if (!(pciDev = virPCIDeviceNew(hostAddr->domain,
hostAddr->bus,
hostAddr->slot,
@@ -1730,7 +1735,8 @@ qemuDomainAssignDevicePCISlots(virDomainDefPtr def,
 if (def->hostdevs[i]->mode != VIR_DOMAIN_HOSTDEV_MODE_SUBSYS)
 continue;
 if (def->hostdevs[i]->source.subsys.type != 
VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_PCI &&
-def->hostdevs[i]->source.subsys.type != 
VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_SCSI_HOST)
+def->hostdevs[i]->source.subsys.type != 
VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_SCSI_HOST &&
+def->hostdevs[i]->source.subsys.type != 
VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV)
 continue;
 
 if (qemuDomainPCIAddressReserveNextAddr(addrs,
-- 
2.10.2

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


[libvirt] [PATCH 12/16] qemu: cgroup: Adjust cgroups' logic to allow mediated devices

2017-02-06 Thread Erik Skultety
As goes for all the other hostdev device types, grant the qemu process
access to /dev/vfio/.

Signed-off-by: Erik Skultety 
---
 src/qemu/qemu_cgroup.c | 30 ++
 1 file changed, 30 insertions(+)

diff --git a/src/qemu/qemu_cgroup.c b/src/qemu/qemu_cgroup.c
index 0902624..405d6c1 100644
--- a/src/qemu/qemu_cgroup.c
+++ b/src/qemu/qemu_cgroup.c
@@ -318,6 +318,23 @@ qemuSetupHostSCSIVHostDeviceCgroup(virSCSIVHostDevicePtr 
dev ATTRIBUTE_UNUSED,
 return ret;
 }
 
+static int
+qemuSetupHostMediatedDeviceCgroup(virMediatedDevicePtr dev ATTRIBUTE_UNUSED,
+  const char *path,
+  void *opaque)
+{
+virDomainObjPtr vm = opaque;
+qemuDomainObjPrivatePtr priv = vm->privateData;
+int ret = -1;
+
+VIR_DEBUG("Process path '%s' for mediated device", path);
+ret = virCgroupAllowDevicePath(priv->cgroup, path,
+   VIR_CGROUP_DEVICE_RW, false);
+virDomainAuditCgroupPath(vm, priv->cgroup, "allow", path, "rw", ret == 0);
+
+return ret;
+}
+
 int
 qemuSetupHostdevCgroup(virDomainObjPtr vm,
virDomainHostdevDefPtr dev)
@@ -328,10 +345,12 @@ qemuSetupHostdevCgroup(virDomainObjPtr vm,
 virDomainHostdevSubsysPCIPtr pcisrc = >source.subsys.u.pci;
 virDomainHostdevSubsysSCSIPtr scsisrc = >source.subsys.u.scsi;
 virDomainHostdevSubsysSCSIVHostPtr hostsrc = 
>source.subsys.u.scsi_host;
+virDomainHostdevSubsysMediatedDevPtr mdevsrc = >source.subsys.u.mdev;
 virPCIDevicePtr pci = NULL;
 virUSBDevicePtr usb = NULL;
 virSCSIDevicePtr scsi = NULL;
 virSCSIVHostDevicePtr host = NULL;
+virMediatedDevicePtr mdev = NULL;
 char *path = NULL;
 
 /* currently this only does something for PCI devices using vfio
@@ -434,6 +453,16 @@ qemuSetupHostdevCgroup(virDomainObjPtr vm,
 }
 
 case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV:
+if (!(mdev = virMediatedDeviceNew(>addr,
+  mdevsrc->uuidstr)))
+goto cleanup;
+
+if (!(path = virMediatedDeviceGetIOMMUGroupDev(mdev)))
+goto cleanup;
+
+if (qemuSetupHostMediatedDeviceCgroup(mdev, path, vm) < 0)
+goto cleanup;
+
 break;
 
 case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_LAST:
@@ -447,6 +476,7 @@ qemuSetupHostdevCgroup(virDomainObjPtr vm,
 virUSBDeviceFree(usb);
 virSCSIDeviceFree(scsi);
 virSCSIVHostDeviceFree(host);
+virMediatedDeviceFree(mdev);
 VIR_FREE(path);
 return ret;
 }
-- 
2.10.2

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


[libvirt] [PATCH 07/16] security: selinux: Enable labeling of vfio mediated devices

2017-02-06 Thread Erik Skultety
Label the VFIO IOMMU devices under /dev/vfio/ referenced by the symlinks
in the sysfs (e.g. /sys/class/mdev_bus//iommu_group) which what
qemu actually gets formatted on the command line.

Signed-off-by: Erik Skultety 
---
 src/security/security_selinux.c | 57 +++--
 1 file changed, 55 insertions(+), 2 deletions(-)

diff --git a/src/security/security_selinux.c b/src/security/security_selinux.c
index e152c72..4f6b098 100644
--- a/src/security/security_selinux.c
+++ b/src/security/security_selinux.c
@@ -36,6 +36,7 @@
 #include "virerror.h"
 #include "viralloc.h"
 #include "virlog.h"
+#include "virmdev.h"
 #include "virpci.h"
 #include "virusb.h"
 #include "virscsi.h"
@@ -1686,6 +1687,13 @@ virSecuritySELinuxSetHostLabel(virSCSIVHostDevicePtr dev 
ATTRIBUTE_UNUSED,
 }
 
 static int
+virSecuritySELinuxSetMediatedDevLabel(virMediatedDevicePtr dev 
ATTRIBUTE_UNUSED,
+  const char *file, void *opaque)
+{
+return virSecuritySELinuxSetHostdevLabelHelper(file, opaque);
+}
+
+static int
 virSecuritySELinuxSetHostdevSubsysLabel(virSecurityManagerPtr mgr,
 virDomainDefPtr def,
 virDomainHostdevDefPtr dev,
@@ -1696,7 +1704,9 @@ 
virSecuritySELinuxSetHostdevSubsysLabel(virSecurityManagerPtr mgr,
 virDomainHostdevSubsysPCIPtr pcisrc = >source.subsys.u.pci;
 virDomainHostdevSubsysSCSIPtr scsisrc = >source.subsys.u.scsi;
 virDomainHostdevSubsysSCSIVHostPtr hostsrc = 
>source.subsys.u.scsi_host;
+virDomainHostdevSubsysMediatedDevPtr mdevsrc = >source.subsys.u.mdev;
 virSecuritySELinuxCallbackData data = {.mgr = mgr, .def = def};
+virMediatedDevicePtr mdev = NULL;
 
 int ret = -1;
 
@@ -1782,13 +1792,26 @@ 
virSecuritySELinuxSetHostdevSubsysLabel(virSecurityManagerPtr mgr,
 break;
 }
 
-case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV:
+case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV: {
+char *vfio_dev = NULL;
+if (!(mdev = virMediatedDeviceNew(>addr, mdevsrc->uuidstr)))
+goto done;
+
+if (!(vfio_dev = virMediatedDeviceGetIOMMUGroupDev(mdev)))
+goto done;
+
+ret = virSecuritySELinuxSetMediatedDevLabel(mdev, vfio_dev, );
+VIR_FREE(vfio_dev);
+break;
+}
+
 case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_LAST:
 ret = 0;
 break;
 }
 
  done:
+virMediatedDeviceFree(mdev);
 return ret;
 }
 
@@ -1918,6 +1941,16 @@ virSecuritySELinuxRestoreHostLabel(virSCSIVHostDevicePtr 
dev ATTRIBUTE_UNUSED,
 }
 
 static int
+virSecuritySELinuxRestoreMediatedDevLabel(virMediatedDevicePtr dev 
ATTRIBUTE_UNUSED,
+   const char *file,
+   void *opaque)
+{
+virSecurityManagerPtr mgr = opaque;
+
+return virSecuritySELinuxRestoreFileLabel(mgr, file);
+}
+
+static int
 virSecuritySELinuxRestoreHostdevSubsysLabel(virSecurityManagerPtr mgr,
 virDomainHostdevDefPtr dev,
 const char *vroot)
@@ -1927,6 +1960,7 @@ 
virSecuritySELinuxRestoreHostdevSubsysLabel(virSecurityManagerPtr mgr,
 virDomainHostdevSubsysPCIPtr pcisrc = >source.subsys.u.pci;
 virDomainHostdevSubsysSCSIPtr scsisrc = >source.subsys.u.scsi;
 virDomainHostdevSubsysSCSIVHostPtr hostsrc = 
>source.subsys.u.scsi_host;
+virDomainHostdevSubsysMediatedDevPtr mdevsrc = >source.subsys.u.mdev;
 int ret = -1;
 
 /* Like virSecuritySELinuxRestoreImageLabelInt() for a networked
@@ -2010,7 +2044,26 @@ 
virSecuritySELinuxRestoreHostdevSubsysLabel(virSecurityManagerPtr mgr,
 break;
 }
 
-case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV:
+case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV: {
+char *vfiodev = NULL;
+virMediatedDevicePtr mdev = virMediatedDeviceNew(>addr,
+ mdevsrc->uuidstr);
+
+if (!mdev)
+goto done;
+
+if (!(vfiodev = virMediatedDeviceGetIOMMUGroupDev(mdev))) {
+virMediatedDeviceFree(mdev);
+goto done;
+}
+
+ret = virSecuritySELinuxRestoreMediatedDevLabel(mdev, vfiodev, mgr);
+
+VIR_FREE(vfiodev);
+virMediatedDeviceFree(mdev);
+break;
+}
+
 case VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_LAST:
 ret = 0;
 break;
-- 
2.10.2

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


[libvirt] [PATCH 01/16] util: Introduce new module virmdev

2017-02-06 Thread Erik Skultety
Beside creation, disposal, getter, and setter methods the module exports
methods to work with lists of mediated devices.

Signed-off-by: Erik Skultety 
---
 po/POTFILES.in   |   1 +
 src/Makefile.am  |   1 +
 src/libvirt_private.syms |  17 +++
 src/util/virmdev.c   | 375 +++
 src/util/virmdev.h   |  85 +++
 5 files changed, 479 insertions(+)
 create mode 100644 src/util/virmdev.c
 create mode 100644 src/util/virmdev.h

diff --git a/po/POTFILES.in b/po/POTFILES.in
index 365ea66..53c674e 100644
--- a/po/POTFILES.in
+++ b/po/POTFILES.in
@@ -218,6 +218,7 @@ src/util/virlease.c
 src/util/virlockspace.c
 src/util/virlog.c
 src/util/virmacmap.c
+src/util/virmdev.c
 src/util/virnetdev.c
 src/util/virnetdevbandwidth.c
 src/util/virnetdevbridge.c
diff --git a/src/Makefile.am b/src/Makefile.am
index 2f32d41..6510e1d 100644
--- a/src/Makefile.am
+++ b/src/Makefile.am
@@ -186,6 +186,7 @@ UTIL_SOURCES =  
\
util/viruuid.c util/viruuid.h   \
util/virxdrdefs.h   \
util/virxml.c util/virxml.h \
+   util/virmdev.c util/virmdev.h   \
$(NULL)
 
 EXTRA_DIST += $(srcdir)/util/keymaps.csv $(srcdir)/util/virkeycode-mapgen.py
diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms
index 8e994c7..4047945 100644
--- a/src/libvirt_private.syms
+++ b/src/libvirt_private.syms
@@ -1960,6 +1960,23 @@ virMacMapNew;
 virMacMapRemove;
 virMacMapWriteFile;
 
+# util/virmdev.h
+virMediatedDeviceFree;
+virMediatedDeviceGetIOMMUGroupDev;
+virMediatedDeviceGetPath;
+virMediatedDeviceGetUsedBy;
+virMediatedDeviceListAdd;
+virMediatedDeviceListCount;
+virMediatedDeviceListDel;
+virMediatedDeviceListFind;
+virMediatedDeviceListGet;
+virMediatedDeviceListNew;
+virMediatedDeviceListSteal;
+virMediatedDeviceListStealIndex;
+virMediatedDeviceNew;
+virMediatedDeviceSetUsedBy;
+
+
 
 # util/virnetdev.h
 virNetDevAddMulti;
diff --git a/src/util/virmdev.c b/src/util/virmdev.c
new file mode 100644
index 000..0b70798
--- /dev/null
+++ b/src/util/virmdev.c
@@ -0,0 +1,375 @@
+/*
+ * virmdev.c: helper APIs for managing host MDEV devices
+ *
+ * Copyright (C) 2017-2018 Red Hat, Inc.
+ *
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2.1 of the License, or (at your option) any later version.
+ *
+ * This library is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library.  If not, see
+ * .
+ *
+ * Authors:
+ * Erik Skultety 
+ */
+
+#include 
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "virmdev.h"
+#include "dirname.h"
+#include "virlog.h"
+#include "viralloc.h"
+#include "vircommand.h"
+#include "virerror.h"
+#include "virfile.h"
+#include "virkmod.h"
+#include "virstring.h"
+#include "virutil.h"
+#include "viruuid.h"
+#include "virhostdev.h"
+
+VIR_LOG_INIT("util.mdev");
+
+struct _virMediatedDevice {
+char *path; /* sysfs path */
+bool managed;
+
+char *used_by_drvname;
+char *used_by_domname;
+};
+
+struct _virMediatedDeviceList {
+virObjectLockable parent;
+
+size_t count;
+virMediatedDevicePtr *devs;
+};
+
+
+/* For virReportOOMError()  and virReportSystemError() */
+#define VIR_FROM_THIS VIR_FROM_NONE
+
+static virClassPtr virMediatedDeviceListClass;
+
+static void virMediatedDeviceListDispose(void *obj);
+
+static int virMediatedOnceInit(void)
+{
+if (!(virMediatedDeviceListClass = virClassNew(virClassForObjectLockable(),
+   "virMediatedDeviceList",
+   
sizeof(virMediatedDeviceList),
+   
virMediatedDeviceListDispose)))
+return -1;
+
+return 0;
+}
+
+VIR_ONCE_GLOBAL_INIT(virMediated)
+
+#ifdef __linux__
+# define MDEV_SYSFS "/sys/class/mdev_bus/"
+
+virMediatedDevicePtr
+virMediatedDeviceNew(virPCIDeviceAddressPtr pciaddr, const char *uuidstr)
+{
+virMediatedDevicePtr dev = NULL, ret = NULL;
+char *pcistr = NULL;
+
+if (virAsprintf(, "%.4x:%.2x:%.2x.%.1x", pciaddr->domain,
+pciaddr->bus, pciaddr->slot, pciaddr->function) < 0)
+return NULL;
+
+if (VIR_ALLOC(dev) < 0)
+goto cleanup;
+
+if (virAsprintf(>path, MDEV_SYSFS "%s/%s", 

[libvirt] [PATCH 03/16] docs: Update RNG schema to reflect the new hostdev type mdev

2017-02-06 Thread Erik Skultety
To keep the domain XML as much platform agnostic as possible, do not
expose an element/attribute which would contain path directly to the
syfs filesystem which the mediated devices are build upon. Instead,
identify each mediated device by the parent physical device and a UUID
which is an optional element, but only if managed='yes' which is not
implemented yet.

Signed-off-by: Erik Skultety 
---
 docs/schemas/domaincommon.rng | 17 +
 1 file changed, 17 insertions(+)

diff --git a/docs/schemas/domaincommon.rng b/docs/schemas/domaincommon.rng
index cc6e0d0..087ca82 100644
--- a/docs/schemas/domaincommon.rng
+++ b/docs/schemas/domaincommon.rng
@@ -3984,6 +3984,7 @@
   
   
   
+  
 
   
 
@@ -4134,6 +4135,22 @@
 
   
 
+  
+
+  mdev
+
+
+  
+
+  
+  
+
+  
+
+  
+
+  
+
   
 
   storage
-- 
2.10.2

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


[libvirt] [PATCH 11/16] hostdev: Introduce a reattach method for mediated devices

2017-02-06 Thread Erik Skultety
The name "reattach" does not really reflect the truth behind mediated
devices, since these are purely software devices that do not need to be
plugged back into the host in any way, however this patch pushes for
naming consistency for methods where the logic behind the underlying
operation stays the same except that in case of mdevs the operation
itself is effectively a NO-OP.

Signed-off-by: Erik Skultety 
---
 src/libvirt_private.syms |  1 +
 src/qemu/qemu_hostdev.c  | 15 ++
 src/qemu/qemu_hostdev.h  |  4 
 src/util/virhostdev.c| 53 
 src/util/virhostdev.h|  7 +++
 5 files changed, 80 insertions(+)

diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms
index 8921a53..00fb2c2 100644
--- a/src/libvirt_private.syms
+++ b/src/libvirt_private.syms
@@ -1713,6 +1713,7 @@ virHostdevPrepareSCSIDevices;
 virHostdevPrepareSCSIVHostDevices;
 virHostdevPrepareUSBDevices;
 virHostdevReAttachDomainDevices;
+virHostdevReAttachMediatedDevices;
 virHostdevReAttachPCIDevices;
 virHostdevReAttachSCSIDevices;
 virHostdevReAttachSCSIVHostDevices;
diff --git a/src/qemu/qemu_hostdev.c b/src/qemu/qemu_hostdev.c
index 45b731c..6a7232f 100644
--- a/src/qemu/qemu_hostdev.c
+++ b/src/qemu/qemu_hostdev.c
@@ -419,6 +419,18 @@ qemuHostdevReAttachSCSIVHostDevices(virQEMUDriverPtr 
driver,
 }
 
 void
+qemuHostdevReAttachMediatedDevices(virQEMUDriverPtr driver,
+   const char *name,
+   virDomainHostdevDefPtr *hostdevs,
+   int nhostdevs)
+{
+virHostdevManagerPtr hostdev_mgr = driver->hostdevMgr;
+
+virHostdevReAttachMediatedDevices(hostdev_mgr, QEMU_DRIVER_NAME,
+  name, hostdevs, nhostdevs);
+}
+
+void
 qemuHostdevReAttachDomainDevices(virQEMUDriverPtr driver,
  virDomainDefPtr def)
 {
@@ -436,4 +448,7 @@ qemuHostdevReAttachDomainDevices(virQEMUDriverPtr driver,
 
 qemuHostdevReAttachSCSIVHostDevices(driver, def->name, def->hostdevs,
 def->nhostdevs);
+
+qemuHostdevReAttachMediatedDevices(driver, def->name, def->hostdevs,
+   def->nhostdevs);
 }
diff --git a/src/qemu/qemu_hostdev.h b/src/qemu/qemu_hostdev.h
index 9399241..0096497 100644
--- a/src/qemu/qemu_hostdev.h
+++ b/src/qemu/qemu_hostdev.h
@@ -84,6 +84,10 @@ void qemuHostdevReAttachSCSIVHostDevices(virQEMUDriverPtr 
driver,
  const char *name,
  virDomainHostdevDefPtr *hostdevs,
  int nhostdevs);
+void qemuHostdevReAttachMediatedDevices(virQEMUDriverPtr driver,
+const char *name,
+virDomainHostdevDefPtr *hostdevs,
+int nhostdevs);
 void qemuHostdevReAttachDomainDevices(virQEMUDriverPtr driver,
   virDomainDefPtr def);
 
diff --git a/src/util/virhostdev.c b/src/util/virhostdev.c
index 7691eaa..520b711 100644
--- a/src/util/virhostdev.c
+++ b/src/util/virhostdev.c
@@ -1913,6 +1913,59 @@ virHostdevReAttachSCSIVHostDevices(virHostdevManagerPtr 
mgr,
 virObjectUnlock(mgr->activeSCSIVHostHostdevs);
 }
 
+void
+virHostdevReAttachMediatedDevices(virHostdevManagerPtr mgr,
+  const char *drv_name,
+  const char *dom_name,
+  virDomainHostdevDefPtr *hostdevs,
+  int nhostdevs)
+{
+char *used_by_drvname = NULL;
+char *used_by_domname = NULL;
+virDomainHostdevDefPtr hostdev = NULL;
+virDomainHostdevSubsysMediatedDevPtr mdevsrc = NULL;
+size_t i;
+
+if (nhostdevs == 0)
+return;
+
+virObjectLock(mgr->activeMediatedHostdevs);
+for (i = 0; i < nhostdevs; i++) {
+virMediatedDevicePtr mdev, tmp;
+
+hostdev = hostdevs[i];
+mdevsrc = >source.subsys.u.mdev;
+
+if (hostdev->mode != VIR_DOMAIN_HOSTDEV_MODE_SUBSYS ||
+hostdev->source.subsys.type != VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV)
+continue;
+
+if (!(mdev = virMediatedDeviceNew(>addr, mdevsrc->uuidstr))) {
+VIR_WARN("Failed to reattach mediated device %s attached to "
+ "domain %s", mdevsrc->uuidstr, dom_name);
+continue;
+}
+
+/* Remove from the list only mdevs assigned to @drv_name/@dom_name */
+
+tmp = virMediatedDeviceListFind(mgr->activeMediatedHostdevs, mdev);
+virMediatedDeviceFree(mdev);
+
+/* skip inactive devices */
+if (!tmp)
+continue;
+
+virMediatedDeviceGetUsedBy(tmp, _by_drvname, _by_domname);
+if (STREQ_NULLABLE(drv_name, used_by_drvname) &&
+

Re: [libvirt] [PATCH] char: drop data written to a disconnected pty

2017-02-06 Thread Paolo Bonzini


On 31/01/2017 14:45, Ed Swierk wrote:
> When a serial port writes data to a pty that's disconnected, drop the
> data and return the length dropped. This avoids triggering pointless
> retries in callers like the 16550A serial_xmit(), and causes
> qemu_chr_fe_write() to write all data to the log file, rather than
> logging only while a pty client like virsh console happens to be
> connected.
> 
> Signed-off-by: Ed Swierk 
> ---
>  qemu-char.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/qemu-char.c b/qemu-char.c
> index 676944a..ccb6923 100644
> --- a/qemu-char.c
> +++ b/qemu-char.c
> @@ -1528,7 +1528,7 @@ static int pty_chr_write(CharDriverState *chr, const 
> uint8_t *buf, int len)
>  /* guest sends data, check for (re-)connect */
>  pty_chr_update_read_handler_locked(chr);
>  if (!s->connected) {
> -return 0;
> +return len;
>  }
>  }
>  return io_channel_send(s->ioc, buf, len);
> 

Rebased and queued, thanks.

Paolo

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [PATCH v2 3/3] add per cpu stats to all domain stats

2017-02-06 Thread Nikolay Shirokovskiy


On 03.02.2017 19:09, Peter Krempa wrote:
> On Thu, Feb 02, 2017 at 14:09:33 +0300, Nikolay Shirokovskiy wrote:
>> Add per host cpu info provided in virDomainGetCPUStats to the
>> stats provided in virConnectGetAllDomainStats. Namely:
>>
>> "cpu.count" - number of host cpus
>> "cpu..time" - total cpu time spent for this domain in nanoseconds
>> "cpu..vtime" - time spent in virtual cpu threads for this domain
>> in nanoseconds
>> ---
>>  docs/news.xml|  9 ++
>>  include/libvirt/libvirt-domain.h |  1 +
>>  src/libvirt-domain.c |  7 +
>>  src/qemu/qemu_driver.c   | 64 
>> 
>>  tools/virsh-domain-monitor.c |  7 +
>>  tools/virsh.pod  | 11 +--
>>  6 files changed, 97 insertions(+), 2 deletions(-)
>>

[snip]

>>  
>>  static int
>> +qemuDomainGetStatsPerCpu(virQEMUDriverPtr driver ATTRIBUTE_UNUSED,
>> + virDomainObjPtr dom,
>> + virDomainStatsRecordPtr record,
>> + int *maxparams,
>> + unsigned int privflags ATTRIBUTE_UNUSED)
>> +{
>> +qemuDomainObjPrivatePtr priv = dom->privateData;
>> +virCgroupCpuStats stats = {0};
>> +virBitmapPtr guestvcpus = NULL;
>> +char param_name[VIR_TYPED_PARAM_FIELD_LENGTH];
>> +int ncpus;
>> +size_t i;
>> +int ret = -1;
>> +
>> +if (qemuDomainHasVcpuPids(dom))
>> +guestvcpus = virDomainDefGetOnlineVcpumap(dom->def);
> 
> Does it make sense to do this if we don't have the PIDs? Without that
> it's usually one thread anyways.
> 

It is on par with qemuDomainGetCPUStats.

I don't quite understand you. You mean even if not qemuDomainHasVcpuPids
we can gather cpu..vtime?

Nikolay

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


[libvirt] [PATCH 2/5] virusbmock: Link with libvirt_utils

2017-02-06 Thread Michal Privoznik
We are using couple of functions from there (e.g. virStrdup) and
rely that the binary linking us has the libvirt_utils linked
already. Well, this makes valgrind sad.

Signed-off-by: Michal Privoznik 
---
 tests/Makefile.am | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/tests/Makefile.am b/tests/Makefile.am
index 521728a3d..7149a865e 100644
--- a/tests/Makefile.am
+++ b/tests/Makefile.am
@@ -1203,7 +1203,8 @@ virnetdevbandwidthtest_LDADD = $(LDADDS) $(LIBXML_LIBS)
 virusbmock_la_SOURCES = virusbmock.c
 virusbmock_la_CFLAGS = $(AM_CFLAGS)
 virusbmock_la_LDFLAGS = $(MOCKLIBS_LDFLAGS)
-virusbmock_la_LIBADD = $(MOCKLIBS_LIBS)
+virusbmock_la_LIBADD = $(MOCKLIBS_LIBS) \
+   ../src/libvirt_util.la
 
 virnetdevbandwidthmock_la_SOURCES = \
virnetdevbandwidthmock.c
-- 
2.11.0

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


[libvirt] [PATCH 3/5] testUSBList: don't leak @dev

2017-02-06 Thread Michal Privoznik
==22187== 77 (56 direct, 21 indirect) bytes in 1 blocks are definitely lost in 
loss record 23 of 37
==22187==at 0x4C2BC75: calloc (vg_replace_malloc.c:624)
==22187==by 0x4E75685: virAlloc (viralloc.c:144)
==22187==by 0x4F0613A: virUSBDeviceNew (virusb.c:332)
==22187==by 0x4F05BA2: virUSBDeviceSearch (virusb.c:183)
==22187==by 0x4F05F95: virUSBDeviceFind (virusb.c:296)
==22187==by 0x403514: testUSBList (virusbtest.c:209)
==22187==by 0x403BD8: virTestRun (testutils.c:180)
==22187==by 0x4039E5: mymain (virusbtest.c:285)
==22187==by 0x4056BC: virTestMain (testutils.c:992)
==22187==by 0x403A4A: main (virusbtest.c:293)

Signed-off-by: Michal Privoznik 
---
 tests/virusbtest.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/tests/virusbtest.c b/tests/virusbtest.c
index 4bbfe4af9..7cddb89af 100644
--- a/tests/virusbtest.c
+++ b/tests/virusbtest.c
@@ -217,6 +217,7 @@ testUSBList(const void *opaque ATTRIBUTE_UNUSED)
 }
 
 virUSBDeviceListDel(list, dev);
+virUSBDeviceFree(dev);
 dev = NULL;
 
 if (testCheckNdevs("After deleting one",
-- 
2.11.0

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


[libvirt] [PATCH 4/5] qemuMonitorCPUModelInfoFree: Don't leak model_info->props

2017-02-06 Thread Michal Privoznik
==11846== 240 bytes in 1 blocks are definitely lost in loss record 81 of 107
==11846==at 0x4C2BC75: calloc (vg_replace_malloc.c:624)
==11846==by 0x18C74242: virAllocN (viralloc.c:191)
==11846==by 0x4A05E8: qemuMonitorCPUModelInfoCopy (qemu_monitor.c:3677)
==11846==by 0x446E3C: virQEMUCapsNewCopy (qemu_capabilities.c:2171)
==11846==by 0x437335: testQemuCapsCopy (qemucapabilitiestest.c:108)
==11846==by 0x437CD2: virTestRun (testutils.c:180)
==11846==by 0x437AD8: mymain (qemucapabilitiestest.c:176)
==11846==by 0x4397B6: virTestMain (testutils.c:992)
==11846==by 0x437B44: main (qemucapabilitiestest.c:188)

Signed-off-by: Michal Privoznik 
---
 src/qemu/qemu_monitor.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/src/qemu/qemu_monitor.c b/src/qemu/qemu_monitor.c
index b7be5e7f4..565339921 100644
--- a/src/qemu/qemu_monitor.c
+++ b/src/qemu/qemu_monitor.c
@@ -3660,6 +3660,7 @@ qemuMonitorCPUModelInfoFree(qemuMonitorCPUModelInfoPtr 
model_info)
 for (i = 0; i < model_info->nprops; i++)
 VIR_FREE(model_info->props[i].name);
 
+VIR_FREE(model_info->props);
 VIR_FREE(model_info->name);
 VIR_FREE(model_info);
 }
-- 
2.11.0

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


[libvirt] [PATCH 0/5] Fix couple of memleaks

2017-02-06 Thread Michal Privoznik
*** BLURB HERE ***

Michal Privoznik (5):
  tests: Trace re-executing processes with valgrind
  virusbmock: Link with libvirt_utils
  testUSBList: don't leak @dev
  qemuMonitorCPUModelInfoFree: Don't leak model_info->props
  xenFoxenFormatXLDisk: Don't leak @target

 src/qemu/qemu_monitor.c | 1 +
 src/xenconfig/xen_xl.c  | 5 +++--
 tests/Makefile.am   | 6 --
 tests/virusbtest.c  | 1 +
 4 files changed, 9 insertions(+), 4 deletions(-)

-- 
2.11.0

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


[libvirt] [PATCH 5/5] xenFoxenFormatXLDisk: Don't leak @target

2017-02-06 Thread Michal Privoznik
==11260== 1,006 bytes in 1 blocks are definitely lost in loss record 106 of 111
==11260==at 0x4C2AE5F: malloc (vg_replace_malloc.c:297)
==11260==by 0x4C2BDFF: realloc (vg_replace_malloc.c:693)
==11260==by 0x4EA430B: virReallocN (viralloc.c:245)
==11260==by 0x4EA7C52: virBufferGrow (virbuffer.c:130)
==11260==by 0x4EA7D28: virBufferAdd (virbuffer.c:165)
==11260==by 0x4EA8E10: virBufferStrcat (virbuffer.c:718)
==11260==by 0x42D263: xenFormatXLDiskSrcNet (xen_xl.c:960)
==11260==by 0x42D4EB: xenFormatXLDiskSrc (xen_xl.c:1015)
==11260==by 0x42D870: xenFormatXLDisk (xen_xl.c:1101)
==11260==by 0x42DA89: xenFormatXLDomainDisks (xen_xl.c:1148)
==11260==by 0x42EAF8: xenFormatXL (xen_xl.c:1558)
==11260==by 0x40E85F: testCompareParseXML (xlconfigtest.c:105)

Signed-off-by: Michal Privoznik 
---
 src/xenconfig/xen_xl.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/src/xenconfig/xen_xl.c b/src/xenconfig/xen_xl.c
index 18d9fe369..2c9174e53 100644
--- a/src/xenconfig/xen_xl.c
+++ b/src/xenconfig/xen_xl.c
@@ -1034,6 +1034,7 @@ xenFormatXLDisk(virConfValuePtr list, virDomainDiskDefPtr 
disk)
 int format = virDomainDiskGetFormat(disk);
 const char *driver = virDomainDiskGetDriver(disk);
 char *target = NULL;
+int ret = -1;
 
 /* format */
 virBufferAddLit(, "format=");
@@ -1119,12 +1120,12 @@ xenFormatXLDisk(virConfValuePtr list, 
virDomainDiskDefPtr disk)
 tmp->next = val;
 else
 list->list = val;
-return 0;
+ret = 0;
 
  cleanup:
 VIR_FREE(target);
 virBufferFreeAndReset();
-return -1;
+return ret;
 }
 
 
-- 
2.11.0

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


[libvirt] [PATCH 1/5] tests: Trace re-executing processes with valgrind

2017-02-06 Thread Michal Privoznik
A lot of our tests re-execute themeselves after loading their
mock library. This, however, makes valgrind sad because currently
we do not tell it to trace the process after exec().

Signed-off-by: Michal Privoznik 
---
 tests/Makefile.am | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/tests/Makefile.am b/tests/Makefile.am
index e923178f2..521728a3d 100644
--- a/tests/Makefile.am
+++ b/tests/Makefile.am
@@ -470,7 +470,8 @@ TESTS_ENVIRONMENT = \
   $(VG)
 
 
-VALGRIND = valgrind --quiet --leak-check=full \
+VALGRIND = valgrind --quiet --leak-check=full --trace-children=yes \
+   --trace-children-skip="*/tools/virsh","*/tests/commandhelper" \
--suppressions=$(srcdir)/.valgrind.supp
 valgrind:
$(MAKE) check VG="libtool --mode=execute $(VALGRIND)"
-- 
2.11.0

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list