Re: [libvirt] [PATCH v2 11/12] storage: Use virStoragePoolObj{Get|Incr}Decr}Asyncjobs

2017-09-18 Thread Michal Privoznik
On 08/24/2017 03:09 PM, John Ferlan wrote:
> Use the new accessor APIs for storage_driver.
> 
> Signed-off-by: John Ferlan 
> ---
>  src/storage/storage_driver.c | 20 ++--
>  1 file changed, 10 insertions(+), 10 deletions(-)

s/Incr}Decr/Incr|Decr/ in $SUBJ.

Michal

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [PATCH v2 07/12] storage: Use virStoragePoolObjGetAutostartLink

2017-09-18 Thread Michal Privoznik
On 08/24/2017 03:09 PM, John Ferlan wrote:
> Use the new accessor API for storage_driver.
> 
> Signed-off-by: John Ferlan 
> ---
>  src/storage/storage_driver.c | 21 -
>  1 file changed, 12 insertions(+), 9 deletions(-)
> 
> diff --git a/src/storage/storage_driver.c b/src/storage/storage_driver.c
> index 7b1396f..a7a77ba 100644
> --- a/src/storage/storage_driver.c
> +++ b/src/storage/storage_driver.c
> @@ -814,6 +814,7 @@ static int
>  storagePoolUndefine(virStoragePoolPtr pool)
>  {
>  virStoragePoolObjPtr obj;
> +const char *autostartLink;
>  virObjectEventPtr event = NULL;
>  int ret = -1;
>  
> @@ -838,18 +839,17 @@ storagePoolUndefine(virStoragePoolPtr pool)
>  goto cleanup;
>  }
>  
> +autostartLink = virStoragePoolObjGetAutostartLink(obj);

virStoragePoolObjGetAutostartLink() should look a bit different. Now
it's returning char * which suggests that caller is supposed to free the
retval. But in fact they shouldn't. const char * would be better. But
lets save that for a follow up patch.

Michal

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [PATCH v2 10/12] storage: Internally represent @autostart to bool

2017-09-18 Thread Michal Privoznik
On 08/24/2017 03:09 PM, John Ferlan wrote:
> Since it's been used that way anyway, let's just convert it to a bool
> and only make the external representation be an int.
> 
> Signed-off-by: John Ferlan 
> ---
>  src/conf/virstorageobj.c | 4 ++--
>  src/conf/virstorageobj.h | 4 ++--
>  src/storage/storage_driver.c | 2 +-
>  src/test/test_driver.c   | 4 ++--
>  4 files changed, 7 insertions(+), 7 deletions(-)

I'm no native speaker but probably s/to bool/as bool/ in $SUBJ?


Michal

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [PATCH] qemu: Introduce a wrapper over virFileWrapperFdClose

2017-09-18 Thread Michal Privoznik
On 09/18/2017 10:49 PM, John Ferlan wrote:
> 
> 
> On 09/18/2017 05:08 AM, Michal Privoznik wrote:
>> https://bugzilla.redhat.com/show_bug.cgi?id=1448268
>>
>> When migrating to a file (e.g. when doing 'virsh save file'),
>> couple of things are happening in the thread that is executing
>> the API:
>>
>> 1) the domain obj is locked
>> 2) iohelper is spawned as a separate process to handle all I/O
>> 3) the thread waits for iohelper to finish
>> 4) the domain obj is unlocked
>>
>> Now, the problem is that while the thread waits in step 3 for
>> iohelper to finish this may take ages because iohelper calls
>> fdatasync(). And unfortunately, we are waiting the whole time
>> with the domain locked. So if another thread wants to jump in and
>> say copy the domain name ('virsh list' for instance), they are
>> stuck.
>>
>> The solution is to unlock the domain whenever waiting for I/O and
>> lock it back again when it finished.
>>
>> Signed-off-by: Michal Privoznik 
>> ---
>>  src/qemu/qemu_driver.c | 27 +++
>>  1 file changed, 23 insertions(+), 4 deletions(-)
>>
> 
> I would think there'd be more concern about dropping the lock in the
> middle and something changing that could more negatively affect things.
> There's such a delicate balance between Save/Restore processing now - if
> you adjust the locks that could mean simultaneous usage, right?

Yes & no. Yes in sense that if another thread (in this case the one that
is listing the domains) wants to access the domain it can. But no other
API can be started really (like migrate, or hotplug) because the thread
executing save/dump sets a job. A job is like a flag so that we can
unlock the domain object for others to use (in a limited manner) and yet
still exclude another state changing API. We lock and unlock domains all
the time btw - qemuDomaninObjEnterMonitor() and
qemuDomainObjExitMonitor() do just that. And exactly for the same reason
as here. It may take ages for qemu to reply to our command and for that
the domain obj lock should not be held. IOW, once a job is acquired we
can lock & unlock domain as we please because we can be sure nobody will
change the libvirt internal state of the domain. All that other threads
can do is read.

> 
> Perhaps thinking differently... Does the virdomainobjlist code for
> Collect, Filter, Export really need to obtain the domain obj write lock?

Unfortunately yes. vm->def is not immutable, nor vm->def->name.

> 
> If Export is but a "snapshot in time", then using virObject{Lock|Unlock}
> perhaps isn't necessary in any of the @vms list processing. Since
> virDomainObjListCollectIterator sets a refcnt on the object it cannot be
> freed. Beyond that any state checking done in virDomainObjListFilter is
> negated because the lock isn't held until the virGetDomain call is made
> during virDomainObjListExport to fill in the returned @doms list. It's
> quite possible after *Filter to have the 'removing' boolean get set
> before the virGetDomain call is made. So what's the need or purpose of
> the write locks then?
> 
> If the object locks are removed during Export, then virsh list won't
> freeze while other operations that take time occur. Since the RWLock is
> taken on the list while @vms is created, we know nothing else can
> Add/Remove to/from the list - thus *that* is our snapshot in time.
> Anything done after than is just reducing what's in that snapshot.
> 
> It's "interesting" that the domain export list uses a very different
> model than other drivers... That loop to take a refcnt on each object w/
> the Read lock taken is something I could see duplicated for other
> drivers conceptually speaking of course if someone were to write that
> kind of common processing model ;-)

Right. It's not domain obj lock that is RW, it's the domain obj *list*
lock that is. The domain obj lock is still a mutex.

BTW: it's not only 'virsh list' that would hang. Other APIs that just
read without grabbing a job would hang too:

qemuDomainIsActive
qemuDomainIsPersistent
..
And also qemuDomainLookupByName.

We really must unlock the domain instead of trying to work around the
problem.

Michal

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


[libvirt] [PATCH v6 2/2] qemu: add capability checking for qcow2 cache configuration

2017-09-18 Thread Liu Qing
Add qemu capabilities QEMU_CAPS_L2_CACHE_SIZE,
QEMU_CAPS_REFCOUNT_CACHE_SIZE, QEMU_CAPS_CACHE_CLEAN_INTERVAL.
Add testing for the above qemu capabilities.

Signed-off-by: Liu Qing 
---
 src/qemu/qemu_capabilities.c   | 11 
 src/qemu/qemu_capabilities.h   |  5 
 src/qemu/qemu_command.c| 33 ++
 tests/qemucapabilitiesdata/caps_2.5.0.x86_64.xml   |  3 ++
 .../caps_2.6.0-gicv2.aarch64.xml   |  3 ++
 .../caps_2.6.0-gicv3.aarch64.xml   |  3 ++
 tests/qemucapabilitiesdata/caps_2.6.0.ppc64le.xml  |  3 ++
 tests/qemucapabilitiesdata/caps_2.6.0.x86_64.xml   |  3 ++
 tests/qemucapabilitiesdata/caps_2.7.0.s390x.xml|  3 ++
 tests/qemucapabilitiesdata/caps_2.7.0.x86_64.xml   |  3 ++
 tests/qemucapabilitiesdata/caps_2.8.0.s390x.xml|  3 ++
 tests/qemucapabilitiesdata/caps_2.8.0.x86_64.xml   |  3 ++
 tests/qemucapabilitiesdata/caps_2.9.0.ppc64le.xml  |  3 ++
 tests/qemucapabilitiesdata/caps_2.9.0.s390x.xml|  3 ++
 tests/qemucapabilitiesdata/caps_2.9.0.x86_64.xml   |  3 ++
 .../qemuxml2argv-disk-drive-qcow2-cache.args   | 28 ++
 tests/qemuxml2argvtest.c   |  4 +++
 17 files changed, 117 insertions(+)
 create mode 100644 
tests/qemuxml2argvdata/qemuxml2argv-disk-drive-qcow2-cache.args

diff --git a/src/qemu/qemu_capabilities.c b/src/qemu/qemu_capabilities.c
index e7ea6f4..2db6af1 100644
--- a/src/qemu/qemu_capabilities.c
+++ b/src/qemu/qemu_capabilities.c
@@ -439,6 +439,11 @@ VIR_ENUM_IMPL(virQEMUCaps, QEMU_CAPS_LAST,
   "virtio-net.tx_queue_size",
   "chardev-reconnect",
   "virtio-gpu.max_outputs",
+
+  /* 270 */
+  "drive.qcow2.l2-cache-size",
+  "drive.qcow2.refcount-cache-size",
+  "drive.qcow2.cache-clean-interval",
 );
 
 
@@ -1811,6 +1816,12 @@ static struct virQEMUCapsStringFlags 
virQEMUCapsObjectPropsIntelIOMMU[] = {
 static struct virQEMUCapsStringFlags virQEMUCapsQMPSchemaQueries[] = {
 { "blockdev-add/arg-type/options/+gluster/debug-level", 
QEMU_CAPS_GLUSTER_DEBUG_LEVEL},
 { "blockdev-add/arg-type/+gluster/debug", QEMU_CAPS_GLUSTER_DEBUG_LEVEL},
+{ "blockdev-add/arg-type/options/+qcow2/l2-cache-size", 
QEMU_CAPS_DRIVE_QCOW2_L2_CACHE_SIZE},
+{ "blockdev-add/arg-type/options/+qcow2/refcount-cache-size", 
QEMU_CAPS_DRIVE_QCOW2_REFCOUNT_CACHE_SIZE},
+{ "blockdev-add/arg-type/options/+qcow2/cache-clean-interval", 
QEMU_CAPS_DRIVE_QCOW2_CACHE_CLEAN_INTERVAL},
+{ "blockdev-add/arg-type/+qcow2/l2-cache-size", 
QEMU_CAPS_DRIVE_QCOW2_L2_CACHE_SIZE},
+{ "blockdev-add/arg-type/+qcow2/refcount-cache-size", 
QEMU_CAPS_DRIVE_QCOW2_REFCOUNT_CACHE_SIZE},
+{ "blockdev-add/arg-type/+qcow2/cache-clean-interval", 
QEMU_CAPS_DRIVE_QCOW2_CACHE_CLEAN_INTERVAL},
 };
 
 struct virQEMUCapsObjectTypeProps {
diff --git a/src/qemu/qemu_capabilities.h b/src/qemu/qemu_capabilities.h
index f32687d..021297e 100644
--- a/src/qemu/qemu_capabilities.h
+++ b/src/qemu/qemu_capabilities.h
@@ -426,6 +426,11 @@ typedef enum {
 QEMU_CAPS_CHARDEV_RECONNECT, /* -chardev reconnect */
 QEMU_CAPS_VIRTIO_GPU_MAX_OUTPUTS, /* -device 
virtio-(vga|gpu-*),max-outputs= */
 
+/* 270 */
+QEMU_CAPS_DRIVE_QCOW2_L2_CACHE_SIZE, /* -drive support qcow2 l2 cache size 
*/
+QEMU_CAPS_DRIVE_QCOW2_REFCOUNT_CACHE_SIZE, /* -drive support qcow2 
refcount cache size */
+QEMU_CAPS_DRIVE_QCOW2_CACHE_CLEAN_INTERVAL, /* -drive qcow2 cache clean 
interval */
+
 QEMU_CAPS_LAST /* this must always be the last item */
 } virQEMUCapsFlags;
 
diff --git a/src/qemu/qemu_command.c b/src/qemu/qemu_command.c
index d553df5..da91059 100644
--- a/src/qemu/qemu_command.c
+++ b/src/qemu/qemu_command.c
@@ -1434,6 +1434,39 @@ qemuBuildDriveSourceStr(virDomainDiskDefPtr disk,
 virBufferAsprintf(buf, "format=%s,", qemuformat);
 }
 
+if (disk->src->l2_cache_size > 0) {
+if (virQEMUCapsGet(qemuCaps, QEMU_CAPS_DRIVE_QCOW2_L2_CACHE_SIZE)) {
+virBufferAsprintf(buf, "l2-cache-size=%llu,",
+  disk->src->l2_cache_size);
+} else {
+virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s",
+   _("l2-cache-size is not supported by this QEMU"));
+goto cleanup;
+}
+}
+
+if (disk->src->refcount_cache_size > 0) {
+if (virQEMUCapsGet(qemuCaps, 
QEMU_CAPS_DRIVE_QCOW2_REFCOUNT_CACHE_SIZE)) {
+virBufferAsprintf(buf, "refcount-cache-size=%llu,",
+  disk->src->refcount_cache_size);
+} else {
+virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s",
+   _("refcount-cache-size is not supported by this 
QEMU"));
+goto cleanup;
+}
+}
+
+if (disk->src->cache_clean_interval > 0) {
+if (virQEMUCapsGet(qemuCaps, 
QEMU_CAPS_DRIVE_QCOW2_CACHE_CLEAN_INTERVAL)) {

[libvirt] [PATCH v6 0/2] Add support for qcow2 cache

2017-09-18 Thread Liu Qing
Qcow2 small IO random write performance will drop dramatically if the l2
cache table could not cover the whole disk. This will be a lot of l2
cache table RW operations if cache miss happens frequently.
 
This patch exports the qcow2 driver parameter
l2-cache-size/refcount-cache-size, first added in Qemu 2.2, and
cache-clean-interval, first added in Qemu 2.5, in libvirt.

change since v4: removed unnecessary cache error check

Liu Qing (2):
  conf, docs: Add qcow2 cache configuration support
  qemu: add capability checking for qcow2 cache configuration

 docs/formatdomain.html.in  | 41 +
 docs/schemas/domaincommon.rng  | 35 
 src/conf/domain_conf.c | 97 --
 src/qemu/qemu_capabilities.c   | 11 +++
 src/qemu/qemu_capabilities.h   |  5 ++
 src/qemu/qemu_command.c| 33 
 src/qemu/qemu_driver.c |  5 ++
 src/util/virstoragefile.c  |  3 +
 src/util/virstoragefile.h  |  6 ++
 tests/qemucapabilitiesdata/caps_2.5.0.x86_64.xml   |  3 +
 .../caps_2.6.0-gicv2.aarch64.xml   |  3 +
 .../caps_2.6.0-gicv3.aarch64.xml   |  3 +
 tests/qemucapabilitiesdata/caps_2.6.0.ppc64le.xml  |  3 +
 tests/qemucapabilitiesdata/caps_2.6.0.x86_64.xml   |  3 +
 tests/qemucapabilitiesdata/caps_2.7.0.s390x.xml|  3 +
 tests/qemucapabilitiesdata/caps_2.7.0.x86_64.xml   |  3 +
 tests/qemucapabilitiesdata/caps_2.8.0.s390x.xml|  3 +
 tests/qemucapabilitiesdata/caps_2.8.0.x86_64.xml   |  3 +
 tests/qemucapabilitiesdata/caps_2.9.0.ppc64le.xml  |  3 +
 tests/qemucapabilitiesdata/caps_2.9.0.s390x.xml|  3 +
 tests/qemucapabilitiesdata/caps_2.9.0.x86_64.xml   |  3 +
 .../qemuxml2argv-disk-drive-qcow2-cache.args   | 28 +++
 .../qemuxml2argv-disk-drive-qcow2-cache.xml| 43 ++
 tests/qemuxml2argvtest.c   |  4 +
 .../qemuxml2xmlout-disk-drive-qcow2-cache.xml  | 43 ++
 tests/qemuxml2xmltest.c|  1 +
 26 files changed, 384 insertions(+), 7 deletions(-)
 create mode 100644 
tests/qemuxml2argvdata/qemuxml2argv-disk-drive-qcow2-cache.args
 create mode 100644 
tests/qemuxml2argvdata/qemuxml2argv-disk-drive-qcow2-cache.xml
 create mode 100644 
tests/qemuxml2xmloutdata/qemuxml2xmlout-disk-drive-qcow2-cache.xml

-- 
1.8.3.1

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


[libvirt] [PATCH v6 1/2] conf, docs: Add qcow2 cache configuration support

2017-09-18 Thread Liu Qing
Random write IOPS will drop dramatically if qcow2 l2 cache could not
cover the whole disk. This patch gives libvirt user a chance to adjust
the qcow2 cache configuration.

Three new qcow2 driver parameters (l2-cache-size, refcount-cache-size
and cache-clean-interval) are added as attributes to a new 
subelement for a  of a 
element. The QEMU source docs/qcow2-cache.txt provides the guidelines
for defining/configuring values for each.

Signed-off-by: Liu Qing 
---

change since v4: removed unnecessary cache error check

 docs/formatdomain.html.in  | 41 +
 docs/schemas/domaincommon.rng  | 35 
 src/conf/domain_conf.c | 97 --
 src/qemu/qemu_driver.c |  5 ++
 src/util/virstoragefile.c  |  3 +
 src/util/virstoragefile.h  |  6 ++
 .../qemuxml2argv-disk-drive-qcow2-cache.xml| 43 ++
 .../qemuxml2xmlout-disk-drive-qcow2-cache.xml  | 43 ++
 tests/qemuxml2xmltest.c|  1 +
 9 files changed, 267 insertions(+), 7 deletions(-)
 create mode 100644 
tests/qemuxml2argvdata/qemuxml2argv-disk-drive-qcow2-cache.xml
 create mode 100644 
tests/qemuxml2xmloutdata/qemuxml2xmlout-disk-drive-qcow2-cache.xml

diff --git a/docs/formatdomain.html.in b/docs/formatdomain.html.in
index 8ca7637..4574e3a 100644
--- a/docs/formatdomain.html.in
+++ b/docs/formatdomain.html.in
@@ -3036,6 +3036,47 @@
   set. (Since 3.5.0)
   
 
+The driver element may contain a qcow2 sub
+element, which to specifying further details related to a qcow2 disk.
+For recommended setting guidelines refer to the QEMU source file
+docs/qcow2-cache.txt.
+Since 3.8.0
+
+  
+The optional l2_cache_size attribute controls how much
+memory will be consumed by qcow2 l2 table cache in bytes. This
+option is only valid when the driver type is qcow2. The default
+size is 2097152.
+Since 3.8.0
+
+In general you should leave this option alone, unless you
+are very certain you know what you are doing.
+  
+  
+The optional refcount_cache_size attribute controls
+how much memory will be consumed by qcow2 reference count table
+cache in bytes. This option is only valid when the driver type is
+qcow2. The default size is 262144.
+Since 3.8.0
+
+In general you should leave this option alone, unless you
+are very certain you know what you are doing.
+  
+  
+The optional cache_clean_interval attribute defines
+an interval (in seconds). All cache entries that haven't been
+accessed during that interval are removed from memory. This option
+is only valid when the driver type is qcow2. The default
+value is 0, which disables this feature. If the interval is too
+short, it will cause frequent cache write back and load, which
+impact performance. If the interval is too long, unused cache
+will still consume memory until it's been written back to disk.
+Since 3.8.0
+
+In general you should leave this option alone, unless you
+are very certain you know what you are doing.
+  
+
   
   backenddomain
   The optional backenddomain element allows specifying a
diff --git a/docs/schemas/domaincommon.rng b/docs/schemas/domaincommon.rng
index c9a4f7a..0e25f44 100644
--- a/docs/schemas/domaincommon.rng
+++ b/docs/schemas/domaincommon.rng
@@ -1756,6 +1756,23 @@
 
   
   
+  
+
+  
+
+  
+  
+
+  
+  
+
+  
+
+  
+
+  
   
@@ -1794,6 +1811,9 @@
 
   
   
+  
+
+  
   
 
   
@@ -1889,6 +1909,21 @@
   
 
   
+  
+
+  
+
+  
+  
+
+  
+
+  
+  
+
+  
+
+  
   
 
   
diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c
index a43b25c..f225a7f 100644
--- a/src/conf/domain_conf.c
+++ b/src/conf/domain_conf.c
@@ -5734,6 +5734,30 @@ virDomainDeviceLoadparmIsValid(const char *loadparm)
 
 
 static void
+virDomainQcow2CacheOptionsFormat(virBufferPtr buf,
+ virDomainDiskDefPtr def)
+{
+if (def->src->l2_cache_size == 0 &&
+def->src->refcount_cache_size == 0 &&
+def->src->cache_clean_interval == 0)
+return;
+
+virBufferAddLit(buf, "src->l2_cache_size > 0)
+virBufferAsprintf(buf, " l2_cache_size='%llu'",
+  def->src->l2_cache_size);
+if (def->src->refcount_cache_size > 0)
+virBufferAsprintf(buf, " refcount_cache_size='%llu'",
+  def->src->refcount_cache_size);
+if

Re: [libvirt] [PATCH v5 1/2] conf, docs: Add qcow2 cache configuration support

2017-09-18 Thread Liu Qing
On Mon, Sep 18, 2017 at 09:49:36AM -0400, John Ferlan wrote:
> 
> 
> On 09/14/2017 01:08 AM, Liu Qing wrote:
> > Random write IOPS will drop dramatically if qcow2 l2 cache could not
> > cover the whole disk. This patch gives libvirt user a chance to adjust
> > the qcow2 cache configuration.
> > 
> > Three new qcow2 driver parameters (l2-cache-size, refcount-cache-size
> > and cache-clean-interval) are added as attributes to a new 
> > subelement for a  of a 
> > element. The QEMU source docs/qcow2-cache.txt provides the guidelines
> > for defining/configuring values for each.
> > 
> > Signed-off-by: Liu Qing 
> > ---
> >  docs/formatdomain.html.in  | 41 +
> >  docs/schemas/domaincommon.rng  | 35 
> >  src/conf/domain_conf.c | 96 
> > --
> >  src/qemu/qemu_driver.c |  5 ++
> >  src/util/virstoragefile.c  |  3 +
> >  src/util/virstoragefile.h  |  6 ++
> >  .../qemuxml2argv-disk-drive-qcow2-cache.xml| 43 ++
> >  .../qemuxml2xmlout-disk-drive-qcow2-cache.xml  | 43 ++
> >  tests/qemuxml2xmltest.c|  1 +
> >  9 files changed, 268 insertions(+), 5 deletions(-)
> >  create mode 100644 
> > tests/qemuxml2argvdata/qemuxml2argv-disk-drive-qcow2-cache.xml
> >  create mode 100644 
> > tests/qemuxml2xmloutdata/qemuxml2xmlout-disk-drive-qcow2-cache.xml
> > 
> 
> OK - so I had my head in other code during the v4 review comments from
> Peter. So I'll just "restart" that a bit here. If I can summarize the
> comments... There's a concern that we're creating a very specific set of
> parameters for which there isn't great guidance available other than
> going to a source file in the qemu tree and attempting to decipher it.
> Something I did note at one time as a possible issue and why I also
> asked for the "qcow2" subelement. It makes it painfully obvious that
> there was something very specific. I figure seeing a "qcow2" element
> name would fairly easily point out to anyone that specificity.
> 
> So, based on what I read in the responses it seems that the purpose of
> the values is to provide a mechanism to "allow" qemu to use a couple of
> different algorithms to perform operations, but qemu wants the user to
> decide whether they are willing to sacrifice one thing over another. If
> you want high IOPS, then you have to sacrifice something else; whereas,
> if you don't care about performance, then you sacrifice something else.
> 
> In any case, I think what could be useful is to determine if there is a
> way to "automate" something such that the "dials" would be automatically
> set based on the algorithms. That is - rather than supplying the
> specific values, supply a named attribute that would then perform the
> calculations and set the values, such as "high_performance="yes" or
> "large_capacity=yes". Where you cannot set both, but can set one or the
> other.  Or can you set both? Is there value in partially turning a knob?
>  If so maybe the attribute is "performance={default, ..., best}" and
> "capacity={default, ..., large}". Then based on the value of the
> attribute, code you add to libvirt would perform the calculations.
> 
> Or do you really think that's something some supplier (such as what
> you're trying to do) would want to have - the ability to control the
> algorithm for the various knobs.
I googled the qcow2 cache and there are already some discussion about
how to give a user friendly interface, even in the qemu community.
https://bugzilla.redhat.com/show_bug.cgi?id=1377735
I don't what to involve this, as this is hard to lead to an all win
result. A high level policy configuration would be enough for us.
If a policy configuration on libvirt level is necessary, I think a new patch
could be made. For example adding new attribute in the  element
which controls how the cache is automatically configured. But this may lead to
another discussion about what's a good policy. A policy way and the knob
way could both be available on libvirt level.

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


[libvirt] Questions about function virPCIDeviceIsBehindSwitchLackingACS in virpci.c

2017-09-18 Thread Wuzongyong (Euler Dept)
Hi,

In function virPCIDeviceIsBehindSwitchLackingACS, I noticed that(line 8):

1if (virPCIDeviceGetParent(dev, &parent) < 0)
2return -1;
3if (!parent) {
4/* if we have no parent, and this is the root bus, ACS doesn't come
5 * into play since devices on the root bus can't P2P without going
6 * through the root IOMMU.
7 */
8if (dev->address.bus == 0) {
9return 0;
10} else {
11virReportError(VIR_ERR_INTERNAL_ERROR,
12   _("Failed to find parent device for %s"),
13   dev->name);
14return -1;
15}
16}

Why we just return 0 only if device's bus is 0?
In my server, I can see a root bus which bus number is greater than 0, see the
results(just a part) after I run lspci -t:

+-[:80]-+-02.0-[81-83]--+-00.0
|   |   \-00.1
|   +-05.0
|   +-05.1
|   +-05.2
|   \-05.4
+-[:7f]-+-08.0
|   +-08.2
|   +-08.3
|   + . . .
|   \-1f.2
\-[:00]-+-00.0
 +-01.0-[01]00.0
 +-02.0-[02]--+-00.0
 |+-00.1
 |+-00.2
 |\-00.3
 +-02.2-[03]--
 +-03.0-[04-0b]00.0-[05-0b]--+-08.0-[06-08]00.0
 |   \-10.0-[09-0b]00.0
 +-05.0
 +-05.1
 +-05.2
 +-05.4
 +-11.0
 +-11.4
 +-16.0
 +-16.1
 +-1a.0

If I assign the device :81:00.0 to a VM, I get "Failed to find parent 
device".
I think I should get no error with return value 0 just like bus number is 0, 
because
bus 80 is the root bus as well in my case.

In the <>
Datasheet, I found that(Chapter 9.1):
For some server platforms, it may be desirable to have multiple PCHs in the 
system
Which means some PCH's may reside on a bus greater than 0.

So, is this a bug?

Thanks,
Zongyong Wu


--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list

Re: [libvirt] [PATCH 1/2] logical: implement volume resize

2017-09-18 Thread John Ferlan


On 09/07/2017 08:56 AM, apolya...@beget.ru wrote:
> From: Alexander Polyakov 
> 
> Add new virStorageBackendLogicalResizeVol function to
> implement resize of logical volumes
> 
> Sparse volumes are not supported
> 
> Signed-off-by: Alexander Polyakov 
> ---
>  m4/virt-storage-lvm.m4|  4 
>  src/storage/storage_backend_logical.c | 37 
> +++
>  2 files changed, 41 insertions(+)
> 

Curious - is there a way to create/modify an LV to use LUKS encryption
via some lvm* commands?  I have another bz and the only way that comes
to me right now is to create the LV using qemu-img with and then somehow
have it recognized/added so that LVM recognizes it. I haven't put much
thought into it yet though...

Anyway...

> diff --git a/m4/virt-storage-lvm.m4 b/m4/virt-storage-lvm.m4
> index a0ccca7a0..0932995b4 100644
> --- a/m4/virt-storage-lvm.m4
> +++ b/m4/virt-storage-lvm.m4
> @@ -30,6 +30,7 @@ AC_DEFUN([LIBVIRT_STORAGE_CHECK_LVM], [
>  AC_PATH_PROG([VGREMOVE], [vgremove], [], [$LIBVIRT_SBIN_PATH])
>  AC_PATH_PROG([LVREMOVE], [lvremove], [], [$LIBVIRT_SBIN_PATH])
>  AC_PATH_PROG([LVCHANGE], [lvchange], [], [$LIBVIRT_SBIN_PATH])
> +AC_PATH_PROG([LVRESIZE], [lvresize], [], [$LIBVIRT_SBIN_PATH])
>  AC_PATH_PROG([VGCHANGE], [vgchange], [], [$LIBVIRT_SBIN_PATH])
>  AC_PATH_PROG([VGSCAN], [vgscan], [], [$LIBVIRT_SBIN_PATH])
>  AC_PATH_PROG([PVS], [pvs], [], [$LIBVIRT_SBIN_PATH])
> @@ -44,6 +45,7 @@ AC_DEFUN([LIBVIRT_STORAGE_CHECK_LVM], [
>if test -z "$VGREMOVE" ; then AC_MSG_ERROR([We need vgremove for LVM 
> storage driver]) ; fi
>if test -z "$LVREMOVE" ; then AC_MSG_ERROR([We need lvremove for LVM 
> storage driver]) ; fi
>if test -z "$LVCHANGE" ; then AC_MSG_ERROR([We need lvchange for LVM 
> storage driver]) ; fi
> +  if test -z "$LVRESIZE" ; then AC_MSG_ERROR([We need lvresize for LVM 
> storage driver]) ; fi
>if test -z "$VGCHANGE" ; then AC_MSG_ERROR([We need vgchange for LVM 
> storage driver]) ; fi
>if test -z "$VGSCAN" ; then AC_MSG_ERROR([We need vgscan for LVM 
> storage driver]) ; fi
>if test -z "$PVS" ; then AC_MSG_ERROR([We need pvs for LVM storage 
> driver]) ; fi
> @@ -57,6 +59,7 @@ AC_DEFUN([LIBVIRT_STORAGE_CHECK_LVM], [
>if test -z "$VGREMOVE" ; then with_storage_lvm=no ; fi
>if test -z "$LVREMOVE" ; then with_storage_lvm=no ; fi
>if test -z "$LVCHANGE" ; then with_storage_lvm=no ; fi
> +  if test -z "$LVRESIZE" ; then with_storage_lvm=no ; fi
>if test -z "$VGCHANGE" ; then with_storage_lvm=no ; fi
>if test -z "$VGSCAN" ; then with_storage_lvm=no ; fi
>if test -z "$PVS" ; then with_storage_lvm=no ; fi
> @@ -75,6 +78,7 @@ AC_DEFUN([LIBVIRT_STORAGE_CHECK_LVM], [
>AC_DEFINE_UNQUOTED([VGREMOVE],["$VGREMOVE"],[Location of vgremove 
> program])
>AC_DEFINE_UNQUOTED([LVREMOVE],["$LVREMOVE"],[Location of lvremove 
> program])
>AC_DEFINE_UNQUOTED([LVCHANGE],["$LVCHANGE"],[Location of lvchange 
> program])
> +  AC_DEFINE_UNQUOTED([LVRESIZE],["$LVRESIZE"],[Location of lvresize 
> program])
>AC_DEFINE_UNQUOTED([VGCHANGE],["$VGCHANGE"],[Location of vgchange 
> program])
>AC_DEFINE_UNQUOTED([VGSCAN],["$VGSCAN"],[Location of vgscan program])
>AC_DEFINE_UNQUOTED([PVS],["$PVS"],[Location of pvs program])
> diff --git a/src/storage/storage_backend_logical.c 
> b/src/storage/storage_backend_logical.c
> index 67f70e551..e5cb09922 100644
> --- a/src/storage/storage_backend_logical.c
> +++ b/src/storage/storage_backend_logical.c
> @@ -1072,6 +1072,42 @@ virStorageBackendLogicalVolWipe(virConnectPtr conn,
>  return -1;
>  }
>  

I see the lvmresize command allows you to supply which physical device
the extents would come from. Would that be something we might want to
try here as well?  Thankfully I don't think snapshot or mirror'd LV's
come into play (yet) nor thin pool or stripes.

In any case, I don't know whether someone would "want" to choose a
specific PV to draw from or just let LVM make the decision and let it be
for now. If we did have another parameter we'd have to check if the PV
existed and was part of the VG... lot's more work, but didn't want to
miss noting it either. Just typing some thoughts...

> +static int
> +virStorageBackendLogicalResizeVol(virConnectPtr conn ATTRIBUTE_UNUSED,
> +   virStoragePoolObjPtr pool 
> ATTRIBUTE_UNUSED,
> +   virStorageVolDefPtr vol,
> +   unsigned long long capacity,
> +   unsigned int flags)
 ^
The spacing for each of the argument rows is off by a single space.

> +{
> +virCheckFlags(VIR_STORAGE_VOL_RESIZE_SHRINK |
> +  VIR_STORAGE_VOL_RESIZE_DELTA, -1);
> +
> +bool delta = flags & VIR_STORAGE_VOL_RESIZE_DELTA;
> +bool shrink = flags & VIR_STORAGE_VOL_RES

Re: [libvirt] [PATCH 2/2] news: Document lvm volume resize

2017-09-18 Thread John Ferlan


On 09/07/2017 08:56 AM, apolya...@beget.ru wrote:
> From: Alexander Polyakov 
> 
> Signed-off-by: Alexander Polyakov 
> ---
>  docs/news.xml | 8 
>  1 file changed, 8 insertions(+)
> 
> diff --git a/docs/news.xml b/docs/news.xml
> index 483f9d6d1..44e99205f 100644
> --- a/docs/news.xml
> +++ b/docs/news.xml
> @@ -37,6 +37,14 @@
>  
>  
>  
> +  
> +
> +  logical: Add volume resize
> +
> +
> +  It's now possible to resize LVM volumes.

I would probably add "non sparse" LVM volumes... Just to be clear.

Reviewed-by: John Ferlan 

John
> +
> +  
>  
>  
>  
> 

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [PATCH] qemu: Introduce a wrapper over virFileWrapperFdClose

2017-09-18 Thread John Ferlan


On 09/18/2017 05:08 AM, Michal Privoznik wrote:
> https://bugzilla.redhat.com/show_bug.cgi?id=1448268
> 
> When migrating to a file (e.g. when doing 'virsh save file'),
> couple of things are happening in the thread that is executing
> the API:
> 
> 1) the domain obj is locked
> 2) iohelper is spawned as a separate process to handle all I/O
> 3) the thread waits for iohelper to finish
> 4) the domain obj is unlocked
> 
> Now, the problem is that while the thread waits in step 3 for
> iohelper to finish this may take ages because iohelper calls
> fdatasync(). And unfortunately, we are waiting the whole time
> with the domain locked. So if another thread wants to jump in and
> say copy the domain name ('virsh list' for instance), they are
> stuck.
> 
> The solution is to unlock the domain whenever waiting for I/O and
> lock it back again when it finished.
> 
> Signed-off-by: Michal Privoznik 
> ---
>  src/qemu/qemu_driver.c | 27 +++
>  1 file changed, 23 insertions(+), 4 deletions(-)
> 

I would think there'd be more concern about dropping the lock in the
middle and something changing that could more negatively affect things.
There's such a delicate balance between Save/Restore processing now - if
you adjust the locks that could mean simultaneous usage, right?

Perhaps thinking differently... Does the virdomainobjlist code for
Collect, Filter, Export really need to obtain the domain obj write lock?

If Export is but a "snapshot in time", then using virObject{Lock|Unlock}
perhaps isn't necessary in any of the @vms list processing. Since
virDomainObjListCollectIterator sets a refcnt on the object it cannot be
freed. Beyond that any state checking done in virDomainObjListFilter is
negated because the lock isn't held until the virGetDomain call is made
during virDomainObjListExport to fill in the returned @doms list. It's
quite possible after *Filter to have the 'removing' boolean get set
before the virGetDomain call is made. So what's the need or purpose of
the write locks then?

If the object locks are removed during Export, then virsh list won't
freeze while other operations that take time occur. Since the RWLock is
taken on the list while @vms is created, we know nothing else can
Add/Remove to/from the list - thus *that* is our snapshot in time.
Anything done after than is just reducing what's in that snapshot.

It's "interesting" that the domain export list uses a very different
model than other drivers... That loop to take a refcnt on each object w/
the Read lock taken is something I could see duplicated for other
drivers conceptually speaking of course if someone were to write that
kind of common processing model ;-)

John

FWIW: The "older" processing model to get the Num of domains followed by
getting a list of domains could continue to do the locking... although
that code could use a cleanup too, but that's a different issue.

> diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c
> index b334cf20b..f81d3def4 100644
> --- a/src/qemu/qemu_driver.c
> +++ b/src/qemu/qemu_driver.c
> @@ -3216,6 +3216,25 @@ qemuOpenFileAs(uid_t fallback_uid, gid_t fallback_gid,
>  goto cleanup;
>  }
>  
> +
> +static int
> +qemuFileWrapperFDClose(virDomainObjPtr vm,
> +   virFileWrapperFdPtr fd)
> +{
> +int ret;
> +
> +/* virFileWrapperFd uses iohelper to write data onto disk.
> + * However, iohelper calls fdatasync() which may take ages to
> + * finish. Therefore, we shouldn't be waiting with the domain
> + * object locked. */
> +
> +virObjectUnlock(vm);
> +ret = virFileWrapperFdClose(fd);
> +virObjectLock(vm);
> +return ret;
> +}
> +
> +
>  /* Helper function to execute a migration to file with a correct save header
>   * the caller needs to make sure that the processors are stopped and do all 
> other
>   * actions besides saving memory */
> @@ -3276,7 +3295,7 @@ qemuDomainSaveMemory(virQEMUDriverPtr driver,
>  goto cleanup;
>  }
>  
> -if (virFileWrapperFdClose(wrapperFd) < 0)
> +if (qemuFileWrapperFDClose(vm, wrapperFd) < 0)
>  goto cleanup;
>  
>  if ((fd = qemuOpenFile(driver, vm, path, O_WRONLY, NULL, NULL)) < 0 ||
> @@ -3827,7 +3846,7 @@ doCoreDump(virQEMUDriverPtr driver,
>   path);
>  goto cleanup;
>  }
> -if (virFileWrapperFdClose(wrapperFd) < 0)
> +if (qemuFileWrapperFDClose(vm, wrapperFd) < 0)
>  goto cleanup;
>  
>  ret = 0;
> @@ -6687,7 +6706,7 @@ qemuDomainRestoreFlags(virConnectPtr conn,
>  
>  ret = qemuDomainSaveImageStartVM(conn, driver, vm, &fd, data, path,
>   false, QEMU_ASYNC_JOB_START);
> -if (virFileWrapperFdClose(wrapperFd) < 0)
> +if (qemuFileWrapperFDClose(vm, wrapperFd) < 0)
>  VIR_WARN("Failed to close %s", path);
>  
>  qemuProcessEndJob(driver, vm);
> @@ -6962,7 +6981,7 @@ qemuDomainObjRestore(virConnectPtr conn,
>  
>  ret = qemuDomainSaveImageS

Re: [libvirt] [PATCH v7 11/11] qemu: Add TLS support for Veritas HyperScale (VxHS)

2017-09-18 Thread ashish mittal
Hi,

I've done some TLS testing with this patch and results look good. The
following test statically adds a VxHS disk to a guest in the TLS mode.
Boots up the guest and makes sure that we can do read/writes to the VxHS
disk from within the guest with TLS enabled.

(1) Create a backing store file /tmp/test_vxhs_disk_1 and start the VxHS
test server "qnio_server" with TLS enabled.

(2) Client side TLS env was setup as follows -

[root@audi ~] 2017-09-18 13:56:13# grep -i vxhs /etc/libvirt/qemu.conf |
grep -v "^#"
vxhs_tls = 1
vxhs_tls_x509_cert_dir = "/etc/pki/libvirt-vxhs"

[root@audi ~] 2017-09-18 13:56:22# ll /etc/pki/libvirt-vxhs
total 20
-r--r--r--. 1 root root 4062 Sep 17 23:15 ca-cert.pem
-rw-r--r--. 1 root root 1866 Sep 17 22:52 client-cert.pem
-r. 1 root root 1679 Sep 17 22:52 client-key.pem
[root@audi ~] 2017-09-18 13:56:35#

(3) virsh edit and add a new VxHS device with tls='yes'

The XML added to existing domain -


  
  

  
  
  
  eb90327c-8302-4725-9e1b-4e85ed4dc251
  
  


(4) Start the domain and check if qemu command is correct

[root@audi ~] 2017-09-18 13:29:01# virsh start myfc24
Domain myfc24 started

[root@audi ~] 2017-09-18 13:29:20# ps -ef | grep qemu

root  9578 1 99 13:29 ?00:00:20 /usr/bin/qemu-system-x86_64
-machine accel=kvm -name guest=myfc24,debug-threads=on -S -object
secret,id=masterKey0,format=raw,file=/var/lib/libvirt/qemu/domain-1-myfc24/master-key.aes
-machine pc-i440fx-2.6,accel=kvm,usb=off,vmport=off,dump-guest-core=off
-cpu Opteron_G3 -m 1024 -realtime mlock=off -smp
2,sockets=2,cores=1,threads=1 -uuid 70454565-8185-4506-b50f-d2cf55d83796
-no-user-config -nodefaults -chardev
socket,id=charmonitor,path=/var/lib/libvirt/qemu/domain-1-myfc24/monitor.sock,server,nowait
-mon chardev=charmonitor,id=monitor,mode=control -rtc
base=utc,driftfix=slew -global kvm-pit.lost_tick_policy=delay -no-hpet
-no-shutdown -global PIIX4_PM.disable_s3=1 -global PIIX4_PM.disable_s4=1
-boot strict=on -device ich9-usb-ehci1,id=usb,bus=pci.0,addr=0x6.0x7
-device
ich9-usb-uhci1,masterbus=usb.0,firstport=0,bus=pci.0,multifunction=on,addr=0x6
-device ich9-usb-uhci2,masterbus=usb.0,firstport=2,bus=pci.0,addr=0x6.0x1
-device ich9-usb-uhci3,masterbus=usb.0,firstport=4,bus=pci.0,addr=0x6.0x2
-device virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x5 -drive
file=/var/lib/libvirt/images/myfc24_rootdisk.qcow2,format=qcow2,if=none,id=drive-ide0-0-0
-device
ide-hd,bus=ide.0,unit=0,drive=drive-ide0-0-0,id=ide0-0-0,bootindex=1 -drive
if=none,id=drive-ide0-0-1,readonly=on -device
ide-cd,bus=ide.0,unit=1,drive=drive-ide0-0-1,id=ide0-0-1 -object
tls-creds-x509,id=objvirtio-disk2_tls0,dir=/etc/pki/libvirt-vxhs,endpoint=client,verify-peer=yes
-drive
file.driver=vxhs,file.tls-creds=objvirtio-disk2_tls0,file.vdisk-id=/tmp/test_vxhs_disk_1,file.server.type=tcp,file.server.host=127.0.0.1,file.server.port=,format=raw,if=none,id=drive-virtio-disk2,serial=eb90327c-8302-4725-9e1b-4e85ed4dc251,cache=none
-device
virtio-blk-pci,scsi=off,bus=pci.0,addr=0xa,drive=drive-virtio-disk2,id=virtio-disk2
-netdev tap,fd=27,id=hostnet0 -device
rtl8139,netdev=hostnet0,id=net0,mac=52:54:00:e4:9e:30,bus=pci.0,addr=0x3
-netdev tap,fd=29,id=hostnet1,vhost=on,vhostfd=30 -device
virtio-net-pci,netdev=hostnet1,id=net1,mac=52:54:00:b1:43:c4,bus=pci.0,addr=0x8
-chardev pty,id=charserial0 -device
isa-serial,chardev=charserial0,id=serial0 -chardev
spicevmc,id=charchannel0,name=vdagent -device
virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=channel0,name=com.redhat.spice.0
-spice
port=5900,addr=127.0.0.1,disable-ticketing,image-compression=off,seamless-migration=on
-device
qxl-vga,id=video0,ram_size=67108864,vram_size=67108864,vram64_size_mb=0,vgamem_mb=16,max_outputs=1,bus=pci.0,addr=0x2
-device intel-hda,id=sound0,bus=pci.0,addr=0x4 -device
hda-duplex,id=sound0-codec0,bus=sound0.0,cad=0 -chardev
spicevmc,id=charredir0,name=usbredir -device
usb-redir,chardev=charredir0,id=redir0,bus=usb.0,port=1 -chardev
spicevmc,id=charredir1,name=usbredir -device
usb-redir,chardev=charredir1,id=redir1,bus=usb.0,port=2 -device
virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x7 -msg timestamp=on

(5) Log in to the guest domain and make sure we see this VxHS disk

[root@camshaft ~] 2017-09-18 13:32:22# fdisk -l
...
Disk /dev/vda: 1 MiB, 1048576 bytes, 2048 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes

Disk /dev/mapper/fedora-root: 45.4 GiB, 48704258048 bytes, 95125504 sectors
...

(6) Create a disk label and partition. Do mkfs and mount the FS. Copy some
files to the disk and check general read/write operations.

[root@camshaft ~] 2017-09-18 13:32:35# fdisk /dev/vda

Created a new partition 1 of type 'Linux' and of size 1023.5 KiB.

Command (m for help): p
Disk /dev/vda: 1 MiB, 1048576 bytes, 2048 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical

[libvirt] [PATCH] apparmor: support finer-grained ptrace checks

2017-09-18 Thread Jim Fehlig
Kernel 4.13 introduced finer-grained ptrace checks

https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git/commit/?h=v4.13.2&id=290f458a4f16f9cf6cb6562b249e69fe1c3c3a07

When Apparmor is enabled and libvirtd is confined, attempting to start
a domain fails

virsh start test
error: Failed to start domain test
error: internal error: child reported: Kernel does not provide mount
   namespace: Permission denied

The audit log contains

type=AVC msg=audit(1505466699.828:534): apparmor="DENIED"
operation="ptrace" profile="/usr/sbin/libvirtd" pid=6621
comm="libvirtd" requested_mask="trace" denied_mask="trace"
peer="/usr/sbin/libvirtd"

It was also noticed that simply connecting to libvirtd (e.g. virsh list)
resulted in the following entries in the audit log

type=AVC msg=audit(1505755799.975:65): apparmor="DENIED"
operation="ptrace" profile="/usr/sbin/libvirtd" pid=1418
comm="libvirtd" requested_mask="trace" denied_mask="trace"
peer="unconfined"
type=AVC msg=audit(1505755799.976:66): apparmor="DENIED"
operation="ptrace" profile="/usr/sbin/libvirtd" pid=1418
comm="libvirtd" requested_mask="trace" denied_mask="trace"
peer="unconfined"

Both Apparmor denials can be fixed by adding ptrace rules to the
libvirtd profile. The new rules only grant trace permission.

Resolves: https://bugzilla.suse.com/show_bug.cgi?id=1058847
Signed-off-by: Jim Fehlig 
---

Even with debug enabled in libvirtd, I've had a hard time correlating a
libvirtd action that results in the denied ptrace check seen in the audit
log. I suspect it is related to accessing files in /proc as mentioned in
the apparmor wiki

http://wiki.apparmor.net/index.php/TechnicalDo_Proc_and_ptrace

cc'ing some of the usual apparmor suspects for any words of wisdom.

 examples/apparmor/usr.sbin.libvirtd | 4 
 1 file changed, 4 insertions(+)

diff --git a/examples/apparmor/usr.sbin.libvirtd 
b/examples/apparmor/usr.sbin.libvirtd
index acb59e071..ff84aa149 100644
--- a/examples/apparmor/usr.sbin.libvirtd
+++ b/examples/apparmor/usr.sbin.libvirtd
@@ -37,6 +37,10 @@
   network packet dgram,
   network packet raw,
 
+  # Support finer-grained ptrace checks, which were enabled in kernel 4.13
+  ptrace trace peer=/usr/sbin/libvirtd,
+  ptrace trace peer=unconfined,
+
   # Very lenient profile for libvirtd since we want to first focus on confining
   # the guests. Guests will have a very restricted profile.
   / r,
-- 
2.14.1

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


[libvirt] [PATCH] apparmor: delete profile on VM shutdown

2017-09-18 Thread Guido Günther
instead of only unloading it. This makes sure old profiles don't pile up
in /etc/apparmor.d/libvirt and we get updates to modified templates on
VM restart.
---
 src/security/security_apparmor.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/security/security_apparmor.c b/src/security/security_apparmor.c
index 5afe0c5c85..1db94c632f 100644
--- a/src/security/security_apparmor.c
+++ b/src/security/security_apparmor.c
@@ -220,7 +220,7 @@ remove_profile(const char *profile)
 {
 int rc = -1;
 const char * const argv[] = {
-VIRT_AA_HELPER, "-R", "-u", profile, NULL
+VIRT_AA_HELPER, "-D", "-u", profile, NULL
 };
 
 if (virRun(argv, NULL) == 0)
-- 
2.14.1

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [PATCH 1/2] qemu: blockPeek: Fix filling of the return buffer

2017-09-18 Thread Eric Blake
On 09/18/2017 09:11 AM, Peter Krempa wrote:
> Commit 3956af495e broke the blockPeek API since virStorageFileRead
> allocates a return buffer and fills it with the data, while the API
> fills a user-provided buffer. This did not get caught by the compiler
> since the API prototype uses a 'void *'.
> 
> Fix it by transferring the data from the allocated buffer to the user
> provided buffer.
> 
> Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1491217
> ---
>  src/qemu/qemu_driver.c | 4 
>  1 file changed, 4 insertions(+)
> 
> diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c
> index e1a0dd553..93a1c6061 100644
> --- a/src/qemu/qemu_driver.c
> +++ b/src/qemu/qemu_driver.c
> @@ -11415,6 +11415,7 @@ qemuDomainBlockPeek(virDomainPtr dom,
>  virQEMUDriverPtr driver = dom->conn->privateData;
>  virDomainDiskDefPtr disk = NULL;
>  virDomainObjPtr vm;
> +char *tmpbuf = NULL;
>  int ret = -1;
> 
>  virCheckFlags(0, -1);
> @@ -11444,12 +11445,15 @@ qemuDomainBlockPeek(virDomainPtr dom,
>  if (virStorageFileRead(disk->src, offset, size, buffer) < 0)
>  goto cleanup;
> 
> +memcpy(buffer, tmpbuf, size);

Umm, where is tmpbuf actually set to a non-null pointer? Shouldn't the
virStorageFileRead() call also be updated?

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org



signature.asc
Description: OpenPGP digital signature
--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list

Re: [libvirt] [PATCH] apparmor: cater for new AAVMF image location

2017-09-18 Thread Guido Günther
Hi,
On Mon, Sep 18, 2017 at 02:05:41PM +0200, Michal Privoznik wrote:
> On 09/15/2017 06:10 PM, Guido Günther wrote:
> > Things moved again, sigh.
> > ---
> >  src/security/virt-aa-helper.c | 3 ++-
> >  1 file changed, 2 insertions(+), 1 deletion(-)
> > 
> > diff --git a/src/security/virt-aa-helper.c b/src/security/virt-aa-helper.c
> > index 55a686a59c..0b43c8e391 100644
> > --- a/src/security/virt-aa-helper.c
> > +++ b/src/security/virt-aa-helper.c
> > @@ -516,7 +516,8 @@ valid_path(const char *path, const bool readonly)
> >  "/usr/share/OVMF/",  /* for OVMF images */
> >  "/usr/share/ovmf/",  /* for OVMF images */
> >  "/usr/share/AAVMF/", /* for AAVMF images */
> > -"/usr/share/qemu-efi/"   /* for AAVMF images */
> > +"/usr/share/qemu-efi/",  /* for AAVMF images */
> > +"/usr/share/qemu-efi-aarch64/"   /* for AAVMF images */
> >  };
> >  /* override the above with these */
> >  const char * const override[] = {
> > 
> 
> ACK

Pushed. Thanks
 -- Guido

> 
> Michal
> 

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [PATCH] apparmor: add attach_disconnected

2017-09-18 Thread Guido Günther
Hi,
On Mon, Sep 18, 2017 at 02:05:30PM +0200, Michal Privoznik wrote:
> On 09/15/2017 05:17 PM, Guido Günther wrote:
> > Otherwise we fail to reconnect to /dev/net/tun opened by libvirtd
> > like
> > 
> > [ 8144.507756] audit: type=1400 audit(1505488162.386:38069121): 
> > apparmor="DENIED" operation="file_perm" info="Failed name lookup - 
> > disconnected path" error=-13 
> > profile="libvirt-5dfcc8a7-b79a-4fa9-a41f-f6271651934c" name="dev/net/tun" 
> > pid=9607 comm="qemu-system-x86" requested_mask="r" denied_mask="r" 
> > fsuid=117 ouid=0
> > 
> > ---
> > I do wonder why we didn't see this earlier though.
> > 
> >  examples/apparmor/TEMPLATE.lxc  | 2 +-
> >  examples/apparmor/TEMPLATE.qemu | 2 +-
> >  2 files changed, 2 insertions(+), 2 deletions(-)
> 
> +1/ACK/or whatever.

Pushed. Thanks.
 -- Guido

> 
> Michal
> 

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


[libvirt] [PATCH] util: Fix configure.ac check for DEVLINK_CMD_ESWITCH_GET

2017-09-18 Thread John Ferlan
Rather than checking for whether the devlink.h on the system has
multiple symbols, let's only check for whether the command we want
is defined.

Turns out the mechanism of providing multiple definitions to check via
AC_CHECK_DECLS in order to determine whether HAVE_DECL_DEVLINK should
be set resulted in a false positive since as long as one of the defs
was true, then the HAVE_DECL_DEVLINK got defined.

This is further complicated by a change between kernel 4.8 and 4.11
where the originally defined name DEVLINK_CMD_ESWITCH_MODE_GET was
changed to DEVLINK_CMD_ESWITCH_GET with the comment/caveat that
the old format is obsolete should never be used. Therefore, even
though some kernels will have a couple of the same symbols that
were added at the same time (DEVLINK_ATTR_ESWITCH_MODE and
DEVLINK_ESWITCH_MODE_SWITCHDEV), they won't have the newer name
and thus cause a build failure.

So even though multiple DEVLINK_CMD_SWITCH* symbols are used to
determine whether the set HAVE_DECL_DEVLINK, this should cover
those kernels before 4.11 with the old definition.

Signed-off-by: John Ferlan 
---

 I'd push this as a build breaker, but I don't want to need to go through
 the trouble again if i got this wrong, so hopefully someone who's seeing
 this can try out the patch...  It's also present on a couple of the CI
 infrastructure environments (f23, centos-7):

  https://ci.centos.org/view/libvirt/job/libvirt-master-build/

 configure.ac | 8 ++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/configure.ac b/configure.ac
index c9509c7f9..73ae7fdd5 100644
--- a/configure.ac
+++ b/configure.ac
@@ -630,12 +630,16 @@ fi
 dnl
 dnl check for kernel headers required by devlink
 dnl
+dnl kernel 4.8 introduced DEVLINK_CMD_ESWITCH_MODE_GET, but that was
+dnl later changed in kernel 4.11 to be just DEVLINK_CMD_ESWITCH_GET, so
+dnl for "completeness" let's allow HAVE_DECL_DEVLINK to be define if
+dnl either is defined.
 if test "$with_linux" = "yes"; then
 AC_CHECK_HEADERS([linux/devlink.h])
-AC_CHECK_DECLS([DEVLINK_GENL_VERSION, DEVLINK_GENL_NAME, DEVLINK_ATTR_MAX, 
DEVLINK_CMD_ESWITCH_GET, DEVLINK_ATTR_BUS_NAME, DEVLINK_ATTR_DEV_NAME, 
DEVLINK_ATTR_ESWITCH_MODE, DEVLINK_ESWITCH_MODE_SWITCHDEV],
+AC_CHECK_DECLS([DEVLINK_CMD_ESWITCH_GET, DEVLINK_CMD_ESWITCH_MODE_GET],
[AC_DEFINE([HAVE_DECL_DEVLINK],
   [1],
-  [whether devlink declarations are available])],
+  [whether devlink CMD_ESWITCH_GET is available])],
[],
[[#include ]])
 fi
-- 
2.13.5

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


[libvirt] [PATCH] cpu_ppc64: Error out when model tag missing in virsh cpu-compare xml

2017-09-18 Thread Nitesh Konkar
libvirtd throws unhandled signal 11 on ppc while running
virsh cpu-compare with missing model tag in the xml. This
patch errors out in such situation.

Signed-off-by: Nitesh Konkar 
---
 src/cpu/cpu_ppc64.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/src/cpu/cpu_ppc64.c b/src/cpu/cpu_ppc64.c
index b58e80a..c11ac9f 100644
--- a/src/cpu/cpu_ppc64.c
+++ b/src/cpu/cpu_ppc64.c
@@ -247,6 +247,12 @@ ppc64ModelFromCPU(const virCPUDef *cpu,
 {
 struct ppc64_model *model;
 
+if (!cpu->model) {
+virReportError(VIR_ERR_INVALID_ARG, "%s",
+   _("no guest CPU model specified"));
+return NULL;
+}
+
 if (!(model = ppc64ModelFind(map, cpu->model))) {
 virReportError(VIR_ERR_INTERNAL_ERROR,
_("Unknown CPU model %s"), cpu->model);
-- 
2.9.5

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


[libvirt] [PATCH v4 2/7] nodedev: udev: Lock the driver in udevEventCheckMonitorFd

2017-09-18 Thread Erik Skultety
The udev monitor is not an immutable resource, so better protect every
access to it.

Signed-off-by: Erik Skultety 
---
 src/node_device/node_device_udev.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/src/node_device/node_device_udev.c 
b/src/node_device/node_device_udev.c
index fe21ad7df..6c7f887ed 100644
--- a/src/node_device/node_device_udev.c
+++ b/src/node_device/node_device_udev.c
@@ -1615,6 +1615,7 @@ udevHandleOneDevice(struct udev_device *device)
 }
 
 
+/* the caller must be holding the driver lock prior to calling this function */
 static bool
 udevEventCheckMonitorFD(struct udev_monitor *udev_monitor,
 int fd)
@@ -1653,6 +1654,7 @@ udevEventHandleCallback(int watch ATTRIBUTE_UNUSED,
 struct udev_device *device = NULL;
 struct udev_monitor *udev_monitor = NULL;
 
+nodeDeviceLock();
 udev_monitor = DRV_STATE_UDEV_MONITOR(driver);
 
 if (!udevEventCheckMonitorFD(udev_monitor, fd)) {
@@ -1661,6 +1663,7 @@ udevEventHandleCallback(int watch ATTRIBUTE_UNUSED,
 }
 
 device = udev_monitor_receive_device(udev_monitor);
+nodeDeviceUnlock();
 
 if (!device) {
 virReportError(VIR_ERR_INTERNAL_ERROR, "%s",
-- 
2.13.5

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


[libvirt] [PATCH v4 7/7] nodedev: Work around the uevent race by hooking up virFileWaitForAccess

2017-09-18 Thread Erik Skultety
If we find ourselves in the situation that the 'add' uevent has been
fired earlier than the sysfs tree for a device was created, we should
use the best-effort approach and give kernel some predetermined amount
of time, thus waiting for the attributes to be ready rather than
discarding the device from our device list forever. If those don't appear
in the given time frame, we need to move on, since libvirt can't wait
indefinitely.

Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1463285

Signed-off-by: Erik Skultety 
---
 src/node_device/node_device_udev.c | 16 +++-
 1 file changed, 15 insertions(+), 1 deletion(-)

diff --git a/src/node_device/node_device_udev.c 
b/src/node_device/node_device_udev.c
index 70e15ffb8..2f63256a3 100644
--- a/src/node_device/node_device_udev.c
+++ b/src/node_device/node_device_udev.c
@@ -1166,9 +1166,23 @@ udevProcessMediatedDevice(struct udev_device *dev,
 char *canonicalpath = NULL;
 virNodeDevCapMdevPtr data = &def->caps->data.mdev;
 
-if (virAsprintf(&linkpath, "%s/mdev_type", udev_device_get_syspath(dev)) < 
0)
+/* Because of a kernel uevent race, we might get the 'add' event prior to
+ * the sysfs tree being ready, so any attempt to access any sysfs attribute
+ * would result in ENOENT and us dropping the device, so let's work around
+ * it by waiting for the attributes to become available.
+ */
+
+if (virAsprintf(&linkpath, "%s/mdev_type",
+udev_device_get_syspath(dev)) < 0)
 goto cleanup;
 
+if (virFileWaitForExists(linkpath, 1, 100) < 0) {
+virReportSystemError(errno,
+ _("failed to wait for file '%s' to appear"),
+ linkpath);
+goto cleanup;
+}
+
 if (virFileResolveLink(linkpath, &canonicalpath) < 0) {
 virReportSystemError(errno, _("failed to resolve '%s'"), linkpath);
 goto cleanup;
-- 
2.13.5

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


[libvirt] [PATCH v4 5/7] nodedev: Disable/re-enable polling on the udev fd

2017-09-18 Thread Erik Skultety
The event loop may get scheduled earlier than the udev event handler
thread which means that it would keep invoking the handler callback with
"new" events, while in fact it's most likely still the same event which
the handler thread hasn't managed to remove from the socket queue yet.
This is due to unwanted increments of the number of events queuing on the
socket and causes the handler thread to spam the logs with an error about
libudev returning NULL (errno == EAGAIN).

Signed-off-by: Erik Skultety 
---
 src/node_device/node_device_udev.c | 17 +
 1 file changed, 17 insertions(+)

diff --git a/src/node_device/node_device_udev.c 
b/src/node_device/node_device_udev.c
index 96760d1e4..70e15ffb8 100644
--- a/src/node_device/node_device_udev.c
+++ b/src/node_device/node_device_udev.c
@@ -1696,6 +1696,7 @@ udevEventCheckMonitorFD(struct udev_monitor *udev_monitor,
 static void
 udevEventHandleThread(void *opaque)
 {
+udevPrivate *priv = NULL;
 udevEventThreadDataPtr privateData = opaque;
 struct udev_device *device = NULL;
 struct udev_monitor *udev_monitor = NULL;
@@ -1715,6 +1716,7 @@ udevEventHandleThread(void *opaque)
 virMutexUnlock(&privateData->lock);
 
 nodeDeviceLock();
+priv = driver->privateData;
 udev_monitor = DRV_STATE_UDEV_MONITOR(driver);
 
 if (!udevEventCheckMonitorFD(udev_monitor, privateData->monitor_fd)) {
@@ -1725,6 +1727,9 @@ udevEventHandleThread(void *opaque)
 device = udev_monitor_receive_device(udev_monitor);
 nodeDeviceUnlock();
 
+/* Re-enable polling for new events on the @udev_monitor */
+virEventUpdateHandle(priv->watch, VIR_EVENT_HANDLE_READABLE);
+
 if (!device) {
 virReportError(VIR_ERR_INTERNAL_ERROR, "%s",
_("udev_monitor_receive_device returned NULL"));
@@ -1750,10 +1755,12 @@ udevEventHandleCallback(int watch ATTRIBUTE_UNUSED,
 int events ATTRIBUTE_UNUSED,
 void *opaque)
 {
+udevPrivate *priv = NULL;
 struct udev_monitor *udev_monitor = NULL;
 udevEventThreadDataPtr threadData = opaque;
 
 nodeDeviceLock();
+priv = driver->privateData;
 udev_monitor = DRV_STATE_UDEV_MONITOR(driver);
 
 if (!udevEventCheckMonitorFD(udev_monitor, fd)) {
@@ -1771,6 +1778,16 @@ udevEventHandleCallback(int watch ATTRIBUTE_UNUSED,
 threadData->nevents++;
 virCondSignal(&threadData->threadCond);
 virMutexUnlock(&threadData->lock);
+
+/* Due to scheduling, the eventloop might poll the udev monitor before the
+ * handler thread actually removes the data from the socket, thus causing 
it
+ * to spam errors about data not being ready yet, because
+ * udevEventHandleCallback would be invoked multiple times incrementing the
+ * counter of events queuing for a single event.
+ * Therefore we need to disable polling on the fd until the thread actually
+ * removes the data from the socket.
+ */
+virEventUpdateHandle(priv->watch, 0);
 }
 
 
-- 
2.13.5

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


[libvirt] [PATCH v4 3/7] udev: Split udevEventHandleCallback in two functions

2017-09-18 Thread Erik Skultety
This patch splits udevEventHandleCallback in two (introduces
udevEventHandleThread) in order to be later able to refactor the latter
to actually become a detached thread which will wait some time for the
kernel to create the whole sysfs tree for a device as we cannot do that
in the event loop directly.

Signed-off-by: Erik Skultety 
---
 src/node_device/node_device_udev.c | 27 +++
 1 file changed, 23 insertions(+), 4 deletions(-)

diff --git a/src/node_device/node_device_udev.c 
b/src/node_device/node_device_udev.c
index 6c7f887ed..e144472f1 100644
--- a/src/node_device/node_device_udev.c
+++ b/src/node_device/node_device_udev.c
@@ -1646,13 +1646,11 @@ udevEventCheckMonitorFD(struct udev_monitor 
*udev_monitor,
 
 
 static void
-udevEventHandleCallback(int watch ATTRIBUTE_UNUSED,
-int fd,
-int events ATTRIBUTE_UNUSED,
-void *data ATTRIBUTE_UNUSED)
+udevEventHandleThread(void *opaque)
 {
 struct udev_device *device = NULL;
 struct udev_monitor *udev_monitor = NULL;
+int fd = (intptr_t) opaque;
 
 nodeDeviceLock();
 udev_monitor = DRV_STATE_UDEV_MONITOR(driver);
@@ -1676,6 +1674,27 @@ udevEventHandleCallback(int watch ATTRIBUTE_UNUSED,
 }
 
 
+static void
+udevEventHandleCallback(int watch ATTRIBUTE_UNUSED,
+int fd,
+int events ATTRIBUTE_UNUSED,
+void *data ATTRIBUTE_UNUSED)
+{
+struct udev_monitor *udev_monitor = NULL;
+
+nodeDeviceLock();
+udev_monitor = DRV_STATE_UDEV_MONITOR(driver);
+
+if (!udevEventCheckMonitorFD(udev_monitor, fd)) {
+nodeDeviceUnlock();
+return;
+}
+nodeDeviceUnlock();
+
+udevEventHandleThread((void *)(intptr_t) fd);
+}
+
+
 /* DMI is intel-compatible specific */
 #if defined(__x86_64__) || defined(__i386__) || defined(__amd64__)
 static void
-- 
2.13.5

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


[libvirt] [PATCH v4 1/7] nodedev: Introduce udevEventCheckMonitorFD helper function

2017-09-18 Thread Erik Skultety
We need to perform some sanity checks on the udev monitor before every
use so that we know nothing changed in the meantime. The reason for
moving the code to a separate function is to be able to perform the same
check from a worker thread that will replace the udevEventHandleCallback
in terms of poking udev for device events.

Signed-off-by: Erik Skultety 
---
 src/node_device/node_device_udev.c | 40 +-
 1 file changed, 27 insertions(+), 13 deletions(-)

diff --git a/src/node_device/node_device_udev.c 
b/src/node_device/node_device_udev.c
index f4177455c..fe21ad7df 100644
--- a/src/node_device/node_device_udev.c
+++ b/src/node_device/node_device_udev.c
@@ -1615,14 +1615,10 @@ udevHandleOneDevice(struct udev_device *device)
 }
 
 
-static void
-udevEventHandleCallback(int watch ATTRIBUTE_UNUSED,
-int fd,
-int events ATTRIBUTE_UNUSED,
-void *data ATTRIBUTE_UNUSED)
+static bool
+udevEventCheckMonitorFD(struct udev_monitor *udev_monitor,
+int fd)
 {
-struct udev_device *device = NULL;
-struct udev_monitor *udev_monitor = DRV_STATE_UDEV_MONITOR(driver);
 int udev_fd = -1;
 
 udev_fd = udev_monitor_get_fd(udev_monitor);
@@ -1641,21 +1637,39 @@ udevEventHandleCallback(int watch ATTRIBUTE_UNUSED,
 virEventRemoveHandle(priv->watch);
 priv->watch = -1;
 
-goto cleanup;
+return false;
+}
+
+return true;
+}
+
+
+static void
+udevEventHandleCallback(int watch ATTRIBUTE_UNUSED,
+int fd,
+int events ATTRIBUTE_UNUSED,
+void *data ATTRIBUTE_UNUSED)
+{
+struct udev_device *device = NULL;
+struct udev_monitor *udev_monitor = NULL;
+
+udev_monitor = DRV_STATE_UDEV_MONITOR(driver);
+
+if (!udevEventCheckMonitorFD(udev_monitor, fd)) {
+nodeDeviceUnlock();
+return;
 }
 
 device = udev_monitor_receive_device(udev_monitor);
-if (device == NULL) {
+
+if (!device) {
 virReportError(VIR_ERR_INTERNAL_ERROR, "%s",
_("udev_monitor_receive_device returned NULL"));
-goto cleanup;
+return;
 }
 
 udevHandleOneDevice(device);
-
- cleanup:
 udev_device_unref(device);
-return;
 }
 
 
-- 
2.13.5

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


[libvirt] [PATCH v4 6/7] util: Introduce virFileWaitForExists

2017-09-18 Thread Erik Skultety
Since we have a number of places where we workaround timing issues with
devices, attributes (files in general) not being available at the time
of processing them by calling usleep in a loop for a fixed number of
tries, we could as well have a utility function that would do that.
Therefore we won't have to duplicate this ugly workaround even more.

Signed-off-by: Erik Skultety 
---
 src/libvirt_private.syms |  1 +
 src/util/virfile.c   | 31 +++
 src/util/virfile.h   |  2 ++
 3 files changed, 34 insertions(+)

diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms
index d12db9b3f..57a5ea300 100644
--- a/src/libvirt_private.syms
+++ b/src/libvirt_private.syms
@@ -1730,6 +1730,7 @@ virFileStripSuffix;
 virFileTouch;
 virFileUnlock;
 virFileUpdatePerm;
+virFileWaitForExists;
 virFileWrapperFdClose;
 virFileWrapperFdFree;
 virFileWrapperFdNew;
diff --git a/src/util/virfile.c b/src/util/virfile.c
index 2f28e83f4..01cc61e98 100644
--- a/src/util/virfile.c
+++ b/src/util/virfile.c
@@ -4166,3 +4166,34 @@ virFileReadValueString(char **value, const char *format, 
...)
 VIR_FREE(str);
 return ret;
 }
+
+
+/**
+ * virFileWaitForExists:
+ * @path: absolute path to a sysfs attribute (can be a symlink)
+ * @ms: how long to wait (in milliseconds)
+ * @tries: how many times should we try to wait for @path to become accessible
+ *
+ * Checks the existence of @path. In case the file defined by @path
+ * doesn't exist, we wait for it to appear in @ms milliseconds (for up to
+ * @tries attempts).
+ *
+ * Returns 0 on success, -1 on error, setting errno appropriately.
+ */
+int
+virFileWaitForExists(const char *path,
+ size_t ms,
+ size_t tries)
+{
+errno = 0;
+
+/* wait for @path to be accessible in @ms milliseconds, up to @tries */
+while (tries-- > 0 && !virFileExists(path)) {
+if (tries == 0 || errno != ENOENT)
+return -1;
+
+usleep(ms * 1000);
+}
+
+return 0;
+}
diff --git a/src/util/virfile.h b/src/util/virfile.h
index 57ceb8072..310a8e7f6 100644
--- a/src/util/virfile.h
+++ b/src/util/virfile.h
@@ -347,6 +347,8 @@ int virFileReadValueScaledInt(unsigned long long *value, 
const char *format, ...
 int virFileReadValueString(char **value, const char *format, ...)
  ATTRIBUTE_FMT_PRINTF(2, 3);
 
+int virFileWaitForExists(const char *path, size_t ms, size_t tries);
+
 
 int virFileInData(int fd,
   int *inData,
-- 
2.13.5

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


[libvirt] [PATCH v4 4/7] udev: Convert udevEventHandleThread to an actual thread routine

2017-09-18 Thread Erik Skultety
Adjust udevEventHandleThread to be a proper thread routine running in an
infinite loop handling devices. Also introduce udevEventThreadData
private structure.
Every time there's and incoming event from udev, udevEventHandleCallback
only increments the number of events queuing on the monitor and signals
the worker thread to handle them.

Signed-off-by: Erik Skultety 
---
 src/node_device/node_device_udev.c | 125 +++--
 1 file changed, 107 insertions(+), 18 deletions(-)

diff --git a/src/node_device/node_device_udev.c 
b/src/node_device/node_device_udev.c
index e144472f1..96760d1e4 100644
--- a/src/node_device/node_device_udev.c
+++ b/src/node_device/node_device_udev.c
@@ -59,6 +59,54 @@ struct _udevPrivate {
 bool privileged;
 };
 
+typedef struct _udevEventThreadData udevEventThreadData;
+typedef udevEventThreadData *udevEventThreadDataPtr;
+
+struct _udevEventThreadData {
+virMutex lock;
+virCond threadCond;
+
+bool threadQuit;
+int monitor_fd;
+unsigned long long nevents; /* number of udev events queuing on monitor */
+};
+
+
+static udevEventThreadDataPtr
+udevEventThreadDataNew(int fd)
+{
+udevEventThreadDataPtr ret = NULL;
+
+if (VIR_ALLOC_QUIET(ret) < 0)
+return NULL;
+
+if (virMutexInit(&ret->lock) < 0)
+goto cleanup;
+
+if (virCondInit(&ret->threadCond) < 0)
+goto cleanup_mutex;
+
+ret->monitor_fd = fd;
+
+return ret;
+
+ cleanup_mutex:
+virMutexDestroy(&ret->lock);
+
+ cleanup:
+VIR_FREE(ret);
+return NULL;
+}
+
+
+static void
+udevEventThreadDataFree(udevEventThreadDataPtr data)
+{
+virMutexDestroy(&data->lock);
+virCondDestroy(&data->threadCond);
+VIR_FREE(data);
+}
+
 
 static bool
 udevHasDeviceProperty(struct udev_device *dev,
@@ -1648,29 +1696,51 @@ udevEventCheckMonitorFD(struct udev_monitor 
*udev_monitor,
 static void
 udevEventHandleThread(void *opaque)
 {
+udevEventThreadDataPtr privateData = opaque;
 struct udev_device *device = NULL;
 struct udev_monitor *udev_monitor = NULL;
-int fd = (intptr_t) opaque;
 
-nodeDeviceLock();
-udev_monitor = DRV_STATE_UDEV_MONITOR(driver);
+/* continue rather than break from the loop on non-fatal errors */
+while (1) {
+virMutexLock(&privateData->lock);
+while (privateData->nevents == 0 && !privateData->threadQuit) {
+if (virCondWait(&privateData->threadCond, &privateData->lock)) {
+virReportSystemError(errno, "%s",
+ _("handler failed to wait on condition"));
+goto cleanup;
+}
+}
 
-if (!udevEventCheckMonitorFD(udev_monitor, fd)) {
+privateData->nevents--;
+virMutexUnlock(&privateData->lock);
+
+nodeDeviceLock();
+udev_monitor = DRV_STATE_UDEV_MONITOR(driver);
+
+if (!udevEventCheckMonitorFD(udev_monitor, privateData->monitor_fd)) {
+nodeDeviceUnlock();
+goto cleanup;
+}
+
+device = udev_monitor_receive_device(udev_monitor);
 nodeDeviceUnlock();
-return;
-}
 
-device = udev_monitor_receive_device(udev_monitor);
-nodeDeviceUnlock();
+if (!device) {
+virReportError(VIR_ERR_INTERNAL_ERROR, "%s",
+   _("udev_monitor_receive_device returned NULL"));
+goto next;
+}
+
+udevHandleOneDevice(device);
 
-if (!device) {
-virReportError(VIR_ERR_INTERNAL_ERROR, "%s",
-   _("udev_monitor_receive_device returned NULL"));
-return;
+next:
+udev_device_unref(device);
 }
 
-udevHandleOneDevice(device);
+ cleanup:
 udev_device_unref(device);
+udevEventThreadDataFree(privateData);
+return;
 }
 
 
@@ -1678,20 +1748,29 @@ static void
 udevEventHandleCallback(int watch ATTRIBUTE_UNUSED,
 int fd,
 int events ATTRIBUTE_UNUSED,
-void *data ATTRIBUTE_UNUSED)
+void *opaque)
 {
 struct udev_monitor *udev_monitor = NULL;
+udevEventThreadDataPtr threadData = opaque;
 
 nodeDeviceLock();
 udev_monitor = DRV_STATE_UDEV_MONITOR(driver);
 
 if (!udevEventCheckMonitorFD(udev_monitor, fd)) {
+virMutexLock(&threadData->lock);
+threadData->threadQuit = true;
+virCondSignal(&threadData->threadCond);
+virMutexUnlock(&threadData->lock);
+
 nodeDeviceUnlock();
 return;
 }
 nodeDeviceUnlock();
 
-udevEventHandleThread((void *)(intptr_t) fd);
+virMutexLock(&threadData->lock);
+threadData->nevents++;
+virCondSignal(&threadData->threadCond);
+virMutexUnlock(&threadData->lock);
 }
 
 
@@ -1818,6 +1897,9 @@ nodeStateInitialize(bool privileged,
 {
 udevPrivate *priv = NULL;
 struct udev *udev = NULL;
+int monitor_fd = -1;
+virThread th;
+udevEventThreadDataPtr threadData =

[libvirt] [PATCH v4 0/7] Work around the kernel mdev uevent race in nodedev

2017-09-18 Thread Erik Skultety
v3 here: https://www.redhat.com/archives/libvir-list/2017-August/msg00703.html

since v3:
- some minor cosmetic changes related to comments from the original patches 1-2
- everything else (provided it didn't cause merge conflicts) is unchanged

Erik Skultety (7):
  nodedev: Introduce udevEventCheckMonitorFD helper function
  nodedev: udev: Lock the driver in udevEventCheckMonitorFd
  udev: Split udevEventHandleCallback in two functions
  udev: Convert udevEventHandleThread to an actual thread routine
  nodedev: Disable/re-enable polling on the udev fd
  util: Introduce virFileWaitForExists
  nodedev: Work around the uevent race by hooking up
virFileWaitForAccess

 src/libvirt_private.syms   |   1 +
 src/node_device/node_device_udev.c | 194 +
 src/util/virfile.c |  31 ++
 src/util/virfile.h |   2 +
 4 files changed, 209 insertions(+), 19 deletions(-)

--
2.13.5

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [PATCHv2] nodedev: add switchdev to NIC capabilities

2017-09-18 Thread John Ferlan


On 09/18/2017 10:57 AM, Ján Tomko wrote:
> On Thu, Sep 14, 2017 at 11:23:02AM -0400, Laine Stump wrote:
>> (Almost all of my comments result in "ok, no action needed". There are
>> just three items I would like to see changed (2 trivial, 1 also small
>> but Edan or John may think it prudent to re-test with the change before
>> pushing) - I marked those comments with [**].
>>
>> On 08/21/2017 05:19 AM, Edan David wrote:
>>> Adding functionality to libvirt that will allow querying the interface
>>> for the availability of switchdev Offloading NIC capabilities.
>>> The switchdev mode was introduced in kernel 4.8, the iproute2-devlink
>>> command to retrieve the swtichdev NIC feature,
>>> Command example:  devlink dev eswitch show pci/:03:00.0
>>> This feature is needed for Openstack so we can do a scheduling decision
>>> if the NIC is in Hardware Offload (switchdev) or regular SR-IOV
>>> (legacy) mode.
>>> And select the appropriate hypervisors with the requested capability
>>> see [1].
>>>
>>> [1] -
>>> https://specs.openstack.org/openstack/nova-specs/specs/pike/approved/enable-sriov-nic-features.html
>>>
>>> ---
>>>  configure.ac  |  13 ++
>>>  docs/formatnode.html.in   |   1 +
>>>  src/util/virnetdev.c  | 187
>>> +-
>>>  src/util/virnetdev.h  |   1 +
>>>  tests/nodedevschemadata/net_00_13_02_b9_f9_d3.xml |   1 +
>>>  tests/nodedevschemadata/net_00_15_58_2f_e9_55.xml |   1 +
>>>  6 files changed, 203 insertions(+), 1 deletion(-)
>>>
> 
> This patch fails to compile on a Gentoo machine with
> sys-kernel/linux-headers-4.8:
> 
> util/virnetdev.c:3248:18: error: use of undeclared identifier
> 'DEVLINK_CMD_ESWITCH_GET'; did you mean 'DEVLINK_CMD_ESWITCH_MODE_GET'?
>    gmsgh->cmd = DEVLINK_CMD_ESWITCH_GET;
>     ^~~
>     DEVLINK_CMD_ESWITCH_MODE_GET
> /usr/include/linux/devlink.h:60:2: note: 'DEVLINK_CMD_ESWITCH_MODE_GET'
> declared here
>    DEVLINK_CMD_ESWITCH_MODE_GET,
>    ^
> 1 error generated.
> 
>>> diff --git a/configure.ac b/configure.ac
>>> index b12b7fa..c089798 100644
>>> --- a/configure.ac
>>> +++ b/configure.ac
>>> @@ -627,6 +627,19 @@ if test "$with_linux" = "yes"; then
>>>  AC_CHECK_HEADERS([linux/btrfs.h])
>>>  fi
>>>
>>> +dnl
>>> +dnl check for kernel headers required by devlink
>>> +dnl
>>> +if test "$with_linux" = "yes"; then
>>> +    AC_CHECK_HEADERS([linux/devlink.h])
>>> +    AC_CHECK_DECLS([DEVLINK_GENL_VERSION, DEVLINK_GENL_NAME,
>>> DEVLINK_ATTR_MAX, DEVLINK_CMD_ESWITCH_GET, DEVLINK_ATTR_BUS_NAME,
>>> DEVLINK_ATTR_DEV_NAME, DEVLINK_ATTR_ESWITCH_MODE,
>>> DEVLINK_ESWITCH_MODE_SWITCHDEV],
>>
>> That's very . thorough, and potentially misleading if someone ever
>> wanted to use devlink to check for something other than switchdev (e.g.
>> [mythical feature X]) on some system that didn't have the proper stuff
>> defined for switchdev, but did have it for [other mythical feature X].
>> But that's all just hypothetical, so this is fine with me.
>>
>>
>>> +   [AC_DEFINE([HAVE_DECL_DEVLINK],
>>> +  [1],
>>> +  [whether devlink declarations is
>>> available])],
>>
>> [**]
>> s/is/are/
>>
> 
> It seems the [action-if-found] is executed for every found symbol:
> 
> #define HAVE_DECL_DEVLINK_GENL_VERSION 1
> #define HAVE_DECL_DEVLINK 1
> #define HAVE_DECL_DEVLINK_GENL_NAME 1
> #define HAVE_DECL_DEVLINK 1
> #define HAVE_DECL_DEVLINK_ATTR_MAX 1
> #define HAVE_DECL_DEVLINK 1
> #define HAVE_DECL_DEVLINK_CMD_ESWITCH_GET 0
> #define HAVE_DECL_DEVLINK_ATTR_BUS_NAME 1
> #define HAVE_DECL_DEVLINK 1
> #define HAVE_DECL_DEVLINK_ATTR_DEV_NAME 1
> #define HAVE_DECL_DEVLINK 1
> #define HAVE_DECL_DEVLINK_ATTR_ESWITCH_MODE 1
> #define HAVE_DECL_DEVLINK 1
> #define HAVE_DECL_DEVLINK_ESWITCH_MODE_SWITCHDEV 1
> #define HAVE_DECL_DEVLINK 1
> 
> so this usage of AC_CHECK_DECLS is bogus.
> 
> Jan
> 

So then the question becomes is it "safe enough" to only look for
HAVE_DECL_DEVLINK_CMD_ESWITCH_GET and assume the others are present or
does there have to be some sort of AC_CHECK_DECLS for every one of the
symbols that are going to be used and then some way to only allow that
code to be compiled if all are true?

I'd say the former - just check for CMD_ESWITCH_GET and assume the
others, but you know what can be said about assumptions... And this is
why I didn't want to push this on Saturday!  bad enough that I've been
in and out of the office today ;-)

John

>>> +   [],
>>> +   [[#include ]])
>>> +fi
>>> +
>>>  dnl Allow perl/python overrides
>>>  AC_PATH_PROGS([PYTHON], [python2 python])
>>>  if test -z "$PYTHON"; then
> 
> 
> --
> libvir-list mailing list
> libvir-list@redhat.com
> https://www.redhat.com/mailman/listinfo/libvir-list
> 

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/l

Re: [libvirt] [PATCH] check-spacing: Don't hardcode Perl interpreter path

2017-09-18 Thread Ján Tomko

On Mon, Sep 18, 2017 at 02:39:39PM +0200, Andrea Bolognani wrote:

This is particularly useful on operating systems that don't ship
Perl as part of the base system (eg. FreeBSD) while still working
just as well as it did before on Linux.

Signed-off-by: Andrea Bolognani 
---
build-aux/check-spacing.pl | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)



For the usual use case of 'make syntax-check' this should be working
already, since we run it as $(PERL) with the binary path detected
at configure time.

But this change is an improvement. ACK if you extend it to include all
perl scripts.

Jan


diff --git a/build-aux/check-spacing.pl b/build-aux/check-spacing.pl
index 448acf234..ca8b43491 100755
--- a/build-aux/check-spacing.pl
+++ b/build-aux/check-spacing.pl
@@ -1,4 +1,4 @@
-#!/usr/bin/perl
+#!/usr/bin/env perl
#
# check-spacing.pl: Report any usage of 'function (..args..)'
# Also check for other syntax issues, such as correct use of ';'
--
2.13.5

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


signature.asc
Description: Digital signature
--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list

Re: [libvirt] [PATCH v5 1/2] conf, docs: Add qcow2 cache configuration support

2017-09-18 Thread Peter Krempa
On Mon, Sep 18, 2017 at 09:49:36 -0400, John Ferlan wrote:
> 
> 
> On 09/14/2017 01:08 AM, Liu Qing wrote:
> > Random write IOPS will drop dramatically if qcow2 l2 cache could not
> > cover the whole disk. This patch gives libvirt user a chance to adjust
> > the qcow2 cache configuration.
> > 
> > Three new qcow2 driver parameters (l2-cache-size, refcount-cache-size
> > and cache-clean-interval) are added as attributes to a new 
> > subelement for a  of a 
> > element. The QEMU source docs/qcow2-cache.txt provides the guidelines
> > for defining/configuring values for each.
> > 
> > Signed-off-by: Liu Qing 
> > ---
> >  docs/formatdomain.html.in  | 41 +
> >  docs/schemas/domaincommon.rng  | 35 
> >  src/conf/domain_conf.c | 96 
> > --
> >  src/qemu/qemu_driver.c |  5 ++
> >  src/util/virstoragefile.c  |  3 +
> >  src/util/virstoragefile.h  |  6 ++
> >  .../qemuxml2argv-disk-drive-qcow2-cache.xml| 43 ++
> >  .../qemuxml2xmlout-disk-drive-qcow2-cache.xml  | 43 ++
> >  tests/qemuxml2xmltest.c|  1 +
> >  9 files changed, 268 insertions(+), 5 deletions(-)
> >  create mode 100644 
> > tests/qemuxml2argvdata/qemuxml2argv-disk-drive-qcow2-cache.xml
> >  create mode 100644 
> > tests/qemuxml2xmloutdata/qemuxml2xmlout-disk-drive-qcow2-cache.xml
> > 
> 
> OK - so I had my head in other code during the v4 review comments from
> Peter. So I'll just "restart" that a bit here. If I can summarize the
> comments... There's a concern that we're creating a very specific set of
> parameters for which there isn't great guidance available other than
> going to a source file in the qemu tree and attempting to decipher it.
> Something I did note at one time as a possible issue and why I also
> asked for the "qcow2" subelement. It makes it painfully obvious that
> there was something very specific. I figure seeing a "qcow2" element
> name would fairly easily point out to anyone that specificity.
> 
> So, based on what I read in the responses it seems that the purpose of
> the values is to provide a mechanism to "allow" qemu to use a couple of
> different algorithms to perform operations, but qemu wants the user to
> decide whether they are willing to sacrifice one thing over another. If
> you want high IOPS, then you have to sacrifice something else; whereas,
> if you don't care about performance, then you sacrifice something else.
> 
> In any case, I think what could be useful is to determine if there is a
> way to "automate" something such that the "dials" would be automatically
> set based on the algorithms. That is - rather than supplying the
> specific values, supply a named attribute that would then perform the
> calculations and set the values, such as "high_performance="yes" or
> "large_capacity=yes". Where you cannot set both, but can set one or the
> other.  Or can you set both? Is there value in partially turning a knob?
>  If so maybe the attribute is "performance={default, ..., best}" and
> "capacity={default, ..., large}". Then based on the value of the
> attribute, code you add to libvirt would perform the calculations.

While I think that having an user friendly way (not having to inspect
the image to set similar parameters across different VMs) would be
great, boolean parameters will basically translate into some kind of
policy decision we are trying to avoid.

I thought that we could set the cache e.g. in percent of the image it
can cover, but that lacks transparency in regards to consumed memory,
but that's similar to the boolean flags.

While I really don't like meaningless knobs that can change in the
future I currently can't think of anything that would be user friendly
and not step into the "policy" region.


signature.asc
Description: PGP signature
--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list

Re: [libvirt] [PATCHv2] nodedev: add switchdev to NIC capabilities

2017-09-18 Thread Ján Tomko

On Thu, Sep 14, 2017 at 11:23:02AM -0400, Laine Stump wrote:

(Almost all of my comments result in "ok, no action needed". There are
just three items I would like to see changed (2 trivial, 1 also small
but Edan or John may think it prudent to re-test with the change before
pushing) - I marked those comments with [**].

On 08/21/2017 05:19 AM, Edan David wrote:

Adding functionality to libvirt that will allow querying the interface
for the availability of switchdev Offloading NIC capabilities.
The switchdev mode was introduced in kernel 4.8, the iproute2-devlink
command to retrieve the swtichdev NIC feature,
Command example:  devlink dev eswitch show pci/:03:00.0
This feature is needed for Openstack so we can do a scheduling decision
if the NIC is in Hardware Offload (switchdev) or regular SR-IOV (legacy) mode.
And select the appropriate hypervisors with the requested capability see [1].

[1] - 
https://specs.openstack.org/openstack/nova-specs/specs/pike/approved/enable-sriov-nic-features.html
---
 configure.ac  |  13 ++
 docs/formatnode.html.in   |   1 +
 src/util/virnetdev.c  | 187 +-
 src/util/virnetdev.h  |   1 +
 tests/nodedevschemadata/net_00_13_02_b9_f9_d3.xml |   1 +
 tests/nodedevschemadata/net_00_15_58_2f_e9_55.xml |   1 +
 6 files changed, 203 insertions(+), 1 deletion(-)



This patch fails to compile on a Gentoo machine with
sys-kernel/linux-headers-4.8:

util/virnetdev.c:3248:18: error: use of undeclared identifier 
'DEVLINK_CMD_ESWITCH_GET'; did you mean 'DEVLINK_CMD_ESWITCH_MODE_GET'?
   gmsgh->cmd = DEVLINK_CMD_ESWITCH_GET;
^~~
DEVLINK_CMD_ESWITCH_MODE_GET
/usr/include/linux/devlink.h:60:2: note: 'DEVLINK_CMD_ESWITCH_MODE_GET' 
declared here
   DEVLINK_CMD_ESWITCH_MODE_GET,
   ^
1 error generated.


diff --git a/configure.ac b/configure.ac
index b12b7fa..c089798 100644
--- a/configure.ac
+++ b/configure.ac
@@ -627,6 +627,19 @@ if test "$with_linux" = "yes"; then
 AC_CHECK_HEADERS([linux/btrfs.h])
 fi

+dnl
+dnl check for kernel headers required by devlink
+dnl
+if test "$with_linux" = "yes"; then
+AC_CHECK_HEADERS([linux/devlink.h])
+AC_CHECK_DECLS([DEVLINK_GENL_VERSION, DEVLINK_GENL_NAME, DEVLINK_ATTR_MAX, 
DEVLINK_CMD_ESWITCH_GET, DEVLINK_ATTR_BUS_NAME, DEVLINK_ATTR_DEV_NAME, 
DEVLINK_ATTR_ESWITCH_MODE, DEVLINK_ESWITCH_MODE_SWITCHDEV],


That's very . thorough, and potentially misleading if someone ever
wanted to use devlink to check for something other than switchdev (e.g.
[mythical feature X]) on some system that didn't have the proper stuff
defined for switchdev, but did have it for [other mythical feature X].
But that's all just hypothetical, so this is fine with me.



+   [AC_DEFINE([HAVE_DECL_DEVLINK],
+  [1],
+  [whether devlink declarations is available])],


[**]
s/is/are/



It seems the [action-if-found] is executed for every found symbol:

#define HAVE_DECL_DEVLINK_GENL_VERSION 1
#define HAVE_DECL_DEVLINK 1
#define HAVE_DECL_DEVLINK_GENL_NAME 1
#define HAVE_DECL_DEVLINK 1
#define HAVE_DECL_DEVLINK_ATTR_MAX 1
#define HAVE_DECL_DEVLINK 1
#define HAVE_DECL_DEVLINK_CMD_ESWITCH_GET 0
#define HAVE_DECL_DEVLINK_ATTR_BUS_NAME 1
#define HAVE_DECL_DEVLINK 1
#define HAVE_DECL_DEVLINK_ATTR_DEV_NAME 1
#define HAVE_DECL_DEVLINK 1
#define HAVE_DECL_DEVLINK_ATTR_ESWITCH_MODE 1
#define HAVE_DECL_DEVLINK 1
#define HAVE_DECL_DEVLINK_ESWITCH_MODE_SWITCHDEV 1
#define HAVE_DECL_DEVLINK 1

so this usage of AC_CHECK_DECLS is bogus.

Jan


+   [],
+   [[#include ]])
+fi
+
 dnl Allow perl/python overrides
 AC_PATH_PROGS([PYTHON], [python2 python])
 if test -z "$PYTHON"; then


signature.asc
Description: Digital signature
--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list

Re: [libvirt] [PATCH] qemu: Mark graphics ports used on reconnect

2017-09-18 Thread Michal Privoznik
On 09/18/2017 04:10 PM, Cole Robinson wrote:
> On 09/18/2017 09:47 AM, Michal Privoznik wrote:
>> This is an issue that's bugging me for a long time. I don't know
>> exactly when and who is to blame but on daemon reconnect we lose
>> qemu's port allocator internal state.
> 
> It's the kernel's fault, broken in 4.11+, patches recently posted
> upstream but not in any release yet:
> 
> https://bugzilla.redhat.com/show_bug.cgi?id=1432684   
> 

Cool thanks. I'll keep an eye on it. Nevertheless, we should merge this
patch as from control POV it makes no sense to try to bind to ports we
know are taken.

Michal

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


[libvirt] [PATCH 0/2] qemu: blockPeek: Fix invalid buffer usage and

2017-09-18 Thread Peter Krempa
See 1/2 for the bugfix.

Peter Krempa (2):
  qemu: blockPeek: Fix filling of the return buffer
  qemu: blockPeek: Enforce buffer filling

 src/qemu/qemu_driver.c | 14 +-
 1 file changed, 13 insertions(+), 1 deletion(-)

-- 
2.14.1

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [PATCH v5 1/2] conf, docs: Add qcow2 cache configuration support

2017-09-18 Thread John Ferlan


On 09/14/2017 01:08 AM, Liu Qing wrote:
> Random write IOPS will drop dramatically if qcow2 l2 cache could not
> cover the whole disk. This patch gives libvirt user a chance to adjust
> the qcow2 cache configuration.
> 
> Three new qcow2 driver parameters (l2-cache-size, refcount-cache-size
> and cache-clean-interval) are added as attributes to a new 
> subelement for a  of a 
> element. The QEMU source docs/qcow2-cache.txt provides the guidelines
> for defining/configuring values for each.
> 
> Signed-off-by: Liu Qing 
> ---
>  docs/formatdomain.html.in  | 41 +
>  docs/schemas/domaincommon.rng  | 35 
>  src/conf/domain_conf.c | 96 
> --
>  src/qemu/qemu_driver.c |  5 ++
>  src/util/virstoragefile.c  |  3 +
>  src/util/virstoragefile.h  |  6 ++
>  .../qemuxml2argv-disk-drive-qcow2-cache.xml| 43 ++
>  .../qemuxml2xmlout-disk-drive-qcow2-cache.xml  | 43 ++
>  tests/qemuxml2xmltest.c|  1 +
>  9 files changed, 268 insertions(+), 5 deletions(-)
>  create mode 100644 
> tests/qemuxml2argvdata/qemuxml2argv-disk-drive-qcow2-cache.xml
>  create mode 100644 
> tests/qemuxml2xmloutdata/qemuxml2xmlout-disk-drive-qcow2-cache.xml
> 

OK - so I had my head in other code during the v4 review comments from
Peter. So I'll just "restart" that a bit here. If I can summarize the
comments... There's a concern that we're creating a very specific set of
parameters for which there isn't great guidance available other than
going to a source file in the qemu tree and attempting to decipher it.
Something I did note at one time as a possible issue and why I also
asked for the "qcow2" subelement. It makes it painfully obvious that
there was something very specific. I figure seeing a "qcow2" element
name would fairly easily point out to anyone that specificity.

So, based on what I read in the responses it seems that the purpose of
the values is to provide a mechanism to "allow" qemu to use a couple of
different algorithms to perform operations, but qemu wants the user to
decide whether they are willing to sacrifice one thing over another. If
you want high IOPS, then you have to sacrifice something else; whereas,
if you don't care about performance, then you sacrifice something else.

In any case, I think what could be useful is to determine if there is a
way to "automate" something such that the "dials" would be automatically
set based on the algorithms. That is - rather than supplying the
specific values, supply a named attribute that would then perform the
calculations and set the values, such as "high_performance="yes" or
"large_capacity=yes". Where you cannot set both, but can set one or the
other.  Or can you set both? Is there value in partially turning a knob?
 If so maybe the attribute is "performance={default, ..., best}" and
"capacity={default, ..., large}". Then based on the value of the
attribute, code you add to libvirt would perform the calculations.

Or do you really think that's something some supplier (such as what
you're trying to do) would want to have - the ability to control the
algorithm for the various knobs.

Hopefully this makes sense...

One minor nit below which is easily fixable...


> diff --git a/docs/formatdomain.html.in b/docs/formatdomain.html.in
> index 8ca7637..4574e3a 100644
> --- a/docs/formatdomain.html.in
> +++ b/docs/formatdomain.html.in
> @@ -3036,6 +3036,47 @@
>set. (Since 3.5.0)
>
>  
> +The driver element may contain a qcow2 sub
> +element, which to specifying further details related to a qcow2 disk.
> +For recommended setting guidelines refer to the QEMU source file
> +docs/qcow2-cache.txt.
> +Since 3.8.0
> +
> +  
> +The optional l2_cache_size attribute controls how 
> much
> +memory will be consumed by qcow2 l2 table cache in bytes. This
> +option is only valid when the driver type is qcow2. The default
> +size is 2097152.
> +Since 3.8.0
> +
> +In general you should leave this option alone, unless you
> +are very certain you know what you are doing.
> +  
> +  
> +The optional refcount_cache_size attribute controls
> +how much memory will be consumed by qcow2 reference count table
> +cache in bytes. This option is only valid when the driver type is
> +qcow2. The default size is 262144.
> +Since 3.8.0
> +
> +In general you should leave this option alone, unless you
> +are very certain you know what you are doing.
> +  
> +  
> +The optional cache_clean_interval attribute defines
> +an interval (in seconds). All cache entries that haven't been
> + 

Re: [libvirt] [PATCH v2 0/3] Be more selective when determining cdrom for taint messaging

2017-09-18 Thread John Ferlan


On 09/18/2017 09:12 AM, Michal Privoznik wrote:
> On 09/11/2017 04:32 PM, John Ferlan wrote:
>> v1: https://www.redhat.com/archives/libvir-list/2017-September/msg00103.html
>>
>> Changes since v1:
>>
>>  Split into 3 parts... The first patch would be the bare minimum using
>>  STRPREFIX instead of STREQ type comparisons for the incoming path to
>>  be "/dev/cdrom[N]" or "/dev/srN" (or resolved to that).
>>
>>  This would "work" for the most part, but then since it's possible to 
>>  make even more checks let's check against the collected node device
>>  data. Patch 2 therefore will "tag" the already collected cdrom data
>>  with a capability. This allows patch3 to find any/all CDROM's on the
>>  host and compare the resolved path to that list of devices returning
>>  "true" if something matches a node device declared physical CDROM.
>>
>> I split things up mainly to make it easier to decide whether patch 1
>> is sufficient or not. If patch2 and patch3 are OK, I would also add
>> a release note indicating the improvement to find CDROM by node device
>> capability.  It's a separate "improvement" on it's own as well. Whether
>> it's truly useful or not, is a different question...
> 
> [1]
> 
>>
>> John Ferlan (3):
>>   qemu: Be more selective when determining cdrom for taint messaging
> 
> ACK to this one ^^
> 
>>   nodedev: Add capability bit to detect 'cdrom' devices
>>   qemu: Add inquiry to nodedev for cdrom taint checking
> 
> However, these two ^^ look like an overkill to me. It's still just a
> taint message that nobody cares about. Or?
> 1: Yeah, I don't think we really need such a big hammer for tiny nail.
> But I might be missing something.
> 
> Michal
> 

I agree with you, but just in case someone wanted to use that sledge
hammer in order to catch some really obscure corner condition, I figured
I'd show it was possible...

Still I can give it a few more days to see if someone indicates they
would also like to see usage of the sledge hammer.

John

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


[libvirt] [PATCH 1/2] qemu: blockPeek: Fix filling of the return buffer

2017-09-18 Thread Peter Krempa
Commit 3956af495e broke the blockPeek API since virStorageFileRead
allocates a return buffer and fills it with the data, while the API
fills a user-provided buffer. This did not get caught by the compiler
since the API prototype uses a 'void *'.

Fix it by transferring the data from the allocated buffer to the user
provided buffer.

Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1491217
---
 src/qemu/qemu_driver.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c
index e1a0dd553..93a1c6061 100644
--- a/src/qemu/qemu_driver.c
+++ b/src/qemu/qemu_driver.c
@@ -11415,6 +11415,7 @@ qemuDomainBlockPeek(virDomainPtr dom,
 virQEMUDriverPtr driver = dom->conn->privateData;
 virDomainDiskDefPtr disk = NULL;
 virDomainObjPtr vm;
+char *tmpbuf = NULL;
 int ret = -1;

 virCheckFlags(0, -1);
@@ -11444,12 +11445,15 @@ qemuDomainBlockPeek(virDomainPtr dom,
 if (virStorageFileRead(disk->src, offset, size, buffer) < 0)
 goto cleanup;

+memcpy(buffer, tmpbuf, size);
+
 ret = 0;

  cleanup:
 if (disk)
 virStorageFileDeinit(disk->src);
 virDomainObjEndAPI(&vm);
+VIR_FREE(tmpbuf);
 return ret;
 }

-- 
2.14.1

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


[libvirt] [PATCH 2/2] qemu: blockPeek: Enforce buffer filling

2017-09-18 Thread Peter Krempa
Documentation states:

"'offset' and 'size' represent an area which must lie entirely within
the device or file." Enforce the that the buffer lies within fully.
---
 src/qemu/qemu_driver.c | 10 +-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c
index 93a1c6061..bddba6b71 100644
--- a/src/qemu/qemu_driver.c
+++ b/src/qemu/qemu_driver.c
@@ -11416,6 +11416,7 @@ qemuDomainBlockPeek(virDomainPtr dom,
 virDomainDiskDefPtr disk = NULL;
 virDomainObjPtr vm;
 char *tmpbuf = NULL;
+ssize_t nread;
 int ret = -1;

 virCheckFlags(0, -1);
@@ -11442,9 +11443,16 @@ qemuDomainBlockPeek(virDomainPtr dom,
 if (qemuDomainStorageFileInit(driver, vm, disk->src) < 0)
 goto cleanup;

-if (virStorageFileRead(disk->src, offset, size, buffer) < 0)
+if ((nread = virStorageFileRead(disk->src, offset, size, &tmpbuf)) < 0)
 goto cleanup;

+if (nread < size) {
+virReportError(VIR_ERR_INVALID_ARG,
+   _("'%s' starting from %llu has only %zd bytes 
available"),
+   path, offset, nread);
+goto cleanup;
+}
+
 memcpy(buffer, tmpbuf, size);

 ret = 0;
-- 
2.14.1

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [PATCH] qemu: Mark graphics ports used on reconnect

2017-09-18 Thread Cole Robinson
On 09/18/2017 09:47 AM, Michal Privoznik wrote:
> This is an issue that's bugging me for a long time. I don't know
> exactly when and who is to blame but on daemon reconnect we lose
> qemu's port allocator internal state.

It's the kernel's fault, broken in 4.11+, patches recently posted
upstream but not in any release yet:

https://bugzilla.redhat.com/show_bug.cgi?id=1432684 

- Cole

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [Qemu-devel] libvirt/QEMU/SEV interaction

2017-09-18 Thread Erik Skultety
On Mon, Sep 18, 2017 at 07:41:09AM -0500, Richard Relph wrote:
>
>
> On 9/18/17 4:47 AM, Daniel P. Berrange wrote:
> > On Mon, Sep 18, 2017 at 11:43:57AM +0200, Erik Skultety wrote:
> > > [...]
> > >
> > > > >  > c) what existing communicate interface can be used between 
> > > > > libvirt and qemu
> > > > >  > to get the measurement ? can we add a new qemu monitor command
> > > > >  > 'get_sev_measurement' to get the measurement ? (step 10)
> > > > >
> > > > >  Yes, QMP commands seeem most likely.
> > > > >
> > > > >  > d) how to pass the secret blob from libvirt to qemu ? should 
> > > > > we consider
> > > > >  > adding a new object (sev-guest-secret) -- libvirt can add the 
> > > > > object through
> > > > >  > qemu monitor.
> > > > >
> > > > >  Yeah, that looks like a viable option too.
> > > > So I could see a flow like the following:
> > > >
> > > >
> > > >1. mgmt tool calls  virConnectGetCapabilities. This returns an XML
> > > >   document that includes the following
> > > >
> > > >
> > > >   ...other bits...
> > > >  
> > > >   ...hex encoded PDH key...
> > > > 
> > > >
> > > >
> > > >2. mgmt tool requests to start a guest calling virCreateXML(),
> > > >   passing VIR_DOMAIN_START_PAUSED. The XML would include
> > > >
> > > >
> > > >  ...hex encode DH key...
> > > > ..hex encode info...
> > > > ...int32 value..
> > > >
> > > >
> > > >
> > > >   if  is provided and VIR_DOMAIN_START_PAUSED is missing,
> > > >   libvirt would report an error and refuse to start the guest
> For ease of use, I would not add this conditional to libvirt. If  is
> provided and VIR_DOMAIN_START_PAUSED is missing, I’d just send the "GO"

I also feel that the presence of the  element might determine the usage of
the VIR_DOMAIN_START_PAUSED flag implicitly.

> command as it would naturally occur.
> Unless that would confuse things inside libvirt or QEMU in relation to the
> measurement and secret…
> Many of our existing tests focus on other aspects of SEV functionality and
> so they skip the MEASURE/SECRET phase of launch and just go immediately from
> LAUNCH_UPDATE_DATA (or VMSA) to LAUNCH_FINISH.
> I guess the key question will be how will QEMU know when to get the
> MEASUREMENT and wait for a LAUNCH_SECRET before doing a LAUNCH_FINISH when
> connected to libvirt.
> Brijesh, this is your area. It feels to me like QEMU will have to wait to do
> the LAUNCH_FINISH until it gets the first “go” from libvirt. If that’s
> right, and assuming the same “go” comes from libvirt with or without
> VIR_DOMAIN_START_PAUSED, then I’d simply exclude the conditional check. QEMU
> would get the measurement when it is done sending the data.
> Though in “real world” uses, I think the conditional is perfectly OK.
> > > >
> > > >3. Libvirt generates the QEMU cli arg to enable SEV using
> > > >   the XML data and starts QEMU, leaving CPUs paused
> > > >
> > > >4. QEMU emits a SEV_MEASURE event containing the measurement
> > > >   blob
> > > Speaking of which, I expect QEMU to have a QMP command to retrieve the
> > > measurement, in which case I think libvirt has to provide an API for the 
> > > user
> > > to retrieve the measurement in case libvirtd crashes somewhere between 
> > > setting
> > > up QEMU and waiting for the measurement event from QEMU, or simply 
> > > because the
> > > GO missed the event for some unspecified reason.
> > Yeah, that's a good point - we also ought to have a pause-reason that
> > reflects that it is paused due to waiting for SEV secrets.
> >
> > > >5. Libvirt catches the QEMU event and emits its own
> > > >   VIR_CONNECT_DOMAIN_EVENT_SEV_MEASURE event containing
> > > >   the measurement blob
> > > >
> > > >6. GO does its validation of the measurement
> > > >
> > > >7a  If validation failed, then virDomainDestroy() to stop QEMU
> > > >
> > > >7b  If validation succeeed
> > > >
> > > >   Optionally call
> > > >
> > > >   virDomainSetSEVSecret()
> > > Given the fact that we're likely introducing a new  element to the 
> > > XML
> > > config, I'm more inclined to utilizing the existing virSecret interfaces 
> > > (as
> > > was originally suggested) instead of creating a vendor-specific API. You 
> > > could
> > > have an optional secret sub-element within the  element and libvirt 
> > > would
> > > simply check if that secret has a value set, once the GO issues
> > > virDomainResume(). Any particular reason for having a specific API for 
> > > this that
> > > I'm missing?
> > Initially I was intending to suggest extensive use of virSecret, but it
> > turns out that despite being called a "secret", none of the SEV data we are
> > passing around needs protection. Either it is safe to be public, or it is
> > already encrypted.  So essentially we just have some data blobs we need to
> > pass into QEMU. I didn't feel we ought to be 

[libvirt] [PATCH 3/6] cpu_conf: Simplify formatting of guest CPU attributes

2017-09-18 Thread Jiri Denemark
Signed-off-by: Jiri Denemark 
---
 src/conf/cpu_conf.c | 20 ++--
 1 file changed, 10 insertions(+), 10 deletions(-)

diff --git a/src/conf/cpu_conf.c b/src/conf/cpu_conf.c
index 6058d26fa5..02506c020b 100644
--- a/src/conf/cpu_conf.c
+++ b/src/conf/cpu_conf.c
@@ -604,20 +604,20 @@ virCPUDefFormatBufFull(virBufferPtr buf,
 if (!def)
 return 0;
 
-/* Format attributes */
-if (def->type == VIR_CPU_TYPE_GUEST) {
+/* Format attributes for guest CPUs unless they only specify
+ * topology or cache. */
+if (def->type == VIR_CPU_TYPE_GUEST &&
+(def->mode != VIR_CPU_MODE_CUSTOM || def->model)) {
 const char *tmp;
 
-if (def->mode != VIR_CPU_MODE_CUSTOM || def->model) {
-if (!(tmp = virCPUModeTypeToString(def->mode))) {
-virReportError(VIR_ERR_INTERNAL_ERROR,
-   _("Unexpected CPU mode %d"), def->mode);
-goto cleanup;
-}
-virBufferAsprintf(&attributeBuf, " mode='%s'", tmp);
+if (!(tmp = virCPUModeTypeToString(def->mode))) {
+virReportError(VIR_ERR_INTERNAL_ERROR,
+   _("Unexpected CPU mode %d"), def->mode);
+goto cleanup;
 }
+virBufferAsprintf(&attributeBuf, " mode='%s'", tmp);
 
-if (def->model && def->mode == VIR_CPU_MODE_CUSTOM) {
+if (def->mode == VIR_CPU_MODE_CUSTOM) {
 if (!(tmp = virCPUMatchTypeToString(def->match))) {
 virReportError(VIR_ERR_INTERNAL_ERROR,
_("Unexpected CPU match policy %d"),
-- 
2.14.1

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


[libvirt] [PATCH 4/6] conf: Drop unused VIR_DOMAIN_DEF_FORMAT_UPDATE_CPU

2017-09-18 Thread Jiri Denemark
The only real usage of this flag was removed by "cpu_conf: Drop
updateCPU from virCPUDefFormat".

Signed-off-by: Jiri Denemark 
---
 src/conf/domain_conf.c   | 3 ---
 src/conf/domain_conf.h   | 1 -
 src/conf/snapshot_conf.c | 3 +--
 3 files changed, 1 insertion(+), 6 deletions(-)

diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c
index 35adce95b5..984db98107 100644
--- a/src/conf/domain_conf.c
+++ b/src/conf/domain_conf.c
@@ -87,7 +87,6 @@ struct _virDomainXMLOption {
 #define VIR_DOMAIN_DEF_FORMAT_COMMON_FLAGS \
 (VIR_DOMAIN_DEF_FORMAT_SECURE |\
  VIR_DOMAIN_DEF_FORMAT_INACTIVE |  \
- VIR_DOMAIN_DEF_FORMAT_UPDATE_CPU |\
  VIR_DOMAIN_DEF_FORMAT_MIGRATABLE)
 
 VIR_ENUM_IMPL(virDomainTaint, VIR_DOMAIN_TAINT_LAST,
@@ -25974,8 +25973,6 @@ unsigned int virDomainDefFormatConvertXMLFlags(unsigned 
int flags)
 formatFlags |= VIR_DOMAIN_DEF_FORMAT_SECURE;
 if (flags & VIR_DOMAIN_XML_INACTIVE)
 formatFlags |= VIR_DOMAIN_DEF_FORMAT_INACTIVE;
-if (flags & VIR_DOMAIN_XML_UPDATE_CPU)
-formatFlags |= VIR_DOMAIN_DEF_FORMAT_UPDATE_CPU;
 if (flags & VIR_DOMAIN_XML_MIGRATABLE)
 formatFlags |= VIR_DOMAIN_DEF_FORMAT_MIGRATABLE;
 
diff --git a/src/conf/domain_conf.h b/src/conf/domain_conf.h
index bb3b6f0c3c..4603e4f97e 100644
--- a/src/conf/domain_conf.h
+++ b/src/conf/domain_conf.h
@@ -2857,7 +2857,6 @@ typedef enum {
 typedef enum {
 VIR_DOMAIN_DEF_FORMAT_SECURE  = 1 << 0,
 VIR_DOMAIN_DEF_FORMAT_INACTIVE= 1 << 1,
-VIR_DOMAIN_DEF_FORMAT_UPDATE_CPU  = 1 << 2,
 VIR_DOMAIN_DEF_FORMAT_MIGRATABLE  = 1 << 3,
 /* format internal domain status information */
 VIR_DOMAIN_DEF_FORMAT_STATUS  = 1 << 4,
diff --git a/src/conf/snapshot_conf.c b/src/conf/snapshot_conf.c
index 5a07d3734a..07706e0b26 100644
--- a/src/conf/snapshot_conf.c
+++ b/src/conf/snapshot_conf.c
@@ -703,8 +703,7 @@ virDomainSnapshotDefFormat(const char *domain_uuid,
 virBuffer buf = VIR_BUFFER_INITIALIZER;
 size_t i;
 
-virCheckFlags(VIR_DOMAIN_DEF_FORMAT_SECURE |
-  VIR_DOMAIN_DEF_FORMAT_UPDATE_CPU, NULL);
+virCheckFlags(VIR_DOMAIN_DEF_FORMAT_SECURE, NULL);
 
 flags |= VIR_DOMAIN_DEF_FORMAT_INACTIVE;
 
-- 
2.14.1

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


[libvirt] [PATCH 0/6] Fix guest CPU updates for inactive domains

2017-09-18 Thread Jiri Denemark
This series originally started as a cleanup work, but it turned out two
real bugs got fixed with this cleanup :-)

Since commit v2.2.0-199-g7ce711a30e libvirt stores an updated guest CPU
in domain's live definition and there's no need to update it every time
we want to format the definition. Thus libvirt should never internally
request a guest CPU update when formatting domain XML. But if it is
asked by a user to do so, it should use the same host CPU model which is
used when starting a domain to provide meaningful results.

https://bugzilla.redhat.com/show_bug.cgi?id=1481309
https://bugzilla.redhat.com/show_bug.cgi?id=1485022

Jiri Denemark (6):
  qemuxml2xmltest: Add tests for Power CPUs
  cpu_conf: Drop updateCPU from virCPUDefFormat
  cpu_conf: Simplify formatting of guest CPU attributes
  conf: Drop unused VIR_DOMAIN_DEF_FORMAT_UPDATE_CPU
  qemu: Use correct host model for updating guest cpu
  qemu: Don't update CPU when formatting live def

 src/bhyve/bhyve_driver.c   |  2 +-
 src/conf/capabilities.c|  2 +-
 src/conf/cpu_conf.c| 38 +
 src/conf/cpu_conf.h|  9 ++---
 src/conf/domain_capabilities.c |  2 +-
 src/conf/domain_conf.c |  6 +---
 src/conf/domain_conf.h |  1 -
 src/conf/snapshot_conf.c   |  3 +-
 src/libxl/libxl_driver.c   |  2 +-
 src/qemu/qemu_domain.c | 17 +++---
 src/qemu/qemu_domain.h |  3 +-
 src/qemu/qemu_driver.c |  9 -
 src/qemu/qemu_migration_cookie.c   |  2 +-
 src/test/test_driver.c |  2 +-
 src/vz/vz_driver.c |  2 +-
 tests/cputest.c| 15 -
 .../ppc64-host+guest-compat-incompatible.xml   |  2 +-
 .../ppc64-host+guest-compat-invalid.xml|  2 +-
 .../cputestdata/ppc64-host+guest-compat-valid.xml  |  2 +-
 .../qemuxml2xmlout-pseries-cpu-compat-power9.xml   | 38 +
 .../qemuxml2xmlout-pseries-cpu-compat.xml  | 38 +
 .../qemuxml2xmlout-pseries-cpu-exact.xml   | 39 ++
 tests/qemuxml2xmltest.c|  4 +++
 23 files changed, 178 insertions(+), 62 deletions(-)
 create mode 100644 
tests/qemuxml2xmloutdata/qemuxml2xmlout-pseries-cpu-compat-power9.xml
 create mode 100644 
tests/qemuxml2xmloutdata/qemuxml2xmlout-pseries-cpu-compat.xml
 create mode 100644 
tests/qemuxml2xmloutdata/qemuxml2xmlout-pseries-cpu-exact.xml

-- 
2.14.1

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


[libvirt] [PATCH 2/6] cpu_conf: Drop updateCPU from virCPUDefFormat

2017-09-18 Thread Jiri Denemark
In the past we updated host-model CPUs with host CPU data by adding a
model and features, but keeping the host-model mode. And since the CPU
model is not normally formatted for host-model CPU defs, we had to pass
the updateCPU flag to the formatting code to be able to properly output
updated host-model CPUs. Libvirt doesn't do this anymore, host-model
CPUs are turned into custom mode CPUs once updated with host CPU data
and thus there's no reason for keeping the hacks inside CPU XML
formatters.

Signed-off-by: Jiri Denemark 
---
 src/bhyve/bhyve_driver.c |  2 +-
 src/conf/capabilities.c  |  2 +-
 src/conf/cpu_conf.c  | 20 +++-
 src/conf/cpu_conf.h  |  9 +++--
 src/conf/domain_capabilities.c   |  2 +-
 src/conf/domain_conf.c   |  3 +--
 src/libxl/libxl_driver.c |  2 +-
 src/qemu/qemu_domain.c   |  4 ++--
 src/qemu/qemu_driver.c   |  2 +-
 src/qemu/qemu_migration_cookie.c |  2 +-
 src/test/test_driver.c   |  2 +-
 src/vz/vz_driver.c   |  2 +-
 tests/cputest.c  | 15 +++
 .../ppc64-host+guest-compat-incompatible.xml |  2 +-
 .../cputestdata/ppc64-host+guest-compat-invalid.xml  |  2 +-
 tests/cputestdata/ppc64-host+guest-compat-valid.xml  |  2 +-
 16 files changed, 31 insertions(+), 42 deletions(-)

diff --git a/src/bhyve/bhyve_driver.c b/src/bhyve/bhyve_driver.c
index e8241f39ff..c096b5562f 100644
--- a/src/bhyve/bhyve_driver.c
+++ b/src/bhyve/bhyve_driver.c
@@ -1441,7 +1441,7 @@ bhyveConnectBaselineCPU(virConnectPtr conn,
 virCPUExpandFeatures(cpus[0]->arch, cpu) < 0)
 goto cleanup;
 
-cpustr = virCPUDefFormat(cpu, NULL, false);
+cpustr = virCPUDefFormat(cpu, NULL);
 
  cleanup:
 virCPUDefListFree(cpus);
diff --git a/src/conf/capabilities.c b/src/conf/capabilities.c
index ba554535ae..9920a675ac 100644
--- a/src/conf/capabilities.c
+++ b/src/conf/capabilities.c
@@ -984,7 +984,7 @@ virCapabilitiesFormatXML(virCapsPtr caps)
 virBufferAdjustIndent(&buf, -2);
 virBufferAddLit(&buf, "\n");
 }
-virCPUDefFormatBuf(&buf, caps->host.cpu, false);
+virCPUDefFormatBuf(&buf, caps->host.cpu);
 
 for (i = 0; i < caps->host.nPagesSize; i++) {
 virBufferAsprintf(&buf, "\n",
diff --git a/src/conf/cpu_conf.c b/src/conf/cpu_conf.c
index 0bd56c9d28..6058d26fa5 100644
--- a/src/conf/cpu_conf.c
+++ b/src/conf/cpu_conf.c
@@ -574,12 +574,11 @@ virCPUDefParseXML(xmlXPathContextPtr ctxt,
 
 char *
 virCPUDefFormat(virCPUDefPtr def,
-virDomainNumaPtr numa,
-bool updateCPU)
+virDomainNumaPtr numa)
 {
 virBuffer buf = VIR_BUFFER_INITIALIZER;
 
-if (virCPUDefFormatBufFull(&buf, def, numa, updateCPU) < 0)
+if (virCPUDefFormatBufFull(&buf, def, numa) < 0)
 goto cleanup;
 
 if (virBufferCheckError(&buf) < 0)
@@ -596,8 +595,7 @@ virCPUDefFormat(virCPUDefPtr def,
 int
 virCPUDefFormatBufFull(virBufferPtr buf,
virCPUDefPtr def,
-   virDomainNumaPtr numa,
-   bool updateCPU)
+   virDomainNumaPtr numa)
 {
 int ret = -1;
 virBuffer attributeBuf = VIR_BUFFER_INITIALIZER;
@@ -619,9 +617,7 @@ virCPUDefFormatBufFull(virBufferPtr buf,
 virBufferAsprintf(&attributeBuf, " mode='%s'", tmp);
 }
 
-if (def->model &&
-(def->mode == VIR_CPU_MODE_CUSTOM ||
- updateCPU)) {
+if (def->model && def->mode == VIR_CPU_MODE_CUSTOM) {
 if (!(tmp = virCPUMatchTypeToString(def->match))) {
 virReportError(VIR_ERR_INTERNAL_ERROR,
_("Unexpected CPU match policy %d"),
@@ -642,7 +638,7 @@ virCPUDefFormatBufFull(virBufferPtr buf,
 if (def->type == VIR_CPU_TYPE_HOST && def->arch)
 virBufferAsprintf(&childrenBuf, "%s\n",
   virArchToString(def->arch));
-if (virCPUDefFormatBuf(&childrenBuf, def, updateCPU) < 0)
+if (virCPUDefFormatBuf(&childrenBuf, def) < 0)
 goto cleanup;
 
 if (virDomainNumaDefCPUFormat(&childrenBuf, numa) < 0)
@@ -677,8 +673,7 @@ virCPUDefFormatBufFull(virBufferPtr buf,
 
 int
 virCPUDefFormatBuf(virBufferPtr buf,
-   virCPUDefPtr def,
-   bool updateCPU)
+   virCPUDefPtr def)
 {
 size_t i;
 bool formatModel;
@@ -688,8 +683,7 @@ virCPUDefFormatBuf(virBufferPtr buf,
 return 0;
 
 formatModel = (def->mode == VIR_CPU_MODE_CUSTOM ||
-   def->mode == VIR_CPU_MODE_HOST_MODEL ||
-   updateCPU);
+   def->mode == VIR_CPU_MODE_HOST_MODEL);
 formatFallbac

Re: [libvirt] [PATCH v3 0/6] Work around the kernel mdev uevent race in nodedev

2017-09-18 Thread Erik Skultety
On Mon, Sep 18, 2017 at 08:47:52AM -0400, John Ferlan wrote:
>
>
> On 09/18/2017 01:53 AM, Erik Skultety wrote:
> > On Thu, Aug 24, 2017 at 01:23:26PM +0200, Erik Skultety wrote:
> >> v2 here: 
> >> https://www.redhat.com/archives/libvir-list/2017-July/msg01268.html
> >>
> >> Since v2:
> >> - added patch 4/6 that fixes the issue with the handler thread spamming 
> >> logs
> >> with "udev_monitor_receive_device returned NULL"
> >>   -> the event loop callback now disables polling on the udev monitor's fd
> >>  every time there's a new event, leaving the responsibility for 
> >> re-enabling
> >>  it back to the handler thread once it had removed the corresponding 
> >> data
> >>  from the socket.
> >>
> >> Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1463285
> >
> > ping
> >
>
> I think I had assumed there was going to be a v4...  Was I incorrect and
> you would prefer to keep this series as is presented here?

Oh, I managed to somehow forget about the comments in patches 1-2 as I fixed
them locally and then solely focused on our technical discussion in patches
3-6. No problem, I'll resend a v4 adjusting the first two patches, since there
weren't any semantic-related points, rather than points regarding the whole
idea in general.

Erik

>
> John
>
> > Erik
> >
> > --
> > libvir-list mailing list
> > libvir-list@redhat.com
> > https://www.redhat.com/mailman/listinfo/libvir-list
> >

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


[libvirt] [PATCH 5/6] qemu: Use correct host model for updating guest cpu

2017-09-18 Thread Jiri Denemark
when a user requested a domain xml description with
vir_domain_xml_update_cpu flag, libvirt would use the host cpu
definition from host capabilities rather than the one which will
actually be used once the domain is started.

https://bugzilla.redhat.com/show_bug.cgi?id=1481309

Signed-off-by: Jiri Denemark 
---
 src/qemu/qemu_domain.c | 11 ++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c
index aba4111ba2..74284f40c1 100644
--- a/src/qemu/qemu_domain.c
+++ b/src/qemu/qemu_domain.c
@@ -4553,6 +4553,7 @@ qemuDomainDefFormatBufInternal(virQEMUDriverPtr driver,
 int ret = -1;
 virDomainDefPtr copy = NULL;
 virCapsPtr caps = NULL;
+virQEMUCapsPtr qemuCaps = NULL;
 
 if (!(caps = virQEMUDriverGetCapabilities(driver, false)))
 goto cleanup;
@@ -4571,7 +4572,14 @@ qemuDomainDefFormatBufInternal(virQEMUDriverPtr driver,
 def->cpu &&
 (def->cpu->mode != VIR_CPU_MODE_CUSTOM ||
  def->cpu->model)) {
-if (virCPUUpdate(def->os.arch, def->cpu, caps->host.cpu) < 0)
+if (!(qemuCaps = virQEMUCapsCacheLookupCopy(driver->qemuCapsCache,
+def->emulator,
+def->os.machine)))
+goto cleanup;
+
+if (virCPUUpdate(def->os.arch, def->cpu,
+ virQEMUCapsGetHostModel(qemuCaps, def->virtType,
+ 
VIR_QEMU_CAPS_HOST_CPU_MIGRATABLE)) < 0)
 goto cleanup;
 }
 
@@ -4689,6 +4697,7 @@ qemuDomainDefFormatBufInternal(virQEMUDriverPtr driver,
  cleanup:
 virDomainDefFree(copy);
 virObjectUnref(caps);
+virObjectUnref(qemuCaps);
 return ret;
 }
 
-- 
2.14.1

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


[libvirt] [PATCH] qemu: Mark graphics ports used on reconnect

2017-09-18 Thread Michal Privoznik
This is an issue that's bugging me for a long time. I don't know
exactly when and who is to blame but on daemon reconnect we lose
qemu's port allocator internal state. That's okay as we should be
able to rebuild it later. However, now I'm seeing port allocator
biding successfully to ports that are already taken by qemu
(either VNC or Spice). Thus any attempt to start another domain
after daemon is restarted fails because libvirt instructs qemu to
take port 5900 which is already taken.

Now, I don't want to mask the real problem, but one can advocate
that we should be marking graphics ports as already in use on
qemuProcessReconnect anyway, because we already know that they
are taken.

Signed-off-by: Michal Privoznik 
---
 src/qemu/qemu_process.c | 17 +
 1 file changed, 13 insertions(+), 4 deletions(-)

diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c
index d3155e4e7..053aba1a6 100644
--- a/src/qemu/qemu_process.c
+++ b/src/qemu/qemu_process.c
@@ -4035,7 +4035,8 @@ qemuProcessStartHook(virQEMUDriverPtr driver,
 
 static int
 qemuProcessGraphicsReservePorts(virQEMUDriverPtr driver,
-virDomainGraphicsDefPtr graphics)
+virDomainGraphicsDefPtr graphics,
+bool reconnect)
 {
 virDomainGraphicsListenDefPtr glisten;
 
@@ -4050,7 +4051,8 @@ qemuProcessGraphicsReservePorts(virQEMUDriverPtr driver,
 
 switch (graphics->type) {
 case VIR_DOMAIN_GRAPHICS_TYPE_VNC:
-if (!graphics->data.vnc.autoport) {
+if (!graphics->data.vnc.autoport ||
+reconnect) {
 if (virPortAllocatorSetUsed(driver->remotePorts,
 graphics->data.vnc.port,
 true) < 0)
@@ -4065,7 +4067,7 @@ qemuProcessGraphicsReservePorts(virQEMUDriverPtr driver,
 break;
 
 case VIR_DOMAIN_GRAPHICS_TYPE_SPICE:
-if (graphics->data.spice.autoport)
+if (graphics->data.spice.autoport && !reconnect)
 return 0;
 
 if (graphics->data.spice.port > 0) {
@@ -4269,7 +4271,7 @@ qemuProcessSetupGraphics(virQEMUDriverPtr driver,
 for (i = 0; i < vm->def->ngraphics; i++) {
 graphics = vm->def->graphics[i];
 
-if (qemuProcessGraphicsReservePorts(driver, graphics) < 0)
+if (qemuProcessGraphicsReservePorts(driver, graphics, false) < 0)
 goto cleanup;
 }
 }
@@ -6881,6 +6883,13 @@ qemuProcessReconnect(void *opaque)
 goto error;
 }
 
+for (i = 0; i < obj->def->ngraphics; i++) {
+if (qemuProcessGraphicsReservePorts(driver,
+obj->def->graphics[i],
+true) < 0)
+goto error;
+}
+
 if (qemuProcessUpdateState(driver, obj) < 0)
 goto error;
 
-- 
2.13.5

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


[libvirt] [PATCH 6/6] qemu: Don't update CPU when formatting live def

2017-09-18 Thread Jiri Denemark
Since commit v2.2.0-199-g7ce711a30e libvirt stores an updated guest CPU
in domain's live definition and there's no need to update it every time
we want to format the definition. The commit itself tried to address
this in qemuDomainFormatXML, but forgot to fix qemuDomainDefFormatLive.
Not to mention that masking a previously set flag is only acceptable if
the flag was set by a public API user. Internally, libvirt should have
never set the flag in the first place.

https://bugzilla.redhat.com/show_bug.cgi?id=1485022

Signed-off-by: Jiri Denemark 
---
 src/qemu/qemu_domain.c | 2 --
 src/qemu/qemu_domain.h | 3 +--
 src/qemu/qemu_driver.c | 7 +++
 3 files changed, 8 insertions(+), 4 deletions(-)

diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c
index 74284f40c1..5ef98911dc 100644
--- a/src/qemu/qemu_domain.c
+++ b/src/qemu/qemu_domain.c
@@ -4757,8 +4757,6 @@ char *qemuDomainFormatXML(virQEMUDriverPtr driver,
 } else {
 def = vm->def;
 origCPU = priv->origCPU;
-if (virDomainObjIsActive(vm))
-flags &= ~VIR_DOMAIN_XML_UPDATE_CPU;
 }
 
 return qemuDomainDefFormatXMLInternal(driver, def, origCPU, flags);
diff --git a/src/qemu/qemu_domain.h b/src/qemu/qemu_domain.h
index b291dc3082..09201b1a40 100644
--- a/src/qemu/qemu_domain.h
+++ b/src/qemu/qemu_domain.h
@@ -40,8 +40,7 @@
 # include "logging/log_manager.h"
 
 # define QEMU_DOMAIN_FORMAT_LIVE_FLAGS  \
-(VIR_DOMAIN_XML_SECURE |\
- VIR_DOMAIN_XML_UPDATE_CPU)
+(VIR_DOMAIN_XML_SECURE)
 
 # if ULONG_MAX == 4294967295
 /* QEMU has a 64-bit limit, but we are limited by our historical choice of
diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c
index 77308d547e..a29bbea55d 100644
--- a/src/qemu/qemu_driver.c
+++ b/src/qemu/qemu_driver.c
@@ -6997,6 +6997,13 @@ static char
 if ((flags & VIR_DOMAIN_XML_MIGRATABLE))
 flags |= QEMU_DOMAIN_FORMAT_LIVE_FLAGS;
 
+/* The CPU is already updated in the domain's live definition, we need to
+ * ignore the VIR_DOMAIN_XML_UPDATE_CPU flag.
+ */
+if (virDomainObjIsActive(vm) &&
+!(flags & VIR_DOMAIN_DEF_FORMAT_INACTIVE))
+flags &= ~VIR_DOMAIN_XML_UPDATE_CPU;
+
 ret = qemuDomainFormatXML(driver, vm, flags);
 
  cleanup:
-- 
2.14.1

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


[libvirt] [PATCH 1/6] qemuxml2xmltest: Add tests for Power CPUs

2017-09-18 Thread Jiri Denemark
Signed-off-by: Jiri Denemark 
---
 .../qemuxml2xmlout-pseries-cpu-compat-power9.xml   | 38 +
 .../qemuxml2xmlout-pseries-cpu-compat.xml  | 38 +
 .../qemuxml2xmlout-pseries-cpu-exact.xml   | 39 ++
 tests/qemuxml2xmltest.c|  4 +++
 4 files changed, 119 insertions(+)
 create mode 100644 
tests/qemuxml2xmloutdata/qemuxml2xmlout-pseries-cpu-compat-power9.xml
 create mode 100644 
tests/qemuxml2xmloutdata/qemuxml2xmlout-pseries-cpu-compat.xml
 create mode 100644 
tests/qemuxml2xmloutdata/qemuxml2xmlout-pseries-cpu-exact.xml

diff --git 
a/tests/qemuxml2xmloutdata/qemuxml2xmlout-pseries-cpu-compat-power9.xml 
b/tests/qemuxml2xmloutdata/qemuxml2xmlout-pseries-cpu-compat-power9.xml
new file mode 100644
index 00..f020056219
--- /dev/null
+++ b/tests/qemuxml2xmloutdata/qemuxml2xmlout-pseries-cpu-compat-power9.xml
@@ -0,0 +1,38 @@
+
+  QEMUGuest1
+  c7a5fdbd-edaf-9455-926a-d65c16db1809
+  219100
+  219100
+  4
+  
+hvm
+
+  
+  
+power9
+  
+  
+  destroy
+  restart
+  destroy
+  
+/usr/bin/qemu-system-ppc64
+
+  
+
+
+  
+  
+
+
+  
+  
+
+
+  
+  
+
+
+
+  
+
diff --git a/tests/qemuxml2xmloutdata/qemuxml2xmlout-pseries-cpu-compat.xml 
b/tests/qemuxml2xmloutdata/qemuxml2xmlout-pseries-cpu-compat.xml
new file mode 100644
index 00..3cbce9fe6a
--- /dev/null
+++ b/tests/qemuxml2xmloutdata/qemuxml2xmlout-pseries-cpu-compat.xml
@@ -0,0 +1,38 @@
+
+  QEMUGuest1
+  c7a5fdbd-edaf-9455-926a-d65c16db1809
+  219100
+  219100
+  4
+  
+hvm
+
+  
+  
+power7
+  
+  
+  destroy
+  restart
+  destroy
+  
+/usr/bin/qemu-system-ppc64
+
+  
+
+
+  
+  
+
+
+  
+  
+
+
+  
+  
+
+
+
+  
+
diff --git a/tests/qemuxml2xmloutdata/qemuxml2xmlout-pseries-cpu-exact.xml 
b/tests/qemuxml2xmloutdata/qemuxml2xmlout-pseries-cpu-exact.xml
new file mode 100644
index 00..d69b387686
--- /dev/null
+++ b/tests/qemuxml2xmloutdata/qemuxml2xmlout-pseries-cpu-exact.xml
@@ -0,0 +1,39 @@
+
+  QEMUGuest1
+  1ccfd97d-5eb4-478a-bbe6-88d254c16db7
+  524288
+  524288
+  1
+  
+hvm
+
+  
+  
+POWER7
+IBM
+  
+  
+  destroy
+  restart
+  destroy
+  
+/usr/bin/qemu-system-ppc64
+
+  
+
+
+  
+  
+
+
+  
+  
+
+
+  
+  
+
+
+
+  
+
diff --git a/tests/qemuxml2xmltest.c b/tests/qemuxml2xmltest.c
index efac9e8286..d15ea93b19 100644
--- a/tests/qemuxml2xmltest.c
+++ b/tests/qemuxml2xmltest.c
@@ -1216,6 +1216,10 @@ mymain(void)
 DO_TEST("smartcard-passthrough-spicevmc", NONE);
 DO_TEST("smartcard-controller", NONE);
 
+DO_TEST("pseries-cpu-compat-power9", NONE);
+DO_TEST("pseries-cpu-compat", NONE);
+DO_TEST("pseries-cpu-exact", NONE);
+
 if (getenv("LIBVIRT_SKIP_CLEANUP") == NULL)
 virFileDeleteTree(fakerootdir);
 
-- 
2.14.1

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [PATCH 0/5] Reject parallel ports on s390, and pseries

2017-09-18 Thread Michal Privoznik
On 09/07/2017 02:19 PM, Pino Toscano wrote:
> Hi,
> 
> this is a simple series to reject parallel ports on s390 architectures,
> and pseries machines -- both simply do not support them.  This fixes
> https://bugzilla.redhat.com/show_bug.cgi?id=1487499
> 
> Pino Toscano (5):
>   tests: qemuxml2argv: fix expected type for usb-bus-missing
>   tests: qemuxml2argv: fail also on unexpected pass
>   qemu: pass the virDomainDef to qemuDomainChrDefValidate
>   qemu: reject parallel ports for s390 archs
>   qemu: reject parallel ports for pseries machines
> 
>  src/qemu/qemu_domain.c | 16 
>  .../qemuxml2argv-pseries-no-parallel.xml   | 18 ++
>  .../qemuxml2argv-s390-no-parallel.xml  | 22 
> ++
>  tests/qemuxml2argvtest.c   | 20 +---
>  4 files changed, 69 insertions(+), 7 deletions(-)
>  create mode 100644 
> tests/qemuxml2argvdata/qemuxml2argv-pseries-no-parallel.xml
>  create mode 100644 tests/qemuxml2argvdata/qemuxml2argv-s390-no-parallel.xml
> 

ACK series.

Michal

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [PATCHv2] nodedev: add switchdev to NIC capabilities

2017-09-18 Thread John Ferlan


On 09/17/2017 05:32 AM, Edan David wrote:
> I removed the memset from virNetDevPutExtraHeader and tested on a card 
> supporting switchdev.
> It looks fine to me :)
> Great review guys, thanks!
> 

Thanks for checking... This is now pushed.

Congrats on your first libvirt patch,

John

> 
> -Original Message-
> From: John Ferlan [mailto:jfer...@redhat.com] 
> Sent: Saturday, September 16, 2017 2:04 PM
> To: Laine Stump ; libvir-list@redhat.com
> Cc: Edan David 
> Subject: Re: [libvirt] [PATCHv2] nodedev: add switchdev to NIC capabilities
> 
> 
> 
> On 09/14/2017 11:23 AM, Laine Stump wrote:
>> (Almost all of my comments result in "ok, no action needed". There are 
>> just three items I would like to see changed (2 trivial, 1 also small 
>> but Edan or John may think it prudent to re-test with the change 
>> before
>> pushing) - I marked those comments with [**].
>>
>> On 08/21/2017 05:19 AM, Edan David wrote:
>>> Adding functionality to libvirt that will allow querying the 
>>> interface for the availability of switchdev Offloading NIC capabilities.
>>> The switchdev mode was introduced in kernel 4.8, the iproute2-devlink 
>>> command to retrieve the swtichdev NIC feature, Command example:  
>>> devlink dev eswitch show pci/:03:00.0 This feature is needed for 
>>> Openstack so we can do a scheduling decision if the NIC is in 
>>> Hardware Offload (switchdev) or regular SR-IOV (legacy) mode.
>>> And select the appropriate hypervisors with the requested capability see 
>>> [1].
>>>
>>> [1] - 
>>> https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fsp
>>> ecs.openstack.org%2Fopenstack%2Fnova-specs%2Fspecs%2Fpike%2Fapproved%
>>> 2Fenable-sriov-nic-features.html&data=02%7C01%7Cedand%40mellanox.com%
>>> 7C1e30890a5c7a4f9f045f08d4fcf2b479%7Ca652971c7d2e4d9ba6a4d149256f461b
>>> %7C0%7C0%7C636411566714987559&sdata=Ri340H6MLap3HpOMDrb%2FFk6D9agQjEh
>>> C9cvR7%2FbV1Ls%3D&reserved=0
>>> ---
>>>  configure.ac  |  13 ++
>>>  docs/formatnode.html.in   |   1 +
>>>  src/util/virnetdev.c  | 187 
>>> +-
>>>  src/util/virnetdev.h  |   1 +
>>>  tests/nodedevschemadata/net_00_13_02_b9_f9_d3.xml |   1 +
>>>  tests/nodedevschemadata/net_00_15_58_2f_e9_55.xml |   1 +
>>>  6 files changed, 203 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/configure.ac b/configure.ac index b12b7fa..c089798 
>>> 100644
>>> --- a/configure.ac
>>> +++ b/configure.ac
>>> @@ -627,6 +627,19 @@ if test "$with_linux" = "yes"; then
>>>  AC_CHECK_HEADERS([linux/btrfs.h])  fi
>>>  
>>> +dnl
>>> +dnl check for kernel headers required by devlink dnl if test 
>>> +"$with_linux" = "yes"; then
>>> +AC_CHECK_HEADERS([linux/devlink.h])
>>> +AC_CHECK_DECLS([DEVLINK_GENL_VERSION, DEVLINK_GENL_NAME, 
>>> +DEVLINK_ATTR_MAX, DEVLINK_CMD_ESWITCH_GET, DEVLINK_ATTR_BUS_NAME, 
>>> +DEVLINK_ATTR_DEV_NAME, DEVLINK_ATTR_ESWITCH_MODE, 
>>> +DEVLINK_ESWITCH_MODE_SWITCHDEV],
>>
>> That's very . thorough, and potentially misleading if someone ever 
>> wanted to use devlink to check for something other than switchdev (e.g.
>> [mythical feature X]) on some system that didn't have the proper stuff 
>> defined for switchdev, but did have it for [other mythical feature X].
>> But that's all just hypothetical, so this is fine with me.
>>
>>
>>> +   [AC_DEFINE([HAVE_DECL_DEVLINK],
>>> +  [1],
>>> +  [whether devlink declarations is 
>>> + available])],
>>
>> [**]
>> s/is/are/
>>
> 
> Yep - altered...
> 
>>> +   [],
>>> +   [[#include ]]) fi
>>> +
>>>  dnl Allow perl/python overrides
>>>  AC_PATH_PROGS([PYTHON], [python2 python])  if test -z "$PYTHON"; 
>>> then diff --git a/docs/formatnode.html.in b/docs/formatnode.html.in 
>>> index 4d935b5..29244a8 100644
>>> --- a/docs/formatnode.html.in
>>> +++ b/docs/formatnode.html.in
>>> @@ -227,6 +227,7 @@
>>>  rxhashreceive-hashing
>>>  
>>> rdmaremote-direct-memory-access
>>>  
>>> txudptnltx-udp-tunnel-segmentation
>>> +
>>> + switchdevkernel-forward-plane-offload>> + >
>>
>> pretty odd abbreviation. But it is what it is :-)
  
>>>
>>>capability diff --git 
>>> a/src/util/virnetdev.c b/src/util/virnetdev.c index 51a6e42..fc7c961 
>>> 100644
>>> --- a/src/util/virnetdev.c
>>> +++ b/src/util/virnetdev.c
>>> @@ -59,6 +59,10 @@
>>>  # include 
>>>  #endif
>>>  
>>> +#if HAVE_DECL_DEVLINK
>>> +# include 
>>> +#endif
>>> +
>>>  #ifndef IFNAMSIZ
>>>  # define IFNAMSIZ 16
>>>  #endif
>>> @@ -2481,7 +2485,8 @@ VIR_ENUM_IMPL(virNetDevFeature,
>>>"ntuple",
>>>"rxhash",
>>>"rdma",
>>> -  "txudptnl")
>>> +  "txudptnl",
>>> +  "switchdev")
>>>  
>>>  #ifdef __linu

Re: [libvirt] [Qemu-devel] libvirt/QEMU/SEV interaction

2017-09-18 Thread Richard Relph



On 9/18/17 4:47 AM, Daniel P. Berrange wrote:

On Mon, Sep 18, 2017 at 11:43:57AM +0200, Erik Skultety wrote:

[...]


 > c) what existing communicate interface can be used between libvirt and 
qemu
 > to get the measurement ? can we add a new qemu monitor command
 > 'get_sev_measurement' to get the measurement ? (step 10)

 Yes, QMP commands seeem most likely.

 > d) how to pass the secret blob from libvirt to qemu ? should we consider
 > adding a new object (sev-guest-secret) -- libvirt can add the object 
through
 > qemu monitor.

 Yeah, that looks like a viable option too.

So I could see a flow like the following:


   1. mgmt tool calls  virConnectGetCapabilities. This returns an XML
  document that includes the following

   
  ...other bits...
 
  ...hex encoded PDH key...

   

   2. mgmt tool requests to start a guest calling virCreateXML(),
  passing VIR_DOMAIN_START_PAUSED. The XML would include

   
 ...hex encode DH key...
..hex encode info...
...int32 value..
   


  if  is provided and VIR_DOMAIN_START_PAUSED is missing,
  libvirt would report an error and refuse to start the guest
For ease of use, I would not add this conditional to libvirt. If  
is provided and VIR_DOMAIN_START_PAUSED is missing, I’d just send the 
"GO" command as it would naturally occur.
Unless that would confuse things inside libvirt or QEMU in relation to 
the measurement and secret…
Many of our existing tests focus on other aspects of SEV functionality 
and so they skip the MEASURE/SECRET phase of launch and just go 
immediately from LAUNCH_UPDATE_DATA (or VMSA) to LAUNCH_FINISH.
I guess the key question will be how will QEMU know when to get the 
MEASUREMENT and wait for a LAUNCH_SECRET before doing a LAUNCH_FINISH 
when connected to libvirt.
Brijesh, this is your area. It feels to me like QEMU will have to wait 
to do the LAUNCH_FINISH until it gets the first “go” from libvirt. If 
that’s right, and assuming the same “go” comes from libvirt with or 
without VIR_DOMAIN_START_PAUSED, then I’d simply exclude the conditional 
check. QEMU would get the measurement when it is done sending the data.

Though in “real world” uses, I think the conditional is perfectly OK.


   3. Libvirt generates the QEMU cli arg to enable SEV using
  the XML data and starts QEMU, leaving CPUs paused

   4. QEMU emits a SEV_MEASURE event containing the measurement
  blob

Speaking of which, I expect QEMU to have a QMP command to retrieve the
measurement, in which case I think libvirt has to provide an API for the user
to retrieve the measurement in case libvirtd crashes somewhere between setting
up QEMU and waiting for the measurement event from QEMU, or simply because the
GO missed the event for some unspecified reason.

Yeah, that's a good point - we also ought to have a pause-reason that
reflects that it is paused due to waiting for SEV secrets.


   5. Libvirt catches the QEMU event and emits its own
  VIR_CONNECT_DOMAIN_EVENT_SEV_MEASURE event containing
  the measurement blob

   6. GO does its validation of the measurement

   7a  If validation failed, then virDomainDestroy() to stop QEMU

   7b  If validation succeeed

  Optionally call

  virDomainSetSEVSecret()

Given the fact that we're likely introducing a new  element to the XML
config, I'm more inclined to utilizing the existing virSecret interfaces (as
was originally suggested) instead of creating a vendor-specific API. You could
have an optional secret sub-element within the  element and libvirt would
simply check if that secret has a value set, once the GO issues
virDomainResume(). Any particular reason for having a specific API for this that
I'm missing?

Initially I was intending to suggest extensive use of virSecret, but it
turns out that despite being called a "secret", none of the SEV data we are
passing around needs protection. Either it is safe to be public, or it is
already encrypted.  So essentially we just have some data blobs we need to
pass into QEMU. I didn't feel we ought to be abusing virSecret as a
general purpose mechanism for passing in opaque data blobs which do not
need any kind of protection.

All of the above looks really good to me.
While I agree with Daniel’s analysis of the need for “secret”, I do like 
using virSecret to convey the notion of secrecy. But it isn’t necessary. 
The end points are the SEV FW and the guest owner and all secrets they 
share are already encrypted. Embedding it in the “GO” command feels 
equally OK to me.
Note that sending a secret with a “GO” other than the first one is an 
error… I don’t think libvirt needs to catch that, though. The SEV FW will.



Regards,
Daniel

Thanks,
Richard
--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list

[libvirt] [PATCH] check-spacing: Don't hardcode Perl interpreter path

2017-09-18 Thread Andrea Bolognani
This is particularly useful on operating systems that don't ship
Perl as part of the base system (eg. FreeBSD) while still working
just as well as it did before on Linux.

Signed-off-by: Andrea Bolognani 
---
 build-aux/check-spacing.pl | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/build-aux/check-spacing.pl b/build-aux/check-spacing.pl
index 448acf234..ca8b43491 100755
--- a/build-aux/check-spacing.pl
+++ b/build-aux/check-spacing.pl
@@ -1,4 +1,4 @@
-#!/usr/bin/perl
+#!/usr/bin/env perl
 #
 # check-spacing.pl: Report any usage of 'function (..args..)'
 # Also check for other syntax issues, such as correct use of ';'
-- 
2.13.5

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [PATCH v2 0/3] Be more selective when determining cdrom for taint messaging

2017-09-18 Thread Michal Privoznik
On 09/11/2017 04:32 PM, John Ferlan wrote:
> v1: https://www.redhat.com/archives/libvir-list/2017-September/msg00103.html
> 
> Changes since v1:
> 
>  Split into 3 parts... The first patch would be the bare minimum using
>  STRPREFIX instead of STREQ type comparisons for the incoming path to
>  be "/dev/cdrom[N]" or "/dev/srN" (or resolved to that).
> 
>  This would "work" for the most part, but then since it's possible to 
>  make even more checks let's check against the collected node device
>  data. Patch 2 therefore will "tag" the already collected cdrom data
>  with a capability. This allows patch3 to find any/all CDROM's on the
>  host and compare the resolved path to that list of devices returning
>  "true" if something matches a node device declared physical CDROM.
> 
> I split things up mainly to make it easier to decide whether patch 1
> is sufficient or not. If patch2 and patch3 are OK, I would also add
> a release note indicating the improvement to find CDROM by node device
> capability.  It's a separate "improvement" on it's own as well. Whether
> it's truly useful or not, is a different question...

[1]

> 
> John Ferlan (3):
>   qemu: Be more selective when determining cdrom for taint messaging

ACK to this one ^^

>   nodedev: Add capability bit to detect 'cdrom' devices
>   qemu: Add inquiry to nodedev for cdrom taint checking

However, these two ^^ look like an overkill to me. It's still just a
taint message that nobody cares about. Or?
1: Yeah, I don't think we really need such a big hammer for tiny nail.
But I might be missing something.

Michal

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [PATCH v3 0/6] Work around the kernel mdev uevent race in nodedev

2017-09-18 Thread John Ferlan


On 09/18/2017 01:53 AM, Erik Skultety wrote:
> On Thu, Aug 24, 2017 at 01:23:26PM +0200, Erik Skultety wrote:
>> v2 here: https://www.redhat.com/archives/libvir-list/2017-July/msg01268.html
>>
>> Since v2:
>> - added patch 4/6 that fixes the issue with the handler thread spamming logs
>> with "udev_monitor_receive_device returned NULL"
>>   -> the event loop callback now disables polling on the udev monitor's fd
>>  every time there's a new event, leaving the responsibility for 
>> re-enabling
>>  it back to the handler thread once it had removed the corresponding data
>>  from the socket.
>>
>> Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1463285
> 
> ping
> 

I think I had assumed there was going to be a v4...  Was I incorrect and
you would prefer to keep this series as is presented here?

John

> Erik
> 
> --
> libvir-list mailing list
> libvir-list@redhat.com
> https://www.redhat.com/mailman/listinfo/libvir-list
> 

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [PATCH] virsh: man: Describe the 'create' command a bit more

2017-09-18 Thread Michal Privoznik
On 09/08/2017 02:52 PM, Erik Skultety wrote:
> So we refer to the terms 'persistent' and 'transient' across the whole
> man page, without describing it further, but more importantly, how the
> create command affects it, i.e. explicitly stating that domain created
> via the 'create' command are going to be transient or persistent,
> depending on whether there is an existing persistent domain with a
> matching  and , in which case it will remain persistent, but
> will run using a one-time configuration, otherwise it's going to be
> transient and will vanish once destroyed.
> 
> Signed-off-by: Erik Skultety 
> ---
>  tools/virsh.pod | 32 ++--
>  1 file changed, 26 insertions(+), 6 deletions(-)

ACK

Michal

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


[libvirt] Questions about function virPCIDeviceIsBehindSwitchLackingACS in virpci.c

2017-09-18 Thread Wuzongyong (Euler Dept)
In function virPCIDeviceIsBehindSwitchLackingACS, I noticed that(line 8):

1if (virPCIDeviceGetParent(dev, &parent) < 0)
2return -1;
3if (!parent) {
4/* if we have no parent, and this is the root bus, ACS doesn't come
5 * into play since devices on the root bus can't P2P without going
6 * through the root IOMMU.
7 */
8if (dev->address.bus == 0) {
9return 0;
10} else {
11virReportError(VIR_ERR_INTERNAL_ERROR,
12   _("Failed to find parent device for %s"),
13   dev->name);
14return -1;
15}
16}

Why we just return 0 only if device's bus is 0?
In my server, I can see a root bus which bus number is greater than 0, see the
results(just a part) after I run lspci -t:

+-[:80]-+-02.0-[81-83]--+-00.0
|   |   \-00.1
|   +-05.0
|   +-05.1
|   +-05.2
|   \-05.4
+-[:7f]-+-08.0
|   +-08.2
|   +-08.3
|   + . . .
|   \-1f.2
\-[:00]-+-00.0
 +-01.0-[01]00.0
 +-02.0-[02]--+-00.0
 |+-00.1
 |+-00.2
 |\-00.3
 +-02.2-[03]--
 +-03.0-[04-0b]00.0-[05-0b]--+-08.0-[06-08]00.0
 |   \-10.0-[09-0b]00.0
 +-05.0
 +-05.1
 +-05.2
 +-05.4
 +-11.0
 +-11.4
 +-16.0
 +-16.1
 +-1a.0

If I assign the device :81:00.0 to a VM, I get "Failed to find parent 
device".
I think I should get no error with return value 0 just like bus number is 0, 
because
bus 80 is the root bus as well in my case.

In the <>
Datasheet, I found that(Chapter 9.1):
For some server platforms, it may be desirable to have multiple PCHs in the 
system
Which means some PCH's may reside on a bus greater than 0.

So, is this a bug?

Thanks,
Zongyong Wu

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list

Re: [libvirt] [PATCH go-xml] Add support for domain hostdev and test code

2017-09-18 Thread Daniel P. Berrange
On Fri, Sep 15, 2017 at 02:12:44PM +0800, zhenwei.pi wrote:
> Signed-off-by: zhenwei.pi 
> ---
>  domain.go  | 36 
>  domain_test.go | 44 
>  2 files changed, 80 insertions(+)
> 
> diff --git a/domain.go b/domain.go
> index bead49a..1bcc9cc 100644
> --- a/domain.go
> +++ b/domain.go
> @@ -407,6 +407,29 @@ type DomainRNG struct {
>   Backend *DomainRNGBackend `xml:"backend"`
>  }
>  
> +type DomainHostdevAdapter struct {
> + Name string `xml:"name,attr,omitempty"`
> +}
> +
> +type DomainHostdevSource struct {
> + Protocol string`xml:"protocol,attr,omitempty"`
> + Name string`xml:"name,attr,omitempty"`
> + Wwpn string`xml:"wwpn,attr,omitempty"`

This should be WWPN

> + Adapter  *DomainHostdevAdapter `xml:"adapter"`
> + Address  *DomainAddress`xml:"address"`
> +}
> +
> +type DomainHostdev struct {
> + XMLName xml.Name `xml:"hostdev"`
> + Modestring   `xml:"mode,attr"`
> + Typestring   `xml:"type,attr"`
> + Sgiostring   `xml:"sgio,attr,omitempty"`

SGIO

> + Rawio   string   `xml:"rawio,attr,omitempty"`

And RawIO


> + Managed string   `xml:"managed,attr,omitempty"`
> + Source  *DomainHostdevSource `xml:"source"`
> + Address *DomainAddress   `xml:"address"`
> +}

> diff --git a/domain_test.go b/domain_test.go
> index d1b107d..73dd47b 100644
> --- a/domain_test.go
> +++ b/domain_test.go
> @@ -37,6 +37,13 @@ type Address struct {
>   Function HexUint
>  }
>  
> +type ScsiAddress struct {

Should be SCSI

ACK, and I'll fix the capitalization when pushing


Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [PATCH] apparmor: add attach_disconnected

2017-09-18 Thread Michal Privoznik
On 09/15/2017 05:17 PM, Guido Günther wrote:
> Otherwise we fail to reconnect to /dev/net/tun opened by libvirtd
> like
> 
> [ 8144.507756] audit: type=1400 audit(1505488162.386:38069121): 
> apparmor="DENIED" operation="file_perm" info="Failed name lookup - 
> disconnected path" error=-13 
> profile="libvirt-5dfcc8a7-b79a-4fa9-a41f-f6271651934c" name="dev/net/tun" 
> pid=9607 comm="qemu-system-x86" requested_mask="r" denied_mask="r" fsuid=117 
> ouid=0
> 
> ---
> I do wonder why we didn't see this earlier though.
> 
>  examples/apparmor/TEMPLATE.lxc  | 2 +-
>  examples/apparmor/TEMPLATE.qemu | 2 +-
>  2 files changed, 2 insertions(+), 2 deletions(-)

+1/ACK/or whatever.

Michal

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list

Re: [libvirt] [PATCH] apparmor: cater for new AAVMF image location

2017-09-18 Thread Michal Privoznik
On 09/15/2017 06:10 PM, Guido Günther wrote:
> Things moved again, sigh.
> ---
>  src/security/virt-aa-helper.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/src/security/virt-aa-helper.c b/src/security/virt-aa-helper.c
> index 55a686a59c..0b43c8e391 100644
> --- a/src/security/virt-aa-helper.c
> +++ b/src/security/virt-aa-helper.c
> @@ -516,7 +516,8 @@ valid_path(const char *path, const bool readonly)
>  "/usr/share/OVMF/",  /* for OVMF images */
>  "/usr/share/ovmf/",  /* for OVMF images */
>  "/usr/share/AAVMF/", /* for AAVMF images */
> -"/usr/share/qemu-efi/"   /* for AAVMF images */
> +"/usr/share/qemu-efi/",  /* for AAVMF images */
> +"/usr/share/qemu-efi-aarch64/"   /* for AAVMF images */
>  };
>  /* override the above with these */
>  const char * const override[] = {
> 

ACK

Michal

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list

Re: [libvirt] [PATCH python] Skip sparseRecvAll / sparseSendAll in sanity test

2017-09-18 Thread Michal Privoznik
On 09/18/2017 11:49 AM, Daniel P. Berrange wrote:
> On Mon, Sep 18, 2017 at 11:47:24AM +0200, Michal Privoznik wrote:
>> On 09/14/2017 02:50 PM, Daniel P. Berrange wrote:
>>> The sanity test check aims to ensure that every function listed in
>>> the Python code maps to a corresponding C function. The Sparse
>>> send/recv methods are special though - we're never calling the
>>> corresponding C APIs, instead we have a pure python impl.
>>>
>>> Signed-off-by: Daniel P. Berrange 
>>> ---
>>>  sanitytest.py | 3 ++-
>>>  1 file changed, 2 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/sanitytest.py b/sanitytest.py
>>> index deec200..a5cb01b 100644
>>> --- a/sanitytest.py
>>> +++ b/sanitytest.py
>>> @@ -351,7 +351,8 @@ for klass in gotfunctions:
>>>  for func in sorted(gotfunctions[klass]):
>>>  # These are pure python methods with no C APi
>>>  if func in ["connect", "getConnect", "domain", "getDomain",
>>> -"virEventInvokeFreeCallback"]:
>>> +"virEventInvokeFreeCallback",
>>> +"sparseRecvAll", "sparseSendAll"]:
>>>  continue
>>>  
>>>  key = "%s.%s" % (klass, func)
>>>
>>
>> Well, what if somebody builds -python against older libvirt that lacks
>> virStreamSparseRecvAll()? Are they facing an exception because cmod is
>> unable to find that func?
> 
> Yes, the same is true of any of the functions where we manually written
> the Python API. To be fully correct we ought to annotate the manually
> written python code and then process the .py file to strip out the APIs
> we can't support. Patches welcome for doing that

Okay. Since we're doing that for others you have my ACK. This makes
things better and no worse.

Michal

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [PATCH 0/3] cpu: Add new Skylake-Server CPU model

2017-09-18 Thread Pavel Hrdina
On Wed, Sep 13, 2017 at 02:00:43PM +0200, Jiri Denemark wrote:
> Jiri Denemark (3):
>   tests: Add CPUID data for Intel(R) Xeon(R) Gold 6148 CPU
>   cpu: Add clwb/pcommit CPU features
>   cpu: Add new Skylake-Server CPU model

Reviewed-by: Pavel Hrdina 


signature.asc
Description: PGP signature
--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list

Re: [libvirt] [PATCH python] Skip sparseRecvAll / sparseSendAll in sanity test

2017-09-18 Thread Daniel P. Berrange
On Mon, Sep 18, 2017 at 11:47:24AM +0200, Michal Privoznik wrote:
> On 09/14/2017 02:50 PM, Daniel P. Berrange wrote:
> > The sanity test check aims to ensure that every function listed in
> > the Python code maps to a corresponding C function. The Sparse
> > send/recv methods are special though - we're never calling the
> > corresponding C APIs, instead we have a pure python impl.
> > 
> > Signed-off-by: Daniel P. Berrange 
> > ---
> >  sanitytest.py | 3 ++-
> >  1 file changed, 2 insertions(+), 1 deletion(-)
> > 
> > diff --git a/sanitytest.py b/sanitytest.py
> > index deec200..a5cb01b 100644
> > --- a/sanitytest.py
> > +++ b/sanitytest.py
> > @@ -351,7 +351,8 @@ for klass in gotfunctions:
> >  for func in sorted(gotfunctions[klass]):
> >  # These are pure python methods with no C APi
> >  if func in ["connect", "getConnect", "domain", "getDomain",
> > -"virEventInvokeFreeCallback"]:
> > +"virEventInvokeFreeCallback",
> > +"sparseRecvAll", "sparseSendAll"]:
> >  continue
> >  
> >  key = "%s.%s" % (klass, func)
> > 
> 
> Well, what if somebody builds -python against older libvirt that lacks
> virStreamSparseRecvAll()? Are they facing an exception because cmod is
> unable to find that func?

Yes, the same is true of any of the functions where we manually written
the Python API. To be fully correct we ought to annotate the manually
written python code and then process the .py file to strip out the APIs
we can't support. Patches welcome for doing that

Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] RFC: libvirt support for QEMU live patching

2017-09-18 Thread Martin Kletzander

On Mon, Sep 18, 2017 at 10:03:34AM +0100, Daniel P. Berrange wrote:

On Mon, Sep 18, 2017 at 09:37:14AM +0200, Martin Kletzander wrote:

On Fri, Sep 15, 2017 at 09:18:18AM +0100, Daniel P. Berrange wrote:
> On Fri, Sep 15, 2017 at 01:27:31PM +0530, Madhu Pavan wrote:
> > Hi,
> > QEMU live patching should be just a matter of updating the QEMU RPM package
> > and then live migrating the VMs to another QEMU instance on the same host
> > (which would point to the just installed new QEMU executable).
> > I think it will be useful to support it from libvirt side. After some
> > searching I found a
> > RFC patch posted in Nov 2013. Here is the link to it
> > https://www.redhat.com/archives/libvir-list/2013-November/msg00372.html
> > Approach followed in above mentioned link is as follows:
> > 1. newDef = deep copy oldVm definition
> > 2. newVm = create VM using newDef, start QEMU process with all vCPUs paused
> > 3. oldVm migrate to newVm using unix socket
> > 4. shutdown oldVm
> > 5. newPid = newVm->pid
> > 6. finalDef = live deep copy of newVm definition
> > 7. Drop the newVm from qemu domain table without shutting down QEMU process
> > 8. Assign finalDef to oldVm
> > 9. oldVm attaches to QEMU process newPid using finalDef
> > 10.resume all vCPUs in oldVm
> >
> > I can see it didn't get communities approval for having problems in handling
> > UUID
> > of the vm's. To fix the problem we need to teach libvirt to manage two qemu
> > processes
> > at once both tied to same UUID. I would like to know if there is any
> > interested approach
> > to get this done. I would like to send patches on this.
> >
> > Is there any specific reason why it is not been pursued for the last 4 year?
>
> It isn't possible to make it work correctly in the general case, because
> both QEMU processes want to own the same files on disk. eg both might want
> to listen on a UNIX socket /foo/bar, but only one can do this. If you let
> the new QEMU delete the original QEMU's sockets, then you either break or
> delay incoming connections during the migration time, and you make it
> impossible to roll back on failure, or both. This kind of thing is not
> something that's acceptable for the usage scenerio described, which would
> need to be bulletproof to be usable in production.
>

Can't we utilize namespaces for this?  Lot of the things could be
separated, so we could fire up a new VM that's "containerized" like
this, migrate to it and then fire up a new one and migrate back.  If the
first migration fails then we can still fallback.  If it does not, then
the second one "should" not either.


As well as increasing complexity and thus chances of failure, this also
doubles the performance impact on the guest.


More generally I think the high level idea of in-place live-upgrades for
guests on a host is flawed.  Even if we had the ability to in-place upgrade
QEMU for running guest, there are always going to be cases where a security
upgrade requires you to reboot the host system - kernel/glibc flaw, or fixes
to other servicves that can't be restarted eg dbus. Any person hosting VMs
will always always need to be able to handle migrating VMs to a *separate*
host to deal with such security updates. For added complexity, determinining
which security upgrades could be done by restarting QEMU vs which need to have
a full host restart is not trivial, unless intimately familiar with the software
stack.

So from an administrative / operational POV, defining two separate procedures
for dealing with upgrades is unwise. You have the chance of picking the wrong
one and leaving a security fix accidentally unpatched, or if you take one path
95% of the time and the other path 5% of the time, chances are you're going
to screw up the less used path due to lack of practice. It makes more sense
to simplify the operational protocols by standardizing of cross-host migration,
followed by host reboot, for applying patches. It simplifies the decision
matrix removing complexity, and ensures you are well praticed following the
same approach every time. The only cost is having 1 spare server against
which to perform the migrations, but if your VMs are at all important, you
will have spare hardware already to deal with unforseen hardware problems.
If you absolutely can't afford spare hardware, you could just run your
entire "host" OS in a container, and then spin up a second "host" OS on
the same machine and migrate into that. From libvirt's POV this is still
cross-host migration, so needs no special case.



Oh sure, I did not consider the in-place upgrade to be dealing with any
security issues.  And even if people would want just update qemu for
some other reason they would be misusing the update for security issues
and that's a problem on its own.  We probably don't want that at all.

And there are other ways around it.  Case closed then, thanks for the
insights.


Regards,
Daniel
--
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https:/

Re: [libvirt] [PATCH python] Skip sparseRecvAll / sparseSendAll in sanity test

2017-09-18 Thread Michal Privoznik
On 09/14/2017 02:50 PM, Daniel P. Berrange wrote:
> The sanity test check aims to ensure that every function listed in
> the Python code maps to a corresponding C function. The Sparse
> send/recv methods are special though - we're never calling the
> corresponding C APIs, instead we have a pure python impl.
> 
> Signed-off-by: Daniel P. Berrange 
> ---
>  sanitytest.py | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/sanitytest.py b/sanitytest.py
> index deec200..a5cb01b 100644
> --- a/sanitytest.py
> +++ b/sanitytest.py
> @@ -351,7 +351,8 @@ for klass in gotfunctions:
>  for func in sorted(gotfunctions[klass]):
>  # These are pure python methods with no C APi
>  if func in ["connect", "getConnect", "domain", "getDomain",
> -"virEventInvokeFreeCallback"]:
> +"virEventInvokeFreeCallback",
> +"sparseRecvAll", "sparseSendAll"]:
>  continue
>  
>  key = "%s.%s" % (klass, func)
> 

Well, what if somebody builds -python against older libvirt that lacks
virStreamSparseRecvAll()? Are they facing an exception because cmod is
unable to find that func?

Michal

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [Qemu-devel] libvirt/QEMU/SEV interaction

2017-09-18 Thread Daniel P. Berrange
On Mon, Sep 18, 2017 at 11:43:57AM +0200, Erik Skultety wrote:
> [...]
> 
> > >
> > > > c) what existing communicate interface can be used between libvirt 
> > > and qemu
> > > > to get the measurement ? can we add a new qemu monitor command
> > > > 'get_sev_measurement' to get the measurement ? (step 10)
> > >
> > > Yes, QMP commands seeem most likely.
> > >
> > > > d) how to pass the secret blob from libvirt to qemu ? should we 
> > > consider
> > > > adding a new object (sev-guest-secret) -- libvirt can add the 
> > > object through
> > > > qemu monitor.
> > >
> > > Yeah, that looks like a viable option too.
> >
> > So I could see a flow like the following:
> >
> >
> >   1. mgmt tool calls  virConnectGetCapabilities. This returns an XML
> >  document that includes the following
> >
> >   
> >  ...other bits...
> > 
> >   ...hex encoded PDH key...
> > 
> >   
> >
> >   2. mgmt tool requests to start a guest calling virCreateXML(),
> >  passing VIR_DOMAIN_START_PAUSED. The XML would include
> >
> >   
> > ...hex encode DH key...
> > ..hex encode info...
> > ...int32 value..
> >   
> >
> >
> >  if  is provided and VIR_DOMAIN_START_PAUSED is missing,
> >  libvirt would report an error and refuse to start the guest
> >
> >   3. Libvirt generates the QEMU cli arg to enable SEV using
> >  the XML data and starts QEMU, leaving CPUs paused
> >
> >   4. QEMU emits a SEV_MEASURE event containing the measurement
> >  blob
> 
> Speaking of which, I expect QEMU to have a QMP command to retrieve the
> measurement, in which case I think libvirt has to provide an API for the user
> to retrieve the measurement in case libvirtd crashes somewhere between setting
> up QEMU and waiting for the measurement event from QEMU, or simply because the
> GO missed the event for some unspecified reason.

Yeah, that's a good point - we also ought to have a pause-reason that
reflects that it is paused due to waiting for SEV secrets.

> 
> >
> >   5. Libvirt catches the QEMU event and emits its own
> >  VIR_CONNECT_DOMAIN_EVENT_SEV_MEASURE event containing
> >  the measurement blob
> >
> >   6. GO does its validation of the measurement
> >
> >   7a  If validation failed, then virDomainDestroy() to stop QEMU
> >
> >   7b  If validation succeeed
> >
> >  Optionally call
> >
> >  virDomainSetSEVSecret()
> 
> Given the fact that we're likely introducing a new  element to the XML
> config, I'm more inclined to utilizing the existing virSecret interfaces (as
> was originally suggested) instead of creating a vendor-specific API. You could
> have an optional secret sub-element within the  element and libvirt would
> simply check if that secret has a value set, once the GO issues
> virDomainResume(). Any particular reason for having a specific API for this 
> that
> I'm missing?

Initially I was intending to suggest extensive use of virSecret, but it
turns out that despite being called a "secret", none of the SEV data we are
passing around needs protection. Either it is safe to be public, or it is
already encrypted.  So essentially we just have some data blobs we need to
pass into QEMU. I didn't feel we ought to be abusing virSecret as a
general purpose mechanism for passing in opaque data blobs which do not
need any kind of protection.

Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [Qemu-devel] libvirt/QEMU/SEV interaction

2017-09-18 Thread Erik Skultety
[...]

> >
> > > c) what existing communicate interface can be used between libvirt 
> > and qemu
> > > to get the measurement ? can we add a new qemu monitor command
> > > 'get_sev_measurement' to get the measurement ? (step 10)
> >
> > Yes, QMP commands seeem most likely.
> >
> > > d) how to pass the secret blob from libvirt to qemu ? should we 
> > consider
> > > adding a new object (sev-guest-secret) -- libvirt can add the object 
> > through
> > > qemu monitor.
> >
> > Yeah, that looks like a viable option too.
>
> So I could see a flow like the following:
>
>
>   1. mgmt tool calls  virConnectGetCapabilities. This returns an XML
>  document that includes the following
>
>   
>  ...other bits...
> 
> ...hex encoded PDH key...
>   
>   
>
>   2. mgmt tool requests to start a guest calling virCreateXML(),
>  passing VIR_DOMAIN_START_PAUSED. The XML would include
>
>   
> ...hex encode DH key...
>   ..hex encode info...
>   ...int32 value..
>   
>
>
>  if  is provided and VIR_DOMAIN_START_PAUSED is missing,
>  libvirt would report an error and refuse to start the guest
>
>   3. Libvirt generates the QEMU cli arg to enable SEV using
>  the XML data and starts QEMU, leaving CPUs paused
>
>   4. QEMU emits a SEV_MEASURE event containing the measurement
>  blob

Speaking of which, I expect QEMU to have a QMP command to retrieve the
measurement, in which case I think libvirt has to provide an API for the user
to retrieve the measurement in case libvirtd crashes somewhere between setting
up QEMU and waiting for the measurement event from QEMU, or simply because the
GO missed the event for some unspecified reason.

>
>   5. Libvirt catches the QEMU event and emits its own
>  VIR_CONNECT_DOMAIN_EVENT_SEV_MEASURE event containing
>  the measurement blob
>
>   6. GO does its validation of the measurement
>
>   7a  If validation failed, then virDomainDestroy() to stop QEMU
>
>   7b  If validation succeeed
>
>  Optionally call
>
>  virDomainSetSEVSecret()

Given the fact that we're likely introducing a new  element to the XML
config, I'm more inclined to utilizing the existing virSecret interfaces (as
was originally suggested) instead of creating a vendor-specific API. You could
have an optional secret sub-element within the  element and libvirt would
simply check if that secret has a value set, once the GO issues
virDomainResume(). Any particular reason for having a specific API for this that
I'm missing?

Other than that I like the initial proposal for the design.

Erik

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [PATCH] apparmor: add attach_disconnected

2017-09-18 Thread intrigeri
Hi,

Jamie Strandboge:
> On Fri, 2017-09-15 at 17:17 +0200, Guido Günther wrote:
>> Otherwise we fail to reconnect to /dev/net/tun opened by libvirtd
>> like

I confirm I see the bug on current Debian sid and Guido's patch
fixes it. Please commit :)

Cheers,
-- 
intrigeri

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list

Re: [libvirt] [PATCH] apparmor: cater for new AAVMF image location

2017-09-18 Thread intrigeri
Jamie Strandboge:
> On Fri, 2017-09-15 at 18:10 +0200, Guido Günther wrote:
>> Things moved again, sigh.
>> ---
>>  src/security/virt-aa-helper.c | 3 ++-
>>  1 file changed, 2 insertions(+), 1 deletion(-)
>> 
>> diff --git a/src/security/virt-aa-helper.c b/src/security/virt-aa-helper.c
>> index 55a686a59c..0b43c8e391 100644
>> --- a/src/security/virt-aa-helper.c
>> +++ b/src/security/virt-aa-helper.c
>> @@ -516,7 +516,8 @@ valid_path(const char *path, const bool readonly)
>>  "/usr/share/OVMF/",  /* for OVMF images */
>>  "/usr/share/ovmf/",  /* for OVMF images */
>>  "/usr/share/AAVMF/", /* for AAVMF images */
>> -"/usr/share/qemu-efi/"   /* for AAVMF images */
>> +"/usr/share/qemu-efi/",  /* for AAVMF images */
>> +"/usr/share/qemu-efi-aarch64/"   /* for AAVMF images */
>>  };
>>  /* override the above with these */
>>  const char * const override[] = {

> +1. LGTM

+1 too after verifying that qemu-efi-aarch64 on current Debian sid
indeed ships QEMU_EFI.fd in there..

Cheers,
-- 
intrigeri

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list

Re: [libvirt] RFC: libvirt support for QEMU live patching

2017-09-18 Thread Daniel P. Berrange
On Mon, Sep 18, 2017 at 09:37:14AM +0200, Martin Kletzander wrote:
> On Fri, Sep 15, 2017 at 09:18:18AM +0100, Daniel P. Berrange wrote:
> > On Fri, Sep 15, 2017 at 01:27:31PM +0530, Madhu Pavan wrote:
> > > Hi,
> > > QEMU live patching should be just a matter of updating the QEMU RPM 
> > > package
> > > and then live migrating the VMs to another QEMU instance on the same host
> > > (which would point to the just installed new QEMU executable).
> > > I think it will be useful to support it from libvirt side. After some
> > > searching I found a
> > > RFC patch posted in Nov 2013. Here is the link to it
> > > https://www.redhat.com/archives/libvir-list/2013-November/msg00372.html
> > > Approach followed in above mentioned link is as follows:
> > > 1. newDef = deep copy oldVm definition
> > > 2. newVm = create VM using newDef, start QEMU process with all vCPUs 
> > > paused
> > > 3. oldVm migrate to newVm using unix socket
> > > 4. shutdown oldVm
> > > 5. newPid = newVm->pid
> > > 6. finalDef = live deep copy of newVm definition
> > > 7. Drop the newVm from qemu domain table without shutting down QEMU 
> > > process
> > > 8. Assign finalDef to oldVm
> > > 9. oldVm attaches to QEMU process newPid using finalDef
> > > 10.resume all vCPUs in oldVm
> > > 
> > > I can see it didn't get communities approval for having problems in 
> > > handling
> > > UUID
> > > of the vm's. To fix the problem we need to teach libvirt to manage two 
> > > qemu
> > > processes
> > > at once both tied to same UUID. I would like to know if there is any
> > > interested approach
> > > to get this done. I would like to send patches on this.
> > > 
> > > Is there any specific reason why it is not been pursued for the last 4 
> > > year?
> > 
> > It isn't possible to make it work correctly in the general case, because
> > both QEMU processes want to own the same files on disk. eg both might want
> > to listen on a UNIX socket /foo/bar, but only one can do this. If you let
> > the new QEMU delete the original QEMU's sockets, then you either break or
> > delay incoming connections during the migration time, and you make it
> > impossible to roll back on failure, or both. This kind of thing is not
> > something that's acceptable for the usage scenerio described, which would
> > need to be bulletproof to be usable in production.
> > 
> 
> Can't we utilize namespaces for this?  Lot of the things could be
> separated, so we could fire up a new VM that's "containerized" like
> this, migrate to it and then fire up a new one and migrate back.  If the
> first migration fails then we can still fallback.  If it does not, then
> the second one "should" not either.

As well as increasing complexity and thus chances of failure, this also
doubles the performance impact on the guest.


More generally I think the high level idea of in-place live-upgrades for
guests on a host is flawed.  Even if we had the ability to in-place upgrade
QEMU for running guest, there are always going to be cases where a security
upgrade requires you to reboot the host system - kernel/glibc flaw, or fixes
to other servicves that can't be restarted eg dbus. Any person hosting VMs
will always always need to be able to handle migrating VMs to a *separate*
host to deal with such security updates. For added complexity, determinining
which security upgrades could be done by restarting QEMU vs which need to have
a full host restart is not trivial, unless intimately familiar with the software
stack.

So from an administrative / operational POV, defining two separate procedures
for dealing with upgrades is unwise. You have the chance of picking the wrong
one and leaving a security fix accidentally unpatched, or if you take one path
95% of the time and the other path 5% of the time, chances are you're going
to screw up the less used path due to lack of practice. It makes more sense
to simplify the operational protocols by standardizing of cross-host migration,
followed by host reboot, for applying patches. It simplifies the decision
matrix removing complexity, and ensures you are well praticed following the
same approach every time. The only cost is having 1 spare server against
which to perform the migrations, but if your VMs are at all important, you
will have spare hardware already to deal with unforseen hardware problems.
If you absolutely can't afford spare hardware, you could just run your
entire "host" OS in a container, and then spin up a second "host" OS on
the same machine and migrate into that. From libvirt's POV this is still
cross-host migration, so needs no special case.

Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


[libvirt] [PATCH] qemu: Introduce a wrapper over virFileWrapperFdClose

2017-09-18 Thread Michal Privoznik
https://bugzilla.redhat.com/show_bug.cgi?id=1448268

When migrating to a file (e.g. when doing 'virsh save file'),
couple of things are happening in the thread that is executing
the API:

1) the domain obj is locked
2) iohelper is spawned as a separate process to handle all I/O
3) the thread waits for iohelper to finish
4) the domain obj is unlocked

Now, the problem is that while the thread waits in step 3 for
iohelper to finish this may take ages because iohelper calls
fdatasync(). And unfortunately, we are waiting the whole time
with the domain locked. So if another thread wants to jump in and
say copy the domain name ('virsh list' for instance), they are
stuck.

The solution is to unlock the domain whenever waiting for I/O and
lock it back again when it finished.

Signed-off-by: Michal Privoznik 
---
 src/qemu/qemu_driver.c | 27 +++
 1 file changed, 23 insertions(+), 4 deletions(-)

diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c
index b334cf20b..f81d3def4 100644
--- a/src/qemu/qemu_driver.c
+++ b/src/qemu/qemu_driver.c
@@ -3216,6 +3216,25 @@ qemuOpenFileAs(uid_t fallback_uid, gid_t fallback_gid,
 goto cleanup;
 }
 
+
+static int
+qemuFileWrapperFDClose(virDomainObjPtr vm,
+   virFileWrapperFdPtr fd)
+{
+int ret;
+
+/* virFileWrapperFd uses iohelper to write data onto disk.
+ * However, iohelper calls fdatasync() which may take ages to
+ * finish. Therefore, we shouldn't be waiting with the domain
+ * object locked. */
+
+virObjectUnlock(vm);
+ret = virFileWrapperFdClose(fd);
+virObjectLock(vm);
+return ret;
+}
+
+
 /* Helper function to execute a migration to file with a correct save header
  * the caller needs to make sure that the processors are stopped and do all 
other
  * actions besides saving memory */
@@ -3276,7 +3295,7 @@ qemuDomainSaveMemory(virQEMUDriverPtr driver,
 goto cleanup;
 }
 
-if (virFileWrapperFdClose(wrapperFd) < 0)
+if (qemuFileWrapperFDClose(vm, wrapperFd) < 0)
 goto cleanup;
 
 if ((fd = qemuOpenFile(driver, vm, path, O_WRONLY, NULL, NULL)) < 0 ||
@@ -3827,7 +3846,7 @@ doCoreDump(virQEMUDriverPtr driver,
  path);
 goto cleanup;
 }
-if (virFileWrapperFdClose(wrapperFd) < 0)
+if (qemuFileWrapperFDClose(vm, wrapperFd) < 0)
 goto cleanup;
 
 ret = 0;
@@ -6687,7 +6706,7 @@ qemuDomainRestoreFlags(virConnectPtr conn,
 
 ret = qemuDomainSaveImageStartVM(conn, driver, vm, &fd, data, path,
  false, QEMU_ASYNC_JOB_START);
-if (virFileWrapperFdClose(wrapperFd) < 0)
+if (qemuFileWrapperFDClose(vm, wrapperFd) < 0)
 VIR_WARN("Failed to close %s", path);
 
 qemuProcessEndJob(driver, vm);
@@ -6962,7 +6981,7 @@ qemuDomainObjRestore(virConnectPtr conn,
 
 ret = qemuDomainSaveImageStartVM(conn, driver, vm, &fd, data, path,
  start_paused, asyncJob);
-if (virFileWrapperFdClose(wrapperFd) < 0)
+if (qemuFileWrapperFDClose(vm, wrapperFd) < 0)
 VIR_WARN("Failed to close %s", path);
 
  cleanup:
-- 
2.13.5

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] RFC: libvirt support for QEMU live patching

2017-09-18 Thread Martin Kletzander

On Mon, Sep 18, 2017 at 10:26:40AM +0200, Peter Krempa wrote:

On Mon, Sep 18, 2017 at 09:37:14 +0200, Martin Kletzander wrote:

On Fri, Sep 15, 2017 at 09:18:18AM +0100, Daniel P. Berrange wrote:
> On Fri, Sep 15, 2017 at 01:27:31PM +0530, Madhu Pavan wrote:


[...]


> It isn't possible to make it work correctly in the general case, because
> both QEMU processes want to own the same files on disk. eg both might want
> to listen on a UNIX socket /foo/bar, but only one can do this. If you let
> the new QEMU delete the original QEMU's sockets, then you either break or
> delay incoming connections during the migration time, and you make it
> impossible to roll back on failure, or both. This kind of thing is not
> something that's acceptable for the usage scenerio described, which would
> need to be bulletproof to be usable in production.
>

Can't we utilize namespaces for this?  Lot of the things could be
separated, so we could fire up a new VM that's "containerized" like
this, migrate to it and then fire up a new one and migrate back.  If the
first migration fails then we can still fallback.  If it does not, then
the second one "should" not either.


If you are willing to accept this kind of slow-down/overhead you can
as well as save the VM to file and restore it. This works as the upgrade
path even now.


That way you have no fallback option, though.  The above solution could
still support live upgrade.


signature.asc
Description: Digital signature
--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list

Re: [libvirt] [PATCH 1/7] cpu_conf: Introduce virCPUDefList{Parse, Free}

2017-09-18 Thread Jiri Denemark
On Fri, Sep 15, 2017 at 14:04:42 +0200, Ján Tomko wrote:
> On Thu, Sep 14, 2017 at 12:57:14PM +0200, Jiri Denemark wrote:
> >For parsing a list of CPU XMLs into a NULL-terminated list of CPU defs.
> >
> >Signed-off-by: Jiri Denemark 
> >---
> > src/conf/cpu_conf.c  | 78 
> > 
> > src/conf/cpu_conf.h  |  7 +
> > src/libvirt_private.syms |  2 ++
> > 3 files changed, 87 insertions(+)
> >
> 
> [...]
> 
> >+/*
> >+ * Parses a list of CPU XMLs into a NULL-terminated list of CPU defs.
> >+ */
> >+virCPUDefPtr *
> >+virCPUDefListParse(const char **xmlCPUs,
> >+   unsigned int ncpus,
> >+   virCPUType cpuType)
> >+{
> >+
> 
> [...]
> 
> >+if (ncpus < 1) {
> 
> Interesting way of spelling '== 0'

Yeah, caused by copy&paste engineering... see the next patch where the
same condition gets removed from cpuBaselineXML. Fixed.

Jirka

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] RFC: libvirt support for QEMU live patching

2017-09-18 Thread Peter Krempa
On Mon, Sep 18, 2017 at 09:37:14 +0200, Martin Kletzander wrote:
> On Fri, Sep 15, 2017 at 09:18:18AM +0100, Daniel P. Berrange wrote:
> > On Fri, Sep 15, 2017 at 01:27:31PM +0530, Madhu Pavan wrote:

[...]

> > It isn't possible to make it work correctly in the general case, because
> > both QEMU processes want to own the same files on disk. eg both might want
> > to listen on a UNIX socket /foo/bar, but only one can do this. If you let
> > the new QEMU delete the original QEMU's sockets, then you either break or
> > delay incoming connections during the migration time, and you make it
> > impossible to roll back on failure, or both. This kind of thing is not
> > something that's acceptable for the usage scenerio described, which would
> > need to be bulletproof to be usable in production.
> > 
> 
> Can't we utilize namespaces for this?  Lot of the things could be
> separated, so we could fire up a new VM that's "containerized" like
> this, migrate to it and then fire up a new one and migrate back.  If the
> first migration fails then we can still fallback.  If it does not, then
> the second one "should" not either.

If you are willing to accept this kind of slow-down/overhead you can
as well as save the VM to file and restore it. This works as the upgrade
path even now.


signature.asc
Description: PGP signature
--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list

Re: [libvirt] RFC: libvirt support for QEMU live patching

2017-09-18 Thread Martin Kletzander

On Fri, Sep 15, 2017 at 09:18:18AM +0100, Daniel P. Berrange wrote:

On Fri, Sep 15, 2017 at 01:27:31PM +0530, Madhu Pavan wrote:

Hi,
QEMU live patching should be just a matter of updating the QEMU RPM package
and then live migrating the VMs to another QEMU instance on the same host
(which would point to the just installed new QEMU executable).
I think it will be useful to support it from libvirt side. After some
searching I found a
RFC patch posted in Nov 2013. Here is the link to it
https://www.redhat.com/archives/libvir-list/2013-November/msg00372.html
Approach followed in above mentioned link is as follows:
1. newDef = deep copy oldVm definition
2. newVm = create VM using newDef, start QEMU process with all vCPUs paused
3. oldVm migrate to newVm using unix socket
4. shutdown oldVm
5. newPid = newVm->pid
6. finalDef = live deep copy of newVm definition
7. Drop the newVm from qemu domain table without shutting down QEMU process
8. Assign finalDef to oldVm
9. oldVm attaches to QEMU process newPid using finalDef
10.resume all vCPUs in oldVm

I can see it didn't get communities approval for having problems in handling
UUID
of the vm's. To fix the problem we need to teach libvirt to manage two qemu
processes
at once both tied to same UUID. I would like to know if there is any
interested approach
to get this done. I would like to send patches on this.

Is there any specific reason why it is not been pursued for the last 4 year?


It isn't possible to make it work correctly in the general case, because
both QEMU processes want to own the same files on disk. eg both might want
to listen on a UNIX socket /foo/bar, but only one can do this. If you let
the new QEMU delete the original QEMU's sockets, then you either break or
delay incoming connections during the migration time, and you make it
impossible to roll back on failure, or both. This kind of thing is not
something that's acceptable for the usage scenerio described, which would
need to be bulletproof to be usable in production.



Can't we utilize namespaces for this?  Lot of the things could be
separated, so we could fire up a new VM that's "containerized" like
this, migrate to it and then fire up a new one and migrate back.  If the
first migration fails then we can still fallback.  If it does not, then
the second one "should" not either.


Regards,
Daniel
--
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


signature.asc
Description: Digital signature
--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list

Re: [libvirt] [PATCH] virsh: Enhance documentation of --rdma-pin-all option

2017-09-18 Thread Pavel Hrdina
On Fri, Sep 08, 2017 at 09:33:13PM +0200, Jiri Denemark wrote:
> https://bugzilla.redhat.com/show_bug.cgi?id=1373783
> 
> Signed-off-by: Jiri Denemark 
> ---
>  tools/virsh.pod | 8 +++-
>  1 file changed, 7 insertions(+), 1 deletion(-)

Reviewed-by: Pavel Hrdina 


signature.asc
Description: PGP signature
--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list