Re: [libvirt] [PATCH] libvirt: qemu: Fix domain termination caused by query-hotpluggable-cpus not enabled

2016-11-25 Thread Peter Krempa
On Fri, Nov 25, 2016 at 14:25:33 +0100, Boris Fiuczynski wrote:
> On 11/25/2016 10:07 AM, Peter Krempa wrote:
> > On Fri, Nov 25, 2016 at 10:03:38 +0100, Peter Krempa wrote:
> > > On Fri, Nov 25, 2016 at 09:19:18 +0100, Boris Fiuczynski wrote:
> > 
> > [...]
> 
> Peter,
> looking at your commit message of 920bbe5c it reads as follows:
> qemu: capabilities: Extract availability of new cpu hotplug for machine
> types
> 
> QEMU reports whether 'query-hotpluggable-cpus' is supported for a given
> machine type. Extract and cache the information using the capability
> cache.
> 
> When copying the capabilities for a new start of qemu, mask out the
> presence of QEMU_CAPS_QUERY_HOTPLUGGABLE_CPUS if the machine type
> doesn't support hotpluggable cpus.
> 
> The loaded XML has the 'query-hotpluggable-cpus' capability set since the
> qmp command exists in the list returned by the qmp command query-commands by
> the qemu binary.

In the global capabilities cache, the QEMU_CAPS_QUERY_HOTPLUGGABLE_CPUS
is set if the command is supported by qemu.

Once a VM is started we create a copy of the given capability set and
possibly filter out stuff that won't work and then store the filtered
capability set in the domain object.

> The machine type dependent masking of QEMU_CAPS_QUERY_HOTPLUGGABLE_CPUS you
> are doing for a new start of qemu seems therefore also required to be done
> for reconnecting to this running qemu domain. Are you saying that this is

No. You can't do that at reconnect time. Once you start the VM you can
completely replace the original binary and thus you don't have data that
would allow to do such operation at reconnect time.

That is the main reason why we store the capabilities in the status XML
rather than reloading it from the cache.

> wrong and the machine type dependent masking result of the
> 'query-hotpluggable-cpus' capability should be stored in the XML?

Yes, exactly. As said, the capability set should not be re-detected for
the reasons stated above.

Said the above, there's something fishy going on. I compiled a qemu
binary that specifically doesn't support cpu hotplug and started a VM.
The status XML file does not show the flag:

# grep query-hotpluggable-cpus /var/run/libvirt/qemu/upstream.xml
#

But an attempt to restart libvirtd indeed ends up in the VM being
killed:

016-11-25 14:09:12.909+: 2610599: error :
qemuMonitorJSONCheckError:387 : internal error: unable to execute QEMU
command 'query-hotpluggable-cpus': The feature 'query-hotpluggable-cpus'
is not enabled

This means that the capabilities are not properly restored.

Peter



signature.asc
Description: PGP signature
--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list

Re: [libvirt] [PATCH] libvirt: qemu: Fix domain termination caused by query-hotpluggable-cpus not enabled

2016-11-25 Thread Boris Fiuczynski

On 11/25/2016 10:07 AM, Peter Krempa wrote:

On Fri, Nov 25, 2016 at 10:03:38 +0100, Peter Krempa wrote:

On Fri, Nov 25, 2016 at 09:19:18 +0100, Boris Fiuczynski wrote:


[...]


 src/qemu/qemu_process.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c
index f8f379a..675f5b5 100644
--- a/src/qemu/qemu_process.c
+++ b/src/qemu/qemu_process.c
@@ -3349,8 +3349,7 @@ qemuProcessReconnect(void *opaque)
 /* If upgrading from old libvirtd we won't have found any
  * caps in the domain status, so re-query them


At reconnect the capabilities are taken from the status XML file, where
they are saved for every instance specifically. This code is supposed to
run


only when a very old version of libvirt did not save the capability
flags into the status XML. It's even explained in the comment above.




  */
-if (!priv->qemuCaps &&
-!(priv->qemuCaps = virQEMUCapsCacheLookupCopy(caps,
+if (!(priv->qemuCaps = virQEMUCapsCacheLookupCopy(caps,
   driver->qemuCapsCache,
   obj->def->emulator,
   obj->def->os.machine)))


NACK, this certainly is not the right fix. Does the status XML have the
'query-hotpluggable-cpus' capability set? If it's so then it was
probably mis-detected at start of the VM and should be fixed there.

If there is no such capability, then the reconnect code is broken
somehow.

At any rate we should not re-detect the capabilities if they were
reloaded properly from the XML.

Peter


Peter,
looking at your commit message of 920bbe5c it reads as follows:
qemu: capabilities: Extract availability of new cpu hotplug for 
machine types


QEMU reports whether 'query-hotpluggable-cpus' is supported for a given
machine type. Extract and cache the information using the capability
cache.

When copying the capabilities for a new start of qemu, mask out the
presence of QEMU_CAPS_QUERY_HOTPLUGGABLE_CPUS if the machine type
doesn't support hotpluggable cpus.

The loaded XML has the 'query-hotpluggable-cpus' capability set since 
the qmp command exists in the list returned by the qmp command 
query-commands by the qemu binary.
The machine type dependent masking of QEMU_CAPS_QUERY_HOTPLUGGABLE_CPUS 
you are doing for a new start of qemu seems therefore also required to 
be done for reconnecting to this running qemu domain. Are you saying 
that this is wrong and the machine type dependent masking result of the 
'query-hotpluggable-cpus' capability should be stored in the XML?



--
Mit freundlichen Grüßen/Kind regards
   Boris Fiuczynski

IBM Deutschland Research & Development GmbH
Vorsitzender des Aufsichtsrats: Martina Köderitz
Geschäftsführung: Dirk Wittkopp
Sitz der Gesellschaft: Böblingen
Registergericht: Amtsgericht Stuttgart, HRB 243294

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [PATCH] libvirt: qemu: Fix domain termination caused by query-hotpluggable-cpus not enabled

2016-11-25 Thread Peter Krempa
On Fri, Nov 25, 2016 at 10:03:38 +0100, Peter Krempa wrote:
> On Fri, Nov 25, 2016 at 09:19:18 +0100, Boris Fiuczynski wrote:

[...]

> >  src/qemu/qemu_process.c | 3 +--
> >  1 file changed, 1 insertion(+), 2 deletions(-)
> > 
> > diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c
> > index f8f379a..675f5b5 100644
> > --- a/src/qemu/qemu_process.c
> > +++ b/src/qemu/qemu_process.c
> > @@ -3349,8 +3349,7 @@ qemuProcessReconnect(void *opaque)
> >  /* If upgrading from old libvirtd we won't have found any
> >   * caps in the domain status, so re-query them
> 
> At reconnect the capabilities are taken from the status XML file, where
> they are saved for every instance specifically. This code is supposed to
> run

only when a very old version of libvirt did not save the capability
flags into the status XML. It's even explained in the comment above.

> 
> >   */
> > -if (!priv->qemuCaps &&
> > -!(priv->qemuCaps = virQEMUCapsCacheLookupCopy(caps,
> > +if (!(priv->qemuCaps = virQEMUCapsCacheLookupCopy(caps,
> >
> > driver->qemuCapsCache,
> >obj->def->emulator,
> >
> > obj->def->os.machine)))
> 
> NACK, this certainly is not the right fix. Does the status XML have the
> 'query-hotpluggable-cpus' capability set? If it's so then it was
> probably mis-detected at start of the VM and should be fixed there.
> 
> If there is no such capability, then the reconnect code is broken
> somehow.
> 
> At any rate we should not re-detect the capabilities if they were
> reloaded properly from the XML.
> 
> Peter



> --
> libvir-list mailing list
> libvir-list@redhat.com
> https://www.redhat.com/mailman/listinfo/libvir-list



signature.asc
Description: PGP signature
--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list

Re: [libvirt] [PATCH] libvirt: qemu: Fix domain termination caused by query-hotpluggable-cpus not enabled

2016-11-25 Thread Peter Krempa
On Fri, Nov 25, 2016 at 09:19:18 +0100, Boris Fiuczynski wrote:
> (Re-)Starting libvirt on a system with running qemu domains which earlier
> had been successfully started by libvirt results in the error
> 
> internal error: unable to execute QEMU command 'query-hotpluggable-cpus':
>   The feature 'query-hotpluggable-cpus' is not enabled
> 
> if the qemu binary does not support the qmp command 'query-hotpluggable-cpus'.
> 
> As libvirt tries to reconnect to the running qemu domains it reads in the
> capabilities but in qemuProcessReconnect misses to run
> virQEMUCapsCacheLookupCopy and not clearing the query-hotpluggable-cpus
> capability in virQEMUCapsFilterByMachineType which was introduced with
> commit 920bbe5c.
> Libvirt therefore issues the qmp command and qemu responds with the error
> 'The feature 'query-hotpluggable-cpus' is not enabled'.
> As a consequence libvirt terminates the running qemu process since it
> determines that it cannot reconnect to the domain.
> 
> Signed-off-by: Boris Fiuczynski 
> Reviewed-by: Marc Hartmayer 
> ---
> Due to the severity of this issue I recommend to backport this fix
> into all maintenance releases up to v2.2.0.
> 
>  src/qemu/qemu_process.c | 3 +--
>  1 file changed, 1 insertion(+), 2 deletions(-)
> 
> diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c
> index f8f379a..675f5b5 100644
> --- a/src/qemu/qemu_process.c
> +++ b/src/qemu/qemu_process.c
> @@ -3349,8 +3349,7 @@ qemuProcessReconnect(void *opaque)
>  /* If upgrading from old libvirtd we won't have found any
>   * caps in the domain status, so re-query them

At reconnect the capabilities are taken from the status XML file, where
they are saved for every instance specifically. This code is supposed to
run

>   */
> -if (!priv->qemuCaps &&
> -!(priv->qemuCaps = virQEMUCapsCacheLookupCopy(caps,
> +if (!(priv->qemuCaps = virQEMUCapsCacheLookupCopy(caps,
>driver->qemuCapsCache,
>obj->def->emulator,
>obj->def->os.machine)))

NACK, this certainly is not the right fix. Does the status XML have the
'query-hotpluggable-cpus' capability set? If it's so then it was
probably mis-detected at start of the VM and should be fixed there.

If there is no such capability, then the reconnect code is broken
somehow.

At any rate we should not re-detect the capabilities if they were
reloaded properly from the XML.

Peter


signature.asc
Description: PGP signature
--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list