[libvirt] [PATCH] virsh: Enhance documentation of --rdma-pin-all option

2017-09-08 Thread Jiri Denemark
https://bugzilla.redhat.com/show_bug.cgi?id=1373783

Signed-off-by: Jiri Denemark 
---
 tools/virsh.pod | 8 +++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/tools/virsh.pod b/tools/virsh.pod
index 01453be600..a03f64a262 100644
--- a/tools/virsh.pod
+++ b/tools/virsh.pod
@@ -1767,7 +1767,13 @@ periodically increased by I.
 
 I<--rdma-pin-all> can be used with RDMA migration (i.e., when I
 starts with rdma://) to tell the hypervisor to pin all domain's memory at once
-before migration starts rather than letting it pin memory pages as needed.
+before migration starts rather than letting it pin memory pages as needed. For
+QEMU/KVM this requires hard_limit memory tuning element (in the domain XML) to
+be used and set to the maximum memory configured for the domain plus any memory
+consumed by the QEMU process itself. Beware of setting the memory limit too
+high (and thus allowing the domain to lock most of the host's memory). Doing so
+may be dangerous to both the domain and the host itself since the host's kernel
+may run out of memory.
 
 B: Individual hypervisors usually do not support all possible types of
 migration. For example, QEMU does not support direct migration.
-- 
2.14.1

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


[libvirt] [PATCH] qemu: Don't report failure to destroy a destroyed domain

2017-09-08 Thread Jiri Denemark
When destroying a domain libvirt marks it internally with a
beingDestroyed flag to make sure the qemuDomainDestroyFlags API itself
cleans up after the domain rather than letting an uninformed EOF handler
do it. However, when the domain is being started at the moment libvirt
was asked to destroy it, only the starting thread can properly clean up
after the domain and thus it ignores the beingDestroyed flag. Once
qemuDomainDestroyFlags finally gets a job, the domain may not be running
anymore, which should not be reported as an error if the domain has been
starting up.

https://bugzilla.redhat.com/show_bug.cgi?id=1445600

Signed-off-by: Jiri Denemark 
---
 src/qemu/qemu_driver.c | 23 +--
 1 file changed, 21 insertions(+), 2 deletions(-)

diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c
index 6255d89310..a25daae866 100644
--- a/src/qemu/qemu_driver.c
+++ b/src/qemu/qemu_driver.c
@@ -2224,6 +2224,9 @@ qemuDomainDestroyFlags(virDomainPtr dom,
 virObjectEventPtr event = NULL;
 qemuDomainObjPrivatePtr priv;
 unsigned int stopFlags = 0;
+int state;
+int reason;
+bool starting;
 
 virCheckFlags(VIR_DOMAIN_DESTROY_GRACEFUL, -1);
 
@@ -2235,13 +2238,29 @@ qemuDomainDestroyFlags(virDomainPtr dom,
 if (virDomainDestroyFlagsEnsureACL(dom->conn, vm->def) < 0)
 goto cleanup;
 
+if (!virDomainObjIsActive(vm)) {
+virReportError(VIR_ERR_OPERATION_INVALID,
+   "%s", _("domain is not running"));
+goto cleanup;
+}
+
+state = virDomainObjGetState(vm, );
+starting = (state == VIR_DOMAIN_PAUSED &&
+reason == VIR_DOMAIN_PAUSED_STARTING_UP &&
+!priv->beingDestroyed);
+
 if (qemuProcessBeginStopJob(driver, vm, QEMU_JOB_DESTROY,
 !(flags & VIR_DOMAIN_DESTROY_GRACEFUL)) < 0)
 goto cleanup;
 
 if (!virDomainObjIsActive(vm)) {
-virReportError(VIR_ERR_OPERATION_INVALID,
-   "%s", _("domain is not running"));
+if (starting) {
+VIR_DEBUG("Domain %s is not running anymore", vm->def->name);
+ret = 0;
+} else {
+virReportError(VIR_ERR_OPERATION_INVALID,
+   "%s", _("domain is not running"));
+}
 goto endjob;
 }
 
-- 
2.14.1

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


[libvirt] [PATCH] virsh: migrate --timeout-postcopy requires --postcopy

2017-09-08 Thread Jiri Denemark
Requesting an automated switch to a post-copy migration (using
--timeout-postcopy) without actually enabling post-copy migration (using
--postcopy) doesn't really do anything. Let's make this dependency
explicit to avoid unexpected behavior.

https://bugzilla.redhat.com/show_bug.cgi?id=1455023

Signed-off-by: Jiri Denemark 
---
 tools/virsh-domain.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/tools/virsh-domain.c b/tools/virsh-domain.c
index f235c66b07..a3f3b7c7bd 100644
--- a/tools/virsh-domain.c
+++ b/tools/virsh-domain.c
@@ -10768,6 +10768,7 @@ cmdMigrate(vshControl *ctl, const vshCmd *cmd)
 VSH_EXCLUSIVE_OPTIONS("live", "offline");
 VSH_EXCLUSIVE_OPTIONS("timeout-suspend", "timeout-postcopy");
 VSH_REQUIRE_OPTION("postcopy-after-precopy", "postcopy");
+VSH_REQUIRE_OPTION("timeout-postcopy", "postcopy");
 VSH_REQUIRE_OPTION("persistent-xml", "persistent");
 
 if (!(dom = virshCommandOptDomain(ctl, cmd, NULL)))
-- 
2.14.1

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [Qemu-devel] libvirt/QEMU/SEV interaction

2017-09-08 Thread Brijesh Singh



On 09/08/2017 10:51 AM, Daniel P. Berrange wrote:

On Fri, Sep 08, 2017 at 10:48:10AM -0500, Brijesh Singh wrote:

So I could see a flow like the following:



The flow looks good




1. mgmt tool calls  virConnectGetCapabilities. This returns an XML
   document that includes the following


   ...other bits...
  
  ...hex encoded PDH key...



2. mgmt tool requests to start a guest calling virCreateXML(),
   passing VIR_DOMAIN_START_PAUSED. The XML would include


  ...hex encode DH key...
..hex encode info...
...int32 value..



   if  is provided and VIR_DOMAIN_START_PAUSED is missing,
   libvirt would report an error and refuse to start the guest




One thing which is not clear to me is, how do we know that we are asked
to launch SEV guest? Are you thinking that  tag in the XML will
hint libvirt that GO has asked to launch a SEV guest?


Yes, the existance of the  tag is the indicator that informs
libvirt that SEV *must* be used for the guest.



Thanks for confirming.





3. Libvirt generates the QEMU cli arg to enable SEV using
   the XML data and starts QEMU, leaving CPUs paused




I am looking at [1] to get the feel for how do we model it in the XML.
As you can see I am using ad-hoc  to create the sev-guest
object. Currently, sev-guest object accepts the following properties:

dh-cert-file: 
session-info-file: 
policy: 

I believe the new XML model will influence the property input type,
Any recommendation on how do model this part ? thank you so much.


That looks ok to me - even if QEMU wants the data provided in
files on disk, libvirt can just create the files on the fly
from the data it has in the  element in the XML file.
Since they're only needed during startup, libvirt can then
easily delete the files the moment QEMU has completed its
startup.



Perfect! works well with me.


--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [Qemu-devel] libvirt/QEMU/SEV interaction

2017-09-08 Thread Laszlo Ersek
On 09/08/17 17:51, Daniel P. Berrange wrote:
> On Fri, Sep 08, 2017 at 10:48:10AM -0500, Brijesh Singh wrote:

>> I am looking at [1] to get the feel for how do we model it in the XML.
>> As you can see I am using ad-hoc  to create the sev-guest
>> object. Currently, sev-guest object accepts the following properties:
>>
>> dh-cert-file: 
>> session-info-file: 
>> policy: 
>>
>> I believe the new XML model will influence the property input type,
>> Any recommendation on how do model this part ? thank you so much.
> 
> That looks ok to me - even if QEMU wants the data provided in
> files on disk, libvirt can just create the files on the fly
> from the data it has in the  element in the XML file.
> Since they're only needed during startup, libvirt can then
> easily delete the files the moment QEMU has completed its
> startup.

/dev/fd/N filenames could be used for poor man's fd passing, I think.

(/dev/fd is a symlink to the /proc/self/fd directory)

proc(5) has documentation on this.

Thanks,
Laszlo

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [Qemu-devel] libvirt/QEMU/SEV interaction

2017-09-08 Thread Brijesh Singh

Hi Daniel,


On 09/08/2017 09:52 AM, Daniel P. Berrange wrote:

On Fri, Sep 08, 2017 at 01:45:06PM +, Relph, Richard wrote:

A few answers in line…

On 9/8/17, 8:16 AM, "Daniel P. Berrange"  wrote:

 On Fri, Sep 08, 2017 at 06:57:30AM -0500, Brijesh Singh wrote:
 > Hi All,
 >
 > (sorry for the long message)
 >
 > CPUs from AMD EPYC family supports Secure Encrypted Virtualization (SEV)
 > feature - the feature allows running encrypted VMs. To enable the 
feature,
 > I have been submitting patches to Linux kernel [1], Qemu [2] and OVMF 
[3].
 > We have been making some good progress in getting patches accepted 
upstream
 > in Linux and OVMF trees. SEV builds upon SME (Secure Memory Encryption)
 > feature -- SME support just got pulled into 4.14 merge window. The base
 > SEV patches are accepted in OVMF tree -- now we have SEV aware guest 
BIOS.
 > I am getting ready to take off "RFC" tag from remaining patches to get 
them
 > reviewed and accepted.
 >
 > The boot flow for launching an SEV guest is a bit different from a 
typical
 > guest launch. In order to launch SEV guest from virt-manager or other
 > high-level VM management tools, we need to design and implement new
 > interface between libvirt and qemu, and probably add new APIs in libvirt
 > to be used by VM management tools. I am new to the libvirt and need some
 > expert advice while designing this interface. A pictorial representation
 > for a SEV guest launch flow is available in SEV Spec Appendix A [4].
 >
 > A typical flow looks like this:
 >
 > 1. Guest owner (GO) asks the cloud provider to launch SEV guest.
 > 2. VM tool asks libvirt to provide its Platform Diffie-Hellman (PDH) key.
 > 3. libvirt opens /dev/sev device to get its PDH and return the blob to 
the
 >caller.
 
 What sort of size are we talking about for the PDH ?


The PDH blob is described in reference 4. It’s 0x824 bytes long… a bit over 2K 
bytes.
PDH is “Platform Diffie-Hellman” key, public portion.
 
 There's a few ways libvirt could report it
 
  1. As an XML element in the host capabilities XML

  2. As an XML element in the emulator capabilities XML
  3. Via a newly added host API
 
 > 4. VM tool gives its PDH to GO.

 > 5. GO provides its DH key, session-info and guest policy.
 
 Are steps 4 & 5 strictly required to be in this order, or is it

 possible

Steps 4 and 5 must occur in that order. The data sent by the GO in the
“session info” is encrypted and integrity protected with keys that the
GO derives from the GO private Diffie-Hellman key and the PDH public
Diffie-Hellman key.
 
 What are the security requirements around the DH key, session info

 and guest policy ? Are any of them sensitive data which needs to be
 kept private from untrustworthy users. Also are all three of these
 items different for every guest launched, or are some of them
 likely to be the same for every guest ?

The PDH is not sensitive. It is relatively static for the platform.
(It only changes when the platform owner chooses to change the apparent
identity of the platform that will actually run the virtual machine.)
The 128-byte session info data is encrypted and integrity protected by
the GO. It need not be additionally encrypted or integrity protected.
It should vary for EVERY guest launch.
The 4-byte guest policy must be sent in the clear as some components
may want to observe the policy bits. It may change from guest to guest,
but there will likely only be a few common values.


Given this, and the pretty small data sizes, I think this info could
in fact all be provided inline in the XML config - either hex or base64
encoded for the binary blobs.




 > 8. libvirt launches the guest with "-S"
 
 All libvirt guests get launched with -S, to give libvirt chance to do some

 setup before starting vCPUs. Normally vCPUs are started by default, but
 the VIR_DOMAIN_START_PAUSED flag allows the mgmt app to tell libvirt to
 leave vCPUS paused.
 
 Alternatively, if libvirt sees presencese of 'sev' config for the guest,

 it could automatically leave it in PAUSED state regardless of the
 VIR_DOMAIN_START_PAUSED flag.

While using the LAUNCH_MEASURE and LAUNCH_SECRET operations is expected
to be the common case, they are optional. I would recommend requiring
the upstream software to explicitly set the VIR_DOMAIN_START_PAUSED flag,
if only to minimize the dependencies…


Ok.


 > 9. While creating the SEV guest qemu does the following
 >  i) create encryption context using GO's DH, session-info and guest 
policy
 > (LAUNCH_START)
 >  ii) encrypts the guest bios (LAUNCH_UPDATE_DATA)
 >  iii) calls LAUNCH_MEASUREMENT to get the encrypted bios measurement
 > 10. By some interface we must propagate the measurement all 

Re: [libvirt] [Qemu-devel] libvirt/QEMU/SEV interaction

2017-09-08 Thread Daniel P. Berrange
On Fri, Sep 08, 2017 at 10:48:10AM -0500, Brijesh Singh wrote:
> > So I could see a flow like the following:
> 
> 
> The flow looks good
> 
> > 
> > 
> >1. mgmt tool calls  virConnectGetCapabilities. This returns an XML
> >   document that includes the following
> > 
> >
> >   ...other bits...
> >  
> >   ...hex encoded PDH key...
> > 
> >
> > 
> >2. mgmt tool requests to start a guest calling virCreateXML(),
> >   passing VIR_DOMAIN_START_PAUSED. The XML would include
> > 
> >
> >  ...hex encode DH key...
> > ..hex encode info...
> > ...int32 value..
> >
> > 
> > 
> >   if  is provided and VIR_DOMAIN_START_PAUSED is missing,
> >   libvirt would report an error and refuse to start the guest
> > 
> 
> 
> One thing which is not clear to me is, how do we know that we are asked
> to launch SEV guest? Are you thinking that  tag in the XML will
> hint libvirt that GO has asked to launch a SEV guest?

Yes, the existance of the  tag is the indicator that informs
libvirt that SEV *must* be used for the guest.

> >3. Libvirt generates the QEMU cli arg to enable SEV using
> >   the XML data and starts QEMU, leaving CPUs paused
> > 
> 
> 
> I am looking at [1] to get the feel for how do we model it in the XML.
> As you can see I am using ad-hoc  to create the sev-guest
> object. Currently, sev-guest object accepts the following properties:
> 
> dh-cert-file: 
> session-info-file: 
> policy: 
> 
> I believe the new XML model will influence the property input type,
> Any recommendation on how do model this part ? thank you so much.

That looks ok to me - even if QEMU wants the data provided in
files on disk, libvirt can just create the files on the fly
from the data it has in the  element in the XML file.
Since they're only needed during startup, libvirt can then
easily delete the files the moment QEMU has completed its
startup.

> 
> [1] https://libvirt.org/formatdomain.html#elementsCPU
> 
> 
> >4. QEMU emits a SEV_MEASURE event containing the measurement
> >   blob
> > 
> >5. Libvirt catches the QEMU event and emits its own
> >   VIR_CONNECT_DOMAIN_EVENT_SEV_MEASURE event containing
> >   the measurement blob
> > 
> >6. GO does its validation of the measurement
> > 
> >7a  If validation failed, then virDomainDestroy() to stop QEMU
> > 
> >7b  If validation succeeed
> > 
> >   Optionally call
> > 
> >   virDomainSetSEVSecret()
> > 
> >   providing the optional secret, then
> > 
> >   virDomainResume()
> > 
> >   to let QEMU continue
> > 
> > 
> > 
> > 
> > Regards,
> > Daniel
> > 

Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


[libvirt] [PATCH v4 3/5] xenconfig: add domxml conversions for xen-xl

2017-09-08 Thread Wim Ten Have
From: Wim ten Have 

This patch converts NUMA configurations between the Xen libxl
configuration file format and libvirt's XML format.

XML HVM domain configuration:

  

  

  
  
  
  

  
  

  
  
  
  

  
  

  
  
  
  

  
  

  
  
  
  

  

  

Xen xl.cfg domain configuration:

  vnuma = [["pnode=0","size=2048","vcpus=0-1","vdistances=10,21,31,41"],
   ["pnode=1","size=2048","vcpus=2-3","vdistances=21,10,21,31"],
   ["pnode=2","size=2048","vcpus=4-5","vdistances=31,21,10,21"],
   ["pnode=3","size=2048","vcpus=6-7","vdistances=41,31,21,10"]]

If there is no XML  description amongst the  data the
conversion schema from xml to native will generate 10 for local and 20
for all remote instances.

Signed-off-by: Wim ten Have 
---
Changes on v2:
- Reduce the indentation level under xenParseXLVnuma().
Changes on v3:
- Add the Numa core split functions required to interface.
---
 src/conf/numa_conf.c | 138 
 src/conf/numa_conf.h |  20 +++
 src/libvirt_private.syms |   5 +
 src/xenconfig/xen_xl.c   | 333 +++
 4 files changed, 496 insertions(+)

diff --git a/src/conf/numa_conf.c b/src/conf/numa_conf.c
index 5db4311..d30b8cc 100644
--- a/src/conf/numa_conf.c
+++ b/src/conf/numa_conf.c
@@ -1123,6 +1123,133 @@ virDomainNumaGetNodeCount(virDomainNumaPtr numa)
 }
 
 
+size_t
+virDomainNumaSetNodeCount(virDomainNumaPtr numa, size_t nmem_nodes)
+{
+if (!nmem_nodes) {
+virReportError(VIR_ERR_INTERNAL_ERROR, "%s",
+   _("Cannot set an empty mem_nodes set"));
+return 0;
+}
+
+if (numa->mem_nodes) {
+virReportError(VIR_ERR_INTERNAL_ERROR, "%s",
+   _("Cannot alter an existing mem_nodes set"));
+return 0;
+}
+
+if (VIR_ALLOC_N(numa->mem_nodes, nmem_nodes) < 0)
+return 0;
+
+numa->nmem_nodes = nmem_nodes;
+
+return numa->nmem_nodes;
+}
+
+size_t
+virDomainNumaGetNodeDistance(virDomainNumaPtr numa,
+ size_t node,
+ size_t cellid)
+{
+virDomainNumaDistancePtr distances = NULL;
+
+if (node < numa->nmem_nodes)
+distances = numa->mem_nodes[node].distances;
+
+/*
+ * Present the configured distance value. If
+ * out of range or not available set the platform
+ * defined default for local and remote nodes.
+ */
+if (!distances ||
+!distances[cellid].value ||
+!numa->mem_nodes[node].ndistances)
+return (node == cellid) ? \
+  LOCAL_DISTANCE : REMOTE_DISTANCE;
+
+return distances[cellid].value;
+}
+
+
+int
+virDomainNumaSetNodeDistance(virDomainNumaPtr numa,
+ size_t node,
+ size_t cellid,
+ unsigned int value)
+{
+virDomainNumaDistancePtr distances;
+
+if (node >= numa->nmem_nodes) {
+virReportError(VIR_ERR_INTERNAL_ERROR,
+   _("Argument 'node' %zu outranges "
+ "defined number of NUMA nodes"),
+   node);
+return -1;
+}
+
+distances = numa->mem_nodes[node].distances;
+if (!distances ||
+cellid >= numa->mem_nodes[node].ndistances) {
+virReportError(VIR_ERR_XML_ERROR, "%s",
+   _("Arguments under memnode element do not "
+ "correspond with existing guest's NUMA cell"));
+return -1;
+}
+
+/*
+ * Advanced Configuration and Power Interface
+ * Specification version 6.1. Chapter 5.2.17
+ * System Locality Distance Information Table
+ * ... Distance values of 0-9 are reserved.
+ */
+if (value < LOCAL_DISTANCE ||
+value > UNREACHABLE) {
+virReportError(VIR_ERR_CONFIG_UNSUPPORTED,
+   _("Distance value of %d is in 0-9 reserved range"),
+   value);
+return -1;
+}
+
+if (value == LOCAL_DISTANCE && node != cellid) {
+virReportError(VIR_ERR_CONFIG_UNSUPPORTED,
+   _("Distance value %d under node %zu seems "
+ "LOCAL_DISTANCE and should be set to 10"),
+   value, node);
+return -1;
+}
+
+distances[cellid].cellid = cellid;
+distances[cellid].value = value;
+
+return distances[cellid].value;
+}
+
+
+size_t
+virDomainNumaSetNodeDistanceCount(virDomainNumaPtr numa,
+  size_t node,
+  size_t ndistances)
+{
+virDomainNumaDistancePtr distances;
+
+distances = numa->mem_nodes[node].distances;
+if (distances) {
+ 

[libvirt] [PATCH v4 0/5] numa: describe sibling nodes distances

2017-09-08 Thread Wim Ten Have
From: Wim ten Have 

This patch extends guest domain administration adding support to advertise
node sibling distances when configuring HVM numa guests.

NUMA (non-uniform memory access), a method of configuring a cluster of nodes
within a single multiprocessing system such that it shares processor
local memory amongst others improving performance and the ability of the
system to be expanded.

A NUMA system could be illustrated as shown below. Within this 4-node
system, every socket is equipped with its own distinct memory. The whole
typically resembles a SMP (symmetric multiprocessing) system being a
"tightly-coupled," "share everything" system in which multiple processors
are working under a single operating system and can access each others'
memory over multiple "Bus Interconnect" paths.

+-+-+-+ +-+-+-+
|  M  | CPU | CPU | | CPU | CPU |  M  |
|  E  | | | | | |  E  |
|  M  +- Socket0 -+ +- Socket3 -+  M  |
|  O  | | | | | |  O  |
|  R  | CPU | CPU <-> CPU | CPU |  R  |
|  Y  | | | | | |  Y  |
+-+--^--+-+ +-+--^--+-+
 |   |
 |  Bus Interconnect |
 |   |
+-+--v--+-+ +-+--v--+-+
|  M  | | | | | |  M  |
|  E  | CPU | CPU <-> CPU | CPU |  E  |
|  M  | | | | | |  M  |
|  O  +- Socket1 -+ +- Socket2 -+  O  |
|  R  | | | | | |  R  |
|  Y  | CPU | CPU | | CPU | CPU |  Y  |
+-+-+-+ +-+-+-+

In contrast there is the limitation of a flat SMP system (not illustrated)
under which the bus (data and address path) can easily become a performance
bottleneck under high activity as sockets are added.
NUMA adds an intermediate level of memory shared amongst a few cores per
socket as illustrated above, so that data accesses do not have to travel
over a single bus.

Unfortunately the way NUMA does this adds its own limitations. This,
as visualized in the illustration above, happens when data is stored in
memory associated with Socket2 and is accessed by a CPU (core) in Socket0.
The processors use the "Bus Interconnect" to create gateways between the
sockets (nodes) enabling inter-socket access to memory. These "Bus
Interconnect" hops add data access delays when a CPU (core) accesses
memory associated with a remote socket (node).

For terminology we refer to sockets as "nodes" where access to each
others' distinct resources such as memory make them "siblings" with a
designated "distance" between them.  A specific design is described under
the ACPI (Advanced Configuration and Power Interface Specification)
within the chapter explaining the system's SLIT (System Locality Distance
Information Table).

These patches extend core libvirt's XML description of a virtual machine's
hardware to include NUMA distance information for sibling nodes, which
is then passed to Xen guests via libxl. Recently qemu landed support for
constructing the SLIT since commit 0f203430dd ("numa: Allow setting NUMA
distance for different NUMA nodes"), hence these core libvirt extensions
can also help other drivers in supporting this feature.

The XML changes made allow to describe the  node/sockets 
amongst  node identifiers and propagate these towards the numa
domain functionality finally adding support to libxl.

[below is an example illustrating a 4 node/socket  setup]


  

  




  


  




  


  




  


  




  

  


By default on libxl, if no  are given to describe the distance data
between different s, this patch will default to a scheme using 10
for local and 20 for any remote node/socket, which is the assumption of
guest OS when no SLIT is specified. While SLIT is optional, libxl requires
that distances are set nonetheless.

On Linux systems the SLIT detail can be listed with help of the 'numactl -H'
command. The above HVM guest would show the following output.

[root@f25 ~]# numactl -H
available: 4 nodes (0-3)
node 0 cpus: 0 4 5 6 7
node 0 size: 1988 MB
node 0 free: 1743 MB
node 1 cpus: 1 8 9 10 12 13 14 15
node 1 size: 1946 MB
node 1 free: 1885 MB
node 2 cpus: 2 11
node 2 size: 2011 MB
node 2 free: 1912 MB
node 3 cpus: 3
node 3 size: 2010 MB
node 3 free: 1980 MB
node distances:
node   0   1   2   

[libvirt] [PATCH v4 1/5] numa: rename function virDomainNumaDefCPUFormat

2017-09-08 Thread Wim Ten Have
From: Wim ten Have 

This patch renames virDomainNumaDefCPUFormat(), by adding XML, into
virDomainNumaDefCPUFormatXML().  So that it meets its peer Parse
sibling virDomainNumaDefCPUParseXML() and matches vir*XML() function
naming conventions.

Signed-off-by: Wim ten Have 
---
Changes on v3:
- Cleanup by putting this change under a separate patch in series.
---
 src/conf/cpu_conf.c  | 2 +-
 src/conf/numa_conf.c | 4 ++--
 src/conf/numa_conf.h | 2 +-
 3 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/src/conf/cpu_conf.c b/src/conf/cpu_conf.c
index c21d11d..8d804a1 100644
--- a/src/conf/cpu_conf.c
+++ b/src/conf/cpu_conf.c
@@ -642,7 +642,7 @@ virCPUDefFormatBufFull(virBufferPtr buf,
 if (virCPUDefFormatBuf(, def, updateCPU) < 0)
 goto cleanup;
 
-if (virDomainNumaDefCPUFormat(, numa) < 0)
+if (virDomainNumaDefCPUFormatXML(, numa) < 0)
 goto cleanup;
 
 if (virBufferCheckError() < 0 ||
diff --git a/src/conf/numa_conf.c b/src/conf/numa_conf.c
index bfd3703..b71dc01 100644
--- a/src/conf/numa_conf.c
+++ b/src/conf/numa_conf.c
@@ -801,8 +801,8 @@ virDomainNumaDefCPUParseXML(virDomainNumaPtr def,
 
 
 int
-virDomainNumaDefCPUFormat(virBufferPtr buf,
-  virDomainNumaPtr def)
+virDomainNumaDefCPUFormatXML(virBufferPtr buf,
+ virDomainNumaPtr def)
 {
 virDomainMemoryAccess memAccess;
 char *cpustr;
diff --git a/src/conf/numa_conf.h b/src/conf/numa_conf.h
index b6a5354..378b772 100644
--- a/src/conf/numa_conf.h
+++ b/src/conf/numa_conf.h
@@ -151,7 +151,7 @@ bool virDomainNumatuneNodeSpecified(virDomainNumaPtr 
numatune,
 int cellid);
 
 int virDomainNumaDefCPUParseXML(virDomainNumaPtr def, xmlXPathContextPtr ctxt);
-int virDomainNumaDefCPUFormat(virBufferPtr buf, virDomainNumaPtr def);
+int virDomainNumaDefCPUFormatXML(virBufferPtr buf, virDomainNumaPtr def);
 
 unsigned int virDomainNumaGetCPUCountTotal(virDomainNumaPtr numa);
 
-- 
2.9.5

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


[libvirt] [PATCH v4 2/5] numa: describe siblings distances within cells

2017-09-08 Thread Wim Ten Have
From: Wim ten Have 

Add libvirtd NUMA cell domain administration functionality to
describe underlying cell id sibling distances in full fashion
when configuring HVM guests.

Schema updates are made to docs/schemas/cputypes.rng enforcing domain
administration to follow the syntax below the numa cell id and
docs/schemas/basictypes.rng to add "numaDistanceValue".

A minimum value of 10 representing the LOCAL_DISTANCE as 0-9 are
reserved values and can not be used as System Locality Distance Information.
A value of 20 represents the default setting of REMOTE_DISTANCE
where a maximum value of 255 represents UNREACHABLE.

Effectively any cell sibling can be assigned a distance value where
practically 'LOCAL_DISTANCE <= value <= UNREACHABLE'.

[below is an example of a 4 node setup]

  

  

  
  
  
  

  
  

  
  
  
  

  
  

  
  
  
  

  

  
  
  
  

  

  

Whenever a sibling id the cell LOCAL_DISTANCE does apply and for any
sibling id not being covered a default of REMOTE_DISTANCE is used
for internal computations.

Signed-off-by: Wim ten Have 
---
Changes on v1:
- Add changes to docs/formatdomain.html.in describing schema update.
Changes on v2:
- Automatically apply distance symmetry maintaining cell <-> sibling.
- Check for maximum '255' on numaDistanceValue.
- Automatically complete empty distance ranges.
- Check that sibling_id's are in range with cell identifiers.
- Allow non-contiguous ranges, starting from any node id.
- Respect parameters as ATTRIBUTE_NONNULL fix functions and callers.
- Add and apply topology for LOCAL_DISTANCE=10 and REMOTE_DISTANCE=20.
Changes on v3
- Add UNREACHABLE if one locality is unreachable from another.
- Add code cleanup aligning function naming in a separated patch.
- Add numa related driver code in a separated patch.
- Remove  from numaDistanceValue schema/basictypes.rng
- Correct doc changes.
---
 docs/formatdomain.html.in   |  63 +-
 docs/schemas/basictypes.rng |   7 ++
 docs/schemas/cputypes.rng   |  18 
 src/conf/numa_conf.c| 200 +++-
 4 files changed, 284 insertions(+), 4 deletions(-)

diff --git a/docs/formatdomain.html.in b/docs/formatdomain.html.in
index 8ca7637..c453d44 100644
--- a/docs/formatdomain.html.in
+++ b/docs/formatdomain.html.in
@@ -1529,7 +1529,68 @@
 
 
 
-  This guest NUMA specification is currently available only for QEMU/KVM.
+  This guest NUMA specification is currently available only for
+  QEMU/KVM and Xen.  Whereas Xen driver also allows for a distinct
+  description of NUMA arranged sibling cell
+  distances Since 3.6.0.
+
+
+
+  Under NUMA h/w architecture, distinct resources such as memory
+  create a designated distance between cell and
+  siblings that now can be described with the help of
+  distances. A detailed description can be found within
+  the ACPI (Advanced Configuration and Power Interface Specification)
+  within the chapter explaining the system's SLIT (System Locality
+  Distance Information Table).
+
+
+
+...
+cpu
+  ...
+  numa
+cell id='0' cpus='0,4-7' memory='512000' unit='KiB'
+  distances
+sibling id='0' value='10'/
+sibling id='1' value='21'/
+sibling id='2' value='31'/
+sibling id='3' value='41'/
+  /distances
+/cell
+cell id='1' cpus='1,8-10,12-15' memory='512000' unit='KiB' 
memAccess='shared'
+  distances
+sibling id='0' value='21'/
+sibling id='1' value='10'/
+sibling id='2' value='21'/
+sibling id='3' value='31'/
+  /distances
+/cell
+cell id='2' cpus='2,11' memory='512000' unit='KiB' 
memAccess='shared'
+  distances
+sibling id='0' value='31'/
+sibling id='1' value='21'/
+sibling id='2' value='10'/
+sibling id='3' value='21'/
+  /distances
+/cell
+cell id='3' cpus='3' memory='512000' unit='KiB'
+  distances
+sibling id='0' value='41'/
+sibling id='1' value='31'/
+sibling id='2' value='21'/
+sibling id='3' value='10'/
+  /distances
+/cell
+  /numa
+  ...
+/cpu
+...
+
+
+  Under Xen driver, if no distances are given to describe
+  the SLIT data between different cells, it will default to a scheme
+  using 10 for local and 20 for remote distances.
 
 
 Events configuration
diff --git a/docs/schemas/basictypes.rng b/docs/schemas/basictypes.rng
index 1ea667c..1a18cd3 100644
--- a/docs/schemas/basictypes.rng
+++ b/docs/schemas/basictypes.rng
@@ -77,6 +77,13 @@
 
   
 
+  
+
+  10
+  255
+
+  
+
   
 
   
diff --git a/docs/schemas/cputypes.rng 

[libvirt] [PATCH v4 4/5] libxl: vnuma support

2017-09-08 Thread Wim Ten Have
From: Wim ten Have 

This patch generates a NUMA distance-aware libxl description from the
information extracted from a NUMA distance-aware libvirt XML file.

By default, if no NUMA node distance information is supplied in the
libvirt XML file, this patch uses the distances 10 for local and 20 for
remote nodes/sockets."

Signed-off-by: Wim ten Have 
---
Changes on v1:
- Fix function calling (coding) standards.
- Use virReportError(...) under all failing circumstances.
- Reduce redundant (format->parse) code sorting bitmaps.
- Avoid non GNU shorthand notations, difficult code practice.
Changes on v2:
- Have autonomous defaults applied from virDomainNumaGetNodeDistance.
- Automatically make Xen driver simulate vnuma pnodes on non NUMA h/w.
- Compute 'memory unit=' making it the sum of all node memory.
---
 src/libxl/libxl_conf.c   | 120 +++
 src/libxl/libxl_driver.c |   3 +-
 2 files changed, 122 insertions(+), 1 deletion(-)

diff --git a/src/libxl/libxl_conf.c b/src/libxl/libxl_conf.c
index 4416a09..5fb3561 100644
--- a/src/libxl/libxl_conf.c
+++ b/src/libxl/libxl_conf.c
@@ -618,6 +618,121 @@ libxlMakeDomBuildInfo(virDomainDefPtr def,
 return 0;
 }
 
+#ifdef LIBXL_HAVE_VNUMA
+static int
+libxlMakeVnumaList(virDomainDefPtr def,
+   libxl_ctx *ctx,
+   libxl_domain_config *d_config)
+{
+int ret = -1;
+size_t i, j;
+size_t nr_nodes;
+size_t num_vnuma;
+bool simulate = false;
+virBitmapPtr bitmap = NULL;
+virDomainNumaPtr numa = def->numa;
+libxl_domain_build_info *b_info = _config->b_info;
+libxl_physinfo physinfo;
+libxl_vnode_info *vnuma_nodes = NULL;
+
+if (!numa)
+return 0;
+
+num_vnuma = virDomainNumaGetNodeCount(numa);
+if (!num_vnuma)
+return 0;
+
+libxl_physinfo_init();
+if (libxl_get_physinfo(ctx, ) < 0) {
+libxl_physinfo_dispose();
+virReportError(VIR_ERR_INTERNAL_ERROR, "%s",
+   _("libxl_get_physinfo_info failed"));
+return -1;
+}
+nr_nodes = physinfo.nr_nodes;
+libxl_physinfo_dispose();
+
+if (num_vnuma > nr_nodes) {
+virReportError(VIR_ERR_CONFIG_UNSUPPORTED,
+   _("Number of configured numa cells %zu exceeds the 
physical available nodes %zu, guest simulates numa"),
+   num_vnuma, nr_nodes);
+simulate = true;
+}
+
+/*
+ * allocate the vnuma_nodes for assignment under b_info.
+ */
+if (VIR_ALLOC_N(vnuma_nodes, num_vnuma) < 0)
+return -1;
+
+/*
+ * parse the vnuma vnodes data.
+ */
+for (i = 0; i < num_vnuma; i++) {
+int cpu;
+libxl_bitmap vcpu_bitmap;
+libxl_vnode_info *p = _nodes[i];
+
+libxl_vnode_info_init(p);
+
+/* pnode */
+p->pnode = simulate ? 0 : i;
+
+/* memory size */
+p->memkb = virDomainNumaGetNodeMemorySize(numa, i);
+
+/* vcpus */
+bitmap = virDomainNumaGetNodeCpumask(numa, i);
+if (bitmap == NULL) {
+virReportError(VIR_ERR_INTERNAL_ERROR,
+   _("vnuma sibling %zu missing vcpus set"), i);
+goto cleanup;
+}
+
+if ((cpu = virBitmapNextSetBit(bitmap, -1)) < 0)
+goto cleanup;
+
+libxl_bitmap_init(_bitmap);
+if (libxl_cpu_bitmap_alloc(ctx, _bitmap, b_info->max_vcpus)) {
+virReportOOMError();
+goto cleanup;
+}
+
+do {
+libxl_bitmap_set(_bitmap, cpu);
+} while ((cpu = virBitmapNextSetBit(bitmap, cpu)) >= 0);
+
+libxl_bitmap_copy_alloc(ctx, >vcpus, _bitmap);
+libxl_bitmap_dispose(_bitmap);
+
+/* vdistances */
+if (VIR_ALLOC_N(p->distances, num_vnuma) < 0)
+goto cleanup;
+p->num_distances = num_vnuma;
+
+for (j = 0; j < num_vnuma; j++)
+p->distances[j] = virDomainNumaGetNodeDistance(numa, i, j);
+}
+
+b_info->vnuma_nodes = vnuma_nodes;
+b_info->num_vnuma_nodes = num_vnuma;
+
+ret = 0;
+
+ cleanup:
+if (ret) {
+for (i = 0; i < num_vnuma; i++) {
+libxl_vnode_info *p = _nodes[i];
+
+VIR_FREE(p->distances);
+}
+VIR_FREE(vnuma_nodes);
+}
+
+return ret;
+}
+#endif
+
 static int
 libxlDiskSetDiscard(libxl_device_disk *x_disk, int discard)
 {
@@ -2208,6 +2323,11 @@ libxlBuildDomainConfig(virPortAllocatorPtr graphicsports,
 if (libxlMakeDomBuildInfo(def, ctx, caps, d_config) < 0)
 return -1;
 
+#ifdef LIBXL_HAVE_VNUMA
+if (libxlMakeVnumaList(def, ctx, d_config) < 0)
+return -1;
+#endif
+
 if (libxlMakeDiskList(def, d_config) < 0)
 return -1;
 
diff --git a/src/libxl/libxl_driver.c b/src/libxl/libxl_driver.c
index 8fefce6..a7d8bfe 100644
--- a/src/libxl/libxl_driver.c
+++ b/src/libxl/libxl_driver.c
@@ -2788,7 +2788,8 @@ 

Re: [libvirt] [Qemu-devel] libvirt/QEMU/SEV interaction

2017-09-08 Thread Daniel P. Berrange
On Fri, Sep 08, 2017 at 01:45:06PM +, Relph, Richard wrote:
> A few answers in line…
> 
> On 9/8/17, 8:16 AM, "Daniel P. Berrange"  wrote:
> 
> On Fri, Sep 08, 2017 at 06:57:30AM -0500, Brijesh Singh wrote:
> > Hi All,
> > 
> > (sorry for the long message)
> > 
> > CPUs from AMD EPYC family supports Secure Encrypted Virtualization (SEV)
> > feature - the feature allows running encrypted VMs. To enable the 
> feature,
> > I have been submitting patches to Linux kernel [1], Qemu [2] and OVMF 
> [3].
> > We have been making some good progress in getting patches accepted 
> upstream
> > in Linux and OVMF trees. SEV builds upon SME (Secure Memory Encryption)
> > feature -- SME support just got pulled into 4.14 merge window. The base
> > SEV patches are accepted in OVMF tree -- now we have SEV aware guest 
> BIOS.
> > I am getting ready to take off "RFC" tag from remaining patches to get 
> them
> > reviewed and accepted.
> > 
> > The boot flow for launching an SEV guest is a bit different from a 
> typical
> > guest launch. In order to launch SEV guest from virt-manager or other
> > high-level VM management tools, we need to design and implement new
> > interface between libvirt and qemu, and probably add new APIs in libvirt
> > to be used by VM management tools. I am new to the libvirt and need some
> > expert advice while designing this interface. A pictorial representation
> > for a SEV guest launch flow is available in SEV Spec Appendix A [4].
> > 
> > A typical flow looks like this:
> > 
> > 1. Guest owner (GO) asks the cloud provider to launch SEV guest.
> > 2. VM tool asks libvirt to provide its Platform Diffie-Hellman (PDH) 
> key.
> > 3. libvirt opens /dev/sev device to get its PDH and return the blob to 
> the
> >caller.
> 
> What sort of size are we talking about for the PDH ?
> 
> The PDH blob is described in reference 4. It’s 0x824 bytes long… a bit over 
> 2K bytes.
> PDH is “Platform Diffie-Hellman” key, public portion.
> 
> There's a few ways libvirt could report it
> 
>  1. As an XML element in the host capabilities XML
>  2. As an XML element in the emulator capabilities XML
>  3. Via a newly added host API
> 
> > 4. VM tool gives its PDH to GO.
> > 5. GO provides its DH key, session-info and guest policy.
> 
> Are steps 4 & 5 strictly required to be in this order, or is it
> possible
> 
> Steps 4 and 5 must occur in that order. The data sent by the GO in the
> “session info” is encrypted and integrity protected with keys that the
> GO derives from the GO private Diffie-Hellman key and the PDH public
> Diffie-Hellman key.
> 
> What are the security requirements around the DH key, session info
> and guest policy ? Are any of them sensitive data which needs to be
> kept private from untrustworthy users. Also are all three of these
> items different for every guest launched, or are some of them
> likely to be the same for every guest ?
> 
> The PDH is not sensitive. It is relatively static for the platform.
> (It only changes when the platform owner chooses to change the apparent
> identity of the platform that will actually run the virtual machine.)
> The 128-byte session info data is encrypted and integrity protected by
> the GO. It need not be additionally encrypted or integrity protected.
> It should vary for EVERY guest launch.
> The 4-byte guest policy must be sent in the clear as some components
> may want to observe the policy bits. It may change from guest to guest,
> but there will likely only be a few common values.

Given this, and the pretty small data sizes, I think this info could
in fact all be provided inline in the XML config - either hex or base64
encoded for the binary blobs.



> > 8. libvirt launches the guest with "-S"
> 
> All libvirt guests get launched with -S, to give libvirt chance to do some
> setup before starting vCPUs. Normally vCPUs are started by default, but
> the VIR_DOMAIN_START_PAUSED flag allows the mgmt app to tell libvirt to
> leave vCPUS paused.
> 
> Alternatively, if libvirt sees presencese of 'sev' config for the guest,
> it could automatically leave it in PAUSED state regardless of the
> VIR_DOMAIN_START_PAUSED flag.
> 
> While using the LAUNCH_MEASURE and LAUNCH_SECRET operations is expected
> to be the common case, they are optional. I would recommend requiring
> the upstream software to explicitly set the VIR_DOMAIN_START_PAUSED flag,
> if only to minimize the dependencies…

Ok.

> > 9. While creating the SEV guest qemu does the following
> >  i) create encryption context using GO's DH, session-info and guest 
> policy
> > (LAUNCH_START)
> >  ii) encrypts the guest bios (LAUNCH_UPDATE_DATA)
> >  iii) calls LAUNCH_MEASUREMENT to get the encrypted bios measurement
>  

[libvirt] [PATCH 2/3] qemu_driver: fix existance vs existence typo

2017-09-08 Thread Guido Günther
---
 src/qemu/qemu_driver.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c
index 6255d89310..c742e505c4 100644
--- a/src/qemu/qemu_driver.c
+++ b/src/qemu/qemu_driver.c
@@ -16816,7 +16816,7 @@ qemuDomainBlockCopyValidateMirror(virStorageSourcePtr 
mirror,
 if (virStorageFileAccess(mirror, F_OK) < 0) {
 if (errno != ENOENT) {
 virReportSystemError(errno, "%s",
- _("unable to verify existance of "
+ _("unable to verify existence of "
"block copy target"));
 return -1;
 }
-- 
2.14.1

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


[libvirt] [PATCH v4 5/5] xlconfigtest: add tests for numa cell sibling distances

2017-09-08 Thread Wim Ten Have
From: Wim ten Have 

Test a bidirectional xen-xl domxml to and from native for numa
support administration as brought under this patch series.

Signed-off-by: Wim ten Have 
---
 .../test-fullvirt-vnuma-autocomplete.cfg   | 26 +++
 .../test-fullvirt-vnuma-autocomplete.xml   | 85 ++
 .../test-fullvirt-vnuma-nodistances.cfg| 26 +++
 .../test-fullvirt-vnuma-nodistances.xml| 53 ++
 .../test-fullvirt-vnuma-partialdist.cfg| 26 +++
 .../test-fullvirt-vnuma-partialdist.xml| 60 +++
 tests/xlconfigdata/test-fullvirt-vnuma.cfg | 26 +++
 tests/xlconfigdata/test-fullvirt-vnuma.xml | 81 +
 tests/xlconfigtest.c   |  6 ++
 9 files changed, 389 insertions(+)
 create mode 100644 tests/xlconfigdata/test-fullvirt-vnuma-autocomplete.cfg
 create mode 100644 tests/xlconfigdata/test-fullvirt-vnuma-autocomplete.xml
 create mode 100644 tests/xlconfigdata/test-fullvirt-vnuma-nodistances.cfg
 create mode 100644 tests/xlconfigdata/test-fullvirt-vnuma-nodistances.xml
 create mode 100644 tests/xlconfigdata/test-fullvirt-vnuma-partialdist.cfg
 create mode 100644 tests/xlconfigdata/test-fullvirt-vnuma-partialdist.xml
 create mode 100644 tests/xlconfigdata/test-fullvirt-vnuma.cfg
 create mode 100644 tests/xlconfigdata/test-fullvirt-vnuma.xml

diff --git a/tests/xlconfigdata/test-fullvirt-vnuma-autocomplete.cfg 
b/tests/xlconfigdata/test-fullvirt-vnuma-autocomplete.cfg
new file mode 100644
index 000..edba69a
--- /dev/null
+++ b/tests/xlconfigdata/test-fullvirt-vnuma-autocomplete.cfg
@@ -0,0 +1,26 @@
+name = "XenGuest2"
+uuid = "c7a5fdb2-cdaf-9455-926a-d65c16db1809"
+maxmem = 8192
+memory = 8192
+vcpus = 12
+pae = 1
+acpi = 1
+apic = 1
+viridian = 0
+rtc_timeoffset = 0
+localtime = 0
+on_poweroff = "destroy"
+on_reboot = "restart"
+on_crash = "restart"
+device_model = "/usr/lib/xen/bin/qemu-system-i386"
+sdl = 0
+vnc = 1
+vncunused = 1
+vnclisten = "127.0.0.1"
+vif = [ "mac=00:16:3e:66:92:9c,bridge=xenbr1,script=vif-bridge,model=e1000" ]
+parallel = "none"
+serial = "none"
+builder = "hvm"
+boot = "d"
+vnuma = [ [ "pnode=0", "size=2048", "vcpus=0,11", 
"vdistances=10,21,31,41,51,61" ], [ "pnode=1", "size=2048", "vcpus=1,10", 
"vdistances=21,10,21,31,41,51" ], [ "pnode=2", "size=2048", "vcpus=2,9", 
"vdistances=31,21,10,21,31,41" ], [ "pnode=3", "size=2048", "vcpus=3,8", 
"vdistances=41,31,21,10,21,31" ], [ "pnode=4", "size=2048", "vcpus=4,7", 
"vdistances=51,41,31,21,10,21" ], [ "pnode=5", "size=2048", "vcpus=5-6", 
"vdistances=61,51,41,31,21,10" ] ]
+disk = [ 
"format=raw,vdev=hda,access=rw,backendtype=phy,target=/dev/HostVG/XenGuest2" ]
diff --git a/tests/xlconfigdata/test-fullvirt-vnuma-autocomplete.xml 
b/tests/xlconfigdata/test-fullvirt-vnuma-autocomplete.xml
new file mode 100644
index 000..e3639eb
--- /dev/null
+++ b/tests/xlconfigdata/test-fullvirt-vnuma-autocomplete.xml
@@ -0,0 +1,85 @@
+
+  XenGuest2
+  c7a5fdb2-cdaf-9455-926a-d65c16db1809
+  8388608
+  8388608
+  12
+  
+hvm
+/usr/lib/xen/boot/hvmloader
+
+  
+  
+
+
+
+  
+  
+
+  
+
+  
+  
+  
+  
+  
+
+  
+  
+
+  
+  
+  
+  
+
+  
+  
+
+  
+  
+  
+
+  
+  
+
+  
+  
+
+  
+  
+
+  
+
+  
+  
+
+  
+  
+  destroy
+  restart
+  restart
+  
+/usr/lib/xen/bin/qemu-system-i386
+
+  
+  
+  
+  
+
+
+
+  
+  
+  
+  
+
+
+
+
+  
+
+
+  
+
+  
+
diff --git a/tests/xlconfigdata/test-fullvirt-vnuma-nodistances.cfg 
b/tests/xlconfigdata/test-fullvirt-vnuma-nodistances.cfg
new file mode 100644
index 000..8186edf
--- /dev/null
+++ b/tests/xlconfigdata/test-fullvirt-vnuma-nodistances.cfg
@@ -0,0 +1,26 @@
+name = "XenGuest2"
+uuid = "c7a5fdb2-cdaf-9455-926a-d65c16db1809"
+maxmem = 8192
+memory = 8192
+vcpus = 8
+pae = 1
+acpi = 1
+apic = 1
+viridian = 0
+rtc_timeoffset = 0
+localtime = 0
+on_poweroff = "destroy"
+on_reboot = "restart"
+on_crash = "restart"
+device_model = "/usr/lib/xen/bin/qemu-system-i386"
+sdl = 0
+vnc = 1
+vncunused = 1
+vnclisten = "127.0.0.1"
+vif = [ "mac=00:16:3e:66:92:9c,bridge=xenbr1,script=vif-bridge,model=e1000" ]
+parallel = "none"
+serial = "none"
+builder = "hvm"
+boot = "d"
+vnuma = [ [ "pnode=0", "size=2048", "vcpus=0-1", "vdistances=10,20,20,20" ], [ 
"pnode=1", "size=2048", "vcpus=2-3", "vdistances=20,10,20,20" ], [ "pnode=2", 
"size=2048", "vcpus=4-5", "vdistances=20,20,10,20" ], [ "pnode=3", "size=2048", 
"vcpus=6-7", "vdistances=20,20,20,10" ] ]
+disk = [ 
"format=raw,vdev=hda,access=rw,backendtype=phy,target=/dev/HostVG/XenGuest2" ]
diff --git 

[libvirt] [PATCH 0/3] typo fixes

2017-09-08 Thread Guido Günther
Probably could have gone by the trivial rule.

Guido Günther (3):
  storagefile: fix defintion vs definition typo
  qemu_driver: fix existance vs existence typo
  virnetserver: fix mesage vs message typo

 src/qemu/qemu_driver.c| 2 +-
 src/rpc/virnetserver.c| 2 +-
 src/util/virstoragefile.c | 4 ++--
 3 files changed, 4 insertions(+), 4 deletions(-)

-- 
2.14.1

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list

[libvirt] [PATCH 1/3] storagefile: fix defintion vs definition typo

2017-09-08 Thread Guido Günther
---
 src/util/virstoragefile.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/util/virstoragefile.c b/src/util/virstoragefile.c
index fbc8245f35..e94ad32f09 100644
--- a/src/util/virstoragefile.c
+++ b/src/util/virstoragefile.c
@@ -3253,7 +3253,7 @@ 
virStorageSourceParseBackingJSONInternal(virStorageSourcePtr src,
 if (!(file = virJSONValueObjectGetObject(deflattened, "file"))) {
 str = virJSONValueToString(json, false);
 virReportError(VIR_ERR_INVALID_ARG,
-   _("JSON backing volume defintion '%s' lacks 'file' 
object"),
+   _("JSON backing volume definition '%s' lacks 'file' 
object"),
NULLSTR(str));
 goto cleanup;
 }
@@ -3261,7 +3261,7 @@ 
virStorageSourceParseBackingJSONInternal(virStorageSourcePtr src,
 if (!(drvname = virJSONValueObjectGetString(file, "driver"))) {
 str = virJSONValueToString(json, false);
 virReportError(VIR_ERR_INVALID_ARG,
-   _("JSON backing volume defintion '%s' lacks driver 
name"),
+   _("JSON backing volume definition '%s' lacks driver 
name"),
NULLSTR(str));
 goto cleanup;
 }
-- 
2.14.1

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [PATCH v3 1/4] numa: describe siblings distances within cells

2017-09-08 Thread Wim ten Have
On Mon, 4 Sep 2017 08:49:33 +0200
Martin Kletzander  wrote:

> On Fri, Sep 01, 2017 at 04:31:50PM +0200, Wim ten Have wrote:
> >On Thu, 31 Aug 2017 16:36:58 +0200
> >Martin Kletzander  wrote:
> >  
> >> On Thu, Aug 31, 2017 at 04:02:38PM +0200, Wim Ten Have wrote:  
> >> >From: Wim ten Have 
> >> >  
> >  
> >> I haven't seen the previous versions, so sorry if I point out something
> >> that got changed already.  
> >
> >There was a v1 and v2 cycle by Jim and Daniel 2 month back.
> >Due to personal issues whole got delayed at my end where i am
> >catching up.
> >  
> >> >Add libvirtd NUMA cell domain administration functionality to
> >> >describe underlying cell id sibling distances in full fashion
> >> >when configuring HVM guests.
> >> >
> >> >[below is an example of a 4 node setup]
> >> >
> >> >  
> >> >
> >> >  
> >> >
> >> >  
> >> >  
> >> >  
> >> >  
> >> >
> >> >  
> >> >  
> >> >
> >> >  
> >> >  
> >> >  
> >> >  
> >> >
> >> >  
> >> >  
> >> >
> >> >  
> >> >  
> >> >  
> >> >  
> >> >
> >> >  
> >> >
> >> >  
> >> >  
> >> >  
> >> >  
> >> >
> >> >  
> >> >
> >> >  
> >> >  
> >>
> >> Are all those required?  I don't see the point in requiring duplicate
> >> values.  
> >
> >Sorry I am not sure I understand what you mean by "duplicate values"
> >here. Can you elaborate?
> >  
> 
> Sure.  For simplicity let's assume 2 cells:
> 
> 
>   
> 
> 
>   
> 
> 
>   
>  
> 
> 
> 

The fields are not a necessity.  And to reduce even more we could also
remove LOCAL_DISTANCES as they make a constant factor where;
  (cell_id == sibling_id)

So, although the  presentation of a guest domain may heavily,
giving this a look it seems far from attractive to me.  I am not sure
if this is what we want and should do.

Let me go forward sending out "PATCH v4" addressing other comments and
reserve this idea for "PATCH v5".

Regards,
- Wim.

> >> >Changes under this commit concern all those that require adding
> >> >the valid data structures, virDomainNuma* functional routines and the
> >> >XML doc schema enhancements to enforce appropriate administration.  
> >>
> >> I don't understand the point of this paragraph.  
> >
> >Let me rephrase this paragraph in future version.
> >  
> >> >These changes are accompanied with topic related documentation under
> >> >docs/formatdomain.html within the "CPU model and topology" extending the
> >> >"Guest NUMA topology" paragraph.  
> >>
> >> This can be seen from the patch.  
> >
> >Agree, so I'll address this in future version too.
> >  
> >> >For terminology we refer to sockets as "nodes" where access to each
> >> >others' distinct resources such as memory make them "siblings" with a
> >> >designated "distance" between them.  A specific design is described under
> >> >the ACPI (Advanced Configuration and Power Interface Specification)
> >> >within the chapter explaining the system's SLIT (System Locality Distance
> >> >Information Table).  
> >>
> >> The above paragraph is also being added in the docs.  
> >
> >True so I'll scratch this part of the commit message.
> >  
> >> >These patches extend core libvirt's XML description of a virtual machine's
> >> >hardware to include NUMA distance information for sibling nodes, which
> >> >is then passed to Xen guests via libxl. Recently qemu landed support for
> >> >constructing the SLIT since commit 0f203430dd ("numa: Allow setting NUMA
> >> >distance for different NUMA nodes"), hence these core libvirt extensions
> >> >can also help other drivers in supporting this feature.  
> >>
> >> This is not true, libxl changes are in separate patch.  
> >
> >Like the previous comments and replies, I'll rearrange the commit
> >messages and rephrase them to be more specific to the patch.
> >  
> >> There's a lot of copy-paste from the cover-letter...  
> >
> >There is indeed room for improvement.
> >  
> >> >These changes alter the docs/schemas/cputypes.rng enforcing
> >> >domain administration to follow the syntax below per numa cell id.
> >> >
> >> >These changes also alter docs/schemas/basictypes.rng to add
> >> >"numaDistanceValue" which is an "unsignedInt" with a minimum value
> >> >of 10 as 0-9 are reserved values and can not be used as System
> >> >Locality Distance Information Table data.
> >> >
> >> >Signed-off-by: Wim ten Have 
> >> >---
> >> >Changes on v1:
> >> >- Add changes to docs/formatdomain.html.in describing schema update.
> >> >Changes on v2:
> >> >- Automatically apply distance symmetry maintaining cell <-> sibling.
> >> >- Check for maximum '255' on numaDistanceValue.
> >> >- Automatically complete empty distance ranges.
> >> >- Check that sibling_id's are in range with cell identifiers.
> >> 

[libvirt] [PATCH 3/3] virnetserver: fix mesage vs message typo

2017-09-08 Thread Guido Günther
---
 src/rpc/virnetserver.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/rpc/virnetserver.c b/src/rpc/virnetserver.c
index c02db74c46..2b76daab55 100644
--- a/src/rpc/virnetserver.c
+++ b/src/rpc/virnetserver.c
@@ -133,7 +133,7 @@ static int virNetServerProcessMsg(virNetServerPtr srv,
 >header) < 0)
 goto cleanup;
 } else {
-VIR_INFO("Dropping client mesage, unknown program %d version %d 
type %d proc %d",
+VIR_INFO("Dropping client message, unknown program %d version %d 
type %d proc %d",
  msg->header.prog, msg->header.vers,
  msg->header.type, msg->header.proc);
 /* Send a dummy reply to free up 'msg' & unblock client rx */
-- 
2.14.1

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [Qemu-devel] libvirt/QEMU/SEV interaction

2017-09-08 Thread Relph, Richard
A few answers in line…

On 9/8/17, 8:16 AM, "Daniel P. Berrange"  wrote:

On Fri, Sep 08, 2017 at 06:57:30AM -0500, Brijesh Singh wrote:
> Hi All,
> 
> (sorry for the long message)
> 
> CPUs from AMD EPYC family supports Secure Encrypted Virtualization (SEV)
> feature - the feature allows running encrypted VMs. To enable the feature,
> I have been submitting patches to Linux kernel [1], Qemu [2] and OVMF [3].
> We have been making some good progress in getting patches accepted 
upstream
> in Linux and OVMF trees. SEV builds upon SME (Secure Memory Encryption)
> feature -- SME support just got pulled into 4.14 merge window. The base
> SEV patches are accepted in OVMF tree -- now we have SEV aware guest BIOS.
> I am getting ready to take off "RFC" tag from remaining patches to get 
them
> reviewed and accepted.
> 
> The boot flow for launching an SEV guest is a bit different from a typical
> guest launch. In order to launch SEV guest from virt-manager or other
> high-level VM management tools, we need to design and implement new
> interface between libvirt and qemu, and probably add new APIs in libvirt
> to be used by VM management tools. I am new to the libvirt and need some
> expert advice while designing this interface. A pictorial representation
> for a SEV guest launch flow is available in SEV Spec Appendix A [4].
> 
> A typical flow looks like this:
> 
> 1. Guest owner (GO) asks the cloud provider to launch SEV guest.
> 2. VM tool asks libvirt to provide its Platform Diffie-Hellman (PDH) key.
> 3. libvirt opens /dev/sev device to get its PDH and return the blob to the
>caller.

What sort of size are we talking about for the PDH ?

The PDH blob is described in reference 4. It’s 0x824 bytes long… a bit over 2K 
bytes.
PDH is “Platform Diffie-Hellman” key, public portion.

There's a few ways libvirt could report it

 1. As an XML element in the host capabilities XML
 2. As an XML element in the emulator capabilities XML
 3. Via a newly added host API

> 4. VM tool gives its PDH to GO.
> 5. GO provides its DH key, session-info and guest policy.

Are steps 4 & 5 strictly required to be in this order, or is it
possible

Steps 4 and 5 must occur in that order. The data sent by the GO in the “session 
info” is encrypted and integrity protected with keys that the GO derives from 
the GO private Diffie-Hellman key and the PDH public Diffie-Hellman key.

What are the security requirements around the DH key, session info
and guest policy ? Are any of them sensitive data which needs to be
kept private from untrustworthy users. Also are all three of these
items different for every guest launched, or are some of them
likely to be the same for every guest ?

The PDH is not sensitive. It is relatively static for the platform. (It only 
changes when the platform owner chooses to change the apparent identity of the 
platform that will actually run the virtual machine.)
The 128-byte session info data is encrypted and integrity protected by the GO. 
It need not be additionally encrypted or integrity protected. It should vary 
for EVERY guest launch.
The 4-byte guest policy must be sent in the clear as some components may want 
to observe the policy bits. It may change from guest to guest, but there will 
likely only be a few common values.

eg, would the same guest policy blob be used for every guest, with
only session info changing ?

Also what sort of size are we talking about for each of these
data items, KBs, 10's of KB, 100's of KBs or larger ?

The security and data size can influence our design approach from
the libvirt POV.

> 6. VM tool somehow communicates the GO provided information to libvirt.

Essentially we have two choices

 1. inline in the guest XML config description passed to libvirt
when defining the guest XML

 2. out of band, ahead of time, via some other API prior to defining
the guest XML, which is then referenced in guest XML. THis could
be done using the virSecret APIs, where we create 3 secrets, one
each for DH key, session-info and guets policy. The UUID of the
secret could be specified in the guest XML. This has flexibility
of allowing the same secrets ot be used for many guests (if this
is valid for SEV)


> 7. libvirt adds "sev-guest" object in its xml file with all the 
information
>obtained from #5
> 
>(currently my xml file looks like this)
> 
>
> 
value='sev-guest,id=sev0,policy=,dh-key-file=,session-file=/>
>
>
> 
> 8. libvirt launches the guest with "-S"

All libvirt guests get launched with -S, to give libvirt chance to do some
setup before starting vCPUs. Normally 

Re: [libvirt] [Qemu-devel] libvirt/QEMU/SEV interaction

2017-09-08 Thread Daniel P. Berrange
On Fri, Sep 08, 2017 at 06:57:30AM -0500, Brijesh Singh wrote:
> Hi All,
> 
> (sorry for the long message)
> 
> CPUs from AMD EPYC family supports Secure Encrypted Virtualization (SEV)
> feature - the feature allows running encrypted VMs. To enable the feature,
> I have been submitting patches to Linux kernel [1], Qemu [2] and OVMF [3].
> We have been making some good progress in getting patches accepted upstream
> in Linux and OVMF trees. SEV builds upon SME (Secure Memory Encryption)
> feature -- SME support just got pulled into 4.14 merge window. The base
> SEV patches are accepted in OVMF tree -- now we have SEV aware guest BIOS.
> I am getting ready to take off "RFC" tag from remaining patches to get them
> reviewed and accepted.
> 
> The boot flow for launching an SEV guest is a bit different from a typical
> guest launch. In order to launch SEV guest from virt-manager or other
> high-level VM management tools, we need to design and implement new
> interface between libvirt and qemu, and probably add new APIs in libvirt
> to be used by VM management tools. I am new to the libvirt and need some
> expert advice while designing this interface. A pictorial representation
> for a SEV guest launch flow is available in SEV Spec Appendix A [4].
> 
> A typical flow looks like this:
> 
> 1. Guest owner (GO) asks the cloud provider to launch SEV guest.
> 2. VM tool asks libvirt to provide its Platform Diffie-Hellman (PDH) key.
> 3. libvirt opens /dev/sev device to get its PDH and return the blob to the
>    caller.

What sort of size are we talking about for the PDH ?

There's a few ways libvirt could report it

 1. As an XML element in the host capabilities XML
 2. As an XML element in the emulator capabilities XML
 3. Via a newly added host API

> 4. VM tool gives its PDH to GO.
> 5. GO provides its DH key, session-info and guest policy.

Are steps 4 & 5 strictly required to be in this order, or is it
possible

What are the security requirements around the DH key, session info
and guest policy ? Are any of them sensitive data which needs to be
kept private from untrustworthy users. Also are all three of these
items different for every guest launched, or are some of them
likely to be the same for every guest ?

eg, would the same guest policy blob be used for every guest, with
only session info changing ?

Also what sort of size are we talking about for each of these
data items, KBs, 10's of KB, 100's of KBs or larger ?

The security and data size can influence our design approach from
the libvirt POV.

> 6. VM tool somehow communicates the GO provided information to libvirt.

Essentially we have two choices

 1. inline in the guest XML config description passed to libvirt
when defining the guest XML

 2. out of band, ahead of time, via some other API prior to defining
the guest XML, which is then referenced in guest XML. THis could
be done using the virSecret APIs, where we create 3 secrets, one
each for DH key, session-info and guets policy. The UUID of the
secret could be specified in the guest XML. This has flexibility
of allowing the same secrets ot be used for many guests (if this
is valid for SEV)


> 7. libvirt adds "sev-guest" object in its xml file with all the information
>    obtained from #5
> 
>    (currently my xml file looks like this)
> 
>    
>     value='sev-guest,id=sev0,policy=,dh-key-file=,session-file=/>
>    
>    
> 
> 8. libvirt launches the guest with "-S"

All libvirt guests get launched with -S, to give libvirt chance to do some
setup before starting vCPUs. Normally vCPUs are started by default, but
the VIR_DOMAIN_START_PAUSED flag allows the mgmt app to tell libvirt to
leave vCPUS paused.

Alternatively, if libvirt sees presencese of 'sev' config for the guest,
it could automatically leave it in PAUSED state regardless of the
VIR_DOMAIN_START_PAUSED flag.

> 9. While creating the SEV guest qemu does the following
>  i) create encryption context using GO's DH, session-info and guest policy
>     (LAUNCH_START)
>  ii) encrypts the guest bios (LAUNCH_UPDATE_DATA)
>  iii) calls LAUNCH_MEASUREMENT to get the encrypted bios measurement
> 10. By some interface we must propagate the measurement all the way to GO
>   before libvirt starts the guest.

Again, what kind of size data are we talking about for athe "measurement"
blob ? a KB, 10's of KB, or more ?

My first gut instinct would be for QEMU to emit a QMP event when it has
the measurement available. The event could include the actual data blob,
or we can could add an explicit QMP command to fetch the data blob.

Libvirt could listen for this QEMU event, and in turn emit an event from
libvirt with the same data, which the mgmt tool can finally give to the
GO.

> 11. GO verifies the measurement and if measurement matches then it may
>  give a secret blob -- which must be injected into the guest before
>  libvirt starts the VM. If verification failed, GO will request cloud
>  provider to destroy the VM.

So 

Re: [libvirt] [PATCH 1/2] virsh: man: Be more explicit about 'create' creating transient domain

2017-09-08 Thread Erik Skultety
On Fri, Sep 08, 2017 at 11:06:08AM +0200, Martin Kletzander wrote:
> On Thu, Sep 07, 2017 at 03:48:13PM +0200, Erik Skultety wrote:
> > So we refer to terms 'persistent' and 'transient' across the whole man
> > page, without explicitly stating that domain created via the 'create'
> > command is in fact transient and will vanish once destroyed.
> >
> > Signed-off-by: Erik Skultety 
> > ---
> > tools/virsh.pod | 5 +
> > 1 file changed, 5 insertions(+)
> >
> > diff --git a/tools/virsh.pod b/tools/virsh.pod
> > index c13f96f22..14a01e9ba 100644
> > --- a/tools/virsh.pod
> > +++ b/tools/virsh.pod
> > @@ -652,6 +652,11 @@ of open file descriptors which should be pass on into 
> > the guest. The
> > file descriptors will be re-numbered in the guest, starting from 3. This
> > is only supported with container based virtualization.
> >
> > +Note that a domain created using this command is transient (temporary), 
> > meaning
> > +that such a domain simply disappears once shut off or destroyed and any
> > +subsequent attempts to B it will fail. See B command on how 
> > to
> > +make a domain persistent.
> > +
>
> Not really.  With this command you can also start existing domain, just
> with a different configuration.  Given that the example below shows
> exactly that usage, I think someone might get a bit confused.  Would you
> mind rewriting it to accommodate all variants?

I pushed 2/2 and sent a v2 of this one, rewording things a bit.

Thanks,
Erik

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


[libvirt] [PATCH] virsh: man: Describe the 'create' command a bit more

2017-09-08 Thread Erik Skultety
So we refer to the terms 'persistent' and 'transient' across the whole
man page, without describing it further, but more importantly, how the
create command affects it, i.e. explicitly stating that domain created
via the 'create' command are going to be transient or persistent,
depending on whether there is an existing persistent domain with a
matching  and , in which case it will remain persistent, but
will run using a one-time configuration, otherwise it's going to be
transient and will vanish once destroyed.

Signed-off-by: Erik Skultety 
---
 tools/virsh.pod | 32 ++--
 1 file changed, 26 insertions(+), 6 deletions(-)

diff --git a/tools/virsh.pod b/tools/virsh.pod
index 01453be60..c4c76fcb1 100644
--- a/tools/virsh.pod
+++ b/tools/virsh.pod
@@ -640,9 +640,14 @@ sessions, such as in a case of a broken connection.
 
 Create a domain from an XML . Optionally, I<--validate> option can be
 passed to validate the format of the input XML file against an internal RNG
-schema (identical to using L tool). An easy way to create
-the XML  is to use the B command to obtain the definition of a
-pre-existing guest.  The domain will be paused if the I<--paused> option is 
used
+schema (identical to using L tool). Domains created using
+this command are going to be either transient (temporary ones that will vanish
+once destroyed) or existing persistent domains that will run with one-time use
+configuration, leaving the persistent XML untouched (this can come handy during
+an automated testing of various configurations all based on the original XML).
+See the B section for usage demonstration.
+
+The domain will be paused if the I<--paused> option is used
 and supported by the driver; otherwise it will be running. If I<--console> is
 requested, attach to the console after creation.
 If I<--autodestroy> is requested, then the guest will be automatically
@@ -656,9 +661,24 @@ is only supported with container based virtualization.
 
 B
 
- virsh dumpxml  > domain.xml
- vi domain.xml (or make changes with your other text editor)
- virsh create domain.xml
+ 1) prepare a template from an existing domain (skip directly to 3a if writing
+one from scratch)
+
+ # virsh dumpxml  > domain.xml
+
+ 2) edit the template using an editor of your choice and:
+a) DO CHANGE!  and  ( can also be removed), or
+b) DON'T CHANGE! either  or 
+
+ # $EDITOR domain.xml
+
+ 3) create a domain from domain.xml, depending on whether following 2a or 2b
+respectively:
+a) the domain is going to be transient
+b) an existing persistent domain will run with a modified one-time
+   configuration
+
+ # virsh create domain.xml
 
 =item B I [I<--validate>]
 
-- 
2.13.3

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


[libvirt] libvirt/QEMU/SEV interaction

2017-09-08 Thread Brijesh Singh
Hi All,

(sorry for the long message)

CPUs from AMD EPYC family supports Secure Encrypted Virtualization (SEV)
feature - the feature allows running encrypted VMs. To enable the feature,
I have been submitting patches to Linux kernel [1], Qemu [2] and OVMF [3].
We have been making some good progress in getting patches accepted upstream
in Linux and OVMF trees. SEV builds upon SME (Secure Memory Encryption)
feature -- SME support just got pulled into 4.14 merge window. The base
SEV patches are accepted in OVMF tree -- now we have SEV aware guest BIOS.
I am getting ready to take off "RFC" tag from remaining patches to get them
reviewed and accepted.

The boot flow for launching an SEV guest is a bit different from a typical
guest launch. In order to launch SEV guest from virt-manager or other
high-level VM management tools, we need to design and implement new
interface between libvirt and qemu, and probably add new APIs in libvirt
to be used by VM management tools. I am new to the libvirt and need some
expert advice while designing this interface. A pictorial representation
for a SEV guest launch flow is available in SEV Spec Appendix A [4].

A typical flow looks like this:

1. Guest owner (GO) asks the cloud provider to launch SEV guest.
2. VM tool asks libvirt to provide its Platform Diffie-Hellman (PDH) key.
3. libvirt opens /dev/sev device to get its PDH and return the blob to the
   caller.
4. VM tool gives its PDH to GO.
5. GO provides its DH key, session-info and guest policy.
6. VM tool somehow communicates the GO provided information to libvirt.
7. libvirt adds "sev-guest" object in its xml file with all the information
   obtained from #5

   (currently my xml file looks like this)

   
   
   

8. libvirt launches the guest with "-S"
9. While creating the SEV guest qemu does the following
 i) create encryption context using GO's DH, session-info and guest policy
    (LAUNCH_START)
 ii) encrypts the guest bios (LAUNCH_UPDATE_DATA)
 iii) calls LAUNCH_MEASUREMENT to get the encrypted bios measurement
10. By some interface we must propagate the measurement all the way to GO
  before libvirt starts the guest.
11. GO verifies the measurement and if measurement matches then it may
 give a secret blob -- which must be injected into the guest before
 libvirt starts the VM. If verification failed, GO will request cloud
 provider to destroy the VM.
12. After secret blob is injected into guest, we call LAUNCH_FINISH
  to destory the encryption context.
13. libvirt issues "continue" command to resume the guest boot.

Please note that the measurement value is protected with transport
encryption key (TIK) and it changes on each run. Similarly the secret blob
provided by GO does not need to be protected using libvirt/qemu APIs. The
secret is protected by TIK. From qemu and libvirt point of view these are
blobs and must be passed as-is to the SEV FW.

Questions:
a) Do we need to add a new set of APIs in libvirt to return the PDH from
libvirt and VM tool ? Or can we use some pre-existing APIs to pass the
opaque blobs ? (this is mainly for step 3 and 6)
b) do we need to define a new xml tag to for memory-encryption ? or just
use the qemu:args tag ? (step 6)
c) what existing communicate interface can be used between libvirt and qemu
to get the measurement ? can we add a new qemu monitor command
'get_sev_measurement' to get the measurement ? (step 10)
d) how to pass the secret blob from libvirt to qemu ? should we consider
adding a new object (sev-guest-secret) -- libvirt can add the object through
qemu monitor.


[1] https://marc.info/?l=kvm=150092661105069=2
[2] https://marc.info/?l=qemu-devel=148901186615642=2
[3] https://lists.01.org/pipermail/edk2-devel/2017-July/012220.html
[4] http://support.amd.com/TechDocs/55766_SEV-KM%20API_Specification.pdf

Thanks

Brijesh

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list

Re: [libvirt] [PATCH 2/2] virsh: man: Document the --validate option for create and define cmds

2017-09-08 Thread Martin Kletzander

On Thu, Sep 07, 2017 at 03:48:14PM +0200, Erik Skultety wrote:

Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1368753

Signed-off-by: Erik Skultety 
---
tools/virsh.pod | 22 +-
1 file changed, 13 insertions(+), 9 deletions(-)

diff --git a/tools/virsh.pod b/tools/virsh.pod
index 14a01e9ba..8ab93faf2 100644
--- a/tools/virsh.pod
+++ b/tools/virsh.pod
@@ -636,13 +636,15 @@ the I<--force> flag may be specified, requesting to 
disconnect any existing
sessions, such as in a case of a broken connection.

=item B I [I<--console>] [I<--paused>] [I<--autodestroy>]
-[I<--pass-fds N,M,...>]
+[I<--pass-fds N,M,...>] [I<--validate>]

-Create a domain from an XML . An easy way to create the XML
- is to use the B command to obtain the definition of a
-pre-existing guest.  The domain will be paused if the I<--paused> option
-is used and supported by the driver; otherwise it will be running.
-If I<--console> is requested, attach to the console after creation.
+Create a domain from an XML . Optionally, I<--validate> option can be
+passed to validate the format of the input XML file against an internal RNG
+schema (identical to using L tool). An easy way to create
+the XML  is to use the B command to obtain the definition of a
+pre-existing guest.  The domain will be paused if the I<--paused> option is 
used
+and supported by the driver; otherwise it will be running. If I<--console> is
+requested, attach to the console after creation.
If I<--autodestroy> is requested, then the guest will be automatically
destroyed when virsh closes its connection to libvirt, or otherwise
exits.
@@ -663,10 +665,12 @@ B
 vi domain.xml (or make changes with your other text editor)
 virsh create domain.xml

-=item B I
+=item B I [I<--validate>]

-Define a domain from an XML . The domain definition is registered
-but not started.  If domain is already running, the changes will take
+Define a domain from an XML . Optionally, the format of the input XML
+file can be validated against an internal RNG schema with I<--validate>
+(identical to using L tool). The domain definition is
+registered but not started.  If domain is already running, the changes will 
take
effect on the next boot.



I though we made this the default.  It looks like we did that only for
cmdEdit.  In that case,

Reviewed-by: Martin Kletzander 



=item B I [[I<--live>] [I<--config>] |
--
2.13.3

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


signature.asc
Description: Digital signature
--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list

[libvirt] [PATCH] vhost-user: add support reconnect for vhost-user ports

2017-09-08 Thread ZhiPeng Lu
For vhost-user ports, Open vSwitch acts as the server and QEMU the client.
When OVS crashed or restart, QEMU shoule be reconnect to OVS.

Signed-off-by: ZhiPeng Lu 
---
 docs/formatdomain.html.in  |  6 +++--
 docs/schemas/domaincommon.rng  |  5 
 src/conf/domain_conf.c | 28 --
 .../qemuxml2argv-net-vhostuser-multiq.args |  2 +-
 .../qemuxml2argv-net-vhostuser-multiq.xml  |  2 +-
 5 files changed, 37 insertions(+), 6 deletions(-)

diff --git a/docs/formatdomain.html.in b/docs/formatdomain.html.in
index 8ca7637..ffe45c2 100644
--- a/docs/formatdomain.html.in
+++ b/docs/formatdomain.html.in
@@ -5660,7 +5660,7 @@ qemu-kvm -net nic,model=? /dev/null
   /interface
   interface type='vhostuser'
 mac address='52:54:00:3b:83:1b'/
-source type='unix' path='/tmp/vhost2.sock' mode='client'/
+source type='unix' path='/tmp/vhost2.sock' mode='client' 
reconnect='10'/
 model type='virtio'/
 driver queues='5'/
   /interface
@@ -5675,7 +5675,9 @@ qemu-kvm -net nic,model=? /dev/null
   Both mode='server' and mode='client'
   are supported.
   vhost-user requires the virtio model type, thus the
-  model element is mandatory.
+  model element is mandatory. reconnect
+  is an optional element,which configures reconnect timeout if the
+  connection is lost.
 
 
 Traffic filtering with NWFilter
diff --git a/docs/schemas/domaincommon.rng b/docs/schemas/domaincommon.rng
index 06c5a91..82f30ae 100644
--- a/docs/schemas/domaincommon.rng
+++ b/docs/schemas/domaincommon.rng
@@ -2383,6 +2383,11 @@
 client
   
 
+
+  
+
+  
+
 
   
 
diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c
index 2fc1fc3..f9c3b35 100644
--- a/src/conf/domain_conf.c
+++ b/src/conf/domain_conf.c
@@ -10176,6 +10176,7 @@ virDomainNetDefParseXML(virDomainXMLOptionPtr xmlopt,
 char *vhostuser_mode = NULL;
 char *vhostuser_path = NULL;
 char *vhostuser_type = NULL;
+char *vhostuser_reconnect = NULL;
 char *trustGuestRxFilters = NULL;
 char *vhost_path = NULL;
 virNWFilterHashTablePtr filterparams = NULL;
@@ -10262,11 +10263,12 @@ virDomainNetDefParseXML(virDomainXMLOptionPtr xmlopt,
 goto error;
 }
 } else if (!vhostuser_path && !vhostuser_mode && !vhostuser_type
-   && def->type == VIR_DOMAIN_NET_TYPE_VHOSTUSER &&
-   virXMLNodeNameEqual(cur, "source")) {
+   && !vhostuser_reconnect && def->type == 
VIR_DOMAIN_NET_TYPE_VHOSTUSER
+   && virXMLNodeNameEqual(cur, "source")) {
 vhostuser_type = virXMLPropString(cur, "type");
 vhostuser_path = virXMLPropString(cur, "path");
 vhostuser_mode = virXMLPropString(cur, "mode");
+vhostuser_reconnect = virXMLPropString(cur, "reconnect");
 } else if (!def->virtPortProfile
&& virXMLNodeNameEqual(cur, "virtualport")) {
 if (def->type == VIR_DOMAIN_NET_TYPE_NETWORK) {
@@ -10478,6 +10480,11 @@ virDomainNetDefParseXML(virDomainXMLOptionPtr xmlopt,
  "type='vhostuser'/>"));
 goto error;
 }
+if (vhostuser_reconnect != NULL && STREQ(vhostuser_mode, "server")) {
+virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s",
+   _("'reconnect' attribute  unsupported "
+ "'server' mode for "));
+}
 
 if (VIR_ALLOC(def->data.vhostuser) < 0)
 goto error;
@@ -10490,6 +10497,17 @@ virDomainNetDefParseXML(virDomainXMLOptionPtr xmlopt,
 def->data.vhostuser->data.nix.listen = true;
 } else if (STREQ(vhostuser_mode, "client")) {
 def->data.vhostuser->data.nix.listen = false;
+if (vhostuser_reconnect != NULL) {
+def->data.vhostuser->data.nix.reconnect.enabled = true;
+if (virStrToLong_ui(vhostuser_reconnect, NULL, 0,
+   
>data.vhostuser->data.nix.reconnect.timeout) < 0) {
+virReportError(VIR_ERR_INTERNAL_ERROR, "%s",
+ _("vhost-user reconnect attribute is 
invalid"));
+vhostuser_reconnect = NULL;
+def->data.vhostuser->data.nix.reconnect.enabled = false;
+goto error;
+}
+}
 } else {
 virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s",
_("Wrong  'mode' attribute "
@@ -10937,6 +10955,7 @@ virDomainNetDefParseXML(virDomainXMLOptionPtr xmlopt,
 VIR_FREE(vhostuser_type);
 

Re: [libvirt] [PATCH 1/2] virsh: man: Be more explicit about 'create' creating transient domain

2017-09-08 Thread Martin Kletzander

On Thu, Sep 07, 2017 at 03:48:13PM +0200, Erik Skultety wrote:

So we refer to terms 'persistent' and 'transient' across the whole man
page, without explicitly stating that domain created via the 'create'
command is in fact transient and will vanish once destroyed.

Signed-off-by: Erik Skultety 
---
tools/virsh.pod | 5 +
1 file changed, 5 insertions(+)

diff --git a/tools/virsh.pod b/tools/virsh.pod
index c13f96f22..14a01e9ba 100644
--- a/tools/virsh.pod
+++ b/tools/virsh.pod
@@ -652,6 +652,11 @@ of open file descriptors which should be pass on into the 
guest. The
file descriptors will be re-numbered in the guest, starting from 3. This
is only supported with container based virtualization.

+Note that a domain created using this command is transient (temporary), meaning
+that such a domain simply disappears once shut off or destroyed and any
+subsequent attempts to B it will fail. See B command on how to
+make a domain persistent.
+


Not really.  With this command you can also start existing domain, just
with a different configuration.  Given that the example below shows
exactly that usage, I think someone might get a bit confused.  Would you
mind rewriting it to accommodate all variants?


B

 virsh dumpxml  > domain.xml
--
2.13.3

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


signature.asc
Description: Digital signature
--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list

Re: [libvirt] [PATCH] docs: Update --timeout description in libvirtd's man page

2017-09-08 Thread Martin Kletzander

On Thu, Sep 07, 2017 at 03:47:51PM +0200, Erik Skultety wrote:

Since commit @ae2163f8, only active client connections or running
domains are allowed to inhibit daemon shutdown. The man page however
wasn't updated appropriately.

Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1325066

Signed-off-by: Erik Skultety 


Reviewed-by: Martin Kletzander 


---
daemon/libvirtd.pod | 6 ++
1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/daemon/libvirtd.pod b/daemon/libvirtd.pod
index 9f5a17d9c..8d3fbd13d 100644
--- a/daemon/libvirtd.pod
+++ b/daemon/libvirtd.pod
@@ -56,10 +56,8 @@ Use this name for the PID file, overriding the default value.

=item B<-t, --timeout> I

-Exit after timeout period (in seconds) elapse with no client connections
-or registered resources.  Be aware that resources such as autostart
-networks will result in never reaching the timeout, even when there are
-no client connections.
+Exit after timeout period (in seconds), provided there are neither any client
+connections nor any running domains.

=item B<-v, --verbose>

--
2.13.3

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


signature.asc
Description: Digital signature
--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list

[libvirt] [PATCH v3 REBASE 1/2] qemu: prepare blockjob complete event error usage

2017-09-08 Thread Nikolay Shirokovskiy
This patch pass event error up to the place where we can
use it. Error is passed only for sync blockjob event mode
as we can't use the error in async mode. In async mode we
just pass the event details to the client thru event API
but current blockjob event API can not carry extra parameter.
---
 src/qemu/qemu_blockjob.c | 2 ++
 src/qemu/qemu_domain.c   | 1 +
 src/qemu/qemu_domain.h   | 1 +
 src/qemu/qemu_monitor.c  | 5 +++--
 src/qemu/qemu_monitor.h  | 4 +++-
 src/qemu/qemu_monitor_json.c | 4 +++-
 src/qemu/qemu_process.c  | 4 
 7 files changed, 17 insertions(+), 4 deletions(-)

diff --git a/src/qemu/qemu_blockjob.c b/src/qemu/qemu_blockjob.c
index 415768d..cb00736 100644
--- a/src/qemu/qemu_blockjob.c
+++ b/src/qemu/qemu_blockjob.c
@@ -35,6 +35,7 @@
 #include "virthread.h"
 #include "virtime.h"
 #include "locking/domain_lock.h"
+#include "viralloc.h"
 
 #define VIR_FROM_THIS VIR_FROM_QEMU
 
@@ -66,6 +67,7 @@ qemuBlockJobUpdate(virQEMUDriverPtr driver,
  diskPriv->blockJobType,
  diskPriv->blockJobStatus);
 diskPriv->blockJobStatus = -1;
+VIR_FREE(diskPriv->blockJobError);
 }
 
 return status;
diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c
index 7203189..92bf747 100644
--- a/src/qemu/qemu_domain.c
+++ b/src/qemu/qemu_domain.c
@@ -923,6 +923,7 @@ qemuDomainDiskPrivateDispose(void *obj)
 
 qemuDomainSecretInfoFree(>secinfo);
 qemuDomainSecretInfoFree(>encinfo);
+VIR_FREE(priv->blockJobError);
 }
 
 
diff --git a/src/qemu/qemu_domain.h b/src/qemu/qemu_domain.h
index 5f6e361..5b00aef 100644
--- a/src/qemu/qemu_domain.h
+++ b/src/qemu/qemu_domain.h
@@ -342,6 +342,7 @@ struct _qemuDomainDiskPrivate {
 /* for some synchronous block jobs, we need to notify the owner */
 int blockJobType;   /* type of the block job from the event */
 int blockJobStatus; /* status of the finished block job */
+char *blockJobError; /* block job completed event error */
 bool blockJobSync; /* the block job needs synchronized termination */
 
 bool migrating; /* the disk is being migrated */
diff --git a/src/qemu/qemu_monitor.c b/src/qemu/qemu_monitor.c
index 19082d8..fa093df 100644
--- a/src/qemu/qemu_monitor.c
+++ b/src/qemu/qemu_monitor.c
@@ -1514,13 +1514,14 @@ int
 qemuMonitorEmitBlockJob(qemuMonitorPtr mon,
 const char *diskAlias,
 int type,
-int status)
+int status,
+const char *error)
 {
 int ret = -1;
 VIR_DEBUG("mon=%p", mon);
 
 QEMU_MONITOR_CALLBACK(mon, ret, domainBlockJob, mon->vm,
-  diskAlias, type, status);
+  diskAlias, type, status, error);
 return ret;
 }
 
diff --git a/src/qemu/qemu_monitor.h b/src/qemu/qemu_monitor.h
index 9805a33..2014705 100644
--- a/src/qemu/qemu_monitor.h
+++ b/src/qemu/qemu_monitor.h
@@ -176,6 +176,7 @@ typedef int 
(*qemuMonitorDomainBlockJobCallback)(qemuMonitorPtr mon,
  const char *diskAlias,
  int type,
  int status,
+ const char *error,
  void *opaque);
 typedef int (*qemuMonitorDomainTrayChangeCallback)(qemuMonitorPtr mon,
virDomainObjPtr vm,
@@ -375,7 +376,8 @@ int qemuMonitorEmitPMSuspend(qemuMonitorPtr mon);
 int qemuMonitorEmitBlockJob(qemuMonitorPtr mon,
 const char *diskAlias,
 int type,
-int status);
+int status,
+const char *error);
 int qemuMonitorEmitBalloonChange(qemuMonitorPtr mon,
  unsigned long long actual);
 int qemuMonitorEmitPMSuspendDisk(qemuMonitorPtr mon);
diff --git a/src/qemu/qemu_monitor_json.c b/src/qemu/qemu_monitor_json.c
index df5fb7c..56df903 100644
--- a/src/qemu/qemu_monitor_json.c
+++ b/src/qemu/qemu_monitor_json.c
@@ -862,6 +862,7 @@ qemuMonitorJSONHandleBlockJobImpl(qemuMonitorPtr mon,
 {
 const char *device;
 const char *type_str;
+const char *error = NULL;
 int type = VIR_DOMAIN_BLOCK_JOB_TYPE_UNKNOWN;
 unsigned long long offset, len;
 
@@ -894,6 +895,7 @@ qemuMonitorJSONHandleBlockJobImpl(qemuMonitorPtr mon,
 
 switch ((virConnectDomainEventBlockJobStatus) event) {
 case VIR_DOMAIN_BLOCK_JOB_COMPLETED:
+error = virJSONValueObjectGetString(data, "error");
 /* Make sure the whole device has been processed */
 if (offset != len)
 event = VIR_DOMAIN_BLOCK_JOB_FAILED;
@@ -908,7 +910,7 @@ qemuMonitorJSONHandleBlockJobImpl(qemuMonitorPtr mon,
 }
 
  out:
-qemuMonitorEmitBlockJob(mon, 

[libvirt] [PATCH v3 REBASE 2/2] qemu: report drive mirror errors on migration

2017-09-08 Thread Nikolay Shirokovskiy
---
 src/qemu/qemu_blockjob.c  | 14 +---
 src/qemu/qemu_blockjob.h  |  3 ++-
 src/qemu/qemu_driver.c|  4 ++--
 src/qemu/qemu_migration.c | 55 +--
 4 files changed, 54 insertions(+), 22 deletions(-)

diff --git a/src/qemu/qemu_blockjob.c b/src/qemu/qemu_blockjob.c
index cb00736..5e4fe6d 100644
--- a/src/qemu/qemu_blockjob.c
+++ b/src/qemu/qemu_blockjob.c
@@ -47,6 +47,7 @@ VIR_LOG_INIT("qemu.qemu_blockjob");
  * @driver: qemu driver
  * @vm: domain
  * @disk: domain disk
+ * @error: error (output parameter)
  *
  * Update disk's mirror state in response to a block job event stored in
  * blockJobStatus by qemuProcessHandleBlockJob event handler.
@@ -57,17 +58,24 @@ int
 qemuBlockJobUpdate(virQEMUDriverPtr driver,
virDomainObjPtr vm,
qemuDomainAsyncJob asyncJob,
-   virDomainDiskDefPtr disk)
+   virDomainDiskDefPtr disk,
+   char **error)
 {
 qemuDomainDiskPrivatePtr diskPriv = QEMU_DOMAIN_DISK_PRIVATE(disk);
 int status = diskPriv->blockJobStatus;
 
+if (error)
+*error = NULL;
+
 if (status != -1) {
 qemuBlockJobEventProcess(driver, vm, disk, asyncJob,
  diskPriv->blockJobType,
  diskPriv->blockJobStatus);
 diskPriv->blockJobStatus = -1;
-VIR_FREE(diskPriv->blockJobError);
+if (error)
+VIR_STEAL_PTR(*error, diskPriv->blockJobError);
+else
+VIR_FREE(diskPriv->blockJobError);
 }
 
 return status;
@@ -255,6 +263,6 @@ qemuBlockJobSyncEnd(virQEMUDriverPtr driver,
 virDomainDiskDefPtr disk)
 {
 VIR_DEBUG("disk=%s", disk->dst);
-qemuBlockJobUpdate(driver, vm, asyncJob, disk);
+qemuBlockJobUpdate(driver, vm, asyncJob, disk, NULL);
 QEMU_DOMAIN_DISK_PRIVATE(disk)->blockJobSync = false;
 }
diff --git a/src/qemu/qemu_blockjob.h b/src/qemu/qemu_blockjob.h
index 47aa4c1..e71d691 100644
--- a/src/qemu/qemu_blockjob.h
+++ b/src/qemu/qemu_blockjob.h
@@ -29,7 +29,8 @@
 int qemuBlockJobUpdate(virQEMUDriverPtr driver,
virDomainObjPtr vm,
qemuDomainAsyncJob asyncJob,
-   virDomainDiskDefPtr disk);
+   virDomainDiskDefPtr disk,
+   char **error);
 void qemuBlockJobEventProcess(virQEMUDriverPtr driver,
   virDomainObjPtr vm,
   virDomainDiskDefPtr disk,
diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c
index 6255d89..f6fd6ad 100644
--- a/src/qemu/qemu_driver.c
+++ b/src/qemu/qemu_driver.c
@@ -16596,13 +16596,13 @@ qemuDomainBlockJobAbort(virDomainPtr dom,
  VIR_DOMAIN_BLOCK_JOB_CANCELED);
 } else {
 qemuDomainDiskPrivatePtr diskPriv = QEMU_DOMAIN_DISK_PRIVATE(disk);
-qemuBlockJobUpdate(driver, vm, QEMU_ASYNC_JOB_NONE, disk);
+qemuBlockJobUpdate(driver, vm, QEMU_ASYNC_JOB_NONE, disk, NULL);
 while (diskPriv->blockjob) {
 if (virDomainObjWait(vm) < 0) {
 ret = -1;
 goto endjob;
 }
-qemuBlockJobUpdate(driver, vm, QEMU_ASYNC_JOB_NONE, disk);
+qemuBlockJobUpdate(driver, vm, QEMU_ASYNC_JOB_NONE, disk, 
NULL);
 }
 }
 }
diff --git a/src/qemu/qemu_migration.c b/src/qemu/qemu_migration.c
index 272d525..b7ba4c3 100644
--- a/src/qemu/qemu_migration.c
+++ b/src/qemu/qemu_migration.c
@@ -611,17 +611,25 @@ qemuMigrationDriveMirrorReady(virQEMUDriverPtr driver,
 for (i = 0; i < vm->def->ndisks; i++) {
 virDomainDiskDefPtr disk = vm->def->disks[i];
 qemuDomainDiskPrivatePtr diskPriv = QEMU_DOMAIN_DISK_PRIVATE(disk);
+char *error = NULL;
 
 if (!diskPriv->migrating)
 continue;
 
-status = qemuBlockJobUpdate(driver, vm, asyncJob, disk);
+status = qemuBlockJobUpdate(driver, vm, asyncJob, disk, );
 if (status == VIR_DOMAIN_BLOCK_JOB_FAILED) {
-virReportError(VIR_ERR_OPERATION_FAILED,
-   _("migration of disk %s failed"),
-   disk->dst);
+if (error) {
+virReportError(VIR_ERR_OPERATION_FAILED,
+   _("migration of disk %s failed: %s"),
+   disk->dst, error);
+VIR_FREE(error);
+} else {
+virReportError(VIR_ERR_OPERATION_FAILED,
+   _("migration of disk %s failed"), disk->dst);
+}
 return -1;
 }
+VIR_FREE(error);
 
 if (disk->mirrorState != VIR_DOMAIN_DISK_MIRROR_STATE_READY)
 notReady++;
@@ -663,17 +671,23 @@ qemuMigrationDriveMirrorCancelled(virQEMUDriverPtr driver,
 for (i = 0; i < 

[libvirt] [PATCH v3 REBASE 0/2] qemu: report block job errors from qemu to the user

2017-09-08 Thread Nikolay Shirokovskiy
So that you can see nice report on migration:

"error: operation failed: migration of disk sda failed: No space left on device"

diff from v2:

1. split into 2 patches
2. change formal documentation where it is present accordingly
3. add variable initialization for safety

Nikolay Shirokovskiy (2):
  qemu: prepare blockjob complete event error usage
  qemu: report drive mirror errors on migration

 src/qemu/qemu_blockjob.c | 14 +--
 src/qemu/qemu_blockjob.h |  3 ++-
 src/qemu/qemu_domain.c   |  1 +
 src/qemu/qemu_domain.h   |  1 +
 src/qemu/qemu_driver.c   |  4 ++--
 src/qemu/qemu_migration.c| 55 +++-
 src/qemu/qemu_monitor.c  |  5 ++--
 src/qemu/qemu_monitor.h  |  4 +++-
 src/qemu/qemu_monitor_json.c |  4 +++-
 src/qemu/qemu_process.c  |  4 
 10 files changed, 70 insertions(+), 25 deletions(-)

-- 
1.8.3.1

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


Re: [libvirt] [PATCH] qemu: monitor: fix graceful shutdown corner cases

2017-09-08 Thread Nikolay Shirokovskiy
Ouch, I don't escape test patch [1] in message body so
is is applied together with the main patch. Be careful.

On 08.09.2017 10:16, Nikolay Shirokovskiy wrote:
> Patch aeda1b8c needs some enhancement.
> 
> 1. Shutdown event is delivired on reboot too and we don't want
> to set willhangup flag is this case.
> 
> 2. There is a next race condition.
> 
>  - EOF is delivered in event loop thread
>  - qemuMonitorSend is called on client behalf and waits for notification
>on message response or monitor close
>  - EOF handler tries to accquire job condition and fail on timeout as
>it is grabbed by the request above
> 
>   Now qemuMonitorSend hangs.
> 
> Previously if qemuMonitorSend is called after EOF then it failed
> immediately due to lastError is set. Now we need to check willhangup too.
> 
> ---
> 
> Race is easy to trigger with patch [1]. One need to query domain
> frequently enough in a loop and do a shutdown.
> 
> [1] Patch to trigger race condition.
> 
> diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c
> index b782451..4445f88 100644
> --- a/src/qemu/qemu_driver.c
> +++ b/src/qemu/qemu_driver.c
> @@ -4742,6 +4742,9 @@ static void qemuProcessEventHandler(void *data, void 
> *opaque)
> 
>  VIR_DEBUG("vm=%p, event=%d", vm, processEvent->eventType);
> 
> +if (processEvent->eventType == QEMU_PROCESS_EVENT_MONITOR_EOF)
> +sleep(3);
> +
>  virObjectLock(vm);
> 
>  switch (processEvent->eventType) {
> 
> 
> 
> 
>  src/qemu/qemu_monitor.c | 16 +++-
>  src/qemu/qemu_monitor.h |  2 ++
>  src/qemu/qemu_process.c |  1 +
>  3 files changed, 18 insertions(+), 1 deletion(-)
> 
> diff --git a/src/qemu/qemu_monitor.c b/src/qemu/qemu_monitor.c
> index 19082d8..6270191 100644
> --- a/src/qemu/qemu_monitor.c
> +++ b/src/qemu/qemu_monitor.c
> @@ -315,6 +315,15 @@ qemuMonitorEscapeNonPrintable(const char *text)
>  #endif
>  
>  
> +void
> +qemuMonitorShutdown(qemuMonitorPtr mon)
> +{
> +virObjectLock(mon);
> +mon->willhangup = 1;
> +virObjectUnlock(mon);
> +}
> +
> +
>  static void
>  qemuMonitorDispose(void *obj)
>  {
> @@ -1055,6 +1064,12 @@ qemuMonitorSend(qemuMonitorPtr mon,
>  return -1;
>  }
>  
> +if (mon->willhangup) {
> +virReportError(VIR_ERR_OPERATION_INVALID, "%s",
> +   _("Domain is shutting down."));
> +return -1;
> +}
> +
>  mon->msg = msg;
>  qemuMonitorUpdateWatch(mon);
>  
> @@ -1336,7 +1351,6 @@ qemuMonitorEmitShutdown(qemuMonitorPtr mon, 
> virTristateBool guest)
>  {
>  int ret = -1;
>  VIR_DEBUG("mon=%p guest=%u", mon, guest);
> -mon->willhangup = 1;
>  
>  QEMU_MONITOR_CALLBACK(mon, ret, domainShutdown, mon->vm, guest);
>  return ret;
> diff --git a/src/qemu/qemu_monitor.h b/src/qemu/qemu_monitor.h
> index 9805a33..30533ef 100644
> --- a/src/qemu/qemu_monitor.h
> +++ b/src/qemu/qemu_monitor.h
> @@ -303,6 +303,8 @@ void qemuMonitorUnregister(qemuMonitorPtr mon)
>  ATTRIBUTE_NONNULL(1);
>  void qemuMonitorClose(qemuMonitorPtr mon);
>  
> +void qemuMonitorShutdown(qemuMonitorPtr mon);
> +
>  virErrorPtr qemuMonitorLastError(qemuMonitorPtr mon);
>  
>  int qemuMonitorSetCapabilities(qemuMonitorPtr mon);
> diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c
> index ab81d65..824be86 100644
> --- a/src/qemu/qemu_process.c
> +++ b/src/qemu/qemu_process.c
> @@ -618,6 +618,7 @@ qemuProcessShutdownOrReboot(virQEMUDriverPtr driver,
>  virObjectUnref(vm);
>  }
>  } else {
> +qemuMonitorShutdown(priv->mon);
>  ignore_value(qemuProcessKill(vm, VIR_QEMU_PROCESS_KILL_NOWAIT));
>  }
>  }
> 

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list


[libvirt] [PATCH] qemu: monitor: fix graceful shutdown corner cases

2017-09-08 Thread Nikolay Shirokovskiy
Patch aeda1b8c needs some enhancement.

1. Shutdown event is delivired on reboot too and we don't want
to set willhangup flag is this case.

2. There is a next race condition.

 - EOF is delivered in event loop thread
 - qemuMonitorSend is called on client behalf and waits for notification
   on message response or monitor close
 - EOF handler tries to accquire job condition and fail on timeout as
   it is grabbed by the request above

  Now qemuMonitorSend hangs.

Previously if qemuMonitorSend is called after EOF then it failed
immediately due to lastError is set. Now we need to check willhangup too.

---

Race is easy to trigger with patch [1]. One need to query domain
frequently enough in a loop and do a shutdown.

[1] Patch to trigger race condition.

diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c
index b782451..4445f88 100644
--- a/src/qemu/qemu_driver.c
+++ b/src/qemu/qemu_driver.c
@@ -4742,6 +4742,9 @@ static void qemuProcessEventHandler(void *data, void 
*opaque)

 VIR_DEBUG("vm=%p, event=%d", vm, processEvent->eventType);

+if (processEvent->eventType == QEMU_PROCESS_EVENT_MONITOR_EOF)
+sleep(3);
+
 virObjectLock(vm);

 switch (processEvent->eventType) {




 src/qemu/qemu_monitor.c | 16 +++-
 src/qemu/qemu_monitor.h |  2 ++
 src/qemu/qemu_process.c |  1 +
 3 files changed, 18 insertions(+), 1 deletion(-)

diff --git a/src/qemu/qemu_monitor.c b/src/qemu/qemu_monitor.c
index 19082d8..6270191 100644
--- a/src/qemu/qemu_monitor.c
+++ b/src/qemu/qemu_monitor.c
@@ -315,6 +315,15 @@ qemuMonitorEscapeNonPrintable(const char *text)
 #endif
 
 
+void
+qemuMonitorShutdown(qemuMonitorPtr mon)
+{
+virObjectLock(mon);
+mon->willhangup = 1;
+virObjectUnlock(mon);
+}
+
+
 static void
 qemuMonitorDispose(void *obj)
 {
@@ -1055,6 +1064,12 @@ qemuMonitorSend(qemuMonitorPtr mon,
 return -1;
 }
 
+if (mon->willhangup) {
+virReportError(VIR_ERR_OPERATION_INVALID, "%s",
+   _("Domain is shutting down."));
+return -1;
+}
+
 mon->msg = msg;
 qemuMonitorUpdateWatch(mon);
 
@@ -1336,7 +1351,6 @@ qemuMonitorEmitShutdown(qemuMonitorPtr mon, 
virTristateBool guest)
 {
 int ret = -1;
 VIR_DEBUG("mon=%p guest=%u", mon, guest);
-mon->willhangup = 1;
 
 QEMU_MONITOR_CALLBACK(mon, ret, domainShutdown, mon->vm, guest);
 return ret;
diff --git a/src/qemu/qemu_monitor.h b/src/qemu/qemu_monitor.h
index 9805a33..30533ef 100644
--- a/src/qemu/qemu_monitor.h
+++ b/src/qemu/qemu_monitor.h
@@ -303,6 +303,8 @@ void qemuMonitorUnregister(qemuMonitorPtr mon)
 ATTRIBUTE_NONNULL(1);
 void qemuMonitorClose(qemuMonitorPtr mon);
 
+void qemuMonitorShutdown(qemuMonitorPtr mon);
+
 virErrorPtr qemuMonitorLastError(qemuMonitorPtr mon);
 
 int qemuMonitorSetCapabilities(qemuMonitorPtr mon);
diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c
index ab81d65..824be86 100644
--- a/src/qemu/qemu_process.c
+++ b/src/qemu/qemu_process.c
@@ -618,6 +618,7 @@ qemuProcessShutdownOrReboot(virQEMUDriverPtr driver,
 virObjectUnref(vm);
 }
 } else {
+qemuMonitorShutdown(priv->mon);
 ignore_value(qemuProcessKill(vm, VIR_QEMU_PROCESS_KILL_NOWAIT));
 }
 }
-- 
1.8.3.1

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list