date:20150702

Re: [Qemu-devel] [PATCH v4 4/9] Add virt-v3 machine that uses GIC-500

2015-07-02 Thread Pavel Fedin

 Hi!

> I think you should leave out the software emulation part for this
> series and let's concentrate on getting this in first.

 Ok.

> Also, you should make sure you have an agreement with Shlomo about
> taking over his patches before doing so (maybe you are already?).

 Shlomo has posted his work here, doesn't this automatically mean that he is 
sharing it with the community? Additionally, i do not drop his authorship.

Kind regards,
Pavel Fedin
Expert Engineer
Samsung Electronics Research center Russia

Re: [Qemu-devel] [PATCH v3 00/16] Add an IPMI device to QEMU

2015-07-02 Thread Paolo Bonzini



On 09/06/2015 03:11, miny...@acm.org wrote:
> I have reworked the QEMU IPMI patches, and I believe this design
> improves greatly on the previous one.  The suggestions made all
> resulted in better design.  To start an IPMI interface
> with this, you would use:
> 
>   -device ipmi-bmc-sim,id=bmc0 -device isa-ipmi-bt,bmc=bmc0
> 
> Note that I did not make the BMCs objects.  I started this way, but
> adding properties was ending up duplicating a lot of code already
> in the device core, especially for the chardev link.
> 
> I'm not sure of the importance of having a BMC be an object instead
> of a device, but it seems to me that if we do that, it would be
> best to add helper functions to the object core for the chardev
> and integer properties.

Hi, sorry for the delay.  I missed this because I left for vacation
exactly on the day it was posted.  I think it's okay to have it as a device.

Michael, are you going to merge it?  Otherwise, just ack the patches and
I will take care of it.  Time is ticking fast...

Paolo

Re: [Qemu-devel] [PATCH 02/18] virtio: run drivers in 32bit mode

2015-07-02 Thread Paolo Bonzini



On 29/06/2015 10:53, Gerd Hoffmann wrote:
> virtio version 1.0 registers can (and actually do in the qemu
> implementation) live in mmio space.  So we must run the blk and
> scsi virtio drivers in 32bit mode, otherwise we can't access them.
> 
> This also allows to drop a bunch of GET_LOWFLAT calls from the virtio
> code in the following patches.

This isn't really necessary though.  The config access capability exists
for this reason.  At least as a follow up it would be nice to add a switch.

Paolo

[Qemu-devel] quorum: validate vote threshold against num_children even if read-pattern is fifo

2015-07-02 Thread Wen Congyang

We need to use threshold to check if too many write operation fails.
If threshold is larger than num children, we always get write error
event even if all write operations success.

Signed-off-by: Wen Congyang 
---
 block/quorum.c | 12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/block/quorum.c b/block/quorum.c
index a7df17c..b0eead0 100644
--- a/block/quorum.c
+++ b/block/quorum.c
@@ -894,6 +894,12 @@ static int quorum_open(BlockDriverState *bs, QDict 
*options, int flags,
 }
 
 s->threshold = qemu_opt_get_number(opts, QUORUM_OPT_VOTE_THRESHOLD, 0);
+/* and validate it against s->num_children */
+ret = quorum_valid_threshold(s->threshold, s->num_children, &local_err);
+if (ret < 0) {
+goto exit;
+}
+
 ret = parse_read_pattern(qemu_opt_get(opts, QUORUM_OPT_READ_PATTERN));
 if (ret < 0) {
 error_setg(&local_err, "Please set read-pattern as fifo or quorum");
@@ -902,12 +908,6 @@ static int quorum_open(BlockDriverState *bs, QDict 
*options, int flags,
 s->read_pattern = ret;
 
 if (s->read_pattern == QUORUM_READ_PATTERN_QUORUM) {
-/* and validate it against s->num_children */
-ret = quorum_valid_threshold(s->threshold, s->num_children, 
&local_err);
-if (ret < 0) {
-goto exit;
-}
-
 /* is the driver in blkverify mode */
 if (qemu_opt_get_bool(opts, QUORUM_OPT_BLKVERIFY, false) &&
 s->num_children == 2 && s->threshold == 2) {
-- 
2.4.3

Re: [Qemu-devel] [PATCH 4/4] spapr: SPLPAR Characteristics

2015-07-02 Thread Bharata B Rao

On Fri, Jul 3, 2015 at 9:12 AM, Sam Bobroff  wrote:
> Improve the SPLPAR Characteristics information:
>
> Add MaxPlatProcs: set to max_cpus, the maximum CPUs that could be
> addded to the system.
> Add DesMem: set to the initial memory of the system.
> Add DesProcs: set to smp_cpus, the inital number of CPUs in the
> system.
>
> These tokens and values are specified by PAPR.
>
> Signed-off-by: Sam Bobroff 
> ---
>  hw/ppc/spapr_rtas.c | 11 +--
>  1 file changed, 9 insertions(+), 2 deletions(-)
>
> diff --git a/hw/ppc/spapr_rtas.c b/hw/ppc/spapr_rtas.c
> index 0f1ae55..a08254b 100644
> --- a/hw/ppc/spapr_rtas.c
> +++ b/hw/ppc/spapr_rtas.c
> @@ -33,6 +33,7 @@
>  #include "hw/ppc/spapr.h"
>  #include "hw/ppc/spapr_vio.h"
>  #include "qapi-event.h"
> +#include "hw/boards.h"
>
>  #include 
>
> @@ -190,8 +191,14 @@ static void rtas_ibm_get_system_parameter(PowerPCCPU 
> *cpu,
>
>  switch (parameter) {
>  case RTAS_SYSPARM_SPLPAR_CHARACTERISTICS: {
> -char *param_val = g_strdup_printf("MaxEntCap=%d,MaxPlatProcs=%d",
> -  max_cpus, smp_cpus);
> +char *param_val = g_strdup_printf("MaxEntCap=%d,"
> +  "DesMem=%lu,"
> +  "DesProcs=%d,"
> +  "MaxPlatProcs=%d",
> +  max_cpus,
> +  current_machine->ram_size / 1024 / 
> 1024,

May be you could use M_BYTE definition from qemu-common.h ?

Regards,
Bharata.

Re: [Qemu-devel] [Qemu-ppc] [PATCH v3 0/4] target-ppc: Add FWNMI support in qemu for powerKVM guests

2015-07-02 Thread David Gibson

On Thu, Jul 02, 2015 at 07:11:52PM +1000, Alexey Kardashevskiy wrote:
> On 04/02/2015 03:46 PM, David Gibson wrote:
> >On Thu, Apr 02, 2015 at 03:28:11PM +1100, Alexey Kardashevskiy wrote:
> >>On 11/19/2014 04:48 PM, Aravinda Prasad wrote:
> >>>
> >>>
> >>>On Tuesday 11 November 2014 08:54 AM, David Gibson wrote:
> >>>
> >>>[..]
> >>>
> 
> So, this may not still be possible depending on whether the KVM side
> of this is already merged, but it occurs to me that there's a simpler
> way.
> 
> Rather than mucking about with having to update the hypervisor on the
> RTAS location, they have qemu copy the code out of RTAS, patch it and
> copy it back into the vector, you could instead do this:
> 
>    1. Make KVM instead of immediately delivering a 0x200 for a guest
> machine check, cause a special exit to qemu.
> 
>    2. Have the register-nmi RTAS call store the guest side MC handler
> address in the spapr structure, but perform no actual guest code
> patching.
> 
>    3. Allocate the error log buffer independently from the RTAS blob,
> so qemu always knows where it is.
> 
>    4. When qemu gets the MC exit condition, instead of going via a
> patched 0x200 vector, just directly set the guest register state and
> jump straight into the guest side MC handler.
> 
> >>>
> >>>Before I proceed further I would like to know what others think about
> >>>the approach proposed above (except for step 3 - as per PAPR the error
> >>>log buffer should be part of RTAS blob and hence we cannot have error
> >>>log buffer independent of RTAS blob).
> >>>
> >>>Alex, Alexey, Ben: Any thoughts?
> >>
> >>
> >>Any updates about FWNMI? Thanks
> >
> >Huh.. I'd completely forgotten about this.  Aravinda, can you repost
> >your latest work on this?
> 
> 
> Aravinda disappeared...

Ok, well someone who cares about FWNMI is going to have to start
sending something, or it won't happen.

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


pgphT11sxHWRZ.pgp
Description: PGP signature

[Qemu-devel] [PATCH] axienet: Flush queued packets when rx is done

2015-07-02 Thread Fam Zheng

eth_can_rx checks s->rxsize and returns false if it is non-zero. Because
of the .can_receive semantics change, this will make the incoming queue
disabled by peer, until it is explicitly flushed. So we should flush it
when s->rxsize is becoming zero.

Do it by adding a BH that calls qemu_flush_queued_packets after
decrementing s->rxsize in axienet_eth_rx_notify. BH is necessary to
avoid too deep recursive call stack.

The other conditions, "!axienet_rx_resetting(s) &&
axienet_rx_enabled(s)" are OK because enet_write already calls
qemu_flush_queued_packets when the register bits are changed.

Signed-off-by: Fam Zheng 

---

Only tested building because I don't have a xilinx image to start a
guest.
---
 hw/net/xilinx_axienet.c | 22 ++
 1 file changed, 22 insertions(+)

diff --git a/hw/net/xilinx_axienet.c b/hw/net/xilinx_axienet.c
index 9205770..bbc0ea8 100644
--- a/hw/net/xilinx_axienet.c
+++ b/hw/net/xilinx_axienet.c
@@ -28,6 +28,7 @@
 #include "net/checksum.h"
 
 #include "hw/stream.h"
+#include "qemu/main-loop.h"
 
 #define DPHY(x)
 
@@ -401,6 +402,8 @@ struct XilinxAXIEnet {
 
 uint8_t rxapp[CONTROL_PAYLOAD_SIZE];
 uint32_t rxappsize;
+
+QEMUBH *flush_bh;
 };
 
 static void axienet_rx_reset(XilinxAXIEnet *s)
@@ -681,8 +684,15 @@ static int enet_match_addr(const uint8_t *buf, uint32_t 
f0, uint32_t f1)
 return match;
 }
 
+static void xilinx_flush_cb(void *opaque)
+{
+XilinxAXIEnet *s = opaque;
+qemu_flush_queued_packets(qemu_get_queue(s->nic));
+}
+
 static void axienet_eth_rx_notify(void *opaque)
 {
+bool flush = false;
 XilinxAXIEnet *s = XILINX_AXI_ENET(opaque);
 
 while (s->rxappsize && stream_can_push(s->tx_control_dev,
@@ -701,9 +711,13 @@ static void axienet_eth_rx_notify(void *opaque)
 s->rxpos += ret;
 if (!s->rxsize) {
 s->regs[R_IS] |= IS_RX_COMPLETE;
+flush = true;
 }
 }
 enet_update_irq(s);
+if (flush) {
+qemu_bh_schedule(s->flush_bh);
+}
 }
 
 static ssize_t eth_rx(NetClientState *nc, const uint8_t *buf, size_t size)
@@ -967,6 +981,7 @@ static void xilinx_enet_realize(DeviceState *dev, Error 
**errp)
 s->TEMAC.parent = s;
 
 s->rxmem = g_malloc(s->c_rxmem);
+s->flush_bh = qemu_bh_new(xilinx_flush_cb, s);
 return;
 
 xilinx_enet_realize_fail:
@@ -975,6 +990,12 @@ xilinx_enet_realize_fail:
 }
 }
 
+static void xilinx_enet_unrealize(DeviceState *dev, Error **errp)
+{
+XilinxAXIEnet *s = XILINX_AXI_ENET(dev);
+qemu_bh_delete(s->flush_bh);
+}
+
 static void xilinx_enet_init(Object *obj)
 {
 XilinxAXIEnet *s = XILINX_AXI_ENET(obj);
@@ -1020,6 +1041,7 @@ static void xilinx_enet_class_init(ObjectClass *klass, 
void *data)
 DeviceClass *dc = DEVICE_CLASS(klass);
 
 dc->realize = xilinx_enet_realize;
+dc->unrealize = xilinx_enet_unrealize;
 dc->props = xilinx_enet_properties;
 dc->reset = xilinx_axienet_reset;
 }
-- 
2.4.3

Re: [Qemu-devel] [PATCH] virtio-net: Drop net_virtio_info.can_receive

2015-07-02 Thread Fam Zheng

On Fri, 07/03 09:12, Fam Zheng wrote:
> On Thu, 07/02 18:46, Michael S. Tsirkin wrote:
> > On Thu, Jul 02, 2015 at 01:46:26PM +0100, Stefan Hajnoczi wrote:
> > > On Tue, Jun 30, 2015 at 04:35:24PM +0800, Jason Wang wrote:
> > > > On 06/30/2015 11:06 AM, Fam Zheng wrote:
> > > > > virtio_net_receive still does the check by calling
> > > > > virtio_net_can_receive, if the device or driver is not ready, the 
> > > > > packet
> > > > > is dropped.
> > > > >
> > > > > This is necessary because returning false from can_receive complicates
> > > > > things: the peer would disable sending until we explicitly flush the
> > > > > queue.
> > > > >
> > > > > Signed-off-by: Fam Zheng 
> > > > > ---
> > > > >  hw/net/virtio-net.c | 1 -
> > > > >  1 file changed, 1 deletion(-)
> > > > >
> > > > > diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
> > > > > index d728233..dbef0d0 100644
> > > > > --- a/hw/net/virtio-net.c
> > > > > +++ b/hw/net/virtio-net.c
> > > > > @@ -1503,7 +1503,6 @@ static int virtio_net_load_device(VirtIODevice 
> > > > > *vdev, QEMUFile *f,
> > > > >  static NetClientInfo net_virtio_info = {
> > > > >  .type = NET_CLIENT_OPTIONS_KIND_NIC,
> > > > >  .size = sizeof(NICState),
> > > > > -.can_receive = virtio_net_can_receive,
> > > > >  .receive = virtio_net_receive,
> > > > >  .link_status_changed = virtio_net_set_link_status,
> > > > >  .query_rx_filter = virtio_net_query_rxfilter,
> > > > 
> > > > A side effect of this patch is it will read and then drop packet is
> > > > guest driver is no ok.
> > > 
> > > I think that the semantics of .can_receive() and .receive() return
> > > values are currently incorrect in many NICs.  They have .can_receive()
> > > functions that return false for conditions where .receive() would
> > > discard the packet.  So what happens is that packets get queued when
> > > they should actually be discarded.
> > > 
> > > The purpose of the flow control (queuing) mechanism is to tell the
> > > sender to hold off until the receiver has more rx buffers available.
> > > It's a short-term thing that doesn't included link down, rx disable, or
> > > NIC reset states.
> > > 
> > > Therefore, I think this patch will not introduce a regression.  It is
> > > adjusting the code to stop queuing packets when they should actually be
> > > dropped.
> > > 
> > > Thoughts?
> > > 
> > > Reviewed-by: Stefan Hajnoczi 
> > 
> > OK, but we do have transient out of buffers states too, and
> > virtio_net_can_receive checks these as well.

I'm a bit confused, do you mean virtio_queue_ready()? That is buffer presence
check, and the free buffer check virtio_net_has_buffers() is only in
virtio_net_receive(). Am I missing things?

Fam

> > 
> > To me, it looks like we should change virtio_net_can_receive to
> > avoid checking link down state, etc.
> > But I don't see why we should remove it completely.
> 
> OK, it makse sense to check buffer state.
> 
> Fam
>

[Qemu-devel] [PATCH 2/4] spapr: Add /rtas/ibm,change-msix-capable

2015-07-02 Thread Sam Bobroff

QEMU is MSI-X capable and makes it available via ibm,change-msi, so
we should indicate this by adding /rtas/ibm,change-msix-capable to the
device tree.

This is specificed by PAPR.

Signed-off-by: Sam Bobroff 
---
 hw/ppc/spapr.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 603f517..6801724 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -543,6 +543,10 @@ static void *spapr_create_fdt_skel(hwaddr initrd_base,
 
 _FDT((fdt_property_cell(fdt, "rtas-error-log-max", RTAS_ERROR_LOG_MAX)));
 
+if (msi_supported) {
+_FDT((fdt_property(fdt, "ibm,change-msix-capable", NULL, 0)));
+}
+
 /*
  * According to PAPR, rtas ibm,os-term does not guarantee a return
  * back to the guest cpu.
-- 
2.1.4

[Qemu-devel] [PATCH 4/4] spapr: SPLPAR Characteristics

2015-07-02 Thread Sam Bobroff

Improve the SPLPAR Characteristics information:

Add MaxPlatProcs: set to max_cpus, the maximum CPUs that could be
addded to the system.
Add DesMem: set to the initial memory of the system.
Add DesProcs: set to smp_cpus, the inital number of CPUs in the
system.

These tokens and values are specified by PAPR.

Signed-off-by: Sam Bobroff 
---
 hw/ppc/spapr_rtas.c | 11 +--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/hw/ppc/spapr_rtas.c b/hw/ppc/spapr_rtas.c
index 0f1ae55..a08254b 100644
--- a/hw/ppc/spapr_rtas.c
+++ b/hw/ppc/spapr_rtas.c
@@ -33,6 +33,7 @@
 #include "hw/ppc/spapr.h"
 #include "hw/ppc/spapr_vio.h"
 #include "qapi-event.h"
+#include "hw/boards.h"
 
 #include 
 
@@ -190,8 +191,14 @@ static void rtas_ibm_get_system_parameter(PowerPCCPU *cpu,
 
 switch (parameter) {
 case RTAS_SYSPARM_SPLPAR_CHARACTERISTICS: {
-char *param_val = g_strdup_printf("MaxEntCap=%d,MaxPlatProcs=%d",
-  max_cpus, smp_cpus);
+char *param_val = g_strdup_printf("MaxEntCap=%d,"
+  "DesMem=%lu,"
+  "DesProcs=%d,"
+  "MaxPlatProcs=%d",
+  max_cpus,
+  current_machine->ram_size / 1024 / 
1024,
+  smp_cpus,
+  max_cpus);
 rtas_st_buffer(buffer, length, (uint8_t *)param_val, 
strlen(param_val));
 g_free(param_val);
 break;
-- 
2.1.4

[Qemu-devel] [PATCH 0/4] spapr: Small PAPR compliance fixes

2015-07-02 Thread Sam Bobroff


This patch set contains several small fixes to make QEMU more PAPR
compliant.


Sam Bobroff (4):
  spapr: Add /ibm,partition-name
  spapr: Add /rtas/ibm,change-msix-capable
  spapr: Make ibm,change-msi respect 3 return values
  spapr: SPLPAR Characteristics

 hw/ppc/spapr.c  |  9 +
 hw/ppc/spapr_pci.c  |  4 +++-
 hw/ppc/spapr_rtas.c | 11 +--
 3 files changed, 21 insertions(+), 3 deletions(-)

-- 
2.1.4

[Qemu-devel] [PATCH 1/4] spapr: Add /ibm,partition-name

2015-07-02 Thread Sam Bobroff

QEMU has a notion of the guest name, so if it's present we might as
well put that into the device tree as /ibm,partition-name.

This is specificed by PAPR.

Signed-off-by: Sam Bobroff 
---
 hw/ppc/spapr.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 0487f52..603f517 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -384,6 +384,11 @@ static void *spapr_create_fdt_skel(hwaddr initrd_base,
 _FDT((fdt_property_string(fdt, "vm,uuid", buf)));
 g_free(buf);
 
+if (qemu_get_vm_name()) {
+_FDT((fdt_property_string(fdt, "ibm,partition-name",
+  qemu_get_vm_name(;
+}
+
 _FDT((fdt_property_cell(fdt, "#address-cells", 0x2)));
 _FDT((fdt_property_cell(fdt, "#size-cells", 0x2)));
 
-- 
2.1.4

[Qemu-devel] [PATCH 3/4] spapr: Make ibm, change-msi respect 3 return values

2015-07-02 Thread Sam Bobroff

Currently, rtas_ibm_change_msi() always returns four values even if
less are specified.

Correct this by only returning the fourth parameter if it was
requested.

This is specified by PAPR.

Signed-off-by: Sam Bobroff 
---
 hw/ppc/spapr_pci.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c
index 05f4fac..f422e3b 100644
--- a/hw/ppc/spapr_pci.c
+++ b/hw/ppc/spapr_pci.c
@@ -360,7 +360,9 @@ out:
 rtas_st(rets, 0, RTAS_OUT_SUCCESS);
 rtas_st(rets, 1, req_num);
 rtas_st(rets, 2, ++seq_num);
-rtas_st(rets, 3, ret_intr_type);
+if (nret > 3) {
+rtas_st(rets, 3, ret_intr_type);
+}
 
 trace_spapr_pci_rtas_ibm_change_msi(config_addr, func, req_num, irq);
 }
-- 
2.1.4

Re: [Qemu-devel] [v8][RESEND][PATCH 00/10] xen: add Intel IGD passthrough

2015-07-02 Thread Chen, Tiejun


On 2015/7/2 0:03, Stefano Stabellini wrote:

Aside from a couple of really minor stylistic issues, I think the patch


Thanks for your review.


series can go in from my point of view. However it looks like you are
still missing a few acked-by/reviewed-by on the non-xen patches.


Just yesterday Michael Acked them :)

Note once you're fine to my replies to your previous comments, I'd like 
to rebase them on the latest and then send out the last revision adding 
Acked-by with you and Michael.


Thanks
Tiejun



On Wed, 1 Jul 2015, Chen, Tiejun wrote:

Ping...

Thanks
Tiejun

On 2015/6/5 16:44, Tiejun Chen wrote:

v8:

* Rebase on the latest qemu tree
* Cleanup one xen leftover in patch #3

v7:

* Instead of "-gfx_passthru" we'd like to make that a machine
option, "-machine xxx,igd-passthru=on""
* try to make something as common shared by others like KvmGT in
the future
* Just read those real value from host bridge pci
configuration space when create host bridge then put in dev->config.

v6:

* Drop introducing a new machine specific to IGD passthrough
* Try to share some codes from KVM stuff in qemu to retrive VGA BIOS
* Currently IGD drivers always need to access PCH by 1f.0. But we
don't want to poke that directly to get ID, and although in real
world different GPU should have different PCH. But actually the
different PCH DIDs likely map to different PCH SKUs. We do the
same thing for the GPU. For PCH, the different SKUs are going to
be all the same silicon design and implementation, just different
features turn on and off with fuses. The SW interfaces should be
consistent across all SKUs in a given family (eg LPT). But just
same features may not be supported.

Most of these different PCH features probably don't matter to the
Gfx driver, but obviously any difference in display port connections
will so it should be fine with any PCH in case of passthrough.

So currently use one PCH version, 0x8c4e, to cover all HSW(Haswell)
scenarios, 0x9cc3 for BDW(Broadwell).
* Drop igd write ops since its fine to emulate that, and we also shrink
those igd read ops as necessary.
* Rebase and cleanup all patches.

v5:

* Simplify to make sure its really inherited from the standard one in patch
#3
* Then drop the original patch #3

v4:

* Rebase on latest tree
* Drop patch #2
* Regenerate patches after Michael introduce patch #1
* We need to use this pci_type as a index to reuse I440FX_PCI_DEVICE()
* Test: boot with a preinstalled winxp
./i386-softmmu/qemu-system-i386 -hda winxp-32.img -m 2560 -boot c
-machine pc

v3:

* Drop patch #4
* Add one patch #1 from Michael
* Rebase
* In./i386-softmmu/qemu-system-i386 -hda test.img -m 2560 -boot c -machine
pc

v2:

* Fix some coding style
* New patch to separate i440fx_init
* Just add prefix with XEN_IGD_PASSTHROUGH/xen_igd_passthrough
* Based on patch #2 to regenerate
* Unify prefix with XEN_IGD_PASSTHROUGH/xen_igd_passthrough like patch #3
* Test: boot with a preinstalled ubuntu 14.04
./i386-softmmu/qemu-system-i386 -hda test.img -m 2560 -boot c -machine pc

As we discussed we need to create a separate machine to support current
IGD passthrough.


Michael S. Tsirkin (1):
i440fx: make types configurable at run-time

Tiejun Chen (9):
pc_init1: pass parameters just with types
piix: create host bridge to passthrough
hw/pci-assign: split pci-assign.c
xen, gfx passthrough: basic graphics passthrough support
xen, gfx passthrough: retrieve VGA BIOS to work
igd gfx passthrough: create a isa bridge
xen, gfx passthrough: register a isa bridge
xen, gfx passthrough: register host bridge specific to passthrough
xen, gfx passthrough: add opregion mapping

   hw/core/machine.c |  20 +++
   hw/i386/Makefile.objs |   1 +
   hw/i386/kvm/pci-assign.c  |  82 +-
   hw/i386/pc_piix.c | 151 ++-
   hw/i386/pci-assign-load-rom.c |  93 
   hw/pci-host/piix.c|  91 +++-
   hw/xen/Makefile.objs  |   1 +
   hw/xen/xen-host-pci-device.c  |   5 +
   hw/xen/xen-host-pci-device.h  |   1 +
   hw/xen/xen_pt.c   |  32 
   hw/xen/xen_pt.h   |  21 ++-
   hw/xen/xen_pt_config_init.c   |  51 ++-
   hw/xen/xen_pt_graphics.c  | 272 ++
   include/hw/boards.h   |   1 +
   include/hw/i386/pc.h  |   8 +-
   include/hw/pci/pci-assign.h   |  27 
   include/hw/xen/xen.h  |   1 +
   qemu-options.hx   |   3 +
   vl.c  |  10 ++
   19 files changed, 780 insertions(+), 91 deletions(-)
   create mode 100644 hw/i386/pci-assign-load-rom.c
   create mode 100644 hw/xen/xen_pt_graphics.c
   create mode 100644 include/hw/pci/pci-assign.h

Thanks
Tiejun

[Qemu-devel] [Bug 1469946] Re: guest can't get IP when create guest with bridge.

2015-07-02 Thread chao zhou

Since commit 6e99c63 "net/socket: Drop net_socket_can_send" and friends,
net queues need to be explicitly flushed after qemu_can_send_packet()
returns false, because the netdev side will disable the polling of fd.

This fixes the case of "cont" after "stop" (or migration), i.e.
vm_running changes to true, by listening to vm state changes.

Signed-off-by: Fam Zheng 
---
 include/net/net.h |  2 ++
 net/net.c | 14 +-
 2 files changed, 15 insertions(+), 1 deletion(-)

diff --git a/include/net/net.h b/include/net/net.h index 6a6cbef..619a6e1 100644
--- a/include/net/net.h
+++ b/include/net/net.h
@@ -8,6 +8,7 @@
 #include "net/queue.h"
 #include "migration/vmstate.h"
 #include "qapi-types.h"
+#include "sysemu/sysemu.h"
 
 #define MAX_QUEUE_NUM 1024
 
@@ -92,6 +93,7 @@ struct NetClientState {
 NetClientDestructor *destructor;
 unsigned int queue_index;
 unsigned rxfilter_notify_enabled:1;
+VMChangeStateEntry *vmcse;
 };
 
 typedef struct NICState {
diff --git a/net/net.c b/net/net.c
index 6ff7fec..edfa6a0 100644
--- a/net/net.c
+++ b/net/net.c
@@ -43,7 +43,6 @@
 #include "qapi-visit.h"
 #include "qapi/opts-visitor.h"
 #include "qapi/dealloc-visitor.h"
-#include "sysemu/sysemu.h"
 
 /* Net bridge is currently not supported for W32. */  #if !defined(_WIN32) @@ 
-263,6 +262,16 @@ static void qemu_net_client_destructor(NetClientState *nc)
 g_free(nc);
 }
 
+static void qemu_net_client_handle_vmstate(void *opaque,
+   int running,
+   RunState state) {
+NetClientState *nc = opaque;
+if (running && qemu_can_send_packet(nc) && nc->peer) {
+qemu_flush_queued_packets(nc->peer);
+}
+}
+
 static void qemu_net_client_setup(NetClientState *nc,
   NetClientInfo *info,
   NetClientState *peer, @@ -287,6 +296,8 @@ 
static void qemu_net_client_setup(NetClientState *nc,
 
 nc->incoming_queue = qemu_new_net_queue(nc);
 nc->destructor = destructor;
+nc->vmcse = 
qemu_add_vm_change_state_handler(qemu_net_client_handle_vmstate,
+ nc);
 }
 
 NetClientState *qemu_new_net_client(NetClientInfo *info, @@ -395,6 +406,7 @@ 
void qemu_del_net_client(NetClientState *nc)
   MAX_QUEUE_NUM);
 assert(queues != 0);
 
+qemu_del_vm_change_state_handler(nc->vmcse);
 /* If there is a peer NIC, delete and cleanup client, but do not free. */
 if (nc->peer && nc->peer->info->type == NET_CLIENT_OPTIONS_KIND_NIC) {
 NICState *nic = qemu_get_nic(nc->peer);
--
2.4.3


after try this patch and 
http://lists.nongnu.org/archive/html/qemu-devel/2015-06/msg07377.html  with 
qemu.git d2966f804d70a244f5dde395fc5d22a50ed3e74e
 after save/retore or live migration, the guest is alive,  ping or ssh guest's 
IP , it is fine

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1469946

Title:
  guest can't get IP when create guest with bridge.

Status in QEMU:
  New

Bug description:
  Environment:
  
  Host OS (ia32/ia32e/IA64):ia32e
  Guest OS (ia32/ia32e/IA64):ia32e
  Guest OS Type (Linux/Windows):linux
  kvm.git Commit:aefbef10e3ae6e2c6e3c54f906f10b34c73a2c66
  qemu.git Commit:dc1e1350f8061021df765b396295329797d66933
  Host Kernel Version:4.1.0
  Hardware:Ivytown_EP, Haswell_EP

  
  Bug detailed description:
  --
  when create guest with bridge, the guest can not get ip.

  note:
  1. fail rate: 3/5
  2. this is a qemu bug:
  kvm  + qemu   =  result
  aefbef10 + dc1e1350=  bad 
  aefbef10 + a4ef02fd   =  good

  Reproduce steps:
  
  1. create guest:
  qemu-system-x86_64 -enable-kvm -m 2G -smp 4 -device 
virtio-net-pci,netdev=net0,mac=$random_mac -netdev 
tap,id=net0,script=/etc/kvm/qemu-ifup rhel6u5.qcow

  Current result:
  
  guest can't get IP

  Expected result:
  
  guest can get ip

  Basic root-causing log:
  --

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1469946/+subscriptions

Re: [Qemu-devel] [v8][RESEND][PATCH 09/10] xen, gfx passthrough: register host bridge specific to passthrough

2015-07-02 Thread Chen, Tiejun


  >>   #ifdef CONFIG_XEN

+static void igd_passthrough_pc_init_pci(MachineState *machine)
+{
+pc_init1(machine,
+ TYPE_I440FX_PCI_HOST_BRIDGE,
+ TYPE_IGD_PASSTHROUGH_I440FX_PCI_DEVICE);
+}
+
+static void pc_init_pci(MachineState *machine)
+{
+pc_init1(machine,
+ TYPE_I440FX_PCI_HOST_BRIDGE,
+ TYPE_I440FX_PCI_DEVICE);
+}
+
+static void pc_xen_hvm_init_pci(MachineState *machine)
+{
+if (has_igd_gfx_passthru)
+igd_passthrough_pc_init_pci(machine);
+else
+pc_init_pci(machine);
+}


I don't see any value in introducing pc_init_pci and
igd_passthrough_pc_init_pci.  I would expand both of them here.



Agree, and what about this?

static void pc_xen_hvm_init_pci(MachineState *machine)
{
const char *pci_type = has_igd_gfx_passthru ?
TYPE_IGD_PASSTHROUGH_I440FX_PCI_DEVICE : 
TYPE_I440FX_PCI_DEVICE;


pc_init1(machine,
 TYPE_I440FX_PCI_HOST_BRIDGE,
 pci_type);
}

Thanks
Tiejun

Re: [Qemu-devel] [PULL 23/27] migration: create migration event

2015-07-02 Thread Wen Congyang

On 07/03/2015 12:10 AM, Juan Quintela wrote:
> We have one argument that tells us what event has happened.

This patch brokes 'make check'

Thanks
Wen Congyang

> 
> Signed-off-by: Juan Quintela 
> Reviewed-by: Eric Blake 
> ---
>  docs/qmp/qmp-events.txt | 14 ++
>  migration/migration.c   |  2 ++
>  qapi/event.json | 12 
>  3 files changed, 28 insertions(+)
> 
> diff --git a/docs/qmp/qmp-events.txt b/docs/qmp/qmp-events.txt
> index 4c13d48..d92cc48 100644
> --- a/docs/qmp/qmp-events.txt
> +++ b/docs/qmp/qmp-events.txt
> @@ -473,6 +473,20 @@ Example:
>  { "timestamp": {"seconds": 1290688046, "microseconds": 417172},
>"event": "SPICE_MIGRATE_COMPLETED" }
> 
> +MIGRATION
> +-
> +
> +Emitted when a migration event happens
> +
> +Data: None.
> +
> + - "status": migration status
> + See MigrationStatus in ~/qapi-schema.json for possible values
> +
> +Example:
> +
> +{"timestamp": {"seconds": 1432121972, "microseconds": 744001},
> + "event": "MIGRATION", "data": {"status": "completed"}}
> 
>  STOP
>  
> diff --git a/migration/migration.c b/migration/migration.c
> index ffaa5c8..d8415c4 100644
> --- a/migration/migration.c
> +++ b/migration/migration.c
> @@ -27,6 +27,7 @@
>  #include "qmp-commands.h"
>  #include "trace.h"
>  #include "qapi/util.h"
> +#include "qapi-event.h"
> 
>  #define MAX_THROTTLE  (32 << 20)  /* Migration speed throttling */
> 
> @@ -510,6 +511,7 @@ void qmp_migrate_set_parameters(bool has_compress_level,
>  static void migrate_set_state(MigrationState *s, int old_state, int 
> new_state)
>  {
>  if (atomic_cmpxchg(&s->state, old_state, new_state) == old_state) {
> +qapi_event_send_migration(new_state, &error_abort);
>  trace_migrate_set_state(new_state);
>  }
>  }
> diff --git a/qapi/event.json b/qapi/event.json
> index 378dda5..f0cef01 100644
> --- a/qapi/event.json
> +++ b/qapi/event.json
> @@ -243,6 +243,18 @@
>  { 'event': 'SPICE_MIGRATE_COMPLETED' }
> 
>  ##
> +# @MIGRATION
> +#
> +# Emitted when a migration event happens
> +#
> +# @status: @MigrationStatus describing the current migration status.
> +#
> +# Since: 2.4
> +##
> +{ 'event': 'MIGRATION',
> +  'data': {'status': 'MigrationStatus'}}
> +
> +##
>  # @ACPI_DEVICE_OST
>  #
>  # Emitted when guest executes ACPI _OST method.
>

Re: [Qemu-devel] [v8][RESEND][PATCH 08/10] xen, gfx passthrough: register a isa bridge

2015-07-02 Thread Chen, Tiejun


+static void
+xen_igd_passthrough_isa_bridge_create(XenPCIPassthroughState *s,
+  XenHostPCIDevice *dev)
+{
+uint16_t gpu_dev_id;
+PCIDevice *d = &s->dev;
+
+if (!is_igd_vga_passthrough(dev)) {
+return;
+}


I would rather move the if into xen_pt_initfn, otherwise reading
xen_pt_initfn it looks like we are going to create an isa bridge
regardless.


This makes sense.





+gpu_dev_id = dev->device_id;
+igd_passthrough_isa_bridge_create(d->bus, gpu_dev_id);
+}
+
  /* init */

  static int xen_pt_initfn(PCIDevice *d)
@@ -725,6 +740,9 @@ static int xen_pt_initfn(PCIDevice *d)


I'd like to add something like,

if (!is_igd_vga_passthrough(&s->real_device)) {
XEN_PT_ERR(d, "Need to enable igd-passthru if you're trying"
   " to passthrough IGD GFX.\n");
xen_host_pci_device_put(&s->real_device); 

return -1; 


}


  xen_host_pci_device_put(&s->real_device);
  return -1;
  }
+
+/* Register ISA bridge for passthrough GFX. */
+xen_igd_passthrough_isa_bridge_create(s, &s->real_device);
  }

  /* Handle real device's MMIO/PIO BARs */
diff --git a/include/hw/xen/xen.h b/include/hw/xen/xen.h
index 4356af4..703148e 100644
--- a/include/hw/xen/xen.h
+++ b/include/hw/xen/xen.h
@@ -51,4 +51,5 @@ void xen_register_framebuffer(struct MemoryRegion *mr);
  #  define HVM_MAX_VCPUS 32
  #endif

+extern void igd_passthrough_isa_bridge_create(PCIBus *bus, uint16_t 
gpu_dev_id);
  #endif /* QEMU_HW_XEN_H */


Either static or extern. You probably want to drop this declaration.



I just guess you're confused between igd_passthrough_isa_bridge_create() 
and xen_igd_passthrough_isa_bridge_create(), or am I wrong?


Note igd_passthrough_isa_bridge_create() is defined in the pc_piix.c file.

Thanks
Tiejun

Re: [Qemu-devel] [PATCH] virtio-net: Drop net_virtio_info.can_receive

2015-07-02 Thread Fam Zheng

On Thu, 07/02 18:46, Michael S. Tsirkin wrote:
> On Thu, Jul 02, 2015 at 01:46:26PM +0100, Stefan Hajnoczi wrote:
> > On Tue, Jun 30, 2015 at 04:35:24PM +0800, Jason Wang wrote:
> > > On 06/30/2015 11:06 AM, Fam Zheng wrote:
> > > > virtio_net_receive still does the check by calling
> > > > virtio_net_can_receive, if the device or driver is not ready, the packet
> > > > is dropped.
> > > >
> > > > This is necessary because returning false from can_receive complicates
> > > > things: the peer would disable sending until we explicitly flush the
> > > > queue.
> > > >
> > > > Signed-off-by: Fam Zheng 
> > > > ---
> > > >  hw/net/virtio-net.c | 1 -
> > > >  1 file changed, 1 deletion(-)
> > > >
> > > > diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
> > > > index d728233..dbef0d0 100644
> > > > --- a/hw/net/virtio-net.c
> > > > +++ b/hw/net/virtio-net.c
> > > > @@ -1503,7 +1503,6 @@ static int virtio_net_load_device(VirtIODevice 
> > > > *vdev, QEMUFile *f,
> > > >  static NetClientInfo net_virtio_info = {
> > > >  .type = NET_CLIENT_OPTIONS_KIND_NIC,
> > > >  .size = sizeof(NICState),
> > > > -.can_receive = virtio_net_can_receive,
> > > >  .receive = virtio_net_receive,
> > > >  .link_status_changed = virtio_net_set_link_status,
> > > >  .query_rx_filter = virtio_net_query_rxfilter,
> > > 
> > > A side effect of this patch is it will read and then drop packet is
> > > guest driver is no ok.
> > 
> > I think that the semantics of .can_receive() and .receive() return
> > values are currently incorrect in many NICs.  They have .can_receive()
> > functions that return false for conditions where .receive() would
> > discard the packet.  So what happens is that packets get queued when
> > they should actually be discarded.
> > 
> > The purpose of the flow control (queuing) mechanism is to tell the
> > sender to hold off until the receiver has more rx buffers available.
> > It's a short-term thing that doesn't included link down, rx disable, or
> > NIC reset states.
> > 
> > Therefore, I think this patch will not introduce a regression.  It is
> > adjusting the code to stop queuing packets when they should actually be
> > dropped.
> > 
> > Thoughts?
> > 
> > Reviewed-by: Stefan Hajnoczi 
> 
> OK, but we do have transient out of buffers states too, and
> virtio_net_can_receive checks these as well.
> 
> To me, it looks like we should change virtio_net_can_receive to
> avoid checking link down state, etc.
> But I don't see why we should remove it completely.

OK, it makse sense to check buffer state.

Fam

Re: [Qemu-devel] [PATCH COLO-BLOCK v7 00/17] Block replication for continuous checkpoints

2015-07-02 Thread Wen Congyang

On 07/02/2015 10:47 PM, Michael R. Hines wrote:
> Is this up to date:
> 
> On 06/29/2015 10:34 PM, Wen Congyang wrote:
>> Block replication is a very important feature which is used for
>> continuous checkpoints(for example: COLO).
>>
>> Usage:
>> Please refer to docs/block-replication.txt
>>
>> You can get the patch here:
>> https://github.com/wencongyang/qemu-colo/commits/block-replication-v7
>>
>> You can get ths patch with framework here:
>> https://github.com/wencongyang/qemu-colo/commits/colo_framework_v7.2
>>
>> TODO:
>> 1. Continuous block replication. It will be started after basic functions
>> are accepted.
>>
>> Changs Log:
>> V7:
>> 1. Implement adding/removing quorum child. Remove the option non-connect.
>> 2. Simplify the backing refrence option according to Stefan Hajnoczi's 
>> suggestion
>> V6:
>> 1. Rebase to the newest qemu.
>> V5:
>> 1. Address the comments from Gong Lei
>> 2. Speed the failover up. The secondary vm can take over very quickly even
>> if there are too many I/O requests.
>> V4:
>> 1. Introduce a new driver replication to avoid touch nbd and qcow2.
>> V3:
>> 1: use error_setg() instead of error_set()
>> 2. Add a new block job API
>> 3. Active disk, hidden disk and nbd target uses the same AioContext
>> 4. Add a testcase to test new hbitmap API
>> V2:
>> 1. Redesign the secondary qemu(use image-fleecing)
>> 2. Use Error objects to return error message
>> 3. Address the comments from Max Reitz and Eric Blake
>>
>>
>> Wen Congyang (17):
>>Add new block driver interface to add/delete a BDS's child
>>quorum: implement block driver interfaces add/delete a BDS's child
>>hmp: add monitor command to add/remove a child
>>introduce a new API qemu_opts_absorb_qdict_by_index()
>>quorum: allow ignoring child errors
>>introduce a new API to enable/disable attach device model
>>introduce a new API to check if blk is attached
>>block: make bdrv_put_ref_bh_schedule() as a public API
>>Backup: clear all bitmap when doing block checkpoint
>>allow writing to the backing file
>>Allow creating backup jobs when opening BDS
>>block: Allow references for backing files
>>docs: block replication's description
>>Add new block driver interfaces to control block replication
>>skip nbd_target when starting block replication
>>quorum: implement block driver interfaces for block replication
>>Implement new driver for block replication
>>
>>   block.c| 198 +-
>>   block/Makefile.objs|   3 +-
>>   block/backup.c |  13 ++
>>   block/block-backend.c  |  33 +++
>>   block/quorum.c | 244 ++-
>>   block/replication.c| 443 
>> +
>>   blockdev.c |  90 ++---
>>   blockjob.c |  10 +
>>   docs/block-replication.txt | 179 +
>>   hmp-commands.hx|  28 +++
>>   include/block/block.h  |  11 +
>>   include/block/block_int.h  |  19 ++
>>   include/block/blockjob.h   |  12 ++
>>   include/qemu/option.h  |   2 +
>>   include/sysemu/block-backend.h |   3 +
>>   include/sysemu/blockdev.h  |   2 +
>>   qapi/block.json|  16 ++
>>   util/qemu-option.c |  44 
>>   18 files changed, 1303 insertions(+), 47 deletions(-)
>>   create mode 100644 block/replication.c
>>   create mode 100644 docs/block-replication.txt
>>
> 
> Is this update to date with the wiki the documentation?
> 
> https://github.com/wencongyang/qemu-colo/commits/block-replication-v7
> 
> I just want to test this patchset, rather than the whole colo patcheset.

I forgot to update the patch 13. So please setup the envrionment according to 
the wiki.

Thanks
Wen Congyang

> 
> - Michael
> 
> .
>

Re: [Qemu-devel] [PATCH COLO-BLOCK v7 00/17] Block replication for continuous checkpoints

2015-07-02 Thread Wen Congyang

On 07/02/2015 10:59 PM, Michael R. Hines wrote:
> On 06/29/2015 10:34 PM, Wen Congyang wrote:
>> Block replication is a very important feature which is used for
>> continuous checkpoints(for example: COLO).
>>
>> Usage:
>> Please refer to docs/block-replication.txt
>>
>> You can get the patch here:
>> https://github.com/wencongyang/qemu-colo/commits/block-replication-v7
>>
>> You can get ths patch with framework here:
>> https://github.com/wencongyang/qemu-colo/commits/colo_framework_v7.2
>>
>> TODO:
>> 1. Continuous block replication. It will be started after basic functions
>> are accepted.
>>
>> Changs Log:
>> V7:
>> 1. Implement adding/removing quorum child. Remove the option non-connect.
>> 2. Simplify the backing refrence option according to Stefan Hajnoczi's 
>> suggestion
>> V6:
>> 1. Rebase to the newest qemu.
>> V5:
>> 1. Address the comments from Gong Lei
>> 2. Speed the failover up. The secondary vm can take over very quickly even
>> if there are too many I/O requests.
>> V4:
>> 1. Introduce a new driver replication to avoid touch nbd and qcow2.
>> V3:
>> 1: use error_setg() instead of error_set()
>> 2. Add a new block job API
>> 3. Active disk, hidden disk and nbd target uses the same AioContext
>> 4. Add a testcase to test new hbitmap API
>> V2:
>> 1. Redesign the secondary qemu(use image-fleecing)
>> 2. Use Error objects to return error message
>> 3. Address the comments from Max Reitz and Eric Blake
>>
>>
>> Wen Congyang (17):
>>Add new block driver interface to add/delete a BDS's child
>>quorum: implement block driver interfaces add/delete a BDS's child
>>hmp: add monitor command to add/remove a child
>>introduce a new API qemu_opts_absorb_qdict_by_index()
>>quorum: allow ignoring child errors
>>introduce a new API to enable/disable attach device model
>>introduce a new API to check if blk is attached
>>block: make bdrv_put_ref_bh_schedule() as a public API
>>Backup: clear all bitmap when doing block checkpoint
>>allow writing to the backing file
>>Allow creating backup jobs when opening BDS
>>block: Allow references for backing files
>>docs: block replication's description
>>Add new block driver interfaces to control block replication
>>skip nbd_target when starting block replication
>>quorum: implement block driver interfaces for block replication
>>Implement new driver for block replication
>>
>>   block.c| 198 +-
>>   block/Makefile.objs|   3 +-
>>   block/backup.c |  13 ++
>>   block/block-backend.c  |  33 +++
>>   block/quorum.c | 244 ++-
>>   block/replication.c| 443 
>> +
>>   blockdev.c |  90 ++---
>>   blockjob.c |  10 +
>>   docs/block-replication.txt | 179 +
>>   hmp-commands.hx|  28 +++
>>   include/block/block.h  |  11 +
>>   include/block/block_int.h  |  19 ++
>>   include/block/blockjob.h   |  12 ++
>>   include/qemu/option.h  |   2 +
>>   include/sysemu/block-backend.h |   3 +
>>   include/sysemu/blockdev.h  |   2 +
>>   qapi/block.json|  16 ++
>>   util/qemu-option.c |  44 
>>   18 files changed, 1303 insertions(+), 47 deletions(-)
>>   create mode 100644 block/replication.c
>>   create mode 100644 docs/block-replication.txt
>>
> 
> Can you move the interfaces for 
> blk_start_replication()/blk_stop_replication/blk_do_checkpoint() to this 
> patch series
> instead of the COLO patch series?
> 
> I don't think those functions are specific to COLO --- they should be 
> available to all users.

OK, I will do it in the next version.

Thanks
Wen Congyang

> 
> - Michael
> 
> .
>

[Qemu-devel] [PATCH v2] ne2000: Drop ne2000_can_receive

2015-07-02 Thread Fam Zheng

This moves the behavior of ne2000_can_receive to ne2000_receive. The
logic is when the NIC is stopped we drop the packet, when the buffer is
full we queue it and try flush later.

ne2000_buffer_full is determined by s->curpag, s->boundary, s->start and
s->stop. Add a flush in ne2000_ioport_write as they are all updated
there, except the advancing of s->curpag in ne2000_receive where
ne2000_buffer_full is already false.

Signed-off-by: Fam Zheng 
---
 hw/net/ne2000-isa.c |  1 -
 hw/net/ne2000.c | 27 ---
 2 files changed, 16 insertions(+), 12 deletions(-)

diff --git a/hw/net/ne2000-isa.c b/hw/net/ne2000-isa.c
index 17e7199..18b0644 100644
--- a/hw/net/ne2000-isa.c
+++ b/hw/net/ne2000-isa.c
@@ -44,7 +44,6 @@ typedef struct ISANE2000State {
 static NetClientInfo net_ne2000_isa_info = {
 .type = NET_CLIENT_OPTIONS_KIND_NIC,
 .size = sizeof(NICState),
-.can_receive = ne2000_can_receive,
 .receive = ne2000_receive,
 };
 
diff --git a/hw/net/ne2000.c b/hw/net/ne2000.c
index 3492db3..182e7ca 100644
--- a/hw/net/ne2000.c
+++ b/hw/net/ne2000.c
@@ -165,15 +165,6 @@ static int ne2000_buffer_full(NE2000State *s)
 return 0;
 }
 
-int ne2000_can_receive(NetClientState *nc)
-{
-NE2000State *s = qemu_get_nic_opaque(nc);
-
-if (s->cmd & E8390_STOP)
-return 1;
-return !ne2000_buffer_full(s);
-}
-
 #define MIN_BUF_SIZE 60
 
 ssize_t ne2000_receive(NetClientState *nc, const uint8_t *buf, size_t size_)
@@ -190,8 +181,12 @@ ssize_t ne2000_receive(NetClientState *nc, const uint8_t 
*buf, size_t size_)
 printf("NE2000: received len=%d\n", size);
 #endif
 
-if (s->cmd & E8390_STOP || ne2000_buffer_full(s))
+if (s->cmd & E8390_STOP) {
 return -1;
+}
+if (ne2000_buffer_full(s)) {
+return 0;
+}
 
 /* XXX: check this */
 if (s->rxcr & 0x10) {
@@ -277,6 +272,7 @@ static void ne2000_ioport_write(void *opaque, uint32_t 
addr, uint32_t val)
 {
 NE2000State *s = opaque;
 int offset, page, index;
+bool try_flush = false;
 
 addr &= 0xf;
 #ifdef DEBUG_NE2000
@@ -309,19 +305,25 @@ static void ne2000_ioport_write(void *opaque, uint32_t 
addr, uint32_t val)
 s->cmd &= ~E8390_TRANS;
 ne2000_update_irq(s);
 }
+} else {
+try_flush = true;
 }
+
 } else {
 page = s->cmd >> 6;
 offset = addr | (page << 4);
 switch(offset) {
 case EN0_STARTPG:
 s->start = val << 8;
+try_flush = true;
 break;
 case EN0_STOPPG:
 s->stop = val << 8;
+try_flush = true;
 break;
 case EN0_BOUNDARY:
 s->boundary = val;
+try_flush = true;
 break;
 case EN0_IMR:
 s->imr = val;
@@ -363,12 +365,16 @@ static void ne2000_ioport_write(void *opaque, uint32_t 
addr, uint32_t val)
 break;
 case EN1_CURPAG:
 s->curpag = val;
+try_flush = true;
 break;
 case EN1_MULT ... EN1_MULT + 7:
 s->mult[offset - EN1_MULT] = val;
 break;
 }
 }
+if (try_flush && !ne2000_buffer_full(s)) {
+qemu_flush_queued_packets(qemu_get_queue(s->nic));
+}
 }
 
 static uint32_t ne2000_ioport_read(void *opaque, uint32_t addr)
@@ -705,7 +711,6 @@ void ne2000_setup_io(NE2000State *s, DeviceState *dev, 
unsigned size)
 static NetClientInfo net_ne2000_info = {
 .type = NET_CLIENT_OPTIONS_KIND_NIC,
 .size = sizeof(NICState),
-.can_receive = ne2000_can_receive,
 .receive = ne2000_receive,
 };
 
-- 
2.4.5

Re: [Qemu-devel] [PATCH COLO-BLOCK v7 02/17] quorum: implement block driver interfaces add/delete a BDS's child

2015-07-02 Thread Wen Congyang

On 07/02/2015 11:21 PM, Alberto Garcia wrote:
> On Thu 02 Jul 2015 04:30:50 PM CEST, Wen Congyang wrote:
> 
>>> 2) Did you think of any API to update vote_threshold? Currently it
>>> cannot be higher than num_children, so it's effectively limited by
>>> the number of children that are attached when the quorum device is
>>> created.
>>
>> The things I think about it is: if vote_threshold-- when deleting a
>> child, vote_threshold will be less than 0. If we don't do
>> vote_threshold-- when it is 1, the vote_threshold will be changed when
>> all children are added again. Which behavior is better?
> 
> Adding/removing a child and changing vote_threshold are in principle two
> unrelated operations. One user could want to modify one but not the
> other, so linking them together does not seem like a good idea. A
> specific API to change vote_threshold could be added when the use case
> appears (currently no one seems to have missed it).
> 
> So I think I would keep these things separate, I would also remove the
> "TODO" comments that mention it.
> 
> With the current patches (that do not touch vote_threshold), you can
> remove as many children as you want as long as there's at least one
> left. However if you end up with num_chilren < vote_threshold then you
> will get read errors. I see two alternatives:
> 
>   - Allow that and expect that the user will add the necessary children
> afterwards.
> 
>   - Forbid that completely and return an error.
> 
> I think I prefer the second option.

OK, I will do it.

> 
>>> 3) I don't think it's necessary to set to NULL the pointers in
>>> s->bs[i] when i >= num_children. There's no way to access those
>>> pointers anyway. Same for the ' s->bs[s->num_children] = NULL; ' bit
>>> in quorum_del_child(). I also think that using memset() for setting
>>> NULL pointers is not portable, although QEMU is already doing this in
>>> a few places.
>>
>> OK, will remove it in the next version. Just a question: why is using
>> memset() for setting NULL pointers is not prtable?
> 
> The standard allows for null pointers to be internally represented by
> nonzero bit patterns. However I'm not aware of any system that we
> support that does that.
> 
>http://c-faq.com/null/confusion4.html
>http://c-faq.com/null/machexamp.html

Thanks for your explantion.

Wen Congyang

> 
> Berto
> .
>

Re: [Qemu-devel] [PATCH v2] raw-posix.c: remove raw device access for cdrom

2015-07-02 Thread Programmingkid

Allows QEMU running on Mac OS X guest to use the real
CD-ROM drive.

This patch doesn't actually remove raw device access. I'm just
using the same name as the last patch.

Why should we test for /dev/disk1s0 when /dev/rdisk1s0 works?
By avoiding using the raw device we gain speed. When a read
takes place for the raw CD-ROM device, it has to start spinning
up if it went to sleep. This takes time to do. Plus there is 
that annoying spin up sound. With the non-raw access, QEMU
can use the operating system's buffers to do its reading. 
This means faster and quieter access. 

I have noticed that in order to use /dev/disk1s0, the CD-ROM
had to be unmounted (but not ejected) from the desktop. 
Disk Utility is what I used to accomplish this. 

Signed-off-by: John Arbuckle 

---
Tries more permutations of the CD-ROM device file.
Sets the needs_alignment variable.
Removes the unused kernResult variable.

 block/raw-posix.c |   62 
 1 files changed, 52 insertions(+), 10 deletions(-)

diff --git a/block/raw-posix.c b/block/raw-posix.c
index cbe6574..2088c1f 100644
--- a/block/raw-posix.c
+++ b/block/raw-posix.c
@@ -484,6 +484,19 @@ static int raw_open_common(BlockDriverState *bs, QDict 
*options,
 
 filename = qemu_opt_get(opts, "filename");
 
+#ifdef __APPLE__
+
+/*
+ * Raw CD-ROM access on a Macintosh needs to be aligned
+ * or reads will fail.
+ */
+
+if (strncmp(filename, "/dev/r", 6) == 0) {
+s->needs_alignment = true;
+}
+
+#endif  /* __APPLE__ */
+
 ret = raw_normalize_devicepath(&filename);
 if (ret != 0) {
 error_setg_errno(errp, -ret, "Could not normalize device path");
@@ -2014,7 +2027,6 @@ kern_return_t GetBSDPath( io_iterator_t mediaIterator, 
char *bsdPath, CFIndex ma
 if ( bsdPathAsCFString ) {
 size_t devPathLength;
 strcpy( bsdPath, _PATH_DEV );
-strcat( bsdPath, "r" );
 devPathLength = strlen( bsdPath );
 if ( CFStringGetCString( bsdPathAsCFString, bsdPath + 
devPathLength, maxPathSize - devPathLength, kCFStringEncodingASCII ) ) {
 kernResult = KERN_SUCCESS;
@@ -2120,25 +2132,55 @@ static int hdev_open(BlockDriverState *bs, QDict 
*options, int flags,
 const char *filename = qdict_get_str(options, "filename");
 
 if (strstart(filename, "/dev/cdrom", NULL)) {
-kern_return_t kernResult;
 io_iterator_t mediaIterator;
 char bsdPath[ MAXPATHLEN ];
 int fd;
 
-kernResult = FindEjectableCDMedia( &mediaIterator );
-kernResult = GetBSDPath( mediaIterator, bsdPath, sizeof( bsdPath ) );
+FindEjectableCDMedia(&mediaIterator);
+GetBSDPath(mediaIterator, bsdPath, sizeof(bsdPath));
 
 if ( bsdPath[ 0 ] != '\0' ) {
-strcat(bsdPath,"s0");
-/* some CDs don't have a partition 0 */
-fd = qemu_open(bsdPath, O_RDONLY | O_BINARY | O_LARGEFILE);
-if (fd < 0) {
-bsdPath[strlen(bsdPath)-1] = '1';
-} else {
+int devStringLength = strlen("/dev/");
+char baseDeviceName[MAXPATHLEN];
+int numberOfOptions = 4;
+char deviceNameArray[numberOfOptions][MAXPATHLEN];
+int i = 0;
+
+/* remove the "/dev/" part from the device file's name */
+strcpy(baseDeviceName, bsdPath + devStringLength);
+
+/* /dev/disk*s0 */
+sprintf(deviceNameArray[i++], "/dev/%ss0", baseDeviceName);
+
+/* /dev/disk*s1 */
+sprintf(deviceNameArray[i++], "/dev/%ss1", baseDeviceName);
+
+/* /dev/rdisk*s0 */
+sprintf(deviceNameArray[i++], "/dev/r%ss0", baseDeviceName);
+
+/* /dev/rdisk*s1 */
+sprintf(deviceNameArray[i++], "/dev/r%ss1", baseDeviceName);
+
+/* Try device file permutions until one works */
+for (i = 0; i < numberOfOptions; i++) {
+fd = qemu_open(deviceNameArray[i], O_RDONLY | O_BINARY
+| O_LARGEFILE);
+if (fd < 0) {
+DPRINTF("Error opening %s: %s\n", deviceNameArray[i]
+, strerror(errno));
+} else {
+strcpy(bsdPath, (char *)deviceNameArray[i]);
+break;
+}
+}
+
+if (fd > 0) {
 qemu_close(fd);
 }
+
 filename = bsdPath;
 qdict_put(options, "filename", qstring_from_str(filename));
+DPRINTF("cdrom is using %s\n", filename);
 }
 
 if ( mediaIterator )
-- 
1.7.5.4

Re: [Qemu-devel] [PATCH for-2.4] net-hub: Drop can_receive

2015-07-02 Thread Fam Zheng

On Thu, 07/02 14:30, Stefan Hajnoczi wrote:
> On Tue, Jun 30, 2015 at 10:29:20AM +0800, Fam Zheng wrote:
> > It returns true as long as there is another attached port. This is not
> > strictly necessary because even if there is only one port (the sender),
> > net_hub_port_receive could succeed with a NOP. So always deliver the
> > packets, instead of queuing them.
> > 
> > This fixes the possible hanging issue after net layer changed how
> > can_read is handled. That is, if net_hub_port_can_receive returned
> > false, the peer would disable the queue until it's explicitly flushed
> > (for example, a call to qemu_flush_queued_packets() in net_hub_add_port,
> > where net_hub_port_can_receive() would become true). This patch avoids
> > that complication.
> > 
> > Signed-off-by: Fam Zheng 
> > ---
> >  net/hub.c | 20 
> >  1 file changed, 20 deletions(-)
> 
> Hmm...I misread the hub code:
> 
> net_hub_port_can_receive() returns true if *any* port can receive.
> 
> net_hub_receive_iov() always accepts packets.  It never discards or
> queues.
> 
> So in order to move to the semantics you want, let's drop
> net_hub_port_can_receive() *and* change net_hub_receive_iov():
> 
>   static ssize_t net_hub_receive(NetHub *hub, NetHubPort *source_port,
>  const uint8_t *buf, size_t len)
>   {
>   NetHubPort *port;
> 
>   QLIST_FOREACH(port, &hub->ports, next) {
>   if (port == source_port) {
>   continue;
>   }
> 
>   /* No need for a callback because net_hub_flush() is called
>* when the peer flushes the queue anyway.
>*
>* Note that packets are duplicated if there are multiple
>* ports and some of them accepted a packet before a later
>* port queued it.  Live with it, the network tolerates
>* duplicates.
>*/
>   if (qemu_send_packet(&port->nc, buf, len) == 0) {
>   return 0;

I'm not sure if it is a good idea to return 0 from either net_hub_port_receive
or net_hub_receive. Because that way if one out of ten ports is down, other
ones will no longer get more packets from tap.

Fam


>   }
>   }
>   return len;
>   }

Re: [Qemu-devel] [PATCH v3 00/16] Add an IPMI device to QEMU

2015-07-02 Thread Benjamin Herrenschmidt

On Mon, 2015-06-08 at 20:11 -0500, miny...@acm.org wrote:
> I have reworked the QEMU IPMI patches, and I believe this design
> improves greatly on the previous one.  The suggestions made all
> resulted in better design.  To start an IPMI interface
> with this, you would use:
> 
>   -device ipmi-bmc-sim,id=bmc0 -device isa-ipmi-bt,bmc=bmc0
> 
> Note that I did not make the BMCs objects.  I started this way, but
> adding properties was ending up duplicating a lot of code already
> in the device core, especially for the chardev link.
> 
> I'm not sure of the importance of having a BMC be an object instead
> of a device, but it seems to me that if we do that, it would be
> best to add helper functions to the object core for the chardev
> and integer properties.

Out of curiosity, what's the outlook to have this merged ? I'm working
on a "native" OpenPower (POWER8 based) model which could definitely
make good use of that !

Cheers,
Ben.

Re: [Qemu-devel] [PATCH] raw-posix.c: remove raw device access for cdrom

2015-07-02 Thread Programmingkid


On Jul 2, 2015, at 10:33 AM, Laurent Vivier wrote:

> 
> 
> On 02/07/2015 16:20, Paolo Bonzini wrote:
>> 
>> 
>> On 02/07/2015 16:18, Laurent Vivier wrote:
> I'm okay with doing the simple thing, but it needs a comment for 
> non-BSDers.
>>> So, what we have to do, in our case, for MacOS X cdrom, is something like:
>>> 
>>> ... GetBSDPath ...
>>> ...
>>>if (flags & BDRV_O_NOCACHE) {
>>>strcat(bsdPath, "r");
>>>}
>>> ...
>>> 
>>> ?
>> 
>> Well, what to do with Mac OS X CD-ROM is another story...  Raw access
>> "seems not do work well" according to John, so we may have a comment
>> there explaining why we're not adding the "r".
> 
> I think it doesn't work well because they need to be aligned, and
> NOCACHE implies that (with BDRV_O_NOCACHE code will be self explicit :) )
> 
> raw_open_common()
> 
>if ((bs->open_flags & BDRV_O_NOCACHE) != 0) {
>s->needs_alignment = true;
>}
> 
> and needs_alignment allows to probe alignment (raw_probe_alignment())

Thank you very much for this. Raw cdrom access now works with this code. I
just had to modify it so it just sets s->needs_alignment to true without 
checking
the if condition. The if condition didn't work.

[Qemu-devel] [PATCH 2/3] arm_mptimer: Fix ONE-SHOT -> PERIODIC mode change

2015-07-02 Thread Dmitry Osipenko

Timer won't start periodic ticking if ONE-SHOT -> PERIODIC mode change happened
after one-shot tick was completed. Fix it by starting ticking only if timer was
disabled previously and isn't ticking right now.

Signed-off-by: Dmitry Osipenko 
---
 hw/timer/arm_mptimer.c | 9 +++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/hw/timer/arm_mptimer.c b/hw/timer/arm_mptimer.c
index e230063..58e17c4 100644
--- a/hw/timer/arm_mptimer.c
+++ b/hw/timer/arm_mptimer.c
@@ -122,11 +122,16 @@ static void timerblock_write(void *opaque, hwaddr addr,
 case 8: /* Control.  */
 old = tb->control;
 tb->control = value;
-if ((old & 1) == (value & 1)) {
+/* Don't do anything if timer already disabled.  */
+if (((old & 1) == 0) && ((value & 1) == 0)) {
 break;
 }
 if (value & 1) {
-if (tb->count == 0 && (tb->control & 2)) {
+/* Don't do anything if timer already ticking.  */
+if (((old & 1) != 0) && (tb->count != 0)) {
+break;
+}
+if (tb->control & 2) {
 tb->count = tb->load;
 }
 timerblock_reload(tb, 1);
-- 
2.4.4

[Qemu-devel] [PATCH 3/3] arm_mptimer: Respect IT bit state

2015-07-02 Thread Dmitry Osipenko

Timer fires interrupt regardless of current IT(interrupt enable) bit state.
Fix it by making timer to respect IT state.

Signed-off-by: Dmitry Osipenko 
---
 hw/timer/arm_mptimer.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/timer/arm_mptimer.c b/hw/timer/arm_mptimer.c
index 58e17c4..82c4462 100644
--- a/hw/timer/arm_mptimer.c
+++ b/hw/timer/arm_mptimer.c
@@ -38,7 +38,7 @@ static inline int get_current_cpu(ARMMPTimerState *s)
 
 static inline void timerblock_update_irq(TimerBlock *tb)
 {
-qemu_set_irq(tb->irq, tb->status);
+qemu_set_irq(tb->irq, tb->status && (tb->control & 4));
 }
 
 /* Return conversion factor from mpcore timer ticks to qemu timer ticks.  */
-- 
2.4.4

[Qemu-devel] arm_mptimer fixes

2015-07-02 Thread Dmitry Osipenko

Hello, this series fixes 3 arm_mptimer issues. All 3 patches were successfully
tested on ARM Cortex-A9 QEMU machine booting Linux kernel and behavior was
compared to real hw by running couple handcrafted mptimer tests.

arm_mptimer: Fix timer shutdown
arm_mptimer: Fix ONE-SHOT -> PERIODIC mode change
arm_mptimer: Respect IT bit state

[Qemu-devel] [PATCH 1/3 v3] arm_mptimer: Fix timer shutdown

2015-07-02 Thread Dmitry Osipenko

Timer, running in periodic mode, can't be stopped or coming one-shot tick
won't be canceled because timer control code just doesn't handle timer
disabling. Fix it by deleting timer if enable bit isn't set.

Signed-off-by: Dmitry Osipenko 
---

v2: Avoid calling timer_del() if the timer was already disabled as per
Peter Maydell suggestion.

v3: Do not change "reload the timer" logic as it is separate bug.

 hw/timer/arm_mptimer.c | 8 +++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/hw/timer/arm_mptimer.c b/hw/timer/arm_mptimer.c
index 8b93b3c..e230063 100644
--- a/hw/timer/arm_mptimer.c
+++ b/hw/timer/arm_mptimer.c
@@ -122,11 +122,17 @@ static void timerblock_write(void *opaque, hwaddr addr,
 case 8: /* Control.  */
 old = tb->control;
 tb->control = value;
-if (((old & 1) == 0) && (value & 1)) {
+if ((old & 1) == (value & 1)) {
+break;
+}
+if (value & 1) {
 if (tb->count == 0 && (tb->control & 2)) {
 tb->count = tb->load;
 }
 timerblock_reload(tb, 1);
+} else {
+/* Shutdown timer.  */
+timer_del(tb->timer);
 }
 break;
 case 12: /* Interrupt status.  */
-- 
2.4.4

[Qemu-devel] [PATCH] target-ppc: fix hugepage support when using memory-backend-file

2015-07-02 Thread Michael Roth

Current PPC code relies on -mem-path being used in order for
hugepage support to be detected. With the introduction of
MemoryBackendFile we can now handle this via:
  -object memory-file-backend,mem-path=...,id=hugemem0 \
  -numa node,id=mem0,memdev=hugemem0

Management tools like libvirt treat the 2 approaches as
interchangeable in some cases, which can lead to user-visible
regressions even for previously supported guest configurations.

Fix these by also iterating through any configured memory
backends that may be backed by hugepages.

Since the old code assumed hugepages always backed the entirety
of guest memory, play it safe an pick the minimum across the
max pages sizes for all backends, even ones that aren't backed
by hugepages.

Cc: qemu-sta...@nongnu.org
Signed-off-by: Michael Roth 
---
 target-ppc/kvm.c | 57 ++--
 1 file changed, 51 insertions(+), 6 deletions(-)

diff --git a/target-ppc/kvm.c b/target-ppc/kvm.c
index afb4696..5db2045 100644
--- a/target-ppc/kvm.c
+++ b/target-ppc/kvm.c
@@ -40,6 +40,7 @@
 #include "trace.h"
 #include "exec/gdbstub.h"
 #include "exec/memattrs.h"
+#include "sysemu/hostmem.h"
 
 //#define DEBUG_KVM
 
@@ -303,16 +304,11 @@ static void kvm_get_smmu_info(PowerPCCPU *cpu, struct 
kvm_ppc_smmu_info *info)
 kvm_get_fallback_smmu_info(cpu, info);
 }
 
-static long getrampagesize(void)
+static long gethugepagesize(const char *mem_path)
 {
 struct statfs fs;
 int ret;
 
-if (!mem_path) {
-/* guest RAM is backed by normal anonymous pages */
-return getpagesize();
-}
-
 do {
 ret = statfs(mem_path, &fs);
 } while (ret != 0 && errno == EINTR);
@@ -334,6 +330,55 @@ static long getrampagesize(void)
 return fs.f_bsize;
 }
 
+static int find_max_supported_pagesize(Object *obj, void *opaque)
+{
+char *mem_path;
+long *hpsize_min = opaque;
+
+if (object_dynamic_cast(obj, TYPE_MEMORY_BACKEND)) {
+mem_path = object_property_get_str(obj, "mem-path", NULL);
+if (mem_path) {
+long hpsize = gethugepagesize(mem_path);
+if (hpsize < *hpsize_min) {
+*hpsize_min = hpsize;
+}
+} else {
+*hpsize_min = getpagesize();
+}
+}
+
+return 0;
+}
+
+static long getrampagesize(void)
+{
+long hpsize = LONG_MAX;
+Object *memdev_root;
+
+if (mem_path) {
+return gethugepagesize(mem_path);
+}
+
+/* it's possible we have memory-backend objects with
+ * hugepage-backed RAM. these may get mapped into system
+ * address space via -numa parameters or memory hotplug
+ * hooks. we want to take these into account, but we
+ * also want to make sure these supported hugepage
+ * sizes are applicable across the entire range of memory
+ * we may boot from, so we take the min across all
+ * backends, and assume normal pages in cases where a
+ * backend isn't backed by hugepages.
+ */
+memdev_root = object_resolve_path("/objects", NULL);
+if (!memdev_root) {
+return getpagesize();
+}
+
+object_child_foreach(memdev_root, find_max_supported_pagesize, &hpsize);
+
+return (hpsize == LONG_MAX) ? getpagesize() : hpsize;
+}
+
 static bool kvm_valid_page_size(uint32_t flags, long rampgsize, uint32_t shift)
 {
 if (!(flags & KVM_PPC_PAGE_SIZES_REAL)) {
-- 
1.9.1

[Qemu-devel] [PATCH v1 02/10] xen/pt: Sync up the dev.config and data values.

2015-07-02 Thread Konrad Rzeszutek Wilk

For a passthrough device we maintain a state of emulated
registers value contained within d->config. We also consult
the host registers (and apply ro and write masks) whenever
the guest access the registers. This is done in xen_pt_pci_write_config
and xen_pt_pci_read_config.

Also in this picture we call pci_default_write_config which
updates the d->config and if the d->config[PCI_COMMAND] register
has PCI_COMMAND_MEMORY (or PCI_COMMAND_IO) acts on those changes.

On startup the d->config[PCI_COMMAND] are the host values, not
what the guest initial values should be, which is exactly what
we do _not_ want to do for 64-bit BARs when the guest just wants
to read the size of the BAR. Huh you say?

To get the size of 64-bit memory space BARs,  the guest has
to calculate ((BAR[x] & 0xFFF0) + ((BAR[x+1] & 0x) << 32))
which means it has to do two writes of ~0 to BARx and BARx+1.

prior to this patch and with XSA120-addendum patch (Linux kernel)
the PCI_COMMAND register is copied from the host it can have
PCI_COMMAND_MEMORY bit set which means that QEMU will try to
update the hypervisor's P2M with BARx+1 value to ~0 (0x)
(to sync the guest state to host) instead of just having
xen_pt_pci_write_config and xen_pt_bar_reg_write apply the
proper masks and return the size to the guest.

To thwart this, this patch syncs up the host values with the
guest values taking into account the emu_mask (bit set means
we emulate, PCI_COMMAND_MEMORY and PCI_COMMAND_IO are set).
That is we copy the host values - masking out any bits which
we will emulate. Then merge it with the initial emulation register
values. Lastly this value is then copied both in
dev.config _and_ XenPTReg->data field.

There is also reg->size accounting taken into consideration
that ends up being used in two patches:
 xen/pt: Check if reg->init function sets the 'data' past the reg->size
 xen/pt: Use xen_host_pci_get_[byte,word,long] instead of xen_host_pci_get_long

This fixes errors such as these:

(XEN) memory_map:add: dom2 gfn=fffe0 mfn=fbce0 nr=20
(DEBUG) 189 pci dev 04:0 BAR16 wrote ~0.
(DEBUG) 200 pci dev 04:0 BAR16 read 0x0fffe0004.
(XEN) memory_map:remove: dom2 gfn=fffe0 mfn=fbce0 nr=20
(DEBUG) 204 pci dev 04:0 BAR16 wrote 0x0fffe0004.
(DEBUG) 217 pci dev 04:0 BAR16 read upper 0x0.
(XEN) memory_map:add: dom2 gfn=0 mfn=fbce0 nr=20
(XEN) p2m.c:883:d0v0 p2m_set_entry failed! mfn= rc:-22
(XEN) memory_map:fail: dom2 gfn=0 mfn=fbce0 nr=20 ret:-22
(XEN) memory_map:remove: dom2 gfn=0 mfn=fbce0 nr=20
(XEN) p2m.c:920:d0v0 gfn_to_mfn failed! gfn=0 type:4
(XEN) p2m.c:920:d0v0 gfn_to_mfn failed! gfn=1 type:4
..
(XEN) memory_map: error -22 removing dom2 access to [fbce0,fbcff]
(DEBUG) 222 pci dev 04:0 BAR16 read upper 0x0.
(XEN) memory_map:remove: dom2 gfn=0 mfn=fbce0 nr=20
(XEN) memory_map: error -22 removing dom2 access to [fbce0,fbcff]

[The DEBUG is to illustate what the hvmloader was doing]

Reported-by: Sander Eikelenboom 
Signed-off-by: Konrad Rzeszutek Wilk 
---
 hw/xen/xen_pt_config_init.c | 48 -
 1 file changed, 47 insertions(+), 1 deletion(-)

diff --git a/hw/xen/xen_pt_config_init.c b/hw/xen/xen_pt_config_init.c
index e34f9f8..3938afd 100644
--- a/hw/xen/xen_pt_config_init.c
+++ b/hw/xen/xen_pt_config_init.c
@@ -1856,6 +1856,10 @@ static int xen_pt_config_reg_init(XenPCIPassthroughState 
*s,
 reg_entry->reg = reg;
 
 if (reg->init) {
+uint32_t host_mask, size_mask;
+unsigned int offset;
+uint32_t val;
+
 /* initialize emulate register */
 rc = reg->init(s, reg_entry->reg,
reg_grp->base_offset + reg->offset, &data);
@@ -1868,8 +1872,50 @@ static int xen_pt_config_reg_init(XenPCIPassthroughState 
*s,
 g_free(reg_entry);
 return 0;
 }
+/* Sync up the data to dev.config */
+offset = reg_grp->base_offset + reg->offset;
+size_mask = 0x >> ((4 - reg->size) << 3);
+
+rc = xen_host_pci_get_long(&s->real_device, offset, &val);
+if (rc) {
+/* Serious issues when we cannot read the host values! */
+g_free(reg_entry);
+return rc;
+}
+/* Set bits in emu_mask are the ones we emulate. The dev.config shall
+ * contain the emulated view of the guest - therefore we flip the mask
+ * to mask out the host values (which dev.config initially has) . */
+host_mask = size_mask & ~reg->emu_mask;
+
+if ((data & host_mask) != (val & host_mask)) {
+uint32_t new_val;
+
+/* Mask out host (including past size). */
+new_val = val & host_mask;
+/* Merge emulated ones (excluding the non-emulated ones). */
+new_val |= data & host_mask;
+/* Leave intact host and emulated values past the size - even 
though
+ * we

[Qemu-devel] [PATCH v1 07/10] xen/pt: Log xen_host_pci_get/set errors in MSI code.

2015-07-02 Thread Konrad Rzeszutek Wilk

We seem to only use these functions when de-activating the
MSI - so just log errors.

Reviewed-by: Stefano Stabellini 
Signed-off-by: Konrad Rzeszutek Wilk 
---
 hw/xen/xen_pt_msi.c | 18 ++
 1 file changed, 14 insertions(+), 4 deletions(-)

diff --git a/hw/xen/xen_pt_msi.c b/hw/xen/xen_pt_msi.c
index 5822df5..e3d7194 100644
--- a/hw/xen/xen_pt_msi.c
+++ b/hw/xen/xen_pt_msi.c
@@ -75,19 +75,29 @@ static int msi_msix_enable(XenPCIPassthroughState *s,
bool enable)
 {
 uint16_t val = 0;
+int rc;
 
 if (!address) {
 return -1;
 }
 
-xen_host_pci_get_word(&s->real_device, address, &val);
+rc = xen_host_pci_get_word(&s->real_device, address, &val);
+if (rc) {
+XEN_PT_ERR(&s->dev, "Failed to read MSI/MSI-X register (0x%x), 
rc:%d\n",
+   address, rc);
+return rc;
+}
 if (enable) {
 val |= flag;
 } else {
 val &= ~flag;
 }
-xen_host_pci_set_word(&s->real_device, address, val);
-return 0;
+rc = xen_host_pci_set_word(&s->real_device, address, val);
+if (rc) {
+XEN_PT_ERR(&s->dev, "Failed to write MSI/MSI-X register (0x%x), 
rc:%d\n",
+   address, rc);
+}
+return rc;
 }
 
 static int msi_msix_setup(XenPCIPassthroughState *s,
@@ -276,7 +286,7 @@ void xen_pt_msi_disable(XenPCIPassthroughState *s)
 return;
 }
 
-xen_pt_msi_set_enable(s, false);
+(void)xen_pt_msi_set_enable(s, false);
 
 msi_msix_disable(s, msi_addr64(msi), msi->data, msi->pirq, false,
  msi->initialized);
-- 
2.1.0

[Qemu-devel] [PATCH v1 05/10] xen/pt: Remove XenPTReg->data field.

2015-07-02 Thread Konrad Rzeszutek Wilk

We do not want to have two entries to cache the guest configuration
registers: XenPTReg->data and dev.config. Instead we want to use
only the dev.config.

To do without much complications we rip out the ->data field
and replace it with an pointer to the dev.config. This way we
have the type-checking (uint8_t, uint16_t, etc) and as well
and pre-computed location.

Alternatively we could compute the offset in dev.config by
using the XenPTRRegInfo and XenPTRegGroup every time but
this way we have the pre-computed values.

Signed-off-by: Konrad Rzeszutek Wilk 
---
 hw/xen/xen_pt.h |  6 +++-
 hw/xen/xen_pt_config_init.c | 73 +++--
 2 files changed, 49 insertions(+), 30 deletions(-)

diff --git a/hw/xen/xen_pt.h b/hw/xen/xen_pt.h
index 09358b1..586d055 100644
--- a/hw/xen/xen_pt.h
+++ b/hw/xen/xen_pt.h
@@ -134,7 +134,11 @@ struct XenPTRegInfo {
 struct XenPTReg {
 QLIST_ENTRY(XenPTReg) entries;
 XenPTRegInfo *reg;
-uint32_t data; /* emulated value */
+union {
+uint8_t *byte;
+uint16_t *word;
+uint32_t *dbword;
+} ptr; /* pointer to dev.config. */
 };
 
 typedef const struct XenPTRegGroupInfo XenPTRegGroupInfo;
diff --git a/hw/xen/xen_pt_config_init.c b/hw/xen/xen_pt_config_init.c
index e597993..0a710a2 100644
--- a/hw/xen/xen_pt_config_init.c
+++ b/hw/xen/xen_pt_config_init.c
@@ -128,10 +128,11 @@ static int xen_pt_byte_reg_read(XenPCIPassthroughState 
*s, XenPTReg *cfg_entry,
 {
 XenPTRegInfo *reg = cfg_entry->reg;
 uint8_t valid_emu_mask = 0;
+uint8_t *data = cfg_entry->ptr.byte;
 
 /* emulate byte register */
 valid_emu_mask = reg->emu_mask & valid_mask;
-*value = XEN_PT_MERGE_VALUE(*value, cfg_entry->data, ~valid_emu_mask);
+*value = XEN_PT_MERGE_VALUE(*value, *data, ~valid_emu_mask);
 
 return 0;
 }
@@ -140,10 +141,11 @@ static int xen_pt_word_reg_read(XenPCIPassthroughState 
*s, XenPTReg *cfg_entry,
 {
 XenPTRegInfo *reg = cfg_entry->reg;
 uint16_t valid_emu_mask = 0;
+uint16_t *data = cfg_entry->ptr.word;
 
 /* emulate word register */
 valid_emu_mask = reg->emu_mask & valid_mask;
-*value = XEN_PT_MERGE_VALUE(*value, cfg_entry->data, ~valid_emu_mask);
+*value = XEN_PT_MERGE_VALUE(*value, *data, ~valid_emu_mask);
 
 return 0;
 }
@@ -152,10 +154,11 @@ static int xen_pt_long_reg_read(XenPCIPassthroughState 
*s, XenPTReg *cfg_entry,
 {
 XenPTRegInfo *reg = cfg_entry->reg;
 uint32_t valid_emu_mask = 0;
+uint32_t *data = cfg_entry->ptr.dbword;
 
 /* emulate long register */
 valid_emu_mask = reg->emu_mask & valid_mask;
-*value = XEN_PT_MERGE_VALUE(*value, cfg_entry->data, ~valid_emu_mask);
+*value = XEN_PT_MERGE_VALUE(*value, *data, ~valid_emu_mask);
 
 return 0;
 }
@@ -169,10 +172,11 @@ static int xen_pt_byte_reg_write(XenPCIPassthroughState 
*s, XenPTReg *cfg_entry,
 XenPTRegInfo *reg = cfg_entry->reg;
 uint8_t writable_mask = 0;
 uint8_t throughable_mask = get_throughable_mask(s, reg, valid_mask);
+uint8_t *data = cfg_entry->ptr.byte;
 
 /* modify emulate register */
 writable_mask = reg->emu_mask & ~reg->ro_mask & valid_mask;
-cfg_entry->data = XEN_PT_MERGE_VALUE(*val, cfg_entry->data, writable_mask);
+*data = XEN_PT_MERGE_VALUE(*val, *data, writable_mask);
 
 /* create value for writing to I/O device register */
 *val = XEN_PT_MERGE_VALUE(*val, dev_value, throughable_mask);
@@ -186,10 +190,11 @@ static int xen_pt_word_reg_write(XenPCIPassthroughState 
*s, XenPTReg *cfg_entry,
 XenPTRegInfo *reg = cfg_entry->reg;
 uint16_t writable_mask = 0;
 uint16_t throughable_mask = get_throughable_mask(s, reg, valid_mask);
+uint16_t *data = cfg_entry->ptr.word;
 
 /* modify emulate register */
 writable_mask = reg->emu_mask & ~reg->ro_mask & valid_mask;
-cfg_entry->data = XEN_PT_MERGE_VALUE(*val, cfg_entry->data, writable_mask);
+*data = XEN_PT_MERGE_VALUE(*val, *data, writable_mask);
 
 /* create value for writing to I/O device register */
 *val = XEN_PT_MERGE_VALUE(*val, dev_value, throughable_mask);
@@ -203,10 +208,11 @@ static int xen_pt_long_reg_write(XenPCIPassthroughState 
*s, XenPTReg *cfg_entry,
 XenPTRegInfo *reg = cfg_entry->reg;
 uint32_t writable_mask = 0;
 uint32_t throughable_mask = get_throughable_mask(s, reg, valid_mask);
+uint32_t *data = cfg_entry->ptr.dbword;
 
 /* modify emulate register */
 writable_mask = reg->emu_mask & ~reg->ro_mask & valid_mask;
-cfg_entry->data = XEN_PT_MERGE_VALUE(*val, cfg_entry->data, writable_mask);
+*data = XEN_PT_MERGE_VALUE(*val, *data, writable_mask);
 
 /* create value for writing to I/O device register */
 *val = XEN_PT_MERGE_VALUE(*val, dev_value, throughable_mask);
@@ -255,7 +261,7 @@ static int xen_pt_status_reg_init(XenPCIPassthroughState *s,
 reg_entry = xen_pt_find_reg(reg_grp_entry, PCI_CAPABILITY_LIST);
 if (reg_entry) {
 /*

[Qemu-devel] [PATCH v1 09/10] xen/pt: Move bulk of xen_pt_unregister_device in its own routine.

2015-07-02 Thread Konrad Rzeszutek Wilk

This way we can call it if we fail during init.

This code movement introduces no changes.

Acked-by: Stefano Stabellini 
Signed-off-by: Konrad Rzeszutek Wilk 
---
 hw/xen/xen_pt.c | 119 +---
 1 file changed, 62 insertions(+), 57 deletions(-)

diff --git a/hw/xen/xen_pt.c b/hw/xen/xen_pt.c
index a094f7c..2dc4277 100644
--- a/hw/xen/xen_pt.c
+++ b/hw/xen/xen_pt.c
@@ -683,6 +683,67 @@ static const MemoryListener xen_pt_io_listener = {
 .priority = 10,
 };
 
+/* destroy. */
+static void xen_pt_destroy(PCIDevice *d) {
+
+XenPCIPassthroughState *s = XEN_PT_DEVICE(d);
+uint8_t machine_irq = s->machine_irq;
+uint8_t intx;
+int rc;
+
+ /* Note that if xen_host_pci_device_put had closed config_fd, then
+  * intx value becomes 0xff. */
+intx = xen_pt_pci_intx(s);
+if (machine_irq && !xen_host_pci_device_closed(&s->real_device)) {
+rc = xc_domain_unbind_pt_irq(xen_xc, xen_domid, machine_irq,
+ PT_IRQ_TYPE_PCI,
+ pci_bus_num(d->bus),
+ PCI_SLOT(s->dev.devfn),
+ intx,
+ 0 /* isa_irq */);
+if (rc < 0) {
+XEN_PT_ERR(d, "unbinding of interrupt INT%c failed."
+   " (machine irq: %i, err: %d)"
+   " But bravely continuing on..\n",
+   'a' + intx, machine_irq, errno);
+}
+}
+
+/* N.B. xen_pt_config_delete takes care of freeing them. */
+if (s->msi) {
+xen_pt_msi_disable(s);
+}
+if (s->msix) {
+xen_pt_msix_disable(s);
+}
+
+if (machine_irq) {
+xen_pt_mapped_machine_irq[machine_irq]--;
+
+if (xen_pt_mapped_machine_irq[machine_irq] == 0) {
+rc = xc_physdev_unmap_pirq(xen_xc, xen_domid, machine_irq);
+
+if (rc < 0) {
+XEN_PT_ERR(d, "unmapping of interrupt %i failed. (err: %d)"
+   " But bravely continuing on..\n",
+   machine_irq, errno);
+}
+}
+s->machine_irq = 0;
+}
+
+/* delete all emulated config registers */
+xen_pt_config_delete(s);
+
+if (s->listener_set) {
+memory_listener_unregister(&s->memory_listener);
+memory_listener_unregister(&s->io_listener);
+s->listener_set = false;
+}
+if (!xen_host_pci_device_closed(&s->real_device)) {
+xen_host_pci_device_put(&s->real_device);
+}
+}
 /* init */
 
 static int xen_pt_initfn(PCIDevice *d)
@@ -817,63 +878,7 @@ out:
 
 static void xen_pt_unregister_device(PCIDevice *d)
 {
-XenPCIPassthroughState *s = XEN_PT_DEVICE(d);
-uint8_t machine_irq = s->machine_irq;
-uint8_t intx;
-int rc;
-
- /* Note that if xen_host_pci_device_put had closed config_fd, then
-  * intx value becomes 0xff. */
-intx = xen_pt_pci_intx(s);
-if (machine_irq && !xen_host_pci_device_closed(&s->real_device)) {
-rc = xc_domain_unbind_pt_irq(xen_xc, xen_domid, machine_irq,
- PT_IRQ_TYPE_PCI,
- pci_bus_num(d->bus),
- PCI_SLOT(s->dev.devfn),
- intx,
- 0 /* isa_irq */);
-if (rc < 0) {
-XEN_PT_ERR(d, "unbinding of interrupt INT%c failed."
-   " (machine irq: %i, err: %d)"
-   " But bravely continuing on..\n",
-   'a' + intx, machine_irq, errno);
-}
-}
-
-/* N.B. xen_pt_config_delete takes care of freeing them. */
-if (s->msi) {
-xen_pt_msi_disable(s);
-}
-if (s->msix) {
-xen_pt_msix_disable(s);
-}
-
-if (machine_irq) {
-xen_pt_mapped_machine_irq[machine_irq]--;
-
-if (xen_pt_mapped_machine_irq[machine_irq] == 0) {
-rc = xc_physdev_unmap_pirq(xen_xc, xen_domid, machine_irq);
-
-if (rc < 0) {
-XEN_PT_ERR(d, "unmapping of interrupt %i failed. (err: %d)"
-   " But bravely continuing on..\n",
-   machine_irq, errno);
-}
-}
-s->machine_irq = 0;
-}
-
-/* delete all emulated config registers */
-xen_pt_config_delete(s);
-
-if (s->listener_set) {
-memory_listener_unregister(&s->memory_listener);
-memory_listener_unregister(&s->io_listener);
-s->listener_set = false;
-}
-if (!xen_host_pci_device_closed(&s->real_device)) {
-xen_host_pci_device_put(&s->real_device);
-}
+xen_pt_destroy(d);
 }
 
 static Property xen_pci_passthrough_properties[] = {
-- 
2.1.0

[Qemu-devel] [PATCH v1 04/10] xen/pt: Use xen_host_pci_get_[byte, word, long] instead of xen_host_pci_get_long

2015-07-02 Thread Konrad Rzeszutek Wilk

Otherwise we get:

xen_pt_config_reg_init: Offset 0x0004 mismatch! Emulated=0x, 
host=0x2300017, syncing to 0x2300014.
xen_pt_config_reg_init: Error: Offset 0x0004:0x2300014 expands past register 
size(2)!

which is not surprising. We read the value as an 32-bit (from host),
then operate it as a 16-bit - and the remainder is left unchanged.

We end up writting the value as 16-bit (so 0014) to dev.config
(as we use proper xen_set_host_[byte,word,long] so we don't spill
to other registers) but in XenPTReg->data it is as 32-bit (0x2300014)!

It is harmless as the read/write functions end up using an size mask
and never modify the bits past 16-bit (reg->size is 2).

This patch fixes the warnings by reading the value using the
proper size.

Note that the check for size is still left in-case the developer
sets bits past the reg->size in the ->init routines.

Signed-off-by: Konrad Rzeszutek Wilk 
---
 hw/xen/xen_pt_config_init.c | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/hw/xen/xen_pt_config_init.c b/hw/xen/xen_pt_config_init.c
index 09309ba..e597993 100644
--- a/hw/xen/xen_pt_config_init.c
+++ b/hw/xen/xen_pt_config_init.c
@@ -1876,7 +1876,12 @@ static int xen_pt_config_reg_init(XenPCIPassthroughState 
*s,
 offset = reg_grp->base_offset + reg->offset;
 size_mask = 0x >> ((4 - reg->size) << 3);
 
-rc = xen_host_pci_get_long(&s->real_device, offset, &val);
+switch (reg->size) {
+case 1: rc = xen_host_pci_get_byte(&s->real_device, offset, (uint8_t 
*)&val);break;
+case 2: rc = xen_host_pci_get_word(&s->real_device, offset, (uint16_t 
*)&val);break;
+case 4: rc = xen_host_pci_get_long(&s->real_device, offset, 
&val);break;
+default: assert(1);
+}
 if (rc) {
 /* Serious issues when we cannot read the host values! */
 g_free(reg_entry);
-- 
2.1.0

[Qemu-devel] [PATCH v1 03/10] xen/pt: Check if reg->init function sets the 'data' past the reg->size

2015-07-02 Thread Konrad Rzeszutek Wilk

It should never happen, but in case it does (an developer adds
a new register and the 'init_val' expands past the register
size) we want to report. The code will only write up to
reg->size so there is no runtime danger of the register spilling
across other ones - however to catch this sort of thing
we still return an error.

Signed-off-by: Konrad Rzeszutek Wilk 
---
 hw/xen/xen_pt_config_init.c | 10 --
 1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/hw/xen/xen_pt_config_init.c b/hw/xen/xen_pt_config_init.c
index 3938afd..09309ba 100644
--- a/hw/xen/xen_pt_config_init.c
+++ b/hw/xen/xen_pt_config_init.c
@@ -1904,9 +1904,15 @@ static int xen_pt_config_reg_init(XenPCIPassthroughState 
*s,
 } else
 val = data;
 
+if (val & ~size_mask) {
+XEN_PT_ERR(&s->dev,"Offset 0x%04x:0x%04x expands past register 
size(%d)!\n",
+   offset, val, reg->size);
+g_free(reg_entry);
+return -ENXIO;
+}
 /* This could be just pci_set_long as we don't modify the bits
- * past reg->size, but in case this routine is run in parallel
- * we do not want to over-write other registers. */
+ * past reg->size, but in case this routine is run in parallel or the
+ * init value is larger, we do not want to over-write registers. */
 switch (reg->size) {
 case 1: pci_set_byte(s->dev.config + offset, (uint8_t)val); break;
 case 2: pci_set_word(s->dev.config + offset, (uint16_t)val); break;
-- 
2.1.0

[Qemu-devel] [PATCH v1 06/10] xen/pt: Log xen_host_pci_get in two init functions

2015-07-02 Thread Konrad Rzeszutek Wilk

To help with troubleshooting in the field.

Acked-by: Stefano Stabellini 
Signed-off-by: Konrad Rzeszutek Wilk 
---
 hw/xen/xen_pt_config_init.c | 9 +
 1 file changed, 9 insertions(+)

diff --git a/hw/xen/xen_pt_config_init.c b/hw/xen/xen_pt_config_init.c
index 0a710a2..a2af415 100644
--- a/hw/xen/xen_pt_config_init.c
+++ b/hw/xen/xen_pt_config_init.c
@@ -1791,6 +1791,8 @@ static int xen_pt_ptr_reg_init(XenPCIPassthroughState *s,
 rc = xen_host_pci_get_byte(&s->real_device,
reg_field + PCI_CAP_LIST_ID, &cap_id);
 if (rc) {
+XEN_PT_ERR(&s->dev, "Failed to read capability @0x%x 
(rc:%d)\n",
+   reg_field + PCI_CAP_LIST_ID, rc);
 return rc;
 }
 if (xen_pt_emu_reg_grps[i].grp_id == cap_id) {
@@ -1984,6 +1986,9 @@ int xen_pt_config_init(XenPCIPassthroughState *s)
   reg_grp_offset,
   ®_grp_entry->size);
 if (rc < 0) {
+XEN_PT_LOG(&s->dev, "Failed to initialize %d/%ld, type=0x%x, 
rc:%d\n",
+   i, ARRAY_SIZE(xen_pt_emu_reg_grps),
+   xen_pt_emu_reg_grps[i].grp_type, rc);
 xen_pt_config_delete(s);
 return rc;
 }
@@ -1998,6 +2003,10 @@ int xen_pt_config_init(XenPCIPassthroughState *s)
 /* initialize capability register */
 rc = xen_pt_config_reg_init(s, reg_grp_entry, regs);
 if (rc < 0) {
+XEN_PT_LOG(&s->dev, "Failed to initialize %d/%ld reg 
0x%x in grp_type=0x%x (%d/%ld), rc=%d\n",
+   j, 
ARRAY_SIZE(xen_pt_emu_reg_grps[i].emu_regs),
+   regs->offset, 
xen_pt_emu_reg_grps[i].grp_type,
+   i, ARRAY_SIZE(xen_pt_emu_reg_grps), rc);
 xen_pt_config_delete(s);
 return rc;
 }
-- 
2.1.0

[Qemu-devel] [PATCH v1 10/10] xen/pt: Check for return values for xen_host_pci_[get|set] in init

2015-07-02 Thread Konrad Rzeszutek Wilk

and if we have failures we call xen_pt_destroy introduced in
'xen/pt: Move bulk of xen_pt_unregister_device in its own routine.'
and free all of the allocated structures.

Acked-by: Stefano Stabellini 
Signed-off-by: Konrad Rzeszutek Wilk 
---
 hw/xen/xen_pt.c | 24 
 1 file changed, 16 insertions(+), 8 deletions(-)

diff --git a/hw/xen/xen_pt.c b/hw/xen/xen_pt.c
index 2dc4277..62f5751 100644
--- a/hw/xen/xen_pt.c
+++ b/hw/xen/xen_pt.c
@@ -776,10 +776,11 @@ static int xen_pt_initfn(PCIDevice *d)
 }
 
 /* Initialize virtualized PCI configuration (Extended 256 Bytes) */
-if (xen_host_pci_get_block(&s->real_device, 0, d->config,
-   PCI_CONFIG_SPACE_SIZE) < 0) {
-xen_host_pci_device_put(&s->real_device);
-return -1;
+rc = xen_host_pci_get_block(&s->real_device, 0, d->config,
+PCI_CONFIG_SPACE_SIZE);
+if (rc < 0) {
+XEN_PT_ERR(d,"Could not read PCI_CONFIG space! (rc:%d)\n", rc);
+goto err_out;
 }
 
 s->memory_listener = xen_pt_memory_listener;
@@ -789,17 +790,17 @@ static int xen_pt_initfn(PCIDevice *d)
 xen_pt_register_regions(s, &cmd);
 
 /* reinitialize each config register to be emulated */
-if (xen_pt_config_init(s)) {
+rc = xen_pt_config_init(s);
+if (rc) {
 XEN_PT_ERR(d, "PCI Config space initialisation failed.\n");
-xen_host_pci_device_put(&s->real_device);
-return -1;
+goto err_out;
 }
 
 /* Bind interrupt */
 rc = xen_host_pci_get_byte(&s->real_device, PCI_INTERRUPT_PIN, &scratch);
 if (rc) {
 XEN_PT_ERR(d, "Failed to read PCI_INTERRUPT_PIN! (rc:%d)\n", rc);
-scratch = 0;
+goto err_out;
 }
 if (!scratch) {
 XEN_PT_LOG(d, "no pin interrupt\n");
@@ -856,12 +857,14 @@ out:
 rc = xen_host_pci_get_word(&s->real_device, PCI_COMMAND, &val);
 if (rc) {
 XEN_PT_ERR(d, "Failed to read PCI_COMMAND! (rc: %d)\n", rc);
+goto err_out;
 } else {
 val |= cmd;
 rc = xen_host_pci_set_word(&s->real_device, PCI_COMMAND, val);
 if (rc) {
 XEN_PT_ERR(d, "Failed to write PCI_COMMAND val=0x%x!(rc: 
%d)\n",
val, rc);
+goto err_out;
 }
 }
 }
@@ -874,6 +877,11 @@ out:
s->hostaddr.bus, s->hostaddr.slot, s->hostaddr.function);
 
 return 0;
+
+err_out:
+xen_pt_destroy(d);
+assert(rc);
+return rc;
 }
 
 static void xen_pt_unregister_device(PCIDevice *d)
-- 
2.1.0

[Qemu-devel] [PATCH v1 01/10] xen/pt: Use xen_host_pci_get_[byte|word] instead of dev.config

2015-07-02 Thread Konrad Rzeszutek Wilk

During init time we treat the dev.config area as a cache
of the host view. However during execution time we treat it
as guest view (by the generic PCI API). We need to sync Xen's
code to the generic PCI API view. This is the first step
by replacing all of the code that uses dev.config or
pci_get_[byte|word] to get host value to actually use the
xen_host_pci_get_[byte|word] functions.

Interestingly in 'xen_pt_ptr_reg_init' we also needed to swap
reg_field from uint32_t to uint8_t - since the access is only
for one byte not four bytes. We can split this as a seperate
patch however we would have to use a cast to thwart compiler
warnings in the meantime.

We also truncated 'flags' to 'flag' to make the code fit within
the 80 characters.

Signed-off-by: Konrad Rzeszutek Wilk 
---
 hw/xen/xen_pt.c | 24 +++---
 hw/xen/xen_pt_config_init.c | 77 +++--
 2 files changed, 73 insertions(+), 28 deletions(-)

diff --git a/hw/xen/xen_pt.c b/hw/xen/xen_pt.c
index ea1ceda..7575311 100644
--- a/hw/xen/xen_pt.c
+++ b/hw/xen/xen_pt.c
@@ -689,7 +689,7 @@ static int xen_pt_initfn(PCIDevice *d)
 {
 XenPCIPassthroughState *s = XEN_PT_DEVICE(d);
 int rc = 0;
-uint8_t machine_irq = 0;
+uint8_t machine_irq = 0, scratch;
 uint16_t cmd = 0;
 int pirq = XEN_PT_UNASSIGNED_PIRQ;
 
@@ -735,7 +735,12 @@ static int xen_pt_initfn(PCIDevice *d)
 }
 
 /* Bind interrupt */
-if (!s->dev.config[PCI_INTERRUPT_PIN]) {
+rc = xen_host_pci_get_byte(&s->real_device, PCI_INTERRUPT_PIN, &scratch);
+if (rc) {
+XEN_PT_ERR(d, "Failed to read PCI_INTERRUPT_PIN! (rc:%d)\n", rc);
+scratch = 0;
+}
+if (!scratch) {
 XEN_PT_LOG(d, "no pin interrupt\n");
 goto out;
 }
@@ -785,8 +790,19 @@ static int xen_pt_initfn(PCIDevice *d)
 
 out:
 if (cmd) {
-xen_host_pci_set_word(&s->real_device, PCI_COMMAND,
-  pci_get_word(d->config + PCI_COMMAND) | cmd);
+uint16_t val;
+
+rc = xen_host_pci_get_word(&s->real_device, PCI_COMMAND, &val);
+if (rc) {
+XEN_PT_ERR(d, "Failed to read PCI_COMMAND! (rc: %d)\n", rc);
+} else {
+val |= cmd;
+rc = xen_host_pci_set_word(&s->real_device, PCI_COMMAND, val);
+if (rc) {
+XEN_PT_ERR(d, "Failed to write PCI_COMMAND val=0x%x!(rc: 
%d)\n",
+   val, rc);
+}
+}
 }
 
 memory_listener_register(&s->memory_listener, &s->dev.bus_master_as);
diff --git a/hw/xen/xen_pt_config_init.c b/hw/xen/xen_pt_config_init.c
index 21d4938..e34f9f8 100644
--- a/hw/xen/xen_pt_config_init.c
+++ b/hw/xen/xen_pt_config_init.c
@@ -800,15 +800,21 @@ static XenPTRegInfo xen_pt_emu_reg_vendor[] = {
 static inline uint8_t get_capability_version(XenPCIPassthroughState *s,
  uint32_t offset)
 {
-uint8_t flags = pci_get_byte(s->dev.config + offset + PCI_EXP_FLAGS);
-return flags & PCI_EXP_FLAGS_VERS;
+uint8_t flag;
+if (xen_host_pci_get_byte(&s->real_device, offset + PCI_EXP_FLAGS, &flag)) 
{
+return 0;
+}
+return flag & PCI_EXP_FLAGS_VERS;
 }
 
 static inline uint8_t get_device_type(XenPCIPassthroughState *s,
   uint32_t offset)
 {
-uint8_t flags = pci_get_byte(s->dev.config + offset + PCI_EXP_FLAGS);
-return (flags & PCI_EXP_FLAGS_TYPE) >> 4;
+uint8_t flag;
+if (xen_host_pci_get_byte(&s->real_device, offset + PCI_EXP_FLAGS, &flag)) 
{
+return 0;
+}
+return (flag & PCI_EXP_FLAGS_TYPE) >> 4;
 }
 
 /* initialize Link Control register */
@@ -857,8 +863,14 @@ static int 
xen_pt_linkctrl2_reg_init(XenPCIPassthroughState *s,
 reg_field = XEN_PT_INVALID_REG;
 } else {
 /* set Supported Link Speed */
-uint8_t lnkcap = pci_get_byte(s->dev.config + real_offset - reg->offset
-  + PCI_EXP_LNKCAP);
+uint8_t lnkcap;
+int rc;
+rc = xen_host_pci_get_byte(&s->real_device,
+   real_offset - reg->offset + PCI_EXP_LNKCAP,
+   &lnkcap);
+if (rc) {
+return rc;
+}
 reg_field |= PCI_EXP_LNKCAP_SLS & lnkcap;
 }
 
@@ -1039,13 +1051,15 @@ static int 
xen_pt_msgctrl_reg_init(XenPCIPassthroughState *s,
XenPTRegInfo *reg, uint32_t real_offset,
uint32_t *data)
 {
-PCIDevice *d = &s->dev;
 XenPTMSI *msi = s->msi;
-uint16_t reg_field = 0;
+uint16_t reg_field;
+int rc;
 
 /* use I/O device register's value as initial value */
-reg_field = pci_get_word(d->config + real_offset);
-
+rc = xen_host_pci_get_word(&s->real_device, real_offset, ®_field);
+if (rc) {
+return rc;
+}
 if (reg_field & PCI_MSI_FLAGS_ENABLE) {
 XEN_PT_LO

[Qemu-devel] [PATCH v1] Remove XenPTReg->data and use dev.config for guest configuration values.

2015-07-02 Thread Konrad Rzeszutek Wilk

Since RFC [https://lists.gnu.org/archive/html/qemu-devel/2015-06/msg07350.html]
 - Added Acks
 - Fixed bugs

This patchset is dependent on the "Cleanups + various fixes due to libxl ABI
+   more logging on errors." 
(http://lists.xen.org/archives/html/xen-devel/2015-07/msg00431.html)
just posted.


During the discussion of 
http://lists.xen.org/archives/html/xen-devel/2015-06/msg01504.html
"[PATCH QEMU-XEN] xen/pt: Start with emulated PCI_COMMAND set to zero."
we got in a discussion about the xen/pt code usage of XenPTRreg->data
and dev.config. The crux of the matter was that the PCI_COMMAND
register read from the host changed from 0 to 3 (so PCI_COMMAND_IO
and PCI_COMMAND_MEM were enabled). That messed up QEMU badly (thought
it did recover) with some nasty messages on the Xen's ring buffer.

Fast-forward and we came to the conclusion that:

1) The dev.config is (by Xen code) used as the cache of the
   host configuration devices (which is OK at init time, but not
   during run-time as the generic API uses it as the guest cache
   of configuration registers).

2). The dev.config is (by the generic code) used as a view of what
   the guest should see (cache of guest values). 

This patchset means to fix that by having the dev.config be used
by the Xen code as the guest configuration registers and nothing more.

Since the Xen code uses the XenPTRreg->data as the configuration
registers we MUST redo some ot the init code to read the host
values directly (instead of using dev.config as a cache), and save
the guest configuration values in dev.config. We also want
dev.confg and ->data be synced up so that the initial guest values
are not polluted with host register values which we are emulating.

See: "[PATCH v1 02/10] xen/pt: Sync up the dev.config and data values."
for that logic.

And then the axe-patch: "[PATCH v1 05/10] xen/pt: Remove XenPTReg->data field."
which rips out the ->data and from there on we only use dev.config for
guest configuration registers.

The rest of the patches add extra logging, error checking, sanity
checking so that if bugs show up because of this patchset it will
be easier to identify them (I hope!).

Stefano, patch #6, #7, #9 are unchanged (I've added the Acked tag).
#10 is a bit changed, but it has the same flow so I took the liberty
of putting your Ack there.

Please advise on:
 [PATCH v1 03/10] xen/pt: Check if reg->init function sets the 'data'

especially the return code - I picked ENXIO but perhaps there is a better
one?

Also the:
 [PATCH v1 08/10] xen/pt: Make xen_pt_unregister_device idempotent

is unchanged since the last posting, becasue I could not implement your
suggestion to use QTAILQ_EMPTY (and QTAILQ_INIT).


 hw/xen/xen-host-pci-device.c |   5 +
 hw/xen/xen-host-pci-device.h |   1 +
 hw/xen/xen_pt.c  | 157 +++
 hw/xen/xen_pt.h  |   8 +-
 hw/xen/xen_pt_config_init.c  | 216 ---
 hw/xen/xen_pt_msi.c  |  18 +++-
 6 files changed, 288 insertions(+), 117 deletions(-)

Konrad Rzeszutek Wilk (10):
  xen/pt: Use xen_host_pci_get_[byte|word] instead of dev.config
  xen/pt: Sync up the dev.config and data values.
  xen/pt: Check if reg->init function sets the 'data' past the reg->size
  xen/pt: Use xen_host_pci_get_[byte,word,long] instead of 
xen_host_pci_get_long
  xen/pt: Remove XenPTReg->data field.
  xen/pt: Log xen_host_pci_get in two init functions
  xen/pt: Log xen_host_pci_get/set errors in MSI code.
  xen/pt: Make xen_pt_unregister_device idempotent
  xen/pt: Move bulk of xen_pt_unregister_device in its own routine.
  xen/pt: Check for return values for xen_host_pci_[get|set] in init

[Qemu-devel] [PATCH v1 08/10] xen/pt: Make xen_pt_unregister_device idempotent

2015-07-02 Thread Konrad Rzeszutek Wilk

To deal with xen_host_pci_[set|get]_ functions returning error values
and clearing ourselves in the init function we should make the
.exit (xen_pt_unregister_device) function be idempotent in case
the generic code starts calling .exit (or for fun does it before
calling .init!).

Signed-off-by: Konrad Rzeszutek Wilk 
---
 hw/xen/xen-host-pci-device.c |  5 +
 hw/xen/xen-host-pci-device.h |  1 +
 hw/xen/xen_pt.c  | 22 --
 hw/xen/xen_pt.h  |  2 ++
 4 files changed, 24 insertions(+), 6 deletions(-)

diff --git a/hw/xen/xen-host-pci-device.c b/hw/xen/xen-host-pci-device.c
index 743b37b..5b20570 100644
--- a/hw/xen/xen-host-pci-device.c
+++ b/hw/xen/xen-host-pci-device.c
@@ -387,6 +387,11 @@ error:
 return rc;
 }
 
+bool xen_host_pci_device_closed(XenHostPCIDevice *d)
+{
+return d->config_fd == -1;
+}
+
 void xen_host_pci_device_put(XenHostPCIDevice *d)
 {
 if (d->config_fd >= 0) {
diff --git a/hw/xen/xen-host-pci-device.h b/hw/xen/xen-host-pci-device.h
index c2486f0..16f4805 100644
--- a/hw/xen/xen-host-pci-device.h
+++ b/hw/xen/xen-host-pci-device.h
@@ -38,6 +38,7 @@ typedef struct XenHostPCIDevice {
 int xen_host_pci_device_get(XenHostPCIDevice *d, uint16_t domain,
 uint8_t bus, uint8_t dev, uint8_t func);
 void xen_host_pci_device_put(XenHostPCIDevice *pci_dev);
+bool xen_host_pci_device_closed(XenHostPCIDevice *d);
 
 int xen_host_pci_get_byte(XenHostPCIDevice *d, int pos, uint8_t *p);
 int xen_host_pci_get_word(XenHostPCIDevice *d, int pos, uint16_t *p);
diff --git a/hw/xen/xen_pt.c b/hw/xen/xen_pt.c
index 7575311..a094f7c 100644
--- a/hw/xen/xen_pt.c
+++ b/hw/xen/xen_pt.c
@@ -807,6 +807,7 @@ out:
 
 memory_listener_register(&s->memory_listener, &s->dev.bus_master_as);
 memory_listener_register(&s->io_listener, &address_space_io);
+s->listener_set = true;
 XEN_PT_LOG(d,
"Real physical device %02x:%02x.%d registered successfully!\n",
s->hostaddr.bus, s->hostaddr.slot, s->hostaddr.function);
@@ -818,10 +819,13 @@ static void xen_pt_unregister_device(PCIDevice *d)
 {
 XenPCIPassthroughState *s = XEN_PT_DEVICE(d);
 uint8_t machine_irq = s->machine_irq;
-uint8_t intx = xen_pt_pci_intx(s);
+uint8_t intx;
 int rc;
 
-if (machine_irq) {
+ /* Note that if xen_host_pci_device_put had closed config_fd, then
+  * intx value becomes 0xff. */
+intx = xen_pt_pci_intx(s);
+if (machine_irq && !xen_host_pci_device_closed(&s->real_device)) {
 rc = xc_domain_unbind_pt_irq(xen_xc, xen_domid, machine_irq,
  PT_IRQ_TYPE_PCI,
  pci_bus_num(d->bus),
@@ -836,6 +840,7 @@ static void xen_pt_unregister_device(PCIDevice *d)
 }
 }
 
+/* N.B. xen_pt_config_delete takes care of freeing them. */
 if (s->msi) {
 xen_pt_msi_disable(s);
 }
@@ -855,15 +860,20 @@ static void xen_pt_unregister_device(PCIDevice *d)
machine_irq, errno);
 }
 }
+s->machine_irq = 0;
 }
 
 /* delete all emulated config registers */
 xen_pt_config_delete(s);
 
-memory_listener_unregister(&s->memory_listener);
-memory_listener_unregister(&s->io_listener);
-
-xen_host_pci_device_put(&s->real_device);
+if (s->listener_set) {
+memory_listener_unregister(&s->memory_listener);
+memory_listener_unregister(&s->io_listener);
+s->listener_set = false;
+}
+if (!xen_host_pci_device_closed(&s->real_device)) {
+xen_host_pci_device_put(&s->real_device);
+}
 }
 
 static Property xen_pci_passthrough_properties[] = {
diff --git a/hw/xen/xen_pt.h b/hw/xen/xen_pt.h
index 586d055..2b58521 100644
--- a/hw/xen/xen_pt.h
+++ b/hw/xen/xen_pt.h
@@ -221,6 +221,7 @@ struct XenPCIPassthroughState {
 
 MemoryListener memory_listener;
 MemoryListener io_listener;
+bool listener_set;
 };
 
 int xen_pt_config_init(XenPCIPassthroughState *s);
@@ -286,6 +287,7 @@ static inline uint8_t 
xen_pt_pci_intx(XenPCIPassthroughState *s)
" value=%i, acceptable range is 1 - 4\n", r_val);
 r_val = 0;
 } else {
+/* Note that if s.real_device.config_fd is closed we make 0xff. */
 r_val -= 1;
 }
 
-- 
2.1.0

[Qemu-devel] [PATCH v1] Cleanups + various fixes due to libxl ABI + more logging on errors.

2015-07-02 Thread Konrad Rzeszutek Wilk

Hey!

since RFC [https://lists.gnu.org/archive/html/qemu-devel/2015-06/msg07326.html]
 - Added Acks
 - Followed 'xen: Print and use errno where applicable.' suggestion by Stefano
   and added a wrapper.

As I am in the process of syncing the 'dev.config' and Xen's internal
cache of the PCI config space I noticed that some oddities
in the code and figured that having them in would be easier
for me (so I don't have to handle 20 odd patches).


Stefano, patches 1-3 are unchanged since the RFC posting (well, they have the
Ack tag added). Patch:

 [PATCH v1 4/6] xen: use errno instead of rc for xc_domain_add_to_physmap

uses the new wrapper you suggested.

The last two patches are new and I missed them the first time I posted:
 [PATCH v1 5/6] xen/pt/msi: Add the register value when printing
 [PATCH v1 6/6] xen/pt: Use XEN_PT_LOG properly to guard against

In short, please review #4,#5, and #6.


 hw/xen/xen_pt.c |  6 +++---
 hw/xen/xen_pt.h |  1 -
 hw/xen/xen_pt_config_init.c |  8 
 hw/xen/xen_pt_msi.c |  2 +-
 include/hw/xen/xen_common.h | 22 ++
 xen-hvm.c   |  4 ++--
 6 files changed, 32 insertions(+), 11 deletions(-)

Konrad Rzeszutek Wilk (6):
  xen/pt: Update comments with proper function name.
  xen/pt: Make xen_pt_msi_set_enable static
  xen/pt: xen_host_pci_config_read returns -errno, not -1 on failure
  xen: use errno instead of rc for xc_domain_add_to_physmap
  xen/pt/msi: Add the register value when printing logging and error 
messages
  xen/pt: Use XEN_PT_LOG properly to guard against compiler warnings.

[Qemu-devel] [PATCH v1 4/6] xen: use errno instead of rc for xc_domain_add_to_physmap

2015-07-02 Thread Konrad Rzeszutek Wilk

In Xen 4.6 commit cd2f100f0f61b3f333d52d1737dd73f02daee592
"libxc: Fix do_memory_op to return negative value on errors"
made the libxc API less odd-ball: On errors, return value is
-1 and error code is in errno. On success the return value
is either 0 or an positive value.

Since we could be running with an old toolstack in which the
Exx value is in rc or the newer, we add an wrapper around
the xc_domain_add_to_physmap (called xen_xc_domain_add_to_physmap)
which will always return the EXX.

Suggested-by: Stefano Stabellini 
Signed-off-by: Konrad Rzeszutek Wilk 
---
 include/hw/xen/xen_common.h | 22 ++
 xen-hvm.c   |  4 ++--
 2 files changed, 24 insertions(+), 2 deletions(-)

diff --git a/include/hw/xen/xen_common.h b/include/hw/xen/xen_common.h
index 38f29fb..fc66c3c 100644
--- a/include/hw/xen/xen_common.h
+++ b/include/hw/xen/xen_common.h
@@ -407,4 +407,26 @@ static inline int xen_set_ioreq_server_state(XenXC xc, 
domid_t dom,
 
 #endif
 
+#if CONFIG_XEN_CTRL_INTERFACE_VERSION < 460
+static inline int xen_xc_domain_add_to_physmap(XenXC xch, uint32_t domid,
+   unsigned int space,
+   unsigned long idx,
+   xen_pfn_t gpfn)
+{
+return xc_domain_add_to_physmap(xch, domid, space, idx, gpfn);
+}
+#else
+static inline int xen_xc_domain_add_to_physmap(XenXC xch, uint32_t domid,
+   unsigned int space,
+   unsigned long idx,
+   xen_pfn_t gpfn)
+{
+/* In Xen 4.6 rc is -1 and errno contains the error value. */
+int rc = xc_domain_add_to_physmap(xch, domid, space, idx, gpfn);
+if (rc == -1)
+return errno;
+return rc;
+}
+#endif
+
 #endif /* QEMU_HW_XEN_COMMON_H */
diff --git a/xen-hvm.c b/xen-hvm.c
index 0408462..fae2d22 100644
--- a/xen-hvm.c
+++ b/xen-hvm.c
@@ -345,7 +345,7 @@ go_physmap:
 unsigned long idx = pfn + i;
 xen_pfn_t gpfn = start_gpfn + i;
 
-rc = xc_domain_add_to_physmap(xen_xc, xen_domid, XENMAPSPACE_gmfn, 
idx, gpfn);
+rc = xen_xc_domain_add_to_physmap(xen_xc, xen_domid, XENMAPSPACE_gmfn, 
idx, gpfn);
 if (rc) {
 DPRINTF("add_to_physmap MFN %"PRI_xen_pfn" to PFN %"
 PRI_xen_pfn" failed: %d\n", idx, gpfn, rc);
@@ -422,7 +422,7 @@ static int xen_remove_from_physmap(XenIOState *state,
 xen_pfn_t idx = start_addr + i;
 xen_pfn_t gpfn = phys_offset + i;
 
-rc = xc_domain_add_to_physmap(xen_xc, xen_domid, XENMAPSPACE_gmfn, 
idx, gpfn);
+rc = xen_xc_domain_add_to_physmap(xen_xc, xen_domid, XENMAPSPACE_gmfn, 
idx, gpfn);
 if (rc) {
 fprintf(stderr, "add_to_physmap MFN %"PRI_xen_pfn" to PFN %"
 PRI_xen_pfn" failed: %d\n", idx, gpfn, rc);
-- 
2.1.0

[Qemu-devel] [PATCH v1 6/6] xen/pt: Use XEN_PT_LOG properly to guard against compiler warnings.

2015-07-02 Thread Konrad Rzeszutek Wilk

If XEN_PT_LOGGING_ENABLED is enabled the XEN_PT_LOG macros start
using the first argument. Which means if within the function there
is only one user of the argument ('d') and XEN_PT_LOGGING_ENABLED
is not set, we get compiler warnings. This is not the case now
but with the "xen/pt: Use xen_host_pci_get_[byte|word] instead of dev.config"
we will hit - so this sync up the function to the rest of them.

Signed-off-by: Konrad Rzeszutek Wilk 
---
 hw/xen/xen_pt_config_init.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/xen/xen_pt_config_init.c b/hw/xen/xen_pt_config_init.c
index 462e1b9..21d4938 100644
--- a/hw/xen/xen_pt_config_init.c
+++ b/hw/xen/xen_pt_config_init.c
@@ -1418,7 +1418,7 @@ static int 
xen_pt_msixctrl_reg_init(XenPCIPassthroughState *s,
 reg_field = pci_get_word(d->config + real_offset);
 
 if (reg_field & PCI_MSIX_FLAGS_ENABLE) {
-XEN_PT_LOG(d, "MSIX already enabled, disabling it first\n");
+XEN_PT_LOG(&s->dev, "MSIX already enabled, disabling it first\n");
 xen_host_pci_set_word(&s->real_device, real_offset,
   reg_field & ~PCI_MSIX_FLAGS_ENABLE);
 }
-- 
2.1.0

[Qemu-devel] [PATCH v1 2/6] xen/pt: Make xen_pt_msi_set_enable static

2015-07-02 Thread Konrad Rzeszutek Wilk

As we do not use it outside our code.

Reviewed-by: Stefano Stabellini 
Signed-off-by: Konrad Rzeszutek Wilk 
---
 hw/xen/xen_pt.h | 1 -
 hw/xen/xen_pt_msi.c | 2 +-
 2 files changed, 1 insertion(+), 2 deletions(-)

diff --git a/hw/xen/xen_pt.h b/hw/xen/xen_pt.h
index 393f36c..09358b1 100644
--- a/hw/xen/xen_pt.h
+++ b/hw/xen/xen_pt.h
@@ -289,7 +289,6 @@ static inline uint8_t 
xen_pt_pci_intx(XenPCIPassthroughState *s)
 }
 
 /* MSI/MSI-X */
-int xen_pt_msi_set_enable(XenPCIPassthroughState *s, bool en);
 int xen_pt_msi_setup(XenPCIPassthroughState *s);
 int xen_pt_msi_update(XenPCIPassthroughState *d);
 void xen_pt_msi_disable(XenPCIPassthroughState *s);
diff --git a/hw/xen/xen_pt_msi.c b/hw/xen/xen_pt_msi.c
index 263e051..5822df5 100644
--- a/hw/xen/xen_pt_msi.c
+++ b/hw/xen/xen_pt_msi.c
@@ -220,7 +220,7 @@ static int msi_msix_disable(XenPCIPassthroughState *s,
  * MSI virtualization functions
  */
 
-int xen_pt_msi_set_enable(XenPCIPassthroughState *s, bool enable)
+static int xen_pt_msi_set_enable(XenPCIPassthroughState *s, bool enable)
 {
 XEN_PT_LOG(&s->dev, "%s MSI.\n", enable ? "enabling" : "disabling");
 
-- 
2.1.0

[Qemu-devel] [PATCH v1 1/6] xen/pt: Update comments with proper function name.

2015-07-02 Thread Konrad Rzeszutek Wilk

It has changed but the comments still refer to the old names.

Reviewed-by: Stefano Stabellini 
Signed-off-by: Konrad Rzeszutek Wilk 
---
 hw/xen/xen_pt.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/hw/xen/xen_pt.c b/hw/xen/xen_pt.c
index ed5fcae..706e3d9 100644
--- a/hw/xen/xen_pt.c
+++ b/hw/xen/xen_pt.c
@@ -378,7 +378,7 @@ static void xen_pt_pci_write_config(PCIDevice *d, uint32_t 
addr,
 }
 }
 
-/* need to shift back before passing them to xen_host_pci_device */
+/* need to shift back before passing them to xen_host_pci_set_block. */
 val >>= (addr & 3) << 3;
 
 memory_region_transaction_commit();
@@ -406,7 +406,7 @@ out:
 (uint8_t *)&val + index, len);
 
 if (rc < 0) {
-XEN_PT_ERR(d, "pci_write_block failed. return value: %d.\n", rc);
+XEN_PT_ERR(d, "xen_host_pci_set_block failed. return value: 
%d.\n", rc);
 }
 }
 }
-- 
2.1.0

[Qemu-devel] [PATCH v1 5/6] xen/pt/msi: Add the register value when printing logging and error messages

2015-07-02 Thread Konrad Rzeszutek Wilk

We would like to know what the MSI register value is to help
in troubleshooting in the field. As such modify the logging
logic to include such details in xen_pt_msgctrl_reg_write.

Signed-off-by: Konrad Rzeszutek Wilk 
---
 hw/xen/xen_pt_config_init.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/hw/xen/xen_pt_config_init.c b/hw/xen/xen_pt_config_init.c
index dd37be3..462e1b9 100644
--- a/hw/xen/xen_pt_config_init.c
+++ b/hw/xen/xen_pt_config_init.c
@@ -1086,7 +1086,7 @@ static int 
xen_pt_msgctrl_reg_write(XenPCIPassthroughState *s,
 /* setup MSI pirq for the first time */
 if (!msi->initialized) {
 /* Init physical one */
-XEN_PT_LOG(&s->dev, "setup MSI\n");
+XEN_PT_LOG(&s->dev, "setup MSI (register: %x).\n", *val);
 if (xen_pt_msi_setup(s)) {
 /* We do not broadcast the error to the framework code, so
  * that MSI errors are contained in MSI emulation code and
@@ -1094,12 +1094,12 @@ static int 
xen_pt_msgctrl_reg_write(XenPCIPassthroughState *s,
  * Guest MSI would be actually not working.
  */
 *val &= ~PCI_MSI_FLAGS_ENABLE;
-XEN_PT_WARN(&s->dev, "Can not map MSI.\n");
+XEN_PT_WARN(&s->dev, "Can not map MSI (register: %x)!\n", 
*val);
 return 0;
 }
 if (xen_pt_msi_update(s)) {
 *val &= ~PCI_MSI_FLAGS_ENABLE;
-XEN_PT_WARN(&s->dev, "Can not bind MSI\n");
+XEN_PT_WARN(&s->dev, "Can not bind MSI (register: %x)!\n", 
*val);
 return 0;
 }
 msi->initialized = true;
-- 
2.1.0

[Qemu-devel] [PATCH v1 3/6] xen/pt: xen_host_pci_config_read returns -errno, not -1 on failure

2015-07-02 Thread Konrad Rzeszutek Wilk

However the init routines assume that on errors the return
code is -1 (as the libxc API is) - while those xen_host_* routines follow
another paradigm - negative errno on return, 0 on success.

Reviewed-by: Stefano Stabellini 
Signed-off-by: Konrad Rzeszutek Wilk 
---
 hw/xen/xen_pt.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/xen/xen_pt.c b/hw/xen/xen_pt.c
index 706e3d9..ea1ceda 100644
--- a/hw/xen/xen_pt.c
+++ b/hw/xen/xen_pt.c
@@ -716,7 +716,7 @@ static int xen_pt_initfn(PCIDevice *d)
 
 /* Initialize virtualized PCI configuration (Extended 256 Bytes) */
 if (xen_host_pci_get_block(&s->real_device, 0, d->config,
-   PCI_CONFIG_SPACE_SIZE) == -1) {
+   PCI_CONFIG_SPACE_SIZE) < 0) {
 xen_host_pci_device_put(&s->real_device);
 return -1;
 }
-- 
2.1.0

Re: [Qemu-devel] [PATCH v2] thread-win32: fix GetThreadContext() permanently fails

2015-07-02 Thread Zavadovsky Yan

I tested this patch on my 4-cores cpu.
Debug and release builds both.
Win32 and Win64 binaries both. (I used old Fedora 17-18 with SJLJ mingw-w64
to crossbuild for Win64.)
With default Qemu BIOS and with myself-builded OVMF(also debug and
release) from EDK2.

Also I did some synthetic tests with sample from this article:
http://blogs.msdn.com/b/oldnewthing/archive/2015/02/05/10591215.aspx
modified as I described here:
http://lists.nongnu.org/archive/html/qemu-devel/2015-06/msg06894.html

On Thu, Jul 2, 2015 at 7:52 PM, Fabien Chouteau 
wrote:

> On 07/01/2015 08:00 PM, Stefan Weil wrote:
> > Am 01.07.2015 um 18:49 schrieb Paolo Bonzini:
> >>
> >> On 01/07/2015 17:48, Zavadovsky Yan wrote:
> >>> Ping.
> >> Stefan, are you merging this?
> >>
> >> Paolo
> >
> > I can do so, but as the current code seems to fix the problems
> > with multi-processor systems, too (even if it is unclear why),
> > it does not look urgent.
> >
> > Fabien, you suggested "extensive tests". Do you think that
> > patch v2 is fine, or are you still waiting for test results?
> >
>
> The patch looks good. I won't be able to do heavy testing anytime soon,
> if Yan tells us that the new code was tested I will take his word for it.
>
> Thanks,
>
>

Re: [Qemu-devel] [PATCH] virtio-pci: implement cfg capability

2015-07-02 Thread Paolo Bonzini



On 02/07/2015 21:00, Michael S. Tsirkin wrote:
> On Thu, Jul 02, 2015 at 08:48:14PM +0200, Paolo Bonzini wrote:
>>
>>
>> On 02/07/2015 15:00, Michael S. Tsirkin wrote:
>>> +cfg = (void *)(proxy->pci_dev.config + proxy->config_cap);
>>> +off = le32_to_cpu(cfg->cap.offset);
>>> +len = le32_to_cpu(cfg->cap.length);
>>> +
>>> +if ((len == 1 || len == 2 || len == 4)) {
>>> +address_space_write(&proxy->modern_as, off,
>>> +MEMTXATTRS_UNSPECIFIED,
>>> +cfg->pci_cfg_data, len);
>>> +}
>>
>> This parses pci_cfg_data in target endianness I think.  You just want to
>> move the little-endian value from the config cap to the little-endian
>> value in the modern_as, so you need to use ldl_le_p and
>> address_space_stl_le.
> 
> but isn't that two byteswaps? why isn't this same as memcpy?

It is a memcpy if you write to RAM, but the MMIO ops take an unsigned
integer, so you can still have one byteswap hiding; you have to be
careful when you have this kind of "forwarder", and the simplest way to
do it is to use an ldl/stl pair with the same endianness.

Paolo

>>> +}
>>> +}
>>> +
>>> +static uint32_t virtio_read_config(PCIDevice *pci_dev,
>>> +   uint32_t address, int len)
>>> +{
>>> +VirtIOPCIProxy *proxy = DO_UPCAST(VirtIOPCIProxy, pci_dev, pci_dev);
>>> +struct virtio_pci_cfg_cap *cfg;
>>> +
>>> +if (proxy->config_cap &&
>>> +ranges_overlap(address, len, proxy->config_cap + offsetof(struct 
>>> virtio_pci_cfg_cap,
>>> +  
>>> pci_cfg_data),
>>> +   sizeof cfg->pci_cfg_data)) {
>>> +uint32_t off;
>>> +uint32_t len;
>>> +
>>> +cfg = (void *)(proxy->pci_dev.config + proxy->config_cap);
>>> +off = le32_to_cpu(cfg->cap.offset);
>>> +len = le32_to_cpu(cfg->cap.length);
>>> +
>>> +if ((len == 1 || len == 2 || len == 4)) {
>>> +address_space_read(&proxy->modern_as, off,
>>> +MEMTXATTRS_UNSPECIFIED,
>>> +cfg->pci_cfg_data, len);
>>
>> Same here, use address_space_ldl_le to read into an int, and stl_le_p to
>> write into cfg->pci_cfg_data.
>>
>> The best way to check it, of course, is to write a unit test! :)  But
>> you could also use a Linux BE guest on LE host.
>>
>> Everything else looks good.
>>
>> Paolo

Re: [Qemu-devel] [PATCH] virtio-pci: implement cfg capability

2015-07-02 Thread Michael S. Tsirkin

On Thu, Jul 02, 2015 at 08:48:14PM +0200, Paolo Bonzini wrote:
> 
> 
> On 02/07/2015 15:00, Michael S. Tsirkin wrote:
> > +cfg = (void *)(proxy->pci_dev.config + proxy->config_cap);
> > +off = le32_to_cpu(cfg->cap.offset);
> > +len = le32_to_cpu(cfg->cap.length);
> > +
> > +if ((len == 1 || len == 2 || len == 4)) {
> > +address_space_write(&proxy->modern_as, off,
> > +MEMTXATTRS_UNSPECIFIED,
> > +cfg->pci_cfg_data, len);
> > +}
> 
> This parses pci_cfg_data in target endianness I think.  You just want to
> move the little-endian value from the config cap to the little-endian
> value in the modern_as, so you need to use ldl_le_p and
> address_space_stl_le.

but isn't that two byteswaps? why isn't this same as memcpy?

> > +}
> > +}
> > +
> > +static uint32_t virtio_read_config(PCIDevice *pci_dev,
> > +   uint32_t address, int len)
> > +{
> > +VirtIOPCIProxy *proxy = DO_UPCAST(VirtIOPCIProxy, pci_dev, pci_dev);
> > +struct virtio_pci_cfg_cap *cfg;
> > +
> > +if (proxy->config_cap &&
> > +ranges_overlap(address, len, proxy->config_cap + offsetof(struct 
> > virtio_pci_cfg_cap,
> > +  
> > pci_cfg_data),
> > +   sizeof cfg->pci_cfg_data)) {
> > +uint32_t off;
> > +uint32_t len;
> > +
> > +cfg = (void *)(proxy->pci_dev.config + proxy->config_cap);
> > +off = le32_to_cpu(cfg->cap.offset);
> > +len = le32_to_cpu(cfg->cap.length);
> > +
> > +if ((len == 1 || len == 2 || len == 4)) {
> > +address_space_read(&proxy->modern_as, off,
> > +MEMTXATTRS_UNSPECIFIED,
> > +cfg->pci_cfg_data, len);
> 
> Same here, use address_space_ldl_le to read into an int, and stl_le_p to
> write into cfg->pci_cfg_data.
> 
> The best way to check it, of course, is to write a unit test! :)  But
> you could also use a Linux BE guest on LE host.
> 
> Everything else looks good.
> 
> Paolo

Re: [Qemu-devel] [Xen-devel] [PATCH RFC 1 6/8] xen/pt: Make xen_pt_unregister_device idempotent

2015-07-02 Thread Konrad Rzeszutek Wilk

> > @@ -858,15 +863,20 @@ static void xen_pt_unregister_device(PCIDevice *d)
> > machine_irq, errno);
> >  }
> >  }
> > +s->machine_irq = 0;
> >  }
> >  
> >  /* delete all emulated config registers */
> >  xen_pt_config_delete(s);
> >  
> > -memory_listener_unregister(&s->memory_listener);
> > -memory_listener_unregister(&s->io_listener);
> > -
> > -xen_host_pci_device_put(&s->real_device);
> > +if (s->listener_set) {
> > +memory_listener_unregister(&s->memory_listener);
> > +memory_listener_unregister(&s->io_listener);
> > +s->listener_set = false;
> 
> If you call QTAILQ_INIT on memory_listener and io_listener, then you
> simply check on QTAILQ_EMPTY and remove listener_set.

No dice. For that to work they need to be QTAIL_HEAD while
they are of QTAILQ_ENTRY.

That is, the include/exec/memory.h:
struct MemoryListener {
 ...
AddressSpace *address_space_filter;
QTAILQ_ENTRY(MemoryListener) link;
}

And the QTAILQ_EMPTY checks for '>tqh_first' while the
QTAILQ_ENTRY only defines two pointers: tqe_next and tqe_prev.

Re: [Qemu-devel] [PATCH v2] arm_mptimer: Fix timer shutdown

2015-07-02 Thread Dmitry Osipenko


02.07.2015 21:43, Dmitry Osipenko пишет:

02.07.2015 21:09, Peter Maydell пишет:
TIMER_CONTROL_IT_ENABLE) to TIMER_CONTROL < it won't start, bug



s/it won't start/it won't start periodic/

--
Dmitry

Re: [Qemu-devel] [PATCH] virtio-pci: implement cfg capability

2015-07-02 Thread Paolo Bonzini

On 02/07/2015 15:00, Michael S. Tsirkin wrote:
> +cfg = (void *)(proxy->pci_dev.config + proxy->config_cap);
> +off = le32_to_cpu(cfg->cap.offset);
> +len = le32_to_cpu(cfg->cap.length);
> +
> +if ((len == 1 || len == 2 || len == 4)) {
> +address_space_write(&proxy->modern_as, off,
> +MEMTXATTRS_UNSPECIFIED,
> +cfg->pci_cfg_data, len);
> +}

This parses pci_cfg_data in target endianness I think.  You just want to
move the little-endian value from the config cap to the little-endian
value in the modern_as, so you need to use ldl_le_p and
address_space_stl_le.

> +}
> +}
> +
> +static uint32_t virtio_read_config(PCIDevice *pci_dev,
> +   uint32_t address, int len)
> +{
> +VirtIOPCIProxy *proxy = DO_UPCAST(VirtIOPCIProxy, pci_dev, pci_dev);
> +struct virtio_pci_cfg_cap *cfg;
> +
> +if (proxy->config_cap &&
> +ranges_overlap(address, len, proxy->config_cap + offsetof(struct 
> virtio_pci_cfg_cap,
> +  
> pci_cfg_data),
> +   sizeof cfg->pci_cfg_data)) {
> +uint32_t off;
> +uint32_t len;
> +
> +cfg = (void *)(proxy->pci_dev.config + proxy->config_cap);
> +off = le32_to_cpu(cfg->cap.offset);
> +len = le32_to_cpu(cfg->cap.length);
> +
> +if ((len == 1 || len == 2 || len == 4)) {
> +address_space_read(&proxy->modern_as, off,
> +MEMTXATTRS_UNSPECIFIED,
> +cfg->pci_cfg_data, len);

Same here, use address_space_ldl_le to read into an int, and stl_le_p to
write into cfg->pci_cfg_data.

The best way to check it, of course, is to write a unit test! :)  But
you could also use a Linux BE guest on LE host.

Everything else looks good.

Paolo

Re: [Qemu-devel] [PATCH v2] arm_mptimer: Fix timer shutdown

2015-07-02 Thread Dmitry Osipenko


02.07.2015 21:09, Peter Maydell пишет:

On 2 July 2015 at 18:52, Dmitry Osipenko  wrote:

02.07.2015 20:34, Peter Maydell пишет:



This will now cause us to do the "reload the timer"
logic if you write a 1 to the control bit when it was
already 1, which we didn't do before.

The logic I suggested in my previous review
comment gets this right...

-- PMM



The problem with code you suggested is that won't start periodic count after
one-shot tick was completed.


Can you give more detail? This code is only for when
the guest writes to the control register, so it doesn't
get run when a one-shot tick completes.

In any case, the code currently in master does:
   old value   new valueaction
   0   0  nothing
   0   1  reload timer
   1   0  nothing
   1   1  nothing

Your first patch did:

   old value   new valueaction
   0   0  delete timer
   0   1  reload timer
   1   0  delete timer
   1   1  nothing

Your second patch does:

   old value   new valueaction
   0   0  nothing
   0   1  reload timer
   1   0  delete timer
   1   1  reload timer

My suggestion was:

   old value   new valueaction
   0   0  nothing
   0   1  reload timer
   1   0  delete timer
   1   1  nothing

If you think that's wrong, then surely your first
patch also has the same problem?

thanks
-- PMM



Yes, my first patch has same problem. Noticed that issue couple hours ago, it's 
separate bug as I see now... Will make patch for it too.


To clarify new issue:
1) load TIMER_LOAD with some value
	2) write (TIMER_CONTROL_ENABLE | TIMER_CONTROL_ONESHOT | 
TIMER_CONTROL_IT_ENABLE) to TIMER_CONTROL

3) wait for one-shot complete
4) re-load TIMER_LOAD
	5) write (TIMER_CONTROL_ENABLE | TIMER_CONTROL_PERIODIC | 
TIMER_CONTROL_IT_ENABLE) to TIMER_CONTROL < it won't start, bug



Oh, and just noticed that timer code doesn't handle IT(interrupt enable) bit. 
Will fix it too.


--
Dmitry

Re: [Qemu-devel] [PATCH 00/16] implement vNVDIMM

2015-07-02 Thread Paolo Bonzini



On 02/07/2015 20:01, Xiao Guangrong wrote:
> 
> Thanks for your review, Stefan and Paolo!
> 
> On 07/02/2015 05:52 PM, Paolo Bonzini wrote:
>>
>>
>> On 02/07/2015 11:20, Stefan Hajnoczi wrote:
 Currently, the NVDIMM driver has been merged into upstream Linux
 Kernel and
 this patchset tries to enable it in virtualization field
>>>
>>>  From a device model perspective, have you checked whether it makes
>>> sense
>>> to integrate nvdimms into the pc-dimm and hostmem code that is used for
>>> memory hotplug and NUMA?
>>>
>>> The NVDIMM device in your patches is a completely new TYPE_DEVICE so it
>>> doesn't share any interfaces or code with existing memory devices.
>>> Maybe that is the right solution here because NVDIMMs have different
>>> characteristics, but I'm not sure.
>>
>> The hostmem code should definitely be shared, e.g. by adding a new
>> "file" property to the memory-backend-file class.  ivshmem can also use
>> it---CCing Marc-Andr�.
> 
> However, file-based memory used by NVDIMM is special, it divides the file
> to two parts, one part is used as PMEM and another part is used to store
> NVDIMM's configure data.
> 
> Maybe we can introduce "end-reserved" property to reserve specified size
> at the end of the file. Or create a new class type based on
> memory-backend-file (named nvdimm-backend-file) class to hide this magic
> thing?

I need to read the code then. :)

Paolo

Re: [Qemu-devel] [PATCH v2] arm_mptimer: Fix timer shutdown

2015-07-02 Thread Peter Maydell

On 2 July 2015 at 18:52, Dmitry Osipenko  wrote:
> 02.07.2015 20:34, Peter Maydell пишет:
>>
>>
>> This will now cause us to do the "reload the timer"
>> logic if you write a 1 to the control bit when it was
>> already 1, which we didn't do before.
>>
>> The logic I suggested in my previous review
>> comment gets this right...
>>
>> -- PMM
>>
>
> The problem with code you suggested is that won't start periodic count after
> one-shot tick was completed.

Can you give more detail? This code is only for when
the guest writes to the control register, so it doesn't
get run when a one-shot tick completes.

In any case, the code currently in master does:
  old value   new valueaction
  0   0  nothing
  0   1  reload timer
  1   0  nothing
  1   1  nothing

Your first patch did:

  old value   new valueaction
  0   0  delete timer
  0   1  reload timer
  1   0  delete timer
  1   1  nothing

Your second patch does:

  old value   new valueaction
  0   0  nothing
  0   1  reload timer
  1   0  delete timer
  1   1  reload timer

My suggestion was:

  old value   new valueaction
  0   0  nothing
  0   1  reload timer
  1   0  delete timer
  1   1  nothing

If you think that's wrong, then surely your first
patch also has the same problem?

thanks
-- PMM

Re: [Qemu-devel] [PATCH 14/16] nvdimm: support NFIT_CMD_GET_CONFIG_SIZE function

2015-07-02 Thread Xiao Guangrong




On 07/02/2015 05:23 PM, Stefan Hajnoczi wrote:

On Wed, Jul 01, 2015 at 10:50:30PM +0800, Xiao Guangrong wrote:

+static uint32_t dsm_cmd_config_size(struct dsm_buffer *in, struct dsm_out *out)
+{
+GSList *list = get_nvdimm_built_list();
+PCNVDIMMDevice *nvdimm = get_nvdimm_device_by_handle(list, in->handle);
+uint32_t status = NFIT_STATUS_NON_EXISTING_MEM_DEV;
+
+if (!nvdimm) {
+goto exit;
+}
+
+status = NFIT_STATUS_SUCCESS;
+out->cmd_config_size.config_size = nvdimm->config_data_size;
+out->cmd_config_size.max_xfer = max_xfer_config_size();


cpu_to_*() missing?

It should be possible to emulate NVDIMMs for a x86_64 guest on a
big-endian host, for example.


Indeed, will fix it in the next version, thank you for pointing it out.

Re: [Qemu-devel] [PATCH 00/16] implement vNVDIMM

2015-07-02 Thread Xiao Guangrong



Thanks for your review, Stefan and Paolo!

On 07/02/2015 05:52 PM, Paolo Bonzini wrote:



On 02/07/2015 11:20, Stefan Hajnoczi wrote:

Currently, the NVDIMM driver has been merged into upstream Linux Kernel and
this patchset tries to enable it in virtualization field


 From a device model perspective, have you checked whether it makes sense
to integrate nvdimms into the pc-dimm and hostmem code that is used for
memory hotplug and NUMA?

The NVDIMM device in your patches is a completely new TYPE_DEVICE so it
doesn't share any interfaces or code with existing memory devices.
Maybe that is the right solution here because NVDIMMs have different
characteristics, but I'm not sure.


The hostmem code should definitely be shared, e.g. by adding a new
"file" property to the memory-backend-file class.  ivshmem can also use
it---CCing Marc-Andr�.


However, file-based memory used by NVDIMM is special, it divides the file
to two parts, one part is used as PMEM and another part is used to store
NVDIMM's configure data.

Maybe we can introduce "end-reserved" property to reserve specified size
at the end of the file. Or create a new class type based on
memory-backend-file (named nvdimm-backend-file) class to hide this magic
thing?



I don't know about the pc-dimm devices.  If the NVDIMM devices can do
_OST and can be hotplugged, then the answer is probably yes.


_OST is not needed for NVDIMM.

NVDIMM is completely different with dimm memory device in ACPI - it has
different HID, method object, memory range detection, device organization,
etc. So i prefer to introducing new device class for NVDIMM.

For hotplug, NVDIMM and DIMM can share some logic, e.g, free－address－range
management, slot management ... ( but new Object initiation in ACPI is
complete different), we can abstract these operation as common part.

NUMA detection is also different between NVDIMM, DIMM is also different,
NVDIMM need to report its NUMA affinity in SPA table. But they can share
some common function i think.

BTW, i am going to implement vNVDIMM hotplug once linux NVDIMM driver
supports it.

Re: [Qemu-devel] [PATCH 0/5] qapi flattening + some miscellaneous patches

2015-07-02 Thread Markus Armbruster

Gerd Hoffmann  writes:

> On Di, 2015-06-23 at 15:32 +0200, Kővágó, Zoltán wrote:
>> I've cherry-picked the qapi related parts from my previous -audiodev
>> patch series, we can hopefully concentrate on one thing at a time.  The
>> most important changes in this patch series are the flattening of the
>> Netdev structures.  This way we can add proper nested structure support
>> into OptsVisitor, without requiring backward-compatibility hacks.
>
> Applies and builds fine, no obvious regressions in testing.
>
> Tested-by: Gerd Hoffmann 
>
> Getting this merged before hard freeze would be great ...
>
> Any takers?  Markus?

A first round of review is my best offer, I'm afraid: I'll be away the
next two weeks.

Any particular reason why we want this in 2.4?

Re: [Qemu-devel] [PATCH 5/5] opts: produce valid command line in qemu_opts_print

2015-07-02 Thread Markus Armbruster

"Kővágó, Zoltán"  writes:

> This will let us print options in a format that the user would actually
> write it on the command line (foo=bar,baz=asd,etc=def), without
> prepending a spurious comma at the beginning of the list, or quoting
> values unnecessarily.  This patch provides the following changes:
> * write and id=, if the option has an id
> * do not print separator before the first element
> * do not quote string arguments which only contains letters or numbers
> * properly escape commas (,) for QEMU, apostrophe (') for shell
>
> Signed-off-by: Kővágó, Zoltán 
>
> ---
>
> Chages from v1 -audiodev patch:
> * print id=
> * proper value escaping (apostrophe and comma)
> * renamed d_sep -> separator
>
>
>  block.c|  2 +-
>  util/qemu-option.c | 47 ---
>  2 files changed, 45 insertions(+), 4 deletions(-)
>
> diff --git a/block.c b/block.c
> index e2e33fd..a18af00 100644
> --- a/block.c
> +++ b/block.c
> @@ -3825,7 +3825,7 @@ void bdrv_img_create(const char *filename, const char 
> *fmt,
>  }
>  
>  if (!quiet) {
> -printf("Formatting '%s', fmt=%s", filename, fmt);
> +printf("Formatting '%s', fmt=%s ", filename, fmt);
>  qemu_opts_print(opts, " ");
>  puts("");
>  }
> diff --git a/util/qemu-option.c b/util/qemu-option.c
> index efe9d27..db60ac9 100644
> --- a/util/qemu-option.c
> +++ b/util/qemu-option.c
> @@ -730,14 +730,53 @@ void qemu_opts_del(QemuOpts *opts)
>  g_free(opts);
>  }
>  
> -void qemu_opts_print(QemuOpts *opts, const char *sep)
> +/* print value properly escaping it for the shell (at least for bash) */
> +static void escaped_print(const char *value)
> +{
> +const char *ptr;
> +bool need_quote = false;
> +
> +for (ptr = value; *ptr; ++ptr) {
> +if (!qemu_isalnum(*ptr)) {
> +need_quote = true;
> +break;
> +}
> +}

Rather eager to quote.  Shell wants you to quote

|  &  ;  <  >  (  )  $ `  \  "  '  

always, and

*   ?   [   #   ˜   =   %

in some contexts.

In particular, there is no need to quote the common characters '.', '-'
and '_'.

http://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html#tag_18_02

> +
> +if (need_quote) {
> +putchar('\'');
> +for (ptr = value; *ptr; ++ptr) {
> +if (*ptr == '\'') {
> +printf("'\\''");

Won't achieve the stated purpose "for shell", because backslash escapes
do not work within single quotes.

You could try double quote, but things become even more complicated
then.

> +} else if (*ptr == ',') {
> +printf(",,");
> +} else {
> +putchar(*ptr);
> +}

If we do backslash escapes, should we escape non-printing characters?

> +}
> +putchar('\'');
> +} else {
> +printf("%s", value);
> +}
> +}
> +
> +void qemu_opts_print(QemuOpts *opts, const char *separator)
>  {
>  QemuOpt *opt;
>  QemuOptDesc *desc = opts->list->desc;
> +const char *sep = "";
> +
> +if (opts->id) {
> +printf("id=");
> +escaped_print(opts->id);

Since opts->id has passed id_wellformed(), it won't need quoting.
Your escaped_print() may quote it anyway, though.

> +sep = separator;
> +}
>  
>  if (desc[0].name == NULL) {
>  QTAILQ_FOREACH(opt, &opts->head, next) {
> -printf("%s%s=\"%s\"", sep, opt->name, opt->str);
> +printf("%s%s=", sep, opt->name);
> +escaped_print(opt->str);
> +sep = separator;
>  }
>  return;
>  }
> @@ -750,13 +789,15 @@ void qemu_opts_print(QemuOpts *opts, const char *sep)
>  continue;
>  }
>  if (desc->type == QEMU_OPT_STRING) {
> -printf("%s%s='%s'", sep, desc->name, value);
> +printf("%s%s=", sep, desc->name);
> +escaped_print(value);
>  } else if ((desc->type == QEMU_OPT_SIZE ||
>  desc->type == QEMU_OPT_NUMBER) && opt) {
>  printf("%s%s=%" PRId64, sep, desc->name, opt->value.uint);
>  } else {
>  printf("%s%s=%s", sep, desc->name, value);
>  }
> +sep = separator;
>  }
>  }

Things I like about the patch:

* Making parameter sep a true separator

* Printing opts->id

* Escaping comma

I'm not sure the "quote for shell" feature is worth the bother.  If we
want it, then

"foo=,msg=\"hello\",gnu=on"

would perhaps be nicer than

foo="",msg="\"hello\"",gnu=on

but could be even more complicated to do.

If you want to pursue quoting for shell, I recommend to split this patch
into a part I like and a part we want to discuss :)

Re: [Qemu-devel] [PATCH v2] arm_mptimer: Fix timer shutdown

2015-07-02 Thread Dmitry Osipenko


02.07.2015 20:34, Peter Maydell пишет:


This will now cause us to do the "reload the timer"
logic if you write a 1 to the control bit when it was
already 1, which we didn't do before.

The logic I suggested in my previous review
comment gets this right...

-- PMM



The problem with code you suggested is that won't start periodic count after 
one-shot tick was completed.


--
Dmitry

Re: [Qemu-devel] [PATCH v2] arm_mptimer: Fix timer shutdown

2015-07-02 Thread Dmitry Osipenko


02.07.2015 20:34, Peter Maydell пишет:

On 2 July 2015 at 18:20, Dmitry Osipenko  wrote:

Timer, running in periodic mode, can't be stopped or coming one-shot tick
won't be canceled because timer control code just doesn't handle timer
disabling. Fix it by deleting timer if enable bit isn't set.

Signed-off-by: Dmitry Osipenko 
---

v2: Avoid calling timer_del() if the timer was already disabled as per
 Peter Maydell suggestion.

  hw/timer/arm_mptimer.c | 8 +++-
  1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/hw/timer/arm_mptimer.c b/hw/timer/arm_mptimer.c
index 8b93b3c..51c18de 100644
--- a/hw/timer/arm_mptimer.c
+++ b/hw/timer/arm_mptimer.c
@@ -122,11 +122,17 @@ static void timerblock_write(void *opaque, hwaddr addr,
  case 8: /* Control.  */
  old = tb->control;
  tb->control = value;
-if (((old & 1) == 0) && (value & 1)) {
+if (((old & 1) == 0) && ((value & 1) == 0)) {
+break;
+}
+if (value & 1) {
  if (tb->count == 0 && (tb->control & 2)) {
  tb->count = tb->load;
  }
  timerblock_reload(tb, 1);
+} else {
+/* Shutdown timer.  */
+timer_del(tb->timer);



This will now cause us to do the "reload the timer"
logic if you write a 1 to the control bit when it was
already 1, which we didn't do before.

The logic I suggested in my previous review
comment gets this right...

-- PMM



Yep, you right. I overlooked it, let me try again.

--
Dmitry

Re: [Qemu-devel] [PATCH 4/5] qapi: support nested structs in OptsVisitor

2015-07-02 Thread Markus Armbruster

"Kővágó, Zoltán"  writes:

> The current OptsVisitor flattens the whole structure, if there are same
> named fields under different paths the current visitor can't cope with
> them (it'll just set the first field, leaving the others unspecified (if
> they're optional) or erroring out (if they're required).
>
> This patch add support for it, by always requiring a complete path in
> case of nested structs.  Fields in the path are separated by dots,
> similar to C structs (without pointers), like `foo.bar'.
>
> You must provide a full path even in non-ambigous cases.  The previous
> two commits hopefully ensures that this change doesn't create backward
> compatibility problems.
>
> Signed-off-by: Kővágó, Zoltán 
>
> ---
>
> Rationale for this commit: see these threads
> http://lists.nongnu.org/archive/html/qemu-devel/2015-06/msg04189.html
> http://lists.nongnu.org/archive/html/qemu-devel/2015-06/msg04186.html

So, despite your flattening work, we still need to parse option strings
into nested QAPI types?

Can you give examples of the need for such nested QAPI types in
configuration?  I may have seen them already in your audio patches, but
I think having some right here would be useful.

Re: [Qemu-devel] [PATCH v2] arm_mptimer: Fix timer shutdown

2015-07-02 Thread Peter Maydell

On 2 July 2015 at 18:20, Dmitry Osipenko  wrote:
> Timer, running in periodic mode, can't be stopped or coming one-shot tick
> won't be canceled because timer control code just doesn't handle timer
> disabling. Fix it by deleting timer if enable bit isn't set.
>
> Signed-off-by: Dmitry Osipenko 
> ---
>
> v2: Avoid calling timer_del() if the timer was already disabled as per
> Peter Maydell suggestion.
>
>  hw/timer/arm_mptimer.c | 8 +++-
>  1 file changed, 7 insertions(+), 1 deletion(-)
>
> diff --git a/hw/timer/arm_mptimer.c b/hw/timer/arm_mptimer.c
> index 8b93b3c..51c18de 100644
> --- a/hw/timer/arm_mptimer.c
> +++ b/hw/timer/arm_mptimer.c
> @@ -122,11 +122,17 @@ static void timerblock_write(void *opaque, hwaddr addr,
>  case 8: /* Control.  */
>  old = tb->control;
>  tb->control = value;
> -if (((old & 1) == 0) && (value & 1)) {
> +if (((old & 1) == 0) && ((value & 1) == 0)) {
> +break;
> +}
> +if (value & 1) {
>  if (tb->count == 0 && (tb->control & 2)) {
>  tb->count = tb->load;
>  }
>  timerblock_reload(tb, 1);
> +} else {
> +/* Shutdown timer.  */
> +timer_del(tb->timer);


This will now cause us to do the "reload the timer"
logic if you write a 1 to the control bit when it was
already 1, which we didn't do before.

The logic I suggested in my previous review
comment gets this right...

-- PMM

Re: [Qemu-devel] [PULL v2 00/27] Migration pull request

2015-07-02 Thread Peter Maydell

On 2 July 2015 at 17:09, Juan Quintela  wrote:
> Hi
>
> [v2]
> - update migration_bitmap extension to work on osX (li)
> - fix footers (dave)
>
> Please, apply.
>
> [v1]
> This series includes:
> - rdma fixes by Dave
> - rdma memory fix by gonglei
> - vmdescription for old machine types (dave)
> - fix footers for power (dave)
> - migration bitmap extensions (Li)
>   just fixed the compilation issues for linux-users
> - migration events (me)
> - optional secttions (me)
> - global  configuration (me)
>
>
> Please, Apply.
>
> The following changes since commit 5317b0f6d4bb581c5c8f88f31138ee301ad2b7e5:
>
>   Merge remote-tracking branch 'remotes/cohuck/tags/s390x-20150702-v3' into 
> staging (2015-07-02 15:20:55 +0100)
>
> are available in the git repository at:
>
>   git://github.com/juanquintela/qemu.git tags/migration/20150702
>
> for you to fetch changes up to 4c7e7bb049cc20d53c2f99fb294626e5e5a334ec:
>
>   migration: extend migration_bitmap (2015-07-02 18:06:23 +0200)
>
> 
> migration/next for 20150702

This fails 'make check' for me (x86-64 host):

  /i386/ahci/flush/retry:  OK
  /i386/ahci/flush/migrate:**
ERROR:/home/petmay01/linaro/qemu-for-merges/tests/libqos/libqos.c:106:migrate:
assertion failed: (qdict_haskey(rsp, "return"))
FAIL
GTester: last random seed: R02S0e9ae1836b7c936e209a06fd493951b7
(pid=10860)
  /i386/ahci/migrate/sanity:   **
ERROR:/home/petmay01/linaro/qemu-for-merges/tests/libqos/libqos.c:106:migrate:
assertion failed: (qdict_haskey(rsp, "return"))
FAIL
GTester: last random seed: R02Sb0504bab18f341196cdbdc51a849b263
(pid=10875)
  /i386/ahci/migrate/dma/simple:   **
ERROR:/home/petmay01/linaro/qemu-for-merges/tests/libqos/libqos.c:106:migrate:
assertion failed: (qdict_haskey(rsp, "return"))
FAIL
GTester: last random seed: R02S35838569a3434f3215d88cbb1535b6df
(pid=10890)
  /i386/ahci/migrate/dma/halted:   **
ERROR:/home/petmay01/linaro/qemu-for-merges/tests/libqos/libqos.c:106:migrate:
assertion failed: (qdict_haskey(rsp, "return"))
FAIL
GTester: last random seed: R02S6f0580dcb31a655cf4b57ec7ea59b011
(pid=10905)
FAIL: tests/ahci-test
TEST: tests/hd-geo-test... (pid=10908)
  /i386/hd-geo/ide/none:   OK
  /i386/hd-geo/ide/drive/cd_0: OK

thanks
-- PMM

Re: [Qemu-devel] [PATCH 2/5] qapi: convert NumaOptions into a flat union

2015-07-02 Thread Markus Armbruster

"Kővágó, Zoltán"  writes:

> Signed-off-by: Kővágó, Zoltán 
> ---
>  numa.c   |  2 +-
>  qapi-schema.json | 47 ---
>  2 files changed, 37 insertions(+), 12 deletions(-)
>
> diff --git a/numa.c b/numa.c
> index 91fc6c1..8b0755d 100644
> --- a/numa.c
> +++ b/numa.c
> @@ -140,7 +140,7 @@ static int parse_numa(void *opaque, QemuOpts *opts, Error 
> **errp)
>  }
>  
>  switch (object->kind) {
> -case NUMA_OPTIONS_KIND_NODE:
> +case NUMA_DRIVER_NODE:
>  numa_node_parse(object->node, opts, &err);
>  if (err) {
>  goto error;
> diff --git a/qapi-schema.json b/qapi-schema.json
> index 106008c..7ebf99e 100644
> --- a/qapi-schema.json
> +++ b/qapi-schema.json
> @@ -3537,17 +3537,6 @@
>'data': { '*console':'int', 'events': [ 'InputEvent' ] } }
>  
>  ##
> -# @NumaOptions
> -#
> -# A discriminated record of NUMA options. (for OptsVisitor)
> -#
> -# Since 2.1
> -##
> -{ 'union': 'NumaOptions',
> -  'data': {
> -'node': 'NumaNodeOptions' }}
> -
> -##
>  # @NumaNodeOptions
>  #
>  # Create a guest NUMA node. (for OptsVisitor)
> @@ -3574,6 +3563,42 @@
> '*memdev': 'str' }}
>  
>  ##
> +# @NumaDriver
> +#
> +# List of possible numa drivers.
> +#
> +# Since: 2.4
> +##
> +{ 'enum': 'NumaDriver',
> +  'data': [ 'node' ] }
> +
> +##
> +# @NumaCommonOptions
> +#
> +# Common set of numa options.
> +#
> +# @type: the numa driver to use
> +#
> +# Since: 2.4
> +##
> +{ 'struct': 'NumaCommonOptions',
> +  'data': {
> +'type': 'NumaDriver' } }
> +
> +##
> +# @NumaOptions
> +#
> +# A discriminated record of NUMA options. (for OptsVisitor)
> +#
> +# Since 2.1
> +##
> +{ 'union': 'NumaOptions',
> +  'base': 'NumaCommonOptions',
> +  'discriminator': 'type',
> +  'data': {
> +'node': 'NumaNodeOptions' }}
> +
> +##
>  # @HostMemPolicy
>  #
>  # Host memory policy types

Here, the notational overhead of flat unions borders on the ridiculous.

Anyway, faithful conversion.

[Qemu-devel] [PATCH v2] arm_mptimer: Fix timer shutdown

2015-07-02 Thread Dmitry Osipenko

Timer, running in periodic mode, can't be stopped or coming one-shot tick
won't be canceled because timer control code just doesn't handle timer
disabling. Fix it by deleting timer if enable bit isn't set.

Signed-off-by: Dmitry Osipenko 
---

v2: Avoid calling timer_del() if the timer was already disabled as per
Peter Maydell suggestion.

 hw/timer/arm_mptimer.c | 8 +++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/hw/timer/arm_mptimer.c b/hw/timer/arm_mptimer.c
index 8b93b3c..51c18de 100644
--- a/hw/timer/arm_mptimer.c
+++ b/hw/timer/arm_mptimer.c
@@ -122,11 +122,17 @@ static void timerblock_write(void *opaque, hwaddr addr,
 case 8: /* Control.  */
 old = tb->control;
 tb->control = value;
-if (((old & 1) == 0) && (value & 1)) {
+if (((old & 1) == 0) && ((value & 1) == 0)) {
+break;
+}
+if (value & 1) {
 if (tb->count == 0 && (tb->control & 2)) {
 tb->count = tb->load;
 }
 timerblock_reload(tb, 1);
+} else {
+/* Shutdown timer.  */
+timer_del(tb->timer);
 }
 break;
 case 12: /* Interrupt status.  */
-- 
2.4.4

Re: [Qemu-devel] [PATCH 3/5] qapi: change Netdev and NetLegacy into a flat union

2015-07-02 Thread Markus Armbruster

"Kővágó, Zoltán"  writes:

> Signed-off-by: Kővágó, Zoltán 
[...]
> diff --git a/hw/arm/musicpal.c b/hw/arm/musicpal.c
> index a3b1314..72e2f8f 100644
> --- a/hw/arm/musicpal.c
> +++ b/hw/arm/musicpal.c
> @@ -379,7 +379,7 @@ static void eth_cleanup(NetClientState *nc)
>  }
>  
>  static NetClientInfo net_mv88w8618_info = {
> -.type = NET_CLIENT_OPTIONS_KIND_NIC,
> +.type = NET_CLIENT_DRIVER_NIC,
>  .size = sizeof(NICState),
>  .can_receive = eth_can_receive,
>  .receive = eth_receive,
[Many more renames snipped, mind-numbing, hope it's nothing but renames...]
> diff --git a/net/clients.h b/net/clients.h
> index d47530e..2aac0ee 100644
> --- a/net/clients.h
> +++ b/net/clients.h
> @@ -27,39 +27,39 @@
>  #include "net/net.h"
>  #include "qapi-types.h"
>  
> -int net_init_dump(const NetClientOptions *opts, const char *name,
> +int net_init_dump(const void *opts, const char *name,
>NetClientState *peer, Error **errp);

Why do we need to bypass the type system now?

Hmm, the reason is the conversion to flat unions loses type
NetClientOptions, member of both Netdev and NetLegacy, because the type
gets inlined into both containing types.  More below.

>  
>  #ifdef CONFIG_SLIRP
> -int net_init_slirp(const NetClientOptions *opts, const char *name,
> +int net_init_slirp(const void *opts, const char *name,
> NetClientState *peer, Error **errp);
>  #endif
>  
> -int net_init_hubport(const NetClientOptions *opts, const char *name,
> +int net_init_hubport(const void *opts, const char *name,
>   NetClientState *peer, Error **errp);
>  
> -int net_init_socket(const NetClientOptions *opts, const char *name,
> +int net_init_socket(const void *opts, const char *name,
>  NetClientState *peer, Error **errp);
>  
> -int net_init_tap(const NetClientOptions *opts, const char *name,
> +int net_init_tap(const void *opts, const char *name,
>   NetClientState *peer, Error **errp);
>  
> -int net_init_bridge(const NetClientOptions *opts, const char *name,
> +int net_init_bridge(const void *opts, const char *name,
>  NetClientState *peer, Error **errp);
>  
> -int net_init_l2tpv3(const NetClientOptions *opts, const char *name,
> +int net_init_l2tpv3(const void *opts, const char *name,
>  NetClientState *peer, Error **errp);
>  #ifdef CONFIG_VDE
> -int net_init_vde(const NetClientOptions *opts, const char *name,
> +int net_init_vde(const void *opts, const char *name,
>   NetClientState *peer, Error **errp);
>  #endif
>  
>  #ifdef CONFIG_NETMAP
> -int net_init_netmap(const NetClientOptions *opts, const char *name,
> +int net_init_netmap(const void *opts, const char *name,
>  NetClientState *peer, Error **errp);
>  #endif
>  
> -int net_init_vhost_user(const NetClientOptions *opts, const char *name,
> +int net_init_vhost_user(const void *opts, const char *name,
>  NetClientState *peer, Error **errp);
>  
>  #endif /* QEMU_NET_CLIENTS_H */
> diff --git a/net/dump.c b/net/dump.c
> index 02c8064..6aca19d 100644
> --- a/net/dump.c
> +++ b/net/dump.c
> @@ -94,7 +94,7 @@ static void dump_cleanup(NetClientState *nc)
>  }
>  
>  static NetClientInfo net_dump_info = {
> -.type = NET_CLIENT_OPTIONS_KIND_DUMP,
> +.type = NET_CLIENT_DRIVER_DUMP,
>  .size = sizeof(DumpState),
>  .receive = dump_receive,
>  .cleanup = dump_cleanup,
> @@ -146,16 +146,13 @@ static int net_dump_init(NetClientState *peer, const 
> char *device,
>  return 0;
>  }
>  
> -int net_init_dump(const NetClientOptions *opts, const char *name,
> +int net_init_dump(const void *opts, const char *name,
>NetClientState *peer, Error **errp)
>  {
>  int len;
>  const char *file;
>  char def_file[128];
> -const NetdevDumpOptions *dump;
> -
> -assert(opts->kind == NET_CLIENT_OPTIONS_KIND_DUMP);
> -dump = opts->dump;
> +const NetdevDumpOptions *dump = opts;
>  
>  assert(peer);
>  

Before the patch: we pass tagged union NetClientOptions to the tag's
callback, and fetch the tag's union member.

After: we pass the union member directly.  Because all callbacks have to
have the same type, we need one that can hold all the union members.
Since they're all pointers, void * works.

A union of all members would do as well.  The QAPI code generator
actually generates it, but gives it no name.  Fixable.

Further down, I challenge having separate types Netdev and Netlegacy.
If we used the same one for both, we could pass it to the callbacks, I
think.

[More of the same...]
> diff --git a/net/net.c b/net/net.c
> index cc36c7b..5f230a5 100644
> --- a/net/net.c
> +++ b/net/net.c
[...]
> @@ -878,32 +875,32 @@ static int net_init_nic(const NetClientOptions *opts, 
> const char *name,
>  }
>  
>  
> -static int (* const net_client_init_fun[NET_CLIENT_OPTIONS_KIND_MAX])(
> -const NetClientOptions *opts,
> +static in

[Qemu-devel] [PATCH v4 0/5] migration: Dynamic cpu throttling for auto-converge

2015-07-02 Thread Jason J. Herne

This patch set provides a new method for throttling a vcpu and makes use of said
method to dynamically increase cpu throttling during an autoconverge
migration until the migration completes.

The method used here for throttling vcpus is likely not the best. However, I
believe that it is preferable to what is used for autoconverge today.

This work is related to the following discussion:
https://lists.gnu.org/archive/html/qemu-devel/2015-03/msg00287.html

Changelog
---
v4
- Switch to a static timer
- Use QEMU_CLOCK_VIRTUAL_RT instead of QEMU_CLOCK_REALTIME
- Get rid of throttle_timer_stop, use throttle_percentage = 0 instead
- Calculate throttle_ratio directly in cpu_throttle_thread
- Consolidate timer mod operations to single function
- Remove some unneeded cpu_throttle_active() checks
- Check for throttle_percentage == 0 in cpu_throttle_thread
- Change throttle_percentage to unsigned int
- Renamed cpu_throttle_timer_pop to cpu_throttle_timer_tick
v3
- Cpu throttling parameters moved from CPUState to global scope
- Throttling interface now uses a percentage instead of ratio
- Migration parameters added to allow tweaking of how aggressive throttling is
- Add throttle active check to the throttle stop routine.
- Remove asserts from throttle start/stop functions and instead force the input
  to fall within the acceptable range
- Rename cpu_throttle_start to cpu_throttle_set to better describe its purpose
- Remove unneeded object casts
- Fixed monitor output formatting
- Comment cleanups
v2
- Add cpu throttle ratio output to hmp/qmp info/query migrate commands
v1
- Initial

Jason J. Herne (5):
  cpu: Provide vcpu throttling interface
  migration: Parameters for auto-converge cpu throttling
  migration: Dynamic cpu throttling for auto-converge
  qmp/hmp: Add throttle ratio to query-migrate and info migrate
  migration: Disambiguate MAX_THROTTLE

 arch_init.c   | 88 ++-
 cpus.c| 66 ++
 hmp.c | 21 
 include/qom/cpu.h | 38 ++
 migration/migration.c | 57 +++--
 qapi-schema.json  | 40 ---
 6 files changed, 246 insertions(+), 64 deletions(-)

-- 
1.9.1

Re: [Qemu-devel] [PATCH v2] thread-win32: fix GetThreadContext() permanently fails

2015-07-02 Thread Fabien Chouteau

On 07/01/2015 08:00 PM, Stefan Weil wrote:
> Am 01.07.2015 um 18:49 schrieb Paolo Bonzini:
>>
>> On 01/07/2015 17:48, Zavadovsky Yan wrote:
>>> Ping.
>> Stefan, are you merging this?
>>
>> Paolo
> 
> I can do so, but as the current code seems to fix the problems
> with multi-processor systems, too (even if it is unclear why),
> it does not look urgent.
> 
> Fabien, you suggested "extensive tests". Do you think that
> patch v2 is fine, or are you still waiting for test results?
> 

The patch looks good. I won't be able to do heavy testing anytime soon,
if Yan tells us that the new code was tested I will take his word for it.

Thanks,

Re: [Qemu-devel] [PATCH v4 1/5] cpu: Provide vcpu throttling interface

2015-07-02 Thread Dr. David Alan Gilbert

* Paolo Bonzini (pbonz...@redhat.com) wrote:
> 
> 
> On 02/07/2015 18:36, Jason J. Herne wrote:
> > +static void cpu_throttle_thread(void *opaque)
> > +{
> > +double pct = (double)throttle_percentage/100;
> > +double throttle_ratio = pct / (1 - pct);
> > +long sleeptime_ms = (long)(throttle_ratio * CPU_THROTTLE_TIMESLICE);
> > +
> > +if (!throttle_percentage) {
> > +return;
> > +}
> > +
> > +qemu_mutex_unlock_iothread();
> > +g_usleep(sleeptime_ms * 1000); /* Convert ms to us for usleep call */
> > +qemu_mutex_lock_iothread();
> > +}
> > +
> > +static void cpu_throttle_timer_tick(void *opaque)
> > +{
> > +CPUState *cpu;
> > +
> > +/* Stop the timer if needed */
> > +if (!throttle_percentage) {
> > +return;
> > +}
> > +CPU_FOREACH(cpu) {
> > +async_run_on_cpu(cpu, cpu_throttle_thread, NULL);
> > +}
> > +
> > +timer_mod(throttle_timer, qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL_RT) +
> > +   CPU_THROTTLE_TIMESLICE);
> > +}
> 
> This could cause callbacks to pile up I think.  David, do you have any
> idea how to fix it?

I don't know the timer code well enough.

Dave
> 
> Paolo
--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK

Re: [Qemu-devel] [PATCH v4 1/5] cpu: Provide vcpu throttling interface

2015-07-02 Thread Andreas Färber

Am 02.07.2015 um 18:36 schrieb Jason J. Herne:
> Provide a method to throttle guest cpu execution. CPUState is augmented with
> timeout controls and throttle start/stop functions. To throttle the guest cpu
> the caller simply has to call the throttle set function and provide a 
> percentage
> of throttle time.
> 
> Signed-off-by: Jason J. Herne 
> Reviewed-by: Matthew Rosato 
> ---
>  cpus.c| 66 
> +++
>  include/qom/cpu.h | 38 
>  2 files changed, 104 insertions(+)

No objections from my side, but the interesting code is outside my area.

I feel we (including myself) are abusing include/qom/cpu.h (here there's
not even a single CPUState argument) but I don't have a better
suggestion. At some point we'll need to revisit the cpu.h vs. cpu-all.h
etc. split or even introduce something new.

I'm preparing a qom-cpu pull and assume this will go through the
migration tree when finalized.

Regards,
Andreas

-- 
SUSE Linux GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
GF: Felix Imendörffer, Jane Smithard, Dilip Upmanyu, Graham Norton; HRB
21284 (AG Nürnberg)

Re: [Qemu-devel] [PATCH] virtio-net: Drop net_virtio_info.can_receive

2015-07-02 Thread Michael S. Tsirkin

On Thu, Jul 02, 2015 at 01:46:26PM +0100, Stefan Hajnoczi wrote:
> On Tue, Jun 30, 2015 at 04:35:24PM +0800, Jason Wang wrote:
> > On 06/30/2015 11:06 AM, Fam Zheng wrote:
> > > virtio_net_receive still does the check by calling
> > > virtio_net_can_receive, if the device or driver is not ready, the packet
> > > is dropped.
> > >
> > > This is necessary because returning false from can_receive complicates
> > > things: the peer would disable sending until we explicitly flush the
> > > queue.
> > >
> > > Signed-off-by: Fam Zheng 
> > > ---
> > >  hw/net/virtio-net.c | 1 -
> > >  1 file changed, 1 deletion(-)
> > >
> > > diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
> > > index d728233..dbef0d0 100644
> > > --- a/hw/net/virtio-net.c
> > > +++ b/hw/net/virtio-net.c
> > > @@ -1503,7 +1503,6 @@ static int virtio_net_load_device(VirtIODevice 
> > > *vdev, QEMUFile *f,
> > >  static NetClientInfo net_virtio_info = {
> > >  .type = NET_CLIENT_OPTIONS_KIND_NIC,
> > >  .size = sizeof(NICState),
> > > -.can_receive = virtio_net_can_receive,
> > >  .receive = virtio_net_receive,
> > >  .link_status_changed = virtio_net_set_link_status,
> > >  .query_rx_filter = virtio_net_query_rxfilter,
> > 
> > A side effect of this patch is it will read and then drop packet is
> > guest driver is no ok.
> 
> I think that the semantics of .can_receive() and .receive() return
> values are currently incorrect in many NICs.  They have .can_receive()
> functions that return false for conditions where .receive() would
> discard the packet.  So what happens is that packets get queued when
> they should actually be discarded.
> 
> The purpose of the flow control (queuing) mechanism is to tell the
> sender to hold off until the receiver has more rx buffers available.
> It's a short-term thing that doesn't included link down, rx disable, or
> NIC reset states.
> 
> Therefore, I think this patch will not introduce a regression.  It is
> adjusting the code to stop queuing packets when they should actually be
> dropped.
> 
> Thoughts?
> 
> Reviewed-by: Stefan Hajnoczi 

OK, but we do have transient out of buffers states too, and
virtio_net_can_receive checks these as well.

To me, it looks like we should change virtio_net_can_receive to
avoid checking link down state, etc.
But I don't see why we should remove it completely.


-- 
MST

Re: [Qemu-devel] [PATCH v4 1/5] cpu: Provide vcpu throttling interface

2015-07-02 Thread Paolo Bonzini



On 02/07/2015 18:36, Jason J. Herne wrote:
> +static void cpu_throttle_thread(void *opaque)
> +{
> +double pct = (double)throttle_percentage/100;
> +double throttle_ratio = pct / (1 - pct);
> +long sleeptime_ms = (long)(throttle_ratio * CPU_THROTTLE_TIMESLICE);
> +
> +if (!throttle_percentage) {
> +return;
> +}
> +
> +qemu_mutex_unlock_iothread();
> +g_usleep(sleeptime_ms * 1000); /* Convert ms to us for usleep call */
> +qemu_mutex_lock_iothread();
> +}
> +
> +static void cpu_throttle_timer_tick(void *opaque)
> +{
> +CPUState *cpu;
> +
> +/* Stop the timer if needed */
> +if (!throttle_percentage) {
> +return;
> +}
> +CPU_FOREACH(cpu) {
> +async_run_on_cpu(cpu, cpu_throttle_thread, NULL);
> +}
> +
> +timer_mod(throttle_timer, qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL_RT) +
> +   CPU_THROTTLE_TIMESLICE);
> +}

This could cause callbacks to pile up I think.  David, do you have any
idea how to fix it?

Paolo

[Qemu-devel] [PATCH v4 4/5] qmp/hmp: Add throttle ratio to query-migrate and info migrate

2015-07-02 Thread Jason J. Herne

Report throttle percentage in info migrate and query-migrate responses when
cpu throttling is active.

Signed-off-by: Jason J. Herne 
Reviewed-by: Dr. David Alan Gilbert 
---
 hmp.c | 5 +
 migration/migration.c | 5 +
 qapi-schema.json  | 7 ++-
 3 files changed, 16 insertions(+), 1 deletion(-)

diff --git a/hmp.c b/hmp.c
index eb65998..36bc76e 100644
--- a/hmp.c
+++ b/hmp.c
@@ -229,6 +229,11 @@ void hmp_info_migrate(Monitor *mon, const QDict *qdict)
info->xbzrle_cache->overflow);
 }
 
+if (info->has_x_cpu_throttle_percentage) {
+monitor_printf(mon, "cpu throttle percentage: %" PRIu64 "\n",
+   info->x_cpu_throttle_percentage);
+}
+
 qapi_free_MigrationInfo(info);
 qapi_free_MigrationCapabilityStatusList(caps);
 }
diff --git a/migration/migration.c b/migration/migration.c
index 7708c54..b29450a 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -274,6 +274,11 @@ MigrationInfo *qmp_query_migrate(Error **errp)
 info->disk->total = blk_mig_bytes_total();
 }
 
+if (cpu_throttle_active()) {
+info->has_x_cpu_throttle_percentage = true;
+info->x_cpu_throttle_percentage = cpu_throttle_get_percentage();
+}
+
 get_xbzrle_cache_stats(info);
 break;
 case MIGRATION_STATUS_COMPLETED:
diff --git a/qapi-schema.json b/qapi-schema.json
index 52b7e07..c2bd8ce 100644
--- a/qapi-schema.json
+++ b/qapi-schema.json
@@ -474,6 +474,10 @@
 #may be expensive, but do not actually occur during the iterative
 #migration rounds themselves. (since 1.6)
 #
+# @x-cpu-throttle-percentage: #optional percentage of time guest cpus are being
+#   throttled during auto-converge. This is only present when auto-converge
+#   has started throttling guest cpus. (Since 2.4)
+#
 # Since: 0.14.0
 ##
 { 'struct': 'MigrationInfo',
@@ -483,7 +487,8 @@
'*total-time': 'int',
'*expected-downtime': 'int',
'*downtime': 'int',
-   '*setup-time': 'int'} }
+   '*setup-time': 'int',
+   '*x-cpu-throttle-percentage': 'int'} }
 
 ##
 # @query-migrate
-- 
1.9.1

[Qemu-devel] [PATCH v4 0/5] migration: Dynamic cpu throttling for auto-converge

2015-07-02 Thread Jason J. Herne

This patch set provides a new method for throttling a vcpu and makes use of said
method to dynamically increase cpu throttling during an autoconverge
migration until the migration completes.

The method used here for throttling vcpus is likely not the best. However, I
believe that it is preferable to what is used for autoconverge today.

This work is related to the following discussion:
https://lists.gnu.org/archive/html/qemu-devel/2015-03/msg00287.html

Changelog
---
v4
- Switch to a static timer
- Use QEMU_CLOCK_VIRTUAL_RT instead of QEMU_CLOCK_REALTIME
- Get rid of throttle_timer_stop, use throttle_percentage = 0 instead
- Calculate throttle_ratio directly in cpu_throttle_thread
- Consolidate timer mod operations to single function
- Remove some unneeded cpu_throttle_active() checks
- Check for throttle_percentage == 0 in cpu_throttle_thread
- Change throttle_percentage to unsigned int
- Renamed cpu_throttle_timer_pop to cpu_throttle_timer_tick
v3
- Cpu throttling parameters moved from CPUState to global scope
- Throttling interface now uses a percentage instead of ratio
- Migration parameters added to allow tweaking of how aggressive throttling is
- Add throttle active check to the throttle stop routine.
- Remove asserts from throttle start/stop functions and instead force the input
  to fall within the acceptable range
- Rename cpu_throttle_start to cpu_throttle_set to better describe its purpose
- Remove unneeded object casts
- Fixed monitor output formatting
- Comment cleanups
v2
- Add cpu throttle ratio output to hmp/qmp info/query migrate commands
v1
- Initial

Jason J. Herne (5):
  cpu: Provide vcpu throttling interface
  migration: Parameters for auto-converge cpu throttling
  migration: Dynamic cpu throttling for auto-converge
  qmp/hmp: Add throttle ratio to query-migrate and info migrate
  migration: Disambiguate MAX_THROTTLE

 arch_init.c   | 88 ++-
 cpus.c| 66 ++
 hmp.c | 21 
 include/qom/cpu.h | 38 ++
 migration/migration.c | 57 +++--
 qapi-schema.json  | 40 ---
 6 files changed, 246 insertions(+), 64 deletions(-)

-- 
1.9.1

[Qemu-devel] [PATCH v4 2/5] migration: Parameters for auto-converge cpu throttling

2015-07-02 Thread Jason J. Herne

Add migration parameters to allow the user to adjust the parameters
that control cpu throttling when auto-converge is in effect. The added
parameters are as follows:

x-cpu-throttle-initial : Initial percantage of time guest cpus are throttled
when migration auto-converge is activated.

x-cpu-throttle-increment: throttle percantage increase each time
auto-converge detects that migration is not making progress.

Signed-off-by: Jason J. Herne 
Reviewed-by: Dr. David Alan Gilbert 
---
 hmp.c | 16 
 migration/migration.c | 46 +-
 qapi-schema.json  | 33 ++---
 3 files changed, 91 insertions(+), 4 deletions(-)

diff --git a/hmp.c b/hmp.c
index e17852d..eb65998 100644
--- a/hmp.c
+++ b/hmp.c
@@ -269,6 +269,12 @@ void hmp_info_migrate_parameters(Monitor *mon, const QDict 
*qdict)
 monitor_printf(mon, " %s: %" PRId64,
 MigrationParameter_lookup[MIGRATION_PARAMETER_DECOMPRESS_THREADS],
 params->decompress_threads);
+monitor_printf(mon, " %s: %" PRId64,
+
MigrationParameter_lookup[MIGRATION_PARAMETER_X_CPU_THROTTLE_INITIAL],
+params->x_cpu_throttle_initial);
+monitor_printf(mon, " %s: %" PRId64,
+
MigrationParameter_lookup[MIGRATION_PARAMETER_X_CPU_THROTTLE_INCREMENT],
+params->x_cpu_throttle_increment);
 monitor_printf(mon, "\n");
 }
 
@@ -1216,6 +1222,8 @@ void hmp_migrate_set_parameter(Monitor *mon, const QDict 
*qdict)
 bool has_compress_level = false;
 bool has_compress_threads = false;
 bool has_decompress_threads = false;
+bool has_x_cpu_throttle_initial = false;
+bool has_x_cpu_throttle_increment = false;
 int i;
 
 for (i = 0; i < MIGRATION_PARAMETER_MAX; i++) {
@@ -1230,10 +1238,18 @@ void hmp_migrate_set_parameter(Monitor *mon, const 
QDict *qdict)
 case MIGRATION_PARAMETER_DECOMPRESS_THREADS:
 has_decompress_threads = true;
 break;
+case MIGRATION_PARAMETER_X_CPU_THROTTLE_INITIAL:
+has_x_cpu_throttle_initial = true;
+break;
+case MIGRATION_PARAMETER_X_CPU_THROTTLE_INCREMENT:
+has_x_cpu_throttle_increment = true;
+break;
 }
 qmp_migrate_set_parameters(has_compress_level, value,
has_compress_threads, value,
has_decompress_threads, value,
+   has_x_cpu_throttle_initial, value,
+   has_x_cpu_throttle_increment, value,
&err);
 break;
 }
diff --git a/migration/migration.c b/migration/migration.c
index 732d229..05790e9 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -40,6 +40,9 @@
 #define DEFAULT_MIGRATE_DECOMPRESS_THREAD_COUNT 2
 /*0: means nocompress, 1: best speed, ... 9: best compress ratio */
 #define DEFAULT_MIGRATE_COMPRESS_LEVEL 1
+/* Define default autoconverge cpu throttle migration parameters */
+#define DEFAULT_MIGRATE_X_CPU_THROTTLE_INITIAL 20
+#define DEFAULT_MIGRATE_X_CPU_THROTTLE_INCREMENT 10
 
 /* Migration XBZRLE default cache size */
 #define DEFAULT_MIGRATE_CACHE_SIZE (64 * 1024 * 1024)
@@ -66,6 +69,10 @@ MigrationState *migrate_get_current(void)
 DEFAULT_MIGRATE_COMPRESS_THREAD_COUNT,
 .parameters[MIGRATION_PARAMETER_DECOMPRESS_THREADS] =
 DEFAULT_MIGRATE_DECOMPRESS_THREAD_COUNT,
+.parameters[MIGRATION_PARAMETER_X_CPU_THROTTLE_INITIAL] =
+DEFAULT_MIGRATE_X_CPU_THROTTLE_INITIAL,
+.parameters[MIGRATION_PARAMETER_X_CPU_THROTTLE_INCREMENT] =
+DEFAULT_MIGRATE_X_CPU_THROTTLE_INCREMENT,
 };
 
 return ¤t_migration;
@@ -199,6 +206,10 @@ MigrationParameters *qmp_query_migrate_parameters(Error 
**errp)
 s->parameters[MIGRATION_PARAMETER_COMPRESS_THREADS];
 params->decompress_threads =
 s->parameters[MIGRATION_PARAMETER_DECOMPRESS_THREADS];
+params->x_cpu_throttle_initial =
+s->parameters[MIGRATION_PARAMETER_X_CPU_THROTTLE_INITIAL];
+params->x_cpu_throttle_increment =
+s->parameters[MIGRATION_PARAMETER_X_CPU_THROTTLE_INCREMENT];
 
 return params;
 }
@@ -321,7 +332,11 @@ void qmp_migrate_set_parameters(bool has_compress_level,
 bool has_compress_threads,
 int64_t compress_threads,
 bool has_decompress_threads,
-int64_t decompress_threads, Error **errp)
+int64_t decompress_threads,
+bool has_x_cpu_throttle_initial,
+int64_t x_cpu_throttle_initial,
+bool has_x_cpu_throttle_increm

[Qemu-devel] [PATCH v4 5/5] migration: Disambiguate MAX_THROTTLE

2015-07-02 Thread Jason J. Herne

Migration has a define for MAX_THROTTLE. Update comment to clarify that this is
used for throttling transfer speed. Hopefully this will prevent it from being
confused with a guest cpu throttling entity.

Signed-off-by: Jason J. Herne 
Reviewed-by: Dr. David Alan Gilbert 
---
 migration/migration.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/migration/migration.c b/migration/migration.c
index b29450a..b9faeb0 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -27,7 +27,7 @@
 #include "trace.h"
 #include "qom/cpu.h"
 
-#define MAX_THROTTLE  (32 << 20)  /* Migration speed throttling */
+#define MAX_THROTTLE  (32 << 20)  /* Migration transfer speed throttling */
 
 /* Amount of time to allocate to each "chunk" of bandwidth-throttled
  * data. */
-- 
1.9.1

Re: [Qemu-devel] [PATCH v3 1/5] cpu: Provide vcpu throttling interface

2015-07-02 Thread Paolo Bonzini



On 02/07/2015 18:33, Jason J. Herne wrote:
> I've made all of the changes you have suggested except adding atomics.  I'm 
> having
> a bit of trouble figuring out what is needed here. Perhaps I should be using
> atomic_read() to read throttle_percentage? If so, I don't undertand why. 
> Rather
> than trying to explain everything here I'm going to submit my V4 patches.
> If any atomic operations are still necessary with V4 please let me know.

Fair enough!

Paolo

[Qemu-devel] [PATCH v4 1/5] cpu: Provide vcpu throttling interface

2015-07-02 Thread Jason J. Herne

Provide a method to throttle guest cpu execution. CPUState is augmented with
timeout controls and throttle start/stop functions. To throttle the guest cpu
the caller simply has to call the throttle set function and provide a percentage
of throttle time.

Signed-off-by: Jason J. Herne 
Reviewed-by: Matthew Rosato 
---
 cpus.c| 66 +++
 include/qom/cpu.h | 38 
 2 files changed, 104 insertions(+)

diff --git a/cpus.c b/cpus.c
index de6469f..6f86da0 100644
--- a/cpus.c
+++ b/cpus.c
@@ -68,6 +68,14 @@ static CPUState *next_cpu;
 int64_t max_delay;
 int64_t max_advance;
 
+/* vcpu throttling controls */
+static QEMUTimer *throttle_timer;
+static unsigned int throttle_percentage;
+
+#define CPU_THROTTLE_PCT_MIN 1
+#define CPU_THROTTLE_PCT_MAX 99
+#define CPU_THROTTLE_TIMESLICE 10
+
 bool cpu_is_stopped(CPUState *cpu)
 {
 return cpu->stopped || !runstate_is_running();
@@ -486,10 +494,68 @@ static const VMStateDescription vmstate_timers = {
 }
 };
 
+static void cpu_throttle_thread(void *opaque)
+{
+double pct = (double)throttle_percentage/100;
+double throttle_ratio = pct / (1 - pct);
+long sleeptime_ms = (long)(throttle_ratio * CPU_THROTTLE_TIMESLICE);
+
+if (!throttle_percentage) {
+return;
+}
+
+qemu_mutex_unlock_iothread();
+g_usleep(sleeptime_ms * 1000); /* Convert ms to us for usleep call */
+qemu_mutex_lock_iothread();
+}
+
+static void cpu_throttle_timer_tick(void *opaque)
+{
+CPUState *cpu;
+
+/* Stop the timer if needed */
+if (!throttle_percentage) {
+return;
+}
+CPU_FOREACH(cpu) {
+async_run_on_cpu(cpu, cpu_throttle_thread, NULL);
+}
+
+timer_mod(throttle_timer, qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL_RT) +
+   CPU_THROTTLE_TIMESLICE);
+}
+
+void cpu_throttle_set(int new_throttle_pct)
+{
+/* Ensure throttle percentage is within valid range */
+new_throttle_pct = MIN(new_throttle_pct, CPU_THROTTLE_PCT_MAX);
+throttle_percentage = MAX(new_throttle_pct, CPU_THROTTLE_PCT_MIN);
+
+timer_mod(throttle_timer, qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL_RT) +
+   CPU_THROTTLE_TIMESLICE);
+}
+
+void cpu_throttle_stop(void)
+{
+throttle_percentage = 0;
+}
+
+bool cpu_throttle_active(void)
+{
+return (throttle_percentage != 0);
+}
+
+int cpu_throttle_get_percentage(void)
+{
+return throttle_percentage;
+}
+
 void cpu_ticks_init(void)
 {
 seqlock_init(&timers_state.vm_clock_seqlock, NULL);
 vmstate_register(NULL, 0, &vmstate_timers, &timers_state);
+throttle_timer = timer_new_ms(QEMU_CLOCK_VIRTUAL_RT,
+   cpu_throttle_timer_tick, NULL);
 }
 
 void configure_icount(QemuOpts *opts, Error **errp)
diff --git a/include/qom/cpu.h b/include/qom/cpu.h
index 39f0f19..56eb964 100644
--- a/include/qom/cpu.h
+++ b/include/qom/cpu.h
@@ -553,6 +553,44 @@ CPUState *qemu_get_cpu(int index);
  */
 bool cpu_exists(int64_t id);
 
+/**
+ * cpu_throttle_set:
+ * @new_throttle_pct: Percent of sleep time to running time.
+ *Valid range is 1 to 99.
+ *
+ * Throttles all vcpus by forcing them to sleep for the given percentage of
+ * time. A throttle_percentage of 50 corresponds to a 50% duty cycle roughly.
+ * (example: 10ms sleep for every 10ms awake).
+ *
+ * cpu_throttle_set can be called as needed to adjust new_throttle_pct.
+ * Once the throttling starts, it will remain in effect until cpu_throttle_stop
+ * is called.
+ */
+void cpu_throttle_set(int new_throttle_pct);
+
+/**
+ * cpu_throttle_stop:
+ *
+ * Stops the vcpu throttling started by cpu_throttle_set.
+ */
+void cpu_throttle_stop(void);
+
+/**
+ * cpu_throttle_active:
+ *
+ * Returns %true if the vcpus are currently being throttled, %false otherwise.
+ */
+bool cpu_throttle_active(void);
+
+/**
+ * cpu_throttle_get_percentage:
+ *
+ * Returns the vcpu throttle percentage. See cpu_throttle_set for details.
+ *
+ * Returns The throttle percentage in range 1 to 99.
+ */
+int cpu_throttle_get_percentage(void);
+
 #ifndef CONFIG_USER_ONLY
 
 typedef void (*CPUInterruptHandler)(CPUState *, int);
-- 
1.9.1

[Qemu-devel] [PATCH v4 3/5] migration: Dynamic cpu throttling for auto-converge

2015-07-02 Thread Jason J. Herne

Remove traditional auto-converge static 30ms throttling code and replace it
with a dynamic throttling algorithm.

Additionally, be more aggressive when deciding when to start throttling.
Previously we waited until four unproductive memory passes. Now we begin
throttling after only two unproductive memory passes. Four seemed quite
arbitrary and only waiting for two passes allows us to complete the migration
faster.

Signed-off-by: Jason J. Herne 
Reviewed-by: Matthew Rosato 
---
 arch_init.c   | 88 ++-
 migration/migration.c |  4 +++
 2 files changed, 34 insertions(+), 58 deletions(-)

diff --git a/arch_init.c b/arch_init.c
index 23d3feb..35f8eb0 100644
--- a/arch_init.c
+++ b/arch_init.c
@@ -111,9 +111,7 @@ int graphic_depth = 32;
 #endif
 
 const uint32_t arch_type = QEMU_ARCH;
-static bool mig_throttle_on;
 static int dirty_rate_high_cnt;
-static void check_guest_throttling(void);
 
 static uint64_t bitmap_sync_count;
 
@@ -487,6 +485,29 @@ static size_t save_page_header(QEMUFile *f, RAMBlock 
*block, ram_addr_t offset)
 return size;
 }
 
+/* Reduce amount of guest cpu execution to hopefully slow down memory writes.
+ * If guest dirty memory rate is reduced below the rate at which we can
+ * transfer pages to the destination then we should be able to complete
+ * migration. Some workloads dirty memory way too fast and will not effectively
+ * converge, even with auto-converge.
+ */
+static void mig_throttle_guest_down(void)
+{
+MigrationState *s = migrate_get_current();
+uint64_t pct_initial =
+s->parameters[MIGRATION_PARAMETER_X_CPU_THROTTLE_INITIAL];
+uint64_t pct_icrement =
+s->parameters[MIGRATION_PARAMETER_X_CPU_THROTTLE_INCREMENT];
+
+/* We have not started throttling yet. Let's start it. */
+if (!cpu_throttle_active()) {
+cpu_throttle_set(pct_initial);
+} else {
+/* Throttling already on, just increase the rate */
+cpu_throttle_set(cpu_throttle_get_percentage() + pct_icrement);
+}
+}
+
 /* Update the xbzrle cache to reflect a page that's been sent as all 0.
  * The important thing is that a stale (not-yet-0'd) page be replaced
  * by the new data.
@@ -714,21 +735,21 @@ static void migration_bitmap_sync(void)
 /* The following detection logic can be refined later. For now:
Check to see if the dirtied bytes is 50% more than the approx.
amount of bytes that just got transferred since the last time we
-   were in this routine. If that happens >N times (for now N==4)
-   we turn on the throttle down logic */
+   were in this routine. If that happens twice, start or increase
+   throttling */
 bytes_xfer_now = ram_bytes_transferred();
+
 if (s->dirty_pages_rate &&
(num_dirty_pages_period * TARGET_PAGE_SIZE >
(bytes_xfer_now - bytes_xfer_prev)/2) &&
-   (dirty_rate_high_cnt++ > 4)) {
+   (dirty_rate_high_cnt++ >= 2)) {
 trace_migration_throttle();
-mig_throttle_on = true;
 dirty_rate_high_cnt = 0;
+mig_throttle_guest_down();
  }
  bytes_xfer_prev = bytes_xfer_now;
-} else {
- mig_throttle_on = false;
 }
+
 if (migrate_use_xbzrle()) {
 if (iterations_prev != acct_info.iterations) {
 acct_info.xbzrle_cache_miss_rate =
@@ -1197,7 +1218,6 @@ static int ram_save_setup(QEMUFile *f, void *opaque)
 RAMBlock *block;
 int64_t ram_bitmap_pages; /* Size of bitmap in pages, including gaps */
 
-mig_throttle_on = false;
 dirty_rate_high_cnt = 0;
 bitmap_sync_count = 0;
 migration_bitmap_sync_init();
@@ -1301,7 +1321,7 @@ static int ram_save_iterate(QEMUFile *f, void *opaque)
 }
 pages_sent += pages;
 acct_info.iterations++;
-check_guest_throttling();
+
 /* we want to check in the 1st loop, just in case it was the 1st time
and we had to sync the dirty bitmap.
qemu_get_clock_ns() is a bit expensive, so we only check each some
@@ -1913,51 +1933,3 @@ TargetInfo *qmp_query_target(Error **errp)
 return info;
 }
 
-/* Stub function that's gets run on the vcpu when its brought out of the
-   VM to run inside qemu via async_run_on_cpu()*/
-static void mig_sleep_cpu(void *opq)
-{
-qemu_mutex_unlock_iothread();
-g_usleep(30*1000);
-qemu_mutex_lock_iothread();
-}
-
-/* To reduce the dirty rate explicitly disallow the VCPUs from spending
-   much time in the VM. The migration thread will try to catchup.
-   Workload will experience a performance drop.
-*/
-static void mig_throttle_guest_down(void)
-{
-CPUState *cpu;
-
-qemu_mutex_lock_iothread();
-CPU_FOREACH(cpu) {
-async_run_on_cpu(cpu, mig_sleep_cpu, NULL);
-}
-qemu_mutex_unloc

Re: [Qemu-devel] [PATCH v3 1/5] cpu: Provide vcpu throttling interface

2015-07-02 Thread Jason J. Herne


On 07/01/2015 10:03 AM, Paolo Bonzini wrote:



On 26/06/2015 20:07, Dr. David Alan Gilbert wrote:

* Jason J. Herne (jjhe...@linux.vnet.ibm.com) wrote:

Provide a method to throttle guest cpu execution. CPUState is augmented with
timeout controls and throttle start/stop functions. To throttle the guest cpu
the caller simply has to call the throttle set function and provide a percentage
of throttle time.


I'm worried about atomicity and threads and all those fun things.

I think all the starting/stopping/setting the throttling level is done in the
migration thread; I think the timers run in the main/io thread?
So you really need to be careful with at least:
 throttle_timer_stop - which may have a minor effect
 throttle_timer  - I worry about the way cpu_timer_active checks the pointer
   yet it's freed when the timer goes off.   It's probably
   not too bad because it never dereferences it.


Agreed.  I think the only atomic should be throttle_percentage; if zero,
throttling is inactive.

In particular, throttle_ratio can be computed in cpu_throttle_thread.

If you have exactly one variable that is shared between the threads,
everything is much simpler.

There is no need to allocate and free the timer; it's very cheap and in
fact we probably should convert to statically allocated timers sooner or
later.  So you can just create it once, for example in cpu_ticks_init.



I've made all of the changes you have suggested except adding atomics. 
I'm having

a bit of trouble figuring out what is needed here. Perhaps I should be using
atomic_read() to read throttle_percentage? If so, I don't undertand why. 
Rather

than trying to explain everything here I'm going to submit my V4 patches.
If any atomic operations are still necessary with V4 please let me know.

--
-- Jason J. Herne (jjhe...@linux.vnet.ibm.com)

[Qemu-devel] [PATCH 2/3] monitor: remove target-specific code from monitor.c

2015-07-02 Thread Denis V. Lunev

From: Pavel Butsykin 

Move target-specific code out of /monitor.c to /target-*/monitor.c,
this will avoid code cluttering and using random ifdeffery.  The solution
is quite simple, but solves the issue of the separation of target-specific
code from monitor

Signed-off-by: Pavel Butsykin 
Signed-off-by: Denis V. Lunev 
CC: Luiz Capitulino 
CC: Paolo Bonzini 
CC: Peter Maydell 
---
 include/monitor/monitor-common.h |  43 ++
 monitor.c| 854 +--
 target-i386/Makefile.objs|   2 +-
 target-i386/monitor.c| 489 ++
 target-ppc/Makefile.objs |   2 +-
 target-ppc/monitor.c | 250 
 target-sh4/Makefile.objs |   1 +
 target-sh4/monitor.c |  52 +++
 target-sparc/Makefile.objs   |   2 +-
 target-sparc/monitor.c   | 153 +++
 target-xtensa/Makefile.objs  |   1 +
 target-xtensa/monitor.c  |  34 ++
 12 files changed, 1032 insertions(+), 851 deletions(-)
 create mode 100644 include/monitor/monitor-common.h
 create mode 100644 target-i386/monitor.c
 create mode 100644 target-ppc/monitor.c
 create mode 100644 target-sh4/monitor.c
 create mode 100644 target-sparc/monitor.c
 create mode 100644 target-xtensa/monitor.c

diff --git a/include/monitor/monitor-common.h b/include/monitor/monitor-common.h
new file mode 100644
index 000..e1d79b9
--- /dev/null
+++ b/include/monitor/monitor-common.h
@@ -0,0 +1,43 @@
+/*
+ * QEMU monitor
+ *
+ * Copyright (c) 2003-2004 Fabrice Bellard
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to 
deal
+ * in the Software without restriction, including without limitation the rights
+ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+ * copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING 
FROM,
+ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
+ * THE SOFTWARE.
+ */
+#ifndef MONITOR_COMMON_H
+#define MONITOR_COMMON_H
+
+#define MD_TLONG 0
+#define MD_I32   1
+
+typedef struct MonitorDef {
+const char *name;
+int offset;
+target_long (*get_value)(const struct MonitorDef *md, int val);
+int type;
+} MonitorDef;
+
+CPUArchState *mon_get_cpu_env(void);
+
+void hmp_info_mem(Monitor *mon, const QDict *qdict);
+void hmp_info_tlb(Monitor *mon, const QDict *qdict);
+void hmp_mce(Monitor *mon, const QDict *qdict);
+
+#endif /* MONITOR_COMMON */
diff --git a/monitor.c b/monitor.c
index 89c7a52..de00d84 100644
--- a/monitor.c
+++ b/monitor.c
@@ -63,6 +63,7 @@
 #include "cpu.h"
 #include "trace.h"
 #include "trace/control.h"
+#include "monitor/monitor-common.h"
 #ifdef CONFIG_TRACE_SIMPLE
 #include "trace/simple.h"
 #endif
@@ -946,7 +947,7 @@ static CPUState *mon_get_cpu(void)
 return cur_mon->mon_cpu;
 }
 
-static CPUArchState *mon_get_cpu_env(void)
+CPUArchState *mon_get_cpu_env(void)
 {
 return mon_get_cpu()->env_ptr;
 }
@@ -1420,442 +1421,6 @@ static void hmp_boot_set(Monitor *mon, const QDict 
*qdict)
 }
 }
 
-#if defined(TARGET_I386)
-static void print_pte(Monitor *mon, hwaddr addr,
-  hwaddr pte,
-  hwaddr mask)
-{
-#ifdef TARGET_X86_64
-if (addr & (1ULL << 47)) {
-addr |= -1LL << 48;
-}
-#endif
-monitor_printf(mon, TARGET_FMT_plx ": " TARGET_FMT_plx
-   " %c%c%c%c%c%c%c%c%c\n",
-   addr,
-   pte & mask,
-   pte & PG_NX_MASK ? 'X' : '-',
-   pte & PG_GLOBAL_MASK ? 'G' : '-',
-   pte & PG_PSE_MASK ? 'P' : '-',
-   pte & PG_DIRTY_MASK ? 'D' : '-',
-   pte & PG_ACCESSED_MASK ? 'A' : '-',
-   pte & PG_PCD_MASK ? 'C' : '-',
-   pte & PG_PWT_MASK ? 'T' : '-',
-   pte & PG_USER_MASK ? 'U' : '-',
-   pte & PG_RW_MASK ? 'W' : '-');
-}
-
-static void tlb_info_32(Monitor *mon, CPUArchState *env)
-{
-unsigned int l1, l2;
-uint32_t pgd, pde, pte;
-
-pgd = env->cr[3] & ~0xfff;
-for(l1 = 0; l1 < 1024; l1++) {
-cpu_physical_memory_read(pgd + l1 * 4, &pde, 4);
-pde = le32_to_cpu(pde);
-if (pde & PG_PRESENT_MASK) {
-if ((pde & PG_PSE_MASK) && (env->cr[4] & CR4_P

Re: [Qemu-devel] [PATCH 07/12] kvm/x86: added hyper-v crash data and ctl msr's get/set'ers

2015-07-02 Thread Paolo Bonzini

On 02/07/2015 18:07, Denis V. Lunev wrote:
> From: Andrey Smetanin 
> 
> Added hyper-v crash msr's(HV_X64_MSR_CRASH*) data and control
> geters and setters. Userspace should check that such msr's
> available by check of KVM_CAP_HYPERV_MSR_CRASH capability.

It should use the existing KVM_GET_SUPPORTED_MSRS infrastructure.  See
emulated_msrs where other Hyper-V MSRs are listed.

Paolo

[Qemu-devel] [PATCH 3/3] monitor: added generation of documentation for hmp-commands-info.hx

2015-07-02 Thread Denis V. Lunev

From: Pavel Butsykin 

It will be easier if you need to add info-commands to edit
only hmp-commands-info.hx, before this had to edit monitor.c and
hmp-commands.hx

Signed-off-by: Pavel Butsykin 
Signed-off-by: Denis V. Lunev 
CC: Luiz Capitulino 
CC: Paolo Bonzini 
CC: Peter Maydell 
---
 .gitignore   |   1 +
 Makefile |   9 ++--
 hmp-commands-info.hx |   4 ++
 hmp-commands.hx  | 120 ---
 qemu-doc.texi|   2 +
 5 files changed, 13 insertions(+), 123 deletions(-)

diff --git a/.gitignore b/.gitignore
index aed0e1f..ac29efe 100644
--- a/.gitignore
+++ b/.gitignore
@@ -47,6 +47,7 @@
 /qemu-ga
 /qemu-bridge-helper
 /qemu-monitor.texi
+/qemu-monitor-info.texi
 /qmp-commands.txt
 /vscclient
 /fsdev/virtfs-proxy-helper
diff --git a/Makefile b/Makefile
index c9be643..8f394f1 100644
--- a/Makefile
+++ b/Makefile
@@ -344,7 +344,7 @@ qemu-%.tar.bz2:
$(SRC_PATH)/scripts/make-release "$(SRC_PATH)" "$(patsubst 
qemu-%.tar.bz2,%,$@)"
 
 distclean: clean
-   rm -f config-host.mak config-host.h* config-host.ld $(DOCS) 
qemu-options.texi qemu-img-cmds.texi qemu-monitor.texi
+   rm -f config-host.mak config-host.h* config-host.ld $(DOCS) 
qemu-options.texi qemu-img-cmds.texi qemu-monitor.texi qemu-monitor-info.texi
rm -f config-all-devices.mak config-all-disas.mak config.status
rm -f po/*.mo tests/qemu-iotests/common.env
rm -f roms/seabios/config.mak roms/vgabios/config.mak
@@ -508,13 +508,16 @@ qemu-options.texi: $(SRC_PATH)/qemu-options.hx
 qemu-monitor.texi: $(SRC_PATH)/hmp-commands.hx
$(call quiet-command,sh $(SRC_PATH)/scripts/hxtool -t < $< > $@,"  GEN  
 $@")
 
+qemu-monitor-info.texi: $(SRC_PATH)/hmp-commands-info.hx
+   $(call quiet-command,sh $(SRC_PATH)/scripts/hxtool -t < $< > $@,"  GEN  
 $@")
+
 qmp-commands.txt: $(SRC_PATH)/qmp-commands.hx
$(call quiet-command,sh $(SRC_PATH)/scripts/hxtool -q < $< > $@,"  GEN  
 $@")
 
 qemu-img-cmds.texi: $(SRC_PATH)/qemu-img-cmds.hx
$(call quiet-command,sh $(SRC_PATH)/scripts/hxtool -t < $< > $@,"  GEN  
 $@")
 
-qemu.1: qemu-doc.texi qemu-options.texi qemu-monitor.texi
+qemu.1: qemu-doc.texi qemu-options.texi qemu-monitor.texi 
qemu-monitor-info.texi
$(call quiet-command, \
  perl -Ww -- $(SRC_PATH)/scripts/texi2pod.pl $< qemu.pod && \
  $(POD2MAN) --section=1 --center=" " --release=" " qemu.pod > $@, \
@@ -551,7 +554,7 @@ pdf: qemu-doc.pdf qemu-tech.pdf
 
 qemu-doc.dvi qemu-doc.html qemu-doc.info qemu-doc.pdf: \
qemu-img.texi qemu-nbd.texi qemu-options.texi \
-   qemu-monitor.texi qemu-img-cmds.texi
+   qemu-monitor.texi qemu-monitor-info.texi qemu-img-cmds.texi
 
 ifdef CONFIG_WIN32
 
diff --git a/hmp-commands-info.hx b/hmp-commands-info.hx
index 9ccb33f..81ae9d7 100644
--- a/hmp-commands-info.hx
+++ b/hmp-commands-info.hx
@@ -6,6 +6,9 @@ HXCOMM monitor info commands
 HXCOMM HXCOMM can be used for comments, discarded from both texi and C
 
 STEXI
+@item info @var{subcommand}
+@findex info
+Show various information about the system state.
 @table @option
 ETEXI
 
@@ -708,4 +711,5 @@ ETEXI
 
 STEXI
 @end table
+@end table
 ETEXI
diff --git a/hmp-commands.hx b/hmp-commands.hx
index d3b7932..3b36db4 100644
--- a/hmp-commands.hx
+++ b/hmp-commands.hx
@@ -1705,123 +1705,3 @@ ETEXI
 .mhandler.cmd = hmp_info_help,
 .sub_table = info_cmds,
 },
-
-STEXI
-@item info @var{subcommand}
-@findex info
-Show various information about the system state.
-
-@table @option
-@item info version
-show the version of QEMU
-@item info network
-show the various VLANs and the associated devices
-@item info chardev
-show the character devices
-@item info block
-show the block devices
-@item info blockstats
-show block device statistics
-@item info registers
-show the cpu registers
-@item info cpus
-show infos for each CPU
-@item info history
-show the command line history
-@item info irq
-show the interrupts statistics (if available)
-@item info pic
-show i8259 (PIC) state
-@item info pci
-show emulated PCI device info
-@item info tlb
-show virtual to physical memory mappings (i386, SH4, SPARC, PPC, and Xtensa 
only)
-@item info mem
-show the active virtual memory mappings (i386 only)
-@item info jit
-show dynamic compiler info
-@item info numa
-show NUMA information
-@item info kvm
-show KVM information
-@item info usb
-show USB devices plugged on the virtual USB hub
-@item info usbhost
-show all USB host devices
-@item info profile
-show profiling information
-@item info capture
-show information about active capturing
-@item info snapshots
-show list of VM snapshots
-@item info status
-show the current VM status (running|paused)
-@item info mice
-show which guest mouse is receiving events
-@item info vnc
-show the vnc server status
-@item info name
-show the current VM name
-@item info uuid
-show the current VM UUID
-@item info cpustats
-show CPU statistics
-@item info usernet
-show user networ

[Qemu-devel] [PULL 27/27] migration: extend migration_bitmap

2015-07-02 Thread Juan Quintela

From: Li Zhijian 

Prevously, if we hotplug a device(e.g. device_add e1000) during
migration is processing in source side, qemu will add a new ram
block but migration_bitmap is not extended.
In this case, migration_bitmap will overflow and lead qemu abort
unexpectedly.

Signed-off-by: Li Zhijian 
Signed-off-by: Wen Congyang 
Signed-off-by: Juan Quintela 
---
 exec.c  |  5 +
 include/exec/exec-all.h |  3 +++
 migration/ram.c | 28 
 3 files changed, 36 insertions(+)

diff --git a/exec.c b/exec.c
index f7883d2..1702544 100644
--- a/exec.c
+++ b/exec.c
@@ -1401,6 +1401,11 @@ static ram_addr_t ram_block_add(RAMBlock *new_block, 
Error **errp)
 }
 }

+new_ram_size = MAX(old_ram_size,
+  (new_block->offset + new_block->max_length) >> TARGET_PAGE_BITS);
+if (new_ram_size > old_ram_size) {
+migration_bitmap_extend(old_ram_size, new_ram_size);
+}
 /* Keep the list sorted from biggest to smallest block.  Unlike QTAILQ,
  * QLIST (which has an RCU-friendly variant) does not have insertion at
  * tail, so save the last element in last_block.
diff --git a/include/exec/exec-all.h b/include/exec/exec-all.h
index 2573e8c..91ffa99 100644
--- a/include/exec/exec-all.h
+++ b/include/exec/exec-all.h
@@ -385,4 +385,7 @@ static inline bool cpu_can_do_io(CPUState *cpu)
 return cpu->can_do_io != 0;
 }

+#if !defined(CONFIG_USER_ONLY)
+void migration_bitmap_extend(ram_addr_t old, ram_addr_t new);
+#endif
 #endif
diff --git a/migration/ram.c b/migration/ram.c
index 9c0bcfe..c696814 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -222,6 +222,7 @@ static RAMBlock *last_seen_block;
 static RAMBlock *last_sent_block;
 static ram_addr_t last_offset;
 static unsigned long *migration_bitmap;
+static QemuMutex migration_bitmap_mutex;
 static uint64_t migration_dirty_pages;
 static uint32_t last_version;
 static bool ram_bulk_stage;
@@ -569,11 +570,13 @@ static void migration_bitmap_sync(void)
 trace_migration_bitmap_sync_start();
 address_space_sync_dirty_bitmap(&address_space_memory);

+qemu_mutex_lock(&migration_bitmap_mutex);
 rcu_read_lock();
 QLIST_FOREACH_RCU(block, &ram_list.blocks, next) {
 migration_bitmap_sync_range(block->mr->ram_addr, block->used_length);
 }
 rcu_read_unlock();
+qemu_mutex_unlock(&migration_bitmap_mutex);

 trace_migration_bitmap_sync_end(migration_dirty_pages
 - num_dirty_pages_init);
@@ -1062,6 +1065,30 @@ static void reset_ram_globals(void)

 #define MAX_WAIT 50 /* ms, half buffered_file limit */

+void migration_bitmap_extend(ram_addr_t old, ram_addr_t new)
+{
+/* called in qemu main thread, so there is
+ * no writing race against this migration_bitmap
+ */
+if (migration_bitmap) {
+unsigned long *old_bitmap = migration_bitmap, *bitmap;
+bitmap = bitmap_new(new);
+
+/* prevent migration_bitmap content from being set bit
+ * by migration_bitmap_sync_range() at the same time.
+ * it is safe to migration if migration_bitmap is cleared bit
+ * at the same time.
+ */
+qemu_mutex_lock(&migration_bitmap_mutex);
+bitmap_copy(bitmap, old_bitmap, old);
+bitmap_set(bitmap, old, new - old);
+atomic_rcu_set(&migration_bitmap, bitmap);
+qemu_mutex_unlock(&migration_bitmap_mutex);
+migration_dirty_pages += new - old;
+synchronize_rcu();
+g_free(old_bitmap);
+}
+}

 /* Each of ram_save_setup, ram_save_iterate and ram_save_complete has
  * long-running RCU critical section.  When rcu-reclaims in the code
@@ -1078,6 +1105,7 @@ static int ram_save_setup(QEMUFile *f, void *opaque)
 dirty_rate_high_cnt = 0;
 bitmap_sync_count = 0;
 migration_bitmap_sync_init();
+qemu_mutex_init(&migration_bitmap_mutex);

 if (migrate_use_xbzrle()) {
 XBZRLE_cache_lock();
-- 
2.4.3

[Qemu-devel] [PATCH] block: update bdrv_drain_all()/bdrv_drain() comments

2015-07-02 Thread Stefan Hajnoczi

The doc comments for bdrv_drain_all() and bdrv_drain() are outdated:

 * The bdrv_drain() comment is a poor man's bdrv_lock()/bdrv_unlock()
   which Fam Zheng is currently developing.  Unfortunately this warning
   was never really enough because devices keep submitting I/O and op
   blockers don't prevent that.

 * The bdrv_drain_all() comment is still partially correct but reflects
   the nature of the implementation rather than API documentation.

Do make it clear that bdrv_drain() is only appropriate within an
AioContext.  For anything spanning AioContexts you need
bdrv_drain_all().

Cc: Markus Armbruster 
Cc: Paolo Bonzini 
Signed-off-by: Stefan Hajnoczi 
---
 block/io.c | 20 ++--
 1 file changed, 10 insertions(+), 10 deletions(-)

diff --git a/block/io.c b/block/io.c
index e295992..6f5704f 100644
--- a/block/io.c
+++ b/block/io.c
@@ -236,12 +236,12 @@ static bool bdrv_requests_pending(BlockDriverState *bs)
 /*
  * Wait for pending requests to complete on a single BlockDriverState subtree
  *
- * See the warning in bdrv_drain_all().  This function can only be called if
- * you are sure nothing can generate I/O because you have op blockers
- * installed.
- *
  * Note that unlike bdrv_drain_all(), the caller must hold the BlockDriverState
  * AioContext.
+ *
+ * Only this BlockDriverState's AioContext is run, so in-flight requests must
+ * not depend on events in other AioContexts.  In that case, use
+ * bdrv_drain_all() instead.
  */
 void bdrv_drain(BlockDriverState *bs)
 {
@@ -260,12 +260,6 @@ void bdrv_drain(BlockDriverState *bs)
  *
  * This function does not flush data to disk, use bdrv_flush_all() for that
  * after calling this function.
- *
- * Note that completion of an asynchronous I/O operation can trigger any
- * number of other I/O operations on other devices---for example a coroutine
- * can be arbitrarily complex and a constant flow of I/O can come until the
- * coroutine is complete.  Because of this, it is not possible to have a
- * function to drain a single device's I/O queue.
  */
 void bdrv_drain_all(void)
 {
@@ -288,6 +282,12 @@ void bdrv_drain_all(void)
 }
 }
 
+/* Note that completion of an asynchronous I/O operation can trigger any
+ * number of other I/O operations on other devices---for example a
+ * coroutine can submit an I/O request to another device in response to
+ * request completion.  Therefore we must keep looping until there was no
+ * more activity rather than simply draining each device independently.
+ */
 while (busy) {
 busy = false;
 
-- 
2.4.3

Re: [Qemu-devel] [PATCH 12/12] qemu/kvm/x86: hyper-v crash msrs set/get'ers and migration

2015-07-02 Thread Paolo Bonzini

On 02/07/2015 18:07, Denis V. Lunev wrote:
> +if (cpu->hyperv_crash &&
> +kvm_check_extension(cs->kvm_state, KVM_CAP_HYPERV_MSR_CRASH) > 
> 0) {
> +c->edx |= HV_X64_GUEST_CRASH_MSR_AVAILABLE;
> +has_msr_hv_crash = true;
> +}
> +

Please patch kvm_get_supported_msrs instead of adding a capability.  The
QEMU parts are otherwise okay.

Paolo

[Qemu-devel] [PULL 24/27] migration: Add migration events on target side

2015-07-02 Thread Juan Quintela

We reuse the migration events from the source side, sending them on the
appropiate place.

Signed-off-by: Juan Quintela 
Reviewed-by: Eric Blake 
Reviewed-by: Dr. David Alan Gilbert 
---
 migration/migration.c | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/migration/migration.c b/migration/migration.c
index d8415c4..c41779f 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -222,6 +222,7 @@ void qemu_start_incoming_migration(const char *uri, Error 
**errp)
 {
 const char *p;

+qapi_event_send_migration(MIGRATION_STATUS_SETUP, &error_abort);
 if (!strcmp(uri, "defer")) {
 deferred_incoming_migration(errp);
 } else if (strstart(uri, "tcp:", &p)) {
@@ -250,7 +251,7 @@ static void process_incoming_migration_co(void *opaque)
 int ret;

 migration_incoming_state_new(f);
-
+qapi_event_send_migration(MIGRATION_STATUS_ACTIVE, &error_abort);
 ret = qemu_loadvm_state(f);

 qemu_fclose(f);
@@ -258,10 +259,12 @@ static void process_incoming_migration_co(void *opaque)
 migration_incoming_state_destroy();

 if (ret < 0) {
+qapi_event_send_migration(MIGRATION_STATUS_FAILED, &error_abort);
 error_report("load of migration failed: %s", strerror(-ret));
 migrate_decompress_threads_join();
 exit(EXIT_FAILURE);
 }
+qapi_event_send_migration(MIGRATION_STATUS_COMPLETED, &error_abort);
 qemu_announce_self();

 /* Make sure all file formats flush their mutable metadata */
-- 
2.4.3

[Qemu-devel] [PATCH 1/3] hmp-commands-info: move info_cmds content out of monitor.c

2015-07-02 Thread Denis V. Lunev

From: Pavel Butsykin 

For moving target- and device-specific code  from monitor.c,
to beginning we move info_cmds content to hmp-commands-info.hx

Signed-off-by: Pavel Butsykin 
Signed-off-by: Denis V. Lunev 
CC: Luiz Capitulino 
CC: Paolo Bonzini 
CC: Peter Maydell 
---
 Makefile.target  |   5 +-
 hmp-commands-info.hx | 711 +++
 monitor.c| 373 +--
 3 files changed, 717 insertions(+), 372 deletions(-)
 create mode 100644 hmp-commands-info.hx

diff --git a/Makefile.target b/Makefile.target
index 3e7aafd..93ae46d 100644
--- a/Makefile.target
+++ b/Makefile.target
@@ -151,7 +151,7 @@ else
 obj-y += hw/$(TARGET_BASE_ARCH)/
 endif
 
-GENERATED_HEADERS += hmp-commands.h qmp-commands-old.h
+GENERATED_HEADERS += hmp-commands.h hmp-commands-info.h qmp-commands-old.h
 
 endif # CONFIG_SOFTMMU
 
@@ -193,6 +193,9 @@ gdbstub-xml.c: $(TARGET_XML_FILES) 
$(SRC_PATH)/scripts/feature_to_c.sh
 hmp-commands.h: $(SRC_PATH)/hmp-commands.hx
$(call quiet-command,sh $(SRC_PATH)/scripts/hxtool -h < $< > $@,"  GEN  
 $(TARGET_DIR)$@")
 
+hmp-commands-info.h: $(SRC_PATH)/hmp-commands-info.hx
+   $(call quiet-command,sh $(SRC_PATH)/scripts/hxtool -h < $< > $@,"  GEN  
 $(TARGET_DIR)$@")
+
 qmp-commands-old.h: $(SRC_PATH)/qmp-commands.hx
$(call quiet-command,sh $(SRC_PATH)/scripts/hxtool -h < $< > $@,"  GEN  
 $(TARGET_DIR)$@")
 
diff --git a/hmp-commands-info.hx b/hmp-commands-info.hx
new file mode 100644
index 000..9ccb33f
--- /dev/null
+++ b/hmp-commands-info.hx
@@ -0,0 +1,711 @@
+HXCOMM Use DEFHEADING() to define headings in both help text and texi
+HXCOMM Text between STEXI and ETEXI are copied to texi version and
+HXCOMM discarded from C version
+HXCOMM DEF(command, args, callback, arg_string, help) is used to construct
+HXCOMM monitor info commands
+HXCOMM HXCOMM can be used for comments, discarded from both texi and C
+
+STEXI
+@table @option
+ETEXI
+
+{
+.name   = "version",
+.args_type  = "",
+.params = "",
+.help   = "show the version of QEMU",
+.mhandler.cmd = hmp_info_version,
+},
+
+STEXI
+@item info version
+@findex version
+Show the version of QEMU.
+ETEXI
+
+{
+.name   = "network",
+.args_type  = "",
+.params = "",
+.help   = "show the network state",
+.mhandler.cmd = hmp_info_network,
+},
+
+STEXI
+@item info network
+@findex network
+Show the network state.
+ETEXI
+
+{
+.name   = "chardev",
+.args_type  = "",
+.params = "",
+.help   = "show the character devices",
+.mhandler.cmd = hmp_info_chardev,
+},
+
+STEXI
+@item info chardev
+@findex chardev
+Show the character devices.
+ETEXI
+
+{
+.name   = "block",
+.args_type  = "nodes:-n,verbose:-v,device:B?",
+.params = "[-n] [-v] [device]",
+.help   = "show info of one block device or all block devices "
+  "(-n: show named nodes; -v: show details)",
+.mhandler.cmd = hmp_info_block,
+},
+
+STEXI
+@item info block
+@findex block
+Show info of one block device or all block devices.
+ETEXI
+
+{
+.name   = "blockstats",
+.args_type  = "",
+.params = "",
+.help   = "show block device statistics",
+.mhandler.cmd = hmp_info_blockstats,
+},
+
+STEXI
+@item info blockstats
+@findex blockstats
+Show block device statistics.
+ETEXI
+
+{
+.name   = "block-jobs",
+.args_type  = "",
+.params = "",
+.help   = "show progress of ongoing block device operations",
+.mhandler.cmd = hmp_info_block_jobs,
+},
+
+STEXI
+@item info block-jobs
+@findex block-jobs
+Show progress of ongoing block device operations.
+ETEXI
+
+{
+.name   = "registers",
+.args_type  = "",
+.params = "",
+.help   = "show the cpu registers",
+.mhandler.cmd = hmp_info_registers,
+},
+
+STEXI
+@item info registers
+@findex registers
+Show the cpu registers.
+ETEXI
+
+{
+.name   = "cpus",
+.args_type  = "",
+.params = "",
+.help   = "show infos for each CPU",
+.mhandler.cmd = hmp_info_cpus,
+},
+
+STEXI
+@item info cpus
+@findex cpus
+Show infos for each CPU.
+ETEXI
+
+{
+.name   = "history",
+.args_type  = "",
+.params = "",
+.help   = "show the command line history",
+.mhandler.cmd = hmp_info_history,
+},
+
+STEXI
+@item info history
+@findex history
+Show the command line history.
+ETEXI
+
+#if defined(TARGET_I386) || defined(TARGET_PPC) || defined(TARGET_MIPS) || \
+defined(TARGET_LM32) || (defined(TARGET_SPARC) && !defined(TARGET_SPARC64))
+{
+.name   = "irq",
+.args_type  = "",
+.params = "",
+.help   = "

Re: [Qemu-devel] [PATCH 11/12] qemu: add crash_occurred flag into CPUState

2015-07-02 Thread Paolo Bonzini



On 02/07/2015 18:18, Andreas Färber wrote:
>> > +uint32_t crash_occurred;
>> >  volatile sig_atomic_t exit_request;
>> >  uint32_t interrupt_request;
>> >  int singlestep_enabled;
> If you add this field to CPUState, you'll also need to reset it in
> qom/cpu.c. Or is it intentionally persistent?

And since you are at it, you can make it a bool.

Paolo

[Qemu-devel] [PULL 20/27] migration: ensure we start in NONE state

2015-07-02 Thread Juan Quintela

Signed-off-by: Juan Quintela 
Reviewed-by: Dr. David Alan Gilbert 
---
 migration/migration.c | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/migration/migration.c b/migration/migration.c
index 1e34aa5..5c1233f 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -694,7 +694,6 @@ void qmp_migrate(const char *uri, bool has_blk, bool blk,
 error_setg(errp, QERR_MIGRATION_ACTIVE);
 return;
 }
-
 if (runstate_check(RUN_STATE_INMIGRATE)) {
 error_setg(errp, "Guest is waiting for an incoming migration");
 return;
@@ -709,6 +708,12 @@ void qmp_migrate(const char *uri, bool has_blk, bool blk,
 return;
 }

+/* We are starting a new migration, so we want to start in a clean
+   state.  This change is only needed if previous migration
+   failed/was cancelled.  We don't use migrate_set_state() because
+   we are setting the initial state, not changing it. */
+s->state = MIGRATION_STATUS_NONE;
+
 s = migrate_init(¶ms);

 if (strstart(uri, "tcp:", &p)) {
-- 
2.4.3

[Qemu-devel] [PATCH 07/12] kvm/x86: added hyper-v crash data and ctl msr's get/set'ers

2015-07-02 Thread Denis V. Lunev

From: Andrey Smetanin 

Added hyper-v crash msr's(HV_X64_MSR_CRASH*) data and control
geters and setters. Userspace should check that such msr's
available by check of KVM_CAP_HYPERV_MSR_CRASH capability.

User space allowed to setup Hyper-V crash ctl msr.
This msr should be setup to HV_X64_MSR_CRASH_CTL_NOTIFY
value so Hyper-V guest knows it can send crash data to host.
But Hyper-V guest notifies about crash event by writing
the same HV_X64_MSR_CRASH_CTL_NOTIFY value into crash ctl msr.
So both user space and guest writes inside ctl msr the same value
and this patch distingiush the moment of actual guest crash
by checking host initiated value from msr info. Also patch
prevents modification of crash ctl msr by guest.

Signed-off-by: Andrey Smetanin 
Signed-off-by: Denis V. Lunev 
Reviewed-by: Peter Hornyack 
CC: Paolo Bonzini 
CC: Gleb Natapov 
---
 arch/x86/kvm/hyperv.c| 74 ++--
 arch/x86/kvm/hyperv.h|  2 +-
 arch/x86/kvm/x86.c   |  8 +-
 include/uapi/linux/kvm.h |  1 +
 4 files changed, 80 insertions(+), 5 deletions(-)

diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
index af83c96..a8160d2 100644
--- a/arch/x86/kvm/hyperv.c
+++ b/arch/x86/kvm/hyperv.c
@@ -48,7 +48,63 @@ static bool kvm_hv_msr_partition_wide(u32 msr)
return r;
 }
 
-static int kvm_hv_set_msr_pw(struct kvm_vcpu *vcpu, u32 msr, u64 data)
+static int kvm_hv_msr_get_crash_data(struct kvm_vcpu *vcpu,
+u32 index, u64 *pdata)
+{
+   struct kvm_hv *hv = &vcpu->kvm->arch.hyperv;
+
+   if (WARN_ON_ONCE(index >= ARRAY_SIZE(hv->hv_crash_param)))
+   return -EINVAL;
+
+   *pdata = hv->hv_crash_param[index];
+   return 0;
+}
+
+static int kvm_hv_msr_get_crash_ctl(struct kvm_vcpu *vcpu, u64 *pdata)
+{
+   struct kvm_hv *hv = &vcpu->kvm->arch.hyperv;
+
+   *pdata = hv->hv_crash_ctl;
+   return 0;
+}
+
+static int kvm_hv_msr_set_crash_ctl(struct kvm_vcpu *vcpu, u64 data, bool host)
+{
+   struct kvm_hv *hv = &vcpu->kvm->arch.hyperv;
+
+   if (host)
+   hv->hv_crash_ctl = data & HV_X64_MSR_CRASH_CTL_NOTIFY;
+
+   if (!host && (data & HV_X64_MSR_CRASH_CTL_NOTIFY)) {
+
+   vcpu_debug(vcpu, "hv crash (0x%llx 0x%llx 0x%llx 0x%llx 
0x%llx)\n",
+ hv->hv_crash_param[0],
+ hv->hv_crash_param[1],
+ hv->hv_crash_param[2],
+ hv->hv_crash_param[3],
+ hv->hv_crash_param[4]);
+
+   /* Send notification about crash to user space */
+   kvm_make_request(KVM_REQ_HV_CRASH, vcpu);
+   }
+
+   return 0;
+}
+
+static int kvm_hv_msr_set_crash_data(struct kvm_vcpu *vcpu,
+u32 index, u64 data)
+{
+   struct kvm_hv *hv = &vcpu->kvm->arch.hyperv;
+
+   if (WARN_ON_ONCE(index >= ARRAY_SIZE(hv->hv_crash_param)))
+   return -EINVAL;
+
+   hv->hv_crash_param[index] = data;
+   return 0;
+}
+
+static int kvm_hv_set_msr_pw(struct kvm_vcpu *vcpu, u32 msr, u64 data,
+bool host)
 {
struct kvm *kvm = vcpu->kvm;
struct kvm_hv *hv = &kvm->arch.hyperv;
@@ -101,6 +157,12 @@ static int kvm_hv_set_msr_pw(struct kvm_vcpu *vcpu, u32 
msr, u64 data)
mark_page_dirty(kvm, gfn);
break;
}
+   case HV_X64_MSR_CRASH_P0 ... HV_X64_MSR_CRASH_P4:
+   return kvm_hv_msr_set_crash_data(vcpu,
+msr - HV_X64_MSR_CRASH_P0,
+data);
+   case HV_X64_MSR_CRASH_CTL:
+   return kvm_hv_msr_set_crash_ctl(vcpu, data, host);
default:
vcpu_unimpl(vcpu, "Hyper-V uhandled wrmsr: 0x%x data 0x%llx\n",
msr, data);
@@ -173,6 +235,12 @@ static int kvm_hv_get_msr_pw(struct kvm_vcpu *vcpu, u32 
msr, u64 *pdata)
case HV_X64_MSR_REFERENCE_TSC:
data = hv->hv_tsc_page;
break;
+   case HV_X64_MSR_CRASH_P0 ... HV_X64_MSR_CRASH_P4:
+   return kvm_hv_msr_get_crash_data(vcpu,
+msr - HV_X64_MSR_CRASH_P0,
+pdata);
+   case HV_X64_MSR_CRASH_CTL:
+   return kvm_hv_msr_get_crash_ctl(vcpu, pdata);
default:
vcpu_unimpl(vcpu, "Hyper-V unhandled rdmsr: 0x%x\n", msr);
return 1;
@@ -217,13 +285,13 @@ static int kvm_hv_get_msr(struct kvm_vcpu *vcpu, u32 msr, 
u64 *pdata)
return 0;
 }
 
-int kvm_hv_set_msr_common(struct kvm_vcpu *vcpu, u32 msr, u64 data)
+int kvm_hv_set_msr_common(struct kvm_vcpu *vcpu, u32 msr, u64 data, bool host)
 {
if (kvm_hv_msr_partition_wide(msr)) {
int r;
 
mutex_lock(&vcpu->kvm->lock);
-   r = kvm_hv_set_msr_pw(

Re: [Qemu-devel] [PATCH 11/12] qemu: add crash_occurred flag into CPUState

2015-07-02 Thread Andreas Färber

Hi,

This patch is clearly against QEMU, please name it "cpu: Add crash_...".
(You may want to take a second look at the non-CPU patches, too.)

Am 02.07.2015 um 18:07 schrieb Denis V. Lunev:
> From: Andrey Smetanin 
> 
> CPUState->crash_occurred value inside CPUState marks

"CPUState::crash_occurred field inside ..."

> that guest crash occurred. This value added into cpu common

"value is added"

> migration subsection.
> 
> Signed-off-by: Andrey Smetanin 
> Signed-off-by: Denis V. Lunev 
> CC: Paolo Bonzini 
> CC: Andreas Färber 
> ---
>  exec.c| 19 +++
>  include/qom/cpu.h |  1 +
>  vl.c  |  3 +++
>  3 files changed, 23 insertions(+)
> 
> diff --git a/exec.c b/exec.c
> index f7883d2..adf49e8 100644
> --- a/exec.c
> +++ b/exec.c
> @@ -465,6 +465,24 @@ static const VMStateDescription 
> vmstate_cpu_common_exception_index = {
>  }
>  };
>  
> +static bool cpu_common_crash_occurred_needed(void *opaque)
> +{
> +CPUState *cpu = opaque;
> +
> +return cpu->crash_occurred != 0;
> +}
> +
> +static const VMStateDescription vmstate_cpu_common_crash_occurred = {
> +.name = "cpu_common/crash_occurred",
> +.version_id = 1,
> +.minimum_version_id = 1,
> +.needed = cpu_common_crash_occurred_needed,
> +.fields = (VMStateField[]) {
> +VMSTATE_UINT32(crash_occurred, CPUState),
> +VMSTATE_END_OF_LIST()
> +}
> +};
> +
>  const VMStateDescription vmstate_cpu_common = {
>  .name = "cpu_common",
>  .version_id = 1,
> @@ -478,6 +496,7 @@ const VMStateDescription vmstate_cpu_common = {
>  },
>  .subsections = (const VMStateDescription*[]) {
>  &vmstate_cpu_common_exception_index,
> +&vmstate_cpu_common_crash_occurred,
>  NULL
>  }
>  };
> diff --git a/include/qom/cpu.h b/include/qom/cpu.h
> index 39f0f19..f559a69 100644
> --- a/include/qom/cpu.h
> +++ b/include/qom/cpu.h
> @@ -263,6 +263,7 @@ struct CPUState {
>  bool created;
>  bool stop;
>  bool stopped;
> +uint32_t crash_occurred;
>  volatile sig_atomic_t exit_request;
>  uint32_t interrupt_request;
>  int singlestep_enabled;

If you add this field to CPUState, you'll also need to reset it in
qom/cpu.c. Or is it intentionally persistent?

Looks good otherwise.

Regards,
Andreas

> diff --git a/vl.c b/vl.c
> index 38eee1f..9e0aee5 100644
> --- a/vl.c
> +++ b/vl.c
> @@ -1723,6 +1723,9 @@ void qemu_system_reset(bool report)
>  
>  void qemu_system_guest_panicked(void)
>  {
> +if (current_cpu) {
> +current_cpu->crash_occurred = 1;
> +}
>  qapi_event_send_guest_panicked(GUEST_PANIC_ACTION_PAUSE, &error_abort);
>  vm_stop(RUN_STATE_GUEST_PANICKED);
>  }

-- 
SUSE Linux GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
GF: Felix Imendörffer, Jane Smithard, Dilip Upmanyu, Graham Norton; HRB
21284 (AG Nürnberg)

[Qemu-devel] [PULL 25/27] check_section_footers: Check the correct section_id

2015-07-02 Thread Juan Quintela

From: "Dr. David Alan Gilbert" 

The section footers check was incorrectly checking the section_id
in the SaveStateEntry not the LoadStateEntry.  These can validly be different
if the two QEMU instances have instantiated their devices in a
different order.  The test only cares that we're finishing the same
section we started, and hence it's the LoadStateEntry that we care about.

Signed-off-by: Dr. David Alan Gilbert 
Reported-by: Christian Borntraeger 
Signed-off-by: Juan Quintela 
---
 migration/savevm.c | 74 +++---
 1 file changed, 37 insertions(+), 37 deletions(-)

diff --git a/migration/savevm.c b/migration/savevm.c
index b28a269..86735fc 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -697,41 +697,6 @@ static void save_section_footer(QEMUFile *f, 
SaveStateEntry *se)
 }
 }

-/*
- * Read a footer off the wire and check that it matches the expected section
- *
- * Returns: true if the footer was good
- *  false if there is a problem (and calls error_report to say why)
- */
-static bool check_section_footer(QEMUFile *f, SaveStateEntry *se)
-{
-uint8_t read_mark;
-uint32_t read_section_id;
-
-if (skip_section_footers) {
-/* No footer to check */
-return true;
-}
-
-read_mark = qemu_get_byte(f);
-
-if (read_mark != QEMU_VM_SECTION_FOOTER) {
-error_report("Missing section footer for %s", se->idstr);
-return false;
-}
-
-read_section_id = qemu_get_be32(f);
-if (read_section_id != se->section_id) {
-error_report("Mismatched section id in footer for %s -"
- " read 0x%x expected 0x%x",
- se->idstr, read_section_id, se->section_id);
-return false;
-}
-
-/* All good */
-return true;
-}
-
 bool qemu_savevm_state_blocked(Error **errp)
 {
 SaveStateEntry *se;
@@ -1046,6 +1011,41 @@ struct LoadStateEntry {
 int version_id;
 };

+/*
+ * Read a footer off the wire and check that it matches the expected section
+ *
+ * Returns: true if the footer was good
+ *  false if there is a problem (and calls error_report to say why)
+ */
+static bool check_section_footer(QEMUFile *f, LoadStateEntry *le)
+{
+uint8_t read_mark;
+uint32_t read_section_id;
+
+if (skip_section_footers) {
+/* No footer to check */
+return true;
+}
+
+read_mark = qemu_get_byte(f);
+
+if (read_mark != QEMU_VM_SECTION_FOOTER) {
+error_report("Missing section footer for %s", le->se->idstr);
+return false;
+}
+
+read_section_id = qemu_get_be32(f);
+if (read_section_id != le->section_id) {
+error_report("Mismatched section id in footer for %s -"
+ " read 0x%x expected 0x%x",
+ le->se->idstr, read_section_id, le->section_id);
+return false;
+}
+
+/* All good */
+return true;
+}
+
 void loadvm_free_handlers(MigrationIncomingState *mis)
 {
 LoadStateEntry *le, *new_le;
@@ -1151,7 +1151,7 @@ int qemu_loadvm_state(QEMUFile *f)
  " device '%s'", instance_id, idstr);
 goto out;
 }
-if (!check_section_footer(f, le->se)) {
+if (!check_section_footer(f, le)) {
 ret = -EINVAL;
 goto out;
 }
@@ -1178,7 +1178,7 @@ int qemu_loadvm_state(QEMUFile *f)
  section_id, le->se->idstr);
 goto out;
 }
-if (!check_section_footer(f, le->se)) {
+if (!check_section_footer(f, le)) {
 ret = -EINVAL;
 goto out;
 }
-- 
2.4.3

[Qemu-devel] [PULL 14/27] runstate: migration allows more transitions now

2015-07-02 Thread Juan Quintela

Next commit would allow to move from incoming migration to error happening on 
source.

Should we add more states to this transition?  Luiz?

Signed-off-by: Juan Quintela 
Reviewed-by: Dr. David Alan Gilbert 
---
 vl.c | 8 +++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/vl.c b/vl.c
index fec7e93..19a8737 100644
--- a/vl.c
+++ b/vl.c
@@ -573,8 +573,14 @@ static const RunStateTransition runstate_transitions_def[] 
= {
 { RUN_STATE_DEBUG, RUN_STATE_RUNNING },
 { RUN_STATE_DEBUG, RUN_STATE_FINISH_MIGRATE },

-{ RUN_STATE_INMIGRATE, RUN_STATE_RUNNING },
+{ RUN_STATE_INMIGRATE, RUN_STATE_INTERNAL_ERROR },
+{ RUN_STATE_INMIGRATE, RUN_STATE_IO_ERROR },
 { RUN_STATE_INMIGRATE, RUN_STATE_PAUSED },
+{ RUN_STATE_INMIGRATE, RUN_STATE_RUNNING },
+{ RUN_STATE_INMIGRATE, RUN_STATE_SHUTDOWN },
+{ RUN_STATE_INMIGRATE, RUN_STATE_SUSPENDED },
+{ RUN_STATE_INMIGRATE, RUN_STATE_WATCHDOG },
+{ RUN_STATE_INMIGRATE, RUN_STATE_GUEST_PANICKED },

 { RUN_STATE_INTERNAL_ERROR, RUN_STATE_PAUSED },
 { RUN_STATE_INTERNAL_ERROR, RUN_STATE_FINISH_MIGRATE },
-- 
2.4.3

Re: [Qemu-devel] [PATCH v4 4/9] Add virt-v3 machine that uses GIC-500

2015-07-02 Thread Daniel P. Berrange

On Thu, Jul 02, 2015 at 05:47:04PM +0300, Pavel Fedin wrote:
>  Hello!
> 
>  I already explained this earlier: 
> http://lists.nongnu.org/archive/html/qemu-devel/2015-05/msg04842.html, and i 
> tried to explain this in commit message. Current qemu architecture does not 
> allow doing this in a clean way.
>  I can simply change mc->max_cpus for 'virt' machine to 64 (always), and
>  then check user-supplied value against GIC limitation by myself. And
>  produce error. This will be code duplication. Do you think it is better?
>  Anybody else (Peter, Cristoffer ?), please vote.

Yes, just increase the declared max cpus to the larger of the two values,
and do a manual check at time of use

Regards,
Daniel
-- 
|: http://berrange.com  -o-http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org  -o- http://virt-manager.org :|
|: http://autobuild.org   -o- http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org   -o-   http://live.gnome.org/gtk-vnc :|

1 2 3 4 >

1 - 100 of 327 matches

Mail list logo