Re: [Qemu-devel] [PATCH 2/4] target-ppc: implement vnegw/d instructions

2016-10-09 Thread Nikunj A Dadhania
Richard Henderson  writes:

> On 10/07/2016 01:57 PM, Nikunj A Dadhania wrote:
>> +r->element[i] = (~(b->element[i]) + 1) & mask;  \
>
> Any reason you're not writing this as a proper negate?

No particular reason, I was just trying to mimic the pseudo code in the
ISA.

r->element[i] = -b->element[i];

Should be fine as well. 

Regards,
Nikunj




[Qemu-devel] block/nfs: Fine grained runtime options in nfs

2016-10-09 Thread Ashijeet Acharya
Hi all,

I was working on trying to add blockdev-add compatibility for the nfs
block driver but before that runtime options need to be separated into
various options rather than just a simple "filename" option.

I have added the following until now:
a) host
b) port (not sure about this one, do we just use a default port number?)
c) export
d) path (path to the file)

I have matched these with the URI but still let me know if i have
missed anyone :)

Now, in order to parse the uri for different runtime options, I have
made two new functions nfs_parse_filename() and nfs_parse_uri() which
is pretty similar to the way how other network block drivers do it.

Currently we parse the uri in a nfs_client_open() function which takes
'const char *filename' as one of its parameters but I dont think
that's the right way anymore because we pass 'qemu_opt_get(opts,
"filename")' which is invalid due to no runtime option named
"filename" available anymore. Right?

While parsing uri we check for the query parameters inside a 'for
loop', so I have moved that too inside

nfs_parse_uri(const char *filename, QDict *options, Error **errp)

but the problem is there is no struct NFSClient parameter here, so I
cannot fill up its important fields while parsing the query
parameters. I cannot do the same inside nfs_client_open() because I no
longer parse the uri over there.OR CAN I? A completely different
approach will work too :)

I can attach a pastebin link containing a raw patch if you want to get
a clear view but I am afraid it doesn't compile at the moment due to
the problems mentioned above.

Any help will be appreciated.

Thanks for reading
Ashijeet



Re: [Qemu-devel] [PATCH 1/4] target-ppc: implement vexts[bh]2w and vexts[bhw]2d

2016-10-09 Thread Nikunj A Dadhania
Richard Henderson  writes:

> On 10/07/2016 01:57 PM, Nikunj A Dadhania wrote:
>> +VEXT_SIGNED(vextsb2w, s32, UINT8_MAX, char, int32_t)
>> +VEXT_SIGNED(vextsb2d, s64, UINT8_MAX, char, int64_t)
>
> char has target-dependent sign.  Use int8_t.

Sure. will change.

Regards
Nikunj




Re: [Qemu-devel] [RFC 1/4] spapr_pci: Delegate placement of PCI host bridges to machine type

2016-10-09 Thread David Gibson
On Mon, Oct 10, 2016 at 12:04:29PM +1100, Alexey Kardashevskiy wrote:
> On 07/10/16 20:17, David Gibson wrote:
> > On Fri, Oct 07, 2016 at 04:34:59PM +1100, Alexey Kardashevskiy wrote:
> >> On 07/10/16 16:10, David Gibson wrote:
> >>> On Fri, Oct 07, 2016 at 02:57:43PM +1100, Alexey Kardashevskiy wrote:
>  On 06/10/16 14:03, David Gibson wrote:
> > The 'spapr-pci-host-bridge' represents the virtual PCI host bridge (PHB)
> > for a PAPR guest.  Unlike on x86, it's routine on Power (both bare metal
> > and PAPR guests) to have numerous independent PHBs, each controlling a
> > separate PCI domain.
> >
> > There are two ways of configuring the spapr-pci-host-bridge device: 
> > first
> > it can be done fully manually, specifying the locations and sizes of all
> > the IO windows.  This gives the most control, but is very awkward with 6
> > mandatory parameters.  Alternatively just an "index" can be specified
> > which essentially selects from an array of predefined PHB locations.
> > The PHB at index 0 is automatically created as the default PHB.
> >
> > The current set of default locations causes some problems for guests 
> > with
> > large RAM (> 1 TiB) or PCI devices with very large BARs (e.g. big nVidia
> > GPGPU cards via VFIO).  Obviously, for migration we can only change the
> > locations on a new machine type, however.
> >
> > This is awkward, because the placement is currently decided within the
> > spapr-pci-host-bridge code, so it breaks abstraction to look inside the
> > machine type version.
> >
> > So, this patch delegates the "default mode" PHB placement from the
> > spapr-pci-host-bridge device back to the machine type via a public 
> > method
> > in sPAPRMachineClass.  It's still a bit ugly, but it's about the best we
> > can do.
> >
> > For now, this just changes where the calculation is done.  It doesn't
> > change the actual location of the host bridges, or any other behaviour.
> >
> > Signed-off-by: David Gibson 
> > ---
> >  hw/ppc/spapr.c  | 34 ++
> >  hw/ppc/spapr_pci.c  | 22 --
> >  include/hw/pci-host/spapr.h | 11 +--
> >  include/hw/ppc/spapr.h  |  4 
> >  4 files changed, 47 insertions(+), 24 deletions(-)
> >
> > diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> > index 03e3803..f6e9c2a 100644
> > --- a/hw/ppc/spapr.c
> > +++ b/hw/ppc/spapr.c
> > @@ -2370,6 +2370,39 @@ static HotpluggableCPUList 
> > *spapr_query_hotpluggable_cpus(MachineState *machine)
> >  return head;
> >  }
> >  
> > +static void spapr_phb_placement(sPAPRMachineState *spapr, uint32_t 
> > index,
> > +uint64_t *buid, hwaddr *pio, hwaddr 
> > *pio_size,
> > +hwaddr *mmio, hwaddr *mmio_size,
> > +unsigned n_dma, uint32_t *liobns, 
> > Error **errp)
> > +{
> > +const uint64_t base_buid = 0x8002000ULL;
> > +const hwaddr phb0_base = 0x100ULL; /* 1 TiB */
> > +const hwaddr phb_spacing = 0x10ULL; /* 64 GiB */
> > +const hwaddr mmio_offset = 0xa000; /* 2 GiB + 512 MiB */
> > +const hwaddr pio_offset = 0x8000; /* 2 GiB */
> > +const uint32_t max_index = 255;
> > +
> > +hwaddr phb_base;
> > +int i;
> > +
> > +if (index > max_index) {
> > +error_setg(errp, "\"index\" for PAPR PHB is too large (max 
> > %u)",
> > +   max_index);
> > +return;
> > +}
> > +
> > +*buid = base_buid + index;
> > +for (i = 0; i < n_dma; ++i) {
> > +liobns[i] = SPAPR_PCI_LIOBN(index, i);
> > +}
> > +
> > +phb_base = phb0_base + index * phb_spacing;
> > +*pio = phb_base + pio_offset;
> > +*pio_size = SPAPR_PCI_IO_WIN_SIZE;
> > +*mmio = phb_base + mmio_offset;
> > +*mmio_size = SPAPR_PCI_MMIO_WIN_SIZE;
> > +}
> > +
> >  static void spapr_machine_class_init(ObjectClass *oc, void *data)
> >  {
> >  MachineClass *mc = MACHINE_CLASS(oc);
> > @@ -2406,6 +2439,7 @@ static void spapr_machine_class_init(ObjectClass 
> > *oc, void *data)
> >  mc->query_hotpluggable_cpus = spapr_query_hotpluggable_cpus;
> >  fwc->get_dev_path = spapr_get_fw_dev_path;
> >  nc->nmi_monitor_handler = spapr_nmi;
> > +smc->phb_placement = spapr_phb_placement;
> >  }
> >  
> >  static const TypeInfo spapr_machine_info = {
> > diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c
> > index 4f00865..c0fc964 100644
> > --- a/hw/ppc/spapr_pci.c
> > +++ b/hw/ppc/spapr_pci.c
> > @@ -1311,7 +1311,8 @@ static void 

Re: [Qemu-devel] [PATCH 1/2] vhost: enable any layout feature

2016-10-09 Thread Michael S. Tsirkin
On Mon, Oct 10, 2016 at 04:16:19AM +, Wang, Zhihong wrote:
> 
> 
> > -Original Message-
> > From: Yuanhan Liu [mailto:yuanhan@linux.intel.com]
> > Sent: Monday, October 10, 2016 11:59 AM
> > To: Michael S. Tsirkin 
> > Cc: Maxime Coquelin ; Stephen Hemminger
> > ; d...@dpdk.org; qemu-
> > de...@nongnu.org; Wang, Zhihong 
> > Subject: Re: [Qemu-devel] [PATCH 1/2] vhost: enable any layout feature
> > 
> > On Mon, Oct 10, 2016 at 06:46:44AM +0300, Michael S. Tsirkin wrote:
> > > On Mon, Oct 10, 2016 at 11:37:44AM +0800, Yuanhan Liu wrote:
> > > > On Thu, Sep 29, 2016 at 11:21:48PM +0300, Michael S. Tsirkin wrote:
> > > > > On Thu, Sep 29, 2016 at 10:05:22PM +0200, Maxime Coquelin wrote:
> > > > > >
> > > > > >
> > > > > > On 09/29/2016 07:57 PM, Michael S. Tsirkin wrote:
> > > > > Yes but two points.
> > > > >
> > > > > 1. why is this memset expensive?
> > > >
> > > > I don't have the exact answer, but just some rough thoughts:
> > > >
> > > > It's an external clib function: there is a call stack and the
> > > > IP register will bounch back and forth.
> > >
> > > for memset 0?  gcc 5.3.1 on fedora happily inlines it.
> > 
> > Good to know!
> > 
> > > > overkill to use that for resetting 14 bytes structure.
> > > >
> > > > Some trick like
> > > > *(struct virtio_net_hdr *)hdr = {0, };
> > > >
> > > > Or even
> > > > hdr->xxx = 0;
> > > > hdr->yyy = 0;
> > > >
> > > > should behaviour better.
> > > >
> > > > There was an example: the vhost enqueue optmization patchset from
> > > > Zhihong [0] uses memset, and it introduces more than 15% drop (IIRC)
> > > > on my Ivybridge server: it has no such issue on his server though.
> > > >
> > > > [0]: http://dpdk.org/ml/archives/dev/2016-August/045272.html
> > > >
> > > > --yliu
> > >
> > > I'd say that's weird. what's your config? any chance you
> > > are using an old compiler?
> > 
> > Not really, it's gcc 5.3.1. Maybe Zhihong could explain more. IIRC,
> > he said the memset is not well optimized for Ivybridge server.
> 
> The dst is remote in that case. It's fine on Haswell but has complication
> in Ivy Bridge which (wasn't supposed to but) causes serious frontend issue.
> 
> I don't think gcc inlined it there. I'm using fc24 gcc 6.1.1.


So try something like this then:

Signed-off-by: Michael S. Tsirkin 


diff --git a/drivers/net/virtio/virtio_pci.h b/drivers/net/virtio/virtio_pci.h
index dd7693f..7a3f88e 100644
--- a/drivers/net/virtio/virtio_pci.h
+++ b/drivers/net/virtio/virtio_pci.h
@@ -292,6 +292,16 @@ vtpci_with_feature(struct virtio_hw *hw, uint64_t bit)
return (hw->guest_features & (1ULL << bit)) != 0;
 }
 
+static inline int
+vtnet_hdr_size(struct virtio_hw *hw)
+{
+   if (vtpci_with_feature(hw, VIRTIO_NET_F_MRG_RXBUF) ||
+   vtpci_with_feature(hw, VIRTIO_F_VERSION_1))
+   return sizeof(struct virtio_net_hdr_mrg_rxbuf);
+   else
+   return sizeof(struct virtio_net_hdr);
+}
+
 /*
  * Function declaration from virtio_pci.c
  */
diff --git a/drivers/net/virtio/virtio_rxtx.c b/drivers/net/virtio/virtio_rxtx.c
index a27208e..21a45e1 100644
--- a/drivers/net/virtio/virtio_rxtx.c
+++ b/drivers/net/virtio/virtio_rxtx.c
@@ -216,7 +216,7 @@ virtqueue_enqueue_xmit(struct virtnet_tx *txvq, struct 
rte_mbuf *cookie,
struct vring_desc *start_dp;
uint16_t seg_num = cookie->nb_segs;
uint16_t head_idx, idx;
-   uint16_t head_size = vq->hw->vtnet_hdr_size;
+   uint16_t head_size = vtnet_hdr_size(vq->hw);
unsigned long offs;
 
head_idx = vq->vq_desc_head_idx;

Generally pointer chasing in vq->hw->vtnet_hdr_size can't be good
for performance. Move fields used on data path into vq
and use from there to avoid indirections?




Re: [Qemu-devel] [PATCH 1/2] vhost: enable any layout feature

2016-10-09 Thread Michael S. Tsirkin
On Mon, Oct 10, 2016 at 12:22:09PM +0800, Yuanhan Liu wrote:
> On Mon, Oct 10, 2016 at 07:17:06AM +0300, Michael S. Tsirkin wrote:
> > On Mon, Oct 10, 2016 at 12:05:31PM +0800, Yuanhan Liu wrote:
> > > On Fri, Sep 30, 2016 at 10:16:43PM +0300, Michael S. Tsirkin wrote:
> > > > > > And the same is done is done in DPDK:
> > > > > > 
> > > > > > static inline int __attribute__((always_inline))
> > > > > > copy_desc_to_mbuf(struct virtio_net *dev, struct vring_desc *descs,
> > > > > >   uint16_t max_desc, struct rte_mbuf *m, uint16_t desc_idx,
> > > > > >   struct rte_mempool *mbuf_pool)
> > > > > > {
> > > > > > ...
> > > > > > /*
> > > > > >  * A virtio driver normally uses at least 2 desc buffers
> > > > > >  * for Tx: the first for storing the header, and others
> > > > > >  * for storing the data.
> > > > > >  */
> > > > > > if (likely((desc->len == dev->vhost_hlen) &&
> > > > > >(desc->flags & VRING_DESC_F_NEXT) != 0)) {
> > > > > > desc = [desc->next];
> > > > > > if (unlikely(desc->flags & VRING_DESC_F_INDIRECT))
> > > > > > return -1;
> > > > > > 
> > > > > > desc_addr = gpa_to_vva(dev, desc->addr);
> > > > > > if (unlikely(!desc_addr))
> > > > > > return -1;
> > > > > > 
> > > > > > rte_prefetch0((void *)(uintptr_t)desc_addr);
> > > > > > 
> > > > > > desc_offset = 0;
> > > > > > desc_avail  = desc->len;
> > > > > > nr_desc+= 1;
> > > > > > 
> > > > > > PRINT_PACKET(dev, (uintptr_t)desc_addr, desc->len, 0);
> > > > > > } else {
> > > > > > desc_avail  = desc->len - dev->vhost_hlen;
> > > > > > desc_offset = dev->vhost_hlen;
> > > > > > }
> > > > > 
> > > > > Actually, the header is parsed in DPDK vhost implementation.
> > > > > But as Virtio PMD provides a zero'ed header, we could just parse
> > > > > the header only if VIRTIO_NET_F_NO_TX_HEADER is not negotiated.
> > > > 
> > > > host can always skip the header parse if it wants to.
> > > > It didn't seem worth it to add branches there but
> > > > if I'm wrong, by all means code it up.
> > > 
> > > It's added by following commit, which yields about 10% performance
> > > boosts for PVP case (with 64B packet size).
> > > 
> > > At that time, a packet always use 2 descs. Since indirect desc is
> > > enabled (by default) now, the assumption is not true then. What's
> > > worse, it might even slow things a bit down. That should also be
> > > part of the reason why performance is slightly worse than before.
> > > 
> > >   --yliu
> > 
> > I'm not sure I get what you are saying
> > 
> > > commit 1d41d77cf81c448c1b09e1e859bfd300e2054a98
> > > Author: Yuanhan Liu 
> > > Date:   Mon May 2 17:46:17 2016 -0700
> > > 
> > > vhost: optimize dequeue for small packets
> > > 
> > > A virtio driver normally uses at least 2 desc buffers for Tx: the
> > > first for storing the header, and the others for storing the data.
> > > 
> > > Therefore, we could fetch the first data desc buf before the main
> > > loop, and do the copy first before the check of "are we done yet?".
> > > This could save one check for small packets that just have one data
> > > desc buffer and need one mbuf to store it.
> > > 
> > > Signed-off-by: Yuanhan Liu 
> > > Acked-by: Huawei Xie 
> > > Tested-by: Rich Lane 
> > 
> > This fast-paths the 2-descriptors format but it's not active
> > for indirect descriptors. Is this what you mean?
> 
> Yes. It's also not active when ANY_LAYOUT is actually turned on.

It's not needed there though - you only use 1 desc, no point in
fetching the next one.


> > Should be a simple matter to apply this optimization for indirect.
> 
> Might be.
> 
>   --yliu



Re: [Qemu-devel] [PATCH 1/2] vhost: enable any layout feature

2016-10-09 Thread Michael S. Tsirkin
On Mon, Oct 10, 2016 at 04:16:19AM +, Wang, Zhihong wrote:
> 
> 
> > -Original Message-
> > From: Yuanhan Liu [mailto:yuanhan@linux.intel.com]
> > Sent: Monday, October 10, 2016 11:59 AM
> > To: Michael S. Tsirkin 
> > Cc: Maxime Coquelin ; Stephen Hemminger
> > ; d...@dpdk.org; qemu-
> > de...@nongnu.org; Wang, Zhihong 
> > Subject: Re: [Qemu-devel] [PATCH 1/2] vhost: enable any layout feature
> > 
> > On Mon, Oct 10, 2016 at 06:46:44AM +0300, Michael S. Tsirkin wrote:
> > > On Mon, Oct 10, 2016 at 11:37:44AM +0800, Yuanhan Liu wrote:
> > > > On Thu, Sep 29, 2016 at 11:21:48PM +0300, Michael S. Tsirkin wrote:
> > > > > On Thu, Sep 29, 2016 at 10:05:22PM +0200, Maxime Coquelin wrote:
> > > > > >
> > > > > >
> > > > > > On 09/29/2016 07:57 PM, Michael S. Tsirkin wrote:
> > > > > Yes but two points.
> > > > >
> > > > > 1. why is this memset expensive?
> > > >
> > > > I don't have the exact answer, but just some rough thoughts:
> > > >
> > > > It's an external clib function: there is a call stack and the
> > > > IP register will bounch back and forth.
> > >
> > > for memset 0?  gcc 5.3.1 on fedora happily inlines it.
> > 
> > Good to know!
> > 
> > > > overkill to use that for resetting 14 bytes structure.
> > > >
> > > > Some trick like
> > > > *(struct virtio_net_hdr *)hdr = {0, };
> > > >
> > > > Or even
> > > > hdr->xxx = 0;
> > > > hdr->yyy = 0;
> > > >
> > > > should behaviour better.
> > > >
> > > > There was an example: the vhost enqueue optmization patchset from
> > > > Zhihong [0] uses memset, and it introduces more than 15% drop (IIRC)
> > > > on my Ivybridge server: it has no such issue on his server though.
> > > >
> > > > [0]: http://dpdk.org/ml/archives/dev/2016-August/045272.html
> > > >
> > > > --yliu
> > >
> > > I'd say that's weird. what's your config? any chance you
> > > are using an old compiler?
> > 
> > Not really, it's gcc 5.3.1. Maybe Zhihong could explain more. IIRC,
> > he said the memset is not well optimized for Ivybridge server.
> 
> The dst is remote in that case. It's fine on Haswell but has complication
> in Ivy Bridge which (wasn't supposed to but) causes serious frontend issue.
> 
> I don't think gcc inlined it there. I'm using fc24 gcc 6.1.1.

I just wrote some test code and compiled with gcc -O2,
it did get inlined.

It's probably this:
uint16_t head_size = vq->hw->vtnet_hdr_size;
...
memset(hdr, 0, head_size);
IOW head_size is not known to compiler.

Try sticking a bool there instead of vtnet_hdr_size, and
 memset(hdr, 0, bigheader ? 10 : 12);


> > 
> > --yliu



Re: [Qemu-devel] [PATCH 1/2] vhost: enable any layout feature

2016-10-09 Thread Yuanhan Liu
On Mon, Oct 10, 2016 at 07:17:06AM +0300, Michael S. Tsirkin wrote:
> On Mon, Oct 10, 2016 at 12:05:31PM +0800, Yuanhan Liu wrote:
> > On Fri, Sep 30, 2016 at 10:16:43PM +0300, Michael S. Tsirkin wrote:
> > > > > And the same is done is done in DPDK:
> > > > > 
> > > > > static inline int __attribute__((always_inline))
> > > > > copy_desc_to_mbuf(struct virtio_net *dev, struct vring_desc *descs,
> > > > >   uint16_t max_desc, struct rte_mbuf *m, uint16_t desc_idx,
> > > > >   struct rte_mempool *mbuf_pool)
> > > > > {
> > > > > ...
> > > > > /*
> > > > >  * A virtio driver normally uses at least 2 desc buffers
> > > > >  * for Tx: the first for storing the header, and others
> > > > >  * for storing the data.
> > > > >  */
> > > > > if (likely((desc->len == dev->vhost_hlen) &&
> > > > >(desc->flags & VRING_DESC_F_NEXT) != 0)) {
> > > > > desc = [desc->next];
> > > > > if (unlikely(desc->flags & VRING_DESC_F_INDIRECT))
> > > > > return -1;
> > > > > 
> > > > > desc_addr = gpa_to_vva(dev, desc->addr);
> > > > > if (unlikely(!desc_addr))
> > > > > return -1;
> > > > > 
> > > > > rte_prefetch0((void *)(uintptr_t)desc_addr);
> > > > > 
> > > > > desc_offset = 0;
> > > > > desc_avail  = desc->len;
> > > > > nr_desc+= 1;
> > > > > 
> > > > > PRINT_PACKET(dev, (uintptr_t)desc_addr, desc->len, 0);
> > > > > } else {
> > > > > desc_avail  = desc->len - dev->vhost_hlen;
> > > > > desc_offset = dev->vhost_hlen;
> > > > > }
> > > > 
> > > > Actually, the header is parsed in DPDK vhost implementation.
> > > > But as Virtio PMD provides a zero'ed header, we could just parse
> > > > the header only if VIRTIO_NET_F_NO_TX_HEADER is not negotiated.
> > > 
> > > host can always skip the header parse if it wants to.
> > > It didn't seem worth it to add branches there but
> > > if I'm wrong, by all means code it up.
> > 
> > It's added by following commit, which yields about 10% performance
> > boosts for PVP case (with 64B packet size).
> > 
> > At that time, a packet always use 2 descs. Since indirect desc is
> > enabled (by default) now, the assumption is not true then. What's
> > worse, it might even slow things a bit down. That should also be
> > part of the reason why performance is slightly worse than before.
> > 
> > --yliu
> 
> I'm not sure I get what you are saying
> 
> > commit 1d41d77cf81c448c1b09e1e859bfd300e2054a98
> > Author: Yuanhan Liu 
> > Date:   Mon May 2 17:46:17 2016 -0700
> > 
> > vhost: optimize dequeue for small packets
> > 
> > A virtio driver normally uses at least 2 desc buffers for Tx: the
> > first for storing the header, and the others for storing the data.
> > 
> > Therefore, we could fetch the first data desc buf before the main
> > loop, and do the copy first before the check of "are we done yet?".
> > This could save one check for small packets that just have one data
> > desc buffer and need one mbuf to store it.
> > 
> > Signed-off-by: Yuanhan Liu 
> > Acked-by: Huawei Xie 
> > Tested-by: Rich Lane 
> 
> This fast-paths the 2-descriptors format but it's not active
> for indirect descriptors. Is this what you mean?

Yes. It's also not active when ANY_LAYOUT is actually turned on.

> Should be a simple matter to apply this optimization for indirect.

Might be.

--yliu



Re: [Qemu-devel] [PATCH 1/2] vhost: enable any layout feature

2016-10-09 Thread Michael S. Tsirkin
On Mon, Oct 10, 2016 at 12:05:31PM +0800, Yuanhan Liu wrote:
> On Fri, Sep 30, 2016 at 10:16:43PM +0300, Michael S. Tsirkin wrote:
> > > > And the same is done is done in DPDK:
> > > > 
> > > > static inline int __attribute__((always_inline))
> > > > copy_desc_to_mbuf(struct virtio_net *dev, struct vring_desc *descs,
> > > >   uint16_t max_desc, struct rte_mbuf *m, uint16_t desc_idx,
> > > >   struct rte_mempool *mbuf_pool)
> > > > {
> > > > ...
> > > > /*
> > > >  * A virtio driver normally uses at least 2 desc buffers
> > > >  * for Tx: the first for storing the header, and others
> > > >  * for storing the data.
> > > >  */
> > > > if (likely((desc->len == dev->vhost_hlen) &&
> > > >(desc->flags & VRING_DESC_F_NEXT) != 0)) {
> > > > desc = [desc->next];
> > > > if (unlikely(desc->flags & VRING_DESC_F_INDIRECT))
> > > > return -1;
> > > > 
> > > > desc_addr = gpa_to_vva(dev, desc->addr);
> > > > if (unlikely(!desc_addr))
> > > > return -1;
> > > > 
> > > > rte_prefetch0((void *)(uintptr_t)desc_addr);
> > > > 
> > > > desc_offset = 0;
> > > > desc_avail  = desc->len;
> > > > nr_desc+= 1;
> > > > 
> > > > PRINT_PACKET(dev, (uintptr_t)desc_addr, desc->len, 0);
> > > > } else {
> > > > desc_avail  = desc->len - dev->vhost_hlen;
> > > > desc_offset = dev->vhost_hlen;
> > > > }
> > > 
> > > Actually, the header is parsed in DPDK vhost implementation.
> > > But as Virtio PMD provides a zero'ed header, we could just parse
> > > the header only if VIRTIO_NET_F_NO_TX_HEADER is not negotiated.
> > 
> > host can always skip the header parse if it wants to.
> > It didn't seem worth it to add branches there but
> > if I'm wrong, by all means code it up.
> 
> It's added by following commit, which yields about 10% performance
> boosts for PVP case (with 64B packet size).
> 
> At that time, a packet always use 2 descs. Since indirect desc is
> enabled (by default) now, the assumption is not true then. What's
> worse, it might even slow things a bit down. That should also be
> part of the reason why performance is slightly worse than before.
> 
>   --yliu

I'm not sure I get what you are saying

> commit 1d41d77cf81c448c1b09e1e859bfd300e2054a98
> Author: Yuanhan Liu 
> Date:   Mon May 2 17:46:17 2016 -0700
> 
> vhost: optimize dequeue for small packets
> 
> A virtio driver normally uses at least 2 desc buffers for Tx: the
> first for storing the header, and the others for storing the data.
> 
> Therefore, we could fetch the first data desc buf before the main
> loop, and do the copy first before the check of "are we done yet?".
> This could save one check for small packets that just have one data
> desc buffer and need one mbuf to store it.
> 
> Signed-off-by: Yuanhan Liu 
> Acked-by: Huawei Xie 
> Tested-by: Rich Lane 

This fast-paths the 2-descriptors format but it's not active
for indirect descriptors. Is this what you mean?
Should be a simple matter to apply this optimization for indirect.





[Qemu-devel] [PATCH v12 0/2] virtio-crypto: virtio crypto device specification

2016-10-09 Thread Gonglei
This is the specification about a new virtio crypto device.

You can get the source code from the below website:

[PATCH v3 00/10] virtio-crypto: introduce framework and device emulation
  https://lists.gnu.org/archive/html/qemu-devel/2016-09/msg04132.html

[PATCH v4 00/13] virtio-crypto: introduce framework and device emulation
 https://lists.gnu.org/archive/html/qemu-devel/2016-09/msg07327.html

[PATCH v5 00/14] virtio-crypto: introduce framework and device emulation
 https://lists.gnu.org/archive/html/qemu-devel/2016-10/msg00963.html

For more information, please see:
 http://qemu-project.org/Features/VirtioCrypto

Please help to review, thanks.

CC: Michael S. Tsirkin 
CC: Cornelia Huck 
CC: Stefan Hajnoczi 
CC: Lingli Deng 
CC: Jani Kokkonen 
CC: Ola Liljedahl 
CC: Varun Sethi 
CC: Zeng Xin 
CC: Keating Brian 
CC: Ma Liang J 
CC: Griffin John 
CC: Hanweidong 
CC: Mihai Claudiu Caraman 

Changes since v11:
 - drop scatter-gather I/O definition for virtio crypto device because
   The vring already provides scatter-gather I/O.  It is usually not
   necessary to define scatter-gather I/O at the device level.  [Stefan]
 - perfect algorithm chain parameters' definition.
 - add HASH/MAC parameter structure.

Changes since v10:
 - fix typos s/filed/field/. [Xin]
 - replace 'real cypto accelerator' with 'backend crypto accelerator'. [mst]
 - drop KDF, ASYM, PRIMITIVE services description temporarily. [mst]
 - write a device requirement are testable about VIRTIO_CRYPTO_S_HW_READY. [mst]
 - add a space before * in one code comment. [mst]
 - reset the layout of all crypto operations for better asymmetric algos 
support. [Xin]
 - add more detailed description for initialization vector under different 
modes.
 - sed -i 's/VIRTIO_CRYPTO_OP_/VIRTIO_CRYPTO_/g' for general usage in asym 
algos. [Xin]

Changes since v9:
 - request a native speaker go over the text and fix corresponding grammar 
issues. [mst]
 - make some description more appropriated over here and there. [mst]
 - rewrite some requirement for both device and driver. [mst]
 - use RFC 2119 keywords. [mst]
 - fix some complaints by Xelatex and typoes. [Xin Zeng]
 - add scatter/getter chain support for possible large block data.

Thanks for your review, Michael and Xin.

Changes from v8:
 - add additional auth gpa and length to struct virtio_crypto_sym_data_req;
 - add definition of op in struct virtio_crypto_cipher_session_para,
  VIRTIO_CRYPTO_OP_ENCRYPT and VIRTIO_CRYPTO_OP_DECRYPT;
 - make all structures 64bit aligned in order to support different
  architectures more conveniently [Alex & Stefan]
 - change to devicenormative{\subsection} and \drivernormative{\subsection} in 
some sections [Stefan]
 - driver does not have to initialize all data virtqueues if it wants to use 
fewer [Stefan]
 - drop VIRTIO_CRYPTO_NO_SERVICE definition [Stefan]
 - many grammatical problems and typos. [Stefan]
 - rename VIRTIO_CRYPTO_MAC_CMAC_KASUMI_F9 to VIRTIO_CRYPTO_MAC_CMAC_KASUMI_F9,
  and VIRTIO_CRYPTO_MAC_CMAC_SNOW3G_UIA2 to VIRTIO_CRYPTO_MAC_SNOW3G_UIA2. 
[Liang Ma]
 - drop queue_id property of struct virtio_crypto_op_data_req.
 - reconstruct some structures about session operation request.
 - introduce struct virtio_crypto_alg_chain_session_req and struct 
virtio_crypto_alg_chain_data_req,
  introduce chain para, output, input structures as well.
 - change some sections' layout for better compatibility, for asymmetric algos. 
[Xin Zeng]

Changes from v7:
 - fix some grammar or typo problems.
 - add more detailed description at steps of encryption section.

Changes from v6:
 - drop verion filed in struct virtio_crypto_config. [Michael & Cornelia]
 - change the incorrect description in initialization routine. [Zeng Xin]
 - redefine flag u16 to make structure alignment. [Zeng Xin]
 - move the content of virtio_crypto_hash_session_para into
   virtio_crypto_hash_session_input directly, Same to MAC/SYM/AEAD session 
creation. [Zeng Xin]
 - adjuest the sequence of idata and odata refer to the virtio scsi parts,
   meanwhile add the comments of device-readable/writable for them.
 - add restrictive documents for the guest memory in some structure, which
   MUST be gauranted to be allocated and physically-contiguous.

Changes from v5:
 - add conformance clauses for virtio crypto device. [Michael]
 - drop VIRTIO_CRYPTO_S_STARTED. [Michael]
 - fix some characters problems. [Stefan]
 - add a MAC algorithm, named VIRTIO_CRYPTO_MAC_ZUC_EIA3. [Zeng Xin]
 - add the fourth return code, named VIRTIO_CRYPTO_OP_INVSESS used
   for invalid session id when executing crypto operations.
 - drop some gpu stuff forgot to delete. [Michael]
 - convert tab to space all over the content.


Re: [Qemu-devel] [PATCH 1/2] vhost: enable any layout feature

2016-10-09 Thread Yuanhan Liu
On Fri, Sep 30, 2016 at 10:16:43PM +0300, Michael S. Tsirkin wrote:
> > > And the same is done is done in DPDK:
> > > 
> > > static inline int __attribute__((always_inline))
> > > copy_desc_to_mbuf(struct virtio_net *dev, struct vring_desc *descs,
> > >   uint16_t max_desc, struct rte_mbuf *m, uint16_t desc_idx,
> > >   struct rte_mempool *mbuf_pool)
> > > {
> > > ...
> > > /*
> > >  * A virtio driver normally uses at least 2 desc buffers
> > >  * for Tx: the first for storing the header, and others
> > >  * for storing the data.
> > >  */
> > > if (likely((desc->len == dev->vhost_hlen) &&
> > >(desc->flags & VRING_DESC_F_NEXT) != 0)) {
> > > desc = [desc->next];
> > > if (unlikely(desc->flags & VRING_DESC_F_INDIRECT))
> > > return -1;
> > > 
> > > desc_addr = gpa_to_vva(dev, desc->addr);
> > > if (unlikely(!desc_addr))
> > > return -1;
> > > 
> > > rte_prefetch0((void *)(uintptr_t)desc_addr);
> > > 
> > > desc_offset = 0;
> > > desc_avail  = desc->len;
> > > nr_desc+= 1;
> > > 
> > > PRINT_PACKET(dev, (uintptr_t)desc_addr, desc->len, 0);
> > > } else {
> > > desc_avail  = desc->len - dev->vhost_hlen;
> > > desc_offset = dev->vhost_hlen;
> > > }
> > 
> > Actually, the header is parsed in DPDK vhost implementation.
> > But as Virtio PMD provides a zero'ed header, we could just parse
> > the header only if VIRTIO_NET_F_NO_TX_HEADER is not negotiated.
> 
> host can always skip the header parse if it wants to.
> It didn't seem worth it to add branches there but
> if I'm wrong, by all means code it up.

It's added by following commit, which yields about 10% performance
boosts for PVP case (with 64B packet size).

At that time, a packet always use 2 descs. Since indirect desc is
enabled (by default) now, the assumption is not true then. What's
worse, it might even slow things a bit down. That should also be
part of the reason why performance is slightly worse than before.

--yliu

commit 1d41d77cf81c448c1b09e1e859bfd300e2054a98
Author: Yuanhan Liu 
Date:   Mon May 2 17:46:17 2016 -0700

vhost: optimize dequeue for small packets

A virtio driver normally uses at least 2 desc buffers for Tx: the
first for storing the header, and the others for storing the data.

Therefore, we could fetch the first data desc buf before the main
loop, and do the copy first before the check of "are we done yet?".
This could save one check for small packets that just have one data
desc buffer and need one mbuf to store it.

Signed-off-by: Yuanhan Liu 
Acked-by: Huawei Xie 
Tested-by: Rich Lane 




Re: [Qemu-devel] [PATCH 1/2] vhost: enable any layout feature

2016-10-09 Thread Yuanhan Liu
On Thu, Sep 29, 2016 at 10:05:22PM +0200, Maxime Coquelin wrote:
> >so doing this unconditionally would be a spec violation, but if you see
> >value in this, we can add a feature bit.
> Right it would be a spec violation, so it should be done conditionally.
> If a feature bit is to be added, what about VIRTIO_NET_F_NO_TX_HEADER?
> It would imply VIRTIO_NET_F_CSUM not set, and no GSO features set.
> If negotiated, we wouldn't need to prepend a header.

If we could skip Tx header, I think we could also skip Rx header, in the
case when mrg-rx is aslo turned off?

> From the micro-benchmarks results, we can expect +10% compared to
> indirect descriptors, and + 5% compared to using 2 descs in the
> virtqueue.
> Also, it should have the same benefits as indirect descriptors for 0%
> pkt loss (as we can fill 2x more packets in the virtqueue).
> 
> What do you think?

I would vote for this. It should yield maximum performance for the case
that it's guaranteed that packet size will always fit in a typical MTU
(1500).

--yliu



Re: [Qemu-devel] [PATCH 1/2] vhost: enable any layout feature

2016-10-09 Thread Michael S. Tsirkin
On Mon, Oct 10, 2016 at 11:37:44AM +0800, Yuanhan Liu wrote:
> On Thu, Sep 29, 2016 at 11:21:48PM +0300, Michael S. Tsirkin wrote:
> > On Thu, Sep 29, 2016 at 10:05:22PM +0200, Maxime Coquelin wrote:
> > > 
> > > 
> > > On 09/29/2016 07:57 PM, Michael S. Tsirkin wrote:
> > Yes but two points.
> > 
> > 1. why is this memset expensive?
> 
> I don't have the exact answer, but just some rough thoughts:
> 
> It's an external clib function: there is a call stack and the
> IP register will bounch back and forth.

for memset 0?  gcc 5.3.1 on fedora happily inlines it.

> BTW, It's kind of an
> overkill to use that for resetting 14 bytes structure.
> 
> Some trick like
> *(struct virtio_net_hdr *)hdr = {0, };
> 
> Or even 
> hdr->xxx = 0;
> hdr->yyy = 0;
> 
> should behaviour better.
> 
> There was an example: the vhost enqueue optmization patchset from
> Zhihong [0] uses memset, and it introduces more than 15% drop (IIRC)
> on my Ivybridge server: it has no such issue on his server though.
> 
> [0]: http://dpdk.org/ml/archives/dev/2016-August/045272.html
> 
>   --yliu

I'd say that's weird. what's your config? any chance you
are using an old compiler?


> > Is the test completely skipping looking
> >at the packet otherwise?
> > 
> > 2. As long as we are doing this, see
> > Alignment vs. Networking
> > 
> > in Documentation/unaligned-memory-access.txt
> > 
> > 
> > > From the micro-benchmarks results, we can expect +10% compared to
> > > indirect descriptors, and + 5% compared to using 2 descs in the
> > > virtqueue.
> > > Also, it should have the same benefits as indirect descriptors for 0%
> > > pkt loss (as we can fill 2x more packets in the virtqueue).
> > > 
> > > What do you think?
> > > 
> > > Thanks,
> > > Maxime



Re: [Qemu-devel] [PATCH] colo-compare: fix find_and_check_chardev()

2016-10-09 Thread Hailiang Zhang

Hi,

On 2016/10/10 10:52, Zhang Chen wrote:



On 09/30/2016 12:06 PM, zhanghailiang wrote:

find_and_check_chardev() uses 'opts' member of CharDriverState to
check if the chardev is 'socket' chardev or not, which the opts
will be NULL if We add the chardev by qmp 'chardev-add' command.

All the related info can be found in 'filename' member of CharDriverState,
For tcp socket device, it will be like 'disconnected:tcp:9.61.1.8:9004,server'
or 'tcp:9.61.1.8:9001,server <-> 9.61.1.8:50256', we can simply check it to
identify if it is a tcp socket char device.

Besides, fix this helper function to return -1 while some errors happen.

Signed-off-by: zhanghailiang 


This patch looks fine to me.



Sorry, I found there are still some problems with this modification,
For some local connection between filter objects, I think we can use unix socket
instead of tcp socket. (Or even other char device, for example file or pipe, but
Let's make things simple, we limit it to socket now)

So the below check is insufficient, It should be

+if (!strstr((*chr)->filename, "tcp:") && !strstr((*chr)->filename, 
"unix:")) {
 error_setg(errp, "chardev \"%s\" is not a tcp socket, filename '%s'",
chr_name, (*chr)->filename);

If you and Jason agree with this, i will send V2.

Thanks,
Hailiang


Reviewed-by: Zhang Chen 

Thanks
Zhang Chen


---
   net/colo-compare.c | 54 
--
   1 file changed, 8 insertions(+), 46 deletions(-)

diff --git a/net/colo-compare.c b/net/colo-compare.c
index 22b1da1..6693258 100644
--- a/net/colo-compare.c
+++ b/net/colo-compare.c
@@ -92,10 +92,6 @@ typedef struct CompareClass {
   ObjectClass parent_class;
   } CompareClass;

-typedef struct CompareChardevProps {
-bool is_socket;
-} CompareChardevProps;
-
   enum {
   PRIMARY_IN = 0,
   SECONDARY_IN,
@@ -564,56 +560,22 @@ static void compare_sec_rs_finalize(SocketReadState 
*sec_rs)
   }
   }

-static int compare_chardev_opts(void *opaque,
-const char *name, const char *value,
-Error **errp)
-{
-CompareChardevProps *props = opaque;
-
-if (strcmp(name, "backend") == 0 &&
-strcmp(value, "socket") == 0) {
-props->is_socket = true;
-return 0;
-} else if (strcmp(name, "host") == 0 ||
-  (strcmp(name, "port") == 0) ||
-  (strcmp(name, "server") == 0) ||
-  (strcmp(name, "wait") == 0) ||
-  (strcmp(name, "path") == 0)) {
-return 0;
-} else {
-error_setg(errp,
-   "COLO-compare does not support a chardev with option %s=%s",
-   name, value);
-return -1;
-}
-}
-
-/*
- * Return 0 is success.
- * Return 1 is failed.
- */
   static int find_and_check_chardev(CharDriverState **chr,
 char *chr_name,
 Error **errp)
   {
-CompareChardevProps props;
-
   *chr = qemu_chr_find(chr_name);
   if (*chr == NULL) {
   error_setg(errp, "Device '%s' not found",
  chr_name);
-return 1;
+return -1;
   }

-memset(, 0, sizeof(props));
-if (qemu_opt_foreach((*chr)->opts, compare_chardev_opts, , errp)) {
-return 1;
-}
+if (!strstr((*chr)->filename, "tcp")) {
+error_setg(errp, "chardev \"%s\" is not a tcp socket, filename '%s'",
+   chr_name, (*chr)->filename);
+return -1;

-if (!props.is_socket) {
-error_setg(errp, "chardev \"%s\" is not a tcp socket",
-   chr_name);
-return 1;
   }
   return 0;
   }
@@ -660,15 +622,15 @@ static void colo_compare_complete(UserCreatable *uc, 
Error **errp)
   return;
   }

-if (find_and_check_chardev(>chr_pri_in, s->pri_indev, errp)) {
+if (find_and_check_chardev(>chr_pri_in, s->pri_indev, errp) < 0) {
   return;
   }

-if (find_and_check_chardev(>chr_sec_in, s->sec_indev, errp)) {
+if (find_and_check_chardev(>chr_sec_in, s->sec_indev, errp) < 0) {
   return;
   }

-if (find_and_check_chardev(>chr_out, s->outdev, errp)) {
+if (find_and_check_chardev(>chr_out, s->outdev, errp) < 0) {
   return;
   }








Re: [Qemu-devel] [PATCH 1/2] vhost: enable any layout feature

2016-10-09 Thread Yuanhan Liu
On Mon, Oct 10, 2016 at 06:46:44AM +0300, Michael S. Tsirkin wrote:
> On Mon, Oct 10, 2016 at 11:37:44AM +0800, Yuanhan Liu wrote:
> > On Thu, Sep 29, 2016 at 11:21:48PM +0300, Michael S. Tsirkin wrote:
> > > On Thu, Sep 29, 2016 at 10:05:22PM +0200, Maxime Coquelin wrote:
> > > > 
> > > > 
> > > > On 09/29/2016 07:57 PM, Michael S. Tsirkin wrote:
> > > Yes but two points.
> > > 
> > > 1. why is this memset expensive?
> > 
> > I don't have the exact answer, but just some rough thoughts:
> > 
> > It's an external clib function: there is a call stack and the
> > IP register will bounch back and forth.
> 
> for memset 0?  gcc 5.3.1 on fedora happily inlines it.

Good to know!

> > overkill to use that for resetting 14 bytes structure.
> > 
> > Some trick like
> > *(struct virtio_net_hdr *)hdr = {0, };
> > 
> > Or even 
> > hdr->xxx = 0;
> > hdr->yyy = 0;
> > 
> > should behaviour better.
> > 
> > There was an example: the vhost enqueue optmization patchset from
> > Zhihong [0] uses memset, and it introduces more than 15% drop (IIRC)
> > on my Ivybridge server: it has no such issue on his server though.
> > 
> > [0]: http://dpdk.org/ml/archives/dev/2016-August/045272.html
> > 
> > --yliu
> 
> I'd say that's weird. what's your config? any chance you
> are using an old compiler?

Not really, it's gcc 5.3.1. Maybe Zhihong could explain more. IIRC,
he said the memset is not well optimized for Ivybridge server.

--yliu



[Qemu-devel] [PATCH v12 1/2] virtio-crypto: Add virtio crypto device specification

2016-10-09 Thread Gonglei
The virtio crypto device is a virtual crypto device (ie. hardware
crypto accelerator card). Currently, the virtio crypto device provides
the following crypto services: CIPHER, MAC, HASH, and AEAD.

In this patch, CIPHER, MAC, HASH, AEAD services are introduced.

VIRTIO-153

Signed-off-by: Gonglei 
CC: Michael S. Tsirkin 
CC: Cornelia Huck 
CC: Stefan Hajnoczi 
CC: Lingli Deng 
CC: Jani Kokkonen 
CC: Ola Liljedahl 
CC: Varun Sethi 
CC: Zeng Xin 
CC: Keating Brian 
CC: Ma Liang J 
CC: Griffin John 
CC: Hanweidong 
CC: Mihai Claudiu Caraman 
---
 content.tex   |   2 +
 virtio-crypto.tex | 999 ++
 2 files changed, 1001 insertions(+)
 create mode 100644 virtio-crypto.tex

diff --git a/content.tex b/content.tex
index 4b45678..ab75f78 100644
--- a/content.tex
+++ b/content.tex
@@ -5750,6 +5750,8 @@ descriptor for the \field{sense_len}, \field{residual},
 \field{status_qualifier}, \field{status}, \field{response} and
 \field{sense} fields.
 
+\input{virtio-crypto.tex}
+
 \chapter{Reserved Feature Bits}\label{sec:Reserved Feature Bits}
 
 Currently there are three device-independent feature bits defined:
diff --git a/virtio-crypto.tex b/virtio-crypto.tex
new file mode 100644
index 000..86e4869
--- /dev/null
+++ b/virtio-crypto.tex
@@ -0,0 +1,999 @@
+\section{Crypto Device}\label{sec:Device Types / Crypto Device}
+
+The virtio crypto device is a virtual cryptography device as well as a kind of
+virtual hardware accelerator for virtual machines. The encryption and
+decryption requests are placed in the data queue and are ultimately handled by 
the
+backend crypto accelerators. The second queue is the control queue used to 
create 
+or destroy sessions for symmetric algorithms and will control some advanced
+features in the future. The virtio crypto device provides the following crypto
+services: CIPHER, MAC, HASH, and AEAD.
+
+
+\subsection{Device ID}\label{sec:Device Types / Crypto Device / Device ID}
+
+20
+
+\subsection{Virtqueues}\label{sec:Device Types / Crypto Device / Virtqueues}
+
+\begin{description}
+\item[0] dataq1
+\item[\ldots]
+\item[N-1] dataqN
+\item[N] controlq
+\end{description}
+
+N is set by \field{max_dataqueues}.
+
+\subsection{Feature bits}\label{sec:Device Types / Crypto Device / Feature 
bits}
+
+Undefined currently.
+
+\subsection{Device configuration layout}\label{sec:Device Types / Crypto 
Device / Device configuration layout}
+
+The following driver-read-only configuration fields are defined:
+
+\begin{lstlisting}
+struct virtio_crypto_config {
+le32 status;
+le32 max_dataqueues;
+le32 crypto_services;
+/* detailed algorithms mask */
+le32 cipher_algo_l;
+le32 cipher_algo_h;
+le32 hash_algo;
+le32 mac_algo_l;
+le32 mac_algo_h;
+le32 aead_algo;
+le32 reserve;
+};
+\end{lstlisting}
+
+The value of the \field{status} field is VIRTIO_CRYPTO_S_HW_READY or 
VIRTIO_CRYPTO_S_STARTED.
+
+\begin{lstlisting}
+#define VIRTIO_CRYPTO_S_HW_READY  (1 << 0)
+#define VIRTIO_CRYPTO_S_STARTED  (1 << 1)
+\end{lstlisting}
+
+The following driver-read-only fields include \field{max_dataqueues}, which 
specifies the
+maximum number of data virtqueues (dataq1\ldots dataqN), and 
\field{crypto_services},
+which indicates the crypto services the virtio crypto supports.
+
+The following services are defined:
+
+\begin{lstlisting}
+/* CIPHER service */
+#define VIRTIO_CRYPTO_SERVICE_CIPHER (0)
+/* HASH service */
+#define VIRTIO_CRYPTO_SERVICE_HASH   (1)
+/* MAC (Message Authentication Codes) service */
+#define VIRTIO_CRYPTO_SERVICE_MAC(2)
+/* AEAD (Authenticated Encryption with Associated Data) service */
+#define VIRTIO_CRYPTO_SERVICE_AEAD   (3)
+\end{lstlisting}
+
+The last driver-read-only fields specify detailed algorithms masks 
+the device offers for corresponding services. The following CIPHER algorithms
+are defined:
+
+\begin{lstlisting}
+#define VIRTIO_CRYPTO_NO_CIPHER 0
+#define VIRTIO_CRYPTO_CIPHER_ARC4   1
+#define VIRTIO_CRYPTO_CIPHER_AES_ECB2
+#define VIRTIO_CRYPTO_CIPHER_AES_CBC3
+#define VIRTIO_CRYPTO_CIPHER_AES_CTR4
+#define VIRTIO_CRYPTO_CIPHER_DES_ECB5
+#define VIRTIO_CRYPTO_CIPHER_DES_CBC6
+#define VIRTIO_CRYPTO_CIPHER_3DES_ECB   7
+#define VIRTIO_CRYPTO_CIPHER_3DES_CBC   8
+#define VIRTIO_CRYPTO_CIPHER_3DES_CTR   9
+#define VIRTIO_CRYPTO_CIPHER_KASUMI_F8  10
+#define VIRTIO_CRYPTO_CIPHER_SNOW3G_UEA211
+#define VIRTIO_CRYPTO_CIPHER_AES_F8 12
+#define VIRTIO_CRYPTO_CIPHER_AES_XTS13
+#define 

Re: [Qemu-devel] [PATCH 1/2] vhost: enable any layout feature

2016-10-09 Thread Yuanhan Liu
On Mon, Oct 10, 2016 at 06:04:32AM +0300, Michael S. Tsirkin wrote:
> > > So I guess at this point, we can teach vhost-user in qemu
> > > that version 1 implies any_layout but only for machine types
> > > qemu 2.8 and up. It sets a bad precedent but oh well.
> > 
> > It should work.
> > 
> > --yliu
> 
> Cool. Want to post a patch?

I could have a try this week (Well, it's very unlikely though).
If not, it will be postponed for a while: I am traveling next week.

--yliu



[Qemu-devel] [Qemu-block][PATCH] qemu-img: fix failed qemu-img command return zero exit code defeat

2016-10-09 Thread Xu Tian
If backing file can not open when do image rebase, flag 'ret' not
assign a non-zero value, then qemu-img process exit with code zero.
Assign value '-1' to flag 'ret' after report error message to fix
this defeat.

BZ: https://bugzilla.redhat.com/show_bug.cgi?id=1383012

Signed-off-by: Xu Tian 
---
 qemu-img.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/qemu-img.c b/qemu-img.c
index 46f2a6d..37dcade 100644
--- a/qemu-img.c
+++ b/qemu-img.c
@@ -2918,6 +2918,7 @@ static int img_rebase(int argc, char **argv)
 error_reportf_err(local_err,
   "Could not open old backing file '%s': ",
   backing_name);
+ret = -1;
 goto out;
 }
 
@@ -2935,6 +2936,7 @@ static int img_rebase(int argc, char **argv)
 error_reportf_err(local_err,
   "Could not open new backing file '%s': ",
   out_baseimg);
+ret = -1;
 goto out;
 }
 }
-- 
2.5.5




[Qemu-devel] [PULL 12/33] virtio-blk: make some functions static

2016-10-09 Thread Michael S. Tsirkin
From: Greg Kurz 

Some functions that were called from the dataplane code are now only used
locally:

virtio_blk_init_request()
virtio_blk_handle_request()
virtio_blk_submit_multireq()

since commit "03de2f527499 virtio-blk: do not use vring in dataplane", and

virtio_blk_free_request()

since commit "6aa46d8ff1ee virtio: move VirtQueueElement at the beginning
of the structs".

This patch converts them to static.

Signed-off-by: Greg Kurz 
Reviewed-by: Stefan Hajnoczi 
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
---
 include/hw/virtio/virtio-blk.h |  8 
 hw/block/virtio-blk.c  | 10 +-
 2 files changed, 5 insertions(+), 13 deletions(-)

diff --git a/include/hw/virtio/virtio-blk.h b/include/hw/virtio/virtio-blk.h
index 180bd8d..9734b4c 100644
--- a/include/hw/virtio/virtio-blk.h
+++ b/include/hw/virtio/virtio-blk.h
@@ -80,14 +80,6 @@ typedef struct MultiReqBuffer {
 bool is_write;
 } MultiReqBuffer;
 
-void virtio_blk_init_request(VirtIOBlock *s, VirtQueue *vq,
- VirtIOBlockReq *req);
-void virtio_blk_free_request(VirtIOBlockReq *req);
-
-void virtio_blk_handle_request(VirtIOBlockReq *req, MultiReqBuffer *mrb);
-
-void virtio_blk_submit_multireq(BlockBackend *blk, MultiReqBuffer *mrb);
-
 void virtio_blk_handle_vq(VirtIOBlock *s, VirtQueue *vq);
 
 #endif
diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c
index c7ca4d6..bbacd56 100644
--- a/hw/block/virtio-blk.c
+++ b/hw/block/virtio-blk.c
@@ -29,8 +29,8 @@
 #include "hw/virtio/virtio-bus.h"
 #include "hw/virtio/virtio-access.h"
 
-void virtio_blk_init_request(VirtIOBlock *s, VirtQueue *vq,
- VirtIOBlockReq *req)
+static void virtio_blk_init_request(VirtIOBlock *s, VirtQueue *vq,
+VirtIOBlockReq *req)
 {
 req->dev = s;
 req->vq = vq;
@@ -40,7 +40,7 @@ void virtio_blk_init_request(VirtIOBlock *s, VirtQueue *vq,
 req->mr_next = NULL;
 }
 
-void virtio_blk_free_request(VirtIOBlockReq *req)
+static void virtio_blk_free_request(VirtIOBlockReq *req)
 {
 if (req) {
 g_free(req);
@@ -381,7 +381,7 @@ static int multireq_compare(const void *a, const void *b)
 }
 }
 
-void virtio_blk_submit_multireq(BlockBackend *blk, MultiReqBuffer *mrb)
+static void virtio_blk_submit_multireq(BlockBackend *blk, MultiReqBuffer *mrb)
 {
 int i = 0, start = 0, num_reqs = 0, niov = 0, nb_sectors = 0;
 uint32_t max_transfer;
@@ -468,7 +468,7 @@ static bool virtio_blk_sect_range_ok(VirtIOBlock *dev,
 return true;
 }
 
-void virtio_blk_handle_request(VirtIOBlockReq *req, MultiReqBuffer *mrb)
+static void virtio_blk_handle_request(VirtIOBlockReq *req, MultiReqBuffer *mrb)
 {
 uint32_t type;
 struct iovec *in_iov = req->elem.in_sg;
-- 
MST




Re: [Qemu-devel] [PATCH 1/2] vhost: enable any layout feature

2016-10-09 Thread Michael S. Tsirkin
On Mon, Oct 10, 2016 at 11:03:33AM +0800, Yuanhan Liu wrote:
> On Mon, Oct 10, 2016 at 02:20:22AM +0300, Michael S. Tsirkin wrote:
> > On Wed, Sep 28, 2016 at 10:28:48AM +0800, Yuanhan Liu wrote:
> > > On Tue, Sep 27, 2016 at 10:56:40PM +0300, Michael S. Tsirkin wrote:
> > > > On Tue, Sep 27, 2016 at 11:11:58AM +0800, Yuanhan Liu wrote:
> > > > > On Mon, Sep 26, 2016 at 10:24:55PM +0300, Michael S. Tsirkin wrote:
> > > > > > On Mon, Sep 26, 2016 at 11:01:58AM -0700, Stephen Hemminger wrote:
> > > > > > > I assume that if using Version 1 that the bit will be ignored
> > > > > 
> > > > > Yes, but I will just quote what you just said: what if the guest
> > > > > virtio device is a legacy device? I also gave my reasons in another
> > > > > email why I consistently set this flag:
> > > > > 
> > > > >   - we have to return all features we support to the guest.
> > > > >   
> > > > > We don't know the guest is a modern or legacy device. That means
> > > > > we should claim we support both: VERSION_1 and ANY_LAYOUT.
> > > > >   
> > > > > Assume guest is a legacy device and we just set VERSION_1 (the 
> > > > > current
> > > > > case), ANY_LAYOUT will never be negotiated.
> > > > >   
> > > > >   - I'm following the way Linux kernel takes: it also set both 
> > > > > features.
> > > > >   
> > > > >   Maybe, we could unset ANY_LAYOUT when VERSION_1 is _negotiated_?
> > > > > 
> > > > > The unset after negotiation I proposed turned out it won't work: the
> > > > > feature is already negotiated; unsetting it only in vhost side doesn't
> > > > > change anything. Besides, it may break the migration as Michael stated
> > > > > below.
> > > > 
> > > > I think the reverse. Teach vhost user that for future machine types
> > > > only VERSION_1 implies ANY_LAYOUT.
> > 
> > So I guess at this point, we can teach vhost-user in qemu
> > that version 1 implies any_layout but only for machine types
> > qemu 2.8 and up. It sets a bad precedent but oh well.
> 
> It should work.
> 
>   --yliu

Cool. Want to post a patch?

-- 
MST



Re: [Qemu-devel] [PATCH] colo-compare: fix find_and_check_chardev()

2016-10-09 Thread Zhang Chen



On 10/10/2016 11:13 AM, Hailiang Zhang wrote:

Hi,

On 2016/10/10 10:52, Zhang Chen wrote:



On 09/30/2016 12:06 PM, zhanghailiang wrote:

find_and_check_chardev() uses 'opts' member of CharDriverState to
check if the chardev is 'socket' chardev or not, which the opts
will be NULL if We add the chardev by qmp 'chardev-add' command.

All the related info can be found in 'filename' member of 
CharDriverState,
For tcp socket device, it will be like 
'disconnected:tcp:9.61.1.8:9004,server'
or 'tcp:9.61.1.8:9001,server <-> 9.61.1.8:50256', we can simply 
check it to

identify if it is a tcp socket char device.

Besides, fix this helper function to return -1 while some errors 
happen.


Signed-off-by: zhanghailiang 


This patch looks fine to me.



Sorry, I found there are still some problems with this modification,
For some local connection between filter objects, I think we can use 
unix socket
instead of tcp socket. (Or even other char device, for example file or 
pipe, but

Let's make things simple, we limit it to socket now)

So the below check is insufficient, It should be

+if (!strstr((*chr)->filename, "tcp:") && 
!strstr((*chr)->filename, "unix:")) {
 error_setg(errp, "chardev \"%s\" is not a tcp socket, 
filename '%s'",

chr_name, (*chr)->filename);

If you and Jason agree with this, i will send V2.



I find part of codes in this patch has same with another patch:

net: don't poke at chardev internal QemuOpts

I think you can fix and rebase your patch,
then we need jason's comments for this.


Thanks
Zhang Chen


Thanks,
Hailiang


Reviewed-by: Zhang Chen 

Thanks
Zhang Chen


---
   net/colo-compare.c | 54 
--

   1 file changed, 8 insertions(+), 46 deletions(-)

diff --git a/net/colo-compare.c b/net/colo-compare.c
index 22b1da1..6693258 100644
--- a/net/colo-compare.c
+++ b/net/colo-compare.c
@@ -92,10 +92,6 @@ typedef struct CompareClass {
   ObjectClass parent_class;
   } CompareClass;

-typedef struct CompareChardevProps {
-bool is_socket;
-} CompareChardevProps;
-
   enum {
   PRIMARY_IN = 0,
   SECONDARY_IN,
@@ -564,56 +560,22 @@ static void 
compare_sec_rs_finalize(SocketReadState *sec_rs)

   }
   }

-static int compare_chardev_opts(void *opaque,
-const char *name, const char *value,
-Error **errp)
-{
-CompareChardevProps *props = opaque;
-
-if (strcmp(name, "backend") == 0 &&
-strcmp(value, "socket") == 0) {
-props->is_socket = true;
-return 0;
-} else if (strcmp(name, "host") == 0 ||
-  (strcmp(name, "port") == 0) ||
-  (strcmp(name, "server") == 0) ||
-  (strcmp(name, "wait") == 0) ||
-  (strcmp(name, "path") == 0)) {
-return 0;
-} else {
-error_setg(errp,
-   "COLO-compare does not support a chardev with 
option %s=%s",

-   name, value);
-return -1;
-}
-}
-
-/*
- * Return 0 is success.
- * Return 1 is failed.
- */
   static int find_and_check_chardev(CharDriverState **chr,
 char *chr_name,
 Error **errp)
   {
-CompareChardevProps props;
-
   *chr = qemu_chr_find(chr_name);
   if (*chr == NULL) {
   error_setg(errp, "Device '%s' not found",
  chr_name);
-return 1;
+return -1;
   }

-memset(, 0, sizeof(props));
-if (qemu_opt_foreach((*chr)->opts, compare_chardev_opts, 
, errp)) {

-return 1;
-}
+if (!strstr((*chr)->filename, "tcp")) {
+error_setg(errp, "chardev \"%s\" is not a tcp socket, 
filename '%s'",

+   chr_name, (*chr)->filename);
+return -1;

-if (!props.is_socket) {
-error_setg(errp, "chardev \"%s\" is not a tcp socket",
-   chr_name);
-return 1;
   }
   return 0;
   }
@@ -660,15 +622,15 @@ static void 
colo_compare_complete(UserCreatable *uc, Error **errp)

   return;
   }

-if (find_and_check_chardev(>chr_pri_in, s->pri_indev, errp)) {
+if (find_and_check_chardev(>chr_pri_in, s->pri_indev, errp) 
< 0) {

   return;
   }

-if (find_and_check_chardev(>chr_sec_in, s->sec_indev, errp)) {
+if (find_and_check_chardev(>chr_sec_in, s->sec_indev, errp) 
< 0) {

   return;
   }

-if (find_and_check_chardev(>chr_out, s->outdev, errp)) {
+if (find_and_check_chardev(>chr_out, s->outdev, errp) < 0) {
   return;
   }







.



--
Thanks
zhangchen






[Qemu-devel] [PATCH v12 2/2] virtio-crypto: Add conformance clauses

2016-10-09 Thread Gonglei
Add the conformance targets and clauses for
virtio-crypto device.

Signed-off-by: Gonglei 
---
 conformance.tex | 30 ++
 1 file changed, 30 insertions(+)

diff --git a/conformance.tex b/conformance.tex
index f59e360..3bde4b6 100644
--- a/conformance.tex
+++ b/conformance.tex
@@ -146,6 +146,21 @@ An SCSI host driver MUST conform to the following 
normative statements:
 \item \ref{drivernormative:Device Types / SCSI Host Device / Device Operation 
/ Device Operation: eventq}
 \end{itemize}
 
+\subsection{Crypto Driver Conformance}\label{sec:Conformance / Driver 
Conformance / Crypto Driver Conformance}
+
+An Crypto driver MUST conform to the following normative statements:
+
+\begin{itemize}
+\item \ref{drivernormative:Device Types / Crypto Device / Device configuration 
layout}
+\item \ref{drivernormative:Device Types / Crypto Device / Device 
Initialization}
+\item \ref{drivernormative:Device Types / Crypto Device / Device Operation / 
Control Virtqueue / Session operation / Session operation: create session}
+\item \ref{drivernormative:Device Types / Crypto Device / Device Operation / 
Control Virtqueue / Session operation / Session operation: destroy session}
+\item \ref{drivernormative:Device Types / Crypto Device / Device Operation / 
HASH Service operation}
+\item \ref{drivernormative:Device Types / Crypto Device / Device Operation / 
MAC Service operation}
+\item \ref{drivernormative:Device Types / Crypto Device / Device Operation / 
Symmetric algorithms Operation}
+\item \ref{drivernormative:Device Types / Crypto Device / Device Operation / 
AEAD Service operation}
+\end{itemize}
+
 \section{Device Conformance}\label{sec:Conformance / Device Conformance}
 
 A device MUST conform to the following normative statements:
@@ -267,6 +282,21 @@ An SCSI host device MUST conform to the following 
normative statements:
 \item \ref{devicenormative:Device Types / SCSI Host Device / Device Operation 
/ Device Operation: eventq}
 \end{itemize}
 
+\subsection{Crypto Device Conformance}\label{sec:Conformance / Device 
Conformance / Crypto Device Conformance}
+
+An Crypto device MUST conform to the following normative statements:
+
+\begin{itemize}
+\item \ref{devicenormative:Device Types / Crypto Device / Device configuration 
layout}
+\item \ref{devicenormative:Device Types / Crypto Device / Device 
Initialization}
+\item \ref{devicenormative:Device Types / Crypto Device / Device Operation / 
Control Virtqueue / Session operation / Session operation: create session}
+\item \ref{devicenormative:Device Types / Crypto Device / Device Operation / 
Control Virtqueue / Session operation / Session operation: destroy session}
+\item \ref{devicenormative:Device Types / Crypto Device / Device Operation / 
HASH Service operation}
+\item \ref{devicenormative:Device Types / Crypto Device / Device Operation / 
MAC Service operation}
+\item \ref{devicenormative:Device Types / Crypto Device / Device Operation / 
Symmetric algorithms Operation}
+\item \ref{devicenormative:Device Types / Crypto Device / Device Operation / 
AEAD Service operation}
+\end{itemize}
+
 \section{Legacy Interface: Transitional Device and
 Transitional Driver Conformance}\label{sec:Conformance / Legacy
 Interface: Transitional Device and 
-- 
1.7.12.4





[Qemu-devel] [PULL 21/33] virtio: prepare change VMSTATE_VIRTIO_DEVICE macro

2016-10-09 Thread Michael S. Tsirkin
From: Halil Pasic 

In most cases the functions passed to VMSTATE_VIRTIO_DEVICE
only call the virtio_load and virtio_save wrappers. Some include some
pre- and post- massaging too. The massaging is better expressed
as such in the VMStateDescription.

Let us prepare for changing the semantic of the VMSTATE_VIRTIO_DEVICE
macro so that it is more similar to the other VMSTATE_*_DEVICE macros
in a sense that it is a field definition.

The preprocessor conditionals are going to be removed as soon as
every usage is converted to the new semantic.

Signed-off-by: Halil Pasic 
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
---
 include/hw/virtio/virtio.h | 16 
 hw/virtio/virtio.c | 21 +
 2 files changed, 37 insertions(+)

diff --git a/include/hw/virtio/virtio.h b/include/hw/virtio/virtio.h
index e25ec4f..929fa92 100644
--- a/include/hw/virtio/virtio.h
+++ b/include/hw/virtio/virtio.h
@@ -179,6 +179,20 @@ void virtio_notify(VirtIODevice *vdev, VirtQueue *vq);
 void virtio_save(VirtIODevice *vdev, QEMUFile *f);
 void virtio_vmstate_save(QEMUFile *f, void *opaque, size_t size);
 
+extern const VMStateInfo virtio_vmstate_info;
+
+#ifdef VMSTATE_VIRTIO_DEVICE_USE_NEW
+
+#define VMSTATE_VIRTIO_DEVICE \
+{ \
+.name = "virtio", \
+.info = _vmstate_info, \
+.flags = VMS_SINGLE,  \
+}
+
+#else
+/* TODO remove conditional as soon as all users are converted */
+
 #define VMSTATE_VIRTIO_DEVICE(devname, v, getf, putf) \
 static const VMStateDescription vmstate_virtio_ ## devname = { \
 .name = "virtio-" #devname ,  \
@@ -198,6 +212,8 @@ void virtio_vmstate_save(QEMUFile *f, void *opaque, size_t 
size);
 } \
 }
 
+#endif
+
 int virtio_load(VirtIODevice *vdev, QEMUFile *f, int version_id);
 
 void virtio_notify_config(VirtIODevice *vdev);
diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
index 46f79c9..62b9c00 100644
--- a/hw/virtio/virtio.c
+++ b/hw/virtio/virtio.c
@@ -1645,6 +1645,27 @@ void virtio_vmstate_save(QEMUFile *f, void *opaque, 
size_t size)
 virtio_save(VIRTIO_DEVICE(opaque), f);
 }
 
+/* A wrapper for use as a VMState .put function */
+static void virtio_device_put(QEMUFile *f, void *opaque, size_t size)
+{
+virtio_save(VIRTIO_DEVICE(opaque), f);
+}
+
+/* A wrapper for use as a VMState .get function */
+static int virtio_device_get(QEMUFile *f, void *opaque, size_t size)
+{
+VirtIODevice *vdev = VIRTIO_DEVICE(opaque);
+DeviceClass *dc = DEVICE_CLASS(VIRTIO_DEVICE_GET_CLASS(vdev));
+
+return virtio_load(vdev, f, dc->vmsd->version_id);
+}
+
+const VMStateInfo  virtio_vmstate_info = {
+.name = "virtio",
+.get = virtio_device_get,
+.put = virtio_device_put,
+};
+
 static int virtio_set_features_nocheck(VirtIODevice *vdev, uint64_t val)
 {
 VirtioDeviceClass *k = VIRTIO_DEVICE_GET_CLASS(vdev);
-- 
MST




Re: [Qemu-devel] [PATCH 1/2] vhost: enable any layout feature

2016-10-09 Thread Yuanhan Liu
On Mon, Oct 10, 2016 at 02:20:22AM +0300, Michael S. Tsirkin wrote:
> On Wed, Sep 28, 2016 at 10:28:48AM +0800, Yuanhan Liu wrote:
> > On Tue, Sep 27, 2016 at 10:56:40PM +0300, Michael S. Tsirkin wrote:
> > > On Tue, Sep 27, 2016 at 11:11:58AM +0800, Yuanhan Liu wrote:
> > > > On Mon, Sep 26, 2016 at 10:24:55PM +0300, Michael S. Tsirkin wrote:
> > > > > On Mon, Sep 26, 2016 at 11:01:58AM -0700, Stephen Hemminger wrote:
> > > > > > I assume that if using Version 1 that the bit will be ignored
> > > > 
> > > > Yes, but I will just quote what you just said: what if the guest
> > > > virtio device is a legacy device? I also gave my reasons in another
> > > > email why I consistently set this flag:
> > > > 
> > > >   - we have to return all features we support to the guest.
> > > >   
> > > > We don't know the guest is a modern or legacy device. That means
> > > > we should claim we support both: VERSION_1 and ANY_LAYOUT.
> > > >   
> > > > Assume guest is a legacy device and we just set VERSION_1 (the 
> > > > current
> > > > case), ANY_LAYOUT will never be negotiated.
> > > >   
> > > >   - I'm following the way Linux kernel takes: it also set both features.
> > > >   
> > > >   Maybe, we could unset ANY_LAYOUT when VERSION_1 is _negotiated_?
> > > > 
> > > > The unset after negotiation I proposed turned out it won't work: the
> > > > feature is already negotiated; unsetting it only in vhost side doesn't
> > > > change anything. Besides, it may break the migration as Michael stated
> > > > below.
> > > 
> > > I think the reverse. Teach vhost user that for future machine types
> > > only VERSION_1 implies ANY_LAYOUT.
> 
> So I guess at this point, we can teach vhost-user in qemu
> that version 1 implies any_layout but only for machine types
> qemu 2.8 and up. It sets a bad precedent but oh well.

It should work.

--yliu



[Qemu-devel] [PULL 29/33] virtio-balloon: convert VMSTATE_VIRTIO_DEVICE

2016-10-09 Thread Michael S. Tsirkin
From: Halil Pasic 

Use the new VMSTATE_VIRTIO_DEVICE macro.

Signed-off-by: Halil Pasic 
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
---
 hw/virtio/virtio-balloon.c | 17 +++--
 1 file changed, 11 insertions(+), 6 deletions(-)

diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c
index eb572e6..2c68d3d 100644
--- a/hw/virtio/virtio-balloon.c
+++ b/hw/virtio/virtio-balloon.c
@@ -13,6 +13,8 @@
  *
  */
 
+#define VMSTATE_VIRTIO_DEVICE_USE_NEW
+
 #include "qemu/osdep.h"
 #include "qemu/iov.h"
 #include "qemu/timer.h"
@@ -402,11 +404,6 @@ static void virtio_balloon_save_device(VirtIODevice *vdev, 
QEMUFile *f)
 qemu_put_be32(f, s->actual);
 }
 
-static int virtio_balloon_load(QEMUFile *f, void *opaque, size_t size)
-{
-return virtio_load(VIRTIO_DEVICE(opaque), f, 1);
-}
-
 static int virtio_balloon_load_device(VirtIODevice *vdev, QEMUFile *f,
   int version_id)
 {
@@ -492,7 +489,15 @@ static void virtio_balloon_instance_init(Object *obj)
 NULL, s, NULL);
 }
 
-VMSTATE_VIRTIO_DEVICE(balloon, 1, virtio_balloon_load, virtio_vmstate_save);
+static const VMStateDescription vmstate_virtio_balloon = {
+.name = "virtio-balloon",
+.minimum_version_id = 1,
+.version_id = 1,
+.fields = (VMStateField[]) {
+VMSTATE_VIRTIO_DEVICE,
+VMSTATE_END_OF_LIST()
+},
+};
 
 static Property virtio_balloon_properties[] = {
 DEFINE_PROP_BIT("deflate-on-oom", VirtIOBalloon, host_features,
-- 
MST




Re: [Qemu-devel] [PATCH 1/2] vhost: enable any layout feature

2016-10-09 Thread Yuanhan Liu
On Thu, Sep 29, 2016 at 11:21:48PM +0300, Michael S. Tsirkin wrote:
> On Thu, Sep 29, 2016 at 10:05:22PM +0200, Maxime Coquelin wrote:
> > 
> > 
> > On 09/29/2016 07:57 PM, Michael S. Tsirkin wrote:
> Yes but two points.
> 
> 1. why is this memset expensive?

I don't have the exact answer, but just some rough thoughts:

It's an external clib function: there is a call stack and the
IP register will bounch back and forth. BTW, It's kind of an
overkill to use that for resetting 14 bytes structure.

Some trick like
*(struct virtio_net_hdr *)hdr = {0, };

Or even 
hdr->xxx = 0;
hdr->yyy = 0;

should behaviour better.

There was an example: the vhost enqueue optmization patchset from
Zhihong [0] uses memset, and it introduces more than 15% drop (IIRC)
on my Ivybridge server: it has no such issue on his server though.

[0]: http://dpdk.org/ml/archives/dev/2016-August/045272.html

--yliu

> Is the test completely skipping looking
>at the packet otherwise?
> 
> 2. As long as we are doing this, see
>   Alignment vs. Networking
>   
> in Documentation/unaligned-memory-access.txt
> 
> 
> > From the micro-benchmarks results, we can expect +10% compared to
> > indirect descriptors, and + 5% compared to using 2 descs in the
> > virtqueue.
> > Also, it should have the same benefits as indirect descriptors for 0%
> > pkt loss (as we can fill 2x more packets in the virtqueue).
> > 
> > What do you think?
> > 
> > Thanks,
> > Maxime



[Qemu-devel] [PULL 32/33] virtio: cleanup VMSTATE_VIRTIO_DEVICE

2016-10-09 Thread Michael S. Tsirkin
From: Halil Pasic 

Now all the usages of the old version of VMSTATE_VIRTIO_DEVICE are gone,
so we can get rid of the conditionals, and the old macro.

Signed-off-by: Halil Pasic 
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
---
 include/hw/virtio/virtio.h  | 27 ---
 hw/9pfs/virtio-9p-device.c  |  2 --
 hw/block/virtio-blk.c   |  2 --
 hw/char/virtio-serial-bus.c |  2 --
 hw/display/virtio-gpu.c |  2 --
 hw/input/virtio-input.c |  2 --
 hw/net/virtio-net.c |  2 --
 hw/scsi/virtio-scsi.c   |  2 --
 hw/virtio/vhost-vsock.c |  2 --
 hw/virtio/virtio-balloon.c  |  2 --
 hw/virtio/virtio-rng.c  |  2 --
 hw/virtio/virtio.c  |  6 --
 12 files changed, 53 deletions(-)

diff --git a/include/hw/virtio/virtio.h b/include/hw/virtio/virtio.h
index 929fa92..b913aac 100644
--- a/include/hw/virtio/virtio.h
+++ b/include/hw/virtio/virtio.h
@@ -177,12 +177,9 @@ bool virtio_should_notify(VirtIODevice *vdev, VirtQueue 
*vq);
 void virtio_notify(VirtIODevice *vdev, VirtQueue *vq);
 
 void virtio_save(VirtIODevice *vdev, QEMUFile *f);
-void virtio_vmstate_save(QEMUFile *f, void *opaque, size_t size);
 
 extern const VMStateInfo virtio_vmstate_info;
 
-#ifdef VMSTATE_VIRTIO_DEVICE_USE_NEW
-
 #define VMSTATE_VIRTIO_DEVICE \
 { \
 .name = "virtio", \
@@ -190,30 +187,6 @@ extern const VMStateInfo virtio_vmstate_info;
 .flags = VMS_SINGLE,  \
 }
 
-#else
-/* TODO remove conditional as soon as all users are converted */
-
-#define VMSTATE_VIRTIO_DEVICE(devname, v, getf, putf) \
-static const VMStateDescription vmstate_virtio_ ## devname = { \
-.name = "virtio-" #devname ,  \
-.minimum_version_id = v,  \
-.version_id = v,  \
-.fields = (VMStateField[]) {  \
-{ \
-.name = "virtio", \
-.info = &(const VMStateInfo) {\
-.name = "virtio", \
-.get = getf,  \
-.put = putf,  \
-},\
-.flags = VMS_SINGLE,  \
-},\
-VMSTATE_END_OF_LIST() \
-} \
-}
-
-#endif
-
 int virtio_load(VirtIODevice *vdev, QEMUFile *f, int version_id);
 
 void virtio_notify_config(VirtIODevice *vdev);
diff --git a/hw/9pfs/virtio-9p-device.c b/hw/9pfs/virtio-9p-device.c
index 526ec7d..e98dd0c 100644
--- a/hw/9pfs/virtio-9p-device.c
+++ b/hw/9pfs/virtio-9p-device.c
@@ -11,8 +11,6 @@
  *
  */
 
-#define VMSTATE_VIRTIO_DEVICE_USE_NEW
-
 #include "qemu/osdep.h"
 #include "hw/virtio/virtio.h"
 #include "qemu/sockets.h"
diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c
index 10c5794..37fe72b 100644
--- a/hw/block/virtio-blk.c
+++ b/hw/block/virtio-blk.c
@@ -11,8 +11,6 @@
  *
  */
 
-#define VMSTATE_VIRTIO_DEVICE_USE_NEW
-
 #include "qemu/osdep.h"
 #include "qapi/error.h"
 #include "qemu-common.h"
diff --git a/hw/char/virtio-serial-bus.c b/hw/char/virtio-serial-bus.c
index c9b0fc8..7975c2c 100644
--- a/hw/char/virtio-serial-bus.c
+++ b/hw/char/virtio-serial-bus.c
@@ -18,8 +18,6 @@
  * GNU GPL, version 2 or (at your option) any later version.
  */
 
-#define VMSTATE_VIRTIO_DEVICE_USE_NEW
-
 #include "qemu/osdep.h"
 #include "qapi/error.h"
 #include "qemu/iov.h"
diff --git a/hw/display/virtio-gpu.c b/hw/display/virtio-gpu.c
index 4fcd63c..fa6fd0e 100644
--- a/hw/display/virtio-gpu.c
+++ b/hw/display/virtio-gpu.c
@@ -11,8 +11,6 @@
  * See the COPYING file in the top-level directory.
  */
 
-#define VMSTATE_VIRTIO_DEVICE_USE_NEW
-
 #include "qemu/osdep.h"
 #include "qemu-common.h"
 #include "qemu/iov.h"
diff --git a/hw/input/virtio-input.c b/hw/input/virtio-input.c
index 5e31033..b678ee9 100644
--- a/hw/input/virtio-input.c
+++ b/hw/input/virtio-input.c
@@ -4,8 +4,6 @@
  * top-level directory.
  */
 
-#define VMSTATE_VIRTIO_DEVICE_USE_NEW
-
 #include "qemu/osdep.h"
 #include "qapi/error.h"
 #include "qemu/iov.h"
diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
index b2198a5..06bfe4b 100644
--- a/hw/net/virtio-net.c
+++ b/hw/net/virtio-net.c
@@ -11,8 +11,6 @@
  *
  */
 
-#define VMSTATE_VIRTIO_DEVICE_USE_NEW
-
 #include "qemu/osdep.h"
 #include "qemu/iov.h"
 #include "hw/virtio/virtio.h"
diff --git a/hw/scsi/virtio-scsi.c b/hw/scsi/virtio-scsi.c
index 9473e10..4762f05 100644
--- a/hw/scsi/virtio-scsi.c
+++ b/hw/scsi/virtio-scsi.c
@@ -13,8 +13,6 @@
  *
  */
 
-#define VMSTATE_VIRTIO_DEVICE_USE_NEW
-
 #include "qemu/osdep.h"
 #include "qapi/error.h"
 #include "standard-headers/linux/virtio_ids.h"
diff --git a/hw/virtio/vhost-vsock.c b/hw/virtio/vhost-vsock.c

[Qemu-devel] [PULL 24/33] virtio-9p: convert VMSTATE_VIRTIO_DEVICE

2016-10-09 Thread Michael S. Tsirkin
From: Halil Pasic 

Use the new VMSTATE_VIRTIO_DEVICE macro.

Signed-off-by: Halil Pasic 
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
---
 hw/9pfs/virtio-9p-device.c | 17 +++--
 1 file changed, 11 insertions(+), 6 deletions(-)

diff --git a/hw/9pfs/virtio-9p-device.c b/hw/9pfs/virtio-9p-device.c
index a338f64..526ec7d 100644
--- a/hw/9pfs/virtio-9p-device.c
+++ b/hw/9pfs/virtio-9p-device.c
@@ -11,6 +11,8 @@
  *
  */
 
+#define VMSTATE_VIRTIO_DEVICE_USE_NEW
+
 #include "qemu/osdep.h"
 #include "hw/virtio/virtio.h"
 #include "qemu/sockets.h"
@@ -113,11 +115,6 @@ static void virtio_9p_get_config(VirtIODevice *vdev, 
uint8_t *config)
 g_free(cfg);
 }
 
-static int virtio_9p_load(QEMUFile *f, void *opaque, size_t size)
-{
-return virtio_load(VIRTIO_DEVICE(opaque), f, 1);
-}
-
 static void virtio_9p_device_realize(DeviceState *dev, Error **errp)
 {
 VirtIODevice *vdev = VIRTIO_DEVICE(dev);
@@ -184,7 +181,15 @@ void virtio_init_iov_from_pdu(V9fsPDU *pdu, struct iovec 
**piov,
 
 /* virtio-9p device */
 
-VMSTATE_VIRTIO_DEVICE(9p, 1, virtio_9p_load, virtio_vmstate_save);
+static const VMStateDescription vmstate_virtio_9p = {
+.name = "virtio-9p",
+.minimum_version_id = 1,
+.version_id = 1,
+.fields = (VMStateField[]) {
+VMSTATE_VIRTIO_DEVICE,
+VMSTATE_END_OF_LIST()
+},
+};
 
 static Property virtio_9p_properties[] = {
 DEFINE_PROP_STRING("mount_tag", V9fsVirtioState, state.fsconf.tag),
-- 
MST




[Qemu-devel] [PULL 33/33] intel-iommu: Check IOAPIC's Trigger Mode against the one in IRTE

2016-10-09 Thread Michael S. Tsirkin
From: Feng Wu 

The Trigger Mode field of IOAPIC must match the Trigger Mode in
the IRTE according to VT-d Spec 5.1.5.1.

Signed-off-by: Feng Wu 
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
Reviewed-by: Peter Xu 
---
 hw/i386/intel_iommu.c | 12 
 1 file changed, 12 insertions(+)

diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
index 9f4e64a..2efd69b 100644
--- a/hw/i386/intel_iommu.c
+++ b/hw/i386/intel_iommu.c
@@ -27,6 +27,7 @@
 #include "hw/pci/pci.h"
 #include "hw/pci/pci_bus.h"
 #include "hw/i386/pc.h"
+#include "hw/i386/apic-msidef.h"
 #include "hw/boards.h"
 #include "hw/i386/x86-iommu.h"
 #include "hw/pci-host/q35.h"
@@ -2209,6 +2210,8 @@ static int vtd_interrupt_remap_msi(IntelIOMMUState *iommu,
 }
 } else {
 uint8_t vector = origin->data & 0xff;
+uint8_t trigger_mode = (origin->data >> MSI_DATA_TRIGGER_SHIFT) & 0x1;
+
 VTD_DPRINTF(IR, "received IOAPIC interrupt");
 /* IOAPIC entry vector should be aligned with IRTE vector
  * (see vt-d spec 5.1.5.1). */
@@ -2217,6 +2220,15 @@ static int vtd_interrupt_remap_msi(IntelIOMMUState 
*iommu,
 "entry: %d, IRTE: %d, index: %d",
 vector, irq.vector, index);
 }
+
+/* The Trigger Mode field must match the Trigger Mode in the IRTE.
+ * (see vt-d spec 5.1.5.1). */
+if (trigger_mode != irq.trigger_mode) {
+VTD_DPRINTF(GENERAL, "IOAPIC trigger mode inconsistent: "
+"entry: %u, IRTE: %u, index: %d",
+trigger_mode, irq.trigger_mode, index);
+}
+
 }
 
 /*
-- 
MST




[Qemu-devel] [PULL 14/33] virtio-blk: handle virtio_blk_handle_request() errors

2016-10-09 Thread Michael S. Tsirkin
From: Greg Kurz 

All these errors are caused by a buggy guest: QEMU should not exit.

With this patch, if virtio_blk_handle_request() detects a buggy request, it
marks the device as broken and returns an error to the caller so it takes
appropriate action.

In the case of virtio_blk_handle_vq(), we detach the request from the
virtqueue, free its allocated memory and stop popping new requests.
We don't need to bother about multireq since virtio_blk_handle_request()
errors out early and mrb.num_reqs == 0.

In the case of virtio_blk_dma_restart_bh(), we need to detach and free all
queued requests as well.

Signed-off-by: Greg Kurz 
Reviewed-by: Stefan Hajnoczi 
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
---
 hw/block/virtio-blk.c | 38 --
 1 file changed, 28 insertions(+), 10 deletions(-)

diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c
index bbacd56..0ddd7fb 100644
--- a/hw/block/virtio-blk.c
+++ b/hw/block/virtio-blk.c
@@ -468,30 +468,32 @@ static bool virtio_blk_sect_range_ok(VirtIOBlock *dev,
 return true;
 }
 
-static void virtio_blk_handle_request(VirtIOBlockReq *req, MultiReqBuffer *mrb)
+static int virtio_blk_handle_request(VirtIOBlockReq *req, MultiReqBuffer *mrb)
 {
 uint32_t type;
 struct iovec *in_iov = req->elem.in_sg;
 struct iovec *iov = req->elem.out_sg;
 unsigned in_num = req->elem.in_num;
 unsigned out_num = req->elem.out_num;
+VirtIOBlock *s = req->dev;
+VirtIODevice *vdev = VIRTIO_DEVICE(s);
 
 if (req->elem.out_num < 1 || req->elem.in_num < 1) {
-error_report("virtio-blk missing headers");
-exit(1);
+virtio_error(vdev, "virtio-blk missing headers");
+return -1;
 }
 
 if (unlikely(iov_to_buf(iov, out_num, 0, >out,
 sizeof(req->out)) != sizeof(req->out))) {
-error_report("virtio-blk request outhdr too short");
-exit(1);
+virtio_error(vdev, "virtio-blk request outhdr too short");
+return -1;
 }
 
 iov_discard_front(, _num, sizeof(req->out));
 
 if (in_iov[in_num - 1].iov_len < sizeof(struct virtio_blk_inhdr)) {
-error_report("virtio-blk request inhdr too short");
-exit(1);
+virtio_error(vdev, "virtio-blk request inhdr too short");
+return -1;
 }
 
 /* We always touch the last byte, so just see how big in_iov is.  */
@@ -529,7 +531,7 @@ static void virtio_blk_handle_request(VirtIOBlockReq *req, 
MultiReqBuffer *mrb)
 block_acct_invalid(blk_get_stats(req->dev->blk),
is_write ? BLOCK_ACCT_WRITE : BLOCK_ACCT_READ);
 virtio_blk_free_request(req);
-return;
+return 0;
 }
 
 block_acct_start(blk_get_stats(req->dev->blk),
@@ -576,6 +578,7 @@ static void virtio_blk_handle_request(VirtIOBlockReq *req, 
MultiReqBuffer *mrb)
 virtio_blk_req_complete(req, VIRTIO_BLK_S_UNSUPP);
 virtio_blk_free_request(req);
 }
+return 0;
 }
 
 void virtio_blk_handle_vq(VirtIOBlock *s, VirtQueue *vq)
@@ -586,7 +589,11 @@ void virtio_blk_handle_vq(VirtIOBlock *s, VirtQueue *vq)
 blk_io_plug(s->blk);
 
 while ((req = virtio_blk_get_request(s, vq))) {
-virtio_blk_handle_request(req, );
+if (virtio_blk_handle_request(req, )) {
+virtqueue_detach_element(req->vq, >elem, 0);
+virtio_blk_free_request(req);
+break;
+}
 }
 
 if (mrb.num_reqs) {
@@ -625,7 +632,18 @@ static void virtio_blk_dma_restart_bh(void *opaque)
 
 while (req) {
 VirtIOBlockReq *next = req->next;
-virtio_blk_handle_request(req, );
+if (virtio_blk_handle_request(req, )) {
+/* Device is now broken and won't do any processing until it gets
+ * reset. Already queued requests will be lost: let's purge them.
+ */
+while (req) {
+next = req->next;
+virtqueue_detach_element(req->vq, >elem, 0);
+virtio_blk_free_request(req);
+req = next;
+}
+break;
+}
 req = next;
 }
 
-- 
MST




[Qemu-devel] [PULL 15/33] virtio-net: handle virtio_net_handle_ctrl() error

2016-10-09 Thread Michael S. Tsirkin
From: Greg Kurz 

This error is caused by a buggy guest: let's switch the device to the
broken state instead of terminating QEMU. Also we detach the element
from the virtqueue and free it.

Signed-off-by: Greg Kurz 
Reviewed-by: Stefan Hajnoczi 
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
---
 hw/net/virtio-net.c | 7 +--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
index 6b8ae2c..a1584e1 100644
--- a/hw/net/virtio-net.c
+++ b/hw/net/virtio-net.c
@@ -880,6 +880,7 @@ static int virtio_net_handle_mq(VirtIONet *n, uint8_t cmd,
 
 return VIRTIO_NET_OK;
 }
+
 static void virtio_net_handle_ctrl(VirtIODevice *vdev, VirtQueue *vq)
 {
 VirtIONet *n = VIRTIO_NET(vdev);
@@ -897,8 +898,10 @@ static void virtio_net_handle_ctrl(VirtIODevice *vdev, 
VirtQueue *vq)
 }
 if (iov_size(elem->in_sg, elem->in_num) < sizeof(status) ||
 iov_size(elem->out_sg, elem->out_num) < sizeof(ctrl)) {
-error_report("virtio-net ctrl missing headers");
-exit(1);
+virtio_error(vdev, "virtio-net ctrl missing headers");
+virtqueue_detach_element(vq, elem, 0);
+g_free(elem);
+break;
 }
 
 iov_cnt = elem->out_num;
-- 
MST




[Qemu-devel] [PULL 22/33] virtio-blk: convert VMSTATE_VIRTIO_DEVICE

2016-10-09 Thread Michael S. Tsirkin
From: Halil Pasic 

Use the new VMSTATE_VIRTIO_DEVICE macro.

Signed-off-by: Halil Pasic 
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
---
 hw/block/virtio-blk.c | 27 +++
 1 file changed, 11 insertions(+), 16 deletions(-)

diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c
index 0ddd7fb..10c5794 100644
--- a/hw/block/virtio-blk.c
+++ b/hw/block/virtio-blk.c
@@ -11,6 +11,8 @@
  *
  */
 
+#define VMSTATE_VIRTIO_DEVICE_USE_NEW
+
 #include "qemu/osdep.h"
 #include "qapi/error.h"
 #include "qemu-common.h"
@@ -822,13 +824,6 @@ static void virtio_blk_set_status(VirtIODevice *vdev, 
uint8_t status)
 }
 }
 
-static void virtio_blk_save(QEMUFile *f, void *opaque, size_t size)
-{
-VirtIODevice *vdev = VIRTIO_DEVICE(opaque);
-
-virtio_save(vdev, f);
-}
-
 static void virtio_blk_save_device(VirtIODevice *vdev, QEMUFile *f)
 {
 VirtIOBlock *s = VIRTIO_BLK(vdev);
@@ -847,14 +842,6 @@ static void virtio_blk_save_device(VirtIODevice *vdev, 
QEMUFile *f)
 qemu_put_sbyte(f, 0);
 }
 
-static int virtio_blk_load(QEMUFile *f, void *opaque, size_t size)
-{
-VirtIOBlock *s = opaque;
-VirtIODevice *vdev = VIRTIO_DEVICE(s);
-
-return virtio_load(vdev, f, 2);
-}
-
 static int virtio_blk_load_device(VirtIODevice *vdev, QEMUFile *f,
   int version_id)
 {
@@ -975,7 +962,15 @@ static void virtio_blk_instance_init(Object *obj)
   DEVICE(obj), NULL);
 }
 
-VMSTATE_VIRTIO_DEVICE(blk, 2, virtio_blk_load, virtio_blk_save);
+static const VMStateDescription vmstate_virtio_blk = {
+.name = "virtio-blk",
+.minimum_version_id = 2,
+.version_id = 2,
+.fields = (VMStateField[]) {
+VMSTATE_VIRTIO_DEVICE,
+VMSTATE_END_OF_LIST()
+},
+};
 
 static Property virtio_blk_properties[] = {
 DEFINE_BLOCK_PROPERTIES(VirtIOBlock, conf.conf),
-- 
MST




[Qemu-devel] [PULL 20/33] net: don't poke at chardev internal QemuOpts

2016-10-09 Thread Michael S. Tsirkin
From: "Daniel P. Berrange" 

The vhost-user & colo code is poking at the QemuOpts instance
in the CharDriverState struct, not realizing that it is valid
for this to be NULL. e.g. the following crash shows a codepath
where it will be NULL:

 Program terminated with signal SIGSEGV, Segmentation fault.
 #0  0x55baf6ab4adc in qemu_opt_foreach (opts=0x0, func=0x55baf696b650 
, opaque=0x7ffc51368c00, errp=0x7ffc51368e48) at 
util/qemu-option.c:617
 617 QTAILQ_FOREACH(opt, >head, next) {
 [Current thread is 1 (Thread 0x7f1d4970bb40 (LWP 6603))]
 (gdb) bt
 #0  0x55baf6ab4adc in qemu_opt_foreach (opts=0x0, func=0x55baf696b650 
, opaque=0x7ffc51368c00, errp=0x7ffc51368e48) at 
util/qemu-option.c:617
 #1  0x55baf696b7da in net_vhost_parse_chardev (opts=0x55baf8ff9260, 
errp=0x7ffc51368e48) at net/vhost-user.c:314
 #2  0x55baf696b985 in net_init_vhost_user (netdev=0x55baf8ff9250, 
name=0x55baf879d270 "hostnet2", peer=0x0, errp=0x7ffc51368e48) at 
net/vhost-user.c:360
 #3  0x55baf6960216 in net_client_init1 (object=0x55baf8ff9250, 
is_netdev=true, errp=0x7ffc51368e48) at net/net.c:1051
 #4  0x55baf6960518 in net_client_init (opts=0x55baf776e7e0, 
is_netdev=true, errp=0x7ffc51368f00) at net/net.c:1108
 #5  0x55baf696083f in netdev_add (opts=0x55baf776e7e0, 
errp=0x7ffc51368f00) at net/net.c:1186
 #6  0x55baf69608c7 in qmp_netdev_add (qdict=0x55baf7afaf60, 
ret=0x7ffc51368f50, errp=0x7ffc51368f48) at net/net.c:1205
 #7  0x55baf6622135 in handle_qmp_command (parser=0x55baf77fb590, 
tokens=0x7f1d24011960) at /path/to/qemu.git/monitor.c:3978
 #8  0x55baf6a9d099 in json_message_process_token (lexer=0x55baf77fb598, 
input=0x55baf75acd20, type=JSON_RCURLY, x=113, y=19) at 
qobject/json-streamer.c:105
 #9  0x55baf6abf7aa in json_lexer_feed_char (lexer=0x55baf77fb598, ch=125 
'}', flush=false) at qobject/json-lexer.c:319
 #10 0x55baf6abf8f2 in json_lexer_feed (lexer=0x55baf77fb598, 
buffer=0x7ffc51369170 "}R\204\367\272U", size=1) at qobject/json-lexer.c:369
 #11 0x55baf6a9d13c in json_message_parser_feed (parser=0x55baf77fb590, 
buffer=0x7ffc51369170 "}R\204\367\272U", size=1) at qobject/json-streamer.c:124
 #12 0x55baf66221f7 in monitor_qmp_read (opaque=0x55baf77fb530, 
buf=0x7ffc51369170 "}R\204\367\272U", size=1) at 
/path/to/qemu.git/monitor.c:3994
 #13 0x55baf6757014 in qemu_chr_be_write_impl (s=0x55baf7610a40, 
buf=0x7ffc51369170 "}R\204\367\272U", len=1) at qemu-char.c:387
 #14 0x55baf6757076 in qemu_chr_be_write (s=0x55baf7610a40, 
buf=0x7ffc51369170 "}R\204\367\272U", len=1) at qemu-char.c:399
 #15 0x55baf675b3b0 in tcp_chr_read (chan=0x55baf90244b0, cond=G_IO_IN, 
opaque=0x55baf7610a40) at qemu-char.c:2927
 #16 0x55baf6a5d655 in qio_channel_fd_source_dispatch 
(source=0x55baf7610df0, callback=0x55baf675b25a , 
user_data=0x55baf7610a40) at io/channel-watch.c:84
 #17 0x7f1d3e80cbbd in g_main_context_dispatch () from 
/usr/lib64/libglib-2.0.so.0
 #18 0x55baf69d3720 in glib_pollfds_poll () at main-loop.c:213
 #19 0x55baf69d37fd in os_host_main_loop_wait (timeout=12600) at 
main-loop.c:258
 #20 0x55baf69d38ad in main_loop_wait (nonblocking=0) at main-loop.c:506
 #21 0x55baf676587b in main_loop () at vl.c:1908
 #22 0x55baf676d3bf in main (argc=101, argv=0x7ffc5136a6c8, 
envp=0x7ffc5136a9f8) at vl.c:4604
 (gdb) p opts
 $1 = (QemuOpts *) 0x0

The crash occurred when attaching vhost-user net via QMP:

{
"execute": "chardev-add",
"arguments": {
"id": "charnet2",
"backend": {
"type": "socket",
"data": {
"addr": {
"type": "unix",
"data": {
"path": "/var/run/openvswitch/vhost-user1"
}
},
"wait": false,
"server": false
}
}
},
"id": "libvirt-19"
}
{
"return": {

},
"id": "libvirt-19"
}
{
"execute": "netdev_add",
"arguments": {
"type": "vhost-user",
"chardev": "charnet2",
"id": "hostnet2"
},
"id": "libvirt-20"
}

Code using chardevs should not be poking at the internals of the
CharDriverState struct. What vhost-user wants is a chardev that is
operating as reconnectable network service, along with the ability
to do FD passing over the connection. The colo code simply wants
a network service. Add a feature concept to the char drivers so
that chardev users can query the actual features they wish to have
supported. The QemuOpts member is removed to prevent future mistakes
in this area.

Signed-off-by: Daniel P. Berrange 
Reviewed-by: Marc-André Lureau 
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
---
 include/sysemu/char.h | 21 -
 hmp.c |  1 +
 net/colo-compare.c| 30 

[Qemu-devel] [PULL 31/33] vhost-vsock: convert VMSTATE_VIRTIO_DEVICE

2016-10-09 Thread Michael S. Tsirkin
From: Halil Pasic 

Use the new VMSTATE_VIRTIO_DEVICE macro.

Signed-off-by: Halil Pasic 
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
---
 hw/virtio/vhost-vsock.c | 44 +++-
 1 file changed, 23 insertions(+), 21 deletions(-)

diff --git a/hw/virtio/vhost-vsock.c b/hw/virtio/vhost-vsock.c
index bde2456..99cb216 100644
--- a/hw/virtio/vhost-vsock.c
+++ b/hw/virtio/vhost-vsock.c
@@ -11,6 +11,8 @@
  * top-level directory.
  */
 
+#define VMSTATE_VIRTIO_DEVICE_USE_NEW
+
 #include 
 #include "qemu/osdep.h"
 #include "standard-headers/linux/virtio_vsock.h"
@@ -236,17 +238,6 @@ out:
 g_free(elem);
 }
 
-static void vhost_vsock_save(QEMUFile *f, void *opaque, size_t size)
-{
-VHostVSock *vsock = opaque;
-VirtIODevice *vdev = VIRTIO_DEVICE(vsock);
-
-/* At this point, backend must be stopped, otherwise
- * it might keep writing to memory. */
-assert(!vsock->vhost_dev.started);
-virtio_save(vdev, f);
-}
-
 static void vhost_vsock_post_load_timer_cleanup(VHostVSock *vsock)
 {
 if (!vsock->post_load_timer) {
@@ -266,16 +257,19 @@ static void vhost_vsock_post_load_timer_cb(void *opaque)
 vhost_vsock_send_transport_reset(vsock);
 }
 
-static int vhost_vsock_load(QEMUFile *f, void *opaque, size_t size)
+static void vhost_vsock_pre_save(void *opaque)
 {
 VHostVSock *vsock = opaque;
-VirtIODevice *vdev = VIRTIO_DEVICE(vsock);
-int ret;
 
-ret = virtio_load(vdev, f, VHOST_VSOCK_SAVEVM_VERSION);
-if (ret) {
-return ret;
-}
+/* At this point, backend must be stopped, otherwise
+ * it might keep writing to memory. */
+assert(!vsock->vhost_dev.started);
+}
+
+static int vhost_vsock_post_load(void *opaque, int version_id)
+{
+VHostVSock *vsock = opaque;
+VirtIODevice *vdev = VIRTIO_DEVICE(vsock);
 
 if (virtio_queue_get_addr(vdev, 2)) {
 /* Defer transport reset event to a vm clock timer so that virtqueue
@@ -288,12 +282,20 @@ static int vhost_vsock_load(QEMUFile *f, void *opaque, 
size_t size)
  vsock);
 timer_mod(vsock->post_load_timer, 1);
 }
-
 return 0;
 }
 
-VMSTATE_VIRTIO_DEVICE(vhost_vsock, VHOST_VSOCK_SAVEVM_VERSION,
-  vhost_vsock_load, vhost_vsock_save);
+static const VMStateDescription vmstate_virtio_vhost_vsock = {
+.name = "virtio-vhost_vsock",
+.minimum_version_id = VHOST_VSOCK_SAVEVM_VERSION,
+.version_id = VHOST_VSOCK_SAVEVM_VERSION,
+.fields = (VMStateField[]) {
+VMSTATE_VIRTIO_DEVICE,
+VMSTATE_END_OF_LIST()
+},
+.pre_save = vhost_vsock_pre_save,
+.post_load = vhost_vsock_post_load,
+};
 
 static void vhost_vsock_device_realize(DeviceState *dev, Error **errp)
 {
-- 
MST




[Qemu-devel] [PULL 17/33] virtio-net: handle virtio_net_flush_tx() errors

2016-10-09 Thread Michael S. Tsirkin
From: Greg Kurz 

All these errors are caused by a buggy guest: let's switch the device to
the broken state instead of terminating QEMU. Also we detach the element
from the virtqueue and free it.

If this happens, virtio_net_flush_tx() also returns -EINVAL, so that all
callers can stop processing the virtqueue immediatly.

Signed-off-by: Greg Kurz 
Reviewed-by: Stefan Hajnoczi 
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
---
 hw/net/virtio-net.c | 26 ++
 1 file changed, 18 insertions(+), 8 deletions(-)

diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
index 5c0b2e0..ca1b469 100644
--- a/hw/net/virtio-net.c
+++ b/hw/net/virtio-net.c
@@ -1249,15 +1249,19 @@ static int32_t virtio_net_flush_tx(VirtIONetQueue *q)
 out_num = elem->out_num;
 out_sg = elem->out_sg;
 if (out_num < 1) {
-error_report("virtio-net header not in first element");
-exit(1);
+virtio_error(vdev, "virtio-net header not in first element");
+virtqueue_detach_element(q->tx_vq, elem, 0);
+g_free(elem);
+return -EINVAL;
 }
 
 if (n->has_vnet_hdr) {
 if (iov_to_buf(out_sg, out_num, 0, , n->guest_hdr_len) <
 n->guest_hdr_len) {
-error_report("virtio-net header incorrect");
-exit(1);
+virtio_error(vdev, "virtio-net header incorrect");
+virtqueue_detach_element(q->tx_vq, elem, 0);
+g_free(elem);
+return -EINVAL;
 }
 if (n->needs_vnet_hdr_swap) {
 virtio_net_hdr_swap(vdev, (void *) );
@@ -1325,7 +1329,9 @@ static void virtio_net_handle_tx_timer(VirtIODevice 
*vdev, VirtQueue *vq)
 virtio_queue_set_notification(vq, 1);
 timer_del(q->tx_timer);
 q->tx_waiting = 0;
-virtio_net_flush_tx(q);
+if (virtio_net_flush_tx(q) == -EINVAL) {
+return;
+}
 } else {
 timer_mod(q->tx_timer,
qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL) + n->tx_timeout);
@@ -1396,8 +1402,9 @@ static void virtio_net_tx_bh(void *opaque)
 }
 
 ret = virtio_net_flush_tx(q);
-if (ret == -EBUSY) {
-return; /* Notification re-enable handled by tx_complete */
+if (ret == -EBUSY || ret == -EINVAL) {
+return; /* Notification re-enable handled by tx_complete or device
+ * broken */
 }
 
 /* If we flush a full burst of packets, assume there are
@@ -1412,7 +1419,10 @@ static void virtio_net_tx_bh(void *opaque)
  * anything that may have come in while we weren't looking.  If
  * we find something, assume the guest is still active and reschedule */
 virtio_queue_set_notification(q->tx_vq, 1);
-if (virtio_net_flush_tx(q) > 0) {
+ret = virtio_net_flush_tx(q);
+if (ret == -EINVAL) {
+return;
+} else if (ret > 0) {
 virtio_queue_set_notification(q->tx_vq, 0);
 qemu_bh_schedule(q->tx_bh);
 q->tx_waiting = 1;
-- 
MST




[Qemu-devel] [PULL 26/33] virtio-gpu: convert VMSTATE_VIRTIO_DEVICE

2016-10-09 Thread Michael S. Tsirkin
From: Halil Pasic 

Use the new VMSTATE_VIRTIO_DEVICE macro. The device virtio-gpu is
special because it actually does not adhere to the virtio migration
schema, because device state is last.

Signed-off-by: Halil Pasic 
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
---
 hw/display/virtio-gpu.c | 41 +
 1 file changed, 29 insertions(+), 12 deletions(-)

diff --git a/hw/display/virtio-gpu.c b/hw/display/virtio-gpu.c
index 7fe6ed8..4fcd63c 100644
--- a/hw/display/virtio-gpu.c
+++ b/hw/display/virtio-gpu.c
@@ -11,6 +11,8 @@
  * See the COPYING file in the top-level directory.
  */
 
+#define VMSTATE_VIRTIO_DEVICE_USE_NEW
+
 #include "qemu/osdep.h"
 #include "qemu-common.h"
 #include "qemu/iov.h"
@@ -990,12 +992,9 @@ static const VMStateDescription 
vmstate_virtio_gpu_scanouts = {
 static void virtio_gpu_save(QEMUFile *f, void *opaque, size_t size)
 {
 VirtIOGPU *g = opaque;
-VirtIODevice *vdev = VIRTIO_DEVICE(g);
 struct virtio_gpu_simple_resource *res;
 int i;
 
-virtio_save(vdev, f);
-
 /* in 2d mode we should never find unprocessed commands here */
 assert(QTAILQ_EMPTY(>cmdq));
 
@@ -1020,16 +1019,10 @@ static void virtio_gpu_save(QEMUFile *f, void *opaque, 
size_t size)
 static int virtio_gpu_load(QEMUFile *f, void *opaque, size_t size)
 {
 VirtIOGPU *g = opaque;
-VirtIODevice *vdev = VIRTIO_DEVICE(g);
 struct virtio_gpu_simple_resource *res;
 struct virtio_gpu_scanout *scanout;
 uint32_t resource_id, pformat;
-int i, ret;
-
-ret = virtio_load(vdev, f, VIRTIO_GPU_VM_VERSION);
-if (ret) {
-return ret;
-}
+int i;
 
 resource_id = qemu_get_be32(f);
 while (resource_id != 0) {
@@ -1219,8 +1212,32 @@ static void virtio_gpu_reset(VirtIODevice *vdev)
 #endif
 }
 
-VMSTATE_VIRTIO_DEVICE(gpu, VIRTIO_GPU_VM_VERSION, virtio_gpu_load,
-  virtio_gpu_save);
+/*
+ * For historical reasons virtio_gpu does not adhere to virtio migration
+ * scheme as described in doc/virtio-migration.txt, in a sense that no
+ * save/load callback are provided to the core. Instead the device data
+ * is saved/loaded after the core data.
+ *
+ * Because of this we need a special vmsd.
+ */
+static const VMStateDescription vmstate_virtio_gpu = {
+.name = "virtio-gpu",
+.minimum_version_id = VIRTIO_GPU_VM_VERSION,
+.version_id = VIRTIO_GPU_VM_VERSION,
+.fields = (VMStateField[]) {
+VMSTATE_VIRTIO_DEVICE /* core */,
+{
+.name = "virtio-gpu",
+.info = &(const VMStateInfo) {
+.name = "virtio-gpu",
+.get = virtio_gpu_load,
+.put = virtio_gpu_save,
+},
+.flags = VMS_SINGLE,
+} /* device */,
+VMSTATE_END_OF_LIST()
+},
+};
 
 static Property virtio_gpu_properties[] = {
 DEFINE_PROP_UINT32("max_outputs", VirtIOGPU, conf.max_outputs, 1),
-- 
MST




[Qemu-devel] [PULL 30/33] virtio-rng: convert VMSTATE_VIRTIO_DEVICE

2016-10-09 Thread Michael S. Tsirkin
From: Halil Pasic 

Use the new VMSTATE_VIRTIO_DEVICE macro.

Signed-off-by: Halil Pasic 
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
---
 hw/virtio/virtio-rng.c | 21 +
 1 file changed, 13 insertions(+), 8 deletions(-)

diff --git a/hw/virtio/virtio-rng.c b/hw/virtio/virtio-rng.c
index cd8ca10..62867d1 100644
--- a/hw/virtio/virtio-rng.c
+++ b/hw/virtio/virtio-rng.c
@@ -9,6 +9,8 @@
  * top-level directory.
  */
 
+#define VMSTATE_VIRTIO_DEVICE_USE_NEW
+
 #include "qemu/osdep.h"
 #include "qapi/error.h"
 #include "qemu/iov.h"
@@ -120,15 +122,9 @@ static uint64_t get_features(VirtIODevice *vdev, uint64_t 
f, Error **errp)
 return f;
 }
 
-static int virtio_rng_load(QEMUFile *f, void *opaque, size_t size)
+static int virtio_rng_post_load(void *opaque, int version_id)
 {
 VirtIORNG *vrng = opaque;
-int ret;
-
-ret = virtio_load(VIRTIO_DEVICE(vrng), f, 1);
-if (ret != 0) {
-return ret;
-}
 
 /* We may have an element ready but couldn't process it due to a quota
  * limit.  Make sure to try again after live migration when the quota may
@@ -216,7 +212,16 @@ static void virtio_rng_device_unrealize(DeviceState *dev, 
Error **errp)
 virtio_cleanup(vdev);
 }
 
-VMSTATE_VIRTIO_DEVICE(rng, 1, virtio_rng_load, virtio_vmstate_save);
+static const VMStateDescription vmstate_virtio_rng = {
+.name = "virtio-rng",
+.minimum_version_id = 1,
+.version_id = 1,
+.fields = (VMStateField[]) {
+VMSTATE_VIRTIO_DEVICE,
+VMSTATE_END_OF_LIST()
+},
+.post_load =  virtio_rng_post_load,
+};
 
 static Property virtio_rng_properties[] = {
 /* Set a default rate limit of 2^47 bytes per minute or roughly 2TB/s.  If
-- 
MST




[Qemu-devel] [PULL 25/33] virtio-serial: convert VMSTATE_VIRTIO_DEVICE

2016-10-09 Thread Michael S. Tsirkin
From: Halil Pasic 

Use the new VMSTATE_VIRTIO_DEVICE macro.

Signed-off-by: Halil Pasic 
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
---
 hw/char/virtio-serial-bus.c | 18 +++---
 1 file changed, 11 insertions(+), 7 deletions(-)

diff --git a/hw/char/virtio-serial-bus.c b/hw/char/virtio-serial-bus.c
index 3955f0f..c9b0fc8 100644
--- a/hw/char/virtio-serial-bus.c
+++ b/hw/char/virtio-serial-bus.c
@@ -18,6 +18,8 @@
  * GNU GPL, version 2 or (at your option) any later version.
  */
 
+#define VMSTATE_VIRTIO_DEVICE_USE_NEW
+
 #include "qemu/osdep.h"
 #include "qapi/error.h"
 #include "qemu/iov.h"
@@ -778,12 +780,6 @@ static int fetch_active_ports_list(QEMUFile *f,
 return 0;
 }
 
-static int virtio_serial_load(QEMUFile *f, void *opaque, size_t size)
-{
-/* The virtio device */
-return virtio_load(VIRTIO_DEVICE(opaque), f, 3);
-}
-
 static int virtio_serial_load_device(VirtIODevice *vdev, QEMUFile *f,
  int version_id)
 {
@@ -1129,7 +1125,15 @@ static void virtio_serial_device_unrealize(DeviceState 
*dev, Error **errp)
 }
 
 /* Note: 'console' is used for backwards compatibility */
-VMSTATE_VIRTIO_DEVICE(console, 3, virtio_serial_load, virtio_vmstate_save);
+static const VMStateDescription vmstate_virtio_console = {
+.name = "virtio-console",
+.minimum_version_id = 3,
+.version_id = 3,
+.fields = (VMStateField[]) {
+VMSTATE_VIRTIO_DEVICE,
+VMSTATE_END_OF_LIST()
+},
+};
 
 static Property virtio_serial_properties[] = {
 DEFINE_PROP_UINT32("max_ports", VirtIOSerial, serial.max_virtserial_ports,
-- 
MST




[Qemu-devel] [PULL 07/33] tests: acpi tables expected blobs update

2016-10-09 Thread Michael S. Tsirkin
From: Igor Mammedov 

Signed-off-by: Igor Mammedov 
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
---
 tests/acpi-test-data/pc/DSDT.cphp  | Bin 6435 -> 6471 bytes
 tests/acpi-test-data/pc/SRAT.cphp  | Bin 0 -> 304 bytes
 tests/acpi-test-data/q35/DSDT.cphp | Bin 9197 -> 9233 bytes
 tests/acpi-test-data/q35/SRAT.cphp | Bin 0 -> 304 bytes
 4 files changed, 0 insertions(+), 0 deletions(-)
 create mode 100644 tests/acpi-test-data/pc/SRAT.cphp
 create mode 100644 tests/acpi-test-data/q35/SRAT.cphp

diff --git a/tests/acpi-test-data/pc/DSDT.cphp 
b/tests/acpi-test-data/pc/DSDT.cphp
index 
e8b146208eb3877e1cde2cc361c5afd270d194c6..9f405cfd83d39a8e06bc08428e160a0192fc9704
 100644
GIT binary patch
delta 122
zcmZ2%blix`CD^<3<8^QCN+{En;m-Cx^2F7EIZuXlj#sifD^AdR6*}$eSZeGq)!vg?MeIj%K

delta 85
zcmX?ZwAhHtCD!Wg5-^

diff --git a/tests/acpi-test-data/pc/SRAT.cphp 
b/tests/acpi-test-data/pc/SRAT.cphp
new file mode 100644
index 
..ff2137642f488ec70b85207ed6c20e7351d61e98
GIT binary patch
literal 304
zcmWFzattwGWME)4bMklg2v%^42yhMtiUEZfKx_~V!f+sf!DmF1XF}yOvY_!<(fDl0
pd`1npO;83GTmZW|po75R12aq^syaB21u74tQT>}2A

literal 0
HcmV?d1

diff --git a/tests/acpi-test-data/q35/DSDT.cphp 
b/tests/acpi-test-data/q35/DSDT.cphp
index 
6cc28c6daec2b331030ab0600a4d79034c1dfc40..a0ce6b3264c6c6e82a8ae7bab49338e4819b
 100644
GIT binary patch
delta 122

[Qemu-devel] [PULL 27/33] virtio-input: convert VMSTATE_VIRTIO_DEVICE

2016-10-09 Thread Michael S. Tsirkin
From: Halil Pasic 

Use the new VMSTATE_VIRTIO_DEVICE macro.

Signed-off-by: Halil Pasic 
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
---
 hw/input/virtio-input.c | 23 +--
 1 file changed, 13 insertions(+), 10 deletions(-)

diff --git a/hw/input/virtio-input.c b/hw/input/virtio-input.c
index ccdf730..5e31033 100644
--- a/hw/input/virtio-input.c
+++ b/hw/input/virtio-input.c
@@ -4,6 +4,8 @@
  * top-level directory.
  */
 
+#define VMSTATE_VIRTIO_DEVICE_USE_NEW
+
 #include "qemu/osdep.h"
 #include "qapi/error.h"
 #include "qemu/iov.h"
@@ -217,19 +219,12 @@ static void virtio_input_reset(VirtIODevice *vdev)
 }
 }
 
-static int virtio_input_load(QEMUFile *f, void *opaque, size_t size)
+static int virtio_input_post_load(void *opaque, int version_id)
 {
 VirtIOInput *vinput = opaque;
 VirtIOInputClass *vic = VIRTIO_INPUT_GET_CLASS(vinput);
 VirtIODevice *vdev = VIRTIO_DEVICE(vinput);
-int ret;
-
-ret = virtio_load(vdev, f, VIRTIO_INPUT_VM_VERSION);
-if (ret) {
-return ret;
-}
 
-/* post_load() */
 vinput->active = vdev->status & VIRTIO_CONFIG_S_DRIVER_OK;
 if (vic->change_active) {
 vic->change_active(vinput);
@@ -296,8 +291,16 @@ static void virtio_input_device_unrealize(DeviceState 
*dev, Error **errp)
 virtio_cleanup(vdev);
 }
 
-VMSTATE_VIRTIO_DEVICE(input, VIRTIO_INPUT_VM_VERSION, virtio_input_load,
-  virtio_vmstate_save);
+static const VMStateDescription vmstate_virtio_input = {
+.name = "virtio-input",
+.minimum_version_id = VIRTIO_INPUT_VM_VERSION,
+.version_id = VIRTIO_INPUT_VM_VERSION,
+.fields = (VMStateField[]) {
+VMSTATE_VIRTIO_DEVICE,
+VMSTATE_END_OF_LIST()
+},
+.post_load = virtio_input_post_load,
+};
 
 static Property virtio_input_properties[] = {
 DEFINE_PROP_STRING("serial", VirtIOInput, serial),
-- 
MST




[Qemu-devel] [PULL 10/33] virtio-serial: add missing virtio_detach_element() call

2016-10-09 Thread Michael S. Tsirkin
From: Stefan Hajnoczi 

Ports enter a "throttled" state when writing to the chardev would block.
The current output VirtQueueElement is kept around until the chardev
becomes writable again.

There are several places in the virtio-serial lifecycle where the
VirtQueueElement should be thrown away.  For example, if the virtio
device is reset then virtqueue elements are no longer valid.

This patch adds the discard_throttle_data() function to unmap the
scatter-gather list and decrement vq->inuse.  This ensures that the
VirtQueueElement is freed properly.

Cc: amit.s...@redhat.com
Signed-off-by: Stefan Hajnoczi 
Tested-by: Ladi Prosek 
Reviewed-by: Ladi Prosek 
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
---
 hw/char/virtio-serial-bus.c | 14 ++
 1 file changed, 14 insertions(+)

diff --git a/hw/char/virtio-serial-bus.c b/hw/char/virtio-serial-bus.c
index db2a9f1..3955f0f 100644
--- a/hw/char/virtio-serial-bus.c
+++ b/hw/char/virtio-serial-bus.c
@@ -145,6 +145,15 @@ static void discard_vq_data(VirtQueue *vq, VirtIODevice 
*vdev)
 virtio_notify(vdev, vq);
 }
 
+static void discard_throttle_data(VirtIOSerialPort *port)
+{
+if (port->elem) {
+virtqueue_detach_element(port->ovq, port->elem, 0);
+g_free(port->elem);
+port->elem = NULL;
+}
+}
+
 static void do_flush_queued_data(VirtIOSerialPort *port, VirtQueue *vq,
  VirtIODevice *vdev)
 {
@@ -267,6 +276,7 @@ int virtio_serial_close(VirtIOSerialPort *port)
  * consume, reset the throttling flag and discard the data.
  */
 port->throttled = false;
+discard_throttle_data(port);
 discard_vq_data(port->ovq, VIRTIO_DEVICE(port->vser));
 
 send_control_event(port->vser, port->id, VIRTIO_CONSOLE_PORT_OPEN, 0);
@@ -591,6 +601,9 @@ static void guest_reset(VirtIOSerial *vser)
 
 QTAILQ_FOREACH(port, >ports, next) {
 vsc = VIRTIO_SERIAL_PORT_GET_CLASS(port);
+
+discard_throttle_data(port);
+
 if (port->guest_connected) {
 port->guest_connected = false;
 if (vsc->set_guest_connected) {
@@ -901,6 +914,7 @@ static void remove_port(VirtIOSerial *vser, uint32_t 
port_id)
 assert(port);
 
 /* Flush out any unconsumed buffers first */
+discard_throttle_data(port);
 discard_vq_data(port->ovq, VIRTIO_DEVICE(port->vser));
 
 send_control_event(vser, port->id, VIRTIO_CONSOLE_PORT_REMOVE, 1);
-- 
MST




[Qemu-devel] [PULL 28/33] virtio-scsi: convert VMSTATE_VIRTIO_DEVICE

2016-10-09 Thread Michael S. Tsirkin
From: Halil Pasic 

Use the new VMSTATE_VIRTIO_DEVICE macro.

Signed-off-by: Halil Pasic 
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
---
 hw/scsi/virtio-scsi.c | 28 +++-
 1 file changed, 11 insertions(+), 17 deletions(-)

diff --git a/hw/scsi/virtio-scsi.c b/hw/scsi/virtio-scsi.c
index 6eaadd8..9473e10 100644
--- a/hw/scsi/virtio-scsi.c
+++ b/hw/scsi/virtio-scsi.c
@@ -13,6 +13,8 @@
  *
  */
 
+#define VMSTATE_VIRTIO_DEVICE_USE_NEW
+
 #include "qemu/osdep.h"
 #include "qapi/error.h"
 #include "standard-headers/linux/virtio_ids.h"
@@ -681,22 +683,6 @@ static void virtio_scsi_reset(VirtIODevice *vdev)
 s->events_dropped = false;
 }
 
-/* The device does not have anything to save beyond the virtio data.
- * Request data is saved with callbacks from SCSI devices.
- */
-static void virtio_scsi_save(QEMUFile *f, void *opaque, size_t size)
-{
-VirtIODevice *vdev = VIRTIO_DEVICE(opaque);
-virtio_save(vdev, f);
-}
-
-static int virtio_scsi_load(QEMUFile *f, void *opaque, size_t size)
-{
-VirtIODevice *vdev = VIRTIO_DEVICE(opaque);
-
-return virtio_load(vdev, f, 1);
-}
-
 void virtio_scsi_push_event(VirtIOSCSI *s, SCSIDevice *dev,
 uint32_t event, uint32_t reason)
 {
@@ -940,7 +926,15 @@ static Property virtio_scsi_properties[] = {
 DEFINE_PROP_END_OF_LIST(),
 };
 
-VMSTATE_VIRTIO_DEVICE(scsi, 1, virtio_scsi_load, virtio_scsi_save);
+static const VMStateDescription vmstate_virtio_scsi = {
+.name = "virtio-scsi",
+.minimum_version_id = 1,
+.version_id = 1,
+.fields = (VMStateField[]) {
+VMSTATE_VIRTIO_DEVICE,
+VMSTATE_END_OF_LIST()
+},
+};
 
 static void virtio_scsi_common_class_init(ObjectClass *klass, void *data)
 {
-- 
MST




[Qemu-devel] [PULL 13/33] virtio-9p: handle handle_9p_output() error

2016-10-09 Thread Michael S. Tsirkin
From: Greg Kurz 

A broken guest may send a request without providing buffers for the reply
or for the request itself, and virtqueue_pop() will return an element with
either in_num == 0 or out_num == 0.

All 9P requests are expected to start with the following 7-byte header:

uint32_t size_le;
uint8_t id;
uint16_t tag_le;

If iov_to_buf() fails to return these 7 bytes, then something is wrong in
the guest.

In both cases, it is wrong to crash QEMU, since the root cause lies in the
guest.

This patch hence does the following:
- keep the check of in_num since pdu_complete() assumes it has enough
  space to store the reply and we will send something broken to the guest
- let iov_to_buf() handle out_num == 0, since it will return 0 just like
  if the guest had provided an zero-sized buffer.
- call virtio_error() to inform the guest that the device is now broken,
  instead of aborting
- detach the request from the virtqueue and free it

Signed-off-by: Greg Kurz 
Reviewed-by: Stefan Hajnoczi 
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
---
 hw/9pfs/virtio-9p-device.c | 26 +-
 1 file changed, 21 insertions(+), 5 deletions(-)

diff --git a/hw/9pfs/virtio-9p-device.c b/hw/9pfs/virtio-9p-device.c
index e7ea0e4..a338f64 100644
--- a/hw/9pfs/virtio-9p-device.c
+++ b/hw/9pfs/virtio-9p-device.c
@@ -41,6 +41,7 @@ static void handle_9p_output(VirtIODevice *vdev, VirtQueue 
*vq)
 V9fsState *s = >state;
 V9fsPDU *pdu;
 ssize_t len;
+VirtQueueElement *elem;
 
 while ((pdu = pdu_alloc(s))) {
 struct {
@@ -48,21 +49,28 @@ static void handle_9p_output(VirtIODevice *vdev, VirtQueue 
*vq)
 uint8_t id;
 uint16_t tag_le;
 } QEMU_PACKED out;
-VirtQueueElement *elem;
 
 elem = virtqueue_pop(vq, sizeof(VirtQueueElement));
 if (!elem) {
-pdu_free(pdu);
-break;
+goto out_free_pdu;
 }
 
-BUG_ON(elem->out_num == 0 || elem->in_num == 0);
+if (elem->in_num == 0) {
+virtio_error(vdev,
+ "The guest sent a VirtFS request without space for "
+ "the reply");
+goto out_free_req;
+}
 QEMU_BUILD_BUG_ON(sizeof(out) != 7);
 
 v->elems[pdu->idx] = elem;
 len = iov_to_buf(elem->out_sg, elem->out_num, 0,
  , sizeof(out));
-BUG_ON(len != sizeof(out));
+if (len != sizeof(out)) {
+virtio_error(vdev, "The guest sent a malformed VirtFS request: "
+ "header size is %zd, should be 7", len);
+goto out_free_req;
+}
 
 pdu->size = le32_to_cpu(out.size_le);
 
@@ -72,6 +80,14 @@ static void handle_9p_output(VirtIODevice *vdev, VirtQueue 
*vq)
 qemu_co_queue_init(>complete);
 pdu_submit(pdu);
 }
+
+return;
+
+out_free_req:
+virtqueue_detach_element(vq, elem, 0);
+g_free(elem);
+out_free_pdu:
+pdu_free(pdu);
 }
 
 static uint64_t virtio_9p_get_features(VirtIODevice *vdev, uint64_t features,
-- 
MST




[Qemu-devel] [PULL 18/33] virtio-scsi: convert virtio_scsi_bad_req() to use virtio_error()

2016-10-09 Thread Michael S. Tsirkin
From: Greg Kurz 

The virtio_scsi_bad_req() function is called when a guest sends a
request with missing or ill-sized headers. This generally happens
when the virtio_scsi_parse_req() function returns an error.

With this patch, virtio_scsi_bad_req() will mark the device as broken,
detach the request from the virtqueue and free it, instead of forcing
QEMU to exit.

In nearly all locations where virtio_scsi_bad_req() is called, the only
thing to do next is to return to the caller.

The virtio_scsi_handle_cmd_req_prepare() function is an exception though.

It is called in a loop by virtio_scsi_handle_cmd_vq() and passed requests
freshly popped from a cmd virtqueue; virtio_scsi_handle_cmd_req_prepare()
does some sanity checks on the request and returns a boolean flag to
indicate whether the request should be queued or not. In the latter case,
virtio_scsi_handle_cmd_req_prepare() has detected a non-fatal error and
sent a response back to the guest.

We have now a new condition to take into account: the device is broken
and should stop all processing.

The return value of virtio_scsi_handle_cmd_req_prepare() is hence changed
to an int. A return value of zero means that the request should be queued.
Other non-fatal error cases where the request shoudn't be queued  return
a negative errno (values are vaguely inspired by the error condition, but
the only goal here is to discriminate the case we're interested in).

And finally, if virtio_scsi_bad_req() was called, -EINVAL is returned. In
this case, virtio_scsi_handle_cmd_vq() detaches and frees already queued
requests, instead of submitting them.

Signed-off-by: Greg Kurz 
Reviewed-by: Stefan Hajnoczi 
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
---
 hw/scsi/virtio-scsi.c | 46 --
 1 file changed, 32 insertions(+), 14 deletions(-)

diff --git a/hw/scsi/virtio-scsi.c b/hw/scsi/virtio-scsi.c
index e596b64..b58de95 100644
--- a/hw/scsi/virtio-scsi.c
+++ b/hw/scsi/virtio-scsi.c
@@ -81,10 +81,11 @@ static void virtio_scsi_complete_req(VirtIOSCSIReq *req)
 virtio_scsi_free_req(req);
 }
 
-static void virtio_scsi_bad_req(void)
+static void virtio_scsi_bad_req(VirtIOSCSIReq *req)
 {
-error_report("wrong size for virtio-scsi headers");
-exit(1);
+virtio_error(VIRTIO_DEVICE(req->dev), "wrong size for virtio-scsi 
headers");
+virtqueue_detach_element(req->vq, >elem, 0);
+virtio_scsi_free_req(req);
 }
 
 static size_t qemu_sgl_concat(VirtIOSCSIReq *req, struct iovec *iov,
@@ -387,7 +388,7 @@ static void virtio_scsi_handle_ctrl_req(VirtIOSCSI *s, 
VirtIOSCSIReq *req)
 
 if (iov_to_buf(req->elem.out_sg, req->elem.out_num, 0,
 , sizeof(type)) < sizeof(type)) {
-virtio_scsi_bad_req();
+virtio_scsi_bad_req(req);
 return;
 }
 
@@ -395,7 +396,8 @@ static void virtio_scsi_handle_ctrl_req(VirtIOSCSI *s, 
VirtIOSCSIReq *req)
 if (type == VIRTIO_SCSI_T_TMF) {
 if (virtio_scsi_parse_req(req, sizeof(VirtIOSCSICtrlTMFReq),
 sizeof(VirtIOSCSICtrlTMFResp)) < 0) {
-virtio_scsi_bad_req();
+virtio_scsi_bad_req(req);
+return;
 } else {
 r = virtio_scsi_do_tmf(s, req);
 }
@@ -404,7 +406,8 @@ static void virtio_scsi_handle_ctrl_req(VirtIOSCSI *s, 
VirtIOSCSIReq *req)
type == VIRTIO_SCSI_T_AN_SUBSCRIBE) {
 if (virtio_scsi_parse_req(req, sizeof(VirtIOSCSICtrlANReq),
 sizeof(VirtIOSCSICtrlANResp)) < 0) {
-virtio_scsi_bad_req();
+virtio_scsi_bad_req(req);
+return;
 } else {
 req->resp.an.event_actual = 0;
 req->resp.an.response = VIRTIO_SCSI_S_OK;
@@ -521,7 +524,7 @@ static void virtio_scsi_fail_cmd_req(VirtIOSCSIReq *req)
 virtio_scsi_complete_cmd_req(req);
 }
 
-static bool virtio_scsi_handle_cmd_req_prepare(VirtIOSCSI *s, VirtIOSCSIReq 
*req)
+static int virtio_scsi_handle_cmd_req_prepare(VirtIOSCSI *s, VirtIOSCSIReq 
*req)
 {
 VirtIOSCSICommon *vs = >parent_obj;
 SCSIDevice *d;
@@ -532,17 +535,18 @@ static bool virtio_scsi_handle_cmd_req_prepare(VirtIOSCSI 
*s, VirtIOSCSIReq *req
 if (rc < 0) {
 if (rc == -ENOTSUP) {
 virtio_scsi_fail_cmd_req(req);
+return -ENOTSUP;
 } else {
-virtio_scsi_bad_req();
+virtio_scsi_bad_req(req);
+return -EINVAL;
 }
-return false;
 }
 
 d = virtio_scsi_device_find(s, req->req.cmd.lun);
 if (!d) {
 req->resp.cmd.response = VIRTIO_SCSI_S_BAD_TARGET;
 virtio_scsi_complete_cmd_req(req);
-return false;
+return -ENOENT;
 }
 virtio_scsi_ctx_check(s, d);
 req->sreq = scsi_req_new(d, req->req.cmd.tag,
@@ -554,11 +558,11 @@ static bool virtio_scsi_handle_cmd_req_prepare(VirtIOSCSI 
*s, 

[Qemu-devel] [PULL 04/33] numa: reduce code duplication by adding helper numa_get_node_for_cpu()

2016-10-09 Thread Michael S. Tsirkin
From: Igor Mammedov 

Replace repeated pattern

for (i = 0; i < nb_numa_nodes; i++) {
if (test_bit(idx, numa_info[i].node_cpu)) {
   ...
   break;

with a helper function to lookup numa node index for cpu.

Suggested-by: Michael S. Tsirkin 
Signed-off-by: Igor Mammedov 
Reviewed-by: David Gibson 
Reviewed-by: Shannon Zhao 
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
---
 include/sysemu/numa.h|  3 +++
 hw/arm/virt-acpi-build.c |  6 ++
 hw/arm/virt.c|  7 +++
 hw/i386/acpi-build.c |  7 ++-
 hw/i386/pc.c |  8 +++-
 hw/ppc/spapr_cpu_core.c  |  6 ++
 numa.c   | 12 
 7 files changed, 27 insertions(+), 22 deletions(-)

diff --git a/include/sysemu/numa.h b/include/sysemu/numa.h
index bb184c9..4da808a 100644
--- a/include/sysemu/numa.h
+++ b/include/sysemu/numa.h
@@ -32,4 +32,7 @@ void numa_set_mem_node_id(ram_addr_t addr, uint64_t size, 
uint32_t node);
 void numa_unset_mem_node_id(ram_addr_t addr, uint64_t size, uint32_t node);
 uint32_t numa_get_node(ram_addr_t addr, Error **errp);
 
+/* on success returns node index in numa_info,
+ * on failure returns nb_numa_nodes */
+int numa_get_node_for_cpu(int idx);
 #endif
diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c
index 7b39b1d..c77525d 100644
--- a/hw/arm/virt-acpi-build.c
+++ b/hw/arm/virt-acpi-build.c
@@ -427,11 +427,9 @@ build_srat(GArray *table_data, BIOSLinker *linker, 
VirtGuestInfo *guest_info)
 uint32_t *cpu_node = g_malloc0(guest_info->smp_cpus * sizeof(uint32_t));
 
 for (i = 0; i < guest_info->smp_cpus; i++) {
-for (j = 0; j < nb_numa_nodes; j++) {
-if (test_bit(i, numa_info[j].node_cpu)) {
+j = numa_get_node_for_cpu(i);
+if (j < nb_numa_nodes) {
 cpu_node[i] = j;
-break;
-}
 }
 }
 
diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index 0f6305d..795740d 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -413,10 +413,9 @@ static void fdt_add_cpu_nodes(const VirtBoardInfo *vbi)
   armcpu->mp_affinity);
 }
 
-for (i = 0; i < nb_numa_nodes; i++) {
-if (test_bit(cpu, numa_info[i].node_cpu)) {
-qemu_fdt_setprop_cell(vbi->fdt, nodename, "numa-node-id", i);
-}
+i = numa_get_node_for_cpu(cpu);
+if (i < nb_numa_nodes) {
+qemu_fdt_setprop_cell(vbi->fdt, nodename, "numa-node-id", i);
 }
 
 g_free(nodename);
diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
index c20bc71..e999654 100644
--- a/hw/i386/acpi-build.c
+++ b/hw/i386/acpi-build.c
@@ -2410,18 +2410,15 @@ build_srat(GArray *table_data, BIOSLinker *linker, 
MachineState *machine)
 srat->reserved1 = cpu_to_le32(1);
 
 for (i = 0; i < apic_ids->len; i++) {
-int j;
+int j = numa_get_node_for_cpu(i);
 int apic_id = apic_ids->cpus[i].arch_id;
 
 core = acpi_data_push(table_data, sizeof *core);
 core->type = ACPI_SRAT_PROCESSOR_APIC;
 core->length = sizeof(*core);
 core->local_apic_id = apic_id;
-for (j = 0; j < nb_numa_nodes; j++) {
-if (test_bit(i, numa_info[j].node_cpu)) {
+if (j < nb_numa_nodes) {
 core->proximity_lo = j;
-break;
-}
 }
 memset(core->proximity_hi, 0, 3);
 core->local_sapic_eid = 0;
diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index 2d6d792..93ff49c 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -779,11 +779,9 @@ static FWCfgState *bochs_bios_init(AddressSpace *as, 
PCMachineState *pcms)
 for (i = 0; i < max_cpus; i++) {
 unsigned int apic_id = x86_cpu_apic_id_from_index(i);
 assert(apic_id < pcms->apic_id_limit);
-for (j = 0; j < nb_numa_nodes; j++) {
-if (test_bit(i, numa_info[j].node_cpu)) {
-numa_fw_cfg[apic_id + 1] = cpu_to_le64(j);
-break;
-}
+j = numa_get_node_for_cpu(i);
+if (j < nb_numa_nodes) {
+numa_fw_cfg[apic_id + 1] = cpu_to_le64(j);
 }
 }
 for (i = 0; i < nb_numa_nodes; i++) {
diff --git a/hw/ppc/spapr_cpu_core.c b/hw/ppc/spapr_cpu_core.c
index 35d1873..bc922bc 100644
--- a/hw/ppc/spapr_cpu_core.c
+++ b/hw/ppc/spapr_cpu_core.c
@@ -69,11 +69,9 @@ void spapr_cpu_init(sPAPRMachineState *spapr, PowerPCCPU 
*cpu, Error **errp)
 }
 
 /* Set NUMA node for the added CPUs  */
-for (i = 0; i < nb_numa_nodes; i++) {
-if (test_bit(cs->cpu_index, numa_info[i].node_cpu)) {
+i = numa_get_node_for_cpu(cs->cpu_index);
+if (i < nb_numa_nodes) {
 cs->numa_node = i;
-break;
-}
 }
 
 xics_cpu_setup(spapr->xics, cpu);
diff --git a/numa.c 

[Qemu-devel] [PULL 01/33] virtio-balloon: Remove needless precompiled directive

2016-10-09 Thread Michael S. Tsirkin
From: Liang Li 

Since there in wrapper around madvise(), the virtio-balloon
code is able to work without the precompiled directive, the
directive can be removed.

Signed-off-by: Liang Li 
Suggested-by: Thomas Huth 
Reviewd-by: Dr. David Alan Gilbert 
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
---
 hw/virtio/virtio-balloon.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c
index 49a2f4a..eb572e6 100644
--- a/hw/virtio/virtio-balloon.c
+++ b/hw/virtio/virtio-balloon.c
@@ -34,13 +34,11 @@
 
 static void balloon_page(void *addr, int deflate)
 {
-#if defined(__linux__)
 if (!qemu_balloon_is_inhibited() && (!kvm_enabled() ||
  kvm_has_sync_mmu())) {
 qemu_madvise(addr, BALLOON_PAGE_SIZE,
 deflate ? QEMU_MADV_WILLNEED : QEMU_MADV_DONTNEED);
 }
-#endif
 }
 
 static const char *balloon_stat_names[] = {
-- 
MST




[Qemu-devel] [PULL 19/33] virtio-scsi: handle virtio_scsi_set_config() error

2016-10-09 Thread Michael S. Tsirkin
From: Greg Kurz 

This error is caused by a buggy guest: let's switch the device to the
broken state instead of terminating QEMU.

Signed-off-by: Greg Kurz 
Reviewed-by: Stefan Hajnoczi 
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
---
 hw/scsi/virtio-scsi.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/hw/scsi/virtio-scsi.c b/hw/scsi/virtio-scsi.c
index b58de95..6eaadd8 100644
--- a/hw/scsi/virtio-scsi.c
+++ b/hw/scsi/virtio-scsi.c
@@ -644,8 +644,9 @@ static void virtio_scsi_set_config(VirtIODevice *vdev,
 
 if ((uint32_t) virtio_ldl_p(vdev, >sense_size) >= 65536 ||
 (uint32_t) virtio_ldl_p(vdev, >cdb_size) >= 256) {
-error_report("bad data written to virtio-scsi configuration space");
-exit(1);
+virtio_error(vdev,
+ "bad data written to virtio-scsi configuration space");
+return;
 }
 
 vs->sense_size = virtio_ldl_p(vdev, >sense_size);
-- 
MST




[Qemu-devel] [PULL 11/33] virtio-9p: add parentheses to sizeof operator

2016-10-09 Thread Michael S. Tsirkin
From: Greg Kurz 

Signed-off-by: Greg Kurz 
Reviewed-by: Cornelia Huck 
Reviewed-by: Stefan Hajnoczi 
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
---
 hw/9pfs/virtio-9p-device.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/hw/9pfs/virtio-9p-device.c b/hw/9pfs/virtio-9p-device.c
index 009b43f..e7ea0e4 100644
--- a/hw/9pfs/virtio-9p-device.c
+++ b/hw/9pfs/virtio-9p-device.c
@@ -57,12 +57,12 @@ static void handle_9p_output(VirtIODevice *vdev, VirtQueue 
*vq)
 }
 
 BUG_ON(elem->out_num == 0 || elem->in_num == 0);
-QEMU_BUILD_BUG_ON(sizeof out != 7);
+QEMU_BUILD_BUG_ON(sizeof(out) != 7);
 
 v->elems[pdu->idx] = elem;
 len = iov_to_buf(elem->out_sg, elem->out_num, 0,
- , sizeof out);
-BUG_ON(len != sizeof out);
+ , sizeof(out));
+BUG_ON(len != sizeof(out));
 
 pdu->size = le32_to_cpu(out.size_le);
 
-- 
MST




[Qemu-devel] [PULL 08/33] virtio: add virtio_detach_element()

2016-10-09 Thread Michael S. Tsirkin
From: Stefan Hajnoczi 

During device reset or similar situations a VirtQueueElement needs to be
freed without pushing it onto the used ring or rewinding the virtqueue.
Extract a new function to do this.

Later patches add virtio_detach_element() calls to existing device so
that scatter-gather lists are unmapped and vq->inuse goes back to zero
during device reset.  Currently some devices don't bother and simply
call g_free(elem) which is not a clean way to throw away a
VirtQueueElement.

Signed-off-by: Stefan Hajnoczi 
Acked-by: Greg Kurz 
Reviewed-by: Ladi Prosek 
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
---
 include/hw/virtio/virtio.h |  2 ++
 hw/virtio/virtio.c | 27 +--
 2 files changed, 27 insertions(+), 2 deletions(-)

diff --git a/include/hw/virtio/virtio.h b/include/hw/virtio/virtio.h
index 888c8de..e25ec4f 100644
--- a/include/hw/virtio/virtio.h
+++ b/include/hw/virtio/virtio.h
@@ -155,6 +155,8 @@ void *virtqueue_alloc_element(size_t sz, unsigned out_num, 
unsigned in_num);
 void virtqueue_push(VirtQueue *vq, const VirtQueueElement *elem,
 unsigned int len);
 void virtqueue_flush(VirtQueue *vq, unsigned int count);
+void virtqueue_detach_element(VirtQueue *vq, const VirtQueueElement *elem,
+  unsigned int len);
 void virtqueue_discard(VirtQueue *vq, const VirtQueueElement *elem,
unsigned int len);
 bool virtqueue_rewind(VirtQueue *vq, unsigned int num);
diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
index 18ce333..46f79c9 100644
--- a/hw/virtio/virtio.c
+++ b/hw/virtio/virtio.c
@@ -264,12 +264,35 @@ static void virtqueue_unmap_sg(VirtQueue *vq, const 
VirtQueueElement *elem,
   0, elem->out_sg[i].iov_len);
 }
 
+/* virtqueue_detach_element:
+ * @vq: The #VirtQueue
+ * @elem: The #VirtQueueElement
+ * @len: number of bytes written
+ *
+ * Detach the element from the virtqueue.  This function is suitable for device
+ * reset or other situations where a #VirtQueueElement is simply freed and will
+ * not be pushed or discarded.
+ */
+void virtqueue_detach_element(VirtQueue *vq, const VirtQueueElement *elem,
+  unsigned int len)
+{
+vq->inuse--;
+virtqueue_unmap_sg(vq, elem, len);
+}
+
+/* virtqueue_discard:
+ * @vq: The #VirtQueue
+ * @elem: The #VirtQueueElement
+ * @len: number of bytes written
+ *
+ * Pretend the most recent element wasn't popped from the virtqueue.  The next
+ * call to virtqueue_pop() will refetch the element.
+ */
 void virtqueue_discard(VirtQueue *vq, const VirtQueueElement *elem,
unsigned int len)
 {
 vq->last_avail_idx--;
-vq->inuse--;
-virtqueue_unmap_sg(vq, elem, len);
+virtqueue_detach_element(vq, elem, len);
 }
 
 /* virtqueue_rewind:
-- 
MST




[Qemu-devel] [PULL 00/33] virtio, pc: fixes and features

2016-10-09 Thread Michael S. Tsirkin
The following changes since commit 48f592118ab42f83a1a7561c4bfd2b72a100f241:

  bsd-user: fix FreeBSD build after d148d90e (2016-10-07 15:17:53 +0100)

are available in the git repository at:

  git://git.kernel.org/pub/scm/virt/kvm/mst/qemu.git tags/for_upstream

for you to fetch changes up to dea651a95af6dad0997b840241a0bf6059d9a776:

  intel-iommu: Check IOAPIC's Trigger Mode against the one in IRTE (2016-10-10 
02:38:14 +0300)


virtio, pc: fixes and features

more guest error handling for virtio devices
virtio migration rework
pc fixes

Signed-off-by: Michael S. Tsirkin 


Daniel P. Berrange (1):
  net: don't poke at chardev internal QemuOpts

Feng Wu (1):
  intel-iommu: Check IOAPIC's Trigger Mode against the one in IRTE

Greg Kurz (9):
  virtio-9p: add parentheses to sizeof operator
  virtio-blk: make some functions static
  virtio-9p: handle handle_9p_output() error
  virtio-blk: handle virtio_blk_handle_request() errors
  virtio-net: handle virtio_net_handle_ctrl() error
  virtio-net: handle virtio_net_receive() errors
  virtio-net: handle virtio_net_flush_tx() errors
  virtio-scsi: convert virtio_scsi_bad_req() to use virtio_error()
  virtio-scsi: handle virtio_scsi_set_config() error

Halil Pasic (12):
  virtio: prepare change VMSTATE_VIRTIO_DEVICE macro
  virtio-blk: convert VMSTATE_VIRTIO_DEVICE
  virtio-net: convert VMSTATE_VIRTIO_DEVICE
  virtio-9p: convert VMSTATE_VIRTIO_DEVICE
  virtio-serial: convert VMSTATE_VIRTIO_DEVICE
  virtio-gpu: convert VMSTATE_VIRTIO_DEVICE
  virtio-input: convert VMSTATE_VIRTIO_DEVICE
  virtio-scsi: convert VMSTATE_VIRTIO_DEVICE
  virtio-balloon: convert VMSTATE_VIRTIO_DEVICE
  virtio-rng: convert VMSTATE_VIRTIO_DEVICE
  vhost-vsock: convert VMSTATE_VIRTIO_DEVICE
  virtio: cleanup VMSTATE_VIRTIO_DEVICE

Igor Mammedov (4):
  numa: reduce code duplication by adding helper numa_get_node_for_cpu()
  acpi: provide _PXM method for CPU devices if QEMU is started numa enabled
  tests: acpi: extend cphp testcase with numa check
  tests: acpi tables expected blobs update

Liang Li (1):
  virtio-balloon: Remove needless precompiled directive

Sascha Silbe (2):
  virtio-serial: add plumbing for virtio console emergency write support
  virtio-serial: enable virtio console emergency write feature

Stefan Hajnoczi (3):
  virtio: add virtio_detach_element()
  virtio-blk: add missing virtio_detach_element() call
  virtio-serial: add missing virtio_detach_element() call

 include/hw/compat.h|   4 ++
 include/hw/virtio/virtio-blk.h |   8 ---
 include/hw/virtio/virtio-serial.h  |   2 +
 include/hw/virtio/virtio.h |  29 ---
 include/sysemu/char.h  |  21 +++-
 include/sysemu/numa.h  |   3 ++
 hmp.c  |   1 +
 hw/9pfs/virtio-9p-device.c |  45 -
 hw/acpi/cpu.c  |  12 +
 hw/arm/virt-acpi-build.c   |   6 +--
 hw/arm/virt.c  |   7 ++-
 hw/block/virtio-blk.c  |  72 +++---
 hw/char/virtio-serial-bus.c|  79 +
 hw/display/virtio-gpu.c|  39 ++-
 hw/i386/acpi-build.c   |   7 +--
 hw/i386/intel_iommu.c  |  12 +
 hw/i386/pc.c   |   8 ++-
 hw/input/virtio-input.c|  21 
 hw/net/virtio-net.c| 100 +
 hw/ppc/spapr_cpu_core.c|   6 +--
 hw/scsi/virtio-scsi.c  |  77 
 hw/virtio/vhost-vsock.c|  42 
 hw/virtio/virtio-balloon.c |  17 ---
 hw/virtio/virtio-rng.c |  19 ---
 hw/virtio/virtio.c |  44 ++--
 net/colo-compare.c |  30 +--
 net/vhost-user.c   |  41 +++
 numa.c |  12 +
 qemu-char.c|  22 ++--
 tests/bios-tables-test.c   |   6 ++-
 tests/acpi-test-data/pc/DSDT.cphp  | Bin 6435 -> 6471 bytes
 tests/acpi-test-data/pc/SRAT.cphp  | Bin 0 -> 304 bytes
 tests/acpi-test-data/q35/DSDT.cphp | Bin 9197 -> 9233 bytes
 tests/acpi-test-data/q35/SRAT.cphp | Bin 0 -> 304 bytes
 34 files changed, 484 insertions(+), 308 deletions(-)
 create mode 100644 tests/acpi-test-data/pc/SRAT.cphp
 create mode 100644 tests/acpi-test-data/q35/SRAT.cphp




[Qemu-devel] [PULL 05/33] acpi: provide _PXM method for CPU devices if QEMU is started numa enabled

2016-10-09 Thread Michael S. Tsirkin
From: Igor Mammedov 

Workaround for long standing issue where Linux kernel
assigns hotplugged CPU to 1st numa node as it discards
proximity for possible CPUs from SRAT after it's parsed.

_PXM method allows linux query proximity directly from
hotplugged CPU object, which allows Linux to assing CPU
to the correct numa node.

Signed-off-by: Igor Mammedov 
Reviewed-by: Marcel Apfelbaum 
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
---
 hw/acpi/cpu.c | 12 
 1 file changed, 12 insertions(+)

diff --git a/hw/acpi/cpu.c b/hw/acpi/cpu.c
index c13b65c..902f5c9 100644
--- a/hw/acpi/cpu.c
+++ b/hw/acpi/cpu.c
@@ -4,6 +4,7 @@
 #include "qapi/error.h"
 #include "qapi-event.h"
 #include "trace.h"
+#include "sysemu/numa.h"
 
 #define ACPI_CPU_HOTPLUG_REG_LEN 12
 #define ACPI_CPU_SELECTOR_OFFSET_WR 0
@@ -503,6 +504,7 @@ void build_cpus_aml(Aml *table, MachineState *machine, 
CPUHotplugFeatures opts,
 
 /* build Processor object for each processor */
 for (i = 0; i < arch_ids->len; i++) {
+int j;
 Aml *dev;
 Aml *uid = aml_int(i);
 GArray *madt_buf = g_array_new(0, 1, 1);
@@ -546,6 +548,16 @@ void build_cpus_aml(Aml *table, MachineState *machine, 
CPUHotplugFeatures opts,
   aml_arg(1), aml_arg(2))
 );
 aml_append(dev, method);
+
+/* Linux guests discard SRAT info for non-present CPUs
+ * as a result _PXM is required for all CPUs which might
+ * be hot-plugged. For simplicity, add it for all CPUs.
+ */
+j = numa_get_node_for_cpu(i);
+if (j < nb_numa_nodes) {
+aml_append(dev, aml_name_decl("_PXM", aml_int(j)));
+}
+
 aml_append(cpus_dev, dev);
 }
 }
-- 
MST




[Qemu-devel] [PULL 23/33] virtio-net: convert VMSTATE_VIRTIO_DEVICE

2016-10-09 Thread Michael S. Tsirkin
From: Halil Pasic 

Use the new VMSTATE_VIRTIO_DEVICE macro.

Signed-off-by: Halil Pasic 
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
---
 hw/net/virtio-net.c | 42 +-
 1 file changed, 21 insertions(+), 21 deletions(-)

diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
index ca1b469..b2198a5 100644
--- a/hw/net/virtio-net.c
+++ b/hw/net/virtio-net.c
@@ -11,6 +11,8 @@
  *
  */
 
+#define VMSTATE_VIRTIO_DEVICE_USE_NEW
+
 #include "qemu/osdep.h"
 #include "qemu/iov.h"
 #include "hw/virtio/virtio.h"
@@ -1514,17 +1516,6 @@ static void virtio_net_set_multiqueue(VirtIONet *n, int 
multiqueue)
 virtio_net_set_queues(n);
 }
 
-static void virtio_net_save(QEMUFile *f, void *opaque, size_t size)
-{
-VirtIONet *n = opaque;
-VirtIODevice *vdev = VIRTIO_DEVICE(n);
-
-/* At this point, backend must be stopped, otherwise
- * it might keep writing to memory. */
-assert(!n->vhost_started);
-virtio_save(vdev, f);
-}
-
 static void virtio_net_save_device(VirtIODevice *vdev, QEMUFile *f)
 {
 VirtIONet *n = VIRTIO_NET(vdev);
@@ -1560,14 +1551,6 @@ static void virtio_net_save_device(VirtIODevice *vdev, 
QEMUFile *f)
 }
 }
 
-static int virtio_net_load(QEMUFile *f, void *opaque, size_t size)
-{
-VirtIONet *n = opaque;
-VirtIODevice *vdev = VIRTIO_DEVICE(n);
-
-return virtio_load(vdev, f, VIRTIO_NET_VM_VERSION);
-}
-
 static int virtio_net_load_device(VirtIODevice *vdev, QEMUFile *f,
   int version_id)
 {
@@ -1870,8 +1853,25 @@ static void virtio_net_instance_init(Object *obj)
   DEVICE(n), NULL);
 }
 
-VMSTATE_VIRTIO_DEVICE(net, VIRTIO_NET_VM_VERSION, virtio_net_load,
-  virtio_net_save);
+static void virtio_net_pre_save(void *opaque)
+{
+VirtIONet *n = opaque;
+
+/* At this point, backend must be stopped, otherwise
+ * it might keep writing to memory. */
+assert(!n->vhost_started);
+}
+
+static const VMStateDescription vmstate_virtio_net = {
+.name = "virtio-net",
+.minimum_version_id = VIRTIO_NET_VM_VERSION,
+.version_id = VIRTIO_NET_VM_VERSION,
+.fields = (VMStateField[]) {
+VMSTATE_VIRTIO_DEVICE,
+VMSTATE_END_OF_LIST()
+},
+.pre_save = virtio_net_pre_save,
+};
 
 static Property virtio_net_properties[] = {
 DEFINE_PROP_BIT("csum", VirtIONet, host_features, VIRTIO_NET_F_CSUM, true),
-- 
MST




[Qemu-devel] [PULL 16/33] virtio-net: handle virtio_net_receive() errors

2016-10-09 Thread Michael S. Tsirkin
From: Greg Kurz 

All these errors are caused by a buggy guest: let's switch the device to
the broken state instead of terminating QEMU. Also we detach the element
from the virtqueue and free it.

Signed-off-by: Greg Kurz 
Reviewed-by: Cornelia Huck 
Reviewed-by: Stefan Hajnoczi 
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
---
 hw/net/virtio-net.c | 27 +++
 1 file changed, 15 insertions(+), 12 deletions(-)

diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
index a1584e1..5c0b2e0 100644
--- a/hw/net/virtio-net.c
+++ b/hw/net/virtio-net.c
@@ -1130,21 +1130,24 @@ static ssize_t virtio_net_receive(NetClientState *nc, 
const uint8_t *buf, size_t
 
 elem = virtqueue_pop(q->rx_vq, sizeof(VirtQueueElement));
 if (!elem) {
-if (i == 0)
-return -1;
-error_report("virtio-net unexpected empty queue: "
- "i %zd mergeable %d offset %zd, size %zd, "
- "guest hdr len %zd, host hdr len %zd "
- "guest features 0x%" PRIx64,
- i, n->mergeable_rx_bufs, offset, size,
- n->guest_hdr_len, n->host_hdr_len,
- vdev->guest_features);
-exit(1);
+if (i) {
+virtio_error(vdev, "virtio-net unexpected empty queue: "
+ "i %zd mergeable %d offset %zd, size %zd, "
+ "guest hdr len %zd, host hdr len %zd "
+ "guest features 0x%" PRIx64,
+ i, n->mergeable_rx_bufs, offset, size,
+ n->guest_hdr_len, n->host_hdr_len,
+ vdev->guest_features);
+}
+return -1;
 }
 
 if (elem->in_num < 1) {
-error_report("virtio-net receive queue contains no in buffers");
-exit(1);
+virtio_error(vdev,
+ "virtio-net receive queue contains no in buffers");
+virtqueue_detach_element(q->rx_vq, elem, 0);
+g_free(elem);
+return -1;
 }
 
 sg = elem->in_sg;
-- 
MST




[Qemu-devel] [PULL 03/33] virtio-serial: enable virtio console emergency write feature

2016-10-09 Thread Michael S. Tsirkin
From: Sascha Silbe 

Add support for enabling the virtio 1.0 "emergency write"
(VIRTIO_CONSOLE_F_EMERG_WRITE) feature. The previous patch introduced
the plumbing required for this; now we expose the virtio feature to
the guest. The feature is disabled for compatibility machines to avoid
exposing a new feature to existing guests.

As required by the virtio 1.0 spec, the emergency write functionality
is available to the guest even if the guest doesn't negotatiate the
feature, as well as before feature negotation.

Reviewed-by: Cornelia Huck 
Signed-off-by: Sascha Silbe 
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
---
 include/hw/compat.h   |  4 
 include/hw/virtio/virtio-serial.h |  2 ++
 hw/char/virtio-serial-bus.c   | 12 +---
 3 files changed, 15 insertions(+), 3 deletions(-)

diff --git a/include/hw/compat.h b/include/hw/compat.h
index 46412b2..ef3fae3 100644
--- a/include/hw/compat.h
+++ b/include/hw/compat.h
@@ -7,6 +7,10 @@
 .property = "page-per-vq",\
 .value= "on",\
 },{\
+.driver   = "virtio-serial-device",\
+.property = "emergency-write",\
+.value= "off",\
+},{\
 .driver   = "ioapic",\
 .property = "version",\
 .value= "0x11",\
diff --git a/include/hw/virtio/virtio-serial.h 
b/include/hw/virtio/virtio-serial.h
index 730c88d..b19c447 100644
--- a/include/hw/virtio/virtio-serial.h
+++ b/include/hw/virtio/virtio-serial.h
@@ -184,6 +184,8 @@ struct VirtIOSerial {
 struct VirtIOSerialPostLoad *post_load;
 
 virtio_serial_conf serial;
+
+uint64_t host_features;
 };
 
 /* Interface to the virtio-serial bus */
diff --git a/hw/char/virtio-serial-bus.c b/hw/char/virtio-serial-bus.c
index 57419b2..db2a9f1 100644
--- a/hw/char/virtio-serial-bus.c
+++ b/hw/char/virtio-serial-bus.c
@@ -541,6 +541,7 @@ static uint64_t get_features(VirtIODevice *vdev, uint64_t 
features,
 
 vser = VIRTIO_SERIAL(vdev);
 
+features |= vser->host_features;
 if (vser->bus.max_nr_ports > 1) {
 virtio_add_feature(, VIRTIO_CONSOLE_F_MULTIPORT);
 }
@@ -1003,6 +1004,7 @@ static void virtio_serial_device_realize(DeviceState 
*dev, Error **errp)
 VirtIODevice *vdev = VIRTIO_DEVICE(dev);
 VirtIOSerial *vser = VIRTIO_SERIAL(dev);
 uint32_t i, max_supported_ports;
+size_t config_size = sizeof(struct virtio_console_config);
 
 if (!vser->serial.max_virtserial_ports) {
 error_setg(errp, "Maximum number of serial ports not specified");
@@ -1017,10 +1019,12 @@ static void virtio_serial_device_realize(DeviceState 
*dev, Error **errp)
 return;
 }
 
-/* We don't support emergency write, skip it for now. */
-/* TODO: cleaner fix, depending on host features. */
+if (!virtio_has_feature(vser->host_features,
+VIRTIO_CONSOLE_F_EMERG_WRITE)) {
+config_size = offsetof(struct virtio_console_config, emerg_wr);
+}
 virtio_init(vdev, "virtio-serial", VIRTIO_ID_CONSOLE,
-offsetof(struct virtio_console_config, emerg_wr));
+config_size);
 
 /* Spawn a new virtio-serial bus on which the ports will ride as devices */
 qbus_create_inplace(>bus, sizeof(vser->bus), TYPE_VIRTIO_SERIAL_BUS,
@@ -1116,6 +1120,8 @@ VMSTATE_VIRTIO_DEVICE(console, 3, virtio_serial_load, 
virtio_vmstate_save);
 static Property virtio_serial_properties[] = {
 DEFINE_PROP_UINT32("max_ports", VirtIOSerial, serial.max_virtserial_ports,
   31),
+DEFINE_PROP_BIT64("emergency-write", VirtIOSerial, host_features,
+  VIRTIO_CONSOLE_F_EMERG_WRITE, true),
 DEFINE_PROP_END_OF_LIST(),
 };
 
-- 
MST




[Qemu-devel] [PULL for-2.0 1/3] Detect pthread_setname_np at configure time

2016-10-09 Thread Michael S. Tsirkin
From: "Dr. David Alan Gilbert" 

Warn if no way of setting thread name is available.

Signed-off-by: Dr. David Alan Gilbert 
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
---
 configure| 28 
 util/qemu-thread-posix.c | 21 ++---
 util/qemu-thread-win32.c |  2 ++
 3 files changed, 48 insertions(+), 3 deletions(-)

diff --git a/configure b/configure
index aae617e..01e637f 100755
--- a/configure
+++ b/configure
@@ -2696,6 +2696,24 @@ if test "$mingw32" != yes -a "$pthread" = no; then
   "Make sure to have the pthread libs and headers installed."
 fi
 
+# check for pthread_setname_np
+pthread_setname_np=no
+cat > $TMPC << EOF
+#include 
+
+static void *f(void *p) { return NULL; }
+int main(void)
+{
+pthread_t thread;
+pthread_create(, 0, f, 0);
+pthread_setname_np(thread, "QEMU");
+return 0;
+}
+EOF
+if compile_prog "" "$pthread_lib" ; then
+  pthread_setname_np=yes
+fi
+
 ##
 # rbd probe
 if test "$rbd" != "no" ; then
@@ -4628,6 +4646,16 @@ if test "$rdma" = "yes" ; then
   echo "CONFIG_RDMA=y" >> $config_host_mak
 fi
 
+# Hold two types of flag:
+#   CONFIG_THREAD_SETNAME_BYTHREAD  - we've got a way of setting the name on
+# a thread we have a handle to
+#   CONFIG_PTHREAD_SETNAME_NP   - A way of doing it on a particular
+# platform
+if test "$pthread_setname_np" = "yes" ; then
+  echo "CONFIG_THREAD_SETNAME_BYTHREAD=y" >> $config_host_mak
+  echo "CONFIG_PTHREAD_SETNAME_NP=y" >> $config_host_mak
+fi
+
 if test "$tcg_interpreter" = "yes"; then
   QEMU_INCLUDES="-I\$(SRC_PATH)/tcg/tci $QEMU_INCLUDES"
 elif test "$ARCH" = "sparc64" ; then
diff --git a/util/qemu-thread-posix.c b/util/qemu-thread-posix.c
index 960d7f5..d05a649 100644
--- a/util/qemu-thread-posix.c
+++ b/util/qemu-thread-posix.c
@@ -32,6 +32,13 @@ static bool name_threads;
 void qemu_thread_naming(bool enable)
 {
 name_threads = enable;
+
+#ifndef CONFIG_THREAD_SETNAME_BYTHREAD
+/* This is a debugging option, not fatal */
+if (enable) {
+fprintf(stderr, "qemu: thread naming not supported on this host\n");
+}
+#endif
 }
 
 static void error_exit(int err, const char *msg)
@@ -394,6 +401,16 @@ void qemu_event_wait(QemuEvent *ev)
 }
 }
 
+/* Attempt to set the threads name; note that this is for debug, so
+ * we're not going to fail if we can't set it.
+ */
+static void qemu_thread_set_name(QemuThread *thread, const char *name)
+{
+#ifdef CONFIG_PTHREAD_SETNAME_NP
+pthread_setname_np(thread->thread, name);
+#endif
+}
+
 void qemu_thread_create(QemuThread *thread, const char *name,
void *(*start_routine)(void*),
void *arg, int mode)
@@ -420,11 +437,9 @@ void qemu_thread_create(QemuThread *thread, const char 
*name,
 if (err)
 error_exit(err, __func__);
 
-#if defined(__GLIBC__) && (__GLIBC__ > 2 || (__GLIBC__ == 2 && __GLIBC_MINOR__ 
>= 12))
 if (name_threads) {
-pthread_setname_np(thread->thread, name);
+qemu_thread_set_name(thread, name);
 }
-#endif
 
 pthread_sigmask(SIG_SETMASK, , NULL);
 
diff --git a/util/qemu-thread-win32.c b/util/qemu-thread-win32.c
index b9c957b..c405c9b 100644
--- a/util/qemu-thread-win32.c
+++ b/util/qemu-thread-win32.c
@@ -22,6 +22,8 @@ void qemu_thread_naming(bool enable)
 {
 /* But note we don't actually name them on Windows yet */
 name_threads = enable;
+
+fprintf(stderr, "qemu: thread naming not supported on this host\n");
 }
 
 static void error_exit(int err, const char *msg)
-- 
MST




[Qemu-devel] [PULL 09/33] virtio-blk: add missing virtio_detach_element() call

2016-10-09 Thread Michael S. Tsirkin
From: Stefan Hajnoczi 

Make sure to unmap the scatter-gather list and decrement vq->inuse
before freeing requests in virtio_blk_reset().

Signed-off-by: Stefan Hajnoczi 
Reviewed-by: Ladi Prosek 
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
---
 hw/block/virtio-blk.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c
index 3a6112f..c7ca4d6 100644
--- a/hw/block/virtio-blk.c
+++ b/hw/block/virtio-blk.c
@@ -665,6 +665,7 @@ static void virtio_blk_reset(VirtIODevice *vdev)
 while (s->rq) {
 req = s->rq;
 s->rq = req->next;
+virtqueue_detach_element(req->vq, >elem, 0);
 virtio_blk_free_request(req);
 }
 
-- 
MST




[Qemu-devel] [PULL v3 07/14] Add 'debug-threads' suboption to --name

2016-10-09 Thread Michael S. Tsirkin
From: "Dr. David Alan Gilbert" 

Add flag storage to qemu-thread-* to store the namethreads flag

Signed-off-by: Dr. David Alan Gilbert 
Acked-by: Michael S. Tsirkin 
Reviewed-by: Laszlo Ersek 
---
 include/qemu/thread.h| 1 +
 util/qemu-thread-posix.c | 7 +++
 util/qemu-thread-win32.c | 8 
 vl.c | 9 +
 qemu-options.hx  | 7 +--
 5 files changed, 30 insertions(+), 2 deletions(-)

diff --git a/include/qemu/thread.h b/include/qemu/thread.h
index 3e32c65..bf1e110 100644
--- a/include/qemu/thread.h
+++ b/include/qemu/thread.h
@@ -59,5 +59,6 @@ void *qemu_thread_join(QemuThread *thread);
 void qemu_thread_get_self(QemuThread *thread);
 bool qemu_thread_is_self(QemuThread *thread);
 void qemu_thread_exit(void *retval);
+void qemu_thread_naming(bool enable);
 
 #endif
diff --git a/util/qemu-thread-posix.c b/util/qemu-thread-posix.c
index 37dd298..0fa6c81 100644
--- a/util/qemu-thread-posix.c
+++ b/util/qemu-thread-posix.c
@@ -27,6 +27,13 @@
 #include "qemu/thread.h"
 #include "qemu/atomic.h"
 
+static bool name_threads;
+
+void qemu_thread_naming(bool enable)
+{
+name_threads = enable;
+}
+
 static void error_exit(int err, const char *msg)
 {
 fprintf(stderr, "qemu: %s: %s\n", msg, strerror(err));
diff --git a/util/qemu-thread-win32.c b/util/qemu-thread-win32.c
index 27a5217..e42cb77 100644
--- a/util/qemu-thread-win32.c
+++ b/util/qemu-thread-win32.c
@@ -16,6 +16,14 @@
 #include 
 #include 
 
+static bool name_threads;
+
+void qemu_thread_naming(bool enable)
+{
+/* But note we don't actually name them on Windows yet */
+name_threads = enable;
+}
+
 static void error_exit(int err, const char *msg)
 {
 char *pstr;
diff --git a/vl.c b/vl.c
index 44b5ad3..c8a5bfa 100644
--- a/vl.c
+++ b/vl.c
@@ -495,6 +495,12 @@ static QemuOptsList qemu_name_opts = {
 .name = "process",
 .type = QEMU_OPT_STRING,
 .help = "Sets the name of the QEMU process, as shown in top etc",
+}, {
+.name = "debug-threads",
+.type = QEMU_OPT_BOOL,
+.help = "When enabled, name the individual threads; defaults 
off.\n"
+"NOTE: The thread names are for debugging and not a\n"
+"stable API.",
 },
 { /* End of list */ }
 },
@@ -954,6 +960,9 @@ static void parse_name(QemuOpts *opts)
 {
 const char *proc_name;
 
+if (qemu_opt_get(opts, "debug-threads")) {
+qemu_thread_naming(qemu_opt_get_bool(opts, "debug-threads", false));
+}
 qemu_name = qemu_opt_get(opts, "guest");
 
 proc_name = qemu_opt_get(opts, "process");
diff --git a/qemu-options.hx b/qemu-options.hx
index 56e5fdf..068da2d 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -328,9 +328,11 @@ possible drivers and properties, use @code{-device help} 
and
 ETEXI
 
 DEF("name", HAS_ARG, QEMU_OPTION_name,
-"-name string1[,process=string2]\n"
+"-name string1[,process=string2][,debug-threads=on|off]\n"
 "set the name of the guest\n"
-"string1 sets the window title and string2 the process 
name (on Linux)\n",
+"string1 sets the window title and string2 the process 
name (on Linux)\n"
+"When debug-threads is enabled, individual threads are 
given a separate name (on Linux)\n"
+"NOTE: The thread names are for debugging and not a stable 
API.\n",
 QEMU_ARCH_ALL)
 STEXI
 @item -name @var{name}
@@ -339,6 +341,7 @@ Sets the @var{name} of the guest.
 This name will be displayed in the SDL window caption.
 The @var{name} will also be used for the VNC server.
 Also optionally set the top visible process name in Linux.
+Naming of individual threads can also be enabled on Linux to aid debugging.
 ETEXI
 
 DEF("uuid", HAS_ARG, QEMU_OPTION_uuid,
-- 
MST




[Qemu-devel] [PULL 08/14] Add a 'name' parameter to qemu_thread_create

2016-10-09 Thread Michael S. Tsirkin
From: "Dr. David Alan Gilbert" 

If enabled, set the thread name at creation (on GNU systems with
  pthread_set_np)
Fix up all the callers with a thread name

Signed-off-by: Dr. David Alan Gilbert 
Acked-by: Michael S. Tsirkin 
Reviewed-by: Laszlo Ersek 
---
 include/qemu/thread.h   |  2 +-
 cpus.c  | 25 -
 hw/block/dataplane/virtio-blk.c |  2 +-
 hw/usb/ccid-card-emulated.c |  8 
 libcacard/vscclient.c   |  2 +-
 migration.c |  2 +-
 thread-pool.c   |  2 +-
 ui/vnc-jobs.c   |  3 ++-
 util/compatfd.c |  3 ++-
 util/qemu-thread-posix.c|  9 +++--
 util/qemu-thread-win32.c|  2 +-
 11 files changed, 41 insertions(+), 19 deletions(-)

diff --git a/include/qemu/thread.h b/include/qemu/thread.h
index bf1e110..f7e3b9b 100644
--- a/include/qemu/thread.h
+++ b/include/qemu/thread.h
@@ -52,7 +52,7 @@ void qemu_event_reset(QemuEvent *ev);
 void qemu_event_wait(QemuEvent *ev);
 void qemu_event_destroy(QemuEvent *ev);
 
-void qemu_thread_create(QemuThread *thread,
+void qemu_thread_create(QemuThread *thread, const char *name,
 void *(*start_routine)(void *),
 void *arg, int mode);
 void *qemu_thread_join(QemuThread *thread);
diff --git a/cpus.c b/cpus.c
index 945d85b..b6421fd 100644
--- a/cpus.c
+++ b/cpus.c
@@ -1117,8 +1117,13 @@ void resume_all_vcpus(void)
 }
 }
 
+/* For temporary buffers for forming a name */
+#define VCPU_THREAD_NAME_SIZE 16
+
 static void qemu_tcg_init_vcpu(CPUState *cpu)
 {
+char thread_name[VCPU_THREAD_NAME_SIZE];
+
 tcg_cpu_address_space_init(cpu, cpu->as);
 
 /* share a single thread for all cpus with TCG */
@@ -1127,8 +1132,10 @@ static void qemu_tcg_init_vcpu(CPUState *cpu)
 cpu->halt_cond = g_malloc0(sizeof(QemuCond));
 qemu_cond_init(cpu->halt_cond);
 tcg_halt_cond = cpu->halt_cond;
-qemu_thread_create(cpu->thread, qemu_tcg_cpu_thread_fn, cpu,
-   QEMU_THREAD_JOINABLE);
+snprintf(thread_name, VCPU_THREAD_NAME_SIZE, "CPU %d/TCG",
+ cpu->cpu_index);
+qemu_thread_create(cpu->thread, thread_name, qemu_tcg_cpu_thread_fn,
+   cpu, QEMU_THREAD_JOINABLE);
 #ifdef _WIN32
 cpu->hThread = qemu_thread_get_handle(cpu->thread);
 #endif
@@ -1144,11 +1151,15 @@ static void qemu_tcg_init_vcpu(CPUState *cpu)
 
 static void qemu_kvm_start_vcpu(CPUState *cpu)
 {
+char thread_name[VCPU_THREAD_NAME_SIZE];
+
 cpu->thread = g_malloc0(sizeof(QemuThread));
 cpu->halt_cond = g_malloc0(sizeof(QemuCond));
 qemu_cond_init(cpu->halt_cond);
-qemu_thread_create(cpu->thread, qemu_kvm_cpu_thread_fn, cpu,
-   QEMU_THREAD_JOINABLE);
+snprintf(thread_name, VCPU_THREAD_NAME_SIZE, "CPU %d/KVM",
+ cpu->cpu_index);
+qemu_thread_create(cpu->thread, thread_name, qemu_kvm_cpu_thread_fn,
+   cpu, QEMU_THREAD_JOINABLE);
 while (!cpu->created) {
 qemu_cond_wait(_cpu_cond, _global_mutex);
 }
@@ -1156,10 +1167,14 @@ static void qemu_kvm_start_vcpu(CPUState *cpu)
 
 static void qemu_dummy_start_vcpu(CPUState *cpu)
 {
+char thread_name[VCPU_THREAD_NAME_SIZE];
+
 cpu->thread = g_malloc0(sizeof(QemuThread));
 cpu->halt_cond = g_malloc0(sizeof(QemuCond));
 qemu_cond_init(cpu->halt_cond);
-qemu_thread_create(cpu->thread, qemu_dummy_cpu_thread_fn, cpu,
+snprintf(thread_name, VCPU_THREAD_NAME_SIZE, "CPU %d/DUMMY",
+ cpu->cpu_index);
+qemu_thread_create(cpu->thread, thread_name, qemu_dummy_cpu_thread_fn, cpu,
QEMU_THREAD_JOINABLE);
 while (!cpu->created) {
 qemu_cond_wait(_cpu_cond, _global_mutex);
diff --git a/hw/block/dataplane/virtio-blk.c b/hw/block/dataplane/virtio-blk.c
index 2237edb..d1c7ad4 100644
--- a/hw/block/dataplane/virtio-blk.c
+++ b/hw/block/dataplane/virtio-blk.c
@@ -358,7 +358,7 @@ static void start_data_plane_bh(void *opaque)
 
 qemu_bh_delete(s->start_bh);
 s->start_bh = NULL;
-qemu_thread_create(>thread, data_plane_thread,
+qemu_thread_create(>thread, "data_plane", data_plane_thread,
s, QEMU_THREAD_JOINABLE);
 }
 
diff --git a/hw/usb/ccid-card-emulated.c b/hw/usb/ccid-card-emulated.c
index aa913df..7213c89 100644
--- a/hw/usb/ccid-card-emulated.c
+++ b/hw/usb/ccid-card-emulated.c
@@ -546,10 +546,10 @@ static int emulated_initfn(CCIDCardState *base)
 printf("%s: failed to initialize vcard\n", EMULATED_DEV_NAME);
 return -1;
 }
-qemu_thread_create(>event_thread_id, event_thread, card,
-   QEMU_THREAD_JOINABLE);
-qemu_thread_create(>apdu_thread_id, handle_apdu_thread, card,
-   QEMU_THREAD_JOINABLE);
+

[Qemu-devel] [PULL 06/33] tests: acpi: extend cphp testcase with numa check

2016-10-09 Thread Michael S. Tsirkin
From: Igor Mammedov 

so it would be possible to verify _PXM generation in
DSDT and SRAT tables.

Signed-off-by: Igor Mammedov 
Reviewed-by: Marcel Apfelbaum 
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
---
 tests/bios-tables-test.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/tests/bios-tables-test.c b/tests/bios-tables-test.c
index 7e27ea9..6ea2b6d 100644
--- a/tests/bios-tables-test.c
+++ b/tests/bios-tables-test.c
@@ -811,7 +811,8 @@ static void test_acpi_piix4_tcg_cphp(void)
 memset(, 0, sizeof(data));
 data.machine = MACHINE_PC;
 data.variant = ".cphp";
-test_acpi_one("-smp 2,cores=3,sockets=2,maxcpus=6",
+test_acpi_one("-smp 2,cores=3,sockets=2,maxcpus=6"
+  " -numa node -numa node",
   );
 free_test_data();
 }
@@ -823,7 +824,8 @@ static void test_acpi_q35_tcg_cphp(void)
 memset(, 0, sizeof(data));
 data.machine = MACHINE_Q35;
 data.variant = ".cphp";
-test_acpi_one(" -smp 2,cores=3,sockets=2,maxcpus=6",
+test_acpi_one(" -smp 2,cores=3,sockets=2,maxcpus=6"
+  " -numa node -numa node",
   );
 free_test_data();
 }
-- 
MST




[Qemu-devel] [PATCH for-2.5] iotests: drop thread spun work-around

2016-10-09 Thread Michael S. Tsirkin
We've disabled the warning, there should be no need for test to work
around it.

Signed-off-by: Michael S. Tsirkin 
---

This is on top of
main-loop: suppress warnings under qtest

I just tested this by running make check.
Is this enough?

 tests/qemu-iotests/common.filter | 1 -
 1 file changed, 1 deletion(-)

diff --git a/tests/qemu-iotests/common.filter b/tests/qemu-iotests/common.filter
index cfdb633..49217b0 100644
--- a/tests/qemu-iotests/common.filter
+++ b/tests/qemu-iotests/common.filter
@@ -164,7 +164,6 @@ _filter_qemu()
 {
 sed -e "s#\\(^\\|(qemu) \\)$(basename $QEMU_PROG):#\1QEMU_PROG:#" \
 -e 's#^QEMU [0-9]\+\.[0-9]\+\.[0-9]\+ monitor#QEMU X.Y.Z monitor#' \
--e '/main-loop: WARNING: I\/O thread spun for [0-9]\+ iterations/d' \
 -e $'s#\r##' # QEMU monitor uses \r\n line endings
 }
 
-- 
MST




[Qemu-devel] [PULL 07/14] Add 'debug-threads' suboption to --name

2016-10-09 Thread Michael S. Tsirkin
From: "Dr. David Alan Gilbert" 

Add flag storage to qemu-thread-* to store the namethreads flag

Signed-off-by: Dr. David Alan Gilbert 
Acked-by: Michael S. Tsirkin 
Reviewed-by: Laszlo Ersek 
---
 include/qemu/thread.h| 1 +
 util/qemu-thread-posix.c | 7 +++
 util/qemu-thread-win32.c | 8 
 vl.c | 9 +
 qemu-options.hx  | 7 +--
 5 files changed, 30 insertions(+), 2 deletions(-)

diff --git a/include/qemu/thread.h b/include/qemu/thread.h
index 3e32c65..bf1e110 100644
--- a/include/qemu/thread.h
+++ b/include/qemu/thread.h
@@ -59,5 +59,6 @@ void *qemu_thread_join(QemuThread *thread);
 void qemu_thread_get_self(QemuThread *thread);
 bool qemu_thread_is_self(QemuThread *thread);
 void qemu_thread_exit(void *retval);
+void qemu_thread_naming(bool enable);
 
 #endif
diff --git a/util/qemu-thread-posix.c b/util/qemu-thread-posix.c
index 37dd298..0fa6c81 100644
--- a/util/qemu-thread-posix.c
+++ b/util/qemu-thread-posix.c
@@ -27,6 +27,13 @@
 #include "qemu/thread.h"
 #include "qemu/atomic.h"
 
+static bool name_threads;
+
+void qemu_thread_naming(bool enable)
+{
+name_threads = enable;
+}
+
 static void error_exit(int err, const char *msg)
 {
 fprintf(stderr, "qemu: %s: %s\n", msg, strerror(err));
diff --git a/util/qemu-thread-win32.c b/util/qemu-thread-win32.c
index 27a5217..e42cb77 100644
--- a/util/qemu-thread-win32.c
+++ b/util/qemu-thread-win32.c
@@ -16,6 +16,14 @@
 #include 
 #include 
 
+static bool name_threads;
+
+void qemu_thread_naming(bool enable)
+{
+/* But note we don't actually name them on Windows yet */
+name_threads = enable;
+}
+
 static void error_exit(int err, const char *msg)
 {
 char *pstr;
diff --git a/vl.c b/vl.c
index 44b5ad3..c8a5bfa 100644
--- a/vl.c
+++ b/vl.c
@@ -495,6 +495,12 @@ static QemuOptsList qemu_name_opts = {
 .name = "process",
 .type = QEMU_OPT_STRING,
 .help = "Sets the name of the QEMU process, as shown in top etc",
+}, {
+.name = "debug-threads",
+.type = QEMU_OPT_BOOL,
+.help = "When enabled, name the individual threads; defaults 
off.\n"
+"NOTE: The thread names are for debugging and not a\n"
+"stable API.",
 },
 { /* End of list */ }
 },
@@ -954,6 +960,9 @@ static void parse_name(QemuOpts *opts)
 {
 const char *proc_name;
 
+if (qemu_opt_get(opts, "debug-threads")) {
+qemu_thread_naming(qemu_opt_get_bool(opts, "debug-threads", false));
+}
 qemu_name = qemu_opt_get(opts, "guest");
 
 proc_name = qemu_opt_get(opts, "process");
diff --git a/qemu-options.hx b/qemu-options.hx
index 56e5fdf..068da2d 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -328,9 +328,11 @@ possible drivers and properties, use @code{-device help} 
and
 ETEXI
 
 DEF("name", HAS_ARG, QEMU_OPTION_name,
-"-name string1[,process=string2]\n"
+"-name string1[,process=string2][,debug-threads=on|off]\n"
 "set the name of the guest\n"
-"string1 sets the window title and string2 the process 
name (on Linux)\n",
+"string1 sets the window title and string2 the process 
name (on Linux)\n"
+"When debug-threads is enabled, individual threads are 
given a separate name (on Linux)\n"
+"NOTE: The thread names are for debugging and not a stable 
API.\n",
 QEMU_ARCH_ALL)
 STEXI
 @item -name @var{name}
@@ -339,6 +341,7 @@ Sets the @var{name} of the guest.
 This name will be displayed in the SDL window caption.
 The @var{name} will also be used for the VNC server.
 Also optionally set the top visible process name in Linux.
+Naming of individual threads can also be enabled on Linux to aid debugging.
 ETEXI
 
 DEF("uuid", HAS_ARG, QEMU_OPTION_uuid,
-- 
MST




[Qemu-devel] [PULL 02/33] virtio-serial: add plumbing for virtio console emergency write support

2016-10-09 Thread Michael S. Tsirkin
From: Sascha Silbe 

Add the infrastructure required for the virtio 1.0 "emergency write"
(VIRTIO_CONSOLE_F_EMERG_WRITE) feature. Because we don't touch the
size of the configuration area, guests will not be able to actually
make use of this without further patches.

Reviewed-by: Cornelia Huck 
Signed-off-by: Sascha Silbe 
Reviewed-by: Michael S. Tsirkin 
Signed-off-by: Michael S. Tsirkin 
---
 hw/char/virtio-serial-bus.c | 37 +
 1 file changed, 37 insertions(+)

diff --git a/hw/char/virtio-serial-bus.c b/hw/char/virtio-serial-bus.c
index db57a38..57419b2 100644
--- a/hw/char/virtio-serial-bus.c
+++ b/hw/char/virtio-serial-bus.c
@@ -75,6 +75,19 @@ static VirtIOSerialPort *find_port_by_name(char *name)
 return NULL;
 }
 
+static VirtIOSerialPort *find_first_connected_console(VirtIOSerial *vser)
+{
+VirtIOSerialPort *port;
+
+QTAILQ_FOREACH(port, >ports, next) {
+VirtIOSerialPortClass const *vsc = VIRTIO_SERIAL_PORT_GET_CLASS(port);
+if (vsc->is_console && port->host_connected) {
+return port;
+}
+}
+return NULL;
+}
+
 static bool use_multiport(VirtIOSerial *vser)
 {
 VirtIODevice *vdev = VIRTIO_DEVICE(vser);
@@ -547,6 +560,29 @@ static void get_config(VirtIODevice *vdev, uint8_t 
*config_data)
   vser->serial.max_virtserial_ports);
 }
 
+/* Guest sent new config info */
+static void set_config(VirtIODevice *vdev, const uint8_t *config_data)
+{
+VirtIOSerial *vser = VIRTIO_SERIAL(vdev);
+struct virtio_console_config *config =
+(struct virtio_console_config *)config_data;
+uint8_t emerg_wr_lo = le32_to_cpu(config->emerg_wr);
+VirtIOSerialPort *port = find_first_connected_console(vser);
+VirtIOSerialPortClass *vsc;
+
+if (!config->emerg_wr) {
+return;
+}
+/* Make sure we don't misdetect an emergency write when the guest
+ * does a short config write after an emergency write. */
+config->emerg_wr = 0;
+if (!port) {
+return;
+}
+vsc = VIRTIO_SERIAL_PORT_GET_CLASS(port);
+(void)vsc->have_data(port, _wr_lo, 1);
+}
+
 static void guest_reset(VirtIOSerial *vser)
 {
 VirtIOSerialPort *port;
@@ -1098,6 +1134,7 @@ static void virtio_serial_class_init(ObjectClass *klass, 
void *data)
 vdc->unrealize = virtio_serial_device_unrealize;
 vdc->get_features = get_features;
 vdc->get_config = get_config;
+vdc->set_config = set_config;
 vdc->set_status = set_status;
 vdc->reset = vser_reset;
 vdc->save = virtio_serial_save_device;
-- 
MST




[Qemu-devel] [PULL v3 08/14] Add a 'name' parameter to qemu_thread_create

2016-10-09 Thread Michael S. Tsirkin
From: "Dr. David Alan Gilbert" 

If enabled, set the thread name at creation (on GNU systems with
  pthread_set_np)
Fix up all the callers with a thread name

Signed-off-by: Dr. David Alan Gilbert 
Acked-by: Michael S. Tsirkin 
Reviewed-by: Laszlo Ersek 
---
 include/qemu/thread.h   |  2 +-
 cpus.c  | 25 -
 hw/block/dataplane/virtio-blk.c |  2 +-
 hw/usb/ccid-card-emulated.c |  8 
 libcacard/vscclient.c   |  2 +-
 migration.c |  2 +-
 thread-pool.c   |  2 +-
 ui/vnc-jobs.c   |  3 ++-
 util/compatfd.c |  3 ++-
 util/qemu-thread-posix.c|  9 +++--
 util/qemu-thread-win32.c|  2 +-
 11 files changed, 41 insertions(+), 19 deletions(-)

diff --git a/include/qemu/thread.h b/include/qemu/thread.h
index bf1e110..f7e3b9b 100644
--- a/include/qemu/thread.h
+++ b/include/qemu/thread.h
@@ -52,7 +52,7 @@ void qemu_event_reset(QemuEvent *ev);
 void qemu_event_wait(QemuEvent *ev);
 void qemu_event_destroy(QemuEvent *ev);
 
-void qemu_thread_create(QemuThread *thread,
+void qemu_thread_create(QemuThread *thread, const char *name,
 void *(*start_routine)(void *),
 void *arg, int mode);
 void *qemu_thread_join(QemuThread *thread);
diff --git a/cpus.c b/cpus.c
index 945d85b..b6421fd 100644
--- a/cpus.c
+++ b/cpus.c
@@ -1117,8 +1117,13 @@ void resume_all_vcpus(void)
 }
 }
 
+/* For temporary buffers for forming a name */
+#define VCPU_THREAD_NAME_SIZE 16
+
 static void qemu_tcg_init_vcpu(CPUState *cpu)
 {
+char thread_name[VCPU_THREAD_NAME_SIZE];
+
 tcg_cpu_address_space_init(cpu, cpu->as);
 
 /* share a single thread for all cpus with TCG */
@@ -1127,8 +1132,10 @@ static void qemu_tcg_init_vcpu(CPUState *cpu)
 cpu->halt_cond = g_malloc0(sizeof(QemuCond));
 qemu_cond_init(cpu->halt_cond);
 tcg_halt_cond = cpu->halt_cond;
-qemu_thread_create(cpu->thread, qemu_tcg_cpu_thread_fn, cpu,
-   QEMU_THREAD_JOINABLE);
+snprintf(thread_name, VCPU_THREAD_NAME_SIZE, "CPU %d/TCG",
+ cpu->cpu_index);
+qemu_thread_create(cpu->thread, thread_name, qemu_tcg_cpu_thread_fn,
+   cpu, QEMU_THREAD_JOINABLE);
 #ifdef _WIN32
 cpu->hThread = qemu_thread_get_handle(cpu->thread);
 #endif
@@ -1144,11 +1151,15 @@ static void qemu_tcg_init_vcpu(CPUState *cpu)
 
 static void qemu_kvm_start_vcpu(CPUState *cpu)
 {
+char thread_name[VCPU_THREAD_NAME_SIZE];
+
 cpu->thread = g_malloc0(sizeof(QemuThread));
 cpu->halt_cond = g_malloc0(sizeof(QemuCond));
 qemu_cond_init(cpu->halt_cond);
-qemu_thread_create(cpu->thread, qemu_kvm_cpu_thread_fn, cpu,
-   QEMU_THREAD_JOINABLE);
+snprintf(thread_name, VCPU_THREAD_NAME_SIZE, "CPU %d/KVM",
+ cpu->cpu_index);
+qemu_thread_create(cpu->thread, thread_name, qemu_kvm_cpu_thread_fn,
+   cpu, QEMU_THREAD_JOINABLE);
 while (!cpu->created) {
 qemu_cond_wait(_cpu_cond, _global_mutex);
 }
@@ -1156,10 +1167,14 @@ static void qemu_kvm_start_vcpu(CPUState *cpu)
 
 static void qemu_dummy_start_vcpu(CPUState *cpu)
 {
+char thread_name[VCPU_THREAD_NAME_SIZE];
+
 cpu->thread = g_malloc0(sizeof(QemuThread));
 cpu->halt_cond = g_malloc0(sizeof(QemuCond));
 qemu_cond_init(cpu->halt_cond);
-qemu_thread_create(cpu->thread, qemu_dummy_cpu_thread_fn, cpu,
+snprintf(thread_name, VCPU_THREAD_NAME_SIZE, "CPU %d/DUMMY",
+ cpu->cpu_index);
+qemu_thread_create(cpu->thread, thread_name, qemu_dummy_cpu_thread_fn, cpu,
QEMU_THREAD_JOINABLE);
 while (!cpu->created) {
 qemu_cond_wait(_cpu_cond, _global_mutex);
diff --git a/hw/block/dataplane/virtio-blk.c b/hw/block/dataplane/virtio-blk.c
index 2237edb..d1c7ad4 100644
--- a/hw/block/dataplane/virtio-blk.c
+++ b/hw/block/dataplane/virtio-blk.c
@@ -358,7 +358,7 @@ static void start_data_plane_bh(void *opaque)
 
 qemu_bh_delete(s->start_bh);
 s->start_bh = NULL;
-qemu_thread_create(>thread, data_plane_thread,
+qemu_thread_create(>thread, "data_plane", data_plane_thread,
s, QEMU_THREAD_JOINABLE);
 }
 
diff --git a/hw/usb/ccid-card-emulated.c b/hw/usb/ccid-card-emulated.c
index aa913df..7213c89 100644
--- a/hw/usb/ccid-card-emulated.c
+++ b/hw/usb/ccid-card-emulated.c
@@ -546,10 +546,10 @@ static int emulated_initfn(CCIDCardState *base)
 printf("%s: failed to initialize vcard\n", EMULATED_DEV_NAME);
 return -1;
 }
-qemu_thread_create(>event_thread_id, event_thread, card,
-   QEMU_THREAD_JOINABLE);
-qemu_thread_create(>apdu_thread_id, handle_apdu_thread, card,
-   QEMU_THREAD_JOINABLE);
+

[Qemu-devel] [PULL 08/16] Add 'debug-threads' suboption to --name

2016-10-09 Thread Michael S. Tsirkin
From: "Dr. David Alan Gilbert" 

Add flag storage to qemu-thread-* to store the namethreads flag

Signed-off-by: Dr. David Alan Gilbert 
Acked-by: Michael S. Tsirkin 
Reviewed-by: Laszlo Ersek 
---
 include/qemu/thread.h| 1 +
 util/qemu-thread-posix.c | 7 +++
 util/qemu-thread-win32.c | 8 
 vl.c | 9 +
 qemu-options.hx  | 7 +--
 5 files changed, 30 insertions(+), 2 deletions(-)

diff --git a/include/qemu/thread.h b/include/qemu/thread.h
index 3e32c65..bf1e110 100644
--- a/include/qemu/thread.h
+++ b/include/qemu/thread.h
@@ -59,5 +59,6 @@ void *qemu_thread_join(QemuThread *thread);
 void qemu_thread_get_self(QemuThread *thread);
 bool qemu_thread_is_self(QemuThread *thread);
 void qemu_thread_exit(void *retval);
+void qemu_thread_naming(bool enable);
 
 #endif
diff --git a/util/qemu-thread-posix.c b/util/qemu-thread-posix.c
index 37dd298..0fa6c81 100644
--- a/util/qemu-thread-posix.c
+++ b/util/qemu-thread-posix.c
@@ -27,6 +27,13 @@
 #include "qemu/thread.h"
 #include "qemu/atomic.h"
 
+static bool name_threads;
+
+void qemu_thread_naming(bool enable)
+{
+name_threads = enable;
+}
+
 static void error_exit(int err, const char *msg)
 {
 fprintf(stderr, "qemu: %s: %s\n", msg, strerror(err));
diff --git a/util/qemu-thread-win32.c b/util/qemu-thread-win32.c
index 27a5217..e42cb77 100644
--- a/util/qemu-thread-win32.c
+++ b/util/qemu-thread-win32.c
@@ -16,6 +16,14 @@
 #include 
 #include 
 
+static bool name_threads;
+
+void qemu_thread_naming(bool enable)
+{
+/* But note we don't actually name them on Windows yet */
+name_threads = enable;
+}
+
 static void error_exit(int err, const char *msg)
 {
 char *pstr;
diff --git a/vl.c b/vl.c
index 44b5ad3..c8a5bfa 100644
--- a/vl.c
+++ b/vl.c
@@ -495,6 +495,12 @@ static QemuOptsList qemu_name_opts = {
 .name = "process",
 .type = QEMU_OPT_STRING,
 .help = "Sets the name of the QEMU process, as shown in top etc",
+}, {
+.name = "debug-threads",
+.type = QEMU_OPT_BOOL,
+.help = "When enabled, name the individual threads; defaults 
off.\n"
+"NOTE: The thread names are for debugging and not a\n"
+"stable API.",
 },
 { /* End of list */ }
 },
@@ -954,6 +960,9 @@ static void parse_name(QemuOpts *opts)
 {
 const char *proc_name;
 
+if (qemu_opt_get(opts, "debug-threads")) {
+qemu_thread_naming(qemu_opt_get_bool(opts, "debug-threads", false));
+}
 qemu_name = qemu_opt_get(opts, "guest");
 
 proc_name = qemu_opt_get(opts, "process");
diff --git a/qemu-options.hx b/qemu-options.hx
index 56e5fdf..068da2d 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -328,9 +328,11 @@ possible drivers and properties, use @code{-device help} 
and
 ETEXI
 
 DEF("name", HAS_ARG, QEMU_OPTION_name,
-"-name string1[,process=string2]\n"
+"-name string1[,process=string2][,debug-threads=on|off]\n"
 "set the name of the guest\n"
-"string1 sets the window title and string2 the process 
name (on Linux)\n",
+"string1 sets the window title and string2 the process 
name (on Linux)\n"
+"When debug-threads is enabled, individual threads are 
given a separate name (on Linux)\n"
+"NOTE: The thread names are for debugging and not a stable 
API.\n",
 QEMU_ARCH_ALL)
 STEXI
 @item -name @var{name}
@@ -339,6 +341,7 @@ Sets the @var{name} of the guest.
 This name will be displayed in the SDL window caption.
 The @var{name} will also be used for the VNC server.
 Also optionally set the top visible process name in Linux.
+Naming of individual threads can also be enabled on Linux to aid debugging.
 ETEXI
 
 DEF("uuid", HAS_ARG, QEMU_OPTION_uuid,
-- 
MST




[Qemu-devel] [PULL 08/12] Add a 'name' parameter to qemu_thread_create

2016-10-09 Thread Michael S. Tsirkin
From: "Dr. David Alan Gilbert" 

If enabled, set the thread name at creation (on GNU systems with
  pthread_set_np)
Fix up all the callers with a thread name

Signed-off-by: Dr. David Alan Gilbert 
Acked-by: Michael S. Tsirkin 
Reviewed-by: Laszlo Ersek 
---
 include/qemu/thread.h   |  2 +-
 cpus.c  | 25 -
 hw/block/dataplane/virtio-blk.c |  2 +-
 hw/usb/ccid-card-emulated.c |  8 
 libcacard/vscclient.c   |  2 +-
 migration.c |  2 +-
 thread-pool.c   |  2 +-
 ui/vnc-jobs.c   |  3 ++-
 util/compatfd.c |  3 ++-
 util/qemu-thread-posix.c|  9 +++--
 util/qemu-thread-win32.c|  2 +-
 11 files changed, 41 insertions(+), 19 deletions(-)

diff --git a/include/qemu/thread.h b/include/qemu/thread.h
index bf1e110..f7e3b9b 100644
--- a/include/qemu/thread.h
+++ b/include/qemu/thread.h
@@ -52,7 +52,7 @@ void qemu_event_reset(QemuEvent *ev);
 void qemu_event_wait(QemuEvent *ev);
 void qemu_event_destroy(QemuEvent *ev);
 
-void qemu_thread_create(QemuThread *thread,
+void qemu_thread_create(QemuThread *thread, const char *name,
 void *(*start_routine)(void *),
 void *arg, int mode);
 void *qemu_thread_join(QemuThread *thread);
diff --git a/cpus.c b/cpus.c
index ca4c59f..9a4ce45 100644
--- a/cpus.c
+++ b/cpus.c
@@ -1117,16 +1117,23 @@ void resume_all_vcpus(void)
 }
 }
 
+/* For temporary buffers for forming a name */
+#define VCPU_THREAD_NAME_SIZE 16
+
 static void qemu_tcg_init_vcpu(CPUState *cpu)
 {
+char thread_name[VCPU_THREAD_NAME_SIZE];
+
 /* share a single thread for all cpus with TCG */
 if (!tcg_cpu_thread) {
 cpu->thread = g_malloc0(sizeof(QemuThread));
 cpu->halt_cond = g_malloc0(sizeof(QemuCond));
 qemu_cond_init(cpu->halt_cond);
 tcg_halt_cond = cpu->halt_cond;
-qemu_thread_create(cpu->thread, qemu_tcg_cpu_thread_fn, cpu,
-   QEMU_THREAD_JOINABLE);
+snprintf(thread_name, VCPU_THREAD_NAME_SIZE, "CPU %d/TCG",
+ cpu->cpu_index);
+qemu_thread_create(cpu->thread, thread_name, qemu_tcg_cpu_thread_fn,
+   cpu, QEMU_THREAD_JOINABLE);
 #ifdef _WIN32
 cpu->hThread = qemu_thread_get_handle(cpu->thread);
 #endif
@@ -1142,11 +1149,15 @@ static void qemu_tcg_init_vcpu(CPUState *cpu)
 
 static void qemu_kvm_start_vcpu(CPUState *cpu)
 {
+char thread_name[VCPU_THREAD_NAME_SIZE];
+
 cpu->thread = g_malloc0(sizeof(QemuThread));
 cpu->halt_cond = g_malloc0(sizeof(QemuCond));
 qemu_cond_init(cpu->halt_cond);
-qemu_thread_create(cpu->thread, qemu_kvm_cpu_thread_fn, cpu,
-   QEMU_THREAD_JOINABLE);
+snprintf(thread_name, VCPU_THREAD_NAME_SIZE, "CPU %d/KVM",
+ cpu->cpu_index);
+qemu_thread_create(cpu->thread, thread_name, qemu_kvm_cpu_thread_fn,
+   cpu, QEMU_THREAD_JOINABLE);
 while (!cpu->created) {
 qemu_cond_wait(_cpu_cond, _global_mutex);
 }
@@ -1154,10 +1165,14 @@ static void qemu_kvm_start_vcpu(CPUState *cpu)
 
 static void qemu_dummy_start_vcpu(CPUState *cpu)
 {
+char thread_name[VCPU_THREAD_NAME_SIZE];
+
 cpu->thread = g_malloc0(sizeof(QemuThread));
 cpu->halt_cond = g_malloc0(sizeof(QemuCond));
 qemu_cond_init(cpu->halt_cond);
-qemu_thread_create(cpu->thread, qemu_dummy_cpu_thread_fn, cpu,
+snprintf(thread_name, VCPU_THREAD_NAME_SIZE, "CPU %d/DUMMY",
+ cpu->cpu_index);
+qemu_thread_create(cpu->thread, thread_name, qemu_dummy_cpu_thread_fn, cpu,
QEMU_THREAD_JOINABLE);
 while (!cpu->created) {
 qemu_cond_wait(_cpu_cond, _global_mutex);
diff --git a/hw/block/dataplane/virtio-blk.c b/hw/block/dataplane/virtio-blk.c
index 456d437..980a684 100644
--- a/hw/block/dataplane/virtio-blk.c
+++ b/hw/block/dataplane/virtio-blk.c
@@ -358,7 +358,7 @@ static void start_data_plane_bh(void *opaque)
 
 qemu_bh_delete(s->start_bh);
 s->start_bh = NULL;
-qemu_thread_create(>thread, data_plane_thread,
+qemu_thread_create(>thread, "data_plane", data_plane_thread,
s, QEMU_THREAD_JOINABLE);
 }
 
diff --git a/hw/usb/ccid-card-emulated.c b/hw/usb/ccid-card-emulated.c
index aa913df..7213c89 100644
--- a/hw/usb/ccid-card-emulated.c
+++ b/hw/usb/ccid-card-emulated.c
@@ -546,10 +546,10 @@ static int emulated_initfn(CCIDCardState *base)
 printf("%s: failed to initialize vcard\n", EMULATED_DEV_NAME);
 return -1;
 }
-qemu_thread_create(>event_thread_id, event_thread, card,
-   QEMU_THREAD_JOINABLE);
-qemu_thread_create(>apdu_thread_id, handle_apdu_thread, card,
-   QEMU_THREAD_JOINABLE);
+qemu_thread_create(>event_thread_id, "ccid/event", event_thread,
+ 

[Qemu-devel] [PULL 09/16] Add a 'name' parameter to qemu_thread_create

2016-10-09 Thread Michael S. Tsirkin
From: "Dr. David Alan Gilbert" 

If enabled, set the thread name at creation (on GNU systems with
  pthread_set_np)
Fix up all the callers with a thread name

Signed-off-by: Dr. David Alan Gilbert 
Acked-by: Michael S. Tsirkin 
Reviewed-by: Laszlo Ersek 
---
 include/qemu/thread.h   |  2 +-
 cpus.c  | 25 -
 hw/block/dataplane/virtio-blk.c |  2 +-
 hw/usb/ccid-card-emulated.c |  8 
 libcacard/vscclient.c   |  2 +-
 migration.c |  2 +-
 thread-pool.c   |  2 +-
 ui/vnc-jobs.c   |  3 ++-
 util/compatfd.c |  3 ++-
 util/qemu-thread-posix.c|  9 +++--
 util/qemu-thread-win32.c|  2 +-
 11 files changed, 41 insertions(+), 19 deletions(-)

diff --git a/include/qemu/thread.h b/include/qemu/thread.h
index bf1e110..f7e3b9b 100644
--- a/include/qemu/thread.h
+++ b/include/qemu/thread.h
@@ -52,7 +52,7 @@ void qemu_event_reset(QemuEvent *ev);
 void qemu_event_wait(QemuEvent *ev);
 void qemu_event_destroy(QemuEvent *ev);
 
-void qemu_thread_create(QemuThread *thread,
+void qemu_thread_create(QemuThread *thread, const char *name,
 void *(*start_routine)(void *),
 void *arg, int mode);
 void *qemu_thread_join(QemuThread *thread);
diff --git a/cpus.c b/cpus.c
index 945d85b..b6421fd 100644
--- a/cpus.c
+++ b/cpus.c
@@ -1117,8 +1117,13 @@ void resume_all_vcpus(void)
 }
 }
 
+/* For temporary buffers for forming a name */
+#define VCPU_THREAD_NAME_SIZE 16
+
 static void qemu_tcg_init_vcpu(CPUState *cpu)
 {
+char thread_name[VCPU_THREAD_NAME_SIZE];
+
 tcg_cpu_address_space_init(cpu, cpu->as);
 
 /* share a single thread for all cpus with TCG */
@@ -1127,8 +1132,10 @@ static void qemu_tcg_init_vcpu(CPUState *cpu)
 cpu->halt_cond = g_malloc0(sizeof(QemuCond));
 qemu_cond_init(cpu->halt_cond);
 tcg_halt_cond = cpu->halt_cond;
-qemu_thread_create(cpu->thread, qemu_tcg_cpu_thread_fn, cpu,
-   QEMU_THREAD_JOINABLE);
+snprintf(thread_name, VCPU_THREAD_NAME_SIZE, "CPU %d/TCG",
+ cpu->cpu_index);
+qemu_thread_create(cpu->thread, thread_name, qemu_tcg_cpu_thread_fn,
+   cpu, QEMU_THREAD_JOINABLE);
 #ifdef _WIN32
 cpu->hThread = qemu_thread_get_handle(cpu->thread);
 #endif
@@ -1144,11 +1151,15 @@ static void qemu_tcg_init_vcpu(CPUState *cpu)
 
 static void qemu_kvm_start_vcpu(CPUState *cpu)
 {
+char thread_name[VCPU_THREAD_NAME_SIZE];
+
 cpu->thread = g_malloc0(sizeof(QemuThread));
 cpu->halt_cond = g_malloc0(sizeof(QemuCond));
 qemu_cond_init(cpu->halt_cond);
-qemu_thread_create(cpu->thread, qemu_kvm_cpu_thread_fn, cpu,
-   QEMU_THREAD_JOINABLE);
+snprintf(thread_name, VCPU_THREAD_NAME_SIZE, "CPU %d/KVM",
+ cpu->cpu_index);
+qemu_thread_create(cpu->thread, thread_name, qemu_kvm_cpu_thread_fn,
+   cpu, QEMU_THREAD_JOINABLE);
 while (!cpu->created) {
 qemu_cond_wait(_cpu_cond, _global_mutex);
 }
@@ -1156,10 +1167,14 @@ static void qemu_kvm_start_vcpu(CPUState *cpu)
 
 static void qemu_dummy_start_vcpu(CPUState *cpu)
 {
+char thread_name[VCPU_THREAD_NAME_SIZE];
+
 cpu->thread = g_malloc0(sizeof(QemuThread));
 cpu->halt_cond = g_malloc0(sizeof(QemuCond));
 qemu_cond_init(cpu->halt_cond);
-qemu_thread_create(cpu->thread, qemu_dummy_cpu_thread_fn, cpu,
+snprintf(thread_name, VCPU_THREAD_NAME_SIZE, "CPU %d/DUMMY",
+ cpu->cpu_index);
+qemu_thread_create(cpu->thread, thread_name, qemu_dummy_cpu_thread_fn, cpu,
QEMU_THREAD_JOINABLE);
 while (!cpu->created) {
 qemu_cond_wait(_cpu_cond, _global_mutex);
diff --git a/hw/block/dataplane/virtio-blk.c b/hw/block/dataplane/virtio-blk.c
index 2237edb..d1c7ad4 100644
--- a/hw/block/dataplane/virtio-blk.c
+++ b/hw/block/dataplane/virtio-blk.c
@@ -358,7 +358,7 @@ static void start_data_plane_bh(void *opaque)
 
 qemu_bh_delete(s->start_bh);
 s->start_bh = NULL;
-qemu_thread_create(>thread, data_plane_thread,
+qemu_thread_create(>thread, "data_plane", data_plane_thread,
s, QEMU_THREAD_JOINABLE);
 }
 
diff --git a/hw/usb/ccid-card-emulated.c b/hw/usb/ccid-card-emulated.c
index aa913df..7213c89 100644
--- a/hw/usb/ccid-card-emulated.c
+++ b/hw/usb/ccid-card-emulated.c
@@ -546,10 +546,10 @@ static int emulated_initfn(CCIDCardState *base)
 printf("%s: failed to initialize vcard\n", EMULATED_DEV_NAME);
 return -1;
 }
-qemu_thread_create(>event_thread_id, event_thread, card,
-   QEMU_THREAD_JOINABLE);
-qemu_thread_create(>apdu_thread_id, handle_apdu_thread, card,
-   QEMU_THREAD_JOINABLE);
+

[Qemu-devel] [PULL 07/12] Add 'debug-threads' suboption to --name

2016-10-09 Thread Michael S. Tsirkin
From: "Dr. David Alan Gilbert" 

Add flag storage to qemu-thread-* to store the namethreads flag

Signed-off-by: Dr. David Alan Gilbert 
Acked-by: Michael S. Tsirkin 
Reviewed-by: Laszlo Ersek 
---
 include/qemu/thread.h| 1 +
 util/qemu-thread-posix.c | 7 +++
 util/qemu-thread-win32.c | 8 
 vl.c | 9 +
 qemu-options.hx  | 7 +--
 5 files changed, 30 insertions(+), 2 deletions(-)

diff --git a/include/qemu/thread.h b/include/qemu/thread.h
index 3e32c65..bf1e110 100644
--- a/include/qemu/thread.h
+++ b/include/qemu/thread.h
@@ -59,5 +59,6 @@ void *qemu_thread_join(QemuThread *thread);
 void qemu_thread_get_self(QemuThread *thread);
 bool qemu_thread_is_self(QemuThread *thread);
 void qemu_thread_exit(void *retval);
+void qemu_thread_naming(bool enable);
 
 #endif
diff --git a/util/qemu-thread-posix.c b/util/qemu-thread-posix.c
index 37dd298..0fa6c81 100644
--- a/util/qemu-thread-posix.c
+++ b/util/qemu-thread-posix.c
@@ -27,6 +27,13 @@
 #include "qemu/thread.h"
 #include "qemu/atomic.h"
 
+static bool name_threads;
+
+void qemu_thread_naming(bool enable)
+{
+name_threads = enable;
+}
+
 static void error_exit(int err, const char *msg)
 {
 fprintf(stderr, "qemu: %s: %s\n", msg, strerror(err));
diff --git a/util/qemu-thread-win32.c b/util/qemu-thread-win32.c
index 27a5217..e42cb77 100644
--- a/util/qemu-thread-win32.c
+++ b/util/qemu-thread-win32.c
@@ -16,6 +16,14 @@
 #include 
 #include 
 
+static bool name_threads;
+
+void qemu_thread_naming(bool enable)
+{
+/* But note we don't actually name them on Windows yet */
+name_threads = enable;
+}
+
 static void error_exit(int err, const char *msg)
 {
 char *pstr;
diff --git a/vl.c b/vl.c
index 5026e7a..a9b05cc 100644
--- a/vl.c
+++ b/vl.c
@@ -548,6 +548,12 @@ static QemuOptsList qemu_name_opts = {
 .name = "process",
 .type = QEMU_OPT_STRING,
 .help = "Sets the name of the QEMU process, as shown in top etc",
+}, {
+.name = "debug-threads",
+.type = QEMU_OPT_BOOL,
+.help = "When enabled, name the individual threads; defaults 
off.\n"
+"NOTE: The thread names are for debugging and not a\n"
+"stable API.",
 },
 { /* End of list */ }
 },
@@ -1007,6 +1013,9 @@ static void parse_name(QemuOpts *opts)
 {
 const char *proc_name;
 
+if (qemu_opt_get(opts, "debug-threads")) {
+qemu_thread_naming(qemu_opt_get_bool(opts, "debug-threads", false));
+}
 qemu_name = qemu_opt_get(opts, "guest");
 
 proc_name = qemu_opt_get(opts, "process");
diff --git a/qemu-options.hx b/qemu-options.hx
index 56e5fdf..068da2d 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -328,9 +328,11 @@ possible drivers and properties, use @code{-device help} 
and
 ETEXI
 
 DEF("name", HAS_ARG, QEMU_OPTION_name,
-"-name string1[,process=string2]\n"
+"-name string1[,process=string2][,debug-threads=on|off]\n"
 "set the name of the guest\n"
-"string1 sets the window title and string2 the process 
name (on Linux)\n",
+"string1 sets the window title and string2 the process 
name (on Linux)\n"
+"When debug-threads is enabled, individual threads are 
given a separate name (on Linux)\n"
+"NOTE: The thread names are for debugging and not a stable 
API.\n",
 QEMU_ARCH_ALL)
 STEXI
 @item -name @var{name}
@@ -339,6 +341,7 @@ Sets the @var{name} of the guest.
 This name will be displayed in the SDL window caption.
 The @var{name} will also be used for the VNC server.
 Also optionally set the top visible process name in Linux.
+Naming of individual threads can also be enabled on Linux to aid debugging.
 ETEXI
 
 DEF("uuid", HAS_ARG, QEMU_OPTION_uuid,
-- 
MST




[Qemu-devel] [PULL 8/8] Add a 'name' parameter to qemu_thread_create

2016-10-09 Thread Michael S. Tsirkin
From: "Dr. David Alan Gilbert" 

If enabled, set the thread name at creation (on GNU systems with
  pthread_set_np)
Fix up all the callers with a thread name

Signed-off-by: Dr. David Alan Gilbert 
Acked-by: Michael S. Tsirkin 
Reviewed-by: Laszlo Ersek 
---
 include/qemu/thread.h   |  2 +-
 cpus.c  | 25 -
 hw/block/dataplane/virtio-blk.c |  2 +-
 hw/usb/ccid-card-emulated.c |  8 
 libcacard/vscclient.c   |  2 +-
 migration.c |  2 +-
 thread-pool.c   |  2 +-
 ui/vnc-jobs.c   |  3 ++-
 util/compatfd.c |  3 ++-
 util/qemu-thread-posix.c|  9 +++--
 util/qemu-thread-win32.c|  2 +-
 11 files changed, 41 insertions(+), 19 deletions(-)

diff --git a/include/qemu/thread.h b/include/qemu/thread.h
index bf1e110..f7e3b9b 100644
--- a/include/qemu/thread.h
+++ b/include/qemu/thread.h
@@ -52,7 +52,7 @@ void qemu_event_reset(QemuEvent *ev);
 void qemu_event_wait(QemuEvent *ev);
 void qemu_event_destroy(QemuEvent *ev);
 
-void qemu_thread_create(QemuThread *thread,
+void qemu_thread_create(QemuThread *thread, const char *name,
 void *(*start_routine)(void *),
 void *arg, int mode);
 void *qemu_thread_join(QemuThread *thread);
diff --git a/cpus.c b/cpus.c
index ca4c59f..9a4ce45 100644
--- a/cpus.c
+++ b/cpus.c
@@ -1117,16 +1117,23 @@ void resume_all_vcpus(void)
 }
 }
 
+/* For temporary buffers for forming a name */
+#define VCPU_THREAD_NAME_SIZE 16
+
 static void qemu_tcg_init_vcpu(CPUState *cpu)
 {
+char thread_name[VCPU_THREAD_NAME_SIZE];
+
 /* share a single thread for all cpus with TCG */
 if (!tcg_cpu_thread) {
 cpu->thread = g_malloc0(sizeof(QemuThread));
 cpu->halt_cond = g_malloc0(sizeof(QemuCond));
 qemu_cond_init(cpu->halt_cond);
 tcg_halt_cond = cpu->halt_cond;
-qemu_thread_create(cpu->thread, qemu_tcg_cpu_thread_fn, cpu,
-   QEMU_THREAD_JOINABLE);
+snprintf(thread_name, VCPU_THREAD_NAME_SIZE, "CPU %d/TCG",
+ cpu->cpu_index);
+qemu_thread_create(cpu->thread, thread_name, qemu_tcg_cpu_thread_fn,
+   cpu, QEMU_THREAD_JOINABLE);
 #ifdef _WIN32
 cpu->hThread = qemu_thread_get_handle(cpu->thread);
 #endif
@@ -1142,11 +1149,15 @@ static void qemu_tcg_init_vcpu(CPUState *cpu)
 
 static void qemu_kvm_start_vcpu(CPUState *cpu)
 {
+char thread_name[VCPU_THREAD_NAME_SIZE];
+
 cpu->thread = g_malloc0(sizeof(QemuThread));
 cpu->halt_cond = g_malloc0(sizeof(QemuCond));
 qemu_cond_init(cpu->halt_cond);
-qemu_thread_create(cpu->thread, qemu_kvm_cpu_thread_fn, cpu,
-   QEMU_THREAD_JOINABLE);
+snprintf(thread_name, VCPU_THREAD_NAME_SIZE, "CPU %d/KVM",
+ cpu->cpu_index);
+qemu_thread_create(cpu->thread, thread_name, qemu_kvm_cpu_thread_fn,
+   cpu, QEMU_THREAD_JOINABLE);
 while (!cpu->created) {
 qemu_cond_wait(_cpu_cond, _global_mutex);
 }
@@ -1154,10 +1165,14 @@ static void qemu_kvm_start_vcpu(CPUState *cpu)
 
 static void qemu_dummy_start_vcpu(CPUState *cpu)
 {
+char thread_name[VCPU_THREAD_NAME_SIZE];
+
 cpu->thread = g_malloc0(sizeof(QemuThread));
 cpu->halt_cond = g_malloc0(sizeof(QemuCond));
 qemu_cond_init(cpu->halt_cond);
-qemu_thread_create(cpu->thread, qemu_dummy_cpu_thread_fn, cpu,
+snprintf(thread_name, VCPU_THREAD_NAME_SIZE, "CPU %d/DUMMY",
+ cpu->cpu_index);
+qemu_thread_create(cpu->thread, thread_name, qemu_dummy_cpu_thread_fn, cpu,
QEMU_THREAD_JOINABLE);
 while (!cpu->created) {
 qemu_cond_wait(_cpu_cond, _global_mutex);
diff --git a/hw/block/dataplane/virtio-blk.c b/hw/block/dataplane/virtio-blk.c
index 456d437..980a684 100644
--- a/hw/block/dataplane/virtio-blk.c
+++ b/hw/block/dataplane/virtio-blk.c
@@ -358,7 +358,7 @@ static void start_data_plane_bh(void *opaque)
 
 qemu_bh_delete(s->start_bh);
 s->start_bh = NULL;
-qemu_thread_create(>thread, data_plane_thread,
+qemu_thread_create(>thread, "data_plane", data_plane_thread,
s, QEMU_THREAD_JOINABLE);
 }
 
diff --git a/hw/usb/ccid-card-emulated.c b/hw/usb/ccid-card-emulated.c
index aa913df..7213c89 100644
--- a/hw/usb/ccid-card-emulated.c
+++ b/hw/usb/ccid-card-emulated.c
@@ -546,10 +546,10 @@ static int emulated_initfn(CCIDCardState *base)
 printf("%s: failed to initialize vcard\n", EMULATED_DEV_NAME);
 return -1;
 }
-qemu_thread_create(>event_thread_id, event_thread, card,
-   QEMU_THREAD_JOINABLE);
-qemu_thread_create(>apdu_thread_id, handle_apdu_thread, card,
-   QEMU_THREAD_JOINABLE);
+qemu_thread_create(>event_thread_id, "ccid/event", event_thread,
+ 

[Qemu-devel] [PULL 7/8] Add 'debug-threads' suboption to --name

2016-10-09 Thread Michael S. Tsirkin
From: "Dr. David Alan Gilbert" 

Add flag storage to qemu-thread-* to store the namethreads flag

Signed-off-by: Dr. David Alan Gilbert 
Acked-by: Michael S. Tsirkin 
Reviewed-by: Laszlo Ersek 
---
 include/qemu/thread.h| 1 +
 util/qemu-thread-posix.c | 7 +++
 util/qemu-thread-win32.c | 8 
 vl.c | 9 +
 qemu-options.hx  | 7 +--
 5 files changed, 30 insertions(+), 2 deletions(-)

diff --git a/include/qemu/thread.h b/include/qemu/thread.h
index 3e32c65..bf1e110 100644
--- a/include/qemu/thread.h
+++ b/include/qemu/thread.h
@@ -59,5 +59,6 @@ void *qemu_thread_join(QemuThread *thread);
 void qemu_thread_get_self(QemuThread *thread);
 bool qemu_thread_is_self(QemuThread *thread);
 void qemu_thread_exit(void *retval);
+void qemu_thread_naming(bool enable);
 
 #endif
diff --git a/util/qemu-thread-posix.c b/util/qemu-thread-posix.c
index 37dd298..0fa6c81 100644
--- a/util/qemu-thread-posix.c
+++ b/util/qemu-thread-posix.c
@@ -27,6 +27,13 @@
 #include "qemu/thread.h"
 #include "qemu/atomic.h"
 
+static bool name_threads;
+
+void qemu_thread_naming(bool enable)
+{
+name_threads = enable;
+}
+
 static void error_exit(int err, const char *msg)
 {
 fprintf(stderr, "qemu: %s: %s\n", msg, strerror(err));
diff --git a/util/qemu-thread-win32.c b/util/qemu-thread-win32.c
index 27a5217..e42cb77 100644
--- a/util/qemu-thread-win32.c
+++ b/util/qemu-thread-win32.c
@@ -16,6 +16,14 @@
 #include 
 #include 
 
+static bool name_threads;
+
+void qemu_thread_naming(bool enable)
+{
+/* But note we don't actually name them on Windows yet */
+name_threads = enable;
+}
+
 static void error_exit(int err, const char *msg)
 {
 char *pstr;
diff --git a/vl.c b/vl.c
index 5026e7a..a9b05cc 100644
--- a/vl.c
+++ b/vl.c
@@ -548,6 +548,12 @@ static QemuOptsList qemu_name_opts = {
 .name = "process",
 .type = QEMU_OPT_STRING,
 .help = "Sets the name of the QEMU process, as shown in top etc",
+}, {
+.name = "debug-threads",
+.type = QEMU_OPT_BOOL,
+.help = "When enabled, name the individual threads; defaults 
off.\n"
+"NOTE: The thread names are for debugging and not a\n"
+"stable API.",
 },
 { /* End of list */ }
 },
@@ -1007,6 +1013,9 @@ static void parse_name(QemuOpts *opts)
 {
 const char *proc_name;
 
+if (qemu_opt_get(opts, "debug-threads")) {
+qemu_thread_naming(qemu_opt_get_bool(opts, "debug-threads", false));
+}
 qemu_name = qemu_opt_get(opts, "guest");
 
 proc_name = qemu_opt_get(opts, "process");
diff --git a/qemu-options.hx b/qemu-options.hx
index 56e5fdf..068da2d 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -328,9 +328,11 @@ possible drivers and properties, use @code{-device help} 
and
 ETEXI
 
 DEF("name", HAS_ARG, QEMU_OPTION_name,
-"-name string1[,process=string2]\n"
+"-name string1[,process=string2][,debug-threads=on|off]\n"
 "set the name of the guest\n"
-"string1 sets the window title and string2 the process 
name (on Linux)\n",
+"string1 sets the window title and string2 the process 
name (on Linux)\n"
+"When debug-threads is enabled, individual threads are 
given a separate name (on Linux)\n"
+"NOTE: The thread names are for debugging and not a stable 
API.\n",
 QEMU_ARCH_ALL)
 STEXI
 @item -name @var{name}
@@ -339,6 +341,7 @@ Sets the @var{name} of the guest.
 This name will be displayed in the SDL window caption.
 The @var{name} will also be used for the VNC server.
 Also optionally set the top visible process name in Linux.
+Naming of individual threads can also be enabled on Linux to aid debugging.
 ETEXI
 
 DEF("uuid", HAS_ARG, QEMU_OPTION_uuid,
-- 
MST




Re: [Qemu-devel] [PATCH] colo-compare: fix find_and_check_chardev()

2016-10-09 Thread Zhang Chen



On 09/30/2016 12:06 PM, zhanghailiang wrote:

find_and_check_chardev() uses 'opts' member of CharDriverState to
check if the chardev is 'socket' chardev or not, which the opts
will be NULL if We add the chardev by qmp 'chardev-add' command.

All the related info can be found in 'filename' member of CharDriverState,
For tcp socket device, it will be like 'disconnected:tcp:9.61.1.8:9004,server'
or 'tcp:9.61.1.8:9001,server <-> 9.61.1.8:50256', we can simply check it to
identify if it is a tcp socket char device.

Besides, fix this helper function to return -1 while some errors happen.

Signed-off-by: zhanghailiang 


This patch looks fine to me.

Reviewed-by: Zhang Chen 

Thanks
Zhang Chen


---
  net/colo-compare.c | 54 --
  1 file changed, 8 insertions(+), 46 deletions(-)

diff --git a/net/colo-compare.c b/net/colo-compare.c
index 22b1da1..6693258 100644
--- a/net/colo-compare.c
+++ b/net/colo-compare.c
@@ -92,10 +92,6 @@ typedef struct CompareClass {
  ObjectClass parent_class;
  } CompareClass;
  
-typedef struct CompareChardevProps {

-bool is_socket;
-} CompareChardevProps;
-
  enum {
  PRIMARY_IN = 0,
  SECONDARY_IN,
@@ -564,56 +560,22 @@ static void compare_sec_rs_finalize(SocketReadState 
*sec_rs)
  }
  }
  
-static int compare_chardev_opts(void *opaque,

-const char *name, const char *value,
-Error **errp)
-{
-CompareChardevProps *props = opaque;
-
-if (strcmp(name, "backend") == 0 &&
-strcmp(value, "socket") == 0) {
-props->is_socket = true;
-return 0;
-} else if (strcmp(name, "host") == 0 ||
-  (strcmp(name, "port") == 0) ||
-  (strcmp(name, "server") == 0) ||
-  (strcmp(name, "wait") == 0) ||
-  (strcmp(name, "path") == 0)) {
-return 0;
-} else {
-error_setg(errp,
-   "COLO-compare does not support a chardev with option %s=%s",
-   name, value);
-return -1;
-}
-}
-
-/*
- * Return 0 is success.
- * Return 1 is failed.
- */
  static int find_and_check_chardev(CharDriverState **chr,
char *chr_name,
Error **errp)
  {
-CompareChardevProps props;
-
  *chr = qemu_chr_find(chr_name);
  if (*chr == NULL) {
  error_setg(errp, "Device '%s' not found",
 chr_name);
-return 1;
+return -1;
  }
  
-memset(, 0, sizeof(props));

-if (qemu_opt_foreach((*chr)->opts, compare_chardev_opts, , errp)) {
-return 1;
-}
+if (!strstr((*chr)->filename, "tcp")) {
+error_setg(errp, "chardev \"%s\" is not a tcp socket, filename '%s'",
+   chr_name, (*chr)->filename);
+return -1;
  
-if (!props.is_socket) {

-error_setg(errp, "chardev \"%s\" is not a tcp socket",
-   chr_name);
-return 1;
  }
  return 0;
  }
@@ -660,15 +622,15 @@ static void colo_compare_complete(UserCreatable *uc, 
Error **errp)
  return;
  }
  
-if (find_and_check_chardev(>chr_pri_in, s->pri_indev, errp)) {

+if (find_and_check_chardev(>chr_pri_in, s->pri_indev, errp) < 0) {
  return;
  }
  
-if (find_and_check_chardev(>chr_sec_in, s->sec_indev, errp)) {

+if (find_and_check_chardev(>chr_sec_in, s->sec_indev, errp) < 0) {
  return;
  }
  
-if (find_and_check_chardev(>chr_out, s->outdev, errp)) {

+if (find_and_check_chardev(>chr_out, s->outdev, errp) < 0) {
  return;
  }
  


--
Thanks
zhangchen






Re: [Qemu-devel] [PATCH v5 08/17] vfio: Pass an Error object to vfio_connect_container

2016-10-09 Thread Alexey Kardashevskiy
On 07/10/16 18:36, Auger Eric wrote:
> Hi,
> 
> On 07/10/2016 09:01, Markus Armbruster wrote:
>> Eric Auger  writes:
>>
>>> The error is currently simply reported in vfio_get_group. Don't
>>> bother too much with the prefix which will be handled at upper level,
>>> later on.
>>>
>>> Also return an error value in case container->error is not 0 and
>>> the container is teared down.
>>
>> "torn down", I think.
> 
> Sure. I had a wrong feeling when writing this ...
>>
>> Is this a bug fix?  See also below.
>>
>>> On vfio_spapr_remove_window failure, we also report an error whereas
>>> it was silent before.
>>>
>>> Signed-off-by: Eric Auger 
>>> Reviewed-by: Markus Armbruster 
>>>
>>> ---
>>>
>>> v4 -> v5:
>>> - set ret to container->error
>>> - mention error report on vfio_spapr_remove_window failure in the commit
>>>   message
>>> ---
>>>  hw/vfio/common.c | 40 +---
>>>  1 file changed, 25 insertions(+), 15 deletions(-)
>>>
>>> diff --git a/hw/vfio/common.c b/hw/vfio/common.c
>>> index 29188a1..85a7759 100644
>>> --- a/hw/vfio/common.c
>>> +++ b/hw/vfio/common.c
>>> @@ -34,6 +34,7 @@
>>>  #include "qemu/range.h"
>>>  #include "sysemu/kvm.h"
>>>  #include "trace.h"
>>> +#include "qapi/error.h"
>>>  
>>>  struct vfio_group_head vfio_group_list =
>>>  QLIST_HEAD_INITIALIZER(vfio_group_list);
>>> @@ -900,7 +901,8 @@ static void vfio_put_address_space(VFIOAddressSpace 
>>> *space)
>>>  }
>>>  }
>>>  
>>> -static int vfio_connect_container(VFIOGroup *group, AddressSpace *as)
>>> +static int vfio_connect_container(VFIOGroup *group, AddressSpace *as,
>>> +  Error **errp)
>>>  {
>>>  VFIOContainer *container;
>>>  int ret, fd;
>>> @@ -918,15 +920,15 @@ static int vfio_connect_container(VFIOGroup *group, 
>>> AddressSpace *as)
>>>  
>>>  fd = qemu_open("/dev/vfio/vfio", O_RDWR);
>>>  if (fd < 0) {
>>> -error_report("vfio: failed to open /dev/vfio/vfio: %m");
>>> +error_setg_errno(errp, errno, "failed to open /dev/vfio/vfio");
>>>  ret = -errno;
>>>  goto put_space_exit;
>>>  }
>>>  
>>>  ret = ioctl(fd, VFIO_GET_API_VERSION);
>>>  if (ret != VFIO_API_VERSION) {
>>> -error_report("vfio: supported vfio version: %d, "
>>> - "reported version: %d", VFIO_API_VERSION, ret);
>>> +error_setg(errp, "supported vfio version: %d, "
>>> +   "reported version: %d", VFIO_API_VERSION, ret);
>>>  ret = -EINVAL;
>>>  goto close_fd_exit;
>>>  }
>>> @@ -941,7 +943,7 @@ static int vfio_connect_container(VFIOGroup *group, 
>>> AddressSpace *as)
>>>  
>>>  ret = ioctl(group->fd, VFIO_GROUP_SET_CONTAINER, );
>>>  if (ret) {
>>> -error_report("vfio: failed to set group container: %m");
>>> +error_setg_errno(errp, errno, "failed to set group container");
>>>  ret = -errno;
>>>  goto free_container_exit;
>>>  }
>>> @@ -949,7 +951,7 @@ static int vfio_connect_container(VFIOGroup *group, 
>>> AddressSpace *as)
>>>  container->iommu_type = v2 ? VFIO_TYPE1v2_IOMMU : VFIO_TYPE1_IOMMU;
>>>  ret = ioctl(fd, VFIO_SET_IOMMU, container->iommu_type);
>>>  if (ret) {
>>> -error_report("vfio: failed to set iommu for container: %m");
>>> +error_setg_errno(errp, errno, "failed to set iommu for 
>>> container");
>>>  ret = -errno;
>>>  goto free_container_exit;
>>>  }
>>> @@ -976,7 +978,7 @@ static int vfio_connect_container(VFIOGroup *group, 
>>> AddressSpace *as)
>>>  
>>>  ret = ioctl(group->fd, VFIO_GROUP_SET_CONTAINER, );
>>>  if (ret) {
>>> -error_report("vfio: failed to set group container: %m");
>>> +error_setg_errno(errp, errno, "failed to set group container");
>>>  ret = -errno;
>>>  goto free_container_exit;
>>>  }
>>> @@ -984,7 +986,7 @@ static int vfio_connect_container(VFIOGroup *group, 
>>> AddressSpace *as)
>>>  v2 ? VFIO_SPAPR_TCE_v2_IOMMU : VFIO_SPAPR_TCE_IOMMU;
>>>  ret = ioctl(fd, VFIO_SET_IOMMU, container->iommu_type);
>>>  if (ret) {
>>> -error_report("vfio: failed to set iommu for container: %m");
>>> +error_setg_errno(errp, errno, "failed to set iommu for 
>>> container");
>>>  ret = -errno;
>>>  goto free_container_exit;
>>>  }
>>> @@ -997,7 +999,7 @@ static int vfio_connect_container(VFIOGroup *group, 
>>> AddressSpace *as)
>>>  if (!v2) {
>>>  ret = ioctl(fd, VFIO_IOMMU_ENABLE);
>>>  if (ret) {
>>> -error_report("vfio: failed to enable container: %m");
>>> +error_setg_errno(errp, errno, "failed to enable 
>>> container");
>>>  ret = -errno;
>>>  goto 

Re: [Qemu-devel] [PATCH] qtest: ask endianness of the target in qtest_init()

2016-10-09 Thread David Gibson
On Fri, Oct 07, 2016 at 10:39:09AM +0100, Peter Maydell wrote:
> On 7 October 2016 at 00:55, David Gibson  wrote:
> > It is an improvement.  But I still think if we're relying on the
> > ill-defined "target endianness" we're already doing something wrong.
> 
> Target endianness is not ill-defined. It's a clear and constant
> property of the bus the CPU is plugged into.

It's certainly not clear to me.  How are you defining it?

Preferably in terms of visible effects, rather than something that
requires snooping into pieces of hardware that aren't actually
modelled in qemu...

> It is a bit weird
> to rely on it in the test code, which is why only the virtio
> tests currently use qtest_big_endian().
> 
> thanks
> -- PMM
> 

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature


Re: [Qemu-devel] [PATCH] qtest: ask endianness of the target in qtest_init()

2016-10-09 Thread David Gibson
On Fri, Oct 07, 2016 at 12:10:07PM +0200, Greg Kurz wrote:
> On Fri, 7 Oct 2016 10:39:09 +0100
> Peter Maydell  wrote:
> 
> > On 7 October 2016 at 00:55, David Gibson  
> > wrote:
> > > It is an improvement.  But I still think if we're relying on the
> > > ill-defined "target endianness" we're already doing something wrong.  
> > 
> > Target endianness is not ill-defined. It's a clear and constant
> > property of the bus the CPU is plugged into. It is a bit weird
> > to rely on it in the test code, which is why only the virtio
> > tests currently use qtest_big_endian().
> > 
> 
> And to discourage anyone to use it in a test program, maybe it
> could even be renamed virtio_big_endian() and put in a virtio
> specific header file ? This is how it is done in QEMU.

I think that's a good idea.

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature


Re: [Qemu-devel] [RFC 1/4] spapr_pci: Delegate placement of PCI host bridges to machine type

2016-10-09 Thread Alexey Kardashevskiy
On 07/10/16 20:17, David Gibson wrote:
> On Fri, Oct 07, 2016 at 04:34:59PM +1100, Alexey Kardashevskiy wrote:
>> On 07/10/16 16:10, David Gibson wrote:
>>> On Fri, Oct 07, 2016 at 02:57:43PM +1100, Alexey Kardashevskiy wrote:
 On 06/10/16 14:03, David Gibson wrote:
> The 'spapr-pci-host-bridge' represents the virtual PCI host bridge (PHB)
> for a PAPR guest.  Unlike on x86, it's routine on Power (both bare metal
> and PAPR guests) to have numerous independent PHBs, each controlling a
> separate PCI domain.
>
> There are two ways of configuring the spapr-pci-host-bridge device: first
> it can be done fully manually, specifying the locations and sizes of all
> the IO windows.  This gives the most control, but is very awkward with 6
> mandatory parameters.  Alternatively just an "index" can be specified
> which essentially selects from an array of predefined PHB locations.
> The PHB at index 0 is automatically created as the default PHB.
>
> The current set of default locations causes some problems for guests with
> large RAM (> 1 TiB) or PCI devices with very large BARs (e.g. big nVidia
> GPGPU cards via VFIO).  Obviously, for migration we can only change the
> locations on a new machine type, however.
>
> This is awkward, because the placement is currently decided within the
> spapr-pci-host-bridge code, so it breaks abstraction to look inside the
> machine type version.
>
> So, this patch delegates the "default mode" PHB placement from the
> spapr-pci-host-bridge device back to the machine type via a public method
> in sPAPRMachineClass.  It's still a bit ugly, but it's about the best we
> can do.
>
> For now, this just changes where the calculation is done.  It doesn't
> change the actual location of the host bridges, or any other behaviour.
>
> Signed-off-by: David Gibson 
> ---
>  hw/ppc/spapr.c  | 34 ++
>  hw/ppc/spapr_pci.c  | 22 --
>  include/hw/pci-host/spapr.h | 11 +--
>  include/hw/ppc/spapr.h  |  4 
>  4 files changed, 47 insertions(+), 24 deletions(-)
>
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index 03e3803..f6e9c2a 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -2370,6 +2370,39 @@ static HotpluggableCPUList 
> *spapr_query_hotpluggable_cpus(MachineState *machine)
>  return head;
>  }
>  
> +static void spapr_phb_placement(sPAPRMachineState *spapr, uint32_t index,
> +uint64_t *buid, hwaddr *pio, hwaddr 
> *pio_size,
> +hwaddr *mmio, hwaddr *mmio_size,
> +unsigned n_dma, uint32_t *liobns, Error 
> **errp)
> +{
> +const uint64_t base_buid = 0x8002000ULL;
> +const hwaddr phb0_base = 0x100ULL; /* 1 TiB */
> +const hwaddr phb_spacing = 0x10ULL; /* 64 GiB */
> +const hwaddr mmio_offset = 0xa000; /* 2 GiB + 512 MiB */
> +const hwaddr pio_offset = 0x8000; /* 2 GiB */
> +const uint32_t max_index = 255;
> +
> +hwaddr phb_base;
> +int i;
> +
> +if (index > max_index) {
> +error_setg(errp, "\"index\" for PAPR PHB is too large (max %u)",
> +   max_index);
> +return;
> +}
> +
> +*buid = base_buid + index;
> +for (i = 0; i < n_dma; ++i) {
> +liobns[i] = SPAPR_PCI_LIOBN(index, i);
> +}
> +
> +phb_base = phb0_base + index * phb_spacing;
> +*pio = phb_base + pio_offset;
> +*pio_size = SPAPR_PCI_IO_WIN_SIZE;
> +*mmio = phb_base + mmio_offset;
> +*mmio_size = SPAPR_PCI_MMIO_WIN_SIZE;
> +}
> +
>  static void spapr_machine_class_init(ObjectClass *oc, void *data)
>  {
>  MachineClass *mc = MACHINE_CLASS(oc);
> @@ -2406,6 +2439,7 @@ static void spapr_machine_class_init(ObjectClass 
> *oc, void *data)
>  mc->query_hotpluggable_cpus = spapr_query_hotpluggable_cpus;
>  fwc->get_dev_path = spapr_get_fw_dev_path;
>  nc->nmi_monitor_handler = spapr_nmi;
> +smc->phb_placement = spapr_phb_placement;
>  }
>  
>  static const TypeInfo spapr_machine_info = {
> diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c
> index 4f00865..c0fc964 100644
> --- a/hw/ppc/spapr_pci.c
> +++ b/hw/ppc/spapr_pci.c
> @@ -1311,7 +1311,8 @@ static void spapr_phb_realize(DeviceState *dev, 
> Error **errp)
>  sphb->ddw_enabled ? SPAPR_PCI_DMA_MAX_WINDOWS : 1;
>  
>  if (sphb->index != (uint32_t)-1) {
> -hwaddr windows_base;
> +sPAPRMachineClass *smc = SPAPR_MACHINE_GET_CLASS(spapr);
> +Error *local_err = 

[Qemu-devel] [RFC QEMU PATCH 7/8] xen-hvm: create hotplug memory region for HVM guest

2016-10-09 Thread Haozhong Zhang
Reserve the address space after guest physical memory for the hotplug
memory region which is used by the existing implementation to place
NVDIMM devices.

Signed-off-by: Haozhong Zhang 
---
Cc: "Michael S. Tsirkin" 
Cc: Igor Mammedov 
Cc: Stefano Stabellini 
Cc: Anthony Perard 
Cc: xen-de...@lists.xensource.com
---
 hw/mem/pc-dimm.c |  5 -
 xen-hvm.c| 36 
 2 files changed, 40 insertions(+), 1 deletion(-)

diff --git a/hw/mem/pc-dimm.c b/hw/mem/pc-dimm.c
index 9e8dab0..69c5784 100644
--- a/hw/mem/pc-dimm.c
+++ b/hw/mem/pc-dimm.c
@@ -28,6 +28,7 @@
 #include "sysemu/kvm.h"
 #include "trace.h"
 #include "hw/virtio/vhost.h"
+#include "hw/xen/xen.h"
 
 typedef struct pc_dimms_capacity {
  uint64_t size;
@@ -107,7 +108,9 @@ void pc_dimm_memory_plug(DeviceState *dev, 
MemoryHotplugState *hpms,
 }
 
 memory_region_add_subregion(>mr, addr - hpms->base, mr);
-vmstate_register_ram(vmstate_mr, dev);
+if (!xen_enabled()) {
+vmstate_register_ram(vmstate_mr, dev);
+}
 numa_set_mem_node_id(addr, memory_region_size(mr), dimm->node);
 
 out:
diff --git a/xen-hvm.c b/xen-hvm.c
index 768c4c2..68833db 100644
--- a/xen-hvm.c
+++ b/xen-hvm.c
@@ -25,6 +25,7 @@
 #include "sysemu/xen-mapcache.h"
 #include "trace.h"
 #include "exec/address-spaces.h"
+#include "exec/ram_addr.h"
 
 #include 
 #include 
@@ -201,6 +202,8 @@ static void xen_ram_init(PCMachineState *pcms,
 uint64_t user_lowmem = object_property_get_int(qdev_get_machine(),
PC_MACHINE_MAX_RAM_BELOW_4G,
_abort);
+MachineState *machine = MACHINE(pcms);
+PCMachineClass *pcmc = PC_MACHINE_GET_CLASS(pcms);
 
 /* Handle the machine opt max-ram-below-4g.  It is basically doing
  * min(xen limit, user limit).
@@ -252,6 +255,39 @@ static void xen_ram_init(PCMachineState *pcms,
  pcms->above_4g_mem_size);
 memory_region_add_subregion(sysmem, 0x1ULL, _hi);
 }
+
+/* reserve hotplug memory region for vNVDIMM */
+if (pcmc->has_reserved_memory &&
+(machine->ram_size < machine->maxram_size)) {
+ram_addr_t hotplug_mem_size = machine->maxram_size - machine->ram_size;
+
+if (QEMU_ALIGN_UP(machine->maxram_size,
+  TARGET_PAGE_SIZE) != machine->maxram_size) {
+error_report("maximum memory size must by aligned to multiple of "
+ "%d bytes", TARGET_PAGE_SIZE);
+exit(EXIT_FAILURE);
+}
+
+pcms->hotplug_memory.base =
+ROUND_UP(0x1ULL + pcms->above_4g_mem_size, 1ULL << 30);
+
+if (pcmc->enforce_aligned_dimm) {
+/* size hotplug region assuming 1G page max alignment per slot */
+hotplug_mem_size += (1ULL << 30) * machine->ram_slots;
+}
+
+if ((pcms->hotplug_memory.base + hotplug_mem_size) <
+hotplug_mem_size) {
+error_report("unsupported amount of maximum memory: " RAM_ADDR_FMT,
+ machine->maxram_size);
+exit(EXIT_FAILURE);
+}
+
+memory_region_init(>hotplug_memory.mr, OBJECT(pcms),
+   "hotplug-memory", hotplug_mem_size);
+memory_region_add_subregion(sysmem, pcms->hotplug_memory.base,
+>hotplug_memory.mr);
+}
 }
 
 void xen_ram_alloc(ram_addr_t ram_addr, ram_addr_t size, MemoryRegion *mr,
-- 
2.10.1




[Qemu-devel] [RFC QEMU PATCH 6/8] hostmem: add a host memory backend for Xen

2016-10-09 Thread Haozhong Zhang
Some virtual devices (e.g. NVDIMM) use the host memory backend to map
its backend resources to the guest. When those devices are used on Xen,
the mapping has to be managed out of QEMU. In order to reuse other parts
of the implementation of those devices, we introduce a host memory
backend for Xen (memory-backend-xen) which plays as a placeholder and
does not map any backend resource. Therefore, those devices when used on
Xen just needs to use the new backend.

Following is an example of the command options of an NVDIMM device using
a file backend on Xen:
   -object memory-backend-xen,id=mem1,mem-path=$FILENAME,size=$SIZE
   -device nvdimm,id=nvdimm1,memdev=mem1

Signed-off-by: Haozhong Zhang 
---
Cc: Eduardo Habkost 
Cc: Igor Mammedov 
---
 backends/Makefile.objs |   1 +
 backends/hostmem-xen.c | 120 +
 backends/hostmem.c |   9 
 3 files changed, 130 insertions(+)
 create mode 100644 backends/hostmem-xen.c

diff --git a/backends/Makefile.objs b/backends/Makefile.objs
index 31a3a89..a587aca 100644
--- a/backends/Makefile.objs
+++ b/backends/Makefile.objs
@@ -9,3 +9,4 @@ common-obj-$(CONFIG_TPM) += tpm.o
 
 common-obj-y += hostmem.o hostmem-ram.o
 common-obj-$(CONFIG_LINUX) += hostmem-file.o
+common-obj-${CONFIG_XEN_BACKEND} += hostmem-xen.o
diff --git a/backends/hostmem-xen.c b/backends/hostmem-xen.c
new file mode 100644
index 000..144feee
--- /dev/null
+++ b/backends/hostmem-xen.c
@@ -0,0 +1,120 @@
+/*
+ * QEMU Host Memory Backend for Xen
+ *
+ * Copyright(C) 2016 Intel Corporation.
+ *
+ * Author:
+ *  Haozhong Zhang 
+ *
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2 of the License, or (at your option) any later version.
+ *
+ * This library is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library; if not, see 
+ */
+
+#include "qemu/osdep.h"
+#include "qemu/mmap-alloc.h"
+#include "sysemu/hostmem.h"
+#include "qapi/error.h"
+#include "qom/object_interfaces.h"
+
+#define TYPE_MEMORY_BACKEND_XEN "memory-backend-xen"
+
+#define MEMORY_BACKEND_XEN(obj) \
+OBJECT_CHECK(HostMemoryBackendXen, (obj), TYPE_MEMORY_BACKEND_XEN)
+
+typedef struct HostMemoryBackendXen HostMemoryBackendXen;
+
+struct HostMemoryBackendXen {
+HostMemoryBackend parent_obj;
+
+char *mem_path;
+};
+
+static void xen_backend_memory_alloc(HostMemoryBackend *backend, Error **errp)
+{
+HostMemoryBackendXen *db = MEMORY_BACKEND_XEN(backend);
+int fd;
+size_t page_size;
+
+if (!backend->size) {
+error_setg(errp, "can't create backend with size 0");
+return;
+}
+memory_region_init(>mr,
+   OBJECT(backend), db->mem_path, backend->size);
+
+fd = open(db->mem_path, O_RDONLY);
+if (!fd) {
+error_setg(errp, "can't open file %s, err %d", db->mem_path, errno);
+return;
+}
+page_size = qemu_fd_getpagesize(fd);
+backend->mr.align = MAX(page_size, QEMU_VMALLOC_ALIGN);
+close(fd);
+}
+
+static void xen_backend_class_init(ObjectClass *oc, void *data)
+{
+HostMemoryBackendClass *bc = MEMORY_BACKEND_CLASS(oc);
+
+bc->alloc = xen_backend_memory_alloc;
+}
+
+static char *get_mem_path(Object *o, Error **errp)
+{
+HostMemoryBackendXen *db = MEMORY_BACKEND_XEN(o);
+
+return g_strdup(db->mem_path);
+}
+
+static void set_mem_path(Object *o, const char *str, Error **errp)
+{
+HostMemoryBackend *backend = MEMORY_BACKEND(o);
+HostMemoryBackendXen *db = MEMORY_BACKEND_XEN(o);
+
+if (memory_region_size(>mr)) {
+error_setg(errp, "cannot change property value");
+return;
+}
+g_free(db->mem_path);
+db->mem_path = g_strdup(str);
+}
+
+static void
+xen_backend_instance_init(Object *o)
+{
+object_property_add_str(o, "mem-path", get_mem_path,
+set_mem_path, NULL);
+}
+
+static void xen_backend_instance_finalize(Object *o)
+{
+HostMemoryBackendXen *db = MEMORY_BACKEND_XEN(o);
+
+g_free(db->mem_path);
+}
+
+static const TypeInfo xen_backend_info = {
+.name = TYPE_MEMORY_BACKEND_XEN,
+.parent = TYPE_MEMORY_BACKEND,
+.class_init = xen_backend_class_init,
+.instance_init = xen_backend_instance_init,
+.instance_finalize = xen_backend_instance_finalize,
+.instance_size = sizeof(HostMemoryBackendXen),
+};
+
+static void register_types(void)
+{
+type_register_static(_backend_info);
+}
+
+type_init(register_types);
diff --git 

[Qemu-devel] [RFC QEMU PATCH 5/8] nvdimm acpi: build and copy NVDIMM namespace devices to guest on Xen

2016-10-09 Thread Haozhong Zhang
Build and copy NVDIMM namespace devices to guest when QEMU is used as
the device model of Xen. Only the body of each AML device is built and
copied, Xen hvmloader will build the complete namespace devices from
them and put in SSDT tables.

Signed-off-by: Haozhong Zhang 
---
Cc: "Michael S. Tsirkin" 
Cc: Igor Mammedov 
Cc: Xiao Guangrong 
---
 hw/acpi/aml-build.c |  2 +-
 hw/acpi/nvdimm.c| 58 +++--
 include/hw/acpi/aml-build.h |  2 ++
 3 files changed, 38 insertions(+), 24 deletions(-)

diff --git a/hw/acpi/aml-build.c b/hw/acpi/aml-build.c
index a749b62..eda999f 100644
--- a/hw/acpi/aml-build.c
+++ b/hw/acpi/aml-build.c
@@ -287,7 +287,7 @@ build_append_named_dword(GArray *array, const char 
*name_format, ...)
 
 static GPtrArray *alloc_list;
 
-static Aml *aml_alloc(void)
+Aml *aml_alloc(void)
 {
 Aml *var = g_new0(typeof(*var), 1);
 
diff --git a/hw/acpi/nvdimm.c b/hw/acpi/nvdimm.c
index 6de2301..4cfb94d 100644
--- a/hw/acpi/nvdimm.c
+++ b/hw/acpi/nvdimm.c
@@ -925,17 +925,22 @@ static void nvdimm_build_ssdt(GSList *device_list, GArray 
*table_offsets,
   GArray *table_data, BIOSLinker *linker,
   GArray *dsm_dma_arrea)
 {
-Aml *ssdt, *sb_scope, *dev, *field;
+Aml *ssdt, *sb_scope = NULL, *dev, *field;
 int mem_addr_offset, nvdimm_ssdt;
 
 acpi_add_table(table_offsets, table_data);
 
 ssdt = init_aml_allocator();
-acpi_data_push(ssdt->buf, sizeof(AcpiTableHeader));
 
-sb_scope = aml_scope("\\_SB");
+if (xen_enabled()) {
+dev = aml_alloc();
+} else {
+acpi_data_push(ssdt->buf, sizeof(AcpiTableHeader));
+
+sb_scope = aml_scope("\\_SB");
 
-dev = aml_device("NVDR");
+dev = aml_device("NVDR");
+}
 
 /*
  * ACPI 6.0: 9.20 NVDIMM Devices:
@@ -1014,25 +1019,32 @@ static void nvdimm_build_ssdt(GSList *device_list, 
GArray *table_offsets,
 
 nvdimm_build_nvdimm_devices(device_list, dev);
 
-aml_append(sb_scope, dev);
-aml_append(ssdt, sb_scope);
-
-nvdimm_ssdt = table_data->len;
-
-/* copy AML table into ACPI tables blob and patch header there */
-g_array_append_vals(table_data, ssdt->buf->data, ssdt->buf->len);
-mem_addr_offset = build_append_named_dword(table_data,
-   NVDIMM_ACPI_MEM_ADDR);
-
-bios_linker_loader_alloc(linker,
- NVDIMM_DSM_MEM_FILE, dsm_dma_arrea,
- sizeof(NvdimmDsmIn), false /* high memory */);
-bios_linker_loader_add_pointer(linker,
-ACPI_BUILD_TABLE_FILE, mem_addr_offset, sizeof(uint32_t),
-NVDIMM_DSM_MEM_FILE, 0);
-build_header(linker, table_data,
-(void *)(table_data->data + nvdimm_ssdt),
-"SSDT", table_data->len - nvdimm_ssdt, 1, NULL, "NVDIMM");
+if (xen_enabled()) {
+build_append_named_dword(dev->buf, NVDIMM_ACPI_MEM_ADDR);
+xen_acpi_copy_to_guest("NVDR", dev->buf->data, dev->buf->len,
+   XEN_ACPI_NSDEV);
+} else {
+aml_append(sb_scope, dev);
+aml_append(ssdt, sb_scope);
+
+nvdimm_ssdt = table_data->len;
+
+/* copy AML table into ACPI tables blob and patch header there */
+g_array_append_vals(table_data, ssdt->buf->data, ssdt->buf->len);
+mem_addr_offset = build_append_named_dword(table_data,
+   NVDIMM_ACPI_MEM_ADDR);
+
+bios_linker_loader_alloc(linker,
+ NVDIMM_DSM_MEM_FILE, dsm_dma_arrea,
+ sizeof(NvdimmDsmIn), false /* high memory */);
+bios_linker_loader_add_pointer(linker,
+   ACPI_BUILD_TABLE_FILE, mem_addr_offset,
+   sizeof(uint32_t),
+   NVDIMM_DSM_MEM_FILE, 0);
+build_header(linker, table_data,
+ (void *)(table_data->data + nvdimm_ssdt),
+ "SSDT", table_data->len - nvdimm_ssdt, 1, NULL, "NVDIMM");
+}
 free_aml_allocator();
 }
 
diff --git a/include/hw/acpi/aml-build.h b/include/hw/acpi/aml-build.h
index 559326c..bf02f91 100644
--- a/include/hw/acpi/aml-build.h
+++ b/include/hw/acpi/aml-build.h
@@ -232,6 +232,8 @@ Aml *init_aml_allocator(void);
  */
 void free_aml_allocator(void);
 
+Aml *aml_alloc(void);
+
 /**
  * aml_append:
  * @parent_ctx: context to which @child element is added
-- 
2.10.1




[Qemu-devel] [RFC QEMU PATCH 8/8] qmp: add a qmp command 'query-nvdimms' to get plugged NVDIMM devices

2016-10-09 Thread Haozhong Zhang
Xen uses this command to get the backend resource, guest SPA and size of
NVDIMM devices so as to map them to guest.

Signed-off-by: Haozhong Zhang 
---
Cc: Markus Armbruster 
Cc: Xiao Guangrong 
Cc: "Michael S. Tsirkin" 
Cc: Igor Mammedov 
Cc: Eric Blake 
---
 docs/qmp-commands.txt   | 36 
 hw/acpi/nvdimm.c|  2 +-
 hw/mem/nvdimm.c | 35 +++
 include/hw/mem/nvdimm.h | 10 ++
 qapi-schema.json| 29 +
 5 files changed, 111 insertions(+), 1 deletion(-)

diff --git a/docs/qmp-commands.txt b/docs/qmp-commands.txt
index e0adceb..90e9fb6 100644
--- a/docs/qmp-commands.txt
+++ b/docs/qmp-commands.txt
@@ -3800,3 +3800,39 @@ Example for pc machine type started with
 "props": {"core-id": 0, "socket-id": 0, "thread-id": 0}
  }
]}
+
+EQMP
+
+{
+.name   = "query-nvdimms",
+.args_type  = "",
+.mhandler.cmd_new = qmp_marshal_query_nvdimms,
+},
+
+SQMP
+Show plugged NVDIMM devices
+---
+
+Arguments: None.
+
+Example for pc machine type started with
+-object memory-backend-file,id=mem1,mem-path=/path/to/nvm1,size=4G
+-device nvdimm,id=nvdimm1,memdev=mem1
+-object memory-backend-file,id=mem2,mem-path=/path/to/nvm2,size=8G
+-device nvdimm,id=nvdimm2,memdev=mem2:
+
+-> { "execute": "query-nvdimms" }
+<- { "returns": [
+  {
+ "mem-path": "/path/to/nvm1",
+"slot": 0,
+"spa": 17179869184,
+"length": 4294967296
+  },
+  {
+ "mem-path": "/path/to/nvm2",
+"slot": 1,
+"spa": 21474836480,
+"length": 8589934592
+  }
+   ]}
diff --git a/hw/acpi/nvdimm.c b/hw/acpi/nvdimm.c
index 4cfb94d..eedc128 100644
--- a/hw/acpi/nvdimm.c
+++ b/hw/acpi/nvdimm.c
@@ -57,7 +57,7 @@ static int nvdimm_plugged_device_list(Object *obj, void 
*opaque)
  * Note: it is the caller's responsibility to free the list to avoid
  * memory leak.
  */
-static GSList *nvdimm_get_plugged_device_list(void)
+GSList *nvdimm_get_plugged_device_list(void)
 {
 GSList *list = NULL;
 
diff --git a/hw/mem/nvdimm.c b/hw/mem/nvdimm.c
index d25993b..99d0cc9 100644
--- a/hw/mem/nvdimm.c
+++ b/hw/mem/nvdimm.c
@@ -26,6 +26,7 @@
 #include "qapi/error.h"
 #include "qapi/visitor.h"
 #include "hw/mem/nvdimm.h"
+#include "qmp-commands.h"
 
 static void nvdimm_get_label_size(Object *obj, Visitor *v, const char *name,
   void *opaque, Error **errp)
@@ -180,3 +181,37 @@ static void nvdimm_register_types(void)
 }
 
 type_init(nvdimm_register_types)
+
+NvdimmInfoList *qmp_query_nvdimms(Error **errp)
+{
+NvdimmInfoList *info_list = NULL;
+NvdimmInfoList *info;
+GSList *device_list = nvdimm_get_plugged_device_list();
+
+while (device_list) {
+DeviceState *dev = device_list->data;
+PCDIMMDevice *parent = PC_DIMM(OBJECT(dev));
+const char *mem_path;
+
+info = g_new0(NvdimmInfoList, 1);
+info->value = g_new0(NvdimmInfo, 1);
+
+mem_path = object_property_get_str(OBJECT(parent->hostmem),
+   "mem-path", NULL);
+info->value->mem_path = mem_path ? strdup(mem_path) : NULL;
+
+info->value->slot = object_property_get_int(OBJECT(dev),
+PC_DIMM_SLOT_PROP, NULL);
+info->value->spa = object_property_get_int(OBJECT(dev),
+   PC_DIMM_ADDR_PROP, NULL);
+info->value->length = object_property_get_int(OBJECT(dev),
+  PC_DIMM_SIZE_PROP, NULL);
+
+info->next = info_list;
+info_list = info;
+device_list = device_list->next;
+}
+
+g_slist_free(device_list);
+return info_list;
+}
diff --git a/include/hw/mem/nvdimm.h b/include/hw/mem/nvdimm.h
index 1cfe9e0..6be269e 100644
--- a/include/hw/mem/nvdimm.h
+++ b/include/hw/mem/nvdimm.h
@@ -113,4 +113,14 @@ void nvdimm_init_acpi_state(AcpiNVDIMMState *state, 
MemoryRegion *io,
 FWCfgState *fw_cfg, Object *owner);
 void nvdimm_build_acpi(GArray *table_offsets, GArray *table_data,
BIOSLinker *linker, GArray *dsm_dma_arrea);
+
+/*
+ * Inquire plugged NVDIMM devices and link them into the list which is
+ * returned to the caller.
+ *
+ * Note: it is the caller's responsibility to free the list to avoid
+ * memory leak.
+ */
+GSList *nvdimm_get_plugged_device_list(void);
+
 #endif
diff --git a/qapi-schema.json b/qapi-schema.json
index c3dcf11..6246255 100644
--- a/qapi-schema.json
+++ b/qapi-schema.json
@@ -4646,3 +4646,32 @@
 # Since: 2.7
 ##
 { 'command': 'query-hotpluggable-cpus', 'returns': ['HotpluggableCPU'] }
+
+##
+# @NvdimmInfo
+#
+# Information about 

[Qemu-devel] [RFC QEMU PATCH 4/8] nvdimm acpi: build and copy NFIT to guest on Xen

2016-10-09 Thread Haozhong Zhang
Build and copy NFIT to guest when QEMU is used as the device model of
Xen. The checksum of NFIT is left blank and will be filled by Xen
hvmloader.

Signed-off-by: Haozhong Zhang 
---
Cc: "Michael S. Tsirkin" 
Cc: Igor Mammedov 
Cc: Xiao Guangrong 
Cc: Paolo Bonzini 
Cc: Richard Henderson 
Cc: Eduardo Habkost 
Cc: Stefano Stabellini 
Cc: Anthony Perard 
Cc: xen-de...@lists.xensource.com
---
 hw/acpi/aml-build.c  |  9 ++---
 hw/acpi/nvdimm.c |  8 
 hw/i386/pc.c | 12 
 include/hw/xen/xen.h |  2 ++
 xen-hvm.c| 21 -
 5 files changed, 44 insertions(+), 8 deletions(-)

diff --git a/hw/acpi/aml-build.c b/hw/acpi/aml-build.c
index b2a1e40..a749b62 100644
--- a/hw/acpi/aml-build.c
+++ b/hw/acpi/aml-build.c
@@ -1530,9 +1530,12 @@ build_header(BIOSLinker *linker, GArray *table_data,
 h->oem_revision = cpu_to_le32(1);
 memcpy(h->asl_compiler_id, ACPI_BUILD_APPNAME4, 4);
 h->asl_compiler_revision = cpu_to_le32(1);
-/* Checksum to be filled in by Guest linker */
-bios_linker_loader_add_checksum(linker, ACPI_BUILD_TABLE_FILE,
-tbl_offset, len, checksum_offset);
+/* No linker is provided when running on Xen */
+if (linker) {
+/* Checksum to be filled in by Guest linker */
+bios_linker_loader_add_checksum(linker, ACPI_BUILD_TABLE_FILE,
+tbl_offset, len, checksum_offset);
+}
 }
 
 void *acpi_data_push(GArray *table_data, unsigned size)
diff --git a/hw/acpi/nvdimm.c b/hw/acpi/nvdimm.c
index c9c2a84..6de2301 100644
--- a/hw/acpi/nvdimm.c
+++ b/hw/acpi/nvdimm.c
@@ -32,6 +32,7 @@
 #include "hw/acpi/bios-linker-loader.h"
 #include "hw/nvram/fw_cfg.h"
 #include "hw/mem/nvdimm.h"
+#include "hw/xen/xen.h"
 
 static int nvdimm_plugged_device_list(Object *obj, void *opaque)
 {
@@ -389,6 +390,13 @@ static void nvdimm_build_nfit(GSList *device_list, GArray 
*table_offsets,
 build_header(linker, table_data,
  (void *)(table_data->data + header), "NFIT",
  sizeof(NvdimmNfitHeader) + structures->len, 1, NULL, NULL);
+
+if (xen_enabled()) {
+xen_acpi_copy_to_guest("NFIT", table_data->data + header,
+   sizeof(NvdimmNfitHeader) + structures->len,
+   XEN_ACPI_TABLE);
+}
+
 g_array_free(structures, true);
 }
 
diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index 2d6d792..33be032 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -1269,10 +1269,14 @@ void pc_machine_done(Notifier *notifier, void *data)
 }
 }
 
-acpi_setup();
-if (pcms->fw_cfg) {
-pc_build_smbios(pcms->fw_cfg);
-pc_build_feature_control_file(pcms);
+if (xen_enabled()) {
+xen_acpi_setup(pcms);
+} else {
+acpi_setup();
+if (pcms->fw_cfg) {
+pc_build_smbios(pcms->fw_cfg);
+pc_build_feature_control_file(pcms);
+}
 }
 }
 
diff --git a/include/hw/xen/xen.h b/include/hw/xen/xen.h
index 79273da..60344f9 100644
--- a/include/hw/xen/xen.h
+++ b/include/hw/xen/xen.h
@@ -47,6 +47,8 @@ void xen_modified_memory(ram_addr_t start, ram_addr_t length);
 
 void xen_register_framebuffer(struct MemoryRegion *mr);
 
+void xen_acpi_setup(PCMachineState *pcms);
+
 #define XEN_ACPI_TABLE 0
 #define XEN_ACPI_NSDEV 1
 
diff --git a/xen-hvm.c b/xen-hvm.c
index 168a9ec..768c4c2 100644
--- a/xen-hvm.c
+++ b/xen-hvm.c
@@ -1244,7 +1244,7 @@ static ram_addr_t guest_acpi_buf_alloc(size_t length)
 
 static int xen_acpi_needed(PCMachineState *pcms)
 {
-return 0;
+return pcms->acpi_nvdimm_state.is_enabled;
 }
 
 static int xen_acpi_init(PCMachineState *pcms, XenIOState *state)
@@ -1256,6 +1256,25 @@ static int xen_acpi_init(PCMachineState *pcms, 
XenIOState *state)
 return guest_acpi_buf_init(state);
 }
 
+static void xen_acpi_nvdimm_setup(PCMachineState *pcms)
+{
+GArray *table_offsets = g_array_new(false, true /* clear */,
+sizeof(uint32_t));
+GArray *table_data = g_array_new(false, true /* clear */, 1);
+
+nvdimm_build_acpi(table_offsets, table_data,
+  NULL, pcms->acpi_nvdimm_state.dsm_mem);
+g_array_free(table_offsets, true);
+g_array_free(table_data, true);
+}
+
+void xen_acpi_setup(PCMachineState *pcms)
+{
+if (pcms->acpi_nvdimm_state.is_enabled) {
+xen_acpi_nvdimm_setup(pcms);
+}
+}
+
 void xen_hvm_init(PCMachineState *pcms, MemoryRegion **ram_memory)
 {
 int i, rc;
-- 
2.10.1




[Qemu-devel] [RFC QEMU PATCH 3/8] nvdimm acpi: do not use fw_cfg on Xen

2016-10-09 Thread Haozhong Zhang
No fw_cfg is created when QEMU is used as the device model of Xen.

Signed-off-by: Haozhong Zhang 
---
Cc: Xiao Guangrong 
Cc: "Michael S. Tsirkin" 
Cc: Igor Mammedov 
---
 hw/acpi/nvdimm.c | 7 +--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/hw/acpi/nvdimm.c b/hw/acpi/nvdimm.c
index e486128..c9c2a84 100644
--- a/hw/acpi/nvdimm.c
+++ b/hw/acpi/nvdimm.c
@@ -770,8 +770,11 @@ void nvdimm_init_acpi_state(AcpiNVDIMMState *state, 
MemoryRegion *io,
 
 state->dsm_mem = g_array_new(false, true /* clear */, 1);
 acpi_data_push(state->dsm_mem, sizeof(NvdimmDsmIn));
-fw_cfg_add_file(fw_cfg, NVDIMM_DSM_MEM_FILE, state->dsm_mem->data,
-state->dsm_mem->len);
+/* No fw_cfg is used when running on Xen */
+if (fw_cfg) {
+fw_cfg_add_file(fw_cfg, NVDIMM_DSM_MEM_FILE, state->dsm_mem->data,
+state->dsm_mem->len);
+}
 }
 
 #define NVDIMM_COMMON_DSM  "NCAL"
-- 
2.10.1




[Qemu-devel] [RFC QEMU PATCH 2/8] xen-hvm: add a function to copy ACPI to guest

2016-10-09 Thread Haozhong Zhang
xen_acpi_copy_to_guest() will be used later to copy NVDIMM ACPI to
guest.

Signed-off-by: Haozhong Zhang 
---
Cc: Stefano Stabellini 
Cc: Anthony Perard 
Cc: xen-de...@lists.xensource.com
---
 include/hw/xen/xen.h |   6 ++
 xen-hvm.c| 180 +++
 2 files changed, 186 insertions(+)

diff --git a/include/hw/xen/xen.h b/include/hw/xen/xen.h
index a8f3afb..79273da 100644
--- a/include/hw/xen/xen.h
+++ b/include/hw/xen/xen.h
@@ -47,4 +47,10 @@ void xen_modified_memory(ram_addr_t start, ram_addr_t 
length);
 
 void xen_register_framebuffer(struct MemoryRegion *mr);
 
+#define XEN_ACPI_TABLE 0
+#define XEN_ACPI_NSDEV 1
+
+int xen_acpi_copy_to_guest(const char *name, const char *data, size_t length,
+   int type);
+
 #endif /* QEMU_HW_XEN_H */
diff --git a/xen-hvm.c b/xen-hvm.c
index 2f348ed..168a9ec 100644
--- a/xen-hvm.c
+++ b/xen-hvm.c
@@ -21,6 +21,7 @@
 #include "sysemu/char.h"
 #include "qemu/error-report.h"
 #include "qemu/range.h"
+#include "qemu/cutils.h"
 #include "sysemu/xen-mapcache.h"
 #include "trace.h"
 #include "exec/address-spaces.h"
@@ -39,6 +40,10 @@
 do { } while (0)
 #endif
 
+#define HVM_XS_DM_ACPI_ROOT  "/hvmloader/dm-acpi"
+#define HVM_XS_DM_ACPI_ADDRESS   HVM_XS_DM_ACPI_ROOT"/address"
+#define HVM_XS_DM_ACPI_LENGTHHVM_XS_DM_ACPI_ROOT"/length"
+
 static MemoryRegion ram_memory, ram_640k, ram_lo, ram_hi;
 static MemoryRegion *framebuffer;
 static bool xen_in_migration;
@@ -87,6 +92,14 @@ typedef struct XenPhysmap {
 QLIST_ENTRY(XenPhysmap) list;
 } XenPhysmap;
 
+typedef struct XenAcpiBuf {
+ram_addr_t base;
+ram_addr_t length;
+ram_addr_t used;
+} XenAcpiBuf;
+
+static XenAcpiBuf *guest_acpi_buf;
+
 typedef struct XenIOState {
 ioservid_t ioservid;
 shared_iopage_t *shared_page;
@@ -111,6 +124,8 @@ typedef struct XenIOState {
 hwaddr free_phys_offset;
 const XenPhysmap *log_for_dirtybit;
 
+XenAcpiBuf acpi_buf;
+
 Notifier exit;
 Notifier suspend;
 Notifier wakeup;
@@ -1181,6 +1196,66 @@ static void xen_wakeup_notifier(Notifier *notifier, void 
*data)
 xc_set_hvm_param(xen_xc, xen_domid, HVM_PARAM_ACPI_S_STATE, 0);
 }
 
+static int guest_acpi_buf_init(XenIOState *state)
+{
+char path[80], *value;
+unsigned int len;
+
+guest_acpi_buf = >acpi_buf;
+
+snprintf(path, sizeof(path),
+ "/local/domain/%d"HVM_XS_DM_ACPI_ADDRESS, xen_domid);
+value = xs_read(state->xenstore, 0, path, );
+if (!value) {
+return -EINVAL;
+}
+if (qemu_strtoull(value, NULL, 16, _acpi_buf->base)) {
+return -EINVAL;
+}
+
+snprintf(path, sizeof(path),
+ "/local/domain/%d"HVM_XS_DM_ACPI_LENGTH, xen_domid);
+value = xs_read(state->xenstore, 0, path, );
+if (!value) {
+return -EINVAL;
+}
+if (qemu_strtoull(value, NULL, 16, _acpi_buf->length)) {
+return -EINVAL;
+}
+
+guest_acpi_buf->used = 0;
+
+return 0;
+}
+
+static ram_addr_t guest_acpi_buf_alloc(size_t length)
+{
+ram_addr_t addr;
+
+if (guest_acpi_buf->length - guest_acpi_buf->used < length) {
+return 0;
+}
+
+addr = guest_acpi_buf->base + guest_acpi_buf->used;
+guest_acpi_buf->used += length;
+
+return addr;
+}
+
+static int xen_acpi_needed(PCMachineState *pcms)
+{
+return 0;
+}
+
+static int xen_acpi_init(PCMachineState *pcms, XenIOState *state)
+{
+if (!xen_acpi_needed(pcms)) {
+return 0;
+}
+
+return guest_acpi_buf_init(state);
+}
+
 void xen_hvm_init(PCMachineState *pcms, MemoryRegion **ram_memory)
 {
 int i, rc;
@@ -1316,6 +1391,13 @@ void xen_hvm_init(PCMachineState *pcms, MemoryRegion 
**ram_memory)
 }
 xen_be_register_common();
 xen_read_physmap(state);
+
+/* Initialize ACPI */
+if (xen_acpi_init(pcms, state)) {
+error_report("failed to initialize xen ACPI");
+goto err;
+}
+
 return;
 
 err:
@@ -1392,3 +1474,101 @@ void qmp_xen_set_global_dirty_log(bool enable, Error 
**errp)
 memory_global_dirty_log_stop();
 }
 }
+
+static int xs_write_guest_acpi_blob_key(const char *name,
+const char *key, const char *value)
+{
+XenIOState *state = container_of(guest_acpi_buf, XenIOState, acpi_buf);
+char path[80];
+
+snprintf(path, sizeof(path),
+ "/local/domain/%d"HVM_XS_DM_ACPI_ROOT"/%s/%s",
+ xen_domid, name, key);
+if (!xs_write(state->xenstore, 0, path, value, strlen(value))) {
+return -EIO;
+}
+
+return 0;
+}
+
+static size_t xen_memcpy_to_guest(ram_addr_t gpa,
+  const char *buf, size_t length)
+{
+size_t copied = 0, size;
+ram_addr_t s, e, offset, cur = gpa;
+xen_pfn_t cur_pfn;
+void *page;
+
+if (!buf || !length) {
+return 0;
+}
+
+

[Qemu-devel] [RFC QEMU PATCH 0/8] Implement vNVDIMM for Xen HVM guest

2016-10-09 Thread Haozhong Zhang
Overview

This RFC QEMU patch series along with corresponding patch series of
Xen, Linux kernel and ndctl implements vNVDIMM for Xen HVM guests. DSM
(and hence labels) and hotplug are not supported by this patch series
and will be implemented later.

Design and Implementation
=
The complete design can be found at
  https://lists.xenproject.org/archives/html/xen-devel/2016-07/msg01921.html.

All patch series can be found at
  Xen:  https://github.com/hzzhan9/xen.git nvdimm-rfc-v1
  QEMU: https://github.com/hzzhan9/qemu.git xen-nvdimm-rfc-v1
  Linux kernel: https://github.com/hzzhan9/nvdimm.git xen-nvdimm-rfc-v1
  ndctl:https://github.com/hzzhan9/ndctl.git pfn-xen-rfc-v1

QEMU, as the device model of Xen HVM domU, is responsible to
1) build NVDIMM ACPI tables and namsepace devices, and
2) find proper areas in the guest physical address space to place
   vNVDIMM devices. The backend resources of vNVDIMM are managed by
   Xen rather than QEMU.

Patch 02 - 05 implement above 1). They implement a mechanism to pass
guest ACPI tables and namespace devices to Xen guests.

Patch 06 - 08 implement above 2). Because the backend resources of
vNVDIMM devices for Xen guest is managed out of QEMU, we introduce a
new host memory backend memory-backend-xen to be used with vNVDIMM
devices. It basically plays as a placeholder, which can fit in the
current pc-dimm code and only gets the guest address ranges of vNVDIMM
devices. The guest address ranges as well as other information of
vNVDIMM devices are passed to Xen via a new QMP command.

Because labels are not supported for Xen guest now, Patch 01 is needed
to avoid dereferencing the NULL pointer to non-existing label data.

How to test
===
Please refer to the cover letter of Xen patch series
"[RFC XEN PATCH 00/16] Add vNVDIMM support to HVM domains".

Haozhong Zhang (8):
  01/ nvdimm: do not initialize label_data if label_size is zero
  02/ xen-hvm: add a function to copy ACPI to guest
  03/ nvdimm acpi: do not use fw_cfg on Xen
  04/ nvdimm acpi: build and copy NFIT to guest on Xen
  05/ nvdimm acpi: build and copy NVDIMM namespace devices to guest on Xen
  06/ hostmem: add a host memory backend for Xen
  07/ xen-hvm: create hotplug memory region for HVM guest
  08/ qmp: add a qmp command 'query-nvdimms' to get plugged NVDIMM devices

 backends/Makefile.objs  |   1 +
 backends/hostmem-xen.c  | 120 ++
 backends/hostmem.c  |   9 ++
 docs/qmp-commands.txt   |  36 +++
 hw/acpi/aml-build.c |  11 ++-
 hw/acpi/nvdimm.c|  75 +-
 hw/i386/pc.c|  12 ++-
 hw/mem/nvdimm.c |  39 +++-
 hw/mem/pc-dimm.c|   5 +-
 include/hw/acpi/aml-build.h |   2 +
 include/hw/mem/nvdimm.h |  10 ++
 include/hw/xen/xen.h|   8 ++
 qapi-schema.json|  29 ++
 xen-hvm.c   | 235 
 14 files changed, 556 insertions(+), 36 deletions(-)
 create mode 100644 backends/hostmem-xen.c

-- 
2.10.1




[Qemu-devel] [RFC QEMU PATCH 1/8] nvdimm: do not initialize label_data if label_size is zero

2016-10-09 Thread Haozhong Zhang
When memory-backend-xen is used, the label_data pointer can not be got
via memory_region_get_ram_ptr(). We will use other functions to get
label_data once we introduce NVDIMM label support to Xen.

Signed-off-by: Haozhong Zhang 
---
Cc: Xiao Guangrong 
Cc: "Michael S. Tsirkin" 
Cc: Igor Mammedov 
---
 hw/mem/nvdimm.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/hw/mem/nvdimm.c b/hw/mem/nvdimm.c
index 7895805..d25993b 100644
--- a/hw/mem/nvdimm.c
+++ b/hw/mem/nvdimm.c
@@ -87,7 +87,9 @@ static void nvdimm_realize(PCDIMMDevice *dimm, Error **errp)
 align = memory_region_get_alignment(mr);
 
 pmem_size = size - nvdimm->label_size;
-nvdimm->label_data = memory_region_get_ram_ptr(mr) + pmem_size;
+if (nvdimm->label_size) {
+nvdimm->label_data = memory_region_get_ram_ptr(mr) + pmem_size;
+}
 pmem_size = QEMU_ALIGN_DOWN(pmem_size, align);
 
 if (size <= nvdimm->label_size || !pmem_size) {
-- 
2.10.1




Re: [Qemu-devel] [PATCH v5 11/14] virtio-crypto: emulate virtio crypto as a legacy device by default

2016-10-09 Thread Gonglei (Arei)
Hi Michael,

Happy to listen to your voice :)


> -Original Message-
> From: Michael S. Tsirkin [mailto:m...@redhat.com]
> Sent: Monday, October 10, 2016 7:48 AM
> Subject: Re: [PATCH v5 11/14] virtio-crypto: emulate virtio crypto as a legacy
> device by default
> 
> On Thu, Oct 06, 2016 at 07:36:44PM +0800, Gonglei wrote:
> > the scenario of virtio crypto device is mostly NFV, which require
> > the existing Guest can't need to do any changes to support virtio
> > crypto, so that they can easily migrate the existing network units
> > to VM. That's also a basic requirement came from our customers
> > in production environment.
> >
> > For virtio crypto driver, we can both support virtio-1.0 or earlier. But
> > Virtio pci driver module can't discovery the virtio-1.0 devices in most
> > existing Guests. If we want do this, we have to require the customers
> > change the virtio pci module for existing guests influence all virtio
> > devices, which is impossible.
> >
> > So, let's emulate the virtio crypto device as a legacy virtio
> > device by default. Using 0x1014 as virtio crypto pci device ID
> > because virtio crypto ID is 20 (0x14).
> >
> > Signed-off-by: Gonglei 
> 
> Sorry, I don't think this makes any sense.
> 
> You certainly can have two drivers: one for legacy and one for modern
> devices. 

Oh, This is indeed a solution.

> It only gets difficult if you try to support transitional
> devices.
> 
> Pls drop this patch.
> 
OK, then I have to temporarily drop the patch 12/14 as well because libqtest
doesn't support virtio-1.0 device yet. 


Regards,
-Gonglei

> > ---
> >  docs/specs/pci-ids.txt| 2 ++
> >  hw/virtio/virtio-crypto-pci.c | 4 +++-
> >  include/hw/pci/pci.h  | 2 ++
> >  3 files changed, 7 insertions(+), 1 deletion(-)
> >
> > diff --git a/docs/specs/pci-ids.txt b/docs/specs/pci-ids.txt
> > index fd27c67..662d4f8 100644
> > --- a/docs/specs/pci-ids.txt
> > +++ b/docs/specs/pci-ids.txt
> > @@ -22,6 +22,7 @@ maintained as part of the virtio specification.
> >  1af4:1004  SCSI host bus adapter device (legacy)
> >  1af4:1005  entropy generator device (legacy)
> >  1af4:1009  9p filesystem device (legacy)
> > +1af4:1014  crypto device (legacy)
> >
> >  1af4:1041  network device (modern)
> >  1af4:1042  block device (modern)
> > @@ -32,6 +33,7 @@ maintained as part of the virtio specification.
> >  1af4:1049  9p filesystem device (modern)
> >  1af4:1050  virtio gpu device (modern)
> >  1af4:1052  virtio input device (modern)
> > +1af4:1054  crypto device (modern)
> >
> >  1af4:10f0  Available for experimental usage without registration.  Must
> get
> > to  official ID when the code leaves the test lab (i.e. when seeking
> > diff --git a/hw/virtio/virtio-crypto-pci.c b/hw/virtio/virtio-crypto-pci.c
> > index 21d9984..30c10f0 100644
> > --- a/hw/virtio/virtio-crypto-pci.c
> > +++ b/hw/virtio/virtio-crypto-pci.c
> > @@ -32,7 +32,6 @@ static void virtio_crypto_pci_realize(VirtIOPCIProxy
> *vpci_dev, Error **errp)
> >  DeviceState *vdev = DEVICE(>vdev);
> >
> >  qdev_set_parent_bus(vdev, BUS(_dev->bus));
> > -virtio_pci_force_virtio_1(vpci_dev);
> >  object_property_set_bool(OBJECT(vdev), true, "realized", errp);
> >  object_property_set_link(OBJECT(vcrypto),
> >   OBJECT(vcrypto->vdev.conf.cryptodev), "cryptodev",
> > @@ -49,6 +48,9 @@ static void virtio_crypto_pci_class_init(ObjectClass
> *klass, void *data)
> >  set_bit(DEVICE_CATEGORY_MISC, dc->categories);
> >  dc->props = virtio_crypto_pci_properties;
> >
> > +pcidev_k->vendor_id = PCI_VENDOR_ID_REDHAT_QUMRANET;
> > +pcidev_k->device_id = PCI_DEVICE_ID_VIRTIO_CRYPTO;
> > +pcidev_k->revision = 0;
> >  pcidev_k->class_id = PCI_CLASS_OTHERS;
> >  }
> >
> > diff --git a/include/hw/pci/pci.h b/include/hw/pci/pci.h
> > index 772692f..5881101 100644
> > --- a/include/hw/pci/pci.h
> > +++ b/include/hw/pci/pci.h
> > @@ -83,6 +83,8 @@
> >  #define PCI_DEVICE_ID_VIRTIO_RNG 0x1005
> >  #define PCI_DEVICE_ID_VIRTIO_9P  0x1009
> >  #define PCI_DEVICE_ID_VIRTIO_VSOCK   0x1012
> > +#define PCI_DEVICE_ID_VIRTIO_CRYPTO  0x1014
> > +
> >
> >  #define PCI_VENDOR_ID_REDHAT 0x1b36
> >  #define PCI_DEVICE_ID_REDHAT_BRIDGE  0x0001
> > --
> > 1.7.12.4
> >



Re: [Qemu-devel] [PATCH v5 11/14] virtio-crypto: emulate virtio crypto as a legacy device by default

2016-10-09 Thread Michael S. Tsirkin
On Thu, Oct 06, 2016 at 07:36:44PM +0800, Gonglei wrote:
> the scenario of virtio crypto device is mostly NFV, which require
> the existing Guest can't need to do any changes to support virtio
> crypto, so that they can easily migrate the existing network units
> to VM. That's also a basic requirement came from our customers
> in production environment.
> 
> For virtio crypto driver, we can both support virtio-1.0 or earlier. But
> Virtio pci driver module can't discovery the virtio-1.0 devices in most
> existing Guests. If we want do this, we have to require the customers
> change the virtio pci module for existing guests influence all virtio
> devices, which is impossible.
>
> So, let's emulate the virtio crypto device as a legacy virtio
> device by default. Using 0x1014 as virtio crypto pci device ID
> because virtio crypto ID is 20 (0x14).
> 
> Signed-off-by: Gonglei 

Sorry, I don't think this makes any sense.

You certainly can have two drivers: one for legacy and one for modern
devices.  It only gets difficult if you try to support transitional
devices.

Pls drop this patch.

> ---
>  docs/specs/pci-ids.txt| 2 ++
>  hw/virtio/virtio-crypto-pci.c | 4 +++-
>  include/hw/pci/pci.h  | 2 ++
>  3 files changed, 7 insertions(+), 1 deletion(-)
> 
> diff --git a/docs/specs/pci-ids.txt b/docs/specs/pci-ids.txt
> index fd27c67..662d4f8 100644
> --- a/docs/specs/pci-ids.txt
> +++ b/docs/specs/pci-ids.txt
> @@ -22,6 +22,7 @@ maintained as part of the virtio specification.
>  1af4:1004  SCSI host bus adapter device (legacy)
>  1af4:1005  entropy generator device (legacy)
>  1af4:1009  9p filesystem device (legacy)
> +1af4:1014  crypto device (legacy)
>  
>  1af4:1041  network device (modern)
>  1af4:1042  block device (modern)
> @@ -32,6 +33,7 @@ maintained as part of the virtio specification.
>  1af4:1049  9p filesystem device (modern)
>  1af4:1050  virtio gpu device (modern)
>  1af4:1052  virtio input device (modern)
> +1af4:1054  crypto device (modern)
>  
>  1af4:10f0  Available for experimental usage without registration.  Must get
> to  official ID when the code leaves the test lab (i.e. when seeking
> diff --git a/hw/virtio/virtio-crypto-pci.c b/hw/virtio/virtio-crypto-pci.c
> index 21d9984..30c10f0 100644
> --- a/hw/virtio/virtio-crypto-pci.c
> +++ b/hw/virtio/virtio-crypto-pci.c
> @@ -32,7 +32,6 @@ static void virtio_crypto_pci_realize(VirtIOPCIProxy 
> *vpci_dev, Error **errp)
>  DeviceState *vdev = DEVICE(>vdev);
>  
>  qdev_set_parent_bus(vdev, BUS(_dev->bus));
> -virtio_pci_force_virtio_1(vpci_dev);
>  object_property_set_bool(OBJECT(vdev), true, "realized", errp);
>  object_property_set_link(OBJECT(vcrypto),
>   OBJECT(vcrypto->vdev.conf.cryptodev), "cryptodev",
> @@ -49,6 +48,9 @@ static void virtio_crypto_pci_class_init(ObjectClass 
> *klass, void *data)
>  set_bit(DEVICE_CATEGORY_MISC, dc->categories);
>  dc->props = virtio_crypto_pci_properties;
>  
> +pcidev_k->vendor_id = PCI_VENDOR_ID_REDHAT_QUMRANET;
> +pcidev_k->device_id = PCI_DEVICE_ID_VIRTIO_CRYPTO;
> +pcidev_k->revision = 0;
>  pcidev_k->class_id = PCI_CLASS_OTHERS;
>  }
>  
> diff --git a/include/hw/pci/pci.h b/include/hw/pci/pci.h
> index 772692f..5881101 100644
> --- a/include/hw/pci/pci.h
> +++ b/include/hw/pci/pci.h
> @@ -83,6 +83,8 @@
>  #define PCI_DEVICE_ID_VIRTIO_RNG 0x1005
>  #define PCI_DEVICE_ID_VIRTIO_9P  0x1009
>  #define PCI_DEVICE_ID_VIRTIO_VSOCK   0x1012
> +#define PCI_DEVICE_ID_VIRTIO_CRYPTO  0x1014
> +
>  
>  #define PCI_VENDOR_ID_REDHAT 0x1b36
>  #define PCI_DEVICE_ID_REDHAT_BRIDGE  0x0001
> -- 
> 1.7.12.4
> 



[Qemu-devel] [PATCH v5 33/35] target-arm: remove EXCP_STREX + cpu_exclusive_{test, info}

2016-10-09 Thread Richard Henderson
From: "Emilio G. Cota" 

The exception is not emitted anymore; remove it and the associated
TCG variables.

Reviewed-by: Alex Bennée 
Signed-off-by: Emilio G. Cota 
Signed-off-by: Richard Henderson 
Message-Id: <1467054136-10430-31-git-send-email-c...@braap.org>
---
 target-arm/cpu.h   | 17 ++---
 target-arm/internals.h |  4 +---
 target-arm/translate.c | 10 --
 target-arm/translate.h |  4 
 4 files changed, 7 insertions(+), 28 deletions(-)

diff --git a/target-arm/cpu.h b/target-arm/cpu.h
index 76d824d..a38cec0 100644
--- a/target-arm/cpu.h
+++ b/target-arm/cpu.h
@@ -46,13 +46,12 @@
 #define EXCP_BKPT7
 #define EXCP_EXCEPTION_EXIT  8   /* Return from v7M exception.  */
 #define EXCP_KERNEL_TRAP 9   /* Jumped to kernel code page.  */
-#define EXCP_STREX  10
-#define EXCP_HVC11   /* HyperVisor Call */
-#define EXCP_HYP_TRAP   12
-#define EXCP_SMC13   /* Secure Monitor Call */
-#define EXCP_VIRQ   14
-#define EXCP_VFIQ   15
-#define EXCP_SEMIHOST   16   /* semihosting call (A64 only) */
+#define EXCP_HVC10   /* HyperVisor Call */
+#define EXCP_HYP_TRAP   11
+#define EXCP_SMC12   /* Secure Monitor Call */
+#define EXCP_VIRQ   13
+#define EXCP_VFIQ   14
+#define EXCP_SEMIHOST   15   /* semihosting call (A64 only) */
 
 #define ARMV7M_EXCP_RESET   1
 #define ARMV7M_EXCP_NMI 2
@@ -475,10 +474,6 @@ typedef struct CPUARMState {
 uint64_t exclusive_addr;
 uint64_t exclusive_val;
 uint64_t exclusive_high;
-#if defined(CONFIG_USER_ONLY)
-uint64_t exclusive_test;
-uint32_t exclusive_info;
-#endif
 
 /* iwMMXt coprocessor state.  */
 struct {
diff --git a/target-arm/internals.h b/target-arm/internals.h
index cd57401..3edccd2 100644
--- a/target-arm/internals.h
+++ b/target-arm/internals.h
@@ -46,8 +46,7 @@ static inline bool excp_is_internal(int excp)
 || excp == EXCP_HALTED
 || excp == EXCP_EXCEPTION_EXIT
 || excp == EXCP_KERNEL_TRAP
-|| excp == EXCP_SEMIHOST
-|| excp == EXCP_STREX;
+|| excp == EXCP_SEMIHOST;
 }
 
 /* Exception names for debug logging; note that not all of these
@@ -63,7 +62,6 @@ static const char * const excnames[] = {
 [EXCP_BKPT] = "Breakpoint",
 [EXCP_EXCEPTION_EXIT] = "QEMU v7M exception exit",
 [EXCP_KERNEL_TRAP] = "QEMU intercept of kernel commpage",
-[EXCP_STREX] = "QEMU intercept of STREX",
 [EXCP_HVC] = "Hypervisor Call",
 [EXCP_HYP_TRAP] = "Hypervisor Trap",
 [EXCP_SMC] = "Secure Monitor Call",
diff --git a/target-arm/translate.c b/target-arm/translate.c
index 7048cb3..5e21b52 100644
--- a/target-arm/translate.c
+++ b/target-arm/translate.c
@@ -64,10 +64,6 @@ static TCGv_i32 cpu_R[16];
 TCGv_i32 cpu_CF, cpu_NF, cpu_VF, cpu_ZF;
 TCGv_i64 cpu_exclusive_addr;
 TCGv_i64 cpu_exclusive_val;
-#ifdef CONFIG_USER_ONLY
-TCGv_i64 cpu_exclusive_test;
-TCGv_i32 cpu_exclusive_info;
-#endif
 
 /* FIXME:  These should be removed.  */
 static TCGv_i32 cpu_F0s, cpu_F1s;
@@ -101,12 +97,6 @@ void arm_translate_init(void)
 offsetof(CPUARMState, exclusive_addr), "exclusive_addr");
 cpu_exclusive_val = tcg_global_mem_new_i64(cpu_env,
 offsetof(CPUARMState, exclusive_val), "exclusive_val");
-#ifdef CONFIG_USER_ONLY
-cpu_exclusive_test = tcg_global_mem_new_i64(cpu_env,
-offsetof(CPUARMState, exclusive_test), "exclusive_test");
-cpu_exclusive_info = tcg_global_mem_new_i32(cpu_env,
-offsetof(CPUARMState, exclusive_info), "exclusive_info");
-#endif
 
 a64_translate_init();
 }
diff --git a/target-arm/translate.h b/target-arm/translate.h
index dbd7ac8..d4e205e 100644
--- a/target-arm/translate.h
+++ b/target-arm/translate.h
@@ -77,10 +77,6 @@ extern TCGv_env cpu_env;
 extern TCGv_i32 cpu_NF, cpu_ZF, cpu_CF, cpu_VF;
 extern TCGv_i64 cpu_exclusive_addr;
 extern TCGv_i64 cpu_exclusive_val;
-#ifdef CONFIG_USER_ONLY
-extern TCGv_i64 cpu_exclusive_test;
-extern TCGv_i32 cpu_exclusive_info;
-#endif
 
 static inline int arm_dc_feature(DisasContext *dc, int feature)
 {
-- 
2.7.4




[Qemu-devel] [PATCH v5 29/35] target-arm: emulate SWP with atomic_xchg helper

2016-10-09 Thread Richard Henderson
From: "Emilio G. Cota" 

Signed-off-by: Emilio G. Cota 
Message-Id: <1467054136-10430-25-git-send-email-c...@braap.org>
Signed-off-by: Richard Henderson 
---
 target-arm/translate.c | 26 ++
 1 file changed, 14 insertions(+), 12 deletions(-)

diff --git a/target-arm/translate.c b/target-arm/translate.c
index e0e29d9..7048cb3 100644
--- a/target-arm/translate.c
+++ b/target-arm/translate.c
@@ -8752,25 +8752,27 @@ static void disas_arm_insn(DisasContext *s, unsigned 
int insn)
 }
 tcg_temp_free_i32(addr);
 } else {
+TCGv taddr;
+TCGMemOp opc = s->be_data;
+
 /* SWP instruction */
 rm = (insn) & 0xf;
 
-/* ??? This is not really atomic.  However we know
-   we never have multiple CPUs running in parallel,
-   so it is good enough.  */
-addr = load_reg(s, rn);
-tmp = load_reg(s, rm);
-tmp2 = tcg_temp_new_i32();
 if (insn & (1 << 22)) {
-gen_aa32_ld8u(s, tmp2, addr, get_mem_index(s));
-gen_aa32_st8(s, tmp, addr, get_mem_index(s));
+opc |= MO_UB;
 } else {
-gen_aa32_ld32u(s, tmp2, addr, get_mem_index(s));
-gen_aa32_st32(s, tmp, addr, get_mem_index(s));
+opc |= MO_UL | MO_ALIGN;
 }
-tcg_temp_free_i32(tmp);
+
+addr = load_reg(s, rn);
+taddr = gen_aa32_addr(s, addr, opc);
 tcg_temp_free_i32(addr);
-store_reg(s, rd, tmp2);
+
+tmp = load_reg(s, rm);
+tcg_gen_atomic_xchg_i32(tmp, taddr, tmp,
+get_mem_index(s), opc);
+tcg_temp_free(taddr);
+store_reg(s, rd, tmp);
 }
 }
 } else {
-- 
2.7.4




[Qemu-devel] [PATCH v5 30/35] target-arm: emulate aarch64's LL/SC using cmpxchg helpers

2016-10-09 Thread Richard Henderson
From: "Emilio G. Cota" 

Emulating LL/SC with cmpxchg is not correct, since it can
suffer from the ABA problem. Portable parallel code, however,
is written assuming only cmpxchg--and not LL/SC--is available.
This means that in practice emulating LL/SC with cmpxchg is
a viable alternative.

The appended emulates LL/SC pairs in aarch64 with cmpxchg helpers.
This works in both user and system mode. In usermode, it avoids
pausing all other CPUs to perform the LL/SC pair. The subsequent
performance and scalability improvement is significant, as the
plots below show. They plot the throughput of atomic_add-bench
compiled for ARM and executed on a 64-core x86 machine.

Hi-res plots: http://imgur.com/a/JVc8Y

atomic_add-bench: 100 ops/thread, [0,1] range

  18 ++-+--+-+--+--+--+---++
 +cmpxchg +-E--+   + +  +  +  +|
  16 ++master +-H--+  ++
 |||
  14 ++   ++
 | |   |
  12 ++|  ++
 | |   |
  10  ++
   8 ++E  ++
 |+++  |
   6 ++ | ++
 |  |  |
   4 ++ | ++
 |   | |
   2 +H++E+---++
 + | +E+++E+---+--+E+++E+--+E+--+E+++E+---+--+E|
   0 ++H-HH-+-H+-+--+--+--+---++
 0  10 2030 40 50 60
Number of threads

atomic_add-bench: 100 ops/thread, [0,2] range

  18 ++-+--+-+--+--+--+---++
 +cmpxchg +-E--+   + +  +  +  +|
  16 ++master +-H--+  ++
 | |   |
  14 ++E  ++
 | |   |
  12 ++|  ++
 |+++  |
  10 ++ | ++
   8 ++ | ++
 |  |  |
   6 ++ | ++
 |   | |
   4 ++  |++
 |  +E+--- |
   2 +H+ +E+-+++  +++  +++   ---+E+-+E+--+++
 +++++E+---+--+E+++E+--+E+---   +++   +  +E|
   0 ++H-HH-+-H+-+--+--+--+---++
 0  10 2030 40 50 60
Number of threads

   atomic_add-bench: 100 ops/thread, [0,128] range

  70 ++-+--+-+--+--+--+---++
 +cmpxchg +-E--+   + +  +  +  +|
  60 ++master +-H--+  +++---+E+-+E+--+E+
 |+E+--E---+E+---  |
 | ---+++  |
  50 ++  +++---   ++
 |  -+E+   |
  40 ++  +++  ++
 |E-   |
 |  --||
  30 ++   -- +++  ++
 |  +E+|
  20 ++E+ ++
 |E+  

[Qemu-devel] [PATCH v5 28/35] target-arm: emulate LL/SC using cmpxchg helpers

2016-10-09 Thread Richard Henderson
From: "Emilio G. Cota" 

Emulating LL/SC with cmpxchg is not correct, since it can
suffer from the ABA problem. Portable parallel code, however,
is written assuming only cmpxchg--and not LL/SC--is available.
This means that in practice emulating LL/SC with cmpxchg is
a viable alternative.

The appended emulates LL/SC pairs in ARM with cmpxchg helpers.
This works in both user and system mode. In usermode, it avoids
pausing all other CPUs to perform the LL/SC pair. The subsequent
performance and scalability improvement is significant, as the
plots below show. They plot the throughput of atomic_add-bench
compiled for ARM and executed on a 64-core x86 machine.

Hi-res plots: http://imgur.com/a/aNQpB

   atomic_add-bench: 100 ops/thread, [0,1] range

  9 ++-+--+--+--+--+--+---++
+cmpxchg +-E--+   +  +  +  +  +|
  8 +Emaster +-H--+   ++
| ||
  7 ++E   ++
| ||
  6   ++
|  |   |
  5 ++ |  ++
  4 ++ |  ++
|  |   |
  3 ++ |  ++
|   |  |
  2 ++  | ++
|H++E+---  +++  ---+E+--+E+--+E|
  1 +++ +E+-+E+--+E+--+E+--+E+--   +++  +++   ++
++H+   ++++   +  +++    +  +  +|
  0 ++--HH-+-H+--+--+--+--+---++
0  10 20 30 40 50 60
   Number of threads

atomic_add-bench: 100 ops/thread, [0,2] range

  16 ++-+--+-+--+--+--+---++
 +cmpxchg +-E--+   + +  +  +  +|
  14 ++master +-H--+  ++
 | |   |
  12 ++|  ++
 | E   |
  10 ++|  ++
 | |   |
   8  ++
 |E+|  |
 |  |  |
   6 ++ | ++
 |   | |
   4 ++  |++
 |  +E+---   +++  +++  +++   ---+E+--+E|
   2 +H+ +E+--E---+E+-+E+--+E+--+E+--+++
 + |++++   +    +  +  +|
   0 ++H-HH-+-H+-+--+--+--+---++
 0  10 2030 40 50 60
Number of threads

   atomic_add-bench: 100 ops/thread, [0,128] range

  70 ++-+--+-+--+--+--+---++
 +cmpxchg +-E--+   + +  +     +|
  60 ++master +-H--+ E--+E+---++
 |-+E+---   +++ +++  +E|
 |+++  +++   ++|
  50 ++   +++  ---+E+-++
 |-E---|
  40 ++---+++ ++
 |   +++---|
 |  -+E+   |
  30 ++  +++  ++
 |   +E+   |
  20 ++ +++-- ++
 |  +E+ 

[Qemu-devel] [PATCH v5 26/35] tests: add atomic_add-bench

2016-10-09 Thread Richard Henderson
From: "Emilio G. Cota" 

With this microbenchmark we can measure the overhead of emulating atomic
instructions with a configurable degree of contention.

The benchmark spawns $n threads, each performing $o atomic ops (additions)
in a loop. Each atomic operation is performed on a different cache line
(assuming lines are 64b long) that is randomly selected from a range [0, $r).

[ Note: each $foo corresponds to a -foo flag ]

Signed-off-by: Emilio G. Cota 
Signed-off-by: Richard Henderson 
Message-Id: <1467054136-10430-20-git-send-email-c...@braap.org>
---
 tests/.gitignore |   1 +
 tests/Makefile.include   |   4 +-
 tests/atomic_add-bench.c | 181 +++
 3 files changed, 185 insertions(+), 1 deletion(-)
 create mode 100644 tests/atomic_add-bench.c

diff --git a/tests/.gitignore b/tests/.gitignore
index 0f0c79b..ea379b4 100644
--- a/tests/.gitignore
+++ b/tests/.gitignore
@@ -1,3 +1,4 @@
+atomic_add-bench
 check-qdict
 check-qfloat
 check-qint
diff --git a/tests/Makefile.include b/tests/Makefile.include
index a7c..177661e 100644
--- a/tests/Makefile.include
+++ b/tests/Makefile.include
@@ -454,7 +454,8 @@ test-obj-y = tests/check-qint.o tests/check-qstring.o 
tests/check-qdict.o \
tests/test-opts-visitor.o tests/test-qmp-event.o \
tests/rcutorture.o tests/test-rcu-list.o \
tests/test-qdist.o \
-   tests/test-qht.o tests/qht-bench.o tests/test-qht-par.o
+   tests/test-qht.o tests/qht-bench.o tests/test-qht-par.o \
+   tests/atomic_add-bench.o
 
 $(test-obj-y): QEMU_INCLUDES += -Itests
 QEMU_CFLAGS += -I$(SRC_PATH)/tests
@@ -499,6 +500,7 @@ tests/test-qht$(EXESUF): tests/test-qht.o $(test-util-obj-y)
 tests/test-qht-par$(EXESUF): tests/test-qht-par.o tests/qht-bench$(EXESUF) 
$(test-util-obj-y)
 tests/qht-bench$(EXESUF): tests/qht-bench.o $(test-util-obj-y)
 tests/test-bufferiszero$(EXESUF): tests/test-bufferiszero.o $(test-util-obj-y)
+tests/atomic_add-bench$(EXESUF): tests/atomic_add-bench.o $(test-util-obj-y)
 
 tests/test-qdev-global-props$(EXESUF): tests/test-qdev-global-props.o \
hw/core/qdev.o hw/core/qdev-properties.o hw/core/hotplug.o\
diff --git a/tests/atomic_add-bench.c b/tests/atomic_add-bench.c
new file mode 100644
index 000..77a9f03
--- /dev/null
+++ b/tests/atomic_add-bench.c
@@ -0,0 +1,181 @@
+#include "qemu/osdep.h"
+#include "qemu/thread.h"
+#include "qemu/host-utils.h"
+#include "qemu/processor.h"
+
+struct thread_info {
+uint64_t r;
+} QEMU_ALIGNED(64);
+
+struct count {
+unsigned long val;
+} QEMU_ALIGNED(64);
+
+static QemuThread *threads;
+static struct thread_info *th_info;
+static unsigned int n_threads = 1;
+static unsigned int n_ready_threads;
+static struct count *counts;
+static unsigned long n_ops = 1;
+static double duration;
+static unsigned int range = 1;
+static bool test_start;
+
+static const char commands_string[] =
+" -n = number of threads\n"
+" -o = number of ops per thread\n"
+" -r = range (will be rounded up to pow2)";
+
+static void usage_complete(char *argv[])
+{
+fprintf(stderr, "Usage: %s [options]\n", argv[0]);
+fprintf(stderr, "options:\n%s\n", commands_string);
+}
+
+/*
+ * From: https://en.wikipedia.org/wiki/Xorshift
+ * This is faster than rand_r(), and gives us a wider range (RAND_MAX is only
+ * guaranteed to be >= INT_MAX).
+ */
+static uint64_t xorshift64star(uint64_t x)
+{
+x ^= x >> 12; /* a */
+x ^= x << 25; /* b */
+x ^= x >> 27; /* c */
+return x * UINT64_C(2685821657736338717);
+}
+
+static void *thread_func(void *arg)
+{
+struct thread_info *info = arg;
+unsigned long i;
+
+atomic_inc(_ready_threads);
+while (!atomic_mb_read(_start)) {
+cpu_relax();
+}
+
+for (i = 0; i < n_ops; i++) {
+unsigned int index;
+
+info->r = xorshift64star(info->r);
+index = info->r & (range - 1);
+atomic_inc([index].val);
+}
+return NULL;
+}
+
+static inline
+uint64_t ts_subtract(const struct timespec *a, const struct timespec *b)
+{
+uint64_t ns;
+
+ns = (b->tv_sec - a->tv_sec) * 10ULL;
+ns += (b->tv_nsec - a->tv_nsec);
+return ns;
+}
+
+static void run_test(void)
+{
+unsigned int i;
+struct timespec ts_start, ts_end;
+
+while (atomic_read(_ready_threads) != n_threads) {
+cpu_relax();
+}
+atomic_mb_set(_start, true);
+
+clock_gettime(CLOCK_MONOTONIC, _start);
+for (i = 0; i < n_threads; i++) {
+qemu_thread_join([i]);
+}
+clock_gettime(CLOCK_MONOTONIC, _end);
+duration = ts_subtract(_start, _end) / 1e9;
+}
+
+static void create_threads(void)
+{
+unsigned int i;
+
+threads = g_new(QemuThread, n_threads);
+th_info = g_new(struct thread_info, n_threads);
+counts = qemu_memalign(64, sizeof(*counts) * range);
+memset(counts, 0, sizeof(*counts) * range);
+
+for (i = 0; i < n_threads; i++) {
+struct 

[Qemu-devel] [PATCH v5 27/35] target-arm: Rearrange aa32 load and store functions

2016-10-09 Thread Richard Henderson
Stop specializing on TARGET_LONG_BITS == 32; unconditionally allocate
a temp and expand with tcg_gen_extu_i32_tl.  Split out gen_aa32_addr,
gen_aa32_frob64, gen_aa32_ld_i32 and gen_aa32_st_i32 as separate interfaces.

Reviewed-by: Alex Bennée 
Signed-off-by: Richard Henderson 
---
 target-arm/translate.c | 171 +++--
 1 file changed, 66 insertions(+), 105 deletions(-)

diff --git a/target-arm/translate.c b/target-arm/translate.c
index 8df24bf..f745c37 100644
--- a/target-arm/translate.c
+++ b/target-arm/translate.c
@@ -931,145 +931,106 @@ static inline void store_reg_from_load(DisasContext *s, 
int reg, TCGv_i32 var)
  * These functions work like tcg_gen_qemu_{ld,st}* except
  * that the address argument is TCGv_i32 rather than TCGv.
  */
-#if TARGET_LONG_BITS == 32
 
-#define DO_GEN_LD(SUFF, OPC, BE32_XOR)   \
-static inline void gen_aa32_ld##SUFF(DisasContext *s, TCGv_i32 val,  \
- TCGv_i32 addr, int index)   \
-{\
-TCGMemOp opc = (OPC) | s->be_data;   \
-/* Not needed for user-mode BE32, where we use MO_BE instead.  */\
-if (!IS_USER_ONLY && s->sctlr_b && BE32_XOR) {   \
-TCGv addr_be = tcg_temp_new();   \
-tcg_gen_xori_i32(addr_be, addr, BE32_XOR);   \
-tcg_gen_qemu_ld_i32(val, addr_be, index, opc);   \
-tcg_temp_free(addr_be);  \
-return;  \
-}\
-tcg_gen_qemu_ld_i32(val, addr, index, opc);  \
-}
-
-#define DO_GEN_ST(SUFF, OPC, BE32_XOR)   \
-static inline void gen_aa32_st##SUFF(DisasContext *s, TCGv_i32 val,  \
- TCGv_i32 addr, int index)   \
-{\
-TCGMemOp opc = (OPC) | s->be_data;   \
-/* Not needed for user-mode BE32, where we use MO_BE instead.  */\
-if (!IS_USER_ONLY && s->sctlr_b && BE32_XOR) {   \
-TCGv addr_be = tcg_temp_new();   \
-tcg_gen_xori_i32(addr_be, addr, BE32_XOR);   \
-tcg_gen_qemu_st_i32(val, addr_be, index, opc);   \
-tcg_temp_free(addr_be);  \
-return;  \
-}\
-tcg_gen_qemu_st_i32(val, addr, index, opc);  \
-}
-
-static inline void gen_aa32_ld64(DisasContext *s, TCGv_i64 val,
- TCGv_i32 addr, int index)
+static inline TCGv gen_aa32_addr(DisasContext *s, TCGv_i32 a32, TCGMemOp op)
 {
-TCGMemOp opc = MO_Q | s->be_data;
-tcg_gen_qemu_ld_i64(val, addr, index, opc);
+TCGv addr = tcg_temp_new();
+tcg_gen_extu_i32_tl(addr, a32);
+
 /* Not needed for user-mode BE32, where we use MO_BE instead.  */
-if (!IS_USER_ONLY && s->sctlr_b) {
-tcg_gen_rotri_i64(val, val, 32);
+if (!IS_USER_ONLY && s->sctlr_b && (op & MO_SIZE) < MO_32) {
+tcg_gen_xori_tl(addr, addr, 4 - (1 << (op & MO_SIZE)));
 }
+return addr;
 }
 
-static inline void gen_aa32_st64(DisasContext *s, TCGv_i64 val,
- TCGv_i32 addr, int index)
+static void gen_aa32_ld_i32(DisasContext *s, TCGv_i32 val, TCGv_i32 a32,
+int index, TCGMemOp opc)
 {
-TCGMemOp opc = MO_Q | s->be_data;
-/* Not needed for user-mode BE32, where we use MO_BE instead.  */
-if (!IS_USER_ONLY && s->sctlr_b) {
-TCGv_i64 tmp = tcg_temp_new_i64();
-tcg_gen_rotri_i64(tmp, val, 32);
-tcg_gen_qemu_st_i64(tmp, addr, index, opc);
-tcg_temp_free_i64(tmp);
-return;
-}
-tcg_gen_qemu_st_i64(val, addr, index, opc);
+TCGv addr = gen_aa32_addr(s, a32, opc);
+tcg_gen_qemu_ld_i32(val, addr, index, opc);
+tcg_temp_free(addr);
 }
 
-#else
+static void gen_aa32_st_i32(DisasContext *s, TCGv_i32 val, TCGv_i32 a32,
+int index, TCGMemOp opc)
+{
+TCGv addr = gen_aa32_addr(s, a32, opc);
+tcg_gen_qemu_st_i32(val, addr, index, opc);
+tcg_temp_free(addr);
+}
 
-#define DO_GEN_LD(SUFF, OPC, BE32_XOR)   \
+#define DO_GEN_LD(SUFF, OPC) \
 static inline void gen_aa32_ld##SUFF(DisasContext *s, TCGv_i32 val,  \
- TCGv_i32 

[Qemu-devel] [PATCH v5 25/35] target-i386: remove helper_lock()

2016-10-09 Thread Richard Henderson
From: "Emilio G. Cota" 

It's been superseded by the atomic helpers.

The use of the atomic helpers provides a significant performance and scalability
improvement. Below is the result of running the atomic_add-test microbenchmark 
with:
 $ x86_64-linux-user/qemu-x86_64 tests/atomic_add-bench -o 500 -r $r -n $n
, where $n is the number of threads and $r is the allowed range for the 
additions.

The scenarios measured are:
- atomic: implements x86' ADDL with the atomic_add helper (i.e. this patchset)
- cmpxchg: implement x86' ADDL with a TCG loop using the cmpxchg helper
- master: before this patchset

Results sorted in ascending range, i.e. descending degree of contention.
Y axis is Throughput in Mops/s. Tests are run on an AMD machine with 64
Opteron 6376 cores.

atomic_add-bench: 500 ops/thread, [0,1] range

  25 ++-+--+-+--+--+--+---++
 + atomic +-E--+   + +  +  +  +|
 |cmpxchg +-H--+   |
  20 +Emaster +-N--+  ++
 |||
 |++   |
 |||
  15 +++  ++
 |N|   |
 |+|   |
  10 ++|  ++
 |+|+  |
 | |-+E+--+++  ---+E+--+E+--+E+-+E+--+E|
 |+E+E+- +++ +E+--+E+--|
   5 ++|+ ++
 |+N+H+--- +++ |
 N+--+H+++++   +  +++  --++H+--+H+--+H+++H+---+--- |
   0 ++-+-H+---H-+--+--+--+---H+
 0  10 2030 40 50 60
Number of threads

atomic_add-bench: 500 ops/thread, [0,2] range

  25 ++-+--+-+--+--+--+---++
 ++atomic +-E--+   + +  +  +  +|
 |cmpxchg +-H--+   |
  20 ++master +-N--+  ++
 |E|   |
 |++   |
 ||E   |
  15 ++|  ++
 |N||  |
 |+||   ---+E+--+E+-+E+--+E|
  10 ++| |---+E+--+E+-+E+---+++  +++
 ||H+E+--+E+-- |
 |+|
 | ||  |
   5 ++|+H+--  +++++
 |+N+-  ---+H+--+H+--  |
 +  +N+--+H+++H+---+--+H+++H+---+  ++H+---+--+H|
   0 ++-+--+-+--+--+--+---++
 0  10 2030 40 50 60
Number of threads

atomic_add-bench: 500 ops/thread, [0,8] range

  40 ++-+--+-+--+--+--+---++
 ++atomic +-E--+   + +  +  +  +|
  35 +cmpxchg +-H--+  ++
 | master +-N--+   ---+E+--+E+--+E+-+E+--+E|
  30 ++|   ---+E+--   +++ ++
 | |-+E+---|
  25 ++E +++  ++
 |+ -+E+   |
  20 +E+ E-- +++  ++
 |H|+++|
 |+|   +H+---  |
  15 ++H+   ---+++  +H+-- ++
 |N++H+-- +++---

[Qemu-devel] [PATCH v5 24/35] target-i386: emulate XCHG using atomic helper

2016-10-09 Thread Richard Henderson
From: "Emilio G. Cota" 

Signed-off-by: Emilio G. Cota 
Message-Id: <1467054136-10430-19-git-send-email-c...@braap.org>
Signed-off-by: Richard Henderson 
---
 target-i386/translate.c | 8 ++--
 1 file changed, 2 insertions(+), 6 deletions(-)

diff --git a/target-i386/translate.c b/target-i386/translate.c
index e781869..c8827f3 100644
--- a/target-i386/translate.c
+++ b/target-i386/translate.c
@@ -5564,12 +5564,8 @@ static target_ulong disas_insn(CPUX86State *env, 
DisasContext *s,
 gen_lea_modrm(env, s, modrm);
 gen_op_mov_v_reg(ot, cpu_T0, reg);
 /* for xchg, lock is implicit */
-if (!(prefixes & PREFIX_LOCK))
-gen_helper_lock();
-gen_op_ld_v(s, ot, cpu_T1, cpu_A0);
-gen_op_st_v(s, ot, cpu_T0, cpu_A0);
-if (!(prefixes & PREFIX_LOCK))
-gen_helper_unlock();
+tcg_gen_atomic_xchg_tl(cpu_T1, cpu_A0, cpu_T0,
+   s->mem_index, ot | MO_LE);
 gen_op_mov_reg_v(ot, reg, cpu_T1);
 }
 break;
-- 
2.7.4




[Qemu-devel] [PATCH v5 21/35] target-i386: emulate LOCK'ed NEG using cmpxchg helper

2016-10-09 Thread Richard Henderson
From: "Emilio G. Cota" 

[rth: Move redundant qemu_load out of cmpxchg loop.]

Signed-off-by: Emilio G. Cota 
Message-Id: <1467054136-10430-16-git-send-email-c...@braap.org>
Signed-off-by: Richard Henderson 
---
 target-i386/translate.c | 38 ++
 1 file changed, 34 insertions(+), 4 deletions(-)

diff --git a/target-i386/translate.c b/target-i386/translate.c
index 49455a3..17a37a3 100644
--- a/target-i386/translate.c
+++ b/target-i386/translate.c
@@ -4713,11 +4713,41 @@ static target_ulong disas_insn(CPUX86State *env, 
DisasContext *s,
 }
 break;
 case 3: /* neg */
-tcg_gen_neg_tl(cpu_T0, cpu_T0);
-if (mod != 3) {
-gen_op_st_v(s, ot, cpu_T0, cpu_A0);
+if (s->prefix & PREFIX_LOCK) {
+TCGLabel *label1;
+TCGv a0, t0, t1, t2;
+
+if (mod == 3) {
+goto illegal_op;
+}
+a0 = tcg_temp_local_new();
+t0 = tcg_temp_local_new();
+label1 = gen_new_label();
+
+tcg_gen_mov_tl(a0, cpu_A0);
+tcg_gen_mov_tl(t0, cpu_T0);
+
+gen_set_label(label1);
+t1 = tcg_temp_new();
+t2 = tcg_temp_new();
+tcg_gen_mov_tl(t2, t0);
+tcg_gen_neg_tl(t1, t0);
+tcg_gen_atomic_cmpxchg_tl(t0, a0, t0, t1,
+  s->mem_index, ot | MO_LE);
+tcg_temp_free(t1);
+tcg_gen_brcond_tl(TCG_COND_NE, t0, t2, label1);
+
+tcg_temp_free(t2);
+tcg_temp_free(a0);
+tcg_gen_mov_tl(cpu_T0, t0);
+tcg_temp_free(t0);
 } else {
-gen_op_mov_reg_v(ot, rm, cpu_T0);
+tcg_gen_neg_tl(cpu_T0, cpu_T0);
+if (mod != 3) {
+gen_op_st_v(s, ot, cpu_T0, cpu_A0);
+} else {
+gen_op_mov_reg_v(ot, rm, cpu_T0);
+}
 }
 gen_op_update_neg_cc();
 set_cc_op(s, CC_OP_SUBB + ot);
-- 
2.7.4




[Qemu-devel] [PATCH v5 20/35] target-i386: emulate LOCK'ed NOT using atomic helper

2016-10-09 Thread Richard Henderson
From: "Emilio G. Cota" 

[rth: Avoid qemu_load that's redundant with the atomic op.]

Signed-off-by: Emilio G. Cota 
Message-Id: <1467054136-10430-15-git-send-email-c...@braap.org>
Signed-off-by: Richard Henderson 
---
 target-i386/translate.c | 26 --
 1 file changed, 20 insertions(+), 6 deletions(-)

diff --git a/target-i386/translate.c b/target-i386/translate.c
index a38d953..49455a3 100644
--- a/target-i386/translate.c
+++ b/target-i386/translate.c
@@ -4675,10 +4675,15 @@ static target_ulong disas_insn(CPUX86State *env, 
DisasContext *s,
 rm = (modrm & 7) | REX_B(s);
 op = (modrm >> 3) & 7;
 if (mod != 3) {
-if (op == 0)
+if (op == 0) {
 s->rip_offset = insn_const_size(ot);
+}
 gen_lea_modrm(env, s, modrm);
-gen_op_ld_v(s, ot, cpu_T0, cpu_A0);
+/* For those below that handle locked memory, don't load here.  */
+if (!(s->prefix & PREFIX_LOCK)
+|| op != 2) {
+gen_op_ld_v(s, ot, cpu_T0, cpu_A0);
+}
 } else {
 gen_op_mov_v_reg(ot, cpu_T0, rm);
 }
@@ -4691,11 +4696,20 @@ static target_ulong disas_insn(CPUX86State *env, 
DisasContext *s,
 set_cc_op(s, CC_OP_LOGICB + ot);
 break;
 case 2: /* not */
-tcg_gen_not_tl(cpu_T0, cpu_T0);
-if (mod != 3) {
-gen_op_st_v(s, ot, cpu_T0, cpu_A0);
+if (s->prefix & PREFIX_LOCK) {
+if (mod == 3) {
+goto illegal_op;
+}
+tcg_gen_movi_tl(cpu_T0, ~0);
+tcg_gen_atomic_xor_fetch_tl(cpu_T0, cpu_A0, cpu_T0,
+s->mem_index, ot | MO_LE);
 } else {
-gen_op_mov_reg_v(ot, rm, cpu_T0);
+tcg_gen_not_tl(cpu_T0, cpu_T0);
+if (mod != 3) {
+gen_op_st_v(s, ot, cpu_T0, cpu_A0);
+} else {
+gen_op_mov_reg_v(ot, rm, cpu_T0);
+}
 }
 break;
 case 3: /* neg */
-- 
2.7.4




[Qemu-devel] [PATCH v5 18/35] target-i386: emulate LOCK'ed OP instructions using atomic helpers

2016-10-09 Thread Richard Henderson
From: "Emilio G. Cota" 

[rth: Eliminate some unnecessary temporaries.]

Signed-off-by: Emilio G. Cota 
Message-Id: <1467054136-10430-13-git-send-email-c...@braap.org>
Signed-off-by: Richard Henderson 
---
 target-i386/translate.c | 76 +
 1 file changed, 58 insertions(+), 18 deletions(-)

diff --git a/target-i386/translate.c b/target-i386/translate.c
index 5d9790a..b5c7791 100644
--- a/target-i386/translate.c
+++ b/target-i386/translate.c
@@ -1258,55 +1258,95 @@ static void gen_op(DisasContext *s1, int op, TCGMemOp 
ot, int d)
 {
 if (d != OR_TMP0) {
 gen_op_mov_v_reg(ot, cpu_T0, d);
-} else {
+} else if (!(s1->prefix & PREFIX_LOCK)) {
 gen_op_ld_v(s1, ot, cpu_T0, cpu_A0);
 }
 switch(op) {
 case OP_ADCL:
 gen_compute_eflags_c(s1, cpu_tmp4);
-tcg_gen_add_tl(cpu_T0, cpu_T0, cpu_T1);
-tcg_gen_add_tl(cpu_T0, cpu_T0, cpu_tmp4);
-gen_op_st_rm_T0_A0(s1, ot, d);
+if (s1->prefix & PREFIX_LOCK) {
+tcg_gen_add_tl(cpu_T0, cpu_tmp4, cpu_T1);
+tcg_gen_atomic_add_fetch_tl(cpu_T0, cpu_A0, cpu_T0,
+s1->mem_index, ot | MO_LE);
+} else {
+tcg_gen_add_tl(cpu_T0, cpu_T0, cpu_T1);
+tcg_gen_add_tl(cpu_T0, cpu_T0, cpu_tmp4);
+gen_op_st_rm_T0_A0(s1, ot, d);
+}
 gen_op_update3_cc(cpu_tmp4);
 set_cc_op(s1, CC_OP_ADCB + ot);
 break;
 case OP_SBBL:
 gen_compute_eflags_c(s1, cpu_tmp4);
-tcg_gen_sub_tl(cpu_T0, cpu_T0, cpu_T1);
-tcg_gen_sub_tl(cpu_T0, cpu_T0, cpu_tmp4);
-gen_op_st_rm_T0_A0(s1, ot, d);
+if (s1->prefix & PREFIX_LOCK) {
+tcg_gen_add_tl(cpu_T0, cpu_T1, cpu_tmp4);
+tcg_gen_neg_tl(cpu_T0, cpu_T0);
+tcg_gen_atomic_add_fetch_tl(cpu_T0, cpu_A0, cpu_T0,
+s1->mem_index, ot | MO_LE);
+} else {
+tcg_gen_sub_tl(cpu_T0, cpu_T0, cpu_T1);
+tcg_gen_sub_tl(cpu_T0, cpu_T0, cpu_tmp4);
+gen_op_st_rm_T0_A0(s1, ot, d);
+}
 gen_op_update3_cc(cpu_tmp4);
 set_cc_op(s1, CC_OP_SBBB + ot);
 break;
 case OP_ADDL:
-tcg_gen_add_tl(cpu_T0, cpu_T0, cpu_T1);
-gen_op_st_rm_T0_A0(s1, ot, d);
+if (s1->prefix & PREFIX_LOCK) {
+tcg_gen_atomic_add_fetch_tl(cpu_T0, cpu_A0, cpu_T1,
+s1->mem_index, ot | MO_LE);
+} else {
+tcg_gen_add_tl(cpu_T0, cpu_T0, cpu_T1);
+gen_op_st_rm_T0_A0(s1, ot, d);
+}
 gen_op_update2_cc();
 set_cc_op(s1, CC_OP_ADDB + ot);
 break;
 case OP_SUBL:
-tcg_gen_mov_tl(cpu_cc_srcT, cpu_T0);
-tcg_gen_sub_tl(cpu_T0, cpu_T0, cpu_T1);
-gen_op_st_rm_T0_A0(s1, ot, d);
+if (s1->prefix & PREFIX_LOCK) {
+tcg_gen_neg_tl(cpu_T0, cpu_T1);
+tcg_gen_atomic_fetch_add_tl(cpu_cc_srcT, cpu_A0, cpu_T0,
+s1->mem_index, ot | MO_LE);
+tcg_gen_sub_tl(cpu_T0, cpu_cc_srcT, cpu_T1);
+} else {
+tcg_gen_mov_tl(cpu_cc_srcT, cpu_T0);
+tcg_gen_sub_tl(cpu_T0, cpu_T0, cpu_T1);
+gen_op_st_rm_T0_A0(s1, ot, d);
+}
 gen_op_update2_cc();
 set_cc_op(s1, CC_OP_SUBB + ot);
 break;
 default:
 case OP_ANDL:
-tcg_gen_and_tl(cpu_T0, cpu_T0, cpu_T1);
-gen_op_st_rm_T0_A0(s1, ot, d);
+if (s1->prefix & PREFIX_LOCK) {
+tcg_gen_atomic_and_fetch_tl(cpu_T0, cpu_A0, cpu_T1,
+s1->mem_index, ot | MO_LE);
+} else {
+tcg_gen_and_tl(cpu_T0, cpu_T0, cpu_T1);
+gen_op_st_rm_T0_A0(s1, ot, d);
+}
 gen_op_update1_cc();
 set_cc_op(s1, CC_OP_LOGICB + ot);
 break;
 case OP_ORL:
-tcg_gen_or_tl(cpu_T0, cpu_T0, cpu_T1);
-gen_op_st_rm_T0_A0(s1, ot, d);
+if (s1->prefix & PREFIX_LOCK) {
+tcg_gen_atomic_or_fetch_tl(cpu_T0, cpu_A0, cpu_T1,
+   s1->mem_index, ot | MO_LE);
+} else {
+tcg_gen_or_tl(cpu_T0, cpu_T0, cpu_T1);
+gen_op_st_rm_T0_A0(s1, ot, d);
+}
 gen_op_update1_cc();
 set_cc_op(s1, CC_OP_LOGICB + ot);
 break;
 case OP_XORL:
-tcg_gen_xor_tl(cpu_T0, cpu_T0, cpu_T1);
-gen_op_st_rm_T0_A0(s1, ot, d);
+if (s1->prefix & PREFIX_LOCK) {
+tcg_gen_atomic_xor_fetch_tl(cpu_T0, cpu_A0, cpu_T1,
+s1->mem_index, ot | MO_LE);
+} else {
+tcg_gen_xor_tl(cpu_T0, cpu_T0, cpu_T1);
+gen_op_st_rm_T0_A0(s1, ot, d);
+}
 gen_op_update1_cc();
 set_cc_op(s1, CC_OP_LOGICB + ot);
 break;

[Qemu-devel] [PATCH v5 13/35] tcg: Add atomic helpers

2016-10-09 Thread Richard Henderson
Add all of cmpxchg, op_fetch, fetch_op, and xchg.
Handle both endian-ness, and sizes up to 8.
Handle expanding non-atomically, when emulating in serial.

Signed-off-by: Richard Henderson 
---
 Makefile.objs |   2 +-
 Makefile.target   |   1 +
 atomic_template.h | 173 ++
 cputlb.c  | 112 -
 include/qemu/atomic.h |  19 ++-
 tcg-runtime.c |  49 ++--
 tcg/tcg-op.c  | 328 ++
 tcg/tcg-op.h  |  44 +++
 tcg/tcg-runtime.h |  75 
 tcg/tcg.h |  53 
 10 files changed, 836 insertions(+), 20 deletions(-)
 create mode 100644 atomic_template.h

diff --git a/Makefile.objs b/Makefile.objs
index 02fb8e7..99d1f6d 100644
--- a/Makefile.objs
+++ b/Makefile.objs
@@ -89,7 +89,7 @@ endif
 
 ###
 # Target-independent parts used in system and user emulation
-common-obj-y += tcg-runtime.o cpus-common.o
+common-obj-y += cpus-common.o
 common-obj-y += hw/
 common-obj-y += qom/
 common-obj-y += disas/
diff --git a/Makefile.target b/Makefile.target
index 9968871..91b6fbd 100644
--- a/Makefile.target
+++ b/Makefile.target
@@ -94,6 +94,7 @@ obj-$(CONFIG_TCG_INTERPRETER) += disas/tci.o
 obj-y += fpu/softfloat.o
 obj-y += target-$(TARGET_BASE_ARCH)/
 obj-y += disas.o
+obj-y += tcg-runtime.o
 obj-$(call notempty,$(TARGET_XML_FILES)) += gdbstub-xml.o
 obj-$(call lnot,$(CONFIG_KVM)) += kvm-stub.o
 
diff --git a/atomic_template.h b/atomic_template.h
new file mode 100644
index 000..d2c8a08
--- /dev/null
+++ b/atomic_template.h
@@ -0,0 +1,173 @@
+/*
+ * Atomic helper templates
+ * Included from tcg-runtime.c and cputlb.c.
+ *
+ * Copyright (c) 2016 Red Hat, Inc
+ *
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2 of the License, or (at your option) any later version.
+ *
+ * This library is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library; if not, see .
+ */
+
+#if DATA_SIZE == 8
+# define SUFFIX q
+# define DATA_TYPE  uint64_t
+# define BSWAP  bswap64
+#elif DATA_SIZE == 4
+# define SUFFIX l
+# define DATA_TYPE  uint32_t
+# define BSWAP  bswap32
+#elif DATA_SIZE == 2
+# define SUFFIX w
+# define DATA_TYPE  uint16_t
+# define BSWAP  bswap16
+#elif DATA_SIZE == 1
+# define SUFFIX b
+# define DATA_TYPE  uint8_t
+# define BSWAP
+#else
+# error unsupported data size
+#endif
+
+#if DATA_SIZE >= 4
+# define ABI_TYPE  DATA_TYPE
+#else
+# define ABI_TYPE  uint32_t
+#endif
+
+#if DATA_SIZE == 1
+# define END
+#elif defined(HOST_WORDS_BIGENDIAN)
+# define END  _be
+#else
+# define END  _le
+#endif
+
+ABI_TYPE ATOMIC_NAME(cmpxchg)(CPUArchState *env, target_ulong addr,
+  ABI_TYPE cmpv, ABI_TYPE newv EXTRA_ARGS)
+{
+DATA_TYPE *haddr = ATOMIC_MMU_LOOKUP;
+return atomic_cmpxchg__nocheck(haddr, cmpv, newv);
+}
+
+ABI_TYPE ATOMIC_NAME(xchg)(CPUArchState *env, target_ulong addr,
+   ABI_TYPE val EXTRA_ARGS)
+{
+DATA_TYPE *haddr = ATOMIC_MMU_LOOKUP;
+return atomic_xchg__nocheck(haddr, val);
+}
+
+#define GEN_ATOMIC_HELPER(X)\
+ABI_TYPE ATOMIC_NAME(X)(CPUArchState *env, target_ulong addr,   \
+ ABI_TYPE val EXTRA_ARGS)   \
+{   \
+DATA_TYPE *haddr = ATOMIC_MMU_LOOKUP;   \
+return atomic_##X(haddr, val);  \
+}   \
+
+GEN_ATOMIC_HELPER(fetch_add)
+GEN_ATOMIC_HELPER(fetch_and)
+GEN_ATOMIC_HELPER(fetch_or)
+GEN_ATOMIC_HELPER(fetch_xor)
+GEN_ATOMIC_HELPER(add_fetch)
+GEN_ATOMIC_HELPER(and_fetch)
+GEN_ATOMIC_HELPER(or_fetch)
+GEN_ATOMIC_HELPER(xor_fetch)
+
+#undef GEN_ATOMIC_HELPER
+#undef END
+
+#if DATA_SIZE > 1
+
+#ifdef HOST_WORDS_BIGENDIAN
+# define END  _le
+#else
+# define END  _be
+#endif
+
+ABI_TYPE ATOMIC_NAME(cmpxchg)(CPUArchState *env, target_ulong addr,
+  ABI_TYPE cmpv, ABI_TYPE newv EXTRA_ARGS)
+{
+DATA_TYPE *haddr = ATOMIC_MMU_LOOKUP;
+return BSWAP(atomic_cmpxchg__nocheck(haddr, BSWAP(cmpv), BSWAP(newv)));
+}
+
+ABI_TYPE ATOMIC_NAME(xchg)(CPUArchState *env, target_ulong addr,
+   ABI_TYPE val EXTRA_ARGS)
+{
+DATA_TYPE *haddr = ATOMIC_MMU_LOOKUP;
+return 

[Qemu-devel] [PATCH v5 35/35] target-alpha: Emulate LL/SC using cmpxchg helpers

2016-10-09 Thread Richard Henderson
Emulating LL/SC with cmpxchg is not correct, since it can
suffer from the ABA problem.  However, portable parallel
code is written assuming only cmpxchg which means that in
practice this is a viable alternative.

Signed-off-by: Richard Henderson 
---
 linux-user/main.c|  49 --
 target-alpha/cpu.h   |   4 --
 target-alpha/helper.c|   6 ---
 target-alpha/machine.c   |   2 -
 target-alpha/translate.c | 104 ---
 5 files changed, 45 insertions(+), 120 deletions(-)

diff --git a/linux-user/main.c b/linux-user/main.c
index 7055e54..bb48260 100644
--- a/linux-user/main.c
+++ b/linux-user/main.c
@@ -2903,51 +2903,6 @@ void cpu_loop(CPUM68KState *env)
 #endif /* TARGET_M68K */
 
 #ifdef TARGET_ALPHA
-static void do_store_exclusive(CPUAlphaState *env, int reg, int quad)
-{
-target_ulong addr, val, tmp;
-target_siginfo_t info;
-int ret = 0;
-
-addr = env->lock_addr;
-tmp = env->lock_st_addr;
-env->lock_addr = -1;
-env->lock_st_addr = 0;
-
-start_exclusive();
-mmap_lock();
-
-if (addr == tmp) {
-if (quad ? get_user_s64(val, addr) : get_user_s32(val, addr)) {
-goto do_sigsegv;
-}
-
-if (val == env->lock_value) {
-tmp = env->ir[reg];
-if (quad ? put_user_u64(tmp, addr) : put_user_u32(tmp, addr)) {
-goto do_sigsegv;
-}
-ret = 1;
-}
-}
-env->ir[reg] = ret;
-env->pc += 4;
-
-mmap_unlock();
-end_exclusive();
-return;
-
- do_sigsegv:
-mmap_unlock();
-end_exclusive();
-
-info.si_signo = TARGET_SIGSEGV;
-info.si_errno = 0;
-info.si_code = TARGET_SEGV_MAPERR;
-info._sifields._sigfault._addr = addr;
-queue_signal(env, TARGET_SIGSEGV, QEMU_SI_FAULT, );
-}
-
 void cpu_loop(CPUAlphaState *env)
 {
 CPUState *cs = CPU(alpha_env_get_cpu(env));
@@ -3122,10 +3077,6 @@ void cpu_loop(CPUAlphaState *env)
 queue_signal(env, info.si_signo, QEMU_SI_FAULT, );
 }
 break;
-case EXCP_STL_C:
-case EXCP_STQ_C:
-do_store_exclusive(env, env->error_code, trapnr - EXCP_STL_C);
-break;
 case EXCP_INTERRUPT:
 /* Just indicate that signals should be handled asap.  */
 break;
diff --git a/target-alpha/cpu.h b/target-alpha/cpu.h
index 871d9ba..b08d160 100644
--- a/target-alpha/cpu.h
+++ b/target-alpha/cpu.h
@@ -230,7 +230,6 @@ struct CPUAlphaState {
 uint64_t pc;
 uint64_t unique;
 uint64_t lock_addr;
-uint64_t lock_st_addr;
 uint64_t lock_value;
 
 /* The FPCR, and disassembled portions thereof.  */
@@ -346,9 +345,6 @@ enum {
 EXCP_ARITH,
 EXCP_FEN,
 EXCP_CALL_PAL,
-/* For Usermode emulation.  */
-EXCP_STL_C,
-EXCP_STQ_C,
 };
 
 /* Alpha-specific interrupt pending bits.  */
diff --git a/target-alpha/helper.c b/target-alpha/helper.c
index 9ba3e1a..2ef6cbe 100644
--- a/target-alpha/helper.c
+++ b/target-alpha/helper.c
@@ -306,12 +306,6 @@ void alpha_cpu_do_interrupt(CPUState *cs)
 case EXCP_CALL_PAL:
 name = "call_pal";
 break;
-case EXCP_STL_C:
-name = "stl_c";
-break;
-case EXCP_STQ_C:
-name = "stq_c";
-break;
 }
 qemu_log("INT %6d: %s(%#x) pc=%016" PRIx64 " sp=%016" PRIx64 "\n",
  ++count, name, env->error_code, env->pc, env->ir[IR_SP]);
diff --git a/target-alpha/machine.c b/target-alpha/machine.c
index 710b783..b99a123 100644
--- a/target-alpha/machine.c
+++ b/target-alpha/machine.c
@@ -45,8 +45,6 @@ static VMStateField vmstate_env_fields[] = {
 VMSTATE_UINTTL(unique, CPUAlphaState),
 VMSTATE_UINTTL(lock_addr, CPUAlphaState),
 VMSTATE_UINTTL(lock_value, CPUAlphaState),
-/* Note that lock_st_addr is not saved; it is a temporary
-   used during the execution of the st[lq]_c insns.  */
 
 VMSTATE_UINT8(ps, CPUAlphaState),
 VMSTATE_UINT8(intr_flag, CPUAlphaState),
diff --git a/target-alpha/translate.c b/target-alpha/translate.c
index a2e2a62..03e4776 100644
--- a/target-alpha/translate.c
+++ b/target-alpha/translate.c
@@ -99,7 +99,6 @@ static TCGv cpu_std_ir[31];
 static TCGv cpu_fir[31];
 static TCGv cpu_pc;
 static TCGv cpu_lock_addr;
-static TCGv cpu_lock_st_addr;
 static TCGv cpu_lock_value;
 
 #ifndef CONFIG_USER_ONLY
@@ -116,7 +115,6 @@ void alpha_translate_init(void)
 static const GlobalVar vars[] = {
 DEF_VAR(pc),
 DEF_VAR(lock_addr),
-DEF_VAR(lock_st_addr),
 DEF_VAR(lock_value),
 };
 
@@ -198,6 +196,23 @@ static TCGv dest_sink(DisasContext *ctx)
 return ctx->sink;
 }
 
+static void free_context_temps(DisasContext *ctx)
+{
+if (!TCGV_IS_UNUSED_I64(ctx->sink)) {
+tcg_gen_discard_i64(ctx->sink);
+tcg_temp_free(ctx->sink);
+TCGV_UNUSED_I64(ctx->sink);
+}
+if 

[Qemu-devel] [PATCH v5 17/35] target-i386: emulate LOCK'ed cmpxchg using cmpxchg helpers

2016-10-09 Thread Richard Henderson
From: "Emilio G. Cota" 

The diff here is uglier than necessary. All this does is to turn

FOO

into:

if (s->prefix & PREFIX_LOCK) {
  BAR
} else {
  FOO
}

where FOO is the original implementation of an unlocked cmpxchg.

[rth: Adjust unlocked cmpxchg to use movcond instead of branches.
Adjust helpers to use atomic helpers.]

Signed-off-by: Emilio G. Cota 
Message-Id: <1467054136-10430-6-git-send-email-c...@braap.org>
Signed-off-by: Richard Henderson 
---
 target-i386/helper.h |   2 +
 target-i386/mem_helper.c | 134 +++
 target-i386/translate.c  |  99 ++
 3 files changed, 169 insertions(+), 66 deletions(-)

diff --git a/target-i386/helper.h b/target-i386/helper.h
index 1320edc..729d4b6 100644
--- a/target-i386/helper.h
+++ b/target-i386/helper.h
@@ -74,8 +74,10 @@ DEF_HELPER_3(boundw, void, env, tl, int)
 DEF_HELPER_3(boundl, void, env, tl, int)
 DEF_HELPER_1(rsm, void, env)
 DEF_HELPER_2(into, void, env, int)
+DEF_HELPER_2(cmpxchg8b_unlocked, void, env, tl)
 DEF_HELPER_2(cmpxchg8b, void, env, tl)
 #ifdef TARGET_X86_64
+DEF_HELPER_2(cmpxchg16b_unlocked, void, env, tl)
 DEF_HELPER_2(cmpxchg16b, void, env, tl)
 #endif
 DEF_HELPER_1(single_step, void, env)
diff --git a/target-i386/mem_helper.c b/target-i386/mem_helper.c
index 5bc0594..c4b5c5b 100644
--- a/target-i386/mem_helper.c
+++ b/target-i386/mem_helper.c
@@ -22,6 +22,8 @@
 #include "exec/helper-proto.h"
 #include "exec/exec-all.h"
 #include "exec/cpu_ldst.h"
+#include "qemu/int128.h"
+#include "tcg.h"
 
 /* broken thread support */
 
@@ -56,53 +58,143 @@ void helper_lock_init(void)
 }
 #endif
 
+void helper_cmpxchg8b_unlocked(CPUX86State *env, target_ulong a0)
+{
+uintptr_t ra = GETPC();
+uint64_t oldv, cmpv, newv;
+int eflags;
+
+eflags = cpu_cc_compute_all(env, CC_OP);
+
+cmpv = deposit64(env->regs[R_EAX], 32, 32, env->regs[R_EDX]);
+newv = deposit64(env->regs[R_EBX], 32, 32, env->regs[R_ECX]);
+
+oldv = cpu_ldq_data_ra(env, a0, ra);
+newv = (cmpv == oldv ? newv : oldv);
+/* always do the store */
+cpu_stq_data_ra(env, a0, newv, ra);
+
+if (oldv == cmpv) {
+eflags |= CC_Z;
+} else {
+env->regs[R_EAX] = (uint32_t)oldv;
+env->regs[R_EDX] = (uint32_t)(oldv >> 32);
+eflags &= ~CC_Z;
+}
+CC_SRC = eflags;
+}
+
 void helper_cmpxchg8b(CPUX86State *env, target_ulong a0)
 {
-uint64_t d;
+#ifdef CONFIG_ATOMIC64
+uint64_t oldv, cmpv, newv;
 int eflags;
 
 eflags = cpu_cc_compute_all(env, CC_OP);
-d = cpu_ldq_data_ra(env, a0, GETPC());
-if (d == (((uint64_t)env->regs[R_EDX] << 32) | 
(uint32_t)env->regs[R_EAX])) {
-cpu_stq_data_ra(env, a0, ((uint64_t)env->regs[R_ECX] << 32)
-  | (uint32_t)env->regs[R_EBX], GETPC());
+
+cmpv = deposit64(env->regs[R_EAX], 32, 32, env->regs[R_EDX]);
+newv = deposit64(env->regs[R_EBX], 32, 32, env->regs[R_ECX]);
+
+#ifdef CONFIG_USER_ONLY
+{
+uint64_t *haddr = g2h(a0);
+cmpv = cpu_to_le64(cmpv);
+newv = cpu_to_le64(newv);
+oldv = atomic_cmpxchg__nocheck(haddr, cmpv, newv);
+oldv = le64_to_cpu(oldv);
+}
+#else
+{
+uintptr_t ra = GETPC();
+int mem_idx = cpu_mmu_index(env, false);
+TCGMemOpIdx oi = make_memop_idx(MO_TEQ, mem_idx);
+oldv = helper_atomic_cmpxchgq_le_mmu(env, a0, cmpv, newv, oi, ra);
+}
+#endif
+
+if (oldv == cmpv) {
 eflags |= CC_Z;
 } else {
-/* always do the store */
-cpu_stq_data_ra(env, a0, d, GETPC());
-env->regs[R_EDX] = (uint32_t)(d >> 32);
-env->regs[R_EAX] = (uint32_t)d;
+env->regs[R_EAX] = (uint32_t)oldv;
+env->regs[R_EDX] = (uint32_t)(oldv >> 32);
 eflags &= ~CC_Z;
 }
 CC_SRC = eflags;
+#else
+cpu_loop_exit_atomic(ENV_GET_CPU(env), GETPC());
+#endif /* CONFIG_ATOMIC64 */
 }
 
 #ifdef TARGET_X86_64
-void helper_cmpxchg16b(CPUX86State *env, target_ulong a0)
+void helper_cmpxchg16b_unlocked(CPUX86State *env, target_ulong a0)
 {
-uint64_t d0, d1;
+uintptr_t ra = GETPC();
+Int128 oldv, cmpv, newv;
+uint64_t o0, o1;
 int eflags;
+bool success;
 
 if ((a0 & 0xf) != 0) {
 raise_exception_ra(env, EXCP0D_GPF, GETPC());
 }
 eflags = cpu_cc_compute_all(env, CC_OP);
-d0 = cpu_ldq_data_ra(env, a0, GETPC());
-d1 = cpu_ldq_data_ra(env, a0 + 8, GETPC());
-if (d0 == env->regs[R_EAX] && d1 == env->regs[R_EDX]) {
-cpu_stq_data_ra(env, a0, env->regs[R_EBX], GETPC());
-cpu_stq_data_ra(env, a0 + 8, env->regs[R_ECX], GETPC());
+
+cmpv = int128_make128(env->regs[R_EAX], env->regs[R_EDX]);
+newv = int128_make128(env->regs[R_EBX], env->regs[R_ECX]);
+
+o0 = cpu_ldq_data_ra(env, a0 + 0, ra);
+o1 = cpu_ldq_data_ra(env, a0 + 8, ra);
+
+oldv = int128_make128(o0, o1);
+success = 

[Qemu-devel] [PATCH v5 09/35] cputlb: Move probe_write out of softmmu_template.h

2016-10-09 Thread Richard Henderson
Reviewed-by: Alex Bennée 
Signed-off-by: Richard Henderson 
---
 cputlb.c   | 21 +
 softmmu_template.h | 23 ---
 2 files changed, 21 insertions(+), 23 deletions(-)

diff --git a/cputlb.c b/cputlb.c
index 5575b73..0c9b77b 100644
--- a/cputlb.c
+++ b/cputlb.c
@@ -527,6 +527,27 @@ static bool victim_tlb_hit(CPUArchState *env, size_t 
mmu_idx, size_t index,
   victim_tlb_hit(env, mmu_idx, index, offsetof(CPUTLBEntry, TY), \
  (ADDR) & TARGET_PAGE_MASK)
 
+/* Probe for whether the specified guest write access is permitted.
+ * If it is not permitted then an exception will be taken in the same
+ * way as if this were a real write access (and we will not return).
+ * Otherwise the function will return, and there will be a valid
+ * entry in the TLB for this access.
+ */
+void probe_write(CPUArchState *env, target_ulong addr, int mmu_idx,
+ uintptr_t retaddr)
+{
+int index = (addr >> TARGET_PAGE_BITS) & (CPU_TLB_SIZE - 1);
+target_ulong tlb_addr = env->tlb_table[mmu_idx][index].addr_write;
+
+if ((addr & TARGET_PAGE_MASK)
+!= (tlb_addr & (TARGET_PAGE_MASK | TLB_INVALID_MASK))) {
+/* TLB entry is for a different page */
+if (!VICTIM_TLB_HIT(addr_write, addr)) {
+tlb_fill(ENV_GET_CPU(env), addr, MMU_DATA_STORE, mmu_idx, retaddr);
+}
+}
+}
+
 #define MMUSUFFIX _mmu
 
 #define DATA_SIZE 1
diff --git a/softmmu_template.h b/softmmu_template.h
index f9c51fe..538cff5 100644
--- a/softmmu_template.h
+++ b/softmmu_template.h
@@ -464,29 +464,6 @@ void helper_be_st_name(CPUArchState *env, target_ulong 
addr, DATA_TYPE val,
 glue(glue(st, SUFFIX), _be_p)((uint8_t *)haddr, val);
 }
 #endif /* DATA_SIZE > 1 */
-
-#if DATA_SIZE == 1
-/* Probe for whether the specified guest write access is permitted.
- * If it is not permitted then an exception will be taken in the same
- * way as if this were a real write access (and we will not return).
- * Otherwise the function will return, and there will be a valid
- * entry in the TLB for this access.
- */
-void probe_write(CPUArchState *env, target_ulong addr, int mmu_idx,
- uintptr_t retaddr)
-{
-int index = (addr >> TARGET_PAGE_BITS) & (CPU_TLB_SIZE - 1);
-target_ulong tlb_addr = env->tlb_table[mmu_idx][index].addr_write;
-
-if ((addr & TARGET_PAGE_MASK)
-!= (tlb_addr & (TARGET_PAGE_MASK | TLB_INVALID_MASK))) {
-/* TLB entry is for a different page */
-if (!VICTIM_TLB_HIT(addr_write, addr)) {
-tlb_fill(ENV_GET_CPU(env), addr, MMU_DATA_STORE, mmu_idx, retaddr);
-}
-}
-}
-#endif
 #endif /* !defined(SOFTMMU_CODE_ACCESS) */
 
 #undef READ_ACCESS_TYPE
-- 
2.7.4




  1   2   >