Re: [Qemu-devel] [PATCH v5 08/10] migration: create a dedicated thread to release rdma resource

2018-07-18 Thread 858585 jemmy
On Thu, Jul 5, 2018 at 10:26 PM, 858585 jemmy  wrote:
> On Thu, Jun 28, 2018 at 2:59 AM, Dr. David Alan Gilbert
>  wrote:
>> * Lidong Chen (jemmy858...@gmail.com) wrote:
>>> ibv_dereg_mr wait for a long time for big memory size virtual server.
>>>
>>> The test result is:
>>>   10GB  326ms
>>>   20GB  699ms
>>>   30GB  1021ms
>>>   40GB  1387ms
>>>   50GB  1712ms
>>>   60GB  2034ms
>>>   70GB  2457ms
>>>   80GB  2807ms
>>>   90GB  3107ms
>>>   100GB 3474ms
>>>   110GB 3735ms
>>>   120GB 4064ms
>>>   130GB 4567ms
>>>   140GB 4886ms
>>>
>>> this will cause the guest os hang for a while when migration finished.
>>> So create a dedicated thread to release rdma resource.
>>>
>>> Signed-off-by: Lidong Chen 
>>> ---
>>>  migration/rdma.c | 43 +++
>>>  1 file changed, 27 insertions(+), 16 deletions(-)
>>>
>>> diff --git a/migration/rdma.c b/migration/rdma.c
>>> index dfa4f77..f12e8d5 100644
>>> --- a/migration/rdma.c
>>> +++ b/migration/rdma.c
>>> @@ -2979,35 +2979,46 @@ static void 
>>> qio_channel_rdma_set_aio_fd_handler(QIOChannel *ioc,
>>>  }
>>>  }
>>>
>>> -static int qio_channel_rdma_close(QIOChannel *ioc,
>>> -  Error **errp)
>>> +static void *qio_channel_rdma_close_thread(void *arg)
>>>  {
>>> -QIOChannelRDMA *rioc = QIO_CHANNEL_RDMA(ioc);
>>> -RDMAContext *rdmain, *rdmaout;
>>> -trace_qemu_rdma_close();
>>> +RDMAContext **rdma = arg;
>>> +RDMAContext *rdmain = rdma[0];
>>> +RDMAContext *rdmaout = rdma[1];
>>>
>>> -rdmain = rioc->rdmain;
>>> -if (rdmain) {
>>> -atomic_rcu_set(>rdmain, NULL);
>>> -}
>>> -
>>> -rdmaout = rioc->rdmaout;
>>> -if (rdmaout) {
>>> -atomic_rcu_set(>rdmaout, NULL);
>>> -}
>>> +rcu_register_thread();
>>>
>>>  synchronize_rcu();
>>
>> * see below
>>
>>> -
>>>  if (rdmain) {
>>>  qemu_rdma_cleanup(rdmain);
>>>  }
>>> -
>>>  if (rdmaout) {
>>>  qemu_rdma_cleanup(rdmaout);
>>>  }
>>>
>>>  g_free(rdmain);
>>>  g_free(rdmaout);
>>> +g_free(rdma);
>>> +
>>> +rcu_unregister_thread();
>>> +return NULL;
>>> +}
>>> +
>>> +static int qio_channel_rdma_close(QIOChannel *ioc,
>>> +  Error **errp)
>>> +{
>>> +QemuThread t;
>>> +QIOChannelRDMA *rioc = QIO_CHANNEL_RDMA(ioc);
>>> +RDMAContext **rdma = g_new0(RDMAContext*, 2);
>>> +
>>> +trace_qemu_rdma_close();
>>> +if (rioc->rdmain || rioc->rdmaout) {
>>> +rdma[0] =  rioc->rdmain;
>>> +rdma[1] =  rioc->rdmaout;
>>> +qemu_thread_create(, "rdma cleanup", 
>>> qio_channel_rdma_close_thread,
>>> +   rdma, QEMU_THREAD_DETACHED);
>>> +atomic_rcu_set(>rdmain, NULL);
>>> +atomic_rcu_set(>rdmaout, NULL);
>>
>> I'm not sure this pair is ordered with the synchronise_rcu above;
>> Doesn't that mean, on a bad day, that you could get:
>>
>>
>> main-thread  rdma_cleanup another-thread
>> qmu_thread_create
>>   synchronise_rcu
>> reads rioc->rdmain
>> starts doing something with rdmain
>> atomic_rcu_set
>>   rdma_cleanup
>>
>>
>> so the another-thread is using it during the cleanup?
>> Would just moving the atomic_rcu_sets before the qemu_thread_create
>> fix that?
> yes, I will fix it.
>
>>
>> However, I've got other worries as well:
>>a) qemu_rdma_cleanup does:
>>migrate_get_current()->state == MIGRATION_STATUS_CANCELLING
>>
>>   which worries me a little if someone immediately tries to restart
>>   the migration.

Because the current qemu version don't wait for
RDMA_CM_EVENT_DISCONNECTED event after rdma_disconnect,
so I think it's not necessary to send RDMA_CONTROL_ERROR.

compare to send RDMA_CONTROL_ERROR, I think use cm event to notify
peer qemu is better.
maybe the rdma is already in error_state, and RDMA_CONTROL_ERROR
cannot send successfully.

For peer qemu, in qemu_rdma_wait_comp_channel function, it's should
not only poll rdma->comp_channel->fd,
it's should also poll  rdma->channel->fd.

for the source qemu, it's fixed by this patch.
migration: poll the cm event while wait RDMA work request completion

and for destination qemu, it's need a new patch to fix it.

so I prefer to remove MIGRATION_STATUS_CANCELLING in qemu_rdma_cleanup function.

>>
>>b) I don't understand what happens if someone does try and restart
>>   the migration after that, but in the ~5s it takes the ibv cleanup
>>   to happen.

I prefer to add a new variable in current_migration.  if the rdma
cleanup thread has not
exited. it's should not start a new migration.

>
> yes, I will try to fix it.
>
>>
>> Dave
>>
>>
>>> +}
>>>
>>>  return 0;
>>>  }
>>> --
>>> 1.8.3.1
>>>
>> --
>> Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK



Re: [Qemu-devel] [RFC v3] qemu: Add virtio pmem device

2018-07-18 Thread Pankaj Gupta


> 
> >  This patch adds virtio-pmem Qemu device.
> > 
> >  This device presents memory address range information to guest
> >  which is backed by file backend type. It acts like persistent
> >  memory device for KVM guest. Guest can perform read and persistent
> >  write operations on this memory range with the help of DAX capable
> >  filesystem.
> > 
> >  Persistent guest writes are assured with the help of virtio based
> >  flushing interface. When guest userspace space performs fsync on
> >  file fd on pmem device, a flush command is send to Qemu over VIRTIO
> >  and host side flush/sync is done on backing image file.
> > 
> > Changes from RFC v2:
> > - Use aio_worker() to avoid Qemu from hanging with blocking fsync
> >   call - Stefan
> > - Use virtio_st*_p() for endianess - Stefan
> > - Correct indentation in qapi/misc.json - Eric
> > 
> > Signed-off-by: Pankaj Gupta 
> > ---
> >  hw/virtio/Makefile.objs |   3 +
> >  hw/virtio/virtio-pci.c  |  44 +
> >  hw/virtio/virtio-pci.h  |  14 ++
> >  hw/virtio/virtio-pmem.c | 241
> >  
> >  include/hw/pci/pci.h|   1 +
> >  include/hw/virtio/virtio-pmem.h |  42 +
> >  include/standard-headers/linux/virtio_ids.h |   1 +
> >  qapi/misc.json  |  26 ++-
> >  8 files changed, 371 insertions(+), 1 deletion(-)
> >  create mode 100644 hw/virtio/virtio-pmem.c
> >  create mode 100644 include/hw/virtio/virtio-pmem.h
> > 
> > diff --git a/hw/virtio/Makefile.objs b/hw/virtio/Makefile.objs
> > index 1b2799cfd8..7f914d45d0 100644
> > --- a/hw/virtio/Makefile.objs
> > +++ b/hw/virtio/Makefile.objs
> > @@ -10,6 +10,9 @@ obj-$(CONFIG_VIRTIO_CRYPTO) += virtio-crypto.o
> >  obj-$(call land,$(CONFIG_VIRTIO_CRYPTO),$(CONFIG_VIRTIO_PCI)) +=
> >  virtio-crypto-pci.o
> >  
> >  obj-$(CONFIG_LINUX) += vhost.o vhost-backend.o vhost-user.o
> > +ifeq ($(CONFIG_MEM_HOTPLUG),y)
> > +obj-$(CONFIG_LINUX) += virtio-pmem.o
> > +endif
> >  obj-$(CONFIG_VHOST_VSOCK) += vhost-vsock.o
> >  endif
> >  
> > diff --git a/hw/virtio/virtio-pci.c b/hw/virtio/virtio-pci.c
> > index 3a01fe90f0..93d3fc05c7 100644
> > --- a/hw/virtio/virtio-pci.c
> > +++ b/hw/virtio/virtio-pci.c
> > @@ -2521,6 +2521,49 @@ static const TypeInfo virtio_rng_pci_info = {
> >  .class_init= virtio_rng_pci_class_init,
> >  };
> >  
> > +/* virtio-pmem-pci */
> > +
> > +static void virtio_pmem_pci_realize(VirtIOPCIProxy *vpci_dev, Error
> > **errp)
> > +{
> > +VirtIOPMEMPCI *vpmem = VIRTIO_PMEM_PCI(vpci_dev);
> > +DeviceState *vdev = DEVICE(>vdev);
> > +
> > +qdev_set_parent_bus(vdev, BUS(_dev->bus));
> > +object_property_set_bool(OBJECT(vdev), true, "realized", errp);
> > +}
> > +
> > +static void virtio_pmem_pci_class_init(ObjectClass *klass, void *data)
> > +{
> > +DeviceClass *dc = DEVICE_CLASS(klass);
> > +VirtioPCIClass *k = VIRTIO_PCI_CLASS(klass);
> > +PCIDeviceClass *pcidev_k = PCI_DEVICE_CLASS(klass);
> > +k->realize = virtio_pmem_pci_realize;
> > +set_bit(DEVICE_CATEGORY_MISC, dc->categories);
> > +pcidev_k->vendor_id = PCI_VENDOR_ID_REDHAT_QUMRANET;
> > +pcidev_k->device_id = PCI_DEVICE_ID_VIRTIO_PMEM;
> > +pcidev_k->revision = VIRTIO_PCI_ABI_VERSION;
> > +pcidev_k->class_id = PCI_CLASS_OTHERS;
> > +}
> > +
> > +static void virtio_pmem_pci_instance_init(Object *obj)
> > +{
> > +VirtIOPMEMPCI *dev = VIRTIO_PMEM_PCI(obj);
> > +
> > +virtio_instance_init_common(obj, >vdev, sizeof(dev->vdev),
> > +TYPE_VIRTIO_PMEM);
> > +object_property_add_alias(obj, "memdev", OBJECT(>vdev), "memdev",
> > +  _abort);
> > +}
> > +
> > +static const TypeInfo virtio_pmem_pci_info = {
> > +.name  = TYPE_VIRTIO_PMEM_PCI,
> > +.parent= TYPE_VIRTIO_PCI,
> > +.instance_size = sizeof(VirtIOPMEMPCI),
> > +.instance_init = virtio_pmem_pci_instance_init,
> > +.class_init= virtio_pmem_pci_class_init,
> > +};
> > +
> > +
> >  /* virtio-input-pci */
> >  
> >  static Property virtio_input_pci_properties[] = {
> > @@ -2714,6 +2757,7 @@ static void virtio_pci_register_types(void)
> >  type_register_static(_balloon_pci_info);
> >  type_register_static(_serial_pci_info);
> >  type_register_static(_net_pci_info);
> > +type_register_static(_pmem_pci_info);
> >  #ifdef CONFIG_VHOST_SCSI
> >  type_register_static(_scsi_pci_info);
> >  #endif
> > diff --git a/hw/virtio/virtio-pci.h b/hw/virtio/virtio-pci.h
> > index 813082b0d7..fe74fcad3f 100644
> > --- a/hw/virtio/virtio-pci.h
> > +++ b/hw/virtio/virtio-pci.h
> > @@ -19,6 +19,7 @@
> >  #include "hw/virtio/virtio-blk.h"
> >  #include "hw/virtio/virtio-net.h"
> >  #include "hw/virtio/virtio-rng.h"
> > +#include "hw/virtio/virtio-pmem.h"
> >  #include "hw/virtio/virtio-serial.h"
> >  #include "hw/virtio/virtio-scsi.h"
> >  #include 

Re: [Qemu-devel] [RFC PATCH 0/3] Balloon inhibit enhancements

2018-07-18 Thread Peter Xu
On Wed, Jul 18, 2018 at 10:31:33AM -0600, Alex Williamson wrote:
> On Wed, 18 Jul 2018 14:48:03 +0800
> Peter Xu  wrote:
> 
> > On Tue, Jul 17, 2018 at 04:47:31PM -0600, Alex Williamson wrote:
> > > Directly assigned vfio devices have never been compatible with
> > > ballooning.  Zapping MADV_DONTNEED pages happens completely
> > > independent of vfio page pinning and IOMMU mapping, leaving us with
> > > inconsistent GPA to HPA mapping between vCPUs and assigned devices
> > > when the balloon deflates.  Mediated devices can theoretically do
> > > better, if we make the assumption that the mdev vendor driver is fully
> > > synchronized to the actual working set of the guest driver.  In that
> > > case the guest balloon driver should never be able to allocate an mdev
> > > pinned page for balloon inflation.  Unfortunately, QEMU can't know the
> > > workings of the vendor driver pinning, and doesn't actually know the
> > > difference between mdev devices and directly assigned devices.  Until
> > > we can sort out how the vfio IOMMU backend can tell us if ballooning
> > > is safe, the best approach is to disabling ballooning any time a vfio
> > > devices is attached.
> > > 
> > > To do that, simply make the balloon inhibitor a counter rather than a
> > > boolean, fixup a case where KVM can then simply use the inhibit
> > > interface, and inhibit ballooning any time a vfio device is attached.
> > > I'm expecting we'll expose some sort of flag similar to
> > > KVM_CAP_SYNC_MMU from the vfio IOMMU for cases where we can resolve
> > > this.  An addition we could consider here would be yet another device
> > > option for vfio, such as x-disable-balloon-inhibit, in case there are
> > > mdev devices that behave in a manner compatible with ballooning.
> > > 
> > > Please let me know if this looks like a good idea.  Thanks,  
> > 
> > IMHO patches 1-2 are good cleanup as standalone patches...
> > 
> > I totally have no idea on whether people would like to use vfio-pci
> > and the balloon device at the same time.  After all vfio-pci are
> > majorly for performance players, then I would vaguely guess that they
> > don't really care thin provisioning of memory at all, hence the usage
> > scenario might not exist much.  Is that the major reason that we'd
> > just like to disable it (which makes sense to me)?
> 
> Well, the major reason for disabling it is that it currently doesn't
> work and when the balloon is deflated, the device and vCPU are talking
> to different host pages for the same GPA for previously ballooned
> pages.  Regardless of the amenability of device assignment to various
> usage scenarios, that's a bad thing.  I guess most device assignment
> users have either realized this doesn't work and avoid it, or perhaps
> they have VMs tuned more for performance than density and (again) don't
> use ballooning.

Makes sense to me.

>  
> > I'm wondering what if want to do that somehow some day... Whether
> > it'll work if we just let vfio-pci devices to register some guest
> > memory invalidation hook (just like the iommu notifiers, but for guest
> > memory address space instead), then we map/unmap the IOMMU pages there
> > for vfio-pci device to make sure the inflated balloon pages are not
> > mapped and also make sure new pages are remapped with correct HPA
> > after deflated.  This is a pure question out of my curiosity, and for
> > sure it makes little sense if the answer of the first question above
> > is positive.
> 
> This is why I mention the KVM MMU synchronization flag above.  KVM
> essentially had this same problem and fixed it with with MMU notifiers
> in the kernel.  They expose that KVM has the capability of handling
> such a scenario via a feature flag.  We can do the same with vfio.  In
> scenarios where we're able to fix this, we could expose a flag on the
> container indicating support for the same sort of thing.

Sorry I didn't really caught that point when reply.  So that's why we
have had the mmu notifiers... Hmm, glad to know that.

But I would guess that if we want that notifier for vfio it should be
in QEMU rather than the kernel one since kernel vfio driver should not
have enough information on the GPA address space, hence it might not
be able to rebuild the mapping when a new page is mapped?  While QEMU
should be able to get both GPA and HVA easily when the balloon device
wants to deflate a page. [1]

> 
> There are a few complications to this support though.  First ballooning
> works at page size granularity, but IOMMU mapping can make use of
> arbitrary superpage sizes and the IOMMU API only guarantees unmap
> granularity equal to the original mapping.  Therefore we cannot unmap
> individual pages unless we require that all mappings through the IOMMU
> API are done with page granularity, precluding the use of superpages by
> the IOMMU and thereby inflicting higher IOTLB overhead.  Unlike a CPU,
> we can't invalidate the mappings and fault them back in or halt the
> processor to make 

Re: [Qemu-devel] [PATCH v4] monitor: let cur_mon be per-thread

2018-07-18 Thread Peter Xu
On Wed, Jul 18, 2018 at 05:38:11PM +0200, Markus Armbruster wrote:
> Peter Xu  writes:
> 
> > After the Out-Of-Band work, the monitor iothread may be accessing the
> > cur_mon as well (via monitor_qmp_dispatch_one()).  Let's convert the
> > cur_mon variable to be a per-thread variable to make sure there won't be
> > a race between threads when accessing the variable.
> 
> Hmm... why hasn't the OOB work created such a race already?
> 
> A monitor reads, parses, dispatches and executes commands, formats and
> sends replies.
> 
> Before OOB, all of that ran in the main thread.  Any access of cur_mon
> should therefore be from the main thread.  No races.
> 
> OOB moves read, parse, format and send to an I/O thread.  Dispatch and
> execute remain in the main thread.  *Except* for commands executed OOB,
> dispatch and execute move to the I/O thread, too.
> 
> Why is this not racy?  I guess it relies on careful non-use of cur_mon
> in any part that may now execute in the I/O thread.  Scary...

I think it's because cur_mon is not really used in out-of-band command
executions - now we only have a few out-of-band enabled commands, and
IIUC none of them is using cur_mon (for example, in
qmp_migrate_recover() we don't even call error_report, and the code
path is quite straight forward to make sure of that).  So IIUC cur_mon
variable is still only touched by main thread for now hence we should
be safe.  However that condition might change in the future when we
add more out-of-band capable commands.

(not to mention that I don't even know whether there are real users of
 out-of-band if we haven't yet started to support that for libvirt...)

> 
> Should this go into 3.0 to reduce the risk of bugs?

Yes I think it would be good to have that even for 3.0, since it still
can be seen as a bug fix of existing code.

Regards,

-- 
Peter Xu



Re: [Qemu-devel] [RFC PATCH 0/3] Balloon inhibit enhancements

2018-07-18 Thread Peter Xu
On Wed, Jul 18, 2018 at 11:36:40AM +0200, Cornelia Huck wrote:
> On Wed, 18 Jul 2018 14:48:03 +0800
> Peter Xu  wrote:
> 
> > On Tue, Jul 17, 2018 at 04:47:31PM -0600, Alex Williamson wrote:
> > > Directly assigned vfio devices have never been compatible with
> > > ballooning.  Zapping MADV_DONTNEED pages happens completely
> > > independent of vfio page pinning and IOMMU mapping, leaving us with
> > > inconsistent GPA to HPA mapping between vCPUs and assigned devices
> > > when the balloon deflates.  Mediated devices can theoretically do
> > > better, if we make the assumption that the mdev vendor driver is fully
> > > synchronized to the actual working set of the guest driver.  In that
> > > case the guest balloon driver should never be able to allocate an mdev
> > > pinned page for balloon inflation.  Unfortunately, QEMU can't know the
> > > workings of the vendor driver pinning, and doesn't actually know the
> > > difference between mdev devices and directly assigned devices.  Until
> > > we can sort out how the vfio IOMMU backend can tell us if ballooning
> > > is safe, the best approach is to disabling ballooning any time a vfio
> > > devices is attached.
> > > 
> > > To do that, simply make the balloon inhibitor a counter rather than a
> > > boolean, fixup a case where KVM can then simply use the inhibit
> > > interface, and inhibit ballooning any time a vfio device is attached.
> > > I'm expecting we'll expose some sort of flag similar to
> > > KVM_CAP_SYNC_MMU from the vfio IOMMU for cases where we can resolve
> > > this.  An addition we could consider here would be yet another device
> > > option for vfio, such as x-disable-balloon-inhibit, in case there are
> > > mdev devices that behave in a manner compatible with ballooning.
> > > 
> > > Please let me know if this looks like a good idea.  Thanks,  
> > 
> > IMHO patches 1-2 are good cleanup as standalone patches...
> > 
> > I totally have no idea on whether people would like to use vfio-pci
> > and the balloon device at the same time.  After all vfio-pci are
> > majorly for performance players, then I would vaguely guess that they
> > don't really care thin provisioning of memory at all, hence the usage
> > scenario might not exist much.  Is that the major reason that we'd
> > just like to disable it (which makes sense to me)?
> 
> Don't people use vfio-pci as well if they want some special
> capabilities from the passed-through device? (At least that's the main
> use case for vfio-ccw, not any performance considerations.)

Good to know these.

Out of topic: could I further ask what's these capabilities, and why
these capabilities can't be emulated (or hard to be emulated) if we
don't care about performance?

(Any link or keyword would be welcomed too if there is)

Regards,

-- 
Peter Xu



Re: [Qemu-devel] [RFC PATCH 1/3] balloon: Allow nested inhibits

2018-07-18 Thread Peter Xu
On Wed, Jul 18, 2018 at 10:37:36AM -0600, Alex Williamson wrote:
> On Wed, 18 Jul 2018 14:40:15 +0800
> Peter Xu  wrote:
> 
> > On Tue, Jul 17, 2018 at 04:47:37PM -0600, Alex Williamson wrote:
> > > A simple true/false internal state does not allow multiple users.  Fix
> > > this within the existing interface by converting to a counter, so long
> > > as the counter is elevated, ballooning is inhibited.
> > > 
> > > Signed-off-by: Alex Williamson 
> > > ---
> > >  balloon.c |7 ---
> > >  1 file changed, 4 insertions(+), 3 deletions(-)
> > > 
> > > diff --git a/balloon.c b/balloon.c
> > > index 6bf0a9681377..2a6a7e1a22a0 100644
> > > --- a/balloon.c
> > > +++ b/balloon.c
> > > @@ -37,16 +37,17 @@
> > >  static QEMUBalloonEvent *balloon_event_fn;
> > >  static QEMUBalloonStatus *balloon_stat_fn;
> > >  static void *balloon_opaque;
> > > -static bool balloon_inhibited;
> > > +static int balloon_inhibited;
> > >  
> > >  bool qemu_balloon_is_inhibited(void)
> > >  {
> > > -return balloon_inhibited;
> > > +return balloon_inhibited > 0;
> > >  }
> > >  
> > >  void qemu_balloon_inhibit(bool state)
> > >  {
> > > -balloon_inhibited = state;
> > > +balloon_inhibited += (state ? 1 : -1);
> > > +assert(balloon_inhibited >= 0);  
> > 
> > Better do it atomically?
> 
> I'd assumed we're protected by the BQL anywhere this is called.  Is
> that not the case?  Generally when I try to add any sort of locking to
> QEMU it's shot down because the code paths are already serialized.

Ah I see. :-)

But I guess current code might call these without BQL.  For example,
qemu_balloon_inhibit() has:

postcopy_ram_listen_thread
  qemu_loadvm_state_main
loadvm_process_command
  loadvm_postcopy_handle_listen
postcopy_ram_enable_notify
  qemu_balloon_inhibit

While the whole stack is inside a standalone thread to load QEMU for
postcopy incoming migration rather than the main thread.

Thanks,

-- 
Peter Xu



[Qemu-devel] [PATCH] hw/block/nvme: add optional parameter num_namespaces for nvme device

2018-07-18 Thread Weiping Zhang
Add an optional paramter num_namespaces for device, and set it
to 1 by default.

Signed-off-by: Weiping Zhang 
---
 hw/block/nvme.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/hw/block/nvme.c b/hw/block/nvme.c
index 156ecf3c41..b53be4b5c0 100644
--- a/hw/block/nvme.c
+++ b/hw/block/nvme.c
@@ -19,7 +19,7 @@
  *  -drive file=,if=none,id=
  *  -device nvme,drive=,serial=,id=, \
  *  cmb_size_mb=, \
- *  num_queues=
+ *  num_queues=,num_namespaces=
  *
  * Note cmb_size_mb denotes size of CMB in MB. CMB is assumed to be at
  * offset 0 in BAR2 and supports only WDS, RDS and SQS for now.
@@ -1232,7 +1232,6 @@ static void nvme_realize(PCIDevice *pci_dev, Error **errp)
 pci_config_set_class(pci_dev->config, PCI_CLASS_STORAGE_EXPRESS);
 pcie_endpoint_cap_init(>parent_obj, 0x80);
 
-n->num_namespaces = 1;
 n->reg_size = pow2ceil(0x1004 + 2 * (n->num_queues + 1) * 4);
 n->ns_size = bs_size / (uint64_t)n->num_namespaces;
 
@@ -1342,6 +1341,7 @@ static Property nvme_props[] = {
 DEFINE_PROP_STRING("serial", NvmeCtrl, serial),
 DEFINE_PROP_UINT32("cmb_size_mb", NvmeCtrl, cmb_size_mb, 0),
 DEFINE_PROP_UINT32("num_queues", NvmeCtrl, num_queues, 64),
+DEFINE_PROP_UINT32("num_namespaces", NvmeCtrl, num_namespaces, 1),
 DEFINE_PROP_END_OF_LIST(),
 };
 
-- 
2.14.1




[Qemu-devel] [PATCH] block: Don't lock /dev/null and /dev/zero automatically

2018-07-18 Thread Fam Zheng
On my Fedora 28, /dev/null is locked by some other process (couldn't
inspect it due to the current lslocks limitation), so iotests 226 fails
with some unexpected image locking errors because it uses qemu-io to
open it.

Actually it's safe to not use any lock on /dev/null or /dev/zero.

Signed-off-by: Fam Zheng 
---
 block/file-posix.c | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/block/file-posix.c b/block/file-posix.c
index 60af4b3d51..8bf034108a 100644
--- a/block/file-posix.c
+++ b/block/file-posix.c
@@ -503,7 +503,12 @@ static int raw_open_common(BlockDriverState *bs, QDict 
*options,
 s->use_lock = false;
 break;
 case ON_OFF_AUTO_AUTO:
-s->use_lock = qemu_has_ofd_lock();
+if (!strcmp(filename, "/dev/null") ||
+   !strcmp(filename, "/dev/zero")) {
+s->use_lock = false;
+} else {
+s->use_lock = qemu_has_ofd_lock();
+}
 break;
 default:
 abort();
-- 
2.17.1




[Qemu-devel] [PATCH] hw/intc/arm_gicv3_its: downgrade error_report to warn_report in kvm_arm_its_reset

2018-07-18 Thread Jia He
In scripts/arch-run.bash of kvm-unit-tests, it will check the qemu
output log with:
if [ -z "$(echo "$errors" | grep -vi warning)" ]; then

Thus without the warning prefix, all of the test fail.

Since it is not unrecoverable error in kvm_arm_its_reset for
current implementation, downgrading the report from error to
warn makes sense.

Signed-off-by: Jia He 
---
 hw/intc/arm_gicv3_its_kvm.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/intc/arm_gicv3_its_kvm.c b/hw/intc/arm_gicv3_its_kvm.c
index 271ebe4..01573ab 100644
--- a/hw/intc/arm_gicv3_its_kvm.c
+++ b/hw/intc/arm_gicv3_its_kvm.c
@@ -211,7 +211,7 @@ static void kvm_arm_its_reset(DeviceState *dev)
 return;
 }
 
-error_report("ITS KVM: full reset is not supported by the host kernel");
+warn_report("ITS KVM: full reset is not supported by the host kernel");
 
 if (!kvm_device_check_attr(s->dev_fd, KVM_DEV_ARM_VGIC_GRP_ITS_REGS,
GITS_CTLR)) {
-- 
1.8.3.1




Re: [Qemu-devel] [libvirt] CPU Support

2018-07-18 Thread Brijesh Singh



On 7/18/18 8:49 AM, Eduardo Habkost wrote:
> CCing the AMD people who worked on this.
>
> On Wed, Jul 18, 2018 at 12:18:45PM +0200, Pavel Hrdina wrote:
>> On Wed, Jul 18, 2018 at 10:50:34AM +0100, Daniel P. Berrangé wrote:
>>> On Wed, Jul 18, 2018 at 12:41:48PM +0300, Hetz Ben Hamo wrote:
 Hi,

 I've been looking at the CPU list and although I see lots of CPU's, I
 cannot find 2 CPU families:

 * AMD Ryzen
 * AMD Threadripper

 Although EPYC has been added recently.

 Are there any missing details which preventing adding those CPU's to the
 list?
>>> Libvirt adds CPU models based on what QEMU supports. So from libvirt side 
>>> the
>>> answer is simply that QEMU doesn't expose any models for Ryzen/Threadripper,
>>> but I'm not clear why it doesn't...

EPYC model should work just fine on Ryzen/Threadripper. Are we seeing
some issues?

>>> For a while I thought Ryzen/Threadripper would have same feature set as
>>> EPYC, but I've seen bugs recently suggesting that is not in fact the
>>> case. So it does look like having those models exposed by QEMU might
>>> be useful.
>>>
>>> Copy'ing QEMU devel & the CPU model maintainers for opinions.
>> I think that QEMU should figure out some pattern for naming CPU models
>> because it's one big mess.  EPYC and Ryzen are bad names for QEMU as
>> Core/Xeon would be for Intel CPUs.  It's the name of a model families
>> and it will probably remain the same but with different
>> microarchitecture.
>> Better name would be similarly like for the latest Inter CPUs,
>> Skylake-Client and Skylake-Server.  Currently AMD has already two
>> microarchitectures, Zen and Zen+ and there is third one Zen 2 planned.
>>
>> Zen has AMD Ryzen, AMD Ryzen Threadripper and AMD Epyc.
>> Zen+ has AMD Ryzen, AMD Ryzen Threadripper
>>
>> And I bet that Zen 2 will follow the same model families.


My guess is same as your :) I hope sales/marketing does not come up with
different names for Soc's based Zen 2 core.


>> We probably cannot rename EPYC now, but before we introduce Ryzen and
>> Threadripper let's thing about it and come up with better names, for
>> example Zen-Client/Zen-Server Zen+-Client or something like that.

Zen-Client/Zen-Server naming convention looks better.




Re: [Qemu-devel] [PATCH] block/vvfat: Fix crash when reporting error about too many files in directory

2018-07-18 Thread Philippe Mathieu-Daudé
On 07/18/2018 12:28 PM, Thomas Huth wrote:
> When using the vvfat driver with a directory that contains too many files,
> QEMU currently crashes. We are trying to print the wrong path variable here.
> 
> Signed-off-by: Thomas Huth 
> ---
>  block/vvfat.c | 3 +--
>  1 file changed, 1 insertion(+), 2 deletions(-)
> 
> diff --git a/block/vvfat.c b/block/vvfat.c
> index fc41841..6ae7458 100644
> --- a/block/vvfat.c
> +++ b/block/vvfat.c
> @@ -975,8 +975,7 @@ static int init_directories(BDRVVVFATState* s,
>  if (mapping->mode & MODE_DIRECTORY) {
>  mapping->begin = cluster;
>  if(read_directory(s, i)) {
> -error_setg(errp, "Could not read directory %s",
> -   mapping->path);
> +error_setg(errp, "Could not read directory \"%s\"", s->path);
>  return -1;
>  }
>  mapping = array_get(&(s->mapping), i);
> 

Reviewed-by: Philippe Mathieu-Daudé 



Re: [Qemu-devel] [PATCH] linux-user/ppc: Implement swapcontext syscall

2018-07-18 Thread Philippe Mathieu-Daudé
Hi Richard,

On 07/18/2018 05:06 PM, Richard Henderson wrote:
> This allows the tests generated by debian-powerpc-user-cross
> to function properly, especially tests/test-coroutine.
> 
> Technically this syscall is available to both ppc32 and ppc64,
> but only ppc32 glibc actually uses it.  Thus the ppc64 path is
> untested.
> 
> Signed-off-by: Richard Henderson 
> ---
>  linux-user/qemu.h   |  2 ++
>  linux-user/ppc/signal.c | 56 +
>  linux-user/syscall.c|  6 +
>  3 files changed, 64 insertions(+)
> 
> diff --git a/linux-user/qemu.h b/linux-user/qemu.h
> index bb85c81aa4..e0963676c7 100644
> --- a/linux-user/qemu.h
> +++ b/linux-user/qemu.h
> @@ -395,6 +395,8 @@ long do_sigreturn(CPUArchState *env);
>  long do_rt_sigreturn(CPUArchState *env);
>  abi_long do_sigaltstack(abi_ulong uss_addr, abi_ulong uoss_addr, abi_ulong 
> sp);
>  int do_sigprocmask(int how, const sigset_t *set, sigset_t *oldset);
> +abi_long do_swapcontext(CPUArchState *env, abi_ulong uold_ctx,
> +abi_ulong unew_ctx, abi_long ctx_size);
>  /**
>   * block_signals: block all signals while handling this guest syscall
>   *
> diff --git a/linux-user/ppc/signal.c b/linux-user/ppc/signal.c
> index ef4c518f11..2ae120a2bc 100644
> --- a/linux-user/ppc/signal.c
> +++ b/linux-user/ppc/signal.c
> @@ -675,3 +675,59 @@ sigsegv:
>  force_sig(TARGET_SIGSEGV);
>  return -TARGET_QEMU_ESIGRETURN;
>  }
> +
> +/* This syscall implements {get,set,swap}context for userland.  */

This comment confuses me because do_setcontext() is available at line 625.

> +abi_long do_swapcontext(CPUArchState *env, abi_ulong uold_ctx,
> +abi_ulong unew_ctx, abi_long ctx_size)
> +{
> +struct target_ucontext *uctx;
> +struct target_mcontext *mctx;
> +
> +/* For ppc32, ctx_size is "reserved for future use".
> + * For ppc64, we do not yet support the VSX extension.
> + */
> +if (ctx_size < sizeof(struct target_ucontext)) {
> +return -TARGET_EINVAL;

Shouldn't this be -TARGET_ENOMEM?

swapcontext(3):
ERRORS
   ENOMEM Insufficient stack space left.

> +}
> +
> +if (uold_ctx) {
> +TaskState *ts = (TaskState *)thread_cpu->opaque;
> +
> +if (!lock_user_struct(VERIFY_WRITE, uctx, uold_ctx, 1)) {
> +return -TARGET_EFAULT;
> +}
> +
> +#ifdef TARGET_PPC64
> +mctx = >tuc_sigcontext.mcontext;
> +#else
> +/* ??? The kernel aligns the pointer down here into padding, but
> + * in setup_rt_frame we don't.  Be self-compatible for now.
> + */
> +mctx = >tuc_mcontext;
> +__put_user(h2g(mctx), >tuc_regs);
> +#endif
> +
> +save_user_regs(env, mctx);
> +host_to_target_sigset(>tuc_sigmask, >signal_mask);
> +
> +unlock_user_struct(uctx, uold_ctx, 1);
> +}
> +
> +if (unew_ctx) {
> +int err;
> +
> +if (!lock_user_struct(VERIFY_READ, uctx, unew_ctx, 1)) {
> +return -TARGET_EFAULT;
> +}
> +err = do_setcontext(uctx, env, 0);
> +unlock_user_struct(uctx, unew_ctx, 1);
> +
> +if (err) {
> +/* We cannot return to a partially updated context.  */
> +force_sig(TARGET_SIGSEGV);
> +}
> +return -TARGET_QEMU_ESIGRETURN;
> +}
> +
> +return 0;
> +}
> diff --git a/linux-user/syscall.c b/linux-user/syscall.c
> index 3df3bdffb2..dfc851cc35 100644
> --- a/linux-user/syscall.c
> +++ b/linux-user/syscall.c
> @@ -12790,6 +12790,12 @@ abi_long do_syscall(void *cpu_env, int num, abi_long 
> arg1,
>  ret = get_errno(kcmp(arg1, arg2, arg3, arg4, arg5));
>  break;
>  #endif
> +#ifdef TARGET_NR_swapcontext
> +case TARGET_NR_swapcontext:
> +/* PowerPC specific.  */
> +ret = do_swapcontext(cpu_env, arg1, arg2, arg3);
> +break;
> +#endif
>  
>  default:
>  unimplemented:
> 



[Qemu-devel] [PATCH for-3.1 v2] python: Use io.StringIO

2018-07-18 Thread Philippe Mathieu-Daudé
Both Python 2.7 and 3 support the same io.StringIO to
handle unicode strings.

Python 2.6 requires special care, but since 7f2b55443a his
support was removed. Stop caring, drop the ImportError check.

Use the common form to use indistinctly Python 2.7 or 3.

http://python-future.org/compatible_idioms.html#stringio

This fixes running tests on the Fedora Docker image,
which uses Python3 since 356dc290f:

  $ make docker-test-block@fedora
  [...]
  045 [failed, exit status 1] - output mismatch (see 045.out.bad)
  --- /tmp/qemu-test/src/tests/qemu-iotests/045.out   2018-07-17 
16:56:18.0 +
  +++ /tmp/qemu-test/build/tests/qemu-iotests/045.out.bad 2018-07-17 
17:19:22.448409007 +
  @@ -1,5 +1,6 @@
  -...
  ---
  -Ran 11 tests
  -
  -OK
  +Traceback (most recent call last):
  +  File "045", line 178, in 
  +iotests.main(supported_fmts=['raw'])
  +  File "/tmp/qemu-test/src/tests/qemu-iotests/iotests.py", line 682, in main
  +import StringIO
  +ModuleNotFoundError: No module named 'StringIO'
  132 [failed, exit status 1] - output mismatch (see 132.out.bad)
  152 [failed, exit status 1] - output mismatch (see 152.out.bad)

  Failures: 045 132 152

Suggested-by: Eduardo Habkost 
Signed-off-by: Philippe Mathieu-Daudé 
---
 tests/docker/docker.py| 5 +
 tests/image-fuzzer/runner.py  | 6 +++---
 tests/qemu-iotests/iotests.py | 5 +++--
 3 files changed, 7 insertions(+), 9 deletions(-)

diff --git a/tests/docker/docker.py b/tests/docker/docker.py
index 69e7130db7..9d53b868db 100755
--- a/tests/docker/docker.py
+++ b/tests/docker/docker.py
@@ -26,10 +26,7 @@ import tempfile
 import re
 import signal
 from tarfile import TarFile, TarInfo
-try:
-from StringIO import StringIO
-except ImportError:
-from io import StringIO
+from io import StringIO
 from shutil import copy, rmtree
 from pwd import getpwuid
 from datetime import datetime,timedelta
diff --git a/tests/image-fuzzer/runner.py b/tests/image-fuzzer/runner.py
index 95d84f38f3..4462d84f45 100755
--- a/tests/image-fuzzer/runner.py
+++ b/tests/image-fuzzer/runner.py
@@ -28,7 +28,7 @@ import shutil
 from itertools import count
 import time
 import getopt
-import StringIO
+from io import StringIO
 import resource
 
 try:
@@ -183,7 +183,7 @@ class TestEnv(object):
MAX_BACKING_FILE_SIZE) * (1 << 20)
 cmd = self.qemu_img + ['create', '-f', backing_file_fmt,
backing_file_name, str(backing_file_size)]
-temp_log = StringIO.StringIO()
+temp_log = StringIO()
 retcode = run_app(temp_log, cmd)
 if retcode == 0:
 temp_log.close()
@@ -240,7 +240,7 @@ class TestEnv(object):
"Backing file: %s\n" \
% (self.seed, " ".join(current_cmd),
   self.current_dir, backing_file_name)
-temp_log = StringIO.StringIO()
+temp_log = StringIO()
 try:
 retcode = run_app(temp_log, current_cmd)
 except OSError as e:
diff --git a/tests/qemu-iotests/iotests.py b/tests/qemu-iotests/iotests.py
index 4e67fbbe96..c95dd17190 100644
--- a/tests/qemu-iotests/iotests.py
+++ b/tests/qemu-iotests/iotests.py
@@ -679,13 +679,14 @@ def main(supported_fmts=[], supported_oses=['linux'], 
supported_cache_modes=[],
 
 # We need to filter out the time taken from the output so that qemu-iotest
 # can reliably diff the results against master output.
-import StringIO
+from io import StringIO
+
 if debug:
 output = sys.stdout
 verbosity = 2
 sys.argv.remove('-d')
 else:
-output = StringIO.StringIO()
+output = StringIO()
 
 logging.basicConfig(level=(logging.DEBUG if debug else logging.WARN))
 
-- 
2.18.0




Re: [Qemu-devel] [PULL 0/5] riscv-pull queue

2018-07-18 Thread Alistair Francis
On Wed, Jul 18, 2018 at 3:27 PM, Alistair Francis
 wrote:
> The following changes since commit ea6abffa8a08d832feb759d359d5b935e3087cf7:
>
>   Update version for v3.0.0-rc1 release (2018-07-17 18:15:19 +0100)
>
> are available in the Git repository at:
>
>   g...@github.com:alistair23/qemu.git tags/pull-riscv-pull-20180718-2
>
> for you to fetch changes up to 451ea962be0cbb6710737eeb90c2a801b979cad3:
>
>   spike: Fix crash when introspecting the device (2018-07-18 14:33:40 -0700)
>
> 
> riscv: Fix introspection problems
>
> This is based on Thomas's work fixing introspection problems [1] and
> applied to the RISC-V port.
>
> 1: https://lists.gnu.org/archive/html/qemu-devel/2018-07/msg03261.html
>
> 
> Alistair Francis (5):
>   sifive_e: Fix crash when introspecting the device
>   sifive_u: Fix crash when introspecting the device
>   virt: Fix crash when introspecting the device
>   riscv_hart: Fix crash when introspecting the device
>   spike: Fix crash when introspecting the device

Argh! I just realised I left off Michael's review. I'll have to send a v2.

I'll tag it as v3.0 as well.

Alistair

>
>  hw/riscv/riscv_hart.c |  7 +++
>  hw/riscv/sifive_e.c   | 12 ++--
>  hw/riscv/sifive_u.c   | 15 +++
>  hw/riscv/spike.c  | 10 --
>  hw/riscv/virt.c   |  5 ++---
>  5 files changed, 22 insertions(+), 27 deletions(-)



[Qemu-devel] [PULL 4/5] riscv_hart: Fix crash when introspecting the device

2018-07-18 Thread Alistair Francis
Use the new object_initialize_child() and sysbus_init_child_obj() to
fix the issue.

Signed-off-by: Alistair Francis 
Suggested-by: Thomas Huth 
Reviewed-by: Thomas Huth 
Reviewed-by: Philippe Mathieu-Daudé 
Tested-by: Philippe Mathieu-Daudé 
---
 hw/riscv/riscv_hart.c | 7 +++
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/hw/riscv/riscv_hart.c b/hw/riscv/riscv_hart.c
index 75ba7ed579..e34a26a0ef 100644
--- a/hw/riscv/riscv_hart.c
+++ b/hw/riscv/riscv_hart.c
@@ -45,11 +45,10 @@ static void riscv_harts_realize(DeviceState *dev, Error 
**errp)
 s->harts = g_new0(RISCVCPU, s->num_harts);
 
 for (n = 0; n < s->num_harts; n++) {
-
-object_initialize(>harts[n], sizeof(RISCVCPU), s->cpu_type);
+object_initialize_child(OBJECT(s), "harts[*]", >harts[n],
+sizeof(RISCVCPU), s->cpu_type,
+_abort, NULL);
 s->harts[n].env.mhartid = n;
-object_property_add_child(OBJECT(s), "harts[*]", OBJECT(>harts[n]),
-  _abort);
 qemu_register_reset(riscv_harts_cpu_reset, >harts[n]);
 object_property_set_bool(OBJECT(>harts[n]), true,
  "realized", );
-- 
2.17.1




[Qemu-devel] [PULL 5/5] spike: Fix crash when introspecting the device

2018-07-18 Thread Alistair Francis
Use the new object_initialize_child() and sysbus_init_child_obj() to
fix the issue.

Signed-off-by: Alistair Francis 
Suggested-by: Thomas Huth 
Reviewed-by: Thomas Huth 
Reviewed-by: Philippe Mathieu-Daudé 
Tested-by: Philippe Mathieu-Daudé 
---
 hw/riscv/spike.c | 10 --
 1 file changed, 4 insertions(+), 6 deletions(-)

diff --git a/hw/riscv/spike.c b/hw/riscv/spike.c
index f94e2b6707..c8c056c50b 100644
--- a/hw/riscv/spike.c
+++ b/hw/riscv/spike.c
@@ -171,9 +171,8 @@ static void spike_v1_10_0_board_init(MachineState *machine)
 int i;
 
 /* Initialize SOC */
-object_initialize(>soc, sizeof(s->soc), TYPE_RISCV_HART_ARRAY);
-object_property_add_child(OBJECT(machine), "soc", OBJECT(>soc),
-  _abort);
+object_initialize_child(OBJECT(machine), "soc", >soc, sizeof(s->soc),
+TYPE_RISCV_HART_ARRAY, _abort, NULL);
 object_property_set_str(OBJECT(>soc), SPIKE_V1_10_0_CPU, "cpu-type",
 _abort);
 object_property_set_int(OBJECT(>soc), smp_cpus, "num-harts",
@@ -254,9 +253,8 @@ static void spike_v1_09_1_board_init(MachineState *machine)
 int i;
 
 /* Initialize SOC */
-object_initialize(>soc, sizeof(s->soc), TYPE_RISCV_HART_ARRAY);
-object_property_add_child(OBJECT(machine), "soc", OBJECT(>soc),
-  _abort);
+object_initialize_child(OBJECT(machine), "soc", >soc, sizeof(s->soc),
+TYPE_RISCV_HART_ARRAY, _abort, NULL);
 object_property_set_str(OBJECT(>soc), SPIKE_V1_09_1_CPU, "cpu-type",
 _abort);
 object_property_set_int(OBJECT(>soc), smp_cpus, "num-harts",
-- 
2.17.1




[Qemu-devel] [PULL 3/5] virt: Fix crash when introspecting the device

2018-07-18 Thread Alistair Francis
Use the new object_initialize_child() and sysbus_init_child_obj() to
fix the issue.

Signed-off-by: Alistair Francis 
Suggested-by: Thomas Huth 
Reviewed-by: Thomas Huth 
Reviewed-by: Philippe Mathieu-Daudé 
Tested-by: Philippe Mathieu-Daudé 
---
 hw/riscv/virt.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/hw/riscv/virt.c b/hw/riscv/virt.c
index aeada2498d..248bbdffd3 100644
--- a/hw/riscv/virt.c
+++ b/hw/riscv/virt.c
@@ -274,9 +274,8 @@ static void riscv_virt_board_init(MachineState *machine)
 void *fdt;
 
 /* Initialize SOC */
-object_initialize(>soc, sizeof(s->soc), TYPE_RISCV_HART_ARRAY);
-object_property_add_child(OBJECT(machine), "soc", OBJECT(>soc),
-  _abort);
+object_initialize_child(OBJECT(machine), "soc", >soc, sizeof(s->soc),
+TYPE_RISCV_HART_ARRAY, _abort, NULL);
 object_property_set_str(OBJECT(>soc), VIRT_CPU, "cpu-type",
 _abort);
 object_property_set_int(OBJECT(>soc), smp_cpus, "num-harts",
-- 
2.17.1




[Qemu-devel] [PULL 2/5] sifive_u: Fix crash when introspecting the device

2018-07-18 Thread Alistair Francis
Use the new object_initialize_child() and sysbus_init_child_obj() to
fix the issue.

Signed-off-by: Alistair Francis 
Suggested-by: Thomas Huth 
Reviewed-by: Thomas Huth 
Reviewed-by: Philippe Mathieu-Daudé 
Tested-by: Philippe Mathieu-Daudé 
---
 hw/riscv/sifive_u.c | 15 +++
 1 file changed, 7 insertions(+), 8 deletions(-)

diff --git a/hw/riscv/sifive_u.c b/hw/riscv/sifive_u.c
index 3a6ffeb437..59ae1ce24a 100644
--- a/hw/riscv/sifive_u.c
+++ b/hw/riscv/sifive_u.c
@@ -244,9 +244,9 @@ static void riscv_sifive_u_init(MachineState *machine)
 int i;
 
 /* Initialize SoC */
-object_initialize(>soc, sizeof(s->soc), TYPE_RISCV_U_SOC);
-object_property_add_child(OBJECT(machine), "soc", OBJECT(>soc),
-  _abort);
+object_initialize_child(OBJECT(machine), "soc", >soc,
+sizeof(s->soc), TYPE_RISCV_U_SOC,
+_abort, NULL);
 object_property_set_bool(OBJECT(>soc), true, "realized",
 _abort);
 
@@ -303,16 +303,15 @@ static void riscv_sifive_u_soc_init(Object *obj)
 {
 SiFiveUSoCState *s = RISCV_U_SOC(obj);
 
-object_initialize(>cpus, sizeof(s->cpus), TYPE_RISCV_HART_ARRAY);
-object_property_add_child(obj, "cpus", OBJECT(>cpus),
-  _abort);
+object_initialize_child(obj, "cpus", >cpus, sizeof(s->cpus),
+TYPE_RISCV_HART_ARRAY, _abort, NULL);
 object_property_set_str(OBJECT(>cpus), SIFIVE_U_CPU, "cpu-type",
 _abort);
 object_property_set_int(OBJECT(>cpus), smp_cpus, "num-harts",
 _abort);
 
-object_initialize(>gem, sizeof(s->gem), TYPE_CADENCE_GEM);
-qdev_set_parent_bus(DEVICE(>gem), sysbus_get_default());
+sysbus_init_child_obj(obj, "gem", >gem, sizeof(s->gem),
+  TYPE_CADENCE_GEM);
 }
 
 static void riscv_sifive_u_soc_realize(DeviceState *dev, Error **errp)
-- 
2.17.1




[Qemu-devel] [PULL 0/5] riscv-pull queue

2018-07-18 Thread Alistair Francis
The following changes since commit ea6abffa8a08d832feb759d359d5b935e3087cf7:

  Update version for v3.0.0-rc1 release (2018-07-17 18:15:19 +0100)

are available in the Git repository at:

  g...@github.com:alistair23/qemu.git tags/pull-riscv-pull-20180718-2

for you to fetch changes up to 451ea962be0cbb6710737eeb90c2a801b979cad3:

  spike: Fix crash when introspecting the device (2018-07-18 14:33:40 -0700)


riscv: Fix introspection problems

This is based on Thomas's work fixing introspection problems [1] and
applied to the RISC-V port.

1: https://lists.gnu.org/archive/html/qemu-devel/2018-07/msg03261.html


Alistair Francis (5):
  sifive_e: Fix crash when introspecting the device
  sifive_u: Fix crash when introspecting the device
  virt: Fix crash when introspecting the device
  riscv_hart: Fix crash when introspecting the device
  spike: Fix crash when introspecting the device

 hw/riscv/riscv_hart.c |  7 +++
 hw/riscv/sifive_e.c   | 12 ++--
 hw/riscv/sifive_u.c   | 15 +++
 hw/riscv/spike.c  | 10 --
 hw/riscv/virt.c   |  5 ++---
 5 files changed, 22 insertions(+), 27 deletions(-)



[Qemu-devel] [PULL 1/5] sifive_e: Fix crash when introspecting the device

2018-07-18 Thread Alistair Francis
Use the new object_initialize_child() and sysbus_init_child_obj() to
fix the issue.

Signed-off-by: Alistair Francis 
Suggested-by: Thomas Huth 
Reviewed-by: Thomas Huth 
Reviewed-by: Philippe Mathieu-Daudé 
Tested-by: Philippe Mathieu-Daudé 
---
 hw/riscv/sifive_e.c | 12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/hw/riscv/sifive_e.c b/hw/riscv/sifive_e.c
index 8a8dbe1c00..4577d72037 100644
--- a/hw/riscv/sifive_e.c
+++ b/hw/riscv/sifive_e.c
@@ -105,9 +105,9 @@ static void riscv_sifive_e_init(MachineState *machine)
 int i;
 
 /* Initialize SoC */
-object_initialize(>soc, sizeof(s->soc), TYPE_RISCV_E_SOC);
-object_property_add_child(OBJECT(machine), "soc", OBJECT(>soc),
-  _abort);
+object_initialize_child(OBJECT(machine), "soc", >soc,
+sizeof(s->soc), TYPE_RISCV_E_SOC,
+_abort, NULL);
 object_property_set_bool(OBJECT(>soc), true, "realized",
 _abort);
 
@@ -139,9 +139,9 @@ static void riscv_sifive_e_soc_init(Object *obj)
 {
 SiFiveESoCState *s = RISCV_E_SOC(obj);
 
-object_initialize(>cpus, sizeof(s->cpus), TYPE_RISCV_HART_ARRAY);
-object_property_add_child(obj, "cpus", OBJECT(>cpus),
-  _abort);
+object_initialize_child(obj, "cpus", >cpus,
+sizeof(s->cpus), TYPE_RISCV_HART_ARRAY,
+_abort, NULL);
 object_property_set_str(OBJECT(>cpus), SIFIVE_E_CPU, "cpu-type",
 _abort);
 object_property_set_int(OBJECT(>cpus), smp_cpus, "num-harts",
-- 
2.17.1




Re: [Qemu-devel] [PATCH v1 0/5] riscv: Fix introspection problems

2018-07-18 Thread Alistair Francis
On Wed, Jul 18, 2018 at 12:22 AM, Thomas Huth  wrote:
> On 17.07.2018 22:27, Alistair Francis wrote:
>> This is based on Thomas's work fixing introspection problems [1] and
>> applied to the RISC-V port.
>>
>> 1: https://lists.gnu.org/archive/html/qemu-devel/2018-07/msg03261.html
>>
>> Alistair Francis (5):
>>   sifive_e: Fix crash when introspecting the device
>>   sifive_u: Fix crash when introspecting the device
>>   virt: Fix crash when introspecting the device
>>   riscv_hart: Fix crash when introspecting the device
>>   spike: Fix crash when introspecting the device
>>
>>  hw/riscv/riscv_hart.c |  7 +++
>>  hw/riscv/sifive_e.c   | 12 ++--
>>  hw/riscv/sifive_u.c   | 15 +++
>>  hw/riscv/spike.c  | 10 --
>>  hw/riscv/virt.c   |  5 ++---
>>  5 files changed, 22 insertions(+), 27 deletions(-)
>
> That's interesting, these issues did not appear in my tests (modified
> tests/device-introspect-test.c with hmp("info qtree")). Likely because
> device-introspect-test only checks with the "none" machine ==> one more
> reason to test here with all machines, too (I suggested such a patch a
> couple of months ago) ...

What happened to the patch? More tests would be super handy.

Alistair

>
>  Thomas
>
>



[Qemu-devel] How to generate custom fw paths for IDE devices?

2018-07-18 Thread Mark Cave-Ayland

Hi all,

Following on from a couple of patches I've previously posted to the 
mailing list at 
https://lists.gnu.org/archive/html/qemu-devel/2018-06/msg08836.html I've 
made some good progress with trying to add bootindex support to OpenBIOS 
but I'm stuck with generating the IDE device paths from QEMU.


According to OpenBIOS the device path for a cdrom on a sun4u machine 
should be:


  /pci@1fe,0/pci@1,1/ide@3/ide1@8100/cdrom@0

whereas with my working patchset I'm currently generating:

  /pci@1fe,0/pci@1,1/ide@3/drive@1

The issue is that the drive@1 part is being generated by the IDE drive 
device attached to the IDE bus in hw/ide/qdev.c, and so I think I need 
to override idebus_get_fw_dev_path() to manually generate the remainder 
of the path including both the controller and the correctly named drive 
node.


One option may be to consider subclassing IDEBus and overriding 
idebus_get_fw_dev_path() there, but the cmd646 device is a child of 
TYPE_PCI_IDE which has its own internal IDEBus and so it seems 
overriding it is impossible.


Can anyone point me in the right direction as to how to generate the 
correct fw path for IDE devices in the above format for sun4u machines?



ATB,

Mark.



[Qemu-devel] [PATCH] block/file-posix: add bdrv_attach_aio_context callback for host dev and cdrom

2018-07-18 Thread Nishanth Aravamudan via Qemu-devel
In ed6e2161 ("linux-aio: properly bubble up errors from initialzation"),
I only added a bdrv_attach_aio_context callback for the bdrv_file
driver. There are several other drivers that use the shared
aio_plug callback, though, and they will trip the assertion added to
aio_get_linux_aio because they did not call aio_setup_linux_aio first.
Add the appropriate callback definition to the affected driver
definitions.

Fixes: ed6e2161 ("linux-aio: properly bubble up errors from initialization")
Reported-by: Farhan Ali 
Signed-off-by: Nishanth Aravamudan 
Cc: Eric Blake 
Cc: Kevin Wolf 
Cc: John Snow 
Cc: Max Reitz 
Cc: Stefan Hajnoczi 
Cc: Fam Zheng 
Cc: Paolo Bonzini 
Cc: qemu-bl...@nongnu.org
Cc: qemu-devel@nongnu.org
---
 block/file-posix.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/block/file-posix.c b/block/file-posix.c
index 60af4b3d51..ad299beb38 100644
--- a/block/file-posix.c
+++ b/block/file-posix.c
@@ -3158,6 +3158,7 @@ static BlockDriver bdrv_host_device = {
 .bdrv_refresh_limits = raw_refresh_limits,
 .bdrv_io_plug = raw_aio_plug,
 .bdrv_io_unplug = raw_aio_unplug,
+.bdrv_attach_aio_context = raw_aio_attach_aio_context,
 
 .bdrv_co_truncate   = raw_co_truncate,
 .bdrv_getlength= raw_getlength,
@@ -3280,6 +3281,7 @@ static BlockDriver bdrv_host_cdrom = {
 .bdrv_refresh_limits = raw_refresh_limits,
 .bdrv_io_plug = raw_aio_plug,
 .bdrv_io_unplug = raw_aio_unplug,
+.bdrv_attach_aio_context = raw_aio_attach_aio_context,
 
 .bdrv_co_truncate= raw_co_truncate,
 .bdrv_getlength  = raw_getlength,
@@ -3410,6 +3412,7 @@ static BlockDriver bdrv_host_cdrom = {
 .bdrv_refresh_limits = raw_refresh_limits,
 .bdrv_io_plug = raw_aio_plug,
 .bdrv_io_unplug = raw_aio_unplug,
+.bdrv_attach_aio_context = raw_aio_attach_aio_context,
 
 .bdrv_co_truncate= raw_co_truncate,
 .bdrv_getlength  = raw_getlength,
-- 
2.17.1




Re: [Qemu-devel] [PATCH 1/7] tests: qgraph API for the qtest driver framework

2018-07-18 Thread Emanuele




On 07/18/2018 09:28 PM, Paolo Bonzini wrote:

On 18/07/2018 16:23, Stefan Hajnoczi wrote:

+struct QOSGraphObject {
+/* for produces, returns void * */
+QOSGetDriver get_driver;

Unused?


+/* for contains, returns a QOSGraphObject * */
+QOSGetDevice get_device;

Unused?

What is unused?

Neither of these fields are used in this patch.  Please introduce them
in the first patch that actually uses them.  This way code review can
proceed linearly and it also prevents deadcode when just part of a patch
series is merged or backported.

So do you suggest to squash patch 6 into this one, so that a user of
QOSGraphObject exists already here?
I think he's right, it makes no sense to have qos-graph as patch 6, it 
could go as patch 2, even though by that time no object/node exist yet.


Moreover, right now no file in patch 3-4-5 is actually compiled until 
patch 6, which is easily prone to errors in case of future edits (I too 
have this problem right now, when I need to modify sdhci.c and I'm 
forced to return to the branch head to compile), so adding it before 
would make things more clear.

Thanks,

Paolo





Re: [Qemu-devel] [Qemu-stable] [PATCH v2] tap: fix memory leak on success to create a tap device

2018-07-18 Thread Michael Roth
Quoting Jason Wang (2018-05-31 04:46:05)
> 
> 
> On 2018年05月31日 15:28, wangyunjian wrote:
> > From: Yunjian Wang 
> >
> > The memory leak on success to create a tap device. And the nfds and
> > nvhosts may not be the same and need to be processed separately.
> >
> > Fixes: 07825977 ("tap: fix memory leak on failure to create a multiqueue 
> > tap device")
> > Fixes: 264986e2 ("tap: multiqueue support")
> >
> > CC: QEMU Stable 
> > Signed-off-by: Yunjian Wang 
> > ---
> > v2:
> >   -add commit log and cc qemu-stable with fixes tags
> > ---
> 
> Applied.

It looks like this hasn't made its way to master yet. Is a pull request
still planned?

> 
> Thanks
> 
> >   net/tap.c | 16 ++--
> >   1 file changed, 10 insertions(+), 6 deletions(-)
> >
> > diff --git a/net/tap.c b/net/tap.c
> > index de05f20..6d7710f 100644
> > --- a/net/tap.c
> > +++ b/net/tap.c
> > @@ -803,7 +803,8 @@ int net_init_tap(const Netdev *netdev, const char *name,
> >   } else if (tap->has_fds) {
> >   char **fds;
> >   char **vhost_fds;
> > -int nfds, nvhosts;
> > +int nfds = 0, nvhosts = 0;
> > +int ret = 0;
> >   
> >   if (tap->has_ifname || tap->has_script || tap->has_downscript ||
> >   tap->has_vnet_hdr || tap->has_helper || tap->has_queues ||
> > @@ -823,6 +824,7 @@ int net_init_tap(const Netdev *netdev, const char *name,
> >   if (nfds != nvhosts) {
> >   error_setg(errp, "The number of fds passed does not match 
> > "
> >  "the number of vhostfds passed");
> > +ret = -1;
> >   goto free_fail;
> >   }
> >   }
> > @@ -831,6 +833,7 @@ int net_init_tap(const Netdev *netdev, const char *name,
> >   fd = monitor_fd_param(cur_mon, fds[i], );
> >   if (fd == -1) {
> >   error_propagate(errp, err);
> > +ret = -1;
> >   goto free_fail;
> >   }
> >   
> > @@ -841,6 +844,7 @@ int net_init_tap(const Netdev *netdev, const char *name,
> >   } else if (vnet_hdr != tap_probe_vnet_hdr(fd)) {
> >   error_setg(errp,
> >  "vnet_hdr not consistent across given tap 
> > fds");
> > +ret = -1;
> >   goto free_fail;
> >   }
> >   
> > @@ -850,21 +854,21 @@ int net_init_tap(const Netdev *netdev, const char 
> > *name,
> >vnet_hdr, fd, );
> >   if (err) {
> >   error_propagate(errp, err);
> > +ret = -1;
> >   goto free_fail;
> >   }
> >   }
> > -g_free(fds);
> > -g_free(vhost_fds);
> > -return 0;
> >   
> >   free_fail:
> > +for (i = 0; i < nvhosts; i++) {
> > +g_free(vhost_fds[i]);
> > +}
> >   for (i = 0; i < nfds; i++) {
> >   g_free(fds[i]);
> > -g_free(vhost_fds[i]);
> >   }
> >   g_free(fds);
> >   g_free(vhost_fds);
> > -return -1;
> > +return ret;
> >   } else if (tap->has_helper) {
> >   if (tap->has_ifname || tap->has_script || tap->has_downscript ||
> >   tap->has_vnet_hdr || tap->has_queues || tap->has_vhostfds) {
> 
> 




Re: [Qemu-devel] [Qemu-stable] [PATCH v4] block: fix QEMU crash with scsi-hd and drive_del

2018-07-18 Thread Michael Roth
Quoting Kevin Wolf (2018-05-29 15:19:17)
> Am 28.05.2018 um 14:03 hat Greg Kurz geschrieben:
> > Removing a drive with drive_del while it is being used to run an I/O
> > intensive workload can cause QEMU to crash.
> > 
> > An AIO flush can yield at some point:
> > 
> > blk_aio_flush_entry()
> >  blk_co_flush(blk)
> >   bdrv_co_flush(blk->root->bs)
> >...
> > qemu_coroutine_yield()
> > 
> > and let the HMP command to run, free blk->root and give control
> > back to the AIO flush:
> > 
> > hmp_drive_del()
> >  blk_remove_bs()
> >   bdrv_root_unref_child(blk->root)
> >child_bs = blk->root->bs
> >bdrv_detach_child(blk->root)
> > bdrv_replace_child(blk->root, NULL)
> >  blk->root->bs = NULL
> > g_free(blk->root) <== blk->root becomes stale
> >bdrv_unref(child_bs)
> > bdrv_delete(child_bs)
> >  bdrv_close()
> >   bdrv_drained_begin()
> >bdrv_do_drained_begin()
> > bdrv_drain_recurse()
> >  aio_poll()
> >   ...
> >   qemu_coroutine_switch()
> > 
> > and the AIO flush completion ends up dereferencing blk->root:
> > 
> >   blk_aio_complete()
> >scsi_aio_complete()
> > blk_get_aio_context(blk)
> >  bs = blk_bs(blk)
> >  ie, bs = blk->root ? blk->root->bs : NULL
> > ^
> > stale
> > 
> > The problem is that we should avoid making block driver graph
> > changes while we have in-flight requests. Let's drain all I/O
> > for this BB before calling bdrv_root_unref_child().
> > 
> > Signed-off-by: Greg Kurz 
> 
> Hmm... It sounded convincing, but 'make check-tests/test-replication'
> fails now. The good news is that with the drain fixes, for which I sent
> v2 today, it passes, so instead of staging it in my block branch, I'll
> put it at the end of my branch for the drain fixes.
> 
> Might take a bit longer than planned until it's in master, sorry.

I'm getting the below test-replication failure/trace trying to backport
this patch for 2.12.1 (using this tree:
https://github.com/mdroth/qemu/commits/stable-2.12-staging-f45280cbf)

Is this the same issue you saw, and if so, are the drain fixes
appropriate for 2.12.x? Are there other prereqs/follow-ups you're
aware of that would also be needed?

mdroth@sif:~/w/qemu-build4$ MALLOC_PERTURB_=1 gdb --args tests/test-replication
(gdb) run
Starting program: /home/mdroth/dev/kvm/qemu-build4/tests/test-replication 
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
[New Thread 0x750bf700 (LWP 3916)]
[New Thread 0x748be700 (LWP 3917)]
/replication/primary/read: OK
/replication/primary/write: OK
/replication/primary/start: OK
/replication/primary/stop: OK
/replication/primary/do_checkpoint: OK
/replication/primary/get_error_all: OK
/replication/secondary/read: OK
/replication/secondary/write: OK
/replication/secondary/start: [New Thread 0x7fffea9f5700 (LWP 3918)]
[New Thread 0x7fffea1f4700 (LWP 3919)]
[New Thread 0x7fffe99f3700 (LWP 3920)]
[New Thread 0x7fffe91f2700 (LWP 3921)]
[New Thread 0x7fffe89f1700 (LWP 3922)]
[New Thread 0x7fffdbfff700 (LWP 3923)]
[New Thread 0x7fffdb7fe700 (LWP 3924)]
[New Thread 0x7fffda1ef700 (LWP 3925)]
[New Thread 0x7fffd99ee700 (LWP 3926)]
[New Thread 0x7fffd91ed700 (LWP 3927)]
[New Thread 0x7fffd89ec700 (LWP 3928)]
[New Thread 0x7fffb700 (LWP 3929)]
[New Thread 0x7fffbf7fe700 (LWP 3930)]
[New Thread 0x7fffbeffd700 (LWP 3931)]
[New Thread 0x7fffbe7fc700 (LWP 3932)]
[New Thread 0x7fffbdffb700 (LWP 3933)]
[New Thread 0x7fffbd7fa700 (LWP 3934)]
[New Thread 0x7fffbcff9700 (LWP 3935)]
[New Thread 0x7fff9bfff700 (LWP 3936)]
[New Thread 0x7fff9b7fe700 (LWP 3937)]
[New Thread 0x7fff9affd700 (LWP 3938)]
[New Thread 0x7fff9a7fc700 (LWP 3939)]
[New Thread 0x7fff99ffb700 (LWP 3940)]
OK
/replication/secondary/stop: 
Thread 1 "test-replicatio" received signal SIGSEGV, Segmentation fault.
qemu_mutex_unlock_impl (mutex=mutex@entry=0x101010101010161, 
file=file@entry=0x556734b0 "/home/mdroth/w/qemu4.git/util/async.c", 
line=line@entry=507) at /home/mdroth/w/qemu4.git/util/qemu-thread-posix.c:94
94  assert(mutex->initialized);
(gdb) bt
#0  qemu_mutex_unlock_impl (mutex=mutex@entry=0x101010101010161, 
file=file@entry=0x556734b0 "/home/mdroth/w/qemu4.git/util/async.c", 
line=line@entry=507)
at /home/mdroth/w/qemu4.git/util/qemu-thread-posix.c:94
#1  0x556231f5 in aio_context_release (ctx=ctx@entry=0x101010101010101) 
at /home/mdroth/w/qemu4.git/util/async.c:507
#2  0x555c6895 in bdrv_drain_recurse (bs=bs@entry=0x55d06250) at 
/home/mdroth/w/qemu4.git/block/io.c:197
#3  0x555c6f0f in bdrv_do_drained_begin (bs=0x55d06250, 
recursive=, parent=0x0) at 
/home/mdroth/w/qemu4.git/block/io.c:290
#4  0x555c6ef0 in bdrv_parent_drained_begin (ignore=0x0, 
bs=0x55d56fb0) at /home/mdroth/w/qemu4.git/block/io.c:53
#5  bdrv_do_drained_begin 

Re: [Qemu-devel] [PATCH 2/7] tests/qgraph: pci-pc driver and interface nodes

2018-07-18 Thread Emanuele




On 07/18/2018 09:33 PM, Paolo Bonzini wrote:

On 18/07/2018 20:29, Emanuele wrote:

I had to put this patch here because it also introduces
qpci_device_init, used by sdhci (patch 3).

For the next version I plan to have a patch X where I rename all
occurrences of qpci_init_pc in qpci_pc_new, and a patch X+1 that
introduces qpci_init_pc (was qpci_set_pc) and the other changes.

Should I only introduce qpci_device_init in patch 3 and the remaining
things in patch 5?

I think the general problem here is that in some patches I create
functions that are planned to only be used only in next patches (of the
current series).

I think it's okay this way, however you should justify the changes you
make to "qgraph-ify" each component.

For patch 1, let's wait for Stefan's reply.  Because patch 1 is
introducing the infrastructure, I think it is acceptable that some
definitions are introduced early as long as they have doc comments; it
would make little sense to introduce get_device in patch 4 just because
there are no "contains" edges until then.

However, introducing the qos-test directly at the beginning is also a
possibility.

In either case, we need better doc comments for the function pointers in
QOSGraphObject.
What would you suggest as better doc comments? For version 2 I have 
written a little introduction like in qom/object.h, essentially pasting 
the cover letter and a working example.

Paolo





Re: [Qemu-devel] hw/display/sm501* status

2018-07-18 Thread Sebastian Bauer

Hi,

Am 2018-07-18 04:30, schrieb David Gibson:

On Mon, Jul 16, 2018 at 09:43:05PM +0100, Peter Maydell wrote:

On 16 July 2018 at 21:26, BALATON Zoltan  wrote:
> I could list sm501 in the sam460ex section thus formally taking
> sub-maintainership but this would only go as far that I'll get cc-d on
> changes and try to review them. I still can't (and don't want to) maintain
> my own tree and send pull requests directly so I'd need someone's help with
> that who can take my patches and merge them via their tree. If it's OK to
> continue to use the PPC tree for that I'm fine with that if David agrees
> (which I think he did above).
>
> So is it OK to add sm501 to sam460ex section so I claim responsibility for
> changes and will continue sending patches cc-ing David (and maybe Gerd to
> help spotting problems) then David could take these via the PPC tree?

I think that makes sense. sh4 is pretty dead and if sm501's major
active user is the sam460ex then listing it there seems reasonable.


Sounds good to me.

Zoltan, can you send a patch adding sm501 to the sam460ex section of
the MAINTAINERS file.  That will make you the right point of contact
for it.

On the understanding that it's just about tree logistics, rather than
review, I'll be happy to take sm501 patches from you through my tree.


Zoltan, if you like you can also add me as a possible reviewer for 
anything that is Sam460ex-related.


Bye
Sebastian



[Qemu-devel] [PATCH] linux-user/ppc: Implement swapcontext syscall

2018-07-18 Thread Richard Henderson
This allows the tests generated by debian-powerpc-user-cross
to function properly, especially tests/test-coroutine.

Technically this syscall is available to both ppc32 and ppc64,
but only ppc32 glibc actually uses it.  Thus the ppc64 path is
untested.

Signed-off-by: Richard Henderson 
---
 linux-user/qemu.h   |  2 ++
 linux-user/ppc/signal.c | 56 +
 linux-user/syscall.c|  6 +
 3 files changed, 64 insertions(+)

diff --git a/linux-user/qemu.h b/linux-user/qemu.h
index bb85c81aa4..e0963676c7 100644
--- a/linux-user/qemu.h
+++ b/linux-user/qemu.h
@@ -395,6 +395,8 @@ long do_sigreturn(CPUArchState *env);
 long do_rt_sigreturn(CPUArchState *env);
 abi_long do_sigaltstack(abi_ulong uss_addr, abi_ulong uoss_addr, abi_ulong sp);
 int do_sigprocmask(int how, const sigset_t *set, sigset_t *oldset);
+abi_long do_swapcontext(CPUArchState *env, abi_ulong uold_ctx,
+abi_ulong unew_ctx, abi_long ctx_size);
 /**
  * block_signals: block all signals while handling this guest syscall
  *
diff --git a/linux-user/ppc/signal.c b/linux-user/ppc/signal.c
index ef4c518f11..2ae120a2bc 100644
--- a/linux-user/ppc/signal.c
+++ b/linux-user/ppc/signal.c
@@ -675,3 +675,59 @@ sigsegv:
 force_sig(TARGET_SIGSEGV);
 return -TARGET_QEMU_ESIGRETURN;
 }
+
+/* This syscall implements {get,set,swap}context for userland.  */
+abi_long do_swapcontext(CPUArchState *env, abi_ulong uold_ctx,
+abi_ulong unew_ctx, abi_long ctx_size)
+{
+struct target_ucontext *uctx;
+struct target_mcontext *mctx;
+
+/* For ppc32, ctx_size is "reserved for future use".
+ * For ppc64, we do not yet support the VSX extension.
+ */
+if (ctx_size < sizeof(struct target_ucontext)) {
+return -TARGET_EINVAL;
+}
+
+if (uold_ctx) {
+TaskState *ts = (TaskState *)thread_cpu->opaque;
+
+if (!lock_user_struct(VERIFY_WRITE, uctx, uold_ctx, 1)) {
+return -TARGET_EFAULT;
+}
+
+#ifdef TARGET_PPC64
+mctx = >tuc_sigcontext.mcontext;
+#else
+/* ??? The kernel aligns the pointer down here into padding, but
+ * in setup_rt_frame we don't.  Be self-compatible for now.
+ */
+mctx = >tuc_mcontext;
+__put_user(h2g(mctx), >tuc_regs);
+#endif
+
+save_user_regs(env, mctx);
+host_to_target_sigset(>tuc_sigmask, >signal_mask);
+
+unlock_user_struct(uctx, uold_ctx, 1);
+}
+
+if (unew_ctx) {
+int err;
+
+if (!lock_user_struct(VERIFY_READ, uctx, unew_ctx, 1)) {
+return -TARGET_EFAULT;
+}
+err = do_setcontext(uctx, env, 0);
+unlock_user_struct(uctx, unew_ctx, 1);
+
+if (err) {
+/* We cannot return to a partially updated context.  */
+force_sig(TARGET_SIGSEGV);
+}
+return -TARGET_QEMU_ESIGRETURN;
+}
+
+return 0;
+}
diff --git a/linux-user/syscall.c b/linux-user/syscall.c
index 3df3bdffb2..dfc851cc35 100644
--- a/linux-user/syscall.c
+++ b/linux-user/syscall.c
@@ -12790,6 +12790,12 @@ abi_long do_syscall(void *cpu_env, int num, abi_long 
arg1,
 ret = get_errno(kcmp(arg1, arg2, arg3, arg4, arg5));
 break;
 #endif
+#ifdef TARGET_NR_swapcontext
+case TARGET_NR_swapcontext:
+/* PowerPC specific.  */
+ret = do_swapcontext(cpu_env, arg1, arg2, arg3);
+break;
+#endif
 
 default:
 unimplemented:
-- 
2.17.1




Re: [Qemu-devel] [PATCH 2/7] tests/qgraph: pci-pc driver and interface nodes

2018-07-18 Thread Paolo Bonzini
On 11/07/2018 19:46, Emanuele wrote:
>>> +static void qpci(void)
>>> +{
>>> +    qos_node_create_interface("pci-bus");
>>> +}
>>> +
>>> +libqos_init(qpci);
>> Why does an interface need to be created?  The drivers declare which
>> interfaces they support?
>>
>> I don't think this can be used to detect typoes in the driver's
>> qos_node_produces() call since there is no explicit control over the
>> order in which libqos_init() functions are called.  So the driver may
>> call qos_node_produces() before the qos_node_create_interface() is
>> called?
> The interface is what is actually given to the test, so from there it
> can test the functions and fields regardless of which driver is actually
> implementing them.
> For example, sdhci-test uses the QSDHCI functions, and depending on the
> path the generic-sdhci or sdhci-pci functions will be used.

I think Stefan is right, if you adjust qos_node_create_interface so that
it is idempotent(*), you can make it static and call it from
qos_node_produces and qos_node_consumes.

(*) that is, fail like now if the node exists and is not an interface;
but, succeed if the node exists and is an interface.

Thanks,

Paolo



Re: [Qemu-devel] [PATCH 0/7] Qtest driver framework

2018-07-18 Thread Paolo Bonzini
On 18/07/2018 19:14, Markus Armbruster wrote:
>> The main challenge to me seems "how can we make tests simpler?".  The
>> presence of a new API and object model raises the bar for writing and
>> running tests.  I hope all qtests will use qgraph but if the complexity
>> is too high then qgraph may only be used in a few niche cases where
>> authors think it's worth abstracting machine types and others will
>> continue to write qtests without qgraph.
> The #1 reason for writing code a certain way is following existing
> examples.  As long as "qtest without qgraph" is prevalent in existing
> tests, it'll remain prevalent in new tests.  I'm afraid we'll need a
> conversion effort.
> 

Indeed, Emanuele is already working on converting virtio-serial tests
(virtio is possibly the largest part of libqos).  The various nop tests
are also more or less trivial to adjust.

Paolo



Re: [Qemu-devel] [PATCH 2/7] tests/qgraph: pci-pc driver and interface nodes

2018-07-18 Thread Paolo Bonzini
On 18/07/2018 20:29, Emanuele wrote:
> I had to put this patch here because it also introduces
> qpci_device_init, used by sdhci (patch 3).
> 
> For the next version I plan to have a patch X where I rename all
> occurrences of qpci_init_pc in qpci_pc_new, and a patch X+1 that
> introduces qpci_init_pc (was qpci_set_pc) and the other changes.
> 
> Should I only introduce qpci_device_init in patch 3 and the remaining
> things in patch 5?
> 
> I think the general problem here is that in some patches I create
> functions that are planned to only be used only in next patches (of the
> current series).

I think it's okay this way, however you should justify the changes you
make to "qgraph-ify" each component.

For patch 1, let's wait for Stefan's reply.  Because patch 1 is
introducing the infrastructure, I think it is acceptable that some
definitions are introduced early as long as they have doc comments; it
would make little sense to introduce get_device in patch 4 just because
there are no "contains" edges until then.

However, introducing the qos-test directly at the beginning is also a
possibility.

In either case, we need better doc comments for the function pointers in
QOSGraphObject.

Paolo



Re: [Qemu-devel] [PATCH 1/7] tests: qgraph API for the qtest driver framework

2018-07-18 Thread Paolo Bonzini
On 18/07/2018 16:23, Stefan Hajnoczi wrote:
 +struct QOSGraphObject {
 +/* for produces, returns void * */
 +QOSGetDriver get_driver;
>>> Unused?
>>>
 +/* for contains, returns a QOSGraphObject * */
 +QOSGetDevice get_device;
>>> Unused?
>> What is unused?
> Neither of these fields are used in this patch.  Please introduce them
> in the first patch that actually uses them.  This way code review can
> proceed linearly and it also prevents deadcode when just part of a patch
> series is merged or backported.

So do you suggest to squash patch 6 into this one, so that a user of
QOSGraphObject exists already here?

Thanks,

Paolo



Re: [Qemu-devel] [PATCH] configure: Support pkg-config for zlib

2018-07-18 Thread Peter Maydell
On 18 July 2018 at 18:04, Daniel P. Berrangé  wrote:
> On Wed, Jul 18, 2018 at 06:59:25PM +0200, Stefan Weil wrote:
>> Am 18.07.2018 um 18:21 schrieb Daniel P. Berrangé:
>> >  this fallback support for non-pkgconfig scenarios can be entirely
>> > deleted, just leaving the error_exit message.
>>
>> I have no objection. Thank you for the investigation of the zlib
>> history. Removing old unneeded code is always good, but maybe that's
>> something which could be done after release 3.0.
>>
>> Or would you suggest to do it now? Then I can either send an updated
>> patch (v2), or whoever pulls that patch can make that trivial modification.
>
> If we're ok with adding support for pkg-config in 3.0, I think it is
> reasonable to drop the fallback check at the same time.  I think it
> this stills qualify as a bug fix patch in terms of our freeze rules,
> since we're specificially aiming to fix build problems on one of our
> supported platforms.
>
> So personally I'd suggest sending a v2 with the fallback dropped.
> for 3.0

I would favour doing that part in 3.1. This feels like the
kind of change that we're all pretty certain is safe that
then turns out not to be for some case or other, and it's
easier not to have to fix it up later.

thanks
-- PMM



Re: [Qemu-devel] [BUG?] aio_get_linux_aio: Assertion `ctx->linux_aio' failed

2018-07-18 Thread Nishanth Aravamudan via Qemu-devel
On 18.07.2018 [11:10:27 -0400], Farhan Ali wrote:
> 
> 
> On 07/18/2018 09:42 AM, Farhan Ali wrote:
> > 
> > 
> > On 07/17/2018 04:52 PM, Nishanth Aravamudan wrote:
> > > iiuc, this possibly implies AIO was not actually used previously on this
> > > guest (it might have silently been falling back to threaded IO?). I
> > > don't have access to s390x, but would it be possible to run qemu under
> > > gdb and see if aio_setup_linux_aio is being called at all (I think it
> > > might not be, but I'm not sure why), and if so, if it's for the context
> > > in question?
> > > 
> > > If it's not being called first, could you see what callpath is calling
> > > aio_get_linux_aio when this assertion trips?
> > > 
> > > Thanks!
> > > -Nish
> > 
> > 
> > Hi Nishant,
> > 
> >  From the coredump of the guest this is the call trace that calls
> > aio_get_linux_aio:
> > 
> > 
> > Stack trace of thread 145158:
> > #0  0x03ff94dbe274 raise (libc.so.6)
> > #1  0x03ff94da39a8 abort (libc.so.6)
> > #2  0x03ff94db62ce __assert_fail_base (libc.so.6)
> > #3  0x03ff94db634c __assert_fail (libc.so.6)
> > #4  0x02aa20db067a aio_get_linux_aio (qemu-system-s390x)
> > #5  0x02aa20d229a8 raw_aio_plug (qemu-system-s390x)
> > #6  0x02aa20d309ee bdrv_io_plug (qemu-system-s390x)
> > #7  0x02aa20b5a8ea virtio_blk_handle_vq (qemu-system-s390x)
> > #8  0x02aa20db2f6e aio_dispatch_handlers (qemu-system-s390x)
> > #9  0x02aa20db3c34 aio_poll (qemu-system-s390x)
> > #10 0x02aa20be32a2 iothread_run (qemu-system-s390x)
> > #11 0x03ff94f879a8 start_thread (libpthread.so.0)
> > #12 0x03ff94e797ee thread_start (libc.so.6)
> > 
> > 
> > Thanks for taking a look and responding.
> > 
> > Thanks
> > Farhan
> > 
> > 
> > 
> 
> Trying to debug a little further, the block device in this case is a "host
> device". And looking at your commit carefully you use the
> bdrv_attach_aio_context callback to setup a Linux AioContext.
> 
> For some reason the "host device" struct (BlockDriver bdrv_host_device in
> block/file-posix.c) does not have a bdrv_attach_aio_context defined.
> So a simple change of adding the callback to the struct solves the issue and
> the guest starts fine.
> 
> 
> diff --git a/block/file-posix.c b/block/file-posix.c
> index 28824aa..b8d59fb 100644
> --- a/block/file-posix.c
> +++ b/block/file-posix.c
> @@ -3135,6 +3135,7 @@ static BlockDriver bdrv_host_device = {
>  .bdrv_refresh_limits = raw_refresh_limits,
>  .bdrv_io_plug = raw_aio_plug,
>  .bdrv_io_unplug = raw_aio_unplug,
> +.bdrv_attach_aio_context = raw_aio_attach_aio_context,
> 
>  .bdrv_co_truncate   = raw_co_truncate,
>  .bdrv_getlength= raw_getlength,
> 
> 
> 
> I am not too familiar with block device code in QEMU, so not sure if
> this is the right fix or if there are some underlying problems.

Oh this is quite embarassing! I only added the bdrv_attach_aio_context
callback for the file-backed device. Your fix is definitely corect for
host device. Let me make sure there weren't any others missed and I will
send out a properly formatted patch. Thank you for the quick testing and
turnaround!

-Nish



Re: [Qemu-devel] [PATCH 2/7] tests/qgraph: pci-pc driver and interface nodes

2018-07-18 Thread Emanuele




On 07/18/2018 04:29 PM, Stefan Hajnoczi wrote:

On Wed, Jul 11, 2018 at 05:18:03PM +0200, Paolo Bonzini wrote:

On 11/07/2018 16:49, Stefan Hajnoczi wrote:

On Mon, Jul 09, 2018 at 11:11:31AM +0200, Emanuele Giuseppe Esposito wrote:

-QPCIBus *qpci_init_pc(QTestState *qts, QGuestAllocator *alloc)
+static void *qpci_get_driver(void *obj, const char *interface)
  {
-QPCIBusPC *ret = g_new0(QPCIBusPC, 1);
+QPCIBusPC *qpci = obj;
+if (!g_strcmp0(interface, "pci-bus")) {
+return >bus;
+}
+printf("%s not present in pci-bus-pc", interface);
+abort();
+}

At this point I wonder if it makes sense to use the QEMU Object Model
(QOM), which has interfaces and inheritance.  qgraph duplicates part of
the object model.

Replying for Emanuele on this point since we didn't really go through
QOM yet; I'll let him answer the comments that are more related to the code.

QOM is much heavier weight than qgraph, and adds a lot more boilerplate:

- QOM properties interact with QAPI and bring in a lot of baggage;
qgraph would only use "child" properties to implement containment.

- QOM objects are long-lived, qgraph objects only last for the duration
of a single test.  qgraph doesn't need reference counting or complex
two-phase initialization like "realize" or "user_complete"

- QOM's hierarchy is shallow, but qgraph doesn't really need _any_
hierarchy.  Because it focuses on interface rather than hierarchy,
everything can be expressed simply through structs contained into one
another.

Consider that qgraph is 1/4th the size of QOM, and a large part of it is
the graph data structure that (as you said) would not be provided by QOM.

There are two things where using QOM would save a little bit of
duplicated concepts, namely the get_driver (get interface, what you
quote above) and get_device (get contained object) callbacks.  However,
it wouldn't simplify the code at all, because it would require the
introduction of separate instance and class structs.  We didn't want to
add all too much boilerplate for people that want to write qtest, as you
pointed out in the review of patch 4.

Yes, I think these are good points.  QOM does involve a lot of
boilerplate for defining objects.


+void qpci_set_pc(QPCIBusPC *ret, QTestState *qts, QGuestAllocator *alloc)

It's not clear to me what the purpose of this function is - at least the
name is a bit cryptic since it seems more like an initialization
function than 'setting pc' on QPCIBusPC.  How about inlining this in
qpci_init_pc() instead of keeping a separate function?

I agree about the naming.  Perhaps we can rename the existing
qpci_init_pc to qpci_pc_new, and qpci_set_pc to qpci_pc_init.

Would you prefer if the split was done in the patch that introduces the
user for qpci_set_pc (patch 5)?  We did it here because this patch
prepares qpci-pc

Either way is fine.  I suggested inlining mainly because it avoids the
need to pick a good name :).
I had to put this patch here because it also introduces 
qpci_device_init, used by sdhci (patch 3).


For the next version I plan to have a patch X where I rename all 
occurrences of qpci_init_pc in qpci_pc_new, and a patch X+1 that 
introduces qpci_init_pc (was qpci_set_pc) and the other changes.


Should I only introduce qpci_device_init in patch 3 and the remaining 
things in patch 5?


I think the general problem here is that in some patches I create 
functions that are planned to only be used only in next patches (of the 
current series).

Stefan





Re: [Qemu-devel] [PATCH 1/7] tests: qgraph API for the qtest driver framework

2018-07-18 Thread Emanuele




On 07/18/2018 04:23 PM, Stefan Hajnoczi wrote:

On Wed, Jul 11, 2018 at 04:58:41PM +0200, Paolo Bonzini wrote:

On 11/07/2018 16:28, Stefan Hajnoczi wrote:

+ *
+ * QOSGraphObject also provides a destructor, used to deallocate the
+ * after the test has been executed.
+ */
+struct QOSGraphObject {
+/* for produces, returns void * */
+QOSGetDriver get_driver;

Unused?


+/* for contains, returns a QOSGraphObject * */
+QOSGetDevice get_device;

Unused?

What is unused?

Neither of these fields are used in this patch.  Please introduce them
in the first patch that actually uses them.  This way code review can
proceed linearly and it also prevents deadcode when just part of a patch
series is merged or backported.
I'm sorry but at this point also sdhci.c, raspi2-machine.c and 
x86_64_pc-machine.c (patch 3-4-5) are not actually even compiled before 
qos-test.c (patch 6). Is that wrong too?




Re: [Qemu-devel] [PATCH] nvic: Change NVIC to support ARMv6-M

2018-07-18 Thread Peter Maydell
On 18 July 2018 at 14:59, Julia Suvorova  wrote:
> On 17.07.2018 15:58, Peter Maydell wrote:
>>
>> On 10 July 2018 at 16:33, Julia Suvorova  wrote:
>>>
>>> The differences from ARMv7-M NVIC are:
>>>* ARMv6-M only supports up to 32 external interrupts
>>> (configurable feature already). The ICTR is reserved.
>>>* Active Bit Register is reserved.
>>>* ARMv6-M supports 4 priority levels against 256 in ARMv7-M.
>>>
>>> Signed-off-by: Julia Suvorova 
>>> ---
>>>   hw/intc/armv7m_nvic.c | 29 +
>>>   1 file changed, 25 insertions(+), 4 deletions(-)
>>>
>>> diff --git a/hw/intc/armv7m_nvic.c b/hw/intc/armv7m_nvic.c
>>> index 38aaf3dc8e..8545c87caa 100644
>>> --- a/hw/intc/armv7m_nvic.c
>>> +++ b/hw/intc/armv7m_nvic.c
>>> @@ -420,6 +420,10 @@ static void set_prio(NVICState *s, unsigned irq,
>>> bool secure, uint8_t prio)
>>>   assert(irq > ARMV7M_EXCP_NMI); /* only use for configurable prios
>>> */
>>>   assert(irq < s->num_irq);
>>>
>>> +if (!arm_feature(>cpu->env, ARM_FEATURE_V7)) {
>>> +prio &= 0xc0;
>>
>>
>> Rather than hard-coding this, I think we should have a
>> num_prio_bits field in the NVICState struct which defines
>> how many bits of priority are implemented. This would be
>> 2 for v6M and 8 otherwise, and can be set in
>> armv7m_nvic_realize. Then the mask is
>>MAKE_64BIT_MASK(8 - num_prio_bits, num_prio_bits);
>> (For v8M the number of priority bits is configurable.)
>
>
> Do I understand correctly that the check in armv7m_nvic_realize is for
> Baseline, not only v6m, because Baseline has only 4 priority levels too?
> And num_prio_bits should be just a field, not a property?

Make it for v6M -- v8M baseline doesn't specify a particular
number of priority levels, you can do anything. You can leave
it as a field and we'll bump it up to a property if we need it
(probably if/when we do a v8M baseline CPU model); though
if you want to make it a full property that's fine too.

thanks
-- PMM



Re: [Qemu-devel] Native Memory Virtualization in qemu-system-aarch64

2018-07-18 Thread Peter Maydell
On 18 July 2018 at 02:34, Kevin Loughlin  wrote:
> Under my setup, the CPU's MMU translates from VAs to IPAs, and an external
> memory controller then intercepts all memory transactions and translates
> these IPAs to true PAs. This allows the memory controller to enforce
> physical isolation of environments, and does not expose true PAs to the
> CPU/system software.

Ah, right, "external custom memory controller" makes sense.

> My question is how best to emulate the memory controller given this desired
> setup. I have three primary ideas, and I would love to get feedback on their
> feasibility.
>
> Implement the controller as an IOMMU region. I would be responsible for
> writing the controller's operations to shift and forward the target address
> to the appropriate subregion. Would it be possible to trigger the IOMMU
> region on every access to system_memory? For example, even during QEMU's
> loading process? Or would I only be able to trigger the IOMMU operations on
> access to the subregions that represent my environments? My understanding of
> the IOMMU regions is shaky. Nonetheless, this sounds like the most promising
> approach, assuming I can provide the shifting and forwarding operations and
> hide the PAs from the CPU's TLB as desired.

I would probably go with implementing it as an IOMMU region. We recently
added code to QEMU that allows you to put IOMMUs in the CPU's
memory-access path, so this works now. The example we have of
that at the moment is hw/misc/tz-mpc.c (which is a simple device
which configurably controls access to the thing "downstream" of it
based on a lookup table and whether the access is S or NS).

As you've guessed, the way the IOMMU stuff works is that it gates
access to the things sat behind it: the device has a MemoryRegion
"upstream" which it exposes to the code which creates it, and a
MemoryRegion property "downstream". The creating code (ie the board)
passes in whatever the "downstream" is (likely a container MemoryRegion
with RAM and so on), and maps the "upstream" end into the address
space that the CPU sees. (You would probably have one downstream
for each separate subregion). You could either have the IOMMU
only "in front" of one part of the overall address space, or
in front of the whole of the address space, as you liked.
(Assuming you have some kind of "control register" memory mapped
interface for programming it, it can't be behind itself; that
would be "an interesting topological exercise", to quote nethack.)

What the CPU sees is whatever is in the MemoryRegion passed to it
via
object_property_set_link(cpuobj, ..., "memory",
 _abort);

The virt board happens to currently use get_system_memory()
for that, but you can use a custom container MemoryRegion if you
like.

> Go into the target/arm code, find every instance of accesses to address
> spaces, and shift the target physical address accordingly. This seems ugly
> and unlikely to work.

That's very fragile, and I don't recommend it.

> Use overlapping subregions with differing priorities, as in done in QEMU's
> TrustZone implementation. However, these priorities would have to change on
> an environment context switch, and I don't know if that would lead to chaos.

You can model things this way too, yes (both by changing priorities, and by
simply enabling/disabling/mapping/unmapping memory regions). The main
advantage that using IOMMU regions gets you are:
 * you can do things at a finer granularity, say changing
   permissions at a page-by-page level, without creating a
   ton of MemoryRegion objects. Swapping MRs in and out would
   work if you only wanted to do it for big chunks of space at once
 * you can make the choice of what to do based on the memory
   transaction attributes (eg secure/nonsecure, user/privileged)
   rather than having to provide a single mapping only
If you really do only want to map 4G of RAM in and out at once,
this might be simpler.

Note that for both the IOMMU approach and the MemoryRegion map/unmap
approach, changing the mapping will blow away the emulated CPU's
cached TLB entirely. So if you do it very often you'll see a
performance hit. (In the IOMMU case it might in theory be possible
to get some of that performance back by being cleverer in the core
memory subsystem code so as to only drop the bits of the TLB that
are affected; but if you're remapping all-of-RAM then that probably
covers all the interesting cached TLB entries anyhow.)

thanks
-- PMM



Re: [Qemu-devel] [PATCH 0/7] Qtest driver framework

2018-07-18 Thread Markus Armbruster
Stefan Hajnoczi  writes:

> On Mon, Jul 09, 2018 at 11:11:29AM +0200, Emanuele Giuseppe Esposito wrote:
>> This work is being done as Google Summer of Code 2018 project for QEMU,
>> my mentors are Paolo Bonzini and Laurent Vivier.
>> Additional infos on the project can be found at:
>> https://wiki.qemu.org/Features/qtest_driver_framework
>
> Thanks, I've finished reviewing this version.  It looks like a good
> start!
>
> The main challenge to me seems "how can we make tests simpler?".  The
> presence of a new API and object model raises the bar for writing and
> running tests.  I hope all qtests will use qgraph but if the complexity
> is too high then qgraph may only be used in a few niche cases where
> authors think it's worth abstracting machine types and others will
> continue to write qtests without qgraph.

The #1 reason for writing code a certain way is following existing
examples.  As long as "qtest without qgraph" is prevalent in existing
tests, it'll remain prevalent in new tests.  I'm afraid we'll need a
conversion effort.

> What are the next steps and do you have plans for making qgraph more
> integral to libqos?

Good questions.



Re: [Qemu-devel] [PATCH] configure: Support pkg-config for zlib

2018-07-18 Thread Daniel P . Berrangé
On Wed, Jul 18, 2018 at 06:59:25PM +0200, Stefan Weil wrote:
> Am 18.07.2018 um 18:21 schrieb Daniel P. Berrangé:
> > On Thu, Jul 12, 2018 at 09:26:03PM +0200, Stefan Weil wrote:
> >> This is needed for builds with the mingw64-* packages from Cygwin,
> >> but also works for Linux.
> >>
> >> Move the zlib test also more to the end because users should
> >> get information on the really important missing packages
> >> (which also require zlib) first.
> > 
> > According to the zlib Changelog file pkgconfig support was added
> > in 2006 !
> > 
> > [quote]
> > Changes in 1.2.3.1 (16 August 2006)
> > 
> >   - Add pkgconfig support [Weigelt]
> > [/quote]
> > 
> > Given our target build platforms support guidelines
> > 
> >   https://qemu.weilnetz.de/doc/qemu-doc.html#Supported-build-platforms
> > 
> > we can safely say that all supported platforms will have a zlib
> > that contains pkgconfig support, so..
> > 
> >>
> >> Signed-off-by: Stefan Weil 
> >> ---
> >>  configure | 40 +++-
> >>  1 file changed, 23 insertions(+), 17 deletions(-)
> >>
> >> diff --git a/configure b/configure
> >> index 2a7796ea80..dcaab01729 100755
> >> --- a/configure
> >> +++ b/configure
> >> @@ -2140,23 +2140,6 @@ EOF
> >>fi
> >>  fi
> >>  
> >> -#
> >> -# zlib check
> >> -
> >> -if test "$zlib" != "no" ; then
> >> -cat > $TMPC << EOF
> >> -#include 
> >> -int main(void) { zlibVersion(); return 0; }
> >> -EOF
> >> -if compile_prog "" "-lz" ; then
> >> -:
> >> -else
> >> -error_exit "zlib check failed" \
> >> -"Make sure to have the zlib libs and headers installed."
> >> -fi
> >> -fi
> >> -LIBS="$LIBS -lz"
> >> -
> >>  ##
> >>  # lzo check
> >>  
> >> @@ -3525,6 +3508,29 @@ if ! compile_prog "$glib_cflags -Werror" 
> >> "$glib_libs" ; then
> >>  fi
> >>  fi
> >>  
> >> +#
> >> +# zlib check
> >> +
> >> +if test "$zlib" != "no" ; then
> >> +if $pkg_config --exists zlib; then
> >> +zlib_cflags=$($pkg_config --cflags zlib)
> >> +zlib_libs=$($pkg_config --libs zlib)
> >> +QEMU_CFLAGS="$zlib_cflags $QEMU_CFLAGS"
> >> +LIBS="$zlib_libs $LIBS"
> >> +else
> >> +cat > $TMPC << EOF
> >> +#include 
> >> +int main(void) { zlibVersion(); return 0; }
> >> +EOF
> >> +if compile_prog "" "-lz" ; then
> >> +LIBS="$LIBS -lz"
> >> +else
> >> +error_exit "zlib check failed" \
> >> +"Make sure to have the zlib libs and headers installed."
> >> +fi
> >> +fi
> > 
> >  this fallback support for non-pkgconfig scenarios can be entirely
> > deleted, just leaving the error_exit message. 
> 
> I have no objection. Thank you for the investigation of the zlib
> history. Removing old unneeded code is always good, but maybe that's
> something which could be done after release 3.0.
> 
> Or would you suggest to do it now? Then I can either send an updated
> patch (v2), or whoever pulls that patch can make that trivial modification.

If we're ok with adding support for pkg-config in 3.0, I think it is
reasonable to drop the fallback check at the same time.  I think it
this stills qualify as a bug fix patch in terms of our freeze rules,
since we're specificially aiming to fix build problems on one of our
supported platforms.

So personally I'd suggest sending a v2 with the fallback dropped.
for 3.0

Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|



Re: [Qemu-devel] [PATCH] configure: Support pkg-config for zlib

2018-07-18 Thread Stefan Weil
Am 18.07.2018 um 18:21 schrieb Daniel P. Berrangé:
> On Thu, Jul 12, 2018 at 09:26:03PM +0200, Stefan Weil wrote:
>> This is needed for builds with the mingw64-* packages from Cygwin,
>> but also works for Linux.
>>
>> Move the zlib test also more to the end because users should
>> get information on the really important missing packages
>> (which also require zlib) first.
> 
> According to the zlib Changelog file pkgconfig support was added
> in 2006 !
> 
> [quote]
> Changes in 1.2.3.1 (16 August 2006)
> 
>   - Add pkgconfig support [Weigelt]
> [/quote]
> 
> Given our target build platforms support guidelines
> 
>   https://qemu.weilnetz.de/doc/qemu-doc.html#Supported-build-platforms
> 
> we can safely say that all supported platforms will have a zlib
> that contains pkgconfig support, so..
> 
>>
>> Signed-off-by: Stefan Weil 
>> ---
>>  configure | 40 +++-
>>  1 file changed, 23 insertions(+), 17 deletions(-)
>>
>> diff --git a/configure b/configure
>> index 2a7796ea80..dcaab01729 100755
>> --- a/configure
>> +++ b/configure
>> @@ -2140,23 +2140,6 @@ EOF
>>fi
>>  fi
>>  
>> -#
>> -# zlib check
>> -
>> -if test "$zlib" != "no" ; then
>> -cat > $TMPC << EOF
>> -#include 
>> -int main(void) { zlibVersion(); return 0; }
>> -EOF
>> -if compile_prog "" "-lz" ; then
>> -:
>> -else
>> -error_exit "zlib check failed" \
>> -"Make sure to have the zlib libs and headers installed."
>> -fi
>> -fi
>> -LIBS="$LIBS -lz"
>> -
>>  ##
>>  # lzo check
>>  
>> @@ -3525,6 +3508,29 @@ if ! compile_prog "$glib_cflags -Werror" "$glib_libs" 
>> ; then
>>  fi
>>  fi
>>  
>> +#
>> +# zlib check
>> +
>> +if test "$zlib" != "no" ; then
>> +if $pkg_config --exists zlib; then
>> +zlib_cflags=$($pkg_config --cflags zlib)
>> +zlib_libs=$($pkg_config --libs zlib)
>> +QEMU_CFLAGS="$zlib_cflags $QEMU_CFLAGS"
>> +LIBS="$zlib_libs $LIBS"
>> +else
>> +cat > $TMPC << EOF
>> +#include 
>> +int main(void) { zlibVersion(); return 0; }
>> +EOF
>> +if compile_prog "" "-lz" ; then
>> +LIBS="$LIBS -lz"
>> +else
>> +error_exit "zlib check failed" \
>> +"Make sure to have the zlib libs and headers installed."
>> +fi
>> +fi
> 
>  this fallback support for non-pkgconfig scenarios can be entirely
> deleted, just leaving the error_exit message. 
> 
> 
> Regards,
> Daniel

I have no objection. Thank you for the investigation of the zlib
history. Removing old unneeded code is always good, but maybe that's
something which could be done after release 3.0.

Or would you suggest to do it now? Then I can either send an updated
patch (v2), or whoever pulls that patch can make that trivial modification.

Cheers
Stefan




Re: [Qemu-devel] [RFC PATCH 1/3] balloon: Allow nested inhibits

2018-07-18 Thread Alex Williamson
On Wed, 18 Jul 2018 14:40:15 +0800
Peter Xu  wrote:

> On Tue, Jul 17, 2018 at 04:47:37PM -0600, Alex Williamson wrote:
> > A simple true/false internal state does not allow multiple users.  Fix
> > this within the existing interface by converting to a counter, so long
> > as the counter is elevated, ballooning is inhibited.
> > 
> > Signed-off-by: Alex Williamson 
> > ---
> >  balloon.c |7 ---
> >  1 file changed, 4 insertions(+), 3 deletions(-)
> > 
> > diff --git a/balloon.c b/balloon.c
> > index 6bf0a9681377..2a6a7e1a22a0 100644
> > --- a/balloon.c
> > +++ b/balloon.c
> > @@ -37,16 +37,17 @@
> >  static QEMUBalloonEvent *balloon_event_fn;
> >  static QEMUBalloonStatus *balloon_stat_fn;
> >  static void *balloon_opaque;
> > -static bool balloon_inhibited;
> > +static int balloon_inhibited;
> >  
> >  bool qemu_balloon_is_inhibited(void)
> >  {
> > -return balloon_inhibited;
> > +return balloon_inhibited > 0;
> >  }
> >  
> >  void qemu_balloon_inhibit(bool state)
> >  {
> > -balloon_inhibited = state;
> > +balloon_inhibited += (state ? 1 : -1);
> > +assert(balloon_inhibited >= 0);  
> 
> Better do it atomically?

I'd assumed we're protected by the BQL anywhere this is called.  Is
that not the case?  Generally when I try to add any sort of locking to
QEMU it's shot down because the code paths are already serialized.
Thanks,

Alex



Re: [Qemu-devel] [RFC PATCH 0/3] Balloon inhibit enhancements

2018-07-18 Thread Alex Williamson
On Wed, 18 Jul 2018 14:48:03 +0800
Peter Xu  wrote:

> On Tue, Jul 17, 2018 at 04:47:31PM -0600, Alex Williamson wrote:
> > Directly assigned vfio devices have never been compatible with
> > ballooning.  Zapping MADV_DONTNEED pages happens completely
> > independent of vfio page pinning and IOMMU mapping, leaving us with
> > inconsistent GPA to HPA mapping between vCPUs and assigned devices
> > when the balloon deflates.  Mediated devices can theoretically do
> > better, if we make the assumption that the mdev vendor driver is fully
> > synchronized to the actual working set of the guest driver.  In that
> > case the guest balloon driver should never be able to allocate an mdev
> > pinned page for balloon inflation.  Unfortunately, QEMU can't know the
> > workings of the vendor driver pinning, and doesn't actually know the
> > difference between mdev devices and directly assigned devices.  Until
> > we can sort out how the vfio IOMMU backend can tell us if ballooning
> > is safe, the best approach is to disabling ballooning any time a vfio
> > devices is attached.
> > 
> > To do that, simply make the balloon inhibitor a counter rather than a
> > boolean, fixup a case where KVM can then simply use the inhibit
> > interface, and inhibit ballooning any time a vfio device is attached.
> > I'm expecting we'll expose some sort of flag similar to
> > KVM_CAP_SYNC_MMU from the vfio IOMMU for cases where we can resolve
> > this.  An addition we could consider here would be yet another device
> > option for vfio, such as x-disable-balloon-inhibit, in case there are
> > mdev devices that behave in a manner compatible with ballooning.
> > 
> > Please let me know if this looks like a good idea.  Thanks,  
> 
> IMHO patches 1-2 are good cleanup as standalone patches...
> 
> I totally have no idea on whether people would like to use vfio-pci
> and the balloon device at the same time.  After all vfio-pci are
> majorly for performance players, then I would vaguely guess that they
> don't really care thin provisioning of memory at all, hence the usage
> scenario might not exist much.  Is that the major reason that we'd
> just like to disable it (which makes sense to me)?

Well, the major reason for disabling it is that it currently doesn't
work and when the balloon is deflated, the device and vCPU are talking
to different host pages for the same GPA for previously ballooned
pages.  Regardless of the amenability of device assignment to various
usage scenarios, that's a bad thing.  I guess most device assignment
users have either realized this doesn't work and avoid it, or perhaps
they have VMs tuned more for performance than density and (again) don't
use ballooning.
 
> I'm wondering what if want to do that somehow some day... Whether
> it'll work if we just let vfio-pci devices to register some guest
> memory invalidation hook (just like the iommu notifiers, but for guest
> memory address space instead), then we map/unmap the IOMMU pages there
> for vfio-pci device to make sure the inflated balloon pages are not
> mapped and also make sure new pages are remapped with correct HPA
> after deflated.  This is a pure question out of my curiosity, and for
> sure it makes little sense if the answer of the first question above
> is positive.

This is why I mention the KVM MMU synchronization flag above.  KVM
essentially had this same problem and fixed it with with MMU notifiers
in the kernel.  They expose that KVM has the capability of handling
such a scenario via a feature flag.  We can do the same with vfio.  In
scenarios where we're able to fix this, we could expose a flag on the
container indicating support for the same sort of thing.

There are a few complications to this support though.  First ballooning
works at page size granularity, but IOMMU mapping can make use of
arbitrary superpage sizes and the IOMMU API only guarantees unmap
granularity equal to the original mapping.  Therefore we cannot unmap
individual pages unless we require that all mappings through the IOMMU
API are done with page granularity, precluding the use of superpages by
the IOMMU and thereby inflicting higher IOTLB overhead.  Unlike a CPU,
we can't invalidate the mappings and fault them back in or halt the
processor to make the page table updates appear atomic.  The device is
considered always running and interfering with that would likely lead
to functional issues.

Second MMU notifiers seem to provide invalidation, pte change notices,
and page aging interfaces, so if a page is consumed by the balloon
inflating, we can invalidate it (modulo the issues in the previous
paragraph), but how do we re-populate the mapping through the IOMMU
when the page is released as the balloon is deflated?  KVM seems to do
this by handling the page fault, but we don't really have that option
for devices.  If we try to solve this only for mdev devices, we can
request invalidation down the vendor driver with page granularity and
we could assume a vendor driver that's well 

Re: [Qemu-devel] [PATCH] configure: Support pkg-config for zlib

2018-07-18 Thread Daniel P . Berrangé
On Thu, Jul 12, 2018 at 09:26:03PM +0200, Stefan Weil wrote:
> This is needed for builds with the mingw64-* packages from Cygwin,
> but also works for Linux.
> 
> Move the zlib test also more to the end because users should
> get information on the really important missing packages
> (which also require zlib) first.

According to the zlib Changelog file pkgconfig support was added
in 2006 !

[quote]
Changes in 1.2.3.1 (16 August 2006)

  - Add pkgconfig support [Weigelt]
[/quote]

Given our target build platforms support guidelines

  https://qemu.weilnetz.de/doc/qemu-doc.html#Supported-build-platforms

we can safely say that all supported platforms will have a zlib
that contains pkgconfig support, so..

> 
> Signed-off-by: Stefan Weil 
> ---
>  configure | 40 +++-
>  1 file changed, 23 insertions(+), 17 deletions(-)
> 
> diff --git a/configure b/configure
> index 2a7796ea80..dcaab01729 100755
> --- a/configure
> +++ b/configure
> @@ -2140,23 +2140,6 @@ EOF
>fi
>  fi
>  
> -#
> -# zlib check
> -
> -if test "$zlib" != "no" ; then
> -cat > $TMPC << EOF
> -#include 
> -int main(void) { zlibVersion(); return 0; }
> -EOF
> -if compile_prog "" "-lz" ; then
> -:
> -else
> -error_exit "zlib check failed" \
> -"Make sure to have the zlib libs and headers installed."
> -fi
> -fi
> -LIBS="$LIBS -lz"
> -
>  ##
>  # lzo check
>  
> @@ -3525,6 +3508,29 @@ if ! compile_prog "$glib_cflags -Werror" "$glib_libs" 
> ; then
>  fi
>  fi
>  
> +#
> +# zlib check
> +
> +if test "$zlib" != "no" ; then
> +if $pkg_config --exists zlib; then
> +zlib_cflags=$($pkg_config --cflags zlib)
> +zlib_libs=$($pkg_config --libs zlib)
> +QEMU_CFLAGS="$zlib_cflags $QEMU_CFLAGS"
> +LIBS="$zlib_libs $LIBS"
> +else
> +cat > $TMPC << EOF
> +#include 
> +int main(void) { zlibVersion(); return 0; }
> +EOF
> +if compile_prog "" "-lz" ; then
> +LIBS="$LIBS -lz"
> +else
> +error_exit "zlib check failed" \
> +"Make sure to have the zlib libs and headers installed."
> +fi
> +fi

 this fallback support for non-pkgconfig scenarios can be entirely
deleted, just leaving the error_exit message. 


Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|



Re: [Qemu-devel] [PATCH] configure: Support pkg-config for zlib

2018-07-18 Thread Stefan Hajnoczi
On Thu, Jul 12, 2018 at 09:26:03PM +0200, Stefan Weil wrote:
> This is needed for builds with the mingw64-* packages from Cygwin,
> but also works for Linux.
> 
> Move the zlib test also more to the end because users should
> get information on the really important missing packages
> (which also require zlib) first.
> 
> Signed-off-by: Stefan Weil 
> ---
>  configure | 40 +++-
>  1 file changed, 23 insertions(+), 17 deletions(-)

Reviewed-by: Stefan Hajnoczi 


signature.asc
Description: PGP signature


Re: [Qemu-devel] [PATCH] migration: add capability to bypass the shared memory

2018-07-18 Thread Stefan Hajnoczi
On Thu, Jul 12, 2018 at 11:02:08PM +0800, Peng Tao wrote:
> On Tue, Jul 10, 2018 at 9:40 PM, Stefan Hajnoczi  wrote:
> > Two things come to mind:
> >
> > At that point both guest kernel and agent address-space layout
> > randomization (ASLR) is finished.  ALSR makes it harder for memory
> > corruption bugs to lead to real exploits because the attacker does not
> > know the full memory layout of the process.  Cloned VMs will not benefit
> > from ASLR because much of the memory layout of the guest kernel and
> > agent will be identical across all clones.
> >
> Yes, indeed. I am not arguing that ASLR is retained with VM
> templating. Just that ASLR is also compromised if one wants to use KSM
> to save memory by sharing among different guests. Kata is already
> shipping with KSM components and we are adding VM templating as a
> better alternative.

Hang on, ASLR is *not* compromised by KSM.  The address space layout is
still unique for each guest, even if KSM deduplicates physical pages on
the host.  Remember ASLR is about virtual addresses while KSM is about
sharing the physical pages.  Therefore KSM does not affect ASLR.

The KSM issue you referred to earlier is a timing side-channel attack.
Being vulnerable to timing side-channel attacks through KSM does not
reduce the effectiveness of ASLR.

> > Software random number generators have probably been initialized at this
> > point.  This doesn't mean that all cloned VMs will produce the same
> > sequence of random numbers since they should incorporate entropy sources
> > or use hardware random number generators, but the quality of random
> > numbers might be reduced.  Someone who knows random number generators
> > should take a look at this.
> >
> As Andrea pointed out earlier in his comments, we can configure the
> random number generator to printk a warning if it's being used at boot
> before it had its "shutdown" state restored. Then we can add a new
> kata-agent request set the entropy and check for such warning after a
> new VM is cloned and before it is given to the user. This way, we are
> guaranteed that random numbers generated by each guest is created with
> a different seed. Do you have other concern with this method?

Sounds good.

Stefan


signature.asc
Description: PGP signature


Re: [Qemu-devel] [PATCH for-3.1] qemu-iotests: Adapt to moved location of StringIO module in py3

2018-07-18 Thread Eduardo Habkost
On Wed, Jul 18, 2018 at 12:22:19PM -0300, Philippe Mathieu-Daudé wrote:
> On 07/18/2018 12:05 PM, Eduardo Habkost wrote:
> > On Wed, Jul 18, 2018 at 12:02:39PM -0300, Philippe Mathieu-Daudé wrote:
> >> Hi Eduardo,
> >>
> >> On 07/18/2018 11:53 AM, Eduardo Habkost wrote:
> >>> On Tue, Jul 17, 2018 at 08:40:15PM -0300, Philippe Mathieu-Daudé wrote:
> >>> [...]
>  -import StringIO
>  +try:
>  +from StringIO import StringIO
>  +except ImportError:
>  +from io import StringIO
> >>>
> >>> Why do we need this?  Python 2.7 has io.StringIO.
> >>
> >> Python 2 works fine, the problem is the Fedora Docker image uses Python
> >> 3 and the block tests started to fail...
> > 
> > My question is: why use StringIO.StringIO on Python 2 and
> > io.StringIO on Python 3, if io.StringIO works on both Python
> > versions?
> 
> Oh I missed your question because I was not aware of this, and looked
> how this was handled in the tree (7a5d936b6fc and 5f90af8e6b).
> 
> TIL we can use "from io import StringIO" regardless the version, the
> 2->3 conversion looks a bit less absurd, thanks!

Note that it's not always an obvious conversion: on Python 3 you
need to decide if you want a text file or binary file.
io.StringIO is a text file, io.BinaryIO is a binary file.  See
https://www.mail-archive.com/qemu-devel@nongnu.org/msg545627.html
for an example where io.StringIO wasn't appropriate.

In the case of iotests.py it looks like the code expects a text
file, though.

-- 
Eduardo



[Qemu-devel] [PATCH v1 10/17] background snapshots: adapt the page queueing code for using page copies

2018-07-18 Thread Denis Plotnikov
The background snapshot uses memeory page copying to seal the page
memory content. The patch adapts the migration infrastructure to save
copies of the pages.

Signed-off-by: Denis Plotnikov 
---
 migration/migration.c |  2 +-
 migration/ram.c   | 59 ---
 migration/ram.h   |  3 ++-
 3 files changed, 47 insertions(+), 17 deletions(-)

diff --git a/migration/migration.c b/migration/migration.c
index 87096d23ef..131d0904e4 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -1716,7 +1716,7 @@ static void migrate_handle_rp_req_pages(MigrationState 
*ms, const char* rbname,
 return;
 }
 
-if (ram_save_queue_pages(rbname, start, len)) {
+if (ram_save_queue_pages(NULL, rbname, start, len, NULL)) {
 mark_source_rp_bad(ms);
 }
 }
diff --git a/migration/ram.c b/migration/ram.c
index dc7dfe0726..3c4ccd85b4 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -976,7 +976,12 @@ static int ram_save_page(RAMState *rs, PageSearchStatus 
*pss, bool last_stage)
 RAMBlock *block = pss->block;
 ram_addr_t offset = pss->page << TARGET_PAGE_BITS;
 
-p = block->host + offset;
+if (pss->page_copy) {
+p = pss->page_copy;
+} else {
+p = block->host + offset;
+}
+
 trace_ram_save_page(block->idstr, (uint64_t)offset, p);
 
 /* In doubt sent page as normal */
@@ -1003,13 +1008,18 @@ static int ram_save_page(RAMState *rs, PageSearchStatus 
*pss, bool last_stage)
 } else {
 pages = save_zero_page(rs, block, offset, p);
 if (pages > 0) {
-/* Must let xbzrle know, otherwise a previous (now 0'd) cached
- * page would be stale
- */
-xbzrle_cache_zero_page(rs, current_addr);
-ram_release_pages(block->idstr, offset, pages);
+if (pss->page_copy) {
+qemu_madvise(p, TARGET_PAGE_SIZE, MADV_DONTNEED);
+} else {
+/* Must let xbzrle know, otherwise a previous (now 0'd) cached
+ * page would be stale
+ */
+xbzrle_cache_zero_page(rs, current_addr);
+ram_release_pages(block->idstr, offset, pages);
+}
 } else if (!rs->ram_bulk_stage &&
-   !migration_in_postcopy() && migrate_use_xbzrle()) {
+   !migration_in_postcopy() && migrate_use_xbzrle() &&
+   !migrate_background_snapshot()) {
 pages = save_xbzrle_page(rs, , current_addr, block,
  offset, last_stage);
 if (!last_stage) {
@@ -1026,9 +1036,10 @@ static int ram_save_page(RAMState *rs, PageSearchStatus 
*pss, bool last_stage)
 ram_counters.transferred +=
 save_page_header(rs, rs->f, block, offset | RAM_SAVE_FLAG_PAGE);
 if (send_async) {
-qemu_put_buffer_async(rs->f, p, TARGET_PAGE_SIZE,
-  migrate_release_ram() &
-  migration_in_postcopy());
+bool may_free = migrate_background_snapshot() ||
+(migrate_release_ram() &&
+ migration_in_postcopy());
+qemu_put_buffer_async(rs->f, p, TARGET_PAGE_SIZE, may_free);
 } else {
 qemu_put_buffer(rs->f, p, TARGET_PAGE_SIZE);
 }
@@ -1269,7 +1280,8 @@ static bool find_dirty_block(RAMState *rs, 
PageSearchStatus *pss, bool *again)
  * @rs: current RAM state
  * @offset: used to return the offset within the RAMBlock
  */
-static RAMBlock *unqueue_page(RAMState *rs, ram_addr_t *offset)
+static RAMBlock *unqueue_page(RAMState *rs, ram_addr_t *offset,
+  void **page_copy)
 {
 RAMBlock *block = NULL;
 
@@ -1279,10 +1291,14 @@ static RAMBlock *unqueue_page(RAMState *rs, ram_addr_t 
*offset)
 QSIMPLEQ_FIRST(>src_page_requests);
 block = entry->rb;
 *offset = entry->offset;
+*page_copy = entry->page_copy;
 
 if (entry->len > TARGET_PAGE_SIZE) {
 entry->len -= TARGET_PAGE_SIZE;
 entry->offset += TARGET_PAGE_SIZE;
+if (entry->page_copy) {
+entry->page_copy += TARGET_PAGE_SIZE / sizeof(void *);
+}
 } else {
 memory_region_unref(block->mr);
 QSIMPLEQ_REMOVE_HEAD(>src_page_requests, next_req);
@@ -1309,9 +1325,10 @@ static bool get_queued_page(RAMState *rs, 
PageSearchStatus *pss)
 RAMBlock  *block;
 ram_addr_t offset;
 bool dirty;
+void *page_copy;
 
 do {
-block = unqueue_page(rs, );
+block = unqueue_page(rs, , _copy);
 /*
  * We're sending this page, and since it's postcopy nothing else
  * will dirty it, and we must make sure it doesn't get sent again
@@ -1349,6 +1366,7 @@ static bool get_queued_page(RAMState *rs, 
PageSearchStatus *pss)
  */
 

[Qemu-devel] [PATCH v1 17/17] background snapshot: enable background snapshot

2018-07-18 Thread Denis Plotnikov
The patch enables to save vmstate to a migration thread
in the background: ram is being saved while vCPUs are running.
This is done to reduce downtime on vm snapshotting: the majority
of vmstate is ram, the rest of devices consumes only a few MB of
memory on a typical vm.
By this moment, there were no means to make a snapshot without
full vm stopping. From now, one can save a vm state having the vm
stopped only for a couple of seconds to save devices' states. Then
the vm resumed and ram is written without the vm full stopping.

To use async vm state it's needed to enable "background-snapshot"
migration capability and start the migration

Signed-off-by: Denis Plotnikov 
---
 migration/migration.c | 103 +-
 1 file changed, 101 insertions(+), 2 deletions(-)

diff --git a/migration/migration.c b/migration/migration.c
index 131d0904e4..935e5b6f2e 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -2188,6 +2188,99 @@ bool migrate_colo_enabled(void)
 return s->enabled_capabilities[MIGRATION_CAPABILITY_X_COLO];
 }
 
+static void background_snapshot_sigsegv_handler(int unused0, siginfo_t *info,
+void *unused1)
+{
+if (ram_process_page_fault(info->si_addr)) {
+assert(false);
+}
+}
+
+static void *background_snapshot_thread(void *opaque)
+{
+MigrationState *m = opaque;
+QIOChannelBuffer *bioc;
+QEMUFile *fb;
+int res = 0;
+
+rcu_register_thread();
+
+qemu_file_set_rate_limit(m->to_dst_file, INT64_MAX);
+
+qemu_mutex_lock_iothread();
+vm_stop(RUN_STATE_PAUSED);
+
+qemu_savevm_state_header(m->to_dst_file);
+qemu_mutex_unlock_iothread();
+qemu_savevm_state_setup(m->to_dst_file);
+qemu_mutex_lock_iothread();
+
+migrate_set_state(>state, MIGRATION_STATUS_SETUP,
+  MIGRATION_STATUS_ACTIVE);
+
+/* save the device state to the temporary buffer */
+bioc = qio_channel_buffer_new(4096);
+qio_channel_set_name(QIO_CHANNEL(bioc), "vmstate-buffer");
+fb = qemu_fopen_channel_output(QIO_CHANNEL(bioc));
+object_unref(OBJECT(bioc));
+
+ram_block_list_create();
+ram_block_list_set_readonly();
+
+if (global_state_store()) {
+goto exit;
+}
+
+if (qemu_savevm_state_save(fb, false, true) < 0) {
+migrate_set_state(>state, MIGRATION_STATUS_ACTIVE,
+  MIGRATION_STATUS_FAILED);
+goto exit;
+}
+
+sigsegv_user_handler_set(background_snapshot_sigsegv_handler);
+
+vm_start();
+qemu_mutex_unlock_iothread();
+
+while (!res) {
+res = qemu_savevm_state_iterate(m->to_dst_file, false);
+
+if (res < 0 || qemu_file_get_error(m->to_dst_file)) {
+migrate_set_state(>state, MIGRATION_STATUS_ACTIVE,
+  MIGRATION_STATUS_FAILED);
+goto exit;
+}
+}
+
+/*
+ * By this moment we have RAM saved into the stream
+ * The next step is to flush the device state right after the
+ * RAM saved. The rest of device state is stored in
+ * the temporary buffer. Do flush the buffer into the stream.
+ */
+qemu_put_buffer(m->to_dst_file, bioc->data, bioc->usage);
+qemu_fflush(m->to_dst_file);
+
+if (qemu_file_get_error(m->to_dst_file)) {
+migrate_set_state(>state, MIGRATION_STATUS_ACTIVE,
+  MIGRATION_STATUS_FAILED);
+goto exit;
+}
+
+migrate_set_state(>state, MIGRATION_STATUS_ACTIVE,
+ MIGRATION_STATUS_COMPLETED);
+exit:
+ram_block_list_set_writable();
+ram_block_list_destroy();
+sigsegv_user_handler_reset();
+qemu_fclose(fb);
+qemu_mutex_lock_iothread();
+qemu_savevm_state_cleanup();
+qemu_mutex_unlock_iothread();
+rcu_unregister_thread();
+return NULL;
+}
+
 /*
  * Master migration thread on the source VM.
  * It drives the migration and pumps the data down the outgoing channel.
@@ -2402,8 +2495,14 @@ void migrate_fd_connect(MigrationState *s)
 migrate_fd_cleanup(s);
 return;
 }
-qemu_thread_create(>thread, "live_migration", migration_thread, s,
-   QEMU_THREAD_JOINABLE);
+if (migrate_background_snapshot()) {
+qemu_thread_create(>thread, "bg_snapshot",
+   background_snapshot_thread, s,
+   QEMU_THREAD_JOINABLE);
+} else {
+qemu_thread_create(>thread, "live_migration", migration_thread, s,
+   QEMU_THREAD_JOINABLE);
+}
 s->migration_thread_running = true;
 }
 
-- 
2.17.0




[Qemu-devel] [PATCH v1 16/17] migration: move the device state saving logic to a separate function

2018-07-18 Thread Denis Plotnikov
The logic being split will be reused by the background snapshot.

Signed-off-by: Denis Plotnikov 
---
 migration/savevm.c | 91 +-
 migration/savevm.h |  2 +
 2 files changed, 52 insertions(+), 41 deletions(-)

diff --git a/migration/savevm.c b/migration/savevm.c
index f202c3de3a..36074ec9de 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -1107,51 +1107,15 @@ void qemu_savevm_state_complete_postcopy(QEMUFile *f)
 qemu_fflush(f);
 }
 
-int qemu_savevm_state_complete_precopy(QEMUFile *f, bool iterable_only,
-   bool inactivate_disks)
+int qemu_savevm_state_save(QEMUFile *f, bool inactivate_disks,
+   bool send_eof)
 {
-QJSON *vmdesc;
 int vmdesc_len;
 SaveStateEntry *se;
 int ret;
-bool in_postcopy = migration_in_postcopy();
-
-trace_savevm_state_complete_precopy();
+QJSON *vmdesc = qjson_new();
 
 cpu_synchronize_all_states();
-
-QTAILQ_FOREACH(se, _state.handlers, entry) {
-if (!se->ops ||
-(in_postcopy && se->ops->has_postcopy &&
- se->ops->has_postcopy(se->opaque)) ||
-(in_postcopy && !iterable_only) ||
-!se->ops->save_live_complete_precopy) {
-continue;
-}
-
-if (se->ops && se->ops->is_active) {
-if (!se->ops->is_active(se->opaque)) {
-continue;
-}
-}
-trace_savevm_section_start(se->idstr, se->section_id);
-
-save_section_header(f, se, QEMU_VM_SECTION_END);
-
-ret = se->ops->save_live_complete_precopy(f, se->opaque);
-trace_savevm_section_end(se->idstr, se->section_id, ret);
-save_section_footer(f, se);
-if (ret < 0) {
-qemu_file_set_error(f, ret);
-return -1;
-}
-}
-
-if (iterable_only) {
-return 0;
-}
-
-vmdesc = qjson_new();
 json_prop_int(vmdesc, "page_size", qemu_target_page_size());
 json_start_array(vmdesc, "devices");
 QTAILQ_FOREACH(se, _state.handlers, entry) {
@@ -1193,8 +1157,8 @@ int qemu_savevm_state_complete_precopy(QEMUFile *f, bool 
iterable_only,
 return ret;
 }
 }
-if (!in_postcopy) {
-/* Postcopy stream will still be going */
+
+if (send_eof) {
 qemu_put_byte(f, QEMU_VM_EOF);
 }
 
@@ -1213,6 +1177,51 @@ int qemu_savevm_state_complete_precopy(QEMUFile *f, bool 
iterable_only,
 return 0;
 }
 
+int qemu_savevm_state_complete_precopy(QEMUFile *f, bool iterable_only,
+   bool inactivate_disks)
+{
+SaveStateEntry *se;
+int ret;
+bool in_postcopy = migration_in_postcopy();
+
+trace_savevm_state_complete_precopy();
+
+cpu_synchronize_all_states();
+
+QTAILQ_FOREACH(se, _state.handlers, entry) {
+if (!se->ops ||
+(in_postcopy && se->ops->has_postcopy &&
+ se->ops->has_postcopy(se->opaque)) ||
+(in_postcopy && !iterable_only) ||
+!se->ops->save_live_complete_precopy) {
+continue;
+}
+
+if (se->ops && se->ops->is_active) {
+if (!se->ops->is_active(se->opaque)) {
+continue;
+}
+}
+trace_savevm_section_start(se->idstr, se->section_id);
+
+save_section_header(f, se, QEMU_VM_SECTION_END);
+
+ret = se->ops->save_live_complete_precopy(f, se->opaque);
+trace_savevm_section_end(se->idstr, se->section_id, ret);
+save_section_footer(f, se);
+if (ret < 0) {
+qemu_file_set_error(f, ret);
+return -1;
+}
+}
+
+if (iterable_only) {
+return 0;
+}
+
+return qemu_savevm_state_save(f, inactivate_disks, !in_postcopy);
+}
+
 /* Give an estimate of the amount left to be transferred,
  * the result is split into the amount for units that can and
  * for units that can't do postcopy.
diff --git a/migration/savevm.h b/migration/savevm.h
index 295c4a1f2c..0e8d7aecf8 100644
--- a/migration/savevm.h
+++ b/migration/savevm.h
@@ -40,6 +40,8 @@ int qemu_savevm_state_complete_precopy(QEMUFile *f, bool 
iterable_only,
 void qemu_savevm_state_pending(QEMUFile *f, uint64_t max_size,
uint64_t *res_non_postcopiable,
uint64_t *res_postcopiable);
+int qemu_savevm_state_save(QEMUFile *f, bool inactivate_disks,
+   bool send_eof);
 void qemu_savevm_send_ping(QEMUFile *f, uint32_t value);
 void qemu_savevm_send_open_return_path(QEMUFile *f);
 int qemu_savevm_send_packaged(QEMUFile *f, const uint8_t *buf, size_t len);
-- 
2.17.0




[Qemu-devel] [PATCH v1 11/17] background snapshot: add a memory page copying function

2018-07-18 Thread Denis Plotnikov
It's the only function making a memory page copy.
It supports multithreading semantics ensuring that
the page is copied by one thread only and releasing
the copied page from write protection.

Signed-off-by: Denis Plotnikov 
---
 migration/ram.c | 56 +
 migration/ram.h |  1 +
 2 files changed, 57 insertions(+)

diff --git a/migration/ram.c b/migration/ram.c
index 3c4ccd85b4..10b6fdf23e 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -1706,6 +1706,62 @@ RAMBlock *ram_bgs_block_find(uint8_t *address, 
ram_addr_t *page_offset)
 return NULL;
 }
 
+/**
+ * ram_copy_page: make a page copy
+ *
+ * Used in the background snapshot to make a copy of a memeory page.
+ * Ensures that the memeory page is copied only once.
+ * When a page copy is done, restores read/write access to the memory
+ * page.
+ * If a page is being copied by another thread, wait until the copying
+ * ends and exit.
+ *
+ * Returns:
+ *   -1 - on error
+ *0 - the page wasn't copied by the function call
+ *1 - the page has been copied
+ *
+ * @block: RAM block to use
+ * @page_nr:   the page number to copy
+ * @page_copy: the pointer to return a page copy
+ *
+ */
+int ram_copy_page(RAMBlock *block, unsigned long page_nr,
+  void **page_copy)
+{
+void *host_page;
+
+if (test_and_set_bit_atomic(page_nr, block->touched_map)) {
+while (!test_bit_atomic(page_nr, block->copied_map)) {
+/* the page is being copied -- wait for the end of the copying
+ * and check once again
+ */
+qemu_event_wait(_state->page_copying_done);
+}
+return 0;
+}
+
+*page_copy = ram_page_buffer_get();
+if (!*page_copy) {
+return -1;
+}
+
+host_page = block->host + (page_nr << TARGET_PAGE_BITS);
+memcpy(*page_copy, host_page, TARGET_PAGE_SIZE);
+
+if (ram_set_rw(host_page, TARGET_PAGE_SIZE)) {
+ram_page_buffer_free(*page_copy);
+*page_copy = NULL;
+return -1;
+}
+
+set_bit_atomic(page_nr, block->copied_map);
+qemu_event_set(_state->page_copying_done);
+qemu_event_reset(_state->page_copying_done);
+
+return 1;
+}
+
 /**
  * ram_find_and_save_block: finds a dirty page and sends it to f
  *
diff --git a/migration/ram.h b/migration/ram.h
index c3679ba65e..76ab0a3377 100644
--- a/migration/ram.h
+++ b/migration/ram.h
@@ -75,4 +75,5 @@ int ram_page_buffer_free(void *buffer);
 int ram_block_list_set_readonly(void);
 int ram_block_list_set_writable(void);
 
+int ram_copy_page(RAMBlock *block, unsigned long page_nr, void **page_copy);
 #endif
-- 
2.17.0




[Qemu-devel] [PATCH v1 08/17] migration: add helpers to change VM memory protection rights

2018-07-18 Thread Denis Plotnikov
Signed-off-by: Denis Plotnikov 
---
 migration/ram.c | 54 +
 migration/ram.h |  3 +++
 2 files changed, 57 insertions(+)

diff --git a/migration/ram.c b/migration/ram.c
index 27d3403435..ce3dead932 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -1555,6 +1555,60 @@ void ram_block_list_destroy(void)
 }
 }
 
+static int mem_protect(void *addr, uint64_t length, int prot)
+{
+int ret = mprotect(addr, length, prot);
+
+if (ret < 0) {
+error_report("%s: Can't change protection on ram block at %p",
+ __func__, addr);
+}
+
+return ret;
+}
+
+static int ram_set_ro(void *addr, uint64_t length)
+{
+return mem_protect(addr, length, PROT_READ);
+}
+
+static int ram_set_rw(void *addr, uint64_t length)
+{
+return mem_protect(addr, length, PROT_READ | PROT_WRITE);
+}
+
+int ram_block_list_set_readonly(void)
+{
+RAMBlock *block = NULL;
+RamBlockList *block_list = ram_bgs_block_list_get();
+int ret = 0;
+
+QLIST_FOREACH(block, block_list, bgs_next) {
+ret = ram_set_ro(block->host, block->used_length);
+if (ret) {
+break;
+}
+}
+
+return ret;
+}
+
+int ram_block_list_set_writable(void)
+{
+RAMBlock *block = NULL;
+RamBlockList *block_list = ram_bgs_block_list_get();
+int ret = 0;
+
+QLIST_FOREACH(block, block_list, bgs_next) {
+ret = ram_set_rw(block->host, block->used_length);
+if (ret) {
+break;
+}
+}
+
+return ret;
+}
+
 static void ram_page_buffer_decrease_used(void)
 {
 qemu_event_reset(_state->page_buffer.used_decreased);
diff --git a/migration/ram.h b/migration/ram.h
index 7c802c1d7f..4c463597a5 100644
--- a/migration/ram.h
+++ b/migration/ram.h
@@ -71,4 +71,7 @@ RAMBlock *ram_bgs_block_find(uint8_t *address, ram_addr_t 
*page_offset);
 void *ram_page_buffer_get(void);
 int ram_page_buffer_free(void *buffer);
 
+int ram_block_list_set_readonly(void);
+int ram_block_list_set_writable(void);
+
 #endif
-- 
2.17.0




[Qemu-devel] [PATCH v1 03/17] threads: add infrastructure to process sigsegv

2018-07-18 Thread Denis Plotnikov
Allows to define sigsegv handler temporary for all threads.
This is useful to implement copy-on-write logic while
linux usefaultfd doesn't support write-protected faults.
In the future, switch to using WP userfaultfd when it's
available.

Signed-off-by: Denis Plotnikov 
---
 include/qemu/thread.h|  5 
 util/qemu-thread-posix.c | 52 
 2 files changed, 57 insertions(+)

diff --git a/include/qemu/thread.h b/include/qemu/thread.h
index 9910f49b3a..d6fed833fa 100644
--- a/include/qemu/thread.h
+++ b/include/qemu/thread.h
@@ -210,4 +210,9 @@ void qemu_lockcnt_inc_and_unlock(QemuLockCnt *lockcnt);
  */
 unsigned qemu_lockcnt_count(QemuLockCnt *lockcnt);
 
+
+typedef void (*sigsegv_handler)(int signum, siginfo_t *siginfo, void *sigctx);
+void sigsegv_user_handler_set(sigsegv_handler handler);
+void sigsegv_user_handler_reset(void);
+
 #endif
diff --git a/util/qemu-thread-posix.c b/util/qemu-thread-posix.c
index 7306475899..5424b7106d 100644
--- a/util/qemu-thread-posix.c
+++ b/util/qemu-thread-posix.c
@@ -489,6 +489,47 @@ static void qemu_thread_set_name(QemuThread *thread, const 
char *name)
 #endif
 }
 
+static sigsegv_handler sigsegv_user_handler;
+
+void sigsegv_user_handler_set(sigsegv_handler handler)
+{
+assert(handler);
+atomic_set(_user_handler, handler);
+}
+
+static sigsegv_handler sigsegv_user_handler_get(void)
+{
+return atomic_read(_user_handler);
+}
+
+void sigsegv_user_handler_reset(void)
+{
+atomic_set(_user_handler, NULL);
+}
+
+static void sigsegv_default_handler(int signum, siginfo_t *siginfo, void 
*sigctx)
+{
+sigsegv_handler handler = sigsegv_user_handler_get();
+
+if (!handler) {
+/*
+ * remove the sigsegv handler if it's not set by user
+ * this will lead to re-raising the error without a handler
+ * and exiting from the program with "Segmentation fault"
+ */
+int err;
+struct sigaction act;
+memset(, 0, sizeof(act));
+act.sa_flags = SA_RESETHAND;
+err = sigaction(SIGSEGV, , NULL);
+if (err) {
+error_exit(err, __func__);
+}
+} else {
+handler(signum, siginfo, sigctx);
+}
+}
+
 void qemu_thread_create(QemuThread *thread, const char *name,
void *(*start_routine)(void*),
void *arg, int mode)
@@ -496,14 +537,25 @@ void qemu_thread_create(QemuThread *thread, const char 
*name,
 sigset_t set, oldset;
 int err;
 pthread_attr_t attr;
+struct sigaction act;
 
 err = pthread_attr_init();
 if (err) {
 error_exit(err, __func__);
 }
 
+memset(, 0, sizeof(act));
+act.sa_flags = SA_SIGINFO;
+act.sa_sigaction = sigsegv_default_handler;
+err = sigaction(SIGSEGV, , NULL);
+if (err) {
+error_exit(err, __func__);
+}
+
 /* Leave signal handling to the iothread.  */
 sigfillset();
+/* ...all but SIGSEGV */
+sigdelset(, SIGSEGV);
 pthread_sigmask(SIG_SETMASK, , );
 err = pthread_create(>thread, , start_routine, arg);
 if (err)
-- 
2.17.0




[Qemu-devel] [PATCH v1 02/17] bitops: add some atomic versions of bitmap operations

2018-07-18 Thread Denis Plotnikov
1. test bit
2. test and set bit

Signed-off-by: Denis Plotnikov 
---
 include/qemu/bitops.h | 25 +
 1 file changed, 25 insertions(+)

diff --git a/include/qemu/bitops.h b/include/qemu/bitops.h
index 3f0926cf40..72afcfaec5 100644
--- a/include/qemu/bitops.h
+++ b/include/qemu/bitops.h
@@ -94,6 +94,21 @@ static inline int test_and_set_bit(long nr, unsigned long 
*addr)
 return (old & mask) != 0;
 }
 
+/**
+ * test_and_set_bit_atomic - Set a bit atomically and return its old value
+ * @nr: Bit to set
+ * @addr: Address to count from
+ */
+static inline int test_and_set_bit_atomic(long nr, unsigned long *addr)
+{
+unsigned long mask = BIT_MASK(nr);
+unsigned long *p = addr + BIT_WORD(nr);
+unsigned long old;
+
+old = atomic_fetch_or(p, mask);
+return (old & mask) != 0;
+}
+
 /**
  * test_and_clear_bit - Clear a bit and return its old value
  * @nr: Bit to clear
@@ -134,6 +149,16 @@ static inline int test_bit(long nr, const unsigned long 
*addr)
 return 1UL & (addr[BIT_WORD(nr)] >> (nr & (BITS_PER_LONG-1)));
 }
 
+/**
+ * test_bit_atomic - Determine whether a bit is set atomicallly
+ * @nr: bit number to test
+ * @addr: Address to start counting from
+ */
+static inline int test_bit_atomic(long nr, const unsigned long *addr)
+{
+long valid_nr = nr & (BITS_PER_LONG - 1);
+return 1UL & (atomic_read([BIT_WORD(nr)]) >> valid_nr);
+}
 /**
  * find_last_bit - find the last set bit in a memory region
  * @addr: The address to start the search at
-- 
2.17.0




[Qemu-devel] [PATCH v1 07/17] background snapshot: introduce page buffer

2018-07-18 Thread Denis Plotnikov
To limit the amount of memory used by the background snapshot
a memory limiter is used which called "page buffer".
In fact, it's a memory page counter limiting the page allocation.
Currently, the limit of pages is hardcoded but its setter is
the matter of the future work.

The background snapshot makes a copy of the page before writing it
to the snapshot destination, but it doesn't use a preallocated buffer,
because the background snapshot employs the migration infrastructure
which, in turn, uses madvice to release the buffer.
The advice issuing is hard to track, so here we rely on anonymous
memory mapping and releasing it by the existing code.

Signed-off-by: Denis Plotnikov 
---
 migration/ram.c | 91 +
 migration/ram.h |  3 ++
 2 files changed, 94 insertions(+)

diff --git a/migration/ram.c b/migration/ram.c
index 4399c31606..27d3403435 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -192,6 +192,16 @@ struct RAMSrcPageRequest {
 QSIMPLEQ_ENTRY(RAMSrcPageRequest) next_req;
 };
 
+/* Page buffer used for background snapshot */
+typedef struct RAMPageBuffer {
+/* Page buffer capacity in host pages */
+int capacity;
+/* Current number of pages in the buffer */
+int used;
+/* Event to notify that buffer usage is under capacity */
+QemuEvent used_decreased;
+} RAMPageBuffer;
+
 /* State of RAM for migration */
 struct RAMState {
 /* QEMUFile used for this migration */
@@ -230,6 +240,11 @@ struct RAMState {
 /* Queue of outstanding page requests from the destination */
 QemuMutex src_page_req_mutex;
 QSIMPLEQ_HEAD(src_page_requests, RAMSrcPageRequest) src_page_requests;
+/* The following 2 are for background snapshot */
+/* Buffer data to store copies of ram pages while async vm saving */
+RAMPageBuffer page_buffer;
+/* Event to notify that a page copying just has finished*/
+QemuEvent page_copying_done;
 };
 typedef struct RAMState RAMState;
 
@@ -1540,6 +1555,52 @@ void ram_block_list_destroy(void)
 }
 }
 
+static void ram_page_buffer_decrease_used(void)
+{
+qemu_event_reset(_state->page_buffer.used_decreased);
+atomic_dec(_state->page_buffer.used);
+qemu_event_set(_state->page_buffer.used_decreased);
+}
+
+static void ram_page_buffer_increase_used_wait(void)
+{
+int used, *used_ptr;
+RAMState *rs = ram_state;
+used_ptr = >page_buffer.used;
+do {
+used = atomic_read(used_ptr);
+if (rs->page_buffer.capacity > used) {
+if (atomic_cmpxchg(used_ptr, used, used + 1) == used) {
+return;
+} else {
+continue;
+}
+} else {
+qemu_event_wait(>page_buffer.used_decreased);
+}
+} while (true);
+}
+
+void *ram_page_buffer_get(void)
+{
+void *page;
+ram_page_buffer_increase_used_wait();
+page = mmap(0, TARGET_PAGE_SIZE, PROT_READ | PROT_WRITE,
+MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
+
+if (page == MAP_FAILED) {
+ram_page_buffer_decrease_used();
+page = NULL;
+}
+return page;
+}
+
+int ram_page_buffer_free(void *buffer)
+{
+ram_page_buffer_decrease_used();
+return qemu_madvise(buffer, TARGET_PAGE_SIZE, MADV_DONTNEED);
+}
+
 RAMBlock *ram_bgs_block_find(uint8_t *address, ram_addr_t *page_offset)
 {
 RAMBlock *block = NULL;
@@ -1559,6 +1620,7 @@ RAMBlock *ram_bgs_block_find(uint8_t *address, ram_addr_t 
*page_offset)
 
 return NULL;
 }
+
 /**
  * ram_find_and_save_block: finds a dirty page and sends it to f
  *
@@ -1651,6 +1713,13 @@ static void xbzrle_load_cleanup(void)
 
 static void ram_state_cleanup(RAMState **rsp)
 {
+if (migrate_background_snapshot()) {
+qemu_event_destroy(&(*rsp)->page_buffer.used_decreased);
+/* In case somebody is still waiting for the event */
+qemu_event_set(&(*rsp)->page_copying_done);
+qemu_event_destroy(&(*rsp)->page_copying_done);
+}
+
 migration_page_queue_free(*rsp);
 qemu_mutex_destroy(&(*rsp)->bitmap_mutex);
 qemu_mutex_destroy(&(*rsp)->src_page_req_mutex);
@@ -1689,6 +1758,13 @@ static void ram_save_cleanup(void *opaque)
 block->bmap = NULL;
 g_free(block->unsentmap);
 block->unsentmap = NULL;
+
+if (migrate_background_snapshot()) {
+g_free(block->touched_map);
+block->touched_map = NULL;
+g_free(block->copied_map);
+block->copied_map = NULL;
+}
 }
 
 xbzrle_cleanup();
@@ -1703,6 +1779,10 @@ static void ram_state_reset(RAMState *rs)
 rs->last_page = 0;
 rs->last_version = ram_list.version;
 rs->ram_bulk_stage = true;
+
+/* page buffer capacity in number of pages */
+rs->page_buffer.capacity = 1000;
+rs->page_buffer.used = 0;
 }
 
 #define MAX_WAIT 50 /* ms, half buffered_file limit */
@@ -2180,6 +2260,11 @@ static int ram_state_init(RAMState **rsp)
  */
 (*rsp)->migration_dirty_pages 

[Qemu-devel] [PATCH v1 01/17] migration: add background snapshot capability

2018-07-18 Thread Denis Plotnikov
The capability is used for the background vmstate saving
using the migration infrastructure.
Background vmstate saving means that the majority of vmstate
(RAM) is saved in the background when VM's vCPUS are running.
This helps to reduce the VM downtime on VM snapshotting.

Signed-off-by: Denis Plotnikov 
---
 migration/migration.c | 35 +++
 migration/migration.h |  1 +
 qapi/migration.json   |  6 +-
 3 files changed, 41 insertions(+), 1 deletion(-)

diff --git a/migration/migration.c b/migration/migration.c
index d780601f0c..87096d23ef 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -697,6 +697,12 @@ static bool migrate_caps_check(bool *cap_list,
 return false;
 }
 
+if (cap_list[MIGRATION_CAPABILITY_BACKGROUND_SNAPSHOT]) {
+error_setg(errp, "Postcopy is not compatible "
+"with background snapshot");
+return false;
+}
+
 /* This check is reasonably expensive, so only when it's being
  * set the first time, also it's only the destination that needs
  * special support.
@@ -711,6 +717,26 @@ static bool migrate_caps_check(bool *cap_list,
 }
 }
 
+if (cap_list[MIGRATION_CAPABILITY_BACKGROUND_SNAPSHOT]) {
+if (cap_list[MIGRATION_CAPABILITY_RELEASE_RAM]) {
+error_setg(errp, "Background snapshot is not compatible "
+"with release ram capability");
+return false;
+}
+
+if (cap_list[MIGRATION_CAPABILITY_COMPRESS]) {
+error_setg(errp, "Background snapshot is not "
+"currently compatible with compression");
+return false;
+}
+
+if (cap_list[MIGRATION_CAPABILITY_XBZRLE]) {
+error_setg(errp, "Background snapshot is not "
+"currently compatible with XBZLRE");
+return false;
+}
+}
+
 return true;
 }
 
@@ -1635,6 +1661,15 @@ bool migrate_use_block_incremental(void)
 return s->parameters.block_incremental;
 }
 
+bool migrate_background_snapshot(void)
+{
+MigrationState *s;
+
+s = migrate_get_current();
+
+return s->enabled_capabilities[MIGRATION_CAPABILITY_BACKGROUND_SNAPSHOT];
+}
+
 /* migration thread support */
 /*
  * Something bad happened to the RP stream, mark an error
diff --git a/migration/migration.h b/migration/migration.h
index 663415fe48..21babbc008 100644
--- a/migration/migration.h
+++ b/migration/migration.h
@@ -201,6 +201,7 @@ int migrate_compress_level(void);
 int migrate_compress_threads(void);
 int migrate_decompress_threads(void);
 bool migrate_use_events(void);
+bool migrate_background_snapshot(void);
 
 /* Sending on the return path - generic and then for each message type */
 void migrate_send_rp_shut(MigrationIncomingState *mis,
diff --git a/qapi/migration.json b/qapi/migration.json
index 03f57c9616..eb5f52b7ff 100644
--- a/qapi/migration.json
+++ b/qapi/migration.json
@@ -352,12 +352,16 @@
 #
 # @x-multifd: Use more than one fd for migration (since 2.11)
 #
+# @background-snapshot: Using migration infrastructure makes VM snapshot
+# saving its RAM in background. This reduces  VM downtime. (since 2.12)
+#
 # Since: 1.2
 ##
 { 'enum': 'MigrationCapability',
   'data': ['xbzrle', 'rdma-pin-all', 'auto-converge', 'zero-blocks',
'compress', 'events', 'postcopy-ram', 'x-colo', 'release-ram',
-   'block', 'return-path', 'pause-before-switchover', 'x-multifd' ] }
+   'block', 'return-path', 'pause-before-switchover', 'x-multifd',
+   'background-snapshot' ] }
 
 ##
 # @MigrationCapabilityStatus:
-- 
2.17.0




[Qemu-devel] [PATCH v1 06/17] background snapshot: add helpers to manage a copy of ram block list

2018-07-18 Thread Denis Plotnikov
The VM ramblock list may be changed during the snapshotting.
We want to make sure that we have the same ramblock list as it
was at the time of snapshot beginning.
So, we create a copy of the list at the beginning of the snapshotting
and use it during the process.

Signed-off-by: Denis Plotnikov 
---
 migration/ram.c | 51 +
 migration/ram.h |  6 ++
 2 files changed, 57 insertions(+)

diff --git a/migration/ram.c b/migration/ram.c
index 021d583b9b..4399c31606 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -1508,6 +1508,57 @@ static int ram_save_host_page(RAMState *rs, 
PageSearchStatus *pss,
 return pages;
 }
 
+/* BackGround Snapshot Blocks */
+static RamBlockList bgs_blocks;
+
+static RamBlockList *ram_bgs_block_list_get(void)
+{
+return _blocks;
+}
+
+void ram_block_list_create(void)
+{
+RAMBlock *block = NULL;
+RamBlockList *block_list = ram_bgs_block_list_get();
+
+qemu_mutex_lock_ramlist();
+QLIST_FOREACH(block, _list.blocks, next) {
+memory_region_ref(block->mr);
+QLIST_INSERT_HEAD(block_list, block, bgs_next);
+}
+qemu_mutex_unlock_ramlist();
+}
+
+void ram_block_list_destroy(void)
+{
+RAMBlock *next, *block;
+RamBlockList *block_list = ram_bgs_block_list_get();
+
+QLIST_FOREACH_SAFE(block, block_list, bgs_next, next) {
+QLIST_REMOVE(block, bgs_next);
+memory_region_unref(block->mr);
+}
+}
+
+RAMBlock *ram_bgs_block_find(uint8_t *address, ram_addr_t *page_offset)
+{
+RAMBlock *block = NULL;
+RamBlockList *block_list = ram_bgs_block_list_get();
+
+QLIST_FOREACH(block, block_list, bgs_next) {
+/* This case append when the block is not mapped. */
+if (block->host == NULL) {
+continue;
+}
+
+if (address - block->host < block->max_length) {
+*page_offset = (address - block->host) & TARGET_PAGE_MASK;
+return block;
+}
+}
+
+return NULL;
+}
 /**
  * ram_find_and_save_block: finds a dirty page and sends it to f
  *
diff --git a/migration/ram.h b/migration/ram.h
index 64d81e9f1d..385579cae9 100644
--- a/migration/ram.h
+++ b/migration/ram.h
@@ -31,6 +31,7 @@
 
 #include "qemu-common.h"
 #include "exec/cpu-common.h"
+#include "exec/ramlist.h"
 
 extern MigrationStats ram_counters;
 extern XBZRLECacheStats xbzrle_counters;
@@ -61,5 +62,10 @@ void ram_handle_compressed(void *host, uint8_t ch, uint64_t 
size);
 int ramblock_recv_bitmap_test(RAMBlock *rb, void *host_addr);
 void ramblock_recv_bitmap_set(RAMBlock *rb, void *host_addr);
 void ramblock_recv_bitmap_set_range(RAMBlock *rb, void *host_addr, size_t nr);
+/* For background snapshot */
+void ram_block_list_create(void);
+void ram_block_list_destroy(void);
+
+RAMBlock *ram_bgs_block_find(uint8_t *address, ram_addr_t *page_offset);
 
 #endif
-- 
2.17.0




[Qemu-devel] [PATCH v1 12/17] ram: add background snapshot support in ram page saving part of migration

2018-07-18 Thread Denis Plotnikov
Unify the function saving the ram pages for using in both the migration
and the background snapshots.

Signed-off-by: Denis Plotnikov 
---
 migration/ram.c | 17 +
 1 file changed, 17 insertions(+)

diff --git a/migration/ram.c b/migration/ram.c
index 10b6fdf23e..b1623e96e7 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -1803,11 +1803,28 @@ static int ram_find_and_save_block(RAMState *rs, bool 
last_stage)
 if (!found) {
 /* priority queue empty, so just search for something dirty */
 found = find_dirty_block(rs, , );
+
+if (found && migrate_background_snapshot()) {
+/* make a copy of the page and
+ * pass it to the page search status */
+int ret;
+ret = ram_copy_page(pss.block, pss.page, _copy);
+if (ret == 0) {
+found = false;
+pages = 0;
+} else if (ret < 0) {
+return ret;
+}
+}
 }
 
 if (found) {
 pages = ram_save_host_page(rs, , last_stage);
 }
+
+if (pss.page_copy) {
+ram_page_buffer_decrease_used();
+}
 } while (!pages && again);
 
 rs->last_seen_block = pss.block;
-- 
2.17.0




[Qemu-devel] [PATCH v1 00/17] Background snapshots

2018-07-18 Thread Denis Plotnikov
The workflow to make a snapshot is the following:
1. Pause the vm
2. Make a snapshot of block devices using the scheme of your choice
3. Turn on background-snapshot migration capability
4. Start the migration using the destination (migration stream) of your choice.
   The migration will resume the vm execution by itself
   when it has the devices' states saved  and is ready to start ram writing
   to the migration stream.
5. Listen to the migration finish event

The bakground snapshot works with support of KVM patch:
"x86: mmu: report failed memory access to the userspace"
(not applied to the mainstream, it's in the kvm mailing list)

--
Change log:

v0 => v1:

the patch series has been split in smaller chunks

Denis Plotnikov (17):
  migration: add background snapshot capability
  bitops: add some atomic versions of bitmap operations
  threads: add infrastructure to process sigsegv
  background snapshot: make a dedicated type for ram block list
  ram: extend the data structures for background snapshotting
  background snapshot: add helpers to manage a copy of ram block list
  background snapshot: introduce page buffer
  migration: add helpers to change VM memory protection rights
  background snapshot: extend RAM request for holding a page copy
pointer
  background snapshots: adapt the page queueing code for using page
copies
  background snapshot: add a memory page copying function
  ram: add background snapshot support in ram page saving part of
migration
  background snapshot: add write-protected page access handler function
  kvm: add failed memeory access exit reason
  kvm: add vCPU failed memeory access processing
  migration: move the device state saving logic to a separate function
  background snapshot: enable background snapshot

 include/exec/ram_addr.h   |   7 +
 include/exec/ramlist.h|   4 +-
 include/qemu/bitops.h |  25 +++
 include/qemu/thread.h |   5 +
 linux-headers/linux/kvm.h |   5 +
 migration/migration.c | 140 +-
 migration/migration.h |   1 +
 migration/ram.c   | 374 --
 migration/ram.h   |  17 +-
 migration/savevm.c|  91 +-
 migration/savevm.h|   2 +
 qapi/migration.json   |   6 +-
 target/i386/kvm.c |  17 ++
 util/qemu-thread-posix.c  |  52 ++
 14 files changed, 684 insertions(+), 62 deletions(-)

-- 
2.17.0




[Qemu-devel] [PATCH v1 13/17] background snapshot: add write-protected page access handler function

2018-07-18 Thread Denis Plotnikov
The handler does all the necessary operations to save the page being
accessed to the snapshot file(stream).

Signed-off-by: Denis Plotnikov 
---
 migration/ram.c | 43 +++
 migration/ram.h |  1 +
 2 files changed, 44 insertions(+)

diff --git a/migration/ram.c b/migration/ram.c
index b1623e96e7..04a4bf708d 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -1762,6 +1762,49 @@ int ram_copy_page(RAMBlock *block, unsigned long page_nr,
 return 1;
 }
 
+/**
+ * ram_process_page_fault
+ *
+ * Used in the background snapshot to queue the copy of the memory
+ * page for writing.
+ *
+ * Returns:
+ *0 > - on error
+ *0   - success
+ *
+ * @address:  guest address
+ *
+ */
+int ram_process_page_fault(void *address)
+{
+int ret;
+void *page_copy = NULL;
+unsigned long page_nr;
+ram_addr_t offset;
+
+RAMBlock *block = ram_bgs_block_find(address, );
+
+if (!block) {
+return -1;
+}
+
+page_nr = offset >> TARGET_PAGE_BITS;
+
+ret = ram_copy_page(block, page_nr, _copy);
+
+if (ret < 0) {
+return ret;
+} else if (ret > 0) {
+if (ram_save_queue_pages(block, NULL, offset,
+ TARGET_PAGE_SIZE, page_copy)) {
+ram_page_buffer_free(page_copy);
+return -1;
+}
+}
+
+return 0;
+}
+
 /**
  * ram_find_and_save_block: finds a dirty page and sends it to f
  *
diff --git a/migration/ram.h b/migration/ram.h
index 76ab0a3377..2b565ce620 100644
--- a/migration/ram.h
+++ b/migration/ram.h
@@ -76,4 +76,5 @@ int ram_block_list_set_readonly(void);
 int ram_block_list_set_writable(void);
 
 int ram_copy_page(RAMBlock *block, unsigned long page_nr, void **page_copy);
+int ram_process_page_fault(void *address);
 #endif
-- 
2.17.0




[Qemu-devel] [PATCH v1 04/17] background snapshot: make a dedicated type for ram block list

2018-07-18 Thread Denis Plotnikov
The type will be used later to hold a list of ram blocks which
memory content to be used for the vm state saving when making
a background snapshot.

Signed-off-by: Denis Plotnikov 
---
 include/exec/ramlist.h | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/include/exec/ramlist.h b/include/exec/ramlist.h
index 2e2ac6cb99..e0231d3bec 100644
--- a/include/exec/ramlist.h
+++ b/include/exec/ramlist.h
@@ -44,11 +44,13 @@ typedef struct {
 unsigned long *blocks[];
 } DirtyMemoryBlocks;
 
+typedef QLIST_HEAD(, RAMBlock) RamBlockList;
+
 typedef struct RAMList {
 QemuMutex mutex;
 RAMBlock *mru_block;
 /* RCU-enabled, writes protected by the ramlist lock. */
-QLIST_HEAD(, RAMBlock) blocks;
+RamBlockList blocks;
 DirtyMemoryBlocks *dirty_memory[DIRTY_MEMORY_NUM];
 uint32_t version;
 QLIST_HEAD(, RAMBlockNotifier) ramblock_notifiers;
-- 
2.17.0




[Qemu-devel] [PATCH v1 14/17] kvm: add failed memeory access exit reason

2018-07-18 Thread Denis Plotnikov
The patch allows qemu to be aware of how to read kvm failed memeory
access.

This is a temporary patch. It sould be removed on importing
the kvm failed memeory access exit from the linux branch.

Signed-off-by: Denis Plotnikov 
---
 linux-headers/linux/kvm.h | 5 +
 1 file changed, 5 insertions(+)

diff --git a/linux-headers/linux/kvm.h b/linux-headers/linux/kvm.h
index d92c9b2f0e..ea63b681d8 100644
--- a/linux-headers/linux/kvm.h
+++ b/linux-headers/linux/kvm.h
@@ -235,6 +235,7 @@ struct kvm_hyperv_exit {
 #define KVM_EXIT_S390_STSI25
 #define KVM_EXIT_IOAPIC_EOI   26
 #define KVM_EXIT_HYPERV   27
+#define KVM_EXIT_FAIL_MEM_ACCESS  28
 
 /* For KVM_EXIT_INTERNAL_ERROR */
 /* Emulate instruction failed. */
@@ -392,6 +393,10 @@ struct kvm_run {
} eoi;
/* KVM_EXIT_HYPERV */
struct kvm_hyperv_exit hyperv;
+   /* KVM_EXIT_FAIL_MEM_ACCESS */
+   struct {
+   __u64 hva;
+   } fail_mem_access;
/* Fix the size of the union. */
char padding[256];
};
-- 
2.17.0




[Qemu-devel] [PATCH v1 05/17] ram: extend the data structures for background snapshotting

2018-07-18 Thread Denis Plotnikov
Signed-off-by: Denis Plotnikov 
---
 include/exec/ram_addr.h | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/include/exec/ram_addr.h b/include/exec/ram_addr.h
index 6cbc02aa0f..5b403d537d 100644
--- a/include/exec/ram_addr.h
+++ b/include/exec/ram_addr.h
@@ -36,6 +36,8 @@ struct RAMBlock {
 char idstr[256];
 /* RCU-enabled, writes protected by the ramlist lock */
 QLIST_ENTRY(RAMBlock) next;
+/* blocks used for background snapshot */
+QLIST_ENTRY(RAMBlock) bgs_next;
 QLIST_HEAD(, RAMBlockNotifier) ramblock_notifiers;
 int fd;
 size_t page_size;
@@ -49,6 +51,11 @@ struct RAMBlock {
 unsigned long *unsentmap;
 /* bitmap of already received pages in postcopy */
 unsigned long *receivedmap;
+/* The following 2 are for background snapshot */
+/* Pages currently being copied */
+unsigned long *touched_map;
+/* Pages has been copied already */
+unsigned long *copied_map;
 };
 
 static inline bool offset_in_ramblock(RAMBlock *b, ram_addr_t offset)
-- 
2.17.0




[Qemu-devel] [PATCH v1 15/17] kvm: add vCPU failed memeory access processing

2018-07-18 Thread Denis Plotnikov
Is done with support of the KVM patch returning the faulting address.

Signed-off-by: Denis Plotnikov 
---
 target/i386/kvm.c | 17 +
 1 file changed, 17 insertions(+)

diff --git a/target/i386/kvm.c b/target/i386/kvm.c
index 3ac5302bc5..55b8860d1a 100644
--- a/target/i386/kvm.c
+++ b/target/i386/kvm.c
@@ -45,6 +45,8 @@
 #include "hw/pci/msi.h"
 #include "hw/pci/msix.h"
 #include "migration/blocker.h"
+#include "migration/savevm.h"
+#include "migration/ram.h"
 #include "exec/memattrs.h"
 #include "trace.h"
 
@@ -3130,6 +3132,18 @@ static bool host_supports_vmx(void)
 return ecx & CPUID_EXT_VMX;
 }
 
+static int kvm_handle_fail_mem_access(CPUState *cpu)
+{
+struct kvm_run *run = cpu->kvm_run;
+int ret = ram_process_page_fault((void *)run->fail_mem_access.hva);
+
+if (ret >= 0) {
+cpu_resume(cpu);
+}
+
+return ret;
+}
+
 #define VMX_INVALID_GUEST_STATE 0x8021
 
 int kvm_arch_handle_exit(CPUState *cs, struct kvm_run *run)
@@ -3188,6 +3202,9 @@ int kvm_arch_handle_exit(CPUState *cs, struct kvm_run 
*run)
 ioapic_eoi_broadcast(run->eoi.vector);
 ret = 0;
 break;
+case KVM_EXIT_FAIL_MEM_ACCESS:
+ret = kvm_handle_fail_mem_access(cs);
+break;
 default:
 fprintf(stderr, "KVM: unknown exit reason %d\n", run->exit_reason);
 ret = -1;
-- 
2.17.0




Re: [Qemu-devel] [PATCH 1/7] tests: qgraph API for the qtest driver framework

2018-07-18 Thread Stefan Hajnoczi
On Wed, Jul 11, 2018 at 04:58:41PM +0200, Paolo Bonzini wrote:
> On 11/07/2018 16:28, Stefan Hajnoczi wrote:
> >> + *
> >> + * QOSGraphObject also provides a destructor, used to deallocate the
> >> + * after the test has been executed.
> >> + */
> >> +struct QOSGraphObject {
> >> +/* for produces, returns void * */
> >> +QOSGetDriver get_driver;
> > 
> > Unused?
> > 
> >> +/* for contains, returns a QOSGraphObject * */
> >> +QOSGetDevice get_device;
> > 
> > Unused?
> 
> What is unused?

Neither of these fields are used in this patch.  Please introduce them
in the first patch that actually uses them.  This way code review can
proceed linearly and it also prevents deadcode when just part of a patch
series is merged or backported.


signature.asc
Description: PGP signature


[Qemu-devel] [PATCH v1 09/17] background snapshot: extend RAM request for holding a page copy pointer

2018-07-18 Thread Denis Plotnikov
This pointer is going to be used to transfer a memory.
Once the memory page is copied the content the snapshot interested in is
saved for writing and we can make the page writable again.

Signed-off-by: Denis Plotnikov 
---
 migration/ram.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/migration/ram.c b/migration/ram.c
index ce3dead932..dc7dfe0726 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -188,6 +188,7 @@ struct RAMSrcPageRequest {
 RAMBlock *rb;
 hwaddroffset;
 hwaddrlen;
+void *page_copy;
 
 QSIMPLEQ_ENTRY(RAMSrcPageRequest) next_req;
 };
@@ -265,6 +266,8 @@ struct PageSearchStatus {
 unsigned long page;
 /* Set once we wrap around */
 bool complete_round;
+/* Pointer to the cached page */
+void *page_copy;
 };
 typedef struct PageSearchStatus PageSearchStatus;
 
-- 
2.17.0




Re: [Qemu-devel] [PATCH v4] monitor: let cur_mon be per-thread

2018-07-18 Thread Markus Armbruster
Peter Xu  writes:

> After the Out-Of-Band work, the monitor iothread may be accessing the
> cur_mon as well (via monitor_qmp_dispatch_one()).  Let's convert the
> cur_mon variable to be a per-thread variable to make sure there won't be
> a race between threads when accessing the variable.

Hmm... why hasn't the OOB work created such a race already?

A monitor reads, parses, dispatches and executes commands, formats and
sends replies.

Before OOB, all of that ran in the main thread.  Any access of cur_mon
should therefore be from the main thread.  No races.

OOB moves read, parse, format and send to an I/O thread.  Dispatch and
execute remain in the main thread.  *Except* for commands executed OOB,
dispatch and execute move to the I/O thread, too.

Why is this not racy?  I guess it relies on careful non-use of cur_mon
in any part that may now execute in the I/O thread.  Scary...

Should this go into 3.0 to reduce the risk of bugs?

> Note that thread variables are not initialized to a valid value when new
> thread is created.  However for our case we don't need to set it up,
> since the cur_mon variable is only used in such a pattern:
>
>   old_mon = cur_mon;
>   cur_mon = xxx;
>   (do something, read cur_mon if necessary in the stack)
>   cur_mon = old_mon;
>
> It plays a role as stack variable, so no need to be initialized at all.
> We only need to make sure the variable won't be changed unexpectedly by
> other threads.
>
> Reviewed-by: Eric Blake 
> Reviewed-by: Marc-André Lureau 
> Reviewed-by: Stefan Hajnoczi 
> [peterx: touch up commit message a bit]
> Signed-off-by: Peter Xu 



Re: [Qemu-devel] [RFC v3 2/2] tests: Add bbc:microbit / nRF51 test suite

2018-07-18 Thread Stefan Hajnoczi
On Thu, Jul 12, 2018 at 12:12:19PM +0200, Steffen Görtz wrote:
> diff --git a/tests/microbit-test.c b/tests/microbit-test.c
> new file mode 100644
> index 00..c502ee3976
> --- /dev/null
> +++ b/tests/microbit-test.c
> @@ -0,0 +1,118 @@
> + /*
> + * QTest testcase for Microbit board using the Nordic Semiconductor nRF51 
> SoC.
> + *
> + *
> + * Copyright (c) 2018 Steffen Görtz 

Please add a license.


signature.asc
Description: PGP signature


Re: [Qemu-devel] [RFC v3 1/2] arm: Add NRF51 SOC non-volatile memory controller

2018-07-18 Thread Stefan Hajnoczi
On Thu, Jul 12, 2018 at 12:12:18PM +0200, Steffen Görtz wrote:
> Changes in V3:
> - NVMs consolidated in one file
> - Removed bitfields
> - Add VMState
> - Add reset
> 
> Changes in V1/V2:
> - Code style changes
> 
> Signed-off-by: Steffen Görtz 
> ---

Everything above '---' goes into the commit description and is
permanently in the git log.  Changelogs are not useful in the git log
since their context (the earlier revisions and the discussion) is
missing.  Please put changelogs below '---' as anything there doesn't
get included in the commit description.

> +static uint64_t ficr_read(void *opaque, hwaddr offset, unsigned int size)
> +{
> +if (offset > (ARRAY_SIZE(ficr_content) - size)) {

Bug alert, the semantic types look suspicious:

  bytes_t > num_elements_t - bytes_t

You cannot subtract a byte count from the number of elements in an
array.

Did you mean:

  if (offset >= sizeof(ficr_content)) {

MemoryRegion was already declared with FICR_SIZE length, so nothing
should ever call ficr_read() with an out-of-bounds offset.

Instead of defining FICR_SIZE, which could become out-of-sync with the
definition of ficr_content[], I would use sizeof(ficr_content) as the
MemoryRegion length.  This way the code is guaranteed to be safe even if
the size of ficr_content[] is changed in the future.  No if statement
would be necessary.

> +static uint64_t uicr_read(void *opaque, hwaddr offset, unsigned int size)
> +{
> +Nrf51NVMState *s = NRF51_NVM(opaque);
> +
> +offset >>= 2;
> +
> +if (offset >= ARRAY_SIZE(s->uicr_content)) {
> +qemu_log_mask(LOG_GUEST_ERROR,
> +"%s: bad read offset 0x%" HWADDR_PRIx "\n", __func__, 
> offset);
> +return 0;
> +}

The same applies to NRF51_UICR_SIZE.  It's simpler to use
sizeof(s->uicr_content) instead of defining a separate constant and then
using an if statement to check the offset.

> +static void nrf51_nvm_init(Object *obj)
> +{
> +Nrf51NVMState *s = NRF51_NVM(obj);
> +SysBusDevice *sbd = SYS_BUS_DEVICE(obj);
> +
> +memory_region_init_io(>mmio, obj, _ops, s, "nrf51_soc.nvmc",
> +NRF51_NVMC_SIZE);
> +sysbus_init_mmio(sbd, >mmio);
> +
> +memory_region_init_io(>ficr, NULL, _ops, s, "nrf51_soc.ficr",
> +NRF51_FICR_SIZE);
> +memory_region_set_readonly(>ficr, true);
> +sysbus_init_mmio(sbd, >ficr);
> +
> +memcpy(s->uicr_content, uicr_fixture, sizeof(s->uicr_content));

uicr_content[] persists across device reset?  Doing it that way seems
fine to me, but I wanted to check that it's what you intended.

> +static void nrf51_nvm_reset(DeviceState *dev)
> +{
> +Nrf51NVMState *s = NRF51_NVM(dev);
> +
> +s->empty_page = g_malloc(s->page_size);
> +memset(s->empty_page, 0xFF, s->page_size);
> +}

This function can be called multiple times and will leak the previous
s->empty_page.  It's easier to allocate s->empty_page in
nrf51_nvm_realize() instead.

> +
> +
> +static Property nrf51_nvm_properties[] = {
> +DEFINE_PROP_UINT16("page_size", Nrf51NVMState, page_size, 0x400),
> +DEFINE_PROP_UINT32("code_size", Nrf51NVMState, code_size, 0x100),

There is an intense battle between '-' and '_' for property names:

$ git grep 'DEFINE_PROP[^(]\+("[^"]\+_' | wc -l
234
$ git grep 'DEFINE_PROP[^(]\+("[^"]\+-' | wc -l
394

I like '-' is because -device foo,my-option=x is more consistent with
commonly-used properties than -device foo,my_option=x, but there are
plenty of properties with '_' lurking around too.

Up to you :).

> diff --git a/include/hw/arm/nrf51_soc.h b/include/hw/arm/nrf51_soc.h
> index 35dd71c3db..9c32d22abe 100644
> --- a/include/hw/arm/nrf51_soc.h
> +++ b/include/hw/arm/nrf51_soc.h
> @@ -1,5 +1,5 @@
>  /*
> - * Nordic Semiconductor nRF51  SoC
> + * Nordic Semiconductor nRF51 SoC

Unrelated to this patch.  Please drop.


signature.asc
Description: PGP signature


Re: [Qemu-devel] [PATCH 2/7] tests/qgraph: pci-pc driver and interface nodes

2018-07-18 Thread Stefan Hajnoczi
On Wed, Jul 11, 2018 at 07:46:09PM +0200, Emanuele wrote:
> On 07/11/2018 04:49 PM, Stefan Hajnoczi wrote:
> > On Mon, Jul 09, 2018 at 11:11:31AM +0200, Emanuele Giuseppe Esposito wrote:
> > > +void qpci_device_init(QPCIDevice *dev, QPCIBus *bus, int devfn)
> > > +{
> > > +if (!bus) {
> > > +return;
> > > +}
> > When does this happen and why?
> Ok maybe this is unneeded, I added it because I was creating a test with a
> NULL bus, but we ended up putting it aside. So you're right, this should be
> eliminated.

Thanks!

> > > +dev->bus = bus;
> > > +dev->devfn = devfn;
> > > +
> > > +if (qpci_config_readw(dev, PCI_VENDOR_ID) == 0x) {
> > > +printf("PCI Device not found\n");
> > > +abort();
> > > +}
> > > +qpci_device_enable(dev);
> > > +}
> > > +
> > > +void qpci_set_pc(QPCIBusPC *ret, QTestState *qts, QGuestAllocator *alloc)
> > It's not clear to me what the purpose of this function is - at least the
> > name is a bit cryptic since it seems more like an initialization
> > function than 'setting pc' on QPCIBusPC.  How about inlining this in
> > qpci_init_pc() instead of keeping a separate function?
> The idea is that, following the graph that Paolo wrote in the GSoC project
> wiki (https://wiki.qemu.org/Features/qtest_driver_framework), QPCIBusPC is
> "contained" in i440FX-pcihost. This means that the i440FX-pcihost struct has
> a QPCIBusPC field.
> 
> Therefore I had to put QPCIBusPC as public (in order to have it as field),
> even though qpci_init_pc() was not what I needed to initialize it, because
> that function is allocating a new QPCIBusPC, while I just needed to "set"
> its values.
> From here, qpci_set_pc().

I see.  Renaming qpci_set_pc() to qpci_pc_init() and including a doc
comment explaining that this is the in-place initialization function
would make things clear.

> > > +{
> > >   assert(qts);
> > >   ret->bus.pio_readb = qpci_pc_pio_readb;
> > > @@ -147,11 +164,23 @@ QPCIBus *qpci_init_pc(QTestState *qts, 
> > > QGuestAllocator *alloc)
> > >   ret->bus.mmio_alloc_ptr = 0xE000;
> > >   ret->bus.mmio_limit = 0x1ULL;
> > > +ret->obj.get_driver = qpci_get_driver;
> > > +}
> > > +
> > > +QPCIBus *qpci_init_pc(QTestState *qts, QGuestAllocator *alloc)
> > > +{
> > > +QPCIBusPC *ret = g_new0(QPCIBusPC, 1);
> > > +qpci_set_pc(ret, qts, alloc);
> > > +
> > >   return >bus;
> > >   }
> > >   void qpci_free_pc(QPCIBus *bus)
> > >   {
> > > +if (!bus) {
> > > +return;
> > > +}
> > Why is this needed now?
> Not having this would mean failure of tests like vrtio-user-test, because
> now QPCIBusPC has two fields, and QPCIBus bus; is not the first one either.
> Therefore, if I remember correctly doing something like
> container_of(NULL, QPCIBusPC, bus)
> where bus is not the first field of QPCIBusPC would result in a negative
> number, and freeing that would make the test crash.
> 
> I discovered this issue while doing some checks, and already proposed a
> patch for it, even though we ended up agreeing that this fix was only needed
> in my patch and not in the current QEMU implementation.

I see now:

  void free_ahci_device(QPCIDevice *dev)
  {
  QPCIBus *pcibus = dev ? dev->bus : NULL;

  /* libqos doesn't have a function for this, so free it manually */
  g_free(dev);
  qpci_free_pc(pcibus);
  }

The caller is assuming it's okay to pass NULL.

This is a good candidate for a separate patch so that you can explain
the rationale ("Although containerof(NULL, QPCIBusPC, bus) happens to
evaluate to NULL today thanks to the QPCIBusPC struct layout, it's
generally bad practice to rely on that.  Normally containerof() is used
precisely because an offset into the struct is required and a
straightforward cast would not work.  We got lucky, but don't depend on
it anymore.").

> > > +
> > >   QPCIBusPC *s = container_of(bus, QPCIBusPC, bus);
> > >   g_free(s);
> > > @@ -176,3 +205,11 @@ void qpci_unplug_acpi_device_test(const char *id, 
> > > uint8_t slot)
> > >   qmp_eventwait("DEVICE_DELETED");
> > >   }
> > > +
> > > +static void qpci_pc(void)
> > > +{
> > > +qos_node_create_driver("pci-bus-pc", NULL);
> > > +qos_node_produces("pci-bus-pc", "pci-bus");
> > In QOM pci-bus-pc would be a class, pci-bus would be an interface.  From
> > a driver perspective it seems QOM can already do what is needed and the
> > qgraph infrastructure isn't necessary.
> > 
> > Obviously the depth-first search *is* unique and not in QOM, although
> > QOM does offer a tree namespace which can be used for looking up object
> > instances and I guess this could be used to configure tests at runtime.
> > 
> > I'll think about this more as I read the rest of the patches.
> > 
> > > +}
> > > +
> > > +libqos_init(qpci_pc);
> > > diff --git a/tests/libqos/pci-pc.h b/tests/libqos/pci-pc.h
> > > index 491eeac756..ee381c5667 100644
> > > --- a/tests/libqos/pci-pc.h
> > > +++ b/tests/libqos/pci-pc.h
> > 

Re: [Qemu-devel] [RFC v3 0/2] Add NRF51 SOC non-volatile memory controller

2018-07-18 Thread Stefan Hajnoczi
On Thu, Jul 12, 2018 at 12:12:17PM +0200, Steffen Görtz wrote:
> Add some non-volatile memories and a non-volatile
> memory controller for the nRF51.
> Furthermore, a testsuite for the bbc:microbit and
> nrf51 soc was added.
> 
> Examination of the real device showed that
> NVMs remained unchanged when the write/erase enabled
> bits are not set in the controller, so we can
> safely ignore all writes.
> More: 
> https://github.com/douzepouze/gsoc18-qemu/blob/master/notes.md#test-nvmc-behavior-out-of-micropython-repl
> 
> The CODE/FLASH NVM is not currently included in this
> peripheral. It is hosted in the SOC and must be read-only
> to provide an accurate model.
> 
> Steffen Görtz (2):
>   arm: Add NRF51 SOC non-volatile memory controller
>   tests: Add bbc:microbit / nRF51 test suite

Please include a changelog in the cover letter so that it's clear what
has changed in this revision.


signature.asc
Description: PGP signature


Re: [Qemu-devel] [PATCH 2/7] tests/qgraph: pci-pc driver and interface nodes

2018-07-18 Thread Stefan Hajnoczi
On Wed, Jul 11, 2018 at 05:18:03PM +0200, Paolo Bonzini wrote:
> On 11/07/2018 16:49, Stefan Hajnoczi wrote:
> > On Mon, Jul 09, 2018 at 11:11:31AM +0200, Emanuele Giuseppe Esposito wrote:
> >> -QPCIBus *qpci_init_pc(QTestState *qts, QGuestAllocator *alloc)
> >> +static void *qpci_get_driver(void *obj, const char *interface)
> >>  {
> >> -QPCIBusPC *ret = g_new0(QPCIBusPC, 1);
> >> +QPCIBusPC *qpci = obj;
> >> +if (!g_strcmp0(interface, "pci-bus")) {
> >> +return >bus;
> >> +}
> >> +printf("%s not present in pci-bus-pc", interface);
> >> +abort();
> >> +}
> > 
> > At this point I wonder if it makes sense to use the QEMU Object Model
> > (QOM), which has interfaces and inheritance.  qgraph duplicates part of
> > the object model.
> 
> Replying for Emanuele on this point since we didn't really go through
> QOM yet; I'll let him answer the comments that are more related to the code.
> 
> QOM is much heavier weight than qgraph, and adds a lot more boilerplate:
> 
> - QOM properties interact with QAPI and bring in a lot of baggage;
> qgraph would only use "child" properties to implement containment.
> 
> - QOM objects are long-lived, qgraph objects only last for the duration
> of a single test.  qgraph doesn't need reference counting or complex
> two-phase initialization like "realize" or "user_complete"
> 
> - QOM's hierarchy is shallow, but qgraph doesn't really need _any_
> hierarchy.  Because it focuses on interface rather than hierarchy,
> everything can be expressed simply through structs contained into one
> another.
> 
> Consider that qgraph is 1/4th the size of QOM, and a large part of it is
> the graph data structure that (as you said) would not be provided by QOM.
> 
> There are two things where using QOM would save a little bit of
> duplicated concepts, namely the get_driver (get interface, what you
> quote above) and get_device (get contained object) callbacks.  However,
> it wouldn't simplify the code at all, because it would require the
> introduction of separate instance and class structs.  We didn't want to
> add all too much boilerplate for people that want to write qtest, as you
> pointed out in the review of patch 4.

Yes, I think these are good points.  QOM does involve a lot of
boilerplate for defining objects.

> >> +void qpci_set_pc(QPCIBusPC *ret, QTestState *qts, QGuestAllocator *alloc)
> > 
> > It's not clear to me what the purpose of this function is - at least the
> > name is a bit cryptic since it seems more like an initialization
> > function than 'setting pc' on QPCIBusPC.  How about inlining this in
> > qpci_init_pc() instead of keeping a separate function?
> 
> I agree about the naming.  Perhaps we can rename the existing
> qpci_init_pc to qpci_pc_new, and qpci_set_pc to qpci_pc_init.
> 
> Would you prefer if the split was done in the patch that introduces the
> user for qpci_set_pc (patch 5)?  We did it here because this patch
> prepares qpci-pc

Either way is fine.  I suggested inlining mainly because it avoids the
need to pick a good name :).

Stefan


signature.asc
Description: PGP signature


Re: [Qemu-devel] [PATCH v2] arm: Add ARMv6-M programmer's model support

2018-07-18 Thread Stefan Hajnoczi
On Wed, Jul 18, 2018 at 12:56:28PM +0300, Julia Suvorova wrote:
> Forbid stack alignment change. (CCR)
> Reserve FAULTMASK, BASEPRI registers.
> Report any fault as a HardFault. Disable MemManage, BusFault and
> UsageFault, so they always escalated to HardFault. (SHCSR)
> 
> Signed-off-by: Julia Suvorova 
> ---
> v2:
> * Changed CCR reset value
> 
>  hw/intc/armv7m_nvic.c | 10 ++
>  target/arm/cpu.c  |  4 
>  target/arm/helper.c   | 13 +++--
>  3 files changed, 25 insertions(+), 2 deletions(-)

Reviewed-by: Stefan Hajnoczi 


signature.asc
Description: PGP signature


Re: [Qemu-devel] [Qemu-block] [PATCH v2] nvme: Fix nvme_init error handling

2018-07-18 Thread Stefan Hajnoczi
On Thu, Jul 12, 2018 at 10:54:20AM +0800, Fam Zheng wrote:
> It is wrong to leave this field as 1, as nvme_close() called in the
> error handling code in nvme_file_open() will use it and try to free
> s->queues again.
> 
> Another problem is the cleaning ups are duplicated between the fail*
> labels of nvme_init() and nvme_file_open(), which calls nvme_close().
> 
> A third problem is nvme_close() misses g_free() and
> event_notifier_cleanup().
> 
> Fix all of them.
> 
> Cc: qemu-sta...@nongnu.org
> Signed-off-by: Fam Zheng 
> 
> ---
> 
> v2: Adopt the suggested fix by Kevin.
> ---
>  block/nvme.c | 37 -
>  1 file changed, 12 insertions(+), 25 deletions(-)

Reviewed-by: Stefan Hajnoczi 


signature.asc
Description: PGP signature


Re: [Qemu-devel] [Qemu-block] [PATCH for-3.1] qemu-iotests: Adapt to moved location of StringIO module in py3

2018-07-18 Thread John Snow



On 07/18/2018 11:05 AM, Eduardo Habkost wrote:
> On Wed, Jul 18, 2018 at 12:02:39PM -0300, Philippe Mathieu-Daudé wrote:
>> Hi Eduardo,
>>
>> On 07/18/2018 11:53 AM, Eduardo Habkost wrote:
>>> On Tue, Jul 17, 2018 at 08:40:15PM -0300, Philippe Mathieu-Daudé wrote:
>>> [...]
 -import StringIO
 +try:
 +from StringIO import StringIO
 +except ImportError:
 +from io import StringIO
>>>
>>> Why do we need this?  Python 2.7 has io.StringIO.
>>
>> Python 2 works fine, the problem is the Fedora Docker image uses Python
>> 3 and the block tests started to fail...
> 
> My question is: why use StringIO.StringIO on Python 2 and
> io.StringIO on Python 3, if io.StringIO works on both Python
> versions?
> 

Holdover from when 2.6 was our requisite version, surely.

--js



Re: [Qemu-devel] [Qemu-block] [PATCH] block/vvfat: Disable debug message by default

2018-07-18 Thread John Snow



On 07/18/2018 11:08 AM, Thomas Huth wrote:
> It's annoying to see this debug message every time you use vvfat.
> Disable it with the DLOG() macro by default, as it is done with the
> other debug messages in this file.
> 
> Signed-off-by: Thomas Huth 
> ---
>  block/vvfat.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/block/vvfat.c b/block/vvfat.c
> index c7d2ed2..fc41841 100644
> --- a/block/vvfat.c
> +++ b/block/vvfat.c
> @@ -1245,8 +1245,8 @@ static int vvfat_open(BlockDriverState *bs, QDict 
> *options, int flags,
>  s->fat2 = NULL;
>  s->downcase_short_names = 1;
>  
> -fprintf(stderr, "vvfat %s chs %d,%d,%d\n",
> -dirname, cyls, heads, secs);
> +DLOG(fprintf(stderr, "vvfat %s chs %d,%d,%d\n",
> + dirname, cyls, heads, secs));
>  
>  s->sector_count = cyls * heads * secs - s->offset_to_bootsector;
>  
> 

Makes sense as a minimal change without doing more... real work here.

Reviewed-by: John Snow 



[Qemu-devel] [PATCH] block/vvfat: Fix crash when reporting error about too many files in directory

2018-07-18 Thread Thomas Huth
When using the vvfat driver with a directory that contains too many files,
QEMU currently crashes. We are trying to print the wrong path variable here.

Signed-off-by: Thomas Huth 
---
 block/vvfat.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/block/vvfat.c b/block/vvfat.c
index fc41841..6ae7458 100644
--- a/block/vvfat.c
+++ b/block/vvfat.c
@@ -975,8 +975,7 @@ static int init_directories(BDRVVVFATState* s,
 if (mapping->mode & MODE_DIRECTORY) {
 mapping->begin = cluster;
 if(read_directory(s, i)) {
-error_setg(errp, "Could not read directory %s",
-   mapping->path);
+error_setg(errp, "Could not read directory \"%s\"", s->path);
 return -1;
 }
 mapping = array_get(&(s->mapping), i);
-- 
1.8.3.1




Re: [Qemu-devel] [PATCH for-3.1] qemu-iotests: Adapt to moved location of StringIO module in py3

2018-07-18 Thread Philippe Mathieu-Daudé
On 07/18/2018 12:05 PM, Eduardo Habkost wrote:
> On Wed, Jul 18, 2018 at 12:02:39PM -0300, Philippe Mathieu-Daudé wrote:
>> Hi Eduardo,
>>
>> On 07/18/2018 11:53 AM, Eduardo Habkost wrote:
>>> On Tue, Jul 17, 2018 at 08:40:15PM -0300, Philippe Mathieu-Daudé wrote:
>>> [...]
 -import StringIO
 +try:
 +from StringIO import StringIO
 +except ImportError:
 +from io import StringIO
>>>
>>> Why do we need this?  Python 2.7 has io.StringIO.
>>
>> Python 2 works fine, the problem is the Fedora Docker image uses Python
>> 3 and the block tests started to fail...
> 
> My question is: why use StringIO.StringIO on Python 2 and
> io.StringIO on Python 3, if io.StringIO works on both Python
> versions?

Oh I missed your question because I was not aware of this, and looked
how this was handled in the tree (7a5d936b6fc and 5f90af8e6b).

TIL we can use "from io import StringIO" regardless the version, the
2->3 conversion looks a bit less absurd, thanks!

Phil.



Re: [Qemu-devel] [PATCH 5/7 V11] mem/nvdimm: ensure write persistence to PMEM in label emulation

2018-07-18 Thread Richard Henderson
On 07/18/2018 12:48 AM, junyan...@gmx.com wrote:
> From: Junyan He 
> 
> Guest writes to vNVDIMM labels are intercepted and performed on the
> backend by QEMU. When the backend is a real persistent memort, QEMU
> needs to take proper operations to ensure its write persistence on the
> persistent memory. Otherwise, a host power failure may result in the
> loss of guest label configurations.
> 
> Signed-off-by: Haozhong Zhang 
> Reviewed-by: Stefan Hajnoczi 
> Reviewed-by: Igor Mammedov 
> ---
>  hw/mem/nvdimm.c |  9 -
>  include/qemu/pmem.h | 30 ++
>  2 files changed, 38 insertions(+), 1 deletion(-)
>  create mode 100644 include/qemu/pmem.h

Reviewed-by: Richard Henderson 


r~




Re: [Qemu-devel] spice/qlx: atomic-alignment error with clang-7

2018-07-18 Thread Philippe Mathieu-Daudé
On 07/18/2018 12:12 PM, Paolo Bonzini wrote:
> On 18/07/2018 16:56, Philippe Mathieu-Daudé wrote:
>> Using clang-7 (Debian) on x86_64 to build v3.0.0-rc1.
>>
>> Not a regression, it was probably always there:
>>
>> $ ./configure --cc=clang-7 && make hw/display/qxl.o
>> hw/display/qxl.c:1884:19: error: misaligned or large atomic operation
>> may incur significant performance penalty [-Werror,-Watomic-alignment]
>> old_pending = atomic_fetch_or(>ram->int_pending, le_events);
>>   ^
>> include/qemu/atomic.h:206:34: note: expanded from macro 'atomic_fetch_or'
>> #define atomic_fetch_or(ptr, n)  __atomic_fetch_or(ptr, n, __ATOMIC_SEQ_CST)
>>  ^
>> 1 error generated.
> 
> Hi, this is fixed in the latest SPICE git.  The upstream commit is
> 
> https://gitlab.freedesktop.org/spice/spice-protocol/commit/beda5ec7a6848be20c0cac2a9a8ef2a41e8069c1

That was fast =)

Thanks!

Phil.



Re: [Qemu-devel] [PATCH 2/7 V11] memory, exec: switch file ram allocation functions to 'flags' parameters

2018-07-18 Thread Richard Henderson
On 07/18/2018 12:47 AM, junyan...@gmx.com wrote:
> From: Junyan He 
> 
> As more flag parameters besides the existing 'share' are going to be
> added to following functions
> memory_region_init_ram_from_file
> qemu_ram_alloc_from_fd
> qemu_ram_alloc_from_file
> let's switch them to use the 'flags' parameters so as to ease future
> flag additions.
> 
> The existing 'share' flag is converted to the RAM_SHARED bit in ram_flags,
> and other flag bits are ignored by above functions right now.
> 
> Signed-off-by: Junyan He 
> Signed-off-by: Haozhong Zhang 
> Reviewed-by: Stefan Hajnoczi 
> Reviewed-by: Igor Mammedov 
> ---
>  backends/hostmem-file.c |  3 ++-
>  exec.c  | 10 +-
>  include/exec/memory.h   |  7 +--
>  include/exec/ram_addr.h | 25 +++--
>  memory.c|  8 +---
>  numa.c  |  2 +-
>  6 files changed, 41 insertions(+), 14 deletions(-)

Reviewed-by: Richard Henderson 


r~



Re: [Qemu-devel] spice/qlx: atomic-alignment error with clang-7

2018-07-18 Thread Paolo Bonzini
On 18/07/2018 16:56, Philippe Mathieu-Daudé wrote:
> Using clang-7 (Debian) on x86_64 to build v3.0.0-rc1.
> 
> Not a regression, it was probably always there:
> 
> $ ./configure --cc=clang-7 && make hw/display/qxl.o
> hw/display/qxl.c:1884:19: error: misaligned or large atomic operation
> may incur significant performance penalty [-Werror,-Watomic-alignment]
> old_pending = atomic_fetch_or(>ram->int_pending, le_events);
>   ^
> include/qemu/atomic.h:206:34: note: expanded from macro 'atomic_fetch_or'
> #define atomic_fetch_or(ptr, n)  __atomic_fetch_or(ptr, n, __ATOMIC_SEQ_CST)
>  ^
> 1 error generated.

Hi, this is fixed in the latest SPICE git.  The upstream commit is

https://gitlab.freedesktop.org/spice/spice-protocol/commit/beda5ec7a6848be20c0cac2a9a8ef2a41e8069c1

Thanks,

Paolo



Re: [Qemu-devel] [BUG?] aio_get_linux_aio: Assertion `ctx->linux_aio' failed

2018-07-18 Thread Farhan Ali




On 07/18/2018 09:42 AM, Farhan Ali wrote:



On 07/17/2018 04:52 PM, Nishanth Aravamudan wrote:

iiuc, this possibly implies AIO was not actually used previously on this
guest (it might have silently been falling back to threaded IO?). I
don't have access to s390x, but would it be possible to run qemu under
gdb and see if aio_setup_linux_aio is being called at all (I think it
might not be, but I'm not sure why), and if so, if it's for the context
in question?

If it's not being called first, could you see what callpath is calling
aio_get_linux_aio when this assertion trips?

Thanks!
-Nish



Hi Nishant,

 From the coredump of the guest this is the call trace that calls 
aio_get_linux_aio:



Stack trace of thread 145158:
#0  0x03ff94dbe274 raise (libc.so.6)
#1  0x03ff94da39a8 abort (libc.so.6)
#2  0x03ff94db62ce __assert_fail_base (libc.so.6)
#3  0x03ff94db634c __assert_fail (libc.so.6)
#4  0x02aa20db067a aio_get_linux_aio (qemu-system-s390x)
#5  0x02aa20d229a8 raw_aio_plug (qemu-system-s390x)
#6  0x02aa20d309ee bdrv_io_plug (qemu-system-s390x)
#7  0x02aa20b5a8ea virtio_blk_handle_vq (qemu-system-s390x)
#8  0x02aa20db2f6e aio_dispatch_handlers (qemu-system-s390x)
#9  0x02aa20db3c34 aio_poll (qemu-system-s390x)
#10 0x02aa20be32a2 iothread_run (qemu-system-s390x)
#11 0x03ff94f879a8 start_thread (libpthread.so.0)
#12 0x03ff94e797ee thread_start (libc.so.6)


Thanks for taking a look and responding.

Thanks
Farhan





Trying to debug a little further, the block device in this case is a 
"host device". And looking at your commit carefully you use the 
bdrv_attach_aio_context callback to setup a Linux AioContext.


For some reason the "host device" struct (BlockDriver bdrv_host_device 
in block/file-posix.c) does not have a bdrv_attach_aio_context defined.
So a simple change of adding the callback to the struct solves the issue 
and the guest starts fine.



diff --git a/block/file-posix.c b/block/file-posix.c
index 28824aa..b8d59fb 100644
--- a/block/file-posix.c
+++ b/block/file-posix.c
@@ -3135,6 +3135,7 @@ static BlockDriver bdrv_host_device = {
 .bdrv_refresh_limits = raw_refresh_limits,
 .bdrv_io_plug = raw_aio_plug,
 .bdrv_io_unplug = raw_aio_unplug,
+.bdrv_attach_aio_context = raw_aio_attach_aio_context,

 .bdrv_co_truncate   = raw_co_truncate,
 .bdrv_getlength= raw_getlength,



I am not too familiar with block device code in QEMU, so not sure if 
this is the right fix or if there are some underlying problems.


Thanks
Farhan




Re: [Qemu-devel] monitor: enable OOB by default

2018-07-18 Thread Markus Armbruster
Marc-André, one question for you inline.  Search for your name.

Peter Xu  writes:

> On Thu, Jun 28, 2018 at 03:20:41PM +0200, Markus Armbruster wrote:
>> Peter Xu  writes:
>> 
>> > On Wed, Jun 27, 2018 at 03:13:57PM +0200, Markus Armbruster wrote:
>> >> Monitor behavior changes even when the client rejects capability "oob".
>> >> 
>> >> Traditionally, the monitor reads, executes and responds to one command
>> >> after the other.  If the client sends in-band commands faster than the
>> >> server can execute them, the kernel will eventually refuse to buffer
>> >> more, and sending blocks or fails with EAGAIN.
>> >> 
>> >> To make OOB possible, we need to read and queue commands as we receive
>> >> them.  If the client sends in-band commands faster than the server can
>> >> execute them, the server will eventually drop commands to limit the
>> >> queue length.  The sever sends event COMMAND_DROPPED then.
>> >> 
>> >> However, we get the new behavior even when the client rejects capability
>> >> "oob".  We get the traditional behavior only when the server doesn't
>> >> offer "oob".
>> >> 
>> >> Is this what we want?
>> >
>> > Not really.  But I thought we kept that old behavior, haven't we?
>> >
>> > In handle_qmp_command() we have this:
>> >
>> > /*
>> >  * If OOB is not enabled on the current monitor, we'll emulate the
>> >  * old behavior that we won't process the current monitor any more
>> >  * until it has responded.  This helps make sure that as long as
>> >  * OOB is not enabled, the server will never drop any command.
>> >  */
>> > if (!qmp_oob_enabled(mon)) {
>> > monitor_suspend(mon);
>> > req_obj->need_resume = true;
>> > } else {
>> > /* Drop the request if queue is full. */
>> > if (mon->qmp.qmp_requests->length >= QMP_REQ_QUEUE_LEN_MAX) {
>> > qemu_mutex_unlock(>qmp.qmp_queue_lock);
>> > qapi_event_send_command_dropped(id,
>> > COMMAND_DROP_REASON_QUEUE_FULL,
>> > _abort);
>> > qmp_request_free(req_obj);
>> > return;
>> > }
>> > }
>> >
>> > Here assuming that we are not enabling OOB, then since we will suspend
>> > the monitor when we receive one command, then monitor_can_read() later
>> > will check with a result that "we should not read the chardev", then
>> > it'll leave all the data in the input buffer.  Then AFAIU the QMP
>> > client that didn't enable OOB will have similar behavior as before.
>> >
>> > Also, we will never receive a COMMAND_DROPPED event in that legacy
>> > mode as well since it's in the "else" block.   Am I right?
>> 
>> Hmm.  Did I get confused?  Let me look again.
>> 
>> Some places check qmp_oob_enabled(), which is true when the server
>> offered capability "oob" and the client accepted.  In terms of my reply
>> to Daniel, it distinguishes the two run time cases "OOB off" and "OOB
>> on".
>
> Exactly.
>
>> 
>> Other places check ->use_io_thr, which is true when
>> 
>> (flags & MONITOR_USE_CONTROL) && !CHARDEV_IS_MUX(chr)
>> 
>> in monitor_init().

->use_io_thr is now spelled ->use_io_thread.

>> One of these places is get_qmp_greeting().  It makes the server offer
>> "oob" when ->use_io_thr.  Makes sense.
>> 
>> In terms of my reply to Daniel, ->use_io_thr distinguishes between "OOB
>> not offered" and ("OOB offered, but rejected by client" or "OOB offered
>> and accepted").
>
> Exactly.
>
> To be more clear, let's name the three cases (as you mentioned):
>
> (A) OOB not offered
> (B) OOB offered, but rejected by client
> (C) OOB offered, and accepted
>
> Then:
>
> (1) qmp_oob_enabled() will only be return true if (C).
> (2) ->use_io_thr will only be true if (B) or (C).

Notes on (A) to be kept in mind for the remainder of the discussion:

* I don't expect (A) to occur in production

  (A) means the monitor has a chardev-mux backend.  chardev-mux
  multiplexes multiple character backends, e.g. multiplex a monitor and
  a console on stdio.  C-a c switches focus.  While such multiplexing
  may be convenient for humans, it's an unnecessary complication for
  machines, which could just as well use two UNIX domain sockets and not
  have to worry about focus and escape sequences.

* I do expect (A) to go away eventually

  (A) exists only because "not all the chardev frontends can run without
  main thread, or can run in multiple threads" [PATCH 6].  Hopefully,
  that'll be fixed eventually, so (A) can go away.

  Marc-André, your "[PATCH 04/12] Revert "qmp: isolate responses into io
  thread" states the "chardev write path is thread safe".  What's
  missing to make chardev-mux capable of supporting OOB?

>> Uses could to violate 'may use "server offered OOB" only for
>> configuration purposes', so let's check them:
>> 
>> * monitor_json_emitter()

This is now qmp_queue_response()

>>   If mon->use_io_thr, push to response queue.  Else emit directly.
>> 
>>   

[Qemu-devel] [PATCH] block/vvfat: Disable debug message by default

2018-07-18 Thread Thomas Huth
It's annoying to see this debug message every time you use vvfat.
Disable it with the DLOG() macro by default, as it is done with the
other debug messages in this file.

Signed-off-by: Thomas Huth 
---
 block/vvfat.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/block/vvfat.c b/block/vvfat.c
index c7d2ed2..fc41841 100644
--- a/block/vvfat.c
+++ b/block/vvfat.c
@@ -1245,8 +1245,8 @@ static int vvfat_open(BlockDriverState *bs, QDict 
*options, int flags,
 s->fat2 = NULL;
 s->downcase_short_names = 1;
 
-fprintf(stderr, "vvfat %s chs %d,%d,%d\n",
-dirname, cyls, heads, secs);
+DLOG(fprintf(stderr, "vvfat %s chs %d,%d,%d\n",
+ dirname, cyls, heads, secs));
 
 s->sector_count = cyls * heads * secs - s->offset_to_bootsector;
 
-- 
1.8.3.1




Re: [Qemu-devel] [PATCH for-3.1] qemu-iotests: Adapt to moved location of StringIO module in py3

2018-07-18 Thread Eduardo Habkost
On Wed, Jul 18, 2018 at 12:02:39PM -0300, Philippe Mathieu-Daudé wrote:
> Hi Eduardo,
> 
> On 07/18/2018 11:53 AM, Eduardo Habkost wrote:
> > On Tue, Jul 17, 2018 at 08:40:15PM -0300, Philippe Mathieu-Daudé wrote:
> > [...]
> >> -import StringIO
> >> +try:
> >> +from StringIO import StringIO
> >> +except ImportError:
> >> +from io import StringIO
> > 
> > Why do we need this?  Python 2.7 has io.StringIO.
> 
> Python 2 works fine, the problem is the Fedora Docker image uses Python
> 3 and the block tests started to fail...

My question is: why use StringIO.StringIO on Python 2 and
io.StringIO on Python 3, if io.StringIO works on both Python
versions?

-- 
Eduardo



Re: [Qemu-devel] [PATCH for-3.1] qemu-iotests: Adapt to moved location of StringIO module in py3

2018-07-18 Thread Philippe Mathieu-Daudé
Hi Eduardo,

On 07/18/2018 11:53 AM, Eduardo Habkost wrote:
> On Tue, Jul 17, 2018 at 08:40:15PM -0300, Philippe Mathieu-Daudé wrote:
> [...]
>> -import StringIO
>> +try:
>> +from StringIO import StringIO
>> +except ImportError:
>> +from io import StringIO
> 
> Why do we need this?  Python 2.7 has io.StringIO.

Python 2 works fine, the problem is the Fedora Docker image uses Python
3 and the block tests started to fail...

Is this commit message clearer?

"Since 356dc290f the Fedora [Docker] image uses Python3 by default."



[Qemu-devel] spice/qlx: atomic-alignment error with clang-7

2018-07-18 Thread Philippe Mathieu-Daudé
Using clang-7 (Debian) on x86_64 to build v3.0.0-rc1.

Not a regression, it was probably always there:

$ ./configure --cc=clang-7 && make hw/display/qxl.o
hw/display/qxl.c:1884:19: error: misaligned or large atomic operation
may incur significant performance penalty [-Werror,-Watomic-alignment]
old_pending = atomic_fetch_or(>ram->int_pending, le_events);
  ^
include/qemu/atomic.h:206:34: note: expanded from macro 'atomic_fetch_or'
#define atomic_fetch_or(ptr, n)  __atomic_fetch_or(ptr, n, __ATOMIC_SEQ_CST)
 ^
1 error generated.

static void qxl_send_events(PCIQXLDevice *d, uint32_t events)
{
uint32_t old_pending;
uint32_t le_events = cpu_to_le32(events);
...
old_pending = atomic_fetch_or(>ram->int_pending, le_events);
if ((old_pending & le_events) == le_events) {
return;
}
...
}

Having:

typedef struct PCIQXLDevice {
QXLRam *ram;
...
} PCIQXLDevice;

And from :

typedef struct SPICE_ATTR_PACKED QXLRam {
uint32_t magic;
uint32_t int_pending;
uint32_t int_mask;
uint8_t log_buf[QXL_LOG_BUF_SIZE];
...
} QXLRam;

Any idea how this could get fixed?

Thanks

Phil.



Re: [Qemu-devel] [PATCH for-3.1] qemu-iotests: Adapt to moved location of StringIO module in py3

2018-07-18 Thread Eduardo Habkost
On Tue, Jul 17, 2018 at 08:40:15PM -0300, Philippe Mathieu-Daudé wrote:
[...]
> -import StringIO
> +try:
> +from StringIO import StringIO
> +except ImportError:
> +from io import StringIO

Why do we need this?  Python 2.7 has io.StringIO.

-- 
Eduardo



Re: [Qemu-devel] [PATCH v1 2/5] sifive_u: Fix crash when introspecting the device

2018-07-18 Thread Philippe Mathieu-Daudé
On 07/17/2018 05:28 PM, Alistair Francis wrote:
> Use the new object_initialize_child() and sysbus_init_child_obj() to
> fix the issue.
> 

Suggested-by: Thomas Huth 

> Signed-off-by: Alistair Francis 

Reviewed-by: Philippe Mathieu-Daudé 
Tested-by: Philippe Mathieu-Daudé 

> ---
>  hw/riscv/sifive_u.c | 15 +++
>  1 file changed, 7 insertions(+), 8 deletions(-)
> 
> diff --git a/hw/riscv/sifive_u.c b/hw/riscv/sifive_u.c
> index 3a6ffeb437..59ae1ce24a 100644
> --- a/hw/riscv/sifive_u.c
> +++ b/hw/riscv/sifive_u.c
> @@ -244,9 +244,9 @@ static void riscv_sifive_u_init(MachineState *machine)
>  int i;
>  
>  /* Initialize SoC */
> -object_initialize(>soc, sizeof(s->soc), TYPE_RISCV_U_SOC);
> -object_property_add_child(OBJECT(machine), "soc", OBJECT(>soc),
> -  _abort);
> +object_initialize_child(OBJECT(machine), "soc", >soc,
> +sizeof(s->soc), TYPE_RISCV_U_SOC,
> +_abort, NULL);
>  object_property_set_bool(OBJECT(>soc), true, "realized",
>  _abort);
>  
> @@ -303,16 +303,15 @@ static void riscv_sifive_u_soc_init(Object *obj)
>  {
>  SiFiveUSoCState *s = RISCV_U_SOC(obj);
>  
> -object_initialize(>cpus, sizeof(s->cpus), TYPE_RISCV_HART_ARRAY);
> -object_property_add_child(obj, "cpus", OBJECT(>cpus),
> -  _abort);
> +object_initialize_child(obj, "cpus", >cpus, sizeof(s->cpus),
> +TYPE_RISCV_HART_ARRAY, _abort, NULL);
>  object_property_set_str(OBJECT(>cpus), SIFIVE_U_CPU, "cpu-type",
>  _abort);
>  object_property_set_int(OBJECT(>cpus), smp_cpus, "num-harts",
>  _abort);
>  
> -object_initialize(>gem, sizeof(s->gem), TYPE_CADENCE_GEM);
> -qdev_set_parent_bus(DEVICE(>gem), sysbus_get_default());
> +sysbus_init_child_obj(obj, "gem", >gem, sizeof(s->gem),
> +  TYPE_CADENCE_GEM);
>  }
>  
>  static void riscv_sifive_u_soc_realize(DeviceState *dev, Error **errp)
> 



Re: [Qemu-devel] [RFC v3 00/15] ARM virt: PCDIMM/NVDIMM at 2TB

2018-07-18 Thread Igor Mammedov
On Tue,  3 Jul 2018 09:19:43 +0200
Eric Auger  wrote:

> This series aims at supporting PCDIMM/NVDIMM intantiation in
> machvirt at 2TB guest physical address.
> 
> This is achieved in 3 steps:
> 1) support more than 40b IPA/GPA
will it work for TCG as well?
/important from make check pov and maybe in cases when there is no ARM system 
available to test/play with the feature/



> 2) support PCDIMM instantiation
> 3) support NVDIMM instantiation
> 
> This series reuses/rebases patches initially submitted by Shameer in [1]
> and Kwangwoo in [2].
> 
> I put all parts all together for consistency and due to dependencies
> however as soon as the kernel dependency is resolved we can consider
> upstreaming them separately.
> 
> Support more than 40b IPA/GPA [ patches 1 - 5 ]
> ---
> was "[RFC 0/6] KVM/ARM: Dynamic and larger GPA size"
> 
> At the moment the guest physical address space is limited to 40b
> due to KVM limitations. [0] bumps this limitation and allows to
> create a VM with up to 52b GPA address space.
> 
> With this series, QEMU creates a virt VM with the max IPA range
> reported by the host kernel or 40b by default.
> 
> This choice can be overriden by using the -machine kvm-type=
> option with bits within [40, 52]. If  are not supported by
> the host, the legacy 40b value is used.
> 
> Currently the EDK2 FW also hardcodes the max number of GPA bits to
> 40. This will need to be fixed.
> 
> PCDIMM Support [ patches 6 - 11 ]
> -
> was "[RFC 0/5] ARM virt: Support PC-DIMM at 2TB"
> 
> We instantiate the device_memory at 2TB. Using it obviously requires
> at least 42b of IPA/GPA. While its max capacity is currently limited
> to 2TB, the actual size depends on the initial guest RAM size and
> maxmem parameter.
> 
> Actual hot-plug and hot-unplug of PC-DIMM is not suported due to lack
> of support of those features in baremetal.
> 
> NVDIMM support [ patches 12 - 15 ]
> --
> 
> Once the memory hotplug framework is in place it is fairly
> straightforward to add support for NVDIMM. the machine "nvdimm" option
> turns the capability on.
> 
> Best Regards
> 
> Eric
> 
> References:
> 
> [0] [PATCH v3 00/20] arm64: Dynamic & 52bit IPA support
> https://www.spinics.net/lists/kernel/msg2841735.html
> 
> [1] [RFC v2 0/6] hw/arm: Add support for non-contiguous iova regions
> http://patchwork.ozlabs.org/cover/914694/
> 
> [2] [RFC PATCH 0/3] add nvdimm support on AArch64 virt platform
> https://lists.gnu.org/archive/html/qemu-devel/2016-07/msg04599.html
> 
> Tests:
> - On Cavium Gigabyte, a 48b VM was created.
> - Migration tests were performed between kernel supporting the
>   feature and destination kernel not suporting it
> - test with ACPI: to overcome the limitation of EDK2 FW, virt
>   memory map was hacked to move the device memory below 1TB.
> 
> This series can be found at:
> https://github.com/eauger/qemu/tree/v2.12.0-dimm-2tb-v3
> 
> History:
> 
> v2 -> v3:
> - fix pc_q35 and pc_piix compilation error
> - kwangwoo's email being not valid anymore, remove his address
> 
> v1 -> v2:
> - kvm_get_max_vm_phys_shift moved in arch specific file
> - addition of NVDIMM part
> - single series
> - rebase on David's refactoring
> 
> v1:
> - was "[RFC 0/6] KVM/ARM: Dynamic and larger GPA size"
> - was "[RFC 0/5] ARM virt: Support PC-DIMM at 2TB"
> 
> Best Regards
> 
> Eric
> 
> 
> Eric Auger (9):
>   linux-headers: header update for KVM/ARM KVM_ARM_GET_MAX_VM_PHYS_SHIFT
>   hw/boards: Add a MachineState parameter to kvm_type callback
>   kvm: add kvm_arm_get_max_vm_phys_shift
>   hw/arm/virt: support kvm_type property
>   hw/arm/virt: handle max_vm_phys_shift conflicts on migration
>   hw/arm/virt: Allocate device_memory
>   acpi: move build_srat_hotpluggable_memory to generic ACPI source
>   hw/arm/boot: Expose the pmem nodes in the DT
>   hw/arm/virt: Add nvdimm and nvdimm-persistence options
> 
> Kwangwoo Lee (2):
>   nvdimm: use configurable ACPI IO base and size
>   hw/arm/virt: Add nvdimm hot-plug infrastructure
> 
> Shameer Kolothum (4):
>   hw/arm/virt: Add memory hotplug framework
>   hw/arm/boot: introduce fdt_add_memory_node helper
>   hw/arm/boot: Expose the PC-DIMM nodes in the DT
>   hw/arm/virt-acpi-build: Add PC-DIMM in SRAT
> 
>  accel/kvm/kvm-all.c|   2 +-
>  default-configs/arm-softmmu.mak|   4 +
>  hw/acpi/aml-build.c|  51 
>  hw/acpi/nvdimm.c   |  28 ++-
>  hw/arm/boot.c  | 123 +++--
>  hw/arm/virt-acpi-build.c   |  10 +
>  hw/arm/virt.c  | 330 
> ++---
>  hw/i386/acpi-build.c   |  49 
>  hw/i386/pc_piix.c  |   8 +-
>  hw/i386/pc_q35.c   |   8 +-
>  hw/ppc/mac_newworld.c  | 

Re: [Qemu-devel] [RFC v3 08/15] hw/arm/boot: introduce fdt_add_memory_node helper

2018-07-18 Thread Igor Mammedov
On Tue,  3 Jul 2018 09:19:51 +0200
Eric Auger  wrote:

> From: Shameer Kolothum 
> 
> We introduce an helper to create a memory node.
> 
> Signed-off-by: Eric Auger 
> Signed-off-by: Shameer Kolothum 
> 
> ---
> 
> v1 -> v2:
> - nop of existing /memory nodes was already handled
> ---
>  hw/arm/boot.c | 54 ++
>  1 file changed, 34 insertions(+), 20 deletions(-)
> 
> diff --git a/hw/arm/boot.c b/hw/arm/boot.c
> index e09201c..5243a25 100644
> --- a/hw/arm/boot.c
> +++ b/hw/arm/boot.c
> @@ -413,6 +413,36 @@ static void set_kernel_args_old(const struct 
> arm_boot_info *info,
>  }
>  }
>  
> +static int fdt_add_memory_node(void *fdt, uint32_t acells, hwaddr mem_base,
> +   uint32_t scells, hwaddr mem_len,
> +   int numa_node_id)
> +{
> +char *nodename = NULL;
> +int ret;
> +
> +nodename = g_strdup_printf("/memory@%" PRIx64, mem_base);
> +qemu_fdt_add_subnode(fdt, nodename);
> +qemu_fdt_setprop_string(fdt, nodename, "device_type", "memory");
> +ret = qemu_fdt_setprop_sized_cells(fdt, nodename, "reg", acells, 
> mem_base,
> +   scells, mem_len);
> +if (ret < 0) {
> +fprintf(stderr, "couldn't set %s/reg\n", nodename);
> +goto out;
> +}
> +if (numa_node_id < 0) {
> +goto out;
> +}
> +
> +ret = qemu_fdt_setprop_cell(fdt, nodename, "numa-node-id", numa_node_id);
> +if (ret < 0) {
> +fprintf(stderr, "couldn't set %s/numa-node-id\n", nodename);
> +}
> +
> +out:
> +g_free(nodename);
> +return ret;
> +}
> +

not related question from hotplug POV,
is entry size fixed?
can we estimate exact size for #slots number of dimms and reserve it in advance
in FDT 'rom'?

>  static void fdt_add_psci_node(void *fdt)
>  {
>  uint32_t cpu_suspend_fn;
> @@ -492,7 +522,6 @@ int arm_load_dtb(hwaddr addr, const struct arm_boot_info 
> *binfo,
>  void *fdt = NULL;
>  int size, rc, n = 0;
>  uint32_t acells, scells;
> -char *nodename;
>  unsigned int i;
>  hwaddr mem_base, mem_len;
>  char **node_path;
> @@ -566,35 +595,20 @@ int arm_load_dtb(hwaddr addr, const struct 
> arm_boot_info *binfo,
>  mem_base = binfo->loader_start;
>  for (i = 0; i < nb_numa_nodes; i++) {
>  mem_len = numa_info[i].node_mem;
> -nodename = g_strdup_printf("/memory@%" PRIx64, mem_base);
> -qemu_fdt_add_subnode(fdt, nodename);
> -qemu_fdt_setprop_string(fdt, nodename, "device_type", "memory");
> -rc = qemu_fdt_setprop_sized_cells(fdt, nodename, "reg",
> -  acells, mem_base,
> -  scells, mem_len);
> +rc = fdt_add_memory_node(fdt, acells, mem_base,
> + scells, mem_len, i);
>  if (rc < 0) {
> -fprintf(stderr, "couldn't set %s/reg for node %d\n", 
> nodename,
> -i);
>  goto fail;
>  }
>  
> -qemu_fdt_setprop_cell(fdt, nodename, "numa-node-id", i);
>  mem_base += mem_len;
> -g_free(nodename);
>  }
>  } else {
> -nodename = g_strdup_printf("/memory@%" PRIx64, binfo->loader_start);
> -qemu_fdt_add_subnode(fdt, nodename);
> -qemu_fdt_setprop_string(fdt, nodename, "device_type", "memory");
> -
> -rc = qemu_fdt_setprop_sized_cells(fdt, nodename, "reg",
> -  acells, binfo->loader_start,
> -  scells, binfo->ram_size);
> +rc = fdt_add_memory_node(fdt, acells, binfo->loader_start,
> + scells, binfo->ram_size, -1);
>  if (rc < 0) {
> -fprintf(stderr, "couldn't set %s reg\n", nodename);
>  goto fail;
>  }
> -g_free(nodename);
>  }
>  
>  rc = fdt_path_offset(fdt, "/chosen");
nice cleanup, but I won't stop here just yet if hotplug to be considered.

I see arm_load_dtb() as a hack called from every board
where we dump everything that might be related to DTB regardless
if it's generic for every board or only a board specific stuff.

Could we split it in several logical parts that do a single thing
and preferably user only when they are actually need?
Something along following lines:
(cleanups/refactoring should be a separate from pcdimm series as it's self 
sufficient
and it would be easier to review/merge and could simplify following up pcdimm 
series):

diff --git a/hw/arm/boot.c b/hw/arm/boot.c
index e09201c..9c41efd 100644
--- a/hw/arm/boot.c
+++ b/hw/arm/boot.c
@ -486,9 +486,6 @@ static void fdt_add_psci_node(void *fdt)
 qemu_fdt_setprop_cell(fdt, "/psci", "migrate", migrate_fn);
 }
 
-int arm_load_dtb(hwaddr addr, const struct arm_boot_info *binfo,
- 

Re: [Qemu-devel] [PATCH] nvic: Change NVIC to support ARMv6-M

2018-07-18 Thread Julia Suvorova via Qemu-devel

On 17.07.2018 15:58, Peter Maydell wrote:

On 10 July 2018 at 16:33, Julia Suvorova  wrote:

The differences from ARMv7-M NVIC are:
   * ARMv6-M only supports up to 32 external interrupts
(configurable feature already). The ICTR is reserved.
   * Active Bit Register is reserved.
   * ARMv6-M supports 4 priority levels against 256 in ARMv7-M.

Signed-off-by: Julia Suvorova 
---
  hw/intc/armv7m_nvic.c | 29 +
  1 file changed, 25 insertions(+), 4 deletions(-)

diff --git a/hw/intc/armv7m_nvic.c b/hw/intc/armv7m_nvic.c
index 38aaf3dc8e..8545c87caa 100644
--- a/hw/intc/armv7m_nvic.c
+++ b/hw/intc/armv7m_nvic.c
@@ -420,6 +420,10 @@ static void set_prio(NVICState *s, unsigned irq, bool 
secure, uint8_t prio)
  assert(irq > ARMV7M_EXCP_NMI); /* only use for configurable prios */
  assert(irq < s->num_irq);

+if (!arm_feature(>cpu->env, ARM_FEATURE_V7)) {
+prio &= 0xc0;


Rather than hard-coding this, I think we should have a
num_prio_bits field in the NVICState struct which defines
how many bits of priority are implemented. This would be
2 for v6M and 8 otherwise, and can be set in
armv7m_nvic_realize. Then the mask is
   MAKE_64BIT_MASK(8 - num_prio_bits, num_prio_bits);
(For v8M the number of priority bits is configurable.)


Do I understand correctly that the check in armv7m_nvic_realize is for
Baseline, not only v6m, because Baseline has only 4 priority levels too?
And num_prio_bits should be just a field, not a property?

Best regards, Julia Suvorova.



Re: [Qemu-devel] [PATCH for-3.0] po: Don't include comments with location

2018-07-18 Thread Philippe Mathieu-Daudé
Hi Stefan,

On 07/18/2018 03:10 AM, Stefan Weil wrote:
> Those comments change often when ui/gtk.c is changed and are not
> really useful.

Reviewed-by: Philippe Mathieu-Daudé 

> 
> Add also a new translation for German (still to be done for all other
> languages).

Do you mind moving this change in a different patch?

> 
> Signed-off-by: Stefan Weil 
> ---
> 
> CC'ing all translators because of the new string which still needs
> translations.
> 
> Regards
> Stefan
> 
>  po/Makefile|  2 +-
>  po/bg.po   | 23 ---
>  po/de_DE.po| 23 ---
>  po/fr_FR.po| 23 ---
>  po/hu.po   | 23 ---
>  po/it.po   | 23 ---
>  po/messages.po | 25 +
>  po/tr.po   | 23 ---
>  po/zh_CN.po| 23 ---
>  9 files changed, 34 insertions(+), 154 deletions(-)
> 
> diff --git a/po/Makefile b/po/Makefile
> index cc630363de..e47e262ee6 100644
> --- a/po/Makefile
> +++ b/po/Makefile
> @@ -43,7 +43,7 @@ install: $(OBJS)
>  
>  $(PO_PATH)/messages.po: $(SRC_PATH)/ui/gtk.c
>   $(call quiet-command, ( cd $(SRC_PATH) && \
> -  xgettext -o - --from-code=UTF-8 --foreign-user \
> +  xgettext -o - --from-code=UTF-8 --foreign-user --no-location \
>   --package-name=QEMU --package-version=$(VERSION) \
>   --msgid-bugs-address=qemu-devel@nongnu.org -k_ -C ui/gtk.c | \
> sed -e s/CHARSET/UTF-8/) >$@,"GEN","$@")
> diff --git a/po/bg.po b/po/bg.po
> index 279d1b864a..3d8c353372 100644
> --- a/po/bg.po
> +++ b/po/bg.po
> @@ -7,7 +7,7 @@ msgid ""
>  msgstr ""
>  "Project-Id-Version: QEMU 2.6.50\n"
>  "Report-Msgid-Bugs-To: qemu-devel@nongnu.org\n"
> -"POT-Creation-Date: 2016-12-13 21:46+\n"
> +"POT-Creation-Date: 2018-07-18 07:56+0200\n"
>  "PO-Revision-Date: 2016-06-09 15:54+0300\n"
>  "Last-Translator: Alexander Shopov \n"
>  "Language-Team: Bulgarian \n"
> @@ -17,74 +17,59 @@ msgstr ""
>  "Content-Transfer-Encoding: 8bit\n"
>  "Plural-Forms: nplurals=2; plural=(n != 1);\n"
>  
> -#: ui/gtk.c:275
>  msgid " - Press Ctrl+Alt+G to release grab"
>  msgstr " — натиснете Ctrl+Alt+G, за да освободите фокуса"
>  
> -#: ui/gtk.c:279
>  msgid " [Paused]"
>  msgstr " [пауза]"
>  
> -#: ui/gtk.c:1922
>  msgid "_Pause"
>  msgstr "_Пауза"
>  
> -#: ui/gtk.c:1928
>  msgid "_Reset"
>  msgstr "_Рестартиране"
>  
> -#: ui/gtk.c:1931
>  msgid "Power _Down"
>  msgstr "_Изключване"
>  
> -#: ui/gtk.c:1937
>  msgid "_Quit"
>  msgstr "_Спиране на програмата"
>  
> -#: ui/gtk.c:2029
>  msgid "_Fullscreen"
>  msgstr "На _цял екран"
>  
> -#: ui/gtk.c:2032
>  msgid "_Copy"
>  msgstr "_Копиране"
>  
> -#: ui/gtk.c:2048
>  msgid "Zoom _In"
>  msgstr "_Увеличаване"
>  
> -#: ui/gtk.c:2055
>  msgid "Zoom _Out"
>  msgstr "_Намаляване"
>  
> -#: ui/gtk.c:2062
>  msgid "Best _Fit"
>  msgstr "По_местване"
>  
> -#: ui/gtk.c:2069
>  msgid "Zoom To _Fit"
>  msgstr "Напас_ване"
>  
> -#: ui/gtk.c:2075
>  msgid "Grab On _Hover"
>  msgstr "Прихващане при посо_чване"
>  
> -#: ui/gtk.c:2078
>  msgid "_Grab Input"
>  msgstr "Прихващане на _фокуса"
>  
> -#: ui/gtk.c:2107
>  msgid "Show _Tabs"
>  msgstr "Подпро_зорци"
>  
> -#: ui/gtk.c:2110
>  msgid "Detach Tab"
>  msgstr "Към самостоятелен подпрозорец"
>  
> -#: ui/gtk.c:2122
> +msgid "Show Menubar"
> +msgstr ""
> +
>  msgid "_Machine"
>  msgstr "_Машина"
>  
> -#: ui/gtk.c:2127
>  msgid "_View"
>  msgstr "_Изглед"
> diff --git a/po/de_DE.po b/po/de_DE.po
> index de27fcf174..1411dc52f4 100644
> --- a/po/de_DE.po
> +++ b/po/de_DE.po
> @@ -6,7 +6,7 @@ msgid ""
>  msgstr ""
>  "Project-Id-Version: QEMU 1.4.50\n"
>  "Report-Msgid-Bugs-To: qemu-devel@nongnu.org\n"
> -"POT-Creation-Date: 2016-12-13 21:46+\n"
> +"POT-Creation-Date: 2018-07-18 07:56+0200\n"
>  "PO-Revision-Date: 2012-02-28 16:00+0100\n"
>  "Last-Translator: Kevin Wolf \n"
>  "Language-Team: Deutsch \n"
> @@ -16,74 +16,59 @@ msgstr ""
>  "Content-Transfer-Encoding: 8bit\n"
>  "Plural-Forms: nplurals=2; plural=(n!=1);\n"
>  
> -#: ui/gtk.c:275
>  msgid " - Press Ctrl+Alt+G to release grab"
>  msgstr " - Strg+Alt+G drücken, um Eingabegeräte freizugeben"
>  
> -#: ui/gtk.c:279
>  msgid " [Paused]"
>  msgstr " [Angehalten]"
>  
> -#: ui/gtk.c:1922
>  msgid "_Pause"
>  msgstr "_Angehalten"
>  
> -#: ui/gtk.c:1928
>  msgid "_Reset"
>  msgstr "_Reset"
>  
> -#: ui/gtk.c:1931
>  msgid "Power _Down"
>  msgstr "_Herunterfahren"
>  
> -#: ui/gtk.c:1937
>  msgid "_Quit"
>  msgstr "_Beenden"
>  
> -#: ui/gtk.c:2029
>  msgid "_Fullscreen"
>  msgstr "_Vollbild"
>  
> -#: ui/gtk.c:2032
>  msgid "_Copy"
>  msgstr "_Kopieren"
>  
> -#: ui/gtk.c:2048
>  msgid "Zoom _In"
>  msgstr "_Heranzoomen"
>  
> -#: ui/gtk.c:2055
>  msgid "Zoom _Out"
>  msgstr "_Wegzoomen"
>  
> -#: ui/gtk.c:2062
>  msgid "Best _Fit"
>  msgstr "_Einpassen"
>  
> -#: ui/gtk.c:2069
>  msgid "Zoom To _Fit"
>  msgstr "Auf _Fenstergröße skalieren"
>  
> -#: ui/gtk.c:2075
>  msgid "Grab On 

Re: [Qemu-devel] qemu-iotests: workaround to avoid Python3 while running tests on Fedora Docker image

2018-07-18 Thread Philippe Mathieu-Daudé
Hi Daniel,

On 07/18/2018 04:25 AM, Daniel P. Berrangé wrote:
> On Tue, Jul 17, 2018 at 08:37:58PM -0300, Philippe Mathieu-Daudé wrote:
>> I noticed this while running "make docker-test-block@fedora":
>>
>> $ make docker-test-block@fedora NETWORK=1
>>   BUILD   fedora
>> RUN test-block in qemu:fedora
>> Configure options:
>> --enable-werror --prefix=/tmp/qemu-test/install
>> --python=/usr/bin/python3 --target-list=x86_64-softmmu
>> ...
>> python/usr/bin/python3 -B
>> ...
>>
>> 194 - output mismatch (see 194.out.bad)
>> --- /tmp/qemu-test/src/tests/qemu-iotests/194.out   2018-07-17
>> 22:51:10.0 +
>> +++ /tmp/qemu-test/build/tests/qemu-iotests/194.out.bad 2018-07-17
>> 22:58:01.646916625 +
>> @@ -1,18 +1,18 @@
>>  Launching VMs...
>>  Launching NBD server on destination...
>> -{u'return': {}}
>> -{u'return': {}}
>> +{'return': {}}
>> +{'return': {}}
>>
>> and many more errors, until:
>>
>> Failures: 045 132 148 152 162 169 194 205 208 218 222
>> Failed 11 of 49 tests
>> Test failed: iotests raw
>>
>> All failures are due to Python2 syntax.
>> I started to fix but noticed there are too many and this isn't to
>> correct fix for this release.
> 
> Were they all due to the unicode string prefix, or where there
> difference failures too ?

3 more kinds:

148 [failed, exit status 1] - output mismatch (see 148.out.bad)
--- /tmp/qemu-test/src/tests/qemu-iotests/148.out   2018-07-18
13:20:17.0 +
+++ /tmp/qemu-test/build/tests/qemu-iotests/148.out.bad 2018-07-18
13:25:58.649302935 +
@@ -1,5 +1,8 @@
-..
---
-Ran 2 tests
-
-OK
+Traceback (most recent call last):
+  File "148", line 139, in 
+iotests.verify_quorum()
+  File "/tmp/qemu-test/src/tests/qemu-iotests/iotests.py", line 657, in
verify_quorum
+if not supports_quorum():
+  File "/tmp/qemu-test/src/tests/qemu-iotests/iotests.py", line 653, in
supports_quorum
+return 'quorum' in qemu_img_pipe('--help')
+TypeError: a bytes-like object is required, not 'str'

169 [failed, exit status 1] - output mismatch (see 169.out.bad)
--- /tmp/qemu-test/src/tests/qemu-iotests/169.out   2018-07-18
13:20:17.0 +
+++ /tmp/qemu-test/build/tests/qemu-iotests/169.out.bad 2018-07-18
13:26:49.594192720 +
@@ -1,5 +1,4 @@
-
---
-Ran 16 tests
-
-OK
+Traceback (most recent call last):
+  File "169", line 26, in 
+import new
+ModuleNotFoundError: No module named 'new'

205 [failed, exit status 1] - output mismatch (see 205.out.bad)
--- /tmp/qemu-test/src/tests/qemu-iotests/205.out   2018-07-18
13:20:17.0 +
+++ /tmp/qemu-test/build/tests/qemu-iotests/205.out.bad 2018-07-18
13:27:46.775187975 +
@@ -1,5 +1,97 @@
-...
+EEE
+==
+ERROR: test_connect_after_remove_default (__main__.TestNbdServerRemove)
+--
+Traceback (most recent call last):
+  File "205", line 96, in test_connect_after_remove_default
+self.do_test_connect_after_remove()
+  File "205", line 87, in do_test_connect_after_remove
+self.assertReadOk(qemu_io(*args))
+  File "205", line 71, in assertReadOk
+filter_qemu_io(qemu_io_output).strip(),
+  File "/tmp/qemu-test/src/tests/qemu-iotests/iotests.py", line 218, in
filter_qemu_io
+msg = filter_win32(msg)
+  File "/tmp/qemu-test/src/tests/qemu-iotests/iotests.py", line 214, in
filter_win32
+return win32_re.sub("", msg)
+TypeError: cannot use a string pattern on a bytes-like object



Re: [Qemu-devel] [libvirt] CPU Support

2018-07-18 Thread Eduardo Habkost
CCing the AMD people who worked on this.

On Wed, Jul 18, 2018 at 12:18:45PM +0200, Pavel Hrdina wrote:
> On Wed, Jul 18, 2018 at 10:50:34AM +0100, Daniel P. Berrangé wrote:
> > On Wed, Jul 18, 2018 at 12:41:48PM +0300, Hetz Ben Hamo wrote:
> > > Hi,
> > > 
> > > I've been looking at the CPU list and although I see lots of CPU's, I
> > > cannot find 2 CPU families:
> > > 
> > > * AMD Ryzen
> > > * AMD Threadripper
> > > 
> > > Although EPYC has been added recently.
> > > 
> > > Are there any missing details which preventing adding those CPU's to the
> > > list?
> > 
> > Libvirt adds CPU models based on what QEMU supports. So from libvirt side 
> > the
> > answer is simply that QEMU doesn't expose any models for Ryzen/Threadripper,
> > but I'm not clear why it doesn't...
> > 
> > For a while I thought Ryzen/Threadripper would have same feature set as
> > EPYC, but I've seen bugs recently suggesting that is not in fact the
> > case. So it does look like having those models exposed by QEMU might
> > be useful.
> > 
> > Copy'ing QEMU devel & the CPU model maintainers for opinions.
> 
> I think that QEMU should figure out some pattern for naming CPU models
> because it's one big mess.  EPYC and Ryzen are bad names for QEMU as
> Core/Xeon would be for Intel CPUs.  It's the name of a model families
> and it will probably remain the same but with different
> microarchitecture.
> 
> Better name would be similarly like for the latest Inter CPUs,
> Skylake-Client and Skylake-Server.  Currently AMD has already two
> microarchitectures, Zen and Zen+ and there is third one Zen 2 planned.
> 
> Zen has AMD Ryzen, AMD Ryzen Threadripper and AMD Epyc.
> Zen+ has AMD Ryzen, AMD Ryzen Threadripper
> 
> And I bet that Zen 2 will follow the same model families.
> 
> We probably cannot rename EPYC now, but before we introduce Ryzen and
> Threadripper let's thing about it and come up with better names, for
> example Zen-Client/Zen-Server Zen+-Client or something like that.
> 
> Pavel

-- 
Eduardo



Re: [Qemu-devel] [PATCH for-3.0] i386: Rename enum CacheType members

2018-07-18 Thread Moger, Babu


> -Original Message-
> From: Aleksandar Markovic [mailto:amarko...@wavecomp.com]
> Sent: Wednesday, July 18, 2018 8:35 AM
> To: Philippe Mathieu-Daudé ; Eduardo Habkost
> ; qemu-devel@nongnu.org
> Cc: Moger, Babu ; Paolo Bonzini
> ; Aurelien Jarno ; Richard
> Henderson 
> Subject: Re: [PATCH for-3.0] i386: Rename enum CacheType members
> 
> 
> On 07/17/2018 04:40 PM, Eduardo Habkost wrote:
> > Rename DCACHE to DATA_CACHE and ICACHE to INSTRUCTION_CACHE.
> >
> > This avoids conflict with Linux asm/cachectl.h macros and fixes
> > build failure on mips hosts.
> >
> > Reported-by: Philippe Mathieu-Daudé 
> > Signed-off-by: Eduardo Habkost 
> 
> Acked-by: Aleksandar Markovic 
> 
Reviewed-by: Babu Moger 




Re: [Qemu-devel] [PATCH for 3.0 3/4] tests: use error_abort in places expecting errors

2018-07-18 Thread Philippe Mathieu-Daudé
On 07/18/2018 06:38 AM, Daniel P. Berrangé wrote:
> Most of the TLS related tests are passing an in a "Error" object to
> methods that are expected to fail, but then ignoring any error that is
> set and instead asserting on a return value. This means that when an
> error is unexpectedly raised, no information about it is printed out,
> making failures hard to diagnose. Changing these tests to pass in
> _abort will make unexpected failures print messages to stderr.
> 
> Signed-off-by: Daniel P. Berrangé 

Reviewed-by: Philippe Mathieu-Daudé 
Tested-by: Philippe Mathieu-Daudé 

> ---
>  tests/test-crypto-tlscredsx509.c | 11 +
>  tests/test-crypto-tlssession.c   | 76 
>  tests/test-io-channel-tls.c  | 24 --
>  3 files changed, 39 insertions(+), 72 deletions(-)
> 
> diff --git a/tests/test-crypto-tlscredsx509.c 
> b/tests/test-crypto-tlscredsx509.c
> index af2f80e89c..30f9ac4bbf 100644
> --- a/tests/test-crypto-tlscredsx509.c
> +++ b/tests/test-crypto-tlscredsx509.c
> @@ -54,7 +54,7 @@ static QCryptoTLSCreds 
> *test_tls_creds_create(QCryptoTLSCredsEndpoint endpoint,
>  "sanity-check", "yes",
>  NULL);
>  
> -if (*errp) {
> +if (!creds) {
>  return NULL;
>  }
>  return QCRYPTO_TLS_CREDS(creds);
> @@ -74,7 +74,6 @@ static void test_tls_creds(const void *opaque)
>  struct QCryptoTLSCredsTestData *data =
>  (struct QCryptoTLSCredsTestData *)opaque;
>  QCryptoTLSCreds *creds;
> -Error *err = NULL;
>  
>  #define CERT_DIR "tests/test-crypto-tlscredsx509-certs/"
>  mkdir(CERT_DIR, 0700);
> @@ -113,17 +112,11 @@ static void test_tls_creds(const void *opaque)
>   QCRYPTO_TLS_CREDS_ENDPOINT_SERVER :
>   QCRYPTO_TLS_CREDS_ENDPOINT_CLIENT),
>  CERT_DIR,
> -);
> +data->expectFail ? NULL : _abort);
>  
>  if (data->expectFail) {
> -error_free(err);
>  g_assert(creds == NULL);
>  } else {
> -if (err) {
> -g_printerr("Failed to generate creds: %s\n",
> -   error_get_pretty(err));
> -error_free(err);
> -}
>  g_assert(creds != NULL);
>  }
>  
> diff --git a/tests/test-crypto-tlssession.c b/tests/test-crypto-tlssession.c
> index 7bd811796e..fd9acf9067 100644
> --- a/tests/test-crypto-tlssession.c
> +++ b/tests/test-crypto-tlssession.c
> @@ -52,28 +52,21 @@ static ssize_t testRead(char *buf, size_t len, void 
> *opaque)
>  
>  static QCryptoTLSCreds *test_tls_creds_psk_create(
>  QCryptoTLSCredsEndpoint endpoint,
> -const char *dir,
> -Error **errp)
> +const char *dir)
>  {
> -Error *err = NULL;
>  Object *parent = object_get_objects_root();
>  Object *creds = object_new_with_props(
>  TYPE_QCRYPTO_TLS_CREDS_PSK,
>  parent,
>  (endpoint == QCRYPTO_TLS_CREDS_ENDPOINT_SERVER ?
>   "testtlscredsserver" : "testtlscredsclient"),
> -,
> +_abort,
>  "endpoint", (endpoint == QCRYPTO_TLS_CREDS_ENDPOINT_SERVER ?
>   "server" : "client"),
>  "dir", dir,
>  "priority", "NORMAL",
>  NULL
>  );
> -
> -if (err) {
> -error_propagate(errp, err);
> -return NULL;
> -}
>  return QCRYPTO_TLS_CREDS(creds);
>  }
>  
> @@ -87,7 +80,6 @@ static void test_crypto_tls_session_psk(void)
>  int channel[2];
>  bool clientShake = false;
>  bool serverShake = false;
> -Error *err = NULL;
>  int ret;
>  
>  /* We'll use this for our fake client-server connection */
> @@ -104,25 +96,23 @@ static void test_crypto_tls_session_psk(void)
>  
>  clientCreds = test_tls_creds_psk_create(
>  QCRYPTO_TLS_CREDS_ENDPOINT_CLIENT,
> -WORKDIR,
> -);
> +WORKDIR);
>  g_assert(clientCreds != NULL);
>  
>  serverCreds = test_tls_creds_psk_create(
>  QCRYPTO_TLS_CREDS_ENDPOINT_SERVER,
> -WORKDIR,
> -);
> +WORKDIR);
>  g_assert(serverCreds != NULL);
>  
>  /* Now the real part of the test, setup the sessions */
>  clientSess = qcrypto_tls_session_new(
>  clientCreds, NULL, NULL,
> -QCRYPTO_TLS_CREDS_ENDPOINT_CLIENT, );
> +QCRYPTO_TLS_CREDS_ENDPOINT_CLIENT, _abort);
> +g_assert(clientSess != NULL);
> +
>  serverSess = qcrypto_tls_session_new(
>  serverCreds, NULL, NULL,
> -QCRYPTO_TLS_CREDS_ENDPOINT_SERVER, );
> -
> -g_assert(clientSess != NULL);
> +QCRYPTO_TLS_CREDS_ENDPOINT_SERVER, _abort);
>  g_assert(serverSess != NULL);
>  
>  /* For handshake to work, we need to set the I/O callbacks
> @@ -145,7 +135,7 @@ static void test_crypto_tls_session_psk(void)
>  int rv;
>  if (!serverShake) {
>  rv = qcrypto_tls_session_handshake(serverSess,
> -   );
> +   _abort);
>  g_assert(rv 

Re: [Qemu-devel] qemu-iotests: workaround to avoid Python3 while running tests on Fedora Docker image

2018-07-18 Thread Eduardo Habkost
On Wed, Jul 18, 2018 at 08:25:48AM +0100, Daniel P. Berrangé wrote:
> On Tue, Jul 17, 2018 at 08:37:58PM -0300, Philippe Mathieu-Daudé wrote:
> > I noticed this while running "make docker-test-block@fedora":
> > 
> > $ make docker-test-block@fedora NETWORK=1
> >   BUILD   fedora
> > RUN test-block in qemu:fedora
> > Configure options:
> > --enable-werror --prefix=/tmp/qemu-test/install
> > --python=/usr/bin/python3 --target-list=x86_64-softmmu
> > ...
> > python/usr/bin/python3 -B
> > ...
> > 
> > 194 - output mismatch (see 194.out.bad)
> > --- /tmp/qemu-test/src/tests/qemu-iotests/194.out   2018-07-17
> > 22:51:10.0 +
> > +++ /tmp/qemu-test/build/tests/qemu-iotests/194.out.bad 2018-07-17
> > 22:58:01.646916625 +
> > @@ -1,18 +1,18 @@
> >  Launching VMs...
> >  Launching NBD server on destination...
> > -{u'return': {}}
> > -{u'return': {}}
> > +{'return': {}}
> > +{'return': {}}
> > 
> > and many more errors, until:
> > 
> > Failures: 045 132 148 152 162 169 194 205 208 218 222
> > Failed 11 of 49 tests
> > Test failed: iotests raw
> > 
> > All failures are due to Python2 syntax.
> > I started to fix but noticed there are too many and this isn't to
> > correct fix for this release.
> 
> Were they all due to the unicode string prefix, or where there
> difference failures too ?
> 
> The unicode string prefix could probably be fixed once by changing
> the log() method in iotests.py to have a built-in filter that
> chomps the leading 'u'

We can implement more consistent behavior by making the QMP
methods return a dictionary object with a custom __unicode__()
method.

Using JSON when printing QMP messages (instead of the default
repr() output) is probably a good idea everywhere (not just on
iotests).

-- 
Eduardo



  1   2   >