Re: [PATCH 02/16] vl: extract accelerator option processing to a separate function

2019-11-13 Thread Marc-André Lureau
Hi

On Wed, Nov 13, 2019 at 6:39 PM Paolo Bonzini  wrote:
>
> As a first step towards supporting multiple "-accel" options, push -icount
> and -accel semantics into a new function, and use qemu_opts_foreach to
> retrieve the key/value lists instead of stashing them into globals.
>
> Signed-off-by: Paolo Bonzini 
> ---
>  vl.c | 40 
>  1 file changed, 28 insertions(+), 12 deletions(-)
>
> diff --git a/vl.c b/vl.c
> index 841fdae..5367f23 100644
> --- a/vl.c
> +++ b/vl.c
> @@ -2827,6 +2827,33 @@ static void user_register_global_props(void)
>global_init_func, NULL, NULL);
>  }
>
> +static int do_configure_icount(void *opaque, QemuOpts *opts, Error **errp)
> +{
> +if (tcg_enabled()) {
> +configure_icount(opts, errp);
> +} else {
> +error_setg(errp, "-icount is not allowed with hardware 
> virtualization");
> +}
> +return 0;
> +}
> +
> +static int do_configure_accelerator(void *opaque, QemuOpts *opts, Error 
> **errp)
> +{
> +if (tcg_enabled()) {
> +qemu_tcg_configure(opts, _fatal);
> +}
> +return 0;
> +}
> +
> +static void configure_accelerators(void)
> +{
> +qemu_opts_foreach(qemu_find_opts("icount"),
> +  do_configure_icount, NULL, _fatal);
> +
> +qemu_opts_foreach(qemu_find_opts("accel"),
> +  do_configure_accelerator, NULL, _fatal);

It used to call qemu_tcg_configure() when no -accel option given. In
this case, it still sets mttcg_enabled = default_mttcg_enabled(), but
now it misses that. Perhaps it could be set earlier.

> +}
> +
>  int main(int argc, char **argv, char **envp)
>  {
>  int i;
> @@ -4241,18 +4268,7 @@ int main(int argc, char **argv, char **envp)
>  qemu_spice_init();
>
>  cpu_ticks_init();
> -if (icount_opts) {
> -if (!tcg_enabled()) {
> -error_report("-icount is not allowed with hardware 
> virtualization");
> -exit(1);
> -}
> -configure_icount(icount_opts, _abort);
> -qemu_opts_del(icount_opts);
> -}
> -
> -if (tcg_enabled()) {
> -qemu_tcg_configure(accel_opts, _fatal);
> -}
> +configure_accelerators();
>
>  if (default_net) {
>  QemuOptsList *net = qemu_find_opts("net");
> --
> 1.8.3.1
>
>
>


-- 
Marc-André Lureau



Re: [SeaBIOS] Re: [PATCH] ahci: zero-initialize port struct

2019-11-13 Thread Gerd Hoffmann
On Wed, Nov 13, 2019 at 05:03:58PM +0200, Sam Eiderman wrote:
> Hi,
> 
> Does this fix a bug that actually happened?

Yes, "make check-qtest" may fail.  It's kind of random though.

> I just noticed that in my lchs patches I assumed that lchs struct is
> zeroed out in all devices (not only ahci):

ahci was the only one not zeroing out the struct (yes, I've reviewed
them all).

> Also Gerd it seems that my lchs patches were not committed in the
> latest submitted version (v4)!!!

Whoops.  Can you sent a patch seabios/master ... v4 please?

IIRC there didn't change much, mostly the parser function, so that
should be alot less churn than a full revert + v4 reapply.

thanks,
  Gerd




Re: [PATCH] tests: fix modules-test 'duplicate test case' error

2019-11-13 Thread Marc-André Lureau
On Thu, Nov 14, 2019 at 1:09 AM Cole Robinson  wrote:
>
> ./configure --enable-sdl --audio-drv-list=sdl --enable-modules
>
> Will generate two identical test names: /$arch/module/load/sdl
> Which generates an error like:
>
> (tests/modules-test:23814): GLib-ERROR **: 18:23:06.359: duplicate test case 
> path: /aarch64//module/load/sdl
>
> Add the subsystem prefix in the name as well, so instead we get:
>
> /$arch/module/load/audio-sdl
> /$arch/module/load/ui-sdl
>
> Signed-off-by: Cole Robinson 

Reviewed-by: Marc-André Lureau 

> ---
>  tests/modules-test.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/tests/modules-test.c b/tests/modules-test.c
> index d1a6ace218..88217686e1 100644
> --- a/tests/modules-test.c
> +++ b/tests/modules-test.c
> @@ -64,7 +64,8 @@ int main(int argc, char *argv[])
>  g_test_init(, , NULL);
>
>  for (i = 0; i < G_N_ELEMENTS(modules); i += 2) {
> -char *testname = g_strdup_printf("/module/load/%s", modules[i + 1]);
> +char *testname = g_strdup_printf("/module/load/%s%s",
> + modules[i], modules[i + 1]);
>  qtest_add_data_func(testname, modules + i, test_modules_load);
>  g_free(testname);
>  }
> --
> 2.23.0
>




Re: [PATCH v9 QEMU 14/15] vfio: Add ioctl to get dirty pages bitmap during dma unmap.

2019-11-13 Thread Alex Williamson
On Tue, 12 Nov 2019 22:35:23 +0530
Kirti Wankhede  wrote:

> With vIOMMU, IO virtual address range can get unmapped while in pre-copy phase
> of migration. In that case, unmap ioctl should return pages pinned in that 
> range
> and QEMU should find its correcponding guest physical addresses and report

corresponding

> those dirty.
> 
> Note: This patch is not yet tested. I'm trying to see how I can test this code
> path.
> 
> Suggested-by: Alex Williamson 
> Signed-off-by: Kirti Wankhede 
> Reviewed-by: Neo Jia 
> ---
>  hw/vfio/common.c | 65 
> 
>  1 file changed, 61 insertions(+), 4 deletions(-)
> 
> diff --git a/hw/vfio/common.c b/hw/vfio/common.c
> index 66f1c64bf074..dc5768219d44 100644
> --- a/hw/vfio/common.c
> +++ b/hw/vfio/common.c
> @@ -311,11 +311,30 @@ static bool vfio_devices_are_stopped_and_saving(void)
>  return true;
>  }
>  
> +static bool vfio_devices_are_running_and_saving(void)
> +{
> +VFIOGroup *group;
> +VFIODevice *vbasedev;
> +
> +QLIST_FOREACH(group, _group_list, next) {
> +QLIST_FOREACH(vbasedev, >device_list, next) {
> +if ((vbasedev->device_state & VFIO_DEVICE_STATE_SAVING) &&
> +(vbasedev->device_state & VFIO_DEVICE_STATE_RUNNING)) {
> +continue;
> +} else {
> +return false;
> +}
> +}
> +}
> +return true;
> +}

Suggests to generalize the other function to allow the caller to
provide the mask and value to test for.

> +
>  /*
>   * DMA - Mapping and unmapping for the "type1" IOMMU interface used on x86
>   */
>  static int vfio_dma_unmap(VFIOContainer *container,
> -  hwaddr iova, ram_addr_t size)
> +  hwaddr iova, ram_addr_t size,
> +  VFIOGuestIOMMU *giommu)
>  {
>  struct vfio_iommu_type1_dma_unmap unmap = {
>  .argsz = sizeof(unmap),
> @@ -324,6 +343,44 @@ static int vfio_dma_unmap(VFIOContainer *container,
>  .size = size,
>  };
>  
> +if (giommu && vfio_devices_are_running_and_saving()) {
> +int ret;
> +uint64_t bitmap_size;
> +struct vfio_iommu_type1_dma_unmap_bitmap unmap_bitmap = {
> +.argsz = sizeof(unmap_bitmap),
> +.flags = VFIO_DMA_UNMAP_FLAG_GET_DIRTY_BITMAP,
> +.iova = iova,
> +.size = size,
> +};
> +
> +bitmap_size = BITS_TO_LONGS(size >> TARGET_PAGE_BITS) *
> +  sizeof(uint64_t);
> +
> +unmap_bitmap.bitmap = g_try_malloc0(bitmap_size);
> +if (!unmap_bitmap.bitmap) {
> +error_report("%s: Error allocating bitmap buffer of size 0x%lx",
> + __func__, bitmap_size);
> +return -ENOMEM;
> +}
> +
> +unmap_bitmap.bitmap_size = bitmap_size;
> +
> +ret = ioctl(container->fd, VFIO_IOMMU_UNMAP_DMA_GET_BITMAP,
> +_bitmap);
> +
> +if (!ret) {
> +cpu_physical_memory_set_dirty_lebitmap(
> +(uint64_t *)unmap_bitmap.bitmap,
> +giommu->iommu_offset + 
> giommu->n.start,
> +bitmap_size >> TARGET_PAGE_BITS);

+1 Yan's comments.

> +} else {
> +error_report("VFIO_IOMMU_GET_DIRTY_BITMAP: %d %d", ret, errno);
> +}
> +
> +g_free(unmap_bitmap.bitmap);
> +return ret;
> +}
> +
>  while (ioctl(container->fd, VFIO_IOMMU_UNMAP_DMA, )) {
>  /*
>   * The type1 backend has an off-by-one bug in the kernel 
> (71a7d3d78e3c
> @@ -371,7 +428,7 @@ static int vfio_dma_map(VFIOContainer *container, hwaddr 
> iova,
>   * the VGA ROM space.
>   */
>  if (ioctl(container->fd, VFIO_IOMMU_MAP_DMA, ) == 0 ||
> -(errno == EBUSY && vfio_dma_unmap(container, iova, size) == 0 &&
> +(errno == EBUSY && vfio_dma_unmap(container, iova, size, NULL) == 0 
> &&
>   ioctl(container->fd, VFIO_IOMMU_MAP_DMA, ) == 0)) {
>  return 0;
>  }
> @@ -511,7 +568,7 @@ static void vfio_iommu_map_notify(IOMMUNotifier *n, 
> IOMMUTLBEntry *iotlb)
>   iotlb->addr_mask + 1, vaddr, ret);
>  }
>  } else {
> -ret = vfio_dma_unmap(container, iova, iotlb->addr_mask + 1);
> +ret = vfio_dma_unmap(container, iova, iotlb->addr_mask + 1, giommu);
>  if (ret) {
>  error_report("vfio_dma_unmap(%p, 0x%"HWADDR_PRIx", "
>   "0x%"HWADDR_PRIx") = %d (%m)",
> @@ -814,7 +871,7 @@ static void vfio_listener_region_del(MemoryListener 
> *listener,
>  }
>  
>  if (try_unmap) {
> -ret = vfio_dma_unmap(container, iova, int128_get64(llsize));
> +ret = vfio_dma_unmap(container, iova, int128_get64(llsize), NULL);
>  if (ret) {
>  error_report("vfio_dma_unmap(%p, 0x%"HWADDR_PRIx", "
>   

Re: [PATCH v9 QEMU 15/15] vfio: Make vfio-pci device migration capable.

2019-11-13 Thread Alex Williamson
On Tue, 12 Nov 2019 22:35:24 +0530
Kirti Wankhede  wrote:

> If device is not failover primary device call vfio_migration_probe()
> and vfio_migration_finalize() functions for vfio-pci device to enable
> migration for vfio PCI device which support migration.
> Removed vfio_pci_vmstate structure.
> Removed migration blocker from VFIO PCI device specific structure and use
> migration blocker from generic structure of  VFIO device.
>
> Signed-off-by: Kirti Wankhede 
> Reviewed-by: Neo Jia 
> ---
>  hw/vfio/pci.c | 30 +++---
>  hw/vfio/pci.h |  1 -
>  2 files changed, 11 insertions(+), 20 deletions(-)
> 
> diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
> index 2c22cca0c3be..3d2ebc7abfdc 100644
> --- a/hw/vfio/pci.c
> +++ b/hw/vfio/pci.c
> @@ -2909,21 +2909,11 @@ static void vfio_realize(PCIDevice *pdev, Error 
> **errp)
>  return;
>  }
>  
> -if (!pdev->failover_pair_id) {
> -error_setg(>migration_blocker,
> -"VFIO device doesn't support migration");
> -ret = migrate_add_blocker(vdev->migration_blocker, );
> -if (err) {
> -error_propagate(errp, err);
> -error_free(vdev->migration_blocker);
> -return;
> -}
> -}
> -
>  vdev->vbasedev.name = g_path_get_basename(vdev->vbasedev.sysfsdev);
>  vdev->vbasedev.ops = _pci_ops;
>  vdev->vbasedev.type = VFIO_DEVICE_TYPE_PCI;
>  vdev->vbasedev.dev = DEVICE(vdev);
> +vdev->vbasedev.device_state = 0;

But it's not.

>  
>  tmp = g_strdup_printf("%s/iommu_group", vdev->vbasedev.sysfsdev);
>  len = readlink(tmp, group_path, sizeof(group_path));
> @@ -3184,6 +3174,14 @@ static void vfio_realize(PCIDevice *pdev, Error **errp)
>  }
>  }
>  
> +if (!pdev->failover_pair_id) {
> +ret = vfio_migration_probe(>vbasedev, errp);

Hmm, I suppose this prevents us from breaking failover previously, but
does it make more sense to enable it earlier in the series, even before
it's feature complete so that we can iteratively debug?

> +if (ret) {
> +error_report("%s: Failed to setup for migration",
> + vdev->vbasedev.name);
> +}
> +}
> +
>  vfio_register_err_notifier(vdev);
>  vfio_register_req_notifier(vdev);
>  vfio_setup_resetfn_quirk(vdev);
> @@ -3196,10 +3194,6 @@ out_teardown:
>  vfio_bars_exit(vdev);
>  error:
>  error_prepend(errp, VFIO_MSG_PREFIX, vdev->vbasedev.name);
> -if (vdev->migration_blocker) {
> -migrate_del_blocker(vdev->migration_blocker);
> -error_free(vdev->migration_blocker);
> -}
>  }
>  
>  static void vfio_instance_finalize(Object *obj)
> @@ -3207,14 +3201,11 @@ static void vfio_instance_finalize(Object *obj)
>  VFIOPCIDevice *vdev = PCI_VFIO(obj);
>  VFIOGroup *group = vdev->vbasedev.group;
>  
> +vdev->vbasedev.device_state = 0;

Nor is this accurate or meaningful unless we do actually stop the
device.

>  vfio_display_finalize(vdev);
>  vfio_bars_finalize(vdev);
>  g_free(vdev->emulated_config_bits);
>  g_free(vdev->rom);
> -if (vdev->migration_blocker) {
> -migrate_del_blocker(vdev->migration_blocker);
> -error_free(vdev->migration_blocker);
> -}
>  /*
>   * XXX Leaking igd_opregion is not an oversight, we can't remove the
>   * fw_cfg entry therefore leaking this allocation seems like the safest
> @@ -3239,6 +3230,7 @@ static void vfio_exitfn(PCIDevice *pdev)
>  }
>  vfio_teardown_msi(vdev);
>  vfio_bars_exit(vdev);
> +vfio_migration_finalize(>vbasedev);
>  }
>  
>  static void vfio_pci_reset(DeviceState *dev)
> diff --git a/hw/vfio/pci.h b/hw/vfio/pci.h
> index b329d50338b5..834a90d64686 100644
> --- a/hw/vfio/pci.h
> +++ b/hw/vfio/pci.h
> @@ -168,7 +168,6 @@ typedef struct VFIOPCIDevice {
>  bool no_vfio_ioeventfd;
>  bool enable_ramfb;
>  VFIODisplay *dpy;
> -Error *migration_blocker;
>  } VFIOPCIDevice;
>  
>  uint32_t vfio_pci_read_config(PCIDevice *pdev, uint32_t addr, int len);




Re: [PATCH v9 QEMU 13/15] vfio: Add vfio_listener_log_sync to mark dirty pages

2019-11-13 Thread Alex Williamson
On Tue, 12 Nov 2019 22:35:22 +0530
Kirti Wankhede  wrote:

> vfio_listener_log_sync gets list of dirty pages from container using
> VFIO_IOMMU_GET_DIRTY_BITMAP ioctl and mark those pages dirty when all
> devices are stopped and saving state.
> Return early for the RAM block section of mapped MMIO region.
> 
> Signed-off-by: Kirti Wankhede 
> Reviewed-by: Neo Jia 
> ---
>  hw/vfio/common.c | 103 
> +++
>  hw/vfio/trace-events |   1 +
>  2 files changed, 104 insertions(+)
> 
> diff --git a/hw/vfio/common.c b/hw/vfio/common.c
> index ade9839c28a3..66f1c64bf074 100644
> --- a/hw/vfio/common.c
> +++ b/hw/vfio/common.c
> @@ -29,6 +29,7 @@
>  #include "hw/vfio/vfio.h"
>  #include "exec/address-spaces.h"
>  #include "exec/memory.h"
> +#include "exec/ram_addr.h"
>  #include "hw/hw.h"
>  #include "qemu/error-report.h"
>  #include "qemu/main-loop.h"
> @@ -38,6 +39,7 @@
>  #include "sysemu/reset.h"
>  #include "trace.h"
>  #include "qapi/error.h"
> +#include "migration/migration.h"
>  
>  VFIOGroupList vfio_group_list =
>  QLIST_HEAD_INITIALIZER(vfio_group_list);
> @@ -288,6 +290,28 @@ const MemoryRegionOps vfio_region_ops = {
>  };
>  
>  /*
> + * Device state interfaces
> + */
> +
> +static bool vfio_devices_are_stopped_and_saving(void)
> +{
> +VFIOGroup *group;
> +VFIODevice *vbasedev;
> +
> +QLIST_FOREACH(group, _group_list, next) {
> +QLIST_FOREACH(vbasedev, >device_list, next) {
> +if ((vbasedev->device_state & VFIO_DEVICE_STATE_SAVING) &&
> +!(vbasedev->device_state & VFIO_DEVICE_STATE_RUNNING)) {

(device_state & MASK) == SAVING

> +continue;

Kind of silly to have a continue rather than just changing the polarity
of the test so that we only branch into the return case.

> +} else {
> +return false;
> +}
> +}
> +}
> +return true;
> +}
> +
> +/*
>   * DMA - Mapping and unmapping for the "type1" IOMMU interface used on x86
>   */
>  static int vfio_dma_unmap(VFIOContainer *container,
> @@ -813,9 +837,88 @@ static void vfio_listener_region_del(MemoryListener 
> *listener,
>  }
>  }
>  
> +static int vfio_get_dirty_bitmap(VFIOContainer *container,
> + MemoryRegionSection *section)
> +{
> +struct vfio_iommu_type1_dirty_bitmap range;
> +uint64_t bitmap_size;
> +int ret;
> +
> +range.argsz = sizeof(range);
> +
> +if (memory_region_is_iommu(section->mr)) {
> +VFIOGuestIOMMU *giommu;
> +IOMMUTLBEntry iotlb;
> +
> +QLIST_FOREACH(giommu, >giommu_list, giommu_next) {
> +if (MEMORY_REGION(giommu->iommu) == section->mr &&
> +giommu->n.start == section->offset_within_region) {
> +break;
> +}
> +}
> +
> +if (!giommu) {
> +return -EINVAL;
> +}
> +
> +iotlb = address_space_get_iotlb_entry(container->space->as,
> +   
> TARGET_PAGE_ALIGN(section->offset_within_address_space),
> +   true, MEMTXATTRS_UNSPECIFIED);
> +range.iova = iotlb.iova + giommu->iommu_offset;
> +range.size = iotlb.addr_mask + 1;
> +} else {
> +range.iova = TARGET_PAGE_ALIGN(section->offset_within_address_space);
> +range.size = int128_get64(section->size);
> +}
> +
> +bitmap_size = BITS_TO_LONGS(range.size >> TARGET_PAGE_BITS) *
> + 
> sizeof(uint64_t);
> +
> +range.bitmap = g_try_malloc0(bitmap_size);
> +if (!range.bitmap) {
> +error_report("%s: Error allocating bitmap buffer of size 0x%lx",
> + __func__, bitmap_size);
> +return -ENOMEM;

We could certainly iterate with a smaller bitmap rather than use a
single ioctl.  This doesn't seem like it scales well as VM memory size
increases.

> +}
> +
> +range.bitmap_size = bitmap_size;
> +
> +ret = ioctl(container->fd, VFIO_IOMMU_GET_DIRTY_BITMAP, );
> +
> +if (!ret) {
> +cpu_physical_memory_set_dirty_lebitmap((uint64_t *)range.bitmap,
> +   
> TARGET_PAGE_ALIGN(section->offset_within_address_space),
> +   bitmap_size >> TARGET_PAGE_BITS);


Like Yan, I think this is relative to the iova address space and needs
a translation for the vIOMMU case.

> +} else {
> +error_report("VFIO_IOMMU_GET_DIRTY_BITMAP: %d %d", ret, errno);
> +}
> +
> +trace_vfio_get_dirty_bitmap(container->fd, range.iova, range.size,
> +bitmap_size);
> +
> +g_free(range.bitmap);
> +return ret;
> +}
> +
> +static void vfio_listerner_log_sync(MemoryListener *listener,
> +MemoryRegionSection *section)
> +{
> +VFIOContainer *container = container_of(listener, VFIOContainer, 
> listener);
> +
> +if (memory_region_is_ram_device(section->mr)) {
> +return;
> +}
> +
> +   

Re: [PATCH v9 QEMU 12/15] vfio: Add load state functions to SaveVMHandlers

2019-11-13 Thread Alex Williamson
On Tue, 12 Nov 2019 22:35:21 +0530
Kirti Wankhede  wrote:

> Sequence  during _RESUMING device state:
> While data for this device is available, repeat below steps:
> a. read data_offset from where user application should write data.
> b. write data of data_size to migration region from data_offset.
> c. write data_size which indicates vendor driver that data is written in
>staging buffer.
> 
> For user, data is opaque. User should write data in the same order as
> received.
> 
> Signed-off-by: Kirti Wankhede 
> Reviewed-by: Neo Jia 
> ---
>  hw/vfio/migration.c  | 170 
> +++
>  hw/vfio/trace-events |   3 +
>  2 files changed, 173 insertions(+)
> 
> diff --git a/hw/vfio/migration.c b/hw/vfio/migration.c
> index f890e864e174..16e12586fe8b 100644
> --- a/hw/vfio/migration.c
> +++ b/hw/vfio/migration.c
> @@ -251,6 +251,33 @@ static int vfio_save_device_config_state(QEMUFile *f, 
> void *opaque)
>  return qemu_file_get_error(f);
>  }
>  
> +static int vfio_load_device_config_state(QEMUFile *f, void *opaque)
> +{
> +VFIODevice *vbasedev = opaque;
> +uint64_t data;
> +
> +if (vbasedev->ops && vbasedev->ops->vfio_load_config) {
> +int ret;
> +
> +ret = vbasedev->ops->vfio_load_config(vbasedev, f);
> +if (ret) {
> +error_report("%s: Failed to load device config space",
> + vbasedev->name);
> +return ret;
> +}
> +}
> +
> +data = qemu_get_be64(f);
> +if (data != VFIO_MIG_FLAG_END_OF_STATE) {
> +error_report("%s: Failed loading device config space, "
> + "end flag incorrect 0x%"PRIx64, vbasedev->name, data);
> +return -EINVAL;
> +}
> +
> +trace_vfio_load_device_config_state(vbasedev->name);
> +return qemu_file_get_error(f);
> +}
> +
>  /* -- */
>  
>  static int vfio_save_setup(QEMUFile *f, void *opaque)
> @@ -410,12 +437,155 @@ static int vfio_save_complete_precopy(QEMUFile *f, 
> void *opaque)
>  return ret;
>  }
>  
> +static int vfio_load_setup(QEMUFile *f, void *opaque)
> +{
> +VFIODevice *vbasedev = opaque;
> +VFIOMigration *migration = vbasedev->migration;
> +int ret = 0;
> +
> +if (migration->region.mmaps) {
> +ret = vfio_region_mmap(>region);
> +if (ret) {
> +error_report("%s: Failed to mmap VFIO migration region %d: %s",
> + vbasedev->name, migration->region.nr,
> + strerror(-ret));
> +return ret;
> +}
> +}
> +
> +ret = vfio_migration_set_state(vbasedev, VFIO_DEVICE_STATE_RESUMING, 0);
> +if (ret) {
> +error_report("%s: Failed to set state RESUMING", vbasedev->name);
> +}
> +return ret;
> +}
> +
> +static int vfio_load_cleanup(void *opaque)
> +{
> +vfio_save_cleanup(opaque);
> +return 0;
> +}
> +
> +static int vfio_load_state(QEMUFile *f, void *opaque, int version_id)
> +{
> +VFIODevice *vbasedev = opaque;
> +VFIOMigration *migration = vbasedev->migration;
> +int ret = 0;
> +uint64_t data, data_size;
> +
> +data = qemu_get_be64(f);
> +while (data != VFIO_MIG_FLAG_END_OF_STATE) {
> +
> +trace_vfio_load_state(vbasedev->name, data);
> +
> +switch (data) {
> +case VFIO_MIG_FLAG_DEV_CONFIG_STATE:
> +{
> +ret = vfio_load_device_config_state(f, opaque);
> +if (ret) {
> +return ret;
> +}
> +break;
> +}
> +case VFIO_MIG_FLAG_DEV_SETUP_STATE:
> +{
> +data = qemu_get_be64(f);
> +if (data == VFIO_MIG_FLAG_END_OF_STATE) {
> +return ret;
> +} else {
> +error_report("%s: SETUP STATE: EOS not found 0x%"PRIx64,
> + vbasedev->name, data);
> +return -EINVAL;
> +}
> +break;
> +}
> +case VFIO_MIG_FLAG_DEV_DATA_STATE:
> +{
> +VFIORegion *region = >region;
> +void *buf = NULL;
> +bool buffer_mmaped = false;
> +uint64_t data_offset = 0;
> +
> +data_size = qemu_get_be64(f);
> +if (data_size == 0) {
> +break;

We're not writing data_size = 0 to the migration region, so these
aren't used to synchronization, why are we writing them into the
migration stream?

> +}
> +
> +ret = pread(vbasedev->fd, _offset, sizeof(data_offset),
> +region->fd_offset +
> +offsetof(struct vfio_device_migration_info,
> +data_offset));
> +if (ret != sizeof(data_offset)) {
> +error_report("%s:Failed to get migration buffer data offset 
> %d",
> + vbasedev->name, ret);
> +return -EINVAL;
> 

Re: [PATCH v9 QEMU 10/15] vfio: Register SaveVMHandlers for VFIO device

2019-11-13 Thread Alex Williamson
On Tue, 12 Nov 2019 22:35:19 +0530
Kirti Wankhede  wrote:

> Define flags to be used as delimeter in migration file stream.
> Added .save_setup and .save_cleanup functions. Mapped & unmapped migration
> region from these functions at source during saving or pre-copy phase.
> Set VFIO device state depending on VM's state. During live migration, VM is
> running when .save_setup is called, _SAVING | _RUNNING state is set for VFIO
> device. During save-restore, VM is paused, _SAVING state is set for VFIO 
> device.
> 
> Signed-off-by: Kirti Wankhede 
> Reviewed-by: Neo Jia 
> ---
>  hw/vfio/migration.c  | 70 
> 
>  hw/vfio/trace-events |  2 ++
>  2 files changed, 72 insertions(+)
> 
> diff --git a/hw/vfio/migration.c b/hw/vfio/migration.c
> index 7e7aeb58647e..48aac6d29876 100644
> --- a/hw/vfio/migration.c
> +++ b/hw/vfio/migration.c
> @@ -8,6 +8,7 @@
>   */
>  
>  #include "qemu/osdep.h"
> +#include "qemu/main-loop.h"
>  #include 
>  
>  #include "sysemu/runstate.h"
> @@ -24,6 +25,17 @@
>  #include "pci.h"
>  #include "trace.h"
>  
> +/*
> + * Flags used as delimiter:
> + * 0x => MSB 32-bit all 1s
> + * 0xef10 => emulated (virtual) function IO
> + * 0x => 16-bits reserved for flags
> + */
> +#define VFIO_MIG_FLAG_END_OF_STATE  (0xef11ULL)
> +#define VFIO_MIG_FLAG_DEV_CONFIG_STATE  (0xef12ULL)
> +#define VFIO_MIG_FLAG_DEV_SETUP_STATE   (0xef13ULL)
> +#define VFIO_MIG_FLAG_DEV_DATA_STATE(0xef14ULL)
> +
>  static void vfio_migration_region_exit(VFIODevice *vbasedev)
>  {
>  VFIOMigration *migration = vbasedev->migration;
> @@ -108,6 +120,63 @@ static int vfio_migration_set_state(VFIODevice 
> *vbasedev, uint32_t set_flags,
>  return 0;
>  }
>  
> +/* -- */
> +
> +static int vfio_save_setup(QEMUFile *f, void *opaque)
> +{
> +VFIODevice *vbasedev = opaque;
> +VFIOMigration *migration = vbasedev->migration;
> +int ret;
> +
> +qemu_put_be64(f, VFIO_MIG_FLAG_DEV_SETUP_STATE);
> +
> +if (migration->region.mmaps) {
> +qemu_mutex_lock_iothread();
> +ret = vfio_region_mmap(>region);
> +qemu_mutex_unlock_iothread();

Please add comment indicating why the iothread mutex handling is
necessary.

> +if (ret) {
> +error_report("%s: Failed to mmap VFIO migration region %d: %s",
> + vbasedev->name, migration->region.index,
> + strerror(-ret));
> +return ret;

mmaps are optional for the user, right?  This seems like a continue'able error.

> +}
> +}
> +
> +ret = vfio_migration_set_state(vbasedev, VFIO_DEVICE_STATE_SAVING, 0);
> +if (ret) {
> +error_report("%s: Failed to set state SAVING", vbasedev->name);
> +return ret;
> +}
> +
> +qemu_put_be64(f, VFIO_MIG_FLAG_END_OF_STATE);

Why have we bothered to write anything into the migration stream yet?

> +
> +ret = qemu_file_get_error(f);
> +if (ret) {
> +return ret;
> +}
> +
> +trace_vfio_save_setup(vbasedev->name);
> +return 0;
> +}
> +
> +static void vfio_save_cleanup(void *opaque)
> +{
> +VFIODevice *vbasedev = opaque;
> +VFIOMigration *migration = vbasedev->migration;
> +
> +if (migration->region.mmaps) {
> +vfio_region_unmap(>region);
> +}
> +trace_vfio_save_cleanup(vbasedev->name);

We don't need to touch device_state here?

> +}
> +
> +static SaveVMHandlers savevm_vfio_handlers = {
> +.save_setup = vfio_save_setup,
> +.save_cleanup = vfio_save_cleanup,
> +};
> +
> +/* -- */
> +
>  static void vfio_vmstate_change(void *opaque, int running, RunState state)
>  {
>  VFIODevice *vbasedev = opaque;
> @@ -171,6 +240,7 @@ static int vfio_migration_init(VFIODevice *vbasedev,
>  return ret;
>  }
>  
> +register_savevm_live("vfio", -1, 1, _vfio_handlers, vbasedev);
>  vbasedev->vm_state = 
> qemu_add_vm_change_state_handler(vfio_vmstate_change,
>vbasedev);
>  
> diff --git a/hw/vfio/trace-events b/hw/vfio/trace-events
> index 69503228f20e..4bb43f18f315 100644
> --- a/hw/vfio/trace-events
> +++ b/hw/vfio/trace-events
> @@ -149,3 +149,5 @@ vfio_migration_probe(char *name, uint32_t index) " (%s) 
> Region %d"
>  vfio_migration_set_state(char *name, uint32_t state) " (%s) state %d"
>  vfio_vmstate_change(char *name, int running, const char *reason, uint32_t 
> dev_state) " (%s) running %d reason %s device state %d"
>  vfio_migration_state_notifier(char *name, int state) " (%s) state %d"
> +vfio_save_setup(char *name) " (%s)"
> +vfio_save_cleanup(char *name) " (%s)"




Re: [PATCH v9 QEMU 11/15] vfio: Add save state functions to SaveVMHandlers

2019-11-13 Thread Alex Williamson
On Tue, 12 Nov 2019 22:35:20 +0530
Kirti Wankhede  wrote:

> Added .save_live_pending, .save_live_iterate and .save_live_complete_precopy
> functions. These functions handles pre-copy and stop-and-copy phase.
> 
> In _SAVING|_RUNNING device state or pre-copy phase:
> - read pending_bytes. If pending_bytes > 0, go through below steps.
> - read data_offset - indicates kernel driver to write data to staging
>   buffer.
> - read data_size - amount of data in bytes written by vendor driver in
>   migration region.
> - read data_size bytes of data from data_offset in the migration region.
> - Write data packet to file stream as below:
> {VFIO_MIG_FLAG_DEV_DATA_STATE, data_size, actual data,
> VFIO_MIG_FLAG_END_OF_STATE }
> 
> In _SAVING device state or stop-and-copy phase
> a. read config space of device and save to migration file stream. This
>doesn't need to be from vendor driver. Any other special config state
>from driver can be saved as data in following iteration.
> b. read pending_bytes. If pending_bytes > 0, go through below steps.
> c. read data_offset - indicates kernel driver to write data to staging
>buffer.
> d. read data_size - amount of data in bytes written by vendor driver in
>migration region.
> e. read data_size bytes of data from data_offset in the migration region.
> f. Write data packet as below:
>{VFIO_MIG_FLAG_DEV_DATA_STATE, data_size, actual data}
> g. iterate through steps b to f while (pending_bytes > 0)
> h. Write {VFIO_MIG_FLAG_END_OF_STATE}
> 
> When data region is mapped, its user's responsibility to read data from

s/mapped/made available/

"mapped" is confusing given the mmap'd features.

> data_offset of data_size before moving to next steps.
>
> Signed-off-by: Kirti Wankhede 
> Reviewed-by: Neo Jia 
> ---
>  hw/vfio/migration.c  | 245 
> ++-
>  hw/vfio/trace-events |   6 ++
>  2 files changed, 250 insertions(+), 1 deletion(-)
> 
> diff --git a/hw/vfio/migration.c b/hw/vfio/migration.c
> index 48aac6d29876..f890e864e174 100644
> --- a/hw/vfio/migration.c
> +++ b/hw/vfio/migration.c
> @@ -120,6 +120,137 @@ static int vfio_migration_set_state(VFIODevice 
> *vbasedev, uint32_t set_flags,
>  return 0;
>  }
>  
> +static void *find_data_region(VFIORegion *region,
> +  uint64_t data_offset,
> +  uint64_t data_size)
> +{
> +void *ptr = NULL;
> +int i;
> +
> +for (i = 0; i < region->nr_mmaps; i++) {
> +if ((data_offset >= region->mmaps[i].offset) &&
> +(data_offset < region->mmaps[i].offset + region->mmaps[i].size) 
> &&
> +(data_size <= region->mmaps[i].size)) {

data_offset is determined to live somewhere within the mmap and
data_size is independently determined to be smaller than the entire
mmaps size.  This is broken.

> +ptr = region->mmaps[i].mmap + (data_offset -
> +   region->mmaps[i].offset);

If the data offset is mmap'd, this gives us a pointer to the start, but
we have no idea if the entire range is accessible via this pointer, nor
does the API require it to be.

> +break;
> +}
> +}
> +return ptr;
> +}
> +
> +static int vfio_save_buffer(QEMUFile *f, VFIODevice *vbasedev)
> +{
> +VFIOMigration *migration = vbasedev->migration;
> +VFIORegion *region = >region;
> +uint64_t data_offset = 0, data_size = 0;
> +int ret;
> +
> +ret = pread(vbasedev->fd, _offset, sizeof(data_offset),
> +region->fd_offset + offsetof(struct 
> vfio_device_migration_info,
> + data_offset));
> +if (ret != sizeof(data_offset)) {
> +error_report("%s: Failed to get migration buffer data offset %d",
> + vbasedev->name, ret);
> +return -EINVAL;
> +}
> +
> +ret = pread(vbasedev->fd, _size, sizeof(data_size),
> +region->fd_offset + offsetof(struct 
> vfio_device_migration_info,
> + data_size));
> +if (ret != sizeof(data_size)) {
> +error_report("%s: Failed to get migration buffer data size %d",
> + vbasedev->name, ret);
> +return -EINVAL;
> +}
> +
> +if (data_size > 0) {
> +void *buf = NULL;
> +bool buffer_mmaped;
> +
> +if (region->mmaps) {
> +buf = find_data_region(region, data_offset, data_size);
> +}
> +
> +buffer_mmaped = (buf != NULL) ? true : false;
> +
> +if (!buffer_mmaped) {
> +buf = g_try_malloc0(data_size);
> +if (!buf) {
> +error_report("%s: Error allocating buffer ", __func__);
> +return -ENOMEM;
> +}
> +
> +ret = pread(vbasedev->fd, buf, data_size,
> +region->fd_offset + data_offset);
> +if (ret != data_size) {
> +

Re: [PATCH v9 QEMU 08/15] vfio: Add VM state change handler to know state of VM

2019-11-13 Thread Alex Williamson
On Tue, 12 Nov 2019 22:35:17 +0530
Kirti Wankhede  wrote:

> VM state change handler gets called on change in VM's state. This is used to 
> set
> VFIO device state to _RUNNING.
> 
> Signed-off-by: Kirti Wankhede 
> Reviewed-by: Neo Jia 
> ---
>  hw/vfio/migration.c   | 69 
> +++
>  hw/vfio/trace-events  |  2 ++
>  include/hw/vfio/vfio-common.h |  4 +++
>  3 files changed, 75 insertions(+)
> 
> diff --git a/hw/vfio/migration.c b/hw/vfio/migration.c
> index c17bd1b0b934..28981a759e6c 100644
> --- a/hw/vfio/migration.c
> +++ b/hw/vfio/migration.c
> @@ -10,6 +10,7 @@
>  #include "qemu/osdep.h"
>  #include 
>  
> +#include "sysemu/runstate.h"
>  #include "hw/vfio/vfio-common.h"
>  #include "cpu.h"
>  #include "migration/migration.h"
> @@ -74,6 +75,67 @@ err:
>  return ret;
>  }
>  
> +static int vfio_migration_set_state(VFIODevice *vbasedev, uint32_t set_flags,
> +uint32_t clear_flags)
> +{

Perhaps a mask and value interface like we have elsewhere?

> +VFIOMigration *migration = vbasedev->migration;
> +VFIORegion *region = >region;
> +uint32_t device_state;
> +int ret = 0;
> +
> +/* same flags should not be set or clear */
> +assert(!(set_flags & clear_flags));

mask/value avoids this sort of thing.

> +device_state = (vbasedev->device_state | set_flags) & ~clear_flags;

Don't we need to re-read device_state from the region?  We can't
predict what those reserved bits will be used for, they could be
volatile.  If we adopt that a reset returns to running, our cached
state may be stale.

> +
> +switch (device_state & VFIO_DEVICE_STATE_MASK) {
> +case VFIO_DEVICE_STATE_INVALID_CASE1:
> +case VFIO_DEVICE_STATE_INVALID_CASE2:
> +return -EINVAL;
> +}

I like the VALID macro better.

> +
> +ret = pwrite(vbasedev->fd, _state, sizeof(device_state),
> + region->fd_offset + offsetof(struct 
> vfio_device_migration_info,
> +  device_state));
> +if (ret < 0) {
> +error_report("%s: Failed to set device state %d %s",
> + vbasedev->name, ret, strerror(errno));
> +return ret;
> +}
> +
> +vbasedev->device_state = device_state;

Are we opposed to re-reading device_state, here and in the error case
above?

> +trace_vfio_migration_set_state(vbasedev->name, device_state);
> +return 0;
> +}
> +
> +static void vfio_vmstate_change(void *opaque, int running, RunState state)
> +{
> +VFIODevice *vbasedev = opaque;
> +
> +if ((vbasedev->vm_running != running)) {
> +int ret;
> +uint32_t set_flags = 0, clear_flags = 0;
> +
> +if (running) {
> +set_flags = VFIO_DEVICE_STATE_RUNNING;
> +if (vbasedev->device_state & VFIO_DEVICE_STATE_RESUMING) {
> +clear_flags = VFIO_DEVICE_STATE_RESUMING;
> +}
> +} else {
> +clear_flags = VFIO_DEVICE_STATE_RUNNING;
> +}
> +
> +ret = vfio_migration_set_state(vbasedev, set_flags, clear_flags);
> +if (ret) {
> +error_report("%s: Failed to set device state 0x%x",
> + vbasedev->name, set_flags & ~clear_flags);
> +}
> +vbasedev->vm_running = running;

We're effectively storing running both in vbasedev->device_state and
vbasedev->vm_running, why?

Seems like this could trivially know the initial state of the device is
running.

> +trace_vfio_vmstate_change(vbasedev->name, running, 
> RunState_str(state),
> +  set_flags & ~clear_flags);
> +}
> +}
> +
>  static int vfio_migration_init(VFIODevice *vbasedev,
> struct vfio_region_info *info)
>  {
> @@ -89,6 +151,9 @@ static int vfio_migration_init(VFIODevice *vbasedev,
>  return ret;
>  }
>  
> +vbasedev->vm_state = 
> qemu_add_vm_change_state_handler(vfio_vmstate_change,
> +  vbasedev);
> +
>  return 0;
>  }
>  
> @@ -127,6 +192,10 @@ add_blocker:
>  
>  void vfio_migration_finalize(VFIODevice *vbasedev)
>  {
> +if (vbasedev->vm_state) {
> +qemu_del_vm_change_state_handler(vbasedev->vm_state);
> +}
> +
>  if (vbasedev->migration_blocker) {
>  migrate_del_blocker(vbasedev->migration_blocker);
>  error_free(vbasedev->migration_blocker);
> diff --git a/hw/vfio/trace-events b/hw/vfio/trace-events
> index 191a726a1312..3d15bacd031a 100644
> --- a/hw/vfio/trace-events
> +++ b/hw/vfio/trace-events
> @@ -146,3 +146,5 @@ vfio_display_edid_write_error(void) ""
>  
>  # migration.c
>  vfio_migration_probe(char *name, uint32_t index) " (%s) Region %d"
> +vfio_migration_set_state(char *name, uint32_t state) " (%s) state %d"
> +vfio_vmstate_change(char *name, int running, const char *reason, uint32_t 
> dev_state) " (%s) running %d reason %s device state %d"

Re: [PATCH v9 QEMU 07/15] vfio: Add migration region initialization and finalize function

2019-11-13 Thread Alex Williamson
On Tue, 12 Nov 2019 22:35:16 +0530
Kirti Wankhede  wrote:

> - Migration functions are implemented for VFIO_DEVICE_TYPE_PCI device in this
>   patch series.
> - VFIO device supports migration or not is decided based of migration region
>   query. If migration region query is successful and migration region
>   initialization is successful then migration is supported else migration is
>   blocked.
> 
> Signed-off-by: Kirti Wankhede 
> Reviewed-by: Neo Jia 
> ---
>  hw/vfio/Makefile.objs |   2 +-
>  hw/vfio/migration.c   | 137 
> ++
>  hw/vfio/trace-events  |   3 +
>  include/hw/vfio/vfio-common.h |  10 +++
>  4 files changed, 151 insertions(+), 1 deletion(-)
>  create mode 100644 hw/vfio/migration.c
> 
> diff --git a/hw/vfio/Makefile.objs b/hw/vfio/Makefile.objs
> index abad8b818c9b..36033d1437c5 100644
> --- a/hw/vfio/Makefile.objs
> +++ b/hw/vfio/Makefile.objs
> @@ -1,4 +1,4 @@
> -obj-y += common.o spapr.o
> +obj-y += common.o spapr.o migration.o
>  obj-$(CONFIG_VFIO_PCI) += pci.o pci-quirks.o display.o
>  obj-$(CONFIG_VFIO_CCW) += ccw.o
>  obj-$(CONFIG_VFIO_PLATFORM) += platform.o
> diff --git a/hw/vfio/migration.c b/hw/vfio/migration.c
> new file mode 100644
> index ..c17bd1b0b934
> --- /dev/null
> +++ b/hw/vfio/migration.c
> @@ -0,0 +1,137 @@
> +/*
> + * Migration support for VFIO devices
> + *
> + * Copyright NVIDIA, Inc. 2019
> + *
> + * This work is licensed under the terms of the GNU GPL, version 2. See
> + * the COPYING file in the top-level directory.
> + */
> +
> +#include "qemu/osdep.h"
> +#include 
> +
> +#include "hw/vfio/vfio-common.h"
> +#include "cpu.h"
> +#include "migration/migration.h"
> +#include "migration/qemu-file.h"
> +#include "migration/register.h"
> +#include "migration/blocker.h"
> +#include "migration/misc.h"
> +#include "qapi/error.h"
> +#include "exec/ramlist.h"
> +#include "exec/ram_addr.h"
> +#include "pci.h"
> +#include "trace.h"
> +
> +static void vfio_migration_region_exit(VFIODevice *vbasedev)
> +{
> +VFIOMigration *migration = vbasedev->migration;
> +
> +if (!migration) {
> +return;
> +}
> +
> +if (migration->region.size) {
> +vfio_region_exit(>region);
> +vfio_region_finalize(>region);
> +}
> +}
> +
> +static int vfio_migration_region_init(VFIODevice *vbasedev, int index)
> +{
> +VFIOMigration *migration = vbasedev->migration;
> +Object *obj = NULL;
> +int ret = -EINVAL;
> +
> +if (!vbasedev->ops || !vbasedev->ops->vfio_get_object) {

Is it possible not to have vbasedev->ops?

> +return ret;
> +}
> +
> +obj = vbasedev->ops->vfio_get_object(vbasedev);
> +if (!obj) {
> +return ret;
> +}
> +
> +ret = vfio_region_setup(obj, vbasedev, >region, index,
> +"migration");
> +if (ret) {
> +error_report("%s: Failed to setup VFIO migration region %d: %s",
> + vbasedev->name, index, strerror(-ret));
> +goto err;
> +}
> +
> +if (!migration->region.size) {
> +ret = -EINVAL;
> +error_report("%s: Invalid region size of VFIO migration region %d: 
> %s",
> + vbasedev->name, index, strerror(-ret));
> +goto err;
> +}
> +
> +return 0;
> +
> +err:
> +vfio_migration_region_exit(vbasedev);
> +return ret;
> +}
> +
> +static int vfio_migration_init(VFIODevice *vbasedev,
> +   struct vfio_region_info *info)
> +{
> +int ret;
> +
> +vbasedev->migration = g_new0(VFIOMigration, 1);
> +
> +ret = vfio_migration_region_init(vbasedev, info->index);
> +if (ret) {
> +error_report("%s: Failed to initialise migration region",
> + vbasedev->name);
> +g_free(vbasedev->migration);

Note that vbasedev->migration is not NULL, so calling
vfio_migration_region_exit() at this point will be a use-after-free
error.

> +return ret;
> +}
> +
> +return 0;
> +}
> +
> +/* -- */
> +
> +int vfio_migration_probe(VFIODevice *vbasedev, Error **errp)
> +{
> +struct vfio_region_info *info;
> +Error *local_err = NULL;
> +int ret;
> +
> +ret = vfio_get_dev_region_info(vbasedev, VFIO_REGION_TYPE_MIGRATION,
> +   VFIO_REGION_SUBTYPE_MIGRATION, );
> +if (ret) {
> +goto add_blocker;
> +}
> +
> +ret = vfio_migration_init(vbasedev, info);
> +if (ret) {
> +goto add_blocker;
> +}
> +
> +trace_vfio_migration_probe(vbasedev->name, info->index);
> +return 0;
> +
> +add_blocker:
> +error_setg(>migration_blocker,
> +   "VFIO device doesn't support migration");
> +ret = migrate_add_blocker(vbasedev->migration_blocker, _err);
> +if (local_err) {
> +error_propagate(errp, local_err);
> +error_free(vbasedev->migration_blocker);
> +}

This won't get 

Re: [PATCH v9 QEMU 06/15] vfio: Add save and load functions for VFIO PCI devices

2019-11-13 Thread Alex Williamson
On Tue, 12 Nov 2019 22:35:15 +0530
Kirti Wankhede  wrote:

> These functions save and restore PCI device specific data - config
> space of PCI device.
> Tested save and restore with MSI and MSIX type.
> 
> Signed-off-by: Kirti Wankhede 
> Reviewed-by: Neo Jia 
> ---
>  hw/vfio/pci.c | 168 
> ++
>  include/hw/vfio/vfio-common.h |   2 +
>  2 files changed, 170 insertions(+)
> 
> diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
> index 4ae02e71622a..2c22cca0c3be 100644
> --- a/hw/vfio/pci.c
> +++ b/hw/vfio/pci.c
> @@ -41,6 +41,7 @@
>  #include "trace.h"
>  #include "qapi/error.h"
>  #include "migration/blocker.h"
> +#include "migration/qemu-file.h"
>  
>  #define TYPE_VFIO_PCI "vfio-pci"
>  #define PCI_VFIO(obj)OBJECT_CHECK(VFIOPCIDevice, obj, TYPE_VFIO_PCI)
> @@ -1620,6 +1621,55 @@ static void vfio_bars_prepare(VFIOPCIDevice *vdev)
>  }
>  }
>  
> +static int vfio_bar_validate(VFIOPCIDevice *vdev, int nr)
> +{
> +PCIDevice *pdev = >pdev;
> +VFIOBAR *bar = >bars[nr];
> +uint64_t addr;
> +uint32_t addr_lo, addr_hi = 0;
> +
> +/* Skip unimplemented BARs and the upper half of 64bit BARS. */
> +if (!bar->size) {
> +return 0;
> +}
> +
> +/* skip IO BAR */
> +if (bar->ioport) {
> +return 0;
> +}

Why?

> +
> +addr_lo = pci_default_read_config(pdev, PCI_BASE_ADDRESS_0 + nr * 4, 4);
> +
> +addr_lo = addr_lo & (bar->ioport ? PCI_BASE_ADDRESS_IO_MASK :
> +   PCI_BASE_ADDRESS_MEM_MASK);

And if we've skipped IO BARs above, why are we checking for them here?

> +if (bar->type == PCI_BASE_ADDRESS_MEM_TYPE_64) {
> +addr_hi = pci_default_read_config(pdev,
> + PCI_BASE_ADDRESS_0 + (nr + 1) * 4, 
> 4);
> +}
> +
> +addr = ((uint64_t)addr_hi << 32) | addr_lo;
> +
> +if (!QEMU_IS_ALIGNED(addr, bar->size)) {
> +return -EINVAL;
> +}

Why is this function even necessary?

> +
> +return 0;
> +}
> +
> +static int vfio_bars_validate(VFIOPCIDevice *vdev)
> +{
> +int i, ret;
> +
> +for (i = 0; i < PCI_ROM_SLOT; i++) {
> +ret = vfio_bar_validate(vdev, i);
> +if (ret) {
> +error_report("vfio: BAR address %d validation failed", i);
> +return ret;
> +}
> +}
> +return 0;
> +}
> +
>  static void vfio_bar_register(VFIOPCIDevice *vdev, int nr)
>  {
>  VFIOBAR *bar = >bars[nr];
> @@ -2402,11 +2452,129 @@ static Object *vfio_pci_get_object(VFIODevice 
> *vbasedev)
>  return OBJECT(vdev);
>  }
>  
> +static void vfio_pci_save_config(VFIODevice *vbasedev, QEMUFile *f)
> +{
> +VFIOPCIDevice *vdev = container_of(vbasedev, VFIOPCIDevice, vbasedev);
> +PCIDevice *pdev = >pdev;
> +uint16_t pci_cmd;
> +int i;
> +

Is the basis for what we're selecting to save and restore based
primarily on vfio_pci_write_config()?  I'm nervous about what we're
choosing to save/load and why it isn't more extensive.

> +for (i = 0; i < PCI_ROM_SLOT; i++) {
> +uint32_t bar;
> +
> +bar = pci_default_read_config(pdev, PCI_BASE_ADDRESS_0 + i * 4, 4);
> +qemu_put_be32(f, bar);
> +}
> +
> +qemu_put_be32(f, vdev->interrupt);
> +if (vdev->interrupt == VFIO_INT_MSI) {
> +uint32_t msi_flags, msi_addr_lo, msi_addr_hi = 0, msi_data;
> +bool msi_64bit;
> +
> +msi_flags = pci_default_read_config(pdev, pdev->msi_cap + 
> PCI_MSI_FLAGS,
> +2);
> +msi_64bit = (msi_flags & PCI_MSI_FLAGS_64BIT);
> +
> +msi_addr_lo = pci_default_read_config(pdev,
> + pdev->msi_cap + PCI_MSI_ADDRESS_LO, 
> 4);
> +qemu_put_be32(f, msi_addr_lo);
> +
> +if (msi_64bit) {
> +msi_addr_hi = pci_default_read_config(pdev,
> + pdev->msi_cap + 
> PCI_MSI_ADDRESS_HI,
> + 4);
> +}
> +qemu_put_be32(f, msi_addr_hi);
> +
> +msi_data = pci_default_read_config(pdev,
> +pdev->msi_cap + (msi_64bit ? PCI_MSI_DATA_64 : 
> PCI_MSI_DATA_32),
> +2);
> +qemu_put_be32(f, msi_data);
> +} else if (vdev->interrupt == VFIO_INT_MSIX) {
> +uint16_t offset;
> +
> +/* save enable bit and maskall bit */
> +offset = pci_default_read_config(pdev,
> +   pdev->msix_cap + PCI_MSIX_FLAGS + 1, 
> 2);
> +qemu_put_be16(f, offset);
> +msix_save(pdev, f);
> +}
> +pci_cmd = pci_default_read_config(pdev, PCI_COMMAND, 2);
> +qemu_put_be16(f, pci_cmd);
> +}
> +
> +static int vfio_pci_load_config(VFIODevice *vbasedev, QEMUFile *f)
> +{
> +VFIOPCIDevice *vdev = container_of(vbasedev, VFIOPCIDevice, vbasedev);
> +PCIDevice *pdev = >pdev;
> +uint32_t interrupt_type;
> +uint32_t msi_flags, 

[Bug 1841592] Re: ppc: softfloat float implementation issues

2019-11-13 Thread Launchpad Bug Tracker
[Expired for QEMU because there has been no activity for 60 days.]

** Changed in: qemu
   Status: Incomplete => Expired

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1841592

Title:
  ppc: softfloat float implementation issues

Status in QEMU:
  Expired

Bug description:
  Per bug #1841491, Richard Henderson (rth) said:
  > The float test failure is part of a larger problem for target/powerpc
  > in which all float routines are implemented incorrectly. They are all
  > implemented as double operations with rounding to float as a second
  > step. Which not only produces incorrect exceptions, as in this case,
  > but incorrect numerical results from the double rounding.
  >
  > This should probably be split to a separate bug...

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1841592/+subscriptions



Re: [PATCH] tests/vm: update openbsd to release 6.6

2019-11-13 Thread Brad Smith

Thanks.

Reviewed-by: Brad Smith 

On 10/18/2019 6:24 AM, Gerd Hoffmann wrote:

Signed-off-by: Gerd Hoffmann 
---
  tests/vm/openbsd | 4 ++--
  1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/tests/vm/openbsd b/tests/vm/openbsd
index b92c39f89a6f..9f82cd459fde 100755
--- a/tests/vm/openbsd
+++ b/tests/vm/openbsd
@@ -22,8 +22,8 @@ class OpenBSDVM(basevm.BaseVM):
  name = "openbsd"
  arch = "x86_64"
  
-link = "https://cdn.openbsd.org/pub/OpenBSD/6.5/amd64/install65.iso;

-csum = "38d1f8cadd502f1c27bf05c5abde6cc505dd28f3f34f8a941048ff9a54f9f608"
+link = "https://cdn.openbsd.org/pub/OpenBSD/6.6/amd64/install66.iso;
+csum = "b22e63df56e6266de6bbeed8e9be0fbe9ee2291551c5bc03f3cc2e4ab9436ee3"
  size = "20G"
  pkgs = [
  # tools


Re: [PATCH v3 for-4.2 0/4] Better NBD string length handling

2019-11-13 Thread Eric Blake

On 11/13/19 9:00 PM, no-re...@patchew.org wrote:

Patchew URL: https://patchew.org/QEMU/20191114024635.11363-1-ebl...@redhat.com/





  from /tmp/qemu-test/src/include/qemu/osdep.h:140,
  from /tmp/qemu-test/src/nbd/server.c:20:
/tmp/qemu-test/src/nbd/server.c: In function 'nbd_negotiate_handle_export_name':
/usr/x86_64-w64-mingw32/sys-root/mingw/include/glib-2.0/glib/glib-autocleanups.h:28:3:
 error: 'name' may be used uninitialized in this function 
[-Werror=maybe-uninitialized]
g_free (*pp);
^~~~
/tmp/qemu-test/src/nbd/server.c:435:22: note: 'name' was declared here
  g_autofree char *name;
   ^~~~


Ha - I posted the fix to that one minute before patchew flagged it. 
Still, I'm not sure why my gcc didn't flag this locally, while the mingw 
builder did.


--
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3226
Virtualization:  qemu.org | libvirt.org




Re: [PATCH v3 for-4.2 0/4] Better NBD string length handling

2019-11-13 Thread no-reply
Patchew URL: https://patchew.org/QEMU/20191114024635.11363-1-ebl...@redhat.com/



Hi,

This series failed the docker-mingw@fedora build test. Please find the testing 
commands and
their output below. If you have Docker installed, you can probably reproduce it
locally.

=== TEST SCRIPT BEGIN ===
#! /bin/bash
export ARCH=x86_64
make docker-image-fedora V=1 NETWORK=1
time make docker-test-mingw@fedora J=14 NETWORK=1
=== TEST SCRIPT END ===

 from /tmp/qemu-test/src/include/qemu/osdep.h:140,
 from /tmp/qemu-test/src/nbd/server.c:20:
/tmp/qemu-test/src/nbd/server.c: In function 'nbd_negotiate_handle_export_name':
/usr/x86_64-w64-mingw32/sys-root/mingw/include/glib-2.0/glib/glib-autocleanups.h:28:3:
 error: 'name' may be used uninitialized in this function 
[-Werror=maybe-uninitialized]
   g_free (*pp);
   ^~~~
/tmp/qemu-test/src/nbd/server.c:435:22: note: 'name' was declared here
 g_autofree char *name;
  ^~~~
cc1: all warnings being treated as errors
make: *** [/tmp/qemu-test/src/rules.mak:69: nbd/server.o] Error 1
make: *** Waiting for unfinished jobs
Traceback (most recent call last):
  File "./tests/docker/docker.py", line 662, in 
---
raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['sudo', '-n', 'docker', 'run', 
'--label', 'com.qemu.instance.uuid=960f572775044fbdbd74adcaff59712e', '-u', 
'1003', '--security-opt', 'seccomp=unconfined', '--rm', '-e', 'TARGET_LIST=', 
'-e', 'EXTRA_CONFIGURE_OPTS=', '-e', 'V=', '-e', 'J=14', '-e', 'DEBUG=', '-e', 
'SHOW_ENV=', '-e', 'CCACHE_DIR=/var/tmp/ccache', '-v', 
'/home/patchew2/.cache/qemu-docker-ccache:/var/tmp/ccache:z', '-v', 
'/var/tmp/patchew-tester-tmp-07kxi3y_/src/docker-src.2019-11-13-21.57.58.24536:/var/tmp/qemu:z,ro',
 'qemu:fedora', '/var/tmp/qemu/run', 'test-mingw']' returned non-zero exit 
status 2.
filter=--filter=label=com.qemu.instance.uuid=960f572775044fbdbd74adcaff59712e
make[1]: *** [docker-run] Error 1
make[1]: Leaving directory `/var/tmp/patchew-tester-tmp-07kxi3y_/src'
make: *** [docker-run-test-mingw@fedora] Error 2

real2m33.836s
user0m7.592s


The full log is available at
http://patchew.org/logs/20191114024635.11363-1-ebl...@redhat.com/testing.docker-mingw@fedora/?type=message.
---
Email generated automatically by Patchew [https://patchew.org/].
Please send your feedback to patchew-de...@redhat.com

Re: [PATCH v3 1/4] nbd/server: Prefer heap over stack for parsing client names

2019-11-13 Thread Eric Blake

On 11/13/19 8:46 PM, Eric Blake wrote:

As long as we limit NBD names to 256 bytes (the bare minimum permitted
by the standard), stack-allocation works for parsing a name received
from the client.  But as mentioned in a comment, we eventually want to
permit up to the 4k maximum of the NBD standard, which is too large
for stack allocation; so switch everything in the server to use heap
allocation.  For now, there is no change in actually supported name
length.

Signed-off-by: Eric Blake 
---
  include/block/nbd.h | 10 +-
  nbd/server.c| 25 +++--
  2 files changed, 20 insertions(+), 15 deletions(-)



@@ -427,7 +431,7 @@ static void nbd_check_meta_export(NBDClient *client)
  static int nbd_negotiate_handle_export_name(NBDClient *client, bool no_zeroes,
  Error **errp)
  {
-char name[NBD_MAX_NAME_SIZE + 1];
+g_autofree char *name;


This needs to be:

g_autofree char *name = NULL;

--
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3226
Virtualization:  qemu.org | libvirt.org




Re: [PATCH v3 for-4.2 0/4] Better NBD string length handling

2019-11-13 Thread no-reply
Patchew URL: https://patchew.org/QEMU/20191114024635.11363-1-ebl...@redhat.com/



Hi,

This series failed the docker-quick@centos7 build test. Please find the testing 
commands and
their output below. If you have Docker installed, you can probably reproduce it
locally.

=== TEST SCRIPT BEGIN ===
#!/bin/bash
make docker-image-centos7 V=1 NETWORK=1
time make docker-test-quick@centos7 SHOW_ENV=1 J=14 NETWORK=1
=== TEST SCRIPT END ===

 from /tmp/qemu-test/src/include/qemu/osdep.h:140,
 from /tmp/qemu-test/src/nbd/server.c:20:
/tmp/qemu-test/src/nbd/server.c: In function 'nbd_negotiate_handle_export_name':
/usr/include/glib-2.0/glib/glib-autocleanups.h:28:10: error: 'name' may be used 
uninitialized in this function [-Werror=maybe-uninitialized]
   g_free (*pp);
  ^
/tmp/qemu-test/src/nbd/server.c:435:22: note: 'name' was declared here
 g_autofree char *name;
  ^
cc1: all warnings being treated as errors
make: *** [nbd/server.o] Error 1
make: *** Waiting for unfinished jobs
  CC  chardev/char-file.o
  CC  chardev/char-io.o
---
raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['sudo', '-n', 'docker', 'run', 
'--label', 'com.qemu.instance.uuid=9620816161004b4c84746eaba688af5a', '-u', 
'1003', '--security-opt', 'seccomp=unconfined', '--rm', '-e', 'TARGET_LIST=', 
'-e', 'EXTRA_CONFIGURE_OPTS=', '-e', 'V=', '-e', 'J=14', '-e', 'DEBUG=', '-e', 
'SHOW_ENV=1', '-e', 'CCACHE_DIR=/var/tmp/ccache', '-v', 
'/home/patchew2/.cache/qemu-docker-ccache:/var/tmp/ccache:z', '-v', 
'/var/tmp/patchew-tester-tmp-gsxzfw9n/src/docker-src.2019-11-13-21.55.21.14727:/var/tmp/qemu:z,ro',
 'qemu:centos7', '/var/tmp/qemu/run', 'test-quick']' returned non-zero exit 
status 2.
filter=--filter=label=com.qemu.instance.uuid=9620816161004b4c84746eaba688af5a
make[1]: *** [docker-run] Error 1
make[1]: Leaving directory `/var/tmp/patchew-tester-tmp-gsxzfw9n/src'
make: *** [docker-run-test-quick@centos7] Error 2

real2m9.489s
user0m7.589s


The full log is available at
http://patchew.org/logs/20191114024635.11363-1-ebl...@redhat.com/testing.docker-quick@centos7/?type=message.
---
Email generated automatically by Patchew [https://patchew.org/].
Please send your feedback to patchew-de...@redhat.com

[PATCH v3 2/4] bitmap: Enforce maximum bitmap name length

2019-11-13 Thread Eric Blake
We document that for qcow2 persistent bitmaps, the name cannot exceed
1023 bytes.  It is inconsistent if transient bitmaps do not have to
abide by the same limit, and it is unlikely that any existing client
even cares about using bitmap names this long.  It's time to codify
that ALL bitmaps managed by qemu (whether persistent in qcow2 or not)
have a documented maximum length.

Signed-off-by: Eric Blake 
---
 qapi/block-core.json |  2 +-
 include/block/dirty-bitmap.h |  2 ++
 block/dirty-bitmap.c | 12 +---
 block/qcow2-bitmap.c |  2 ++
 4 files changed, 14 insertions(+), 4 deletions(-)

diff --git a/qapi/block-core.json b/qapi/block-core.json
index aa97ee264112..0cf68fea1450 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -2042,7 +2042,7 @@
 #
 # @node: name of device/node which the bitmap is tracking
 #
-# @name: name of the dirty bitmap
+# @name: name of the dirty bitmap (must be less than 1024 bytes)
 #
 # @granularity: the bitmap granularity, default is 64k for
 #   block-dirty-bitmap-add
diff --git a/include/block/dirty-bitmap.h b/include/block/dirty-bitmap.h
index 958e7474fb51..e2b20ecab9a3 100644
--- a/include/block/dirty-bitmap.h
+++ b/include/block/dirty-bitmap.h
@@ -14,6 +14,8 @@ typedef enum BitmapCheckFlags {
  BDRV_BITMAP_INCONSISTENT)
 #define BDRV_BITMAP_ALLOW_RO (BDRV_BITMAP_BUSY | BDRV_BITMAP_INCONSISTENT)

+#define BDRV_BITMAP_MAX_NAME_SIZE 1023
+
 BdrvDirtyBitmap *bdrv_create_dirty_bitmap(BlockDriverState *bs,
   uint32_t granularity,
   const char *name,
diff --git a/block/dirty-bitmap.c b/block/dirty-bitmap.c
index 4bbb251b2c9c..7039e8252009 100644
--- a/block/dirty-bitmap.c
+++ b/block/dirty-bitmap.c
@@ -104,9 +104,15 @@ BdrvDirtyBitmap *bdrv_create_dirty_bitmap(BlockDriverState 
*bs,

 assert(is_power_of_2(granularity) && granularity >= BDRV_SECTOR_SIZE);

-if (name && bdrv_find_dirty_bitmap(bs, name)) {
-error_setg(errp, "Bitmap already exists: %s", name);
-return NULL;
+if (name) {
+if (bdrv_find_dirty_bitmap(bs, name)) {
+error_setg(errp, "Bitmap already exists: %s", name);
+return NULL;
+}
+if (strlen(name) > BDRV_BITMAP_MAX_NAME_SIZE) {
+error_setg(errp, "Bitmap name too long: %s", name);
+return NULL;
+}
 }
 bitmap_size = bdrv_getlength(bs);
 if (bitmap_size < 0) {
diff --git a/block/qcow2-bitmap.c b/block/qcow2-bitmap.c
index ef9ef628a0d0..809bbc5d20c8 100644
--- a/block/qcow2-bitmap.c
+++ b/block/qcow2-bitmap.c
@@ -42,6 +42,8 @@
 #define BME_MIN_GRANULARITY_BITS 9
 #define BME_MAX_NAME_SIZE 1023

+QEMU_BUILD_BUG_ON(BME_MAX_NAME_SIZE != BDRV_BITMAP_MAX_NAME_SIZE);
+
 #if BME_MAX_TABLE_SIZE * 8ULL > INT_MAX
 #error In the code bitmap table physical size assumed to fit into int
 #endif
-- 
2.21.0




[PATCH v3 3/4] nbd: Don't send oversize strings

2019-11-13 Thread Eric Blake
Qemu as server currently won't accept export names larger than 256
bytes, nor create dirty bitmap names longer than 1023 bytes, so most
uses of qemu as client or server have no reason to get anywhere near
the NBD spec maximum of a 4k limit per string.

However, we weren't actually enforcing things, ignoring when the
remote side violates the protocol on input, and also having several
code paths where we send oversize strings on output (for example,
qemu-nbd --description could easily send more than 4k).  Tighten
things up as follows:

client:
- Perform bounds check on export name and dirty bitmap request prior
  to handing it to server
- Validate that copied server replies are not too long (ignoring
  NBD_INFO_* replies that are not copied is not too bad)
server:
- Perform bounds check on export name and description prior to
  advertising it to client
- Reject client name or metadata query that is too long
- Adjust things to allow full 4k name limit rather than previous
  256 byte limit

Signed-off-by: Eric Blake 
---
 include/block/nbd.h |  8 
 block/nbd.c | 10 ++
 blockdev-nbd.c  |  5 +
 nbd/client.c| 18 +++---
 nbd/server.c| 20 +++-
 qemu-nbd.c  |  9 +
 6 files changed, 58 insertions(+), 12 deletions(-)

diff --git a/include/block/nbd.h b/include/block/nbd.h
index c306423dc85c..7f46932d80f1 100644
--- a/include/block/nbd.h
+++ b/include/block/nbd.h
@@ -227,11 +227,11 @@ enum {
 #define NBD_MAX_BUFFER_SIZE (32 * 1024 * 1024)

 /*
- * Maximum size of an export name. The NBD spec requires a minimum of
- * 256 and recommends that servers support up to 4096; all users use
- * malloc so we can bump this constant without worry.
+ * Maximum size of a protocol string (export name, meta context name,
+ * etc.).  Use malloc rather than stack allocation for storage of a
+ * string.
  */
-#define NBD_MAX_NAME_SIZE 256
+#define NBD_MAX_STRING_SIZE 4096

 /* Two types of reply structures */
 #define NBD_SIMPLE_REPLY_MAGIC  0x67446698
diff --git a/block/nbd.c b/block/nbd.c
index 123976171cf4..5f18f78a9471 100644
--- a/block/nbd.c
+++ b/block/nbd.c
@@ -1832,6 +1832,10 @@ static int nbd_process_options(BlockDriverState *bs, 
QDict *options,
 }

 s->export = g_strdup(qemu_opt_get(opts, "export"));
+if (s->export && strlen(s->export) > NBD_MAX_STRING_SIZE) {
+error_setg(errp, "export name too long to send to server");
+goto error;
+}

 s->tlscredsid = g_strdup(qemu_opt_get(opts, "tls-creds"));
 if (s->tlscredsid) {
@@ -1849,6 +1853,11 @@ static int nbd_process_options(BlockDriverState *bs, 
QDict *options,
 }

 s->x_dirty_bitmap = g_strdup(qemu_opt_get(opts, "x-dirty-bitmap"));
+if (s->x_dirty_bitmap && strlen(s->x_dirty_bitmap) > NBD_MAX_STRING_SIZE) {
+error_setg(errp, "x-dirty-bitmap query too long to send to server");
+goto error;
+}
+
 s->reconnect_delay = qemu_opt_get_number(opts, "reconnect-delay", 0);

 ret = 0;
@@ -1859,6 +1868,7 @@ static int nbd_process_options(BlockDriverState *bs, 
QDict *options,
 qapi_free_SocketAddress(s->saddr);
 g_free(s->export);
 g_free(s->tlscredsid);
+g_free(s->x_dirty_bitmap);
 }
 qemu_opts_del(opts);
 return ret;
diff --git a/blockdev-nbd.c b/blockdev-nbd.c
index 6a8b206e1d74..8c20baa4a4b9 100644
--- a/blockdev-nbd.c
+++ b/blockdev-nbd.c
@@ -162,6 +162,11 @@ void qmp_nbd_server_add(const char *device, bool has_name, 
const char *name,
 name = device;
 }

+if (strlen(name) > NBD_MAX_STRING_SIZE) {
+error_setg(errp, "export name '%s' too long", name);
+return;
+}
+
 if (nbd_export_find(name)) {
 error_setg(errp, "NBD server already has export named '%s'", name);
 return;
diff --git a/nbd/client.c b/nbd/client.c
index f6733962b49b..ba173108baab 100644
--- a/nbd/client.c
+++ b/nbd/client.c
@@ -289,8 +289,8 @@ static int nbd_receive_list(QIOChannel *ioc, char **name, 
char **description,
 return -1;
 }
 len -= sizeof(namelen);
-if (len < namelen) {
-error_setg(errp, "incorrect option name length");
+if (len < namelen || namelen > NBD_MAX_STRING_SIZE) {
+error_setg(errp, "incorrect name length in server's list response");
 nbd_send_opt_abort(ioc);
 return -1;
 }
@@ -303,6 +303,12 @@ static int nbd_receive_list(QIOChannel *ioc, char **name, 
char **description,
 local_name[namelen] = '\0';
 len -= namelen;
 if (len) {
+if (len > NBD_MAX_STRING_SIZE) {
+error_setg(errp, "incorrect description length in server's "
+   "list response");
+nbd_send_opt_abort(ioc);
+return -1;
+}
 local_desc = g_malloc(len + 1);
 if (nbd_read(ioc, local_desc, len, "export description", errp) < 0) {
 nbd_send_opt_abort(ioc);
@@ -479,6 +485,10 @@ static int 

[PATCH v3 for-5.0 4/4] nbd: Allow description when creating NBD blockdev

2019-11-13 Thread Eric Blake
Allow blockdevs to match the feature already present in qemu-nbd -D.
Enhance iotest 223 to cover it.

Signed-off-by: Eric Blake 
Reviewed-by: Maxim Levitsky 
Reviewed-by: Vladimir Sementsov-Ogievskiy 
---
 qapi/block.json| 9 ++---
 blockdev-nbd.c | 9 -
 monitor/hmp-cmds.c | 4 ++--
 tests/qemu-iotests/223 | 2 +-
 tests/qemu-iotests/223.out | 1 +
 5 files changed, 18 insertions(+), 7 deletions(-)

diff --git a/qapi/block.json b/qapi/block.json
index 145c268bb646..7898104dae42 100644
--- a/qapi/block.json
+++ b/qapi/block.json
@@ -250,9 +250,12 @@
 # @name: Export name. If unspecified, the @device parameter is used as the
 #export name. (Since 2.12)
 #
+# @description: Free-form description of the export, up to 4096 bytes.
+#   (Since 5.0)
+#
 # @writable: Whether clients should be able to write to the device via the
 # NBD connection (default false).
-
+#
 # @bitmap: Also export the dirty bitmap reachable from @device, so the
 #  NBD client can use NBD_OPT_SET_META_CONTEXT with
 #  "qemu:dirty-bitmap:NAME" to inspect the bitmap. (since 4.0)
@@ -263,8 +266,8 @@
 # Since: 1.3.0
 ##
 { 'command': 'nbd-server-add',
-  'data': {'device': 'str', '*name': 'str', '*writable': 'bool',
-   '*bitmap': 'str' } }
+  'data': {'device': 'str', '*name': 'str', '*description': 'str',
+   '*writable': 'bool', '*bitmap': 'str' } }

 ##
 # @NbdServerRemoveMode:
diff --git a/blockdev-nbd.c b/blockdev-nbd.c
index 8c20baa4a4b9..de2f2ff71320 100644
--- a/blockdev-nbd.c
+++ b/blockdev-nbd.c
@@ -144,6 +144,7 @@ void qmp_nbd_server_start(SocketAddressLegacy *addr,
 }

 void qmp_nbd_server_add(const char *device, bool has_name, const char *name,
+bool has_description, const char *description,
 bool has_writable, bool writable,
 bool has_bitmap, const char *bitmap, Error **errp)
 {
@@ -167,6 +168,11 @@ void qmp_nbd_server_add(const char *device, bool has_name, 
const char *name,
 return;
 }

+if (has_description && strlen(description) > NBD_MAX_STRING_SIZE) {
+error_setg(errp, "description '%s' too long", description);
+return;
+}
+
 if (nbd_export_find(name)) {
 error_setg(errp, "NBD server already has export named '%s'", name);
 return;
@@ -195,7 +201,8 @@ void qmp_nbd_server_add(const char *device, bool has_name, 
const char *name,
 writable = false;
 }

-exp = nbd_export_new(bs, 0, len, name, NULL, bitmap, !writable, !writable,
+exp = nbd_export_new(bs, 0, len, name, description, bitmap,
+ !writable, !writable,
  NULL, false, on_eject_blk, errp);
 if (!exp) {
 goto out;
diff --git a/monitor/hmp-cmds.c b/monitor/hmp-cmds.c
index b2551c16d129..574c6321c9d0 100644
--- a/monitor/hmp-cmds.c
+++ b/monitor/hmp-cmds.c
@@ -2352,7 +2352,7 @@ void hmp_nbd_server_start(Monitor *mon, const QDict 
*qdict)
 continue;
 }

-qmp_nbd_server_add(info->value->device, false, NULL,
+qmp_nbd_server_add(info->value->device, false, NULL, false, NULL,
true, writable, false, NULL, _err);

 if (local_err != NULL) {
@@ -2374,7 +2374,7 @@ void hmp_nbd_server_add(Monitor *mon, const QDict *qdict)
 bool writable = qdict_get_try_bool(qdict, "writable", false);
 Error *local_err = NULL;

-qmp_nbd_server_add(device, !!name, name, true, writable,
+qmp_nbd_server_add(device, !!name, name, false, NULL, true, writable,
false, NULL, _err);
 hmp_handle_error(mon, _err);
 }
diff --git a/tests/qemu-iotests/223 b/tests/qemu-iotests/223
index b5a80e50bbc1..c708e479325e 100755
--- a/tests/qemu-iotests/223
+++ b/tests/qemu-iotests/223
@@ -144,7 +144,7 @@ _send_qemu_cmd $QEMU_HANDLE '{"execute":"nbd-server-add",
   "bitmap":"b3"}}' "error" # Missing bitmap
 _send_qemu_cmd $QEMU_HANDLE '{"execute":"nbd-server-add",
   "arguments":{"device":"n", "name":"n2", "writable":true,
-  "bitmap":"b2"}}' "return"
+  "description":"some text", "bitmap":"b2"}}' "return"
 $QEMU_NBD_PROG -L -k "$SOCK_DIR/nbd"

 echo
diff --git a/tests/qemu-iotests/223.out b/tests/qemu-iotests/223.out
index 23b34fcd202e..16d597585b4f 100644
--- a/tests/qemu-iotests/223.out
+++ b/tests/qemu-iotests/223.out
@@ -49,6 +49,7 @@ exports available: 2
base:allocation
qemu:dirty-bitmap:b
  export: 'n2'
+  description: some text
   size:  4194304
   flags: 0xced ( flush fua trim zeroes df cache fast-zero )
   min block: 1
-- 
2.21.0




[PATCH v3 for-4.2 0/4] Better NBD string length handling

2019-11-13 Thread Eric Blake
This series was originally posted before soft freeze, but then KVM
Forum interfered. I think that patches 1-3 are bug fixes still
appropriate for -rc2 if they get good reviews, but patch 4 is a new
feature and now only appropriate for 5.0.

Since v2:
- Patch 1, 2: new [Vladimir]
- Patch 3: improve error messages and fix a memleak [Vladimir]
- Patch 3: bump name length from 256 to 4k (R-b dropped)
- Patch 4: add R-b, but tweak to defer to 5.0

Eric Blake (4):
  nbd/server: Prefer heap over stack for parsing client names
  bitmap: Enforce maximum bitmap name length
  nbd: Don't send oversize strings
  nbd: Allow description when creating NBD blockdev

 qapi/block-core.json |  2 +-
 qapi/block.json  |  9 +---
 include/block/dirty-bitmap.h |  2 ++
 include/block/nbd.h  | 12 +-
 block/dirty-bitmap.c | 12 +++---
 block/nbd.c  | 10 +
 block/qcow2-bitmap.c |  2 ++
 blockdev-nbd.c   | 14 +++-
 monitor/hmp-cmds.c   |  4 ++--
 nbd/client.c | 18 ---
 nbd/server.c | 43 
 qemu-nbd.c   |  9 
 tests/qemu-iotests/223   |  2 +-
 tests/qemu-iotests/223.out   |  1 +
 14 files changed, 106 insertions(+), 34 deletions(-)

-- 
2.21.0




[PATCH v3 1/4] nbd/server: Prefer heap over stack for parsing client names

2019-11-13 Thread Eric Blake
As long as we limit NBD names to 256 bytes (the bare minimum permitted
by the standard), stack-allocation works for parsing a name received
from the client.  But as mentioned in a comment, we eventually want to
permit up to the 4k maximum of the NBD standard, which is too large
for stack allocation; so switch everything in the server to use heap
allocation.  For now, there is no change in actually supported name
length.

Signed-off-by: Eric Blake 
---
 include/block/nbd.h | 10 +-
 nbd/server.c| 25 +++--
 2 files changed, 20 insertions(+), 15 deletions(-)

diff --git a/include/block/nbd.h b/include/block/nbd.h
index 316fd705a9e4..c306423dc85c 100644
--- a/include/block/nbd.h
+++ b/include/block/nbd.h
@@ -226,11 +226,11 @@ enum {
 /* Maximum size of a single READ/WRITE data buffer */
 #define NBD_MAX_BUFFER_SIZE (32 * 1024 * 1024)

-/* Maximum size of an export name. The NBD spec requires 256 and
- * suggests that servers support up to 4096, but we stick to only the
- * required size so that we can stack-allocate the names, and because
- * going larger would require an audit of more code to make sure we
- * aren't overflowing some other buffer. */
+/*
+ * Maximum size of an export name. The NBD spec requires a minimum of
+ * 256 and recommends that servers support up to 4096; all users use
+ * malloc so we can bump this constant without worry.
+ */
 #define NBD_MAX_NAME_SIZE 256

 /* Two types of reply structures */
diff --git a/nbd/server.c b/nbd/server.c
index d8d1e6245532..c63b76b22735 100644
--- a/nbd/server.c
+++ b/nbd/server.c
@@ -324,18 +324,20 @@ static int nbd_opt_skip(NBDClient *client, size_t size, 
Error **errp)
  *   uint32_t len (<= NBD_MAX_NAME_SIZE)
  *   len bytes string (not 0-terminated)
  *
- * @name should be enough to store NBD_MAX_NAME_SIZE+1.
+ * On success, @name will be allocated.
  * If @length is non-null, it will be set to the actual string length.
  *
  * Return -errno on I/O error, 0 if option was completely handled by
  * sending a reply about inconsistent lengths, or 1 on success.
  */
-static int nbd_opt_read_name(NBDClient *client, char *name, uint32_t *length,
+static int nbd_opt_read_name(NBDClient *client, char **name, uint32_t *length,
  Error **errp)
 {
 int ret;
 uint32_t len;
+g_autofree char *local_name = NULL;

+*name = NULL;
 ret = nbd_opt_read(client, , sizeof(len), errp);
 if (ret <= 0) {
 return ret;
@@ -347,15 +349,17 @@ static int nbd_opt_read_name(NBDClient *client, char 
*name, uint32_t *length,
"Invalid name length: %" PRIu32, len);
 }

-ret = nbd_opt_read(client, name, len, errp);
+local_name = g_malloc(len + 1);
+ret = nbd_opt_read(client, local_name, len, errp);
 if (ret <= 0) {
 return ret;
 }
-name[len] = '\0';
+local_name[len] = '\0';

 if (length) {
 *length = len;
 }
+*name = g_steal_pointer(_name);

 return 1;
 }
@@ -427,7 +431,7 @@ static void nbd_check_meta_export(NBDClient *client)
 static int nbd_negotiate_handle_export_name(NBDClient *client, bool no_zeroes,
 Error **errp)
 {
-char name[NBD_MAX_NAME_SIZE + 1];
+g_autofree char *name;
 char buf[NBD_REPLY_EXPORT_NAME_SIZE] = "";
 size_t len;
 int ret;
@@ -441,10 +445,11 @@ static int nbd_negotiate_handle_export_name(NBDClient 
*client, bool no_zeroes,
 [10 .. 133]   reserved (0) [unless no_zeroes]
  */
 trace_nbd_negotiate_handle_export_name();
-if (client->optlen >= sizeof(name)) {
+if (client->optlen > NBD_MAX_NAME_SIZE) {
 error_setg(errp, "Bad length received");
 return -EINVAL;
 }
+name = g_malloc(client->optlen + 1);
 if (nbd_read(client->ioc, name, client->optlen, "export name", errp) < 0) {
 return -EIO;
 }
@@ -533,7 +538,7 @@ static int nbd_reject_length(NBDClient *client, bool fatal, 
Error **errp)
 static int nbd_negotiate_handle_info(NBDClient *client, Error **errp)
 {
 int rc;
-char name[NBD_MAX_NAME_SIZE + 1];
+g_autofree char *name = NULL;
 NBDExport *exp;
 uint16_t requests;
 uint16_t request;
@@ -551,7 +556,7 @@ static int nbd_negotiate_handle_info(NBDClient *client, 
Error **errp)
 2 bytes: N, number of requests (can be 0)
 N * 2 bytes: N requests
 */
-rc = nbd_opt_read_name(client, name, , errp);
+rc = nbd_opt_read_name(client, , , errp);
 if (rc <= 0) {
 return rc;
 }
@@ -957,7 +962,7 @@ static int nbd_negotiate_meta_queries(NBDClient *client,
   NBDExportMetaContexts *meta, Error 
**errp)
 {
 int ret;
-char export_name[NBD_MAX_NAME_SIZE + 1];
+g_autofree char *export_name = NULL;
 NBDExportMetaContexts local_meta;
 uint32_t nb_queries;
 int i;
@@ -976,7 +981,7 @@ static int nbd_negotiate_meta_queries(NBDClient *client,

 

Re: [PATCH RFC] virtio-pci: disable vring processing when bus-mastering is disabled

2019-11-13 Thread Michael Roth
Quoting Michael S. Tsirkin (2019-11-13 04:09:02)
> On Tue, Nov 12, 2019 at 11:43:01PM -0600, Michael Roth wrote:
> > Currently the SLOF firmware for pseries guests will disable/re-enable
> > a PCI device multiple times via IO/MEM/MASTER bits of PCI_COMMAND
> > register after the initial probe/feature negotiation, as it tends to
> > work with a single device at a time at various stages like probing
> > and running block/network bootloaders without doing a full reset
> > in-between.
> > 
> > In QEMU, when PCI_COMMAND_MASTER is disabled we disable the
> > corresponding IOMMU memory region, so DMA accesses (including to vring
> > fields like idx/flags) will no longer undergo the necessary
> > translation. Normally we wouldn't expect this to happen since it would
> > be misbehavior on the driver side to continue driving DMA requests.
> > 
> > However, in the case of pseries, with iommu_platform=on, we trigger the
> > following sequence when tearing down the virtio-blk dataplane ioeventfd
> > in response to the guest unsetting PCI_COMMAND_MASTER:
> > 
> >   #2  0x55922651 in virtqueue_map_desc 
> > (vdev=vdev@entry=0x56dbcfb0, p_num_sg=p_num_sg@entry=0x7fffe657e1a8, 
> > addr=addr@entry=0x7fffe657e240, iov=iov@entry=0x7fffe6580240, 
> > max_num_sg=max_num_sg@entry=1024, is_write=is_write@entry=false, pa=0, sz=0)
> >   at /home/mdroth/w/qemu.git/hw/virtio/virtio.c:757
> >   #3  0x55922a89 in virtqueue_pop (vq=vq@entry=0x56dc8660, 
> > sz=sz@entry=184)
> >   at /home/mdroth/w/qemu.git/hw/virtio/virtio.c:950
> >   #4  0x558d3eca in virtio_blk_get_request (vq=0x56dc8660, 
> > s=0x56dbcfb0)
> >   at /home/mdroth/w/qemu.git/hw/block/virtio-blk.c:255
> >   #5  0x558d3eca in virtio_blk_handle_vq (s=0x56dbcfb0, 
> > vq=0x56dc8660)
> >   at /home/mdroth/w/qemu.git/hw/block/virtio-blk.c:776
> >   #6  0x5591dd66 in virtio_queue_notify_aio_vq 
> > (vq=vq@entry=0x56dc8660)
> >   at /home/mdroth/w/qemu.git/hw/virtio/virtio.c:1550
> >   #7  0x5591ecef in virtio_queue_notify_aio_vq (vq=0x56dc8660)
> >   at /home/mdroth/w/qemu.git/hw/virtio/virtio.c:1546
> >   #8  0x5591ecef in virtio_queue_host_notifier_aio_poll 
> > (opaque=0x56dc86c8)
> >   at /home/mdroth/w/qemu.git/hw/virtio/virtio.c:2527
> >   #9  0x55d02164 in run_poll_handlers_once 
> > (ctx=ctx@entry=0x5688bfc0, timeout=timeout@entry=0x7fffe65844a8)
> >   at /home/mdroth/w/qemu.git/util/aio-posix.c:520
> >   #10 0x55d02d1b in try_poll_mode (timeout=0x7fffe65844a8, 
> > ctx=0x5688bfc0)
> >   at /home/mdroth/w/qemu.git/util/aio-posix.c:607
> >   #11 0x55d02d1b in aio_poll (ctx=ctx@entry=0x5688bfc0, 
> > blocking=blocking@entry=true)
> >   at /home/mdroth/w/qemu.git/util/aio-posix.c:639
> >   #12 0x55d0004d in aio_wait_bh_oneshot (ctx=0x5688bfc0, 
> > cb=cb@entry=0x558d5130 , 
> > opaque=opaque@entry=0x56de86f0)
> >   at /home/mdroth/w/qemu.git/util/aio-wait.c:71
> >   #13 0x558d59bf in virtio_blk_data_plane_stop (vdev= > out>)
> >   at /home/mdroth/w/qemu.git/hw/block/dataplane/virtio-blk.c:288
> >   #14 0x55b906a1 in virtio_bus_stop_ioeventfd 
> > (bus=bus@entry=0x56dbcf38)
> >   at /home/mdroth/w/qemu.git/hw/virtio/virtio-bus.c:245
> >   #15 0x55b90dbb in virtio_bus_stop_ioeventfd 
> > (bus=bus@entry=0x56dbcf38)
> >   at /home/mdroth/w/qemu.git/hw/virtio/virtio-bus.c:237
> >   #16 0x55b92a8e in virtio_pci_stop_ioeventfd (proxy=0x56db4e40)
> >   at /home/mdroth/w/qemu.git/hw/virtio/virtio-pci.c:292
> >   #17 0x55b92a8e in virtio_write_config (pci_dev=0x56db4e40, 
> > address=, val=1048832, len=)
> >   at /home/mdroth/w/qemu.git/hw/virtio/virtio-pci.c:613
> > 
> > I.e. the calling code is only scheduling a one-shot BH for
> > virtio_blk_data_plane_stop_bh, but somehow we end up trying to process
> > an additional virtqueue entry before we get there. This is likely due
> > to the following check in virtio_queue_host_notifier_aio_poll:
> > 
> >   static bool virtio_queue_host_notifier_aio_poll(void *opaque)
> >   {
> >   EventNotifier *n = opaque;
> >   VirtQueue *vq = container_of(n, VirtQueue, host_notifier);
> >   bool progress;
> > 
> >   if (!vq->vring.desc || virtio_queue_empty(vq)) {
> >   return false;
> >   }
> > 
> >   progress = virtio_queue_notify_aio_vq(vq);
> > 
> > namely the call to virtio_queue_empty(). In this case, since no new
> > requests have actually been issued, shadow_avail_idx == last_avail_idx,
> > so we actually try to access the vring via vring_avail_idx() to get
> > the latest non-shadowed idx:
> > 
> >   int virtio_queue_empty(VirtQueue *vq)
> >   {
> >   bool empty;
> >   ...
> > 
> >   if (vq->shadow_avail_idx != vq->last_avail_idx) {
> >   return 0;
> >   }
> > 
> >   rcu_read_lock();
> >   empty = 

Re: [PATCH v14 03/11] tests: Add test for QAPI builtin type time

2019-11-13 Thread Tao Xu

On 11/14/2019 6:06 AM, Eduardo Habkost wrote:

On Wed, Nov 13, 2019 at 09:01:29AM +0800, Tao Xu wrote:

On 11/13/2019 4:15 AM, Eduardo Habkost wrote:

On Fri, Nov 08, 2019 at 09:05:52AM +0100, Markus Armbruster wrote:

Tao Xu  writes:


On 11/7/2019 9:31 PM, Eduardo Habkost wrote:

On Thu, Nov 07, 2019 at 02:24:52PM +0800, Tao Xu wrote:

On 11/7/2019 4:53 AM, Eduardo Habkost wrote:

On Mon, Oct 28, 2019 at 03:52:12PM +0800, Tao Xu wrote:

Add tests for time input such as zero, around limit of precision,
signed upper limit, actual upper limit, beyond limits, time suffixes,
and etc.

Signed-off-by: Tao Xu 
---

[...]

+/* Close to signed upper limit 0x7c00 (53 msbs set) */
+qdict = keyval_parse("time1=9223372036854774784," /* 7c00 */
+ "time2=9223372036854775295", /* 7dff */
+ NULL, _abort);
+v = qobject_input_visitor_new_keyval(QOBJECT(qdict));
+qobject_unref(qdict);
+visit_start_struct(v, NULL, NULL, 0, _abort);
+visit_type_time(v, "time1", , _abort);
+g_assert_cmphex(time, ==, 0x7c00);
+visit_type_time(v, "time2", , _abort);
+g_assert_cmphex(time, ==, 0x7c00);


I'm confused by this test case and the one below[1].  Are these
known bugs?  Shouldn't we document them as known bugs?


Because do_strtosz() or do_strtomul() actually parse with strtod(), so the
precision is 53 bits, so in these cases, 7dff and
fbff are rounded.


My questions remain: why isn't this being treated like a bug?


Hi Markus,

I am confused about the code here too. Because in do_strtosz(), the
upper limit is

val * mul >= 0xfc00

So some data near 53 bit may be rounded. Is there a bug?


No, but the design is surprising, and the functions lack written
contracts, except for the do_strtosz() helper, which has one that sucks.

qemu_strtosz() & friends are designed to accept fraction * unit
multiplier.  Example: 1.5M means 1.5 * 1024 * 1024 with qemu_strtosz()
and qemu_strtosz_MiB(), and 1.5 * 1000 * 1000 with
qemu_strtosz_metric().  Whether supporting fractions is a good idea is
debatable, but it's what we've got.

The implementation limits the numeric part to the precision of double,
i.e. 53 bits.  "8PiB should be enough for anybody."

Switching it from double to long double raises the limit to the
precision of long double.  At least 64 bit on common hosts, but hosts
exist where it's the same 53 bits.  Do we support any such hosts?  If
yes, then we'd make the precision depend on the host, which feels like a
bad idea.

A possible alternative is to parse the numeric part both as a double and
as a 64 bit unsigned integer, then use whatever consumes more
characters.  This enables providing full 64 bits unless you actually use
a fraction.



This sounds like the right thing to do, if user input is an
integer and the code in the other end is consuming an integer.



As far as I remember, the only problem we've ever had with the 53 bits
limit is developer confusion :)



Developer confusion, I can deal with.  However, exposing this
behavior on external interfaces is a bug to me.

I don't know how serious the bug is because I don't know which
interfaces are affected by it.  Do we have a list?


Patches welcome.


My first goal is to get the maintainers of that code to recognize
it as a bug.  Then I hope this will motivate somebody else to fix
it.  :)



Hi Eduardo,

If it is a bug, could the fix patch merged during rc1-rc3? Because I made 2
patches, and I want to submit before HMAT (HMAT patches is big, so submit
together may be slow).


Even if I convince other maintainers it is a bug, I don't think
it is serious enough to require a fix in QEMU 4.2.  I suggest
finishing the ongoing HMAT work first, and worry about this issue
later.

Or, if you really prefer to address it before HMAT, it's OK to
make the next version of the HMAT series depend on a series
that's not merged yet.  Just make this explicit in the series
cover letter (publishing a git branch to help review and testing
is also appreciated).


OK I will submit them together.



Re: About MONITOR/MWAIT in i386 CPU model

2019-11-13 Thread Tao Xu

On 11/14/2019 6:47 AM, Eduardo Habkost wrote:

On Wed, Nov 13, 2019 at 04:42:25PM +0800, Tao Xu wrote:

Hi Eduardo,

After kvm use "-overcommit cpu-pm=on" to expose MONITOR/MWAIT
(commit id 6f131f13e68d648a8e4f083c667ab1acd88ce4cd), the MONITOR/MWAIT
feature in CPU model (phenom core2duo coreduo n270 Opteron_G3 EPYC Snowridge
Denverton) may be unused. For example, when we boot a guest with Denverton
cpu model, guest cannot detect MONITOR and boot with no warning. Should we
remove this feature from some CPU model?


Good catch, thanks!

Yes, we should remove them from Opteron_G3, EPYC, Snowridge, and
Denverton, at least.  The other older CPU models can be left
alone: they are more useful for use with TCG than with KVM, and
TCG supports MONITOR/MWAIT.

I would like to understand why this wasn't detected during
testing by Intel.  I suggest always testing CPU models using the
"enforce" flag to make sure warnings don't go unnoticed.



OK we will improve the testing.



Re: [PATCH v9 Kernel 1/5] vfio: KABI for migration interface for device state

2019-11-13 Thread Yan Zhao
On Thu, Nov 14, 2019 at 03:02:55AM +0800, Kirti Wankhede wrote:
> 
> 
> On 11/13/2019 8:53 AM, Yan Zhao wrote:
> > On Wed, Nov 13, 2019 at 06:30:05AM +0800, Alex Williamson wrote:
> >> On Tue, 12 Nov 2019 22:33:36 +0530
> >> Kirti Wankhede  wrote:
> >>
> >>> - Defined MIGRATION region type and sub-type.
> >>> - Used 3 bits to define VFIO device states.
> >>>  Bit 0 => _RUNNING
> >>>  Bit 1 => _SAVING
> >>>  Bit 2 => _RESUMING
> >>>  Combination of these bits defines VFIO device's state during 
> >>> migration
> >>>  _RUNNING => Normal VFIO device running state. When its reset, it
> >>>   indicates _STOPPED state. when device is changed to
> >>>   _STOPPED, driver should stop device before write()
> >>>   returns.
> >>>  _SAVING | _RUNNING => vCPUs are running, VFIO device is running but
> >>>start saving state of device i.e. pre-copy 
> >>> state
> >>>  _SAVING  => vCPUs are stopped, VFIO device should be stopped, and
> >>
> >> s/should/must/
> >>
> >>>  save device state,i.e. stop-n-copy state
> >>>  _RESUMING => VFIO device resuming state.
> >>>  _SAVING | _RESUMING and _RUNNING | _RESUMING => Invalid states
> >>
> >> A table might be useful here and in the uapi header to indicate valid
> >> states:
> >>
> >> | _RESUMING | _SAVING | _RUNNING | Description
> >> +---+-+--+--
> >> | 0 |0| 0| Stopped, not saving or resuming (a)
> >> +---+-+--+--
> >> | 0 |0| 1| Running, default state
> >> +---+-+--+--
> >> | 0 |1| 0| Stopped, migration interface in save 
> >> mode
> >> +---+-+--+--
> >> | 0 |1| 1| Running, save mode interface, iterative
> >> +---+-+--+--
> >> | 1 |0| 0| Stopped, migration resume interface 
> >> active
> >> +---+-+--+--
> >> | 1 |0| 1| Invalid (b)
> >> +---+-+--+--
> >> | 1 |1| 0| Invalid (c)
> >> +---+-+--+--
> >> | 1 |1| 1| Invalid (d)
> >>
> >> I think we need to consider whether we define (a) as generally
> >> available, for instance we might want to use it for diagnostics or a
> >> fatal error condition outside of migration.
> >>
> 
> We have to set it as init state. I'll add this it.
> 
> >> Are there hidden assumptions between state transitions here or are
> >> there specific next possible state diagrams that we need to include as
> >> well?
> >>
> >> I'm curious if Intel agrees with the states marked invalid with their
> >> push for post-copy support.
> >>
> > hi Alex and Kirti,
> > Actually, for postcopy, I think we anyway need an extra POSTCOPY state
> > introduced. Reasons as below:
> > - in the target side, _RSESUMING state is set in the beginning of
> >migration. It cannot be used as a state to inform device of that
> >currently it's in postcopy state and device DMAs are to be trapped and
> >pre-faulted.
> >we also cannot use state (_RESUMING + _RUNNING) as an indicator of
> >postcopy, because before device & vm running in target side, some device
> >state are already loaded (e.g. page tables, pending workloads),
> >target side can do pre-pagefault at that period before target vm up.
> > - in the source side, after device is stopped, postcopy needs saving
> >device state only (as compared to device state + remaining dirty
> >pages in precopy). state (!_RUNNING + _SAVING) here again cannot
> >differentiate precopy and postcopy here.
> > 
> >>>  Bits 3 - 31 are reserved for future use. User should perform
> >>>  read-modify-write operation on this field.
> >>> - Defined vfio_device_migration_info structure which will be placed at 0th
> >>>offset of migration region to get/set VFIO device related information.
> >>>Defined members of structure and usage on read/write access:
> >>>  * device_state: (read/write)
> >>>  To convey VFIO device state to be transitioned to. Only 3 bits 
> >>> are
> >>>   used as of now, Bits 3 - 31 are reserved for future use.
> >>>  * pending bytes: (read only)
> >>>  To get pending bytes yet to be migrated for VFIO device.
> >>>  * data_offset: (read only)
> >>>  To get data offset in migration region from where data exist
> >>>   during _SAVING and from where data should be written by user space
> >>>   application during _RESUMING state.
> >>>  * data_size: 

[PATCH v5 17/20] fuzz: add support for qos-assisted fuzz targets

2019-11-13 Thread Oleinik, Alexander
Signed-off-by: Alexander Bulekov 
---
 tests/fuzz/qos_fuzz.c | 232 ++
 tests/fuzz/qos_fuzz.h |  33 ++
 2 files changed, 265 insertions(+)
 create mode 100644 tests/fuzz/qos_fuzz.c
 create mode 100644 tests/fuzz/qos_fuzz.h

diff --git a/tests/fuzz/qos_fuzz.c b/tests/fuzz/qos_fuzz.c
new file mode 100644
index 00..da76e28ca3
--- /dev/null
+++ b/tests/fuzz/qos_fuzz.c
@@ -0,0 +1,232 @@
+/*
+ * QOS-assisted fuzzing helpers
+ *
+ * Copyright (c) 2018 Emanuele Giuseppe Esposito 
+ *
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License version 2 as published by the Free Software Foundation.
+ *
+ * This library is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library; if not, see 
+ */
+
+#include "qemu/osdep.h"
+#include "qemu/units.h"
+#include "qapi/error.h"
+#include "qemu-common.h"
+#include "exec/memory.h"
+#include "exec/address-spaces.h"
+#include "sysemu/sysemu.h"
+#include "qemu/main-loop.h"
+
+#include 
+
+#include "libqos/malloc.h"
+#include "libqos/qgraph.h"
+#include "libqos/qgraph_internal.h"
+
+#include "fuzz.h"
+#include "qos_fuzz.h"
+#include "tests/libqos/qgraph.h"
+#include "tests/libqos/qos_external.h"
+#include "tests/libqtest.h"
+
+#include "qapi/qapi-commands-machine.h"
+#include "qapi/qapi-commands-qom.h"
+#include "qapi/qmp/qlist.h"
+
+
+void *fuzz_qos_obj;
+QGuestAllocator *fuzz_qos_alloc;
+
+static const char *fuzz_target_name;
+static char **fuzz_path_vec;
+
+/*
+ * Replaced the qmp commands with direct qmp_marshal calls.
+ * Probably there is a better way to do this
+ */
+static void qos_set_machines_devices_available(void)
+{
+QDict *req = qdict_new();
+QObject *response;
+QDict *args = qdict_new();
+QList *lst;
+Error *err = NULL;
+
+qmp_marshal_query_machines(NULL, , );
+assert(!err);
+lst = qobject_to(QList, response);
+apply_to_qlist(lst, true);
+
+qobject_unref(response);
+
+
+qdict_put_str(req, "execute", "qom-list-types");
+qdict_put_str(args, "implements", "device");
+qdict_put_bool(args, "abstract", true);
+qdict_put_obj(req, "arguments", (QObject *) args);
+
+qmp_marshal_qom_list_types(args, , );
+assert(!err);
+lst = qobject_to(QList, response);
+apply_to_qlist(lst, false);
+qobject_unref(response);
+qobject_unref(req);
+}
+
+static char **current_path;
+
+void *qos_allocate_objects(QTestState *qts, QGuestAllocator **p_alloc)
+{
+return allocate_objects(qts, current_path + 1, p_alloc);
+}
+
+static const char *qos_build_main_args(void)
+{
+char **path = fuzz_path_vec;
+QOSGraphNode *test_node;
+GString *cmd_line = g_string_new(path[0]);
+void *test_arg;
+
+/* Before test */
+current_path = path;
+test_node = qos_graph_get_node(path[(g_strv_length(path) - 1)]);
+test_arg = test_node->u.test.arg;
+if (test_node->u.test.before) {
+test_arg = test_node->u.test.before(cmd_line, test_arg);
+}
+/* Prepend the arguments that we need */
+g_string_prepend(cmd_line,
+TARGET_NAME " -display none -machine accel=qtest -m 64 ");
+return cmd_line->str;
+}
+
+/*
+ * This function is largely a copy of qos-test.c:walk_path. Since walk_path
+ * is itself a callback, its a little annoying to add another argument/layer of
+ * indirection
+ */
+static void walk_path(QOSGraphNode *orig_path, int len)
+{
+QOSGraphNode *path;
+QOSGraphEdge *edge;
+
+/* etype set to QEDGE_CONSUMED_BY so that machine can add to the command 
line */
+QOSEdgeType etype = QEDGE_CONSUMED_BY;
+
+/* twice QOS_PATH_MAX_ELEMENT_SIZE since each edge can have its arg */
+char **path_vec = g_new0(char *, (QOS_PATH_MAX_ELEMENT_SIZE * 2));
+int path_vec_size = 0;
+
+char *after_cmd, *before_cmd, *after_device;
+GString *after_device_str = g_string_new("");
+char *node_name = orig_path->name, *path_str;
+
+GString *cmd_line = g_string_new("");
+GString *cmd_line2 = g_string_new("");
+
+path = qos_graph_get_node(node_name); /* root */
+node_name = qos_graph_edge_get_dest(path->path_edge); /* machine name */
+
+path_vec[path_vec_size++] = node_name;
+path_vec[path_vec_size++] = qos_get_machine_type(node_name);
+
+for (;;) {
+path = qos_graph_get_node(node_name);
+if (!path->path_edge) {
+break;
+}
+
+node_name = qos_graph_edge_get_dest(path->path_edge);
+
+/* append node command line + previous edge command line */
+if (path->command_line && etype == QEDGE_CONSUMED_BY) {
+

[PATCH v5 15/20] fuzz: add fuzzer skeleton

2019-11-13 Thread Oleinik, Alexander
tests/fuzz/fuzz.c serves as the entry point for the virtual-device
fuzzer. Namely, libfuzzer invokes the LLVMFuzzerInitialize and
LLVMFuzzerTestOneInput functions, both of which are defined in this
file. This change adds a "FuzzTarget" struct, along with the
fuzz_add_target function, which should be used to define new fuzz
targets.

Signed-off-by: Alexander Bulekov 
---
 tests/fuzz/Makefile.include |   4 +-
 tests/fuzz/fuzz.c   | 179 
 tests/fuzz/fuzz.h   |  94 +++
 3 files changed, 275 insertions(+), 2 deletions(-)
 create mode 100644 tests/fuzz/fuzz.c
 create mode 100644 tests/fuzz/fuzz.h

diff --git a/tests/fuzz/Makefile.include b/tests/fuzz/Makefile.include
index 324e6c1433..b415b056b0 100644
--- a/tests/fuzz/Makefile.include
+++ b/tests/fuzz/Makefile.include
@@ -1,4 +1,4 @@
-# QEMU_PROG_FUZZ=qemu-fuzz-$(TARGET_NAME)$(EXESUF)
+QEMU_PROG_FUZZ=qemu-fuzz-$(TARGET_NAME)$(EXESUF)
 fuzz-obj-y = $(libqos-obj-y)
 fuzz-obj-y += tests/libqtest.o
-
+fuzz-obj-y += tests/fuzz/fuzz.o
diff --git a/tests/fuzz/fuzz.c b/tests/fuzz/fuzz.c
new file mode 100644
index 00..f4abaa3484
--- /dev/null
+++ b/tests/fuzz/fuzz.c
@@ -0,0 +1,179 @@
+/*
+ * fuzzing driver
+ *
+ * Copyright Red Hat Inc., 2019
+ *
+ * Authors:
+ *  Alexander Bulekov   
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ *
+ */
+
+#include "qemu/osdep.h"
+
+#include 
+
+#include "sysemu/qtest.h"
+#include "sysemu/runstate.h"
+#include "sysemu/sysemu.h"
+#include "qemu/main-loop.h"
+#include "tests/libqtest.h"
+#include "tests/libqos/qgraph.h"
+#include "fuzz.h"
+
+#define MAX_EVENT_LOOPS 10
+
+typedef struct FuzzTargetState {
+FuzzTarget *target;
+QSLIST_ENTRY(FuzzTargetState) target_list;
+} FuzzTargetState;
+
+typedef QSLIST_HEAD(, FuzzTargetState) FuzzTargetList;
+
+static const char *fuzz_arch = TARGET_NAME;
+
+static FuzzTargetList *fuzz_target_list;
+static FuzzTarget *fuzz_target;
+static QTestState *fuzz_qts;
+
+
+
+void flush_events(QTestState *s)
+{
+int i = MAX_EVENT_LOOPS;
+while (g_main_context_pending(NULL) && i-- > 0) {
+main_loop_wait(false);
+}
+}
+
+static QTestState *qtest_setup(void)
+{
+qtest_server_set_send_handler(_client_inproc_recv, _qts);
+return qtest_inproc_init(_qts, false fuzz_arch,
+_server_inproc_recv);
+}
+
+void fuzz_add_target(const FuzzTarget *target)
+{
+FuzzTargetState *tmp;
+FuzzTargetState *target_state;
+if (!fuzz_target_list) {
+fuzz_target_list = g_new0(FuzzTargetList, 1);
+}
+
+QSLIST_FOREACH(tmp, fuzz_target_list, target_list) {
+if (g_strcmp0(tmp->target->name, target->name) == 0) {
+fprintf(stderr, "Error: Fuzz target name %s already in use\n",
+target->name);
+abort();
+}
+}
+target_state = g_new0(FuzzTargetState, 1);
+target_state->target = g_new0(FuzzTarget, 1);
+*(target_state->target) = *target;
+QSLIST_INSERT_HEAD(fuzz_target_list, target_state, target_list);
+}
+
+
+
+static void usage(char *path)
+{
+printf("Usage: %s --fuzz-target=FUZZ_TARGET [LIBFUZZER ARGUMENTS]\n", 
path);
+printf("where FUZZ_TARGET is one of:\n");
+FuzzTargetState *tmp;
+if (!fuzz_target_list) {
+fprintf(stderr, "Fuzz target list not initialized\n");
+abort();
+}
+QSLIST_FOREACH(tmp, fuzz_target_list, target_list) {
+printf(" %s  : %s\n", tmp->target->name,
+tmp->target->description);
+}
+exit(0);
+}
+
+static FuzzTarget *fuzz_get_target(char* name)
+{
+FuzzTargetState *tmp;
+if (!fuzz_target_list) {
+fprintf(stderr, "Fuzz target list not initialized\n");
+abort();
+}
+
+QSLIST_FOREACH(tmp, fuzz_target_list, target_list) {
+if (strcmp(tmp->target->name, name) == 0) {
+return tmp->target;
+}
+}
+return NULL;
+}
+
+
+/* Executed for each fuzzing-input */
+int LLVMFuzzerTestOneInput(const unsigned char *Data, size_t Size)
+{
+/*
+ * Do the pre-fuzz-initialization before the first fuzzing iteration,
+ * instead of before the actual fuzz loop. This is needed since libfuzzer
+ * may fork off additional workers, prior to the fuzzing loop, and if
+ * pre_fuzz() sets up e.g. shared memory, this should be done for the
+ * individual worker processes
+ */
+static int pre_fuzz_done;
+if (!pre_fuzz_done && fuzz_target->pre_fuzz) {
+fuzz_target->pre_fuzz(fuzz_qts);
+pre_fuzz_done = true;
+}
+
+fuzz_target->fuzz(fuzz_qts, Data, Size);
+return 0;
+}
+
+/* Executed once, prior to fuzzing */
+int LLVMFuzzerInitialize(int *argc, char ***argv, char ***envp)
+{
+
+char *target_name;
+
+/* Initialize qgraph and modules */
+qos_graph_init();
+module_call_init(MODULE_INIT_FUZZ_TARGET);
+

[PATCH v5 18/20] fuzz: add i440fx fuzz targets

2019-11-13 Thread Oleinik, Alexander
These three targets should simply fuzz reads/writes to a couple ioports,
but they mostly serve as examples of different ways to write targets.
They demonstrate using qtest and qos for fuzzing, as well as using
rebooting and forking to reset state, or not resetting it at all.

Signed-off-by: Alexander Bulekov 
---
 tests/fuzz/Makefile.include |   3 +
 tests/fuzz/i440fx_fuzz.c| 176 
 2 files changed, 179 insertions(+)
 create mode 100644 tests/fuzz/i440fx_fuzz.c

diff --git a/tests/fuzz/Makefile.include b/tests/fuzz/Makefile.include
index 687dacce04..37d6821bee 100644
--- a/tests/fuzz/Makefile.include
+++ b/tests/fuzz/Makefile.include
@@ -3,5 +3,8 @@ fuzz-obj-y = $(libqos-obj-y)
 fuzz-obj-y += tests/libqtest.o
 fuzz-obj-y += tests/fuzz/fuzz.o
 fuzz-obj-y += tests/fuzz/fork_fuzz.o
+fuzz-obj-y += tests/fuzz/qos_fuzz.o
+
+fuzz-obj-y += tests/fuzz/i440fx_fuzz.o
 
 FUZZ_LDFLAGS += -Xlinker -T$(SRC_PATH)/tests/fuzz/fork_fuzz.ld
diff --git a/tests/fuzz/i440fx_fuzz.c b/tests/fuzz/i440fx_fuzz.c
new file mode 100644
index 00..56e3315a88
--- /dev/null
+++ b/tests/fuzz/i440fx_fuzz.c
@@ -0,0 +1,176 @@
+/*
+ * I440FX Fuzzing Target
+ *
+ * Copyright Red Hat Inc., 2019
+ *
+ * Authors:
+ *  Alexander Bulekov   
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ */
+
+#include "qemu/osdep.h"
+
+#include "fuzz.h"
+#include "tests/libqtest.h"
+#include "fuzz/qos_fuzz.h"
+#include "fuzz/fork_fuzz.h"
+#include "qemu/main-loop.h"
+#include "tests/libqos/pci.h"
+#include "tests/libqos/pci-pc.h"
+
+
+#define I440FX_PCI_HOST_BRIDGE_CFG 0xcf8
+#define I440FX_PCI_HOST_BRIDGE_DATA 0xcfc
+
+enum action_id {
+WRITEB,
+WRITEW,
+WRITEL,
+READB,
+READW,
+READL,
+ACTION_MAX
+};
+
+static void i440fx_fuzz_qtest(QTestState *s,
+const unsigned char *Data, size_t Size) {
+typedef struct QTestFuzzAction {
+uint32_t value;
+uint8_t id;
+uint8_t addr;
+} QTestFuzzAction;
+QTestFuzzAction a;
+
+while (Size >= sizeof(a)) {
+memcpy(, Data, sizeof(a));
+uint16_t addr = a.addr % 2 ? I440FX_PCI_HOST_BRIDGE_CFG :
+  I440FX_PCI_HOST_BRIDGE_DATA;
+switch (a.id % ACTION_MAX) {
+case WRITEB:
+qtest_outb(s, addr, (uint8_t)a.value);
+break;
+case WRITEW:
+qtest_outw(s, addr, (uint16_t)a.value);
+break;
+case WRITEL:
+qtest_outl(s, addr, (uint32_t)a.value);
+break;
+case READB:
+qtest_inb(s, addr);
+break;
+case READW:
+qtest_inw(s, addr);
+break;
+case READL:
+qtest_inl(s, addr);
+break;
+}
+Size -= sizeof(a);
+Data += sizeof(a);
+}
+flush_events(s);
+}
+
+static void i440fx_fuzz_qos(QTestState *s,
+const unsigned char *Data, size_t Size) {
+
+typedef struct QOSFuzzAction {
+uint32_t value;
+int devfn;
+uint8_t offset;
+uint8_t id;
+} QOSFuzzAction;
+
+static QPCIBus *bus;
+if (!bus) {
+bus = qpci_new_pc(s, fuzz_qos_alloc);
+}
+
+QOSFuzzAction a;
+while (Size >= sizeof(a)) {
+memcpy(, Data, sizeof(a));
+switch (a.id % ACTION_MAX) {
+case WRITEB:
+bus->config_writeb(bus, a.devfn, a.offset, (uint8_t)a.value);
+break;
+case WRITEW:
+bus->config_writew(bus, a.devfn, a.offset, (uint16_t)a.value);
+break;
+case WRITEL:
+bus->config_writel(bus, a.devfn, a.offset, (uint32_t)a.value);
+break;
+case READB:
+bus->config_readb(bus, a.devfn, a.offset);
+break;
+case READW:
+bus->config_readw(bus, a.devfn, a.offset);
+break;
+case READL:
+bus->config_readl(bus, a.devfn, a.offset);
+break;
+}
+Size -= sizeof(a);
+Data += sizeof(a);
+}
+flush_events(s);
+}
+
+static void i440fx_fuzz_qos_fork(QTestState *s,
+const unsigned char *Data, size_t Size) {
+if (fork() == 0) {
+i440fx_fuzz_qos(s, Data, Size);
+_Exit(0);
+} else {
+wait(NULL);
+}
+}
+
+static const char *i440fx_qtest_argv = TARGET_NAME " -machine accel=qtest"
+   "-m 0 -display none";
+static const char *i440fx_argv(FuzzTarget *t)
+{
+return i440fx_qtest_argv;
+}
+
+static void fork_init(void)
+{
+counter_shm_init();
+}
+
+static void register_pci_fuzz_targets(void)
+{
+/* Uses simple qtest commands and reboots to reset state */
+fuzz_add_target(&(FuzzTarget){
+.name = "i440fx-qtest-reboot-fuzz",
+.description = "Fuzz the i440fx using raw qtest commands and"
+   

[PATCH v5 14/20] fuzz: Add target/fuzz makefile rules

2019-11-13 Thread Oleinik, Alexander
Signed-off-by: Alexander Bulekov 
---
 Makefile| 15 ++-
 Makefile.objs   |  4 +++-
 Makefile.target | 18 +-
 tests/fuzz/Makefile.include |  4 
 4 files changed, 38 insertions(+), 3 deletions(-)
 create mode 100644 tests/fuzz/Makefile.include

diff --git a/Makefile b/Makefile
index d2b2ecd3c4..571f5562c9 100644
--- a/Makefile
+++ b/Makefile
@@ -464,7 +464,7 @@ config-host.h-timestamp: config-host.mak
 qemu-options.def: $(SRC_PATH)/qemu-options.hx $(SRC_PATH)/scripts/hxtool
$(call quiet-command,sh $(SRC_PATH)/scripts/hxtool -h < $< > 
$@,"GEN","$@")
 
-TARGET_DIRS_RULES := $(foreach t, all clean install, $(addsuffix /$(t), 
$(TARGET_DIRS)))
+TARGET_DIRS_RULES := $(foreach t, all fuzz clean install, $(addsuffix /$(t), 
$(TARGET_DIRS)))
 
 SOFTMMU_ALL_RULES=$(filter %-softmmu/all, $(TARGET_DIRS_RULES))
 $(SOFTMMU_ALL_RULES): $(authz-obj-y)
@@ -476,6 +476,15 @@ $(SOFTMMU_ALL_RULES): config-all-devices.mak
 $(SOFTMMU_ALL_RULES): $(edk2-decompressed)
 $(SOFTMMU_ALL_RULES): $(softmmu-main-y)
 
+SOFTMMU_FUZZ_RULES=$(filter %-softmmu/fuzz, $(TARGET_DIRS_RULES))
+$(SOFTMMU_FUZZ_RULES): $(authz-obj-y)
+$(SOFTMMU_FUZZ_RULES): $(block-obj-y)
+$(SOFTMMU_FUZZ_RULES): $(chardev-obj-y)
+$(SOFTMMU_FUZZ_RULES): $(crypto-obj-y)
+$(SOFTMMU_FUZZ_RULES): $(io-obj-y)
+$(SOFTMMU_FUZZ_RULES): config-all-devices.mak
+$(SOFTMMU_FUZZ_RULES): $(edk2-decompressed)
+
 .PHONY: $(TARGET_DIRS_RULES)
 # The $(TARGET_DIRS_RULES) are of the form SUBDIR/GOAL, so that
 # $(dir $@) yields the sub-directory, and $(notdir $@) yields the sub-goal
@@ -526,6 +535,9 @@ subdir-slirp: slirp/all
 $(filter %/all, $(TARGET_DIRS_RULES)): libqemuutil.a $(common-obj-y) \
$(qom-obj-y) $(crypto-user-obj-$(CONFIG_USER_ONLY))
 
+$(filter %/fuzz, $(TARGET_DIRS_RULES)): libqemuutil.a $(common-obj-y) \
+   $(qom-obj-y) $(crypto-user-obj-$(CONFIG_USER_ONLY))
+
 ROM_DIRS = $(addprefix pc-bios/, $(ROMS))
 ROM_DIRS_RULES=$(foreach t, all clean, $(addsuffix /$(t), $(ROM_DIRS)))
 # Only keep -O and -g cflags
@@ -535,6 +547,7 @@ $(ROM_DIRS_RULES):
 
 .PHONY: recurse-all recurse-clean recurse-install
 recurse-all: $(addsuffix /all, $(TARGET_DIRS) $(ROM_DIRS))
+recurse-fuzz: $(addsuffix /fuzz, $(TARGET_DIRS) $(ROM_DIRS))
 recurse-clean: $(addsuffix /clean, $(TARGET_DIRS) $(ROM_DIRS))
 recurse-install: $(addsuffix /install, $(TARGET_DIRS))
 $(addsuffix /install, $(TARGET_DIRS)): all
diff --git a/Makefile.objs b/Makefile.objs
index 9ff9b0c6f9..5478a554f6 100644
--- a/Makefile.objs
+++ b/Makefile.objs
@@ -86,10 +86,12 @@ common-obj-$(CONFIG_FDT) += device_tree.o
 # qapi
 
 common-obj-y += qapi/
+softmmu-obj-y = main.o
 
-softmmu-main-y = main.o
 endif
 
+
+
 ###
 # Target-independent parts used in system and user emulation
 common-obj-y += cpus-common.o
diff --git a/Makefile.target b/Makefile.target
index ca3d14efe1..cddc8e4306 100644
--- a/Makefile.target
+++ b/Makefile.target
@@ -202,7 +202,7 @@ endif
 COMMON_LDADDS = ../libqemuutil.a
 
 # build either PROG or PROGW
-$(QEMU_PROG_BUILD): $(all-obj-y) $(COMMON_LDADDS)
+$(QEMU_PROG_BUILD): $(all-obj-y) $(COMMON_LDADDS) $(softmmu-obj-y)
$(call LINK, $(filter-out %.mak, $^))
 ifdef CONFIG_DARWIN
$(call quiet-command,Rez -append $(SRC_PATH)/pc-bios/qemu.rsrc -o 
$@,"REZ","$(TARGET_DIR)$@")
@@ -227,6 +227,22 @@ ifdef CONFIG_TRACE_SYSTEMTAP
rm -f *.stp
 endif
 
+ifdef CONFIG_FUZZ
+include $(SRC_PATH)/tests/fuzz/Makefile.include
+include $(SRC_PATH)/tests/Makefile.include
+
+fuzz: fuzz-vars
+fuzz-vars: QEMU_CFLAGS := $(FUZZ_CFLAGS) $(QEMU_CFLAGS)
+fuzz-vars: QEMU_LDFLAGS := $(FUZZ_LDFLAGS) $(QEMU_LDFLAGS)
+fuzz-vars: $(QEMU_PROG_FUZZ)
+dummy := $(call unnest-vars,, fuzz-obj-y)
+
+
+$(QEMU_PROG_FUZZ): config-devices.mak $(all-obj-y) $(COMMON_LDADDS) 
$(fuzz-obj-y)
+   $(call LINK, $(filter-out %.mak, $^))
+
+endif
+
 install: all
 ifneq ($(PROGS),)
$(call install-prog,$(PROGS),$(DESTDIR)$(bindir))
diff --git a/tests/fuzz/Makefile.include b/tests/fuzz/Makefile.include
new file mode 100644
index 00..324e6c1433
--- /dev/null
+++ b/tests/fuzz/Makefile.include
@@ -0,0 +1,4 @@
+# QEMU_PROG_FUZZ=qemu-fuzz-$(TARGET_NAME)$(EXESUF)
+fuzz-obj-y = $(libqos-obj-y)
+fuzz-obj-y += tests/libqtest.o
+
-- 
2.23.0




[PATCH v5 16/20] fuzz: add support for fork-based fuzzing.

2019-11-13 Thread Oleinik, Alexander
fork() is a simple way to ensure that state does not leak in between
fuzzing runs. Unfortunately, the fuzzer mutation engine relies on
bitmaps which contain coverage information for each fuzzing run, and
these bitmaps should be copied from the child to the parent(where the
mutation occurs). These bitmaps are created through compile-time
instrumentation and they are not shared with fork()-ed processes, by
default. To address this, we create a shared memory region, adjust its
size and map it _over_ the counter region. Furthermore, libfuzzer
doesn't generally expose the globals that specify the location of the
counters/coverage bitmap. As a workaround, we rely on a custom linker
script which forces all of the bitmaps we care about to be placed in a
contiguous region, which is easy to locate and mmap over.

Signed-off-by: Alexander Bulekov 
---
 exec.c  | 12 ++--
 tests/fuzz/Makefile.include |  3 ++
 tests/fuzz/fork_fuzz.c  | 55 +
 tests/fuzz/fork_fuzz.h  | 23 
 tests/fuzz/fork_fuzz.ld | 37 +
 tests/fuzz/fuzz.c   |  2 +-
 6 files changed, 129 insertions(+), 3 deletions(-)
 create mode 100644 tests/fuzz/fork_fuzz.c
 create mode 100644 tests/fuzz/fork_fuzz.h
 create mode 100644 tests/fuzz/fork_fuzz.ld

diff --git a/exec.c b/exec.c
index 91c8b79656..b15207b00c 100644
--- a/exec.c
+++ b/exec.c
@@ -35,6 +35,7 @@
 #include "sysemu/kvm.h"
 #include "sysemu/sysemu.h"
 #include "sysemu/tcg.h"
+#include "sysemu/qtest.h"
 #include "qemu/timer.h"
 #include "qemu/config-file.h"
 #include "qemu/error-report.h"
@@ -2266,8 +2267,15 @@ static void ram_block_add(RAMBlock *new_block, Error 
**errp, bool shared)
 if (new_block->host) {
 qemu_ram_setup_dump(new_block->host, new_block->max_length);
 qemu_madvise(new_block->host, new_block->max_length, 
QEMU_MADV_HUGEPAGE);
-/* MADV_DONTFORK is also needed by KVM in absence of synchronous MMU */
-qemu_madvise(new_block->host, new_block->max_length, 
QEMU_MADV_DONTFORK);
+/*
+ * MADV_DONTFORK is also needed by KVM in absence of synchronous MMU
+ * Configure it unless the machine is a qtest server, in which case it
+ * may be forked, for fuzzing purposes
+ */
+if (!qtest_enabled()) {
+qemu_madvise(new_block->host, new_block->max_length,
+ QEMU_MADV_DONTFORK);
+}
 ram_block_notify_add(new_block->host, new_block->max_length);
 }
 }
diff --git a/tests/fuzz/Makefile.include b/tests/fuzz/Makefile.include
index b415b056b0..687dacce04 100644
--- a/tests/fuzz/Makefile.include
+++ b/tests/fuzz/Makefile.include
@@ -2,3 +2,6 @@ QEMU_PROG_FUZZ=qemu-fuzz-$(TARGET_NAME)$(EXESUF)
 fuzz-obj-y = $(libqos-obj-y)
 fuzz-obj-y += tests/libqtest.o
 fuzz-obj-y += tests/fuzz/fuzz.o
+fuzz-obj-y += tests/fuzz/fork_fuzz.o
+
+FUZZ_LDFLAGS += -Xlinker -T$(SRC_PATH)/tests/fuzz/fork_fuzz.ld
diff --git a/tests/fuzz/fork_fuzz.c b/tests/fuzz/fork_fuzz.c
new file mode 100644
index 00..2bd0851903
--- /dev/null
+++ b/tests/fuzz/fork_fuzz.c
@@ -0,0 +1,55 @@
+/*
+ * Fork-based fuzzing helpers
+ *
+ * Copyright Red Hat Inc., 2019
+ *
+ * Authors:
+ *  Alexander Bulekov   
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ *
+ */
+
+#include "qemu/osdep.h"
+#include "fork_fuzz.h"
+
+
+void counter_shm_init(void)
+{
+char *shm_path = g_strdup_printf("/qemu-fuzz-cntrs.%d", getpid());
+int fd = shm_open(shm_path, O_CREAT | O_RDWR, S_IRUSR | S_IWUSR);
+g_free(shm_path);
+
+if (fd == -1) {
+perror("Error: ");
+exit(1);
+}
+if (ftruncate(fd, &__FUZZ_COUNTERS_END - &__FUZZ_COUNTERS_START) == -1) {
+perror("Error: ");
+exit(1);
+}
+/* Copy what's in the counter region to the shm.. */
+void *rptr = mmap(NULL ,
+&__FUZZ_COUNTERS_END - &__FUZZ_COUNTERS_START,
+PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0);
+memcpy(rptr,
+   &__FUZZ_COUNTERS_START,
+   &__FUZZ_COUNTERS_END - &__FUZZ_COUNTERS_START);
+
+munmap(rptr, &__FUZZ_COUNTERS_END - &__FUZZ_COUNTERS_START);
+
+/* And map the shm over the counter region */
+rptr = mmap(&__FUZZ_COUNTERS_START,
+&__FUZZ_COUNTERS_END - &__FUZZ_COUNTERS_START,
+PROT_READ | PROT_WRITE, MAP_SHARED | MAP_FIXED, fd, 0);
+
+close(fd);
+
+if (!rptr) {
+perror("Error: ");
+exit(1);
+}
+}
+
+
diff --git a/tests/fuzz/fork_fuzz.h b/tests/fuzz/fork_fuzz.h
new file mode 100644
index 00..9ecb8b58ef
--- /dev/null
+++ b/tests/fuzz/fork_fuzz.h
@@ -0,0 +1,23 @@
+/*
+ * Fork-based fuzzing helpers
+ *
+ * Copyright Red Hat Inc., 2019
+ *
+ * Authors:
+ *  Alexander Bulekov   
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the 

[PATCH v5 09/20] libqos: split qos-test and libqos makefile vars

2019-11-13 Thread Oleinik, Alexander
Most qos-related objects were specified in the qos-test-obj-y variable.
qos-test-obj-y also included qos-test.o which defines a main().
This made it difficult to repurpose qos-test-obj-y to link anything
beside tests/qos-test against libqos. This change separates objects that
are libqos-specific and ones that are qos-test specific into different
variables.

Signed-off-by: Alexander Bulekov 
---
 tests/Makefile.include | 71 +-
 1 file changed, 36 insertions(+), 35 deletions(-)

diff --git a/tests/Makefile.include b/tests/Makefile.include
index 67853d10c3..1517c4817e 100644
--- a/tests/Makefile.include
+++ b/tests/Makefile.include
@@ -699,52 +699,53 @@ tests/test-crypto-block$(EXESUF): 
tests/test-crypto-block.o $(test-crypto-obj-y)
 
 libqgraph-obj-y = tests/libqos/qgraph.o
 
-libqos-obj-y = $(libqgraph-obj-y) tests/libqos/pci.o tests/libqos/fw_cfg.o
-libqos-obj-y += tests/libqos/malloc.o
-libqos-obj-y += tests/libqos/libqos.o
-libqos-spapr-obj-y = $(libqos-obj-y) tests/libqos/malloc-spapr.o
+libqos-core-obj-y = $(libqgraph-obj-y) tests/libqos/pci.o tests/libqos/fw_cfg.o
+libqos-core-obj-y += tests/libqos/malloc.o
+libqos-core-obj-y += tests/libqos/libqos.o
+libqos-spapr-obj-y = $(libqos-core-obj-y) tests/libqos/malloc-spapr.o
 libqos-spapr-obj-y += tests/libqos/libqos-spapr.o
 libqos-spapr-obj-y += tests/libqos/rtas.o
 libqos-spapr-obj-y += tests/libqos/pci-spapr.o
-libqos-pc-obj-y = $(libqos-obj-y) tests/libqos/pci-pc.o
+libqos-pc-obj-y = $(libqos-core-obj-y) tests/libqos/pci-pc.o
 libqos-pc-obj-y += tests/libqos/malloc-pc.o tests/libqos/libqos-pc.o
 libqos-pc-obj-y += tests/libqos/ahci.o
 libqos-usb-obj-y = $(libqos-spapr-obj-y) $(libqos-pc-obj-y) tests/libqos/usb.o
 
 # Devices
-qos-test-obj-y = tests/qos-test.o $(libqgraph-obj-y)
-qos-test-obj-y += $(libqos-pc-obj-y) $(libqos-spapr-obj-y)
-qos-test-obj-y += tests/libqos/e1000e.o
-qos-test-obj-y += tests/libqos/i2c.o
-qos-test-obj-y += tests/libqos/i2c-imx.o
-qos-test-obj-y += tests/libqos/i2c-omap.o
-qos-test-obj-y += tests/libqos/sdhci.o
-qos-test-obj-y += tests/libqos/tpci200.o
-qos-test-obj-y += tests/libqos/virtio.o
-qos-test-obj-$(CONFIG_VIRTFS) += tests/libqos/virtio-9p.o
-qos-test-obj-y += tests/libqos/virtio-balloon.o
-qos-test-obj-y += tests/libqos/virtio-blk.o
-qos-test-obj-y += tests/libqos/virtio-mmio.o
-qos-test-obj-y += tests/libqos/virtio-net.o
-qos-test-obj-y += tests/libqos/virtio-pci.o
-qos-test-obj-y += tests/libqos/virtio-pci-modern.o
-qos-test-obj-y += tests/libqos/virtio-rng.o
-qos-test-obj-y += tests/libqos/virtio-scsi.o
-qos-test-obj-y += tests/libqos/virtio-serial.o
+libqos-obj-y = $(libqgraph-obj-y)
+libqos-obj-y += $(libqos-pc-obj-y) $(libqos-spapr-obj-y)
+libqos-obj-y += tests/libqos/e1000e.o
+libqos-obj-y += tests/libqos/i2c.o
+libqos-obj-y += tests/libqos/i2c-imx.o
+libqos-obj-y += tests/libqos/i2c-omap.o
+libqos-obj-y += tests/libqos/sdhci.o
+libqos-obj-y += tests/libqos/tpci200.o
+libqos-obj-y += tests/libqos/virtio.o
+libqos-obj-$(CONFIG_VIRTFS) += tests/libqos/virtio-9p.o
+libqos-obj-y += tests/libqos/virtio-balloon.o
+libqos-obj-y += tests/libqos/virtio-blk.o
+libqos-obj-y += tests/libqos/virtio-mmio.o
+libqos-obj-y += tests/libqos/virtio-net.o
+libqos-obj-y += tests/libqos/virtio-pci.o
+libqos-obj-y += tests/libqos/virtio-pci-modern.o
+libqos-obj-y += tests/libqos/virtio-rng.o
+libqos-obj-y += tests/libqos/virtio-scsi.o
+libqos-obj-y += tests/libqos/virtio-serial.o
 
 # Machines
-qos-test-obj-y += tests/libqos/aarch64-xlnx-zcu102-machine.o
-qos-test-obj-y += tests/libqos/arm-imx25-pdk-machine.o
-qos-test-obj-y += tests/libqos/arm-n800-machine.o
-qos-test-obj-y += tests/libqos/arm-raspi2-machine.o
-qos-test-obj-y += tests/libqos/arm-sabrelite-machine.o
-qos-test-obj-y += tests/libqos/arm-smdkc210-machine.o
-qos-test-obj-y += tests/libqos/arm-virt-machine.o
-qos-test-obj-y += tests/libqos/arm-xilinx-zynq-a9-machine.o
-qos-test-obj-y += tests/libqos/ppc64_pseries-machine.o
-qos-test-obj-y += tests/libqos/x86_64_pc-machine.o
+libqos-obj-y += tests/libqos/aarch64-xlnx-zcu102-machine.o
+libqos-obj-y += tests/libqos/arm-imx25-pdk-machine.o
+libqos-obj-y += tests/libqos/arm-n800-machine.o
+libqos-obj-y += tests/libqos/arm-raspi2-machine.o
+libqos-obj-y += tests/libqos/arm-sabrelite-machine.o
+libqos-obj-y += tests/libqos/arm-smdkc210-machine.o
+libqos-obj-y += tests/libqos/arm-virt-machine.o
+libqos-obj-y += tests/libqos/arm-xilinx-zynq-a9-machine.o
+libqos-obj-y += tests/libqos/ppc64_pseries-machine.o
+libqos-obj-y += tests/libqos/x86_64_pc-machine.o
 
 # Tests
+qos-test-obj-y = tests/qos-test.o
 qos-test-obj-y += tests/ac97-test.o
 qos-test-obj-y += tests/ds1338-test.o
 qos-test-obj-y += tests/e1000-test.o
@@ -776,7 +777,7 @@ check-unit-y += tests/test-qgraph$(EXESUF)
 tests/test-qgraph$(EXESUF): tests/test-qgraph.o $(libqgraph-obj-y)
 
 check-qtest-generic-y += tests/qos-test$(EXESUF)
-tests/qos-test$(EXESUF): $(qos-test-obj-y)
+tests/qos-test$(EXESUF): $(qos-test-obj-y) 

[PATCH v5 10/20] libqos: move useful qos-test funcs to qos_external

2019-11-13 Thread Oleinik, Alexander
The moved functions are not specific to qos-test and might be useful
elsewhere. For example the virtual-device fuzzer makes use of them for
qos-assisted fuzz-targets.

Signed-off-by: Alexander Bulekov 
---
 tests/Makefile.include  |   1 +
 tests/libqos/qos_external.c | 168 
 tests/libqos/qos_external.h |  28 ++
 tests/qos-test.c| 140 ++
 4 files changed, 202 insertions(+), 135 deletions(-)
 create mode 100644 tests/libqos/qos_external.c
 create mode 100644 tests/libqos/qos_external.h

diff --git a/tests/Makefile.include b/tests/Makefile.include
index 1517c4817e..205ae1 100644
--- a/tests/Makefile.include
+++ b/tests/Makefile.include
@@ -714,6 +714,7 @@ libqos-usb-obj-y = $(libqos-spapr-obj-y) $(libqos-pc-obj-y) 
tests/libqos/usb.o
 # Devices
 libqos-obj-y = $(libqgraph-obj-y)
 libqos-obj-y += $(libqos-pc-obj-y) $(libqos-spapr-obj-y)
+libqos-obj-y += tests/libqos/qos_external.o
 libqos-obj-y += tests/libqos/e1000e.o
 libqos-obj-y += tests/libqos/i2c.o
 libqos-obj-y += tests/libqos/i2c-imx.o
diff --git a/tests/libqos/qos_external.c b/tests/libqos/qos_external.c
new file mode 100644
index 00..398556dde0
--- /dev/null
+++ b/tests/libqos/qos_external.c
@@ -0,0 +1,168 @@
+/*
+ * libqos driver framework
+ *
+ * Copyright (c) 2018 Emanuele Giuseppe Esposito 
+ *
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License version 2 as published by the Free Software Foundation.
+ *
+ * This library is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library; if not, see 
+ */
+
+#include "qemu/osdep.h"
+#include 
+#include "libqtest.h"
+#include "qapi/qmp/qdict.h"
+#include "qapi/qmp/qbool.h"
+#include "qapi/qmp/qstring.h"
+#include "qemu/module.h"
+#include "qapi/qmp/qlist.h"
+#include "libqos/malloc.h"
+#include "libqos/qgraph.h"
+#include "libqos/qgraph_internal.h"
+#include "libqos/qos_external.h"
+
+
+
+void apply_to_node(const char *name, bool is_machine, bool is_abstract)
+{
+char *machine_name = NULL;
+if (is_machine) {
+const char *arch = qtest_get_arch();
+machine_name = g_strconcat(arch, "/", name, NULL);
+name = machine_name;
+}
+qos_graph_node_set_availability(name, true);
+if (is_abstract) {
+qos_delete_cmd_line(name);
+}
+g_free(machine_name);
+}
+
+/**
+ * apply_to_qlist(): using QMP queries QEMU for a list of
+ * machines and devices available, and sets the respective node
+ * as true. If a node is found, also all its produced and contained
+ * child are marked available.
+ *
+ * See qos_graph_node_set_availability() for more info
+ */
+void apply_to_qlist(QList *list, bool is_machine)
+{
+const QListEntry *p;
+const char *name;
+bool abstract;
+QDict *minfo;
+QObject *qobj;
+QString *qstr;
+QBool *qbool;
+
+for (p = qlist_first(list); p; p = qlist_next(p)) {
+minfo = qobject_to(QDict, qlist_entry_obj(p));
+qobj = qdict_get(minfo, "name");
+qstr = qobject_to(QString, qobj);
+name = qstring_get_str(qstr);
+
+qobj = qdict_get(minfo, "abstract");
+if (qobj) {
+qbool = qobject_to(QBool, qobj);
+abstract = qbool_get_bool(qbool);
+} else {
+abstract = false;
+}
+
+apply_to_node(name, is_machine, abstract);
+qobj = qdict_get(minfo, "alias");
+if (qobj) {
+qstr = qobject_to(QString, qobj);
+name = qstring_get_str(qstr);
+apply_to_node(name, is_machine, abstract);
+}
+}
+}
+
+QGuestAllocator *get_machine_allocator(QOSGraphObject *obj)
+{
+return obj->get_driver(obj, "memory");
+}
+
+/**
+ * allocate_objects(): given an array of nodes @arg,
+ * walks the path invoking all constructors and
+ * passing the corresponding parameter in order to
+ * continue the objects allocation.
+ * Once the test is reached, return the object it consumes.
+ *
+ * Since the machine and QEDGE_CONSUMED_BY nodes allocate
+ * memory in the constructor, g_test_queue_destroy is used so
+ * that after execution they can be safely free'd.  (The test's
+ * ->before callback is also welcome to use g_test_queue_destroy).
+ *
+ * Note: as specified in walk_path() too, @arg is an array of
+ * char *, where arg[0] is a pointer to the command line
+ * string that will be used to properly start QEMU when executing
+ * the test, and the remaining elements represent the actual objects
+ * that will be allocated.
+ */
+void *allocate_objects(QTestState *qts, char **path, QGuestAllocator **p_alloc)

[PATCH v5 20/20] fuzz: add documentation to docs/devel/

2019-11-13 Thread Oleinik, Alexander
Signed-off-by: Alexander Bulekov 
---
 docs/devel/fuzzing.txt | 119 +
 1 file changed, 119 insertions(+)
 create mode 100644 docs/devel/fuzzing.txt

diff --git a/docs/devel/fuzzing.txt b/docs/devel/fuzzing.txt
new file mode 100644
index 00..b0cceb2a6b
--- /dev/null
+++ b/docs/devel/fuzzing.txt
@@ -0,0 +1,119 @@
+= Fuzzing =
+
+== Introduction ==
+
+This document describes the virtual-device fuzzing infrastructure in QEMU and
+how to use it to implement additional fuzzers.
+
+== Basics ==
+
+Fuzzing operates by passing inputs to an entry point/target function. The
+fuzzer tracks the code coverage triggered by the input. Based on these
+findings, the fuzzer mutates the input and repeats the fuzzing. 
+
+To fuzz QEMU, we rely on libfuzzer. Unlike other fuzzers such as AFL, libfuzzer
+is an _in-process_ fuzzer. For the developer, this means that it is their
+responsibility to ensure that state is reset between fuzzing-runs.
+
+== Building the fuzzers ==
+
+NOTE: If possible, build a 32-bit binary. When forking, the 32-bit fuzzer is
+much faster, since the page-map has a smaller size. This is due to the fact 
that
+AddressSanitizer mmaps ~20TB of memory, as part of its detection. This results
+in a large page-map, and a much slower fork().
+
+To build the fuzzers, install a recent version of clang:
+Configure with (substitute the clang binaries with the version you installed):
+
+CC=clang-8 CXX=clang++-8 /path/to/configure --enable-fuzzing
+
+Fuzz targets are built similarly to system/softmmu:
+
+make i386-softmmu/fuzz
+
+This builds ./i386-softmmu/qemu-fuzz-i386
+
+The first option to this command is: --fuzz_taget=FUZZ_NAME
+To list all of the available fuzzers run qemu-fuzz-i386 with no arguments.
+
+eg:
+./i386-softmmu/qemu-fuzz-i386 --fuzz-target=virtio-net-fork-fuzz
+
+Internally, libfuzzer parses all arguments that do not begin with "--".
+Information about these is available by passing -help=1
+
+Now the only thing left to do is wait for the fuzzer to trigger potential
+crashes.
+
+== Adding a new fuzzer ==
+Coverage over virtual devices can be improved by adding additional fuzzers. 
+Fuzzers are kept in tests/fuzz/ and should be added to
+tests/fuzz/Makefile.include
+
+Fuzzers can rely on both qtest and libqos to communicate with virtual devices.
+
+1. Create a new source file. For example ``tests/fuzz/fuzz-foo-device.c``.
+
+2. Write the fuzzing code using the libqtest/libqos API. See existing fuzzers
+for reference.
+
+3. Register the fuzzer in ``tests/fuzz/Makefile.include`` by appending the
+corresponding object to fuzz-obj-y
+
+Fuzzers can be more-or-less thought of as special qtest programs which can
+modify the qtest commands and/or qtest command arguments based on inputs
+provided by libfuzzer. Libfuzzer passes a byte array and length. Commonly the
+fuzzer loops over the byte-array interpreting it as a list of qtest commands,
+addresses, or values.
+
+
+= Implementation Details =
+
+== The Fuzzer's Lifecycle ==
+
+The fuzzer has two entrypoints that libfuzzer calls. libfuzzer provides it's
+own main(), which performs some setup, and calls the entrypoints:
+
+LLVMFuzzerInitialize: called prior to fuzzing. Used to initialize all of the
+necessary state
+
+LLVMFuzzerTestOneInput: called for each fuzzing run. Processes the input and
+resets the state at the end of each run.
+
+In more detail:
+
+LLVMFuzzerInitialize parses the arguments to the fuzzer (must start with two
+dashes, so they are ignored by libfuzzer main()). Currently, the arguments
+select the fuzz target. Then, the qtest client is initialized. If the target
+requires qos, qgraph is set up and the QOM/LIBQOS modules are initialized.
+Then the QGraph is walked and the QEMU cmd_line is determined and saved.
+
+After this, the vl.c:qemu__main is called to set up the guest. There are
+target-specific hooks that can be called before and after qemu_main, for
+additional setup(e.g. PCI setup, or VM snapshotting).
+
+LLVMFuzzerTestOneInput: Uses qtest/qos functions to act based on the fuzz
+input. It is also responsible for manually calling the main loop/main_loop_wait
+to ensure that bottom halves are executed and any cleanup required before the
+next input. 
+
+
+Since the same process is reused for many fuzzing runs, QEMU state needs to
+be reset at the end of each run. There are currently two implemented
+options for resetting state: 
+1. Reboot the guest between runs.
+   Pros: Straightforward and fast for simple fuzz targets. 
+   Cons: Depending on the device, does not reset all device state. If the
+   device requires some initialization prior to being ready for fuzzing
+   (common for QOS-based targets), this initialization needs to be done after
+   each reboot.
+   Example target: i440fx-qtest-reboot-fuzz
+2. Run each test case in a separate forked process and copy the coverage
+   information back to the parent. This is fairly similar to AFL's "deferred"
+   fork-server 

[PATCH v5 11/20] libqtest: make bufwrite rely on the TransportOps

2019-11-13 Thread Oleinik, Alexander
When using qtest "in-process" communication, qtest_sendf directly calls
a function in the server (qtest.c). Previously, bufwrite used
socket_send, which bypasses the TransportOps enabling the call into
qtest.c. This change replaces the socket_send calls with ops->send,
maintaining the benefits of the direct socket_send call, while adding
support for in-process qtest calls.

Signed-off-by: Alexander Bulekov 
---
 tests/libqtest.c | 4 ++--
 tests/libqtest.h | 3 +++
 2 files changed, 5 insertions(+), 2 deletions(-)

diff --git a/tests/libqtest.c b/tests/libqtest.c
index c406b2ea09..6d3bcb6766 100644
--- a/tests/libqtest.c
+++ b/tests/libqtest.c
@@ -1080,8 +1080,8 @@ void qtest_bufwrite(QTestState *s, uint64_t addr, const 
void *data, size_t size)
 
 bdata = g_base64_encode(data, size);
 qtest_sendf(s, "b64write 0x%" PRIx64 " 0x%zx ", addr, size);
-socket_send(s->fd, bdata, strlen(bdata));
-socket_send(s->fd, "\n", 1);
+s->ops.send(s, bdata);
+s->ops.send(s, "\n");
 qtest_rsp(s, 0);
 g_free(bdata);
 }
diff --git a/tests/libqtest.h b/tests/libqtest.h
index c9e21e05b3..0e9b8908ef 100644
--- a/tests/libqtest.h
+++ b/tests/libqtest.h
@@ -729,4 +729,7 @@ bool qtest_probe_child(QTestState *s);
  */
 void qtest_set_expected_status(QTestState *s, int status);
 
+QTestState *qtest_inproc_init(bool log, const char* arch,
+void (*send)(void*, const char*));
+void qtest_client_inproc_recv(void *opaque, const char *str);
 #endif
-- 
2.23.0




[PATCH v5 12/20] libqtest: add in-process qtest.c tx/rx handlers

2019-11-13 Thread Oleinik, Alexander
Signed-off-by: Alexander Bulekov 
---
 tests/libqtest.c | 54 
 tests/libqtest.h |  3 ++-
 2 files changed, 56 insertions(+), 1 deletion(-)

diff --git a/tests/libqtest.c b/tests/libqtest.c
index 6d3bcb6766..da0e5c7ef8 100644
--- a/tests/libqtest.c
+++ b/tests/libqtest.c
@@ -1368,3 +1368,57 @@ static void qtest_client_set_rx_handler(QTestState *s, 
QTestRecvFn recv)
 {
 s->ops.recv_line = recv;
 }
+/* A type-safe wrapper for s->send() */
+static void send_wrapper(QTestState *s, const char *buf)
+{
+s->ops.external_send(s, buf);
+}
+
+static GString *qtest_client_inproc_recv_line(QTestState *s)
+{
+GString *line;
+size_t offset;
+char *eol;
+
+eol = strchr(s->rx->str, '\n');
+offset = eol - s->rx->str;
+line = g_string_new_len(s->rx->str, offset);
+g_string_erase(s->rx, 0, offset + 1);
+return line;
+}
+
+QTestState *qtest_inproc_init(QTestState **s, bool log, const char* arch,
+void (*send)(void*, const char*))
+{
+QTestState *qts;
+qts = g_new0(QTestState, 1);
+*s = qts; /* Expose qts early on, since the query endianness relies on it 
*/
+qts->wstatus = 0;
+for (int i = 0; i < MAX_IRQ; i++) {
+qts->irq_level[i] = false;
+}
+
+qtest_client_set_rx_handler(qts, qtest_client_inproc_recv_line);
+
+/* send() may not have a matching protoype, so use a type-safe wrapper */
+qts->ops.external_send = send;
+qtest_client_set_tx_handler(qts, send_wrapper);
+
+qts->big_endian = qtest_query_target_endianness(qts);
+gchar *bin_path = g_strconcat("/qemu-system-", arch, NULL);
+setenv("QTEST_QEMU_BINARY", bin_path, 0);
+g_free(bin_path);
+
+return qts;
+}
+
+void qtest_client_inproc_recv(void *opaque, const char *str)
+{
+QTestState *qts = *(QTestState **)opaque;
+
+if (!qts->rx) {
+qts->rx = g_string_new(NULL);
+}
+g_string_append(qts->rx, str);
+return;
+}
diff --git a/tests/libqtest.h b/tests/libqtest.h
index 0e9b8908ef..f5cf93c386 100644
--- a/tests/libqtest.h
+++ b/tests/libqtest.h
@@ -729,7 +729,8 @@ bool qtest_probe_child(QTestState *s);
  */
 void qtest_set_expected_status(QTestState *s, int status);
 
-QTestState *qtest_inproc_init(bool log, const char* arch,
+QTestState *qtest_inproc_init(QTestState **s, bool log, const char* arch,
 void (*send)(void*, const char*));
+
 void qtest_client_inproc_recv(void *opaque, const char *str);
 #endif
-- 
2.23.0




[PATCH v5 01/20] softmmu: split off vl.c:main() into main.c

2019-11-13 Thread Oleinik, Alexander
A program might rely on functions implemented in vl.c, but implement its
own main(). By placing main into a separate source file, there are no
complaints about duplicate main()s when linking against vl.o. For
example, the virtual-device fuzzer uses a main() provided by libfuzzer,
and needs to perform some initialization before running the softmmu
initialization. Now, main simply calls three vl.c functions which
handle the guest initialization, main loop and cleanup.

Signed-off-by: Alexander Bulekov 
---
 Makefile|  1 +
 Makefile.objs   |  2 ++
 include/sysemu/sysemu.h |  4 
 main.c  | 53 +
 vl.c| 38 -
 5 files changed, 70 insertions(+), 28 deletions(-)
 create mode 100644 main.c

diff --git a/Makefile b/Makefile
index 0e994a275d..d2b2ecd3c4 100644
--- a/Makefile
+++ b/Makefile
@@ -474,6 +474,7 @@ $(SOFTMMU_ALL_RULES): $(crypto-obj-y)
 $(SOFTMMU_ALL_RULES): $(io-obj-y)
 $(SOFTMMU_ALL_RULES): config-all-devices.mak
 $(SOFTMMU_ALL_RULES): $(edk2-decompressed)
+$(SOFTMMU_ALL_RULES): $(softmmu-main-y)
 
 .PHONY: $(TARGET_DIRS_RULES)
 # The $(TARGET_DIRS_RULES) are of the form SUBDIR/GOAL, so that
diff --git a/Makefile.objs b/Makefile.objs
index 11ba1a36bd..9ff9b0c6f9 100644
--- a/Makefile.objs
+++ b/Makefile.objs
@@ -86,6 +86,8 @@ common-obj-$(CONFIG_FDT) += device_tree.o
 # qapi
 
 common-obj-y += qapi/
+
+softmmu-main-y = main.o
 endif
 
 ###
diff --git a/include/sysemu/sysemu.h b/include/sysemu/sysemu.h
index 44f18eb739..d1dbf85414 100644
--- a/include/sysemu/sysemu.h
+++ b/include/sysemu/sysemu.h
@@ -114,6 +114,10 @@ QemuOpts *qemu_get_machine_opts(void);
 
 bool defaults_enabled(void);
 
+void qemu_init(int argc, char **argv, char **envp);
+void qemu_main_loop(void);
+void qemu_cleanup(void);
+
 extern QemuOptsList qemu_legacy_drive_opts;
 extern QemuOptsList qemu_common_drive_opts;
 extern QemuOptsList qemu_drive_opts;
diff --git a/main.c b/main.c
new file mode 100644
index 00..f10ceda541
--- /dev/null
+++ b/main.c
@@ -0,0 +1,53 @@
+/*
+ * QEMU System Emulator
+ *
+ * Copyright (c) 2003-2008 Fabrice Bellard
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to 
deal
+ * in the Software without restriction, including without limitation the rights
+ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+ * copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING 
FROM,
+ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
+ * THE SOFTWARE.
+ */
+
+#include "qemu/osdep.h"
+#include "qemu-common.h"
+#include "sysemu/sysemu.h"
+
+#ifdef CONFIG_SDL
+#if defined(__APPLE__) || defined(main)
+#include 
+int main(int argc, char **argv)
+{
+return qemu_main(argc, argv, NULL);
+}
+#undef main
+#define main qemu_main
+#endif
+#endif /* CONFIG_SDL */
+
+#ifdef CONFIG_COCOA
+#undef main
+#define main qemu_main
+#endif /* CONFIG_COCOA */
+
+int main(int argc, char **argv, char **envp)
+{
+qemu_init(argc, argv, envp);
+qemu_main_loop();
+qemu_cleanup();
+
+return 0;
+}
diff --git a/vl.c b/vl.c
index c389d24b2c..adb08a3d41 100644
--- a/vl.c
+++ b/vl.c
@@ -36,25 +36,6 @@
 #include "sysemu/seccomp.h"
 #include "sysemu/tcg.h"
 
-#ifdef CONFIG_SDL
-#if defined(__APPLE__) || defined(main)
-#include 
-int qemu_main(int argc, char **argv, char **envp);
-int main(int argc, char **argv)
-{
-return qemu_main(argc, argv, NULL);
-}
-#undef main
-#define main qemu_main
-#endif
-#endif /* CONFIG_SDL */
-
-#ifdef CONFIG_COCOA
-#undef main
-#define main qemu_main
-#endif /* CONFIG_COCOA */
-
-
 #include "qemu/error-report.h"
 #include "qemu/sockets.h"
 #include "sysemu/accel.h"
@@ -1797,7 +1778,7 @@ static bool main_loop_should_exit(void)
 return false;
 }
 
-static void main_loop(void)
+void qemu_main_loop(void)
 {
 #ifdef CONFIG_PROFILER
 int64_t ti;
@@ -2824,7 +2805,7 @@ static void user_register_global_props(void)
   global_init_func, NULL, NULL);
 }
 
-int main(int argc, char **argv, char **envp)
+void qemu_init(int argc, char **argv, char **envp)
 {
 int i;
 int snapshot, linux_boot;
@@ -3404,7 +3385,7 @@ int main(int argc, 

Re: virtio,iommu_platform=on

2019-11-13 Thread Alexey Kardashevskiy



On 13/11/2019 21:00, Michael S. Tsirkin wrote:
> On Wed, Nov 13, 2019 at 03:44:28PM +1100, Alexey Kardashevskiy wrote:
>>
>>
>> On 12/11/2019 18:08, Michael S. Tsirkin wrote:
>>> On Tue, Nov 12, 2019 at 02:53:49PM +1100, Alexey Kardashevskiy wrote:
 Hi!

 I am enabling IOMMU for virtio in the pseries firmware (SLOF) and seeing
 problems, one of them is SLOF does SCSI bus scan, then it stops the
 virtio-scsi by clearing MMIO|IO|BUSMASTER from PCI_COMMAND (as SLOF
 stopped using the devices) and when this happens, I see unassigned
 memory access (see below) which happens because disabling busmaster
 disables IOMMU and QEMU cannot access the rings to do some shutdown. And
 when this happens, the device does not come back even if SLOF re-enables 
 it.
>>>
>>> In fact clearing bus master should disable ring access even
>>> without the IOMMU.
>>> Once you do this you should not wait for rings to be processed,
>>> it is safe to assume they won't be touched again and just
>>> free up any buffers that have not been used.
>>>
>>> Why don't you see this without IOMMU?
>>
>> Because without IOMMU, virtio can always access rings, it does not need
>> bus master address space for that.
> 
> Right and that's a bug in virtio scsi. E.g. virtio net checks
> bus mastering before each access.

You have to be specific - virtio scsi in the guest or in QEMU?


> Which is all well and good, but we can't just break the world
> so I guess we first need to fix SLOF, and then add
> a compat property. And maybe keep it broken for
> legacy ...
> 
>>
>>> It's a bug I think, probably there to work around buggy guests.
>>>
>>> So pls fix this in SLOF and then hopefully we can drop the
>>> work arounds and have clearing bus master actually block DMA.
>>
>>
>> Laszlo suggested writing 0 to the status but this does not seem helping,
>> with both ioeventfd=true/false. It looks like setting/clearing busmaster
>> bit confused memory region caches in QEMU's virtio. I am confused which
>> direction to keep digging to, any suggestions? Thanks,
>>
> 
> to clarify you reset after setting bus master? right?


I was talking about clearing the bus master, and where I call that
virtio reset does not matter. Thanks,



> 
> 
>>
>>>
 Hacking SLOF to not clear BUSMASTER makes virtio-scsi work but it is
 hardly a right fix.

 Is this something expected? Thanks,


 Here is the exact command line:

 /home/aik/pbuild/qemu-garrison2-ppc64/ppc64-softmmu/qemu-system-ppc64 \

 -nodefaults \

 -chardev stdio,id=STDIO0,signal=off,mux=on \

 -device spapr-vty,id=svty0,reg=0x71000110,chardev=STDIO0 \

 -mon id=MON0,chardev=STDIO0,mode=readline \

 -nographic \

 -vga none \

 -enable-kvm \
 -m 2G \

 -device
 virtio-scsi-pci,id=vscsi0,iommu_platform=on,disable-modern=off,disable-legacy=on
 \
 -drive id=DRIVE0,if=none,file=img/u1804-64le.qcow2,format=qcow2 \

 -device scsi-disk,id=scsi-disk0,drive=DRIVE0 \

 -snapshot \

 -smp 1 \

 -machine pseries \

 -L /home/aik/t/qemu-ppc64-bios/ \

 -trace events=qemu_trace_events \

 -d guest_errors \

 -chardev socket,id=SOCKET0,server,nowait,path=qemu.mon.ssh59518 \

 -mon chardev=SOCKET0,mode=control



 Here is the backtrace:

 Thread 5 "qemu-system-ppc" hit Breakpoint 8, unassigned_mem_accepts
 (opaque=0x0, addr=0x5802, size=0x2, is_write=0x0, attrs=...) at /home/
 aik/p/qemu/memory.c:1275
 1275return false;
 #0  unassigned_mem_accepts (opaque=0x0, addr=0x5802, size=0x2,
 is_write=0x0, attrs=...) at /home/aik/p/qemu/memory.c:1275
 #1  0x100a8ac8 in memory_region_access_valid (mr=0x1105c230
 , addr=0x5802, size=0x2, is_write=0x0, attrs=...) at
 /home/aik/p/qemu/memory.c:1377
 #2  0x100a8c88 in memory_region_dispatch_read (mr=0x1105c230
 , addr=0x5802, pval=0x7550d410, op=MO_16,
 attrs=...) at /home/aik/p/qemu/memory.c:1418
 #3  0x1001cad4 in address_space_lduw_internal_cached_slow
 (cache=0x7fff68036fa0, addr=0x2, attrs=..., result=0x0,
 endian=DEVICE_LITTLE_ENDIAN) at /home/aik/p/qemu/memory_ldst.inc.c:211
 #4  0x1001cc84 in address_space_lduw_le_cached_slow
 (cache=0x7fff68036fa0, addr=0x2, attrs=..., result=0x0) at
 /home/aik/p/qemu/memory_ldst.inc.c:249
 #5  0x1019bd80 in address_space_lduw_le_cached
 (cache=0x7fff68036fa0, addr=0x2, attrs=..., result=0x0) at
 /home/aik/p/qemu/include/exec/memory_ldst_cached.inc.h:56
 #6  0x1019c10c in lduw_le_phys_cached (cache=0x7fff68036fa0,
 addr=0x2) at /home/aik/p/qemu/include/exec/memory_ldst_phys.inc.h:91
 #7  0x1019d86c in virtio_lduw_phys_cached (vdev=0x118b9110,
 cache=0x7fff68036fa0, pa=0x2) at
 /home/aik/p/qemu/include/hw/virtio/virtio-access.h:166
 #8  

[PATCH v5 19/20] fuzz: add virtio-net fuzz target

2019-11-13 Thread Oleinik, Alexander
The virtio-net fuzz target feeds inputs to all three virtio-net
virtqueues, and uses forking to avoid leaking state between fuzz runs.

Signed-off-by: Alexander Bulekov 
---
 tests/fuzz/Makefile.include  |   1 +
 tests/fuzz/virtio_net_fuzz.c | 100 +++
 2 files changed, 101 insertions(+)
 create mode 100644 tests/fuzz/virtio_net_fuzz.c

diff --git a/tests/fuzz/Makefile.include b/tests/fuzz/Makefile.include
index 37d6821bee..f1d9b46b1c 100644
--- a/tests/fuzz/Makefile.include
+++ b/tests/fuzz/Makefile.include
@@ -6,5 +6,6 @@ fuzz-obj-y += tests/fuzz/fork_fuzz.o
 fuzz-obj-y += tests/fuzz/qos_fuzz.o
 
 fuzz-obj-y += tests/fuzz/i440fx_fuzz.o
+fuzz-obj-y += tests/fuzz/virtio_net_fuzz.o
 
 FUZZ_LDFLAGS += -Xlinker -T$(SRC_PATH)/tests/fuzz/fork_fuzz.ld
diff --git a/tests/fuzz/virtio_net_fuzz.c b/tests/fuzz/virtio_net_fuzz.c
new file mode 100644
index 00..cd7d086442
--- /dev/null
+++ b/tests/fuzz/virtio_net_fuzz.c
@@ -0,0 +1,100 @@
+/*
+ * virtio-net Fuzzing Target
+ *
+ * Copyright Red Hat Inc., 2019
+ *
+ * Authors:
+ *  Alexander Bulekov   
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ */
+
+#include "qemu/osdep.h"
+
+#include "fuzz.h"
+#include "fork_fuzz.h"
+#include "qos_fuzz.h"
+#include "tests/libqtest.h"
+#include "tests/libqos/virtio-net.h"
+
+
+static void virtio_net_fuzz_multi(QTestState *s,
+const unsigned char *Data, size_t Size)
+{
+typedef struct vq_action {
+uint8_t queue;
+uint8_t length;
+uint8_t write;
+uint8_t next;
+} vq_action;
+
+uint32_t free_head = 0;
+
+QGuestAllocator *t_alloc = fuzz_qos_alloc;
+
+QVirtioNet *net_if = fuzz_qos_obj;
+QVirtioDevice *dev = net_if->vdev;
+QVirtQueue *q;
+vq_action vqa;
+while (Size >= sizeof(vqa)) {
+memcpy(, Data, sizeof(vqa));
+Data += sizeof(vqa);
+Size -= sizeof(vqa);
+
+q = net_if->queues[vqa.queue % 3];
+
+vqa.length = vqa.length >= Size ? Size :  vqa.length;
+
+uint64_t req_addr = guest_alloc(t_alloc, vqa.length);
+qtest_memwrite(s, req_addr, Data, vqa.length);
+free_head = qvirtqueue_add(s, q, req_addr, vqa.length,
+vqa.write, vqa.next);
+qvirtqueue_add(s, q, req_addr, vqa.length, vqa.write , vqa.next);
+qvirtqueue_kick(s, dev, q, free_head);
+Data += vqa.length;
+Size -= vqa.length;
+}
+}
+
+static int *sv;
+
+static void *virtio_net_test_setup_socket(GString *cmd_line, void *arg)
+{
+if (!sv) {
+sv = g_new(int, 2);
+int ret = socketpair(PF_UNIX, SOCK_STREAM, 0, sv);
+fcntl(sv[0], F_SETFL, O_NONBLOCK);
+g_assert_cmpint(ret, !=, -1);
+}
+g_string_append_printf(cmd_line, " -netdev socket,fd=%d,id=hs0 ", sv[1]);
+return arg;
+}
+
+static void virtio_net_fork_fuzz(QTestState *s,
+const unsigned char *Data, size_t Size)
+{
+if (fork() == 0) {
+virtio_net_fuzz_multi(s, Data, Size);
+flush_events(s);
+_Exit(0);
+} else {
+wait(NULL);
+}
+}
+
+static void register_virtio_net_fuzz_targets(void)
+{
+fuzz_add_qos_target(&(FuzzTarget){
+.name = "virtio-net-fork-fuzz",
+.description = "Fuzz the virtio-net virtual queues, forking"
+"for each fuzz run",
+.pre_vm_init = _shm_init,
+.pre_fuzz = _init_path,
+.fuzz = virtio_net_fork_fuzz,},
+"virtio-net",
+&(QOSGraphTestOptions){.before = virtio_net_test_setup_socket}
+);
+}
+
+fuzz_target_init(register_virtio_net_fuzz_targets);
-- 
2.23.0




[PATCH v5 02/20] libqos: Rename i2c_send and i2c_recv

2019-11-13 Thread Oleinik, Alexander
The names i2c_send and i2c_recv collide with functions defined in
hw/i2c/core.c. This causes an error when linking against libqos and
softmmu simultaneously (for example when using qtest inproc). Rename the
libqos functions to avoid this.

Signed-off-by: Alexander Bulekov 
---
 tests/libqos/i2c.c   | 10 +-
 tests/libqos/i2c.h   |  4 ++--
 tests/pca9552-test.c | 10 +-
 3 files changed, 12 insertions(+), 12 deletions(-)

diff --git a/tests/libqos/i2c.c b/tests/libqos/i2c.c
index 156114e745..38f800dbab 100644
--- a/tests/libqos/i2c.c
+++ b/tests/libqos/i2c.c
@@ -10,12 +10,12 @@
 #include "libqos/i2c.h"
 #include "libqtest.h"
 
-void i2c_send(QI2CDevice *i2cdev, const uint8_t *buf, uint16_t len)
+void qi2c_send(QI2CDevice *i2cdev, const uint8_t *buf, uint16_t len)
 {
 i2cdev->bus->send(i2cdev->bus, i2cdev->addr, buf, len);
 }
 
-void i2c_recv(QI2CDevice *i2cdev, uint8_t *buf, uint16_t len)
+void qi2c_recv(QI2CDevice *i2cdev, uint8_t *buf, uint16_t len)
 {
 i2cdev->bus->recv(i2cdev->bus, i2cdev->addr, buf, len);
 }
@@ -23,8 +23,8 @@ void i2c_recv(QI2CDevice *i2cdev, uint8_t *buf, uint16_t len)
 void i2c_read_block(QI2CDevice *i2cdev, uint8_t reg,
 uint8_t *buf, uint16_t len)
 {
-i2c_send(i2cdev, , 1);
-i2c_recv(i2cdev, buf, len);
+qi2c_send(i2cdev, , 1);
+qi2c_recv(i2cdev, buf, len);
 }
 
 void i2c_write_block(QI2CDevice *i2cdev, uint8_t reg,
@@ -33,7 +33,7 @@ void i2c_write_block(QI2CDevice *i2cdev, uint8_t reg,
 uint8_t *cmd = g_malloc(len + 1);
 cmd[0] = reg;
 memcpy([1], buf, len);
-i2c_send(i2cdev, cmd, len + 1);
+qi2c_send(i2cdev, cmd, len + 1);
 g_free(cmd);
 }
 
diff --git a/tests/libqos/i2c.h b/tests/libqos/i2c.h
index 945b65b34c..c65f087834 100644
--- a/tests/libqos/i2c.h
+++ b/tests/libqos/i2c.h
@@ -47,8 +47,8 @@ struct QI2CDevice {
 void *i2c_device_create(void *i2c_bus, QGuestAllocator *alloc, void *addr);
 void add_qi2c_address(QOSGraphEdgeOptions *opts, QI2CAddress *addr);
 
-void i2c_send(QI2CDevice *dev, const uint8_t *buf, uint16_t len);
-void i2c_recv(QI2CDevice *dev, uint8_t *buf, uint16_t len);
+void qi2c_send(QI2CDevice *dev, const uint8_t *buf, uint16_t len);
+void qi2c_recv(QI2CDevice *dev, uint8_t *buf, uint16_t len);
 
 void i2c_read_block(QI2CDevice *dev, uint8_t reg,
 uint8_t *buf, uint16_t len);
diff --git a/tests/pca9552-test.c b/tests/pca9552-test.c
index 4b800d3c3e..d80ed93cd3 100644
--- a/tests/pca9552-test.c
+++ b/tests/pca9552-test.c
@@ -32,22 +32,22 @@ static void receive_autoinc(void *obj, void *data, 
QGuestAllocator *alloc)
 
 pca9552_init(i2cdev);
 
-i2c_send(i2cdev, , 1);
+qi2c_send(i2cdev, , 1);
 
 /* PCA9552_LS0 */
-i2c_recv(i2cdev, , 1);
+qi2c_recv(i2cdev, , 1);
 g_assert_cmphex(resp, ==, 0x54);
 
 /* PCA9552_LS1 */
-i2c_recv(i2cdev, , 1);
+qi2c_recv(i2cdev, , 1);
 g_assert_cmphex(resp, ==, 0x55);
 
 /* PCA9552_LS2 */
-i2c_recv(i2cdev, , 1);
+qi2c_recv(i2cdev, , 1);
 g_assert_cmphex(resp, ==, 0x55);
 
 /* PCA9552_LS3 */
-i2c_recv(i2cdev, , 1);
+qi2c_recv(i2cdev, , 1);
 g_assert_cmphex(resp, ==, 0x54);
 }
 
-- 
2.23.0




[PATCH v5 03/20] fuzz: Add FUZZ_TARGET module type

2019-11-13 Thread Oleinik, Alexander
Signed-off-by: Alexander Bulekov 
---
 include/qemu/module.h | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/include/qemu/module.h b/include/qemu/module.h
index 65ba596e46..684753d808 100644
--- a/include/qemu/module.h
+++ b/include/qemu/module.h
@@ -46,6 +46,7 @@ typedef enum {
 MODULE_INIT_TRACE,
 MODULE_INIT_XEN_BACKEND,
 MODULE_INIT_LIBQOS,
+MODULE_INIT_FUZZ_TARGET,
 MODULE_INIT_MAX
 } module_init_type;
 
@@ -56,7 +57,8 @@ typedef enum {
 #define xen_backend_init(function) module_init(function, \
MODULE_INIT_XEN_BACKEND)
 #define libqos_init(function) module_init(function, MODULE_INIT_LIBQOS)
-
+#define fuzz_target_init(function) module_init(function, \
+   MODULE_INIT_FUZZ_TARGET)
 #define block_module_load_one(lib) module_load_one("block-", lib)
 #define ui_module_load_one(lib) module_load_one("ui-", lib)
 #define audio_module_load_one(lib) module_load_one("audio-", lib)
-- 
2.23.0




[PATCH v5 08/20] tests: provide test variables to other targets

2019-11-13 Thread Oleinik, Alexander
Before, when tests/Makefile.include was included, the contents would be
ignored if config-host.mak was defined. Moving the ifneq responsible for
this allows a target to depend on both testing-related and host-related
objects. For example the virtual-device fuzzer relies on both
libqtest/libqos objects and softmmu objects.

Signed-off-by: Alexander Bulekov 
---
 tests/Makefile.include | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/tests/Makefile.include b/tests/Makefile.include
index 34ec03391c..67853d10c3 100644
--- a/tests/Makefile.include
+++ b/tests/Makefile.include
@@ -27,7 +27,6 @@ check-help:
@echo "Default options are -k and (for $(MAKE) V=1) --verbose; they can 
be"
@echo "changed with variable GTESTER_OPTIONS."
 
-ifneq ($(wildcard config-host.mak),)
 export SRC_PATH
 
 # TODO don't duplicate $(SRC_PATH)/Makefile's qapi-py here
@@ -873,6 +872,8 @@ tests/test-qga$(EXESUF): tests/test-qga.o $(qtest-obj-y)
 
 SPEED = quick
 
+ifneq ($(wildcard config-host.mak),)
+
 # gtester tests, possibly with verbose output
 # do_test_tap runs all tests, even if some of them fail, while do_test_human
 # stops at the first failure unless -k is given on the command line
-- 
2.23.0




[PATCH v5 07/20] qtest: add in-process incoming command handler

2019-11-13 Thread Oleinik, Alexander
The handler allows a qtest client to send commands to the server by
directly calling a function, rather than using a file/CharBackend

Signed-off-by: Alexander Bulekov 
---
 include/sysemu/qtest.h |  1 +
 qtest.c| 13 +
 2 files changed, 14 insertions(+)

diff --git a/include/sysemu/qtest.h b/include/sysemu/qtest.h
index e2f1047fd7..eedd3664f0 100644
--- a/include/sysemu/qtest.h
+++ b/include/sysemu/qtest.h
@@ -28,5 +28,6 @@ void qtest_server_init(const char *qtest_chrdev, const char 
*qtest_log, Error **
 
 void qtest_server_set_send_handler(void (*send)(void *, const char *),
  void *opaque);
+void qtest_server_inproc_recv(void *opaque, const char *buf);
 
 #endif
diff --git a/qtest.c b/qtest.c
index 58d7e2a6fb..1db712d302 100644
--- a/qtest.c
+++ b/qtest.c
@@ -803,3 +803,16 @@ bool qtest_driver(void)
 {
 return qtest_chr.chr != NULL;
 }
+
+void qtest_server_inproc_recv(void *dummy, const char *buf)
+{
+static GString *gstr;
+if (!gstr) {
+gstr = g_string_new(NULL);
+}
+g_string_append(gstr, buf);
+if (gstr->str[gstr->len - 1] == '\n') {
+qtest_process_inbuf(NULL, gstr);
+g_string_truncate(gstr, 0);
+}
+}
-- 
2.23.0




[PATCH v5 13/20] fuzz: add configure flag --enable-fuzzing

2019-11-13 Thread Oleinik, Alexander
Signed-off-by: Alexander Bulekov 
---
 configure | 39 +++
 1 file changed, 39 insertions(+)

diff --git a/configure b/configure
index 3be9e92a24..aeca632dd9 100755
--- a/configure
+++ b/configure
@@ -501,6 +501,7 @@ libxml2=""
 debug_mutex="no"
 libpmem=""
 default_devices="yes"
+fuzzing="no"
 
 supported_cpu="no"
 supported_os="no"
@@ -630,6 +631,15 @@ int main(void) { return 0; }
 EOF
 }
 
+write_c_fuzzer_skeleton() {
+cat > $TMPC <
+#include 
+int LLVMFuzzerTestOneInput(const uint8_t *Data, size_t Size);
+int LLVMFuzzerTestOneInput(const uint8_t *Data, size_t Size) { return 0; }
+EOF
+}
+
 if check_define __linux__ ; then
   targetos="Linux"
 elif check_define _WIN32 ; then
@@ -1532,6 +1542,10 @@ for opt do
   ;;
   --disable-xkbcommon) xkbcommon=no
   ;;
+  --enable-fuzzing) fuzzing=yes
+  ;;
+  --disable-fuzzing) fuzzing=no
+  ;;
   *)
   echo "ERROR: unknown option $opt"
   echo "Try '$0 --help' for more information"
@@ -5911,6 +5925,15 @@ EOF
   fi
 fi
 
+##
+# checks for fuzzer
+if test "$fuzzing" = "yes" ; then
+  write_c_fuzzer_skeleton
+  if compile_prog "$CPU_CFLAGS -Werror -fsanitize=address,fuzzer" ""; then
+  have_fuzzer=yes
+  fi
+fi
+
 ##
 # check for libpmem
 
@@ -6491,6 +6514,7 @@ echo "capstone  $capstone"
 echo "libpmem support   $libpmem"
 echo "libudev   $libudev"
 echo "default devices   $default_devices"
+echo "fuzzing support   $fuzzing"
 
 if test "$supported_cpu" = "no"; then
 echo
@@ -7327,6 +7351,16 @@ fi
 if test "$sheepdog" = "yes" ; then
   echo "CONFIG_SHEEPDOG=y" >> $config_host_mak
 fi
+if test "$fuzzing" = "yes" ; then
+  if test "$have_fuzzer" = "yes"; then
+FUZZ_LDFLAGS=" -fsanitize=address,fuzzer"
+FUZZ_CFLAGS=" -fsanitize=address,fuzzer"
+CFLAGS=" -fsanitize=address"
+  else
+error_exit "Your compiler doesn't support -fsanitize=address,fuzzer"
+exit 1
+  fi
+fi
 
 if test "$tcg_interpreter" = "yes"; then
   QEMU_INCLUDES="-iquote \$(SRC_PATH)/tcg/tci $QEMU_INCLUDES"
@@ -7409,6 +7443,11 @@ if test "$libudev" != "no"; then
 echo "CONFIG_LIBUDEV=y" >> $config_host_mak
 echo "LIBUDEV_LIBS=$libudev_libs" >> $config_host_mak
 fi
+if test "$fuzzing" != "no"; then
+echo "CONFIG_FUZZ=y" >> $config_host_mak
+echo "FUZZ_CFLAGS=$FUZZ_CFLAGS" >> $config_host_mak
+echo "FUZZ_LDFLAGS=$FUZZ_LDFLAGS" >> $config_host_mak
+fi
 
 # use included Linux headers
 if test "$linux" = "yes" ; then
-- 
2.23.0




[PATCH v5 00/20] Add virtual device fuzzing support

2019-11-13 Thread Oleinik, Alexander
This series adds a framework for coverage-guided fuzzing of
virtual-devices. Fuzzing targets are based on qtest and can make use of
the libqos abstractions.

V5:
 * misc fixes addressing V4 comments
 * cleanup in-process handlers/globals in libqtest.c
 * small fixes to fork-based fuzzing and support for multiple workers
 * changes to the virtio-net fuzzer to kick after each vq add

V4:
 * add/transfer license headers to new files
 * restructure the added QTestClientTransportOps struct
 * restructure the FuzzTarget struct and fuzzer skeleton
 * fork-based fuzzer now directly mmaps shm over the coverage bitmaps
 * fixes to i440 and virtio-net fuzz targets
 * undo the changes to qtest_memwrite
 * possible to build /fuzz and /all in the same build-dir
 * misc fixes to address V3 comments

V3:
 * rebased onto v4.1.0+
 * add the fuzzer as a new build-target type in the build-system
 * add indirection to qtest client/server communication functions
 * remove ramfile and snapshot-based fuzzing support
 * add i440fx fuzz-target as a reference for developers.
 * add linker-script to assist with fork-based fuzzer

V2:
 * split off changes to qos virtio-net and qtest server to other patches
 * move vl:main initialization into new func: qemu_init
 * moved useful functions from qos-test.c to a separate object
 * use struct of function pointers for add_fuzz_target(), instead of
   arguments
 * move ramfile to migration/qemu-file
 * rewrite fork-based fuzzer pending patch to libfuzzer
 * pass check-patch

Alexander Bulekov (20):
  softmmu: split off vl.c:main() into main.c
  libqos: Rename i2c_send and i2c_recv
  fuzz: Add FUZZ_TARGET module type
  qtest: add qtest_server_send abstraction
  libqtest: Add a layer of abstraciton to send/recv
  module: check module wasn't already initialized
  qtest: add in-process incoming command handler
  tests: provide test variables to other targets
  libqos: split qos-test and libqos makefile vars
  libqos: move useful qos-test funcs to qos_external
  libqtest: make bufwrite rely on the TransportOps
  libqtest: add in-process qtest.c tx/rx handlers
  fuzz: add configure flag --enable-fuzzing
  fuzz: Add target/fuzz makefile rules
  fuzz: add fuzzer skeleton
  fuzz: add support for fork-based fuzzing.
  fuzz: add support for qos-assisted fuzz targets
  fuzz: add i440fx fuzz targets
  fuzz: add virtio-net fuzz target
  fuzz: add documentation to docs/devel/

 Makefile |  16 ++-
 Makefile.objs|   4 +
 Makefile.target  |  18 ++-
 configure|  39 ++
 docs/devel/fuzzing.txt   | 119 ++
 exec.c   |  12 +-
 include/qemu/module.h|   4 +-
 include/sysemu/qtest.h   |   4 +
 include/sysemu/sysemu.h  |   4 +
 main.c   |  53 
 qtest.c  |  31 -
 tests/Makefile.include   |  75 +--
 tests/fuzz/Makefile.include  |  11 ++
 tests/fuzz/fork_fuzz.c   |  55 +
 tests/fuzz/fork_fuzz.h   |  23 
 tests/fuzz/fork_fuzz.ld  |  37 ++
 tests/fuzz/fuzz.c| 179 +++
 tests/fuzz/fuzz.h|  94 ++
 tests/fuzz/i440fx_fuzz.c | 176 ++
 tests/fuzz/qos_fuzz.c| 232 +++
 tests/fuzz/qos_fuzz.h|  33 +
 tests/fuzz/virtio_net_fuzz.c | 100 +++
 tests/libqos/i2c.c   |  10 +-
 tests/libqos/i2c.h   |   4 +-
 tests/libqos/qos_external.c  | 168 +
 tests/libqos/qos_external.h  |  28 +
 tests/libqtest.c | 108 ++--
 tests/libqtest.h |   4 +
 tests/pca9552-test.c |  10 +-
 tests/qos-test.c | 140 +
 util/module.c|   7 ++
 vl.c |  38 ++
 32 files changed, 1607 insertions(+), 229 deletions(-)
 create mode 100644 docs/devel/fuzzing.txt
 create mode 100644 main.c
 create mode 100644 tests/fuzz/Makefile.include
 create mode 100644 tests/fuzz/fork_fuzz.c
 create mode 100644 tests/fuzz/fork_fuzz.h
 create mode 100644 tests/fuzz/fork_fuzz.ld
 create mode 100644 tests/fuzz/fuzz.c
 create mode 100644 tests/fuzz/fuzz.h
 create mode 100644 tests/fuzz/i440fx_fuzz.c
 create mode 100644 tests/fuzz/qos_fuzz.c
 create mode 100644 tests/fuzz/qos_fuzz.h
 create mode 100644 tests/fuzz/virtio_net_fuzz.c
 create mode 100644 tests/libqos/qos_external.c
 create mode 100644 tests/libqos/qos_external.h

-- 
2.23.0




[PATCH v5 04/20] qtest: add qtest_server_send abstraction

2019-11-13 Thread Oleinik, Alexander
qtest_server_send is a function pointer specifying the handler used to
transmit data to the qtest client. In the standard configuration, this
calls the CharBackend handler, but now it is possible for other types of
handlers, e.g direct-function calls if the qtest client and server
exist within the same process (inproc)

Signed-off-by: Alexander Bulekov 
---
 include/sysemu/qtest.h |  3 +++
 qtest.c| 18 --
 2 files changed, 19 insertions(+), 2 deletions(-)

diff --git a/include/sysemu/qtest.h b/include/sysemu/qtest.h
index 5ed09c80b1..e2f1047fd7 100644
--- a/include/sysemu/qtest.h
+++ b/include/sysemu/qtest.h
@@ -26,4 +26,7 @@ bool qtest_driver(void);
 
 void qtest_server_init(const char *qtest_chrdev, const char *qtest_log, Error 
**errp);
 
+void qtest_server_set_send_handler(void (*send)(void *, const char *),
+ void *opaque);
+
 #endif
diff --git a/qtest.c b/qtest.c
index 8b50e2783e..58d7e2a6fb 100644
--- a/qtest.c
+++ b/qtest.c
@@ -42,6 +42,8 @@ static GString *inbuf;
 static int irq_levels[MAX_IRQ];
 static qemu_timeval start_time;
 static bool qtest_opened;
+static void (*qtest_server_send)(void*, const char*);
+static void *qtest_server_send_opaque;
 
 #define FMT_timeval "%ld.%06ld"
 
@@ -228,8 +230,10 @@ static void GCC_FMT_ATTR(1, 2) qtest_log_send(const char 
*fmt, ...)
 va_end(ap);
 }
 
-static void do_qtest_send(CharBackend *chr, const char *str, size_t len)
+static void qtest_server_char_be_send(void *opaque, const char *str)
 {
+size_t len = strlen(str);
+CharBackend* chr = (CharBackend *)opaque;
 qemu_chr_fe_write_all(chr, (uint8_t *)str, len);
 if (qtest_log_fp && qtest_opened) {
 fprintf(qtest_log_fp, "%s", str);
@@ -238,7 +242,7 @@ static void do_qtest_send(CharBackend *chr, const char 
*str, size_t len)
 
 static void qtest_send(CharBackend *chr, const char *str)
 {
-do_qtest_send(chr, str, strlen(str));
+qtest_server_send(qtest_server_send_opaque, str);
 }
 
 static void GCC_FMT_ATTR(2, 3) qtest_sendf(CharBackend *chr,
@@ -783,6 +787,16 @@ void qtest_server_init(const char *qtest_chrdev, const 
char *qtest_log, Error **
 qemu_chr_fe_set_echo(_chr, true);
 
 inbuf = g_string_new("");
+
+if (!qtest_server_send) {
+qtest_server_set_send_handler(qtest_server_char_be_send, _chr);
+}
+}
+
+void qtest_server_set_send_handler(void (*send)(void*, const char*), void 
*opaque)
+{
+qtest_server_send = send;
+qtest_server_send_opaque = opaque;
 }
 
 bool qtest_driver(void)
-- 
2.23.0




[PATCH v5 05/20] libqtest: Add a layer of abstraciton to send/recv

2019-11-13 Thread Oleinik, Alexander
This makes it simple to swap the transport functions for qtest commands
to and from the qtest client. For example, now it is possible to
directly pass qtest commands to a server handler that exists within the
same process, without the standard way of writing to a file descriptor.

Signed-off-by: Alexander Bulekov 
---
 tests/libqtest.c | 50 +++-
 1 file changed, 41 insertions(+), 9 deletions(-)

diff --git a/tests/libqtest.c b/tests/libqtest.c
index 3706bccd8d..c406b2ea09 100644
--- a/tests/libqtest.c
+++ b/tests/libqtest.c
@@ -35,6 +35,17 @@
 #define SOCKET_TIMEOUT 50
 #define SOCKET_MAX_FDS 16
 
+
+typedef void (*QTestSendFn)(QTestState *s, const char *buf);
+typedef void (*ExternalSendFn)(void *s, const char *buf);
+typedef GString* (*QTestRecvFn)(QTestState *);
+
+typedef struct QTestClientTransportOps {
+QTestSendFn send;
+ExternalSendFn  external_send;
+QTestRecvFn recv_line;
+} QTestTransportOps;
+
 struct QTestState
 {
 int fd;
@@ -45,6 +56,7 @@ struct QTestState
 bool big_endian;
 bool irq_level[MAX_IRQ];
 GString *rx;
+QTestTransportOps ops;
 };
 
 static GHookList abrt_hooks;
@@ -52,6 +64,14 @@ static struct sigaction sigact_old;
 
 static int qtest_query_target_endianness(QTestState *s);
 
+static void qtest_client_socket_send(QTestState*, const char *buf);
+static void socket_send(int fd, const char *buf, size_t size);
+
+static GString *qtest_client_socket_recv_line(QTestState *);
+
+static void qtest_client_set_tx_handler(QTestState *s, QTestSendFn send);
+static void qtest_client_set_rx_handler(QTestState *s, QTestRecvFn recv);
+
 static int init_socket(const char *socket_path)
 {
 struct sockaddr_un addr;
@@ -234,6 +254,9 @@ QTestState *qtest_init_without_qmp_handshake(const char 
*extra_args)
 sock = init_socket(socket_path);
 qmpsock = init_socket(qmp_socket_path);
 
+qtest_client_set_rx_handler(s, qtest_client_socket_recv_line);
+qtest_client_set_tx_handler(s, qtest_client_socket_send);
+
 qtest_add_abrt_handler(kill_qemu_hook_func, s);
 
 command = g_strdup_printf("exec %s "
@@ -379,13 +402,9 @@ static void socket_send(int fd, const char *buf, size_t 
size)
 }
 }
 
-static void socket_sendf(int fd, const char *fmt, va_list ap)
+static void qtest_client_socket_send(QTestState *s, const char *buf)
 {
-gchar *str = g_strdup_vprintf(fmt, ap);
-size_t size = strlen(str);
-
-socket_send(fd, str, size);
-g_free(str);
+socket_send(s->fd, buf, strlen(buf));
 }
 
 static void GCC_FMT_ATTR(2, 3) qtest_sendf(QTestState *s, const char *fmt, ...)
@@ -393,8 +412,11 @@ static void GCC_FMT_ATTR(2, 3) qtest_sendf(QTestState *s, 
const char *fmt, ...)
 va_list ap;
 
 va_start(ap, fmt);
-socket_sendf(s->fd, fmt, ap);
+gchar *str = g_strdup_vprintf(fmt, ap);
 va_end(ap);
+
+s->ops.send(s, str);
+g_free(str);
 }
 
 /* Sends a message and file descriptors to the socket.
@@ -431,7 +453,7 @@ static void socket_send_fds(int socket_fd, int *fds, size_t 
fds_num,
 g_assert_cmpint(ret, >, 0);
 }
 
-static GString *qtest_recv_line(QTestState *s)
+static GString *qtest_client_socket_recv_line(QTestState *s)
 {
 GString *line;
 size_t offset;
@@ -468,7 +490,7 @@ static gchar **qtest_rsp(QTestState *s, int expected_args)
 int i;
 
 redo:
-line = qtest_recv_line(s);
+line = s->ops.recv_line(s);
 words = g_strsplit(line->str, " ", 0);
 g_string_free(line, TRUE);
 
@@ -1336,3 +1358,13 @@ void qmp_assert_error_class(QDict *rsp, const char 
*class)
 
 qobject_unref(rsp);
 }
+
+static void qtest_client_set_tx_handler(QTestState *s,
+QTestSendFn send)
+{
+s->ops.send = send;
+}
+static void qtest_client_set_rx_handler(QTestState *s, QTestRecvFn recv)
+{
+s->ops.recv_line = recv;
+}
-- 
2.23.0




[PATCH v5 06/20] module: check module wasn't already initialized

2019-11-13 Thread Oleinik, Alexander
The virtual-device fuzzer must initialize QOM, prior to running
vl:qemu_init, so that it can use the qos_graph to identify the arguments
required to initialize a guest for libqos-assisted fuzzing. This change
prevents errors when vl:qemu_init tries to (re)initialize the previously
initialized QOM module.

Signed-off-by: Alexander Bulekov 
---
 util/module.c | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/util/module.c b/util/module.c
index e9fe3e5422..841e490e06 100644
--- a/util/module.c
+++ b/util/module.c
@@ -30,6 +30,7 @@ typedef struct ModuleEntry
 typedef QTAILQ_HEAD(, ModuleEntry) ModuleTypeList;
 
 static ModuleTypeList init_type_list[MODULE_INIT_MAX];
+static bool modules_init_done[MODULE_INIT_MAX];
 
 static ModuleTypeList dso_init_list;
 
@@ -91,11 +92,17 @@ void module_call_init(module_init_type type)
 ModuleTypeList *l;
 ModuleEntry *e;
 
+if (modules_init_done[type]) {
+return;
+}
+
 l = find_type(type);
 
 QTAILQ_FOREACH(e, l, node) {
 e->init();
 }
+
+modules_init_done[type] = true;
 }
 
 #ifdef CONFIG_MODULES
-- 
2.23.0




Re: About MONITOR/MWAIT in i386 CPU model

2019-11-13 Thread Eduardo Habkost
On Wed, Nov 13, 2019 at 04:42:25PM +0800, Tao Xu wrote:
> Hi Eduardo,
> 
> After kvm use "-overcommit cpu-pm=on" to expose MONITOR/MWAIT
> (commit id 6f131f13e68d648a8e4f083c667ab1acd88ce4cd), the MONITOR/MWAIT
> feature in CPU model (phenom core2duo coreduo n270 Opteron_G3 EPYC Snowridge
> Denverton) may be unused. For example, when we boot a guest with Denverton
> cpu model, guest cannot detect MONITOR and boot with no warning. Should we
> remove this feature from some CPU model?

Good catch, thanks!

Yes, we should remove them from Opteron_G3, EPYC, Snowridge, and
Denverton, at least.  The other older CPU models can be left
alone: they are more useful for use with TCG than with KVM, and
TCG supports MONITOR/MWAIT.

I would like to understand why this wasn't detected during
testing by Intel.  I suggest always testing CPU models using the
"enforce" flag to make sure warnings don't go unnoticed.

> 
> Tested by Guo, Xuelian 
> 
> Tao Xu
> 

-- 
Eduardo




[PATCH v3 3/4] watchdog/aspeed: Improve watchdog timeout message

2019-11-13 Thread Joel Stanley
Users benefit from knowing which watchdog timer has expired. The address
of the watchdog's registers unambiguously indicates which has expired,
so log that.

Reviewed-by: Cédric Le Goater 
Reviewed-by: Alex Bennée 
Signed-off-by: Joel Stanley 
---
v2: Use HWADDR_PRIx
v3: Fix spacing
---
 hw/watchdog/wdt_aspeed.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/hw/watchdog/wdt_aspeed.c b/hw/watchdog/wdt_aspeed.c
index 145be6f99ce2..d283d07d6546 100644
--- a/hw/watchdog/wdt_aspeed.c
+++ b/hw/watchdog/wdt_aspeed.c
@@ -219,7 +219,8 @@ static void aspeed_wdt_timer_expired(void *dev)
 return;
 }
 
-qemu_log_mask(CPU_LOG_RESET, "Watchdog timer expired.\n");
+qemu_log_mask(CPU_LOG_RESET, "Watchdog timer %" HWADDR_PRIx " expired.\n",
+  s->iomem.addr);
 watchdog_perform_action();
 timer_del(s->timer);
 }
-- 
2.24.0




[PATCH v3 4/4] watchdog/aspeed: Fix AST2600 frequency behaviour

2019-11-13 Thread Joel Stanley
The AST2600 control register sneakily changed the meaning of bit 4
without anyone noticing. It no longer controls the 1MHz vs APB clock
select, and instead always runs at 1MHz.

The AST2500 was always 1MHz too, but it retained bit 4, making it read
only. We can model both using the same fixed 1MHz calculation.

Fixes: 6b2b2a703cad ("hw: wdt_aspeed: Add AST2600 support")
Reviewed-by: Cédric Le Goater 
Reviewed-by: Alex Bennée 
Signed-off-by: Joel Stanley 
---
v2: Fix Fixes line in commit message
---
 hw/watchdog/wdt_aspeed.c | 21 +
 include/hw/watchdog/wdt_aspeed.h |  1 +
 2 files changed, 18 insertions(+), 4 deletions(-)

diff --git a/hw/watchdog/wdt_aspeed.c b/hw/watchdog/wdt_aspeed.c
index d283d07d6546..122aa8daaadf 100644
--- a/hw/watchdog/wdt_aspeed.c
+++ b/hw/watchdog/wdt_aspeed.c
@@ -93,11 +93,11 @@ static uint64_t aspeed_wdt_read(void *opaque, hwaddr 
offset, unsigned size)
 
 }
 
-static void aspeed_wdt_reload(AspeedWDTState *s, bool pclk)
+static void aspeed_wdt_reload(AspeedWDTState *s)
 {
 uint64_t reload;
 
-if (pclk) {
+if (!(s->regs[WDT_CTRL] & WDT_CTRL_1MHZ_CLK)) {
 reload = muldiv64(s->regs[WDT_RELOAD_VALUE], NANOSECONDS_PER_SECOND,
   s->pclk_freq);
 } else {
@@ -109,6 +109,16 @@ static void aspeed_wdt_reload(AspeedWDTState *s, bool pclk)
 }
 }
 
+static void aspeed_wdt_reload_1mhz(AspeedWDTState *s)
+{
+uint64_t reload = s->regs[WDT_RELOAD_VALUE] * 1000ULL;
+
+if (aspeed_wdt_is_enabled(s)) {
+timer_mod(s->timer, qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL) + reload);
+}
+}
+
+
 static void aspeed_wdt_write(void *opaque, hwaddr offset, uint64_t data,
  unsigned size)
 {
@@ -130,13 +140,13 @@ static void aspeed_wdt_write(void *opaque, hwaddr offset, 
uint64_t data,
 case WDT_RESTART:
 if ((data & 0x) == WDT_RESTART_MAGIC) {
 s->regs[WDT_STATUS] = s->regs[WDT_RELOAD_VALUE];
-aspeed_wdt_reload(s, !(s->regs[WDT_CTRL] & WDT_CTRL_1MHZ_CLK));
+awc->wdt_reload(s);
 }
 break;
 case WDT_CTRL:
 if (enable && !aspeed_wdt_is_enabled(s)) {
 s->regs[WDT_CTRL] = data;
-aspeed_wdt_reload(s, !(data & WDT_CTRL_1MHZ_CLK));
+awc->wdt_reload(s);
 } else if (!enable && aspeed_wdt_is_enabled(s)) {
 s->regs[WDT_CTRL] = data;
 timer_del(s->timer);
@@ -283,6 +293,7 @@ static void aspeed_2400_wdt_class_init(ObjectClass *klass, 
void *data)
 awc->offset = 0x20;
 awc->ext_pulse_width_mask = 0xff;
 awc->reset_ctrl_reg = SCU_RESET_CONTROL1;
+awc->wdt_reload = aspeed_wdt_reload;
 }
 
 static const TypeInfo aspeed_2400_wdt_info = {
@@ -317,6 +328,7 @@ static void aspeed_2500_wdt_class_init(ObjectClass *klass, 
void *data)
 awc->ext_pulse_width_mask = 0xf;
 awc->reset_ctrl_reg = SCU_RESET_CONTROL1;
 awc->reset_pulse = aspeed_2500_wdt_reset_pulse;
+awc->wdt_reload = aspeed_wdt_reload_1mhz;
 }
 
 static const TypeInfo aspeed_2500_wdt_info = {
@@ -336,6 +348,7 @@ static void aspeed_2600_wdt_class_init(ObjectClass *klass, 
void *data)
 awc->ext_pulse_width_mask = 0xf; /* TODO */
 awc->reset_ctrl_reg = AST2600_SCU_RESET_CONTROL1;
 awc->reset_pulse = aspeed_2500_wdt_reset_pulse;
+awc->wdt_reload = aspeed_wdt_reload_1mhz;
 }
 
 static const TypeInfo aspeed_2600_wdt_info = {
diff --git a/include/hw/watchdog/wdt_aspeed.h b/include/hw/watchdog/wdt_aspeed.h
index dfedd7662dd1..819c22993a6e 100644
--- a/include/hw/watchdog/wdt_aspeed.h
+++ b/include/hw/watchdog/wdt_aspeed.h
@@ -47,6 +47,7 @@ typedef struct AspeedWDTClass {
 uint32_t ext_pulse_width_mask;
 uint32_t reset_ctrl_reg;
 void (*reset_pulse)(AspeedWDTState *s, uint32_t property);
+void (*wdt_reload)(AspeedWDTState *s);
 }  AspeedWDTClass;
 
 #endif /* WDT_ASPEED_H */
-- 
2.24.0




[PATCH v3 0/4] arm/aspeed: Watchdog and SDRAM fixes

2019-11-13 Thread Joel Stanley
Three of these are fixes for ast2600 models that I found when testing
master. The forth is a usability improvement that is helpful when
diagnosing why a watchdog is biting.

v3 adds some comments and fixes whitespace, and r-b from Alex. Thanks
for the review Alex.

v2 fixes some review comments from Cédric and adds his r-b.

Joel Stanley (4):
  aspeed/sdmc: Make ast2600 default 1G
  aspeed/scu: Fix W1C behavior
  watchdog/aspeed: Improve watchdog timeout message
  watchdog/aspeed: Fix AST2600 frequency behaviour

 hw/misc/aspeed_scu.c | 15 +++
 hw/misc/aspeed_sdmc.c|  6 +++---
 hw/watchdog/wdt_aspeed.c | 24 +++-
 include/hw/watchdog/wdt_aspeed.h |  1 +
 4 files changed, 34 insertions(+), 12 deletions(-)

-- 
2.24.0




[PATCH v3 1/4] aspeed/sdmc: Make ast2600 default 1G

2019-11-13 Thread Joel Stanley
Most boards have this much.

Reviewed-by: Cédric Le Goater 
Reviewed-by: Alex Bennée 
Signed-off-by: Joel Stanley 
---
 hw/misc/aspeed_sdmc.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/hw/misc/aspeed_sdmc.c b/hw/misc/aspeed_sdmc.c
index f3a63a2e01db..2df3244b53c8 100644
--- a/hw/misc/aspeed_sdmc.c
+++ b/hw/misc/aspeed_sdmc.c
@@ -208,10 +208,10 @@ static int ast2600_rambits(AspeedSDMCState *s)
 }
 
 /* use a common default */
-warn_report("Invalid RAM size 0x%" PRIx64 ". Using default 512M",
+warn_report("Invalid RAM size 0x%" PRIx64 ". Using default 1024M",
 s->ram_size);
-s->ram_size = 512 << 20;
-return ASPEED_SDMC_AST2600_512MB;
+s->ram_size = 1024 << 20;
+return ASPEED_SDMC_AST2600_1024MB;
 }
 
 static void aspeed_sdmc_reset(DeviceState *dev)
-- 
2.24.0




[PATCH v3 2/4] aspeed/scu: Fix W1C behavior

2019-11-13 Thread Joel Stanley
This models the clock write one to clear registers, and fixes up some
incorrect behavior in all of the write to clear registers.

There was also a typo in one of the register definitions.

Reviewed-by: Cédric Le Goater 
Reviewed-by: Alex Bennée 
Signed-off-by: Joel Stanley 
---
v3: Beef up the comments
---
 hw/misc/aspeed_scu.c | 15 +++
 1 file changed, 11 insertions(+), 4 deletions(-)

diff --git a/hw/misc/aspeed_scu.c b/hw/misc/aspeed_scu.c
index 717509bc5460..5518168e48b4 100644
--- a/hw/misc/aspeed_scu.c
+++ b/hw/misc/aspeed_scu.c
@@ -98,7 +98,7 @@
 #define AST2600_CLK_STOP_CTRL TO_REG(0x80)
 #define AST2600_CLK_STOP_CTRL_CLR TO_REG(0x84)
 #define AST2600_CLK_STOP_CTRL2 TO_REG(0x90)
-#define AST2600_CLK_STOP_CTR2L_CLR TO_REG(0x94)
+#define AST2600_CLK_STOP_CTRL2_CLR TO_REG(0x94)
 #define AST2600_SDRAM_HANDSHAKE   TO_REG(0x100)
 #define AST2600_HPLL_PARAMTO_REG(0x200)
 #define AST2600_HPLL_EXT  TO_REG(0x204)
@@ -532,11 +532,13 @@ static uint64_t aspeed_ast2600_scu_read(void *opaque, 
hwaddr offset,
 return s->regs[reg];
 }
 
-static void aspeed_ast2600_scu_write(void *opaque, hwaddr offset, uint64_t 
data,
+static void aspeed_ast2600_scu_write(void *opaque, hwaddr offset, uint64_t 
data64,
  unsigned size)
 {
 AspeedSCUState *s = ASPEED_SCU(opaque);
 int reg = TO_REG(offset);
+/* Truncate here so bitwise operations below behave as expected */
+uint32_t data = data64;
 
 if (reg >= ASPEED_AST2600_SCU_NR_REGS) {
 qemu_log_mask(LOG_GUEST_ERROR,
@@ -563,15 +565,20 @@ static void aspeed_ast2600_scu_write(void *opaque, hwaddr 
offset, uint64_t data,
 /* fall through */
 case AST2600_SYS_RST_CTRL:
 case AST2600_SYS_RST_CTRL2:
+case AST2600_CLK_STOP_CTRL:
+case AST2600_CLK_STOP_CTRL2:
 /* W1S (Write 1 to set) registers */
 s->regs[reg] |= data;
 return;
 case AST2600_SYS_RST_CTRL_CLR:
 case AST2600_SYS_RST_CTRL2_CLR:
+case AST2600_CLK_STOP_CTRL_CLR:
+case AST2600_CLK_STOP_CTRL2_CLR:
 case AST2600_HW_STRAP1_CLR:
 case AST2600_HW_STRAP2_CLR:
-/* W1C (Write 1 to clear) registers */
-s->regs[reg] &= ~data;
+/* W1C (Write 1 to clear) registers are offset by one address from
+ * the data register */
+s->regs[reg - 1] &= ~data;
 return;
 
 case AST2600_RNG_DATA:
-- 
2.24.0




Re: [PATCH v14 03/11] tests: Add test for QAPI builtin type time

2019-11-13 Thread Eduardo Habkost
On Wed, Nov 13, 2019 at 09:01:29AM +0800, Tao Xu wrote:
> On 11/13/2019 4:15 AM, Eduardo Habkost wrote:
> > On Fri, Nov 08, 2019 at 09:05:52AM +0100, Markus Armbruster wrote:
> > > Tao Xu  writes:
> > > 
> > > > On 11/7/2019 9:31 PM, Eduardo Habkost wrote:
> > > > > On Thu, Nov 07, 2019 at 02:24:52PM +0800, Tao Xu wrote:
> > > > > > On 11/7/2019 4:53 AM, Eduardo Habkost wrote:
> > > > > > > On Mon, Oct 28, 2019 at 03:52:12PM +0800, Tao Xu wrote:
> > > > > > > > Add tests for time input such as zero, around limit of 
> > > > > > > > precision,
> > > > > > > > signed upper limit, actual upper limit, beyond limits, time 
> > > > > > > > suffixes,
> > > > > > > > and etc.
> > > > > > > > 
> > > > > > > > Signed-off-by: Tao Xu 
> > > > > > > > ---
> > > > > > > [...]
> > > > > > > > +/* Close to signed upper limit 0x7c00 (53 msbs 
> > > > > > > > set) */
> > > > > > > > +qdict = keyval_parse("time1=9223372036854774784," /* 
> > > > > > > > 7c00 */
> > > > > > > > + "time2=9223372036854775295", /* 
> > > > > > > > 7dff */
> > > > > > > > + NULL, _abort);
> > > > > > > > +v = qobject_input_visitor_new_keyval(QOBJECT(qdict));
> > > > > > > > +qobject_unref(qdict);
> > > > > > > > +visit_start_struct(v, NULL, NULL, 0, _abort);
> > > > > > > > +visit_type_time(v, "time1", , _abort);
> > > > > > > > +g_assert_cmphex(time, ==, 0x7c00);
> > > > > > > > +visit_type_time(v, "time2", , _abort);
> > > > > > > > +g_assert_cmphex(time, ==, 0x7c00);
> > > > > > > 
> > > > > > > I'm confused by this test case and the one below[1].  Are these
> > > > > > > known bugs?  Shouldn't we document them as known bugs?
> > > > > > 
> > > > > > Because do_strtosz() or do_strtomul() actually parse with strtod(), 
> > > > > > so the
> > > > > > precision is 53 bits, so in these cases, 7dff and
> > > > > > fbff are rounded.
> > > > > 
> > > > > My questions remain: why isn't this being treated like a bug?
> > > > > 
> > > > Hi Markus,
> > > > 
> > > > I am confused about the code here too. Because in do_strtosz(), the
> > > > upper limit is
> > > > 
> > > > val * mul >= 0xfc00
> > > > 
> > > > So some data near 53 bit may be rounded. Is there a bug?
> > > 
> > > No, but the design is surprising, and the functions lack written
> > > contracts, except for the do_strtosz() helper, which has one that sucks.
> > > 
> > > qemu_strtosz() & friends are designed to accept fraction * unit
> > > multiplier.  Example: 1.5M means 1.5 * 1024 * 1024 with qemu_strtosz()
> > > and qemu_strtosz_MiB(), and 1.5 * 1000 * 1000 with
> > > qemu_strtosz_metric().  Whether supporting fractions is a good idea is
> > > debatable, but it's what we've got.
> > > 
> > > The implementation limits the numeric part to the precision of double,
> > > i.e. 53 bits.  "8PiB should be enough for anybody."
> > > 
> > > Switching it from double to long double raises the limit to the
> > > precision of long double.  At least 64 bit on common hosts, but hosts
> > > exist where it's the same 53 bits.  Do we support any such hosts?  If
> > > yes, then we'd make the precision depend on the host, which feels like a
> > > bad idea.
> > > 
> > > A possible alternative is to parse the numeric part both as a double and
> > > as a 64 bit unsigned integer, then use whatever consumes more
> > > characters.  This enables providing full 64 bits unless you actually use
> > > a fraction.
> > > 
> > 
> > This sounds like the right thing to do, if user input is an
> > integer and the code in the other end is consuming an integer.
> > 
> > 
> > > As far as I remember, the only problem we've ever had with the 53 bits
> > > limit is developer confusion :)
> > > 
> > 
> > Developer confusion, I can deal with.  However, exposing this
> > behavior on external interfaces is a bug to me.
> > 
> > I don't know how serious the bug is because I don't know which
> > interfaces are affected by it.  Do we have a list?
> > 
> > > Patches welcome.
> > 
> > My first goal is to get the maintainers of that code to recognize
> > it as a bug.  Then I hope this will motivate somebody else to fix
> > it.  :)
> > 
> 
> Hi Eduardo,
> 
> If it is a bug, could the fix patch merged during rc1-rc3? Because I made 2
> patches, and I want to submit before HMAT (HMAT patches is big, so submit
> together may be slow).

Even if I convince other maintainers it is a bug, I don't think
it is serious enough to require a fix in QEMU 4.2.  I suggest
finishing the ongoing HMAT work first, and worry about this issue
later.

Or, if you really prefer to address it before HMAT, it's OK to
make the next version of the HMAT series depend on a series
that's not merged yet.  Just make this explicit in the series
cover letter (publishing a git branch to help review and testing
is also appreciated).

-- 
Eduardo




Re: [PATCH 0/2] Introducing QMP query-netdevs command

2019-11-13 Thread Alexey Kirillov
That's a good idea, thanks! I'll do this in V2.

14.11.2019, 00:32, "Eric Blake" :
>
> Can we rewrite the existing HMP command to call into the new QMP command?
>


-- 
Alex Kirillov
Yandex.Cloud




Re: [PATCH 0/2] Introducing QMP query-netdevs command

2019-11-13 Thread Eric Blake

On 11/13/19 3:25 PM, Alexey Kirillov wrote:

This patch introduces a new QMP command "query-netdevs" to get information
about currently attached network devices.
Potentially, this patch makes the "info_str" field of "struct NetClientState"
and HMP command "info network" obsolete as new command gives out more
information in a structured way.


Can we rewrite the existing HMP command to call into the new QMP command?

--
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3226
Virtualization:  qemu.org | libvirt.org




Re: [PATCH 1/2] qapi: net: Add query-netdevs command

2019-11-13 Thread Eric Blake

On 11/13/19 3:25 PM, Alexey Kirillov wrote:

Add a qmp command that provides information about currently attached
network devices and their configuration.

Signed-off-by: Alexey Kirillov 
---



+++ b/qapi/net.json
@@ -754,3 +754,88 @@
  ##
  { 'event': 'FAILOVER_NEGOTIATED',
'data': {'device-id': 'str'} }
+
+##
+# @NetdevInfo:
+#
+# Configuration of a network device.
+#
+# @id: Device identifier.
+#
+# @type: Specify the driver used for interpreting remaining arguments.
+#
+# @peer: Connected network device.
+#
+# @queues_count: Number of queues.


Unless there is a strong reason otherwise, this should be 'queues-count'.


+#
+# @hub: hubid of hub, if connected to.
+#
+# Since: 4.2
+##
+{ 'union': 'NetdevInfo',
+  'base': { 'id': 'str',
+'type': 'NetClientDriver',
+'*peer': 'str',
+'queues_count': 'int',
+'*hub': 'int' },
+  'discriminator': 'type',
+  'data': {
+  'nic':'NetLegacyNicOptions',
+  'user':   'NetdevUserOptions',
+  'tap':'NetdevTapOptions',
+  'l2tpv3': 'NetdevL2TPv3Options',
+  'socket': 'NetdevSocketOptions',
+  'vde':'NetdevVdeOptions',
+  'bridge': 'NetdevBridgeOptions',
+  'hubport':'NetdevHubPortOptions',
+  'netmap': 'NetdevNetmapOptions',
+  'vhost-user': 'NetdevVhostUserOptions' } }
+
+##
+# @x-query-netdevs:


What are the reasons for the x- prefix?  Are we planning on changing 
this interface down the road?  If so, what changes might we make?



+#
+# Get a list of @NetdevInfo for all virtual network devices.
+#
+# Returns: a list of @NetdevInfo describing each virtual network device.
+#
+# Since: 4.2


This is a new feature; as such, it's too late to make it into 4.2; 
you'll want to change this to 5.0.


--
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3226
Virtualization:  qemu.org | libvirt.org




Re: [PULL 04/11] target/arm/cpu64: max cpu: Introduce sve properties

2019-11-13 Thread Peter Maydell
On Wed, 13 Nov 2019 at 20:17, Richard Henderson
 wrote:
>
> On 11/12/19 11:23 AM, Peter Maydell wrote:
> >> +static uint32_t sve_zcr_get_valid_len(ARMCPU *cpu, uint32_t start_len)
> >> +{
> >> +uint32_t start_vq = (start_len & 0xf) + 1;
> >> +
> >> +return arm_cpu_vq_map_next_smaller(cpu, start_vq + 1) - 1;
> >
> > "Subtract operation overflows on operands
> > arm_cpu_vq_map_next_smaller(cpu, start_vq + 1U) and 1U"
> >
> > Certainly it looks as if arm_cpu_vq_map_next_smaller() can
> > return 0, and claiming the valid length to be UINT_MAX
> > seems a bit odd in that case.
>
> The lsb is always set in the map, the minimum number we send to next_smaller 
> is
> 2 -> so the minimum number returned from next_smaller is 1.
>
> We should never return UINT_MAX.
>
> > return bitnum == vq - 1 ? 0 : bitnum + 1;
>
> But yes, this computation doesn't seem right.
>
> The beginning assert should probably be (vq >= 2 ...)
> and here we should assert bitnum != vq - 1.

Coverity may also be looking at the case where
TARGET_AARCH64 is not defined. The fallback definition
of arm_cpu_vq_map_next_smaller() for that situation
always returns 0.

thanks
-- PMM



[PATCH 2/2] tests: Add tests for query-netdevs command

2019-11-13 Thread Alexey Kirillov
Signed-off-by: Alexey Kirillov 
---
 tests/Makefile.include |   2 +
 tests/test-query-netdevs.c | 114 +
 2 files changed, 116 insertions(+)
 create mode 100644 tests/test-query-netdevs.c

diff --git a/tests/Makefile.include b/tests/Makefile.include
index 534ee48743..4d199e463b 100644
--- a/tests/Makefile.include
+++ b/tests/Makefile.include
@@ -297,6 +297,7 @@ check-qtest-s390x-y += tests/migration-test$(EXESUF)
 check-qtest-generic-y += tests/machine-none-test$(EXESUF)
 check-qtest-generic-y += tests/qom-test$(EXESUF)
 check-qtest-generic-y += tests/test-hmp$(EXESUF)
+check-qtest-generic-y += tests/test-query-netdevs$(EXESUF)
 
 qapi-schema += alternate-any.json
 qapi-schema += alternate-array.json
@@ -844,6 +845,7 @@ tests/numa-test$(EXESUF): tests/numa-test.o
 tests/vmgenid-test$(EXESUF): tests/vmgenid-test.o tests/boot-sector.o 
tests/acpi-utils.o
 tests/cdrom-test$(EXESUF): tests/cdrom-test.o tests/boot-sector.o 
$(libqos-obj-y)
 tests/arm-cpu-features$(EXESUF): tests/arm-cpu-features.o
+tests/test-query-netdevs$(EXESUF): tests/test-query-netdevs.o
 
 tests/migration/stress$(EXESUF): tests/migration/stress.o
$(call quiet-command, $(LINKPROG) -static -O3 $(PTHREAD_LIB) -o $@ $< 
,"LINK","$(TARGET_DIR)$@")
diff --git a/tests/test-query-netdevs.c b/tests/test-query-netdevs.c
new file mode 100644
index 00..2afde36114
--- /dev/null
+++ b/tests/test-query-netdevs.c
@@ -0,0 +1,114 @@
+/*
+ * QTest testcase for the query-netdevs
+ *
+ * Copyright Yandex N.V., 2019
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2.  See
+ * the COPYING file in the top-level directory.
+ *
+ */
+
+#include "qemu/osdep.h"
+
+#include "libqtest.h"
+#include "qapi/qmp/qdict.h"
+#include "qapi/qmp/qlist.h"
+
+/*
+ * Events can get in the way of responses we are actually waiting for.
+ */
+GCC_FMT_ATTR(2, 3)
+static QObject *wait_command(QTestState *who, const char *command, ...)
+{
+va_list ap;
+QDict *response;
+QObject *result;
+
+va_start(ap, command);
+qtest_qmp_vsend(who, command, ap);
+va_end(ap);
+
+response = qtest_qmp_receive(who);
+
+result = qdict_get(response, "return");
+g_assert(result);
+qobject_ref(result);
+qobject_unref(response);
+
+return result;
+}
+
+static void qmp_query_netdevs_no_error(QTestState *qts,
+   size_t netdevs_count)
+{
+QObject *resp;
+QList *netdevs;
+
+resp = wait_command(qts, "{'execute': 'x-query-netdevs'}");
+
+netdevs = qobject_to(QList, resp);
+g_assert(netdevs);
+g_assert(qlist_size(netdevs) == netdevs_count);
+
+qobject_unref(resp);
+}
+
+static void test_query_netdevs(void)
+{
+const char *arch = qtest_get_arch();
+size_t correction = 0;
+QObject *resp;
+QTestState *state;
+
+if (strcmp(arch, "arm") == 0 ||
+strcmp(arch, "aarch64") == 0 ||
+strcmp(arch, "tricore") == 0) {
+g_test_skip("Not supported without machine type");
+return;
+}
+
+/* Archs with default not unpluggable netdev */
+if (strcmp(arch, "cris") == 0 ||
+strcmp(arch, "microblaze") == 0 ||
+strcmp(arch, "microblazeel") == 0 ||
+strcmp(arch, "sparc") == 0) {
+correction = 1;
+}
+
+state = qtest_init(
+"-nodefaults "
+"-netdev user,id=slirp0");
+g_assert(state);
+
+qmp_query_netdevs_no_error(state, 1 + correction);
+
+resp = wait_command(state,
+"{'execute': 'netdev_add', 'arguments': {"
+" 'id': 'slirp1',"
+" 'type': 'user'}}");
+qobject_unref(resp);
+
+qmp_query_netdevs_no_error(state, 2 + correction);
+
+resp = wait_command(state,
+"{'execute': 'netdev_del', 'arguments': {"
+" 'id': 'slirp1'}}");
+qobject_unref(resp);
+
+qmp_query_netdevs_no_error(state, 1 + correction);
+
+qtest_quit(state);
+}
+
+int main(int argc, char **argv)
+{
+int ret = 0;
+g_test_init(, , NULL);
+
+qtest_add_func("/net/qapi/query_netdevs",
+test_query_netdevs);
+
+ret = g_test_run();
+
+return ret;
+}
-- 
2.17.1




Re: QEMU for Qualcomm Hexagon - KVM Forum talk and code available

2019-11-13 Thread Peter Maydell
On Wed, 13 Nov 2019 at 10:32, Alex Bennée  wrote:
> I don't see including flex/bison as a dependency
> being a major issue (in fact we have it in our docker images so I guess
> something uses it).

They're used by the dtc submodule, so only in setups where you
need to use the submodule rather than the system libfdt.
In fact I think that dtc doesn't require them for building
libfdt, but its build machinery complains about them being
missing (it needs them for building the 'dtc' binary, which
we don't try to build), so providing them shuts up the
misleading warning.

thanks
-- PMM



[PATCH 1/2] qapi: net: Add query-netdevs command

2019-11-13 Thread Alexey Kirillov
Add a qmp command that provides information about currently attached
network devices and their configuration.

Signed-off-by: Alexey Kirillov 
---
 include/net/net.h |   1 +
 net/hub.c |   8 +++
 net/l2tpv3.c  |  19 +++
 net/net.c |  80 +
 net/netmap.c  |  13 +
 net/slirp.c   | 126 ++
 net/socket.c  |  71 ++
 net/tap-win32.c   |   9 
 net/tap.c | 103 +++--
 net/vde.c |  26 ++
 net/vhost-user.c  |  18 +--
 qapi/net.json |  85 +++
 12 files changed, 551 insertions(+), 8 deletions(-)

diff --git a/include/net/net.h b/include/net/net.h
index e175ba9677..2c8956c0b3 100644
--- a/include/net/net.h
+++ b/include/net/net.h
@@ -92,6 +92,7 @@ struct NetClientState {
 char *model;
 char *name;
 char info_str[256];
+NetdevInfo *stored_config;
 unsigned receive_disabled : 1;
 NetClientDestructor *destructor;
 unsigned int queue_index;
diff --git a/net/hub.c b/net/hub.c
index 5795a678ed..37995b5517 100644
--- a/net/hub.c
+++ b/net/hub.c
@@ -148,6 +148,7 @@ static NetHubPort *net_hub_port_new(NetHub *hub, const char 
*name,
 NetHubPort *port;
 int id = hub->num_ports++;
 char default_name[128];
+NetdevHubPortOptions *stored;
 
 if (!name) {
 snprintf(default_name, sizeof(default_name),
@@ -160,6 +161,13 @@ static NetHubPort *net_hub_port_new(NetHub *hub, const 
char *name,
 port->id = id;
 port->hub = hub;
 
+/* Store startup parameters */
+nc->stored_config = g_new0(NetdevInfo, 1);
+nc->stored_config->type = NET_CLIENT_DRIVER_HUBPORT;
+stored = >stored_config->u.hubport;
+
+stored->hubid = hub->id;
+
 QLIST_INSERT_HEAD(>ports, port, next);
 
 return port;
diff --git a/net/l2tpv3.c b/net/l2tpv3.c
index 55fea17c0f..f4e45e7b28 100644
--- a/net/l2tpv3.c
+++ b/net/l2tpv3.c
@@ -535,6 +535,7 @@ int net_init_l2tpv3(const Netdev *netdev,
 struct addrinfo hints;
 struct addrinfo *result = NULL;
 char *srcport, *dstport;
+NetdevL2TPv3Options *stored;
 
 nc = qemu_new_net_client(_l2tpv3_info, peer, "l2tpv3", name);
 
@@ -726,6 +727,24 @@ int net_init_l2tpv3(const Netdev *netdev,
 
 l2tpv3_read_poll(s, true);
 
+/* Store startup parameters */
+nc->stored_config = g_new0(NetdevInfo, 1);
+nc->stored_config->type = NET_CLIENT_DRIVER_L2TPV3;
+stored = >stored_config->u.l2tpv3;
+
+memcpy(stored, l2tpv3, sizeof(NetdevL2TPv3Options));
+
+stored->src = g_strdup(l2tpv3->src);
+stored->dst = g_strdup(l2tpv3->dst);
+
+if (l2tpv3->has_srcport) {
+stored->srcport = g_strdup(l2tpv3->srcport);
+}
+
+if (l2tpv3->has_dstport) {
+stored->dstport = g_strdup(l2tpv3->dstport);
+}
+
 snprintf(s->nc.info_str, sizeof(s->nc.info_str),
  "l2tpv3: connected");
 return 0;
diff --git a/net/net.c b/net/net.c
index 84aa6d8d00..08bf78a668 100644
--- a/net/net.c
+++ b/net/net.c
@@ -54,6 +54,7 @@
 #include "sysemu/sysemu.h"
 #include "net/filter.h"
 #include "qapi/string-output-visitor.h"
+#include "qapi/clone-visitor.h"
 
 /* Net bridge is currently not supported for W32. */
 #if !defined(_WIN32)
@@ -283,6 +284,7 @@ NICState *qemu_new_nic(NetClientInfo *info,
 NetClientState **peers = conf->peers.ncs;
 NICState *nic;
 int i, queues = MAX(1, conf->peers.queues);
+NetLegacyNicOptions *stored;
 
 assert(info->type == NET_CLIENT_DRIVER_NIC);
 assert(info->size >= sizeof(NICState));
@@ -298,6 +300,22 @@ NICState *qemu_new_nic(NetClientInfo *info,
 nic->ncs[i].queue_index = i;
 }
 
+/* Store startup parameters */
+nic->ncs[0].stored_config = g_new0(NetdevInfo, 1);
+nic->ncs[0].stored_config->type = NET_CLIENT_DRIVER_NIC;
+stored = >ncs[0].stored_config->u.nic;
+
+if (peers[0]) {
+stored->has_netdev = true;
+stored->netdev = g_strdup(peers[0]->name);
+}
+
+stored->has_macaddr = true;
+stored->macaddr = g_strdup_printf(MAC_FMT, MAC_ARG(conf->macaddr.a));
+
+stored->has_model = true;
+stored->model = g_strdup(model);
+
 return nic;
 }
 
@@ -344,6 +362,7 @@ static void qemu_free_net_client(NetClientState *nc)
 }
 g_free(nc->name);
 g_free(nc->model);
+qapi_free_NetdevInfo(nc->stored_config);
 if (nc->destructor) {
 nc->destructor(nc);
 }
@@ -1323,6 +1342,67 @@ RxFilterInfoList *qmp_query_rx_filter(bool has_name, 
const char *name,
 return filter_list;
 }
 
+NetdevInfoList *qmp_x_query_netdevs(Error **errp)
+{
+NetdevInfoList *list = NULL;
+NetClientState *nc;
+
+QTAILQ_FOREACH(nc, _clients, next) {
+/* Only look at netdevs, not for each queue */
+if (nc->stored_config) {
+NetdevInfoList *node = g_new0(NetdevInfoList, 1);
+
+node->value = QAPI_CLONE(NetdevInfo, 

[PATCH 0/2] Introducing QMP query-netdevs command

2019-11-13 Thread Alexey Kirillov
This patch introduces a new QMP command "query-netdevs" to get information
about currently attached network devices.
Potentially, this patch makes the "info_str" field of "struct NetClientState"
and HMP command "info network" obsolete as new command gives out more
information in a structured way.

Usage example:

{ "execute": "x-query-netdevs" }
{ "return": [
{
  "peer": "netdev0",
  "netdev": "netdev0",
  "model": "virtio-net-pci",
  "macaddr": "52:54:00:12:34:56",
  "queues_count": 1,
  "type": "nic",
  "id": "net0"
},
{
  "peer": "net0",
  "ipv6": true,
  "ipv4": true,
  "host": "10.0.2.2",
  "queues_count": 1,
  "ipv6-dns": "fec0::3",
  "ipv6-prefix": "fec0::",
  "net": "10.0.2.0/255.255.255.0",
  "ipv6-host": "fec0::2",
  "type": "user",
  "dns": "10.0.2.3",
  "hostfwd": [
{
  "str": "tcp::20004-:22"
}
  ],
  "ipv6-prefixlen": 64,
  "id": "netdev0",
  "restrict": false
}
  ]
}

Alexey Kirillov (2):
  qapi: net: Add query-netdevs command
  tests: Add tests for query-netdevs command

 include/net/net.h  |   1 +
 net/hub.c  |   8 +++
 net/l2tpv3.c   |  19 ++
 net/net.c  |  80 +++
 net/netmap.c   |  13 
 net/slirp.c| 126 +
 net/socket.c   |  71 +
 net/tap-win32.c|   9 +++
 net/tap.c  | 103 --
 net/vde.c  |  26 
 net/vhost-user.c   |  18 +-
 qapi/net.json  |  85 +
 tests/Makefile.include |   2 +
 tests/test-query-netdevs.c | 114 +
 14 files changed, 667 insertions(+), 8 deletions(-)
 create mode 100644 tests/test-query-netdevs.c

-- 
2.17.1




Re: QEMU for Qualcomm Hexagon - KVM Forum talk and code available

2019-11-13 Thread Richard Henderson
On 11/13/19 8:31 PM, Taylor Simpson wrote:
> [Taylor] Currently, I have the generator and the generated code sitting in 
> the source tree.  I'm flexible on this if the decision is to regenerate it 
> every time.

I would prefer to regenerate every time, and not store the generated code in
the source tree at all.  It makes it a no-brainer to modify the source and not
have to remember how to regenerate, because the rules are right there in the
makefile.


r~



[PATCH] tests: fix modules-test 'duplicate test case' error

2019-11-13 Thread Cole Robinson
./configure --enable-sdl --audio-drv-list=sdl --enable-modules

Will generate two identical test names: /$arch/module/load/sdl
Which generates an error like:

(tests/modules-test:23814): GLib-ERROR **: 18:23:06.359: duplicate test case 
path: /aarch64//module/load/sdl

Add the subsystem prefix in the name as well, so instead we get:

/$arch/module/load/audio-sdl
/$arch/module/load/ui-sdl

Signed-off-by: Cole Robinson 
---
 tests/modules-test.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/tests/modules-test.c b/tests/modules-test.c
index d1a6ace218..88217686e1 100644
--- a/tests/modules-test.c
+++ b/tests/modules-test.c
@@ -64,7 +64,8 @@ int main(int argc, char *argv[])
 g_test_init(, , NULL);
 
 for (i = 0; i < G_N_ELEMENTS(modules); i += 2) {
-char *testname = g_strdup_printf("/module/load/%s", modules[i + 1]);
+char *testname = g_strdup_printf("/module/load/%s%s",
+ modules[i], modules[i + 1]);
 qtest_add_data_func(testname, modules + i, test_modules_load);
 g_free(testname);
 }
-- 
2.23.0




Re: [PATCH v9 Kernel 1/5] vfio: KABI for migration interface for device state

2019-11-13 Thread Alex Williamson
On Thu, 14 Nov 2019 01:47:04 +0530
Kirti Wankhede  wrote:

> On 11/14/2019 1:18 AM, Alex Williamson wrote:
> > On Thu, 14 Nov 2019 00:59:52 +0530
> > Kirti Wankhede  wrote:
> >   
> >> On 11/13/2019 11:57 PM, Alex Williamson wrote:  
> >>> On Wed, 13 Nov 2019 11:24:17 +0100
> >>> Cornelia Huck  wrote:
> >>>  
>  On Tue, 12 Nov 2019 15:30:05 -0700
>  Alex Williamson  wrote:
>  
> > On Tue, 12 Nov 2019 22:33:36 +0530
> > Kirti Wankhede  wrote:
> > 
> >> - Defined MIGRATION region type and sub-type.
> >> - Used 3 bits to define VFIO device states.
> >>   Bit 0 => _RUNNING
> >>   Bit 1 => _SAVING
> >>   Bit 2 => _RESUMING
> >>   Combination of these bits defines VFIO device's state during 
> >> migration
> >>   _RUNNING => Normal VFIO device running state. When its reset, it
> >>indicates _STOPPED state. when device is changed to
> >>_STOPPED, driver should stop device before write()
> >>returns.
> >>   _SAVING | _RUNNING => vCPUs are running, VFIO device is running 
> >> but
> >> start saving state of device i.e. pre-copy 
> >> state
> >>   _SAVING  => vCPUs are stopped, VFIO device should be stopped, 
> >> and  
> >
> > s/should/must/
> > 
> >>   save device state,i.e. stop-n-copy state
> >>   _RESUMING => VFIO device resuming state.
> >>   _SAVING | _RESUMING and _RUNNING | _RESUMING => Invalid states  
> >
> > A table might be useful here and in the uapi header to indicate valid
> > states:  
> 
>  I like that.
>  
> >
> > | _RESUMING | _SAVING | _RUNNING | Description
> > +---+-+--+--
> > | 0 |0| 0| Stopped, not saving or resuming (a)
> > +---+-+--+--
> > | 0 |0| 1| Running, default state
> > +---+-+--+--
> > | 0 |1| 0| Stopped, migration interface in save 
> > mode
> > +---+-+--+--
> > | 0 |1| 1| Running, save mode interface, 
> > iterative
> > +---+-+--+--
> > | 1 |0| 0| Stopped, migration resume interface 
> > active
> > +---+-+--+--
> > | 1 |0| 1| Invalid (b)
> > +---+-+--+--
> > | 1 |1| 0| Invalid (c)
> > +---+-+--+--
> > | 1 |1| 1| Invalid (d)
> >
> > I think we need to consider whether we define (a) as generally
> > available, for instance we might want to use it for diagnostics or a
> > fatal error condition outside of migration.
> >
> > Are there hidden assumptions between state transitions here or are
> > there specific next possible state diagrams that we need to include as
> > well?  
> 
>  Some kind of state-change diagram might be useful in addition to the
>  textual description anyway. Let me try, just to make sure I understand
>  this correctly:
>  
> >>
> >> During User application initialization, there is one more state change:
> >>
> >> 0) 0/0/0  stop to running -> 0/0/1  
> > 
> > 0/0/0 cannot be the initial state of the device, that would imply that
> > a device supporting this migration interface breaks backwards
> > compatibility with all existing vfio userspace code and that code needs
> > to learn to set the device running as part of its initialization.
> > That's absolutely unacceptable.  The initial device state must be 0/0/1.
> >   
> 
> There isn't any device state for all existing vfio userspace code right 
> now. So default its assumed to be always running.

Exactly, there is no representation of device state, therefore it's
assumed to be running, therefore when adding a representation of device
state it must default to running.

> With migration support, device states are explicitly getting added. For 
> example, in case of QEMU, while device is getting initialized, i.e. from 
> vfio_realize(), device_state is set to 0/0/0, but not required to convey 
> it to vendor driver.

But we have a 0/0/0 state, why would we intentionally keep an internal
state that's inconsistent with the device?

> Then with vfio_vmstate_change() notifier, device 
> state is changed to 0/0/1 when VM/vCPU are transitioned to running, at 
> this moment device state is 

Re: [PATCH v9 Kernel 3/5] vfio iommu: Add ioctl defination to unmap IOVA and return dirty bitmap

2019-11-13 Thread Alex Williamson
On Thu, 14 Nov 2019 01:22:39 +0530
Kirti Wankhede  wrote:

> On 11/13/2019 4:00 AM, Alex Williamson wrote:
> > On Tue, 12 Nov 2019 22:33:38 +0530
> > Kirti Wankhede  wrote:
> >   
> >> With vIOMMU, during pre-copy phase of migration, while CPUs are still
> >> running, IO virtual address unmap can happen while device still keeping
> >> reference of guest pfns. Those pages should be reported as dirty before
> >> unmap, so that VFIO user space application can copy content of those pages
> >> from source to destination.
> >>
> >> IOCTL defination added here add bitmap pointer, size and flag. If flag  
> > 
> > definition, adds
> >   
> >> VFIO_DMA_UNMAP_FLAG_GET_DIRTY_BITMAP is set and bitmap memory is allocated
> >> and bitmap_size of set, then ioctl will create bitmap of pinned pages and  
> > 
> > s/of/is/
> >   
> >> then unmap those.
> >>
> >> Signed-off-by: Kirti Wankhede 
> >> Reviewed-by: Neo Jia 
> >> ---
> >>   include/uapi/linux/vfio.h | 33 +
> >>   1 file changed, 33 insertions(+)
> >>
> >> diff --git a/include/uapi/linux/vfio.h b/include/uapi/linux/vfio.h
> >> index 6fd3822aa610..72fd297baf52 100644
> >> --- a/include/uapi/linux/vfio.h
> >> +++ b/include/uapi/linux/vfio.h
> >> @@ -925,6 +925,39 @@ struct vfio_iommu_type1_dirty_bitmap {
> >>   
> >>   #define VFIO_IOMMU_GET_DIRTY_BITMAP _IO(VFIO_TYPE, VFIO_BASE 
> >> + 17)
> >>   
> >> +/**
> >> + * VFIO_IOMMU_UNMAP_DMA_GET_BITMAP - _IOWR(VFIO_TYPE, VFIO_BASE + 18,
> >> + *  struct 
> >> vfio_iommu_type1_dma_unmap_bitmap)
> >> + *
> >> + * Unmap IO virtual addresses using the provided struct
> >> + * vfio_iommu_type1_dma_unmap_bitmap.  Caller sets argsz.
> >> + * VFIO_DMA_UNMAP_FLAG_GET_DIRTY_BITMAP should be set to get dirty bitmap
> >> + * before unmapping IO virtual addresses. If this flag is not set, only IO
> >> + * virtual address are unmapped without creating pinned pages bitmap, that
> >> + * is, behave same as VFIO_IOMMU_UNMAP_DMA ioctl.
> >> + * User should allocate memory to get bitmap and should set size of 
> >> allocated
> >> + * memory in bitmap_size field. One bit in bitmap is used to represent 
> >> per page
> >> + * consecutively starting from iova offset. Bit set indicates page at that
> >> + * offset from iova is dirty.
> >> + * The actual unmapped size is returned in the size field and bitmap of 
> >> pages
> >> + * in the range of unmapped size is returned in bitmap if flag
> >> + * VFIO_DMA_UNMAP_FLAG_GET_DIRTY_BITMAP is set.
> >> + *
> >> + * No guarantee is made to the user that arbitrary unmaps of iova or size
> >> + * different from those used in the original mapping call will succeed.
> >> + */
> >> +struct vfio_iommu_type1_dma_unmap_bitmap {
> >> +  __u32argsz;
> >> +  __u32flags;
> >> +#define VFIO_DMA_UNMAP_FLAG_GET_DIRTY_BITMAP (1 << 0)
> >> +  __u64iova;/* IO virtual address */
> >> +  __u64size;/* Size of mapping (bytes) */
> >> +  __u64bitmap_size; /* in bytes */
> >> +  void __user *bitmap;  /* one bit per page */
> >> +};
> >> +
> >> +#define VFIO_IOMMU_UNMAP_DMA_GET_BITMAP _IO(VFIO_TYPE, VFIO_BASE + 18)
> >> +  
> > 
> > Why not extend VFIO_IOMMU_UNMAP_DMA to support this rather than add an
> > ioctl that duplicates the functionality and extends it??   
> 
> We do want old userspace applications to work with new kernel and 
> vice-versa, right?
> 
> If I try to change existing VFIO_IOMMU_UNMAP_DMA ioctl structure, say if 
> add 'bitmap_size' and 'bitmap' after 'size', with below code in old 
> kernel, old kernel & new userspace will work.
> 
>  minsz = offsetofend(struct vfio_iommu_type1_dma_unmap, size);
> 
>  if (copy_from_user(, (void __user *)arg, minsz))
>  return -EFAULT;
> 
>  if (unmap.argsz < minsz || unmap.flags)
>  return -EINVAL;
> 
> 
> With new kernel it would change to:
>  minsz = offsetofend(struct vfio_iommu_type1_dma_unmap, bitmap);

No, the minimum structure size still ends at size, we interpret flags
and argsz to learn if the user understands those fields and optionally
include them.  Therefore old userspace on new kernel continues to work.

>  if (copy_from_user(, (void __user *)arg, minsz))
>  return -EFAULT;
> 
>  if (unmap.argsz < minsz || unmap.flags)
>  return -EINVAL;
> 
> Then old userspace app will fail because unmap.argsz < minsz and might 
> be copy_from_user would cause seg fault because userspace sdk doesn't 
> contain new member variables.
> We can't change the sequence to keep 'size' as last member, because then 
> new userspace app on old kernel will interpret it wrong.

If we have new userspace on old kernel, that userspace needs to be able
to learn that this feature exists (new flag in the
vfio_iommu_type1_info struct as suggested below) and only make use of it

Re: [PULL 04/11] target/arm/cpu64: max cpu: Introduce sve properties

2019-11-13 Thread Richard Henderson
On 11/12/19 11:23 AM, Peter Maydell wrote:
>> +static uint32_t sve_zcr_get_valid_len(ARMCPU *cpu, uint32_t start_len)
>> +{
>> +uint32_t start_vq = (start_len & 0xf) + 1;
>> +
>> +return arm_cpu_vq_map_next_smaller(cpu, start_vq + 1) - 1;
> 
> "Subtract operation overflows on operands
> arm_cpu_vq_map_next_smaller(cpu, start_vq + 1U) and 1U"
> 
> Certainly it looks as if arm_cpu_vq_map_next_smaller() can
> return 0, and claiming the valid length to be UINT_MAX
> seems a bit odd in that case.

The lsb is always set in the map, the minimum number we send to next_smaller is
2 -> so the minimum number returned from next_smaller is 1.

We should never return UINT_MAX.

> return bitnum == vq - 1 ? 0 : bitnum + 1;

But yes, this computation doesn't seem right.

The beginning assert should probably be (vq >= 2 ...)
and here we should assert bitnum != vq - 1.


r~



Re: [PATCH v9 Kernel 1/5] vfio: KABI for migration interface for device state

2019-11-13 Thread Kirti Wankhede




On 11/14/2019 1:18 AM, Alex Williamson wrote:

On Thu, 14 Nov 2019 00:59:52 +0530
Kirti Wankhede  wrote:


On 11/13/2019 11:57 PM, Alex Williamson wrote:

On Wed, 13 Nov 2019 11:24:17 +0100
Cornelia Huck  wrote:
   

On Tue, 12 Nov 2019 15:30:05 -0700
Alex Williamson  wrote:
  

On Tue, 12 Nov 2019 22:33:36 +0530
Kirti Wankhede  wrote:
  

- Defined MIGRATION region type and sub-type.
- Used 3 bits to define VFIO device states.
  Bit 0 => _RUNNING
  Bit 1 => _SAVING
  Bit 2 => _RESUMING
  Combination of these bits defines VFIO device's state during migration
  _RUNNING => Normal VFIO device running state. When its reset, it
indicates _STOPPED state. when device is changed to
_STOPPED, driver should stop device before write()
returns.
  _SAVING | _RUNNING => vCPUs are running, VFIO device is running but
start saving state of device i.e. pre-copy state
  _SAVING  => vCPUs are stopped, VFIO device should be stopped, and


s/should/must/
  

  save device state,i.e. stop-n-copy state
  _RESUMING => VFIO device resuming state.
  _SAVING | _RESUMING and _RUNNING | _RESUMING => Invalid states


A table might be useful here and in the uapi header to indicate valid
states:


I like that.
  


| _RESUMING | _SAVING | _RUNNING | Description
+---+-+--+--
| 0 |0| 0| Stopped, not saving or resuming (a)
+---+-+--+--
| 0 |0| 1| Running, default state
+---+-+--+--
| 0 |1| 0| Stopped, migration interface in save mode
+---+-+--+--
| 0 |1| 1| Running, save mode interface, iterative
+---+-+--+--
| 1 |0| 0| Stopped, migration resume interface active
+---+-+--+--
| 1 |0| 1| Invalid (b)
+---+-+--+--
| 1 |1| 0| Invalid (c)
+---+-+--+--
| 1 |1| 1| Invalid (d)

I think we need to consider whether we define (a) as generally
available, for instance we might want to use it for diagnostics or a
fatal error condition outside of migration.

Are there hidden assumptions between state transitions here or are
there specific next possible state diagrams that we need to include as
well?


Some kind of state-change diagram might be useful in addition to the
textual description anyway. Let me try, just to make sure I understand
this correctly:
  


During User application initialization, there is one more state change:

0) 0/0/0  stop to running -> 0/0/1


0/0/0 cannot be the initial state of the device, that would imply that
a device supporting this migration interface breaks backwards
compatibility with all existing vfio userspace code and that code needs
to learn to set the device running as part of its initialization.
That's absolutely unacceptable.  The initial device state must be 0/0/1.



There isn't any device state for all existing vfio userspace code right 
now. So default its assumed to be always running.


With migration support, device states are explicitly getting added. For 
example, in case of QEMU, while device is getting initialized, i.e. from 
vfio_realize(), device_state is set to 0/0/0, but not required to convey 
it to vendor driver. Then with vfio_vmstate_change() notifier, device 
state is changed to 0/0/1 when VM/vCPU are transitioned to running, at 
this moment device state is conveyed to vendor driver. So vendor driver 
doesn't see 0/0/0 state.


While resuming, for userspace, for example QEMU, device state change is 
from 0/0/0 to 1/0/0, vendor driver see 1/0/0 after device basic 
initialization is done.




1) 0/0/1 ---(trigger driver to start gathering state info)---> 0/1/1


not just gathering state info, but also copy device state to be
transferred during pre-copy phase.

Below 2 state are not just to tell driver to stop, those 2 differ.
2) is device state changed from running to stop, this is when VM
shutdowns cleanly, no need to save device state


Userspace is under no obligation to perform this state change though,
backwards compatibility dictates this.
  

2) 0/0/1 ---(tell driver to stop)---> 0/0/0



3) 0/1/1 ---(tell driver to stop)---> 0/1/0


above is transition from pre-copy phase to stop-and-copy phase, where
device data should be made available to user to transfer to destination
or to save it to file in case of save VM or suspend.



4) 0/0/1 ---(tell driver to 

Re: [PATCH v9 Kernel 2/5] vfio iommu: Add ioctl defination to get dirty pages bitmap.

2019-11-13 Thread Alex Williamson
On Thu, 14 Nov 2019 01:07:21 +0530
Kirti Wankhede  wrote:

> On 11/13/2019 4:00 AM, Alex Williamson wrote:
> > On Tue, 12 Nov 2019 22:33:37 +0530
> > Kirti Wankhede  wrote:
> >   
> >> All pages pinned by vendor driver through vfio_pin_pages API should be
> >> considered as dirty during migration. IOMMU container maintains a list of
> >> all such pinned pages. Added an ioctl defination to get bitmap of such  
> > 
> > definition
> >   
> >> pinned pages for requested IO virtual address range.  
> > 
> > Additionally, all mapped pages are considered dirty when physically
> > mapped through to an IOMMU, modulo we discussed devices opting in to
> > per page pinning to indicate finer granularity with a TBD mechanism to
> > figure out if any non-opt-in devices remain.
> >   
> 
> You mean, in case of device direct assignment (device pass through)?

Yes, or IOMMU backed mdevs.  If vfio_dmas in the container are fully
pinned and mapped, then the correct dirty page set is all mapped pages.
We discussed using the vpfn list as a mechanism for vendor drivers to
reduce their migration footprint, but we also discussed that we would
need a way to determine that all participants in the container have
explicitly pinned their working pages or else we must consider the
entire potential working set as dirty.

> >> Signed-off-by: Kirti Wankhede 
> >> Reviewed-by: Neo Jia 
> >> ---
> >>   include/uapi/linux/vfio.h | 23 +++
> >>   1 file changed, 23 insertions(+)
> >>
> >> diff --git a/include/uapi/linux/vfio.h b/include/uapi/linux/vfio.h
> >> index 35b09427ad9f..6fd3822aa610 100644
> >> --- a/include/uapi/linux/vfio.h
> >> +++ b/include/uapi/linux/vfio.h
> >> @@ -902,6 +902,29 @@ struct vfio_iommu_type1_dma_unmap {
> >>   #define VFIO_IOMMU_ENABLE_IO(VFIO_TYPE, VFIO_BASE + 15)
> >>   #define VFIO_IOMMU_DISABLE   _IO(VFIO_TYPE, VFIO_BASE + 16)
> >>   
> >> +/**
> >> + * VFIO_IOMMU_GET_DIRTY_BITMAP - _IOWR(VFIO_TYPE, VFIO_BASE + 17,
> >> + * struct 
> >> vfio_iommu_type1_dirty_bitmap)
> >> + *
> >> + * IOCTL to get dirty pages bitmap for IOMMU container during migration.
> >> + * Get dirty pages bitmap of given IO virtual addresses range using
> >> + * struct vfio_iommu_type1_dirty_bitmap. Caller sets argsz, which is size 
> >> of
> >> + * struct vfio_iommu_type1_dirty_bitmap. User should allocate memory to 
> >> get
> >> + * bitmap and should set size of allocated memory in bitmap_size field.
> >> + * One bit is used to represent per page consecutively starting from iova
> >> + * offset. Bit set indicates page at that offset from iova is dirty.
> >> + */
> >> +struct vfio_iommu_type1_dirty_bitmap {
> >> +  __u32argsz;
> >> +  __u32flags;
> >> +  __u64iova;  /* IO virtual address */
> >> +  __u64size;  /* Size of iova range */
> >> +  __u64bitmap_size;   /* in bytes */  
> > 
> > This seems redundant.  We can calculate the size of the bitmap based on
> > the iova size.
> >  
> 
> But in kernel space, we need to validate the size of memory allocated by 
> user instead of assuming user is always correct, right?

What does it buy us for the user to tell us the size?  They could be
wrong, they could be malicious.  The argsz field on the ioctl is mostly
for the handshake that the user is competent, we should get faults from
the copy-user operation if it's incorrect.
 
> >> +  void __user *bitmap;/* one bit per page */  
> > 
> > Should we define that as a __u64* to (a) help with the size
> > calculation, and (b) assure that we can use 8-byte ops on it?
> > 
> > However, who defines page size?  Is it necessarily the processor page
> > size?  A physical IOMMU may support page sizes other than the CPU page
> > size.  It might be more important to indicate the expected page size
> > than the bitmap size.  Thanks,
> >  
> 
> I see in QEMU and in vfio_iommu_type1 module, page sizes considered for 
> mapping are CPU page size, 4K. Do we still need to have such argument?

That assumption exists for backwards compatibility prior to supporting
the iova_pgsizes field in vfio_iommu_type1_info.  AFAIK the current
interface has no page size assumptions and we should not add any.
Thanks,

Alex




Re: [PATCH v9 Kernel 3/5] vfio iommu: Add ioctl defination to unmap IOVA and return dirty bitmap

2019-11-13 Thread Kirti Wankhede




On 11/13/2019 4:00 AM, Alex Williamson wrote:

On Tue, 12 Nov 2019 22:33:38 +0530
Kirti Wankhede  wrote:


With vIOMMU, during pre-copy phase of migration, while CPUs are still
running, IO virtual address unmap can happen while device still keeping
reference of guest pfns. Those pages should be reported as dirty before
unmap, so that VFIO user space application can copy content of those pages
from source to destination.

IOCTL defination added here add bitmap pointer, size and flag. If flag


definition, adds


VFIO_DMA_UNMAP_FLAG_GET_DIRTY_BITMAP is set and bitmap memory is allocated
and bitmap_size of set, then ioctl will create bitmap of pinned pages and


s/of/is/


then unmap those.

Signed-off-by: Kirti Wankhede 
Reviewed-by: Neo Jia 
---
  include/uapi/linux/vfio.h | 33 +
  1 file changed, 33 insertions(+)

diff --git a/include/uapi/linux/vfio.h b/include/uapi/linux/vfio.h
index 6fd3822aa610..72fd297baf52 100644
--- a/include/uapi/linux/vfio.h
+++ b/include/uapi/linux/vfio.h
@@ -925,6 +925,39 @@ struct vfio_iommu_type1_dirty_bitmap {
  
  #define VFIO_IOMMU_GET_DIRTY_BITMAP _IO(VFIO_TYPE, VFIO_BASE + 17)
  
+/**

+ * VFIO_IOMMU_UNMAP_DMA_GET_BITMAP - _IOWR(VFIO_TYPE, VFIO_BASE + 18,
+ *   struct vfio_iommu_type1_dma_unmap_bitmap)
+ *
+ * Unmap IO virtual addresses using the provided struct
+ * vfio_iommu_type1_dma_unmap_bitmap.  Caller sets argsz.
+ * VFIO_DMA_UNMAP_FLAG_GET_DIRTY_BITMAP should be set to get dirty bitmap
+ * before unmapping IO virtual addresses. If this flag is not set, only IO
+ * virtual address are unmapped without creating pinned pages bitmap, that
+ * is, behave same as VFIO_IOMMU_UNMAP_DMA ioctl.
+ * User should allocate memory to get bitmap and should set size of allocated
+ * memory in bitmap_size field. One bit in bitmap is used to represent per page
+ * consecutively starting from iova offset. Bit set indicates page at that
+ * offset from iova is dirty.
+ * The actual unmapped size is returned in the size field and bitmap of pages
+ * in the range of unmapped size is returned in bitmap if flag
+ * VFIO_DMA_UNMAP_FLAG_GET_DIRTY_BITMAP is set.
+ *
+ * No guarantee is made to the user that arbitrary unmaps of iova or size
+ * different from those used in the original mapping call will succeed.
+ */
+struct vfio_iommu_type1_dma_unmap_bitmap {
+   __u32argsz;
+   __u32flags;
+#define VFIO_DMA_UNMAP_FLAG_GET_DIRTY_BITMAP (1 << 0)
+   __u64iova;/* IO virtual address */
+   __u64size;/* Size of mapping (bytes) */
+   __u64bitmap_size; /* in bytes */
+   void __user *bitmap;  /* one bit per page */
+};
+
+#define VFIO_IOMMU_UNMAP_DMA_GET_BITMAP _IO(VFIO_TYPE, VFIO_BASE + 18)
+


Why not extend VFIO_IOMMU_UNMAP_DMA to support this rather than add an
ioctl that duplicates the functionality and extends it?? 


We do want old userspace applications to work with new kernel and 
vice-versa, right?


If I try to change existing VFIO_IOMMU_UNMAP_DMA ioctl structure, say if 
add 'bitmap_size' and 'bitmap' after 'size', with below code in old 
kernel, old kernel & new userspace will work.


minsz = offsetofend(struct vfio_iommu_type1_dma_unmap, size);

if (copy_from_user(, (void __user *)arg, minsz))
return -EFAULT;

if (unmap.argsz < minsz || unmap.flags)
return -EINVAL;


With new kernel it would change to:
minsz = offsetofend(struct vfio_iommu_type1_dma_unmap, bitmap);

if (copy_from_user(, (void __user *)arg, minsz))
return -EFAULT;

if (unmap.argsz < minsz || unmap.flags)
return -EINVAL;

Then old userspace app will fail because unmap.argsz < minsz and might 
be copy_from_user would cause seg fault because userspace sdk doesn't 
contain new member variables.
We can't change the sequence to keep 'size' as last member, because then 
new userspace app on old kernel will interpret it wrong.



Otherwise
same comments as previous, in fact it's too bad we can't use this ioctl
for both, but a DONT_UNMAP flag on the UNMAP_DMA ioctl seems a bit
absurd.

I suspect we also want a flags bit in VFIO_IOMMU_GET_INFO to indicate
these capabilities are supported.



Ok. I'll add that.


Maybe for both ioctls we also want to define it as the user's
responsibility to zero the bitmap, requiring the kernel to only set
bits as necessary. 


Ok. Updating comment.

Thanks,
Kirti


Thanks,

Alex


  /*  Additional API for SPAPR TCE (Server POWERPC) IOMMU  */
  
  /*






Re: [PATCH v9 Kernel 1/5] vfio: KABI for migration interface for device state

2019-11-13 Thread Alex Williamson
On Thu, 14 Nov 2019 00:59:52 +0530
Kirti Wankhede  wrote:

> On 11/13/2019 11:57 PM, Alex Williamson wrote:
> > On Wed, 13 Nov 2019 11:24:17 +0100
> > Cornelia Huck  wrote:
> >   
> >> On Tue, 12 Nov 2019 15:30:05 -0700
> >> Alex Williamson  wrote:
> >>  
> >>> On Tue, 12 Nov 2019 22:33:36 +0530
> >>> Kirti Wankhede  wrote:
> >>>  
>  - Defined MIGRATION region type and sub-type.
>  - Used 3 bits to define VFIO device states.
>   Bit 0 => _RUNNING
>   Bit 1 => _SAVING
>   Bit 2 => _RESUMING
>   Combination of these bits defines VFIO device's state during 
>  migration
>   _RUNNING => Normal VFIO device running state. When its reset, it
>   indicates _STOPPED state. when device is changed to
>   _STOPPED, driver should stop device before write()
>   returns.
>   _SAVING | _RUNNING => vCPUs are running, VFIO device is running but
> start saving state of device i.e. pre-copy 
>  state
>   _SAVING  => vCPUs are stopped, VFIO device should be stopped, and  
> >>>
> >>> s/should/must/
> >>>  
>   save device state,i.e. stop-n-copy state
>   _RESUMING => VFIO device resuming state.
>   _SAVING | _RESUMING and _RUNNING | _RESUMING => Invalid states  
> >>>
> >>> A table might be useful here and in the uapi header to indicate valid
> >>> states:  
> >>
> >> I like that.
> >>  
> >>>
> >>> | _RESUMING | _SAVING | _RUNNING | Description
> >>> +---+-+--+--
> >>> | 0 |0| 0| Stopped, not saving or resuming (a)
> >>> +---+-+--+--
> >>> | 0 |0| 1| Running, default state
> >>> +---+-+--+--
> >>> | 0 |1| 0| Stopped, migration interface in save 
> >>> mode
> >>> +---+-+--+--
> >>> | 0 |1| 1| Running, save mode interface, iterative
> >>> +---+-+--+--
> >>> | 1 |0| 0| Stopped, migration resume interface 
> >>> active
> >>> +---+-+--+--
> >>> | 1 |0| 1| Invalid (b)
> >>> +---+-+--+--
> >>> | 1 |1| 0| Invalid (c)
> >>> +---+-+--+--
> >>> | 1 |1| 1| Invalid (d)
> >>>
> >>> I think we need to consider whether we define (a) as generally
> >>> available, for instance we might want to use it for diagnostics or a
> >>> fatal error condition outside of migration.
> >>>
> >>> Are there hidden assumptions between state transitions here or are
> >>> there specific next possible state diagrams that we need to include as
> >>> well?  
> >>
> >> Some kind of state-change diagram might be useful in addition to the
> >> textual description anyway. Let me try, just to make sure I understand
> >> this correctly:
> >>  
> 
> During User application initialization, there is one more state change:
> 
> 0) 0/0/0  stop to running -> 0/0/1

0/0/0 cannot be the initial state of the device, that would imply that
a device supporting this migration interface breaks backwards
compatibility with all existing vfio userspace code and that code needs
to learn to set the device running as part of its initialization.
That's absolutely unacceptable.  The initial device state must be 0/0/1.

> >> 1) 0/0/1 ---(trigger driver to start gathering state info)---> 0/1/1  
> 
> not just gathering state info, but also copy device state to be 
> transferred during pre-copy phase.
> 
> Below 2 state are not just to tell driver to stop, those 2 differ.
> 2) is device state changed from running to stop, this is when VM 
> shutdowns cleanly, no need to save device state

Userspace is under no obligation to perform this state change though,
backwards compatibility dictates this.
 
> >> 2) 0/0/1 ---(tell driver to stop)---> 0/0/0   
> 
> >> 3) 0/1/1 ---(tell driver to stop)---> 0/1/0  
> 
> above is transition from pre-copy phase to stop-and-copy phase, where 
> device data should be made available to user to transfer to destination 
> or to save it to file in case of save VM or suspend.
> 
> 
> >> 4) 0/0/1 ---(tell driver to resume with provided info)---> 1/0/0  
> > 
> > I think this is to switch into resuming mode, the data will follow >  
> >> 5) 1/0/0 ---(driver is ready)---> 0/0/1
> >> 6) 0/1/1 ---(tell driver to stop saving)---> 0/0/1  
> >  
> 
> above can occur on migration cancelled or failed.
> 
> 
> > I think also:
> > 
> > 0/0/1 --> 0/1/0 If user chooses to go directly to stop and copy  

Re: [PATCH v9 Kernel 2/5] vfio iommu: Add ioctl defination to get dirty pages bitmap.

2019-11-13 Thread Kirti Wankhede




On 11/13/2019 4:00 AM, Alex Williamson wrote:

On Tue, 12 Nov 2019 22:33:37 +0530
Kirti Wankhede  wrote:


All pages pinned by vendor driver through vfio_pin_pages API should be
considered as dirty during migration. IOMMU container maintains a list of
all such pinned pages. Added an ioctl defination to get bitmap of such


definition


pinned pages for requested IO virtual address range.


Additionally, all mapped pages are considered dirty when physically
mapped through to an IOMMU, modulo we discussed devices opting in to
per page pinning to indicate finer granularity with a TBD mechanism to
figure out if any non-opt-in devices remain.



You mean, in case of device direct assignment (device pass through)?


Signed-off-by: Kirti Wankhede 
Reviewed-by: Neo Jia 
---
  include/uapi/linux/vfio.h | 23 +++
  1 file changed, 23 insertions(+)

diff --git a/include/uapi/linux/vfio.h b/include/uapi/linux/vfio.h
index 35b09427ad9f..6fd3822aa610 100644
--- a/include/uapi/linux/vfio.h
+++ b/include/uapi/linux/vfio.h
@@ -902,6 +902,29 @@ struct vfio_iommu_type1_dma_unmap {
  #define VFIO_IOMMU_ENABLE _IO(VFIO_TYPE, VFIO_BASE + 15)
  #define VFIO_IOMMU_DISABLE_IO(VFIO_TYPE, VFIO_BASE + 16)
  
+/**

+ * VFIO_IOMMU_GET_DIRTY_BITMAP - _IOWR(VFIO_TYPE, VFIO_BASE + 17,
+ * struct vfio_iommu_type1_dirty_bitmap)
+ *
+ * IOCTL to get dirty pages bitmap for IOMMU container during migration.
+ * Get dirty pages bitmap of given IO virtual addresses range using
+ * struct vfio_iommu_type1_dirty_bitmap. Caller sets argsz, which is size of
+ * struct vfio_iommu_type1_dirty_bitmap. User should allocate memory to get
+ * bitmap and should set size of allocated memory in bitmap_size field.
+ * One bit is used to represent per page consecutively starting from iova
+ * offset. Bit set indicates page at that offset from iova is dirty.
+ */
+struct vfio_iommu_type1_dirty_bitmap {
+   __u32argsz;
+   __u32flags;
+   __u64iova;  /* IO virtual address */
+   __u64size;  /* Size of iova range */
+   __u64bitmap_size;   /* in bytes */


This seems redundant.  We can calculate the size of the bitmap based on
the iova size.



But in kernel space, we need to validate the size of memory allocated by 
user instead of assuming user is always correct, right?



+   void __user *bitmap;/* one bit per page */


Should we define that as a __u64* to (a) help with the size
calculation, and (b) assure that we can use 8-byte ops on it?

However, who defines page size?  Is it necessarily the processor page
size?  A physical IOMMU may support page sizes other than the CPU page
size.  It might be more important to indicate the expected page size
than the bitmap size.  Thanks,



I see in QEMU and in vfio_iommu_type1 module, page sizes considered for 
mapping are CPU page size, 4K. Do we still need to have such argument?


Thanks,
Kirti


Alex


+};
+
+#define VFIO_IOMMU_GET_DIRTY_BITMAP _IO(VFIO_TYPE, VFIO_BASE + 17)
+
  /*  Additional API for SPAPR TCE (Server POWERPC) IOMMU  */
  
  /*






RE: QEMU for Qualcomm Hexagon - KVM Forum talk and code available

2019-11-13 Thread Taylor Simpson
Responses below ...

Taylor


Taylor Simpson  writes:

> I had discussions with several people at the KVM Forum, and I’ve been 
> thinking about how to divide up the code for community review.  Here is my 
> proposal for the steps.
>
>   1.  linux-user changes + linux-user/hexagon + skeleton of
> target/hexagon This is the minimum amount to build and run a very simple 
> program.  I
>   have an assembly program that prints “Hello” and exits.  It is
>   constructed to use very few instructions that can be added brute
>   force in the Hexagon back end.

I'm hoping most of the linux-user changes are in the hexagon runloop?
There has been quite a bit of work splitting up and cleaning up the #ifdef mess 
in linux-user over the last few years.

[Taylor] The majority of the linux-user support is in linux-user/hexagon.  
However, there were still a few changes needed in the linux-user directory.
elfload.c Needs some code to match some existing #ifdef TARGET_xyz code 
(e.g., define the init_thread function).
signal.c Needs some code to map signal 33 from the guest to something else 
on the target.
  I spoke to Laurent about this at the converence.
syscall.c Needs a definition of regpairs_aligned that returns 1.
Ssscall_defs.h Needs some definitions (e.g., TARGET_IOC_SIZEBITS and 
target_stat) added to the other #ifdef TARGET_xys blocks.

>   2.  Add the code that is imported from the Hexagon simulator and the
> qemu helper generator This will allow the scalar ISA to be executed.
> This will grow the set of programs that could execute, but there will still 
> be limitations.
> In particular, there can be no packets which means the C library won’t
> work .  We have to build with -nostdlib

You could run -nostdlib system TCG tests (hello and memory) but that would 
require modelling some sort of hardware and assumes you have a simple serial 
port or semihosting solution. That said a bunch of the MIPS tests are 
linux-user and -nostdlib so that isn't a major problem in getting some of the 
tests running.

When you say code imported from the hexagon simulator I was under the 
impression you were generating code from the instruction description.
Otherwise you'll need to be very clear about your licensing grants.

[Taylor] That is correct, we are generating code from the description.  There 
are two major pieces that are imported
Instruction decode logic
Any additional functions that are called from the generated code

[Taylor] All of the code will be licensed the same way.  I want to mark the 
imported code because it does not conform to the qemu coding standards.  I 
prefer not to reformat it in order to easily get bug fixes and enhancements 
going forward.  I also hope it will make the community review easier by 
allowing people to focus on the code that is new for qemu.

>   3.  Add support for packet semantics At this point, we will be able
> to execute full programs linked with the C library.  This will include
> the check-tcg tests.

I think the interesting question is if the roll-back semantics of the hexagon 
are something we might need for other emulated architectures or is a 
particularly specific solution for Hexagon (I'm guessing the later).

[Taylor] It is currently Hexagon-specific and isolated into the target/hexagon 
directory.  I was thinking the reviewers would have an easier time 
understanding the code if this were broken out.  However, it could also be 
merged together with step 2 if that is preferred.

>   4.  Add support for the wide vector extensions
>   5.  Add the helper overrides for performance optimization Some of
> these will be written by hand, and we’ll work with rev.ng to
>   integrate their flex/bison generator.

One thing to nail down will be will we include the generated code in the source 
tree with a tool to regenerate (much like we do for
linux-headers) or if we want to add the dependency and regenerate each time 
from scratch. I don't see including flex/bison as a dependency being a major 
issue (in fact we have it in our docker images so I guess something uses it). 
However it might be trickier depending on libclang which was also being 
discussed.

[Taylor] Currently, I have the generator and the generated code sitting in the 
source tree.  I'm flexible on this if the decision is to regenerate it every 
time.

>
> I would love some feedback on this proposal.  Hopefully, that is enough 
> detail so that people can comment.  If anything isn’t clear, please ask 
> questions.
>
>
> Thanks,
> Taylor
>
>
> From: Qemu-devel 
> On Behalf Of Taylor Simpson
> Sent: Tuesday, November 5, 2019 10:33 AM
> To: Aleksandar Markovic 
> Cc: Alessandro Di Federico ; ni...@rev.ng;
> qemu-devel@nongnu.org; Niccolò Izzo 
> Subject: RE: QEMU for Qualcomm Hexagon - KVM Forum talk and code
> available
>
> Hi Aleksandar,
>
> Thank you – We’re glad you enjoyed the talk.
>
> One point of clarification on SIMD in Hexagon.  What we refer to as the 
> “scalar” core does 

Re: [PATCH v9 Kernel 1/5] vfio: KABI for migration interface for device state

2019-11-13 Thread Kirti Wankhede




On 11/13/2019 11:57 PM, Alex Williamson wrote:

On Wed, 13 Nov 2019 11:24:17 +0100
Cornelia Huck  wrote:


On Tue, 12 Nov 2019 15:30:05 -0700
Alex Williamson  wrote:


On Tue, 12 Nov 2019 22:33:36 +0530
Kirti Wankhede  wrote:
   

- Defined MIGRATION region type and sub-type.
- Used 3 bits to define VFIO device states.
 Bit 0 => _RUNNING
 Bit 1 => _SAVING
 Bit 2 => _RESUMING
 Combination of these bits defines VFIO device's state during migration
 _RUNNING => Normal VFIO device running state. When its reset, it
indicates _STOPPED state. when device is changed to
_STOPPED, driver should stop device before write()
returns.
 _SAVING | _RUNNING => vCPUs are running, VFIO device is running but
   start saving state of device i.e. pre-copy state
 _SAVING  => vCPUs are stopped, VFIO device should be stopped, and


s/should/must/
   

 save device state,i.e. stop-n-copy state
 _RESUMING => VFIO device resuming state.
 _SAVING | _RESUMING and _RUNNING | _RESUMING => Invalid states


A table might be useful here and in the uapi header to indicate valid
states:


I like that.



| _RESUMING | _SAVING | _RUNNING | Description
+---+-+--+--
| 0 |0| 0| Stopped, not saving or resuming (a)
+---+-+--+--
| 0 |0| 1| Running, default state
+---+-+--+--
| 0 |1| 0| Stopped, migration interface in save mode
+---+-+--+--
| 0 |1| 1| Running, save mode interface, iterative
+---+-+--+--
| 1 |0| 0| Stopped, migration resume interface active
+---+-+--+--
| 1 |0| 1| Invalid (b)
+---+-+--+--
| 1 |1| 0| Invalid (c)
+---+-+--+--
| 1 |1| 1| Invalid (d)

I think we need to consider whether we define (a) as generally
available, for instance we might want to use it for diagnostics or a
fatal error condition outside of migration.

Are there hidden assumptions between state transitions here or are
there specific next possible state diagrams that we need to include as
well?


Some kind of state-change diagram might be useful in addition to the
textual description anyway. Let me try, just to make sure I understand
this correctly:



During User application initialization, there is one more state change:

0) 0/0/0  stop to running -> 0/0/1


1) 0/0/1 ---(trigger driver to start gathering state info)---> 0/1/1


not just gathering state info, but also copy device state to be 
transferred during pre-copy phase.


Below 2 state are not just to tell driver to stop, those 2 differ.
2) is device state changed from running to stop, this is when VM 
shutdowns cleanly, no need to save device state


2) 0/0/1 ---(tell driver to stop)---> 0/0/0 



3) 0/1/1 ---(tell driver to stop)---> 0/1/0


above is transition from pre-copy phase to stop-and-copy phase, where 
device data should be made available to user to transfer to destination 
or to save it to file in case of save VM or suspend.




4) 0/0/1 ---(tell driver to resume with provided info)---> 1/0/0


I think this is to switch into resuming mode, the data will follow >

5) 1/0/0 ---(driver is ready)---> 0/0/1
6) 0/1/1 ---(tell driver to stop saving)---> 0/0/1




above can occur on migration cancelled or failed.



I think also:

0/0/1 --> 0/1/0 If user chooses to go directly to stop and copy


that's right, this happens in case of save VM or suspend VM.



0/0/0 and 0/0/1 should be reachable from any state, though I could see
that a vendor driver could fail transition from 1/0/0 -> 0/0/1 if the
received state is incomplete.  Somehow though a user always needs to
return the device to the initial state, so how does device_state
interact with the reset ioctl?  Would this automatically manipulate
device_state back to 0/0/1?


why would reset occur on 1/0/0 -> 0/0/1 failure?

1/0/0 -> 0/0/1 fails, then user should convey that to source that 
migration has failed, then resume at source.


  

Not sure about the usefulness of 2).


I explained this above.


Also, is 4) the only way to
trigger resuming? 

Yes.


And is the change in 5) performed by the driver, or
by userspace?


By userspace.


Are any other state transitions valid?

(...)


+ * Sequence to be followed for _SAVING|_RUNNING device state or pre-copy phase
+ * and for _SAVING device state or stop-and-copy phase:
+ * a. read 

Re: [PATCH v9 Kernel 1/5] vfio: KABI for migration interface for device state

2019-11-13 Thread Kirti Wankhede




On 11/13/2019 8:53 AM, Yan Zhao wrote:

On Wed, Nov 13, 2019 at 06:30:05AM +0800, Alex Williamson wrote:

On Tue, 12 Nov 2019 22:33:36 +0530
Kirti Wankhede  wrote:


- Defined MIGRATION region type and sub-type.
- Used 3 bits to define VFIO device states.
 Bit 0 => _RUNNING
 Bit 1 => _SAVING
 Bit 2 => _RESUMING
 Combination of these bits defines VFIO device's state during migration
 _RUNNING => Normal VFIO device running state. When its reset, it
indicates _STOPPED state. when device is changed to
_STOPPED, driver should stop device before write()
returns.
 _SAVING | _RUNNING => vCPUs are running, VFIO device is running but
   start saving state of device i.e. pre-copy state
 _SAVING  => vCPUs are stopped, VFIO device should be stopped, and


s/should/must/


 save device state,i.e. stop-n-copy state
 _RESUMING => VFIO device resuming state.
 _SAVING | _RESUMING and _RUNNING | _RESUMING => Invalid states


A table might be useful here and in the uapi header to indicate valid
states:

| _RESUMING | _SAVING | _RUNNING | Description
+---+-+--+--
| 0 |0| 0| Stopped, not saving or resuming (a)
+---+-+--+--
| 0 |0| 1| Running, default state
+---+-+--+--
| 0 |1| 0| Stopped, migration interface in save mode
+---+-+--+--
| 0 |1| 1| Running, save mode interface, iterative
+---+-+--+--
| 1 |0| 0| Stopped, migration resume interface active
+---+-+--+--
| 1 |0| 1| Invalid (b)
+---+-+--+--
| 1 |1| 0| Invalid (c)
+---+-+--+--
| 1 |1| 1| Invalid (d)

I think we need to consider whether we define (a) as generally
available, for instance we might want to use it for diagnostics or a
fatal error condition outside of migration.



We have to set it as init state. I'll add this it.


Are there hidden assumptions between state transitions here or are
there specific next possible state diagrams that we need to include as
well?

I'm curious if Intel agrees with the states marked invalid with their
push for post-copy support.


hi Alex and Kirti,
Actually, for postcopy, I think we anyway need an extra POSTCOPY state
introduced. Reasons as below:
- in the target side, _RSESUMING state is set in the beginning of
   migration. It cannot be used as a state to inform device of that
   currently it's in postcopy state and device DMAs are to be trapped and
   pre-faulted.
   we also cannot use state (_RESUMING + _RUNNING) as an indicator of
   postcopy, because before device & vm running in target side, some device
   state are already loaded (e.g. page tables, pending workloads),
   target side can do pre-pagefault at that period before target vm up.
- in the source side, after device is stopped, postcopy needs saving
   device state only (as compared to device state + remaining dirty
   pages in precopy). state (!_RUNNING + _SAVING) here again cannot
   differentiate precopy and postcopy here.


 Bits 3 - 31 are reserved for future use. User should perform
 read-modify-write operation on this field.
- Defined vfio_device_migration_info structure which will be placed at 0th
   offset of migration region to get/set VFIO device related information.
   Defined members of structure and usage on read/write access:
 * device_state: (read/write)
 To convey VFIO device state to be transitioned to. Only 3 bits are
used as of now, Bits 3 - 31 are reserved for future use.
 * pending bytes: (read only)
 To get pending bytes yet to be migrated for VFIO device.
 * data_offset: (read only)
 To get data offset in migration region from where data exist
during _SAVING and from where data should be written by user space
application during _RESUMING state.
 * data_size: (read/write)
 To get and set size in bytes of data copied in migration region
during _SAVING and _RESUMING state.

Migration region looks like:
  --
|vfio_device_migration_info|data section  |
|  | ///  |
  --
  ^  ^
  offset 0-trapped part

[PATCH v1] s390x: kvm-unit-tests: a PONG device for Sub Channels tests

2019-11-13 Thread Pierre Morel
The PONG device accept two commands: PONG_READ and PONG_WRITE
which allow to read from and write to an internal buffer of
1024 bytes.

The QEMU device is named ccw-pong.

Signed-off-by: Pierre Morel 
---
 hw/s390x/Makefile.objs  |   1 +
 hw/s390x/ccw-pong.c | 186 
 include/hw/s390x/pong.h |  47 
 3 files changed, 234 insertions(+)
 create mode 100644 hw/s390x/ccw-pong.c
 create mode 100644 include/hw/s390x/pong.h

diff --git a/hw/s390x/Makefile.objs b/hw/s390x/Makefile.objs
index ee91152..3a83438 100644
--- a/hw/s390x/Makefile.objs
+++ b/hw/s390x/Makefile.objs
@@ -32,6 +32,7 @@ obj-$(CONFIG_KVM) += tod-kvm.o
 obj-$(CONFIG_KVM) += s390-skeys-kvm.o
 obj-$(CONFIG_KVM) += s390-stattrib-kvm.o s390-mchk.o
 obj-y += s390-ccw.o
+obj-y += ccw-pong.o
 obj-y += ap-device.o
 obj-y += ap-bridge.o
 obj-y += s390-sei.o
diff --git a/hw/s390x/ccw-pong.c b/hw/s390x/ccw-pong.c
new file mode 100644
index 000..e7439d5
--- /dev/null
+++ b/hw/s390x/ccw-pong.c
@@ -0,0 +1,186 @@
+/*
+ * CCW PING-PONG
+ *
+ * Copyright 2019 IBM Corp.
+ * Author(s): Pierre Morel 
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or (at
+ * your option) any later version. See the COPYING file in the top-level
+ * directory.
+ */
+
+#include "qemu/osdep.h"
+#include "qapi/error.h"
+#include "qemu/module.h"
+#include "cpu.h"
+#include "exec/address-spaces.h"
+#include "hw/s390x/css.h"
+#include "hw/s390x/css-bridge.h"
+#include "hw/qdev-properties.h"
+#include "hw/s390x/pong.h"
+
+#define PONG_BUF_SIZE 0x1000
+static char buf[PONG_BUF_SIZE] = "Hello world\n";
+
+static inline int pong_rw(CCW1 *ccw, char *p, int len, bool dir)
+{
+int ret;
+
+ret = address_space_rw(_space_memory, ccw->cda,
+   MEMTXATTRS_UNSPECIFIED,
+   (unsigned char *)buf, len, dir);
+
+return (ret == MEMTX_OK) ? -EIO : 0;
+}
+
+/* Handle READ ccw commands from guest */
+static int handle_payload_read(CcwPONGDevice *dev, CCW1 *ccw)
+{
+CcwDevice *ccw_dev = CCW_DEVICE(dev);
+int len;
+
+if (!ccw->cda) {
+return -EFAULT;
+}
+
+if (ccw->count > PONG_BUF_SIZE) {
+len = PONG_BUF_SIZE;
+ccw_dev->sch->curr_status.scsw.count = ccw->count - PONG_BUF_SIZE;
+} else {
+len = ccw->count;
+ccw_dev->sch->curr_status.scsw.count = 0;
+}
+
+return pong_rw(ccw, buf, len, 1);
+}
+
+/*
+ * Handle WRITE ccw commands to write data to client
+ * The SCSW count is set to the number of bytes not transfered.
+ */
+static int handle_payload_write(CcwPONGDevice *dev, CCW1 *ccw)
+{
+CcwDevice *ccw_dev = CCW_DEVICE(dev);
+int len;
+
+if (!ccw->cda) {
+ccw_dev->sch->curr_status.scsw.count = ccw->count;
+return -EFAULT;
+}
+
+if (ccw->count > PONG_BUF_SIZE) {
+len = PONG_BUF_SIZE;
+ccw_dev->sch->curr_status.scsw.count = ccw->count - PONG_BUF_SIZE;
+} else {
+len = ccw->count;
+ccw_dev->sch->curr_status.scsw.count = 0;
+}
+
+return pong_rw(ccw, buf, len, 0);
+}
+
+static int pong_ccw_cb(SubchDev *sch, CCW1 ccw)
+{
+int rc = 0;
+CcwPONGDevice *dev = sch->driver_data;
+
+switch (ccw.cmd_code) {
+case PONG_WRITE:
+rc = handle_payload_write(dev, );
+break;
+case PONG_READ:
+rc = handle_payload_read(dev, );
+break;
+default:
+rc = -ENOSYS;
+break;
+}
+
+if (rc == -EIO) {
+/* I/O error, specific devices generate specific conditions */
+SCHIB *schib = >curr_status;
+
+sch->curr_status.scsw.dstat = SCSW_DSTAT_UNIT_CHECK;
+sch->sense_data[0] = 0x40;/* intervention-req */
+schib->scsw.ctrl &= ~SCSW_ACTL_START_PEND;
+schib->scsw.ctrl &= ~SCSW_CTRL_MASK_STCTL;
+schib->scsw.ctrl |= SCSW_STCTL_PRIMARY | SCSW_STCTL_SECONDARY |
+   SCSW_STCTL_ALERT | SCSW_STCTL_STATUS_PEND;
+}
+return rc;
+}
+
+static void pong_ccw_realize(DeviceState *ds, Error **errp)
+{
+uint16_t chpid;
+CcwPONGDevice *dev = CCW_PONG(ds);
+CcwDevice *cdev = CCW_DEVICE(ds);
+CCWDeviceClass *cdk = CCW_DEVICE_GET_CLASS(cdev);
+SubchDev *sch;
+Error *err = NULL;
+
+sch = css_create_sch(cdev->devno, errp);
+if (!sch) {
+return;
+}
+
+sch->driver_data = dev;
+cdev->sch = sch;
+chpid = css_find_free_chpid(sch->cssid);
+
+if (chpid > MAX_CHPID) {
+error_setg(, "No available chpid to use.");
+goto out_err;
+}
+
+sch->id.reserved = 0xff;
+sch->id.cu_type = CCW_PONG_CU_TYPE;
+css_sch_build_virtual_schib(sch, (uint8_t)chpid,
+CCW_PONG_CHPID_TYPE);
+sch->do_subchannel_work = do_subchannel_work_virtual;
+sch->ccw_cb = pong_ccw_cb;
+
+cdk->realize(cdev, );
+if (err) {
+goto out_err;
+}
+
+css_reset_sch(sch);
+return;
+
+out_err:
+error_propagate(errp, err);
+ 

Re: [EXTERNAL]Re: [PATCH v1 5/5] .travis.yml: drop 32 bit systems from MAIN_SOFTMMU_TARGETS

2019-11-13 Thread Alex Bennée


Aleksandar Markovic  writes:

>> From: Philippe Mathieu-Daudé 
>> > -- 
>> > MAIN_SOFTMMU_TARGETS="aarch64-softmmu,arm-softmmu,i386-softmmu,mips-softmmu,mips64-softmmu,ppc64-softmmu,riscv64-softmmu,s390x-softmmu,x86_64-softmmu"
>> > +- 
>> > MAIN_SOFTMMU_TARGETS="aarch64-softmmu,mips64-softmmu,ppc64-softmmu,riscv64-softmmu,s390x-softmmu,x86_64-softmmu"
>>
>> Aleksandar, since you mostly test 32-bit MIPS, are you OK we keep
>> mips-softmmu and drop mips64-softmmu here? Another job (acceptance-test)
>> builds the mips64el-softmmu.
>
> Philippe, thanks for bringing this to my attention. Yes, 32-bit mips
> targets are important to us, but, what can we do, time constraints are
> time constraints, so I agree with Alex change, please go ahead, Alex.
> We can test 32-bit mips targets via other acceptance tests (those that
> can run longer, so-called "slow" group), and perhaps we can extend
> them to test more 32-bit mips systems.

To be clear both gcc and clang have rules that test:

- CONFIG="--disable-user --target-list-exclude=${MAIN_SOFTMMU_TARGETS}"

So the main targets which are reducing their coverage are:

- CONFIG="--enable-debug --target-list=${MAIN_SOFTMMU_TARGETS}"

- CONFIG="--enable-modules --target-list=${MAIN_SOFTMMU_TARGETS}"

- CONFIG="--target-list=${MAIN_SOFTMMU_TARGETS} "
- CACHE_NAME="${TRAVIS_BRANCH}-linux-clang-sanitize"
  compiler: clang
  before_script:
- ./configure ${CONFIG} --extra-cflags="-fsanitize=undefined -Werror" 
|| { cat config.log && exit 1; }

- CONFIG="--enable-gprof --enable-gcov --disable-pie 
--target-list=${MAIN_SOFTMMU_TARGETS}"

and the MacOSX 9.4 build:
# MacOSX builds
- env:
- CONFIG="--target-list=${MAIN_SOFTMMU_TARGETS}"
  os: osx
  osx_image: xcode9.4
  compiler: clang

The Xcode 10.3 build is already a reduced list:
- 
CONFIG="--target-list=i386-softmmu,ppc-softmmu,ppc64-softmmu,m68k-softmmu,x86_64-softmmu"


>
> Thanks to everybody,
> Aleksandar


--
Alex Bennée



[PATCH V2] WHPX: refactor load library

2019-11-13 Thread Sunil Muthuswamy
This refactors the load library of WHV libraries to make it more
modular. It makes a helper routine that can be called on demand.
This allows future expansion of load library/functions to support
functionality that is dependent on some feature being available.

Signed-off-by: Sunil Muthuswamy 
---
Changes since v1:
- Fixed typo of load_whp_dispatch_fns
- Fixed free of the right handle

 target/i386/whp-dispatch.h |  4 +++
 target/i386/whpx-all.c | 85 +++---
 2 files changed, 62 insertions(+), 27 deletions(-)

diff --git a/target/i386/whp-dispatch.h b/target/i386/whp-dispatch.h
index 23791fbb47..87d049ceab 100644
--- a/target/i386/whp-dispatch.h
+++ b/target/i386/whp-dispatch.h
@@ -50,5 +50,9 @@ extern struct WHPDispatch whp_dispatch;
 
 bool init_whp_dispatch(void);
 
+typedef enum WHPFunctionList {
+WINHV_PLATFORM_FNS_DEFAULT,
+WINHV_EMULATION_FNS_DEFAULT,
+} WHPFunctionList;
 
 #endif /* WHP_DISPATCH_H */
diff --git a/target/i386/whpx-all.c b/target/i386/whpx-all.c
index ed95105eae..f3c61fa5d8 100644
--- a/target/i386/whpx-all.c
+++ b/target/i386/whpx-all.c
@@ -1356,6 +1356,58 @@ static void whpx_handle_interrupt(CPUState *cpu, int 
mask)
 }
 }
 
+/*
+ * Load the functions from the given library, using the given handle. If a
+ * handle is provided, it is used, otherwise the library is opened. The
+ * handle will be updated on return with the opened one.
+ */
+static bool load_whp_dispatch_fns(HMODULE *handle,
+WHPFunctionList function_list)
+{
+HMODULE hLib = *handle;
+
+#define WINHV_PLATFORM_DLL "WinHvPlatform.dll"
+#define WINHV_EMULATION_DLL "WinHvEmulation.dll"
+#define WHP_LOAD_FIELD(return_type, function_name, signature) \
+whp_dispatch.function_name = \
+(function_name ## _t)GetProcAddress(hLib, #function_name); \
+if (!whp_dispatch.function_name) { \
+error_report("Could not load function %s", #function_name); \
+goto error; \
+} \
+
+#define WHP_LOAD_LIB(lib_name, handle_lib) \
+if (!handle_lib) { \
+handle_lib = LoadLibrary(lib_name); \
+if (!handle_lib) { \
+error_report("Could not load library %s.", lib_name); \
+goto error; \
+} \
+} \
+
+switch (function_list) {
+case WINHV_PLATFORM_FNS_DEFAULT:
+WHP_LOAD_LIB(WINHV_PLATFORM_DLL, hLib)
+LIST_WINHVPLATFORM_FUNCTIONS(WHP_LOAD_FIELD)
+break;
+
+case WINHV_EMULATION_FNS_DEFAULT:
+WHP_LOAD_LIB(WINHV_EMULATION_DLL, hLib)
+LIST_WINHVEMULATION_FUNCTIONS(WHP_LOAD_FIELD)
+break;
+}
+
+*handle = hLib;
+return true;
+
+error:
+if (hLib) {
+FreeLibrary(hLib);
+}
+
+return false;
+}
+
 /*
  * Partition support
  */
@@ -1491,51 +1543,30 @@ static void whpx_type_init(void)
 
 bool init_whp_dispatch(void)
 {
-const char *lib_name;
-HMODULE hLib;
-
 if (whp_dispatch_initialized) {
 return true;
 }
 
-#define WHP_LOAD_FIELD(return_type, function_name, signature) \
-whp_dispatch.function_name = \
-(function_name ## _t)GetProcAddress(hLib, #function_name); \
-if (!whp_dispatch.function_name) { \
-error_report("Could not load function %s from library %s.", \
- #function_name, lib_name); \
-goto error; \
-} \
-
-lib_name = "WinHvPlatform.dll";
-hWinHvPlatform = LoadLibrary(lib_name);
-if (!hWinHvPlatform) {
-error_report("Could not load library %s.", lib_name);
+if (!load_whp_dispatch_fns(, WINHV_PLATFORM_FNS_DEFAULT)) {
 goto error;
 }
-hLib = hWinHvPlatform;
-LIST_WINHVPLATFORM_FUNCTIONS(WHP_LOAD_FIELD)
 
-lib_name = "WinHvEmulation.dll";
-hWinHvEmulation = LoadLibrary(lib_name);
-if (!hWinHvEmulation) {
-error_report("Could not load library %s.", lib_name);
+if (!load_whp_dispatch_fns(, WINHV_EMULATION_FNS_DEFAULT)) 
{
 goto error;
 }
-hLib = hWinHvEmulation;
-LIST_WINHVEMULATION_FUNCTIONS(WHP_LOAD_FIELD)
 
 whp_dispatch_initialized = true;
-return true;
-
-error:
 
+return true;
+error:
 if (hWinHvPlatform) {
 FreeLibrary(hWinHvPlatform);
 }
+
 if (hWinHvEmulation) {
 FreeLibrary(hWinHvEmulation);
 }
+
 return false;
 }
 
-- 
2.16.4




Re: [PATCH v2 1/5] MAINTAINERS: Add a section on UI translation

2019-11-13 Thread Philippe Mathieu-Daudé

Hi Aleksandar,

On 11/13/19 2:47 PM, Aleksandar Markovic wrote:

From: Aleksandar Markovic 

There should be a person who will quickly evaluate new UI
translation, and find a way to update existing ones should
something changes in UI.


I appreciate your trust, but I'm afraid I know next to nothing about 
po/. I don't use QEMU's GUI myself: mostly command line, and via libvirt 
from time to time.


These files are about language translations, maybe it is easier to let 
them unmaintained and have patches go via the trivial tree?




Signed-off-by: Aleksandar Markovic 
---
  MAINTAINERS | 5 +
  1 file changed, 5 insertions(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index 363e72a..fd9ba32 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -2714,6 +2714,11 @@ M: Daniel P. Berrange 
  S: Odd Fixes
  F: scripts/git-submodule.sh
  
+UI translations

+M: Aleksandar Markovic 
+R: Philippe Mathieu-Daudé 
+F: po/*.po
+
  Sphinx documentation configuration and build machinery
  M: Peter Maydell 
  S: Maintained






[PATCH v7 1/3] block: introduce compress filter driver

2019-11-13 Thread Andrey Shinkevich
Allow writing all the data compressed through the filter driver.
The written data will be aligned by the cluster size.
Based on the QEMU current implementation, that data can be written to
unallocated clusters only. May be used for a backup job.

Suggested-by: Max Reitz 
Signed-off-by: Andrey Shinkevich 
---
 block/Makefile.objs |   1 +
 block/filter-compress.c | 201 
 qapi/block-core.json|  10 ++-
 3 files changed, 208 insertions(+), 4 deletions(-)
 create mode 100644 block/filter-compress.c

diff --git a/block/Makefile.objs b/block/Makefile.objs
index e394fe0..330529b 100644
--- a/block/Makefile.objs
+++ b/block/Makefile.objs
@@ -43,6 +43,7 @@ block-obj-y += crypto.o
 
 block-obj-y += aio_task.o
 block-obj-y += backup-top.o
+block-obj-y += filter-compress.o
 
 common-obj-y += stream.o
 
diff --git a/block/filter-compress.c b/block/filter-compress.c
new file mode 100644
index 000..64b1ee5
--- /dev/null
+++ b/block/filter-compress.c
@@ -0,0 +1,201 @@
+/*
+ * Compress filter block driver
+ *
+ * Copyright (c) 2019 Virtuozzo International GmbH
+ *
+ * Author:
+ *   Andrey Shinkevich 
+ *   (based on block/copy-on-read.c by Max Reitz)
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License as
+ * published by the Free Software Foundation; either version 2 or
+ * (at your option) any later version of the License.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, see .
+ */
+
+#include "qemu/osdep.h"
+#include "block/block_int.h"
+#include "qemu/module.h"
+
+
+static int compress_open(BlockDriverState *bs, QDict *options, int flags,
+ Error **errp)
+{
+bs->backing = bdrv_open_child(NULL, options, "file", bs, _file, 
false,
+  errp);
+if (!bs->backing) {
+return -EINVAL;
+}
+
+bs->supported_write_flags = BDRV_REQ_WRITE_UNCHANGED |
+BDRV_REQ_WRITE_COMPRESSED |
+(BDRV_REQ_FUA & bs->backing->bs->supported_write_flags);
+
+bs->supported_zero_flags = BDRV_REQ_WRITE_UNCHANGED |
+((BDRV_REQ_FUA | BDRV_REQ_MAY_UNMAP | BDRV_REQ_NO_FALLBACK) &
+bs->backing->bs->supported_zero_flags);
+
+return 0;
+}
+
+
+#define PERM_PASSTHROUGH (BLK_PERM_CONSISTENT_READ \
+  | BLK_PERM_WRITE \
+  | BLK_PERM_RESIZE)
+#define PERM_UNCHANGED (BLK_PERM_ALL & ~PERM_PASSTHROUGH)
+
+static void compress_child_perm(BlockDriverState *bs, BdrvChild *c,
+const BdrvChildRole *role,
+BlockReopenQueue *reopen_queue,
+uint64_t perm, uint64_t shared,
+uint64_t *nperm, uint64_t *nshared)
+{
+*nperm = perm & PERM_PASSTHROUGH;
+*nshared = (shared & PERM_PASSTHROUGH) | PERM_UNCHANGED;
+
+/*
+ * We must not request write permissions for an inactive node, the child
+ * cannot provide it.
+ */
+if (!(bs->open_flags & BDRV_O_INACTIVE)) {
+*nperm |= BLK_PERM_WRITE_UNCHANGED;
+}
+}
+
+
+static int64_t compress_getlength(BlockDriverState *bs)
+{
+return bdrv_getlength(bs->backing->bs);
+}
+
+
+static int coroutine_fn compress_co_truncate(BlockDriverState *bs,
+ int64_t offset, bool exact,
+ PreallocMode prealloc,
+ Error **errp)
+{
+return bdrv_co_truncate(bs->backing, offset, exact, prealloc, errp);
+}
+
+
+static int coroutine_fn compress_co_preadv_part(BlockDriverState *bs,
+uint64_t offset, uint64_t 
bytes,
+QEMUIOVector *qiov,
+size_t qiov_offset,
+int flags)
+{
+return bdrv_co_preadv_part(bs->backing, offset, bytes, qiov, qiov_offset,
+   flags);
+}
+
+
+static int coroutine_fn compress_co_pwritev_part(BlockDriverState *bs,
+ uint64_t offset,
+ uint64_t bytes,
+ QEMUIOVector *qiov,
+ size_t qiov_offset, int flags)
+{
+return bdrv_co_pwritev_part(bs->backing, offset, bytes, qiov, qiov_offset,
+flags | BDRV_REQ_WRITE_COMPRESSED);
+}
+
+
+static int coroutine_fn 

[PATCH v7 2/3] qcow2: Allow writing compressed data of multiple clusters

2019-11-13 Thread Andrey Shinkevich
QEMU currently supports writing compressed data of the size equal to
one cluster. This patch allows writing QCOW2 compressed data that
exceed one cluster. Now, we split buffered data into separate clusters
and write them compressed using the existing functionality.

Suggested-by: Pavel Butsykin 
Signed-off-by: Andrey Shinkevich 
Reviewed-by: Vladimir Sementsov-Ogievskiy 
---
 block/qcow2.c | 102 ++
 1 file changed, 75 insertions(+), 27 deletions(-)

diff --git a/block/qcow2.c b/block/qcow2.c
index 7c18721..0e03a1a 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -4222,10 +4222,8 @@ fail:
 return ret;
 }
 
-/* XXX: put compressed sectors first, then all the cluster aligned
-   tables to avoid losing bytes in alignment */
 static coroutine_fn int
-qcow2_co_pwritev_compressed_part(BlockDriverState *bs,
+qcow2_co_pwritev_compressed_task(BlockDriverState *bs,
  uint64_t offset, uint64_t bytes,
  QEMUIOVector *qiov, size_t qiov_offset)
 {
@@ -4235,32 +4233,11 @@ qcow2_co_pwritev_compressed_part(BlockDriverState *bs,
 uint8_t *buf, *out_buf;
 uint64_t cluster_offset;
 
-if (has_data_file(bs)) {
-return -ENOTSUP;
-}
-
-if (bytes == 0) {
-/* align end of file to a sector boundary to ease reading with
-   sector based I/Os */
-int64_t len = bdrv_getlength(bs->file->bs);
-if (len < 0) {
-return len;
-}
-return bdrv_co_truncate(bs->file, len, false, PREALLOC_MODE_OFF, NULL);
-}
-
-if (offset_into_cluster(s, offset)) {
-return -EINVAL;
-}
+assert(bytes == s->cluster_size || (bytes < s->cluster_size &&
+   (offset + bytes == bs->total_sectors << BDRV_SECTOR_BITS)));
 
 buf = qemu_blockalign(bs, s->cluster_size);
-if (bytes != s->cluster_size) {
-if (bytes > s->cluster_size ||
-offset + bytes != bs->total_sectors << BDRV_SECTOR_BITS)
-{
-qemu_vfree(buf);
-return -EINVAL;
-}
+if (bytes < s->cluster_size) {
 /* Zero-pad last write if image size is not cluster aligned */
 memset(buf + bytes, 0, s->cluster_size - bytes);
 }
@@ -4309,6 +4286,77 @@ fail:
 return ret;
 }
 
+static coroutine_fn int qcow2_co_pwritev_compressed_task_entry(AioTask *task)
+{
+Qcow2AioTask *t = container_of(task, Qcow2AioTask, task);
+
+assert(!t->cluster_type && !t->l2meta);
+
+return qcow2_co_pwritev_compressed_task(t->bs, t->offset, t->bytes, 
t->qiov,
+t->qiov_offset);
+}
+
+/*
+ * XXX: put compressed sectors first, then all the cluster aligned
+ * tables to avoid losing bytes in alignment
+ */
+static coroutine_fn int
+qcow2_co_pwritev_compressed_part(BlockDriverState *bs,
+ uint64_t offset, uint64_t bytes,
+ QEMUIOVector *qiov, size_t qiov_offset)
+{
+BDRVQcow2State *s = bs->opaque;
+AioTaskPool *aio = NULL;
+int ret = 0;
+
+if (has_data_file(bs)) {
+return -ENOTSUP;
+}
+
+if (bytes == 0) {
+/*
+ * align end of file to a sector boundary to ease reading with
+ * sector based I/Os
+ */
+int64_t len = bdrv_getlength(bs->file->bs);
+if (len < 0) {
+return len;
+}
+return bdrv_co_truncate(bs->file, len, false, PREALLOC_MODE_OFF, NULL);
+}
+
+if (offset_into_cluster(s, offset)) {
+return -EINVAL;
+}
+
+while (bytes && aio_task_pool_status(aio) == 0) {
+uint64_t chunk_size = MIN(bytes, s->cluster_size);
+
+if (!aio && chunk_size != bytes) {
+aio = aio_task_pool_new(QCOW2_MAX_WORKERS);
+}
+
+ret = qcow2_add_task(bs, aio, qcow2_co_pwritev_compressed_task_entry,
+ 0, 0, offset, chunk_size, qiov, qiov_offset, 
NULL);
+if (ret < 0) {
+break;
+}
+qiov_offset += chunk_size;
+offset += chunk_size;
+bytes -= chunk_size;
+}
+
+if (aio) {
+aio_task_pool_wait_all(aio);
+if (ret == 0) {
+ret = aio_task_pool_status(aio);
+}
+g_free(aio);
+}
+
+return ret;
+}
+
 static int coroutine_fn
 qcow2_co_preadv_compressed(BlockDriverState *bs,
uint64_t file_cluster_offset,
-- 
1.8.3.1




Re: [EXTERNAL]Re: [PATCH v1 5/5] .travis.yml: drop 32 bit systems from MAIN_SOFTMMU_TARGETS

2019-11-13 Thread Philippe Mathieu-Daudé

On 11/13/19 6:38 PM, Aleksandar Markovic wrote:

From: Philippe Mathieu-Daudé 

-- 
MAIN_SOFTMMU_TARGETS="aarch64-softmmu,arm-softmmu,i386-softmmu,mips-softmmu,mips64-softmmu,ppc64-softmmu,riscv64-softmmu,s390x-softmmu,x86_64-softmmu"
+- 
MAIN_SOFTMMU_TARGETS="aarch64-softmmu,mips64-softmmu,ppc64-softmmu,riscv64-softmmu,s390x-softmmu,x86_64-softmmu"


Aleksandar, since you mostly test 32-bit MIPS, are you OK we keep
mips-softmmu and drop mips64-softmmu here? Another job (acceptance-test)
builds the mips64el-softmmu.


Philippe, thanks for bringing this to my attention. Yes, 32-bit mips targets are 
important to us, but, what can we do, time constraints are time constraints, so I agree 
with Alex change, please go ahead, Alex. We can test 32-bit mips targets via other 
acceptance tests (those that can run longer, so-called "slow" group), and 
perhaps we can extend them to test more 32-bit mips systems.


OK, let's keep mips64 as suggested Alex then.

Reviewed-by: Philippe Mathieu-Daudé 




[PATCH v7 3/3] tests/qemu-iotests: add case to write compressed data of multiple clusters

2019-11-13 Thread Andrey Shinkevich
Add the case to the iotest #214 that checks possibility of writing
compressed data of more than one cluster size. The test case involves
the compress filter driver showing a sample usage of that.

Signed-off-by: Andrey Shinkevich 
---
 tests/qemu-iotests/214 | 43 +++
 tests/qemu-iotests/214.out | 14 ++
 2 files changed, 57 insertions(+)

diff --git a/tests/qemu-iotests/214 b/tests/qemu-iotests/214
index 21ec8a2..5012112 100755
--- a/tests/qemu-iotests/214
+++ b/tests/qemu-iotests/214
@@ -89,6 +89,49 @@ _check_test_img -r all
 $QEMU_IO -c "read  -P 0x11  0 4M" "$TEST_IMG" 2>&1 | _filter_qemu_io | 
_filter_testdir
 $QEMU_IO -c "read  -P 0x22 4M 4M" "$TEST_IMG" 2>&1 | _filter_qemu_io | 
_filter_testdir
 
+echo
+echo "=== Write compressed data of multiple clusters ==="
+echo
+cluster_size=0x1
+_make_test_img 2M -o cluster_size=$cluster_size
+
+echo "Write uncompressed data:"
+let data_size="8 * $cluster_size"
+$QEMU_IO -c "write -P 0xaa 0 $data_size" "$TEST_IMG" \
+ 2>&1 | _filter_qemu_io | _filter_testdir
+sizeA=$($QEMU_IMG info --output=json "$TEST_IMG" |
+sed -n '/"actual-size":/ s/[^0-9]//gp')
+
+_make_test_img 2M -o cluster_size=$cluster_size
+echo "Write compressed data:"
+let data_size="3 * $cluster_size + ($cluster_size / 2)"
+# Set compress=on. That will align the written data
+# by the cluster size and will write them compressed.
+QEMU_IO_OPTIONS=$QEMU_IO_OPTIONS_NO_FMT \
+$QEMU_IO -c "write -P 0xbb 0 $data_size" --image-opts \
+ 
"driver=compress,file.driver=$IMGFMT,file.file.driver=file,file.file.filename=$TEST_IMG"
 \
+ 2>&1 | _filter_qemu_io | _filter_testdir
+
+let offset="4 * $cluster_size"
+QEMU_IO_OPTIONS=$QEMU_IO_OPTIONS_NO_FMT \
+$QEMU_IO -c "write -P 0xcc $offset $data_size" "json:{\
+'driver': 'compress',
+'file': {'driver': '$IMGFMT',
+ 'file': {'driver': 'file',
+  'filename': '$TEST_IMG'}}}" | \
+  _filter_qemu_io | _filter_testdir
+
+sizeB=$($QEMU_IMG info --output=json "$TEST_IMG" |
+sed -n '/"actual-size":/ s/[^0-9]//gp')
+
+if [ $sizeA -le $sizeB ]
+then
+echo "Compression ERROR"
+fi
+
+$QEMU_IMG check --output=json "$TEST_IMG" |
+  sed -n 's/,$//; /"compressed-clusters":/ s/^ *//p'
+
 # success, all done
 echo '*** done'
 rm -f $seq.full
diff --git a/tests/qemu-iotests/214.out b/tests/qemu-iotests/214.out
index 0fcd8dc..4a2ec33 100644
--- a/tests/qemu-iotests/214.out
+++ b/tests/qemu-iotests/214.out
@@ -32,4 +32,18 @@ read 4194304/4194304 bytes at offset 0
 4 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
 read 4194304/4194304 bytes at offset 4194304
 4 MiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+
+=== Write compressed data of multiple clusters ===
+
+Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=2097152
+Write uncompressed data:
+wrote 524288/524288 bytes at offset 0
+512 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=2097152
+Write compressed data:
+wrote 229376/229376 bytes at offset 0
+224 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+wrote 229376/229376 bytes at offset 262144
+224 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+"compressed-clusters": 8
 *** done
-- 
1.8.3.1




[PATCH v7 0/3] qcow2: advanced compression options

2019-11-13 Thread Andrey Shinkevich
The compression filter driver is introduced as suggested by Max.
A sample usage of the filter can be found in the test #214.
Now, multiple clusters can be written compressed.
It is useful for the backup job.

v7:
  01: The 'zip_' prefix for the compression filter functions replaced with
'compress_' one.
  02: .bdrv_co_preadv/pwritev (without _part) removed from the filter.
  03: .bdrv_refresh_limits amended.
  04: .bdrv_get_info added.
  05: In qapi/block-core.json, @compress: Since 5.0 was set.

  Discussed in the email thread with the message ID
  <1573488277-794975-1-git-send-email-andrey.shinkev...@virtuozzo.com>

Andrey Shinkevich (3):
  block: introduce compress filter driver
  qcow2: Allow writing compressed data of multiple clusters
  tests/qemu-iotests: add case to write compressed data of multiple
clusters

 block/Makefile.objs|   1 +
 block/filter-compress.c| 201 +
 block/qcow2.c  | 102 +--
 qapi/block-core.json   |  10 ++-
 tests/qemu-iotests/214 |  43 ++
 tests/qemu-iotests/214.out |  14 
 6 files changed, 340 insertions(+), 31 deletions(-)
 create mode 100644 block/filter-compress.c

-- 
1.8.3.1




Re: [PATCH v9 Kernel 1/5] vfio: KABI for migration interface for device state

2019-11-13 Thread Alex Williamson
On Wed, 13 Nov 2019 11:24:17 +0100
Cornelia Huck  wrote:

> On Tue, 12 Nov 2019 15:30:05 -0700
> Alex Williamson  wrote:
> 
> > On Tue, 12 Nov 2019 22:33:36 +0530
> > Kirti Wankhede  wrote:
> >   
> > > - Defined MIGRATION region type and sub-type.
> > > - Used 3 bits to define VFIO device states.
> > > Bit 0 => _RUNNING
> > > Bit 1 => _SAVING
> > > Bit 2 => _RESUMING
> > > Combination of these bits defines VFIO device's state during migration
> > > _RUNNING => Normal VFIO device running state. When its reset, it
> > >   indicates _STOPPED state. when device is changed to
> > >   _STOPPED, driver should stop device before write()
> > >   returns.
> > > _SAVING | _RUNNING => vCPUs are running, VFIO device is running but
> > >   start saving state of device i.e. pre-copy state
> > > _SAVING  => vCPUs are stopped, VFIO device should be stopped, and
> > 
> > s/should/must/
> >   
> > > save device state,i.e. stop-n-copy state
> > > _RESUMING => VFIO device resuming state.
> > > _SAVING | _RESUMING and _RUNNING | _RESUMING => Invalid states
> > 
> > A table might be useful here and in the uapi header to indicate valid
> > states:  
> 
> I like that.
> 
> > 
> > | _RESUMING | _SAVING | _RUNNING | Description
> > +---+-+--+--
> > | 0 |0| 0| Stopped, not saving or resuming (a)
> > +---+-+--+--
> > | 0 |0| 1| Running, default state
> > +---+-+--+--
> > | 0 |1| 0| Stopped, migration interface in save mode
> > +---+-+--+--
> > | 0 |1| 1| Running, save mode interface, iterative
> > +---+-+--+--
> > | 1 |0| 0| Stopped, migration resume interface 
> > active
> > +---+-+--+--
> > | 1 |0| 1| Invalid (b)
> > +---+-+--+--
> > | 1 |1| 0| Invalid (c)
> > +---+-+--+--
> > | 1 |1| 1| Invalid (d)
> > 
> > I think we need to consider whether we define (a) as generally
> > available, for instance we might want to use it for diagnostics or a
> > fatal error condition outside of migration.
> > 
> > Are there hidden assumptions between state transitions here or are
> > there specific next possible state diagrams that we need to include as
> > well?  
> 
> Some kind of state-change diagram might be useful in addition to the
> textual description anyway. Let me try, just to make sure I understand
> this correctly:
> 
> 1) 0/0/1 ---(trigger driver to start gathering state info)---> 0/1/1
> 2) 0/0/1 ---(tell driver to stop)---> 0/0/0
> 3) 0/1/1 ---(tell driver to stop)---> 0/1/0
> 4) 0/0/1 ---(tell driver to resume with provided info)---> 1/0/0

I think this is to switch into resuming mode, the data will follow

> 5) 1/0/0 ---(driver is ready)---> 0/0/1
> 6) 0/1/1 ---(tell driver to stop saving)---> 0/0/1

I think also:

0/0/1 --> 0/1/0 If user chooses to go directly to stop and copy

0/0/0 and 0/0/1 should be reachable from any state, though I could see
that a vendor driver could fail transition from 1/0/0 -> 0/0/1 if the
received state is incomplete.  Somehow though a user always needs to
return the device to the initial state, so how does device_state
interact with the reset ioctl?  Would this automatically manipulate
device_state back to 0/0/1?
 
> Not sure about the usefulness of 2). Also, is 4) the only way to
> trigger resuming? And is the change in 5) performed by the driver, or
> by userspace?
> 
> Are any other state transitions valid?
> 
> (...)
> 
> > > + * Sequence to be followed for _SAVING|_RUNNING device state or pre-copy 
> > > phase
> > > + * and for _SAVING device state or stop-and-copy phase:
> > > + * a. read pending_bytes. If pending_bytes > 0, go through below steps.
> > > + * b. read data_offset, indicates kernel driver to write data to staging 
> > > buffer.
> > > + *Kernel driver should return this read operation only after writing 
> > > data to
> > > + *staging buffer is done.
> > 
> > "staging buffer" implies a vendor driver implementation, perhaps we
> > could just state that data is available from (region + data_offset) to
> > (region + data_offset + data_size) upon return of this read operation.
> >   
> > > + * c. read data_size, amount of data in bytes written by vendor driver in
> > > + *migration region.
> > > + * d. read data_size bytes of data from data_offset in the migration 
> > > 

[PATCH] migration: Fix the re-run check of the migrate-incoming command

2019-11-13 Thread Yury Kotov
The current check sets an error but doesn't fail the command.
This may cause a problem if new connection attempt by the same URI
affects the first connection.

Signed-off-by: Yury Kotov 
---
 migration/migration.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/migration/migration.c b/migration/migration.c
index 354ad072fa..fa2005b49f 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -1784,6 +1784,7 @@ void qmp_migrate_incoming(const char *uri, Error **errp)
 }
 if (!once) {
 error_setg(errp, "The incoming migration has already been started");
+return;
 }
 
 qemu_start_incoming_migration(uri, _err);
-- 
2.24.0




Re: [EXTERNAL]Re: [PATCH v1 5/5] .travis.yml: drop 32 bit systems from MAIN_SOFTMMU_TARGETS

2019-11-13 Thread Aleksandar Markovic
> From: Philippe Mathieu-Daudé 
> > -- 
> > MAIN_SOFTMMU_TARGETS="aarch64-softmmu,arm-softmmu,i386-softmmu,mips-softmmu,mips64-softmmu,ppc64-softmmu,riscv64-softmmu,s390x-softmmu,x86_64-softmmu"
> > +- 
> > MAIN_SOFTMMU_TARGETS="aarch64-softmmu,mips64-softmmu,ppc64-softmmu,riscv64-softmmu,s390x-softmmu,x86_64-softmmu"
> 
> Aleksandar, since you mostly test 32-bit MIPS, are you OK we keep
> mips-softmmu and drop mips64-softmmu here? Another job (acceptance-test)
> builds the mips64el-softmmu.

Philippe, thanks for bringing this to my attention. Yes, 32-bit mips targets 
are important to us, but, what can we do, time constraints are time 
constraints, so I agree with Alex change, please go ahead, Alex. We can test 
32-bit mips targets via other acceptance tests (those that can run longer, 
so-called "slow" group), and perhaps we can extend them to test more 32-bit 
mips systems.

Thanks to everybody,
Aleksandar


RE: [PATCH] WHPX: refactor load library

2019-11-13 Thread Sunil Muthuswamy


> Making it easier for other people to test WHPX would be nice.

Yes, we understand the concerns and I generally agree here. I am
trying to connect the different teams involved here (legal, SDK here)
and connect the dots for them, to see what can be done here.

> But in case this is not sorted out soon, I don't see a reason to
> not merge WHPX changes if they are reviewed and approved by the
> main author of that code (Justin).
> 

Justin is ready to review it, but is out for another week. Will have him
review once he is back.




RE: [PATCH] WHPX: refactor load library

2019-11-13 Thread Sunil Muthuswamy



> Can we wait for approval from the Microsoft legal department first?
> So we can start testing WHPX builds, and reduce the possibilities to
> introduce regressions.
> 
> Testing is ready, we are waiting for Microsoft to merge, see:
> https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.mail-archive.com%2Fqemu-
> devel%40nongnu.org%2Fmsg646351.htmldata=02%7C01%7Csunilmut%40microsoft.com%7C41ce65aedecb47c7bd0d08d76857937d
> %7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637092597657121219sdata=tu0zZDIzlG%2F9lEU4SJi11%2B%2FX1JdHUt6PD
> 2teeYCMZ%2B8%3Dreserved=0
> 

Yes, we have escalated this to the right set of teams, including the legal
team. We are working through the processes here internally and will
update once we have something. Meanwhile, it would be good to see if
we can get these patches in.

> >
> > Sunil, Justin, would you like to be listed as maintainers or
> > designated reviewers for the WHPX code in QEMU?
> 
> Great idea!
It's a valid and good point. I am discussing this internally here and
will get back.




Re: [RFC v4 PATCH 41/49] multi-process/mig: Enable VMSD save in the Proxy object

2019-11-13 Thread Daniel P . Berrangé
On Wed, Nov 13, 2019 at 11:32:09AM -0500, Jag Raman wrote:
> 
> 
> On 11/13/2019 10:50 AM, Daniel P. Berrangé wrote:
> > On Thu, Oct 24, 2019 at 05:09:22AM -0400, Jagannathan Raman wrote:
> > > Collect the VMSD from remote process on the source and save
> > > it to the channel leading to the destination
> > > 
> > > Signed-off-by: Elena Ufimtseva 
> > > Signed-off-by: John G Johnson 
> > > Signed-off-by: Jagannathan Raman 
> > > ---
> > >   New patch in v4
> > > 
> > >   hw/proxy/qemu-proxy.c | 132 
> > > ++
> > >   include/hw/proxy/qemu-proxy.h |   2 +
> > >   include/io/mpqemu-link.h  |   1 +
> > >   3 files changed, 135 insertions(+)
> > > 
> > > diff --git a/hw/proxy/qemu-proxy.c b/hw/proxy/qemu-proxy.c
> > > index 623a6c5..ce72e6a 100644
> > > --- a/hw/proxy/qemu-proxy.c
> > > +++ b/hw/proxy/qemu-proxy.c
> > > @@ -52,6 +52,14 @@
> > >   #include "util/event_notifier-posix.c"
> > >   #include "hw/boards.h"
> > >   #include "include/qemu/log.h"
> > > +#include "io/channel.h"
> > > +#include "migration/qemu-file-types.h"
> > > +#include "qapi/error.h"
> > > +#include "io/channel-util.h"
> > > +#include "migration/qemu-file-channel.h"
> > > +#include "migration/qemu-file.h"
> > > +#include "migration/migration.h"
> > > +#include "migration/vmstate.h"
> > >   QEMUTimer *hb_timer;
> > >   static void pci_proxy_dev_realize(PCIDevice *dev, Error **errp);
> > > @@ -62,6 +70,9 @@ static void stop_heartbeat_timer(void);
> > >   static void childsig_handler(int sig, siginfo_t *siginfo, void *ctx);
> > >   static void broadcast_msg(MPQemuMsg *msg, bool need_reply);
> > > +#define PAGE_SIZE getpagesize()
> > > +uint8_t *mig_data;
> > > +
> > >   static void childsig_handler(int sig, siginfo_t *siginfo, void *ctx)
> > >   {
> > >   /* TODO: Add proper handler. */
> > > @@ -357,14 +368,135 @@ static void pci_proxy_dev_inst_init(Object *obj)
> > >   dev->mem_init = false;
> > >   }
> > > +typedef struct {
> > > +QEMUFile *rem;
> > > +PCIProxyDev *dev;
> > > +} proxy_mig_data;
> > > +
> > > +static void *proxy_mig_out(void *opaque)
> > > +{
> > > +proxy_mig_data *data = opaque;
> > > +PCIProxyDev *dev = data->dev;
> > > +uint8_t byte;
> > > +uint64_t data_size = PAGE_SIZE;
> > > +
> > > +mig_data = g_malloc(data_size);
> > > +
> > > +while (true) {
> > > +byte = qemu_get_byte(data->rem);
> > 
> > There is a pretty large set of APIs hiding behind the qemu_get_byte
> > call, which does not give me confidence that...
> > 
> > > +mig_data[dev->migsize++] = byte;
> > > +if (dev->migsize == data_size) {
> > > +data_size += PAGE_SIZE;
> > > +mig_data = g_realloc(mig_data, data_size);
> > > +}
> > > +}
> > > +
> > > +return NULL;
> > > +}
> > > +
> > > +static int proxy_pre_save(void *opaque)
> > > +{
> > > +PCIProxyDev *pdev = opaque;
> > > +proxy_mig_data *mig_data;
> > > +QEMUFile *f_remote;
> > > +MPQemuMsg msg = {0};
> > > +QemuThread thread;
> > > +Error *err = NULL;
> > > +QIOChannel *ioc;
> > > +uint64_t size;
> > > +int fd[2];
> > > +
> > > +if (socketpair(AF_UNIX, SOCK_STREAM, 0, fd)) {
> > > +return -1;
> > > +}
> > > +
> > > +ioc = qio_channel_new_fd(fd[0], );
> > > +if (err) {
> > > +error_report_err(err);
> > > +return -1;
> > > +}
> > > +
> > > +qio_channel_set_name(QIO_CHANNEL(ioc), "PCIProxyDevice-mig");
> > > +
> > > +f_remote = qemu_fopen_channel_input(ioc);
> > > +
> > > +pdev->migsize = 0;
> > > +
> > > +mig_data = g_malloc0(sizeof(proxy_mig_data));
> > > +mig_data->rem = f_remote;
> > > +mig_data->dev = pdev;
> > > +
> > > +qemu_thread_create(, "Proxy MIG_OUT", proxy_mig_out, mig_data,
> > > +   QEMU_THREAD_DETACHED);
> > > +
> > > +msg.cmd = START_MIG_OUT;
> > > +msg.bytestream = 0;
> > > +msg.num_fds = 2;
> > > +msg.fds[0] = fd[1];
> > > +msg.fds[1] = GET_REMOTE_WAIT;
> > > +
> > > +mpqemu_msg_send(pdev->mpqemu_link, , pdev->mpqemu_link->com);
> > > +size = wait_for_remote(msg.fds[1]);
> > > +PUT_REMOTE_WAIT(msg.fds[1]);
> > > +
> > > +assert(size != ULLONG_MAX);
> > > +
> > > +/*
> > > + * migsize is being update by a separate thread. Using volatile to
> > > + * instruct the compiler to fetch the value of this variable from
> > > + * memory during every read
> > > + */
> > > +while (*((volatile uint64_t *)>migsize) < size) {
> > > +}
> > > +
> > > +qemu_thread_cancel();
> > 
> > this is a safe way to stop the thread executing without
> > resulting in memory being leaked.
> > 
> > In addition thread cancellation is asynchronous, so the thread
> > may still be using the QEMUFile object while
> > 
> > > +qemu_fclose(f_remote);
> 
> The above "wait_for_remote()" call waits for the remote process to
> finish with Migration, and return the size of 

Re: [PATCH for 5.0 0/6] linux-user: Add support for real time clock ioctls

2019-11-13 Thread Laurent Vivier
Hi Filip,

Le 13/11/2019 à 17:41, Filip Bozuta a écrit :
> Add ioctls for all rtc features that are currently supported in linux kernell.
> 
> Filip Bozuta (6):
>   linux-user: Add support for enabling/disabling rtc features using
> ioctls
>   linux-user: Add set and read for rtc time and alarm using ioctls
>   linux-user: Add read and set for rtc periodic interrupt and epoch
> using ioctls
>   linux-user: Add get and set for rtc wakeup alarm using ioctls
>   linux-user: Add get and set for rtc pll correction using ioctls
>   linux-user: Add rtc voltage low detector read and clear using ioctls
> 
>  linux-user/ioctls.h| 23 +++
>  linux-user/syscall.c   |  1 +
>  linux-user/syscall_defs.h  | 36 
>  linux-user/syscall_types.h | 25 +
>  4 files changed, 85 insertions(+)
> 

Could you add in the description of each patch the name the ioctls it
implements, their purpose (you can cut from man(rtc)) and how you
have tested them?

Thanks,
Laurent



RE: [PATCH] WHPX: refactor load library

2019-11-13 Thread Sunil Muthuswamy


> -Original Message-
> From: Paolo Bonzini 
> Sent: Wednesday, November 13, 2019 7:00 AM
> To: Sunil Muthuswamy ; Richard Henderson 
> ; Eduardo Habkost ;
> Stefan Weil 
> Cc: qemu-devel@nongnu.org; Justin Terry (VM) 
> Subject: Re: [PATCH] WHPX: refactor load library
> 
> On 08/11/19 21:31, Sunil Muthuswamy wrote:
> >
> > +typedef enum WHPFunctionList {
> > +WINHV_PLATFORM_FNS_DEFAULT,
> > +WINHV_EMULATION_FNS_DEFAULT,
> > +} WHPFunctionList;
> >
> 
> What does "default" stand for?  I assume you have more changes to this
> function in the future.
> 
Yes, there are more functions coming, such as for XSAVE. I used "default" to 
represent
whatever is there currently, for lack of a better term.

> > + * Load the functions from the given library, using the given handle. If a
> > + * handle is provided, it is used, otherwise the library is opened. The
> > + * handle will be updated on return with the opened one.
> > + */
> > +static bool load_whp_dipatch_fns(HMODULE *handle, WHPFunctionList 
> > function_list)
> > +{
> 
> Typo, "dipatch" instead of "dispatch".
> >
> > +if (hLib) {
> > +FreeLibrary(hWinHvEmulation);
> > +}
> 
> The argument to FreeLibrary should be hLib.
> 

Thanks, will fix these in the next version.



Re: [SeaBIOS] Re: [PATCH] ahci: zero-initialize port struct

2019-11-13 Thread Sam Eiderman
Links to latest commits from archive.
You can see all changes in the cover letter.

[SeaBIOS] [PATCH v4 0/5] Add Qemu to SeaBIOS LCHS interface
https://mail.coreboot.org/hyperkitty/list/seab...@seabios.org/message/VLNFBEERTWLEUO6LM5BYLBNVIFCTP46M/
[SeaBIOS] [PATCH v4 1/5] geometry: Read LCHS from fw_cfg
https://mail.coreboot.org/hyperkitty/list/seab...@seabios.org/message/B3IPD3HH4UPDYJWFE4KX3HXUCNW5GPEW/
[SeaBIOS] [PATCH v4 2/5] boot: Reorder functions in boot.c
https://mail.coreboot.org/hyperkitty/list/seab...@seabios.org/message/YDVU3WIGOSKZ2RQSMR5UVQNZ66K4IG65/
[SeaBIOS] [PATCH v4 3/5] boot: Build ata and scsi paths in function
https://mail.coreboot.org/hyperkitty/list/seab...@seabios.org/message/RY33DRZZ3UK3UMQ3Q6BY2KUCHRRW4MRK/
[SeaBIOS] [PATCH v4 4/5] geometry: Add boot_lchs_find_*() utility functions
https://mail.coreboot.org/hyperkitty/list/seab...@seabios.org/message/DAJOULWFK24DX4DY3VS6WWOOQNWW3GSG/
[SeaBIOS] [PATCH v4 5/5] geometry: Apply LCHS values for boot devices
https://mail.coreboot.org/hyperkitty/list/seab...@seabios.org/message/UUCTPPJ4PTS5CUTCFLOH3YOEXGC6HQ4T/

Sam

On Wed, Nov 13, 2019 at 6:35 PM Sam Eiderman  wrote:
>
> Sure,
>
> There are two issues here.
>
> The first issue is that my commits which applied to seabios master:
>
> * 9caa19b - geometry: Apply LCHS values for boot devices
> * cb56f61 - config: Add toggle for bootdevice information
> * ad29109 - geometry: Add boot_lchs_find_*() utility functions
> * b3d2120 - boot: Reorder functions in boot.c
> * 7c66a43 - geometry: Read LCHS from fw_cfg
>
> Are not from the latest version which was submitted to the mailing list (v4)
> * fw_cfg key name has changed
> * The value and of the key has changed from binary (v1) to textual (v4)
> * Other fixes and variable name changes.
>
> So these commits need to be reverted and reapplied with the latest version 
> (v4)
>
> The second issue is that my commits, (in v4 too) will require this fix
> that Gerd added ([PATCH] ahci: zero-initialize port struct) since they
> change how SeaBIOS uses lchs.
>
> Previously, before any of my commits, drive.lchs could contain "random
> crap" since it was always set before being used in
> setup_translation().
>
> After my patches, get_translation() invokes overriden_lchs_supplied()
> which checks: "return drive->lchs.cylinder || drive->lchs.head ||
> drive->lchs.sector;"
> So there is an assumption that "drive->lchs" is zeroed when lchs is
> not supplied for the host.
>
> This was true for all devices using "drive->lchs" (all were memset to
> 0) except ahci.
> (I used 'git grep "drive_s * drive"' to find them all).
>
> So Gerd fix is indeed needed and then all devices are covered
> (drive->lchs is memset to 0).
>
> Now only the first issue remains...
>
> Sam
>
> On Wed, Nov 13, 2019 at 6:12 PM Philippe Mathieu-Daudé
>  wrote:
> >
> > Hi Sam,
> >
> > On 11/13/19 4:03 PM, Sam Eiderman wrote:
> > > Hi,
> > >
> > > Does this fix a bug that actually happened?
> > >
> > > I just noticed that in my lchs patches I assumed that lchs struct is
> > > zeroed out in all devices (not only ahci):
> > >
> > > 9caa19be0e53 (geometry: Apply LCHS values for boot devices)
> > >
> > > Seems like this is not the case but why only ahci is affected?
> > >
> > > The list of devices is at least:
> > >
> > >  * ata
> > >  * ahci
> > >  * scsi
> > >  * esp
> > >  * lsi
> > >  * megasas
> > >  * mpt
> > >  * pvscsi
> > >  * virtio
> > >  * virtio-blk
> > >
> > > As specified in the commit message.
> > >
> > > Also Gerd it seems that my lchs patches were not committed in the
> > > latest submitted version (v4)!!!
> > > The ABI of the fw config key is completely broken.
> >
> > What do you mean? Can you be more specific?
> >



[PATCH for 5.0 1/6] linux-user: Add support for enabling/disabling rtc features using ioctls

2019-11-13 Thread Filip Bozuta
Signed-off-by: Filip Bozuta 
---
 linux-user/ioctls.h   |  9 +
 linux-user/syscall.c  |  1 +
 linux-user/syscall_defs.h | 10 ++
 3 files changed, 20 insertions(+)

diff --git a/linux-user/ioctls.h b/linux-user/ioctls.h
index c6b9d6a..97741c7 100644
--- a/linux-user/ioctls.h
+++ b/linux-user/ioctls.h
@@ -69,6 +69,15 @@
  IOCTL(KDSETLED, 0, TYPE_INT)
  IOCTL_SPECIAL(KDSIGACCEPT, 0, do_ioctl_kdsigaccept, TYPE_INT)
 
+ IOCTL(RTC_AIE_ON, 0, TYPE_NULL)
+ IOCTL(RTC_AIE_OFF, 0, TYPE_NULL)
+ IOCTL(RTC_UIE_ON, 0, TYPE_NULL)
+ IOCTL(RTC_UIE_OFF, 0, TYPE_NULL)
+ IOCTL(RTC_PIE_ON, 0, TYPE_NULL)
+ IOCTL(RTC_PIE_OFF, 0, TYPE_NULL)
+ IOCTL(RTC_WIE_ON, 0, TYPE_NULL)
+ IOCTL(RTC_WIE_OFF, 0, TYPE_NULL)
+
  IOCTL(BLKROSET, IOC_W, MK_PTR(TYPE_INT))
  IOCTL(BLKROGET, IOC_R, MK_PTR(TYPE_INT))
  IOCTL(BLKRRPART, 0, TYPE_NULL)
diff --git a/linux-user/syscall.c b/linux-user/syscall.c
index ce399a5..74c3c08 100644
--- a/linux-user/syscall.c
+++ b/linux-user/syscall.c
@@ -107,6 +107,7 @@
 #include 
 #include 
 #include 
+#include 
 #include "linux_loop.h"
 #include "uname.h"
 
diff --git a/linux-user/syscall_defs.h b/linux-user/syscall_defs.h
index 98c2119..f91579a 100644
--- a/linux-user/syscall_defs.h
+++ b/linux-user/syscall_defs.h
@@ -763,6 +763,16 @@ struct target_pollfd {
 #define TARGET_KDSETLED0x4B32  /* set led state [lights, not flags] */
 #define TARGET_KDSIGACCEPT 0x4B4E
 
+/* real time clock ioctls */
+#define TARGET_RTC_AIE_ON   TARGET_IO('p', 0x01)
+#define TARGET_RTC_AIE_OFF  TARGET_IO('p', 0x02)
+#define TARGET_RTC_UIE_ON   TARGET_IO('p', 0x03)
+#define TARGET_RTC_UIE_OFF  TARGET_IO('p', 0x04)
+#define TARGET_RTC_PIE_ON   TARGET_IO('p', 0x05)
+#define TARGET_RTC_PIE_OFF  TARGET_IO('p', 0x06)
+#define TARGET_RTC_WIE_ON   TARGET_IO('p', 0x0f)
+#define TARGET_RTC_WIE_OFF  TARGET_IO('p', 0x10)
+
 #if defined(TARGET_ALPHA) || defined(TARGET_MIPS) || defined(TARGET_SH4) ||
\
defined(TARGET_XTENSA)
 #define TARGET_FIOGETOWN   TARGET_IOR('f', 123, int)
-- 
2.7.4




[PATCH for 5.0 5/6] linux-user: Add get and set for rtc pll correction using ioctls

2019-11-13 Thread Filip Bozuta
Signed-off-by: Filip Bozuta 
---
 linux-user/ioctls.h|  2 ++
 linux-user/syscall_defs.h  | 14 ++
 linux-user/syscall_types.h |  9 +
 3 files changed, 25 insertions(+)

diff --git a/linux-user/ioctls.h b/linux-user/ioctls.h
index 5830315..eea65e1 100644
--- a/linux-user/ioctls.h
+++ b/linux-user/ioctls.h
@@ -87,6 +87,8 @@
  IOCTL(RTC_EPOCH_SET, IOC_W, MK_PTR(TYPE_ULONG))
  IOCTL(RTC_WKALM_SET, IOC_W, MK_PTR(MK_STRUCT(STRUCT_rtc_wkalrm)))
  IOCTL(RTC_WKALM_RD, IOC_R, MK_PTR(MK_STRUCT(STRUCT_rtc_wkalrm)))
+ IOCTL(RTC_PLL_GET, IOC_R, MK_PTR(MK_STRUCT(STRUCT_rtc_pll_info)))
+ IOCTL(RTC_PLL_SET, IOC_W, MK_PTR(MK_STRUCT(STRUCT_rtc_pll_info)))
 
  IOCTL(BLKROSET, IOC_W, MK_PTR(TYPE_INT))
  IOCTL(BLKROGET, IOC_R, MK_PTR(TYPE_INT))
diff --git a/linux-user/syscall_defs.h b/linux-user/syscall_defs.h
index 3a0eb6b..367e9bd 100644
--- a/linux-user/syscall_defs.h
+++ b/linux-user/syscall_defs.h
@@ -763,6 +763,16 @@ struct target_pollfd {
 #define TARGET_KDSETLED0x4B32  /* set led state [lights, not flags] */
 #define TARGET_KDSIGACCEPT 0x4B4E
 
+struct target_rtc_pll_info {
+int pll_ctrl;
+int pll_value;
+int pll_max;
+int pll_min;
+int pll_posmult;
+int pll_negmult;
+abi_long pll_clock;
+};
+
 /* real time clock ioctls */
 #define TARGET_RTC_AIE_ON   TARGET_IO('p', 0x01)
 #define TARGET_RTC_AIE_OFF  TARGET_IO('p', 0x02)
@@ -782,6 +792,10 @@ struct target_pollfd {
 #define TARGET_RTC_EPOCH_SETTARGET_IOW('p', 0x0e, abi_ulong)
 #define TARGET_RTC_WKALM_SETTARGET_IOW('p', 0x0f, struct rtc_wkalrm)
 #define TARGET_RTC_WKALM_RD TARGET_IOR('p', 0x10, struct rtc_wkalrm)
+#define TARGET_RTC_PLL_GET  TARGET_IOR('p', 0x11,  
\
+   struct target_rtc_pll_info)
+#define TARGET_RTC_PLL_SET  TARGET_IOW('p', 0x12,  
\
+   struct target_rtc_pll_info)
 
 #if defined(TARGET_ALPHA) || defined(TARGET_MIPS) || defined(TARGET_SH4) ||
\
defined(TARGET_XTENSA)
diff --git a/linux-user/syscall_types.h b/linux-user/syscall_types.h
index 820bc8e..baf43ee 100644
--- a/linux-user/syscall_types.h
+++ b/linux-user/syscall_types.h
@@ -271,6 +271,15 @@ STRUCT(rtc_wkalrm,
TYPE_CHAR, /* pending */
MK_STRUCT(STRUCT_rtc_time)) /* time */
 
+STRUCT(rtc_pll_info,
+   TYPE_INT, /* pll_ctrl */
+   TYPE_INT, /* pll_value */
+   TYPE_INT, /* pll_max */
+   TYPE_INT, /* pll_min */
+   TYPE_INT, /* pll_posmult */
+   TYPE_INT, /* pll_negmult */
+   TYPE_ULONG) /* pll_clock */
+
 STRUCT(blkpg_ioctl_arg,
TYPE_INT, /* op */
TYPE_INT, /* flags */
-- 
2.7.4




[PATCH for 5.0 0/6] linux-user: Add support for real time clock ioctls

2019-11-13 Thread Filip Bozuta
Add ioctls for all rtc features that are currently supported in linux kernell.

Filip Bozuta (6):
  linux-user: Add support for enabling/disabling rtc features using
ioctls
  linux-user: Add set and read for rtc time and alarm using ioctls
  linux-user: Add read and set for rtc periodic interrupt and epoch
using ioctls
  linux-user: Add get and set for rtc wakeup alarm using ioctls
  linux-user: Add get and set for rtc pll correction using ioctls
  linux-user: Add rtc voltage low detector read and clear using ioctls

 linux-user/ioctls.h| 23 +++
 linux-user/syscall.c   |  1 +
 linux-user/syscall_defs.h  | 36 
 linux-user/syscall_types.h | 25 +
 4 files changed, 85 insertions(+)

-- 
2.7.4




Re: [PATCH] WHPX: refactor load library

2019-11-13 Thread Eduardo Habkost
On Wed, Nov 13, 2019 at 05:35:59PM +0100, Philippe Mathieu-Daudé wrote:
> On 11/12/19 8:47 PM, Eduardo Habkost wrote:
> > On Tue, Nov 12, 2019 at 06:42:00PM +, Sunil Muthuswamy wrote:
> > > 
> > > 
> > > > -Original Message-
> > > > From: Sunil Muthuswamy
> > > > Sent: Friday, November 8, 2019 12:32 PM
> > > > To: 'Paolo Bonzini' ; 'Richard Henderson' 
> > > > ; 'Eduardo Habkost' ; 'Stefan
> > > > Weil' 
> > > > Cc: 'qemu-devel@nongnu.org' ; Justin Terry (VM) 
> > > > 
> > > > Subject: [PATCH] WHPX: refactor load library
> > > > 
> > > > This refactors the load library of WHV libraries to make it more
> > > > modular. It makes a helper routine that can be called on demand.
> > > > This allows future expansion of load library/functions to support
> > > > functionality that is depenedent on some feature being available.
> > > > 
> > > > Signed-off-by: Sunil Muthuswamy 
> > > > ---
> > > 
> > > Can I possibly get some eyes on this?
> > 
> > I'd be glad to queue the patch if we get a Reviewed-by line from
> > somebody who understands Windows and WHPX.  Maybe Justin?
> 
> Can we wait for approval from the Microsoft legal department first?
> So we can start testing WHPX builds, and reduce the possibilities to
> introduce regressions.
> 
> Testing is ready, we are waiting for Microsoft to merge, see:
> https://www.mail-archive.com/qemu-devel@nongnu.org/msg646351.html

Making it easier for other people to test WHPX would be nice.
But in case this is not sorted out soon, I don't see a reason to
not merge WHPX changes if they are reviewed and approved by the
main author of that code (Justin).

-- 
Eduardo




[PATCH for 5.0 6/6] linux-user: Add rtc voltage low detector read and clear using ioctls

2019-11-13 Thread Filip Bozuta
Signed-off-by: Filip Bozuta 
---
 linux-user/ioctls.h   | 2 ++
 linux-user/syscall_defs.h | 2 ++
 2 files changed, 4 insertions(+)

diff --git a/linux-user/ioctls.h b/linux-user/ioctls.h
index eea65e1..330e502 100644
--- a/linux-user/ioctls.h
+++ b/linux-user/ioctls.h
@@ -89,6 +89,8 @@
  IOCTL(RTC_WKALM_RD, IOC_R, MK_PTR(MK_STRUCT(STRUCT_rtc_wkalrm)))
  IOCTL(RTC_PLL_GET, IOC_R, MK_PTR(MK_STRUCT(STRUCT_rtc_pll_info)))
  IOCTL(RTC_PLL_SET, IOC_W, MK_PTR(MK_STRUCT(STRUCT_rtc_pll_info)))
+ IOCTL(RTC_VL_READ, IOC_R, TYPE_INT)
+ IOCTL(RTC_VL_CLR, 0, TYPE_NULL)
 
  IOCTL(BLKROSET, IOC_W, MK_PTR(TYPE_INT))
  IOCTL(BLKROGET, IOC_R, MK_PTR(TYPE_INT))
diff --git a/linux-user/syscall_defs.h b/linux-user/syscall_defs.h
index 367e9bd..6ad827b 100644
--- a/linux-user/syscall_defs.h
+++ b/linux-user/syscall_defs.h
@@ -796,6 +796,8 @@ struct target_rtc_pll_info {
struct target_rtc_pll_info)
 #define TARGET_RTC_PLL_SET  TARGET_IOW('p', 0x12,  
\
struct target_rtc_pll_info)
+#define TARGET_RTC_VL_READ  TARGET_IOR('p', 0x13, int)
+#define TARGET_RTC_VL_CLR   TARGET_IO('p', 0x14)
 
 #if defined(TARGET_ALPHA) || defined(TARGET_MIPS) || defined(TARGET_SH4) ||
\
defined(TARGET_XTENSA)
-- 
2.7.4




  1   2   3   >