Re: [Qemu-devel] [PATCH 5/6] migration: send postcopy downtime back to source

2017-04-24 Thread Alexey
On Mon, Apr 24, 2017 at 06:26:31PM +0100, Dr. David Alan Gilbert wrote:
> * Alexey Perevalov (a.pereva...@samsung.com) wrote:
> > Right now to initiate postcopy live migration need to
> > send request to source machine and specify destination.
> > 
> > User could request migration status by query-migrate qmp command on
> > source machine, but postcopy downtime is being evaluated on destination,
> > so it should be transmitted back to source. For this purpose return path
> > socket was shosen.
> > 
> > Signed-off-by: Alexey Perevalov 
> 
> That will break a migration from an older QEMU to a newer QEMU with this 
> feature
> since the old QEMU won't know the message type and fail with a
>   'Received invalid message'
> 
> near the start of source_return_path_thread.
> 
> The simpler solution is to let the stat be read on the destination side
> and not bother sending it backwards over the wire.
Yes, the simplest solution was just to trace_ it. And in this patch set,
I'll keep it.

Looks like, yes, current code couldn't just skip unknown header_type.
Mmm, binary protocol and it have to know the *length*, and length is not
transmitted with header_type, it's hard coded per header type. So
MIG_RP_MSG isn't scalable.
BTW, are you going to replace that protocol in the future?
I think it's even possible to keep MIG_RP_MSG protocol as is, but just
need to send before RP opening an RP_METADATE, header_type and field length,
in the first approximation. But, again, old QEMU will not know about
RP_METADATA and will fail. Or json based, I had coming across on json based 
encapsulation
for devices.


As a total alternative, I could suggest to send request every time user
request query-migration on src, but in this case MigrationIncomingState
should live forever.

> 
> Dave
> 
> > ---
> >  include/migration/migration.h |  4 +++-
> >  migration/migration.c | 20 ++--
> >  migration/postcopy-ram.c  |  1 +
> >  3 files changed, 22 insertions(+), 3 deletions(-)
> > 
> > diff --git a/include/migration/migration.h b/include/migration/migration.h
> > index 5d2c628..5535aa6 100644
> > --- a/include/migration/migration.h
> > +++ b/include/migration/migration.h
> > @@ -55,7 +55,8 @@ enum mig_rp_message_type {
> >  
> >  MIG_RP_MSG_REQ_PAGES_ID, /* data (start: be64, len: be32, id: string) 
> > */
> >  MIG_RP_MSG_REQ_PAGES,/* data (start: be64, len: be32) */
> > -
> > +MIG_RP_MSG_DOWNTIME,/* downtime value from destination,
> > +   calculated and sent in case of post copy */
> >  MIG_RP_MSG_MAX
> >  };
> >  
> > @@ -364,6 +365,7 @@ void migrate_send_rp_pong(MigrationIncomingState *mis,
> >uint32_t value);
> >  void migrate_send_rp_req_pages(MigrationIncomingState *mis, const char* 
> > rbname,
> >ram_addr_t start, size_t len);
> > +void migrate_send_rp_downtime(MigrationIncomingState *mis, uint64_t 
> > downtime);
> >  
> >  void ram_control_before_iterate(QEMUFile *f, uint64_t flags);
> >  void ram_control_after_iterate(QEMUFile *f, uint64_t flags);
> > diff --git a/migration/migration.c b/migration/migration.c
> > index 5bac434..3134e24 100644
> > --- a/migration/migration.c
> > +++ b/migration/migration.c
> > @@ -553,6 +553,19 @@ void migrate_send_rp_message(MigrationIncomingState 
> > *mis,
> >  }
> >  
> >  /*
> > + * Send postcopy migration downtime,
> > + * at the moment of calling this function migration should
> > + * be completed.
> > + */
> > +void migrate_send_rp_downtime(MigrationIncomingState *mis, uint64_t 
> > downtime)
> > +{
> > +uint64_t buf;
> > +
> > +buf = cpu_to_be64(downtime);
> > +migrate_send_rp_message(mis, MIG_RP_MSG_DOWNTIME, sizeof(downtime), 
> > );
> > +}
> > +
> > +/*
> >   * Send a 'SHUT' message on the return channel with the given value
> >   * to indicate that we've finished with the RP.  Non-0 value indicates
> >   * error.
> > @@ -1483,6 +1496,7 @@ static struct rp_cmd_args {
> >  [MIG_RP_MSG_PONG]   = { .len =  4, .name = "PONG" },
> >  [MIG_RP_MSG_REQ_PAGES]  = { .len = 12, .name = "REQ_PAGES" },
> >  [MIG_RP_MSG_REQ_PAGES_ID]   = { .len = -1, .name = "REQ_PAGES_ID" },
> > +[MIG_RP_MSG_DOWNTIME]   = { .len =  8, .name = "DOWNTIME" },
> >  [MIG_RP_MSG_MAX]= { .len = -1, .name = "MAX" },
> >  };
> >  
> > @@ -1613,6 +1627,10 @@ static void *source_return_path_thread(void *opaque)
> >  migrate_handle_rp_req_pages(ms, (char *)[13], start, len);
> >  break;
> >  
> > +case MIG_RP_MSG_DOWNTIME:
> > +ms->downtime = ldq_be_p(buf);
> > +break;
> > +
> >  default:
> >  break;
> >  }
> > @@ -1677,7 +1695,6 @@ static int postcopy_start(MigrationState *ms, bool 
> > *old_vm_running)
> >  int ret;
> >  QIOChannelBuffer *bioc;
> >  QEMUFile *fb;
> > -int64_t time_at_stop = 

Re: [Qemu-devel] DMG chunk size independence

2017-04-24 Thread Ashijeet Acharya
>> For testing I am first converting the images to raw format and then
>> comparing the resulting image with the one converted using v2.9.0 DMG
>> driver and after battling for 2 days with my code, it finally prints
>> "Images are identical." According to John, that should be pretty
>> conclusive and I completely agree.
>>
>
> Yes, comparing a sample.dmg against a raw file generated from the 2.9.0
> qemu-img tool should be reasonably good evidence that you have not
> altered the behavior of the tool.
>
>> Now, the real thing I wanted to ask was, if someone is aware of a DMG
>> file which has a chunk size above 64 MiB so that I can test those too.
>> If yes, please share the download link with me.
>> Currently I am testing the ones posted by Peter Wu while submitting
>> his DMG work in 2014.
>> Here -> 
>> https://lists.nongnu.org/archive/html/qemu-devel/2014-12/msg03606.html
>>
>
> Are any of those over 64MB? I assume you're implying that they aren't.

No, they are not. Because none of them crash while converting with
using the qemu-img tool as in 2.9.0(as it has a limitation of 64MiB).

>
> Maybe Peter knows?...

Yes, I contacted him and he has been of great help so far :-)

Ashijeet



[Qemu-devel] [PATCH v3 3/3] tests: Add a tester for HMP commands

2017-04-24 Thread Thomas Huth
HMP commands do not get any automatic testing yet, so on certain
QEMU machines, some HMP commands were causing crashes in the past.
Thus we should test HMP commands in our test suite, too, to avoid
that such problems creep in again in the future.

Signed-off-by: Thomas Huth 
---
 v3:
 - Fixed the stupid "!strcmp() == 0" problem
 - Removed "isapc" from the blacklist since the "info lapic" problem
   has already been fixed (see commit c7f15bc93661a36fe)
 - Fixed the g_strdup_printf("hmp/%s", mname) memory leak

 tests/Makefile.include |   2 +
 tests/test-hmp.c   | 161 +
 2 files changed, 163 insertions(+)
 create mode 100644 tests/test-hmp.c

diff --git a/tests/Makefile.include b/tests/Makefile.include
index 579ec07..31931c0 100644
--- a/tests/Makefile.include
+++ b/tests/Makefile.include
@@ -331,6 +331,7 @@ check-qtest-xtensaeb-y = $(check-qtest-xtensa-y)
 check-qtest-s390x-y = tests/boot-serial-test$(EXESUF)
 
 check-qtest-generic-y += tests/qom-test$(EXESUF)
+check-qtest-generic-y += tests/test-hmp$(EXESUF)
 
 qapi-schema += alternate-any.json
 qapi-schema += alternate-array.json
@@ -720,6 +721,7 @@ tests/tpci200-test$(EXESUF): tests/tpci200-test.o
 tests/display-vga-test$(EXESUF): tests/display-vga-test.o
 tests/ipoctal232-test$(EXESUF): tests/ipoctal232-test.o
 tests/qom-test$(EXESUF): tests/qom-test.o
+tests/test-hmp$(EXESUF): tests/test-hmp.o
 tests/drive_del-test$(EXESUF): tests/drive_del-test.o $(libqos-pc-obj-y)
 tests/qdev-monitor-test$(EXESUF): tests/qdev-monitor-test.o $(libqos-pc-obj-y)
 tests/nvme-test$(EXESUF): tests/nvme-test.o
diff --git a/tests/test-hmp.c b/tests/test-hmp.c
new file mode 100644
index 000..99e35ec
--- /dev/null
+++ b/tests/test-hmp.c
@@ -0,0 +1,161 @@
+/*
+ * Test HMP commands.
+ *
+ * Copyright (c) 2017 Red Hat Inc.
+ *
+ * Author:
+ *Thomas Huth 
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2
+ * or later. See the COPYING file in the top-level directory.
+ *
+ * This test calls some HMP commands for all machines that the current
+ * QEMU binary provides, to check whether they terminate successfully
+ * (i.e. do not crash QEMU).
+ */
+
+#include "qemu/osdep.h"
+#include "libqtest.h"
+
+static int verbose;
+
+static const char *hmp_cmds[] = {
+"boot_set ndc",
+"chardev-add null,id=testchardev1",
+"chardev-remove testchardev1",
+"commit all",
+"cpu-add 1",
+"cpu 0",
+"device_add ?",
+"device_add usb-mouse,id=mouse1",
+"mouse_button 7",
+"mouse_move 10 10",
+"mouse_button 0",
+"device_del mouse1",
+"dump-guest-memory /dev/null 0 4096",
+"gdbserver",
+"host_net_add user id=net0",
+"hostfwd_add tcp::43210-:43210",
+"hostfwd_remove tcp::43210-:43210",
+"host_net_remove 0 net0",
+"i /w 0",
+"log all",
+"log none",
+"memsave 0 4096 \"/dev/null\"",
+"migrate_set_cache_size 1",
+"migrate_set_downtime 1",
+"migrate_set_speed 1",
+"netdev_add user,id=net1",
+"set_link net1 off",
+"set_link net1 on",
+"netdev_del net1",
+"nmi",
+"o /w 0 0x1234",
+"object_add memory-backend-ram,id=mem1,size=256M",
+"object_del mem1",
+"pmemsave 0 4096 \"/dev/null\"",
+"p $pc + 8",
+"qom-list /",
+"qom-set /machine initrd test",
+"screendump /dev/null",
+"sendkey x",
+"singlestep on",
+"wavcapture /dev/null",
+"stopcapture 0",
+"sum 0 512",
+"x /8i 0x100",
+"xp /16x 0",
+NULL
+};
+
+/* Run through the list of pre-defined commands */
+static void test_commands(void)
+{
+char *response;
+int i;
+
+for (i = 0; hmp_cmds[i] != NULL; i++) {
+if (verbose) {
+fprintf(stderr, "\t%s\n", hmp_cmds[i]);
+}
+response = hmp(hmp_cmds[i]);
+g_free(response);
+}
+
+}
+
+/* Run through all info commands and call them blindly (without arguments) */
+static void test_info_commands(void)
+{
+char *resp, *info, *info_buf, *endp;
+
+info_buf = info = hmp("help info");
+
+while (*info) {
+/* Extract the info command, ignore parameters and description */
+g_assert(strncmp(info, "info ", 5) == 0);
+endp = strchr([5], ' ');
+g_assert(endp != NULL);
+*endp = '\0';
+/* Now run the info command */
+if (verbose) {
+fprintf(stderr, "\t%s\n", info);
+}
+resp = hmp(info);
+g_free(resp);
+/* And move forward to the next line */
+info = strchr(endp + 1, '\n');
+if (!info) {
+break;
+}
+info += 1;
+}
+
+g_free(info_buf);
+}
+
+static void test_machine(gconstpointer data)
+{
+const char *machine = data;
+char *args;
+
+args = g_strdup_printf("-S -M %s", machine);
+qtest_start(args);
+
+test_info_commands();
+test_commands();
+
+qtest_end();
+g_free(args);
+g_free((void *)data);
+}
+

Re: [Qemu-devel] [PATCH v5 07/13] vfio/ccw: vfio based subchannel passthrough driver

2017-04-24 Thread Dong Jia Shi
* Alex Williamson  [2017-04-24 16:56:28 -0600]:

[...]
> > > diff --git a/hw/vfio/ccw.c b/hw/vfio/ccw.c
> > > new file mode 100644
> > > index 000..c491bee
> > > --- /dev/null
> > > +++ b/hw/vfio/ccw.c
> > > @@ -0,0 +1,207 @@
> > > +/*
> > > + * vfio based subchannel assignment support
> > > + *
> > > + * Copyright 2017 IBM Corp.
> > > + * Author(s): Dong Jia Shi 
> > > + *Xiao Feng Ren 
> > > + *Pierre Morel 
> > > + *
> > > + * This work is licensed under the terms of the GNU GPL, version 2 or(at
> > > + * your option) any version. See the COPYING file in the top-level
> > > + * directory.
> > > + */
> > > +
> > > +#include 
> > > +#include 
> > > +
> > > +#include "qemu/osdep.h"
> > > +#include "qapi/error.h"
> > > +#include "hw/sysbus.h"
> > > +#include "hw/vfio/vfio.h"
> > > +#include "hw/vfio/vfio-common.h"
> > > +#include "hw/s390x/s390-ccw.h"
> > > +#include "hw/s390x/ccw-device.h"
> > > +
> > > +#define TYPE_VFIO_CCW "vfio-ccw"
> > > +typedef struct VFIOCCWDevice {
> > > +S390CCWDevice cdev;
> > > +VFIODevice vdev;
> > > +} VFIOCCWDevice;
> > > +
> > > +static void vfio_ccw_compute_needs_reset(VFIODevice *vdev)
> > > +{
> > > +vdev->needs_reset = false;
> > > +}
> > > +
> > > +/*
> > > + * We don't need vfio_hot_reset_multi and vfio_eoi operationis for
> 
> One more:
> 
> s/operationis/operations/
> 
Ok.

> > > + * vfio_ccw device now.
> > > + */
> > > +struct VFIODeviceOps vfio_ccw_ops = {
> > > +.vfio_compute_needs_reset = vfio_ccw_compute_needs_reset,
> > > +};
> > > +
> > > +static void vfio_ccw_reset(DeviceState *dev)
> > > +{
> > > +CcwDevice *ccw_dev = DO_UPCAST(CcwDevice, parent_obj, dev);
> > > +S390CCWDevice *cdev = DO_UPCAST(S390CCWDevice, parent_obj, ccw_dev);
> > > +VFIOCCWDevice *vcdev = DO_UPCAST(VFIOCCWDevice, cdev, cdev);
> > > +
> > > +ioctl(vcdev->vdev.fd, VFIO_DEVICE_RESET);
> > > +}
> > > +
> > > +static void vfio_put_device(VFIOCCWDevice *vcdev)
> > > +{
> > > +g_free(vcdev->vdev.name);
> > > +vfio_put_base_device(>vdev);
> > > +}
> > > +
> > > +static VFIOGroup *vfio_ccw_get_group(S390CCWDevice *cdev, char **path,
> > > + Error **errp)
> > > +{
> > > +struct stat st;
> > > +int groupid;
> > > +GError *gerror = NULL;
> > > +
> > > +/* Check that host subchannel exists. */
> > > +path[0] = g_strdup_printf("/sys/bus/css/devices/%x.%x.%04x",
> > > +  cdev->hostid.cssid,
> > > +  cdev->hostid.ssid,
> > > +  cdev->hostid.devid);
> > > +if (stat(path[0], ) < 0) {
> > > +error_setg(errp, "vfio: no such host subchannel %s", path[0]);
> > > +return NULL;
> > > +}
> > > +
> > > +/* Check that mediated device exists. */
> > > +path[1] = g_strdup_printf("%s/%s", path[0], cdev->mdevid);
> > > +if (stat(path[0], ) < 0) {
> > > +error_setg(errp, "vfio: no such mediated device %s", path[1]);
> > > +return NULL;
> > > +}  
> > 
> > Isn't this all a bit circular since we build the S390CCWDevice based on
> > the sysfsdev mdev path?
> > 
Right! We don't need to verify the existance of the path here again,
since we already did that during the realization of the S390CCWDevice,
which is triggered by calling cdc->realize before vfio_ccw_get_group in
vfio_ccw_realize.

> > > +
> > > +/* Get the iommu_group patch as the interim variable. */
> > > +path[2] = g_strconcat(path[1], "/iommu_group", NULL);
> > > +
> > > +/* Get the link file path of the device iommu_group. */
> > > +path[3] = g_file_read_link(path[2], );
> > > +if (!path[3]) {
> > > +error_setg(errp, "vfio: error no iommu_group for subchannel");
> > > +return NULL;
> > > +}
> > > +
> > > +/* Get the device groupid. */
> > > +if (sscanf(basename(path[3]), "%d", ) != 1) {
> > > +error_setg(errp, "vfio: error reading %s:%m", path[3]);
> > > +return NULL;
> > > +}
> > > +
> > > +return vfio_get_group(groupid, _space_memory, errp);
> > > +}
> > > +
> > > +static void vfio_ccw_put_group(VFIOGroup *group, char **path)
> > > +{
> > > +g_free(path);
> > > +vfio_put_group(group);
> > > +}
> > > +
> > > +static void vfio_ccw_realize(DeviceState *dev, Error **errp)
> > > +{
> > > +VFIODevice *vbasedev;
> > > +VFIOGroup *group;
> > > +CcwDevice *ccw_dev = DO_UPCAST(CcwDevice, parent_obj, dev);
> > > +S390CCWDevice *cdev = DO_UPCAST(S390CCWDevice, parent_obj, ccw_dev);
> > > +VFIOCCWDevice *vcdev = DO_UPCAST(VFIOCCWDevice, cdev, cdev);
> > > +S390CCWDeviceClass *cdc = S390_CCW_DEVICE_GET_CLASS(cdev);
> > > +char *path[4] = {NULL, NULL, NULL, NULL};  
> > 
> > I don't understand what's happening with 'path' throughout this
> > function.  vfio_ccw_get_group() allocates strings 

Re: [Qemu-devel] [PATCH] pci: deassert intx when pci device unrealize

2017-04-24 Thread Marcel Apfelbaum

On 04/25/2017 05:29 AM, Herongguang (Stephen) wrote:

If a pci device is not reset by VM (by writing into config space)
and unplugged by VM, after that when VM reboots, qemu may assert:
pcibus_reset: Assertion `bus->irq_count[i] == 0' failed

Signed-off-by: herongguang 
---
 hw/pci/pci.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/hw/pci/pci.c b/hw/pci/pci.c
index 259483b..98ccc27 100644
--- a/hw/pci/pci.c
+++ b/hw/pci/pci.c
@@ -1083,6 +1083,7 @@ static void pci_qdev_unrealize(DeviceState *dev, Error 
**errp)
 pc->exit(pci_dev);
 }

+pci_device_deassert_intx(pci_dev);
 do_pci_unregister_device(pci_dev);
 }

--
1.7.12.4




Reviewed-by: Marcel Apfelbaum 

Thanks,
Marcel



Re: [Qemu-devel] [PATCH RFC 1/1] vfio/pci: Fix incorrect error message

2017-04-24 Thread Dong Jia Shi
* Dong Jia Shi  [2017-04-25 06:52:01 +0200]:

Hey Alex,

Please ignore the "RFC" tag in the subject. Sorry for the mistake.

> When the "No host device provided" error occurs, the hint message
> that starts with "Use -vfio-pci," makes no sense, since "-vfio-pci"
> is not a valid command line parameter.
> 
> Correct this by replacing "-vfio-pci" with "-device vfio-pci".
> 
> Signed-off-by: Dong Jia Shi 
> ---
>  hw/vfio/pci.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
> index 03a3d01..32aca77 100644
> --- a/hw/vfio/pci.c
> +++ b/hw/vfio/pci.c
> @@ -2625,8 +2625,8 @@ static void vfio_realize(PCIDevice *pdev, Error **errp)
>  if (!(~vdev->host.domain || ~vdev->host.bus ||
>~vdev->host.slot || ~vdev->host.function)) {
>  error_setg(errp, "No provided host device");
> -error_append_hint(errp, "Use -vfio-pci,host=:BB:DD.F "
> -  "or -vfio-pci,sysfsdev=PATH_TO_DEVICE\n");
> +error_append_hint(errp, "Use -device vfio-pci,host=:BB:DD.F "
> +  "or -device 
> vfio-pci,sysfsdev=PATH_TO_DEVICE\n");
>  return;
>  }
>  vdev->vbasedev.sysfsdev =
> -- 
> 2.10.2
> 

-- 
Dong Jia Shi




[Qemu-devel] [PATCH RFC 1/1] vfio/pci: Fix incorrect error message

2017-04-24 Thread Dong Jia Shi
When the "No host device provided" error occurs, the hint message
that starts with "Use -vfio-pci," makes no sense, since "-vfio-pci"
is not a valid command line parameter.

Correct this by replacing "-vfio-pci" with "-device vfio-pci".

Signed-off-by: Dong Jia Shi 
---
 hw/vfio/pci.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c
index 03a3d01..32aca77 100644
--- a/hw/vfio/pci.c
+++ b/hw/vfio/pci.c
@@ -2625,8 +2625,8 @@ static void vfio_realize(PCIDevice *pdev, Error **errp)
 if (!(~vdev->host.domain || ~vdev->host.bus ||
   ~vdev->host.slot || ~vdev->host.function)) {
 error_setg(errp, "No provided host device");
-error_append_hint(errp, "Use -vfio-pci,host=:BB:DD.F "
-  "or -vfio-pci,sysfsdev=PATH_TO_DEVICE\n");
+error_append_hint(errp, "Use -device vfio-pci,host=:BB:DD.F "
+  "or -device vfio-pci,sysfsdev=PATH_TO_DEVICE\n");
 return;
 }
 vdev->vbasedev.sysfsdev =
-- 
2.10.2




Re: [Qemu-devel] [PULL 0/8] Net patches

2017-04-24 Thread Jason Wang



On 2017年04月25日 00:02, Cédric Le Goater wrote:

On 04/24/2017 03:49 PM, Peter Maydell wrote:

On 24 April 2017 at 06:15, Jason Wang  wrote:

The following changes since commit 32c7e0ab755745e961f1772e95cac381cc68769d:

   Merge remote-tracking branch 'remotes/juanquintela/tags/migration/20170421' 
into staging (2017-04-21 15:59:27 +0100)

are available in the git repository at:

   https://github.com/jasowang/qemu.git tags/net-pull-request

for you to fetch changes up to 049f6d8237dd0b14dee02e4c22b20114c43cecff:

   COLO-compare: Optimize tcp compare trace event (2017-04-24 11:30:36 +0800)





Hi. Clang picks up what looks like a typo:

/Users/pm215/src/qemu-for-merges/hw/net/ftgmac100.c:809:33: error: use
of logical '&&' with constant operand
[-Werror,-Wconstant-logical-operand]
 if (size < 64 && !(s->maccr && FTGMAC100_MACCR_RX_RUNT)) {
 ^  ~~~
/Users/pm215/src/qemu-for-merges/hw/net/ftgmac100.c:809:33: note: use
'&' for a bitwise operation
 if (size < 64 && !(s->maccr && FTGMAC100_MACCR_RX_RUNT)) {
 ^~
 &
/Users/pm215/src/qemu-for-merges/hw/net/ftgmac100.c:809:33: note:
remove constant to silence this warning
 if (size < 64 && !(s->maccr && FTGMAC100_MACCR_RX_RUNT)) {
~^~

Jason,

How do you want to handle that ? A resend of the patch or a fix ?

Thanks,

C.



The fix looks trivial, let me fix it.

Thanks



Re: [Qemu-devel] [PATCH V2 0/6] Add COLO-proxy virtio-net support

2017-04-24 Thread Jason Wang



On 2017年04月24日 20:02, Zhang Chen wrote:



On 04/24/2017 11:48 AM, Jason Wang wrote:



On 2017年04月20日 14:39, Zhang Chen wrote:

If user use -device virtio-net-pci, virtio-net driver will add a header
to raw net packet that colo-proxy can't handle it. COLO-proxy just
focus on the packet payload, so we skip the virtio-net header to 
compare

the sent packet that primary guest's to secondary guest's.

Zhang Chen (6):
   net/filter-mirror.c: Add filter-mirror and filter-redirector vnet
 support.
   net/net.c: Add vnet header length to SocketReadState
   net/colo-compare.c: Make colo-compare support vnet_hdr_len
   net/socket.c: Add vnet packet support in net_socket_receive()
   net/colo.c: Add vnet packet parse feature in colo-proxy
   net/colo-compare.c: Add vnet packet's tcp/udp/icmp compare

  include/net/net.h |  4 +++-
  net/colo-compare.c| 48 
+++-

  net/colo.c|  9 +
  net/colo.h|  4 +++-
  net/filter-mirror.c   | 25 -
  net/filter-rewriter.c |  2 +-
  net/net.c | 24 ++--
  net/socket.c  |  6 ++
  8 files changed, 99 insertions(+), 23 deletions(-)



A quick glance at the series and find two issues:

- We can't assume virtio-net is the only user for vnet header, you 
need query e.g NetClientState for a correct vnet header len.


I don't know whether I understand your means.
I found that I can't get vnet_hdr_len from NetClientState,

typedef struct NetClientInfo {
.
HasVnetHdr *has_vnet_hdr;
HasVnetHdrLen *has_vnet_hdr_len;
UsingVnetHdr *using_vnet_hdr;
SetOffload *set_offload;
SetVnetHdrLen *set_vnet_hdr_len;

.
}NetClientInfo;

This struct haven't a function like get_vnet_hdr_len.
Should I add the get_vnet_hdr_len callback here and write new function 
in tap.c,tap-wen32.c and netmap.c ?


Thanks
Zhang Chen


Yes, you need add such callbacks I think.

Thanks




- This series breaks qtest:

**
ERROR:tests/e1000e-test.c:296:e1000e_send_verify: assertion failed 
(buffer == "TEST"): ("" == "TEST")

GTester: last random seed: R02S39dd06f7f52013798111df2e4eb602c5
**
ERROR:tests/e1000e-test.c:365:e1000e_receive_verify: assertion failed 
(le32_to_cpu(descr.wb.upper.status_error) & esta_dd == esta_dd): 
(0x == 0x0001)

GTester: last random seed: R02S8c8200b8ec86358cb7addb5c6fe1303c
**
ERROR:tests/e1000e-test.c:296:e1000e_send_verify: assertion failed 
(buffer == "TEST"): ("" == "TEST")

GTester: last random seed: R02S9be86025aa7ded4902bdf644c3964a6e
**
ERROR:tests/libqos/virtio.c:94:qvirtio_wait_queue_isr: assertion 
failed: (g_get_monotonic_time() - start_time <= timeout_us)

GTester: last random seed: R02S30cac33d7a98fa56806ca59b35910ea5
**
ERROR:tests/libqos/virtio.c:94:qvirtio_wait_queue_isr: assertion 
failed: (g_get_monotonic_time() - start_time <= timeout_us)

GTester: last random seed: R02S258359836760a723622abf56cf2e61e7
^C/home/devel/git/qemu/tests/Makefile.include:815: recipe for target 
'check-qtest-x86_64' failed

make: *** [check-qtest-x86_64] Interrupt

Please fix them.

Thanks


.








Re: [Qemu-devel] [PULL 0/4] hmp queue

2017-04-24 Thread Thomas Huth
On 24.04.2017 18:57, Dr. David Alan Gilbert wrote:
> * Peter Maydell (peter.mayd...@linaro.org) wrote:
>> On 24 April 2017 at 16:32, Dr. David Alan Gilbert (git)
>> <dgilb...@redhat.com> wrote:
>>> From: "Dr. David Alan Gilbert" <dgilb...@redhat.com>
>>>
>>> The following changes since commit 4c55b1d0bad8a703f0499fe62e3761a0cd288da3:
>>>
>>>   Merge remote-tracking branch 'remotes/armbru/tags/pull-error-2017-04-24' 
>>> into staging (2017-04-24 14:49:48 +0100)
>>>
>>> are available in the git repository at:
>>>
>>>   git://github.com/dagrh/qemu.git tags/pull-hmp-20170424
>>>
>>> for you to fetch changes up to e4e3992e626c4cc7514b271807c90f587771c646:
>>>
>>>   tests: Add a tester for HMP commands (2017-04-24 15:55:35 +0100)
>>>
>>> 
>>> HMP pull
>>>
>>> 
>>
>>
>> clang doesn't like some code in test-hmp.c:
>>
>> /home/petmay01/linaro/qemu-for-merges/tests/test-hmp.c:138:9: error:
>> logical not is only applied to the left hand side of this comparison
>> [-Werror,-Wlogical-not-parentheses]
>> if (!strcmp("isapc", mname) == 0 ||  !strcmp("puv3", mname)
>> ^   ~~
> 
> 
> 
>> It does look rather odd:
>>
>> +/* Ignore blacklisted machines that have known problems */
>> +if (!strcmp("isapc", mname) == 0 ||  !strcmp("puv3", mname)
>> +|| !strcmp("tricore_testboard", mname)
>> +|| !strcmp("xenfv", mname) == 0 || !strcmp("xenpv", mname)) {
>> +return;
>> +}
>>
>> since it's not using the same kind of expression to test
>> each board name -- is that deliberate, or accidental ?
>>
>> I think this expression means we'll actually skip every machine...
> 
> Yep, you're right, just tried it with logging.

Ouch, not sure how that happened ... looks like I used
"strcmp("isapc", mname) == 0" in the first version of my patch, and then
wanted to switch to "!strcmp()" when I added the xenfv and xenpv
machines, but forgot to remove the "== 0" everywhere :-( Big sorry for
that mess!

> That's accidental; hmm I should add a clang build somewhere.
> 
> Thomas: Do you want to send me a fixed version?

Yes, I'll send a fixed version, where I also correct the memory leak
that you noticed.

 Thomas




Re: [Qemu-devel] [PATCH v5 06/13] s390x/css: device support for s390-ccw passthrough

2017-04-24 Thread Dong Jia Shi
* Alex Williamson  [2017-04-24 20:16:18 -0600]:

> On Tue, 25 Apr 2017 10:10:22 +0800
> Dong Jia Shi  wrote:
> 
> > * Alex Williamson  [2017-04-24 16:52:58 -0600]:
> > 
> > > On Wed, 12 Apr 2017 07:21:08 +0200
> > > Dong Jia Shi  wrote:
> > >   
> > > > In order to support subchannels pass-through, we introduce a s390
> > > > subchannel device called "s390-ccw" to hold the real subchannel info.
> > > > The s390-ccw devices inherit from the abstract CcwDevice which connect
> > > > to the existing virtual-css-bus.
> > > > 
> > > > Signed-off-by: Dong Jia Shi 
> > > > ---
> > > >  hw/s390x/Makefile.objs |   1 +
> > > >  hw/s390x/s390-ccw.c| 134 
> > > > +
> > > >  hw/s390x/s390-ccw.h|  38 ++
> > > >  3 files changed, 173 insertions(+)
> > > >  create mode 100644 hw/s390x/s390-ccw.c
> > > >  create mode 100644 hw/s390x/s390-ccw.h
> > > > 
> > > > diff --git a/hw/s390x/Makefile.objs b/hw/s390x/Makefile.objs
> > > > index 41ac4ec..72a3d37 100644
> > > > --- a/hw/s390x/Makefile.objs
> > > > +++ b/hw/s390x/Makefile.objs
> > > > @@ -13,3 +13,4 @@ obj-y += ccw-device.o
> > > >  obj-y += s390-pci-bus.o s390-pci-inst.o
> > > >  obj-y += s390-skeys.o
> > > >  obj-$(CONFIG_KVM) += s390-skeys-kvm.o
> > > > +obj-y += s390-ccw.o
> > > > diff --git a/hw/s390x/s390-ccw.c b/hw/s390x/s390-ccw.c
> > > > new file mode 100644
> > > > index 000..f3d5ed1
> > > > --- /dev/null
> > > > +++ b/hw/s390x/s390-ccw.c
> > > > @@ -0,0 +1,134 @@
> > > > +/*
> > > > + * s390 CCW Assignment Support
> > > > + *
> > > > + * Copyright 2017 IBM Corp
> > > > + * Author(s): Dong Jia Shi 
> > > > + *Xiao Feng Ren 
> > > > + *Pierre Morel 
> > > > + *
> > > > + * This work is licensed under the terms of the GNU GPL, version 2
> > > > + * or (at your option) any later version. See the COPYING file in the
> > > > + * top-level directory.
> > > > + */
> > > > +#include "qemu/osdep.h"
> > > > +#include "qapi/error.h"
> > > > +#include "hw/sysbus.h"
> > > > +#include "libgen.h"
> > > > +#include "hw/s390x/css.h"
> > > > +#include "hw/s390x/css-bridge.h"
> > > > +#include "s390-ccw.h"
> > > > +
> > > > +static void s390_ccw_get_dev_info(S390CCWDevice *cdev,
> > > > +  char *sysfsdev,
> > > > +  Error **errp)
> > > > +{
> > > > +char dev_path[PATH_MAX], *tmp;
> > > > +unsigned int cssid, ssid, devid;
> > > > +
> > > > +if (!sysfsdev) {
> > > > +error_setg(errp, "No host device provided");
> > > > +error_append_hint(errp, "Use 
> > > > -vfio-ccw,sysfsdev=PATH_TO_DEVICE\n");  
> > > 
> > > nit, the leading '-' here seems strange, either you're describing
> > > '-device vfio-ccw,...', or only the 'vfio-ccw,...' part.  Maybe I
> > > notice this because sometimes I do accidentally type -vfio-pci instead
> > > of -device vfio-pci.  Thanks,
> > > 
> > > Alex
> > >   
> > Ok. I will change it to:
> > error_append_hint(errp,
> >   "Use -device vfio-ccw,sysfsdev=PATH_TO_DEVICE\n");
> > 
> > BTW, I learned this from the pci code.
> > hw/vfio/pci.c:2628:
> > error_append_hint(errp, "Use -vfio-pci,host=:BB:DD.F "
> >   "or -vfio-pci,sysfsdev=PATH_TO_DEVICE\n")
> > Welcome a fix to that too?
> 
> Aha, maybe that's where I learned my bad habit.  Yes, please send a
> patch separate from this series.  Thanks,
> 
> Alex
> 
Will do soon. ;>

> > > > +return;
> > > > +}
> > > > +
> > > > +if (!realpath(sysfsdev, dev_path)) {
> > > > +error_setg(errp, "Host device '%s' not found", sysfsdev);
> > > > +return;
> > > > +}
> > > > +
> > > > +cdev->mdevid = g_strdup(basename(dev_path));
> > > > +
> > > > +tmp = basename(dirname(dev_path));
> > > > +sscanf(tmp, "%2x.%1x.%4x", , , );
> > > > +
> > > > +cdev->hostid.cssid = cssid;
> > > > +cdev->hostid.ssid = ssid;
> > > > +cdev->hostid.devid = devid;
> > > > +cdev->hostid.valid = true;
> > > > +}
> > > > +  
> > [...]
> > 
> 

-- 
Dong Jia Shi




[Qemu-devel] [PATCH 9/9] Add more descriptive comment for mismatch or end of test condition

2017-04-24 Thread G 3
Replace the comment "mismatch, or end of test" with "called for a  
mismatch,

or for the end of a test". This describes what happens better.

Signed-off-by: John Arbuckle 
---
 risu.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/risu.c b/risu.c
index e7cbd57..bd771b1 100644
--- a/risu.c
+++ b/risu.c
@@ -48,7 +48,7 @@ void master_sigill(int sig, siginfo_t *si, void *uc)
  advance_pc(uc);
  return;
   default:
- /* mismatch, or end of test */
+ /* called for a mismatch, or for the end of a test */
  siglongjmp(jmpbuf, 1);
}
 }
--
2.10.2




[Qemu-devel] [PATCH 7/9] Add verbose option

2017-04-24 Thread G 3
Add an option that prints each instruction that is currently being  
tested.

To use this option, just add "--v" to risu's command-line.

Signed-off-by: John Arbuckle 
---
 risu.c | 11 ++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/risu.c b/risu.c
index 7e42160..ed5b605 100644
--- a/risu.c
+++ b/risu.c
@@ -37,6 +37,8 @@ sigjmp_buf jmpbuf;
 /* Should we test for FP exception status bits? */
 int test_fp_exc = 0;

+int verbose_mode = 0;
+
 void master_sigill(int sig, siginfo_t *si, void *uc)
 {
switch (recv_and_compare_register_info(master_socket, uc))
@@ -125,7 +127,8 @@ int master(int sock)
 {
if (sigsetjmp(jmpbuf, 1))
{
-  return report_match_status();
+int return_status = report_match_status();
+exit(return_status);
}
master_socket = sock;
set_sigill_handler(_sigill);
@@ -177,6 +180,7 @@ int main(int argc, char **argv)
 { "host", required_argument, 0, 'h' },
 { "port", required_argument, 0, 'p' },
 { "test-fp-exc", no_argument, _fp_exc, 1 },
+{ "verbose testing", no_argument, 0, 'v'},
 { 0,0,0,0 }
  };
   int optidx = 0;
@@ -209,6 +213,11 @@ int main(int argc, char **argv)
 usage();
 exit(1);
  }
+ case 'v':  /* Prints each instruction being tested */
+ {
+verbose_mode = 1;
+break;
+ }
  default:
 abort();
   }
--
2.10.2




[Qemu-devel] [PATCH 8/9] Add end of test message

2017-04-24 Thread G 3

Print the message "End of test" on the risu host end.

Signed-off-by: John Arbuckle 
---
 risu.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/risu.c b/risu.c
index ed5b605..e7cbd57 100644
--- a/risu.c
+++ b/risu.c
@@ -63,6 +63,7 @@ void apprentice_sigill(int sig, siginfo_t *si, void  
*uc)

  return;
   case 1:
  /* end of test */
+ printf("End of test\n");
  exit(0);
   default:
  /* mismatch */
--
2.10.2




[Qemu-devel] [PATCH 6/9] Add ppc support to configure

2017-04-24 Thread G 3

Add ppc support to the configure script.

Signed-off-by: John Arbuckle 
---
 configure | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/configure b/configure
index 055e6d6..7881b18 100755
--- a/configure
+++ b/configure
@@ -56,6 +56,8 @@ guess_arch() {
 else
 ARCH="ppc64le"
 fi
+elif check_define __ppc__ ; then
+ARCH="ppc"
 else
 echo "This cpu is not supported by risu. Try -h. " >&2
 exit 1
@@ -92,7 +94,7 @@ Some influential environment variables:
prefixed with the given string.

   ARCH force target architecture instead of trying to  
detect it.

-   Valid values=[arm|aarch64|ppc64|ppc64le|m68k]
+   Valid values=[arm|aarch64|ppc64|ppc64le|m68k|ppc]

   CC   C compiler command
   CFLAGS   C compiler flags
--
2.10.2




[Qemu-devel] [PATCH 5/9] Add risugen_ppc.pm file

2017-04-24 Thread G 3
Add the risugen_ppc.pm file. It is used to generate the instructions  
that risu

runs.

Signed-off-by: John Arbuckle 
---
 risugen_ppc.pm | 744 +++ 
++

 1 file changed, 744 insertions(+)
 create mode 100644 risugen_ppc.pm

diff --git a/risugen_ppc.pm b/risugen_ppc.pm
new file mode 100644
index 000..2faae81
--- /dev/null
+++ b/risugen_ppc.pm
@@ -0,0 +1,744 @@
+#!/usr/bin/perl -w
+### 


+# File: risugen_ppc.pm
+# Date: 4-14-2017
+# Description: Perl module for generating PowerPC instructions with  
risugen.

+# Based on Jose Ricardo Ziviani (IBM) - ppc64 implementation
+### 


+
+# risugen -- generate a test binary file for use with risu
+# See 'risugen --help' for usage information.
+package risugen_ppc;
+
+use strict;
+use warnings;
+use bigint;
+use risugen_common;
+
+require Exporter;
+
+our @ISA= qw(Exporter);
+our @EXPORT = qw(write_test_code);
+
+# Has all debug messages printed if set to 1
+my $debug = 0;
+
+# First available slot in the stack
+my $available_stack_slot = 40;
+
+my $OP_COMPARE = 0;# compare registers
+my $OP_TESTEND = 1;# end of test, stop
+
+# prints debugging information if DEBUG is set to 1
+sub debug_print {
+if ($debug == 1) {
+my $test = sprintf(shift, @_);
+print $test;
+}
+}
+
+# setup the environment for a memory access involving two registers
+# first input: general purpose register rA
+# second input: general purpose register rB
+sub reg_plus_reg($$)
+{
+my ($rA, $rB) = @_;
+my $value;
+my $low_bits;
+my $high_bits;
+
+$value = rand();
+
+# Has to be loaded like this because we can't just load a 32 bit  
value

+$low_bits = $value & 0x;
+$high_bits = $value >> 16;
+
+# Add a value into the expected memory location
+write_lis(10, $high_bits);  # lis r10, $high_bits
+write_ori(10, 10, $low_bits);   # ori r10, r10, $low_bits
+write_stw(10, $available_stack_slot);   # stw r10,  
$available_stack_slot(r1)

+
+# setup the two registers to find the memory location
+write_mr($rA, 1);   # mr $rA, r1
+write_li($rB, $available_stack_slot);   # li $rB,  
$available_stack_slot

+return $rA;
+}
+
+# Handle register + immediate addressing mode
+# first input: base register
+# second input: immediate value (offset)
+sub reg_plus_imm($$)
+{
+my ($base_register, $number) = @_;
+
+# load a random floating point value into memory
+my $float_value = rand();
+my $ieee_value = convert_to_IEEE($float_value);
+my $ieee_high_bits; # the upper 16 bits of $ieee_value
+my $ieee_low_bits;  # the lower 16 bits of $ieee_value
+
+$ieee_high_bits = $ieee_value >> 16;
+$ieee_low_bits = $ieee_value & 0x;
+write_lis(10, $ieee_high_bits);   # lis r10, $ieee_high_bits
+write_ori(10, 10, $ieee_low_bits);# ori r10, r10,  
$ieee_low_bits

+write_stw(10, $number);   # stw r10, $number(r1)
+write_mr($base_register, 1);  # mr $base_register, r1
+return $base_register;
+}
+
+# convert float value to IEEE 754
+# input: floating point value (e.g. 1.23)
+# output: IEEE 754 single precision value
+sub convert_to_IEEE($)
+{
+my $float_value = $_[0];
+my $ieee_value = unpack "L", pack "f", $float_value;
+return $ieee_value;
+}
+
+# returns a random double precision IEEE 754 value
+# output: 64 bit number
+sub get_random_IEEE_double_value()
+{
+my @number_array = (0x40270, 0x405EDCCD,
+0x40700A66, 0x412E240CA45A1CAC,
+0x407C3E8F5C28F5C3, 0x408DBF189374BC6A);
+my $size_of_array = scalar @number_array;
+my $index = int rand($size_of_array);
+return $number_array[$index];
+}
+
+# writes ppc assembly code to load a IEEE 754 double value
+# input one: general purpose register number
+# input two: immediate value (offset)
+# output: IEEE 754 double value
+sub load_double_float_imm($$)
+{
+my ($rA, $number) = @_;
+
+my $ieee_double;   # ieee 754 double float value
+my $ieee_high_bits; # the upper 16 bits of $ieee_double
+my $ieee_low_bits;  # the lower 16 bits of $ieee_double
+my $higher_32_bits;
+my $lower_32_bits;
+
+$ieee_double = get_random_IEEE_double_value();
+
+# loading a 64 bit double value is going to take some work
+
+# load the higher 32 bits
+$higher_32_bits = $ieee_double >> 32;  # push the lower 32 bits out
+$ieee_high_bits = $higher_32_bits >> 16;
+$ieee_low_bits = $higher_32_bits & 0x;
+write_lis(10, $ieee_high_bits);   # lis r10, $ieee_high_bits
+write_ori(10, 10, $ieee_low_bits);# ori r10, r10,  
$ieee_low_bits

+write_stw(10, $number);   # stw r10, $number(r1)
+

[Qemu-devel] [PATCH 4/9] Add risu_reginfo_ppc.h file

2017-04-24 Thread G 3

Add the risu_reginfo_ppc.h file. It defines the reginfo structure.

Signed-off-by: John Arbuckle 
---
 risu_reginfo_ppc.h | 32 
 1 file changed, 32 insertions(+)
 create mode 100644 risu_reginfo_ppc.h

diff --git a/risu_reginfo_ppc.h b/risu_reginfo_ppc.h
new file mode 100644
index 000..c3b342d
--- /dev/null
+++ b/risu_reginfo_ppc.h
@@ -0,0 +1,32 @@
+/**
+ * File: risu_reginfo_ppc.h
+ * Description: 32 bit powerpc registers
+ * Date: 3-26-2017
+ **/
+
+#ifndef RISU_REGINFO_PPC_H
+#define RISU_REGINFO_PPC_H
+
+struct reginfo
+{
+uint32_t faulting_insn;
+uint32_t previous_instruction;  /* The instruction before the  
faulting instruction */
+uint32_t second_previous_instruction;  /* The second instruction  
before the faulting instruction */

+
+/** User Model UISA */
+
+uint32_t gpr[32];/* General Purpose Registers */
+double fpr[32];  /* Floating-point Registers */
+uint32_t cr; /* Condition Register */
+uint32_t fpscr;  /* Floating-point Status and Control  
Register */

+uint32_t xer;/* Fixed-point Exception Register */
+uint32_t lr; /* Link Register */
+uint32_t ctr;/* Count Register */
+
+/*
+ * Since we can't test any registers in the Supervisor Model,  
they are not

+ * included.
+ */
+};
+
+#endif /* RISU_REGINFO_PPC_H */
--
2.10.2




[Qemu-devel] [PATCH 3/9] Add risu_reginfo_ppc.c file

2017-04-24 Thread G 3
Add the risu_reginfo_ppc.c file. It handles operations involving the  
reginfo

structure.

Signed-off-by: John Arbuckle 
---
 risu_reginfo_ppc.c | 273 +++ 
++

 1 file changed, 273 insertions(+)
 create mode 100644 risu_reginfo_ppc.c

diff --git a/risu_reginfo_ppc.c b/risu_reginfo_ppc.c
new file mode 100644
index 000..123e5fa
--- /dev/null
+++ b/risu_reginfo_ppc.c
@@ -0,0 +1,273 @@
+/***
+ * File: risu_reginfo_ppc.c
+ * Description: PowerPC register validation code.
+ * Note: This file was made to work on Mac OS X.
+ * Date: 3-26-2017
+ ***/
+
+#include 
+#include 
+#include 
+
+#include "risu.h"
+#include "risu_reginfo_ppc.h"
+
+#define NUMBER_OF_GPR 32/* Number of general purpose registers */
+#define NUMBER_OF_FPR 32/* Number of floating point registers */
+#define DEBUG 0 /* Set to 1 for debug printing */
+#define debug_print(...) \
+do { if (DEBUG) fprintf(stderr, __VA_ARGS__); } while (0)
+
+extern int verbose_mode;
+
+/* reginfo_init: initialize with a ucontext */
+void reginfo_init(struct reginfo *ri, ucontext_t *uc)
+{
+debug_print("reginfo_init() called\n");
+
+int i;
+
+/* init the program counter (PC) */
+ri->faulting_insn = *((uint32_t *)uc->uc_mcontext->ss.srr0);
+
+/* Tells use the instruction that ran before any problem was  
detected */
+ri->previous_instruction = *((uint32_t *)(uc->uc_mcontext- 
>ss.srr0 - 4));

+
+/*
+ * Tells the second from faulting instruction that caused a  
problem. Needed
+ * because the cleanup instruction for the register that held a  
memory

+ * address is in the way.
+ */
+ ri->second_previous_instruction = *((uint32_t *)(uc- 
>uc_mcontext->ss.srr0 - 8));

+
+/* Displays each instruction and the previous instruction being  
tested */

+if(verbose_mode) {
+printf("testing instruction: 0x%08x 0x%08x\n", ri- 
>previous_instruction

+  , ri->second_previous_instruction);
+}
+
+/* init the general purpose registers */
+ri->gpr[0] = uc->uc_mcontext->ss.r0;
+ri->gpr[1] = 0xdeadbeef;/* stack pointer */
+ri->gpr[2] = uc->uc_mcontext->ss.r2;
+ri->gpr[3] = uc->uc_mcontext->ss.r3;
+ri->gpr[4] = uc->uc_mcontext->ss.r4;
+ri->gpr[5] = uc->uc_mcontext->ss.r5;
+ri->gpr[6] = uc->uc_mcontext->ss.r6;
+ri->gpr[7] = uc->uc_mcontext->ss.r7;
+ri->gpr[8] = uc->uc_mcontext->ss.r8;
+ri->gpr[9] = uc->uc_mcontext->ss.r9;
+ri->gpr[10] = uc->uc_mcontext->ss.r10;
+ri->gpr[11] = uc->uc_mcontext->ss.r11;
+ri->gpr[12] = uc->uc_mcontext->ss.r12;
+ri->gpr[13] = uc->uc_mcontext->ss.r13;
+ri->gpr[14] = uc->uc_mcontext->ss.r14;
+ri->gpr[15] = uc->uc_mcontext->ss.r15;
+ri->gpr[16] = uc->uc_mcontext->ss.r16;
+ri->gpr[17] = uc->uc_mcontext->ss.r17;
+ri->gpr[18] = uc->uc_mcontext->ss.r18;
+ri->gpr[19] = uc->uc_mcontext->ss.r19;
+ri->gpr[20] = uc->uc_mcontext->ss.r20;
+ri->gpr[21] = uc->uc_mcontext->ss.r21;
+ri->gpr[22] = uc->uc_mcontext->ss.r22;
+ri->gpr[23] = uc->uc_mcontext->ss.r23;
+ri->gpr[24] = uc->uc_mcontext->ss.r24;
+ri->gpr[25] = uc->uc_mcontext->ss.r25;
+ri->gpr[26] = uc->uc_mcontext->ss.r26;
+ri->gpr[27] = uc->uc_mcontext->ss.r27;
+ri->gpr[28] = uc->uc_mcontext->ss.r28;
+ri->gpr[29] = uc->uc_mcontext->ss.r29;
+ri->gpr[30] = uc->uc_mcontext->ss.r30;
+ri->gpr[31] = uc->uc_mcontext->ss.r31;
+
+/* init the floating-point registers */
+for (i = 0; i < NUMBER_OF_FPR; i++) {
+ri->fpr[i] = uc->uc_mcontext->fs.fpregs[i];
+}
+
+/* init the condition register */
+ri->cr = uc->uc_mcontext->ss.cr;
+
+/* init the floating point status and control register */
+ri->fpscr = uc->uc_mcontext->fs.fpscr;
+
+/* init the fixed-point exception register */
+ri->xer = uc->uc_mcontext->ss.xer;
+
+/* init the link register */
+ri->lr = uc->uc_mcontext->ss.lr;
+
+/* init the count register */
+ri->ctr = uc->uc_mcontext->ss.ctr;
+}
+
+/* reginfo_is_eq: compare the reginfo structs, returns nonzero if  
equal */

+int reginfo_is_eq(struct reginfo *r1, struct reginfo *r2)
+{
+debug_print("reginfo_is_eq() called\n");
+
+int i;
+
+/* check each general purpose register */
+for (i = 0; i < NUMBER_OF_GPR; i++) {
+if (r1->gpr[i] != r2->gpr[i]) {
+debug_print("general purpose register %d mismatch  
detected\n", i);

+return 0;
+}
+}
+
+/* check each floating point register */
+for (i = 0; i < NUMBER_OF_FPR; i++) {
+if (r1->fpr[i] != r2->fpr[i]) {
+if (!(isnan(r1->fpr[i]) && isnan(r2->fpr[i]))) {
+if ( fabs(r1->fpr[i] - r2->fpr[i]) < 0.01) {
+debug_print("float point register %d mismatch  
detected\n", 

[Qemu-devel] [PATCH 2/9] Add risu_ppc.c file

2017-04-24 Thread G 3

Add the risu_ppc.c file. It defines several functions used by risu.

Signed-off-by: John Arbuckle 
---
 risu_ppc.c | 41 +
 1 file changed, 41 insertions(+)
 create mode 100644 risu_ppc.c

diff --git a/risu_ppc.c b/risu_ppc.c
new file mode 100644
index 000..70a1cf7
--- /dev/null
+++ b/risu_ppc.c
@@ -0,0 +1,41 @@
+/*
+ * File: risu_ppc.c
+ * Date: 3-27-2017
+ * Description: Implement advance_pc(), set_ucontext_paramreg(),
+ *  get_reginfo_paramreg(), get_risuop()
+ */
+
+#include "risu.h"
+
+/* Advances the program counter register to the next instruction */
+void advance_pc(void *vuc)
+{
+ucontext_t *uc = (ucontext_t*)vuc;
+uc->uc_mcontext->ss.srr0 += 4;
+}
+
+/* Sets register r0 to the address of a memory block. */
+void set_ucontext_paramreg(void *vuc, uint64_t value)
+{
+ucontext_t *uc = vuc;
+uc->uc_mcontext->ss.r0 = value;
+}
+
+/*
+ * Returns the register that keeps track of a memory block address.
+ * Used for the load and store to memory instructions like LFD.
+ * Returns general purpose register r0.
+ */
+uint64_t get_reginfo_paramreg(struct reginfo *ri)
+{
+return ri->gpr[0];
+}
+
+int get_risuop(struct reginfo *ri)
+{
+uint32_t insn = ri->faulting_insn;
+uint32_t op = insn & 0xf;
+uint32_t key = insn & ~0xf;
+uint32_t risukey = 0x5af0;
+return (key != risukey) ? -1 : op;
+}
--
2.10.2




[Qemu-devel] [PATCH 1/9] Add ppc.risu file

2017-04-24 Thread G 3
Add the ppc.risu file. It defines the format for various PowerPC  
instructions.


Signed-off-by: John Arbuckle 
---
 ppc.risu | 527 + 
++

 1 file changed, 527 insertions(+)
 create mode 100644 ppc.risu

diff --git a/ppc.risu b/ppc.risu
new file mode 100644
index 000..b6d6aee
--- /dev/null
+++ b/ppc.risu
@@ -0,0 +1,527 @@
+### 


+# File: ppc.risu
+# Date: 3-27-2017
+# Description: Specifies PowerPC instruction test patterns.
+### 


+
+.mode ppc
+
+# Note: register r1 cannot be used because it is the stack pointer.
+# The branching, VEA, and OEA instructions cannot be used here  
currently.

+
+ADD PPC 01 rD:5 rA:5 rB:5 OE:1 11010 Rc:1 \
+!constraints { $rD != 1 && $rA != 1 && $rB != 1; }
+
+ADDC PPC 01 rD:5 rA:5 rB:5 OE:1 01010 Rc:1 \
+!constraints { $rD != 1 && $rA != 1 && $rB != 1; }
+
+ADDE PPC 01 rD:5 rA:5 rB:5 OE:1 010001010 Rc:1 \
+!constraints { $rD != 1 && $rA != 1 && $rB != 1; }
+
+ADDI PPC 001110 rD:5 rA:5 imm:16 \
+!constraints { $rD != 1 && $rA != 1; }
+
+ADDIC PPC 001100 rD:5 rA:5 imm:16 \
+!constraints { $rD != 1 && $rA != 1; }
+
+# ADDIC. and ADDIC are two different instructions
+ADDICp PPC 001101 rD:5 rA:5 imm:16 \
+!constraints { $rD != 1 && $rA != 1; }
+
+ADDIS PPC 00 rD:5 rA:5 imm:16 \
+!constraints { $rD != 1 && $rA != 1; }
+
+ADDME PPC 01 rD:5 rA:5 0 OE:1 011101010 Rc:1 \
+!constraints { $rD != 1 && $rA != 1; }
+
+ADDZE PPC 01 rD:5 rA:5 0 OE:1 011001010 Rc:1 \
+!constraints { $rD != 1 && $rA != 1; }
+
+AND PPC 01 rD:5 rA:5 rB:5 011100 Rc:1 \
+!constraints { $rD != 1 && $rA != 1 && $rB != 1; }
+
+ANDC PPC 01 rD:5 rA:5 rB:5 00 Rc:1 \
+!constraints { $rD != 1 && $rA != 1 && $rB != 1; }
+
+# ANDI. - p stands for period
+ANDIp PPC 011100 rD:5 rA:5 imm:16 \
+!constraints { $rD != 1 && $rA != 1; }
+
+# ANDIS. - p stands for period
+ANDISp PPC 011101 rD:5 rA:5 imm:16 \
+!constraints { $rD != 1 && $rA != 1; }
+
+# For the CMP* instructions, if you want to test the PowerPC G5, set  
L to 1

+# in the constraints.
+
+CMP PPC 01 crfD:3 0 L:1 rA:5 rB:5 00 0 \
+!constraints { $rA != 1 && $rB != 1 && $L == 0; }
+
+CMPI PPC 001011 crfD:3 0 L:1 rA:5 imm:16 \
+!constraints { $rA != 1 && $L == 0; }
+
+CMPL PPC 01 crfD:3 0 L:1 rA:5 rB:5 10 0 \
+!constraints { $rA != 1 && $rB == 1 && $L == 0; }
+
+CMPLI PPC 001010 crfD:3 0 L:1 rA:5 imm:16 \
+!constraints { $rA != 1 && $L == 0; }
+
+CNTLZWX PPC 01 rS:5 rA:5 0 011010 Rc:1 \
+!constraints { $rS != 1 && $rA != 1; }
+
+CRAND PPC 010011 crbD:5 crbA:5 crbB:5 010001 0
+
+CRANDC PPC 010011 crbD:5 crbA:5 crbB:5 001001 0
+
+CREQV PPC 010011 crbD:5 crbA:5 crbB:5 010011 0
+
+CRNAND PPC 010011 crbD:5 crbA:5 crbB:5 001111 0
+
+CRNOR PPC 010011 crbD:5 crbA:5 crbB:5 11 0
+
+CROR PPC 010011 crbD:5 crbA:5 crbB:5 011101 0
+
+CRORC PPC 010011 crbD:5 crbA:5 crbB:5 011011 0
+
+CRXOR PPC 010011 crbD:5 crbA:5 crbB:5 001101 0
+
+DIVW PPC 01 rD:5 rA:5 rB:5 OE:1 01011 Rc:1 \
+!constraints { $rD != 1 && $rA != 1 && $rB != 1; }
+
+DIVWU PPC 01 rD:5 rA:5 rB:5 OE:1 111001011 Rc:1 \
+!constraints { $rD != 1 && $rA != 1 && $rB != 1; }
+
+EQV PPC 01 rS:5 rA:5 rB:5 0100011100 Rc:1 \
+!constraints { $rS != 1 && $rA != 1 && $rB != 1; }
+
+EXTSB PPC 01 rS:5 rA:5 0 1110111010 Rc:1 \
+!constraints { $rS != 1 && $rA != 1; }
+
+EXTSH PPC 01 rS:5 rA:5 0 1110011010 Rc:1 \
+!constraints { $rS != 1 && $rA != 1; }
+
+FABS PPC 11 fD:5 0 fB:5 011000 Rc:1
+
+FADD PPC 11 fD:5 fA:5 fB:5 0 10101 Rc:1
+
+FADDS PPC 111011 fD:5 fA:5 fB:5 0 10101 Rc:1
+
+FCMPO PPC 11 crfD:3 00 fA:5 fB:5 10 0
+
+FCMPU PPC 11 crfD:3 00 fA:5 fB:5 00 0
+
+FCTIW PPC 11 fD:5 0 fB:5 0011100
+
+FCTIWZ PPC 11 fD:5 0 fB:5 00 Rc:1
+
+FDIV PPC 11 fD:5 fA:5 fB:5 0 10010 Rc:1
+
+FDIVS PPC 111011 fD:5 fA:5 fB:5 0 10010 Rc:1
+
+FMADD PPC 11 fD:5 fA:5 fB:5 fC:5 11101 Rc:1
+
+FMADDS PPC 111011 fD:5 fA:5 fB:5 fC:5 11101 Rc:1
+
+FMR PPC 11 fD:5 0 fB:5 0001001000 Rc:1
+
+FMSUB PPC 11 fD:5 fA:5 fB:5 fC:5 11100 Rc:1
+
+FMSUBS PPC 111011 fD:5 fA:5 fB:5 fC:5 11100 Rc:1
+
+FMUL PPC 11 fD:5 fA:5 0 fC:5 11001 Rc:1
+
+FMULS PPC 111011 fD:5 fA:5 0 fC:5 11001 Rc:1
+
+FNABS PPC 11 fD:5 0 fB:5 0010001000 Rc:1
+
+FNEG PPC 11 fD:5 0 fB:5 101000 Rc:1
+
+FNMADD PPC 11 fD:5 fA:5 fB:5 fC:5 1 Rc:1
+
+FNMADDS PPC 111011 fD:5 fA:5 fB:5 fC:5 1 Rc:1
+
+FNMSUB PPC 11 fD:5 fA:5 fB:5 fC:5 0 Rc:1
+
+FNMSUBS PPC 111011 fD:5 fA:5 fB:5 fC:5 0 Rc:1
+
+# The value placed into register fD may vary between implementations  
- too hard to test

+#FRES PPC 111011 fD:5 0 fB:5 0 11000 Rc:1
+
+FRSP PPC 11 fD:5 0 

[Qemu-devel] [PATCH 0/9] Add PowerPC support to risu

2017-04-24 Thread G 3

Makes risu usable on a PowerPC Macintosh running Mac OS X.

John Arbuckle (9):
  Add ppc.risu file.
  Add risu_ppc.c file.
  Add risu_reginfo_ppc.c file.
  Add risu_reginfo_ppc.h file
  Add risugen_ppc.pm file.
  Add ppc support to configure
  Add verbose option.
  Add end of test message
  Add more descriptive comment for mismatch or end of test condition.

 configure  |   4 +-
 ppc.risu   | 527 +
 risu.c |  14 +-
 risu_ppc.c |  41 +++
 risu_reginfo_ppc.c | 273 
 risu_reginfo_ppc.h |  32 +++
 risugen_ppc.pm | 744 +++ 
++

 7 files changed, 1632 insertions(+), 3 deletions(-)
 create mode 100644 ppc.risu
 create mode 100644 risu_ppc.c
 create mode 100644 risu_reginfo_ppc.c
 create mode 100644 risu_reginfo_ppc.h
 create mode 100644 risugen_ppc.pm

--
2.10.2




[Qemu-devel] [PATCH] pci: deassert intx when pci device unrealize

2017-04-24 Thread Herongguang (Stephen)

If a pci device is not reset by VM (by writing into config space)
and unplugged by VM, after that when VM reboots, qemu may assert:
pcibus_reset: Assertion `bus->irq_count[i] == 0' failed

Signed-off-by: herongguang 
---
 hw/pci/pci.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/hw/pci/pci.c b/hw/pci/pci.c
index 259483b..98ccc27 100644
--- a/hw/pci/pci.c
+++ b/hw/pci/pci.c
@@ -1083,6 +1083,7 @@ static void pci_qdev_unrealize(DeviceState *dev, Error 
**errp)
 pc->exit(pci_dev);
 }

+pci_device_deassert_intx(pci_dev);
 do_pci_unregister_device(pci_dev);
 }

--
1.7.12.4



Re: [Qemu-devel] [RFC PATCH] pci: deassert intx when pci device unrealize

2017-04-24 Thread Herongguang (Stephen)



On 2017/4/25 7:45, Michael S. Tsirkin wrote:

On Mon, Apr 24, 2017 at 09:12:29PM +0800, Herongguang (Stephen) wrote:

If a pci device is not reset by VM (by writing into config space)
and unplugged by VM, after that when VM reboots, qemu may assert:
pcibus_reset: Assertion `bus->irq_count[i] == 0' failed

Signed-off-by: herongguang 


Good grief, I can't believe we have had this bug for so long.
Thanks a lot for finding this.

Pls Cc stable on this patch.


---

Is there need to call pci_do_device_reset()?


I don't think so, why?


---
  hw/pci/pci.c | 1 +
  1 file changed, 1 insertion(+)

diff --git a/hw/pci/pci.c b/hw/pci/pci.c
index 259483b..afe6397 100644
--- a/hw/pci/pci.c
+++ b/hw/pci/pci.c
@@ -1078,6 +1078,7 @@ static void pci_qdev_unrealize(DeviceState *dev, Error 
**errp)

  pci_unregister_io_regions(pci_dev);
  pci_del_option_rom(pci_dev);
+pci_device_deassert_intx(pci_dev);


I think the right place to call this is after class exit
when we know no one will try to assert the interrupt.


OK, it's safer, will resend a new patch.





  if (pc->exit) {
  pc->exit(pci_dev);
--
1.7.12.4

.





Re: [Qemu-devel] [PATCH 2/2] qemu-img: fix some spelling errors

2017-04-24 Thread 858585 jemmy
On Mon, Apr 24, 2017 at 11:53 PM, Eric Blake  wrote:
> On 04/24/2017 10:47 AM, Eric Blake wrote:
>> On 04/24/2017 10:37 AM, Philippe Mathieu-Daudé wrote:
>>
>  /*
> - * Returns true iff the first sector pointed to by 'buf' contains at
> least
> - * a non-NUL byte.
> + * Returns true if the first sector pointed to by 'buf' contains at
> least
> + * a non-NULL byte.

 NACK to both changes.  'iff' is an English word that is shorthand for
 "if and only if".  "NUL" means the one-byte character, while "NULL"
 means the 8-byte (or 4-byte, on 32-bit platform) pointer value.
>>>
>>> I agree with Lidong shorthands are not obvious from non-native speaker.
>>>
>>> What about this?
>>>
>>>  * Returns true if (and only if) the first sector pointed to by 'buf'
>>> contains
>>
>> That might be okay.
>>
>>>  * at least a non-null character.
>>
>> But that still doesn't make sense.  The character name is NUL, and
>> non-NULL refers to something that is a pointer, not a character.
>
> What's more, the NUL character can actually occupy more than one byte
> (think UTF-16, where it is the two-byte 0 value).  Referring to NUL byte
> rather than NUL character (or even the ' byte') makes it obvious
> that this function is NOT encoding-sensitive, and doesn't start
> mis-behaving just because the data picks a multi-byte character encoding.

How about this?

 * Returns true  if (and only if) the first sector pointed to by 'buf'
contains at least
 * a non-zero byte.

Thanks.

>
> --
> Eric Blake, Principal Software Engineer
> Red Hat, Inc.   +1-919-301-3266
> Virtualization:  qemu.org | libvirt.org
>



Re: [Qemu-devel] [PATCH] crypto: move 'opaque' parameter to (nearly) the end of parameter list

2017-04-24 Thread Fam Zheng
On Mon, 04/24 16:36, Daniel P. Berrange wrote:
> Previous commit moved 'opaque' to be the 2nd parameter in the list:
> 
>   commit 375092332eeaa6e47561ce47fd36144cdaf964d0
>   Author: Fam Zheng 
>   Date:   Fri Apr 21 20:27:02 2017 +0800
> 
> crypto: Make errp the last parameter of functions
> 
> Move opaque to 2nd instead of the 2nd to last, so that compilers help
> check with the conversion.
> 
> this puts it back to the 2nd to last position.
> 
> Signed-off-by: Daniel P. Berrange 

Reviewed-by: Fam Zheng 



[Qemu-devel] [PATCH v8] Allow setting NUMA distance for different NUMA nodes

2017-04-24 Thread He Chen
This patch is going to add SLIT table support in QEMU, and provides
additional option `dist` for command `-numa` to allow user set vNUMA
distance by QEMU command.

With this patch, when a user wants to create a guest that contains
several vNUMA nodes and also wants to set distance among those nodes,
the QEMU command would like:

```
-numa node,nodeid=0,cpus=0 \
-numa node,nodeid=1,cpus=1 \
-numa node,nodeid=2,cpus=2 \
-numa node,nodeid=3,cpus=3 \
-numa dist,src=0,dst=1,val=21 \
-numa dist,src=0,dst=2,val=31 \
-numa dist,src=0,dst=3,val=41 \
-numa dist,src=1,dst=2,val=21 \
-numa dist,src=1,dst=3,val=31 \
-numa dist,src=2,dst=3,val=21 \
```

Signed-off-by: He Chen 

---
Changes since v7:
* Remove unnecessary node present check.
* Minor improvement on prompt message.

Changes since v6:
* Split validate_numa_distance into 2 separate functions.
* Add comments before validate and complete numa distance functions.

Changes since v5:
* Made the generation of the SLIT dependent on `have_numa_distance`.
* Doc refinement.
---
 hw/acpi/aml-build.c |  26 ++
 hw/i386/acpi-build.c|   4 ++
 include/hw/acpi/aml-build.h |   1 +
 include/sysemu/numa.h   |   2 +
 include/sysemu/sysemu.h |   4 ++
 numa.c  | 124 
 qapi-schema.json|  30 ++-
 qemu-options.hx |  16 +-
 8 files changed, 204 insertions(+), 3 deletions(-)

diff --git a/hw/acpi/aml-build.c b/hw/acpi/aml-build.c
index c6f2032..be496c8 100644
--- a/hw/acpi/aml-build.c
+++ b/hw/acpi/aml-build.c
@@ -24,6 +24,7 @@
 #include "hw/acpi/aml-build.h"
 #include "qemu/bswap.h"
 #include "qemu/bitops.h"
+#include "sysemu/numa.h"
 
 static GArray *build_alloc_array(void)
 {
@@ -1609,3 +1610,28 @@ void build_srat_memory(AcpiSratMemoryAffinity *numamem, 
uint64_t base,
 numamem->base_addr = cpu_to_le64(base);
 numamem->range_length = cpu_to_le64(len);
 }
+
+/*
+ * ACPI spec 5.2.17 System Locality Distance Information Table
+ * (Revision 2.0 or later)
+ */
+void build_slit(GArray *table_data, BIOSLinker *linker)
+{
+int slit_start, i, j;
+slit_start = table_data->len;
+
+acpi_data_push(table_data, sizeof(AcpiTableHeader));
+
+build_append_int_noprefix(table_data, nb_numa_nodes, 8);
+for (i = 0; i < nb_numa_nodes; i++) {
+for (j = 0; j < nb_numa_nodes; j++) {
+assert(numa_info[i].distance[j]);
+build_append_int_noprefix(table_data, numa_info[i].distance[j], 1);
+}
+}
+
+build_header(linker, table_data,
+ (void *)(table_data->data + slit_start),
+ "SLIT",
+ table_data->len - slit_start, 1, NULL, NULL);
+}
diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
index 2073108..2458ebc 100644
--- a/hw/i386/acpi-build.c
+++ b/hw/i386/acpi-build.c
@@ -2678,6 +2678,10 @@ void acpi_build(AcpiBuildTables *tables, MachineState 
*machine)
 if (pcms->numa_nodes) {
 acpi_add_table(table_offsets, tables_blob);
 build_srat(tables_blob, tables->linker, machine);
+if (have_numa_distance) {
+acpi_add_table(table_offsets, tables_blob);
+build_slit(tables_blob, tables->linker);
+}
 }
 if (acpi_get_mcfg()) {
 acpi_add_table(table_offsets, tables_blob);
diff --git a/include/hw/acpi/aml-build.h b/include/hw/acpi/aml-build.h
index 00c21f1..329a0d0 100644
--- a/include/hw/acpi/aml-build.h
+++ b/include/hw/acpi/aml-build.h
@@ -389,4 +389,5 @@ GCC_FMT_ATTR(2, 3);
 void build_srat_memory(AcpiSratMemoryAffinity *numamem, uint64_t base,
uint64_t len, int node, MemoryAffinityFlags flags);
 
+void build_slit(GArray *table_data, BIOSLinker *linker);
 #endif
diff --git a/include/sysemu/numa.h b/include/sysemu/numa.h
index 8f09dcf..0ea1bc0 100644
--- a/include/sysemu/numa.h
+++ b/include/sysemu/numa.h
@@ -8,6 +8,7 @@
 #include "hw/boards.h"
 
 extern int nb_numa_nodes;   /* Number of NUMA nodes */
+extern bool have_numa_distance;
 
 struct numa_addr_range {
 ram_addr_t mem_start;
@@ -21,6 +22,7 @@ typedef struct node_info {
 struct HostMemoryBackend *node_memdev;
 bool present;
 QLIST_HEAD(, numa_addr_range) addr; /* List to store address ranges */
+uint8_t distance[MAX_NODES];
 } NodeInfo;
 
 extern NodeInfo numa_info[MAX_NODES];
diff --git a/include/sysemu/sysemu.h b/include/sysemu/sysemu.h
index 576c7ce..6999545 100644
--- a/include/sysemu/sysemu.h
+++ b/include/sysemu/sysemu.h
@@ -169,6 +169,10 @@ extern int mem_prealloc;
 
 #define MAX_NODES 128
 #define NUMA_NODE_UNASSIGNED MAX_NODES
+#define NUMA_DISTANCE_MIN 10
+#define NUMA_DISTANCE_DEFAULT 20
+#define NUMA_DISTANCE_MAX 254
+#define NUMA_DISTANCE_UNREACHABLE 255
 
 #define MAX_OPTION_ROMS 16
 typedef struct QEMUOptionRom {
diff --git a/numa.c b/numa.c
index 6fc2393..162b84b 100644
--- a/numa.c
+++ b/numa.c
@@ -51,6 +51,7 @@ static int 

Re: [Qemu-devel] [PATCH 1/2] qemu-img: make sure contain the consecutive number of zero bytes

2017-04-24 Thread 858585 jemmy
On Mon, Apr 24, 2017 at 10:43 PM, Eric Blake  wrote:
> On 04/23/2017 09:33 AM, jemmy858...@gmail.com wrote:
>> From: Lidong Chen 
>>
>> is_allocated_sectors_min don't guarantee to contain the
>> consecutive number of zero bytes. this patch fixes this bug.
>
> This message was sent without an 'In-Reply-To' header pointing to a 0/2
> cover letter.  When sending a series, please always thread things to a
> cover letter; you may find 'git config format.coverletter auto' to be
> helpful.

Thanks for your kind advises.

>
>>
>> Signed-off-by: Lidong Chen 
>> ---
>>  qemu-img.c | 11 ++-
>>  1 file changed, 6 insertions(+), 5 deletions(-)
>>
>> diff --git a/qemu-img.c b/qemu-img.c
>> index b220cf7..df6d165 100644
>> --- a/qemu-img.c
>> +++ b/qemu-img.c
>> @@ -1060,9 +1060,9 @@ static int is_allocated_sectors(const uint8_t *buf, 
>> int n, int *pnum)
>>  }
>>
>>  /*
>> - * Like is_allocated_sectors, but if the buffer starts with a used sector,
>> - * up to 'min' consecutive sectors containing zeros are ignored. This avoids
>> - * breaking up write requests for only small sparse areas.
>> + * Like is_allocated_sectors, but up to 'min' consecutive sectors
>> + * containing zeros are ignored. This avoids breaking up write requests
>> + * for only small sparse areas.
>>   */
>>  static int is_allocated_sectors_min(const uint8_t *buf, int n, int *pnum,
>>  int min)
>> @@ -1071,11 +1071,12 @@ static int is_allocated_sectors_min(const uint8_t 
>> *buf, int n, int *pnum,
>>  int num_checked, num_used;
>>
>>  if (n < min) {
>> -min = n;
>> +*pnum = n;
>> +return 1;
>>  }
>>
>>  ret = is_allocated_sectors(buf, n, pnum);
>> -if (!ret) {
>> +if (!ret && *pnum >= min) {
>
> I seem to recall past attempts to try and patch this function, which
> were then turned down, although I haven't scrubbed the archives for a
> quick URL to point to. I'm worried that there are more subtleties here
> than what you realize.

Hi Eric:
Do you mean this URL?
https://lists.gnu.org/archive/html/qemu-block/2017-01/msg00306.html

But I think the code is not consistent with qemu-img --help.
qemu-img --help
  '-S' indicates the consecutive number of bytes (defaults to 4k) that must
   contain only zeros for qemu-img to create a sparse image during
   conversion. If the number of bytes is 0, the source will not be
scanned for
   unallocated or zero sectors, and the destination image will always be
   fully allocated.

another reason:
if s->has_zero_init is 1(the qcow2 image which have backing_file), the empty
space at the beginning of the buffer still need write and invoke
blk_co_pwrite_zeroes.
and split a single write operation into two just because there is small empty
space at the beginning.

Thanks.

>
> --
> Eric Blake, Principal Software Engineer
> Red Hat, Inc.   +1-919-301-3266
> Virtualization:  qemu.org | libvirt.org
>



Re: [Qemu-devel] [PATCH v2] trace: add qemu mutex lock and unlock trace events

2017-04-24 Thread Fam Zheng
On Mon, 04/24 14:19, Jose Ricardo Ziviani wrote:
> These trace events were very useful to help me to understand and find a
> reordering issue in vfio, for example:
> 
> qemu_mutex_lock locked mutex 0x10905ad8
>   vfio_region_write  (0001:03:00.0:region1+0xc0, 0x2020c, 4)
> qemu_mutex_unlock unlocked mutex 0x10905ad8
> qemu_mutex_lock locked mutex 0x10905ad8
>   vfio_region_write  (0001:03:00.0:region1+0xc4, 0xa, 4)
> qemu_mutex_unlock unlocked mutex 0x10905ad8
> 
> that also helped me to see the desired result after the fix:
> 
> qemu_mutex_lock locked mutex 0x10905ad8
>   vfio_region_write  (0001:03:00.0:region1+0xc0, 0x2000c, 4)
>   vfio_region_write  (0001:03:00.0:region1+0xc4, 0xb, 4)
> qemu_mutex_unlock unlocked mutex 0x10905ad8
> 
> So it could be a good idea to have these traces implemented. It's worth
> mentioning that they should be surgically enabled during the debugging,
> otherwise it can flood the trace logs with lock/unlock messages.
> 
> How to use it:
> trace-event qemu_mutex_lock on|off
> trace-event qemu_mutex_unlock on|off
> or
> trace-event qemu_mutex* on|off
> 
> Signed-off-by: Jose Ricardo Ziviani 
> ---
> v2:
>   - removed unecessary (void*) cast
>   - renamed parameter name to lock instead of qemu_global_mutex
> 
>  util/qemu-thread-posix.c | 5 +
>  util/trace-events| 4 
>  2 files changed, 9 insertions(+)
> 
> diff --git a/util/qemu-thread-posix.c b/util/qemu-thread-posix.c
> index 73e3a0e..4f77d7b 100644
> --- a/util/qemu-thread-posix.c
> +++ b/util/qemu-thread-posix.c
> @@ -14,6 +14,7 @@
>  #include "qemu/thread.h"
>  #include "qemu/atomic.h"
>  #include "qemu/notify.h"
> +#include "trace.h"
>  
>  static bool name_threads;
>  
> @@ -60,6 +61,8 @@ void qemu_mutex_lock(QemuMutex *mutex)
>  err = pthread_mutex_lock(>lock);
>  if (err)
>  error_exit(err, __func__);
> +
> +trace_qemu_mutex_lock(>lock);
>  }
>  
>  int qemu_mutex_trylock(QemuMutex *mutex)
> @@ -74,6 +77,8 @@ void qemu_mutex_unlock(QemuMutex *mutex)
>  err = pthread_mutex_unlock(>lock);
>  if (err)
>  error_exit(err, __func__);
> +
> +trace_qemu_mutex_unlock(>lock);
>  }
>  
>  void qemu_rec_mutex_init(QemuRecMutex *mutex)
> diff --git a/util/trace-events b/util/trace-events
> index b44ef4f..70f6212 100644
> --- a/util/trace-events
> +++ b/util/trace-events
> @@ -55,3 +55,7 @@ lockcnt_futex_wait_prepare(const void *lockcnt, int 
> expected, int new) "lockcnt
>  lockcnt_futex_wait(const void *lockcnt, int val) "lockcnt %p waiting on %d"
>  lockcnt_futex_wait_resume(const void *lockcnt, int new) "lockcnt %p after 
> wait: %d"
>  lockcnt_futex_wake(const void *lockcnt) "lockcnt %p waking up one waiter"
> +
> +# util/qemu-thread-posix.c
> +qemu_mutex_lock(void *lock) "locked mutex %p"
> +qemu_mutex_unlock(void *lock) "unlocked mutex %p"
> -- 
> 2.7.4
> 

Reviewed-by: Fam Zheng 



Re: [Qemu-devel] [Qemu-block] [PATCH 17/17] block: Make bdrv_is_allocated_above() byte-based

2017-04-24 Thread Eric Blake
On 04/24/2017 06:06 PM, John Snow wrote:
> 
> 
> On 04/11/2017 06:29 PM, Eric Blake wrote:
>> We are gradually moving away from sector-based interfaces, towards
>> byte-based.  In the common case, allocation is unlikely to ever use
>> values that are not naturally sector-aligned, but it is possible
>> that byte-based values will let us be more precise about allocation
>> at the end of an unaligned file that can do byte-based access.
>>
>> Changing the signature of the function to use int64_t *pnum ensures
>> that the compiler enforces that all callers are updated.  For now,
>> the io.c layer still assert()s that all callers are sector-aligned,
>> but that can be relaxed when a later patch implements byte-based
>> block status.  Therefore, for the most part this patch is just the
>> addition of scaling at the callers followed by inverse scaling at
>> bdrv_is_allocated().  But some code, particularly stream_run(),
>> gets a lot simpler because it no longer has to mess with sectors.
>>

>> +++ b/block/io.c
>> @@ -1930,52 +1930,46 @@ int coroutine_fn bdrv_is_allocated(BlockDriverState 
>> *bs, int64_t offset,
>>  /*
>>   * Given an image chain: ... -> [BASE] -> [INTER1] -> [INTER2] -> [TOP]
>>   *
>> - * Return true if the given sector is allocated in any image between
>> + * Return true if the given offset is allocated in any image between
> 
> perhaps "range" instead of "offset"?
> 
>>   * BASE and TOP (inclusive).  BASE can be NULL to check if the given
>> - * sector is allocated in any image of the chain.  Return false otherwise,
>> + * offset is allocated in any image of the chain.  Return false otherwise,
>>   * or negative errno on failure.

Seems reasonable.


>>  /*
>> - * [sector_num, nb_sectors] is unallocated on top but intermediate
>> - * might have
>> - *
>> - * [sector_num+x, nr_sectors] allocated.
>> + * [offset, bytes] is unallocated on top but intermediate
>> + * might have [offset+x, bytes-x] allocated.
>>   */
>> -if (n > psectors_inter &&
>> +if (n > pnum_inter &&
>>  (intermediate == top ||
>> - sector_num + psectors_inter < intermediate->total_sectors)) {
> 
> 
> 
>> -n = psectors_inter;
>> + offset + pnum_inter < (intermediate->total_sectors *
>> +BDRV_SECTOR_SIZE))) {
> 
> Naive question: not worth using either bdrv_getlength for bytes, or the
> bdrv_nb_sectors helpers?

bdrv_getlength(intermediate) should indeed be the same as
intermediate->total_sectors * BDRV_SECTOR_SIZE (for now - ultimately it
would be nice to track a byte length rather than a sector length for
images). I can make that cleanup for v2.

> 
> Reviewed-by: John Snow 
> 

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] dns server not working in QEMU using usermode networking (SLIRP)

2017-04-24 Thread FONNEMANN Mark
-Original Message-
From: Thomas Huth [mailto:th...@redhat.com] 
Sent: Monday, April 24, 2017 04:00
To: FONNEMANN Mark ; qemu-devel@nongnu.org
Cc: Samuel Thibault 
Subject: Re: [Qemu-devel] dns server not working in QEMU using usermode 
networking (SLIRP)

> IIRC there have been some DNS related fixes shortly before the final 2.9 
> release ... so does it work if you use the final 2.9.0 version instead?

I just confirmed that the problem exists in 2.9 release using 
qemu-system-i386.exe as well.


Re: [Qemu-devel] ARM virt machine boots fail with 14 ioh3420

2017-04-24 Thread Shannon Zhao


On 2017/4/24 18:16, Marcel Apfelbaum wrote:
> On 04/24/2017 01:02 PM, Laszlo Ersek wrote:
>> On 04/14/17 04:41, Shannon Zhao wrote:
>>> Hi Laszlo,
>>>
>>> Thanks a lot for your reply:)
>>>
>>> On 2017/4/14 1:09, Laszlo Ersek wrote:
 Adding Andrea, Ard, Drew and Marcel; and the main qemu list

 On 04/13/17 09:37, Shannon Zhao wrote:
> Hi,
>
> I'm testing the PCIe devices hotplug for ARM virt machine and using
> ioh3420 as root port. I found that below command line could work.
>
> qemu-system-aarch64 -machine virt,accel=kvm,usb=off -cpu host -bios
> QEMU_EFI.fd -m 12288 -smp 8,sockets=8,cores=1,threads=1  -device
> ioh3420,port=0x8,chassis=1,id=pci.1,bus=pcie.0,addr=0x1 -device
> ioh3420,port=0x9,chassis=2,id=pci.2,bus=pcie.0,addr=0x2 -device
> ioh3420,port=0xa,chassis=3,id=pci.3,bus=pcie.0,addr=0x3 -device
> ioh3420,port=0xb,chassis=4,id=pci.4,bus=pcie.0,addr=0x4 -device
> ioh3420,port=0xc,chassis=5,id=pci.5,bus=pcie.0,addr=0x5 -device
> ioh3420,port=0xd,chassis=6,id=pci.6,bus=pcie.0,addr=0x6 -device
> ioh3420,port=0xe,chassis=7,id=pci.7,bus=pcie.0,addr=0x7 -device
> ioh3420,port=0xf,chassis=8,id=pci.8,bus=pcie.0,addr=0x8 -device
> ioh3420,port=0x10,chassis=9,id=pci.9,bus=pcie.0,addr=0x9 -device
> ioh3420,port=0x11,chassis=10,id=pci.10,bus=pcie.0,addr=0xa -device
> ioh3420,port=0x12,chassis=11,id=pci.11,bus=pcie.0,addr=0xb -device
> ioh3420,port=0x13,chassis=12,id=pci.12,bus=pcie.0,addr=0xc -device
> ioh3420,port=0x14,chassis=13,id=pci.13,bus=pcie.0,addr=0xd -device
> i82801b11-bridge,id=pci.17,bus=pcie.0,addr=0x11 -device
> pci-bridge,chassis_nr=18,id=pci.18,bus=pci.17,addr=0x0 -device
> usb-ehci,id=usb,bus=pci.18,addr=0x1 -device
> virtio-scsi-pci,id=scsi0,bus=pci.1,addr=0x0,disable-legacy=on,disable-modern=off
>
> -drive
> file=/mnt/sdb/guest.raw,format=raw,if=none,id=drive-scsi0-0-0-0,cache=none,aio=native
>
> -device
> scsi-hd,bus=scsi0.0,channel=0,scsi-id=0,lun=0,drive=drive-scsi0-0-0-0,id=scsi0-0-0-0,bootindex=1
>
> -netdev tap,id=hostnet1,vhost=on -device
> virtio-net-pci,netdev=hostnet1,id=net1,mac=00:16:3e:2b:cc:e1,bus=pci.2,addr=0x0,disable-legacy=on,disable-modern=off
>
> -netdev tap,id=hostnet2,vhost=on -device
> virtio-net-pci,netdev=hostnet2,id=net2,mac=00:16:3e:22:29:80,bus=pci.3,addr=0x0,disable-legacy=on,disable-modern=off
>
> -netdev tap,id=hostnet3,vhost=on -device
> virtio-net-pci,netdev=hostnet3,id=net3,mac=00:16:3e:28:07:9a,bus=pci.4,addr=0x0,disable-legacy=on,disable-modern=off
>
> -netdev tap,id=hostnet4,vhost=on -device
> virtio-net-pci,netdev=hostnet4,id=net4,mac=00:16:3e:3d:cd:b6,bus=pci.5,addr=0x0,disable-legacy=on,disable-modern=off
>
> -netdev tap,id=hostnet5,vhost=on -device
> virtio-net-pci,netdev=hostnet5,id=net5,mac=00:16:3e:64:9f:b0,bus=pci.6,addr=0x0,disable-legacy=on,disable-modern=off
>
> -netdev tap,id=hostnet6,vhost=on -device
> virtio-net-pci,netdev=hostnet6,id=net6,mac=00:16:3e:33:5b:d3,bus=pci.7,addr=0x0,disable-legacy=on,disable-modern=off
>
> -netdev tap,id=hostnet7,vhost=on -device
> virtio-net-pci,netdev=hostnet7,id=net7,mac=00:16:3e:39:7c:df,bus=pci.8,addr=0x0,disable-legacy=on,disable-modern=off
>
> -netdev tap,id=hostnet8,vhost=on -device
> virtio-net-pci,netdev=hostnet8,id=net8,mac=00:16:3e:0a:c1:4e,bus=pci.9,addr=0x0,disable-legacy=on,disable-modern=off
>
> -netdev tap,id=hostnet9,vhost=on -device
> virtio-net-pci,netdev=hostnet9,id=net9,mac=00:16:3e:0a:58:a6,bus=pci.10,addr=0x0,disable-legacy=on,disable-modern=off
>
> -netdev tap,id=hostnet10,vhost=on -device
> virtio-net-pci,netdev=hostnet10,id=net10,mac=00:16:3e:35:b5:80,bus=pci.11,addr=0x0,disable-legacy=on,disable-modern=off
>
> -netdev tap,id=hostnet11,vhost=on -device
> virtio-net-pci,netdev=hostnet11,id=net11,mac=00:16:3e:4d:b5:bb,bus=pci.12,addr=0x0,disable-legacy=on,disable-modern=off
>
> -netdev tap,id=hostnet12,vhost=on -device
> virtio-net-pci,netdev=hostnet12,id=net12,mac=00:16:3e:3b:69:e9,bus=pci.13,addr=0x0,disable-legacy=on,disable-modern=off
>
> -nographic
>
> But if I add one more ioh3420 device by appending above command with
> "-device ioh3420,port=0x15,chassis=14,id=pci.14,bus=pcie.0,addr=0xe",
> the guest can't boot. It seems that the firmware doesn't recognize the
> PCIe devices and print "Connect: PciRoot(0x0): Not Found".
>
> I'm using QEMU 2.8.1 and edk2 at commit 36a0d5c. Is there any
> limitation
> of the supported PCIe devices?

 In one sentence: you are running out of (emulated) IO space.

 Aarch64 does not have "IO space", but with QEMU, using the "virt"
 machine type, we emulate 64KB of IO space, mapped to a special MMIO
 range. This is available for PCI resource allocation, for such devices
 that have IO BARs (and for such PCI 

[Qemu-devel] [PATCH v4] qga: Add support network interface statistics in

2017-04-24 Thread ZhiPeng Lu
we can get the network interface statistics inside a virtual machine by
guest-network-get-interfaces command. it is very useful for us to monitor
and analyze network traffic.

Signed-off-by: ZhiPeng Lu 
Signed-off-by: Daniel P. Berrange 
---
 qga/commands-posix.c | 71 +++-
 qga/qapi-schema.json | 38 +++-
 2 files changed, 107 insertions(+), 2 deletions(-)

diff --git a/qga/commands-posix.c b/qga/commands-posix.c
index 915df9e..1e35340 100644
--- a/qga/commands-posix.c
+++ b/qga/commands-posix.c
@@ -1638,6 +1638,65 @@ guest_find_interface(GuestNetworkInterfaceList *head,
 return head;
 }
 
+ static int guest_get_network_stats(const char *path,
+   GuestNetworkInterfaceStat *stats)
+{
+int path_len;
+char const *devinfo = "/proc/net/dev";
+FILE *fp;
+char *line = NULL, *colon;
+size_t n;
+fp = fopen(devinfo, "r");
+if (!fp) {
+return -1;
+}
+path_len = strlen(path);
+while (getline(, , fp) != -1) {
+long long dummy;
+long long rx_bytes;
+long long rx_packets;
+long long rx_errs;
+long long rx_dropped;
+long long tx_bytes;
+long long tx_packets;
+long long tx_errs;
+long long tx_dropped;
+
+  /* The line looks like:
+   *"   eth0:..."
+   *Split it at the colon.
+   */
+colon = strchr(line, ':');
+if (!colon) {
+continue;
+}
+*colon = '\0';
+if (colon - path_len >= line && strcmp(colon - path_len, path) == 0) {
+if (sscanf(colon + 1,
+"%lld %lld %lld %lld %lld %lld %lld %lld %lld %lld %lld %lld 
%lld %lld %lld %lld",
+  _bytes, _packets, _errs, _dropped,
+  , , , ,
+  _bytes, _packets, _errs, _dropped,
+  , , , ) != 16) {
+continue;
+}
+stats->rx_bytes = rx_bytes;
+stats->rx_packets = rx_packets;
+stats->rx_errs = rx_errs;
+stats->rx_dropped = rx_dropped;
+stats->tx_bytes = tx_bytes;
+stats->tx_packets = tx_packets;
+stats->tx_errs = tx_errs;
+stats->tx_dropped = tx_dropped;
+fclose(fp);
+return 0;
+}
+}
+fclose(fp);
+g_debug("/proc/net/dev: Interface not found");
+return -1;
+}
+
 /*
  * Build information about guest interfaces
  */
@@ -1654,6 +1713,7 @@ GuestNetworkInterfaceList 
*qmp_guest_network_get_interfaces(Error **errp)
 for (ifa = ifap; ifa; ifa = ifa->ifa_next) {
 GuestNetworkInterfaceList *info;
 GuestIpAddressList **address_list = NULL, *address_item = NULL;
+GuestNetworkInterfaceStat  *interface_stat = NULL;
 char addr4[INET_ADDRSTRLEN];
 char addr6[INET6_ADDRSTRLEN];
 int sock;
@@ -1773,7 +1833,16 @@ GuestNetworkInterfaceList 
*qmp_guest_network_get_interfaces(Error **errp)
 
 info->value->has_ip_addresses = true;
 
-
+if (!info->value->has_statistics) {
+interface_stat = g_malloc0(sizeof(*interface_stat));
+if (guest_get_network_stats(info->value->name,
+interface_stat) == -1) {
+error_setg_errno(errp, errno, "guest_get_network_stats 
failed");
+goto error;
+}
+info->value->statistics = interface_stat;
+info->value->has_statistics = true;
+}
 }
 
 freeifaddrs(ifap);
diff --git a/qga/qapi-schema.json b/qga/qapi-schema.json
index a02dbf2..948219b 100644
--- a/qga/qapi-schema.json
+++ b/qga/qapi-schema.json
@@ -635,6 +635,38 @@
'prefix': 'int'} }
 
 ##
+# @GuestNetworkInterfaceStat:
+#
+# @rx-bytes: total bytes received
+#
+# @rx-packets: total packets received
+#
+# @rx-errs: bad packets received
+#
+# @rx-dropped: receiver dropped packets
+#
+# @tx-bytes: total bytes transmitted
+#
+# @tx-packets: total packets transmitted
+#
+# @tx-errs: packet transmit problems
+#
+# @tx-dropped: dropped packets transmitted
+#
+# Since: 2.10
+##
+{ 'struct': 'GuestNetworkInterfaceStat',
+  'data': {'rx-bytes': 'uint64',
+'rx-packets': 'uint64',
+'rx-errs': 'uint64',
+'rx-dropped': 'uint64',
+'tx-bytes': 'uint64',
+'tx-packets': 'uint64',
+'tx-errs': 'uint64',
+'tx-dropped': 'uint64'
+   } }
+
+##
 # @GuestNetworkInterface:
 #
 # @name: The name of interface for which info are being delivered
@@ -643,12 +675,16 @@
 #
 # @ip-addresses: List of addresses assigned to @name
 #
+# @statistics: various statistic counters related to @name
+# (since 2.10)
+#
 # Since: 1.1
 ##
 { 'struct': 'GuestNetworkInterface',
   'data': {'name': 'str',
'*hardware-address': 'str',
-   '*ip-addresses': ['GuestIpAddress'] } }
+   '*ip-addresses': 

Re: [Qemu-devel] [PATCH v2 00/25] qmp: add async command type

2017-04-24 Thread John Snow


On 04/24/2017 03:10 PM, Markus Armbruster wrote:
> With 2.9 out of the way, how can we make progress on this one?
> 

Throw rocks at my window late at night. Refuse to stop until there is
consensus.

> I can see two ways to get asynchronous QMP commands accepted:
> 
> 1. We break QMP compatibility in QEMU 3.0 and convert all long-running
>tasks from "synchronous command + event" to "asynchronous command".
> 
>This is design option 1 quoted below.  *If* we decide to leave
>compatibility behind for 3.0, *and* we decide we like the
>asynchronous sufficiently better to put in the work, we can do it.
> 
>I guess there's nothing to do here until we decide on breaking
>compatibility in 3.0.
> 
> 2. We don't break QMP compatibility, but we add asynchronous commands
>anyway, because we decide that's how we want to do "jobs".
> 
>This is design option 3 quoted below.  As I said, I dislike its lack
>of orthogonality.  But if asynchronous commands help us get jobs
>done, I can bury my dislike.
> 
>I feel this is something you should investigate with John Snow.
>Going through a middleman (me) makes no sense.  So I'm going to dump
>my thoughts, then get out of the way.
> 

I'd like to discuss this on the KVM call, not this week (because that's
... today, when you read this email) but next week. Let me digest the
async QMP proposal before then and we can discuss the merits of either
approach.

I'd like to invite Marc-Andre, Markus, Stefan, and Eric Blake to join;
Jeff Cody might have some good input here too.

I don't know enough about what problems async QMP is trying to solve yet
to have anything resembling an intelligent comment yet. I'll put next
Tuesday as my deadline for not being a deadbeat.

>You need to take care not to get bogged down in the jobs project's
>complexity.  This is really only how to package up jobs for QMP.
> >With synchronous commands, the job-creating command creates a job,
>jobs state changes trigger events, and job termination is just
>another state change.  Job control commands interact with the job.
>  
>Events obviously need to carry a job ID.  We can either require the
>user to pass it as argument to the job-creating command (hopefully
>unique), or have the job-creating command pick one (a unique one) and
>return it.
> 
>With asynchronous commands, we could make the asynchronous command
>the job.  The only difference is that job termination triggers the
>command response.  When termination is of no interest to anyone but
>the job's creator, the termination event can be omitted then.
> 

That is, you receive no confirmation that the job is being processed
until it succeeds or fails in the "async QMP-as-jobs" model?

Not even the ACK-return? ('{return: {}}')

>Instead of a job ID, we could use the (user-specified and hopefully
>unique) command ID that ties the command response to the command.
>Perhaps throw in a monitor ID.
> 
>To be honest, I'm not sure asynchronous commands buy us much here.
>But my view is from 10,000 feet, and John might have different ideas.
> 

Withholding comment until I give it a fair shake.

> Rejecting asynchronous QMP commands is of course design option 2 quoted
> below.
> 
> 
> Markus Armbruster  writes:
> 
>> Cc'ing block job people.
>>
>> Marc-André Lureau  writes:
>>
>>> Hi,
>>>
>>> One of initial design goals of QMP was to have "asynchronous command
>>> completion" (http://wiki.qemu.org/Features/QAPI). Unfortunately, that
>>> goal was not fully achieved, and some broken bits left were removed
>>> progressively until commit 65207c59d that removed async command
>>> support.
>>
>> Correct.
>>
>> QMP's initial design stipulated that commands are asynchronous.  The
>> initial implementation made them all synchronous, with very few
>> exceptions (buggy ones, to boot).  Naturally, clients relied on the
>> actual rather than the theoretical behavior, i.e. on getting command
>> replies synchronously, in order.
>>
>>> Note that qmp events are asynchronous messages, and must be handled
>>> appropriately by the client: dispatch both reply and events after a
>>> command is sent for example.
>>
>> Yes.
>>
>>> The benefits of async commands that can be trade-off depending on the
>>> requirements are:
>>>
>>> 1) allow the command handler to re-enter the main loop if the command
>>> cannot be handled synchronously, or if it is long-lasting. This is
>>> currently not possible and make some bugs such as rhbz#1230527 tricky
>>> (see below) to solve.  Furthermore, many QMP commands do IO and could
>>> be considered 'slow' and blocking today.
>>>
>>> 2) allow concurrent commands and events. This mainly implies hanlding
>>> concurrency in qemu and out-of-order replies for the client. As noted
>>> earlier, a good qmp client already has to handle dispatching of
>>> received messages (reply and events).
>>
>> We 

Re: [Qemu-devel] [PATCH 08/11] blockjob: group BlockJob transaction functions together

2017-04-24 Thread John Snow


On 04/19/2017 10:42 AM, Paolo Bonzini wrote:
> Yet another pure code movement patch, preparing for the next change.
> 
> Signed-off-by: Paolo Bonzini 

Reviewed-by: John Snow 

> ---
> v1->v2: split out of block_job_completed_txn_abort patch [John]
> 
>  blockjob.c | 128 
> ++---
>  1 file changed, 64 insertions(+), 64 deletions(-)
> 
> diff --git a/blockjob.c b/blockjob.c
> index 1756153..5d70d25 100644
> --- a/blockjob.c
> +++ b/blockjob.c
> @@ -91,6 +91,39 @@ BlockJob *block_job_get(const char *id)
>  return NULL;
>  }
>  
> +BlockJobTxn *block_job_txn_new(void)
> +{
> +BlockJobTxn *txn = g_new0(BlockJobTxn, 1);
> +QLIST_INIT(>jobs);
> +txn->refcnt = 1;
> +return txn;
> +}
> +
> +static void block_job_txn_ref(BlockJobTxn *txn)
> +{
> +txn->refcnt++;
> +}
> +
> +void block_job_txn_unref(BlockJobTxn *txn)
> +{
> +if (txn && --txn->refcnt == 0) {
> +g_free(txn);
> +}
> +}
> +
> +void block_job_txn_add_job(BlockJobTxn *txn, BlockJob *job)
> +{
> +if (!txn) {
> +return;
> +}
> +
> +assert(!job->txn);
> +job->txn = txn;
> +
> +QLIST_INSERT_HEAD(>jobs, job, txn_list);
> +block_job_txn_ref(txn);
> +}
> +
>  static void block_job_pause(BlockJob *job)
>  {
>  job->pause_count++;
> @@ -317,6 +350,37 @@ static void block_job_cancel_async(BlockJob *job)
>  job->cancelled = true;
>  }
>  
> +static int block_job_finish_sync(BlockJob *job,
> + void (*finish)(BlockJob *, Error **errp),
> + Error **errp)
> +{
> +Error *local_err = NULL;
> +int ret;
> +
> +assert(blk_bs(job->blk)->job == job);
> +
> +block_job_ref(job);
> +
> +finish(job, _err);
> +if (local_err) {
> +error_propagate(errp, local_err);
> +block_job_unref(job);
> +return -EBUSY;
> +}
> +/* block_job_drain calls block_job_enter, and it should be enough to
> + * induce progress until the job completes or moves to the main thread.
> +*/
> +while (!job->deferred_to_main_loop && !job->completed) {
> +block_job_drain(job);
> +}
> +while (!job->completed) {
> +aio_poll(qemu_get_aio_context(), true);
> +}
> +ret = (job->cancelled && job->ret == 0) ? -ECANCELED : job->ret;
> +block_job_unref(job);
> +return ret;
> +}
> +
>  static void block_job_completed_txn_abort(BlockJob *job)
>  {
>  AioContext *ctx;
> @@ -440,37 +504,6 @@ void block_job_cancel(BlockJob *job)
>  }
>  }
>  
> -static int block_job_finish_sync(BlockJob *job,
> - void (*finish)(BlockJob *, Error **errp),
> - Error **errp)
> -{
> -Error *local_err = NULL;
> -int ret;
> -
> -assert(blk_bs(job->blk)->job == job);
> -
> -block_job_ref(job);
> -
> -finish(job, _err);
> -if (local_err) {
> -error_propagate(errp, local_err);
> -block_job_unref(job);
> -return -EBUSY;
> -}
> -/* block_job_drain calls block_job_enter, and it should be enough to
> - * induce progress until the job completes or moves to the main thread.
> -*/
> -while (!job->deferred_to_main_loop && !job->completed) {
> -block_job_drain(job);
> -}
> -while (!job->completed) {
> -aio_poll(qemu_get_aio_context(), true);
> -}
> -ret = (job->cancelled && job->ret == 0) ? -ECANCELED : job->ret;
> -block_job_unref(job);
> -return ret;
> -}
> -
>  /* A wrapper around block_job_cancel() taking an Error ** parameter so it 
> may be
>   * used with block_job_finish_sync() without the need for (rather nasty)
>   * function pointer casts there. */
> @@ -883,36 +916,3 @@ void block_job_defer_to_main_loop(BlockJob *job,
>  aio_bh_schedule_oneshot(qemu_get_aio_context(),
>  block_job_defer_to_main_loop_bh, data);
>  }
> -
> -BlockJobTxn *block_job_txn_new(void)
> -{
> -BlockJobTxn *txn = g_new0(BlockJobTxn, 1);
> -QLIST_INIT(>jobs);
> -txn->refcnt = 1;
> -return txn;
> -}
> -
> -static void block_job_txn_ref(BlockJobTxn *txn)
> -{
> -txn->refcnt++;
> -}
> -
> -void block_job_txn_unref(BlockJobTxn *txn)
> -{
> -if (txn && --txn->refcnt == 0) {
> -g_free(txn);
> -}
> -}
> -
> -void block_job_txn_add_job(BlockJobTxn *txn, BlockJob *job)
> -{
> -if (!txn) {
> -return;
> -}
> -
> -assert(!job->txn);
> -job->txn = txn;
> -
> -QLIST_INSERT_HEAD(>jobs, job, txn_list);
> -block_job_txn_ref(txn);
> -}
> 



Re: [Qemu-devel] [RFC PATCH] pci: deassert intx when pci device unrealize

2017-04-24 Thread Michael S. Tsirkin
On Mon, Apr 24, 2017 at 09:12:29PM +0800, Herongguang (Stephen) wrote:
> If a pci device is not reset by VM (by writing into config space)
> and unplugged by VM, after that when VM reboots, qemu may assert:
> pcibus_reset: Assertion `bus->irq_count[i] == 0' failed
> 
> Signed-off-by: herongguang 

Good grief, I can't believe we have had this bug for so long.
Thanks a lot for finding this.

Pls Cc stable on this patch.

> ---
> 
> Is there need to call pci_do_device_reset()?

I don't think so, why?

> ---
>  hw/pci/pci.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/hw/pci/pci.c b/hw/pci/pci.c
> index 259483b..afe6397 100644
> --- a/hw/pci/pci.c
> +++ b/hw/pci/pci.c
> @@ -1078,6 +1078,7 @@ static void pci_qdev_unrealize(DeviceState *dev, Error 
> **errp)
> 
>  pci_unregister_io_regions(pci_dev);
>  pci_del_option_rom(pci_dev);
> +pci_device_deassert_intx(pci_dev);

I think the right place to call this is after class exit
when we know no one will try to assert the interrupt.

> 
>  if (pc->exit) {
>  pc->exit(pci_dev);
> --
> 1.7.12.4



Re: [Qemu-devel] [PULL 0/21] Please pull xen-20170421-tag for 2.10

2017-04-24 Thread Stefano Stabellini
On Mon, 24 Apr 2017, Peter Maydell wrote:
> On 24 April 2017 at 22:25, Stefano Stabellini  wrote:
> > diff --git a/hw/9pfs/xen-9pfs.h b/hw/9pfs/xen-9pfs.h
> > new file mode 100644
> > index 000..18f0ec0
> > --- /dev/null
> > +++ b/hw/9pfs/xen-9pfs.h
> > @@ -0,0 +1,14 @@
> > +/*
> > + * Xen 9p backend
> > + *
> > + * Copyright Aporeto 2017
> > + *
> > + * Authors:
> > + *  Stefano Stabellini 
> > + *
> > + */
> 
> Trivial file, but I prefer it if we have a brief license
> statement in every file, just to be clear (it might
> accumulate more code later).

Sure

> > +
> > +#include 
> > +#include "hw/xen/io/ring.h"
> > +
> > +DEFINE_XEN_FLEX_RING_AND_INTF(xen_9pfs);
> 
> Is it worth a comment to dissuade people from thinking they can
> inline the file back into xen-9p-backend.c ?

Thanks for the quick review! Here you go:


diff --git a/hw/9pfs/xen-9p-backend.c b/hw/9pfs/xen-9p-backend.c
index 7e962aa..9c7f41a 100644
--- a/hw/9pfs/xen-9p-backend.c
+++ b/hw/9pfs/xen-9p-backend.c
@@ -13,12 +13,9 @@
 #include "hw/hw.h"
 #include "hw/9pfs/9p.h"
 #include "hw/xen/xen_backend.h"
-#include "hw/xen/io/ring.h"
+#include "hw/9pfs/xen-9pfs.h"
 #include "qemu/config-file.h"
 #include "fsdev/qemu-fsdev.h"
-#include 
-
-DEFINE_XEN_FLEX_RING_AND_INTF(xen_9pfs);
 
 #define VERSIONS "1"
 #define MAX_RINGS 8
diff --git a/hw/9pfs/xen-9pfs.h b/hw/9pfs/xen-9pfs.h
new file mode 100644
index 000..6e33d77
--- /dev/null
+++ b/hw/9pfs/xen-9pfs.h
@@ -0,0 +1,21 @@
+/*
+ * Xen 9p backend
+ *
+ * Copyright Aporeto 2017
+ *
+ * Authors:
+ *  Stefano Stabellini 
+ *
+ * This work is licensed under the terms of the GNU GPL version 2.
+ * See the COPYING file in the top-level directory.
+ *
+ */
+
+#include 
+#include "hw/xen/io/ring.h"
+
+/*
+ * Do not merge into xen-9p-backend.c: clang doesn't allow unused static
+ * inline functions in c files.
+ */
+DEFINE_XEN_FLEX_RING_AND_INTF(xen_9pfs);



Re: [Qemu-devel] [PATCH 05/11] blockjob: separate monitor and blockjob APIs

2017-04-24 Thread John Snow


On 04/19/2017 10:42 AM, Paolo Bonzini wrote:
> We have two different headers for block job operations, blockjob.h
> and blockjob_int.h.  The former contains APIs called by the monitor,
> the latter contains APIs called by the block job drivers and the
> block layer itself.
> 
> Keep the two APIs separate in the blockjob.c file too.  This will
> be useful when transitioning away from the AioContext lock, because
> there will be locking policies for the two categories, too---the
> monitor will have to call new block_job_lock/unlock APIs, while blockjob
> APIs will take care of this for the users.
> 
> Signed-off-by: Paolo Bonzini 

Reviewed-by: John Snow 

> ---
> v1->v2: move blockjob_create in the blockjob_int.h category,
> rewrite commit message [John]

grazie ~

> 
>  blockjob.c | 390 
> -
>  1 file changed, 205 insertions(+), 185 deletions(-)
> 
> diff --git a/blockjob.c b/blockjob.c
> index 85ad610..140e176 100644
> --- a/blockjob.c
> +++ b/blockjob.c
> @@ -55,6 +55,21 @@ struct BlockJobTxn {
>  
>  static QLIST_HEAD(, BlockJob) block_jobs = 
> QLIST_HEAD_INITIALIZER(block_jobs);
>  
> +/*
> + * The block job API is composed of two categories of functions.
> + *
> + * The first includes functions used by the monitor.  The monitor is
> + * peculiar in that it accesses the block job list with block_job_get, and
> + * therefore needs consistency across block_job_get and the actual operation
> + * (e.g. block_job_set_speed).  The consistency is achieved with
> + * aio_context_acquire/release.  These functions are declared in blockjob.h.
> + *
> + * The second includes functions used by the block job drivers and sometimes
> + * by the core block layer.  These do not care about locking, because the
> + * whole coroutine runs under the AioContext lock, and are declared in
> + * blockjob_int.h.
> + */
> +
>  BlockJob *block_job_next(BlockJob *job)
>  {
>  if (!job) {
> @@ -216,90 +231,6 @@ int block_job_add_bdrv(BlockJob *job, const char *name, 
> BlockDriverState *bs,
>  return 0;
>  }
>  
> -void *block_job_create(const char *job_id, const BlockJobDriver *driver,
> -   BlockDriverState *bs, uint64_t perm,
> -   uint64_t shared_perm, int64_t speed, int flags,
> -   BlockCompletionFunc *cb, void *opaque, Error **errp)
> -{
> -BlockBackend *blk;
> -BlockJob *job;
> -int ret;
> -
> -if (bs->job) {
> -error_setg(errp, QERR_DEVICE_IN_USE, bdrv_get_device_name(bs));
> -return NULL;
> -}
> -
> -if (job_id == NULL && !(flags & BLOCK_JOB_INTERNAL)) {
> -job_id = bdrv_get_device_name(bs);
> -if (!*job_id) {
> -error_setg(errp, "An explicit job ID is required for this node");
> -return NULL;
> -}
> -}
> -
> -if (job_id) {
> -if (flags & BLOCK_JOB_INTERNAL) {
> -error_setg(errp, "Cannot specify job ID for internal block job");
> -return NULL;
> -}
> -
> -if (!id_wellformed(job_id)) {
> -error_setg(errp, "Invalid job ID '%s'", job_id);
> -return NULL;
> -}
> -
> -if (block_job_get(job_id)) {
> -error_setg(errp, "Job ID '%s' already in use", job_id);
> -return NULL;
> -}
> -}
> -
> -blk = blk_new(perm, shared_perm);
> -ret = blk_insert_bs(blk, bs, errp);
> -if (ret < 0) {
> -blk_unref(blk);
> -return NULL;
> -}
> -
> -job = g_malloc0(driver->instance_size);
> -job->driver= driver;
> -job->id= g_strdup(job_id);
> -job->blk   = blk;
> -job->cb= cb;
> -job->opaque= opaque;
> -job->busy  = false;
> -job->paused= true;
> -job->pause_count   = 1;
> -job->refcnt= 1;
> -
> -error_setg(>blocker, "block device is in use by block job: %s",
> -   BlockJobType_lookup[driver->job_type]);
> -block_job_add_bdrv(job, "main node", bs, 0, BLK_PERM_ALL, _abort);
> -bs->job = job;
> -
> -blk_set_dev_ops(blk, _job_dev_ops, job);
> -bdrv_op_unblock(bs, BLOCK_OP_TYPE_DATAPLANE, job->blocker);
> -
> -QLIST_INSERT_HEAD(_jobs, job, job_list);
> -
> -blk_add_aio_context_notifier(blk, block_job_attached_aio_context,
> - block_job_detach_aio_context, job);
> -
> -/* Only set speed when necessary to avoid NotSupported error */
> -if (speed != 0) {
> -Error *local_err = NULL;
> -
> -block_job_set_speed(job, speed, _err);
> -if (local_err) {
> -block_job_unref(job);
> -error_propagate(errp, local_err);
> -return NULL;
> -}
> -}
> -return job;
> -}
> -
>  bool block_job_is_internal(BlockJob *job)
>  {
>  return (job->id == NULL);
> @@ -334,11 +265,6 @@ void 

Re: [Qemu-devel] [PATCH 01/11] blockjob: remove unnecessary check

2017-04-24 Thread John Snow
Worth a resend to CC qemu-block, include a cover letter, and all the
usual amenities?

On 04/19/2017 10:42 AM, Paolo Bonzini wrote:
> !job is always checked prior to the call, drop it from here.
> 
> Reviewed-by: Stefan Hajnoczi 
> Signed-off-by: Paolo Bonzini 
> ---
>  blockjob.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/blockjob.c b/blockjob.c
> index 6e48932..23022b3 100644
> --- a/blockjob.c
> +++ b/blockjob.c
> @@ -480,7 +480,7 @@ static bool block_job_should_pause(BlockJob *job)
>  
>  bool block_job_user_paused(BlockJob *job)
>  {
> -return job ? job->user_paused : 0;
> +return job->user_paused;
>  }
>  
>  void coroutine_fn block_job_pause_point(BlockJob *job)
> 

Reviewed-by: John Snow 



Re: [Qemu-devel] [Qemu-block] [PATCH 17/17] block: Make bdrv_is_allocated_above() byte-based

2017-04-24 Thread John Snow


On 04/11/2017 06:29 PM, Eric Blake wrote:
> We are gradually moving away from sector-based interfaces, towards
> byte-based.  In the common case, allocation is unlikely to ever use
> values that are not naturally sector-aligned, but it is possible
> that byte-based values will let us be more precise about allocation
> at the end of an unaligned file that can do byte-based access.
> 
> Changing the signature of the function to use int64_t *pnum ensures
> that the compiler enforces that all callers are updated.  For now,
> the io.c layer still assert()s that all callers are sector-aligned,
> but that can be relaxed when a later patch implements byte-based
> block status.  Therefore, for the most part this patch is just the
> addition of scaling at the callers followed by inverse scaling at
> bdrv_is_allocated().  But some code, particularly stream_run(),
> gets a lot simpler because it no longer has to mess with sectors.
> 
> For ease of review, bdrv_is_allocated() was tackled separately.
> 
> Signed-off-by: Eric Blake 
> ---
>  include/block/block.h |  2 +-
>  block/commit.c| 20 
>  block/io.c| 36 +++-
>  block/mirror.c|  5 -
>  block/replication.c   | 17 -
>  block/stream.c| 21 +
>  qemu-img.c| 10 +++---
>  7 files changed, 56 insertions(+), 55 deletions(-)
> 
> diff --git a/include/block/block.h b/include/block/block.h
> index 8641149..740cb86 100644
> --- a/include/block/block.h
> +++ b/include/block/block.h
> @@ -425,7 +425,7 @@ int64_t bdrv_get_block_status_above(BlockDriverState *bs,
>  int bdrv_is_allocated(BlockDriverState *bs, int64_t offset, int64_t bytes,
>int64_t *pnum);
>  int bdrv_is_allocated_above(BlockDriverState *top, BlockDriverState *base,
> -int64_t sector_num, int nb_sectors, int *pnum);
> +int64_t offset, int64_t bytes, int64_t *pnum);
> 
>  bool bdrv_is_read_only(BlockDriverState *bs);
>  bool bdrv_is_sg(BlockDriverState *bs);
> diff --git a/block/commit.c b/block/commit.c
> index 4d6bb2a..989de7d 100644
> --- a/block/commit.c
> +++ b/block/commit.c
> @@ -132,7 +132,7 @@ static void coroutine_fn commit_run(void *opaque)
>  int64_t offset;
>  uint64_t delay_ns = 0;
>  int ret = 0;
> -int n = 0; /* sectors */
> +int64_t n = 0; /* bytes */
>  void *buf = NULL;
>  int bytes_written = 0;
>  int64_t base_len;
> @@ -157,7 +157,7 @@ static void coroutine_fn commit_run(void *opaque)
> 
>  buf = blk_blockalign(s->top, COMMIT_BUFFER_SIZE);
> 
> -for (offset = 0; offset < s->common.len; offset += n * BDRV_SECTOR_SIZE) 
> {
> +for (offset = 0; offset < s->common.len; offset += n) {
>  bool copy;
> 
>  /* Note that even when no rate limit is applied we need to yield
> @@ -169,15 +169,12 @@ static void coroutine_fn commit_run(void *opaque)
>  }
>  /* Copy if allocated above the base */
>  ret = bdrv_is_allocated_above(blk_bs(s->top), blk_bs(s->base),
> -  offset / BDRV_SECTOR_SIZE,
> -  COMMIT_BUFFER_SIZE / BDRV_SECTOR_SIZE,
> -  );
> +  offset, COMMIT_BUFFER_SIZE, );
>  copy = (ret == 1);
> -trace_commit_one_iteration(s, offset, n * BDRV_SECTOR_SIZE, ret);
> +trace_commit_one_iteration(s, offset, n, ret);
>  if (copy) {
> -ret = commit_populate(s->top, s->base, offset,
> -  n * BDRV_SECTOR_SIZE, buf);
> -bytes_written += n * BDRV_SECTOR_SIZE;
> +ret = commit_populate(s->top, s->base, offset, n, buf);
> +bytes_written += n;
>  }
>  if (ret < 0) {
>  BlockErrorAction action =
> @@ -190,11 +187,10 @@ static void coroutine_fn commit_run(void *opaque)
>  }
>  }
>  /* Publish progress */
> -s->common.offset += n * BDRV_SECTOR_SIZE;
> +s->common.offset += n;
> 
>  if (copy && s->common.speed) {
> -delay_ns = ratelimit_calculate_delay(>limit,
> - n * BDRV_SECTOR_SIZE);
> +delay_ns = ratelimit_calculate_delay(>limit, n);
>  }
>  }
> 
> diff --git a/block/io.c b/block/io.c
> index 438a493..9218329 100644
> --- a/block/io.c
> +++ b/block/io.c
> @@ -1930,52 +1930,46 @@ int coroutine_fn bdrv_is_allocated(BlockDriverState 
> *bs, int64_t offset,
>  /*
>   * Given an image chain: ... -> [BASE] -> [INTER1] -> [INTER2] -> [TOP]
>   *
> - * Return true if the given sector is allocated in any image between
> + * Return true if the given offset is allocated in any image between

perhaps "range" instead of "offset"?

>   * BASE and TOP (inclusive).  BASE can be NULL to check if the 

Re: [Qemu-devel] [PATCH v5 07/13] vfio/ccw: vfio based subchannel passthrough driver

2017-04-24 Thread Alex Williamson
On Mon, 24 Apr 2017 16:43:38 -0600
Alex Williamson  wrote:

> On Wed, 12 Apr 2017 07:21:09 +0200
> Dong Jia Shi  wrote:
> 
> > From: Xiao Feng Ren 
> > 
> > We use the IOMMU_TYPE1 of VFIO to realize the subchannels
> > passthrough, implement a vfio based subchannels passthrough
> > driver called "vfio-ccw".
> > 
> > Support qemu parameters in the style of:
> > "-device vfio-ccw,sysfsdev=$mdev_file_path,devno=xx.x.'
> > 
> > Signed-off-by: Xiao Feng Ren 
> > Signed-off-by: Dong Jia Shi 
> > ---
> >  default-configs/s390x-softmmu.mak |   1 +
> >  hw/vfio/Makefile.objs |   1 +
> >  hw/vfio/ccw.c | 207 
> > ++
> >  include/hw/vfio/vfio-common.h |   1 +
> >  4 files changed, 210 insertions(+)
> >  create mode 100644 hw/vfio/ccw.c
> > 
> > diff --git a/default-configs/s390x-softmmu.mak 
> > b/default-configs/s390x-softmmu.mak
> > index 36e15de..5576b0a 100644
> > --- a/default-configs/s390x-softmmu.mak
> > +++ b/default-configs/s390x-softmmu.mak
> > @@ -4,4 +4,5 @@ CONFIG_VIRTIO=y
> >  CONFIG_SCLPCONSOLE=y
> >  CONFIG_S390_FLIC=y
> >  CONFIG_S390_FLIC_KVM=$(CONFIG_KVM)
> > +CONFIG_VFIO_CCW=$(CONFIG_LINUX)
> >  CONFIG_WDT_DIAG288=y
> > diff --git a/hw/vfio/Makefile.objs b/hw/vfio/Makefile.objs
> > index 05e7fbb..c3ab909 100644
> > --- a/hw/vfio/Makefile.objs
> > +++ b/hw/vfio/Makefile.objs
> > @@ -1,6 +1,7 @@
> >  ifeq ($(CONFIG_LINUX), y)
> >  obj-$(CONFIG_SOFTMMU) += common.o
> >  obj-$(CONFIG_PCI) += pci.o pci-quirks.o
> > +obj-$(CONFIG_VFIO_CCW) += ccw.o
> >  obj-$(CONFIG_SOFTMMU) += platform.o
> >  obj-$(CONFIG_VFIO_XGMAC) += calxeda-xgmac.o
> >  obj-$(CONFIG_VFIO_AMD_XGBE) += amd-xgbe.o
> > diff --git a/hw/vfio/ccw.c b/hw/vfio/ccw.c
> > new file mode 100644
> > index 000..c491bee
> > --- /dev/null
> > +++ b/hw/vfio/ccw.c
> > @@ -0,0 +1,207 @@
> > +/*
> > + * vfio based subchannel assignment support
> > + *
> > + * Copyright 2017 IBM Corp.
> > + * Author(s): Dong Jia Shi 
> > + *Xiao Feng Ren 
> > + *Pierre Morel 
> > + *
> > + * This work is licensed under the terms of the GNU GPL, version 2 or(at
> > + * your option) any version. See the COPYING file in the top-level
> > + * directory.
> > + */
> > +
> > +#include 
> > +#include 
> > +
> > +#include "qemu/osdep.h"
> > +#include "qapi/error.h"
> > +#include "hw/sysbus.h"
> > +#include "hw/vfio/vfio.h"
> > +#include "hw/vfio/vfio-common.h"
> > +#include "hw/s390x/s390-ccw.h"
> > +#include "hw/s390x/ccw-device.h"
> > +
> > +#define TYPE_VFIO_CCW "vfio-ccw"
> > +typedef struct VFIOCCWDevice {
> > +S390CCWDevice cdev;
> > +VFIODevice vdev;
> > +} VFIOCCWDevice;
> > +
> > +static void vfio_ccw_compute_needs_reset(VFIODevice *vdev)
> > +{
> > +vdev->needs_reset = false;
> > +}
> > +
> > +/*
> > + * We don't need vfio_hot_reset_multi and vfio_eoi operationis for

One more:

s/operationis/operations/

> > + * vfio_ccw device now.
> > + */
> > +struct VFIODeviceOps vfio_ccw_ops = {
> > +.vfio_compute_needs_reset = vfio_ccw_compute_needs_reset,
> > +};
> > +
> > +static void vfio_ccw_reset(DeviceState *dev)
> > +{
> > +CcwDevice *ccw_dev = DO_UPCAST(CcwDevice, parent_obj, dev);
> > +S390CCWDevice *cdev = DO_UPCAST(S390CCWDevice, parent_obj, ccw_dev);
> > +VFIOCCWDevice *vcdev = DO_UPCAST(VFIOCCWDevice, cdev, cdev);
> > +
> > +ioctl(vcdev->vdev.fd, VFIO_DEVICE_RESET);
> > +}
> > +
> > +static void vfio_put_device(VFIOCCWDevice *vcdev)
> > +{
> > +g_free(vcdev->vdev.name);
> > +vfio_put_base_device(>vdev);
> > +}
> > +
> > +static VFIOGroup *vfio_ccw_get_group(S390CCWDevice *cdev, char **path,
> > + Error **errp)
> > +{
> > +struct stat st;
> > +int groupid;
> > +GError *gerror = NULL;
> > +
> > +/* Check that host subchannel exists. */
> > +path[0] = g_strdup_printf("/sys/bus/css/devices/%x.%x.%04x",
> > +  cdev->hostid.cssid,
> > +  cdev->hostid.ssid,
> > +  cdev->hostid.devid);
> > +if (stat(path[0], ) < 0) {
> > +error_setg(errp, "vfio: no such host subchannel %s", path[0]);
> > +return NULL;
> > +}
> > +
> > +/* Check that mediated device exists. */
> > +path[1] = g_strdup_printf("%s/%s", path[0], cdev->mdevid);
> > +if (stat(path[0], ) < 0) {
> > +error_setg(errp, "vfio: no such mediated device %s", path[1]);
> > +return NULL;
> > +}  
> 
> Isn't this all a bit circular since we build the S390CCWDevice based on
> the sysfsdev mdev path?
> 
> > +
> > +/* Get the iommu_group patch as the interim variable. */
> > +path[2] = g_strconcat(path[1], "/iommu_group", NULL);
> > +
> > +/* Get the link 

Re: [Qemu-devel] [PATCH v5 06/13] s390x/css: device support for s390-ccw passthrough

2017-04-24 Thread Alex Williamson
On Wed, 12 Apr 2017 07:21:08 +0200
Dong Jia Shi  wrote:

> In order to support subchannels pass-through, we introduce a s390
> subchannel device called "s390-ccw" to hold the real subchannel info.
> The s390-ccw devices inherit from the abstract CcwDevice which connect
> to the existing virtual-css-bus.
> 
> Signed-off-by: Dong Jia Shi 
> ---
>  hw/s390x/Makefile.objs |   1 +
>  hw/s390x/s390-ccw.c| 134 
> +
>  hw/s390x/s390-ccw.h|  38 ++
>  3 files changed, 173 insertions(+)
>  create mode 100644 hw/s390x/s390-ccw.c
>  create mode 100644 hw/s390x/s390-ccw.h
> 
> diff --git a/hw/s390x/Makefile.objs b/hw/s390x/Makefile.objs
> index 41ac4ec..72a3d37 100644
> --- a/hw/s390x/Makefile.objs
> +++ b/hw/s390x/Makefile.objs
> @@ -13,3 +13,4 @@ obj-y += ccw-device.o
>  obj-y += s390-pci-bus.o s390-pci-inst.o
>  obj-y += s390-skeys.o
>  obj-$(CONFIG_KVM) += s390-skeys-kvm.o
> +obj-y += s390-ccw.o
> diff --git a/hw/s390x/s390-ccw.c b/hw/s390x/s390-ccw.c
> new file mode 100644
> index 000..f3d5ed1
> --- /dev/null
> +++ b/hw/s390x/s390-ccw.c
> @@ -0,0 +1,134 @@
> +/*
> + * s390 CCW Assignment Support
> + *
> + * Copyright 2017 IBM Corp
> + * Author(s): Dong Jia Shi 
> + *Xiao Feng Ren 
> + *Pierre Morel 
> + *
> + * This work is licensed under the terms of the GNU GPL, version 2
> + * or (at your option) any later version. See the COPYING file in the
> + * top-level directory.
> + */
> +#include "qemu/osdep.h"
> +#include "qapi/error.h"
> +#include "hw/sysbus.h"
> +#include "libgen.h"
> +#include "hw/s390x/css.h"
> +#include "hw/s390x/css-bridge.h"
> +#include "s390-ccw.h"
> +
> +static void s390_ccw_get_dev_info(S390CCWDevice *cdev,
> +  char *sysfsdev,
> +  Error **errp)
> +{
> +char dev_path[PATH_MAX], *tmp;
> +unsigned int cssid, ssid, devid;
> +
> +if (!sysfsdev) {
> +error_setg(errp, "No host device provided");
> +error_append_hint(errp, "Use -vfio-ccw,sysfsdev=PATH_TO_DEVICE\n");

nit, the leading '-' here seems strange, either you're describing
'-device vfio-ccw,...', or only the 'vfio-ccw,...' part.  Maybe I
notice this because sometimes I do accidentally type -vfio-pci instead
of -device vfio-pci.  Thanks,

Alex

> +return;
> +}
> +
> +if (!realpath(sysfsdev, dev_path)) {
> +error_setg(errp, "Host device '%s' not found", sysfsdev);
> +return;
> +}
> +
> +cdev->mdevid = g_strdup(basename(dev_path));
> +
> +tmp = basename(dirname(dev_path));
> +sscanf(tmp, "%2x.%1x.%4x", , , );
> +
> +cdev->hostid.cssid = cssid;
> +cdev->hostid.ssid = ssid;
> +cdev->hostid.devid = devid;
> +cdev->hostid.valid = true;
> +}
> +
> +static void s390_ccw_realize(S390CCWDevice *cdev, char *sysfsdev, Error 
> **errp)
> +{
> +CcwDevice *ccw_dev = CCW_DEVICE(cdev);
> +CCWDeviceClass *ck = CCW_DEVICE_GET_CLASS(ccw_dev);
> +DeviceState *parent = DEVICE(ccw_dev);
> +BusState *qbus = qdev_get_parent_bus(parent);
> +VirtualCssBus *cbus = VIRTUAL_CSS_BUS(qbus);
> +SubchDev *sch;
> +int ret;
> +Error *err = NULL;
> +
> +s390_ccw_get_dev_info(cdev, sysfsdev, errp);
> +if (*errp) {
> +return;
> +}
> +
> +sch = css_create_sch(ccw_dev->devno, false, cbus->squash_mcss, errp);
> +if (!sch) {
> +return;
> +}
> +sch->driver_data = cdev;
> +
> +ccw_dev->sch = sch;
> +ret = css_sch_build_schib(sch, >hostid);
> +if (ret) {
> +error_setg(, "%s: Failed to build initial schib: %d",
> +   __func__, ret);
> +goto out_err;
> +}
> +
> +ck->realize(ccw_dev, );
> +if (err) {
> +goto out_err;
> +}
> +
> +css_generate_sch_crws(sch->cssid, sch->ssid, sch->schid,
> +  parent->hotplugged, 1);
> +return;
> +
> +out_err:
> +error_propagate(errp, err);
> +css_subch_assign(sch->cssid, sch->ssid, sch->schid, sch->devno, NULL);
> +ccw_dev->sch = NULL;
> +g_free(sch);
> +}
> +
> +static void s390_ccw_unrealize(S390CCWDevice *cdev, Error **errp)
> +{
> +CcwDevice *ccw_dev = CCW_DEVICE(cdev);
> +SubchDev *sch = ccw_dev->sch;
> +
> +if (sch) {
> +css_subch_assign(sch->cssid, sch->ssid, sch->schid, sch->devno, 
> NULL);
> +g_free(sch);
> +ccw_dev->sch = NULL;
> +}
> +
> +g_free(cdev->mdevid);
> +}
> +
> +static void s390_ccw_class_init(ObjectClass *klass, void *data)
> +{
> +DeviceClass *dc = DEVICE_CLASS(klass);
> +S390CCWDeviceClass *cdc = S390_CCW_DEVICE_CLASS(klass);
> +
> +dc->bus_type = TYPE_VIRTUAL_CSS_BUS;
> +cdc->realize = s390_ccw_realize;
> +cdc->unrealize = s390_ccw_unrealize;
> +}
> +
> +static const TypeInfo s390_ccw_info = {

Re: [Qemu-devel] [Qemu-stable] [PATCH v2 1/1] qemu-img: wait for convert coroutines to complete

2017-04-24 Thread Anton Nefedov


On 24/04/2017 21:16, Peter Lieven wrote:




Am 24.04.2017 um 18:27 schrieb Anton Nefedov :


On 04/21/2017 03:37 PM, Peter Lieven wrote:

Am 21.04.2017 um 14:19 schrieb Anton Nefedov:

On 04/21/2017 01:44 PM, Peter Lieven wrote:

Am 21.04.2017 um 12:04 schrieb Anton Nefedov:
On error path (like i/o error in one of the coroutines), it's required to
 - wait for coroutines completion before cleaning the common structures
 - reenter dependent coroutines so they ever finish

Introduced in 2d9187bc65.

Signed-off-by: Anton Nefedov 
---
[..]




And what if we error out in the read path? Wouldn't be something like this 
easier?


diff --git a/qemu-img.c b/qemu-img.c
index 22f559a..4ff1085 100644
--- a/qemu-img.c
+++ b/qemu-img.c
@@ -1903,6 +1903,16 @@ static int convert_do_copy(ImgConvertState *s)
main_loop_wait(false);
}

+/* on error path we need to enter all coroutines that are still
+ * running before cleaning up common structures */
+if (s->ret) {
+for (i = 0; i < s->num_coroutines; i++) {
+ if (s->co[i]) {
+ qemu_coroutine_enter(s->co[i]);
+ }
+}
+}
+
if (s->compressed && !s->ret) {
/* signal EOF to align */
ret = blk_pwrite_compressed(s->target, 0, NULL, 0);


Peter



seemed a bit too daring to me to re-enter every coroutine potentially including 
the ones that yielded waiting for I/O completion.
If that's ok - that is for sure easier :)


I think we should enter every coroutine that is still running and have it 
terminate correctly. It was my mistake that I have not
done this in the original patch. Can you check if the above fixes your test 
cases that triggered the bug?



hi, sorry I'm late with the answer

this segfaults in bdrv_close(). Looks like it tries to finish some i/o which 
coroutine we have already entered and terminated?

(gdb) run
Starting program: /vz/anefedov/qemu-build/us/./qemu-img convert -O qcow2 
./harddisk.hdd.c ./harddisk.hdd
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
[New Thread 0x7fffeac2d700 (LWP 436020)]
[New Thread 0x7fffe4ed6700 (LWP 436021)]
qemu-img: error while reading sector 20480: Input/output error
qemu-img: error while writing sector 19712: Operation now in progress

Program received signal SIGSEGV, Segmentation fault.
aio_co_wake (co=0x0) at /mnt/code/us-qemu/util/async.c:454
454 ctx = atomic_read(>ctx);
(gdb) bt
#0  aio_co_wake (co=0x0) at /mnt/code/us-qemu/util/async.c:454
/* [Anton]: thread_pool_co_cb () here */
#1  0x55634629 in thread_pool_completion_bh (opaque=0x55cfe020) at 
/mnt/code/us-qemu/util/thread-pool.c:189
#2  0x55633b31 in aio_bh_call (bh=0x55cfe0f0) at 
/mnt/code/us-qemu/util/async.c:90
#3  aio_bh_poll (ctx=ctx@entry=0x55cee6d0) at 
/mnt/code/us-qemu/util/async.c:118
#4  0x55636f14 in aio_poll (ctx=ctx@entry=0x55cee6d0, 
blocking=) at /mnt/code/us-qemu/util/aio-posix.c:682
#5  0x555c52d4 in bdrv_drain_recurse (bs=bs@entry=0x55d22560) at 
/mnt/code/us-qemu/block/io.c:164
#6  0x555c5aed in bdrv_drained_begin (bs=bs@entry=0x55d22560) at 
/mnt/code/us-qemu/block/io.c:248
#7  0x55581443 in bdrv_close (bs=0x55d22560) at 
/mnt/code/us-qemu/block.c:2909
#8  bdrv_delete (bs=0x55d22560) at /mnt/code/us-qemu/block.c:3100
#9  bdrv_unref (bs=0x55d22560) at /mnt/code/us-qemu/block.c:4087
#10 0x555baf44 in blk_remove_bs (blk=blk@entry=0x55d22380) at 
/mnt/code/us-qemu/block/block-backend.c:552
#11 0x555bb173 in blk_delete (blk=0x55d22380) at 
/mnt/code/us-qemu/block/block-backend.c:238
#12 blk_unref (blk=blk@entry=0x55d22380) at 
/mnt/code/us-qemu/block/block-backend.c:282
#13 0x5557a22c in img_convert (argc=, argv=) at /mnt/code/us-qemu/qemu-img.c:2359
#14 0x55574189 in main (argc=5, argv=0x7fffe4a0) at 
/mnt/code/us-qemu/qemu-img.c:4464



Peter



/Anton



it seems that this is a bit tricky, can you share how your test case works?

thanks,
peter



how I tested today was basically

diff --git a/qemu-img.c b/qemu-img.c
index 4425aaa..3d2d506 100644
--- a/qemu-img.c
+++ b/qemu-img.c
@@ -1788,6 +1788,10 @@ static void coroutine_fn convert_co_do_copy(void 
*opaque)


 if (status == BLK_DATA) {
 ret = convert_co_read(s, sector_num, n, buf);
+const uint64_t fsector = 10*1024*1024/512;
+if (sector_num <= fsector && fsector < sector_num+n) {
+ret = -EIO;
+}
 if (ret < 0) {
 error_report("error while reading sector %" PRId64
  ": %s", sector_num, strerror(-ret));




Re: [Qemu-devel] [PATCH v5 07/13] vfio/ccw: vfio based subchannel passthrough driver

2017-04-24 Thread Alex Williamson
On Wed, 12 Apr 2017 07:21:09 +0200
Dong Jia Shi  wrote:

> From: Xiao Feng Ren 
> 
> We use the IOMMU_TYPE1 of VFIO to realize the subchannels
> passthrough, implement a vfio based subchannels passthrough
> driver called "vfio-ccw".
> 
> Support qemu parameters in the style of:
> "-device vfio-ccw,sysfsdev=$mdev_file_path,devno=xx.x.'
> 
> Signed-off-by: Xiao Feng Ren 
> Signed-off-by: Dong Jia Shi 
> ---
>  default-configs/s390x-softmmu.mak |   1 +
>  hw/vfio/Makefile.objs |   1 +
>  hw/vfio/ccw.c | 207 
> ++
>  include/hw/vfio/vfio-common.h |   1 +
>  4 files changed, 210 insertions(+)
>  create mode 100644 hw/vfio/ccw.c
> 
> diff --git a/default-configs/s390x-softmmu.mak 
> b/default-configs/s390x-softmmu.mak
> index 36e15de..5576b0a 100644
> --- a/default-configs/s390x-softmmu.mak
> +++ b/default-configs/s390x-softmmu.mak
> @@ -4,4 +4,5 @@ CONFIG_VIRTIO=y
>  CONFIG_SCLPCONSOLE=y
>  CONFIG_S390_FLIC=y
>  CONFIG_S390_FLIC_KVM=$(CONFIG_KVM)
> +CONFIG_VFIO_CCW=$(CONFIG_LINUX)
>  CONFIG_WDT_DIAG288=y
> diff --git a/hw/vfio/Makefile.objs b/hw/vfio/Makefile.objs
> index 05e7fbb..c3ab909 100644
> --- a/hw/vfio/Makefile.objs
> +++ b/hw/vfio/Makefile.objs
> @@ -1,6 +1,7 @@
>  ifeq ($(CONFIG_LINUX), y)
>  obj-$(CONFIG_SOFTMMU) += common.o
>  obj-$(CONFIG_PCI) += pci.o pci-quirks.o
> +obj-$(CONFIG_VFIO_CCW) += ccw.o
>  obj-$(CONFIG_SOFTMMU) += platform.o
>  obj-$(CONFIG_VFIO_XGMAC) += calxeda-xgmac.o
>  obj-$(CONFIG_VFIO_AMD_XGBE) += amd-xgbe.o
> diff --git a/hw/vfio/ccw.c b/hw/vfio/ccw.c
> new file mode 100644
> index 000..c491bee
> --- /dev/null
> +++ b/hw/vfio/ccw.c
> @@ -0,0 +1,207 @@
> +/*
> + * vfio based subchannel assignment support
> + *
> + * Copyright 2017 IBM Corp.
> + * Author(s): Dong Jia Shi 
> + *Xiao Feng Ren 
> + *Pierre Morel 
> + *
> + * This work is licensed under the terms of the GNU GPL, version 2 or(at
> + * your option) any version. See the COPYING file in the top-level
> + * directory.
> + */
> +
> +#include 
> +#include 
> +
> +#include "qemu/osdep.h"
> +#include "qapi/error.h"
> +#include "hw/sysbus.h"
> +#include "hw/vfio/vfio.h"
> +#include "hw/vfio/vfio-common.h"
> +#include "hw/s390x/s390-ccw.h"
> +#include "hw/s390x/ccw-device.h"
> +
> +#define TYPE_VFIO_CCW "vfio-ccw"
> +typedef struct VFIOCCWDevice {
> +S390CCWDevice cdev;
> +VFIODevice vdev;
> +} VFIOCCWDevice;
> +
> +static void vfio_ccw_compute_needs_reset(VFIODevice *vdev)
> +{
> +vdev->needs_reset = false;
> +}
> +
> +/*
> + * We don't need vfio_hot_reset_multi and vfio_eoi operationis for
> + * vfio_ccw device now.
> + */
> +struct VFIODeviceOps vfio_ccw_ops = {
> +.vfio_compute_needs_reset = vfio_ccw_compute_needs_reset,
> +};
> +
> +static void vfio_ccw_reset(DeviceState *dev)
> +{
> +CcwDevice *ccw_dev = DO_UPCAST(CcwDevice, parent_obj, dev);
> +S390CCWDevice *cdev = DO_UPCAST(S390CCWDevice, parent_obj, ccw_dev);
> +VFIOCCWDevice *vcdev = DO_UPCAST(VFIOCCWDevice, cdev, cdev);
> +
> +ioctl(vcdev->vdev.fd, VFIO_DEVICE_RESET);
> +}
> +
> +static void vfio_put_device(VFIOCCWDevice *vcdev)
> +{
> +g_free(vcdev->vdev.name);
> +vfio_put_base_device(>vdev);
> +}
> +
> +static VFIOGroup *vfio_ccw_get_group(S390CCWDevice *cdev, char **path,
> + Error **errp)
> +{
> +struct stat st;
> +int groupid;
> +GError *gerror = NULL;
> +
> +/* Check that host subchannel exists. */
> +path[0] = g_strdup_printf("/sys/bus/css/devices/%x.%x.%04x",
> +  cdev->hostid.cssid,
> +  cdev->hostid.ssid,
> +  cdev->hostid.devid);
> +if (stat(path[0], ) < 0) {
> +error_setg(errp, "vfio: no such host subchannel %s", path[0]);
> +return NULL;
> +}
> +
> +/* Check that mediated device exists. */
> +path[1] = g_strdup_printf("%s/%s", path[0], cdev->mdevid);
> +if (stat(path[0], ) < 0) {
> +error_setg(errp, "vfio: no such mediated device %s", path[1]);
> +return NULL;
> +}

Isn't this all a bit circular since we build the S390CCWDevice based on
the sysfsdev mdev path?

> +
> +/* Get the iommu_group patch as the interim variable. */
> +path[2] = g_strconcat(path[1], "/iommu_group", NULL);
> +
> +/* Get the link file path of the device iommu_group. */
> +path[3] = g_file_read_link(path[2], );
> +if (!path[3]) {
> +error_setg(errp, "vfio: error no iommu_group for subchannel");
> +return NULL;
> +}
> +
> +/* Get the device groupid. */
> +if (sscanf(basename(path[3]), "%d", ) != 1) {
> +error_setg(errp, "vfio: error reading %s:%m", path[3]);
> +return NULL;
> +}
> +

[Qemu-devel] [PATCH 4/4] migration: spapr: migrate pending_events of spapr state

2017-04-24 Thread Daniel Henrique Barboza
From: Jianjun Duan 

In racing situations between hotplug events and migration operation,
a rtas hotplug event could have not yet be delivered to the source
guest when migration is started. In this case the pending_events of
spapr state need be transmitted to the target so that the hotplug
event can be finished on the target.

All the different fields of the events are encoded as defined by
PAPR. We can migrate them as uint8_t binary stream without any
concerns about data padding or endianess.

pending_events is put in a subsection in the spapr state VMSD to make
sure migration across different versions is not broken.

Signed-off-by: Jianjun Duan 
Signed-off-by: Daniel Henrique Barboza 
---
 hw/ppc/spapr.c | 33 +
 hw/ppc/spapr_events.c  | 24 +---
 include/hw/ppc/spapr.h |  3 ++-
 3 files changed, 48 insertions(+), 12 deletions(-)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 22f351c..a3e939b 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -1419,6 +1419,38 @@ static const VMStateDescription vmstate_spapr_ccs_list = 
{
 },
 };
 
+static bool spapr_pending_events_needed(void *opaque)
+{
+sPAPRMachineState *spapr = (sPAPRMachineState *)opaque;
+return !QTAILQ_EMPTY(>pending_events);
+}
+
+static const VMStateDescription vmstate_spapr_event_entry = {
+.name = "spapreventlogentry",
+.version_id = 1,
+.minimum_version_id = 1,
+.fields = (VMStateField[]) {
+VMSTATE_INT32(log_type, sPAPREventLogEntry),
+VMSTATE_BOOL(exception, sPAPREventLogEntry),
+VMSTATE_UINT32(data_size, sPAPREventLogEntry),
+VMSTATE_VARRAY_UINT32_ALLOC(data, sPAPREventLogEntry, data_size,
+0, vmstate_info_uint8, uint8_t),
+VMSTATE_END_OF_LIST()
+},
+};
+
+static const VMStateDescription vmstate_spapr_pending_events = {
+.name = "spaprpendingevents",
+.version_id = 1,
+.minimum_version_id = 1,
+.needed = spapr_pending_events_needed,
+.fields = (VMStateField[]) {
+VMSTATE_QTAILQ_V(pending_events, sPAPRMachineState, 1,
+ vmstate_spapr_event_entry, sPAPREventLogEntry, next),
+VMSTATE_END_OF_LIST()
+},
+};
+
 static bool spapr_ov5_cas_needed(void *opaque)
 {
 sPAPRMachineState *spapr = opaque;
@@ -1518,6 +1550,7 @@ static const VMStateDescription vmstate_spapr = {
 _spapr_ov5_cas,
 _spapr_patb_entry,
 _spapr_ccs_list,
+_spapr_pending_events,
 NULL
 }
 };
diff --git a/hw/ppc/spapr_events.c b/hw/ppc/spapr_events.c
index 24a5758..399dd49 100644
--- a/hw/ppc/spapr_events.c
+++ b/hw/ppc/spapr_events.c
@@ -342,7 +342,8 @@ static int rtas_event_log_to_irq(sPAPRMachineState *spapr, 
int log_type)
 return source->irq;
 }
 
-static void rtas_event_log_queue(int log_type, void *data, bool exception)
+static void rtas_event_log_queue(int log_type, void *data, bool exception,
+ int data_size)
 {
 sPAPRMachineState *spapr = SPAPR_MACHINE(qdev_get_machine());
 sPAPREventLogEntry *entry = g_new(sPAPREventLogEntry, 1);
@@ -351,6 +352,7 @@ static void rtas_event_log_queue(int log_type, void *data, 
bool exception)
 entry->log_type = log_type;
 entry->exception = exception;
 entry->data = data;
+entry->data_size = data_size;
 QTAILQ_INSERT_TAIL(>pending_events, entry, next);
 }
 
@@ -445,6 +447,7 @@ static void spapr_powerdown_req(Notifier *n, void *opaque)
 struct rtas_event_log_v6_mainb *mainb;
 struct rtas_event_log_v6_epow *epow;
 struct epow_log_full *new_epow;
+uint32_t data_size;
 
 new_epow = g_malloc0(sizeof(*new_epow));
 hdr = _epow->hdr;
@@ -453,14 +456,13 @@ static void spapr_powerdown_req(Notifier *n, void *opaque)
 mainb = _epow->mainb;
 epow = _epow->epow;
 
+data_size = sizeof(*new_epow);
 hdr->summary = cpu_to_be32(RTAS_LOG_VERSION_6
| RTAS_LOG_SEVERITY_EVENT
| RTAS_LOG_DISPOSITION_NOT_RECOVERED
| RTAS_LOG_OPTIONAL_PART_PRESENT
| RTAS_LOG_TYPE_EPOW);
-hdr->extended_length = cpu_to_be32(sizeof(*new_epow)
-   - sizeof(new_epow->hdr));
-
+hdr->extended_length = cpu_to_be32(data_size - sizeof(new_epow->hdr));
 spapr_init_v6hdr(v6hdr);
 spapr_init_maina(maina, 3 /* Main-A, Main-B and EPOW */);
 
@@ -479,7 +481,7 @@ static void spapr_powerdown_req(Notifier *n, void *opaque)
 epow->event_modifier = RTAS_LOG_V6_EPOW_MODIFIER_NORMAL;
 epow->extended_modifier = RTAS_LOG_V6_EPOW_XMODIFIER_PARTITION_SPECIFIC;
 
-rtas_event_log_queue(RTAS_LOG_TYPE_EPOW, new_epow, true);
+rtas_event_log_queue(RTAS_LOG_TYPE_EPOW, new_epow, true, data_size);
 
 qemu_irq_pulse(xics_get_qirq(XICS_FABRIC(spapr),

[Qemu-devel] [PATCH 3/4] migration: spapr: migrate ccs_list in spapr state

2017-04-24 Thread Daniel Henrique Barboza
From: Jianjun Duan 

ccs_list in spapr state maintains the device tree related
information on the rtas side for hotplugged devices. In racing
situations between hotplug events and migration operation, a rtas
hotplug event could be migrated from the source guest to target
guest, or the source guest could have not yet finished fetching
the device tree when migration is started, the target will try
to finish fetching the device tree. By migrating ccs_list, the
target can fetch the device tree properly.

In theory there would be other alternatives besides migrating the
css_list to fix this. For example, we could add a flag that indicates
whether a device is in the middle of the configure_connector during the
migration process, in the post_load we can detect if this flag
is active and then return an error informing the guest to restart the
hotplug process. However, the DRC state can still be modified outside of
hotplug. Using:

   drmgr -c pci -s  -r
   drmgr -c pci -s  -a

it is possible to return a device to firmware and then later take it
back and reconfigure it. This is not a common case but it's not prohibited,
and performing a migration between these 2 operations would fail because
the default coldplug state on target assumes a configured state in
the source*. Migrating ccs_list is one solution that cover this
case as well.

ccs_list is put in a subsection in the spapr state VMSD to make
sure migration across different versions is not broken.

* see http://lists.nongnu.org/archive/html/qemu-devel/2016-10/msg01763.html
for more information on this discussion.

Signed-off-by: Jianjun Duan 
Signed-off-by: Daniel Henrique Barboza 
---
 hw/ppc/spapr.c | 32 
 1 file changed, 32 insertions(+)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 35db949..22f351c 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -1388,6 +1388,37 @@ static bool version_before_3(void *opaque, int 
version_id)
 return version_id < 3;
 }
 
+static bool spapr_ccs_list_needed(void *opaque)
+{
+sPAPRMachineState *spapr = (sPAPRMachineState *)opaque;
+return !QTAILQ_EMPTY(>ccs_list);
+}
+
+static const VMStateDescription vmstate_spapr_ccs = {
+.name = "spaprconfigureconnectorstate",
+.version_id = 1,
+.minimum_version_id = 1,
+.fields = (VMStateField[]) {
+VMSTATE_UINT32(drc_index, sPAPRConfigureConnectorState),
+VMSTATE_INT32(fdt_offset, sPAPRConfigureConnectorState),
+VMSTATE_INT32(fdt_depth, sPAPRConfigureConnectorState),
+VMSTATE_END_OF_LIST()
+},
+};
+
+static const VMStateDescription vmstate_spapr_ccs_list = {
+.name = "spaprccslist",
+.version_id = 1,
+.minimum_version_id = 1,
+.needed = spapr_ccs_list_needed,
+.fields = (VMStateField[]) {
+VMSTATE_QTAILQ_V(ccs_list, sPAPRMachineState, 1,
+ vmstate_spapr_ccs, sPAPRConfigureConnectorState,
+ next),
+VMSTATE_END_OF_LIST()
+},
+};
+
 static bool spapr_ov5_cas_needed(void *opaque)
 {
 sPAPRMachineState *spapr = opaque;
@@ -1486,6 +1517,7 @@ static const VMStateDescription vmstate_spapr = {
 .subsections = (const VMStateDescription*[]) {
 _spapr_ov5_cas,
 _spapr_patb_entry,
+_spapr_ccs_list,
 NULL
 }
 };
-- 
2.9.3




[Qemu-devel] [PATCH 2/4] hw/ppc: migrating the DRC state of hotplugged devices

2017-04-24 Thread Daniel Henrique Barboza
In pseries, a firmware abstraction called Dynamic Reconfiguration
Connector (DRC) is used to assign a particular dynamic resource
to the guest and provide an interface to manage configuration/removal
of the resource associated with it. In other words, DRC is the
'plugged state' of a device.

Before this patch, DRC wasn't being migrated. This causes
post-migration problems due to DRC state mismatch between source and
target. The DRC state of a device X in the source might
change, while in the target the DRC state of X is still fresh. When
migrating the guest, X will not have the same hotplugged state as it
did in the source. This means that we can't hot unplug X in the
target after migration is completed because its DRC state is not consistent.
https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/1677552 is one
bug that is caused by this DRC state mismatch between source and
target.

To migrate the DRC state, we defined the VMStateDescription struct for
spapr_drc to enable the transmission of spapr_drc state in migration.
Not all the elements in the DRC state are migrated - only those
that can be modified by guest actions or device add/remove
operations:

- 'isolation_state', 'allocation_state' and 'configured' are involved
in the DR state transition diagram from PAPR+ 2.7, 13.4;

- 'configured' and 'signalled' are needed in attaching and detaching
devices;

- 'indicator_state' provides users with hardware state information.

These are the DRC elements that are migrated.

In this patch the DRC state is migrated for PCI, LMB and CPU
connector types. At this moment there is no support to migrate
DRC for the PHB (PCI Host Bridge) type.

The instance_id is used to identify objects in migration. We set
instance_id of DRC using the unique index so that it is the same
across migration.

In hw/ppc/spapr_pci.c, a function called spapr_pci_set_detach_cb
was created to set the detach_cb of the migrated DRC in the
spapr_pci_post_load. The reason is that detach_cb is a DRC function
pointer that can't be migrated but we need it set in the target
so a ongoing hot-unplug event can properly finish.

Signed-off-by: Daniel Henrique Barboza 
---
 hw/ppc/spapr_drc.c | 67 ++
 hw/ppc/spapr_pci.c | 22 ++
 2 files changed, 89 insertions(+)

diff --git a/hw/ppc/spapr_drc.c b/hw/ppc/spapr_drc.c
index a1cdc87..5c2baad 100644
--- a/hw/ppc/spapr_drc.c
+++ b/hw/ppc/spapr_drc.c
@@ -651,6 +651,70 @@ static void spapr_dr_connector_instance_init(Object *obj)
 NULL, NULL, NULL, NULL);
 }
 
+static bool spapr_drc_needed(void *opaque)
+{
+sPAPRDRConnector *drc = (sPAPRDRConnector *)opaque;
+sPAPRDRConnectorClass *drck = SPAPR_DR_CONNECTOR_GET_CLASS(drc);
+bool rc = false;
+sPAPRDREntitySense value;
+drck->entity_sense(drc, );
+/* If no dev is plugged in there is no need to migrate the DRC state */
+if (value != SPAPR_DR_ENTITY_SENSE_PRESENT) {
+return false;
+}
+
+/*
+ * If there is dev plugged in, we need to migrate the DRC state when
+ * it is different from cold-plugged state
+ */
+switch (drc->type) {
+
+case SPAPR_DR_CONNECTOR_TYPE_PCI:
+rc = !((drc->isolation_state == SPAPR_DR_ISOLATION_STATE_UNISOLATED) &&
+   (drc->allocation_state == SPAPR_DR_ALLOCATION_STATE_USABLE) &&
+   drc->configured && drc->signalled && !drc->awaiting_release);
+break;
+
+case SPAPR_DR_CONNECTOR_TYPE_LMB:
+rc = !((drc->isolation_state == SPAPR_DR_ISOLATION_STATE_ISOLATED) &&
+   (drc->allocation_state == SPAPR_DR_ALLOCATION_STATE_UNUSABLE) &&
+   drc->configured && drc->signalled && !drc->awaiting_release);
+break;
+
+case SPAPR_DR_CONNECTOR_TYPE_CPU:
+rc = !((drc->isolation_state == SPAPR_DR_ISOLATION_STATE_ISOLATED) &&
+   (drc->allocation_state == SPAPR_DR_ALLOCATION_STATE_UNUSABLE) &&
+drc->configured && drc->signalled && !drc->awaiting_release);
+break;
+
+default:
+;
+}
+return rc;
+}
+
+/* return the unique drc index as instance_id for qom interfaces*/
+static int get_instance_id(DeviceState *dev)
+{
+return (int)get_index(SPAPR_DR_CONNECTOR(OBJECT(dev)));
+}
+
+static const VMStateDescription vmstate_spapr_drc = {
+.name = "spapr_drc",
+.version_id = 1,
+.minimum_version_id = 1,
+.needed = spapr_drc_needed,
+.fields  = (VMStateField []) {
+VMSTATE_UINT32(isolation_state, sPAPRDRConnector),
+VMSTATE_UINT32(allocation_state, sPAPRDRConnector),
+VMSTATE_UINT32(indicator_state, sPAPRDRConnector),
+VMSTATE_BOOL(configured, sPAPRDRConnector),
+VMSTATE_BOOL(awaiting_release, sPAPRDRConnector),
+VMSTATE_BOOL(signalled, sPAPRDRConnector),
+VMSTATE_END_OF_LIST()
+}
+};
+
 static void spapr_dr_connector_class_init(ObjectClass *k, void *data)
 {
 

[Qemu-devel] [PATCH 1/4] migration: alternative way to set instance_id in SaveStateEntry

2017-04-24 Thread Daniel Henrique Barboza
From: Jianjun Duan 

In QOM (QEMU Object Model) migrated objects are identified with instance_id
which is calculated automatically using their path in the QOM composition
tree. For some objects, this path could change from source to target in
migration. To migrate such objects, we need to make sure the instance_id does
not change from source to target. We add a hook in DeviceClass to do customized
instance_id calculation in such cases.

As a result, in these cases compat will not be set in the concerned
SaveStateEntry. This will prevent the inconsistent idstr to be sent over in
migration. We could have set alias_id in a similar way. But that will be
overloading the purpose of alias_id.

The first application will be setting instance_id for pseries DRC objects using
its unique index. Doing this makes the instance_id of DRC to be consistent
across migration and supports flexible management of DRC objects in migration.

Signed-off-by: Jianjun Duan 
Signed-off-by: Daniel Henrique Barboza 
---
 include/hw/qdev-core.h | 6 ++
 migration/savevm.c | 6 ++
 2 files changed, 12 insertions(+)

diff --git a/include/hw/qdev-core.h b/include/hw/qdev-core.h
index 4bf86b0..9b3914c 100644
--- a/include/hw/qdev-core.h
+++ b/include/hw/qdev-core.h
@@ -127,6 +127,12 @@ typedef struct DeviceClass {
 qdev_initfn init; /* TODO remove, once users are converted to realize */
 qdev_event exit; /* TODO remove, once users are converted to unrealize */
 const char *bus_type;
+
+/* When this field is set, qemu will use it to get an unique instance_id
+ * instead of calculating an auto idstr and instance_id for the relevant
+ * SaveStateEntry
+ */
+int (*dev_get_instance_id)(DeviceState *dev);
 } DeviceClass;
 
 typedef struct NamedGPIOList NamedGPIOList;
diff --git a/migration/savevm.c b/migration/savevm.c
index 03ae1bd..5d8135f 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -606,6 +606,9 @@ int register_savevm_live(DeviceState *dev,
  calculate_compat_instance_id(idstr) : instance_id;
 instance_id = -1;
 }
+if (DEVICE_GET_CLASS(dev)->dev_get_instance_id) {
+instance_id = DEVICE_GET_CLASS(dev)->dev_get_instance_id(dev);
+}
 }
 pstrcat(se->idstr, sizeof(se->idstr), idstr);
 
@@ -696,6 +699,9 @@ int vmstate_register_with_alias_id(DeviceState *dev, int 
instance_id,
  calculate_compat_instance_id(vmsd->name) : 
instance_id;
 instance_id = -1;
 }
+if (DEVICE_GET_CLASS(dev)->dev_get_instance_id) {
+instance_id = DEVICE_GET_CLASS(dev)->dev_get_instance_id(dev);
+}
 }
 pstrcat(se->idstr, sizeof(se->idstr), vmsd->name);
 
-- 
2.9.3




[Qemu-devel] [PATCH 0/4 v6] migration/ppc: migrating DRC, ccs_list and pending_events

2017-04-24 Thread Daniel Henrique Barboza
Hi,

This is the version 6 of the pseries patches that was last sent in the mailing 
list
more than 6 months ago. The original v5 patchset was authored by Jianjun Duan 
(see link
below):

http://lists.nongnu.org/archive/html/qemu-devel/2016-10/msg00270.html

The specific pseries patches were stripped out in the original v6 patchset
and it was later pushed upstream in the v17 in the 'extend VMStateInfo' and
'migrate QTAILQ' contributions.

The changelog as far as the pseries patches are concerned:


v6: - Rebased with QEMU master after 6+ months.
- Simplified the logic in patch 1.
- Reworked patch 2: added CPU DRC migration, removed a function pointer 
from DRC
class and minor improvements.
- Added clarifications from the previous v5 discussions in the commit 
messages.

v5: - Rebased to David's ppc-for-2.8.

v4: - Introduce a way to set customized instance_id in SaveStateEntry. Use it
  to set instance_id for DRC using its unique index to address David 
  Gibson's concern.
- Rename VMS_CSTM to VMS_LINKED based on Paolo Bonzini's suggestions.
- Clean up qjson stuff in put_qtailq. 
- Add trace for put_qtailq and get_qtailq based on David Gilbert's 
  suggestion.

- Based on David's ppc-for-2.7. 

v3: - Simplify overall design followng discussion with Paolo. No longer need
  metadata to migrate QTAILQ.
- Extend VMStateInfo instead of adding similar fields to VMStateField.
- Clean up macros in qemu/queue.h.
(link: https://lists.nongnu.org/archive/html/qemu-devel/2016-05/msg05695.html)

v2: - Introduce a general approach to migrate QTAILQ in qemu/queue.h.
- Migrate signalled field in the DRC state.
- Put the newly added migrating fields in subsections so that backward 
  migration is not broken.  
- Set detach_cb field right after migration so that a migrated hot-unplug
  event could finish its course.
(link: https://lists.nongnu.org/archive/html/qemu-devel/2016-05/msg04188.html)

v1: - Inital version.
(link: https://lists.nongnu.org/archive/html/qemu-devel/2016-04/msg02601.html)


To make guest devices (PCI, CPU and memory) hotplug work together 
with guest migration, spapr drc state needs be transmitted in
migration. This patch defines the VMStateDescription struct for
spapr drc state to enable it.

To fix the potential racing between hotplug events on guest and 
guest migration, ccs_list and pending_events of spapr state need be 
transmitted in migration. This patch set also takes care of it.



Daniel Henrique Barboza (1):
  hw/ppc: migrating the DRC state of hotplugged devices

Jianjun Duan (3):
  migration: alternative way to set instance_id in SaveStateEntry
  migration: spapr: migrate ccs_list in spapr state
  migration: spapr: migrate pending_events of spapr state

 hw/ppc/spapr.c | 65 
 hw/ppc/spapr_drc.c | 67 ++
 hw/ppc/spapr_events.c  | 24 +-
 hw/ppc/spapr_pci.c | 22 +
 include/hw/ppc/spapr.h |  3 ++-
 include/hw/qdev-core.h |  6 +
 migration/savevm.c |  6 +
 7 files changed, 181 insertions(+), 12 deletions(-)

-- 
2.9.3




Re: [Qemu-devel] [PULL 0/21] Please pull xen-20170421-tag for 2.10

2017-04-24 Thread Peter Maydell
On 24 April 2017 at 22:25, Stefano Stabellini  wrote:
> diff --git a/hw/9pfs/xen-9pfs.h b/hw/9pfs/xen-9pfs.h
> new file mode 100644
> index 000..18f0ec0
> --- /dev/null
> +++ b/hw/9pfs/xen-9pfs.h
> @@ -0,0 +1,14 @@
> +/*
> + * Xen 9p backend
> + *
> + * Copyright Aporeto 2017
> + *
> + * Authors:
> + *  Stefano Stabellini 
> + *
> + */

Trivial file, but I prefer it if we have a brief license
statement in every file, just to be clear (it might
accumulate more code later).

> +
> +#include 
> +#include "hw/xen/io/ring.h"
> +
> +DEFINE_XEN_FLEX_RING_AND_INTF(xen_9pfs);

Is it worth a comment to dissuade people from thinking they can
inline the file back into xen-9p-backend.c ?

thanks
-- PMM



Re: [Qemu-devel] [PULL 0/21] Please pull xen-20170421-tag for 2.10

2017-04-24 Thread Stefano Stabellini
On Mon, 24 Apr 2017, Peter Maydell wrote:
> On 21 April 2017 at 21:14, Stefano Stabellini  wrote:
> > The following changes since commit 55a19ad8b2d0797e3a8fe90ab99a9bb713824059:
> >
> >   Update version for v2.9.0-rc1 release (2017-03-21 17:13:29 +)
> >
> > are available in the git repository at:
> >
> >   git://xenbits.xen.org/people/sstabellini/qemu-dm.git tags/xen-20170421-tag
> >
> > for you to fetch changes up to b0d48550a2d10f0fa8c30a7cc3ff5a1cbda8d4c4:
> >
> >   move xen-mapcache.c to hw/i386/xen/ (2017-04-21 12:41:29 -0700)
> >
> > 
> > Xen 2017/04/21
> >
> > 
> 
> Hi; I'm afraid this doesn't build with clang:
> 
> 
>   CC  hw/9pfs/xen-9p-backend.o
> /home/petmay01/linaro/qemu-for-merges/hw/9pfs/xen-9p-backend.c:21:1:
> error: unused fun
> ction 'xen_9pfs_get_ring_ptr' [-Werror,-Wunused-function]
> DEFINE_XEN_FLEX_RING_AND_INTF(xen_9pfs);
> ^
> /home/petmay01/linaro/qemu-for-merges/include/hw/xen/io/ring.h:469:79:
> note: expanded
> from macro 'DEFINE_XEN_FLEX_RING_AND_INTF'
> };
> \
>   
> ^
> /home/petmay01/linaro/qemu-for-merges/include/hw/xen/io/ring.h:387:30:
> note: expanded from macro '\
> DEFINE_XEN_FLEX_RING'
> static inline unsigned char *name##_get_ring_ptr(unsigned char *buf,  
> \
>  ^
> :151:1: note: expanded from here
> xen_9pfs_get_ring_ptr
> ^
> /home/petmay01/linaro/qemu-for-merges/hw/9pfs/xen-9p-backend.c:21:1:
> error: unused function 'xen_9pfs_write_packet'
> [-Werror,-Wunused-function]
> /home/petmay01/linaro/qemu-for-merges/include/hw/xen/io/ring.h:469:79:
> note: expanded from macro 'DEFINE_XEN_FLEX_RING_AND_INTF'
> };
> \
>   
> ^
> /home/petmay01/linaro/qemu-for-merges/include/hw/xen/io/ring.h:412:20:
> note: expanded from macro '\
> DEFINE_XEN_FLEX_RING'
> static inline void name##_write_packet(unsigned char *buf,
> \
>^
> :155:1: note: expanded from here
> xen_9pfs_write_packet
> ^
> 2 errors generated.
> /home/petmay01/linaro/qemu-for-merges/rules.mak:69: recipe for target
> 'hw/9pfs/xen-9p-backend.o' failed
> 
> 
> Clang requires that functions, even inline functions, defined in .c files
> must be used.

Thank you for finding this issue. Given that ring.h is synced from Xen,
I think it makes sense to leave as is. I'll fix it as follow:

diff --git a/hw/9pfs/xen-9p-backend.c b/hw/9pfs/xen-9p-backend.c
index 7e962aa..9c7f41a 100644
--- a/hw/9pfs/xen-9p-backend.c
+++ b/hw/9pfs/xen-9p-backend.c
@@ -13,12 +13,9 @@
 #include "hw/hw.h"
 #include "hw/9pfs/9p.h"
 #include "hw/xen/xen_backend.h"
-#include "hw/xen/io/ring.h"
+#include "hw/9pfs/xen-9pfs.h"
 #include "qemu/config-file.h"
 #include "fsdev/qemu-fsdev.h"
-#include 
-
-DEFINE_XEN_FLEX_RING_AND_INTF(xen_9pfs);
 
 #define VERSIONS "1"
 #define MAX_RINGS 8
diff --git a/hw/9pfs/xen-9pfs.h b/hw/9pfs/xen-9pfs.h
new file mode 100644
index 000..18f0ec0
--- /dev/null
+++ b/hw/9pfs/xen-9pfs.h
@@ -0,0 +1,14 @@
+/*
+ * Xen 9p backend
+ *
+ * Copyright Aporeto 2017
+ *
+ * Authors:
+ *  Stefano Stabellini 
+ *
+ */
+
+#include 
+#include "hw/xen/io/ring.h"
+
+DEFINE_XEN_FLEX_RING_AND_INTF(xen_9pfs);



Re: [Qemu-devel] DMG chunk size independence

2017-04-24 Thread John Snow


On 04/23/2017 05:03 AM, Ashijeet Acharya wrote:
> Hi,
> 
> Great news!
> I have almost completed this task and the results are looking
> promising. I have not yet attended to the DMG files having bz2
> compressed chunks but that should be easy and pretty similar to my
> approach for zlib compressed files. So, no worries there.
> 
> For testing I am first converting the images to raw format and then
> comparing the resulting image with the one converted using v2.9.0 DMG
> driver and after battling for 2 days with my code, it finally prints
> "Images are identical." According to John, that should be pretty
> conclusive and I completely agree.
> 

Yes, comparing a sample.dmg against a raw file generated from the 2.9.0
qemu-img tool should be reasonably good evidence that you have not
altered the behavior of the tool.

> Now, the real thing I wanted to ask was, if someone is aware of a DMG
> file which has a chunk size above 64 MiB so that I can test those too.
> If yes, please share the download link with me.
> Currently I am testing the ones posted by Peter Wu while submitting
> his DMG work in 2014.
> Here -> https://lists.nongnu.org/archive/html/qemu-devel/2014-12/msg03606.html
> 

Are any of those over 64MB? I assume you're implying that they aren't.

Maybe Peter knows?...

> Expect v1 soon...
> 
> Ashijeet
> 




[Qemu-devel] [PATCH v2] pc/fwcfg: unbreak migration from qemu-2.5 and qemu-2.6 during firmware boot

2017-04-24 Thread Igor Mammedov
Since 2.7 commit (b2a575a Add optionrom compatible with fw_cfg DMA version)
regressed migration during firmware exection time by
abusing fwcfg.dma_enabled property to decide loading
dma version of option rom AND by mistake disabling DMA
for 2.6 and earlier globally instead of only for option rom.

so 2.6 machine type guest is broken when it already runs
firmware in DMA mode but migrated to qemu-2.7(pc-2.6)
at that time;

a) qemu-2.6:pc2.6 (fwcfg.dma=on,firmware=dma,oprom=mmio)
b) qemu-2.7:pc2.6 (fwcfg.dma=off,firmware=mmio,oprom=mmio)

  to:   a b
from
a   OK   FAIL
b   OK   OK

So we currently have broken forward migration from
qemu-2.6 to qemu-2.[789] that however could be fixed
for 2.10 by re-enabling DMA for 2.[56] machine types
and allowing dma capable option rom only since 2.7.
As result qemu should end up with:

c) qemu-2.10:pc2.6 (fwcfg.dma=on,firmware=dma,oprom=mmio)

   to:  a bc
from
a  OK   FAIL  OK
b  OK   OKOK
c  OK   FAIL  OK

where forward migration from qemu-2.6 to qemu-2.10 should
work again leaving only qemu-2.[789]:pc-2.6 broken.

Patch should also help downstream to maintain migration
the way it used to be since dma cable option rom
is managed by new

Signed-off-by: Igor Mammedov 
---
v2:
  (Eduardo Habkost )
* s/linuxboot_dma_disabled/linuxboot_dma_enabled/
* add comment to linuxboot_dma_enabled field
---
 hw/i386/pc.c | 9 -
 hw/i386/pc_piix.c| 1 +
 hw/i386/pc_q35.c | 1 +
 include/hw/i386/pc.h | 7 +++
 4 files changed, 9 insertions(+), 9 deletions(-)

diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index f3b372a18f..8063241140 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -1047,12 +1047,10 @@ static void load_linux(PCMachineState *pcms,
 fw_cfg_add_i32(fw_cfg, FW_CFG_SETUP_SIZE, setup_size);
 fw_cfg_add_bytes(fw_cfg, FW_CFG_SETUP_DATA, setup, setup_size);
 
-if (fw_cfg_dma_enabled(fw_cfg)) {
+option_rom[nb_option_roms].bootindex = 0;
+option_rom[nb_option_roms].name = "linuxboot.bin";
+if (pcmc->linuxboot_dma_enabled && fw_cfg_dma_enabled(fw_cfg)) {
 option_rom[nb_option_roms].name = "linuxboot_dma.bin";
-option_rom[nb_option_roms].bootindex = 0;
-} else {
-option_rom[nb_option_roms].name = "linuxboot.bin";
-option_rom[nb_option_roms].bootindex = 0;
 }
 nb_option_roms++;
 }
@@ -2321,6 +2319,7 @@ static void pc_machine_class_init(ObjectClass *oc, void 
*data)
  * to be used at the moment, 32K should be enough for a while.  */
 pcmc->acpi_data_size = 0x2 + 0x8000;
 pcmc->save_tsc_khz = true;
+pcmc->linuxboot_dma_enabled = true;
 mc->get_hotplug_handler = pc_get_hotpug_handler;
 mc->cpu_index_to_socket_id = pc_cpu_index_to_socket_id;
 mc->possible_cpu_arch_ids = pc_possible_cpu_arch_ids;
diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c
index 9f102aa388..a11190be46 100644
--- a/hw/i386/pc_piix.c
+++ b/hw/i386/pc_piix.c
@@ -474,6 +474,7 @@ static void pc_i440fx_2_6_machine_options(MachineClass *m)
 PCMachineClass *pcmc = PC_MACHINE_CLASS(m);
 pc_i440fx_2_7_machine_options(m);
 pcmc->legacy_cpu_hotplug = true;
+pcmc->linuxboot_dma_enabled = false;
 SET_MACHINE_COMPAT(m, PC_COMPAT_2_6);
 }
 
diff --git a/hw/i386/pc_q35.c b/hw/i386/pc_q35.c
index dd792a8547..0a61a2070c 100644
--- a/hw/i386/pc_q35.c
+++ b/hw/i386/pc_q35.c
@@ -335,6 +335,7 @@ static void pc_q35_2_6_machine_options(MachineClass *m)
 PCMachineClass *pcmc = PC_MACHINE_CLASS(m);
 pc_q35_2_7_machine_options(m);
 pcmc->legacy_cpu_hotplug = true;
+pcmc->linuxboot_dma_enabled = false;
 SET_MACHINE_COMPAT(m, PC_COMPAT_2_6);
 }
 
diff --git a/include/hw/i386/pc.h b/include/hw/i386/pc.h
index f278b3ae89..a57c607a8c 100644
--- a/include/hw/i386/pc.h
+++ b/include/hw/i386/pc.h
@@ -151,6 +151,9 @@ struct PCMachineClass {
 bool save_tsc_khz;
 /* generate legacy CPU hotplug AML */
 bool legacy_cpu_hotplug;
+
+/* use DMA capable linuxboot option rom */
+bool linuxboot_dma_enabled;
 };
 
 #define TYPE_PC_MACHINE "generic-pc-machine"
@@ -432,10 +435,6 @@ bool e820_get_entry(int, uint32_t, uint64_t *, uint64_t *);
 #define PC_COMPAT_2_6 \
 HW_COMPAT_2_6 \
 {\
-.driver   = "fw_cfg_io",\
-.property = "dma_enabled",\
-.value= "off",\
-},{\
 .driver   = TYPE_X86_CPU,\
 .property = "cpuid-0xb",\
 .value= "off",\
-- 
2.11.0 (Apple Git-81)




Re: [Qemu-devel] [PATCH RESEND v4 4/4] HMP: Introduce msr_get and msr_set HMP commands

2017-04-24 Thread Julian Kirsch
Good catch, thanks!

-Julian

On 24.04.2017 18:32, Dr. David Alan Gilbert wrote:
> Shouldn't the use of '-' be '_' in those to match the .name (and in the -set 
> variant
> below) ?
> 
> Dave




Re: [Qemu-devel] [PATCH] pc/fwcfg: unbreak migration from qemu-2.5 and qemu-2.6 during firmware boot

2017-04-24 Thread Eduardo Habkost
On Mon, Apr 24, 2017 at 09:56:04PM +0200, Igor Mammedov wrote:
> On Mon, 24 Apr 2017 16:37:31 -0300
> Eduardo Habkost  wrote:
> 
> > On Mon, Apr 24, 2017 at 09:32:33PM +0200, Igor Mammedov wrote:
> > > On Mon, 24 Apr 2017 16:13:17 -0300
> > > Eduardo Habkost  wrote:
> > > 
> > > > On Mon, Apr 24, 2017 at 08:58:17PM +0200, Igor Mammedov wrote:
> > > > [...]
> > > > > diff --git a/hw/i386/pc.c b/hw/i386/pc.c
> > > > > index f3b372a18f..3f2d96da64 100644
> > > > > --- a/hw/i386/pc.c
> > > > > +++ b/hw/i386/pc.c
> > > > > @@ -1047,7 +1047,7 @@ static void load_linux(PCMachineState *pcms,
> > > > >  fw_cfg_add_i32(fw_cfg, FW_CFG_SETUP_SIZE, setup_size);
> > > > >  fw_cfg_add_bytes(fw_cfg, FW_CFG_SETUP_DATA, setup, setup_size);
> > > > >  
> > > > > -if (fw_cfg_dma_enabled(fw_cfg)) {
> > > > > +if (!pcmc->linuxboot_dma_disabled && fw_cfg_dma_enabled(fw_cfg)) 
> > > > > {
> > > > 
> > > > Why not name the flag just "linuxboot_dma", set it to true by
> > > > default at pc_machine_class_init(), and avoid the double
> > > > negative?
> > > to avoid setting it to true somewhere else, so less thing could go wrong
> > > but is you prefer *_enable variant I can switch to it.
> > 
> > I would prefer to. We already have other compat flags initialized
> > inside pc_machine_class_init(), so this would fit nicely there.
> how about 'use_linuxboot_mmio' instead, it will remove negation
> and let me not to touch pc_machine_class_init()?

Sounds good to me.

-- 
Eduardo



Re: [Qemu-devel] [PATCH] pc/fwcfg: unbreak migration from qemu-2.5 and qemu-2.6 during firmware boot

2017-04-24 Thread Igor Mammedov
On Mon, 24 Apr 2017 16:37:31 -0300
Eduardo Habkost  wrote:

> On Mon, Apr 24, 2017 at 09:32:33PM +0200, Igor Mammedov wrote:
> > On Mon, 24 Apr 2017 16:13:17 -0300
> > Eduardo Habkost  wrote:
> > 
> > > On Mon, Apr 24, 2017 at 08:58:17PM +0200, Igor Mammedov wrote:
> > > [...]
> > > > diff --git a/hw/i386/pc.c b/hw/i386/pc.c
> > > > index f3b372a18f..3f2d96da64 100644
> > > > --- a/hw/i386/pc.c
> > > > +++ b/hw/i386/pc.c
> > > > @@ -1047,7 +1047,7 @@ static void load_linux(PCMachineState *pcms,
> > > >  fw_cfg_add_i32(fw_cfg, FW_CFG_SETUP_SIZE, setup_size);
> > > >  fw_cfg_add_bytes(fw_cfg, FW_CFG_SETUP_DATA, setup, setup_size);
> > > >  
> > > > -if (fw_cfg_dma_enabled(fw_cfg)) {
> > > > +if (!pcmc->linuxboot_dma_disabled && fw_cfg_dma_enabled(fw_cfg)) {
> > > 
> > > Why not name the flag just "linuxboot_dma", set it to true by
> > > default at pc_machine_class_init(), and avoid the double
> > > negative?
> > to avoid setting it to true somewhere else, so less thing could go wrong
> > but is you prefer *_enable variant I can switch to it.
> 
> I would prefer to. We already have other compat flags initialized
> inside pc_machine_class_init(), so this would fit nicely there.
how about 'use_linuxboot_mmio' instead, it will remove negation
and let me not to touch pc_machine_class_init()?

> 
> > 
> > > 
> > > [...]
> > > > diff --git a/include/hw/i386/pc.h b/include/hw/i386/pc.h
> > > > index f278b3ae89..ff6f13b61b 100644
> > > > --- a/include/hw/i386/pc.h
> > > > +++ b/include/hw/i386/pc.h
> > > > @@ -151,6 +151,8 @@ struct PCMachineClass {
> > > >  bool save_tsc_khz;
> > > >  /* generate legacy CPU hotplug AML */
> > > >  bool legacy_cpu_hotplug;
> > > > +
> > > 
> > > A one-line description of the consequences of setting/clearing
> > > the flag would be nice.
> > will fix in v2
> 
> Thanks!
> 
> > 
> > > 
> > > > +bool linuxboot_dma_disabled;
> > > >  };
> > > >  
> > > >  #define TYPE_PC_MACHINE "generic-pc-machine"
> > > > @@ -432,10 +434,6 @@ bool e820_get_entry(int, uint32_t, uint64_t *, 
> > > > uint64_t *);
> > > >  #define PC_COMPAT_2_6 \
> > > >  HW_COMPAT_2_6 \
> > > >  {\
> > > > -.driver   = "fw_cfg_io",\
> > > > -.property = "dma_enabled",\
> > > > -.value= "off",\
> > > > -},{\
> > > >  .driver   = TYPE_X86_CPU,\
> > > >  .property = "cpuid-0xb",\
> > > >  .value= "off",\
> > > > -- 
> > > > 2.11.0 (Apple Git-81)
> > > > 
> > > 
> > 
> 




Re: [Qemu-devel] [PATCH] pc/fwcfg: unbreak migration from qemu-2.5 and qemu-2.6 during firmware boot

2017-04-24 Thread Eduardo Habkost
On Mon, Apr 24, 2017 at 09:32:33PM +0200, Igor Mammedov wrote:
> On Mon, 24 Apr 2017 16:13:17 -0300
> Eduardo Habkost  wrote:
> 
> > On Mon, Apr 24, 2017 at 08:58:17PM +0200, Igor Mammedov wrote:
> > [...]
> > > diff --git a/hw/i386/pc.c b/hw/i386/pc.c
> > > index f3b372a18f..3f2d96da64 100644
> > > --- a/hw/i386/pc.c
> > > +++ b/hw/i386/pc.c
> > > @@ -1047,7 +1047,7 @@ static void load_linux(PCMachineState *pcms,
> > >  fw_cfg_add_i32(fw_cfg, FW_CFG_SETUP_SIZE, setup_size);
> > >  fw_cfg_add_bytes(fw_cfg, FW_CFG_SETUP_DATA, setup, setup_size);
> > >  
> > > -if (fw_cfg_dma_enabled(fw_cfg)) {
> > > +if (!pcmc->linuxboot_dma_disabled && fw_cfg_dma_enabled(fw_cfg)) {
> > 
> > Why not name the flag just "linuxboot_dma", set it to true by
> > default at pc_machine_class_init(), and avoid the double
> > negative?
> to avoid setting it to true somewhere else, so less thing could go wrong
> but is you prefer *_enable variant I can switch to it.

I would prefer to. We already have other compat flags initialized
inside pc_machine_class_init(), so this would fit nicely there.

> 
> > 
> > [...]
> > > diff --git a/include/hw/i386/pc.h b/include/hw/i386/pc.h
> > > index f278b3ae89..ff6f13b61b 100644
> > > --- a/include/hw/i386/pc.h
> > > +++ b/include/hw/i386/pc.h
> > > @@ -151,6 +151,8 @@ struct PCMachineClass {
> > >  bool save_tsc_khz;
> > >  /* generate legacy CPU hotplug AML */
> > >  bool legacy_cpu_hotplug;
> > > +
> > 
> > A one-line description of the consequences of setting/clearing
> > the flag would be nice.
> will fix in v2

Thanks!

> 
> > 
> > > +bool linuxboot_dma_disabled;
> > >  };
> > >  
> > >  #define TYPE_PC_MACHINE "generic-pc-machine"
> > > @@ -432,10 +434,6 @@ bool e820_get_entry(int, uint32_t, uint64_t *, 
> > > uint64_t *);
> > >  #define PC_COMPAT_2_6 \
> > >  HW_COMPAT_2_6 \
> > >  {\
> > > -.driver   = "fw_cfg_io",\
> > > -.property = "dma_enabled",\
> > > -.value= "off",\
> > > -},{\
> > >  .driver   = TYPE_X86_CPU,\
> > >  .property = "cpuid-0xb",\
> > >  .value= "off",\
> > > -- 
> > > 2.11.0 (Apple Git-81)
> > > 
> > 
> 

-- 
Eduardo



Re: [Qemu-devel] [RFC 0/7] Move accel, KVM, Xen, qtest files to accel/ subdir

2017-04-24 Thread Eduardo Habkost
On Mon, Apr 24, 2017 at 12:40:07PM +0200, Thomas Huth wrote:
> On 20.12.2016 18:43, Eduardo Habkost wrote:
> > This moves the KVM and Xen files to the an accel/ subdir.
> > 
> > Instead of moving the *-stubs.c file to accel/ as-is, I tried to
> > move most of the stub code to libqemustub.a. This way the obj-y
> > logic for accel/ is simpler: obj-y includes accel/ only if
> > CONFIG_SOFTMMU is set.
> > 
> > The Xen stubs could be moved completely to stubs/, but some of
> > the KVM stubs depend on cpu.h. So most of the kvm-stub.c code was
> > moved to stubs/kvm.c, but some of that code was kept in
> > accel/kvm-stub.c.
> > 
> > About TCG:
> > --
> > 
> > It is not obvious to me which TCG-related files could be moved to
> > accel/, so this series don't move any of them yet.
> > 
> > About other CONFIG_SOFTMMU top-level files:
> > ---
> > 
> > I would like to know what we should do with the top-level
> > CONFIG_SOFTMMU-only files that don't belong to hw/. Some
> > candidates: arch_init.c cpus.c monitor.c gdbstub.c balloon.c
> > ioport.c bootdevice.c memory.c cputlb.c memory_mapping.c dump.c.
> > 
> > Maybe a sysemu/ subdir? In that case, should we still create an
> > accel/ subdir, or move xen-*, kvm-* and friends to sysemu/ too?
> > 
> > Cc: Paolo Bonzini 
> > Cc: k...@vger.kernel.org
> > Cc: Christoffer Dall 
> > Cc: Anthony Perard 
> > Cc: Stefano Stabellini 
> > Cc: xen-de...@lists.xensource.com
> > 
> > Eduardo Habkost (7):
> >   xen: Move xen-*-stub.c to stubs/
> >   xen: Move xen files to accel/
> >   kvm: Move some kvm-stub.c code to stubs/kvm.c
> >   kvm: Include kvm-stub.o only on CONFIG_SOFTMMU
> >   kvm: Move kvm*.c files to accel/
> >   accel: Move accel.c to accel/
> >   accel: Move qtest.c to accel/
> > 
> >  Makefile.objs  |  2 +-
> >  Makefile.target| 10 ++
> >  accel.c => accel/accel.c   |  0
> >  kvm-all.c => accel/kvm-common.c|  0
> >  kvm-stub.c => accel/kvm-stub.c | 51 --
> >  qtest.c => accel/qtest.c   |  0
> >  xen-common.c => accel/xen-common.c |  0
> >  xen-hvm.c => accel/xen-hvm.c   |  0
> >  xen-mapcache.c => accel/xen-mapcache.c |  0
> >  stubs/kvm.c| 65 
> > ++
> >  xen-hvm-stub.c => stubs/xen-hvm.c  |  0
> >  xen-common-stub.c => stubs/xen.c   |  0
> >  MAINTAINERS|  4 +--
> >  accel/Makefile.objs|  9 +
> >  stubs/Makefile.objs|  2 ++
> >  15 files changed, 80 insertions(+), 63 deletions(-)
> >  rename accel.c => accel/accel.c (100%)
> >  rename kvm-all.c => accel/kvm-common.c (100%)
> >  rename kvm-stub.c => accel/kvm-stub.c (71%)
> >  rename qtest.c => accel/qtest.c (100%)
> >  rename xen-common.c => accel/xen-common.c (100%)
> >  rename xen-hvm.c => accel/xen-hvm.c (100%)
> >  rename xen-mapcache.c => accel/xen-mapcache.c (100%)
> >  rename xen-hvm-stub.c => stubs/xen-hvm.c (100%)
> >  rename xen-common-stub.c => stubs/xen.c (100%)
> >  create mode 100644 accel/Makefile.objs
> 
> Now that the development tree is open again ... any chance that we could
> get this series into 2.10 ?

I remember there were some suggestions about the code movements,
especially about the files being moved inside stubs/. I never
took the time to make a v2 implementing those suggestions, so if
anybody wants to volunteer to address the feedback on this RFC
and redo the series, please be my guest.  :)

-- 
Eduardo



Re: [Qemu-devel] [PATCH] pc/fwcfg: unbreak migration from qemu-2.5 and qemu-2.6 during firmware boot

2017-04-24 Thread Igor Mammedov
On Mon, 24 Apr 2017 16:13:17 -0300
Eduardo Habkost  wrote:

> On Mon, Apr 24, 2017 at 08:58:17PM +0200, Igor Mammedov wrote:
> [...]
> > diff --git a/hw/i386/pc.c b/hw/i386/pc.c
> > index f3b372a18f..3f2d96da64 100644
> > --- a/hw/i386/pc.c
> > +++ b/hw/i386/pc.c
> > @@ -1047,7 +1047,7 @@ static void load_linux(PCMachineState *pcms,
> >  fw_cfg_add_i32(fw_cfg, FW_CFG_SETUP_SIZE, setup_size);
> >  fw_cfg_add_bytes(fw_cfg, FW_CFG_SETUP_DATA, setup, setup_size);
> >  
> > -if (fw_cfg_dma_enabled(fw_cfg)) {
> > +if (!pcmc->linuxboot_dma_disabled && fw_cfg_dma_enabled(fw_cfg)) {
> 
> Why not name the flag just "linuxboot_dma", set it to true by
> default at pc_machine_class_init(), and avoid the double
> negative?
to avoid setting it to true somewhere else, so less thing could go wrong
but is you prefer *_enable variant I can switch to it.

> 
> [...]
> > diff --git a/include/hw/i386/pc.h b/include/hw/i386/pc.h
> > index f278b3ae89..ff6f13b61b 100644
> > --- a/include/hw/i386/pc.h
> > +++ b/include/hw/i386/pc.h
> > @@ -151,6 +151,8 @@ struct PCMachineClass {
> >  bool save_tsc_khz;
> >  /* generate legacy CPU hotplug AML */
> >  bool legacy_cpu_hotplug;
> > +
> 
> A one-line description of the consequences of setting/clearing
> the flag would be nice.
will fix in v2

> 
> > +bool linuxboot_dma_disabled;
> >  };
> >  
> >  #define TYPE_PC_MACHINE "generic-pc-machine"
> > @@ -432,10 +434,6 @@ bool e820_get_entry(int, uint32_t, uint64_t *, 
> > uint64_t *);
> >  #define PC_COMPAT_2_6 \
> >  HW_COMPAT_2_6 \
> >  {\
> > -.driver   = "fw_cfg_io",\
> > -.property = "dma_enabled",\
> > -.value= "off",\
> > -},{\
> >  .driver   = TYPE_X86_CPU,\
> >  .property = "cpuid-0xb",\
> >  .value= "off",\
> > -- 
> > 2.11.0 (Apple Git-81)
> > 
> 




[Qemu-devel] [PULL v2 12/12] qemu-iotests: _cleanup_qemu must be called on exit

2017-04-24 Thread Jeff Cody
For the tests that use the common.qemu functions for running a QEMU
process, _cleanup_qemu must be called in the exit function.

If it is not, if the qemu process aborts, then not all of the droppings
are cleaned up (e.g. pidfile, fifos).

This updates those tests that did not have a cleanup in qemu-iotests.

(I swapped spaces for tabs in test 102 as well)

Reported-by: Eric Blake 
Reviewed-by: Eric Blake 
Signed-off-by: Jeff Cody 
Message-id: 
d59c2f6ad6c1da8b9b3c7f357c94a7122ccfc55a.1492544096.git.jc...@redhat.com
---
 tests/qemu-iotests/028 |  1 +
 tests/qemu-iotests/094 | 11 ---
 tests/qemu-iotests/102 |  5 +++--
 tests/qemu-iotests/109 |  1 +
 tests/qemu-iotests/117 |  1 +
 tests/qemu-iotests/130 |  1 +
 tests/qemu-iotests/140 |  1 +
 tests/qemu-iotests/141 |  1 +
 tests/qemu-iotests/143 |  1 +
 tests/qemu-iotests/156 |  1 +
 10 files changed, 19 insertions(+), 5 deletions(-)

diff --git a/tests/qemu-iotests/028 b/tests/qemu-iotests/028
index 7783e57..97a8869 100755
--- a/tests/qemu-iotests/028
+++ b/tests/qemu-iotests/028
@@ -32,6 +32,7 @@ status=1  # failure is the default!
 
 _cleanup()
 {
+_cleanup_qemu
 rm -f "${TEST_IMG}.copy"
 _cleanup_test_img
 }
diff --git a/tests/qemu-iotests/094 b/tests/qemu-iotests/094
index 0ba0b0c..9aa01e3 100755
--- a/tests/qemu-iotests/094
+++ b/tests/qemu-iotests/094
@@ -27,7 +27,14 @@ echo "QA output created by $seq"
 here="$PWD"
 status=1   # failure is the default!
 
-trap "exit \$status" 0 1 2 3 15
+_cleanup()
+{
+_cleanup_qemu
+_cleanup_test_img
+rm -f "$TEST_DIR/source.$IMGFMT"
+}
+
+trap "_cleanup; exit \$status" 0 1 2 3 15
 
 # get standard environment, filters and checks
 . ./common.rc
@@ -73,8 +80,6 @@ _send_qemu_cmd $QEMU_HANDLE \
 
 wait=1 _cleanup_qemu
 
-_cleanup_test_img
-rm -f "$TEST_DIR/source.$IMGFMT"
 
 # success, all done
 echo '*** done'
diff --git a/tests/qemu-iotests/102 b/tests/qemu-iotests/102
index 64b4af9..87db1bb 100755
--- a/tests/qemu-iotests/102
+++ b/tests/qemu-iotests/102
@@ -25,11 +25,12 @@ seq=$(basename $0)
 echo "QA output created by $seq"
 
 here=$PWD
-status=1   # failure is the default!
+status=1# failure is the default!
 
 _cleanup()
 {
-   _cleanup_test_img
+_cleanup_qemu
+_cleanup_test_img
 }
 trap "_cleanup; exit \$status" 0 1 2 3 15
 
diff --git a/tests/qemu-iotests/109 b/tests/qemu-iotests/109
index 927151a..6161633 100755
--- a/tests/qemu-iotests/109
+++ b/tests/qemu-iotests/109
@@ -29,6 +29,7 @@ status=1  # failure is the default!
 
 _cleanup()
 {
+_cleanup_qemu
 rm -f $TEST_IMG.src
_cleanup_test_img
 }
diff --git a/tests/qemu-iotests/117 b/tests/qemu-iotests/117
index e955d52..6c83461 100755
--- a/tests/qemu-iotests/117
+++ b/tests/qemu-iotests/117
@@ -29,6 +29,7 @@ status=1  # failure is the default!
 
 _cleanup()
 {
+_cleanup_qemu
_cleanup_test_img
 }
 trap "_cleanup; exit \$status" 0 1 2 3 15
diff --git a/tests/qemu-iotests/130 b/tests/qemu-iotests/130
index f941fc9..e7e43de 100755
--- a/tests/qemu-iotests/130
+++ b/tests/qemu-iotests/130
@@ -31,6 +31,7 @@ status=1  # failure is the default!
 
 _cleanup()
 {
+_cleanup_qemu
 _cleanup_test_img
 }
 trap "_cleanup; exit \$status" 0 1 2 3 15
diff --git a/tests/qemu-iotests/140 b/tests/qemu-iotests/140
index 49f9df4..8c80a5a 100755
--- a/tests/qemu-iotests/140
+++ b/tests/qemu-iotests/140
@@ -33,6 +33,7 @@ status=1  # failure is the default!
 
 _cleanup()
 {
+_cleanup_qemu
 _cleanup_test_img
 rm -f "$TEST_DIR/nbd"
 }
diff --git a/tests/qemu-iotests/141 b/tests/qemu-iotests/141
index 27fb1cc..40a3405 100755
--- a/tests/qemu-iotests/141
+++ b/tests/qemu-iotests/141
@@ -29,6 +29,7 @@ status=1  # failure is the default!
 
 _cleanup()
 {
+_cleanup_qemu
 _cleanup_test_img
 rm -f "$TEST_DIR/{b,m,o}.$IMGFMT"
 }
diff --git a/tests/qemu-iotests/143 b/tests/qemu-iotests/143
index ec4ef22..5ff1944 100755
--- a/tests/qemu-iotests/143
+++ b/tests/qemu-iotests/143
@@ -29,6 +29,7 @@ status=1  # failure is the default!
 
 _cleanup()
 {
+_cleanup_qemu
 rm -f "$TEST_DIR/nbd"
 }
 trap "_cleanup; exit \$status" 0 1 2 3 15
diff --git a/tests/qemu-iotests/156 b/tests/qemu-iotests/156
index 78deaff..d799b73 100755
--- a/tests/qemu-iotests/156
+++ b/tests/qemu-iotests/156
@@ -37,6 +37,7 @@ status=1  # failure is the default!
 
 _cleanup()
 {
+_cleanup_qemu
 rm -f "$TEST_IMG{,.target}{,.backing,.overlay}"
 }
 trap "_cleanup; exit \$status" 0 1 2 3 15
-- 
2.9.3




[Qemu-devel] [PULL v2 10/12] block/rbd - update variable names to more apt names

2017-04-24 Thread Jeff Cody
Update 'clientname' to be 'user', which tracks better with both
the QAPI and rados variable naming.

Update 'name' to be 'image_name', as it indicates the rbd image.
Naming it 'image' would have been ideal, but we are using that for
the rados_image_t value returned by rbd_open().

Reviewed-by: Stefan Hajnoczi 
Signed-off-by: Jeff Cody 
Reviewed-by: John Snow 
Message-id: 
b7ec1fb2e1cf36f9b6911631447a5b0422590b7d.1491597120.git.jc...@redhat.com
---
 block/rbd.c | 33 +
 1 file changed, 17 insertions(+), 16 deletions(-)

diff --git a/block/rbd.c b/block/rbd.c
index 1c43171..35853c9 100644
--- a/block/rbd.c
+++ b/block/rbd.c
@@ -94,7 +94,7 @@ typedef struct BDRVRBDState {
 rados_t cluster;
 rados_ioctx_t io_ctx;
 rbd_image_t image;
-char *name;
+char *image_name;
 char *snap;
 } BDRVRBDState;
 
@@ -350,7 +350,7 @@ static int qemu_rbd_create(const char *filename, QemuOpts 
*opts, Error **errp)
 int64_t bytes = 0;
 int64_t objsize;
 int obj_order = 0;
-const char *pool, *name, *conf, *clientname, *keypairs;
+const char *pool, *image_name, *conf, *user, *keypairs;
 const char *secretid;
 rados_t cluster;
 rados_ioctx_t io_ctx;
@@ -393,11 +393,11 @@ static int qemu_rbd_create(const char *filename, QemuOpts 
*opts, Error **errp)
  */
 pool   = qdict_get_try_str(options, "pool");
 conf   = qdict_get_try_str(options, "conf");
-clientname = qdict_get_try_str(options, "user");
-name   = qdict_get_try_str(options, "image");
+user   = qdict_get_try_str(options, "user");
+image_name = qdict_get_try_str(options, "image");
 keypairs   = qdict_get_try_str(options, "=keyvalue-pairs");
 
-ret = rados_create(, clientname);
+ret = rados_create(, user);
 if (ret < 0) {
 error_setg_errno(errp, -ret, "error initializing");
 goto exit;
@@ -434,7 +434,7 @@ static int qemu_rbd_create(const char *filename, QemuOpts 
*opts, Error **errp)
 goto shutdown;
 }
 
-ret = rbd_create(io_ctx, name, bytes, _order);
+ret = rbd_create(io_ctx, image_name, bytes, _order);
 if (ret < 0) {
 error_setg_errno(errp, -ret, "error rbd create");
 }
@@ -540,7 +540,7 @@ static int qemu_rbd_open(BlockDriverState *bs, QDict 
*options, int flags,
  Error **errp)
 {
 BDRVRBDState *s = bs->opaque;
-const char *pool, *snap, *conf, *clientname, *name, *keypairs;
+const char *pool, *snap, *conf, *user, *image_name, *keypairs;
 const char *secretid;
 QemuOpts *opts;
 Error *local_err = NULL;
@@ -567,24 +567,24 @@ static int qemu_rbd_open(BlockDriverState *bs, QDict 
*options, int flags,
 pool   = qemu_opt_get(opts, "pool");
 conf   = qemu_opt_get(opts, "conf");
 snap   = qemu_opt_get(opts, "snapshot");
-clientname = qemu_opt_get(opts, "user");
-name   = qemu_opt_get(opts, "image");
+user   = qemu_opt_get(opts, "user");
+image_name = qemu_opt_get(opts, "image");
 keypairs   = qemu_opt_get(opts, "=keyvalue-pairs");
 
-if (!pool || !name) {
+if (!pool || !image_name) {
 error_setg(errp, "Parameters 'pool' and 'image' are required");
 r = -EINVAL;
 goto failed_opts;
 }
 
-r = rados_create(>cluster, clientname);
+r = rados_create(>cluster, user);
 if (r < 0) {
 error_setg_errno(errp, -r, "error initializing");
 goto failed_opts;
 }
 
 s->snap = g_strdup(snap);
-s->name = g_strdup(name);
+s->image_name = g_strdup(image_name);
 
 /* try default location when conf=NULL, but ignore failure */
 r = rados_conf_read_file(s->cluster, conf);
@@ -636,9 +636,10 @@ static int qemu_rbd_open(BlockDriverState *bs, QDict 
*options, int flags,
 }
 
 /* rbd_open is always r/w */
-r = rbd_open(s->io_ctx, s->name, >image, s->snap);
+r = rbd_open(s->io_ctx, s->image_name, >image, s->snap);
 if (r < 0) {
-error_setg_errno(errp, -r, "error reading header from %s", s->name);
+error_setg_errno(errp, -r, "error reading header from %s",
+ s->image_name);
 goto failed_open;
 }
 
@@ -660,7 +661,7 @@ failed_open:
 failed_shutdown:
 rados_shutdown(s->cluster);
 g_free(s->snap);
-g_free(s->name);
+g_free(s->image_name);
 failed_opts:
 qemu_opts_del(opts);
 g_free(mon_host);
@@ -674,7 +675,7 @@ static void qemu_rbd_close(BlockDriverState *bs)
 rbd_close(s->image);
 rados_ioctx_destroy(s->io_ctx);
 g_free(s->snap);
-g_free(s->name);
+g_free(s->image_name);
 rados_shutdown(s->cluster);
 }
 
-- 
2.9.3




[Qemu-devel] [PULL v2 11/12] block/rbd: Add support for reopen()

2017-04-24 Thread Jeff Cody
This adds support for reopen in rbd, for changing between r/w and r/o.

Note, that this is only a flag change, but we will block a change from
r/o to r/w if we are using an RBD internal snapshot.

Reviewed-by: Stefan Hajnoczi 
Signed-off-by: Jeff Cody 
Reviewed-by: John Snow 
Message-id: 
d4e87539167ec6527d44c97b164eabcccf96e4f3.1491597120.git.jc...@redhat.com
---
 block/rbd.c | 21 +
 1 file changed, 21 insertions(+)

diff --git a/block/rbd.c b/block/rbd.c
index 35853c9..6471f4f 100644
--- a/block/rbd.c
+++ b/block/rbd.c
@@ -668,6 +668,26 @@ failed_opts:
 return r;
 }
 
+
+/* Since RBD is currently always opened R/W via the API,
+ * we just need to check if we are using a snapshot or not, in
+ * order to determine if we will allow it to be R/W */
+static int qemu_rbd_reopen_prepare(BDRVReopenState *state,
+   BlockReopenQueue *queue, Error **errp)
+{
+BDRVRBDState *s = state->bs->opaque;
+int ret = 0;
+
+if (s->snap && state->flags & BDRV_O_RDWR) {
+error_setg(errp,
+   "Cannot change node '%s' to r/w when using RBD snapshot",
+   bdrv_get_device_or_node_name(state->bs));
+ret = -EINVAL;
+}
+
+return ret;
+}
+
 static void qemu_rbd_close(BlockDriverState *bs)
 {
 BDRVRBDState *s = bs->opaque;
@@ -1074,6 +1094,7 @@ static BlockDriver bdrv_rbd = {
 .bdrv_parse_filename= qemu_rbd_parse_filename,
 .bdrv_file_open = qemu_rbd_open,
 .bdrv_close = qemu_rbd_close,
+.bdrv_reopen_prepare= qemu_rbd_reopen_prepare,
 .bdrv_create= qemu_rbd_create,
 .bdrv_has_zero_init = bdrv_has_zero_init_1,
 .bdrv_get_info  = qemu_rbd_getinfo,
-- 
2.9.3




[Qemu-devel] [PULL v2 08/12] block: introduce bdrv_can_set_read_only()

2017-04-24 Thread Jeff Cody
Introduce check function for setting read_only flags.  Will return < 0 on
error, with appropriate Error value set.  Does not alter any flags.

Signed-off-by: Jeff Cody 
Reviewed-by: Stefan Hajnoczi 
Reviewed-by: John Snow 
Message-id: 
e2bba34ac3bc76a0c42adc390413f358ae0566e8.1491597120.git.jc...@redhat.com
---
 block.c   | 14 +-
 include/block/block.h |  1 +
 2 files changed, 14 insertions(+), 1 deletion(-)

diff --git a/block.c b/block.c
index 123c982..1ac05c1 100644
--- a/block.c
+++ b/block.c
@@ -197,7 +197,7 @@ bool bdrv_is_read_only(BlockDriverState *bs)
 return bs->read_only;
 }
 
-int bdrv_set_read_only(BlockDriverState *bs, bool read_only, Error **errp)
+int bdrv_can_set_read_only(BlockDriverState *bs, bool read_only, Error **errp)
 {
 /* Do not set read_only if copy_on_read is enabled */
 if (bs->copy_on_read && read_only) {
@@ -213,6 +213,18 @@ int bdrv_set_read_only(BlockDriverState *bs, bool 
read_only, Error **errp)
 return -EPERM;
 }
 
+return 0;
+}
+
+int bdrv_set_read_only(BlockDriverState *bs, bool read_only, Error **errp)
+{
+int ret = 0;
+
+ret = bdrv_can_set_read_only(bs, read_only, errp);
+if (ret < 0) {
+return ret;
+}
+
 bs->read_only = read_only;
 return 0;
 }
diff --git a/include/block/block.h b/include/block/block.h
index 1d7fd19..144df0d 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -434,6 +434,7 @@ int bdrv_is_allocated_above(BlockDriverState *top, 
BlockDriverState *base,
 int64_t sector_num, int nb_sectors, int *pnum);
 
 bool bdrv_is_read_only(BlockDriverState *bs);
+int bdrv_can_set_read_only(BlockDriverState *bs, bool read_only, Error **errp);
 int bdrv_set_read_only(BlockDriverState *bs, bool read_only, Error **errp);
 bool bdrv_is_sg(BlockDriverState *bs);
 bool bdrv_is_inserted(BlockDriverState *bs);
-- 
2.9.3




[Qemu-devel] [PULL v2 07/12] block: code movement

2017-04-24 Thread Jeff Cody
Move bdrv_is_read_only() up with its friends.

Reviewed-by: Stefan Hajnoczi 
Reviewed-by: John Snow 
Signed-off-by: Jeff Cody 
Message-id: 
73b2399459760c32506f9407efb9dddb3a2789de.1491597120.git.jc...@redhat.com
---
 block.c | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/block.c b/block.c
index 4752fee..123c982 100644
--- a/block.c
+++ b/block.c
@@ -192,6 +192,11 @@ void path_combine(char *dest, int dest_size,
 }
 }
 
+bool bdrv_is_read_only(BlockDriverState *bs)
+{
+return bs->read_only;
+}
+
 int bdrv_set_read_only(BlockDriverState *bs, bool read_only, Error **errp)
 {
 /* Do not set read_only if copy_on_read is enabled */
@@ -3375,11 +3380,6 @@ void bdrv_get_geometry(BlockDriverState *bs, uint64_t 
*nb_sectors_ptr)
 *nb_sectors_ptr = nb_sectors < 0 ? 0 : nb_sectors;
 }
 
-bool bdrv_is_read_only(BlockDriverState *bs)
-{
-return bs->read_only;
-}
-
 bool bdrv_is_sg(BlockDriverState *bs)
 {
 return bs->sg;
-- 
2.9.3




[Qemu-devel] [PULL v2 09/12] block: use bdrv_can_set_read_only() during reopen

2017-04-24 Thread Jeff Cody
Signed-off-by: Jeff Cody 
Reviewed-by: Stefan Hajnoczi 
Reviewed-by: John Snow 
Message-id: 
00aed7ffdd7be4b9ed9ce1007d50028a72b34ebe.1491597120.git.jc...@redhat.com
---
 block.c | 14 --
 1 file changed, 8 insertions(+), 6 deletions(-)

diff --git a/block.c b/block.c
index 1ac05c1..5db266b 100644
--- a/block.c
+++ b/block.c
@@ -2789,6 +2789,7 @@ int bdrv_reopen_prepare(BDRVReopenState *reopen_state, 
BlockReopenQueue *queue,
 BlockDriver *drv;
 QemuOpts *opts;
 const char *value;
+bool read_only;
 
 assert(reopen_state != NULL);
 assert(reopen_state->bs->drv != NULL);
@@ -2817,12 +2818,13 @@ int bdrv_reopen_prepare(BDRVReopenState *reopen_state, 
BlockReopenQueue *queue,
 qdict_put(reopen_state->options, "driver", qstring_from_str(value));
 }
 
-/* if we are to stay read-only, do not allow permission change
- * to r/w */
-if (!(reopen_state->bs->open_flags & BDRV_O_ALLOW_RDWR) &&
-reopen_state->flags & BDRV_O_RDWR) {
-error_setg(errp, "Node '%s' is read only",
-   bdrv_get_device_or_node_name(reopen_state->bs));
+/* If we are to stay read-only, do not allow permission change
+ * to r/w. Attempting to set to r/w may fail if either BDRV_O_ALLOW_RDWR is
+ * not set, or if the BDS still has copy_on_read enabled */
+read_only = !(reopen_state->flags & BDRV_O_RDWR);
+ret = bdrv_can_set_read_only(reopen_state->bs, read_only, _err);
+if (local_err) {
+error_propagate(errp, local_err);
 goto error;
 }
 
-- 
2.9.3




[Qemu-devel] [PULL v2 02/12] block/vxhs.c: Add qemu-iotests for new block device type "vxhs"

2017-04-24 Thread Jeff Cody
From: Ashish Mittal 

These changes use a vxhs test server that is a part of the following
repository:
https://github.com/VeritasHyperScale/libqnio.git

Signed-off-by: Ashish Mittal 
Reviewed-by: Stefan Hajnoczi 
Reviewed-by: Jeff Cody 
Signed-off-by: Jeff Cody 
Message-id: 1491277689-24949-3-git-send-email-ashish.mit...@veritas.com
---
 tests/qemu-iotests/common|  6 ++
 tests/qemu-iotests/common.config | 13 +
 tests/qemu-iotests/common.filter |  1 +
 tests/qemu-iotests/common.rc | 19 +++
 4 files changed, 39 insertions(+)

diff --git a/tests/qemu-iotests/common b/tests/qemu-iotests/common
index 4d5650d..9c6f972 100644
--- a/tests/qemu-iotests/common
+++ b/tests/qemu-iotests/common
@@ -157,6 +157,7 @@ check options
 -sshtest ssh
 -nfstest nfs
 -luks   test luks
+-vxhs   test vxhs
 -xdiff  graphical mode diff
 -nocacheuse O_DIRECT on backing file
 -misalign   misalign memory allocations
@@ -260,6 +261,11 @@ testlist options
 xpand=false
 ;;
 
+-vxhs)
+IMGPROTO=vxhs
+xpand=false
+;;
+
 -ssh)
 IMGPROTO=ssh
 xpand=false
diff --git a/tests/qemu-iotests/common.config b/tests/qemu-iotests/common.config
index 55527aa..c4b51b3 100644
--- a/tests/qemu-iotests/common.config
+++ b/tests/qemu-iotests/common.config
@@ -105,6 +105,10 @@ if [ -z "$QEMU_NBD_PROG" ]; then
 export QEMU_NBD_PROG="`set_prog_path qemu-nbd`"
 fi
 
+if [ -z "$QEMU_VXHS_PROG" ]; then
+export QEMU_VXHS_PROG="`set_prog_path qnio_server`"
+fi
+
 _qemu_wrapper()
 {
 (
@@ -156,10 +160,19 @@ _qemu_nbd_wrapper()
 )
 }
 
+_qemu_vxhs_wrapper()
+{
+(
+echo $BASHPID > "${TEST_DIR}/qemu-vxhs.pid"
+exec "$QEMU_VXHS_PROG" $QEMU_VXHS_OPTIONS "$@"
+)
+}
+
 export QEMU=_qemu_wrapper
 export QEMU_IMG=_qemu_img_wrapper
 export QEMU_IO=_qemu_io_wrapper
 export QEMU_NBD=_qemu_nbd_wrapper
+export QEMU_VXHS=_qemu_vxhs_wrapper
 
 QEMU_IMG_EXTRA_ARGS=
 if [ "$IMGOPTSSYNTAX" = "true" ]; then
diff --git a/tests/qemu-iotests/common.filter b/tests/qemu-iotests/common.filter
index 1040013..c9a2d5c 100644
--- a/tests/qemu-iotests/common.filter
+++ b/tests/qemu-iotests/common.filter
@@ -122,6 +122,7 @@ _filter_img_info()
 -e "s#$TEST_DIR#TEST_DIR#g" \
 -e "s#$IMGFMT#IMGFMT#g" \
 -e 's#nbd://127.0.0.1:10810$#TEST_DIR/t.IMGFMT#g' \
+-e 's#json.*vdisk-id.*vxhs"}}#TEST_DIR/t.IMGFMT#' \
 -e "/encrypted: yes/d" \
 -e "/cluster_size: [0-9]\\+/d" \
 -e "/table_size: [0-9]\\+/d" \
diff --git a/tests/qemu-iotests/common.rc b/tests/qemu-iotests/common.rc
index 7d4781d..62529ee 100644
--- a/tests/qemu-iotests/common.rc
+++ b/tests/qemu-iotests/common.rc
@@ -85,6 +85,9 @@ else
 elif [ "$IMGPROTO" = "nfs" ]; then
 TEST_DIR="nfs://127.0.0.1/$TEST_DIR"
 TEST_IMG=$TEST_DIR/t.$IMGFMT
+elif [ "$IMGPROTO" = "vxhs" ]; then
+TEST_IMG_FILE=$TEST_DIR/t.$IMGFMT
+TEST_IMG="vxhs://127.0.0.1:/t.$IMGFMT"
 else
 TEST_IMG=$IMGPROTO:$TEST_DIR/t.$IMGFMT
 fi
@@ -171,6 +174,12 @@ _make_test_img()
 eval "$QEMU_NBD -v -t -b 127.0.0.1 -p 10810 -f $IMGFMT  $TEST_IMG_FILE 
>/dev/null &"
 sleep 1 # FIXME: qemu-nbd needs to be listening before we continue
 fi
+
+# Start QNIO server on image directory for vxhs protocol
+if [ $IMGPROTO = "vxhs" ]; then
+eval "$QEMU_VXHS -d  $TEST_DIR > /dev/null &"
+sleep 1 # Wait for server to come up.
+fi
 }
 
 _rm_test_img()
@@ -197,6 +206,16 @@ _cleanup_test_img()
 fi
 rm -f "$TEST_IMG_FILE"
 ;;
+vxhs)
+if [ -f "${TEST_DIR}/qemu-vxhs.pid" ]; then
+local QEMU_VXHS_PID
+read QEMU_VXHS_PID < "${TEST_DIR}/qemu-vxhs.pid"
+kill ${QEMU_VXHS_PID} >/dev/null 2>&1
+rm -f "${TEST_DIR}/qemu-vxhs.pid"
+fi
+rm -f "$TEST_IMG_FILE"
+;;
+
 file)
 _rm_test_img "$TEST_DIR/t.$IMGFMT"
 _rm_test_img "$TEST_DIR/t.$IMGFMT.orig"
-- 
2.9.3




[Qemu-devel] [PULL v2 04/12] block: add bdrv_set_read_only() helper function

2017-04-24 Thread Jeff Cody
We have a helper wrapper for checking for the BDS read_only flag,
add a helper wrapper to set the read_only flag as well.

Reviewed-by: Stefan Hajnoczi 
Signed-off-by: Jeff Cody 
Reviewed-by: John Snow 
Message-id: 
9b18972d05f5fa2ac16c014f0af98d680553048d.1491597120.git.jc...@redhat.com
---
 block.c   | 5 +
 block/bochs.c | 2 +-
 block/cloop.c | 2 +-
 block/dmg.c   | 2 +-
 block/rbd.c   | 2 +-
 block/vvfat.c | 4 ++--
 include/block/block.h | 1 +
 7 files changed, 12 insertions(+), 6 deletions(-)

diff --git a/block.c b/block.c
index 7eda9a4..6f21145 100644
--- a/block.c
+++ b/block.c
@@ -192,6 +192,11 @@ void path_combine(char *dest, int dest_size,
 }
 }
 
+void bdrv_set_read_only(BlockDriverState *bs, bool read_only)
+{
+bs->read_only = read_only;
+}
+
 void bdrv_get_full_backing_filename_from_filename(const char *backed,
   const char *backing,
   char *dest, size_t sz,
diff --git a/block/bochs.c b/block/bochs.c
index 516da56..bdc2831 100644
--- a/block/bochs.c
+++ b/block/bochs.c
@@ -110,7 +110,7 @@ static int bochs_open(BlockDriverState *bs, QDict *options, 
int flags,
 return -EINVAL;
 }
 
-bs->read_only = true; /* no write support yet */
+bdrv_set_read_only(bs, true); /* no write support yet */
 
 ret = bdrv_pread(bs->file, 0, , sizeof(bochs));
 if (ret < 0) {
diff --git a/block/cloop.c b/block/cloop.c
index a6c7b9d..11f17c8 100644
--- a/block/cloop.c
+++ b/block/cloop.c
@@ -72,7 +72,7 @@ static int cloop_open(BlockDriverState *bs, QDict *options, 
int flags,
 return -EINVAL;
 }
 
-bs->read_only = true;
+bdrv_set_read_only(bs, true);
 
 /* read header */
 ret = bdrv_pread(bs->file, 128, >block_size, 4);
diff --git a/block/dmg.c b/block/dmg.c
index a7d25fc..27ce4a6 100644
--- a/block/dmg.c
+++ b/block/dmg.c
@@ -420,7 +420,7 @@ static int dmg_open(BlockDriverState *bs, QDict *options, 
int flags,
 }
 
 block_module_load_one("dmg-bz2");
-bs->read_only = true;
+bdrv_set_read_only(bs, true);
 
 s->n_chunks = 0;
 s->offsets = s->lengths = s->sectors = s->sectorcounts = NULL;
diff --git a/block/rbd.c b/block/rbd.c
index 1ceeeb5..6ad2904 100644
--- a/block/rbd.c
+++ b/block/rbd.c
@@ -641,7 +641,7 @@ static int qemu_rbd_open(BlockDriverState *bs, QDict 
*options, int flags,
 goto failed_open;
 }
 
-bs->read_only = (s->snap != NULL);
+bdrv_set_read_only(bs, (s->snap != NULL));
 
 qemu_opts_del(opts);
 return 0;
diff --git a/block/vvfat.c b/block/vvfat.c
index af5153d..d4ce6d7 100644
--- a/block/vvfat.c
+++ b/block/vvfat.c
@@ -1157,7 +1157,7 @@ static int vvfat_open(BlockDriverState *bs, QDict 
*options, int flags,
 s->current_cluster=0x;
 
 /* read only is the default for safety */
-bs->read_only = true;
+bdrv_set_read_only(bs, true);
 s->qcow = NULL;
 s->qcow_filename = NULL;
 s->fat2 = NULL;
@@ -1173,7 +1173,7 @@ static int vvfat_open(BlockDriverState *bs, QDict 
*options, int flags,
 if (ret < 0) {
 goto fail;
 }
-bs->read_only = false;
+bdrv_set_read_only(bs, false);
 }
 
 bs->total_sectors = cyls * heads * secs;
diff --git a/include/block/block.h b/include/block/block.h
index 466de49..99d49f2 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -434,6 +434,7 @@ int bdrv_is_allocated_above(BlockDriverState *top, 
BlockDriverState *base,
 int64_t sector_num, int nb_sectors, int *pnum);
 
 bool bdrv_is_read_only(BlockDriverState *bs);
+void bdrv_set_read_only(BlockDriverState *bs, bool read_only);
 bool bdrv_is_sg(BlockDriverState *bs);
 bool bdrv_is_inserted(BlockDriverState *bs);
 int bdrv_media_changed(BlockDriverState *bs);
-- 
2.9.3




[Qemu-devel] [PULL v2 05/12] block: do not set BDS read_only if copy_on_read enabled

2017-04-24 Thread Jeff Cody
A few block drivers will set the BDS read_only flag from their
.bdrv_open() function.  This means the bs->read_only flag could
be set after we enable copy_on_read, as the BDRV_O_COPY_ON_READ
flag check occurs prior to the call to bdrv->bdrv_open().

This adds an error return to bdrv_set_read_only(), and an error will be
return if we try to set the BDS to read_only while copy_on_read is
enabled.

This patch also changes the behavior of vvfat.  Before, vvfat could
override the drive 'readonly' flag with its own, internal 'rw' flag.

For instance, this -drive parameter would result in a writable image:

"-drive format=vvfat,dir=/tmp/vvfat,rw,if=virtio,readonly=on"

This is not correct.  Now, attempting to use the above -drive parameter
will result in an error (i.e., 'rw' is incompatible with 'readonly=on').

Signed-off-by: Jeff Cody 
Reviewed-by: Stefan Hajnoczi 
Reviewed-by: John Snow 
Message-id: 
0c5b4c1cc2c651471b131f21376dfd5ea24d2196.1491597120.git.jc...@redhat.com
---
 block.c   | 10 +-
 block/bochs.c |  5 -
 block/cloop.c |  5 -
 block/dmg.c   |  6 +-
 block/rbd.c   | 11 ++-
 block/vvfat.c | 19 +++
 include/block/block.h |  2 +-
 7 files changed, 48 insertions(+), 10 deletions(-)

diff --git a/block.c b/block.c
index 6f21145..67ae35a 100644
--- a/block.c
+++ b/block.c
@@ -192,9 +192,17 @@ void path_combine(char *dest, int dest_size,
 }
 }
 
-void bdrv_set_read_only(BlockDriverState *bs, bool read_only)
+int bdrv_set_read_only(BlockDriverState *bs, bool read_only, Error **errp)
 {
+/* Do not set read_only if copy_on_read is enabled */
+if (bs->copy_on_read && read_only) {
+error_setg(errp, "Can't set node '%s' to r/o with copy-on-read 
enabled",
+   bdrv_get_device_or_node_name(bs));
+return -EINVAL;
+}
+
 bs->read_only = read_only;
+return 0;
 }
 
 void bdrv_get_full_backing_filename_from_filename(const char *backed,
diff --git a/block/bochs.c b/block/bochs.c
index bdc2831..a759b6e 100644
--- a/block/bochs.c
+++ b/block/bochs.c
@@ -110,7 +110,10 @@ static int bochs_open(BlockDriverState *bs, QDict 
*options, int flags,
 return -EINVAL;
 }
 
-bdrv_set_read_only(bs, true); /* no write support yet */
+ret = bdrv_set_read_only(bs, true, errp); /* no write support yet */
+if (ret < 0) {
+return ret;
+}
 
 ret = bdrv_pread(bs->file, 0, , sizeof(bochs));
 if (ret < 0) {
diff --git a/block/cloop.c b/block/cloop.c
index 11f17c8..d6597fc 100644
--- a/block/cloop.c
+++ b/block/cloop.c
@@ -72,7 +72,10 @@ static int cloop_open(BlockDriverState *bs, QDict *options, 
int flags,
 return -EINVAL;
 }
 
-bdrv_set_read_only(bs, true);
+ret = bdrv_set_read_only(bs, true, errp);
+if (ret < 0) {
+return ret;
+}
 
 /* read header */
 ret = bdrv_pread(bs->file, 128, >block_size, 4);
diff --git a/block/dmg.c b/block/dmg.c
index 27ce4a6..900ae5a 100644
--- a/block/dmg.c
+++ b/block/dmg.c
@@ -419,8 +419,12 @@ static int dmg_open(BlockDriverState *bs, QDict *options, 
int flags,
 return -EINVAL;
 }
 
+ret = bdrv_set_read_only(bs, true, errp);
+if (ret < 0) {
+return ret;
+}
+
 block_module_load_one("dmg-bz2");
-bdrv_set_read_only(bs, true);
 
 s->n_chunks = 0;
 s->offsets = s->lengths = s->sectors = s->sectorcounts = NULL;
diff --git a/block/rbd.c b/block/rbd.c
index 6ad2904..1c43171 100644
--- a/block/rbd.c
+++ b/block/rbd.c
@@ -635,13 +635,22 @@ static int qemu_rbd_open(BlockDriverState *bs, QDict 
*options, int flags,
 goto failed_shutdown;
 }
 
+/* rbd_open is always r/w */
 r = rbd_open(s->io_ctx, s->name, >image, s->snap);
 if (r < 0) {
 error_setg_errno(errp, -r, "error reading header from %s", s->name);
 goto failed_open;
 }
 
-bdrv_set_read_only(bs, (s->snap != NULL));
+/* If we are using an rbd snapshot, we must be r/o, otherwise
+ * leave as-is */
+if (s->snap != NULL) {
+r = bdrv_set_read_only(bs, true, _err);
+if (r < 0) {
+error_propagate(errp, local_err);
+goto failed_open;
+}
+}
 
 qemu_opts_del(opts);
 return 0;
diff --git a/block/vvfat.c b/block/vvfat.c
index d4ce6d7..b509d55 100644
--- a/block/vvfat.c
+++ b/block/vvfat.c
@@ -1156,8 +1156,6 @@ static int vvfat_open(BlockDriverState *bs, QDict 
*options, int flags,
 
 s->current_cluster=0x;
 
-/* read only is the default for safety */
-bdrv_set_read_only(bs, true);
 s->qcow = NULL;
 s->qcow_filename = NULL;
 s->fat2 = NULL;
@@ -1169,11 +1167,24 @@ static int vvfat_open(BlockDriverState *bs, QDict 
*options, int flags,
 s->sector_count = cyls * heads * secs - (s->first_sectors_number - 1);
 
 if (qemu_opt_get_bool(opts, "rw", false)) {
-ret = 

[Qemu-devel] [PULL v2 03/12] qemu-iotests: exclude vxhs from image creation via protocol

2017-04-24 Thread Jeff Cody
The protocol VXHS does not support image creation.  Some tests expect
to be able to create images through the protocol.  Exclude VXHS from
these tests.

Signed-off-by: Jeff Cody 
---
 tests/qemu-iotests/017 | 1 +
 tests/qemu-iotests/020 | 1 +
 tests/qemu-iotests/029 | 1 +
 tests/qemu-iotests/073 | 1 +
 tests/qemu-iotests/114 | 1 +
 tests/qemu-iotests/130 | 1 +
 tests/qemu-iotests/134 | 1 +
 tests/qemu-iotests/156 | 1 +
 tests/qemu-iotests/158 | 1 +
 9 files changed, 9 insertions(+)

diff --git a/tests/qemu-iotests/017 b/tests/qemu-iotests/017
index e3f9e75..4f9302d 100755
--- a/tests/qemu-iotests/017
+++ b/tests/qemu-iotests/017
@@ -41,6 +41,7 @@ trap "_cleanup; exit \$status" 0 1 2 3 15
 # Any format supporting backing files
 _supported_fmt qcow qcow2 vmdk qed
 _supported_proto generic
+_unsupported_proto vxhs
 _supported_os Linux
 _unsupported_imgopts "subformat=monolithicFlat" "subformat=twoGbMaxExtentFlat"
 
diff --git a/tests/qemu-iotests/020 b/tests/qemu-iotests/020
index 9c4a68c..7a0 100755
--- a/tests/qemu-iotests/020
+++ b/tests/qemu-iotests/020
@@ -43,6 +43,7 @@ trap "_cleanup; exit \$status" 0 1 2 3 15
 # Any format supporting backing files
 _supported_fmt qcow qcow2 vmdk qed
 _supported_proto generic
+_unsupported_proto vxhs
 _supported_os Linux
 _unsupported_imgopts "subformat=monolithicFlat" \
  "subformat=twoGbMaxExtentFlat" \
diff --git a/tests/qemu-iotests/029 b/tests/qemu-iotests/029
index e639ac0..30bab24 100755
--- a/tests/qemu-iotests/029
+++ b/tests/qemu-iotests/029
@@ -42,6 +42,7 @@ trap "_cleanup; exit \$status" 0 1 2 3 15
 # Any format supporting intenal snapshots
 _supported_fmt qcow2
 _supported_proto generic
+_unsupported_proto vxhs
 _supported_os Linux
 # Internal snapshots are (currently) impossible with refcount_bits=1
 _unsupported_imgopts 'refcount_bits=1[^0-9]'
diff --git a/tests/qemu-iotests/073 b/tests/qemu-iotests/073
index ad37a61..40f85b1 100755
--- a/tests/qemu-iotests/073
+++ b/tests/qemu-iotests/073
@@ -39,6 +39,7 @@ trap "_cleanup; exit \$status" 0 1 2 3 15
 
 _supported_fmt qcow2
 _supported_proto generic
+_unsupported_proto vxhs
 _supported_os Linux
 
 CLUSTER_SIZE=64k
diff --git a/tests/qemu-iotests/114 b/tests/qemu-iotests/114
index f110d4f..5b7dc54 100755
--- a/tests/qemu-iotests/114
+++ b/tests/qemu-iotests/114
@@ -39,6 +39,7 @@ trap "_cleanup; exit \$status" 0 1 2 3 15
 
 _supported_fmt qcow2
 _supported_proto generic
+_unsupported_proto vxhs
 _supported_os Linux
 
 
diff --git a/tests/qemu-iotests/130 b/tests/qemu-iotests/130
index ecc8a5b..f941fc9 100755
--- a/tests/qemu-iotests/130
+++ b/tests/qemu-iotests/130
@@ -42,6 +42,7 @@ trap "_cleanup; exit \$status" 0 1 2 3 15
 
 _supported_fmt qcow2
 _supported_proto generic
+_unsupported_proto vxhs
 _supported_os Linux
 
 qemu_comm_method="monitor"
diff --git a/tests/qemu-iotests/134 b/tests/qemu-iotests/134
index af618b8..acce946 100755
--- a/tests/qemu-iotests/134
+++ b/tests/qemu-iotests/134
@@ -39,6 +39,7 @@ trap "_cleanup; exit \$status" 0 1 2 3 15
 
 _supported_fmt qcow2
 _supported_proto generic
+_unsupported_proto vxhs
 _supported_os Linux
 
 
diff --git a/tests/qemu-iotests/156 b/tests/qemu-iotests/156
index cc95ff1..78deaff 100755
--- a/tests/qemu-iotests/156
+++ b/tests/qemu-iotests/156
@@ -48,6 +48,7 @@ trap "_cleanup; exit \$status" 0 1 2 3 15
 
 _supported_fmt qcow2 qed
 _supported_proto generic
+_unsupported_proto vxhs
 _supported_os Linux
 
 # Create source disk
diff --git a/tests/qemu-iotests/158 b/tests/qemu-iotests/158
index a6cdd6d..ef8d70f 100755
--- a/tests/qemu-iotests/158
+++ b/tests/qemu-iotests/158
@@ -39,6 +39,7 @@ trap "_cleanup; exit \$status" 0 1 2 3 15
 
 _supported_fmt qcow2
 _supported_proto generic
+_unsupported_proto vxhs
 _supported_os Linux
 
 
-- 
2.9.3




[Qemu-devel] [PULL v2 06/12] block: honor BDRV_O_ALLOW_RDWR when clearing bs->read_only

2017-04-24 Thread Jeff Cody
The BDRV_O_ALLOW_RDWR flag allows / prohibits the changing of
the BDS 'read_only' state, but there are a few places where it
is ignored.  In the bdrv_set_read_only() helper, make sure to
honor the flag.

Signed-off-by: Jeff Cody 
Reviewed-by: Stefan Hajnoczi 
Reviewed-by: John Snow 
Message-id: 
be2e5fb2d285cbece2b6d06bed54a6f56520d251.1491597120.git.jc...@redhat.com
---
 block.c | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/block.c b/block.c
index 67ae35a..4752fee 100644
--- a/block.c
+++ b/block.c
@@ -201,6 +201,13 @@ int bdrv_set_read_only(BlockDriverState *bs, bool 
read_only, Error **errp)
 return -EINVAL;
 }
 
+/* Do not clear read_only if it is prohibited */
+if (!read_only && !(bs->open_flags & BDRV_O_ALLOW_RDWR)) {
+error_setg(errp, "Node '%s' is read only",
+   bdrv_get_device_or_node_name(bs));
+return -EPERM;
+}
+
 bs->read_only = read_only;
 return 0;
 }
-- 
2.9.3




[Qemu-devel] [PULL v2 00/12] Block patches

2017-04-24 Thread Jeff Cody
The following changes since commit 4c55b1d0bad8a703f0499fe62e3761a0cd288da3:

  Merge remote-tracking branch 'remotes/armbru/tags/pull-error-2017-04-24' into 
staging (2017-04-24 14:49:48 +0100)

are available in the git repository at:

  git://github.com/codyprime/qemu-kvm-jtc.git tags/block-pull-request

for you to fetch changes up to ecfa185400ade2abc9915efa924cbad1e15a21a4:

  qemu-iotests: _cleanup_qemu must be called on exit (2017-04-24 15:09:33 -0400)


Pull v2, with 32-bit errors fixed.  I don't have OS X to test compile on,
but I think it is safe to assume the cause of the compile error was the same.


Ashish Mittal (2):
  block/vxhs.c: Add support for a new block device type called "vxhs"
  block/vxhs.c: Add qemu-iotests for new block device type "vxhs"

Jeff Cody (10):
  qemu-iotests: exclude vxhs from image creation via protocol
  block: add bdrv_set_read_only() helper function
  block: do not set BDS read_only if copy_on_read enabled
  block: honor BDRV_O_ALLOW_RDWR when clearing bs->read_only
  block: code movement
  block: introduce bdrv_can_set_read_only()
  block: use bdrv_can_set_read_only() during reopen
  block/rbd - update variable names to more apt names
  block/rbd: Add support for reopen()
  qemu-iotests: _cleanup_qemu must be called on exit

 block.c  |  56 +++-
 block/Makefile.objs  |   2 +
 block/bochs.c|   5 +-
 block/cloop.c|   5 +-
 block/dmg.c  |   6 +-
 block/rbd.c  |  65 +++--
 block/trace-events   |  17 ++
 block/vvfat.c|  19 +-
 block/vxhs.c | 575 +++
 configure|  39 +++
 include/block/block.h|   2 +
 qapi/block-core.json |  23 +-
 tests/qemu-iotests/017   |   1 +
 tests/qemu-iotests/020   |   1 +
 tests/qemu-iotests/028   |   1 +
 tests/qemu-iotests/029   |   1 +
 tests/qemu-iotests/073   |   1 +
 tests/qemu-iotests/094   |  11 +-
 tests/qemu-iotests/102   |   5 +-
 tests/qemu-iotests/109   |   1 +
 tests/qemu-iotests/114   |   1 +
 tests/qemu-iotests/117   |   1 +
 tests/qemu-iotests/130   |   2 +
 tests/qemu-iotests/134   |   1 +
 tests/qemu-iotests/140   |   1 +
 tests/qemu-iotests/141   |   1 +
 tests/qemu-iotests/143   |   1 +
 tests/qemu-iotests/156   |   2 +
 tests/qemu-iotests/158   |   1 +
 tests/qemu-iotests/common|   6 +
 tests/qemu-iotests/common.config |  13 +
 tests/qemu-iotests/common.filter |   1 +
 tests/qemu-iotests/common.rc |  19 ++
 33 files changed, 844 insertions(+), 42 deletions(-)
 create mode 100644 block/vxhs.c

-- 
2.9.3




[Qemu-devel] [PULL v2 01/12] block/vxhs.c: Add support for a new block device type called "vxhs"

2017-04-24 Thread Jeff Cody
From: Ashish Mittal 

Source code for the qnio library that this code loads can be downloaded from:
https://github.com/VeritasHyperScale/libqnio.git

Sample command line using JSON syntax:
./x86_64-softmmu/qemu-system-x86_64 -name instance-0008 -S -vnc 0.0.0.0:0
-k en-us -vga cirrus -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x5
-msg timestamp=on
'json:{"driver":"vxhs","vdisk-id":"c3e9095a-a5ee-4dce-afeb-2a59fb387410",
"server":{"host":"172.172.17.4","port":""}}'

Sample command line using URI syntax:
qemu-img convert -f raw -O raw -n
/var/lib/nova/instances/_base/0c5eacd5ebea5ed914b6a3e7b18f1ce734c386ad
vxhs://192.168.0.1:/c6718f6b-0401-441d-a8c3-1f0064d75ee0

Sample command line using TLS credentials (run in secure mode):
./qemu-io --object
tls-creds-x509,id=tls0,dir=/etc/pki/qemu/vxhs,endpoint=client -c 'read
-v 66000 2.5k' 'json:{"server.host": "127.0.0.1", "server.port": "",
"vdisk-id": "/test.raw", "driver": "vxhs", "tls-creds":"tls0"}'

[Jeff: Modified trace-events with the correct string formatting]

Signed-off-by: Ashish Mittal 
Reviewed-by: Stefan Hajnoczi 
Reviewed-by: Jeff Cody 
Signed-off-by: Jeff Cody 
Message-id: 1491277689-24949-2-git-send-email-ashish.mit...@veritas.com
---
 block/Makefile.objs  |   2 +
 block/trace-events   |  17 ++
 block/vxhs.c | 575 +++
 configure|  39 
 qapi/block-core.json |  23 ++-
 5 files changed, 654 insertions(+), 2 deletions(-)
 create mode 100644 block/vxhs.c

diff --git a/block/Makefile.objs b/block/Makefile.objs
index de96f8e..ea95530 100644
--- a/block/Makefile.objs
+++ b/block/Makefile.objs
@@ -19,6 +19,7 @@ block-obj-$(CONFIG_LIBNFS) += nfs.o
 block-obj-$(CONFIG_CURL) += curl.o
 block-obj-$(CONFIG_RBD) += rbd.o
 block-obj-$(CONFIG_GLUSTERFS) += gluster.o
+block-obj-$(CONFIG_VXHS) += vxhs.o
 block-obj-$(CONFIG_LIBSSH2) += ssh.o
 block-obj-y += accounting.o dirty-bitmap.o
 block-obj-y += write-threshold.o
@@ -38,6 +39,7 @@ rbd.o-cflags   := $(RBD_CFLAGS)
 rbd.o-libs := $(RBD_LIBS)
 gluster.o-cflags   := $(GLUSTERFS_CFLAGS)
 gluster.o-libs := $(GLUSTERFS_LIBS)
+vxhs.o-libs:= $(VXHS_LIBS)
 ssh.o-cflags   := $(LIBSSH2_CFLAGS)
 ssh.o-libs := $(LIBSSH2_LIBS)
 block-obj-$(if $(CONFIG_BZIP2),m,n) += dmg-bz2.o
diff --git a/block/trace-events b/block/trace-events
index 0bc5c0a..9a71c7f 100644
--- a/block/trace-events
+++ b/block/trace-events
@@ -110,3 +110,20 @@ qed_aio_write_data(void *s, void *acb, int ret, uint64_t 
offset, size_t len) "s
 qed_aio_write_prefill(void *s, void *acb, uint64_t start, size_t len, uint64_t 
offset) "s %p acb %p start %"PRIu64" len %zu offset %"PRIu64
 qed_aio_write_postfill(void *s, void *acb, uint64_t start, size_t len, 
uint64_t offset) "s %p acb %p start %"PRIu64" len %zu offset %"PRIu64
 qed_aio_write_main(void *s, void *acb, int ret, uint64_t offset, size_t len) 
"s %p acb %p ret %d offset %"PRIu64" len %zu"
+
+# block/vxhs.c
+vxhs_iio_callback(int error) "ctx is NULL: error %d"
+vxhs_iio_callback_chnfail(int err, int error) "QNIO channel failed, no i/o %d, 
%d"
+vxhs_iio_callback_unknwn(int opcode, int err) "unexpected opcode %d, errno %d"
+vxhs_aio_rw_invalid(int req) "Invalid I/O request iodir %d"
+vxhs_aio_rw_ioerr(char *guid, int iodir, uint64_t size, uint64_t off, void 
*acb, int ret, int err) "IO ERROR (vDisk %s) FOR : Read/Write = %d size = 
%"PRIu64" offset = %"PRIu64" ACB = %p. Error = %d, errno = %d"
+vxhs_get_vdisk_stat_err(char *guid, int ret, int err) "vDisk (%s) stat ioctl 
failed, ret = %d, errno = %d"
+vxhs_get_vdisk_stat(char *vdisk_guid, uint64_t vdisk_size) "vDisk %s stat 
ioctl returned size %"PRIu64
+vxhs_complete_aio(void *acb, uint64_t ret) "aio failed acb %p ret %"PRIu64
+vxhs_parse_uri_filename(const char *filename) "URI passed via 
bdrv_parse_filename %s"
+vxhs_open_vdiskid(const char *vdisk_id) "Opening vdisk-id %s"
+vxhs_open_hostinfo(char *of_vsa_addr, int port) "Adding host %s:%d to 
BDRVVXHSState"
+vxhs_open_iio_open(const char *host) "Failed to connect to storage agent on 
host %s"
+vxhs_parse_uri_hostinfo(char *host, int port) "Host: IP %s, Port %d"
+vxhs_close(char *vdisk_guid) "Closing vdisk %s"
+vxhs_get_creds(const char *cacert, const char *client_key, const char 
*client_cert) "cacert %s, client_key %s, client_cert %s"
diff --git a/block/vxhs.c b/block/vxhs.c
new file mode 100644
index 000..9ffe9d3
--- /dev/null
+++ b/block/vxhs.c
@@ -0,0 +1,575 @@
+/*
+ * QEMU Block driver for Veritas HyperScale (VxHS)
+ *
+ * Copyright (c) 2017 Veritas Technologies LLC.
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or later.
+ * See the COPYING file in the top-level directory.
+ *
+ */
+
+#include "qemu/osdep.h"
+#include 
+#include 
+#include "block/block_int.h"
+#include "qapi/qmp/qerror.h"
+#include "qapi/qmp/qdict.h"
+#include 

Re: [Qemu-devel] [PATCH] pc/fwcfg: unbreak migration from qemu-2.5 and qemu-2.6 during firmware boot

2017-04-24 Thread Eduardo Habkost
On Mon, Apr 24, 2017 at 08:58:17PM +0200, Igor Mammedov wrote:
[...]
> diff --git a/hw/i386/pc.c b/hw/i386/pc.c
> index f3b372a18f..3f2d96da64 100644
> --- a/hw/i386/pc.c
> +++ b/hw/i386/pc.c
> @@ -1047,7 +1047,7 @@ static void load_linux(PCMachineState *pcms,
>  fw_cfg_add_i32(fw_cfg, FW_CFG_SETUP_SIZE, setup_size);
>  fw_cfg_add_bytes(fw_cfg, FW_CFG_SETUP_DATA, setup, setup_size);
>  
> -if (fw_cfg_dma_enabled(fw_cfg)) {
> +if (!pcmc->linuxboot_dma_disabled && fw_cfg_dma_enabled(fw_cfg)) {

Why not name the flag just "linuxboot_dma", set it to true by
default at pc_machine_class_init(), and avoid the double
negative?

[...]
> diff --git a/include/hw/i386/pc.h b/include/hw/i386/pc.h
> index f278b3ae89..ff6f13b61b 100644
> --- a/include/hw/i386/pc.h
> +++ b/include/hw/i386/pc.h
> @@ -151,6 +151,8 @@ struct PCMachineClass {
>  bool save_tsc_khz;
>  /* generate legacy CPU hotplug AML */
>  bool legacy_cpu_hotplug;
> +

A one-line description of the consequences of setting/clearing
the flag would be nice.

> +bool linuxboot_dma_disabled;
>  };
>  
>  #define TYPE_PC_MACHINE "generic-pc-machine"
> @@ -432,10 +434,6 @@ bool e820_get_entry(int, uint32_t, uint64_t *, uint64_t 
> *);
>  #define PC_COMPAT_2_6 \
>  HW_COMPAT_2_6 \
>  {\
> -.driver   = "fw_cfg_io",\
> -.property = "dma_enabled",\
> -.value= "off",\
> -},{\
>  .driver   = TYPE_X86_CPU,\
>  .property = "cpuid-0xb",\
>  .value= "off",\
> -- 
> 2.11.0 (Apple Git-81)
> 

-- 
Eduardo



Re: [Qemu-devel] postcopy migration hangs while loading virtio state

2017-04-24 Thread Dr. David Alan Gilbert
* Christian Borntraeger (borntrae...@de.ibm.com) wrote:
> On 04/24/2017 04:35 PM, Dr. David Alan Gilbert wrote:
> > * Christian Borntraeger (borntrae...@de.ibm.com) wrote:
> >> On 04/24/2017 12:53 PM, Dr. David Alan Gilbert wrote:
> >>> * Christian Borntraeger (borntrae...@de.ibm.com) wrote:
>  David, Juan,
> 
>  I can trigger a hang of postcopy migration (when I do it early) so
>  that both sides are in paused state. Looks like one thread is still
>  loading vmstates for virtio and this load accesses guest memory and
>  triggers a userfault.
> >>>
> >>> It's perfectly legal for the destination to cause a userfault as it
> >>> loads the virtio queue - the virtio queue should be being loaded by
> >>> the main thread from the buffer while the 'listening' thread is
> >>> waiting for the incoming page data.
> >>>
> >>> Can you turn on the following tracing please: destination: 
> >>> postcopy_ram_fault_thread_request postcopy_place_page 
> >>> postcopy_place_page_zero
> >>>
> >>> source: migrate_handle_rp_req_pages ram_save_queue_pages
> >>>
> >>> You should see: virtio does the access userfault generates a fault 
> >>> postcopy_ram_fault_thread_request sends the request to the source
> >>>
> >>> the source sees migrate_handle_rp_req_pages queues it with
> >>> ram_save_queue_pages
> >>>
> >>> the destination sees the page arrive and postcopy_place_page or
> >>> postcopy_place_page_zero
> >>>
> >>> some of that might disappear if the page was already on it's way.
> >>
> >>
> >> the last event on the source are
> >> [..]
> >> 58412@1493037953.747988:postcopy_place_page host=0x3ff92246000
> >> 58412@1493037953.747992:postcopy_place_page host=0x3ff92247000
> > 
> > How do you see those on the source???
> 
> It was the previous migrate (I did it in a loop forth and back)
> The problem happens on the migrate back.

OK, good.

> > 
> >> 58412@1493037956.804210:migrate_handle_rp_req_pages in s390.ram at 41d9000 
> >> len 1000
> >> 58412@1493037956.804216:ram_save_queue_pages s390.ram: start: 41d9000 len: 
> >> 100
> >>
> > 
> > Is that a typo? I'm expecting those two 'len' fields to be the same?
> 
> Yes, its a cut'n' paste "miss the last byte"

Good.

Ok, before digging further, is this a new bug or does it happen on older
QEMU?  Have you got a rune to reproduce it on x86?

> 
> > 
> >> On the target a see lots of
> >>
> >> 39741@1493037958.833710:postcopy_place_page_zero host=0x3ff9befa000
> >> 39741@1493037958.833716:postcopy_place_page host=0x3ff9befb000
> >> 39741@1493037958.833759:postcopy_place_page host=0x3ff9befc000
> >> 39741@1493037958.833818:postcopy_place_page host=0x3ff9befd000
> >> 39741@1493037958.833819:postcopy_place_page_zero host=0x3ff9befe000
> >> 39741@1493037958.833822:postcopy_place_page host=0x3ff9beff000
> >>
> >> So we have about 2 seconds of traffic going on after that request,
> >> I assume its precopy related.
> >>
> >> Looking back on the target history there was
> >> 39741@1493037956.804337:postcopy_ram_fault_thread_request Request for 
> >> HVA=3ff618d9000 rb=s390.ram offset=41d9000
> >>
> >> In fact it seems to be the first and only request:
> >>
> >> # cat /tmp/test0.trace | grep -v postcopy_place_page
> >>
> >> 39741@1493037956.804337:postcopy_ram_fault_thread_request Request for 
> >> HVA=3ff618d9000 rb=s390.ram offset=41d9000
> > 
> > OK, does the HVA there correspond to the address that your virtio device is 
> > blocking on?
> 
> yes it is the same page.
> 
> 
> > (or at least the start of the page)
> > Do you see a postcopy_place_page with a host= the same HVA ?
> 
> no

Hmm, that's bad.
The flow is:
Precopy
   (a) Send pages
Switch to postcopy
   (b) Send discards for pages that have been changed after
   (a)
Postcopy
   (c) Keep sending pages until we run out
   (d) But send pages we're asked for first

 (d) might be ignored if the source thinks the page was already sent or not 
dirty.

So we need to figure out:
  1) If the source sent the pages during (a)
  2) If the source discarded it during (b)
  3) If it sent it again any time in c/d
  4) If it ignored the request from d


So please turn on the traces:
get_queued_page_not_dirtyshould help with (4)
get_queued_page  should help with (4)
ram_discard_rangeshould help with (2)
loadvm_postcopy_ram_handle_discard   should help with (2)
qemu_savevm_send_postcopy_ram_discardshould help with (2)

add near the top of ram_save_page in ram.c:
  fprintf(stderr, "%s: %s:%zx\n", __func__, block->idstr, (size_t)offset);

   should help with 1, 3

So lets see if your page ever gets sent in ram_save_page, and does it get
discarded prior to it hitting the point where it hangs but after the page
arrived?
Another (slimmer) possibility is the number of dirties paged is wrong
so the source thinks it has finished too soon - but we'll only look at
that if all the above doesn't help.

Dave

> > 
> 

Re: [Qemu-devel] [RFC 0/7] Move accel, KVM, Xen, qtest files to accel/ subdir

2017-04-24 Thread Stefano Stabellini
On Mon, 24 Apr 2017, Thomas Huth wrote:
> On 20.12.2016 18:43, Eduardo Habkost wrote:
> > This moves the KVM and Xen files to the an accel/ subdir.
> > 
> > Instead of moving the *-stubs.c file to accel/ as-is, I tried to
> > move most of the stub code to libqemustub.a. This way the obj-y
> > logic for accel/ is simpler: obj-y includes accel/ only if
> > CONFIG_SOFTMMU is set.
> > 
> > The Xen stubs could be moved completely to stubs/, but some of
> > the KVM stubs depend on cpu.h. So most of the kvm-stub.c code was
> > moved to stubs/kvm.c, but some of that code was kept in
> > accel/kvm-stub.c.
> > 
> > About TCG:
> > --
> > 
> > It is not obvious to me which TCG-related files could be moved to
> > accel/, so this series don't move any of them yet.
> > 
> > About other CONFIG_SOFTMMU top-level files:
> > ---
> > 
> > I would like to know what we should do with the top-level
> > CONFIG_SOFTMMU-only files that don't belong to hw/. Some
> > candidates: arch_init.c cpus.c monitor.c gdbstub.c balloon.c
> > ioport.c bootdevice.c memory.c cputlb.c memory_mapping.c dump.c.
> > 
> > Maybe a sysemu/ subdir? In that case, should we still create an
> > accel/ subdir, or move xen-*, kvm-* and friends to sysemu/ too?
> > 
> > Cc: Paolo Bonzini 
> > Cc: k...@vger.kernel.org
> > Cc: Christoffer Dall 
> > Cc: Anthony Perard 
> > Cc: Stefano Stabellini 
> > Cc: xen-de...@lists.xensource.com
> > 
> > Eduardo Habkost (7):
> >   xen: Move xen-*-stub.c to stubs/
> >   xen: Move xen files to accel/
> >   kvm: Move some kvm-stub.c code to stubs/kvm.c
> >   kvm: Include kvm-stub.o only on CONFIG_SOFTMMU
> >   kvm: Move kvm*.c files to accel/
> >   accel: Move accel.c to accel/
> >   accel: Move qtest.c to accel/
> > 
> >  Makefile.objs  |  2 +-
> >  Makefile.target| 10 ++
> >  accel.c => accel/accel.c   |  0
> >  kvm-all.c => accel/kvm-common.c|  0
> >  kvm-stub.c => accel/kvm-stub.c | 51 --
> >  qtest.c => accel/qtest.c   |  0
> >  xen-common.c => accel/xen-common.c |  0
> >  xen-hvm.c => accel/xen-hvm.c   |  0
> >  xen-mapcache.c => accel/xen-mapcache.c |  0
> >  stubs/kvm.c| 65 
> > ++
> >  xen-hvm-stub.c => stubs/xen-hvm.c  |  0
> >  xen-common-stub.c => stubs/xen.c   |  0
> >  MAINTAINERS|  4 +--
> >  accel/Makefile.objs|  9 +
> >  stubs/Makefile.objs|  2 ++
> >  15 files changed, 80 insertions(+), 63 deletions(-)
> >  rename accel.c => accel/accel.c (100%)
> >  rename kvm-all.c => accel/kvm-common.c (100%)
> >  rename kvm-stub.c => accel/kvm-stub.c (71%)
> >  rename qtest.c => accel/qtest.c (100%)
> >  rename xen-common.c => accel/xen-common.c (100%)
> >  rename xen-hvm.c => accel/xen-hvm.c (100%)
> >  rename xen-mapcache.c => accel/xen-mapcache.c (100%)
> >  rename xen-hvm-stub.c => stubs/xen-hvm.c (100%)
> >  rename xen-common-stub.c => stubs/xen.c (100%)
> >  create mode 100644 accel/Makefile.objs
> 
> Now that the development tree is open again ... any chance that we could
> get this series into 2.10 ?

FYI I took Anothony Xu's patches to move the xen files under hw/xen/ and
hw/i386/xen, see the last 3 patches of
alpine.DEB.2.10.1704211258580.18403@sstabellini-ThinkPad-X260. 



Re: [Qemu-devel] [PATCH v2 00/25] qmp: add async command type

2017-04-24 Thread Markus Armbruster
With 2.9 out of the way, how can we make progress on this one?

I can see two ways to get asynchronous QMP commands accepted:

1. We break QMP compatibility in QEMU 3.0 and convert all long-running
   tasks from "synchronous command + event" to "asynchronous command".

   This is design option 1 quoted below.  *If* we decide to leave
   compatibility behind for 3.0, *and* we decide we like the
   asynchronous sufficiently better to put in the work, we can do it.

   I guess there's nothing to do here until we decide on breaking
   compatibility in 3.0.

2. We don't break QMP compatibility, but we add asynchronous commands
   anyway, because we decide that's how we want to do "jobs".

   This is design option 3 quoted below.  As I said, I dislike its lack
   of orthogonality.  But if asynchronous commands help us get jobs
   done, I can bury my dislike.

   I feel this is something you should investigate with John Snow.
   Going through a middleman (me) makes no sense.  So I'm going to dump
   my thoughts, then get out of the way.

   You need to take care not to get bogged down in the jobs project's
   complexity.  This is really only how to package up jobs for QMP.

   With synchronous commands, the job-creating command creates a job,
   jobs state changes trigger events, and job termination is just
   another state change.  Job control commands interact with the job.
 
   Events obviously need to carry a job ID.  We can either require the
   user to pass it as argument to the job-creating command (hopefully
   unique), or have the job-creating command pick one (a unique one) and
   return it.

   With asynchronous commands, we could make the asynchronous command
   the job.  The only difference is that job termination triggers the
   command response.  When termination is of no interest to anyone but
   the job's creator, the termination event can be omitted then.

   Instead of a job ID, we could use the (user-specified and hopefully
   unique) command ID that ties the command response to the command.
   Perhaps throw in a monitor ID.

   To be honest, I'm not sure asynchronous commands buy us much here.
   But my view is from 10,000 feet, and John might have different ideas.

Rejecting asynchronous QMP commands is of course design option 2 quoted
below.


Markus Armbruster  writes:

> Cc'ing block job people.
>
> Marc-André Lureau  writes:
>
>> Hi,
>>
>> One of initial design goals of QMP was to have "asynchronous command
>> completion" (http://wiki.qemu.org/Features/QAPI). Unfortunately, that
>> goal was not fully achieved, and some broken bits left were removed
>> progressively until commit 65207c59d that removed async command
>> support.
>
> Correct.
>
> QMP's initial design stipulated that commands are asynchronous.  The
> initial implementation made them all synchronous, with very few
> exceptions (buggy ones, to boot).  Naturally, clients relied on the
> actual rather than the theoretical behavior, i.e. on getting command
> replies synchronously, in order.
>
>> Note that qmp events are asynchronous messages, and must be handled
>> appropriately by the client: dispatch both reply and events after a
>> command is sent for example.
>
> Yes.
>
>> The benefits of async commands that can be trade-off depending on the
>> requirements are:
>>
>> 1) allow the command handler to re-enter the main loop if the command
>> cannot be handled synchronously, or if it is long-lasting. This is
>> currently not possible and make some bugs such as rhbz#1230527 tricky
>> (see below) to solve.  Furthermore, many QMP commands do IO and could
>> be considered 'slow' and blocking today.
>>
>> 2) allow concurrent commands and events. This mainly implies hanlding
>> concurrency in qemu and out-of-order replies for the client. As noted
>> earlier, a good qmp client already has to handle dispatching of
>> received messages (reply and events).
>
> We need to distingish two kinds of concurrency.  One, execution of QMP
> commands concurrently with other activities in QEMU, including other QMP
> monitors.  Two, executing multiple QMP commands concurrently in the same
> monitor, which obviously requires all but one of them to be
> asynchronous.
>
>> The traditional approach to solving the above in qemu is the following
>> scheme:
>> -> { "execute": "do-foo" }
>> <- { "return": {} }
>> <- { "event": "FOO_DONE" }
>
> ... where the event may carry additional data, such as the command's
> true result.
>
>> It has several flaws:
>> - FOO_DONE event has no semantic link with do-foo in the qapi
>>   schema. It is not simple to generalize that pattern when writing
>>   qmp clients. It makes documentation and usage harder.
>> - the FOO_DONE event has no clear association with the command that
>>   triggered it: commands/events have to come up with additional
>>   specific association schemes (ids, path etc)
>
> Valid points.  Emulating asynchronous commands with a synchronous
> command + 

[Qemu-devel] [PATCH] pc/fwcfg: unbreak migration from qemu-2.5 and qemu-2.6 during firmware boot

2017-04-24 Thread Igor Mammedov
Since 2.7 commit (b2a575a Add optionrom compatible with fw_cfg DMA version)
regressed migration during firmware exection time by
abusing fwcfg.dma_enabled property to decide loading
dma version of option rom AND by mistake disabling DMA
for 2.6 and earlier globally instead of only for option rom.

so 2.6 machine type guest is broken when it already runs
firmware in DMA mode but migrated to qemu-2.7(pc-2.6)
at that time;

a) qemu-2.6:pc2.6 (fwcfg.dma=on,firmware=dma,oprom=mmio)
b) qemu-2.7:pc2.6 (fwcfg.dma=off,firmware=mmio,oprom=mmio)

  to:   a b
from
a   OK   FAIL
b   OK   OK

So we currently have broken forward migration from
qemu-2.6 to qemu-2.[789] that however could be fixed
for 2.10 by re-enabling DMA for 2.[56] machine types
and allowing dma capable option rom only since 2.7.
As result qemu should end up with:

c) qemu-2.10:pc2.6 (fwcfg.dma=on,firmware=dma,oprom=mmio)

   to:  a bc
from
a  OK   FAIL  OK
b  OK   OKOK
c  OK   FAIL  OK

where forward migration from qemu-2.6 to qemu-2.10 should
work again leaving only qemu-2.[789]:pc-2.6 broken.

Patch should also help downstream to maintain migration
the way it used to be since dma cable option rom
is managed by new

Signed-off-by: Igor Mammedov 
---
 hw/i386/pc.c | 2 +-
 hw/i386/pc_piix.c| 1 +
 hw/i386/pc_q35.c | 1 +
 include/hw/i386/pc.h | 6 ++
 4 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index f3b372a18f..3f2d96da64 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -1047,7 +1047,7 @@ static void load_linux(PCMachineState *pcms,
 fw_cfg_add_i32(fw_cfg, FW_CFG_SETUP_SIZE, setup_size);
 fw_cfg_add_bytes(fw_cfg, FW_CFG_SETUP_DATA, setup, setup_size);
 
-if (fw_cfg_dma_enabled(fw_cfg)) {
+if (!pcmc->linuxboot_dma_disabled && fw_cfg_dma_enabled(fw_cfg)) {
 option_rom[nb_option_roms].name = "linuxboot_dma.bin";
 option_rom[nb_option_roms].bootindex = 0;
 } else {
diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c
index 9f102aa388..dd3a2bb02a 100644
--- a/hw/i386/pc_piix.c
+++ b/hw/i386/pc_piix.c
@@ -474,6 +474,7 @@ static void pc_i440fx_2_6_machine_options(MachineClass *m)
 PCMachineClass *pcmc = PC_MACHINE_CLASS(m);
 pc_i440fx_2_7_machine_options(m);
 pcmc->legacy_cpu_hotplug = true;
+pcmc->linuxboot_dma_disabled = true;
 SET_MACHINE_COMPAT(m, PC_COMPAT_2_6);
 }
 
diff --git a/hw/i386/pc_q35.c b/hw/i386/pc_q35.c
index dd792a8547..9988ecc578 100644
--- a/hw/i386/pc_q35.c
+++ b/hw/i386/pc_q35.c
@@ -335,6 +335,7 @@ static void pc_q35_2_6_machine_options(MachineClass *m)
 PCMachineClass *pcmc = PC_MACHINE_CLASS(m);
 pc_q35_2_7_machine_options(m);
 pcmc->legacy_cpu_hotplug = true;
+pcmc->linuxboot_dma_disabled = true;
 SET_MACHINE_COMPAT(m, PC_COMPAT_2_6);
 }
 
diff --git a/include/hw/i386/pc.h b/include/hw/i386/pc.h
index f278b3ae89..ff6f13b61b 100644
--- a/include/hw/i386/pc.h
+++ b/include/hw/i386/pc.h
@@ -151,6 +151,8 @@ struct PCMachineClass {
 bool save_tsc_khz;
 /* generate legacy CPU hotplug AML */
 bool legacy_cpu_hotplug;
+
+bool linuxboot_dma_disabled;
 };
 
 #define TYPE_PC_MACHINE "generic-pc-machine"
@@ -432,10 +434,6 @@ bool e820_get_entry(int, uint32_t, uint64_t *, uint64_t *);
 #define PC_COMPAT_2_6 \
 HW_COMPAT_2_6 \
 {\
-.driver   = "fw_cfg_io",\
-.property = "dma_enabled",\
-.value= "off",\
-},{\
 .driver   = TYPE_X86_CPU,\
 .property = "cpuid-0xb",\
 .value= "off",\
-- 
2.11.0 (Apple Git-81)




Re: [Qemu-devel] [PATCH RESEND v2 08/18] ram/COLO: Record the dirty pages that SVM received

2017-04-24 Thread Juan Quintela
zhanghailiang  wrote:
> We record the address of the dirty pages that received,
> it will help flushing pages that cached into SVM.
>
> Here, it is a trick, we record dirty pages by re-using migration
> dirty bitmap. In the later patch, we will start the dirty log
> for SVM, just like migration, in this way, we can record both
> the dirty pages caused by PVM and SVM, we only flush those dirty
> pages from RAM cache while do checkpoint.
>
> Cc: Juan Quintela 
> Signed-off-by: zhanghailiang 
> Reviewed-by: Dr. David Alan Gilbert 
> ---
>  migration/ram.c | 29 +
>  1 file changed, 29 insertions(+)
>
> diff --git a/migration/ram.c b/migration/ram.c
> index 05d1b06..0653a24 100644
> --- a/migration/ram.c
> +++ b/migration/ram.c
> @@ -2268,6 +2268,9 @@ static inline void *host_from_ram_block_offset(RAMBlock 
> *block,
>  static inline void *colo_cache_from_block_offset(RAMBlock *block,
>   ram_addr_t offset)
>  {
> +unsigned long *bitmap;
> +long k;
> +
>  if (!offset_in_ramblock(block, offset)) {
>  return NULL;
>  }
> @@ -2276,6 +2279,17 @@ static inline void 
> *colo_cache_from_block_offset(RAMBlock *block,
>   __func__, block->idstr);
>  return NULL;
>  }
> +
> +k = (memory_region_get_ram_addr(block->mr) + offset) >> TARGET_PAGE_BITS;
> +bitmap = atomic_rcu_read(_state.ram_bitmap)->bmap;
> +/*
> +* During colo checkpoint, we need bitmap of these migrated pages.
> +* It help us to decide which pages in ram cache should be flushed
> +* into VM's RAM later.
> +*/
> +if (!test_and_set_bit(k, bitmap)) {
> +ram_state.migration_dirty_pages++;
> +}
>  return block->colo_cache + offset;
>  }
>  
> @@ -2752,6 +2766,15 @@ int colo_init_ram_cache(void)
>  memcpy(block->colo_cache, block->host, block->used_length);
>  }
>  rcu_read_unlock();
> +/*
> +* Record the dirty pages that sent by PVM, we use this dirty bitmap 
> together
> +* with to decide which page in cache should be flushed into SVM's RAM. 
> Here
> +* we use the same name 'ram_bitmap' as for migration.
> +*/
> +ram_state.ram_bitmap = g_new0(RAMBitmap, 1);
> +ram_state.ram_bitmap->bmap = bitmap_new(last_ram_page());
> +ram_state.migration_dirty_pages = 0;
> +
>  return 0;
>  
>  out_locked:
> @@ -2770,6 +2793,12 @@ out_locked:
>  void colo_release_ram_cache(void)
>  {
>  RAMBlock *block;
> +RAMBitmap *bitmap = ram_state.ram_bitmap;
> +
> +atomic_rcu_set(_state.ram_bitmap, NULL);
> +if (bitmap) {
> +call_rcu(bitmap, migration_bitmap_free, rcu);
> +}
>  
>  rcu_read_lock();
>  QLIST_FOREACH_RCU(block, _list.blocks, next) {

You can see my Split bitmap patches, I am splitting the dirty bitmap per
block, I think that it shouldn't make your life more difficult, but
please take a look.

I am wondering if it is faster/easier to use the page_cache.c that
xbzrle uses to store the dirty pages instead of copying the whole
RAMBlocks, but I don't really know.


Thanks, Juan.



Re: [Qemu-devel] [PATCH RESEND v2 07/18] COLO: Load dirty pages into SVM's RAM cache firstly

2017-04-24 Thread Juan Quintela
zhanghailiang  wrote:
> We should not load PVM's state directly into SVM, because there maybe some
> errors happen when SVM is receving data, which will break SVM.
>
> We need to ensure receving all data before load the state into SVM. We use
> an extra memory to cache these data (PVM's ram). The ram cache in secondary 
> side
> is initially the same as SVM/PVM's memory. And in the process of checkpoint,
> we cache the dirty pages of PVM into this ram cache firstly, so this ram cache
> always the same as PVM's memory at every checkpoint, then we flush this 
> cached ram
> to SVM after we receive all PVM's state.
>
> Cc: Dr. David Alan Gilbert 
> Signed-off-by: zhanghailiang 
> Signed-off-by: Li Zhijian 
> ---
> v2:
> - Move colo_init_ram_cache() and colo_release_ram_cache() out of
>   incoming thread since both of them need the global lock, if we keep
>   colo_release_ram_cache() in incoming thread, there are potential
>   dead-lock.
> - Remove bool ram_cache_enable flag, use migration_incoming_in_state() 
> instead.
> - Remove the Reviewd-by tag because of the above changes.


> +out_locked:
> +QLIST_FOREACH_RCU(block, _list.blocks, next) {
> +if (block->colo_cache) {
> +qemu_anon_ram_free(block->colo_cache, block->used_length);
> +block->colo_cache = NULL;
> +}
> +}
> +
> +rcu_read_unlock();
> +return -errno;
> +}
> +
> +/* It is need to hold the global lock to call this helper */
> +void colo_release_ram_cache(void)
> +{
> +RAMBlock *block;
> +
> +rcu_read_lock();
> +QLIST_FOREACH_RCU(block, _list.blocks, next) {
> +if (block->colo_cache) {
> +qemu_anon_ram_free(block->colo_cache, block->used_length);
> +block->colo_cache = NULL;
> +}
> +}
> +rcu_read_unlock();
> +}

Create a function from the creation/removal?  We have exactly two copies
of the same code.  Right now the code inside the function is very small,
but it could be bigger, no?

Later, Juan.




Re: [Qemu-devel] [PATCH RESEND v2 04/18] COLO: integrate colo compare with colo frame

2017-04-24 Thread Juan Quintela
zhanghailiang  wrote:
> For COLO FT, both the PVM and SVM run at the same time,
> only sync the state while it needs.
>
> So here, let SVM runs while not doing checkpoint, change
> DEFAULT_MIGRATE_X_CHECKPOINT_DELAY to 200*100.
>
> Besides, we forgot to release colo_checkpoint_semd and
> colo_delay_timer, fix them here.
>
> Cc: Jason Wang 
> Signed-off-by: zhanghailiang 
> Reviewed-by: Dr. David Alan Gilbert 



> diff --git a/migration/migration.c b/migration/migration.c
> index 353f272..2ade2aa 100644
> --- a/migration/migration.c
> +++ b/migration/migration.c
> @@ -70,7 +70,7 @@
>  /* The delay time (in ms) between two COLO checkpoints
>   * Note: Please change this default value to 1 when we support hybrid 
> mode.
>   */
> -#define DEFAULT_MIGRATE_X_CHECKPOINT_DELAY 200
> +#define DEFAULT_MIGRATE_X_CHECKPOINT_DELAY (200 * 100)
>  
>  static NotifierList migration_state_notifiers =
>  NOTIFIER_LIST_INITIALIZER(migration_state_notifiers);

1000 or 200 * 100

Please, fix value or comment?

Later, Juan.



Re: [Qemu-devel] postcopy migration hangs while loading virtio state

2017-04-24 Thread Christian Borntraeger
On 04/24/2017 04:35 PM, Dr. David Alan Gilbert wrote:
> * Christian Borntraeger (borntrae...@de.ibm.com) wrote:
>> On 04/24/2017 12:53 PM, Dr. David Alan Gilbert wrote:
>>> * Christian Borntraeger (borntrae...@de.ibm.com) wrote:
 David, Juan,

 I can trigger a hang of postcopy migration (when I do it early) so
 that both sides are in paused state. Looks like one thread is still
 loading vmstates for virtio and this load accesses guest memory and
 triggers a userfault.
>>>
>>> It's perfectly legal for the destination to cause a userfault as it
>>> loads the virtio queue - the virtio queue should be being loaded by
>>> the main thread from the buffer while the 'listening' thread is
>>> waiting for the incoming page data.
>>>
>>> Can you turn on the following tracing please: destination: 
>>> postcopy_ram_fault_thread_request postcopy_place_page 
>>> postcopy_place_page_zero
>>>
>>> source: migrate_handle_rp_req_pages ram_save_queue_pages
>>>
>>> You should see: virtio does the access userfault generates a fault 
>>> postcopy_ram_fault_thread_request sends the request to the source
>>>
>>> the source sees migrate_handle_rp_req_pages queues it with
>>> ram_save_queue_pages
>>>
>>> the destination sees the page arrive and postcopy_place_page or
>>> postcopy_place_page_zero
>>>
>>> some of that might disappear if the page was already on it's way.
>>
>>
>> the last event on the source are
>> [..]
>> 58412@1493037953.747988:postcopy_place_page host=0x3ff92246000
>> 58412@1493037953.747992:postcopy_place_page host=0x3ff92247000
> 
> How do you see those on the source???

It was the previous migrate (I did it in a loop forth and back)
The problem happens on the migrate back.
> 
>> 58412@1493037956.804210:migrate_handle_rp_req_pages in s390.ram at 41d9000 
>> len 1000
>> 58412@1493037956.804216:ram_save_queue_pages s390.ram: start: 41d9000 len: 
>> 100
>>
> 
> Is that a typo? I'm expecting those two 'len' fields to be the same?

Yes, its a cut'n' paste "miss the last byte"


> 
>> On the target a see lots of
>>
>> 39741@1493037958.833710:postcopy_place_page_zero host=0x3ff9befa000
>> 39741@1493037958.833716:postcopy_place_page host=0x3ff9befb000
>> 39741@1493037958.833759:postcopy_place_page host=0x3ff9befc000
>> 39741@1493037958.833818:postcopy_place_page host=0x3ff9befd000
>> 39741@1493037958.833819:postcopy_place_page_zero host=0x3ff9befe000
>> 39741@1493037958.833822:postcopy_place_page host=0x3ff9beff000
>>
>> So we have about 2 seconds of traffic going on after that request,
>> I assume its precopy related.
>>
>> Looking back on the target history there was
>> 39741@1493037956.804337:postcopy_ram_fault_thread_request Request for 
>> HVA=3ff618d9000 rb=s390.ram offset=41d9000
>>
>> In fact it seems to be the first and only request:
>>
>> # cat /tmp/test0.trace | grep -v postcopy_place_page
>>
>> 39741@1493037956.804337:postcopy_ram_fault_thread_request Request for 
>> HVA=3ff618d9000 rb=s390.ram offset=41d9000
> 
> OK, does the HVA there correspond to the address that your virtio device is 
> blocking on?

yes it is the same page.


> (or at least the start of the page)
> Do you see a postcopy_place_page with a host= the same HVA ?

no

> 
> From the source backtrace, I think the source thinks it's sent everything
> and is waiting for the return-path-thread to close.
> 
> Dave
> 
>>>
>>> Dave
>>>
 Thread 1 (Thread 0x3ffa2f45f00 (LWP 21122)): #0  0x01017130
 in lduw_he_p (ptr=0x3ff498d9002) at
 /root/qemu/include/qemu/bswap.h:317 < according to the host
 kernel this threads hangs in handle_userfault. #1
 0x01017342 in lduw_le_p (ptr=0x3ff498d9002) at
 /root/qemu/include/qemu/bswap.h:359 #2  0x01025840 in
 address_space_lduw_internal_cached (cache=0x283491d0, addr=2,
 attrs=..., result=0x0, endian=DEVICE_LITTLE_ENDIAN) at
 /root/qemu/memory_ldst.inc.c:284 #3  0x010259a6 in
 address_space_lduw_le_cached (cache=0x283491d0, addr=2, attrs=...,
 result=0x0) at /root/qemu/memory_ldst.inc.c:315 #4
 0x01025ad6 in lduw_le_phys_cached (cache=0x283491d0,
 addr=2) at /root/qemu/memory_ldst.inc.c:334 #5  0x01116c10
 in virtio_lduw_phys_cached (vdev=0x28275090, cache=0x283491d0,
 pa=2) at /root/qemu/include/hw/virtio/virtio-access.h:166 #6
 0x01117940 in vring_used_idx (vq=0x3ffa2e1c090) at
 /root/qemu/hw/virtio/virtio.c:263 #7  0x0111daea in
 virtio_load (vdev=0x28275090, f=0x28353600, version_id=2) at
 /root/qemu/hw/virtio/virtio.c:2168 #8  0x0111d0cc in
 virtio_device_get (f=0x28353600, opaque=0x28275090, size=0,
 field=0x16adf38 <__compound_literal.0>) at
 /root/qemu/hw/virtio/virtio.c:1975 #9  0x012a7f50 in
 vmstate_load_state (f=0x28353600, vmsd=0x14db480
 , opaque=0x28275090, version_id=2) at
 /root/qemu/migration/vmstate.c:128 #10 0x010cbc08 in
 vmstate_load (f=0x28353600, se=0x28279e60, version_id=2) 

Re: [Qemu-devel] [Qemu-stable] [PATCH v2 1/1] qemu-img: wait for convert coroutines to complete

2017-04-24 Thread Peter Lieven


> Am 24.04.2017 um 18:27 schrieb Anton Nefedov :
> 
>> On 04/21/2017 03:37 PM, Peter Lieven wrote:
>>> Am 21.04.2017 um 14:19 schrieb Anton Nefedov:
 On 04/21/2017 01:44 PM, Peter Lieven wrote:
> Am 21.04.2017 um 12:04 schrieb Anton Nefedov:
> On error path (like i/o error in one of the coroutines), it's required to
>  - wait for coroutines completion before cleaning the common structures
>  - reenter dependent coroutines so they ever finish
> 
> Introduced in 2d9187bc65.
> 
> Signed-off-by: Anton Nefedov 
> ---
> [..]
> 
 
 
 And what if we error out in the read path? Wouldn't be something like this 
 easier?
 
 
 diff --git a/qemu-img.c b/qemu-img.c
 index 22f559a..4ff1085 100644
 --- a/qemu-img.c
 +++ b/qemu-img.c
 @@ -1903,6 +1903,16 @@ static int convert_do_copy(ImgConvertState *s)
 main_loop_wait(false);
 }
 
 +/* on error path we need to enter all coroutines that are still
 + * running before cleaning up common structures */
 +if (s->ret) {
 +for (i = 0; i < s->num_coroutines; i++) {
 + if (s->co[i]) {
 + qemu_coroutine_enter(s->co[i]);
 + }
 +}
 +}
 +
 if (s->compressed && !s->ret) {
 /* signal EOF to align */
 ret = blk_pwrite_compressed(s->target, 0, NULL, 0);
 
 
 Peter
 
>>> 
>>> seemed a bit too daring to me to re-enter every coroutine potentially 
>>> including the ones that yielded waiting for I/O completion.
>>> If that's ok - that is for sure easier :)
>> 
>> I think we should enter every coroutine that is still running and have it 
>> terminate correctly. It was my mistake that I have not
>> done this in the original patch. Can you check if the above fixes your test 
>> cases that triggered the bug?
>> 
> 
> hi, sorry I'm late with the answer
> 
> this segfaults in bdrv_close(). Looks like it tries to finish some i/o which 
> coroutine we have already entered and terminated?
> 
> (gdb) run
> Starting program: /vz/anefedov/qemu-build/us/./qemu-img convert -O qcow2 
> ./harddisk.hdd.c ./harddisk.hdd
> [Thread debugging using libthread_db enabled]
> Using host libthread_db library "/lib64/libthread_db.so.1".
> [New Thread 0x7fffeac2d700 (LWP 436020)]
> [New Thread 0x7fffe4ed6700 (LWP 436021)]
> qemu-img: error while reading sector 20480: Input/output error
> qemu-img: error while writing sector 19712: Operation now in progress
> 
> Program received signal SIGSEGV, Segmentation fault.
> aio_co_wake (co=0x0) at /mnt/code/us-qemu/util/async.c:454
> 454 ctx = atomic_read(>ctx);
> (gdb) bt
> #0  aio_co_wake (co=0x0) at /mnt/code/us-qemu/util/async.c:454
> /* [Anton]: thread_pool_co_cb () here */
> #1  0x55634629 in thread_pool_completion_bh (opaque=0x55cfe020) 
> at /mnt/code/us-qemu/util/thread-pool.c:189
> #2  0x55633b31 in aio_bh_call (bh=0x55cfe0f0) at 
> /mnt/code/us-qemu/util/async.c:90
> #3  aio_bh_poll (ctx=ctx@entry=0x55cee6d0) at 
> /mnt/code/us-qemu/util/async.c:118
> #4  0x55636f14 in aio_poll (ctx=ctx@entry=0x55cee6d0, 
> blocking=) at /mnt/code/us-qemu/util/aio-posix.c:682
> #5  0x555c52d4 in bdrv_drain_recurse (bs=bs@entry=0x55d22560) at 
> /mnt/code/us-qemu/block/io.c:164
> #6  0x555c5aed in bdrv_drained_begin (bs=bs@entry=0x55d22560) at 
> /mnt/code/us-qemu/block/io.c:248
> #7  0x55581443 in bdrv_close (bs=0x55d22560) at 
> /mnt/code/us-qemu/block.c:2909
> #8  bdrv_delete (bs=0x55d22560) at /mnt/code/us-qemu/block.c:3100
> #9  bdrv_unref (bs=0x55d22560) at /mnt/code/us-qemu/block.c:4087
> #10 0x555baf44 in blk_remove_bs (blk=blk@entry=0x55d22380) at 
> /mnt/code/us-qemu/block/block-backend.c:552
> #11 0x555bb173 in blk_delete (blk=0x55d22380) at 
> /mnt/code/us-qemu/block/block-backend.c:238
> #12 blk_unref (blk=blk@entry=0x55d22380) at 
> /mnt/code/us-qemu/block/block-backend.c:282
> #13 0x5557a22c in img_convert (argc=, argv= out>) at /mnt/code/us-qemu/qemu-img.c:2359
> #14 0x55574189 in main (argc=5, argv=0x7fffe4a0) at 
> /mnt/code/us-qemu/qemu-img.c:4464
> 
> 
>> Peter
>> 
> 
> /Anton
> 

it seems that this is a bit tricky, can you share how your test case works?

thanks,
peter



Re: [Qemu-devel] [Qemu-arm] [RFC 0/3] split core mmu_idx from ARMMMUIdx values

2017-04-24 Thread Peter Maydell
On 24 April 2017 at 17:33, Peter Maydell  wrote:
> For M profile, we're eventually going to want some
> new MMU index values:
>  non secure User
>  non secure Privileged
>  non secure Privileged, execution priority < 0
>  secure User
>  secure Privileged
>  secure Privileged, execution priority < 0

Hmm, Alex and Edgar were definitely on the cc when I sent it but
seem to have fallen off it. (If you got a copy anyway then you
should probably check whether you have accidentally told mailman to
"filter out duplicate messages": you never want this setting
enabled, because it means "silently drop me from the cc list".)

thanks
-- PMM



Re: [Qemu-devel] [PATCH 6/6] migration: detailed traces for postcopy

2017-04-24 Thread Dr. David Alan Gilbert
* Alexey Perevalov (a.pereva...@samsung.com) wrote:
> It could help to track down vCPU state during page fault and
> page fault sources.
> 
> This patch showes proc's status/stack/syscall file at the moment of pagefault,
> it's very interesting to know who was page fault initiator.

This is a LOT of debug code, almost none of it is postcopy specific,
so probably a question for generic tracing code; but I'll admit to
not being happy about the idea of putting this much code in for
this type of dumping; when it gets this desperate we just normally do
a special build.

However, some specific comments as well.

> Signed-off-by: Alexey Perevalov 
> ---
>  migration/postcopy-ram.c | 98 
> +++-
>  migration/trace-events   |  6 +++
>  2 files changed, 103 insertions(+), 1 deletion(-)
> 
> diff --git a/migration/postcopy-ram.c b/migration/postcopy-ram.c
> index 42330fd..513633c 100644
> --- a/migration/postcopy-ram.c
> +++ b/migration/postcopy-ram.c
> @@ -412,7 +412,91 @@ static int ram_block_enable_notify(const char 
> *block_name, void *host_addr,
>  return 0;
>  }
>  
> -static int get_mem_fault_cpu_index(uint32_t pid)
> +#define PROC_LEN 1024
> +#define DEBUG_FAULT_PROCESS_STATUS 1
> +
> +#ifdef DEBUG_FAULT_PROCESS_STATUS
> +
> +static FILE *get_proc_file(const gchar *frmt, pid_t thread_id)
> +{
> +FILE *f = NULL;
> +gchar *file_path = g_strdup_printf(frmt, thread_id);
> +if (file_path == NULL) {
> +error_report("Couldn't allocate path for %u", thread_id);
> +return NULL;
> +}

I was going to say that I thought g_strdup_printf couldn't
return NULL; but then I looked at the source - eww it can.

> +f = fopen(file_path, "r");
> +if (!f) {
> +error_report("can't open %s", file_path);
> +}
> +
> +trace_get_proc_file(file_path);
> +g_free(file_path);
> +return f;
> +}
> +
> +typedef void(*proc_line_handler)(const char *line);
> +
> +static void proc_line_cb(const char *line)
> +{
> +/* trace_ functions are inline */
> +trace_proc_line_cb(line);
> +}
> +
> +static void foreach_line_in_file(FILE *f, proc_line_handler cb)
> +{
> +char *line = NULL;
> +ssize_t read;
> +size_t len;
> +
> +while ((read = getline(, , f)) != -1) {
> +/* workaround, trace_ infrastructure already insert \n
> + * and getline includes it */
> +ssize_t str_len = strlen(line) - 1;
> +if (str_len <= 0)
> +continue;
> +line[str_len] = '\0';
> +cb(line);
> +}
> +free(line);
> +}
> +
> +static void observe_thread_proc(const gchar *path_frmt, pid_t thread_id)
> +{
> +FILE *f = get_proc_file(path_frmt, thread_id);
> +if (!f) {
> +error_report("can't read thread's proc");
> +return;
> +}
> +
> +foreach_line_in_file(f, proc_line_cb);

> +fclose(f);
> +}
> +
> +/*
> + * for convinience tracing need to trace
> + * observe_thread_begin
> + * get_proc_file
> + * proc_line_cb
> + * observe_thread_end
> + */
> +static void observe_thread(const char *msg, pid_t thread_id)
> +{
> +trace_observe_thread_begin(msg);
> +observe_thread_proc("/proc/%d/status", thread_id);
> +observe_thread_proc("/proc/%d/syscall", thread_id);
> +observe_thread_proc("/proc/%d/stack", thread_id);

You could wrap that in something like:
  if (TRACE_PROC_LINE_CB_ENABLED) {

so it doesn't read all of the files and do all the allocation
to get to the point it realised no one cared.

Dave

> +trace_observe_thread_end(msg);
> +}
> +
> +#else
> +static void observe_thread(const char *msg, pid_t thread_id)
> +{
> +}
> +
> +#endif /* DEBUG_FAULT_PROCESS_STATUS */
> +
> +static int get_mem_fault_cpu_index(pid_t pid)
>  {
>  CPUState *cpu_iter;
>  
> @@ -421,9 +505,20 @@ static int get_mem_fault_cpu_index(uint32_t pid)
> return cpu_iter->cpu_index;
>  }
>  trace_get_mem_fault_cpu_index(pid);
> +observe_thread("not a vCPU", pid);
> +
>  return -1;
>  }
>  
> +static void observe_vcpu_state(void)
> +{
> +CPUState *cpu_iter;
> +CPU_FOREACH(cpu_iter) {
> +observe_thread("vCPU", cpu_iter->thread_id);
> +trace_vcpu_state(cpu_iter->running, cpu_iter->cpu_index);
> +}
> +}
> +
>  /*
>   * Handle faults detected by the USERFAULT markings
>   */
> @@ -465,6 +560,7 @@ static void *postcopy_ram_fault_thread(void *opaque)
>  }
>  
>  ret = read(mis->userfault_fd, , sizeof(msg));
> +observe_vcpu_state();
>  if (ret != sizeof(msg)) {
>  if (errno == EAGAIN) {
>  /*
> diff --git a/migration/trace-events b/migration/trace-events
> index ab2e1e4..3a74f91 100644
> --- a/migration/trace-events
> +++ b/migration/trace-events
> @@ -202,6 +202,12 @@ save_xbzrle_page_overflow(void) ""
>  ram_save_iterate_big_wait(uint64_t milliconds, int iterations) "big wait: %" 
> PRIu64 " milliseconds, %d iterations"
>  

Re: [Qemu-devel] [Qemu-devel RFC v2 1/4] msf2: Add Smartfusion2 System timer

2017-04-24 Thread Peter Maydell
On 24 April 2017 at 18:44, Alistair Francis  wrote:
> Basically the simple explanation is that init is called when the
> object is created and realize is called when the object is realized.
>
> Generally for devices it will go something like this:
>  1. init
>  2. Set properties
>  3. Connect things
>  4. realize
>  5. Map to memory
>
>> Don't we need to use realize function for new models?
>
> AFAIK we still put things like: sysbus_init_irq(),
> memory_region_init_io() and sysbus_init_mmio() in the init function.
>
> I don't think we are at a stage yet to not use init functions.

Two-phase init is here to stay -- some things must be done in
init, some must be done in realize, and some can be done in
either. Some simple devices may find they can do everything
in only one function.

Must be done in init:
 * creating properties (for the cases where that is done "by hand"
   by calling object_property_add_*())
 * calling init on child objects which you want to pass through
   alias properties for

Must be done in realize:
 * anything that can fail such that we need to report the
   error and abandon creation of the device
 * anything which depends on the values of QOM properties that
   the caller might have set

We should probably sit down and write up some guidelines for
how we recommend dealing with the various things that could
be called in either function -- this is basically a code
style and consistency question.

thanks
-- PMM



Re: [Qemu-devel] [Qemu-devel RFC v2 4/4] msf2: Add Emcraft's Smartfusion2 SOM kit.

2017-04-24 Thread Alistair Francis
>>
>> Instead of calling all of these in the init function you should split
>> it up over the machines init and realize function.
>>
>> Look at the stm32f205_soc or xlnx-zynqmp files for examples of how to do 
>> this.
>>
>> It also moves away from calling qdev_create() and qdev_init_nofail()
>> and instead manually creates the objects.
>>
> I am still learning all these. Please correct me if am wrong.
> I need to create a SoC file and a board file like stm32f205 and
> xlnx-zynqmp now right?

Hey Sundeep,

I don't think you have to do it like that. I think for some
SoCs/boards it makes sense. For example the ZynqMP SoCs are included
on multiple different boards (EP108 and ZCU102) so it makes sense to
have a SoC and a board separately defined. On the other hand if you
had a SoC that is always on the same board it doesn't make as much
sense.

It is probably is a good idea to split it between a board and an SoC
unless you have a good reason not to though.

Thanks,

Alistair

>
>> Otherwise this patch looks pretty good.
>
> Thank you :)
> Sundeep
>
>>
>> Thanks,
>>
>> Alistair
>>
>>> +}
>>> +
>>> +static void msf2_machine_init(MachineClass *mc)
>>> +{
>>> +mc->desc = "SmartFusion2 SOM kit from Emcraft";
>>> +mc->init = msf2_init;
>>> +}
>>> +
>>> +DEFINE_MACHINE("smartfusion2-som", msf2_machine_init)
>>> --
>>> 2.5.0
>>>
>>>



[Qemu-devel] [PATCH v2] crypto_gen_random() now also works on windows

2017-04-24 Thread Geert Martin Ijewski
If no crypto library is included in the build QEMU uses 
qcrypto_random_bytes() to generate random data. That function tried to 
open /dev/urandom or /dev/random and if openeing neither file worked it 
errored out.


Those files obviously do not exist on windows, so there the code now 
uses CryptGenRandom().


Furthermore there was some refactoring and a new function 
qcrypto_random_init() was introduced, that initalizes (platform 
specific) handles that are used by qcrypto_random_bytes().


Signed-off-by: Geert Martin Ijewski 
---
 crypto/init.c|  6 ++
 crypto/random-platform.c | 45 
+

 include/crypto/random.h  |  9 +
 3 files changed, 52 insertions(+), 8 deletions(-)

diff --git a/crypto/init.c b/crypto/init.c
index f65207e..f131c42
--- a/crypto/init.c
+++ b/crypto/init.c
@@ -32,6 +32,8 @@
 #include 
 #endif

+#include "crypto/random.h"
+
 /* #define DEBUG_GNUTLS */

 /*
@@ -146,5 +148,9 @@ int qcrypto_init(Error **errp)
 gcry_control(GCRYCTL_INITIALIZATION_FINISHED, 0);
 #endif

+if (qcrypto_random_init(errp) < 0) {
+return -1;
+}
+
 return 0;
 }
diff --git a/crypto/random-platform.c b/crypto/random-platform.c
index 82b755a..49d7f80
@@ -22,14 +22,23 @@

 #include "crypto/random.h"

-int qcrypto_random_bytes(uint8_t *buf G_GNUC_UNUSED,
- size_t buflen G_GNUC_UNUSED,
- Error **errp)
-{
-int fd;
-int ret = -1;
-int got;
+#ifdef _WIN32
+#include 
+HCRYPTPROV hCryptProv;
+#else
+int fd; /* a file handle to either /dev/urandom or /dev/random */
+#endif

+int qcrypto_random_init(Error **errp)
+{
+#ifdef _WIN32
+if (!CryptAcquireContext(, NULL, NULL, PROV_RSA_FULL,
+ CRYPT_SILENT | CRYPT_VERIFYCONTEXT)) {
+error_setg_errno(errp, GetLastError(),
+ "Unable to create cryptographic provider");
+return -1;
+}
+#else
 /* TBD perhaps also add support for BSD getentropy / Linux
  * getrandom syscalls directly */
 fd = open("/dev/urandom", O_RDONLY);
@@ -41,6 +50,18 @@ int qcrypto_random_bytes(uint8_t *buf G_GNUC_UNUSED,
 error_setg(errp, "No /dev/urandom or /dev/random found");
 return -1;
 }
+#endif
+
+return 0;
+}
+
+int qcrypto_random_bytes(uint8_t *buf G_GNUC_UNUSED,
+ size_t buflen G_GNUC_UNUSED,
+ Error **errp)
+{
+#ifndef _WIN32
+int ret = -1;
+int got;

 while (buflen > 0) {
 got = read(fd, buf, buflen);
@@ -59,6 +80,14 @@ int qcrypto_random_bytes(uint8_t *buf G_GNUC_UNUSED,

 ret = 0;
  cleanup:
-close(fd);
 return ret;
+#else
+if (!CryptGenRandom(hCryptProv, buflen, buf)) {
+error_setg_errno(errp, GetLastError(),
+ "Unable to read random bytes");
+return -1;
+}
+
+return 0;
+#endif
 }
diff --git a/include/crypto/random.h b/include/crypto/random.h
index a101353..82a3209
--- a/include/crypto/random.h
+++ b/include/crypto/random.h
@@ -40,5 +40,14 @@ int qcrypto_random_bytes(uint8_t *buf,
  size_t buflen,
  Error **errp);

+/**
+ * qcrypto_random_init:
+ * @errp: pointer to a NULL-initialized error object
+ *
+ * Initalizes the handles used by qcrypto_random_bytes
+ *
+ * Returns 0 on success, -1 on error
+ */
+int qcrypto_random_init(Error **errp);

 #endif /* QCRYPTO_RANDOM_H */
--
1.9.1




Re: [Qemu-devel] [Qemu-devel RFC v2 1/4] msf2: Add Smartfusion2 System timer

2017-04-24 Thread Alistair Francis
>>> +
>>> +isr = !!(st->regs[R_RIS] & TIMER_RIS_ACK);
>>> +ier = !!(st->regs[R_CTRL] & TIMER_CTRL_INTR);
>>> +
>>> +qemu_set_irq(st->irq, (ier && isr));
>>> +}
>>> +
>>> +static uint64_t
>>> +timer_read(void *opaque, hwaddr addr, unsigned int size)
>>> +{
>>> +struct timerblock *t = opaque;
>>> +struct msf2_timer *st;
>>> +uint32_t r = 0;
>>> +unsigned int timer;
>>> +int isr;
>>> +int ier;
>>> +
>>> +addr >>= 2;
>>> +timer = timer_from_addr(addr);
>>> +st = >timers[timer];
>>> +
>>> +if (timer) {
>>> +addr -= 6;
>>> +}
>>
>> Isn't this timer logic just checking if (addr >> 2) == R_MAX and if it
>> is set (addr >> 2) back to zero? This seems an overly complex way to
>> check that.
> I did not get you clearly. Do you want me to write like this:
> unsigned int timer = 0;
>
> addr >>= 2;
> if (addr >= R_MAX) {
> timer = 1;
> addr =  addr - R_MAX;
> }

Yeah, I think this is clearer then what you had earlier.

Although why do you have to subtract R_MAX, shouldn't it just be an
error if accessing values larger then R_MAX?

>
>>
>>> +
>>> +switch (addr) {
>>> +case R_VAL:
>>> +r = ptimer_get_count(st->ptimer);
>>> +D(qemu_log("msf2_timer t=%d read counter=%x\n", timer, r));
>>> +break;
>>> +
>>> +case R_MIS:
>>> +isr = !!(st->regs[R_RIS] & TIMER_RIS_ACK);
>>> +ier = !!(st->regs[R_CTRL] & TIMER_CTRL_INTR);
>>> +r = ier && isr;
>>> +break;
>>> +
>>> +default:
>>> +if (addr < ARRAY_SIZE(st->regs)) {
>>> +r = st->regs[addr];
>>> +}
>>> +break;
>>> +}
>>> +D(fprintf(stderr, "%s timer=%d %x=%x\n", __func__, timer, addr * 4, 
>>> r));
>>> +return r;
>>> +}
>>> +
>>> +static void timer_update(struct msf2_timer *st)
>>> +{
>>> +uint64_t count;
>>> +
>>> +D(fprintf(stderr, "%s timer=%d\n", __func__, st->nr));
>>> +
>>> +if (!(st->regs[R_CTRL] & TIMER_CTRL_ENBL)) {
>>> +ptimer_stop(st->ptimer);
>>> +return;
>>> +}
>>> +
>>> +count = st->regs[R_LOADVAL];
>>> +ptimer_set_limit(st->ptimer, count, 1);
>>> +ptimer_run(st->ptimer, 1);
>>> +}
>>
>> The update function should be above the read/write functions.
>>
> Ok I will change
>
>>> +
>>> +static void
>>> +timer_write(void *opaque, hwaddr addr,
>>> +uint64_t val64, unsigned int size)
>>> +{
>>> +struct timerblock *t = opaque;
>>> +struct msf2_timer *st;
>>> +unsigned int timer;
>>> +uint32_t value = val64;
>>> +
>>> +addr >>= 2;
>>> +timer = timer_from_addr(addr);
>>> +st = >timers[timer];
>>> +D(fprintf(stderr, "%s addr=%x val=%x (timer=%d)\n",
>>> + __func__, addr * 4, value, timer));
>>> +
>>> +if (timer) {
>>> +addr -= 6;
>>> +}
>>
>> Same comment from the read function.
>>
>>> +
>>> +switch (addr) {
>>> +case R_CTRL:
>>> +st->regs[R_CTRL] = value;
>>> +timer_update(st);
>>> +break;
>>> +
>>> +case R_RIS:
>>> +if (value & TIMER_RIS_ACK) {
>>> +st->regs[R_RIS] &= ~TIMER_RIS_ACK;
>>> +}
>>> +break;
>>> +
>>> +case R_LOADVAL:
>>> +st->regs[R_LOADVAL] = value;
>>> +if (st->regs[R_CTRL] & TIMER_CTRL_ENBL) {
>>> +timer_update(st);
>>> +}
>>> +break;
>>> +
>>> +case R_BGLOADVAL:
>>> +st->regs[R_BGLOADVAL] = value;
>>> +st->regs[R_LOADVAL] = value;
>>> +break;
>>> +
>>> +case R_VAL:
>>> +case R_MIS:
>>> +break;
>>> +
>>> +default:
>>> +if (addr < ARRAY_SIZE(st->regs)) {
>>> +st->regs[addr] = value;
>>> +}
>>> +break;
>>> +}
>>> +timer_update_irq(st);
>>> +}
>>> +
>>> +static const MemoryRegionOps timer_ops = {
>>> +.read = timer_read,
>>> +.write = timer_write,
>>> +.endianness = DEVICE_NATIVE_ENDIAN,
>>> +.valid = {
>>> +.min_access_size = 4,
>>> +.max_access_size = 4
>>> +}
>>> +};
>>> +
>>> +static void timer_hit(void *opaque)
>>> +{
>>> +struct msf2_timer *st = opaque;
>>> +D(fprintf(stderr, "%s %d\n", __func__, st->nr));
>>> +st->regs[R_RIS] |= TIMER_RIS_ACK;
>>> +
>>> +if (!(st->regs[R_CTRL] & TIMER_CTRL_ONESHOT)) {
>>> +timer_update(st);
>>> +}
>>> +timer_update_irq(st);
>>> +}
>>> +
>>> +static void msf2_timer_realize(DeviceState *dev, Error **errp)
>>> +{
>>> +struct timerblock *t = MSF2_TIMER(dev);
>>> +unsigned int i;
>>> +
>>> +/* Init all the ptimers.  */
>>> +t->timers = g_malloc0((sizeof t->timers[0]) * NUM_TIMERS);
>>> +for (i = 0; i < NUM_TIMERS; i++) {
>>> +struct msf2_timer *st = >timers[i];
>>> +
>>> +st->parent = t;
>>> +st->nr = i;
>>> +st->bh = qemu_bh_new(timer_hit, st);
>>> +st->ptimer = ptimer_init(st->bh, PTIMER_POLICY_DEFAULT);
>>> +ptimer_set_freq(st->ptimer, t->freq_hz);
>>> +

Re: [Qemu-devel] [PATCH v2] ipmi: add SET_SENSOR_READING command

2017-04-24 Thread Michael S. Tsirkin
On Wed, Apr 12, 2017 at 09:08:50AM +0200, Cédric Le Goater wrote:
> SET_SENSOR_READING is a complex IPMI command (see IPMI spec 35.17)
> which enables the host software to set the reading value and the event
> status of sensors supporting it.
> 
> Below is a proposal for all the operations (reading, assert, deassert,
> event data) with the following limitations :
> 
>  - No event are generated for threshold-based sensors. 
>  - The case in which the BMC needs to generate its own events is not
>supported.
>
> Signed-off-by: Cédric Le Goater 

Need to look at how guests will behave if they start on
a new QEMU and migrate to old one.
Any input? Do we need a flag to behave like 2.9 did if
we want compatibility?

> ---
> 
>  Corey,
> 
>  There is some progress but I didn't add a check on the value before
>  generating events because we would not generate an event when only
>  the event bytes change. 
> 
>  Changes since v1:
>  
>  - created copies of the reading and the assertion bits before
>committing the values
>  - added some TODOs
>  - handled inconsistent Event Data Bytes operation
>  
>  hw/ipmi/ipmi_bmc_sim.c |  213 
> +
>  1 file changed, 213 insertions(+)
> 
> Index: qemu-powernv-2.9.git/hw/ipmi/ipmi_bmc_sim.c
> ===
> --- qemu-powernv-2.9.git.orig/hw/ipmi/ipmi_bmc_sim.c
> +++ qemu-powernv-2.9.git/hw/ipmi/ipmi_bmc_sim.c
> @@ -45,6 +45,7 @@
>  #define IPMI_CMD_GET_SENSOR_READING   0x2d
>  #define IPMI_CMD_SET_SENSOR_TYPE  0x2e
>  #define IPMI_CMD_GET_SENSOR_TYPE  0x2f
> +#define IPMI_CMD_SET_SENSOR_READING   0x30
>  
>  /* #define IPMI_NETFN_APP 0x06 In ipmi.h */
>  
> @@ -1739,6 +1740,217 @@ static void get_sensor_type(IPMIBmcSim *
>  rsp_buffer_push(rsp, sens->evt_reading_type_code);
>  }
>  
> +/*
> + * bytes   parameter
> + *1sensor number
> + *2operation (see below for bits meaning)
> + *3sensor reading
> + *  4:5assertion states (optional)
> + *  6:7deassertion states (optional)
> + *  8:10   event data 1,2,3 (optional)
> + */
> +static void set_sensor_reading(IPMIBmcSim *ibs,
> +   uint8_t *cmd, unsigned int cmd_len,
> +   RspBuffer *rsp)
> +{
> +IPMISensor *sens;
> +uint8_t evd1 = 0;
> +uint8_t evd2 = 0;
> +uint8_t evd3 = 0;
> +uint8_t new_reading = 0;
> +uint16_t new_assert_states = 0;
> +uint16_t new_deassert_states = 0;
> +bool change_reading = false;
> +bool change_assert = false;
> +bool change_deassert = false;
> +enum {
> +SENSOR_GEN_EVENT_NONE,
> +SENSOR_GEN_EVENT_DATA,
> +SENSOR_GEN_EVENT_BMC,
> +} do_gen_event = SENSOR_GEN_EVENT_NONE;
> +
> +if ((cmd[2] >= MAX_SENSORS) ||
> +!IPMI_SENSOR_GET_PRESENT(ibs->sensors + cmd[2])) {
> +rsp_buffer_set_error(rsp, IPMI_CC_REQ_ENTRY_NOT_PRESENT);
> +return;
> +}
> +
> +sens = ibs->sensors + cmd[2];
> +
> +/* [1:0] Sensor Reading operation */
> +switch ((cmd[3]) & 0x3) {
> +case 0: /* Do not change */
> +break;
> +case 1: /* write given value to sensor reading byte */
> +new_reading = cmd[4];
> +if (sens->reading != new_reading) {
> +change_reading = true;
> +}
> +break;
> +case 2:
> +case 3:
> +rsp_buffer_set_error(rsp, IPMI_CC_INVALID_DATA_FIELD);
> +return;
> +}
> +
> +/* [3:2] Deassertion bits operation */
> +switch ((cmd[3] >> 2) & 0x3) {
> +case 0: /* Do not change */
> +break;
> +case 1: /* write given value */
> +if (cmd_len > 7) {
> +new_deassert_states = cmd[7];
> +change_deassert = true;
> +}
> +if (cmd_len > 8) {
> +new_deassert_states |= (cmd[8] << 8);
> +}
> +break;
> +
> +case 2: /* mask on */
> +if (cmd_len > 7) {
> +new_deassert_states = (sens->deassert_states | cmd[7]);
> +change_deassert = true;
> +}
> +if (cmd_len > 8) {
> +new_deassert_states |= (sens->deassert_states | (cmd[8] << 8));
> +}
> +break;
> +
> +case 3: /* mask off */
> +if (cmd_len > 7) {
> +new_deassert_states = (sens->deassert_states & cmd[7]);
> +change_deassert = true;
> +}
> +if (cmd_len > 8) {
> +new_deassert_states |= (sens->deassert_states & (cmd[8] << 8));
> +}
> +break;
> +}
> +
> +if (change_deassert && (new_deassert_states == sens->deassert_states)) {
> +change_deassert = false;
> +}
> +
> +/* [5:4] Assertion bits operation */
> +switch ((cmd[3] >> 4) & 0x3) {
> +case 0: /* Do not change */
> +break;
> +case 1: /* write given value */
> +if (cmd_len > 5) {
> +

Re: [Qemu-devel] [PATCH 5/6] migration: send postcopy downtime back to source

2017-04-24 Thread Dr. David Alan Gilbert
* Alexey Perevalov (a.pereva...@samsung.com) wrote:
> Right now to initiate postcopy live migration need to
> send request to source machine and specify destination.
> 
> User could request migration status by query-migrate qmp command on
> source machine, but postcopy downtime is being evaluated on destination,
> so it should be transmitted back to source. For this purpose return path
> socket was shosen.
> 
> Signed-off-by: Alexey Perevalov 

That will break a migration from an older QEMU to a newer QEMU with this feature
since the old QEMU won't know the message type and fail with a
  'Received invalid message'

near the start of source_return_path_thread.

The simpler solution is to let the stat be read on the destination side
and not bother sending it backwards over the wire.

Dave

> ---
>  include/migration/migration.h |  4 +++-
>  migration/migration.c | 20 ++--
>  migration/postcopy-ram.c  |  1 +
>  3 files changed, 22 insertions(+), 3 deletions(-)
> 
> diff --git a/include/migration/migration.h b/include/migration/migration.h
> index 5d2c628..5535aa6 100644
> --- a/include/migration/migration.h
> +++ b/include/migration/migration.h
> @@ -55,7 +55,8 @@ enum mig_rp_message_type {
>  
>  MIG_RP_MSG_REQ_PAGES_ID, /* data (start: be64, len: be32, id: string) */
>  MIG_RP_MSG_REQ_PAGES,/* data (start: be64, len: be32) */
> -
> +MIG_RP_MSG_DOWNTIME,/* downtime value from destination,
> +   calculated and sent in case of post copy */
>  MIG_RP_MSG_MAX
>  };
>  
> @@ -364,6 +365,7 @@ void migrate_send_rp_pong(MigrationIncomingState *mis,
>uint32_t value);
>  void migrate_send_rp_req_pages(MigrationIncomingState *mis, const char* 
> rbname,
>ram_addr_t start, size_t len);
> +void migrate_send_rp_downtime(MigrationIncomingState *mis, uint64_t 
> downtime);
>  
>  void ram_control_before_iterate(QEMUFile *f, uint64_t flags);
>  void ram_control_after_iterate(QEMUFile *f, uint64_t flags);
> diff --git a/migration/migration.c b/migration/migration.c
> index 5bac434..3134e24 100644
> --- a/migration/migration.c
> +++ b/migration/migration.c
> @@ -553,6 +553,19 @@ void migrate_send_rp_message(MigrationIncomingState *mis,
>  }
>  
>  /*
> + * Send postcopy migration downtime,
> + * at the moment of calling this function migration should
> + * be completed.
> + */
> +void migrate_send_rp_downtime(MigrationIncomingState *mis, uint64_t downtime)
> +{
> +uint64_t buf;
> +
> +buf = cpu_to_be64(downtime);
> +migrate_send_rp_message(mis, MIG_RP_MSG_DOWNTIME, sizeof(downtime), 
> );
> +}
> +
> +/*
>   * Send a 'SHUT' message on the return channel with the given value
>   * to indicate that we've finished with the RP.  Non-0 value indicates
>   * error.
> @@ -1483,6 +1496,7 @@ static struct rp_cmd_args {
>  [MIG_RP_MSG_PONG]   = { .len =  4, .name = "PONG" },
>  [MIG_RP_MSG_REQ_PAGES]  = { .len = 12, .name = "REQ_PAGES" },
>  [MIG_RP_MSG_REQ_PAGES_ID]   = { .len = -1, .name = "REQ_PAGES_ID" },
> +[MIG_RP_MSG_DOWNTIME]   = { .len =  8, .name = "DOWNTIME" },
>  [MIG_RP_MSG_MAX]= { .len = -1, .name = "MAX" },
>  };
>  
> @@ -1613,6 +1627,10 @@ static void *source_return_path_thread(void *opaque)
>  migrate_handle_rp_req_pages(ms, (char *)[13], start, len);
>  break;
>  
> +case MIG_RP_MSG_DOWNTIME:
> +ms->downtime = ldq_be_p(buf);
> +break;
> +
>  default:
>  break;
>  }
> @@ -1677,7 +1695,6 @@ static int postcopy_start(MigrationState *ms, bool 
> *old_vm_running)
>  int ret;
>  QIOChannelBuffer *bioc;
>  QEMUFile *fb;
> -int64_t time_at_stop = qemu_clock_get_ms(QEMU_CLOCK_REALTIME);
>  bool restart_block = false;
>  migrate_set_state(>state, MIGRATION_STATUS_ACTIVE,
>MIGRATION_STATUS_POSTCOPY_ACTIVE);
> @@ -1779,7 +1796,6 @@ static int postcopy_start(MigrationState *ms, bool 
> *old_vm_running)
>   */
>  ms->postcopy_after_devices = true;
>  notifier_list_notify(_state_notifiers, ms);
> -ms->downtime =  qemu_clock_get_ms(QEMU_CLOCK_REALTIME) - time_at_stop;
>  
>  qemu_mutex_unlock_iothread();
>  
> diff --git a/migration/postcopy-ram.c b/migration/postcopy-ram.c
> index ea89f4e..42330fd 100644
> --- a/migration/postcopy-ram.c
> +++ b/migration/postcopy-ram.c
> @@ -330,6 +330,7 @@ int postcopy_ram_incoming_cleanup(MigrationIncomingState 
> *mis)
>  }
>  
>  postcopy_state_set(POSTCOPY_INCOMING_END);
> +migrate_send_rp_downtime(mis, get_postcopy_total_downtime());
>  migrate_send_rp_shut(mis, qemu_file_get_error(mis->from_src_file) != 0);
>  
>  if (mis->postcopy_tmp_page) {
> -- 
> 1.8.3.1
> 
--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK



[Qemu-devel] [PATCH v2] trace: add qemu mutex lock and unlock trace events

2017-04-24 Thread Jose Ricardo Ziviani
These trace events were very useful to help me to understand and find a
reordering issue in vfio, for example:

qemu_mutex_lock locked mutex 0x10905ad8
  vfio_region_write  (0001:03:00.0:region1+0xc0, 0x2020c, 4)
qemu_mutex_unlock unlocked mutex 0x10905ad8
qemu_mutex_lock locked mutex 0x10905ad8
  vfio_region_write  (0001:03:00.0:region1+0xc4, 0xa, 4)
qemu_mutex_unlock unlocked mutex 0x10905ad8

that also helped me to see the desired result after the fix:

qemu_mutex_lock locked mutex 0x10905ad8
  vfio_region_write  (0001:03:00.0:region1+0xc0, 0x2000c, 4)
  vfio_region_write  (0001:03:00.0:region1+0xc4, 0xb, 4)
qemu_mutex_unlock unlocked mutex 0x10905ad8

So it could be a good idea to have these traces implemented. It's worth
mentioning that they should be surgically enabled during the debugging,
otherwise it can flood the trace logs with lock/unlock messages.

How to use it:
trace-event qemu_mutex_lock on|off
trace-event qemu_mutex_unlock on|off
or
trace-event qemu_mutex* on|off

Signed-off-by: Jose Ricardo Ziviani 
---
v2:
  - removed unecessary (void*) cast
  - renamed parameter name to lock instead of qemu_global_mutex

 util/qemu-thread-posix.c | 5 +
 util/trace-events| 4 
 2 files changed, 9 insertions(+)

diff --git a/util/qemu-thread-posix.c b/util/qemu-thread-posix.c
index 73e3a0e..4f77d7b 100644
--- a/util/qemu-thread-posix.c
+++ b/util/qemu-thread-posix.c
@@ -14,6 +14,7 @@
 #include "qemu/thread.h"
 #include "qemu/atomic.h"
 #include "qemu/notify.h"
+#include "trace.h"
 
 static bool name_threads;
 
@@ -60,6 +61,8 @@ void qemu_mutex_lock(QemuMutex *mutex)
 err = pthread_mutex_lock(>lock);
 if (err)
 error_exit(err, __func__);
+
+trace_qemu_mutex_lock(>lock);
 }
 
 int qemu_mutex_trylock(QemuMutex *mutex)
@@ -74,6 +77,8 @@ void qemu_mutex_unlock(QemuMutex *mutex)
 err = pthread_mutex_unlock(>lock);
 if (err)
 error_exit(err, __func__);
+
+trace_qemu_mutex_unlock(>lock);
 }
 
 void qemu_rec_mutex_init(QemuRecMutex *mutex)
diff --git a/util/trace-events b/util/trace-events
index b44ef4f..70f6212 100644
--- a/util/trace-events
+++ b/util/trace-events
@@ -55,3 +55,7 @@ lockcnt_futex_wait_prepare(const void *lockcnt, int expected, 
int new) "lockcnt
 lockcnt_futex_wait(const void *lockcnt, int val) "lockcnt %p waiting on %d"
 lockcnt_futex_wait_resume(const void *lockcnt, int new) "lockcnt %p after 
wait: %d"
 lockcnt_futex_wake(const void *lockcnt) "lockcnt %p waking up one waiter"
+
+# util/qemu-thread-posix.c
+qemu_mutex_lock(void *lock) "locked mutex %p"
+qemu_mutex_unlock(void *lock) "unlocked mutex %p"
-- 
2.7.4




Re: [Qemu-devel] [PATCH 4/6] migration: calculate downtime on dst side (CPUMASK)

2017-04-24 Thread Dr. David Alan Gilbert
* Alexey (a.pereva...@samsung.com) wrote:
> Hello David,
> this mail just for CPUMASK discussion.
> 
> On Fri, Apr 21, 2017 at 01:00:32PM +0100, Dr. David Alan Gilbert wrote:
> > * Alexey Perevalov (a.pereva...@samsung.com) wrote:
> > > This patch provides downtime calculation per vCPU,
> > > as a summary and as a overlapped value for all vCPUs.
> > > 
> > > This approach just keeps tree with page fault addr as a key,
> > > and t1-t2 interval of pagefault time and page copy time, with
> > > affected vCPU bit mask.
> > > For more implementation details please see comment to
> > > get_postcopy_total_downtime function.
> > > 
> > > Signed-off-by: Alexey Perevalov 
> > > ---
> > >  include/migration/migration.h |  14 +++
> > >  migration/migration.c | 280 
> > > +-
> > >  migration/postcopy-ram.c  |  24 +++-
> > >  migration/qemu-file.c |   1 -
> > >  migration/trace-events|   9 +-
> > >  5 files changed, 323 insertions(+), 5 deletions(-)
> > > 
> > > diff --git a/include/migration/migration.h b/include/migration/migration.h
> > > index 5720c88..5d2c628 100644
> > > --- a/include/migration/migration.h
> > > +++ b/include/migration/migration.h
> > > @@ -123,10 +123,24 @@ struct MigrationIncomingState {
> > >  
> > >  /* See savevm.c */
> > >  LoadStateEntry_Head loadvm_handlers;
> > > +
> > > +/*
> > > + *  Tree for keeping postcopy downtime,
> > > + *  necessary to calculate correct downtime, during multiple
> > > + *  vm suspends, it keeps host page address as a key and
> > > + *  DowntimeDuration as a data
> > > + *  NULL means kernel couldn't provide process thread id,
> > > + *  and QEMU couldn't identify which vCPU raise page fault
> > > + */
> > > +GTree *postcopy_downtime;
> > >  };
> > >  
> > >  MigrationIncomingState *migration_incoming_get_current(void);
> > >  void migration_incoming_state_destroy(void);
> > > +void mark_postcopy_downtime_begin(uint64_t addr, int cpu);
> > > +void mark_postcopy_downtime_end(uint64_t addr);
> > > +uint64_t get_postcopy_total_downtime(void);
> > > +void destroy_downtime_duration(gpointer data);
> > >  
> > >  /*
> > >   * An outstanding page request, on the source, having been received
> > > diff --git a/migration/migration.c b/migration/migration.c
> > > index 79f6425..5bac434 100644
> > > --- a/migration/migration.c
> > > +++ b/migration/migration.c
> > > @@ -38,6 +38,8 @@
> > >  #include "io/channel-tls.h"
> > >  #include "migration/colo.h"
> > >  
> > > +#define DEBUG_VCPU_DOWNTIME 1
> > > +
> > >  #define MAX_THROTTLE  (32 << 20)  /* Migration transfer speed 
> > > throttling */
> > >  
> > >  /* Amount of time to allocate to each "chunk" of bandwidth-throttled
> > > @@ -77,6 +79,19 @@ static NotifierList migration_state_notifiers =
> > >  
> > >  static bool deferred_incoming;
> > >  
> > > +typedef struct {
> > > +int64_t begin;
> > > +int64_t end;
> > > +uint64_t *cpus; /* cpus bit mask array, QEMU bit functions support
> > > + bit operation on memory regions, but doesn't check out of range */
> > > +} DowntimeDuration;
> > > +
> > > +typedef struct {
> > > +int64_t tp; /* point in time */
> > > +bool is_end;
> > > +uint64_t *cpus;
> > > +} OverlapDowntime;
> > > +
> > >  /*
> > >   * Current state of incoming postcopy; note this is not part of
> > >   * MigrationIncomingState since it's state is used during cleanup
> > > @@ -117,6 +132,13 @@ MigrationState *migrate_get_current(void)
> > >  return _migration;
> > >  }
> > >  
> > > +void destroy_downtime_duration(gpointer data)
> > > +{
> > > +DowntimeDuration *dd = (DowntimeDuration *)data;
> > > +g_free(dd->cpus);
> > > +g_free(data);
> > > +}
> > > +
> > >  MigrationIncomingState *migration_incoming_get_current(void)
> > >  {
> > >  static bool once;
> > > @@ -138,10 +160,13 @@ void migration_incoming_state_destroy(void)
> > >  struct MigrationIncomingState *mis = 
> > > migration_incoming_get_current();
> > >  
> > >  qemu_event_destroy(>main_thread_load_event);
> > > +if (mis->postcopy_downtime) {
> > > +g_tree_destroy(mis->postcopy_downtime);
> > > +mis->postcopy_downtime = NULL;
> > > +}
> > >  loadvm_free_handlers(mis);
> > >  }
> > >  
> > > -
> > >  typedef struct {
> > >  bool optional;
> > >  uint32_t size;
> > > @@ -1754,7 +1779,6 @@ static int postcopy_start(MigrationState *ms, bool 
> > > *old_vm_running)
> > >   */
> > >  ms->postcopy_after_devices = true;
> > >  notifier_list_notify(_state_notifiers, ms);
> > > -
> > 
> > Stray deletion
> > 
> > >  ms->downtime =  qemu_clock_get_ms(QEMU_CLOCK_REALTIME) - 
> > > time_at_stop;
> > >  
> > >  qemu_mutex_unlock_iothread();
> > > @@ -2117,3 +2141,255 @@ PostcopyState postcopy_state_set(PostcopyState 
> > > new_state)
> > >  return atomic_xchg(_postcopy_state, new_state);
> > >  }
> > >  
> > > 

Re: [Qemu-devel] [PATCH 4/6] migration: calculate downtime on dst side

2017-04-24 Thread Dr. David Alan Gilbert
* Alexey (a.pereva...@samsung.com) wrote:
> Hello, David!
> 
> 
> I apologize, forgot to check patches with checkpatch.pl script, but now I 
> checked,
> and I fixed code styles in patches, however I checked also files,
> migration.c has code style errors and glib-compat.h too.
> I could send that patches to qemu-trivial, if you not against.

Feel free to send style patches to trivial;  if they're right next
to a line you're changing then you can include them in the same patch
but if they're elsewhere do as you say with a trivial patch.

Dave

> 
> On Fri, Apr 21, 2017 at 01:00:32PM +0100, Dr. David Alan Gilbert wrote:
> > * Alexey Perevalov (a.pereva...@samsung.com) wrote:
> > > This patch provides downtime calculation per vCPU,
> > > as a summary and as a overlapped value for all vCPUs.
> > > 
> > > This approach just keeps tree with page fault addr as a key,
> > > and t1-t2 interval of pagefault time and page copy time, with
> > > affected vCPU bit mask.
> > > For more implementation details please see comment to
> > > get_postcopy_total_downtime function.
> > > 
> > > Signed-off-by: Alexey Perevalov 
> > > ---
> > >  include/migration/migration.h |  14 +++
> > >  migration/migration.c | 280 
> > > +-
> > >  migration/postcopy-ram.c  |  24 +++-
> > >  migration/qemu-file.c |   1 -
> > >  migration/trace-events|   9 +-
> > >  5 files changed, 323 insertions(+), 5 deletions(-)
> > > 
> > > diff --git a/include/migration/migration.h b/include/migration/migration.h
> > > index 5720c88..5d2c628 100644
> > > --- a/include/migration/migration.h
> > > +++ b/include/migration/migration.h
> > > @@ -123,10 +123,24 @@ struct MigrationIncomingState {
> > >  
> > >  /* See savevm.c */
> > >  LoadStateEntry_Head loadvm_handlers;
> > > +
> > > +/*
> > > + *  Tree for keeping postcopy downtime,
> > > + *  necessary to calculate correct downtime, during multiple
> > > + *  vm suspends, it keeps host page address as a key and
> > > + *  DowntimeDuration as a data
> > > + *  NULL means kernel couldn't provide process thread id,
> > > + *  and QEMU couldn't identify which vCPU raise page fault
> > > + */
> > > +GTree *postcopy_downtime;
> > >  };
> > >  
> > >  MigrationIncomingState *migration_incoming_get_current(void);
> > >  void migration_incoming_state_destroy(void);
> > > +void mark_postcopy_downtime_begin(uint64_t addr, int cpu);
> > > +void mark_postcopy_downtime_end(uint64_t addr);
> > > +uint64_t get_postcopy_total_downtime(void);
> > > +void destroy_downtime_duration(gpointer data);
> > >  
> > >  /*
> > >   * An outstanding page request, on the source, having been received
> > > diff --git a/migration/migration.c b/migration/migration.c
> > > index 79f6425..5bac434 100644
> > > --- a/migration/migration.c
> > > +++ b/migration/migration.c
> > > @@ -38,6 +38,8 @@
> > >  #include "io/channel-tls.h"
> > >  #include "migration/colo.h"
> > >  
> > > +#define DEBUG_VCPU_DOWNTIME 1
> > > +
> > >  #define MAX_THROTTLE  (32 << 20)  /* Migration transfer speed 
> > > throttling */
> > >  
> > >  /* Amount of time to allocate to each "chunk" of bandwidth-throttled
> > > @@ -77,6 +79,19 @@ static NotifierList migration_state_notifiers =
> > >  
> > >  static bool deferred_incoming;
> > >  
> > > +typedef struct {
> > > +int64_t begin;
> > > +int64_t end;
> > > +uint64_t *cpus; /* cpus bit mask array, QEMU bit functions support
> > > + bit operation on memory regions, but doesn't check out of range */
> > > +} DowntimeDuration;
> > > +
> > > +typedef struct {
> > > +int64_t tp; /* point in time */
> > > +bool is_end;
> > > +uint64_t *cpus;
> > > +} OverlapDowntime;
> > > +
> > >  /*
> > >   * Current state of incoming postcopy; note this is not part of
> > >   * MigrationIncomingState since it's state is used during cleanup
> > > @@ -117,6 +132,13 @@ MigrationState *migrate_get_current(void)
> > >  return _migration;
> > >  }
> > >  
> > > +void destroy_downtime_duration(gpointer data)
> > > +{
> > > +DowntimeDuration *dd = (DowntimeDuration *)data;
> > > +g_free(dd->cpus);
> > > +g_free(data);
> > > +}
> > > +
> > >  MigrationIncomingState *migration_incoming_get_current(void)
> > >  {
> > >  static bool once;
> > > @@ -138,10 +160,13 @@ void migration_incoming_state_destroy(void)
> > >  struct MigrationIncomingState *mis = 
> > > migration_incoming_get_current();
> > >  
> > >  qemu_event_destroy(>main_thread_load_event);
> > > +if (mis->postcopy_downtime) {
> > > +g_tree_destroy(mis->postcopy_downtime);
> > > +mis->postcopy_downtime = NULL;
> > > +}
> > >  loadvm_free_handlers(mis);
> > >  }
> > >  
> > > -
> > >  typedef struct {
> > >  bool optional;
> > >  uint32_t size;
> > > @@ -1754,7 +1779,6 @@ static int postcopy_start(MigrationState *ms, bool 
> > > *old_vm_running)
> > >   */
> 

Re: [Qemu-devel] [PATCH 3/6] migration: add UFFD_FEATURE_THREAD_ID feature support

2017-04-24 Thread Dr. David Alan Gilbert
* Alexey (a.pereva...@samsung.com) wrote:
> On Mon, Apr 24, 2017 at 04:12:29PM +0800, Peter Xu wrote:
> > On Fri, Apr 21, 2017 at 06:22:12PM +0300, Alexey wrote:
> > > On Fri, Apr 21, 2017 at 11:24:54AM +0100, Dr. David Alan Gilbert wrote:
> > > > * Alexey Perevalov (a.pereva...@samsung.com) wrote:
> > > > > Userfaultfd mechanism is able to provide process thread id,
> > > > > in case when client request it with UFDD_API ioctl.
> > > > > 
> > > > > Signed-off-by: Alexey Perevalov 
> > > > 
> > > > There seem to be two parts to this:
> > > >   a) Adding the mis parameter to ufd_version_check
> > > >   b) Asking for the feature
> > > > 
> > > > Please split it into two patches.
> > > > 
> > > > Also
> > > > 
> > > > > ---
> > > > >  include/migration/postcopy-ram.h |  2 +-
> > > > >  migration/migration.c|  2 +-
> > > > >  migration/postcopy-ram.c | 12 ++--
> > > > >  migration/savevm.c   |  2 +-
> > > > >  4 files changed, 9 insertions(+), 9 deletions(-)
> > > > > 
> > > > > diff --git a/include/migration/postcopy-ram.h 
> > > > > b/include/migration/postcopy-ram.h
> > > > > index 8e036b9..809f6db 100644
> > > > > --- a/include/migration/postcopy-ram.h
> > > > > +++ b/include/migration/postcopy-ram.h
> > > > > @@ -14,7 +14,7 @@
> > > > >  #define QEMU_POSTCOPY_RAM_H
> > > > >  
> > > > >  /* Return true if the host supports everything we need to do 
> > > > > postcopy-ram */
> > > > > -bool postcopy_ram_supported_by_host(void);
> > > > > +bool postcopy_ram_supported_by_host(MigrationIncomingState *mis);
> > > > >  
> > > > >  /*
> > > > >   * Make all of RAM sensitive to accesses to areas that haven't yet 
> > > > > been written
> > > > > diff --git a/migration/migration.c b/migration/migration.c
> > > > > index ad4036f..79f6425 100644
> > > > > --- a/migration/migration.c
> > > > > +++ b/migration/migration.c
> > > > > @@ -802,7 +802,7 @@ void 
> > > > > qmp_migrate_set_capabilities(MigrationCapabilityStatusList *params,
> > > > >   * special support.
> > > > >   */
> > > > >  if (!old_postcopy_cap && runstate_check(RUN_STATE_INMIGRATE) 
> > > > > &&
> > > > > -!postcopy_ram_supported_by_host()) {
> > > > > +!postcopy_ram_supported_by_host(NULL)) {
> > > > >  /* postcopy_ram_supported_by_host will have emitted a 
> > > > > more
> > > > >   * detailed message
> > > > >   */
> > > > > diff --git a/migration/postcopy-ram.c b/migration/postcopy-ram.c
> > > > > index dc80dbb..70f0480 100644
> > > > > --- a/migration/postcopy-ram.c
> > > > > +++ b/migration/postcopy-ram.c
> > > > > @@ -60,13 +60,13 @@ struct PostcopyDiscardState {
> > > > >  #include 
> > > > >  #include 
> > > > >  
> > > > > -static bool ufd_version_check(int ufd)
> > > > > +static bool ufd_version_check(int ufd, MigrationIncomingState *mis)
> > > > >  {
> > > > >  struct uffdio_api api_struct;
> > > > >  uint64_t ioctl_mask;
> > > > >  
> > > > >  api_struct.api = UFFD_API;
> > > > > -api_struct.features = 0;
> > > > > +api_struct.features = UFFD_FEATURE_THREAD_ID;
> > > > >  if (ioctl(ufd, UFFDIO_API, _struct)) {
> > > > >  error_report("postcopy_ram_supported_by_host: UFFDIO_API 
> > > > > failed: %s",
> > > > >   strerror(errno));
> > > > 
> > > > You're not actually using the 'mis' here - what I'd expected was
> > > > something that was going to check if the UFFDIO_API return said that it 
> > > > really
> > > > had the feature, and if so store a flag in the MIS somewhere.
> > > > 
> > > > Also, I'm not sure it's right to set 'api_struct.features' on the input 
> > > > - what
> > > > happens if this is run on an old kernel - we don't want postcopy to 
> > > > fail on
> > > > an old kernel without your feature.
> > > > I'm not 100% sure of the interface, but I think the way it works is you 
> > > > set
> > > > features = 0 before the call, and then check the api_struct.features in 
> > > > the
> > > > return - in the same way that I check for 
> > > > UFFD_FEATURE_MISSING_HUGETLBFS.
> > > > 
> > > We need to ask kernel about that feature,
> > > right,
> > > kernel returns back available features
> > > uffdio_api.features = UFFD_API_FEATURES
> > > but it also stores requested features
> > 
> > I feel like this does not against Dave's comment, maybe we just need
> > to send the UFFDIO_API twice? Like:
> yes, ioctl with UFFDIO_API will fail on old kernel if we will request
> e.g. UFFD_FEATURE_THREAD_ID or other new feature.
> 
> So in general way need a per feature request, for better error handling.

No, we don't need to - I think the way the kernel works is that you pass
features = 0 in, and it sets api_struct.features on the way out;
so if you always pass 0 in, you can then just check the features that
it returns.

Dave

> 
> > 
> > diff --git a/migration/postcopy-ram.c b/migration/postcopy-ram.c
> > index 85fd8d7..fd0905f 100644
> > --- 

Re: [Qemu-devel] [PATCH] trace: add qemu mutex lock and unlock trace events

2017-04-24 Thread joserz
On Mon, Apr 24, 2017 at 10:45:52PM +0800, Fam Zheng wrote:
> On Mon, 04/24 11:28, Jose Ricardo Ziviani wrote:
> > These trace events were very useful to help me to understand and find a
> > reordering issue in vfio, for example:
> > 
> > qemu_mutex_lock locked mutex 0x10905ad8
> >   vfio_region_write  (0001:03:00.0:region1+0xc0, 0x2020c, 4)
> > qemu_mutex_unlock unlocked mutex 0x10905ad8
> > qemu_mutex_lock locked mutex 0x10905ad8
> >   vfio_region_write  (0001:03:00.0:region1+0xc4, 0xa, 4)
> > qemu_mutex_unlock unlocked mutex 0x10905ad8
> > 
> > that also helped to see desired result after the fix:
> > 
> > qemu_mutex_lock locked mutex 0x10905ad8
> >   vfio_region_write  (0001:03:00.0:region1+0xc0, 0x2000c, 4)
> >   vfio_region_write  (0001:03:00.0:region1+0xc4, 0xb, 4)
> > qemu_mutex_unlock unlocked mutex 0x10905ad8
> > 
> > So it could be a good idea to have these traces implemented. It's worth
> > mentioning that they should be surgically enabled during the debugging,
> > otherwise it'd flood the trace logs with lock/unlock messages.
> > 
> > How to use it:
> > trace-event qemu_mutex_lock on|off
> > trace-event qemu_mutex_unlock on|off
> > or
> > trace-event qemu_mutex* on|off
> > 
> > Signed-off-by: Jose Ricardo Ziviani 
> > ---
> >  util/qemu-thread-posix.c | 5 +
> >  util/trace-events| 4 
> >  2 files changed, 9 insertions(+)
> > 
> > diff --git a/util/qemu-thread-posix.c b/util/qemu-thread-posix.c
> > index 73e3a0e..909c2ac 100644
> > --- a/util/qemu-thread-posix.c
> > +++ b/util/qemu-thread-posix.c
> > @@ -14,6 +14,7 @@
> >  #include "qemu/thread.h"
> >  #include "qemu/atomic.h"
> >  #include "qemu/notify.h"
> > +#include "trace.h"
> >  
> >  static bool name_threads;
> >  
> > @@ -60,6 +61,8 @@ void qemu_mutex_lock(QemuMutex *mutex)
> >  err = pthread_mutex_lock(>lock);
> >  if (err)
> >  error_exit(err, __func__);
> > +
> > +trace_qemu_mutex_lock((void *)>lock);
> 
> You don't need these casts as the parameter type is void * which accepts any
> pointers.

OK

> 
> >  }
> >  
> >  int qemu_mutex_trylock(QemuMutex *mutex)
> > @@ -74,6 +77,8 @@ void qemu_mutex_unlock(QemuMutex *mutex)
> >  err = pthread_mutex_unlock(>lock);
> >  if (err)
> >  error_exit(err, __func__);
> > +
> > +trace_qemu_mutex_unlock((void *)>lock);
> >  }
> >  
> >  void qemu_rec_mutex_init(QemuRecMutex *mutex)
> > diff --git a/util/trace-events b/util/trace-events
> > index b44ef4f..65c33fe 100644
> > --- a/util/trace-events
> > +++ b/util/trace-events
> > @@ -55,3 +55,7 @@ lockcnt_futex_wait_prepare(const void *lockcnt, int 
> > expected, int new) "lockcnt
> >  lockcnt_futex_wait(const void *lockcnt, int val) "lockcnt %p waiting on %d"
> >  lockcnt_futex_wait_resume(const void *lockcnt, int new) "lockcnt %p after 
> > wait: %d"
> >  lockcnt_futex_wake(const void *lockcnt) "lockcnt %p waking up one waiter"
> > +
> > +# util/qemu-thread-posix.c
> > +qemu_mutex_lock(void *qemu_global_mutex) "locked mutex %p"
> > +qemu_mutex_unlock(void *qemu_global_mutex) "unlocked mutex %p"
> 
> Parameter name is slightly misleading, maybe s/qemu_global_mutex/lock/ for 
> both
> lines?

Great! I'll change it and send a v2.

Thank for your review!

> 
> Fam
> 




Re: [Qemu-devel] [PULL 0/4] hmp queue

2017-04-24 Thread Dr. David Alan Gilbert
* Peter Maydell (peter.mayd...@linaro.org) wrote:
> On 24 April 2017 at 16:32, Dr. David Alan Gilbert (git)
> <dgilb...@redhat.com> wrote:
> > From: "Dr. David Alan Gilbert" <dgilb...@redhat.com>
> >
> > The following changes since commit 4c55b1d0bad8a703f0499fe62e3761a0cd288da3:
> >
> >   Merge remote-tracking branch 'remotes/armbru/tags/pull-error-2017-04-24' 
> > into staging (2017-04-24 14:49:48 +0100)
> >
> > are available in the git repository at:
> >
> >   git://github.com/dagrh/qemu.git tags/pull-hmp-20170424
> >
> > for you to fetch changes up to e4e3992e626c4cc7514b271807c90f587771c646:
> >
> >   tests: Add a tester for HMP commands (2017-04-24 15:55:35 +0100)
> >
> > 
> > HMP pull
> >
> > 
> 
> 
> clang doesn't like some code in test-hmp.c:
> 
> /home/petmay01/linaro/qemu-for-merges/tests/test-hmp.c:138:9: error:
> logical not is only applied to the left hand side of this comparison
> [-Werror,-Wlogical-not-parentheses]
> if (!strcmp("isapc", mname) == 0 ||  !strcmp("puv3", mname)
> ^   ~~



> It does look rather odd:
> 
> +/* Ignore blacklisted machines that have known problems */
> +if (!strcmp("isapc", mname) == 0 ||  !strcmp("puv3", mname)
> +|| !strcmp("tricore_testboard", mname)
> +|| !strcmp("xenfv", mname) == 0 || !strcmp("xenpv", mname)) {
> +return;
> +}
> 
> since it's not using the same kind of expression to test
> each board name -- is that deliberate, or accidental ?
> 
> I think this expression means we'll actually skip every machine...

Yep, you're right, just tried it with logging.

That's accidental; hmm I should add a clang build somewhere.

Thomas: Do you want to send me a fixed version?

Dave

> thanks
> -- PMM
--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK



Re: [Qemu-devel] [PATCH v5 13/18] qcow2: add support for LUKS encryption format

2017-04-24 Thread Daniel P. Berrange
On Tue, Feb 21, 2017 at 03:13:03PM +0100, Alberto Garcia wrote:
> On Tue 21 Feb 2017 12:55:07 PM CET, Daniel P. Berrange wrote:
> >  static int qcow2_set_up_encryption(BlockDriverState *bs, QemuOpts *opts,
> > -   Error **errp)
> > +   const char *fmtstr, Error **errp)
> >  {
> >  BDRVQcow2State *s = bs->opaque;
> >  QCryptoBlockCreateOptions *cryptoopts = NULL;
> >  QCryptoBlock *crypto = NULL;
> >  int ret = -EINVAL;
> > +int fmt = qcow2_crypt_method_from_format(fmtstr);
> >  
> > -cryptoopts = block_crypto_create_opts_init(
> > -Q_CRYPTO_BLOCK_FORMAT_QCOW, opts, "aes-", errp);
> > +switch (fmt) {
> > +case QCOW_CRYPT_LUKS:
> > +cryptoopts = block_crypto_create_opts_init(
> > +Q_CRYPTO_BLOCK_FORMAT_LUKS, opts, "luks-", errp);
> > +break;
> > +case QCOW_CRYPT_AES:
> > +cryptoopts = block_crypto_create_opts_init(
> > +Q_CRYPTO_BLOCK_FORMAT_QCOW, opts, "aes-", errp);
> > +break;
> > +default:
> > +error_setg(errp, "Unknown encryption format %s", fmtstr);
> > +break;
> > +}
> > +s->crypt_method_header = fmt;
> >  if (!cryptoopts) {
> >  ret = -EINVAL;
> >  goto out;
> >  }
> 
> I don't believe it has any practical effect in this case, but I think
> it's cleaner to set s->crypt_method_header = fmt after checking that
> everything went well. Otherwise an incorrect format string will result
> in s->crypt_method_header = -EINVAL, which doesn't make sense (it's not
> even the correct type: the variable is unsigned).

Yes, makes sense.

> Other than that, the patch looks good to me.
> 
> Reviewed-by: Alberto Garcia 


Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|



Re: [Qemu-devel] [PULL 0/4] hmp queue

2017-04-24 Thread Peter Maydell
On 24 April 2017 at 16:32, Dr. David Alan Gilbert (git)
<dgilb...@redhat.com> wrote:
> From: "Dr. David Alan Gilbert" <dgilb...@redhat.com>
>
> The following changes since commit 4c55b1d0bad8a703f0499fe62e3761a0cd288da3:
>
>   Merge remote-tracking branch 'remotes/armbru/tags/pull-error-2017-04-24' 
> into staging (2017-04-24 14:49:48 +0100)
>
> are available in the git repository at:
>
>   git://github.com/dagrh/qemu.git tags/pull-hmp-20170424
>
> for you to fetch changes up to e4e3992e626c4cc7514b271807c90f587771c646:
>
>   tests: Add a tester for HMP commands (2017-04-24 15:55:35 +0100)
>
> 
> HMP pull
>
> 


clang doesn't like some code in test-hmp.c:

/home/petmay01/linaro/qemu-for-merges/tests/test-hmp.c:138:9: error:
logical not is only applied to the left hand side of this comparison
[-Werror,-Wlogical-not-parentheses]
if (!strcmp("isapc", mname) == 0 ||  !strcmp("puv3", mname)
^   ~~
/home/petmay01/linaro/qemu-for-merges/tests/test-hmp.c:138:9: note:
add parentheses after the '!' to evaluate the comparison first
if (!strcmp("isapc", mname) == 0 ||  !strcmp("puv3", mname)
^
 (  )
/home/petmay01/linaro/qemu-for-merges/tests/test-hmp.c:138:9: note:
add parentheses around left hand side expression to silence this
warning
if (!strcmp("isapc", mname) == 0 ||  !strcmp("puv3", mname)
^
(  )
/home/petmay01/linaro/qemu-for-merges/tests/test-hmp.c:140:12: error:
logical not is only applied to the left hand side of this comparison
[-Werror,-Wlogical-not-parentheses]
|| !strcmp("xenfv", mname) == 0 || !strcmp("xenpv", mname)) {
   ^   ~~
/home/petmay01/linaro/qemu-for-merges/tests/test-hmp.c:140:12: note:
add parentheses after the '!' to evaluate the comparison first
|| !strcmp("xenfv", mname) == 0 || !strcmp("xenpv", mname)) {
   ^
(  )
/home/petmay01/linaro/qemu-for-merges/tests/test-hmp.c:140:12: note:
add parentheses around left hand side expression to silence this
warning
|| !strcmp("xenfv", mname) == 0 || !strcmp("xenpv", mname)) {
   ^
   (  )


It does look rather odd:

+/* Ignore blacklisted machines that have known problems */
+if (!strcmp("isapc", mname) == 0 ||  !strcmp("puv3", mname)
+|| !strcmp("tricore_testboard", mname)
+|| !strcmp("xenfv", mname) == 0 || !strcmp("xenpv", mname)) {
+return;
+}

since it's not using the same kind of expression to test
each board name -- is that deliberate, or accidental ?

I think this expression means we'll actually skip every machine...

thanks
-- PMM



Re: [Qemu-devel] [PATCH v5 11/18] qcow2: convert QCow2 to use QCryptoBlock for encryption

2017-04-24 Thread Daniel P. Berrange
On Tue, Feb 21, 2017 at 02:30:10PM +0100, Alberto Garcia wrote:
> On Tue 21 Feb 2017 12:55:05 PM CET, Daniel P. Berrange wrote:
> > +switch (s->crypt_method_header) {
> > +case QCOW_CRYPT_NONE:
> > +break;
> > +
> > +case QCOW_CRYPT_AES:
> > +r->crypto_opts = block_crypto_open_opts_init(
> > +Q_CRYPTO_BLOCK_FORMAT_QCOW, opts, "aes-", errp);
> > +break;
> > +
> > +default:
> > +error_setg(errp, "Unsupported encryption method %d",
> > +   s->crypt_method_header);
> > +break;
> > +}
> > +if (s->crypt_method_header && !r->crypto_opts) {
> > +ret = -EINVAL;
> > +goto fail;
> > +}
> 
> This last condition relies on the assumption that QCOW_CRYPT_NONE == 0.
> 
> I think it's safe to assume that its value is never going to change and
> therefore this isn't too important, but I'm just pointing it out in case
> you want to make it explicit.

Yeah, I'll make it explicit to be kinder to future reviewers :-)

> 
> > @@ -1122,6 +1145,24 @@ static int qcow2_open(BlockDriverState *bs, QDict 
> > *options, int flags,
> >  goto fail;
> >  }
> >  
> > +if (s->crypt_method_header == QCOW_CRYPT_AES) {
> > +unsigned int cflags = 0;
> > +if (flags & BDRV_O_NO_IO) {
> > +cflags |= QCRYPTO_BLOCK_OPEN_NO_IO;
> > +}
> > +/* TODO how do we pass the same crypto opts down to the
> > + * backing file by default, so we don't have to manually
> > + * provide the same key-secret property against the full
> > + * backing chain
> > + */
> > +s->crypto = qcrypto_block_open(s->crypto_opts, NULL, NULL,
> > +   cflags, errp);
> > +if (!s->crypto) {
> > +ret = -EINVAL;
> > +goto fail;
> > +}
> > +}
> 
> Actually this has the same problem that I mentioned for patch 9: if
> qcow2_open() fails then s->crypto is leaked.

Yep, and the crypto_opts actually


Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|



Re: [Qemu-devel] About QEMU BQL and dirty log switch in Migration

2017-04-24 Thread Dr. David Alan Gilbert
* Yang Hongyang (yanghongy...@huawei.com) wrote:
> 
> 
> On 2017/4/24 20:06, Juan Quintela wrote:
> > Yang Hongyang  wrote:
> >> Hi all,
> >>
> >> We found dirty log switch costs more then 13 seconds while migrating
> >> a 4T memory guest, and dirty log switch is currently protected by QEMU
> >> BQL. This causes guest freeze for a long time when switching dirty log on,
> >> and the migration downtime is unacceptable.
> >> Are there any chance to optimize the time cost for dirty log switch 
> >> operation?
> >> Or move the time consuming operation out of the QEMU BQL?
> > 
> > Hi
> > 
> > Could you specify what do you mean by dirty log switch?
> > The one inside kvm?
> > The merge between kvm one and migration bitmap?
> 
> The call of the following functions:
> memory_global_dirty_log_start/stop();

I suppose there's a few questions;
  a) Do we actually need the BQL - and if so why
  b) What actually takes 13s?  It's probably worth figuring
out where it goes,  the whole bitmap is only 1GB isn't it
even on a 4TB machine, and even the simplest way to fill
that takes way less than 13s.

Dave

> 
> > 
> > Thanks, Juan.
> > 
> 
> -- 
> Thanks,
> Yang
--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK



Re: [Qemu-devel] error: qcrypto_random_bytes() tried to read from /dev/[u]random, even on windows

2017-04-24 Thread Daniel P. Berrange
On Mon, Apr 24, 2017 at 06:33:26PM +0200, Geert Martin Ijewski wrote:
> > We can have the existing qcrypto_init() call a qcrypto_random_init()
> > method to do the one-time initialization task, since that's already
> > required to run early in order to initialize gnutls when we use it.
> 
> Wouldn't it make sense to also move the unix initalization to that function?

Yep, we can do that.

> And what about deinitalization? Though because that handle needs to stay
> valid throughout the whole lifetime of QEMU anyway, that probably can be
> ignored and taken care of by the OS

Correct, there's no need for any de-init if we have a global handle, as it
is needed until QEMU exits.

Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|



  1   2   3   >