[Qemu-devel] [Bug 1713825] Re: Booting Windows 2016 with qxl video crashes qemu

2017-11-14 Thread Gerd Hoffmann
Guest triggerable assert() isn't exactly nice indeed.
But it's not a show stopper.
It doesn't allow exploiting the host, the guest can only DoS itself.
And you must be priviledged in the guest to do so.

Most likely this is the driver placing the qxl commands in the wrong pci
bar.  See commit 86dbcdd9c7590d06db89ca256c5eaf0b4aba8858.  Seems the
impact is more than breaking live migration.  So, I can raise a error
irq and have qxl enter guest bug mode.  That doesn't improve the
situation much though, the guest will continue running but you will have
broken display ...

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1713825

Title:
  Booting Windows 2016 with qxl video crashes qemu

Status in QEMU:
  New

Bug description:
  launched from libvirt.

  qemu version: 2.9.0
  host: Linux  4.9.34-gentoo #1 SMP Sat Jul 29 13:28:43 PDT 2017 
x86_64 Intel(R) Core(TM) i7-3930K CPU @ 3.20GHz GenuineIntel GNU/Linux
  guest: Windows 2016 64 bit

  Thread 28 (Thread 0x7f0e2edff700 (LWP 29860)):
  #0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51
  set = {__val = {18446744067266837079, 139698892694944, 
139699853745096, 139700858749789, 4222451712, 139694281220640, 139694281220741, 
139694281220640, 139694281220640, 139694281220810, 
  139694281220940, 139694281220640, 139694281220940, 0, 0, 0}}
  pid = 
  tid = 
  #1  0x7f0ea40b644a in __GI_abort () at abort.c:89
  save_stage = 2
  act = {__sigaction_handler = {sa_handler = 0x7f0e2edfe5c0, 
sa_sigaction = 0x7f0e2edfe5c0}, sa_mask = {__val = {139694281219872, 
139698106269697, 139698892695344, 4, 2676511744, 0, 139698892695144, 0, 
139698892694912, 1, 4737316546111099904, 139700859888720, 
4737316546111099904, 139700862161824, 139700911349760, 94211934977482}}, 
sa_flags = 416, 
sa_restorer = 0x55af6ceb0500 <__PRETTY_FUNCTION__.36381>}
  sigs = {__val = {32, 0 }}
  #2  0x7f0ea40abab6 in __assert_fail_base (fmt=, 
assertion=assertion@entry=0x55af6ceafdca "offset < qxl->vga.vram_size", 
  file=file@entry=0x55af6ceaeaa0 
"/var/tmp/portage/app-emulation/qemu-2.9.0-r2/work/qemu-2.9.0/hw/display/qxl.c",
 line=line@entry=416, 
  function=function@entry=0x55af6ceb0500 <__PRETTY_FUNCTION__.36381> 
"qxl_ram_set_dirty") at assert.c:92
  str = 0x7f0d1c026220 "\340r\002\034\r\177"
  total = 4096
  #3  0x7f0ea40abb81 in __GI___assert_fail 
(assertion=assertion@entry=0x55af6ceafdca "offset < qxl->vga.vram_size", 
  file=file@entry=0x55af6ceaeaa0 
"/var/tmp/portage/app-emulation/qemu-2.9.0-r2/work/qemu-2.9.0/hw/display/qxl.c",
 line=line@entry=416, 
  function=function@entry=0x55af6ceb0500 <__PRETTY_FUNCTION__.36381> 
"qxl_ram_set_dirty") at assert.c:101
  No locals.
  #4  0x55af6cc58805 in qxl_ram_set_dirty (qxl=, 
ptr=) at 
/var/tmp/portage/app-emulation/qemu-2.9.0-r2/work/qemu-2.9.0/hw/display/qxl.c:416
  base = 
  offset = 
  qxl = 
  ptr = 
  base = 
  offset = 
  #5  0x55af6cc5b9e2 in interface_release_resource (sin=0x55af71a91ed0, 
ext=...) at 
/var/tmp/portage/app-emulation/qemu-2.9.0-r2/work/qemu-2.9.0/hw/display/qxl.c:767
  qxl = 0x55af71a91450
  ring = 
  item = 
  id = 18446690739814400920
  __func__ = "interface_release_resource"
  #6  0x7f0ea510afa8 in red_drawable_unref (red_drawable=0x7f0d1c026120) at 
red-worker.c:101
  No locals.
  #7  0x7f0ea510b609 in red_drawable_unref (red_drawable=) 
at red-worker.c:104
  No locals.
  #8  0x7f0ea510eae9 in drawable_unref 
(drawable=drawable@entry=0x7f0e68285ac0) at display-channel.c:1438
  display = 0x55af71dbd3c0
  __FUNCTION__ = "drawable_unref"
  #9  0x7f0ea51109f7 in draw_until (display=display@entry=0x55af71dbd3c0, 
surface=surface@entry=0x7f0e6828aae8, last=0x7f0e68285ac0) at 
display-channel.c:1637
  container = 0x0
  now = 0x7f0e68285ac0
  #10 0x7f0ea510f93f in display_channel_draw (display=0x55af71dbd3c0, 
area=0x7f0e2edfe8e0, surface_id=) at display-channel.c:1729
  surface = 0x7f0e6828aae8
  last = 
  __FUNCTION__ = "display_channel_draw"
  __func__ = "display_channel_draw"

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1713825/+subscriptions



Re: [Qemu-devel] [PATCH v1 0/2] intel-iommu: Extend address width to 48 bits

2017-11-14 Thread Peter Xu
On Tue, Nov 14, 2017 at 06:13:48PM -0500, prasad.singamse...@oracle.com wrote:
> From: Prasad Singamsetty 
> 
> This pair of patches extends the intel-iommu to support address
> width to 48 bits. This is required to support qemu guest with large
> memory (>=1TB). 
> 
> Patch1 implements changes to redefine macros and usage to
> allow further changes to add support for 48 bit address width.
> This patch doesn't change the existing functionality or behavior.
> 
> Patch2 adds support for 48 bit address width but keeping the
> default to 39 bits.
> 
> NOTE: Peter Xu had originaly started on this enhancement
> but it was not completed or integrated.
> 
> Unit testing done:
> 
> patch-1:
>* Boot vm with and without intel-iommu enabled
>* Boot vm with #cpus below and above 255 cpus
> patch-2:
>* boot vm without "x-aw-bits" or "x-aw-bits=39": guest boots with 39
>* boot vm with "x-aw-bits=48": guest boots with 48 bits
>* boot vm with invalid value for x-aw-bits: guest fails to boot
>* boot vm with >=1TB memory and "x-aw-bits=48": guest boots
> 
> Prasad Singamsetty (2):
>   intel-iommu: Redefine macros to enable supporting 48 bit address width
>   intel-iommu: Extend address width to 48 bits

Looks quite good to me!

Reviewed-by: Peter Xu 

-- 
Peter Xu



[Qemu-devel] [PATCH v2 1/3] ivshmem: Don't update non-existent MSI routes

2017-11-14 Thread Ladi Prosek
As of commit 660c97eef6f8 ("ivshmem: use kvm irqfd for msi notifications"),
QEMU crashes with:

  kvm_irqchip_commit_routes: Assertion `ret == 0' failed.

if the ivshmem device is configured with more vectors than what the server
supports. This is caused by the ivshmem_vector_unmask() being called on
vectors that have not been initialized by ivshmem_add_kvm_msi_virq().

This commit fixes it by adding a simple check to the mask and unmask
callbacks.

Note that the opposite mismatch, if the server supplies more vectors than
what the device is configured for, is already handled and leads to output
like:

  Too many eventfd received, device has 1 vectors

To reproduce the assert, run:

  ivshmem-server -n 0

and QEMU with:

  -device ivshmem-doorbell,chardev=iv
  -chardev socket,path=/tmp/ivshmem_socket,id=iv

then load the Windows driver, at the time of writing available at:

https://github.com/virtio-win/kvm-guest-drivers-windows/tree/master/ivshmem

The issue is believed to have been masked by other guest drivers, notably
Linux ones, not enabling MSI-X on the device.

Fixes: 660c97eef6f8 ("ivshmem: use kvm irqfd for msi notifications")
Signed-off-by: Ladi Prosek 
Reviewed-by: Marc-André Lureau 
Reviewed-by: Markus Armbruster 
---
 hw/misc/ivshmem.c | 12 ++--
 1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/hw/misc/ivshmem.c b/hw/misc/ivshmem.c
index a5a46827fe..6e46669744 100644
--- a/hw/misc/ivshmem.c
+++ b/hw/misc/ivshmem.c
@@ -317,6 +317,10 @@ static int ivshmem_vector_unmask(PCIDevice *dev, unsigned 
vector,
 int ret;
 
 IVSHMEM_DPRINTF("vector unmask %p %d\n", dev, vector);
+if (!v->pdev) {
+error_report("ivshmem: vector %d route does not exist", vector);
+return -EINVAL;
+}
 
 ret = kvm_irqchip_update_msi_route(kvm_state, v->virq, msg, dev);
 if (ret < 0) {
@@ -331,12 +335,16 @@ static void ivshmem_vector_mask(PCIDevice *dev, unsigned 
vector)
 {
 IVShmemState *s = IVSHMEM_COMMON(dev);
 EventNotifier *n = >peers[s->vm_id].eventfds[vector];
+MSIVector *v = >msi_vectors[vector];
 int ret;
 
 IVSHMEM_DPRINTF("vector mask %p %d\n", dev, vector);
+if (!v->pdev) {
+error_report("ivshmem: vector %d route does not exist", vector);
+return;
+}
 
-ret = kvm_irqchip_remove_irqfd_notifier_gsi(kvm_state, n,
-s->msi_vectors[vector].virq);
+ret = kvm_irqchip_remove_irqfd_notifier_gsi(kvm_state, n, v->virq);
 if (ret != 0) {
 error_report("remove_irqfd_notifier_gsi failed");
 }
-- 
2.13.5




[Qemu-devel] [PATCH v2 3/3] ivshmem: Improve MSI irqfd error handling

2017-11-14 Thread Ladi Prosek
Adds a rollback path to ivshmem_enable_irqfd() and fixes
ivshmem_disable_irqfd() to bail if irqfd has not been enabled.

To reproduce, run:

  ivshmem-server -n 0

and QEMU with:

  -device ivshmem-doorbell,chardev=iv
  -chardev socket,path=/tmp/ivshmem_socket,id=iv

then load, unload, and load again the Windows driver, at the time of writing
available at:

https://github.com/virtio-win/kvm-guest-drivers-windows/tree/master/ivshmem

The issue is believed to have been masked by other guest drivers, notably
Linux ones, not enabling MSI-X on the device.

Signed-off-by: Ladi Prosek 
Reviewed-by: Markus Armbruster 
---
 hw/misc/ivshmem.c | 37 -
 1 file changed, 24 insertions(+), 13 deletions(-)

diff --git a/hw/misc/ivshmem.c b/hw/misc/ivshmem.c
index 91364d8364..d1bb246d12 100644
--- a/hw/misc/ivshmem.c
+++ b/hw/misc/ivshmem.c
@@ -786,6 +786,20 @@ static int ivshmem_setup_interrupts(IVShmemState *s, Error 
**errp)
 return 0;
 }
 
+static void ivshmem_remove_kvm_msi_virq(IVShmemState *s, int vector)
+{
+IVSHMEM_DPRINTF("ivshmem_remove_kvm_msi_virq vector:%d\n", vector);
+
+if (s->msi_vectors[vector].pdev == NULL) {
+return;
+}
+
+/* it was cleaned when masked in the frontend. */
+kvm_irqchip_release_virq(kvm_state, s->msi_vectors[vector].virq);
+
+s->msi_vectors[vector].pdev = NULL;
+}
+
 static void ivshmem_enable_irqfd(IVShmemState *s)
 {
 PCIDevice *pdev = PCI_DEVICE(s);
@@ -797,7 +811,7 @@ static void ivshmem_enable_irqfd(IVShmemState *s)
 ivshmem_add_kvm_msi_virq(s, i, );
 if (err) {
 error_report_err(err);
-/* TODO do we need to handle the error? */
+goto undo;
 }
 }
 
@@ -806,21 +820,14 @@ static void ivshmem_enable_irqfd(IVShmemState *s)
   ivshmem_vector_mask,
   ivshmem_vector_poll)) {
 error_report("ivshmem: msix_set_vector_notifiers failed");
+goto undo;
 }
-}
+return;
 
-static void ivshmem_remove_kvm_msi_virq(IVShmemState *s, int vector)
-{
-IVSHMEM_DPRINTF("ivshmem_remove_kvm_msi_virq vector:%d\n", vector);
-
-if (s->msi_vectors[vector].pdev == NULL) {
-return;
+undo:
+while (--i >= 0) {
+ivshmem_remove_kvm_msi_virq(s, i);
 }
-
-/* it was cleaned when masked in the frontend. */
-kvm_irqchip_release_virq(kvm_state, s->msi_vectors[vector].virq);
-
-s->msi_vectors[vector].pdev = NULL;
 }
 
 static void ivshmem_disable_irqfd(IVShmemState *s)
@@ -828,6 +835,10 @@ static void ivshmem_disable_irqfd(IVShmemState *s)
 PCIDevice *pdev = PCI_DEVICE(s);
 int i;
 
+if (!pdev->msix_vector_use_notifier) {
+return;
+}
+
 msix_unset_vector_notifiers(pdev);
 
 for (i = 0; i < s->peers[s->vm_id].nb_eventfds; i++) {
-- 
2.13.5




[Qemu-devel] [PATCH v2 0/3] ivshmem: MSI bug fixes

2017-11-14 Thread Ladi Prosek
Fixes bugs in the ivshmem device implementation uncovered with the new
Windows ivshmem driver:
https://github.com/virtio-win/kvm-guest-drivers-windows/tree/master/ivshmem

v1->v2:
* Patch 1 - added reproducer info to commit message (Markus)
* Patch 2 - restructured conditionals, fixed comment formatting (Markus)
* Patch 3 - added reproducer info to commit message (Markus)

Ladi Prosek (3):
  ivshmem: Don't update non-existent MSI routes
  ivshmem: Always remove irqfd notifiers
  ivshmem: Improve MSI irqfd error handling

 hw/misc/ivshmem.c | 77 +--
 1 file changed, 58 insertions(+), 19 deletions(-)

-- 
2.13.5




[Qemu-devel] [PATCH v2 2/3] ivshmem: Always remove irqfd notifiers

2017-11-14 Thread Ladi Prosek
As of commit 660c97eef6f8 ("ivshmem: use kvm irqfd for msi notifications"),
QEMU crashes with:

ivshmem: msix_set_vector_notifiers failed
msix_unset_vector_notifiers: Assertion `dev->msix_vector_use_notifier && 
dev->msix_vector_release_notifier' failed.

if MSI-X is repeatedly enabled and disabled on the ivshmem device, for example
by loading and unloading the Windows ivshmem driver. This is because
msix_unset_vector_notifiers() doesn't call any of the release notifier callbacks
since MSI-X is already disabled at that point (msix_enabled() returning false
is how this transition is detected in the first place). Thus 
ivshmem_vector_mask()
doesn't run and when MSI-X is subsequently enabled again ivshmem_vector_unmask()
fails.

This is fixed by keeping track of unmasked vectors and making sure that
ivshmem_vector_mask() always runs on MSI-X disable.

Fixes: 660c97eef6f8 ("ivshmem: use kvm irqfd for msi notifications")
Signed-off-by: Ladi Prosek 
Reviewed-by: Markus Armbruster 
---
 hw/misc/ivshmem.c | 32 ++--
 1 file changed, 26 insertions(+), 6 deletions(-)

diff --git a/hw/misc/ivshmem.c b/hw/misc/ivshmem.c
index 6e46669744..91364d8364 100644
--- a/hw/misc/ivshmem.c
+++ b/hw/misc/ivshmem.c
@@ -77,6 +77,7 @@ typedef struct Peer {
 typedef struct MSIVector {
 PCIDevice *pdev;
 int virq;
+bool unmasked;
 } MSIVector;
 
 typedef struct IVShmemState {
@@ -321,6 +322,7 @@ static int ivshmem_vector_unmask(PCIDevice *dev, unsigned 
vector,
 error_report("ivshmem: vector %d route does not exist", vector);
 return -EINVAL;
 }
+assert(!v->unmasked);
 
 ret = kvm_irqchip_update_msi_route(kvm_state, v->virq, msg, dev);
 if (ret < 0) {
@@ -328,7 +330,13 @@ static int ivshmem_vector_unmask(PCIDevice *dev, unsigned 
vector,
 }
 kvm_irqchip_commit_routes(kvm_state);
 
-return kvm_irqchip_add_irqfd_notifier_gsi(kvm_state, n, NULL, v->virq);
+ret = kvm_irqchip_add_irqfd_notifier_gsi(kvm_state, n, NULL, v->virq);
+if (ret < 0) {
+return ret;
+}
+v->unmasked = true;
+
+return 0;
 }
 
 static void ivshmem_vector_mask(PCIDevice *dev, unsigned vector)
@@ -343,11 +351,14 @@ static void ivshmem_vector_mask(PCIDevice *dev, unsigned 
vector)
 error_report("ivshmem: vector %d route does not exist", vector);
 return;
 }
+assert(v->unmasked);
 
 ret = kvm_irqchip_remove_irqfd_notifier_gsi(kvm_state, n, v->virq);
-if (ret != 0) {
+if (ret < 0) {
 error_report("remove_irqfd_notifier_gsi failed");
+return;
 }
+v->unmasked = false;
 }
 
 static void ivshmem_vector_poll(PCIDevice *dev,
@@ -817,11 +828,20 @@ static void ivshmem_disable_irqfd(IVShmemState *s)
 PCIDevice *pdev = PCI_DEVICE(s);
 int i;
 
-for (i = 0; i < s->peers[s->vm_id].nb_eventfds; i++) {
-ivshmem_remove_kvm_msi_virq(s, i);
-}
-
 msix_unset_vector_notifiers(pdev);
+
+for (i = 0; i < s->peers[s->vm_id].nb_eventfds; i++) {
+/*
+ * MSI-X is already disabled here so msix_unset_vector_notifiers()
+ * didn't call our release notifier.  Do it now to keep our masks and
+ * unmasks balanced.
+ */
+if (s->msi_vectors[i].unmasked) {
+ivshmem_vector_mask(pdev, i);
+}
+ivshmem_remove_kvm_msi_virq(s, i);
+}
+
 }
 
 static void ivshmem_write_config(PCIDevice *pdev, uint32_t address,
-- 
2.13.5




Re: [Qemu-devel] [RESEND PATCH 2/6] memory: introduce AddressSpaceOps and IOMMUObject

2017-11-14 Thread Peter Xu
On Tue, Nov 14, 2017 at 10:52:54PM +0100, Auger Eric wrote:

[...]

> I meant, in the current intel_iommu code, vtd_find_add_as() creates 1
> IOMMU MR and 1 AS per PCIe device, right?

I think this is the most tricky point - in QEMU IOMMU MR is not really
a 1:1 relationship to devices.  For Intel, it's true; for Power, it's
not.  On Power guests, one device's DMA address space can be splited
into different translation windows, while each window corresponds to
one IOMMU MR.

So IMHO the real 1:1 mapping is between the device and its DMA address
space, rather than MRs.

It's been a long time since when I drafted the patches.  I think at
least that should be a more general notifier mechanism comparing to
current IOMMUNotifier thing, which was bound to IOTLB notifies only.
AFAICT if we want to trap first-level translation changes, current
notifier is not even close to that interface - just see the definition
of IOMMUTLBEntry, it is tailored only for MAP/UNMAP of translation
addresses, not anything else.  And IMHO that's why it's tightly bound
to MemoryRegions, and that's the root problem.  The dynamic IOMMU MR
switching problem is related to this issue as well.

I am not sure current "get IOMMU object from address space" solution
would be best, maybe it's "too bigger a scope", I think it depends on
whether in the future we'll have some requirement in such a bigger
scope (say, something we want to trap from vIOMMU and deliver it to
host IOMMU which may not even be device-related?  I don't know).  Now
another alternative I am thinking is, whether we can provide a
per-device notifier, then it can be bound to PCIDevice rather than
MemoryRegions, then it will be in device scope.

Thanks,

-- 
Peter Xu



[Qemu-devel] [PATCH] vhost: Cancel migration when vhost-user process restarted during migration

2017-11-14 Thread fangying
From: Ying Fang

QEMU will abort when vhost-user process is restarted during migration when
vhost_log_global_start/stop is called. The reason is clear that
vhost_dev_set_log returns -1 because network connection is temporarily lost.
To handle this situation, let's cancel migration and report it to user.

Signed-off-by: Ying Fang 
---
 hw/virtio/vhost.c | 9 +++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
index ddc42f0..f409b06 100644
--- a/hw/virtio/vhost.c
+++ b/hw/virtio/vhost.c
@@ -27,6 +27,7 @@
 #include "hw/virtio/virtio-access.h"
 #include "migration/blocker.h"
 #include "sysemu/dma.h"
+#include "qmp-commands.h"
 
 /* enabled until disconnected backend stabilizes */
 #define _VHOST_DEBUG 1
@@ -882,20 +883,24 @@ static int vhost_migration_log(MemoryListener *listener, 
int enable)
 static void vhost_log_global_start(MemoryListener *listener)
 {
 int r;
+Error *errp = NULL;
 
 r = vhost_migration_log(listener, true);
 if (r < 0) {
-abort();
+error_setg(errp, "Failed to start vhost migration log");
+qmp_migrate_cancel();
 }
 }
 
 static void vhost_log_global_stop(MemoryListener *listener)
 {
 int r;
+Error *errp = NULL;
 
 r = vhost_migration_log(listener, false);
 if (r < 0) {
-abort();
+error_setg(errp, "Failed to stop vhost migration log");
+qmp_migrate_cancel();
 }
 }
 
-- 
2.10.0.windows.1





Re: [Qemu-devel] [PATCH v2 for-2.11] hw/net/vmxnet3: Fix code to work on big endian hosts, too

2017-11-14 Thread Thomas Huth
On 15.11.2017 00:33, David Gibson wrote:
> On Tue, 14 Nov 2017 12:20:24 +0100
> Thomas Huth  wrote:
> 
>> Since commit ab06ec43577177a442e8 we test the vmxnet3 device in the
>> pxe-tester, too (when running "make check SPEED=slow"). This now
>> revealed that the code is not working there if the host is a big
>> endian machine (for example ppc64 or s390x) - "make check SPEED=slow"
>> is now failing on such hosts.
>>
>> The vmxnet3 code lacks endianess conversions in a couple of places.
>> Interestingly, the bitfields in the structs in vmxnet3.h already tried to
>> take care of the *bit* endianess of the C compilers - but the code missed
>> to change the *byte* endianess when reading or writing the corresponding
>> structs. So the bitfields are now wrapped into unions which allow to change
>> the byte endianess during runtime with the non-bitfield member of the union.
>> With these changes, "make check SPEED=slow" now properly works on big endian
>> hosts, too.
>>
>> Reported-by: David Gibson 
>> Signed-off-by: Thomas Huth 
>> ---
>>  v2:
>>  - Introduced vmxnet3_ring_read_curr_txdesc() & vmxnet3_pci_dma_write_rxcd()
>>helper functions to wrap the byte-swapping code that is required in
>>multiple places (as suggested by Philippe)
[...]
>> +static inline void
>> +vmxnet3_ring_read_curr_txdesc(PCIDevice *pcidev, Vmxnet3Ring *ring,
>> +  struct Vmxnet3_TxDesc *txd)
>> +{
>> +vmxnet3_ring_read_curr_cell(pcidev, ring, txd);
>> +txd->addr = le64_to_cpu(txd->addr);
>> +txd->val1 = le32_to_cpu(txd->val1);
>> +txd->val2 = le32_to_cpu(txd->val2);
> 
> Urgh.. and I dislike swabbing struct fields in place even more :(.
> Having to know the endianness of a structure field is bad enough,
> having to know what endianness it's in *right now* is much worse.
> 
> But possibly you're stuck with it, given the existing code?
> 
> Looks like this is a structure determined by the hardware, in which
> case I think it should be left in hardware endian, and only swabbed
> when you go to actually look at the fields within.
[...]
>> diff --git a/hw/net/vmxnet3.h b/hw/net/vmxnet3.h
>> index f9352c4..5b3b76b 100644
>> --- a/hw/net/vmxnet3.h
>> +++ b/hw/net/vmxnet3.h
>> @@ -226,39 +226,49 @@ enum {
>>  struct Vmxnet3_TxDesc {
>>  __le64 addr;
>>  
>> +union {
>> +struct {
>>  #ifdef __BIG_ENDIAN_BITFIELD
>> -u32 msscof:14;  /* MSS, checksum offset, flags */
>> -u32 ext1:1;
>> -u32 dtype:1;/* descriptor type */
>> -u32 rsvd:1;
>> -u32 gen:1;  /* generation bit */
>> -u32 len:14;
>> +u32 msscof:14;  /* MSS, checksum offset, flags */
>> +u32 ext1:1;
>> +u32 dtype:1;/* descriptor type */
>> +u32 rsvd:1;
>> +u32 gen:1;  /* generation bit */
>> +u32 len:14;
> 
> Blech.  This is one reason I think bitfields are a terrible idea for
> mirroring hardware (or on disk) data.  Still, I guess changing that's
> kind of out of scope for a quick fix.

David, I agree with all of your comments - I had similar thoughts when I
was working with the code. However, as you already indicated, to address
all of this, we would need to rewrite most parts of this device. That's
out of my scope here - I wanted to keep the changes as minimal as
possible, so that there is a chance that we can get the endianness
problem still fixed for 2.11. So do you think that the patch is OK for
this? ... otherwise, I think I'll rather send a patch that removes the
vmxnet3 from the pxe-tester again - then it won't be tested anymore and
we won't get anymore endianness test failures.

 Thomas



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] [PATCH for-2.12 1/3] qapi: Add qdict_is_null()

2017-11-14 Thread Markus Armbruster
Max Reitz  writes:

> On 2017-11-14 15:57, Markus Armbruster wrote:
>> Max Reitz  writes:
>> 
>>> Signed-off-by: Max Reitz 
>>> ---
>>>  include/qapi/qmp/qdict.h |  1 +
>>>  qobject/qdict.c  | 10 ++
>>>  2 files changed, 11 insertions(+)
>>>
>>> diff --git a/include/qapi/qmp/qdict.h b/include/qapi/qmp/qdict.h
>>> index fc218e7be6..c65ebfc748 100644
>>> --- a/include/qapi/qmp/qdict.h
>>> +++ b/include/qapi/qmp/qdict.h
>>> @@ -76,6 +76,7 @@ int64_t qdict_get_try_int(const QDict *qdict, const char 
>>> *key,
>>>int64_t def_value);
>>>  bool qdict_get_try_bool(const QDict *qdict, const char *key, bool 
>>> def_value);
>>>  const char *qdict_get_try_str(const QDict *qdict, const char *key);
>>> +bool qdict_is_qnull(const QDict *qdict, const char *key);
>>>  
>>>  void qdict_copy_default(QDict *dst, QDict *src, const char *key);
>>>  void qdict_set_default_str(QDict *dst, const char *key, const char *val);
>>> diff --git a/qobject/qdict.c b/qobject/qdict.c
>>> index e8f15f1132..a032ea629a 100644
>>> --- a/qobject/qdict.c
>>> +++ b/qobject/qdict.c
>>> @@ -294,6 +294,16 @@ const char *qdict_get_try_str(const QDict *qdict, 
>>> const char *key)
>>>  }
>>>  
>>>  /**
>>> + * qdict_is_qnull(): Return true if the value for 'key' is QNull
>>> + */
>>> +bool qdict_is_qnull(const QDict *qdict, const char *key)
>>> +{
>>> +QObject *value = qdict_get(qdict, key);
>>> +
>>> +return value && value->type == QTYPE_QNULL;
>>> +}
>>> +
>>> +/**
>>>   * qdict_iter(): Iterate over all the dictionary's stored values.
>>>   *
>>>   * This function allows the user to provide an iterator, which will be
>> 
>> As far as I can tell, the new helper function is going to be used just
>> once, by bdrv_open_inherit() in PATCH 2:
>> 
>> qdict_is_qnull(options, "backing")
>> 
>> I dislike abstracting from just one concrete instance.
>> 
>> Perhaps a more general helper could be more generally useful.  Something
>> like:
>> 
>> qobject_is(qdict_get(options, "backing", QTYPE_QNULL))
>> 
>> There are numerous instances of
>> 
>> !obj || qobject_type(obj) == T
>> 
>> in the tree, which could then be replaced by
>> 
>> qobject_is(obj, T)
>> 
>> An alternative helper: macro qobject_dynamic_cast(obj, T) that returns
>> (T *)obj if obj is a T, else null.  Leads to something like
>> 
>> qobject_dynamic_cast(qdict_get(options, "backing", QNull))
>
> If you think that's good, then that's good -- you know the QAPI code
> better then me, after all.

I'll play with it today.

> To explain myself: I thought it would be the natural extension of the
> qdict_get_try_*() functions for the QNull type.

I see now.  The name qdict_get_try_null() would make it obvious, but
what to return on success...



Re: [Qemu-devel] [PULL 7/8] Add new PCI ID for i82559a

2017-11-14 Thread Stefan Weil
Hi,

I currently think that this patch is wrong and should be reverted.

It fixes a certain use case by hacking the PCI device id, but does
not model the way how that device id is set on the real hardware
correctly.

As far as I know, all i82559 have a default PCI device id of 0x1229.
It can be changed by the EEPROM configuration, but not all network
cards do have an EEPROM.

See for example this URL for more information:
http://zoo.cs.yale.edu/classes/cs422/2010/ref/82559_eeprom.pdf

The correct solution is modeling the EEPROM and allowing QEMU
users to provide an EEPROM‌ file.

Cheers
Stefan

Am 14.11.2017 um 03:11 schrieb Jason Wang:
> From: Mike Nawrocki 
> 
> Adds a new PCI ID for the i82559a (0x8086 0x1030) interface. The
> "x-use-alt-device-id" property controls whether this new ID is to be
> used, and is true by default, and set to false in a compat entry.
> 
> Signed-off-by: Mike Nawrocki 
> Reviewed-by: Michael S. Tsirkin 
> Signed-off-by: Jason Wang 
> ---
>  hw/net/eepro100.c| 13 +
>  include/hw/compat.h  |  4 
>  include/hw/pci/pci.h |  1 +
>  qemu-options.hx  |  2 +-
>  4 files changed, 19 insertions(+), 1 deletion(-)
> 
> diff --git a/hw/net/eepro100.c b/hw/net/eepro100.c
> index 91dd058..a63ed2c 100644
> --- a/hw/net/eepro100.c
> +++ b/hw/net/eepro100.c
> @@ -132,6 +132,7 @@ typedef struct {
>  const char *name;
>  const char *desc;
>  uint16_t device_id;
> +uint16_t alt_device_id;
>  uint8_t revision;
>  uint16_t subsystem_vendor_id;
>  uint16_t subsystem_id;
> @@ -276,6 +277,7 @@ typedef struct {
>  /* Quasi static device properties (no need to save them). */
>  uint16_t stats_size;
>  bool has_extended_tcb_support;
> +bool use_alt_device_id;
>  } EEPRO100State;
>  
>  /* Word indices in EEPROM. */
> @@ -1855,6 +1857,14 @@ static void e100_nic_realize(PCIDevice *pci_dev, Error 
> **errp)
>  
>  TRACE(OTHER, logout("\n"));
>  
> +/* By default, the i82559a adapter uses the legacy PCI ID (for the
> + * i82557). This allows the PCI ID to be changed to the alternate
> + * i82559 ID if needed.
> + */
> +if (s->use_alt_device_id && strcmp(info->name, "i82559a") == 0) {
> +pci_config_set_device_id(s->dev.config, info->alt_device_id);
> +}
> +
>  s->device = info->device;
>  
>  e100_pci_reset(s, _err);
> @@ -1974,6 +1984,7 @@ static E100PCIDeviceInfo e100_devices[] = {
>  .desc = "Intel i82559A Ethernet",
>  .device = i82559A,
>  .device_id = PCI_DEVICE_ID_INTEL_82557,
> +.alt_device_id = PCI_DEVICE_ID_INTEL_82559,
>  .revision = 0x06,
>  .stats_size = 80,
>  .has_extended_tcb_support = true,
> @@ -2067,6 +2078,8 @@ static E100PCIDeviceInfo 
> *eepro100_get_class(EEPRO100State *s)
>  
>  static Property e100_properties[] = {
>  DEFINE_NIC_PROPERTIES(EEPRO100State, conf),
> +DEFINE_PROP_BOOL("x-use-alt-device-id", EEPRO100State, use_alt_device_id,
> + true),
>  DEFINE_PROP_END_OF_LIST(),
>  };
>  
> diff --git a/include/hw/compat.h b/include/hw/compat.h
> index cf389b4..f96212c 100644
> --- a/include/hw/compat.h
> +++ b/include/hw/compat.h
> @@ -10,6 +10,10 @@
>  .driver   = "virtio-tablet-device",\
>  .property = "wheel-axis",\
>  .value= "false",\
> +},{\
> +.driver   = "i82559a",\
> +.property = "x-use-alt-device-id",\
> +.value= "false",\
>  },
>  
>  #define HW_COMPAT_2_9 \
> diff --git a/include/hw/pci/pci.h b/include/hw/pci/pci.h
> index 8d02a0a..f30e2cf 100644
> --- a/include/hw/pci/pci.h
> +++ b/include/hw/pci/pci.h
> @@ -70,6 +70,7 @@ extern bool pci_available;
>  /* Intel (0x8086) */
>  #define PCI_DEVICE_ID_INTEL_82551IT  0x1209
>  #define PCI_DEVICE_ID_INTEL_825570x1229
> +#define PCI_DEVICE_ID_INTEL_825590x1030
>  #define PCI_DEVICE_ID_INTEL_82801IR  0x2922
>  
>  /* Red Hat / Qumranet (for QEMU) -- see pci-ids.txt */
> diff --git a/qemu-options.hx b/qemu-options.hx
> index 3728e9b..a39c7e4 100644
> --- a/qemu-options.hx
> +++ b/qemu-options.hx
> @@ -2047,7 +2047,7 @@ that the card should have; this option currently only 
> affects virtio cards; set
>  @var{v} = 0 to disable MSI-X. If no @option{-net} option is specified, a 
> single
>  NIC is created.  QEMU can emulate several different models of network card.
>  Valid values for @var{type} are
> -@code{virtio}, @code{i82551}, @code{i82557b}, @code{i82559er},
> +@code{virtio}, @code{i82551}, @code{i82557b}, @code{i82559a}, 
> @code{i82559er},
>  @code{ne2k_pci}, @code{ne2k_isa}, @code{pcnet}, @code{rtl8139},
>  @code{e1000}, @code{smc91c111}, @code{lance} and @code{mcf_fec}.
>  Not all devices are supported on all targets.  Use @code{-net nic,model=help}
> 




Re: [Qemu-devel] Abnormal observation during migration: too many "write-not-dirty" pages

2017-11-14 Thread Chunguang Li
Some more details about this experiment:

The host is running Ubuntu-16.04 with 4.4.0 Linux kernel and QEMU-2.5.1; The
guest is running Ubuntu-12.04, except Memcached with Ubuntu-16.04.

 

The exact numbers of the proportions of write-not-dirty pages for the first
2 pre-copy iterations: (0.445 means 44.5%)

Memcached:  0.445, 0.478

Zeusmp:  0.670, 0.727

Mcf: 0.808, 0.793

Bzip2:0.464, 0.447

Milc: 0.341, 0.037

cactusADM:   0.280, 0.248

lbm: 0.090, 0.037

GemsFDTD:   0.226, 0.172

Bwaves:  0.069, 0.003

Astar:0.113, 0.039

Xalancbmk:   0.082, 0.041

Wrf: 0.141, 0.073

 

Any advice? Looking forward to any response. Thank you.

 

Chunguang



Re: [Qemu-devel] QEMU abort when network serivce is restarted during live migration with vhost-user as the network backend

2017-11-14 Thread Yori Fang


在 2017/11/14 19:40, Marc-André Lureau 写道:
> Hi
> 
> On Tue, Nov 14, 2017 at 8:09 AM, fangying  wrote:
>> Hi all,
>>
>> We have a vm running migration with vhost-user as network backend, we notice 
>> that qemu will abort when openvswitch is restarted
>> when MEMORY_LISTENER_CALL_GLOBAL(log_global_start, Forward) is called. The 
>> reasion is clear that vhost_dev_set_log returns -1 because
>> the network connection is temporarily lost due to the restart of openvswitch 
>> service.
>>
>> Below is the trace of the call stack.
>>
>> #0  0x7f868ed971d7 in raise() from /usr/lib64/libc.so.6
>> #1  0x7f868ed988c8 in abort() from /usr/lib64/libc.so.6
>> #2  0x004d0d35 in vhost_log_global_start (listener=) 
>> at /usr/src/debug/qemu-kvm-2.8.1/hw/virtio/vhost.c:794
>> #2  0x00486bd2 in memory_global_dirty_log_start at 
>> /usr/src/debug/qemu-kvm-2.8.1/memory.c:2304
>> #3  0x00486dcd in ram_save_init_globals at 
>> /usr/src/debug/qemu-kvm-2.8.1/migration/ram.c:2072
>> #4  0x0048c185 in ram_save_setup (f=0x25e6ac0, opaque=> out>) at /usr/src/debug/qemu-kvm-2.8.1/migration/ram.c:2093
>> #5  0x004fbee2 in qemu_savevm_state_begin at 
>> /usr/src/debug/qemu-kvm-2.8.1/migration/savevm.c:956
>> #6  0x0083d8f8 in migration_thread at migration/migration.c:2198
>>
>> static void vhost_log_global_start(MemoryListener *listener)
>> {
>> int r;
>>
>> r = vhost_migration_log(listener, true);
>> if (r < 0) {
>> abort();   /* branch taken */
>> }
>> }
>>
>> What confuse me is that
>> 1. do we really need to abort here ?
> 
> Not if we have a sane way to handle the situation. It make sense
> though to not want to support that use case (restarting the vhost-user
> process during migration).
> 
>> 2. all member of callbacks in MemoryListener returned with type void, we 
>> cannot judge in any upper function on the call stack.
>> Can we just cancel migration here instead of calling abort ? like:
> 
> That would be acceptable to me, but there should be a better way than
> calling qmp_migrate_cancel() (we need to give a reason for cancelling,
> and report it to user). Juan should be able to help.

I agree with you, we'd better give more details here instead of passing NULL in 
qmp_migrate_cancel.

> 
>>
>> diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
>> index ddc42f0..27ae4a2 100644
>> --- a/hw/virtio/vhost.c
>> +++ b/hw/virtio/vhost.c
>> @@ -27,6 +27,7 @@
>>  #include "hw/virtio/virtio-access.h"
>>  #include "migration/blocker.h"
>>  #include "sysemu/dma.h"
>> +#include "qmp-commands.h"
>>
>>  /* enabled until disconnected backend stabilizes */
>>  #define _VHOST_DEBUG 1
>> @@ -885,7 +886,7 @@ static void vhost_log_global_start(MemoryListener 
>> *listener)
>>
>>  r = vhost_migration_log(listener, true);
>>  if (r < 0) {
>> -abort();
>> +qmp_migrate_cancel(NULL);
>>  }
>>  }
>>
>> @@ -895,7 +896,7 @@ static void vhost_log_global_stop(MemoryListener 
>> *listener)
>>
>>  r = vhost_migration_log(listener, false);
>>  if (r < 0) {
>> -abort();
>> +qmp_migrate_cancel(NULL);
>>  }
>>  }
>>
>>
>>
>>
> 
> 
> 




Re: [Qemu-devel] [PATCH v3 for-2.11 1/3] tpm_emulator: Add a caching layer for the TPM Established flag

2017-11-14 Thread Marc-André Lureau
Hi

On Wed, Nov 15, 2017 at 2:16 AM, Stefan Berger
 wrote:
> On 11/14/2017 06:40 PM, Marc-André Lureau wrote:
>>
>> Hi
>>
>> On Tue, Nov 14, 2017 at 10:52 PM, Stefan Berger
>>  wrote:
>>>
>>> Add a caching layer for the TPM established flag so that we don't
>>> need to go to the emulator every time the flag is read by accessing
>>> the REG_ACCESS register.
>>
>> What's the impact? Isn't this just a "small" optimization? Iotw, why
>> is this for-2.11?
>
>
> The TIS has a register that contains this flag and that's being polled quite
> frequently. So it generates a lot of traffic to the emulator. This caching
> layer gets rid of most of the traffic.

I didn't notice any problem when doing my tests, I guess Amarnath
niether. perhaps it's best to delay for after 2.11.

>Stefan
>
>
>>
>>> Signed-off-by: Stefan Berger 
>>>
>>> v1->v2:
>>>   - move the caching to the backend layer since detecting the
>>> TPM 1.2 TSC_ResetEstablishmentBit() command is easier to do
>>> here.
>>> ---
>>>   hw/tpm/tpm_emulator.c | 17 ++---
>>>   1 file changed, 14 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/hw/tpm/tpm_emulator.c b/hw/tpm/tpm_emulator.c
>>> index e1a6810..b293db7 100644
>>> --- a/hw/tpm/tpm_emulator.c
>>> +++ b/hw/tpm/tpm_emulator.c
>>> @@ -73,6 +73,9 @@ typedef struct TPMEmulator {
>>>   Error *migration_blocker;
>>>
>>>   QemuMutex mutex;
>>> +
>>> +unsigned int established_flag:1;
>>> +unsigned int established_flag_cached:1;
>>>   } TPMEmulator;
>>>
>>>
>>> @@ -287,16 +290,22 @@ static bool
>>> tpm_emulator_get_tpm_established_flag(TPMBackend *tb)
>>>   TPMEmulator *tpm_emu = TPM_EMULATOR(tb);
>>>   ptm_est est;
>>>
>>> -DPRINTF("%s", __func__);
>>> +if (tpm_emu->established_flag_cached) {
>>> +return tpm_emu->established_flag;
>>> +}
>>> +
>>>   if (tpm_emulator_ctrlcmd(tpm_emu, CMD_GET_TPMESTABLISHED, ,
>>>0, sizeof(est)) < 0) {
>>>   error_report("tpm-emulator: Could not get the TPM established
>>> flag: %s",
>>>strerror(errno));
>>>   return false;
>>>   }
>>> -DPRINTF("established flag: %0x", est.u.resp.bit);
>>> +DPRINTF("got established flag: %0x", est.u.resp.bit);
>>> +
>>> +tpm_emu->established_flag_cached = 1;
>>> +tpm_emu->established_flag = (est.u.resp.bit != 0);
>>>
>>> -return (est.u.resp.bit != 0);
>>> +return tpm_emu->established_flag;
>>>   }
>>>
>>>   static int tpm_emulator_reset_tpm_established_flag(TPMBackend *tb,
>>> @@ -327,6 +336,8 @@ static int
>>> tpm_emulator_reset_tpm_established_flag(TPMBackend *tb,
>>>   return -1;
>>>   }
>>>
>>> +tpm_emu->established_flag_cached = 0;
>>> +
>>>   return 0;
>>>   }
>>>
>>> --
>>> 2.5.5
>>>
>>
>>
>



-- 
Marc-André Lureau



Re: [Qemu-devel] [PATCH v3 for-2.11 3/3] tpm_tis: Return 0 for every register in case of failure mode

2017-11-14 Thread Marc-André Lureau
Hi

On Wed, Nov 15, 2017 at 2:18 AM, Stefan Berger
 wrote:
> On 11/14/2017 06:47 PM, Marc-André Lureau wrote:
>>
>> Hi
>>
>> On Tue, Nov 14, 2017 at 10:52 PM, Stefan Berger
>>  wrote:
>>>
>>> Rather than returning ~0, return 0 for every register in case of
>>> failure mode. The '0' is better to indicate that there's no device
>>> there.
>>
>> For most registers, 0 makes more sense. However, I wonder if we
>> shouldn't just fail to start qemu in this case...
>>
>> Not convincing me this is 2.11 material either. Does this fix a specific
>> bug?
>
>
> Yes, SeaBIOS detects the ~0 when it probes and thinks there's a device
> there. It then hangs trying to set flags and read registers to be able to
> use the device.
>

Please update the commit message with that added,
Reviewed-by: Marc-André Lureau 


>Stefan
>
>
>>
>>> Signed-off-by: Stefan Berger 
>>> ---
>>>   hw/tpm/tpm_tis.c | 2 +-
>>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/hw/tpm/tpm_tis.c b/hw/tpm/tpm_tis.c
>>> index fec2fc6..42d647d 100644
>>> --- a/hw/tpm/tpm_tis.c
>>> +++ b/hw/tpm/tpm_tis.c
>>> @@ -545,7 +545,7 @@ static uint64_t tpm_tis_mmio_read(void *opaque,
>>> hwaddr addr,
>>>   uint8_t v;
>>>
>>>   if (tpm_backend_had_startup_error(s->be_driver)) {
>>> -return val;
>>> +return 0;
>>>   }
>>>
>>>   switch (offset) {
>>> --
>>> 2.5.5
>>>
>>
>>
>



-- 
Marc-André Lureau



Re: [Qemu-devel] [PATCH v17 6/6] virtio-balloon: VIRTIO_BALLOON_F_FREE_PAGE_VQ

2017-11-14 Thread Wei Wang

On 11/15/2017 05:21 AM, Michael S. Tsirkin wrote:

On Tue, Nov 14, 2017 at 08:02:03PM +0800, Wei Wang wrote:

On 11/14/2017 01:32 AM, Michael S. Tsirkin wrote:

- guest2host_cmd: written by the guest to ACK to the host about the
commands that have been received. The host will clear the corresponding
bits on the host2guest_cmd register. The guest also uses this register
to send commands to the host (e.g. when finish free page reporting).

I am not sure what is the role of guest2host_cmd. Reporting of
the correct cmd id seems sufficient indication that guest
received the start command. Not getting any more seems sufficient
to detect stop.


I think the issue is when the host is waiting for the guest to report pages,
it does not know whether the guest is going to report more or the report is
done already. That's why we need a way to let the guest tell the host "the
report is done, don't wait for more", then the host continues to the next
step - sending the non-free pages to the destination. The following method
is a conclusion of other comments, with some new thought. Please have a
check if it is good.

config won't work well for this IMHO.
Writes to config register are hard to synchronize with the VQ.
For example, guest sends free pages, host says stop, meanwhile
guest sends stop for 1st set of pages.


I still don't see an issue with this. Please see below:
(before jumping into the discussion, just make sure I've well explained 
this point: now host-to-guest commands are done via config, and 
guest-to-host commands are done via the free page vq)


Case: Host starts to request the reporting with cmd_id=1. Some time 
later, Host writes "stop" to config, meantime guest happens to finish 
the reporting and plan to actively send a "stop" command from the 
free_page_vq().
  Essentially, this is like a sync between two threads - if we 
view the config interrupt handler as one thread, another is the free 
page reporting worker thread.


- what the config handler does is simply:
  1.1:  WRITE_ONCE(vb->reporting_stop, true);

- what the reporting thread will do is
  2.1:  WRITE_ONCE(vb->reporting_stop, true);
  2.2:  send_stop_to_host_via_vq();

From the guest point of view, no matter 1.1 is executed first or 2.1 
first, it doesn't make a difference to the end result - 
vb->reporting_stop is set.


From the host point of view, it knows that cmd_id=1 has truly stopped 
the reporting when it receives a "stop" sign via the vq.




How about adding a buffer with "stop" in the VQ instead?
Wastes a VQ entry which you will need to reserve for this
but is it a big deal?


The free page vq is guest-to-host direction. Using it for host-to-guest 
requests will make it bidirectional, which will result in the same issue 
described before: https://lkml.org/lkml/2017/10/11/1009 (the first response)


On the other hand, I think adding another new vq for host-to-guest 
requesting doesn't make a difference in essence, compared to using 
config (same 1.1, 2.1, 2.2 above), but will be more complicated.




Two new configuration registers in total:
- cmd_reg: the command register, combined from the previous host2guest and
guest2host. I think we can use the same register for host requesting and
guest ACKing, since the guest writing will trap to QEMU, that is, all the
writes to the register are performed in QEMU, and we can keep things work in
a correct way there.
- cmd_id_reg: the sequence id of the free page report command.

-- free page report:
 - host requests the guest to start reporting by "cmd_reg |
REPORT_START";
 - guest ACKs to the host about receiving the start reporting request by
"cmd_reg | REPORT_START", host will clear the flag bit once receiving the
ACK.
 - host requests the guest to stop reporting by "cmd_reg | REPORT_STOP";
 - guest ACKs to the host about receiving the stop reporting request by
"cmd_reg | REPORT_STOP", host will clear the flag once receiving the ACK.
 - guest tells the host about the start of the reporting by writing "cmd
id" into an outbuf, which is added to the free page vq.
 - guest tells the host about the end of the reporting by writing "0"
into an outbuf, which is added to the free page vq. (we reserve "id=0" as
the stop sign)

-- ballooning:
 - host requests the guest to start ballooning by "cmd_reg | BALLOONING";
 - guest ACKs to the host about receiving the request by "cmd_reg |
BALLOONING", host will clear the flag once receiving the ACK.


Some more explanations:
-- Why not let the host request the guest to start the free page reporting
simply by writing a new cmd id to the cmd_id_reg?
The configuration interrupt is shared among all the features - ballooning,
free page reporting, and future feature extensions which need host-to-guest
requests. Some features may need to add other feature specific configuration
registers, like free page reporting need the cmd_id_reg, which is not used
by ballooning. The 

[Qemu-devel] [Question] Qemu's Heap Becomes Very Large and Never Reduce Down

2017-11-14 Thread Xulei (Stone)
Hi, guys

I met a strange problem, with qemu 2.8.1:
qemu consumes too many heap memory after several operations and can not release 
them anymore:
hot pulg/unplug disk & net, vnc connect/disconnect, guestOS reboot, etc.


01a7a000-3b4efe000 rw-p  00:00 0 [heap]

Size:   15520272 kB

Rss:14421836 kB

Pss:14421836 kB

Shared_Clean:  0 kB

Shared_Dirty:  0 kB

Private_Clean:  1164 kB

Private_Dirty:  14420672 kB

Referenced:  7485624 kB

Anonymous:  14421836 kB

AnonHugePages: 34816 kB

Swap:1098140 kB

KernelPageSize:4 kB

MMUPageSize:   4 kB

Locked:0 kB

VmFlags: rd wr mr mw me ac sd

My steps are:
1) start several VMs all equipped only 8G memory;
2) random combining those operations mentioned above;
3) after few hours, qemu's Virt memory and RSS both grow too large and never 
fall down;

After analysis via /proc/$pid/smaps, I found the VMA of pc.ram does not occupy 
much
memory but only becauses of heap section.

I guess that has some relations of glibc or qemu rcu_thread, but i can not 
figure it out.
Is there some patches can fix this problem or does somebody have any idea?



[Qemu-devel] 答复: Re: 答复: Re: 答复: Re: 答复: Re: [PATCH v2] qga: replace GetIfEntry

2017-11-14 Thread lu.zhipeng
build 32 bit config:

./configure --enable-guest-agent --cross-prefix=i686-w64-mingw32- 
--with-vss-sdk="/home/VSSSDK72"  --disable-fdt --target-list=i386-softmmu 

  



















为了让您的VPlat虚拟机故障和docker故障得到高效的处理,请上报故障到: $VPlat技术支持。


芦志朋 luzhipeng






IT开发工程师 IT Development
Engineer
操作系统产品部/中心研究院/系统产品 OS Product Dept./Central R&D Institute/System Product









四川省成都市天府大道中段800号
E: lu.zhip...@zte.com.cn 
www.zte.com.cn










原始邮件



发件人:芦志朋10108272
收件人: ;
抄送人: ;
日 期 :2017年11月15日 10:48
主 题 :答复: Re: 答复: Re: 答复: Re: 答复: Re: [PATCH v2] qga: replace GetIfEntry








Quoting lu.zhip...@zte.com.cn (2017-11-14 19:41:58)> i used  xp  version:> > xp 
professional 2002 service pack 3>Hmm, doesn't 
--cross-prefix=x86_64-w64-mingw32- result in a 64-bit> qemu-ga.exe? How are you 
running this on 32-bit Windows XP?






i build two version :32bit and 64 bit , run 32bit in xp 























为了让您的VPlat虚拟机故障和docker故障得到高效的处理,请上报故障到: $VPlat技术支持。


芦志朋 luzhipeng






IT开发工程师 IT Development Engineer
操作系统产品部/中心研究院/系统产品 OS Product Dept./Central R&D Institute/System Product









四川省成都市天府大道中段800号
E: lu.zhip...@zte.com.cn 
www.zte.com.cn

















发件人: ;
收件人:芦志朋10108272;
抄送人: ;
日 期 :2017年11月15日 10:23
主 题 :Re: 答复: Re: 答复: Re: 答复: Re: [PATCH v2] qga: replace GetIfEntry


Quoting lu.zhip...@zte.com.cn (2017-11-14 19:41:58)
> i used  xp  version:
> 
> xp professional 2002 service pack 3

Hmm, doesn't --cross-prefix=x86_64-w64-mingw32- result in a 64-bit
qemu-ga.exe? How are you running this on 32-bit Windows XP?

> 
> build environment: 
> 
> root@localhost qemu-2.5.0]# cat /etc/redhat-release 
> 
> CentOS Linux release 7.0.1406 (Core) 

Thanks, I'll try to see if there's anything there that would account for
the difference.

> 
> 
> 
> 
> 
> 
> 为了让您的VPlat虚拟机故障和docker故障得到高效的处理,请上报故障到: $VPlat技术支
> 持。
> 
> 芦志朋 luzhipeng
> 
> 
> IT开发工程师 IT Development Engineer
> 操作系统产品部/中心研究院/系统产品 OS Product Dept./Central R&D Institute/
> System Product
> 
> 
> [cid]  [cid]
>四川省成都市天府大道中段800号
>E: lu.zhip...@zte.com.cn
>www.zte.com.cn
> 
> 原始邮件
> 发件人: ;
> 收件人:芦志朋10108272;
> 抄送人: ;
> 日期:2017年11月15日 09:22
> 主题:Re: 答复: Re: 答复: Re: [PATCH v2] qga: replace GetIfEntry
> Quoting lu.zhip...@zte.com.cn (2017-11-14 05:09:35)
> >  i test the latest qga in xp , it run ok .
> > 
> > 
> > my qga config :
> > 
> > Configured with: './configure' '--enable-guest-agent' '--cross-prefix=
> > x86_64-w64-mingw32-' '--with-vss-sdk=/home/VSSSDK72' '--disable-fdt'
> > '--target-list=x86_64-softmmu'
> 
> Hmm, so you're testing with Windows XP x64? I was using XP 32-bit (SP3),
> but I retried with XP x64 (SP2) and I still have the same issue.
> 
> I can only get qemu-ga working if I build on top of something prior to
> commit 12f8def0e.
> 
> What build environment are you using? I've tried Fedora Core 18 and 20
> and have the same issue with both.
> 
> > 
> > used qga version info
> > 
> > [root@ceshi qemu]# git log
> > 
> > commit 533ab83ea074d5fc457769f6ac698524a12f1156
> > 
> > Author: ZhiPeng Lu 
> > 
> > Date:   Fri Nov 10 10:17:14 2017 +0800
> > 
> > 
> > qga: fix some errors for guest_get_network_stats
> > 
> > 
> > 
> > fix some erros:
> > 
> > 1.if building qga on Windows Vista/2008 and newer,
> > 
> > it cann't find the link to GetIfEntry2 in windows xp.
> > 
> > 2. check valid of if_index.
> > 
> > 
> > 
> > Signed-off-by: ZhiPeng Lu 
> > 
> > 
> > commit de597a8b27722ce4f9cc660f930f7dccc712712d
> > 
> > Author: ZhiPeng Lu 
> > 
> > Date:   Fri Nov 3 22:54:20 2017 +0800
> > 
> > 
> > qga: replace GetIfEntry
> > 
> > 
> > 
> >
>  The data obtained by GetIfEntry is 32 bits, and it may overflow. Thus 
> using
> > GetIfEntry2 instead of GetIfEntry.
> > 
> > 
> > 
> > Signed-off-by: ZhiPeng Lu 
> > 
> > *avoid CamelCase variable names
> > 
> > *update field names for MIB_IFROW -> MIB_IF_ROW2
> > 
> > Signed-off-by: Michael Roth 
> > 
> > 
> > commit 5ca7a3cba468736cfe555887af1f6ba754f6eac9
> > 
> > Merge: a4f0537 10a7b7e
> > 
> > Author: Peter Maydell 
> > 
> > Date:   Tue Nov 7 14:43:35 2017 +
> > 
> > 
> > Merge remote-tracking branch 'remotes/berrange/tags/
> pull-2017-11-06-2' into
> > staging
> > 
> > 
> > 
> > Pull IO 2017/11/06 v2
> > 
> > 
> > 
> > 
> > 为了让您的VPlat虚拟机故障和docker故障得到高效的处理,请上报故障到: $VPlat技术
> 支
> > 持。
> > 
> > 芦志朋 luzhipeng
> > 
> > 
> > IT开发工程师 IT Development Engineer
> > 操作系统产品部/中心研究院/系统产品 OS Product Dept./Central R&D Institute/
> > System Product
> > 
> > 
> > [cid]  [cid]
> >四川省成都市天府大道中段800号
> >E: lu.zhip...@zte.com.cn
> >

[Qemu-devel] [PATCH v2] net: Transmit zero UDP checksum as 0xFFFF

2017-11-14 Thread Ed Swierk via Qemu-devel
The checksum algorithm used by IPv4, TCP and UDP allows a zero value
to be represented by either 0x and 0x. But per RFC 768, a zero
UDP checksum must be transmitted as 0x, as 0x is a special
value meaning no checksum.

Substitute 0x whenever a checksum is computed as zero when
modifying a UDP datagram header. Doing this on IPv4 packets and TCP
segments is unnecessary but legal. Add a wrapper for
net_checksum_finish() that makes the substitution.

(We can't just change net_checksum_finish(), as that function is also
used by receivers to verify checksums, and in that case the expected
value is always 0x.)

v2:

Add a wrapper net_checksum_finish_hdr() rather than duplicating the
logic at every caller.

Signed-off-by: Ed Swierk 
---
 hw/net/e1000.c | 2 +-
 hw/net/net_rx_pkt.c| 2 +-
 hw/net/net_tx_pkt.c| 6 +++---
 hw/net/vmxnet3.c   | 3 ++-
 include/net/checksum.h | 7 +++
 5 files changed, 14 insertions(+), 6 deletions(-)

diff --git a/hw/net/e1000.c b/hw/net/e1000.c
index d642314..4e33667 100644
--- a/hw/net/e1000.c
+++ b/hw/net/e1000.c
@@ -506,7 +506,7 @@ putsum(uint8_t *data, uint32_t n, uint32_t sloc, uint32_t 
css, uint32_t cse)
 n = cse + 1;
 if (sloc < n-1) {
 sum = net_checksum_add(n-css, data+css);
-stw_be_p(data + sloc, net_checksum_finish(sum));
+stw_be_p(data + sloc, net_checksum_finish_hdr(sum));
 }
 }
 
diff --git a/hw/net/net_rx_pkt.c b/hw/net/net_rx_pkt.c
index 1019b50..a2b2c2d 100644
--- a/hw/net/net_rx_pkt.c
+++ b/hw/net/net_rx_pkt.c
@@ -517,7 +517,7 @@ _net_rx_pkt_calc_l4_csum(struct NetRxPkt *pkt)
 cntr += net_checksum_add_iov(pkt->vec, pkt->vec_len,
  pkt->l4hdr_off, csl, cso);
 
-csum = net_checksum_finish(cntr);
+csum = net_checksum_finish_hdr(cntr);
 
 trace_net_rx_pkt_l4_csum_calc_csum(pkt->l4hdr_off, csl, cntr, csum);
 
diff --git a/hw/net/net_tx_pkt.c b/hw/net/net_tx_pkt.c
index 20b2549..dc95f12 100644
--- a/hw/net/net_tx_pkt.c
+++ b/hw/net/net_tx_pkt.c
@@ -126,12 +126,12 @@ void net_tx_pkt_update_ip_checksums(struct NetTxPkt *pkt)
 
 /* Calculate IP pseudo header checksum */
 cntr = eth_calc_ip4_pseudo_hdr_csum(ip_hdr, pkt->payload_len, );
-csum = cpu_to_be16(~net_checksum_finish(cntr));
+csum = cpu_to_be16(~net_checksum_finish_hdr(cntr));
 } else if (gso_type == VIRTIO_NET_HDR_GSO_TCPV6) {
 /* Calculate IP pseudo header checksum */
 cntr = eth_calc_ip6_pseudo_hdr_csum(ip_hdr, pkt->payload_len,
 IP_PROTO_TCP, );
-csum = cpu_to_be16(~net_checksum_finish(cntr));
+csum = cpu_to_be16(~net_checksum_finish_hdr(cntr));
 } else {
 return;
 }
@@ -486,7 +486,7 @@ static void net_tx_pkt_do_sw_csum(struct NetTxPkt *pkt)
 net_checksum_add_iov(iov, iov_len, pkt->virt_hdr.csum_start, csl, cso);
 
 /* Put the checksum obtained into the packet */
-csum = cpu_to_be16(net_checksum_finish(csum_cntr));
+csum = cpu_to_be16(net_checksum_finish_hdr(csum_cntr));
 iov_from_buf(iov, iov_len, csum_offset, , sizeof csum);
 }
 
diff --git a/hw/net/vmxnet3.c b/hw/net/vmxnet3.c
index 90f6943..ee7cdeb 100644
--- a/hw/net/vmxnet3.c
+++ b/hw/net/vmxnet3.c
@@ -970,7 +970,8 @@ static void vmxnet3_rx_need_csum_calculate(struct NetRxPkt 
*pkt,
 data = (uint8_t *)pkt_data + vhdr->csum_start;
 len = pkt_len - vhdr->csum_start;
 /* Put the checksum obtained into the packet */
-stw_be_p(data + vhdr->csum_offset, net_raw_checksum(data, len));
+stw_be_p(data + vhdr->csum_offset,
+ net_checksum_finish_hdr(net_checksum_add(len, data)));
 
 vhdr->flags &= ~VIRTIO_NET_HDR_F_NEEDS_CSUM;
 vhdr->flags |= VIRTIO_NET_HDR_F_DATA_VALID;
diff --git a/include/net/checksum.h b/include/net/checksum.h
index 7df472c..9878c8b 100644
--- a/include/net/checksum.h
+++ b/include/net/checksum.h
@@ -34,6 +34,13 @@ net_checksum_add(int len, uint8_t *buf)
 }
 
 static inline uint16_t
+net_checksum_finish_hdr(uint32_t sum)
+{
+uint16_t s = net_checksum_finish(sum);
+return s ? s : 0x;
+}
+
+static inline uint16_t
 net_raw_checksum(uint8_t *data, int length)
 {
 return net_checksum_finish(net_checksum_add(length, data));
-- 
1.9.1




Re: [Qemu-devel] [RESEND PATCH 2/6] memory: introduce AddressSpaceOps and IOMMUObject

2017-11-14 Thread Liu, Yi L
Hi Eric,

On Tue, Nov 14, 2017 at 10:52:54PM +0100, Auger Eric wrote:
> Hi Yi L,
> 
> On 14/11/2017 14:59, Liu, Yi L wrote:
> > On Tue, Nov 14, 2017 at 09:53:07AM +0100, Auger Eric wrote:
> > Hi Eric,
> > 
> >> Hi Yi L,
> >>
> >> On 13/11/2017 10:58, Liu, Yi L wrote:
> >>> On Mon, Nov 13, 2017 at 04:56:01PM +1100, David Gibson wrote:
>  On Fri, Nov 03, 2017 at 08:01:52PM +0800, Liu, Yi L wrote:
> > From: Peter Xu 
> >
> > AddressSpaceOps is similar to MemoryRegionOps, it's just for address
> > spaces to store arch-specific hooks.
> >
> > The first hook I would like to introduce is iommu_get(). Return an
> > IOMMUObject behind the AddressSpace.
> >
> > For systems that have IOMMUs, we will create a special address
> > space per device which is different from system default address
> > space for it (please refer to pci_device_iommu_address_space()).
> > Normally when that happens, there will be one specific IOMMU (or
> > say, translation unit) stands right behind that new address space.
> >
> > This iommu_get() fetches that guy behind the address space. Here,
> > the guy is defined as IOMMUObject, which includes a notifier_list
> > so far, may extend in future. Along with IOMMUObject, a new iommu
> > notifier mechanism is introduced. It would be used for virt-svm.
> > Also IOMMUObject can further have a IOMMUObjectOps which is similar
> > to MemoryRegionOps. The difference is IOMMUObjectOps is not relied
> > on MemoryRegion.
> >
> > Signed-off-by: Peter Xu 
> > Signed-off-by: Liu, Yi L 
> 
>  Hi, sorry I didn't reply to the earlier postings of this after our
>  discussion in China.  I've been sick several times and very busy.
> >>>
> >>> Hi David,
> >>>
> >>> Fully understood. I'll try my best to address your question. Also,
> >>> feel free to input further questions, anyhow, the more we discuss the
> >>> better work we done.
> >>>
>  I still don't feel like there's an adequate explanation of exactly
>  what an IOMMUObject represents.   Obviously it can represent more than
> >>>
> >>> IOMMUObject is aimed to represent the iommu itself. e.g. the iommu
> >>> specific operations. One of the key purpose of IOMMUObject is to
> >>> introduce a notifier framework to let iommu emulator to be able to
> >>> do iommu operations other than MAP/UNMAP. As IOMMU grows more and
> >>> more feature, MAP/UNMAP is not the only operation iommu emulator needs
> >>> to deal. e.g. shared virtual memory. So far, as I know AMD/ARM also
> >>> has it. may correct me on it. As my cover letter mentioned, MR based
> >>> notifier framework doesn’t work for the newly added IOMMU operations.
> >>> Like bind guest pasid table pointer to host and propagate guest's
> >>> iotlb flush to host.
> >>>
>  a single translation window - since that's represented by the
>  IOMMUMR.  But what exactly do all the MRs - or whatever else - that
>  are represented by the IOMMUObject have in common, from a functional
>  point of view.
> >>>
> >>> Let me take virt-SVM as an example. As far as I know, for virt-SVM,
> >>> the implementation of different vendors are similar. The key design
> >>> is to have a nested translation(aka. two stage translation). It is to
> >>> have guest maintain gVA->gPA mapping and hypervisor builds gPA->hPA
> >>> mapping. Similar to EPT based virt-MMU solution.
> >>>
> >>> In Qemu, gPA->hPA mapping is done through MAP/UNMAP notifier, it can
> >>> keep going. But for gVA->gPA mapping, only guest knows it, so hypervisor
> >>> needs to trap specific guest iommu operation and pass the gVA->gPA
> >>> mapping knowledge to host through a notifier(newly added one). In VT-d,
> >>> it is called bind guest pasid table to host.
> >>
> >> What I don't get is the PASID table is per extended context entry. I
> >> understand this latter is indexed by PCI device function. And today MR
> >> are created per PCIe device if I am not wrong. 
> > 
> > In my understanding, MR is more related to AddressSpace not exactly tagged
> > with PCIe device.
> I meant, in the current intel_iommu code, vtd_find_add_as() creates 1
> IOMMU MR and 1 AS per PCIe device, right?

yes, it is. This is the PCIe device address space, it's can be guest's physical
address space if no vIOMMU exposed. Or it can be the guset IOVA address space
if vIOMMU is epxosed. Both the two address space is treated as 2nd level
transaltion in VT-d, which is different from the 1st level translation which
is for process vaddr space.

> > 
> >> So why can't we have 1
> >> new MR notifier dedicated to PASID table passing? My understanding is
> >> the MR, having a 1-1 correspondence with a PCIe device and thus a
> >> context could be of right granularity. Then I understand the only flags
> > 
> > I didn't quite get your point regards to the "granlarity" here. May talk
> > a little bit more here?
> The 

[Qemu-devel] 答复: Re: 答复: Re: 答复: Re: 答复: Re: [PATCH v2] qga: replace GetIfEntry

2017-11-14 Thread lu.zhipeng
Quoting lu.zhip...@zte.com.cn (2017-11-14 19:41:58)> i used  xp  version:> > xp 
professional 2002 service pack 3>Hmm, doesn't 
--cross-prefix=x86_64-w64-mingw32- result in a 64-bit> qemu-ga.exe? How are you 
running this on 32-bit Windows XP?






i build two version :32bit and 64 bit , run 32bit in xp 























为了让您的VPlat虚拟机故障和docker故障得到高效的处理,请上报故障到: $VPlat技术支持。


芦志朋 luzhipeng






IT开发工程师 IT Development
Engineer
操作系统产品部/中心研究院/系统产品 OS Product Dept./Central R&D Institute/System Product









四川省成都市天府大道中段800号
E: lu.zhip...@zte.com.cn 
www.zte.com.cn










原始邮件



发件人: ;
收件人:芦志朋10108272;
抄送人: ;
日 期 :2017年11月15日 10:23
主 题 :Re: 答复: Re: 答复: Re: 答复: Re: [PATCH v2] qga: replace GetIfEntry


Quoting lu.zhip...@zte.com.cn (2017-11-14 19:41:58)
> i used  xp  version:
> 
> xp professional 2002 service pack 3

Hmm, doesn't --cross-prefix=x86_64-w64-mingw32- result in a 64-bit
qemu-ga.exe? How are you running this on 32-bit Windows XP?

> 
> build environment: 
> 
> root@localhost qemu-2.5.0]# cat /etc/redhat-release 
> 
> CentOS Linux release 7.0.1406 (Core) 

Thanks, I'll try to see if there's anything there that would account for
the difference.

> 
> 
> 
> 
> 
> 
> 为了让您的VPlat虚拟机故障和docker故障得到高效的处理,请上报故障到: $VPlat技术支
> 持。
> 
> 芦志朋 luzhipeng
> 
> 
> IT开发工程师 IT Development Engineer
> 操作系统产品部/中心研究院/系统产品 OS Product Dept./Central R&D Institute/
> System Product
> 
> 
> [cid]  [cid]
>四川省成都市天府大道中段800号
>E: lu.zhip...@zte.com.cn
>www.zte.com.cn
> 
> 原始邮件
> 发件人: ;
> 收件人:芦志朋10108272;
> 抄送人: ;
> 日期:2017年11月15日 09:22
> 主题:Re: 答复: Re: 答复: Re: [PATCH v2] qga: replace GetIfEntry
> Quoting lu.zhip...@zte.com.cn (2017-11-14 05:09:35)
> >  i test the latest qga in xp , it run ok .
> > 
> > 
> > my qga config :
> > 
> > Configured with: './configure' '--enable-guest-agent' '--cross-prefix=
> > x86_64-w64-mingw32-' '--with-vss-sdk=/home/VSSSDK72' '--disable-fdt'
> > '--target-list=x86_64-softmmu'
> 
> Hmm, so you're testing with Windows XP x64? I was using XP 32-bit (SP3),
> but I retried with XP x64 (SP2) and I still have the same issue.
> 
> I can only get qemu-ga working if I build on top of something prior to
> commit 12f8def0e.
> 
> What build environment are you using? I've tried Fedora Core 18 and 20
> and have the same issue with both.
> 
> > 
> > used qga version info
> > 
> > [root@ceshi qemu]# git log
> > 
> > commit 533ab83ea074d5fc457769f6ac698524a12f1156
> > 
> > Author: ZhiPeng Lu 
> > 
> > Date:   Fri Nov 10 10:17:14 2017 +0800
> > 
> > 
> > qga: fix some errors for guest_get_network_stats
> > 
> > 
> > 
> > fix some erros:
> > 
> > 1.if building qga on Windows Vista/2008 and newer,
> > 
> > it cann't find the link to GetIfEntry2 in windows xp.
> > 
> > 2. check valid of if_index.
> > 
> > 
> > 
> > Signed-off-by: ZhiPeng Lu 
> > 
> > 
> > commit de597a8b27722ce4f9cc660f930f7dccc712712d
> > 
> > Author: ZhiPeng Lu 
> > 
> > Date:   Fri Nov 3 22:54:20 2017 +0800
> > 
> > 
> > qga: replace GetIfEntry
> > 
> > 
> > 
> >
>  The data obtained by GetIfEntry is 32 bits, and it may overflow. Thus 
> using
> > GetIfEntry2 instead of GetIfEntry.
> > 
> > 
> > 
> > Signed-off-by: ZhiPeng Lu 
> > 
> > *avoid CamelCase variable names
> > 
> > *update field names for MIB_IFROW -> MIB_IF_ROW2
> > 
> > Signed-off-by: Michael Roth 
> > 
> > 
> > commit 5ca7a3cba468736cfe555887af1f6ba754f6eac9
> > 
> > Merge: a4f0537 10a7b7e
> > 
> > Author: Peter Maydell 
> > 
> > Date:   Tue Nov 7 14:43:35 2017 +
> > 
> > 
> > Merge remote-tracking branch 'remotes/berrange/tags/
> pull-2017-11-06-2' into
> > staging
> > 
> > 
> > 
> > Pull IO 2017/11/06 v2
> > 
> > 
> > 
> > 
> > 为了让您的VPlat虚拟机故障和docker故障得到高效的处理,请上报故障到: $VPlat技术
> 支
> > 持。
> > 
> > 芦志朋 luzhipeng
> > 
> > 
> > IT开发工程师 IT Development Engineer
> > 操作系统产品部/中心研究院/系统产品 OS Product Dept./Central R&D Institute/
> > System Product
> > 
> > 
> > [cid]  [cid]
> >四川省成都市天府大道中段800号
> >E: lu.zhip...@zte.com.cn
> >www.zte.com.cn
> > 
> > 原始邮件
> > 发件人: ;
> > 收件人:芦志朋10108272;
> > 抄送人: ;
> > 日期:2017年11月14日 07:57
> > 主题:Re: 答复: Re: [PATCH v2] qga: replace GetIfEntry
> > Quoting lu.zhip...@zte.com.cn (2017-11-09 05:26:15)
> > >  i think the code is better
> > > 
> > >  if (OSver.dwMajorVersion >= 6) {
> > >   MIB_IF_ROW2 aMib_ifrow;
> > >   typedef NETIOAPI_API (WINAPI *getifentry2_t)(PMIB_IF_ROW2 Row);
> > >   memset(_ifrow, 0, sizeof(aMib_ifrow));
> > >   aMib_ifrow.InterfaceIndex = nicId;
> > >   HMODULE module = GetModuleHandle("iphlpapi");
> > >   PVOID fun = 

Re: [Qemu-devel] [PATCH] net: Transmit zero UDP checksum as 0xFFFF

2017-11-14 Thread Ed Swierk via Qemu-devel
On Tue, Nov 14, 2017 at 6:10 PM, Jason Wang  wrote:
>
>
> On 2017年11月15日 07:25, Ed Swierk wrote:
>>
>> The checksum algorithm used by IPv4, TCP and UDP allows a zero value
>> to be represented by either 0x and 0x. But per RFC 768, a zero
>> UDP checksum must be transmitted as 0x, as 0x is a special
>> value meaning no checksum.
>>
>> Substitute 0x whenever a checksum is computed as zero on a UDP
>> datagram. Doing this on IPv4 packets and TCP segments is unnecessary
>> but legal.
>>
>> (While it is tempting to make the substitution in
>> net_checksum_finish(), that function is also used by receivers to
>> verify checksums, and in that case the expected value is always
>> 0x.)
>
>
> Then looks like you'd better have an wrapper for net_checksum_finish() and
> do things there.

I'll do that in v2.

>> index 1019b50..e820132 100644
>> --- a/hw/net/net_rx_pkt.c
>> +++ b/hw/net/net_rx_pkt.c
>> @@ -588,6 +588,9 @@ bool net_rx_pkt_fix_l4_csum(struct NetRxPkt *pkt)
>> /* Calculate L4 checksum */
>>   csum = cpu_to_be16(_net_rx_pkt_calc_l4_csum(pkt));
>> +if (!csum) {
>> +csum = 0x; /* For UDP, zero checksum must be sent as 0x
>> */
>> +}
>
>
> I thought we should only do this for tx?

We need to do this any time we modify the checksum field in a UDP
datagram header for someone else to verify. Normally this happens on
the tx path, and that someone is a remote system. But here
net_rx_pkt_fix_l4_csum() is used to fill in the checksum on packets
received with the NEEDS_CSUM vhdr flag before passing them along to
the guest.

--Ed



Re: [Qemu-devel] 答复: Re: 答复: Re: 答复: Re: [PATCH v2] qga: replace GetIfEntry

2017-11-14 Thread Michael Roth
Quoting lu.zhip...@zte.com.cn (2017-11-14 19:41:58)
> i used  xp  version:
> 
> xp professional 2002 service pack 3

Hmm, doesn't --cross-prefix=x86_64-w64-mingw32- result in a 64-bit
qemu-ga.exe? How are you running this on 32-bit Windows XP?

> 
> build environment: 
> 
> root@localhost qemu-2.5.0]# cat /etc/redhat-release 
> 
> CentOS Linux release 7.0.1406 (Core) 

Thanks, I'll try to see if there's anything there that would account for
the difference.

> 
> 
> 
> 
> 
> 
> 为了让您的VPlat虚拟机故障和docker故障得到高效的处理,请上报故障到: $VPlat技术支
> 持。
> 
> 芦志朋 luzhipeng
> 
> 
> IT开发工程师 IT Development Engineer
> 操作系统产品部/中心研究院/系统产品 OS Product Dept./Central R&D Institute/
> System Product
> 
> 
> [cid]  [cid]
>四川省成都市天府大道中段800号
>E: lu.zhip...@zte.com.cn
>www.zte.com.cn
> 
> 原始邮件
> 发件人: ;
> 收件人:芦志朋10108272;
> 抄送人: ;
> 日期:2017年11月15日 09:22
> 主题:Re: 答复: Re: 答复: Re: [PATCH v2] qga: replace GetIfEntry
> Quoting lu.zhip...@zte.com.cn (2017-11-14 05:09:35)
> >  i test the latest qga in xp , it run ok .
> > 
> > 
> > my qga config :
> > 
> > Configured with: './configure' '--enable-guest-agent' '--cross-prefix=
> > x86_64-w64-mingw32-' '--with-vss-sdk=/home/VSSSDK72' '--disable-fdt'
> > '--target-list=x86_64-softmmu'
> 
> Hmm, so you're testing with Windows XP x64? I was using XP 32-bit (SP3),
> but I retried with XP x64 (SP2) and I still have the same issue.
> 
> I can only get qemu-ga working if I build on top of something prior to
> commit 12f8def0e.
> 
> What build environment are you using? I've tried Fedora Core 18 and 20
> and have the same issue with both.
> 
> > 
> > used qga version info
> > 
> > [root@ceshi qemu]# git log
> > 
> > commit 533ab83ea074d5fc457769f6ac698524a12f1156
> > 
> > Author: ZhiPeng Lu 
> > 
> > Date:   Fri Nov 10 10:17:14 2017 +0800
> > 
> > 
> > qga: fix some errors for guest_get_network_stats
> > 
> > 
> > 
> > fix some erros:
> > 
> > 1.if building qga on Windows Vista/2008 and newer,
> > 
> > it cann't find the link to GetIfEntry2 in windows xp.
> > 
> > 2. check valid of if_index.
> > 
> > 
> > 
> > Signed-off-by: ZhiPeng Lu 
> > 
> > 
> > commit de597a8b27722ce4f9cc660f930f7dccc712712d
> > 
> > Author: ZhiPeng Lu 
> > 
> > Date:   Fri Nov 3 22:54:20 2017 +0800
> > 
> > 
> > qga: replace GetIfEntry
> > 
> > 
> > 
> >
>  The data obtained by GetIfEntry is 32 bits, and it may overflow. Thus 
> using
> > GetIfEntry2 instead of GetIfEntry.
> > 
> > 
> > 
> > Signed-off-by: ZhiPeng Lu 
> > 
> > *avoid CamelCase variable names
> > 
> > *update field names for MIB_IFROW -> MIB_IF_ROW2
> > 
> > Signed-off-by: Michael Roth 
> > 
> > 
> > commit 5ca7a3cba468736cfe555887af1f6ba754f6eac9
> > 
> > Merge: a4f0537 10a7b7e
> > 
> > Author: Peter Maydell 
> > 
> > Date:   Tue Nov 7 14:43:35 2017 +
> > 
> > 
> > Merge remote-tracking branch 'remotes/berrange/tags/
> pull-2017-11-06-2' into
> > staging
> > 
> > 
> > 
> > Pull IO 2017/11/06 v2
> > 
> > 
> > 
> > 
> > 为了让您的VPlat虚拟机故障和docker故障得到高效的处理,请上报故障到: $VPlat技术
> 支
> > 持。
> > 
> > 芦志朋 luzhipeng
> > 
> > 
> > IT开发工程师 IT Development Engineer
> > 操作系统产品部/中心研究院/系统产品 OS Product Dept./Central R&D Institute/
> > System Product
> > 
> > 
> > [cid]  [cid]
> >四川省成都市天府大道中段800号
> >E: lu.zhip...@zte.com.cn
> >www.zte.com.cn
> > 
> > 原始邮件
> > 发件人: ;
> > 收件人:芦志朋10108272;
> > 抄送人: ;
> > 日期:2017年11月14日 07:57
> > 主题:Re: 答复: Re: [PATCH v2] qga: replace GetIfEntry
> > Quoting lu.zhip...@zte.com.cn (2017-11-09 05:26:15)
> > >  i think the code is better
> > > 
> > >  if (OSver.dwMajorVersion >= 6) {
> > >   MIB_IF_ROW2 aMib_ifrow;
> > >   typedef NETIOAPI_API (WINAPI *getifentry2_t)(PMIB_IF_ROW2 Row);
> > >   memset(_ifrow, 0, sizeof(aMib_ifrow));
> > >   aMib_ifrow.InterfaceIndex = nicId;
> > >   HMODULE module = GetModuleHandle("iphlpapi");
> > >   PVOID fun = GetProcAddress(module, "GetIfEntry2");
> > >   if (fun == NULL) {
> > >   error_setg(errp, QERR_QGA_COMMAND_FAILED,
> > >  "Failed to get address of GetIfEntry2");
> > >   return NULL;
> > >   }
> > > getifentry2_t getifentry2_ex = (getifentry2_t)fun;
> > > if (NO_ERROR == getifentry2_ex(_ifrow)){
> > > }
> > 
> > I've updated the patch with this change:
> >   https://github.com/mdroth/qemu/commits/qga-if-stats
> > 
> > But I'm a bit confused now: when I tried to test this on XP I realized that
> > that qemu-ga no longer works on XP, and generates the following error
> > when I try to start it (even without your stats patch):
> > 
> >   "The procedure entry point AcquireSRWLockExclusive could not be located
> >

Re: [Qemu-devel] [PATCH V5] hw/pci-host: Fix x86 Host Bridges 64bit PCI hole

2017-11-14 Thread Michael S. Tsirkin
On Mon, Nov 13, 2017 at 03:07:45PM +0200, Marcel Apfelbaum wrote:
> On 11/11/2017 17:25, Marcel Apfelbaum wrote:
> > Currently there is no MMIO range over 4G
> > reserved for PCI hotplug. Since the 32bit PCI hole
> > depends on the number of cold-plugged PCI devices
> > and other factors, it is very possible is too small
> > to hotplug PCI devices with large BARs.
> > 
> > Fix it by reserving 2G for I4400FX chipset
> > in order to comply with older Win32 Guest OSes
> > and 32G for Q35 chipset.
> > 
> > Even if the new defaults of pci-hole64-size will appear in
> > "info qtree" also for older machines, the property was
> > not implemented so no changes will be visible to guests.
> > 
> > Note this is a regression since prev QEMU versions had
> > some range reserved for 64bit PCI hotplug.
> > 
> > Reviewed-by: Laszlo Ersek 
> > Reviewed-by: Gerd Hoffmann 
> > Signed-off-by: Marcel Apfelbaum 
> > ---
> > 
> 
> Hi Michael,
> 
> Can you please merge the patch for QEMU 2.11
> if you have no further comments?
> I think is an important fix and it worth
> having it in 2.11 .
> 
> Thanks,
> Marcel

Just a note: this changes the DSDT to use
QWordMemory in the _CRS.

I expected it to cause trouble for old windows such as winXP
but it does not seem to cause them - at least not boot crashes.

Worth checking that 32 bit hotplug still works though.
Could you pls confirm?

> > V4 -> V5:
> >   - Renamed a local variable (Laszlo)
> >   - Added a comment to q35 props (Eduardo)
> > 
> > V3 -> V4:
> >   - Addressed Laszlo's comments:
> >  - Added defines for pci-hole64 default size props.
> >  - Rounded the hole64_end to 1G
> >  - Moved some info to commit message
> >   - Addressed Michael's comments:
> >  - Added more comments.
> >   - I kept Gerd's "review-by" tag since no functional changes were made.
> > 
> > V2 -> V3:
> >   - Addressed Gerd's and others comments and re-enabled the pci-hole64-size
> > property defaulting it to 2G for I440FX and 32g for Q35.
> >   - Even if the new defaults of pci-hole64-size will appear in "info qtree"
> > also for older machines, the property was not implemented so
> > no changes will be visible to guests.
> > 
> > V1 -> V2:
> >   Addressed Igor's comments:
> >  - aligned the hole64 start to 1Gb
> >   (I think all the computations took care of it already,
> >but it can't hurt)
> >  - Init compat props to "off" instead of "false"
> > 
> >   hw/i386/pc.c  | 22 ++
> >   hw/pci-host/piix.c| 32 ++--
> >   hw/pci-host/q35.c | 42 +++---
> >   include/hw/i386/pc.h  | 10 +-
> >   include/hw/pci-host/q35.h |  1 +
> >   5 files changed, 101 insertions(+), 6 deletions(-)
> > 
> > diff --git a/hw/i386/pc.c b/hw/i386/pc.c
> > index e11a65b545..fafe5ba5cd 100644
> > --- a/hw/i386/pc.c
> > +++ b/hw/i386/pc.c
> > @@ -1448,6 +1448,28 @@ void pc_memory_init(PCMachineState *pcms,
> >   pcms->ioapic_as = _space_memory;
> >   }
> > +/*
> > + * The 64bit pci hole starts after "above 4G RAM" and
> > + * potentially the space reserved for memory hotplug.
> > + */
> > +uint64_t pc_pci_hole64_start(void)
> > +{
> > +PCMachineState *pcms = PC_MACHINE(qdev_get_machine());
> > +PCMachineClass *pcmc = PC_MACHINE_GET_CLASS(pcms);
> > +uint64_t hole64_start = 0;
> > +
> > +if (pcmc->has_reserved_memory && pcms->hotplug_memory.base) {
> > +hole64_start = pcms->hotplug_memory.base;
> > +if (!pcmc->broken_reserved_end) {
> > +hole64_start += memory_region_size(>hotplug_memory.mr);
> > +}
> > +} else {
> > +hole64_start = 0x1ULL + pcms->above_4g_mem_size;
> > +}
> > +
> > +return ROUND_UP(hole64_start, 1ULL << 30);
> > +}
> > +
> >   qemu_irq pc_allocate_cpu_irq(void)
> >   {
> >   return qemu_allocate_irq(pic_irq_request, NULL, 0);
> > diff --git a/hw/pci-host/piix.c b/hw/pci-host/piix.c
> > index a7e2256870..a684a7cca9 100644
> > --- a/hw/pci-host/piix.c
> > +++ b/hw/pci-host/piix.c
> > @@ -50,6 +50,7 @@ typedef struct I440FXState {
> >   PCIHostState parent_obj;
> >   Range pci_hole;
> >   uint64_t pci_hole64_size;
> > +bool pci_hole64_fix;
> >   uint32_t short_root_bus;
> >   } I440FXState;
> > @@ -112,6 +113,9 @@ struct PCII440FXState {
> >   #define I440FX_PAM_SIZE 7
> >   #define I440FX_SMRAM0x72
> > +/* Keep it 2G to comply with older win32 guests */
> > +#define I440FX_PCI_HOST_HOLE64_SIZE_DEFAULT (1ULL << 31)
> > +
> >   /* Older coreboot versions (4.0 and older) read a config register that 
> > doesn't
> >* exist in real hardware, to get the RAM size from QEMU.
> >*/
> > @@ -238,29 +242,52 @@ static void i440fx_pcihost_get_pci_hole_end(Object 
> > *obj, Visitor *v,
> >   visit_type_uint32(v, name, , errp);
> >   }
> > +/*
> > + * The 64bit PCI hole start is set by the 

Re: [Qemu-devel] [PATCH] net: Transmit zero UDP checksum as 0xFFFF

2017-11-14 Thread Jason Wang



On 2017年11月15日 07:25, Ed Swierk wrote:

The checksum algorithm used by IPv4, TCP and UDP allows a zero value
to be represented by either 0x and 0x. But per RFC 768, a zero
UDP checksum must be transmitted as 0x, as 0x is a special
value meaning no checksum.

Substitute 0x whenever a checksum is computed as zero on a UDP
datagram. Doing this on IPv4 packets and TCP segments is unnecessary
but legal.

(While it is tempting to make the substitution in
net_checksum_finish(), that function is also used by receivers to
verify checksums, and in that case the expected value is always
0x.)


Then looks like you'd better have an wrapper for net_checksum_finish() 
and do things there.




Signed-off-by: Ed Swierk 
---
  hw/net/e1000.c  | 5 +++--
  hw/net/net_rx_pkt.c | 3 +++
  hw/net/net_tx_pkt.c | 6 ++
  hw/net/vmxnet3.c| 7 +--
  4 files changed, 17 insertions(+), 4 deletions(-)

diff --git a/hw/net/e1000.c b/hw/net/e1000.c
index d642314..97242a1 100644
--- a/hw/net/e1000.c
+++ b/hw/net/e1000.c
@@ -505,8 +505,9 @@ putsum(uint8_t *data, uint32_t n, uint32_t sloc, uint32_t 
css, uint32_t cse)
  if (cse && cse < n)
  n = cse + 1;
  if (sloc < n-1) {
-sum = net_checksum_add(n-css, data+css);
-stw_be_p(data + sloc, net_checksum_finish(sum));
+sum = net_raw_checksum(data + css, n - css);
+/* For UDP, zero checksum must be sent as 0x */
+stw_be_p(data + sloc, sum ? sum : 0x);
  }
  }
  
diff --git a/hw/net/net_rx_pkt.c b/hw/net/net_rx_pkt.c

index 1019b50..e820132 100644
--- a/hw/net/net_rx_pkt.c
+++ b/hw/net/net_rx_pkt.c
@@ -588,6 +588,9 @@ bool net_rx_pkt_fix_l4_csum(struct NetRxPkt *pkt)
  
  /* Calculate L4 checksum */

  csum = cpu_to_be16(_net_rx_pkt_calc_l4_csum(pkt));
+if (!csum) {
+csum = 0x; /* For UDP, zero checksum must be sent as 0x */
+}


I thought we should only do this for tx?

Thanks



Re: [Qemu-devel] [PATCH v8 10/14] migration: add postcopy migration of dirty bitmaps

2017-11-14 Thread John Snow


On 10/30/2017 12:33 PM, Vladimir Sementsov-Ogievskiy wrote:
> Postcopy migration of dirty bitmaps. Only named dirty bitmaps,
> associated with root nodes and non-root named nodes are migrated.
> 
> If destination qemu is already containing a dirty bitmap with the same name
> as a migrated bitmap (for the same node), then, if their granularities are
> the same the migration will be done, otherwise the error will be generated.
> 
> If destination qemu doesn't contain such bitmap it will be created.
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy 
> ---
>  include/migration/misc.h   |   3 +
>  migration/migration.h  |   3 +
>  migration/block-dirty-bitmap.c | 734 
> +

Ouch :\

>  migration/migration.c  |   3 +
>  migration/savevm.c |   2 +
>  vl.c   |   1 +
>  migration/Makefile.objs|   1 +
>  migration/trace-events |  14 +
>  8 files changed, 761 insertions(+)
>  create mode 100644 migration/block-dirty-bitmap.c
> 

Organizationally, you introduce three new 'public' prototypes:

dirty_bitmap_mig_init
dirty_bitmap_mig_before_vm_start
init_dirty_bitmap_incoming_migration

mig_init is advertised in migration/misc.h, the other two are in
migration/migration.h.
The definitions for all three are in migration/block-dirty-bitmap.c

In pure naivety, I find it weird to have something that you use in
migration.c and advertised in migration.h actually exist separately in
block-dirty-bitmap.c; but maybe this is the sanest thing to do.

> diff --git a/include/migration/misc.h b/include/migration/misc.h
> index c079b7771b..9cc539e232 100644
> --- a/include/migration/misc.h
> +++ b/include/migration/misc.h
> @@ -55,4 +55,7 @@ bool migration_has_failed(MigrationState *);
>  bool migration_in_postcopy_after_devices(MigrationState *);
>  void migration_global_dump(Monitor *mon);
>  
> +/* migration/block-dirty-bitmap.c */
> +void dirty_bitmap_mig_init(void);
> +
>  #endif
> diff --git a/migration/migration.h b/migration/migration.h
> index 50d1f01346..4e3ad04664 100644
> --- a/migration/migration.h
> +++ b/migration/migration.h
> @@ -211,4 +211,7 @@ void migrate_send_rp_pong(MigrationIncomingState *mis,
>  void migrate_send_rp_req_pages(MigrationIncomingState *mis, const char* 
> rbname,
>ram_addr_t start, size_t len);
>  
> +void dirty_bitmap_mig_before_vm_start(void);
> +void init_dirty_bitmap_incoming_migration(void);
> +
>  #endif
> diff --git a/migration/block-dirty-bitmap.c b/migration/block-dirty-bitmap.c
> new file mode 100644
> index 00..53cb20045d
> --- /dev/null
> +++ b/migration/block-dirty-bitmap.c
> @@ -0,0 +1,734 @@
> +/*
> + * Block dirty bitmap postcopy migration
> + *
> + * Copyright IBM, Corp. 2009
> + * Copyright (c) 2016-2017 Parallels International GmbH
> + *
> + * Authors:
> + *  Liran Schour   
> + *  Vladimir Sementsov-Ogievskiy 
> + *
> + * This work is licensed under the terms of the GNU GPL, version 2.  See
> + * the COPYING file in the top-level directory.
> + * This file is derived from migration/block.c, so it's author and IBM 
> copyright
> + * are here, although content is quite different.
> + *
> + * Contributions after 2012-01-13 are licensed under the terms of the
> + * GNU GPL, version 2 or (at your option) any later version.
> + *
> + ****
> + *
> + * Here postcopy migration of dirty bitmaps is realized. Only named dirty
> + * bitmaps, associated with root nodes and non-root named nodes are migrated.

Put another way, only QMP-addressable bitmaps. Nodes without a name that
are not the root have no way to be addressed.

> + *
> + * If destination qemu is already containing a dirty bitmap with the same 
> name

"If the destination QEMU already contains a dirty bitmap with the same name"

> + * as a migrated bitmap (for the same node), then, if their granularities are
> + * the same the migration will be done, otherwise the error will be 
> generated.

"an error"

> + *
> + * If destination qemu doesn't contain such bitmap it will be created.

"If the destination QEMU doesn't contain such a bitmap, it will be created."

> + *
> + * format of migration:
> + *
> + * # Header (shared for different chunk types)
> + * 1, 2 or 4 bytes: flags (see qemu_{put,put}_flags)
> + * [ 1 byte: node name size ] \  flags & DEVICE_NAME
> + * [ n bytes: node name ] /
> + * [ 1 byte: bitmap name size ] \  flags & BITMAP_NAME
> + * [ n bytes: bitmap name ] /
> + *
> + * # Start of bitmap migration (flags & START)
> + * header
> + * be64: granularity
> + * 1 byte: bitmap flags (corresponds to BdrvDirtyBitmap)
> + *   bit 0-  bitmap is enabled
> + *   bit 1-  bitmap is persistent
> + *   bit 2-  bitmap is autoloading
> + *   bits 3-7 - reserved, must be zero
> + *
> + * # Complete of bitmap migration (flags & COMPLETE)
> + * header
> + *
> + 

[Qemu-devel] 答复: Re: 答复: Re: 答复: Re: [PATCH v2] qga: replace GetIfEntry

2017-11-14 Thread lu.zhipeng
i used  xp  version:


xp professional 2002 service pack 3


build environment: 




root@localhost qemu-2.5.0]# cat /etc/redhat-release 

CentOS Linux release 7.0.1406 (Core) 
























为了让您的VPlat虚拟机故障和docker故障得到高效的处理,请上报故障到: $VPlat技术支持。


芦志朋 luzhipeng






IT开发工程师 IT Development
Engineer
操作系统产品部/中心研究院/系统产品 OS Product Dept./Central R&D Institute/System Product









四川省成都市天府大道中段800号
E: lu.zhip...@zte.com.cn 
www.zte.com.cn










原始邮件



发件人: ;
收件人:芦志朋10108272;
抄送人: ;
日 期 :2017年11月15日 09:22
主 题 :Re: 答复: Re: 答复: Re: [PATCH v2] qga: replace GetIfEntry


Quoting lu.zhip...@zte.com.cn (2017-11-14 05:09:35)
>  i test the latest qga in xp , it run ok .
> 
> 
> my qga config :
> 
> Configured with: './configure' '--enable-guest-agent' '--cross-prefix=
> x86_64-w64-mingw32-' '--with-vss-sdk=/home/VSSSDK72' '--disable-fdt'
> '--target-list=x86_64-softmmu'

Hmm, so you're testing with Windows XP x64? I was using XP 32-bit (SP3),
but I retried with XP x64 (SP2) and I still have the same issue.

I can only get qemu-ga working if I build on top of something prior to
commit 12f8def0e.

What build environment are you using? I've tried Fedora Core 18 and 20
and have the same issue with both.

> 
> used qga version info
> 
> [root@ceshi qemu]# git log
> 
> commit 533ab83ea074d5fc457769f6ac698524a12f1156
> 
> Author: ZhiPeng Lu 
> 
> Date:   Fri Nov 10 10:17:14 2017 +0800
> 
> 
> qga: fix some errors for guest_get_network_stats
> 
> 
> 
> fix some erros:
> 
> 1.if building qga on Windows Vista/2008 and newer,
> 
> it cann't find the link to GetIfEntry2 in windows xp.
> 
> 2. check valid of if_index.
> 
> 
> 
> Signed-off-by: ZhiPeng Lu 
> 
> 
> commit de597a8b27722ce4f9cc660f930f7dccc712712d
> 
> Author: ZhiPeng Lu 
> 
> Date:   Fri Nov 3 22:54:20 2017 +0800
> 
> 
> qga: replace GetIfEntry
> 
> 
> 
> The data obtained by GetIfEntry is 32 bits, and it may overflow. Thus 
> using
> GetIfEntry2 instead of GetIfEntry.
> 
> 
> 
> Signed-off-by: ZhiPeng Lu 
> 
> *avoid CamelCase variable names
> 
> *update field names for MIB_IFROW -> MIB_IF_ROW2
> 
> Signed-off-by: Michael Roth 
> 
> 
> commit 5ca7a3cba468736cfe555887af1f6ba754f6eac9
> 
> Merge: a4f0537 10a7b7e
> 
> Author: Peter Maydell 
> 
> Date:   Tue Nov 7 14:43:35 2017 +
> 
> 
> Merge remote-tracking branch 'remotes/berrange/tags/pull-2017-11-06-2' 
> into
> staging
> 
> 
> 
> Pull IO 2017/11/06 v2
> 
> 
> 
> 
> 为了让您的VPlat虚拟机故障和docker故障得到高效的处理,请上报故障到: $VPlat技术支
> 持。
> 
> 芦志朋 luzhipeng
> 
> 
> IT开发工程师 IT Development Engineer
> 操作系统产品部/中心研究院/系统产品 OS Product Dept./Central R&D Institute/
> System Product
> 
> 
> [cid]  [cid]
>四川省成都市天府大道中段800号
>E: lu.zhip...@zte.com.cn
>www.zte.com.cn
> 
> 原始邮件
> 发件人: ;
> 收件人:芦志朋10108272;
> 抄送人: ;
> 日期:2017年11月14日 07:57
> 主题:Re: 答复: Re: [PATCH v2] qga: replace GetIfEntry
> Quoting lu.zhip...@zte.com.cn (2017-11-09 05:26:15)
> >  i think the code is better
> > 
> >  if (OSver.dwMajorVersion >= 6) {
> >   MIB_IF_ROW2 aMib_ifrow;
> >   typedef NETIOAPI_API (WINAPI *getifentry2_t)(PMIB_IF_ROW2 Row);
> >   memset(_ifrow, 0, sizeof(aMib_ifrow));
> >   aMib_ifrow.InterfaceIndex = nicId;
> >   HMODULE module = GetModuleHandle("iphlpapi");
> >   PVOID fun = GetProcAddress(module, "GetIfEntry2");
> >   if (fun == NULL) {
> >   error_setg(errp, QERR_QGA_COMMAND_FAILED,
> >  "Failed to get address of GetIfEntry2");
> >   return NULL;
> >   }
> > getifentry2_t getifentry2_ex = (getifentry2_t)fun;
> > if (NO_ERROR == getifentry2_ex(_ifrow)){
> > }
> 
> I've updated the patch with this change:
>   https://github.com/mdroth/qemu/commits/qga-if-stats
> 
> But I'm a bit confused now: when I tried to test this on XP I realized that
> that qemu-ga no longer works on XP, and generates the following error
> when I try to start it (even without your stats patch):
> 
>   "The procedure entry point AcquireSRWLockExclusive could not be located
>in the dynamic link library KERNEL32.dll"
> 
> I think this may be due to the following commit, which notes that Vista+
> are now required as a result:
> 
> commit 12f8def0e02232d7c6416ad9b66640f973c531d1
> Author: Andrey Shedel 
> Date:   Fri Mar 24 15:01:41 2017 -0700
> 
> win32: replace custom mutex and condition variable with native
> primitives
> 
> So, are you actually able to run on XP currently? If so, how? And if
> not, I think we have other issues that need to be addressed if we
> want to support XP still; I'm not even sure that's realistic at this
> point.
> 
> Unless there's actually a way to test 

Re: [Qemu-devel] [PATCH v6] NUMA: Enable adding NUMA node implicitly

2017-11-14 Thread Dou Liyang

Hi Igor,

[...]

+parse_numa_node(ms, , NULL);

I get build break here:

numa.c:451:13: error: too few arguments to function ‘parse_numa_node’
 parse_numa_node(ms, , NULL);



In upstream tree, your commit

  cc001888b780 ("numa: fixup parsed NumaNodeOptions earlier")

removed a argument from parse_numa_node() recently. this definition
of function becomes

static void parse_numa_node(MachineState *ms, NumaNodeOptions *node,
Error **errp)

this patch is based on the upstream tree, parse_numa_node() should have
three arguments.

I am not sure why you got this building failure log, can you tell me
which branch did you test?

Thanks,
dou





Re: [Qemu-devel] 答复: Re: 答复: Re: [PATCH v2] qga: replace GetIfEntry

2017-11-14 Thread Michael Roth
Quoting lu.zhip...@zte.com.cn (2017-11-14 05:09:35)
>  i test the latest qga in xp , it run ok .
> 
> 
> my qga config :
> 
> Configured with: './configure' '--enable-guest-agent' '--cross-prefix=
> x86_64-w64-mingw32-' '--with-vss-sdk=/home/VSSSDK72' '--disable-fdt'
> '--target-list=x86_64-softmmu'

Hmm, so you're testing with Windows XP x64? I was using XP 32-bit (SP3),
but I retried with XP x64 (SP2) and I still have the same issue.

I can only get qemu-ga working if I build on top of something prior to
commit 12f8def0e.

What build environment are you using? I've tried Fedora Core 18 and 20
and have the same issue with both.

> 
> used qga version info
> 
> [root@ceshi qemu]# git log
> 
> commit 533ab83ea074d5fc457769f6ac698524a12f1156
> 
> Author: ZhiPeng Lu 
> 
> Date:   Fri Nov 10 10:17:14 2017 +0800
> 
> 
> qga: fix some errors for guest_get_network_stats
> 
> 
> 
> fix some erros:
> 
> 1.if building qga on Windows Vista/2008 and newer,
> 
> it cann't find the link to GetIfEntry2 in windows xp.
> 
> 2. check valid of if_index.
> 
> 
> 
> Signed-off-by: ZhiPeng Lu 
> 
> 
> commit de597a8b27722ce4f9cc660f930f7dccc712712d
> 
> Author: ZhiPeng Lu 
> 
> Date:   Fri Nov 3 22:54:20 2017 +0800
> 
> 
> qga: replace GetIfEntry
> 
> 
> 
> The data obtained by GetIfEntry is 32 bits, and it may overflow. Thus 
> using
> GetIfEntry2 instead of GetIfEntry.
> 
> 
> 
> Signed-off-by: ZhiPeng Lu 
> 
> *avoid CamelCase variable names
> 
> *update field names for MIB_IFROW -> MIB_IF_ROW2
> 
> Signed-off-by: Michael Roth 
> 
> 
> commit 5ca7a3cba468736cfe555887af1f6ba754f6eac9
> 
> Merge: a4f0537 10a7b7e
> 
> Author: Peter Maydell 
> 
> Date:   Tue Nov 7 14:43:35 2017 +
> 
> 
> Merge remote-tracking branch 'remotes/berrange/tags/pull-2017-11-06-2' 
> into
> staging
> 
> 
> 
> Pull IO 2017/11/06 v2
> 
> 
> 
> 
> 为了让您的VPlat虚拟机故障和docker故障得到高效的处理,请上报故障到: $VPlat技术支
> 持。
> 
> 芦志朋 luzhipeng
> 
> 
> IT开发工程师 IT Development Engineer
> 操作系统产品部/中心研究院/系统产品 OS Product Dept./Central R&D Institute/
> System Product
> 
> 
> [cid]  [cid]
>四川省成都市天府大道中段800号
>E: lu.zhip...@zte.com.cn
>www.zte.com.cn
> 
> 原始邮件
> 发件人: ;
> 收件人:芦志朋10108272;
> 抄送人: ;
> 日期:2017年11月14日 07:57
> 主题:Re: 答复: Re: [PATCH v2] qga: replace GetIfEntry
> Quoting lu.zhip...@zte.com.cn (2017-11-09 05:26:15)
> >  i think the code is better
> > 
> >  if (OSver.dwMajorVersion >= 6) {
> >   MIB_IF_ROW2 aMib_ifrow;
> >   typedef NETIOAPI_API (WINAPI *getifentry2_t)(PMIB_IF_ROW2 Row);
> >   memset(_ifrow, 0, sizeof(aMib_ifrow));
> >   aMib_ifrow.InterfaceIndex = nicId;
> >   HMODULE module = GetModuleHandle("iphlpapi");
> >   PVOID fun = GetProcAddress(module, "GetIfEntry2");
> >   if (fun == NULL) {
> >   error_setg(errp, QERR_QGA_COMMAND_FAILED,
> >  "Failed to get address of GetIfEntry2");
> >   return NULL;
> >   }
> > getifentry2_t getifentry2_ex = (getifentry2_t)fun;
> > if (NO_ERROR == getifentry2_ex(_ifrow)){
> > }
> 
> I've updated the patch with this change:
>   https://github.com/mdroth/qemu/commits/qga-if-stats
> 
> But I'm a bit confused now: when I tried to test this on XP I realized that
> that qemu-ga no longer works on XP, and generates the following error
> when I try to start it (even without your stats patch):
> 
>   "The procedure entry point AcquireSRWLockExclusive could not be located
>in the dynamic link library KERNEL32.dll"
> 
> I think this may be due to the following commit, which notes that Vista+
> are now required as a result:
> 
> commit 12f8def0e02232d7c6416ad9b66640f973c531d1
> Author: Andrey Shedel 
> Date:   Fri Mar 24 15:01:41 2017 -0700
> 
> win32: replace custom mutex and condition variable with native
> primitives
> 
> So, are you actually able to run on XP currently? If so, how? And if
> not, I think we have other issues that need to be addressed if we
> want to support XP still; I'm not even sure that's realistic at this
> point.
> 
> Unless there's actually a way to test QGA on XP right now I think I
> we should just get in the updated patch minus the dynamic DLL stuff,
> i.e.:
>   https://github.com/mdroth/qemu/commit/
> de597a8b27722ce4f9cc660f930f7dccc712712d
> 
> Make sense?
> 
> > 
> > 
> > 
> > 
> > 
> > 为了让您的VPlat虚拟机故障和docker故障得到高效的处理,请上报故障到: $VPlat技术
> 支
> > 持。
> > 
> > 芦志朋 luzhipeng
> > 
> > 
> > IT开发工程师 IT Development Engineer
> > 操作系统产品部/中心研究院/系统产品 OS Product Dept./Central R&D Institute/
> > System Product
> > 
> > 
> > [cid]  [cid]
> >四川省成都市天府大道中段800号
> >E: lu.zhip...@zte.com.cn
> >www.zte.com.cn
> > 
> > 原始邮件
> > 

Re: [Qemu-devel] [PATCH v3 for-2.11 3/3] tpm_tis: Return 0 for every register in case of failure mode

2017-11-14 Thread Stefan Berger

On 11/14/2017 06:47 PM, Marc-André Lureau wrote:

Hi

On Tue, Nov 14, 2017 at 10:52 PM, Stefan Berger
 wrote:

Rather than returning ~0, return 0 for every register in case of
failure mode. The '0' is better to indicate that there's no device
there.

For most registers, 0 makes more sense. However, I wonder if we
shouldn't just fail to start qemu in this case...

Not convincing me this is 2.11 material either. Does this fix a specific bug?


Yes, SeaBIOS detects the ~0 when it probes and thinks there's a device 
there. It then hangs trying to set flags and read registers to be able 
to use the device.


   Stefan




Signed-off-by: Stefan Berger 
---
  hw/tpm/tpm_tis.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/tpm/tpm_tis.c b/hw/tpm/tpm_tis.c
index fec2fc6..42d647d 100644
--- a/hw/tpm/tpm_tis.c
+++ b/hw/tpm/tpm_tis.c
@@ -545,7 +545,7 @@ static uint64_t tpm_tis_mmio_read(void *opaque, hwaddr addr,
  uint8_t v;

  if (tpm_backend_had_startup_error(s->be_driver)) {
-return val;
+return 0;
  }

  switch (offset) {
--
2.5.5









Re: [Qemu-devel] [PATCH v3 for-2.11 1/3] tpm_emulator: Add a caching layer for the TPM Established flag

2017-11-14 Thread Stefan Berger

On 11/14/2017 06:40 PM, Marc-André Lureau wrote:

Hi

On Tue, Nov 14, 2017 at 10:52 PM, Stefan Berger
 wrote:

Add a caching layer for the TPM established flag so that we don't
need to go to the emulator every time the flag is read by accessing
the REG_ACCESS register.

What's the impact? Isn't this just a "small" optimization? Iotw, why
is this for-2.11?


The TIS has a register that contains this flag and that's being polled 
quite frequently. So it generates a lot of traffic to the emulator. This 
caching layer gets rid of most of the traffic.


   Stefan




Signed-off-by: Stefan Berger 

v1->v2:
  - move the caching to the backend layer since detecting the
TPM 1.2 TSC_ResetEstablishmentBit() command is easier to do
here.
---
  hw/tpm/tpm_emulator.c | 17 ++---
  1 file changed, 14 insertions(+), 3 deletions(-)

diff --git a/hw/tpm/tpm_emulator.c b/hw/tpm/tpm_emulator.c
index e1a6810..b293db7 100644
--- a/hw/tpm/tpm_emulator.c
+++ b/hw/tpm/tpm_emulator.c
@@ -73,6 +73,9 @@ typedef struct TPMEmulator {
  Error *migration_blocker;

  QemuMutex mutex;
+
+unsigned int established_flag:1;
+unsigned int established_flag_cached:1;
  } TPMEmulator;


@@ -287,16 +290,22 @@ static bool 
tpm_emulator_get_tpm_established_flag(TPMBackend *tb)
  TPMEmulator *tpm_emu = TPM_EMULATOR(tb);
  ptm_est est;

-DPRINTF("%s", __func__);
+if (tpm_emu->established_flag_cached) {
+return tpm_emu->established_flag;
+}
+
  if (tpm_emulator_ctrlcmd(tpm_emu, CMD_GET_TPMESTABLISHED, ,
   0, sizeof(est)) < 0) {
  error_report("tpm-emulator: Could not get the TPM established flag: 
%s",
   strerror(errno));
  return false;
  }
-DPRINTF("established flag: %0x", est.u.resp.bit);
+DPRINTF("got established flag: %0x", est.u.resp.bit);
+
+tpm_emu->established_flag_cached = 1;
+tpm_emu->established_flag = (est.u.resp.bit != 0);

-return (est.u.resp.bit != 0);
+return tpm_emu->established_flag;
  }

  static int tpm_emulator_reset_tpm_established_flag(TPMBackend *tb,
@@ -327,6 +336,8 @@ static int 
tpm_emulator_reset_tpm_established_flag(TPMBackend *tb,
  return -1;
  }

+tpm_emu->established_flag_cached = 0;
+
  return 0;
  }

--
2.5.5









Re: [Qemu-devel] [PATCH for-2.11] target/arm: Report GICv3 sysregs present in ID registers if needed

2017-11-14 Thread Alistair Francis
On Tue, Nov 7, 2017 at 7:01 AM, Peter Maydell  wrote:
> The CPU ID registers ID_AA64PFR0_EL1, ID_PFR1_EL1 and ID_PFR1
> have a field for reporting presence of GICv3 system registers.
> We need to report this field correctly in order for Xen to
> work as a guest inside QEMU emulation. We mustn't incorrectly
> claim the sysregs exist when they don't, though, or Linux will
> crash.
>
> Unfortunately the way we've designed the GICv3 emulation in QEMU
> puts the system registers as part of the GICv3 device, which
> may be created after the CPU proper has been realized. This
> means that we don't know at the point when we define the ID
> registers what the correct value is. Handle this by switching
> them to calling a function at runtime to read the value, where
> we can fill in the GIC field appropriately.
>
> Signed-off-by: Peter Maydell 

Is this going to make it into 2.11?

Alistair

> ---
> In retrospect I think having the sysregs emulation in the
> GIC device was a bit of a design error -- we should have
> split it like the hardware does, with a defined protocol
> between the GIC and the CPU interface. (In real hardware the
> CPU can have the GIC system registers even though it's not
> connected to an actual GICv3, and we don't/can't emulate
> that with our current design.)
> ---
>  target/arm/helper.c | 44 
>  1 file changed, 40 insertions(+), 4 deletions(-)
>
> diff --git a/target/arm/helper.c b/target/arm/helper.c
> index f61fb3e..35c5bd6 100644
> --- a/target/arm/helper.c
> +++ b/target/arm/helper.c
> @@ -4549,6 +4549,33 @@ static void define_debug_regs(ARMCPU *cpu)
>  }
>  }
>
> +/* We don't know until after realize whether there's a GICv3
> + * attached, and that is what registers the gicv3 sysregs.
> + * So we have to fill in the GIC fields in ID_PFR/ID_PFR1_EL1/ID_AA64PFR0_EL1
> + * at runtime.
> + */
> +static uint64_t id_pfr1_read(CPUARMState *env, const ARMCPRegInfo *ri)
> +{
> +ARMCPU *cpu = arm_env_get_cpu(env);
> +uint64_t pfr1 = cpu->id_pfr1;
> +
> +if (env->gicv3state) {
> +pfr1 |= 1 << 28;
> +}
> +return pfr1;
> +}
> +
> +static uint64_t id_aa64pfr0_read(CPUARMState *env, const ARMCPRegInfo *ri)
> +{
> +ARMCPU *cpu = arm_env_get_cpu(env);
> +uint64_t pfr0 = cpu->id_aa64pfr0;
> +
> +if (env->gicv3state) {
> +pfr0 |= 1 << 24;
> +}
> +return pfr0;
> +}
> +
>  void register_cp_regs_for_features(ARMCPU *cpu)
>  {
>  /* Register all the coprocessor registers based on feature bits */
> @@ -4573,10 +4600,14 @@ void register_cp_regs_for_features(ARMCPU *cpu)
>.opc0 = 3, .opc1 = 0, .crn = 0, .crm = 1, .opc2 = 0,
>.access = PL1_R, .type = ARM_CP_CONST,
>.resetvalue = cpu->id_pfr0 },
> +/* ID_PFR1 is not a plain ARM_CP_CONST because we don't know
> + * the value of the GIC field until after we define these regs.
> + */
>  { .name = "ID_PFR1", .state = ARM_CP_STATE_BOTH,
>.opc0 = 3, .opc1 = 0, .crn = 0, .crm = 1, .opc2 = 1,
> -  .access = PL1_R, .type = ARM_CP_CONST,
> -  .resetvalue = cpu->id_pfr1 },
> +  .access = PL1_R, .type = ARM_CP_NO_RAW,
> +  .readfn = id_pfr1_read,
> +  .writefn = arm_cp_write_ignore },
>  { .name = "ID_DFR0", .state = ARM_CP_STATE_BOTH,
>.opc0 = 3, .opc1 = 0, .crn = 0, .crm = 1, .opc2 = 2,
>.access = PL1_R, .type = ARM_CP_CONST,
> @@ -4692,10 +4723,15 @@ void register_cp_regs_for_features(ARMCPU *cpu)
>   * define new registers here.
>   */
>  ARMCPRegInfo v8_idregs[] = {
> +/* ID_AA64PFR0_EL1 is not a plain ARM_CP_CONST because we don't
> + * know the right value for the GIC field until after we
> + * define these regs.
> + */
>  { .name = "ID_AA64PFR0_EL1", .state = ARM_CP_STATE_AA64,
>.opc0 = 3, .opc1 = 0, .crn = 0, .crm = 4, .opc2 = 0,
> -  .access = PL1_R, .type = ARM_CP_CONST,
> -  .resetvalue = cpu->id_aa64pfr0 },
> +  .access = PL1_R, .type = ARM_CP_NO_RAW,
> +  .readfn = id_aa64pfr0_read,
> +  .writefn = arm_cp_write_ignore },
>  { .name = "ID_AA64PFR1_EL1", .state = ARM_CP_STATE_AA64,
>.opc0 = 3, .opc1 = 0, .crn = 0, .crm = 4, .opc2 = 1,
>.access = PL1_R, .type = ARM_CP_CONST,
> --
> 2.7.4
>
>



Re: [Qemu-devel] [PATCH for-2.11] util/stats64: Fix min/max comparisons

2017-11-14 Thread Paolo Bonzini

- Max Reitz  ha scritto:
> stat64_min_slow() and stat64_max_slow() compare the wrong way.  This
> makes iotest 136 fail with clang and -m32.

Queued, thanks.

Cc: qemu-sta...@nongnu.org

Paolo

> Signed-off-by: Max Reitz 
> ---
>  util/stats64.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/util/stats64.c b/util/stats64.c
> index 9968fcceac..389c365a9e 100644
> --- a/util/stats64.c
> +++ b/util/stats64.c
> @@ -91,7 +91,7 @@ bool stat64_min_slow(Stat64 *s, uint64_t value)
>  low = atomic_read(>low);
>  
>  orig = ((uint64_t)high << 32) | low;
> -if (orig < value) {
> +if (value < orig) {
>  /* We have to set low before high, just like stat64_min reads
>   * high before low.  The value may become higher temporarily, but
>   * stat64_get does not notice (it takes the lock) and the only ill
> @@ -120,7 +120,7 @@ bool stat64_max_slow(Stat64 *s, uint64_t value)
>  low = atomic_read(>low);
>  
>  orig = ((uint64_t)high << 32) | low;
> -if (orig > value) {
> +if (value > orig) {
>  /* We have to set low before high, just like stat64_max reads
>   * high before low.  The value may become lower temporarily, but
>   * stat64_get does not notice (it takes the lock) and the only ill
> -- 
> 2.13.6
> 




Re: [Qemu-devel] [PATCH v2 0/2] e1000e: Reimplement e1000 as a variant of e1000e

2017-11-14 Thread Ed Swierk via Qemu-devel
On Thu, Nov 9, 2017 at 5:53 AM, Daniel P. Berrange  wrote:
> My fear is that this approach of building a new e1000-ng device in
> parallel with having the existing e1000 device is going to cause
> long term pain, possibly never getting to a state where the e1000-ng
> device can replace the e1000 device. Any time there needs to be a
> "big bang" to switch from one impl to another completely different
> impl always causes trouble IME. With need for migration wire format
> & state compatibility, this is even more difficult. From a code review
> POV it will be essentially impossible to have confidence that the new
> impl can be a viable drop-in replacement for the old impl.
>
> Is there really no way that you can change the approach to do a more
> incremental conversion of the existing code base, but still end up in
> the same place at the very end ?
>
> eg just copy all the e1000.c code into the e1000e.c file to start with.
> Then gradually merge functional areas over a longish series of patches
> to eliminate the duplication. This would make it far more practical to
> identify where any regressions come from, and will give reviewers more
> confidence that we're not breaking migration compat.

I agree an incremental conversion is the only realistic way to unearth
potential regressions; testing won't be enough. In the past couple of
days I've run into several cases where e1000 works but e1000-ng
doesn't. Apparently e1000e guest drivers just happen not to tickle
these bugs. So e1000-ng isn't ready for prime time anyway. I'll shelve
this patch series for the moment.

Meanwhile I'll post separate patches for the bugs I've encountered
with e1000 UDP checksum offload.

--Ed



Re: [Qemu-devel] [PATCH v3 for-2.11 3/3] tpm_tis: Return 0 for every register in case of failure mode

2017-11-14 Thread Marc-André Lureau
Hi

On Tue, Nov 14, 2017 at 10:52 PM, Stefan Berger
 wrote:
> Rather than returning ~0, return 0 for every register in case of
> failure mode. The '0' is better to indicate that there's no device
> there.

For most registers, 0 makes more sense. However, I wonder if we
shouldn't just fail to start qemu in this case...

Not convincing me this is 2.11 material either. Does this fix a specific bug?

> Signed-off-by: Stefan Berger 
> ---
>  hw/tpm/tpm_tis.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/hw/tpm/tpm_tis.c b/hw/tpm/tpm_tis.c
> index fec2fc6..42d647d 100644
> --- a/hw/tpm/tpm_tis.c
> +++ b/hw/tpm/tpm_tis.c
> @@ -545,7 +545,7 @@ static uint64_t tpm_tis_mmio_read(void *opaque, hwaddr 
> addr,
>  uint8_t v;
>
>  if (tpm_backend_had_startup_error(s->be_driver)) {
> -return val;
> +return 0;
>  }
>
>  switch (offset) {
> --
> 2.5.5
>



-- 
Marc-André Lureau



Re: [Qemu-devel] [PATCH v3 for-2.11 2/3] tpm_tis: Return TPM_VERSION_UNSPEC in case of BE failure

2017-11-14 Thread Marc-André Lureau
Hi

On Tue, Nov 14, 2017 at 10:52 PM, Stefan Berger
 wrote:
> In case the backend has a failure, such as the tpm_emulator's CMD_INIT
> failing, the TIS goes into failure mode and does not respond to reads
> or writes to MMIO registers. In this case we need to prevent the ACPI
> table from being added and the straight-forward way is to indicate that
> there's no known TPM version being used.
>
> Signed-off-by: Stefan Berger 

Reviewed-by: Marc-André Lureau 

(we can probably iterate to improve the code around that later)

> ---
>  hw/tpm/tpm_tis.c | 4 
>  1 file changed, 4 insertions(+)
>
> diff --git a/hw/tpm/tpm_tis.c b/hw/tpm/tpm_tis.c
> index 7402528..fec2fc6 100644
> --- a/hw/tpm/tpm_tis.c
> +++ b/hw/tpm/tpm_tis.c
> @@ -1008,6 +1008,10 @@ TPMVersion tpm_tis_get_tpm_version(Object *obj)
>  {
>  TPMState *s = TPM(obj);
>
> +if (tpm_backend_had_startup_error(s->be_driver)) {
> +return TPM_VERSION_UNSPEC;
> +}
> +
>  return tpm_backend_get_tpm_version(s->be_driver);
>  }
>
> --
> 2.5.5
>



-- 
Marc-André Lureau



Re: [Qemu-devel] [PATCH v3 for-2.11 1/3] tpm_emulator: Add a caching layer for the TPM Established flag

2017-11-14 Thread Marc-André Lureau
Hi

On Tue, Nov 14, 2017 at 10:52 PM, Stefan Berger
 wrote:
> Add a caching layer for the TPM established flag so that we don't
> need to go to the emulator every time the flag is read by accessing
> the REG_ACCESS register.

What's the impact? Isn't this just a "small" optimization? Iotw, why
is this for-2.11?

> Signed-off-by: Stefan Berger 
>
> v1->v2:
>  - move the caching to the backend layer since detecting the
>TPM 1.2 TSC_ResetEstablishmentBit() command is easier to do
>here.
> ---
>  hw/tpm/tpm_emulator.c | 17 ++---
>  1 file changed, 14 insertions(+), 3 deletions(-)
>
> diff --git a/hw/tpm/tpm_emulator.c b/hw/tpm/tpm_emulator.c
> index e1a6810..b293db7 100644
> --- a/hw/tpm/tpm_emulator.c
> +++ b/hw/tpm/tpm_emulator.c
> @@ -73,6 +73,9 @@ typedef struct TPMEmulator {
>  Error *migration_blocker;
>
>  QemuMutex mutex;
> +
> +unsigned int established_flag:1;
> +unsigned int established_flag_cached:1;
>  } TPMEmulator;
>
>
> @@ -287,16 +290,22 @@ static bool 
> tpm_emulator_get_tpm_established_flag(TPMBackend *tb)
>  TPMEmulator *tpm_emu = TPM_EMULATOR(tb);
>  ptm_est est;
>
> -DPRINTF("%s", __func__);
> +if (tpm_emu->established_flag_cached) {
> +return tpm_emu->established_flag;
> +}
> +
>  if (tpm_emulator_ctrlcmd(tpm_emu, CMD_GET_TPMESTABLISHED, ,
>   0, sizeof(est)) < 0) {
>  error_report("tpm-emulator: Could not get the TPM established flag: 
> %s",
>   strerror(errno));
>  return false;
>  }
> -DPRINTF("established flag: %0x", est.u.resp.bit);
> +DPRINTF("got established flag: %0x", est.u.resp.bit);
> +
> +tpm_emu->established_flag_cached = 1;
> +tpm_emu->established_flag = (est.u.resp.bit != 0);
>
> -return (est.u.resp.bit != 0);
> +return tpm_emu->established_flag;
>  }
>
>  static int tpm_emulator_reset_tpm_established_flag(TPMBackend *tb,
> @@ -327,6 +336,8 @@ static int 
> tpm_emulator_reset_tpm_established_flag(TPMBackend *tb,
>  return -1;
>  }
>
> +tpm_emu->established_flag_cached = 0;
> +
>  return 0;
>  }
>
> --
> 2.5.5
>



-- 
Marc-André Lureau



Re: [Qemu-devel] [PATCH v2 for-2.11] hw/net/vmxnet3: Fix code to work on big endian hosts, too

2017-11-14 Thread David Gibson
On Tue, 14 Nov 2017 12:20:24 +0100
Thomas Huth  wrote:

> Since commit ab06ec43577177a442e8 we test the vmxnet3 device in the
> pxe-tester, too (when running "make check SPEED=slow"). This now
> revealed that the code is not working there if the host is a big
> endian machine (for example ppc64 or s390x) - "make check SPEED=slow"
> is now failing on such hosts.
> 
> The vmxnet3 code lacks endianess conversions in a couple of places.
> Interestingly, the bitfields in the structs in vmxnet3.h already tried to
> take care of the *bit* endianess of the C compilers - but the code missed
> to change the *byte* endianess when reading or writing the corresponding
> structs. So the bitfields are now wrapped into unions which allow to change
> the byte endianess during runtime with the non-bitfield member of the union.
> With these changes, "make check SPEED=slow" now properly works on big endian
> hosts, too.
> 
> Reported-by: David Gibson 
> Signed-off-by: Thomas Huth 
> ---
>  v2:
>  - Introduced vmxnet3_ring_read_curr_txdesc() & vmxnet3_pci_dma_write_rxcd()
>helper functions to wrap the byte-swapping code that is required in
>multiple places (as suggested by Philippe)
> 
>  hw/net/vmware_utils.h |   6 ++
>  hw/net/vmxnet3.c  |  46 +++---
>  hw/net/vmxnet3.h  | 230 
> ++
>  3 files changed, 181 insertions(+), 101 deletions(-)
> 
> diff --git a/hw/net/vmware_utils.h b/hw/net/vmware_utils.h
> index 5500601..6b1e251 100644
> --- a/hw/net/vmware_utils.h
> +++ b/hw/net/vmware_utils.h
> @@ -83,6 +83,7 @@ vmw_shmem_ld16(PCIDevice *d, hwaddr addr)
>  {
>  uint16_t res;
>  pci_dma_read(d, addr, , 2);
> +res = le16_to_cpu(res);

There already exists an ldl_le_pci_dma() function.  If there isn't an
ldw_le_pci_dma() we should make one.

I really dislike swabbing values in place - it makes it much harder to
quickly tell what endianness a variable is supposed to be in, for both
humans and checkers like sparse.

>  VMW_SHPRN("SHMEM load16: %" PRIx64 " (value 0x%X)", addr, res);
>  return res;
>  }
> @@ -91,6 +92,7 @@ static inline void
>  vmw_shmem_st16(PCIDevice *d, hwaddr addr, uint16_t value)
>  {
>  VMW_SHPRN("SHMEM store16: %" PRIx64 " (value 0x%X)", addr, value);
> +value = cpu_to_le16(value);
>  pci_dma_write(d, addr, , 2);
>  }
>  
> @@ -99,6 +101,7 @@ vmw_shmem_ld32(PCIDevice *d, hwaddr addr)
>  {
>  uint32_t res;
>  pci_dma_read(d, addr, , 4);
> +res = le32_to_cpu(res);
>  VMW_SHPRN("SHMEM load32: %" PRIx64 " (value 0x%X)", addr, res);
>  return res;
>  }
> @@ -107,6 +110,7 @@ static inline void
>  vmw_shmem_st32(PCIDevice *d, hwaddr addr, uint32_t value)
>  {
>  VMW_SHPRN("SHMEM store32: %" PRIx64 " (value 0x%X)", addr, value);
> +value = cpu_to_le32(value);
>  pci_dma_write(d, addr, , 4);
>  }
>  
> @@ -115,6 +119,7 @@ vmw_shmem_ld64(PCIDevice *d, hwaddr addr)
>  {
>  uint64_t res;
>  pci_dma_read(d, addr, , 8);
> +res = le64_to_cpu(res);
>  VMW_SHPRN("SHMEM load64: %" PRIx64 " (value %" PRIx64 ")", addr, res);
>  return res;
>  }
> @@ -123,6 +128,7 @@ static inline void
>  vmw_shmem_st64(PCIDevice *d, hwaddr addr, uint64_t value)
>  {
>  VMW_SHPRN("SHMEM store64: %" PRIx64 " (value %" PRIx64 ")", addr, value);
> +value = cpu_to_le64(value);
>  pci_dma_write(d, addr, , 8);
>  }
>  
> diff --git a/hw/net/vmxnet3.c b/hw/net/vmxnet3.c
> index 8c4bae5..bfb066a 100644
> --- a/hw/net/vmxnet3.c
> +++ b/hw/net/vmxnet3.c
> @@ -222,7 +222,7 @@ vmxnet3_dump_tx_descr(struct Vmxnet3_TxDesc *descr)
>"addr %" PRIx64 ", len: %d, gen: %d, rsvd: %d, "
>"dtype: %d, ext1: %d, msscof: %d, hlen: %d, om: %d, "
>"eop: %d, cq: %d, ext2: %d, ti: %d, tci: %d",
> -  le64_to_cpu(descr->addr), descr->len, descr->gen, descr->rsvd,
> +  descr->addr, descr->len, descr->gen, descr->rsvd,
>descr->dtype, descr->ext1, descr->msscof, descr->hlen, 
> descr->om,
>descr->eop, descr->cq, descr->ext2, descr->ti, descr->tci);
>  }
> @@ -241,7 +241,7 @@ vmxnet3_dump_rx_descr(struct Vmxnet3_RxDesc *descr)
>  {
>  VMW_PKPRN("RX DESCR: addr %" PRIx64 ", len: %d, gen: %d, rsvd: %d, "
>"dtype: %d, ext1: %d, btype: %d",
> -  le64_to_cpu(descr->addr), descr->len, descr->gen,
> +  descr->addr, descr->len, descr->gen,
>descr->rsvd, descr->dtype, descr->ext1, descr->btype);
>  }
>  
> @@ -535,7 +535,8 @@ static void vmxnet3_complete_packet(VMXNET3State *s, int 
> qidx, uint32_t tx_ridx)
>  memset(_descr, 0, sizeof(txcq_descr));
>  txcq_descr.txdIdx = tx_ridx;
>  txcq_descr.gen = vmxnet3_ring_curr_gen(>txq_descr[qidx].comp_ring);
> -
> +txcq_descr.val1 = cpu_to_le32(txcq_descr.val1);
> +txcq_descr.val2 = cpu_to_le32(txcq_descr.val2);
>  vmxnet3_ring_write_curr_cell(d, 

Re: [Qemu-devel] [PATCH for-2.11? v7 0/6] block: Don't compare strings in bdrv_reopen_prepare()

2017-11-14 Thread Max Reitz
On 2017-11-14 19:01, Max Reitz wrote:
> bdrv_reopen_prepare() assumes that all BDS options are strings, which is
> not necessarily correct. This series introduces a new qobject_is_equal()
> function which can be used to test whether any options have changed,
> independently of their type.

Aaand once again applied to my block branch.

(As always, thank you for reviewing, Eric -- I'm happy to have expanded
your knowledge on obscure C behavior. :-))

Max



signature.asc
Description: OpenPGP digital signature


[Qemu-devel] [PATCH] net: Transmit zero UDP checksum as 0xFFFF

2017-11-14 Thread Ed Swierk via Qemu-devel
The checksum algorithm used by IPv4, TCP and UDP allows a zero value
to be represented by either 0x and 0x. But per RFC 768, a zero
UDP checksum must be transmitted as 0x, as 0x is a special
value meaning no checksum.

Substitute 0x whenever a checksum is computed as zero on a UDP
datagram. Doing this on IPv4 packets and TCP segments is unnecessary
but legal.

(While it is tempting to make the substitution in
net_checksum_finish(), that function is also used by receivers to
verify checksums, and in that case the expected value is always
0x.)

Signed-off-by: Ed Swierk 
---
 hw/net/e1000.c  | 5 +++--
 hw/net/net_rx_pkt.c | 3 +++
 hw/net/net_tx_pkt.c | 6 ++
 hw/net/vmxnet3.c| 7 +--
 4 files changed, 17 insertions(+), 4 deletions(-)

diff --git a/hw/net/e1000.c b/hw/net/e1000.c
index d642314..97242a1 100644
--- a/hw/net/e1000.c
+++ b/hw/net/e1000.c
@@ -505,8 +505,9 @@ putsum(uint8_t *data, uint32_t n, uint32_t sloc, uint32_t 
css, uint32_t cse)
 if (cse && cse < n)
 n = cse + 1;
 if (sloc < n-1) {
-sum = net_checksum_add(n-css, data+css);
-stw_be_p(data + sloc, net_checksum_finish(sum));
+sum = net_raw_checksum(data + css, n - css);
+/* For UDP, zero checksum must be sent as 0x */
+stw_be_p(data + sloc, sum ? sum : 0x);
 }
 }
 
diff --git a/hw/net/net_rx_pkt.c b/hw/net/net_rx_pkt.c
index 1019b50..e820132 100644
--- a/hw/net/net_rx_pkt.c
+++ b/hw/net/net_rx_pkt.c
@@ -588,6 +588,9 @@ bool net_rx_pkt_fix_l4_csum(struct NetRxPkt *pkt)
 
 /* Calculate L4 checksum */
 csum = cpu_to_be16(_net_rx_pkt_calc_l4_csum(pkt));
+if (!csum) {
+csum = 0x; /* For UDP, zero checksum must be sent as 0x */
+}
 
 /* Set calculated checksum to checksum word */
 iov_from_buf(pkt->vec, pkt->vec_len,
diff --git a/hw/net/net_tx_pkt.c b/hw/net/net_tx_pkt.c
index 20b2549..21194e7 100644
--- a/hw/net/net_tx_pkt.c
+++ b/hw/net/net_tx_pkt.c
@@ -136,6 +136,9 @@ void net_tx_pkt_update_ip_checksums(struct NetTxPkt *pkt)
 return;
 }
 
+if (!csum) {
+csum = 0x; /* For UDP, zero checksum must be sent as 0x */
+}
 iov_from_buf(>vec[NET_TX_PKT_PL_START_FRAG], pkt->payload_frags,
  pkt->virt_hdr.csum_offset, , sizeof(csum));
 }
@@ -487,6 +490,9 @@ static void net_tx_pkt_do_sw_csum(struct NetTxPkt *pkt)
 
 /* Put the checksum obtained into the packet */
 csum = cpu_to_be16(net_checksum_finish(csum_cntr));
+if (!csum) {
+csum = 0x; /* For UDP, zero checksum must be sent as 0x */
+}
 iov_from_buf(iov, iov_len, csum_offset, , sizeof csum);
 }
 
diff --git a/hw/net/vmxnet3.c b/hw/net/vmxnet3.c
index 90f6943..de9c40e 100644
--- a/hw/net/vmxnet3.c
+++ b/hw/net/vmxnet3.c
@@ -942,6 +942,7 @@ static void vmxnet3_rx_need_csum_calculate(struct NetRxPkt 
*pkt,
 bool isip4, isip6, istcp, isudp;
 uint8_t *data;
 int len;
+uint16_t sum;
 
 if (!net_rx_pkt_has_virt_hdr(pkt)) {
 return;
@@ -969,8 +970,10 @@ static void vmxnet3_rx_need_csum_calculate(struct NetRxPkt 
*pkt,
 
 data = (uint8_t *)pkt_data + vhdr->csum_start;
 len = pkt_len - vhdr->csum_start;
-/* Put the checksum obtained into the packet */
-stw_be_p(data + vhdr->csum_offset, net_raw_checksum(data, len));
+sum = net_raw_checksum(data, len);
+/* Put the checksum obtained into the packet; for UDP, zero checksum */
+/* must be sent as 0x */
+stw_be_p(data + vhdr->csum_offset, sum ? sum : 0x);
 
 vhdr->flags &= ~VIRTIO_NET_HDR_F_NEEDS_CSUM;
 vhdr->flags |= VIRTIO_NET_HDR_F_DATA_VALID;
-- 
1.9.1




Re: [Qemu-devel] [PATCH] qapi: block-core: Clarify events emitted by 'block-job-cancel'

2017-11-14 Thread no-reply
Hi,

This series failed build test on ppc host. Please find the details below.

Type: series
Subject: [Qemu-devel] [PATCH] qapi: block-core: Clarify events emitted by 
'block-job-cancel'
Message-id: 20171114191605.22349-1-kcham...@redhat.com

=== TEST SCRIPT BEGIN ===
#!/bin/bash
# Testing script will be invoked under the git checkout with
# HEAD pointing to a commit that has the patches applied on top of "base"
# branch
set -e
echo "=== ENV ==="
env
echo "=== PACKAGES ==="
rpm -qa
echo "=== TEST BEGIN ==="
INSTALL=$PWD/install
BUILD=$PWD/build
mkdir -p $BUILD $INSTALL
SRC=$PWD
cd $BUILD
$SRC/configure --prefix=$INSTALL
make -j100
# XXX: we need reliable clean up
# make check -j100 V=1
make install
=== TEST SCRIPT END ===

Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384
From https://github.com/patchew-project/qemu
 * [new tag] patchew/20171114191605.22349-1-kcham...@redhat.com -> 
patchew/20171114191605.22349-1-kcham...@redhat.com
Submodule 'dtc' (git://git.qemu-project.org/dtc.git) registered for path 'dtc'
Submodule 'pixman' (git://anongit.freedesktop.org/pixman) registered for path 
'pixman'
Submodule 'roms/SLOF' (git://git.qemu-project.org/SLOF.git) registered for path 
'roms/SLOF'
Submodule 'roms/ipxe' (git://git.qemu-project.org/ipxe.git) registered for path 
'roms/ipxe'
Submodule 'roms/openbios' (git://git.qemu-project.org/openbios.git) registered 
for path 'roms/openbios'
Submodule 'roms/openhackware' (git://git.qemu-project.org/openhackware.git) 
registered for path 'roms/openhackware'
Submodule 'roms/qemu-palcode' (git://github.com/rth7680/qemu-palcode.git) 
registered for path 'roms/qemu-palcode'
Submodule 'roms/seabios' (git://git.qemu-project.org/seabios.git/) registered 
for path 'roms/seabios'
Submodule 'roms/sgabios' (git://git.qemu-project.org/sgabios.git) registered 
for path 'roms/sgabios'
Submodule 'roms/u-boot' (git://git.qemu-project.org/u-boot.git) registered for 
path 'roms/u-boot'
Submodule 'roms/vgabios' (git://git.qemu-project.org/vgabios.git/) registered 
for path 'roms/vgabios'
Cloning into 'dtc'...
Submodule path 'dtc': checked out '65cc4d2748a2c2e6f27f1cf39e07a5dbabd80ebf'
Cloning into 'pixman'...
Submodule path 'pixman': checked out '87eea99e443b389c978cf37efc52788bf03a0ee0'
Cloning into 'roms/SLOF'...
Submodule path 'roms/SLOF': checked out 
'e3d05727a074619fc12d0a67f05cf2c42c875cce'
Cloning into 'roms/ipxe'...
Submodule path 'roms/ipxe': checked out 
'04186319181298083ef28695a8309028b26fe83c'
Cloning into 'roms/openbios'...
Submodule path 'roms/openbios': checked out 
'e79bca64838c96ec44fd7acd508879c5284233dd'
Cloning into 'roms/openhackware'...
Submodule path 'roms/openhackware': checked out 
'c559da7c8eec5e45ef1f67978827af6f0b9546f5'
Cloning into 'roms/qemu-palcode'...
Submodule path 'roms/qemu-palcode': checked out 
'c87a92639b28ac42bc8f6c67443543b405dc479b'
Cloning into 'roms/seabios'...
Submodule path 'roms/seabios': checked out 
'e2fc41e24ee0ada60fc511d60b15a41b294538be'
Cloning into 'roms/sgabios'...
Submodule path 'roms/sgabios': checked out 
'23d474943dcd55d0550a3d20b3d30e9040a4f15b'
Cloning into 'roms/u-boot'...
Submodule path 'roms/u-boot': checked out 
'2072e7262965bb48d7fffb1e283101e6ed8b21a8'
Cloning into 'roms/vgabios'...
Submodule path 'roms/vgabios': checked out 
'19ea12c230ded95928ecaef0db47a82231c2e485'
warning: unable to rmdir pixman: Directory not empty
Switched to a new branch 'test'
M   dtc
M   roms/SLOF
M   roms/ipxe
M   roms/openbios
M   roms/qemu-palcode
M   roms/seabios
M   roms/sgabios
M   roms/u-boot
375c460 qapi: block-core: Clarify events emitted by 'block-job-cancel'

=== OUTPUT BEGIN ===
=== ENV ===
XDG_SESSION_ID=31750
SHELL=/bin/sh
USER=patchew
PATCHEW=/home/patchew/patchew/patchew-cli -s http://patchew.org --nodebug
PATH=/usr/bin:/bin
PWD=/var/tmp/patchew-tester-tmp-x1o6jiab/src
LANG=en_US.UTF-8
HOME=/home/patchew
SHLVL=2
LOGNAME=patchew
XDG_RUNTIME_DIR=/run/user/1000
_=/usr/bin/env
=== PACKAGES ===
plymouth-core-libs-0.8.9-0.28.20140113.el7.centos.ppc64le
vim-common-7.4.160-2.el7.ppc64le
perl-Test-Simple-0.98-243.el7.noarch
hplip-common-3.15.9-3.el7.ppc64le
valgrind-3.12.0-8.el7.ppc64le
gamin-0.1.10-16.el7.ppc64le
libpeas-loader-python-1.20.0-1.el7.ppc64le
telepathy-filesystem-0.0.2-6.el7.noarch
colord-libs-1.3.4-1.el7.ppc64le
kbd-legacy-1.15.5-13.el7.noarch
perl-CPAN-Meta-YAML-0.008-14.el7.noarch
libvirt-daemon-driver-nwfilter-3.2.0-14.el7.ppc64le
ntsysv-1.7.4-1.el7.ppc64le
kernel-bootwrapper-3.10.0-693.el7.ppc64le
telepathy-farstream-0.6.0-5.el7.ppc64le
kdenetwork-common-4.10.5-8.el7_0.noarch
elfutils-devel-0.168-8.el7.ppc64le
pm-utils-1.4.1-27.el7.ppc64le
perl-Error-0.17020-2.el7.noarch
usbmuxd-1.1.0-1.el7.ppc64le
bzip2-devel-1.0.6-13.el7.ppc64le
blktrace-1.0.5-8.el7.ppc64le
gnome-keyring-pam-3.20.0-3.el7.ppc64le
tzdata-java-2017b-1.el7.noarch
perl-devel-5.16.3-292.el7.ppc64le
gnome-getting-started-docs-3.22.0-1.el7.noarch
perl-Log-Message-Simple-0.10-2.el7.noarch

[Qemu-devel] [PATCH 2/2] e1000: Separate TSO and non-TSO contexts, fixing UDP TX corruption

2017-11-14 Thread Ed Swierk via Qemu-devel
The device is supposed to maintain two distinct contexts for transmit
offloads: one has parameters for both segmentation and checksum
offload, the other only for checksum offload. The guest driver can
send two context descriptors, one for each context (the TSE flag
specifies which). Then the guest can refer to one or the other context
in subsequent transmit data descriptors, depending on what offloads it
wants applied to each packet.

Currently the e1000 device stores just one context, and misinterprets
the TSE flags in the context and data descriptors. This is often okay:
Linux happens to send a fresh context descriptor before every data
descriptor, so forgetting the other context doesn't matter. Windows
does rely on separate contexts for TSO vs. non-TSO packets, but for
mostly-TCP traffic the two contexts have identical TCP-specific
offload parameters so confusing them doesn't matter.

One case where this confusion matters is when a Windows guest sets up
a TSO context for TCP and a non-TSO context for UDP, and then
transmits both TCP and UDP traffic in parallel. The e1000 device
sometimes ends up using TCP-specific parameters while doing checksum
offload on a UDP datagram: it writes the checksum to offset 16 (the
correct location for a TCP checksum), stomping on two bytes of UDP
data, and leaving the wrong value in the actual UDP checksum field at
offset 6. (Even worse, the host network stack may then recompute the
UDP checksum, "correcting" it to match the corrupt data before sending
it out a physical interface.)

Correct this by tracking the TSO context independently of the non-TSO
context, and selecting the appropriate context based on the TSE flag
in each transmit data descriptor.

Signed-off-by: Ed Swierk 
---
 hw/net/e1000.c | 70 +-
 1 file changed, 40 insertions(+), 30 deletions(-)

diff --git a/hw/net/e1000.c b/hw/net/e1000.c
index 471cdd9..d642314 100644
--- a/hw/net/e1000.c
+++ b/hw/net/e1000.c
@@ -101,6 +101,7 @@ typedef struct E1000State_st {
 unsigned char sum_needed;
 bool cptse;
 e1000x_txd_props props;
+e1000x_txd_props tso_props;
 uint16_t tso_frames;
 } tx;
 
@@ -541,35 +542,37 @@ xmit_seg(E1000State *s)
 uint16_t len;
 unsigned int frames = s->tx.tso_frames, css, sofar;
 struct e1000_tx *tp = >tx;
+struct e1000x_txd_props *props = tp->cptse ? >tso_props : >props;
 
-if (tp->props.tse && tp->cptse) {
-css = tp->props.ipcss;
+if (tp->cptse) {
+css = props->ipcss;
 DBGOUT(TXSUM, "frames %d size %d ipcss %d\n",
frames, tp->size, css);
-if (tp->props.ip) {/* IPv4 */
+if (props->ip) {/* IPv4 */
 stw_be_p(tp->data+css+2, tp->size - css);
 stw_be_p(tp->data+css+4,
  lduw_be_p(tp->data + css + 4) + frames);
 } else { /* IPv6 */
 stw_be_p(tp->data+css+4, tp->size - css);
 }
-css = tp->props.tucss;
+css = props->tucss;
 len = tp->size - css;
-DBGOUT(TXSUM, "tcp %d tucss %d len %d\n", tp->props.tcp, css, len);
-if (tp->props.tcp) {
-sofar = frames * tp->props.mss;
+DBGOUT(TXSUM, "tcp %d tucss %d len %d\n", props->tcp, css, len);
+if (props->tcp) {
+sofar = frames * props->mss;
 stl_be_p(tp->data+css+4, ldl_be_p(tp->data+css+4)+sofar); /* seq */
-if (tp->props.paylen - sofar > tp->props.mss) {
+if (props->paylen - sofar > props->mss) {
 tp->data[css + 13] &= ~9;/* PSH, FIN */
 } else if (frames) {
 e1000x_inc_reg_if_not_full(s->mac_reg, TSCTC);
 }
-} else/* UDP */
+} else {/* UDP */
 stw_be_p(tp->data+css+4, len);
+}
 if (tp->sum_needed & E1000_TXD_POPTS_TXSM) {
 unsigned int phsum;
 // add pseudo-header length before checksum calculation
-void *sp = tp->data + tp->props.tucso;
+void *sp = tp->data + props->tucso;
 
 phsum = lduw_be_p(sp) + len;
 phsum = (phsum >> 16) + (phsum & 0x);
@@ -579,12 +582,10 @@ xmit_seg(E1000State *s)
 }
 
 if (tp->sum_needed & E1000_TXD_POPTS_TXSM) {
-putsum(tp->data, tp->size, tp->props.tucso,
-   tp->props.tucss, tp->props.tucse);
+putsum(tp->data, tp->size, props->tucso, props->tucss, props->tucse);
 }
 if (tp->sum_needed & E1000_TXD_POPTS_IXSM) {
-putsum(tp->data, tp->size, tp->props.ipcso,
-   tp->props.ipcss, tp->props.ipcse);
+putsum(tp->data, tp->size, props->ipcso, props->ipcss, props->ipcse);
 }
 if (tp->vlan_needed) {
 memmove(tp->vlan, tp->data, 4);
@@ -616,11 +617,11 @@ process_tx_desc(E1000State *s, struct e1000_tx_desc *dp)
 
 s->mit_ide |= (txd_lower & E1000_TXD_CMD_IDE);
   

[Qemu-devel] [PATCH 1/2] e1000, e1000e: Move per-packet TX offload flags out of context state

2017-11-14 Thread Ed Swierk via Qemu-devel
sum_needed and cptse flags are received from the guest within each
transmit data descriptor. They are not part of the offload context;
instead, they determine how to apply a previously received context to
the packet being transmitted:

- If cptse is set, perform both segmentation and checksum offload
  using the parameters in the TSO context; otherwise just do checksum
  offload. (Currently the e1000 device incorrectly stores only one
  context, which will be fixed in a subsequent patch.)

- Depending on the bits set in sum_needed, possibly perform L4
  checksum offload and/or IP checksum offload, using the parameters in
  the appropriate context.

Move these flags out of struct e1000x_txd_props, which is otherwise
dedicated to storing values from a context descriptor, and into the
per-packet TX struct.

Signed-off-by: Ed Swierk 
---
 hw/net/e1000.c | 30 --
 hw/net/e1000e.c|  4 ++--
 hw/net/e1000e_core.c   | 16 
 hw/net/e1000e_core.h   |  2 ++
 hw/net/e1000x_common.h |  2 --
 5 files changed, 28 insertions(+), 26 deletions(-)

diff --git a/hw/net/e1000.c b/hw/net/e1000.c
index 9324949..471cdd9 100644
--- a/hw/net/e1000.c
+++ b/hw/net/e1000.c
@@ -98,6 +98,8 @@ typedef struct E1000State_st {
 unsigned char data[0x1];
 uint16_t size;
 unsigned char vlan_needed;
+unsigned char sum_needed;
+bool cptse;
 e1000x_txd_props props;
 uint16_t tso_frames;
 } tx;
@@ -540,7 +542,7 @@ xmit_seg(E1000State *s)
 unsigned int frames = s->tx.tso_frames, css, sofar;
 struct e1000_tx *tp = >tx;
 
-if (tp->props.tse && tp->props.cptse) {
+if (tp->props.tse && tp->cptse) {
 css = tp->props.ipcss;
 DBGOUT(TXSUM, "frames %d size %d ipcss %d\n",
frames, tp->size, css);
@@ -564,7 +566,7 @@ xmit_seg(E1000State *s)
 }
 } else/* UDP */
 stw_be_p(tp->data+css+4, len);
-if (tp->props.sum_needed & E1000_TXD_POPTS_TXSM) {
+if (tp->sum_needed & E1000_TXD_POPTS_TXSM) {
 unsigned int phsum;
 // add pseudo-header length before checksum calculation
 void *sp = tp->data + tp->props.tucso;
@@ -576,11 +578,11 @@ xmit_seg(E1000State *s)
 tp->tso_frames++;
 }
 
-if (tp->props.sum_needed & E1000_TXD_POPTS_TXSM) {
+if (tp->sum_needed & E1000_TXD_POPTS_TXSM) {
 putsum(tp->data, tp->size, tp->props.tucso,
tp->props.tucss, tp->props.tucse);
 }
-if (tp->props.sum_needed & E1000_TXD_POPTS_IXSM) {
+if (tp->sum_needed & E1000_TXD_POPTS_IXSM) {
 putsum(tp->data, tp->size, tp->props.ipcso,
tp->props.ipcss, tp->props.ipcse);
 }
@@ -624,17 +626,17 @@ process_tx_desc(E1000State *s, struct e1000_tx_desc *dp)
 } else if (dtype == (E1000_TXD_CMD_DEXT | E1000_TXD_DTYP_D)) {
 // data descriptor
 if (tp->size == 0) {
-tp->props.sum_needed = le32_to_cpu(dp->upper.data) >> 8;
+tp->sum_needed = le32_to_cpu(dp->upper.data) >> 8;
 }
-tp->props.cptse = (txd_lower & E1000_TXD_CMD_TSE) ? 1 : 0;
+tp->cptse = (txd_lower & E1000_TXD_CMD_TSE) ? 1 : 0;
 } else {
 // legacy descriptor
-tp->props.cptse = 0;
+tp->cptse = 0;
 }
 
 if (e1000x_vlan_enabled(s->mac_reg) &&
 e1000x_is_vlan_txd(txd_lower) &&
-(tp->props.cptse || txd_lower & E1000_TXD_CMD_EOP)) {
+(tp->cptse || txd_lower & E1000_TXD_CMD_EOP)) {
 tp->vlan_needed = 1;
 stw_be_p(tp->vlan_header,
   le16_to_cpu(s->mac_reg[VET]));
@@ -643,7 +645,7 @@ process_tx_desc(E1000State *s, struct e1000_tx_desc *dp)
 }
 
 addr = le64_to_cpu(dp->buffer_addr);
-if (tp->props.tse && tp->props.cptse) {
+if (tp->props.tse && tp->cptse) {
 msh = tp->props.hdr_len + tp->props.mss;
 do {
 bytes = split_size;
@@ -665,7 +667,7 @@ process_tx_desc(E1000State *s, struct e1000_tx_desc *dp)
 }
 split_size -= bytes;
 } while (bytes && split_size);
-} else if (!tp->props.tse && tp->props.cptse) {
+} else if (!tp->props.tse && tp->cptse) {
 // context descriptor TSE is not set, while data descriptor TSE is set
 DBGOUT(TXERR, "TCP segmentation error\n");
 } else {
@@ -676,14 +678,14 @@ process_tx_desc(E1000State *s, struct e1000_tx_desc *dp)
 
 if (!(txd_lower & E1000_TXD_CMD_EOP))
 return;
-if (!(tp->props.tse && tp->props.cptse && tp->size < tp->props.hdr_len)) {
+if (!(tp->props.tse && tp->cptse && tp->size < tp->props.hdr_len)) {
 xmit_seg(s);
 }
 tp->tso_frames = 0;
-tp->props.sum_needed = 0;
+tp->sum_needed = 0;
 tp->vlan_needed = 0;
 tp->size = 0;
-tp->props.cptse = 0;
+tp->cptse = 0;
 }
 
 static uint32_t
@@ -1459,7 +1461,7 @@ static const VMStateDescription 

[Qemu-devel] [PATCH 0/2] e1000: Correct TX offload context handling

2017-11-14 Thread Ed Swierk via Qemu-devel
The transmit offload implementation in QEMU's e1000 device is
deficient and causes packet data corruption in some situations.

According to the Intel 8254x software developer's manual[1], the
device maintains two separate contexts: the TCP segmentation offload
context includes parameters for both segmentation offload and checksum
offload, and the normal (checksum-offload-only) context includes only
checksum offload parameters. These parameters specify over which
packet data to compute the checksum, and where in the packet to store
the computed checksum(s).

[1] 
https://www.intel.com/content/dam/doc/manual/pci-pci-x-family-gbe-controllers-software-dev-manual.pdf

The e1000 driver can update either of these contexts by sending a
transmit context descriptor. The TSE bit in the TUCMD field controls
which context is modified by the descriptor. Crucially, a transmit
context descriptor with TSE=1 changes only the TSO context, leaving
the non-TSO context unchanged; with TSE=0 the opposite is true.

Fields in the transmit data descriptor determine which (if either) of
these two contexts the device uses when actually transmitting some
data:

- If the TSE bit in the DCMD field is set, then the device performs
  TCP segmentation offload using the parameters previously set in the
  TSO context. In addition, if TXSM and/or IXSM is set in the POPTS
  field, the device performs the appropriate checksum offloads using
  the parameters in the same (TSO) context.

- Otherwise, if the TSE bit in the DCMD field is clear, then there is
  no TCP segmentation offload. If TXSM and/or IXSM is set in the POPTS
  field, the device performs the appropriate checksum offloads using
  the parameters in the non-TSO context.

The e1000 driver is free to set up the TSO and non-TSO contexts and
then transmit a mixture of data, with each data descriptor using a
different (or neither) context. This is what the e1000 driver for
Windows (Intel(R) PRO/1000 MT Network Connection, aka E1G6023E.sys)
does in certain cases. Sometimes with quite undesirable results, since
the QEMU e1000 device doesn't work as described above.

Instead, the QEMU e1000 device maintains only one context in its state
structure. When it receives a transmit context descriptor from the
driver, it overwrites the context parameters regardless of the TSE bit
in the TUCMD field.

To see why this is wrong, suppose the driver first sets up a non-TSO
context with UDP checksum offload parameters (say, TUCSO pointing to
the appropriate offset for a UDP checksum, 6 bytes into the header),
and then sets up a TSO context with TCP checksum offload parameters
(TUCSO pointing to the appropriate offset for a TCP checksum, 16 bytes
into the header). The driver then sends a transmit data descriptor
with TSO=0 and TXSM=1 along with a UDP datagram. The QEMU e1000 device
computes the checksum using the last set of checksum offload
parameters, and writes the checksum to offset 16, stomping on two
bytes of UDP data, and leaving the wrong checksum in the UDP checksum
field.

To make matters worse, if the network stack on the host running QEMU
treats data transmitted from a VM as locally originated, it may do its
own UDP checksum computation, "correcting" it to match the corrupt
data before sending it on the wire. Now the corrupt UDP packet makes
its way all the way to the peer application.

I have reproduced this behavior with a Windows 10 guest, rather easily
with a TCP iperf and a UDP iperf running in parallel. With the
patchlet below, you'll see an error message whenever the bug is
triggered.


 --- a/hw/net/e1000.c
 +++ b/hw/net/e1000.c
 @@ -534,6 +534,30 @@ e1000_send_packet(E1000State *s, const uint8_t *buf, int 
size)
  }
  
  static void
 +debug_csum(struct e1000_tx *tp, uint16_t oldsum)
 +{
 +e1000x_txd_props *props = >props;
 +uint8_t proto = tp->data[14 + 9];
 +uint32_t sumoff = props->tucso - props->tucss;
 +
 +if ((proto == 17 && sumoff != 6) ||
 +(proto == 6 && sumoff != 16)) {
 +DBGOUT(TXERR, "txsum bug! ver %d src %08x dst %08x len %d proto %d "
 +   "cptse %d sum_needed %x oldsum %x newsum %x realsum %x\n",
 +   tp->data[14] >> 4,
 +   ldl_be_p(tp->data + 14 + 12),
 +   ldl_be_p(tp->data + 14 + 16),
 +   lduw_be_p(tp->data + 14 + 2),
 +   proto,
 +   props->cptse,
 +   props->sum_needed,
 +   oldsum,
 +   lduw_be_p(tp->data + props->tucso),
 +   lduw_be_p(tp->data + props->tucss + (proto == 6 ? 16 : 6)));
 +}
 +}
 +
 +static void
  xmit_seg(E1000State *s)
  {
  uint16_t len;
 @@ -577,8 +601,10 @@ xmit_seg(E1000State *s)
  }
  
  if (tp->props.sum_needed & E1000_TXD_POPTS_TXSM) {
 +uint16_t oldsum = lduw_be_p(tp->data + tp->props.tucso);
  putsum(tp->data, tp->size, tp->props.tucso,
 tp->props.tucss, tp->props.tucse);
 +debug_csum(tp, oldsum); /* FIXME: remove */
   

[Qemu-devel] [PATCH for-2.11] util/stats64: Fix min/max comparisons

2017-11-14 Thread Max Reitz
stat64_min_slow() and stat64_max_slow() compare the wrong way.  This
makes iotest 136 fail with clang and -m32.

Signed-off-by: Max Reitz 
---
 util/stats64.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/util/stats64.c b/util/stats64.c
index 9968fcceac..389c365a9e 100644
--- a/util/stats64.c
+++ b/util/stats64.c
@@ -91,7 +91,7 @@ bool stat64_min_slow(Stat64 *s, uint64_t value)
 low = atomic_read(>low);
 
 orig = ((uint64_t)high << 32) | low;
-if (orig < value) {
+if (value < orig) {
 /* We have to set low before high, just like stat64_min reads
  * high before low.  The value may become higher temporarily, but
  * stat64_get does not notice (it takes the lock) and the only ill
@@ -120,7 +120,7 @@ bool stat64_max_slow(Stat64 *s, uint64_t value)
 low = atomic_read(>low);
 
 orig = ((uint64_t)high << 32) | low;
-if (orig > value) {
+if (value > orig) {
 /* We have to set low before high, just like stat64_max reads
  * high before low.  The value may become lower temporarily, but
  * stat64_get does not notice (it takes the lock) and the only ill
-- 
2.13.6




[Qemu-devel] [PATCH v1 2/2] intel-iommu: Extend address width to 48 bits

2017-11-14 Thread prasad . singamsetty
From: Prasad Singamsetty 

The current implementation of Intel IOMMU code only supports 39 bits
iova address width. This patch provides a new parameter (x-aw-bits)
for intel-iommu to extend its address width to 48 bits but keeping the
default the same (39 bits). The reason for not changing the default
is to avoid potential compatibility problems with live migration of
intel-iommu enabled QEMU guest. The only valid values for 'x-aw-bits'
parameter are 39 and 48.

After enabling larger address width (48), we should be able to map
larger iova addresses in the guest. For example, a QEMU guest that
is configured with large memory ( >=1TB ). To check whether 48 bits
aw is enabled, we can grep in the guest dmesg output with line:
"DMAR: Host address width 48".

Signed-off-by: Prasad Singamsetty 
---
 hw/i386/acpi-build.c   |   3 +-
 hw/i386/intel_iommu.c  | 101 -
 hw/i386/intel_iommu_internal.h |   9 ++--
 include/hw/i386/intel_iommu.h  |   1 +
 4 files changed, 65 insertions(+), 49 deletions(-)

diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
index 73519ab3ac..537957c89a 100644
--- a/hw/i386/acpi-build.c
+++ b/hw/i386/acpi-build.c
@@ -2460,6 +2460,7 @@ build_dmar_q35(GArray *table_data, BIOSLinker *linker)
 AcpiDmarDeviceScope *scope = NULL;
 /* Root complex IOAPIC use one path[0] only */
 size_t ioapic_scope_size = sizeof(*scope) + sizeof(scope->path[0]);
+IntelIOMMUState *intel_iommu = INTEL_IOMMU_DEVICE(iommu);
 
 assert(iommu);
 if (iommu->intr_supported) {
@@ -2467,7 +2468,7 @@ build_dmar_q35(GArray *table_data, BIOSLinker *linker)
 }
 
 dmar = acpi_data_push(table_data, sizeof(*dmar));
-dmar->host_address_width = VTD_HOST_ADDRESS_WIDTH - 1;
+dmar->host_address_width = intel_iommu->aw_bits - 1;
 dmar->flags = dmar_flags;
 
 /* DMAR Remapping Hardware Unit Definition structure */
diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
index 53b3bf244d..c2380fdfdc 100644
--- a/hw/i386/intel_iommu.c
+++ b/hw/i386/intel_iommu.c
@@ -521,9 +521,9 @@ static inline dma_addr_t 
vtd_ce_get_slpt_base(VTDContextEntry *ce)
 return ce->lo & VTD_CONTEXT_ENTRY_SLPTPTR;
 }
 
-static inline uint64_t vtd_get_slpte_addr(uint64_t slpte)
+static inline uint64_t vtd_get_slpte_addr(uint64_t slpte, uint8_t aw)
 {
-return slpte & VTD_SL_PT_BASE_ADDR_MASK(VTD_HOST_ADDRESS_WIDTH);
+return slpte & VTD_SL_PT_BASE_ADDR_MASK(aw);
 }
 
 /* Whether the pte indicates the address of the page frame */
@@ -608,20 +608,21 @@ static inline bool vtd_ce_type_check(X86IOMMUState 
*x86_iommu,
 return true;
 }
 
-static inline uint64_t vtd_iova_limit(VTDContextEntry *ce)
+static inline uint64_t vtd_iova_limit(VTDContextEntry *ce, uint8_t aw)
 {
 uint32_t ce_agaw = vtd_ce_get_agaw(ce);
-return 1ULL << MIN(ce_agaw, VTD_MGAW);
+return 1ULL << MIN(ce_agaw, aw);
 }
 
 /* Return true if IOVA passes range check, otherwise false. */
-static inline bool vtd_iova_range_check(uint64_t iova, VTDContextEntry *ce)
+static inline bool vtd_iova_range_check(uint64_t iova, VTDContextEntry *ce,
+uint8_t aw)
 {
 /*
  * Check if @iova is above 2^X-1, where X is the minimum of MGAW
  * in CAP_REG and AW in context-entry.
  */
-return !(iova & ~(vtd_iova_limit(ce) - 1));
+return !(iova & ~(vtd_iova_limit(ce, aw) - 1));
 }
 
 /*
@@ -669,7 +670,7 @@ static VTDBus *vtd_find_as_from_bus_num(IntelIOMMUState *s, 
uint8_t bus_num)
  */
 static int vtd_iova_to_slpte(VTDContextEntry *ce, uint64_t iova, bool is_write,
  uint64_t *slptep, uint32_t *slpte_level,
- bool *reads, bool *writes)
+ bool *reads, bool *writes, uint8_t aw_bits)
 {
 dma_addr_t addr = vtd_ce_get_slpt_base(ce);
 uint32_t level = vtd_ce_get_level(ce);
@@ -677,7 +678,7 @@ static int vtd_iova_to_slpte(VTDContextEntry *ce, uint64_t 
iova, bool is_write,
 uint64_t slpte;
 uint64_t access_right_check;
 
-if (!vtd_iova_range_check(iova, ce)) {
+if (!vtd_iova_range_check(iova, ce, aw_bits)) {
 trace_vtd_err_dmar_iova_overflow(iova);
 return -VTD_FR_ADDR_BEYOND_MGAW;
 }
@@ -714,7 +715,7 @@ static int vtd_iova_to_slpte(VTDContextEntry *ce, uint64_t 
iova, bool is_write,
 *slpte_level = level;
 return 0;
 }
-addr = vtd_get_slpte_addr(slpte);
+addr = vtd_get_slpte_addr(slpte, aw_bits);
 level--;
 }
 }
@@ -732,11 +733,12 @@ typedef int (*vtd_page_walk_hook)(IOMMUTLBEntry *entry, 
void *private);
  * @read: whether parent level has read permission
  * @write: whether parent level has write permission
  * @notify_unmap: whether we should notify invalid entries
+ * @aw: maximum address width
  */
 static int vtd_page_walk_level(dma_addr_t addr, uint64_t start,
 

[Qemu-devel] [PATCH v1 1/2] intel-iommu: Redefine macros to enable supporting 48 bit address width

2017-11-14 Thread prasad . singamsetty
From: Prasad Singamsetty 

The current implementation of Intel IOMMU code only supports 39 bits
host/iova address width so number of macros use hard coded values based
on that. This patch is to redefine them so they can be used with
variable address widths. This patch doesn't add any new functionality
but enables adding support for 48 bit address width.

Signed-off-by: Prasad Singamsetty 
---
 hw/i386/intel_iommu.c  | 54 --
 hw/i386/intel_iommu_internal.h | 34 +++---
 include/hw/i386/intel_iommu.h  |  6 +++--
 3 files changed, 61 insertions(+), 33 deletions(-)

diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
index 3a5bb0bc2e..53b3bf244d 100644
--- a/hw/i386/intel_iommu.c
+++ b/hw/i386/intel_iommu.c
@@ -523,7 +523,7 @@ static inline dma_addr_t 
vtd_ce_get_slpt_base(VTDContextEntry *ce)
 
 static inline uint64_t vtd_get_slpte_addr(uint64_t slpte)
 {
-return slpte & VTD_SL_PT_BASE_ADDR_MASK;
+return slpte & VTD_SL_PT_BASE_ADDR_MASK(VTD_HOST_ADDRESS_WIDTH);
 }
 
 /* Whether the pte indicates the address of the page frame */
@@ -624,19 +624,12 @@ static inline bool vtd_iova_range_check(uint64_t iova, 
VTDContextEntry *ce)
 return !(iova & ~(vtd_iova_limit(ce) - 1));
 }
 
-static const uint64_t vtd_paging_entry_rsvd_field[] = {
-[0] = ~0ULL,
-/* For not large page */
-[1] = 0x800ULL | ~(VTD_HAW_MASK | VTD_SL_IGN_COM),
-[2] = 0x800ULL | ~(VTD_HAW_MASK | VTD_SL_IGN_COM),
-[3] = 0x800ULL | ~(VTD_HAW_MASK | VTD_SL_IGN_COM),
-[4] = 0x880ULL | ~(VTD_HAW_MASK | VTD_SL_IGN_COM),
-/* For large page */
-[5] = 0x800ULL | ~(VTD_HAW_MASK | VTD_SL_IGN_COM),
-[6] = 0x1ff800ULL | ~(VTD_HAW_MASK | VTD_SL_IGN_COM),
-[7] = 0x3800ULL | ~(VTD_HAW_MASK | VTD_SL_IGN_COM),
-[8] = 0x880ULL | ~(VTD_HAW_MASK | VTD_SL_IGN_COM),
-};
+/*
+ * Rsvd field masks for spte:
+ * Index [1] to [4] 4k pages
+ * Index [5] to [8] large pages
+ */
+static uint64_t vtd_paging_entry_rsvd_field[9];
 
 static bool vtd_slpte_nonzero_rsvd(uint64_t slpte, uint32_t level)
 {
@@ -874,7 +867,7 @@ static int vtd_dev_to_context_entry(IntelIOMMUState *s, 
uint8_t bus_num,
 return -VTD_FR_ROOT_ENTRY_P;
 }
 
-if (re.rsvd || (re.val & VTD_ROOT_ENTRY_RSVD)) {
+if (re.rsvd || (re.val & VTD_ROOT_ENTRY_RSVD(VTD_HOST_ADDRESS_WIDTH))) {
 trace_vtd_re_invalid(re.rsvd, re.val);
 return -VTD_FR_ROOT_ENTRY_RSVD;
 }
@@ -891,7 +884,7 @@ static int vtd_dev_to_context_entry(IntelIOMMUState *s, 
uint8_t bus_num,
 }
 
 if ((ce->hi & VTD_CONTEXT_ENTRY_RSVD_HI) ||
-(ce->lo & VTD_CONTEXT_ENTRY_RSVD_LO)) {
+   (ce->lo & VTD_CONTEXT_ENTRY_RSVD_LO(VTD_HOST_ADDRESS_WIDTH))) {
 trace_vtd_ce_invalid(ce->hi, ce->lo);
 return -VTD_FR_CONTEXT_ENTRY_RSVD;
 }
@@ -1207,7 +1200,7 @@ static void vtd_root_table_setup(IntelIOMMUState *s)
 {
 s->root = vtd_get_quad_raw(s, DMAR_RTADDR_REG);
 s->root_extended = s->root & VTD_RTADDR_RTT;
-s->root &= VTD_RTADDR_ADDR_MASK;
+s->root &= VTD_RTADDR_ADDR_MASK(VTD_HOST_ADDRESS_WIDTH);
 
 trace_vtd_reg_dmar_root(s->root, s->root_extended);
 }
@@ -1223,7 +1216,7 @@ static void 
vtd_interrupt_remap_table_setup(IntelIOMMUState *s)
 uint64_t value = 0;
 value = vtd_get_quad_raw(s, DMAR_IRTA_REG);
 s->intr_size = 1UL << ((value & VTD_IRTA_SIZE_MASK) + 1);
-s->intr_root = value & VTD_IRTA_ADDR_MASK;
+s->intr_root = value & VTD_IRTA_ADDR_MASK(VTD_HOST_ADDRESS_WIDTH);
 s->intr_eime = value & VTD_IRTA_EIME;
 
 /* Notify global invalidation */
@@ -1479,7 +1472,7 @@ static void vtd_handle_gcmd_qie(IntelIOMMUState *s, bool 
en)
 trace_vtd_inv_qi_enable(en);
 
 if (en) {
-s->iq = iqa_val & VTD_IQA_IQA_MASK;
+s->iq = iqa_val & VTD_IQA_IQA_MASK(VTD_HOST_ADDRESS_WIDTH);
 /* 2^(x+8) entries */
 s->iq_size = 1UL << ((iqa_val & VTD_IQA_QS) + 8);
 s->qi_enabled = true;
@@ -2772,12 +2765,12 @@ static void vtd_address_space_unmap(VTDAddressSpace 
*as, IOMMUNotifier *n)
  * VT-d spec), otherwise we need to consider overflow of 64 bits.
  */
 
-if (end > VTD_ADDRESS_SIZE) {
+if (end > VTD_ADDRESS_SIZE(VTD_HOST_ADDRESS_WIDTH)) {
 /*
  * Don't need to unmap regions that is bigger than the whole
  * VT-d supported address space size
  */
-end = VTD_ADDRESS_SIZE;
+end = VTD_ADDRESS_SIZE(VTD_HOST_ADDRESS_WIDTH);
 }
 
 assert(start <= end);
@@ -2866,6 +2859,7 @@ static void vtd_iommu_replay(IOMMUMemoryRegion *iommu_mr, 
IOMMUNotifier *n)
 static void vtd_init(IntelIOMMUState *s)
 {
 X86IOMMUState *x86_iommu = X86_IOMMU_DEVICE(s);
+uint8_t aw_bits = VTD_HOST_ADDRESS_WIDTH;
 
 memset(s->csr, 0, DMAR_REG_SIZE);
 memset(s->wmask, 0, DMAR_REG_SIZE);
@@ -2882,10 +2876,24 @@ static void vtd_init(IntelIOMMUState *s)
 

[Qemu-devel] [PATCH v1 0/2] intel-iommu: Extend address width to 48 bits

2017-11-14 Thread prasad . singamsetty
From: Prasad Singamsetty 

This pair of patches extends the intel-iommu to support address
width to 48 bits. This is required to support qemu guest with large
memory (>=1TB). 

Patch1 implements changes to redefine macros and usage to
allow further changes to add support for 48 bit address width.
This patch doesn't change the existing functionality or behavior.

Patch2 adds support for 48 bit address width but keeping the
default to 39 bits.

NOTE: Peter Xu had originaly started on this enhancement
but it was not completed or integrated.

Unit testing done:

patch-1:
   * Boot vm with and without intel-iommu enabled
   * Boot vm with #cpus below and above 255 cpus
patch-2:
   * boot vm without "x-aw-bits" or "x-aw-bits=39": guest boots with 39
   * boot vm with "x-aw-bits=48": guest boots with 48 bits
   * boot vm with invalid value for x-aw-bits: guest fails to boot
   * boot vm with >=1TB memory and "x-aw-bits=48": guest boots

Prasad Singamsetty (2):
  intel-iommu: Redefine macros to enable supporting 48 bit address width
  intel-iommu: Extend address width to 48 bits

 hw/i386/acpi-build.c   |   3 +-
 hw/i386/intel_iommu.c  | 123 +
 hw/i386/intel_iommu_internal.h |  43 +-
 include/hw/i386/intel_iommu.h  |   7 ++-
 4 files changed, 110 insertions(+), 66 deletions(-)

-- 
2.14.0-rc1




Re: [Qemu-devel] [PATCH] exec: Fix section_covers_addr() for sections with non-zero offset

2017-11-14 Thread BALATON Zoltan

On Tue, 14 Nov 2017, Paolo Bonzini wrote:

On 21/10/2017 13:24, BALATON Zoltan wrote:

diff --git a/exec.c b/exec.c
index db5ae23..a915817 100644
--- a/exec.c
+++ b/exec.c
@@ -370,7 +370,8 @@ static inline bool section_covers_addr(const 
MemoryRegionSection *section,
  * the section must cover the entire address space.
  */
 return int128_gethi(section->size) ||
-   range_covers_byte(section->offset_within_address_space,
+   range_covers_byte(section->offset_within_address_space +
+ section->offset_within_region,
  int128_getlo(section->size), addr);
 }


Sorry, this is incorrect.  addr is an address in the address space, and
range_covers_byte checks if it is between
section->offset_within_address_space and
section->offset_within_address_space + section->size.  I am not sure how
things don't explode completely by adding section->offset_within_region
(probably it's just because section->offset_within_region is usually 0).


I had a feeling this might not be correct but appeared to work, very 
likely because in most cases the offset is 0 (which is why the bug wasn't 
happening very often either). How about the alternative I've just sent 
according to your suggestion? That also appears to fix the problem and 
hopefully more correct.


Thank you,
BALATON Zoltan



[Qemu-devel] [PATCH] exec: Skip mru section if it's a partial page and not resolving subpage

2017-11-14 Thread BALATON Zoltan
This fixes a crash caused by picking the wrong memory region in
address_space_lookup_region seen with client code accessing a device
model that uses alias memory regions.

Signed-off-by: BALATON Zoltan 
---
 exec.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/exec.c b/exec.c
index 97a24a8..e5f2b9a 100644
--- a/exec.c
+++ b/exec.c
@@ -413,6 +413,7 @@ static MemoryRegionSection 
*address_space_lookup_region(AddressSpaceDispatch *d,
 bool update;
 
 if (section && section != >map.sections[PHYS_SECTION_UNASSIGNED] &&
+(resolve_subpage || !section->offset_within_region) &&
 section_covers_addr(section, addr)) {
 update = false;
 } else {
-- 
2.7.6




[Qemu-devel] [PATCH v3 for-2.11 1/3] tpm_emulator: Add a caching layer for the TPM Established flag

2017-11-14 Thread Stefan Berger
Add a caching layer for the TPM established flag so that we don't
need to go to the emulator every time the flag is read by accessing
the REG_ACCESS register.

Signed-off-by: Stefan Berger 

v1->v2:
 - move the caching to the backend layer since detecting the
   TPM 1.2 TSC_ResetEstablishmentBit() command is easier to do
   here.
---
 hw/tpm/tpm_emulator.c | 17 ++---
 1 file changed, 14 insertions(+), 3 deletions(-)

diff --git a/hw/tpm/tpm_emulator.c b/hw/tpm/tpm_emulator.c
index e1a6810..b293db7 100644
--- a/hw/tpm/tpm_emulator.c
+++ b/hw/tpm/tpm_emulator.c
@@ -73,6 +73,9 @@ typedef struct TPMEmulator {
 Error *migration_blocker;
 
 QemuMutex mutex;
+
+unsigned int established_flag:1;
+unsigned int established_flag_cached:1;
 } TPMEmulator;
 
 
@@ -287,16 +290,22 @@ static bool 
tpm_emulator_get_tpm_established_flag(TPMBackend *tb)
 TPMEmulator *tpm_emu = TPM_EMULATOR(tb);
 ptm_est est;
 
-DPRINTF("%s", __func__);
+if (tpm_emu->established_flag_cached) {
+return tpm_emu->established_flag;
+}
+
 if (tpm_emulator_ctrlcmd(tpm_emu, CMD_GET_TPMESTABLISHED, ,
  0, sizeof(est)) < 0) {
 error_report("tpm-emulator: Could not get the TPM established flag: 
%s",
  strerror(errno));
 return false;
 }
-DPRINTF("established flag: %0x", est.u.resp.bit);
+DPRINTF("got established flag: %0x", est.u.resp.bit);
+
+tpm_emu->established_flag_cached = 1;
+tpm_emu->established_flag = (est.u.resp.bit != 0);
 
-return (est.u.resp.bit != 0);
+return tpm_emu->established_flag;
 }
 
 static int tpm_emulator_reset_tpm_established_flag(TPMBackend *tb,
@@ -327,6 +336,8 @@ static int 
tpm_emulator_reset_tpm_established_flag(TPMBackend *tb,
 return -1;
 }
 
+tpm_emu->established_flag_cached = 0;
+
 return 0;
 }
 
-- 
2.5.5




[Qemu-devel] [PATCH v3 for-2.11 2/3] tpm_tis: Return TPM_VERSION_UNSPEC in case of BE failure

2017-11-14 Thread Stefan Berger
In case the backend has a failure, such as the tpm_emulator's CMD_INIT
failing, the TIS goes into failure mode and does not respond to reads
or writes to MMIO registers. In this case we need to prevent the ACPI
table from being added and the straight-forward way is to indicate that
there's no known TPM version being used.

Signed-off-by: Stefan Berger 
---
 hw/tpm/tpm_tis.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/hw/tpm/tpm_tis.c b/hw/tpm/tpm_tis.c
index 7402528..fec2fc6 100644
--- a/hw/tpm/tpm_tis.c
+++ b/hw/tpm/tpm_tis.c
@@ -1008,6 +1008,10 @@ TPMVersion tpm_tis_get_tpm_version(Object *obj)
 {
 TPMState *s = TPM(obj);
 
+if (tpm_backend_had_startup_error(s->be_driver)) {
+return TPM_VERSION_UNSPEC;
+}
+
 return tpm_backend_get_tpm_version(s->be_driver);
 }
 
-- 
2.5.5




Re: [Qemu-devel] [RESEND PATCH 2/6] memory: introduce AddressSpaceOps and IOMMUObject

2017-11-14 Thread Auger Eric
Hi Yi L,

On 14/11/2017 14:59, Liu, Yi L wrote:
> On Tue, Nov 14, 2017 at 09:53:07AM +0100, Auger Eric wrote:
> Hi Eric,
> 
>> Hi Yi L,
>>
>> On 13/11/2017 10:58, Liu, Yi L wrote:
>>> On Mon, Nov 13, 2017 at 04:56:01PM +1100, David Gibson wrote:
 On Fri, Nov 03, 2017 at 08:01:52PM +0800, Liu, Yi L wrote:
> From: Peter Xu 
>
> AddressSpaceOps is similar to MemoryRegionOps, it's just for address
> spaces to store arch-specific hooks.
>
> The first hook I would like to introduce is iommu_get(). Return an
> IOMMUObject behind the AddressSpace.
>
> For systems that have IOMMUs, we will create a special address
> space per device which is different from system default address
> space for it (please refer to pci_device_iommu_address_space()).
> Normally when that happens, there will be one specific IOMMU (or
> say, translation unit) stands right behind that new address space.
>
> This iommu_get() fetches that guy behind the address space. Here,
> the guy is defined as IOMMUObject, which includes a notifier_list
> so far, may extend in future. Along with IOMMUObject, a new iommu
> notifier mechanism is introduced. It would be used for virt-svm.
> Also IOMMUObject can further have a IOMMUObjectOps which is similar
> to MemoryRegionOps. The difference is IOMMUObjectOps is not relied
> on MemoryRegion.
>
> Signed-off-by: Peter Xu 
> Signed-off-by: Liu, Yi L 

 Hi, sorry I didn't reply to the earlier postings of this after our
 discussion in China.  I've been sick several times and very busy.
>>>
>>> Hi David,
>>>
>>> Fully understood. I'll try my best to address your question. Also,
>>> feel free to input further questions, anyhow, the more we discuss the
>>> better work we done.
>>>
 I still don't feel like there's an adequate explanation of exactly
 what an IOMMUObject represents.   Obviously it can represent more than
>>>
>>> IOMMUObject is aimed to represent the iommu itself. e.g. the iommu
>>> specific operations. One of the key purpose of IOMMUObject is to
>>> introduce a notifier framework to let iommu emulator to be able to
>>> do iommu operations other than MAP/UNMAP. As IOMMU grows more and
>>> more feature, MAP/UNMAP is not the only operation iommu emulator needs
>>> to deal. e.g. shared virtual memory. So far, as I know AMD/ARM also
>>> has it. may correct me on it. As my cover letter mentioned, MR based
>>> notifier framework doesn’t work for the newly added IOMMU operations.
>>> Like bind guest pasid table pointer to host and propagate guest's
>>> iotlb flush to host.
>>>
 a single translation window - since that's represented by the
 IOMMUMR.  But what exactly do all the MRs - or whatever else - that
 are represented by the IOMMUObject have in common, from a functional
 point of view.
>>>
>>> Let me take virt-SVM as an example. As far as I know, for virt-SVM,
>>> the implementation of different vendors are similar. The key design
>>> is to have a nested translation(aka. two stage translation). It is to
>>> have guest maintain gVA->gPA mapping and hypervisor builds gPA->hPA
>>> mapping. Similar to EPT based virt-MMU solution.
>>>
>>> In Qemu, gPA->hPA mapping is done through MAP/UNMAP notifier, it can
>>> keep going. But for gVA->gPA mapping, only guest knows it, so hypervisor
>>> needs to trap specific guest iommu operation and pass the gVA->gPA
>>> mapping knowledge to host through a notifier(newly added one). In VT-d,
>>> it is called bind guest pasid table to host.
>>
>> What I don't get is the PASID table is per extended context entry. I
>> understand this latter is indexed by PCI device function. And today MR
>> are created per PCIe device if I am not wrong. 
> 
> In my understanding, MR is more related to AddressSpace not exactly tagged
> with PCIe device.
I meant, in the current intel_iommu code, vtd_find_add_as() creates 1
IOMMU MR and 1 AS per PCIe device, right?
> 
>> So why can't we have 1
>> new MR notifier dedicated to PASID table passing? My understanding is
>> the MR, having a 1-1 correspondence with a PCIe device and thus a
>> context could be of right granularity. Then I understand the only flags
> 
> I didn't quite get your point regards to the "granlarity" here. May talk
> a little bit more here?
The PASID table is per device (contained by extended context which is
dev/fn indexed). The "record_device" notifier also is attached to a
specific PCIe device. So we can't really say they have an iommu wide
scope (PCIe device granularity would fit). However I understand from
below explanation that TLB invalidate notifier is not especially tight
to a given source-id as we are going to invalidate by PASID/page.

I think the main justification behind introducing this new framework is
that PT is set along with SVM and in this case the IOMMU MR notifiers
are not registered since the IOMMU MR is 

[Qemu-devel] [PATCH v3 for-2.11 3/3] tpm_tis: Return 0 for every register in case of failure mode

2017-11-14 Thread Stefan Berger
Rather than returning ~0, return 0 for every register in case of
failure mode. The '0' is better to indicate that there's no device
there.

Signed-off-by: Stefan Berger 
---
 hw/tpm/tpm_tis.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/tpm/tpm_tis.c b/hw/tpm/tpm_tis.c
index fec2fc6..42d647d 100644
--- a/hw/tpm/tpm_tis.c
+++ b/hw/tpm/tpm_tis.c
@@ -545,7 +545,7 @@ static uint64_t tpm_tis_mmio_read(void *opaque, hwaddr addr,
 uint8_t v;
 
 if (tpm_backend_had_startup_error(s->be_driver)) {
-return val;
+return 0;
 }
 
 switch (offset) {
-- 
2.5.5




[Qemu-devel] [PATCH v3 for-2.11 0/3] tpm: a few fixes

2017-11-14 Thread Stefan Berger
From: Stefan Berger 

The following patches fix a performance issue (patch 1) and an
error path issue (patches 2 and 3) for 2.11.

   Stefan

Stefan Berger (3):
  tpm_emulator: Add a caching layer for the TPM Established flag
  tpm_tis: Return TPM_VERSION_UNSPEC in case of BE failure
  tpm_tis: Return 0 for every register in case of failure mode

 hw/tpm/tpm_emulator.c | 17 ++---
 hw/tpm/tpm_tis.c  |  6 +-
 2 files changed, 19 insertions(+), 4 deletions(-)

-- 
2.5.5




Re: [Qemu-devel] [PATCH v2 2/2] Add new PCI ID for i82559a

2017-11-14 Thread Stefan Weil
Am 06.11.2017 um 21:35 schrieb Mike Nawrocki:
> Adds a new PCI ID for the i82559a (0x8086 0x1030) interface. Enables
> this ID with a new property "use-alt-device-id" to preserve
> compatibility.
> 
> Signed-off-by: Mike Nawrocki 
> ---
>  hw/net/eepro100.c| 12 
>  include/hw/pci/pci.h |  1 +
>  qemu-options.hx  |  2 +-
>  3 files changed, 14 insertions(+), 1 deletion(-)


Sorry that I missed this patch.
I think I should have an entry for eepro100 in MAINTAINERS.

Mike, which hardware uses i82559a with PCI device id 0x1030?

https://www.intel.com/content/www/us/en/support/articles/05612/network-and-i-o/ethernet-products.html
only lists devices with
0x1229.

Thanks,
Stefan



Re: [Qemu-devel] [PATCH v17 6/6] virtio-balloon: VIRTIO_BALLOON_F_FREE_PAGE_VQ

2017-11-14 Thread Michael S. Tsirkin
On Tue, Nov 14, 2017 at 08:02:03PM +0800, Wei Wang wrote:
> On 11/14/2017 01:32 AM, Michael S. Tsirkin wrote:
> > > - guest2host_cmd: written by the guest to ACK to the host about the
> > > commands that have been received. The host will clear the corresponding
> > > bits on the host2guest_cmd register. The guest also uses this register
> > > to send commands to the host (e.g. when finish free page reporting).
> > I am not sure what is the role of guest2host_cmd. Reporting of
> > the correct cmd id seems sufficient indication that guest
> > received the start command. Not getting any more seems sufficient
> > to detect stop.
> > 
> 
> I think the issue is when the host is waiting for the guest to report pages,
> it does not know whether the guest is going to report more or the report is
> done already. That's why we need a way to let the guest tell the host "the
> report is done, don't wait for more", then the host continues to the next
> step - sending the non-free pages to the destination. The following method
> is a conclusion of other comments, with some new thought. Please have a
> check if it is good.

config won't work well for this IMHO.
Writes to config register are hard to synchronize with the VQ.
For example, guest sends free pages, host says stop, meanwhile
guest sends stop for 1st set of pages.

How about adding a buffer with "stop" in the VQ instead?
Wastes a VQ entry which you will need to reserve for this
but is it a big deal?


> Two new configuration registers in total:
> - cmd_reg: the command register, combined from the previous host2guest and
> guest2host. I think we can use the same register for host requesting and
> guest ACKing, since the guest writing will trap to QEMU, that is, all the
> writes to the register are performed in QEMU, and we can keep things work in
> a correct way there.
> - cmd_id_reg: the sequence id of the free page report command.
> 
> -- free page report:
> - host requests the guest to start reporting by "cmd_reg |
> REPORT_START";
> - guest ACKs to the host about receiving the start reporting request by
> "cmd_reg | REPORT_START", host will clear the flag bit once receiving the
> ACK.
> - host requests the guest to stop reporting by "cmd_reg | REPORT_STOP";
> - guest ACKs to the host about receiving the stop reporting request by
> "cmd_reg | REPORT_STOP", host will clear the flag once receiving the ACK.
> - guest tells the host about the start of the reporting by writing "cmd
> id" into an outbuf, which is added to the free page vq.
> - guest tells the host about the end of the reporting by writing "0"
> into an outbuf, which is added to the free page vq. (we reserve "id=0" as
> the stop sign)
> 
> -- ballooning:
> - host requests the guest to start ballooning by "cmd_reg | BALLOONING";
> - guest ACKs to the host about receiving the request by "cmd_reg |
> BALLOONING", host will clear the flag once receiving the ACK.
> 
> 
> Some more explanations:
> -- Why not let the host request the guest to start the free page reporting
> simply by writing a new cmd id to the cmd_id_reg?
> The configuration interrupt is shared among all the features - ballooning,
> free page reporting, and future feature extensions which need host-to-guest
> requests. Some features may need to add other feature specific configuration
> registers, like free page reporting need the cmd_id_reg, which is not used
> by ballooning. The rule here is that the feature specific registers are read
> only when that feature is requested via the cmd_reg. For example, the
> cmd_id_reg is read only when "cmd_reg | REPORT_START" is true. Otherwise,
> when the driver receives a configuration interrupt, it has to read both
> cmd_reg and cmd_id registers to know what are requested by the host - think
> about the case that ballooning requests are sent frequently while free page
> reporting isn't requested, the guest has to read the cmd_id register every
> time a ballooning request is sent by the host, which is not necessary. If
> future new features follow this style, there will be more unnecessary
> VMexits to read the unused feature specific registers.
> So I think it is good to have a central control of the feature request via
> only one cmd register - reading that one is enough to know what is requested
> by the host.
> 

Right now you are increasing the cost of balloon request 3x though.


How about we establish a baseline with a simple interface, and
then add the command register when it's actually benefitial.



> Best,
> Wei



Re: [Qemu-devel] [PATCH] qapi: block-core: Clarify events emitted by 'block-job-cancel'

2017-11-14 Thread no-reply
Hi,

This series failed build test on s390x host. Please find the details below.

Type: series
Subject: [Qemu-devel] [PATCH] qapi: block-core: Clarify events emitted by 
'block-job-cancel'
Message-id: 20171114191605.22349-1-kcham...@redhat.com

=== TEST SCRIPT BEGIN ===
#!/bin/bash
# Testing script will be invoked under the git checkout with
# HEAD pointing to a commit that has the patches applied on top of "base"
# branch
set -e
echo "=== ENV ==="
env
echo "=== PACKAGES ==="
rpm -qa
echo "=== TEST BEGIN ==="
CC=$HOME/bin/cc
INSTALL=$PWD/install
BUILD=$PWD/build
echo -n "Using CC: "
realpath $CC
mkdir -p $BUILD $INSTALL
SRC=$PWD
cd $BUILD
$SRC/configure --cc=$CC --prefix=$INSTALL
make -j4
# XXX: we need reliable clean up
# make check -j4 V=1
make install
=== TEST SCRIPT END ===

Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384
From https://github.com/patchew-project/qemu
 * [new tag] patchew/20171114191605.22349-1-kcham...@redhat.com -> 
patchew/20171114191605.22349-1-kcham...@redhat.com
Switched to a new branch 'test'
375c460 qapi: block-core: Clarify events emitted by 'block-job-cancel'

=== OUTPUT BEGIN ===
=== ENV ===
XDG_SESSION_ID=91315
SHELL=/bin/sh
USER=fam
PATCHEW=/home/fam/patchew/patchew-cli -s http://patchew.org --nodebug
PATH=/usr/bin:/bin
PWD=/var/tmp/patchew-tester-tmp-ki994nis/src
LANG=en_US.UTF-8
HOME=/home/fam
SHLVL=2
LOGNAME=fam
DBUS_SESSION_BUS_ADDRESS=unix:path=/run/user/1012/bus
XDG_RUNTIME_DIR=/run/user/1012
_=/usr/bin/env
=== PACKAGES ===
gpg-pubkey-873529b8-54e386ff
xz-libs-5.2.2-2.fc24.s390x
libxshmfence-1.2-3.fc24.s390x
giflib-4.1.6-15.fc24.s390x
trousers-lib-0.3.13-6.fc24.s390x
ncurses-base-6.0-6.20160709.fc25.noarch
gmp-6.1.1-1.fc25.s390x
libidn-1.33-1.fc25.s390x
slang-2.3.0-7.fc25.s390x
pkgconfig-0.29.1-1.fc25.s390x
alsa-lib-1.1.1-2.fc25.s390x
yum-metadata-parser-1.1.4-17.fc25.s390x
python3-slip-dbus-0.6.4-4.fc25.noarch
python2-cssselect-0.9.2-1.fc25.noarch
createrepo_c-libs-0.10.0-6.fc25.s390x
initscripts-9.69-1.fc25.s390x
parted-3.2-21.fc25.s390x
flex-2.6.0-3.fc25.s390x
colord-libs-1.3.4-1.fc25.s390x
python-osbs-client-0.33-3.fc25.noarch
perl-Pod-Simple-3.35-1.fc25.noarch
python2-simplejson-3.10.0-1.fc25.s390x
brltty-5.4-2.fc25.s390x
librados2-10.2.4-2.fc25.s390x
tcp_wrappers-7.6-83.fc25.s390x
libcephfs_jni1-10.2.4-2.fc25.s390x
nettle-devel-3.3-1.fc25.s390x
bzip2-devel-1.0.6-21.fc25.s390x
libuuid-2.28.2-2.fc25.s390x
python3-dnf-1.1.10-6.fc25.noarch
texlive-kpathsea-doc-svn41139-33.fc25.1.noarch
openssh-7.4p1-4.fc25.s390x
texlive-kpathsea-bin-svn40473-33.20160520.fc25.1.s390x
texlive-graphics-svn41015-33.fc25.1.noarch
texlive-dvipdfmx-def-svn40328-33.fc25.1.noarch
texlive-mfware-svn40768-33.fc25.1.noarch
texlive-texlive-scripts-svn41433-33.fc25.1.noarch
texlive-euro-svn22191.1.1-33.fc25.1.noarch
texlive-etex-svn37057.0-33.fc25.1.noarch
texlive-iftex-svn29654.0.2-33.fc25.1.noarch
texlive-palatino-svn31835.0-33.fc25.1.noarch
texlive-texlive-docindex-svn41430-33.fc25.1.noarch
texlive-xunicode-svn30466.0.981-33.fc25.1.noarch
texlive-koma-script-svn41508-33.fc25.1.noarch
texlive-pst-grad-svn15878.1.06-33.fc25.1.noarch
texlive-pst-blur-svn15878.2.0-33.fc25.1.noarch
texlive-jknapltx-svn19440.0-33.fc25.1.noarch
texinfo-6.1-4.fc25.s390x
openssl-devel-1.0.2k-1.fc25.s390x
jansson-2.10-2.fc25.s390x
fedora-repos-25-4.noarch
perl-Errno-1.25-387.fc25.s390x
acl-2.2.52-13.fc25.s390x
systemd-pam-231-17.fc25.s390x
NetworkManager-libnm-1.4.4-5.fc25.s390x
poppler-0.45.0-5.fc25.s390x
ccache-3.3.4-1.fc25.s390x
valgrind-3.12.0-9.fc25.s390x
perl-open-1.10-387.fc25.noarch
libgcc-6.4.1-1.fc25.s390x
libsoup-2.56.1-1.fc25.s390x
libstdc++-devel-6.4.1-1.fc25.s390x
libobjc-6.4.1-1.fc25.s390x
python2-rpm-4.13.0.1-2.fc25.s390x
python2-gluster-3.10.5-1.fc25.s390x
rpm-build-4.13.0.1-2.fc25.s390x
glibc-static-2.24-10.fc25.s390x
lz4-1.8.0-1.fc25.s390x
xapian-core-libs-1.2.24-1.fc25.s390x
elfutils-libelf-devel-0.169-1.fc25.s390x
nss-softokn-3.32.0-1.2.fc25.s390x
pango-1.40.9-1.fc25.s390x
glibc-debuginfo-common-2.24-10.fc25.s390x
libaio-0.3.110-6.fc24.s390x
libfontenc-1.1.3-3.fc24.s390x
lzo-2.08-8.fc24.s390x
isl-0.14-5.fc24.s390x
libXau-1.0.8-6.fc24.s390x
linux-atm-libs-2.5.1-14.fc24.s390x
libXext-1.3.3-4.fc24.s390x
libXxf86vm-1.1.4-3.fc24.s390x
bison-3.0.4-4.fc24.s390x
perl-srpm-macros-1-20.fc25.noarch
gawk-4.1.3-8.fc25.s390x
libwayland-client-1.12.0-1.fc25.s390x
perl-Exporter-5.72-366.fc25.noarch
perl-version-0.99.17-1.fc25.s390x
fftw-libs-double-3.3.5-3.fc25.s390x
libssh2-1.8.0-1.fc25.s390x
ModemManager-glib-1.6.4-1.fc25.s390x
newt-python3-0.52.19-2.fc25.s390x
python-munch-2.0.4-3.fc25.noarch
python-bugzilla-1.2.2-4.fc25.noarch
libedit-3.1-16.20160618cvs.fc25.s390x
createrepo_c-0.10.0-6.fc25.s390x
device-mapper-multipath-libs-0.4.9-83.fc25.s390x
yum-3.4.3-510.fc25.noarch
mozjs17-17.0.0-16.fc25.s390x
libselinux-2.5-13.fc25.s390x
python2-pyparsing-2.1.10-1.fc25.noarch
cairo-gobject-1.14.8-1.fc25.s390x
xorg-x11-proto-devel-7.7-20.fc25.noarch
brlapi-0.6.5-2.fc25.s390x
librados-devel-10.2.4-2.fc25.s390x

Re: [Qemu-devel] [PATCH] vhost-user-scsi: add missing virtqueue_size param

2017-11-14 Thread Michael S. Tsirkin
On Tue, Nov 14, 2017 at 05:28:36PM +0100, Dariusz Stojaczyk wrote:
> Commit 5c0919d0 [1] introduced virtqueue_size parameter
> for common virtio-scsi path, without updaing the vhost-user-scsi
> code. vhost-user-scsi devices right now report size 0 for each vq.
> 
> This patch introduces virtqueue_size param to vhost-user-scsi,
> that can now be set by the user. However, the most importantly, it
> now has a default value of 128 (same as QEMU's virtio-scsi).
> 
> [1] 5c0919d0 ("virtio-scsi: Add virtqueue_size parameter
> allowing virtqueue size to be set.")
> 
> Change-Id: I70e87eab702ebf1196c028dbf17d54fdc0c89a14
> Signed-off-by: Dariusz Stojaczyk 

Reviewed-by: Michael S. Tsirkin 

> ---
>  hw/scsi/vhost-user-scsi.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/hw/scsi/vhost-user-scsi.c b/hw/scsi/vhost-user-scsi.c
> index 500fa6a..f7561e2 100644
> --- a/hw/scsi/vhost-user-scsi.c
> +++ b/hw/scsi/vhost-user-scsi.c
> @@ -135,6 +135,8 @@ static Property vhost_user_scsi_properties[] = {
>  DEFINE_PROP_CHR("chardev", VirtIOSCSICommon, conf.chardev),
>  DEFINE_PROP_UINT32("boot_tpgt", VirtIOSCSICommon, conf.boot_tpgt, 0),
>  DEFINE_PROP_UINT32("num_queues", VirtIOSCSICommon, conf.num_queues, 1),
> +DEFINE_PROP_UINT32("virtqueue_size", VirtIOSCSICommon, 
> conf.virtqueue_size,
> +   128),
>  DEFINE_PROP_UINT32("max_sectors", VirtIOSCSICommon, conf.max_sectors,
> 0x),
>  DEFINE_PROP_UINT32("cmd_per_lun", VirtIOSCSICommon, conf.cmd_per_lun, 
> 128),
> -- 
> 2.7.4



[Qemu-devel] [Bug 1728256] Re: (Regression) Memory corruption in Windows 10 guest / amd64

2017-11-14 Thread Wüstengecko
It happened again, both with the e1000 and the rtl8139 NICs under qemu
2.11.0.rc0-7-g4ffa88c99c. Kernel is the official Arch one, right now on
4.13.12.

At this point I have no idea anymore what could be causing this, and am
unable to test without having to remove basic functionality from the VM
(e.g. the graphics card) or downgrading the host kernel (which I really
want to avoid because I'm using btrfs).

That said, during the last several days I did not experience these weird
hangup issues that I described previously, however I did see very high
CPU load in the guest that was caused by the network (listed in task
manager as System Interrupts, and going as high as one full CPU core
during large network operations).

What is most interesting though is that it survived while I tried my
best to get it to crash (stress-testing CPU and network, mostly), and
then hit me with a Bluescreen in a most unexpected time almost a week
later. Since then however it started crashing anywhere between a few
hours and about two days of consecutive uptime again, just like before.

@larsk, could you elaborate on your setup? Like, in which ways is it different 
(other than you using Ubuntu and thus different versions of the involved 
software)?
Which hardware do you pass through, if any?

** Summary changed:

- (Regression) Memory corruption in Windows 10 guest / amd64
+ Memory corruption in Windows 10 guest / amd64

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1728256

Title:
  Memory corruption in Windows 10 guest / amd64

Status in QEMU:
  New

Bug description:
  I have a Win 10 Pro x64 guest inside a qemu/kvm running on an Arch x86_64 
host. The VM has a physical GPU passed through, as well as the physical USB 
controllers, as well as a dedicated SSD attached via SATA; you can find the 
complete libvirt xml here: https://pastebin.com/U1ZAXBNg
  I built qemu from source using the qemu-minimal-git AUR package; you can find 
the build script here: 
https://aur.archlinux.org/cgit/aur.git/tree/PKGBUILD?h=qemu-minimal-git (if you 
aren't familiar with Arch, this is essentially a bash script where build() and 
package() are run to build the files, and then install them into the $pkgdir to 
later tar them up.)

  Starting with qemu v2.10.0, Windows crashes randomly with a bluescreen
  about CRITICAL_STRUCTURE_CORRUPTION. I also tested the git heads
  f90ea7ba7c, 861cd431c9 and e822e81e35, before I went back to v2.9.0,
  which is running stable for over 50 hours right now.

  During my tests I found that locking the memory pages alleviates the
  problem somewhat, but never completely avoids it. However, with the
  crashes occuring randomly, that could as well be false conclusions; I
  had crashes within minutes after boot with that too.

  I will now start `git bisect`ing; if you have any other suggestions on
  what I could try or possible patches feel free to leave them with me.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1728256/+subscriptions



[Qemu-devel] [Bug 1713825] Re: Booting Windows 2016 with qxl video crashes qemu

2017-11-14 Thread Maciej Piechotka
It helps but I'm quite sure that lower level security systems (guest)
should never be able to crash higher level security systems
(hypervisor).

PS. It repros in 2.10.0 as well.

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1713825

Title:
  Booting Windows 2016 with qxl video crashes qemu

Status in QEMU:
  New

Bug description:
  launched from libvirt.

  qemu version: 2.9.0
  host: Linux  4.9.34-gentoo #1 SMP Sat Jul 29 13:28:43 PDT 2017 
x86_64 Intel(R) Core(TM) i7-3930K CPU @ 3.20GHz GenuineIntel GNU/Linux
  guest: Windows 2016 64 bit

  Thread 28 (Thread 0x7f0e2edff700 (LWP 29860)):
  #0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51
  set = {__val = {18446744067266837079, 139698892694944, 
139699853745096, 139700858749789, 4222451712, 139694281220640, 139694281220741, 
139694281220640, 139694281220640, 139694281220810, 
  139694281220940, 139694281220640, 139694281220940, 0, 0, 0}}
  pid = 
  tid = 
  #1  0x7f0ea40b644a in __GI_abort () at abort.c:89
  save_stage = 2
  act = {__sigaction_handler = {sa_handler = 0x7f0e2edfe5c0, 
sa_sigaction = 0x7f0e2edfe5c0}, sa_mask = {__val = {139694281219872, 
139698106269697, 139698892695344, 4, 2676511744, 0, 139698892695144, 0, 
139698892694912, 1, 4737316546111099904, 139700859888720, 
4737316546111099904, 139700862161824, 139700911349760, 94211934977482}}, 
sa_flags = 416, 
sa_restorer = 0x55af6ceb0500 <__PRETTY_FUNCTION__.36381>}
  sigs = {__val = {32, 0 }}
  #2  0x7f0ea40abab6 in __assert_fail_base (fmt=, 
assertion=assertion@entry=0x55af6ceafdca "offset < qxl->vga.vram_size", 
  file=file@entry=0x55af6ceaeaa0 
"/var/tmp/portage/app-emulation/qemu-2.9.0-r2/work/qemu-2.9.0/hw/display/qxl.c",
 line=line@entry=416, 
  function=function@entry=0x55af6ceb0500 <__PRETTY_FUNCTION__.36381> 
"qxl_ram_set_dirty") at assert.c:92
  str = 0x7f0d1c026220 "\340r\002\034\r\177"
  total = 4096
  #3  0x7f0ea40abb81 in __GI___assert_fail 
(assertion=assertion@entry=0x55af6ceafdca "offset < qxl->vga.vram_size", 
  file=file@entry=0x55af6ceaeaa0 
"/var/tmp/portage/app-emulation/qemu-2.9.0-r2/work/qemu-2.9.0/hw/display/qxl.c",
 line=line@entry=416, 
  function=function@entry=0x55af6ceb0500 <__PRETTY_FUNCTION__.36381> 
"qxl_ram_set_dirty") at assert.c:101
  No locals.
  #4  0x55af6cc58805 in qxl_ram_set_dirty (qxl=, 
ptr=) at 
/var/tmp/portage/app-emulation/qemu-2.9.0-r2/work/qemu-2.9.0/hw/display/qxl.c:416
  base = 
  offset = 
  qxl = 
  ptr = 
  base = 
  offset = 
  #5  0x55af6cc5b9e2 in interface_release_resource (sin=0x55af71a91ed0, 
ext=...) at 
/var/tmp/portage/app-emulation/qemu-2.9.0-r2/work/qemu-2.9.0/hw/display/qxl.c:767
  qxl = 0x55af71a91450
  ring = 
  item = 
  id = 18446690739814400920
  __func__ = "interface_release_resource"
  #6  0x7f0ea510afa8 in red_drawable_unref (red_drawable=0x7f0d1c026120) at 
red-worker.c:101
  No locals.
  #7  0x7f0ea510b609 in red_drawable_unref (red_drawable=) 
at red-worker.c:104
  No locals.
  #8  0x7f0ea510eae9 in drawable_unref 
(drawable=drawable@entry=0x7f0e68285ac0) at display-channel.c:1438
  display = 0x55af71dbd3c0
  __FUNCTION__ = "drawable_unref"
  #9  0x7f0ea51109f7 in draw_until (display=display@entry=0x55af71dbd3c0, 
surface=surface@entry=0x7f0e6828aae8, last=0x7f0e68285ac0) at 
display-channel.c:1637
  container = 0x0
  now = 0x7f0e68285ac0
  #10 0x7f0ea510f93f in display_channel_draw (display=0x55af71dbd3c0, 
area=0x7f0e2edfe8e0, surface_id=) at display-channel.c:1729
  surface = 0x7f0e6828aae8
  last = 
  __FUNCTION__ = "display_channel_draw"
  __func__ = "display_channel_draw"

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1713825/+subscriptions



Re: [Qemu-devel] using "qemu-img convert -O qcow2" to convert qcow v1 to v2 creates a qcow v3 file?

2017-11-14 Thread Max Reitz
On 2017-11-14 21:38, John Snow wrote:
> 
> 
> On 11/14/2017 03:35 PM, Max Reitz wrote:
>> On 2017-11-14 21:30, John Snow wrote:
>>>
>>>
>>> On 11/14/2017 01:46 PM, Max Reitz wrote:
 On 2017-11-14 19:45, Thomas Huth wrote:
> On 14.11.2017 14:32, Max Reitz wrote:
> [...]
>> Well, do you want to document it?  I'd rather deprecate it altogether.
>
> Maybe a first step could be to change qemu-img so that it refuses to
> create new qcow1 images (but still can convert them into other formats).
> So basically make qcow1 read-only?

 Yep, and the actual first step to that is to make it issue a deprecation
 warning when creating qcow v1 images (which is what I proposed). :-)

 Max

>>>
>>> Deprecation warning is good.
>>>
>>> In future versions you can shimmy it behind a
>>> --no-really-I-want-this-old-format option, I think we ought to support
>>> creating the images for as long as is technologically convenient.
>>
>> Well, at some point you can also demand from users to just dig out some
>> old version of qemu-img to convert their qcow v1 images to qcow2.  It's
>> not like they are going to miss out on anything.
>>
> 
> As long is convenient. I won't throw a fit that it needs to be around
> forever, but as long as it's sufficiently guarded from use and isn't
> hard to keep around I'd prefer to do that.
> 
> I suppose it's just a weak preference.

I agree that sufficiently guarding it (albeit our definitions on what is
sufficient may differ) serves all the purpose I need, that is, make
users aware of the fact that they are doing something for which I can
see no reason.

But on the other hand, I want those guards to make it dead code,
effectively.  And if something is dead code... There is no reason to
keep it around.

>> (If you deprecate emulated hardware, users may complain that they don't
>> get the newest qemu features/bugfixes/... while continuing to use that
>> hardware, so I can see that it's a tough decision whether to deprecate
>> that.  But it's not like you are going to lose any features or anything
>> if you convert your dusty images to qcow2.  On the contrary, we're
>> helping you to get more performance out of them.  Maybe qemu should just
>> silently convert qcow v1 images to qcow2 without asking the user, like
>> Apple did...)
>>
>> Max
>>
> 
> "Like Apple did" seems sufficient justification to never do that, but
> maybe that's just my own opinion.

Sorry, forgot the emoticon. :-)

Yes, that was meant to be a joke.  Although adding a qemu-img amend for
amending qcow v1 to v2/v3 images is probably mostly an issue of creating
the right internal interface for cross-format amendments.

(Silently storing qcow v1 as v2/v3 images is actually something where
users could have a problem because maybe they have some tool that only
works on qcow v1 images; and then they can't use that together with an
auto-amending qemu...)

Max



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] using "qemu-img convert -O qcow2" to convert qcow v1 to v2 creates a qcow v3 file?

2017-11-14 Thread John Snow


On 11/14/2017 03:35 PM, Max Reitz wrote:
> On 2017-11-14 21:30, John Snow wrote:
>>
>>
>> On 11/14/2017 01:46 PM, Max Reitz wrote:
>>> On 2017-11-14 19:45, Thomas Huth wrote:
 On 14.11.2017 14:32, Max Reitz wrote:
 [...]
> Well, do you want to document it?  I'd rather deprecate it altogether.

 Maybe a first step could be to change qemu-img so that it refuses to
 create new qcow1 images (but still can convert them into other formats).
 So basically make qcow1 read-only?
>>>
>>> Yep, and the actual first step to that is to make it issue a deprecation
>>> warning when creating qcow v1 images (which is what I proposed). :-)
>>>
>>> Max
>>>
>>
>> Deprecation warning is good.
>>
>> In future versions you can shimmy it behind a
>> --no-really-I-want-this-old-format option, I think we ought to support
>> creating the images for as long as is technologically convenient.
> 
> Well, at some point you can also demand from users to just dig out some
> old version of qemu-img to convert their qcow v1 images to qcow2.  It's
> not like they are going to miss out on anything.
> 

As long is convenient. I won't throw a fit that it needs to be around
forever, but as long as it's sufficiently guarded from use and isn't
hard to keep around I'd prefer to do that.

I suppose it's just a weak preference.

> (If you deprecate emulated hardware, users may complain that they don't
> get the newest qemu features/bugfixes/... while continuing to use that
> hardware, so I can see that it's a tough decision whether to deprecate
> that.  But it's not like you are going to lose any features or anything
> if you convert your dusty images to qcow2.  On the contrary, we're
> helping you to get more performance out of them.  Maybe qemu should just
> silently convert qcow v1 images to qcow2 without asking the user, like
> Apple did...)
> 
> Max
> 

"Like Apple did" seems sufficient justification to never do that, but
maybe that's just my own opinion.



Re: [Qemu-devel] using "qemu-img convert -O qcow2" to convert qcow v1 to v2 creates a qcow v3 file?

2017-11-14 Thread Max Reitz
On 2017-11-14 21:30, John Snow wrote:
> 
> 
> On 11/14/2017 01:46 PM, Max Reitz wrote:
>> On 2017-11-14 19:45, Thomas Huth wrote:
>>> On 14.11.2017 14:32, Max Reitz wrote:
>>> [...]
 Well, do you want to document it?  I'd rather deprecate it altogether.
>>>
>>> Maybe a first step could be to change qemu-img so that it refuses to
>>> create new qcow1 images (but still can convert them into other formats).
>>> So basically make qcow1 read-only?
>>
>> Yep, and the actual first step to that is to make it issue a deprecation
>> warning when creating qcow v1 images (which is what I proposed). :-)
>>
>> Max
>>
> 
> Deprecation warning is good.
> 
> In future versions you can shimmy it behind a
> --no-really-I-want-this-old-format option, I think we ought to support
> creating the images for as long as is technologically convenient.

Well, at some point you can also demand from users to just dig out some
old version of qemu-img to convert their qcow v1 images to qcow2.  It's
not like they are going to miss out on anything.

(If you deprecate emulated hardware, users may complain that they don't
get the newest qemu features/bugfixes/... while continuing to use that
hardware, so I can see that it's a tough decision whether to deprecate
that.  But it's not like you are going to lose any features or anything
if you convert your dusty images to qcow2.  On the contrary, we're
helping you to get more performance out of them.  Maybe qemu should just
silently convert qcow v1 images to qcow2 without asking the user, like
Apple did...)

Max



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] using "qemu-img convert -O qcow2" to convert qcow v1 to v2 creates a qcow v3 file?

2017-11-14 Thread John Snow


On 11/14/2017 01:46 PM, Max Reitz wrote:
> On 2017-11-14 19:45, Thomas Huth wrote:
>> On 14.11.2017 14:32, Max Reitz wrote:
>> [...]
>>> Well, do you want to document it?  I'd rather deprecate it altogether.
>>
>> Maybe a first step could be to change qemu-img so that it refuses to
>> create new qcow1 images (but still can convert them into other formats).
>> So basically make qcow1 read-only?
> 
> Yep, and the actual first step to that is to make it issue a deprecation
> warning when creating qcow v1 images (which is what I proposed). :-)
> 
> Max
> 

Deprecation warning is good.

In future versions you can shimmy it behind a
--no-really-I-want-this-old-format option, I think we ought to support
creating the images for as long as is technologically convenient.



[Qemu-devel] [ANNOUNCE] QEMU 2.11.0-rc1 is now available

2017-11-14 Thread Michael Roth
Hello,

On behalf of the QEMU Team, I'd like to announce the availability of the
second release candidate for the QEMU 2.11 release.  This release is meant
for testing purposes and should not be used in a production environment.

  http://download.qemu-project.org/qemu-2.11.0-rc1.tar.xz
  http://download.qemu-project.org/qemu-2.11.0-rc1.tar.xz.sig

You can help improve the quality of the QEMU 2.11 release by testing this
release and reporting bugs on Launchpad:

  https://bugs.launchpad.net/qemu/

The release plan, as well a documented known issues for release
candidates, are available at:

  http://wiki.qemu.org/Planning/2.11

Please add entries to the ChangeLog for the 2.11 release below:

  http://wiki.qemu.org/ChangeLog/2.11

Changes since rc0:

1fa0f627d0: Update version for v2.11.0-rc1 release (Peter Maydell)
8b2d7c364d: qemu-iotests: update unsupported image formats in 194 (Jeff Cody)
1d0f37cf21: block/parallels: add migration blocker (Jeff Cody)
6c7d390b99: block/parallels: Do not update header or truncate image when 
INMIGRATE (Jeff Cody)
7479bf07c4: block/vhdx.c: Don't blindly update the header (Jeff Cody)
d04c155503: iotests: 077: Filter out 'resume' lines (Fam Zheng)
04dec3c3ae: block/snapshot: dirty all dirty bitmaps on snapshot-switch 
(Vladimir Sementsov-Ogievskiy)
bcb5270c75: qcow2: Check that corrupted images can be repaired in iotest 060 
(Alberto Garcia)
147b44be49: iotests: Use new-style NBD connections (Eric Blake)
19026817f7: iotests: Make 136 less flaky (Max Reitz)
ddc7093eec: iotests: Make 083 less flaky (Max Reitz)
bc11aee2ac: iotests: Make 055 less flaky (Max Reitz)
51c493c5cc: iotests: Add missing 'blkdebug::' in 040 (Max Reitz)
dca9b6a2b1: iotests: Make 030 less flaky (Max Reitz)
c9b83e9c23: qcow2: Assert that the crypto header does not overlap other 
metadata (Alberto Garcia)
ef083f61af: qcow2: Add iotest for an empty refcount table (Alberto Garcia)
5a45da5ef8: qcow2: Add iotest for an image with header.refcount_table_offset == 
0 (Alberto Garcia)
951053a9ec: qcow2: Don't open images with header.refcount_table_clusters == 0 
(Alberto Garcia)
8aa34834d5: qcow2: Prevent allocating compressed clusters at offset 0 (Alberto 
Garcia)
9883975050: qcow2: Prevent allocating L2 tables at offset 0 (Alberto Garcia)
6bf45d59f9: qcow2: Prevent allocating refcount blocks at offset 0 (Alberto 
Garcia)
6350b2a09b: seabios: update to 1.11 final (Gerd Hoffmann)
dcb556fc6a: xics/kvm: synchonize state before 'info pic' (Greg Kurz)
e05fba5004: target/ppc: correct htab shift for hash on radix (Sam Bobroff)
0761562687: qemu-iotests: Test I/O limits with removable media (Alberto Garcia)
c89bcf3af0: block: Leave valid throttle timers when removing a BDS from a 
backend (Alberto Garcia)
48bf7ea81a: block: Check for inserted BlockDriverState in 
blk_io_limits_disable() (Alberto Garcia)
dc868fb03b: throttle-groups: drain before detaching ThrottleState (Stefan 
Hajnoczi)
632a773543: block: all I/O should be completed before removing throttle timers. 
(Zhengui)
d25f2a7227: accel/tcg/translate-all: expand cpu_restore_state addr check (Alex 
Bennée)
7264961934: hw: add .min_cpus and .default_cpus fields to machine_class (Emilio 
G. Cota)
1342b0355e: xlnx-zcu102: Specify the max number of CPUs for the EP108 (Emilio 
G. Cota)
83926ad527: xlnx-zcu102: Add an info message deprecating the EP108 (Alistair 
Francis)
6908ec448b: xlnx-zynqmp: Properly support the smp command line option (Alistair 
Francis)
2dda635410: qom: move CPUClass.tcg_initialize to a global (Emilio G. Cota)
670bc4cbda: MAINTAINERS: Add entries for Smartfusion2 (Subbaraya Sundeep)
c5c752af8c: highbank: validate register offset before access (Prasad J Pandit)
5ca66278c8: arm/translate-a64: mark path as unreachable to eliminate warning 
(Emilio G. Cota)
bb160b571f: net/socket: fix coverity issue (Jens Freimann)
5e89dc0113: Add new PCI ID for i82559a (Mike Nawrocki)
1865e288a8: Fix eepro100 simple transmission mode (Mike Nawrocki)
8fa5ad6dfb: colo: Consolidate the duplicate code chunk into a routine (Mao 
Zhongyi)
3463218c6c: colo-compare: Fix comments (Mao Zhongyi)
8ec1440202: colo-compare: compare the packet in a specified Connection (Mao 
Zhongyi)
8850d4caa7: colo-compare: Insert packet into the suitable position of packet 
queue directly (Mao Zhongyi)
ff86d57625: net: fix check for number of parameters to -netdev socket (Jens 
Freimann)
2e9a856570: ui: use QEMU_IS_ALIGNED macro (Philippe Mathieu-Daudé)
cf7040e284: vmsvga: use ARRAY_SIZE macro (Philippe Mathieu-Daudé)
115788d7a7: vga: fix region checks in wraparound case (Gerd Hoffmann)
777c5f1e43: ui: fix dcl unregister (Gerd Hoffmann)
c53f5b89f1: virtio-gpu: fix bug in host memory calculation. (Tao Wu)
990132cda9: slirp: don't zero the whole ti_i when m == NULL (Tao Wu)
ef8c887ee0: nbd/server: Fix structured read of length 0 (Eric Blake)
b4176cb314: nbd-client: Stricter enforcing of structured reply spec (Eric Blake)
9d8f818cde: nbd-client: Short-circuit 0-length operations (Eric Blake)

Re: [Qemu-devel] [PATCH for-2.11] qcow2: Fix overly broad madvise()

2017-11-14 Thread Eric Blake
On 11/14/2017 12:41 PM, Max Reitz wrote:
> @mem_size and @offset are both size_t, thus subtracting them from one
> another will just return a big size_t if mem_size < offset -- even more
> obvious here because the result is stored in another size_t.
> 
> Checking that result to be positive is therefore not sufficient to
> excluse the case that offset > mem_size.  Thus, we currently sometimes

s/excluse/exclude/

> issue an madvise() over a very large address range.
> 
> This is triggered by iotest 163, but with -m64, this does not result in
> tangible problems.  But with -m32, this test produces three segfaults,
> all of which are fixed by this patch.
> 
> Signed-off-by: Max Reitz 
> ---
>  block/qcow2-cache.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 

Reviewed-by: Eric Blake 

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] [PATCH v2 for-2.11] hw/net/vmxnet3: Fix code to work on big endian hosts, too

2017-11-14 Thread no-reply
Hi,

This series seems to have some coding style problems. See output below for
more information:

Subject: [Qemu-devel] [PATCH v2 for-2.11] hw/net/vmxnet3: Fix code to work on 
big endian hosts, too
Type: series
Message-id: 1510658424-16527-1-git-send-email-th...@redhat.com

=== TEST SCRIPT BEGIN ===
#!/bin/bash

BASE=base
n=1
total=$(git log --oneline $BASE.. | wc -l)
failed=0

git config --local diff.renamelimit 0
git config --local diff.renames True

commits="$(git log --format=%H --reverse $BASE..)"
for c in $commits; do
echo "Checking PATCH $n/$total: $(git log -n 1 --format=%s $c)..."
if ! git show $c --format=email | ./scripts/checkpatch.pl --mailback -; then
failed=1
echo
fi
n=$((n+1))
done

exit $failed
=== TEST SCRIPT END ===

Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384
From https://github.com/patchew-project/qemu
 t [tag update]
patchew/1510658424-16527-1-git-send-email-th...@redhat.com -> 
patchew/1510658424-16527-1-git-send-email-th...@redhat.com
Switched to a new branch 'test'
2d5d069b29 hw/net/vmxnet3: Fix code to work on big endian hosts, too

=== OUTPUT BEGIN ===
Checking PATCH 1/1: hw/net/vmxnet3: Fix code to work on big endian hosts, too...
ERROR: spaces required around that ':' (ctx:VxV)
#235: FILE: hw/net/vmxnet3.h:232:
+u32 msscof:14;  /* MSS, checksum offset, flags */
   ^

ERROR: spaces required around that ':' (ctx:VxV)
#236: FILE: hw/net/vmxnet3.h:233:
+u32 ext1:1;
 ^

ERROR: spaces required around that ':' (ctx:VxV)
#237: FILE: hw/net/vmxnet3.h:234:
+u32 dtype:1;/* descriptor type */
  ^

ERROR: spaces required around that ':' (ctx:VxV)
#238: FILE: hw/net/vmxnet3.h:235:
+u32 rsvd:1;
 ^

ERROR: spaces required around that ':' (ctx:VxV)
#239: FILE: hw/net/vmxnet3.h:236:
+u32 gen:1;  /* generation bit */
^

ERROR: spaces required around that ':' (ctx:VxV)
#240: FILE: hw/net/vmxnet3.h:237:
+u32 len:14;
^

ERROR: spaces required around that ':' (ctx:VxV)
#248: FILE: hw/net/vmxnet3.h:239:
+u32 len:14;
^

ERROR: spaces required around that ':' (ctx:VxV)
#249: FILE: hw/net/vmxnet3.h:240:
+u32 gen:1;  /* generation bit */
^

ERROR: spaces required around that ':' (ctx:VxV)
#250: FILE: hw/net/vmxnet3.h:241:
+u32 rsvd:1;
 ^

ERROR: spaces required around that ':' (ctx:VxV)
#251: FILE: hw/net/vmxnet3.h:242:
+u32 dtype:1;/* descriptor type */
  ^

ERROR: spaces required around that ':' (ctx:VxV)
#252: FILE: hw/net/vmxnet3.h:243:
+u32 ext1:1;
 ^

ERROR: spaces required around that ':' (ctx:VxV)
#253: FILE: hw/net/vmxnet3.h:244:
+u32 msscof:14;  /* MSS, checksum offset, flags */
   ^

ERROR: trailing whitespace
#259: FILE: hw/net/vmxnet3.h:249:
+$

WARNING: architecture specific defines should be avoided
#308: FILE: hw/net/vmxnet3.h:306:
+#ifdef __BIG_ENDIAN_BITFIELD

WARNING: architecture specific defines should be avoided
#326: FILE: hw/net/vmxnet3.h:321:
+#ifdef __BIG_ENDIAN_BITFIELD

total: 13 errors, 2 warnings, 441 lines checked

Your patch has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.

=== OUTPUT END ===

Test command exited with code: 1


---
Email generated automatically by Patchew [http://patchew.org/].
Please send your feedback to patchew-de...@freelists.org

Re: [Qemu-devel] [PATCH v2] linux-user: fix is_proc_myself to check the paths via realpath

2017-11-14 Thread Laurent Vivier
Le 11/11/2017 à 02:48, Zach Riggle a écrit :
> I wrote up a quick example to show that this should work specifically for
> /proc/self/exe:
> 
> #define _GNU_SOURCE
> #include 
> #include 
> #include 
> #include 
> int main(int argc, char** argv) {
> int fd = open("/proc/self/exe", O_NOFOLLOW | O_PATH);
> system("ls -la /proc/$PPID/fd/");
> }
> 

And what about a readlink() in a loop until we cross "/proc/" (or not)?

Thanks,
Laurent



Re: [Qemu-devel] [PATCH] qapi: block-core: Clarify events emitted by 'block-job-cancel'

2017-11-14 Thread no-reply
Hi,

This series failed automatic build test. Please find the testing commands and
their output below. If you have docker installed, you can probably reproduce it
locally.

Subject: [Qemu-devel] [PATCH] qapi: block-core: Clarify events emitted by 
'block-job-cancel'
Type: series
Message-id: 20171114191605.22349-1-kcham...@redhat.com

=== TEST SCRIPT BEGIN ===
#!/bin/bash
set -e
git submodule update --init dtc
# Let docker tests dump environment info
export SHOW_ENV=1
export J=8
time make docker-test-quick@centos6
time make docker-test-build@min-glib
time make docker-test-mingw@fedora
time make docker-test-block@fedora
=== TEST SCRIPT END ===

Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384
Switched to a new branch 'test'
375c460e8a qapi: block-core: Clarify events emitted by 'block-job-cancel'

=== OUTPUT BEGIN ===
Submodule 'dtc' (git://git.qemu-project.org/dtc.git) registered for path 'dtc'
Cloning into '/var/tmp/patchew-tester-tmp-q1izxxh8/src/dtc'...
Submodule path 'dtc': checked out '558cd81bdd432769b59bff01240c44f82cfb1a9d'
  BUILD   centos6
make[1]: Entering directory '/var/tmp/patchew-tester-tmp-q1izxxh8/src'
  GEN 
/var/tmp/patchew-tester-tmp-q1izxxh8/src/docker-src.2017-11-14-14.25.09.27009/qemu.tar
Cloning into 
'/var/tmp/patchew-tester-tmp-q1izxxh8/src/docker-src.2017-11-14-14.25.09.27009/qemu.tar.vroot'...
done.
Checking out files:  27% (1530/5654)   
Checking out files:  28% (1584/5654)   
Checking out files:  29% (1640/5654)   
Checking out files:  30% (1697/5654)   
Checking out files:  31% (1753/5654)   
Checking out files:  32% (1810/5654)   
Checking out files:  33% (1866/5654)   
Checking out files:  34% (1923/5654)   
Checking out files:  35% (1979/5654)   
Checking out files:  36% (2036/5654)   
Checking out files:  37% (2092/5654)   
Checking out files:  38% (2149/5654)   
Checking out files:  39% (2206/5654)   
Checking out files:  40% (2262/5654)   
Checking out files:  40% (2278/5654)   
Checking out files:  41% (2319/5654)   
Checking out files:  42% (2375/5654)   
Checking out files:  43% (2432/5654)   
Checking out files:  44% (2488/5654)   
Checking out files:  45% (2545/5654)   
Checking out files:  46% (2601/5654)   
Checking out files:  47% (2658/5654)   
Checking out files:  48% (2714/5654)   
Checking out files:  49% (2771/5654)   
Checking out files:  50% (2827/5654)   
Checking out files:  51% (2884/5654)   
Checking out files:  52% (2941/5654)   
Checking out files:  53% (2997/5654)   
Checking out files:  54% (3054/5654)   
Checking out files:  55% (3110/5654)   
Checking out files:  56% (3167/5654)   
Checking out files:  57% (3223/5654)   
Checking out files:  58% (3280/5654)   
Checking out files:  59% (3336/5654)   
Checking out files:  60% (3393/5654)   
Checking out files:  61% (3449/5654)   
Checking out files:  62% (3506/5654)   
Checking out files:  63% (3563/5654)   
Checking out files:  64% (3619/5654)   
Checking out files:  65% (3676/5654)   
Checking out files:  66% (3732/5654)   
Checking out files:  67% (3789/5654)   
Checking out files:  68% (3845/5654)   
Checking out files:  69% (3902/5654)   
Checking out files:  70% (3958/5654)   
Checking out files:  71% (4015/5654)   
Checking out files:  72% (4071/5654)   
Checking out files:  73% (4128/5654)   
Checking out files:  74% (4184/5654)   
Checking out files:  75% (4241/5654)   
Checking out files:  76% (4298/5654)   
Checking out files:  77% (4354/5654)   
Checking out files:  78% (4411/5654)   
Checking out files:  79% (4467/5654)   
Checking out files:  80% (4524/5654)   
Checking out files:  81% (4580/5654)   
Checking out files:  82% (4637/5654)   
Checking out files:  83% (4693/5654)   
Checking out files:  84% (4750/5654)   
Checking out files:  85% (4806/5654)   
Checking out files:  86% (4863/5654)   
Checking out files:  87% (4919/5654)   
Checking out files:  88% (4976/5654)   
Checking out files:  89% (5033/5654)   
Checking out files:  90% (5089/5654)   
Checking out files:  91% (5146/5654)   
Checking out files:  92% (5202/5654)   
Checking out files:  93% (5259/5654)   
Checking out files:  94% (5315/5654)   
Checking out files:  95% (5372/5654)   
Checking out files:  96% (5428/5654)   
Checking out files:  97% (5485/5654)   
Checking out files:  98% (5541/5654)   
Checking out files:  99% (5598/5654)   
Checking out files: 100% (5654/5654)   
Checking out files: 100% (5654/5654), done.
Your branch is up-to-date with 'origin/test'.
Submodule 'dtc' (git://git.qemu-project.org/dtc.git) registered for path 'dtc'
Cloning into 
'/var/tmp/patchew-tester-tmp-q1izxxh8/src/docker-src.2017-11-14-14.25.09.27009/qemu.tar.vroot/dtc'...
Submodule path 'dtc': checked out '558cd81bdd432769b59bff01240c44f82cfb1a9d'
Submodule 'ui/keycodemapdb' (git://git.qemu.org/keycodemapdb.git) registered 
for path 'ui/keycodemapdb'
Cloning into 
'/var/tmp/patchew-tester-tmp-q1izxxh8/src/docker-src.2017-11-14-14.25.09.27009/qemu.tar.vroot/ui/keycodemapdb'...
Submodule path 'ui/keycodemapdb': checked out 

Re: [Qemu-devel] [PATCH 1/5 for-2.11?] qcow2: reject unaligned offsets in write compressed

2017-11-14 Thread Eric Blake
On 11/14/2017 12:30 PM, Anton Nefedov wrote:
> On 14/11/2017 7:50 PM, Eric Blake wrote:
>> On 11/14/2017 04:16 AM, Anton Nefedov wrote:
>>> Misaligned compressed write is not supported.
>>>
>>> Signed-off-by: Anton Nefedov 
>>> ---
>>>   block/qcow2.c | 4 
>>>   1 file changed, 4 insertions(+)
>>
>> Should this one be applied in 2.11?
>>
> 
> For the record, this one is pretty hard to trigger; backup and qemu-img
> convert currently use compressed write, both make sure they operate in
> clusters.
> 
> qemu-io is almighty though

Okay, then we definitely have a bug, and this patch is definitely 2.11
material, especially if you update the commit message to show the
trigger case:

> 
> qemu-io> write -c -P 7 512 64k
> wrote 65536/65536 bytes at offset 512
> 64 KiB, 1 ops; 0.0187 sec (3.329 MiB/sec and 53.2566 ops/sec)
> qemu-io> read -P 7 512 64k
> Pattern verification failed at offset 512, 65536 bytes
> read 65536/65536 bytes at offset 512
> 64 KiB, 1 ops; 0.0002 sec (248.016 MiB/sec and 3968.2540 ops/sec)
> qemu-io> read -P 7 0 64k
> read 65536/65536 bytes at offset 0
> 64 KiB, 1 ops; 0. sec (1.606 GiB/sec and 26315.7895 ops/sec)
> 
> /Anton
> 

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] [PATCH for-2.11? v7 6/6] tests: Add check-qobject for equality tests

2017-11-14 Thread Eric Blake
On 11/14/2017 12:01 PM, Max Reitz wrote:
> Add a new test file (check-qobject.c) for unit tests that concern
> QObjects as a whole.
> 
> Its only purpose for now is to test the qobject_is_equal() function.
> 

> + * Note that qobject_is_equal() is not really an equivalence relation,
> + * so this function may not be used for all objects (reflexivity is
> + * not guaranteed, e.g. in the case of a QNum containing NaN).
> + *
> + * The @_ argument is required because a boolean may not be the last
> + * argument before a variadic argument list (C11 7.16.1.4 para. 4).

C99 7.15.1.4 (did C11 add a section? /me goes and looks... Oh, it did.)

Okay, so it's not a typo after all.  Ignore my comment in the cover
letter, and

Reviewed-by: Eric Blake 

(And soon, I'll have to start quoting C17 instead of C99 or C11...)

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] [PATCH for-2.11? v7 0/6] block: Don't compare strings in bdrv_reopen_prepare()

2017-11-14 Thread Eric Blake
On 11/14/2017 12:01 PM, Max Reitz wrote:
> bdrv_reopen_prepare() assumes that all BDS options are strings, which is
> not necessarily correct. This series introduces a new qobject_is_equal()
> function which can be used to test whether any options have changed,
> independently of their type.
> 
> 
> v7:
> - Patch 6: Fix a clang warning:
> tests/check-qobject.c:39:24: error: passing an object that undergoes
>  default argument promotion to
>  'va_start' has undefined behavior
>   TIL: You cannot use va_start(ap, foo) if @foo is a bool.  An int
>works, however.

I knew that va_arg(ap, bool) was undefined behavior, but didn't realize
va_start(ap, bool_param) was also undefined.  But sure enough, reading
C99 7.15.1.4:

4 The parameter parmN is the identifier of the rightmost parameter in
the variable
parameter list in the function definition (the one just before the ,
...). If the parameter
parmN is declared with the register storage class, with a function or
array type, or
with a type that is not compatible with the type that results after
application of the default
argument promotions, the behavior is undefined.

>Feel free to explain the long version to me, because I don't
>think I have fully understood the issue, but it's something like
>"Using bools for variadic arguments results in their promotion to
>an int, but you have to use a type that is promoted to itself
>(like int)."

So it must be that C99 is trying to cater to platforms that have special
ABI for passing multiple bool on the stack, by stating that the moment
argument promotion is in effect, va_list adjacent to a bool may cause
problems with that special packing in the ABI.  Weird.

> 
> 
> git-backport-diff against v6:
> 
> Key:
> [] : patches are identical
> [] : number of functional differences between upstream/downstream patch
> [down] : patch is downstream-only
> The flags [FC] indicate (F)unctional and (C)ontextual differences, 
> respectively
> 
> 001/6:[] [--] 'qapi/qnull: Add own header'
> 002/6:[] [--] 'qapi/qlist: Add qlist_append_null() macro'
> 003/6:[] [--] 'qapi: Add qobject_is_equal()'
> 004/6:[] [--] 'block: qobject_is_equal() in bdrv_reopen_prepare()'
> 005/6:[] [--] 'iotests: Add test for non-string option reopening'
> 006/6:[0011] [FC] 'tests: Add check-qobject for equality tests'

I still think this is 2.11 material.  Once you fix the typo I point out
separately on 6/6, the changes since v6 look reasonable, so series:
Reviewed-by: Eric Blake 

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org



signature.asc
Description: OpenPGP digital signature


[Qemu-devel] [PATCH] qapi: block-core: Clarify events emitted by 'block-job-cancel'

2017-11-14 Thread Kashyap Chamarthy
When you cancel an in-progress live block operation with QMP
`block-job-cancel`, it emits the event: BLOCK_JOB_CANCELLED.  However,
when `block-job-cancel` is issued after `drive-mirror` has indicated (by
emitting the event BLOCK_JOB_READY) that the source and destination
remain synchronized:

[...] # Snip `drive-mirror` invocation & outputs
{
  "execute":"block-job-cancel",
  "arguments":{
"device":"virtio0"
  }
}

{"return": {}}

It (`block-job-cancel`) will counterintuitively emit the event
'BLOCK_JOB_COMPLETED':

{
  "timestamp":{
"seconds":1510678024,
"microseconds":526240
  },
  "event":"BLOCK_JOB_COMPLETED",
  "data":{
"device":"virtio0",
"len":41126400,
"offset":41126400,
"speed":0,
"type":"mirror"
  }
}

But this is expected behaviour, where the _COMPLETED event indicates
that synchronization has successfully ended (and the destination has a
point-in-time copy, which is at the time of cancel).

So add a small note to this effect.  (Thanks: Max Reitz for reminding
me of this on IRC.)
---
 qapi/block-core.json | 8 
 1 file changed, 8 insertions(+)

diff --git a/qapi/block-core.json b/qapi/block-core.json
index 
ab96e348e6317bb42769ae20f4a4519bac02e93a..e43a7eaeb22b92c613edcb4219ed4f0e928577b6
 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -2065,6 +2065,14 @@
 # BLOCK_JOB_CANCELLED event.  Before that happens the job is still visible when
 # enumerated using query-block-jobs.
 #
+# Note: The 'block-job-cancel' command will emit the event BLOCK_JOB_COMPLETED
+# if you issue it ('block-job-cancel') after 'drive-mirror' has
+# indicated (by emitting the event BLOCK_JOB_READY) that the source and
+# destination remain synchronized.  In this case, the BLOCK_JOB_COMPLETED event
+# indicates that synchronization (from `drive-mirror`) has successfully ended
+# and the destination now has a point-in-time copy, which is at the time of
+# cancel.
+#
 # For streaming, the image file retains its backing file unless the streaming
 # operation happens to complete just as it is being cancelled.  A new streaming
 # operation can be started at a later time to finish copying all data from the
-- 
2.9.5




Re: [Qemu-devel] HAXM is now open source

2017-11-14 Thread John Snow


On 11/14/2017 06:09 AM, Thomas Huth wrote:
> On 14.11.2017 09:54, Yu Ning wrote:
>> Hello,
>>
>> As some of you may have noticed, since QEMU 2.9.0, an accelerator known
>> as “hax” has been available for Windows and macOS builds of QEMU, thanks
>> to the hard work of Vincent Palatin and help from this community (Paolo
>> Bonzini, Stefan Weil, et al.).
>>
>> The accelerator requires a host kernel module (driver) known as Intel
>> Hardware Accelerated Execution Manager (HAXM), i.e. intelhaxm.sys on
>> Windows or intelhaxm.kext on macOS, similar to how the KVM accelerator
>> depends on kvm.ko on Linux.
>>
>> Today, we released the source code of the HAXM kernel module under the
>> BSD 3-clause license:
>>
>> https://github.com/intel/haxm
>>
>> We look forward to working with the community to improve HAXM (both the
>> kernel module and the accelerator). The code is accompanied by some
>> basic documentation (README.md and API.md), which is incomplete, but
>> hopefully helps people get started. If you have any questions or
>> suggestions, please create an issue or post a comment on GitHub.
> 
> That's great news! I hope this all will help to promote QEMU on Windows
> and macOS quite a bit!
> 
> However, during the past months, I noticed a couple of times that users
> ask on IRC or the qemu-discuss mailing list how they could accelerate
> their QEMU on Windows - and they are running only in TCG mode when you
> ask how they start QEMU. So it seems like there is not much knowledge
> about "--accel hax" in the public yet. Maybe you could write a nice blog
> post for the QEMU blog or something similar that explains how to use
> HAXM with QEMU on Windows for the normal users? Or maybe make it more
> prominent in the QEMU wiki? (e.g. the main page only mentions KVM and
> Xen, but not HAXM)
> 
>  Thomas
> 

A blog post advertising this new development would be an absolute
miracle for linking to people who are just getting started with QEMU on
Windows.

(It would also be really good for idiots like me, who do not use Windows
for anything other than playing video games and sometimes forget that it
is capable of doing other things.)

--js



Re: [Qemu-devel] [Nbd] [Qemu-block] How to online resize qemu disk with nbd protocol?

2017-11-14 Thread Eric Blake
On 11/14/2017 11:37 AM, Wouter Verhelst wrote:
> On Tue, Nov 14, 2017 at 10:41:39AM -0600, Eric Blake wrote:
>> Another thought - with structured replies, we finally have a way to let
>> the client ask for the server to send resize information whenever the
>> server wants, rather than having to be polled by a new client request
>> all the time.  This is possible by having the server reply with a chunk
>> without the NBD_REPLY_FLAG_DONE bit, for as many times as it wants,
>> (that is, the server never officially ends the response to the single
>> client request for on-going status, until the client sends an
>> NBD_CMD_DISC).
> 
> Hrm, yeah, that could work.
> 
> Minor downside of this would be that a client would now be expected to
> continue listening "forever" (probably needs to do a blocking read() or
> a select() on the socket), whereas with the current situation a client
> could get away with only reading for as long as it expects data.
> 
> I don't think that should be a blocker, but it might be something we
> might want to document.
> 
>> I don't think the server should go into this mode without a flag bit
>> from the client requesting it (as it potentially ties up a thread that
>> could otherwise be used for parallel processing of other requests),
> 
> Yeah. I think we should probably initiate this with a BLOCK_STATUS
> message that has a flag with which we mean "don't stop sending data on
> the given region for contexts that support it".

Now we're mixing NBD_CMD_BLOCK_STATUS and NBD_CMD_RESIZE; I was thinking
of the open-ended command for being informed of all
server-side-initiated size changes in response to RESIZE; but your
mention of an open-ended BLOCK_STATUS has an interesting connotation of
being able to get live updates as sections of a file are dirtied.

I also remember from talking with Vladimir during KVM Forum last month
that one of the shortfalls of the NBD protocol is that you can only ever
send a length of up to 32 bits on the command side (unless we introduce
structured commands in addition to our current work to add structured
replies); even querying the full status of a 1TB volume would require at
least 256 NBD_CMD_BLOCK_STATUS calls.  But having a special case of a
length of 0 meaning to report status as long as the server is interested
could reduce the number of round trips by letting the server report more
than 4G of status in response to one client query.  There was also a
thread a while ago about the possibility of a per-export flag denoting a
volume is known to contain all-zeroes on startup, which can be used as a
handy shortcut to having to query block status over each slice of the
file; we weren't sure about burning a per-export flag on that
information, but having it be a special mode of NBD_CMD_BLOCK_STATUS may
be easy enough to codify.

> 
> However, I could imagine that there might be some cases wherein a server
> might be able to go into such a mode for two or more metadata contexts,
> and where a client might want to go into that mode for one of them but
> not all of them, while still wanting some information from them.
> 
> This could be covered with metadata context syntax, but it's annoying
> and shouldn't be necessary.
> 
> I'm starting to think I made a mistake when I said NBD_CMD_BLOCK_STATUS
> can't take a metadata context ID. Okay, there's no space for it, but
> that shouldn't have been a blocker.
> 
> Thoughts?

Nothing says the server has to reply the same length of information when
replying for multiple selected metadata contexts; but if we allow
different reply sizes all in one query, we may also need some way to
easily tell that the server has stopped sending metadata for one context
even though it is still providing additional replies for another context.

And maybe we do want to someday start thinking about structured
requests; where being able to do per-command selection of metadata
contexts (instead of per-export selection) may indeed be the first use case.

> 
>> and that the server could reject a repeat command with the flag if it
>> is already serving a previous open-ended request.
> 
> Right.
> 
> On the other hand, I can imagine that a client might also want to tell
> the server that it is no longer interested in an outstanding request. In
> such a case, it should be able to cancel it.

Good point - if we allow the client to request an open-ended reply, it's
also nice to let the client decide how long that open-endedness should last.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] using "qemu-img convert -O qcow2" to convert qcow v1 to v2 creates a qcow v3 file?

2017-11-14 Thread Max Reitz
On 2017-11-14 19:45, Thomas Huth wrote:
> On 14.11.2017 14:32, Max Reitz wrote:
> [...]
>> Well, do you want to document it?  I'd rather deprecate it altogether.
> 
> Maybe a first step could be to change qemu-img so that it refuses to
> create new qcow1 images (but still can convert them into other formats).
> So basically make qcow1 read-only?

Yep, and the actual first step to that is to make it issue a deprecation
warning when creating qcow v1 images (which is what I proposed). :-)

Max



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] using "qemu-img convert -O qcow2" to convert qcow v1 to v2 creates a qcow v3 file?

2017-11-14 Thread Thomas Huth
On 14.11.2017 14:32, Max Reitz wrote:
[...]
> Well, do you want to document it?  I'd rather deprecate it altogether.

Maybe a first step could be to change qemu-img so that it refuses to
create new qcow1 images (but still can convert them into other formats).
So basically make qcow1 read-only?

 Thomas



signature.asc
Description: OpenPGP digital signature


[Qemu-devel] [PATCH for-2.11] qcow2: Fix overly broad madvise()

2017-11-14 Thread Max Reitz
@mem_size and @offset are both size_t, thus subtracting them from one
another will just return a big size_t if mem_size < offset -- even more
obvious here because the result is stored in another size_t.

Checking that result to be positive is therefore not sufficient to
excluse the case that offset > mem_size.  Thus, we currently sometimes
issue an madvise() over a very large address range.

This is triggered by iotest 163, but with -m64, this does not result in
tangible problems.  But with -m32, this test produces three segfaults,
all of which are fixed by this patch.

Signed-off-by: Max Reitz 
---
 block/qcow2-cache.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/block/qcow2-cache.c b/block/qcow2-cache.c
index 75746a7f43..5222a7b94d 100644
--- a/block/qcow2-cache.c
+++ b/block/qcow2-cache.c
@@ -73,7 +73,7 @@ static void qcow2_cache_table_release(BlockDriverState *bs, 
Qcow2Cache *c,
 size_t mem_size = (size_t) s->cluster_size * num_tables;
 size_t offset = QEMU_ALIGN_UP((uintptr_t) t, align) - (uintptr_t) t;
 size_t length = QEMU_ALIGN_DOWN(mem_size - offset, align);
-if (length > 0) {
+if (mem_size > offset && length > 0) {
 madvise((uint8_t *) t + offset, length, MADV_DONTNEED);
 }
 #endif
-- 
2.13.6




Re: [Qemu-devel] [PATCH 1/5 for-2.11?] qcow2: reject unaligned offsets in write compressed

2017-11-14 Thread Anton Nefedov

On 14/11/2017 7:50 PM, Eric Blake wrote:

On 11/14/2017 04:16 AM, Anton Nefedov wrote:

Misaligned compressed write is not supported.

Signed-off-by: Anton Nefedov 
---
  block/qcow2.c | 4 
  1 file changed, 4 insertions(+)


Should this one be applied in 2.11?



For the record, this one is pretty hard to trigger; backup and qemu-img
convert currently use compressed write, both make sure they operate in
clusters.

qemu-io is almighty though

qemu-io> write -c -P 7 512 64k
wrote 65536/65536 bytes at offset 512
64 KiB, 1 ops; 0.0187 sec (3.329 MiB/sec and 53.2566 ops/sec)
qemu-io> read -P 7 512 64k
Pattern verification failed at offset 512, 65536 bytes
read 65536/65536 bytes at offset 512
64 KiB, 1 ops; 0.0002 sec (248.016 MiB/sec and 3968.2540 ops/sec)
qemu-io> read -P 7 0 64k
read 65536/65536 bytes at offset 0
64 KiB, 1 ops; 0. sec (1.606 GiB/sec and 26315.7895 ops/sec)

/Anton



Re: [Qemu-devel] [PULL 00/20] Block patches for 2.11.0-rc1

2017-11-14 Thread Peter Maydell
On 14 November 2017 at 17:23, Max Reitz  wrote:
> The following changes since commit 191b5fbfa66e5b23e2150f3c6981d30eb84418a9:
>
>   Merge remote-tracking branch 'remotes/stefanha/tags/block-pull-request' 
> into staging (2017-11-14 16:11:19 +)
>
> are available in the git repository at:
>
>   git://github.com/XanClic/qemu.git tags/pull-block-2017-11-14
>
> for you to fetch changes up to 8b2d7c364d9a2491f7501f6688cd722045cf808a:
>
>   qemu-iotests: update unsupported image formats in 194 (2017-11-14 18:06:26 
> +0100)
>
> 
> Block patches for 2.11.0-rc1


Applied, thanks.

-- PMM



Re: [Qemu-devel] [Qemu-ppc] How to debug crash in TCG code?

2017-11-14 Thread Paolo Bonzini
On 15/10/2017 13:30, BALATON Zoltan wrote:
> I've got a bit further with this but still could use some hints to find
> what is happening. Here are some more details I've found so far.
> 
> The memory map I have (see below) is a bit complex but the interesting
> part is that I have sii3112.bar5 as an mmio region with sii3112.bar0-4
> as io region aliases into this. The crash is happening when the firmware
> is accessing one of these aliased io regions when
> 
> tlb_set_page_with_attrs: vaddr=d8001000 paddr=0x000c08001000 prot=3
> idx=1
> 
> is called in accel/tcg/cputlb.c:616 which then calls
> 
> 635    section = address_space_translate_for_iotlb(cpu, asidx,
> paddr, , );
> 
> this in turn calls address_space_translate_internal which calls
> 
> 441    section = address_space_lookup_region(d, addr, resolve_subpage);
> 
> that eventually gets the cached section at exec.c:411
> 
> 411    MemoryRegionSection *section = atomic_read(>mru_section);
> 
> When this is not a region covering the address as verifed by

Could it be that the cached region is only for a small part of the page,
while phys_page_find returns a subpage (and resolve_subpage is false)?

Maybe it's enough to skip mru_section if resolve_subpage is false.

Thanks,

Paolo



[Qemu-devel] [PATCH for-2.11? v7 6/6] tests: Add check-qobject for equality tests

2017-11-14 Thread Max Reitz
Add a new test file (check-qobject.c) for unit tests that concern
QObjects as a whole.

Its only purpose for now is to test the qobject_is_equal() function.

Signed-off-by: Max Reitz 
---
 tests/Makefile.include |   4 +-
 tests/check-qobject.c  | 328 +
 tests/.gitignore   |   1 +
 3 files changed, 332 insertions(+), 1 deletion(-)
 create mode 100644 tests/check-qobject.c

diff --git a/tests/Makefile.include b/tests/Makefile.include
index 434a2ce868..c002352134 100644
--- a/tests/Makefile.include
+++ b/tests/Makefile.include
@@ -41,6 +41,7 @@ check-unit-y += tests/check-qlist$(EXESUF)
 gcov-files-check-qlist-y = qobject/qlist.c
 check-unit-y += tests/check-qnull$(EXESUF)
 gcov-files-check-qnull-y = qobject/qnull.c
+check-unit-y += tests/check-qobject$(EXESUF)
 check-unit-y += tests/check-qjson$(EXESUF)
 gcov-files-check-qjson-y = qobject/qjson.c
 check-unit-y += tests/check-qlit$(EXESUF)
@@ -546,7 +547,7 @@ GENERATED_FILES += tests/test-qapi-types.h 
tests/test-qapi-visit.h \
tests/test-qmp-introspect.h
 
 test-obj-y = tests/check-qnum.o tests/check-qstring.o tests/check-qdict.o \
-   tests/check-qlist.o tests/check-qnull.o \
+   tests/check-qlist.o tests/check-qnull.o tests/check-qobject.o \
tests/check-qjson.o tests/check-qlit.o \
tests/test-coroutine.o tests/test-string-output-visitor.o \
tests/test-string-input-visitor.o tests/test-qobject-output-visitor.o \
@@ -580,6 +581,7 @@ tests/check-qstring$(EXESUF): tests/check-qstring.o 
$(test-util-obj-y)
 tests/check-qdict$(EXESUF): tests/check-qdict.o $(test-util-obj-y)
 tests/check-qlist$(EXESUF): tests/check-qlist.o $(test-util-obj-y)
 tests/check-qnull$(EXESUF): tests/check-qnull.o $(test-util-obj-y)
+tests/check-qobject$(EXESUF): tests/check-qobject.o $(test-util-obj-y)
 tests/check-qjson$(EXESUF): tests/check-qjson.o $(test-util-obj-y)
 tests/check-qlit$(EXESUF): tests/check-qlit.o $(test-util-obj-y)
 tests/check-qom-interface$(EXESUF): tests/check-qom-interface.o 
$(test-qom-obj-y)
diff --git a/tests/check-qobject.c b/tests/check-qobject.c
new file mode 100644
index 00..03e9175113
--- /dev/null
+++ b/tests/check-qobject.c
@@ -0,0 +1,328 @@
+/*
+ * Generic QObject unit-tests.
+ *
+ * Copyright (C) 2017 Red Hat Inc.
+ *
+ * This work is licensed under the terms of the GNU LGPL, version 2.1 or later.
+ * See the COPYING.LIB file in the top-level directory.
+ */
+#include "qemu/osdep.h"
+
+#include "qapi/qmp/types.h"
+#include "qemu-common.h"
+
+#include 
+
+/* Marks the end of the test_equality() argument list.
+ * We cannot use NULL there because that is a valid argument. */
+static QObject test_equality_end_of_arguments;
+
+/**
+ * Test whether all variadic QObject *arguments are equal (@expected
+ * is true) or whether they are all not equal (@expected is false).
+ * Every QObject is tested to be equal to itself (to test
+ * reflexivity), all tests are done both ways (to test symmetry), and
+ * transitivity is not assumed but checked (each object is compared to
+ * every other one).
+ *
+ * Note that qobject_is_equal() is not really an equivalence relation,
+ * so this function may not be used for all objects (reflexivity is
+ * not guaranteed, e.g. in the case of a QNum containing NaN).
+ *
+ * The @_ argument is required because a boolean may not be the last
+ * argument before a variadic argument list (C11 7.16.1.4 para. 4).
+ */
+static void do_test_equality(bool expected, int _, ...)
+{
+va_list ap_count, ap_extract;
+QObject **args;
+int arg_count = 0;
+int i, j;
+
+va_start(ap_count, _);
+va_copy(ap_extract, ap_count);
+while (va_arg(ap_count, QObject *) != _equality_end_of_arguments) {
+arg_count++;
+}
+va_end(ap_count);
+
+args = g_new(QObject *, arg_count);
+for (i = 0; i < arg_count; i++) {
+args[i] = va_arg(ap_extract, QObject *);
+}
+va_end(ap_extract);
+
+for (i = 0; i < arg_count; i++) {
+g_assert(qobject_is_equal(args[i], args[i]) == true);
+
+for (j = i + 1; j < arg_count; j++) {
+g_assert(qobject_is_equal(args[i], args[j]) == expected);
+}
+}
+}
+
+#define check_equal(...) \
+do_test_equality(true, 0, __VA_ARGS__, _equality_end_of_arguments)
+#define check_unequal(...) \
+do_test_equality(false, 0, __VA_ARGS__, _equality_end_of_arguments)
+
+static void do_free_all(int _, ...)
+{
+va_list ap;
+QObject *obj;
+
+va_start(ap, _);
+while ((obj = va_arg(ap, QObject *)) != NULL) {
+qobject_decref(obj);
+}
+va_end(ap);
+}
+
+#define free_all(...) \
+do_free_all(0, __VA_ARGS__, NULL)
+
+static void qobject_is_equal_null_test(void)
+{
+check_unequal(qnull(), NULL);
+}
+
+static void qobject_is_equal_num_test(void)
+{
+QNum *u0, *i0, *d0, *dnan, *um42, *im42, *dm42;
+
+u0 = qnum_from_uint(0u);
+i0 = qnum_from_int(0);
+d0 = qnum_from_double(0.0);
+

Re: [Qemu-devel] [PATCH] exec: Fix section_covers_addr() for sections with non-zero offset

2017-11-14 Thread Paolo Bonzini
On 21/10/2017 13:24, BALATON Zoltan wrote:
> When a section with non-0 offset_within_region field is tested to
> cover an address the offset should be taken into account as well.
> 
> This fixes a crash caused by picking the wrong memory region in
> address_space_lookup_region seen with client code accessing a device
> model that uses alias memory regions.
> 
> Signed-off-by: BALATON Zoltan 
> ---
> This seems to fix the problem described in
> http://lists.nongnu.org/archive/html/qemu-devel/2017-10/msg03356.html
> but I'm not completely sure about it. This seems to be introduced in
> 729633c exec: Introduce AddressSpaceDispatch.mru_section and the patch
> before that which split off section_covers_addr from phys_page_find so
> this patch also changes that caller. Is that OK to do? It appears to
> work but I don't know this part of QEMU.
> 
> Also the bug seems to be caused by section_covers_addr accepting
> sii3112.bar5 when that's the mru_section instead of picking
> sii3112.bar0 (which it picks when going through phys_page_find) when
> client code is accessing 0xc08001006 from this address map (full
> address map is at above URL):
> 
> address-space: memory
> 000c0800-000c0800 (prio 0, i/o): alias isa_mmio @io 
> -
> 
> address-space: I/O
>   - (prio 0, i/o): io
> 1000-1007 (prio 1, i/o): alias sii3112.bar0 
> @sii3112.bar5 0080-0087
> 1008-100b (prio 1, i/o): alias sii3112.bar1 
> @sii3112.bar5 0088-008b
> 1010-1017 (prio 1, i/o): alias sii3112.bar2 
> @sii3112.bar5 00c0-00c7
> 1018-101b (prio 1, i/o): alias sii3112.bar3 
> @sii3112.bar5 00c8-00cb
> 1020-102f (prio 1, i/o): alias sii3112.bar4 
> @sii3112.bar5 -000f
> 
> which this patch fixes but would the same problem happen if the
> mru_section is bar5 but bar4 is accessed? I could not reproduce that
> case but then the offset is 0 but in this case the address would be
> above 0xc08001020 and size is 0xf so they probably won't match. But
> this is only because of the size of the region. Could that mean the
> bug is caused by something else and should be fixed elsewhere?
> 
> ---
>  exec.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/exec.c b/exec.c
> index db5ae23..a915817 100644
> --- a/exec.c
> +++ b/exec.c
> @@ -370,7 +370,8 @@ static inline bool section_covers_addr(const 
> MemoryRegionSection *section,
>   * the section must cover the entire address space.
>   */
>  return int128_gethi(section->size) ||
> -   range_covers_byte(section->offset_within_address_space,
> +   range_covers_byte(section->offset_within_address_space +
> + section->offset_within_region,
>   int128_getlo(section->size), addr);
>  }

Sorry, this is incorrect.  addr is an address in the address space, and
range_covers_byte checks if it is between
section->offset_within_address_space and
section->offset_within_address_space + section->size.  I am not sure how
things don't explode completely by adding section->offset_within_region
(probably it's just because section->offset_within_region is usually 0).

Paolo



[Qemu-devel] [PATCH for-2.11? v7 5/6] iotests: Add test for non-string option reopening

2017-11-14 Thread Max Reitz
Signed-off-by: Max Reitz 
Reviewed-by: Kevin Wolf 
Reviewed-by: Eric Blake 
---
 tests/qemu-iotests/133 | 9 +
 tests/qemu-iotests/133.out | 5 +
 2 files changed, 14 insertions(+)

diff --git a/tests/qemu-iotests/133 b/tests/qemu-iotests/133
index 9d35a6a1ca..af6b3e1dd4 100755
--- a/tests/qemu-iotests/133
+++ b/tests/qemu-iotests/133
@@ -83,6 +83,15 @@ $QEMU_IO -c 'reopen -o driver=qcow2' $TEST_IMG
 $QEMU_IO -c 'reopen -o file.driver=file' $TEST_IMG
 $QEMU_IO -c 'reopen -o backing.driver=qcow2' $TEST_IMG
 
+echo
+echo "=== Check that reopening works with non-string options ==="
+echo
+
+# Using the json: pseudo-protocol we can create non-string options
+# (Invoke 'info' just so we get some output afterwards)
+IMGOPTSSYNTAX=false $QEMU_IO -f null-co -c 'reopen' -c 'info' \
+"json:{'driver': 'null-co', 'size': 65536}"
+
 # success, all done
 echo "*** done"
 rm -f $seq.full
diff --git a/tests/qemu-iotests/133.out b/tests/qemu-iotests/133.out
index cc86b94880..f4a85aeb63 100644
--- a/tests/qemu-iotests/133.out
+++ b/tests/qemu-iotests/133.out
@@ -19,4 +19,9 @@ Cannot change the option 'driver'
 
 === Check that unchanged driver is okay ===
 
+
+=== Check that reopening works with non-string options ===
+
+format name: null-co
+format name: null-co
 *** done
-- 
2.13.6




[Qemu-devel] [PATCH for-2.11? v7 4/6] block: qobject_is_equal() in bdrv_reopen_prepare()

2017-11-14 Thread Max Reitz
Currently, bdrv_reopen_prepare() assumes that all BDS options are
strings. However, this is not the case if the BDS has been created
through the json: pseudo-protocol or blockdev-add.

Note that the user-invokable reopen command is an HMP command, so you
can only specify strings there. Therefore, specifying a non-string
option with the "same" value as it was when originally created will now
return an error because the values are supposedly similar (and there is
no way for the user to circumvent this but to just not specify the
option again -- however, this is still strictly better than just
crashing).

Signed-off-by: Max Reitz 
Reviewed-by: Eric Blake 
---
 block.c | 29 ++---
 1 file changed, 18 insertions(+), 11 deletions(-)

diff --git a/block.c b/block.c
index 684cb018da..28889a2690 100644
--- a/block.c
+++ b/block.c
@@ -3069,19 +3069,26 @@ int bdrv_reopen_prepare(BDRVReopenState *reopen_state, 
BlockReopenQueue *queue,
 const QDictEntry *entry = qdict_first(reopen_state->options);
 
 do {
-QString *new_obj = qobject_to_qstring(entry->value);
-const char *new = qstring_get_str(new_obj);
+QObject *new = entry->value;
+QObject *old = qdict_get(reopen_state->bs->options, entry->key);
+
 /*
- * Caution: while qdict_get_try_str() is fine, getting
- * non-string types would require more care.  When
- * bs->options come from -blockdev or blockdev_add, its
- * members are typed according to the QAPI schema, but
- * when they come from -drive, they're all QString.
+ * TODO: When using -drive to specify blockdev options, all values
+ * will be strings; however, when using -blockdev, blockdev-add or
+ * filenames using the json:{} pseudo-protocol, they will be
+ * correctly typed.
+ * In contrast, reopening options are (currently) always strings
+ * (because you can only specify them through qemu-io; all other
+ * callers do not specify any options).
+ * Therefore, when using anything other than -drive to create a 
BDS,
+ * this cannot detect non-string options as unchanged, because
+ * qobject_is_equal() always returns false for objects of different
+ * type.  In the future, this should be remedied by correctly 
typing
+ * all options.  For now, this is not too big of an issue because
+ * the user can simply omit options which cannot be changed anyway,
+ * so they will stay unchanged.
  */
-const char *old = qdict_get_try_str(reopen_state->bs->options,
-entry->key);
-
-if (!old || strcmp(new, old)) {
+if (!qobject_is_equal(new, old)) {
 error_setg(errp, "Cannot change the option '%s'", entry->key);
 ret = -EINVAL;
 goto error;
-- 
2.13.6




[Qemu-devel] [PATCH for-2.11? v7 3/6] qapi: Add qobject_is_equal()

2017-11-14 Thread Max Reitz
This generic function (along with its implementations for different
types) determines whether two QObjects are equal.

Signed-off-by: Max Reitz 
Reviewed-by: Eric Blake 
Reviewed-by: Alberto Garcia 
Reviewed-by: Markus Armbruster 
---
 include/qapi/qmp/qbool.h   |  1 +
 include/qapi/qmp/qdict.h   |  1 +
 include/qapi/qmp/qlist.h   |  1 +
 include/qapi/qmp/qnull.h   |  2 ++
 include/qapi/qmp/qnum.h|  1 +
 include/qapi/qmp/qobject.h |  9 
 include/qapi/qmp/qstring.h |  1 +
 qobject/qbool.c|  8 +++
 qobject/qdict.c| 29 +
 qobject/qlist.c| 32 +++
 qobject/qnull.c|  9 
 qobject/qnum.c | 54 ++
 qobject/qobject.c  | 29 +
 qobject/qstring.c  |  9 
 14 files changed, 186 insertions(+)

diff --git a/include/qapi/qmp/qbool.h b/include/qapi/qmp/qbool.h
index a4c309..f77ea86c4e 100644
--- a/include/qapi/qmp/qbool.h
+++ b/include/qapi/qmp/qbool.h
@@ -24,6 +24,7 @@ typedef struct QBool {
 QBool *qbool_from_bool(bool value);
 bool qbool_get_bool(const QBool *qb);
 QBool *qobject_to_qbool(const QObject *obj);
+bool qbool_is_equal(const QObject *x, const QObject *y);
 void qbool_destroy_obj(QObject *obj);
 
 #endif /* QBOOL_H */
diff --git a/include/qapi/qmp/qdict.h b/include/qapi/qmp/qdict.h
index 7ea5120c4a..fc218e7be6 100644
--- a/include/qapi/qmp/qdict.h
+++ b/include/qapi/qmp/qdict.h
@@ -43,6 +43,7 @@ void qdict_del(QDict *qdict, const char *key);
 int qdict_haskey(const QDict *qdict, const char *key);
 QObject *qdict_get(const QDict *qdict, const char *key);
 QDict *qobject_to_qdict(const QObject *obj);
+bool qdict_is_equal(const QObject *x, const QObject *y);
 void qdict_iter(const QDict *qdict,
 void (*iter)(const char *key, QObject *obj, void *opaque),
 void *opaque);
diff --git a/include/qapi/qmp/qlist.h b/include/qapi/qmp/qlist.h
index 59d209bbae..ec3fcc1a4c 100644
--- a/include/qapi/qmp/qlist.h
+++ b/include/qapi/qmp/qlist.h
@@ -61,6 +61,7 @@ QObject *qlist_peek(QList *qlist);
 int qlist_empty(const QList *qlist);
 size_t qlist_size(const QList *qlist);
 QList *qobject_to_qlist(const QObject *obj);
+bool qlist_is_equal(const QObject *x, const QObject *y);
 void qlist_destroy_obj(QObject *obj);
 
 static inline const QListEntry *qlist_first(const QList *qlist)
diff --git a/include/qapi/qmp/qnull.h b/include/qapi/qmp/qnull.h
index d075549283..c992ee2ae1 100644
--- a/include/qapi/qmp/qnull.h
+++ b/include/qapi/qmp/qnull.h
@@ -27,4 +27,6 @@ static inline QNull *qnull(void)
 return _;
 }
 
+bool qnull_is_equal(const QObject *x, const QObject *y);
+
 #endif /* QNULL_H */
diff --git a/include/qapi/qmp/qnum.h b/include/qapi/qmp/qnum.h
index d6b0791139..c3d86794bb 100644
--- a/include/qapi/qmp/qnum.h
+++ b/include/qapi/qmp/qnum.h
@@ -69,6 +69,7 @@ double qnum_get_double(QNum *qn);
 char *qnum_to_string(QNum *qn);
 
 QNum *qobject_to_qnum(const QObject *obj);
+bool qnum_is_equal(const QObject *x, const QObject *y);
 void qnum_destroy_obj(QObject *obj);
 
 #endif /* QNUM_H */
diff --git a/include/qapi/qmp/qobject.h b/include/qapi/qmp/qobject.h
index ef1d1a9237..38ac68845c 100644
--- a/include/qapi/qmp/qobject.h
+++ b/include/qapi/qmp/qobject.h
@@ -68,6 +68,15 @@ static inline void qobject_incref(QObject *obj)
 }
 
 /**
+ * qobject_is_equal(): Return whether the two objects are equal.
+ *
+ * Any of the pointers may be NULL; return true if both are.  Always
+ * return false if only one is (therefore a QNull object is not
+ * considered equal to a NULL pointer).
+ */
+bool qobject_is_equal(const QObject *x, const QObject *y);
+
+/**
  * qobject_destroy(): Free resources used by the object
  */
 void qobject_destroy(QObject *obj);
diff --git a/include/qapi/qmp/qstring.h b/include/qapi/qmp/qstring.h
index 10076b7c8c..65c05a9be5 100644
--- a/include/qapi/qmp/qstring.h
+++ b/include/qapi/qmp/qstring.h
@@ -31,6 +31,7 @@ void qstring_append_int(QString *qstring, int64_t value);
 void qstring_append(QString *qstring, const char *str);
 void qstring_append_chr(QString *qstring, int c);
 QString *qobject_to_qstring(const QObject *obj);
+bool qstring_is_equal(const QObject *x, const QObject *y);
 void qstring_destroy_obj(QObject *obj);
 
 #endif /* QSTRING_H */
diff --git a/qobject/qbool.c b/qobject/qbool.c
index 0606bbd2a3..ac825fc5a2 100644
--- a/qobject/qbool.c
+++ b/qobject/qbool.c
@@ -52,6 +52,14 @@ QBool *qobject_to_qbool(const QObject *obj)
 }
 
 /**
+ * qbool_is_equal(): Test whether the two QBools are equal
+ */
+bool qbool_is_equal(const QObject *x, const QObject *y)
+{
+return qobject_to_qbool(x)->value == qobject_to_qbool(y)->value;
+}
+
+/**
  * qbool_destroy_obj(): Free all memory allocated by a
  * QBool object
  */
diff --git a/qobject/qdict.c b/qobject/qdict.c
index 

[Qemu-devel] [PATCH for-2.11? v7 1/6] qapi/qnull: Add own header

2017-11-14 Thread Max Reitz
Signed-off-by: Max Reitz 
Reviewed-by: Eric Blake 
Reviewed-by: Alberto Garcia 
Reviewed-by: Markus Armbruster 
---
 include/qapi/qmp/qdict.h|  1 +
 include/qapi/qmp/qnull.h| 30 ++
 include/qapi/qmp/qobject.h  | 12 
 include/qapi/qmp/types.h|  1 +
 qapi/qapi-clone-visitor.c   |  1 +
 qapi/string-input-visitor.c |  1 +
 qobject/qnull.c |  2 +-
 tests/check-qnull.c |  2 +-
 8 files changed, 36 insertions(+), 14 deletions(-)
 create mode 100644 include/qapi/qmp/qnull.h

diff --git a/include/qapi/qmp/qdict.h b/include/qapi/qmp/qdict.h
index 6588c7f0c8..7ea5120c4a 100644
--- a/include/qapi/qmp/qdict.h
+++ b/include/qapi/qmp/qdict.h
@@ -15,6 +15,7 @@
 
 #include "qapi/qmp/qobject.h"
 #include "qapi/qmp/qlist.h"
+#include "qapi/qmp/qnull.h"
 #include "qapi/qmp/qnum.h"
 #include "qemu/queue.h"
 
diff --git a/include/qapi/qmp/qnull.h b/include/qapi/qmp/qnull.h
new file mode 100644
index 00..d075549283
--- /dev/null
+++ b/include/qapi/qmp/qnull.h
@@ -0,0 +1,30 @@
+/*
+ * QNull
+ *
+ * Copyright (C) 2015 Red Hat, Inc.
+ *
+ * Authors:
+ *  Markus Armbruster 
+ *
+ * This work is licensed under the terms of the GNU LGPL, version 2.1
+ * or later.  See the COPYING.LIB file in the top-level directory.
+ */
+
+#ifndef QNULL_H
+#define QNULL_H
+
+#include "qapi/qmp/qobject.h"
+
+struct QNull {
+QObject base;
+};
+
+extern QNull qnull_;
+
+static inline QNull *qnull(void)
+{
+QINCREF(_);
+return _;
+}
+
+#endif /* QNULL_H */
diff --git a/include/qapi/qmp/qobject.h b/include/qapi/qmp/qobject.h
index eab29edd12..ef1d1a9237 100644
--- a/include/qapi/qmp/qobject.h
+++ b/include/qapi/qmp/qobject.h
@@ -93,16 +93,4 @@ static inline QType qobject_type(const QObject *obj)
 return obj->type;
 }
 
-struct QNull {
-QObject base;
-};
-
-extern QNull qnull_;
-
-static inline QNull *qnull(void)
-{
-QINCREF(_);
-return _;
-}
-
 #endif /* QOBJECT_H */
diff --git a/include/qapi/qmp/types.h b/include/qapi/qmp/types.h
index a4bc662bfb..749ac44dcb 100644
--- a/include/qapi/qmp/types.h
+++ b/include/qapi/qmp/types.h
@@ -19,5 +19,6 @@
 #include "qapi/qmp/qstring.h"
 #include "qapi/qmp/qdict.h"
 #include "qapi/qmp/qlist.h"
+#include "qapi/qmp/qnull.h"
 
 #endif /* QAPI_QMP_TYPES_H */
diff --git a/qapi/qapi-clone-visitor.c b/qapi/qapi-clone-visitor.c
index d8b62792bc..daab6819b4 100644
--- a/qapi/qapi-clone-visitor.c
+++ b/qapi/qapi-clone-visitor.c
@@ -12,6 +12,7 @@
 #include "qapi/clone-visitor.h"
 #include "qapi/visitor-impl.h"
 #include "qapi/error.h"
+#include "qapi/qmp/qnull.h"
 
 struct QapiCloneVisitor {
 Visitor visitor;
diff --git a/qapi/string-input-visitor.c b/qapi/string-input-visitor.c
index 67a0a4a58b..b3fdd0827d 100644
--- a/qapi/string-input-visitor.c
+++ b/qapi/string-input-visitor.c
@@ -16,6 +16,7 @@
 #include "qapi/string-input-visitor.h"
 #include "qapi/visitor-impl.h"
 #include "qapi/qmp/qerror.h"
+#include "qapi/qmp/qnull.h"
 #include "qemu/option.h"
 #include "qemu/queue.h"
 #include "qemu/range.h"
diff --git a/qobject/qnull.c b/qobject/qnull.c
index 69a21d1059..bc9fd31626 100644
--- a/qobject/qnull.c
+++ b/qobject/qnull.c
@@ -12,7 +12,7 @@
 
 #include "qemu/osdep.h"
 #include "qemu-common.h"
-#include "qapi/qmp/qobject.h"
+#include "qapi/qmp/qnull.h"
 
 QNull qnull_ = {
 .base = {
diff --git a/tests/check-qnull.c b/tests/check-qnull.c
index 5c6eb0adc8..afa4400da1 100644
--- a/tests/check-qnull.c
+++ b/tests/check-qnull.c
@@ -8,7 +8,7 @@
  */
 #include "qemu/osdep.h"
 
-#include "qapi/qmp/qobject.h"
+#include "qapi/qmp/qnull.h"
 #include "qemu-common.h"
 #include "qapi/qobject-input-visitor.h"
 #include "qapi/qobject-output-visitor.h"
-- 
2.13.6




[Qemu-devel] [PATCH for-2.11? v7 0/6] block: Don't compare strings in bdrv_reopen_prepare()

2017-11-14 Thread Max Reitz
bdrv_reopen_prepare() assumes that all BDS options are strings, which is
not necessarily correct. This series introduces a new qobject_is_equal()
function which can be used to test whether any options have changed,
independently of their type.


v7:
- Patch 6: Fix a clang warning:
tests/check-qobject.c:39:24: error: passing an object that undergoes
 default argument promotion to
 'va_start' has undefined behavior
  TIL: You cannot use va_start(ap, foo) if @foo is a bool.  An int
   works, however.
   Feel free to explain the long version to me, because I don't
   think I have fully understood the issue, but it's something like
   "Using bools for variadic arguments results in their promotion to
   an int, but you have to use a type that is promoted to itself
   (like int)."


git-backport-diff against v6:

Key:
[] : patches are identical
[] : number of functional differences between upstream/downstream patch
[down] : patch is downstream-only
The flags [FC] indicate (F)unctional and (C)ontextual differences, respectively

001/6:[] [--] 'qapi/qnull: Add own header'
002/6:[] [--] 'qapi/qlist: Add qlist_append_null() macro'
003/6:[] [--] 'qapi: Add qobject_is_equal()'
004/6:[] [--] 'block: qobject_is_equal() in bdrv_reopen_prepare()'
005/6:[] [--] 'iotests: Add test for non-string option reopening'
006/6:[0011] [FC] 'tests: Add check-qobject for equality tests'


Max Reitz (6):
  qapi/qnull: Add own header
  qapi/qlist: Add qlist_append_null() macro
  qapi: Add qobject_is_equal()
  block: qobject_is_equal() in bdrv_reopen_prepare()
  iotests: Add test for non-string option reopening
  tests: Add check-qobject for equality tests

 tests/Makefile.include   |   4 +-
 include/qapi/qmp/qbool.h |   1 +
 include/qapi/qmp/qdict.h |   2 +
 include/qapi/qmp/qlist.h |   4 +
 include/qapi/qmp/qnull.h |  32 
 include/qapi/qmp/qnum.h  |   1 +
 include/qapi/qmp/qobject.h   |  21 ++-
 include/qapi/qmp/qstring.h   |   1 +
 include/qapi/qmp/types.h |   1 +
 block.c  |  29 ++--
 qapi/qapi-clone-visitor.c|   1 +
 qapi/string-input-visitor.c  |   1 +
 qobject/qbool.c  |   8 +
 qobject/qdict.c  |  29 
 qobject/qlist.c  |  32 
 qobject/qnull.c  |  11 +-
 qobject/qnum.c   |  54 +++
 qobject/qobject.c|  29 
 qobject/qstring.c|   9 ++
 tests/check-qnull.c  |   2 +-
 tests/check-qobject.c| 328 +++
 scripts/coccinelle/qobject.cocci |   3 +
 tests/.gitignore |   1 +
 tests/qemu-iotests/133   |   9 ++
 tests/qemu-iotests/133.out   |   5 +
 25 files changed, 592 insertions(+), 26 deletions(-)
 create mode 100644 include/qapi/qmp/qnull.h
 create mode 100644 tests/check-qobject.c

-- 
2.13.6




[Qemu-devel] [PATCH for-2.11? v7 2/6] qapi/qlist: Add qlist_append_null() macro

2017-11-14 Thread Max Reitz
Besides the macro itself, this patch also adds a corresponding
Coccinelle rule.

Signed-off-by: Max Reitz 
Reviewed-by: Eric Blake 
Reviewed-by: Alberto Garcia 
---
 include/qapi/qmp/qlist.h | 3 +++
 scripts/coccinelle/qobject.cocci | 3 +++
 2 files changed, 6 insertions(+)

diff --git a/include/qapi/qmp/qlist.h b/include/qapi/qmp/qlist.h
index c4b5fdad9b..59d209bbae 100644
--- a/include/qapi/qmp/qlist.h
+++ b/include/qapi/qmp/qlist.h
@@ -15,6 +15,7 @@
 
 #include "qapi/qmp/qobject.h"
 #include "qapi/qmp/qnum.h"
+#include "qapi/qmp/qnull.h"
 #include "qemu/queue.h"
 
 typedef struct QListEntry {
@@ -37,6 +38,8 @@ typedef struct QList {
 qlist_append(qlist, qbool_from_bool(value))
 #define qlist_append_str(qlist, value) \
 qlist_append(qlist, qstring_from_str(value))
+#define qlist_append_null(qlist) \
+qlist_append(qlist, qnull())
 
 #define QLIST_FOREACH_ENTRY(qlist, var) \
 for ((var) = ((qlist)->head.tqh_first); \
diff --git a/scripts/coccinelle/qobject.cocci b/scripts/coccinelle/qobject.cocci
index 1120eb1a42..47bcafe9a9 100644
--- a/scripts/coccinelle/qobject.cocci
+++ b/scripts/coccinelle/qobject.cocci
@@ -41,4 +41,7 @@ expression Obj, E;
 |
 - qlist_append(Obj, qstring_from_str(E));
 + qlist_append_str(Obj, E);
+|
+- qlist_append(Obj, qnull());
++ qlist_append_null(Obj);
 )
-- 
2.13.6




Re: [Qemu-devel] [Nbd] [Qemu-block] How to online resize qemu disk with nbd protocol?

2017-11-14 Thread Wouter Verhelst
On Tue, Nov 14, 2017 at 10:41:39AM -0600, Eric Blake wrote:
> Another thought - with structured replies, we finally have a way to let
> the client ask for the server to send resize information whenever the
> server wants, rather than having to be polled by a new client request
> all the time.  This is possible by having the server reply with a chunk
> without the NBD_REPLY_FLAG_DONE bit, for as many times as it wants,
> (that is, the server never officially ends the response to the single
> client request for on-going status, until the client sends an
> NBD_CMD_DISC).

Hrm, yeah, that could work.

Minor downside of this would be that a client would now be expected to
continue listening "forever" (probably needs to do a blocking read() or
a select() on the socket), whereas with the current situation a client
could get away with only reading for as long as it expects data.

I don't think that should be a blocker, but it might be something we
might want to document.

> I don't think the server should go into this mode without a flag bit
> from the client requesting it (as it potentially ties up a thread that
> could otherwise be used for parallel processing of other requests),

Yeah. I think we should probably initiate this with a BLOCK_STATUS
message that has a flag with which we mean "don't stop sending data on
the given region for contexts that support it".

However, I could imagine that there might be some cases wherein a server
might be able to go into such a mode for two or more metadata contexts,
and where a client might want to go into that mode for one of them but
not all of them, while still wanting some information from them.

This could be covered with metadata context syntax, but it's annoying
and shouldn't be necessary.

I'm starting to think I made a mistake when I said NBD_CMD_BLOCK_STATUS
can't take a metadata context ID. Okay, there's no space for it, but
that shouldn't have been a blocker.

Thoughts?

> and that the server could reject a repeat command with the flag if it
> is already serving a previous open-ended request.

Right.

On the other hand, I can imagine that a client might also want to tell
the server that it is no longer interested in an outstanding request. In
such a case, it should be able to cancel it.

-- 
Could you people please use IRC like normal people?!?

  -- Amaya Rodrigo Sastre, trying to quiet down the buzz in the DebConf 2008
 Hacklab



Re: [Qemu-devel] [PATCH] exec: Fix section_covers_addr() for sections with non-zero offset

2017-11-14 Thread BALATON Zoltan

On Fri, 27 Oct 2017, BALATON Zoltan wrote:

On Sat, 21 Oct 2017, BALATON Zoltan wrote:

When a section with non-0 offset_within_region field is tested to
cover an address the offset should be taken into account as well.

This fixes a crash caused by picking the wrong memory region in
address_space_lookup_region seen with client code accessing a device
model that uses alias memory regions.

Signed-off-by: BALATON Zoltan 
---


Ping? http://patchwork.ozlabs.org/project/qemu-devel/list/?series=9457


Ping!


This seems to fix the problem described in
http://lists.nongnu.org/archive/html/qemu-devel/2017-10/msg03356.html
but I'm not completely sure about it. This seems to be introduced in
729633c exec: Introduce AddressSpaceDispatch.mru_section and the patch
before that which split off section_covers_addr from phys_page_find so
this patch also changes that caller. Is that OK to do? It appears to
work but I don't know this part of QEMU.

Also the bug seems to be caused by section_covers_addr accepting
sii3112.bar5 when that's the mru_section instead of picking
sii3112.bar0 (which it picks when going through phys_page_find) when
client code is accessing 0xc08001006 from this address map (full
address map is at above URL):

address-space: memory
   000c0800-000c0800 (prio 0, i/o): alias isa_mmio @io 
-


address-space: I/O
 - (prio 0, i/o): io
   1000-1007 (prio 1, i/o): alias sii3112.bar0 
@sii3112.bar5 0080-0087
   1008-100b (prio 1, i/o): alias sii3112.bar1 
@sii3112.bar5 0088-008b
   1010-1017 (prio 1, i/o): alias sii3112.bar2 
@sii3112.bar5 00c0-00c7
   1018-101b (prio 1, i/o): alias sii3112.bar3 
@sii3112.bar5 00c8-00cb
   1020-102f (prio 1, i/o): alias sii3112.bar4 
@sii3112.bar5 -000f


which this patch fixes but would the same problem happen if the
mru_section is bar5 but bar4 is accessed? I could not reproduce that
case but then the offset is 0 but in this case the address would be
above 0xc08001020 and size is 0xf so they probably won't match. But
this is only because of the size of the region. Could that mean the
bug is caused by something else and should be fixed elsewhere?

---
exec.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/exec.c b/exec.c
index db5ae23..a915817 100644
--- a/exec.c
+++ b/exec.c
@@ -370,7 +370,8 @@ static inline bool section_covers_addr(const 
MemoryRegionSection *section,

 * the section must cover the entire address space.
 */
return int128_gethi(section->size) ||
-   range_covers_byte(section->offset_within_address_space,
+   range_covers_byte(section->offset_within_address_space +
+ section->offset_within_region,
 int128_getlo(section->size), addr);
}








Re: [Qemu-devel] [PULL 00/20] Block patches for 2.11.0-rc1

2017-11-14 Thread Max Reitz
On 2017-11-14 18:28, Peter Maydell wrote:
> On 14 November 2017 at 17:23, Max Reitz  wrote:
>> The following changes since commit 191b5fbfa66e5b23e2150f3c6981d30eb84418a9:
>>
>>   Merge remote-tracking branch 'remotes/stefanha/tags/block-pull-request' 
>> into staging (2017-11-14 16:11:19 +)
>>
>> are available in the git repository at:
>>
>>   git://github.com/XanClic/qemu.git tags/pull-block-2017-11-14
>>
>> for you to fetch changes up to 8b2d7c364d9a2491f7501f6688cd722045cf808a:
>>
>>   qemu-iotests: update unsupported image formats in 194 (2017-11-14 18:06:26 
>> +0100)
>>
> 
> I can probably squeeze this into rc1, but 17:30 GMT on rc day is at least
> an hour later than you can reasonably rely on being able to get a pull
> request into the rc

Sorry, I understand.

(I got two errors when testing the build, so I had to debug them and
drop a series -- which is one reason why it got delayed so much.  The
other is that I wanted to see if Kevin would be around today so that he
could do a combined pull request.)

I promise to do my build testing earlier next time...

Max



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] [PULL 00/20] Block patches for 2.11.0-rc1

2017-11-14 Thread Peter Maydell
On 14 November 2017 at 17:23, Max Reitz  wrote:
> The following changes since commit 191b5fbfa66e5b23e2150f3c6981d30eb84418a9:
>
>   Merge remote-tracking branch 'remotes/stefanha/tags/block-pull-request' 
> into staging (2017-11-14 16:11:19 +)
>
> are available in the git repository at:
>
>   git://github.com/XanClic/qemu.git tags/pull-block-2017-11-14
>
> for you to fetch changes up to 8b2d7c364d9a2491f7501f6688cd722045cf808a:
>
>   qemu-iotests: update unsupported image formats in 194 (2017-11-14 18:06:26 
> +0100)
>

I can probably squeeze this into rc1, but 17:30 GMT on rc day is at least
an hour later than you can reasonably rely on being able to get a pull
request into the rc

thanks
-- PMM



[Qemu-devel] [PULL 20/20] qemu-iotests: update unsupported image formats in 194

2017-11-14 Thread Max Reitz
From: Jeff Cody 

Test 194 checks for 'luks' to exclude as an unsupported format,
However, most formats are unsupported, due to migration blockers.

Rather than specifying a blacklist of unsupported formats, whitelist
supported formats (specifically, qcow2, qed, raw, dmg).

Tested-by: Alexey Kardashevskiy 
Signed-off-by: Jeff Cody 
Message-id: 
23ca18c7f843c86a28b1529ca9ac6db4b35ca0e4.1510059970.git.jc...@redhat.com
Reviewed-by: Stefan Hajnoczi 
Reviewed-by: Denis V. Lunev 
Signed-off-by: Max Reitz 
---
 tests/qemu-iotests/194 | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tests/qemu-iotests/194 b/tests/qemu-iotests/194
index 8d973b440f..1d4214aca3 100755
--- a/tests/qemu-iotests/194
+++ b/tests/qemu-iotests/194
@@ -21,7 +21,7 @@
 
 import iotests
 
-iotests.verify_image_format(unsupported_fmts=['luks'])
+iotests.verify_image_format(supported_fmts=['qcow2', 'qed', 'raw', 'dmg'])
 iotests.verify_platform(['linux'])
 
 with iotests.FilePath('source.img') as source_img_path, \
-- 
2.13.6




[Qemu-devel] [PULL 19/20] block/parallels: add migration blocker

2017-11-14 Thread Max Reitz
From: Jeff Cody 

Migration does not work for parallels, and has been broken for a while
(see patch 'block/parallels: Do not update header or truncate image when
 INMIGRATE').  The bdrv_invalidate_cache() method needs to be added for
migration to be supported.  Until this is done, prohibit migration.

Signed-off-by: Jeff Cody 
Reviewed-by: Fam Zheng 
Message-id: 
5e04a7c8a3089913fa58d484af42dab7993984ad.1510059970.git.jc...@redhat.com
Reviewed-by: Stefan Hajnoczi 
Reviewed-by: Denis V. Lunev 
Signed-off-by: Max Reitz 
---
 block/parallels.c | 15 +++
 1 file changed, 15 insertions(+)

diff --git a/block/parallels.c b/block/parallels.c
index 7b7a3efa1d..9545761f49 100644
--- a/block/parallels.c
+++ b/block/parallels.c
@@ -35,6 +35,7 @@
 #include "qemu/module.h"
 #include "qemu/bswap.h"
 #include "qemu/bitmap.h"
+#include "migration/blocker.h"
 
 /**/
 
@@ -100,6 +101,7 @@ typedef struct BDRVParallelsState {
 unsigned int tracks;
 
 unsigned int off_multiplier;
+Error *migration_blocker;
 } BDRVParallelsState;
 
 
@@ -720,6 +722,16 @@ static int parallels_open(BlockDriverState *bs, QDict 
*options, int flags,
 s->bat_dirty_bmap =
 bitmap_new(DIV_ROUND_UP(s->header_size, s->bat_dirty_block));
 
+/* Disable migration until bdrv_invalidate_cache method is added */
+error_setg(>migration_blocker, "The Parallels format used by node '%s' "
+   "does not support live migration",
+   bdrv_get_device_or_node_name(bs));
+ret = migrate_add_blocker(s->migration_blocker, _err);
+if (local_err) {
+error_propagate(errp, local_err);
+error_free(s->migration_blocker);
+goto fail;
+}
 qemu_co_mutex_init(>lock);
 return 0;
 
@@ -750,6 +762,9 @@ static void parallels_close(BlockDriverState *bs)
 
 g_free(s->bat_dirty_bmap);
 qemu_vfree(s->header);
+
+migrate_del_blocker(s->migration_blocker);
+error_free(s->migration_blocker);
 }
 
 static QemuOptsList parallels_create_opts = {
-- 
2.13.6




Re: [Qemu-devel] [PULL 0/1] Seabios 1.11 final 20171114 patches

2017-11-14 Thread Peter Maydell
On 14 November 2017 at 14:40, Gerd Hoffmann <kra...@redhat.com> wrote:
> The following changes since commit 2e550e31518f90cc9cb7e5c855a1995c317463a3:
>
>   Merge remote-tracking branch 'remotes/kraxel/tags/ui-20171110-pull-request' 
> into staging (2017-11-14 08:39:50 +)
>
> are available in the git repository at:
>
>   git://git.kraxel.org/qemu tags/seabios-1.11-final-20171114-pull-request
>
> for you to fetch changes up to 6350b2a09b8a330cbfaea462a34bbb1b8c63d7b1:
>
>   seabios: update to 1.11 final (2017-11-14 15:36:08 +0100)
>
> 
> seabios: update to 1.11 final
>
> 
>
> Gerd Hoffmann (1):
>   seabios: update to 1.11 final
>
>  pc-bios/bios-256k.bin  | Bin 262144 -> 262144 bytes
>  pc-bios/bios.bin   | Bin 131072 -> 131072 bytes
>  pc-bios/vgabios-cirrus.bin | Bin 38400 -> 38400 bytes
>  pc-bios/vgabios-qxl.bin| Bin 38912 -> 38912 bytes
>  pc-bios/vgabios-stdvga.bin | Bin 38912 -> 38912 bytes
>  pc-bios/vgabios-virtio.bin | Bin 38912 -> 38912 bytes
>  pc-bios/vgabios-vmware.bin | Bin 38912 -> 38912 bytes
>  pc-bios/vgabios.bin| Bin 38400 -> 38400 bytes
>  roms/seabios   |   2 +-
>  9 files changed, 1 insertion(+), 1 deletion(-)
>

Applied, thanks.

-- PMM



[Qemu-devel] [PULL 15/20] block/snapshot: dirty all dirty bitmaps on snapshot-switch

2017-11-14 Thread Max Reitz
From: Vladimir Sementsov-Ogievskiy 

Snapshot-switch actually changes active state of disk so it should
reflect on dirty bitmaps. Otherwise next incremental backup using
these bitmaps will be invalid.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
Message-id: 20171023092945.54532-1-vsement...@virtuozzo.com
Reviewed-by: Eric Blake 
Reviewed-by: John Snow 
Signed-off-by: Max Reitz 
---
 block/snapshot.c | 14 ++
 1 file changed, 14 insertions(+)

diff --git a/block/snapshot.c b/block/snapshot.c
index a46564e7b7..1d5ab5f90f 100644
--- a/block/snapshot.c
+++ b/block/snapshot.c
@@ -181,10 +181,24 @@ int bdrv_snapshot_goto(BlockDriverState *bs,
 {
 BlockDriver *drv = bs->drv;
 int ret, open_ret;
+int64_t len;
 
 if (!drv) {
 return -ENOMEDIUM;
 }
+
+len = bdrv_getlength(bs);
+if (len < 0) {
+return len;
+}
+/* We should set all bits in all enabled dirty bitmaps, because dirty
+ * bitmaps reflect active state of disk and snapshot switch operation
+ * actually dirties active state.
+ * TODO: It may make sense not to set all bits but analyze block status of
+ * current state and destination snapshot and do not set bits corresponding
+ * to both-zero or both-unallocated areas. */
+bdrv_set_dirty(bs, 0, len);
+
 if (drv->bdrv_snapshot_goto) {
 return drv->bdrv_snapshot_goto(bs, snapshot_id);
 }
-- 
2.13.6




[Qemu-devel] [PULL 17/20] block/vhdx.c: Don't blindly update the header

2017-11-14 Thread Max Reitz
From: Jeff Cody 

The VHDX specification requires that before user data modification of
the vhdx image, the VHDX header file and data GUIDs need to be updated.
In vhdx_open(), if the image is set to RDWR, we go ahead and update the
header.

However, just because the image is set to RDWR does not mean we can go
ahead and write at this point - specifically, if the QEMU run state is
INMIGRATE, the underlying file BS may be set to inactive via the BDS
open flag of BDRV_O_INACTIVE.  Attempting to write under this condition
will cause an assert in bdrv_co_pwritev().

We can alternatively latch the first time the image is written.  And lo
and behold, we do just that, via vhdx_user_visible_write() in
vhdx_co_writev().  This means the call to vhdx_update_headers() in
vhdx_open() is likely just vestigial, and can be removed.

Reported-by: Alexey Kardashevskiy 
Tested-by: Alexey Kardashevskiy 
Signed-off-by: Jeff Cody 
Message-id: 
659e4cdba6ef4c651737852777c8c93d27b38040.1510059970.git.jc...@redhat.com
Reviewed-by: Stefan Hajnoczi 
Reviewed-by: Denis V. Lunev 
Signed-off-by: Max Reitz 
---
 block/vhdx.c | 7 ---
 1 file changed, 7 deletions(-)

diff --git a/block/vhdx.c b/block/vhdx.c
index 7ae4589879..9956933da6 100644
--- a/block/vhdx.c
+++ b/block/vhdx.c
@@ -1008,13 +1008,6 @@ static int vhdx_open(BlockDriverState *bs, QDict 
*options, int flags,
 goto fail;
 }
 
-if (flags & BDRV_O_RDWR) {
-ret = vhdx_update_headers(bs, s, false, NULL);
-if (ret < 0) {
-goto fail;
-}
-}
-
 /* TODO: differencing files */
 
 return 0;
-- 
2.13.6




[Qemu-devel] [PULL 09/20] iotests: Add missing 'blkdebug::' in 040

2017-11-14 Thread Max Reitz
040 tries to invoke pause_drive() on a drive that does not use blkdebug.
Good idea, but let's use blkdebug to make it actually work.

Signed-off-by: Max Reitz 
Reviewed-by: Eric Blake 
Reviewed-by: Stefan Hajnoczi 
Message-id: 20171109203025.27493-3-mre...@redhat.com
Signed-off-by: Max Reitz 
---
 tests/qemu-iotests/040 | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tests/qemu-iotests/040 b/tests/qemu-iotests/040
index c284d08796..90b5b4f2ad 100755
--- a/tests/qemu-iotests/040
+++ b/tests/qemu-iotests/040
@@ -289,7 +289,7 @@ class TestSetSpeed(ImageCommitTestCase):
 qemu_img('create', '-f', iotests.imgfmt, '-o', 'backing_file=%s' % 
mid_img, test_img)
 qemu_io('-f', iotests.imgfmt, '-c', 'write -P 0x1 0 512', test_img)
 qemu_io('-f', iotests.imgfmt, '-c', 'write -P 0xef 524288 524288', 
mid_img)
-self.vm = iotests.VM().add_drive(test_img)
+self.vm = iotests.VM().add_drive('blkdebug::' + test_img)
 self.vm.launch()
 
 def tearDown(self):
-- 
2.13.6




  1   2   3   4   >