Re: [Qemu-devel] Suggestion on 'virtio-pmem' implementation

2018-03-14 Thread Pankaj Gupta

Hi David,

> 
> Hi Pankaj,
> 
> I have a prototype (new one for virtio-mem I was working on over the last
> weeks) for exactly what you need. I basically factored out the notion of a
> memory device. So also virtio devices can be memory devices and get
> recognized e.g. in formerly known pc_dimm_get_free_address(), so it works
> out nicely with ordinary memory hotplug and such.

Nice. This is very useful re-factoring.

> 
> Guess your main problem right now is that you don‘t create a memory slot
> properly. But I also have code for that that you can built onto.

sure.

> 
> I‘m right now in Sri Lanka on vacation, I‘ll be back on 23. September and
> will send you the link to a branch with the prototype asap.

Cool. Enjoy your vacation and thanks for the reply.

Best regards,
Pankaj

> 
> Thanks,
> 
> David / dhildenb / da...@redhat.com
> 
> Von meinem iPhone gesendet
> 
> > Am 14.03.2018 um 10:36 schrieb Pankaj Gupta :
> > 
> > 
> > 
> > Hi,
> > 
> > 
> > I am implementing 'virtio-pmem' as a mechanism to
> > flush guest writes with 'fake DAX' flushing interface.
> > 
> > Below is the high level details of components:
> > 
> > 1] 'virtio-pmem' device expose guest physical address
> >  details(start, len).
> > 
> > 2] 'virtio-pmem' driver in guest discovers this
> >information and configures 'libnvdimm'. Guest 'pmem'
> >driver works on this memory range.
> > 
> > 3] Guest 'pmem' driver uses 'virtio-pmem' PV driver to
> >   send flush commands.
> > 
> > 
> > I need suggestion implementing part 1]
> > 
> > * When tried with 'hotplug_memory.base' address as guest physical
> >  address, I am facing 'EPT_MISCONFIG' errors when pmem does mkfs.
> >  After digging more it looks like address range I am using as guest
> >  physical address is either already mapped as MMIO or reserved.
> >  Though Guest hot-plugs this physical address into its virtual
> >  memory range when guest tries to read/write the memory KVM cannot
> >  translate the address and throw 'EPT_MISCONFIG' error.
> > 
> > * While I am trying to get the appropriate guest physical address
> >  which is free, I could see memory 'pc_dimm_memory_plug' code
> >  has a function 'pc_dimm_get_free_addr' which works with 'PC DIMM'
> >  class. As I am using 'VIRTIO', there is no way AFAIK this function
> >  can be used by VIRTIO or my PV device code.
> > 
> > 
> > I need ideas to get the free guest physical address from my PV
> > device code so that we can use this range in guest address space.
> > 
> > Find below pointer to previous discussion:
> > 
> > https://marc.info/?l=kvm=151629709903946=2
> > 
> > Thanks,
> > Pankaj
> > 
> 



Re: [Qemu-devel] [PATCH] migration: Fix rate limiting issue on RDMA migration

2018-03-14 Thread 858585 jemmy
On Thu, Mar 15, 2018 at 4:19 AM, Dr. David Alan Gilbert
 wrote:
> * Lidong Chen (jemmy858...@gmail.com) wrote:
>> RDMA migration implement save_page function for QEMUFile, but
>> ram_control_save_page do not increase bytes_xfer. So when doing
>> RDMA migration, it will use whole bandwidth.
>
> Hi,
>   Thanks for this,
>
>> Signed-off-by: Lidong Chen 
>> ---
>>  migration/qemu-file.c | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/migration/qemu-file.c b/migration/qemu-file.c
>> index 2ab2bf3..217609d 100644
>> --- a/migration/qemu-file.c
>> +++ b/migration/qemu-file.c
>> @@ -253,7 +253,7 @@ size_t ram_control_save_page(QEMUFile *f, ram_addr_t 
>> block_offset,
>>  if (f->hooks && f->hooks->save_page) {
>>  int ret = f->hooks->save_page(f, f->opaque, block_offset,
>>offset, size, bytes_sent);
>> -
>> +f->bytes_xfer += size;
>
> I'm a bit confused, because I know rdma.c calls acct_update_position()
> and I'd always thought that was enough.
> That calls qemu_update_position(...) which increases f->pos but not
> f->bytes_xfer.
>
> f_pos is used to calculate the 'transferred' value in
> migration_update_counters and thus the current bandwidth and downtime -
> but as you say, not the rate_limit.
>
> So really, should this f->bytes_xfer += size   go in
> qemu_update_position ?

For tcp migration, bytes_xfer is updated before qemu_fflush(f) which
actually send data.
but qemu_update_position is invoked by qemu_rdma_write_one, which
after call ibv_post_send.
and qemu_rdma_save_page is asynchronous, it may merge the page.
I think it's more safe to limiting rate before send data

>
> Juan: I'm not sure I know why we have both bytes_xfer and pos.

Maybe the reasion is bytes_xfer is updated before send data,
and bytes_xfer will be reset by migration_update_counters.

>
> Dave
>
>>  if (ret != RAM_SAVE_CONTROL_DELAYED) {
>>  if (bytes_sent && *bytes_sent > 0) {
>>  qemu_update_position(f, *bytes_sent);
>> --
>> 1.8.3.1
>>
> --
> Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK



Re: [Qemu-devel] [PATCH 1/5] multiboot: Reject kernels exceeding the address space

2018-03-14 Thread Jack Schwartz

Hi Kevin.

My comments are inline...

On 2018-03-14 10:32, Kevin Wolf wrote:

The code path with a manually set mh_load_addr in the Multiboot header
checks that load_end_addr <= load_addr, but the path where load_end_addr
is automatically detected if 0 is given in the header misses the
corresponding check.

1) The code checks that load_end_addr >= load_addr (before letting it
through).

2) load_end_addr is used only when it is read in as non-zero, so no 
check is needed if zero.  (It gets debug-printed even when zero, but is 
used only to calculate mb_load_size and only when non-zero.)

If the kernel binary size is larger than can fit in
the address space after load_addr, we ended up with a kernel_size that
is smaller than load_size, which means that we read the file into a too
small buffer.

Add a check to reject kernel files with such Multiboot headers.

Code itself looks fine.

Modulo above comments:
    Reviewed-by: Jack Schwartz 

    Thanks,
    Jack


Signed-off-by: Kevin Wolf 
---
  hw/i386/multiboot.c | 4 
  1 file changed, 4 insertions(+)

diff --git a/hw/i386/multiboot.c b/hw/i386/multiboot.c
index b9064264d8..1e215bf8d3 100644
--- a/hw/i386/multiboot.c
+++ b/hw/i386/multiboot.c
@@ -247,6 +247,10 @@ int load_multiboot(FWCfgState *fw_cfg,
  }
  mb_load_size = kernel_file_size - mb_kernel_text_offset;
  }
+if (mb_load_size > UINT32_MAX - mh_load_addr) {
+error_report("kernel does not fit in address space");
+exit(1);
+}
  if (mh_bss_end_addr) {
  if (mh_bss_end_addr < (mh_load_addr + mb_load_size)) {
  error_report("invalid bss_end_addr address");





Re: [Qemu-devel] [PATCH 0/5] multiboot: Fix buffer overflow on invalid kernels

2018-03-14 Thread Jack Schwartz

Hi Kevin.

I see an issue with the commit message of patch 1; please see my reply 
to that patch for details.  I fully understand patches 1,2,3, patch 4 
except for some of the Makefile black magic, and patch 5 looks 
reasonable to me.


So, for patches 2,3,4,5:
    Reviewed-by: Jack Schwartz 

    Thanks,
    Jack

On 2018-03-14 10:32, Kevin Wolf wrote:

Patch 1 fixes another Multiboot kernel validation bug that could cause
QEMU to load the kernel image file into a too small buffer. Patch 2 adds
another check to harden the code. The rest of the series adds Multiboot
test cases for kernels using the a.out kludge, which is where the recent
bugs were found.

Kevin Wolf (5):
   multiboot: Reject kernels exceeding the address space
   multiboot: Check validity of mh_header_addr
   tests/multiboot: Test exit code for every qemu run
   tests/multiboot: Add tests for the a.out kludge
   tests/multiboot: Add .gitignore

  hw/i386/multiboot.c |   8 +++
  tests/multiboot/.gitignore  |   3 +
  tests/multiboot/Makefile|  22 +--
  tests/multiboot/aout_kludge.S   | 138 
  tests/multiboot/aout_kludge.out |  42 
  tests/multiboot/run_test.sh |  34 ++
  6 files changed, 227 insertions(+), 20 deletions(-)
  create mode 100644 tests/multiboot/.gitignore
  create mode 100644 tests/multiboot/aout_kludge.S
  create mode 100644 tests/multiboot/aout_kludge.out






[Qemu-devel] [PULL 1/9] sii3112: Remove unneeded exit function

2018-03-14 Thread David Gibson
From: BALATON Zoltan 

An exit function was mistakenly left here but it's not needed because
the PCI bars are organised differently in this device. Calling this
exit function during device_del was causing an abort with
memory_region_del_subregion: `Assertion subregion->container == mr' failed.

Reported-by: Thomas Huth 
Signed-off-by: BALATON Zoltan 
Signed-off-by: David Gibson 
---
 hw/ide/sii3112.c | 12 
 1 file changed, 12 deletions(-)

diff --git a/hw/ide/sii3112.c b/hw/ide/sii3112.c
index e3896c65b4..743a50ed51 100644
--- a/hw/ide/sii3112.c
+++ b/hw/ide/sii3112.c
@@ -327,17 +327,6 @@ static void sii3112_pci_realize(PCIDevice *dev, Error 
**errp)
 qemu_register_reset(sii3112_reset, s);
 }
 
-static void sii3112_pci_exitfn(PCIDevice *dev)
-{
-PCIIDEState *d = PCI_IDE(dev);
-int i;
-
-for (i = 0; i < 2; ++i) {
-memory_region_del_subregion(>bmdma_bar, >bmdma[i].extra_io);
-memory_region_del_subregion(>bmdma_bar, >bmdma[i].addr_ioport);
-}
-}
-
 static void sii3112_pci_class_init(ObjectClass *klass, void *data)
 {
 DeviceClass *dc = DEVICE_CLASS(klass);
@@ -348,7 +337,6 @@ static void sii3112_pci_class_init(ObjectClass *klass, void 
*data)
 pd->class_id = PCI_CLASS_STORAGE_RAID;
 pd->revision = 1;
 pd->realize = sii3112_pci_realize;
-pd->exit = sii3112_pci_exitfn;
 dc->desc = "SiI3112A SATA controller";
 set_bit(DEVICE_CATEGORY_STORAGE, dc->categories);
 }
-- 
2.14.3




[Qemu-devel] [PULL 4/9] hw/misc/macio: Mark the macio devices with user_creatable = false

2018-03-14 Thread David Gibson
From: Thomas Huth 

The macio devices currently cause a crash when the user tries to
instantiate them on a different machine:

$ ppc64-softmmu/qemu-system-ppc64 -device macio-newworld
Unexpected error in qemu_chr_fe_init() at chardev/char-fe.c:222:
qemu-system-ppc64: -device macio-newworld: Device 'serial0' is in use
Aborted (core dumped)

These devices are clearly not intended to be creatable by the user
since they are using serial_hds[] directly in their instance_init
function. So let's mark them with user_creatable = false.

Signed-off-by: Thomas Huth 
Reviewed-by: Mark Cave-Ayland 
Signed-off-by: David Gibson 
---
 hw/misc/macio/macio.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/hw/misc/macio/macio.c b/hw/misc/macio/macio.c
index af1bd46b4b..454244f59e 100644
--- a/hw/misc/macio/macio.c
+++ b/hw/misc/macio/macio.c
@@ -406,6 +406,8 @@ static void macio_class_init(ObjectClass *klass, void *data)
 k->class_id = PCI_CLASS_OTHERS << 8;
 dc->props = macio_properties;
 set_bit(DEVICE_CATEGORY_BRIDGE, dc->categories);
+/* Reason: Uses serial_hds in macio_instance_init */
+dc->user_creatable = false;
 }
 
 static const TypeInfo macio_oldworld_type_info = {
-- 
2.14.3




[Qemu-devel] [PULL 6/9] hw/ppc/spapr: Allow "spapr-vlan" as NIC model name beside "ibmveth"

2018-03-14 Thread David Gibson
From: Thomas Huth 

With the new "--nic" command line parameter option, the "old" way of
specifying a NIC model via the nd_table[] is becoming more prominent
again. But for the pseries "spapr-vlan" device, there is a confusing
discrepancy between the model name that is used for "--device" (i.e.
"spapr-vlan") and the model name that has to be used for "--net nic"
or the new "--nic" parameter (i.e. "ibmveth"). Since "spapr-vlan" is
the "real" name of the device, let's allow "spapr-vlan" to be used
as model name for the nd_table[] entries, too.

Signed-off-by: Thomas Huth 
Reviewed-by: Greg Kurz 
Signed-off-by: David Gibson 
---
 hw/ppc/spapr.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 7e1c858566..dfa9e43601 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -2607,10 +2607,11 @@ static void spapr_machine_init(MachineState *machine)
 NICInfo *nd = _table[i];
 
 if (!nd->model) {
-nd->model = g_strdup("ibmveth");
+nd->model = g_strdup("spapr-vlan");
 }
 
-if (strcmp(nd->model, "ibmveth") == 0) {
+if (g_str_equal(nd->model, "spapr-vlan") ||
+g_str_equal(nd->model, "ibmveth")) {
 spapr_vlan_create(spapr->vio_bus, nd);
 } else {
 pci_nic_init_nofail(_table[i], phb->bus, nd->model, NULL);
-- 
2.14.3




[Qemu-devel] [PULL 9/9] target/ppc: fix tlbsync to check privilege level depending on GTSE

2018-03-14 Thread David Gibson
From: Cédric Le Goater 

tlbsync also needs to check the Guest Translation Shootdown Enable
(GTSE) bit in the Logical Partition Control Register (LPCR) to
determine at which privilege level it is running.

See commit c6fd28fd573d ("target/ppc: Update tlbie to check privilege
level based on GTSE")

Signed-off-by: Cédric Le Goater 
Reviewed-by: Suraj Jitindar Singh 
Signed-off-by: David Gibson 
---
 target/ppc/translate.c | 9 +++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/target/ppc/translate.c b/target/ppc/translate.c
index 0a0c090c99..218665b408 100644
--- a/target/ppc/translate.c
+++ b/target/ppc/translate.c
@@ -4526,7 +4526,7 @@ static void gen_tlbie(DisasContext *ctx)
 TCGv_i32 t1;
 
 if (ctx->gtse) {
-CHK_SV; /* If gtse is set then tblie is supervisor privileged */
+CHK_SV; /* If gtse is set then tlbie is supervisor privileged */
 } else {
 CHK_HV; /* Else hypervisor privileged */
 }
@@ -4553,7 +4553,12 @@ static void gen_tlbsync(DisasContext *ctx)
 #if defined(CONFIG_USER_ONLY)
 GEN_PRIV;
 #else
-CHK_HV;
+
+if (ctx->gtse) {
+CHK_SV; /* If gtse is set then tlbsync is supervisor privileged */
+} else {
+CHK_HV; /* Else hypervisor privileged */
+}
 
 /* BookS does both ptesync and tlbsync make tlbsync a nop for server */
 if (ctx->insns_flags & PPC_BOOKE) {
-- 
2.14.3




[Qemu-devel] [PULL 7/9] ppc440_pcix: Change some error_report to qemu_log_mask(LOG_UNIMP, ...)

2018-03-14 Thread David Gibson
From: BALATON Zoltan 

Using log unimp is more appropriate for these messages and this also
silences them by default so they won't clobber make check output when
tests are added for this board.

Signed-off-by: BALATON Zoltan 
Reviewed-by: Thomas Huth 
Signed-off-by: David Gibson 
---
 hw/ppc/ppc440_pcix.c | 11 +++
 1 file changed, 7 insertions(+), 4 deletions(-)

diff --git a/hw/ppc/ppc440_pcix.c b/hw/ppc/ppc440_pcix.c
index ab2626a9de..b1307e6477 100644
--- a/hw/ppc/ppc440_pcix.c
+++ b/hw/ppc/ppc440_pcix.c
@@ -21,6 +21,7 @@
 
 #include "qemu/osdep.h"
 #include "qemu/error-report.h"
+#include "qemu/log.h"
 #include "hw/hw.h"
 #include "hw/ppc/ppc.h"
 #include "hw/ppc/ppc4xx.h"
@@ -286,8 +287,9 @@ static void ppc440_pcix_reg_write4(void *opaque, hwaddr 
addr,
 break;
 
 default:
-error_report("%s: unhandled PCI internal register 0x%lx", __func__,
- (unsigned long)addr);
+qemu_log_mask(LOG_UNIMP,
+  "%s: unhandled PCI internal register 0x%"HWADDR_PRIx"\n",
+  __func__, addr);
 break;
 }
 }
@@ -377,8 +379,9 @@ static uint64_t ppc440_pcix_reg_read4(void *opaque, hwaddr 
addr,
 break;
 
 default:
-error_report("%s: invalid PCI internal register 0x%lx", __func__,
- (unsigned long)addr);
+qemu_log_mask(LOG_UNIMP,
+  "%s: invalid PCI internal register 0x%" HWADDR_PRIx "\n",
+  __func__, addr);
 val = 0;
 }
 
-- 
2.14.3




[Qemu-devel] [PULL 5/9] PPC e500: Fix gap between u-boot and kernel

2018-03-14 Thread David Gibson
From: David Engraf 

This patch moves the gap between u-boot and kernel at the correct location.

Signed-off-by: David Engraf 
Signed-off-by: David Gibson 
---
 hw/ppc/e500.c | 9 -
 1 file changed, 4 insertions(+), 5 deletions(-)

diff --git a/hw/ppc/e500.c b/hw/ppc/e500.c
index 43c15d18c4..bdef2bddc6 100644
--- a/hw/ppc/e500.c
+++ b/hw/ppc/e500.c
@@ -1009,6 +1009,10 @@ void ppce500_init(MachineState *machine, PPCE500Params 
*params)
 }
 
 cur_base = loadaddr + payload_size;
+if (cur_base < (32 * 1024 * 1024)) {
+/* u-boot occupies memory up to 32MB, so load blobs above */
+cur_base = (32 * 1024 * 1024);
+}
 
 /* Load bare kernel only if no bios/u-boot has been provided */
 if (machine->kernel_filename && !kernel_as_payload) {
@@ -1025,11 +1029,6 @@ void ppce500_init(MachineState *machine, PPCE500Params 
*params)
 cur_base += kernel_size;
 }
 
-if (cur_base < (32 * 1024 * 1024)) {
-/* u-boot occupies memory up to 32MB, so load blobs above */
-cur_base = (32 * 1024 * 1024);
-}
-
 /* Load initrd. */
 if (machine->initrd_filename) {
 initrd_base = (cur_base + INITRD_LOAD_PAD) & ~INITRD_PAD_MASK;
-- 
2.14.3




[Qemu-devel] [PULL 8/9] tests/boot-serial: Test the sam460ex board

2018-03-14 Thread David Gibson
From: Thomas Huth 

We've got a U-Boot firmware for this board in our repository, and
the firmware prints some output to the serial console, so we can
check this board in the boot-serial tester, too.

Signed-off-by: Thomas Huth 
Signed-off-by: David Gibson 
---
 tests/boot-serial-test.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/tests/boot-serial-test.c b/tests/boot-serial-test.c
index 5b24cd26c1..011525d8cf 100644
--- a/tests/boot-serial-test.c
+++ b/tests/boot-serial-test.c
@@ -79,12 +79,14 @@ static testdef_t tests[] = {
 { "ppc", "40p", "-boot d", "Booting from device d" },
 { "ppc", "g3beige", "", "PowerPC,750" },
 { "ppc", "mac99", "", "PowerPC,G4" },
+{ "ppc", "sam460ex", "-m 256", "DRAM:  256 MiB" },
 { "ppc64", "ppce500", "", "U-Boot" },
 { "ppc64", "prep", "-boot e", "Booting from device e" },
 { "ppc64", "40p", "-m 192", "Memory size: 192 MB" },
 { "ppc64", "mac99", "", "PowerPC,970FX" },
 { "ppc64", "pseries", "", "Open Firmware" },
 { "ppc64", "powernv", "-cpu POWER8", "OPAL" },
+{ "ppc64", "sam460ex", "-device e1000", "8086  100e" },
 { "i386", "isapc", "-cpu qemu32 -device sga", "SGABIOS" },
 { "i386", "pc", "-device sga", "SGABIOS" },
 { "i386", "q35", "-device sga", "SGABIOS" },
-- 
2.14.3




[Qemu-devel] [PULL 0/9] ppc-for-2.12 queue 20180315

2018-03-14 Thread David Gibson
The following changes since commit 026aaf47c02b79036feb830206cfebb2a726510d:

  Merge remote-tracking branch 'remotes/ehabkost/tags/python-next-pull-request' 
into staging (2018-03-13 16:26:44 +)

are available in the Git repository at:

  git://github.com/dgibson/qemu.git tags/ppc-for-2.12-20180315

for you to fetch changes up to a9ab8cc157054ea6941fb849c78d9e6c515a7730:

  target/ppc: fix tlbsync to check privilege level depending on GTSE 
(2018-03-15 11:18:31 +1100)


ppc patch queue for 2018-03-15

Here's the set of accumulated patches now that we're into soft freeze.
I've split new functionality into a ppc-for-2.13 branch, so this only
has bugfixes.  Well.. and a couple of simple cleanups to make bugfixes
easier, some test improvements and a trivial change to make command
line options more obvious.  I think those are all acceptable for soft
freeze.


BALATON Zoltan (2):
  sii3112: Remove unneeded exit function
  ppc440_pcix: Change some error_report to qemu_log_mask(LOG_UNIMP, ...)

Cédric Le Goater (1):
  target/ppc: fix tlbsync to check privilege level depending on GTSE

David Engraf (1):
  PPC e500: Fix gap between u-boot and kernel

Thomas Huth (5):
  tests/boot-serial: Check the 40p machine, too
  hw/ppc/prep: Fix implicit creation of "-drive if=scsi" devices
  hw/misc/macio: Mark the macio devices with user_creatable = false
  hw/ppc/spapr: Allow "spapr-vlan" as NIC model name beside "ibmveth"
  tests/boot-serial: Test the sam460ex board

 hw/ide/sii3112.c | 12 
 hw/misc/macio/macio.c|  2 ++
 hw/ppc/e500.c|  9 -
 hw/ppc/ppc440_pcix.c | 11 +++
 hw/ppc/prep.c|  2 +-
 hw/ppc/spapr.c   |  5 +++--
 hw/scsi/lsi53c895a.c |  7 +++
 include/hw/pci/pci.h |  1 +
 target/ppc/translate.c   |  9 +++--
 tests/boot-serial-test.c |  8 ++--
 10 files changed, 38 insertions(+), 28 deletions(-)



[Qemu-devel] [PULL 3/9] hw/ppc/prep: Fix implicit creation of "-drive if=scsi" devices

2018-03-14 Thread David Gibson
From: Thomas Huth 

The global hack for creating SCSI devices has recently been removed,
but this apparently broke SCSI devices on some boards that were not
ready for this change yet. For the 40p machine you now get:

$ ppc64-softmmu/qemu-system-ppc64 -M 40p -cdrom x.iso
qemu-system-ppc64: -cdrom x.iso: machine type does not support 
if=scsi,bus=0,unit=2

Fix it by providing a lsi53c810_create() function that takes care
of calling scsi_bus_legacy_handle_cmdline() after creating the
corresponding SCSI controller.

Fixes: 1454509726719e0933c800fad00d6999752688ea
Signed-off-by: Thomas Huth 
Signed-off-by: David Gibson 
---
 hw/ppc/prep.c| 2 +-
 hw/scsi/lsi53c895a.c | 7 +++
 include/hw/pci/pci.h | 1 +
 3 files changed, 9 insertions(+), 1 deletion(-)

diff --git a/hw/ppc/prep.c b/hw/ppc/prep.c
index 096d4d4cfb..3361509a85 100644
--- a/hw/ppc/prep.c
+++ b/hw/ppc/prep.c
@@ -788,7 +788,7 @@ static void ibm_40p_init(MachineState *machine)
 qdev_prop_set_uint32(dev, "equipment", 0xc0);
 qdev_init_nofail(dev);
 
-pci_create_simple(pci_bus, PCI_DEVFN(1, 0), "lsi53c810");
+lsi53c810_create(pci_bus, PCI_DEVFN(1, 0));
 
 /* XXX: s3-trio at PCI_DEVFN(2, 0) */
 pci_vga_init(pci_bus);
diff --git a/hw/scsi/lsi53c895a.c b/hw/scsi/lsi53c895a.c
index f3d4c4d230..160657f4b9 100644
--- a/hw/scsi/lsi53c895a.c
+++ b/hw/scsi/lsi53c895a.c
@@ -2279,3 +2279,10 @@ void lsi53c895a_create(PCIBus *bus)
 
 scsi_bus_legacy_handle_cmdline(>bus);
 }
+
+void lsi53c810_create(PCIBus *bus, int devfn)
+{
+LSIState *s = LSI53C895A(pci_create_simple(bus, devfn, "lsi53c810"));
+
+scsi_bus_legacy_handle_cmdline(>bus);
+}
diff --git a/include/hw/pci/pci.h b/include/hw/pci/pci.h
index d8c18c7fa4..e255941b5a 100644
--- a/include/hw/pci/pci.h
+++ b/include/hw/pci/pci.h
@@ -708,6 +708,7 @@ PCIDevice *pci_create(PCIBus *bus, int devfn, const char 
*name);
 PCIDevice *pci_create_simple(PCIBus *bus, int devfn, const char *name);
 
 void lsi53c895a_create(PCIBus *bus);
+void lsi53c810_create(PCIBus *bus, int devfn);
 
 qemu_irq pci_allocate_irq(PCIDevice *pci_dev);
 void pci_set_irq(PCIDevice *pci_dev, int level);
-- 
2.14.3




[Qemu-devel] [PULL 2/9] tests/boot-serial: Check the 40p machine, too

2018-03-14 Thread David Gibson
From: Thomas Huth 

The "40p" machine is using the Open Hack'Ware BIOS, just like the "prep"
machine, so we can test it accordingly with the boot-serial tester, too.
While we're at it, also change the strings that we are using for the
"prep" machine, so that this test now also checks some CLI parameters.

Signed-off-by: Thomas Huth 
Reviewed-by: Hervé Poussineau 
Signed-off-by: David Gibson 
---
 tests/boot-serial-test.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/tests/boot-serial-test.c b/tests/boot-serial-test.c
index ece25c694f..5b24cd26c1 100644
--- a/tests/boot-serial-test.c
+++ b/tests/boot-serial-test.c
@@ -75,11 +75,13 @@ typedef struct testdef {
 static testdef_t tests[] = {
 { "alpha", "clipper", "", "PCI:" },
 { "ppc", "ppce500", "", "U-Boot" },
-{ "ppc", "prep", "", "Open Hack'Ware BIOS" },
+{ "ppc", "prep", "-m 96", "Memory size: 96 MB" },
+{ "ppc", "40p", "-boot d", "Booting from device d" },
 { "ppc", "g3beige", "", "PowerPC,750" },
 { "ppc", "mac99", "", "PowerPC,G4" },
 { "ppc64", "ppce500", "", "U-Boot" },
-{ "ppc64", "prep", "", "Open Hack'Ware BIOS" },
+{ "ppc64", "prep", "-boot e", "Booting from device e" },
+{ "ppc64", "40p", "-m 192", "Memory size: 192 MB" },
 { "ppc64", "mac99", "", "PowerPC,970FX" },
 { "ppc64", "pseries", "", "Open Firmware" },
 { "ppc64", "powernv", "-cpu POWER8", "OPAL" },
-- 
2.14.3




[Qemu-devel] [PATCH] block: Fix leak of ignore_children in error path

2018-03-14 Thread Fam Zheng
Reported-by: Max Reitz 
Signed-off-by: Fam Zheng 
---
 block.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/block.c b/block.c
index 75a9fd49de..c1fda9fd57 100644
--- a/block.c
+++ b/block.c
@@ -3671,12 +3671,12 @@ int bdrv_drop_intermediate(BlockDriverState *top, 
BlockDriverState *base,
 GSList *ignore_children = g_slist_prepend(NULL, c);
 bdrv_check_update_perm(base, NULL, c->perm, c->shared_perm,
ignore_children, _err);
+g_slist_free(ignore_children);
 if (local_err) {
 ret = -EPERM;
 error_report_err(local_err);
 goto exit;
 }
-g_slist_free(ignore_children);
 
 /* If so, update the backing file path in the image file */
 if (c->role->update_filename) {
-- 
2.14.3




[Qemu-devel] [PATCH] vvfat: Fix inherit_options flags

2018-03-14 Thread Fam Zheng
Overriding flags violates the precedence rules of
bdrv_reopen_queue_child. Just like the read-only option, no-flush should
be put into the options. The same is done in bdrv_temp_snapshot_options.

Reported-by: Stefan Hajnoczi 
---
 block/vvfat.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/block/vvfat.c b/block/vvfat.c
index 4a17a49e12..1569783b0f 100644
--- a/block/vvfat.c
+++ b/block/vvfat.c
@@ -3129,7 +3129,7 @@ static void vvfat_qcow_options(int *child_flags, QDict 
*child_options,
int parent_flags, QDict *parent_options)
 {
 qdict_set_default_str(child_options, BDRV_OPT_READ_ONLY, "off");
-*child_flags = BDRV_O_NO_FLUSH;
+qdict_set_default_str(child_options, BDRV_OPT_CACHE_NO_FLUSH, "on");
 }
 
 static const BdrvChildRole child_vvfat_qcow = {
-- 
2.14.3




Re: [Qemu-devel] [virtio-dev] Re: [PATCH v4 3/4] virtio-balloon: VIRTIO_BALLOON_F_FREE_PAGE_HINT

2018-03-14 Thread Michael S. Tsirkin
On Thu, Mar 15, 2018 at 09:15:48AM +0800, Wei Wang wrote:
> On 03/14/2018 10:12 PM, Michael S. Tsirkin wrote:
> > On Wed, Mar 14, 2018 at 02:03:19PM +0800, Wei Wang wrote:
> > > On 03/14/2018 10:53 AM, Michael S. Tsirkin wrote:
> > > > On Wed, Mar 14, 2018 at 10:43:01AM +0800, Wei Wang wrote:
> > > > > On 03/14/2018 12:49 AM, Michael S. Tsirkin wrote:
> > > > > > On Wed, Mar 07, 2018 at 08:34:24PM +0800, Wei Wang wrote:
> > > > > > 
> > > > > > > Signed-off-by: Wei Wang 
> > > > > > > Signed-off-by: Liang Li 
> > > > > > > CC: Michael S. Tsirkin 
> > > > > > > CC: Dr. David Alan Gilbert 
> > > > > > > CC: Juan Quintela 
> > > > > > I find it suspicious that neither unrealize nor reset
> > > > > > functions have been touched at all.
> > > > > > Are you sure you have thought through scenarious like
> > > > > > hot-unplug or disabling the device by guest?
> > > > > OK. I think we can call balloon_free_page_stop in unrealize and reset.
> > > > > 
> > > > > 
> > > > > > +static void *virtio_balloon_poll_free_page_hints(void *opaque)
> > > > > > +{
> > > > > > +VirtQueueElement *elem;
> > > > > > +VirtIOBalloon *dev = opaque;
> > > > > > +VirtQueue *vq = dev->free_page_vq;
> > > > > > +uint32_t id;
> > > > > > +size_t size;
> > > > > > What makes it safe to poke at this device from multiple threads?
> > > > > > I think that it would be safer to do it from e.g. BH.
> > > > > > 
> > > > > Actually the free_page_optimization thread is the only user of 
> > > > > free_page_vq,
> > > > > and there is only one optimization thread each time. Would this be 
> > > > > safe
> > > > > enough?
> > > > > 
> > > > > Best,
> > > > > Wei
> > > > Aren't there other fields there? Also things like reset affect all VQs.
> > > > 
> > > Yes. But I think BHs are used to avoid re-entrancy, which isn't the issue
> > > here.
> > Since you are adding locks to address the issue - doesn't this imply
> > reentrancy is exactly the issue?
> 
> Not really. The lock isn't intended for any reentrancy issues, since there
> will be only one run of the virtio_balloon_poll_free_page_hints function at
> any given time. Instead, the lock is used to synchronize
> virtio_balloon_poll_free_page_hints and virtio_balloon_free_page_stop to
> access dev->free_page_report_status.

I wonder whether that's enough. E.g. is there a race with guest
trying to reset the device? That resets all VQs you know.


> Please see the whole picture below:
> 
> virtio_balloon_poll_free_page_hints()
> {
> 
> while (1) {
> qemu_spin_lock();
> if (dev->free_page_report_status >= FREE_PAGE_REPORT_S_STOP ||
> !runstate_is_running()) {
> qemu_spin_unlock();
> break;
> }
> ...
> if (id == dev->free_page_report_cmd_id) {
> ==>dev->free_page_report_status = FREE_PAGE_REPORT_S_START;
> ...
> qemu_spin_unlock();
> }
> }
> 
> 
> static void virtio_balloon_free_page_stop(void *opaque)
> {
> VirtIOBalloon *s = opaque;
> VirtIODevice *vdev = VIRTIO_DEVICE(s);
> 
> qemu_spin_lock();
> ...
> ==>   s->free_page_report_status = FREE_PAGE_REPORT_S_STOP;
> ...
> qemu_spin_unlock();
> }
> 
> 
> Without the lock, there are theoretical possibilities that assigning STOP
> below is overridden by START above. In that
> case,virtio_balloon_free_page_stop does not effectively stop
> virtio_balloon_poll_free_page_hints.
> I think this issue couldn't be solved by BHs.
> 
> Best,
> Wei

Don't all BHs run under the BQL?



Re: [Qemu-devel] [PATCH qemu v7 2/4] vfio/pci: Relax DMA map errors for MMIO regions

2018-03-14 Thread Alex Williamson
On Wed, 14 Mar 2018 13:40:21 +1100
Alexey Kardashevskiy  wrote:

> On 14/3/18 3:56 am, Alex Williamson wrote:
> > Actually making sure it compiles would have been nice:
> > 
> > qemu.git/hw/vfio/common.c: In function ‘vfio_listener_region_add’:
> > qemu.git/hw/vfio/common.c:550:40: error: invalid operands to binary & (have 
> > ‘Int128 {aka struct Int128}’ and ‘hwaddr {aka long long unsigned int}’)
> >  if ((iova & pgmask) || (llsize & pgmask)) {
> > ^
> > qemu.git/hw/vfio/common.c: In function ‘vfio_listener_region_del’:
> > qemu.git/hw/vfio/common.c:669:50: error: invalid operands to binary & (have 
> > ‘Int128 {aka struct Int128}’ and ‘hwaddr {aka long long unsigned int}’)
> >  try_unmap = !((iova & pgmask) || (llsize & pgmask));
> >   ^
> > Clearly llsize needs to be wrapped in int128_get64() here.  Should I
> > presume that testing of this patch is equally lacking?   
> 
> No.
> 
> I do not know how but this definitely compiles for me with these:
> 
> gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-16)
> gcc (GCC) 6.3.1 20161221 (Red Hat 6.3.1-1)
> 
> I always thought Int128 is a struct but it turns out there are __uint128_t
> and __int128_t and everywhere I compile (including x86_64 laptop) there is
> CONFIG_INT128=y in config-host.mak, hence no error.
> 
> I can see that you fixed it (thanks for that!) but how do you compile it to
> get this error? There is no way to disable gcc's types and even 4.8.5 can
> do these. Thanks,

Hmm, I always try to do a 32bit build before I send a pull request,
that's where I started this time.  I thought an Int128 was always a
structure, so I figured it always failed.  Good to know there was more
than just my testing on this commit.  Since it's in, we can fix it
during the freeze if it turns out to be broken.  Thanks,

Alex



Re: [Qemu-devel] [PATCH] dump-guest-memory: more descriptive lookup_type failure

2018-03-14 Thread no-reply
Hi,

This series seems to have some coding style problems. See output below for
more information:

Type: series
Message-id: 20180314142133.14166-1-drjo...@redhat.com
Subject: [Qemu-devel] [PATCH] dump-guest-memory: more descriptive lookup_type 
failure

=== TEST SCRIPT BEGIN ===
#!/bin/bash

BASE=base
n=1
total=$(git log --oneline $BASE.. | wc -l)
failed=0

git config --local diff.renamelimit 0
git config --local diff.renames True
git config --local diff.algorithm histogram

commits="$(git log --format=%H --reverse $BASE..)"
for c in $commits; do
echo "Checking PATCH $n/$total: $(git log -n 1 --format=%s $c)..."
if ! git show $c --format=email | ./scripts/checkpatch.pl --mailback -; then
failed=1
echo
fi
n=$((n+1))
done

exit $failed
=== TEST SCRIPT END ===

Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384
From https://github.com/patchew-project/qemu
 * [new tag]   patchew/20180314142133.14166-1-drjo...@redhat.com -> 
patchew/20180314142133.14166-1-drjo...@redhat.com
Switched to a new branch 'test'
8dba220694 dump-guest-memory: more descriptive lookup_type failure

=== OUTPUT BEGIN ===
Checking PATCH 1/1: dump-guest-memory: more descriptive lookup_type failure...
WARNING: line over 80 characters
#34: FILE: scripts/dump-guest-memory.py:22:
+raise gdb.GdbError("Symbols must be loaded prior to sourcing 
dump-guest-memory.\n"

ERROR: line over 90 characters
#35: FILE: scripts/dump-guest-memory.py:23:
+   "Symbols may be loaded by first 'attach'ing a QEMU 
process id or by 'load'ing a QEMU binary.")

total: 1 errors, 1 warnings, 12 lines checked

Your patch has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.

=== OUTPUT END ===

Test command exited with code: 1


---
Email generated automatically by Patchew [http://patchew.org/].
Please send your feedback to patchew-de...@freelists.org

[Qemu-devel] [PATCH resend v2] qga: unset frozen state if no mount point is frozen

2018-03-14 Thread Chen Hanxiao
From: Chen Hanxiao 

If we set mountpoints to qmp_guest_fsfreeze_freeze_list,
we may got nothing to freeze as all mountpoints are
not valid.
So call ga_unset_frozen in this senario.

Also, if we return 0 frozen fs, there is no need to call
guest-fsfreeze-thaw.

Cc: Michael Roth 
Signed-off-by: Chen Hanxiao 
---
v2:
  remove has_mountpoints special case
  add qapi-schema.json section

 qga/commands-posix.c | 6 ++
 qga/qapi-schema.json | 4 +++-
 2 files changed, 9 insertions(+), 1 deletion(-)

diff --git a/qga/commands-posix.c b/qga/commands-posix.c
index 967061444a..05cf9caa04 100644
--- a/qga/commands-posix.c
+++ b/qga/commands-posix.c
@@ -1274,6 +1274,12 @@ int64_t qmp_guest_fsfreeze_freeze_list(bool 
has_mountpoints,
 }
 
 free_fs_mount_list();
+/* We may not issue any FIFREEZE here.
+ * Just unset ga_state here and ready for the next call.
+ */
+if (i == 0) {
+ga_unset_frozen(ga_state);
+}
 return i;
 
 error:
diff --git a/qga/qapi-schema.json b/qga/qapi-schema.json
index 17884c7c70..1045cef386 100644
--- a/qga/qapi-schema.json
+++ b/qga/qapi-schema.json
@@ -435,7 +435,9 @@
 # for up to 10 seconds by VSS.
 #
 # Returns: Number of file systems currently frozen. On error, all filesystems
-# will be thawed.
+# will be thawed. If no filesystems are frozen as a result of this call,
+# then @guest-fsfreeze-status will remain "thawed" and calling
+# @guest-fsfreeze-thaw is not necessary.
 #
 # Since: 0.15.0
 ##
-- 
2.14.3




Re: [Qemu-devel] [virtio-dev] Re: [PATCH v4 3/4] virtio-balloon: VIRTIO_BALLOON_F_FREE_PAGE_HINT

2018-03-14 Thread Wei Wang

On 03/14/2018 10:12 PM, Michael S. Tsirkin wrote:

On Wed, Mar 14, 2018 at 02:03:19PM +0800, Wei Wang wrote:

On 03/14/2018 10:53 AM, Michael S. Tsirkin wrote:

On Wed, Mar 14, 2018 at 10:43:01AM +0800, Wei Wang wrote:

On 03/14/2018 12:49 AM, Michael S. Tsirkin wrote:

On Wed, Mar 07, 2018 at 08:34:24PM +0800, Wei Wang wrote:


Signed-off-by: Wei Wang 
Signed-off-by: Liang Li 
CC: Michael S. Tsirkin 
CC: Dr. David Alan Gilbert 
CC: Juan Quintela 

I find it suspicious that neither unrealize nor reset
functions have been touched at all.
Are you sure you have thought through scenarious like
hot-unplug or disabling the device by guest?

OK. I think we can call balloon_free_page_stop in unrealize and reset.



+static void *virtio_balloon_poll_free_page_hints(void *opaque)
+{
+VirtQueueElement *elem;
+VirtIOBalloon *dev = opaque;
+VirtQueue *vq = dev->free_page_vq;
+uint32_t id;
+size_t size;
What makes it safe to poke at this device from multiple threads?
I think that it would be safer to do it from e.g. BH.


Actually the free_page_optimization thread is the only user of free_page_vq,
and there is only one optimization thread each time. Would this be safe
enough?

Best,
Wei

Aren't there other fields there? Also things like reset affect all VQs.


Yes. But I think BHs are used to avoid re-entrancy, which isn't the issue
here.

Since you are adding locks to address the issue - doesn't this imply
reentrancy is exactly the issue?


Not really. The lock isn't intended for any reentrancy issues, since 
there will be only one run of the virtio_balloon_poll_free_page_hints 
function at any given time. Instead, the lock is used to synchronize 
virtio_balloon_poll_free_page_hints and virtio_balloon_free_page_stop to 
access dev->free_page_report_status. Please see the whole picture below:


virtio_balloon_poll_free_page_hints()
{

while (1) {
qemu_spin_lock();
if (dev->free_page_report_status >= FREE_PAGE_REPORT_S_STOP ||
!runstate_is_running()) {
qemu_spin_unlock();
break;
}
...
if (id == dev->free_page_report_cmd_id) {
==>dev->free_page_report_status = FREE_PAGE_REPORT_S_START;
...
qemu_spin_unlock();
}
}


static void virtio_balloon_free_page_stop(void *opaque)
{
VirtIOBalloon *s = opaque;
VirtIODevice *vdev = VIRTIO_DEVICE(s);

qemu_spin_lock();
...
==>   s->free_page_report_status = FREE_PAGE_REPORT_S_STOP;
...
qemu_spin_unlock();
}


Without the lock, there are theoretical possibilities that assigning 
STOP below is overridden by START above. In that 
case,virtio_balloon_free_page_stop does not effectively stop 
virtio_balloon_poll_free_page_hints.

I think this issue couldn't be solved by BHs.

Best,
Wei



Re: [Qemu-devel] [PATCH] target/ppc: fix tlbsync to check privilege level depending on GTSE

2018-03-14 Thread David Gibson
On Wed, Mar 14, 2018 at 06:33:36PM +0100, Cédric Le Goater wrote:
> tlbsync also needs to check the Guest Translation Shootdown Enable
> (GTSE) bit in the Logical Partition Control Register (LPCR) to
> determine at which privilege level it is running.
> 
> See commit c6fd28fd573d ("target/ppc: Update tlbie to check privilege
> level based on GTSE")
> 
> Signed-off-by: Cédric Le Goater 
> ---
> 
>  This will have its importance when we activate the HV bit on the
>  POWER9 CPU for the PowerNV machine.
> 
>  target/ppc/translate.c | 9 +++--
>  1 file changed, 7 insertions(+), 2 deletions(-)

Applied to ppc-for-2.12, thanks.

> 
> diff --git a/target/ppc/translate.c b/target/ppc/translate.c
> index 0a0c090c9978..218665b4080b 100644
> --- a/target/ppc/translate.c
> +++ b/target/ppc/translate.c
> @@ -4526,7 +4526,7 @@ static void gen_tlbie(DisasContext *ctx)
>  TCGv_i32 t1;
>  
>  if (ctx->gtse) {
> -CHK_SV; /* If gtse is set then tblie is supervisor privileged */
> +CHK_SV; /* If gtse is set then tlbie is supervisor privileged */
>  } else {
>  CHK_HV; /* Else hypervisor privileged */
>  }
> @@ -4553,7 +4553,12 @@ static void gen_tlbsync(DisasContext *ctx)
>  #if defined(CONFIG_USER_ONLY)
>  GEN_PRIV;
>  #else
> -CHK_HV;
> +
> +if (ctx->gtse) {
> +CHK_SV; /* If gtse is set then tlbsync is supervisor privileged */
> +} else {
> +CHK_HV; /* Else hypervisor privileged */
> +}
>  
>  /* BookS does both ptesync and tlbsync make tlbsync a nop for server */
>  if (ctx->insns_flags & PPC_BOOKE) {

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature


Re: [Qemu-devel] [Qemu-ppc] [PATCH] target/ppc: fix tlbsync to check privilege level depending on GTSE

2018-03-14 Thread Suraj Jitindar Singh
On Wed, 2018-03-14 at 18:33 +0100, Cédric Le Goater wrote:
> tlbsync also needs to check the Guest Translation Shootdown Enable
> (GTSE) bit in the Logical Partition Control Register (LPCR) to
> determine at which privilege level it is running.
> 
> See commit c6fd28fd573d ("target/ppc: Update tlbie to check privilege
> level based on GTSE")
> 
> Signed-off-by: Cédric Le Goater 

Reviewed-by: Suraj Jitindar Singh 

> ---
> 
>  This will have its importance when we activate the HV bit on the
>  POWER9 CPU for the PowerNV machine.
> 
>  target/ppc/translate.c | 9 +++--
>  1 file changed, 7 insertions(+), 2 deletions(-)
> 
> diff --git a/target/ppc/translate.c b/target/ppc/translate.c
> index 0a0c090c9978..218665b4080b 100644
> --- a/target/ppc/translate.c
> +++ b/target/ppc/translate.c
> @@ -4526,7 +4526,7 @@ static void gen_tlbie(DisasContext *ctx)
>  TCGv_i32 t1;
>  
>  if (ctx->gtse) {
> -CHK_SV; /* If gtse is set then tblie is supervisor
> privileged */
> +CHK_SV; /* If gtse is set then tlbie is supervisor
> privileged */

^^^ Did that line actually change? :)

>  } else {
>  CHK_HV; /* Else hypervisor privileged */
>  }
> @@ -4553,7 +4553,12 @@ static void gen_tlbsync(DisasContext *ctx)
>  #if defined(CONFIG_USER_ONLY)
>  GEN_PRIV;
>  #else
> -CHK_HV;
> +
> +if (ctx->gtse) {
> +CHK_SV; /* If gtse is set then tlbsync is supervisor
> privileged */
> +} else {
> +CHK_HV; /* Else hypervisor privileged */
> +}
>  
>  /* BookS does both ptesync and tlbsync make tlbsync a nop for
> server */
>  if (ctx->insns_flags & PPC_BOOKE) {



Re: [Qemu-devel] [PATCH v4 2/4] migration: API to clear bits of guest free pages from the dirty bitmap

2018-03-14 Thread Michael S. Tsirkin
On Wed, Mar 14, 2018 at 07:42:59PM +, Dr. David Alan Gilbert wrote:
> * Michael S. Tsirkin (m...@redhat.com) wrote:
> > On Wed, Mar 14, 2018 at 06:11:37PM +, Dr. David Alan Gilbert wrote:
> > > > +used_len = block->used_length - offset;
> > > > +addr += used_len;
> > > > +}
> > > > +
> > > > +start = offset >> TARGET_PAGE_BITS;
> > > > +npages = used_len >> TARGET_PAGE_BITS;
> > > > +ram_state->migration_dirty_pages -=
> > > > +  bitmap_count_one_with_offset(block->bmap, start, 
> > > > npages);
> > > > +bitmap_clear(block->bmap, start, npages);
> > > 
> > > If this is happening while the migration is running, this isn't safe -
> > > the migration code could clear a bit at about the same point this
> > > happens, so that the count returned by bitmap_count_one_with_offset
> > > wouldn't match the word that was cleared by bitmap_clear.
> > > 
> > > The only way I can see to fix it is to run over the range using
> > > bitmap_test_and_clear_atomic, using the return value to decrement
> > > the number of dirty pages.
> > > But you also need to be careful with the update of the
> > > migration_dirty_pages value itself, because that's also being read
> > > by the migration thread.
> > > 
> > > Dave
> > 
> > I see that there's migration_bitmap_sync but it does not seem to be
> 
> Do you mean bitmap_mutex?

Yes. Sorry.

> > taken on all paths. E.g. migration_bitmap_clear_dirty and
> > migration_bitmap_find_dirty are called without that lock sometimes.
> > Thoughts?
> 
> Hmm, that doesn't seem to protect much at all!  It looks like it was
> originally added to handle hotplug causing the bitmaps to be resized;
> that extension code was removed in 66103a5 so that lock can probably go.
> 
> I don't see how the lock would help us though; the migration thread is
> scanning it most of the time so would have to have the lock held
> most of the time.
> 
> Dave
> 
> > -- 
> > MST
> --
> Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK



Re: [Qemu-devel] [PULL 18/41] blockjobs: add block-job-finalize

2018-03-14 Thread John Snow


On 03/13/2018 02:47 PM, Eric Blake wrote:
> On 03/13/2018 11:17 AM, Kevin Wolf wrote:
>> From: John Snow 
>>
>> Instead of automatically transitioning from PENDING to CONCLUDED, gate
>> the .prepare() and .commit() phases behind an explicit acknowledgement
>> provided by the QMP monitor if auto_finalize = false has been requested.
>>
> 
>>   ##
>> +# @block-job-finalize:
>> +#
>> +# Once a job that has manual=true reaches the pending state, it can be
> 
> Is this wording stale, given that you add two separate auto-* bool flags
> in 19/41?  You may want to prepare a followup patch (doc bug fixes are
> safe during softfreeze, so it need not hold up this pull request) that
> tweaks this and any similar stale wording.
> 

Fixed up in my local branch, will send out once the dust settles on master.

Thanks!

--js



Re: [Qemu-devel] [PATCH] migration: Fix rate limiting issue on RDMA migration

2018-03-14 Thread Dr. David Alan Gilbert
* Lidong Chen (jemmy858...@gmail.com) wrote:
> RDMA migration implement save_page function for QEMUFile, but
> ram_control_save_page do not increase bytes_xfer. So when doing
> RDMA migration, it will use whole bandwidth.

Hi,
  Thanks for this,

> Signed-off-by: Lidong Chen 
> ---
>  migration/qemu-file.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/migration/qemu-file.c b/migration/qemu-file.c
> index 2ab2bf3..217609d 100644
> --- a/migration/qemu-file.c
> +++ b/migration/qemu-file.c
> @@ -253,7 +253,7 @@ size_t ram_control_save_page(QEMUFile *f, ram_addr_t 
> block_offset,
>  if (f->hooks && f->hooks->save_page) {
>  int ret = f->hooks->save_page(f, f->opaque, block_offset,
>offset, size, bytes_sent);
> -
> +f->bytes_xfer += size;

I'm a bit confused, because I know rdma.c calls acct_update_position()
and I'd always thought that was enough.
That calls qemu_update_position(...) which increases f->pos but not
f->bytes_xfer.

f_pos is used to calculate the 'transferred' value in
migration_update_counters and thus the current bandwidth and downtime -
but as you say, not the rate_limit.

So really, should this f->bytes_xfer += size   go in
qemu_update_position ?

Juan: I'm not sure I know why we have both bytes_xfer and pos.

Dave

>  if (ret != RAM_SAVE_CONTROL_DELAYED) {
>  if (bytes_sent && *bytes_sent > 0) {
>  qemu_update_position(f, *bytes_sent);
> -- 
> 1.8.3.1
> 
--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK



Re: [Qemu-devel] [PATCH v4 4/4] migration: use the free page hint feature from balloon

2018-03-14 Thread Dr. David Alan Gilbert
* Wei Wang (wei.w.w...@intel.com) wrote:
> Start the free page optimization after the migration bitmap is
> synchronized. This can't be used in the stop phase since the guest
> is paused. Make sure the guest reporting has stopped before
> synchronizing the migration dirty bitmap. Currently, the optimization is
> added to precopy only.
> 
> Signed-off-by: Wei Wang 
> CC: Dr. David Alan Gilbert 
> CC: Juan Quintela 
> CC: Michael S. Tsirkin 
> ---
>  migration/ram.c | 19 ++-
>  1 file changed, 18 insertions(+), 1 deletion(-)
> 
> diff --git a/migration/ram.c b/migration/ram.c
> index e172798..7b4c9b1 100644
> --- a/migration/ram.c
> +++ b/migration/ram.c
> @@ -51,6 +51,8 @@
>  #include "qemu/rcu_queue.h"
>  #include "migration/colo.h"
>  #include "migration/block.h"
> +#include "sysemu/balloon.h"
> +#include "sysemu/sysemu.h"
>  
>  /***/
>  /* ram save/restore */
> @@ -208,6 +210,8 @@ struct RAMState {
>  uint32_t last_version;
>  /* We are in the first round */
>  bool ram_bulk_stage;
> +/* The free pages optimization feature is supported */
> +bool free_page_support;
>  /* How many times we have dirty too many pages */
>  int dirty_rate_high_cnt;
>  /* these variables are used for bitmap sync */
> @@ -775,7 +779,7 @@ unsigned long migration_bitmap_find_dirty(RAMState *rs, 
> RAMBlock *rb,
>  unsigned long *bitmap = rb->bmap;
>  unsigned long next;
>  
> -if (rs->ram_bulk_stage && start > 0) {
> +if (rs->ram_bulk_stage && start > 0 && !rs->free_page_support) {
>  next = start + 1;

An easier thing is just to clear the ram_bulk_stage flag (and if you're
doing it in the middle of the migration you need to reset some of the
pointers; see postcopy_start for an example).

>  } else {
>  next = find_next_bit(bitmap, size, start);
> @@ -833,6 +837,10 @@ static void migration_bitmap_sync(RAMState *rs)
>  int64_t end_time;
>  uint64_t bytes_xfer_now;
>  
> +if (rs->free_page_support) {
> +balloon_free_page_stop();

Does balloon_free_page_stop cause it to immediately stop, or does it
just ask nicely?   Could a slow guest keep pumping advice to us even
when it was stopped?

> +}
> +
>  ram_counters.dirty_sync_count++;
>  
>  if (!rs->time_last_bitmap_sync) {
> @@ -899,6 +907,10 @@ static void migration_bitmap_sync(RAMState *rs)
>  if (migrate_use_events()) {
>  qapi_event_send_migration_pass(ram_counters.dirty_sync_count, NULL);
>  }
> +
> +if (rs->free_page_support && runstate_is_running()) {
> +balloon_free_page_start();
> +}
>  }
>  
>  /**
> @@ -1656,6 +1668,8 @@ static void ram_state_reset(RAMState *rs)
>  rs->last_page = 0;
>  rs->last_version = ram_list.version;
>  rs->ram_bulk_stage = true;
> +rs->free_page_support = balloon_free_page_support() &
> +!migration_in_postcopy();

That's probably the wrong test for postcopy; I think it'll always
be false there.  Using migrate_postcopy_ram() tells you whether
postcopy-ram is enabled; although not necessarily in use at that
point.

Dave

>  }
>  
>  #define MAX_WAIT 50 /* ms, half buffered_file limit */
> @@ -2330,6 +2344,9 @@ static int ram_save_iterate(QEMUFile *f, void *opaque)
>  
>  ret = qemu_file_get_error(f);
>  if (ret < 0) {
> +if (rs->free_page_support) {
> +balloon_free_page_stop();
> +}
>  return ret;
>  }
>  
> -- 
> 1.8.3.1
> 
--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK



Re: [Qemu-devel] [PATCH v4 2/4] migration: API to clear bits of guest free pages from the dirty bitmap

2018-03-14 Thread Dr. David Alan Gilbert
* Michael S. Tsirkin (m...@redhat.com) wrote:
> On Wed, Mar 14, 2018 at 06:11:37PM +, Dr. David Alan Gilbert wrote:
> > > +used_len = block->used_length - offset;
> > > +addr += used_len;
> > > +}
> > > +
> > > +start = offset >> TARGET_PAGE_BITS;
> > > +npages = used_len >> TARGET_PAGE_BITS;
> > > +ram_state->migration_dirty_pages -=
> > > +  bitmap_count_one_with_offset(block->bmap, start, 
> > > npages);
> > > +bitmap_clear(block->bmap, start, npages);
> > 
> > If this is happening while the migration is running, this isn't safe -
> > the migration code could clear a bit at about the same point this
> > happens, so that the count returned by bitmap_count_one_with_offset
> > wouldn't match the word that was cleared by bitmap_clear.
> > 
> > The only way I can see to fix it is to run over the range using
> > bitmap_test_and_clear_atomic, using the return value to decrement
> > the number of dirty pages.
> > But you also need to be careful with the update of the
> > migration_dirty_pages value itself, because that's also being read
> > by the migration thread.
> > 
> > Dave
> 
> I see that there's migration_bitmap_sync but it does not seem to be

Do you mean bitmap_mutex?

> taken on all paths. E.g. migration_bitmap_clear_dirty and
> migration_bitmap_find_dirty are called without that lock sometimes.
> Thoughts?

Hmm, that doesn't seem to protect much at all!  It looks like it was
originally added to handle hotplug causing the bitmaps to be resized;
that extension code was removed in 66103a5 so that lock can probably go.

I don't see how the lock would help us though; the migration thread is
scanning it most of the time so would have to have the lock held
most of the time.

Dave

> -- 
> MST
--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK



Re: [Qemu-devel] [PATCH v4 2/4] migration: API to clear bits of guest free pages from the dirty bitmap

2018-03-14 Thread Michael S. Tsirkin
On Wed, Mar 14, 2018 at 06:11:37PM +, Dr. David Alan Gilbert wrote:
> > +used_len = block->used_length - offset;
> > +addr += used_len;
> > +}
> > +
> > +start = offset >> TARGET_PAGE_BITS;
> > +npages = used_len >> TARGET_PAGE_BITS;
> > +ram_state->migration_dirty_pages -=
> > +  bitmap_count_one_with_offset(block->bmap, start, 
> > npages);
> > +bitmap_clear(block->bmap, start, npages);
> 
> If this is happening while the migration is running, this isn't safe -
> the migration code could clear a bit at about the same point this
> happens, so that the count returned by bitmap_count_one_with_offset
> wouldn't match the word that was cleared by bitmap_clear.
> 
> The only way I can see to fix it is to run over the range using
> bitmap_test_and_clear_atomic, using the return value to decrement
> the number of dirty pages.
> But you also need to be careful with the update of the
> migration_dirty_pages value itself, because that's also being read
> by the migration thread.
> 
> Dave

I see that there's migration_bitmap_sync but it does not seem to be
taken on all paths. E.g. migration_bitmap_clear_dirty and
migration_bitmap_find_dirty are called without that lock sometimes.
Thoughts?

-- 
MST



[Qemu-devel] [PATCH 2/2 v3] slirp: Add classless static routes support to DHCP server

2018-03-14 Thread Benjamin Drung
This patch will allow the user to specify classless static routes for
the replies from the built-in DHCP server, for example:

  qemu --net user,route=10.0.2.0/24,route=192.168.0.0/16 [...]

The QMP schema for the "route" option is ['str'], because the opts
visitor code only supports lists with a single mandatory scalar member.
The option "ipv6-net" takes two pieces and is split in net_client_init()
into "ipv6-prefix" and "ipv6-prefixlen" before calling
visit_type_Netdev()/visit_type_NetLegacy(). But that logic cannot be
used for the "route" option, because there is no intermediate format to
store the split parts for further processing without overhauling the
visitor code. Once the opts visitor supports lists with multiple
members, please update the QMP schema to:

{ 'struct': 'RouteEntry',
  'data': {
'subnet': 'str',
'mask_width': 'uint8',
'*gateway': 'str' } }
[...]
'route': [ 'RouteEntry' ]

Signed-off-by: Benjamin Drung 
---
v2: Address all remarks from Samuel Thibault:
 * Initialize values directly before usage
 * Make :gateway part optional
 * change formula for significant octet calculation
 * do not calculate significant_octets/option_length twice

v3:
 * change QMP member 'route' from ['String'] to ['str']
 * use '--net' in the documentation

 net/slirp.c  | 68 +---
 qapi/net.json|  4 
 qemu-options.hx  | 14 +++-
 slirp/bootp.c| 20 +
 slirp/bootp.h|  2 ++
 slirp/libslirp.h |  9 +++-
 slirp/slirp.c|  7 +-
 slirp/slirp.h|  2 ++
 8 files changed, 120 insertions(+), 6 deletions(-)

diff --git a/net/slirp.c b/net/slirp.c
index 8c08e5644f..6611eb257f 100644
--- a/net/slirp.c
+++ b/net/slirp.c
@@ -158,7 +158,7 @@ static int net_slirp_init(NetClientState *peer, const char 
*model,
   const char *vnameserver, const char *vnameserver6,
   const char *smb_export, const char *vsmbserver,
   const char **dnssearch, const char *vdomainname,
-  Error **errp)
+  const strList *vroutes, Error **errp)
 {
 /* default settings according to historic slirp */
 struct in_addr net  = { .s_addr = htonl(0x0a000200) }; /* 10.0.2.0 */
@@ -177,8 +177,12 @@ static int net_slirp_init(NetClientState *peer, const char 
*model,
 char buf[20];
 uint32_t addr;
 int shift;
+unsigned int i;
 char *end;
+unsigned int route_count;
 struct slirp_config_str *config;
+struct StaticRoute *routes = NULL;
+const strList *iter;
 
 if (!ipv4 && (vnetwork || vhost || vnameserver)) {
 error_setg(errp, "IPv4 disabled but netmask/host/dns provided");
@@ -365,6 +369,61 @@ static int net_slirp_init(NetClientState *peer, const char 
*model,
 return -1;
 }
 
+iter = vroutes;
+route_count = 0;
+while (iter) {
+route_count++;
+iter = iter->next;
+}
+routes = g_malloc(route_count * sizeof(StaticRoute));
+
+i = 0;
+iter = vroutes;
+while(iter != NULL) {
+char buf2[20];
+const char *gateway = iter->value;
+const char *mask;
+char *end;
+long mask_width;
+
+// Split "subnet/mask[:gateway]" into its components
+if (get_str_sep(buf2, sizeof(buf2), , ':') < 0) {
+// Fallback to default gateway
+routes[i].gateway.s_addr = host.s_addr;
+mask = gateway;
+} else {
+if (!inet_aton(gateway, [i].gateway)) {
+error_setg(errp, "Failed to parse route gateway '%s'", 
gateway);
+return -1;
+}
+mask = buf2;
+}
+
+if (get_str_sep(buf, sizeof(buf), , '/') < 0) {
+error_setg(errp, "Failed to parse route: No slash found in '%s'",
+   mask);
+return -1;
+}
+if (!inet_aton(buf, [i].subnet)) {
+error_setg(errp, "Failed to parse route subnet '%s'", buf);
+return -1;
+}
+
+mask_width = strtol(mask, , 10);
+if (*end != '\0') {
+error_setg(errp,
+   "Failed to parse netmask '%s' (trailing chars)", mask);
+return -1;
+} else if (mask_width < 0 || mask_width > 32) {
+error_setg(errp,
+   "Invalid netmask provided (must be in range 0-32)");
+return -1;
+}
+routes[i].mask_width = (uint8_t)mask_width;
+
+iter = iter->next;
+i++;
+}
 
 nc = qemu_new_net_client(_slirp_info, peer, model, name);
 
@@ -377,7 +436,8 @@ static int net_slirp_init(NetClientState *peer, const char 
*model,
 s->slirp = slirp_init(restricted, ipv4, net, mask, host,
   ipv6, ip6_prefix, vprefix6_len, ip6_host,
   vhostname, tftp_export, bootfile, dhcp,
-

Re: [Qemu-devel] [PATCH 2/2 v2] slirp: Add classless static routes support to DHCP server

2018-03-14 Thread Benjamin Drung
Am Donnerstag, den 08.03.2018, 14:21 -0600 schrieb Eric Blake:
> On 03/08/2018 02:07 PM, Benjamin Drung wrote:
> > Am Donnerstag, den 08.03.2018, 13:46 -0600 schrieb Eric Blake:
> > > On 03/08/2018 12:57 PM, Benjamin Drung wrote:
> > > >'*dnssearch': ['String'],
> > > >'*domainname': 'str',
> > > > +'*route': ['String'],
> > > 
> > > I know we've used ['String'] for previous members, but that's
> > > rather
> > > heavyweight - it transmits over QMP as:
> > > 
> > > "dnssearch": [ { "str": "foo" }, { "str": "bar" } ]
> > > 
> > > Nicer is ['str'], which transmits as:
> > > 
> > > "route": [ "foo", "bar" ]
> > > 
> > > so the question boils down to whether cross-member consistency is
> > > more
> > > important than making your additions concise.
> > 
> > Agreed that ['str'] is nicer. I will update the patch.
> 
> The problem is that ['str'] might not work easily for the command
> line 
> glue; I'm more familiar with how QMP exposes things than with the 
> command line parsing, and Markus, who is trying to improve command
> line 
> parsing to share more common infrastructure with QMP, might have
> better 
> comments on the topic, except that he's on leave for a few weeks and 
> won't respond until after 2.12 is frozen.  Using ['String'] for 
> consistency is therefore okay, if you can't get ['str'] working
> quickly.

['str'] works and was easily changeable.

> > > > @@ -1904,7 +1904,7 @@ DEF("netdev", HAS_ARG,
> > > > QEMU_OPTION_netdev,
> > > >" [,ipv6[=on|off]][,ipv6-net=addr[/int]][,ipv6-
> > > > host=addr]\n"
> 
> Here's an example where we made the command line smart.  ipv6-net
> takes 
> TWO pieces of information: addr/int; on the QMP side, we spelled it 
> '*ipv6-prefix':'str' + 'ipv6-prefixlen':'int'.  So somewhere in the 
> command line parsing code for --net (which I'm less familiar with), 
> there is some glue code taking the compact representation and
> splitting 
> it into the more verbose but more direct QMP representation - well,
> that 
> is, if we are converting it into QMP form at all (part of the problem
> is 
> that our command line and runtime control don't always share code, 
> although we're trying to get better at that).

The split is done net_client_init() before iterating over the options.
This is equivalent to having following command line parameter:

  [,ipv6-prefix=addr][,ipv6-prefixlen=int]

instead of

  [,ipv6-net=addr[/int]]

> > > > +" [,route=addr/mask[:gateway]][,tftp=dir][,bootfil
> > > > e=f]
> > > > [,hostfwd=rule][,guestfwd=rule]"
> > > 
> > > Urgh - your QMP interface HAS to be further parsed to get to the
> > > useful
> > > information.  While it's nice to have compact syntax on the
> > > command
> > > line, it is really worth thinking about making information easier
> > > to
> > > consume (that is, NO further parsing required once the
> > > information is
> > > in
> > > JSON format).  Would it be any better to send things over the
> > > wire
> > > as:
> > > 
> > > "route": [ { "addr": "...", "mask": 24, "gateway": "..." } ]
> > 
> > That's looks good.
> 
> Okay, doing that would mean using something like:
> 
> { 'struct': 'RouteEntry', 'data': { 'addr': 'str', '*mask': 'int', 
> '*gateway': 'str' } }
> ...
> 'route': [ 'RouteEntry' ]
> 
> (but reuse, rather than inventing a new type, if one of the existing
> QMP 
> types already resembles what I proposed for RouteEntry)
> 
> The command line can still use route=addr/mask:gateway syntax, parse
> it 
> down into components, then compile the QMP array of already-parsed 
> structs (rather than making QMP take a direct ['String'] that still 
> needs further parsing).  It may take more glue code, but the idea is 
> that all the glue code should live on the front end, so that the QMP 
> backend should be easy to work with.

The problem is that the opts visitor code only supports lists with a
single mandatory scalar member. Bigger changes are needed to support
splitting 'route'. I have looked into the code for several hours, but
found no simple and non-ugly way to add the splitting without major
changes to the code. I am willing to adapt the patch if someone changes
the opts visitor to support lists with multiple members.

-- 
Benjamin Drung
System Developer
Debian & Ubuntu Developer

ProfitBricks GmbH
Greifswalder Str. 207
D - 10405 Berlin

Email: benjamin.dr...@profitbricks.com
URL: https://www.profitbricks.de

Sitz der Gesellschaft: Berlin
Registergericht: Amtsgericht Charlottenburg, HRB 125506 B
Geschäftsführer: Achim Weiss, Matthias Steinberg



Re: [Qemu-devel] [PATCH v4 3/4] virtio-balloon: VIRTIO_BALLOON_F_FREE_PAGE_HINT

2018-03-14 Thread Dr. David Alan Gilbert
* Wei Wang (wei.w.w...@intel.com) wrote:
> The new feature enables the virtio-balloon device to receive hints of
> guest free pages from the free page vq.
> 
> balloon_free_page_start - start guest free page hint reporting.
> balloon_free_page_stop - stop guest free page hint reporting.
> 
> Note: balloon will report pages which were free at the time
> of this call. As the reporting happens asynchronously, dirty bit logging
> must be enabled before this call is made. Guest reporting must be
> disabled before the migration dirty bitmap is synchronized.
> 
> Signed-off-by: Wei Wang 
> Signed-off-by: Liang Li 
> CC: Michael S. Tsirkin 
> CC: Dr. David Alan Gilbert 
> CC: Juan Quintela 
> ---
>  balloon.c   |  49 +--
>  hw/virtio/virtio-balloon.c  | 183 
> +---
>  include/hw/virtio/virtio-balloon.h  |  15 +-
>  include/standard-headers/linux/virtio_balloon.h |   7 +
>  include/sysemu/balloon.h|  15 +-
>  5 files changed, 240 insertions(+), 29 deletions(-)



> +qemu_thread_create(>free_page_thread, "free_page_optimization_thread",

Note the maximum size of thread name is normally 14 chars, and also we
don't need to say 'thread' since we know it's a thread; I suggest
shortening it to "free_page_opt" or "balloon_fpo"

Dave

> +   virtio_balloon_poll_free_page_hints, s,
> +   QEMU_THREAD_JOINABLE);
> +}
> +
> +static void virtio_balloon_free_page_stop(void *opaque)
> +{
> +VirtIOBalloon *s = opaque;
> +VirtIODevice *vdev = VIRTIO_DEVICE(s);
> +
> +switch (s->free_page_report_status) {
> +case FREE_PAGE_REPORT_S_REQUESTED:
> +case FREE_PAGE_REPORT_S_START:
> +/*
> + * The guest hasn't done the reporting, so host sends a notification
> + * to the guest to actively stop the reporting before joining the
> + * optimization thread.
> + */
> +s->free_page_report_status = FREE_PAGE_REPORT_S_STOP;
> +virtio_notify_config(vdev);
> +case FREE_PAGE_REPORT_S_STOP:
> +/* The guest has stopped the reporting. Join the optimization thread 
> */
> +qemu_thread_join(>free_page_thread);
> +s->free_page_report_status = FREE_PAGE_REPORT_S_EXIT;
> +case FREE_PAGE_REPORT_S_EXIT:
> +/* The optimization thread has gone. No actions needded so far. */
> +break;
> +}
> +}
> +
>  static void virtio_balloon_get_config(VirtIODevice *vdev, uint8_t 
> *config_data)
>  {
>  VirtIOBalloon *dev = VIRTIO_BALLOON(vdev);
> @@ -315,6 +421,15 @@ static void virtio_balloon_get_config(VirtIODevice 
> *vdev, uint8_t *config_data)
>  
>  config.num_pages = cpu_to_le32(dev->num_pages);
>  config.actual = cpu_to_le32(dev->actual);
> +config.poison_val = cpu_to_le32(dev->poison_val);
> +
> +if (dev->free_page_report_status >= FREE_PAGE_REPORT_S_STOP) {
> +config.free_page_report_cmd_id =
> +   cpu_to_le32(VIRTIO_BALLOON_FREE_PAGE_REPORT_STOP_ID);
> +} else {
> +config.free_page_report_cmd_id =
> +   cpu_to_le32(dev->free_page_report_cmd_id);
> +}
>  
>  trace_virtio_balloon_get_config(config.num_pages, config.actual);
>  memcpy(config_data, , sizeof(struct virtio_balloon_config));
> @@ -368,6 +483,7 @@ static void virtio_balloon_set_config(VirtIODevice *vdev,
>  ((ram_addr_t) dev->actual << 
> VIRTIO_BALLOON_PFN_SHIFT),
>  _abort);
>  }
> +dev->poison_val = le32_to_cpu(config.poison_val);
>  trace_virtio_balloon_set_config(dev->actual, oldactual);
>  }
>  
> @@ -377,6 +493,11 @@ static uint64_t virtio_balloon_get_features(VirtIODevice 
> *vdev, uint64_t f,
>  VirtIOBalloon *dev = VIRTIO_BALLOON(vdev);
>  f |= dev->host_features;
>  virtio_add_feature(, VIRTIO_BALLOON_F_STATS_VQ);
> +
> +if (dev->host_features & 1ULL << VIRTIO_BALLOON_F_FREE_PAGE_HINT) {
> +virtio_add_feature(, VIRTIO_BALLOON_F_PAGE_POISON);
> +}
> +
>  return f;
>  }
>  
> @@ -413,6 +534,18 @@ static int virtio_balloon_post_load_device(void *opaque, 
> int version_id)
>  return 0;
>  }
>  
> +static const VMStateDescription vmstate_virtio_balloon_free_page_report = {
> +.name = "virtio-balloon-device/free-page-report",
> +.version_id = 1,
> +.minimum_version_id = 1,
> +.needed = virtio_balloon_free_page_support,
> +.fields = (VMStateField[]) {
> +VMSTATE_UINT32(free_page_report_cmd_id, VirtIOBalloon),
> +VMSTATE_UINT32(poison_val, VirtIOBalloon),
> +VMSTATE_END_OF_LIST()
> +}
> +};
> +
>  static const VMStateDescription vmstate_virtio_balloon_device = {
>  .name = "virtio-balloon-device",
>  .version_id = 1,
> @@ -423,30 +556,30 @@ static const VMStateDescription 
> 

Re: [Qemu-devel] [PATCH RFC] configure: shorthand for only enabling native softmmu target

2018-03-14 Thread Peter Maydell
On 14 March 2018 at 12:09, Daniel P. Berrangé  wrote:
> With the huge number of QEMU targets, a default configuration will take
> a very long time to rebuild. When developing most code changes, it is
> sufficient to test compilation with a single target - rebuilding all
> targets just extends compile times while not detecting any new problems.
>
> Developers will often thus specify a single target for configure,
> commonly matching the host architecture. eg
>
>   ./configure --target-list=x86_64-softmmu
>
> This works fine, but is a bit of a verbose thing to type out everytime
> configure is invoked. There are already short-hand args to disable all
> user targets, all softmmu targets, or all tcg targets. This adds one
> further shorthand to disable all non-native architecture targets.
>
>   ./configure --native

How common actually is that, though? Almost all the time when
I'm picking targets I want something that's *not* the native
target...

More to the point, I actually only fairly rarely run configure
by hand at all. I have a source tree, with a subdir build/
which I have lots of subdirectories of for each config I
care about or have cared about. So if I want to do something
with sparc I'll just use 'make -C build/sparc' which will
automatically rerun configure with the right arguments, because
they're in the config.status in that build tree.

And as Alex points out you already usually want to feed
configure a pile of options, like --enable-debug. I also
like --with-pkgversion=foo, because without that the
version depends on the git commit hash, which means
every time the git hash changes multiple files and
executables get pointlessly rebuilt because the version
string changed. Keeping build dirs around means that
all this sort of customisation of options stays around.

So I'm not hugely keen on this patch, because it saves
20 characters on a command line that really ought not to
need typing more than once a month at most, but it
adds another messy case statement in configure that
knows about multiple architecture and subarchitecture
names.

thanks
-- PMM



Re: [Qemu-devel] [PATCH v4 2/4] migration: API to clear bits of guest free pages from the dirty bitmap

2018-03-14 Thread Dr. David Alan Gilbert
* Wei Wang (wei.w.w...@intel.com) wrote:
> This patch adds an API to clear bits corresponding to guest free pages
> from the dirty bitmap. Spilt the free page block if it crosses the QEMU
> RAMBlock boundary.
> 
> Signed-off-by: Wei Wang 
> CC: Dr. David Alan Gilbert 
> CC: Juan Quintela 
> CC: Michael S. Tsirkin 
> ---
>  include/migration/misc.h |  2 ++
>  migration/ram.c  | 21 +
>  2 files changed, 23 insertions(+)
> 
> diff --git a/include/migration/misc.h b/include/migration/misc.h
> index 77fd4f5..fae1acf 100644
> --- a/include/migration/misc.h
> +++ b/include/migration/misc.h
> @@ -14,11 +14,13 @@
>  #ifndef MIGRATION_MISC_H
>  #define MIGRATION_MISC_H
>  
> +#include "exec/cpu-common.h"
>  #include "qemu/notify.h"
>  
>  /* migration/ram.c */
>  
>  void ram_mig_init(void);
> +void qemu_guest_free_page_hint(void *addr, size_t len);
>  
>  /* migration/block.c */
>  
> diff --git a/migration/ram.c b/migration/ram.c
> index 5e33e5c..e172798 100644
> --- a/migration/ram.c
> +++ b/migration/ram.c
> @@ -2189,6 +2189,27 @@ static int ram_init_all(RAMState **rsp)
>  return 0;
>  }
>  

This could do with some comments

> +void qemu_guest_free_page_hint(void *addr, size_t len)
> +{
> +RAMBlock *block;
> +ram_addr_t offset;
> +size_t used_len, start, npages;

From your use I think the addr and len are coming raw from the guest;
so we need to take some care.

> +
> +for (used_len = len; len > 0; len -= used_len) {

That initialisation of used_len is unusual; I'd rather put that
in the body.

> +block = qemu_ram_block_from_host(addr, false, );

CHeck for block != 0

> +if (unlikely(offset + len > block->used_length)) {

I think to make that overflow safe, that should be:
  if (len > (block->used_length - offset)) {

But we'll need another test before it, because qemu_ram_block_from_host
seems to check max_length not used_length, so we need to check
for offset > block->used_length first

> +used_len = block->used_length - offset;
> +addr += used_len;
> +}
> +
> +start = offset >> TARGET_PAGE_BITS;
> +npages = used_len >> TARGET_PAGE_BITS;
> +ram_state->migration_dirty_pages -=
> +  bitmap_count_one_with_offset(block->bmap, start, 
> npages);
> +bitmap_clear(block->bmap, start, npages);

If this is happening while the migration is running, this isn't safe -
the migration code could clear a bit at about the same point this
happens, so that the count returned by bitmap_count_one_with_offset
wouldn't match the word that was cleared by bitmap_clear.

The only way I can see to fix it is to run over the range using
bitmap_test_and_clear_atomic, using the return value to decrement
the number of dirty pages.
But you also need to be careful with the update of the
migration_dirty_pages value itself, because that's also being read
by the migration thread.

Dave

> +}
> +}
> +
>  /*
>   * Each of ram_save_setup, ram_save_iterate and ram_save_complete has
>   * long-running RCU critical section.  When rcu-reclaims in the code
> -- 
> 1.8.3.1
> 
--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK



Re: [Qemu-devel] [PATCH RFC] configure: shorthand for only enabling native softmmu target

2018-03-14 Thread Alex Bennée

Daniel P. Berrangé  writes:

> With the huge number of QEMU targets, a default configuration will take
> a very long time to rebuild. When developing most code changes, it is
> sufficient to test compilation with a single target - rebuilding all
> targets just extends compile times while not detecting any new problems.
>
> Developers will often thus specify a single target for configure,
> commonly matching the host architecture. eg
>
>   ./configure --target-list=x86_64-softmmu

A while back I messed with a patch that allowed stems in --target-list
so you could quickly select targets with stems:

  Subject: [PATCH 0/4] Current Travis queue
  Date: Fri, 15 Apr 2016 16:56:57 +0100
  Message-Id: <1460735821-12775-1-git-send-email-alex.ben...@linaro.org>

but if I recall Peter was worried about it breaking existing configure
lines.

>
> This works fine, but is a bit of a verbose thing to type out everytime
> configure is invoked. There are already short-hand args to disable all
> user targets, all softmmu targets, or all tcg targets. This adds one
> further shorthand to disable all non-native architecture targets.
>
>   ./configure --native

I'm not sure this is really the case. My history tends to be littered
with things like:

  ./configure --enable-debug --enable-debug-tcg --extra-cflags="-O0 -g3" 
--target-list=aarch64-linux-user

when compile time is an issue although my development box is an x86.
Normally I do compile all targets and rely on ccache to keep the compile
time reasonable.

>
> Signed-off-by: Daniel P. Berrangé 
> ---
>
> Suggestions welcomed for better names than --native, but bear in mind
> the goal is to minimise amount of typing so nothing too verbose, hence
> why I didn't do something like --disable-non-native ...

I would argue it's "almost" equivalent to --disable-tcg as most native
users in my experience aren't looking to run X on X via TCG. I could be
wrong of course.

>
>  configure | 24 
>  1 file changed, 24 insertions(+)
>
> diff --git a/configure b/configure
> index af72fc852e..807af93116 100755
> --- a/configure
> +++ b/configure
> @@ -233,6 +233,22 @@ supported_whpx_target() {
>  return 1
>  }
>
> +supported_native_target() {
> +glob "$1" "*-softmmu" || return 1
> +case "${1%-softmmu}:$cpu" in
> +arm:arm | aarch64:aarch64 | \
> +i386:i386 | i386:x32 | \
> +x86_64:x86_64 | \
> +mips:mips | mipsel:mips | \
> +ppc:ppc | ppcemb:ppc | \
> +ppc64:ppc64 | \
> +s390x:s390x)
> +return 0
> +;;
> +esac
> +return 1
> +}
> +

This strikes me as another place to mess about with when doing target
specific changes to configure.

>  supported_target() {
>  case "$1" in
>  *-softmmu)
> @@ -254,6 +270,10 @@ supported_target() {
>  return 1
>  ;;
>  esac
> +if test "$native" = "yes"
> +then
> + supported_native_target "$1" || return 1
> +fi
>  test "$tcg" = "yes" && return 0
>  supported_kvm_target "$1" && return 0
>  supported_xen_target "$1" && return 0
> @@ -390,6 +410,7 @@ cocoa="no"
>  softmmu="yes"
>  linux_user="no"
>  bsd_user="no"
> +native="no"
>  blobs="yes"
>  pkgversion=""
>  pie=""
> @@ -1112,6 +1133,8 @@ for opt do
>cocoa="yes" ;
>audio_drv_list="coreaudio $(echo $audio_drv_list | sed s,coreaudio,,g)"
>;;
> +  --native) native="yes"
> +  ;;
>--disable-system) softmmu="no"
>;;
>--enable-system) softmmu="yes"
> @@ -1540,6 +1563,7 @@ Advanced options (experts only):
> xen pv domain builder
>--enable-debug-stack-usage
> track the maximum stack usage of stacks created 
> by qemu_alloc_stack
> +  --native only enable the softmmu target matching host 
> architecture
>
>  Optional features, enabled with --enable-FEATURE and
>  disabled with --disable-FEATURE, default is enabled if available:


--
Alex Bennée



Re: [Qemu-devel] CVE-2018-7550 (was: multiboot: bss_end_addr can be zero / cleanup)

2018-03-14 Thread Kevin Wolf
Am 14.03.2018 um 18:35 hat Konrad Rzeszutek Wilk geschrieben:
> On March 14, 2018 1:23:51 PM EDT, Kevin Wolf  wrote:
> >Am 21.12.2017 um 18:25 hat Jack Schwartz geschrieben:
> >> Properly account for the possibility of multiboot kernels with a zero
> >> bss_end_addr.  The Multiboot Specification, section 3.1.3 allows for
> >> kernels without a bss section, by allowing a zeroed bss_end_addr
> >multiboot
> >> header field.
> >> 
> >> Do some cleanup to multiboot.c as well:
> >> - Remove some unused variables.
> >> - Use more intuitive header names when displaying fields in messages.
> >> - Change fprintf(stderr...) to error_report
> >
> >[ Cc: qemu-stable ]
> >
> >This series happens to fix CVE-2018-7550.
> >http://www.openwall.com/lists/oss-security/2018/03/08/4
> >
> >Just a shame that we weren't told before merging it so that the
> >appropriate tags could have been set in the commit message (and all of
> >the problems could have been addressed; I'm going to send another
> >Multiboot series now).
> 
> Huh?
> 
> You mean the CVE tags that were created in 2018 for a patch posted in
> 2017?

Well, it seems to me that this patch was created for a different
purpose, but it happens to fix the bug for which this CVE was assigned
now. It's not your or Jack's fault, that's just how things go sometimes.

I think PJP knew that this CVE was coming before the patches were merged
into master, so if he had told us, we could have had a better commit
message. But either way, it's not a disaster to have a suboptimal commit
message.

> Or that the reporter of the security issue didn't point to this particular 
> patch?
> 
> Irrespective of that,  is there a write-up  of how security process
> works at QEMU?
> 
> That is what is the usual embargo period, the list of security folks,
> how one can become one, what are the responsibilities, how changes to
> process are being carried out (and discussed), what breath of testing
> and PoC work is done , how security fixes are being reviewed, etc?

I don't think a problem like this would be embargoed at all. Anyway,
have a look here:

https://wiki.qemu.org/SecurityProcess

Kevin



Re: [Qemu-devel] [PATCH 5/5] tests/multiboot: Add .gitignore

2018-03-14 Thread Eric Blake

On 03/14/2018 12:43 PM, Eric Blake wrote:

On 03/14/2018 12:32 PM, Kevin Wolf wrote:

Signed-off-by: Kevin Wolf 
---
  tests/multiboot/.gitignore | 3 +++
  1 file changed, 3 insertions(+)
  create mode 100644 tests/multiboot/.gitignore


Reviewed-by: Eric Blake 


Huh - and I even proposed something similar a while back:
https://lists.gnu.org/archive/html/qemu-devel/2017-09/msg00305.html

--
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org



Re: [Qemu-devel] [PATCH 5/5] tests/multiboot: Add .gitignore

2018-03-14 Thread Eric Blake

On 03/14/2018 12:32 PM, Kevin Wolf wrote:

Signed-off-by: Kevin Wolf 
---
  tests/multiboot/.gitignore | 3 +++
  1 file changed, 3 insertions(+)
  create mode 100644 tests/multiboot/.gitignore


Reviewed-by: Eric Blake 

--
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org



[Qemu-devel] [PATCH] target/ppc: fix tlbsync to check privilege level depending on GTSE

2018-03-14 Thread Cédric Le Goater
tlbsync also needs to check the Guest Translation Shootdown Enable
(GTSE) bit in the Logical Partition Control Register (LPCR) to
determine at which privilege level it is running.

See commit c6fd28fd573d ("target/ppc: Update tlbie to check privilege
level based on GTSE")

Signed-off-by: Cédric Le Goater 
---

 This will have its importance when we activate the HV bit on the
 POWER9 CPU for the PowerNV machine.

 target/ppc/translate.c | 9 +++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/target/ppc/translate.c b/target/ppc/translate.c
index 0a0c090c9978..218665b4080b 100644
--- a/target/ppc/translate.c
+++ b/target/ppc/translate.c
@@ -4526,7 +4526,7 @@ static void gen_tlbie(DisasContext *ctx)
 TCGv_i32 t1;
 
 if (ctx->gtse) {
-CHK_SV; /* If gtse is set then tblie is supervisor privileged */
+CHK_SV; /* If gtse is set then tlbie is supervisor privileged */
 } else {
 CHK_HV; /* Else hypervisor privileged */
 }
@@ -4553,7 +4553,12 @@ static void gen_tlbsync(DisasContext *ctx)
 #if defined(CONFIG_USER_ONLY)
 GEN_PRIV;
 #else
-CHK_HV;
+
+if (ctx->gtse) {
+CHK_SV; /* If gtse is set then tlbsync is supervisor privileged */
+} else {
+CHK_HV; /* Else hypervisor privileged */
+}
 
 /* BookS does both ptesync and tlbsync make tlbsync a nop for server */
 if (ctx->insns_flags & PPC_BOOKE) {
-- 
2.13.6




Re: [Qemu-devel] CVE-2018-7550 (was: multiboot: bss_end_addr can be zero / cleanup)

2018-03-14 Thread Konrad Rzeszutek Wilk
On March 14, 2018 1:23:51 PM EDT, Kevin Wolf  wrote:
>Am 21.12.2017 um 18:25 hat Jack Schwartz geschrieben:
>> Properly account for the possibility of multiboot kernels with a zero
>> bss_end_addr.  The Multiboot Specification, section 3.1.3 allows for
>> kernels without a bss section, by allowing a zeroed bss_end_addr
>multiboot
>> header field.
>> 
>> Do some cleanup to multiboot.c as well:
>> - Remove some unused variables.
>> - Use more intuitive header names when displaying fields in messages.
>> - Change fprintf(stderr...) to error_report
>
>[ Cc: qemu-stable ]
>
>This series happens to fix CVE-2018-7550.
>http://www.openwall.com/lists/oss-security/2018/03/08/4
>
>Just a shame that we weren't told before merging it so that the
>appropriate tags could have been set in the commit message (and all of
>the problems could have been addressed; I'm going to send another
>Multiboot series now).

Huh?

You mean the CVE tags that were created in 2018 for a patch posted in 2017?

Or that the reporter of the security issue didn't point to this particular 
patch?

Irrespective of that,  is there a write-up  of how security process works at 
QEMU?

 That is what is the usual embargo period, the list of security folks, how one 
can become one, what are the responsibilities, how changes to process are being 
carried out (and discussed), what breath of testing and PoC work is done , how 
security fixes are being reviewed, etc?

Thanks!

>
>Kevin




[Qemu-devel] [PATCH 5/5] tests/multiboot: Add .gitignore

2018-03-14 Thread Kevin Wolf
Signed-off-by: Kevin Wolf 
---
 tests/multiboot/.gitignore | 3 +++
 1 file changed, 3 insertions(+)
 create mode 100644 tests/multiboot/.gitignore

diff --git a/tests/multiboot/.gitignore b/tests/multiboot/.gitignore
new file mode 100644
index 00..93ef99800b
--- /dev/null
+++ b/tests/multiboot/.gitignore
@@ -0,0 +1,3 @@
+*.bin
+*.elf
+test.out
-- 
2.13.6




[Qemu-devel] [PATCH 4/5] tests/multiboot: Add tests for the a.out kludge

2018-03-14 Thread Kevin Wolf
Signed-off-by: Kevin Wolf 
---
 tests/multiboot/Makefile|  22 +--
 tests/multiboot/aout_kludge.S   | 138 
 tests/multiboot/aout_kludge.out |  42 
 tests/multiboot/run_test.sh |  10 ++-
 4 files changed, 204 insertions(+), 8 deletions(-)
 create mode 100644 tests/multiboot/aout_kludge.S
 create mode 100644 tests/multiboot/aout_kludge.out

diff --git a/tests/multiboot/Makefile b/tests/multiboot/Makefile
index 36f01dc647..ed4225e7d1 100644
--- a/tests/multiboot/Makefile
+++ b/tests/multiboot/Makefile
@@ -3,16 +3,26 @@ CCFLAGS=-m32 -Wall -Wextra -Werror -fno-stack-protector 
-nostdinc -fno-builtin
 ASFLAGS=-m32
 
 LD=ld
-LDFLAGS=-melf_i386 -T link.ld
+LDFLAGS_ELF=-melf_i386 -T link.ld
+LDFLAGS_BIN=-melf_i386 -T link.ld --oformat=binary
 LIBS=$(shell $(CC) $(CCFLAGS) -print-libgcc-file-name)
 
-all: mmap.elf modules.elf
+AOUT_KLUDGE_BIN=$(foreach x,$(shell seq 1 9),aout_kludge_$x.bin)
 
-mmap.elf: start.o mmap.o libc.o
-   $(LD) $(LDFLAGS) -o $@ $^ $(LIBS)
+all: mmap.elf modules.elf $(AOUT_KLUDGE_BIN)
 
-modules.elf: start.o modules.o libc.o
-   $(LD) $(LDFLAGS) -o $@ $^ $(LIBS)
+mmap.elf: start.o mmap.o libc.o link.ld
+   $(LD) $(LDFLAGS_ELF) -o $@ $^ $(LIBS)
+
+modules.elf: start.o modules.o libc.o link.ld
+   $(LD) $(LDFLAGS_ELF) -o $@ $^ $(LIBS)
+
+aout_kludge_%.bin: aout_kludge_%.o link.ld
+   $(LD) $(LDFLAGS_BIN) -o $@ $^ $(LIBS)
+
+.PRECIOUS: aout_kludge_%.o
+aout_kludge_%.o: aout_kludge.S
+   $(CC) $(ASFLAGS) -DSCENARIO=$* -c -o $@ $^
 
 %.o: %.c
$(CC) $(CCFLAGS) -c -o $@ $^
diff --git a/tests/multiboot/aout_kludge.S b/tests/multiboot/aout_kludge.S
new file mode 100644
index 00..52e8ebd766
--- /dev/null
+++ b/tests/multiboot/aout_kludge.S
@@ -0,0 +1,138 @@
+/*
+ * Copyright (c) 2018 Kevin Wolf 
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to 
deal
+ * in the Software without restriction, including without limitation the rights
+ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+ * copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING 
FROM,
+ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
+ * THE SOFTWARE.
+ */
+
+.section multiboot
+
+#define MB_MAGIC 0x1badb002
+#define MB_FLAGS 0x1
+#define MB_CHECKSUM -(MB_MAGIC + MB_FLAGS)
+
+.align  4
+.intMB_MAGIC
+.intMB_FLAGS
+.intMB_CHECKSUM
+
+#define LAST_BYTE_VALUE 0xa5
+
+/*
+ * Order of fields in the a.out kludge header fields:
+ *
+ * header_addr
+ * load_addr
+ * load_end_addr
+ * bss_end_addr
+ * entry_addr
+ */
+#if SCENARIO == 1
+/* Well-behaved kernel file with explicit bss_end */
+.int0x10
+.int0x10
+.intdata_end
+.intdata_end
+.int_start
+#elif SCENARIO == 2
+/* Well-behaved kernel file with default bss_end */
+.int0x10
+.int0x10
+.intdata_end
+.int0
+.int_start
+#elif SCENARIO == 3
+/* Well-behaved kernel file with default load_end */
+.int0x10
+.int0x10
+.int0
+.int0
+.int_start
+#elif SCENARIO == 4
+/* Well-behaved kernel file with load_end < data_end and bss > data_end */
+#undef LAST_BYTE_VALUE
+#define LAST_BYTE_VALUE 0
+.int0x10
+.int0x10
+.intcode_end
+.int0x14
+.int_start
+#elif SCENARIO == 5
+/* header < load */
+.int0x1
+.int0x10
+.intdata_end
+.intdata_end
+.int_start
+#elif SCENARIO == 6
+/* load_end < load */
+.int0x10
+.int0x10
+.int0x1
+.intdata_end
+.int_start
+#elif SCENARIO == 7
+/* header much larger than in reality with default load_end */
+.int0x8000
+.int0x10
+.int0
+.intdata_end
+.int_start
+#elif SCENARIO == 8
+/* bss_end < load_end - load (regression test for CVE-2018-7550) */
+.int0x10
+.int0x10
+.intdata_end
+.intcode_end
+.int_start
+#elif SCENARIO == 9
+/* Default load_end_addr, load_addr + kernel_file_size > UINT32_MAX */
+.int0xf000
+.int0xf000
+.int0
+.int0xf001
+.int_start
+#else
+#error Invalid SCENARIO
+#endif
+
+.section .text
+.global _start
+_start:
+xor %eax, %eax
+
+cmpb$LAST_BYTE_VALUE, last_byte
+je  

[Qemu-devel] [PATCH 1/5] multiboot: Reject kernels exceeding the address space

2018-03-14 Thread Kevin Wolf
The code path with a manually set mh_load_addr in the Multiboot header
checks that load_end_addr <= load_addr, but the path where load_end_addr
is automatically detected if 0 is given in the header misses the
corresponding check. If the kernel binary size is larger than can fit in
the address space after load_addr, we ended up with a kernel_size that
is smaller than load_size, which means that we read the file into a too
small buffer.

Add a check to reject kernel files with such Multiboot headers.

Signed-off-by: Kevin Wolf 
---
 hw/i386/multiboot.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/hw/i386/multiboot.c b/hw/i386/multiboot.c
index b9064264d8..1e215bf8d3 100644
--- a/hw/i386/multiboot.c
+++ b/hw/i386/multiboot.c
@@ -247,6 +247,10 @@ int load_multiboot(FWCfgState *fw_cfg,
 }
 mb_load_size = kernel_file_size - mb_kernel_text_offset;
 }
+if (mb_load_size > UINT32_MAX - mh_load_addr) {
+error_report("kernel does not fit in address space");
+exit(1);
+}
 if (mh_bss_end_addr) {
 if (mh_bss_end_addr < (mh_load_addr + mb_load_size)) {
 error_report("invalid bss_end_addr address");
-- 
2.13.6




[Qemu-devel] [PATCH 3/5] tests/multiboot: Test exit code for every qemu run

2018-03-14 Thread Kevin Wolf
Testing the exit code only once after a whole group of tests has
completed is not enough, it catches errors only in the very last qemu
invocation. We need to have the check after each qemu run.

The logging and diff with the reference output is still done once per
group to keep things more managable. This is not a problem because the
log file accumulates the output of all runs.

Signed-off-by: Kevin Wolf 
---
 tests/multiboot/run_test.sh | 24 
 1 file changed, 12 insertions(+), 12 deletions(-)

diff --git a/tests/multiboot/run_test.sh b/tests/multiboot/run_test.sh
index 0278148b43..bc9c3670af 100755
--- a/tests/multiboot/run_test.sh
+++ b/tests/multiboot/run_test.sh
@@ -38,6 +38,17 @@ run_qemu() {
 ret=$?
 
 cat test.out >> test.log
+
+debugexit=$((ret & 0x1))
+ret=$((ret >> 1))
+
+if [ $debugexit != 1 ]; then
+printf %b "\e[31m ?? \e[0m $kernel $* (no debugexit used, exit code 
$ret)\n"
+pass=0
+elif [ $ret != 0 ]; then
+printf %b "\e[31mFAIL\e[0m $kernel $* (exit code $ret)\n"
+pass=0
+fi
 }
 
 mmap() {
@@ -61,19 +72,8 @@ make all
 for t in mmap modules; do
 
 echo > test.log
-$t
-
-debugexit=$((ret & 0x1))
-ret=$((ret >> 1))
 pass=1
-
-if [ $debugexit != 1 ]; then
-printf %b "\e[31m ?? \e[0m $t (no debugexit used, exit code $ret)\n"
-pass=0
-elif [ $ret != 0 ]; then
-printf %b "\e[31mFAIL\e[0m $t (exit code $ret)\n"
-pass=0
-fi
+$t
 
 if ! diff $t.out test.log > /dev/null 2>&1; then
 printf %b "\e[31mFAIL\e[0m $t (output difference)\n"
-- 
2.13.6




[Qemu-devel] [PATCH 0/5] multiboot: Fix buffer overflow on invalid kernels

2018-03-14 Thread Kevin Wolf
Patch 1 fixes another Multiboot kernel validation bug that could cause
QEMU to load the kernel image file into a too small buffer. Patch 2 adds
another check to harden the code. The rest of the series adds Multiboot
test cases for kernels using the a.out kludge, which is where the recent
bugs were found.

Kevin Wolf (5):
  multiboot: Reject kernels exceeding the address space
  multiboot: Check validity of mh_header_addr
  tests/multiboot: Test exit code for every qemu run
  tests/multiboot: Add tests for the a.out kludge
  tests/multiboot: Add .gitignore

 hw/i386/multiboot.c |   8 +++
 tests/multiboot/.gitignore  |   3 +
 tests/multiboot/Makefile|  22 +--
 tests/multiboot/aout_kludge.S   | 138 
 tests/multiboot/aout_kludge.out |  42 
 tests/multiboot/run_test.sh |  34 ++
 6 files changed, 227 insertions(+), 20 deletions(-)
 create mode 100644 tests/multiboot/.gitignore
 create mode 100644 tests/multiboot/aout_kludge.S
 create mode 100644 tests/multiboot/aout_kludge.out

-- 
2.13.6




[Qemu-devel] [PATCH 2/5] multiboot: Check validity of mh_header_addr

2018-03-14 Thread Kevin Wolf
I couldn't find a case where this prevents something bad from happening
that isn't already caught by other checks, but let's err on the safe
side and check that mh_header_addr is as expected.

Signed-off-by: Kevin Wolf 
---
 hw/i386/multiboot.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/hw/i386/multiboot.c b/hw/i386/multiboot.c
index 1e215bf8d3..5bc0a2cddb 100644
--- a/hw/i386/multiboot.c
+++ b/hw/i386/multiboot.c
@@ -229,6 +229,10 @@ int load_multiboot(FWCfgState *fw_cfg,
 error_report("invalid load_addr address");
 exit(1);
 }
+if (mh_header_addr - mh_load_addr > i) {
+error_report("invalid header_addr address");
+exit(1);
+}
 
 uint32_t mb_kernel_text_offset = i - (mh_header_addr - mh_load_addr);
 uint32_t mb_load_size = 0;
-- 
2.13.6




[Qemu-devel] CVE-2018-7550 (was: multiboot: bss_end_addr can be zero / cleanup)

2018-03-14 Thread Kevin Wolf
Am 21.12.2017 um 18:25 hat Jack Schwartz geschrieben:
> Properly account for the possibility of multiboot kernels with a zero
> bss_end_addr.  The Multiboot Specification, section 3.1.3 allows for
> kernels without a bss section, by allowing a zeroed bss_end_addr multiboot
> header field.
> 
> Do some cleanup to multiboot.c as well:
> - Remove some unused variables.
> - Use more intuitive header names when displaying fields in messages.
> - Change fprintf(stderr...) to error_report

[ Cc: qemu-stable ]

This series happens to fix CVE-2018-7550.
http://www.openwall.com/lists/oss-security/2018/03/08/4

Just a shame that we weren't told before merging it so that the
appropriate tags could have been set in the commit message (and all of
the problems could have been addressed; I'm going to send another
Multiboot series now).

Kevin



Re: [Qemu-devel] [PATCH] linux-user: implement HWCAP bits on MIPS

2018-03-14 Thread no-reply
Hi,

This series seems to have some coding style problems. See output below for
more information:

Type: series
Message-id: 20180314142018.13612-1-james.cowg...@mips.com
Subject: [Qemu-devel] [PATCH] linux-user: implement HWCAP bits on MIPS

=== TEST SCRIPT BEGIN ===
#!/bin/bash

BASE=base
n=1
total=$(git log --oneline $BASE.. | wc -l)
failed=0

git config --local diff.renamelimit 0
git config --local diff.renames True
git config --local diff.algorithm histogram

commits="$(git log --format=%H --reverse $BASE..)"
for c in $commits; do
echo "Checking PATCH $n/$total: $(git log -n 1 --format=%s $c)..."
if ! git show $c --format=email | ./scripts/checkpatch.pl --mailback -; then
failed=1
echo
fi
n=$((n+1))
done

exit $failed
=== TEST SCRIPT END ===

Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384
From https://github.com/patchew-project/qemu
 t [tag update]
patchew/20180312220753.20096-1-faro...@linux.vnet.ibm.com -> 
patchew/20180312220753.20096-1-faro...@linux.vnet.ibm.com
 * [new tag]   
patchew/20180314142018.13612-1-james.cowg...@mips.com -> 
patchew/20180314142018.13612-1-james.cowg...@mips.com
 t [tag update]patchew/20180314142133.14166-1-drjo...@redhat.com -> 
patchew/20180314142133.14166-1-drjo...@redhat.com
Switched to a new branch 'test'
6270103e8e linux-user: implement HWCAP bits on MIPS

=== OUTPUT BEGIN ===
Checking PATCH 1/1: linux-user: implement HWCAP bits on MIPS...
ERROR: braces {} are necessary for all arms of this statement
#35: FILE: linux-user/elfload.c:967:
+do { if (cpu->env.insn_flags & (flag)) { hwcaps |= hwcap; } } while (0)
[...]

total: 1 errors, 0 warnings, 30 lines checked

Your patch has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.

=== OUTPUT END ===

Test command exited with code: 1


---
Email generated automatically by Patchew [http://patchew.org/].
Please send your feedback to patchew-de...@freelists.org

[Qemu-devel] [PATCH v6 8/8] [RFH] tests: Add migration compress threads tests

2018-03-14 Thread Juan Quintela
Yeap, it is still not working. trying to learn how to debug threads
for guests running from the testt hardness.

For some reason, compression is not working at the moment, test is
disabled until I found why.

Signed-off-by: Juan Quintela 
---
 tests/migration-test.c | 52 ++
 1 file changed, 52 insertions(+)

diff --git a/tests/migration-test.c b/tests/migration-test.c
index eba40812fc..38c6d281c7 100644
--- a/tests/migration-test.c
+++ b/tests/migration-test.c
@@ -785,6 +785,55 @@ static void test_multifd_tcp(void)
 test_migrate_end(from, to, true);
 }
 
+static void test_compress(const char *uri)
+{
+QTestState *from, *to;
+
+test_migrate_start(, , uri, false);
+
+/* We want to pick a speed slow enough that the test completes
+ * quickly, but that it doesn't complete precopy even on a slow
+ * machine, so also set the downtime.
+ */
+/* 1 ms should make it not converge*/
+migrate_set_parameter(from, "downtime-limit", "1");
+/* 1GB/s */
+migrate_set_parameter(from, "max-bandwidth", "10");
+
+migrate_set_parameter(from, "compress-threads", "4");
+migrate_set_parameter(to, "decompress-threads", "3");
+
+migrate_set_capability(from, "compress", "true");
+migrate_set_capability(to, "compress", "true");
+/* Wait for the first serial output from the source */
+wait_for_serial("src_serial");
+
+migrate(from, uri);
+
+wait_for_migration_pass(from);
+
+/* 300ms it should converge */
+migrate_set_parameter(from, "downtime-limit", "300");
+
+if (!got_stop) {
+qtest_qmp_eventwait(from, "STOP");
+}
+qtest_qmp_eventwait(to, "RESUME");
+
+wait_for_serial("dest_serial");
+wait_for_migration_complete(from);
+
+test_migrate_end(from, to, true);
+}
+
+static void test_compress_unix(void)
+{
+char *uri = g_strdup_printf("unix:%s/migsocket", tmpfs);
+
+test_compress(uri);
+g_free(uri);
+}
+
 int main(int argc, char **argv)
 {
 char template[] = "/tmp/migration-test-XX";
@@ -811,6 +860,9 @@ int main(int argc, char **argv)
 qtest_add_func("/migration/precopy/tcp", test_precopy_tcp);
 qtest_add_func("/migration/xbzrle/unix", test_xbzrle_unix);
 qtest_add_func("/migration/multifd/tcp", test_multifd_tcp);
+if (0) {
+qtest_add_func("/migration/compress/unix", test_compress_unix);
+}
 
 ret = g_test_run();
 
-- 
2.14.3




[Qemu-devel] [PATCH v6 1/8] qemu-sockets: Export SocketAddress_to_str

2018-03-14 Thread Juan Quintela
Migration code needs that function in hmp.c (so we need to export it),
and it needs it on tests/migration-test.c, so we need to move it to a
place where it is compiled into the test framework.

Signed-off-by: Juan Quintela 
---
 chardev/char-socket.c  | 29 -
 include/qemu/sockets.h |  3 +++
 util/qemu-sockets.c| 29 +
 3 files changed, 32 insertions(+), 29 deletions(-)

diff --git a/chardev/char-socket.c b/chardev/char-socket.c
index a220803c01..dc27ecce81 100644
--- a/chardev/char-socket.c
+++ b/chardev/char-socket.c
@@ -373,35 +373,6 @@ static void tcp_chr_free_connection(Chardev *chr)
 s->connected = 0;
 }
 
-static char *SocketAddress_to_str(const char *prefix, SocketAddress *addr,
-  bool is_listen, bool is_telnet)
-{
-switch (addr->type) {
-case SOCKET_ADDRESS_TYPE_INET:
-return g_strdup_printf("%s%s:%s:%s%s", prefix,
-   is_telnet ? "telnet" : "tcp",
-   addr->u.inet.host,
-   addr->u.inet.port,
-   is_listen ? ",server" : "");
-break;
-case SOCKET_ADDRESS_TYPE_UNIX:
-return g_strdup_printf("%sunix:%s%s", prefix,
-   addr->u.q_unix.path,
-   is_listen ? ",server" : "");
-break;
-case SOCKET_ADDRESS_TYPE_FD:
-return g_strdup_printf("%sfd:%s%s", prefix, addr->u.fd.str,
-   is_listen ? ",server" : "");
-break;
-case SOCKET_ADDRESS_TYPE_VSOCK:
-return g_strdup_printf("%svsock:%s:%s", prefix,
-   addr->u.vsock.cid,
-   addr->u.vsock.port);
-default:
-abort();
-}
-}
-
 static void update_disconnected_filename(SocketChardev *s)
 {
 Chardev *chr = CHARDEV(s);
diff --git a/include/qemu/sockets.h b/include/qemu/sockets.h
index e88d4c37ab..a8bc2d7cfb 100644
--- a/include/qemu/sockets.h
+++ b/include/qemu/sockets.h
@@ -109,4 +109,7 @@ SocketAddress *socket_remote_address(int fd, Error **errp);
  */
 SocketAddress *socket_address_flatten(SocketAddressLegacy *addr);
 
+char *SocketAddress_to_str(const char *prefix, SocketAddress *addr,
+   bool is_listen, bool is_telnet);
+
 #endif /* QEMU_SOCKETS_H */
diff --git a/util/qemu-sockets.c b/util/qemu-sockets.c
index 7f13e8a338..25965c6ee1 100644
--- a/util/qemu-sockets.c
+++ b/util/qemu-sockets.c
@@ -1301,3 +1301,32 @@ SocketAddress 
*socket_address_flatten(SocketAddressLegacy *addr_legacy)
 
 return addr;
 }
+
+char *SocketAddress_to_str(const char *prefix, SocketAddress *addr,
+  bool is_listen, bool is_telnet)
+{
+switch (addr->type) {
+case SOCKET_ADDRESS_TYPE_INET:
+return g_strdup_printf("%s%s:%s:%s%s", prefix,
+   is_telnet ? "telnet" : "tcp",
+   addr->u.inet.host,
+   addr->u.inet.port,
+   is_listen ? ",server" : "");
+break;
+case SOCKET_ADDRESS_TYPE_UNIX:
+return g_strdup_printf("%sunix:%s%s", prefix,
+   addr->u.q_unix.path,
+   is_listen ? ",server" : "");
+break;
+case SOCKET_ADDRESS_TYPE_FD:
+return g_strdup_printf("%sfd:%s%s", prefix, addr->u.fd.str,
+   is_listen ? ",server" : "");
+break;
+case SOCKET_ADDRESS_TYPE_VSOCK:
+return g_strdup_printf("%svsock:%s:%s", prefix,
+   addr->u.vsock.cid,
+   addr->u.vsock.port);
+default:
+abort();
+}
+}
-- 
2.14.3




[Qemu-devel] [PATCH v6 7/8] migration: Add multifd test

2018-03-14 Thread Juan Quintela
We set the x-multifd-page-count and x-multifd-channels.

Signed-off-by: Juan Quintela 
Reviewed-by: Dr. David Alan Gilbert 
---
 tests/migration-test.c | 48 
 1 file changed, 48 insertions(+)

diff --git a/tests/migration-test.c b/tests/migration-test.c
index 8f367ea1d7..eba40812fc 100644
--- a/tests/migration-test.c
+++ b/tests/migration-test.c
@@ -738,6 +738,53 @@ static void test_precopy_tcp(void)
 g_free(uri);
 }
 
+static void test_multifd_tcp(void)
+{
+char *uri;
+QTestState *from, *to;
+
+test_migrate_start(, , "tcp:127.0.0.1:0", false);
+
+/* We want to pick a speed slow enough that the test completes
+ * quickly, but that it doesn't complete precopy even on a slow
+ * machine, so also set the downtime.
+ */
+/* 1 ms should make it not converge*/
+migrate_set_parameter(from, "downtime-limit", "1");
+/* 1GB/s */
+migrate_set_parameter(from, "max-bandwidth", "10");
+
+migrate_set_parameter(from, "x-multifd-channels", "4");
+migrate_set_parameter(to, "x-multifd-channels", "4");
+
+migrate_set_parameter(from, "x-multifd-page-count", "64");
+migrate_set_parameter(to, "x-multifd-page-count", "64");
+
+migrate_set_capability(from, "x-multifd", "true");
+migrate_set_capability(to, "x-multifd", "true");
+/* Wait for the first serial output from the source */
+wait_for_serial("src_serial");
+
+uri = migrate_get_socket_address(to, "x-socket-address");
+
+migrate(from, uri);
+
+wait_for_migration_pass(from);
+
+/* 300ms it should converge */
+migrate_set_parameter(from, "downtime-limit", "300");
+
+if (!got_stop) {
+qtest_qmp_eventwait(from, "STOP");
+}
+qtest_qmp_eventwait(to, "RESUME");
+
+wait_for_serial("dest_serial");
+wait_for_migration_complete(from);
+
+test_migrate_end(from, to, true);
+}
+
 int main(int argc, char **argv)
 {
 char template[] = "/tmp/migration-test-XX";
@@ -763,6 +810,7 @@ int main(int argc, char **argv)
 qtest_add_func("/migration/precopy/unix", test_precopy_unix);
 qtest_add_func("/migration/precopy/tcp", test_precopy_tcp);
 qtest_add_func("/migration/xbzrle/unix", test_xbzrle_unix);
+qtest_add_func("/migration/multifd/tcp", test_multifd_tcp);
 
 ret = g_test_run();
 
-- 
2.14.3




[Qemu-devel] [PATCH v6 6/8] tests: Add basic migration precopy tcp test

2018-03-14 Thread Juan Quintela
Not sharing code from precopy/unix because we have to read back the
tcp parameter.

Signed-off-by: Juan Quintela 
Reviewed-by: Dr. David Alan Gilbert 
Reviewed-by: Peter Xu 
---
 tests/migration-test.c | 79 --
 1 file changed, 76 insertions(+), 3 deletions(-)

diff --git a/tests/migration-test.c b/tests/migration-test.c
index 4a94d3d598..8f367ea1d7 100644
--- a/tests/migration-test.c
+++ b/tests/migration-test.c
@@ -19,6 +19,9 @@
 #include "qemu/sockets.h"
 #include "chardev/char.h"
 #include "sysemu/sysemu.h"
+#include "qapi/qapi-visit-sockets.h"
+#include "qapi/qobject-input-visitor.h"
+#include "qapi/qobject-output-visitor.h"
 
 const unsigned start_address = 1024 * 1024;
 const unsigned end_address = 100 * 1024 * 1024;
@@ -277,8 +280,28 @@ static void cleanup(const char *filename)
 g_free(path);
 }
 
-static void migrate_check_parameter(QTestState *who, const char *parameter,
-const char *value)
+static char *migrate_get_socket_address(QTestState *who, const char *parameter)
+{
+QDict *rsp, *rsp_return;
+char *result;
+Error *local_err = NULL;
+SocketAddress *saddr = NULL;
+Visitor *iv = NULL;
+QObject *object;
+
+rsp = wait_command(who, "{ 'execute': 'query-migrate-parameters' }");
+rsp_return = qdict_get_qdict(rsp, "return");
+object = qdict_get(rsp_return, parameter);
+
+iv = qobject_input_visitor_new(object);
+visit_type_SocketAddress(iv, NULL, , _err);
+result = g_strdup_printf("%s",
+ SocketAddress_to_str("", saddr, false, false));
+QDECREF(rsp);
+return result;
+}
+
+static char *migrate_get_parameter(QTestState *who, const char *parameter)
 {
 QDict *rsp, *rsp_return;
 char *result;
@@ -287,9 +310,18 @@ static void migrate_check_parameter(QTestState *who, const 
char *parameter,
 rsp_return = qdict_get_qdict(rsp, "return");
 result = g_strdup_printf("%" PRId64,
  qdict_get_try_int(rsp_return,  parameter, -1));
+QDECREF(rsp);
+return result;
+}
+
+static void migrate_check_parameter(QTestState *who, const char *parameter,
+const char *value)
+{
+char *result;
+
+result = migrate_get_parameter(who, parameter);
 g_assert_cmpstr(result, ==, value);
 g_free(result);
-QDECREF(rsp);
 }
 
 static void migrate_set_parameter(QTestState *who, const char *parameter,
@@ -666,6 +698,46 @@ static void test_xbzrle_unix(void)
 g_free(uri);
 }
 
+static void test_precopy_tcp(void)
+{
+char *uri;
+QTestState *from, *to;
+
+test_migrate_start(, , "tcp:127.0.0.1:0", false);
+
+/* We want to pick a speed slow enough that the test completes
+ * quickly, but that it doesn't complete precopy even on a slow
+ * machine, so also set the downtime.
+ */
+/* 1 ms should make it not converge*/
+migrate_set_parameter(from, "downtime-limit", "1");
+/* 1GB/s */
+migrate_set_parameter(from, "max-bandwidth", "10");
+
+/* Wait for the first serial output from the source */
+wait_for_serial("src_serial");
+
+uri = migrate_get_socket_address(to, "x-socket-address");
+
+migrate(from, uri);
+
+wait_for_migration_pass(from);
+
+/* 300ms should converge */
+migrate_set_parameter(from, "downtime-limit", "300");
+
+if (!got_stop) {
+qtest_qmp_eventwait(from, "STOP");
+}
+qtest_qmp_eventwait(to, "RESUME");
+
+wait_for_serial("dest_serial");
+wait_for_migration_complete(from);
+
+test_migrate_end(from, to, true);
+g_free(uri);
+}
+
 int main(int argc, char **argv)
 {
 char template[] = "/tmp/migration-test-XX";
@@ -689,6 +761,7 @@ int main(int argc, char **argv)
 qtest_add_func("/migration/deprecated", test_deprecated);
 qtest_add_func("/migration/bad_dest", test_baddest);
 qtest_add_func("/migration/precopy/unix", test_precopy_unix);
+qtest_add_func("/migration/precopy/tcp", test_precopy_tcp);
 qtest_add_func("/migration/xbzrle/unix", test_xbzrle_unix);
 
 ret = g_test_run();
-- 
2.14.3




[Qemu-devel] [PATCH v6 5/8] tests: Migration ppc now inlines its program

2018-03-14 Thread Juan Quintela
No need to write it to a file.  Just need a proper firmware O:-)

Signed-off-by: Juan Quintela 
CC: Laurent Vivier 
---
 tests/migration-test.c | 41 +
 1 file changed, 5 insertions(+), 36 deletions(-)

diff --git a/tests/migration-test.c b/tests/migration-test.c
index fd885ba909..4a94d3d598 100644
--- a/tests/migration-test.c
+++ b/tests/migration-test.c
@@ -19,9 +19,6 @@
 #include "qemu/sockets.h"
 #include "chardev/char.h"
 #include "sysemu/sysemu.h"
-#include "hw/nvram/chrp_nvram.h"
-
-#define MIN_NVRAM_SIZE 8192 /* from spapr_nvram.c */
 
 const unsigned start_address = 1024 * 1024;
 const unsigned end_address = 100 * 1024 * 1024;
@@ -90,36 +87,6 @@ static void init_bootfile_x86(const char *bootpath)
 fclose(bootfile);
 }
 
-static void init_bootfile_ppc(const char *bootpath)
-{
-FILE *bootfile;
-char buf[MIN_NVRAM_SIZE];
-ChrpNvramPartHdr *header = (ChrpNvramPartHdr *)buf;
-
-memset(buf, 0, MIN_NVRAM_SIZE);
-
-/* Create a "common" partition in nvram to store boot-command property */
-
-header->signature = CHRP_NVPART_SYSTEM;
-memcpy(header->name, "common", 6);
-chrp_nvram_finish_partition(header, MIN_NVRAM_SIZE);
-
-/* FW_MAX_SIZE is 4MB, but slof.bin is only 900KB,
- * so let's modify memory between 1MB and 100MB
- * to do like PC bootsector
- */
-
-sprintf(buf + 16,
-"boot-command=hex .\" _\" begin %x %x do i c@ 1 + i c! 1000 +loop "
-".\" B\" 0 until", end_address, start_address);
-
-/* Write partition to the NVRAM file */
-
-bootfile = fopen(bootpath, "wb");
-g_assert_cmpint(fwrite(buf, MIN_NVRAM_SIZE, 1, bootfile), ==, 1);
-fclose(bootfile);
-}
-
 /*
  * Wait for some output in the serial output file,
  * we get an 'A' followed by an endless string of 'B's
@@ -410,12 +377,14 @@ static void test_migrate_start(QTestState **from, 
QTestState **to,
 if (access("/sys/module/kvm_hv", F_OK)) {
 accel = "tcg";
 }
-init_bootfile_ppc(bootpath);
 cmd_src = g_strdup_printf("-machine accel=%s -m 256M"
   " -name source,debug-threads=on"
   " -serial file:%s/src_serial"
-  " -drive file=%s,if=pflash,format=raw",
-  accel, tmpfs, bootpath);
+  " -prom-env '"
+  "boot-command=hex .\" _\" begin %x %x "
+  "do i c@ 1 + i c! 1000 +loop .\" B\" 0 "
+  "until'",  accel, tmpfs, end_address,
+  start_address);
 cmd_dst = g_strdup_printf("-machine accel=%s -m 256M"
   " -name target,debug-threads=on"
   " -serial file:%s/dest_serial"
-- 
2.14.3




[Qemu-devel] [PATCH v6 3/8] tests: Add migration xbzrle test

2018-03-14 Thread Juan Quintela
Signed-off-by: Juan Quintela 
Reviewed-by: Peter Xu 
---
 tests/migration-test.c | 64 ++
 1 file changed, 64 insertions(+)

diff --git a/tests/migration-test.c b/tests/migration-test.c
index 834cdf50f2..fd885ba909 100644
--- a/tests/migration-test.c
+++ b/tests/migration-test.c
@@ -512,6 +512,20 @@ static void deprecated_set_speed(QTestState *who, const 
char *value)
 migrate_check_parameter(who, "max-bandwidth", value);
 }
 
+static void deprecated_set_cache_size(QTestState *who, const char *value)
+{
+QDict *rsp;
+gchar *cmd;
+
+cmd = g_strdup_printf("{ 'execute': 'migrate-set-cache-size',"
+  "'arguments': { 'value': %s } }", value);
+rsp = qtest_qmp(who, cmd);
+g_free(cmd);
+g_assert(qdict_haskey(rsp, "return"));
+QDECREF(rsp);
+migrate_check_parameter(who, "xbzrle-cache-size", value);
+}
+
 static void test_deprecated(void)
 {
 QTestState *from;
@@ -520,6 +534,7 @@ static void test_deprecated(void)
 
 deprecated_set_downtime(from, 0.12345);
 deprecated_set_speed(from, "12345");
+deprecated_set_cache_size(from, "4096");
 
 qtest_quit(from);
 }
@@ -634,6 +649,54 @@ static void test_precopy_unix(void)
 g_free(uri);
 }
 
+static void test_xbzrle(const char *uri)
+{
+QTestState *from, *to;
+
+test_migrate_start(, , uri, false);
+
+/* We want to pick a speed slow enough that the test completes
+ * quickly, but that it doesn't complete precopy even on a slow
+ * machine, so also set the downtime.
+ */
+/* 1 ms should make it not converge*/
+migrate_set_parameter(from, "downtime-limit", "1");
+/* 1GB/s */
+migrate_set_parameter(from, "max-bandwidth", "10");
+
+migrate_set_parameter(from, "xbzrle-cache-size", "33554432");
+
+migrate_set_capability(from, "xbzrle", "true");
+migrate_set_capability(to, "xbzrle", "true");
+/* Wait for the first serial output from the source */
+wait_for_serial("src_serial");
+
+migrate(from, uri);
+
+wait_for_migration_pass(from);
+
+/* 300ms should converge */
+migrate_set_parameter(from, "downtime-limit", "300");
+
+if (!got_stop) {
+qtest_qmp_eventwait(from, "STOP");
+}
+qtest_qmp_eventwait(to, "RESUME");
+
+wait_for_serial("dest_serial");
+wait_for_migration_complete(from);
+
+test_migrate_end(from, to, true);
+}
+
+static void test_xbzrle_unix(void)
+{
+char *uri = g_strdup_printf("unix:%s/migsocket", tmpfs);
+
+test_xbzrle(uri);
+g_free(uri);
+}
+
 int main(int argc, char **argv)
 {
 char template[] = "/tmp/migration-test-XX";
@@ -657,6 +720,7 @@ int main(int argc, char **argv)
 qtest_add_func("/migration/deprecated", test_deprecated);
 qtest_add_func("/migration/bad_dest", test_baddest);
 qtest_add_func("/migration/precopy/unix", test_precopy_unix);
+qtest_add_func("/migration/xbzrle/unix", test_xbzrle_unix);
 
 ret = g_test_run();
 
-- 
2.14.3




[Qemu-devel] [PATCH v6 0/8] Add make check tests for Migration

2018-03-14 Thread Juan Quintela

Hi

This is v6, it differest from the patches that I sent with previous
multifd post:
- Rename x-tcp-port to x-socket-address
  This is more *complicated* that it looks as:

  * it is a pointer, so I need to use QAPI_CLONE() to make info
migrate work

  * this is an union of structs.  In QAPI. So, a dict of strings.  The
only way that I was able to make things work is parsing the qdict
to a SocketAddress and then output a SocketAddress as an str.  It
needs to be an easier way, for sure.

  * Cleanups here andthere.

Please, review, Juan.

[v5]
- Several patches moved to pull request
- merge info_migrate and migration_tests
  only missing bit is tcp_port, needed for tcp tests
- Rename tcp-port to x-tcp-port
  We will get better naming from David at some point, and we will use that bit
- ppc: use inline code as suggested by lvivier

Please, review.

It is based on my previous pull request

Based-on: 20180129120932.12874-1-quint...@redhat.com

[v4]
- rebase on top on v4 info_migrate patches
- Tune sleeps to make patches fast
- Create a deprecated test for deprecated commands (i.e. make peterxu happy)
- create migrate_start_postcopy function
- fix naming/sizes between power and x86
- cleanup comments to match code

[v3]

- No more tests for deprecated parameters. Now I only use
  migrate_set_parameter.  If there is a deprecated command for that,
  we tests it there.
- free "result" string, always good to return memory (Peter found it)
- use the new tcp_port parameter from info migrate.  So we are
  handling well the tcp case.
- lots of code movement around to make everything consistent.
- Several patches already integrated upstream.

[v2]
- to make review easier, I started renaming postcopy-test.c to migration-test.c
- Did cleanups/refactoring there
- Don't use global-qtest anymore
- check that the parameters that we sent got really set
- RFH: comrpress threads tests is not working for some weird reason.  Using the 
same code on command line works.
  still investigating why.

ToDoo:

- tcp: after discussions with dave, we ended in conclusion that we
  need to use the 0 port and let the system gives us a free one

  But  that means that we need to be able to get that port back somehow.
  "info migrate" woring on destination side?

- compression threads.  There is some weird interaction with the test
  hardness and every migration thread get waiting in a different
  semaphore.  Investigating if it is a race/bug/whateverr

- deprecated commands: There was a suggestion to make
  migrate_set_parameter look at the parameter name and test old/new
  depending on something.  Not sure what to do here.

- testing commands: Is there a way to launch qemu and just sent
  qmp/hmp commands without having to really run anything else?

[v1]
- add test for precopy for unix/tcp
  exec and fd to came, don't know how to test rdma without hardware
- add tests using deprecated interfaces
- add test for xbzrle
  Note to myself, there is no way to set the cache size with 
migraton_set_parameters
- Add test for compress threads
  disabled on the series, right now it appears that compression is not working 
at all
- Move postcopy to use new results
  Idea is to move it on top of migration-test.c, but first I want some reviews 
on basic idea

Juan Quintela (8):
  qemu-sockets: Export SocketAddress_to_str
  tests: Add migration precopy test
  tests: Add migration xbzrle test
  migration: Create x-socket-address parameter
  tests: Migration ppc now inlines its program
  tests: Add basic migration precopy tcp test
  migration: Add multifd test
  [RFH] tests: Add migration compress threads tests

 chardev/char-socket.c  |  29 -
 hmp.c  |   6 +
 include/qemu/sockets.h |   3 +
 migration/migration.c  |  18 +++
 migration/migration.h  |   2 +
 migration/socket.c |  27 +++-
 qapi/migration.json|  14 ++-
 tests/migration-test.c | 328 ++---
 util/qemu-sockets.c|  29 +
 9 files changed, 379 insertions(+), 77 deletions(-)

-- 
2.14.3




[Qemu-devel] [PATCH v6 2/8] tests: Add migration precopy test

2018-03-14 Thread Juan Quintela
Signed-off-by: Juan Quintela 
Reviewed-by: Dr. David Alan Gilbert 
Reviewed-by: Peter Xu 
---
 tests/migration-test.c | 44 ++--
 1 file changed, 42 insertions(+), 2 deletions(-)

diff --git a/tests/migration-test.c b/tests/migration-test.c
index 422bf1afdf..834cdf50f2 100644
--- a/tests/migration-test.c
+++ b/tests/migration-test.c
@@ -524,7 +524,7 @@ static void test_deprecated(void)
 qtest_quit(from);
 }
 
-static void test_migrate(void)
+static void test_postcopy(void)
 {
 char *uri = g_strdup_printf("unix:%s/migsocket", tmpfs);
 QTestState *from, *to;
@@ -595,6 +595,45 @@ static void test_baddest(void)
 test_migrate_end(from, to, false);
 }
 
+static void test_precopy_unix(void)
+{
+char *uri = g_strdup_printf("unix:%s/migsocket", tmpfs);
+QTestState *from, *to;
+
+test_migrate_start(, , uri, false);
+
+/* We want to pick a speed slow enough that the test completes
+ * quickly, but that it doesn't complete precopy even on a slow
+ * machine, so also set the downtime.
+ */
+/* 1 ms should make it not converge*/
+migrate_set_parameter(from, "downtime-limit", "1");
+/* 1GB/s */
+migrate_set_parameter(from, "max-bandwidth", "10");
+
+/* Wait for the first serial output from the source */
+wait_for_serial("src_serial");
+
+migrate(from, uri);
+
+wait_for_migration_pass(from);
+
+/* 300 ms should converge */
+migrate_set_parameter(from, "downtime-limit", "300");
+
+if (!got_stop) {
+qtest_qmp_eventwait(from, "STOP");
+}
+
+qtest_qmp_eventwait(to, "RESUME");
+
+wait_for_serial("dest_serial");
+wait_for_migration_complete(from);
+
+test_migrate_end(from, to, true);
+g_free(uri);
+}
+
 int main(int argc, char **argv)
 {
 char template[] = "/tmp/migration-test-XX";
@@ -614,9 +653,10 @@ int main(int argc, char **argv)
 
 module_call_init(MODULE_INIT_QOM);
 
-qtest_add_func("/migration/postcopy/unix", test_migrate);
+qtest_add_func("/migration/postcopy/unix", test_postcopy);
 qtest_add_func("/migration/deprecated", test_deprecated);
 qtest_add_func("/migration/bad_dest", test_baddest);
+qtest_add_func("/migration/precopy/unix", test_precopy_unix);
 
 ret = g_test_run();
 
-- 
2.14.3




[Qemu-devel] [PATCH v6 4/8] migration: Create x-socket-address parameter

2018-03-14 Thread Juan Quintela
It will be used to store the uri parameter. We want this only for tcp,
so we don't set it for other uris.  We need it to know what port is
migration running.

Signed-off-by: Juan Quintela 

--

This used to be uri parameter, but it has so many troubles to
reproduce that it don't just make sense.

This used to be a port parameter.  I was asked to move to
SocketAddress, done.
I also merged the setting of the migration tcp port in this one
because now I need to free the address, and this makes it easier.
---
 hmp.c |  6 ++
 migration/migration.c | 18 ++
 migration/migration.h |  2 ++
 migration/socket.c| 27 ++-
 qapi/migration.json   | 14 --
 5 files changed, 60 insertions(+), 7 deletions(-)

diff --git a/hmp.c b/hmp.c
index ba9e299ee2..eee84cfd5f 100644
--- a/hmp.c
+++ b/hmp.c
@@ -355,6 +355,12 @@ void hmp_info_migrate_parameters(Monitor *mon, const QDict 
*qdict)
 monitor_printf(mon, "%s: %" PRIu64 "\n",
 MigrationParameter_str(MIGRATION_PARAMETER_XBZRLE_CACHE_SIZE),
 params->xbzrle_cache_size);
+if (params->has_x_socket_address) {
+monitor_printf(mon, "%s: %s\n",
+MigrationParameter_str(MIGRATION_PARAMETER_X_SOCKET_ADDRESS),
+SocketAddress_to_str("", params->x_socket_address,
+ false, false));
+}
 }
 
 qapi_free_MigrationParameters(params);
diff --git a/migration/migration.c b/migration/migration.c
index 6a4780ef6f..3b811c213a 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -31,6 +31,8 @@
 #include "migration/vmstate.h"
 #include "block/block.h"
 #include "qapi/error.h"
+#include "qapi/clone-visitor.h"
+#include "qapi/qapi-visit-sockets.h"
 #include "qapi/qapi-commands-migration.h"
 #include "qapi/qapi-events-migration.h"
 #include "qapi/qmp/qerror.h"
@@ -268,6 +270,14 @@ int migrate_send_rp_req_pages(MigrationIncomingState *mis, 
const char *rbname,
 return migrate_send_rp_message(mis, msg_type, msglen, bufc);
 }
 
+void migrate_set_address(SocketAddress *address)
+{
+MigrationState *s = migrate_get_current();
+
+s->parameters.has_x_socket_address = true;
+s->parameters.x_socket_address = address;
+}
+
 void qemu_start_incoming_migration(const char *uri, Error **errp)
 {
 const char *p;
@@ -545,6 +555,11 @@ MigrationParameters *qmp_query_migrate_parameters(Error 
**errp)
 params->x_multifd_page_count = s->parameters.x_multifd_page_count;
 params->has_xbzrle_cache_size = true;
 params->xbzrle_cache_size = s->parameters.xbzrle_cache_size;
+if (s->parameters.x_socket_address) {
+params->has_x_socket_address = true;
+params->x_socket_address = QAPI_CLONE(SocketAddress,
+  s->parameters.x_socket_address);
+}
 
 return params;
 }
@@ -2542,6 +2557,9 @@ static void migration_instance_finalize(Object *obj)
 qemu_mutex_destroy(>error_mutex);
 g_free(params->tls_hostname);
 g_free(params->tls_creds);
+if (params->x_socket_address) {
+qapi_free_SocketAddress(params->x_socket_address);
+}
 qemu_sem_destroy(>pause_sem);
 error_free(ms->error);
 }
diff --git a/migration/migration.h b/migration/migration.h
index 08c5d2ded1..36b9c70fd6 100644
--- a/migration/migration.h
+++ b/migration/migration.h
@@ -234,4 +234,6 @@ void migrate_send_rp_pong(MigrationIncomingState *mis,
 int migrate_send_rp_req_pages(MigrationIncomingState *mis, const char* rbname,
   ram_addr_t start, size_t len);
 
+void migrate_set_address(SocketAddress *address);
+
 #endif
diff --git a/migration/socket.c b/migration/socket.c
index 8a93fb1af5..52db0c0c09 100644
--- a/migration/socket.c
+++ b/migration/socket.c
@@ -15,6 +15,7 @@
  */
 
 #include "qemu/osdep.h"
+#include "qemu/cutils.h"
 
 #include "qemu-common.h"
 #include "qemu/error-report.h"
@@ -161,17 +162,24 @@ out:
 }
 
 
-static void socket_start_incoming_migration(SocketAddress *saddr,
-Error **errp)
+static SocketAddress *socket_start_incoming_migration(SocketAddress *saddr,
+  Error **errp)
 {
 QIOChannelSocket *listen_ioc = qio_channel_socket_new();
+SocketAddress *address;
 
 qio_channel_set_name(QIO_CHANNEL(listen_ioc),
  "migration-socket-listener");
 
 if (qio_channel_socket_listen_sync(listen_ioc, saddr, errp) < 0) {
 object_unref(OBJECT(listen_ioc));
-return;
+return NULL;
+}
+
+address = qio_channel_socket_get_local_address(listen_ioc, errp);
+if (address < 0) {
+object_unref(OBJECT(listen_ioc));
+return NULL;
 }
 
 qio_channel_add_watch(QIO_CHANNEL(listen_ioc),
@@ -179,14 +187,20 @@ static void socket_start_incoming_migration(SocketAddress 
*saddr,
   

Re: [Qemu-devel] [PATCH v4 1/4] bitmap: bitmap_count_one_with_offset

2018-03-14 Thread Dr. David Alan Gilbert
* Wei Wang (wei.w.w...@intel.com) wrote:
> Count the number of 1s in a bitmap starting from an offset.
> 
> Signed-off-by: Wei Wang 
> CC: Dr. David Alan Gilbert 
> CC: Juan Quintela 
> CC: Michael S. Tsirkin 

Reviewed-by: Dr. David Alan Gilbert 

> ---
>  include/qemu/bitmap.h | 13 +
>  1 file changed, 13 insertions(+)
> 
> diff --git a/include/qemu/bitmap.h b/include/qemu/bitmap.h
> index 509eedd..e3f31f1 100644
> --- a/include/qemu/bitmap.h
> +++ b/include/qemu/bitmap.h
> @@ -228,6 +228,19 @@ static inline long bitmap_count_one(const unsigned long 
> *bitmap, long nbits)
>  }
>  }
>  
> +static inline long bitmap_count_one_with_offset(const unsigned long *bitmap,
> +long offset, long nbits)
> +{
> +long aligned_offset = QEMU_ALIGN_DOWN(offset, BITS_PER_LONG);
> +long redundant_bits = offset - aligned_offset;
> +long bits_to_count = nbits + redundant_bits;
> +const unsigned long *bitmap_start = bitmap +
> +aligned_offset / BITS_PER_LONG;
> +
> +return bitmap_count_one(bitmap_start, bits_to_count) -
> +   bitmap_count_one(bitmap_start, redundant_bits);
> +}
> +
>  void bitmap_set(unsigned long *map, long i, long len);
>  void bitmap_set_atomic(unsigned long *map, long i, long len);
>  void bitmap_clear(unsigned long *map, long start, long nr);
> -- 
> 1.8.3.1
> 
--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK



Re: [Qemu-devel] [PATCH] linux-user: implement HWCAP bits on MIPS

2018-03-14 Thread Laurent Vivier
Le 14/03/2018 à 16:31, James Cowgill a écrit :
> Add support for the two currently defined HWCAP bits on MIPS - R6 and
> MSA.
> 
> Buglink: https://bugs.launchpad.net/qemu/+bug/1754372
> Signed-off-by: James Cowgill 
> ---
> This was resent because I think I messed up my email config. Apologies if you
> receive this twice.
> 
>  linux-user/elfload.c | 24 
>  1 file changed, 24 insertions(+)
> 
> diff --git a/linux-user/elfload.c b/linux-user/elfload.c
> index 5fc130cc20..747b0ed10b 100644
> --- a/linux-user/elfload.c
> +++ b/linux-user/elfload.c
> @@ -950,6 +950,30 @@ static void elf_core_copy_regs(target_elf_gregset_t 
> *regs, const CPUMIPSState *e
>  #define USE_ELF_CORE_DUMP
>  #define ELF_EXEC_PAGESIZE4096
>  
> +/* See arch/mips/include/uapi/hwcap.h.  */

in fact arch/mips/include/uapi/asm/hwcap.h

> +enum {
> +HWCAP_MIPS_R6   = (1 << 0),
> +HWCAP_MIPS_MSA  = (1 << 1),
> +};

We have this for ARM only in elfload.c since:

afce2927aa Arm AT_HWCAP AUXV entry (Paul Brook) [2005]

but they have been added in include/elf.h since:

41d9ea80ac tcg-arm: Use qemu_getauxval [Richard Henderson, 2013]

and I think we should remove them (they are prefixed by ARM_)

So the MIPS ones should be in include/elf.h (with the #define form).

Richard, any comment?

Thanks,
Laurent



[Qemu-devel] [PATCH] target-mips: Add initrd support for the Boston board

2018-03-14 Thread Aleksandar Rikalo
From: Aleksandar Rikalo 

Add support for initial ramdisk loading for the Mips Boston board.

Signed-off-by: Aleksandar Rikalo 
---
 hw/mips/boston.c | 54 +-
 1 file changed, 45 insertions(+), 9 deletions(-)

diff --git a/hw/mips/boston.c b/hw/mips/boston.c
index fb23161..67ca54f 100644
--- a/hw/mips/boston.c
+++ b/hw/mips/boston.c
@@ -30,6 +30,7 @@
 #include "hw/loader-fit.h"
 #include "hw/mips/cps.h"
 #include "hw/mips/cpudevs.h"
+#include "hw/mips/mips.h"
 #include "hw/pci-host/xilinx-pcie.h"
 #include "qapi/error.h"
 #include "qemu/cutils.h"
@@ -333,10 +334,12 @@ static const void *boston_fdt_filter(void *opaque, const 
void *fdt_orig,
 {
 BostonState *s = BOSTON(opaque);
 MachineState *machine = s->mach;
-const char *cmdline;
+GString *cmdline;
 int err;
 void *fdt;
 size_t fdt_sz, ram_low_sz, ram_high_sz;
+long initrd_size;
+ram_addr_t initrd_offset;
 
 fdt_sz = fdt_totalsize(fdt_orig) * 2;
 fdt = g_malloc0(fdt_sz);
@@ -347,20 +350,53 @@ static const void *boston_fdt_filter(void *opaque, const 
void *fdt_orig,
 return NULL;
 }
 
-cmdline = (machine->kernel_cmdline && machine->kernel_cmdline[0])
-? machine->kernel_cmdline : " ";
-err = qemu_fdt_setprop_string(fdt, "/chosen", "bootargs", cmdline);
-if (err < 0) {
-fprintf(stderr, "couldn't set /chosen/bootargs\n");
-return NULL;
-}
-
 ram_low_sz = MIN(256 * M_BYTE, machine->ram_size);
 ram_high_sz = machine->ram_size - ram_low_sz;
 qemu_fdt_setprop_sized_cells(fdt, "/memory@0", "reg",
  1, 0x, 1, ram_low_sz,
  1, 0x9000, 1, ram_high_sz);
 
+cmdline = g_string_new(machine->kernel_cmdline);
+
+/* load initrd */
+initrd_offset = 0;
+if (machine->initrd_filename) {
+initrd_size = get_image_size(machine->initrd_filename);
+if (initrd_size > 0) {
+/* The kernel allocates the bootmap memory in the low memory after
+   the initrd.  It takes at most 128kiB for 2GB RAM and 4kiB
+   pages.  */
+initrd_offset = (ram_low_sz - initrd_size - 131072
+ - ~INITRD_PAGE_MASK) & INITRD_PAGE_MASK;
+
+if ((int64_t)cpu_mips_kseg0_to_phys(NULL, *load_addr + fdt_sz)
+>= (int64_t)initrd_offset) {
+error_report("memory too small for initial ram disk '%s'",
+ machine->initrd_filename);
+exit(1);
+}
+
+initrd_size = load_image_targphys(machine->initrd_filename,
+  initrd_offset,
+  initrd_size);
+}
+if (initrd_size == (target_ulong) -1) {
+error_report("could not load initial ram disk '%s'",
+ machine->initrd_filename);
+exit(1);
+}
+g_string_append_printf(cmdline, " rd_start=0x%" PRIx64 " rd_size=%li",
+   cpu_mips_phys_to_kseg0(NULL, initrd_offset),
+   initrd_size);
+}
+
+err = qemu_fdt_setprop_string(fdt, "/chosen", "bootargs", cmdline->str);
+g_string_free(cmdline, true);
+if (err < 0) {
+fprintf(stderr, "couldn't set /chosen/bootargs\n");
+return NULL;
+}
+
 fdt = g_realloc(fdt, fdt_totalsize(fdt));
 qemu_fdt_dumpdtb(fdt, fdt_sz);
 
-- 
1.9.1




[Qemu-devel] [Bug 1587065] Re: btrfs qemu-ga - multiple mounts block fsfreeze

2018-03-14 Thread Janåke Rönnblom
This affects Ubuntu 16.04 as in #4



** Also affects: qemu (Ubuntu)
   Importance: Undecided
   Status: New

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1587065

Title:
  btrfs qemu-ga - multiple mounts block fsfreeze

Status in QEMU:
  Fix Released
Status in qemu package in Ubuntu:
  New

Bug description:
  Having two mounts of the same device makes fsfreeze through qemu-ga 
impossible.
  root@CmsrvMTA:/# mount -l | grep /dev/vda2
  /dev/vda2 on / type btrfs (rw,relatime,space_cache,subvolid=257,subvol=/@)
  /dev/vda2 on /home type btrfs 
(rw,relatime,space_cache,subvolid=258,subvol=/@home)

  Having two mounts is rather common with btrfs, so the feature fsfreeze
  is unusable on these systems.

  
  Below more information about how we encountered this issue...

  Message send to qemu-disc...@nongnu.org:

  Message 1:
  --
  I use external snapshots to backup my guests. I use the 'quiesce' option to 
flush and frees the guest file system with the qemu guest agent.

  With the exeption of one guest, this procedure works fine. On the 'unwilling' 
guest, I get this error message:
  "ERROR 2016-05-25 00:51:19 | T25-bakVMSCmsrvVH2 | fout: internal error: 
unable to execute QEMU agent command 'guest-fsfreeze-freeze': failed to freeze 
/: Device or resource busy"

  I don't think this is not some sort of time-out error, because
  activation of the fsfreeze and the error message happen immediately
  after each other:

  $ grep qemu-ga syslog.1
  May 25 00:51:19 CmsrvMTA qemu-ga: info: guest-fsfreeze called

  This is the only entry of the qemu guest agent in syslog.

  $ sudo virsh version
  Compiled against library: libvirt 1.3.1
  Using library: libvirt 1.3.1
  Gebruikte API: QEMU 1.3.1
  Draaiende hypervisor: QEMU 2.5.0

  $ virsh qemu-agent-command CmsrvMTA '{"execute": "guest-info"}'
  {"return":{"version":"2.5.0", ... 
,{"enabled":true,"name":"guest-fstrim","success-response":true},{"enabled":true,"name":"guest-fsfreeze-thaw","success-response":true},{"enabled":true,"name":"guest-fsfreeze-status","success-response":true},{"enabled":true,"name":"guest-fsfreeze-freeze-list","success-response":true},{"enabled":true,"name":"guest-fsfreeze-freeze","success-response":true},
 ... }

  For making an external snapshot, I use this command:
  $ virsh snapshot-create-as --domain CmsrvMTA sn1 --disk-only --atomic 
--quiesce --no-metadata --diskspec vda,file=/srv/poolVMS/CmsrvMTA.sn1

  Piece of reply 1:
  -
  I have encountered this before. Some operating systems
   may have bind-mounts that let a device appear multiple times in the mount 
list. Unfortunately the guest agent is not smart enough to consider a device 
that has been frozen as succesfull and keep going. This causes this specific 
error.

  Piece of reply 2:
  -
  Ok, that seems to be it.

  I’ve got the ‘/’ and ‘/home’ on the same device formatted as btrfs on
  two separate sub volumes.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1587065/+subscriptions



Re: [Qemu-devel] [PATCH 0/3] WHPX introduce changes for Windows Insider SDK 17110

2018-03-14 Thread Paolo Bonzini
On 14/03/2018 15:52, Justin Terry (VM) wrote:
> This change set fixes two breaking changes that were introduced in the
> Windows Insider SDK 17110. First, a change to the WHvGetCapability function
> decl to include the 'out' WrittenSizeInBytes. Second, changes to the
> WHvSetPartitionProperty function decl and WHV_PARTITION_PROPERTY structure
> to directly pass the PropertyCode at invocation.
> 
> Lastly, it introduces a major performance improvement in whpx_vcpu_post_run
> using the VpContext exit structure rather than another round trip call to
> WHvGetVirtualProcessorRegisters to synchronize vp state.
> 
> QEMU compiled against this SDK (17110+) is expected to work on all Windows
> Insider Builds (17115+).
> 
> Thanks,
> Justin Terry
> 
> Justin Terry (VM) (3):
>   WHPX fix WHvGetCapability out WrittenSizeInBytes
>   WHPX fix WHvSetPartitionProperty in PropertyCode
>   WHPX improve vcpu_post_run perf
> 
>  configure  |  4 +++-
>  target/i386/whpx-all.c | 46 ++
>  2 files changed, 17 insertions(+), 33 deletions(-)
> 

Queued, thanks.

Paolo



[Qemu-devel] block-layer: questions on manipulation of internal nodes

2018-03-14 Thread Stefano Panella
Hi everybody,

I am a relatively new user of qemu block layer. I am interested in it mainly 
because it looks very powerful and general and I am hoping to integrate it on 
our product and to contribute to it for new usecases.

I have existing use cases where we work with a model of a disk process per VM 
disk and I am experimenting with qemu and qmp to  build something similar.

At the moment I have managed to build a new binary, called qemu-dp (probably 
should be called qemu-bl for block layer) which is basically starting as a qmp 
server and accepting qmp block layer commands to operate on disks.

just to give you an example this is the kind of thing I am doing:

EXTERNALLY:
/usr/lib64/xen/bin/qemu-img create -f qcow2 -o size=1M /root/a
/usr/lib64/xen/bin/qemu-img create -f qcow2 -b /root/a -o size=1M /root/b
/usr/lib64/xen/bin/qemu-img create -f qcow2 -b /root/b -o size=1M /root/c
/usr/lib64/xen/bin/qemu-img create -f qcow2 -b /root/c -o size=1M /root/d
/usr/lib64/xen/bin/qemu-img create -f qcow2 -b /root/d -o size=1M /root/e

let's assume there were some data in every layer

Than:

USING QMP:
{ "execute": "qmp_capabilities" }
{
"execute": "blockdev-add",  
"arguments": {
"driver": "qcow2", 
"node-name": "qemu_node",
"discard": "unmap",
"cache": {
"direct": true
},
"file": {
"driver": "file",
"filename": "/root/e"
}
}
}

{
"execute": "nbd-server-start",
"arguments": { 
"addr": {
"type": "unix",
"data": {
"path": "/tmp/nbd.test1"
}
}
}
}

{
"execute": "nbd-server-add",
"arguments": { 
"device": "qemu_node",
"writable": true
}
}

after this the chain looks like:

a < b < c < d < e < NBD_server

now I make a full copy of b and c which I call b1 and c1 and for example I run 
externally qemu-img commit c1 -> b1 while qemu-dp has still the chain opened.

I would now like to send a qmp command to tell qemu-dp to hold any IO from the 
NBD_server and forget about a, b, c and insert b1 as d's child, like this:

a < b1 < d < e < NBD_server

I have tried to implement this qmp command and looked at 

qmp_change_backing_file()
qmp_x_blockdev_change()
https://lists.gnu.org/archive/html/qemu-devel/2017-08/msg02660.html

but did not figure out a way of doing that yet...

I suspect my problem is that I am still very confused about the semantics of 
the object model in the block layer, the ref counting, the graph manipulation, 
the monitor etc. etc.

I have tried to have some interactive chats on irc and they have been very 
useful so far (thanks again stefanha, kwolf, berto, eblake) but maybe a proper 
email would be a good starting point as stefanha has suggested.

Please if somebody could point me to a bit of code to achieve my example that 
would be great, otherwise if there is no code for that kind of functionality, 
it would be good to have a little guide on the sequence of block primiteve I 
should call and on which node, including refs, locking, drain, caches, reopen 
etc...

Thanks a lot,

Stefano



[Qemu-devel] [Bug 1754372] Re: Set MIPS MSA in ELF Auxiliary Vectors

2018-03-14 Thread James Cowgill
Patch:
https://lists.nongnu.org/archive/html/qemu-devel/2018-03/msg04399.html

** Changed in: qemu
   Status: New => In Progress

** Changed in: qemu
 Assignee: (unassigned) => James Cowgill (jcowgill)

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1754372

Title:
  Set MIPS MSA in ELF Auxiliary Vectors

Status in QEMU:
  In Progress

Bug description:
  The MIPS MSA feature is currently not set in the ELF auxiliary vector.

  That is, querying the AT_HWCAP key of the ELF auxiliary vectors for a
  MIPS CPU that has the MSA feature should return a value that has the
  second bit [0] set.

  From [0], `HWCAP_MIPS_MSA` is defined to `1 << 1`.

  [0]:
  
https://github.com/torvalds/linux/blob/master/arch/mips/include/uapi/asm/hwcap.h

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1754372/+subscriptions



Re: [Qemu-devel] [PATCH] linux-user: implement HWCAP bits on MIPS

2018-03-14 Thread no-reply
Hi,

This series seems to have some coding style problems. See output below for
more information:

Type: series
Message-id: 20180314153121.23838-1-james.cowg...@mips.com
Subject: [Qemu-devel] [PATCH] linux-user: implement HWCAP bits on MIPS

=== TEST SCRIPT BEGIN ===
#!/bin/bash

BASE=base
n=1
total=$(git log --oneline $BASE.. | wc -l)
failed=0

git config --local diff.renamelimit 0
git config --local diff.renames True
git config --local diff.algorithm histogram

commits="$(git log --format=%H --reverse $BASE..)"
for c in $commits; do
echo "Checking PATCH $n/$total: $(git log -n 1 --format=%s $c)..."
if ! git show $c --format=email | ./scripts/checkpatch.pl --mailback -; then
failed=1
echo
fi
n=$((n+1))
done

exit $failed
=== TEST SCRIPT END ===

Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384
From https://github.com/patchew-project/qemu
 * [new tag]   
patchew/20180314153121.23838-1-james.cowg...@mips.com -> 
patchew/20180314153121.23838-1-james.cowg...@mips.com
Switched to a new branch 'test'
4b00115726 linux-user: implement HWCAP bits on MIPS

=== OUTPUT BEGIN ===
Checking PATCH 1/1: linux-user: implement HWCAP bits on MIPS...
ERROR: braces {} are necessary for all arms of this statement
#35: FILE: linux-user/elfload.c:967:
+do { if (cpu->env.insn_flags & (flag)) { hwcaps |= hwcap; } } while (0)
[...]

total: 1 errors, 0 warnings, 30 lines checked

Your patch has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.

=== OUTPUT END ===

Test command exited with code: 1


---
Email generated automatically by Patchew [http://patchew.org/].
Please send your feedback to patchew-de...@freelists.org

[Qemu-devel] [PATCH v2] dump-guest-memory: more descriptive lookup_type failure

2018-03-14 Thread Andrew Jones
We've seen a few reports of

 (gdb) source /usr/share/qemu-kvm/dump-guest-memory.py
 Traceback (most recent call last):
   File "/usr/share/qemu-kvm/dump-guest-memory.py", line 19, in 
 UINTPTR_T = gdb.lookup_type("uintptr_t")
 gdb.error: No type named uintptr_t.

This occurs when symbols haven't been loaded first, i.e. neither a
QEMU binary was loaded nor a QEMU process was attached first. Let's
better inform the user of how to fix the issue themselves in order
to avoid more reports.

Acked-by: Janosch Frank 
Signed-off-by: Andrew Jones 
---
v2: Not quite so long a long line (< 90 only generates warnings)
Pick up Janosch's ack

 scripts/dump-guest-memory.py | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/scripts/dump-guest-memory.py b/scripts/dump-guest-memory.py
index 51acfcd0c053..276eebf0c27e 100644
--- a/scripts/dump-guest-memory.py
+++ b/scripts/dump-guest-memory.py
@@ -16,7 +16,12 @@ the COPYING file in the top-level directory.
 import ctypes
 import struct
 
-UINTPTR_T = gdb.lookup_type("uintptr_t")
+try:
+UINTPTR_T = gdb.lookup_type("uintptr_t")
+except Exception as inst:
+raise gdb.GdbError("Symbols must be loaded prior to sourcing 
dump-guest-memory.\n"
+   "Symbols may be loaded by 'attach'ing a QEMU process id 
or by "
+   "'load'ing a QEMU binary.")
 
 TARGET_PAGE_SIZE = 0x1000
 TARGET_PAGE_MASK = 0xF000
-- 
2.13.6




[Qemu-devel] [PATCH] linux-user: implement HWCAP bits on MIPS

2018-03-14 Thread James Cowgill
Add support for the two currently defined HWCAP bits on MIPS - R6 and
MSA.

Buglink: https://bugs.launchpad.net/qemu/+bug/1754372
Signed-off-by: James Cowgill 
---
This was resent because I think I messed up my email config. Apologies if you
receive this twice.

 linux-user/elfload.c | 24 
 1 file changed, 24 insertions(+)

diff --git a/linux-user/elfload.c b/linux-user/elfload.c
index 5fc130cc20..747b0ed10b 100644
--- a/linux-user/elfload.c
+++ b/linux-user/elfload.c
@@ -950,6 +950,30 @@ static void elf_core_copy_regs(target_elf_gregset_t *regs, 
const CPUMIPSState *e
 #define USE_ELF_CORE_DUMP
 #define ELF_EXEC_PAGESIZE4096
 
+/* See arch/mips/include/uapi/hwcap.h.  */
+enum {
+HWCAP_MIPS_R6   = (1 << 0),
+HWCAP_MIPS_MSA  = (1 << 1),
+};
+
+#define ELF_HWCAP get_elf_hwcap()
+
+static uint32_t get_elf_hwcap(void)
+{
+MIPSCPU *cpu = MIPS_CPU(thread_cpu);
+uint32_t hwcaps = 0;
+
+#define GET_FEATURE(flag, hwcap) \
+do { if (cpu->env.insn_flags & (flag)) { hwcaps |= hwcap; } } while (0)
+
+GET_FEATURE(ISA_MIPS32R6 | ISA_MIPS64R6, HWCAP_MIPS_R6);
+GET_FEATURE(ASE_MSA, HWCAP_MIPS_MSA);
+
+#undef GET_FEATURE
+
+return hwcaps;
+}
+
 #endif /* TARGET_MIPS */
 
 #ifdef TARGET_MICROBLAZE
-- 
2.16.2




[Qemu-devel] [PATCH] linux-user: implement HWCAP bits on MIPS

2018-03-14 Thread James Cowgill
Add support for the two currently defined HWCAP bits on MIPS - R6 and
MSA.

Buglink: https://bugs.launchpad.net/qemu/+bug/1754372
Signed-off-by: James Cowgill 
---
 linux-user/elfload.c | 24 
 1 file changed, 24 insertions(+)

diff --git a/linux-user/elfload.c b/linux-user/elfload.c
index 5fc130cc20..747b0ed10b 100644
--- a/linux-user/elfload.c
+++ b/linux-user/elfload.c
@@ -950,6 +950,30 @@ static void elf_core_copy_regs(target_elf_gregset_t *regs, 
const CPUMIPSState *e
 #define USE_ELF_CORE_DUMP
 #define ELF_EXEC_PAGESIZE4096
 
+/* See arch/mips/include/uapi/hwcap.h.  */
+enum {
+HWCAP_MIPS_R6   = (1 << 0),
+HWCAP_MIPS_MSA  = (1 << 1),
+};
+
+#define ELF_HWCAP get_elf_hwcap()
+
+static uint32_t get_elf_hwcap(void)
+{
+MIPSCPU *cpu = MIPS_CPU(thread_cpu);
+uint32_t hwcaps = 0;
+
+#define GET_FEATURE(flag, hwcap) \
+do { if (cpu->env.insn_flags & (flag)) { hwcaps |= hwcap; } } while (0)
+
+GET_FEATURE(ISA_MIPS32R6 | ISA_MIPS64R6, HWCAP_MIPS_R6);
+GET_FEATURE(ASE_MSA, HWCAP_MIPS_MSA);
+
+#undef GET_FEATURE
+
+return hwcaps;
+}
+
 #endif /* TARGET_MIPS */
 
 #ifdef TARGET_MICROBLAZE
-- 
2.16.2




Re: [Qemu-devel] [Qemu-block] [PATCH v2 3/5] block/throttle: Remove protocol-related fields

2018-03-14 Thread Alberto Garcia
On Mon 12 Mar 2018 11:07:51 PM CET, Fabiano Rosas wrote:
> The throttle driver is not a protocol so it should implement bdrv_open
> instead of bdrv_file_open and not provide a protocol_name.
>
> Attempts to invoke this driver using protocol syntax
> (i.e. throttle:) will now fail gracefully:
>
>   $ qemu-img info throttle:foo
>   qemu-img: Could not open 'throttle:foo': Unknown protocol 'throttle'
>
> Signed-off-by: Fabiano Rosas 
> Reviewed-by: Max Reitz 

Reviewed-by: Alberto Garcia 

Berto



Re: [Qemu-devel] [PATCH v2 2/5] block/quorum: Remove protocol-related fields

2018-03-14 Thread Alberto Garcia
On Mon 12 Mar 2018 11:07:50 PM CET, Fabiano Rosas wrote:
> The quorum driver is not a protocol so it should implement bdrv_open
> instead of bdrv_file_open and not provide a protocol_name.
>
> Attempts to invoke this driver using protocol syntax
> (i.e. quorum:) will now fail gracefully:
>
>   $ qemu-img info quorum:foo
>   qemu-img: Could not open 'quorum:foo': Unknown protocol 'quorum'
>
> Signed-off-by: Fabiano Rosas 
> Reviewed-by: Max Reitz 

Reviewed-by: Alberto Garcia 

Berto



Re: [Qemu-devel] [PATCH] dump-guest-memory: more descriptive lookup_type failure

2018-03-14 Thread Janosch Frank
On 14.03.2018 15:21, Andrew Jones wrote:
> We've seen a few reports of
> 
>  (gdb) source /usr/share/qemu-kvm/dump-guest-memory.py
>  Traceback (most recent call last):
>File "/usr/share/qemu-kvm/dump-guest-memory.py", line 19, in 
>  UINTPTR_T = gdb.lookup_type("uintptr_t")
>  gdb.error: No type named uintptr_t.

Oh yeah, I remember that particular error.
Acked-by: Janosch Frank 

> 
> This occurs when symbols haven't been loaded first, i.e. neither a
> QEMU binary was loaded nor a QEMU process was attached first. Let's
> better inform the user of how to fix the issue themselves in order
> to avoid more reports.
> 
> Signed-off-by: Andrew Jones 
> ---
>  scripts/dump-guest-memory.py | 6 +-
>  1 file changed, 5 insertions(+), 1 deletion(-)
> 
> diff --git a/scripts/dump-guest-memory.py b/scripts/dump-guest-memory.py
> index 51acfcd0c053..e56fff6d7e82 100644
> --- a/scripts/dump-guest-memory.py
> +++ b/scripts/dump-guest-memory.py
> @@ -16,7 +16,11 @@ the COPYING file in the top-level directory.
>  import ctypes
>  import struct
> 
> -UINTPTR_T = gdb.lookup_type("uintptr_t")
> +try:
> +UINTPTR_T = gdb.lookup_type("uintptr_t")
> +except Exception as inst:
> +raise gdb.GdbError("Symbols must be loaded prior to sourcing 
> dump-guest-memory.\n"
> +   "Symbols may be loaded by first 'attach'ing a QEMU 
> process id or by 'load'ing a QEMU binary.")>
>  TARGET_PAGE_SIZE = 0x1000
>  TARGET_PAGE_MASK = 0xF000
> 




signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] [PATCH v10 08/24] migration: Add multifd test

2018-03-14 Thread Dr. David Alan Gilbert
* Juan Quintela (quint...@redhat.com) wrote:
> "Dr. David Alan Gilbert"  wrote:
> > * Juan Quintela (quint...@redhat.com) wrote:
> >> We set the x-multifd-page-count and x-multifd-channels.
> >> 
> >> Signed-off-by: Juan Quintela 
> >
> >
> > This should probably go nearer the end of the series;
> 
> it is _much_ better here.  It makes so much easier to test that I don't
> break neither migration nor multifd while developing O:-)

OK, not a biggy.

> > we've also got the problem that things are a bit delicate with TCG so
> > adding more migration tests probably shouldn't happen until Paolo's
> > TCG fixes are worked out.
> 
> 
> 
> > Also, should we be checking for some stats to show all 4 channels were
> > used?
> 
> There are traces now, I can add new counters is that is what you want.

If it's easy then it's worth it.

Dave

> Later, Juan.
--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK



[Qemu-devel] [PATCH 2/3] WHPX fix WHvSetPartitionProperty in PropertyCode

2018-03-14 Thread Justin Terry (VM) via Qemu-devel
This fixes a breaking change to WHvSetPartitionProperty to pass the 'in'
PropertyCode on function invocation introduced in Windows Insider SDK 17110.
Usage of this indicates the PropertyCode of the opaque PropertyBuffer passed in
on function invocation.

Also fixes the removal of the PropertyCode parameter from the
WHV_PARTITION_PROPERTY struct as it is now passed to the function directly.

Signed-off-by: Justin Terry (VM) 
---
 target/i386/whpx-all.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/target/i386/whpx-all.c b/target/i386/whpx-all.c
index 2080d58c4c..63e6e1b6f2 100644
--- a/target/i386/whpx-all.c
+++ b/target/i386/whpx-all.c
@@ -1278,9 +1278,9 @@ static int whpx_accel_init(MachineState *ms)
 }
 
 memset(, 0, sizeof(WHV_PARTITION_PROPERTY));
-prop.PropertyCode = WHvPartitionPropertyCodeProcessorCount;
 prop.ProcessorCount = smp_cpus;
 hr = WHvSetPartitionProperty(whpx->partition,
+ WHvPartitionPropertyCodeProcessorCount,
  ,
  sizeof(WHV_PARTITION_PROPERTY));
 
-- 
2.13.6




[Qemu-devel] [PATCH 3/3] WHPX improve vcpu_post_run perf

2018-03-14 Thread Justin Terry (VM) via Qemu-devel
This removes the additional call to WHvGetVirtualProcessorRegisters in
whpx_vcpu_post_run now that the WHV_VP_EXIT_CONTEXT is returned in all
WHV_RUN_VP_EXIT_CONTEXT structures.

Signed-off-by: Justin Terry (VM) 
---
 target/i386/whpx-all.c | 41 +++--
 1 file changed, 11 insertions(+), 30 deletions(-)

diff --git a/target/i386/whpx-all.c b/target/i386/whpx-all.c
index 63e6e1b6f2..63f2b68910 100644
--- a/target/i386/whpx-all.c
+++ b/target/i386/whpx-all.c
@@ -153,7 +153,7 @@ struct whpx_vcpu {
 bool interruptable;
 uint64_t tpr;
 uint64_t apic_base;
-WHV_X64_PENDING_INTERRUPTION_REGISTER interrupt_in_flight;
+bool interruption_pending;
 
 /* Must be the last field as it may have a tail */
 WHV_RUN_VP_EXIT_CONTEXT exit_ctx;
@@ -695,7 +695,7 @@ static void whpx_vcpu_pre_run(CPUState *cpu)
 qemu_mutex_lock_iothread();
 
 /* Inject NMI */
-if (!vcpu->interrupt_in_flight.InterruptionPending &&
+if (!vcpu->interruption_pending &&
 cpu->interrupt_request & (CPU_INTERRUPT_NMI | CPU_INTERRUPT_SMI)) {
 if (cpu->interrupt_request & CPU_INTERRUPT_NMI) {
 cpu->interrupt_request &= ~CPU_INTERRUPT_NMI;
@@ -724,7 +724,7 @@ static void whpx_vcpu_pre_run(CPUState *cpu)
 }
 
 /* Get pending hard interruption or replay one that was overwritten */
-if (!vcpu->interrupt_in_flight.InterruptionPending &&
+if (!vcpu->interruption_pending &&
 vcpu->interruptable && (env->eflags & IF_MASK)) {
 assert(!new_int.InterruptionPending);
 if (cpu->interrupt_request & CPU_INTERRUPT_HARD) {
@@ -781,44 +781,25 @@ static void whpx_vcpu_pre_run(CPUState *cpu)
 
 static void whpx_vcpu_post_run(CPUState *cpu)
 {
-HRESULT hr;
-struct whpx_state *whpx = _global;
 struct whpx_vcpu *vcpu = get_whpx_vcpu(cpu);
 struct CPUX86State *env = (CPUArchState *)(cpu->env_ptr);
 X86CPU *x86_cpu = X86_CPU(cpu);
-WHV_REGISTER_VALUE reg_values[4];
-const WHV_REGISTER_NAME reg_names[4] = {
-WHvX64RegisterRflags,
-WHvX64RegisterCr8,
-WHvRegisterPendingInterruption,
-WHvRegisterInterruptState,
-};
 
-hr = WHvGetVirtualProcessorRegisters(whpx->partition, cpu->cpu_index,
- reg_names, 4, reg_values);
-if (FAILED(hr)) {
-error_report("WHPX: Failed to get interrupt state regusters,"
- " hr=%08lx", hr);
-vcpu->interruptable = false;
-return;
-}
+env->eflags = vcpu->exit_ctx.VpContext.Rflags;
 
-assert(reg_names[0] == WHvX64RegisterRflags);
-env->eflags = reg_values[0].Reg64;
-
-assert(reg_names[1] == WHvX64RegisterCr8);
-if (vcpu->tpr != reg_values[1].Reg64) {
-vcpu->tpr = reg_values[1].Reg64;
+uint64_t tpr = vcpu->exit_ctx.VpContext.Cr8;
+if (vcpu->tpr != tpr) {
+vcpu->tpr = tpr;
 qemu_mutex_lock_iothread();
 cpu_set_apic_tpr(x86_cpu->apic_state, vcpu->tpr);
 qemu_mutex_unlock_iothread();
 }
 
-assert(reg_names[2] == WHvRegisterPendingInterruption);
-vcpu->interrupt_in_flight = reg_values[2].PendingInterruption;
+vcpu->interruption_pending =
+vcpu->exit_ctx.VpContext.ExecutionState.InterruptionPending;
 
-assert(reg_names[3] == WHvRegisterInterruptState);
-vcpu->interruptable = !reg_values[3].InterruptState.InterruptShadow;
+vcpu->interruptable =
+!vcpu->exit_ctx.VpContext.ExecutionState.InterruptShadow;
 
 return;
 }
-- 
2.13.6




[Qemu-devel] [PATCH 1/3] WHPX fix WHvGetCapability out WrittenSizeInBytes

2018-03-14 Thread Justin Terry (VM) via Qemu-devel
This fixes a breaking change to WHvGetCapability to include the 'out'
WrittenSizeInBytes introduced in Windows Insider SDK 17110.

This specifies on return the safe length to read into the WHV_CAPABILITY
structure passed to the call.

Signed-off-by: Justin Terry (VM) 
---
 configure  | 4 +++-
 target/i386/whpx-all.c | 3 ++-
 2 files changed, 5 insertions(+), 2 deletions(-)

diff --git a/configure b/configure
index af72fc852e..1ad153cdfb 100755
--- a/configure
+++ b/configure
@@ -2491,7 +2491,9 @@ if test "$whpx" != "no" ; then
 #include 
 int main(void) {
 WHV_CAPABILITY whpx_cap;
-WHvGetCapability(WHvCapabilityCodeFeatures, _cap, sizeof(whpx_cap));
+UINT32 writtenSize;
+WHvGetCapability(WHvCapabilityCodeFeatures, _cap, sizeof(whpx_cap),
+ );
 return 0;
 }
 EOF
diff --git a/target/i386/whpx-all.c b/target/i386/whpx-all.c
index 940bbe590d..2080d58c4c 100644
--- a/target/i386/whpx-all.c
+++ b/target/i386/whpx-all.c
@@ -1254,6 +1254,7 @@ static int whpx_accel_init(MachineState *ms)
 int ret;
 HRESULT hr;
 WHV_CAPABILITY whpx_cap;
+UINT32 whpx_cap_size;
 WHV_PARTITION_PROPERTY prop;
 
 whpx = _global;
@@ -1262,7 +1263,7 @@ static int whpx_accel_init(MachineState *ms)
 whpx->mem_quota = ms->ram_size;
 
 hr = WHvGetCapability(WHvCapabilityCodeHypervisorPresent, _cap,
-  sizeof(whpx_cap));
+  sizeof(whpx_cap), _cap_size);
 if (FAILED(hr) || !whpx_cap.HypervisorPresent) {
 error_report("WHPX: No accelerator found, hr=%08lx", hr);
 ret = -ENOSPC;
-- 
2.13.6




[Qemu-devel] [PATCH 0/3] WHPX introduce changes for Windows Insider SDK 17110

2018-03-14 Thread Justin Terry (VM) via Qemu-devel
This change set fixes two breaking changes that were introduced in the
Windows Insider SDK 17110. First, a change to the WHvGetCapability function
decl to include the 'out' WrittenSizeInBytes. Second, changes to the
WHvSetPartitionProperty function decl and WHV_PARTITION_PROPERTY structure
to directly pass the PropertyCode at invocation.

Lastly, it introduces a major performance improvement in whpx_vcpu_post_run
using the VpContext exit structure rather than another round trip call to
WHvGetVirtualProcessorRegisters to synchronize vp state.

QEMU compiled against this SDK (17110+) is expected to work on all Windows
Insider Builds (17115+).

Thanks,
Justin Terry

Justin Terry (VM) (3):
  WHPX fix WHvGetCapability out WrittenSizeInBytes
  WHPX fix WHvSetPartitionProperty in PropertyCode
  WHPX improve vcpu_post_run perf

 configure  |  4 +++-
 target/i386/whpx-all.c | 46 ++
 2 files changed, 17 insertions(+), 33 deletions(-)

-- 
2.13.6




Re: [Qemu-devel] [PATCH v10 08/24] migration: Add multifd test

2018-03-14 Thread Juan Quintela
"Dr. David Alan Gilbert"  wrote:
> * Juan Quintela (quint...@redhat.com) wrote:
>> We set the x-multifd-page-count and x-multifd-channels.
>> 
>> Signed-off-by: Juan Quintela 
>
>
> This should probably go nearer the end of the series;

it is _much_ better here.  It makes so much easier to test that I don't
break neither migration nor multifd while developing O:-)

> we've also got the problem that things are a bit delicate with TCG so
> adding more migration tests probably shouldn't happen until Paolo's
> TCG fixes are worked out.



> Also, should we be checking for some stats to show all 4 channels were
> used?

There are traces now, I can add new counters is that is what you want.

Later, Juan.



Re: [Qemu-devel] [PATCH v10 04/24] migration: Set the migration tcp port

2018-03-14 Thread Juan Quintela
Daniel P. Berrange  wrote:
> On Wed, Mar 07, 2018 at 11:59:50AM +0100, Juan Quintela wrote:
>> We can set the port parameter as zero.  This patch lets us know what
>> port the system was choosen for us.  Now we can migrate to this place.
>> 
>> Signed-off-by: Juan Quintela 

>> +void migrate_set_port(const uint16_t port, Error **errp)
>> +{
>> +MigrateSetParameters p = {
>> +.has_x_tcp_port = true,
>> +.x_tcp_port = port,
>> +};
>> +
>> +qmp_migrate_set_parameters(, errp);
>> +}
>
> This is really not nice - it is requiring the QMP  'migrate-set-parameters'
> command to accept an extra field that is never something we want the end
> user to be allowed to set. We should not use the public QMP schema for
> data items we are just passing between 2 internal pieces of code.

void migrate_set_address(SocketAddress *address)
{
MigrationState *s = migrate_get_current();

s->parameters.has_x_socket_address = true;
s->parameters.x_socket_address = address;
}


I hope that is ok with you O:-)

Later, Juan.

PD.  Yes, I agree about not using QMP inside two pieces of code, but on
 the other hand, it make this so *future* proof O:-)



Re: [Qemu-devel] [PATCH for 2.12] iotests: Avoid realpath

2018-03-14 Thread Eric Blake

On 03/14/2018 09:38 AM, Eric Blake wrote:

CentOS 6 lacks a realpath binary on the base install, which makes
all iotests runs fail:

001 - output mismatch (see 001.out.bad)
./check: line 815: realpath: command not found
diff: missing operand after `/home/dummy/qemu/tests/qemu-iotests/001.out'
diff: Try `diff --help' for more information.

Many of the uses of 'realpath' in the check script were being
used on the output of 'type -p' - but that is already an
absolute file name.  We don't care if the name is canonical,
merely that it was an executable file with an absolute path.


Appears to have been broken in cceaf1db.



The remaining use was using realpath to convert a possibly
relative filename into an absolute one before calling diff,
but diff works just fine on the relative name.


Hmm, this last change reverts commit 93e53fb6 that added realpath on 
purpose for ease of diagnosing failed tests.  Maybe it's worth a v2 that 
tests whether realpath exists, and if so uses it, but does a safe 
fallback to just using the filename as-is.




Signed-off-by: Eric Blake 
---

This doesn't magically fix all iotests on CentOS 6, but lets
it get a lot further.  Bug fix, so safe after freeze.

  tests/qemu-iotests/check | 12 ++--
  1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/tests/qemu-iotests/check b/tests/qemu-iotests/check
index 469142cd586..b719e9eee0c 100755
--- a/tests/qemu-iotests/check
+++ b/tests/qemu-iotests/check
@@ -92,7 +92,7 @@ set_prog_path()
  {
  p=`command -v $1 2> /dev/null`
  if [ -n "$p" -a -x "$p" ]; then
-realpath -- "$(type -p "$p")"
+type -p "$p"
  else
  return 1
  fi
@@ -554,7 +554,7 @@ then
  [ "$QEMU_PROG" = "" ] && _init_error "qemu not found"
  fi
  fi
-export QEMU_PROG=$(realpath -- "$(type -p "$QEMU_PROG")")
+export QEMU_PROG="$(type -p "$QEMU_PROG")"

  if [ -z "$QEMU_IMG_PROG" ]; then
  if [ -x "$build_iotests/qemu-img" ]; then
@@ -565,7 +565,7 @@ if [ -z "$QEMU_IMG_PROG" ]; then
  _init_error "qemu-img not found"
  fi
  fi
-export QEMU_IMG_PROG=$(realpath -- "$(type -p "$QEMU_IMG_PROG")")
+export QEMU_IMG_PROG="$(type -p "$QEMU_IMG_PROG")"

  if [ -z "$QEMU_IO_PROG" ]; then
  if [ -x "$build_iotests/qemu-io" ]; then
@@ -576,7 +576,7 @@ if [ -z "$QEMU_IO_PROG" ]; then
  _init_error "qemu-io not found"
  fi
  fi
-export QEMU_IO_PROG=$(realpath -- "$(type -p "$QEMU_IO_PROG")")
+export QEMU_IO_PROG="$(type -p "$QEMU_IO_PROG")"

  if [ -z $QEMU_NBD_PROG ]; then
  if [ -x "$build_iotests/qemu-nbd" ]; then
@@ -587,7 +587,7 @@ if [ -z $QEMU_NBD_PROG ]; then
  _init_error "qemu-nbd not found"
  fi
  fi
-export QEMU_NBD_PROG=$(realpath -- "$(type -p "$QEMU_NBD_PROG")")
+export QEMU_NBD_PROG="$(type -p "$QEMU_NBD_PROG")"

  if [ -z "$QEMU_VXHS_PROG" ]; then
export QEMU_VXHS_PROG="`set_prog_path qnio_server`"
@@ -811,7 +811,7 @@ do
  else
  echo " - output mismatch (see $seq.out.bad)"
  mv $tmp.out $seq.out.bad
-$diff -w "$reference" $(realpath $seq.out.bad)
+$diff -w "$reference" $seq.out.bad
  err=true
  fi
  fi



--
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org



Re: [Qemu-devel] [PATCH v10 03/24] migration: Create tcp_port parameter

2018-03-14 Thread Juan Quintela
Daniel P. Berrange  wrote:
> On Wed, Mar 07, 2018 at 11:59:49AM +0100, Juan Quintela wrote:
>> It will be used to store the uri tcp_port parameter.  This is the only
>> parameter than can change and we can need to be able to connect to it.
>> 
>> Signed-off-by: Juan Quintela 
>> 
>> --
>>  # TODO either fuse back into MigrationParameters, or make
>> @@ -584,7 +591,8 @@
>>  '*block-incremental': 'bool',
>>  '*x-multifd-channels': 'int',
>>  '*x-multifd-page-count': 'int',
>> -'*xbzrle-cache-size': 'size' } }
>> +'*xbzrle-cache-size': 'size',
>> +'*x-tcp-port': 'uint16'} }
>
> This should not exist - this exposes this parameter in migate-set-parameters
> as a end user settable property, which is definitely not desirable.
>
> It is only something we should report with 'query-migrate' / 'info migrate'

Oops, my understanding was that the three places have to be in sync.
Now I stand corrected.  Thanks.


>
>>  # @migrate-set-parameters:
>> @@ -667,6 +675,10 @@
>>  # needs to be a multiple of the target page size
>>  # and a power of 2
>>  # (Since 2.11)
>> +#
>> +# @x-tcp-port: Only used for tcp, to know what the real port is
>> +# (Since 2.12)
>> +#
>>  # Since: 2.4
>>  ##
>>  { 'struct': 'MigrationParameters',
>> @@ -683,7 +695,8 @@
>>  '*block-incremental': 'bool' ,
>>  '*x-multifd-channels': 'uint8',
>>  '*x-multifd-page-count': 'uint32',
>> -'*xbzrle-cache-size': 'size' } }
>> +'*xbzrle-cache-size': 'size',
>> +'*x-tcp-port': 'uint16'} }
>
> As mentioned in previous review, IMHO we should be reporting the full
> socket address, and allow an array of them, since we're going to have
> more than one address available. i.e.
>
>'*socket-address': ['SocketAddress']

This is weird, really weird.  But was done.

- this needs to be copied, because it is a pointer, so we end needing
  QAPI_CLONE(), yes, I know.
- it is really weird that I have to:

rsp_return = qdict_get_qdict(rsp, "return");
object = qdict_get(rsp_return, parameter);

iv = qobject_input_visitor_new(object);
visit_type_SocketAddress(iv, NULL, , _err);
result = g_strdup_printf("%s",
 SocketAddress_to_str("", saddr, false, false));
QDECREF(rsp);

  Remember, it is a *series* of strings in the ddict, we have code to
  _print_ qdicts, as info migrate works.  But I haven't found an easier
  way of getting from a qdict to an string than:
* parsing it
* convert it back to text
  sniff

> It doesn't cover non-socket based URIs, but that's fine, because for those
> the mgmt app already knows how the channel is setup. We just need the
> array of SocketAddress, because for socket URIs, the hostname, gets turned
> into an array of addresses, and the mgmt app can't discover them.

SocketAddress are a mess, there is not a single example that I can find
on how to use it on the whole tree.  I *think* that I did that right,
will see your comments on the next post.

Later, Juan.



Re: [Qemu-devel] [PATCH v4 4/4] migration: use the free page hint feature from balloon

2018-03-14 Thread Michael S. Tsirkin
On Wed, Mar 14, 2018 at 02:50:44PM +0800, Wei Wang wrote:
> On 03/14/2018 10:51 AM, Michael S. Tsirkin wrote:
> > On Wed, Mar 14, 2018 at 10:41:36AM +0800, Wei Wang wrote:
> > > On 03/14/2018 12:35 AM, Michael S. Tsirkin wrote:
> > > > On Wed, Mar 07, 2018 at 08:34:25PM +0800, Wei Wang wrote:
> > > > > Start the free page optimization after the migration bitmap is
> > > > > synchronized. This can't be used in the stop phase since the 
> > > > > guest
> > > > > is paused. Make sure the guest reporting has stopped before
> > > > > synchronizing the migration dirty bitmap. Currently, the optimization 
> > > > > is
> > > > > added to precopy only.
> > > > > 
> > > > > Signed-off-by: Wei Wang 
> > > > > CC: Dr. David Alan Gilbert 
> > > > > CC: Juan Quintela 
> > > > > CC: Michael S. Tsirkin 
> > > > > ---
> > > > >migration/ram.c | 19 ++-
> > > > >1 file changed, 18 insertions(+), 1 deletion(-)
> > > > > 
> > > > > diff --git a/migration/ram.c b/migration/ram.c
> > > > > index e172798..7b4c9b1 100644
> > > > > --- a/migration/ram.c
> > > > > +++ b/migration/ram.c
> > > > > @@ -51,6 +51,8 @@
> > > > >#include "qemu/rcu_queue.h"
> > > > >#include "migration/colo.h"
> > > > >#include "migration/block.h"
> > > > > +#include "sysemu/balloon.h"
> > > > > +#include "sysemu/sysemu.h"
> > > > >/***/
> > > > >/* ram save/restore */
> > > > > @@ -208,6 +210,8 @@ struct RAMState {
> > > > >uint32_t last_version;
> > > > >/* We are in the first round */
> > > > >bool ram_bulk_stage;
> > > > > +/* The free pages optimization feature is supported */
> > > > > +bool free_page_support;
> > > > >/* How many times we have dirty too many pages */
> > > > >int dirty_rate_high_cnt;
> > > > >/* these variables are used for bitmap sync */
> > > > > @@ -775,7 +779,7 @@ unsigned long 
> > > > > migration_bitmap_find_dirty(RAMState *rs, RAMBlock *rb,
> > > > >unsigned long *bitmap = rb->bmap;
> > > > >unsigned long next;
> > > > > -if (rs->ram_bulk_stage && start > 0) {
> > > > > +if (rs->ram_bulk_stage && start > 0 && !rs->free_page_support) {
> > > > >next = start + 1;
> > > > >} else {
> > > > >next = find_next_bit(bitmap, size, start);
> > > > > @@ -833,6 +837,10 @@ static void migration_bitmap_sync(RAMState *rs)
> > > > >int64_t end_time;
> > > > >uint64_t bytes_xfer_now;
> > > > > +if (rs->free_page_support) {
> > > > > +balloon_free_page_stop();
> > > > > +}
> > > > > +
> > > > >ram_counters.dirty_sync_count++;
> > > > >if (!rs->time_last_bitmap_sync) {
> > > > > @@ -899,6 +907,10 @@ static void migration_bitmap_sync(RAMState *rs)
> > > > >if (migrate_use_events()) {
> > > > >
> > > > > qapi_event_send_migration_pass(ram_counters.dirty_sync_count, NULL);
> > > > >}
> > > > > +
> > > > > +if (rs->free_page_support && runstate_is_running()) {
> > > > > +balloon_free_page_start();
> > > > > +}
> > > > >}
> > > > I think some of these conditions should go into
> > > > balloon_free_page_start/stop.
> > > > 
> > > > Checking runstate is generally problematic unless you
> > > > also handle run state change notifiers as it can
> > > > be manipulated from QMP.
> > > How about moving the check of runstate to
> > > virtio_balloon_poll_free_page_hints:
> > > 
> > > while (dev->free_page_report_status < FREE_PAGE_REPORT_S_STOP &&
> > > runstate_is_running()) {
> > > ...
> > > }
> > Hard to tell on the outset. E.g. why is just stop affected?  Pls add
> > comments explaining what happens if VM is not running when start or stop
> > is called.
> > 
> > 
> > > In this case, I think we won't need a notifier - if the run state is 
> > > changed
> > > by qmp, the optimization thread will just exist.
> > But you need to wake it up and notify the guest presumably?
> > 
> 
> 
> I think it's not necessary to wake it up, because when the VM is not
> running, there will be no hints reported to the vq, and the optimization
> thread exits. (there is no issue in that case)
> Probably we can add a notifier which calls virtio_balloon_free_page_stop()
> when qmp wakes up the VM.
> 
> Best,
> Wei

set_status callback is invoked so you can use that maybe.

Might be a good idea to handle a bunch of other corner
cases e.g. if guest driver is loaded when migration
is already in progress.

-- 
MST



[Qemu-devel] [PATCH for 2.12] iotests: Avoid realpath

2018-03-14 Thread Eric Blake
CentOS 6 lacks a realpath binary on the base install, which makes
all iotests runs fail:

001 - output mismatch (see 001.out.bad)
./check: line 815: realpath: command not found
diff: missing operand after `/home/dummy/qemu/tests/qemu-iotests/001.out'
diff: Try `diff --help' for more information.

Many of the uses of 'realpath' in the check script were being
used on the output of 'type -p' - but that is already an
absolute file name.  We don't care if the name is canonical,
merely that it was an executable file with an absolute path.

The remaining use was using realpath to convert a possibly
relative filename into an absolute one before calling diff,
but diff works just fine on the relative name.

Signed-off-by: Eric Blake 
---

This doesn't magically fix all iotests on CentOS 6, but lets
it get a lot further.  Bug fix, so safe after freeze.

 tests/qemu-iotests/check | 12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/tests/qemu-iotests/check b/tests/qemu-iotests/check
index 469142cd586..b719e9eee0c 100755
--- a/tests/qemu-iotests/check
+++ b/tests/qemu-iotests/check
@@ -92,7 +92,7 @@ set_prog_path()
 {
 p=`command -v $1 2> /dev/null`
 if [ -n "$p" -a -x "$p" ]; then
-realpath -- "$(type -p "$p")"
+type -p "$p"
 else
 return 1
 fi
@@ -554,7 +554,7 @@ then
 [ "$QEMU_PROG" = "" ] && _init_error "qemu not found"
 fi
 fi
-export QEMU_PROG=$(realpath -- "$(type -p "$QEMU_PROG")")
+export QEMU_PROG="$(type -p "$QEMU_PROG")"

 if [ -z "$QEMU_IMG_PROG" ]; then
 if [ -x "$build_iotests/qemu-img" ]; then
@@ -565,7 +565,7 @@ if [ -z "$QEMU_IMG_PROG" ]; then
 _init_error "qemu-img not found"
 fi
 fi
-export QEMU_IMG_PROG=$(realpath -- "$(type -p "$QEMU_IMG_PROG")")
+export QEMU_IMG_PROG="$(type -p "$QEMU_IMG_PROG")"

 if [ -z "$QEMU_IO_PROG" ]; then
 if [ -x "$build_iotests/qemu-io" ]; then
@@ -576,7 +576,7 @@ if [ -z "$QEMU_IO_PROG" ]; then
 _init_error "qemu-io not found"
 fi
 fi
-export QEMU_IO_PROG=$(realpath -- "$(type -p "$QEMU_IO_PROG")")
+export QEMU_IO_PROG="$(type -p "$QEMU_IO_PROG")"

 if [ -z $QEMU_NBD_PROG ]; then
 if [ -x "$build_iotests/qemu-nbd" ]; then
@@ -587,7 +587,7 @@ if [ -z $QEMU_NBD_PROG ]; then
 _init_error "qemu-nbd not found"
 fi
 fi
-export QEMU_NBD_PROG=$(realpath -- "$(type -p "$QEMU_NBD_PROG")")
+export QEMU_NBD_PROG="$(type -p "$QEMU_NBD_PROG")"

 if [ -z "$QEMU_VXHS_PROG" ]; then
   export QEMU_VXHS_PROG="`set_prog_path qnio_server`"
@@ -811,7 +811,7 @@ do
 else
 echo " - output mismatch (see $seq.out.bad)"
 mv $tmp.out $seq.out.bad
-$diff -w "$reference" $(realpath $seq.out.bad)
+$diff -w "$reference" $seq.out.bad
 err=true
 fi
 fi
-- 
2.14.3




Re: [Qemu-devel] [PULL 00/18] Linux user for 2.12 patches

2018-03-14 Thread no-reply
Hi,

This series seems to have some coding style problems. See output below for
more information:

Type: series
Message-id: 20180313173355.4468-1-laur...@vivier.eu
Subject: [Qemu-devel] [PULL 00/18] Linux user for 2.12 patches

=== TEST SCRIPT BEGIN ===
#!/bin/bash

BASE=base
n=1
total=$(git log --oneline $BASE.. | wc -l)
failed=0

git config --local diff.renamelimit 0
git config --local diff.renames True
git config --local diff.algorithm histogram

commits="$(git log --format=%H --reverse $BASE..)"
for c in $commits; do
echo "Checking PATCH $n/$total: $(git log -n 1 --format=%s $c)..."
if ! git show $c --format=email | ./scripts/checkpatch.pl --mailback -; then
failed=1
echo
fi
n=$((n+1))
done

exit $failed
=== TEST SCRIPT END ===

Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384
From https://github.com/patchew-project/qemu
 * [new tag]   patchew/20180313173355.4468-1-laur...@vivier.eu -> 
patchew/20180313173355.4468-1-laur...@vivier.eu
 t [tag update]patchew/cover.1520952419.git.be...@igalia.com -> 
patchew/cover.1520952419.git.be...@igalia.com
Switched to a new branch 'test'
8632a757c2 linux-user: init_guest_space: Add a comment about search strategy
ba6b6b2c88 linux-user: init_guest_space: Don't try to align if we'll reject it
28c11cb5f4 linux-user: init_guest_space: Clean up control flow a bit
07bb6931a7 linux-user: init_guest_commpage: Add a comment about size check
861df567d3 linux-user: init_guest_space: Clarify page alignment logic
122cb68f59 linux-user: init_guest_space: Correctly handle guest_start in 
commpage initialization
5300fd1f33 linux-user: init_guest_space: Clean up if we can't initialize the 
commpage
e2363e1081 linux-user: Rename validate_guest_space => init_guest_commpage
8d0f3a270b linux-user: Use #if to only call validate_guest_space for 32-bit ARM 
target
f56dd5f653 qemu-binfmt-conf.sh: add qemu-xtensa
f15bd1b0d6 linux-user: drop unused target_msync function
aabcc316af linux-user: fix target_mprotect/target_munmap error return values
cf497a7694 linux-user: fix assertion in shmdt
b2cf1df32c linux-user: fix mmap/munmap/mprotect/mremap/shmat
5e5ec53930 linux-user: Support f_flags in statfs when available.
a80208de3f linux-user: allows to use "--systemd ALL" with qemu-binfmt-conf.sh
f0f44061ce linux-user: Remove the unused "not implemented" signal handling stubs
9b371941f0 linux-user: Drop unicore32 code

=== OUTPUT BEGIN ===
Checking PATCH 1/18: linux-user: Drop unicore32 code...
Checking PATCH 2/18: linux-user: Remove the unused "not implemented" signal 
handling stubs...
Checking PATCH 3/18: linux-user: allows to use "--systemd ALL" with 
qemu-binfmt-conf.sh...
Checking PATCH 4/18: linux-user: Support f_flags in statfs when available
ERROR: code indent should never use tabs
#57: FILE: linux-user/syscall_defs.h:2216:
+^Iint32_t^I^I^If_flags;$

ERROR: code indent should never use tabs
#58: FILE: linux-user/syscall_defs.h:2217:
+^Iint32_t^I^I^If_spare[5];$

ERROR: code indent should never use tabs
#67: FILE: linux-user/syscall_defs.h:2233:
+^Iabi_long^I^If_flags;$

ERROR: code indent should never use tabs
#68: FILE: linux-user/syscall_defs.h:2234:
+^Iabi_long^I^If_spare[5];$

ERROR: code indent should never use tabs
#77: FILE: linux-user/syscall_defs.h:2250:
+^Iuint32_t^If_flags;$

ERROR: code indent should never use tabs
#78: FILE: linux-user/syscall_defs.h:2251:
+^Iuint32_t^If_spare[5];$

ERROR: code indent should never use tabs
#87: FILE: linux-user/syscall_defs.h:2267:
+^Iabi_long f_flags;$

ERROR: code indent should never use tabs
#88: FILE: linux-user/syscall_defs.h:2268:
+^Iabi_long f_spare[4];$

ERROR: code indent should never use tabs
#97: FILE: linux-user/syscall_defs.h:2282:
+^Iabi_long f_flags;$

ERROR: code indent should never use tabs
#98: FILE: linux-user/syscall_defs.h:2283:
+^Iabi_long f_spare[4];$

ERROR: code indent should never use tabs
#128: FILE: linux-user/syscall_defs.h:2328:
+^Iuint32_t f_flags;$

ERROR: code indent should never use tabs
#129: FILE: linux-user/syscall_defs.h:2329:
+^Iuint32_t f_spare[4];$

ERROR: code indent should never use tabs
#138: FILE: linux-user/syscall_defs.h:2343:
+^Iuint32_t f_flags;$

ERROR: code indent should never use tabs
#139: FILE: linux-user/syscall_defs.h:2344:
+^Iuint32_t f_spare[4];$

total: 14 errors, 0 warnings, 112 lines checked

Your patch has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.

Checking PATCH 5/18: linux-user: fix mmap/munmap/mprotect/mremap/shmat...
Checking PATCH 6/18: linux-user: fix assertion in shmdt...
Checking PATCH 7/18: linux-user: fix target_mprotect/target_munmap error return 
values...
Checking PATCH 8/18: linux-user: drop unused target_msync function...
Checking PATCH 9/18: qemu-binfmt-conf.sh: add qemu-xtensa...
WARNING: line over 80 characters
#38: FILE: scripts/qemu-binfmt-conf.sh:111:

[Qemu-devel] [PATCH] dump-guest-memory: more descriptive lookup_type failure

2018-03-14 Thread Andrew Jones
We've seen a few reports of

 (gdb) source /usr/share/qemu-kvm/dump-guest-memory.py
 Traceback (most recent call last):
   File "/usr/share/qemu-kvm/dump-guest-memory.py", line 19, in 
 UINTPTR_T = gdb.lookup_type("uintptr_t")
 gdb.error: No type named uintptr_t.

This occurs when symbols haven't been loaded first, i.e. neither a
QEMU binary was loaded nor a QEMU process was attached first. Let's
better inform the user of how to fix the issue themselves in order
to avoid more reports.

Signed-off-by: Andrew Jones 
---
 scripts/dump-guest-memory.py | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/scripts/dump-guest-memory.py b/scripts/dump-guest-memory.py
index 51acfcd0c053..e56fff6d7e82 100644
--- a/scripts/dump-guest-memory.py
+++ b/scripts/dump-guest-memory.py
@@ -16,7 +16,11 @@ the COPYING file in the top-level directory.
 import ctypes
 import struct
 
-UINTPTR_T = gdb.lookup_type("uintptr_t")
+try:
+UINTPTR_T = gdb.lookup_type("uintptr_t")
+except Exception as inst:
+raise gdb.GdbError("Symbols must be loaded prior to sourcing 
dump-guest-memory.\n"
+   "Symbols may be loaded by first 'attach'ing a QEMU 
process id or by 'load'ing a QEMU binary.")
 
 TARGET_PAGE_SIZE = 0x1000
 TARGET_PAGE_MASK = 0xF000
-- 
2.13.6




Re: [Qemu-devel] [virtio-dev] Re: [PATCH v4 3/4] virtio-balloon: VIRTIO_BALLOON_F_FREE_PAGE_HINT

2018-03-14 Thread Michael S. Tsirkin
On Wed, Mar 14, 2018 at 02:03:19PM +0800, Wei Wang wrote:
> On 03/14/2018 10:53 AM, Michael S. Tsirkin wrote:
> > On Wed, Mar 14, 2018 at 10:43:01AM +0800, Wei Wang wrote:
> > > On 03/14/2018 12:49 AM, Michael S. Tsirkin wrote:
> > > > On Wed, Mar 07, 2018 at 08:34:24PM +0800, Wei Wang wrote:
> > > > 
> > > > > Signed-off-by: Wei Wang 
> > > > > Signed-off-by: Liang Li 
> > > > > CC: Michael S. Tsirkin 
> > > > > CC: Dr. David Alan Gilbert 
> > > > > CC: Juan Quintela 
> > > > I find it suspicious that neither unrealize nor reset
> > > > functions have been touched at all.
> > > > Are you sure you have thought through scenarious like
> > > > hot-unplug or disabling the device by guest?
> > > OK. I think we can call balloon_free_page_stop in unrealize and reset.
> > > 
> > > 
> > > > +static void *virtio_balloon_poll_free_page_hints(void *opaque)
> > > > +{
> > > > +VirtQueueElement *elem;
> > > > +VirtIOBalloon *dev = opaque;
> > > > +VirtQueue *vq = dev->free_page_vq;
> > > > +uint32_t id;
> > > > +size_t size;
> > > > What makes it safe to poke at this device from multiple threads?
> > > > I think that it would be safer to do it from e.g. BH.
> > > > 
> > > Actually the free_page_optimization thread is the only user of 
> > > free_page_vq,
> > > and there is only one optimization thread each time. Would this be safe
> > > enough?
> > > 
> > > Best,
> > > Wei
> > Aren't there other fields there? Also things like reset affect all VQs.
> > 
> 
> Yes. But I think BHs are used to avoid re-entrancy, which isn't the issue
> here.

Since you are adding locks to address the issue - doesn't this imply
reentrancy is exactly the issue?

> The potential issue I could observe here is that
> "dev->free_page_report_status" is read and written by the optimization
> thread, and it is also modified by the migration thread and reset via
> virtio_balloon_free_page_stop.
> 
> How about adding a QEMU SpinLock, like this:
> 
> virtio_balloon_poll_free_page_hints()
> {
> 
> while (1) {
> qemu_spin_lock();
> /* If the status has been changed to STOP or EXIT, or the VM is
> stopped, just exit */
> if (dev->free_page_report_status >= FREE_PAGE_REPORT_S_STOP ||
> !runstate_is_running()) {
> qemu_spin_unlock();
> break;
> }
> 
> qemu_spin_unlock();
> }
> }
> 
> 
> Best,
> Wei

That will address the issue but it does look weird.

-- 
MST



[Qemu-devel] [PATCH v5 2/3] target/arm: Add "_S" suffix to the secure version of a sysreg

2018-03-14 Thread Abdallah Bouassida
This is a preparation for the coming feature of creating dynamically an XML
description for the ARM sysregs.
Add "_S" suffix to the secure version of sysregs that have both S and NS views
Replace (S) and (NS) by _S and _NS for the register that are manually defined,
so all the registers follow the same convention.

Signed-off-by: Abdallah Bouassida 
Reviewed-by: Peter Maydell 
---
 target/arm/helper.c | 29 ++---
 1 file changed, 18 insertions(+), 11 deletions(-)

diff --git a/target/arm/helper.c b/target/arm/helper.c
index db8c925..1360a14 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -695,12 +695,12 @@ static const ARMCPRegInfo cp_reginfo[] = {
  * the secure register to be properly reset and migrated. There is also no
  * v8 EL1 version of the register so the non-secure instance stands alone.
  */
-{ .name = "FCSEIDR(NS)",
+{ .name = "FCSEIDR",
   .cp = 15, .opc1 = 0, .crn = 13, .crm = 0, .opc2 = 0,
   .access = PL1_RW, .secure = ARM_CP_SECSTATE_NS,
   .fieldoffset = offsetof(CPUARMState, cp15.fcseidr_ns),
   .resetvalue = 0, .writefn = fcse_write, .raw_writefn = raw_write, },
-{ .name = "FCSEIDR(S)",
+{ .name = "FCSEIDR_S",
   .cp = 15, .opc1 = 0, .crn = 13, .crm = 0, .opc2 = 0,
   .access = PL1_RW, .secure = ARM_CP_SECSTATE_S,
   .fieldoffset = offsetof(CPUARMState, cp15.fcseidr_s),
@@ -716,7 +716,7 @@ static const ARMCPRegInfo cp_reginfo[] = {
   .access = PL1_RW, .secure = ARM_CP_SECSTATE_NS,
   .fieldoffset = offsetof(CPUARMState, cp15.contextidr_el[1]),
   .resetvalue = 0, .writefn = contextidr_write, .raw_writefn = raw_write, 
},
-{ .name = "CONTEXTIDR(S)", .state = ARM_CP_STATE_AA32,
+{ .name = "CONTEXTIDR_S", .state = ARM_CP_STATE_AA32,
   .cp = 15, .opc1 = 0, .crn = 13, .crm = 0, .opc2 = 1,
   .access = PL1_RW, .secure = ARM_CP_SECSTATE_S,
   .fieldoffset = offsetof(CPUARMState, cp15.contextidr_s),
@@ -1967,7 +1967,7 @@ static const ARMCPRegInfo generic_timer_cp_reginfo[] = {
cp15.c14_timer[GTIMER_PHYS].ctl),
   .writefn = gt_phys_ctl_write, .raw_writefn = raw_write,
 },
-{ .name = "CNTP_CTL(S)",
+{ .name = "CNTP_CTL_S",
   .cp = 15, .crn = 14, .crm = 2, .opc1 = 0, .opc2 = 1,
   .secure = ARM_CP_SECSTATE_S,
   .type = ARM_CP_IO | ARM_CP_ALIAS, .access = PL1_RW | PL0_R,
@@ -2006,7 +2006,7 @@ static const ARMCPRegInfo generic_timer_cp_reginfo[] = {
   .accessfn = gt_ptimer_access,
   .readfn = gt_phys_tval_read, .writefn = gt_phys_tval_write,
 },
-{ .name = "CNTP_TVAL(S)",
+{ .name = "CNTP_TVAL_S",
   .cp = 15, .crn = 14, .crm = 2, .opc1 = 0, .opc2 = 0,
   .secure = ARM_CP_SECSTATE_S,
   .type = ARM_CP_NO_RAW | ARM_CP_IO, .access = PL1_RW | PL0_R,
@@ -2060,7 +2060,7 @@ static const ARMCPRegInfo generic_timer_cp_reginfo[] = {
   .accessfn = gt_ptimer_access,
   .writefn = gt_phys_cval_write, .raw_writefn = raw_write,
 },
-{ .name = "CNTP_CVAL(S)", .cp = 15, .crm = 14, .opc1 = 2,
+{ .name = "CNTP_CVAL_S", .cp = 15, .crm = 14, .opc1 = 2,
   .secure = ARM_CP_SECSTATE_S,
   .access = PL1_RW | PL0_R,
   .type = ARM_CP_64BIT | ARM_CP_IO | ARM_CP_ALIAS,
@@ -5563,7 +5563,8 @@ CpuDefinitionInfoList *arch_query_cpu_definitions(Error 
**errp)
 
 static void add_cpreg_to_hashtable(ARMCPU *cpu, const ARMCPRegInfo *r,
void *opaque, int state, int secstate,
-   int crm, int opc1, int opc2)
+   int crm, int opc1, int opc2,
+   const char *name)
 {
 /* Private utility function for define_one_arm_cp_reg_with_opaque():
  * add a single reginfo struct to the hash table.
@@ -5573,6 +5574,7 @@ static void add_cpreg_to_hashtable(ARMCPU *cpu, const 
ARMCPRegInfo *r,
 int is64 = (r->type & ARM_CP_64BIT) ? 1 : 0;
 int ns = (secstate & ARM_CP_SECSTATE_NS) ? 1 : 0;
 
+r2->name = g_strdup(name);
 /* Reset the secure state to the specific incoming state.  This is
  * necessary as the register may have been defined with both states.
  */
@@ -5804,19 +5806,24 @@ void define_one_arm_cp_reg_with_opaque(ARMCPU *cpu,
 /* Under AArch32 CP registers can be common
  * (same for secure and non-secure world) or banked.
  */
+char *name;
+
 switch (r->secure) {
 case ARM_CP_SECSTATE_S:
 case ARM_CP_SECSTATE_NS:
 add_cpreg_to_hashtable(cpu, r, opaque, state,
-   r->secure, crm, opc1, opc2);
+   r->secure, crm, opc1, opc2,
+   

[Qemu-devel] [PATCH v5 3/3] target/arm: Add the XML dynamic generation

2018-03-14 Thread Abdallah Bouassida
Generate an XML description for the cp-regs.
Register these regs with the gdb_register_coprocessor().
Add arm_gdb_get_sysreg() to use it as a callback to read those regs.
Add a dummy arm_gdb_set_sysreg().

Signed-off-by: Abdallah Bouassida 
---
 gdbstub.c| 10 
 include/qom/cpu.h|  5 +++-
 target/arm/cpu.c |  1 +
 target/arm/cpu.h | 26 +++
 target/arm/gdbstub.c | 71 
 target/arm/helper.c  | 27 
 6 files changed, 139 insertions(+), 1 deletion(-)

diff --git a/gdbstub.c b/gdbstub.c
index f1d5148..09065bc 100644
--- a/gdbstub.c
+++ b/gdbstub.c
@@ -674,6 +674,16 @@ static const char *get_feature_xml(const char *p, const 
char **newp,
 }
 return target_xml;
 }
+if (cc->gdb_get_dynamic_xml) {
+CPUState *cpu = first_cpu;
+char *xmlname = g_strndup(p, len);
+const char *xml = cc->gdb_get_dynamic_xml(cpu, xmlname);
+
+free(xmlname);
+if (xml) {
+return xml;
+}
+}
 for (i = 0; ; i++) {
 name = xml_builtin[i][0];
 if (!name || (strncmp(name, p, len) == 0 && strlen(name) == len))
diff --git a/include/qom/cpu.h b/include/qom/cpu.h
index dc6d495..be6d84d 100644
--- a/include/qom/cpu.h
+++ b/include/qom/cpu.h
@@ -132,6 +132,9 @@ struct TranslationBlock;
  *   before the insn which triggers a watchpoint rather than after it.
  * @gdb_arch_name: Optional callback that returns the architecture name known
  * to GDB. The caller must free the returned string with g_free.
+ * @gdb_get_dynamic_xml: Callback to return dynamically generated XML for the
+ *   gdb stub. Returns a pointer to the XML contents for the specified XML file
+ *   or NULL if the CPU doesn't have a dynamically generated content for it.
  * @cpu_exec_enter: Callback for cpu_exec preparation.
  * @cpu_exec_exit: Callback for cpu_exec cleanup.
  * @cpu_exec_interrupt: Callback for processing interrupts in cpu_exec.
@@ -198,7 +201,7 @@ typedef struct CPUClass {
 const struct VMStateDescription *vmsd;
 const char *gdb_core_xml_file;
 gchar * (*gdb_arch_name)(CPUState *cpu);
-
+const char *(*gdb_get_dynamic_xml)(CPUState *cpu, const char *xmlname);
 void (*cpu_exec_enter)(CPUState *cpu);
 void (*cpu_exec_exit)(CPUState *cpu);
 bool (*cpu_exec_interrupt)(CPUState *cpu, int interrupt_request);
diff --git a/target/arm/cpu.c b/target/arm/cpu.c
index 022d8c5..38b8b1c 100644
--- a/target/arm/cpu.c
+++ b/target/arm/cpu.c
@@ -1879,6 +1879,7 @@ static void arm_cpu_class_init(ObjectClass *oc, void 
*data)
 cc->gdb_num_core_regs = 26;
 cc->gdb_core_xml_file = "arm-core.xml";
 cc->gdb_arch_name = arm_gdb_arch_name;
+cc->gdb_get_dynamic_xml = arm_gdb_get_dynamic_xml;
 cc->gdb_stop_before_watchpoint = true;
 cc->debug_excp_handler = arm_debug_excp_handler;
 cc->debug_check_watchpoint = arm_debug_check_watchpoint;
diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index 5a6ea24..5664f58 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -133,6 +133,19 @@ enum {
s<2n+1> maps to the most significant half of d
  */
 
+/**
+ * DynamicGDBXMLInfo:
+ * @desc: Contains the XML descriptions.
+ * @num_cpregs: Number of the Coprocessor registers seen by GDB.
+ * @cpregs_keys: Array that contains the corresponding Key of
+ * a given cpreg with the same order of the cpreg in the XML description.
+ */
+typedef struct DynamicGDBXMLInfo {
+char *desc;
+int num_cpregs;
+uint32_t *cpregs_keys;
+} DynamicGDBXMLInfo;
+
 /* CPU state for each instance of a generic timer (in cp15 c14) */
 typedef struct ARMGenericTimer {
 uint64_t cval; /* Timer CompareValue register */
@@ -682,6 +695,8 @@ struct ARMCPU {
 uint64_t *cpreg_vmstate_values;
 int32_t cpreg_vmstate_array_len;
 
+DynamicGDBXMLInfo dyn_xml;
+
 /* Timers used by the generic (architected) timer */
 QEMUTimer *gt_timer[NUM_GTIMERS];
 /* GPIO outputs for generic timer */
@@ -863,6 +878,17 @@ hwaddr arm_cpu_get_phys_page_attrs_debug(CPUState *cpu, 
vaddr addr,
 int arm_cpu_gdb_read_register(CPUState *cpu, uint8_t *buf, int reg);
 int arm_cpu_gdb_write_register(CPUState *cpu, uint8_t *buf, int reg);
 
+/* Dynamically generates for gdb stub an XML description of the sysregs from
+ * the cp_regs hashtable. Returns the registered sysregs number.
+ */
+int arm_gen_dynamic_xml(CPUState *cpu);
+
+/* Returns the dynamically generated XML for the gdb stub.
+ * Returns a pointer to the XML contents for the specified XML file or NULL
+ * if the XML name doesn't match the predefined one.
+ */
+const char *arm_gdb_get_dynamic_xml(CPUState *cpu, const char *xmlname);
+
 int arm_cpu_write_elf64_note(WriteCoreDumpFunction f, CPUState *cs,
  int cpuid, void *opaque);
 int arm_cpu_write_elf32_note(WriteCoreDumpFunction f, CPUState *cs,
diff --git 

[Qemu-devel] [PATCH v5 0/3] target/arm: Add a dynamic XML-description of the cp-registers to GDB

2018-03-14 Thread Abdallah Bouassida
The previous version:
http://patchwork.ozlabs.org/project/qemu-devel/list/?series=33190

Abdallah Bouassida (3):
  target/arm: Add "ARM_CP_NO_GDB" as a new bit field for ARMCPRegInfo
type
  target/arm: Add "_S" suffix to the secure version of a sysreg
  target/arm: Add the XML dynamic generation

 gdbstub.c| 10 
 include/qom/cpu.h|  5 +++-
 target/arm/cpu.c |  1 +
 target/arm/cpu.h | 29 -
 target/arm/gdbstub.c | 71 
 target/arm/helper.c  | 58 +-
 6 files changed, 160 insertions(+), 14 deletions(-)

-- 
2.7.4




[Qemu-devel] [PATCH v5 1/3] target/arm: Add "ARM_CP_NO_GDB" as a new bit field for ARMCPRegInfo type

2018-03-14 Thread Abdallah Bouassida
This is a preparation for the coming feature of creating dynamically an XML
description for the ARM sysregs.
A register has ARM_CP_NO_GDB enabled will not be shown in the dynamic XML.
This bit is enabled automatically when creating CP_ANY wildcard aliases.
This bit could be enabled manually for any register we want to remove from the
dynamic XML description.

Signed-off-by: Abdallah Bouassida 
Reviewed-by: Peter Maydell 
---
 target/arm/cpu.h| 3 ++-
 target/arm/helper.c | 2 +-
 2 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index 1e7e1f8..5a6ea24 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -1815,10 +1815,11 @@ static inline uint64_t cpreg_to_kvm_id(uint32_t cpregid)
 #define ARM_LAST_SPECIAL ARM_CP_DC_ZVA
 #define ARM_CP_FPU   0x1000
 #define ARM_CP_SVE   0x2000
+#define ARM_CP_NO_GDB0x4000
 /* Used only as a terminator for ARMCPRegInfo lists */
 #define ARM_CP_SENTINEL  0x
 /* Mask of only the flag bits in a type field */
-#define ARM_CP_FLAG_MASK 0x30ff
+#define ARM_CP_FLAG_MASK 0x70ff
 
 /* Valid values for ARMCPRegInfo state field, indicating which of
  * the AArch32 and AArch64 execution states this register is visible in.
diff --git a/target/arm/helper.c b/target/arm/helper.c
index 09893e3..db8c925 100644
--- a/target/arm/helper.c
+++ b/target/arm/helper.c
@@ -5664,7 +5664,7 @@ static void add_cpreg_to_hashtable(ARMCPU *cpu, const 
ARMCPRegInfo *r,
 if (((r->crm == CP_ANY) && crm != 0) ||
 ((r->opc1 == CP_ANY) && opc1 != 0) ||
 ((r->opc2 == CP_ANY) && opc2 != 0)) {
-r2->type |= ARM_CP_ALIAS;
+r2->type |= ARM_CP_ALIAS | ARM_CP_NO_GDB;
 }
 
 /* Check that raw accesses are either forbidden or handled. Note that
-- 
2.7.4




Re: [Qemu-devel] Suggestion on 'virtio-pmem' implementation

2018-03-14 Thread David Hildenbrand
Hi Pankaj,

I have a prototype (new one for virtio-mem I was working on over the last 
weeks) for exactly what you need. I basically factored out the notion of a 
memory device. So also virtio devices can be memory devices and get recognized 
e.g. in formerly known pc_dimm_get_free_address(), so it works out nicely with 
ordinary memory hotplug and such. 

Guess your main problem right now is that you don‘t create a memory slot 
properly. But I also have code for that that you can built onto.

I‘m right now in Sri Lanka on vacation, I‘ll be back on 23. September and will 
send you the link to a branch with the prototype asap.

Thanks,

David / dhildenb / da...@redhat.com

Von meinem iPhone gesendet

> Am 14.03.2018 um 10:36 schrieb Pankaj Gupta :
> 
> 
> 
> Hi,
> 
> 
> I am implementing 'virtio-pmem' as a mechanism to
> flush guest writes with 'fake DAX' flushing interface.
> 
> Below is the high level details of components:
> 
> 1] 'virtio-pmem' device expose guest physical address
>  details(start, len).   
> 
> 2] 'virtio-pmem' driver in guest discovers this 
>information and configures 'libnvdimm'. Guest 'pmem' 
>driver works on this memory range.
> 
> 3] Guest 'pmem' driver uses 'virtio-pmem' PV driver to 
>   send flush commands.
> 
> 
> I need suggestion implementing part 1]
> 
> * When tried with 'hotplug_memory.base' address as guest physical 
>  address, I am facing 'EPT_MISCONFIG' errors when pmem does mkfs. 
>  After digging more it looks like address range I am using as guest
>  physical address is either already mapped as MMIO or reserved. 
>  Though Guest hot-plugs this physical address into its virtual 
>  memory range when guest tries to read/write the memory KVM cannot 
>  translate the address and throw 'EPT_MISCONFIG' error.
> 
> * While I am trying to get the appropriate guest physical address
>  which is free, I could see memory 'pc_dimm_memory_plug' code 
>  has a function 'pc_dimm_get_free_addr' which works with 'PC DIMM'
>  class. As I am using 'VIRTIO', there is no way AFAIK this function 
>  can be used by VIRTIO or my PV device code.
> 
> 
> I need ideas to get the free guest physical address from my PV 
> device code so that we can use this range in guest address space.
> 
> Find below pointer to previous discussion:
> 
> https://marc.info/?l=kvm=151629709903946=2
> 
> Thanks,
> Pankaj  
> 


Re: [Qemu-devel] [PATCH v5 0/2] qmp: 'wakeup-suspend-support' in query-target

2018-03-14 Thread Daniel Henrique Barboza

Ping

On 02/19/2018 11:12 AM, Daniel Henrique Barboza wrote:

v5:
- removed a paragraph in the recently added qemu_register_wakeup_notifier
comment that was added. That paragraph was adding too much in-depth
information about the current design of the system_wakeup, making it
harder to understand the whole point (suggested by Mike Roth).
- previous version link:
https://lists.nongnu.org/archive/html/qemu-devel/2018-01/msg00879.html

v4:
- added a comment in 'qemu_register_wakeup_notifier' about the effects
of adding a wakeup notifier without proper suspend/wakeup support in the
logic of the new wakeup-suspend-support flag, as suggested by Mike Roth
- previous version link:
https://lists.gnu.org/archive/html/qemu-devel/2018-01/msg00358.html

v3:
- added a "(since 2.12)" notation in the new flag, as suggested by
Eric Blake
- added a "backwards compatible" note in the commit msg
- previous version link:
https://lists.gnu.org/archive/html/qemu-devel/2018-01/msg00093.html

v2:
- changed the approach based on v1 discussions: instead of a new API, add
the required flag in QMP query-target
- dropped patch 2 since query-target does not have an HMP counterpart
- previous version link:
https://lists.gnu.org/archive/html/qemu-devel/2017-12/msg00889.html


Daniel Henrique Barboza (2):
   qmp: adding 'wakeup-suspend-support' in query-target
   qga: update guest-suspend-ram and guest-suspend-hybrid descriptions

  arch_init.c |  1 +
  include/sysemu/sysemu.h |  1 +
  qapi-schema.json|  4 +++-
  qga/qapi-schema.json| 14 ++
  vl.c| 21 +
  5 files changed, 36 insertions(+), 5 deletions(-)





Re: [Qemu-devel] [PATCH] i386: Disable Intel PT if packets IP payloads have LIP values

2018-03-14 Thread Eduardo Habkost
On Wed, Mar 14, 2018 at 03:26:31AM +0800, Luwei Kang wrote:
> Intel processor trace should be disabled when
> CPUID.(EAX=14H,ECX=0H).ECX.[bit31] is set.
> Generated packets which contain IP payloads will have LIP
> values when this bit is set, or IP payloads will have RIP
> values.
> Currently, The information of CPUID 14H is constant to make
> live migration safty and this bit is always 0 in guest even
> if host support LIP values.
> Guest sees the bit is 0 will expect IP payloads with RIP
> values, but the host CPU will generate IP payloads with
> LIP values if this bit is set in HW.
> To make sure the value of IP payloads correctly, Intel PT
> should be disabled when bit[31] is set.
> 
> Signed-off-by: Luwei Kang 

Queued, thanks.

-- 
Eduardo



Re: [Qemu-devel] [PATCH] i386: add KNM cpu model

2018-03-14 Thread Daniel P . Berrangé
On Wed, Mar 14, 2018 at 03:29:59PM +0800, Boqun Feng wrote:
> A new cpu model called "KNM" is added to model Knights Mill processors.

Why the obscure acryonym? Can't we just call it KnightsMill so it is
obvious what it is to everyone, as we've done for all other Intel
CPU model names in the past.

> Compared to "Skylake-Server" cpu model, the following features are
> added:
> 
>   avx512_4vnniw avx512_4fmaps avx512pf avx512er avx512_vpopcntdq
> 
> and the following features are removed:
> 
>   pcid invpcid clflushopt avx512dq avx512bw axv512cd clwb smap rtm
>   mpx xsavec xgetbv1 hle

"pcid" was one of the features critical to mitigate performance
downside of the Meltdown bug fix in guests. Will Knights Mill have
a fix for Meltdown in hardware, to avoid the need for split table
pages and thus avoid the perf hit in guests ?

> 
> Signed-off-by: Boqun Feng 
> ---
>  target/i386/cpu.c | 42 ++
>  1 file changed, 42 insertions(+)
> 
> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
> index 2c04645ceac9..215a9ee6026a 100644
> --- a/target/i386/cpu.c
> +++ b/target/i386/cpu.c
> @@ -1795,6 +1795,48 @@ static X86CPUDefinition builtin_x86_defs[] = {
>  .xlevel = 0x8008,
>  .model_id = "Intel Xeon Processor (Skylake, IBRS)",
>  },
> +{
> +.name = "KNM",
> +.level = 0xd,
> +.vendor = CPUID_VENDOR_INTEL,
> +.family = 6,
> +.model = 133,
> +.stepping = 0,
> +.features[FEAT_1_EDX] =
> +CPUID_VME | CPUID_SS | CPUID_SSE2 | CPUID_SSE | CPUID_FXSR |
> +CPUID_MMX | CPUID_CLFLUSH | CPUID_PSE36 | CPUID_PAT | CPUID_CMOV 
> |
> +CPUID_MCA | CPUID_PGE | CPUID_MTRR | CPUID_SEP | CPUID_APIC |
> +CPUID_CX8 | CPUID_MCE | CPUID_PAE | CPUID_MSR | CPUID_TSC |
> +CPUID_PSE | CPUID_DE | CPUID_FP87,
> +.features[FEAT_1_ECX] =
> +CPUID_EXT_AVX | CPUID_EXT_XSAVE | CPUID_EXT_AES |
> +CPUID_EXT_POPCNT | CPUID_EXT_X2APIC | CPUID_EXT_SSE42 |
> +CPUID_EXT_SSE41 | CPUID_EXT_CX16 | CPUID_EXT_SSSE3 |
> +CPUID_EXT_PCLMULQDQ | CPUID_EXT_SSE3 |
> +CPUID_EXT_TSC_DEADLINE_TIMER | CPUID_EXT_FMA | CPUID_EXT_MOVBE |
> +CPUID_EXT_F16C | CPUID_EXT_RDRAND,
> +.features[FEAT_8000_0001_EDX] =
> +CPUID_EXT2_LM | CPUID_EXT2_PDPE1GB | CPUID_EXT2_RDTSCP |
> +CPUID_EXT2_NX | CPUID_EXT2_SYSCALL,
> +.features[FEAT_8000_0001_ECX] =
> +CPUID_EXT3_ABM | CPUID_EXT3_LAHF_LM | CPUID_EXT3_3DNOWPREFETCH,
> +.features[FEAT_7_0_EBX] =
> +CPUID_7_0_EBX_FSGSBASE | CPUID_7_0_EBX_BMI1 | CPUID_7_0_EBX_AVX2 
> |
> +CPUID_7_0_EBX_SMEP | CPUID_7_0_EBX_BMI2 | CPUID_7_0_EBX_ERMS |
> +CPUID_7_0_EBX_RDSEED | CPUID_7_0_EBX_ADX | CPUID_7_0_EBX_AVX512F 
> |
> +CPUID_7_0_EBX_AVX512CD | CPUID_7_0_EBX_AVX512PF |
> +CPUID_7_0_EBX_AVX512ER,
> +.features[FEAT_7_0_ECX] =
> +CPUID_7_0_ECX_AVX512_VPOPCNTDQ,
> +.features[FEAT_7_0_EDX] =
> +CPUID_7_0_EDX_AVX512_4VNNIW | CPUID_7_0_EDX_AVX512_4FMAPS,
> +.features[FEAT_XSAVE] =
> +CPUID_XSAVE_XSAVEOPT,
> +.features[FEAT_6_EAX] =
> +CPUID_6_EAX_ARAT,
> +.xlevel = 0x8008,
> +.model_id = "Intel Xeon Phi Processor (Knights Mill)",
> +},
>  {
>  .name = "Opteron_G1",
>  .level = 5,
> -- 
> 2.16.1
> 
> 

Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|



[Qemu-devel] [PATCH] i386: add KNM cpu model

2018-03-14 Thread Boqun Feng
A new cpu model called "KNM" is added to model Knights Mill processors.
Compared to "Skylake-Server" cpu model, the following features are
added:

avx512_4vnniw avx512_4fmaps avx512pf avx512er avx512_vpopcntdq

and the following features are removed:

pcid invpcid clflushopt avx512dq avx512bw axv512cd clwb smap rtm
mpx xsavec xgetbv1 hle

Signed-off-by: Boqun Feng 
---
 target/i386/cpu.c | 42 ++
 1 file changed, 42 insertions(+)

diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index 2c04645ceac9..215a9ee6026a 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -1795,6 +1795,48 @@ static X86CPUDefinition builtin_x86_defs[] = {
 .xlevel = 0x8008,
 .model_id = "Intel Xeon Processor (Skylake, IBRS)",
 },
+{
+.name = "KNM",
+.level = 0xd,
+.vendor = CPUID_VENDOR_INTEL,
+.family = 6,
+.model = 133,
+.stepping = 0,
+.features[FEAT_1_EDX] =
+CPUID_VME | CPUID_SS | CPUID_SSE2 | CPUID_SSE | CPUID_FXSR |
+CPUID_MMX | CPUID_CLFLUSH | CPUID_PSE36 | CPUID_PAT | CPUID_CMOV |
+CPUID_MCA | CPUID_PGE | CPUID_MTRR | CPUID_SEP | CPUID_APIC |
+CPUID_CX8 | CPUID_MCE | CPUID_PAE | CPUID_MSR | CPUID_TSC |
+CPUID_PSE | CPUID_DE | CPUID_FP87,
+.features[FEAT_1_ECX] =
+CPUID_EXT_AVX | CPUID_EXT_XSAVE | CPUID_EXT_AES |
+CPUID_EXT_POPCNT | CPUID_EXT_X2APIC | CPUID_EXT_SSE42 |
+CPUID_EXT_SSE41 | CPUID_EXT_CX16 | CPUID_EXT_SSSE3 |
+CPUID_EXT_PCLMULQDQ | CPUID_EXT_SSE3 |
+CPUID_EXT_TSC_DEADLINE_TIMER | CPUID_EXT_FMA | CPUID_EXT_MOVBE |
+CPUID_EXT_F16C | CPUID_EXT_RDRAND,
+.features[FEAT_8000_0001_EDX] =
+CPUID_EXT2_LM | CPUID_EXT2_PDPE1GB | CPUID_EXT2_RDTSCP |
+CPUID_EXT2_NX | CPUID_EXT2_SYSCALL,
+.features[FEAT_8000_0001_ECX] =
+CPUID_EXT3_ABM | CPUID_EXT3_LAHF_LM | CPUID_EXT3_3DNOWPREFETCH,
+.features[FEAT_7_0_EBX] =
+CPUID_7_0_EBX_FSGSBASE | CPUID_7_0_EBX_BMI1 | CPUID_7_0_EBX_AVX2 |
+CPUID_7_0_EBX_SMEP | CPUID_7_0_EBX_BMI2 | CPUID_7_0_EBX_ERMS |
+CPUID_7_0_EBX_RDSEED | CPUID_7_0_EBX_ADX | CPUID_7_0_EBX_AVX512F |
+CPUID_7_0_EBX_AVX512CD | CPUID_7_0_EBX_AVX512PF |
+CPUID_7_0_EBX_AVX512ER,
+.features[FEAT_7_0_ECX] =
+CPUID_7_0_ECX_AVX512_VPOPCNTDQ,
+.features[FEAT_7_0_EDX] =
+CPUID_7_0_EDX_AVX512_4VNNIW | CPUID_7_0_EDX_AVX512_4FMAPS,
+.features[FEAT_XSAVE] =
+CPUID_XSAVE_XSAVEOPT,
+.features[FEAT_6_EAX] =
+CPUID_6_EAX_ARAT,
+.xlevel = 0x8008,
+.model_id = "Intel Xeon Phi Processor (Knights Mill)",
+},
 {
 .name = "Opteron_G1",
 .level = 5,
-- 
2.16.1




Re: [Qemu-devel] [PULL 00/36] QAPI patches for 2018-03-12, 2.12 softfreeze

2018-03-14 Thread Peter Maydell
On 13 March 2018 at 21:55, Eric Blake  wrote:
> On 03/13/2018 09:17 AM, Eric Blake wrote:

 This builds and passes 'make check', so even though the OOB portion
 depends on chardev fixes that are still pending a pull request from
 Paolo, that dependence can only be observed at runtime by clients
 that use the new oob feature.  Given the timing of soft freeze, and
 the fact that the chardev fixes do not form a build dependency, I
 think it's okay if this pull request gets processed before Paolo's
 (but it's also okay if Paolo's goes in first).
>>
>>
>> Based on the testsuite failures, it looks like Paolo's pull request with
>> chardev fixes DOES have to go in first.
>
>
> Nearing the end of my workday; I'm not sure what the status is on Paolo's
> pull request with the chardev fixes, but hope that it doesn't invalidate
> this pull request from still being considered as soft freeze material.

The freeze approach is that if you get v1 of the pull on the list
before the date, then the purpose of the following week before rc0
is to get the stragglers in and so that we can do re-spins if there
were minor issues with anything. So it's ok.

thanks
-- PMM



[Qemu-devel] [PATCH v6 28/29] libvhost-user: Claim support for postcopy

2018-03-14 Thread Dr. David Alan Gilbert (git)
From: "Dr. David Alan Gilbert" 

Tell QEMU we understand the protocol features needed for postcopy.

Signed-off-by: Dr. David Alan Gilbert 
Reviewed-by: Marc-André Lureau 
---
 contrib/libvhost-user/libvhost-user.c | 33 +
 1 file changed, 33 insertions(+)

diff --git a/contrib/libvhost-user/libvhost-user.c 
b/contrib/libvhost-user/libvhost-user.c
index 504ff5ea59..beeed0c43f 100644
--- a/contrib/libvhost-user/libvhost-user.c
+++ b/contrib/libvhost-user/libvhost-user.c
@@ -185,6 +185,35 @@ vmsg_close_fds(VhostUserMsg *vmsg)
 }
 }
 
+/* A test to see if we have userfault available */
+static bool
+have_userfault(void)
+{
+#if defined(__linux__) && defined(__NR_userfaultfd) &&\
+defined(UFFD_FEATURE_MISSING_SHMEM) &&\
+defined(UFFD_FEATURE_MISSING_HUGETLBFS)
+/* Now test the kernel we're running on really has the features */
+int ufd = syscall(__NR_userfaultfd, O_CLOEXEC | O_NONBLOCK);
+struct uffdio_api api_struct;
+if (ufd < 0) {
+return false;
+}
+
+api_struct.api = UFFD_API;
+api_struct.features = UFFD_FEATURE_MISSING_SHMEM |
+  UFFD_FEATURE_MISSING_HUGETLBFS;
+if (ioctl(ufd, UFFDIO_API, _struct)) {
+close(ufd);
+return false;
+}
+close(ufd);
+return true;
+
+#else
+return false;
+#endif
+}
+
 static bool
 vu_message_read(VuDev *dev, int conn_fd, VhostUserMsg *vmsg)
 {
@@ -939,6 +968,10 @@ vu_get_protocol_features_exec(VuDev *dev, VhostUserMsg 
*vmsg)
 uint64_t features = 1ULL << VHOST_USER_PROTOCOL_F_LOG_SHMFD |
 1ULL << VHOST_USER_PROTOCOL_F_SLAVE_REQ;
 
+if (have_userfault()) {
+features |= 1ULL << VHOST_USER_PROTOCOL_F_PAGEFAULT;
+}
+
 if (dev->iface->get_protocol_features) {
 features |= dev->iface->get_protocol_features(dev);
 }
-- 
2.14.3




[Qemu-devel] [PATCH RFC] configure: shorthand for only enabling native softmmu target

2018-03-14 Thread Daniel P . Berrangé
With the huge number of QEMU targets, a default configuration will take
a very long time to rebuild. When developing most code changes, it is
sufficient to test compilation with a single target - rebuilding all
targets just extends compile times while not detecting any new problems.

Developers will often thus specify a single target for configure,
commonly matching the host architecture. eg

  ./configure --target-list=x86_64-softmmu

This works fine, but is a bit of a verbose thing to type out everytime
configure is invoked. There are already short-hand args to disable all
user targets, all softmmu targets, or all tcg targets. This adds one
further shorthand to disable all non-native architecture targets.

  ./configure --native

Signed-off-by: Daniel P. Berrangé 
---

Suggestions welcomed for better names than --native, but bear in mind
the goal is to minimise amount of typing so nothing too verbose, hence
why I didn't do something like --disable-non-native ...

 configure | 24 
 1 file changed, 24 insertions(+)

diff --git a/configure b/configure
index af72fc852e..807af93116 100755
--- a/configure
+++ b/configure
@@ -233,6 +233,22 @@ supported_whpx_target() {
 return 1
 }
 
+supported_native_target() {
+glob "$1" "*-softmmu" || return 1
+case "${1%-softmmu}:$cpu" in
+arm:arm | aarch64:aarch64 | \
+i386:i386 | i386:x32 | \
+x86_64:x86_64 | \
+mips:mips | mipsel:mips | \
+ppc:ppc | ppcemb:ppc | \
+ppc64:ppc64 | \
+s390x:s390x)
+return 0
+;;
+esac
+return 1
+}
+
 supported_target() {
 case "$1" in
 *-softmmu)
@@ -254,6 +270,10 @@ supported_target() {
 return 1
 ;;
 esac
+if test "$native" = "yes"
+then
+   supported_native_target "$1" || return 1
+fi
 test "$tcg" = "yes" && return 0
 supported_kvm_target "$1" && return 0
 supported_xen_target "$1" && return 0
@@ -390,6 +410,7 @@ cocoa="no"
 softmmu="yes"
 linux_user="no"
 bsd_user="no"
+native="no"
 blobs="yes"
 pkgversion=""
 pie=""
@@ -1112,6 +1133,8 @@ for opt do
   cocoa="yes" ;
   audio_drv_list="coreaudio $(echo $audio_drv_list | sed s,coreaudio,,g)"
   ;;
+  --native) native="yes"
+  ;;
   --disable-system) softmmu="no"
   ;;
   --enable-system) softmmu="yes"
@@ -1540,6 +1563,7 @@ Advanced options (experts only):
xen pv domain builder
   --enable-debug-stack-usage
track the maximum stack usage of stacks created by 
qemu_alloc_stack
+  --native only enable the softmmu target matching host 
architecture
 
 Optional features, enabled with --enable-FEATURE and
 disabled with --disable-FEATURE, default is enabled if available:
-- 
2.14.3




[Qemu-devel] [PATCH v6 26/29] vhost: Huge page align and merge

2018-03-14 Thread Dr. David Alan Gilbert (git)
From: "Dr. David Alan Gilbert" 

Align RAMBlocks to page size alignment, and adjust the merging code
to deal with partial overlap due to that alignment.

This is needed for postcopy so that we can place/fetch whole hugepages
when under userfault.

Signed-off-by: Dr. David Alan Gilbert 
---
 hw/virtio/trace-events |  3 ++-
 hw/virtio/vhost.c  | 66 ++
 2 files changed, 58 insertions(+), 11 deletions(-)

diff --git a/hw/virtio/trace-events b/hw/virtio/trace-events
index 857c495e65..1422ff03ab 100644
--- a/hw/virtio/trace-events
+++ b/hw/virtio/trace-events
@@ -3,7 +3,8 @@
 # hw/virtio/vhost.c
 vhost_commit(bool started, bool changed) "Started: %d Changed: %d"
 vhost_region_add_section(const char *name, uint64_t gpa, uint64_t size, 
uint64_t host) "%s: 0x%"PRIx64"+0x%"PRIx64" @ 0x%"PRIx64
-vhost_region_add_section_abut(const char *name, uint64_t new_size) "%s: 
0x%"PRIx64
+vhost_region_add_section_merge(const char *name, uint64_t new_size, uint64_t 
gpa, uint64_t owr) "%s: size: 0x%"PRIx64 " gpa: 0x%"PRIx64 " owr: 0x%"PRIx64
+vhost_region_add_section_aligned(const char *name, uint64_t gpa, uint64_t 
size, uint64_t host) "%s: 0x%"PRIx64"+0x%"PRIx64" @ 0x%"PRIx64
 vhost_section(const char *name, int r) "%s:%d"
 
 # hw/virtio/vhost-user.c
diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
index 17262d2189..5bd4081683 100644
--- a/hw/virtio/vhost.c
+++ b/hw/virtio/vhost.c
@@ -521,10 +521,28 @@ static void vhost_region_add_section(struct vhost_dev 
*dev,
 uint64_t mrs_gpa = section->offset_within_address_space;
 uintptr_t mrs_host = (uintptr_t)memory_region_get_ram_ptr(section->mr) +
  section->offset_within_region;
+RAMBlock *mrs_rb = section->mr->ram_block;
+size_t mrs_page = qemu_ram_pagesize(mrs_rb);
 
 trace_vhost_region_add_section(section->mr->name, mrs_gpa, mrs_size,
mrs_host);
 
+/* Round the section to it's page size */
+/* First align the start down to a page boundary */
+uint64_t alignage = mrs_host & (mrs_page - 1);
+if (alignage) {
+mrs_host -= alignage;
+mrs_size += alignage;
+mrs_gpa  -= alignage;
+}
+/* Now align the size up to a page boundary */
+alignage = mrs_size & (mrs_page - 1);
+if (alignage) {
+mrs_size += mrs_page - alignage;
+}
+trace_vhost_region_add_section_aligned(section->mr->name, mrs_gpa, 
mrs_size,
+   mrs_host);
+
 if (dev->n_tmp_sections) {
 /* Since we already have at least one section, lets see if
  * this extends it; since we're scanning in order, we only
@@ -541,18 +559,46 @@ static void vhost_region_add_section(struct vhost_dev 
*dev,
 prev_sec->offset_within_region;
 uint64_t prev_host_end   = range_get_last(prev_host_start, prev_size);
 
-if (prev_gpa_end + 1 == mrs_gpa &&
-prev_host_end + 1 == mrs_host &&
-section->mr == prev_sec->mr &&
-(!dev->vhost_ops->vhost_backend_can_merge ||
-dev->vhost_ops->vhost_backend_can_merge(dev,
+if (mrs_gpa <= (prev_gpa_end + 1)) {
+/* OK, looks like overlapping/intersecting - it's possible that
+ * the rounding to page sizes has made them overlap, but they 
should
+ * match up in the same RAMBlock if they do.
+ */
+if (mrs_gpa < prev_gpa_start) {
+error_report("%s:Section rounded to %"PRIx64
+ " prior to previous %"PRIx64,
+ __func__, mrs_gpa, prev_gpa_start);
+/* A way to cleanly fail here would be better */
+return;
+}
+/* Offset from the start of the previous GPA to this GPA */
+size_t offset = mrs_gpa - prev_gpa_start;
+
+if (prev_host_start + offset == mrs_host &&
+section->mr == prev_sec->mr &&
+(!dev->vhost_ops->vhost_backend_can_merge ||
+ dev->vhost_ops->vhost_backend_can_merge(dev,
 mrs_host, mrs_size,
 prev_host_start, prev_size))) {
-/* The two sections abut */
-need_add = false;
-prev_sec->size = int128_add(prev_sec->size, section->size);
-trace_vhost_region_add_section_abut(section->mr->name,
-mrs_size + prev_size);
+uint64_t max_end = MAX(prev_host_end, mrs_host + mrs_size);
+need_add = false;
+prev_sec->offset_within_address_space =
+MIN(prev_gpa_start, mrs_gpa);
+prev_sec->offset_within_region =
+MIN(prev_host_start, mrs_host) -
+(uintptr_t)memory_region_get_ram_ptr(prev_sec->mr);
+prev_sec->size 

[Qemu-devel] [PATCH v6 24/29] vhost-user: Add VHOST_USER_POSTCOPY_END message

2018-03-14 Thread Dr. David Alan Gilbert (git)
From: "Dr. David Alan Gilbert" 

This message is sent just before the end of postcopy to get the
client to stop using userfault since we wont respond to any more
requests.  It should close userfaultfd so that any other pages
get mapped to the backing file automatically by the kernel, since
at this point we know we've received everything.

Signed-off-by: Dr. David Alan Gilbert 
Reviewed-by: Peter Xu 
Reviewed-by: Marc-André Lureau 
---
 contrib/libvhost-user/libvhost-user.c | 23 +++
 contrib/libvhost-user/libvhost-user.h |  1 +
 docs/interop/vhost-user.txt   | 12 
 hw/virtio/vhost-user.c|  1 +
 4 files changed, 37 insertions(+)

diff --git a/contrib/libvhost-user/libvhost-user.c 
b/contrib/libvhost-user/libvhost-user.c
index 5feed52098..504ff5ea59 100644
--- a/contrib/libvhost-user/libvhost-user.c
+++ b/contrib/libvhost-user/libvhost-user.c
@@ -99,6 +99,7 @@ vu_request_to_string(unsigned int req)
 REQ(VHOST_USER_SET_CONFIG),
 REQ(VHOST_USER_POSTCOPY_ADVISE),
 REQ(VHOST_USER_POSTCOPY_LISTEN),
+REQ(VHOST_USER_POSTCOPY_END),
 REQ(VHOST_USER_MAX),
 };
 #undef REQ
@@ -1094,6 +1095,26 @@ vu_set_postcopy_listen(VuDev *dev, VhostUserMsg *vmsg)
 vmsg->payload.u64 = 0; /* Success */
 return true;
 }
+
+static bool
+vu_set_postcopy_end(VuDev *dev, VhostUserMsg *vmsg)
+{
+DPRINT("%s: Entry\n", __func__);
+dev->postcopy_listening = false;
+if (dev->postcopy_ufd > 0) {
+close(dev->postcopy_ufd);
+dev->postcopy_ufd = -1;
+DPRINT("%s: Done close\n", __func__);
+}
+
+vmsg->fd_num = 0;
+vmsg->payload.u64 = 0;
+vmsg->size = sizeof(vmsg->payload.u64);
+vmsg->flags = VHOST_USER_VERSION |  VHOST_USER_REPLY_MASK;
+DPRINT("%s: exit\n", __func__);
+return true;
+}
+
 static bool
 vu_process_message(VuDev *dev, VhostUserMsg *vmsg)
 {
@@ -1169,6 +1190,8 @@ vu_process_message(VuDev *dev, VhostUserMsg *vmsg)
 return vu_set_postcopy_advise(dev, vmsg);
 case VHOST_USER_POSTCOPY_LISTEN:
 return vu_set_postcopy_listen(dev, vmsg);
+case VHOST_USER_POSTCOPY_END:
+return vu_set_postcopy_end(dev, vmsg);
 default:
 vmsg_close_fds(vmsg);
 vu_panic(dev, "Unhandled request: %d", vmsg->request);
diff --git a/contrib/libvhost-user/libvhost-user.h 
b/contrib/libvhost-user/libvhost-user.h
index ed505cf0c1..79f7a53ee8 100644
--- a/contrib/libvhost-user/libvhost-user.h
+++ b/contrib/libvhost-user/libvhost-user.h
@@ -87,6 +87,7 @@ typedef enum VhostUserRequest {
 VHOST_USER_CLOSE_CRYPTO_SESSION = 27,
 VHOST_USER_POSTCOPY_ADVISE  = 28,
 VHOST_USER_POSTCOPY_LISTEN  = 29,
+VHOST_USER_POSTCOPY_END = 30,
 VHOST_USER_MAX
 } VhostUserRequest;
 
diff --git a/docs/interop/vhost-user.txt b/docs/interop/vhost-user.txt
index e295ef12ca..c058c407df 100644
--- a/docs/interop/vhost-user.txt
+++ b/docs/interop/vhost-user.txt
@@ -729,6 +729,18 @@ Master message types
   This is always sent sometime after a VHOST_USER_POSTCOPY_ADVISE, and
   thus only when VHOST_USER_PROTOCOL_F_PAGEFAULT is supported.
 
+ * VHOST_USER_POSTCOPY_END
+  Id: 30
+  Slave payload: u64
+
+  Master advises that postcopy migration has now completed.  The
+  slave must disable the userfaultfd. The response is an acknowledgement
+  only.
+  When VHOST_USER_PROTOCOL_F_PAGEFAULT is supported, this message
+  is sent at the end of the migration, after VHOST_USER_POSTCOPY_LISTEN
+  was previously sent.
+  The value returned is an error indication; 0 is success.
+
 Slave message types
 ---
 
diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
index 7c887c368a..c3d2053303 100644
--- a/hw/virtio/vhost-user.c
+++ b/hw/virtio/vhost-user.c
@@ -82,6 +82,7 @@ typedef enum VhostUserRequest {
 VHOST_USER_CLOSE_CRYPTO_SESSION = 27,
 VHOST_USER_POSTCOPY_ADVISE  = 28,
 VHOST_USER_POSTCOPY_LISTEN  = 29,
+VHOST_USER_POSTCOPY_END = 30,
 VHOST_USER_MAX
 } VhostUserRequest;
 
-- 
2.14.3




[Qemu-devel] [PATCH v6 23/29] libvhost-user: mprotect & madvises for postcopy

2018-03-14 Thread Dr. David Alan Gilbert (git)
From: "Dr. David Alan Gilbert" 

Clear the area and turn off THP.
PROT_NONE the area until after we've userfault advised it
to catch any unexpected changes.

Signed-off-by: Dr. David Alan Gilbert 
Reviewed-by: Marc-André Lureau 
---
 contrib/libvhost-user/libvhost-user.c | 47 +++
 1 file changed, 42 insertions(+), 5 deletions(-)

diff --git a/contrib/libvhost-user/libvhost-user.c 
b/contrib/libvhost-user/libvhost-user.c
index 6314549b65..5feed52098 100644
--- a/contrib/libvhost-user/libvhost-user.c
+++ b/contrib/libvhost-user/libvhost-user.c
@@ -454,7 +454,7 @@ vu_set_mem_table_exec_postcopy(VuDev *dev, VhostUserMsg 
*vmsg)
 int i;
 VhostUserMemory *memory = >payload.memory;
 dev->nregions = memory->nregions;
-/* TODO: Postcopy specific code */
+
 DPRINT("Nregions: %d\n", memory->nregions);
 for (i = 0; i < dev->nregions; i++) {
 void *mmap_addr;
@@ -478,9 +478,12 @@ vu_set_mem_table_exec_postcopy(VuDev *dev, VhostUserMsg 
*vmsg)
 
 /* We don't use offset argument of mmap() since the
  * mapped address has to be page aligned, and we use huge
- * pages.  */
+ * pages.
+ * In postcopy we're using PROT_NONE here to catch anyone
+ * accessing it before we userfault
+ */
 mmap_addr = mmap(0, dev_region->size + dev_region->mmap_offset,
- PROT_READ | PROT_WRITE, MAP_SHARED,
+ PROT_NONE, MAP_SHARED,
  vmsg->fds[i], 0);
 
 if (mmap_addr == MAP_FAILED) {
@@ -519,12 +522,38 @@ vu_set_mem_table_exec_postcopy(VuDev *dev, VhostUserMsg 
*vmsg)
 /* OK, now we can go and register the memory and generate faults */
 for (i = 0; i < dev->nregions; i++) {
 VuDevRegion *dev_region = >regions[i];
+int ret;
 #ifdef UFFDIO_REGISTER
 /* We should already have an open ufd. Mark each memory
  * range as ufd.
- * Note: Do we need any madvises? Well it's not been accessed
- * yet, still probably need no THP to be safe, discard to be safe?
+ * Discard any mapping we have here; note I can't use MADV_REMOVE
+ * or fallocate to make the hole since I don't want to lose
+ * data that's already arrived in the shared process.
+ * TODO: How to do hugepage
  */
+ret = madvise((void *)dev_region->mmap_addr,
+  dev_region->size + dev_region->mmap_offset,
+  MADV_DONTNEED);
+if (ret) {
+fprintf(stderr,
+"%s: Failed to madvise(DONTNEED) region %d: %s\n",
+__func__, i, strerror(errno));
+}
+/* Turn off transparent hugepages so we dont get lose wakeups
+ * in neighbouring pages.
+ * TODO: Turn this backon later.
+ */
+ret = madvise((void *)dev_region->mmap_addr,
+  dev_region->size + dev_region->mmap_offset,
+  MADV_NOHUGEPAGE);
+if (ret) {
+/* Note: This can happen legally on kernels that are configured
+ * without madvise'able hugepages
+ */
+fprintf(stderr,
+"%s: Failed to madvise(NOHUGEPAGE) region %d: %s\n",
+__func__, i, strerror(errno));
+}
 struct uffdio_register reg_struct;
 reg_struct.range.start = (uintptr_t)dev_region->mmap_addr;
 reg_struct.range.len = dev_region->size + dev_region->mmap_offset;
@@ -546,6 +575,14 @@ vu_set_mem_table_exec_postcopy(VuDev *dev, VhostUserMsg 
*vmsg)
 }
 DPRINT("%s: region %d: Registered userfault for %llx + %llx\n",
 __func__, i, reg_struct.range.start, reg_struct.range.len);
+/* Now it's registered we can let the client at it */
+if (mprotect((void *)dev_region->mmap_addr,
+ dev_region->size + dev_region->mmap_offset,
+ PROT_READ | PROT_WRITE)) {
+vu_panic(dev, "failed to mprotect region %d for postcopy (%s)",
+ i, strerror(errno));
+return false;
+}
 /* TODO: Stash 'zero' support flags somewhere */
 #endif
 }
-- 
2.14.3




  1   2   >