Re: [Qemu-devel] [Qemu-stable] [2.2 PATCH] virtio-net: fix unmap leak

2014-11-26 Thread Fam Zheng
On Thu, 11/27 13:59, Jason Wang wrote:
> virtio_net_handle_ctrl() and other functions that process control vq
> request call iov_discard_front() which will shorten the iov. This will
> lead unmapping in virtqueue_push() leaks mapping.
> 
> Fixes this by keeping the original iov untouched and using a temp variable
> in those functions.

I'm a little confused. Commit message mentioned "other functions" but in this
patch only virtio_net_handle_ctrl is changed. So is this complete?

Fam

> 
> Cc: Wen Congyang 
> Cc: Stefano Stabellini 
> Cc: qemu-sta...@nongnu.org
> Signed-off-by: Jason Wang 
> ---
>  hw/net/virtio-net.c | 9 +++--
>  1 file changed, 7 insertions(+), 2 deletions(-)
> 
> diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
> index 9b88775..fdb4edd 100644
> --- a/hw/net/virtio-net.c
> +++ b/hw/net/virtio-net.c
> @@ -798,7 +798,7 @@ static void virtio_net_handle_ctrl(VirtIODevice *vdev, 
> VirtQueue *vq)
>  virtio_net_ctrl_ack status = VIRTIO_NET_ERR;
>  VirtQueueElement elem;
>  size_t s;
> -struct iovec *iov;
> +struct iovec *iov, *iov2;
>  unsigned int iov_cnt;
>  
>  while (virtqueue_pop(vq, &elem)) {
> @@ -808,8 +808,12 @@ static void virtio_net_handle_ctrl(VirtIODevice *vdev, 
> VirtQueue *vq)
>  exit(1);
>  }
>  
> -iov = elem.out_sg;
>  iov_cnt = elem.out_num;
> +s = sizeof(struct iovec) * elem.out_num;
> +iov = g_malloc(s);
> +memcpy(iov, elem.out_sg, s);
> +iov2 = iov;
> +
>  s = iov_to_buf(iov, iov_cnt, 0, &ctrl, sizeof(ctrl));
>  iov_discard_front(&iov, &iov_cnt, sizeof(ctrl));
>  if (s != sizeof(ctrl)) {
> @@ -833,6 +837,7 @@ static void virtio_net_handle_ctrl(VirtIODevice *vdev, 
> VirtQueue *vq)
>  
>  virtqueue_push(vq, &elem, sizeof(status));
>  virtio_notify(vdev, vq);
> +g_free(iov2);
>  }
>  }
>  
> -- 
> 1.9.1
> 
> 



[Qemu-devel] [2.2 PATCH] virtio-net: fix unmap leak

2014-11-26 Thread Jason Wang
virtio_net_handle_ctrl() and other functions that process control vq
request call iov_discard_front() which will shorten the iov. This will
lead unmapping in virtqueue_push() leaks mapping.

Fixes this by keeping the original iov untouched and using a temp variable
in those functions.

Cc: Wen Congyang 
Cc: Stefano Stabellini 
Cc: qemu-sta...@nongnu.org
Signed-off-by: Jason Wang 
---
 hw/net/virtio-net.c | 9 +++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
index 9b88775..fdb4edd 100644
--- a/hw/net/virtio-net.c
+++ b/hw/net/virtio-net.c
@@ -798,7 +798,7 @@ static void virtio_net_handle_ctrl(VirtIODevice *vdev, 
VirtQueue *vq)
 virtio_net_ctrl_ack status = VIRTIO_NET_ERR;
 VirtQueueElement elem;
 size_t s;
-struct iovec *iov;
+struct iovec *iov, *iov2;
 unsigned int iov_cnt;
 
 while (virtqueue_pop(vq, &elem)) {
@@ -808,8 +808,12 @@ static void virtio_net_handle_ctrl(VirtIODevice *vdev, 
VirtQueue *vq)
 exit(1);
 }
 
-iov = elem.out_sg;
 iov_cnt = elem.out_num;
+s = sizeof(struct iovec) * elem.out_num;
+iov = g_malloc(s);
+memcpy(iov, elem.out_sg, s);
+iov2 = iov;
+
 s = iov_to_buf(iov, iov_cnt, 0, &ctrl, sizeof(ctrl));
 iov_discard_front(&iov, &iov_cnt, sizeof(ctrl));
 if (s != sizeof(ctrl)) {
@@ -833,6 +837,7 @@ static void virtio_net_handle_ctrl(VirtIODevice *vdev, 
VirtQueue *vq)
 
 virtqueue_push(vq, &elem, sizeof(status));
 virtio_notify(vdev, vq);
+g_free(iov2);
 }
 }
 
-- 
1.9.1




[Qemu-devel] [PATCH] Fix for crash after migration in virtio-rng on bi-endian targets

2014-11-26 Thread David Gibson
VirtIO devices now remember which endianness they're operating in in order
to support targets which may have guests of either endianness, such as
powerpc.  This endianness state is transferred in a subsection of the
virtio device's information.

With virtio-rng this can lead to an abort after a loadvm hitting the
assert() in virtio_is_big_endian().  This can be reproduced by doing a
migrate and load from file on a bi-endian target with a virtio-rng device.
The actual guest state isn't particularly important to triggering this.

The cause is that virtio_rng_load_device() calls virtio_rng_process() which
accesses the ring and thus needs the endianness.  However,
virtio_rng_process() is called via virtio_load() before it loads the
subsections.  Essentially the ->load callback in VirtioDeviceClass should
only be used for actually reading the device state from the stream, not for
post-load re-initialization.

This patch fixes the bug by moving the virtio_rng_process() after the call
to virtio_load().  Better yet would be to convert virtio to use vmsd and
have the virtio_rng_process() as a post_load callback, but that's a bigger
project for another day.

This is bugfix, and should be considered for the 2.2 branch.

Signed-off-by: David Gibson 
---
 hw/virtio/virtio-rng.c | 15 ---
 1 file changed, 8 insertions(+), 7 deletions(-)

diff --git a/hw/virtio/virtio-rng.c b/hw/virtio/virtio-rng.c
index e85a979..473c044 100644
--- a/hw/virtio/virtio-rng.c
+++ b/hw/virtio/virtio-rng.c
@@ -113,20 +113,22 @@ static void virtio_rng_save(QEMUFile *f, void *opaque)
 
 static int virtio_rng_load(QEMUFile *f, void *opaque, int version_id)
 {
+VirtIORNG *vrng = opaque;
+int ret;
+
 if (version_id != 1) {
 return -EINVAL;
 }
-return virtio_load(VIRTIO_DEVICE(opaque), f, version_id);
-}
+ret = virtio_load(VIRTIO_DEVICE(vrng), f, version_id);
+if (ret != 0) {
+return ret;
+}
 
-static int virtio_rng_load_device(VirtIODevice *vdev, QEMUFile *f,
-  int version_id)
-{
 /* We may have an element ready but couldn't process it due to a quota
  * limit.  Make sure to try again after live migration when the quota may
  * have been reset.
  */
-virtio_rng_process(VIRTIO_RNG(vdev));
+virtio_rng_process(vrng);
 
 return 0;
 }
@@ -231,7 +233,6 @@ static void virtio_rng_class_init(ObjectClass *klass, void 
*data)
 vdc->realize = virtio_rng_device_realize;
 vdc->unrealize = virtio_rng_device_unrealize;
 vdc->get_features = get_features;
-vdc->load = virtio_rng_load_device;
 }
 
 static void virtio_rng_initfn(Object *obj)
-- 
1.9.3




[Qemu-devel] [Bug 1396497] Re: 'qemu-img snapshot' allows new snapshot to be created with the name of an existing snapshot

2014-11-26 Thread Serge Hallyn
I'd agree that at least the last part - removing the oldest snapshot
first - seems like a bug.

** Also affects: qemu
   Importance: Undecided
   Status: New

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1396497

Title:
  'qemu-img snapshot' allows new snapshot to be created with the name of
  an existing snapshot

Status in QEMU:
  New
Status in “qemu” package in Ubuntu:
  New

Bug description:
  qemu-img _may_ be working as designed, but it feels like this could be
  a bug. I'd certainly prefer to only allow unique snapshot names
  (unless maybe something like a "--force-non-unique-snapshot-names" was
  also specified).

  If this really is correct behaviour, it should be documented as qemu-
  img(1) currently specifies no details whatsoever regarding expected
  behaviour or valid snapshot names.

  $ qemu-img snapshot -l image.cow 
  $ qemu-img snapshot -c foo image.cow
  $ qemu-img snapshot -l image.cow
  Snapshot list:
  IDTAG VM SIZEDATE   VM CLOCK
  1 foo   0 2014-11-26 08:30:53   00:00:00.000
  $ qemu-img snapshot -c foo image.cow 
  $ qemu-img snapshot -l image.cow
  Snapshot list:
  IDTAG VM SIZEDATE   VM CLOCK
  1 foo   0 2014-11-26 08:30:53   00:00:00.000
  2 foo   0 2014-11-26 08:30:58   00:00:00.000
  $ qemu-img snapshot -c foo image.cow 
  $ qemu-img snapshot -l image.cow
  Snapshot list:
  IDTAG VM SIZEDATE   VM CLOCK
  1 foo   0 2014-11-26 08:30:53   00:00:00.000
  2 foo   0 2014-11-26 08:30:58   00:00:00.000
  3 foo   0 2014-11-26 08:31:00   00:00:00.000
  $ qemu-img snapshot -d foo image.cow
  $ qemu-img snapshot -l image.cow
  Snapshot list:
  IDTAG VM SIZEDATE   VM CLOCK
  2 foo   0 2014-11-26 08:30:58   00:00:00.000
  3 foo   0 2014-11-26 08:31:00   00:00:00.000
  $ qemu-img snapshot -d foo image.cow 
  $ qemu-img snapshot -l image.cow
  Snapshot list:
  IDTAG VM SIZEDATE   VM CLOCK
  3 foo   0 2014-11-26 08:31:00   00:00:00.000
  $ qemu-img snapshot -d foo image.cow 
  $ qemu-img snapshot -l image.cow 
  $

  Note also how snapshot deletion works in reverse order - the oldest
  snapshot with a given name is deleted first.

  ProblemType: Bug
  DistroRelease: Ubuntu 15.04
  Package: qemu-utils 2.1+dfsg-4ubuntu9
  ProcVersionSignature: Ubuntu 3.16.0-25.33-generic 3.16.7
  Uname: Linux 3.16.0-25-generic x86_64
  ApportVersion: 2.14.7-0ubuntu10
  Architecture: amd64
  CurrentDesktop: Unity
  Date: Wed Nov 26 08:28:16 2014
  InstallationDate: Installed on 2014-04-11 (228 days ago)
  InstallationMedia: Ubuntu 14.04 LTS "Trusty Tahr" - Daily amd64 (20140409)
  KvmCmdLine:
   COMMAND STAT  EUID  RUID   PID  PPID %CPU COMMAND
   kvm-irqfd-clean S<   0 0   719 2  0.0 [kvm-irqfd-clean]
  MachineType: LENOVO 20AQCTO1WW
  ProcKernelCmdLine: BOOT_IMAGE=/vmlinuz-3.16.0-25-generic 
root=/dev/mapper/ubuntu--vg-root ro quiet splash vt.handoff=7
  SourcePackage: qemu
  UpgradeStatus: Upgraded to vivid on 2014-05-08 (201 days ago)
  dmi.bios.date: 02/10/2014
  dmi.bios.vendor: LENOVO
  dmi.bios.version: GJET71WW (2.21 )
  dmi.board.asset.tag: Not Available
  dmi.board.name: 20AQCTO1WW
  dmi.board.vendor: LENOVO
  dmi.board.version: 0B98405 STD
  dmi.chassis.asset.tag: No Asset Information
  dmi.chassis.type: 10
  dmi.chassis.vendor: LENOVO
  dmi.chassis.version: Not Available
  dmi.modalias: 
dmi:bvnLENOVO:bvrGJET71WW(2.21):bd02/10/2014:svnLENOVO:pn20AQCTO1WW:pvrThinkPadT440s:rvnLENOVO:rn20AQCTO1WW:rvr0B98405STD:cvnLENOVO:ct10:cvrNotAvailable:
  dmi.product.name: 20AQCTO1WW
  dmi.product.version: ThinkPad T440s
  dmi.sys.vendor: LENOVO

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1396497/+subscriptions



Re: [Qemu-devel] [Xen-devel] virtio leaks cpu mappings, was: qemu crash with virtio on Xen domUs (backtrace included)

2014-11-26 Thread Jason Wang



On 11/26/2014 06:53 PM, Stefano Stabellini wrote:

On Wed, 26 Nov 2014, Jason Wang wrote:

>On 11/25/2014 09:53 PM, Stefano Stabellini wrote:

> >On Tue, 25 Nov 2014, Jason Wang wrote:

> >>On 11/25/2014 02:44 AM, Stefano Stabellini wrote:

> >>>On Mon, 24 Nov 2014, Stefano Stabellini wrote:

> On Mon, 24 Nov 2014, Stefano Stabellini wrote:

> >CC'ing Paolo.
> >
> >
> >Wen,
> >thanks for the logs.
> >
> >I investigated a little bit and it seems to me that the bug occurs when
> >QEMU tries to unmap only a portion of a memory region previously mapped.
> >That doesn't work with xen-mapcache.
> >
> >See these logs for example:
> >
> >DEBUG address_space_map phys_addr=78ed8b44 vaddr=7fab50afbb68 len=0xa
> >DEBUG address_space_unmap vaddr=7fab50afbb68 len=0x6

> Sorry the logs don't quite match, it was supposed to be:
> 
> DEBUG address_space_map phys_addr=78ed8b44 vaddr=7fab50afbb64 len=0xa
> DEBUG address_space_unmap vaddr=7fab50afbb68 len=0x6

> >>>It looks like the problem is caused by iov_discard_front, called by
> >>>virtio_net_handle_ctrl. By changing iov_base after the sg has already
> >>>been mapped (cpu_physical_memory_map), it causes a leak in the mapping
> >>>because the corresponding cpu_physical_memory_unmap will only unmap a
> >>>portion of the original sg.  On Xen the problem is worse because
> >>>xen-mapcache aborts.
> >>>
> >>>diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
> >>>index 2ac6ce5..b2b5c2d 100644
> >>>--- a/hw/net/virtio-net.c
> >>>+++ b/hw/net/virtio-net.c
> >>>@@ -775,7 +775,7 @@ static void virtio_net_handle_ctrl(VirtIODevice *vdev, 
VirtQueue *vq)
> >>>  struct iovec *iov;
> >>>  unsigned int iov_cnt;
> >>>
> >>>-while (virtqueue_pop(vq, &elem)) {
> >>>+while (virtqueue_pop_nomap(vq, &elem)) {
> >>>  if (iov_size(elem.in_sg, elem.in_num) < sizeof(status) ||
> >>>  iov_size(elem.out_sg, elem.out_num) < sizeof(ctrl)) {
> >>>  error_report("virtio-net ctrl missing headers");
> >>>@@ -784,8 +784,12 @@ static void virtio_net_handle_ctrl(VirtIODevice 
*vdev, VirtQueue *vq)
> >>>
> >>>  iov = elem.out_sg;
> >>>  iov_cnt = elem.out_num;
> >>>-s = iov_to_buf(iov, iov_cnt, 0, &ctrl, sizeof(ctrl));
> >>>  iov_discard_front(&iov, &iov_cnt, sizeof(ctrl));
> >>>+
> >>>+virtqueue_map_sg(elem.in_sg, elem.in_addr, elem.in_num, 1);
> >>>+virtqueue_map_sg(elem.out_sg, elem.out_addr, elem.out_num, 0);
> >>>+
> >>>+s = iov_to_buf(iov, iov_cnt, 0, &ctrl, sizeof(ctrl));

> >>Does this really work?

> >It seems to work here, as in it doesn't crash QEMU and I am able to boot
> >a guest with network. I didn't try any MAC related commands.
> >

>
>It was because the guest (not a recent kernel?) never issue commands
>through control vq.
>
>We'd better hide the implementation details such as virtqueue_map_sg()
>in virtio core instead of letting device call it directly.

> >>The code in fact skips the location that contains
> >>virtio_net_ctrl_hdr. And virtio_net_handle_mac() still calls
> >>iov_discard_front().
> >>
> >>How about copy iov to a temp variable and use it in this function?

> >That would only work if I moved the cpu_physical_memory_unmap call
> >outside of virtqueue_fill, so that we can pass different iov to them.
> >We need to unmap the same iov that was previously mapped by
> >virtqueue_pop.
> >

>
>I mean something like following or just passing the offset of iov to
>virtio_net_handle_*().

Sorry, you are right, your patch works too. I tried something like this
yesterday but I was confused because even if a crash doesn't happen
anymore, virtio-net still doesn't work on Xen (it boots but the network
doesn't work properly within the guest).
But that seems to be a separate issue and it affects my series too.

A possible problem with this approach is that virtqueue_push is now
called passing the original iov, not the shortened one.

Are you sure that is OK?


It's ok, except for unmapping, virtqueue_push does not care iov at all.

If so we can drop my series and use this instead.



I will submit a formal patch for this.

Thanks



[Qemu-devel] [PATCH] nvme: 64kB page size fixes

2014-11-26 Thread Anton Blanchard
Initialise our maximum page size capability to 64kB and increase
the page_size variable from 16 to 32 bits.

Signed-off-by: Anton Blanchard 
--

diff --git a/hw/block/nvme.c b/hw/block/nvme.c
index 1327658..aa1ed98 100644
--- a/hw/block/nvme.c
+++ b/hw/block/nvme.c
@@ -811,6 +811,7 @@ static int nvme_init(PCIDevice *pci_dev)
 NVME_CAP_SET_AMS(n->bar.cap, 1);
 NVME_CAP_SET_TO(n->bar.cap, 0xf);
 NVME_CAP_SET_CSS(n->bar.cap, 1);
+NVME_CAP_SET_MPSMAX(n->bar.cap, 4);
 
 n->bar.vs = 0x00010001;
 n->bar.intmc = n->bar.intms = 0;
diff --git a/hw/block/nvme.h b/hw/block/nvme.h
index 993c511..b6ccb65 100644
--- a/hw/block/nvme.h
+++ b/hw/block/nvme.h
@@ -688,7 +688,7 @@ typedef struct NvmeCtrl {
 NvmeBar  bar;
 BlockConfconf;
 
-uint16_tpage_size;
+uint32_tpage_size;
 uint16_tpage_bits;
 uint16_tmax_prp_ents;
 uint16_tcqe_size;



Re: [Qemu-devel] [Xen-devel] [PATCH] increase maxmem before calling xc_domain_populate_physmap

2014-11-26 Thread Don Slutz


On 11/26/14 13:17, Stefano Stabellini wrote:

On Tue, 25 Nov 2014, Andrew Cooper wrote:

On 25/11/14 17:45, Stefano Stabellini wrote:

Increase maxmem before calling xc_domain_populate_physmap_exact to avoid
the risk of running out of guest memory. This way we can also avoid
complex memory calculations in libxl at domain construction time.

This patch fixes an abort() when assigning more than 4 NICs to a VM.

Signed-off-by: Stefano Stabellini 

diff --git a/xen-hvm.c b/xen-hvm.c
index 5c69a8d..38e08c3 100644
--- a/xen-hvm.c
+++ b/xen-hvm.c
@@ -218,6 +218,7 @@ void xen_ram_alloc(ram_addr_t ram_addr, ram_addr_t size, 
MemoryRegion *mr)
  unsigned long nr_pfn;
  xen_pfn_t *pfn_list;
  int i;
+xc_dominfo_t info;
  
  if (runstate_check(RUN_STATE_INMIGRATE)) {

  /* RAM already populated in Xen */
@@ -240,6 +241,13 @@ void xen_ram_alloc(ram_addr_t ram_addr, ram_addr_t size, 
MemoryRegion *mr)
  pfn_list[i] = (ram_addr >> TARGET_PAGE_BITS) + i;
  }
  
+if (xc_domain_getinfo(xen_xc, xen_domid, 1, &info) < 0) {

xc_domain_getinfo()'s interface is mad, and provides no guarantee that
it returns the information for the domain you requested.  It also won't
return -1 on error.  The correct error handing is:

(xc_domain_getinfo(xen_xc, xen_domid, 1, &info) != 1) || (info.domid !=
xen_domid)

It might be wiser to switch to xc_domain_getinfolist


Either needs the same tests, since both return an vector of info.




~Andrew


+hw_error("xc_domain_getinfo failed");
+}
+if (xc_domain_setmaxmem(xen_xc, xen_domid, info.max_memkb +
+(nr_pfn * XC_PAGE_SIZE / 1024)) < 0) {


There are two big issues and 1 minor one with this.
1) You will allocate the videoram again.
2) You will never use the 1 MB already allocated for option ROMs.

And the minor one is that you can increase maxmem more then is needed.


Here is a better if:

-if (xc_domain_setmaxmem(xen_xc, xen_domid, info.max_memkb +
-(nr_pfn * XC_PAGE_SIZE / 1024)) < 0) {
+max_pages = info.max_memkb * 1024 / XC_PAGE_SIZE;
+free_pages = max_pages - info.nr_pages;
+need_pages = nr_pfn - free_pages;
+if ((free_pages < nr_pfn) &&
+   (xc_domain_setmaxmem(xen_xc, xen_domid, info.max_memkb +
+(need_pages * XC_PAGE_SIZE / 1024)) < 0)) {


My testing shows a free 32 pages that I am not sure where they come 
from.  But

the code about is passing my 8 nics of e1000.


   -Don Slutz



+hw_error("xc_domain_setmaxmem failed");
+}
  if (xc_domain_populate_physmap_exact(xen_xc, xen_domid, nr_pfn, 0, 0, 
pfn_list)) {
  hw_error("xen: failed to populate ram at " RAM_ADDR_FMT, ram_addr);
  }

___
Xen-devel mailing list
xen-de...@lists.xen.org
http://lists.xen.org/xen-devel







Re: [Qemu-devel] [PATCH] qmp: extend QMP to provide read/write access to physical memory

2014-11-26 Thread Fam Zheng
On Wed, 11/26 12:27, Bryan D. Payne wrote:
> This patch adds a new QMP command that sets up a domain socket. This
> socket can then be used for fast read/write access to the guest's
> physical memory. The key benefit to this system over existing solutions
> is speed. Using this patch, guest memory can be copied out at a rate of
> ~200MB/sec, depending on the hardware. Existing solutions only achieve
> a small fraction of this speed.

Out of curiosity, what are existing solutions?

> 
> Signed-off-by: Bryan D. Payne 
> ---
>  Makefile.target  |   2 +-
>  memory-access.c  | 200 
> +++
>  memory-access.h  |  11 +++
>  monitor.c|   6 ++
>  qapi-schema.json |  12 
>  qmp-commands.hx  |  28 
>  6 files changed, 258 insertions(+), 1 deletion(-)
>  create mode 100644 memory-access.c
>  create mode 100644 memory-access.h
> 
> diff --git a/Makefile.target b/Makefile.target
> index e9ff1ee..4b3cd99 100644
> --- a/Makefile.target
> +++ b/Makefile.target
> @@ -127,7 +127,7 @@ endif #CONFIG_BSD_USER
>  # System emulator target
>  ifdef CONFIG_SOFTMMU
>  obj-y += arch_init.o cpus.o monitor.o gdbstub.o balloon.o ioport.o numa.o
> -obj-y += qtest.o bootdevice.o
> +obj-y += qtest.o bootdevice.o memory-access.o
>  obj-y += hw/
>  obj-$(CONFIG_FDT) += device_tree.o
>  obj-$(CONFIG_KVM) += kvm-all.o
> diff --git a/memory-access.c b/memory-access.c
> new file mode 100644
> index 000..f696d7b
> --- /dev/null
> +++ b/memory-access.c
> @@ -0,0 +1,200 @@
> +/*
> + * Access guest physical memory via a domain socket.
> + *
> + * Copyright (c) 2014 Bryan D. Payne (bdpa...@acm.org)
> + */
> +
> +#include "memory-access.h"
> +#include "qemu-common.h"
> +#include "exec/cpu-common.h"
> +#include "config.h"
> +
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +
> +struct request {
> +uint8_t type;  /* 0 quit, 1 read, 2 write, ... rest reserved */
> +uint64_t address;  /* address to read from OR write to */
> +uint64_t length;   /* number of bytes to read OR write */
> +};

Please add QEMU_PACKED to this structure, and probably name it QEMUMARequest,
for name collision avoidance and CamelCase convension.

> +
> +static uint64_t
> +connection_read_memory(uint64_t user_paddr, void *buf, uint64_t user_len)
> +{
> +hwaddr paddr = (hwaddr) user_paddr;
> +hwaddr len = (hwaddr) user_len;
> +void *guestmem = cpu_physical_memory_map(paddr, &len, 0);

Accessing guest memory from another thread is not safe without holding the
lock: qemu_mutex_lock_iothread is needed before processing a request, and
unlock it after responding, to let guest continue.

> +if (!guestmem) {
> +return 0;
> +}
> +memcpy(buf, guestmem, len);
> +cpu_physical_memory_unmap(guestmem, len, 0, len);
> +
> +return len;
> +}
> +
> +static uint64_t
> +connection_write_memory(uint64_t user_paddr,
> +const void *buf,
> +uint64_t user_len)
> +{
> +hwaddr paddr = (hwaddr) user_paddr;
> +hwaddr len = (hwaddr) user_len;
> +void *guestmem = cpu_physical_memory_map(paddr, &len, 1);
> +if (!guestmem) {
> +return 0;
> +}
> +memcpy(guestmem, buf, len);
> +cpu_physical_memory_unmap(guestmem, len, 0, len);
> +
> +return len;
> +}
> +
> +static void
> +send_success_ack(int connection_fd)
> +{
> +uint8_t success = 1;
> +int nbytes = write(connection_fd, &success, 1);
> +if (nbytes != 1) {
> +printf("QemuMemoryAccess: failed to send success ack\n");
> +}
> +}
> +
> +static void
> +send_fail_ack(int connection_fd)
> +{
> +uint8_t fail = 0;
> +int nbytes = write(connection_fd, &fail, 1);
> +if (nbytes != 1) {
> +printf("QemuMemoryAccess: failed to send fail ack\n");

fprintf(stderr ?

> +}
> +}
> +
> +static void
> +connection_handler(int connection_fd)
> +{
> +int nbytes;
> +struct request req;
> +
> +while (1) {
> +/* client request should match the struct request format */
> +nbytes = read(connection_fd, &req, sizeof(struct request));
> +if (nbytes != sizeof(struct request)) {
> +/* error */
> +send_fail_ack(connection_fd);
> +continue;
> +} else if (req.type == 0) {

I suggest using macro names for enum instead of magic values 0, 1, 2.

> +/* request to quit, goodbye */
> +break;
> +} else if (req.type == 1) {
> +/* request to read */
> +char *buf = g_malloc(req.length + 1);

Is there a limit for r/w length? We could abort if req.length is too big.

> +nbytes = connection_read_memory(req.address, buf, req.length);
> +if (nbytes != req.length) {
> +/* read failure, return failure message */
> +buf[req.length] = 0; /* set last byte to 0 for failure */
> +

Re: [Qemu-devel] [PATCH] ppc: do not use get_clock_realtime()

2014-11-26 Thread Tony Breeds
On Wed, Nov 26, 2014 at 03:01:01PM +0100, Paolo Bonzini wrote:
> Use the external qemu-timer API instead.

Perhaps it's obvious but is it reasonable to expand on why you're doign this in
the commit message?
 
Yours Tony.


pgpbbAXtQoPFT.pgp
Description: PGP signature


Re: [Qemu-devel] virtio_blk: fix defaults for max_hw_sectors and max_segment_size

2014-11-26 Thread Mike Snitzer
On Wed, Nov 26 2014 at  4:53pm -0500,
Jens Axboe  wrote:

> On 11/26/2014 02:51 PM, Mike Snitzer wrote:
> > 
> > But while you're here, I wouldn't mind getting your take on virtio-blk
> > setting max_hw_sectors to -1U.
> > 
> > As I said in my original reply to mst: it only makes sense to set a
> > really high initial upper bound like that in a driver if that driver
> > goes on to stack an underlying device's limit.
> 
> -1U should just work, IMHO, there's no reason we should need to cap it
> at some synthetic value.  That said, it seems it should be one of
> those parameters that should be negotiated up and set appropriately. 

I'm saying set it to the underlying device's value for max_hw_sectors --
not some synthetic value.  So I think we're saying the same thing.

But it isn't immediately clear (to me) how that benefits virtio-blk
users (obviously they are getting by today).  So until that is pinned
down I imagine nobody will care to extend the virtio-blk protocol to
allow stacking max_hw_sectors and max_sectors up.



Re: [Qemu-devel] virtio_blk: fix defaults for max_hw_sectors and max_segment_size

2014-11-26 Thread Jens Axboe
On 11/26/2014 02:51 PM, Mike Snitzer wrote:
> On Wed, Nov 26 2014 at  3:54pm -0500,
> Jens Axboe  wrote:
> 
>> On 11/26/2014 01:51 PM, Mike Snitzer wrote:
>>> On Wed, Nov 26 2014 at  2:48pm -0500,
>>> Jens Axboe  wrote:
>>>

 That code isn't even in mainline, as far as I can tell...
>>>
>>> Right, it is old RHEL6 code.
>>>
>>> But I've yet to determine what changed upstream that enables this to
>>> "just work" with a really large max_sectors (I haven't been looking
>>> either).
>>
>> Kind of hard for the rest of us to say, since it's triggering a BUG in
>> code we don't have :-)
> 
> I never asked you or others to weigh in on old RHEL6 code.  Once I
> realized upstream worked even if max_sectors is _really_ high I said
> "sorry for the noise".
> 
> But while you're here, I wouldn't mind getting your take on virtio-blk
> setting max_hw_sectors to -1U.
> 
> As I said in my original reply to mst: it only makes sense to set a
> really high initial upper bound like that in a driver if that driver
> goes on to stack an underlying device's limit.

-1U should just work, IMHO, there's no reason we should need to cap it
at some synthetic value. That said, it seems it should be one of those
parameters that should be negotiated up and set appropriately.

-- 
Jens Axboe




Re: [Qemu-devel] virtio_blk: fix defaults for max_hw_sectors and max_segment_size

2014-11-26 Thread Mike Snitzer
On Wed, Nov 26 2014 at  3:54pm -0500,
Jens Axboe  wrote:

> On 11/26/2014 01:51 PM, Mike Snitzer wrote:
> > On Wed, Nov 26 2014 at  2:48pm -0500,
> > Jens Axboe  wrote:
> > 
> >>
> >> That code isn't even in mainline, as far as I can tell...
> > 
> > Right, it is old RHEL6 code.
> > 
> > But I've yet to determine what changed upstream that enables this to
> > "just work" with a really large max_sectors (I haven't been looking
> > either).
> 
> Kind of hard for the rest of us to say, since it's triggering a BUG in
> code we don't have :-)

I never asked you or others to weigh in on old RHEL6 code.  Once I
realized upstream worked even if max_sectors is _really_ high I said
"sorry for the noise".

But while you're here, I wouldn't mind getting your take on virtio-blk
setting max_hw_sectors to -1U.

As I said in my original reply to mst: it only makes sense to set a
really high initial upper bound like that in a driver if that driver
goes on to stack an underlying device's limit.



Re: [Qemu-devel] MinGW build

2014-11-26 Thread Liviu Ionescu

On 26 Nov 2014, at 22:13, Peter Maydell  wrote:

> ... We have a workaround already for a similar thing in
> include/sysemu/os-win32.h,

ok, I'll investigate.

> Do you have any experience with mingw-w64?

unfortunately not, but, time permitting, I'll check it.

> In any case, thanks very much for writing up the instructions -- we do
> get people wanting to know how to build on Windows from time to time,
> so it will be useful to be able to point them at instructions.

you're welcome! 

I'll try to keep the GNU ARM Eclipse wiki as up-to-date as possible.

> You might consider tweaking the instructions to recommend building in
> a subdirectory...

right, that's a very good idea, I did not know that configure can do this.

I already updated the wiki.


regards,

Liviu




Re: [Qemu-devel] virtio_blk: fix defaults for max_hw_sectors and max_segment_size

2014-11-26 Thread Jens Axboe
On 11/26/2014 01:51 PM, Mike Snitzer wrote:
> On Wed, Nov 26 2014 at  2:48pm -0500,
> Jens Axboe  wrote:
> 
>> On 11/21/2014 08:49 AM, Mike Snitzer wrote:
>>> On Fri, Nov 21 2014 at  4:54am -0500,
>>> Christoph Hellwig  wrote:
>>>
 On Thu, Nov 20, 2014 at 02:00:59PM -0500, Mike Snitzer wrote:
> virtio_blk incorrectly established -1U as the default for these
> queue_limits.  Set these limits to sane default values to avoid crashing
> the kernel.  But the virtio-blk protocol should probably be extended to
> allow proper stacking of the disk's limits from the host.
>
> This change fixes a crash that was reported when virtio-blk was used to
> test linux-dm.git commit 604ea90641b4 ("dm thin: adjust max_sectors_kb
> based on thinp blocksize") that will initially set max_sectors to
> max_hw_sectors and then rounddown to the first power-of-2 factor of the
> DM thin-pool's blocksize.  Basically that commit assumes drivers don't
> suck when establishing max_hw_sectors so it acted like a canary in the
> coal mine.

 Is that a crash in the host or guest?  What kind of mishandling did you
 see?  Unless the recent virtio standard changed anything the host
 should be able to handle our arbitrary limits, and even if it doesn't
 that something we need to hash out with qemu and the virtio standards
 folks.
>>>
>>> Some good news: this guest crash isn't an issue with recent kernels (so
>>> upstream, fedora 20, RHEL7, etc aren't impacted -- Jens feel free to
>>> drop my virtio_blk patch; even though some of it's limits are clearly
>>> broken I'll defer to the virtio_blk developers on the best way forward
>>> -- sorry for the noise!).
>>>
>>> The BUG I saw only seems to impact RHEL6 kernels so far (note to self,
>>> actually _test_ on upstream before reporting a crash against upstream!)
>>>
>>> [root@RHEL-6 ~]# echo 1073741824 > /sys/block/vdc/queue/max_sectors_kb
>>> [root@RHEL-6 ~]# lvs
>>>
>>> Message from syslogd@RHEL-6 at Nov 21 15:32:15 ...
>>>  kernel:Kernel panic - not syncing: Fatal exception
>>>
>>> Here is the RHEL6 guest crash, just for full disclosure:
>>>
>>> kernel BUG at fs/direct-io.c:696!
>>> invalid opcode:  [#1] SMP
>>> last sysfs file: /sys/devices/virtual/block/dm-4/dev
>>> CPU 0
>>> Modules linked in: nfs lockd fscache auth_rpcgss nfs_acl sunrpc ipv6 ext2 
>>> dm_thin_pool dm_bio_prison dm_persistent_data dm_bufio libcrc32c dm_mirror 
>>> dm_region_hash dm_log dm_mod microcode virtio_balloon i2c_piix4 i2c_core 
>>> virtio_net ext4 jbd2 mbcache virtio_blk virtio_pci virtio_ring virtio 
>>> pata_acpi ata_generic ata_piix [last unloaded: speedstep_lib]
>>>
>>> Pid: 1679, comm: lvs Not tainted 2.6.32 #6 Bochs Bochs
>>> RIP: 0010:[]  [] 
>>> __blockdev_direct_IO_newtrunc+0x986/0x1270
>>> RSP: 0018:88011a11ba48  EFLAGS: 00010287
>>> RAX:  RBX: 8801192fbd28 RCX: 1000
>>> RDX: ea0003b3d218 RSI: 88011aac4300 RDI: 880118572378
>>> RBP: 88011a11bbe8 R08:  R09: 
>>> R10:  R11:  R12: 8801192fbd00
>>> R13:  R14: 880118c3cac0 R15: 
>>> FS:  7fde78bc37a0() GS:88002820() knlGS:
>>> CS:  0010 DS:  ES:  CR0: 80050033
>>> CR2: 012706f0 CR3: 00011a432000 CR4: 000407f0
>>> DR0:  DR1:  DR2: 
>>> DR3:  DR6: 0ff0 DR7: 0400
>>> Process lvs (pid: 1679, threadinfo 88011a11a000, task 8801185a4aa0)
>>> Stack:
>>>  88011a11bb48 88011a11baa8 8801000c 88011a11bb18
>>>    88011a11bdc8 88011a11beb8
>>>  000c1a11baa8 880118c3cb98  18c3ccb8
>>> Call Trace:
>>>  [] ? blkdev_get_block+0x0/0x20
>>>  [] __blockdev_direct_IO+0x77/0xe0
>>>  [] ? blkdev_get_block+0x0/0x20
>>>  [] blkdev_direct_IO+0x57/0x60
>>>  [] ? blkdev_get_block+0x0/0x20
>>>  [] generic_file_aio_read+0x6bb/0x700
>>>  [] ? blkdev_get+0x10/0x20
>>>  [] ? blkdev_open+0x0/0xc0
>>>  [] ? __dentry_open+0x23f/0x360
>>>  [] blkdev_aio_read+0x51/0x80
>>>  [] do_sync_read+0xfa/0x140
>>>  [] ? autoremove_wake_function+0x0/0x40
>>>  [] ? block_ioctl+0x3c/0x40
>>>  [] ? vfs_ioctl+0x22/0xa0
>>>  [] ? do_vfs_ioctl+0x84/0x580
>>>  [] ? security_file_permission+0x16/0x20
>>>  [] vfs_read+0xb5/0x1a0
>>>  [] sys_read+0x51/0x90
>>>  [] ? __audit_syscall_exit+0x25e/0x290
>>>  [] system_call_fastpath+0x16/0x1b
>>> Code: fe ff ff c7 85 fc fe ff ff 00 00 00 00 48 89 95 10 ff ff ff 8b 95 34 
>>> ff ff ff e8 46 ac ff ff 3b 85 34 ff ff ff 0f 84 fc 02 00 00 <0f> 0b eb fe 
>>> 8b 9d 34 ff ff ff 8b 85 30 ff ff ff 01 d8 85 c0 0f
>>> RIP  [] __blockdev_direct_IO_newtrunc+0x986/0x1270
>>>  RSP 
>>> ---[ end trace 73be5dcaf8939399 ]---
>>> Kernel panic - not syncing: Fatal exception
>>
>> That code isn't even in mainline, a

Re: [Qemu-devel] virtio_blk: fix defaults for max_hw_sectors and max_segment_size

2014-11-26 Thread Mike Snitzer
On Wed, Nov 26 2014 at  2:48pm -0500,
Jens Axboe  wrote:

> On 11/21/2014 08:49 AM, Mike Snitzer wrote:
> > On Fri, Nov 21 2014 at  4:54am -0500,
> > Christoph Hellwig  wrote:
> > 
> >> On Thu, Nov 20, 2014 at 02:00:59PM -0500, Mike Snitzer wrote:
> >>> virtio_blk incorrectly established -1U as the default for these
> >>> queue_limits.  Set these limits to sane default values to avoid crashing
> >>> the kernel.  But the virtio-blk protocol should probably be extended to
> >>> allow proper stacking of the disk's limits from the host.
> >>>
> >>> This change fixes a crash that was reported when virtio-blk was used to
> >>> test linux-dm.git commit 604ea90641b4 ("dm thin: adjust max_sectors_kb
> >>> based on thinp blocksize") that will initially set max_sectors to
> >>> max_hw_sectors and then rounddown to the first power-of-2 factor of the
> >>> DM thin-pool's blocksize.  Basically that commit assumes drivers don't
> >>> suck when establishing max_hw_sectors so it acted like a canary in the
> >>> coal mine.
> >>
> >> Is that a crash in the host or guest?  What kind of mishandling did you
> >> see?  Unless the recent virtio standard changed anything the host
> >> should be able to handle our arbitrary limits, and even if it doesn't
> >> that something we need to hash out with qemu and the virtio standards
> >> folks.
> > 
> > Some good news: this guest crash isn't an issue with recent kernels (so
> > upstream, fedora 20, RHEL7, etc aren't impacted -- Jens feel free to
> > drop my virtio_blk patch; even though some of it's limits are clearly
> > broken I'll defer to the virtio_blk developers on the best way forward
> > -- sorry for the noise!).
> > 
> > The BUG I saw only seems to impact RHEL6 kernels so far (note to self,
> > actually _test_ on upstream before reporting a crash against upstream!)
> > 
> > [root@RHEL-6 ~]# echo 1073741824 > /sys/block/vdc/queue/max_sectors_kb
> > [root@RHEL-6 ~]# lvs
> > 
> > Message from syslogd@RHEL-6 at Nov 21 15:32:15 ...
> >  kernel:Kernel panic - not syncing: Fatal exception
> > 
> > Here is the RHEL6 guest crash, just for full disclosure:
> > 
> > kernel BUG at fs/direct-io.c:696!
> > invalid opcode:  [#1] SMP
> > last sysfs file: /sys/devices/virtual/block/dm-4/dev
> > CPU 0
> > Modules linked in: nfs lockd fscache auth_rpcgss nfs_acl sunrpc ipv6 ext2 
> > dm_thin_pool dm_bio_prison dm_persistent_data dm_bufio libcrc32c dm_mirror 
> > dm_region_hash dm_log dm_mod microcode virtio_balloon i2c_piix4 i2c_core 
> > virtio_net ext4 jbd2 mbcache virtio_blk virtio_pci virtio_ring virtio 
> > pata_acpi ata_generic ata_piix [last unloaded: speedstep_lib]
> > 
> > Pid: 1679, comm: lvs Not tainted 2.6.32 #6 Bochs Bochs
> > RIP: 0010:[]  [] 
> > __blockdev_direct_IO_newtrunc+0x986/0x1270
> > RSP: 0018:88011a11ba48  EFLAGS: 00010287
> > RAX:  RBX: 8801192fbd28 RCX: 1000
> > RDX: ea0003b3d218 RSI: 88011aac4300 RDI: 880118572378
> > RBP: 88011a11bbe8 R08:  R09: 
> > R10:  R11:  R12: 8801192fbd00
> > R13:  R14: 880118c3cac0 R15: 
> > FS:  7fde78bc37a0() GS:88002820() knlGS:
> > CS:  0010 DS:  ES:  CR0: 80050033
> > CR2: 012706f0 CR3: 00011a432000 CR4: 000407f0
> > DR0:  DR1:  DR2: 
> > DR3:  DR6: 0ff0 DR7: 0400
> > Process lvs (pid: 1679, threadinfo 88011a11a000, task 8801185a4aa0)
> > Stack:
> >  88011a11bb48 88011a11baa8 8801000c 88011a11bb18
> >    88011a11bdc8 88011a11beb8
> >  000c1a11baa8 880118c3cb98  18c3ccb8
> > Call Trace:
> >  [] ? blkdev_get_block+0x0/0x20
> >  [] __blockdev_direct_IO+0x77/0xe0
> >  [] ? blkdev_get_block+0x0/0x20
> >  [] blkdev_direct_IO+0x57/0x60
> >  [] ? blkdev_get_block+0x0/0x20
> >  [] generic_file_aio_read+0x6bb/0x700
> >  [] ? blkdev_get+0x10/0x20
> >  [] ? blkdev_open+0x0/0xc0
> >  [] ? __dentry_open+0x23f/0x360
> >  [] blkdev_aio_read+0x51/0x80
> >  [] do_sync_read+0xfa/0x140
> >  [] ? autoremove_wake_function+0x0/0x40
> >  [] ? block_ioctl+0x3c/0x40
> >  [] ? vfs_ioctl+0x22/0xa0
> >  [] ? do_vfs_ioctl+0x84/0x580
> >  [] ? security_file_permission+0x16/0x20
> >  [] vfs_read+0xb5/0x1a0
> >  [] sys_read+0x51/0x90
> >  [] ? __audit_syscall_exit+0x25e/0x290
> >  [] system_call_fastpath+0x16/0x1b
> > Code: fe ff ff c7 85 fc fe ff ff 00 00 00 00 48 89 95 10 ff ff ff 8b 95 34 
> > ff ff ff e8 46 ac ff ff 3b 85 34 ff ff ff 0f 84 fc 02 00 00 <0f> 0b eb fe 
> > 8b 9d 34 ff ff ff 8b 85 30 ff ff ff 01 d8 85 c0 0f
> > RIP  [] __blockdev_direct_IO_newtrunc+0x986/0x1270
> >  RSP 
> > ---[ end trace 73be5dcaf8939399 ]---
> > Kernel panic - not syncing: Fatal exception
> 
> That code isn't even in mainline, as far as I can tell...

Right, it is old RHE

[Qemu-devel] [Bug 1395217] Re: Networking in qemu 2.0.0 and beyond is not compatible with Open Solaris (Illumos) 5.11

2014-11-26 Thread Tim Dawson
Much appreciated!  Please let me know if there is anything else I can do
to help this bug progress . . . .

- Tim

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1395217

Title:
  Networking in qemu 2.0.0 and beyond is not compatible with Open
  Solaris (Illumos) 5.11

Status in QEMU:
  New

Bug description:
  The networking code in qemu in versions 2.0.0 and beyond is non-
  functional with Solaris/Illumos 5.11 images.

  Building 1.7.1, 2.0.0, 2.0.2, 2.1.2,and 2.2.0rc1with the following
  standard Slackware config:

  # From Slackware build tree . . . 
  ./configure \
--prefix=/usr \
--libdir=/usr/lib64 \
--sysconfdir=/etc \
--localstatedir=/var \
--enable-gtk \
--enable-system \
--enable-kvm \
--disable-debug-info \
--enable-virtfs \
--enable-sdl \
--audio-drv-list=alsa,oss,sdl,esd \
--enable-libusb \
--disable-vnc \
--target-list=x86_64-linux-user,i386-linux-user,x86_64-softmmu,i386-softmmu 
\
--enable-spice \
--enable-usb-redir 

  
  And attempting to run the same VM image with the following command (or via 
virt-manager):

  macaddress="DE:AD:BE:EF:3F:A4"

  qemu-system-x86_64 nex4x -cdrom /dev/cdrom -name "Nex41" -cpu Westmere
  -machine accel=kvm -smp 2 -m 4000 -net nic,macaddr=$macaddress  -net 
bridge,br=b
  r0 -net dump,file=/usr1/tmp/ -drive file=nex4x_d1 -drive 
file=nex4x_d2
   -enable-kvm

  Gives success on 1.7.1, and a deaf VM on all subsequent versions.

  Notable in validating my config, is that a Windows 7 image runs
  cleanly with networking on *all* builds, so my configuration appears
  to be good - qemu just hates Solaris at this point.

  Watching with wireshark (as well as pulling network traces from qemu
  as noted above) it appears that the notable difference in the two
  configs is that for some reason, Solaris gets stuck arping for it's
  own interface on startup, and never really comes on line on the
  network.  If other hosts attempt to ping the Solaris instance, they
  can successfully arp the bad VM, but not the other way around.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1395217/+subscriptions



[Qemu-devel] [PATCH] qmp: extend QMP to provide read/write access to physical memory

2014-11-26 Thread Bryan D. Payne
This patch adds a new QMP command that sets up a domain socket. This
socket can then be used for fast read/write access to the guest's
physical memory. The key benefit to this system over existing solutions
is speed. Using this patch, guest memory can be copied out at a rate of
~200MB/sec, depending on the hardware. Existing solutions only achieve
a small fraction of this speed.

Signed-off-by: Bryan D. Payne 
---
 Makefile.target  |   2 +-
 memory-access.c  | 200 +++
 memory-access.h  |  11 +++
 monitor.c|   6 ++
 qapi-schema.json |  12 
 qmp-commands.hx  |  28 
 6 files changed, 258 insertions(+), 1 deletion(-)
 create mode 100644 memory-access.c
 create mode 100644 memory-access.h

diff --git a/Makefile.target b/Makefile.target
index e9ff1ee..4b3cd99 100644
--- a/Makefile.target
+++ b/Makefile.target
@@ -127,7 +127,7 @@ endif #CONFIG_BSD_USER
 # System emulator target
 ifdef CONFIG_SOFTMMU
 obj-y += arch_init.o cpus.o monitor.o gdbstub.o balloon.o ioport.o numa.o
-obj-y += qtest.o bootdevice.o
+obj-y += qtest.o bootdevice.o memory-access.o
 obj-y += hw/
 obj-$(CONFIG_FDT) += device_tree.o
 obj-$(CONFIG_KVM) += kvm-all.o
diff --git a/memory-access.c b/memory-access.c
new file mode 100644
index 000..f696d7b
--- /dev/null
+++ b/memory-access.c
@@ -0,0 +1,200 @@
+/*
+ * Access guest physical memory via a domain socket.
+ *
+ * Copyright (c) 2014 Bryan D. Payne (bdpa...@acm.org)
+ */
+
+#include "memory-access.h"
+#include "qemu-common.h"
+#include "exec/cpu-common.h"
+#include "config.h"
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+struct request {
+uint8_t type;  /* 0 quit, 1 read, 2 write, ... rest reserved */
+uint64_t address;  /* address to read from OR write to */
+uint64_t length;   /* number of bytes to read OR write */
+};
+
+static uint64_t
+connection_read_memory(uint64_t user_paddr, void *buf, uint64_t user_len)
+{
+hwaddr paddr = (hwaddr) user_paddr;
+hwaddr len = (hwaddr) user_len;
+void *guestmem = cpu_physical_memory_map(paddr, &len, 0);
+if (!guestmem) {
+return 0;
+}
+memcpy(buf, guestmem, len);
+cpu_physical_memory_unmap(guestmem, len, 0, len);
+
+return len;
+}
+
+static uint64_t
+connection_write_memory(uint64_t user_paddr,
+const void *buf,
+uint64_t user_len)
+{
+hwaddr paddr = (hwaddr) user_paddr;
+hwaddr len = (hwaddr) user_len;
+void *guestmem = cpu_physical_memory_map(paddr, &len, 1);
+if (!guestmem) {
+return 0;
+}
+memcpy(guestmem, buf, len);
+cpu_physical_memory_unmap(guestmem, len, 0, len);
+
+return len;
+}
+
+static void
+send_success_ack(int connection_fd)
+{
+uint8_t success = 1;
+int nbytes = write(connection_fd, &success, 1);
+if (nbytes != 1) {
+printf("QemuMemoryAccess: failed to send success ack\n");
+}
+}
+
+static void
+send_fail_ack(int connection_fd)
+{
+uint8_t fail = 0;
+int nbytes = write(connection_fd, &fail, 1);
+if (nbytes != 1) {
+printf("QemuMemoryAccess: failed to send fail ack\n");
+}
+}
+
+static void
+connection_handler(int connection_fd)
+{
+int nbytes;
+struct request req;
+
+while (1) {
+/* client request should match the struct request format */
+nbytes = read(connection_fd, &req, sizeof(struct request));
+if (nbytes != sizeof(struct request)) {
+/* error */
+send_fail_ack(connection_fd);
+continue;
+} else if (req.type == 0) {
+/* request to quit, goodbye */
+break;
+} else if (req.type == 1) {
+/* request to read */
+char *buf = g_malloc(req.length + 1);
+nbytes = connection_read_memory(req.address, buf, req.length);
+if (nbytes != req.length) {
+/* read failure, return failure message */
+buf[req.length] = 0; /* set last byte to 0 for failure */
+nbytes = write(connection_fd, buf, 1);
+} else {
+/* read success, return bytes */
+buf[req.length] = 1; /* set last byte to 1 for success */
+nbytes = write(connection_fd, buf, nbytes + 1);
+}
+g_free(buf);
+} else if (req.type == 2) {
+/* request to write */
+void *write_buf = g_malloc(req.length);
+nbytes = read(connection_fd, &write_buf, req.length);
+if (nbytes != req.length) {
+/* failed reading the message to write */
+send_fail_ack(connection_fd);
+} else{
+/* do the write */
+nbytes = connection_write_memory(req.address,
+ write_buf,
+ req.le

Re: [Qemu-devel] MinGW build

2014-11-26 Thread Stefan Weil
Am 26.11.2014 um 21:13 schrieb Peter Maydell:
> On 26 November 2014 at 19:55, Liviu Ionescu  wrote:
>> since I had some troubles to build QEMU on Windows, I compiled a page with 
>> some instructions:
>>
>> http://gnuarmeclipse.livius.net/wiki/How_to_build_QEMU#Windows
>>
>> the purpose was to generate a standalone executable, that requires no 
>> libraries.
>>
>>
>> although the procedure is fully functional, I had to apply a small patch, 
>> basically:
>>
>> #ifdef __MINGW32__
>> #if !defined(ffs)
>> #define ffs __builtin_ffs
>> #endif
>> #endif
>>
>> in several files.
> 
> Hmm. We have a workaround already for a similar thing in
> include/sysemu/os-win32.h,
> but that works by declaring a prototype rather than using a #define, and it's
> guarded by defined(_WIN64). I wonder if some of those workarounds in that file
> need to be guarded by more specific checks than just _WIN64...
> 
> Do you have any experience with mingw-w64? When I was doing Windows
> (cross-compiled) builds mingw-w64 seemed to be more actively maintained
> (even for 32 bit windows targets) than the original mingw project,
> and I think there was at least one issue where the fix was basically
> "don't try to use mingw, it's just too old, use mingw-w64 instead".
> 
> In any case, thanks very much for writing up the instructions -- we do
> get people wanting to know how to build on Windows from time to time,
> so it will be useful to be able to point them at instructions.
> 
> You might consider tweaking the instructions to recommend building in
> a subdirectory (ie 'mkdir -p build/windows; cd build/windows;
> ../../configure; make'). This is good if you're going to be building
> more than one configuration, and it also makes it easy to blow away
> an old build by removing the whole build tree (distclean is not always
> 100% reliable especially if you track head-of-git often).
> 
> -- PMM
> 


I was just writing nearly the same as Peter. So now I can only add one
remark: there is also a QEMU Wiki which has some rather old instructions
on building for Windows: http://wiki.qemu.org/Hosts/W32. Maybe you could
add updated / new instructions there.

Thanks
Stefan




[Qemu-devel] [PATCH v2] qmp: extend QMP to provide read/write access to physical memory

2014-11-26 Thread Bryan D. Payne
Thanks for the feedback Eric, I've updated the patch.

v2 changes:
- added QMP command contract to qapi-schema.json
- corrected some comments
- rewired QMP command to use schema code



[Qemu-devel] Not able to emulate host processor (2Sockets, 8Core/Socket, 2NUMA nodes) using host pass through

2014-11-26 Thread [J]
Hi, All,

I'm using qemu-system-x86_64 in stable-2.1 and trying to use cpu pass-through 
(-cpu host) to emulate my host processor architecture, which is Intel Xeon CPU 
E5-2670 2.6GHz, 16 cores (2Sockets, 8Cores/Socket, 2Numa nodes).
But it seems that the virtual machine is stuck at somewhere, can not be 
launched.



The command I used to launch virtual machine is as follows.
/usr/local/qemu-2.1/bin/qemu-system-x86_64 -enable-kvm -boot c -cpu host -smp 
16,sockets=2,cores=8,threads=1 -numa node,cpus=0-7 -numa node,cpus=8-15 -m 
24576 -hda /var/lib/libvirt/images/test1.img -net 
nic,macaddr=52-54-00-12-34-4,model=virtio -net tap,ifname=tap0,script=no 
-device pci-assign,host=02:01.0,id=hostdev0



However, if I use "-cpu SandyBridge" the virtual machine can be launched, but 
L3 cache size is is missing in VM. I tried qemu-1.7 earlier, it's also same 
issue.


Is there any idea on how to fix it? Any help would be appreciated!


The "lscpu" output of host cpu is as follows:
Architecture:  x86_64
CPU op-mode(s):32-bit, 64-bit
Byte Order:Little Endian
CPU(s):16
On-line CPU(s) list:   0-15
Thread(s) per core:1
Core(s) per socket:8
CPU socket(s): 2
NUMA node(s):  2
Vendor ID: GenuineIntel
CPU family:6
Model: 45
Stepping:  7
CPU MHz:   2600.038
BogoMIPS:  5199.26
Virtualization:VT-x
L1d cache: 32K
L1i cache: 32K
L2 cache:  256K
L3 cache:  20480K
NUMA node0 CPU(s): 0-7
NUMA node1 CPU(s): 8-15

Re: [Qemu-devel] MinGW build

2014-11-26 Thread Peter Maydell
On 26 November 2014 at 19:55, Liviu Ionescu  wrote:
> since I had some troubles to build QEMU on Windows, I compiled a page with 
> some instructions:
>
> http://gnuarmeclipse.livius.net/wiki/How_to_build_QEMU#Windows
>
> the purpose was to generate a standalone executable, that requires no 
> libraries.
>
>
> although the procedure is fully functional, I had to apply a small patch, 
> basically:
>
> #ifdef __MINGW32__
> #if !defined(ffs)
> #define ffs __builtin_ffs
> #endif
> #endif
>
> in several files.

Hmm. We have a workaround already for a similar thing in
include/sysemu/os-win32.h,
but that works by declaring a prototype rather than using a #define, and it's
guarded by defined(_WIN64). I wonder if some of those workarounds in that file
need to be guarded by more specific checks than just _WIN64...

Do you have any experience with mingw-w64? When I was doing Windows
(cross-compiled) builds mingw-w64 seemed to be more actively maintained
(even for 32 bit windows targets) than the original mingw project,
and I think there was at least one issue where the fix was basically
"don't try to use mingw, it's just too old, use mingw-w64 instead".

In any case, thanks very much for writing up the instructions -- we do
get people wanting to know how to build on Windows from time to time,
so it will be useful to be able to point them at instructions.

You might consider tweaking the instructions to recommend building in
a subdirectory (ie 'mkdir -p build/windows; cd build/windows;
../../configure; make'). This is good if you're going to be building
more than one configuration, and it also makes it easy to blow away
an old build by removing the whole build tree (distclean is not always
100% reliable especially if you track head-of-git often).

-- PMM



[Qemu-devel] MinGW build

2014-11-26 Thread Liviu Ionescu
since I had some troubles to build QEMU on Windows, I compiled a page with some 
instructions:

http://gnuarmeclipse.livius.net/wiki/How_to_build_QEMU#Windows

the purpose was to generate a standalone executable, that requires no libraries.


although the procedure is fully functional, I had to apply a small patch, 
basically:

#ifdef __MINGW32__
#if !defined(ffs)
#define ffs __builtin_ffs
#endif
#endif

in several files.


- any suggestions on how to simplify the MinGW build procedure?

- anyone interested in this patch?


regards,

Liviu




Re: [Qemu-devel] virtio_blk: fix defaults for max_hw_sectors and max_segment_size

2014-11-26 Thread Jens Axboe
On 11/21/2014 08:49 AM, Mike Snitzer wrote:
> On Fri, Nov 21 2014 at  4:54am -0500,
> Christoph Hellwig  wrote:
> 
>> On Thu, Nov 20, 2014 at 02:00:59PM -0500, Mike Snitzer wrote:
>>> virtio_blk incorrectly established -1U as the default for these
>>> queue_limits.  Set these limits to sane default values to avoid crashing
>>> the kernel.  But the virtio-blk protocol should probably be extended to
>>> allow proper stacking of the disk's limits from the host.
>>>
>>> This change fixes a crash that was reported when virtio-blk was used to
>>> test linux-dm.git commit 604ea90641b4 ("dm thin: adjust max_sectors_kb
>>> based on thinp blocksize") that will initially set max_sectors to
>>> max_hw_sectors and then rounddown to the first power-of-2 factor of the
>>> DM thin-pool's blocksize.  Basically that commit assumes drivers don't
>>> suck when establishing max_hw_sectors so it acted like a canary in the
>>> coal mine.
>>
>> Is that a crash in the host or guest?  What kind of mishandling did you
>> see?  Unless the recent virtio standard changed anything the host
>> should be able to handle our arbitrary limits, and even if it doesn't
>> that something we need to hash out with qemu and the virtio standards
>> folks.
> 
> Some good news: this guest crash isn't an issue with recent kernels (so
> upstream, fedora 20, RHEL7, etc aren't impacted -- Jens feel free to
> drop my virtio_blk patch; even though some of it's limits are clearly
> broken I'll defer to the virtio_blk developers on the best way forward
> -- sorry for the noise!).
> 
> The BUG I saw only seems to impact RHEL6 kernels so far (note to self,
> actually _test_ on upstream before reporting a crash against upstream!)
> 
> [root@RHEL-6 ~]# echo 1073741824 > /sys/block/vdc/queue/max_sectors_kb
> [root@RHEL-6 ~]# lvs
> 
> Message from syslogd@RHEL-6 at Nov 21 15:32:15 ...
>  kernel:Kernel panic - not syncing: Fatal exception
> 
> Here is the RHEL6 guest crash, just for full disclosure:
> 
> kernel BUG at fs/direct-io.c:696!
> invalid opcode:  [#1] SMP
> last sysfs file: /sys/devices/virtual/block/dm-4/dev
> CPU 0
> Modules linked in: nfs lockd fscache auth_rpcgss nfs_acl sunrpc ipv6 ext2 
> dm_thin_pool dm_bio_prison dm_persistent_data dm_bufio libcrc32c dm_mirror 
> dm_region_hash dm_log dm_mod microcode virtio_balloon i2c_piix4 i2c_core 
> virtio_net ext4 jbd2 mbcache virtio_blk virtio_pci virtio_ring virtio 
> pata_acpi ata_generic ata_piix [last unloaded: speedstep_lib]
> 
> Pid: 1679, comm: lvs Not tainted 2.6.32 #6 Bochs Bochs
> RIP: 0010:[]  [] 
> __blockdev_direct_IO_newtrunc+0x986/0x1270
> RSP: 0018:88011a11ba48  EFLAGS: 00010287
> RAX:  RBX: 8801192fbd28 RCX: 1000
> RDX: ea0003b3d218 RSI: 88011aac4300 RDI: 880118572378
> RBP: 88011a11bbe8 R08:  R09: 
> R10:  R11:  R12: 8801192fbd00
> R13:  R14: 880118c3cac0 R15: 
> FS:  7fde78bc37a0() GS:88002820() knlGS:
> CS:  0010 DS:  ES:  CR0: 80050033
> CR2: 012706f0 CR3: 00011a432000 CR4: 000407f0
> DR0:  DR1:  DR2: 
> DR3:  DR6: 0ff0 DR7: 0400
> Process lvs (pid: 1679, threadinfo 88011a11a000, task 8801185a4aa0)
> Stack:
>  88011a11bb48 88011a11baa8 8801000c 88011a11bb18
>    88011a11bdc8 88011a11beb8
>  000c1a11baa8 880118c3cb98  18c3ccb8
> Call Trace:
>  [] ? blkdev_get_block+0x0/0x20
>  [] __blockdev_direct_IO+0x77/0xe0
>  [] ? blkdev_get_block+0x0/0x20
>  [] blkdev_direct_IO+0x57/0x60
>  [] ? blkdev_get_block+0x0/0x20
>  [] generic_file_aio_read+0x6bb/0x700
>  [] ? blkdev_get+0x10/0x20
>  [] ? blkdev_open+0x0/0xc0
>  [] ? __dentry_open+0x23f/0x360
>  [] blkdev_aio_read+0x51/0x80
>  [] do_sync_read+0xfa/0x140
>  [] ? autoremove_wake_function+0x0/0x40
>  [] ? block_ioctl+0x3c/0x40
>  [] ? vfs_ioctl+0x22/0xa0
>  [] ? do_vfs_ioctl+0x84/0x580
>  [] ? security_file_permission+0x16/0x20
>  [] vfs_read+0xb5/0x1a0
>  [] sys_read+0x51/0x90
>  [] ? __audit_syscall_exit+0x25e/0x290
>  [] system_call_fastpath+0x16/0x1b
> Code: fe ff ff c7 85 fc fe ff ff 00 00 00 00 48 89 95 10 ff ff ff 8b 95 34 ff 
> ff ff e8 46 ac ff ff 3b 85 34 ff ff ff 0f 84 fc 02 00 00 <0f> 0b eb fe 8b 9d 
> 34 ff ff ff 8b 85 30 ff ff ff 01 d8 85 c0 0f
> RIP  [] __blockdev_direct_IO_newtrunc+0x986/0x1270
>  RSP 
> ---[ end trace 73be5dcaf8939399 ]---
> Kernel panic - not syncing: Fatal exception

That code isn't even in mainline, as far as I can tell...

-- 
Jens Axboe




[Qemu-devel] [Bug 1395217] Re: Networking in qemu 2.0.0 and beyond is not compatible with Open Solaris (Illumos) 5.11

2014-11-26 Thread Eduardo Habkost
So, if it breaks even with -cpu SandyBridge and -cpu host, it is likely
to be a KVM or QEMU bug. Thanks for the testing!

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1395217

Title:
  Networking in qemu 2.0.0 and beyond is not compatible with Open
  Solaris (Illumos) 5.11

Status in QEMU:
  New

Bug description:
  The networking code in qemu in versions 2.0.0 and beyond is non-
  functional with Solaris/Illumos 5.11 images.

  Building 1.7.1, 2.0.0, 2.0.2, 2.1.2,and 2.2.0rc1with the following
  standard Slackware config:

  # From Slackware build tree . . . 
  ./configure \
--prefix=/usr \
--libdir=/usr/lib64 \
--sysconfdir=/etc \
--localstatedir=/var \
--enable-gtk \
--enable-system \
--enable-kvm \
--disable-debug-info \
--enable-virtfs \
--enable-sdl \
--audio-drv-list=alsa,oss,sdl,esd \
--enable-libusb \
--disable-vnc \
--target-list=x86_64-linux-user,i386-linux-user,x86_64-softmmu,i386-softmmu 
\
--enable-spice \
--enable-usb-redir 

  
  And attempting to run the same VM image with the following command (or via 
virt-manager):

  macaddress="DE:AD:BE:EF:3F:A4"

  qemu-system-x86_64 nex4x -cdrom /dev/cdrom -name "Nex41" -cpu Westmere
  -machine accel=kvm -smp 2 -m 4000 -net nic,macaddr=$macaddress  -net 
bridge,br=b
  r0 -net dump,file=/usr1/tmp/ -drive file=nex4x_d1 -drive 
file=nex4x_d2
   -enable-kvm

  Gives success on 1.7.1, and a deaf VM on all subsequent versions.

  Notable in validating my config, is that a Windows 7 image runs
  cleanly with networking on *all* builds, so my configuration appears
  to be good - qemu just hates Solaris at this point.

  Watching with wireshark (as well as pulling network traces from qemu
  as noted above) it appears that the notable difference in the two
  configs is that for some reason, Solaris gets stuck arping for it's
  own interface on startup, and never really comes on line on the
  network.  If other hosts attempt to ping the Solaris instance, they
  can successfully arp the bad VM, but not the other way around.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1395217/+subscriptions



Re: [Qemu-devel] [PATCH] hmp: fix regression of HMP device_del auto-completion

2014-11-26 Thread Marcel Apfelbaum
On Wed, 2014-11-26 at 13:05 -0500, Luiz Capitulino wrote:
> On Wed, 26 Nov 2014 13:50:01 +0200
> Marcel Apfelbaum  wrote:
> 
> > The commits:
> >  - 6a1fa9f5 (monitor: add del completion for peripheral device)
> >  - 66e56b13 (qdev: add qdev_build_hotpluggable_device_list helper)
> > 
> > cause a QEMU crash when trying to use HMP device_del auto-completion.
> > It can be easily reproduced by:
> >  -enable-kvm  ~/images/fedora.qcow2 -monitor stdio -device 
> > virtio-net-pci,id=vnet
> > 
> > (qemu) device_del
> > 
> > /home/mapfelba/git/upstream/qemu/hw/core/qdev.c:941:qdev_build_hotpluggable_device_list:
> >  Object 0x7f6ce04e4fe0 is not an instance of type device
> > Aborted (core dumped)
> > 
> > The root cause is qdev_build_hotpluggable_device_list going recursively over
> > all peripherals and their children assuming all are devices. It doesn't work
> > since PCI devices have at least on child which is a memory region (bus 
> > master).
> > 
> > Solved by observing that all devices appear as direct children of
> > /machine/peripheral container. No need of going recursively
> > over all the children.
> > 
> > Signed-off-by: Marcel Apfelbaum 
> 
> Peter, can you apply this patch directly to master to avoid me a pull
> request? Maybe it's a good idea to wait until tomorrow for more
> reviewers though.
Yes, another review would be welcomed, especially a QOM reviewer.
 
> 
> Marcel, thanks a lot for taking care of this!
No problem, glad to do it.

I forgot to add:
Reported-by: Gal Hammer 

Thanks,
Marcel
> 
> > ---
> >  hw/core/qdev.c | 12 ++--
> >  include/hw/qdev-core.h |  2 +-
> >  monitor.c  | 11 ---
> >  3 files changed, 15 insertions(+), 10 deletions(-)
> > 
> > diff --git a/hw/core/qdev.c b/hw/core/qdev.c
> > index 413b413..35fd00d 100644
> > --- a/hw/core/qdev.c
> > +++ b/hw/core/qdev.c
> > @@ -935,7 +935,7 @@ void qdev_alias_all_properties(DeviceState *target, 
> > Object *source)
> >  } while (class != object_class_by_name(TYPE_DEVICE));
> >  }
> >  
> > -int qdev_build_hotpluggable_device_list(Object *obj, void *opaque)
> > +static int qdev_add_hotpluggable_device(Object *obj, void *opaque)
> >  {
> >  GSList **list = opaque;
> >  DeviceState *dev = DEVICE(obj);
> > @@ -944,10 +944,18 @@ int qdev_build_hotpluggable_device_list(Object *obj, 
> > void *opaque)
> >  *list = g_slist_append(*list, dev);
> >  }
> >  
> > -object_child_foreach(obj, qdev_build_hotpluggable_device_list, opaque);
> >  return 0;
> >  }
> >  
> > +GSList *qdev_build_hotpluggable_device_list(Object *peripheral)
> > +{
> > +GSList *list = NULL;
> > +
> > +object_child_foreach(peripheral, qdev_add_hotpluggable_device, &list);
> > +
> > +return list;
> > +}
> > +
> >  static bool device_get_realized(Object *obj, Error **errp)
> >  {
> >  DeviceState *dev = DEVICE(obj);
> > diff --git a/include/hw/qdev-core.h b/include/hw/qdev-core.h
> > index d3a2940..589bbe7 100644
> > --- a/include/hw/qdev-core.h
> > +++ b/include/hw/qdev-core.h
> > @@ -365,7 +365,7 @@ extern int qdev_hotplug;
> >  
> >  char *qdev_get_dev_path(DeviceState *dev);
> >  
> > -int qdev_build_hotpluggable_device_list(Object *obj, void *opaque);
> > +GSList *qdev_build_hotpluggable_device_list(Object *peripheral);
> >  
> >  void qbus_set_hotplug_handler(BusState *bus, DeviceState *handler,
> >Error **errp);
> > diff --git a/monitor.c b/monitor.c
> > index fa00594..f1031a1 100644
> > --- a/monitor.c
> > +++ b/monitor.c
> > @@ -4321,17 +4321,14 @@ void object_add_completion(ReadLineState *rs, int 
> > nb_args, const char *str)
> >  static void peripheral_device_del_completion(ReadLineState *rs,
> >   const char *str, size_t len)
> >  {
> > -Object *peripheral;
> > -GSList *list = NULL, *item;
> > +Object *peripheral = container_get(qdev_get_machine(), "/peripheral");
> > +GSList *list, *item;
> >  
> > -peripheral = object_resolve_path("/machine/peripheral/", NULL);
> > -if (peripheral == NULL) {
> > +list = qdev_build_hotpluggable_device_list(peripheral);
> > +if (!list) {
> >  return;
> >  }
> >  
> > -object_child_foreach(peripheral, qdev_build_hotpluggable_device_list,
> > - &list);
> > -
> >  for (item = list; item; item = g_slist_next(item)) {
> >  DeviceState *dev = item->data;
> >  
> 






Re: [Qemu-devel] TCG Multithreading performance improvement

2014-11-26 Thread Mark Burton

> On 26 Nov 2014, at 20:12, Lluís Vilanova  wrote:
> 
> Mark Burton writes:
> 
>> Hi all,
>> We are now actively going to pursue TCG Multithreading to improve the
>> performance of the TCG for Qemu models that include multiple cores.
> 
>> We have set up a wiki page to track the project 
>> http://wiki.qemu.org/Features/tcg-multithread
> 
>> At this point, I would like to invite everybody to email us ideas about how 
>> the
>> project should progress, and ideas that might be useful (e.g. people who have
>> tried this before, source code that might be helpful, what order we should
>> attack things in etc)…
> 
>> So - PLEASE let us know if you have interest in this topic, or information 
>> that
>> might help
> 
> This is great news! I've added a new subsection on the wiki to hold some
> problems that should be kept in mind while designing a solution.
> 
> For now, I've added memory consistency and instruction atomicity issues in it.

Thanks !  - Those are the two big issues I think :-)))


Cheers

Mark.

> 
> 
> Best,
>  Lluis
> 
> -- 
> "And it's much the same thing with knowledge, for whenever you learn
> something new, the whole world becomes that much richer."
> -- The Princess of Pure Reason, as told by Norton Juster in The Phantom
> Tollbooth


 +44 (0)20 7100 3485 x 210
 +33 (0)5 33 52 01 77x 210

+33 (0)603762104
mark.burton




Re: [Qemu-devel] TCG Multithreading performance improvement

2014-11-26 Thread Lluís Vilanova
Mark Burton writes:

> Hi all,
> We are now actively going to pursue TCG Multithreading to improve the
> performance of the TCG for Qemu models that include multiple cores.

> We have set up a wiki page to track the project 
> http://wiki.qemu.org/Features/tcg-multithread

> At this point, I would like to invite everybody to email us ideas about how 
> the
> project should progress, and ideas that might be useful (e.g. people who have
> tried this before, source code that might be helpful, what order we should
> attack things in etc)…

> So - PLEASE let us know if you have interest in this topic, or information 
> that
> might help

This is great news! I've added a new subsection on the wiki to hold some
problems that should be kept in mind while designing a solution.

For now, I've added memory consistency and instruction atomicity issues in it.


Best,
  Lluis

-- 
 "And it's much the same thing with knowledge, for whenever you learn
 something new, the whole world becomes that much richer."
 -- The Princess of Pure Reason, as told by Norton Juster in The Phantom
 Tollbooth



[Qemu-devel] [Bug 1395217] Re: Networking in qemu 2.0.0 and beyond is not compatible with Open Solaris (Illumos) 5.11

2014-11-26 Thread Tim Dawson
Note that this Illumos image is certified/runs cleanly on Intel hardware
from the last 5 years when natively on it.  I doubt that it is a kernel
problem with Illumos with regard to the actual CPU architecture.  Older
releases that are OpenSolaris based also see the problem.

Generally speaking, I don't think that an issue of this nature has ever
been seen with this OS image on any Intel or AMD CPU ever tested . . .
so unless there is something in Illumos that is only triggered by qemu,
I find it hard to imagine it being an Illumos bug, but then again, it's
not like oddities like this never happen . . .

And thanks for all the quick attention! If nothing else, it got me to a
point whereby I can work around the problem, and not be stuck on older
builds that virt-manager hates . . . .

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1395217

Title:
  Networking in qemu 2.0.0 and beyond is not compatible with Open
  Solaris (Illumos) 5.11

Status in QEMU:
  New

Bug description:
  The networking code in qemu in versions 2.0.0 and beyond is non-
  functional with Solaris/Illumos 5.11 images.

  Building 1.7.1, 2.0.0, 2.0.2, 2.1.2,and 2.2.0rc1with the following
  standard Slackware config:

  # From Slackware build tree . . . 
  ./configure \
--prefix=/usr \
--libdir=/usr/lib64 \
--sysconfdir=/etc \
--localstatedir=/var \
--enable-gtk \
--enable-system \
--enable-kvm \
--disable-debug-info \
--enable-virtfs \
--enable-sdl \
--audio-drv-list=alsa,oss,sdl,esd \
--enable-libusb \
--disable-vnc \
--target-list=x86_64-linux-user,i386-linux-user,x86_64-softmmu,i386-softmmu 
\
--enable-spice \
--enable-usb-redir 

  
  And attempting to run the same VM image with the following command (or via 
virt-manager):

  macaddress="DE:AD:BE:EF:3F:A4"

  qemu-system-x86_64 nex4x -cdrom /dev/cdrom -name "Nex41" -cpu Westmere
  -machine accel=kvm -smp 2 -m 4000 -net nic,macaddr=$macaddress  -net 
bridge,br=b
  r0 -net dump,file=/usr1/tmp/ -drive file=nex4x_d1 -drive 
file=nex4x_d2
   -enable-kvm

  Gives success on 1.7.1, and a deaf VM on all subsequent versions.

  Notable in validating my config, is that a Windows 7 image runs
  cleanly with networking on *all* builds, so my configuration appears
  to be good - qemu just hates Solaris at this point.

  Watching with wireshark (as well as pulling network traces from qemu
  as noted above) it appears that the notable difference in the two
  configs is that for some reason, Solaris gets stuck arping for it's
  own interface on startup, and never really comes on line on the
  network.  If other hosts attempt to ping the Solaris instance, they
  can successfully arp the bad VM, but not the other way around.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1395217/+subscriptions



[Qemu-devel] [Bug 1395217] Re: Networking in qemu 2.0.0 and beyond is not compatible with Open Solaris (Illumos) 5.11

2014-11-26 Thread Tim Dawson
(Wow . . . that last was incredibly redundant . . . staying up most of
the night working on this has apparently left me a bit stupid this
morning/afternoon . . . sorry!)

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1395217

Title:
  Networking in qemu 2.0.0 and beyond is not compatible with Open
  Solaris (Illumos) 5.11

Status in QEMU:
  New

Bug description:
  The networking code in qemu in versions 2.0.0 and beyond is non-
  functional with Solaris/Illumos 5.11 images.

  Building 1.7.1, 2.0.0, 2.0.2, 2.1.2,and 2.2.0rc1with the following
  standard Slackware config:

  # From Slackware build tree . . . 
  ./configure \
--prefix=/usr \
--libdir=/usr/lib64 \
--sysconfdir=/etc \
--localstatedir=/var \
--enable-gtk \
--enable-system \
--enable-kvm \
--disable-debug-info \
--enable-virtfs \
--enable-sdl \
--audio-drv-list=alsa,oss,sdl,esd \
--enable-libusb \
--disable-vnc \
--target-list=x86_64-linux-user,i386-linux-user,x86_64-softmmu,i386-softmmu 
\
--enable-spice \
--enable-usb-redir 

  
  And attempting to run the same VM image with the following command (or via 
virt-manager):

  macaddress="DE:AD:BE:EF:3F:A4"

  qemu-system-x86_64 nex4x -cdrom /dev/cdrom -name "Nex41" -cpu Westmere
  -machine accel=kvm -smp 2 -m 4000 -net nic,macaddr=$macaddress  -net 
bridge,br=b
  r0 -net dump,file=/usr1/tmp/ -drive file=nex4x_d1 -drive 
file=nex4x_d2
   -enable-kvm

  Gives success on 1.7.1, and a deaf VM on all subsequent versions.

  Notable in validating my config, is that a Windows 7 image runs
  cleanly with networking on *all* builds, so my configuration appears
  to be good - qemu just hates Solaris at this point.

  Watching with wireshark (as well as pulling network traces from qemu
  as noted above) it appears that the notable difference in the two
  configs is that for some reason, Solaris gets stuck arping for it's
  own interface on startup, and never really comes on line on the
  network.  If other hosts attempt to ping the Solaris instance, they
  can successfully arp the bad VM, but not the other way around.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1395217/+subscriptions



Re: [Qemu-devel] [PATCH] hmp: fix regression of HMP device_del auto-completion

2014-11-26 Thread Peter Maydell
On 26 November 2014 at 18:05, Luiz Capitulino  wrote:
> Peter, can you apply this patch directly to master to avoid me a pull
> request? Maybe it's a good idea to wait until tomorrow for more
> reviewers though.

Sure. I'll apply it to master some time tomorrow afternoon I expect.

-- PMM



[Qemu-devel] [Bug 1395217] Re: Networking in qemu 2.0.0 and beyond is not compatible with Open Solaris (Illumos) 5.11

2014-11-26 Thread Tim Dawson
Broadwell - Fails, Host won't support it:

warning: host doesn't support requested feature: CPUID.01H:ECX.fma [bit 12]
warning: host doesn't support requested feature: CPUID.01H:ECX.movbe [bit 22]
warning: host doesn't support requested feature: CPUID.07H:EBX.fsgsbase [bit 0]
warning: host doesn't support requested feature: CPUID.07H:EBX.bmi1 [bit 3]
warning: host doesn't support requested feature: CPUID.07H:EBX.hle [bit 4]
warning: host doesn't support requested feature: CPUID.07H:EBX.avx2 [bit 5]
warning: host doesn't support requested feature: CPUID.07H:EBX.smep [bit 7]
warning: host doesn't support requested feature: CPUID.07H:EBX.bmi2 [bit 8]
warning: host doesn't support requested feature: CPUID.07H:EBX.erms [bit 9]
warning: host doesn't support requested feature: CPUID.07H:EBX.invpcid [bit 10]
warning: host doesn't support requested feature: CPUID.07H:EBX.rtm [bit 11]
warning: host doesn't support requested feature: CPUID.07H:EBX.rdseed [bit 18]
warning: host doesn't support requested feature: CPUID.07H:EBX.adx [bit 19]
warning: host doesn't support requested feature: CPUID.07H:EBX.smap [bit 20]
warning: host doesn't support requested feature: 
CPUID.8001H:ECX.3dnowprefetch [bit 8]
qemu-system-x86_64: Host doesn't support requested features

Haswell fails, host won't support it:

warning: host doesn't support requested feature: CPUID.01H:ECX.fma [bit 12]
warning: host doesn't support requested feature: CPUID.01H:ECX.movbe [bit 22]
warning: host doesn't support requested feature: CPUID.07H:EBX.fsgsbase [bit 0]
warning: host doesn't support requested feature: CPUID.07H:EBX.bmi1 [bit 3]
warning: host doesn't support requested feature: CPUID.07H:EBX.hle [bit 4]
warning: host doesn't support requested feature: CPUID.07H:EBX.avx2 [bit 5]
warning: host doesn't support requested feature: CPUID.07H:EBX.smep [bit 7]
warning: host doesn't support requested feature: CPUID.07H:EBX.bmi2 [bit 8]
warning: host doesn't support requested feature: CPUID.07H:EBX.erms [bit 9]
warning: host doesn't support requested feature: CPUID.07H:EBX.invpcid [bit 10]
warning: host doesn't support requested feature: CPUID.07H:EBX.rtm [bit 11]
qemu-system-x86_64: Host doesn't support requested features


SandyBridge (this is the test box physical CPU) fails, no errors, networking 
dead, as per initial problem.

Westmere fails, no networking.

Nehalem fails, no networking

Panryn fails, no networking

Conroe fails, no networking

host fails, no networking

Just to ensure that all else was good, I tested SandyBridge, Westmere,
Conroe, and host with "-x2apic" and every one works with x2apic
disabled.

This test box is a laptop, and I am only testing on it since I am away
from my primary server (Dell 2950) for the holiday.  Both Intel, but not
even close to the same CPU . . . same problem observed on both, although
workaround not tested yet on primary.


Test box (for this data) CPU into:

processor   : 0
vendor_id   : GenuineIntel
cpu family  : 6
model   : 42
model name  : Intel(R) Core(TM) i5-2520M CPU @ 2.50GHz
stepping: 7
microcode   : 0x25
cpu MHz : 1200.000
cache size  : 3072 KB
physical id : 0
siblings: 4
core id : 0
cpu cores   : 2
apicid  : 0
initial apicid  : 0
fpu : yes
fpu_exception   : yes
cpuid level : 13
wp  : yes
flags   : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov 
pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm c
onstant_tsc arch_perfmon pebs bts nopl xtopology nonstop_tsc aperfmperf eagerfpu
 pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid s
se4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx lahf_lm ida arat epb
 xsaveopt pln pts dtherm tpr_shadow vnmi flexpriority ept vpid
bogomips: 4984.29
clflush size: 64
cache_alignment : 64
address sizes   : 36 bits physical, 48 bits virtual
power management:

(Repeats for 4 cores)



Primary system:

processor   : 0
vendor_id   : GenuineIntel
cpu family  : 6
model   : 15
model name  : Intel(R) Xeon(R) CPU   E5345  @ 2.33GHz
stepping: 7
microcode   : 0x6b
cpu MHz : 2000.000
cache size  : 4096 KB
physical id : 0
siblings: 4
core id : 0
cpu cores   : 4
apicid  : 0
initial apicid  : 0
fpu : yes
fpu_exception   : yes
cpuid level : 10
wp  : yes
flags   : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov 
pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx lm constant
_tsc arch_perfmon pebs bts rep_good nopl aperfmperf pni dtes64 monitor ds_cpl vm
x est tm2 ssse3 cx16 xtpr pdcm dca lahf_lm dtherm tpr_shadow
bogomips: 4655.23
clflush size: 64
cache_alignment : 64
address sizes   : 36 bits physical, 48 bits virtual
power management:

(Repeats for 8 cores)

-- 
You received this bug notification because you are 

Re: [Qemu-devel] [PATCH RFC v3 05/12] virtio: introduce legacy virtio devices

2014-11-26 Thread Michael S. Tsirkin
On Wed, Nov 26, 2014 at 07:46:38PM +0100, Greg Kurz wrote:
> On Wed, 26 Nov 2014 18:28:36 +0100
> Cornelia Huck  wrote:
> 
> > Introduce a helper function to indicate  whether a virtio device is
> > operating in legacy or virtio standard mode.
> > 
> > It may be used to make decisions about the endianess of virtio accesses
> > and other virtio-1 specific changes, enabling us to support transitional
> > devices.
> > 
> > Reviewed-by: Thomas Huth 
> > Signed-off-by: Cornelia Huck 
> > ---
> >  hw/virtio/virtio.c|6 +-
> >  include/hw/virtio/virtio-access.h |4 
> >  include/hw/virtio/virtio.h|   13 +++--
> >  3 files changed, 20 insertions(+), 3 deletions(-)
> > 
> > diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
> > index 2eb5d3c..4149f45 100644
> > --- a/hw/virtio/virtio.c
> > +++ b/hw/virtio/virtio.c
> > @@ -883,7 +883,11 @@ static bool virtio_device_endian_needed(void *opaque)
> >  VirtIODevice *vdev = opaque;
> > 
> >  assert(vdev->device_endian != VIRTIO_DEVICE_ENDIAN_UNKNOWN);
> > -return vdev->device_endian != virtio_default_endian();
> > +if (virtio_device_is_legacy(vdev)) {
> > +return vdev->device_endian != virtio_default_endian();
> > +}
> > +/* Devices conforming to VIRTIO 1.0 or later are always LE. */
> > +return vdev->device_endian != VIRTIO_DEVICE_ENDIAN_LITTLE;
> >  }
> > 
> 
> Sorry but I still don't understand why we need to stream the device_endian
> subsection if we have a virtio-1 device... this field is only used on
> legacy device paths. Can you share an example where it is needed ?

I think it's needed.
A transitional device can be used with legacy native endian and
modern little endian format.


> >  static const VMStateDescription vmstate_virtio_device_endian = {
> > diff --git a/include/hw/virtio/virtio-access.h 
> > b/include/hw/virtio/virtio-access.h
> > index 46456fd..c123ee0 100644
> > --- a/include/hw/virtio/virtio-access.h
> > +++ b/include/hw/virtio/virtio-access.h
> > @@ -19,6 +19,10 @@
> > 
> >  static inline bool virtio_access_is_big_endian(VirtIODevice *vdev)
> >  {
> > +if (!virtio_device_is_legacy(vdev)) {
> > +/* Devices conforming to VIRTIO 1.0 or later are always LE. */
> > +return false;
> > +}
> >  #if defined(TARGET_IS_BIENDIAN)
> >  return virtio_is_big_endian(vdev);
> >  #elif defined(TARGET_WORDS_BIGENDIAN)
> > diff --git a/include/hw/virtio/virtio.h b/include/hw/virtio/virtio.h
> > index b408166..40e567c 100644
> > --- a/include/hw/virtio/virtio.h
> > +++ b/include/hw/virtio/virtio.h
> > @@ -275,9 +275,18 @@ void 
> > virtio_queue_set_host_notifier_fd_handler(VirtQueue *vq, bool assign,
> >  void virtio_queue_notify_vq(VirtQueue *vq);
> >  void virtio_irq(VirtQueue *vq);
> > 
> > +static inline bool virtio_device_is_legacy(VirtIODevice *vdev)
> > +{
> > +return !(vdev->guest_features[1] & (1 << (VIRTIO_F_VERSION_1 - 32)));
> > +}
> > +
> >  static inline bool virtio_is_big_endian(VirtIODevice *vdev)
> >  {
> > -assert(vdev->device_endian != VIRTIO_DEVICE_ENDIAN_UNKNOWN);
> > -return vdev->device_endian == VIRTIO_DEVICE_ENDIAN_BIG;
> > +if (virtio_device_is_legacy(vdev)) {
> > +assert(vdev->device_endian != VIRTIO_DEVICE_ENDIAN_UNKNOWN);
> > +return vdev->device_endian == VIRTIO_DEVICE_ENDIAN_BIG;
> > +}
> > +/* Devices conforming to VIRTIO 1.0 or later are always LE. */
> > +return false;
> >  }
> >  #endif



Re: [Qemu-devel] [PATCH RFC v3 11/12] virtio-net/virtio-blk: enable virtio 1.0

2014-11-26 Thread Michael S. Tsirkin
On Wed, Nov 26, 2014 at 06:28:42PM +0100, Cornelia Huck wrote:
> virtio-net (non-vhost) and virtio-blk have everything in place to support
> virtio 1.0: let's enable the feature bit for them.

Hmm I doubt that.
At least not without more patches.

For block, scsi, wce and config-wce must be off, and
it must support ANY_LAYOUT (which might be a bit more code).

For net, header size is different when mergeable bufs are off,
and mac is read-only.



> Note that VIRTIO_F_VERSION_1 is technically a transport feature; once
> every device is ready for virtio 1.0, we can move this setting this
> feature bit out of the individual devices.
> 
> Signed-off-by: Cornelia Huck 
> ---
>  hw/block/virtio-blk.c |4 
>  hw/net/virtio-net.c   |4 
>  2 files changed, 8 insertions(+)
> 
> diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c
> index 6d86f60..3781f98 100644
> --- a/hw/block/virtio-blk.c
> +++ b/hw/block/virtio-blk.c
> @@ -569,6 +569,10 @@ static uint32_t virtio_blk_get_features(VirtIODevice 
> *vdev, unsigned int index,
>  {
>  VirtIOBlock *s = VIRTIO_BLK(vdev);
>  
> +if (index == 1) {
> +features |= (1 << (VIRTIO_F_VERSION_1 - 32));
> +}
> +
>  if (index > 0) {
>  return features;
>  }
> diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
> index 1e214b5..fcfc95f 100644
> --- a/hw/net/virtio-net.c
> +++ b/hw/net/virtio-net.c
> @@ -447,6 +447,10 @@ static uint32_t virtio_net_get_features(VirtIODevice 
> *vdev, unsigned int index,
>  VirtIONet *n = VIRTIO_NET(vdev);
>  NetClientState *nc = qemu_get_queue(n->nic);
>  
> +if (index == 1 && !get_vhost_net(nc->peer)) {
> +features |= (1 << (VIRTIO_F_VERSION_1 - 32));
> +}
> +
>  if (index > 0) {
>  return features;
>  }
> -- 
> 1.7.9.5



Re: [Qemu-devel] [PATCH RFC v3 05/12] virtio: introduce legacy virtio devices

2014-11-26 Thread Greg Kurz
On Wed, 26 Nov 2014 18:28:36 +0100
Cornelia Huck  wrote:

> Introduce a helper function to indicate  whether a virtio device is
> operating in legacy or virtio standard mode.
> 
> It may be used to make decisions about the endianess of virtio accesses
> and other virtio-1 specific changes, enabling us to support transitional
> devices.
> 
> Reviewed-by: Thomas Huth 
> Signed-off-by: Cornelia Huck 
> ---
>  hw/virtio/virtio.c|6 +-
>  include/hw/virtio/virtio-access.h |4 
>  include/hw/virtio/virtio.h|   13 +++--
>  3 files changed, 20 insertions(+), 3 deletions(-)
> 
> diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
> index 2eb5d3c..4149f45 100644
> --- a/hw/virtio/virtio.c
> +++ b/hw/virtio/virtio.c
> @@ -883,7 +883,11 @@ static bool virtio_device_endian_needed(void *opaque)
>  VirtIODevice *vdev = opaque;
> 
>  assert(vdev->device_endian != VIRTIO_DEVICE_ENDIAN_UNKNOWN);
> -return vdev->device_endian != virtio_default_endian();
> +if (virtio_device_is_legacy(vdev)) {
> +return vdev->device_endian != virtio_default_endian();
> +}
> +/* Devices conforming to VIRTIO 1.0 or later are always LE. */
> +return vdev->device_endian != VIRTIO_DEVICE_ENDIAN_LITTLE;
>  }
> 

Sorry but I still don't understand why we need to stream the device_endian
subsection if we have a virtio-1 device... this field is only used on
legacy device paths. Can you share an example where it is needed ?

>  static const VMStateDescription vmstate_virtio_device_endian = {
> diff --git a/include/hw/virtio/virtio-access.h 
> b/include/hw/virtio/virtio-access.h
> index 46456fd..c123ee0 100644
> --- a/include/hw/virtio/virtio-access.h
> +++ b/include/hw/virtio/virtio-access.h
> @@ -19,6 +19,10 @@
> 
>  static inline bool virtio_access_is_big_endian(VirtIODevice *vdev)
>  {
> +if (!virtio_device_is_legacy(vdev)) {
> +/* Devices conforming to VIRTIO 1.0 or later are always LE. */
> +return false;
> +}
>  #if defined(TARGET_IS_BIENDIAN)
>  return virtio_is_big_endian(vdev);
>  #elif defined(TARGET_WORDS_BIGENDIAN)
> diff --git a/include/hw/virtio/virtio.h b/include/hw/virtio/virtio.h
> index b408166..40e567c 100644
> --- a/include/hw/virtio/virtio.h
> +++ b/include/hw/virtio/virtio.h
> @@ -275,9 +275,18 @@ void virtio_queue_set_host_notifier_fd_handler(VirtQueue 
> *vq, bool assign,
>  void virtio_queue_notify_vq(VirtQueue *vq);
>  void virtio_irq(VirtQueue *vq);
> 
> +static inline bool virtio_device_is_legacy(VirtIODevice *vdev)
> +{
> +return !(vdev->guest_features[1] & (1 << (VIRTIO_F_VERSION_1 - 32)));
> +}
> +
>  static inline bool virtio_is_big_endian(VirtIODevice *vdev)
>  {
> -assert(vdev->device_endian != VIRTIO_DEVICE_ENDIAN_UNKNOWN);
> -return vdev->device_endian == VIRTIO_DEVICE_ENDIAN_BIG;
> +if (virtio_device_is_legacy(vdev)) {
> +assert(vdev->device_endian != VIRTIO_DEVICE_ENDIAN_UNKNOWN);
> +return vdev->device_endian == VIRTIO_DEVICE_ENDIAN_BIG;
> +}
> +/* Devices conforming to VIRTIO 1.0 or later are always LE. */
> +return false;
>  }
>  #endif




Re: [Qemu-devel] [Xen-devel] [PATCH] increase maxmem before calling xc_domain_populate_physmap

2014-11-26 Thread Stefano Stabellini
On Tue, 25 Nov 2014, Andrew Cooper wrote:
> On 25/11/14 17:45, Stefano Stabellini wrote:
> > Increase maxmem before calling xc_domain_populate_physmap_exact to avoid
> > the risk of running out of guest memory. This way we can also avoid
> > complex memory calculations in libxl at domain construction time.
> >
> > This patch fixes an abort() when assigning more than 4 NICs to a VM.
> >
> > Signed-off-by: Stefano Stabellini 
> >
> > diff --git a/xen-hvm.c b/xen-hvm.c
> > index 5c69a8d..38e08c3 100644
> > --- a/xen-hvm.c
> > +++ b/xen-hvm.c
> > @@ -218,6 +218,7 @@ void xen_ram_alloc(ram_addr_t ram_addr, ram_addr_t 
> > size, MemoryRegion *mr)
> >  unsigned long nr_pfn;
> >  xen_pfn_t *pfn_list;
> >  int i;
> > +xc_dominfo_t info;
> >  
> >  if (runstate_check(RUN_STATE_INMIGRATE)) {
> >  /* RAM already populated in Xen */
> > @@ -240,6 +241,13 @@ void xen_ram_alloc(ram_addr_t ram_addr, ram_addr_t 
> > size, MemoryRegion *mr)
> >  pfn_list[i] = (ram_addr >> TARGET_PAGE_BITS) + i;
> >  }
> >  
> > +if (xc_domain_getinfo(xen_xc, xen_domid, 1, &info) < 0) {
> 
> xc_domain_getinfo()'s interface is mad, and provides no guarantee that
> it returns the information for the domain you requested.  It also won't
> return -1 on error.  The correct error handing is:
> 
> (xc_domain_getinfo(xen_xc, xen_domid, 1, &info) != 1) || (info.domid !=
> xen_domid)

It might be wiser to switch to xc_domain_getinfolist


> ~Andrew
> 
> > +hw_error("xc_domain_getinfo failed");
> > +}
> > +if (xc_domain_setmaxmem(xen_xc, xen_domid, info.max_memkb +
> > +(nr_pfn * XC_PAGE_SIZE / 1024)) < 0) {
> > +hw_error("xc_domain_setmaxmem failed");
> > +}
> >  if (xc_domain_populate_physmap_exact(xen_xc, xen_domid, nr_pfn, 0, 0, 
> > pfn_list)) {
> >  hw_error("xen: failed to populate ram at " RAM_ADDR_FMT, ram_addr);
> >  }
> >
> > ___
> > Xen-devel mailing list
> > xen-de...@lists.xen.org
> > http://lists.xen.org/xen-devel
> 
> 



Re: [Qemu-devel] [PATCH RFC v3 07/12] dataplane: allow virtio-1 devices

2014-11-26 Thread Greg Kurz
On Wed, 26 Nov 2014 18:28:38 +0100
Cornelia Huck  wrote:

> Handle endianness conversion for virtio-1 virtqueues correctly.
> 
> Note that dataplane now needs to be built per-target.
> 
> Signed-off-by: Cornelia Huck 

Build is ok now.

Acked-by: Greg Kurz 

> ---
>  hw/block/dataplane/virtio-blk.c   |4 +-
>  hw/scsi/virtio-scsi-dataplane.c   |2 +-
>  hw/virtio/Makefile.objs   |2 +-
>  hw/virtio/dataplane/Makefile.objs |2 +-
>  hw/virtio/dataplane/vring.c   |   86 
> ++---
>  include/hw/virtio/dataplane/vring-accessors.h |   75 +
>  include/hw/virtio/dataplane/vring.h   |   14 +---
>  7 files changed, 131 insertions(+), 54 deletions(-)
>  create mode 100644 include/hw/virtio/dataplane/vring-accessors.h
> 
> diff --git a/hw/block/dataplane/virtio-blk.c b/hw/block/dataplane/virtio-blk.c
> index 1222a37..2d8cc15 100644
> --- a/hw/block/dataplane/virtio-blk.c
> +++ b/hw/block/dataplane/virtio-blk.c
> @@ -16,7 +16,9 @@
>  #include "qemu/iov.h"
>  #include "qemu/thread.h"
>  #include "qemu/error-report.h"
> +#include "hw/virtio/virtio-access.h"
>  #include "hw/virtio/dataplane/vring.h"
> +#include "hw/virtio/dataplane/vring-accessors.h"
>  #include "sysemu/block-backend.h"
>  #include "hw/virtio/virtio-blk.h"
>  #include "virtio-blk.h"
> @@ -75,7 +77,7 @@ static void complete_request_vring(VirtIOBlockReq *req, 
> unsigned char status)
>  VirtIOBlockDataPlane *s = req->dev->dataplane;
>  stb_p(&req->in->status, status);
> 
> -vring_push(&req->dev->dataplane->vring, &req->elem,
> +vring_push(s->vdev, &req->dev->dataplane->vring, &req->elem,
> req->qiov.size + sizeof(*req->in));
> 
>  /* Suppress notification to guest by BH and its scheduled
> diff --git a/hw/scsi/virtio-scsi-dataplane.c b/hw/scsi/virtio-scsi-dataplane.c
> index 03a1e8c..418d73b 100644
> --- a/hw/scsi/virtio-scsi-dataplane.c
> +++ b/hw/scsi/virtio-scsi-dataplane.c
> @@ -94,7 +94,7 @@ void virtio_scsi_vring_push_notify(VirtIOSCSIReq *req)
>  {
>  VirtIODevice *vdev = VIRTIO_DEVICE(req->vring->parent);
> 
> -vring_push(&req->vring->vring, &req->elem,
> +vring_push(vdev, &req->vring->vring, &req->elem,
> req->qsgl.size + req->resp_iov.size);
> 
>  if (vring_should_notify(vdev, &req->vring->vring)) {
> diff --git a/hw/virtio/Makefile.objs b/hw/virtio/Makefile.objs
> index d21c397..19b224a 100644
> --- a/hw/virtio/Makefile.objs
> +++ b/hw/virtio/Makefile.objs
> @@ -2,7 +2,7 @@ common-obj-y += virtio-rng.o
>  common-obj-$(CONFIG_VIRTIO_PCI) += virtio-pci.o
>  common-obj-y += virtio-bus.o
>  common-obj-y += virtio-mmio.o
> -common-obj-$(CONFIG_VIRTIO) += dataplane/
> +obj-$(CONFIG_VIRTIO) += dataplane/
> 
>  obj-y += virtio.o virtio-balloon.o 
>  obj-$(CONFIG_LINUX) += vhost.o vhost-backend.o vhost-user.o
> diff --git a/hw/virtio/dataplane/Makefile.objs 
> b/hw/virtio/dataplane/Makefile.objs
> index 9a8cfc0..753a9ca 100644
> --- a/hw/virtio/dataplane/Makefile.objs
> +++ b/hw/virtio/dataplane/Makefile.objs
> @@ -1 +1 @@
> -common-obj-y += vring.o
> +obj-y += vring.o
> diff --git a/hw/virtio/dataplane/vring.c b/hw/virtio/dataplane/vring.c
> index a051775..0da8d6b 100644
> --- a/hw/virtio/dataplane/vring.c
> +++ b/hw/virtio/dataplane/vring.c
> @@ -18,7 +18,9 @@
>  #include "hw/hw.h"
>  #include "exec/memory.h"
>  #include "exec/address-spaces.h"
> +#include "hw/virtio/virtio-access.h"
>  #include "hw/virtio/dataplane/vring.h"
> +#include "hw/virtio/dataplane/vring-accessors.h"
>  #include "qemu/error-report.h"
> 
>  /* vring_map can be coupled with vring_unmap or (if you still have the
> @@ -83,7 +85,7 @@ bool vring_setup(Vring *vring, VirtIODevice *vdev, int n)
>  vring_init(&vring->vr, virtio_queue_get_num(vdev, n), vring_ptr, 4096);
> 
>  vring->last_avail_idx = virtio_queue_get_last_avail_idx(vdev, n);
> -vring->last_used_idx = vring->vr.used->idx;
> +vring->last_used_idx = vring_get_used_idx(vdev, vring);
>  vring->signalled_used = 0;
>  vring->signalled_used_valid = false;
> 
> @@ -104,7 +106,7 @@ void vring_teardown(Vring *vring, VirtIODevice *vdev, int 
> n)
>  void vring_disable_notification(VirtIODevice *vdev, Vring *vring)
>  {
>  if (!(vdev->guest_features[0] & (1 << VIRTIO_RING_F_EVENT_IDX))) {
> -vring->vr.used->flags |= VRING_USED_F_NO_NOTIFY;
> +vring_set_used_flags(vdev, vring, VRING_USED_F_NO_NOTIFY);
>  }
>  }
> 
> @@ -117,10 +119,10 @@ bool vring_enable_notification(VirtIODevice *vdev, 
> Vring *vring)
>  if (vdev->guest_features[0] & (1 << VIRTIO_RING_F_EVENT_IDX)) {
>  vring_avail_event(&vring->vr) = vring->vr.avail->idx;
>  } else {
> -vring->vr.used->flags &= ~VRING_USED_F_NO_NOTIFY;
> +vring_clear_used_flags(vdev, vring, VRING_USED_F_NO_NOTIFY);
>  }
>  smp_mb(); /* ensure update is seen before reading avail_idx */
> -return !vring_more_avail(

Re: [Qemu-devel] [PATCH] hmp: fix regression of HMP device_del auto-completion

2014-11-26 Thread Luiz Capitulino
On Wed, 26 Nov 2014 13:50:01 +0200
Marcel Apfelbaum  wrote:

> The commits:
>  - 6a1fa9f5 (monitor: add del completion for peripheral device)
>  - 66e56b13 (qdev: add qdev_build_hotpluggable_device_list helper)
> 
> cause a QEMU crash when trying to use HMP device_del auto-completion.
> It can be easily reproduced by:
>  -enable-kvm  ~/images/fedora.qcow2 -monitor stdio -device 
> virtio-net-pci,id=vnet
> 
> (qemu) device_del
> 
> /home/mapfelba/git/upstream/qemu/hw/core/qdev.c:941:qdev_build_hotpluggable_device_list:
>  Object 0x7f6ce04e4fe0 is not an instance of type device
> Aborted (core dumped)
> 
> The root cause is qdev_build_hotpluggable_device_list going recursively over
> all peripherals and their children assuming all are devices. It doesn't work
> since PCI devices have at least on child which is a memory region (bus 
> master).
> 
> Solved by observing that all devices appear as direct children of
> /machine/peripheral container. No need of going recursively
> over all the children.
> 
> Signed-off-by: Marcel Apfelbaum 

Peter, can you apply this patch directly to master to avoid me a pull
request? Maybe it's a good idea to wait until tomorrow for more
reviewers though.

Marcel, thanks a lot for taking care of this!

> ---
>  hw/core/qdev.c | 12 ++--
>  include/hw/qdev-core.h |  2 +-
>  monitor.c  | 11 ---
>  3 files changed, 15 insertions(+), 10 deletions(-)
> 
> diff --git a/hw/core/qdev.c b/hw/core/qdev.c
> index 413b413..35fd00d 100644
> --- a/hw/core/qdev.c
> +++ b/hw/core/qdev.c
> @@ -935,7 +935,7 @@ void qdev_alias_all_properties(DeviceState *target, 
> Object *source)
>  } while (class != object_class_by_name(TYPE_DEVICE));
>  }
>  
> -int qdev_build_hotpluggable_device_list(Object *obj, void *opaque)
> +static int qdev_add_hotpluggable_device(Object *obj, void *opaque)
>  {
>  GSList **list = opaque;
>  DeviceState *dev = DEVICE(obj);
> @@ -944,10 +944,18 @@ int qdev_build_hotpluggable_device_list(Object *obj, 
> void *opaque)
>  *list = g_slist_append(*list, dev);
>  }
>  
> -object_child_foreach(obj, qdev_build_hotpluggable_device_list, opaque);
>  return 0;
>  }
>  
> +GSList *qdev_build_hotpluggable_device_list(Object *peripheral)
> +{
> +GSList *list = NULL;
> +
> +object_child_foreach(peripheral, qdev_add_hotpluggable_device, &list);
> +
> +return list;
> +}
> +
>  static bool device_get_realized(Object *obj, Error **errp)
>  {
>  DeviceState *dev = DEVICE(obj);
> diff --git a/include/hw/qdev-core.h b/include/hw/qdev-core.h
> index d3a2940..589bbe7 100644
> --- a/include/hw/qdev-core.h
> +++ b/include/hw/qdev-core.h
> @@ -365,7 +365,7 @@ extern int qdev_hotplug;
>  
>  char *qdev_get_dev_path(DeviceState *dev);
>  
> -int qdev_build_hotpluggable_device_list(Object *obj, void *opaque);
> +GSList *qdev_build_hotpluggable_device_list(Object *peripheral);
>  
>  void qbus_set_hotplug_handler(BusState *bus, DeviceState *handler,
>Error **errp);
> diff --git a/monitor.c b/monitor.c
> index fa00594..f1031a1 100644
> --- a/monitor.c
> +++ b/monitor.c
> @@ -4321,17 +4321,14 @@ void object_add_completion(ReadLineState *rs, int 
> nb_args, const char *str)
>  static void peripheral_device_del_completion(ReadLineState *rs,
>   const char *str, size_t len)
>  {
> -Object *peripheral;
> -GSList *list = NULL, *item;
> +Object *peripheral = container_get(qdev_get_machine(), "/peripheral");
> +GSList *list, *item;
>  
> -peripheral = object_resolve_path("/machine/peripheral/", NULL);
> -if (peripheral == NULL) {
> +list = qdev_build_hotpluggable_device_list(peripheral);
> +if (!list) {
>  return;
>  }
>  
> -object_child_foreach(peripheral, qdev_build_hotpluggable_device_list,
> - &list);
> -
>  for (item = list; item; item = g_slist_next(item)) {
>  DeviceState *dev = item->data;
>  




Re: [Qemu-devel] TCG Multithreading performance improvement

2014-11-26 Thread Chih-Min Chao
Another good slide for PQemu
http://people.cs.nctu.edu.tw/~chenwj/slide/QEMU/PQEMU_presentation_(IIS_Sinica_2)_.pptx

On Wed, Nov 26, 2014 at 5:31 PM, Mark Burton 
wrote:

>
>
> Hi all,
>
> We are now actively going to pursue TCG Multithreading to improve the
> performance of the TCG for Qemu models that include multiple cores.
>
> We have set up a wiki page to track the project
> http://wiki.qemu.org/Features/tcg-multithread
>
> At this point, I would like to invite everybody to email us ideas about
> how the project should progress, and ideas that might be useful (e.g.
> people who have tried this before, source code that might be helpful, what
> order we should attack things in etc)…
>
> So - PLEASE let us know if you have interest in this topic, or information
> that might help
>
> Cheers
>
> Mark.
>
>
>
>
>
> --
>  +44 (0)20 7100 3485 x 210
>  +33 (0)5 33 52 01 77x 210
>
> +33 (0)603762104mark.burton 
>
>


[Qemu-devel] [PATCH v8 01/10] qapi: Add optional field "name" to block dirty bitmap

2014-11-26 Thread John Snow
From: Fam Zheng 

This field will be set for user created dirty bitmap. Also pass in an
error pointer to bdrv_create_dirty_bitmap, so when a name is already
taken on this BDS, it can report an error message. This is not global
check, two BDSes can have dirty bitmap with a common name.

Implemented bdrv_find_dirty_bitmap to find a dirty bitmap by name, will
be used later when other QMP commands want to reference dirty bitmap by
name.

Add bdrv_dirty_bitmap_make_anon. This unsets the name of dirty bitmap.

Signed-off-by: Fam Zheng 
Signed-off-by: John Snow 
---
 block-migration.c |  2 +-
 block.c   | 32 +++-
 block/mirror.c|  2 +-
 include/block/block.h |  7 ++-
 qapi/block-core.json  |  4 +++-
 5 files changed, 42 insertions(+), 5 deletions(-)

diff --git a/block-migration.c b/block-migration.c
index 08db01a..6f3aa18 100644
--- a/block-migration.c
+++ b/block-migration.c
@@ -319,7 +319,7 @@ static int set_dirty_tracking(void)
 
 QSIMPLEQ_FOREACH(bmds, &block_mig_state.bmds_list, entry) {
 bmds->dirty_bitmap = bdrv_create_dirty_bitmap(bmds->bs, BLOCK_SIZE,
-  NULL);
+  NULL, NULL);
 if (!bmds->dirty_bitmap) {
 ret = -errno;
 goto fail;
diff --git a/block.c b/block.c
index a612594..f94b753 100644
--- a/block.c
+++ b/block.c
@@ -53,6 +53,7 @@
 
 struct BdrvDirtyBitmap {
 HBitmap *bitmap;
+char *name;
 QLIST_ENTRY(BdrvDirtyBitmap) list;
 };
 
@@ -5291,7 +5292,28 @@ bool bdrv_qiov_is_aligned(BlockDriverState *bs, 
QEMUIOVector *qiov)
 return true;
 }
 
-BdrvDirtyBitmap *bdrv_create_dirty_bitmap(BlockDriverState *bs, int 
granularity,
+BdrvDirtyBitmap *bdrv_find_dirty_bitmap(BlockDriverState *bs, const char *name)
+{
+BdrvDirtyBitmap *bm;
+
+assert(name);
+QLIST_FOREACH(bm, &bs->dirty_bitmaps, list) {
+if (bm->name && !strcmp(name, bm->name)) {
+return bm;
+}
+}
+return NULL;
+}
+
+void bdrv_dirty_bitmap_make_anon(BlockDriverState *bs, BdrvDirtyBitmap *bitmap)
+{
+g_free(bitmap->name);
+bitmap->name = NULL;
+}
+
+BdrvDirtyBitmap *bdrv_create_dirty_bitmap(BlockDriverState *bs,
+  int granularity,
+  const char *name,
   Error **errp)
 {
 int64_t bitmap_size;
@@ -5299,6 +5321,10 @@ BdrvDirtyBitmap 
*bdrv_create_dirty_bitmap(BlockDriverState *bs, int granularity,
 
 assert((granularity & (granularity - 1)) == 0);
 
+if (name && bdrv_find_dirty_bitmap(bs, name)) {
+error_setg(errp, "Bitmap already exists: %s", name);
+return NULL;
+}
 granularity >>= BDRV_SECTOR_BITS;
 assert(granularity);
 bitmap_size = bdrv_nb_sectors(bs);
@@ -5309,6 +5335,7 @@ BdrvDirtyBitmap 
*bdrv_create_dirty_bitmap(BlockDriverState *bs, int granularity,
 }
 bitmap = g_new0(BdrvDirtyBitmap, 1);
 bitmap->bitmap = hbitmap_alloc(bitmap_size, ffs(granularity) - 1);
+bitmap->name = g_strdup(name);
 QLIST_INSERT_HEAD(&bs->dirty_bitmaps, bitmap, list);
 return bitmap;
 }
@@ -5320,6 +5347,7 @@ void bdrv_release_dirty_bitmap(BlockDriverState *bs, 
BdrvDirtyBitmap *bitmap)
 if (bm == bitmap) {
 QLIST_REMOVE(bitmap, list);
 hbitmap_free(bitmap->bitmap);
+g_free(bitmap->name);
 g_free(bitmap);
 return;
 }
@@ -5338,6 +5366,8 @@ BlockDirtyInfoList 
*bdrv_query_dirty_bitmaps(BlockDriverState *bs)
 info->count = bdrv_get_dirty_count(bs, bm);
 info->granularity =
 ((int64_t) BDRV_SECTOR_SIZE << hbitmap_granularity(bm->bitmap));
+info->has_name = !!bm->name;
+info->name = g_strdup(bm->name);
 entry->value = info;
 *plist = entry;
 plist = &entry->next;
diff --git a/block/mirror.c b/block/mirror.c
index 2c6dd2a..858e4ff 100644
--- a/block/mirror.c
+++ b/block/mirror.c
@@ -699,7 +699,7 @@ static void mirror_start_job(BlockDriverState *bs, 
BlockDriverState *target,
 s->granularity = granularity;
 s->buf_size = MAX(buf_size, granularity);
 
-s->dirty_bitmap = bdrv_create_dirty_bitmap(bs, granularity, errp);
+s->dirty_bitmap = bdrv_create_dirty_bitmap(bs, granularity, NULL, errp);
 if (!s->dirty_bitmap) {
 return;
 }
diff --git a/include/block/block.h b/include/block/block.h
index 5450610..977f7b5 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -428,8 +428,13 @@ bool bdrv_qiov_is_aligned(BlockDriverState *bs, 
QEMUIOVector *qiov);
 
 struct HBitmapIter;
 typedef struct BdrvDirtyBitmap BdrvDirtyBitmap;
-BdrvDirtyBitmap *bdrv_create_dirty_bitmap(BlockDriverState *bs, int 
granularity,
+BdrvDirtyBitmap *bdrv_create_dirty_bitmap(BlockDriverState *bs,
+  int granularity,
+  

[Qemu-devel] [PATCH v8 10/10] qemu-iotests: Add tests for drive-backup sync=dirty-bitmap

2014-11-26 Thread John Snow
From: Fam Zheng 

Signed-off-by: Fam Zheng 
Signed-off-by: John Snow 
---
 tests/qemu-iotests/056| 33 ++---
 tests/qemu-iotests/056.out|  4 ++--
 tests/qemu-iotests/iotests.py |  8 
 3 files changed, 40 insertions(+), 5 deletions(-)

diff --git a/tests/qemu-iotests/056 b/tests/qemu-iotests/056
index 54e4bd0..fc9114e 100755
--- a/tests/qemu-iotests/056
+++ b/tests/qemu-iotests/056
@@ -23,17 +23,17 @@
 import time
 import os
 import iotests
-from iotests import qemu_img, qemu_io, create_image
+from iotests import qemu_img, qemu_img_map_assert, qemu_io, create_image
 
 backing_img = os.path.join(iotests.test_dir, 'backing.img')
 test_img = os.path.join(iotests.test_dir, 'test.img')
 target_img = os.path.join(iotests.test_dir, 'target.img')
 
-class TestSyncModesNoneAndTop(iotests.QMPTestCase):
+class TestSyncModes(iotests.QMPTestCase):
 image_len = 64 * 1024 * 1024 # MB
 
 def setUp(self):
-create_image(backing_img, TestSyncModesNoneAndTop.image_len)
+create_image(backing_img, TestSyncModes.image_len)
 qemu_img('create', '-f', iotests.imgfmt, '-o', 'backing_file=%s' % 
backing_img, test_img)
 qemu_io('-c', 'write -P0x41 0 512', test_img)
 qemu_io('-c', 'write -P0xd5 1M 32k', test_img)
@@ -64,6 +64,33 @@ class TestSyncModesNoneAndTop(iotests.QMPTestCase):
 self.assertTrue(iotests.compare_images(test_img, target_img),
 'target image does not match source after backup')
 
+def test_sync_dirty_bitmap_missing(self):
+self.assert_no_active_block_jobs()
+result = self.vm.qmp('drive-backup', device='drive0', 
sync='dirty-bitmap',
+ format=iotests.imgfmt, target=target_img)
+self.assert_qmp(result, 'error/class', 'GenericError')
+
+def test_sync_dirty_bitmap_not_found(self):
+self.assert_no_active_block_jobs()
+result = self.vm.qmp('drive-backup', device='drive0', 
sync='dirty-bitmap',
+ bitmap='unknown',
+ format=iotests.imgfmt, target=target_img)
+self.assert_qmp(result, 'error/class', 'GenericError')
+
+def test_sync_dirty_bitmap(self):
+self.assert_no_active_block_jobs()
+result = self.vm.qmp('block-dirty-bitmap-add', device='drive0', 
name='bitmap0')
+self.assert_qmp(result, 'return', {})
+self.vm.hmp_qemu_io('drive0', 'write -P0x5a 0 512')
+self.vm.hmp_qemu_io('drive0', 'write -P0x5a 48M 512')
+result = self.vm.qmp('drive-backup', device='drive0', 
sync='dirty-bitmap',
+ bitmap='bitmap0',
+ format=iotests.imgfmt, target=target_img)
+self.assert_qmp(result, 'return', {})
+self.wait_until_completed(check_offset=False)
+self.assert_no_active_block_jobs()
+qemu_img_map_assert(target_img, [0, 0x300])
+
 def test_cancel_sync_none(self):
 self.assert_no_active_block_jobs()
 
diff --git a/tests/qemu-iotests/056.out b/tests/qemu-iotests/056.out
index fbc63e6..914e373 100644
--- a/tests/qemu-iotests/056.out
+++ b/tests/qemu-iotests/056.out
@@ -1,5 +1,5 @@
-..
+.
 --
-Ran 2 tests
+Ran 5 tests
 
 OK
diff --git a/tests/qemu-iotests/iotests.py b/tests/qemu-iotests/iotests.py
index f57f154..95bb959 100644
--- a/tests/qemu-iotests/iotests.py
+++ b/tests/qemu-iotests/iotests.py
@@ -55,6 +55,14 @@ def qemu_img_pipe(*args):
 '''Run qemu-img and return its output'''
 return subprocess.Popen(qemu_img_args + list(args), 
stdout=subprocess.PIPE).communicate()[0]
 
+def qemu_img_map_assert(img, offsets):
+'''Run qemu-img map on img and check the mapped ranges'''
+offs = []
+for line in qemu_img_pipe('map', img).splitlines()[1:]:
+offset, length, mapped, fname = line.split()
+offs.append(int(offset, 16))
+assert set(offs) == set(offsets), "mapped offsets in image '%s' not equal 
to '%s'" % (str(offs), str(offsets))
+
 def qemu_io(*args):
 '''Run qemu-io and return the stdout data'''
 args = qemu_io_args + list(args)
-- 
1.9.3




[Qemu-devel] [PATCH v8 03/10] block: Introduce bdrv_dirty_bitmap_granularity()

2014-11-26 Thread John Snow
From: Fam Zheng 

This returns the granularity (in bytes) of dirty bitmap,
which matches the QMP interface and the existing query
interface.

Signed-off-by: Fam Zheng 
Signed-off-by: John Snow 
---
 block.c   | 9 +++--
 include/block/block.h | 2 ++
 2 files changed, 9 insertions(+), 2 deletions(-)

diff --git a/block.c b/block.c
index c794316..c851a03 100644
--- a/block.c
+++ b/block.c
@@ -5364,8 +5364,7 @@ BlockDirtyInfoList 
*bdrv_query_dirty_bitmaps(BlockDriverState *bs)
 BlockDirtyInfo *info = g_new0(BlockDirtyInfo, 1);
 BlockDirtyInfoList *entry = g_new0(BlockDirtyInfoList, 1);
 info->count = bdrv_get_dirty_count(bs, bm);
-info->granularity =
-((int64_t) BDRV_SECTOR_SIZE << hbitmap_granularity(bm->bitmap));
+info->granularity = bdrv_dirty_bitmap_granularity(bs, bm);
 info->has_name = !!bm->name;
 info->name = g_strdup(bm->name);
 entry->value = info;
@@ -5404,6 +5403,12 @@ uint64_t bdrv_dbm_calc_def_granularity(BlockDriverState 
*bs)
 return granularity;
 }
 
+int64_t bdrv_dirty_bitmap_granularity(BlockDriverState *bs,
+  BdrvDirtyBitmap *bitmap)
+{
+return BDRV_SECTOR_SIZE << hbitmap_granularity(bitmap->bitmap);
+}
+
 void bdrv_dirty_iter_init(BlockDriverState *bs,
   BdrvDirtyBitmap *bitmap, HBitmapIter *hbi)
 {
diff --git a/include/block/block.h b/include/block/block.h
index 3579e7f..c3daa19 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -438,6 +438,8 @@ void bdrv_dirty_bitmap_make_anon(BlockDriverState *bs, 
BdrvDirtyBitmap *bitmap);
 void bdrv_release_dirty_bitmap(BlockDriverState *bs, BdrvDirtyBitmap *bitmap);
 BlockDirtyInfoList *bdrv_query_dirty_bitmaps(BlockDriverState *bs);
 uint64_t bdrv_dbm_calc_def_granularity(BlockDriverState *bs);
+int64_t bdrv_dirty_bitmap_granularity(BlockDriverState *bs,
+  BdrvDirtyBitmap *bitmap);
 int bdrv_get_dirty(BlockDriverState *bs, BdrvDirtyBitmap *bitmap, int64_t 
sector);
 void bdrv_set_dirty(BlockDriverState *bs, int64_t cur_sector, int nr_sectors);
 void bdrv_reset_dirty(BlockDriverState *bs, int64_t cur_sector, int 
nr_sectors);
-- 
1.9.3




[Qemu-devel] [PATCH v8 06/10] qmp: Add block-dirty-bitmap-enable and block-dirty-bitmap-disable

2014-11-26 Thread John Snow
From: Fam Zheng 

This allows to put the dirty bitmap into a disabled state where no more
writes will be tracked.

It will be used before backup or writing to persistent file.

Signed-off-by: Fam Zheng 
Signed-off-by: John Snow 
---
 block.c   | 15 +
 blockdev.c| 61 +++
 include/block/block.h |  2 ++
 qapi/block-core.json  | 28 +++
 qmp-commands.hx   | 10 +
 5 files changed, 116 insertions(+)

diff --git a/block.c b/block.c
index 40cb9cf..1aa723b 100644
--- a/block.c
+++ b/block.c
@@ -56,6 +56,7 @@ struct BdrvDirtyBitmap {
 int64_t size;
 int64_t granularity;
 char *name;
+bool enabled;
 QLIST_ENTRY(BdrvDirtyBitmap) list;
 };
 
@@ -5361,6 +5362,7 @@ BdrvDirtyBitmap 
*bdrv_create_dirty_bitmap(BlockDriverState *bs,
 bitmap->granularity = granularity;
 bitmap->bitmap = hbitmap_alloc(bitmap->size, ffs(sector_granularity) - 1);
 bitmap->name = g_strdup(name);
+bitmap->enabled = true;
 QLIST_INSERT_HEAD(&bs->dirty_bitmaps, bitmap, list);
 return bitmap;
 }
@@ -5379,6 +5381,16 @@ void bdrv_release_dirty_bitmap(BlockDriverState *bs, 
BdrvDirtyBitmap *bitmap)
 }
 }
 
+void bdrv_disable_dirty_bitmap(BdrvDirtyBitmap *bitmap)
+{
+bitmap->enabled = false;
+}
+
+void bdrv_enable_dirty_bitmap(BdrvDirtyBitmap *bitmap)
+{
+bitmap->enabled = true;
+}
+
 BlockDirtyInfoList *bdrv_query_dirty_bitmaps(BlockDriverState *bs)
 {
 BdrvDirtyBitmap *bm;
@@ -5447,6 +5459,9 @@ void bdrv_set_dirty(BlockDriverState *bs, int64_t 
cur_sector,
 {
 BdrvDirtyBitmap *bitmap;
 QLIST_FOREACH(bitmap, &bs->dirty_bitmaps, list) {
+if (!bitmap->enabled) {
+continue;
+}
 hbitmap_set(bitmap->bitmap, cur_sector, nr_sectors);
 }
 }
diff --git a/blockdev.c b/blockdev.c
index af1152f..276a31b 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -1159,6 +1159,41 @@ SnapshotInfo 
*qmp_blockdev_snapshot_delete_internal_sync(const char *device,
 return info;
 }
 
+/**
+ * Return a dirty bitmap (if present), after validating
+ * the device and bitmap names. Returns NULL on error,
+ * including when the device and/or bitmap is not found.
+ */
+static BdrvDirtyBitmap *block_dirty_bitmap_lookup(const char *device,
+  const char *name,
+  Error **errp)
+{
+BlockDriverState *bs;
+BdrvDirtyBitmap *bitmap;
+
+if (!device) {
+error_setg(errp, "Device cannot be NULL");
+return NULL;
+}
+if (!name) {
+error_setg(errp, "Bitmap name cannot be NULL");
+return NULL;
+}
+
+bs = bdrv_lookup_bs(device, NULL, errp);
+if (!bs) {
+return NULL;
+}
+
+bitmap = bdrv_find_dirty_bitmap(bs, name);
+if (!bitmap) {
+error_setg(errp, "Dirty bitmap not found: %s", name);
+return NULL;
+}
+
+return bitmap;
+}
+
 /* New and old BlockDriverState structs for group snapshots */
 
 typedef struct BlkTransactionState BlkTransactionState;
@@ -1864,6 +1899,32 @@ void qmp_block_dirty_bitmap_remove(const char *device, 
const char *name,
 bdrv_release_dirty_bitmap(bs, bitmap);
 }
 
+void qmp_block_dirty_bitmap_enable(const char *device, const char *name,
+   Error **errp)
+{
+BdrvDirtyBitmap *bitmap;
+
+bitmap = block_dirty_bitmap_lookup(device, name, errp);
+if (!bitmap) {
+return;
+}
+
+bdrv_enable_dirty_bitmap(bitmap);
+}
+
+void qmp_block_dirty_bitmap_disable(const char *device, const char *name,
+Error **errp)
+{
+BdrvDirtyBitmap *bitmap;
+
+bitmap = block_dirty_bitmap_lookup(device, name, errp);
+if (!bitmap) {
+return;
+}
+
+bdrv_disable_dirty_bitmap(bitmap);
+}
+
 int do_drive_del(Monitor *mon, const QDict *qdict, QObject **ret_data)
 {
 const char *id = qdict_get_str(qdict, "id");
diff --git a/include/block/block.h b/include/block/block.h
index 9326705..14a0632 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -440,6 +440,8 @@ BdrvDirtyBitmap *bdrv_copy_dirty_bitmap(BlockDriverState 
*bs,
 const BdrvDirtyBitmap *bitmap,
 const char *name);
 void bdrv_release_dirty_bitmap(BlockDriverState *bs, BdrvDirtyBitmap *bitmap);
+void bdrv_disable_dirty_bitmap(BdrvDirtyBitmap *bitmap);
+void bdrv_enable_dirty_bitmap(BdrvDirtyBitmap *bitmap);
 BlockDirtyInfoList *bdrv_query_dirty_bitmaps(BlockDriverState *bs);
 uint64_t bdrv_dbm_calc_def_granularity(BlockDriverState *bs);
 int64_t bdrv_dirty_bitmap_granularity(BlockDriverState *bs,
diff --git a/qapi/block-core.json b/qapi/block-core.json
index 51638b3..d77f19d 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -919,6 +919,34 @@
   'data': 'BlockDirtyBitmap' }
 
 ##
+# @block-dirty-bitmap-enable
+#
+# En

[Qemu-devel] [PATCH v8 08/10] qapi: Add transaction support to block-dirty-bitmap-{add, enable, disable}

2014-11-26 Thread John Snow
From: Fam Zheng 

This adds three qmp commands to transactions.

Users can stop a dirty bitmap, start backup of it, and start another
dirty bitmap atomically, so that the dirty bitmap is tracked
incrementally and we don't miss any write.

Signed-off-by: Fam Zheng 
Signed-off-by: John Snow 
---
 blockdev.c   | 85 
 qapi-schema.json |  5 +++-
 2 files changed, 89 insertions(+), 1 deletion(-)

diff --git a/blockdev.c b/blockdev.c
index 1a56959..275eb43 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -1532,6 +1532,76 @@ static void drive_backup_abort(BlkTransactionState 
*common)
 }
 }
 
+static void block_dirty_bitmap_add_prepare(BlkTransactionState *common,
+   Error **errp)
+{
+BlockDirtyBitmapAdd *action;
+
+action = common->action->block_dirty_bitmap_add;
+qmp_block_dirty_bitmap_add(action->device, action->name,
+   action->has_granularity, action->granularity,
+   errp);
+}
+
+static void block_dirty_bitmap_add_abort(BlkTransactionState *common)
+{
+BlockDirtyBitmapAdd *action;
+BdrvDirtyBitmap *bm;
+BlockDriverState *bs;
+
+action = common->action->block_dirty_bitmap_add;
+bs = bdrv_lookup_bs(action->device, NULL, NULL);
+if (bs) {
+bm = bdrv_find_dirty_bitmap(bs, action->name);
+if (bm) {
+bdrv_release_dirty_bitmap(bs, bm);
+}
+}
+}
+
+typedef struct BlockDirtyBitmapState {
+BlkTransactionState common;
+BdrvDirtyBitmap *bitmap;
+} BlockDirtyBitmapState;
+
+/**
+ * Enable and Disable re-uses the same preparation.
+ */
+static void block_dirty_bitmap_en_toggle_prepare(BlkTransactionState *common,
+ Error **errp)
+{
+BlockDirtyBitmapState *state = DO_UPCAST(BlockDirtyBitmapState,
+ common, common);
+BlockDirtyBitmap *action;
+
+/* We may be used by either enable or disable;
+ * We use the "enable" member of the union here,
+ * but "disable" should be functionally equivalent: */
+action = common->action->block_dirty_bitmap_enable;
+assert(action == common->action->block_dirty_bitmap_disable);
+
+state->bitmap = block_dirty_bitmap_lookup(action->device,
+  action->name,
+  errp);
+if (!state->bitmap) {
+return;
+}
+}
+
+static void block_dirty_bitmap_enable_commit(BlkTransactionState *common)
+{
+BlockDirtyBitmapState *state = DO_UPCAST(BlockDirtyBitmapState,
+ common, common);
+bdrv_enable_dirty_bitmap(state->bitmap);
+}
+
+static void block_dirty_bitmap_disable_commit(BlkTransactionState *common)
+{
+BlockDirtyBitmapState *state = DO_UPCAST(BlockDirtyBitmapState,
+ common, common);
+bdrv_disable_dirty_bitmap(state->bitmap);
+}
+
 static void abort_prepare(BlkTransactionState *common, Error **errp)
 {
 error_setg(errp, "Transaction aborted using Abort action");
@@ -1564,6 +1634,21 @@ static const BdrvActionOps actions[] = {
 .prepare  = internal_snapshot_prepare,
 .abort = internal_snapshot_abort,
 },
+[TRANSACTION_ACTION_KIND_BLOCK_DIRTY_BITMAP_ADD] = {
+.instance_size = sizeof(BlkTransactionState),
+.prepare = block_dirty_bitmap_add_prepare,
+.abort = block_dirty_bitmap_add_abort,
+},
+[TRANSACTION_ACTION_KIND_BLOCK_DIRTY_BITMAP_ENABLE] = {
+.instance_size = sizeof(BlockDirtyBitmapState),
+.prepare = block_dirty_bitmap_en_toggle_prepare,
+.commit = block_dirty_bitmap_enable_commit,
+},
+[TRANSACTION_ACTION_KIND_BLOCK_DIRTY_BITMAP_DISABLE] = {
+.instance_size = sizeof(BlockDirtyBitmapState),
+.prepare = block_dirty_bitmap_en_toggle_prepare,
+.commit = block_dirty_bitmap_disable_commit,
+},
 };
 
 /*
diff --git a/qapi-schema.json b/qapi-schema.json
index 9ffdcf8..793031b 100644
--- a/qapi-schema.json
+++ b/qapi-schema.json
@@ -1260,7 +1260,10 @@
'blockdev-snapshot-sync': 'BlockdevSnapshot',
'drive-backup': 'DriveBackup',
'abort': 'Abort',
-   'blockdev-snapshot-internal-sync': 'BlockdevSnapshotInternal'
+   'blockdev-snapshot-internal-sync': 'BlockdevSnapshotInternal',
+   'block-dirty-bitmap-add': 'BlockDirtyBitmapAdd',
+   'block-dirty-bitmap-enable': 'BlockDirtyBitmap',
+   'block-dirty-bitmap-disable': 'BlockDirtyBitmap'
} }
 
 ##
-- 
1.9.3




[Qemu-devel] [PATCH v8 05/10] block: Add bdrv_copy_dirty_bitmap and bdrv_reset_dirty_bitmap

2014-11-26 Thread John Snow
From: Fam Zheng 

Signed-off-by: Fam Zheng 
Signed-off-by: John Snow 
---
 block.c   | 35 +++
 include/block/block.h |  4 
 2 files changed, 35 insertions(+), 4 deletions(-)

diff --git a/block.c b/block.c
index c851a03..40cb9cf 100644
--- a/block.c
+++ b/block.c
@@ -53,6 +53,8 @@
 
 struct BdrvDirtyBitmap {
 HBitmap *bitmap;
+int64_t size;
+int64_t granularity;
 char *name;
 QLIST_ENTRY(BdrvDirtyBitmap) list;
 };
@@ -5311,6 +5313,26 @@ void bdrv_dirty_bitmap_make_anon(BlockDriverState *bs, 
BdrvDirtyBitmap *bitmap)
 bitmap->name = NULL;
 }
 
+BdrvDirtyBitmap *bdrv_copy_dirty_bitmap(BlockDriverState *bs,
+const BdrvDirtyBitmap *bitmap,
+const char *name)
+{
+BdrvDirtyBitmap *new_bitmap;
+
+new_bitmap = g_malloc0(sizeof(BdrvDirtyBitmap));
+new_bitmap->bitmap = hbitmap_copy(bitmap->bitmap);
+new_bitmap->size = bitmap->size;
+new_bitmap->granularity = bitmap->granularity;
+new_bitmap->name = g_strdup(name);
+QLIST_INSERT_HEAD(&bs->dirty_bitmaps, new_bitmap, list);
+return new_bitmap;
+}
+
+void bdrv_reset_dirty_bitmap(BlockDriverState *bs, BdrvDirtyBitmap *bitmap)
+{
+hbitmap_reset(bitmap->bitmap, 0, bitmap->size);
+}
+
 BdrvDirtyBitmap *bdrv_create_dirty_bitmap(BlockDriverState *bs,
   int granularity,
   const char *name,
@@ -5318,6 +5340,7 @@ BdrvDirtyBitmap 
*bdrv_create_dirty_bitmap(BlockDriverState *bs,
 {
 int64_t bitmap_size;
 BdrvDirtyBitmap *bitmap;
+int sector_granularity;
 
 assert((granularity & (granularity - 1)) == 0);
 
@@ -5325,8 +5348,8 @@ BdrvDirtyBitmap 
*bdrv_create_dirty_bitmap(BlockDriverState *bs,
 error_setg(errp, "Bitmap already exists: %s", name);
 return NULL;
 }
-granularity >>= BDRV_SECTOR_BITS;
-assert(granularity);
+sector_granularity = granularity >> BDRV_SECTOR_BITS;
+assert(sector_granularity);
 bitmap_size = bdrv_nb_sectors(bs);
 if (bitmap_size < 0) {
 error_setg_errno(errp, -bitmap_size, "could not get length of device");
@@ -5334,7 +5357,9 @@ BdrvDirtyBitmap 
*bdrv_create_dirty_bitmap(BlockDriverState *bs,
 return NULL;
 }
 bitmap = g_new0(BdrvDirtyBitmap, 1);
-bitmap->bitmap = hbitmap_alloc(bitmap_size, ffs(granularity) - 1);
+bitmap->size = bitmap_size;
+bitmap->granularity = granularity;
+bitmap->bitmap = hbitmap_alloc(bitmap->size, ffs(sector_granularity) - 1);
 bitmap->name = g_strdup(name);
 QLIST_INSERT_HEAD(&bs->dirty_bitmaps, bitmap, list);
 return bitmap;
@@ -5406,7 +5431,9 @@ uint64_t bdrv_dbm_calc_def_granularity(BlockDriverState 
*bs)
 int64_t bdrv_dirty_bitmap_granularity(BlockDriverState *bs,
   BdrvDirtyBitmap *bitmap)
 {
-return BDRV_SECTOR_SIZE << hbitmap_granularity(bitmap->bitmap);
+g_assert(BDRV_SECTOR_SIZE << hbitmap_granularity(bitmap->bitmap) ==
+ bitmap->granularity);
+return bitmap->granularity;
 }
 
 void bdrv_dirty_iter_init(BlockDriverState *bs,
diff --git a/include/block/block.h b/include/block/block.h
index c3daa19..9326705 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -435,6 +435,10 @@ BdrvDirtyBitmap *bdrv_create_dirty_bitmap(BlockDriverState 
*bs,
 BdrvDirtyBitmap *bdrv_find_dirty_bitmap(BlockDriverState *bs,
 const char *name);
 void bdrv_dirty_bitmap_make_anon(BlockDriverState *bs, BdrvDirtyBitmap 
*bitmap);
+void bdrv_reset_dirty_bitmap(BlockDriverState *bs, BdrvDirtyBitmap *bitmap);
+BdrvDirtyBitmap *bdrv_copy_dirty_bitmap(BlockDriverState *bs,
+const BdrvDirtyBitmap *bitmap,
+const char *name);
 void bdrv_release_dirty_bitmap(BlockDriverState *bs, BdrvDirtyBitmap *bitmap);
 BlockDirtyInfoList *bdrv_query_dirty_bitmaps(BlockDriverState *bs);
 uint64_t bdrv_dbm_calc_def_granularity(BlockDriverState *bs);
-- 
1.9.3




[Qemu-devel] [PATCH v8 04/10] hbitmap: Add hbitmap_copy

2014-11-26 Thread John Snow
From: Fam Zheng 

This makes a deep copy of an HBitmap.

Signed-off-by: Fam Zheng 
Signed-off-by: John Snow 
---
 include/qemu/hbitmap.h |  8 
 util/hbitmap.c | 16 
 2 files changed, 24 insertions(+)

diff --git a/include/qemu/hbitmap.h b/include/qemu/hbitmap.h
index 550d7ce..b645cfc 100644
--- a/include/qemu/hbitmap.h
+++ b/include/qemu/hbitmap.h
@@ -65,6 +65,14 @@ struct HBitmapIter {
 HBitmap *hbitmap_alloc(uint64_t size, int granularity);
 
 /**
+ * hbitmap_copy:
+ * @bitmap: The original bitmap to copy.
+ *
+ * Copy a HBitmap.
+ */
+HBitmap *hbitmap_copy(const HBitmap *bitmap);
+
+/**
  * hbitmap_empty:
  * @hb: HBitmap to operate on.
  *
diff --git a/util/hbitmap.c b/util/hbitmap.c
index b3060e6..8aa7406 100644
--- a/util/hbitmap.c
+++ b/util/hbitmap.c
@@ -395,3 +395,19 @@ HBitmap *hbitmap_alloc(uint64_t size, int granularity)
 hb->levels[0][0] |= 1UL << (BITS_PER_LONG - 1);
 return hb;
 }
+
+HBitmap *hbitmap_copy(const HBitmap *bitmap)
+{
+int i;
+uint64_t size;
+HBitmap *hb = g_memdup(bitmap, sizeof(HBitmap));
+
+size = bitmap->size;
+for (i = HBITMAP_LEVELS - 1; i >= 0; i--) {
+size = MAX((size + BITS_PER_LONG - 1) >> BITS_PER_LEVEL, 1);
+hb->levels[i] = g_memdup(bitmap->levels[i],
+ size * sizeof(unsigned long));
+}
+
+return hb;
+}
-- 
1.9.3




[Qemu-devel] [PATCH v8 07/10] qmp: Add support of "dirty-bitmap" sync mode for drive-backup

2014-11-26 Thread John Snow
From: Fam Zheng 

For "dirty-bitmap" sync mode, the block job will iterate through the
given dirty bitmap to decide if a sector needs backup (backup all the
dirty clusters and skip clean ones), just as allocation conditions of
"top" sync mode.

There are two bitmap use modes for sync=dirty-bitmap:

 - reset: backup job makes a copy of bitmap and resets the original
   one.
 - consume: backup job makes the original anonymous (invisible to user)
   and releases it after use.

Signed-off-by: Fam Zheng 
Signed-off-by: John Snow 
---
 block.c   |   5 ++
 block/backup.c| 130 ++
 block/mirror.c|   4 ++
 blockdev.c|  18 ++-
 hmp.c |   4 +-
 include/block/block.h |   1 +
 include/block/block_int.h |   6 +++
 qapi/block-core.json  |  30 +--
 qmp-commands.hx   |   7 +--
 9 files changed, 175 insertions(+), 30 deletions(-)

diff --git a/block.c b/block.c
index 1aa723b..e341844 100644
--- a/block.c
+++ b/block.c
@@ -5454,6 +5454,11 @@ void bdrv_dirty_iter_init(BlockDriverState *bs,
 hbitmap_iter_init(hbi, bitmap->bitmap, 0);
 }
 
+void bdrv_dirty_iter_set(HBitmapIter *hbi, int64_t offset)
+{
+hbitmap_iter_init(hbi, hbi->hb, offset);
+}
+
 void bdrv_set_dirty(BlockDriverState *bs, int64_t cur_sector,
 int nr_sectors)
 {
diff --git a/block/backup.c b/block/backup.c
index 792e655..2aab68f 100644
--- a/block/backup.c
+++ b/block/backup.c
@@ -37,6 +37,10 @@ typedef struct CowRequest {
 typedef struct BackupBlockJob {
 BlockJob common;
 BlockDriverState *target;
+/* bitmap for sync=dirty-bitmap */
+BdrvDirtyBitmap *sync_bitmap;
+/* dirty bitmap granularity */
+int64_t sync_bitmap_gran;
 MirrorSyncMode sync_mode;
 RateLimit limit;
 BlockdevOnError on_source_error;
@@ -242,6 +246,31 @@ static void backup_complete(BlockJob *job, void *opaque)
 g_free(data);
 }
 
+static bool yield_and_check(BackupBlockJob *job)
+{
+if (block_job_is_cancelled(&job->common)) {
+return true;
+}
+
+/* we need to yield so that qemu_aio_flush() returns.
+ * (without, VM does not reboot)
+ */
+if (job->common.speed) {
+uint64_t delay_ns = ratelimit_calculate_delay(&job->limit,
+  job->sectors_read);
+job->sectors_read = 0;
+block_job_sleep_ns(&job->common, QEMU_CLOCK_REALTIME, delay_ns);
+} else {
+block_job_sleep_ns(&job->common, QEMU_CLOCK_REALTIME, 0);
+}
+
+if (block_job_is_cancelled(&job->common)) {
+return true;
+}
+
+return false;
+}
+
 static void coroutine_fn backup_run(void *opaque)
 {
 BackupBlockJob *job = opaque;
@@ -254,13 +283,13 @@ static void coroutine_fn backup_run(void *opaque)
 };
 int64_t start, end;
 int ret = 0;
+bool error_is_read;
 
 QLIST_INIT(&job->inflight_reqs);
 qemu_co_rwlock_init(&job->flush_rwlock);
 
 start = 0;
-end = DIV_ROUND_UP(job->common.len / BDRV_SECTOR_SIZE,
-   BACKUP_SECTORS_PER_CLUSTER);
+end = DIV_ROUND_UP(job->common.len, BACKUP_CLUSTER_SIZE);
 
 job->bitmap = hbitmap_alloc(end, 0);
 
@@ -278,28 +307,44 @@ static void coroutine_fn backup_run(void *opaque)
 qemu_coroutine_yield();
 job->common.busy = true;
 }
+} else if (job->sync_mode == MIRROR_SYNC_MODE_DIRTY_BITMAP) {
+/* Dirty Bitmap sync has a slightly different iteration method */
+HBitmapIter hbi;
+int64_t sector;
+int64_t cluster;
+bool polyrhythmic;
+
+bdrv_dirty_iter_init(bs, job->sync_bitmap, &hbi);
+/* Does the granularity happen to match our backup cluster size? */
+polyrhythmic = (job->sync_bitmap_gran != BACKUP_CLUSTER_SIZE);
+
+/* Find the next dirty /sector/ and copy that /cluster/ */
+while ((sector = hbitmap_iter_next(&hbi)) != -1) {
+if (yield_and_check(job)) {
+goto leave;
+}
+cluster = sector / BACKUP_SECTORS_PER_CLUSTER;
+
+do {
+ret = backup_do_cow(bs, cluster * BACKUP_SECTORS_PER_CLUSTER,
+BACKUP_SECTORS_PER_CLUSTER, 
&error_is_read);
+if ((ret < 0) &&
+backup_error_action(job, error_is_read, -ret) ==
+BLOCK_ERROR_ACTION_REPORT) {
+goto leave;
+}
+} while (ret < 0);
+
+/* Advance (or rewind) our iterator if we need to. */
+if (polyrhythmic) {
+bdrv_dirty_iter_set(&hbi,
+(cluster + 1) * 
BACKUP_SECTORS_PER_CLUSTER);
+}
+}
 } else {
 /* Both FULL and TOP SYNC_MODE's require copying.. */
 for (; start < end; start++) {
-bool error_is_read;
-
-if (bloc

[Qemu-devel] [PATCH v8 00/10] block: Incremental backup series

2014-11-26 Thread John Snow
This is the in memory part of the incremental backup feature.

With the added commands, we can create a bitmap on a block backend, from which
point of time all the writes are tracked by the bitmap, marking sectors as
dirty.  Later, we call drive-backup and pass the bitmap to it, to do an
incremental backup.

See the last patch which adds some tests for this use case.

Fam

==

This is the next iteration of Fam's incremenetal backup feature.
I have since taken this over from him and have (slowly) worked through
the feedback to the last version of his patch and have made many
tiny little edits.

For convenience: https://github.com/jnsnow/qemu/commits/dbm-backup

John.

v8:
 - Changed int64_t return for bdrv_dbm_calc_def_granularity to uint64_t (2/10)
 - Updated commit message (2/10)
 - Removed redundant check for null in device parameter (2/10)
 - Removed comment cruft. (2/10)
 - Removed redundant local_err propagation (several)
 - Updated commit message (3/10)
 - Fix HBitmap copy loop index (4/10)
 - Remove redundant ternary (5/10)
 - Shift up the block_dirty_bitmap_lookup function (6/10)
 - Error messages cleanup (7/10)
 - Add an assertion to explain the re-use of .prepare() for two transactions.
   (8/10)
 - Removed BDS argument from bitmap enable/disable helper; it was unused. (8/10)

v7: (highlights)
 - First version being authored by jsnow
 - Addressed most list feedback from V6, many small changes.
   All feedback was either addressed on-list (as a wontfix) or patched.
 - Replaced all error_set with error_setg
 - Replaced all bdrv_find with bdrv_lookup_bs()
 - Adjusted the max granularity to share a common function with
   backup/mirror that attempts to "guess" a reasonable default.
   It clamps between [4K,64K] currently.
 - The BdrvDirtyBitmap object now counts granularity exclusively in
   bytes to match its interface.
   It leaves the sector granularity concerns to HBitmap.
 - Reworked the backup loop to utilize the hbitmap iterator.
   There are some extra concerns to handle arrhythmic cases where the
   granularity of the bitmap does not match the backup cluster size.
   This iteration works best when it does match, but it's not a
   deal-breaker if it doesn't -- it just gets less efficient.
 - Reworked the transactional functions so that abort() wouldn't "undo"
   a redundant command. They now have been split into a prepare and a
   commit function (with state) and do not provide an abort command.
 - Added a block_dirty_bitmap_lookup(device, name, errp) function to
   shorten a few of the commands added in this series, particularly
   qmp_enable, qmp_disable, and the transaction preparations.

v6: Re-send of v5.

v5: Rebase to master.

v4: Last version tailored by Fam Zheng.

Fam Zheng (10):
  qapi: Add optional field "name" to block dirty bitmap
  qmp: Add block-dirty-bitmap-add and block-dirty-bitmap-remove
  block: Introduce bdrv_dirty_bitmap_granularity()
  hbitmap: Add hbitmap_copy
  block: Add bdrv_copy_dirty_bitmap and bdrv_reset_dirty_bitmap
  qmp: Add block-dirty-bitmap-enable and block-dirty-bitmap-disable
  qmp: Add support of "dirty-bitmap" sync mode for drive-backup
  qapi: Add transaction support to block-dirty-bitmap-{add, enable,
disable}
  qmp: Add dirty bitmap 'enabled' field in query-block
  qemu-iotests: Add tests for drive-backup sync=dirty-bitmap

 block-migration.c |   2 +-
 block.c   | 114 --
 block/backup.c| 130 +
 block/mirror.c|  16 ++--
 blockdev.c| 218 +-
 hmp.c |   4 +-
 include/block/block.h |  17 +++-
 include/block/block_int.h |   6 ++
 include/qemu/hbitmap.h|   8 ++
 qapi-schema.json  |   5 +-
 qapi/block-core.json  | 120 ++-
 qmp-commands.hx   |  66 -
 tests/qemu-iotests/056|  33 ++-
 tests/qemu-iotests/056.out|   4 +-
 tests/qemu-iotests/iotests.py |   8 ++
 util/hbitmap.c|  16 
 16 files changed, 712 insertions(+), 55 deletions(-)

-- 
1.9.3




[Qemu-devel] [PATCH v8 02/10] qmp: Add block-dirty-bitmap-add and block-dirty-bitmap-remove

2014-11-26 Thread John Snow
From: Fam Zheng 

The new command pair is added to manage user created dirty bitmap. The
dirty bitmap's name is mandatory and must be unique for the same device,
but different devices can have bitmaps with the same names.

The granularity is an optional field. If it is not specified, we will
choose a default granularity based on the cluster size if available,
clamped to between 4K and 64K (To mirror how the 'mirror' code was
already choosing granularity.) If we do not have cluster size info
available, we choose 64K. This code has been factored out into helper
shared with block/mirror.

The types added to block-core.json will be re-used in future patches
in this series, see:
'qapi: Add transaction support to block-dirty-bitmap-{add, enable, disable}'

Signed-off-by: Fam Zheng 
Signed-off-by: John Snow 
---
 block.c   | 19 ++
 block/mirror.c| 10 +-
 blockdev.c| 54 ++
 include/block/block.h |  1 +
 qapi/block-core.json  | 55 +++
 qmp-commands.hx   | 49 +
 6 files changed, 179 insertions(+), 9 deletions(-)

diff --git a/block.c b/block.c
index f94b753..c794316 100644
--- a/block.c
+++ b/block.c
@@ -5385,6 +5385,25 @@ int bdrv_get_dirty(BlockDriverState *bs, BdrvDirtyBitmap 
*bitmap, int64_t sector
 }
 }
 
+#define BDB_MIN_DEF_GRANULARITY 4096
+#define BDB_MAX_DEF_GRANULARITY 65536
+#define BDB_DEFAULT_GRANULARITY BDB_MAX_DEF_GRANULARITY
+
+uint64_t bdrv_dbm_calc_def_granularity(BlockDriverState *bs)
+{
+BlockDriverInfo bdi;
+uint64_t granularity;
+
+if (bdrv_get_info(bs, &bdi) >= 0 && bdi.cluster_size != 0) {
+granularity = MAX(BDB_MIN_DEF_GRANULARITY, bdi.cluster_size);
+granularity = MIN(BDB_MAX_DEF_GRANULARITY, granularity);
+} else {
+granularity = BDB_DEFAULT_GRANULARITY;
+}
+
+return granularity;
+}
+
 void bdrv_dirty_iter_init(BlockDriverState *bs,
   BdrvDirtyBitmap *bitmap, HBitmapIter *hbi)
 {
diff --git a/block/mirror.c b/block/mirror.c
index 858e4ff..3633632 100644
--- a/block/mirror.c
+++ b/block/mirror.c
@@ -664,15 +664,7 @@ static void mirror_start_job(BlockDriverState *bs, 
BlockDriverState *target,
 MirrorBlockJob *s;
 
 if (granularity == 0) {
-/* Choose the default granularity based on the target file's cluster
- * size, clamped between 4k and 64k.  */
-BlockDriverInfo bdi;
-if (bdrv_get_info(target, &bdi) >= 0 && bdi.cluster_size != 0) {
-granularity = MAX(4096, bdi.cluster_size);
-granularity = MIN(65536, granularity);
-} else {
-granularity = 65536;
-}
+granularity = bdrv_dbm_calc_def_granularity(target);
 }
 
 assert ((granularity & (granularity - 1)) == 0);
diff --git a/blockdev.c b/blockdev.c
index 57910b8..af1152f 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -1810,6 +1810,60 @@ void qmp_block_set_io_throttle(const char *device, 
int64_t bps, int64_t bps_rd,
 aio_context_release(aio_context);
 }
 
+void qmp_block_dirty_bitmap_add(const char *device, const char *name,
+bool has_granularity, int64_t granularity,
+Error **errp)
+{
+BlockDriverState *bs;
+
+bs = bdrv_lookup_bs(device, NULL, errp);
+if (!bs) {
+return;
+}
+
+if (!name || name[0] == '\0') {
+error_setg(errp, "Bitmap name cannot be empty");
+return;
+}
+if (has_granularity) {
+if (granularity < 512 || !is_power_of_2(granularity)) {
+error_setg(errp, "Granularity must be power of 2 "
+ "and at least 512");
+return;
+}
+} else {
+/* Default to cluster size, if available: */
+granularity = bdrv_dbm_calc_def_granularity(bs);
+}
+
+bdrv_create_dirty_bitmap(bs, granularity, name, errp);
+}
+
+void qmp_block_dirty_bitmap_remove(const char *device, const char *name,
+   Error **errp)
+{
+BlockDriverState *bs;
+BdrvDirtyBitmap *bitmap;
+
+bs = bdrv_lookup_bs(device, NULL, errp);
+if (!bs) {
+return;
+}
+
+if (!name || name[0] == '\0') {
+error_setg(errp, "Bitmap name cannot be empty");
+return;
+}
+bitmap = bdrv_find_dirty_bitmap(bs, name);
+if (!bitmap) {
+error_setg(errp, "Dirty bitmap not found: %s", name);
+return;
+}
+
+bdrv_dirty_bitmap_make_anon(bs, bitmap);
+bdrv_release_dirty_bitmap(bs, bitmap);
+}
+
 int do_drive_del(Monitor *mon, const QDict *qdict, QObject **ret_data)
 {
 const char *id = qdict_get_str(qdict, "id");
diff --git a/include/block/block.h b/include/block/block.h
index 977f7b5..3579e7f 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -437,6 +437,7 @@ BdrvDirtyBitmap *bdrv_find_dirty_bit

[Qemu-devel] [PATCH v8 09/10] qmp: Add dirty bitmap 'enabled' field in query-block

2014-11-26 Thread John Snow
From: Fam Zheng 

Signed-off-by: Fam Zheng 
Signed-off-by: John Snow 
---
 block.c  | 1 +
 qapi/block-core.json | 5 -
 2 files changed, 5 insertions(+), 1 deletion(-)

diff --git a/block.c b/block.c
index e341844..3ad1af3 100644
--- a/block.c
+++ b/block.c
@@ -5404,6 +5404,7 @@ BlockDirtyInfoList 
*bdrv_query_dirty_bitmaps(BlockDriverState *bs)
 info->granularity = bdrv_dirty_bitmap_granularity(bs, bm);
 info->has_name = !!bm->name;
 info->name = g_strdup(bm->name);
+info->enabled = bm->enabled;
 entry->value = info;
 *plist = entry;
 plist = &entry->next;
diff --git a/qapi/block-core.json b/qapi/block-core.json
index 3ca9566..b948dcc 100644
--- a/qapi/block-core.json
+++ b/qapi/block-core.json
@@ -310,10 +310,13 @@
 #
 # @granularity: granularity of the dirty bitmap in bytes (since 1.4)
 #
+# @enabled: whether the dirty bitmap is enabled (Since 2.3)
+#
 # Since: 1.3
 ##
 { 'type': 'BlockDirtyInfo',
-  'data': {'*name': 'str', 'count': 'int', 'granularity': 'int'} }
+  'data': {'*name': 'str', 'count': 'int', 'granularity': 'int',
+   'enabled': 'bool'} }
 
 ##
 # @BlockInfo:
-- 
1.9.3




[Qemu-devel] [PATCH RFC v3 12/12] s390x/virtio-ccw: enable virtio 1.0

2014-11-26 Thread Cornelia Huck
virtio-ccw should now have everything in place to operate virtio 1.0
devices, so let's enable revision 1.

Signed-off-by: Cornelia Huck 
---
 hw/s390x/virtio-ccw.h |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/s390x/virtio-ccw.h b/hw/s390x/virtio-ccw.h
index 03d5955..08edd8d 100644
--- a/hw/s390x/virtio-ccw.h
+++ b/hw/s390x/virtio-ccw.h
@@ -73,7 +73,7 @@ typedef struct VirtIOCCWDeviceClass {
 #define VIRTIO_CCW_FEATURE_SIZE NR_VIRTIO_FEATURE_WORDS
 
 /* The maximum virtio revision we support. */
-#define VIRTIO_CCW_REV_MAX 0
+#define VIRTIO_CCW_REV_MAX 1
 
 /* Performance improves when virtqueue kick processing is decoupled from the
  * vcpu thread using ioeventfd for some devices. */
-- 
1.7.9.5




[Qemu-devel] [PATCH RFC v3 08/12] s390x/css: Add a callback for when subchannel gets disabled

2014-11-26 Thread Cornelia Huck
From: Thomas Huth 

We need a possibility to run code when a subchannel gets disabled.
This patch adds the necessary infrastructure.

Signed-off-by: Thomas Huth 
Signed-off-by: Cornelia Huck 
---
 hw/s390x/css.c |   12 
 hw/s390x/css.h |1 +
 2 files changed, 13 insertions(+)

diff --git a/hw/s390x/css.c b/hw/s390x/css.c
index b67c039..735ec55 100644
--- a/hw/s390x/css.c
+++ b/hw/s390x/css.c
@@ -588,6 +588,7 @@ int css_do_msch(SubchDev *sch, SCHIB *orig_schib)
 {
 SCSW *s = &sch->curr_status.scsw;
 PMCW *p = &sch->curr_status.pmcw;
+uint16_t oldflags;
 int ret;
 SCHIB schib;
 
@@ -610,6 +611,7 @@ int css_do_msch(SubchDev *sch, SCHIB *orig_schib)
 copy_schib_from_guest(&schib, orig_schib);
 /* Only update the program-modifiable fields. */
 p->intparm = schib.pmcw.intparm;
+oldflags = p->flags;
 p->flags &= ~(PMCW_FLAGS_MASK_ISC | PMCW_FLAGS_MASK_ENA |
   PMCW_FLAGS_MASK_LM | PMCW_FLAGS_MASK_MME |
   PMCW_FLAGS_MASK_MP);
@@ -625,6 +627,12 @@ int css_do_msch(SubchDev *sch, SCHIB *orig_schib)
 (PMCW_CHARS_MASK_MBFC | PMCW_CHARS_MASK_CSENSE);
 sch->curr_status.mba = schib.mba;
 
+/* Has the channel been disabled? */
+if (sch->disable_cb && (oldflags & PMCW_FLAGS_MASK_ENA) != 0
+&& (p->flags & PMCW_FLAGS_MASK_ENA) == 0) {
+sch->disable_cb(sch);
+}
+
 ret = 0;
 
 out:
@@ -1443,6 +1451,10 @@ void css_reset_sch(SubchDev *sch)
 {
 PMCW *p = &sch->curr_status.pmcw;
 
+if ((p->flags & PMCW_FLAGS_MASK_ENA) != 0 && sch->disable_cb) {
+sch->disable_cb(sch);
+}
+
 p->intparm = 0;
 p->flags &= ~(PMCW_FLAGS_MASK_ISC | PMCW_FLAGS_MASK_ENA |
   PMCW_FLAGS_MASK_LM | PMCW_FLAGS_MASK_MME |
diff --git a/hw/s390x/css.h b/hw/s390x/css.h
index 33104ac..7fa807b 100644
--- a/hw/s390x/css.h
+++ b/hw/s390x/css.h
@@ -81,6 +81,7 @@ struct SubchDev {
 uint8_t ccw_no_data_cnt;
 /* transport-provided data: */
 int (*ccw_cb) (SubchDev *, CCW1);
+void (*disable_cb)(SubchDev *);
 SenseId id;
 void *driver_data;
 };
-- 
1.7.9.5




[Qemu-devel] [PATCH RFC v3 07/12] dataplane: allow virtio-1 devices

2014-11-26 Thread Cornelia Huck
Handle endianness conversion for virtio-1 virtqueues correctly.

Note that dataplane now needs to be built per-target.

Signed-off-by: Cornelia Huck 
---
 hw/block/dataplane/virtio-blk.c   |4 +-
 hw/scsi/virtio-scsi-dataplane.c   |2 +-
 hw/virtio/Makefile.objs   |2 +-
 hw/virtio/dataplane/Makefile.objs |2 +-
 hw/virtio/dataplane/vring.c   |   86 ++---
 include/hw/virtio/dataplane/vring-accessors.h |   75 +
 include/hw/virtio/dataplane/vring.h   |   14 +---
 7 files changed, 131 insertions(+), 54 deletions(-)
 create mode 100644 include/hw/virtio/dataplane/vring-accessors.h

diff --git a/hw/block/dataplane/virtio-blk.c b/hw/block/dataplane/virtio-blk.c
index 1222a37..2d8cc15 100644
--- a/hw/block/dataplane/virtio-blk.c
+++ b/hw/block/dataplane/virtio-blk.c
@@ -16,7 +16,9 @@
 #include "qemu/iov.h"
 #include "qemu/thread.h"
 #include "qemu/error-report.h"
+#include "hw/virtio/virtio-access.h"
 #include "hw/virtio/dataplane/vring.h"
+#include "hw/virtio/dataplane/vring-accessors.h"
 #include "sysemu/block-backend.h"
 #include "hw/virtio/virtio-blk.h"
 #include "virtio-blk.h"
@@ -75,7 +77,7 @@ static void complete_request_vring(VirtIOBlockReq *req, 
unsigned char status)
 VirtIOBlockDataPlane *s = req->dev->dataplane;
 stb_p(&req->in->status, status);
 
-vring_push(&req->dev->dataplane->vring, &req->elem,
+vring_push(s->vdev, &req->dev->dataplane->vring, &req->elem,
req->qiov.size + sizeof(*req->in));
 
 /* Suppress notification to guest by BH and its scheduled
diff --git a/hw/scsi/virtio-scsi-dataplane.c b/hw/scsi/virtio-scsi-dataplane.c
index 03a1e8c..418d73b 100644
--- a/hw/scsi/virtio-scsi-dataplane.c
+++ b/hw/scsi/virtio-scsi-dataplane.c
@@ -94,7 +94,7 @@ void virtio_scsi_vring_push_notify(VirtIOSCSIReq *req)
 {
 VirtIODevice *vdev = VIRTIO_DEVICE(req->vring->parent);
 
-vring_push(&req->vring->vring, &req->elem,
+vring_push(vdev, &req->vring->vring, &req->elem,
req->qsgl.size + req->resp_iov.size);
 
 if (vring_should_notify(vdev, &req->vring->vring)) {
diff --git a/hw/virtio/Makefile.objs b/hw/virtio/Makefile.objs
index d21c397..19b224a 100644
--- a/hw/virtio/Makefile.objs
+++ b/hw/virtio/Makefile.objs
@@ -2,7 +2,7 @@ common-obj-y += virtio-rng.o
 common-obj-$(CONFIG_VIRTIO_PCI) += virtio-pci.o
 common-obj-y += virtio-bus.o
 common-obj-y += virtio-mmio.o
-common-obj-$(CONFIG_VIRTIO) += dataplane/
+obj-$(CONFIG_VIRTIO) += dataplane/
 
 obj-y += virtio.o virtio-balloon.o 
 obj-$(CONFIG_LINUX) += vhost.o vhost-backend.o vhost-user.o
diff --git a/hw/virtio/dataplane/Makefile.objs 
b/hw/virtio/dataplane/Makefile.objs
index 9a8cfc0..753a9ca 100644
--- a/hw/virtio/dataplane/Makefile.objs
+++ b/hw/virtio/dataplane/Makefile.objs
@@ -1 +1 @@
-common-obj-y += vring.o
+obj-y += vring.o
diff --git a/hw/virtio/dataplane/vring.c b/hw/virtio/dataplane/vring.c
index a051775..0da8d6b 100644
--- a/hw/virtio/dataplane/vring.c
+++ b/hw/virtio/dataplane/vring.c
@@ -18,7 +18,9 @@
 #include "hw/hw.h"
 #include "exec/memory.h"
 #include "exec/address-spaces.h"
+#include "hw/virtio/virtio-access.h"
 #include "hw/virtio/dataplane/vring.h"
+#include "hw/virtio/dataplane/vring-accessors.h"
 #include "qemu/error-report.h"
 
 /* vring_map can be coupled with vring_unmap or (if you still have the
@@ -83,7 +85,7 @@ bool vring_setup(Vring *vring, VirtIODevice *vdev, int n)
 vring_init(&vring->vr, virtio_queue_get_num(vdev, n), vring_ptr, 4096);
 
 vring->last_avail_idx = virtio_queue_get_last_avail_idx(vdev, n);
-vring->last_used_idx = vring->vr.used->idx;
+vring->last_used_idx = vring_get_used_idx(vdev, vring);
 vring->signalled_used = 0;
 vring->signalled_used_valid = false;
 
@@ -104,7 +106,7 @@ void vring_teardown(Vring *vring, VirtIODevice *vdev, int n)
 void vring_disable_notification(VirtIODevice *vdev, Vring *vring)
 {
 if (!(vdev->guest_features[0] & (1 << VIRTIO_RING_F_EVENT_IDX))) {
-vring->vr.used->flags |= VRING_USED_F_NO_NOTIFY;
+vring_set_used_flags(vdev, vring, VRING_USED_F_NO_NOTIFY);
 }
 }
 
@@ -117,10 +119,10 @@ bool vring_enable_notification(VirtIODevice *vdev, Vring 
*vring)
 if (vdev->guest_features[0] & (1 << VIRTIO_RING_F_EVENT_IDX)) {
 vring_avail_event(&vring->vr) = vring->vr.avail->idx;
 } else {
-vring->vr.used->flags &= ~VRING_USED_F_NO_NOTIFY;
+vring_clear_used_flags(vdev, vring, VRING_USED_F_NO_NOTIFY);
 }
 smp_mb(); /* ensure update is seen before reading avail_idx */
-return !vring_more_avail(vring);
+return !vring_more_avail(vdev, vring);
 }
 
 /* This is stolen from linux/drivers/vhost/vhost.c:vhost_notify() */
@@ -134,12 +136,13 @@ bool vring_should_notify(VirtIODevice *vdev, Vring *vring)
 smp_mb();
 
 if ((vdev->guest_features[0] & VIRTIO_F_NOTIFY_ON_EMPTY) &&
-unlikely(vring->vr.avail->idx

[Qemu-devel] [PATCH RFC v3 06/12] virtio: allow virtio-1 queue layout

2014-11-26 Thread Cornelia Huck
For virtio-1 devices, we allow a more complex queue layout that doesn't
require descriptor table and rings on a physically-contigous memory area:
add virtio_queue_set_rings() to allow transports to set this up.

Signed-off-by: Cornelia Huck 
---
 hw/virtio/virtio.c |   16 
 include/hw/virtio/virtio.h |2 ++
 2 files changed, 18 insertions(+)

diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
index 4149f45..2c6bb91 100644
--- a/hw/virtio/virtio.c
+++ b/hw/virtio/virtio.c
@@ -96,6 +96,13 @@ static void virtqueue_init(VirtQueue *vq)
 {
 hwaddr pa = vq->pa;
 
+if (pa == -1ULL) {
+/*
+ * This is a virtio-1 style vq that has already been setup
+ * in virtio_queue_set.
+ */
+return;
+}
 vq->vring.desc = pa;
 vq->vring.avail = pa + vq->vring.num * sizeof(VRingDesc);
 vq->vring.used = vring_align(vq->vring.avail +
@@ -719,6 +726,15 @@ hwaddr virtio_queue_get_addr(VirtIODevice *vdev, int n)
 return vdev->vq[n].pa;
 }
 
+void virtio_queue_set_rings(VirtIODevice *vdev, int n, hwaddr desc,
+hwaddr avail, hwaddr used)
+{
+vdev->vq[n].pa = -1ULL;
+vdev->vq[n].vring.desc = desc;
+vdev->vq[n].vring.avail = avail;
+vdev->vq[n].vring.used = used;
+}
+
 void virtio_queue_set_num(VirtIODevice *vdev, int n, int num)
 {
 /* Don't allow guest to flip queue between existent and
diff --git a/include/hw/virtio/virtio.h b/include/hw/virtio/virtio.h
index 40e567c..f840320 100644
--- a/include/hw/virtio/virtio.h
+++ b/include/hw/virtio/virtio.h
@@ -227,6 +227,8 @@ void virtio_queue_set_addr(VirtIODevice *vdev, int n, 
hwaddr addr);
 hwaddr virtio_queue_get_addr(VirtIODevice *vdev, int n);
 void virtio_queue_set_num(VirtIODevice *vdev, int n, int num);
 int virtio_queue_get_num(VirtIODevice *vdev, int n);
+void virtio_queue_set_rings(VirtIODevice *vdev, int n, hwaddr desc,
+hwaddr avail, hwaddr used);
 void virtio_queue_set_align(VirtIODevice *vdev, int n, int align);
 void virtio_queue_notify(VirtIODevice *vdev, int n);
 uint16_t virtio_queue_vector(VirtIODevice *vdev, int n);
-- 
1.7.9.5




[Qemu-devel] [PATCH RFC v3 09/12] s390x/virtio-ccw: add virtio set-revision call

2014-11-26 Thread Cornelia Huck
From: Thomas Huth 

Handle the virtio-ccw revision according to what the guest sets.
When revision 1 is selected, we have a virtio-1 standard device
with byteswapping for the virtio rings.

When a channel gets disabled, we have to revert to the legacy behavior
in case the next user of the device does not negotiate the revision 1
anymore (e.g. the boot firmware uses revision 1, but the operating
system only uses the legacy mode).

Note that revisions > 0 are still disabled; but we still extend the
feature bit size to be able to handle the VERSION_1 bit.

Signed-off-by: Thomas Huth 
Signed-off-by: Cornelia Huck 
---
 hw/s390x/virtio-ccw.c |   52 +
 hw/s390x/virtio-ccw.h |7 ++-
 2 files changed, 58 insertions(+), 1 deletion(-)

diff --git a/hw/s390x/virtio-ccw.c b/hw/s390x/virtio-ccw.c
index 62ec852..e79f3b8 100644
--- a/hw/s390x/virtio-ccw.c
+++ b/hw/s390x/virtio-ccw.c
@@ -20,9 +20,11 @@
 #include "hw/virtio/virtio-net.h"
 #include "hw/sysbus.h"
 #include "qemu/bitops.h"
+#include "hw/virtio/virtio-access.h"
 #include "hw/virtio/virtio-bus.h"
 #include "hw/s390x/adapter.h"
 #include "hw/s390x/s390_flic.h"
+#include "linux/virtio_config.h"
 
 #include "ioinst.h"
 #include "css.h"
@@ -260,6 +262,12 @@ typedef struct VirtioThinintInfo {
 uint8_t isc;
 } QEMU_PACKED VirtioThinintInfo;
 
+typedef struct VirtioRevInfo {
+uint16_t revision;
+uint16_t length;
+uint8_t data[0];
+} QEMU_PACKED VirtioRevInfo;
+
 /* Specify where the virtqueues for the subchannel are in guest memory. */
 static int virtio_ccw_set_vqs(SubchDev *sch, uint64_t addr, uint32_t align,
   uint16_t index, uint16_t num)
@@ -299,6 +307,7 @@ static int virtio_ccw_cb(SubchDev *sch, CCW1 ccw)
 {
 int ret;
 VqInfoBlock info;
+VirtioRevInfo revinfo;
 uint8_t status;
 VirtioFeatDesc features;
 void *config;
@@ -373,6 +382,13 @@ static int virtio_ccw_cb(SubchDev *sch, CCW1 ccw)
ccw.cda + sizeof(features.features));
 if (features.index < ARRAY_SIZE(dev->host_features)) {
 features.features = dev->host_features[features.index];
+/*
+ * Don't offer version 1 to the guest if it did not
+ * negotiate at least revision 1.
+ */
+if (features.index == 1 && dev->revision <= 0) {
+features.features &= ~(1 << (VIRTIO_F_VERSION_1 - 32));
+}
 } else {
 /* Return zeroes if the guest supports more feature bits. */
 features.features = 0;
@@ -400,6 +416,13 @@ static int virtio_ccw_cb(SubchDev *sch, CCW1 ccw)
ccw.cda + sizeof(features.features));
 features.features = ldl_le_phys(&address_space_memory, ccw.cda);
 if (features.index < ARRAY_SIZE(vdev->guest_features)) {
+/*
+ * The guest should not set version 1 if it didn't
+ * negotiate a revision >= 1.
+ */
+if (features.index == 1 && dev->revision <= 0) {
+features.features &= ~(1 << (VIRTIO_F_VERSION_1 - 32));
+}
 virtio_set_features(vdev, features.index, features.features);
 } else {
 /*
@@ -600,6 +623,25 @@ static int virtio_ccw_cb(SubchDev *sch, CCW1 ccw)
 }
 }
 break;
+case CCW_CMD_SET_VIRTIO_REV:
+len = sizeof(revinfo);
+if (ccw.count < len || (check_len && ccw.count > len)) {
+ret = -EINVAL;
+break;
+}
+if (!ccw.cda) {
+ret = -EFAULT;
+break;
+}
+cpu_physical_memory_read(ccw.cda, &revinfo, len);
+if (dev->revision >= 0 ||
+revinfo.revision > VIRTIO_CCW_REV_MAX) {
+ret = -ENOSYS;
+break;
+}
+ret = 0;
+dev->revision = revinfo.revision;
+break;
 default:
 ret = -ENOSYS;
 break;
@@ -607,6 +649,13 @@ static int virtio_ccw_cb(SubchDev *sch, CCW1 ccw)
 return ret;
 }
 
+static void virtio_sch_disable_cb(SubchDev *sch)
+{
+VirtioCcwDevice *dev = sch->driver_data;
+
+dev->revision = -1;
+}
+
 static int virtio_ccw_device_init(VirtioCcwDevice *dev, VirtIODevice *vdev)
 {
 unsigned int cssid = 0;
@@ -733,6 +782,7 @@ static int virtio_ccw_device_init(VirtioCcwDevice *dev, 
VirtIODevice *vdev)
 css_sch_build_virtual_schib(sch, 0, VIRTIO_CCW_CHPID_TYPE);
 
 sch->ccw_cb = virtio_ccw_cb;
+sch->disable_cb = virtio_sch_disable_cb;
 
 /* Build senseid data. */
 memset(&sch->id, 0, sizeof(SenseId));
@@ -740,6 +790,8 @@ static int virtio_ccw_device_init(VirtioCcwDevice *dev, 
VirtIODevice *vdev)
 sch->id.cu_type = VIRTIO_CCW_CU_TYPE;
 sch->id.cu_model = vdev->device_id;
 
+dev->revision = -1;
+
 

[Qemu-devel] [PATCH RFC v3 05/12] virtio: introduce legacy virtio devices

2014-11-26 Thread Cornelia Huck
Introduce a helper function to indicate  whether a virtio device is
operating in legacy or virtio standard mode.

It may be used to make decisions about the endianess of virtio accesses
and other virtio-1 specific changes, enabling us to support transitional
devices.

Reviewed-by: Thomas Huth 
Signed-off-by: Cornelia Huck 
---
 hw/virtio/virtio.c|6 +-
 include/hw/virtio/virtio-access.h |4 
 include/hw/virtio/virtio.h|   13 +++--
 3 files changed, 20 insertions(+), 3 deletions(-)

diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
index 2eb5d3c..4149f45 100644
--- a/hw/virtio/virtio.c
+++ b/hw/virtio/virtio.c
@@ -883,7 +883,11 @@ static bool virtio_device_endian_needed(void *opaque)
 VirtIODevice *vdev = opaque;
 
 assert(vdev->device_endian != VIRTIO_DEVICE_ENDIAN_UNKNOWN);
-return vdev->device_endian != virtio_default_endian();
+if (virtio_device_is_legacy(vdev)) {
+return vdev->device_endian != virtio_default_endian();
+}
+/* Devices conforming to VIRTIO 1.0 or later are always LE. */
+return vdev->device_endian != VIRTIO_DEVICE_ENDIAN_LITTLE;
 }
 
 static const VMStateDescription vmstate_virtio_device_endian = {
diff --git a/include/hw/virtio/virtio-access.h 
b/include/hw/virtio/virtio-access.h
index 46456fd..c123ee0 100644
--- a/include/hw/virtio/virtio-access.h
+++ b/include/hw/virtio/virtio-access.h
@@ -19,6 +19,10 @@
 
 static inline bool virtio_access_is_big_endian(VirtIODevice *vdev)
 {
+if (!virtio_device_is_legacy(vdev)) {
+/* Devices conforming to VIRTIO 1.0 or later are always LE. */
+return false;
+}
 #if defined(TARGET_IS_BIENDIAN)
 return virtio_is_big_endian(vdev);
 #elif defined(TARGET_WORDS_BIGENDIAN)
diff --git a/include/hw/virtio/virtio.h b/include/hw/virtio/virtio.h
index b408166..40e567c 100644
--- a/include/hw/virtio/virtio.h
+++ b/include/hw/virtio/virtio.h
@@ -275,9 +275,18 @@ void virtio_queue_set_host_notifier_fd_handler(VirtQueue 
*vq, bool assign,
 void virtio_queue_notify_vq(VirtQueue *vq);
 void virtio_irq(VirtQueue *vq);
 
+static inline bool virtio_device_is_legacy(VirtIODevice *vdev)
+{
+return !(vdev->guest_features[1] & (1 << (VIRTIO_F_VERSION_1 - 32)));
+}
+
 static inline bool virtio_is_big_endian(VirtIODevice *vdev)
 {
-assert(vdev->device_endian != VIRTIO_DEVICE_ENDIAN_UNKNOWN);
-return vdev->device_endian == VIRTIO_DEVICE_ENDIAN_BIG;
+if (virtio_device_is_legacy(vdev)) {
+assert(vdev->device_endian != VIRTIO_DEVICE_ENDIAN_UNKNOWN);
+return vdev->device_endian == VIRTIO_DEVICE_ENDIAN_BIG;
+}
+/* Devices conforming to VIRTIO 1.0 or later are always LE. */
+return false;
 }
 #endif
-- 
1.7.9.5




[Qemu-devel] [PATCH RFC v3 04/12] s390x/virtio-ccw: fix check for WRITE_FEAT

2014-11-26 Thread Cornelia Huck
We need to check guest feature size, not host feature size to find
out whether we should call virtio_set_features(). This check is
possible now that vdev->guest_features is an array.

Reviewed-by: Thomas Huth 
Signed-off-by: Cornelia Huck 
---
 hw/s390x/virtio-ccw.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/s390x/virtio-ccw.c b/hw/s390x/virtio-ccw.c
index 2f52b82..62ec852 100644
--- a/hw/s390x/virtio-ccw.c
+++ b/hw/s390x/virtio-ccw.c
@@ -399,7 +399,7 @@ static int virtio_ccw_cb(SubchDev *sch, CCW1 ccw)
 features.index = ldub_phys(&address_space_memory,
ccw.cda + sizeof(features.features));
 features.features = ldl_le_phys(&address_space_memory, ccw.cda);
-if (features.index < ARRAY_SIZE(dev->host_features)) {
+if (features.index < ARRAY_SIZE(vdev->guest_features)) {
 virtio_set_features(vdev, features.index, features.features);
 } else {
 /*
-- 
1.7.9.5




[Qemu-devel] [PATCH RFC v3 02/12] virtio: cull virtio_bus_set_vdev_features

2014-11-26 Thread Cornelia Huck
The only user of this function was virtio-ccw, and it should use
virtio_set_features() like everybody else: We need to make sure
that bad features are masked out properly, which this function did
not do.

Reviewed-by: Thomas Huth 
Signed-off-by: Cornelia Huck 
---
 hw/s390x/virtio-ccw.c  |3 +--
 hw/virtio/virtio-bus.c |   14 --
 include/hw/virtio/virtio-bus.h |3 ---
 3 files changed, 1 insertion(+), 19 deletions(-)

diff --git a/hw/s390x/virtio-ccw.c b/hw/s390x/virtio-ccw.c
index ea236c9..84f17bc 100644
--- a/hw/s390x/virtio-ccw.c
+++ b/hw/s390x/virtio-ccw.c
@@ -400,8 +400,7 @@ static int virtio_ccw_cb(SubchDev *sch, CCW1 ccw)
ccw.cda + sizeof(features.features));
 features.features = ldl_le_phys(&address_space_memory, ccw.cda);
 if (features.index < ARRAY_SIZE(dev->host_features)) {
-virtio_bus_set_vdev_features(&dev->bus, features.features);
-vdev->guest_features = features.features;
+virtio_set_features(vdev, features.features);
 } else {
 /*
  * If the guest supports more feature bits, assert that it
diff --git a/hw/virtio/virtio-bus.c b/hw/virtio/virtio-bus.c
index eb77019..a8ffa07 100644
--- a/hw/virtio/virtio-bus.c
+++ b/hw/virtio/virtio-bus.c
@@ -109,20 +109,6 @@ uint32_t virtio_bus_get_vdev_features(VirtioBusState *bus,
 return k->get_features(vdev, requested_features);
 }
 
-/* Set the features of the plugged device. */
-void virtio_bus_set_vdev_features(VirtioBusState *bus,
-  uint32_t requested_features)
-{
-VirtIODevice *vdev = virtio_bus_get_device(bus);
-VirtioDeviceClass *k;
-
-assert(vdev != NULL);
-k = VIRTIO_DEVICE_GET_CLASS(vdev);
-if (k->set_features != NULL) {
-k->set_features(vdev, requested_features);
-}
-}
-
 /* Get bad features of the plugged device. */
 uint32_t virtio_bus_get_vdev_bad_features(VirtioBusState *bus)
 {
diff --git a/include/hw/virtio/virtio-bus.h b/include/hw/virtio/virtio-bus.h
index 0756545..0d2e7b4 100644
--- a/include/hw/virtio/virtio-bus.h
+++ b/include/hw/virtio/virtio-bus.h
@@ -84,9 +84,6 @@ size_t virtio_bus_get_vdev_config_len(VirtioBusState *bus);
 /* Get the features of the plugged device. */
 uint32_t virtio_bus_get_vdev_features(VirtioBusState *bus,
 uint32_t requested_features);
-/* Set the features of the plugged device. */
-void virtio_bus_set_vdev_features(VirtioBusState *bus,
-  uint32_t requested_features);
 /* Get bad features of the plugged device. */
 uint32_t virtio_bus_get_vdev_bad_features(VirtioBusState *bus);
 /* Get config of the plugged device. */
-- 
1.7.9.5




[Qemu-devel] [PATCH RFC v3 03/12] virtio: support more feature bits

2014-11-26 Thread Cornelia Huck
With virtio-1, we support more than 32 feature bits. Let's make
vdev->guest_features depend on the number of supported feature bits,
allowing us to grow the feature bits automatically.

We also need to enhance the internal functions dealing with getting
and setting features with an additional index field, so that all feature
bits may be accessed (in chunks of 32 bits).

vhost and migration have been ignored for now.

Reviewed-by: Thomas Huth 
Signed-off-by: Cornelia Huck 
---
 hw/9pfs/virtio-9p-device.c |7 ++-
 hw/block/virtio-blk.c  |9 +++--
 hw/char/virtio-serial-bus.c|9 +++--
 hw/net/virtio-net.c|   38 ++
 hw/s390x/s390-virtio-bus.c |9 +
 hw/s390x/virtio-ccw.c  |   17 ++---
 hw/scsi/vhost-scsi.c   |7 +--
 hw/scsi/virtio-scsi.c  |   10 +-
 hw/virtio/dataplane/vring.c|   10 +-
 hw/virtio/virtio-balloon.c |8 ++--
 hw/virtio/virtio-bus.c |9 +
 hw/virtio/virtio-mmio.c|9 +
 hw/virtio/virtio-pci.c |   13 +++--
 hw/virtio/virtio-rng.c |2 +-
 hw/virtio/virtio.c |   29 +
 include/hw/virtio/virtio-bus.h |7 ---
 include/hw/virtio/virtio.h |   19 ++-
 17 files changed, 135 insertions(+), 77 deletions(-)

diff --git a/hw/9pfs/virtio-9p-device.c b/hw/9pfs/virtio-9p-device.c
index 2572747..c29c8c8 100644
--- a/hw/9pfs/virtio-9p-device.c
+++ b/hw/9pfs/virtio-9p-device.c
@@ -21,8 +21,13 @@
 #include "virtio-9p-coth.h"
 #include "hw/virtio/virtio-access.h"
 
-static uint32_t virtio_9p_get_features(VirtIODevice *vdev, uint32_t features)
+static uint32_t virtio_9p_get_features(VirtIODevice *vdev, unsigned int index,
+   uint32_t features)
 {
+if (index > 0) {
+return features;
+}
+
 features |= 1 << VIRTIO_9P_MOUNT_TAG;
 return features;
 }
diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c
index b19b102..6d86f60 100644
--- a/hw/block/virtio-blk.c
+++ b/hw/block/virtio-blk.c
@@ -564,10 +564,15 @@ static void virtio_blk_set_config(VirtIODevice *vdev, 
const uint8_t *config)
 aio_context_release(blk_get_aio_context(s->blk));
 }
 
-static uint32_t virtio_blk_get_features(VirtIODevice *vdev, uint32_t features)
+static uint32_t virtio_blk_get_features(VirtIODevice *vdev, unsigned int index,
+uint32_t features)
 {
 VirtIOBlock *s = VIRTIO_BLK(vdev);
 
+if (index > 0) {
+return features;
+}
+
 features |= (1 << VIRTIO_BLK_F_SEG_MAX);
 features |= (1 << VIRTIO_BLK_F_GEOMETRY);
 features |= (1 << VIRTIO_BLK_F_TOPOLOGY);
@@ -601,7 +606,7 @@ static void virtio_blk_set_status(VirtIODevice *vdev, 
uint8_t status)
 return;
 }
 
-features = vdev->guest_features;
+features = vdev->guest_features[0];
 
 /* A guest that supports VIRTIO_BLK_F_CONFIG_WCE must be able to send
  * cache flushes.  Thus, the "auto writethrough" behavior is never
diff --git a/hw/char/virtio-serial-bus.c b/hw/char/virtio-serial-bus.c
index a7b1b68..55de504 100644
--- a/hw/char/virtio-serial-bus.c
+++ b/hw/char/virtio-serial-bus.c
@@ -75,7 +75,7 @@ static VirtIOSerialPort *find_port_by_name(char *name)
 static bool use_multiport(VirtIOSerial *vser)
 {
 VirtIODevice *vdev = VIRTIO_DEVICE(vser);
-return vdev->guest_features & (1 << VIRTIO_CONSOLE_F_MULTIPORT);
+return vdev->guest_features[0] & (1 << VIRTIO_CONSOLE_F_MULTIPORT);
 }
 
 static size_t write_to_port(VirtIOSerialPort *port,
@@ -467,10 +467,15 @@ static void handle_input(VirtIODevice *vdev, VirtQueue 
*vq)
 {
 }
 
-static uint32_t get_features(VirtIODevice *vdev, uint32_t features)
+static uint32_t get_features(VirtIODevice *vdev, unsigned int index,
+ uint32_t features)
 {
 VirtIOSerial *vser;
 
+if (index > 0) {
+return features;
+}
+
 vser = VIRTIO_SERIAL(vdev);
 
 if (vser->bus.max_nr_ports > 1) {
diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
index 9b88775..1e214b5 100644
--- a/hw/net/virtio-net.c
+++ b/hw/net/virtio-net.c
@@ -86,7 +86,7 @@ static void virtio_net_set_config(VirtIODevice *vdev, const 
uint8_t *config)
 
 memcpy(&netcfg, config, n->config_size);
 
-if (!(vdev->guest_features >> VIRTIO_NET_F_CTRL_MAC_ADDR & 1) &&
+if (!(vdev->guest_features[0] >> VIRTIO_NET_F_CTRL_MAC_ADDR & 1) &&
 memcmp(netcfg.mac, n->mac, ETH_ALEN)) {
 memcpy(n->mac, netcfg.mac, ETH_ALEN);
 qemu_format_nic_info_str(qemu_get_queue(n->nic), n->mac);
@@ -305,7 +305,7 @@ static RxFilterInfo 
*virtio_net_query_rxfilter(NetClientState *nc)
 info->multicast_table = str_list;
 info->vlan_table = get_vlan_table(n);
 
-if (!((1 << VIRTIO_NET_F_CTRL_VLAN) & vdev->guest_features)) {
+if (!((1 << VIRTIO_NET_F_CTRL_VLAN) & vdev->g

[Qemu-devel] [PATCH RFC v3 11/12] virtio-net/virtio-blk: enable virtio 1.0

2014-11-26 Thread Cornelia Huck
virtio-net (non-vhost) and virtio-blk have everything in place to support
virtio 1.0: let's enable the feature bit for them.

Note that VIRTIO_F_VERSION_1 is technically a transport feature; once
every device is ready for virtio 1.0, we can move this setting this
feature bit out of the individual devices.

Signed-off-by: Cornelia Huck 
---
 hw/block/virtio-blk.c |4 
 hw/net/virtio-net.c   |4 
 2 files changed, 8 insertions(+)

diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c
index 6d86f60..3781f98 100644
--- a/hw/block/virtio-blk.c
+++ b/hw/block/virtio-blk.c
@@ -569,6 +569,10 @@ static uint32_t virtio_blk_get_features(VirtIODevice 
*vdev, unsigned int index,
 {
 VirtIOBlock *s = VIRTIO_BLK(vdev);
 
+if (index == 1) {
+features |= (1 << (VIRTIO_F_VERSION_1 - 32));
+}
+
 if (index > 0) {
 return features;
 }
diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
index 1e214b5..fcfc95f 100644
--- a/hw/net/virtio-net.c
+++ b/hw/net/virtio-net.c
@@ -447,6 +447,10 @@ static uint32_t virtio_net_get_features(VirtIODevice 
*vdev, unsigned int index,
 VirtIONet *n = VIRTIO_NET(vdev);
 NetClientState *nc = qemu_get_queue(n->nic);
 
+if (index == 1 && !get_vhost_net(nc->peer)) {
+features |= (1 << (VIRTIO_F_VERSION_1 - 32));
+}
+
 if (index > 0) {
 return features;
 }
-- 
1.7.9.5




[Qemu-devel] [PATCH RFC v3 00/12] qemu: towards virtio-1 host support

2014-11-26 Thread Cornelia Huck
Next version of virtio-1 patches for qemu.

Only change from v2 is splitting out the vring accessors into a
separate header file - should hopefully fix the build issues.

Cornelia Huck (9):
  virtio: cull virtio_bus_set_vdev_features
  virtio: support more feature bits
  s390x/virtio-ccw: fix check for WRITE_FEAT
  virtio: introduce legacy virtio devices
  virtio: allow virtio-1 queue layout
  dataplane: allow virtio-1 devices
  s390x/virtio-ccw: support virtio-1 set_vq format
  virtio-net/virtio-blk: enable virtio 1.0
  s390x/virtio-ccw: enable virtio 1.0

Thomas Huth (3):
  linux-headers/virtio_config: Update with VIRTIO_F_VERSION_1
  s390x/css: Add a callback for when subchannel gets disabled
  s390x/virtio-ccw: add virtio set-revision call

 hw/9pfs/virtio-9p-device.c|7 +-
 hw/block/dataplane/virtio-blk.c   |4 +-
 hw/block/virtio-blk.c |   13 +-
 hw/char/virtio-serial-bus.c   |9 +-
 hw/net/virtio-net.c   |   42 --
 hw/s390x/css.c|   12 ++
 hw/s390x/css.h|1 +
 hw/s390x/s390-virtio-bus.c|9 +-
 hw/s390x/virtio-ccw.c |  186 +++--
 hw/s390x/virtio-ccw.h |7 +-
 hw/scsi/vhost-scsi.c  |7 +-
 hw/scsi/virtio-scsi-dataplane.c   |2 +-
 hw/scsi/virtio-scsi.c |   10 +-
 hw/virtio/Makefile.objs   |2 +-
 hw/virtio/dataplane/Makefile.objs |2 +-
 hw/virtio/dataplane/vring.c   |   96 +++--
 hw/virtio/virtio-balloon.c|8 +-
 hw/virtio/virtio-bus.c|   23 +--
 hw/virtio/virtio-mmio.c   |9 +-
 hw/virtio/virtio-pci.c|   13 +-
 hw/virtio/virtio-rng.c|2 +-
 hw/virtio/virtio.c|   51 +--
 include/hw/virtio/dataplane/vring-accessors.h |   75 ++
 include/hw/virtio/dataplane/vring.h   |   14 +-
 include/hw/virtio/virtio-access.h |4 +
 include/hw/virtio/virtio-bus.h|   10 +-
 include/hw/virtio/virtio.h|   34 -
 linux-headers/linux/virtio_config.h   |3 +
 28 files changed, 467 insertions(+), 188 deletions(-)
 create mode 100644 include/hw/virtio/dataplane/vring-accessors.h

-- 
1.7.9.5




[Qemu-devel] [PATCH RFC v3 10/12] s390x/virtio-ccw: support virtio-1 set_vq format

2014-11-26 Thread Cornelia Huck
Support the new CCW_CMD_SET_VQ format for virtio-1 devices.

While we're at it, refactor the code a bit and enforce big endian
fields (which had always been required, even for legacy).

Reviewed-by: Thomas Huth 
Signed-off-by: Cornelia Huck 
---
 hw/s390x/virtio-ccw.c |  114 ++---
 1 file changed, 80 insertions(+), 34 deletions(-)

diff --git a/hw/s390x/virtio-ccw.c b/hw/s390x/virtio-ccw.c
index e79f3b8..60d67a3 100644
--- a/hw/s390x/virtio-ccw.c
+++ b/hw/s390x/virtio-ccw.c
@@ -238,11 +238,20 @@ VirtualCssBus *virtual_css_bus_init(void)
 }
 
 /* Communication blocks used by several channel commands. */
-typedef struct VqInfoBlock {
+typedef struct VqInfoBlockLegacy {
 uint64_t queue;
 uint32_t align;
 uint16_t index;
 uint16_t num;
+} QEMU_PACKED VqInfoBlockLegacy;
+
+typedef struct VqInfoBlock {
+uint64_t desc;
+uint32_t res0;
+uint16_t index;
+uint16_t num;
+uint64_t avail;
+uint64_t used;
 } QEMU_PACKED VqInfoBlock;
 
 typedef struct VqConfigBlock {
@@ -269,17 +278,20 @@ typedef struct VirtioRevInfo {
 } QEMU_PACKED VirtioRevInfo;
 
 /* Specify where the virtqueues for the subchannel are in guest memory. */
-static int virtio_ccw_set_vqs(SubchDev *sch, uint64_t addr, uint32_t align,
-  uint16_t index, uint16_t num)
+static int virtio_ccw_set_vqs(SubchDev *sch, VqInfoBlock *info,
+  VqInfoBlockLegacy *linfo)
 {
 VirtIODevice *vdev = virtio_ccw_get_vdev(sch);
+uint16_t index = info ? info->index : linfo->index;
+uint16_t num = info ? info->num : linfo->num;
+uint64_t desc = info ? info->desc : linfo->queue;
 
 if (index > VIRTIO_PCI_QUEUE_MAX) {
 return -EINVAL;
 }
 
 /* Current code in virtio.c relies on 4K alignment. */
-if (addr && (align != 4096)) {
+if (linfo && desc && (linfo->align != 4096)) {
 return -EINVAL;
 }
 
@@ -287,8 +299,12 @@ static int virtio_ccw_set_vqs(SubchDev *sch, uint64_t 
addr, uint32_t align,
 return -EINVAL;
 }
 
-virtio_queue_set_addr(vdev, index, addr);
-if (!addr) {
+if (info) {
+virtio_queue_set_rings(vdev, index, desc, info->avail, info->used);
+} else {
+virtio_queue_set_addr(vdev, index, desc);
+}
+if (!desc) {
 virtio_queue_set_vector(vdev, index, 0);
 } else {
 /* Fail if we don't have a big enough queue. */
@@ -303,10 +319,66 @@ static int virtio_ccw_set_vqs(SubchDev *sch, uint64_t 
addr, uint32_t align,
 return 0;
 }
 
-static int virtio_ccw_cb(SubchDev *sch, CCW1 ccw)
+static int virtio_ccw_handle_set_vq(SubchDev *sch, CCW1 ccw, bool check_len,
+bool is_legacy)
 {
 int ret;
 VqInfoBlock info;
+VqInfoBlockLegacy linfo;
+size_t info_len = is_legacy ? sizeof(linfo) : sizeof(info);
+
+if (check_len) {
+if (ccw.count != info_len) {
+return -EINVAL;
+}
+} else if (ccw.count < info_len) {
+/* Can't execute command. */
+return -EINVAL;
+}
+if (!ccw.cda) {
+return -EFAULT;
+}
+if (is_legacy) {
+linfo.queue = ldq_be_phys(&address_space_memory, ccw.cda);
+linfo.align = ldl_be_phys(&address_space_memory,
+  ccw.cda + sizeof(linfo.queue));
+linfo.index = lduw_be_phys(&address_space_memory,
+   ccw.cda + sizeof(linfo.queue)
+   + sizeof(linfo.align));
+linfo.num = lduw_be_phys(&address_space_memory,
+ ccw.cda + sizeof(linfo.queue)
+ + sizeof(linfo.align)
+ + sizeof(linfo.index));
+ret = virtio_ccw_set_vqs(sch, NULL, &linfo);
+} else {
+info.desc = ldq_be_phys(&address_space_memory, ccw.cda);
+info.index = lduw_be_phys(&address_space_memory,
+  ccw.cda + sizeof(info.desc)
+  + sizeof(info.res0));
+info.num = lduw_be_phys(&address_space_memory,
+ccw.cda + sizeof(info.desc)
+  + sizeof(info.res0)
+  + sizeof(info.index));
+info.avail = ldq_be_phys(&address_space_memory,
+ ccw.cda + sizeof(info.desc)
+ + sizeof(info.res0)
+ + sizeof(info.index)
+ + sizeof(info.num));
+info.used = ldq_be_phys(&address_space_memory,
+ccw.cda + sizeof(info.desc)
++ sizeof(info.res0)
++ sizeof(info.index)
++ sizeof(info.num)
++ sizeof(info.avail));
+ret = virtio_ccw_set_vqs(sch, &info, NULL);
+}
+  

[Qemu-devel] [PATCH RFC v3 01/12] linux-headers/virtio_config: Update with VIRTIO_F_VERSION_1

2014-11-26 Thread Cornelia Huck
From: Thomas Huth 

Add the new VIRTIO_F_VERSION_1 definition to the virtio_config.h
linux header.

Signed-off-by: Thomas Huth 
Signed-off-by: Cornelia Huck 
---
 linux-headers/linux/virtio_config.h |3 +++
 1 file changed, 3 insertions(+)

diff --git a/linux-headers/linux/virtio_config.h 
b/linux-headers/linux/virtio_config.h
index 75dc20b..16aa289 100644
--- a/linux-headers/linux/virtio_config.h
+++ b/linux-headers/linux/virtio_config.h
@@ -54,4 +54,7 @@
 /* Can the device handle any descriptor layout? */
 #define VIRTIO_F_ANY_LAYOUT27
 
+/* v1.0 compliant. */
+#define VIRTIO_F_VERSION_1 32
+
 #endif /* _LINUX_VIRTIO_CONFIG_H */
-- 
1.7.9.5




[Qemu-devel] [PATCH for-2.3] softmmu: clarify meaning of TLB_NOTDIRTY bit

2014-11-26 Thread Stefan Hajnoczi
It wasn't obvious to me why the TLB entries should cache the memory
dirty bitmap state.

Signed-off-by: Stefan Hajnoczi 
---
 include/exec/cpu-all.h | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/include/exec/cpu-all.h b/include/exec/cpu-all.h
index c085804..e71e47e 100644
--- a/include/exec/cpu-all.h
+++ b/include/exec/cpu-all.h
@@ -329,7 +329,8 @@ extern RAMList ram_list;
 /* Zero if TLB entry is valid.  */
 #define TLB_INVALID_MASK   (1 << 3)
 /* Set if TLB entry references a clean RAM page.  The iotlb entry will
-   contain the page physical address.  */
+   contain the page physical address.  Forces us to take the slow path so pages
+   get marked dirty, whereas the fast path does not mark pages dirty.  */
 #define TLB_NOTDIRTY(1 << 4)
 /* Set if TLB entry is an IO callback.  */
 #define TLB_MMIO(1 << 5)
-- 
2.1.0




Re: [Qemu-devel] [PATCH 1/3] blockdev: Add read-only option to change-blockdev

2014-11-26 Thread Eric Blake
On 11/26/2014 09:36 AM, Max Reitz wrote:

>>>   -qmp_bdrv_open_encrypted(bs, filename, bdrv_flags, drv, NULL,
>>> errp);
>>> +qmp_bdrv_open_encrypted(bs, filename, bdrv_flags, drv, NULL, &err);
>>> +
>>> +if (err) {
>>> +if (read_only == BLOCKDEV_CHANGE_READ_ONLY_MODE_AUTO) {
>>> +error_free(err);
>>> +err = NULL;
>>> +
>>> +/* RDWR did not work, try RO now */
>>> +bdrv_flags &= ~BDRV_O_RDWR;
>>> +qmp_bdrv_open_encrypted(bs, filename, bdrv_flags, drv,
>>> NULL, errp);
>>> +} else {
>>> +error_propagate(errp, err);
>>> +}
>> Umm, why are you propagating the error here manually, when it was
>> previously propagated as part of the fall-through into the out: label?
> 
> Is it? I don't see any error_propagate() after that
> qmp_bdrv_open_encrypted_call()... And also, that call takes "errp" as a
> parameter, so having error_propagate() afterwards would be kind of strange.

Oh, I read too fast; I'm missing the difference between '&err' and
'errp'.  I think we generally use the name 'local_err' instead of plain
'err', so that it is more obvious when we are collecting into
'&local_err' with a need to propagate, vs. reusing the caller's 'errp'
directly because we aren't further checking things.

Okay, your code is correct after all.  The pre-existing confusion of
'err' instead of 'local_err' might be worth fixing if you have a reason
for a respin, but is not itself a reason for me to withhold approval.

Reviewed-by: Eric Blake 

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] [PATCH 0/4] blockdev: support dataplane in QMP 'transaction' command

2014-11-26 Thread Stefan Hajnoczi
On Fri, Nov 21, 2014 at 10:48:56AM +, Stefan Hajnoczi wrote:
> These patches make the QMP 'transaction' command work with virtio-blk
> dataplane.  Each 'transaction' action must take care to acquire AioContext
> around BlockDriverState accesses.  Once that protection is in place we can
> unblock the op blockers for these commands.
> 
> The meat is in Patch 3.
> 
> Patches 1 & 2 are minor cleanups.
> Patch 4 protects the external snapshot command (oops, we forgot!).
> 
> Stefan Hajnoczi (4):
>   blockdev: update outdated qmp_transaction() comments
>   blockdev: drop unnecessary DriveBackupState field assignment
>   blockdev: acquire AioContext in QMP 'transaction' actions
>   blockdev: check for BLOCK_OP_TYPE_EXTERNAL_SNAPSHOT
> 
>  blockdev.c  | 74 
> +
>  hw/block/dataplane/virtio-blk.c |  2 ++
>  2 files changed, 63 insertions(+), 13 deletions(-)
> 
> -- 
> 2.1.0
> 
> 

Applied to my block-next tree:
https://github.com/stefanha/qemu/commits/block-next

Stefan


pgpd7ET78bXl_.pgp
Description: PGP signature


Re: [Qemu-devel] [PATCH v4 3/6] block: JSON filenames and relative backing files

2014-11-26 Thread Eric Blake
On 11/26/2014 09:20 AM, Max Reitz wrote:
> When using a relative backing file name, qemu needs to know the
> directory of the top image file. For JSON filenames, such a directory
> cannot be easily determined (e.g. how do you determine the directory of
> a qcow2 BDS directly on top of a quorum BDS?). Therefore, do not allow
> relative filenames for the backing file of BDSs only having a JSON
> filename.
> 
> Furthermore, BDS::exact_filename should be used whenever possible. If
> BDS::filename is not equal to BDS::exact_filename, the former will
> always be a JSON object.
> 
> Signed-off-by: Max Reitz 
> ---
>  block.c   | 28 ++--
>  block/qapi.c  |  7 ++-
>  include/block/block.h |  5 +++--
>  3 files changed, 31 insertions(+), 9 deletions(-)
> 

Reviewed-by: Eric Blake 

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] [PATCH v4 1/6] checkpatch: Brace handling on multi-line condition

2014-11-26 Thread Eric Blake
On 11/26/2014 09:20 AM, Max Reitz wrote:
> CODING_STYLE states the following about braces around blocks:
> 
>> The opening brace is on the line that contains the control flow
>> statement that introduces the new block; [...]
> 
> This is obviously impossible with multi-line conditions. Therefore,
> CODING_STYLE does not make any clear statement about where to put the
> opening brace after a multi-line condition.

I'll leave review of this one to others (personally, I'm 80:20 against
applying it, but my vote is very weak if you scale it by the percentage
of actual C code contributions existing in the tree under my name - so
I'll live with majority rules).

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] [PATCH 1/3] blockdev: Add read-only option to change-blockdev

2014-11-26 Thread Max Reitz

On 2014-11-26 at 17:24, Eric Blake wrote:

On 11/20/2014 05:44 AM, Max Reitz wrote:

Add an option to qmp_change_blockdev() which allows changing the
read-only status of the block device to be changed.

Some drives do not have a inherently fixed read-only status; for
instance, floppy disks can be set read-only or writable independently of
the drive. Some users may find it useful to be able to therefore change
the read-only status of a block device when changing the medium.

Signed-off-by: Max Reitz 
---
  blockdev.c| 41 ++---
  include/sysemu/blockdev.h |  3 ++-
  qapi-schema.json  | 20 
  qmp.c |  3 ++-
  4 files changed, 62 insertions(+), 5 deletions(-)

  
-qmp_bdrv_open_encrypted(bs, filename, bdrv_flags, drv, NULL, errp);

+qmp_bdrv_open_encrypted(bs, filename, bdrv_flags, drv, NULL, &err);
+
+if (err) {
+if (read_only == BLOCKDEV_CHANGE_READ_ONLY_MODE_AUTO) {
+error_free(err);
+err = NULL;
+
+/* RDWR did not work, try RO now */
+bdrv_flags &= ~BDRV_O_RDWR;
+qmp_bdrv_open_encrypted(bs, filename, bdrv_flags, drv, NULL, errp);
+} else {
+error_propagate(errp, err);
+}

Umm, why are you propagating the error here manually, when it was
previously propagated as part of the fall-through into the out: label?


Is it? I don't see any error_propagate() after that 
qmp_bdrv_open_encrypted_call()... And also, that call takes "errp" as a 
parameter, so having error_propagate() afterwards would be kind of strange.


The tree I'm looking at is master, commit 
3ef4ebcc5c0360629bb05f49d9b961774247d6ca. In my block-next tree, which 
this patch is actually based against, commit 
656ddfaafd67af2b9234567e51cd3418588b2ddd.



Particularly since the second open_encrypted call still relies on
fallthrough for propagating the error?  I think this should be
simplified to:

if (err && read_only == BLOCKDEV_CHANGE_READ_ONLY_MODE_AUTO) {
 error_free(err);
 err = NULL;
 /* RDWR did not work, try RO now */
 bdrv_flags &= ~BDRV_O_RDWR;
 qmp_bdrv_open_encrypted(...);
}


But then you're not propagating the error anymore (in case of !err || 
read_only != BLOCKDEV_CHANGE_READ_ONLY_MODE_AUTO). As I said, at least 
in my tree, there is no error_propagate(errp, err); after this block. I 
may add it, though.


(to be completely sure: 
http://git.qemu.org/?p=qemu.git;a=blob;f=blockdev.c;h=57910b82c7adc3ce59173afeeebcd37ff2a3dfd0;hb=3ef4ebcc5c0360629bb05f49d9b961774247d6ca#l1727 
and https://github.com/XanClic/qemu/blob/block-next/blockdev.c#L1760)


Max


Otherwise, looks okay to me.





Re: [Qemu-devel] [PATCH 3/3] hmp: Expose read-only option for 'change'

2014-11-26 Thread Eric Blake
On 11/20/2014 05:44 AM, Max Reitz wrote:
> Expose the new read-only option of qmp_change_blockdev() for the
> 'change' HMP command.
> 
> Signed-off-by: Max Reitz 
> ---
>  hmp-commands.hx | 24 +---
>  hmp.c   | 17 -
>  2 files changed, 37 insertions(+), 4 deletions(-)

Reviewed-by: Eric Blake 

Even if you go with my suggestion of a new QMP command in 2/3, I'm just
fine with extending the existing HMP command without adding a new name
(although this patch would then have to be modified to call into the new
QMP command as appropriate).

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] [PATCH 2/3] qmp: Expose read-only option for 'change'

2014-11-26 Thread Eric Blake
On 11/20/2014 05:44 AM, Max Reitz wrote:
> Expose the new read-only option of qmp_change_blockdev() for the
> 'change' QMP command. Leave it unset for HMP for now.
> 
> Signed-off-by: Max Reitz 
> ---
>  hmp.c|  2 +-
>  qapi-schema.json |  7 ++-
>  qmp-commands.hx  | 24 +++-
>  qmp.c| 15 ---
>  4 files changed, 42 insertions(+), 6 deletions(-)
> 
> diff --git a/hmp.c b/hmp.c
> index 94b27df..0719fa3 100644
> --- a/hmp.c
> +++ b/hmp.c
> @@ -1133,7 +1133,7 @@ void hmp_change(Monitor *mon, const QDict *qdict)
>  }
>  }
>  
> -qmp_change(device, target, !!arg, arg, &err);
> +qmp_change(device, target, !!arg, arg, false, 0, &err);

Passing raw '0' looks a bit odd for an enum value, but since the
has_read_only parameter is false, I guess it doesn't matter.

>  if (strcmp(device, "vnc") == 0) {
> +if (has_read_only) {
> +error_setg(errp, "Parameter 'read-only' is invalid for VNC");
> +return;
> +}

The 'change' command is quite ugly.  It would be nice if we could
convert it to an automatic union, where the 'device' parameter is the
discriminated union that determines whether we support or reject the
optional 'read-only' argument - except that it is not a true
discriminator (it is either 'vnc' or a device name, with the further
implication that users CANNOT name their block device 'vnc' and still be
able to use this QMP command).  I wonder if it would be better to leave
the ugly command alone, and instead introduce a new command that isn't
multiplexed.  If someone wants to set a disk's read-only status on
change, then they MUST use the nice new command, and leave the old ugly
command for backward compatibility only with no modifications to make it
even uglier.  Furthermore, adding a new command would let someone name
their block device 'vnc' (not that libvirt has any plans to do so).

At any rate, if we decide to live with extending the existing ugly
'change' command, your patch is correct, so here's a reluctant:

Reviewed-by: Eric Blake 

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] [PATCH 1/3] blockdev: Add read-only option to change-blockdev

2014-11-26 Thread Eric Blake
On 11/20/2014 05:44 AM, Max Reitz wrote:
> Add an option to qmp_change_blockdev() which allows changing the
> read-only status of the block device to be changed.
> 
> Some drives do not have a inherently fixed read-only status; for
> instance, floppy disks can be set read-only or writable independently of
> the drive. Some users may find it useful to be able to therefore change
> the read-only status of a block device when changing the medium.
> 
> Signed-off-by: Max Reitz 
> ---
>  blockdev.c| 41 ++---
>  include/sysemu/blockdev.h |  3 ++-
>  qapi-schema.json  | 20 
>  qmp.c |  3 ++-
>  4 files changed, 62 insertions(+), 5 deletions(-)
> 

>  
> -qmp_bdrv_open_encrypted(bs, filename, bdrv_flags, drv, NULL, errp);
> +qmp_bdrv_open_encrypted(bs, filename, bdrv_flags, drv, NULL, &err);
> +
> +if (err) {
> +if (read_only == BLOCKDEV_CHANGE_READ_ONLY_MODE_AUTO) {
> +error_free(err);
> +err = NULL;
> +
> +/* RDWR did not work, try RO now */
> +bdrv_flags &= ~BDRV_O_RDWR;
> +qmp_bdrv_open_encrypted(bs, filename, bdrv_flags, drv, NULL, 
> errp);
> +} else {
> +error_propagate(errp, err);
> +}

Umm, why are you propagating the error here manually, when it was
previously propagated as part of the fall-through into the out: label?
Particularly since the second open_encrypted call still relies on
fallthrough for propagating the error?  I think this should be
simplified to:

if (err && read_only == BLOCKDEV_CHANGE_READ_ONLY_MODE_AUTO) {
error_free(err);
err = NULL;
/* RDWR did not work, try RO now */
bdrv_flags &= ~BDRV_O_RDWR;
qmp_bdrv_open_encrypted(...);
}

Otherwise, looks okay to me.

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] TCG Multithreading performance improvement

2014-11-26 Thread Alex Bennée

Mark Burton  writes:

> Hi all,
>
> We are now actively going to pursue TCG Multithreading to improve the 
> performance of the TCG for Qemu models that include multiple cores.
>
> We have set up a wiki page to track the project 
> http://wiki.qemu.org/Features/tcg-multithread 
> 
>
> At this point, I would like to invite everybody to email us ideas about how 
> the project should progress, and ideas that might be useful (e.g. people who 
> have tried this before, source code that might be helpful, what order we 
> should attack things in etc)…
>
> So - PLEASE let us know if you have interest in this topic, or
> information that might help

I've added some *incomplete* notes to the wiki page. I'm also happy to
get involved and help with design and code review.

-- 
Alex Bennée



[Qemu-devel] [PATCH v4 3/6] block: JSON filenames and relative backing files

2014-11-26 Thread Max Reitz
When using a relative backing file name, qemu needs to know the
directory of the top image file. For JSON filenames, such a directory
cannot be easily determined (e.g. how do you determine the directory of
a qcow2 BDS directly on top of a quorum BDS?). Therefore, do not allow
relative filenames for the backing file of BDSs only having a JSON
filename.

Furthermore, BDS::exact_filename should be used whenever possible. If
BDS::filename is not equal to BDS::exact_filename, the former will
always be a JSON object.

Signed-off-by: Max Reitz 
---
 block.c   | 28 ++--
 block/qapi.c  |  7 ++-
 include/block/block.h |  5 +++--
 3 files changed, 31 insertions(+), 9 deletions(-)

diff --git a/block.c b/block.c
index 0c1be37..0d1869c 100644
--- a/block.c
+++ b/block.c
@@ -305,19 +305,28 @@ void path_combine(char *dest, int dest_size,
 
 void bdrv_get_full_backing_filename_from_filename(const char *backed,
   const char *backing,
-  char *dest, size_t sz)
+  char *dest, size_t sz,
+  Error **errp)
 {
-if (backing[0] == '\0' || path_has_protocol(backing)) {
+if (backing[0] == '\0' || path_has_protocol(backing) ||
+path_is_absolute(backing))
+{
 pstrcpy(dest, sz, backing);
+} else if (backed[0] == '\0' || strstart(backed, "json:", NULL)) {
+error_setg(errp, "Cannot use relative backing file names for '%s'",
+   backed);
 } else {
 path_combine(dest, sz, backed, backing);
 }
 }
 
-void bdrv_get_full_backing_filename(BlockDriverState *bs, char *dest, size_t 
sz)
+void bdrv_get_full_backing_filename(BlockDriverState *bs, char *dest, size_t 
sz,
+Error **errp)
 {
-bdrv_get_full_backing_filename_from_filename(bs->filename, 
bs->backing_file,
- dest, sz);
+char *backed = bs->exact_filename[0] ? bs->exact_filename : bs->filename;
+
+bdrv_get_full_backing_filename_from_filename(backed, bs->backing_file,
+ dest, sz, errp);
 }
 
 void bdrv_register(BlockDriver *bdrv)
@@ -1209,7 +1218,14 @@ int bdrv_open_backing_file(BlockDriverState *bs, QDict 
*options, Error **errp)
 QDECREF(options);
 goto free_exit;
 } else {
-bdrv_get_full_backing_filename(bs, backing_filename, PATH_MAX);
+bdrv_get_full_backing_filename(bs, backing_filename, PATH_MAX,
+   &local_err);
+if (local_err) {
+ret = -EINVAL;
+error_propagate(errp, local_err);
+QDECREF(options);
+goto free_exit;
+}
 }
 
 if (!bs->drv || !bs->drv->supports_backing) {
diff --git a/block/qapi.c b/block/qapi.c
index fa68ba7..a6fd6f7 100644
--- a/block/qapi.c
+++ b/block/qapi.c
@@ -214,7 +214,12 @@ void bdrv_query_image_info(BlockDriverState *bs,
 info->backing_filename = g_strdup(backing_filename);
 info->has_backing_filename = true;
 bdrv_get_full_backing_filename(bs, backing_filename2,
-   sizeof(backing_filename2));
+   sizeof(backing_filename2), &err);
+if (err) {
+error_propagate(errp, err);
+qapi_free_ImageInfo(info);
+return;
+}
 
 if (strcmp(backing_filename, backing_filename2) != 0) {
 info->full_backing_filename =
diff --git a/include/block/block.h b/include/block/block.h
index 1c7ecaa..f34f63c 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -398,10 +398,11 @@ const char *bdrv_get_encrypted_filename(BlockDriverState 
*bs);
 void bdrv_get_backing_filename(BlockDriverState *bs,
char *filename, int filename_size);
 void bdrv_get_full_backing_filename(BlockDriverState *bs,
-char *dest, size_t sz);
+char *dest, size_t sz, Error **errp);
 void bdrv_get_full_backing_filename_from_filename(const char *backed,
   const char *backing,
-  char *dest, size_t sz);
+  char *dest, size_t sz,
+  Error **errp);
 int bdrv_is_snapshot(BlockDriverState *bs);
 
 int path_is_absolute(const char *path);
-- 
1.9.3




Re: [Qemu-devel] [PATCH v2] tests: Use "command -v" instead of which(1) in shell scripts

2014-11-26 Thread Stefan Hajnoczi
On Wed, Nov 19, 2014 at 03:07:12PM +0800, Fam Zheng wrote:
> When which(1) is not installed, we would complain "perl not found"
> because it's the first set_prog_path check. The error message is
> wrong.
> 
> Fix it by using "command -v", a native way to query the existence of a
> command.
> 
> Suggested-by: Eric Blake 
> Signed-off-by: Fam Zheng 
> 
> ---
> v2: Use "command -v" as suggested by Eric. Also change a few other
> occasions of which(1) in tests/qemu-iotests/common.
> ---
>  tests/qemu-iotests/common| 8 
>  tests/qemu-iotests/common.config | 2 +-
>  2 files changed, 5 insertions(+), 5 deletions(-)

Thanks, applied to my block-next tree:
https://github.com/stefanha/qemu/commits/block-next

Stefan


pgphGRfvooPu7.pgp
Description: PGP signature


Re: [Qemu-devel] [PATCH v3 0/9] raw: Prohibit dangerous writes for probed images

2014-11-26 Thread Stefan Hajnoczi
On Thu, Nov 20, 2014 at 04:27:05PM +0100, Kevin Wolf wrote:
> See the commit message of patch 7 for the why and how. This series
> will probably be only part of the solution and doesn't mean that we
> should stop looking for other patches which improve different parts of
> the problem.
> 
> See the mailing list thread "Image probing: how it can be insecure, and
> what we could do about it" for the complete context.
> 
> v3:
> - Patch 5/6: Improved function comment [Max]
> - Patch 7: Handle nb_sectors == 0 case [Stefan]
> - Patch 7: Even longer error message [Eric]
> - Patch 9: Don't create a vhdx image in a test for raw, it might not be
>   compiled in. Use sample images instead, including the more exotic formats.
>   Add a new sample image containing a GRUB MBR. [Max, Eric]
> 
> v2:
> - Fixed offset in qemu_iovec_concat [Kevin]
> - Added paragraph to patch 7 explaining that we're not breaking
>   additional cases, but only change the failure mode of already
>   broken scenarios [Max]
> - Added a warning when opening an image in "restricted raw" mode,
>   which required a few more patches to make the test cases avoid
>   this warning [Markus]
> 
> 
> Kevin Wolf (8):
>   qemu-io: Allow explicitly specifying format
>   qemu-iotests: Use qemu-io -f $IMGFMT
>   qemu-iotests: Add qemu-io format option in Python tests
>   qtests: Specify image format explicitly
>   block: Read only one sector for format probing
>   raw: Prohibit dangerous writes for probed images
>   qemu-iotests: Fix stderr handling in common.qemu
>   qemu-iotests: Test writing non-raw image headers to raw image
> 
> Markus Armbruster (1):
>   block: Factor bdrv_probe_all() out of find_image_format()
> 
>  block.c   |  51 +++--
>  block/raw_bsd.c   |  64 +-
>  include/block/block_int.h |   5 +
>  qemu-io.c |  28 ++-
>  tests/ahci-test.c |   3 +-
>  tests/bios-tables-test.c  |   2 +-
>  tests/drive_del-test.c|   2 +-
>  tests/fdc-test.c  |   2 +-
>  tests/hd-geo-test.c   |   2 +-
>  tests/i440fx-test.c   |   5 +-
>  tests/ide-test.c  |   9 +-
>  tests/nvme-test.c |   2 +-
>  tests/qemu-iotests/016|  11 +-
>  tests/qemu-iotests/030|  22 +--
>  tests/qemu-iotests/040|  32 +--
>  tests/qemu-iotests/048|   2 +-
>  tests/qemu-iotests/055|  18 +-
>  tests/qemu-iotests/058|  11 +-
>  tests/qemu-iotests/071|  10 +-
>  tests/qemu-iotests/071.out|   6 +-
>  tests/qemu-iotests/077|   2 +-
>  tests/qemu-iotests/081|   8 +-
>  tests/qemu-iotests/081.out|   2 +-
>  tests/qemu-iotests/089|   6 +-
>  tests/qemu-iotests/109| 132 +
>  tests/qemu-iotests/109.out| 231 
> ++
>  tests/qemu-iotests/common |   2 +-
>  tests/qemu-iotests/common.qemu|   3 +-
>  tests/qemu-iotests/group  |   1 +
>  tests/qemu-iotests/sample_images/grub_mbr.raw.bz2 | Bin 0 -> 552 bytes
>  tests/usb-hcd-uhci-test.c |   2 +-
>  tests/usb-hcd-xhci-test.c |   2 +-
>  tests/virtio-blk-test.c   |   4 +-
>  tests/virtio-scsi-test.c  |   4 +-
>  34 files changed, 584 insertions(+), 102 deletions(-)
>  create mode 100755 tests/qemu-iotests/109
>  create mode 100644 tests/qemu-iotests/109.out
>  create mode 100644 tests/qemu-iotests/sample_images/grub_mbr.raw.bz2
> 
> -- 
> 1.8.3.1
> 
> 

Thanks, applied to my block-next tree:
https://github.com/stefanha/qemu/commits/block-next

Stefan


pgp4RDtgg2fNa.pgp
Description: PGP signature


[Qemu-devel] [PATCH v4 6/6] iotests: Add test for relative backing file names

2014-11-26 Thread Max Reitz
Sometimes, qemu does not have a filename to work with, so it does not
know which directory to use for a backing file specified by a relative
filename. Add a test which tests that qemu exits with an appropriate
error message.

Additionally, add a test for qemu-img create with a backing filename
relative to the backed image's base directory while omitting the image
size.

Signed-off-by: Max Reitz 
Reviewed-by: Eric Blake 
Reviewed-by: Fam Zheng 
---
 tests/qemu-iotests/110 | 94 ++
 tests/qemu-iotests/110.out | 19 ++
 tests/qemu-iotests/group   |  1 +
 3 files changed, 114 insertions(+)
 create mode 100755 tests/qemu-iotests/110
 create mode 100644 tests/qemu-iotests/110.out

diff --git a/tests/qemu-iotests/110 b/tests/qemu-iotests/110
new file mode 100755
index 000..a687f95
--- /dev/null
+++ b/tests/qemu-iotests/110
@@ -0,0 +1,94 @@
+#!/bin/bash
+#
+# Test case for relative backing file names in complex BDS trees
+#
+# Copyright (C) 2014 Red Hat, Inc.
+#
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 2 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see .
+#
+
+# creator
+owner=mre...@redhat.com
+
+seq="$(basename $0)"
+echo "QA output created by $seq"
+
+here="$PWD"
+tmp=/tmp/$$
+status=1   # failure is the default!
+
+_cleanup()
+{
+   _cleanup_test_img
+}
+trap "_cleanup; exit \$status" 0 1 2 3 15
+
+# get standard environment, filters and checks
+. ./common.rc
+. ./common.filter
+
+# Any format supporting backing files
+_supported_fmt qed qcow qcow2 vmdk
+_supported_proto file
+_supported_os Linux
+_unsupported_imgopts "subformat=monolithicFlat" "subformat=twoGbMaxExtentFlat"
+
+TEST_IMG_REL=$(basename "$TEST_IMG")
+
+echo
+echo '=== Reconstructable filename ==='
+echo
+
+TEST_IMG="$TEST_IMG.base" _make_test_img 64M
+_make_test_img -b "$TEST_IMG_REL.base" 64M
+# qemu should be able to reconstruct the filename, so relative backing names
+# should work
+TEST_IMG="json:{'driver':'$IMGFMT','file':{'driver':'file','filename':'$TEST_IMG'}}"
 \
+_img_info | _filter_img_info
+
+echo
+echo '=== Non-reconstructable filename ==='
+echo
+
+# Across blkdebug without a config file, you cannot reconstruct filenames, so
+# qemu is incapable of knowing the directory of the top image
+TEST_IMG="json:{
+'driver': '$IMGFMT',
+'file': {
+'driver': 'blkdebug',
+'image': {
+'driver': 'file',
+'filename': '$TEST_IMG'
+},
+'set-state': [
+{
+'event': 'read_aio',
+'new_state': 42
+}
+]
+}
+}" _img_info | _filter_img_info
+
+echo
+echo '=== Backing name is always relative to the backed image ==='
+echo
+
+# omit the image size; it should work anyway
+_make_test_img -b "$TEST_IMG_REL.base"
+
+
+# success, all done
+echo '*** done'
+rm -f $seq.full
+status=0
diff --git a/tests/qemu-iotests/110.out b/tests/qemu-iotests/110.out
new file mode 100644
index 000..056ffec
--- /dev/null
+++ b/tests/qemu-iotests/110.out
@@ -0,0 +1,19 @@
+QA output created by 110
+
+=== Reconstructable filename ===
+
+Formatting 'TEST_DIR/t.IMGFMT.base', fmt=IMGFMT size=67108864
+Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=67108864 
backing_file='t.IMGFMT.base'
+image: TEST_DIR/t.IMGFMT
+file format: IMGFMT
+virtual size: 64M (67108864 bytes)
+backing file: t.IMGFMT.base (actual path: TEST_DIR/t.IMGFMT.base)
+
+=== Non-reconstructable filename ===
+
+qemu-img: Cannot use relative backing file names for 'json:{"driver": 
"IMGFMT", "file": {"set-state": [{"new_state": 42, "state": 0, "event": 
"read_aio"}], "image": {"driver": "file", "filename": "TEST_DIR/t.IMGFMT"}, 
"driver": "blkdebug"}}'
+
+=== Backing name is always relative to the backed image ===
+
+Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=67108864 
backing_file='t.IMGFMT.base'
+*** done
diff --git a/tests/qemu-iotests/group b/tests/qemu-iotests/group
index 7dfe469..de3c643 100644
--- a/tests/qemu-iotests/group
+++ b/tests/qemu-iotests/group
@@ -111,4 +111,5 @@
 105 rw auto quick
 107 rw auto quick
 108 rw auto quick
+110 rw auto backing quick
 111 rw auto quick
-- 
1.9.3




[Qemu-devel] [PATCH v4 4/6] block: Relative backing file for image creation

2014-11-26 Thread Max Reitz
Relative backing filenames are always relative to the backed image's
directory; the same applies to image creation. Therefore, if the backing
file has to be opened for determining its size (in case the size has not
been explicitly specified) its filename should be interpreted relative
to the new image's base directory and not relative to qemu's working
directory.

Signed-off-by: Max Reitz 
Reviewed-by: Eric Blake 
Reviewed-by: Fam Zheng 
---
 block.c | 12 +++-
 1 file changed, 11 insertions(+), 1 deletion(-)

diff --git a/block.c b/block.c
index 0d1869c..65f31ef 100644
--- a/block.c
+++ b/block.c
@@ -5634,16 +5634,26 @@ void bdrv_img_create(const char *filename, const char 
*fmt,
 if (size == -1) {
 if (backing_file) {
 BlockDriverState *bs;
+char *full_backing = g_new0(char, PATH_MAX);
 int64_t size;
 int back_flags;
 
+bdrv_get_full_backing_filename_from_filename(filename, 
backing_file,
+ full_backing, 
PATH_MAX,
+ &local_err);
+if (local_err) {
+g_free(full_backing);
+goto out;
+}
+
 /* backing files always opened read-only */
 back_flags =
 flags & ~(BDRV_O_RDWR | BDRV_O_SNAPSHOT | BDRV_O_NO_BACKING);
 
 bs = NULL;
-ret = bdrv_open(&bs, backing_file, NULL, NULL, back_flags,
+ret = bdrv_open(&bs, full_backing, NULL, NULL, back_flags,
 backing_drv, &local_err);
+g_free(full_backing);
 if (ret < 0) {
 goto out;
 }
-- 
1.9.3




[Qemu-devel] [PATCH v4 1/6] checkpatch: Brace handling on multi-line condition

2014-11-26 Thread Max Reitz
CODING_STYLE states the following about braces around blocks:

> The opening brace is on the line that contains the control flow
> statement that introduces the new block; [...]

This is obviously impossible with multi-line conditions. Therefore,
CODING_STYLE does not make any clear statement about where to put the
opening brace after a multi-line condition.

There is a reason to prefer to place the opening brace on an own line
after such a condition while still placing it on the same line as the
"control flow statement" if possible; that reason is that the last line
of a multi-line condition is indented, in the case of "if", it is often
indented by four spaces, just as much as the first statement in the
block will be indented. This is hard to read as there is no clearly
visible distinction between condition and block. Placing the opening
brace on a separate line solves this issue.

Also, there are cases where placing the opening brace on a separate line
is the only viable option; if the previous line had nearly 80 characters
and splitting it is not desirable, the opening brace is naturally placed
on an own line.

This patch fixes checkpatch.pl to not complain about braces on own lines
if the condition introducing the block spanned more than one line, or if
the previous line had 79 or 80 characters.

Furthermore, the warning about not having braces around a block is fixed
to mind braces not being on the last line of the condition.

Signed-off-by: Max Reitz 
---
 scripts/checkpatch.pl | 13 +++--
 1 file changed, 11 insertions(+), 2 deletions(-)

diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
index 053e432..5df61f9 100755
--- a/scripts/checkpatch.pl
+++ b/scripts/checkpatch.pl
@@ -1639,7 +1639,13 @@ sub process {
#print "realcnt<$realcnt> ctx_cnt<$ctx_cnt>\n";
#print 
"pre<$pre_ctx>\nline<$line>\nctx<$ctx>\nnext<$lines[$ctx_ln - 1]>\n";
 
-   if ($ctx !~ /{\s*/ && defined($lines[$ctx_ln -1]) && 
$lines[$ctx_ln - 1] =~ /^\+\s*{/) {
+   # The length of the "previous line" is checked against 
80 because it
+   # includes the + at the beginning of the line (if the 
actual line has
+   # 79 or 80 characters, it is no longer possible to add 
a space and an
+   # opening brace there)
+   if ($#ctx == 0 && $ctx !~ /{\s*/ &&
+   defined($lines[$ctx_ln - 1]) && $lines[$ctx_ln - 1] 
=~ /^\+\s*{/ &&
+   defined($lines[$ctx_ln - 2]) && 
length($lines[$ctx_ln - 2]) < 80) {
ERROR("that open brace { should be on the 
previous line\n" .
"$here\n$ctx\n$rawlines[$ctx_ln - 
1]\n");
}
@@ -2542,7 +2548,10 @@ sub process {
 
substr($block, 0, length($cond), '');
 
-   $seen++ if ($block =~ /^\s*{/);
+   my $spaced_block = $block;
+   $spaced_block =~ s/\n\+/ /g;
+
+   $seen++ if ($spaced_block =~ /^\s*{/);
 
 print "APW: cond<$cond> block<$block> 
allowed<$allowed>\n"
 if $dbg_adv_apw;
-- 
1.9.3




[Qemu-devel] [PATCH v4 0/6] block: Relative backing files

2014-11-26 Thread Max Reitz
Cover letter of v1:

Sometimes, qemu does not have a filename to work with (it then generates
a JSON filename), so it does not know which directory to use for a
backing file specified by a relative filename.

In this case, qemu should not somehow try to append the backing file's
name to the JSON object, but rather just print an error and bail out.


Part of the cover letter specific to v2 (with patch numbers fixed):

Stefan already applied v1 before realizing that the test added in the
series was actually broken for vmdk (which I somehow missed myself).
This was due to vmdk trying to open the backing file on creation in
order to determine its format; however, normally the backing file path
is interpreted relatively to the backed image's base directory, whereas
in this case vmdk directly used the user-specified filename which was
therefore interpreted relatively to qemu's working directory.

Patch 5 of this v2 (v4) fixes this. A similar issue exists directly in
bdrv_img_create() which opens the backing file in order to determine its
size in case the size of the new image has not been specified. This is
fixed by patch 4.

The function both patches use is factored out from
bdrv_get_full_backing_filename() (which we cannot use here because it
requires a BDS which does not necessarily exist during image creation)
in patch 2. Patch 3 was then modified so it modifies this new function
(bdrv_get_full_backing_filename_from_filename()) and the old
bdrv_get_full_backing_filename(), which is now more or less just a
wrapper.

And finally, I added a test to patch 6 which tests creation of a backed
image in another directory only using relative paths while omitting the
image size (which is therefore inferred from the backing file).


v4:
- Added patch 1: Patch 3 (prev. 2) failed checkpatch.pl, so do the
  obvious: Fix checkpatch.pl.
- Patch 3 (prev. 2): Fix overly long line [Fam]
- Patch 6 (prev. 5): s/shoud/should/ [Eric]


git-backport-diff against v3:

Key:
[] : patches are identical
[] : number of functional differences between upstream/downstream patch
[down] : patch is downstream-only
The flags [FC] indicate (F)unctional and (C)ontextual differences, respectively

001/6:[down] 'checkpatch: Brace handling on multi-line condition'
002/6:[] [--] 'block: Get full backing filename from string'
003/6:[0003] [FC] 'block: JSON filenames and relative backing files'
004/6:[] [--] 'block: Relative backing file for image creation'
005/6:[] [--] 'block/vmdk: Relative backing file for creation'
006/6:[0002] [FC] 'iotests: Add test for relative backing file names'


Max Reitz (6):
  checkpatch: Brace handling on multi-line condition
  block: Get full backing filename from string
  block: JSON filenames and relative backing files
  block: Relative backing file for image creation
  block/vmdk: Relative backing file for creation
  iotests: Add test for relative backing file names

 block.c| 46 ---
 block/qapi.c   |  7 +++-
 block/vmdk.c   | 13 ++-
 include/block/block.h  |  6 ++-
 scripts/checkpatch.pl  | 13 ++-
 tests/qemu-iotests/110 | 94 ++
 tests/qemu-iotests/110.out | 19 ++
 tests/qemu-iotests/group   |  1 +
 8 files changed, 188 insertions(+), 11 deletions(-)
 create mode 100755 tests/qemu-iotests/110
 create mode 100644 tests/qemu-iotests/110.out

-- 
1.9.3




[Qemu-devel] [PATCH v4 5/6] block/vmdk: Relative backing file for creation

2014-11-26 Thread Max Reitz
When a vmdk image is created with a backing file, it is opened to check
whether it is indeed a vmdk file by letting qemu probe it. When doing
so, the backing filename is relative to the image's base directory so it
should be interpreted accordingly.

Signed-off-by: Max Reitz 
Reviewed-by: Eric Blake 
Reviewed-by: Fam Zheng 
---
 block/vmdk.c | 13 -
 1 file changed, 12 insertions(+), 1 deletion(-)

diff --git a/block/vmdk.c b/block/vmdk.c
index 2cbfd3e..aea8f07 100644
--- a/block/vmdk.c
+++ b/block/vmdk.c
@@ -1868,8 +1868,19 @@ static int vmdk_create(const char *filename, QemuOpts 
*opts, Error **errp)
 }
 if (backing_file) {
 BlockDriverState *bs = NULL;
-ret = bdrv_open(&bs, backing_file, NULL, NULL, BDRV_O_NO_BACKING, NULL,
+char *full_backing = g_new0(char, PATH_MAX);
+bdrv_get_full_backing_filename_from_filename(filename, backing_file,
+ full_backing, PATH_MAX,
+ &local_err);
+if (local_err) {
+g_free(full_backing);
+error_propagate(errp, local_err);
+ret = -ENOENT;
+goto exit;
+}
+ret = bdrv_open(&bs, full_backing, NULL, NULL, BDRV_O_NO_BACKING, NULL,
 errp);
+g_free(full_backing);
 if (ret != 0) {
 goto exit;
 }
-- 
1.9.3




[Qemu-devel] [PATCH v4 2/6] block: Get full backing filename from string

2014-11-26 Thread Max Reitz
Introduce bdrv_get_full_backing_filename_from_filename(), a function
which takes the name of the backed file and a potentially relative
backing filename to produce the full (absolute) backing filename.

Use this function from bdrv_get_full_backing_filename().

Signed-off-by: Max Reitz 
Reviewed-by: Kevin Wolf 
Reviewed-by: Eric Blake 
Reviewed-by: Fam Zheng 
---
 block.c   | 16 
 include/block/block.h |  3 +++
 2 files changed, 15 insertions(+), 4 deletions(-)

diff --git a/block.c b/block.c
index 866c8b4..0c1be37 100644
--- a/block.c
+++ b/block.c
@@ -303,15 +303,23 @@ void path_combine(char *dest, int dest_size,
 }
 }
 
-void bdrv_get_full_backing_filename(BlockDriverState *bs, char *dest, size_t 
sz)
+void bdrv_get_full_backing_filename_from_filename(const char *backed,
+  const char *backing,
+  char *dest, size_t sz)
 {
-if (bs->backing_file[0] == '\0' || path_has_protocol(bs->backing_file)) {
-pstrcpy(dest, sz, bs->backing_file);
+if (backing[0] == '\0' || path_has_protocol(backing)) {
+pstrcpy(dest, sz, backing);
 } else {
-path_combine(dest, sz, bs->filename, bs->backing_file);
+path_combine(dest, sz, backed, backing);
 }
 }
 
+void bdrv_get_full_backing_filename(BlockDriverState *bs, char *dest, size_t 
sz)
+{
+bdrv_get_full_backing_filename_from_filename(bs->filename, 
bs->backing_file,
+ dest, sz);
+}
+
 void bdrv_register(BlockDriver *bdrv)
 {
 /* Block drivers without coroutine functions need emulation */
diff --git a/include/block/block.h b/include/block/block.h
index 610be9f..1c7ecaa 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -399,6 +399,9 @@ void bdrv_get_backing_filename(BlockDriverState *bs,
char *filename, int filename_size);
 void bdrv_get_full_backing_filename(BlockDriverState *bs,
 char *dest, size_t sz);
+void bdrv_get_full_backing_filename_from_filename(const char *backed,
+  const char *backing,
+  char *dest, size_t sz);
 int bdrv_is_snapshot(BlockDriverState *bs);
 
 int path_is_absolute(const char *path);
-- 
1.9.3




Re: [Qemu-devel] [PATCH v2 0/6] nbd: Use BlockBackend

2014-11-26 Thread Stefan Hajnoczi
On Tue, Nov 18, 2014 at 12:21:13PM +0100, Max Reitz wrote:
> From the block layer's perspective, the nbd server is pretty similar to
> a guest device. Therefore, it should use BlockBackend to access block
> devices, just like any other guest device does.
> 
> This series consequently makes the nbd server use BlockBackend for
> referencing block devices.
> 
> 
> v2:
> - Added patch 6, which converts qemu-nbd to BlockBackend as far as
>   reasonable [Paolo]
> 
> 
> git-backport-diff against v1:
> 
> Key:
> [] : patches are identical
> [] : number of functional differences between upstream/downstream patch
> [down] : patch is downstream-only
> The flags [FC] indicate (F)unctional and (C)ontextual differences, 
> respectively
> 
> 001/6:[] [--] 'block: Lift more functions into BlockBackend'
> 002/6:[] [--] 'block: Add AioContextNotifier functions to BB'
> 003/6:[] [--] 'block: Add blk_add_close_notifier() for BB'
> 004/6:[] [--] 'nbd: Change external interface to BlockBackend'
> 005/6:[] [--] 'nbd: Use BlockBackend internally'
> 006/6:[down] 'qemu-nbd: Use BlockBackend where reasonable'
> 
> 
> Max Reitz (6):
>   block: Lift more functions into BlockBackend
>   block: Add AioContextNotifier functions to BB
>   block: Add blk_add_close_notifier() for BB
>   nbd: Change external interface to BlockBackend
>   nbd: Use BlockBackend internally
>   qemu-nbd: Use BlockBackend where reasonable
> 
>  block/block-backend.c  | 38 +
>  blockdev-nbd.c | 15 +-
>  include/block/nbd.h|  7 ++---
>  include/sysemu/block-backend.h | 12 
>  nbd.c  | 63 
> +-
>  qemu-nbd.c | 12 
>  6 files changed, 99 insertions(+), 48 deletions(-)
> 
> -- 
> 1.9.3
> 
> 

Thanks, applied to my block-next tree:
https://github.com/stefanha/qemu/commits/block-next

Stefan


pgpJkt3yg5dPW.pgp
Description: PGP signature


Re: [Qemu-devel] [PATCH 0/3] blockdev: Add read-only option to change-blockdev

2014-11-26 Thread Kevin Wolf
Am 20.11.2014 um 13:44 hat Max Reitz geschrieben:
> The 'change' QMP and HMP command allows replacing the medium in drives
> which support this, e.g. floppy disk drives. For some drives, the medium
> carries information about whether it can be written to or not (again,
> floppy drives). Therefore, it should be possible to change the read-only
> state of block devices when changing the loaded medium.
> 
> This series adds an optional additional parameter to the 'change' QMP
> and HMP command which allows changing the read-only state in four ways:
> 
> - 'retain': Just keep the status as it was before; this is the current
>   behavior and thus this will be the default.
> - 'ro': Force read-only access
> - 'rw': Force writable access
> - 'auto': This opens the new file R/W first, if that fails, the file is
>   opened read-only.

Not sure if 'auto' is worth implementing (it's a typical HMP default
action that no QMP client would use, except that it isn't even the
default for HMP), but the implementation looks correct at least.

Reviewed-by: Kevin Wolf 



Re: [Qemu-devel] [2.3 PATCH v7 05/10] block: Add bdrv_copy_dirty_bitmap and bdrv_reset_dirty_bitmap

2014-11-26 Thread Eric Blake
On 11/26/2014 05:43 AM, Max Reitz wrote:
> On 2014-11-25 at 20:46, John Snow wrote:
>> From: Fam Zheng 
>>
>> Signed-off-by: Fam Zheng 
>> Signed-off-by: John Snow 
>> ---
>>   block.c   | 35 +++
>>   include/block/block.h |  4 
>>   2 files changed, 35 insertions(+), 4 deletions(-)
>>

>> @@ -5406,7 +5431,9 @@ int64_t
>> bdrv_dbm_calc_def_granularity(BlockDriverState *bs)
>>   int64_t bdrv_dirty_bitmap_granularity(BlockDriverState *bs,
>> BdrvDirtyBitmap *bitmap)
>>   {
>> -return BDRV_SECTOR_SIZE << hbitmap_granularity(bitmap->bitmap);
>> +g_assert(BDRV_SECTOR_SIZE << hbitmap_granularity(bitmap->bitmap)
>> == \
>> + bitmap->granularity);
> 
> Do you really need that backslash?
> 
> If you don't and remove it, or if you do and keep it, and with either
> "name ? g_strdup(name) : NULL" left as-is or replace by "g_strdup(name)":
> 

g_strdup(NULL) is safe, so 'name ? g_strdup(name) : NULL' is pointless
and can be written as 'g_strdup(name)' with no change in functionality.

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] [PATCH 0/2] ahci: small cleanups

2014-11-26 Thread Stefan Hajnoczi
On Thu, Nov 13, 2014 at 10:24:39AM +, Stefan Hajnoczi wrote:
> Two small cleanups for QEMU 2.3:
> 
> 1. Avoid DEBUG_AHCI bitrot
> 2. Use FIS type constants instead of magic numbers
> 
> See commit descriptions for details.
> 
> Stefan Hajnoczi (2):
>   ahci: avoid #ifdef DEBUG_AHCI bitrot
>   ahci: replace SATA FIS type magic numbers with constants
> 
>  hw/ide/ahci.c | 24 +++-
>  hw/ide/ahci.h |  5 -
>  2 files changed, 15 insertions(+), 14 deletions(-)
> 
> -- 
> 2.1.0
> 
> 

Applied to my block-next tree:
https://github.com/stefanha/qemu/commits/block-next

Stefan


pgp6FWnuVxJu3.pgp
Description: PGP signature


Re: [Qemu-devel] [PATCH RESEND] Support vhd type VHD_DIFFERENCING

2014-11-26 Thread Stefan Hajnoczi
On Thu, Nov 06, 2014 at 10:43:50PM +0800, Xiaodong Gong wrote:
> Now qemu only supports vhd type VHD_FIXED and VHD_DYNAMIC, so qemu
> can't read snapshot volume of vhd, and can't support other storage
> features of vhd file.
> 
> This patch add read parent information in function "vpc_open", read
> bitmap in "vpc_read", and change bitmap in "vpc_write".
> 
> Signed-off-by: Xiaodong Gong 
> Reviewed-by: Ding xiao 
> ---

Have you run tests/qemu-iotests/check for all 3 subformats (fixed,
dynamic, and differencing)?

http://qemu-project.org/Documentation/QemuIoTests

The tests need to pass before we can merge a new subformat.

Stefan


pgpO77lJ2eig6.pgp
Description: PGP signature


Re: [Qemu-devel] [PATCH v3 0/2] block: Improve filename generation for blkdebug

2014-11-26 Thread Stefan Hajnoczi
On Tue, Nov 11, 2014 at 10:23:43AM +0100, Max Reitz wrote:
> Eric has gloriously reviewed v2 of this series, and by adopting v2
> averted a kitten crisis and the need for me to find a freely usable
> kitten picture (although I guess Wikipedia would have been the way to
> go). I do have own pictures, but they are currently about 400 km air
> distance away from me.
> 
> 
> The changes in v3:
> - Patch 1: Added a check for the "config" option being NULL (which
>   happens if no config file is given, either by just not giving the
>   option or by using the blkdebug::foo syntax); in this case, omit the
>   config filename from the output (to receive "blkdebug::foo") [Eric]
> - Patch 2: Add a test for this case
> 
> 
> git-backport-diff output against v2 (and v1):
> 
> Key:
> [] : patches are identical
> [] : number of functional differences between upstream/downstream patch
> [down] : patch is downstream-only
> The flags [FC] indicate (F)unctional and (C)ontextual differences, 
> respectively
> 
> 001/2:[0003] [FC] 'blkdebug: Simplify and improve filename generation'
> 002/2:[0010] [FC] 'iotests: Plain blkdebug filename generation'
> 
> 
> Max Reitz (2):
>   blkdebug: Simplify and improve filename generation
>   iotests: Plain blkdebug filename generation
> 
>  block/blkdebug.c   | 99 
> +-
>  tests/qemu-iotests/099 | 15 ++-
>  tests/qemu-iotests/099.out | 12 +-
>  3 files changed, 52 insertions(+), 74 deletions(-)
> 
> -- 
> 1.9.3
> 
> 

Thanks, applied to my block-next tree:
https://github.com/stefanha/qemu/commits/block-next

Stefan


pgprhPpo2nltx.pgp
Description: PGP signature


Re: [Qemu-devel] [2.3 PATCH v7 02/10] qmp: Add block-dirty-bitmap-add and block-dirty-bitmap-remove

2014-11-26 Thread Max Reitz

On 2014-11-26 at 16:39, John Snow wrote:



On 11/26/2014 07:19 AM, Max Reitz wrote:

On 2014-11-25 at 20:46, John Snow wrote:

From: Fam Zheng 

The new command pair is added to manage user created dirty bitmap. The
dirty bitmap's name is mandatory and must be unique for the same 
device,

but different devices can have bitmaps with the same names.

The types added to block-core.json will be re-used in future patches
in this series, see:
'qapi: Add transaction support to block-dirty-bitmap-{add, enable,
disable}'


Thanks, this helps. :-)


Signed-off-by: Fam Zheng 
Signed-off-by: John Snow 
---
  block.c   | 19 
  block/mirror.c| 10 +---
  blockdev.c| 63
+++
  include/block/block.h |  1 +
  qapi/block-core.json  | 58
+++
  qmp-commands.hx   | 49 +++
  6 files changed, 191 insertions(+), 9 deletions(-)

diff --git a/block.c b/block.c
index f94b753..a940345 100644
--- a/block.c
+++ b/block.c
@@ -5385,6 +5385,25 @@ int bdrv_get_dirty(BlockDriverState *bs,
BdrvDirtyBitmap *bitmap, int64_t sector
  }
  }
+#define BDB_MIN_DEF_GRANULARITY 4096
+#define BDB_MAX_DEF_GRANULARITY 65536
+#define BDB_DEFAULT_GRANULARITY BDB_MAX_DEF_GRANULARITY


You mean this is the default for the default? ;-)


+
+int64_t bdrv_dbm_calc_def_granularity(BlockDriverState *bs)


You may want to make this a uint64_t so it's clear that this function
does not return errors.


+{
+BlockDriverInfo bdi;
+int64_t granularity;
+
+if (bdrv_get_info(bs, &bdi) >= 0 && bdi.cluster_size != 0) {
+granularity = MAX(BDB_MIN_DEF_GRANULARITY, bdi.cluster_size);
+granularity = MIN(BDB_MAX_DEF_GRANULARITY, granularity);
+} else {
+granularity = BDB_DEFAULT_GRANULARITY;
+}
+
+return granularity;
+}
+
  void bdrv_dirty_iter_init(BlockDriverState *bs,
BdrvDirtyBitmap *bitmap, HBitmapIter *hbi)
  {
diff --git a/block/mirror.c b/block/mirror.c
index 858e4ff..3633632 100644
--- a/block/mirror.c
+++ b/block/mirror.c
@@ -664,15 +664,7 @@ static void mirror_start_job(BlockDriverState
*bs, BlockDriverState *target,
  MirrorBlockJob *s;
  if (granularity == 0) {
-/* Choose the default granularity based on the target file's
cluster
- * size, clamped between 4k and 64k.  */
-BlockDriverInfo bdi;
-if (bdrv_get_info(target, &bdi) >= 0 && bdi.cluster_size != 
0) {

-granularity = MAX(4096, bdi.cluster_size);
-granularity = MIN(65536, granularity);
-} else {
-granularity = 65536;
-}
+granularity = bdrv_dbm_calc_def_granularity(target);


Maybe you should note this replacement in the commit message.


  }
  assert ((granularity & (granularity - 1)) == 0);
diff --git a/blockdev.c b/blockdev.c
index 57910b8..e2fe687 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -1810,6 +1810,69 @@ void qmp_block_set_io_throttle(const char
*device, int64_t bps, int64_t bps_rd,
  aio_context_release(aio_context);
  }
+void qmp_block_dirty_bitmap_add(const char *device, const char *name,
+bool has_granularity, int64_t
granularity,
+Error **errp)
+{
+BlockDriverState *bs;
+Error *local_err = NULL;
+
+if (!device) {
+error_setg(errp, "Device to add dirty bitmap to must not be
null");
+return;
+}


I don't know if checking for that case makes sense, but of course it
won't hurt. But...[1]


+
+bs = bdrv_lookup_bs(device, NULL, &local_err);


Fair enough, I'd still like blk_by_name() and blk_bs() more
(bdrv_lookup_bs() uses blk_bs(blk_by_name()) for the device name just as
bdrv_find() did), but this is at least not a completely trivial wrapper.


+if (!bs) {
+error_propagate(errp, local_err);


Simply calling bdrv_lookup_bs(device, NULL, errp); suffices, no need for
a local Error object and error_propagate(). But I'm fine with it either
way.


I found other cases where we do this in the code, I actually thought 
it was a "style thing," which is why I did it so often.


See my reply to patch 8 for an explanation.

I still think we could have made nice use of error_propagate() for 
showing some kind of backtrace (with a certain configure option), but 
that idea was rejected...



I'll happily cut it out.


+return;
+}
+
+if (!name || name[0] == '\0') {
+error_setg(errp, "Bitmap name cannot be empty");
+return;
+}
+if (has_granularity) {
+if (granularity < 512 || !is_power_of_2(granularity)) {
+error_setg(errp, "Granularity must be power of 2 "
+ "and at least 512");
+return;
+}
+} else {
+/* Default to cluster size, if available: */
+granularity = bdrv_dbm_calc_def_granularity(bs);
+}
+
+bd

Re: [Qemu-devel] [2.3 PATCH v7 02/10] qmp: Add block-dirty-bitmap-add and block-dirty-bitmap-remove

2014-11-26 Thread John Snow



On 11/26/2014 07:19 AM, Max Reitz wrote:

On 2014-11-25 at 20:46, John Snow wrote:

From: Fam Zheng 

The new command pair is added to manage user created dirty bitmap. The
dirty bitmap's name is mandatory and must be unique for the same device,
but different devices can have bitmaps with the same names.

The types added to block-core.json will be re-used in future patches
in this series, see:
'qapi: Add transaction support to block-dirty-bitmap-{add, enable,
disable}'


Thanks, this helps. :-)


Signed-off-by: Fam Zheng 
Signed-off-by: John Snow 
---
  block.c   | 19 
  block/mirror.c| 10 +---
  blockdev.c| 63
+++
  include/block/block.h |  1 +
  qapi/block-core.json  | 58
+++
  qmp-commands.hx   | 49 +++
  6 files changed, 191 insertions(+), 9 deletions(-)

diff --git a/block.c b/block.c
index f94b753..a940345 100644
--- a/block.c
+++ b/block.c
@@ -5385,6 +5385,25 @@ int bdrv_get_dirty(BlockDriverState *bs,
BdrvDirtyBitmap *bitmap, int64_t sector
  }
  }
+#define BDB_MIN_DEF_GRANULARITY 4096
+#define BDB_MAX_DEF_GRANULARITY 65536
+#define BDB_DEFAULT_GRANULARITY BDB_MAX_DEF_GRANULARITY


You mean this is the default for the default? ;-)


+
+int64_t bdrv_dbm_calc_def_granularity(BlockDriverState *bs)


You may want to make this a uint64_t so it's clear that this function
does not return errors.


+{
+BlockDriverInfo bdi;
+int64_t granularity;
+
+if (bdrv_get_info(bs, &bdi) >= 0 && bdi.cluster_size != 0) {
+granularity = MAX(BDB_MIN_DEF_GRANULARITY, bdi.cluster_size);
+granularity = MIN(BDB_MAX_DEF_GRANULARITY, granularity);
+} else {
+granularity = BDB_DEFAULT_GRANULARITY;
+}
+
+return granularity;
+}
+
  void bdrv_dirty_iter_init(BlockDriverState *bs,
BdrvDirtyBitmap *bitmap, HBitmapIter *hbi)
  {
diff --git a/block/mirror.c b/block/mirror.c
index 858e4ff..3633632 100644
--- a/block/mirror.c
+++ b/block/mirror.c
@@ -664,15 +664,7 @@ static void mirror_start_job(BlockDriverState
*bs, BlockDriverState *target,
  MirrorBlockJob *s;
  if (granularity == 0) {
-/* Choose the default granularity based on the target file's
cluster
- * size, clamped between 4k and 64k.  */
-BlockDriverInfo bdi;
-if (bdrv_get_info(target, &bdi) >= 0 && bdi.cluster_size != 0) {
-granularity = MAX(4096, bdi.cluster_size);
-granularity = MIN(65536, granularity);
-} else {
-granularity = 65536;
-}
+granularity = bdrv_dbm_calc_def_granularity(target);


Maybe you should note this replacement in the commit message.


  }
  assert ((granularity & (granularity - 1)) == 0);
diff --git a/blockdev.c b/blockdev.c
index 57910b8..e2fe687 100644
--- a/blockdev.c
+++ b/blockdev.c
@@ -1810,6 +1810,69 @@ void qmp_block_set_io_throttle(const char
*device, int64_t bps, int64_t bps_rd,
  aio_context_release(aio_context);
  }
+void qmp_block_dirty_bitmap_add(const char *device, const char *name,
+bool has_granularity, int64_t
granularity,
+Error **errp)
+{
+BlockDriverState *bs;
+Error *local_err = NULL;
+
+if (!device) {
+error_setg(errp, "Device to add dirty bitmap to must not be
null");
+return;
+}


I don't know if checking for that case makes sense, but of course it
won't hurt. But...[1]


+
+bs = bdrv_lookup_bs(device, NULL, &local_err);


Fair enough, I'd still like blk_by_name() and blk_bs() more
(bdrv_lookup_bs() uses blk_bs(blk_by_name()) for the device name just as
bdrv_find() did), but this is at least not a completely trivial wrapper.


+if (!bs) {
+error_propagate(errp, local_err);


Simply calling bdrv_lookup_bs(device, NULL, errp); suffices, no need for
a local Error object and error_propagate(). But I'm fine with it either
way.


I found other cases where we do this in the code, I actually thought it 
was a "style thing," which is why I did it so often.


I'll happily cut it out.


+return;
+}
+
+if (!name || name[0] == '\0') {
+error_setg(errp, "Bitmap name cannot be empty");
+return;
+}
+if (has_granularity) {
+if (granularity < 512 || !is_power_of_2(granularity)) {
+error_setg(errp, "Granularity must be power of 2 "
+ "and at least 512");
+return;
+}
+} else {
+/* Default to cluster size, if available: */
+granularity = bdrv_dbm_calc_def_granularity(bs);
+}
+
+bdrv_create_dirty_bitmap(bs, granularity, name, errp);
+}
+
+void qmp_block_dirty_bitmap_remove(const char *device, const char *name,
+   Error **errp)
+{
+BlockDriverState *bs;
+BdrvDirtyBitmap *bitmap;
+Error *local

Re: [Qemu-devel] [PATCH RESEND] Support vhd type VHD_DIFFERENCING

2014-11-26 Thread Stefan Hajnoczi
On Thu, Nov 06, 2014 at 10:43:50PM +0800, Xiaodong Gong wrote:
> +} else if (platform == PLATFORM_W2RU) {
> +/* Must be UTF16-LE to ASCII */
> +char *out, *optr;
> +int j;
> +
> +optr = out = (char *) g_malloc(data_length + 1);
> +if (out == NULL) {
> +ret = -1;
> +return ret;
> +}
> +
> +for (j = 0; j < data_length + 1; j++) {
> +out[j] = bs->backing_file[2 * j];
> +}
> +out[data_length + 1] = '\0';
> +
> +while (*optr != '\0') {
> +if (*optr == '\\') {
> +*optr = '/';
> +}
> +optr++;
> +}
> +
> +strncpy(bs->backing_file, out, data_length + 1);
> +
> +g_free(out);
> +out = NULL;
> +
> +done = true;
> +}

Please convert from UTF-16 LE to the local file system character set:
https://developer.gnome.org/glib/stable/glib-Character-Set-Conversion.html

Also, using ->backing_file[] when the data is UTF-16 LE encoded is not
ideal since it halves the maximum size of the string!  It would be
better to read into a temporary buffer that is 2 *
sizeof(backing_file[]) big before writing into ->backing_file[].

> @@ -286,6 +411,37 @@ static int vpc_open(BlockDriverState *bs, QDict 
> *options, int flags,
>  s->free_data_block_offset =
>  (s->bat_offset + (s->max_table_entries * 4) + 511) & ~511;
>  
> +/* Read tdbatmap header by offset */
> +if (footer->version >= VHD_VERSION(1, 2)) {

Missing be32_to_cpu(footer->version)

> +static int vpc_write(BlockDriverState *bs, int64_t sector_num,
> +const uint8_t *buf, int nb_sectors)
> +{
> +BDRVVPCState *s = bs->opaque;
> +VHDFooter *footer = (VHDFooter *) s->footer_buf;
> +int64_t sectors_per_block = s->block_size >> BDRV_SECTOR_BITS;
> +int64_t offset, sectors;
> +bool diff = true;
> +int ret = 0;
> +
> +switch (be32_to_cpu(footer->type)) {
> +case VHD_FIXED:
> +return bdrv_write(bs->file, sector_num, buf, nb_sectors);
> +case VHD_DYNAMIC:
> +case VHD_DIFF:
> +if (be32_to_cpu(footer->type) == VHD_DYNAMIC) {
> +diff = false;
>  }

This can be done with a fall-through case instead of checking
footer->type again:

case VHD_DYNAMIC:
diff = false;
/* fall-through */
case VHD_DIFF:

> +ret = bdrv_pwrite(bs->file, offset, buf,
> +sectors * BDRV_SECTOR_SIZE);
> +if (ret != sectors * BDRV_SECTOR_SIZE) {
> +return -1;
> +}
>  
> -nb_sectors -= sectors;
> -sector_num += sectors;
> -buf += sectors * BDRV_SECTOR_SIZE;
> -}
> +if (diff) {
> +ret = write_bitmap(bs, sector_num, sectors);
> +if (ret < 0) {
> +return -1;
> +}
> +}
>  
> -return 0;
> +nb_sectors -= sectors;
> +sector_num += sectors;
> +buf += sectors * BDRV_SECTOR_SIZE;
> +}
> +break;
> +default:
> +return -1;
> +}
> +return ret;

In the VHD_DYNAMIC case we must *not* return the number of bytes from
bdrv_pwrite().  Should this be return 0?


pgpAzoE2Fyebl.pgp
Description: PGP signature


Re: [Qemu-devel] [RFC PATCH v5 03/31] replay: global variables and function stubs

2014-11-26 Thread Eric Blake
On 11/26/2014 03:39 AM, Pavel Dovgalyuk wrote:
> This patch adds global variables, defines, functions declarations,
> and function stubs for deterministic VM replay used by external modules.
> 
> Signed-off-by: Pavel Dovgalyuk 
> ---


> +# Since: 2.3
> +##
> +{ 'enum': 'ReplaySubmode',
> +  'data': [ 'unknown', 'normal' ] }
> diff --git a/replay/Makefile.objs b/replay/Makefile.objs

> +++ b/stubs/replay.c
> @@ -0,0 +1,8 @@
> +#include "replay/replay.h"
> +
> +ReplayMode replay_mode;
> +
> +ReplaySubmode replay_get_play_submode(void)
> +{
> +return 0;

Although QMP code generation happens to assign 0 to the first listed
enum (in this case, REPLAY_SUBMODE_UNKNOWN), it is safer to explicitly
use an enum value rather than an open-coded '0' in this function.

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] [PATCH 01/12] block: qcow2 driver may not be found

2014-11-26 Thread Kevin Wolf
Am 26.11.2014 um 16:20 hat Max Reitz geschrieben:
> On 2014-11-26 at 16:19, Eric Blake wrote:
> >On 11/26/2014 02:13 AM, Max Reitz wrote:
> >>On 2014-11-26 at 08:23, Markus Armbruster wrote:
> >>>Max Reitz  writes:
> >>>
> Albeit absolutely impossible right now, bdrv_find_format("qcow2") may
> fail. bdrv_append_temp_snapshot() should heed that case.
> >>>Impossible because we always compile in bdrv_qcow2.
> +++ b/block.c
> @@ -1320,6 +1320,12 @@ int bdrv_append_temp_snapshot(BlockDriverState
> *bs, int flags, Error **errp)
>    }
>  bdrv_qcow2 = bdrv_find_format("qcow2");
> +if (!bdrv_qcow2) {
> >Would it be shorter to 'assert(bdrv_qcow2);' to still silence Coverity?
> 
> Sounds like a good compromise. Will do.

I think it's better to have either proper error handling for the case
that someone compiles it out, like implemented by this patch, or to
reference the symbol so that compiling it out already breaks the build.
The assert() would potentially be a crash of a running VM, which is not
as nice.

Kevin



Re: [Qemu-devel] [PATCH 01/12] block: qcow2 driver may not be found

2014-11-26 Thread Max Reitz

On 2014-11-26 at 16:19, Eric Blake wrote:

On 11/26/2014 02:13 AM, Max Reitz wrote:

On 2014-11-26 at 08:23, Markus Armbruster wrote:

Max Reitz  writes:


Albeit absolutely impossible right now, bdrv_find_format("qcow2") may
fail. bdrv_append_temp_snapshot() should heed that case.

Impossible because we always compile in bdrv_qcow2.

+++ b/block.c
@@ -1320,6 +1320,12 @@ int bdrv_append_temp_snapshot(BlockDriverState
*bs, int flags, Error **errp)
   }
 bdrv_qcow2 = bdrv_find_format("qcow2");
+if (!bdrv_qcow2) {

Would it be shorter to 'assert(bdrv_qcow2);' to still silence Coverity?


Sounds like a good compromise. Will do.

Max



Re: [Qemu-devel] [PATCH 01/12] block: qcow2 driver may not be found

2014-11-26 Thread Eric Blake
On 11/26/2014 02:13 AM, Max Reitz wrote:
> On 2014-11-26 at 08:23, Markus Armbruster wrote:
>> Max Reitz  writes:
>>
>>> Albeit absolutely impossible right now, bdrv_find_format("qcow2") may
>>> fail. bdrv_append_temp_snapshot() should heed that case.
>> Impossible because we always compile in bdrv_qcow2.
> 

>>> +++ b/block.c
>>> @@ -1320,6 +1320,12 @@ int bdrv_append_temp_snapshot(BlockDriverState
>>> *bs, int flags, Error **errp)
>>>   }
>>> bdrv_qcow2 = bdrv_find_format("qcow2");
>>> +if (!bdrv_qcow2) {

Would it be shorter to 'assert(bdrv_qcow2);' to still silence Coverity?

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] [PATCH 1/1] qmp: extend QMP to provide read/write access to physical memory

2014-11-26 Thread Eric Blake
On 11/26/2014 01:41 AM, Bryan D. Payne wrote:
> This patch adds a new QMP command that sets up a domain socket. This
> socket can then be used for fast read/write access to the guest's
> physical memory. The key benefit to this system over existing solutions
> is speed. Using this patch, guest memory can be copied out at a rate of
> ~200MB/sec, depending on the hardware. Existing solutions only achieve
> a small fraction of this speed.
> 
> Signed-off-by: Bryan D. Payne 
> ---
>  Makefile.target |   2 +-
>  memory-access.c | 200 
> 
>  memory-access.h |  11 
>  monitor.c   |  10 +++
>  qmp-commands.hx |  27 

Where is the *.json file that adds the QMP command contract?

> +++ b/qmp-commands.hx
> @@ -609,6 +609,33 @@ Example:
>  EQMP
>  
>  {
> +.name   = "pmemaccess",
> +.args_type  = "path:s",
> +.params = "path",
> +.help   = "mount guest physical memory image at 'path'",
> +.user_print = monitor_user_noop,
> +.mhandler.cmd_new = do_physical_memory_access,
> +},
> +
> +SQMP
> +pmemaccess
> +--
> +
> +Mount guest physical memory image at 'path'.
> +
> +Arguments:
> +
> +- "path": mount point path (json-string)
> +
> +Example:
> +
> +-> { "execute": "pmemaccess",
> + "arguments": { "path": "/tmp/guestname" } }

Sounds like you are missing this entry (probably best to put it in the
top-level qapi-schema.json):

{ 'command': 'pmemaccess', 'data': { 'path': 'str' } }

as well as documentation that mentions it is targetted for addition in 2.3.

-- 
Eric Blake   eblake redhat com+1-919-301-3266
Libvirt virtualization library http://libvirt.org



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] [2.3 PATCH v7 10/10] qemu-iotests: Add tests for drive-backup sync=dirty-bitmap

2014-11-26 Thread Max Reitz

On 2014-11-25 at 20:46, John Snow wrote:

From: Fam Zheng 

Signed-off-by: Fam Zheng 
Signed-off-by: John Snow 
---
  tests/qemu-iotests/056| 33 ++---
  tests/qemu-iotests/056.out|  4 ++--
  tests/qemu-iotests/iotests.py |  8 
  3 files changed, 40 insertions(+), 5 deletions(-)


Reviewed-by: Max Reitz 



Re: [Qemu-devel] [PATCH v6 1/3] linux-aio: fix submit aio as a batch

2014-11-26 Thread Kevin Wolf
Am 25.11.2014 um 08:23 hat Ming Lei geschrieben:
> In the submit path, we can't complete request directly,
> otherwise "Co-routine re-entered recursively" may be caused,

One more thing: Can you write a qemu-iotests case to test this
code path? (Which should fail on master and be fixed with this patch,
obviously.)

Kevin



Re: [Qemu-devel] [PATCH v7 09/16] hw/vfio/platform: add vfio-platform support

2014-11-26 Thread Eric Auger
On 11/26/2014 12:20 PM, Alexander Graf wrote:
> 
> 
> On 26.11.14 11:48, Eric Auger wrote:
>> On 11/26/2014 11:24 AM, Alexander Graf wrote:
>>>
>>>
>>> On 26.11.14 10:45, Eric Auger wrote:
 On 11/05/2014 11:29 AM, Alexander Graf wrote:
>
>
> On 31.10.14 15:05, Eric Auger wrote:
>> Minimal VFIO platform implementation supporting
>> - register space user mapping,
>> - IRQ assignment based on eventfds handled on qemu side.
>>
>> irqfd kernel acceleration comes in a subsequent patch.
>>
>> Signed-off-by: Kim Phillips 
>> Signed-off-by: Eric Auger 
>>>
>>> [...]
>>>
>> +/*
>> + * Mechanics to program/start irq injection on machine init done 
>> notifier:
>> + * this is needed since at finalize time, the device IRQ are not yet
>> + * bound to the platform bus IRQ. It is assumed here dynamic 
>> instantiation
>> + * always is used. Binding to the platform bus IRQ happens on a machine
>> + * init done notifier registered by the machine file. After its 
>> execution
>> + * we execute a new notifier that actually starts the injection. When 
>> using
>> + * irqfd, programming the injection consists in associating eventfds to
>> + * GSI number,ie. virtual IRQ number
>> + */
>> +
>> +typedef struct VfioIrqStarterNotifierParams {
>> +unsigned int platform_bus_first_irq;
>> +Notifier notifier;
>> +} VfioIrqStarterNotifierParams;
>> +
>> +typedef struct VfioIrqStartParams {
>> +PlatformBusDevice *pbus;
>> +int platform_bus_first_irq;
>> +} VfioIrqStartParams;
>> +
>> +/* Start injection of IRQ for a specific VFIO device */
>> +static int vfio_irq_starter(SysBusDevice *sbdev, void *opaque)
>> +{
>> +int i;
>> +VfioIrqStartParams *p = opaque;
>> +VFIOPlatformDevice *vdev;
>> +VFIODevice *vbasedev;
>> +uint64_t irq_number;
>> +PlatformBusDevice *pbus = p->pbus;
>> +int platform_bus_first_irq = p->platform_bus_first_irq;
>> +
>> +if (object_dynamic_cast(OBJECT(sbdev), TYPE_VFIO_PLATFORM)) {
>> +vdev = VFIO_PLATFORM_DEVICE(sbdev);
>> +vbasedev = &vdev->vbasedev;
>> +for (i = 0; i < vbasedev->num_irqs; i++) {
>> +irq_number = platform_bus_get_irqn(pbus, sbdev, i)
>> + + platform_bus_first_irq;
>> +vfio_start_irq_injection(sbdev, i, irq_number);
>> +}
>> +}
>> +return 0;
>> +}
>> +
>> +/* loop on all VFIO platform devices and start their IRQ injection */
>> +static void vfio_irq_starter_notify(Notifier *notifier, void *data)
>> +{
>> +VfioIrqStarterNotifierParams *p =
>> +container_of(notifier, VfioIrqStarterNotifierParams, notifier);
>> +DeviceState *dev =
>> +qdev_find_recursive(sysbus_get_default(), 
>> TYPE_PLATFORM_BUS_DEVICE);
>> +PlatformBusDevice *pbus = PLATFORM_BUS_DEVICE(dev);
>> +
>> +if (pbus->done_gathering) {
>> +VfioIrqStartParams data = {
>> +.pbus = pbus,
>> +.platform_bus_first_irq = p->platform_bus_first_irq,
>> +};
>> +
>> +foreach_dynamic_sysbus_device(vfio_irq_starter, &data);
>> +}
>> +}
>> +
>> +/* registers the machine init done notifier that will start VFIO IRQ */
>> +void vfio_register_irq_starter(int platform_bus_first_irq)
>> +{
>> +VfioIrqStarterNotifierParams *p = 
>> g_new(VfioIrqStarterNotifierParams, 1);
>> +
>> +p->platform_bus_first_irq = platform_bus_first_irq;
>> +p->notifier.notify = vfio_irq_starter_notify;
>> +qemu_add_machine_init_done_notifier(&p->notifier);
>
> Could you add a notifier for each device instead? Then the notifier
> would be part of the vfio device struct and not some dangling random
> pointer :).
>
> Of course instead of foreach_dynamic_sysbus_device() you would directly
> know the device you're dealing with and only handle a single device per
> notifier.

 Hi Alex,

 I don't see how to practically follow your request:

 - at machine init time, VFIO devices are not yet instantiated so I
 cannot call foreach_dynamic_sysbus_device() there - I was definitively
 wrong in my first reply :-().

 - I can't register a per VFIO device notifier in the VFIO device
 finalize function because this latter is called after the platform bus
 instantiation. So the IRQ binding notifier (registered in platform bus
 finalize fn) would be called after the IRQ starter notifier.

 - then to simplify things a bit I could use a qemu_register_reset in
 place of a machine init done notifier (would relax the call order
 constraint) but the problem consists in passing the platform bus first
 irq (all the more so 

Re: [Qemu-devel] [2.3 PATCH v7 09/10] qmp: Add dirty bitmap 'enabled' field in query-block

2014-11-26 Thread Max Reitz

On 2014-11-25 at 20:46, John Snow wrote:

From: Fam Zheng 

Signed-off-by: Fam Zheng 
Signed-off-by: John Snow 
---
  block.c  | 1 +
  qapi/block-core.json | 5 -
  2 files changed, 5 insertions(+), 1 deletion(-)


Reviewed-by: Max Reitz 

I like how you dropped my R-bs even for functionally unchanged patches. 
Are you afraid I might revoke my R-b if you're involved? If so, I'm very 
sorry. ;-)


Max



[Qemu-devel] [RFC PATCH 0/3] linux-aio: Convert to coroutines

2014-11-26 Thread Kevin Wolf
This is my current branch for converting the linux-aio interface to
coroutines. I started this as a performance optimisation, but I think it
also makes it much easier to avoid the recursive coroutine reentrace
that Ming Lei has sent a relatively complex callback-based patch for.
See patch 3 for a quick attempt on fixing the same problem in a
coroutine-based linux-aio implementation.

Not ready to be merged mainly because of incomplete testing and
benchmarking.

Kevin Wolf (3):
  qemu-img bench
  raw-posix: Convert Linux AIO submission to coroutines
  linux-aio: Don't reenter request coroutine recursively

 block/linux-aio.c |  89 +++-
 block/raw-aio.h   |   5 +-
 block/raw-posix.c |  62 +--
 qemu-img-cmds.hx  |   6 ++
 qemu-img.c| 174 ++
 qemu-img.texi |  10 
 6 files changed, 256 insertions(+), 90 deletions(-)

-- 
1.8.3.1




  1   2   3   >