date:20140806

Gonglei (Arei) arei.gong...@huawei.com writes:

 Hi,

 
  $WHATEVER: don't use 'Yoda conditions'
 
  'Yoda conditions' are not part of idiomatic QEMU coding
  style, so rewrite them in the more usual order.
 
 
 OK but why stop at these files? How about this
 instead?
 
 I just search c files by using key words like NULL == etc.

 I don't think we should change conditional statements like  and =.

Eric pointed out it's actually incorrect for NaNs.

If you want to touch inequalities, separate patch(es) please, because
they need more thorough review, both for correctness and for style.

 BTW, just using like value == NULL instead of NULL == value in all files
 is not a good idea, which we have discussed in my patch serials v2. So, I 
 posted
 v3, add change log  imitate nearby code about using '!value' or
 value == NULL' at
 every patch  .

Re not a good idea: I think rewriting NULL == value to value ==
NULL *is* a good idea, but rewriting it to !value where that blends
in with surrounding code is a *better* idea.

Gonglei's patches do that, Michael's don't, but are more complete.
Therefore:

 So, maybe you can post patches for those files I have missed in the serials, 
 but not simply instead all by semantic script IMO, thanks!

Easy: apply Gonglei's patches before you run the script.

You may have to split patches along subsystem boundaries to get them in.
Bothersome, as it involves guessing boundaries.  Not a request from me,
just a warning of possible misfortune :)

[Qemu-devel] [PATCH v2 2/4] monitor: fix access freed memory

The function monitor_fdset_dup_fd_find_remove() references member of 'mon_fdset'
which may be freed in function monitor_fdset_cleanup()

Signed-off-by: zhanghailiang zhang.zhanghaili...@huawei.com
---
 monitor.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/monitor.c b/monitor.c
index 5bc70a6..41e46a6 100644
--- a/monitor.c
+++ b/monitor.c
@@ -2532,8 +2532,10 @@ static int monitor_fdset_dup_fd_find_remove(int dup_fd, 
bool remove)
 {
 MonFdset *mon_fdset;
 MonFdsetFd *mon_fdset_fd_dup;
+int64_t id = -1;
 
 QLIST_FOREACH(mon_fdset, mon_fdsets, next) {
+id = mon_fdset-id;
 QLIST_FOREACH(mon_fdset_fd_dup, mon_fdset-dup_fds, next) {
 if (mon_fdset_fd_dup-fd == dup_fd) {
 if (remove) {
@@ -2542,7 +2544,7 @@ static int monitor_fdset_dup_fd_find_remove(int dup_fd, 
bool remove)
 monitor_fdset_cleanup(mon_fdset);
 }
 }
-return mon_fdset-id;
+return id;
 }
 }
 }
-- 
1.7.12.4

[Qemu-devel] [PATCH v2 1/4] l2cap: fix access freed memory

Pointer 'ch' will be used in function 'l2cap_channel_open_req_msg' after
it was previously freed in 'l2cap_channel_open'.
Assigned it to NULL after it is freed.

Reviewed-by: Alex Bennée alex.ben...@linaro.org
Signed-off-by: zhanghailiang zhang.zhanghaili...@huawei.com
---
 hw/bt/l2cap.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/bt/l2cap.c b/hw/bt/l2cap.c
index 2301d6f..591e047 100644
--- a/hw/bt/l2cap.c
+++ b/hw/bt/l2cap.c
@@ -429,7 +429,7 @@ static struct l2cap_chan_s *l2cap_channel_open(struct 
l2cap_instance_s *l2cap,
 status = L2CAP_CS_NO_INFO;
 } else {
 g_free(ch);
-
+ch = NULL;
 result = L2CAP_CR_NO_MEM;
 status = L2CAP_CS_NO_INFO;
 }
-- 
1.7.12.4

[Qemu-devel] [PATCH v2 4/4] ivshmem: check the value returned by fstat()

The function fstat() may fail, so check its return value.

Signed-off-by: zhanghailiang zhang.zhanghaili...@huawei.com
---
 hw/misc/ivshmem.c | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/hw/misc/ivshmem.c b/hw/misc/ivshmem.c
index 768e528..5d939d2 100644
--- a/hw/misc/ivshmem.c
+++ b/hw/misc/ivshmem.c
@@ -324,7 +324,11 @@ static int check_shm_size(IVShmemState *s, int fd) {
 
 struct stat buf;
 
-fstat(fd, buf);
+if (fstat(fd, buf)  0) {
+fprintf(stderr, ivshmem: exiting for fstat fd '%d' failed: %s\n,
+fd, strerror(errno));
+return -1;
+}
 
 if (s-ivshmem_size  buf.st_size) {
 fprintf(stderr,
-- 
1.7.12.4

Re: [Qemu-devel] qemu live migration error from 2.0 to 2.1

William Dauchy wdau...@gmail.com writes:

 On Tue, Aug 5, 2014 at 8:57 PM, Dr. David Alan Gilbert
 dgilb...@redhat.com wrote:
 Can you confirm this is on the final 2.1 release (there was a fix that
 went in just around rc5).

 for the receiver, I'm using 2.1
 # qemu-system-x86_64 --version
 QEMU emulator version 2.1.0, Copyright (c) 2003-2008 Fabrice Bellar


 for the sender side, here is the qmp result:

 (QEMU) query-version
 {   u'return': {   u'package': u'',
u'qemu': {   u'major': 2, u'micro': 50, u'minor': 0}}}

 What's your command line on both ends?

 the receiver:

 qemu-system-x86_64 -m 2048 -cpu host,level=9 -nodefconfig -nodefaults
 -nographic -readconfig /var/lib/qemu/VM_A/config -pidfile
 /var/lib/qemu/VM_A/pid -serial pty -D /var/lib/qemu/VM_A/log -d
 unimp,guest_errors -incoming tcp:0:32768

 /var/lib/qemu/VM_A/config:
 [name]
   guest = VM_A
   process = VM_A

 [chardev compat_monitor0]
   backend = socket
   path = /var/lib/qemu/VM_A/sock
   server = on
   wait = off

 [device]
   driver = virtio-balloon

 [rtc]
   base = utc

 [mon compat_monitor0]
   mode = control
   chardev = compat_monitor0
   default = on

 [machine]
   kernel = /boot/bzImage
   append = root=/dev/sda console=ttyS0 ro
   accel = kvm

 [device]
   driver = virtio-rng-pci

 [device scsi0]
   driver = virtio-scsi-pci
   hotplug = on

 [...] vif config

 [...] disk config

 [smp-opts]
   cpus = 2
   sockets = 1
   cores = 1
   threads = 1
   maxcpus = 12


 the sender has the exact same command (without tcp receiver) and also
 the same config; I'm just issuing a migrate command through qmp.

Looks like you're not specifying a machine type.

If that's the case, your source machine uses the old QEMU's default
machine type, probably pc-i440fx-2.0, and your destination machine uses
the new default, probably pc-i440fx-2.1.

Migration requires identical machine types on source and destination.
If they differ, migration may work anyway (this is the very lucky case),
fail outright (lucky case), or it may succeed, then make the guest
misbehave or blow up on the destination, possibly after some delay.

Please try with 'type = pc-i440fx-2.0' in your '[machine]' section.

[Qemu-devel] [PATCH v2 3/4] virtio-blk: fix reference a pointer which might be freed

In function virtio_blk_handle_request, it may freed memory pointed by req,
So do not access member of req after calling this function.

Reviewed-by: Stefan Hajnoczi stefa...@redhat.com
Signed-off-by: zhanghailiang zhang.zhanghaili...@huawei.com
---
 hw/block/virtio-blk.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c
index c241c50..54a853a 100644
--- a/hw/block/virtio-blk.c
+++ b/hw/block/virtio-blk.c
@@ -458,7 +458,7 @@ static void virtio_blk_handle_output(VirtIODevice *vdev, 
VirtQueue *vq)
 static void virtio_blk_dma_restart_bh(void *opaque)
 {
 VirtIOBlock *s = opaque;
-VirtIOBlockReq *req = s-rq;
+VirtIOBlockReq *req = s-rq, *next = NULL;
 MultiReqBuffer mrb = {
 .num_writes = 0,
 };
@@ -469,8 +469,9 @@ static void virtio_blk_dma_restart_bh(void *opaque)
 s-rq = NULL;
 
 while (req) {
+next = req-next;
 virtio_blk_handle_request(req, mrb);
-req = req-next;
+req = next;
 }
 
 virtio_submit_multiwrite(s-bs, mrb);
-- 
1.7.12.4

Re: [Qemu-devel] [v3][PATCH 2/4] hw:i386:pc_piix: split pc_init1()


On 2014/8/4 21:48, Michael S. Tsirkin wrote:

On Thu, Jul 31, 2014 at 08:09:32PM +0800, Tiejun Chen wrote:

We'd like to split pc_init1 and then we can share something
with other stuff.

Signed-off-by: Tiejun Chen tiejun.c...@intel.com


With patch 1 in place, this should not be
necessary - just propage the correct types
around.



Will rebase this in next version.

Thanks
Tiejun

[Qemu-devel] [PATCH v2 0/4] fix several bugs about use-after-free and an api abuse

v1 - v2:
-ivshmem: modified the log message according to reviewing suggestion of Michael

zhanghailiang (4):
  l2cap: fix access freed memory
  monitor: fix access freed memory
  virtio-blk: fix reference a pointer which might be freed
  ivshmem: check the value returned by fstat()

 hw/block/virtio-blk.c | 5 +++--
 hw/bt/l2cap.c | 2 +-
 hw/misc/ivshmem.c | 6 +-
 monitor.c | 4 +++-
 4 files changed, 12 insertions(+), 5 deletions(-)

-- 
1.7.12.4

Re: [Qemu-devel] [questions] about qemu log

Zhang Haoyu zhan...@sangfor.com writes:

 The output is on qemu's stderr.  You are in control of what that
 stderr is.

 I don't get why we can configure
 -D /path/to/unique/file/name.log

 but we also have to redirect stderr (I didn't checked if the daemonize
 option was closing it). What's the purpose of this logfile option?


Well -D will log to file only loggable (i.e. qemu_log()) information
(which has all sorts of options and switches). Stderr, is a little
more static and should in theory be limited to genuine errors. But if
you want a combined log of both you can simply omit -D to default
qemu_log output to stderr. This gives you a combined log that you can
redirect anywhere. To be honest, this is what I do as a matter of
course (2 foo rather than -D foo).

 Maybe we can introduce a new qemu option to specify a error logfile
 where stderr be redirected, like below,
 DEF(elogfile, HAS_ARG, QEMU_OPTION_elogfile, \
 -elogfile logfile redirect stderr log to logfile(default
 /var/log/qemu/vm name##.log)\n,
 QEMU_ARCH_ALL)
 STEXI
 @item -elogfile @var{logfile}
 @findex -elogfile
 redirect stderr in @var{logfile}
 ETEXI
 then we can set the error log file through qemu command,
 /var/log/qemu/vm name##.log as default.


This sounds out-of-scope for QEMU to me and makes a standard flow
non-standard. If prints are going to stderr where should be going
elsewhere they probably should be fixed. Do you have specific examples
of information going to stderr that you would rather go to a log (be
it an error log or something else?).

 I use proxmox to manage vm, it dose not redirect qemu's stderr, and
 start vm with -daemonize option,
 so the error log disappeared.
 I want to redirect the error log of qemu to a specified logfile, if
 fault happened, I can use the error log to analyze the fault.

 And, why qemu output the error log to stderr instead of a error
 logfile which can be configure?

Because the code is a mess in that regard.

You don't fix that by redirecting stderr wholesale, because that just
adds to the mess.  You fix it at the root, one ill-advised fprintf() at
a time, as Peter advises:

[...]
There's plently of tree wide work to clean up the cases where stderr
is used where qemu_log should be. If you are finding that log
information is going to stderr instead of the log, patches would be
welcome.

If you want to redirect stderr in the interim, do it in whatever runs
QEMU.

Re: [Qemu-devel] [v3][PATCH 1/4] i440fx: make types configurable at run-time


On 2014/8/4 21:48, Michael S. Tsirkin wrote:

On Thu, Jul 31, 2014 at 08:09:31PM +0800, Tiejun Chen wrote:

Xen wants to supply a different pci and host devices,
inheriting i440fx devices. Make types configurable.

Signed-off-by: Michael S. Tsirkin m...@redhat.com
Signed-off-by: Tiejun Chen tiejun.c...@intel.com


You should have a From: line too, otherwise
git will set the Author field incorrectly.



Will update as follows:

From: Michael S. Tsirkin m...@redhat.com
Signed-off-by: Michael S. Tsirkin m...@redhat.com
Signed-off-by: Tiejun Chen tiejun.c...@intel.com

Thanks
Tiejun

Re: [Qemu-devel] [v3][PATCH 3/4] xen:hw:pci-host:piix: create host bridge to passthrough


On 2014/8/4 21:50, Michael S. Tsirkin wrote:

On Thu, Jul 31, 2014 at 08:09:33PM +0800, Tiejun Chen wrote:

Implement a pci host bridge specific to passthrough. Actually
this just inherits the standard one.

This is based on http://patchwork.ozlabs.org/patch/363810/.

Signed-off-by: Tiejun Chen tiejun.c...@intel.com
---
  hw/pci-host/piix.c   | 41 +
  include/hw/i386/pc.h |  2 ++
  2 files changed, 43 insertions(+)

v3:

* Rebase

v2:

* Just add prefix with XEN_IGD_PASSTHROUGH/xen_igd_passthrough

diff --git a/hw/pci-host/piix.c b/hw/pci-host/piix.c
index 0cd82b8..26ba1e5 100644
--- a/hw/pci-host/piix.c
+++ b/hw/pci-host/piix.c
@@ -93,6 +93,9 @@ typedef struct PIIX3State {
  #define I440FX_PCI_DEVICE(obj) \
  OBJECT_CHECK(PCII440FXState, (obj), TYPE_I440FX_PCI_DEVICE)

+#define XEN_IGD_PASSTHROUGH_I440FX_PCI_DEVICE(obj) \
+OBJECT_CHECK(PCII440FXState, (obj), 
TYPE_XEN_IGD_PASSTHROUGH_I440FX_PCI_DEVICE)
+
  struct PCII440FXState {
  /* private */
  PCIDevice parent_obj;
@@ -303,6 +306,16 @@ static int i440fx_initfn(PCIDevice *dev)
  return 0;
  }

+static int xen_igd_passthrough_i440fx_initfn(PCIDevice *dev)
+{
+PCII440FXState *d = XEN_IGD_PASSTHROUGH_I440FX_PCI_DEVICE(dev);
+
+dev-config[I440FX_SMRAM] = 0x02;
+
+cpu_smm_register(i440fx_set_smm, d);
+return 0;
+}
+


This is exactly i440fx_initfn.
Don't copy and paste code like this, reuse existing functions.



We can't reuse i440fx_initfn() simply after we introduce 
XEN_IGD_PASSTHROUGH_I440FX_PCI_DEVICE(). But with a further concern I 
think we also pass type into I440FX_PCI_DEVIC() to index different 
objects here. Please see next version.


Thanks
Tiejun

Re: [Qemu-devel] [v3][PATCH 0/5] xen: introduce new machine for IGD passthrough


On 2014/8/4 21:51, Michael S. Tsirkin wrote:

On Thu, Jul 31, 2014 at 08:09:30PM +0800, Tiejun Chen wrote:

v3:

* Drop patch #4
* Add one patch #1 from Michael
* Rebase


You added my patch but don't use it, so most of
my comment weren't addressed.


I guess I should cover those comments and posted some reply as well. But 
maybe something is confused.



Do you plan to send v4 to address them?



Anyway, I will send v4 to narrow down that gap to your expectoration.

Thanks
Tiejun

[Qemu-devel] [Bug 1353149] [NEW] qemu 2.1.0 fails to start if number of cores is greater than 1.

2014-08-06 Thread asavah

Public bug reported:

qemu (kvm) 2.1.0 (built from sources) fails to start if number of cores
is greater than 1.

relevant part of commandline arguments:

/usr/bin/qemu-system-x86_64 -name test3 -S -machine pc-
i440fx-2.1,accel=kvm,usb=off -cpu Westmere -m 4096 -realtime mlock=off
-smp 1,maxcpus=4,sockets=1,cores=4,threads=1

the error reported is:

qemu-system-x86_64: /home/asavah/pkgbuild/qemu-2.1.0/hw/i386/smbios.c:825: 
smbios_get_tables: Assertion `smbios_smp_sockets = 1' failed.
2014-08-05 21:45:35.825+: shutting down

however setting 4 sockets with 1 core each allows me to start the
machine just fine.

the system is debian wheezy
Linux hostname 3.16.0-hostname2 #2 SMP Mon Aug 4 17:02:16 EEST 2014 x86_64 
GNU/Linux

libvirt 1.2.7 (built from sources)

** Affects: qemu
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1353149

Title:
  qemu 2.1.0 fails to start if number of cores is greater than 1.

Status in QEMU:
  New

Bug description:
  qemu (kvm) 2.1.0 (built from sources) fails to start if number of
  cores is greater than 1.

  relevant part of commandline arguments:

  /usr/bin/qemu-system-x86_64 -name test3 -S -machine pc-
  i440fx-2.1,accel=kvm,usb=off -cpu Westmere -m 4096 -realtime mlock=off
  -smp 1,maxcpus=4,sockets=1,cores=4,threads=1

  the error reported is:

  qemu-system-x86_64: /home/asavah/pkgbuild/qemu-2.1.0/hw/i386/smbios.c:825: 
smbios_get_tables: Assertion `smbios_smp_sockets = 1' failed.
  2014-08-05 21:45:35.825+: shutting down

  however setting 4 sockets with 1 core each allows me to start the
  machine just fine.

  the system is debian wheezy
  Linux hostname 3.16.0-hostname2 #2 SMP Mon Aug 4 17:02:16 EEST 2014 x86_64 
GNU/Linux

  libvirt 1.2.7 (built from sources)

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1353149/+subscriptions

Re: [Qemu-devel] about -enable-kvm options

2014-08-06 Thread Gareth

Thanks Richard :)


On Sat, Aug 2, 2014 at 4:47 AM, Richard W.M. Jones rjo...@redhat.com
wrote:

 On Fri, Aug 01, 2014 at 11:15:29AM +0800, Gareth wrote:
  Hi all
 
  What does '-enable-kvm' option mean? I have heard two versions of
 answers:

 It's a shortcut for:

   $qemu -machine accel=kvm

  a) guest OS would have /dev/kvm device and which could help vm in guest
 OS
  (nested vm)

 That's nested KVM, which is enabled as a module option in the kernel
 module (on the host), eg:

   # modprobe kvm_intel nested=1

 On AMD it's enabled by default.

  b) use /dev/kvm and intel-vt on host OS which could help vm run more fast
  than pure emulator.

 Nearly.  It uses /dev/kvm on the host, which may or may not be
 implemented using Intel VT, or a variety of other techniques.

 You can also have fallbacks to software emulation (TCG):

   $qemu -machine accel=kvm:tcg

 Rich.

 --
 Richard Jones, Virtualization Group, Red Hat
 http://people.redhat.com/~rjones
 Read my programming and virtualization blog: http://rwmj.wordpress.com
 virt-builder quickly builds VMs from scratch
 http://libguestfs.org/virt-builder.1.html




-- 
Gareth

*Cloud Computing, OpenStack, Distributed Storage, Fitness, Basketball*
*OpenStack contributor, kun_huang@freenode*
*My promise: if you find any spelling or grammar mistakes in my email from
Mar 1 2013, notify me *
*and I'll donate $1 or ￥1 to an open organization you specify.*

[Qemu-devel] [v4][PATCH 1/5] i440fx: make types configurable at run-time

2014-08-06 Thread Tiejun Chen

From: Michael S. Tsirkin m...@redhat.com

Xen wants to supply a different pci and host devices,
inheriting i440fx devices. Make types configurable.

Signed-off-by: Michael S. Tsirkin m...@redhat.com
Signed-off-by: Tiejun Chen tiejun.c...@intel.com
---
 hw/i386/pc_piix.c| 4 +++-
 hw/pci-host/piix.c   | 9 -
 include/hw/i386/pc.h | 6 +-
 3 files changed, 12 insertions(+), 7 deletions(-)

v4:

* Just add From: Michael S. Tsirkin m...@redhat.com

v3:

* New patch from Michael

diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c
index 4f22be8..bf26550 100644
--- a/hw/i386/pc_piix.c
+++ b/hw/i386/pc_piix.c
@@ -201,7 +201,9 @@ static void pc_init1(MachineState *machine,
 }
 
 if (pci_enabled) {
-pci_bus = i440fx_init(i440fx_state, piix3_devfn, isa_bus, gsi,
+pci_bus = i440fx_init(TYPE_I440FX_PCI_HOST_BRIDGE,
+  TYPE_I440FX_PCI_DEVICE,
+  i440fx_state, piix3_devfn, isa_bus, gsi,
   system_memory, system_io, machine-ram_size,
   below_4g_mem_size,
   above_4g_mem_size,
diff --git a/hw/pci-host/piix.c b/hw/pci-host/piix.c
index e0e0946..0cd82b8 100644
--- a/hw/pci-host/piix.c
+++ b/hw/pci-host/piix.c
@@ -40,7 +40,6 @@
  * http://download.intel.com/design/chipsets/datashts/29054901.pdf
  */
 
-#define TYPE_I440FX_PCI_HOST_BRIDGE i440FX-pcihost
 #define I440FX_PCI_HOST_BRIDGE(obj) \
 OBJECT_CHECK(I440FXState, (obj), TYPE_I440FX_PCI_HOST_BRIDGE)
 
@@ -91,7 +90,6 @@ typedef struct PIIX3State {
 MemoryRegion rcr_mem;
 } PIIX3State;
 
-#define TYPE_I440FX_PCI_DEVICE i440FX
 #define I440FX_PCI_DEVICE(obj) \
 OBJECT_CHECK(PCII440FXState, (obj), TYPE_I440FX_PCI_DEVICE)
 
@@ -305,7 +303,8 @@ static int i440fx_initfn(PCIDevice *dev)
 return 0;
 }
 
-PCIBus *i440fx_init(PCII440FXState **pi440fx_state,
+PCIBus *i440fx_init(const char *host_type, const char *pci_type,
+PCII440FXState **pi440fx_state,
 int *piix3_devfn,
 ISABus **isa_bus, qemu_irq *pic,
 MemoryRegion *address_space_mem,
@@ -325,7 +324,7 @@ PCIBus *i440fx_init(PCII440FXState **pi440fx_state,
 unsigned i;
 I440FXState *i440fx;
 
-dev = qdev_create(NULL, TYPE_I440FX_PCI_HOST_BRIDGE);
+dev = qdev_create(NULL, host_type);
 s = PCI_HOST_BRIDGE(dev);
 b = pci_bus_new(dev, NULL, pci_address_space,
 address_space_io, 0, TYPE_PCI_BUS);
@@ -333,7 +332,7 @@ PCIBus *i440fx_init(PCII440FXState **pi440fx_state,
 object_property_add_child(qdev_get_machine(), i440fx, OBJECT(dev), NULL);
 qdev_init_nofail(dev);
 
-d = pci_create_simple(b, 0, TYPE_I440FX_PCI_DEVICE);
+d = pci_create_simple(b, 0, pci_type);
 *pi440fx_state = I440FX_PCI_DEVICE(d);
 f = *pi440fx_state;
 f-system_memory = address_space_mem;
diff --git a/include/hw/i386/pc.h b/include/hw/i386/pc.h
index 863eefb..11fb72f 100644
--- a/include/hw/i386/pc.h
+++ b/include/hw/i386/pc.h
@@ -235,7 +235,11 @@ extern int no_hpet;
 struct PCII440FXState;
 typedef struct PCII440FXState PCII440FXState;
 
-PCIBus *i440fx_init(PCII440FXState **pi440fx_state, int *piix_devfn,
+#define TYPE_I440FX_PCI_HOST_BRIDGE i440FX-pcihost
+#define TYPE_I440FX_PCI_DEVICE i440FX
+
+PCIBus *i440fx_init(const char *host_type, const char *pci_type,
+PCII440FXState **pi440fx_state, int *piix_devfn,
 ISABus **isa_bus, qemu_irq *pic,
 MemoryRegion *address_space_mem,
 MemoryRegion *address_space_io,
-- 
1.9.1

[Qemu-devel] [v4][PATCH 0/5] xen: introduce new machine for IGD passthrough

2014-08-06 Thread Tiejun Chen

v4:

* Rebase on latest tree
* Drop patch #2
* Regenerate patches after Michael introduce patch #1
* We need to use this pci_type as a index to reuse I440FX_PCI_DEVICE()
* Test: boot with a preinstalled winxp
  ./i386-softmmu/qemu-system-i386 -hda winxp-32.img -m 2560 -boot c -machine pc

v3:

* Drop patch #4
* Add one patch #1 from Michael
* Rebase
* In./i386-softmmu/qemu-system-i386 -hda test.img -m 2560 -boot c -machine pc

v2:

* Fix some coding style
* New patch to separate i440fx_init
* Just add prefix with XEN_IGD_PASSTHROUGH/xen_igd_passthrough
* Based on patch #2 to regenerate
* Unify prefix with XEN_IGD_PASSTHROUGH/xen_igd_passthrough like patch #3
* Test: boot with a preinstalled ubuntu 14.04
  ./i386-softmmu/qemu-system-i386 -hda test.img -m 2560 -boot c -machine pc

As we discussed we need to create a separate machine to support current
IGD passthrough.


Michael S. Tsirkin (1):
  i440fx: make types configurable at run-time

Tiejun Chen (4):
  pc_init1: pass parameters just with types
  I440FX_PCI_DEVICE: add pci_type to index
  xen:hw:pci-host:piix: create host bridge to passthrough
  xen:hw:i386:pc_piix: introduce new machine for IGD passthrough

 hw/i386/pc_piix.c| 60 
+++-
 hw/pci-host/piix.c   | 58 
--
 include/hw/i386/pc.h |  8 +++-
 3 files changed, 110 insertions(+), 16 deletions(-)

Thanks
Tiejun

[Qemu-devel] [v4][PATCH 4/5] xen:hw:pci-host:piix: create host bridge to passthrough

2014-08-06 Thread Tiejun Chen

Implement a pci host bridge specific to passthrough. Actually
this just inherits the standard one.

This is based on http://patchwork.ozlabs.org/patch/363810/.

Signed-off-by: Tiejun Chen tiejun.c...@intel.com
---
 hw/pci-host/piix.c   | 39 +++
 include/hw/i386/pc.h |  2 ++
 2 files changed, 41 insertions(+)

v4:

* Rebase

v3:

* Rebase

v2:

* Just add prefix with XEN_IGD_PASSTHROUGH/xen_igd_passthrough

diff --git a/hw/pci-host/piix.c b/hw/pci-host/piix.c
index 4330599..5cac398 100644
--- a/hw/pci-host/piix.c
+++ b/hw/pci-host/piix.c
@@ -303,6 +303,17 @@ static int i440fx_initfn(PCIDevice *dev)
 return 0;
 }
 
+static int xen_igd_passthrough_i440fx_initfn(PCIDevice *dev)
+{
+PCII440FXState *d = I440FX_PCI_DEVICE(dev,
+ TYPE_XEN_IGD_PASSTHROUGH_I440FX_PCI_DEVICE);
+
+dev-config[I440FX_SMRAM] = 0x02;
+
+cpu_smm_register(i440fx_set_smm, d);
+return 0;
+}
+
 PCIBus *i440fx_init(const char *host_type, const char *pci_type,
 PCII440FXState **pi440fx_state,
 int *piix3_devfn,
@@ -703,6 +714,33 @@ static const TypeInfo i440fx_info = {
 .class_init= i440fx_class_init,
 };
 
+static void xen_igd_passthrough_i440fx_class_init(ObjectClass *klass, void 
*data)
+{
+DeviceClass *dc = DEVICE_CLASS(klass);
+PCIDeviceClass *k = PCI_DEVICE_CLASS(klass);
+
+k-init = xen_igd_passthrough_i440fx_initfn;
+k-vendor_id = PCI_VENDOR_ID_INTEL;
+k-device_id = PCI_DEVICE_ID_INTEL_82441;
+k-revision = 0x02;
+k-class_id = PCI_CLASS_BRIDGE_HOST;
+dc-desc = IGD PT XEN Host bridge;
+dc-vmsd = vmstate_i440fx;
+/*
+ * PCI-facing part of the host bridge, not usable without the
+ * host-facing part, which can't be device_add'ed, yet.
+ */
+dc-cannot_instantiate_with_device_add_yet = true;
+dc-hotpluggable   = false;
+}
+
+static const TypeInfo xen_igd_passthrough_i440fx_info = {
+.name  = TYPE_XEN_IGD_PASSTHROUGH_I440FX_PCI_DEVICE,
+.parent= TYPE_PCI_DEVICE,
+.instance_size = sizeof(PCII440FXState),
+.class_init= xen_igd_passthrough_i440fx_class_init,
+};
+
 static const char *i440fx_pcihost_root_bus_path(PCIHostState *host_bridge,
 PCIBus *rootbus)
 {
@@ -744,6 +782,7 @@ static const TypeInfo i440fx_pcihost_info = {
 static void i440fx_register_types(void)
 {
 type_register_static(i440fx_info);
+type_register_static(xen_igd_passthrough_i440fx_info);
 type_register_static(piix3_info);
 type_register_static(piix3_xen_info);
 type_register_static(i440fx_pcihost_info);
diff --git a/include/hw/i386/pc.h b/include/hw/i386/pc.h
index 11fb72f..de34aa6 100644
--- a/include/hw/i386/pc.h
+++ b/include/hw/i386/pc.h
@@ -238,6 +238,8 @@ typedef struct PCII440FXState PCII440FXState;
 #define TYPE_I440FX_PCI_HOST_BRIDGE i440FX-pcihost
 #define TYPE_I440FX_PCI_DEVICE i440FX
 
+#define TYPE_XEN_IGD_PASSTHROUGH_I440FX_PCI_DEVICE xen-igd-passthrough-i440FX
+
 PCIBus *i440fx_init(const char *host_type, const char *pci_type,
 PCII440FXState **pi440fx_state, int *piix_devfn,
 ISABus **isa_bus, qemu_irq *pic,
-- 
1.9.1

Re: [Qemu-devel] [PATCH v3 for-2.2 0/8] don't use Yoda conditions

2014-08-06 Thread Michael S. Tsirkin

On Tue, Aug 05, 2014 at 07:53:51PM -0600, Eric Blake wrote:
 On 08/05/2014 08:02 AM, Michael S. Tsirkin wrote:
  On Fri, Aug 01, 2014 at 03:46:08PM +0800, arei.gong...@huawei.com wrote:
  From: Gonglei arei.gong...@huawei.com
 
  $WHATEVER: don't use 'Yoda conditions'
 
  'Yoda conditions' are not part of idiomatic QEMU coding
  style, so rewrite them in the more usual order.
  
  
  OK but why stop at these files? How about this
  instead?
  
 
  @ disable commeq @
  expression E;
  constant C;
  @@
  - C == E
  + E == C
  @ disable commeq @
  expression E;
  constant C;
  @@
  - C == E
  + E == C
 
 Why is this listed twice?
 
  @ disable gtr_lss @
  expression E;
  constant C;
  @@
  - C  E
  + E  C
 
 This is wrong for floating point (think NaN); you'd have to audit the
 results to make sure only integers are commuted.

I did, take a look at the results :)

  @ disable gtr_lss_eq @
  expression E;
  constant C;
  @@
  - C = E
  + E = C
 
 Ditto.
 
  
  Signed-off-by: Michael S. Tsirkin m...@redhat.com
 
 The idea seems okay to me, but I haven't closely reviewed the patch yet.
 
 -- 
 Eric Blake   eblake redhat com+1-919-301-3266
 Libvirt virtualization library http://libvirt.org

Re: [Qemu-devel] [questions] about qemu log

2014-08-06 Thread Zhang Haoyu

 The output is on qemu's stderr.  You are in control of what that
 stderr is.

 I don't get why we can configure
 -D /path/to/unique/file/name.log

 but we also have to redirect stderr (I didn't checked if the daemonize
 option was closing it). What's the purpose of this logfile option?


Well -D will log to file only loggable (i.e. qemu_log()) information
(which has all sorts of options and switches). Stderr, is a little
more static and should in theory be limited to genuine errors. But if
you want a combined log of both you can simply omit -D to default
qemu_log output to stderr. This gives you a combined log that you can
redirect anywhere. To be honest, this is what I do as a matter of
course (2 foo rather than -D foo).

 Maybe we can introduce a new qemu option to specify a error logfile
 where stderr be redirected, like below,
 DEF(elogfile, HAS_ARG, QEMU_OPTION_elogfile, \
 -elogfile logfile redirect stderr log to logfile(default
 /var/log/qemu/vm name##.log)\n,
 QEMU_ARCH_ALL)
 STEXI
 @item -elogfile @var{logfile}
 @findex -elogfile
 redirect stderr in @var{logfile}
 ETEXI
 then we can set the error log file through qemu command,
 /var/log/qemu/vm name##.log as default.


This sounds out-of-scope for QEMU to me and makes a standard flow
non-standard. If prints are going to stderr where should be going
elsewhere they probably should be fixed. Do you have specific examples
of information going to stderr that you would rather go to a log (be
it an error log or something else?).

 I use proxmox to manage vm, it dose not redirect qemu's stderr, and
 start vm with -daemonize option,
 so the error log disappeared.
 I want to redirect the error log of qemu to a specified logfile, if
 fault happened, I can use the error log to analyze the fault.

 And, why qemu output the error log to stderr instead of a error
 logfile which can be configure?

Because the code is a mess in that regard.

You don't fix that by redirecting stderr wholesale, because that just
adds to the mess.  You fix it at the root, one ill-advised fprintf() at
a time, as Peter advises:


Sorry, I'm afraid I misunderstand what you mean,
should I replace all of fprintf(stderr, ...) with qemu_log() ?
or only some cases where stderr is used where qemu_log should be, as Perter 
advises?
If so, should I still need to redirect the stderr to specified logfile in 
qemu's parent shell/process ?

Thanks,
Zhang Haoyu

[...]
There's plently of tree wide work to clean up the cases where stderr
is used where qemu_log should be. If you are finding that log
information is going to stderr instead of the log, patches would be
welcome.

If you want to redirect stderr in the interim, do it in whatever runs
QEMU.

Re: [Qemu-devel] [Bug 1353149] [NEW] qemu 2.1.0 fails to start if number of cores is greater than 1.

2014-08-06 Thread David Cruz

I can confirm this in Centos 7, also.

Qemu 2.1.0 with latest libvirt.

David Cruz

2014-08-06 0:12 GMT+02:00 asavah irher...@gmail.com:
 Public bug reported:

 qemu (kvm) 2.1.0 (built from sources) fails to start if number of cores
 is greater than 1.

 relevant part of commandline arguments:

 /usr/bin/qemu-system-x86_64 -name test3 -S -machine pc-
 i440fx-2.1,accel=kvm,usb=off -cpu Westmere -m 4096 -realtime mlock=off
 -smp 1,maxcpus=4,sockets=1,cores=4,threads=1

 the error reported is:

 qemu-system-x86_64: /home/asavah/pkgbuild/qemu-2.1.0/hw/i386/smbios.c:825: 
 smbios_get_tables: Assertion `smbios_smp_sockets = 1' failed.
 2014-08-05 21:45:35.825+: shutting down

 however setting 4 sockets with 1 core each allows me to start the
 machine just fine.

 the system is debian wheezy
 Linux hostname 3.16.0-hostname2 #2 SMP Mon Aug 4 17:02:16 EEST 2014 x86_64 
 GNU/Linux

 libvirt 1.2.7 (built from sources)

 ** Affects: qemu
  Importance: Undecided
  Status: New

 --
 You received this bug notification because you are a member of qemu-
 devel-ml, which is subscribed to QEMU.
 https://bugs.launchpad.net/bugs/1353149

 Title:
   qemu 2.1.0 fails to start if number of cores is greater than 1.

 Status in QEMU:
   New

 Bug description:
   qemu (kvm) 2.1.0 (built from sources) fails to start if number of
   cores is greater than 1.

   relevant part of commandline arguments:

   /usr/bin/qemu-system-x86_64 -name test3 -S -machine pc-
   i440fx-2.1,accel=kvm,usb=off -cpu Westmere -m 4096 -realtime mlock=off
   -smp 1,maxcpus=4,sockets=1,cores=4,threads=1

   the error reported is:

   qemu-system-x86_64: /home/asavah/pkgbuild/qemu-2.1.0/hw/i386/smbios.c:825: 
 smbios_get_tables: Assertion `smbios_smp_sockets = 1' failed.
   2014-08-05 21:45:35.825+: shutting down

   however setting 4 sockets with 1 core each allows me to start the
   machine just fine.

   the system is debian wheezy
   Linux hostname 3.16.0-hostname2 #2 SMP Mon Aug 4 17:02:16 EEST 2014 x86_64 
 GNU/Linux

   libvirt 1.2.7 (built from sources)

 To manage notifications about this bug go to:
 https://bugs.launchpad.net/qemu/+bug/1353149/+subscriptions

Re: [Qemu-devel] [PATCH 1/2] virtio-serial: create a linked list of all active devices

Amit Shah amit.s...@redhat.com writes:

 On (Mon) 04 Aug 2014 [13:33:56], Markus Armbruster wrote:
 Amit Shah amit.s...@redhat.com writes:
 
  To ensure two virtserialports don't get added to the system with the
  same 'name' parameter, we need to access all the ports on all the
  devices added, and compare the names.
 
  We currently don't have a list of all VirtIOSerial devices added to the
  system.  This commit adds a simple linked list in which devices are put
  when they're initialized, and removed when they go away.

 snip

  +struct VirtIOSerialDevices {
  +QLIST_HEAD(, VirtIOSerial) devices;
  +} vserdevices;
  +
 
 Any particular reason for stuffing the list into a struct?

 Not strictly needed for this patch alone, but I think it's cleaner to
 keep it this way, and when something else comes up that needs a
 per-device variable, this list will be around.

Adding the struct when it's needed will be easy, so why pay the
notational overhead now?

 Also, this is also the
 way it's done in the kernel, so that uniformity helps as well.

Two uglies make a pretty?

Anyway, matter of taste.

[...]

Re: [Qemu-devel] [PATCH 0/2] virtio-serial: Prevent addition of ports with same name

Amit Shah amit.s...@redhat.com writes:

 Hi,

 The 2nd patch prevents adding ports to the system with conflicts in
 the 'name' parameter.  This is done by going through all the ports in
 the system, including ports on other virtio-serial devices.

 The first patch prepares for this by creating a linked list of all
 available virtio-serial devices, so they can be walked over to check
 their ports' names.

 Please review.

No advice from Andreas on better ways to iterate over devices, yet
(vacation time?).  If there is one, we can always use it on top.

Reviewed-by: Markus Armbruster arm...@redhat.com

[Qemu-devel] [Bug 1353149] Re: qemu 2.1.0 fails to start if number of cores is greater than 1.

2014-08-06 Thread Andrey Korolyov

-smp 1,maxcpus=4,sockets=1,cores=4,threads=1

should be

-smp 4,maxcpus=4,sockets=1,cores=4,threads=1

although more human-friendly error is more appropriate there (better
than a silent fallback to either 1- or 4- core topology)

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1353149

Title:
  qemu 2.1.0 fails to start if number of cores is greater than 1.

Status in QEMU:
  New

Bug description:
  qemu (kvm) 2.1.0 (built from sources) fails to start if number of
  cores is greater than 1.

  relevant part of commandline arguments:

  /usr/bin/qemu-system-x86_64 -name test3 -S -machine pc-
  i440fx-2.1,accel=kvm,usb=off -cpu Westmere -m 4096 -realtime mlock=off
  -smp 1,maxcpus=4,sockets=1,cores=4,threads=1

  the error reported is:

  qemu-system-x86_64: /home/asavah/pkgbuild/qemu-2.1.0/hw/i386/smbios.c:825: 
smbios_get_tables: Assertion `smbios_smp_sockets = 1' failed.
  2014-08-05 21:45:35.825+: shutting down

  however setting 4 sockets with 1 core each allows me to start the
  machine just fine.

  the system is debian wheezy
  Linux hostname 3.16.0-hostname2 #2 SMP Mon Aug 4 17:02:16 EEST 2014 x86_64 
GNU/Linux

  libvirt 1.2.7 (built from sources)

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1353149/+subscriptions

Re: [Qemu-devel] [PATCH 1/2] virtio-serial: create a linked list of all active devices

2014-08-06 Thread Amit Shah

On (Wed) 06 Aug 2014 [09:27:02], Markus Armbruster wrote:
 Amit Shah amit.s...@redhat.com writes:
 
  On (Mon) 04 Aug 2014 [13:33:56], Markus Armbruster wrote:
  Amit Shah amit.s...@redhat.com writes:
  
   To ensure two virtserialports don't get added to the system with the
   same 'name' parameter, we need to access all the ports on all the
   devices added, and compare the names.
  
   We currently don't have a list of all VirtIOSerial devices added to the
   system.  This commit adds a simple linked list in which devices are put
   when they're initialized, and removed when they go away.
 
  snip
 
   +struct VirtIOSerialDevices {
   +QLIST_HEAD(, VirtIOSerial) devices;
   +} vserdevices;
   +
  
  Any particular reason for stuffing the list into a struct?
 
  Not strictly needed for this patch alone, but I think it's cleaner to
  keep it this way, and when something else comes up that needs a
  per-device variable, this list will be around.
 
 Adding the struct when it's needed will be easy, so why pay the
 notational overhead now?

True.

  Also, this is also the
  way it's done in the kernel, so that uniformity helps as well.
 
 Two uglies make a pretty?

Hah!

Amit

Re: [Qemu-devel] [PATCH 0/2] virtio-serial: Prevent addition of ports with same name

2014-08-06 Thread Amit Shah

On (Wed) 06 Aug 2014 [09:27:04], Markus Armbruster wrote:
 Amit Shah amit.s...@redhat.com writes:
 
  Hi,
 
  The 2nd patch prevents adding ports to the system with conflicts in
  the 'name' parameter.  This is done by going through all the ports in
  the system, including ports on other virtio-serial devices.
 
  The first patch prepares for this by creating a linked list of all
  available virtio-serial devices, so they can be walked over to check
  their ports' names.
 
  Please review.
 
 No advice from Andreas on better ways to iterate over devices, yet
 (vacation time?).  If there is one, we can always use it on top.
 
 Reviewed-by: Markus Armbruster arm...@redhat.com

Thanks!

I'll wait for a week before submitting a pull req.


Amit

Re: [Qemu-devel] [questions] about qemu log

Zhang Haoyu zhan...@sangfor.com writes:

 The output is on qemu's stderr.  You are in control of what that
 stderr is.

 I don't get why we can configure
 -D /path/to/unique/file/name.log

 but we also have to redirect stderr (I didn't checked if the daemonize
 option was closing it). What's the purpose of this logfile option?


Well -D will log to file only loggable (i.e. qemu_log()) information
(which has all sorts of options and switches). Stderr, is a little
more static and should in theory be limited to genuine errors. But if
you want a combined log of both you can simply omit -D to default
qemu_log output to stderr. This gives you a combined log that you can
redirect anywhere. To be honest, this is what I do as a matter of
course (2 foo rather than -D foo).

 Maybe we can introduce a new qemu option to specify a error logfile
 where stderr be redirected, like below,
 DEF(elogfile, HAS_ARG, QEMU_OPTION_elogfile, \
 -elogfile logfile redirect stderr log to logfile(default
 /var/log/qemu/vm name##.log)\n,
 QEMU_ARCH_ALL)
 STEXI
 @item -elogfile @var{logfile}
 @findex -elogfile
 redirect stderr in @var{logfile}
 ETEXI
 then we can set the error log file through qemu command,
 /var/log/qemu/vm name##.log as default.


This sounds out-of-scope for QEMU to me and makes a standard flow
non-standard. If prints are going to stderr where should be going
elsewhere they probably should be fixed. Do you have specific examples
of information going to stderr that you would rather go to a log (be
it an error log or something else?).

 I use proxmox to manage vm, it dose not redirect qemu's stderr, and
 start vm with -daemonize option,
 so the error log disappeared.
 I want to redirect the error log of qemu to a specified logfile, if
 fault happened, I can use the error log to analyze the fault.

 And, why qemu output the error log to stderr instead of a error
 logfile which can be configure?

Because the code is a mess in that regard.

You don't fix that by redirecting stderr wholesale, because that just
adds to the mess.  You fix it at the root, one ill-advised fprintf() at
a time, as Peter advises:


 Sorry, I'm afraid I misunderstand what you mean,
 should I replace all of fprintf(stderr, ...) with qemu_log() ?
 or only some cases where stderr is used where qemu_log should be, as
 Perter advises?

I didn't mean to suggest blind conversion from fprintf() to qemu_log().
Each instance of fprintf() needs to be reviewed before conversion.  I
think that's exactly Peter's advice, too.

 If so, should I still need to redirect the stderr to specified logfile
 in qemu's parent shell/process ?

Probably.  Libvirt does it.

Re: [Qemu-devel] qemu live migration error from 2.0 to 2.1

Il 05/08/2014 20:57, Dr. David Alan Gilbert ha scritto:
 (CC Paolo)
 
 * William Dauchy (wdau...@gmail.com) wrote:
 Hello,

 For a qemu live migration from 2.0 to 2.1 I'm getting:

 qemu-system-x86_64: Length mismatch: /rom@etc/acpi/tables: 3000 in != 2
 qemu-system-x86_64: Unknown ramblock /table-loader, cannot accept migration
 qemu-system-x86_64: Ack, bad migration stream!
 qemu-system-x86_64: Illegal RAM offset f0e2d500f06000
 qemu: warning: error while loading state for instance 0x0 of device 'ram'
 qemu-system-x86_64: load of migration failed: Invalid argument

 I was wondering if it was a bug or if I was supposed to do something
 to avoid this error.
 
 That should work.
 
 I saw the bug report for qemu1.7 but it's not my case.
 
 Can you confirm this is on the final 2.1 release (there was a fix that
 went in just around rc5).
 
 What's your command line on both ends?

I suspect he's using -M pc on both.  You must use -M pc-i440fx-2.0
on both if you're migrating from 2.0 to a different version.

Paolo

Re: [Qemu-devel] qemu live migration error from 2.0 to 2.1

Il 06/08/2014 08:16, Markus Armbruster ha scritto:
 for the receiver, I'm using 2.1
 # qemu-system-x86_64 --version
 QEMU emulator version 2.1.0, Copyright (c) 2003-2008 Fabrice Bellar


 for the sender side, here is the qmp result:

 (QEMU) query-version
 {   u'return': {   u'package': u'',
u'qemu': {   u'major': 2, u'micro': 50, u'minor': 0}}}


This is a development version between 2.0.0 and 2.0.0-rc1; it may have
bugs such as the one that was fixed in 2.0.0-rc5.

Migration from development versions is not supported.  You must either
upgrade to 2.1.0, or downgrade to the released 2.0.0 version.

Paolo

Re: [Qemu-devel] [PATCH v3 for-2.2 0/8] don't use Yoda conditions

Michael S. Tsirkin m...@redhat.com writes:

 On Wed, Aug 06, 2014 at 08:05:46AM +0200, Markus Armbruster wrote:
 Gonglei (Arei) arei.gong...@huawei.com writes:
 
  Hi,
 
  
   $WHATEVER: don't use 'Yoda conditions'
  
   'Yoda conditions' are not part of idiomatic QEMU coding
   style, so rewrite them in the more usual order.
  
  
  OK but why stop at these files? How about this
  instead?
  
  I just search c files by using key words like NULL == etc.
 
  I don't think we should change conditional statements like  and =.
 
 Eric pointed out it's actually incorrect for NaNs.
 
 If you want to touch inequalities, separate patch(es) please, because
 they need more thorough review, both for correctness and for style.
 
  BTW, just using like value == NULL instead of NULL == value in all 
  files
  is not a good idea, which we have discussed in my patch serials
  v2. So, I posted
  v3, add change log  imitate nearby code about using '!value' or
  value == NULL' at
  every patch  .
 
 Re not a good idea: I think rewriting NULL == value to value ==
 NULL *is* a good idea, but rewriting it to !value where that blends
 in with surrounding code is a *better* idea.
 
 Gonglei's patches do that, Michael's don't, but are more complete.
 Therefore:

 Yes but it's unrelated to Yoda: we have x != NULL without Yoda
 in a lot of places. So this seems, to me, an unrelated issue.

Actually, the relation is clear to me.  The patch cleans up style, by
converting Yoda conditionals into the locally appropriate form.

Locally appropriate because the optimal choice between x == NULL and
!x depends on which form the context uses.

Gonglei made those choices.  It's only a small improvement, and I
wouldn't demand it, but since we already have it, let's not throw it
away.

 If people feel this == NULL - !x is desired, it's better to do it all at
 once IMHO, and do x != NULL - x at the same time.

Nope, because there's no consensus.  See thread leading to
https://lists.nongnu.org/archive/html/qemu-devel/2014-08/msg00033.html

 Easy to run another script to do it on top.

 
  So, maybe you can post patches for those files I have missed in
  the serials,
  but not simply instead all by semantic script IMO, thanks!
 
 Easy: apply Gonglei's patches before you run the script.
 
 You may have to split patches along subsystem boundaries to get them in.
 Bothersome, as it involves guessing boundaries.  Not a request from me,
 just a warning of possible misfortune :)

 It's going in through trivial tree, I don't think split-up is necessary.

Hope you have better luck than me, because when I try to route tree-wide
cleanups via -trivial, I usually don't have any.

[Qemu-devel] [PATCH] linux-user: redirect openat calls

2014-08-06 Thread riku . voipio

From: Riku Voipio riku.voi...@linaro.org

While Mikhail fixed /proc/self/maps, it was noticed openat calls are
not redirected currently. Some archs don't have open at all, so
openat needs to be redirected.

Fix this by consolidating open/openat code to do_openat - open
is implemented using openat(AT_FDCWD, ... ), which according
to open(2) man page is identical.

Since all targets now have openat, remove the ifdef around sys_openat
and openat: case in do_syscall.

Cc: Mikhail Ilin m.i...@samsung.com
Signed-off-by: Riku Voipio riku.voi...@linaro.org
---
 linux-user/syscall.c | 23 +--
 1 file changed, 9 insertions(+), 14 deletions(-)

diff --git a/linux-user/syscall.c b/linux-user/syscall.c
index c8c2b4c..dd77673 100644
--- a/linux-user/syscall.c
+++ b/linux-user/syscall.c
@@ -294,7 +294,6 @@ static int sys_getcwd1(char *buf, size_t size)
   return strlen(buf)+1;
 }
 
-#ifdef TARGET_NR_openat
 static int sys_openat(int dirfd, const char *pathname, int flags, mode_t mode)
 {
   /*
@@ -306,7 +305,6 @@ static int sys_openat(int dirfd, const char *pathname, int 
flags, mode_t mode)
   }
   return (openat(dirfd, pathname, flags));
 }
-#endif
 
 #ifdef TARGET_NR_utimensat
 #ifdef CONFIG_UTIMENSAT
@@ -5274,7 +5272,7 @@ static int open_net_route(void *cpu_env, int fd)
 }
 #endif
 
-static int do_open(void *cpu_env, const char *pathname, int flags, mode_t mode)
+static int do_openat(void *cpu_env, int dirfd, const char *pathname, int 
flags, mode_t mode)
 {
 struct fake_open {
 const char *filename;
@@ -5295,7 +5293,7 @@ static int do_open(void *cpu_env, const char *pathname, 
int flags, mode_t mode)
 
 if (is_proc_myself(pathname, exe)) {
 int execfd = qemu_getauxval(AT_EXECFD);
-return execfd ? execfd : get_errno(open(exec_path, flags, mode));
+return execfd ? execfd : get_errno(sys_openat(dirfd, exec_path, flags, 
mode));
 }
 
 for (fake_open = fakes; fake_open-filename; fake_open++) {
@@ -5329,7 +5327,7 @@ static int do_open(void *cpu_env, const char *pathname, 
int flags, mode_t mode)
 return fd;
 }
 
-return get_errno(open(path(pathname), flags, mode));
+return get_errno(sys_openat(dirfd, path(pathname), flags, mode));
 }
 
 /* do_syscall() should always have a single exit point at the end so
@@ -5404,22 +5402,19 @@ abi_long do_syscall(void *cpu_env, int num, abi_long 
arg1,
 case TARGET_NR_open:
 if (!(p = lock_user_string(arg1)))
 goto efault;
-ret = get_errno(do_open(cpu_env, p,
-target_to_host_bitmask(arg2, fcntl_flags_tbl),
-arg3));
+ret = get_errno(do_openat(cpu_env, AT_FDCWD, p,
+  target_to_host_bitmask(arg2, 
fcntl_flags_tbl),
+  arg3));
 unlock_user(p, arg1, 0);
 break;
-#if defined(TARGET_NR_openat)  defined(__NR_openat)
 case TARGET_NR_openat:
 if (!(p = lock_user_string(arg2)))
 goto efault;
-ret = get_errno(sys_openat(arg1,
-   path(p),
-   target_to_host_bitmask(arg3, 
fcntl_flags_tbl),
-   arg4));
+ret = get_errno(do_openat(cpu_env, arg1, p,
+  target_to_host_bitmask(arg3, 
fcntl_flags_tbl),
+  arg4));
 unlock_user(p, arg2, 0);
 break;
-#endif
 case TARGET_NR_close:
 ret = get_errno(close(arg1));
 break;
-- 
2.0.1

Re: [Qemu-devel] [PATCH] linux-user: /proc/self/maps content

2014-08-06 Thread Riku Voipio

On Tue, Aug 05, 2014 at 05:24:27PM +0400, Mikhail Ilin wrote:
 I've tested the sample for Aarch64 myself and found that the
 approach should also work fine.
 
 Translation layout:
 
 $ qemu-aarch64 -strace /tmp/busybox-static cat /proc/self/maps
 
 startend  size prot
 0040-005ba000 001ba000 r-x
 005c9000-005d3000 a000 rw-
 0040-00401000 1000 ---
 00401000-004000801000 0080 rw-
 
 /proc/self/maps output:
 
 0040-005ba000 r-xp  08:01 28837016  /tmp/busybox-static
 005ba000-005c9000 ---p  00:00 0
 005c9000-005cc000 rw-p 001b9000 08:01 28837016  /tmp/busybox-static
 005cc000-005f4000 rw-p  00:00 0
 6000-602eb000 r-xp  08:01 55578769
 /home/michail/my1/bin/qemu-aarch64
 604eb000-604f6000 rw-p 002eb000 08:01 55578769
 /home/michail/my1/bin/qemu-aarch64
 604f6000-6054a000 rw-p  00:00 0
 6054a000-6254b000 rwxp  00:00 0
 6254b000-62577000 rw-p  00:00 0
 63396000-633da000 rw-p  00:00 0  [heap]
 40-401000 ---p  00:00 0
 401000-4000801000 rw-p  00:00 0
 7ff830cab000-7ff8348fb000 rw-p  00:00 0
 7fffb26ed000-7fffb270e000 rw-p  00:00 0  [stack]
 7fffb27bb000-7fffb27bd000 r-xp  00:00 0  [vdso]
 ff60-ff601000 r-xp  00:00 0  [vsyscall]
 
 And the reason why it doesn't work for Aarch64 is openat call which is
 used instead of open.
 
 $ qemu-aarch64 -strace /tmp/busybox-static cat /proc/self/maps
 
 483 setgid(1000,0,47,45,0,274886296116) = 0
 483 setuid(1000,0,47,45,0,274886296116) = 0
 483 openat(AT_FDCWD,/proc/self/maps,O_RDONLY) = 3
 483 read(3,0x7febf0,4096) = 1071
 
 this call doesn't have additional preprocessing and called directly.
 
case TARGET_NR_openat:
 if (!(p = lock_user_string(arg2)))
 goto efault;
 ret = get_errno(sys_openat(arg1,
path(p),
target_to_host_bitmask(arg3,
 fcntl_flags_tbl),
arg4));
 
 I believe OpenRISC case looks the same.

Thanks for looking into it. I just sent a patch that adds preprocessing
to openat, and seems to clear the issue for both aarch64 and OpenRISC.

Riku

 
 On 05.08.2014 15:47, Riku Voipio wrote:
 Hi,
 
 On Tue, Aug 05, 2014 at 03:10:07PM +0400, Mikhail Ilyin wrote:
 Build /proc/self/maps doing a match against guest memory translation table.
 Output only that map records which are valid for guest memory layout.
 
 This is clear improvement, for most archs. But seems aarch64, openrisc still
 leak host maps. It's not a regression, same issue before the patch.
 
 + /home/voipio/linaro/qemu/obj/alpha-linux-user/qemu-alpha 
 /home/voipio/linaro/qemu-smoke/alpha/busybox cat /proc/self/maps
 00012000-0001201cc000 r-xp  fe:00 8784862  
 /home/voipio/linaro/qemu-smoke/alpha/busybox
 0001201dc000-0001201e rw-p 001cc000 fe:00 8784862  
 /home/voipio/linaro/qemu-smoke/alpha/busybox
 0001201e-00012020a000 rw-p  00:00 0
 0040-00402000 ---p  00:00 0
 00402000-004000802000 rw-p  00:00 0
 [stack]
 + /home/voipio/linaro/qemu/obj/arm-linux-user/qemu-arm 
 /home/voipio/linaro/qemu-smoke/armel/busybox cat /proc/self/maps
 8000-0014b000 r-xp  fe:00 8784905  
 /home/voipio/linaro/qemu-smoke/armel/busybox
 00153000-00154000 rw-p 00143000 fe:00 8784905  
 /home/voipio/linaro/qemu-smoke/armel/busybox
 00154000-0017b000 rw-p  00:00 0
 f67ff000-f680 ---p  00:00 0
 f680-f700 rw-p  00:00 0[stack]
 + /home/voipio/linaro/qemu/obj/aarch64-linux-user/qemu-aarch64 
 /home/voipio/linaro/qemu-smoke/arm64/busybox cat /proc/self/maps
 0040-00572000 r-xp  fe:00 8784917
 /home/voipio/linaro/qemu-smoke/arm64/busybox
 00572000-00581000 ---p  00:00 0
 00581000-00584000 rw-p 00171000 fe:00 8784917
 /home/voipio/linaro/qemu-smoke/arm64/busybox
 00584000-005ac000 rw-p  00:00 0
 40-401000 ---p  00:00 0
 401000-4000811000 rw-p  00:00 0
 7f38e312b000-7f38e6d2b000 rw-p  00:00 0
 7f38e6d2b000-7f38e6d86000 r-xp  fe:00 5242918
 /lib/x86_64-linux-gnu/libpcre.so.3.13.1
 7f38e6d86000-7f38e6f86000 ---p 0005b000 fe:00 5242918
 /lib/x86_64-linux-gnu/libpcre.so.3.13.1
 7f38e6f86000-7f38e6f87000 rw-p 0005b000 fe:00 5242918
 /lib/x86_64-linux-gnu/libpcre.so.3.13.1
 7f38e6f87000-7f38e7126000 r-xp  fe:00 5248993
 /lib/x86_64-linux-gnu/libc-2.19.so
 7f38e7126000-7f38e7326000 ---p 0019f000 fe:00 5248993
 /lib/x86_64-linux-gnu/libc-2.19.so

Re: [Qemu-devel] [PATCH v1 00/17] dataplane: optimization and multi virtqueue support

On Wed, Aug 6, 2014 at 3:45 PM, Paolo Bonzini pbonz...@redhat.com wrote:
 Il 06/08/2014 07:33, Ming Lei ha scritto:
  I played a bit with the following, I hope it's not too naive. I couldn't
  see a difference with your patches, but at least one reason for this is
  probably that my laptop SSD isn't fast enough to make the CPU the
  bottleneck. Haven't tried ramdisk yet, that would probably be the next
  thing. (I actually wrote the patch up just for some profiling on my own,
  not for comparing throughput, but it should be usable for that as well.)
 This might not be good for the test since it is basically a sequential
 read test, which can be optimized a lot by kernel. And I always use
 randread benchmark.

 A microbenchmark already exists in tests/test-coroutine.c, and doesn't
 really tell us much; it's obvious that coroutines execute more code, the
 question is why it affects the iops performance.

Could you take a look at the coroutine benchmark I worte?  The running
result shows coroutine does decrease performance a lot compared with
bypass coroutine like the patchset is doing.


 The sequential read should be the right workload.  For fio, you want to
 get as many iops as possible to QEMU and so you need randread.  But
 qemu-img is not run in a guest and if the kernel optimizes sequential
 reads then the bypass should have even more benefits because it makes
 userspace proportionally more expensive.

 In any case, the patches as written have no hope of being accepted.  If
 you invert the logic from aio-co to co-aio, that would be much
 better even if it's tedious.

Let's not talk the bypass patch, and see if the coroutine is really the cause
of performance drop first.

Thanks,

[Qemu-devel] [Bug 1353346] [NEW] ARMv7-M software-triggered interrupts-- unexpected behaviour

2014-08-06 Thread Boris Feigin

Public bug reported:

The handling of the NVIC Software Triggered Interrupt Register in
qemu-2.1.0/hw/intc/armv7m_nvic.c:375 isn't quite right.  As things
stand, writing a zero to the STIR ends up transferring control to vector
table entry zero, which, on ARMv7-M, holds the reset value of the stack
pointer.  That's what I see with lm3s811evb emulation, and that's not
what happens on my STM NUCLEO-F103RB board (Cortex-M3).

Seems like an oversight-- the handler probably wants
armv7m_nvic_set_pending(), not gic_set_pending_private(), and the IRQ
number needs 16 added onto it to get the exception number for the
interrupt.

ARM DUI 0552A (Cortex-M3 Devices: Generic User's Guide), p. 134:
  Interrupt ID of the interrupt to trigger, in the range 0-239. For example, a 
value of 0x03 specifies interrupt IRQ3.

Cheers,
Boris

** Affects: qemu
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1353346

Title:
  ARMv7-M software-triggered interrupts-- unexpected behaviour

Status in QEMU:
  New

Bug description:
  The handling of the NVIC Software Triggered Interrupt Register in
  qemu-2.1.0/hw/intc/armv7m_nvic.c:375 isn't quite right.  As things
  stand, writing a zero to the STIR ends up transferring control to
  vector table entry zero, which, on ARMv7-M, holds the reset value of
  the stack pointer.  That's what I see with lm3s811evb emulation, and
  that's not what happens on my STM NUCLEO-F103RB board (Cortex-M3).

  Seems like an oversight-- the handler probably wants
  armv7m_nvic_set_pending(), not gic_set_pending_private(), and the IRQ
  number needs 16 added onto it to get the exception number for the
  interrupt.

  ARM DUI 0552A (Cortex-M3 Devices: Generic User's Guide), p. 134:
Interrupt ID of the interrupt to trigger, in the range 0-239. For example, 
a value of 0x03 specifies interrupt IRQ3.

  Cheers,
  Boris

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1353346/+subscriptions

Re: [Qemu-devel] [RFC PATCH v2 09/10] virtio-scsi-dataplane: Code to run virtio-scsi on iothread

Il 06/08/2014 07:35, Fam Zheng ha scritto:
  ifeq ($(CONFIG_VIRTIO),y)
 -obj-y += virtio-scsi.o
 +obj-y += virtio-scsi.o virtio-scsi-dataplane.o
  obj-$(CONFIG_VHOST_SCSI) += vhost-scsi.o
  endif

I first thought that this must be conditional on 
CONFIG_VIRTIO_BLK_DATA_PLANE.  However, CONFIG_VIRTIO_BLK_DATA_PLANE is 
itself obsolete:

##
# adjust virtio-blk-data-plane based on linux-aio

if test $virtio_blk_data_plane = yes -a \
$linux_aio != yes ; then
  error_exit virtio-blk-data-plane requires Linux AIO, please try 
--enable-linux-aio
elif test -z $virtio_blk_data_plane ; then
  virtio_blk_data_plane=$linux_aio
fi

and there's no requirement to have Linux AIO anymore.  Can you prepare a
patch to drop CONFIG_VIRTIO_BLK_DATA_PLANE, and replace patch 1 with it?

We can leave --disable-virtio-blk-data-plane and --enable-virtio-blk-data-plane
in for a couple of releases.

Paolo

Re: [Qemu-devel] [PATCH v1 00/17] dataplane: optimization and multi virtqueue support

Am 06.08.2014 um 07:33 hat Ming Lei geschrieben:
 Hi Kevin,
 
 On Tue, Aug 5, 2014 at 10:47 PM, Kevin Wolf kw...@redhat.com wrote:
  Am 05.08.2014 um 15:48 hat Stefan Hajnoczi geschrieben:
  I have been wondering how to prove that the root cause is the ucontext
  coroutine mechanism (stack switching).  Here is an idea:
 
  Hack your bypass code path to run the request inside a coroutine.
  That way you can compare bypass without coroutine against bypass with
  coroutine.
 
  Right now I think there are doubts because the bypass code path is
  indeed a different (and not 100% correct) code path.  So this approach
  might prove that the coroutines are adding the overhead and not
  something that you bypassed.
 
  My doubts aren't only that the overhead might not come from the
  coroutines, but also whether any coroutine-related overhead is really
  unavoidable. If we can optimise coroutines, I'd strongly prefer to do
  just that instead of introducing additional code paths.
 
 OK, thank you for taking look at the problem, and hope we can
 figure out the root cause, :-)
 
 
  Another thought I had was this: If the performance difference is indeed
  only coroutines, then that is completely inside the block layer and we
  don't actually need a VM to test it. We could instead have something
  like a simple qemu-img based benchmark and should be observing the same.
 
 Even it is simpler to run a coroutine-only benchmark, and I just
 wrote a raw one, and looks coroutine does decrease performance
 a lot, please see the attachment patch, and thanks for your template
 to help me add the 'co_bench' command in qemu-img.

Yes, we can look at coroutines microbenchmarks in isolation. I actually
did do that yesterday with the yield test from tests/test-coroutine.c.
And in fact profiling immediately showed something to optimise:
pthread_getspecific() was quite high, replacing it by __thread on
systems where it works is more efficient and helped the numbers a bit.
Also, a lot of time seems to be spent in pthread_mutex_lock/unlock (even
in qemu-img bench), maybe there's even something that can be done here.

However, I just wasn't sure whether a change on this level would be
relevant in a realistic environment. This is the reason why I wanted to
get a benchmark involving the block layer and some I/O.

 From the profiling data in below link:
 
 http://pastebin.com/YwH2uwbq
 
 With coroutine, the running time for same loading is increased
 ~50%(1.325s vs. 0.903s), and dcache load events is increased
 ~35%(693M vs. 512M), insns per cycle is decreased by ~50%(
 1.35 vs. 1.63), compared with bypassing coroutine(-b parameter).
 
 The bypass code in the benchmark is very similar with the approach
 used in the bypass patch, since linux-aio with O_DIRECT seldom
 blocks in the the kernel I/O path.
 
 Maybe the benchmark is a bit extremely, but given modern storage
 device may reach millions of IOPS, and it is very easy to slow down
 the I/O by coroutine.

I think in order to optimise coroutines, such benchmarks are fair game.
It's just not guaranteed that the effects are exactly the same on real
workloads, so we should take the results with a grain of salt.

Anyhow, the coroutine version of your benchmark is buggy, it leaks all
coroutines instead of exiting them, so it can't make any use of the
coroutine pool. On my laptop, I get this (where fixed coroutine is a
version that simply removes the yield at the end):

| bypass| fixed coro| buggy coro
+---+---+--
time| 1.09s | 1.10s | 1.62s
L1-dcache-loads | 921,836,360   | 932,781,747   | 1,298,067,438
insns per cycle | 2.39  | 2.39  | 1.90

Begs the question whether you see a similar effect on a real qemu and
the coroutine pool is still not big enough? With correct use of
coroutines, the difference seems to be barely measurable even without
any I/O involved.

  I played a bit with the following, I hope it's not too naive. I couldn't
  see a difference with your patches, but at least one reason for this is
  probably that my laptop SSD isn't fast enough to make the CPU the
  bottleneck. Haven't tried ramdisk yet, that would probably be the next
  thing. (I actually wrote the patch up just for some profiling on my own,
  not for comparing throughput, but it should be usable for that as well.)
 
 This might not be good for the test since it is basically a sequential
 read test, which can be optimized a lot by kernel. And I always use
 randread benchmark.

Yes, I shortly pondered whether I should implement random offsets
instead. But then I realised that a quicker kernel operation would only
help the benchmark because we want it to test the CPU consumption in
userspace. So the faster the kernel gets, the better for us, because it
should make the impact of coroutines bigger.

Kevin

Re: [Qemu-devel] [PATCH v1 00/17] dataplane: optimization and multi virtqueue support

Il 06/08/2014 10:38, Ming Lei ha scritto:
 On Wed, Aug 6, 2014 at 3:45 PM, Paolo Bonzini pbonz...@redhat.com wrote:
 Il 06/08/2014 07:33, Ming Lei ha scritto:
 I played a bit with the following, I hope it's not too naive. I couldn't
 see a difference with your patches, but at least one reason for this is
 probably that my laptop SSD isn't fast enough to make the CPU the
 bottleneck. Haven't tried ramdisk yet, that would probably be the next
 thing. (I actually wrote the patch up just for some profiling on my own,
 not for comparing throughput, but it should be usable for that as well.)
 This might not be good for the test since it is basically a sequential
 read test, which can be optimized a lot by kernel. And I always use
 randread benchmark.

 A microbenchmark already exists in tests/test-coroutine.c, and doesn't
 really tell us much; it's obvious that coroutines execute more code, the
 question is why it affects the iops performance.
 
 Could you take a look at the coroutine benchmark I worte?  The running
 result shows coroutine does decrease performance a lot compared with
 bypass coroutine like the patchset is doing.

Your benchmark is synchronous, while disk I/O is asynchronous.

Your benchmark doesn't add much compared to time tests/test-coroutine
-m perf  -p /perf/yield.  It takes 8 seconds on my machine, and 10^8
function calls obviously take less than 8 seconds.  I've sent a patch to
add a baseline function call benchmark to test-coroutine.

 The sequential read should be the right workload.  For fio, you want to
 get as many iops as possible to QEMU and so you need randread.  But
 qemu-img is not run in a guest and if the kernel optimizes sequential
 reads then the bypass should have even more benefits because it makes
 userspace proportionally more expensive.

Do you agree with this?

Paolo

Re: [Qemu-devel] qemu live migration error from 2.0 to 2.1

On Wed, Aug 6, 2014 at 9:47 AM, Paolo Bonzini pbonz...@redhat.com wrote:
 I suspect he's using -M pc on both.

Is it the default value? because it's not the case in my command line.

 You must use -M pc-i440fx-2.0
 on both if you're migrating from 2.0 to a different version.

wow. I wasn't expecting such behavior; i.e migration between qemu
version does not seem trivial and it removes the benefits; better
using stop/start.
thanks for the info.

-- 
William

Re: [Qemu-devel] [RFC PATCH v2 09/10] virtio-scsi-dataplane: Code to run virtio-scsi on iothread

2014-08-06 Thread Fam Zheng

On Wed, 08/06 10:45, Paolo Bonzini wrote:
 Il 06/08/2014 07:35, Fam Zheng ha scritto:
   ifeq ($(CONFIG_VIRTIO),y)
  -obj-y += virtio-scsi.o
  +obj-y += virtio-scsi.o virtio-scsi-dataplane.o
   obj-$(CONFIG_VHOST_SCSI) += vhost-scsi.o
   endif
 
 I first thought that this must be conditional on 
 CONFIG_VIRTIO_BLK_DATA_PLANE.  However, CONFIG_VIRTIO_BLK_DATA_PLANE is 
 itself obsolete:
 
 ##
 # adjust virtio-blk-data-plane based on linux-aio
 
 if test $virtio_blk_data_plane = yes -a \
 $linux_aio != yes ; then
   error_exit virtio-blk-data-plane requires Linux AIO, please try 
 --enable-linux-aio
 elif test -z $virtio_blk_data_plane ; then
   virtio_blk_data_plane=$linux_aio
 fi
 
 and there's no requirement to have Linux AIO anymore.  Can you prepare a
 patch to drop CONFIG_VIRTIO_BLK_DATA_PLANE, and replace patch 1 with it?
 
 We can leave --disable-virtio-blk-data-plane and 
 --enable-virtio-blk-data-plane
 in for a couple of releases.
 

Yes. Sounds a good idea.

Fam

Re: [Qemu-devel] Are -cdrom/-hda (or -drive if=ide) supposed to work in q35?

Am 05.08.2014 um 23:14 hat John Snow geschrieben:
 
 
 On 08/05/2014 04:30 AM, Kevin Wolf wrote:
 Am 01.08.2014 um 22:10 hat John Snow geschrieben:
 
 On 06/12/2014 05:03 AM, Markus Armbruster wrote:
 -drive mixes up configuration of backend and frontend (a.k.a. device
   model), as follows:
 
 1. It always defines a backend.  info block shows them.
 
 2. It always defines a few frontend configuration bits for the device
 models to pick up.
 
 3. Except with if=none, it posts a request to board code to create a
 suitable frontend.  It's up to the board code to honor, reject or
 ignore the request.  The i440FX boards honor it, the Q35 boards
 ignore it.
 
 Nobody has gotten around to making the Q35 boards honor it, in part
 because there has been some confusion on what if=ide is supposed to
 mean on Q35.  Should it connect an ide-hd / ide-cd in SATA mode or in
 legacy PATA mode?
 
 I've always argued for SATA, because for me if=ide does *not* imply a
 specific kind of HBA any more than if=scsi does, and the natural
 HBA for Q35 is AHCI in SATA mode.
 
 Kevin (cc'ed) has argued for a way to make it connect in legacy PATA
 mode.  I'd be fine with that, as long as it's off by default.
 
 Patches welcome.
 
 
 Kevin, (or anyone else with an opinion for that matter), what is the
 reasoning behind wanting -cdrom to use the old PATA interfaces?
 
 The assumption that makes it a problem is that sooner or later we'll
 want to make Q35 the default. Most of the changes this brings in will
 make the guest see different, but generally still compatible hardware.
 AHCI however is not compatible with IDE in the sense that an OS that has
 only an IDE driver won't work with AHCI.
 
 It wouldn't be reasonable to break things like '-hda winxp.qcow2'. And
 in fact, if the internet is right, even newer Windows version give you
 trouble when you suddenly change from IDE to AHCI. So after an upgrade
 many users would find their existing guests to be broken.
 
 For at least the immediate future, the AHCI device doesn't support
 the mixed-mode SATA/PATA access models, though I suppose we could,
 it seems like a more obvious and simple solution to just allow the
 shorthand syntactic sugar commands to use the native bus of the
 system until you specify otherwise.
 
 I think I will probably begin writing a patch under this assumption
 unless there is a better technical reason not to.
 
 I think the solution that was generally agreed on was to introduce a
 machine option that decides whether to provide an IDE or AHCI interface
 (similar to the BIOS option that commonly exists on real hardware). We
 just need someone to implement it.
 
 Kevin
 
 
 OK, though for what it's worth I think that on real hardware, that
 BIOS switch toggles configuration bits on the AHCI device that
 actually changes its signature into a new device.

I'm not completely sure, but afaik these bits aren't actually in the
AHCI standard, but vendor-specific extensions. If you toggle them,
apparently the device magically turns into a different one, including
different PCI IDs and things like that.

I think we can safely take the shortcut of directly creating the right
device and leaving the BIOS alone.

 I suppose in our case we could create a machine option that toggled
 the behavior of the AHCI device appropriately to be either IDE or
 AHCI and treated the syntactic sugar commands appropriately, though
 you still may run into problems with Windows guests if you ever
 change the default machine type -- I have a hunch that Windows would
 get rather grumpy if you swapped out its IDE device from under it
 with even the emulated ICH9 one in legacy mode, but I suppose we can
 cross that bridge when we get there.
 
 For now, in the meantime, I will assume that -M q35 also implies the
 usage of AHCI, and treat the shorthands accordingly.

Okay.

Kevin

Re: [Qemu-devel] qemu live migration error from 2.0 to 2.1

On Wed, Aug 6, 2014 at 9:49 AM, Paolo Bonzini pbonz...@redhat.com wrote:
 This is a development version between 2.0.0 and 2.0.0-rc1; it may have
 bugs such as the one that was fixed in 2.0.0-rc5.

you mean 2.1.0-rc5

 Migration from development versions is not supported.  You must either
 upgrade to 2.1.0, or downgrade to the released 2.0.0 version.

in all case I would have the same issue with 2.0.0 version since the
fix has been pushed in 2.1.0-rc5

-- 
William

Re: [Qemu-devel] qemu live migration error from 2.0 to 2.1

2014-08-06 Thread Dr. David Alan Gilbert

* William Dauchy (wdau...@gmail.com) wrote:
 On Wed, Aug 6, 2014 at 9:47 AM, Paolo Bonzini pbonz...@redhat.com wrote:
  I suspect he's using -M pc on both.
 
 Is it the default value? because it's not the case in my command line.
 
  You must use -M pc-i440fx-2.0
  on both if you're migrating from 2.0 to a different version.
 
 wow. I wasn't expecting such behavior; i.e migration between qemu
 version does not seem trivial and it removes the benefits; better
 using stop/start.
 thanks for the info.

The trick is to pick a -M value and stick with it; then you should be
able to keep migrating to newer QEMU versions easily (just don't
go with dev versions because things are often broken in them).

-M pc   is special, don't use that if you want to be able to migrate

Dave

--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK

Re: [Qemu-devel] qemu live migration error from 2.0 to 2.1

On Wed, Aug 6, 2014 at 11:17 AM, Dr. David Alan Gilbert
dgilb...@redhat.com wrote:
 The trick is to pick a -M value and stick with it; then you should be
 able to keep migrating to newer QEMU versions easily (just don't
 go with dev versions because things are often broken in them).

ok
for dev versions, I'm just cherry picking interesting fixes which are
not yet released in stable.

 -M pc   is special, don't use that if you want to be able to migrate

understood, I guess it's the default value
-- 
William

[Qemu-devel] [PATCH] test-coroutine: add baseline test that times the cost of function calls

This can be used to compute the cost of coroutine operations.  In the
end the cost of the function call is a few clock cycles, so it's pretty
cheap for now, but it may become more relevant as the coroutine code
is optimized.

For example, here are the results on my machine:

   Function call 1 iterations: 0.173884 s
   Yield 1 iterations: 8.445064 s
   Lifecycle 100 iterations: 0.098445 s
   Nesting 1 iterations of 1000 depth each: 7.406431 s

One yield takes 83 nanoseconds, one enter takes 97 nanoseconds,
one coroutine allocation takes (roughly, since some of the allocations
in the nesting test do hit the pool) 739 nanoseconds:

   (8.445064 - 0.173884) * 10^9 / 1 = 82.7
   (0.098445 * 100 - 0.173884) * 10^9 / 1 = 96.7
   (7.406431 * 10 - 0.173884) * 10^9 / 1 = 738.9

Signed-off-by: Paolo Bonzini pbonz...@redhat.com
---
 tests/test-coroutine.c | 24 
 1 files changed, 24 insertions(+)

diff --git a/tests/test-coroutine.c b/tests/test-coroutine.c
index 760636d..6e634f4 100644
--- a/tests/test-coroutine.c
+++ b/tests/test-coroutine.c
@@ -288,6 +288,29 @@ static void perf_yield(void)
 maxcycles, duration);
 }
 
+static __attribute__((noinline)) void dummy(unsigned *i)
+{
+(*i)--;
+}
+
+static void perf_baseline(void)
+{
+unsigned int i, maxcycles;
+double duration;
+
+maxcycles = 1;
+i = maxcycles;
+
+g_test_timer_start();
+while (i  0) {
+dummy(i);
+}
+duration = g_test_timer_elapsed();
+
+g_test_message(Function call %u iterations: %f s\n,
+maxcycles, duration);
+}
+
 int main(int argc, char **argv)
 {
 g_test_init(argc, argv, NULL);
@@ -301,6 +324,7 @@ int main(int argc, char **argv)
 g_test_add_func(/perf/lifecycle, perf_lifecycle);
 g_test_add_func(/perf/nesting, perf_nesting);
 g_test_add_func(/perf/yield, perf_yield);
+g_test_add_func(/perf/function-call, perf_baseline);
 }
 return g_test_run();
 }
-- 
1.9.3

Re: [Qemu-devel] [PATCH v1 00/17] dataplane: optimization and multi virtqueue support

On Wed, Aug 6, 2014 at 4:48 PM, Kevin Wolf kw...@redhat.com wrote:
 Am 06.08.2014 um 07:33 hat Ming Lei geschrieben:
 Hi Kevin,

 On Tue, Aug 5, 2014 at 10:47 PM, Kevin Wolf kw...@redhat.com wrote:
  Am 05.08.2014 um 15:48 hat Stefan Hajnoczi geschrieben:
  I have been wondering how to prove that the root cause is the ucontext
  coroutine mechanism (stack switching).  Here is an idea:
 
  Hack your bypass code path to run the request inside a coroutine.
  That way you can compare bypass without coroutine against bypass with
  coroutine.
 
  Right now I think there are doubts because the bypass code path is
  indeed a different (and not 100% correct) code path.  So this approach
  might prove that the coroutines are adding the overhead and not
  something that you bypassed.
 
  My doubts aren't only that the overhead might not come from the
  coroutines, but also whether any coroutine-related overhead is really
  unavoidable. If we can optimise coroutines, I'd strongly prefer to do
  just that instead of introducing additional code paths.

 OK, thank you for taking look at the problem, and hope we can
 figure out the root cause, :-)

 
  Another thought I had was this: If the performance difference is indeed
  only coroutines, then that is completely inside the block layer and we
  don't actually need a VM to test it. We could instead have something
  like a simple qemu-img based benchmark and should be observing the same.

 Even it is simpler to run a coroutine-only benchmark, and I just
 wrote a raw one, and looks coroutine does decrease performance
 a lot, please see the attachment patch, and thanks for your template
 to help me add the 'co_bench' command in qemu-img.

 Yes, we can look at coroutines microbenchmarks in isolation. I actually
 did do that yesterday with the yield test from tests/test-coroutine.c.
 And in fact profiling immediately showed something to optimise:
 pthread_getspecific() was quite high, replacing it by __thread on
 systems where it works is more efficient and helped the numbers a bit.
 Also, a lot of time seems to be spent in pthread_mutex_lock/unlock (even
 in qemu-img bench), maybe there's even something that can be done here.

The lock/unlock in dataplane is often from memory_region_find(), and Paolo
should have done lots of work on that.


 However, I just wasn't sure whether a change on this level would be
 relevant in a realistic environment. This is the reason why I wanted to
 get a benchmark involving the block layer and some I/O.

 From the profiling data in below link:

 http://pastebin.com/YwH2uwbq

 With coroutine, the running time for same loading is increased
 ~50%(1.325s vs. 0.903s), and dcache load events is increased
 ~35%(693M vs. 512M), insns per cycle is decreased by ~50%(
 1.35 vs. 1.63), compared with bypassing coroutine(-b parameter).

 The bypass code in the benchmark is very similar with the approach
 used in the bypass patch, since linux-aio with O_DIRECT seldom
 blocks in the the kernel I/O path.

 Maybe the benchmark is a bit extremely, but given modern storage
 device may reach millions of IOPS, and it is very easy to slow down
 the I/O by coroutine.

 I think in order to optimise coroutines, such benchmarks are fair game.
 It's just not guaranteed that the effects are exactly the same on real
 workloads, so we should take the results with a grain of salt.

 Anyhow, the coroutine version of your benchmark is buggy, it leaks all
 coroutines instead of exiting them, so it can't make any use of the
 coroutine pool. On my laptop, I get this (where fixed coroutine is a
 version that simply removes the yield at the end):

 | bypass| fixed coro| buggy coro
 +---+---+--
 time| 1.09s | 1.10s | 1.62s
 L1-dcache-loads | 921,836,360   | 932,781,747   | 1,298,067,438
 insns per cycle | 2.39  | 2.39  | 1.90

 Begs the question whether you see a similar effect on a real qemu and
 the coroutine pool is still not big enough? With correct use of
 coroutines, the difference seems to be barely measurable even without
 any I/O involved.

When I comment qemu_coroutine_yield(), looks result of
bypass and fixed coro is very similar as your test, and I am just
wondering if stack is always switched in qemu_coroutine_enter()
without calling qemu_coroutine_yield().

Without the yield, the benchmark can't emulate coroutine usage in
bdrv_aio_readv/writev() path any more, and bypass in the patchset
skips two qemu_coroutine_enter() and one qemu_coroutine_yield()
for each bdrv_aio_readv/writev().


  I played a bit with the following, I hope it's not too naive. I couldn't
  see a difference with your patches, but at least one reason for this is
  probably that my laptop SSD isn't fast enough to make the CPU the
  bottleneck. Haven't tried ramdisk yet, that would probably be the next
  thing. (I actually wrote the patch up just for some profiling on my

Re: [Qemu-devel] [PATCH v1 00/17] dataplane: optimization and multi virtqueue support

2014-08-06 Thread Stefan Hajnoczi

On Wed, Aug 06, 2014 at 01:33:36PM +0800, Ming Lei wrote:
 With coroutine, the running time for same loading is increased
 ~50%(1.325s vs. 0.903s), and dcache load events is increased

I agree with Paolo about microbenchmarks.  We need to do I/O to get a
realistic picture of performance, since there is little point is
optimizing something that is not a significant factor in overall
performance.

But I also wanted to say that these benchmark durations are so short
that they can be greatly affected by outliers (e.g. scheduler behavior,
system background activity, etc).  Run benchmarks for 2 minutes to
reduce variance and give the system time to warm up.


pgpdVv9EmCnOH.pgp
Description: PGP signature

[Qemu-devel] [Bug 1353149] Re: qemu 2.1.0 fails to start if number of cores is greater than 1.

2014-08-06 Thread asavah

I forgot to mention that VM was created and managed remotely via virt-manager 
0.9.5 from another host.
I used custom cpu topology.

however this config worked fine on qemu 2.0.0 with virt-manager 0.9.5

just tried virt-manager 1.0.1 - it creates the proper argument -smp
4,maxcpus=4,sockets=1,cores=4,threads=1

so this was a virt-manager bug already fixed upstream.

this bug can be closed :)

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1353149

Title:
  qemu 2.1.0 fails to start if number of cores is greater than 1.

Status in QEMU:
  New

Bug description:
  qemu (kvm) 2.1.0 (built from sources) fails to start if number of
  cores is greater than 1.

  relevant part of commandline arguments:

  /usr/bin/qemu-system-x86_64 -name test3 -S -machine pc-
  i440fx-2.1,accel=kvm,usb=off -cpu Westmere -m 4096 -realtime mlock=off
  -smp 1,maxcpus=4,sockets=1,cores=4,threads=1

  the error reported is:

  qemu-system-x86_64: /home/asavah/pkgbuild/qemu-2.1.0/hw/i386/smbios.c:825: 
smbios_get_tables: Assertion `smbios_smp_sockets = 1' failed.
  2014-08-05 21:45:35.825+: shutting down

  however setting 4 sockets with 1 core each allows me to start the
  machine just fine.

  the system is debian wheezy
  Linux hostname 3.16.0-hostname2 #2 SMP Mon Aug 4 17:02:16 EEST 2014 x86_64 
GNU/Linux

  libvirt 1.2.7 (built from sources)

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1353149/+subscriptions

Re: [Qemu-devel] [v4][PATCH 4/5] xen:hw:pci-host:piix: create host bridge to passthrough

2014-08-06 Thread Michael S. Tsirkin

On Wed, Aug 06, 2014 at 02:50:34PM +0800, Tiejun Chen wrote:
 Implement a pci host bridge specific to passthrough. Actually
 this just inherits the standard one.
 
 This is based on http://patchwork.ozlabs.org/patch/363810/.
 
 Signed-off-by: Tiejun Chen tiejun.c...@intel.com
 ---
  hw/pci-host/piix.c   | 39 +++
  include/hw/i386/pc.h |  2 ++
  2 files changed, 41 insertions(+)
 
 v4:
 
 * Rebase
 
 v3:
 
 * Rebase
 
 v2:
 
 * Just add prefix with XEN_IGD_PASSTHROUGH/xen_igd_passthrough
 
 diff --git a/hw/pci-host/piix.c b/hw/pci-host/piix.c
 index 4330599..5cac398 100644
 --- a/hw/pci-host/piix.c
 +++ b/hw/pci-host/piix.c
 @@ -303,6 +303,17 @@ static int i440fx_initfn(PCIDevice *dev)
  return 0;
  }
  
 +static int xen_igd_passthrough_i440fx_initfn(PCIDevice *dev)
 +{
 +PCII440FXState *d = I440FX_PCI_DEVICE(dev,
 + TYPE_XEN_IGD_PASSTHROUGH_I440FX_PCI_DEVICE);
 +
 +dev-config[I440FX_SMRAM] = 0x02;
 +
 +cpu_smm_register(i440fx_set_smm, d);
 +return 0;
 +}
 +
  PCIBus *i440fx_init(const char *host_type, const char *pci_type,
  PCII440FXState **pi440fx_state,
  int *piix3_devfn,

don't duplicate code.
Reuse the function from regular piix.

 @@ -703,6 +714,33 @@ static const TypeInfo i440fx_info = {
  .class_init= i440fx_class_init,
  };
  
 +static void xen_igd_passthrough_i440fx_class_init(ObjectClass *klass, void 
 *data)
 +{
 +DeviceClass *dc = DEVICE_CLASS(klass);
 +PCIDeviceClass *k = PCI_DEVICE_CLASS(klass);
 +
 +k-init = xen_igd_passthrough_i440fx_initfn;
 +k-vendor_id = PCI_VENDOR_ID_INTEL;
 +k-device_id = PCI_DEVICE_ID_INTEL_82441;
 +k-revision = 0x02;
 +k-class_id = PCI_CLASS_BRIDGE_HOST;
 +dc-desc = IGD PT XEN Host bridge;
 +dc-vmsd = vmstate_i440fx;
 +/*
 + * PCI-facing part of the host bridge, not usable without the
 + * host-facing part, which can't be device_add'ed, yet.
 + */
 +dc-cannot_instantiate_with_device_add_yet = true;
 +dc-hotpluggable   = false;
 +}
 +
 +static const TypeInfo xen_igd_passthrough_i440fx_info = {
 +.name  = TYPE_XEN_IGD_PASSTHROUGH_I440FX_PCI_DEVICE,
 +.parent= TYPE_PCI_DEVICE,
 +.instance_size = sizeof(PCII440FXState),
 +.class_init= xen_igd_passthrough_i440fx_class_init,
 +};
 +
  static const char *i440fx_pcihost_root_bus_path(PCIHostState *host_bridge,
  PCIBus *rootbus)
  {
 @@ -744,6 +782,7 @@ static const TypeInfo i440fx_pcihost_info = {
  static void i440fx_register_types(void)
  {
  type_register_static(i440fx_info);
 +type_register_static(xen_igd_passthrough_i440fx_info);
  type_register_static(piix3_info);
  type_register_static(piix3_xen_info);
  type_register_static(i440fx_pcihost_info);
 diff --git a/include/hw/i386/pc.h b/include/hw/i386/pc.h
 index 11fb72f..de34aa6 100644
 --- a/include/hw/i386/pc.h
 +++ b/include/hw/i386/pc.h
 @@ -238,6 +238,8 @@ typedef struct PCII440FXState PCII440FXState;
  #define TYPE_I440FX_PCI_HOST_BRIDGE i440FX-pcihost
  #define TYPE_I440FX_PCI_DEVICE i440FX
  
 +#define TYPE_XEN_IGD_PASSTHROUGH_I440FX_PCI_DEVICE 
 xen-igd-passthrough-i440FX
 +
  PCIBus *i440fx_init(const char *host_type, const char *pci_type,
  PCII440FXState **pi440fx_state, int *piix_devfn,
  ISABus **isa_bus, qemu_irq *pic,
 -- 
 1.9.1

Re: [Qemu-devel] [PATCH v2 00/30] AHCI test suite framework

2014-08-06 Thread Stefan Hajnoczi

On Mon, Aug 04, 2014 at 05:11:01PM -0400, John Snow wrote:
 This patch series introduces a number of small fixes and tweaks to
 help support an AHCI test suite that in the future I hope to expand
 to a fuller regression suite to help guide the development of the
 AHCI device support under, in particular, the Q35 machine type in QEMU.
 
 Paolo Bonzini has contributed a number of cleanup and refactoring patches
 that support changes to the PIO setup FIS packet construction code, which
 is necessary for testing ths specification adherence of the IDENTIFY command,
 which issues its data exclusively via PIO mechanisms.
 
 The ahci-test code being checked in represents a minimum of functionality
 needed in order to issue and receive commands from the AHCI HBA under the
 libqos / qtest environment.
 
 In V2, as detailed below, these tests are not currently expected to pass.
 I will post a complementary patch outside of this set that highlights
 the exact set of tests that will not pass, which can help verify at least
 the portions of these tests that do work correctly.
 
 Assertions that currently fail:
 - Ordering of PCI capabilities as defined by either AHCI or Intel ICH9
 - Boot-time values of the PxTFD register, which should not have valid
   data until after a D2H FIS is received, but does in Qemu 2.1
 - Boot-time values of the PxSIG register, which should have a specific
   placeholder signature until the first D2H FIS is received, but is
   currently blank.
 - The Descriptor Processed interrupt is expected after the IDENTIFY
   command exhausts the given PRDT, but is not seen.

I guess these are the assertion failures:
ERROR:tests/ahci-test.c:777:ahci_test_pci_spec: assertion failed ((data  0xFF) 
== PCI_CAP_ID_MSI): (0x0012 == 0x0005)
GTester: last random seed: R02Sd92815a5d013e8433808b903b2b13fb0
**
ERROR:tests/ahci-test.c:1165:ahci_test_port_spec: assertion failed ((reg)  
((0x01)) == ((0x01))): (0x == 0x0001)
GTester: last random seed: R02S4d6c05e864dc777e64141cdc6d2a18cf
**
ERROR:tests/ahci-test.c:1360:ahci_test_identify: assertion failed ((reg)  
((0x20)) == ((0x20))): (0x == 0x0020)
GTester: last random seed: R02S2b3b330b83a66badb24da80b48120b1d

Why publish this patch series if the test fails?  We can't merge failing
tests.


pgpxF10PzmRWz.pgp
Description: PGP signature

Re: [Qemu-devel] [v4][PATCH 3/5] I440FX_PCI_DEVICE: add pci_type to index

2014-08-06 Thread Michael S. Tsirkin

On Wed, Aug 06, 2014 at 02:50:33PM +0800, Tiejun Chen wrote:
 We need to use this index to reuse this macro later
 
 Signed-off-by: Tiejun Chen tiejun.c...@intel.com

Which index?
Most users don't need to change.
Just open-code OBJECT_CHECK where necessary, or add
a new wrapper.

 ---
  hw/pci-host/piix.c | 10 +-
  1 file changed, 5 insertions(+), 5 deletions(-)
 
 v4:
 
 * New patch to extend I440FX_PCI_DEVICE
 
 diff --git a/hw/pci-host/piix.c b/hw/pci-host/piix.c
 index 0cd82b8..4330599 100644
 --- a/hw/pci-host/piix.c
 +++ b/hw/pci-host/piix.c
 @@ -90,8 +90,8 @@ typedef struct PIIX3State {
  MemoryRegion rcr_mem;
  } PIIX3State;
  
 -#define I440FX_PCI_DEVICE(obj) \
 -OBJECT_CHECK(PCII440FXState, (obj), TYPE_I440FX_PCI_DEVICE)
 +#define I440FX_PCI_DEVICE(obj, type) \
 +OBJECT_CHECK(PCII440FXState, (obj), type)
  
  struct PCII440FXState {
  /* private */
 @@ -155,7 +155,7 @@ static void i440fx_set_smm(int val, void *arg)
  static void i440fx_write_config(PCIDevice *dev,
  uint32_t address, uint32_t val, int len)
  {
 -PCII440FXState *d = I440FX_PCI_DEVICE(dev);
 +PCII440FXState *d = I440FX_PCI_DEVICE(dev, TYPE_I440FX_PCI_DEVICE);
  
  /* XXX: implement SMRAM.D_LOCK */
  pci_default_write_config(dev, address, val, len);
 @@ -295,7 +295,7 @@ static void i440fx_pcihost_realize(DeviceState *dev, 
 Error **errp)
  
  static int i440fx_initfn(PCIDevice *dev)
  {
 -PCII440FXState *d = I440FX_PCI_DEVICE(dev);
 +PCII440FXState *d = I440FX_PCI_DEVICE(dev, TYPE_I440FX_PCI_DEVICE);
  
  dev-config[I440FX_SMRAM] = 0x02;
  
 @@ -333,7 +333,7 @@ PCIBus *i440fx_init(const char *host_type, const char 
 *pci_type,
  qdev_init_nofail(dev);
  
  d = pci_create_simple(b, 0, pci_type);
 -*pi440fx_state = I440FX_PCI_DEVICE(d);
 +*pi440fx_state = I440FX_PCI_DEVICE(d, pci_type);
  f = *pi440fx_state;
  f-system_memory = address_space_mem;
  f-pci_address_space = pci_address_space;
 -- 
 1.9.1

Re: [Qemu-devel] [v4][PATCH 4/5] xen:hw:pci-host:piix: create host bridge to passthrough


On 2014/8/6 17:42, Michael S. Tsirkin wrote:

On Wed, Aug 06, 2014 at 02:50:34PM +0800, Tiejun Chen wrote:

Implement a pci host bridge specific to passthrough. Actually
this just inherits the standard one.

This is based on http://patchwork.ozlabs.org/patch/363810/.

Signed-off-by: Tiejun Chen tiejun.c...@intel.com
---
  hw/pci-host/piix.c   | 39 +++
  include/hw/i386/pc.h |  2 ++
  2 files changed, 41 insertions(+)

v4:

* Rebase

v3:

* Rebase

v2:

* Just add prefix with XEN_IGD_PASSTHROUGH/xen_igd_passthrough

diff --git a/hw/pci-host/piix.c b/hw/pci-host/piix.c
index 4330599..5cac398 100644
--- a/hw/pci-host/piix.c
+++ b/hw/pci-host/piix.c
@@ -303,6 +303,17 @@ static int i440fx_initfn(PCIDevice *dev)
  return 0;
  }

+static int xen_igd_passthrough_i440fx_initfn(PCIDevice *dev)
+{
+PCII440FXState *d = I440FX_PCI_DEVICE(dev,
+ TYPE_XEN_IGD_PASSTHROUGH_I440FX_PCI_DEVICE);


Sorry, I don't understand what you mean.

i440fx_initfn(PCIDevice *dev) just have one parameter, dev, how to 
multiplex that here?


Thanks
Tiejun


+
+dev-config[I440FX_SMRAM] = 0x02;
+
+cpu_smm_register(i440fx_set_smm, d);
+return 0;
+}
+
  PCIBus *i440fx_init(const char *host_type, const char *pci_type,
  PCII440FXState **pi440fx_state,
  int *piix3_devfn,


don't duplicate code.
Reuse the function from regular piix.







@@ -703,6 +714,33 @@ static const TypeInfo i440fx_info = {
  .class_init= i440fx_class_init,
  };

+static void xen_igd_passthrough_i440fx_class_init(ObjectClass *klass, void 
*data)
+{
+DeviceClass *dc = DEVICE_CLASS(klass);
+PCIDeviceClass *k = PCI_DEVICE_CLASS(klass);
+
+k-init = xen_igd_passthrough_i440fx_initfn;
+k-vendor_id = PCI_VENDOR_ID_INTEL;
+k-device_id = PCI_DEVICE_ID_INTEL_82441;
+k-revision = 0x02;
+k-class_id = PCI_CLASS_BRIDGE_HOST;
+dc-desc = IGD PT XEN Host bridge;
+dc-vmsd = vmstate_i440fx;
+/*
+ * PCI-facing part of the host bridge, not usable without the
+ * host-facing part, which can't be device_add'ed, yet.
+ */
+dc-cannot_instantiate_with_device_add_yet = true;
+dc-hotpluggable   = false;
+}
+
+static const TypeInfo xen_igd_passthrough_i440fx_info = {
+.name  = TYPE_XEN_IGD_PASSTHROUGH_I440FX_PCI_DEVICE,
+.parent= TYPE_PCI_DEVICE,
+.instance_size = sizeof(PCII440FXState),
+.class_init= xen_igd_passthrough_i440fx_class_init,
+};
+
  static const char *i440fx_pcihost_root_bus_path(PCIHostState *host_bridge,
  PCIBus *rootbus)
  {
@@ -744,6 +782,7 @@ static const TypeInfo i440fx_pcihost_info = {
  static void i440fx_register_types(void)
  {
  type_register_static(i440fx_info);
+type_register_static(xen_igd_passthrough_i440fx_info);
  type_register_static(piix3_info);
  type_register_static(piix3_xen_info);
  type_register_static(i440fx_pcihost_info);
diff --git a/include/hw/i386/pc.h b/include/hw/i386/pc.h
index 11fb72f..de34aa6 100644
--- a/include/hw/i386/pc.h
+++ b/include/hw/i386/pc.h
@@ -238,6 +238,8 @@ typedef struct PCII440FXState PCII440FXState;
  #define TYPE_I440FX_PCI_HOST_BRIDGE i440FX-pcihost
  #define TYPE_I440FX_PCI_DEVICE i440FX

+#define TYPE_XEN_IGD_PASSTHROUGH_I440FX_PCI_DEVICE xen-igd-passthrough-i440FX
+
  PCIBus *i440fx_init(const char *host_type, const char *pci_type,
  PCII440FXState **pi440fx_state, int *piix_devfn,
  ISABus **isa_bus, qemu_irq *pic,
--
1.9.1

[Qemu-devel] [PATCH v3] po: Add Chinese translation

2014-08-06 Thread Fam Zheng

Signed-off-by: Fam Zheng f...@redhat.com
Reviewed-by: Amos Kong ak...@redhat.com
Reviewed-by: Dongsheng Song songdongsh...@live.cn
Reviewed-by: Wei Huang wehu...@redhat.com

---
v3: Add a missing space for
msgid  - Press Ctrl+Alt+G to release grab
As pointed out by Amos.
Meanwhile adding people's review-by's. [Hope no one in the rev-by
list is against with that change.  To Wei: I changed your Acked-by
to Reviewed-by which I think is more appropriate here.]
Cc'ing qemu-trivial.

v2: Fix typo for _View.
---
 po/zh_CN.po | 86 +
 1 file changed, 86 insertions(+)
 create mode 100644 po/zh_CN.po

diff --git a/po/zh_CN.po b/po/zh_CN.po
new file mode 100644
index 000..2b1d42e
--- /dev/null
+++ b/po/zh_CN.po
@@ -0,0 +1,86 @@
+# Chinese translation for QEMU.
+# This file is put in the public domain.
+#
+# Fam Zheng f...@redhat.com, 2014
+msgid 
+msgstr 
+Project-Id-Version: QEMU 2.2\n
+Report-Msgid-Bugs-To: qemu-devel@nongnu.org\n
+POT-Creation-Date: 2014-07-31 10:03+0800\n
+PO-Revision-Date: 2014-07-31 10:00+0800\n
+Last-Translator: Fam Zheng f...@redhat.com\n
+Language-Team: Chinese z...@li.org\n
+Language: zh\n
+MIME-Version: 1.0\n
+Content-Type: text/plain; charset=UTF-8\n
+Content-Transfer-Encoding: 8bit\n
+Plural-Forms: nplurals=2; plural=n != 1;\n
+X-Generator: Lokalize 1.4\n
+
+#: ui/gtk.c:321
+msgid  - Press Ctrl+Alt+G to release grab
+msgstr  - 按下 Ctrl+Alt+G 取消捕获
+
+#: ui/gtk.c:325
+msgid  [Paused]
+msgstr  [已暂停]
+
+#: ui/gtk.c:1601
+msgid _Pause
+msgstr 暂停(_P)
+
+#: ui/gtk.c:1607
+msgid _Reset
+msgstr 重置(_R)
+
+#: ui/gtk.c:1610
+msgid Power _Down
+msgstr 关闭电源(_D)
+
+#: ui/gtk.c:1616
+msgid _Quit
+msgstr 退出(_Q)
+
+#: ui/gtk.c:1692
+msgid _Fullscreen
+msgstr 全屏(_F)
+
+#: ui/gtk.c:1702
+msgid Zoom _In
+msgstr 放大(_I)
+
+#: ui/gtk.c:1709
+msgid Zoom _Out
+msgstr 缩小(_O)
+
+#: ui/gtk.c:1716
+msgid Best _Fit
+msgstr 最合适大小(_F)
+
+#: ui/gtk.c:1723
+msgid Zoom To _Fit
+msgstr 缩放以适应大小(_F)
+
+#: ui/gtk.c:1729
+msgid Grab On _Hover
+msgstr 鼠标经过时捕获(_H)
+
+#: ui/gtk.c:1732
+msgid _Grab Input
+msgstr 捕获输入(_G)
+
+#: ui/gtk.c:1761
+msgid Show _Tabs
+msgstr 显示标签页(_T)
+
+#: ui/gtk.c:1764
+msgid Detach Tab
+msgstr 分离标签页
+
+#: ui/gtk.c:1778
+msgid _Machine
+msgstr 虚拟机(_M)
+
+#: ui/gtk.c:1783
+msgid _View
+msgstr 视图(_V)
-- 
2.0.3

Re: [Qemu-devel] [questions] about KVM asaMicrosoft-compatiblehypervisor

2014-08-06 Thread Vadim Rozenfeld

On Mon, 2014-08-04 at 14:29 +0800, Zhang Haoyu wrote:
 Hi Zhang,
 
 No I haven't seen such problem
 Which kernel version are you running?
 Host kernel: RHEL7-RC1(linux-3.10.0).
 
 Does it include the latest lazy eli changes?
 
 lazy eli or lazy eoi?
EOI
 How to confirm whether lazy eli has been included?
 
not in linux-3.10.0
 Btw, hv_spinlocks=0xfff is a pretty huge value.
 
 which value do you advise to use?
MS seems to be using 0x as a default.
best regards,
Vadim.
 
 Thanks,
 Zhang Haoyu
 Best regards,
 Vadim.

Re: [Qemu-devel] [PATCH v1 00/17] dataplane: optimization and multi virtqueue support

Am 06.08.2014 um 11:37 hat Ming Lei geschrieben:
 On Wed, Aug 6, 2014 at 4:48 PM, Kevin Wolf kw...@redhat.com wrote:
  Am 06.08.2014 um 07:33 hat Ming Lei geschrieben:
  Hi Kevin,
 
  On Tue, Aug 5, 2014 at 10:47 PM, Kevin Wolf kw...@redhat.com wrote:
   Am 05.08.2014 um 15:48 hat Stefan Hajnoczi geschrieben:
   I have been wondering how to prove that the root cause is the ucontext
   coroutine mechanism (stack switching).  Here is an idea:
  
   Hack your bypass code path to run the request inside a coroutine.
   That way you can compare bypass without coroutine against bypass with
   coroutine.
  
   Right now I think there are doubts because the bypass code path is
   indeed a different (and not 100% correct) code path.  So this approach
   might prove that the coroutines are adding the overhead and not
   something that you bypassed.
  
   My doubts aren't only that the overhead might not come from the
   coroutines, but also whether any coroutine-related overhead is really
   unavoidable. If we can optimise coroutines, I'd strongly prefer to do
   just that instead of introducing additional code paths.
 
  OK, thank you for taking look at the problem, and hope we can
  figure out the root cause, :-)
 
  
   Another thought I had was this: If the performance difference is indeed
   only coroutines, then that is completely inside the block layer and we
   don't actually need a VM to test it. We could instead have something
   like a simple qemu-img based benchmark and should be observing the same.
 
  Even it is simpler to run a coroutine-only benchmark, and I just
  wrote a raw one, and looks coroutine does decrease performance
  a lot, please see the attachment patch, and thanks for your template
  to help me add the 'co_bench' command in qemu-img.
 
  Yes, we can look at coroutines microbenchmarks in isolation. I actually
  did do that yesterday with the yield test from tests/test-coroutine.c.
  And in fact profiling immediately showed something to optimise:
  pthread_getspecific() was quite high, replacing it by __thread on
  systems where it works is more efficient and helped the numbers a bit.
  Also, a lot of time seems to be spent in pthread_mutex_lock/unlock (even
  in qemu-img bench), maybe there's even something that can be done here.
 
 The lock/unlock in dataplane is often from memory_region_find(), and Paolo
 should have done lots of work on that.
 
 
  However, I just wasn't sure whether a change on this level would be
  relevant in a realistic environment. This is the reason why I wanted to
  get a benchmark involving the block layer and some I/O.
 
  From the profiling data in below link:
 
  http://pastebin.com/YwH2uwbq
 
  With coroutine, the running time for same loading is increased
  ~50%(1.325s vs. 0.903s), and dcache load events is increased
  ~35%(693M vs. 512M), insns per cycle is decreased by ~50%(
  1.35 vs. 1.63), compared with bypassing coroutine(-b parameter).
 
  The bypass code in the benchmark is very similar with the approach
  used in the bypass patch, since linux-aio with O_DIRECT seldom
  blocks in the the kernel I/O path.
 
  Maybe the benchmark is a bit extremely, but given modern storage
  device may reach millions of IOPS, and it is very easy to slow down
  the I/O by coroutine.
 
  I think in order to optimise coroutines, such benchmarks are fair game.
  It's just not guaranteed that the effects are exactly the same on real
  workloads, so we should take the results with a grain of salt.
 
  Anyhow, the coroutine version of your benchmark is buggy, it leaks all
  coroutines instead of exiting them, so it can't make any use of the
  coroutine pool. On my laptop, I get this (where fixed coroutine is a
  version that simply removes the yield at the end):
 
  | bypass| fixed coro| buggy coro
  +---+---+--
  time| 1.09s | 1.10s | 1.62s
  L1-dcache-loads | 921,836,360   | 932,781,747   | 1,298,067,438
  insns per cycle | 2.39  | 2.39  | 1.90
 
  Begs the question whether you see a similar effect on a real qemu and
  the coroutine pool is still not big enough? With correct use of
  coroutines, the difference seems to be barely measurable even without
  any I/O involved.
 
 When I comment qemu_coroutine_yield(), looks result of
 bypass and fixed coro is very similar as your test, and I am just
 wondering if stack is always switched in qemu_coroutine_enter()
 without calling qemu_coroutine_yield().

Yes, definitely. qemu_coroutine_enter() always involves calling
qemu_coroutine_switch(), which is the stack switch.

 Without the yield, the benchmark can't emulate coroutine usage in
 bdrv_aio_readv/writev() path any more, and bypass in the patchset
 skips two qemu_coroutine_enter() and one qemu_coroutine_yield()
 for each bdrv_aio_readv/writev().

It's not completely comparable anyway because you're not going through a
main loop and callbacks

Re: [Qemu-devel] [v4][PATCH 3/5] I440FX_PCI_DEVICE: add pci_type to index


On 2014/8/6 17:45, Michael S. Tsirkin wrote:

On Wed, Aug 06, 2014 at 02:50:33PM +0800, Tiejun Chen wrote:

We need to use this index to reuse this macro later

Signed-off-by: Tiejun Chen tiejun.c...@intel.com


Which index?
Most users don't need to change.
Just open-code OBJECT_CHECK where necessary, or add
a new wrapper.


Okay so what about this?

hw:pci-host:piix: define I440FX_PCI_DEVICE_FROM_TYPE

We need to introduce I440FX_PCI_DEVICE_FROM_TYPE to get
object with type, then we can reuse i440fx_init() simply.

Signed-off-by: Tiejun Chen tiejun.c...@intel.com

diff --git a/hw/pci-host/piix.c b/hw/pci-host/piix.c
index 0cd82b8..8c74653 100644
--- a/hw/pci-host/piix.c
+++ b/hw/pci-host/piix.c
@@ -93,6 +93,9 @@ typedef struct PIIX3State {
 #define I440FX_PCI_DEVICE(obj) \
 OBJECT_CHECK(PCII440FXState, (obj), TYPE_I440FX_PCI_DEVICE)

+#define I440FX_PCI_DEVICE_FROM_TYPE(obj, type) \
+OBJECT_CHECK(PCII440FXState, (obj), type)
+
 struct PCII440FXState {
 /* private */
 PCIDevice parent_obj;
@@ -333,7 +336,7 @@ PCIBus *i440fx_init(const char *host_type, const 
char *pci_type,

 qdev_init_nofail(dev);

 d = pci_create_simple(b, 0, pci_type);
-*pi440fx_state = I440FX_PCI_DEVICE(d);
+*pi440fx_state = I440FX_PCI_DEVICE_FROM_TYPE(d, pci_type);
 f = *pi440fx_state;
 f-system_memory = address_space_mem;
 f-pci_address_space = pci_address_space;


Thanks
Tiejun




---
  hw/pci-host/piix.c | 10 +-
  1 file changed, 5 insertions(+), 5 deletions(-)

v4:

* New patch to extend I440FX_PCI_DEVICE

diff --git a/hw/pci-host/piix.c b/hw/pci-host/piix.c
index 0cd82b8..4330599 100644
--- a/hw/pci-host/piix.c
+++ b/hw/pci-host/piix.c
@@ -90,8 +90,8 @@ typedef struct PIIX3State {
  MemoryRegion rcr_mem;
  } PIIX3State;

-#define I440FX_PCI_DEVICE(obj) \
-OBJECT_CHECK(PCII440FXState, (obj), TYPE_I440FX_PCI_DEVICE)
+#define I440FX_PCI_DEVICE(obj, type) \
+OBJECT_CHECK(PCII440FXState, (obj), type)

  struct PCII440FXState {
  /* private */
@@ -155,7 +155,7 @@ static void i440fx_set_smm(int val, void *arg)
  static void i440fx_write_config(PCIDevice *dev,
  uint32_t address, uint32_t val, int len)
  {
-PCII440FXState *d = I440FX_PCI_DEVICE(dev);
+PCII440FXState *d = I440FX_PCI_DEVICE(dev, TYPE_I440FX_PCI_DEVICE);

  /* XXX: implement SMRAM.D_LOCK */
  pci_default_write_config(dev, address, val, len);
@@ -295,7 +295,7 @@ static void i440fx_pcihost_realize(DeviceState *dev, Error 
**errp)

  static int i440fx_initfn(PCIDevice *dev)
  {
-PCII440FXState *d = I440FX_PCI_DEVICE(dev);
+PCII440FXState *d = I440FX_PCI_DEVICE(dev, TYPE_I440FX_PCI_DEVICE);

  dev-config[I440FX_SMRAM] = 0x02;

@@ -333,7 +333,7 @@ PCIBus *i440fx_init(const char *host_type, const char 
*pci_type,
  qdev_init_nofail(dev);

  d = pci_create_simple(b, 0, pci_type);
-*pi440fx_state = I440FX_PCI_DEVICE(d);
+*pi440fx_state = I440FX_PCI_DEVICE(d, pci_type);
  f = *pi440fx_state;
  f-system_memory = address_space_mem;
  f-pci_address_space = pci_address_space;
--
1.9.1

Re: [Qemu-devel] [edk2] license for binary drivers

2014-08-06 Thread Laszlo Ersek

On 08/06/14 09:40, Reza Jelveh wrote:
Hello,

EDK2 integrates FAT as a binary driver. What is the license of the FAT driver?

https://svn.code.sf.net/p/edk2/code/trunk/edk2/FatBinPkg/License.txt

What are the guidelines for use of binary drivers with EDK2? Specifically if
you want to bundle an OVMF firmware with qemu?

I'm not a lawyer. You should consult a lawyer.

That said, the binary or source code nature of the FAT driver is
inconsequential in this instance. You can actually find the source code
for the FAT driver, and build it yourself, you just need to import it in
the edk2 tree, and patch OvmfPkg*.{dsc,fdf) slightly.

https://svn.code.sf.net/p/edk2/code/trunk/edk2/FatBinPkg/ReadMe.txt
https://github.com/tianocore/edk2-FatPkg/

The main issue is that the FAT driver is not free software. It doesn't
satisfy the four freedom requirements listed here:

https://www.gnu.org/philosophy/free-sw.html

[...]
The freedom to run the program as you wish, for any purpose
(freedom 0).
[...]

The FAT driver license does not give you this freedom.

The FAT driver's license is more restrictive than the 3-clause BSDL that
it is based upon.

The 3-clause BSDL is GPL compatible:

http://en.wikipedia.org/wiki/BSD_licenses#3-clause_license_.28.22Revised_BSD_License.22.2C_.22New_BSD_License.22.2C_or_.22Modified_BSD_License.22.29

But the additional restriction in the FAT driver's license makes it
incompatible with the GPLv2 (which QEMU as a whole us released under,
see LICENSE).

The additional restriction most likely originates from
http://msdn.microsoft.com/en-us/gg463080.aspx, the Microsoft EFI FAT32
File System Specification:

Note: The download license agreement permits you to use the
Microsoft EFI FAT32 File System Specification only in connection
with a firmware implementation of the Extensible Firmware Initiative
Specification, v. 1.0. If you plan to implement the FAT32 File
System specification for other purposes, you must obtain an
additional license from Microsoft. For example, you must obtain an
additional license in order to create a file system for reading or
reading and writing FAT32 in digital cameras recording to flash
media, in computer operating systems reading and writing
internal/external hard disks or flash media, or in set-top boxes
reading FAT-formatted media.

(This is precisely what freedom 0 would be about.)

So no, you can't ship an OVMF binary (or source tarball) that contains
the FAT driver, bundled as part of the GPLv2 (+compatible) QEMU
distribution, either in source or in binary form.

Linux distributions have moved OVMF to their non-free channels accordingly.

https://packages.debian.org/sid/ovmf
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=745698

http://packages.ubuntu.com/trusty/misc/ovmf

What you can do is build an OVMF binary that doesn't include the FAT
driver. The result will be then covered by 3-clause BSDL (further
restricted by the OpenSSL license, if you include -D SECURE_BOOT), and
that one you can bundle with QEMU. However, such an OVMF binary will be
useless, unless users themselves can augment it somehow at build time or
at runtime with some FAT driver. (The UEFI specification requires FAT32
support.)

UEFI should have standardized a preexistent, but unencumbered, file
system. That horse is out of the barn.

Disclaimer: I am not a lawyer, and this is my personal opinion only, not
necessarily my employer's.

Laszlo

Re: [Qemu-devel] [questions] about qemu log

On Wed, Aug 6, 2014 at 12:40 AM, Peter Crosthwaite
peter.crosthwa...@xilinx.com wrote:
 Well -D will log to file only loggable (i.e. qemu_log()) information
 (which has all sorts of options and switches). Stderr, is a little
 more static and should in theory be limited to genuine errors. But if
 you want a combined log of both you can simply omit -D to default
 qemu_log output to stderr. This gives you a combined log that you can
 redirect anywhere. To be honest, this is what I do as a matter of
 course (2 foo rather than -D foo).

understood; this make it incompatible with -daemonize option.
there should be a possibility to detach the process and also redirect
stderr somewhere.
-- 
William

Re: [Qemu-devel] [PATCH RFC v2 01/12] QEMUSizedBuffer/QEMUFile

2014-08-06 Thread Dr. David Alan Gilbert

* Eric Blake (ebl...@redhat.com) wrote:
 On 07/25/2014 09:39 AM, Sanidhya Kashyap wrote:
  From: Dr. David Alan Gilbert dgilb...@redhat.com
  
  Stefan Berger's to create a QEMUFile that goes to a memory buffer;
 
 Missing something.  Maybe you meant:
 
 This is based on Stefan Berger's patch to create ...

I'll update that in my version.

  from:
  
  http://lists.gnu.org/archive/html/qemu-devel/2013-03/msg05036.html
  
  Using the QEMUFile interface, this patch adds support functions for
  operating
  on in-memory sized buffers that can be written to or read from.
 
 Odd line breaking.
 
  
  Signed-off-by: Stefan Berger stef...@linux.vnet.ibm.com
  Signed-off-by: Joel Schopp jsch...@linux.vnet.ibm.com
 
 Still looks weird to have David as author but not listed in S-o-B.

I'm thinking of trying to put this code in via a stand-alone patch
(with a testcase use); since we've got 3+ users of this code in unmerged
things and at least 3 implementations of something similar, it would
be best to get it in before someone writes a 4th.

  ---
   include/migration/qemu-file.h |  27 +++
   qemu-file.c   | 411 
  ++
   2 files changed, 438 insertions(+)
  
 
  +/**
  + * Set the length of the buffer; The primary usage of this
 
 s/The/the/

  + * function is to truncate the number of used bytes in the buffer.
  + * The size will not be extended beyond the current  number of
 
 no need for double space
 
  + * allocated bytes in the QEMUSizedBuffer.
  + *
  + * @qsb: A QEMUSizedBuffer
  + * @new_len : The new length of bytes in the buffer
  + *
  + * Returns the number of bytes the buffer was trucated or extended
 
 s/trucated/truncated/

I've picked up these typos in my world.  Thanks.

  +
  +QEMUFile *qemu_bufopen(const char *mode, QEMUSizedBuffer *input)
  +{
  +QEMUBuffer *s;
  +
  +if (mode == NULL || (mode[0] != 'r'  mode[0] != 'w') || mode[1] != 
  0) {
  +fprintf(stderr, qemu_bufopen: Argument validity check failed\n);
 
 Should this function be directly printing to stderr, or should it be
 converted to use Error **errp semantics?

It's consistent with the open functions for other file types
 ( qemu_fdopen, qemu_popen_cmd, qemu_fopen ).

Dave

 
 -- 
 Eric Blake   eblake redhat com+1-919-301-3266
 Libvirt virtualization library http://libvirt.org
 


--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK

[Qemu-devel] QEMU with KVM does not start Win8 on kernel 3.4.67 and core2duo

2014-08-06 Thread Erik Rull

Hi all,

I did already several tests and I'm not completely sure what's going wrong, but
here my scenario:

When I start up QEMU w/ KVM 1.7.0 on a Core2Duo machine running a vanilla kernel
3.4.67 to run a Windows 8.0 guest, the guest freezes at boot without any error.
When I dump the CPU registers via info registers, nothing changes, that means
the system really stalled. Same happens with QEMU 2.0.0.

But - when I run the very same guest using Kernel 2.6.32.12 and QEMU 1.7.0 on
the host side it works on the Core2Duo. Also the system above but just with an
i3 or i5 CPU it works, too.

I already disabled networking and USB for the guest and changed the graphics
card - no effect. I assume that some mean bits and bytes have to be set up
properly to get the thing running.

Any hint what to change / test would be really appreciated.

Thanks in advance,

Best regards,

Erik

Re: [Qemu-devel] [PATCH v1 00/17] dataplane: optimization and multi virtqueue support

On Wed, Aug 6, 2014 at 6:09 PM, Kevin Wolf kw...@redhat.com wrote:
 Am 06.08.2014 um 11:37 hat Ming Lei geschrieben:
 On Wed, Aug 6, 2014 at 4:48 PM, Kevin Wolf kw...@redhat.com wrote:
  Am 06.08.2014 um 07:33 hat Ming Lei geschrieben:
  Hi Kevin,
 
  On Tue, Aug 5, 2014 at 10:47 PM, Kevin Wolf kw...@redhat.com wrote:
   Am 05.08.2014 um 15:48 hat Stefan Hajnoczi geschrieben:
   I have been wondering how to prove that the root cause is the ucontext
   coroutine mechanism (stack switching).  Here is an idea:
  
   Hack your bypass code path to run the request inside a coroutine.
   That way you can compare bypass without coroutine against bypass 
   with
   coroutine.
  
   Right now I think there are doubts because the bypass code path is
   indeed a different (and not 100% correct) code path.  So this approach
   might prove that the coroutines are adding the overhead and not
   something that you bypassed.
  
   My doubts aren't only that the overhead might not come from the
   coroutines, but also whether any coroutine-related overhead is really
   unavoidable. If we can optimise coroutines, I'd strongly prefer to do
   just that instead of introducing additional code paths.
 
  OK, thank you for taking look at the problem, and hope we can
  figure out the root cause, :-)
 
  
   Another thought I had was this: If the performance difference is indeed
   only coroutines, then that is completely inside the block layer and we
   don't actually need a VM to test it. We could instead have something
   like a simple qemu-img based benchmark and should be observing the same.
 
  Even it is simpler to run a coroutine-only benchmark, and I just
  wrote a raw one, and looks coroutine does decrease performance
  a lot, please see the attachment patch, and thanks for your template
  to help me add the 'co_bench' command in qemu-img.
 
  Yes, we can look at coroutines microbenchmarks in isolation. I actually
  did do that yesterday with the yield test from tests/test-coroutine.c.
  And in fact profiling immediately showed something to optimise:
  pthread_getspecific() was quite high, replacing it by __thread on
  systems where it works is more efficient and helped the numbers a bit.
  Also, a lot of time seems to be spent in pthread_mutex_lock/unlock (even
  in qemu-img bench), maybe there's even something that can be done here.

 The lock/unlock in dataplane is often from memory_region_find(), and Paolo
 should have done lots of work on that.

 
  However, I just wasn't sure whether a change on this level would be
  relevant in a realistic environment. This is the reason why I wanted to
  get a benchmark involving the block layer and some I/O.
 
  From the profiling data in below link:
 
  http://pastebin.com/YwH2uwbq
 
  With coroutine, the running time for same loading is increased
  ~50%(1.325s vs. 0.903s), and dcache load events is increased
  ~35%(693M vs. 512M), insns per cycle is decreased by ~50%(
  1.35 vs. 1.63), compared with bypassing coroutine(-b parameter).
 
  The bypass code in the benchmark is very similar with the approach
  used in the bypass patch, since linux-aio with O_DIRECT seldom
  blocks in the the kernel I/O path.
 
  Maybe the benchmark is a bit extremely, but given modern storage
  device may reach millions of IOPS, and it is very easy to slow down
  the I/O by coroutine.
 
  I think in order to optimise coroutines, such benchmarks are fair game.
  It's just not guaranteed that the effects are exactly the same on real
  workloads, so we should take the results with a grain of salt.
 
  Anyhow, the coroutine version of your benchmark is buggy, it leaks all
  coroutines instead of exiting them, so it can't make any use of the
  coroutine pool. On my laptop, I get this (where fixed coroutine is a
  version that simply removes the yield at the end):
 
  | bypass| fixed coro| buggy coro
  +---+---+--
  time| 1.09s | 1.10s | 1.62s
  L1-dcache-loads | 921,836,360   | 932,781,747   | 1,298,067,438
  insns per cycle | 2.39  | 2.39  | 1.90
 
  Begs the question whether you see a similar effect on a real qemu and
  the coroutine pool is still not big enough? With correct use of
  coroutines, the difference seems to be barely measurable even without
  any I/O involved.

 When I comment qemu_coroutine_yield(), looks result of
 bypass and fixed coro is very similar as your test, and I am just
 wondering if stack is always switched in qemu_coroutine_enter()
 without calling qemu_coroutine_yield().

 Yes, definitely. qemu_coroutine_enter() always involves calling
 qemu_coroutine_switch(), which is the stack switch.

 Without the yield, the benchmark can't emulate coroutine usage in
 bdrv_aio_readv/writev() path any more, and bypass in the patchset
 skips two qemu_coroutine_enter() and one qemu_coroutine_yield()
 for each bdrv_aio_readv/writev().

 It's not completely

Re: [Qemu-devel] [PATCH v2 00/30] AHCI test suite framework

Stefan Hajnoczi stefa...@redhat.com writes:

 On Mon, Aug 04, 2014 at 05:11:01PM -0400, John Snow wrote:
 This patch series introduces a number of small fixes and tweaks to
 help support an AHCI test suite that in the future I hope to expand
 to a fuller regression suite to help guide the development of the
 AHCI device support under, in particular, the Q35 machine type in QEMU.
 
 Paolo Bonzini has contributed a number of cleanup and refactoring patches
 that support changes to the PIO setup FIS packet construction code, which
 is necessary for testing ths specification adherence of the IDENTIFY command,
 which issues its data exclusively via PIO mechanisms.
 
 The ahci-test code being checked in represents a minimum of functionality
 needed in order to issue and receive commands from the AHCI HBA under the
 libqos / qtest environment.
 
 In V2, as detailed below, these tests are not currently expected to pass.
 I will post a complementary patch outside of this set that highlights
 the exact set of tests that will not pass, which can help verify at least
 the portions of these tests that do work correctly.
 
 Assertions that currently fail:
 - Ordering of PCI capabilities as defined by either AHCI or Intel ICH9
 - Boot-time values of the PxTFD register, which should not have valid
   data until after a D2H FIS is received, but does in Qemu 2.1
 - Boot-time values of the PxSIG register, which should have a specific
   placeholder signature until the first D2H FIS is received, but is
   currently blank.
 - The Descriptor Processed interrupt is expected after the IDENTIFY
   command exhausts the given PRDT, but is not seen.

 I guess these are the assertion failures:
 ERROR:tests/ahci-test.c:777:ahci_test_pci_spec: assertion failed
 ((data  0xFF) == PCI_CAP_ID_MSI): (0x0012 == 0x0005)
 GTester: last random seed: R02Sd92815a5d013e8433808b903b2b13fb0
 **
 ERROR:tests/ahci-test.c:1165:ahci_test_port_spec: assertion failed
 ((reg)  ((0x01)) == ((0x01))): (0x == 0x0001)
 GTester: last random seed: R02S4d6c05e864dc777e64141cdc6d2a18cf
 **
 ERROR:tests/ahci-test.c:1360:ahci_test_identify: assertion failed
 ((reg)  ((0x20)) == ((0x20))): (0x == 0x0020)
 GTester: last random seed: R02S2b3b330b83a66badb24da80b48120b1d

 Why publish this patch series if the test fails?  We can't merge failing
 tests.

Correct.

What I do when I want to start some bug fixing work with tests is to
write the tests to expect the actual, incorrect behavior, with a
greppable comment documenting the correct behavior.  Then clean that up
as the bugs get fixed.

Re: [Qemu-devel] [PATCH v1 00/17] dataplane: optimization and multi virtqueue support

On Wed, Aug 6, 2014 at 7:28 PM, Ming Lei ming@canonical.com wrote:
 On Wed, Aug 6, 2014 at 6:09 PM, Kevin Wolf kw...@redhat.com wrote:


 I use the /dev/nullb0 block device to test, which is available in linux kernel
 3.13+, and follows the difference, which looks not very big( 10%):

 And I added two parameter to your img-bench patch:

   -c CNT  # which is passed to 'data.n'
   -b   #enable bypass coroutine introduced in this patchset

 Another difference is that dataplane uses its own thread, and this
 bench takes main_loop.

 ming@:~/git/qemu$ sudo ~/bin/perf stat -e
 L1-dcache-loads,L1-dcache-load-misses,cpu-cycles,instructions,branch-instructions,branch-misses,branch-loads,branch-load-misses,dTLB-loads,dTLB-load-misses
 ./qemu-img bench -f raw -t off -n -c 1000  -b /dev/nullb0
 read time: 58024ms

  Performance counter stats for './qemu-img bench -f raw -t off -n -c
 1000 -b /dev/nullb0':

 34,874,462,357  L1-dcache-loads
   [40.00%]
714,018,039  L1-dcache-load-misses #2.05% of all
 L1-dcache hits   [40.00%]
133,897,794,677  cpu-cycles[40.05%]
116,714,230,004  instructions  #0.87  insns per
 cycle [50.02%]
 22,689,223,546  branch-instructions
   [50.01%]
391,673,952  branch-misses #1.73% of all
 branches [50.00%]
 22,726,856,215  branch-loads
   [50.01%]
 18,570,766,783  branch-load-misses
   [49.98%]
 34,944,839,907  dTLB-loads
   [39.99%]
 24,405,944  dTLB-load-misses  #0.07% of all
 dTLB cache hits  [39.99%]

   58.040785989 seconds time elapsed


 ming@:~/git/qemu$ sudo ~/bin/perf stat -e
 L1-dcache-loads,L1-dcache-load-misses,cpu-cycles,instructions,branch-instructions,branch-misses,branch-loads,branch-load-misses,dTLB-loads,dTLB-load-misses
 ./qemu-img bench -f raw -t off -n -c 1000  /dev/nullb0
 read time: 63369ms

BTW, Stefan's coroutine resize patch is applied in both the
tests(qem-img bench).

Thanks,

Re: [Qemu-devel] [questions] about qemu log

On Wed, Aug 6, 2014 at 5:42 PM, Markus Armbruster arm...@redhat.com wrote:
 Zhang Haoyu zhan...@sangfor.com writes:

 The output is on qemu's stderr.  You are in control of what that
 stderr is.

 I don't get why we can configure
 -D /path/to/unique/file/name.log

 but we also have to redirect stderr (I didn't checked if the daemonize
 option was closing it). What's the purpose of this logfile option?


Well -D will log to file only loggable (i.e. qemu_log()) information
(which has all sorts of options and switches). Stderr, is a little
more static and should in theory be limited to genuine errors. But if
you want a combined log of both you can simply omit -D to default
qemu_log output to stderr. This gives you a combined log that you can
redirect anywhere. To be honest, this is what I do as a matter of
course (2 foo rather than -D foo).

 Maybe we can introduce a new qemu option to specify a error logfile
 where stderr be redirected, like below,
 DEF(elogfile, HAS_ARG, QEMU_OPTION_elogfile, \
 -elogfile logfile redirect stderr log to logfile(default
 /var/log/qemu/vm name##.log)\n,
 QEMU_ARCH_ALL)
 STEXI
 @item -elogfile @var{logfile}
 @findex -elogfile
 redirect stderr in @var{logfile}
 ETEXI
 then we can set the error log file through qemu command,
 /var/log/qemu/vm name##.log as default.


This sounds out-of-scope for QEMU to me and makes a standard flow
non-standard. If prints are going to stderr where should be going
elsewhere they probably should be fixed. Do you have specific examples
of information going to stderr that you would rather go to a log (be
it an error log or something else?).

 I use proxmox to manage vm, it dose not redirect qemu's stderr, and
 start vm with -daemonize option,
 so the error log disappeared.
 I want to redirect the error log of qemu to a specified logfile, if
 fault happened, I can use the error log to analyze the fault.

 And, why qemu output the error log to stderr instead of a error
 logfile which can be configure?

Because the code is a mess in that regard.

You don't fix that by redirecting stderr wholesale, because that just
adds to the mess.  You fix it at the root, one ill-advised fprintf() at
a time, as Peter advises:


 Sorry, I'm afraid I misunderstand what you mean,
 should I replace all of fprintf(stderr, ...) with qemu_log() ?
 or only some cases where stderr is used where qemu_log should be, as
 Perter advises?

 I didn't mean to suggest blind conversion from fprintf() to qemu_log().
 Each instance of fprintf() needs to be reviewed before conversion.

Yes i'm afraid there is no quick one-shot fix to your problem. The
best is the suggested workarounds using stderr redirection.

  I
 think that's exactly Peter's advice, too.


Correct.

Regards,
Peter

 If so, should I still need to redirect the stderr to specified logfile
 in qemu's parent shell/process ?

 Probably.  Libvirt does it.

Re: [Qemu-devel] [PATCH v5 1/6] exec: add parameter errp to qemu_ram_alloc and qemu_ram_alloc_from_ptr

On Wed, Aug 6, 2014 at 3:36 PM, Hu Tao hu...@cn.fujitsu.com wrote:
 Add parameter errp to qemu_ram_alloc and qemu_ram_alloc_from_ptr so that
 we can handler errors.

handle


 Signed-off-by: Hu Tao hu...@cn.fujitsu.com
 ---
  exec.c  | 32 +++-
  include/exec/ram_addr.h |  4 ++--
  memory.c|  6 +++---
  3 files changed, 28 insertions(+), 14 deletions(-)

 diff --git a/exec.c b/exec.c
 index 765bd94..7e60a44 100644
 --- a/exec.c
 +++ b/exec.c
 @@ -1224,7 +1224,7 @@ static int memory_try_enable_merging(void *addr, size_t 
 len)
  return qemu_madvise(addr, len, QEMU_MADV_MERGEABLE);
  }

 -static ram_addr_t ram_block_add(RAMBlock *new_block)
 +static ram_addr_t ram_block_add(RAMBlock *new_block, Error **errp)
  {
  RAMBlock *block;
  ram_addr_t old_ram_size, new_ram_size;
 @@ -1241,9 +1241,11 @@ static ram_addr_t ram_block_add(RAMBlock *new_block)
  } else {
  new_block-host = phys_mem_alloc(new_block-length);
  if (!new_block-host) {
 -fprintf(stderr, Cannot set up guest memory '%s': %s\n,
 -new_block-mr-name, strerror(errno));
 -exit(1);
 +error_setg_errno(errp, errno,
 + cannot set up guest memory '%s',
 + new_block-mr-name);
 +qemu_mutex_unlock_ramlist();
 +return -1;
  }
  memory_try_enable_merging(new_block-host, new_block-length);
  }
 @@ -1294,6 +1296,7 @@ ram_addr_t qemu_ram_alloc_from_file(ram_addr_t size, 
 MemoryRegion *mr,
  Error **errp)
  {
  RAMBlock *new_block;
 +ram_addr_t addr;

  if (xen_enabled()) {
  error_setg(errp, -mem-path not supported with Xen);
 @@ -1323,14 +1326,20 @@ ram_addr_t qemu_ram_alloc_from_file(ram_addr_t size, 
 MemoryRegion *mr,
  return -1;
  }

 -return ram_block_add(new_block);
 +addr = ram_block_add(new_block, errp);
 +if (errp  *errp) {
 +g_free(new_block);

The free being conditional on errp will cause a leak if clients
(validly) pass a NULL errp in. This free needs to be unconditional.
The way to achieve that is the local_err error_propagate pattern.

 +return -1;
 +}
 +return addr;
  }
  #endif

  ram_addr_t qemu_ram_alloc_from_ptr(ram_addr_t size, void *host,
 -   MemoryRegion *mr)
 +   MemoryRegion *mr, Error **errp)
  {
  RAMBlock *new_block;
 +ram_addr_t addr;

  size = TARGET_PAGE_ALIGN(size);
  new_block = g_malloc0(sizeof(*new_block));
 @@ -1341,12 +1350,17 @@ ram_addr_t qemu_ram_alloc_from_ptr(ram_addr_t size, 
 void *host,
  if (host) {
  new_block-flags |= RAM_PREALLOC;
  }
 -return ram_block_add(new_block);
 +addr = ram_block_add(new_block, errp);
 +if (errp  *errp) {
 +g_free(new_block);

ditto.

Regards,
Peter

 +return -1;
 +}
 +return addr;
  }

 -ram_addr_t qemu_ram_alloc(ram_addr_t size, MemoryRegion *mr)
 +ram_addr_t qemu_ram_alloc(ram_addr_t size, MemoryRegion *mr, Error **errp)
  {
 -return qemu_ram_alloc_from_ptr(size, NULL, mr);
 +return qemu_ram_alloc_from_ptr(size, NULL, mr, errp);
  }

  void qemu_ram_free_from_ptr(ram_addr_t addr)
 diff --git a/include/exec/ram_addr.h b/include/exec/ram_addr.h
 index 6593be1..cf1d4c7 100644
 --- a/include/exec/ram_addr.h
 +++ b/include/exec/ram_addr.h
 @@ -26,8 +26,8 @@ ram_addr_t qemu_ram_alloc_from_file(ram_addr_t size, 
 MemoryRegion *mr,
  bool share, const char *mem_path,
  Error **errp);
  ram_addr_t qemu_ram_alloc_from_ptr(ram_addr_t size, void *host,
 -   MemoryRegion *mr);
 -ram_addr_t qemu_ram_alloc(ram_addr_t size, MemoryRegion *mr);
 +   MemoryRegion *mr, Error **errp);
 +ram_addr_t qemu_ram_alloc(ram_addr_t size, MemoryRegion *mr, Error **errp);
  int qemu_get_ram_fd(ram_addr_t addr);
  void *qemu_get_ram_block_host_ptr(ram_addr_t addr);
  void *qemu_get_ram_ptr(ram_addr_t addr);
 diff --git a/memory.c b/memory.c
 index 64d7176..59d9935 100644
 --- a/memory.c
 +++ b/memory.c
 @@ -1169,7 +1169,7 @@ void memory_region_init_ram(MemoryRegion *mr,
  mr-ram = true;
  mr-terminates = true;
  mr-destructor = memory_region_destructor_ram;
 -mr-ram_addr = qemu_ram_alloc(size, mr);
 +mr-ram_addr = qemu_ram_alloc(size, mr, error_abort);
  }

  #ifdef __linux__
 @@ -1199,7 +1199,7 @@ void memory_region_init_ram_ptr(MemoryRegion *mr,
  mr-ram = true;
  mr-terminates = true;
  mr-destructor = memory_region_destructor_ram_from_ptr;
 -mr-ram_addr = qemu_ram_alloc_from_ptr(size, ptr, mr);
 +mr-ram_addr = qemu_ram_alloc_from_ptr(size, ptr, mr, error_abort);
  }

  void

Re: [Qemu-devel] [PATCH v5 2/6] memory: add parameter errp to memory_region_init_ram

On Wed, Aug 6, 2014 at 3:36 PM, Hu Tao hu...@cn.fujitsu.com wrote:
 Add parameter errp to memory_region_init_ram and update all call sites
 to pass in error_abort.

 Signed-off-by: Hu Tao hu...@cn.fujitsu.com

Liking it.

Reviewed-by: Peter Crosthwaite peter.crosthwa...@xilinx.com

 ---
  backends/hostmem-ram.c   |  2 +-
  hw/alpha/typhoon.c   |  3 ++-
  hw/arm/armv7m.c  |  7 ---
  hw/arm/cubieboard.c  |  2 +-
  hw/arm/digic_boards.c|  2 +-
  hw/arm/exynos4210.c  |  9 +
  hw/arm/highbank.c|  5 +++--
  hw/arm/integratorcp.c|  5 +++--
  hw/arm/kzm.c |  4 ++--
  hw/arm/mainstone.c   |  3 ++-
  hw/arm/musicpal.c|  6 --
  hw/arm/omap1.c   |  6 --
  hw/arm/omap2.c   |  6 --
  hw/arm/omap_sx1.c|  6 --
  hw/arm/palm.c|  3 ++-
  hw/arm/pxa2xx.c  | 11 +++
  hw/arm/realview.c|  9 ++---
  hw/arm/spitz.c   |  2 +-
  hw/arm/strongarm.c   |  3 ++-
  hw/arm/tosa.c|  2 +-
  hw/arm/versatilepb.c |  3 ++-
  hw/arm/vexpress.c| 15 ++-
  hw/arm/virt.c|  3 ++-
  hw/arm/xilinx_zynq.c |  6 --
  hw/block/onenand.c   |  2 +-
  hw/core/loader.c |  2 +-
  hw/cris/axis_dev88.c |  6 --
  hw/display/cg3.c |  6 --
  hw/display/qxl.c |  6 +++---
  hw/display/sm501.c   |  2 +-
  hw/display/tc6393xb.c|  3 ++-
  hw/display/tcx.c |  5 +++--
  hw/display/vga.c |  3 ++-
  hw/display/vmware_vga.c  |  3 ++-
  hw/i386/kvm/pci-assign.c |  3 ++-
  hw/i386/pc.c |  3 ++-
  hw/i386/pc_sysfw.c   |  5 +++--
  hw/input/milkymist-softusb.c |  4 ++--
  hw/lm32/lm32_boards.c|  6 --
  hw/lm32/milkymist.c  |  3 ++-
  hw/m68k/an5206.c |  4 ++--
  hw/m68k/dummy_m68k.c |  2 +-
  hw/m68k/mcf5208.c|  4 ++--
  hw/microblaze/petalogix_ml605_mmu.c  |  5 +++--
  hw/microblaze/petalogix_s3adsp1800_mmu.c |  6 --
  hw/mips/mips_fulong2e.c  |  5 +++--
  hw/mips/mips_jazz.c  |  8 +---
  hw/mips/mips_malta.c |  6 --
  hw/mips/mips_mipssim.c   |  6 --
  hw/mips/mips_r4k.c   |  5 +++--
  hw/moxie/moxiesim.c  |  4 ++--
  hw/net/milkymist-minimac2.c  |  2 +-
  hw/openrisc/openrisc_sim.c   |  2 +-
  hw/pci-host/prep.c   |  3 ++-
  hw/pci/pci.c |  2 +-
  hw/ppc/mac_newworld.c|  3 ++-
  hw/ppc/mac_oldworld.c|  3 ++-
  hw/ppc/ppc405_boards.c   |  8 +---
  hw/ppc/ppc405_uc.c   |  3 ++-
  hw/s390x/s390-virtio-ccw.c   |  2 +-
  hw/s390x/s390-virtio.c   |  2 +-
  hw/sh4/r2d.c |  2 +-
  hw/sh4/shix.c|  8 +---
  hw/sparc/leon3.c |  4 ++--
  hw/sparc/sun4m.c | 10 ++
  hw/sparc64/sun4u.c   |  6 --
  hw/unicore32/puv3.c  |  3 ++-
  hw/xtensa/sim.c  |  4 ++--
  hw/xtensa/xtfpga.c   |  8 +---
  include/exec/memory.h|  4 +++-
  memory.c |  5 +++--
  numa.c   |  4 ++--
  xen-hvm.c|  3 ++-
  73 files changed, 203 insertions(+), 128 deletions(-)

 diff --git a/backends/hostmem-ram.c b/backends/hostmem-ram.c
 index d9a8290..e55d066 100644
 --- a/backends/hostmem-ram.c
 +++ b/backends/hostmem-ram.c
 @@ -27,7 +27,7 @@ ram_backend_memory_alloc(HostMemoryBackend *backend, Error 
 **errp)

  path = object_get_canonical_path_component(OBJECT(backend));
  memory_region_init_ram(backend-mr, OBJECT(backend), path,
 -   backend-size);
 +   backend-size, error_abort);
  g_free(path);
  }

 diff --git a/hw/alpha/typhoon.c b/hw/alpha/typhoon.c
 index 67a1070..058b723 100644
 --- a/hw/alpha/typhoon.c
 +++ b/hw/alpha/typhoon.c
 @@ -843,7 +843,8 @@ PCIBus *typhoon_init(ram_addr_t ram_size, ISABus 
 **isa_bus,

  /* Main memory

Re: [Qemu-devel] [PATCH v5 3/6] memory: add parameter errp to memory_region_init_ram_ptr

On Wed, Aug 6, 2014 at 3:36 PM, Hu Tao hu...@cn.fujitsu.com wrote:
 Add parameter errp to memory_region_init_ram_ptr and update all call
 sites to pass in error_abort.

 Signed-off-by: Hu Tao hu...@cn.fujitsu.com

Reviewed-by: Peter Crosthwaite peter.crosthwa...@xilinx.com

 ---
  hw/display/g364fb.c  | 2 +-
  hw/i386/kvm/pci-assign.c | 3 ++-
  hw/misc/ivshmem.c| 5 +++--
  hw/misc/vfio.c   | 3 ++-
  hw/ppc/spapr.c   | 2 +-
  include/exec/memory.h| 4 +++-
  memory.c | 5 +++--
  7 files changed, 15 insertions(+), 9 deletions(-)

 diff --git a/hw/display/g364fb.c b/hw/display/g364fb.c
 index 46f7b41..cce33ae 100644
 --- a/hw/display/g364fb.c
 +++ b/hw/display/g364fb.c
 @@ -487,7 +487,7 @@ static void g364fb_init(DeviceState *dev, G364State *s)

  memory_region_init_io(s-mem_ctrl, NULL, g364fb_ctrl_ops, s, ctrl, 
 0x18);
  memory_region_init_ram_ptr(s-mem_vram, NULL, vram,
 -   s-vram_size, s-vram);
 +   s-vram_size, s-vram, error_abort);
  vmstate_register_ram(s-mem_vram, dev);
  memory_region_set_coalescing(s-mem_vram);
  }
 diff --git a/hw/i386/kvm/pci-assign.c b/hw/i386/kvm/pci-assign.c
 index 5dcd2d5..d2013af 100644
 --- a/hw/i386/kvm/pci-assign.c
 +++ b/hw/i386/kvm/pci-assign.c
 @@ -456,7 +456,8 @@ static void assigned_dev_register_regions(PCIRegion 
 *io_regions,
   object_get_typename(OBJECT(pci_dev)), i);
  memory_region_init_ram_ptr(pci_dev-v_addrs[i].real_iomem,
 OBJECT(pci_dev), name,
 -   cur_region-size, virtbase);
 +   cur_region-size, virtbase,
 +   error_abort);
  vmstate_register_ram(pci_dev-v_addrs[i].real_iomem,
   pci_dev-dev.qdev);
  }
 diff --git a/hw/misc/ivshmem.c b/hw/misc/ivshmem.c
 index 768e528..0949c15 100644
 --- a/hw/misc/ivshmem.c
 +++ b/hw/misc/ivshmem.c
 @@ -348,7 +348,7 @@ static void create_shared_memory_BAR(IVShmemState *s, int 
 fd) {
  ptr = mmap(0, s-ivshmem_size, PROT_READ|PROT_WRITE, MAP_SHARED, fd, 0);

  memory_region_init_ram_ptr(s-ivshmem, OBJECT(s), ivshmem.bar2,
 -   s-ivshmem_size, ptr);
 +   s-ivshmem_size, ptr, error_abort);
  vmstate_register_ram(s-ivshmem, DEVICE(s));
  memory_region_add_subregion(s-bar, 0, s-ivshmem);

 @@ -476,7 +476,8 @@ static void ivshmem_read(void *opaque, const uint8_t * 
 buf, int flags)
  map_ptr = mmap(0, s-ivshmem_size, PROT_READ|PROT_WRITE, MAP_SHARED,
  incoming_fd, 0);
  memory_region_init_ram_ptr(s-ivshmem, OBJECT(s),
 -   ivshmem.bar2, s-ivshmem_size, map_ptr);
 +   ivshmem.bar2, s-ivshmem_size, map_ptr,
 +   error_abort);
  vmstate_register_ram(s-ivshmem, DEVICE(s));

  IVSHMEM_DPRINTF(guest h/w addr = % PRIu64 , size = % PRIu64 \n,
 diff --git a/hw/misc/vfio.c b/hw/misc/vfio.c
 index 0b9eba0..91d2c95 100644
 --- a/hw/misc/vfio.c
 +++ b/hw/misc/vfio.c
 @@ -2894,7 +2894,8 @@ static int vfio_mmap_bar(VFIODevice *vdev, VFIOBAR *bar,
  goto empty_region;
  }

 -memory_region_init_ram_ptr(submem, OBJECT(vdev), name, size, *map);
 +memory_region_init_ram_ptr(submem, OBJECT(vdev), name, size, *map,
 +   error_abort);
  } else {
  empty_region:
  /* Create a zero sized sub-region to make cleanup easy. */
 diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
 index d01978f..4dfe40a 100644
 --- a/hw/ppc/spapr.c
 +++ b/hw/ppc/spapr.c
 @@ -1342,7 +1342,7 @@ static void ppc_spapr_init(MachineState *machine)
  if (rma_alloc_size  rma) {
  rma_region = g_new(MemoryRegion, 1);
  memory_region_init_ram_ptr(rma_region, NULL, ppc_spapr.rma,
 -   rma_alloc_size, rma);
 +   rma_alloc_size, rma, error_abort);
  vmstate_register_ram_global(rma_region);
  memory_region_add_subregion(sysmem, 0, rma_region);
  }
 diff --git a/include/exec/memory.h b/include/exec/memory.h
 index ec6299b..caa988d 100644
 --- a/include/exec/memory.h
 +++ b/include/exec/memory.h
 @@ -351,12 +351,14 @@ void memory_region_init_ram_from_file(MemoryRegion *mr,
   * @name: the name of the region.
   * @size: size of the region.
   * @ptr: memory to be mapped; must contain at least @size bytes.
 + * @errp: pointer to Error*, to store an error if it happens.
   */
  void memory_region_init_ram_ptr(MemoryRegion *mr,
  struct Object *owner,
  const char *name,
  uint64_t

[Qemu-devel] [Bug 1353456] [NEW] qemu-io: Failure on a qcow2 image with the fuzzed refcount table

Public bug reported:

'qemu-io -c write' and 'qemu-io -c aio_write' crashes on a qcow2 image
with a fuzzed refcount table.

Sequence:
 1. Unpack the attached archive, make a copy of test.img
 2. Put copy.img and backing_img.file in the same directory
 3. Execute
qemu-io copy.img -c write 279552 322560
  or
   qemu-io copy.img -c aio_write 836608 166400

Result: qemu-io was killed by SIGIOT with the reason:

qemu-io: block/qcow2-cluster.c:1291: qcow2_alloc_cluster_offset:
Assertion `*host_offset != 0' failed.

qemu.git HEAD 69f87f713069f1f

** Affects: qemu
 Importance: Undecided
 Status: New

** Attachment added: traces.n.images.tar.gz
   
https://bugs.launchpad.net/bugs/1353456/+attachment/4171103/+files/traces.n.images.tar.gz

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1353456

Title:
  qemu-io: Failure on a qcow2 image with the fuzzed refcount table

Status in QEMU:
  New

Bug description:
  'qemu-io -c write' and 'qemu-io -c aio_write' crashes on a qcow2 image
  with a fuzzed refcount table.

  Sequence:
   1. Unpack the attached archive, make a copy of test.img
   2. Put copy.img and backing_img.file in the same directory
   3. Execute
  qemu-io copy.img -c write 279552 322560
or
 qemu-io copy.img -c aio_write 836608 166400

  Result: qemu-io was killed by SIGIOT with the reason:

  qemu-io: block/qcow2-cluster.c:1291: qcow2_alloc_cluster_offset:
  Assertion `*host_offset != 0' failed.

  qemu.git HEAD 69f87f713069f1f

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1353456/+subscriptions

Re: [Qemu-devel] [PATCH v2 4/4] ivshmem: check the value returned by fstat()

2014-08-06 Thread Levente Kurusa

 The function fstat() may fail, so check its return value.
 
 Signed-off-by: zhanghailiang zhang.zhanghaili...@huawei.com
 ---
  hw/misc/ivshmem.c | 6 +-
  1 file changed, 5 insertions(+), 1 deletion(-)
 
 diff --git a/hw/misc/ivshmem.c b/hw/misc/ivshmem.c
 index 768e528..5d939d2 100644
 --- a/hw/misc/ivshmem.c
 +++ b/hw/misc/ivshmem.c
 @@ -324,7 +324,11 @@ static int check_shm_size(IVShmemState *s, int fd) {
  
  struct stat buf;
  
 -fstat(fd, buf);
 +if (fstat(fd, buf)  0) {
 +fprintf(stderr, ivshmem: exiting for fstat fd '%d' failed: %s\n,
 +fd, strerror(errno));

exiting for fstat?

I would go with something like this:

ivshmem: exiting: fstat on fd %d failed: %s

... or something among the lines.

Thanks!
Levente Kurusa

Re: [Qemu-devel] [PATCH v5 4/6] memory: add parameter errp to memory_region_init_rom_device

On Wed, Aug 6, 2014 at 3:36 PM, Hu Tao hu...@cn.fujitsu.com wrote:
 Add parameter errp to memory_region_init_rom_device and update all call
 sites to pass in error_abort.

 Signed-off-by: Hu Tao hu...@cn.fujitsu.com

Reviewed-by: Peter Crosthwaite peter.crosthwa...@xilinx.com

 ---
  hw/block/pflash_cfi01.c | 2 +-
  hw/block/pflash_cfi02.c | 2 +-
  include/exec/memory.h   | 4 +++-
  memory.c| 5 +++--
  4 files changed, 8 insertions(+), 5 deletions(-)

 diff --git a/hw/block/pflash_cfi01.c b/hw/block/pflash_cfi01.c
 index f9507b4..649565d 100644
 --- a/hw/block/pflash_cfi01.c
 +++ b/hw/block/pflash_cfi01.c
 @@ -770,7 +770,7 @@ static void pflash_cfi01_realize(DeviceState *dev, Error 
 **errp)
  memory_region_init_rom_device(
  pfl-mem, OBJECT(dev),
  pfl-be ? pflash_cfi01_ops_be : pflash_cfi01_ops_le, pfl,
 -pfl-name, total_len);
 +pfl-name, total_len, error_abort);
  vmstate_register_ram(pfl-mem, DEVICE(pfl));
  pfl-storage = memory_region_get_ram_ptr(pfl-mem);
  sysbus_init_mmio(SYS_BUS_DEVICE(dev), pfl-mem);
 diff --git a/hw/block/pflash_cfi02.c b/hw/block/pflash_cfi02.c
 index 8d4b828..49db02d 100644
 --- a/hw/block/pflash_cfi02.c
 +++ b/hw/block/pflash_cfi02.c
 @@ -608,7 +608,7 @@ static void pflash_cfi02_realize(DeviceState *dev, Error 
 **errp)

  memory_region_init_rom_device(pfl-orig_mem, OBJECT(pfl), pfl-be ?
pflash_cfi02_ops_be : 
 pflash_cfi02_ops_le,
 -  pfl, pfl-name, chip_len);
 +  pfl, pfl-name, chip_len, error_abort);

We probably should take the opportunity to error_propagate in these
cases, to prepare support for hotplug of devs like this. But I think
your blind conversions are a good first step as they will preserve
existing behaviour. So lets call that follow up.

Regards,
Peter

  vmstate_register_ram(pfl-orig_mem, DEVICE(pfl));
  pfl-storage = memory_region_get_ram_ptr(pfl-orig_mem);
  pfl-chip_len = chip_len;
 diff --git a/include/exec/memory.h b/include/exec/memory.h
 index caa988d..71bed47 100644
 --- a/include/exec/memory.h
 +++ b/include/exec/memory.h
 @@ -388,13 +388,15 @@ void memory_region_init_alias(MemoryRegion *mr,
   * @ops: callbacks for write access handling.
   * @name: the name of the region.
   * @size: size of the region.
 + * @errp: pointer to Error*, to store an error if it happens.
   */
  void memory_region_init_rom_device(MemoryRegion *mr,
 struct Object *owner,
 const MemoryRegionOps *ops,
 void *opaque,
 const char *name,
 -   uint64_t size);
 +   uint64_t size,
 +   Error **errp);

  /**
   * memory_region_init_reservation: Initialize a memory region that reserves
 diff --git a/memory.c b/memory.c
 index bcebfd8..06a7e1b 100644
 --- a/memory.c
 +++ b/memory.c
 @@ -1223,7 +1223,8 @@ void memory_region_init_rom_device(MemoryRegion *mr,
 const MemoryRegionOps *ops,
 void *opaque,
 const char *name,
 -   uint64_t size)
 +   uint64_t size,
 +   Error **errp)
  {
  memory_region_init(mr, owner, name, size);
  mr-ops = ops;
 @@ -1231,7 +1232,7 @@ void memory_region_init_rom_device(MemoryRegion *mr,
  mr-terminates = true;
  mr-rom_device = true;
  mr-destructor = memory_region_destructor_rom_device;
 -mr-ram_addr = qemu_ram_alloc(size, mr, error_abort);
 +mr-ram_addr = qemu_ram_alloc(size, mr, errp);
  }

  void memory_region_init_iommu(MemoryRegion *mr,
 --
 1.9.3

Re: [Qemu-devel] [PATCH v5 5/6] hostmem-ram: don't exit qemu if size of memory-backend-ram is way too big

On Wed, Aug 6, 2014 at 3:36 PM, Hu Tao hu...@cn.fujitsu.com wrote:
 When using monitor command object_add to add a memory backend whose
 size is way too big to allocate memory for it, qemu just exits. In
 the case we'd better give an error message and keep guest running.

 The problem can be reproduced as follows:

 1. run qemu
 2. (monitor)object_add memory-backend-ram,size=10G,id=ram0

 Signed-off-by: Hu Tao hu...@cn.fujitsu.com

Reviewed-by: Peter Crosthwaite peter.crosthwa...@xilinx.com

 ---
  backends/hostmem-ram.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

 diff --git a/backends/hostmem-ram.c b/backends/hostmem-ram.c
 index e55d066..a67a134 100644
 --- a/backends/hostmem-ram.c
 +++ b/backends/hostmem-ram.c
 @@ -27,7 +27,7 @@ ram_backend_memory_alloc(HostMemoryBackend *backend, Error 
 **errp)

  path = object_get_canonical_path_component(OBJECT(backend));
  memory_region_init_ram(backend-mr, OBJECT(backend), path,
 -   backend-size, error_abort);
 +   backend-size, errp);
  g_free(path);
  }

 --
 1.9.3

Re: [Qemu-devel] [PATCH v5 6/6] exec: improve error handling and reporting in file_ram_alloc() and gethugepagesize()

You subject line is excessively long. How about just improve RAM file
error handling and elaborate in a commit msg para?

On Wed, Aug 6, 2014 at 3:36 PM, Hu Tao hu...@cn.fujitsu.com wrote:
 This patch fixes two problems of memory-backend-file:


It looks like two self contained changes. Any reason to not split?

 1. If user adds a memory-backend-file object using object_add command,
specifying a non-existing directory for property mem-path, qemu
will core dump with message:

  /nonexistingdir: No such file or directory
  Bad ram offset f000
  Aborted (core dumped)

with this patch, qemu reports error message like:

  qemu-system-x86_64: -object 
 memory-backend-file,mem-path=/nonexistingdir,id=mem-file0,size=128M:
  failed to stat file /nonexistingdir: No such file or directory

 2. If user adds a memory-backend-file object using object_add command,
specifying a size that is less than huge page size, qemu
will core dump with message:

  Bad ram offset f000
  Aborted (core dumped)

with this patch, qemu reports error message like:

  qemu-system-x86_64: -object 
 memory-backend-file,mem-path=/hugepages,id=mem-file0,size=1M: memory
  size 0x10 should be euqal or larger than huge page size 0x20

equal.


 Signed-off-by: Hu Tao hu...@cn.fujitsu.com
 ---
  exec.c | 21 -
  1 file changed, 12 insertions(+), 9 deletions(-)

 diff --git a/exec.c b/exec.c
 index 7e60a44..6512820 100644
 --- a/exec.c
 +++ b/exec.c
 @@ -996,7 +996,7 @@ void qemu_mutex_unlock_ramlist(void)


  #define HUGETLBFS_MAGIC   0x958458f6

 -static long gethugepagesize(const char *path)
 +static long gethugepagesize(const char *path, Error **errp)
  {
  struct statfs fs;
  int ret;
 @@ -1006,7 +1006,7 @@ static long gethugepagesize(const char *path)
  } while (ret != 0  errno == EINTR);

  if (ret != 0) {
 -perror(path);
 +error_setg_errno(errp, errno, failed to get size of file %s, path);

I think your error message is imprecise. It's not the file size you
are trying to get its the page size for that file (or its underlying
file system I think).

  return 0;
  }

 @@ -1024,17 +1024,20 @@ static void *file_ram_alloc(RAMBlock *block,
  char *filename;
  char *sanitized_name;
  char *c;
 -void *area;
 +void *area = NULL;
  int fd;
 -unsigned long hpagesize;
 +uint64_t hpagesize;

 -hpagesize = gethugepagesize(path);
 -if (!hpagesize) {
 +hpagesize = gethugepagesize(path, errp);
 +if (errp  *errp) {

More flow control dependent on non NULL errp. I think you want a
local_err for safety here.

  goto error;
  }

  if (memory  hpagesize) {
 -return NULL;
 +error_setg(errp, memory size 0x RAM_ADDR_FMT  must be euqal to 

equal

Regards,
Peter

 +   or larger than huge page size 0x% PRIx64,
 +   memory, hpagesize);
 +goto error;
  }

  if (kvm_enabled()  !kvm_has_sync_mmu()) {
 @@ -1094,8 +1097,8 @@ static void *file_ram_alloc(RAMBlock *block,
  return area;

  error:
 -if (mem_prealloc) {
 -exit(1);
 +if (area  area != MAP_FAILED) {
 +munmap(area, memory);
  }
  return NULL;
  }
 --
 1.9.3

[Qemu-devel] [PATCH V5 0/5] tests: Add the image fuzzer with qcow2 support


This patch series introduces the image fuzzer, a tool for stability and
reliability testing.
Its approach is to run large amount of tests in background. During every test a
program (e.g. qemu-img) is called to read or modify an invalid test image.
A test image has valid inner structure defined by its format specification with
some fields having random invalid values.

Patch 1 contains documentation for the image fuzzer, patch 2 is the test runner
and remaining ones relate to the image generator for qcow2 format.

This patch series was created for the 'block-next' branch.

v4 - v5:
Runner:
 * Added a warning message if a backing file failed to be created (based on
   the review of Fam Zheng)
 * Back up a test image before every test command
 * Fixed always logged messages of a program under test
 * Added a warning message if a wrong name of a program under test is specified
 * Made offset and length qemu-io arguments multiple of a sector size
 * Print all errors to stderr

Layout:
 * Fixed estimation of the feature name table length

Fuzz:
 * Added fuzzing vector for 8bit values

Overall:
 * Missed white spaces (based on reviews of Fam Zheng and Stefan Hajnoczi)
 * Simplified attribute calls (based on the review of Stefan Hajnoczi)

Maria Kustova (5):
  docs: Specification for the image fuzzer
  runner: Tool for fuzz tests execution
  fuzz: Fuzzing functions for qcow2 images
  layout: Generator of fuzzed qcow2 images
  package: Public API for image-fuzzer/runner/runner.py

 tests/image-fuzzer/docs/image-fuzzer.txt | 239 ++
 tests/image-fuzzer/qcow2/__init__.py |   1 +
 tests/image-fuzzer/qcow2/fuzz.py | 327 +
 tests/image-fuzzer/qcow2/layout.py   | 369 
 tests/image-fuzzer/runner/runner.py  | 405 +++
 5 files changed, 1341 insertions(+)
 create mode 100644 tests/image-fuzzer/docs/image-fuzzer.txt
 create mode 100644 tests/image-fuzzer/qcow2/__init__.py
 create mode 100644 tests/image-fuzzer/qcow2/fuzz.py
 create mode 100644 tests/image-fuzzer/qcow2/layout.py
 create mode 100755 tests/image-fuzzer/runner/runner.py

-- 
1.9.3

[Qemu-devel] [PATCH V5 5/5] package: Public API for image-fuzzer/runner/runner.py

__init__.py provides the public API required by the test runner

Signed-off-by: Maria Kustova mari...@catit.be
---
 tests/image-fuzzer/qcow2/__init__.py | 1 +
 1 file changed, 1 insertion(+)
 create mode 100644 tests/image-fuzzer/qcow2/__init__.py

diff --git a/tests/image-fuzzer/qcow2/__init__.py 
b/tests/image-fuzzer/qcow2/__init__.py
new file mode 100644
index 000..e2ebe19
--- /dev/null
+++ b/tests/image-fuzzer/qcow2/__init__.py
@@ -0,0 +1 @@
+from layout import create_image
-- 
1.9.3

[Qemu-devel] [PATCH V5 1/5] docs: Specification for the image fuzzer

'Overall fuzzer requirements' chapter contains the current product vision and
features done and to be done. This chapter is still in progress.

Signed-off-by: Maria Kustova mari...@catit.be
---
 tests/image-fuzzer/docs/image-fuzzer.txt | 239 +++
 1 file changed, 239 insertions(+)
 create mode 100644 tests/image-fuzzer/docs/image-fuzzer.txt

diff --git a/tests/image-fuzzer/docs/image-fuzzer.txt 
b/tests/image-fuzzer/docs/image-fuzzer.txt
new file mode 100644
index 000..efe0ed4
--- /dev/null
+++ b/tests/image-fuzzer/docs/image-fuzzer.txt
@@ -0,0 +1,239 @@
+# Specification for the fuzz testing tool
+#
+# Copyright (C) 2014 Maria Kustova mari...@catit.be
+#
+# This program is free software: you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation, either version 2 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see http://www.gnu.org/licenses/.
+
+
+Image fuzzer
+
+
+Description
+---
+
+The goal of the image fuzzer is to catch crashes of qemu-io/qemu-img
+by providing to them randomly corrupted images.
+Test images are generated from scratch and have valid inner structure with some
+elements, e.g. L1/L2 tables, having random invalid values.
+
+
+Test runner
+---
+
+The test runner generates test images, executes tests utilizing generated
+images, indicates their results and collects all test related artifacts (logs,
+core dumps, test images).
+The test means execution of all available commands under test with the same
+generated test image.
+By default, the test runner generates new tests and executes them until
+keyboard interruption. But if a test seed is specified via the '--seed' runner
+parameter, then only one test with this seed will be executed, after its finish
+the runner will exit.
+
+The runner uses an external image fuzzer to generate test images. An image
+generator should be specified as a mandatory parameter of the test runner.
+Details about interactions between the runner and fuzzers see Module
+interfaces.
+
+The runner activates generation of core dumps during test executions, but it
+assumes that core dumps will be generated in the current working directory.
+For comprehensive test results, please, set up your test environment
+properly.
+
+Paths to binaries under test (SUTs) qemu-img and qemu-io are retrieved from
+environment variables. If the environment check fails the runner will
+use SUTs installed in system paths.
+qemu-img is required for creation of backing files, so it's mandatory to set
+the related environment variable if it's not installed in the system path.
+For details about environment variables see qemu-iotests/check.
+
+The runner accepts via the '--config' argument a JSON array of fields expected
+to be fuzzed, e.g.
+
+   '[[feature_name_table], [header, l1_table_offset]]'
+
+Each sublist can be have one or two strings defining image structure elements.
+In the latter case a parent element should be placed on the first position,
+and a field name on the second one.
+
+The runner accepts a list of commands under test as a JSON array via
+the '--command' argument. Each command is a list containing a SUT and all its
+arguments, e.g.
+
+   runner.py -c '[[qemu-io, $test_img, -c, write $off $len]]'
+ /tmp/test ../qcow2
+
+For variable arguments next aliases can be used:
+- $test_img for a fuzzed img
+- $off for an offset in the fuzzed image
+- $len for a data size
+
+Values for last two aliases will be generated based on the size of the virtal
+disk in the fuzzed image.
+In case when no commands are specified the runner will execute commands from
+the default list:
+- qemu-img check
+- qemu-img info
+- qemu-img convert
+- qemu-io -c read
+- qemu-io -c write
+- qemu-io -c aio_read
+- qemu-io -c aio_write
+- qemu-io -c flush
+- qemu-io -c discard
+- qemu-io -c truncate
+
+
+Qcow2 image generator
+-
+
+The 'qcow2' generator is a Python package providing 'create_image' method as
+a single public API. See details in 'Test runner/image fuzzer' in 'Module
+interfaces'.
+
+Qcow2 contains two submodules: fuzz.py and layout.py.
+
+'fuzz.py' contains all fuzzing functions, one per image field. It's assumed
+that after code analysis every field will have own constraints for its value.
+For now only universal potentially dangerous values are used, e.g. type limits
+for integers or unsafe symbols as '%s' for strings. For bitmasks random amount
+of bits are set to ones. All fuzzed values are checked on

[Qemu-devel] [PATCH V5 4/5] layout: Generator of fuzzed qcow2 images

The layout submodule of the qcow2 package creates a random valid image,
randomly selects some amount of its fields, fuzzes them and write the fuzzed
image to the file. Fuzzing process can be controlled by an external
configuration.

Signed-off-by: Maria Kustova mari...@catit.be
---
 tests/image-fuzzer/qcow2/layout.py | 369 +
 1 file changed, 369 insertions(+)
 create mode 100644 tests/image-fuzzer/qcow2/layout.py

diff --git a/tests/image-fuzzer/qcow2/layout.py 
b/tests/image-fuzzer/qcow2/layout.py
new file mode 100644
index 000..4c08202
--- /dev/null
+++ b/tests/image-fuzzer/qcow2/layout.py
@@ -0,0 +1,369 @@
+# Generator of fuzzed qcow2 images
+#
+# Copyright (C) 2014 Maria Kustova mari...@catit.be
+#
+# This program is free software: you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation, either version 2 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see http://www.gnu.org/licenses/.
+#
+
+import random
+import struct
+import fuzz
+
+MAX_IMAGE_SIZE = 10 * (1  20)
+# Standard sizes
+UINT32_S = 4
+UINT64_S = 8
+
+
+class Field(object):
+
+Atomic image element (field).
+
+The class represents an image field as quadruple of a data format
+of value necessary for its packing to binary form, an offset from
+the beginning of the image, a value and a name.
+
+The field can be iterated as a list [format, offset, value].
+
+
+__slots__ = ('fmt', 'offset', 'value', 'name')
+
+def __init__(self, fmt, offset, val, name):
+self.fmt = fmt
+self.offset = offset
+self.value = val
+self.name = name
+
+def __iter__(self):
+return iter([self.fmt, self.offset, self.value])
+
+def __repr__(self):
+return Field(fmt='%s', offset=%d, value=%s, name=%s) % \
+(self.fmt, self.offset, str(self.value), self.name)
+
+
+class FieldsList(object):
+
+List of fields.
+
+The class allows access to a field in the list by its name and joins
+several list in one via in-place addition.
+
+
+def __init__(self, meta_data=None):
+if meta_data is None:
+self.data = []
+else:
+self.data = [Field(f[0], f[1], f[2], f[3])
+ for f in meta_data]
+
+def __getitem__(self, name):
+return [x for x in self.data if x.name == name]
+
+def __iter__(self):
+return iter(self.data)
+
+def __iadd__(self, other):
+self.data += other.data
+return self
+
+def __len__(self):
+return len(self.data)
+
+
+class Image(object):
+
+ Qcow2 image object.
+
+This class allows to create qcow2 images with random valid structures and
+values, fuzz them via external qcow2.fuzz module and write the result to
+a file.
+
+
+@staticmethod
+def _size_params():
+Generate a random image size aligned to a random correct
+cluster size.
+
+cluster_bits = random.randrange(9, 21)
+cluster_size = 1  cluster_bits
+img_size = random.randrange(0, MAX_IMAGE_SIZE + 1, cluster_size)
+return (cluster_bits, img_size)
+
+@staticmethod
+def _header(cluster_bits, img_size, backing_file_name=None):
+Generate a random valid header.
+meta_header = [
+['4s', 0, QFI\xfb, 'magic'],
+['I', 4, random.randint(2, 3), 'version'],
+['Q', 8, 0, 'backing_file_offset'],
+['I', 16, 0, 'backing_file_size'],
+['I', 20, cluster_bits, 'cluster_bits'],
+['Q', 24, img_size, 'size'],
+['I', 32, 0, 'crypt_method'],
+['I', 36, 0, 'l1_size'],
+['Q', 40, 0, 'l1_table_offset'],
+['Q', 48, 0, 'refcount_table_offset'],
+['I', 56, 0, 'refcount_table_clusters'],
+['I', 60, 0, 'nb_snapshots'],
+['Q', 64, 0, 'snapshots_offset'],
+['Q', 72, 0, 'incompatible_features'],
+['Q', 80, 0, 'compatible_features'],
+['Q', 88, 0, 'autoclear_features'],
+# Only refcount_order = 4 is supported by current (07.2014)
+# implementation of QEMU
+['I', 96, 4, 'refcount_order'],
+['I', 100, 0, 'header_length']
+]
+v_header = FieldsList(meta_header)
+
+if v_header['version'][0].value == 2:
+v_header['header_length'][0].value = 72
+else:
+v_header['incompatible_features'][0].value = random.getrandbits(2)
+

[Qemu-devel] [PATCH V5 2/5] runner: Tool for fuzz tests execution

The purpose of the test runner is to prepare the test environment (e.g. create
a work directory, a test image, etc), execute a program under test with
parameters, indicate a test failure if the program was killed during the test
execution and collect core dumps, logs and other test artifacts.

The test runner doesn't depend on an image format or a program will be tested,
so it can be used with any external image generator and program under test.

Signed-off-by: Maria Kustova mari...@catit.be
---
 tests/image-fuzzer/runner/runner.py | 405 
 1 file changed, 405 insertions(+)
 create mode 100755 tests/image-fuzzer/runner/runner.py

diff --git a/tests/image-fuzzer/runner/runner.py 
b/tests/image-fuzzer/runner/runner.py
new file mode 100755
index 000..3fa7fca
--- /dev/null
+++ b/tests/image-fuzzer/runner/runner.py
@@ -0,0 +1,405 @@
+#!/usr/bin/env python
+
+# Tool for running fuzz tests
+#
+# Copyright (C) 2014 Maria Kustova mari...@catit.be
+#
+# This program is free software: you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation, either version 2 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see http://www.gnu.org/licenses/.
+#
+
+import sys
+import os
+import signal
+import subprocess
+import random
+import shutil
+from itertools import count
+import getopt
+import StringIO
+import resource
+
+try:
+import json
+except ImportError:
+try:
+import simplejson as json
+except ImportError:
+print sys.stderr, \
+Warning: Module for JSON processing is not found.\n \
+'--config' and '--command' options are not supported.
+
+# Backing file sizes in MB
+MAX_BACKING_FILE_SIZE = 10
+MIN_BACKING_FILE_SIZE = 1
+
+
+def multilog(msg, *output):
+ Write an object to all of specified file descriptors.
+for fd in output:
+fd.write(msg)
+fd.flush()
+
+
+def str_signal(sig):
+ Convert a numeric value of a system signal to the string one
+defined by the current operational system.
+
+for k, v in signal.__dict__.items():
+if v == sig:
+return k
+
+
+def run_app(fd, q_args):
+Start an application with specified arguments and return its exit code
+or kill signal depending on the result of execution.
+
+devnull = open('/dev/null', 'r+')
+process = subprocess.Popen(q_args, stdin=devnull,
+   stdout=subprocess.PIPE,
+   stderr=subprocess.PIPE)
+out, err = process.communicate()
+fd.write(out)
+fd.write(err)
+return process.returncode
+
+
+class TestException(Exception):
+Exception for errors risen by TestEnv objects.
+pass
+
+
+class TestEnv(object):
+
+Test object.
+
+The class sets up test environment, generates backing and test images
+and executes application under tests with specified arguments and a test
+image provided.
+
+All logs are collected.
+
+The summary log will contain short descriptions and statuses of tests in
+a run.
+
+The test log will include application (e.g. 'qemu-img') logs besides info
+sent to the summary log.
+
+
+def __init__(self, test_id, seed, work_dir, run_log,
+ cleanup=True, log_all=False):
+Set test environment in a specified work directory.
+
+Path to qemu-img and qemu-io will be retrieved from 'QEMU_IMG' and
+'QEMU_IO' environment variables.
+
+if seed is not None:
+self.seed = seed
+else:
+self.seed = str(random.randint(0, sys.maxint))
+random.seed(self.seed)
+
+self.init_path = os.getcwd()
+self.work_dir = work_dir
+self.current_dir = os.path.join(work_dir, 'test-' + test_id)
+self.qemu_img = os.environ.get('QEMU_IMG', 'qemu-img')\
+  .strip().split(' ')
+self.qemu_io = os.environ.get('QEMU_IO', 'qemu-io').strip().split(' ')
+self.commands = [['qemu-img', 'check', '-f', 'qcow2', '$test_img'],
+ ['qemu-img', 'info', '-f', 'qcow2', '$test_img'],
+ ['qemu-io', '$test_img', '-c', 'read $off $len'],
+ ['qemu-io', '$test_img', '-c', 'write $off $len'],
+ ['qemu-io', '$test_img', '-c',
+  'aio_read $off $len'],
+ ['qemu-io', '$test_img', '-c',
+  'aio_write $off $len'],
+ ['qemu-io', '$test_img', '-c',

[Qemu-devel] [PATCH V5 3/5] fuzz: Fuzzing functions for qcow2 images

The fuzz submodule of the qcow2 image generator contains fuzzing functions for
image fields.
Each fuzzing function contains a list of constraints and a call of a helper
function that randomly selects a fuzzed value satisfied to one of constraints.
For now constraints include only known as invalid or potentially dangerous
values. But after investigation of code coverage by fuzz tests they will be
expanded by heuristic values based on inner checks and flows of a program
under test.

Now fuzzing of a header, header extensions and a backing file name is
supported.

Signed-off-by: Maria Kustova mari...@catit.be
---
 tests/image-fuzzer/qcow2/fuzz.py | 327 +++
 1 file changed, 327 insertions(+)
 create mode 100644 tests/image-fuzzer/qcow2/fuzz.py

diff --git a/tests/image-fuzzer/qcow2/fuzz.py b/tests/image-fuzzer/qcow2/fuzz.py
new file mode 100644
index 000..a53c84f
--- /dev/null
+++ b/tests/image-fuzzer/qcow2/fuzz.py
@@ -0,0 +1,327 @@
+# Fuzzing functions for qcow2 fields
+#
+# Copyright (C) 2014 Maria Kustova mari...@catit.be
+#
+# This program is free software: you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation, either version 2 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see http://www.gnu.org/licenses/.
+#
+
+import random
+
+
+UINT8 = 0xff
+UINT32 = 0x
+UINT64 = 0x
+# Most significant bit orders
+UINT32_M = 31
+UINT64_M = 63
+# Fuzz vectors
+UINT8_V = [0, 0x10, UINT8/4, UINT8/2 - 1, UINT8/2, UINT8/2 + 1, UINT8 - 1,
+   UINT8]
+UINT32_V = [0, 0x100, 0x1000, 0x1, 0x10, UINT32/4, UINT32/2 - 1,
+UINT32/2, UINT32/2 + 1, UINT32 - 1, UINT32]
+UINT64_V = UINT32_V + [0x100, 0x1000, 0x1, UINT64/4,
+   UINT64/2 - 1, UINT64/2, UINT64/2 + 1, UINT64 - 1,
+   UINT64]
+STRING_V = ['%s%p%x%d', '.1024d', '%.2049d', '%p%p%p%p', '%x%x%x%x',
+'%d%d%d%d', '%s%s%s%s', '%999s', '%08x', '%%20d', '%%20n',
+'%%20x', '%%20s', '%s%s%s%s%s%s%s%s%s%s', '%p%p%p%p%p%p%p%p%p%p',
+'%#0123456x%08x%x%s%p%d%n%o%u%c%h%l%q%j%z%Z%t%i%e%g%f%a%C%S%08x%%',
+'%s x 129', '%x x 257']
+
+
+def random_from_intervals(intervals):
+Select a random integer number from the list of specified intervals.
+
+Each interval is a tuple of lower and upper limits of the interval. The
+limits are included. Intervals in a list should not overlap.
+
+total = reduce(lambda x, y: x + y[1] - y[0] + 1, intervals, 0)
+r = random.randint(0, total - 1) + intervals[0][0]
+for x in zip(intervals, intervals[1:]):
+r = r + (r  x[0][1]) * (x[1][0] - x[0][1] - 1)
+return r
+
+
+def random_bits(bit_ranges):
+Generate random binary mask with ones in the specified bit ranges.
+
+Each bit_ranges is a list of tuples of lower and upper limits of bit
+positions will be fuzzed. The limits are included. Random amount of bits
+in range limits will be set to ones. The mask is returned in decimal
+integer format.
+
+bit_numbers = []
+# Select random amount of random positions in bit_ranges
+for rng in bit_ranges:
+bit_numbers += random.sample(range(rng[0], rng[1] + 1),
+ random.randint(0, rng[1] - rng[0] + 1))
+val = 0
+# Set bits on selected positions to ones
+for bit in bit_numbers:
+val |= 1  bit
+return val
+
+
+def truncate_string(strings, length):
+Return strings truncated to specified length.
+if type(strings) == list:
+return [s[:length] for s in strings]
+else:
+return strings[:length]
+
+
+def validator(current, pick, choices):
+Return a value not equal to the current selected by the pick
+function from choices.
+
+while True:
+val = pick(choices)
+if not val == current:
+return val
+
+
+def int_validator(current, intervals):
+Return a random value from intervals not equal to the current.
+
+This function is useful for selection from valid values except current one.
+
+return validator(current, random_from_intervals, intervals)
+
+
+def bit_validator(current, bit_ranges):
+Return a random bit mask not equal to the current.
+
+This function is useful for selection from valid values except current one.
+
+return validator(current, random_bits, bit_ranges)
+
+
+def string_validator(current, strings):
+Return a random string value from the list not equal to the current.
+
+This

Re: [Qemu-devel] [edk2] license for binary drivers

Il 06/08/2014 12:34, Laszlo Ersek ha scritto:
 So no, you can't ship an OVMF binary (or source tarball) that contains
 the FAT driver, bundled as part of the GPLv2 (+compatible) QEMU
 distribution, either in source or in binary form.

What Laszlo said is mostly my understanding too (IANAL etc.).

Just one thing: the GPL does allow you to merely aggregate the OVMF
binaries with QEMU sources or QEMU binaries; and a lawyer could also
tell you the if QEMU were to ship OVMF binaries in its tarball, it would
be mere aggregation.  It would then be allowed by the GPL.

I wouldn't disagree; the OVMF binaries are just data as far as QEMU is
concerned.

However, the non-free nature of the OVMF binaries mean that QEMU will
never ever ship OVMF binaries until the license is fixed for the
offending FAT driver.  Not only because we don't want to get into legal
minefields, but also because QEMU is free software and wants to keep its
distributed releases entirely free.

Paolo

Re: [Qemu-devel] [PATCH v1 00/17] dataplane: optimization and multi virtqueue support

On Wed, Aug 6, 2014 at 4:50 PM, Paolo Bonzini pbonz...@redhat.com wrote:
 Il 06/08/2014 10:38, Ming Lei ha scritto:
 On Wed, Aug 6, 2014 at 3:45 PM, Paolo Bonzini pbonz...@redhat.com wrote:
 Il 06/08/2014 07:33, Ming Lei ha scritto:
 I played a bit with the following, I hope it's not too naive. I couldn't
 see a difference with your patches, but at least one reason for this is
 probably that my laptop SSD isn't fast enough to make the CPU the
 bottleneck. Haven't tried ramdisk yet, that would probably be the next
 thing. (I actually wrote the patch up just for some profiling on my own,
 not for comparing throughput, but it should be usable for that as well.)
 This might not be good for the test since it is basically a sequential
 read test, which can be optimized a lot by kernel. And I always use
 randread benchmark.

 A microbenchmark already exists in tests/test-coroutine.c, and doesn't
 really tell us much; it's obvious that coroutines execute more code, the
 question is why it affects the iops performance.

 Could you take a look at the coroutine benchmark I worte?  The running
 result shows coroutine does decrease performance a lot compared with
 bypass coroutine like the patchset is doing.

 Your benchmark is synchronous, while disk I/O is asynchronous.

It can be thought as asynchronous too, since it won't sleep like
synchronous I/O.

Basically the IO thread is CPU bound type in case of linux-aio
since both submission and completion won't block CPU mostly,
so my benchmark still fits if we thought the completion as nop.

The current problem is that from single coroutine benchmark,
looks it doesn't hurt performance with stack switch, but in Kevin's
block aio benchmark, bypass coroutine can still obtain observable
improvement.


 Your benchmark doesn't add much compared to time tests/test-coroutine
 -m perf  -p /perf/yield.  It takes 8 seconds on my machine, and 10^8
 function calls obviously take less than 8 seconds.  I've sent a patch to
 add a baseline function call benchmark to test-coroutine.

 The sequential read should be the right workload.  For fio, you want to
 get as many iops as possible to QEMU and so you need randread.  But
 qemu-img is not run in a guest and if the kernel optimizes sequential
 reads then the bypass should have even more benefits because it makes
 userspace proportionally more expensive.

 Do you agree with this?

Yes, I have posted the result of the benchmark, and looks the result
is basically similar with my previous test on dataplane.

Thanks,

[Qemu-devel] [PATCH V2 0/3] image-fuzzer: Support L1/L2 tables in the qcow2 image generator

This patch series adds support of L1/L2 tables to the qcow2 image generator.

This patch series was created for the 'block-next' branch and based on the next
series:
 [PATCH V5 0/5] tests: Add the image fuzzer with qcow2 support.

v1 - v2:
   * Rebased to the new version of the parent patch series
   * Fixed wrong maximum number of L2 tables
   * Fixed missed whitespaces (based on the review of Stefan Hajnoczi)

Maria Kustova (3):
  docs: Expand the list of supported image elements with L1/L2 tables
  fuzz: Add fuzzing functions for L1/L2 table entries
  layout: Add generators of L1/L2 tables

 tests/image-fuzzer/docs/image-fuzzer.txt |   3 +-
 tests/image-fuzzer/qcow2/fuzz.py |  28 
 tests/image-fuzzer/qcow2/layout.py   | 273 ---
 3 files changed, 240 insertions(+), 64 deletions(-)

-- 
1.9.3

[Qemu-devel] [PATCH V2 1/3] docs: Expand the list of supported image elements with L1/L2 tables

Signed-off-by: Maria Kustova mari...@catit.be
---
 tests/image-fuzzer/docs/image-fuzzer.txt | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/tests/image-fuzzer/docs/image-fuzzer.txt 
b/tests/image-fuzzer/docs/image-fuzzer.txt
index efe0ed4..2e8e3b9 100644
--- a/tests/image-fuzzer/docs/image-fuzzer.txt
+++ b/tests/image-fuzzer/docs/image-fuzzer.txt
@@ -125,8 +125,7 @@ If a fuzzer configuration is specified, then it has the 
next interpretation:
 will be always fuzzed for every test. This case is useful for regression
 testing.
 
-For now only header fields and header extensions are generated.
-
+For now only header fields, header extensions and L1/L2 tables are generated.
 
 Module interfaces
 -
-- 
1.9.3

[Qemu-devel] [PATCH V2 3/3] layout: Add generators of L1/L2 tables

Valid L2 entries contain offsets to image clusters filled with random data.
L2 entries have random positions inside L2 tables. L1 entries contain offsets
to generated L2 tables and also have random positions inside the L1 table.
Clusters for L1/L2 tables and guest data are selected randomly.

Signed-off-by: Maria Kustova mari...@catit.be
---
 tests/image-fuzzer/qcow2/layout.py | 273 -
 1 file changed, 211 insertions(+), 62 deletions(-)

diff --git a/tests/image-fuzzer/qcow2/layout.py 
b/tests/image-fuzzer/qcow2/layout.py
index 4c08202..7839d2c 100644
--- a/tests/image-fuzzer/qcow2/layout.py
+++ b/tests/image-fuzzer/qcow2/layout.py
@@ -19,6 +19,8 @@
 import random
 import struct
 import fuzz
+from math import ceil
+from os import urandom
 
 MAX_IMAGE_SIZE = 10 * (1  20)
 # Standard sizes
@@ -102,7 +104,66 @@ class Image(object):
 return (cluster_bits, img_size)
 
 @staticmethod
-def _header(cluster_bits, img_size, backing_file_name=None):
+def _get_available_clusters(used, number):
+Return a set of indices of not allocated clusters.
+
+'used' contains indices of currently allocated clusters.
+All clusters that cannot be allocated between 'used' clusters will have
+indices appended to the end of 'used'.
+
+append_id = max(used) + 1
+free = set(range(1, append_id)) - used
+if len(free) = number:
+return set(random.sample(free, number))
+else:
+return free | set(range(append_id, append_id + number - len(free)))
+
+@staticmethod
+def _get_adjacent_clusters(used, size):
+Return an index of the first cluster in the sequence of free ones.
+
+'used' contains indices of currently allocated clusters. 'size' is the
+length of the sequence of free clusters.
+If the sequence of 'size' is not available between 'used' clusters, its
+first index will be append to the end of 'used'.
+
+def get_cluster_id(lst, length):
+Return the first index of the sequence of the specified length
+or None if the sequence cannot be inserted in the list.
+
+if len(lst) != 0:
+pairs = []
+pair = (lst[0], 1)
+for i in range(1, len(lst)):
+if lst[i] == lst[i-1] + 1:
+pair = (lst[i], pair[1] + 1)
+else:
+pairs.append(pair)
+pair = (lst[i], 1)
+pairs.append(pair)
+random.shuffle(pairs)
+for x, s in pairs:
+if s = length:
+return x - length + 1
+return None
+
+append_id = max(used) + 1
+free = list(set(range(1, append_id)) - used)
+idx = get_cluster_id(free, size)
+if idx is None:
+return append_id
+else:
+return idx
+
+@staticmethod
+def _alloc_data(img_size, cluster_size):
+Return a set of random indices of clusters allocated for guest data.
+
+num_of_cls = img_size/cluster_size
+return set(random.sample(range(1, num_of_cls + 1),
+ random.randint(0, num_of_cls)))
+
+def create_header(self, cluster_bits, backing_file_name=None):
 Generate a random valid header.
 meta_header = [
 ['4s', 0, QFI\xfb, 'magic'],
@@ -110,7 +171,7 @@ class Image(object):
 ['Q', 8, 0, 'backing_file_offset'],
 ['I', 16, 0, 'backing_file_size'],
 ['I', 20, cluster_bits, 'cluster_bits'],
-['Q', 24, img_size, 'size'],
+['Q', 24, self.image_size, 'size'],
 ['I', 32, 0, 'crypt_method'],
 ['I', 36, 0, 'l1_size'],
 ['Q', 40, 0, 'l1_table_offset'],
@@ -126,63 +187,59 @@ class Image(object):
 ['I', 96, 4, 'refcount_order'],
 ['I', 100, 0, 'header_length']
 ]
-v_header = FieldsList(meta_header)
+self.header = FieldsList(meta_header)
 
-if v_header['version'][0].value == 2:
-v_header['header_length'][0].value = 72
+if self.header['version'][0].value == 2:
+self.header['header_length'][0].value = 72
 else:
-v_header['incompatible_features'][0].value = random.getrandbits(2)
-v_header['compatible_features'][0].value = random.getrandbits(1)
-v_header['header_length'][0].value = 104
-
-max_header_len = struct.calcsize(v_header['header_length'][0].fmt) + \
- v_header['header_length'][0].offset
+self.header['incompatible_features'][0].value = \
+random.getrandbits(2)
+self.header['compatible_features'][0].value = random.getrandbits(1)
+

[Qemu-devel] [PATCH V2 2/3] fuzz: Add fuzzing functions for L1/L2 table entries

Signed-off-by: Maria Kustova mari...@catit.be
---
 tests/image-fuzzer/qcow2/fuzz.py | 28 
 1 file changed, 28 insertions(+)

diff --git a/tests/image-fuzzer/qcow2/fuzz.py b/tests/image-fuzzer/qcow2/fuzz.py
index a53c84f..57527f9 100644
--- a/tests/image-fuzzer/qcow2/fuzz.py
+++ b/tests/image-fuzzer/qcow2/fuzz.py
@@ -325,3 +325,31 @@ def feature_name(current):
 truncate_string(STRING_V, 46)  # Fuzz padding (field length = 46)
 ]
 return selector(current, constraints, string_validator)
+
+
+def l1_entry(current):
+Fuzz an entry of the L1 table.
+constraints = UINT64_V
+# Reserved bits are ignored
+# Added a possibility when only flags are fuzzed
+offset = 0x7fff  random.choice([selector(current,
+  constraints),
+ current])
+is_cow = random.randint(0, 1)
+return offset + (is_cow  UINT64_M)
+
+
+def l2_entry(current):
+Fuzz an entry of an L2 table.
+constraints = UINT64_V
+# Reserved bits are ignored
+# Add a possibility when only flags are fuzzed
+offset = 0x3ffe  random.choice([selector(current,
+  constraints),
+ current])
+is_compressed = random.randint(0, 1)
+is_cow = random.randint(0, 1)
+is_zero = random.randint(0, 1)
+value = offset + (is_cow  UINT64_M) + \
+(is_compressed  UINT64_M - 1) + is_zero
+return value
-- 
1.9.3

Re: [Qemu-devel] [PATCH v1 00/17] dataplane: optimization and multi virtqueue support

Am 06.08.2014 um 13:28 hat Ming Lei geschrieben:
 On Wed, Aug 6, 2014 at 6:09 PM, Kevin Wolf kw...@redhat.com wrote:
  Am 06.08.2014 um 11:37 hat Ming Lei geschrieben:
  On Wed, Aug 6, 2014 at 4:48 PM, Kevin Wolf kw...@redhat.com wrote:
   Am 06.08.2014 um 07:33 hat Ming Lei geschrieben:
   Hi Kevin,
  
   On Tue, Aug 5, 2014 at 10:47 PM, Kevin Wolf kw...@redhat.com wrote:
Am 05.08.2014 um 15:48 hat Stefan Hajnoczi geschrieben:
I have been wondering how to prove that the root cause is the 
ucontext
coroutine mechanism (stack switching).  Here is an idea:
   
Hack your bypass code path to run the request inside a coroutine.
That way you can compare bypass without coroutine against bypass 
with
coroutine.
   
Right now I think there are doubts because the bypass code path is
indeed a different (and not 100% correct) code path.  So this 
approach
might prove that the coroutines are adding the overhead and not
something that you bypassed.
   
My doubts aren't only that the overhead might not come from the
coroutines, but also whether any coroutine-related overhead is really
unavoidable. If we can optimise coroutines, I'd strongly prefer to do
just that instead of introducing additional code paths.
  
   OK, thank you for taking look at the problem, and hope we can
   figure out the root cause, :-)
  
   
Another thought I had was this: If the performance difference is 
indeed
only coroutines, then that is completely inside the block layer and we
don't actually need a VM to test it. We could instead have something
like a simple qemu-img based benchmark and should be observing the 
same.
  
   Even it is simpler to run a coroutine-only benchmark, and I just
   wrote a raw one, and looks coroutine does decrease performance
   a lot, please see the attachment patch, and thanks for your template
   to help me add the 'co_bench' command in qemu-img.
  
   Yes, we can look at coroutines microbenchmarks in isolation. I actually
   did do that yesterday with the yield test from tests/test-coroutine.c.
   And in fact profiling immediately showed something to optimise:
   pthread_getspecific() was quite high, replacing it by __thread on
   systems where it works is more efficient and helped the numbers a bit.
   Also, a lot of time seems to be spent in pthread_mutex_lock/unlock (even
   in qemu-img bench), maybe there's even something that can be done here.
 
  The lock/unlock in dataplane is often from memory_region_find(), and Paolo
  should have done lots of work on that.

qemu-img bench doesn't run that code. We have a few more locks that are
taken, and one of them (the coroutine pool lock) is avoided by your
bypass patches.

  
   However, I just wasn't sure whether a change on this level would be
   relevant in a realistic environment. This is the reason why I wanted to
   get a benchmark involving the block layer and some I/O.
  
   From the profiling data in below link:
  
   http://pastebin.com/YwH2uwbq
  
   With coroutine, the running time for same loading is increased
   ~50%(1.325s vs. 0.903s), and dcache load events is increased
   ~35%(693M vs. 512M), insns per cycle is decreased by ~50%(
   1.35 vs. 1.63), compared with bypassing coroutine(-b parameter).
  
   The bypass code in the benchmark is very similar with the approach
   used in the bypass patch, since linux-aio with O_DIRECT seldom
   blocks in the the kernel I/O path.
  
   Maybe the benchmark is a bit extremely, but given modern storage
   device may reach millions of IOPS, and it is very easy to slow down
   the I/O by coroutine.
  
   I think in order to optimise coroutines, such benchmarks are fair game.
   It's just not guaranteed that the effects are exactly the same on real
   workloads, so we should take the results with a grain of salt.
  
   Anyhow, the coroutine version of your benchmark is buggy, it leaks all
   coroutines instead of exiting them, so it can't make any use of the
   coroutine pool. On my laptop, I get this (where fixed coroutine is a
   version that simply removes the yield at the end):
  
   | bypass| fixed coro| buggy coro
   +---+---+--
   time| 1.09s | 1.10s | 1.62s
   L1-dcache-loads | 921,836,360   | 932,781,747   | 1,298,067,438
   insns per cycle | 2.39  | 2.39  | 1.90
  
   Begs the question whether you see a similar effect on a real qemu and
   the coroutine pool is still not big enough? With correct use of
   coroutines, the difference seems to be barely measurable even without
   any I/O involved.
 
  When I comment qemu_coroutine_yield(), looks result of
  bypass and fixed coro is very similar as your test, and I am just
  wondering if stack is always switched in qemu_coroutine_enter()
  without calling qemu_coroutine_yield().
 
  Yes, definitely. qemu_coroutine_enter() always involves

[Qemu-devel] [PATCH] layout: Reduce number of generator functions in init

Some issues can be found only when a fuzzed image has a partial structure,
e.g. has L1/L2 tables but no refcount ones. Generation of an entirely
defined image limits these cases. Now the Image constructor creates only
a header and a backing file name (if any), other image elements are generated
in the 'create_image' API.

This patch series was created for the 'block-next' branch and based on the next
series:
[PATCH V2 0/3] image-fuzzer: Support L1/L2 tables in the qcow2 image
 generator


Signed-off-by: Maria Kustova mari...@catit.be
---
 tests/image-fuzzer/qcow2/layout.py | 92 --
 1 file changed, 49 insertions(+), 43 deletions(-)

diff --git a/tests/image-fuzzer/qcow2/layout.py 
b/tests/image-fuzzer/qcow2/layout.py
index 7839d2c..98673c4 100644
--- a/tests/image-fuzzer/qcow2/layout.py
+++ b/tests/image-fuzzer/qcow2/layout.py
@@ -156,6 +156,14 @@ class Image(object):
 return idx
 
 @staticmethod
+def _get_cluster_ids(fields_list, cluster_size):
+Return indices of clusters allocated by fields of fields_list.
+ids = set()
+for x in fields_list:
+ids.add(x.offset/cluster_size)
+return ids
+
+@staticmethod
 def _alloc_data(img_size, cluster_size):
 Return a set of random indices of clusters allocated for guest data.
 
@@ -196,12 +204,12 @@ class Image(object):
 random.getrandbits(2)
 self.header['compatible_features'][0].value = random.getrandbits(1)
 self.header['header_length'][0].value = 104
-
-max_header_len = struct.calcsize(
+# Extensions starts at the header last field offset and the field size
+self.ext_offset = struct.calcsize(
 self.header['header_length'][0].fmt) + \
 self.header['header_length'][0].offset
 end_of_extension_area_len = 2 * UINT32_S
-free_space = self.cluster_size - max_header_len - \
+free_space = self.cluster_size - self.ext_offset - \
  end_of_extension_area_len
 # If the backing file name specified and there is enough space for it
 # in the first cluster, then it's placed in the very end of the first
@@ -224,24 +232,18 @@ class Image(object):
 [data_fmt, self.header['backing_file_offset'][0].value,
  backing_file_name, 'bf_name']
 ])
-else:
-self.backing_file_name = FieldsList()
 
 def set_backing_file_format(self, backing_file_fmt=None):
 Generate the header extension for the backing file
 format.
 
-self.backing_file_format = FieldsList()
-offset = struct.calcsize(self.header['header_length'][0].fmt) + \
- self.header['header_length'][0].offset
-
 if backing_file_fmt is not None:
 # Calculation of the free space available in the first cluster
 end_of_extension_area_len = 2 * UINT32_S
 high_border = (self.header['backing_file_offset'][0].value or
(self.cluster_size - 1)) - \
 end_of_extension_area_len
-free_space = high_border - offset
+free_space = high_border - self.ext_offset
 ext_size = 2 * UINT32_S + ((len(backing_file_fmt) + 7)  ~7)
 
 if free_space = ext_size:
@@ -249,18 +251,19 @@ class Image(object):
 ext_data_fmt = '' + str(ext_data_len) + 's'
 ext_padding_len = 7 - (ext_data_len - 1) % 8
 self.backing_file_format = FieldsList([
-['I', offset, 0xE2792ACA, 'ext_magic'],
-['I', offset + UINT32_S, ext_data_len, 'ext_length'],
-[ext_data_fmt, offset + UINT32_S * 2, backing_file_fmt,
- 'bf_format']
+['I', self.ext_offset, 0xE2792ACA, 'ext_magic'],
+['I', self.ext_offset + UINT32_S, ext_data_len,
+ 'ext_length'],
+[ext_data_fmt, self.ext_offset + UINT32_S * 2,
+ backing_file_fmt, 'bf_format']
 ])
-offset = self.backing_file_format['bf_format'][0].offset + \
- struct.calcsize(self.backing_file_format[
- 'bf_format'][0].fmt) + ext_padding_len
-
-return offset
+self.ext_offset = \
+struct.calcsize(
+self.backing_file_format['bf_format'][0].fmt) + \
+ext_padding_len + \
+self.backing_file_format['bf_format'][0].offset
 
-def create_feature_name_table(self, offset):
+def create_feature_name_table(self):
 Generate a random header extension for names of features used in
 the image.
 
@@ -272,7 +275,7 @@ class Image(object):
 high_border =

Re: [Qemu-devel] [PATCH v2 1/2] hw/misc/arm_sp810: Create SP810 device

2014-08-06 Thread Aggeler Fabian


On 06 Aug 2014, at 01:03, Peter Crosthwaite peter.crosthwa...@xilinx.com 
wrote:

 On Tue, Aug 5, 2014 at 7:32 PM, Fabian Aggeler aggel...@ethz.ch wrote:
 This adds a device model for the PrimeXsys System Controller (SP810)
 which is present in the Versatile Express motherboards. It is
 so far read-only but allows to set the SCCTRL register.
 
 Signed-off-by: Fabian Aggeler aggel...@ethz.ch
 ---
 default-configs/arm-softmmu.mak |  1 +
 hw/misc/Makefile.objs   |  1 +
 hw/misc/arm_sp810.c | 98 
 +
 include/hw/misc/arm_sp810.h | 85 +++
 4 files changed, 185 insertions(+)
 create mode 100644 hw/misc/arm_sp810.c
 create mode 100644 include/hw/misc/arm_sp810.h
 
 diff --git a/default-configs/arm-softmmu.mak 
 b/default-configs/arm-softmmu.mak
 index f3513fa..67ba99a 100644
 --- a/default-configs/arm-softmmu.mak
 +++ b/default-configs/arm-softmmu.mak
 @@ -55,6 +55,7 @@ CONFIG_PL181=y
 CONFIG_PL190=y
 CONFIG_PL310=y
 CONFIG_PL330=y
 +CONFIG_SP810=y
 CONFIG_CADENCE=y
 CONFIG_XGMAC=y
 CONFIG_EXYNOS4=y
 diff --git a/hw/misc/Makefile.objs b/hw/misc/Makefile.objs
 index 979e532..49d023b 100644
 --- a/hw/misc/Makefile.objs
 +++ b/hw/misc/Makefile.objs
 @@ -13,6 +13,7 @@ common-obj-$(CONFIG_PL310) += arm_l2x0.o
 common-obj-$(CONFIG_INTEGRATOR_DEBUG) += arm_integrator_debug.o
 common-obj-$(CONFIG_A9SCU) += a9scu.o
 common-obj-$(CONFIG_ARM11SCU) += arm11scu.o
 +common-obj-$(CONFIG_SP810) += arm_sp810.o
 
 # PKUnity SoC devices
 common-obj-$(CONFIG_PUV3) += puv3_pm.o
 diff --git a/hw/misc/arm_sp810.c b/hw/misc/arm_sp810.c
 new file mode 100644
 index 000..21eb816
 --- /dev/null
 +++ b/hw/misc/arm_sp810.c
 @@ -0,0 +1,98 @@
 +/*
 + * ARM PrimeXsys System Controller (SP810)
 + *
 + * Copyright (C) 2014 Fabian Aggeler aggel...@ethz.ch
 + *
 + * This program is free software; you can redistribute it and/or modify
 + * it under the terms of the GNU General Public License as published by
 + * the Free Software Foundation; either version 2 of the License, or
 + * (at your option) any later version.
 + *
 + * This program is distributed in the hope that it will be useful,
 + * but WITHOUT ANY WARRANTY; without even the implied warranty of
 + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
 + * GNU General Public License for more details.
 + *
 + */
 +
 +#include hw/misc/arm_sp810.h
 +
 +static const VMStateDescription vmstate_arm_sysctl = {
 +.name = arm_sp810,
 +.version_id = 1,
 +.minimum_version_id = 1,
 +.fields = (VMStateField[]) {
 +VMSTATE_UINT32(sc_ctrl, arm_sp810_state),
 +VMSTATE_END_OF_LIST()
 +}
 +};
 +
 +static uint64_t arm_sp810_read(void *opaque, hwaddr offset, unsigned size)
 +{
 +arm_sp810_state *s = opaque;
 +
 +switch (offset) {
 +case SCCTRL:
 +return s-sc_ctrl;
 +default:
 +qemu_log_mask(LOG_UNIMP,
 +arm_sp810_read: Register not yet implemented: %s\n,
 +HWADDR_PRIx);
 +return 0;
 +}
 +}
 +
 +static void arm_sp810_write(void *opaque, hwaddr offset,
 +uint64_t val, unsigned size)
 +{
 +switch (offset) {
 +default:
 +qemu_log_mask(LOG_UNIMP,
 +arm_sp810_write: Register not yet implemented: %s\n,
 +HWADDR_PRIx);
 +return;
 +}
 +}
 +
 +static const MemoryRegionOps arm_sp810_ops = {
 +.read = arm_sp810_read,
 +.write = arm_sp810_write,
 +.endianness = DEVICE_NATIVE_ENDIAN,
 +};
 +
 +static void arm_sp810_init(Object *obj)
 +{
 +arm_sp810_state *s = ARM_SP810(obj);
 +
 +memory_region_init_io(s-iomem, obj, arm_sp810_ops, s, arm-sp810,
 +  0x1000);
 +sysbus_init_mmio(SYS_BUS_DEVICE(obj), s-iomem);
 +}
 +
 +static Property arm_sp810_properties[] = {
 +DEFINE_PROP_UINT32(sc-ctrl, arm_sp810_state, sc_ctrl, 0x09),
 +DEFINE_PROP_END_OF_LIST(),
 +};
 +
 
 So it looks to me like the SCCTRL contains multiple register fields
 for multiple configurable options. Although i'm having trouble getting
 my head around it as I could not easily find public docco for this
 core (have a link handy?).

Unfortunately not as it seems ARM marked it as obsolete. We found 
the document offline. I could not find it on the web either. 
See also 
http://lists.infradead.org/pipermail/linux-arm-kernel/2011-January/038722.html

Indeed SCCTRL has many configuration bits, not just these 4 configuration 
options. 

 
 Anyways, the thinking is you should define multiple configurable
 options as invididual QOM properties, rather than have board level
 code init registers. This would mean that you
 
 DEFINE_PROP_UINT8(timeren0-sel, ...)
 DEFINE_PROP_UINT8(timeren1-sel, ...)
 
 Have a look at hw/dma/pl330.c for an example of invidividual configs
 getting packed into a single sw visible register.

Okay, I will have a look at pl330 then.

 
 However these look like software configurable

[Qemu-devel] [Bug 1285708] Re: FreeBSD Guest crash on boot due to xsave instruction issue

2014-08-06 Thread Chris J Arges

** Changed in: linux (Ubuntu Precise)
   Status: New = Won't Fix

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1285708

Title:
  FreeBSD Guest crash on boot due to xsave instruction issue

Status in QEMU:
  New
Status in qemu-kvm:
  New
Status in “linux” package in Ubuntu:
  Fix Released
Status in “linux” source package in Precise:
  Won't Fix
Status in “linux” source package in Trusty:
  In Progress

Bug description:
  When trying to boot a working FreeBSD 9.1/9.2 guest on a kvm/qemu host
  with the following command:

  kvm -m 256 -cdrom FreeBSD-9.2-RELEASE-amd64-disc1.iso -drive
  file=FreeBSD-9.2-RELEASE-amd64.qcow2,if=virtio -net nic,model=virtio
  -net user -nographic -vnc :10 -enable-kvm -balloon virtio -cpu
  core2duo,+xsave

  The FreeBSD Guest will kernel crash on boot with the following error:

  panic: CPU0 does not support X87 or SSE: 0

  When launching the guest without the cpu flags, it works just fine.

  This bug has been resolved in source:
  https://lkml.org/lkml/2014/2/22/58

  Can this fix be included in Precise ASAP!

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1285708/+subscriptions

[Qemu-devel] [PULL 02/11] icount: put icount variables into TimerState.

From: KONRAD Frederic fred.kon...@greensocs.com

This puts qemu_icount and qemu_icount_bias into TimerState structure to allow
them to be migrated.

Signed-off-by: KONRAD Frederic fred.kon...@greensocs.com
Reviewed-by: Paolo Bonzini pbonz...@redhat.com
Signed-off-by: Paolo Bonzini pbonz...@redhat.com
---
 cpus.c | 29 -
 1 file changed, 16 insertions(+), 13 deletions(-)

diff --git a/cpus.c b/cpus.c
index 5e7f2cf..127de1c 100644
--- a/cpus.c
+++ b/cpus.c
@@ -102,17 +102,12 @@ static bool all_cpu_threads_idle(void)
 
 /* Protected by TimersState seqlock */
 
-/* Compensate for varying guest execution speed.  */
-static int64_t qemu_icount_bias;
 static int64_t vm_clock_warp_start;
 /* Conversion factor from emulated instructions to virtual clock ticks.  */
 static int icount_time_shift;
 /* Arbitrarily pick 1MIPS as the minimum allowable speed.  */
 #define MAX_ICOUNT_SHIFT 10
 
-/* Only written by TCG thread */
-static int64_t qemu_icount;
-
 static QEMUTimer *icount_rt_timer;
 static QEMUTimer *icount_vm_timer;
 static QEMUTimer *icount_warp_timer;
@@ -129,6 +124,11 @@ typedef struct TimersState {
 int64_t cpu_clock_offset;
 int32_t cpu_ticks_enabled;
 int64_t dummy;
+
+/* Compensate for varying guest execution speed.  */
+int64_t qemu_icount_bias;
+/* Only written by TCG thread */
+int64_t qemu_icount;
 } TimersState;
 
 static TimersState timers_state;
@@ -139,14 +139,14 @@ static int64_t cpu_get_icount_locked(void)
 int64_t icount;
 CPUState *cpu = current_cpu;
 
-icount = qemu_icount;
+icount = timers_state.qemu_icount;
 if (cpu) {
 if (!cpu_can_do_io(cpu)) {
 fprintf(stderr, Bad clock read\n);
 }
 icount -= (cpu-icount_decr.u16.low + cpu-icount_extra);
 }
-return qemu_icount_bias + (icount  icount_time_shift);
+return timers_state.qemu_icount_bias + (icount  icount_time_shift);
 }
 
 int64_t cpu_get_icount(void)
@@ -284,7 +284,8 @@ static void icount_adjust(void)
 icount_time_shift++;
 }
 last_delta = delta;
-qemu_icount_bias = cur_icount - (qemu_icount  icount_time_shift);
+timers_state.qemu_icount_bias = cur_icount
+  - (timers_state.qemu_icount  
icount_time_shift);
 seqlock_write_unlock(timers_state.vm_clock_seqlock);
 }
 
@@ -333,7 +334,7 @@ static void icount_warp_rt(void *opaque)
 int64_t delta = cur_time - cur_icount;
 warp_delta = MIN(warp_delta, delta);
 }
-qemu_icount_bias += warp_delta;
+timers_state.qemu_icount_bias += warp_delta;
 }
 vm_clock_warp_start = -1;
 seqlock_write_unlock(timers_state.vm_clock_seqlock);
@@ -351,7 +352,7 @@ void qtest_clock_warp(int64_t dest)
 int64_t deadline = qemu_clock_deadline_ns_all(QEMU_CLOCK_VIRTUAL);
 int64_t warp = qemu_soonest_timeout(dest - clock, deadline);
 seqlock_write_lock(timers_state.vm_clock_seqlock);
-qemu_icount_bias += warp;
+timers_state.qemu_icount_bias += warp;
 seqlock_write_unlock(timers_state.vm_clock_seqlock);
 
 qemu_clock_run_timers(QEMU_CLOCK_VIRTUAL);
@@ -1250,7 +1251,8 @@ static int tcg_cpu_exec(CPUArchState *env)
 int64_t count;
 int64_t deadline;
 int decr;
-qemu_icount -= (cpu-icount_decr.u16.low + cpu-icount_extra);
+timers_state.qemu_icount -= (cpu-icount_decr.u16.low
++ cpu-icount_extra);
 cpu-icount_decr.u16.low = 0;
 cpu-icount_extra = 0;
 deadline = qemu_clock_deadline_ns_all(QEMU_CLOCK_VIRTUAL);
@@ -1265,7 +1267,7 @@ static int tcg_cpu_exec(CPUArchState *env)
 }
 
 count = qemu_icount_round(deadline);
-qemu_icount += count;
+timers_state.qemu_icount += count;
 decr = (count  0x) ? 0x : count;
 count -= decr;
 cpu-icount_decr.u16.low = decr;
@@ -1278,7 +1280,8 @@ static int tcg_cpu_exec(CPUArchState *env)
 if (use_icount) {
 /* Fold pending instructions back into the
instruction counter, and clear the interrupt flag.  */
-qemu_icount -= (cpu-icount_decr.u16.low + cpu-icount_extra);
+timers_state.qemu_icount -= (cpu-icount_decr.u16.low
++ cpu-icount_extra);
 cpu-icount_decr.u32 = 0;
 cpu-icount_extra = 0;
 }
-- 
1.9.3

[Qemu-devel] [PULL 00/11] KVM, icount changes for 2014-08-06

The following changes since commit 41a1a9c42c4e0fb5f1b94aa8b72e42f66ebde3d9:

  po: Update German translation (2014-07-28 23:37:17 +0200)

are available in the git repository at:

  git://github.com/bonzini/qemu.git tags/for-upstream

for you to fetch changes up to 92627748f7f7355bb2ea676a45791bd66b84aee0:

  target-mips: Ignore unassigned accesses with KVM (2014-08-06 17:53:07 +0200)


KVM changes include a MIPS patch and the testdev backend used by the
ARM kvm-unit-tests.  icount include the first part of reverse execution
and Sebastian Tanase's patches to slow down -icount execution to the
desired speed of the target.


James Hogan (1):
  target-mips: Ignore unassigned accesses with KVM

KONRAD Frederic (3):
  icount: put icount variables into TimerState.
  migration: migrate icount fields.
  timer: add cpu_icount_to_ns function.

Paolo Bonzini (1):
  backends: Introduce chr-testdev

Sebastian Tanase (6):
  icount: Fix virtual clock start value on ARM
  icount: Add QemuOpts for icount
  icount: Add align option to icount
  cpu-exec: Add sleeping algorithm
  cpu-exec: Print to console if the guest is late
  monitor: Add drift info to 'info jit'

 backends/Makefile.objs  |   2 +-
 backends/testdev.c  | 131 
 cpu-exec.c  | 116 ++
 cpus.c  | 114 ++---
 include/qemu-common.h   |   8 ++-
 include/qemu/timer.h|   2 +
 include/sysemu/char.h   |   3 ++
 monitor.c   |   1 +
 qapi-schema.json|   3 +-
 qemu-char.c |   4 ++
 qemu-options.hx |  17 +--
 qtest.c |  13 -
 stubs/Makefile.objs |   1 +
 stubs/chr-testdev.c |   7 +++
 target-mips/op_helper.c |  11 
 vl.c|  39 +++---
 16 files changed, 440 insertions(+), 32 deletions(-)
 create mode 100644 backends/testdev.c
 create mode 100644 stubs/chr-testdev.c
-- 
1.9.3

[Qemu-devel] [PULL 03/11] migration: migrate icount fields.

From: KONRAD Frederic fred.kon...@greensocs.com

This fixes a bug where qemu_icount and qemu_icount_bias are not migrated.
It adds a subsection timer/icount to vmstate_timers so icount is migrated only
when needed.

Signed-off-by: KONRAD Frederic fred.kon...@greensocs.com
Reviewed-by: Amit Shah amit.s...@redhat.com
Reviewed-by: Juan Quintela quint...@redhat.com
Signed-off-by: Paolo Bonzini pbonz...@redhat.com
---
 cpus.c | 27 +++
 1 file changed, 27 insertions(+)

diff --git a/cpus.c b/cpus.c
index 127de1c..a6b6557 100644
--- a/cpus.c
+++ b/cpus.c
@@ -429,6 +429,25 @@ void qemu_clock_warp(QEMUClockType type)
 }
 }
 
+static bool icount_state_needed(void *opaque)
+{
+return use_icount;
+}
+
+/*
+ * This is a subsection for icount migration.
+ */
+static const VMStateDescription icount_vmstate_timers = {
+.name = timer/icount,
+.version_id = 1,
+.minimum_version_id = 1,
+.fields = (VMStateField[]) {
+VMSTATE_INT64(qemu_icount_bias, TimersState),
+VMSTATE_INT64(qemu_icount, TimersState),
+VMSTATE_END_OF_LIST()
+}
+};
+
 static const VMStateDescription vmstate_timers = {
 .name = timer,
 .version_id = 2,
@@ -438,6 +457,14 @@ static const VMStateDescription vmstate_timers = {
 VMSTATE_INT64(dummy, TimersState),
 VMSTATE_INT64_V(cpu_clock_offset, TimersState, 2),
 VMSTATE_END_OF_LIST()
+},
+.subsections = (VMStateSubsection[]) {
+{
+.vmsd = icount_vmstate_timers,
+.needed = icount_state_needed,
+}, {
+/* empty */
+}
 }
 };
 
-- 
1.9.3

[Qemu-devel] [PULL 08/11] cpu-exec: Add sleeping algorithm

From: Sebastian Tanase sebastian.tan...@openwide.fr

The goal is to sleep qemu whenever the guest clock
is in advance compared to the host clock (we use
the monotonic clocks). The amount of time to sleep
is calculated in the execution loop in cpu_exec.

At first, we tried to approximate at each for loop the real time elapsed
while searching for a TB (generating or retrieving from cache) and
executing it. We would then approximate the virtual time corresponding
to the number of virtual instructions executed. The difference between
these 2 values would allow us to know if the guest is in advance or delayed.
However, the function used for measuring the real time
(qemu_clock_get_ns(QEMU_CLOCK_REALTIME)) proved to be very expensive.
We had an added overhead of 13% of the total run time.

Therefore, we modified the algorithm and only take into account the
difference between the 2 clocks at the begining of the cpu_exec function.
During the for loop we try to reduce the advance of the guest only by
computing the virtual time elapsed and sleeping if necessary. The overhead
is thus reduced to 3%. Even though this method still has a noticeable
overhead, it no longer is a bottleneck in trying to achieve a better
guest frequency for which the guest clock is faster than the host one.

As for the the alignement of the 2 clocks, with the first algorithm
the guest clock was oscillating between -1 and 1ms compared to the host clock.
Using the second algorithm we notice that the guest is 5ms behind the host, 
which
is still acceptable for our use case.

The tests where conducted using fio and stress. The host machine in an i5 CPU at
3.10GHz running Debian Jessie (kernel 3.12). The guest machine is an arm 
versatile-pb
built with buildroot.

Currently, on our test machine, the lowest icount we can achieve that is 
suitable for
aligning the 2 clocks is 6. However, we observe that the IO tests (using fio) 
are
slower than the cpu tests (using stress).

Signed-off-by: Sebastian Tanase sebastian.tan...@openwide.fr
Tested-by: Camille Bégué camille.be...@openwide.fr
Signed-off-by: Paolo Bonzini pbonz...@redhat.com
---
 cpu-exec.c   | 79 
 cpus.c   | 17 +++
 include/qemu/timer.h |  1 +
 3 files changed, 97 insertions(+)

diff --git a/cpu-exec.c b/cpu-exec.c
index 38e5f02..68f82b6 100644
--- a/cpu-exec.c
+++ b/cpu-exec.c
@@ -22,6 +22,72 @@
 #include tcg.h
 #include qemu/atomic.h
 #include sysemu/qtest.h
+#include qemu/timer.h
+
+/* -icount align implementation. */
+
+typedef struct SyncClocks {
+int64_t diff_clk;
+int64_t last_cpu_icount;
+} SyncClocks;
+
+#if !defined(CONFIG_USER_ONLY)
+/* Allow the guest to have a max 3ms advance.
+ * The difference between the 2 clocks could therefore
+ * oscillate around 0.
+ */
+#define VM_CLOCK_ADVANCE 300
+
+static void align_clocks(SyncClocks *sc, const CPUState *cpu)
+{
+int64_t cpu_icount;
+
+if (!icount_align_option) {
+return;
+}
+
+cpu_icount = cpu-icount_extra + cpu-icount_decr.u16.low;
+sc-diff_clk += cpu_icount_to_ns(sc-last_cpu_icount - cpu_icount);
+sc-last_cpu_icount = cpu_icount;
+
+if (sc-diff_clk  VM_CLOCK_ADVANCE) {
+#ifndef _WIN32
+struct timespec sleep_delay, rem_delay;
+sleep_delay.tv_sec = sc-diff_clk / 10LL;
+sleep_delay.tv_nsec = sc-diff_clk % 10LL;
+if (nanosleep(sleep_delay, rem_delay)  0) {
+sc-diff_clk -= (sleep_delay.tv_sec - rem_delay.tv_sec) * 
10LL;
+sc-diff_clk -= sleep_delay.tv_nsec - rem_delay.tv_nsec;
+} else {
+sc-diff_clk = 0;
+}
+#else
+Sleep(sc-diff_clk / SCALE_MS);
+sc-diff_clk = 0;
+#endif
+}
+}
+
+static void init_delay_params(SyncClocks *sc,
+  const CPUState *cpu)
+{
+if (!icount_align_option) {
+return;
+}
+sc-diff_clk = qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL) -
+   qemu_clock_get_ns(QEMU_CLOCK_REALTIME) +
+   cpu_get_clock_offset();
+sc-last_cpu_icount = cpu-icount_extra + cpu-icount_decr.u16.low;
+}
+#else
+static void align_clocks(SyncClocks *sc, const CPUState *cpu)
+{
+}
+
+static void init_delay_params(SyncClocks *sc, const CPUState *cpu)
+{
+}
+#endif /* CONFIG USER ONLY */
 
 void cpu_loop_exit(CPUState *cpu)
 {
@@ -227,6 +293,8 @@ int cpu_exec(CPUArchState *env)
 TranslationBlock *tb;
 uint8_t *tc_ptr;
 uintptr_t next_tb;
+SyncClocks sc;
+
 /* This must be volatile so it is not trashed by longjmp() */
 volatile bool have_tb_lock = false;
 
@@ -283,6 +351,13 @@ int cpu_exec(CPUArchState *env)
 #endif
 cpu-exception_index = -1;
 
+/* Calculate difference between guest clock and host clock.
+ * This delay includes the delay of the last cycle, so
+ * what we have to do is sleep until it is 0. As for the
+ * advance/delay we gain here, we try to fix it next time.
+

[Qemu-devel] [PULL 05/11] icount: Fix virtual clock start value on ARM

From: Sebastian Tanase sebastian.tan...@openwide.fr

When using the icount option on ARM, the virtual
clock starts counting at realtime clock but it
should start at 0.

The reason why the virtual clock starts at realtime clock
is because the first time we call qemu_clock_warp (which
calls icount_warp_rt) in tcg_exec_all, qemu_icount_bias
(which is part of the virtual time computation mechanism)
will increment by realtime - vm_clock_warp_start, with
vm_clock_warp_start being 0 (see icount_warp_rt in cpus.c).

By changing the value of vm_clock_warp_start from 0 to -1,
the first time we call qemu_clock_warp which calls
icount_warp_rt, we will return immediatly because
icount_warp_rt first checks if vm_clock_warp_start is -1
and if it's the case it returns. Therefore, qemu_icount_bias
will first be incremented by the value of a virtual timer
deadline when the virtual cpu goes from active to inactive.

The virtual time will start at 0 and increment based
on the instruction counter when the vcpu is active or
the qemu_icount_bias value when inactive.

Signed-off-by: Sebastian Tanase sebastian.tan...@openwide.fr
Signed-off-by: Paolo Bonzini pbonz...@redhat.com
---
 cpus.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/cpus.c b/cpus.c
index 62636a6..bbb8d4e 100644
--- a/cpus.c
+++ b/cpus.c
@@ -102,7 +102,7 @@ static bool all_cpu_threads_idle(void)
 
 /* Protected by TimersState seqlock */
 
-static int64_t vm_clock_warp_start;
+static int64_t vm_clock_warp_start = -1;
 /* Conversion factor from emulated instructions to virtual clock ticks.  */
 static int icount_time_shift;
 /* Arbitrarily pick 1MIPS as the minimum allowable speed.  */
-- 
1.9.3

[Qemu-devel] [PULL 04/11] timer: add cpu_icount_to_ns function.

From: KONRAD Frederic fred.kon...@greensocs.com

This adds cpu_icount_to_ns function which is needed for reverse execution.

It returns the time for a specific instruction.

Signed-off-by: KONRAD Frederic fred.kon...@greensocs.com
Reviewed-by: Paolo Bonzini pbonz...@redhat.com
Signed-off-by: Paolo Bonzini pbonz...@redhat.com
---
 cpus.c   | 7 ++-
 include/qemu/timer.h | 1 +
 2 files changed, 7 insertions(+), 1 deletion(-)

diff --git a/cpus.c b/cpus.c
index a6b6557..62636a6 100644
--- a/cpus.c
+++ b/cpus.c
@@ -146,7 +146,7 @@ static int64_t cpu_get_icount_locked(void)
 }
 icount -= (cpu-icount_decr.u16.low + cpu-icount_extra);
 }
-return timers_state.qemu_icount_bias + (icount  icount_time_shift);
+return timers_state.qemu_icount_bias + cpu_icount_to_ns(icount);
 }
 
 int64_t cpu_get_icount(void)
@@ -162,6 +162,11 @@ int64_t cpu_get_icount(void)
 return icount;
 }
 
+int64_t cpu_icount_to_ns(int64_t icount)
+{
+return icount  icount_time_shift;
+}
+
 /* return the host CPU cycle counter and handle stop/restart */
 /* Caller must hold the BQL */
 int64_t cpu_get_ticks(void)
diff --git a/include/qemu/timer.h b/include/qemu/timer.h
index 7f9a074..e12c714 100644
--- a/include/qemu/timer.h
+++ b/include/qemu/timer.h
@@ -745,6 +745,7 @@ static inline int64_t get_clock(void)
 /* icount */
 int64_t cpu_get_icount(void);
 int64_t cpu_get_clock(void);
+int64_t cpu_icount_to_ns(int64_t icount);
 
 /***/
 /* host CPU ticks (if available) */
-- 
1.9.3

[Qemu-devel] [PULL 06/11] icount: Add QemuOpts for icount

From: Sebastian Tanase sebastian.tan...@openwide.fr

Make icount parameter use QemuOpts style options in order
to easily add other suboptions.

Signed-off-by: Sebastian Tanase sebastian.tan...@openwide.fr
Tested-by: Camille Bégué camille.be...@openwide.fr
Signed-off-by: Paolo Bonzini pbonz...@redhat.com
---
 cpus.c| 10 +-
 include/qemu-common.h |  3 ++-
 qemu-options.hx   |  4 ++--
 qtest.c   | 13 +++--
 vl.c  | 35 ---
 5 files changed, 52 insertions(+), 13 deletions(-)

diff --git a/cpus.c b/cpus.c
index bbb8d4e..8291044 100644
--- a/cpus.c
+++ b/cpus.c
@@ -473,13 +473,21 @@ static const VMStateDescription vmstate_timers = {
 }
 };
 
-void configure_icount(const char *option)
+void configure_icount(QemuOpts *opts, Error **errp)
 {
+const char *option;
+
 seqlock_init(timers_state.vm_clock_seqlock, NULL);
 vmstate_register(NULL, 0, vmstate_timers, timers_state);
+option = qemu_opt_get(opts, shift);
 if (!option) {
 return;
 }
+/* When using -icount shift, the shift option will be
+   misinterpreted as a boolean */
+if (strcmp(option, on) == 0 || strcmp(option, off) == 0) {
+error_setg(errp, The shift option must be a number or auto);
+}
 
 icount_warp_timer = timer_new_ns(QEMU_CLOCK_REALTIME,
   icount_warp_rt, NULL);
diff --git a/include/qemu-common.h b/include/qemu-common.h
index 6ef8282..04b0769 100644
--- a/include/qemu-common.h
+++ b/include/qemu-common.h
@@ -41,6 +41,7 @@
 #include assert.h
 #include signal.h
 #include glib-compat.h
+#include qemu/option.h
 
 #ifdef _WIN32
 #include sysemu/os-win32.h
@@ -105,7 +106,7 @@ static inline char *realpath(const char *path, char 
*resolved_path)
 #endif
 
 /* icount */
-void configure_icount(const char *option);
+void configure_icount(QemuOpts *opts, Error **errp);
 extern int use_icount;
 
 #include qemu/osdep.h
diff --git a/qemu-options.hx b/qemu-options.hx
index 1549625..5a1b001 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -3011,11 +3011,11 @@ re-inject them.
 ETEXI
 
 DEF(icount, HAS_ARG, QEMU_OPTION_icount, \
--icount [N|auto]\n \
+-icount [shift=N|auto]\n \
 enable virtual instruction counter with 2^N clock ticks 
per\n \
 instruction\n, QEMU_ARCH_ALL)
 STEXI
-@item -icount [@var{N}|auto]
+@item -icount [shift=@var{N}|auto]
 @findex -icount
 Enable virtual instruction counter.  The virtual cpu will execute one
 instruction every 2^@var{N} ns of virtual time.  If @code{auto} is specified
diff --git a/qtest.c b/qtest.c
index 04a6dc1..ef0d991 100644
--- a/qtest.c
+++ b/qtest.c
@@ -19,6 +19,9 @@
 #include hw/irq.h
 #include sysemu/sysemu.h
 #include sysemu/cpus.h
+#include qemu/config-file.h
+#include qemu/option.h
+#include qemu/error-report.h
 
 #define MAX_IRQ 256
 
@@ -509,10 +512,16 @@ static void qtest_event(void *opaque, int event)
 }
 }
 
-int qtest_init_accel(MachineClass *mc)
+static void configure_qtest_icount(const char *options)
 {
-configure_icount(0);
+QemuOpts *opts  = qemu_opts_parse(qemu_find_opts(icount), options, 1);
+configure_icount(opts, error_abort);
+qemu_opts_del(opts);
+}
 
+int qtest_init_accel(MachineClass *mc)
+{
+configure_qtest_icount(0);
 return 0;
 }
 
diff --git a/vl.c b/vl.c
index fe451aa..f2621a5 100644
--- a/vl.c
+++ b/vl.c
@@ -537,6 +537,20 @@ static QemuOptsList qemu_mem_opts = {
 },
 };
 
+static QemuOptsList qemu_icount_opts = {
+.name = icount,
+.implied_opt_name = shift,
+.merge_lists = true,
+.head = QTAILQ_HEAD_INITIALIZER(qemu_icount_opts.head),
+.desc = {
+{
+.name = shift,
+.type = QEMU_OPT_STRING,
+},
+{ /* end of list */ }
+},
+};
+
 /**
  * Get machine options
  *
@@ -2908,13 +2922,12 @@ int main(int argc, char **argv, char **envp)
 {
 int i;
 int snapshot, linux_boot;
-const char *icount_option = NULL;
 const char *initrd_filename;
 const char *kernel_filename, *kernel_cmdline;
 const char *boot_order;
 DisplayState *ds;
 int cyls, heads, secs, translation;
-QemuOpts *hda_opts = NULL, *opts, *machine_opts;
+QemuOpts *hda_opts = NULL, *opts, *machine_opts, *icount_opts = NULL;
 QemuOptsList *olist;
 int optind;
 const char *optarg;
@@ -2979,6 +2992,7 @@ int main(int argc, char **argv, char **envp)
 qemu_add_opts(qemu_msg_opts);
 qemu_add_opts(qemu_name_opts);
 qemu_add_opts(qemu_numa_opts);
+qemu_add_opts(qemu_icount_opts);
 
 runstate_init();
 
@@ -3830,7 +3844,11 @@ int main(int argc, char **argv, char **envp)
 }
 break;
 case QEMU_OPTION_icount:
-icount_option = optarg;
+icount_opts = qemu_opts_parse(qemu_find_opts(icount),
+  optarg, 1);
+if

[Qemu-devel] [PULL 09/11] cpu-exec: Print to console if the guest is late

From: Sebastian Tanase sebastian.tan...@openwide.fr

If the align option is enabled, we print to the user whenever
the guest clock is behind the host clock in order for he/she
to have a hint about the actual performance. The maximum
print interval is 2s and we limit the number of messages to 100.
If desired, this can be changed in cpu-exec.c

Signed-off-by: Sebastian Tanase sebastian.tan...@openwide.fr
Tested-by: Camille Bégué camille.be...@openwide.fr
Signed-off-by: Paolo Bonzini pbonz...@redhat.com
---
 cpu-exec.c | 33 -
 1 file changed, 32 insertions(+), 1 deletion(-)

diff --git a/cpu-exec.c b/cpu-exec.c
index 68f82b6..3c14502 100644
--- a/cpu-exec.c
+++ b/cpu-exec.c
@@ -29,6 +29,7 @@
 typedef struct SyncClocks {
 int64_t diff_clk;
 int64_t last_cpu_icount;
+int64_t realtime_clock;
 } SyncClocks;
 
 #if !defined(CONFIG_USER_ONLY)
@@ -37,6 +38,9 @@ typedef struct SyncClocks {
  * oscillate around 0.
  */
 #define VM_CLOCK_ADVANCE 300
+#define THRESHOLD_REDUCE 1.5
+#define MAX_DELAY_PRINT_RATE 20LL
+#define MAX_NB_PRINTS 100
 
 static void align_clocks(SyncClocks *sc, const CPUState *cpu)
 {
@@ -68,16 +72,43 @@ static void align_clocks(SyncClocks *sc, const CPUState 
*cpu)
 }
 }
 
+static void print_delay(const SyncClocks *sc)
+{
+static float threshold_delay;
+static int64_t last_realtime_clock;
+static int nb_prints;
+
+if (icount_align_option 
+sc-realtime_clock - last_realtime_clock = MAX_DELAY_PRINT_RATE 
+nb_prints  MAX_NB_PRINTS) {
+if ((-sc-diff_clk / (float)10LL  threshold_delay) ||
+(-sc-diff_clk / (float)10LL 
+ (threshold_delay - THRESHOLD_REDUCE))) {
+threshold_delay = (-sc-diff_clk / 10LL) + 1;
+printf(Warning: The guest is now late by %.1f to %.1f seconds\n,
+   threshold_delay - 1,
+   threshold_delay);
+nb_prints++;
+last_realtime_clock = sc-realtime_clock;
+}
+}
+}
+
 static void init_delay_params(SyncClocks *sc,
   const CPUState *cpu)
 {
 if (!icount_align_option) {
 return;
 }
+sc-realtime_clock = qemu_clock_get_ns(QEMU_CLOCK_REALTIME);
 sc-diff_clk = qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL) -
-   qemu_clock_get_ns(QEMU_CLOCK_REALTIME) +
+   sc-realtime_clock +
cpu_get_clock_offset();
 sc-last_cpu_icount = cpu-icount_extra + cpu-icount_decr.u16.low;
+
+/* Print every 2s max if the guest is late. We limit the number
+   of printed messages to NB_PRINT_MAX(currently 100) */
+print_delay(sc);
 }
 #else
 static void align_clocks(SyncClocks *sc, const CPUState *cpu)
-- 
1.9.3

[Qemu-devel] [PULL 01/11] backends: Introduce chr-testdev

From: Paolo Bonzini pbonz...@redhat.com

chr-testdev enables a virtio serial channel to be used for guest
initiated qemu exits. hw/misc/debugexit already enables guest
initiated qemu exits, but only for PC targets. chr-testdev supports
any virtio-capable target. kvm-unit-tests/arm is already making use
of this backend.

Currently there is a single command implemented, q.  It takes a
(prefix) argument for the exit code, thus an exit is implemented by
writing, e.g. 1q, to the virtio-serial port.

It can be used as:
   $QEMU ... \
 -device virtio-serial-device \
 -device virtserialport,chardev=ctd -chardev testdev,id=ctd

or, use:
   $QEMU ... \
 -device virtio-serial-device \
 -device virtconsole,chardev=ctd -chardev testdev,id=ctd

to bind it to virtio-serial port0.

Signed-off-by: Paolo Bonzini pbonz...@redhat.com
Signed-off-by: Andrew Jones drjo...@redhat.com
Signed-off-by: Paolo Bonzini pbonz...@redhat.com
---
 backends/Makefile.objs |   2 +-
 backends/testdev.c | 131 +
 include/sysemu/char.h  |   3 ++
 qapi-schema.json   |   3 +-
 qemu-char.c|   4 ++
 stubs/Makefile.objs|   1 +
 stubs/chr-testdev.c|   7 +++
 7 files changed, 149 insertions(+), 2 deletions(-)
 create mode 100644 backends/testdev.c
 create mode 100644 stubs/chr-testdev.c

diff --git a/backends/Makefile.objs b/backends/Makefile.objs
index 506a46c..31a3a89 100644
--- a/backends/Makefile.objs
+++ b/backends/Makefile.objs
@@ -1,7 +1,7 @@
 common-obj-y += rng.o rng-egd.o
 common-obj-$(CONFIG_POSIX) += rng-random.o
 
-common-obj-y += msmouse.o
+common-obj-y += msmouse.o testdev.o
 common-obj-$(CONFIG_BRLAPI) += baum.o
 baum.o-cflags := $(SDL_CFLAGS)
 
diff --git a/backends/testdev.c b/backends/testdev.c
new file mode 100644
index 000..70d63b3
--- /dev/null
+++ b/backends/testdev.c
@@ -0,0 +1,131 @@
+/*
+ * QEMU Char Device for testsuite control
+ *
+ * Copyright (c) 2014 Red Hat, Inc.
+ *
+ * Author: Paolo Bonzini pbonz...@redhat.com
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the Software), to 
deal
+ * in the Software without restriction, including without limitation the rights
+ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+ * copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING 
FROM,
+ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
+ * THE SOFTWARE.
+ */
+#include qemu-common.h
+#include sysemu/char.h
+
+#define BUF_SIZE 32
+
+typedef struct {
+CharDriverState *chr;
+uint8_t in_buf[32];
+int in_buf_used;
+} TestdevCharState;
+
+/* Try to interpret a whole incoming packet */
+static int testdev_eat_packet(TestdevCharState *testdev)
+{
+const uint8_t *cur = testdev-in_buf;
+int len = testdev-in_buf_used;
+uint8_t c;
+int arg;
+
+#define EAT(c) do { \
+if (!len--) {   \
+return 0;   \
+}   \
+c = *cur++; \
+} while (0)
+
+EAT(c);
+
+while (isspace(c)) {
+EAT(c);
+}
+
+arg = 0;
+while (isdigit(c)) {
+arg = arg * 10 + c - '0';
+EAT(c);
+}
+
+while (isspace(c)) {
+EAT(c);
+}
+
+switch (c) {
+case 'q':
+exit((arg  1) | 1);
+break;
+default:
+break;
+}
+return cur - testdev-in_buf;
+}
+
+/* The other end is writing some data.  Store it and try to interpret */
+static int testdev_write(CharDriverState *chr, const uint8_t *buf, int len)
+{
+TestdevCharState *testdev = chr-opaque;
+int tocopy, eaten, orig_len = len;
+
+while (len) {
+/* Complete our buffer as much as possible */
+tocopy = MIN(len, BUF_SIZE - testdev-in_buf_used);
+
+memcpy(testdev-in_buf + testdev-in_buf_used, buf, tocopy);
+testdev-in_buf_used += tocopy;
+buf += tocopy;
+len -= tocopy;
+
+/* Interpret it as much as possible */
+while (testdev-in_buf_used  0 
+   (eaten = testdev_eat_packet(testdev))  0) {
+memmove(testdev-in_buf, testdev-in_buf + eaten,
+testdev-in_buf_used - eaten);
+testdev-in_buf_used -= eaten;
+}
+}
+return orig_len;
+}
+
+static void testdev_close(struct CharDriverState *chr)
+{
+TestdevCharState

[Qemu-devel] [PULL 07/11] icount: Add align option to icount

From: Sebastian Tanase sebastian.tan...@openwide.fr

The align option is used for activating the align algorithm
in order to synchronise the host clock and the guest clock.

Signed-off-by: Sebastian Tanase sebastian.tan...@openwide.fr
Tested-by: Camille Bégué camille.be...@openwide.fr
Signed-off-by: Paolo Bonzini pbonz...@redhat.com
---
 cpus.c| 19 ---
 include/qemu-common.h |  1 +
 qemu-options.hx   | 15 +--
 vl.c  |  4 
 4 files changed, 30 insertions(+), 9 deletions(-)

diff --git a/cpus.c b/cpus.c
index 8291044..7e09538 100644
--- a/cpus.c
+++ b/cpus.c
@@ -476,25 +476,30 @@ static const VMStateDescription vmstate_timers = {
 void configure_icount(QemuOpts *opts, Error **errp)
 {
 const char *option;
+char *rem_str = NULL;
 
 seqlock_init(timers_state.vm_clock_seqlock, NULL);
 vmstate_register(NULL, 0, vmstate_timers, timers_state);
 option = qemu_opt_get(opts, shift);
 if (!option) {
+if (qemu_opt_get(opts, align) != NULL) {
+error_setg(errp, Please specify shift option when using align);
+}
 return;
 }
-/* When using -icount shift, the shift option will be
-   misinterpreted as a boolean */
-if (strcmp(option, on) == 0 || strcmp(option, off) == 0) {
-error_setg(errp, The shift option must be a number or auto);
-}
-
+icount_align_option = qemu_opt_get_bool(opts, align, false);
 icount_warp_timer = timer_new_ns(QEMU_CLOCK_REALTIME,
   icount_warp_rt, NULL);
 if (strcmp(option, auto) != 0) {
-icount_time_shift = strtol(option, NULL, 0);
+errno = 0;
+icount_time_shift = strtol(option, rem_str, 0);
+if (errno != 0 || *rem_str != '\0' || !strlen(option)) {
+error_setg(errp, icount: Invalid shift value);
+}
 use_icount = 1;
 return;
+} else if (icount_align_option) {
+error_setg(errp, shift=auto and align=on are incompatible);
 }
 
 use_icount = 2;
diff --git a/include/qemu-common.h b/include/qemu-common.h
index 04b0769..5d10ac2 100644
--- a/include/qemu-common.h
+++ b/include/qemu-common.h
@@ -108,6 +108,7 @@ static inline char *realpath(const char *path, char 
*resolved_path)
 /* icount */
 void configure_icount(QemuOpts *opts, Error **errp);
 extern int use_icount;
+extern int icount_align_option;
 
 #include qemu/osdep.h
 #include qemu/bswap.h
diff --git a/qemu-options.hx b/qemu-options.hx
index 5a1b001..96516c1 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -3011,9 +3011,9 @@ re-inject them.
 ETEXI
 
 DEF(icount, HAS_ARG, QEMU_OPTION_icount, \
--icount [shift=N|auto]\n \
+-icount [shift=N|auto][,align=on|off]\n \
 enable virtual instruction counter with 2^N clock ticks 
per\n \
-instruction\n, QEMU_ARCH_ALL)
+instruction and enable aligning the host and virtual 
clocks\n, QEMU_ARCH_ALL)
 STEXI
 @item -icount [shift=@var{N}|auto]
 @findex -icount
@@ -3026,6 +3026,17 @@ Note that while this option can give deterministic 
behavior, it does not
 provide cycle accurate emulation.  Modern CPUs contain superscalar out of
 order cores with complex cache hierarchies.  The number of instructions
 executed often has little or no correlation with actual performance.
+
+@option{align=on} will activate the delay algorithm which will try to
+to synchronise the host clock and the virtual clock. The goal is to
+have a guest running at the real frequency imposed by the shift option.
+Whenever the guest clock is behind the host clock and if
+@option{align=on} is specified then we print a messsage to the user
+to inform about the delay.
+Currently this option does not work when @option{shift} is @code{auto}.
+Note: The sync algorithm will work for those shift values for which
+the guest clock runs ahead of the host clock. Typically this happens
+when the shift value is high (how high depends on the host machine).
 ETEXI
 
 DEF(watchdog, HAS_ARG, QEMU_OPTION_watchdog, \
diff --git a/vl.c b/vl.c
index f2621a5..a8029d5 100644
--- a/vl.c
+++ b/vl.c
@@ -183,6 +183,7 @@ uint8_t *boot_splash_filedata;
 size_t boot_splash_filedata_size;
 uint8_t qemu_extra_params_fw[2];
 
+int icount_align_option;
 typedef struct FWBootEntry FWBootEntry;
 
 struct FWBootEntry {
@@ -546,6 +547,9 @@ static QemuOptsList qemu_icount_opts = {
 {
 .name = shift,
 .type = QEMU_OPT_STRING,
+}, {
+.name = align,
+.type = QEMU_OPT_BOOL,
 },
 { /* end of list */ }
 },
-- 
1.9.3

[Qemu-devel] [PULL 11/11] target-mips: Ignore unassigned accesses with KVM

From: James Hogan james.ho...@imgtec.com

MIPS registers an unassigned access handler which raises a guest bus
error exception. However this causes QEMU to crash when KVM is enabled
as it isn't called from the main execution loop so longjmp() gets called
without a corresponding setjmp().

Until the KVM API can be updated to trigger a guest exception in
response to an MMIO exit, prevent the bus error exception being raised
from mips_cpu_unassigned_access() if KVM is enabled.

The check is at run time since the do_unassigned_access callback is
initialised before it is known whether KVM will be enabled.

The problem can be triggered with Malta emulation by making the guest
write to the reset region at physical address 0x1bf0, since it is
marked read-only which is treated as unassigned for writes.

Signed-off-by: James Hogan james.ho...@imgtec.com
Reviewed-by: Aurelien Jarno aurel...@aurel32.net
Cc: Peter Maydell peter.mayd...@linaro.org
Cc: Paolo Bonzini pbonz...@redhat.com
Cc: Gleb Natapov g...@redhat.com
Cc: Christoffer Dall christoffer.d...@linaro.org
Cc: Sanjay Lal sanj...@kymasys.com
Signed-off-by: Paolo Bonzini pbonz...@redhat.com
---
 target-mips/op_helper.c | 11 +++
 1 file changed, 11 insertions(+)

diff --git a/target-mips/op_helper.c b/target-mips/op_helper.c
index 27651a4..df97b35 100644
--- a/target-mips/op_helper.c
+++ b/target-mips/op_helper.c
@@ -21,6 +21,7 @@
 #include qemu/host-utils.h
 #include exec/helper-proto.h
 #include exec/cpu_ldst.h
+#include sysemu/kvm.h
 
 #ifndef CONFIG_USER_ONLY
 static inline void cpu_mips_tlb_flush (CPUMIPSState *env, int flush_global);
@@ -2168,6 +2169,16 @@ void mips_cpu_unassigned_access(CPUState *cs, hwaddr 
addr,
 MIPSCPU *cpu = MIPS_CPU(cs);
 CPUMIPSState *env = cpu-env;
 
+/*
+ * Raising an exception with KVM enabled will crash because it won't be 
from
+ * the main execution loop so the longjmp won't have a matching setjmp.
+ * Until we can trigger a bus error exception through KVM lets just ignore
+ * the access.
+ */
+if (kvm_enabled()) {
+return;
+}
+
 if (is_exec) {
 helper_raise_exception(env, EXCP_IBE);
 } else {
-- 
1.9.3

[Qemu-devel] [PULL 10/11] monitor: Add drift info to 'info jit'

From: Sebastian Tanase sebastian.tan...@openwide.fr

Show in 'info jit' the current delay between the host clock
and the guest clock. In addition, print the maximum advance
and delay of the guest compared to the host.

Signed-off-by: Sebastian Tanase sebastian.tan...@openwide.fr
Tested-by: Camille Bégué camille.be...@openwide.fr
Signed-off-by: Paolo Bonzini pbonz...@redhat.com
---
 cpu-exec.c|  6 ++
 cpus.c| 15 +++
 include/qemu-common.h |  4 
 monitor.c |  1 +
 4 files changed, 26 insertions(+)

diff --git a/cpu-exec.c b/cpu-exec.c
index 3c14502..cbc8067 100644
--- a/cpu-exec.c
+++ b/cpu-exec.c
@@ -105,6 +105,12 @@ static void init_delay_params(SyncClocks *sc,
sc-realtime_clock +
cpu_get_clock_offset();
 sc-last_cpu_icount = cpu-icount_extra + cpu-icount_decr.u16.low;
+if (sc-diff_clk  max_delay) {
+max_delay = sc-diff_clk;
+}
+if (sc-diff_clk  max_advance) {
+max_advance = sc-diff_clk;
+}
 
 /* Print every 2s max if the guest is late. We limit the number
of printed messages to NB_PRINT_MAX(currently 100) */
diff --git a/cpus.c b/cpus.c
index 19245e9..a5f78c1 100644
--- a/cpus.c
+++ b/cpus.c
@@ -64,6 +64,8 @@
 #endif /* CONFIG_LINUX */
 
 static CPUState *next_cpu;
+int64_t max_delay;
+int64_t max_advance;
 
 bool cpu_is_stopped(CPUState *cpu)
 {
@@ -1552,3 +1554,16 @@ void qmp_inject_nmi(Error **errp)
 error_set(errp, QERR_UNSUPPORTED);
 #endif
 }
+
+void dump_drift_info(FILE *f, fprintf_function cpu_fprintf)
+{
+cpu_fprintf(f, Host - Guest clock  %ld(ms)\n,
+(cpu_get_clock() - cpu_get_icount())/SCALE_MS);
+if (icount_align_option) {
+cpu_fprintf(f, Max guest delay %ld(ms)\n, -max_delay/SCALE_MS);
+cpu_fprintf(f, Max guest advance   %ld(ms)\n, max_advance/SCALE_MS);
+} else {
+cpu_fprintf(f, Max guest delay NA\n);
+cpu_fprintf(f, Max guest advance   NA\n);
+}
+}
diff --git a/include/qemu-common.h b/include/qemu-common.h
index 5d10ac2..bcf7a6a 100644
--- a/include/qemu-common.h
+++ b/include/qemu-common.h
@@ -109,6 +109,10 @@ static inline char *realpath(const char *path, char 
*resolved_path)
 void configure_icount(QemuOpts *opts, Error **errp);
 extern int use_icount;
 extern int icount_align_option;
+/* drift information for info jit command */
+extern int64_t max_delay;
+extern int64_t max_advance;
+void dump_drift_info(FILE *f, fprintf_function cpu_fprintf);
 
 #include qemu/osdep.h
 #include qemu/bswap.h
diff --git a/monitor.c b/monitor.c
index 5bc70a6..cdbaa60 100644
--- a/monitor.c
+++ b/monitor.c
@@ -1047,6 +1047,7 @@ static void do_info_registers(Monitor *mon, const QDict 
*qdict)
 static void do_info_jit(Monitor *mon, const QDict *qdict)
 {
 dump_exec_info((FILE *)mon, monitor_fprintf);
+dump_drift_info((FILE *)mon, monitor_fprintf);
 }
 
 static void do_info_history(Monitor *mon, const QDict *qdict)
-- 
1.9.3

[Qemu-devel] [Bug 1285708] Re: FreeBSD Guest crash on boot due to xsave instruction issue