date:20160715

Re: [Qemu-devel] [PATCH 2/2] tests: Check serial output of firmware boot of some machines

2016-07-15 Thread David Gibson

On Fri, Jul 15, 2016 at 09:25:16AM +0200, Thomas Huth wrote:
> On 15.07.2016 05:21, David Gibson wrote:
> > On Thu, Jul 14, 2016 at 11:57:46AM +0200, Thomas Huth wrote:
> >> Some of the machines that we have got a firmware image for write
> >> some output to the serial console while booting up. We can use
> >> this output to make sure that the machine is basically working,
> >> so this adds a test that checks the output of these machines
> >> for some well-known "magic" strings.
> >>
> >> Signed-off-by: Thomas Huth 
> > 
> > I love this idea.  A couple of queries about the implemntation.
> > 
> >> ---
> >>  tests/Makefile.include   |   8 
> >>  tests/boot-serial-test.c | 110 
> >> +++
> >>  2 files changed, 118 insertions(+)
> >>  create mode 100644 tests/boot-serial-test.c
> >>
> >> diff --git a/tests/Makefile.include b/tests/Makefile.include
> >> index b7784d3..ba1cc8d 100644
> >> --- a/tests/Makefile.include
> >> +++ b/tests/Makefile.include
> >> @@ -194,6 +194,7 @@ check-qtest-i386-y += tests/hd-geo-test$(EXESUF)
> >>  gcov-files-i386-y += hw/block/hd-geometry.c
> >>  check-qtest-i386-y += tests/boot-order-test$(EXESUF)
> >>  check-qtest-i386-y += tests/bios-tables-test$(EXESUF)
> >> +check-qtest-i386-y += tests/boot-serial-test$(EXESUF)
> >>  check-qtest-i386-y += tests/pxe-test$(EXESUF)
> >>  check-qtest-i386-y += tests/rtc-test$(EXESUF)
> >>  check-qtest-i386-y += tests/ipmi-kcs-test$(EXESUF)
> >> @@ -241,6 +242,8 @@ check-qtest-x86_64-y += $(check-qtest-i386-y)
> >>  gcov-files-i386-y += i386-softmmu/hw/timer/mc146818rtc.c
> >>  gcov-files-x86_64-y = $(subst 
> >> i386-softmmu/,x86_64-softmmu/,$(gcov-files-i386-y))
> >>  
> >> +check-qtest-alpha-y = tests/boot-serial-test$(EXESUF)
> >> +
> >>  check-qtest-mips-y = tests/endianness-test$(EXESUF)
> >>  check-qtest-mips64-y = tests/endianness-test$(EXESUF)
> >>  check-qtest-mips64el-y = tests/endianness-test$(EXESUF)
> >> @@ -248,12 +251,14 @@ check-qtest-mips64el-y = 
> >> tests/endianness-test$(EXESUF)
> >>  check-qtest-ppc-y = tests/endianness-test$(EXESUF)
> >>  check-qtest-ppc-y += tests/boot-order-test$(EXESUF)
> >>  check-qtest-ppc-y += tests/prom-env-test$(EXESUF)
> >> +check-qtest-ppc-y += tests/boot-serial-test$(EXESUF)
> >>  
> >>  check-qtest-ppc64-y = tests/endianness-test$(EXESUF)
> >>  check-qtest-ppc64-y += tests/boot-order-test$(EXESUF)
> >>  check-qtest-ppc64-y += tests/spapr-phb-test$(EXESUF)
> >>  gcov-files-ppc64-y += ppc64-softmmu/hw/ppc/spapr_pci.c
> >>  check-qtest-ppc64-y += tests/prom-env-test$(EXESUF)
> >> +check-qtest-ppc64-y += tests/boot-serial-test$(EXESUF)
> >>  
> >>  check-qtest-sh4-y = tests/endianness-test$(EXESUF)
> >>  check-qtest-sh4eb-y = tests/endianness-test$(EXESUF)
> >> @@ -277,6 +282,8 @@ gcov-files-arm-y += arm-softmmu/hw/block/virtio-blk.c
> >>  check-qtest-microblazeel-y = $(check-qtest-microblaze-y)
> >>  check-qtest-xtensaeb-y = $(check-qtest-xtensa-y)
> >>  
> >> +check-qtest-s390x-y = tests/boot-serial-test$(EXESUF)
> >> +
> >>  check-qtest-generic-y += tests/qom-test$(EXESUF)
> >>  
> >>  qapi-schema += alternate-any.json
> >> @@ -575,6 +582,7 @@ tests/ipmi-kcs-test$(EXESUF): tests/ipmi-kcs-test.o
> >>  tests/ipmi-bt-test$(EXESUF): tests/ipmi-bt-test.o
> >>  tests/hd-geo-test$(EXESUF): tests/hd-geo-test.o
> >>  tests/boot-order-test$(EXESUF): tests/boot-order-test.o $(libqos-obj-y)
> >> +tests/boot-serial-test$(EXESUF): tests/boot-serial-test.o $(libqos-obj-y)
> >>  tests/bios-tables-test$(EXESUF): tests/bios-tables-test.o \
> >>tests/boot-sector.o $(libqos-obj-y)
> >>  tests/pxe-test$(EXESUF): tests/pxe-test.o tests/boot-sector.o 
> >> $(libqos-obj-y)
> >> diff --git a/tests/boot-serial-test.c b/tests/boot-serial-test.c
> >> new file mode 100644
> >> index 000..3263dcd
> >> --- /dev/null
> >> +++ b/tests/boot-serial-test.c
> >> @@ -0,0 +1,110 @@
> >> +/*
> >> + * Test serial output of some machines.
> >> + *
> >> + * Copyright 2016 Thomas Huth, Red Hat Inc.
> >> + *
> >> + * This work is licensed under the terms of the GNU GPL, version 2
> >> + * or later. See the COPYING file in the top-level directory.
> >> + *
> >> + * This test is used to check that the serial output of the firmware
> >> + * (that we provide for some machines) contains an expected string.
> >> + * Thus we check that the firmware still boots at least to a certain
> >> + * point and so we know that the machine is not completely broken.
> >> + */
> >> +
> >> +#include "qemu/osdep.h"
> >> +#include "libqtest.h"
> >> +
> >> +typedef struct testdef {
> >> +const char *arch;   /* Target architecture */
> >> +const char *machine;/* Name of the machine */
> >> +const char *extra;  /* Additional parameters */
> >> +const char *expect; /* Expected string in the serial output */
> >> +} testdef_t;
> >> +
> >> +static testdef_t tests[] = {
> >> +{ "alpha", "clipper", "", "PCI:" },
> >> +{ "ppc", "ppce500", "", "U-Boot" },
> >>

Re: [Qemu-devel] [PATCH V2 1/1] linux-aio: prevent submitting more than MAX_EVENTS

2016-07-15 Thread Roman Penyaev

On Wed, Jul 13, 2016 at 1:45 PM, Kevin Wolf  wrote:
> Am 13.07.2016 um 13:33 hat Roman Penyaev geschrieben:
>> Just to be sure that we are on the same page:
>>
>> 1. We have this commit "linux-aio: Cancel BH if not needed" which
>>
>>a) introduces performance regression on my fio workloads on the
>>   following config: "iothread=1, VCPU=8, MQ=8". Performance
>>   dropped from 1878MB/s to 1606MB/s with Stefan's fix, that is
>>   ~14%.
>
> Do we already understand why the performance regresses with the patch?
> As long as we don't, everything we do is just guesswork.

Eventually the issue is clear.  I test on /dev/nullb0, which completes
all submitted bios almost immediately.  That means, that after io_submit()
is called it is worth trying to check completed requests and not to
accumulate them in-flight.

That is the theory.  On practise happens the following:

---
>>> sys_poll
<<< sys_poll
>>> aio_dispatch
>>> aio_bh_poll
<<< aio_bh_poll
>>> node->io_read
!!! ioq_submit(), submitted=98
<<< node->io_read
>>> node->io_read
!!! ioq_submit(), submitted=49
<<< node->io_read
>>> node->io_read
!!! ioq_submit(), submitted=47
<<< node->io_read
<<< aio_dispatch
>>> sys_poll
<<< sys_poll
>>> aio_dispatch
>>> aio_bh_poll
<<< aio_bh_poll
>>> node->io_read
!!! ioq_submit(), submitted=50
<<< node->io_read
>>> node->io_read
!!! ioq_submit(), submitted=43
<<< node->io_read
>>> node->io_read
!!! ioq_submit(), submitted=43
<<< node->io_read
>>> node->io_read
!!! ioq_submit(), submitted=8
<<< node->io_read
>>> node->io_read
~~~ qemu_laio_completion_bh completed=338
<<< node->io_read
<<< aio_dispatch
---
* this run gave 1461MB/s *

This is the very common hunk of the log which I see running fio load with
the "linux-aio: Cancel BH if not needed" patch applied.

The important thing which is worth paying attention to is submission of
338 requests (almost whole ring buffer of AIO context) before consuming
requests completions.

Very fast backend device completes submitted requests almost immediately,
but we get a chance to fetch completions only some time later.

The following is the common part of the log when
"linux-aio: Cancel BH if not needed" is reverted:

---
>>> sys_poll
<<< sys_poll
>>> dispatch
>>> aio_bh_poll
~~~ qemu_laio_completion_bh completed=199
<<< aio_bh_poll
>>> node->io_read
!!! ioq_submit(), submitted=47
<<< node->io_read
>>> node->io_read
!!! ioq_submit(), submitted=49
<<< node->io_read
>>> node->io_read
!!! ioq_submit(), submitted=50
<<< node->io_read
>>> node->io_read
!!! ioq_submit(), submitted=43
<<< node->io_read
>>> node->io_read
<<< node->io_read
<<< dispatch
>>> sys_poll
<<< sys_poll
>>> dispatch
>>> aio_bh_poll
~~~ qemu_laio_completion_bh, completed=189
<<< aio_bh_poll
>>> node->io_read
!!! ioq_submit(), submitted=46
<<< node->io_read
>>> node->io_read
!!! ioq_submit(), submitted=46
<<< node->io_read
>>> node->io_read
!!! ioq_submit(), submitted=51
<<< node->io_read
>>> node->io_read
!!! ioq_submit(), submitted=51
<<< node->io_read
<<< dispatch
---
* this run gave 1805MB/s *

According to this part of the log I can say, that completions happen
frequently, i.e. we get a chance to fetch completions more often, thus
queue is always "refreshed" by new comming requests.

To be more precise I collected some statistics: each time I enter
qemu_laio_completion_bh() I account the number of collected requests in the
bucket, e.g.:

   "~~~ qemu_laio_completion_bh completed=199"

   bucket[199] += 1;

   "~~~ qemu_laio_completion_bh, completed=189"

   bucket[189] += 1;

When fio finishes I have a distribution of number of completed requests
which I have observed in the ring buffer.

Here is the sheet:

https://docs.google.com/spreadsheets/d/12CIt6EKiJLqNx0OHNqiabR-oFBrqkH0LN3mjzZ5jGeo/edit?usp=sharing

(Frankly, I could not think of anything better than to send a link on
 google docs, sorry if that insults someone).

There is a chart which shows the whole picture of distribution:

   o  X axis is a number of requests completed at once.
   o  Y axis is a number of times we observe that

Re: [Qemu-devel] [PATCH] ppc: Yet another fix for the huge page support detection mechanism

2016-07-15 Thread Greg Kurz

On Fri, 15 Jul 2016 10:10:25 +0200
Thomas Huth  wrote:

> Commit 86b50f2e1bef ("Disable huge page support if it is not available
> for main RAM") already made sure that huge page support is not announced
> to the guest if the normal RAM of non-NUMA configurations is not backed
> by a huge page filesystem. However, there is one more case that can go
> wrong: NUMA is enabled, but the RAM of the NUMA nodes are not configured
> with huge page support (and only the memory of a DIMM is configured with
> it). When QEMU is started with the following command line for example,
> the Linux guest currently crashes because it is trying to use huge pages
> on a memory region that does not support huge pages:
> 
>  qemu-system-ppc64 -enable-kvm ... -m 1G,slots=4,maxmem=32G -object \
>memory-backend-file,policy=default,mem-path=/hugepages,size=1G,id=mem-mem1 
> \
>-device pc-dimm,id=dimm-mem1,memdev=mem-mem1 -smp 2 \
>-numa node,nodeid=0 -numa node,nodeid=1
> 
> To fix this issue, we've got to make sure to disable huge page support,
> too, when there is a NUMA node that is not using a memory backend with
> huge page support.
> 
> Fixes: 86b50f2e1befc33407bdfeb6f45f7b0d2439a740

According to http://patchwork.ozlabs.org/patch/584741/ , it is best worded

"Broken in commit 86b50f2e1bef"

> Signed-off-by: Thomas Huth 
> ---
>  target-ppc/kvm.c | 10 +++---
>  1 file changed, 7 insertions(+), 3 deletions(-)
> 
> diff --git a/target-ppc/kvm.c b/target-ppc/kvm.c
> index 884d564..7a8f555 100644
> --- a/target-ppc/kvm.c
> +++ b/target-ppc/kvm.c
> @@ -389,12 +389,16 @@ static long getrampagesize(void)
>  
>  object_child_foreach(memdev_root, find_max_supported_pagesize, );
>  
> -if (hpsize == LONG_MAX) {
> +if (hpsize == LONG_MAX || hpsize == getpagesize()) {
>  return getpagesize();
>  }
>  
> -if (nb_numa_nodes == 0 && hpsize > getpagesize()) {
> -/* No NUMA nodes and normal RAM without -mem-path ==> no huge pages! 
> */
> +/* If NUMA is disabled or the NUMA nodes are not backed with a
> + * memory-backend, then there is at least one node using "normal"
> + * RAM. And since normal RAM has not been configured with "-mem-path"
> + * (what we've checked earlier here already), we can not use huge pages!
> + */
> +if (nb_numa_nodes == 0 || numa_info[0].node_memdev == NULL) {

Dumb question: why only checking numa_info[0] ?

>  static bool warned;
>  if (!warned) {
>  error_report("Huge page support disabled (n/a for main 
> memory).");

Re: [Qemu-devel] [PATCH for 2.7 resend] linux-aio: share one LinuxAioState within an AioContext

2016-07-15 Thread Stefan Hajnoczi

On Mon, Jul 04, 2016 at 06:33:20PM +0200, Paolo Bonzini wrote:
> This has better performance because it executes fewer system calls
> and does not use a bottom half per disk.
> 
> Originally proposed by Ming Lei.
> 
> Acked-by: Stefan Hajnoczi 
> Signed-off-by: Paolo Bonzini 
> ---
>  async.c|  23 +++
>  block/linux-aio.c  |  10 ++--
>  block/raw-posix.c  | 119 
> +
>  block/raw-win32.c  |   2 +-
>  include/block/aio.h|  13 
>  {block => include/block}/raw-aio.h |   0
>  6 files changed, 57 insertions(+), 110 deletions(-)
>  rename {block => include/block}/raw-aio.h (100%)

Thanks, applied to my block tree:
https://github.com/stefanha/qemu/commits/block

Stefan


signature.asc
Description: PGP signature

Re: [Qemu-devel] [PATCH v3 12/14] virtio-gpu: Use migrate_add_blocker for virgl migration blocking

2016-07-15 Thread Cornelia Huck

On Thu, 14 Jul 2016 18:22:54 +0100
"Dr. David Alan Gilbert (git)"  wrote:

> From: "Dr. David Alan Gilbert" 
> 
> virgl conditionally registers a vmstate as unmigratable when virgl
> is enabled; instead use the migrate_add_blocker mechanism.
> 
> Signed-off-by: Dr. David Alan Gilbert 
> ---
>  hw/display/virtio-gpu.c| 19 +--
>  include/hw/virtio/virtio-gpu.h |  2 ++
>  2 files changed, 15 insertions(+), 6 deletions(-)
> 

> @@ -1169,13 +1165,23 @@ static void virtio_gpu_device_realize(DeviceState 
> *qdev, Error **errp)
>  }
> 
>  if (virtio_gpu_virgl_enabled(g->conf)) {
> -vmstate_register(qdev, -1, _virtio_gpu_unmigratable, g);
> +error_setg(>migration_blocker, "virgl is not yet migratable");

Suggest prepending with "virtio-gpu:".

> +migrate_add_blocker(g->migration_blocker);
>  } else {
>  register_savevm(qdev, "virtio-gpu", -1, VIRTIO_GPU_VM_VERSION,
>  virtio_gpu_save, virtio_gpu_load, g);
>  }
>  }

In any case,

Reviewed-by: Cornelia Huck

[Qemu-devel] [PATCH] net: fix incorrect access to pointer

2016-07-15 Thread Paolo Bonzini

This is not dereferencing the pointer, and instead checking only
the value of the pointer.

Signed-off-by: Paolo Bonzini 
---
 net/eth.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/eth.c b/net/eth.c
index 0be59c2..df81efb 100644
--- a/net/eth.c
+++ b/net/eth.c
@@ -211,7 +211,7 @@ void eth_get_protocols(const struct iovec *iov, int iovcnt,
  *l4hdr_off, sizeof(l4hdr_info->hdr.tcp),
  _info->hdr.tcp);
 
-if (istcp) {
+if (*istcp) {
 *l5hdr_off = *l4hdr_off +
 TCP_HEADER_DATA_OFFSET(_info->hdr.tcp);
 
-- 
2.7.4

[Qemu-devel] [PATCH] vnc-tight: fix regression with libxenstore

2016-07-15 Thread Peter Lieven

commit 095497ff added thread local storage for the color counting
palette. Unfortunately, a VncPalette is about 7kB on a x86_64 system.
This memory is reserved from the stack of every thread and it
exhausted the stack space of a libxenstore thread.

Fix this by allocating memory only for the VNC encoding thread.

Fixes: 095497ffc66b7f031ff2a17f1e50f5cb105ce588
Reported-by: Juergen Gross 
Tested-by: Juergen Gross 
Signed-off-by: Peter Lieven 
---
 ui/vnc-enc-tight.c | 28 +---
 1 file changed, 21 insertions(+), 7 deletions(-)

diff --git a/ui/vnc-enc-tight.c b/ui/vnc-enc-tight.c
index b8581dd..2b58739 100644
--- a/ui/vnc-enc-tight.c
+++ b/ui/vnc-enc-tight.c
@@ -1457,11 +1457,17 @@ static int send_sub_rect_jpeg(VncState *vs, int x, int 
y, int w, int h,
 }
 #endif
 
-static __thread VncPalette color_count_palette;
+static __thread VncPalette *color_count_palette;
+static __thread Notifier vnc_tight_cleanup_notifier;
+
+static void vnc_tight_cleanup(Notifier *n, void *value)
+{
+g_free(color_count_palette);
+color_count_palette = NULL;
+}
 
 static int send_sub_rect(VncState *vs, int x, int y, int w, int h)
 {
-VncPalette *palette = _count_palette;
 uint32_t bg = 0, fg = 0;
 int colors;
 int ret = 0;
@@ -1470,6 +1476,12 @@ static int send_sub_rect(VncState *vs, int x, int y, int 
w, int h)
 bool allow_jpeg = true;
 #endif
 
+if (!color_count_palette) {
+color_count_palette = g_malloc(sizeof(VncPalette));
+vnc_tight_cleanup_notifier.notify = vnc_tight_cleanup;
+qemu_thread_atexit_add(_tight_cleanup_notifier);
+}
+
 vnc_framebuffer_update(vs, x, y, w, h, vs->tight.type);
 
 vnc_tight_start(vs);
@@ -1490,17 +1502,19 @@ static int send_sub_rect(VncState *vs, int x, int y, 
int w, int h)
 }
 #endif
 
-colors = tight_fill_palette(vs, x, y, w * h, , , palette);
+colors = tight_fill_palette(vs, x, y, w * h, , , 
color_count_palette);
 
 #ifdef CONFIG_VNC_JPEG
 if (allow_jpeg && vs->tight.quality != (uint8_t)-1) {
-ret = send_sub_rect_jpeg(vs, x, y, w, h, bg, fg, colors, palette,
- force_jpeg);
+ret = send_sub_rect_jpeg(vs, x, y, w, h, bg, fg, colors,
+ color_count_palette, force_jpeg);
 } else {
-ret = send_sub_rect_nojpeg(vs, x, y, w, h, bg, fg, colors, palette);
+ret = send_sub_rect_nojpeg(vs, x, y, w, h, bg, fg, colors,
+   color_count_palette);
 }
 #else
-ret = send_sub_rect_nojpeg(vs, x, y, w, h, bg, fg, colors, palette);
+ret = send_sub_rect_nojpeg(vs, x, y, w, h, bg, fg, colors,
+   color_count_palette);
 #endif
 
 return ret;
-- 
1.9.1

Re: [Qemu-devel] [PATCH v3 12/14] virtio-gpu: Use migrate_add_blocker for virgl migration blocking

2016-07-15 Thread Gerd Hoffmann

On Do, 2016-07-14 at 18:22 +0100, Dr. David Alan Gilbert (git) wrote:
> From: "Dr. David Alan Gilbert" 
> 
> virgl conditionally registers a vmstate as unmigratable when virgl
> is enabled; instead use the migrate_add_blocker mechanism.
> 
> Signed-off-by: Dr. David Alan Gilbert 

Reviewed-by: Gerd Hoffmann

Re: [Qemu-devel] [PATCH V2 1/1] linux-aio: prevent submitting more than MAX_EVENTS

2016-07-15 Thread Paolo Bonzini

On 15/07/2016 11:18, Roman Penyaev wrote:
> Those 3 red spikes and a blue hill is what we have to focus on.  The
> blue hill at the right corner of the chart means that almost always the
> ring buffer was observed as full, i.e. qemu_laio_completion_bh() got
> a chance to reap completions not very often, meanwhile completed
> requests stand in the ring buffer for quite a long time which degrades
> the overall performance.
> 
> The results covered by the red line are much better and that can be
> explained by those 3 red spikes, which are almost in the middle of the
> whole distribution, i.e. qemu_laio_completion_bh() is called more often,
> completed requests do not stall, giving fio a chance to submit new fresh
> requests.
> 
> The theoretical fix would be to schedule completion BH just after
> successful io_submit, i.e.:

What about removing the qemu_bh_cancel but keeping the rest of the patch?

I'm also interested in a graph with this patch ("linux-aio: prevent
submitting more than MAX_EVENTS") on top of origin/master.

Thanks for the analysis.  Sometimes a picture _is_ worth a thousand
words, even if it's measuring "only" second-order effects (# of
completions is not what causes the slowdown, but # of completions
affects latency which causes the slowdown).

Paolo

Re: [Qemu-devel] [Qemu-block] [PATCH] aio_ctx_check: follow CODING_STYLE

2016-07-15 Thread Stefan Hajnoczi

On Fri, Jul 15, 2016 at 09:48:50AM +0800, Cao jin wrote:
> On 07/14/2016 10:08 PM, Eric Blake wrote:
> > On 07/14/2016 07:10 AM, Cao jin wrote:
> > > replace tab with spaces
> > > 
> > > Signed-off-by: Cao jin 
> > > ---
> > >   async.c | 2 +-
> > >   1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > Whitespace-only changes are best done as part of a series that is
> > already touching nearby code for other reasons (depending on the size of
> > the whitespace changes and on the rest of your patch, it may be okay to
> > squash the whitespace change in place, or better to split into separate
> > patches to make review of both patches easier).  Otherwise, it just
> > makes 'git blame' output dirtier.
> 
> I see.
> Since async.c & aio-posix.c are belong to the same maintaiers, so, Fam &
> Stefan, is it ok to squash this into that "remove useless parameter" patch?
> If not, we can just forget this one.

The "remove useless parameter" patch doesn't touch the function you are
modifying here.  Please don't squash the patches.

Since you have already posted this patch I will merge it.  In the future
please don't submit whitespace changes, tiny coding style cleanups, etc
in by themselves.

Thanks for all your contributions.  I do not want to discourage you but
my view is that code changes should only be made if they fix a bug,
improve performance measurably, add a feature, or significantly improve
the code.

Every patch has a cost in terms of code review, merging/testing, backporting,
bisecting, documentation, etc.  We could discuss each of these in detail
but basically a code change creates work or takes time from one or more
people in these areas.

In a perfect world with unlimited resources all patches would be equally
welcome.  Due to limited resources it's best to submit the types of
patches I mentioned above where the cost/benefit ratio is favorable.

Thanks,
Stefan

signature.asc
Description: PGP signature

Re: [Qemu-devel] Regression with commit 095497ffc66b7f031

2016-07-15 Thread Juergen Gross

On 15/07/16 12:35, Paolo Bonzini wrote:
> 
> 
> On 15/07/2016 12:12, Gerd Hoffmann wrote:
>> On Fr, 2016-07-15 at 12:02 +0200, Paolo Bonzini wrote:
>>>
>>> On 15/07/2016 10:47, Juergen Gross wrote:
 Nothing scaring and no real difference between working and not working
 variant.

 Meanwhile I've been digging a little bit deeper and found the reason:
 libxenstore is setting up a reader thread which is waiting for the
 watch to fire. With above commit the stack size of that thread (16kB)
 is too small. Setting it to 32kB made qemu work again.
>>>
>>> This makes very little sense (not your fault)...  The commit doesn't
>>> change stack usage at all, TLS should not be on the stack.
>>>
>>> Can you capture a backtrace where the 16K stack is exceeded?  Perhaps
>>> it's only due to inlining decision on the compiler, in which case
>>> Peter's patch from today is only a bandaid.
>>
>> Hmm, I guess I hold off the vnc pull request for now ...
> 
> Go ahead.  I looked at glibc source code and the patch is okay.

Paolo, do you know of any interface to obtain the size of the TLS area
taken from the stack (before calling pthread_create() )? This would
enable me to modify libxenstore to set the stack size to a sensible
value without having to choose a magic number which might fit for qemu,
but not for other users of libxenstore in the future.


Juergen

Re: [Qemu-devel] [PATCH V2 1/1] linux-aio: prevent submitting more than MAX_EVENTS

2016-07-15 Thread Roman Penyaev

On Fri, Jul 15, 2016 at 12:17 PM, Roman Penyaev
 wrote:
> On Fri, Jul 15, 2016 at 11:58 AM, Paolo Bonzini  wrote:
>>
>>
>> On 15/07/2016 11:18, Roman Penyaev wrote:
>>> Those 3 red spikes and a blue hill is what we have to focus on.  The
>>> blue hill at the right corner of the chart means that almost always the
>>> ring buffer was observed as full, i.e. qemu_laio_completion_bh() got
>>> a chance to reap completions not very often, meanwhile completed
>>> requests stand in the ring buffer for quite a long time which degrades
>>> the overall performance.
>>>
>>> The results covered by the red line are much better and that can be
>>> explained by those 3 red spikes, which are almost in the middle of the
>>> whole distribution, i.e. qemu_laio_completion_bh() is called more often,
>>> completed requests do not stall, giving fio a chance to submit new fresh
>>> requests.
>>>
>>> The theoretical fix would be to schedule completion BH just after
>>> successful io_submit, i.e.:
>>
>> What about removing the qemu_bh_cancel but keeping the rest of the patch?
>
> That exactly what I did.  Numbers go to expected from ~1600MB/s to ~1800MB/s.
> So basically this hunk of the debatable patch:
>
>  if (event_notifier_test_and_clear(>e)) {
> -qemu_bh_schedule(s->completion_bh);
> +qemu_laio_completion_bh(s);
>  }
>
> does not have any impact and can be ignored.  At least I did not notice
> anything important.
>
>>
>> I'm also interested in a graph with this patch ("linux-aio: prevent
>> submitting more than MAX_EVENTS") on top of origin/master.
>
> I can plot it also of course.

So, finally I have it.

Same link:
https://docs.google.com/spreadsheets/d/12CIt6EKiJLqNx0OHNqiabR-oFBrqkH0LN3mjzZ5jGeo/edit?usp=sharing

last sheet:
"1789MB/s"

Not that much interesting: almost all the time we complete maximum:
MAX_LIMIT requests at once.  But of course that expected on such
device.  Probably other good metrics should be taken into account.

--
Roman

Re: [Qemu-devel] Regression with commit 095497ffc66b7f031

2016-07-15 Thread Juergen Gross

On 15/07/16 09:39, Peter Lieven wrote:
> Am 15.07.2016 um 08:32 schrieb Juergen Gross:
>> Commit 095497ffc66b7f031ff2a17f1e50f5cb105ce588 ("vnc-enc-tight: use
>> thread local storage for palette") introduced a regression with Xen:
>> Since this commit qemu used as a device model is no longer capable
>> to register Xenstore watches (that's the effect visible to a user).
>> Reverting this commit makes qemu behave well again. I have no idea
>> why that commit would have this effect with Xen, may be some memory
>> is clobbered?
> 
> I personally have no idea, maybe @Paolo has?
> 
> Maybe the corruption happens somewhere else and is just visible
> due to this change.
> 
> Do you see sth when you ran qemu/xen in valgrind?

Nothing scaring and no real difference between working and not working
variant.

Meanwhile I've been digging a little bit deeper and found the reason:
libxenstore is setting up a reader thread which is waiting for the
watch to fire. With above commit the stack size of that thread (16kB)
is too small. Setting it to 32kB made qemu work again.

So I'd recommend to have just a thread local palette pointer and
allocate the palette when needed and don't free it after using it but
keep it for reuse. Do you want to write that patch or should I do it?

Juergen

[Qemu-devel] [PATCH] exec: avoid realloc in phys_map_node_reserve

2016-07-15 Thread Peter Lieven

this is the first step in reducing the brk heap fragmentation
created by the map->nodes memory allocation. Since the introduction
of RCU the freeing of the PhysPageMaps is delayed so that sometimes
several hundred are allocated at the same time.

Even worse the memory for map->nodes is allocated and shortly
afterwards reallocated. Since the number of nodes it grows
to in the end is the same for all PhysPageMaps remember this value
and at least avoid the reallocation.

The large number of simultaneous allocations (about 450 x 70kB in
my configuration) has to be addressed later.

Signed-off-by: Peter Lieven 
---
 exec.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/exec.c b/exec.c
index 011babd..60cf46a 100644
--- a/exec.c
+++ b/exec.c
@@ -187,10 +187,12 @@ struct CPUAddressSpace {
 
 static void phys_map_node_reserve(PhysPageMap *map, unsigned nodes)
 {
+static unsigned alloc_hint = 16;
 if (map->nodes_nb + nodes > map->nodes_nb_alloc) {
-map->nodes_nb_alloc = MAX(map->nodes_nb_alloc * 2, 16);
+map->nodes_nb_alloc = MAX(map->nodes_nb_alloc, alloc_hint);
 map->nodes_nb_alloc = MAX(map->nodes_nb_alloc, map->nodes_nb + nodes);
 map->nodes = g_renew(Node, map->nodes, map->nodes_nb_alloc);
+alloc_hint = map->nodes_nb_alloc;
 }
 }
 
-- 
1.9.1

Re: [Qemu-devel] [PATCH] vnc-tight: fix regression with libxenstore

2016-07-15 Thread Peter Lieven

Am 15.07.2016 um 12:07 schrieb Gerd Hoffmann:
> On Fr, 2016-07-15 at 11:45 +0200, Peter Lieven wrote:
>> commit 095497ff added thread local storage for the color counting
>> palette. Unfortunately, a VncPalette is about 7kB on a x86_64 system.
>> This memory is reserved from the stack of every thread and it
>> exhausted the stack space of a libxenstore thread.
>>
>> Fix this by allocating memory only for the VNC encoding thread.
> Added to vnc queue.

Please wait. Paolo mentioned that TLS is not allocated from the stack.
Maybe this patch is ok, but we need a different commit message then.

Peter

Re: [Qemu-devel] [PATCH] aio_ctx_check: follow CODING_STYLE

2016-07-15 Thread Stefan Hajnoczi

On Thu, Jul 14, 2016 at 09:10:43PM +0800, Cao jin wrote:
> replace tab with spaces
> 
> Signed-off-by: Cao jin 
> ---
>  async.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)

Thanks, applied to my block tree:
https://github.com/stefanha/qemu/commits/block

Stefan


signature.asc
Description: PGP signature

[Qemu-devel] [RFC PATCH V6 2/6] colo-base: add colo-base to define and handle packet

2016-07-15 Thread Zhang Chen

COLO-base used by colo-compare and filter-rewriter.
this can share common data structure like:net packet,
and share other functions.

Signed-off-by: Zhang Chen 
Signed-off-by: Li Zhijian 
Signed-off-by: Wen Congyang 
---
 net/Makefile.objs  |   1 +
 net/colo-base.c|  74 +
 net/colo-base.h|  38 +
 net/colo-compare.c | 119 -
 trace-events   |   3 ++
 5 files changed, 233 insertions(+), 2 deletions(-)
 create mode 100644 net/colo-base.c
 create mode 100644 net/colo-base.h

diff --git a/net/Makefile.objs b/net/Makefile.objs
index ba92f73..119589f 100644
--- a/net/Makefile.objs
+++ b/net/Makefile.objs
@@ -17,3 +17,4 @@ common-obj-y += filter.o
 common-obj-y += filter-buffer.o
 common-obj-y += filter-mirror.o
 common-obj-y += colo-compare.o
+common-obj-y += colo-base.o
diff --git a/net/colo-base.c b/net/colo-base.c
new file mode 100644
index 000..f5d5de9
--- /dev/null
+++ b/net/colo-base.c
@@ -0,0 +1,74 @@
+/*
+ * COarse-grain LOck-stepping Virtual Machines for Non-stop Service (COLO)
+ * (a.k.a. Fault Tolerance or Continuous Replication)
+ *
+ * Copyright (c) 2016 HUAWEI TECHNOLOGIES CO., LTD.
+ * Copyright (c) 2016 FUJITSU LIMITED
+ * Copyright (c) 2016 Intel Corporation
+ *
+ * Author: Zhang Chen 
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or
+ * later.  See the COPYING file in the top-level directory.
+ */
+
+#include "qemu/osdep.h"
+#include "qemu/error-report.h"
+#include "net/colo-base.h"
+
+int parse_packet_early(Packet *pkt)
+{
+int network_length;
+uint8_t *data = pkt->data;
+uint16_t l3_proto;
+ssize_t l2hdr_len = eth_get_l2_hdr_length(data);
+
+if (pkt->size < ETH_HLEN) {
+error_report("pkt->size < ETH_HLEN");
+return 1;
+}
+pkt->network_layer = data + ETH_HLEN;
+l3_proto = eth_get_l3_proto(data, l2hdr_len);
+if (l3_proto != ETH_P_IP) {
+return 1;
+}
+
+network_length = pkt->ip->ip_hl * 4;
+if (pkt->size < ETH_HLEN + network_length) {
+error_report("pkt->size < network_layer + network_length");
+return 1;
+}
+pkt->transport_layer = pkt->network_layer + network_length;
+if (!pkt->transport_layer) {
+error_report("pkt->transport_layer is valid");
+return 1;
+}
+
+return 0;
+}
+
+Packet *packet_new(const void *data, int size)
+{
+Packet *pkt = g_slice_new(Packet);
+
+pkt->data = g_memdup(data, size);
+pkt->size = size;
+
+return pkt;
+}
+
+void packet_destroy(void *opaque, void *user_data)
+{
+Packet *pkt = opaque;
+
+g_free(pkt->data);
+g_slice_free(Packet, pkt);
+}
+
+/*
+ * Clear hashtable, stop this hash growing really huge
+ */
+void connection_hashtable_reset(GHashTable *connection_track_table)
+{
+g_hash_table_remove_all(connection_track_table);
+}
diff --git a/net/colo-base.h b/net/colo-base.h
new file mode 100644
index 000..48835e7
--- /dev/null
+++ b/net/colo-base.h
@@ -0,0 +1,38 @@
+/*
+ * COarse-grain LOck-stepping Virtual Machines for Non-stop Service (COLO)
+ * (a.k.a. Fault Tolerance or Continuous Replication)
+ *
+ * Copyright (c) 2016 HUAWEI TECHNOLOGIES CO., LTD.
+ * Copyright (c) 2016 FUJITSU LIMITED
+ * Copyright (c) 2016 Intel Corporation
+ *
+ * Author: Zhang Chen 
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or
+ * later.  See the COPYING file in the top-level directory.
+ */
+
+#ifndef QEMU_COLO_BASE_H
+#define QEMU_COLO_BASE_H
+
+#include "slirp/slirp.h"
+#include "qemu/jhash.h"
+
+#define HASHTABLE_MAX_SIZE 16384
+
+typedef struct Packet {
+void *data;
+union {
+uint8_t *network_layer;
+struct ip *ip;
+};
+uint8_t *transport_layer;
+int size;
+} Packet;
+
+int parse_packet_early(Packet *pkt);
+void connection_hashtable_reset(GHashTable *connection_track_table);
+Packet *packet_new(const void *data, int size);
+void packet_destroy(void *opaque, void *user_data);
+
+#endif /* QEMU_COLO_BASE_H */
diff --git a/net/colo-compare.c b/net/colo-compare.c
index 0402958..7c52cc8 100644
--- a/net/colo-compare.c
+++ b/net/colo-compare.c
@@ -27,13 +27,38 @@
 #include "sysemu/char.h"
 #include "qemu/sockets.h"
 #include "qapi-visit.h"
+#include "net/colo-base.h"
+#include "trace.h"
 
 #define TYPE_COLO_COMPARE "colo-compare"
 #define COLO_COMPARE(obj) \
 OBJECT_CHECK(CompareState, (obj), TYPE_COLO_COMPARE)
 
 #define COMPARE_READ_LEN_MAX NET_BUFSIZE
+#define MAX_QUEUE_SIZE 1024
 
+/*
+  + CompareState ++
+  |   |
+  +---+   +---+ +---+
+  |conn list  +--->conn   +->conn   |
+  +---+   +---+ +---+
+  |   | |   | |  |
+

Re: [Qemu-devel] [PATCH v5 04/10] block: Support meta dirty bitmap

2016-07-15 Thread Max Reitz

On 15.07.2016 14:04, Max Reitz wrote:
> On 14.07.2016 22:00, John Snow wrote:
>> On 06/22/2016 11:53 AM, Max Reitz wrote:
>>> On 03.06.2016 06:32, Fam Zheng wrote:
 The added group of operations enables tracking of the changed bits in
 the dirty bitmap.

 Signed-off-by: Fam Zheng 
 ---
  block/dirty-bitmap.c | 52 
 
  include/block/dirty-bitmap.h |  9 
  2 files changed, 61 insertions(+)

 diff --git a/block/dirty-bitmap.c b/block/dirty-bitmap.c
 index 628b77c..9c53c56 100644
 --- a/block/dirty-bitmap.c
 +++ b/block/dirty-bitmap.c
 @@ -38,6 +38,7 @@
   */
  struct BdrvDirtyBitmap {
  HBitmap *bitmap;/* Dirty sector bitmap implementation */
 +HBitmap *meta;  /* Meta dirty bitmap */
  BdrvDirtyBitmap *successor; /* Anonymous child; implies frozen status 
 */
  char *name; /* Optional non-empty unique ID */
  int64_t size;   /* Size of the bitmap (Number of sectors) 
 */
 @@ -103,6 +104,56 @@ BdrvDirtyBitmap 
 *bdrv_create_dirty_bitmap(BlockDriverState *bs,
  return bitmap;
  }
  
 +/* bdrv_create_meta_dirty_bitmap
 + *
 + * Create a meta dirty bitmap that tracks the changes of bits in @bitmap. 
 I.e.
 + * when a dirty status bit in @bitmap is changed (either from reset to 
 set or
 + * the other way around), its respective meta dirty bitmap bit will be 
 marked
 + * dirty as well.
>>>
>>> Not wrong, but I'd like a note here that this is not an
>>> when-and-only-when relationship, i.e. that bits in the meta bitmap may
>>> be set even without the tracked bits in the dirty bitmap having changed.
>>>
>>
>> How?
>>
>> You mean, if the caller manually starts setting things in the meta
>> bitmap object?
>>
>> That's their fault then, isn't it?
> 
> No, I mean something that I mentioned in a reply to some previous
> version (the patch adding the test):
> 
> http://lists.nongnu.org/archive/html/qemu-block/2016-03/msg00332.html
> 
> Fam's reply is here:
> 
> http://lists.nongnu.org/archive/html/qemu-block/2016-06/msg00097.html
> 
> (Interesting how that reply took nearly three months and yours took
> nearly one, there most be something about this series that makes
> replying to replies very cumbersome :-))

I just remembered that it's very much justified now, as you have only
recently adopted this series.

It's just always funny to get a “What are you talking about?” reply to
some nagging I sent out long enough in the past that I can't even
remember myself, so I have to look it up, too.

Sorry :-)

Max

> What I meant by “then it would update meta” is that it would update *all
> of the range* even though only a single bit has actually been changed.
> 
> So the answer to your “how” is: See patch 2, the changes to
> hbitmap_set() (and hbitmap_reset()). If any of the bits in the given
> range is changed, all of the range is marked as having changed in the
> meta bitmap.
> 
> So all we guarantee is that every time a bit is changed, the
> corresponding bit in the meta bitmap will be set. But we do not
> guarantee that a bit in the meta bitmap stays cleared as long as the
> corresponding range of the underlying bitmap stays the same.
> 
> Max
> 
>>
>>> Maybe this should be mentioned somewhere in patch 2, too. Or maybe only
>>> in patch 2.
>>>
 + *
 + * @bitmap: the block dirty bitmap for which to create a meta dirty 
 bitmap.
 + * @chunk_size: how many bytes of bitmap data does each bit in the meta 
 bitmap
 + * track.
 + */
 +void bdrv_create_meta_dirty_bitmap(BdrvDirtyBitmap *bitmap,
 +   int chunk_size)
 +{
 +assert(!bitmap->meta);
 +bitmap->meta = hbitmap_create_meta(bitmap->bitmap,
 +   chunk_size * BITS_PER_BYTE);
 +}
 +
 +void bdrv_release_meta_dirty_bitmap(BdrvDirtyBitmap *bitmap)
 +{
 +assert(bitmap->meta);
 +hbitmap_free_meta(bitmap->bitmap);
 +bitmap->meta = NULL;
 +}
 +
 +int bdrv_dirty_bitmap_get_meta(BlockDriverState *bs,
 +   BdrvDirtyBitmap *bitmap, int64_t sector,
 +   int nb_sectors)
 +{
 +uint64_t i;
 +int gran = bdrv_dirty_bitmap_granularity(bitmap) >> BDRV_SECTOR_BITS;
 +
 +/* To optimize: we can make hbitmap to internally check the range in a
 + * coarse level, or at least do it word by word. */
>>>
>>> We could also multiply gran by the granularity of the meta bitmap. Each
>>> bit of the meta bitmap tracks at least eight bits of the dirty bitmap,
>>> so we're calling hbitmap_get() at least eight times as often as
>>> necessary here.
>>>
>>> Or we just use int gran = hbitmap_granularity(bitmap->meta);.
>>>
>>> Not exactly

[Qemu-devel] [PATCH] net: fix incorrect argument to iov_to_buf

2016-07-15 Thread Paolo Bonzini

Coverity reports a "suspicious sizeof" which is indeed wrong.

Signed-off-by: Paolo Bonzini 
---
 net/eth.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/net/eth.c b/net/eth.c
index 95fe15c..0be59c2 100644
--- a/net/eth.c
+++ b/net/eth.c
@@ -418,7 +418,7 @@ _eth_get_rss_ex_dst_addr(const struct iovec *pkt, int 
pkt_frags,
 
 bytes_read = iov_to_buf(pkt, pkt_frags,
 rthdr_offset + sizeof(*ext_hdr),
-dst_addr, sizeof(dst_addr));
+dst_addr, sizeof(*dst_addr));
 
 return bytes_read == sizeof(dst_addr);
 }
@@ -467,7 +467,7 @@ _eth_get_rss_ex_src_addr(const struct iovec *pkt, int 
pkt_frags,
 
 bytes_read = iov_to_buf(pkt, pkt_frags,
 opt_offset + sizeof(opthdr),
-src_addr, sizeof(src_addr));
+src_addr, sizeof(*src_addr));
 
 return bytes_read == sizeof(src_addr);
 }
-- 
2.7.4

Re: [Qemu-devel] Regression with commit 095497ffc66b7f031

2016-07-15 Thread Juergen Gross

On 15/07/16 11:03, Peter Lieven wrote:
> Am 15.07.2016 um 10:47 schrieb Juergen Gross:
>> On 15/07/16 09:39, Peter Lieven wrote:
>>> Am 15.07.2016 um 08:32 schrieb Juergen Gross:
 Commit 095497ffc66b7f031ff2a17f1e50f5cb105ce588 ("vnc-enc-tight: use
 thread local storage for palette") introduced a regression with Xen:
 Since this commit qemu used as a device model is no longer capable
 to register Xenstore watches (that's the effect visible to a user).
 Reverting this commit makes qemu behave well again. I have no idea
 why that commit would have this effect with Xen, may be some memory
 is clobbered?
>>> I personally have no idea, maybe @Paolo has?
>>>
>>> Maybe the corruption happens somewhere else and is just visible
>>> due to this change.
>>>
>>> Do you see sth when you ran qemu/xen in valgrind?
>> Nothing scaring and no real difference between working and not working
>> variant.
>>
>> Meanwhile I've been digging a little bit deeper and found the reason:
>> libxenstore is setting up a reader thread which is waiting for the
>> watch to fire. With above commit the stack size of that thread (16kB)
>> is too small. Setting it to 32kB made qemu work again.
>>
>> So I'd recommend to have just a thread local palette pointer and
>> allocate the palette when needed and don't free it after using it but
>> keep it for reuse. Do you want to write that patch or should I do it?
> 
> As you like. But as I have introduced this regression, maybe I should
> fix it ;-)

Sure.

> Actually I do not understand what libxenstore confuses about 16 and 32kB,
> but I have no knowledge about XEN. However, let me know if this here works 
> for you:

Thanks, is working again. You can add

Tested-by: Juergen Gross 

when submitting it.

The thread local static variable you added is occupying stack space in
each thread started by qemu. As it is rather large (about 6kB on a 64
bit system) the stack of the thread created by libxenstore is exhausted
resulting in a failing library call.


Juergen

> 
> diff --git a/ui/vnc-enc-tight.c b/ui/vnc-enc-tight.c
> index b8581dd..5731cf6 100644
> --- a/ui/vnc-enc-tight.c
> +++ b/ui/vnc-enc-tight.c
> @@ -1457,11 +1457,18 @@ static int send_sub_rect_jpeg(VncState *vs, int x, 
> int y, int w, int h,
>  }
>  #endif
>  
> -static __thread VncPalette color_count_palette;
> +static __thread VncPalette *color_count_palette = NULL;
> +static __thread Notifier vnc_tight_cleanup_notifier;
> +
> +static void vnc_tight_cleanup(Notifier *n, void *value)
> +{
> +printf("thread %d: free tight palette %p\n", qemu_get_thread_id(), 
> color_count_palette);
> +g_free(color_count_palette);
> +color_count_palette = NULL;
> +}
>  
>  static int send_sub_rect(VncState *vs, int x, int y, int w, int h)
>  {
> -VncPalette *palette = _count_palette;
>  uint32_t bg = 0, fg = 0;
>  int colors;
>  int ret = 0;
> @@ -1470,6 +1477,13 @@ static int send_sub_rect(VncState *vs, int x, int y, 
> int w, int h)
>  bool allow_jpeg = true;
>  #endif
>  
> +if (!color_count_palette) {
> +color_count_palette = g_malloc(sizeof(VncPalette));
> +vnc_tight_cleanup_notifier.notify = vnc_tight_cleanup;
> +qemu_thread_atexit_add(_tight_cleanup_notifier);
> +printf("thread %d: alloc tight palette %p\n", qemu_get_thread_id(), 
> color_count_palette);
> +}
> +
>  vnc_framebuffer_update(vs, x, y, w, h, vs->tight.type);
>  
>  vnc_tight_start(vs);
> @@ -1490,17 +1504,17 @@ static int send_sub_rect(VncState *vs, int x, int y, 
> int w, int h)
>  }
>  #endif
>  
> -colors = tight_fill_palette(vs, x, y, w * h, , , palette);
> +colors = tight_fill_palette(vs, x, y, w * h, , , 
> color_count_palette);
>  
>  #ifdef CONFIG_VNC_JPEG
>  if (allow_jpeg && vs->tight.quality != (uint8_t)-1) {
> -ret = send_sub_rect_jpeg(vs, x, y, w, h, bg, fg, colors, palette,
> +ret = send_sub_rect_jpeg(vs, x, y, w, h, bg, fg, colors, 
> color_count_palette,
>   force_jpeg);
>  } else {
> -ret = send_sub_rect_nojpeg(vs, x, y, w, h, bg, fg, colors, palette);
> +ret = send_sub_rect_nojpeg(vs, x, y, w, h, bg, fg, colors, 
> color_count_palette);
>  }
>  #else
> -ret = send_sub_rect_nojpeg(vs, x, y, w, h, bg, fg, colors, palette);
> +ret = send_sub_rect_nojpeg(vs, x, y, w, h, bg, fg, colors, 
> color_count_palette);
>  #endif
>  
>  return ret;
> 
> Peter
> 
>

Re: [Qemu-devel] [PATCH] vnc-enc-tight: fix off-by-one bug

2016-07-15 Thread Gerd Hoffmann

On Di, 2016-07-12 at 17:31 +0800, Herongguang (Stephen) wrote:
> In tight_encode_indexed_rect32, buf(or src)’s size is count. In for loop,
> the logic is supposed to be that i is an index into src, i should be
> incremented when incrementing src.
> 
> This is broken when src is incremented but i is not before while loop,
> resulting in off-by-one bug in while loop.
> 
> Signed-off-by: He Rongguang 

Added to vnc queue.

Patch is whitespace mangled, had to use "patch --ignore-whitespace" to
get it applied.  Can you please use 'git send-email' to send patches in
the future?  That is the best way to avoid your mail client breaking
patches.

thanks,
  Gerd

Re: [Qemu-devel] Regression with commit 095497ffc66b7f031

2016-07-15 Thread Paolo Bonzini



On 15/07/2016 12:12, Gerd Hoffmann wrote:
> On Fr, 2016-07-15 at 12:02 +0200, Paolo Bonzini wrote:
>>
>> On 15/07/2016 10:47, Juergen Gross wrote:
>>> Nothing scaring and no real difference between working and not working
>>> variant.
>>>
>>> Meanwhile I've been digging a little bit deeper and found the reason:
>>> libxenstore is setting up a reader thread which is waiting for the
>>> watch to fire. With above commit the stack size of that thread (16kB)
>>> is too small. Setting it to 32kB made qemu work again.
>>
>> This makes very little sense (not your fault)...  The commit doesn't
>> change stack usage at all, TLS should not be on the stack.
>>
>> Can you capture a backtrace where the 16K stack is exceeded?  Perhaps
>> it's only due to inlining decision on the compiler, in which case
>> Peter's patch from today is only a bandaid.
> 
> Hmm, I guess I hold off the vnc pull request for now ...

Go ahead.  I looked at glibc source code and the patch is okay.

Paolo

Re: [Qemu-devel] [PATCH V2 1/1] linux-aio: prevent submitting more than MAX_EVENTS

2016-07-15 Thread Paolo Bonzini



On 15/07/2016 12:17, Roman Penyaev wrote:
> On Fri, Jul 15, 2016 at 11:58 AM, Paolo Bonzini  wrote:
>>
>>
>> On 15/07/2016 11:18, Roman Penyaev wrote:
>>> Those 3 red spikes and a blue hill is what we have to focus on.  The
>>> blue hill at the right corner of the chart means that almost always the
>>> ring buffer was observed as full, i.e. qemu_laio_completion_bh() got
>>> a chance to reap completions not very often, meanwhile completed
>>> requests stand in the ring buffer for quite a long time which degrades
>>> the overall performance.
>>>
>>> The results covered by the red line are much better and that can be
>>> explained by those 3 red spikes, which are almost in the middle of the
>>> whole distribution, i.e. qemu_laio_completion_bh() is called more often,
>>> completed requests do not stall, giving fio a chance to submit new fresh
>>> requests.
>>>
>>> The theoretical fix would be to schedule completion BH just after
>>> successful io_submit, i.e.:
>>
>> What about removing the qemu_bh_cancel but keeping the rest of the patch?
> 
> That exactly what I did.  Numbers go to expected from ~1600MB/s to ~1800MB/s.
> So basically this hunk of the debatable patch:
> 
>  if (event_notifier_test_and_clear(>e)) {
> -qemu_bh_schedule(s->completion_bh);
> +qemu_laio_completion_bh(s);
>  }
> 
> does not have any impact and can be ignored.  At least I did not notice
> anything important.
> 
>>
>> I'm also interested in a graph with this patch ("linux-aio: prevent
>> submitting more than MAX_EVENTS") on top of origin/master.
> 
> I can plot it also of course.
> 
>>
>> Thanks for the analysis.  Sometimes a picture _is_ worth a thousand
>> words, even if it's measuring "only" second-order effects (# of
>> completions is not what causes the slowdown, but # of completions
>> affects latency which causes the slowdown).
> 
> Yes, you are right, latency.  With userspace io_getevents ~0 costs we
> can peek requests as often as we like to decrease latency on very
> fast devices.  That can also bring something.  Probably after each
> io_submit() it makes sense to peek and complete something.

Right, especially 1) because io_getevents with timeout 0 is cheap (it
peeks at the ring buffer before the syscall); 2) because we want anyway
to replace io_getevents with userspace code through your other patch.

Paolo

Re: [Qemu-devel] [Qemu-block] [PATCH] aio_ctx_check: follow CODING_STYLE

2016-07-15 Thread Cao jin




On 07/15/2016 06:40 PM, Stefan Hajnoczi wrote:

On Fri, Jul 15, 2016 at 09:48:50AM +0800, Cao jin wrote:

On 07/14/2016 10:08 PM, Eric Blake wrote:

On 07/14/2016 07:10 AM, Cao jin wrote:

replace tab with spaces

Signed-off-by: Cao jin 
---
   async.c | 2 +-
   1 file changed, 1 insertion(+), 1 deletion(-)


Whitespace-only changes are best done as part of a series that is
already touching nearby code for other reasons (depending on the size of
the whitespace changes and on the rest of your patch, it may be okay to
squash the whitespace change in place, or better to split into separate
patches to make review of both patches easier).  Otherwise, it just
makes 'git blame' output dirtier.


I see.
Since async.c & aio-posix.c are belong to the same maintaiers, so, Fam &
Stefan, is it ok to squash this into that "remove useless parameter" patch?
If not, we can just forget this one.


The "remove useless parameter" patch doesn't touch the function you are
modifying here.  Please don't squash the patches.

Since you have already posted this patch I will merge it.  In the future
please don't submit whitespace changes, tiny coding style cleanups, etc
in by themselves.

Thanks for all your contributions.  I do not want to discourage you but
my view is that code changes should only be made if they fix a bug,
improve performance measurably, add a feature, or significantly improve
the code.

Every patch has a cost in terms of code review, merging/testing, backporting,
bisecting, documentation, etc.  We could discuss each of these in detail
but basically a code change creates work or takes time from one or more
people in these areas.

In a perfect world with unlimited resources all patches would be equally
welcome.  Due to limited resources it's best to submit the types of
patches I mentioned above where the cost/benefit ratio is favorable.

Thanks,
Stefan



Thanks Stefan, and sorry for the inconvenience brought to you. I thought 
this kind of tiny things would be very simple for maintainers, now I 
understand

--
Yours Sincerely,

Cao jin

[Qemu-devel] [RFC PATCH V6 5/6] colo-compare: introduce packet comparison thread

2016-07-15 Thread Zhang Chen

If primary packet is same with secondary packet,
we will send primary packet and drop secondary
packet, otherwise notify COLO frame to do checkpoint.
If primary packet comes and secondary packet not,
after REGULAR_PACKET_CHECK_MS milliseconds we set
the primary packet as old_packet,then do a checkpoint.

Signed-off-by: Zhang Chen 
Signed-off-by: Li Zhijian 
Signed-off-by: Wen Congyang 
---
 net/colo-base.c|   1 +
 net/colo-base.h|   3 +
 net/colo-compare.c | 216 +
 trace-events   |   2 +
 4 files changed, 222 insertions(+)

diff --git a/net/colo-base.c b/net/colo-base.c
index 7e91dec..eb1b631 100644
--- a/net/colo-base.c
+++ b/net/colo-base.c
@@ -132,6 +132,7 @@ Packet *packet_new(const void *data, int size)
 
 pkt->data = g_memdup(data, size);
 pkt->size = size;
+pkt->creation_ms = qemu_clock_get_ms(QEMU_CLOCK_HOST);
 
 return pkt;
 }
diff --git a/net/colo-base.h b/net/colo-base.h
index 0505608..06d6dca 100644
--- a/net/colo-base.h
+++ b/net/colo-base.h
@@ -17,6 +17,7 @@
 
 #include "slirp/slirp.h"
 #include "qemu/jhash.h"
+#include "qemu/timer.h"
 
 #define HASHTABLE_MAX_SIZE 16384
 
@@ -28,6 +29,8 @@ typedef struct Packet {
 };
 uint8_t *transport_layer;
 int size;
+/* Time of packet creation, in wall clock ms */
+int64_t creation_ms;
 } Packet;
 
 typedef struct ConnectionKey {
diff --git a/net/colo-compare.c b/net/colo-compare.c
index 5f87710..942e326 100644
--- a/net/colo-compare.c
+++ b/net/colo-compare.c
@@ -36,6 +36,8 @@
 
 #define COMPARE_READ_LEN_MAX NET_BUFSIZE
 #define MAX_QUEUE_SIZE 1024
+/* TODO: Should be configurable */
+#define REGULAR_PACKET_CHECK_MS 3000
 
 /*
   + CompareState ++
@@ -83,6 +85,10 @@ typedef struct CompareState {
 GQueue unprocessed_connections;
 /* proxy current hash size */
 uint32_t hashtable_size;
+/* compare thread, a thread for each NIC */
+QemuThread thread;
+/* Timer used on the primary to find packets that are never matched */
+QEMUTimer *timer;
 } CompareState;
 
 typedef struct CompareClass {
@@ -170,6 +176,112 @@ static int packet_enqueue(CompareState *s, int mode)
 return 0;
 }
 
+/*
+ * The IP packets sent by primary and secondary
+ * will be compared in here
+ * TODO support ip fragment, Out-Of-Order
+ * return:0  means packet same
+ *> 0 || < 0 means packet different
+ */
+static int colo_packet_compare(Packet *ppkt, Packet *spkt)
+{
+trace_colo_compare_ip_info(ppkt->size, inet_ntoa(ppkt->ip->ip_src),
+   inet_ntoa(ppkt->ip->ip_dst), spkt->size,
+   inet_ntoa(spkt->ip->ip_src),
+   inet_ntoa(spkt->ip->ip_dst));
+
+if (ppkt->size == spkt->size) {
+return memcmp(ppkt->data, spkt->data, spkt->size);
+} else {
+return -1;
+}
+}
+
+static int colo_packet_compare_all(Packet *spkt, Packet *ppkt)
+{
+trace_colo_compare_main("compare all");
+return colo_packet_compare(ppkt, spkt);
+}
+
+static void colo_old_packet_check_one(void *opaque_packet,
+  void *opaque_found)
+{
+int64_t now;
+bool *found_old = (bool *)opaque_found;
+Packet *ppkt = (Packet *)opaque_packet;
+
+if (*found_old) {
+/* Someone found an old packet earlier in the queue */
+return;
+}
+
+now = qemu_clock_get_ms(QEMU_CLOCK_HOST);
+if ((now - ppkt->creation_ms) > REGULAR_PACKET_CHECK_MS) {
+trace_colo_old_packet_check_found(ppkt->creation_ms);
+*found_old = true;
+}
+}
+
+static void colo_old_packet_check_one_conn(void *opaque,
+   void *user_data)
+{
+bool found_old = false;
+Connection *conn = opaque;
+
+g_queue_foreach(>primary_list, colo_old_packet_check_one,
+_old);
+if (found_old) {
+/* do checkpoint will flush old packet */
+/* TODO: colo_notify_checkpoint();*/
+}
+}
+
+/*
+ * Look for old packets that the secondary hasn't matched,
+ * if we have some then we have to checkpoint to wake
+ * the secondary up.
+ */
+static void colo_old_packet_check(void *opaque)
+{
+CompareState *s = opaque;
+
+g_queue_foreach(>conn_list, colo_old_packet_check_one_conn, NULL);
+}
+
+/*
+ * called from the compare thread on the primary
+ * for compare connection
+ */
+static void colo_compare_connection(void *opaque, void *user_data)
+{
+CompareState *s = user_data;
+Connection *conn = opaque;
+Packet *pkt = NULL;
+GList *result = NULL;
+int ret;
+
+while (!g_queue_is_empty(>primary_list) &&
+   !g_queue_is_empty(>secondary_list)) {
+pkt = g_queue_pop_tail(>primary_list);
+result = g_queue_find_custom(>secondary_list,
+  pkt, (GCompareFunc)colo_packet_compare_all);
+
+if (result) {
+

[Qemu-devel] [PATCH] e1000e: fix incorrect access to pointer

2016-07-15 Thread Paolo Bonzini

This is not dereferencing the pointer, and instead checking only
the value of the pointer.

Signed-off-by: Paolo Bonzini 
---
 hw/net/e1000e_core.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/hw/net/e1000e_core.c b/hw/net/e1000e_core.c
index 6050d8b..badb1fe 100644
--- a/hw/net/e1000e_core.c
+++ b/hw/net/e1000e_core.c
@@ -281,7 +281,7 @@ e1000e_intrmgr_delay_rx_causes(E1000ECore *core, uint32_t 
*causes)
 
 /* Check if delayed RX interrupts disabled by client
or if there are causes that cannot be delayed */
-if ((rdtr == 0) || (causes != 0)) {
+if ((rdtr == 0) || (*causes != 0)) {
 return false;
 }
 
@@ -322,7 +322,7 @@ e1000e_intrmgr_delay_tx_causes(E1000ECore *core, uint32_t 
*causes)
 *causes &= ~delayable_causes;
 
 /* If there are causes that cannot be delayed */
-if (causes != 0) {
+if (*causes != 0) {
 return false;
 }
 
-- 
2.7.4

Re: [Qemu-devel] [PATCH 1/1] spapr: Ensure CPU cores are added contiguously and removed in LIFO order

2016-07-15 Thread Igor Mammedov

On Fri, 15 Jul 2016 15:29:01 +1000
David Gibson  wrote:

> On Thu, Jul 14, 2016 at 10:27:15AM +0200, Igor Mammedov wrote:
> > On Thu, 14 Jul 2016 10:51:27 +1000
> > David Gibson  wrote:
> >   
> > > On Wed, Jul 13, 2016 at 12:20:20PM +0530, Bharata B Rao wrote:  
> > > > If CPU core addition or removal is allowed in random order leading to
> > > > holes in the core id range (and hence in the cpu_index range), migration
> > > > can fail as migration with holes in cpu_index range isn't yet handled
> > > > correctly.
> > > > 
> > > > Prevent this situation by enforcing the addition in contiguous order
> > > > and removal in LIFO order so that we never end up with holes in
> > > > cpu_index range.
> > > > 
> > > > Signed-off-by: Bharata B Rao 
> > > > ---
> > > > While there is work in progress to support migration when there are 
> > > > holes
> > > > in cpu_index range resulting from out-of-order plug or unplug, this 
> > > > patch
> > > > is intended as a last resort if no easy, risk-free and elegant solution
> > > > emerges before 2.7 dev cycle ends.
> > > 
> > > Applied to ppc-for-2.7.  We can revert it once the problems with
> > > cpu_index are sorted out.  
> > You'd need to add machine type specific compat option here,
> > so that new-qemu -M 2.7 wouldn't allow out of order too and
> > could be migrated to old-qemu -M 2.7  
> 
> Hmm, do we care about migration from newer back to older versions of
> qemu upstream?
upstream we don't, though it would reduce maintenance head-ache.
(if isn't hard then why not do it upstream either)

[Qemu-devel] [PATCH] checkpatch: consider git extended headers valid patches

2016-07-15 Thread Stefan Hajnoczi

Renames look like this with git-diff(1) when diff.renames = true is set:

  diff --git a/a b/b
  similarity index 100%
  rename from a
  rename to b

This raises the "Does not appear to be a unified-diff format patch"
error because checkpatch.pl only considers a diff valid if it contains
at least one "@@" hunk.

This patch accepts renames and copies too so that checkpatch.pl exits
successfully when a diff only renames/copies files.  The git diff
extended header format is described on the git-diff(1) man page.

Reported-by: Colin Lord 
Signed-off-by: Stefan Hajnoczi 
---
 scripts/checkpatch.pl | 5 +
 1 file changed, 5 insertions(+)

diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
index cf32c8f..afa7f79 100755
--- a/scripts/checkpatch.pl
+++ b/scripts/checkpatch.pl
@@ -1279,6 +1279,11 @@ sub process {
}
}
 
+# Accept git diff extended headers as valid patches
+   if ($line =~ /^(?:rename|copy) (?:from|to) [\w\/\.\-]+\s*$/) {
+   $is_patch = 1;
+   }
+
 #check the patch for a signoff:
if ($line =~ /^\s*signed-off-by:/i) {
# This is a signoff, if ugly, so do not double report.
-- 
2.7.4

Re: [Qemu-devel] [PATCH 7/8] pc: acpi: memhp: nvdimm hotplug support

2016-07-15 Thread Stefan Hajnoczi

On Fri, Jul 15, 2016 at 03:49:12PM +0800, Xiao Guangrong wrote:
> 
> 
> On 07/14/2016 08:17 PM, Stefan Hajnoczi wrote:
> > On Mon, Jul 11, 2016 at 09:45:17PM +0800, Xiao Guangrong wrote:
> > > diff --git a/hw/acpi/memory_hotplug.c b/hw/acpi/memory_hotplug.c
> > > index 73fa62d..d1c2e92 100644
> > > --- a/hw/acpi/memory_hotplug.c
> > > +++ b/hw/acpi/memory_hotplug.c
> > > @@ -239,11 +239,6 @@ void acpi_memory_plug_cb(HotplugHandler 
> > > *hotplug_dev, MemHotplugState *mem_st,
> > >DeviceState *dev, Error **errp)
> > >   {
> > >   MemStatus *mdev;
> > > -DeviceClass *dc = DEVICE_GET_CLASS(dev);
> > > -
> > > -if (!dc->hotpluggable) {
> > > -return;
> > > -}
> > > 
> > >   mdev = acpi_memory_slot_status(mem_st, dev, errp);
> > >   if (!mdev) {
> > 
> > Did you mean to include this hunk in the patch?  Looks like something
> > left over from debugging/prototyping.
> > 
> 
> As all memory devices, both pc-dimm and nvdimm, have supported hotplug now, i
> think this chunk can be removed.
> 
> In fact, this check was introduced by nvdimm. :)

Fair enough.

Stefan


signature.asc
Description: PGP signature

Re: [Qemu-devel] [PATCH v3 13/14] virtio-gpu: Wrap in vmstate

2016-07-15 Thread Gerd Hoffmann

On Do, 2016-07-14 at 18:22 +0100, Dr. David Alan Gilbert (git) wrote:
> From: "Dr. David Alan Gilbert" 
> 
> Forcibly convert it to a vmstate wrapper;  proper conversion
> comes later.
> 
> Signed-off-by: Dr. David Alan Gilbert 
> Reviewed-by: Cornelia Huck 
> ---
>  hw/display/virtio-gpu.c | 17 +++--
>  1 file changed, 7 insertions(+), 10 deletions(-)

Reviewed-by: Gerd Hoffmann

Re: [Qemu-devel] [PATCH v3 11/14] virtio-input: Wrap in vmstate

2016-07-15 Thread Gerd Hoffmann

On Do, 2016-07-14 at 18:22 +0100, Dr. David Alan Gilbert (git) wrote:
> From: "Dr. David Alan Gilbert" 
> 
> Forcibly convert it to a vmstate wrapper;  proper conversion
> comes later.
> 
> Signed-off-by: Dr. David Alan Gilbert 
> Reviewed-by: Cornelia Huck 

Reviewed-by: Gerd Hoffmann

[Qemu-devel] [PATCH v4] aio-posix: remove useless parameter

2016-07-15 Thread Cao jin

Parameter **errp of aio_context_setup() is useless, remove it
and clean up the related code.

Cc: Stefan Hajnoczi 
Cc: Fam Zheng 
Cc: Eric Blake 
Signed-off-by: Cao jin 
---
 aio-posix.c | 3 ++-
 aio-win32.c | 2 +-
 async.c | 8 ++--
 include/block/aio.h | 2 +-
 4 files changed, 6 insertions(+), 9 deletions(-)

v4 changelog:
1. replace plain errno with strerror(errno)  (Eric)

diff --git a/aio-posix.c b/aio-posix.c
index 6006122..43162a9 100644
--- a/aio-posix.c
+++ b/aio-posix.c
@@ -485,12 +485,13 @@ bool aio_poll(AioContext *ctx, bool blocking)
 return progress;
 }
 
-void aio_context_setup(AioContext *ctx, Error **errp)
+void aio_context_setup(AioContext *ctx)
 {
 #ifdef CONFIG_EPOLL_CREATE1
 assert(!ctx->epollfd);
 ctx->epollfd = epoll_create1(EPOLL_CLOEXEC);
 if (ctx->epollfd == -1) {
+fprintf(stderr, "Failed to create epoll instance: %s", 
strerror(errno));
 ctx->epoll_available = false;
 } else {
 ctx->epoll_available = true;
diff --git a/aio-win32.c b/aio-win32.c
index 6aaa32a..c8c249e 100644
--- a/aio-win32.c
+++ b/aio-win32.c
@@ -371,6 +371,6 @@ bool aio_poll(AioContext *ctx, bool blocking)
 return progress;
 }
 
-void aio_context_setup(AioContext *ctx, Error **errp)
+void aio_context_setup(AioContext *ctx)
 {
 }
diff --git a/async.c b/async.c
index fb7dd92..8589017 100644
--- a/async.c
+++ b/async.c
@@ -327,14 +327,10 @@ AioContext *aio_context_new(Error **errp)
 {
 int ret;
 AioContext *ctx;
-Error *local_err = NULL;
 
 ctx = (AioContext *) g_source_new(_source_funcs, sizeof(AioContext));
-aio_context_setup(ctx, _err);
-if (local_err) {
-error_propagate(errp, local_err);
-goto fail;
-}
+aio_context_setup(ctx);
+
 ret = event_notifier_init(>notifier, false);
 if (ret < 0) {
 error_setg_errno(errp, -ret, "Failed to initialize event notifier");
diff --git a/include/block/aio.h b/include/block/aio.h
index 88a64ee..0922b69 100644
--- a/include/block/aio.h
+++ b/include/block/aio.h
@@ -439,6 +439,6 @@ static inline bool aio_node_check(AioContext *ctx, bool 
is_external)
  *
  * Initialize the aio context.
  */
-void aio_context_setup(AioContext *ctx, Error **errp);
+void aio_context_setup(AioContext *ctx);
 
 #endif
-- 
2.1.0

Re: [Qemu-devel] [PATCH] vnc-tight: fix regression with libxenstore

2016-07-15 Thread Paolo Bonzini



On 15/07/2016 12:10, Peter Lieven wrote:
> Am 15.07.2016 um 12:07 schrieb Gerd Hoffmann:
>> On Fr, 2016-07-15 at 11:45 +0200, Peter Lieven wrote:
>>> commit 095497ff added thread local storage for the color counting
>>> palette. Unfortunately, a VncPalette is about 7kB on a x86_64 system.
>>> This memory is reserved from the stack of every thread and it
>>> exhausted the stack space of a libxenstore thread.
>>>
>>> Fix this by allocating memory only for the VNC encoding thread.
>> Added to vnc queue.
> 
> Please wait. Paolo mentioned that TLS is not allocated from the stack.

Actually it does---which is not a problem, but then the stack size from
pthread attributes should be increased IMHO.  Anyway, the patch is okay.

Paolo

> Maybe this patch is ok, but we need a different commit message then.

[Qemu-devel] [RFC PATCH V6 1/6] colo-compare: introduce colo compare initialization

2016-07-15 Thread Zhang Chen

This a COLO net ascii figure:

 Primary qemu   
Secondary qemu
+--+   
++
| +-+  |   |  
+---+ |
| | |  |   |  | 
  | |
| |guest|  |   |  | 
   guest  | |
| | |  |   |  | 
  | |
| +---^--+--+  |   |  
+-+++ |
| |  | |   |
^|  |
| |  | |   |
||  |
| |  +--+  |
||  |
|netfilter|  |   | ||  |   
netfilter||  |
| +--+ ---+||  |  
+---+ |
| |   |  |   ||||  |  | 
||  filter excute order   | |
| |   |  |   ||||  |  | 
|| +--->  | |
| |   |  |   ||||  |  | 
||   TCP  | |
| | +-+--+--+ +--v-+  | ++ ||  |  | 
++  +---++---v+rewriter++  ++ | |
| | |   | ||  | || ||  |  | |   
 |  ||  |  || | |
| | |  filter   | |   filter   +>   colo <+ +>  
filter   +--> adjust |   adjust +-->   filter   | | |
| | |  mirror   | | redirector |  | |  compare   | |  ||  | | 
redirector |  | ack|   seq|  | redirector | | |
| | |   | ||  | || |  ||  | |   
 |  ||  |  || | |
| | +^--+ ++  | +-+--+ |  ||  | 
++  ++--+  +---++ | |
| |  | tx rx  |   ||  ||  | 
   txall   |  rx  | |
| |  ||   ||  ||  
+---+ |
| |  ||   ||  ||
   ||
| |  |   filter excute order  |   ||  ||
   ||
| |  |  +---> |   ||  
++|
| +---+   ||   |
|
||||   |
|
+--+   
++
 |guest receive   |guest send
 ||
++v+
|  |
  NOTE: filter direction is rx/tx/all
| tap  |
  rx:receive packets sent to the netdev
|  |
  tx:receive packets sent by the netdev
+--+

In COLO-compare.
Packets coming from the primary char indev will be sent to outdev
Packets coming from the secondary char dev will be dropped
colo-comapre need two input chardev and one output chardev:
primary_in=chardev1-id
secondary_in=chardev2-id
outdev=chardev3-id

usage:

primary:
-netdev tap,id=hn0,vhost=off,script=/etc/qemu-ifup,downscript=/etc/qemu-ifdown
-device

[Qemu-devel] [RFC PATCH V6 3/6] Jhash: add linux kernel jhashtable in qemu

2016-07-15 Thread Zhang Chen

Jhash used by colo-compare and filter-rewriter
to save and lookup net connection info

Signed-off-by: Zhang Chen 
Signed-off-by: Li Zhijian 
Signed-off-by: Wen Congyang 
---
 include/qemu/jhash.h | 61 
 1 file changed, 61 insertions(+)
 create mode 100644 include/qemu/jhash.h

diff --git a/include/qemu/jhash.h b/include/qemu/jhash.h
new file mode 100644
index 000..0fcd875
--- /dev/null
+++ b/include/qemu/jhash.h
@@ -0,0 +1,61 @@
+/* jhash.h: Jenkins hash support.
+  *
+  * Copyright (C) 2006. Bob Jenkins (bob_jenk...@burtleburtle.net)
+  *
+  * http://burtleburtle.net/bob/hash/
+  *
+  * These are the credits from Bob's sources:
+  *
+  * lookup3.c, by Bob Jenkins, May 2006, Public Domain.
+  *
+  * These are functions for producing 32-bit hashes for hash table lookup.
+  * hashword(), hashlittle(), hashlittle2(), hashbig(), mix(), and final()
+  * are externally useful functions.  Routines to test the hash are
+included
+  * if SELF_TEST is defined.  You can use this free for any purpose.
+It's in
+  * the public domain.  It has no warranty.
+  *
+  * Copyright (C) 2009-2010 Jozsef Kadlecsik (kad...@blackhole.kfki.hu)
+  *
+  * I've modified Bob's hash to be useful in the Linux kernel, and
+  * any bugs present are my fault.
+  * Jozsef
+  */
+
+#ifndef QEMU_JHASH_H__
+#define QEMU_JHASH_H__
+
+#include "qemu/bitops.h"
+
+/*
+ * hashtable relation copy from linux kernel jhash
+ */
+
+/* __jhash_mix -- mix 3 32-bit values reversibly. */
+#define __jhash_mix(a, b, c)\
+{   \
+a -= c;  a ^= rol32(c, 4);  c += b; \
+b -= a;  b ^= rol32(a, 6);  a += c; \
+c -= b;  c ^= rol32(b, 8);  b += a; \
+a -= c;  a ^= rol32(c, 16); c += b; \
+b -= a;  b ^= rol32(a, 19); a += c; \
+c -= b;  c ^= rol32(b, 4);  b += a; \
+}
+
+/* __jhash_final - final mixing of 3 32-bit values (a,b,c) into c */
+#define __jhash_final(a, b, c)  \
+{   \
+c ^= b; c -= rol32(b, 14);  \
+a ^= c; a -= rol32(c, 11);  \
+b ^= a; b -= rol32(a, 25);  \
+c ^= b; c -= rol32(b, 16);  \
+a ^= c; a -= rol32(c, 4);   \
+b ^= a; b -= rol32(a, 14);  \
+c ^= b; c -= rol32(b, 24);  \
+}
+
+/* An arbitrary initial parameter */
+#define JHASH_INITVAL   0xdeadbeef
+
+#endif /* QEMU_JHASH_H__ */
-- 
2.7.4

Re: [Qemu-devel] [PATCH] Move README to markdown

2016-07-15 Thread Stefan Hajnoczi

On Fri, Jul 15, 2016 at 12:31:11AM -0400, Pranith Kumar wrote:
> Move the README file to markdown so that it makes the github page look
> prettier. I know that github repo is a mirror and not the official
> repo, but I think it doesn't hurt to have it in markdown format.
> 
> Signed-off-by: Pranith Kumar 
> ---
>  README => README.md | 41 -
>  1 file changed, 20 insertions(+), 21 deletions(-)
>  rename README => README.md (85%)

Reviewed-by: Stefan Hajnoczi 


signature.asc
Description: PGP signature

Re: [Qemu-devel] [SeaBIOS] [PATCH 5/5] [wip] sercon: initial split-output implementation

2016-07-15 Thread Gerd Hoffmann

  Hi,

> I'm okay with the cut-and-paste.  But, another option would be to use
> the iretw at the end of the existing irqentry_extrastack to implement
> the ljmpw into the main vgabios.  Something like (totally untested):
> 
> entry_10_hooked:
> pushfw  // Setup for iretw in irqentry_arg
> pushl %cs:sercon_int10_hook_resume
> 
> pushl $handle_10
> #if CONFIG_ENTRY_EXTRASTACK
> jmp irqentry_arg_extrastack
> #else
> jmp irqentry_arg
> #endif

Good idea, I'll try it.

> Separately, have you considered choosing a separate entry point for
> entry_10_hooked.  That is, changing the above pushl to
> $handle_sercon_hooked and introducing that function in sercon.c.  It
> seems it would reduce a number of "if (!sercon_splitmode())" checks in
> the main code, as handle_sercon_hooked() could just use a smaller
> switch statement and ignore requests it doesn't need to support.

Makes sense indeed.  All functions which only return information are not
needed in the splitmode case.

> Finally, one high level observation is that we know there are a number
> of quirks in various vgabios emulators.  For example, we know some
> emulators don't handle certain 32bit instructions when in 16bit mode
> (hence scripts/vgafixup.py), we know some versions of Windows use an
> emulator that doesn't like some stack relative instructions (hence the
> vgabios is compiled without -fomit-frame-pointer), and we know Windows
> Vista doesn't like the extra stack in high ram (the skifree bug).  Any
> thoughts on what happens with these quirks if the main seabios code
> hooks int10?

Good question.  Do the emulators (both win, xorg) use the int10 vector
set by seabios in the first place?  Or go they load the vgabios and run
it, including the initialization, and use whatever entry point the init
code sets up?  I suspect it is the latter.  But needs investigation and
testing.

/me places the item on the todo list.

Also a serial console for windows guests isn't that useful, so I
wouldn't worry too much about windows emulator issues.

cheers,
  Gerd

Re: [Qemu-devel] [PATCH v3 00/14] virtio migration: Flip outer layer to vmstate

2016-07-15 Thread Cornelia Huck

On Thu, 14 Jul 2016 18:22:42 +0100
"Dr. David Alan Gilbert (git)"  wrote:

> From: "Dr. David Alan Gilbert" 
> 
> Hi,
>   This series converts the outer most layer of virtio to
> use VMState macros;  this is the easy bit, but I'm hoping that
> having done that, the next trick is to nibble away at the virtio_save/load
> functions and all of the zillions of device/bus helpers.

Looks good, and I'd like to see this in 2.7.

[Qemu-devel] [PATCH] AioContext: correct comments

2016-07-15 Thread Cao jin

Correct comments of field notify_me

Cc: Kevin Wolf 
Cc: Max Reitz 
Signed-off-by: Cao jin 
---
 include/block/aio.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/block/aio.h b/include/block/aio.h
index 0922b69..f89a1df 100644
--- a/include/block/aio.h
+++ b/include/block/aio.h
@@ -71,7 +71,7 @@ struct AioContext {
  * event_notifier_set necessary.
  *
  * Bit 0 is reserved for GSource usage of the AioContext, and is 1
- * between a call to aio_ctx_check and the next call to aio_ctx_dispatch.
+ * between a call to aio_ctx_prepare and the next call to aio_ctx_check.
  * Bits 1-31 simply count the number of active calls to aio_poll
  * that are in the prepare or poll phase.
  *
-- 
2.1.0

Re: [Qemu-devel] [PATCH] ppc: Yet another fix for the huge page support detection mechanism

2016-07-15 Thread David Gibson

On Fri, Jul 15, 2016 at 10:10:25AM +0200, Thomas Huth wrote:
> Commit 86b50f2e1bef ("Disable huge page support if it is not available
> for main RAM") already made sure that huge page support is not announced
> to the guest if the normal RAM of non-NUMA configurations is not backed
> by a huge page filesystem. However, there is one more case that can go
> wrong: NUMA is enabled, but the RAM of the NUMA nodes are not configured
> with huge page support (and only the memory of a DIMM is configured with
> it). When QEMU is started with the following command line for example,
> the Linux guest currently crashes because it is trying to use huge pages
> on a memory region that does not support huge pages:
> 
>  qemu-system-ppc64 -enable-kvm ... -m 1G,slots=4,maxmem=32G -object \
>memory-backend-file,policy=default,mem-path=/hugepages,size=1G,id=mem-mem1 
> \
>-device pc-dimm,id=dimm-mem1,memdev=mem-mem1 -smp 2 \
>-numa node,nodeid=0 -numa node,nodeid=1
> 
> To fix this issue, we've got to make sure to disable huge page support,
> too, when there is a NUMA node that is not using a memory backend with
> huge page support.
> 
> Fixes: 86b50f2e1befc33407bdfeb6f45f7b0d2439a740
> Signed-off-by: Thomas Huth 
> ---
>  target-ppc/kvm.c | 10 +++---
>  1 file changed, 7 insertions(+), 3 deletions(-)
> 
> diff --git a/target-ppc/kvm.c b/target-ppc/kvm.c
> index 884d564..7a8f555 100644
> --- a/target-ppc/kvm.c
> +++ b/target-ppc/kvm.c
> @@ -389,12 +389,16 @@ static long getrampagesize(void)
>  
>  object_child_foreach(memdev_root, find_max_supported_pagesize, );
>  
> -if (hpsize == LONG_MAX) {
> +if (hpsize == LONG_MAX || hpsize == getpagesize()) {
>  return getpagesize();
>  }
>  
> -if (nb_numa_nodes == 0 && hpsize > getpagesize()) {
> -/* No NUMA nodes and normal RAM without -mem-path ==> no huge pages! 
> */
> +/* If NUMA is disabled or the NUMA nodes are not backed with a
> + * memory-backend, then there is at least one node using "normal"
> + * RAM. And since normal RAM has not been configured with "-mem-path"
> + * (what we've checked earlier here already), we can not use huge pages!
> + */
> +if (nb_numa_nodes == 0 || numa_info[0].node_memdev == NULL) {

Is that second clause sufficient, or do you need to loop through and
check the memdev of every node?

>  static bool warned;
>  if (!warned) {
>  error_report("Huge page support disabled (n/a for main 
> memory).");

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature

Re: [Qemu-devel] Regression with commit 095497ffc66b7f031

2016-07-15 Thread Peter Lieven

Am 15.07.2016 um 12:02 schrieb Paolo Bonzini:
>
> On 15/07/2016 10:47, Juergen Gross wrote:
>> Nothing scaring and no real difference between working and not working
>> variant.
>>
>> Meanwhile I've been digging a little bit deeper and found the reason:
>> libxenstore is setting up a reader thread which is waiting for the
>> watch to fire. With above commit the stack size of that thread (16kB)
>> is too small. Setting it to 32kB made qemu work again.
> This makes very little sense (not your fault)...  The commit doesn't
> change stack usage at all, TLS should not be on the stack.

But we still allocate the VncPalette for every thread, right? Even
if it has nothing todo with VNC.

Peter

Re: [Qemu-devel] [PATCH] vnc-tight: fix regression with libxenstore

2016-07-15 Thread Gerd Hoffmann

On Fr, 2016-07-15 at 11:45 +0200, Peter Lieven wrote:
> commit 095497ff added thread local storage for the color counting
> palette. Unfortunately, a VncPalette is about 7kB on a x86_64 system.
> This memory is reserved from the stack of every thread and it
> exhausted the stack space of a libxenstore thread.
> 
> Fix this by allocating memory only for the VNC encoding thread.

Added to vnc queue.

thanks,
  Gerd

Re: [Qemu-devel] Regression with commit 095497ffc66b7f031

2016-07-15 Thread Peter Lieven

Am 15.07.2016 um 12:12 schrieb Paolo Bonzini:
>
> On 15/07/2016 12:07, Peter Lieven wrote:
>> Am 15.07.2016 um 12:02 schrieb Paolo Bonzini:
>>> On 15/07/2016 10:47, Juergen Gross wrote:
 Nothing scaring and no real difference between working and not working
 variant.

 Meanwhile I've been digging a little bit deeper and found the reason:
 libxenstore is setting up a reader thread which is waiting for the
 watch to fire. With above commit the stack size of that thread (16kB)
 is too small. Setting it to 32kB made qemu work again.
>>> This makes very little sense (not your fault)...  The commit doesn't
>>> change stack usage at all, TLS should not be on the stack.
>> But we still allocate the VncPalette for every thread, right? Even
>> if it has nothing todo with VNC.
> Yes, I'm just trying to understand the root cause.  Which is that glibc
> actually does carve out TLS space from the requested stack size.  That
> means that a program that has a lot of TLS variables, or has big TLS
> variables, will fail in weird ways.
>
> So that's two reasons why your patch is okay. :)

Okay, then I will wait for the analysis and then resubmit that Patch
with a different commit message.

Peter

Re: [Qemu-devel] Regression with commit 095497ffc66b7f031

2016-07-15 Thread Paolo Bonzini



On 15/07/2016 12:07, Peter Lieven wrote:
> Am 15.07.2016 um 12:02 schrieb Paolo Bonzini:
>>
>> On 15/07/2016 10:47, Juergen Gross wrote:
>>> Nothing scaring and no real difference between working and not working
>>> variant.
>>>
>>> Meanwhile I've been digging a little bit deeper and found the reason:
>>> libxenstore is setting up a reader thread which is waiting for the
>>> watch to fire. With above commit the stack size of that thread (16kB)
>>> is too small. Setting it to 32kB made qemu work again.
>> This makes very little sense (not your fault)...  The commit doesn't
>> change stack usage at all, TLS should not be on the stack.
> 
> But we still allocate the VncPalette for every thread, right? Even
> if it has nothing todo with VNC.

Yes, I'm just trying to understand the root cause.  Which is that glibc
actually does carve out TLS space from the requested stack size.  That
means that a program that has a lot of TLS variables, or has big TLS
variables, will fail in weird ways.

So that's two reasons why your patch is okay. :)

Paolo

[Qemu-devel] [PATCH] linux-user: Fix type for SIOCATMARK ioctl

2016-07-15 Thread Peter Maydell

The SIOCATMARK ioctl takes an argument which should be a
pointer to an integer where the kernel will write the result.
We were incorrectly declaring it as TYPE_NULL which would mean
it would always fail (with EFAULT) when it should succeed.
Correct the type.

Signed-off-by: Peter Maydell 
---
This fixes a failure in the LTP sockioctl01 test.

 linux-user/ioctls.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/linux-user/ioctls.h b/linux-user/ioctls.h
index 4b36baa..7e2c133 100644
--- a/linux-user/ioctls.h
+++ b/linux-user/ioctls.h
@@ -120,7 +120,7 @@
MK_PTR(MK_STRUCT(STRUCT_fiemap)))
 #endif
 
-  IOCTL(SIOCATMARK, 0, TYPE_NULL)
+  IOCTL(SIOCATMARK, IOC_R, MK_PTR(TYPE_INT))
   IOCTL(SIOCGIFNAME, IOC_RW, MK_PTR(TYPE_INT))
   IOCTL(SIOCGIFFLAGS, IOC_W | IOC_R, MK_PTR(MK_STRUCT(STRUCT_short_ifreq)))
   IOCTL(SIOCSIFFLAGS, IOC_W, MK_PTR(MK_STRUCT(STRUCT_short_ifreq)))
-- 
1.9.1

Re: [Qemu-devel] [PATCH v3 1/1] linux-aio: prevent submitting more than MAX_EVENTS

2016-07-15 Thread Stefan Hajnoczi

On Wed, Jul 13, 2016 at 03:03:24PM +0200, Roman Pen wrote:
> Invoking io_setup(MAX_EVENTS) we ask kernel to create ring buffer for us
> with specified number of events.  But kernel ring buffer allocation logic
> is a bit tricky (ring buffer is page size aligned + some percpu allocation
> are required) so eventually more than requested events number is allocated.
> 
> From a userspace side we have to follow the convention and should not try
> to io_submit() more or logic, which consumes completed events, should be
> changed accordingly.  The pitfall is in the following sequence:
> 
> MAX_EVENTS = 128
> io_setup(MAX_EVENTS)
> 
> io_submit(MAX_EVENTS)
> io_submit(MAX_EVENTS)
> 
> /* now 256 events are in-flight */
> 
> io_getevents(MAX_EVENTS) = 128
> 
> /* we can handle only 128 events at once, to be sure
>  * that nothing is pended the io_getevents(MAX_EVENTS)
>  * call must be invoked once more or hang will happen. */
> 
> To prevent the hang or reiteration of io_getevents() call this patch
> restricts the number of in-flights, which is now limited to MAX_EVENTS.
> 
> Signed-off-by: Roman Pen 
> Reviewed-by: Fam Zheng 
> Reviewed-by: Paolo Bonzini 
> Cc: Stefan Hajnoczi 
> Cc: qemu-devel@nongnu.org
> ---
> v3:
>   o comment tweaks.
> 
> v2:
>   o comment tweaks.
>   o fix QEMU coding style.
> 
>  block/linux-aio.c | 26 --
>  1 file changed, 16 insertions(+), 10 deletions(-)

Thanks, applied to my block tree:
https://github.com/stefanha/qemu/commits/block

Stefan


signature.asc
Description: PGP signature

Re: [Qemu-devel] [PATCH v5 04/10] block: Support meta dirty bitmap

2016-07-15 Thread Max Reitz

On 14.07.2016 22:00, John Snow wrote:
> On 06/22/2016 11:53 AM, Max Reitz wrote:
>> On 03.06.2016 06:32, Fam Zheng wrote:
>>> The added group of operations enables tracking of the changed bits in
>>> the dirty bitmap.
>>>
>>> Signed-off-by: Fam Zheng 
>>> ---
>>>  block/dirty-bitmap.c | 52 
>>> 
>>>  include/block/dirty-bitmap.h |  9 
>>>  2 files changed, 61 insertions(+)
>>>
>>> diff --git a/block/dirty-bitmap.c b/block/dirty-bitmap.c
>>> index 628b77c..9c53c56 100644
>>> --- a/block/dirty-bitmap.c
>>> +++ b/block/dirty-bitmap.c
>>> @@ -38,6 +38,7 @@
>>>   */
>>>  struct BdrvDirtyBitmap {
>>>  HBitmap *bitmap;/* Dirty sector bitmap implementation */
>>> +HBitmap *meta;  /* Meta dirty bitmap */
>>>  BdrvDirtyBitmap *successor; /* Anonymous child; implies frozen status 
>>> */
>>>  char *name; /* Optional non-empty unique ID */
>>>  int64_t size;   /* Size of the bitmap (Number of sectors) 
>>> */
>>> @@ -103,6 +104,56 @@ BdrvDirtyBitmap 
>>> *bdrv_create_dirty_bitmap(BlockDriverState *bs,
>>>  return bitmap;
>>>  }
>>>  
>>> +/* bdrv_create_meta_dirty_bitmap
>>> + *
>>> + * Create a meta dirty bitmap that tracks the changes of bits in @bitmap. 
>>> I.e.
>>> + * when a dirty status bit in @bitmap is changed (either from reset to set 
>>> or
>>> + * the other way around), its respective meta dirty bitmap bit will be 
>>> marked
>>> + * dirty as well.
>>
>> Not wrong, but I'd like a note here that this is not an
>> when-and-only-when relationship, i.e. that bits in the meta bitmap may
>> be set even without the tracked bits in the dirty bitmap having changed.
>>
> 
> How?
> 
> You mean, if the caller manually starts setting things in the meta
> bitmap object?
> 
> That's their fault then, isn't it?

No, I mean something that I mentioned in a reply to some previous
version (the patch adding the test):

http://lists.nongnu.org/archive/html/qemu-block/2016-03/msg00332.html

Fam's reply is here:

http://lists.nongnu.org/archive/html/qemu-block/2016-06/msg00097.html

(Interesting how that reply took nearly three months and yours took
nearly one, there most be something about this series that makes
replying to replies very cumbersome :-))

What I meant by “then it would update meta” is that it would update *all
of the range* even though only a single bit has actually been changed.

So the answer to your “how” is: See patch 2, the changes to
hbitmap_set() (and hbitmap_reset()). If any of the bits in the given
range is changed, all of the range is marked as having changed in the
meta bitmap.

So all we guarantee is that every time a bit is changed, the
corresponding bit in the meta bitmap will be set. But we do not
guarantee that a bit in the meta bitmap stays cleared as long as the
corresponding range of the underlying bitmap stays the same.

Max

> 
>> Maybe this should be mentioned somewhere in patch 2, too. Or maybe only
>> in patch 2.
>>
>>> + *
>>> + * @bitmap: the block dirty bitmap for which to create a meta dirty bitmap.
>>> + * @chunk_size: how many bytes of bitmap data does each bit in the meta 
>>> bitmap
>>> + * track.
>>> + */
>>> +void bdrv_create_meta_dirty_bitmap(BdrvDirtyBitmap *bitmap,
>>> +   int chunk_size)
>>> +{
>>> +assert(!bitmap->meta);
>>> +bitmap->meta = hbitmap_create_meta(bitmap->bitmap,
>>> +   chunk_size * BITS_PER_BYTE);
>>> +}
>>> +
>>> +void bdrv_release_meta_dirty_bitmap(BdrvDirtyBitmap *bitmap)
>>> +{
>>> +assert(bitmap->meta);
>>> +hbitmap_free_meta(bitmap->bitmap);
>>> +bitmap->meta = NULL;
>>> +}
>>> +
>>> +int bdrv_dirty_bitmap_get_meta(BlockDriverState *bs,
>>> +   BdrvDirtyBitmap *bitmap, int64_t sector,
>>> +   int nb_sectors)
>>> +{
>>> +uint64_t i;
>>> +int gran = bdrv_dirty_bitmap_granularity(bitmap) >> BDRV_SECTOR_BITS;
>>> +
>>> +/* To optimize: we can make hbitmap to internally check the range in a
>>> + * coarse level, or at least do it word by word. */
>>
>> We could also multiply gran by the granularity of the meta bitmap. Each
>> bit of the meta bitmap tracks at least eight bits of the dirty bitmap,
>> so we're calling hbitmap_get() at least eight times as often as
>> necessary here.
>>
>> Or we just use int gran = hbitmap_granularity(bitmap->meta);.
>>
>> Not exactly wrong, though, so:
>>
>> Reviewed-by: Max Reitz 
>>
>>> +for (i = sector; i < sector + nb_sectors; i += gran) {
>>> +if (hbitmap_get(bitmap->meta, i)) {
>>> +return true;
>>> +}
>>> +}
>>> +return false;
>>> +}
>>
> 




signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [PATCH v6 5/6] tests: removed skipped flushes from block test traces

2016-07-15 Thread Evgeny Yakovlev




On 14.07.2016 19:06, Eric Blake wrote:

On 07/14/2016 06:29 AM, Denis V. Lunev wrote:

From: Evgeny Yakovlev 

bdrv_co_flush is now skipping flushes in case underlying media has no
actual changes. This affected some blkdebug testcases that were
expecting error logs from failure-injected flushes which are now
skipped entirely.

This change removes expected flush error logs from block tests 026 071 089

Signed-off-by: Evgeny Yakovlev 
Signed-off-by: Denis V. Lunev 
CC: Kevin Wolf 
CC: Max Reitz 
CC: Stefan Hajnoczi 
CC: Fam Zheng 
CC: John Snow 
---
  tests/qemu-iotests/026.out.nocache | 50 --
  tests/qemu-iotests/071.out |  8 --
  tests/qemu-iotests/089.out |  2 --
  3 files changed, 60 deletions(-)

If the previous patch broke the testsuite, then this should be squashed
in with that patch so that bisection doesn't land on a broken testsuite.

Seeing the testsuite change alongside code change is just fine; it
proves what impact the code change has.



Will do.

[Qemu-devel] [PATCH] tap: fix memory leak on failure to create a multiqueue tap device

2016-07-15 Thread Paolo Bonzini

Reported by Coverity.

Signed-off-by: Paolo Bonzini 
---
 net/tap.c | 22 --
 1 file changed, 16 insertions(+), 6 deletions(-)

diff --git a/net/tap.c b/net/tap.c
index e9c32f3..6a2cedc 100644
--- a/net/tap.c
+++ b/net/tap.c
@@ -787,8 +787,8 @@ int net_init_tap(const NetClientOptions *opts, const char 
*name,
 return -1;
 }
 } else if (tap->has_fds) {
-char **fds = g_new(char *, MAX_TAP_QUEUES);
-char **vhost_fds = g_new(char *, MAX_TAP_QUEUES);
+char **fds = g_new0(char *, MAX_TAP_QUEUES);
+char **vhost_fds = g_new0(char *, MAX_TAP_QUEUES);
 int nfds, nvhosts;
 
 if (tap->has_ifname || tap->has_script || tap->has_downscript ||
@@ -806,7 +806,7 @@ int net_init_tap(const NetClientOptions *opts, const char 
*name,
 if (nfds != nvhosts) {
 error_setg(errp, "The number of fds passed does not match "
"the number of vhostfds passed");
-return -1;
+goto free_fail;
 }
 }
 
@@ -814,7 +814,7 @@ int net_init_tap(const NetClientOptions *opts, const char 
*name,
 fd = monitor_fd_param(cur_mon, fds[i], );
 if (fd == -1) {
 error_propagate(errp, err);
-return -1;
+goto free_fail;
 }
 
 fcntl(fd, F_SETFL, O_NONBLOCK);
@@ -824,7 +824,7 @@ int net_init_tap(const NetClientOptions *opts, const char 
*name,
 } else if (vnet_hdr != tap_probe_vnet_hdr(fd)) {
 error_setg(errp,
"vnet_hdr not consistent across given tap fds");
-return -1;
+goto free_fail;
 }
 
 net_init_tap_one(tap, peer, "tap", name, ifname,
@@ -833,11 +833,21 @@ int net_init_tap(const NetClientOptions *opts, const char 
*name,
  vnet_hdr, fd, );
 if (err) {
 error_propagate(errp, err);
-return -1;
+goto free_fail;
 }
 }
 g_free(fds);
 g_free(vhost_fds);
+return 0;
+
+free_fail:
+for (i = 0; i < nfds; i++) {
+g_free(fds[i]);
+g_free(vhost_fds[i]);
+}
+g_free(fds);
+g_free(vhost_fds);
+return -1;
 } else if (tap->has_helper) {
 if (tap->has_ifname || tap->has_script || tap->has_downscript ||
 tap->has_vnet_hdr || tap->has_queues || tap->has_vhostfds) {
-- 
2.7.4

[Qemu-devel] [Bug 955379] Re: cmake hangs with qemu-arm-static

2016-07-15 Thread Peter Maydell

Please provide exact reproduction instructions -- I need enough
information that I can completely replicate your setup and what you're
doing: exactly how you've set up any chroot or whatever other guest
setup you have, what cmake command you're running, and so on.

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/955379

Title:
  cmake hangs with qemu-arm-static

Status in QEMU:
  Fix Committed
Status in Linaro QEMU:
  Confirmed
Status in qemu-linaro package in Ubuntu:
  Confirmed

Bug description:
  I'm using git commit 3e7ecd976b06f... configured with --target-list
  =arm-linux-user --static in a chroot environment to compile some
  things. I ran into this problem with both pcl and opencv-2.3.1. cmake
  consistently freezes at some point during its execution, though in a
  different spot each time, usually during a step when it's searching
  for some libraries. For instance, pcl most commonly stops after:

  [snip]
  -- Boost version: 1.46.1
  -- Found the following Boost libraries:
  --   system
  --   filesystem
  --   thread
  --   date_time
  -- checking for module 'eigen3'
  --   found eigen3, version 3.0.1

  which is perplexing because it freezes after finding what it wants,
  not during the search. When it does get past that point, it does so
  almost immediately but freezes somewhere else.

  I'm using 64-bit Ubuntu 11.10 with kernel release 3.0.0-16-generic
  with an Intel i5.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/955379/+subscriptions

Re: [Qemu-devel] [PATCH v4 05/16] pc: enforce adding CPUs contiguously and removing them in opposit order

2016-07-15 Thread Igor Mammedov

On Thu, 14 Jul 2016 14:10:24 -0400
Bandan Das  wrote:

> Igor Mammedov  writes:
> 
> > it will still allow us to use cpu_index as migration instance_id
> > since when CPUs are added contiguously (from the first to the last)
> > and removed in opposite order, cpu_index stays stable and it's
> > reproducable on destination side.
> >
> > Signed-off-by: Igor Mammedov 
> > ---
> > While there is work in progress to support migration when there are holes
> > in cpu_index range resulting from out-of-order plug or unplug, this patch
> > is intended as a last resort if no easy, risk-free and elegant solution
> > emerges before 2.7 dev cycle ends.  
> 
> I think this (or a modified version) is appropriate comment
> material to accompany the changes. Ok if you are sure this code
> is short-lived, but if it stays longer, a comment is definitely
> helpful. Maybe a bit of reasoning added to the error message is
> fine too.
dwg is looking at cpu_index refactoring but that's not 2.7 material,
this patch is doing what the similar spapr patch did
(which David applied to his ppc queue).

Perhaps moving comment under --- to commit message itself would
be better as to leave trace of future refactoring plans.

> > ---
> >  hw/i386/pc.c | 34 ++
> >  1 file changed, 34 insertions(+)
> >
> > diff --git a/hw/i386/pc.c b/hw/i386/pc.c
> > index 33c5f97..75a92d0 100644
> > --- a/hw/i386/pc.c
> > +++ b/hw/i386/pc.c
> > @@ -1762,6 +1762,23 @@ static void pc_cpu_unplug_request_cb(HotplugHandler 
> > *hotplug_dev,
> >  goto out;
> >  }
> >  
> > +if (idx < pcms->possible_cpus->len - 1 &&
> > +pcms->possible_cpus->cpus[idx + 1].cpu != NULL) {
> > +X86CPU *cpu;
> > +
> > +for (idx = pcms->possible_cpus->len - 1;
> > + pcms->possible_cpus->cpus[idx].cpu == NULL; idx--) {
> > +;;
> > +}
> > +
> > +cpu = X86_CPU(pcms->possible_cpus->cpus[idx].cpu);
> > +error_setg(_err, "CPU [socket-id: %u, core-id: %u,"
> > +   " thread-id: %u] should be removed first",
> > +   cpu->socket_id, cpu->core_id, cpu->thread_id);
> > +goto out;
> > +
> > +}
> > +
> >  hhc = HOTPLUG_HANDLER_GET_CLASS(pcms->acpi_dev);
> >  hhc->unplug_request(HOTPLUG_HANDLER(pcms->acpi_dev), dev, _err);
> >  
> > @@ -1860,6 +1877,23 @@ static void pc_cpu_pre_plug(HotplugHandler 
> > *hotplug_dev,
> >  return;
> >  }
> >  
> > +if (idx != 0 && pcms->possible_cpus->cpus[idx - 1].cpu == NULL) {
> > +PCMachineClass *pcmc = PC_MACHINE_GET_CLASS(pcms);
> > +
> > +for (idx = 1; pcms->possible_cpus->cpus[idx].cpu != NULL; idx++) {
> > +;;
> > +}
> > +
> > +x86_topo_ids_from_apicid(pcms->possible_cpus->cpus[idx].arch_id,
> > + smp_cores, smp_threads, );
> > +
> > +if (!pcmc->legacy_cpu_hotplug) {
> > +error_setg(errp, "CPU [socket: %u, core: %u, thread: %u] 
> > should be"
> > +   " added first", topo.pkg_id, topo.core_id, 
> > topo.smt_id);
> > +return;
> > +}
> > +}
> > +
> >  /* if 'address' properties socket-id/core-id/thread-id are not set, 
> > set them
> >   * so that query_hotpluggable_cpus would show correct values
> >   */

Re: [Qemu-devel] [PATCH v2 12/13] virtio-gpu: Wrap in vmstate

2016-07-15 Thread Gerd Hoffmann

  Hi,

> Hmm yes; I think the right fix here is to convert that into
>   migrate_add_blocker / migrate_del_blocker

Oh, didn't notice we have that, yes that sounds good.

cheers,
  Gerd

Re: [Qemu-devel] Regression with commit 095497ffc66b7f031

2016-07-15 Thread Gerd Hoffmann

On Fr, 2016-07-15 at 12:02 +0200, Paolo Bonzini wrote:
> 
> On 15/07/2016 10:47, Juergen Gross wrote:
> > Nothing scaring and no real difference between working and not working
> > variant.
> > 
> > Meanwhile I've been digging a little bit deeper and found the reason:
> > libxenstore is setting up a reader thread which is waiting for the
> > watch to fire. With above commit the stack size of that thread (16kB)
> > is too small. Setting it to 32kB made qemu work again.
> 
> This makes very little sense (not your fault)...  The commit doesn't
> change stack usage at all, TLS should not be on the stack.
> 
> Can you capture a backtrace where the 16K stack is exceeded?  Perhaps
> it's only due to inlining decision on the compiler, in which case
> Peter's patch from today is only a bandaid.

Hmm, I guess I hold off the vnc pull request for now ...

cheers,
  Gerd

Re: [Qemu-devel] [PATCH V2 1/1] linux-aio: prevent submitting more than MAX_EVENTS

2016-07-15 Thread Roman Penyaev

On Fri, Jul 15, 2016 at 11:58 AM, Paolo Bonzini  wrote:
>
>
> On 15/07/2016 11:18, Roman Penyaev wrote:
>> Those 3 red spikes and a blue hill is what we have to focus on.  The
>> blue hill at the right corner of the chart means that almost always the
>> ring buffer was observed as full, i.e. qemu_laio_completion_bh() got
>> a chance to reap completions not very often, meanwhile completed
>> requests stand in the ring buffer for quite a long time which degrades
>> the overall performance.
>>
>> The results covered by the red line are much better and that can be
>> explained by those 3 red spikes, which are almost in the middle of the
>> whole distribution, i.e. qemu_laio_completion_bh() is called more often,
>> completed requests do not stall, giving fio a chance to submit new fresh
>> requests.
>>
>> The theoretical fix would be to schedule completion BH just after
>> successful io_submit, i.e.:
>
> What about removing the qemu_bh_cancel but keeping the rest of the patch?

That exactly what I did.  Numbers go to expected from ~1600MB/s to ~1800MB/s.
So basically this hunk of the debatable patch:

 if (event_notifier_test_and_clear(>e)) {
-qemu_bh_schedule(s->completion_bh);
+qemu_laio_completion_bh(s);
 }

does not have any impact and can be ignored.  At least I did not notice
anything important.

>
> I'm also interested in a graph with this patch ("linux-aio: prevent
> submitting more than MAX_EVENTS") on top of origin/master.

I can plot it also of course.

>
> Thanks for the analysis.  Sometimes a picture _is_ worth a thousand
> words, even if it's measuring "only" second-order effects (# of
> completions is not what causes the slowdown, but # of completions
> affects latency which causes the slowdown).

Yes, you are right, latency.  With userspace io_getevents ~0 costs we
can peek requests as often as we like to decrease latency on very
fast devices.  That can also bring something.  Probably after each
io_submit() it makes sense to peek and complete something.

--
Roman

[Qemu-devel] [RFC PATCH V6 6/6] colo-compare: add TCP, UDP, ICMP packet comparison

2016-07-15 Thread Zhang Chen

We add TCP,UDP,ICMP packet comparison to replace
IP packet comparison. This can increase the
accuracy of the package comparison.
less checkpoint more efficiency.

Signed-off-by: Zhang Chen 
Signed-off-by: Li Zhijian 
Signed-off-by: Wen Congyang 
---
 net/colo-compare.c | 174 +++--
 trace-events   |   4 ++
 2 files changed, 174 insertions(+), 4 deletions(-)

diff --git a/net/colo-compare.c b/net/colo-compare.c
index 942e326..9737ec6 100644
--- a/net/colo-compare.c
+++ b/net/colo-compare.c
@@ -18,6 +18,7 @@
 #include "qapi/qmp/qerror.h"
 #include "qapi/error.h"
 #include "net/net.h"
+#include "net/eth.h"
 #include "net/vhost_net.h"
 #include "qom/object_interfaces.h"
 #include "qemu/iov.h"
@@ -197,9 +198,158 @@ static int colo_packet_compare(Packet *ppkt, Packet *spkt)
 }
 }
 
-static int colo_packet_compare_all(Packet *spkt, Packet *ppkt)
+/*
+ * called from the compare thread on the primary
+ * for compare tcp packet
+ * compare_tcp copied from Dr. David Alan Gilbert's branch
+ */
+static int colo_packet_compare_tcp(Packet *spkt, Packet *ppkt)
+{
+struct tcphdr *ptcp, *stcp;
+int res;
+char *sdebug, *ddebug;
+
+trace_colo_compare_main("compare tcp");
+if (ppkt->size != spkt->size) {
+if (trace_event_get_state(TRACE_COLO_COMPARE_MISCOMPARE)) {
+trace_colo_compare_main("pkt size not same");
+}
+return -1;
+}
+
+ptcp = (struct tcphdr *)ppkt->transport_layer;
+stcp = (struct tcphdr *)spkt->transport_layer;
+
+if (ptcp->th_seq != stcp->th_seq) {
+if (trace_event_get_state(TRACE_COLO_COMPARE_MISCOMPARE)) {
+trace_colo_compare_main("pkt tcp seq not same");
+}
+return -1;
+}
+
+/*
+ * The 'identification' field in the IP header is *very* random
+ * it almost never matches.  Fudge this by ignoring differences in
+ * unfragmented packets; they'll normally sort themselves out if different
+ * anyway, and it should recover at the TCP level.
+ * An alternative would be to get both the primary and secondary to rewrite
+ * somehow; but that would need some sync traffic to sync the state
+ */
+if (ntohs(ppkt->ip->ip_off) & IP_DF) {
+spkt->ip->ip_id = ppkt->ip->ip_id;
+/* and the sum will be different if the IDs were different */
+spkt->ip->ip_sum = ppkt->ip->ip_sum;
+}
+
+res = memcmp(ppkt->data + ETH_HLEN, spkt->data + ETH_HLEN,
+(spkt->size - ETH_HLEN));
+
+if (res != 0 && trace_event_get_state(TRACE_COLO_COMPARE_MISCOMPARE)) {
+sdebug = strdup(inet_ntoa(ppkt->ip->ip_src));
+ddebug = strdup(inet_ntoa(ppkt->ip->ip_dst));
+fprintf(stderr, "%s: src/dst: %s/%s p: seq/ack=%u/%u"
+" s: seq/ack=%u/%u res=%d flags=%x/%x\n", __func__,
+   sdebug, ddebug,
+   ntohl(ptcp->th_seq), ntohl(ptcp->th_ack),
+   ntohl(stcp->th_seq), ntohl(stcp->th_ack),
+   res, ptcp->th_flags, stcp->th_flags);
+
+trace_colo_compare_tcp_miscompare("Primary len", ppkt->size);
+qemu_hexdump((char *)ppkt->data, stderr, "colo-compare", ppkt->size);
+trace_colo_compare_tcp_miscompare("Secondary len", spkt->size);
+qemu_hexdump((char *)spkt->data, stderr, "colo-compare", spkt->size);
+
+g_free(sdebug);
+g_free(ddebug);
+}
+
+return res;
+}
+
+/*
+ * called from the compare thread on the primary
+ * for compare udp packet
+ */
+static int colo_packet_compare_udp(Packet *spkt, Packet *ppkt)
+{
+int ret;
+
+trace_colo_compare_main("compare udp");
+ret = colo_packet_compare(ppkt, spkt);
+
+if (ret) {
+trace_colo_compare_udp_miscompare("primary pkt size", ppkt->size);
+qemu_hexdump((char *)ppkt->data, stderr, "colo-compare", ppkt->size);
+trace_colo_compare_udp_miscompare("Secondary pkt size", spkt->size);
+qemu_hexdump((char *)spkt->data, stderr, "colo-compare", spkt->size);
+}
+
+return ret;
+}
+
+/*
+ * called from the compare thread on the primary
+ * for compare icmp packet
+ */
+static int colo_packet_compare_icmp(Packet *spkt, Packet *ppkt)
 {
-trace_colo_compare_main("compare all");
+int network_length;
+struct icmp *icmp_ppkt, *icmp_spkt;
+
+trace_colo_compare_main("compare icmp");
+network_length = ppkt->ip->ip_hl * 4;
+if (ppkt->size != spkt->size ||
+ppkt->size < network_length + ETH_HLEN) {
+return -1;
+}
+icmp_ppkt = (struct icmp *)(ppkt->data + network_length + ETH_HLEN);
+icmp_spkt = (struct icmp *)(spkt->data + network_length + ETH_HLEN);
+
+if ((icmp_ppkt->icmp_type == icmp_spkt->icmp_type) &&
+(icmp_ppkt->icmp_code == icmp_spkt->icmp_code)) {
+if (icmp_ppkt->icmp_type == ICMP_REDIRECT) {
+if (icmp_ppkt->icmp_gwaddr.s_addr !=
+

Re: [Qemu-devel] [PATCH] ppc: Yet another fix for the huge page support detection mechanism

2016-07-15 Thread Thomas Huth

On 15.07.2016 10:35, David Gibson wrote:
> On Fri, Jul 15, 2016 at 10:10:25AM +0200, Thomas Huth wrote:
>> Commit 86b50f2e1bef ("Disable huge page support if it is not available
>> for main RAM") already made sure that huge page support is not announced
>> to the guest if the normal RAM of non-NUMA configurations is not backed
>> by a huge page filesystem. However, there is one more case that can go
>> wrong: NUMA is enabled, but the RAM of the NUMA nodes are not configured
>> with huge page support (and only the memory of a DIMM is configured with
>> it). When QEMU is started with the following command line for example,
>> the Linux guest currently crashes because it is trying to use huge pages
>> on a memory region that does not support huge pages:
>>
>>  qemu-system-ppc64 -enable-kvm ... -m 1G,slots=4,maxmem=32G -object \
>>
>> memory-backend-file,policy=default,mem-path=/hugepages,size=1G,id=mem-mem1 \
>>-device pc-dimm,id=dimm-mem1,memdev=mem-mem1 -smp 2 \
>>-numa node,nodeid=0 -numa node,nodeid=1
>>
>> To fix this issue, we've got to make sure to disable huge page support,
>> too, when there is a NUMA node that is not using a memory backend with
>> huge page support.
>>
>> Fixes: 86b50f2e1befc33407bdfeb6f45f7b0d2439a740
>> Signed-off-by: Thomas Huth 
>> ---
>>  target-ppc/kvm.c | 10 +++---
>>  1 file changed, 7 insertions(+), 3 deletions(-)
>>
>> diff --git a/target-ppc/kvm.c b/target-ppc/kvm.c
>> index 884d564..7a8f555 100644
>> --- a/target-ppc/kvm.c
>> +++ b/target-ppc/kvm.c
>> @@ -389,12 +389,16 @@ static long getrampagesize(void)
>>  
>>  object_child_foreach(memdev_root, find_max_supported_pagesize, );
>>  
>> -if (hpsize == LONG_MAX) {
>> +if (hpsize == LONG_MAX || hpsize == getpagesize()) {
>>  return getpagesize();
>>  }
>>  
>> -if (nb_numa_nodes == 0 && hpsize > getpagesize()) {
>> -/* No NUMA nodes and normal RAM without -mem-path ==> no huge 
>> pages! */
>> +/* If NUMA is disabled or the NUMA nodes are not backed with a
>> + * memory-backend, then there is at least one node using "normal"
>> + * RAM. And since normal RAM has not been configured with "-mem-path"
>> + * (what we've checked earlier here already), we can not use huge pages!
>> + */
>> +if (nb_numa_nodes == 0 || numa_info[0].node_memdev == NULL) {
> 
> Is that second clause sufficient, or do you need to loop through and
> check the memdev of every node?

Checking the first entry should be sufficient. QEMU forces you to
specify either a memory backend for all NUMA nodes (which we should have
looked at during the object_child_foreach() some lines earlier), or you
must not specify a memory backend for any NUMA node at all. You can not
mix the settings, so checking numa_info[0] is enough.

 Thomas




signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [PATCH 3/7] hw/mips: fix PCI bus initialization

2016-07-15 Thread Leon Alrae

On Thu, Jul 14, 2016 at 04:43:42PM +0300, Marcel Apfelbaum wrote:
> Delay the host-bridge 'realization' until the
> PCI root bus is attached.
> 
> Signed-off-by: Marcel Apfelbaum 
> ---
>  hw/mips/gt64xxx_pci.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)

Acked-by: Leon Alrae 
Tested-by: Leon Alrae 

Thanks,
Leon

> 
> diff --git a/hw/mips/gt64xxx_pci.c b/hw/mips/gt64xxx_pci.c
> index 3f4523d..4811843 100644
> --- a/hw/mips/gt64xxx_pci.c
> +++ b/hw/mips/gt64xxx_pci.c
> @@ -1167,7 +1167,6 @@ PCIBus *gt64120_register(qemu_irq *pic)
>  DeviceState *dev;
>  
>  dev = qdev_create(NULL, TYPE_GT64120_PCI_HOST_BRIDGE);
> -qdev_init_nofail(dev);
>  d = GT64120_PCI_HOST_BRIDGE(dev);
>  phb = PCI_HOST_BRIDGE(dev);
>  memory_region_init(>pci0_mem, OBJECT(dev), "pci0-mem", UINT32_MAX);
> @@ -1178,6 +1177,7 @@ PCIBus *gt64120_register(qemu_irq *pic)
>  >pci0_mem,
>  get_system_io(),
>  PCI_DEVFN(18, 0), 4, TYPE_PCI_BUS);
> +qdev_init_nofail(dev);
>  memory_region_init_io(>ISD_mem, OBJECT(dev), _mem_ops, d, 
> "isd-mem", 0x1000);
>  
>  pci_create_simple(phb->bus, PCI_DEVFN(0, 0), "gt64120_pci");
> -- 
> 2.4.3
>

Re: [Qemu-devel] [PATCH V2 1/1] linux-aio: prevent submitting more than MAX_EVENTS

2016-07-15 Thread Paolo Bonzini



On 15/07/2016 13:35, Roman Penyaev wrote:
> On Fri, Jul 15, 2016 at 12:17 PM, Roman Penyaev
>  wrote:
>> On Fri, Jul 15, 2016 at 11:58 AM, Paolo Bonzini  wrote:
>>>
>>>
>>> On 15/07/2016 11:18, Roman Penyaev wrote:
 Those 3 red spikes and a blue hill is what we have to focus on.  The
 blue hill at the right corner of the chart means that almost always the
 ring buffer was observed as full, i.e. qemu_laio_completion_bh() got
 a chance to reap completions not very often, meanwhile completed
 requests stand in the ring buffer for quite a long time which degrades
 the overall performance.

 The results covered by the red line are much better and that can be
 explained by those 3 red spikes, which are almost in the middle of the
 whole distribution, i.e. qemu_laio_completion_bh() is called more often,
 completed requests do not stall, giving fio a chance to submit new fresh
 requests.

 The theoretical fix would be to schedule completion BH just after
 successful io_submit, i.e.:
>>>
>>> What about removing the qemu_bh_cancel but keeping the rest of the patch?
>>
>> That exactly what I did.  Numbers go to expected from ~1600MB/s to ~1800MB/s.
>> So basically this hunk of the debatable patch:
>>
>>  if (event_notifier_test_and_clear(>e)) {
>> -qemu_bh_schedule(s->completion_bh);
>> +qemu_laio_completion_bh(s);
>>  }
>>
>> does not have any impact and can be ignored.  At least I did not notice
>> anything important.

Thanks, this means that we should either add back the other line, or
wrap qemu_laio_completion_bh in a loop.  The rationale is that an
io_getevents that doesn't find any event is extremely cheap.

>>> I'm also interested in a graph with this patch ("linux-aio: prevent
>>> submitting more than MAX_EVENTS") on top of origin/master.
>>
>> I can plot it also of course.
> 
> So, finally I have it.
> 
> Same link:
> https://docs.google.com/spreadsheets/d/12CIt6EKiJLqNx0OHNqiabR-oFBrqkH0LN3mjzZ5jGeo/edit?usp=sharing
> 
> last sheet:
> "1789MB/s"
> 
> Not that much interesting: almost all the time we complete maximum:
> MAX_LIMIT requests at once.  But of course that expected on such
> device.  Probably other good metrics should be taken into account.

And this means that we probably should raise MAX_LIMIT.

Paolo

[Qemu-devel] [kvm-unit-tests PATCH v3 08/10] arm/arm64: gicv2: add an IPI test

2016-07-15 Thread Andrew Jones

Signed-off-by: Andrew Jones 
---
v2: add more details in the output if a test fails,
report spurious interrupts if we get them
---
 arm/Makefile.common |   6 +-
 arm/gic.c   | 194 
 arm/unittests.cfg   |   7 ++
 3 files changed, 204 insertions(+), 3 deletions(-)
 create mode 100644 arm/gic.c

diff --git a/arm/Makefile.common b/arm/Makefile.common
index 41239c37e0920..bc38183ab86e0 100644
--- a/arm/Makefile.common
+++ b/arm/Makefile.common
@@ -9,9 +9,9 @@ ifeq ($(LOADADDR),)
LOADADDR = 0x4000
 endif
 
-tests-common = \
-   $(TEST_DIR)/selftest.flat \
-   $(TEST_DIR)/spinlock-test.flat
+tests-common  = $(TEST_DIR)/selftest.flat
+tests-common += $(TEST_DIR)/spinlock-test.flat
+tests-common += $(TEST_DIR)/gic.flat
 
 all: test_cases
 
diff --git a/arm/gic.c b/arm/gic.c
new file mode 100644
index 0..cf7ec1c90413c
--- /dev/null
+++ b/arm/gic.c
@@ -0,0 +1,194 @@
+/*
+ * GIC tests
+ *
+ * GICv2
+ *   . test sending/receiving IPIs
+ *
+ * Copyright (C) 2016, Red Hat Inc, Andrew Jones 
+ *
+ * This work is licensed under the terms of the GNU LGPL, version 2.
+ */
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+static int gic_version;
+static int acked[NR_CPUS], spurious[NR_CPUS];
+static cpumask_t ready;
+
+static void nr_cpu_check(int nr)
+{
+   if (nr_cpus < nr)
+   report_abort("At least %d cpus required", nr);
+}
+
+static void wait_on_ready(void)
+{
+   cpumask_set_cpu(smp_processor_id(), );
+   while (!cpumask_full())
+   cpu_relax();
+}
+
+static void check_acked(cpumask_t *mask)
+{
+   int missing = 0, extra = 0, unexpected = 0;
+   int nr_pass, cpu, i;
+
+   /* Wait up to 5s for all interrupts to be delivered */
+   for (i = 0; i < 50; ++i) {
+   mdelay(100);
+   nr_pass = 0;
+   for_each_present_cpu(cpu) {
+   smp_rmb();
+   nr_pass += cpumask_test_cpu(cpu, mask) ?
+   acked[cpu] == 1 : acked[cpu] == 0;
+   }
+   if (nr_pass == nr_cpus) {
+   report("Completed in %d ms", true, ++i * 100);
+   return;
+   }
+   }
+
+   for_each_present_cpu(cpu) {
+   if (cpumask_test_cpu(cpu, mask)) {
+   if (!acked[cpu])
+   ++missing;
+   else if (acked[cpu] > 1)
+   ++extra;
+   } else {
+   if (acked[cpu])
+   ++unexpected;
+   }
+   }
+
+   report("Timed-out (5s). ACKS: missing=%d extra=%d unexpected=%d",
+  false, missing, extra, unexpected);
+}
+
+static void ipi_handler(struct pt_regs *regs __unused)
+{
+   u32 iar = readl(gicv2_cpu_base() + GIC_CPU_INTACK);
+
+   if (iar != GICC_INT_SPURIOUS) {
+   writel(iar, gicv2_cpu_base() + GIC_CPU_EOI);
+   smp_rmb(); /* pairs with wmb in ipi_test functions */
+   ++acked[smp_processor_id()];
+   smp_wmb(); /* pairs with rmb in check_acked */
+   } else {
+   ++spurious[smp_processor_id()];
+   smp_wmb();
+   }
+}
+
+static void ipi_test_self(void)
+{
+   cpumask_t mask;
+
+   report_prefix_push("self");
+   memset(acked, 0, sizeof(acked));
+   smp_wmb();
+   cpumask_clear();
+   cpumask_set_cpu(0, );
+   writel(2 << 24, gicv2_dist_base() + GIC_DIST_SOFTINT);
+   check_acked();
+   report_prefix_pop();
+}
+
+static void ipi_test_smp(void)
+{
+   cpumask_t mask;
+   unsigned long tlist;
+
+   report_prefix_push("target-list");
+   memset(acked, 0, sizeof(acked));
+   smp_wmb();
+   tlist = cpumask_bits(_present_mask)[0] & 0xaa;
+   cpumask_bits()[0] = tlist;
+   writel((u8)tlist << 16, gicv2_dist_base() + GIC_DIST_SOFTINT);
+   check_acked();
+   report_prefix_pop();
+
+   report_prefix_push("broadcast");
+   memset(acked, 0, sizeof(acked));
+   smp_wmb();
+   cpumask_copy(, _present_mask);
+   cpumask_clear_cpu(0, );
+   writel(1 << 24, gicv2_dist_base() + GIC_DIST_SOFTINT);
+   check_acked();
+   report_prefix_pop();
+}
+
+static void ipi_enable(void)
+{
+   gicv2_enable_defaults();
+#ifdef __arm__
+   install_exception_handler(EXCPTN_IRQ, ipi_handler);
+#else
+   install_irq_handler(EL1H_IRQ, ipi_handler);
+#endif
+   local_irq_enable();
+}
+
+static void ipi_recv(void)
+{
+   ipi_enable();
+   cpumask_set_cpu(smp_processor_id(), );
+   while (1)
+   wfi();
+}
+
+int main(int argc, char **argv)
+{
+   char pfx[8];
+   int cpu;
+
+   gic_version = gic_init();
+   if (!gic_version)
+   report_abort("No gic present!");
+

[Qemu-devel] [kvm-unit-tests PATCH v3 07/10] arm/arm64: add initial gicv3 support

2016-07-15 Thread Andrew Jones

Signed-off-by: Andrew Jones 

---
v2: configure irqs as NS GRP1
---
 lib/arm/asm/arch_gicv3.h   | 184 ++
 lib/arm/asm/gic-v3.h   | 321 +
 lib/arm/asm/gic.h  |   1 +
 lib/arm/gic.c  |  73 +++
 lib/arm64/asm/arch_gicv3.h | 169 
 lib/arm64/asm/gic-v3.h |   1 +
 lib/arm64/asm/sysreg.h |  44 +++
 7 files changed, 793 insertions(+)
 create mode 100644 lib/arm/asm/arch_gicv3.h
 create mode 100644 lib/arm/asm/gic-v3.h
 create mode 100644 lib/arm64/asm/arch_gicv3.h
 create mode 100644 lib/arm64/asm/gic-v3.h
 create mode 100644 lib/arm64/asm/sysreg.h

diff --git a/lib/arm/asm/arch_gicv3.h b/lib/arm/asm/arch_gicv3.h
new file mode 100644
index 0..d529a7eb62807
--- /dev/null
+++ b/lib/arm/asm/arch_gicv3.h
@@ -0,0 +1,184 @@
+/*
+ * All ripped off from arch/arm/include/asm/arch_gicv3.h
+ *
+ * Copyright (C) 2016, Red Hat Inc, Andrew Jones 
+ *
+ * This work is licensed under the terms of the GNU LGPL, version 2.
+ */
+#ifndef _ASMARM_ARCH_GICV3_H_
+#define _ASMARM_ARCH_GICV3_H_
+
+#ifndef __ASSEMBLY__
+
+#include 
+#include 
+#include 
+
+#define __stringify xstr
+
+
+#define __ACCESS_CP15(CRn, Op1, CRm, Op2)  p15, Op1, %0, CRn, CRm, Op2
+#define __ACCESS_CP15_64(Op1, CRm) p15, Op1, %Q0, %R0, CRm
+
+#define ICC_EOIR1  __ACCESS_CP15(c12, 0, c12, 1)
+#define ICC_DIR__ACCESS_CP15(c12, 0, c11, 1)
+#define ICC_IAR1   __ACCESS_CP15(c12, 0, c12, 0)
+#define ICC_SGI1R  __ACCESS_CP15_64(0, c12)
+#define ICC_PMR__ACCESS_CP15(c4, 0, c6, 0)
+#define ICC_CTLR   __ACCESS_CP15(c12, 0, c12, 4)
+#define ICC_SRE__ACCESS_CP15(c12, 0, c12, 5)
+#define ICC_IGRPEN1__ACCESS_CP15(c12, 0, c12, 7)
+
+#define ICC_HSRE   __ACCESS_CP15(c12, 4, c9, 5)
+
+#define ICH_VSEIR  __ACCESS_CP15(c12, 4, c9, 4)
+#define ICH_HCR__ACCESS_CP15(c12, 4, c11, 0)
+#define ICH_VTR__ACCESS_CP15(c12, 4, c11, 1)
+#define ICH_MISR   __ACCESS_CP15(c12, 4, c11, 2)
+#define ICH_EISR   __ACCESS_CP15(c12, 4, c11, 3)
+#define ICH_ELSR   __ACCESS_CP15(c12, 4, c11, 5)
+#define ICH_VMCR   __ACCESS_CP15(c12, 4, c11, 7)
+
+#define __LR0(x)   __ACCESS_CP15(c12, 4, c12, x)
+#define __LR8(x)   __ACCESS_CP15(c12, 4, c13, x)
+
+#define ICH_LR0__LR0(0)
+#define ICH_LR1__LR0(1)
+#define ICH_LR2__LR0(2)
+#define ICH_LR3__LR0(3)
+#define ICH_LR4__LR0(4)
+#define ICH_LR5__LR0(5)
+#define ICH_LR6__LR0(6)
+#define ICH_LR7__LR0(7)
+#define ICH_LR8__LR8(0)
+#define ICH_LR9__LR8(1)
+#define ICH_LR10   __LR8(2)
+#define ICH_LR11   __LR8(3)
+#define ICH_LR12   __LR8(4)
+#define ICH_LR13   __LR8(5)
+#define ICH_LR14   __LR8(6)
+#define ICH_LR15   __LR8(7)
+
+/* LR top half */
+#define __LRC0(x)  __ACCESS_CP15(c12, 4, c14, x)
+#define __LRC8(x)  __ACCESS_CP15(c12, 4, c15, x)
+
+#define ICH_LRC0   __LRC0(0)
+#define ICH_LRC1   __LRC0(1)
+#define ICH_LRC2   __LRC0(2)
+#define ICH_LRC3   __LRC0(3)
+#define ICH_LRC4   __LRC0(4)
+#define ICH_LRC5   __LRC0(5)
+#define ICH_LRC6   __LRC0(6)
+#define ICH_LRC7   __LRC0(7)
+#define ICH_LRC8   __LRC8(0)
+#define ICH_LRC9   __LRC8(1)
+#define ICH_LRC10  __LRC8(2)
+#define ICH_LRC11  __LRC8(3)
+#define ICH_LRC12  __LRC8(4)
+#define ICH_LRC13  __LRC8(5)
+#define ICH_LRC14  __LRC8(6)
+#define ICH_LRC15  __LRC8(7)
+
+#define __AP0Rx(x) __ACCESS_CP15(c12, 4, c8, x)
+#define ICH_AP0R0  __AP0Rx(0)
+#define ICH_AP0R1  __AP0Rx(1)
+#define ICH_AP0R2  __AP0Rx(2)
+#define ICH_AP0R3  __AP0Rx(3)
+
+#define __AP1Rx(x) __ACCESS_CP15(c12, 4, c9, x)
+#define ICH_AP1R0  __AP1Rx(0)
+#define ICH_AP1R1  __AP1Rx(1)
+#define

Re: [Qemu-devel] [PULL 1/1] Add optionrom compatible with fw_cfg DMA version

2016-07-15 Thread Stefan Hajnoczi

On Thu, Jul 14, 2016 at 2:52 PM, Paolo Bonzini  wrote:
> From: Marc Marí 
>
> This optionrom is based on linuxboot.S.
>
> Signed-off-by: Marc Marí 
> Signed-off-by: Richard W.M. Jones 
> Message-Id: <1464027093-24073-2-git-send-email-rjo...@redhat.com>
> [Add -fno-toplevel-reorder, support clang without -m16. - Paolo]
> Signed-off-by: Paolo Bonzini 
> ---
>  .gitignore|   4 +
>  Makefile  |   2 +-
>  hw/i386/pc.c  |  10 +-
>  hw/nvram/fw_cfg.c |   2 +-
>  include/hw/i386/pc.h  |   4 +
>  include/hw/nvram/fw_cfg.h |   1 +
>  pc-bios/linuxboot_dma.bin | Bin 0 -> 1024 bytes
>  pc-bios/optionrom/Makefile|  42 --
>  pc-bios/optionrom/code16gcc.h |   3 +
>  pc-bios/optionrom/linuxboot_dma.c | 294 
> ++
>  10 files changed, 349 insertions(+), 13 deletions(-)
>  create mode 100644 pc-bios/linuxboot_dma.bin
>  create mode 100644 pc-bios/optionrom/code16gcc.h
>  create mode 100644 pc-bios/optionrom/linuxboot_dma.c

  CCoptionrom/linuxboot_dma.o
clang-3.8: error: unsupported argument '-32' to option 'Wa,'

$ rpm -qi clang
Name: clang
Version : 3.8.0
Release : 2.fc24
Architecture: x86_64

Stefan

[Qemu-devel] [PATCH Qemu] Change spice-server protocol for GL texture passing

2016-07-15 Thread Frediano Ziglio

---
 ui/spice-core.c|  5 -
 ui/spice-display.c | 29 -
 2 files changed, 8 insertions(+), 26 deletions(-)

diff --git a/ui/spice-core.c b/ui/spice-core.c
index da05054..f7647f7 100644
--- a/ui/spice-core.c
+++ b/ui/spice-core.c
@@ -828,11 +828,6 @@ void qemu_spice_init(void)
 
 #ifdef HAVE_SPICE_GL
 if (qemu_opt_get_bool(opts, "gl", 0)) {
-if ((port != 0) || (tls_port != 0)) {
-error_report("SPICE GL support is local-only for now and "
- "incompatible with -spice port/tls-port");
-exit(1);
-}
 if (egl_rendernode_init() != 0) {
 error_report("Failed to initialize EGL render node for SPICE GL");
 exit(1);
diff --git a/ui/spice-display.c b/ui/spice-display.c
index 2a77a54..72137bd 100644
--- a/ui/spice-display.c
+++ b/ui/spice-display.c
@@ -852,6 +852,10 @@ static void qemu_spice_gl_block_timer(void *opaque)
 static QEMUGLContext qemu_spice_gl_create_context(DisplayChangeListener *dcl,
   QEMUGLParams *params)
 {
+SimpleSpiceDisplay *ssd = container_of(dcl, SimpleSpiceDisplay, dcl);
+
+spice_qxl_gl_init(>qxl, qemu_egl_display, qemu_egl_rn_ctx);
+
 eglMakeCurrent(qemu_egl_display, EGL_NO_SURFACE, EGL_NO_SURFACE,
qemu_egl_rn_ctx);
 return qemu_egl_create_context(dcl, params);
@@ -864,28 +868,11 @@ static void qemu_spice_gl_scanout(DisplayChangeListener 
*dcl,
   uint32_t w, uint32_t h)
 {
 SimpleSpiceDisplay *ssd = container_of(dcl, SimpleSpiceDisplay, dcl);
-EGLint stride = 0, fourcc = 0;
-int fd = -1;
-
-if (tex_id) {
-fd = egl_get_fd_for_texture(tex_id, , );
-if (fd < 0) {
-fprintf(stderr, "%s: failed to get fd for texture\n", __func__);
-return;
-}
-dprint(1, "%s: %dx%d (stride %d, fourcc 0x%x)\n", __func__,
-   w, h, stride, fourcc);
-} else {
-dprint(1, "%s: no texture (no framebuffer)\n", __func__);
-}
-
-assert(!tex_id || fd >= 0);
 
-/* note: spice server will close the fd */
-spice_qxl_gl_scanout(>qxl, fd,
- surface_width(ssd->ds),
- surface_height(ssd->ds),
- stride, fourcc, y_0_top);
+spice_qxl_gl_scanout_texture(>qxl, tex_id,
+ surface_width(ssd->ds),
+ surface_height(ssd->ds),
+ y_0_top);
 
 qemu_spice_gl_monitor_config(ssd, x, y, w, h);
 }
-- 
2.7.4

Re: [Qemu-devel] Regression with commit 095497ffc66b7f031

2016-07-15 Thread Paolo Bonzini



On 15/07/2016 12:41, Juergen Gross wrote:
> On 15/07/16 12:35, Paolo Bonzini wrote:
>>
>>
>> On 15/07/2016 12:12, Gerd Hoffmann wrote:
>>> On Fr, 2016-07-15 at 12:02 +0200, Paolo Bonzini wrote:

 On 15/07/2016 10:47, Juergen Gross wrote:
> Nothing scaring and no real difference between working and not working
> variant.
>
> Meanwhile I've been digging a little bit deeper and found the reason:
> libxenstore is setting up a reader thread which is waiting for the
> watch to fire. With above commit the stack size of that thread (16kB)
> is too small. Setting it to 32kB made qemu work again.

 This makes very little sense (not your fault)...  The commit doesn't
 change stack usage at all, TLS should not be on the stack.

 Can you capture a backtrace where the 16K stack is exceeded?  Perhaps
 it's only due to inlining decision on the compiler, in which case
 Peter's patch from today is only a bandaid.
>>>
>>> Hmm, I guess I hold off the vnc pull request for now ...
>>
>> Go ahead.  I looked at glibc source code and the patch is okay.
> 
> Paolo, do you know of any interface to obtain the size of the TLS area
> taken from the stack (before calling pthread_create() )? 

https://gcc.gnu.org/ml/gcc-patches/2014-10/msg01643.html has a patch
that _removes_ code to do this from the golang runtime.  The comments
there say that only with glibc before version 2.16 the static TLS size
is taken out of the stack size...

What version of glibc are you using?

Paolo

[Qemu-devel] [kvm-unit-tests PATCH v3 06/10] arm/arm64: add initial gicv2 support

2016-07-15 Thread Andrew Jones

Add some gicv2 support. This just adds init and enable
functions, allowing unit tests to start messing with it.

Signed-off-by: Andrew Jones 
---
 arm/Makefile.common|  1 +
 lib/arm/asm/gic-v2.h   | 74 ++
 lib/arm/asm/gic.h  | 20 ++
 lib/arm/gic.c  | 69 ++
 lib/arm64/asm/gic-v2.h |  1 +
 lib/arm64/asm/gic.h|  1 +
 6 files changed, 166 insertions(+)
 create mode 100644 lib/arm/asm/gic-v2.h
 create mode 100644 lib/arm/asm/gic.h
 create mode 100644 lib/arm/gic.c
 create mode 100644 lib/arm64/asm/gic-v2.h
 create mode 100644 lib/arm64/asm/gic.h

diff --git a/arm/Makefile.common b/arm/Makefile.common
index ccb554d9251a4..41239c37e0920 100644
--- a/arm/Makefile.common
+++ b/arm/Makefile.common
@@ -42,6 +42,7 @@ cflatobjs += lib/arm/mmu.o
 cflatobjs += lib/arm/bitops.o
 cflatobjs += lib/arm/psci.o
 cflatobjs += lib/arm/smp.o
+cflatobjs += lib/arm/gic.o
 
 libeabi = lib/arm/libeabi.a
 eabiobjs = lib/arm/eabi_compat.o
diff --git a/lib/arm/asm/gic-v2.h b/lib/arm/asm/gic-v2.h
new file mode 100644
index 0..973c2bf3cc796
--- /dev/null
+++ b/lib/arm/asm/gic-v2.h
@@ -0,0 +1,74 @@
+/*
+ * All GIC* defines are lifted from include/linux/irqchip/arm-gic.h
+ *
+ * Copyright (C) 2016, Red Hat Inc, Andrew Jones 
+ *
+ * This work is licensed under the terms of the GNU LGPL, version 2.
+ */
+#ifndef _ASMARM_GIC_V2_H_
+#define _ASMARM_GIC_V2_H_
+
+#define GIC_CPU_CTRL   0x00
+#define GIC_CPU_PRIMASK0x04
+#define GIC_CPU_BINPOINT   0x08
+#define GIC_CPU_INTACK 0x0c
+#define GIC_CPU_EOI0x10
+#define GIC_CPU_RUNNINGPRI 0x14
+#define GIC_CPU_HIGHPRI0x18
+#define GIC_CPU_ALIAS_BINPOINT 0x1c
+#define GIC_CPU_ACTIVEPRIO 0xd0
+#define GIC_CPU_IDENT  0xfc
+#define GIC_CPU_DEACTIVATE 0x1000
+
+#define GICC_ENABLE0x1
+#define GICC_INT_PRI_THRESHOLD 0xf0
+
+#define GIC_CPU_CTRL_EOImodeNS (1 << 9)
+
+#define GICC_IAR_INT_ID_MASK   0x3ff
+#define GICC_INT_SPURIOUS  1023
+#define GICC_DIS_BYPASS_MASK   0x1e0
+
+#define GIC_DIST_CTRL  0x000
+#define GIC_DIST_CTR   0x004
+#define GIC_DIST_IGROUP0x080
+#define GIC_DIST_ENABLE_SET0x100
+#define GIC_DIST_ENABLE_CLEAR  0x180
+#define GIC_DIST_PENDING_SET   0x200
+#define GIC_DIST_PENDING_CLEAR 0x280
+#define GIC_DIST_ACTIVE_SET0x300
+#define GIC_DIST_ACTIVE_CLEAR  0x380
+#define GIC_DIST_PRI   0x400
+#define GIC_DIST_TARGET0x800
+#define GIC_DIST_CONFIG0xc00
+#define GIC_DIST_SOFTINT   0xf00
+#define GIC_DIST_SGI_PENDING_CLEAR 0xf10
+#define GIC_DIST_SGI_PENDING_SET   0xf20
+
+#define GICD_ENABLE0x1
+#define GICD_DISABLE   0x0
+#define GICD_INT_ACTLOW_LVLTRIG0x0
+#define GICD_INT_EN_CLR_X320x
+#define GICD_INT_EN_SET_SGI0x
+#define GICD_INT_EN_CLR_PPI0x
+#define GICD_INT_DEF_PRI   0xa0
+#define GICD_INT_DEF_PRI_X4((GICD_INT_DEF_PRI << 24) |\
+   (GICD_INT_DEF_PRI << 16) |\
+   (GICD_INT_DEF_PRI << 8) |\
+   GICD_INT_DEF_PRI)
+#ifndef __ASSEMBLY__
+
+struct gicv2_data {
+   void *dist_base;
+   void *cpu_base;
+};
+extern struct gicv2_data gicv2_data;
+
+#define gicv2_dist_base()  (gicv2_data.dist_base)
+#define gicv2_cpu_base()   (gicv2_data.cpu_base)
+
+extern int gicv2_init(void);
+extern void gicv2_enable_defaults(void);
+
+#endif /* !__ASSEMBLY__ */
+#endif /* _ASMARM_GIC_V2_H_ */
diff --git a/lib/arm/asm/gic.h b/lib/arm/asm/gic.h
new file mode 100644
index 0..b1237d1c5ef22
--- /dev/null
+++ b/lib/arm/asm/gic.h
@@ -0,0 +1,20 @@
+/*
+ * Copyright (C) 2016, Red Hat Inc, Andrew Jones 
+ *
+ * This work is licensed under the terms of the GNU LGPL, version 2.
+ */
+#ifndef _ASMARM_GIC_H_
+#define _ASMARM_GIC_H_
+
+#include 
+
+/*
+ * gic_init will try to find all known gics, and then
+ * initialize the gic data for the one found.
+ * returns
+ *  0   : no gic was found
+ *  > 0 : the gic version of the gic found
+ */
+extern int gic_init(void);
+
+#endif /* _ASMARM_GIC_H_ */
diff --git a/lib/arm/gic.c b/lib/arm/gic.c
new file mode 100644
index 0..64a3049c9e8ce
--- /dev/null
+++ b/lib/arm/gic.c
@@ -0,0 +1,69 @@
+/*
+ * Copyright (C) 2016, Red Hat Inc, Andrew Jones 
+ *
+ * This work is licensed under the terms of the GNU LGPL, version 2.
+ */
+#include 
+#include

[Qemu-devel] [kvm-unit-tests PATCH v3 00/10] arm/arm64: add gic framework

2016-07-15 Thread Andrew Jones

v3:
 - Rebased on latest master
 - Added Alex's r-b's

v2:
 Rebased on latest master + my "populate argv[0]" series (will
 send a REPOST for that shortly. Additionally a few patches got
 fixes/features;
 07/10 got same fix as kernel 7c9b973061 "irqchip/gic-v3: Configure
   all interrupts as non-secure Group-1" in order to continue
   working over TCG, as the gicv3 code for TCG removed a hack
   it had there to make Linux happy.
 08/10 added more output for when things fail (if they fail)
 09/10 switched gicv3 broadcast implementation to using IRM. This
   found a bug in a recent (but not tip) kernel, which I was
   about to fix, but then I saw MarcZ beat me to it.
 10/10 actually check that the input irq is the received irq


Import defines, and steal enough helper functions, from Linux to
enable programming of the gic (v2 and v3). Then use the framework
to add an initial test (an ipi test; self, target-list, broadcast).

It's my hope that this framework will be a suitable base on which
more tests may be easily added, particularly because we have
vgic-new and tcg gicv3 emulation getting close to merge. (v3 UPDATE:
vgic-new and tcg gicv3 are merged now)

To run it, along with other tests, just do

 ./configure [ --arch=[arm|arm64] --cross-prefix=$PREFIX ]
 make
 export QEMU=$PATH_TO_QEMU
 ./run_tests.sh

To run it separately do, e.g.

$QEMU -machine virt,accel=tcg -cpu cortex-a57 \
 -device virtio-serial-device \
 -device virtconsole,chardev=ctd -chardev testdev,id=ctd \
 -display none -serial stdio \
 -kernel arm/gic.flat \
 -smp 123 -machine gic-version=3 -append ipi

  ^^ note, we can go nuts with nr-cpus on TCG :-)

Or, a KVM example using a different "sender" cpu and irq (other than zero)

$QEMU -machine virt,accel=kvm -cpu host \
 -device virtio-serial-device \
 -device virtconsole,chardev=ctd -chardev testdev,id=ctd \
 -display none -serial stdio \
 -kernel arm/gic.flat \
 -smp 48 -machine gic-version=3 -append 'ipi sender=42 irq=1'


Patches:
01-05: fixes and functionality needed by the later gic patches
06-07: code theft from Linux (defines, helper functions)
08-10: arm/gic.flat (the base of the gic unit test), currently just IPI

Available here: https://github.com/rhdrjones/kvm-unit-tests/commits/arm/gic


Andrew Jones (10):
  lib: xstr: allow multiple args
  arm64: fix get_"sysreg32" and make MPIDR 64bit
  arm/arm64: smp: support more than 8 cpus
  arm/arm64: add some delay routines
  arm/arm64: irq enable/disable
  arm/arm64: add initial gicv2 support
  arm/arm64: add initial gicv3 support
  arm/arm64: gicv2: add an IPI test
  arm/arm64: gicv3: add an IPI test
  arm/arm64: gic: don't just use zero

 arm/Makefile.common|   7 +-
 arm/gic.c  | 381 +
 arm/run|  19 ++-
 arm/selftest.c |   5 +-
 arm/unittests.cfg  |  13 ++
 lib/arm/asm/arch_gicv3.h   | 184 ++
 lib/arm/asm/gic-v2.h   |  74 +
 lib/arm/asm/gic-v3.h   | 321 ++
 lib/arm/asm/gic.h  |  21 +++
 lib/arm/asm/processor.h|  38 -
 lib/arm/asm/setup.h|   4 +-
 lib/arm/gic.c  | 142 +
 lib/arm/processor.c|  15 ++
 lib/arm/setup.c|  12 +-
 lib/arm64/asm/arch_gicv3.h | 169 
 lib/arm64/asm/gic-v2.h |   1 +
 lib/arm64/asm/gic-v3.h |   1 +
 lib/arm64/asm/gic.h|   1 +
 lib/arm64/asm/processor.h  |  53 ++-
 lib/arm64/asm/sysreg.h |  44 ++
 lib/arm64/processor.c  |  15 ++
 lib/libcflat.h |   4 +-
 22 files changed, 1498 insertions(+), 26 deletions(-)
 create mode 100644 arm/gic.c
 create mode 100644 lib/arm/asm/arch_gicv3.h
 create mode 100644 lib/arm/asm/gic-v2.h
 create mode 100644 lib/arm/asm/gic-v3.h
 create mode 100644 lib/arm/asm/gic.h
 create mode 100644 lib/arm/gic.c
 create mode 100644 lib/arm64/asm/arch_gicv3.h
 create mode 100644 lib/arm64/asm/gic-v2.h
 create mode 100644 lib/arm64/asm/gic-v3.h
 create mode 100644 lib/arm64/asm/gic.h
 create mode 100644 lib/arm64/asm/sysreg.h

-- 
2.7.4

[Qemu-devel] [kvm-unit-tests PATCH v3 04/10] arm/arm64: add some delay routines

2016-07-15 Thread Andrew Jones

Allow a thread to wait some specified amount of time. Can
specify in cycles, usecs, and msecs.

Reviewed-by: Alex Bennée 
Signed-off-by: Andrew Jones 
---
 lib/arm/asm/processor.h   | 19 +++
 lib/arm/processor.c   | 15 +++
 lib/arm64/asm/processor.h | 19 +++
 lib/arm64/processor.c | 15 +++
 4 files changed, 68 insertions(+)

diff --git a/lib/arm/asm/processor.h b/lib/arm/asm/processor.h
index d2048f5f5f7e6..afc903ca7d4ab 100644
--- a/lib/arm/asm/processor.h
+++ b/lib/arm/asm/processor.h
@@ -5,7 +5,9 @@
  *
  * This work is licensed under the terms of the GNU LGPL, version 2.
  */
+#include 
 #include 
+#include 
 
 enum vector {
EXCPTN_RST,
@@ -51,4 +53,21 @@ extern int mpidr_to_cpu(unsigned long mpidr);
 extern void start_usr(void (*func)(void *arg), void *arg, unsigned long 
sp_usr);
 extern bool is_user(void);
 
+static inline u64 get_cntvct(void)
+{
+   u64 vct;
+   isb();
+   asm volatile("mrrc p15, 1, %Q0, %R0, c14" : "=r" (vct));
+   return vct;
+}
+
+extern void delay(u64 cycles);
+extern void udelay(unsigned long usecs);
+
+static inline void mdelay(unsigned long msecs)
+{
+   while (msecs--)
+   udelay(1000);
+}
+
 #endif /* _ASMARM_PROCESSOR_H_ */
diff --git a/lib/arm/processor.c b/lib/arm/processor.c
index 54fdb87ef0196..c2ee360df6884 100644
--- a/lib/arm/processor.c
+++ b/lib/arm/processor.c
@@ -9,6 +9,7 @@
 #include 
 #include 
 #include 
+#include 
 
 static const char *processor_modes[] = {
"USER_26", "FIQ_26" , "IRQ_26" , "SVC_26" ,
@@ -141,3 +142,17 @@ bool is_user(void)
 {
return current_thread_info()->flags & TIF_USER_MODE;
 }
+
+void delay(u64 cycles)
+{
+   u64 start = get_cntvct();
+   while ((get_cntvct() - start) < cycles)
+   cpu_relax();
+}
+
+void udelay(unsigned long usec)
+{
+   unsigned int frq;
+   asm volatile("mrc p15, 0, %0, c14, c0, 0" : "=r" (frq));
+   delay((u64)usec * frq / 100);
+}
diff --git a/lib/arm64/asm/processor.h b/lib/arm64/asm/processor.h
index 7e448dc81a6aa..94f7ce35b65c1 100644
--- a/lib/arm64/asm/processor.h
+++ b/lib/arm64/asm/processor.h
@@ -17,8 +17,10 @@
 #define SCTLR_EL1_M(1 << 0)
 
 #ifndef __ASSEMBLY__
+#include 
 #include 
 #include 
+#include 
 
 enum vector {
EL1T_SYNC,
@@ -89,5 +91,22 @@ extern int mpidr_to_cpu(unsigned long mpidr);
 extern void start_usr(void (*func)(void *arg), void *arg, unsigned long 
sp_usr);
 extern bool is_user(void);
 
+static inline u64 get_cntvct(void)
+{
+   u64 vct;
+   isb();
+   asm volatile("mrs %0, cntvct_el0" : "=r" (vct));
+   return vct;
+}
+
+extern void delay(u64 cycles);
+extern void udelay(unsigned long usecs);
+
+static inline void mdelay(unsigned long msecs)
+{
+   while (msecs--)
+   udelay(1000);
+}
+
 #endif /* !__ASSEMBLY__ */
 #endif /* _ASMARM64_PROCESSOR_H_ */
diff --git a/lib/arm64/processor.c b/lib/arm64/processor.c
index deeab4ec9c8ac..50fa835c6f1e3 100644
--- a/lib/arm64/processor.c
+++ b/lib/arm64/processor.c
@@ -9,6 +9,7 @@
 #include 
 #include 
 #include 
+#include 
 
 static const char *vector_names[] = {
"el1t_sync",
@@ -253,3 +254,17 @@ bool is_user(void)
 {
return current_thread_info()->flags & TIF_USER_MODE;
 }
+
+void delay(u64 cycles)
+{
+   u64 start = get_cntvct();
+   while ((get_cntvct() - start) < cycles)
+   cpu_relax();
+}
+
+void udelay(unsigned long usec)
+{
+   unsigned int frq;
+   asm volatile("mrs %0, cntfrq_el0" : "=r" (frq));
+   delay((u64)usec * frq / 100);
+}
-- 
2.7.4

Re: [Qemu-devel] [PATCH v3 0/2] Report format specific info for LUKS block driver

2016-07-15 Thread Max Reitz

On 14.06.2016 18:24, Daniel P. Berrange wrote:
> This is a followup to:
> 
>   v1: https://lists.gnu.org/archive/html/qemu-devel/2016-06/msg01723.html
>   v2: https://lists.gnu.org/archive/html/qemu-devel/2016-06/msg03642.html
> 
> The 'qemu-img info' tool has ability to print format specific
> information, eg with qcow2 it reports two extra items:
> 
>   $ qemu-img info ~/VirtualMachines/demo.qcow2
>   image: /home/berrange/VirtualMachines/demo.qcow2
>   file format: qcow2
>   virtual size: 3.0G (3221225472 bytes)
>   disk size: 140K
>   cluster_size: 65536
>   Format specific information:
>   compat: 0.10
>   refcount bits: 16
> 
> 
> This is not currently wired up for the LUKS driver. This patch
> series adds that support so that we can report useful data about
> the LUKS volume such as the crypto algorithm choices, key slot
> usage and other volume metadata.
> 
> The first patch extends the crypto API to allow querying of the
> format specific metadata
> 
> The second patches extends the block API to allow the LUKS driver
> to report the format specific metadata.
> 
> $ qemu-img info ~/VirtualMachines/demo.luks
> image: /home/berrange/VirtualMachines/demo.luks
> file format: luks
> virtual size: 98M (102760448 bytes)
> disk size: 100M
> encrypted: yes
> Format specific information:
> ivgen alg: plain64
> hash alg: sha1
> cipher alg: aes-128
> uuid: 6ddee74b-3a22-408c-8909-6789d4fa2594
> cipher mode: xts
> slots:
> [0]:
> active: true
> iters: 572706
> key offset: 4096
> stripes: 4000
> [1]:
> active: false
> key offset: 135168
> [2]:
> active: false
> key offset: 266240
> [3]:
> active: false
> key offset: 397312
> [4]:
> active: false
> key offset: 528384
> [5]:
> active: false
> key offset: 659456
> [6]:
> active: false
> key offset: 790528
> [7]:
> active: false
> key offset: 921600
> payload offset: 2097152
> master key iters: 142375
> 
> Technically most of the code changes here are in the crypto
> layer, rather than the block layer. I'm fine with both patches
> going through the block maintainer tree, or can submit a both
> patches myself as, for sake of simplicity of merge.
> 
> Changed in v3:
> 
>  - Do full struct copy instead of field-by-field copy (Max)
>  - Simplify handling of linked list pointers (Max)
>  - Use g_strndup with uuid to guarantee null termination (Max)
>  - Misc typos (Max)
> 
> Changed in v2:
> 
>  - Drop patches related to creating a text output visitor to
>format the ImageInfoSpecific data. This will be continued
>in a separate patch series
>  - Fix key offset to be in bytes instead of sectors
>  - Drop the duplicated ImageInfoSpecificLUKS type and just
>directly use QCryptoBlockInfoLUKS type in block layer
>  - Skip reporting stripes/iters if keyslot is inactive
>  - Add missing QAPI schema docs
> 
> 
> 
> Daniel P. Berrange (2):
>   crypto: add support for querying parameters for block encryption
>   block: export LUKS specific data to qemu-img info
> 
>  block/crypto.c | 49 
>  crypto/block-luks.c| 67 
>  crypto/block.c | 17 +++
>  crypto/blockpriv.h |  4 +++
>  include/crypto/block.h | 16 +++
>  qapi/block-core.json   |  6 +++-
>  qapi/crypto.json   | 76 
> ++
>  7 files changed, 234 insertions(+), 1 deletion(-)

Thanks, I've applied the series to my block branch:

https://github.com/XanClic/qemu/commits/block

Max



signature.asc
Description: OpenPGP digital signature

[Qemu-devel] [PATCH v7 1/4] ide: refactor retry_unit set and clear into separate function

2016-07-15 Thread Denis V. Lunev

From: Evgeny Yakovlev 

Code to set and clear state associated with retry in moved into
ide_set_retry and ide_clear_retry to make adding retry setups easier.

Signed-off-by: Evgeny Yakovlev 
Signed-off-by: Denis V. Lunev 
Reviewed-by: Paolo Bonzini 
CC: Kevin Wolf 
CC: Max Reitz 
CC: Stefan Hajnoczi 
CC: Fam Zheng 
CC: John Snow 
---
 hw/ide/core.c | 22 --
 1 file changed, 16 insertions(+), 6 deletions(-)

diff --git a/hw/ide/core.c b/hw/ide/core.c
index 029f6b9..b72346e 100644
--- a/hw/ide/core.c
+++ b/hw/ide/core.c
@@ -466,6 +466,20 @@ void ide_abort_command(IDEState *s)
 s->error = ABRT_ERR;
 }
 
+static void ide_set_retry(IDEState *s)
+{
+s->bus->retry_unit = s->unit;
+s->bus->retry_sector_num = ide_get_sector(s);
+s->bus->retry_nsector = s->nsector;
+}
+
+static void ide_clear_retry(IDEState *s)
+{
+s->bus->retry_unit = -1;
+s->bus->retry_sector_num = 0;
+s->bus->retry_nsector = 0;
+}
+
 /* prepare data transfer and tell what to do after */
 void ide_transfer_start(IDEState *s, uint8_t *buf, int size,
 EndTransferFunc *end_transfer_func)
@@ -756,9 +770,7 @@ void dma_buf_commit(IDEState *s, uint32_t tx_bytes)
 void ide_set_inactive(IDEState *s, bool more)
 {
 s->bus->dma->aiocb = NULL;
-s->bus->retry_unit = -1;
-s->bus->retry_sector_num = 0;
-s->bus->retry_nsector = 0;
+ide_clear_retry(s);
 if (s->bus->dma->ops->set_inactive) {
 s->bus->dma->ops->set_inactive(s->bus->dma, more);
 }
@@ -914,9 +926,7 @@ static void ide_sector_start_dma(IDEState *s, enum 
ide_dma_cmd dma_cmd)
 void ide_start_dma(IDEState *s, BlockCompletionFunc *cb)
 {
 s->io_buffer_index = 0;
-s->bus->retry_unit = s->unit;
-s->bus->retry_sector_num = ide_get_sector(s);
-s->bus->retry_nsector = s->nsector;
+ide_set_retry(s);
 if (s->bus->dma->ops->start_dma) {
 s->bus->dma->ops->start_dma(s->bus->dma, s, cb);
 }
-- 
2.1.4

[Qemu-devel] [PATCH v7 0/4] block: ignore flush requests when storage is clean

2016-07-15 Thread Denis V. Lunev

Changes from v6:
- squashed patches 5-6 into patch 4 to avoid test faults on git bissect
- changed sector number from 0 to 1 in patch 3

Changes from v5:
- Removed failed flush traces in block tests 026 071 089
- Changed BLOCK_JOB_READY event order in block tests 141 144

Changes from v4:
- Moved to write generation scheme instead of dirty flag
- Added retry setup to IDE PIO and FLUSH requests

Changes from v3:
- Fixed a typo in commit message
- Rebased on Kevin'n origin/block

Changes from v2:
- Better comments
- Rebased on latest master

Changes from v1:
- Flush requests that should be skipped will now wait for completion
of any previous requests already in flight
- Fixed IDE and AHCI tests to dirty media for new flush behaviour
- Fixed a problem in IDE CMD_FLUSH_CACHE failure handling

Signed-off-by: Evgeny Yakovlev 
Signed-off-by: Denis V. Lunev 
CC: Kevin Wolf 
CC: Max Reitz 
CC: Stefan Hajnoczi 
CC: Fam Zheng 
CC: John Snow 

Evgeny Yakovlev (4):
  ide: refactor retry_unit set and clear into separate function
  ide: set retry_unit for PIO and FLUSH requests
  tests: in IDE and AHCI tests perform DMA write before flushing
  block: ignore flush requests when storage is clean

 block.c|  3 +++
 block/io.c | 21 
 hw/ide/core.c  | 24 +-
 include/block/block_int.h  |  5 
 tests/ahci-test.c  | 34 --
 tests/ide-test.c   | 43 
 tests/qemu-iotests/026.out.nocache | 50 --
 tests/qemu-iotests/071.out |  8 --
 tests/qemu-iotests/089.out |  2 --
 tests/qemu-iotests/141.out |  4 +--
 tests/qemu-iotests/144.out |  2 +-
 11 files changed, 125 insertions(+), 71 deletions(-)

-- 
2.1.4

Re: [Qemu-devel] [PATCH v18 4/4] block/gluster: add support for multiple gluster servers

2016-07-15 Thread Prasanna Kalever

On Thu, Jul 14, 2016 at 5:35 PM, Markus Armbruster  wrote:
> Interface and error message review only.
>
> Prasanna Kumar Kalever  writes:
>
>> This patch adds a way to specify multiple volfile servers to the gluster
>> block backend of QEMU with tcp|rdma transport types and their port numbers.
>>
>> Problem:
>>
>> Currently VM Image on gluster volume is specified like this:
>>
>> file=gluster[+tcp]://host[:port]/testvol/a.img
>>
>> Assuming we have three hosts in trusted pool with replica 3 volume
>> in action and unfortunately host (mentioned in the command above) went down
>> for some reason, since the volume is replica 3 we now have other 2 hosts
>> active from which we can boot the VM.
>>
>> But currently there is no mechanism to pass the other 2 gluster host
>> addresses to qemu.
>
> Awkward.  Perhaps:
>
>   Say we have three hosts in a trusted pool with replica 3 volume
>   in action.  When the host mentioned in the command above goes down
>   for some reason, the other two hosts are still available.  But there's
>   currently no way to tell QEMU about them.

Will incorporate in v19

>
>> Solution:
>>
>> New way of specifying VM Image on gluster volume with volfile servers:
>> (We still support old syntax to maintain backward compatibility)
>>
>> Basic command line syntax looks like:
>>
>> Pattern I:
>>  -drive driver=gluster,
>> volume=testvol,path=/path/a.raw,
>> server.0.host=1.2.3.4,
>>[server.0.port=24007,]
>>[server.0.transport=tcp,]
>> server.1.host=5.6.7.8,
>>[server.1.port=24008,]
>>[server.1.transport=rdma,]
>
> Don't forget to update this line when you drop transport 'rdma'.  More
> of the same below.
>
>> server.2.host=/var/run/glusterd.socket,
>> server.2.transport=unix ...
>>
>> Pattern II:
>>  'json:{"driver":"qcow2","file":{"driver":"gluster",
>>"volume":"testvol","path":"/path/a.qcow2",
>>"server":[{tuple0},{tuple1}, ...{tupleN}]}}'
>
> Suggest to add -drive here, for symmetry with pattern I.
>
> JSON calls the things in { ... } objects, not tuples.  Let's stick to
> JSON terminology: [{object0},{object1}, ...].  But I think spelling
> things out to a similar degree as in pattern I would be clearer:
>
>-drive 'json:{"driver": "qcow2",
>  "file": { "driver": "gluster",
>"volume": "testvol",
>"path": "/path/a.qcow2",
>"server": [
>   { "host": "1.2.3.4",
> "port": 24007,
> "transport": "tcp" },
>   { "host": "5.6.7.8"
> ... },
>   ... ] }
>  ... }'

IMO, this doesn't work, I have given a try.

>
>>driver  => 'gluster' (protocol name)
>>volume  => name of gluster volume where our VM image resides
>>path=> absolute path of image in gluster volume
>>
>>   {tuple}  => {"host":"1.2.3.4"[,"port":"24007","transport":"tcp"]}
>>
>>host=> host address (hostname/ipv4/ipv6 addresses/socket path)
>>port=> port number on which glusterd is listening. (default 24007)
>>transport   => transport type used to connect to gluster management 
>> daemon,
>>it can be tcp|rdma|unix (default 'tcp')
>>
>> Examples:
>> 1.
>>  -drive driver=qcow2,file.driver=gluster,
>> file.volume=testvol,file.path=/path/a.qcow2,
>> file.server.0.host=1.2.3.4,
>> file.server.0.port=24007,
>> file.server.0.transport=tcp,
>> file.server.1.host=5.6.7.8,
>> file.server.1.port=24008,
>> file.server.1.transport=rdma,
>> file.server.2.host=/var/run/glusterd.socket
>> file.server.1.transport=unix
>> 2.
>>  'json:{"driver":"qcow2","file":{"driver":"gluster","volume":"testvol",
>>  "path":"/path/a.qcow2","server":
>>  [{"host":"1.2.3.4","port":"24007","transport":"tcp"},
>>   {"host":"4.5.6.7","port":"24008","transport":"rdma"},
>>   {"host":"/var/run/glusterd.socket","transport":"unix"}] } }'
>
> Ah, working examples.  Good!  No need to expand your abbreviated
> description of pattern II then, just say object instead of tuple.  But if
> you prefer to expand it, go ahead, your choice.
>
>> This patch gives a mechanism to provide all the server addresses, which are 
>> in
>> replica set, so in case host1 is down VM can still boot from any of the
>> active hosts.
>>
>> This is equivalent to the backup-volfile-servers option supported by
>> mount.glusterfs (FUSE way of mounting gluster volume)
>>
>> Credits: Sincere thanks to Kevin Wolf  and
>> "Deepak C Shetty"  for inputs and all their support
>>
>> Signed-off-by: Prasanna Kumar Kalever 
>> ---
>>  block/gluster.c

Re: [Qemu-devel] [PATCH v3 00/14] virtio migration: Flip outer layer to vmstate

2016-07-15 Thread Amit Shah

On (Fri) 15 Jul 2016 [10:23:10], Cornelia Huck wrote:
> On Thu, 14 Jul 2016 18:22:42 +0100
> "Dr. David Alan Gilbert (git)"  wrote:
> 
> > From: "Dr. David Alan Gilbert" 
> > 
> > Hi,
> >   This series converts the outer most layer of virtio to
> > use VMState macros;  this is the easy bit, but I'm hoping that
> > having done that, the next trick is to nibble away at the virtio_save/load
> > functions and all of the zillions of device/bus helpers.
> 
> Looks good, and I'd like to see this in 2.7.

Michael mentioned he's interested in reviewing, so we'll wait a few
days for him to go through this.

Amit

[Qemu-devel] [kvm-unit-tests PATCH v3 01/10] lib: xstr: allow multiple args

2016-07-15 Thread Andrew Jones

Make implementation equivalent to Linux's include/linux/stringify.h

Signed-off-by: Andrew Jones 
---
 lib/libcflat.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/lib/libcflat.h b/lib/libcflat.h
index 72b1bf9668ef1..82005f5d014fb 100644
--- a/lib/libcflat.h
+++ b/lib/libcflat.h
@@ -27,8 +27,8 @@
 
 #define __unused __attribute__((__unused__))
 
-#define xstr(s) xxstr(s)
-#define xxstr(s) #s
+#define xstr(s...) xxstr(s)
+#define xxstr(s...) #s
 
 #define __ALIGN_MASK(x, mask)  (((x) + (mask)) & ~(mask))
 #define __ALIGN(x, a)  __ALIGN_MASK(x, (typeof(x))(a) - 1)
-- 
2.7.4

[Qemu-devel] [kvm-unit-tests PATCH v3 02/10] arm64: fix get_"sysreg32" and make MPIDR 64bit

2016-07-15 Thread Andrew Jones

mrs is always 64bit, so we should always use a 64bit register.
Sometimes we'll only want to return the lower 32, but not for
MPIDR, as that does define fields in the upper 32.

Reviewed-by: Alex Bennée 
Signed-off-by: Andrew Jones 
---
 lib/arm64/asm/processor.h | 15 +--
 1 file changed, 9 insertions(+), 6 deletions(-)

diff --git a/lib/arm64/asm/processor.h b/lib/arm64/asm/processor.h
index 84d5c7ce752b0..9a208ff729b7e 100644
--- a/lib/arm64/asm/processor.h
+++ b/lib/arm64/asm/processor.h
@@ -66,14 +66,17 @@ static inline unsigned long current_level(void)
return el & 0xc;
 }
 
-#define DEFINE_GET_SYSREG32(reg)   \
-static inline unsigned int get_##reg(void) \
+#define DEFINE_GET_SYSREG(reg, type)   \
+static inline type get_##reg(void) \
 {  \
-   unsigned int reg;   \
-   asm volatile("mrs %0, " #reg "_el1" : "=r" (reg));  \
-   return reg; \
+   unsigned long r;\
+   asm volatile("mrs %0, " #reg "_el1" : "=r" (r));\
+   return (type)r; \
 }
-DEFINE_GET_SYSREG32(mpidr)
+#define DEFINE_GET_SYSREG32(reg) DEFINE_GET_SYSREG(reg, unsigned int)
+#define DEFINE_GET_SYSREG64(reg) DEFINE_GET_SYSREG(reg, unsigned long)
+
+DEFINE_GET_SYSREG64(mpidr)
 
 /* Only support Aff0 for now, gicv2 only */
 #define mpidr_to_cpu(mpidr) ((int)((mpidr) & 0xff))
-- 
2.7.4

[Qemu-devel] [kvm-unit-tests PATCH v3 03/10] arm/arm64: smp: support more than 8 cpus

2016-07-15 Thread Andrew Jones

Reviewed-by: Alex Bennée 
Signed-off-by: Andrew Jones 
---
 arm/run   | 19 ---
 arm/selftest.c|  5 -
 lib/arm/asm/processor.h   |  9 +++--
 lib/arm/asm/setup.h   |  4 ++--
 lib/arm/setup.c   | 12 +++-
 lib/arm64/asm/processor.h |  9 +++--
 6 files changed, 43 insertions(+), 15 deletions(-)

diff --git a/arm/run b/arm/run
index a2f35ef6a7e63..2d0698619606e 100755
--- a/arm/run
+++ b/arm/run
@@ -31,13 +31,6 @@ if [ -z "$ACCEL" ]; then
fi
 fi
 
-if [ "$HOST" = "aarch64" ] && [ "$ACCEL" = "kvm" ]; then
-   processor="host"
-   if [ "$ARCH" = "arm" ]; then
-   processor+=",aarch64=off"
-   fi
-fi
-
 qemu="${QEMU:-qemu-system-$ARCH_NAME}"
 qpath=$(which $qemu 2>/dev/null)
 
@@ -53,6 +46,18 @@ fi
 
 M='-machine virt'
 
+if [ "$ACCEL" = "kvm" ]; then
+   if $qemu $M,\? 2>&1 | grep gic-version > /dev/null; then
+   M+=',gic-version=host'
+   fi
+   if [ "$HOST" = "aarch64" ]; then
+   processor="host"
+   if [ "$ARCH" = "arm" ]; then
+   processor+=",aarch64=off"
+   fi
+   fi
+fi
+
 if ! $qemu $M -device '?' 2>&1 | grep virtconsole > /dev/null; then
echo "$qpath doesn't support virtio-console for chr-testdev. Exiting."
exit 2
diff --git a/arm/selftest.c b/arm/selftest.c
index 196164f5313de..2f117f795d2dc 100644
--- a/arm/selftest.c
+++ b/arm/selftest.c
@@ -312,9 +312,10 @@ static bool psci_check(void)
 static cpumask_t smp_reported;
 static void cpu_report(void)
 {
+   unsigned long mpidr = get_mpidr();
int cpu = smp_processor_id();
 
-   report("CPU%d online", true, cpu);
+   report("CPU(%3d) mpidr=%lx", mpidr_to_cpu(mpidr) == cpu, cpu, mpidr);
cpumask_set_cpu(cpu, _reported);
halt();
 }
@@ -343,6 +344,7 @@ int main(int argc, char **argv)
 
} else if (strcmp(argv[1], "smp") == 0) {
 
+   unsigned long mpidr = get_mpidr();
int cpu;
 
report("PSCI version", psci_check());
@@ -353,6 +355,7 @@ int main(int argc, char **argv)
smp_boot_secondary(cpu, cpu_report);
}
 
+   report("CPU(%3d) mpidr=%lx", mpidr_to_cpu(mpidr) == 0, 0, 
mpidr);
cpumask_set_cpu(0, _reported);
while (!cpumask_full(_reported))
cpu_relax();
diff --git a/lib/arm/asm/processor.h b/lib/arm/asm/processor.h
index f25e7eee3666c..d2048f5f5f7e6 100644
--- a/lib/arm/asm/processor.h
+++ b/lib/arm/asm/processor.h
@@ -40,8 +40,13 @@ static inline unsigned int get_mpidr(void)
return mpidr;
 }
 
-/* Only support Aff0 for now, up to 4 cpus */
-#define mpidr_to_cpu(mpidr) ((int)((mpidr) & 0xff))
+#define MPIDR_HWID_BITMASK 0xff
+extern int mpidr_to_cpu(unsigned long mpidr);
+
+#define MPIDR_LEVEL_SHIFT(level) \
+   (((1 << level) >> 1) << 3)
+#define MPIDR_AFFINITY_LEVEL(mpidr, level) \
+   ((mpidr >> MPIDR_LEVEL_SHIFT(level)) & 0xff)
 
 extern void start_usr(void (*func)(void *arg), void *arg, unsigned long 
sp_usr);
 extern bool is_user(void);
diff --git a/lib/arm/asm/setup.h b/lib/arm/asm/setup.h
index cb8fdbd38dd5d..c501c6ddd8657 100644
--- a/lib/arm/asm/setup.h
+++ b/lib/arm/asm/setup.h
@@ -10,8 +10,8 @@
 #include 
 #include 
 
-#define NR_CPUS8
-extern u32 cpus[NR_CPUS];
+#define NR_CPUS255
+extern u64 cpus[NR_CPUS];
 extern int nr_cpus;
 
 #define NR_MEM_REGIONS 8
diff --git a/lib/arm/setup.c b/lib/arm/setup.c
index 7e7b39f11dde1..b6e2d5815e723 100644
--- a/lib/arm/setup.c
+++ b/lib/arm/setup.c
@@ -24,12 +24,22 @@ extern unsigned long stacktop;
 extern void io_init(void);
 extern void setup_args_progname(const char *args);
 
-u32 cpus[NR_CPUS] = { [0 ... NR_CPUS-1] = (~0U) };
+u64 cpus[NR_CPUS] = { [0 ... NR_CPUS-1] = (~0U) };
 int nr_cpus;
 
 struct mem_region mem_regions[NR_MEM_REGIONS];
 phys_addr_t __phys_offset, __phys_end;
 
+int mpidr_to_cpu(unsigned long mpidr)
+{
+   int i;
+
+   for (i = 0; i < nr_cpus; ++i)
+   if (cpus[i] == (mpidr & MPIDR_HWID_BITMASK))
+   return i;
+   return -1;
+}
+
 static void cpu_set(int fdtnode __unused, u32 regval, void *info __unused)
 {
int cpu = nr_cpus++;
diff --git a/lib/arm64/asm/processor.h b/lib/arm64/asm/processor.h
index 9a208ff729b7e..7e448dc81a6aa 100644
--- a/lib/arm64/asm/processor.h
+++ b/lib/arm64/asm/processor.h
@@ -78,8 +78,13 @@ static inline type get_##reg(void)   
\
 
 DEFINE_GET_SYSREG64(mpidr)
 
-/* Only support Aff0 for now, gicv2 only */
-#define mpidr_to_cpu(mpidr) ((int)((mpidr) & 0xff))
+#define MPIDR_HWID_BITMASK 0xff00ff
+extern int mpidr_to_cpu(unsigned long mpidr);
+
+#define MPIDR_LEVEL_SHIFT(level) \
+   (((1 << level) >> 1) << 3)
+#define MPIDR_AFFINITY_LEVEL(mpidr, level) \
+

[Qemu-devel] [kvm-unit-tests PATCH v3 05/10] arm/arm64: irq enable/disable

2016-07-15 Thread Andrew Jones

Reviewed-by: Alex Bennée 
Signed-off-by: Andrew Jones 
---
 lib/arm/asm/processor.h   | 10 ++
 lib/arm64/asm/processor.h | 10 ++
 2 files changed, 20 insertions(+)

diff --git a/lib/arm/asm/processor.h b/lib/arm/asm/processor.h
index afc903ca7d4ab..75a8d08b89330 100644
--- a/lib/arm/asm/processor.h
+++ b/lib/arm/asm/processor.h
@@ -35,6 +35,16 @@ static inline unsigned long current_cpsr(void)
 
 #define current_mode() (current_cpsr() & MODE_MASK)
 
+static inline void local_irq_enable(void)
+{
+   asm volatile("cpsie i" : : : "memory", "cc");
+}
+
+static inline void local_irq_disable(void)
+{
+   asm volatile("cpsid i" : : : "memory", "cc");
+}
+
 static inline unsigned int get_mpidr(void)
 {
unsigned int mpidr;
diff --git a/lib/arm64/asm/processor.h b/lib/arm64/asm/processor.h
index 94f7ce35b65c1..d54a4ed1c1876 100644
--- a/lib/arm64/asm/processor.h
+++ b/lib/arm64/asm/processor.h
@@ -68,6 +68,16 @@ static inline unsigned long current_level(void)
return el & 0xc;
 }
 
+static inline void local_irq_enable(void)
+{
+   asm volatile("msr daifclr, #2" : : : "memory");
+}
+
+static inline void local_irq_disable(void)
+{
+   asm volatile("msr daifset, #2" : : : "memory");
+}
+
 #define DEFINE_GET_SYSREG(reg, type)   \
 static inline type get_##reg(void) \
 {  \
-- 
2.7.4

[Qemu-devel] [PATCH v7 4/4] block: ignore flush requests when storage is clean

2016-07-15 Thread Denis V. Lunev

From: Evgeny Yakovlev 

Some guests (win2008 server for example) do a lot of unnecessary
flushing when underlying media has not changed. This adds additional
overhead on host when calling fsync/fdatasync.

This change introduces a write generation scheme in BlockDriverState.
Current write generation is checked against last flushed generation to
avoid unnessesary flushes.

The problem with excessive flushing was found by a performance test
which does parallel directory tree creation (from 2 processes).
Results improved from 0.424 loops/sec to 0.432 loops/sec.
Each loop creates 10^3 directories with 10 files in each.

This affected some blkdebug testcases that were expecting error logs from
failure-injected flushes which are now skipped entirely
(tests 026 071 089).

This also affects the performance of block jobs and thus BLOCK_JOB_READY
events for driver-mirror and active block-commit commands now arrives
faster, before QMP send successfully returns to caller (tests 141 144).

Signed-off-by: Evgeny Yakovlev 
Signed-off-by: Denis V. Lunev 
Reviewed-by: Paolo Bonzini 
CC: Kevin Wolf 
CC: Max Reitz 
CC: Stefan Hajnoczi 
CC: Fam Zheng 
CC: John Snow 
---
 block.c|  3 +++
 block/io.c | 21 
 include/block/block_int.h  |  5 
 tests/qemu-iotests/026.out.nocache | 50 --
 tests/qemu-iotests/071.out |  8 --
 tests/qemu-iotests/089.out |  2 --
 tests/qemu-iotests/141.out |  4 +--
 tests/qemu-iotests/144.out |  2 +-
 8 files changed, 32 insertions(+), 63 deletions(-)

diff --git a/block.c b/block.c
index 823ff1d..060f88e 100644
--- a/block.c
+++ b/block.c
@@ -234,6 +234,8 @@ BlockDriverState *bdrv_new(void)
 bs->refcnt = 1;
 bs->aio_context = qemu_get_aio_context();
 
+qemu_co_queue_init(>flush_queue);
+
 QTAILQ_INSERT_TAIL(_bdrv_states, bs, bs_list);
 
 return bs;
@@ -2472,6 +2474,7 @@ int bdrv_truncate(BlockDriverState *bs, int64_t offset)
 ret = refresh_total_sectors(bs, offset >> BDRV_SECTOR_BITS);
 bdrv_dirty_bitmap_truncate(bs);
 bdrv_parent_cb_resize(bs);
+++bs->write_gen;
 }
 return ret;
 }
diff --git a/block/io.c b/block/io.c
index 7086908..f181ff7 100644
--- a/block/io.c
+++ b/block/io.c
@@ -1303,6 +1303,7 @@ static int coroutine_fn 
bdrv_aligned_pwritev(BlockDriverState *bs,
 }
 bdrv_debug_event(bs, BLKDBG_PWRITEV_DONE);
 
+++bs->write_gen;
 bdrv_set_dirty(bs, start_sector, end_sector - start_sector);
 
 if (bs->wr_highest_offset < offset + bytes) {
@@ -2235,6 +2236,15 @@ int coroutine_fn bdrv_co_flush(BlockDriverState *bs)
 
 tracked_request_begin(, bs, 0, 0, BDRV_TRACKED_FLUSH);
 
+int current_gen = bs->write_gen;
+
+/* Wait until any previous flushes are completed */
+while (bs->flush_started_gen != bs->flushed_gen) {
+qemu_co_queue_wait(>flush_queue);
+}
+
+bs->flush_started_gen = current_gen;
+
 /* Write back all layers by calling one driver function */
 if (bs->drv->bdrv_co_flush) {
 ret = bs->drv->bdrv_co_flush(bs);
@@ -2255,6 +2265,11 @@ int coroutine_fn bdrv_co_flush(BlockDriverState *bs)
 goto flush_parent;
 }
 
+/* Check if we really need to flush anything */
+if (bs->flushed_gen == current_gen) {
+goto flush_parent;
+}
+
 BLKDBG_EVENT(bs->file, BLKDBG_FLUSH_TO_DISK);
 if (bs->drv->bdrv_co_flush_to_disk) {
 ret = bs->drv->bdrv_co_flush_to_disk(bs);
@@ -2285,6 +2300,7 @@ int coroutine_fn bdrv_co_flush(BlockDriverState *bs)
  */
 ret = 0;
 }
+
 if (ret < 0) {
 goto out;
 }
@@ -2295,6 +2311,10 @@ int coroutine_fn bdrv_co_flush(BlockDriverState *bs)
 flush_parent:
 ret = bs->file ? bdrv_co_flush(bs->file->bs) : 0;
 out:
+/* Notify any pending flushes that we have completed */
+bs->flushed_gen = current_gen;
+qemu_co_queue_restart_all(>flush_queue);
+
 tracked_request_end();
 return ret;
 }
@@ -2420,6 +2440,7 @@ int coroutine_fn bdrv_co_discard(BlockDriverState *bs, 
int64_t sector_num,
 }
 ret = 0;
 out:
+++bs->write_gen;
 bdrv_set_dirty(bs, req.offset >> BDRV_SECTOR_BITS,
req.bytes >> BDRV_SECTOR_BITS);
 tracked_request_end();
diff --git a/include/block/block_int.h b/include/block/block_int.h
index 47b9aac..396bd2b 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -439,6 +439,11 @@ struct BlockDriverState {
 int copy_on_read; /* if nonzero, copy read backing sectors into image.
  note this is a reference count */
 
+CoQueue flush_queue;/* Serializing flush queue */
+unsigned int write_gen; /* Current data generation */
+

Re: [Qemu-devel] [PATCH Qemu] Change spice-server protocol for GL texture passing

2016-07-15 Thread Frediano Ziglio

Forgot to add RFC to the subject

Frediano

> 
> ---
>  ui/spice-core.c|  5 -
>  ui/spice-display.c | 29 -
>  2 files changed, 8 insertions(+), 26 deletions(-)
> 
> diff --git a/ui/spice-core.c b/ui/spice-core.c
> index da05054..f7647f7 100644
> --- a/ui/spice-core.c
> +++ b/ui/spice-core.c
> @@ -828,11 +828,6 @@ void qemu_spice_init(void)
>  
>  #ifdef HAVE_SPICE_GL
>  if (qemu_opt_get_bool(opts, "gl", 0)) {
> -if ((port != 0) || (tls_port != 0)) {
> -error_report("SPICE GL support is local-only for now and "
> - "incompatible with -spice port/tls-port");
> -exit(1);
> -}
>  if (egl_rendernode_init() != 0) {
>  error_report("Failed to initialize EGL render node for SPICE
>  GL");
>  exit(1);
> diff --git a/ui/spice-display.c b/ui/spice-display.c
> index 2a77a54..72137bd 100644
> --- a/ui/spice-display.c
> +++ b/ui/spice-display.c
> @@ -852,6 +852,10 @@ static void qemu_spice_gl_block_timer(void *opaque)
>  static QEMUGLContext qemu_spice_gl_create_context(DisplayChangeListener
>  *dcl,
>QEMUGLParams *params)
>  {
> +SimpleSpiceDisplay *ssd = container_of(dcl, SimpleSpiceDisplay, dcl);
> +
> +spice_qxl_gl_init(>qxl, qemu_egl_display, qemu_egl_rn_ctx);
> +
>  eglMakeCurrent(qemu_egl_display, EGL_NO_SURFACE, EGL_NO_SURFACE,
> qemu_egl_rn_ctx);
>  return qemu_egl_create_context(dcl, params);
> @@ -864,28 +868,11 @@ static void qemu_spice_gl_scanout(DisplayChangeListener
> *dcl,
>uint32_t w, uint32_t h)
>  {
>  SimpleSpiceDisplay *ssd = container_of(dcl, SimpleSpiceDisplay, dcl);
> -EGLint stride = 0, fourcc = 0;
> -int fd = -1;
> -
> -if (tex_id) {
> -fd = egl_get_fd_for_texture(tex_id, , );
> -if (fd < 0) {
> -fprintf(stderr, "%s: failed to get fd for texture\n", __func__);
> -return;
> -}
> -dprint(1, "%s: %dx%d (stride %d, fourcc 0x%x)\n", __func__,
> -   w, h, stride, fourcc);
> -} else {
> -dprint(1, "%s: no texture (no framebuffer)\n", __func__);
> -}
> -
> -assert(!tex_id || fd >= 0);
>  
> -/* note: spice server will close the fd */
> -spice_qxl_gl_scanout(>qxl, fd,
> - surface_width(ssd->ds),
> - surface_height(ssd->ds),
> - stride, fourcc, y_0_top);
> +spice_qxl_gl_scanout_texture(>qxl, tex_id,
> + surface_width(ssd->ds),
> + surface_height(ssd->ds),
> + y_0_top);
>  
>  qemu_spice_gl_monitor_config(ssd, x, y, w, h);
>  }

[Qemu-devel] [PATCH 1/3] linux-user: Fix handling of iovec counts

2016-07-15 Thread Peter Maydell

In the kernel the length of an iovec is generally handled as
an unsigned long, not an integer; fix the parameter to
lock_iovec() accordingly.

Signed-off-by: Peter Maydell 
---
 linux-user/syscall.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/linux-user/syscall.c b/linux-user/syscall.c
index 9dbd711..8d36d6c 100644
--- a/linux-user/syscall.c
+++ b/linux-user/syscall.c
@@ -2647,7 +2647,7 @@ static abi_long do_getsockopt(int sockfd, int level, int 
optname,
 }
 
 static struct iovec *lock_iovec(int type, abi_ulong target_addr,
-int count, int copy)
+abi_ulong count, int copy)
 {
 struct target_iovec *target_vec;
 struct iovec *vec;
@@ -2660,7 +2660,7 @@ static struct iovec *lock_iovec(int type, abi_ulong 
target_addr,
 errno = 0;
 return NULL;
 }
-if (count < 0 || count > IOV_MAX) {
+if (count > IOV_MAX) {
 errno = EINVAL;
 return NULL;
 }
@@ -2735,7 +2735,7 @@ static struct iovec *lock_iovec(int type, abi_ulong 
target_addr,
 }
 
 static void unlock_iovec(struct iovec *vec, abi_ulong target_addr,
- int count, int copy)
+ abi_ulong count, int copy)
 {
 struct target_iovec *target_vec;
 int i;
@@ -2962,7 +2962,7 @@ static abi_long do_sendrecvmsg_locked(int fd, struct 
target_msghdr *msgp,
 {
 abi_long ret, len;
 struct msghdr msg;
-int count;
+abi_ulong count;
 struct iovec *vec;
 abi_ulong target_vec;
 
-- 
1.9.1

[Qemu-devel] [PATCH 2/3] linux-user: Fix errno for sendrecvmsg with large iovec length

2016-07-15 Thread Peter Maydell

The sendmsg and recvmsg syscalls use a different errno to indicate
an overlarge iovec length from readv and writev. Handle this
special case in do_sendrcvmsg_locked() to avoid getting the
default errno returned by lock_iovec().

Signed-off-by: Peter Maydell 
---
 linux-user/syscall.c | 9 +
 1 file changed, 9 insertions(+)

diff --git a/linux-user/syscall.c b/linux-user/syscall.c
index 8d36d6c..a21ee59 100644
--- a/linux-user/syscall.c
+++ b/linux-user/syscall.c
@@ -2985,6 +2985,15 @@ static abi_long do_sendrecvmsg_locked(int fd, struct 
target_msghdr *msgp,
 
 count = tswapal(msgp->msg_iovlen);
 target_vec = tswapal(msgp->msg_iov);
+
+if (count > IOV_MAX) {
+/* sendrcvmsg returns a different errno for this condition than
+ * readv/writev, so we must catch it here before lock_iovec() does.
+ */
+ret = -TARGET_EMSGSIZE;
+goto out2;
+}
+
 vec = lock_iovec(send ? VERIFY_READ : VERIFY_WRITE,
  target_vec, count, send);
 if (vec == NULL) {
-- 
1.9.1

Re: [Qemu-devel] [PATCH v3 01/11] util/qht: Document memory ordering assumptions

2016-07-15 Thread Sergey Fedorov

On 13/07/16 14:13, Paolo Bonzini wrote:
> diff --git a/include/qemu/qht.h b/include/qemu/qht.h
> index 70bfc68..f4f1d55 100644
> --- a/include/qemu/qht.h
> +++ b/include/qemu/qht.h
> @@ -69,6 +69,9 @@ void qht_destroy(struct qht *ht);
>   * Attempting to insert a NULL @p is a bug.
>   * Inserting the same pointer @p with different @hash values is a bug.
>   *
> + * In case of successful operation, smp_wmb() is implied before the pointer 
> is
> + * inserted into the hash table.
> + *
>   * Returns true on sucess.
>   * Returns false if the @p-@hash pair already exists in the hash table.
>   */
> @@ -83,6 +86,8 @@ bool qht_insert(struct qht *ht, void *p, uint32_t hash);
>   *
>   * Needs to be called under an RCU read-critical section.
>   *
> + * smp_read_barrier_depends() is implied before the call to @func.
> + *
>   * The user-provided @func compares pointers in QHT against @userp.
>   * If the function returns true, a match has been found.
>   *
> @@ -105,6 +110,10 @@ void *qht_lookup(struct qht *ht, qht_lookup_func_t func, 
> const void *userp,
>   * This guarantees that concurrent lookups will always compare against valid
>   * data.
>   *
> + * In case of successful operation, a smp_wmb() barrier is implied before and
> + * after the pointer is removed from the hash table.  In other words,
> + * a successful qht_remove acts as a bidirectional write barrier.
> + *

I understand why an implied wmb can be expected after the entry is
removed: so that the caller can trash the contents of the object
removed. However that would require double-check on lookup side to make
sure the entry hadn't been removed after the first lookup (something
like seqlock read side). But I have no idea why we might like to promise
wmb before the removal. Could you please share your point regarding this?

I tempt to remove any promises on qht_remove() because it is not clear
for me what would be the natural expectations on memory ordering.

As of qht_insert() and qht_lookup(), I agree and this is enough
guarantees for the series.

Thanks,
Sergey

>   * Returns true on success.
>   * Returns false if the @p-@hash pair was not found.
>   */
> diff --git a/util/qht.c b/util/qht.c
> index 40d6e21..d38948e 100644
> --- a/util/qht.c
> +++ b/util/qht.c
> @@ -445,7 +445,11 @@ void *qht_do_lookup(struct qht_bucket *head, 
> qht_lookup_func_t func,
>  do {
>  for (i = 0; i < QHT_BUCKET_ENTRIES; i++) {
>  if (b->hashes[i] == hash) {
> -void *p = atomic_read(>pointers[i]);
> +/* The pointer is dereferenced before seqlock_read_retry,
> + * so (unlike qht_insert__locked) we need to use
> + * atomic_rcu_read here.
> + */
> +void *p = atomic_rcu_read(>pointers[i]);
>  
>  if (likely(p) && likely(func(p, userp))) {
>  return p;
> @@ -535,6 +539,7 @@ static bool qht_insert__locked(struct qht *ht, struct 
> qht_map *map,
>  atomic_rcu_set(>next, b);
>  }
>  b->hashes[i] = hash;
> +/* smp_wmb() implicit in seqlock_write_begin.  */
>  atomic_set(>pointers[i], p);
>  seqlock_write_end(>sequence);
>  return true;
> @@ -659,6 +664,9 @@ bool qht_remove__locked(struct qht_map *map, struct 
> qht_bucket *head,
>  }
>  if (q == p) {
>  qht_debug_assert(b->hashes[i] == hash);
> +/* seqlock_write_begin and seqlock_write_end provide write
> + * memory barrier semantics to callers of qht_remove.
> + */
>  seqlock_write_begin(>sequence);
>  qht_bucket_remove_entry(b, i);
>  seqlock_write_end(>sequence);

Re: [Qemu-devel] [PATCH v2] qcow2: do not allocate extra memory

2016-07-15 Thread Max Reitz

On 14.07.2016 18:59, Vladimir Sementsov-Ogievskiy wrote:
> There are no needs to allocate more than one cluster, as we set
> avail_out for deflate to one cluster.
> 
> Zlib docs (http://www.zlib.net/manual.html) says:
> "deflate compresses as much data as possible, and stops when the input
> buffer becomes empty or the output buffer becomes full."
> 
> So, deflate will not write more than avail_out to output buffer. If
> there is no enough space in output buffer for compressed data (it may be
> larger than input data) deflate just returns Z_OK. (if all data is
> compressed and written to output buffer deflate returns Z_STREAM_END).
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy 
> ---
> 
> v2: improve commit message
> 
>  block/qcow.c  | 2 +-
>  block/qcow2.c | 2 +-
>  2 files changed, 2 insertions(+), 2 deletions(-)

Thanks Vladimir, applied to my block branch (with s/no/not/ as proposed
by Eric):

https://github.com/XanClic/qemu/commits/block

Max



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [PATCH v3 01/11] util/qht: Document memory ordering assumptions

2016-07-15 Thread Paolo Bonzini



On 15/07/2016 14:37, Sergey Fedorov wrote:
> On 13/07/16 14:13, Paolo Bonzini wrote:
>> diff --git a/include/qemu/qht.h b/include/qemu/qht.h
>> index 70bfc68..f4f1d55 100644
>> --- a/include/qemu/qht.h
>> +++ b/include/qemu/qht.h
>> @@ -69,6 +69,9 @@ void qht_destroy(struct qht *ht);
>>   * Attempting to insert a NULL @p is a bug.
>>   * Inserting the same pointer @p with different @hash values is a bug.
>>   *
>> + * In case of successful operation, smp_wmb() is implied before the pointer 
>> is
>> + * inserted into the hash table.
>> + *
>>   * Returns true on sucess.
>>   * Returns false if the @p-@hash pair already exists in the hash table.
>>   */
>> @@ -83,6 +86,8 @@ bool qht_insert(struct qht *ht, void *p, uint32_t hash);
>>   *
>>   * Needs to be called under an RCU read-critical section.
>>   *
>> + * smp_read_barrier_depends() is implied before the call to @func.
>> + *
>>   * The user-provided @func compares pointers in QHT against @userp.
>>   * If the function returns true, a match has been found.
>>   *
>> @@ -105,6 +110,10 @@ void *qht_lookup(struct qht *ht, qht_lookup_func_t 
>> func, const void *userp,
>>   * This guarantees that concurrent lookups will always compare against valid
>>   * data.
>>   *
>> + * In case of successful operation, a smp_wmb() barrier is implied before 
>> and
>> + * after the pointer is removed from the hash table.  In other words,
>> + * a successful qht_remove acts as a bidirectional write barrier.
>> + *
> 
> I understand why an implied wmb can be expected after the entry is
> removed: so that the caller can trash the contents of the object
> removed. However that would require double-check on lookup side to make
> sure the entry hadn't been removed after the first lookup (something
> like seqlock read side). But I have no idea why we might like to promise
> wmb before the removal. Could you please share your point regarding this?

The reasoning was to make qht_remove() "look to be ordered" with respect
to writes.  So anything after remove wouldn't bleed into it, nor would
anything before.

In the case of this series, it would let you remove the smp_wmb() after
tb_mark_invalid().  However, it's also okay to only handle qht_insert()
and qht_lookup(), and keep the memory barrier after the TB is marked
invalid (though it is unnecessary).

Paolo

> As of qht_insert() and qht_lookup(), I agree and this is enough
> guarantees for the series.
> 
> Thanks,
> Sergey
> 
>>   * Returns true on success.
>>   * Returns false if the @p-@hash pair was not found.
>>   */
>> diff --git a/util/qht.c b/util/qht.c
>> index 40d6e21..d38948e 100644
>> --- a/util/qht.c
>> +++ b/util/qht.c
>> @@ -445,7 +445,11 @@ void *qht_do_lookup(struct qht_bucket *head, 
>> qht_lookup_func_t func,
>>  do {
>>  for (i = 0; i < QHT_BUCKET_ENTRIES; i++) {
>>  if (b->hashes[i] == hash) {
>> -void *p = atomic_read(>pointers[i]);
>> +/* The pointer is dereferenced before seqlock_read_retry,
>> + * so (unlike qht_insert__locked) we need to use
>> + * atomic_rcu_read here.
>> + */
>> +void *p = atomic_rcu_read(>pointers[i]);
>>  
>>  if (likely(p) && likely(func(p, userp))) {
>>  return p;
>> @@ -535,6 +539,7 @@ static bool qht_insert__locked(struct qht *ht, struct 
>> qht_map *map,
>>  atomic_rcu_set(>next, b);
>>  }
>>  b->hashes[i] = hash;
>> +/* smp_wmb() implicit in seqlock_write_begin.  */
>>  atomic_set(>pointers[i], p);
>>  seqlock_write_end(>sequence);
>>  return true;
>> @@ -659,6 +664,9 @@ bool qht_remove__locked(struct qht_map *map, struct 
>> qht_bucket *head,
>>  }
>>  if (q == p) {
>>  qht_debug_assert(b->hashes[i] == hash);
>> +/* seqlock_write_begin and seqlock_write_end provide write
>> + * memory barrier semantics to callers of qht_remove.
>> + */
>>  seqlock_write_begin(>sequence);
>>  qht_bucket_remove_entry(b, i);
>>  seqlock_write_end(>sequence);
>

[Qemu-devel] [kvm-unit-tests PATCH v3 10/10] arm/arm64: gic: don't just use zero

2016-07-15 Thread Andrew Jones

Allow user to select who sends ipis and with which irq,
rather than just always sending irq=0 from cpu0.

Signed-off-by: Andrew Jones 

---
v2: actually check that the irq received was the irq sent,
and (for gicv2) that the sender is the expected one.
---
 arm/gic.c | 80 ++-
 1 file changed, 64 insertions(+), 16 deletions(-)

diff --git a/arm/gic.c b/arm/gic.c
index fc7ef241de3e2..d3ab97d4ae470 100644
--- a/arm/gic.c
+++ b/arm/gic.c
@@ -11,6 +11,7 @@
  * This work is licensed under the terms of the GNU LGPL, version 2.
  */
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -33,6 +34,8 @@ static struct gic *gic;
 static int gic_version;
 static int acked[NR_CPUS], spurious[NR_CPUS];
 static cpumask_t ready;
+static int sender;
+static u32 irq;
 
 static void nr_cpu_check(int nr)
 {
@@ -85,7 +88,16 @@ static void check_acked(cpumask_t *mask)
 
 static u32 gicv2_read_iar(void)
 {
-   return readl(gicv2_cpu_base() + GIC_CPU_INTACK);
+   u32 iar = readl(gicv2_cpu_base() + GIC_CPU_INTACK);
+   int src = (iar >> 10) & 7;
+
+   if (src != sender) {
+   report("cpu%d received IPI from unexpected source cpu%d "
+  "(expected cpu%d)",
+  false, smp_processor_id(), src, sender);
+   }
+
+   return iar & 0x3ff;
 }
 
 static void gicv2_write_eoi(u32 irq)
@@ -99,9 +111,15 @@ static void ipi_handler(struct pt_regs *regs __unused)
 
if (iar != GICC_INT_SPURIOUS) {
gic->write_eoi(iar);
-   smp_rmb(); /* pairs with wmb in ipi_test functions */
-   ++acked[smp_processor_id()];
-   smp_wmb(); /* pairs with rmb in check_acked */
+   if (iar == irq) {
+   smp_rmb(); /* pairs with wmb in ipi_test functions */
+   ++acked[smp_processor_id()];
+   smp_wmb(); /* pairs with rmb in check_acked */
+   } else {
+   report("cpu%d received unexpected irq %u "
+  "(expected %u)",
+  false, smp_processor_id(), iar, irq);
+   }
} else {
++spurious[smp_processor_id()];
smp_wmb();
@@ -110,19 +128,19 @@ static void ipi_handler(struct pt_regs *regs __unused)
 
 static void gicv2_ipi_send_self(void)
 {
-   writel(2 << 24, gicv2_dist_base() + GIC_DIST_SOFTINT);
+   writel(2 << 24 | irq, gicv2_dist_base() + GIC_DIST_SOFTINT);
 }
 
 static void gicv2_ipi_send_tlist(cpumask_t *mask)
 {
u8 tlist = (u8)cpumask_bits(mask)[0];
 
-   writel(tlist << 16, gicv2_dist_base() + GIC_DIST_SOFTINT);
+   writel(tlist << 16 | irq, gicv2_dist_base() + GIC_DIST_SOFTINT);
 }
 
 static void gicv2_ipi_send_broadcast(void)
 {
-   writel(1 << 24, gicv2_dist_base() + GIC_DIST_SOFTINT);
+   writel(1 << 24 | irq, gicv2_dist_base() + GIC_DIST_SOFTINT);
 }
 
 #define ICC_SGI1R_AFFINITY_1_SHIFT 16
@@ -165,7 +183,7 @@ static void gicv3_ipi_send_tlist(cpumask_t *mask)
 
sgi1r = (MPIDR_TO_SGI_AFFINITY(cluster_id, 3)   |
 MPIDR_TO_SGI_AFFINITY(cluster_id, 2)   |
-/* irq << 24   | */
+irq << 24  |
 MPIDR_TO_SGI_AFFINITY(cluster_id, 1)   |
 tlist);
 
@@ -187,7 +205,7 @@ static void gicv3_ipi_send_self(void)
 
 static void gicv3_ipi_send_broadcast(void)
 {
-   gicv3_write_sgi1r(1ULL << 40);
+   gicv3_write_sgi1r(1ULL << 40 | irq << 24);
isb();
 }
 
@@ -199,7 +217,7 @@ static void ipi_test_self(void)
memset(acked, 0, sizeof(acked));
smp_wmb();
cpumask_clear();
-   cpumask_set_cpu(0, );
+   cpumask_set_cpu(smp_processor_id(), );
gic->ipi.send_self();
check_acked();
report_prefix_pop();
@@ -214,7 +232,7 @@ static void ipi_test_smp(void)
memset(acked, 0, sizeof(acked));
smp_wmb();
cpumask_copy(, _present_mask);
-   for (i = 0; i < nr_cpus; i += 2)
+   for (i = smp_processor_id() & 1; i < nr_cpus; i += 2)
cpumask_clear_cpu(i, );
gic->ipi.send_tlist();
check_acked();
@@ -224,7 +242,7 @@ static void ipi_test_smp(void)
memset(acked, 0, sizeof(acked));
smp_wmb();
cpumask_copy(, _present_mask);
-   cpumask_clear_cpu(0, );
+   cpumask_clear_cpu(smp_processor_id(), );
gic->ipi.send_broadcast();
check_acked();
report_prefix_pop();
@@ -241,6 +259,15 @@ static void ipi_enable(void)
local_irq_enable();
 }
 
+static void ipi_send(void)
+{
+   ipi_enable();
+   wait_on_ready();
+   ipi_test_self();
+   ipi_test_smp();
+   exit(report_summary());
+}
+
 static void ipi_recv(void)
 {
ipi_enable();
@@ -300,19 +327,40 @@ int

[Qemu-devel] [kvm-unit-tests PATCH v3 09/10] arm/arm64: gicv3: add an IPI test

2016-07-15 Thread Andrew Jones

Signed-off-by: Andrew Jones 

---
v2: use IRM for gicv3 broadcast
---
 arm/gic.c | 157 ++
 arm/unittests.cfg |   6 +++
 2 files changed, 154 insertions(+), 9 deletions(-)

diff --git a/arm/gic.c b/arm/gic.c
index cf7ec1c90413c..fc7ef241de3e2 100644
--- a/arm/gic.c
+++ b/arm/gic.c
@@ -3,6 +3,8 @@
  *
  * GICv2
  *   . test sending/receiving IPIs
+ * GICv3
+ *   . test sending/receiving IPIs
  *
  * Copyright (C) 2016, Red Hat Inc, Andrew Jones 
  *
@@ -16,6 +18,18 @@
 #include 
 #include 
 
+struct gic {
+   struct {
+   void (*enable)(void);
+   void (*send_self)(void);
+   void (*send_tlist)(cpumask_t *);
+   void (*send_broadcast)(void);
+   } ipi;
+   u32 (*read_iar)(void);
+   void (*write_eoi)(u32);
+};
+
+static struct gic *gic;
 static int gic_version;
 static int acked[NR_CPUS], spurious[NR_CPUS];
 static cpumask_t ready;
@@ -69,12 +83,22 @@ static void check_acked(cpumask_t *mask)
   false, missing, extra, unexpected);
 }
 
+static u32 gicv2_read_iar(void)
+{
+   return readl(gicv2_cpu_base() + GIC_CPU_INTACK);
+}
+
+static void gicv2_write_eoi(u32 irq)
+{
+   writel(irq, gicv2_cpu_base() + GIC_CPU_EOI);
+}
+
 static void ipi_handler(struct pt_regs *regs __unused)
 {
-   u32 iar = readl(gicv2_cpu_base() + GIC_CPU_INTACK);
+   u32 iar = gic->read_iar();
 
if (iar != GICC_INT_SPURIOUS) {
-   writel(iar, gicv2_cpu_base() + GIC_CPU_EOI);
+   gic->write_eoi(iar);
smp_rmb(); /* pairs with wmb in ipi_test functions */
++acked[smp_processor_id()];
smp_wmb(); /* pairs with rmb in check_acked */
@@ -84,6 +108,89 @@ static void ipi_handler(struct pt_regs *regs __unused)
}
 }
 
+static void gicv2_ipi_send_self(void)
+{
+   writel(2 << 24, gicv2_dist_base() + GIC_DIST_SOFTINT);
+}
+
+static void gicv2_ipi_send_tlist(cpumask_t *mask)
+{
+   u8 tlist = (u8)cpumask_bits(mask)[0];
+
+   writel(tlist << 16, gicv2_dist_base() + GIC_DIST_SOFTINT);
+}
+
+static void gicv2_ipi_send_broadcast(void)
+{
+   writel(1 << 24, gicv2_dist_base() + GIC_DIST_SOFTINT);
+}
+
+#define ICC_SGI1R_AFFINITY_1_SHIFT 16
+#define ICC_SGI1R_AFFINITY_2_SHIFT 32
+#define ICC_SGI1R_AFFINITY_3_SHIFT 48
+#define MPIDR_TO_SGI_AFFINITY(cluster_id, level) \
+   (MPIDR_AFFINITY_LEVEL(cluster_id, level) << ICC_SGI1R_AFFINITY_## level 
## _SHIFT)
+
+static void gicv3_ipi_send_tlist(cpumask_t *mask)
+{
+   u16 tlist;
+   int cpu;
+
+   for_each_cpu(cpu, mask) {
+   u64 mpidr = cpus[cpu], sgi1r;
+   u64 cluster_id = mpidr & ~0xffUL;
+
+   tlist = 0;
+
+   while (cpu < nr_cpus) {
+   if ((mpidr & 0xff) >= 16) {
+   printf("cpu%d MPIDR:aff0 is %d (>= 16)!\n",
+   cpu, (int)(mpidr & 0xff));
+   break;
+   }
+
+   tlist |= 1 << (mpidr & 0xf);
+
+   cpu = cpumask_next(cpu, mask);
+   if (cpu >= nr_cpus)
+   break;
+
+   mpidr = cpus[cpu];
+
+   if (cluster_id != (mpidr & ~0xffUL)) {
+   --cpu;
+   break;
+   }
+   }
+
+   sgi1r = (MPIDR_TO_SGI_AFFINITY(cluster_id, 3)   |
+MPIDR_TO_SGI_AFFINITY(cluster_id, 2)   |
+/* irq << 24   | */
+MPIDR_TO_SGI_AFFINITY(cluster_id, 1)   |
+tlist);
+
+   gicv3_write_sgi1r(sgi1r);
+   }
+
+   /* Force the above writes to ICC_SGI1R_EL1 to be executed */
+   isb();
+}
+
+static void gicv3_ipi_send_self(void)
+{
+   cpumask_t mask;
+
+   cpumask_clear();
+   cpumask_set_cpu(smp_processor_id(), );
+   gicv3_ipi_send_tlist();
+}
+
+static void gicv3_ipi_send_broadcast(void)
+{
+   gicv3_write_sgi1r(1ULL << 40);
+   isb();
+}
+
 static void ipi_test_self(void)
 {
cpumask_t mask;
@@ -93,7 +200,7 @@ static void ipi_test_self(void)
smp_wmb();
cpumask_clear();
cpumask_set_cpu(0, );
-   writel(2 << 24, gicv2_dist_base() + GIC_DIST_SOFTINT);
+   gic->ipi.send_self();
check_acked();
report_prefix_pop();
 }
@@ -101,14 +208,15 @@ static void ipi_test_self(void)
 static void ipi_test_smp(void)
 {
cpumask_t mask;
-   unsigned long tlist;
+   int i;
 
report_prefix_push("target-list");
memset(acked, 0, sizeof(acked));
smp_wmb();
-   tlist = cpumask_bits(_present_mask)[0] & 0xaa;
-   cpumask_bits()[0] = tlist;
-   writel((u8)tlist << 16,

Re: [Qemu-devel] [Xen-devel] Regression with commit 095497ffc66b7f031

2016-07-15 Thread Juergen Gross

On 15/07/16 14:42, Paolo Bonzini wrote:
> 
> 
> On 15/07/2016 12:41, Juergen Gross wrote:
>> On 15/07/16 12:35, Paolo Bonzini wrote:
>>>
>>>
>>> On 15/07/2016 12:12, Gerd Hoffmann wrote:
 On Fr, 2016-07-15 at 12:02 +0200, Paolo Bonzini wrote:
>
> On 15/07/2016 10:47, Juergen Gross wrote:
>> Nothing scaring and no real difference between working and not working
>> variant.
>>
>> Meanwhile I've been digging a little bit deeper and found the reason:
>> libxenstore is setting up a reader thread which is waiting for the
>> watch to fire. With above commit the stack size of that thread (16kB)
>> is too small. Setting it to 32kB made qemu work again.
>
> This makes very little sense (not your fault)...  The commit doesn't
> change stack usage at all, TLS should not be on the stack.
>
> Can you capture a backtrace where the 16K stack is exceeded?  Perhaps
> it's only due to inlining decision on the compiler, in which case
> Peter's patch from today is only a bandaid.

 Hmm, I guess I hold off the vnc pull request for now ...
>>>
>>> Go ahead.  I looked at glibc source code and the patch is okay.
>>
>> Paolo, do you know of any interface to obtain the size of the TLS area
>> taken from the stack (before calling pthread_create() )? 
> 
> https://gcc.gnu.org/ml/gcc-patches/2014-10/msg01643.html has a patch
> that _removes_ code to do this from the golang runtime.  The comments
> there say that only with glibc before version 2.16 the static TLS size
> is taken out of the stack size...
> 
> What version of glibc are you using?

2.19. But according to:

https://sourceware.org/bugzilla/show_bug.cgi?id=11787

the issue is still present today.


Juergen

Re: [Qemu-devel] [PATCH v4 11/11] nbd-server: Allow node name for nbd-server-add

2016-07-15 Thread Max Reitz

On 14.07.2016 23:36, Eric Blake wrote:
> On 07/14/2016 07:28 AM, Kevin Wolf wrote:
>> There is no reason why an NBD server couldn't be started for any node,
>> even if it's not on the top level. This converts nbd-server-add to
>> accept a node-name.
>>
>> Note that there is a semantic difference between using a BlockBackend
>> name and the node name of its root: In the former case, the NBD server
>> is closed on eject; in the latter case, the NBD server doesn't drop its
>> reference and keeps the image file open this way.
>>
>> Signed-off-by: Kevin Wolf 
>> ---
>>  blockdev-nbd.c  | 21 +
>>  qapi/block.json |  4 ++--
>>  2 files changed, 11 insertions(+), 14 deletions(-)
>>
>> diff --git a/blockdev-nbd.c b/blockdev-nbd.c
>> index c437d32..ca41cc6 100644
>> --- a/blockdev-nbd.c
>> +++ b/blockdev-nbd.c
>> @@ -145,7 +145,8 @@ void qmp_nbd_server_start(SocketAddress *addr,
>>  void qmp_nbd_server_add(const char *device, bool has_writable, bool 
>> writable,
>>  Error **errp)
>>  {
>> -BlockBackend *blk;
>> +BlockDriverState *bs = NULL;
>> +BlockBackend *on_eject_blk;
>>  NBDExport *exp;
>>  
>>  if (!nbd_server) {
>> @@ -158,26 +159,22 @@ void qmp_nbd_server_add(const char *device, bool 
>> has_writable, bool writable,
>>  return;
> 
> Do we want to do any sanity checking that writing should only be
> permitted on a root, and that when using a node name that is not a root
> that writable must be false so as not to negatively change the BDS out
> of under the feet of the other root?  Do op-blockers already cover that?

Well, one could argue that it's possible to create an NBD server on a
non-root node today anyway, since creating BBs is not restricted to root
nodes:

blockdev-add(id=foo, other arguments...)
blockdev-add(id=bar, backing=foo, other arguments...)

And then you can create an NBD server on bar. I agree that this is not
how it should be, though. However, I think that the fact that you need
to specify a BB name for now deters people from doing stuff like that.
If you can specify a node name, people will think it's completely fine
to do so.

Also note that only allowing NBD servers to be created on a root node
doesn't really help you:

blockdev-add(node-name=foo, ...)
nbd-server-add(device=foo)
blockdev-add(id=bar, backing=foo, ...)

So, yeah, I think we just need the new op-blockers for this, I don't
think the current op blockers cover this.

Max



signature.asc
Description: OpenPGP digital signature

[Qemu-devel] [PATCH v7 2/4] ide: set retry_unit for PIO and FLUSH requests

2016-07-15 Thread Denis V. Lunev

From: Evgeny Yakovlev 

The following sequence of tests discovered a problem in IDE emulation:
1. Send DMA write to IDE device 0
2. Send CMD_FLUSH_CACHE to same IDE device which will be failed by block
layer using blkdebug script in tests/ide-test:test_retry_flush

When doing DMA request ide/core.c will set s->retry_unit to s->unit in
ide_start_dma. When dma completes ide_set_inactive sets retry_unit to -1.
After that ide_flush_cache runs and fails thanks to blkdebug.
ide_flush_cb calls ide_handle_rw_error which asserts that s->retry_unit
== s->unit. But s->retry_unit is still -1 after previous DMA completion
and flush does not use anything related to retry.

This patch restricts retry unit assertion only to ops that actually use
retry logic.

Signed-off-by: Evgeny Yakovlev 
Signed-off-by: Denis V. Lunev 
Reviewed-by: Paolo Bonzini 
CC: Kevin Wolf 
CC: Max Reitz 
CC: Stefan Hajnoczi 
CC: Fam Zheng 
CC: John Snow 
---
 hw/ide/core.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/hw/ide/core.c b/hw/ide/core.c
index b72346e..14f03d2 100644
--- a/hw/ide/core.c
+++ b/hw/ide/core.c
@@ -487,6 +487,7 @@ void ide_transfer_start(IDEState *s, uint8_t *buf, int size,
 s->end_transfer_func = end_transfer_func;
 s->data_ptr = buf;
 s->data_end = buf + size;
+ide_set_retry(s);
 if (!(s->status & ERR_STAT)) {
 s->status |= DRQ_STAT;
 }
@@ -1056,6 +1057,7 @@ static void ide_flush_cache(IDEState *s)
 }
 
 s->status |= BUSY_STAT;
+ide_set_retry(s);
 block_acct_start(blk_get_stats(s->blk), >acct, 0, BLOCK_ACCT_FLUSH);
 s->pio_aiocb = blk_aio_flush(s->blk, ide_flush_cb, s);
 }
-- 
2.1.4

[Qemu-devel] [PATCH 0/3] linux-user: fix corner cases in sendrecvmsg

2016-07-15 Thread Peter Maydell

These patches fix some bugs in handling corner cases of
sendrecvmsg; this allows us to pass the LTP 'recvmsg01'
test case.

thanks
-- PMM

Peter Maydell (3):
  linux-user: Fix handling of iovec counts
  linux-user: Fix errno for sendrecvmsg with large iovec length
  linux-user: Allow bad msg_name for recvfrom on connected socket

 linux-user/syscall.c | 28 ++--
 1 file changed, 22 insertions(+), 6 deletions(-)

-- 
1.9.1

Re: [Qemu-devel] [PATCH] qmp: add support for mixed typed input visitor

2016-07-15 Thread Markus Armbruster

Eric Blake  writes:

> On 07/14/2016 08:39 AM, Daniel P. Berrange wrote:
>> On Thu, Jul 14, 2016 at 08:23:18AM -0600, Eric Blake wrote:
>>> On 07/14/2016 08:16 AM, Daniel P. Berrange wrote:
 Add a qmp_mixed_input_visitor_new() method which returns
 a QMP input visitor that accepts either strings or the
 native data types.
>
> Question: do we want to allow: "key":1 when the QAPI is written
> 'key':'str'?  Your current patches allow the converse (allowing
> "key":"1" when the QAPI is written 'key':'int').  To allow native types
> to be consumed in mixed-mode where string is expected would require yet
> another method for deciding how to handle non-strings in
> v->visitor.type_str.  Where it might be useful is in SocketAddress
> parsing, in particular where InetSocketAddress.port is currently 'str'
> but where it often takes an integer port number in addition to a string
> for a named port alias; callers currently have to pass a stringized
> integer, where mixed mode might make it easier to fudge things.

I think we shouldn't do that.  Let me explain why.

The QMP input visitor is designed for QMP input.  Works like this: the
JSON parser converts a JSON value to a QObject, and the QMP input
visitor converts the QObject to a QAPI object.  Observations:

* In JSON, types are obvious.  The JSON parser's conversion to QObject
  is mostly straightforward: JSON object becomes QDict, array becomes
  QList, string becomes QString, false and true become QBool, null
  becomes qnull().  The only complication is JSON number, which becomes
  either QInt or QFloat, depending on its value.

* QInt represents int64_t.  Integers outside its range become QFloat.
  In particular, INT64_MAX+1..UINT64_MAX become QFloat.

* The QMP input visitor checks the QObject's type matches the QAPI
  object's type.  For object and array, this is recursive.  To
  compensate for the split of JSON number into QInt and QFloat, it
  accepts both for QAPI type 'number'.

* Despite its name, the QMP input visitor doesn't visit QMP, it visits a
  QObject.  Makes it useful in other contexts, such as QemuOpts input.

QemuOpts input works like this: the QemuOpts parser converts a
key=value,... string to a QemuOpts, qemu_opts_to_qdict() converts to a
QObject, and the QMP input visitor converts to a QAPI object.
Observations:

* In the key=value,... string, types are ambiguous.  The QemuOpts parser
  disambiguates to bool, uint64_t, char * when given an option
  description (opts->list->desc[] not empty), else it treats all values
  as strings.  Even when it disambiguates, it retains the string value.

* QemuOpts that are ultimately fed to the QMP input visitor typically
  have no option description.  Even if they have one, the types are
  thrown away: qemu_opts_to_qdict() works with the strings.

* Since all scalars are strings, the QMP input visitor's type checking
  will fail when it runs into a scalar QAPI type other than str.  This
  is the problem Dan needs solved.

Let's compare the two pipelines JSON -> QObject -> QAPI object and
key=value,... -> QObject -> QAPI object.  Their first stages are
conceptually similar, and the remaining stages are identical.  The
difference of interest here is how they pick QObject types:

* The JSON pipeline picks in its first stage.

* The QemuOpts pipeline can't do that, but to be able to reuse the rest
  of the pipeline, it arbitrarily picks QString.  Good enough for its
  intial uses.  Not good for the uses we need to support now.

I believe we should change the QemuOpts pipeline to resolve types in its
last stage.  This requires a different input visitor: one that resolves
types rather than checking them.  I guess this is basically what Dan's
"[PATCH v7 3/7] qapi: add a QmpInputVisitor that does string conversion"
does.  It's even less a *QMP* input visitor than the original, but let's
not worry about that now.

Now it gets ugly.

A long time ago, under a lot of pressure to have QMP reach feature
parity with HMP, I shoehorned device_add into QMP.  QAPI didn't exist
back then, so the pipeline was just JSON -> QObject.  I added a ->
QemuOpts stage, done by qemu_opts_from_qdict(), so I could use the
existing qdev_device_add() unmodified.  Despite such shortcuts, it took
me ~50 patches (commit 0aef426..8bc27249).  I've regretted it ever
since.

This JSON -> QObject -> QemuOpts pipeline is problematic: its second
stage throws away all type information.  It preserves JSON string
values, but anything else gets converted to a string.  Which may, if
you're lucky, even resemble your JSON token string.  Examples:

JSONQemuOpts value of key "foo"
 "foo": "abracadabra"   "abracadabra"
 "foo": -1  "-1"
 "foo": 1.024e3 "1024"
 "foo": 9223372036854775808 "9.2233720368547758e+18"
 "foo": 6.022140857e23  "6.022140857002e+23"
 "foo": false   "off"
 "foo": null

Re: [Qemu-devel] [Qemu-block] [PATCH] AioContext: correct comments

2016-07-15 Thread Max Reitz

On 15.07.2016 11:44, Cao jin wrote:
> Correct comments of field notify_me
> 
> Cc: Kevin Wolf 
> Cc: Max Reitz 
> Signed-off-by: Cao jin 
> ---
>  include/block/aio.h | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/include/block/aio.h b/include/block/aio.h
> index 0922b69..f89a1df 100644
> --- a/include/block/aio.h
> +++ b/include/block/aio.h
> @@ -71,7 +71,7 @@ struct AioContext {
>   * event_notifier_set necessary.
>   *
>   * Bit 0 is reserved for GSource usage of the AioContext, and is 1
> - * between a call to aio_ctx_check and the next call to aio_ctx_dispatch.
> + * between a call to aio_ctx_prepare and the next call to aio_ctx_check.
>   * Bits 1-31 simply count the number of active calls to aio_poll
>   * that are in the prepare or poll phase.
>   *

Reviewed-by: Max Reitz 

CC-ing Stefan and Fam, since I think this file should actually be under
their maintainership (should we add it to MAINTAINERS?).

Max



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [PATCH v3 01/11] util/qht: Document memory ordering assumptions

2016-07-15 Thread Sergey Fedorov

On 15/07/16 15:51, Paolo Bonzini wrote:
>
> On 15/07/2016 14:37, Sergey Fedorov wrote:
>> I understand why an implied wmb can be expected after the entry is
>> removed: so that the caller can trash the contents of the object
>> removed. However that would require double-check on lookup side to make
>> sure the entry hadn't been removed after the first lookup (something
>> like seqlock read side). But I have no idea why we might like to promise
>> wmb before the removal. Could you please share your point regarding this?
> The reasoning was to make qht_remove() "look to be ordered" with respect
> to writes.  So anything after remove wouldn't bleed into it, nor would
> anything before.
>
> In the case of this series, it would let you remove the smp_wmb() after
> tb_mark_invalid().  However, it's also okay to only handle qht_insert()
> and qht_lookup(), and keep the memory barrier after the TB is marked
> invalid (though it is unnecessary).
>

I'm pretty sure the smp_wmb() after tb_mark_invalid() is unnecessary
anyway. We don't rely on it at all because we're to recheck for
tb_is_invalid() under tb_lock before tb_add_jump(). However we have to
invalidate the CPU state atomically since we're going to check for it
out of tb_lock.

Kind regards,
Sergey

[Qemu-devel] [PATCH v7 3/4] tests: in IDE and AHCI tests perform DMA write before flushing

2016-07-15 Thread Denis V. Lunev

From: Evgeny Yakovlev 

Due to changes in flush behaviour clean disks stopped generating
flush_to_disk events and IDE and AHCI tests that test flush commands
started to fail.

This change adds additional DMA writes to affected tests before sending
flush commands so that bdrv_flush actually generates flush_to_disk event.

Signed-off-by: Evgeny Yakovlev 
Signed-off-by: Denis V. Lunev 
Reviewed-by: Paolo Bonzini 
CC: Kevin Wolf 
CC: Max Reitz 
CC: Stefan Hajnoczi 
CC: Fam Zheng 
CC: John Snow 
---
 tests/ahci-test.c | 34 --
 tests/ide-test.c  | 43 +++
 2 files changed, 75 insertions(+), 2 deletions(-)

diff --git a/tests/ahci-test.c b/tests/ahci-test.c
index 57dc44c..9c0adce 100644
--- a/tests/ahci-test.c
+++ b/tests/ahci-test.c
@@ -1063,11 +1063,34 @@ static void test_dma_fragmented(void)
 g_free(tx);
 }
 
+/*
+ * Write sector 1 with random data to make AHCI storage dirty
+ * Needed for flush tests so that flushes actually go though the block layer
+ */
+static void make_dirty(AHCIQState* ahci, uint8_t port)
+{
+uint64_t ptr;
+unsigned bufsize = 512;
+
+ptr = ahci_alloc(ahci, bufsize);
+g_assert(ptr);
+
+ahci_guest_io(ahci, port, CMD_WRITE_DMA, ptr, bufsize, 1);
+ahci_free(ahci, ptr);
+}
+
 static void test_flush(void)
 {
 AHCIQState *ahci;
+uint8_t port;
 
 ahci = ahci_boot_and_enable(NULL);
+
+port = ahci_port_select(ahci);
+ahci_port_clear(ahci, port);
+
+make_dirty(ahci, port);
+
 ahci_test_flush(ahci);
 ahci_shutdown(ahci);
 }
@@ -1087,10 +1110,13 @@ static void test_flush_retry(void)
 debug_path,
 tmp_path, imgfmt);
 
-/* Issue Flush Command and wait for error */
 port = ahci_port_select(ahci);
 ahci_port_clear(ahci, port);
 
+/* Issue write so that flush actually goes to disk */
+make_dirty(ahci, port);
+
+/* Issue Flush Command and wait for error */
 cmd = ahci_guest_io_halt(ahci, port, CMD_FLUSH_CACHE, 0, 0, 0);
 ahci_guest_io_resume(ahci, cmd);
 
@@ -1343,9 +1369,13 @@ static void test_flush_migrate(void)
 
 set_context(src->parent);
 
-/* Issue Flush Command */
 px = ahci_port_select(src);
 ahci_port_clear(src, px);
+
+/* Dirty device so that flush reaches disk */
+make_dirty(src, px);
+
+/* Issue Flush Command */
 cmd = ahci_command_create(CMD_FLUSH_CACHE);
 ahci_command_commit(src, cmd, px);
 ahci_command_issue_async(src, cmd);
diff --git a/tests/ide-test.c b/tests/ide-test.c
index fed1b2e..8466d0f 100644
--- a/tests/ide-test.c
+++ b/tests/ide-test.c
@@ -499,6 +499,39 @@ static void test_identify(void)
 ide_test_quit();
 }
 
+/*
+ * Write sector 1 with random data to make IDE storage dirty
+ * Needed for flush tests so that flushes actually go though the block layer
+ */
+static void make_dirty(uint8_t device)
+{
+uint8_t status;
+size_t len = 512;
+uintptr_t guest_buf;
+void* buf;
+
+guest_buf = guest_alloc(guest_malloc, len);
+buf = g_malloc(len);
+g_assert(guest_buf);
+g_assert(buf);
+
+memwrite(guest_buf, buf, len);
+
+PrdtEntry prdt[] = {
+{
+.addr = cpu_to_le32(guest_buf),
+.size = cpu_to_le32(len | PRDT_EOT),
+},
+};
+
+status = send_dma_request(CMD_WRITE_DMA, 1, 1, prdt,
+  ARRAY_SIZE(prdt), NULL);
+g_assert_cmphex(status, ==, BM_STS_INTR);
+assert_bit_clear(inb(IDE_BASE + reg_status), DF | ERR);
+
+g_free(buf);
+}
+
 static void test_flush(void)
 {
 uint8_t data;
@@ -507,6 +540,11 @@ static void test_flush(void)
 "-drive file=blkdebug::%s,if=ide,cache=writeback,format=raw",
 tmp_path);
 
+qtest_irq_intercept_in(global_qtest, "ioapic");
+
+/* Dirty media so that CMD_FLUSH_CACHE will actually go to disk */
+make_dirty(0);
+
 /* Delay the completion of the flush request until we explicitly do it */
 g_free(hmp("qemu-io ide0-hd0 \"break flush_to_os A\""));
 
@@ -549,6 +587,11 @@ static void test_retry_flush(const char *machine)
 "rerror=stop,werror=stop",
 debug_path, tmp_path);
 
+qtest_irq_intercept_in(global_qtest, "ioapic");
+
+/* Dirty media so that CMD_FLUSH_CACHE will actually go to disk */
+make_dirty(0);
+
 /* FLUSH CACHE command on device 0*/
 outb(IDE_BASE + reg_device, 0);
 outb(IDE_BASE + reg_command, CMD_FLUSH_CACHE);
-- 
2.1.4

Re: [Qemu-devel] [PATCH] linux-aio: keep processing events if MAX_EVENTS reached

2016-07-15 Thread Stefan Hajnoczi

On Tue, Jul 12, 2016 at 04:12:42PM +0200, Roman Penyaev wrote:
> On Tue, Jun 28, 2016 at 11:42 AM, Stefan Hajnoczi  wrote:
> > On Mon, Jun 27, 2016 at 08:36:19PM +0200, Roman Penyaev wrote:
> >> On Jun 27, 2016 6:37 PM, "Stefan Hajnoczi"  wrote:
> >> >
> >> > Commit ccb9dc10129954d0bcd7814298ed445e684d5a2a ("linux-aio: Cancel BH
> >> > if not needed") exposed a side-effect of scheduling the BH for nested
> >> > event loops.  When io_getevents(2) fetches MAX_EVENTS events then it's
> >> > necessary to call it again.  Failure to do so can lead to hung I/O
> >> > because the eventfd has already been cleared and the completion BH will
> >> > not run again.
> >> >
> >> > This patch fixes the hang by calling io_getevents(2) again if the events
> >> > array was totally full.
> >> >
> >> > Reported-by: Roman Penyaev 
> >> > Signed-off-by: Stefan Hajnoczi 
> >> > ---
> >> >  block/linux-aio.c | 5 +
> >> >  1 file changed, 5 insertions(+)
> >> >
> >> > diff --git a/block/linux-aio.c b/block/linux-aio.c
> >> > index e468960..af15f85 100644
> >> > --- a/block/linux-aio.c
> >> > +++ b/block/linux-aio.c
> >> > @@ -117,6 +117,7 @@ static void qemu_laio_completion_bh(void *opaque)
> >> >  LinuxAioState *s = opaque;
> >> >
> >> >  /* Fetch more completion events when empty */
> >> > +more_events:
> >> >  if (s->event_idx == s->event_max) {
> >> >  do {
> >> >  struct timespec ts = { 0 };
> >> > @@ -146,6 +147,10 @@ static void qemu_laio_completion_bh(void *opaque)
> >> >  qemu_laio_process_completion(laiocb);
> >> >  }
> >> >
> >> > +if (s->event_idx == MAX_EVENTS) {
> >> > +goto more_events; /* there might still be events waiting for us
> >> */
> >> > +}
> >> > +
> >>
> >> I am on vacation and can't check anything, but seems you also
> >> could reproduce an issue and seems you are right: what I've
> >> also noticed is that io hangs only if total number of requests
> >> in guest > than 128 (what is defined in linux-aio.c).
> >>
> >> But I do not understand why you have to repeat io_getevents in
> >> case of return value == MAX_EVENTS. According to my
> >> understanding you cant submit more than 128, so the first
> >> io_getevents call returns you exactly what you've submitted.
> >
> > Hi Roman,
> > There is nothing like discussions on qemu-devel from the beach.  True
> > vacation!
> >
> > The issue is:
> >
> > 1. s->events[] is only MAX_EVENTS large so each io_getevents() call can
> >only fetch that maximum number of events.
> >
> > 2. qemu_laio_completion_cb() clears the eventfd so the event loop will
> >not call us again (unless additional I/O requests complete).
> >
> > Therefore, returning from qemu_laio_completion_bh() without having
> > processed all events causes a hang.
> 
> Hi Stefan,
> 
> The issue is clear now.  The thing is that I had an assumption, that we
> never submit more than MAX_EVENTS.  Now I see that according to the
> ioq_submit() this is not true, so we can have inflights > MAX_EVENTS.
> 
> Frankly, that seems a bug to me, because we promise the kernel [when
> we call io_setup(MAX_EVENTS)] not to submit more than specified value.
> Kernel allocates ring buffer aligned to the page size, also does some
> compensations to have enough free requests for each CPU, so the final
> io events number *can be* > than we have requested.  So eventually
> kernel can "swallow" more events than MAX_EVENTS.

You are right.  QEMU shouldn't exceed MAX_EVENTS in-flight requests,
even though the kernel overallocates resources so we don't see an error
(yet).

Your patch makes linux-aio.c more correct overall, so let's take that.

> I did the following patch (at the bottom of this email), which restricts
> submission more than MAX_EVENTS and the interesting thing, that it works
> faster than your current fix for this hang:
> 
> my setup: 1 iothread, VCPU=8, MQ=8
> 
> your current patch
> "linux-aio: keep processing events if MAX_EVENTS reached":
> 
>READ: io=48199MB, aggrb=1606.5MB/s, minb=1606.5MB/s,
> maxb=1606.5MB/s, mint=30003msec, maxt=30003msec
>   WRITE: io=48056MB, aggrb=1601.8MB/s, minb=1601.8MB/s,
> maxb=1601.8MB/s, mint=30003msec, maxt=30003msec
> 
> mine changes:
> 
>READ: io=53294MB, aggrb=1776.3MB/s, minb=1776.3MB/s,
> maxb=1776.3MB/s, mint=30003msec, maxt=30003msec
>   WRITE: io=53177MB, aggrb=1772.4MB/s, minb=1772.4MB/s,
> maxb=1772.4MB/s, mint=30003msec, maxt=30003msec
> 
> But what is the most important thing here is, that reverting
> "linux-aio: Cancel BH if not needed" brings these numbers:
> 
>READ: io=56362MB, aggrb=1878.4MB/s, minb=1878.4MB/s,
> maxb=1878.4MB/s, mint=30007msec, maxt=30007msec
>   WRITE: io=56255MB, aggrb=1874.8MB/s, minb=1874.8MB/s,
> maxb=1874.8MB/s, mint=30007msec, maxt=30007msec
> 
> So, it seems to me that "linux-aio: Cancel BH if not needed" introduces
> regression.

Unfortunately the patch does two

Re: [Qemu-devel] [PATCH v2 0/2] trace: [*-user] Add commandline arguments to control tracing

2016-07-15 Thread Stefan Hajnoczi

On Wed, Jun 22, 2016 at 12:04:30PM +0200, Lluís Vilanova wrote:
> Adds three commandline arguments to the main *-user programs, following what's
> already available in softmmu:
> 
> * -trace-enable
> * -trace-events
> * -trace-file
> 
> 
> Changes in v2
> =
> 
> * Tell user to use 'help' instead of '?' [Eric Blake].
> * Remove newlines on argument docs for bsd-user [Eric Blake].
> 
> 
> Signed-off-by: Lluís Vilanova 
> ---
> 
> Lluís Vilanova (2):
>   trace: [linux-user] Commandline arguments to control tracing
>   trace: [bsd-user] Commandline arguments to control tracing
> 
> 
>  bsd-user/main.c   |   19 +++
>  linux-user/main.c |   28 
>  2 files changed, 47 insertions(+)

Hi Lluís,
Commit e9e0bb2af2248eabafb54402e3127f9f8a8690f5 ("trace: move
qemu_trace_opts to trace/control.c") made trace_events_init() static.
This conflicts with your patch series.

I suggest changing this series to use -trace ... just like
qemu/qemu-img/qemu-nbd.

Stefan


signature.asc
Description: PGP signature

[Qemu-devel] [RFC PATCH 0/4] translate: [tcg] Generic translation framework

2016-07-15 Thread Lluís Vilanova

This series proposes a generic (target-agnostic) instruction translation
framework.

It basically provides a generic main loop for instruction disassembly, which
calls target-specific functions when necessary. This generalization makes
inserting new code in the main loop easier, and helps in keeping all targets in
synch as to the contents of it.

I've only ported i386 as an example to get some feedback, but I'm planning on
porting ARM next to see how well it fits into the current organization.

Signed-off-by: Lluís Vilanova 
---

Lluís Vilanova (4):
  Pass generic CPUState to gen_intermediate_code()
  queue: Add macro for incremental traversal
  target: [tcg] Add generic translation framework
  target: [tcg,i386] Port to generic translation framework


 include/exec/exec-all.h   |2 
 include/exec/translate-all_template.h |   58 +++
 include/qemu/queue.h  |5 +
 include/qom/cpu.h |   21 ++
 target-alpha/translate.c  |   11 +
 target-arm/translate.c|   24 +--
 target-cris/translate.c   |   17 +-
 target-i386/cpu.h |2 
 target-i386/translate.c   |  290 +++--
 target-lm32/translate.c   |   22 +--
 target-m68k/translate.c   |   15 +-
 target-microblaze/translate.c |   24 +--
 target-mips/translate.c   |   15 +-
 target-moxie/translate.c  |   14 +-
 target-openrisc/translate.c   |   24 +--
 target-ppc/translate.c|   15 +-
 target-s390x/translate.c  |   13 +
 target-sh4/translate.c|   15 +-
 target-sparc/translate.c  |   11 +
 target-tilegx/translate.c |7 -
 target-tricore/translate.c|9 -
 target-unicore32/translate.c  |   17 +-
 target-xtensa/translate.c |   13 +
 translate-all.c   |2 
 translate-all_template.h  |  160 ++
 25 files changed, 503 insertions(+), 303 deletions(-)
 create mode 100644 include/exec/translate-all_template.h
 create mode 100644 translate-all_template.h


To: qemu-devel@nongnu.org
Cc: Paolo Bonzini 
Cc: Peter Crosthwaite 
Cc: Richard Henderson

[Qemu-devel] [PATCH 2/4] queue: Add macro for incremental traversal

2016-07-15 Thread Lluís Vilanova

Adds macro QTAILQ_FOREACH_CONTINUE to support incremental list
traversal.

Signed-off-by: Lluís Vilanova 
---
 include/qemu/queue.h |5 +
 1 file changed, 5 insertions(+)

diff --git a/include/qemu/queue.h b/include/qemu/queue.h
index f781aa2..c19f7ee 100644
--- a/include/qemu/queue.h
+++ b/include/qemu/queue.h
@@ -414,6 +414,11 @@ struct {   
 \
 (var);  \
 (var) = ((var)->field.tqe_next))
 
+#define QTAILQ_FOREACH_CONTINUE(var, field) \
+for ((var) = ((var)->field.tqe_next);   \
+(var);  \
+(var) = ((var)->field.tqe_next))
+
 #define QTAILQ_FOREACH_SAFE(var, head, field, next_var) \
 for ((var) = ((head)->tqh_first);   \
 (var) && ((next_var) = ((var)->field.tqe_next), 1); \

[Qemu-devel] [PATCH] test-logging: don't hard-code paths in /tmp

2016-07-15 Thread Sascha Silbe

Since f6880b7f [qemu-log: support simple pid substitution for logs],
test-logging creates files with hard-coded names in /tmp. In the best
case, this prevents multiple developers from running "make check" on
the same machine. In the worst case, it allows for symlink attacks,
enabling an attacker to overwrite files that are writable to the
developer running "make check".

Instead of hard-coding the paths, create a temporary directory using
g_dir_make_tmp() and clean it up afterwards.

Fixes: f6880b7f ("qemu-log: support simple pid substitution for logs")
Signed-off-by: Sascha Silbe 
---
 tests/test-logging.c | 42 +++---
 1 file changed, 35 insertions(+), 7 deletions(-)

diff --git a/tests/test-logging.c b/tests/test-logging.c
index cdf13c6..faebc35 100644
--- a/tests/test-logging.c
+++ b/tests/test-logging.c
@@ -86,24 +86,52 @@ static void test_parse_range(void)
 error_free_or_abort();
 }
 
-static void test_parse_path(void)
+static void test_parse_path(gconstpointer data)
 {
+gchar const *tmp_path = data;
+gchar *plain_path = g_build_filename(tmp_path, "qemu.log", NULL);
+gchar *pid_infix_path = g_build_filename(tmp_path, "qemu-%d.log", NULL);
+gchar *pid_suffix_path = g_build_filename(tmp_path, "qemu.log.%d", NULL);
+gchar *double_pid_path = g_build_filename(tmp_path, "qemu-%d%d.log", NULL);
 Error *err = NULL;
 
-qemu_set_log_filename("/tmp/qemu.log", _abort);
-qemu_set_log_filename("/tmp/qemu-%d.log", _abort);
-qemu_set_log_filename("/tmp/qemu.log.%d", _abort);
+qemu_set_log_filename(plain_path, _abort);
+qemu_set_log_filename(pid_infix_path, _abort);
+qemu_set_log_filename(pid_suffix_path, _abort);
 
-qemu_set_log_filename("/tmp/qemu-%d%d.log", );
+qemu_set_log_filename(double_pid_path, );
 error_free_or_abort();
+
+g_free(double_pid_path);
+g_free(pid_suffix_path);
+g_free(pid_infix_path);
+g_free(plain_path);
+}
+
+static void rmtree(gchar const *root)
+{
+/* There should really be a g_rmtree(). Implementing it ourselves
+ * isn't really worth it just for a test, so let's just use rm. */
+gchar const *rm_args[] = { "rm", "-rf", root, NULL };
+g_spawn_sync(NULL, (gchar **)rm_args, NULL,
+ G_SPAWN_SEARCH_PATH, NULL, NULL,
+ NULL, NULL, NULL, NULL);
 }
 
 int main(int argc, char **argv)
 {
+gchar *tmp_path = g_dir_make_tmp("qemu-test-logging.XX", NULL);
+int rc;
+
 g_test_init(, , NULL);
+g_assert_nonnull(tmp_path);
 
 g_test_add_func("/logging/parse_range", test_parse_range);
-g_test_add_func("/logging/parse_path", test_parse_path);
+g_test_add_data_func("/logging/parse_path", tmp_path, test_parse_path);
 
-return g_test_run();
+rc = g_test_run();
+
+rmtree(tmp_path);
+g_free(tmp_path);
+return rc;
 }
-- 
1.9.1

[Qemu-devel] [PATCH] compiler: never omit assertions if using a static analysis tool

2016-07-15 Thread Paolo Bonzini

Assertions help both Coverity and the clang static analyzer avoid
false positives, but on the other hand both are confused when
the condition is compiled as (void)(x != FOO).  Always expand
assertion macros when using Coverity or clang, through a new
QEMU_STATIC_ANALYSIS preprocessor symbol.

This fixes a couple false positives in TCG.

Signed-off-by: Paolo Bonzini 
---
 include/qemu/compiler.h | 3 +++
 tcg/tcg.h   | 2 +-
 2 files changed, 4 insertions(+), 1 deletion(-)

diff --git a/include/qemu/compiler.h b/include/qemu/compiler.h
index b64f899..338d3a6 100644
--- a/include/qemu/compiler.h
+++ b/include/qemu/compiler.h
@@ -3,6 +3,9 @@
 #ifndef COMPILER_H
 #define COMPILER_H
 
+#if defined __clang_analyzer__ || defined __COVERITY__
+#define QEMU_STATIC_ANALYSIS 1
+#endif
 
 /*
 | The macro QEMU_GNUC_PREREQ tests for minimum version of the GNU C compiler.
diff --git a/tcg/tcg.h b/tcg/tcg.h
index 66ae0c7..6046dcd 100644
--- a/tcg/tcg.h
+++ b/tcg/tcg.h
@@ -191,7 +191,7 @@ typedef uint64_t tcg_insn_unit;
 #endif
 
 
-#ifdef CONFIG_DEBUG_TCG
+#if defined CONFIG_DEBUG_TCG || defined QEMU_STATIC_ANALYSIS
 # define tcg_debug_assert(X) do { assert(X); } while (0)
 #elif QEMU_GNUC_PREREQ(4, 5)
 # define tcg_debug_assert(X) \
-- 
2.7.4

Re: [Qemu-devel] QOM: best way for parents to pass information to children? (was Re: [PATCH RFC 07/16] qom/cpu: make nr-cores, nr-threads real properties)

2016-07-15 Thread Andreas Färber

Am 15.07.2016 um 18:10 schrieb Eduardo Habkost:
> On Fri, Jul 15, 2016 at 11:11:38AM +0200, Igor Mammedov wrote:
>> On Fri, 15 Jul 2016 08:35:30 +0200
>> Andrew Jones  wrote:
>>> On Thu, Jul 14, 2016 at 05:07:43PM -0300, Eduardo Habkost wrote:

 First of all, sorry for the horrible delay in replying to this
 thread.

 On Wed, Jun 15, 2016 at 10:56:20AM +1000, David Gibson wrote:  
> On Tue, Jun 14, 2016 at 08:19:49AM +0200, Andrew Jones wrote:  
>> On Tue, Jun 14, 2016 at 12:12:16PM +1000, David Gibson wrote:  
>>> On Sun, Jun 12, 2016 at 03:48:10PM +0200, Andrew Jones wrote:  
> [...]
>> +static Property cpu_common_properties[] = {
>> +DEFINE_PROP_INT32("nr-cores", CPUState, nr_cores, 1),
>> +DEFINE_PROP_INT32("nr-threads", CPUState, nr_threads, 1),
>> +DEFINE_PROP_END_OF_LIST()
>> +};  
>
> Are you aware of the current CPU hotplug discussion that is going on? 
>  

 I'm aware of it going on, but haven't been following it.
   
> I'm not very involved there, but I think some of these reworks also 
> move
> "nr_threads" into the CPU state already, e.g. see:  

 nr_threads (and nr_cores) are already state in CPUState. This patch 
 just
 exposes that state via properties.
   
>
> https://github.com/dgibson/qemu/commit/9d07719784ecbeebea71
>
> ... so you might want to check these patches first to see whether you
> can base your rework on them?  

 Every cpu, and thus every machine, uses CPUState for its cpus. I'm
 not sure every machine will want to use that new abstract core class
 though. If they did, then we could indeed use nr_threads from there
 instead (and remove it from CPUState), but we'd still need nr_cores
 from the abstract cpu package class (CPUState).  
>>>
>>> Hmm.  Since the CPUState object represents just a single thread, it
>>> seems weird to me that it would have nr_threads and nr_cores
>>> information.  

 Agreed it is weird, and I think we should try to move it away
 from CPUState, not make it part of the TYPE_CPU interface.
 nr_threads belongs to the actual container of the Thread objects,
 and nr_cores in the actual container of the Core objects.

 The problem is how to implement that in a non-intrusive way that
 would require changing the object hierarchy of all architectures.

   
>>>
>>> Exposing those as properties makes that much worse, because it's now
>>> ABI, rather than internal detail we can clean up at some future time.  
>>
>> CPUState is supposed to be "State of one CPU core or thread", which
>> justifies having nr_threads state, as it may be describing a core.  
>
> Um.. does it ever actually represent a (multithread) core in practice?
> It would need to have duplicated register state for every thread were
> that the case.  

 AFAIK, CPUState is still always thread state. Or has this changed
 in some architectures, already?
   
>   
>> I guess there's no justification for having nr_cores in there though.
>> I agree adding the Core class is a good idea, assuming it will get used
>> by all machines, and CPUState then gets changed to a Thread class. The
>> question then, though, is do we also create a Socket class that contains
>> nr_cores?  
>
> That was roughly our intention with the way the cross platform hotplug
> stuff is evolving.  But the intention was that the Socket objects
> would only need to be constructed for machine types where it makes
> sense.  So for example on the paravirt pseries platform, we'll only
> have Core objects, because the socket distinction isn't really
> meaningful.
>   
>> And how will a Thread method get that information when it
>> needs to emulate, e.g. CPUID, that requires it? It's a bit messy, so
>> I'm open to all suggestions on it.  
>
> So, if the Thread needs this information, I'm not opposed to it having
> it internally (presumably populated earlier from the Core object).
> But I am opposed to it being a locked in part of the interface by
> having it as an exposed property.  

 I agree we don't want to make this part of the external
 interface. In this case, if we don't add the properties, how
 exactly is the Machine or Core code supposed to pass that
 information to the Thread object?

 Maybe the intermediate steps could be:

 * Make the Thread code that uses CPUState::nr_{cores,threads} and
   smp_{cores,threads} get that info from MachineState instead.  
>>>
>>> I have some patches already headed down this road.
>>>
 * On the architectures where we already have a reasonable

[Qemu-devel] [PATCH] migration: set state to post-migrate on failure

2016-07-15 Thread Dr. David Alan Gilbert (git)

From: "Dr. David Alan Gilbert" 

If a migration fails/is cancelled during the postcopy stage we currently
end up with the runstate as finish-migrate, where it should be post-migrate.
There's a small window in precopy where I think the same thing can
happen, but I've never seen it.

It rarely matters; the only postcopy case is if you restart a migration, which
again is a case that rarely matters in postcopy because it's only
safe to restart the migration if you know the destination hasn't
been running (which you might if you started the destination with -S
and hadn't got around to 'c' ing it before the postcopy failed).
Even then it's a small window but potentially you could hit if
there's a problem loading the devices on the destination.

This corresponds to:
https://bugzilla.redhat.com/show_bug.cgi?id=1355683

Signed-off-by: Dr. David Alan Gilbert 
---
 migration/migration.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/migration/migration.c b/migration/migration.c
index c4e0193..955d5ee 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -1837,6 +1837,10 @@ static void *migration_thread(void *opaque)
 } else {
 if (old_vm_running && !entered_postcopy) {
 vm_start();
+} else {
+if (runstate_check(RUN_STATE_FINISH_MIGRATE)) {
+runstate_set(RUN_STATE_POSTMIGRATE);
+}
 }
 }
 qemu_bh_schedule(s->cleanup_bh);
-- 
2.7.4

1 2 3 >

1 - 100 of 274 matches

Mail list logo