Re: [Qemu-devel] [RfC PATCH] Add udmabuf misc device

2018-04-11 Thread Oleksandr Andrushchenko

On 04/10/2018 08:26 PM, Dongwon Kim wrote:

On Tue, Apr 10, 2018 at 09:37:53AM +0300, Oleksandr Andrushchenko wrote:

On 04/06/2018 09:57 PM, Dongwon Kim wrote:

On Fri, Apr 06, 2018 at 03:36:03PM +0300, Oleksandr Andrushchenko wrote:

On 04/06/2018 02:57 PM, Gerd Hoffmann wrote:

   Hi,


I fail to see any common ground for xen-zcopy and udmabuf ...

Does the above mean you can assume that xen-zcopy and udmabuf
can co-exist as two different solutions?

Well, udmabuf route isn't fully clear yet, but yes.

See also gvt (intel vgpu), where the hypervisor interface is abstracted
away into a separate kernel modules even though most of the actual vgpu
emulation code is common.

Thank you for your input, I'm just trying to figure out
which of the three z-copy solutions intersect and how much

And what about hyper-dmabuf?

xen z-copy solution is pretty similar fundamentally to hyper_dmabuf
in terms of these core sharing feature:

1. the sharing process - import prime/dmabuf from the producer -> extract
underlying pages and get those shared -> return references for shared pages

Another thing is danvet was kind of against to the idea of importing existing
dmabuf/prime buffer and forward it to the other domain due to synchronization
issues. He proposed to make hyper_dmabuf only work as an exporter so that it
can have a full control over the buffer. I think we need to talk about this
further as well.

Yes, I saw this. But this limits the use-cases so much.
For instance, running Android as a Guest (which uses ION to allocate
buffers) means that finally HW composer will import dma-buf into
the DRM driver. Then, in case of xen-front for example, it needs to be
shared with the backend (Host side). Of course, we can change user-space
to make xen-front allocate the buffers (make it exporter), but what we try
to avoid is to change user-space which in normal world would have remain
unchanged otherwise.
So, I do think we have to support this use-case and just have to understand
the complexity.



danvet, can you comment on this topic?


2. the page sharing mechanism - it uses Xen-grant-table.

And to give you a quick summary of differences as far as I understand
between two implementations (please correct me if I am wrong, Oleksandr.)

1. xen-zcopy is DRM specific - can import only DRM prime buffer
while hyper_dmabuf can export any dmabuf regardless of originator

Well, this is true. And at the same time this is just a matter
of extending the API: xen-zcopy is a helper driver designed for
xen-front/back use-case, so this is why it only has DRM PRIME API

2. xen-zcopy doesn't seem to have dma-buf synchronization between two VMs
while (as danvet called it as remote dmabuf api sharing) hyper_dmabuf sends
out synchronization message to the exporting VM for synchronization.

This is true. Again, this is because of the use-cases it covers.
But having synchronization for a generic solution seems to be a good idea.

Yeah, understood xen-zcopy works ok with your use case. But I am just curious
if it is ok not to have any inter-domain synchronization in this sharing model.

The synchronization is done with displif protocol [1]

The buffer being shared is technically dma-buf and originator needs to be able
to keep track of it.

As I am working in DRM terms the tracking is done by the DRM core
for me for free. (This might be one of the reasons Daniel sees DRM
based implementation fit very good from code-reuse POV).



3. 1-level references - when using grant-table for sharing pages, there will
be same # of refs (each 8 byte)

To be precise, grant ref is 4 bytes

You are right. Thanks for correction.;)


as # of shared pages, which is passed to
the userspace to be shared with importing VM in case of xen-zcopy.

The reason for that is that xen-zcopy is a helper driver, e.g.
the grant references come from the display backend [1], which implements
Xen display protocol [2]. So, effectively the backend extracts references
from frontend's requests and passes those to xen-zcopy as an array
of refs.

  Compared
to this, hyper_dmabuf does multiple level addressing to generate only one
reference id that represents all shared pages.

In the protocol [2] only one reference to the gref directory is passed
between VMs
(and the gref directory is a single-linked list of shared pages containing
all
of the grefs of the buffer).

ok, good to know. I will look into its implementation in more details but is
this gref directory (chained grefs) something that can be used for any general
memory sharing use case or is it jsut for xen-display (in current code base)?

Not to mislead you: one grant ref is passed via displif protocol,
but the page it's referencing contains the rest of the grant refs.

As to if this can be used for any memory: yes. It is the same for
sndif and displif Xen protocols, but defined twice as strictly speaking
sndif and displif are two separate protocols.

While reviewing your RFC v2 one of the comments I had [2] was that if we
can start 

Re: [Qemu-devel] [PATCH] qmp: add pmemload command

2018-04-11 Thread Simon Ruderich
On Tue, Apr 10, 2018 at 04:33:03PM -0500, Eric Blake wrote:
>> +void qmp_pmemload(int64_t addr, int64_t size, const char *filename,
>> +  Error **errp)
>> +{
>> +FILE *f;
>> +size_t l;
>> +uint8_t buf[1024];
>> +
>> +f = fopen(filename, "rb");
>
> Use qemu_fopen() here, so that you can automagically support /dev/fdset/
> magic filenames that work on file descriptors passed via SCM_RIGHTS.

Hello,

I can't find qemu_fopen() in the source. Did you mean
qemu_open()? From reading qemu_close() I guess that I can't use
fdopen() (as there's no qemu_fclose()) but must work with the raw
fd. Or is there an easy way to fdopen? (I prefer the FILE *
interface which is easier to work with.)

But I just copied the code from qmp_pmemsave. Should I change it
as well to use qemu_open()?

>> +++ b/qapi-schema.json
>> @@ -1136,6 +1136,24 @@
>>  { 'command': 'pmemsave',
>>'data': {'val': 'int', 'size': 'int', 'filename': 'str'} }
>>
>> +##
>> +# @pmemload:
>> +#
>> +# Load a portion of guest physical memory from a file.
>> +#
>> +# @val: the physical address of the guest to start from
>
> Is 'val' really the best name for this, or would 'phys-addr' or similar
> be a more descriptive name?

I copied it from pmemsave to keep the argument names consistent.
Should I change it only for pmemload? Changing it for pmemsave is
problematic as it will break the existing API.

>> +#
>> +# @size: the size of memory region to load
>> +#
>> +# @filename: the file to load the memory from as binary data
>> +#
>> +# Returns: Nothing on success
>> +#
>> +# Since: 2.10
>
> You've missed 2.10 by a long shot.  The earliest this new feature could
> appear is 2.13.

Will change.

> Do you additionally need an offset where to start reading from within
> the file (that is, since you already have the 'size' parameter to avoid
> reading the entire file, and the 'val' parameter to target anywhere in
> physical memory, how do I start reading anywhere from the file)?

I didn't need it for my usage and wanted to keep the patch as
simple as possible. But I see how it can be useful and will add
it to my patch in the next iteration.

Thank you for the review!

Regards
Simon
-- 
+ privacy is necessary
+ using gnupg http://gnupg.org
+ public key id: 0x92FEFDB7E44C32F9



Re: [Qemu-devel] [PATCH] iotests: fix 169

2018-04-11 Thread Vladimir Sementsov-Ogievskiy

03.04.2018 23:13, John Snow wrote:


On 04/03/2018 12:23 PM, Max Reitz wrote:

On 2018-03-30 18:10, Vladimir Sementsov-Ogievskiy wrote:

Use MIGRATION events instead of RESUME. Also, make a TODO: enable
dirty-bitmaps capability for offline case.

This (likely) fixes racy faults at least of the following types:

 - timeout on waiting for RESUME event
 - sha256 mismatch on 136 (138 after this patch)

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---

This patch is a true change for the test anyway. But I don't understand,
why (and do really) it fixes the things. And I'm not sure about do we
really have a bug in bitmap migration or persistence. So, it's up to you,
take it into 2.12...

It was already discussed, that "STOP" event is bad for tests. What about
"RESUME"? How can we miss it? And sha256 mismatch is really something
strange.

Max, please check, do it fix 169 for you.

  tests/qemu-iotests/169 | 44 +++-
  1 file changed, 23 insertions(+), 21 deletions(-)

This makes the test pass (thanks!), but it still leaves behind five cats...

Max



Hmm:

jhuston  14772  0.0  0.0   4296   784 pts/3S16:12   0:00 cat
/home/bos/jhuston/src/qemu/bin/git/tests/qemu-iotests/scratch/mig_file
jhuston  14796  0.0  0.0   4296   764 pts/3S16:12   0:00 cat
/home/bos/jhuston/src/qemu/bin/git/tests/qemu-iotests/scratch/mig_file
jhuston  14940  0.0  0.0   4296   788 pts/3S16:12   0:00 cat
/home/bos/jhuston/src/qemu/bin/git/tests/qemu-iotests/scratch/mig_file
jhuston  14964  0.0  0.0   4296   720 pts/3S16:12   0:00 cat
/home/bos/jhuston/src/qemu/bin/git/tests/qemu-iotests/scratch/mig_file
jhuston  15052  0.0  0.0   4296   768 pts/3S16:12   0:00 cat
/home/bos/jhuston/src/qemu/bin/git/tests/qemu-iotests/scratch/mig_file

Why do these get left behind? Nothing to consume the data...?


aha, understand. it is due to last vm_b.shutdown() and vm_b.launch in 
case of should_migrate. So, at the end of the test I restart vm_b with 
-incoming parameter. But it looks like  a bug anyway, If we start qemu 
with -incoming "exec", should not we kill cat process, if there were no 
migration?


--
Best regards,
Vladimir




Re: [Qemu-devel] [PATCH 09/10] target/s390x: avoid integer overflow in next_page PC check

2018-04-11 Thread David Hildenbrand
On 10.04.2018 18:19, Emilio G. Cota wrote:
> If the PC is in the last page of the address space, next_page_start
> overflows to 0. Fix it.
> 
> Cc: Cornelia Huck 
> Cc: Alexander Graf 
> Cc: David Hildenbrand 
> Cc: qemu-s3...@nongnu.org
> Signed-off-by: Emilio G. Cota 
> ---
>  target/s390x/translate.c | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/target/s390x/translate.c b/target/s390x/translate.c
> index 7d39ab3..9f1 100644
> --- a/target/s390x/translate.c
> +++ b/target/s390x/translate.c
> @@ -6163,7 +6163,7 @@ void gen_intermediate_code(CPUState *cs, struct 
> TranslationBlock *tb)
>  CPUS390XState *env = cs->env_ptr;
>  DisasContext dc;
>  target_ulong pc_start;
> -uint64_t next_page_start;
> +uint64_t page_start;
>  int num_insns, max_insns;
>  ExitStatus status;
>  bool do_debug;
> @@ -6181,7 +6181,7 @@ void gen_intermediate_code(CPUState *cs, struct 
> TranslationBlock *tb)
>  dc.ex_value = tb->cs_base;
>  do_debug = dc.singlestep_enabled = cs->singlestep_enabled;
>  
> -next_page_start = (pc_start & TARGET_PAGE_MASK) + TARGET_PAGE_SIZE;
> +page_start = pc_start & TARGET_PAGE_MASK;
>  
>  num_insns = 0;
>  max_insns = tb_cflags(tb) & CF_COUNT_MASK;
> @@ -6218,7 +6218,7 @@ void gen_intermediate_code(CPUState *cs, struct 
> TranslationBlock *tb)
>  /* If we reach a page boundary, are single stepping,
> or exhaust instruction count, stop generation.  */
>  if (status == NO_EXIT
> -&& (dc.pc >= next_page_start
> +&& (dc.pc - page_start >= TARGET_PAGE_SIZE
>  || tcg_op_buf_full()
>  || num_insns >= max_insns
>  || singlestep
> 

Reviewed-by: David Hildenbrand 

-- 

Thanks,

David / dhildenb



Re: [Qemu-devel] [Qemu-block] [PATCH] iotests: Split 214 off of 122

2018-04-11 Thread Max Reitz
On 2018-04-09 11:28, Alberto Garcia wrote:
> On Fri 06 Apr 2018 06:41:08 PM CEST, Max Reitz  wrote:
>> Commit abd3622cc03cf41ed542126a540385f30a4c0175 added a case to 122
>> regarding how the qcow2 driver handles an incorrect compressed data
>> length value.  This does not really fit into 122, as that file is
>> supposed to contain qemu-img convert test cases, which this case is not.
>> So this patch splits it off into its own file; maybe we will even get
>> more qcow2-only compression tests in the future.
>>
>> Also, that test case does not work with refcount_bits=1, so mark that
>> option as unsupported.
>>
>> Signed-off-by: Max Reitz 
> 
> Looks good to me
> 
>> I was a bit lost what to do about the copyright text, since this test
>> case was written by Berto.  I figured I'd drop the "owner" variable
>> (it isn't used anyway), but I put "Red Hat" into the copyright line --
>> currently every test has copyright information, so I decided it'd be
>> difficult to leave that out, and I figured I simply cannot claim
>> copyright for Igalia.  So, here we go.
> 
> The new file contains only my test, right?

Yep.

>You can use
> 
> Copyright (C) 2018 Igalia, S.L.
> Author: Alberto Garcia 
> 
> and add
> 
> Signed-off-by: Alberto Garcia 

Thanks! :-)

Max



Re: [Qemu-devel] [PATCH v1 20/24] tests/tcg: enable building for s390x

2018-04-11 Thread Cornelia Huck
On Tue, 10 Apr 2018 20:39:15 +0100
Alex Bennée  wrote:

> This doesn't add any additional tests but enables building the
> multiarch tests for s390x.
> 
> Signed-off-by: Alex Bennée 
> ---
>  tests/tcg/s390x/Makefile.include | 2 ++
>  1 file changed, 2 insertions(+)
>  create mode 100644 tests/tcg/s390x/Makefile.include
> 
> diff --git a/tests/tcg/s390x/Makefile.include 
> b/tests/tcg/s390x/Makefile.include
> new file mode 100644
> index 00..1f58115d96
> --- /dev/null
> +++ b/tests/tcg/s390x/Makefile.include
> @@ -0,0 +1,2 @@
> +DOCKER_IMAGE=debian-s390x-cross
> +DOCKER_CROSS_COMPILER=s390x-linux-gnu-gcc

Acked-by: Cornelia Huck 



Re: [Qemu-devel] [PATCH v1 01/24] configure: add test for docker availability

2018-04-11 Thread Fam Zheng
On Tue, 04/10 20:38, Alex Bennée wrote:
> This tests for a working docker installation without sudo and sets up
> config-host.mak accordingly. This will be useful from cross compiling
> things in the future.
> 
> Signed-off-by: Alex Bennée 
> ---
>  configure | 23 +++
>  1 file changed, 23 insertions(+)
> 
> diff --git a/configure b/configure
> index 4d0e92c96c..b402befe94 100755
> --- a/configure
> +++ b/configure
> @@ -451,6 +451,7 @@ jemalloc="no"
>  replication="yes"
>  vxhs=""
>  libxml2=""
> +docker="no"
>  
>  supported_cpu="no"
>  supported_os="no"
> @@ -5396,6 +5397,23 @@ EOF
>fi
>  fi
>  
> +##
> +# Docker and cross-compiler support
> +#
> +# This is specifically for building test
> +# cases for foreign architectures, not
> +# cross-compiling QEMU itself.
> +
> +if has "docker"; then
> +if docker images  >/dev/null 2>&1 ; then
> +docker="yes"
> +else
> +# docker may be available but using sudo
> +# so we won't use it for cross-building
> +docker="maybe"

What is the problem with using sudo for cross-building?

Fam

> +fi
> +fi
> +
>  ##
>  # End of CC checks
>  # After here, no more $cc or $ld runs
> @@ -5857,6 +5875,7 @@ echo "avx2 optimization $avx2_opt"
>  echo "replication support $replication"
>  echo "VxHS block device $vxhs"
>  echo "capstone  $capstone"
> +echo "docker$docker"
>  
>  if test "$sdl_too_old" = "yes"; then
>  echo "-> Your SDL version is too old - please upgrade to have SDL support"
> @@ -6680,6 +6699,10 @@ if test "$gcov" = "yes" ; then
>echo "GCOV=$gcov_tool" >> $config_host_mak
>  fi
>  
> +if test "$docker" = "yes"; then
> +echo "HAVE_USER_DOCKER=y" >> $config_host_mak
> +fi
> +
>  # use included Linux headers
>  if test "$linux" = "yes" ; then
>mkdir -p linux-headers
> -- 
> 2.16.2
> 



Re: [Qemu-devel] [Bug 1759264] Re: fpu/softfloat: round_to_int_and_pack refactor broke TriCore ftoi insns

2018-04-11 Thread Bastian Koppelmann
On 04/10/2018 10:07 PM, Alex Bennée wrote:
> Yeah it looks like it was missed, the round_to_uint code does it.
> 
> Do you have a test case I can verify?
> 

For the NaN input 0x the expected result for the flags is that
flag_invalid is raised.

I can provide you with some TriCore asm, but it is a bit of pain to get
the gnu assembler to build, since the public version is a decade old.

Cheers,
Bastian



[Qemu-devel] [Bug 1762558] Re: Many crashes with "memslot_get_virt: slot_id 170 too big"-type errors in 2.12.0 rc2

2018-04-11 Thread Dr. David Alan Gilbert
IMHO it's best to keep this open until we find out what's going on;
it's not impossible it's something that's changed in qemu, and even if
it isn't qemu's fault then you won't be the only person who ends up
reporting it here, so it'll be good to get the answer.

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1762558

Title:
  Many crashes with "memslot_get_virt: slot_id 170 too big"-type errors
  in 2.12.0 rc2

Status in QEMU:
  New

Bug description:
  Since qemu 2.12.0 rc2 - qemu-2.12.0-0.6.rc2.fc29 - landed in Fedora
  Rawhide, just about all of our openQA-automated tests of Rawhide
  guests which run with qxl / SPICE graphics in the guest have died
  partway in, always shortly after the test switches from the installer
  (an X environment) to a console on a tty. qemu is, I think, hanging.
  There are always some errors like this right around the time of the
  hang:

  [2018-04-09T20:13:42.0736 UTC] [debug] QEMU: id 0, group 0, virt start 0, 
virt end , generation 0, delta 0
  [2018-04-09T20:13:42.0736 UTC] [debug] QEMU: id 1, group 1, virt start 
7f42dbc0, virt end 7f42dfbfe000, generation 0, delta 7f42dbc0
  [2018-04-09T20:13:42.0736 UTC] [debug] QEMU: id 2, group 1, virt start 
7f42d7a0, virt end 7f42dba0, generation 0, delta 7f42d7a0
  [2018-04-09T20:13:42.0736 UTC] [debug] QEMU: 
  [2018-04-09T20:13:42.0736 UTC] [debug] QEMU: (process:45812): Spice-CRITICAL 
**: memslot.c:111:memslot_get_virt: slot_id 218 too big, addr=da8e21fbda8e21fb

  or occasionally like this:

  [2018-04-09T20:13:58.0717 UTC] [debug] QEMU: id 0, group 0, virt start 0, 
virt end , generation 0, delta 0
  [2018-04-09T20:13:58.0720 UTC] [debug] QEMU: id 1, group 1, virt start 
7ff093c0, virt end 7ff097bfe000, generation 0, delta 7ff093c0
  [2018-04-09T20:13:58.0720 UTC] [debug] QEMU: id 2, group 1, virt start 
7ff08fa0, virt end 7ff093a0, generation 0, delta 7ff08fa0
  [2018-04-09T20:13:58.0720 UTC] [debug] QEMU: 
  [2018-04-09T20:13:58.0720 UTC] [debug] QEMU: (process:25622): Spice-WARNING 
**: memslot.c:68:memslot_validate_virt: virtual address out of range
  [2018-04-09T20:13:58.0720 UTC] [debug] QEMU: virt=0x0+0x18 slot_id=0 
group_id=1
  [2018-04-09T20:13:58.0721 UTC] [debug] QEMU: slot=0x0-0x0 delta=0x0
  [2018-04-09T20:13:58.0721 UTC] [debug] QEMU: 
  [2018-04-09T20:13:58.0721 UTC] [debug] QEMU: (process:25622): Spice-WARNING 
**: display-channel.c:2426:display_channel_validate_surface: invalid surface_id 
1048576
  [2018-04-09T20:14:14.0728 UTC] [debug] QEMU: id 0, group 0, virt start 0, 
virt end , generation 0, delta 0
  [2018-04-09T20:14:14.0728 UTC] [debug] QEMU: id 1, group 1, virt start 
7ff093c0, virt end 7ff097bfe000, generation 0, delta 7ff093c0
  [2018-04-09T20:14:14.0728 UTC] [debug] QEMU: id 2, group 1, virt start 
7ff08fa0, virt end 7ff093a0, generation 0, delta 7ff08fa0
  [2018-04-09T20:14:14.0728 UTC] [debug] QEMU: 
  [2018-04-09T20:14:14.0728 UTC] [debug] QEMU: (process:25622): Spice-CRITICAL 
**: memslot.c:122:memslot_get_virt: address generation is not valid, group_id 
1, slot_id 0, gen 110, slot_gen 0

  The same tests running on Fedora 28 guests on the same hosts are not
  hanging, and the same tests were not hanging right before the qemu
  package got updated, so this seems very strongly tied to the new qemu.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1762558/+subscriptions



[Qemu-devel] [RFC] vhost-user: introduce F_NEED_ALL_IOTLB protocol feature

2018-04-11 Thread Tiwei Bie
This patch introduces VHOST_USER_PROTOCOL_F_NEED_ALL_IOTLB
feature for vhost-user. By default, vhost-user backend needs
to query the IOTLBs from QEMU after meeting unknown IOVAs.
With this protocol feature negotiated, QEMU will provide all
the IOTLBs to vhost-user backend without waiting for the
queries from backend. This is helpful when using a hardware
accelerator which is not able to handle unknown IOVAs at the
vhost-user backend.

Signed-off-by: Tiwei Bie 
---
The idea of this patch is to let QEMU push all the IOTLBs
to vhost-user backend without waiting for the queries from
the backend. Because hardware accelerator at the vhost-user
backend may not be able to handle unknown IOVAs.

This is just a RFC for now. It seems that, it doesn't work
as expected when guest is using kernel driver (To handle
this case, it seems that some RAM regions' events also need
to be listened). Any comments would be appreciated! Thanks!

 docs/interop/vhost-user.txt   |  9 
 hw/virtio/vhost-backend.c |  7 ++
 hw/virtio/vhost-user.c|  8 +++
 hw/virtio/vhost.c | 47 ---
 include/hw/virtio/vhost-backend.h |  3 +++
 5 files changed, 71 insertions(+), 3 deletions(-)

diff --git a/docs/interop/vhost-user.txt b/docs/interop/vhost-user.txt
index 534caab18a..73e07f9dad 100644
--- a/docs/interop/vhost-user.txt
+++ b/docs/interop/vhost-user.txt
@@ -380,6 +380,7 @@ Protocol features
 #define VHOST_USER_PROTOCOL_F_CRYPTO_SESSION 7
 #define VHOST_USER_PROTOCOL_F_PAGEFAULT  8
 #define VHOST_USER_PROTOCOL_F_CONFIG 9
+#define VHOST_USER_PROTOCOL_F_NEED_ALL_IOTLB 10
 
 Master message types
 
@@ -797,3 +798,11 @@ resilient for selective requests.
 For the message types that already solicit a reply from the client, the
 presence of VHOST_USER_PROTOCOL_F_REPLY_ACK or need_reply bit being set brings
 no behavioural change. (See the 'Communication' section for details.)
+
+VHOST_USER_PROTOCOL_F_NEED_ALL_IOTLB
+
+By default, vhost-user backend needs to query the IOTLBs from QEMU after
+meeting unknown IOVAs. With this protocol feature negotiated, QEMU will
+provide all the IOTLBs to vhost backend without waiting for the queries
+from backend. This is helpful when using a hardware accelerator which is
+not able to handle unknown IOVAs at the vhost-user backend.
diff --git a/hw/virtio/vhost-backend.c b/hw/virtio/vhost-backend.c
index 7f09efab8b..d72994e9a5 100644
--- a/hw/virtio/vhost-backend.c
+++ b/hw/virtio/vhost-backend.c
@@ -233,6 +233,11 @@ static void vhost_kernel_set_iotlb_callback(struct 
vhost_dev *dev,
 qemu_set_fd_handler((uintptr_t)dev->opaque, NULL, NULL, NULL);
 }
 
+static bool vhost_kernel_need_all_device_iotlb(struct vhost_dev *dev)
+{
+return false;
+}
+
 static const VhostOps kernel_ops = {
 .backend_type = VHOST_BACKEND_TYPE_KERNEL,
 .vhost_backend_init = vhost_kernel_init,
@@ -264,6 +269,8 @@ static const VhostOps kernel_ops = {
 #endif /* CONFIG_VHOST_VSOCK */
 .vhost_set_iotlb_callback = vhost_kernel_set_iotlb_callback,
 .vhost_send_device_iotlb_msg = vhost_kernel_send_device_iotlb_msg,
+.vhost_backend_need_all_device_iotlb =
+vhost_kernel_need_all_device_iotlb,
 };
 
 int vhost_set_backend_type(struct vhost_dev *dev, VhostBackendType 
backend_type)
diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
index 38da8692bb..30e0b1c13b 100644
--- a/hw/virtio/vhost-user.c
+++ b/hw/virtio/vhost-user.c
@@ -47,6 +47,7 @@ enum VhostUserProtocolFeature {
 VHOST_USER_PROTOCOL_F_CRYPTO_SESSION = 7,
 VHOST_USER_PROTOCOL_F_PAGEFAULT = 8,
 VHOST_USER_PROTOCOL_F_CONFIG = 9,
+VHOST_USER_PROTOCOL_F_NEED_ALL_IOTLB = 10,
 VHOST_USER_PROTOCOL_F_MAX
 };
 
@@ -1581,6 +1582,12 @@ vhost_user_crypto_close_session(struct vhost_dev *dev, 
uint64_t session_id)
 return 0;
 }
 
+static bool vhost_user_need_all_device_iotlb(struct vhost_dev *dev)
+{
+return virtio_has_feature(dev->protocol_features,
+  VHOST_USER_PROTOCOL_F_NEED_ALL_IOTLB);
+}
+
 const VhostOps user_ops = {
 .backend_type = VHOST_BACKEND_TYPE_USER,
 .vhost_backend_init = vhost_user_init,
@@ -1611,4 +1618,5 @@ const VhostOps user_ops = {
 .vhost_set_config = vhost_user_set_config,
 .vhost_crypto_create_session = vhost_user_crypto_create_session,
 .vhost_crypto_close_session = vhost_user_crypto_close_session,
+.vhost_backend_need_all_device_iotlb = 
vhost_user_need_all_device_iotlb,
 };
diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
index f51bf573d5..70922ee998 100644
--- a/hw/virtio/vhost.c
+++ b/hw/virtio/vhost.c
@@ -48,6 +48,9 @@ static unsigned int used_memslots;
 static QLIST_HEAD(, vhost_dev) vhost_devices =
 QLIST_HEAD_INITIALIZER(vhost_devices);
 
+static int vhost_memory_region_lookup(struct vhost_dev *hdev, 

[Qemu-devel] [PATCH v2] timer: remove replay clock probe in deadline calculation

2018-04-11 Thread Pavel Dovgalyuk
Ciro Santilli reported that commit a5ed352596a8b7eb2f9acce34371b944ac3056c4
breaks the execution replay. It happens due to the probing the clock
for the new instances of iothread.
However, this probing was made in replay mode for the timer lists that
are empty.
This patch removes clock probing in replay mode.
It is an artifact of the old version with another thread model.

Signed-off-by: Pavel Dovgalyuk 
---
 util/qemu-timer.c |   11 ++-
 1 file changed, 2 insertions(+), 9 deletions(-)

diff --git a/util/qemu-timer.c b/util/qemu-timer.c
index 2ed1bf2..86bfe84 100644
--- a/util/qemu-timer.c
+++ b/util/qemu-timer.c
@@ -578,17 +578,10 @@ int64_t timerlistgroup_deadline_ns(QEMUTimerListGroup 
*tlg)
 {
 int64_t deadline = -1;
 QEMUClockType type;
-bool play = replay_mode == REPLAY_MODE_PLAY;
 for (type = 0; type < QEMU_CLOCK_MAX; type++) {
 if (qemu_clock_use_for_deadline(type)) {
-if (!play || type == QEMU_CLOCK_REALTIME) {
-deadline = qemu_soonest_timeout(deadline,
-
timerlist_deadline_ns(tlg->tl[type]));
-} else {
-/* Read clock from the replay file and
-   do not calculate the deadline, based on virtual clock. */
-qemu_clock_get_ns(type);
-}
+deadline = qemu_soonest_timeout(deadline,
+
timerlist_deadline_ns(tlg->tl[type]));
 }
 }
 return deadline;




Re: [Qemu-devel] [RFC] vhost-user: introduce F_NEED_ALL_IOTLB protocol feature

2018-04-11 Thread Tiwei Bie
On Wed, Apr 11, 2018 at 04:01:19PM +0800, Jason Wang wrote:
> On 2018年04月11日 15:20, Tiwei Bie wrote:
> > This patch introduces VHOST_USER_PROTOCOL_F_NEED_ALL_IOTLB
> > feature for vhost-user. By default, vhost-user backend needs
> > to query the IOTLBs from QEMU after meeting unknown IOVAs.
> > With this protocol feature negotiated, QEMU will provide all
> > the IOTLBs to vhost-user backend without waiting for the
> > queries from backend. This is helpful when using a hardware
> > accelerator which is not able to handle unknown IOVAs at the
> > vhost-user backend.
> > 
> > Signed-off-by: Tiwei Bie
> > ---
> > The idea of this patch is to let QEMU push all the IOTLBs
> > to vhost-user backend without waiting for the queries from
> > the backend. Because hardware accelerator at the vhost-user
> > backend may not be able to handle unknown IOVAs.
> > 
> > This is just a RFC for now. It seems that, it doesn't work
> > as expected when guest is using kernel driver (To handle
> > this case, it seems that some RAM regions' events also need
> > to be listened). Any comments would be appreciated! Thanks!
> 
> Interesting, a quick question is why this is needed? Can we just use exist
> IOTLB update message?

Yeah, we are still using the existing IOTLB update messages
to send the IOTLB messages to backend. The only difference
is that, QEMU won't wait for the queries before sending the
IOTLB update messages.

> 
> It looks to me at least kernel does not need this.

Something similar in kernel vhost is that, for kernel vhost,
QEMU needs to push the IOTLBs of some ring addrs to kernel
vhost backend without waiting for the queries.

Best regards,
Tiwei Bie

> 
> Thanks



Re: [Qemu-devel] [PATCH 2/2] qemu-thread: let cur_mon be per-thread

2018-04-11 Thread Paolo Bonzini
On 11/04/2018 11:35, Peter Xu wrote:
> Yeah, the inheritance will only make sure cur_mon be initialized
> always with correct value just like when we are without Out-Of-Band.
> For example, it's still possible a thread is created within a QMP
> handler.  If without current change, the cur_mon in the new thread
> would be NULL.
> 
> AFAIU even if cur_mon==NULL we should mostly be fine (e.g.,
> error_vprintf will handle that case well).  If any of you can help me
> confirm this, then I agree that this patch is not really needed.

Yes, though the solution is to not use error_vprintf from auxiliary
threads. :)  Just make sure all the communication happens through a
struct that's passed in the thread entry point, and possibly bottom
halves from the auxiliary thread to the monitor iothread.

Paolo

> If so, maybe even we don't need to setup cur_mon at entry of monitor
> iothread, since cur_mon is always used in the way like:




Re: [Qemu-devel] [RFC] vhost-user: introduce F_NEED_ALL_IOTLB protocol feature

2018-04-11 Thread Peter Xu
On Wed, Apr 11, 2018 at 03:20:27PM +0800, Tiwei Bie wrote:

[...]

> This is just a RFC for now. It seems that, it doesn't work
> as expected when guest is using kernel driver (To handle
> this case, it seems that some RAM regions' events also need
> to be listened). Any comments would be appreciated! Thanks!

Hi, Tiwei,

What's your kernel command line in the guest?  Is iommu=pt there?

-- 
Peter Xu



Re: [Qemu-devel] [RFC] vhost-user: introduce F_NEED_ALL_IOTLB protocol feature

2018-04-11 Thread Jason Wang



On 2018年04月11日 15:20, Tiwei Bie wrote:

This patch introduces VHOST_USER_PROTOCOL_F_NEED_ALL_IOTLB
feature for vhost-user. By default, vhost-user backend needs
to query the IOTLBs from QEMU after meeting unknown IOVAs.
With this protocol feature negotiated, QEMU will provide all
the IOTLBs to vhost-user backend without waiting for the
queries from backend. This is helpful when using a hardware
accelerator which is not able to handle unknown IOVAs at the
vhost-user backend.

Signed-off-by: Tiwei Bie
---
The idea of this patch is to let QEMU push all the IOTLBs
to vhost-user backend without waiting for the queries from
the backend. Because hardware accelerator at the vhost-user
backend may not be able to handle unknown IOVAs.

This is just a RFC for now. It seems that, it doesn't work
as expected when guest is using kernel driver (To handle
this case, it seems that some RAM regions' events also need
to be listened). Any comments would be appreciated! Thanks!


Interesting, a quick question is why this is needed? Can we just use 
exist IOTLB update message?


It looks to me at least kernel does not need this.

Thanks



Re: [Qemu-devel] [PULL 0/7] Block layer patches for 2.12.0-rc3

2018-04-11 Thread Peter Maydell
On 10 April 2018 at 16:37, Kevin Wolf  wrote:
> The following changes since commit df6378eb0e6cfd58a22a1c3ff8fa4a9039f1eaa8:
>
>   Merge remote-tracking branch 'remotes/kraxel/tags/ui-20180410-pull-request' 
> into staging (2018-04-10 14:04:27 +0100)
>
> are available in the git repository at:
>
>   git://repo.or.cz/qemu/kevin.git tags/for-upstream
>
> for you to fetch changes up to c1de5696d6a25b426432c147dfd7fb8a9eb86b89:
>
>   qemu-iotests: update 185 output (2018-04-10 16:34:38 +0200)
>
> 
> Block layer patches
>

Applied, thanks.

-- PMM



Re: [Qemu-devel] [PATCH v1 12/24] tests/tcg/i386: Build fix for hello-i386

2018-04-11 Thread Thomas Huth
On 10.04.2018 21:39, Alex Bennée wrote:
> From: Fam Zheng 
> 
> We have -Werror=missing-prototype, add a dummy prototype to avoid that
> warning.
> 
> Signed-off-by: Fam Zheng 
> ---
>  tests/tcg/i386/hello-i386.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/tests/tcg/i386/hello-i386.c b/tests/tcg/i386/hello-i386.c
> index fa00380de2..cfeb24b2f5 100644
> --- a/tests/tcg/i386/hello-i386.c
> +++ b/tests/tcg/i386/hello-i386.c
> @@ -20,6 +20,7 @@ static inline int write(int fd, const char * buf, int len)
>return status;
>  }
>  
> +void _start(void);
>  void _start(void)
>  {
>  write(1, "Hello World\n", 12);
> 

Reviewed-by: Thomas Huth 



Re: [Qemu-devel] [RFC] vhost-user: introduce F_NEED_ALL_IOTLB protocol feature

2018-04-11 Thread Tiwei Bie
Hi Peter,

On Wed, Apr 11, 2018 at 04:00:36PM +0800, Peter Xu wrote:
> On Wed, Apr 11, 2018 at 03:20:27PM +0800, Tiwei Bie wrote:
> 
> [...]
> 
> > This is just a RFC for now. It seems that, it doesn't work
> > as expected when guest is using kernel driver (To handle
> > this case, it seems that some RAM regions' events also need
> > to be listened). Any comments would be appreciated! Thanks!
> 
> Hi, Tiwei,
> 
> What's your kernel command line in the guest?  Is iommu=pt there?

Yeah, you are right! The related things in kernel command line are:

iommu=pt intel_iommu=on

Hmm, how will this param affect vIOMMU's behaviour?..

Best regards,
Tiwei Bie

> 
> -- 
> Peter Xu



Re: [Qemu-devel] [RFC] vhost-user: introduce F_NEED_ALL_IOTLB protocol feature

2018-04-11 Thread Peter Xu
On Wed, Apr 11, 2018 at 04:25:56PM +0800, Tiwei Bie wrote:
> Hi Peter,
> 
> On Wed, Apr 11, 2018 at 04:00:36PM +0800, Peter Xu wrote:
> > On Wed, Apr 11, 2018 at 03:20:27PM +0800, Tiwei Bie wrote:
> > 
> > [...]
> > 
> > > This is just a RFC for now. It seems that, it doesn't work
> > > as expected when guest is using kernel driver (To handle
> > > this case, it seems that some RAM regions' events also need
> > > to be listened). Any comments would be appreciated! Thanks!
> > 
> > Hi, Tiwei,
> > 
> > What's your kernel command line in the guest?  Is iommu=pt there?
> 
> Yeah, you are right! The related things in kernel command line are:
> 
> iommu=pt intel_iommu=on
> 
> Hmm, how will this param affect vIOMMU's behaviour?..

If iommu=pt is there, guest kernel will try to bypass IOMMU, the IOMMU
regions will be disabled completely in that case, hence it's very
possible that your IOMMU memory listeners won't get anything useful.

Maybe you can consider removing iommu=pt in the guest parameter to see
whether the guest kernel driver could work.

-- 
Peter Xu



Re: [Qemu-devel] [RFC] vhost-user: introduce F_NEED_ALL_IOTLB protocol feature

2018-04-11 Thread Peter Xu
On Wed, Apr 11, 2018 at 04:55:25PM +0800, Tiwei Bie wrote:
> On Wed, Apr 11, 2018 at 04:37:16PM +0800, Peter Xu wrote:
> > On Wed, Apr 11, 2018 at 04:25:56PM +0800, Tiwei Bie wrote:
> > > On Wed, Apr 11, 2018 at 04:00:36PM +0800, Peter Xu wrote:
> > > > On Wed, Apr 11, 2018 at 03:20:27PM +0800, Tiwei Bie wrote:
> > > > 
> > > > [...]
> > > > 
> > > > > This is just a RFC for now. It seems that, it doesn't work
> > > > > as expected when guest is using kernel driver (To handle
> > > > > this case, it seems that some RAM regions' events also need
> > > > > to be listened). Any comments would be appreciated! Thanks!
> > > > 
> > > > Hi, Tiwei,
> > > > 
> > > > What's your kernel command line in the guest?  Is iommu=pt there?
> > > 
> > > Yeah, you are right! The related things in kernel command line are:
> > > 
> > > iommu=pt intel_iommu=on
> > > 
> > > Hmm, how will this param affect vIOMMU's behaviour?..
> > 
> > If iommu=pt is there, guest kernel will try to bypass IOMMU, the IOMMU
> > regions will be disabled completely in that case, hence it's very
> > possible that your IOMMU memory listeners won't get anything useful.
> > 
> > Maybe you can consider removing iommu=pt in the guest parameter to see
> > whether the guest kernel driver could work.
> 
> Cool. I'll give it a try! Considering we may also need to
> handle the iommu=pt case, is there any event in QEMU can
> be used to know whether the IOMMU regions are disabled
> or enabled by the guest?

You may consider to use similar way like below patch to detect that:

  https://patchwork.kernel.org/patch/9735739/

That was a patch trying to preheat the vhost cache.  It was refused at
that time, but IMHO the logic can be used.  You can just send the
updates only if your new flag set.  Then that won't be a preheat any
more, but a must for your hardwares to work.

-- 
Peter Xu



Re: [Qemu-devel] [Bug 1759264] Re: fpu/softfloat: round_to_int_and_pack refactor broke TriCore ftoi insns

2018-04-11 Thread Alex Bennée

Bastian Koppelmann  writes:

> On 04/10/2018 10:07 PM, Alex Bennée wrote:
>> Yeah it looks like it was missed, the round_to_uint code does it.
>>
>> Do you have a test case I can verify?
>>
>
> For the NaN input 0x the expected result for the flags is that
> flag_invalid is raised.
>
> I can provide you with some TriCore asm, but it is a bit of pain to get
> the gnu assembler to build, since the public version is a decade old.

I'll trust you if you send me a static binary for this particular
verification ;-)

It would be nice to TriCore tests building in tests/tcg/ but I guess we
need an up to date cross compile environment somewhere.

>
> Cheers,
> Bastian


--
Alex Bennée



Re: [Qemu-devel] [Qemu-arm] Crash when running hello-world unikernel for ARM

2018-04-11 Thread Ajay Garg
Hi All.

Just wondering if there is something specific that needs to changed at
https://github.com/rumpkernel/rumprun/tree/master/platform/hw/arch/arm/integrator
in order to get a hello-world app run on "virt" machine?

If so, I would request the rumprun-guys to kindly throw in some light,
on what needs to be done in order to have something run on a "virt"
machine in qemu-context.


Thanks and Regards,
Ajay

On Tue, Apr 10, 2018 at 3:49 PM, Ajay Garg  wrote:
> Thanks Peter .. my sincere gratitude.
> You pin-pointed the real issue ..
>
>
>
> On Tue, Apr 10, 2018 at 2:50 PM, Peter Maydell  
> wrote:
>> On 10 April 2018 at 09:16, Ajay Garg  wrote:
>>> On Tue, Apr 10, 2018 at 1:08 PM, Peter Maydell  
>>> wrote:
 What hardware (what CPU, board, etc) is this "rumprun" software
 expecting to run on?
>>>
>>> Yep, just to ensure that there are no cross-compiling issues, I am
>>> building rumprun on the pseudo-real hardware itself.
>>> In our case, the pseudo-real hardware are :
>>>
>>> a)
>>> An ARM32 "virt" hardware/machine in a qemu environment
>>> (https://translatedcode.wordpress.com/2016/11/03/installing-debian-on-qemus-32-bit-arm-virt-board/)
>>>
>>> Once I start  this machine, all environment is arm32, and I compile
>>> rumprun within this environemnt without any cross-compiling.
>>>
>>> b)
>>> A beaglebone-green-wireless.
>>> This is a arm32 machine bottoms-up, so no question of cross-compiling
>>> whatsoever here :)
>>>
>>> In both cases, I then use qemu-system-arm (on the "virt" machine, and
>>> beaglebone-green-wireless itself).
>>
>> That's telling me what setups you're trying to compile in,
>> which doesn't correspond necessarily to what the guest
>> OS is built to run on.
>>
>>> One query : It is apparent that there is nested qemu-virtualization in
>>> step a), could that be an issue?
>>
>> Why are you running this in a nested setup? I don't understand
>> the purpose of doing that. It would be simpler and faster to
>> just run the guest on a QEMU running in your native host system.
>>
>> Assuming this is the source for the guest you're trying to run:
>>
>> https://github.com/rumpkernel/rumprun/tree/master/platform/hw/arch
>>
>> that suggests that the only Arm board it supports is "integrator"
>> (which is an absolutely ancient devboard with very little memory,
>> no PCI and no virtio support). You need to confirm what Arm hardware
>> this 'rumpkernel' is actually intended to run on, and then give QEMU
>> the right command line arguments to emulate that hardware. I can't
>> really help any further, I'm afraid -- you need somebody who knows
>> about this guest OS.
>>
>> thanks
>> -- PMM
>
>
>
> --
> Regards,
> Ajay



-- 
Regards,
Ajay



Re: [Qemu-devel] [Qemu-ppc] [PATCH for 2.13 v2 1/2] spapr: Add ibm, max-associativity-domains property

2018-04-11 Thread Greg Kurz
On Wed, 11 Apr 2018 10:03:48 +1000
David Gibson  wrote:

> On Tue, Apr 10, 2018 at 02:12:25PM -0400, Serhii Popovych wrote:
> > Now recent kernels (i.e. since linux-stable commit a346137e9142
> > ("powerpc/numa: Use ibm,max-associativity-domains to discover possible 
> > nodes")
> > support this property to mark initially memory-less NUMA nodes as "possible"
> > to allow further memory hot-add to them.
> > 
> > Advertise this property for pSeries machines to let guest kernels detect
> > maximum supported node configuration and benefit from kernel side change
> > when hot-add memory to specific, possibly empty before, NUMA node.
> > 
> > Signed-off-by: Serhii Popovych 
> > ---
> >  hw/ppc/spapr.c | 11 +++
> >  1 file changed, 11 insertions(+)
> > 
> > diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> > index 2c0be8c..3f61785 100644
> > --- a/hw/ppc/spapr.c
> > +++ b/hw/ppc/spapr.c
> > @@ -909,6 +909,14 @@ static void spapr_dt_rtas(sPAPRMachineState *spapr, 
> > void *fdt)
> >  0, cpu_to_be32(SPAPR_MEMORY_BLOCK_SIZE),
> >  cpu_to_be32(max_cpus / smp_threads),
> >  };
> > +uint32_t maxdomains[] = {
> > +cpu_to_be32(5),
> > +cpu_to_be32(0),
> > +cpu_to_be32(0),
> > +cpu_to_be32(0),
> > +cpu_to_be32(nb_numa_nodes - 1),
> > +cpu_to_be32(0),
> > +};  
> 
> Ah.. close, but not quite right.  This is saying that there's exactly
> one node at the bottom (cpu) level, which isn't what we want.  Instead
> of setting it to 0, we want to completely drop that layer, keeping it
> unspecified.
> 
> To do that you need to change the first cell from 5 to 4 (since only 4
> levels will be listed) and drop the last cell entirely.
> 

Alternatively, if we don't want to do any assumptions on the guest
expectations, it is possible to pass the right value in the 6th cell:

cpu_to_be32(spapr_vcpu_id(spapr, max_cpus - 1))

> >  _FDT(rtas = fdt_add_subnode(fdt, 0, "rtas"));
> >  
> > @@ -945,6 +953,9 @@ static void spapr_dt_rtas(sPAPRMachineState *spapr, 
> > void *fdt)
> >  _FDT(fdt_setprop(fdt, rtas, "ibm,associativity-reference-points",
> >   refpoints, sizeof(refpoints)));
> >  
> > +_FDT(fdt_setprop(fdt, rtas, "ibm,max-associativity-domains",
> > + maxdomains, sizeof(maxdomains)));
> > +
> >  _FDT(fdt_setprop_cell(fdt, rtas, "rtas-error-log-max",
> >RTAS_ERROR_LOG_MAX));
> >  _FDT(fdt_setprop_cell(fdt, rtas, "rtas-event-scan-rate",  
> 




Re: [Qemu-devel] [PATCH 2/2] qemu-thread: let cur_mon be per-thread

2018-04-11 Thread Peter Xu
On Wed, Apr 11, 2018 at 11:23:57AM +0200, Paolo Bonzini wrote:
> On 11/04/2018 05:49, Peter Xu wrote:
> > On Wed, Apr 11, 2018 at 09:45:32AM +0800, Stefan Hajnoczi wrote:
> >> On Tue, Apr 10, 2018 at 08:49:13PM +0800, Peter Xu wrote:
> >>> cur_mon was only used in main loop so we don't really need that to be
> >>> per-thread variable.  Now it's possible that we have more than one
> >>> thread to operate on it.  Let's start to let it be per-thread variable.
> >> Trying to understand the reason for this patch:
> >>
> >> Are there any users of per-thread cur_mon?
> > 
> > Currently no.  But if considering future OOB-capable commands, they
> > will modify cur_mon in monitor IOThread at least.
> 
> That's fine, but it shouldn't need the inheritance part.  The monitor
> IOThread can set cur_mon when it starts.

Yeah, the inheritance will only make sure cur_mon be initialized
always with correct value just like when we are without Out-Of-Band.
For example, it's still possible a thread is created within a QMP
handler.  If without current change, the cur_mon in the new thread
would be NULL.

AFAIU even if cur_mon==NULL we should mostly be fine (e.g.,
error_vprintf will handle that case well).  If any of you can help me
confirm this, then I agree that this patch is not really needed.

If so, maybe even we don't need to setup cur_mon at entry of monitor
iothread, since cur_mon is always used in the way like:

  old_mon = cur_mon;
  cur_mon = xxx;
  ... (do something)
  cur_mon = old_mon;

And it'll be fine old_mon==NULL here.  Then IMHO the only thing we
need to do is to mark cur_mon as per-thread and we're done.

Thanks,

-- 
Peter Xu



Re: [Qemu-devel] [PATCH] iotests: fix 169

2018-04-11 Thread Vladimir Sementsov-Ogievskiy

11.04.2018 12:02, Vladimir Sementsov-Ogievskiy wrote:

03.04.2018 23:13, John Snow wrote:


On 04/03/2018 12:23 PM, Max Reitz wrote:

On 2018-03-30 18:10, Vladimir Sementsov-Ogievskiy wrote:

Use MIGRATION events instead of RESUME. Also, make a TODO: enable
dirty-bitmaps capability for offline case.

This (likely) fixes racy faults at least of the following types:

 - timeout on waiting for RESUME event
 - sha256 mismatch on 136 (138 after this patch)

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---

This patch is a true change for the test anyway. But I don't 
understand,

why (and do really) it fixes the things. And I'm not sure about do we
really have a bug in bitmap migration or persistence. So, it's up 
to you,

take it into 2.12...

It was already discussed, that "STOP" event is bad for tests. What 
about

"RESUME"? How can we miss it? And sha256 mismatch is really something
strange.

Max, please check, do it fix 169 for you.

  tests/qemu-iotests/169 | 44 
+++-

  1 file changed, 23 insertions(+), 21 deletions(-)
This makes the test pass (thanks!), but it still leaves behind five 
cats...


Max



Hmm:

jhuston  14772  0.0  0.0   4296   784 pts/3    S    16:12   0:00 cat
/home/bos/jhuston/src/qemu/bin/git/tests/qemu-iotests/scratch/mig_file
jhuston  14796  0.0  0.0   4296   764 pts/3    S    16:12   0:00 cat
/home/bos/jhuston/src/qemu/bin/git/tests/qemu-iotests/scratch/mig_file
jhuston  14940  0.0  0.0   4296   788 pts/3    S    16:12   0:00 cat
/home/bos/jhuston/src/qemu/bin/git/tests/qemu-iotests/scratch/mig_file
jhuston  14964  0.0  0.0   4296   720 pts/3    S    16:12   0:00 cat
/home/bos/jhuston/src/qemu/bin/git/tests/qemu-iotests/scratch/mig_file
jhuston  15052  0.0  0.0   4296   768 pts/3    S    16:12   0:00 cat
/home/bos/jhuston/src/qemu/bin/git/tests/qemu-iotests/scratch/mig_file

Why do these get left behind? Nothing to consume the data...?


aha, understand. it is due to last vm_b.shutdown() and vm_b.launch in 
case of should_migrate. So, at the end of the test I restart vm_b with 
-incoming parameter. But it looks like  a bug anyway, If we start qemu 
with -incoming "exec", should not we kill cat process, if there were 
no migration?




third type of fail, without this patch:

+==
+ERROR: test__persistent__migbitmap__offline_shared 
(__main__.TestDirtyBitmapMigration)

+methodcaller(name, ...) --> methodcaller object
+--
+Traceback (most recent call last):
+  File "169", line 135, in do_test_migration
+    self.vm_b.launch()
+  File 
"/work/src/qemu/up-169/tests/qemu-iotests/../../scripts/qemu.py", line 
221, in launch

+    self._launch()
+  File 
"/work/src/qemu/up-169/tests/qemu-iotests/../../scripts/qemu.py", line 
244, in _launch

+    self._post_launch()
+  File 
"/work/src/qemu/up-169/tests/qemu-iotests/../../scripts/qtest.py", line 
100, in _post_launch

+    super(QEMUQtestMachine, self)._post_launch()
+  File 
"/work/src/qemu/up-169/tests/qemu-iotests/../../scripts/qemu.py", line 
196, in _post_launch

+    self._qmp.accept()
+  File 
"/work/src/qemu/up-169/tests/qemu-iotests/../../scripts/qmp/qmp.py", 
line 157, in accept

+    return self.__negotiate_capabilities()
+  File 
"/work/src/qemu/up-169/tests/qemu-iotests/../../scripts/qmp/qmp.py", 
line 75, in __negotiate_capabilities

+    resp = self.cmd('qmp_capabilities')
+  File 
"/work/src/qemu/up-169/tests/qemu-iotests/../../scripts/qmp/qmp.py", 
line 191, in cmd

+    return self.cmd_obj(qmp_cmd)
+  File 
"/work/src/qemu/up-169/tests/qemu-iotests/../../scripts/qmp/qmp.py", 
line 174, in cmd_obj

+    resp = self.__json_read()
+  File 
"/work/src/qemu/up-169/tests/qemu-iotests/../../scripts/qmp/qmp.py", 
line 82, in __json_read

+    data = self.__sockfile.readline()
+  File "/usr/lib64/python2.7/socket.py", line 447, in readline
+    data = self._sock.recv(self._rbufsize)
+error: [Errno 104] Connection reset by peer
+


--
Best regards,
Vladimir




Re: [Qemu-devel] [PATCH 2/2] qemu-thread: let cur_mon be per-thread

2018-04-11 Thread Peter Xu
On Wed, Apr 11, 2018 at 11:38:58AM +0200, Paolo Bonzini wrote:
> On 11/04/2018 11:35, Peter Xu wrote:
> > Yeah, the inheritance will only make sure cur_mon be initialized
> > always with correct value just like when we are without Out-Of-Band.
> > For example, it's still possible a thread is created within a QMP
> > handler.  If without current change, the cur_mon in the new thread
> > would be NULL.
> > 
> > AFAIU even if cur_mon==NULL we should mostly be fine (e.g.,
> > error_vprintf will handle that case well).  If any of you can help me
> > confirm this, then I agree that this patch is not really needed.
> 
> Yes, though the solution is to not use error_vprintf from auxiliary
> threads. :)  Just make sure all the communication happens through a
> struct that's passed in the thread entry point, and possibly bottom
> halves from the auxiliary thread to the monitor iothread.

Okay. :) Thanks for confirming.  Then let me repost this patch without
touching the qemu-threads.

Btw, do you want me to repost the first patch separately too, or keep
the code as is?  I believe it depends on whether you treat that one as
a cleanup or not.  Thanks,

-- 
Peter Xu



Re: [Qemu-devel] [PATCH] migration: Don't activate block devices if using -S

2018-04-11 Thread Jiri Denemark
On Tue, Apr 10, 2018 at 16:47:56 +0200, Kevin Wolf wrote:
> Am 10.04.2018 um 16:22 hat Dr. David Alan Gilbert geschrieben:
> > * Kevin Wolf (kw...@redhat.com) wrote:
> > > Am 10.04.2018 um 12:40 hat Dr. David Alan Gilbert geschrieben:
> > > > Hmm; having chatted to Jiri I'm OK with reverting it, on the condition
> > > > that I actually understand how this alternative would work first.
> > > > 
> > > > I can't currently see how a block-inactivate would be used.
> > > > I also can't see how a block-activate unless it's also with the
> > > > change that you're asking to revert.
> > > > 
> > > > Can you explain the way you see it working?
> > > 
> > > The key is making the delayed activation of block devices (and probably
> > > delayed announcement of NICs? - you didn't answer that part) optional
> > > instead of making it the default.
> > 
> > NIC announcments are broken in similar but slightly different ways;  we
> > did have a series on list to help a while ago but it never got merged;
> > I'd like to keep that mess separate.
> 
> Okay. I just thought that it would make sense to have clear migration
> phases that are the same for all external resources that the QEMU
> processes use.

I don't think NIC announcements should be delayed in this specific case
since we're dealing with a failure recovery which should be rare in
comparison to successful migration when we want NIC announcements to be
send early. In other words, any NIC issues should be solved separately
and Laine would likely be a better person for discussing them since he
has a broader knowledge of all the fancy network stuff which libvirt
needs to coordinate with.

Jirka



Re: [Qemu-devel] [PATCH v1 10/24] tests/tcg/multiarch: Build fix for linux-test

2018-04-11 Thread Thomas Huth
On 10.04.2018 21:39, Alex Bennée wrote:
> From: Fam Zheng 
> 
> To keep the compiler happy, and to fit in our buildsys flags:
> 
> - Make local functions "static"
> - #ifdef out unused functions
> - drop cutils/osdep dependencies
> 
> Signed-off-by: Fam Zheng 
> [AJB: drop cutils/osdep dependencies]
> Signed-off-by: Alex Bennée 
> ---
>  tests/tcg/multiarch/linux-test.c | 68 
> +---
>  1 file changed, 21 insertions(+), 47 deletions(-)

Reviewed-by: Thomas Huth 



Re: [Qemu-devel] [PATCH v1 11/24] tests/tcg: move i386 specific tests into subdir

2018-04-11 Thread Thomas Huth
On 10.04.2018 21:39, Alex Bennée wrote:
> These only need to be built for i386 guests. This includes a stub
> tests/tcg/i386/Makfile.target which absorbs some of what was in
> tests/tcg/Makefile.
> 
> Signed-off-by: Alex Bennée 
> ---
>  tests/tcg/README|  39 
> 
>  tests/tcg/i386/Makefile.target  |  10 
>  tests/tcg/i386/README   |  38 +++
>  tests/tcg/{ => i386}/hello-i386.c   |   0
>  tests/tcg/{ => i386}/pi_10.com  | Bin
>  tests/tcg/{ => i386}/runcom.c   |   0
>  tests/tcg/{ => i386}/test-i386-code16.S |   0
>  tests/tcg/{ => i386}/test-i386-fprem.c  |   0
>  tests/tcg/{ => i386}/test-i386-muldiv.h |   0
>  tests/tcg/{ => i386}/test-i386-shift.h  |   0
>  tests/tcg/{ => i386}/test-i386-ssse3.c  |   0
>  tests/tcg/{ => i386}/test-i386-vm86.S   |   0

Maybe remove the "-i386-" part from the file name now?

Anyway:

Reviewed-by: Thomas Huth 



[Qemu-devel] [PATCH] cpu: skip unpluged cpu when querying cpus

2018-04-11 Thread linzhecheng
From: XuYandong 

After vcpu1 thread exiting, vcpu0 thread (received notification) is still 
waiting for
holding qemu_global_mutex in cpu_remove_sync, at this moment, vcpu1 is still in 
global cpus list.
If main thread grab qemu_global_mutex in order to handle qmp command "info 
cpus",
qmp_query_cpus visit unpluged vcpu1 will lead qemu process to exit.

Signed-off-by: XuYandong 
---
 cpus.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/cpus.c b/cpus.c
index 2cb0af9..9b3a6c4 100644
--- a/cpus.c
+++ b/cpus.c
@@ -2018,6 +2018,11 @@ CpuInfoList *qmp_query_cpus(Error **errp)
 
 CPU_FOREACH(cpu) {
 CpuInfoList *info;
+
+if (cpu->unplug) {
+continue;
+}
+
 #if defined(TARGET_I386)
 X86CPU *x86_cpu = X86_CPU(cpu);
 CPUX86State *env = _cpu->env;
-- 
1.8.3.1





Re: [Qemu-devel] [Bug 1759264] Re: fpu/softfloat: round_to_int_and_pack refactor broke TriCore ftoi insns

2018-04-11 Thread Bastian Koppelmann
On 04/11/2018 01:01 PM, Alex Bennée wrote:
> 
> Bastian Koppelmann  writes:
> 
>> On 04/10/2018 10:07 PM, Alex Bennée wrote:
>>> Yeah it looks like it was missed, the round_to_uint code does it.
>>>
>>> Do you have a test case I can verify?
>>>
>>
>> For the NaN input 0x the expected result for the flags is that
>> flag_invalid is raised.
>>
>> I can provide you with some TriCore asm, but it is a bit of pain to get
>> the gnu assembler to build, since the public version is a decade old.
> 
> I'll trust you if you send me a static binary for this particular
> verification ;-)
> 
> It would be nice to TriCore tests building in tests/tcg/ but I guess we
> need an up to date cross compile environment somewhere.

That is the problem. There is a gcc/binutils port done by some german
company. And it's not easy to get the source. The one they provide is
pretty old and needs some patching to get in building on modern
machines. Right now I'm trying to set up a test environment.

Cheers,
Bastian



Re: [Qemu-devel] [PATCH 2/2] qemu-thread: let cur_mon be per-thread

2018-04-11 Thread Paolo Bonzini
On 11/04/2018 05:49, Peter Xu wrote:
> On Wed, Apr 11, 2018 at 09:45:32AM +0800, Stefan Hajnoczi wrote:
>> On Tue, Apr 10, 2018 at 08:49:13PM +0800, Peter Xu wrote:
>>> cur_mon was only used in main loop so we don't really need that to be
>>> per-thread variable.  Now it's possible that we have more than one
>>> thread to operate on it.  Let's start to let it be per-thread variable.
>> Trying to understand the reason for this patch:
>>
>> Are there any users of per-thread cur_mon?
> 
> Currently no.  But if considering future OOB-capable commands, they
> will modify cur_mon in monitor IOThread at least.

That's fine, but it shouldn't need the inheritance part.  The monitor
IOThread can set cur_mon when it starts.

In general, relying on cur_mon should be avoided.

Paolo



[Qemu-devel] [Bug 1762179] Re: Record and replay replay fails with: "ERROR:replay/replay-time.c:49:replay_read_clock: assertion failed"

2018-04-11 Thread Ciro Santilli 六四事件 法轮功
** Description changed:

  QEMU master at 915d34c5f99b0ab91517c69f54272bfdb6ca2b32 Ubuntu 17.10
  host.
  
  QEMU commands:
  
  ```
  #!/usr/bin/env bash
  cmd="\
  time \
  ./x86_64-softmmu/qemu-system-x86_64 \
  -append 'root=/dev/sda console=ttyS0 nokaslr printk.time=y - 
lkmc_eval=\"/rand_check.out;/sbin/ifup -a;wget -S google.com;/poweroff.out;\"' \
  -kernel 'out/x86_64/buildroot/images/bzImage' \
  -nographic \
  \
  -drive 
file=out/x86_64/buildroot/images/rootfs.ext2.qcow2,if=none,id=img-direct,format=qcow2
 \
  -drive driver=blkreplay,if=none,image=img-direct,id=img-blkreplay \
  -device ide-hd,drive=img-blkreplay \
  \
  -netdev user,id=net1 \
  -device rtl8139,netdev=net1 \
  -object filter-replay,id=replay,netdev=net1 \
  "
  echo "$cmd"
  eval "$cmd -icount 'shift=7,rr=record,rrfile=replay.bin'"
  eval "$cmd -icount 'shift=7,rr=replay,rrfile=replay.bin'"
  ```
  
  This tries to stay as close as possible to the documented commands:
  
https://github.com/qemu/qemu/blob/08e173f29461396575c85510eb41474b993cb1fb/docs/replay.txt#L28
  
  Images uploaded to: https://github.com/cirosantilli/linux-kernel-module-
  cheat/releases/download/test-replay-arm/images4.zip
  
  Images generated with: https://github.com/cirosantilli/linux-kernel-
  module-cheat/tree/9513c162ef57e6cb70006dfe870856f94ee9a133
  
  The replay failed straight out with:
  
  ```
  ERROR:replay/replay-time.c:49:replay_read_clock: assertion failed: 
(replay_file && replay_mutex_locked())
  ```
  
  QEMU configure:
  
  ```
- ./configure --enable-debug --enable-trace-backends=simple 
--target-list="x86_64-softmmu"
+ ./configure --enable-debug --enable-trace-backends=simple 
--target-list=x86_64-softmmu
  ```

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1762179

Title:
  Record and replay replay fails with: "ERROR:replay/replay-
  time.c:49:replay_read_clock: assertion failed"

Status in QEMU:
  New

Bug description:
  QEMU master at 915d34c5f99b0ab91517c69f54272bfdb6ca2b32 Ubuntu 17.10
  host.

  QEMU commands:

  ```
  #!/usr/bin/env bash
  cmd="\
  time \
  ./x86_64-softmmu/qemu-system-x86_64 \
  -append 'root=/dev/sda console=ttyS0 nokaslr printk.time=y - 
lkmc_eval=\"/rand_check.out;/sbin/ifup -a;wget -S google.com;/poweroff.out;\"' \
  -kernel 'out/x86_64/buildroot/images/bzImage' \
  -nographic \
  \
  -drive 
file=out/x86_64/buildroot/images/rootfs.ext2.qcow2,if=none,id=img-direct,format=qcow2
 \
  -drive driver=blkreplay,if=none,image=img-direct,id=img-blkreplay \
  -device ide-hd,drive=img-blkreplay \
  \
  -netdev user,id=net1 \
  -device rtl8139,netdev=net1 \
  -object filter-replay,id=replay,netdev=net1 \
  "
  echo "$cmd"
  eval "$cmd -icount 'shift=7,rr=record,rrfile=replay.bin'"
  eval "$cmd -icount 'shift=7,rr=replay,rrfile=replay.bin'"
  ```

  This tries to stay as close as possible to the documented commands:
  
https://github.com/qemu/qemu/blob/08e173f29461396575c85510eb41474b993cb1fb/docs/replay.txt#L28

  Images uploaded to: https://github.com/cirosantilli/linux-kernel-
  module-cheat/releases/download/test-replay-arm/images4.zip

  Images generated with: https://github.com/cirosantilli/linux-kernel-
  module-cheat/tree/9513c162ef57e6cb70006dfe870856f94ee9a133

  The replay failed straight out with:

  ```
  ERROR:replay/replay-time.c:49:replay_read_clock: assertion failed: 
(replay_file && replay_mutex_locked())
  ```

  QEMU configure:

  ```
  ./configure --enable-debug --enable-trace-backends=simple 
--target-list=x86_64-softmmu
  ```

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1762179/+subscriptions



Re: [Qemu-devel] [PATCH v1 01/24] configure: add test for docker availability

2018-04-11 Thread Alex Bennée

Fam Zheng  writes:

> On Tue, 04/10 20:38, Alex Bennée wrote:
>> This tests for a working docker installation without sudo and sets up
>> config-host.mak accordingly. This will be useful from cross compiling
>> things in the future.
>>
>> Signed-off-by: Alex Bennée 
>> ---
>>  configure | 23 +++
>>  1 file changed, 23 insertions(+)
>>
>> diff --git a/configure b/configure
>> index 4d0e92c96c..b402befe94 100755
>> --- a/configure
>> +++ b/configure
>> @@ -451,6 +451,7 @@ jemalloc="no"
>>  replication="yes"
>>  vxhs=""
>>  libxml2=""
>> +docker="no"
>>
>>  supported_cpu="no"
>>  supported_os="no"
>> @@ -5396,6 +5397,23 @@ EOF
>>fi
>>  fi
>>
>> +##
>> +# Docker and cross-compiler support
>> +#
>> +# This is specifically for building test
>> +# cases for foreign architectures, not
>> +# cross-compiling QEMU itself.
>> +
>> +if has "docker"; then
>> +if docker images  >/dev/null 2>&1 ; then
>> +docker="yes"
>> +else
>> +# docker may be available but using sudo
>> +# so we won't use it for cross-building
>> +docker="maybe"
>
> What is the problem with using sudo for cross-building?

Nothing in particular but we need someway of testing if the sudo is
passwordless otherwise you might find the build stuck waiting for user
interaction. This is fine for "make docker-foo" but for an eventual
unattended "make check" this may cause problems.

Is there a way we can test for this? Maybe we can push the docker probe
into docker.py and just return to configure if it can run docker
unattended?

--
Alex Bennée



Re: [Qemu-devel] [PULL 0/1] migration queue

2018-04-11 Thread Peter Maydell
On 10 April 2018 at 16:03, Dr. David Alan Gilbert (git)
 wrote:
> From: "Dr. David Alan Gilbert" 
>
> The following changes since commit df6378eb0e6cfd58a22a1c3ff8fa4a9039f1eaa8:
>
>   Merge remote-tracking branch 'remotes/kraxel/tags/ui-20180410-pull-request' 
> into staging (2018-04-10 14:04:27 +0100)
>
> are available in the Git repository at:
>
>   git://github.com/dagrh/qemu.git tags/pull-migration-20180410a
>
> for you to fetch changes up to a18a73d7472e6ff5bc1e5f7fb9f7a42464295d03:
>
>   Revert "migration: Don't activate block devices if using -S" (2018-04-10 
> 15:28:42 +0100)
>
> 
> Migration reversion pull for 2.12
>
> One to revert after we decided it needs some more thinking.
>
> 
> Dr. David Alan Gilbert (1):
>   Revert "migration: Don't activate block devices if using -S"
>
>  migration/migration.c | 22 +++---
>  1 file changed, 7 insertions(+), 15 deletions(-)
>

Applied, thanks.

-- PMM



Re: [Qemu-devel] [RFC] vhost-user: introduce F_NEED_ALL_IOTLB protocol feature

2018-04-11 Thread Tiwei Bie
On Wed, Apr 11, 2018 at 04:37:16PM +0800, Peter Xu wrote:
> On Wed, Apr 11, 2018 at 04:25:56PM +0800, Tiwei Bie wrote:
> > On Wed, Apr 11, 2018 at 04:00:36PM +0800, Peter Xu wrote:
> > > On Wed, Apr 11, 2018 at 03:20:27PM +0800, Tiwei Bie wrote:
> > > 
> > > [...]
> > > 
> > > > This is just a RFC for now. It seems that, it doesn't work
> > > > as expected when guest is using kernel driver (To handle
> > > > this case, it seems that some RAM regions' events also need
> > > > to be listened). Any comments would be appreciated! Thanks!
> > > 
> > > Hi, Tiwei,
> > > 
> > > What's your kernel command line in the guest?  Is iommu=pt there?
> > 
> > Yeah, you are right! The related things in kernel command line are:
> > 
> > iommu=pt intel_iommu=on
> > 
> > Hmm, how will this param affect vIOMMU's behaviour?..
> 
> If iommu=pt is there, guest kernel will try to bypass IOMMU, the IOMMU
> regions will be disabled completely in that case, hence it's very
> possible that your IOMMU memory listeners won't get anything useful.
> 
> Maybe you can consider removing iommu=pt in the guest parameter to see
> whether the guest kernel driver could work.

Cool. I'll give it a try! Considering we may also need to
handle the iommu=pt case, is there any event in QEMU can
be used to know whether the IOMMU regions are disabled
or enabled by the guest?

Best regards,
Tiwei Bie

> 
> -- 
> Peter Xu



Re: [Qemu-devel] [RFC] vhost-user: introduce F_NEED_ALL_IOTLB protocol feature

2018-04-11 Thread Tiwei Bie
On Wed, Apr 11, 2018 at 05:16:47PM +0800, Peter Xu wrote:
> On Wed, Apr 11, 2018 at 04:55:25PM +0800, Tiwei Bie wrote:
> > On Wed, Apr 11, 2018 at 04:37:16PM +0800, Peter Xu wrote:
> > > On Wed, Apr 11, 2018 at 04:25:56PM +0800, Tiwei Bie wrote:
> > > > On Wed, Apr 11, 2018 at 04:00:36PM +0800, Peter Xu wrote:
> > > > > On Wed, Apr 11, 2018 at 03:20:27PM +0800, Tiwei Bie wrote:
> > > > > 
> > > > > [...]
> > > > > 
> > > > > > This is just a RFC for now. It seems that, it doesn't work
> > > > > > as expected when guest is using kernel driver (To handle
> > > > > > this case, it seems that some RAM regions' events also need
> > > > > > to be listened). Any comments would be appreciated! Thanks!
> > > > > 
> > > > > Hi, Tiwei,
> > > > > 
> > > > > What's your kernel command line in the guest?  Is iommu=pt there?
> > > > 
> > > > Yeah, you are right! The related things in kernel command line are:
> > > > 
> > > > iommu=pt intel_iommu=on
> > > > 
> > > > Hmm, how will this param affect vIOMMU's behaviour?..
> > > 
> > > If iommu=pt is there, guest kernel will try to bypass IOMMU, the IOMMU
> > > regions will be disabled completely in that case, hence it's very
> > > possible that your IOMMU memory listeners won't get anything useful.
> > > 
> > > Maybe you can consider removing iommu=pt in the guest parameter to see
> > > whether the guest kernel driver could work.
> > 
> > Cool. I'll give it a try! Considering we may also need to
> > handle the iommu=pt case, is there any event in QEMU can
> > be used to know whether the IOMMU regions are disabled
> > or enabled by the guest?
> 
> You may consider to use similar way like below patch to detect that:
> 
>   https://patchwork.kernel.org/patch/9735739/
> 
> That was a patch trying to preheat the vhost cache.  It was refused at
> that time, but IMHO the logic can be used.  You can just send the
> updates only if your new flag set.  Then that won't be a preheat any
> more, but a must for your hardwares to work.

It looks pretty helpful in my case! Thank you so much!

Best regards,
Tiwei Bie

> 
> -- 
> Peter Xu



Re: [Qemu-devel] [PATCH v1 09/24] tests/tcg: move architecture independent tests into subdir

2018-04-11 Thread Thomas Huth
On 10.04.2018 21:39, Alex Bennée wrote:
> We will want to build these for all supported guest architectures so
> lets move them all into one place. We also drop test_path at this
> point because it needs qemu utils and glib bits which is hard to
> support for cross compiling.
> 
> Signed-off-by: Alex Bennée 
> ---
>  tests/tcg/README   |  10 +--
>  tests/tcg/multiarch/README |   1 +
>  tests/tcg/{ => multiarch}/linux-test.c |   0
>  tests/tcg/{ => multiarch}/sha1.c   |   0
>  tests/tcg/{ => multiarch}/test-mmap.c  |   0
>  tests/tcg/{ => multiarch}/testthread.c |   0
>  tests/tcg/test_path.c  | 157 
> -
>  7 files changed, 5 insertions(+), 163 deletions(-)
>  create mode 100644 tests/tcg/multiarch/README
>  rename tests/tcg/{ => multiarch}/linux-test.c (100%)
>  rename tests/tcg/{ => multiarch}/sha1.c (100%)
>  rename tests/tcg/{ => multiarch}/test-mmap.c (100%)
>  rename tests/tcg/{ => multiarch}/testthread.c (100%)
>  delete mode 100644 tests/tcg/test_path.c

Reviewed-by: Thomas Huth 



Re: [Qemu-devel] [PATCH for-2.12 v2] qemu-iotests: update 185 output

2018-04-11 Thread Max Reitz
On 2018-04-10 10:11, Stefan Hajnoczi wrote:
> On Wed, Apr 04, 2018 at 06:16:12PM +0200, Max Reitz wrote:
>> On 2018-04-04 17:01, Stefan Hajnoczi wrote:
>>  === Start mirror job and exit qemu ===
>>
>> This seems to be independent of whether there is actually data on
>> TEST_IMG (the commit source), so something doesn't seem quite right with
>> the block job throttling here...?
> 
> I've been playing around with this test failure.  Let's leave it broken
> (!) in QEMU 2.12 because this test has uncovered a block job ratelimit
> issue that's not a regression.  The fix shouldn't be rushed.

OK for me.

> block/mirror.c calculates delay_ns but then discards it:
> 
>   static void coroutine_fn mirror_run(void *opaque)
>   {
>   ...
> 
>   for (;;) {
>   ...delay_ns is calculated based on job speed...
> 
>   ret = 0;
>   trace_mirror_before_sleep(s, cnt, s->synced, delay_ns);
>   if (block_job_is_cancelled(>common) && s->common.force) {
>   break;
>   } else if (!should_complete) {
>   delay_ns = (s->in_flight == 0 && cnt == 0 ? SLICE_TIME : 0);
>   ^--- we discard it!
>   block_job_sleep_ns(>common, delay_ns);
>   }
>   s->last_pause_ns = qemu_clock_get_ns(QEMU_CLOCK_REALTIME);
>   }
> 
>   ...
>   }

Uh, nice.

> This is why the mirror-based tests are completing even though we'd
> expect them to only reach the "next" iteration due to the job speed.
> 
> Increasing the 4 MB write operation in the test to >16 MB (the mirror
> buffer size) doesn't solve the problem because the next QMP command will
> race with the job's 0 ns sleep.  It would just make the test unreliable.
> 
> I'll work on the following for QEMU 2.13:
> 
> 1. Avoid spurious block_job_enter() from block_job_drain().  This
>eliminates the extra block job iteration that changed the output in
>the first place.
> 
> 2. Honor the job speed in block/mirror.c.
> 
> The end result will be that the test output will not require changes.

Thanks!

Max



signature.asc
Description: OpenPGP digital signature


[Qemu-devel] [PATCH v5 0/1] i386/kvm: TSC page clocksource for Hyper-V-on-KVM fixes

2018-04-11 Thread Vitaly Kuznetsov
Changes since v4:
- Rebase on top of Roman's patches.
- Drop PATCH2 as it is no longer needed (after adding explicit
  hv_frequencies).

Previously, Ladi was working on enabling TSC page clocksource for nested
Hyper-V-on-KVM workloads. He found out that if Hyper-V frequency MSRs are
exposed to L1 as well as INVTSC flag Hyper-V enables TSC page clocksource
to its guests. Qemu doesn't pass INVTSC by default as it is a migration
blocker.

I found out there's a different way to make Hyper-V like us: expose
Reenlightenment MSRs to it. KVM doesn't fully support the feature as
we're still unable to migrate nested environments but rudimentary support
we have there is enough.

Enable Hyper-V reenlightenment MSRs to make things work.

Vitaly Kuznetsov (1):
  i386/kvm: add support for Hyper-V reenlightenment MSRs

 target/i386/cpu.c  |  4 +++-
 target/i386/cpu.h  |  4 
 target/i386/hyperv-proto.h |  9 -
 target/i386/kvm.c  | 39 ++-
 target/i386/machine.c  | 24 
 5 files changed, 77 insertions(+), 3 deletions(-)

-- 
2.14.3




[Qemu-devel] [PATCH v5 1/1] i386/kvm: add support for Hyper-V reenlightenment MSRs

2018-04-11 Thread Vitaly Kuznetsov
KVM recently gained support for Hyper-V Reenlightenment MSRs which are
required to make KVM-on-Hyper-V enable TSC page clocksource to its guests
when INVTSC is not passed to it (and it is not passed by default in Qemu
as it effectively blocks migration).

Signed-off-by: Vitaly Kuznetsov 
---
Changes since v4:
- Rebase on top of Roman's patches.
---
 target/i386/cpu.c  |  4 +++-
 target/i386/cpu.h  |  4 
 target/i386/hyperv-proto.h |  9 -
 target/i386/kvm.c  | 39 ++-
 target/i386/machine.c  | 24 
 5 files changed, 77 insertions(+), 3 deletions(-)

diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index 1a6b082b6f..e0e7a16d21 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -409,7 +409,8 @@ static FeatureWordInfo feature_word_info[FEATURE_WORDS] = {
 NULL /* hv_vpindex_access */, NULL /* hv_msr_reset_access */,
 NULL /* hv_msr_stats_access */, NULL /* hv_reftsc_access */,
 NULL /* hv_msr_idle_access */, NULL /* hv_msr_frequency_access */,
-NULL, NULL, NULL, NULL,
+NULL /* hv_msr_debug_access */, NULL /* 
hv_msr_reenlightenment_access */,
+NULL, NULL,
 NULL, NULL, NULL, NULL,
 NULL, NULL, NULL, NULL,
 NULL, NULL, NULL, NULL,
@@ -4762,6 +4763,7 @@ static Property x86_cpu_properties[] = {
 DEFINE_PROP_BOOL("hv-synic", X86CPU, hyperv_synic, false),
 DEFINE_PROP_BOOL("hv-stimer", X86CPU, hyperv_stimer, false),
 DEFINE_PROP_BOOL("hv-frequencies", X86CPU, hyperv_frequencies, false),
+DEFINE_PROP_BOOL("hv-reenlightenment", X86CPU, hyperv_reenlightenment, 
false),
 DEFINE_PROP_BOOL("check", X86CPU, check_cpuid, true),
 DEFINE_PROP_BOOL("enforce", X86CPU, enforce_cpuid, false),
 DEFINE_PROP_BOOL("kvm", X86CPU, expose_kvm, true),
diff --git a/target/i386/cpu.h b/target/i386/cpu.h
index 1b219fafc4..b58b779bff 100644
--- a/target/i386/cpu.h
+++ b/target/i386/cpu.h
@@ -1174,6 +1174,9 @@ typedef struct CPUX86State {
 uint64_t msr_hv_synic_sint[HV_SINT_COUNT];
 uint64_t msr_hv_stimer_config[HV_STIMER_COUNT];
 uint64_t msr_hv_stimer_count[HV_STIMER_COUNT];
+uint64_t msr_hv_reenlightenment_control;
+uint64_t msr_hv_tsc_emulation_control;
+uint64_t msr_hv_tsc_emulation_status;
 
 uint64_t msr_rtit_ctrl;
 uint64_t msr_rtit_status;
@@ -1297,6 +1300,7 @@ struct X86CPU {
 bool hyperv_synic;
 bool hyperv_stimer;
 bool hyperv_frequencies;
+bool hyperv_reenlightenment;
 bool check_cpuid;
 bool enforce_cpuid;
 bool expose_kvm;
diff --git a/target/i386/hyperv-proto.h b/target/i386/hyperv-proto.h
index cb4d7f2b7a..93352ebd2a 100644
--- a/target/i386/hyperv-proto.h
+++ b/target/i386/hyperv-proto.h
@@ -35,7 +35,7 @@
 #define HV_RESET_AVAILABLE   (1u << 7)
 #define HV_REFERENCE_TSC_AVAILABLE   (1u << 9)
 #define HV_ACCESS_FREQUENCY_MSRS (1u << 11)
-
+#define HV_ACCESS_REENLIGHTENMENTS_CONTROL  (1u << 13)
 
 /*
  * HV_CPUID_FEATURES.EDX bits
@@ -129,6 +129,13 @@
 #define HV_X64_MSR_CRASH_CTL0x4105
 #define HV_CRASH_CTL_NOTIFY (1ull << 63)
 
+/*
+ * Reenlightenment notification MSRs
+ */
+#define HV_X64_MSR_REENLIGHTENMENT_CONTROL  0x4106
+#define HV_X64_MSR_TSC_EMULATION_CONTROL0x4107
+#define HV_X64_MSR_TSC_EMULATION_STATUS 0x4108
+
 /*
  * Hypercall status code
  */
diff --git a/target/i386/kvm.c b/target/i386/kvm.c
index 6c49954e68..da4b19 100644
--- a/target/i386/kvm.c
+++ b/target/i386/kvm.c
@@ -90,6 +90,7 @@ static bool has_msr_hv_runtime;
 static bool has_msr_hv_synic;
 static bool has_msr_hv_stimer;
 static bool has_msr_hv_frequencies;
+static bool has_msr_hv_reenlightenment;
 static bool has_msr_xss;
 static bool has_msr_spec_ctrl;
 static bool has_msr_smi_count;
@@ -583,7 +584,8 @@ static bool hyperv_enabled(X86CPU *cpu)
 cpu->hyperv_vpindex ||
 cpu->hyperv_runtime ||
 cpu->hyperv_synic ||
-cpu->hyperv_stimer);
+cpu->hyperv_stimer ||
+cpu->hyperv_reenlightenment);
 }
 
 static int kvm_arch_set_tsc_khz(CPUState *cs)
@@ -669,6 +671,16 @@ static int hyperv_handle_properties(CPUState *cs)
 }
 env->features[FEAT_HYPERV_EDX] |= HV_GUEST_CRASH_MSR_AVAILABLE;
 }
+if (cpu->hyperv_reenlightenment) {
+if (!has_msr_hv_reenlightenment) {
+fprintf(stderr,
+"Hyper-V Reenlightenment MSRs "
+"(requested by 'hv-reenlightenment' cpu flag) "
+"are not supported by kernel\n");
+return -ENOSYS;
+}
+env->features[FEAT_HYPERV_EAX] |= HV_ACCESS_REENLIGHTENMENTS_CONTROL;
+}
 env->features[FEAT_HYPERV_EDX] |= HV_CPU_DYNAMIC_PARTITIONING_AVAILABLE;
 if (cpu->hyperv_reset) {
 if (!has_msr_hv_reset) {
@@ -1215,6 +1227,9 @@ 

Re: [Qemu-devel] [PATCH v5 1/1] i386/kvm: add support for Hyper-V reenlightenment MSRs

2018-04-11 Thread Roman Kagan
On Wed, Apr 11, 2018 at 01:50:36PM +0200, Vitaly Kuznetsov wrote:
> KVM recently gained support for Hyper-V Reenlightenment MSRs which are
> required to make KVM-on-Hyper-V enable TSC page clocksource to its guests
> when INVTSC is not passed to it (and it is not passed by default in Qemu
> as it effectively blocks migration).
> 
> Signed-off-by: Vitaly Kuznetsov 
> ---
> Changes since v4:
> - Rebase on top of Roman's patches.
> ---
>  target/i386/cpu.c  |  4 +++-
>  target/i386/cpu.h  |  4 
>  target/i386/hyperv-proto.h |  9 -
>  target/i386/kvm.c  | 39 ++-
>  target/i386/machine.c  | 24 
>  5 files changed, 77 insertions(+), 3 deletions(-)

Reviewed-by: Roman Kagan 



Re: [Qemu-devel] [PATCH 0/5] Enable postcopy RDMA live migration

2018-04-11 Thread Dr. David Alan Gilbert
* Lidong Chen (jemmy858...@gmail.com) wrote:
> Current Qemu RDMA communication does not support send and receive
> data at the same time, so when RDMA live migration with postcopy
> enabled, the source qemu return path thread get qemu file error.
> 
> Those patch add the postcopy support for RDMA live migration.

This description is a little misleading; it doesn't really
do RDMA during the postcopy phase - what it really does is disable
the RDMA page sending during the postcopy phase, relying on the 
RDMA codes stream emulation to send the page.

That's not necessarily a bad fix; you get the nice performance of RDMA
during the precopy phase, but how bad are you finding the performance
during the postcopy phase - the RDMA code we have was only really
designed for sending small commands over the stream?

Dave

> Lidong Chen (5):
>   migration: create a dedicated connection for rdma return path
>   migration: add the interface to set get_return_path
>   migration: implement the get_return_path for RDMA iochannel
>   migration: fix qemu carsh when RDMA live migration
>   migration: disable RDMA WRITR after postcopy started.
> 
>  migration/qemu-file-channel.c |  12 ++--
>  migration/qemu-file.c |  13 +++-
>  migration/qemu-file.h |   2 +-
>  migration/rdma.c  | 148 
> --
>  4 files changed, 163 insertions(+), 12 deletions(-)
> 
> -- 
> 1.8.3.1
> 
--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK



Re: [Qemu-devel] [Bug 1759264] Re: fpu/softfloat: round_to_int_and_pack refactor broke TriCore ftoi insns

2018-04-11 Thread Bastian Koppelmann
On 04/11/2018 01:01 PM, Alex Bennée wrote:
> Bastian Koppelmann  writes:
> 
>> On 04/10/2018 10:07 PM, Alex Bennée wrote:
>>> Yeah it looks like it was missed, the round_to_uint code does it.
>>>
>>> Do you have a test case I can verify?
>>>
>>
>> For the NaN input 0x the expected result for the flags is that
>> flag_invalid is raised.
>>
>> I can provide you with some TriCore asm, but it is a bit of pain to get
>> the gnu assembler to build, since the public version is a decade old.
> 
> I'll trust you if you send me a static binary for this particular
> verification ;-)

I set up a github repo with working binutils and the corresponding
testcase:

https://github.com/bkoppelmann/tricore-fpu

The one caveat is, that we cannot produce any binaries with the TriCore
ISA > 1.3.

In this testcase the last ftoi instruction is supposed to raise the
invalid flag and return 0 since the input was NaN. We did that by only
checking for NaN if any flag was raised after ftoi, then do the NaN
check, and if positive, return 0.

Cheers,
Bastian

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1759264

Title:
  fpu/softfloat: round_to_int_and_pack refactor broke TriCore ftoi insns

Status in QEMU:
  New

Bug description:
  After the refactor from ab52f973a504f8de0c5df64631ba4caea70a7d9e the
  bahaviour of int32_to_float32() was altered.

  helper_ftoi() in target/tricore/fpu_helper.c relied on
  int32_to_float32 to raise the invalid flag if the input was NaN to
  properly return 0. Likewise if the input is infinity.

  The obvious fix for softfloat would be to raise this flag in 
round_to_int_and_pack(). However,
  I'm not sure if this breaks other targets and I have no easy way to test it.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1759264/+subscriptions



Re: [Qemu-devel] [PATCH v3 6/7] s390x/kvm: handle AP instruction interception

2018-04-11 Thread Tony Krowiak

On 04/09/2018 06:51 AM, Cornelia Huck wrote:

On Mon, 9 Apr 2018 12:37:42 +0200
Halil Pasic  wrote:


On 04/09/2018 11:32 AM, Cornelia Huck wrote:

We can kind of (i.e. modulo EECA.28) ensure this in a different fashion I 
think. How
about proclaiming a 'has ap instructions, but nothing to see here' in the
SIE interpreted flavor (ECA.28 set) the default way of having ap instructions
under KVM. This should be easily accomplished with an all zero CRYCB and eca.28
set. The for the guest to actually get real work done with AP we would
still require some sort of driver to either provide a non-zero matrix by
altering the CRYCB or unsettling ECA.28 and doing the intercepted flavor.

Please notice, the cpu facility ap would still keep it's semantic
'has ap instructions' (opposed to 'has ap instructions implemented in
SIE interpreted flavor). And give us all the flexibility.

Yet implementing what we want to have in absence of a driver would become
much easier (under the assumption that ECA.28 equals EECA.28).

How about this?

Unfortunately, this is really hard to follow without the AR... let me
summarize it to check whether I got the gist of it :)

- If the "ap" cpu feature is specified, set a bit that indicates "hey,
   we basically have have AP support" and create the basics, but don't
   enable actual SIE handling. This means the guest gets exceptions from
   the SIE already and we don't need to emulate them.

Kind of. The bit is ECA.28 and tells SIE 'hey SIE shall interpret ap
instructions for the guest (as specified)'. Then SIE has an SD satellite
called CRYCB that contains the which ap resources is the guest authorized
to use. These are the masks. If we set each mask to all zero, we will
effectively accomplish 'hey,we basically have have AP support but no
resources at the moment'. So, right, we don't have to emulate that.

I don't know what do you mean by exceptions. For most legit requests the
SIE should say APQN invalid, with QCI being a notable exception. But
of course SIE would inject program exceptions (access, specification,
and privileged operation) accordingly I guess.

I meant "emulate exceptions"...



In short, the SIE would do what we are trying to emulate in this patch.

...so yes, exactly that.


- Actually enable the missing pieces if a vfio device is created. This
   would enable processing by the SIE, and we would not need to do
   emulation, either (for most of it, IIRC).

Yes. It would actually assign resources to the guest. That would enable
doing real work with the AP instructions.

Ok.


I may be all wrong, though... can we at least have a translation of
ECA.28 and EECA.28 (the "ap is there" bit and the "ap instructions are
interpreted" bit?)
   

I think we have a misunderstanding here. I will wait for Tony. Maybe
he can understand this better or explain in more accessible way.

 From what I get from your explanation, this approach sounds like a good
way forward. But let's wait for Tony.


I agree, this solves the problem with specifying installing AP facilities
on the guest (-cpu xxx,ap=on) and not configuring a vfio-ap device
(-device vfio-ap,sysfsdev=$path-to-mediated-device).

To recap:
The realize() function for the vfio-ap device opens the mediated matrix
device file descriptor to set up the communication pathway with the vfio_ap
kernel driver. When the fd is opened, the vfio_ap driver sets ECA.28 for 
the

guest to instruct SIE to interpret AP instructions. The driver also
configures the AP matrix for the guest in its SIE state description. 
Consequently,

if a vfio-ap device is not configured for the guest, but AP facilities are
installed, all AP instructions will be intercepted and routed to QEMU. 
If there
are no interception handlers for the AP instructions, QEMU injects an 
operation
exception into the guest. This results in initialization of the AP bus 
on the

guest to terminate. This patch was intended to circumvent that problem.

With Halil's suggestion, there is no need to provide these handlers. If 
ECA.28
is set for the guest by default when the AP facilities are installed, 
then the AP

instructions will be interpreted and the AP bus will get initialized on the
guest. Since there is no vfio-ap device to provide the AP matrix 
configuration

for the guest, the AP bus will not detect any devices, but that's okay. AP
instructions targeting at an APQN will execute successfully and set a 
response

code in the status block returned from the instruction indicating the
APQN is invalid  but there will be no exception unless there is truly
an exception condition caused by the execution of the instruction.








Re: [Qemu-devel] [PATCH v1 01/24] configure: add test for docker availability

2018-04-11 Thread Fam Zheng
On Wed, 04/11 11:58, Alex Bennée wrote:
> 
> Fam Zheng  writes:
> 
> > On Tue, 04/10 20:38, Alex Bennée wrote:
> >> This tests for a working docker installation without sudo and sets up
> >> config-host.mak accordingly. This will be useful from cross compiling
> >> things in the future.
> >>
> >> Signed-off-by: Alex Bennée 
> >> ---
> >>  configure | 23 +++
> >>  1 file changed, 23 insertions(+)
> >>
> >> diff --git a/configure b/configure
> >> index 4d0e92c96c..b402befe94 100755
> >> --- a/configure
> >> +++ b/configure
> >> @@ -451,6 +451,7 @@ jemalloc="no"
> >>  replication="yes"
> >>  vxhs=""
> >>  libxml2=""
> >> +docker="no"
> >>
> >>  supported_cpu="no"
> >>  supported_os="no"
> >> @@ -5396,6 +5397,23 @@ EOF
> >>fi
> >>  fi
> >>
> >> +##
> >> +# Docker and cross-compiler support
> >> +#
> >> +# This is specifically for building test
> >> +# cases for foreign architectures, not
> >> +# cross-compiling QEMU itself.
> >> +
> >> +if has "docker"; then
> >> +if docker images  >/dev/null 2>&1 ; then
> >> +docker="yes"
> >> +else
> >> +# docker may be available but using sudo
> >> +# so we won't use it for cross-building
> >> +docker="maybe"
> >
> > What is the problem with using sudo for cross-building?
> 
> Nothing in particular but we need someway of testing if the sudo is
> passwordless otherwise you might find the build stuck waiting for user
> interaction. This is fine for "make docker-foo" but for an eventual
> unattended "make check" this may cause problems.
> 
> Is there a way we can test for this? Maybe we can push the docker probe
> into docker.py and just return to configure if it can run docker
> unattended?

We can try 'sudo -n -k docker' to test if passwordless docker works. According
to the manpage, -k ignores the credential cache, and -n ensures non-interaction.

Fam



Re: [Qemu-devel] [RFC] vhost-user: introduce F_NEED_ALL_IOTLB protocol feature

2018-04-11 Thread Jason Wang



On 2018年04月11日 21:22, Michael S. Tsirkin wrote:

On Wed, Apr 11, 2018 at 03:20:27PM +0800, Tiwei Bie wrote:

This patch introduces VHOST_USER_PROTOCOL_F_NEED_ALL_IOTLB
feature for vhost-user. By default, vhost-user backend needs
to query the IOTLBs from QEMU after meeting unknown IOVAs.
With this protocol feature negotiated, QEMU will provide all
the IOTLBs to vhost-user backend without waiting for the
queries from backend. This is helpful when using a hardware
accelerator which is not able to handle unknown IOVAs at the
vhost-user backend.

Signed-off-by: Tiwei Bie

This is potentially a large amount of data to be sent
on a socket.

I had an impression that a hardware accelerator was using
VFIO anyway. Given this, can't we have QEMU program
the shadow IOMMU tables into VFIO directly?




So if we want to do this, my understanding is we need to implement the 
driver inside qemu (or link it to qemu).


Thanks




[Qemu-devel] [PATCH v2 0/2] bitmaps persistent and migration fixes

2018-04-11 Thread Vladimir Sementsov-Ogievskiy
v2:

01: new, proposed by Max
02: fix leaving cat processes

Vladimir Sementsov-Ogievskiy (2):
  qcow2: try load bitmaps only once
  iotests: fix 169

 block/qcow2.h  |  1 +
 block/qcow2.c  | 16 
 tests/qemu-iotests/169 | 48 +++-
 3 files changed, 36 insertions(+), 29 deletions(-)

-- 
2.11.1




[Qemu-devel] [PATCH v2 2/2] iotests: fix 169

2018-04-11 Thread Vladimir Sementsov-Ogievskiy
Improve and fix 169:
- use MIGRATION events instead of RESUME
- make a TODO: enable dirty-bitmaps capability for offline case
- recreate vm_b without -incoming near test end

This (likely) fixes racy faults at least of the following types:

- timeout on waiting for RESUME event
- sha256 mismatch on line 136 (142 after this patch)
- fail to self.vm_b.launch() on line 135 (141 now after this patch)

And surely fixes cat processes, left after test finish.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
 tests/qemu-iotests/169 | 48 +++-
 1 file changed, 27 insertions(+), 21 deletions(-)

diff --git a/tests/qemu-iotests/169 b/tests/qemu-iotests/169
index 153b10b6e7..f243db9955 100755
--- a/tests/qemu-iotests/169
+++ b/tests/qemu-iotests/169
@@ -31,6 +31,8 @@ disk_a = os.path.join(iotests.test_dir, 'disk_a')
 disk_b = os.path.join(iotests.test_dir, 'disk_b')
 size = '1M'
 mig_file = os.path.join(iotests.test_dir, 'mig_file')
+mig_cmd = 'exec: cat > ' + mig_file
+incoming_cmd = 'exec: cat ' + mig_file
 
 
 class TestDirtyBitmapMigration(iotests.QMPTestCase):
@@ -49,7 +51,6 @@ class TestDirtyBitmapMigration(iotests.QMPTestCase):
 self.vm_a.launch()
 
 self.vm_b = iotests.VM(path_suffix='b')
-self.vm_b.add_incoming("exec: cat '" + mig_file + "'")
 
 def add_bitmap(self, vm, granularity, persistent):
 params = {'node': 'drive0',
@@ -86,36 +87,30 @@ class TestDirtyBitmapMigration(iotests.QMPTestCase):
(0xa0201, 0x1000))
 
 should_migrate = migrate_bitmaps or persistent and shared_storage
+mig_caps = [{'capability': 'events', 'state': True}]
+if migrate_bitmaps:
+mig_caps.append({'capability': 'dirty-bitmaps', 'state': True})
 
+result = self.vm_a.qmp('migrate-set-capabilities',
+   capabilities=mig_caps)
+self.assert_qmp(result, 'return', {})
+
+self.vm_b.add_incoming(incoming_cmd if online else "defer")
 self.vm_b.add_drive(disk_a if shared_storage else disk_b)
 
 if online:
 os.mkfifo(mig_file)
 self.vm_b.launch()
+result = self.vm_b.qmp('migrate-set-capabilities',
+   capabilities=mig_caps)
+self.assert_qmp(result, 'return', {})
 
 self.add_bitmap(self.vm_a, granularity, persistent)
 for r in regions:
 self.vm_a.hmp_qemu_io('drive0', 'write %d %d' % r)
 sha256 = self.get_bitmap_hash(self.vm_a)
 
-if migrate_bitmaps:
-capabilities = [{'capability': 'dirty-bitmaps', 'state': True}]
-
-result = self.vm_a.qmp('migrate-set-capabilities',
-   capabilities=capabilities)
-self.assert_qmp(result, 'return', {})
-
-if online:
-result = self.vm_b.qmp('migrate-set-capabilities',
-   capabilities=capabilities)
-self.assert_qmp(result, 'return', {})
-
-result = self.vm_a.qmp('migrate-set-capabilities',
-   capabilities=[{'capability': 'events',
-  'state': True}])
-self.assert_qmp(result, 'return', {})
-
-result = self.vm_a.qmp('migrate', uri='exec:cat>' + mig_file)
+result = self.vm_a.qmp('migrate', uri=mig_cmd)
 while True:
 event = self.vm_a.event_wait('MIGRATION')
 if event['data']['status'] == 'completed':
@@ -124,14 +119,25 @@ class TestDirtyBitmapMigration(iotests.QMPTestCase):
 if not online:
 self.vm_a.shutdown()
 self.vm_b.launch()
-# TODO enable bitmap capability for vm_b in this case
+result = self.vm_b.qmp('migrate-set-capabilities',
+   capabilities=mig_caps)
+self.assert_qmp(result, 'return', {})
+result = self.vm_b.qmp('migrate-incoming', uri=incoming_cmd)
+self.assert_qmp(result, 'return', {})
 
-self.vm_b.event_wait("RESUME", timeout=10.0)
+while True:
+event = self.vm_b.event_wait('MIGRATION')
+if event['data']['status'] == 'completed':
+break
 
 self.check_bitmap(self.vm_b, sha256 if should_migrate else False)
 
 if should_migrate:
 self.vm_b.shutdown()
+# recreate vm_b, as we don't want -incoming option (this will lead
+# to "cat" process left alive after test finish)
+self.vm_b = iotests.VM(path_suffix='b')
+self.vm_b.add_drive(disk_a if shared_storage else disk_b)
 self.vm_b.launch()
 self.check_bitmap(self.vm_b, sha256 if persistent else False)
 
-- 
2.11.1




[Qemu-devel] [PATCH v2 1/2] qcow2: try load bitmaps only once

2018-04-11 Thread Vladimir Sementsov-Ogievskiy
Checking reopen by existence of some bitmaps is wrong, as it may be
some other bitmaps, or on the other hand, user may remove bitmaps. This
criteria is bad. To simplify things and make behavior more predictable
let's just add a flag to remember, that we've already tried to load
bitmaps on open and do not want do it again.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
 block/qcow2.h |  1 +
 block/qcow2.c | 16 
 2 files changed, 9 insertions(+), 8 deletions(-)

diff --git a/block/qcow2.h b/block/qcow2.h
index d301f77cea..adf5c3950f 100644
--- a/block/qcow2.h
+++ b/block/qcow2.h
@@ -298,6 +298,7 @@ typedef struct BDRVQcow2State {
 uint32_t nb_bitmaps;
 uint64_t bitmap_directory_size;
 uint64_t bitmap_directory_offset;
+bool dirty_bitmaps_loaded;
 
 int flags;
 int qcow_version;
diff --git a/block/qcow2.c b/block/qcow2.c
index 486f3e83b7..4242e99f1e 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -1142,6 +1142,7 @@ static int coroutine_fn qcow2_do_open(BlockDriverState 
*bs, QDict *options,
 uint64_t ext_end;
 uint64_t l1_vm_state_index;
 bool update_header = false;
+bool header_updated = false;
 
 ret = bdrv_pread(bs->file, 0, , sizeof(header));
 if (ret < 0) {
@@ -1480,10 +1481,9 @@ static int coroutine_fn qcow2_do_open(BlockDriverState 
*bs, QDict *options,
 s->autoclear_features &= QCOW2_AUTOCLEAR_MASK;
 }
 
-if (bdrv_dirty_bitmap_next(bs, NULL)) {
-/* It's some kind of reopen with already existing dirty bitmaps. There
- * are no known cases where we need loading bitmaps in such situation,
- * so it's safer don't load them.
+if (s->dirty_bitmaps_loaded) {
+/* It's some kind of reopen. There are no known cases where we need
+ * loading bitmaps in such situation, so it's safer don't load them.
  *
  * Moreover, if we have some readonly bitmaps and we are reopening for
  * rw we should reopen bitmaps correspondingly.
@@ -1491,13 +1491,13 @@ static int coroutine_fn qcow2_do_open(BlockDriverState 
*bs, QDict *options,
 if (bdrv_has_readonly_bitmaps(bs) &&
 !bdrv_is_read_only(bs) && !(bdrv_get_flags(bs) & BDRV_O_INACTIVE))
 {
-bool header_updated = false;
 qcow2_reopen_bitmaps_rw_hint(bs, _updated, _err);
-update_header = update_header && !header_updated;
 }
-} else if (qcow2_load_dirty_bitmaps(bs, _err)) {
-update_header = false;
+} else {
+header_updated = qcow2_load_dirty_bitmaps(bs, _err);
+s->dirty_bitmaps_loaded = true;
 }
+update_header = update_header && !header_updated;
 if (local_err != NULL) {
 error_propagate(errp, local_err);
 ret = -EINVAL;
-- 
2.11.1




Re: [Qemu-devel] [PATCH] qmp: add pmemload command

2018-04-11 Thread Eric Blake
On 04/11/2018 02:36 AM, Simon Ruderich wrote:
> On Tue, Apr 10, 2018 at 04:33:03PM -0500, Eric Blake wrote:
>>> +void qmp_pmemload(int64_t addr, int64_t size, const char *filename,
>>> +  Error **errp)
>>> +{
>>> +FILE *f;
>>> +size_t l;
>>> +uint8_t buf[1024];
>>> +
>>> +f = fopen(filename, "rb");
>>
>> Use qemu_fopen() here, so that you can automagically support /dev/fdset/
>> magic filenames that work on file descriptors passed via SCM_RIGHTS.
> 
> Hello,
> 
> I can't find qemu_fopen() in the source. Did you mean
> qemu_open()?

Looks like you're right; we don't have an automatic FILE wrapper.

> From reading qemu_close() I guess that I can't use
> fdopen() (as there's no qemu_fclose()) but must work with the raw
> fd. Or is there an easy way to fdopen? (I prefer the FILE *
> interface which is easier to work with.)

You could always add qemu_fopen/qemu_fclose to match the existing
qemu_open/qemu_close.  But you do have a point that you can't call
qemu_close/fclose (because fclose would be left with a stale fd that
might spuriously close something that has been opened in another thread
during the race window), nor fclose/qemu_close.

The FILE interface may sometimes be easier to work with, but it also
comes with buffering issues that you don't have to worry about when
using the fd interface.

> 
> But I just copied the code from qmp_pmemsave. Should I change it
> as well to use qemu_open()?

Good point - but yes, ideally it should always be possible to pass in an
fd instead of having qemu itself open a file, because there are
execution environments where qemu is intentionally prohibited from
directly calling open() for security reasons (where the management app,
such as libvirt, opens and then passes fds to qemu instead).


>>> +##
>>> +# @pmemload:
>>> +#
>>> +# Load a portion of guest physical memory from a file.
>>> +#
>>> +# @val: the physical address of the guest to start from
>>
>> Is 'val' really the best name for this, or would 'phys-addr' or similar
>> be a more descriptive name?
> 
> I copied it from pmemsave to keep the argument names consistent.
> Should I change it only for pmemload? Changing it for pmemsave is
> problematic as it will break the existing API.

You are correct that we can't really change the existing interface; so
documenting in the commit message that your choice of names is for
consistency reasons may be sufficient.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] [RFC] vhost-user: introduce F_NEED_ALL_IOTLB protocol feature

2018-04-11 Thread Jason Wang



On 2018年04月11日 16:38, Tiwei Bie wrote:

On Wed, Apr 11, 2018 at 04:01:19PM +0800, Jason Wang wrote:

On 2018年04月11日 15:20, Tiwei Bie wrote:

This patch introduces VHOST_USER_PROTOCOL_F_NEED_ALL_IOTLB
feature for vhost-user. By default, vhost-user backend needs
to query the IOTLBs from QEMU after meeting unknown IOVAs.
With this protocol feature negotiated, QEMU will provide all
the IOTLBs to vhost-user backend without waiting for the
queries from backend. This is helpful when using a hardware
accelerator which is not able to handle unknown IOVAs at the
vhost-user backend.

Signed-off-by: Tiwei Bie
---
The idea of this patch is to let QEMU push all the IOTLBs
to vhost-user backend without waiting for the queries from
the backend. Because hardware accelerator at the vhost-user
backend may not be able to handle unknown IOVAs.

This is just a RFC for now. It seems that, it doesn't work
as expected when guest is using kernel driver (To handle
this case, it seems that some RAM regions' events also need
to be listened). Any comments would be appreciated! Thanks!

Interesting, a quick question is why this is needed? Can we just use exist
IOTLB update message?

Yeah, we are still using the existing IOTLB update messages
to send the IOTLB messages to backend. The only difference
is that, QEMU won't wait for the queries before sending the
IOTLB update messages.


Yes, my question is not very clear. I mean why must need a new feature 
bit? It looks to me qemu code can work without this.


Thanks




It looks to me at least kernel does not need this.

Something similar in kernel vhost is that, for kernel vhost,
QEMU needs to push the IOTLBs of some ring addrs to kernel
vhost backend without waiting for the queries.

Best regards,
Tiwei Bie


Thanks





Re: [Qemu-devel] [PATCH] cpu: skip unpluged cpu when querying cpus

2018-04-11 Thread Igor Mammedov
On Wed, 11 Apr 2018 19:16:02 +0800
linzhecheng  wrote:

> From: XuYandong 
> 
> After vcpu1 thread exiting, vcpu0 thread (received notification) is still 
> waiting for
> holding qemu_global_mutex in cpu_remove_sync, at this moment, vcpu1 is still 
> in global cpus list.
> If main thread grab qemu_global_mutex in order to handle qmp command "info 
> cpus",
> qmp_query_cpus visit unpluged vcpu1 will lead qemu process to exit.
Add here exact error or better stack trace in case it crashes.

 
> Signed-off-by: XuYandong 
> ---
>  cpus.c | 5 +
>  1 file changed, 5 insertions(+)
> 
> diff --git a/cpus.c b/cpus.c
> index 2cb0af9..9b3a6c4 100644
> --- a/cpus.c
> +++ b/cpus.c
> @@ -2018,6 +2018,11 @@ CpuInfoList *qmp_query_cpus(Error **errp)
>  
>  CPU_FOREACH(cpu) {
>  CpuInfoList *info;
> +
> +if (cpu->unplug) {
> +continue;
> +}
Shouldn't be it done for qmp_query_cpus_fast() as well?

> +
>  #if defined(TARGET_I386)
>  X86CPU *x86_cpu = X86_CPU(cpu);
>  CPUX86State *env = _cpu->env;




Re: [Qemu-devel] [PATCH] migration: Don't activate block devices if using -S

2018-04-11 Thread Dr. David Alan Gilbert
* Kevin Wolf (kw...@redhat.com) wrote:
> Am 11.04.2018 um 12:01 hat Jiri Denemark geschrieben:
> > On Tue, Apr 10, 2018 at 16:47:56 +0200, Kevin Wolf wrote:
> > > Am 10.04.2018 um 16:22 hat Dr. David Alan Gilbert geschrieben:
> > > > * Kevin Wolf (kw...@redhat.com) wrote:
> > > > > Am 10.04.2018 um 12:40 hat Dr. David Alan Gilbert geschrieben:
> > > > > > Hmm; having chatted to Jiri I'm OK with reverting it, on the 
> > > > > > condition
> > > > > > that I actually understand how this alternative would work first.
> > > > > > 
> > > > > > I can't currently see how a block-inactivate would be used.
> > > > > > I also can't see how a block-activate unless it's also with the
> > > > > > change that you're asking to revert.
> > > > > > 
> > > > > > Can you explain the way you see it working?
> > > > > 
> > > > > The key is making the delayed activation of block devices (and 
> > > > > probably
> > > > > delayed announcement of NICs? - you didn't answer that part) optional
> > > > > instead of making it the default.
> > > > 
> > > > NIC announcments are broken in similar but slightly different ways;  we
> > > > did have a series on list to help a while ago but it never got merged;
> > > > I'd like to keep that mess separate.
> > > 
> > > Okay. I just thought that it would make sense to have clear migration
> > > phases that are the same for all external resources that the QEMU
> > > processes use.
> > 
> > I don't think NIC announcements should be delayed in this specific case
> > since we're dealing with a failure recovery which should be rare in
> > comparison to successful migration when we want NIC announcements to be
> > send early. In other words, any NIC issues should be solved separately
> > and Laine would likely be a better person for discussing them since he
> > has a broader knowledge of all the fancy network stuff which libvirt
> > needs to coordinate with.
> 
> Well, if I were the migration maintainer, I would insist on a properly
> designed phase model that solves the problem once and for all because it
> would be clear where everything belongs. We could still have bugs in the
> future, but that would be internal implementation bugs with no effect on
> the API.

My main reason for believing this wouldn't work is that most of the
things we've had  recently have been things where we've found out about
subtle constraints that we previously didn't realise, and hence if we
were writing down the mythical phase model we wouldn't have put in.

I'd have loved to have had some more discussion about what those
requirements were _before_ block locking went in a few versions back,
because unsurprisingly adding hard locking constraints shook a lot of
problems out (and IMHO was a much bigger API change than this change)

> But I'm not the maintainer and Dave prefers to deal with it basically as
> a bunch of one-off fixes, and that will work, too. It will probably
> clutter up the external API a bit (because the management tool will have
> to separately address migration of block devices, network devices and
> possibly other things in the future), but that shouldn't matter much for
> libvirt. Maybe what we do need is some documentation of the recommended
> process for performing a live migration so that management tools know
> which QMP commands they need to issue when.

I'd like to keep the networking stuff separate because it's got a whole
bunch of other interactions that we found out last time we tried to fix
it; in particular Op=enStack's networking interaciton with Libvirt isn't
quite what's expected and OpenStack have a whole bunch of different
network configurations whose behaviour when we change something is fun.
Oh and it depends heavily on the guest - which fortunately the block
stuff doesn't (because on modern virtio-net guests it does the modern announce
from the guest and so only ever happens once the guest CPU is running
which simplifies stuff massively).

Dave



> Kevin
--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK



Re: [Qemu-devel] [RFC] vhost-user: introduce F_NEED_ALL_IOTLB protocol feature

2018-04-11 Thread Michael S. Tsirkin
On Wed, Apr 11, 2018 at 03:20:27PM +0800, Tiwei Bie wrote:
> This patch introduces VHOST_USER_PROTOCOL_F_NEED_ALL_IOTLB
> feature for vhost-user. By default, vhost-user backend needs
> to query the IOTLBs from QEMU after meeting unknown IOVAs.
> With this protocol feature negotiated, QEMU will provide all
> the IOTLBs to vhost-user backend without waiting for the
> queries from backend. This is helpful when using a hardware
> accelerator which is not able to handle unknown IOVAs at the
> vhost-user backend.
> 
> Signed-off-by: Tiwei Bie 

This is potentially a large amount of data to be sent
on a socket.

I had an impression that a hardware accelerator was using
VFIO anyway. Given this, can't we have QEMU program
the shadow IOMMU tables into VFIO directly?


> ---
> The idea of this patch is to let QEMU push all the IOTLBs
> to vhost-user backend without waiting for the queries from
> the backend. Because hardware accelerator at the vhost-user
> backend may not be able to handle unknown IOVAs.
> 
> This is just a RFC for now. It seems that, it doesn't work
> as expected when guest is using kernel driver (To handle
> this case, it seems that some RAM regions' events also need
> to be listened). Any comments would be appreciated! Thanks!
> 
>  docs/interop/vhost-user.txt   |  9 
>  hw/virtio/vhost-backend.c |  7 ++
>  hw/virtio/vhost-user.c|  8 +++
>  hw/virtio/vhost.c | 47 
> ---
>  include/hw/virtio/vhost-backend.h |  3 +++
>  5 files changed, 71 insertions(+), 3 deletions(-)
> 
> diff --git a/docs/interop/vhost-user.txt b/docs/interop/vhost-user.txt
> index 534caab18a..73e07f9dad 100644
> --- a/docs/interop/vhost-user.txt
> +++ b/docs/interop/vhost-user.txt
> @@ -380,6 +380,7 @@ Protocol features
>  #define VHOST_USER_PROTOCOL_F_CRYPTO_SESSION 7
>  #define VHOST_USER_PROTOCOL_F_PAGEFAULT  8
>  #define VHOST_USER_PROTOCOL_F_CONFIG 9
> +#define VHOST_USER_PROTOCOL_F_NEED_ALL_IOTLB 10
>  
>  Master message types
>  
> @@ -797,3 +798,11 @@ resilient for selective requests.
>  For the message types that already solicit a reply from the client, the
>  presence of VHOST_USER_PROTOCOL_F_REPLY_ACK or need_reply bit being set 
> brings
>  no behavioural change. (See the 'Communication' section for details.)
> +
> +VHOST_USER_PROTOCOL_F_NEED_ALL_IOTLB
> +
> +By default, vhost-user backend needs to query the IOTLBs from QEMU after
> +meeting unknown IOVAs. With this protocol feature negotiated, QEMU will
> +provide all the IOTLBs to vhost backend without waiting for the queries
> +from backend. This is helpful when using a hardware accelerator which is
> +not able to handle unknown IOVAs at the vhost-user backend.
> diff --git a/hw/virtio/vhost-backend.c b/hw/virtio/vhost-backend.c
> index 7f09efab8b..d72994e9a5 100644
> --- a/hw/virtio/vhost-backend.c
> +++ b/hw/virtio/vhost-backend.c
> @@ -233,6 +233,11 @@ static void vhost_kernel_set_iotlb_callback(struct 
> vhost_dev *dev,
>  qemu_set_fd_handler((uintptr_t)dev->opaque, NULL, NULL, NULL);
>  }
>  
> +static bool vhost_kernel_need_all_device_iotlb(struct vhost_dev *dev)
> +{
> +return false;
> +}
> +
>  static const VhostOps kernel_ops = {
>  .backend_type = VHOST_BACKEND_TYPE_KERNEL,
>  .vhost_backend_init = vhost_kernel_init,
> @@ -264,6 +269,8 @@ static const VhostOps kernel_ops = {
>  #endif /* CONFIG_VHOST_VSOCK */
>  .vhost_set_iotlb_callback = vhost_kernel_set_iotlb_callback,
>  .vhost_send_device_iotlb_msg = vhost_kernel_send_device_iotlb_msg,
> +.vhost_backend_need_all_device_iotlb =
> +vhost_kernel_need_all_device_iotlb,
>  };
>  
>  int vhost_set_backend_type(struct vhost_dev *dev, VhostBackendType 
> backend_type)
> diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c
> index 38da8692bb..30e0b1c13b 100644
> --- a/hw/virtio/vhost-user.c
> +++ b/hw/virtio/vhost-user.c
> @@ -47,6 +47,7 @@ enum VhostUserProtocolFeature {
>  VHOST_USER_PROTOCOL_F_CRYPTO_SESSION = 7,
>  VHOST_USER_PROTOCOL_F_PAGEFAULT = 8,
>  VHOST_USER_PROTOCOL_F_CONFIG = 9,
> +VHOST_USER_PROTOCOL_F_NEED_ALL_IOTLB = 10,
>  VHOST_USER_PROTOCOL_F_MAX
>  };
>  
> @@ -1581,6 +1582,12 @@ vhost_user_crypto_close_session(struct vhost_dev *dev, 
> uint64_t session_id)
>  return 0;
>  }
>  
> +static bool vhost_user_need_all_device_iotlb(struct vhost_dev *dev)
> +{
> +return virtio_has_feature(dev->protocol_features,
> +  VHOST_USER_PROTOCOL_F_NEED_ALL_IOTLB);
> +}
> +
>  const VhostOps user_ops = {
>  .backend_type = VHOST_BACKEND_TYPE_USER,
>  .vhost_backend_init = vhost_user_init,
> @@ -1611,4 +1618,5 @@ const VhostOps user_ops = {
>  .vhost_set_config = vhost_user_set_config,
>  .vhost_crypto_create_session = 

Re: [Qemu-devel] [PATCH v2 1/2] qcow2: try load bitmaps only once

2018-04-11 Thread Eric Blake
On 04/11/2018 07:26 AM, Vladimir Sementsov-Ogievskiy wrote:
> Checking reopen by existence of some bitmaps is wrong, as it may be
> some other bitmaps, or on the other hand, user may remove bitmaps. This
> criteria is bad. To simplify things and make behavior more predictable
> let's just add a flag to remember, that we've already tried to load
> bitmaps on open and do not want do it again.

Wording suggestion:

Checking whether we are reopening an image based on the existence of
some bitmaps is wrong, as the set of bitmaps may have changed (for
example, the user may have added or removed bitmaps in the meantime).
To simplify things and make behavior more predictable, let's just add a
flag to remember whether we've already tried to load bitmaps, so that we
only attempt it on the initial open.

> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy 
> ---
>  block/qcow2.h |  1 +
>  block/qcow2.c | 16 
>  2 files changed, 9 insertions(+), 8 deletions(-)
> 
> @@ -1480,10 +1481,9 @@ static int coroutine_fn qcow2_do_open(BlockDriverState 
> *bs, QDict *options,
>  s->autoclear_features &= QCOW2_AUTOCLEAR_MASK;
>  }
>  
> -if (bdrv_dirty_bitmap_next(bs, NULL)) {
> -/* It's some kind of reopen with already existing dirty bitmaps. 
> There
> - * are no known cases where we need loading bitmaps in such 
> situation,
> - * so it's safer don't load them.
> +if (s->dirty_bitmaps_loaded) {
> +/* It's some kind of reopen. There are no known cases where we need
> + * loading bitmaps in such situation, so it's safer don't load them.

Pre-existing wording, but sounds better as:

There are no known cases where we need to reload bitmaps in such a
situation, so it's safer to skip them.

>   *
>   * Moreover, if we have some readonly bitmaps and we are reopening 
> for
>   * rw we should reopen bitmaps correspondingly.
> @@ -1491,13 +1491,13 @@ static int coroutine_fn 
> qcow2_do_open(BlockDriverState *bs, QDict *options,
>  if (bdrv_has_readonly_bitmaps(bs) &&
>  !bdrv_is_read_only(bs) && !(bdrv_get_flags(bs) & 
> BDRV_O_INACTIVE))
>  {
> -bool header_updated = false;
>  qcow2_reopen_bitmaps_rw_hint(bs, _updated, _err);
> -update_header = update_header && !header_updated;
>  }
> -} else if (qcow2_load_dirty_bitmaps(bs, _err)) {
> -update_header = false;
> +} else {
> +header_updated = qcow2_load_dirty_bitmaps(bs, _err);
> +s->dirty_bitmaps_loaded = true;
>  }
> +update_header = update_header && !header_updated;

Could write this as 'update_header &= !header_updated;'

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] [PATCH] iotests: fix 169

2018-04-11 Thread Vladimir Sementsov-Ogievskiy

11.04.2018 16:05, Vladimir Sementsov-Ogievskiy wrote:

11.04.2018 12:36, Vladimir Sementsov-Ogievskiy wrote:

11.04.2018 12:02, Vladimir Sementsov-Ogievskiy wrote:

03.04.2018 23:13, John Snow wrote:


On 04/03/2018 12:23 PM, Max Reitz wrote:

On 2018-03-30 18:10, Vladimir Sementsov-Ogievskiy wrote:

Use MIGRATION events instead of RESUME. Also, make a TODO: enable
dirty-bitmaps capability for offline case.

This (likely) fixes racy faults at least of the following types:

 - timeout on waiting for RESUME event
 - sha256 mismatch on 136 (138 after this patch)

Signed-off-by: Vladimir Sementsov-Ogievskiy 


---

This patch is a true change for the test anyway. But I don't 
understand,
why (and do really) it fixes the things. And I'm not sure about 
do we
really have a bug in bitmap migration or persistence. So, it's up 
to you,

take it into 2.12...

It was already discussed, that "STOP" event is bad for tests. 
What about
"RESUME"? How can we miss it? And sha256 mismatch is really 
something

strange.

Max, please check, do it fix 169 for you.

  tests/qemu-iotests/169 | 44 
+++-

  1 file changed, 23 insertions(+), 21 deletions(-)
This makes the test pass (thanks!), but it still leaves behind 
five cats...


Max



Hmm:

jhuston  14772  0.0  0.0   4296   784 pts/3    S    16:12 0:00 cat
/home/bos/jhuston/src/qemu/bin/git/tests/qemu-iotests/scratch/mig_file
jhuston  14796  0.0  0.0   4296   764 pts/3    S    16:12 0:00 cat
/home/bos/jhuston/src/qemu/bin/git/tests/qemu-iotests/scratch/mig_file
jhuston  14940  0.0  0.0   4296   788 pts/3    S    16:12 0:00 cat
/home/bos/jhuston/src/qemu/bin/git/tests/qemu-iotests/scratch/mig_file
jhuston  14964  0.0  0.0   4296   720 pts/3    S    16:12 0:00 cat
/home/bos/jhuston/src/qemu/bin/git/tests/qemu-iotests/scratch/mig_file
jhuston  15052  0.0  0.0   4296   768 pts/3    S    16:12 0:00 cat
/home/bos/jhuston/src/qemu/bin/git/tests/qemu-iotests/scratch/mig_file

Why do these get left behind? Nothing to consume the data...?


aha, understand. it is due to last vm_b.shutdown() and vm_b.launch 
in case of should_migrate. So, at the end of the test I restart vm_b 
with -incoming parameter. But it looks like a bug anyway, If we 
start qemu with -incoming "exec", should not we kill cat process, if 
there were no migration?




third type of fail, without this patch:

+==
+ERROR: test__persistent__migbitmap__offline_shared 
(__main__.TestDirtyBitmapMigration)

+methodcaller(name, ...) --> methodcaller object
+--
+Traceback (most recent call last):
+  File "169", line 135, in do_test_migration
+    self.vm_b.launch()
+  File 
"/work/src/qemu/up-169/tests/qemu-iotests/../../scripts/qemu.py", 
line 221, in launch

+    self._launch()
+  File 
"/work/src/qemu/up-169/tests/qemu-iotests/../../scripts/qemu.py", 
line 244, in _launch

+    self._post_launch()
+  File 
"/work/src/qemu/up-169/tests/qemu-iotests/../../scripts/qtest.py", 
line 100, in _post_launch

+    super(QEMUQtestMachine, self)._post_launch()
+  File 
"/work/src/qemu/up-169/tests/qemu-iotests/../../scripts/qemu.py", 
line 196, in _post_launch

+    self._qmp.accept()
+  File 
"/work/src/qemu/up-169/tests/qemu-iotests/../../scripts/qmp/qmp.py", 
line 157, in accept

+    return self.__negotiate_capabilities()
+  File 
"/work/src/qemu/up-169/tests/qemu-iotests/../../scripts/qmp/qmp.py", 
line 75, in __negotiate_capabilities

+    resp = self.cmd('qmp_capabilities')
+  File 
"/work/src/qemu/up-169/tests/qemu-iotests/../../scripts/qmp/qmp.py", 
line 191, in cmd

+    return self.cmd_obj(qmp_cmd)
+  File 
"/work/src/qemu/up-169/tests/qemu-iotests/../../scripts/qmp/qmp.py", 
line 174, in cmd_obj

+    resp = self.__json_read()
+  File 
"/work/src/qemu/up-169/tests/qemu-iotests/../../scripts/qmp/qmp.py", 
line 82, in __json_read

+    data = self.__sockfile.readline()
+  File "/usr/lib64/python2.7/socket.py", line 447, in readline
+    data = self._sock.recv(self._rbufsize)
+error: [Errno 104] Connection reset by peer
+




Hmm, first type? I'm now not sure about, did I really see sha256 
mismatch, or something like this (should be error, but found bitmap):


--- /work/src/qemu/up-169/tests/qemu-iotests/169.out    2018-04-11 
15:35:10.055027392 +0300
+++ /work/src/qemu/up-169/tests/qemu-iotests/169.out.bad 2018-04-11 
15:58:09.300450045 +0300

@@ -1,5 +1,20 @@
-
+F...
+==
+FAIL: test__not_persistent__migbitmap__offline 
(__main__.TestDirtyBitmapMigration)

+methodcaller(name, ...) --> methodcaller object
+--
+Traceback (most recent call last):
+  File "169", line 136, in do_test_migration
+    self.check_bitmap(self.vm_b, sha256 if persistent else False)
+  File "169", 

Re: [Qemu-devel] [PATCH] iotests: fix 169

2018-04-11 Thread Max Reitz
On 2018-04-11 11:02, Vladimir Sementsov-Ogievskiy wrote:
> 03.04.2018 23:13, John Snow wrote:
>>
>> On 04/03/2018 12:23 PM, Max Reitz wrote:
>>> On 2018-03-30 18:10, Vladimir Sementsov-Ogievskiy wrote:
 Use MIGRATION events instead of RESUME. Also, make a TODO: enable
 dirty-bitmaps capability for offline case.

 This (likely) fixes racy faults at least of the following types:

  - timeout on waiting for RESUME event
  - sha256 mismatch on 136 (138 after this patch)

 Signed-off-by: Vladimir Sementsov-Ogievskiy 
 ---

 This patch is a true change for the test anyway. But I don't
 understand,
 why (and do really) it fixes the things. And I'm not sure about do we
 really have a bug in bitmap migration or persistence. So, it's up to
 you,
 take it into 2.12...

 It was already discussed, that "STOP" event is bad for tests. What
 about
 "RESUME"? How can we miss it? And sha256 mismatch is really something
 strange.

 Max, please check, do it fix 169 for you.

   tests/qemu-iotests/169 | 44
 +++-
   1 file changed, 23 insertions(+), 21 deletions(-)
>>> This makes the test pass (thanks!), but it still leaves behind five
>>> cats...
>>>
>>> Max
>>>
>>>
>> Hmm:
>>
>> jhuston  14772  0.0  0.0   4296   784 pts/3    S    16:12   0:00 cat
>> /home/bos/jhuston/src/qemu/bin/git/tests/qemu-iotests/scratch/mig_file
>> jhuston  14796  0.0  0.0   4296   764 pts/3    S    16:12   0:00 cat
>> /home/bos/jhuston/src/qemu/bin/git/tests/qemu-iotests/scratch/mig_file
>> jhuston  14940  0.0  0.0   4296   788 pts/3    S    16:12   0:00 cat
>> /home/bos/jhuston/src/qemu/bin/git/tests/qemu-iotests/scratch/mig_file
>> jhuston  14964  0.0  0.0   4296   720 pts/3    S    16:12   0:00 cat
>> /home/bos/jhuston/src/qemu/bin/git/tests/qemu-iotests/scratch/mig_file
>> jhuston  15052  0.0  0.0   4296   768 pts/3    S    16:12   0:00 cat
>> /home/bos/jhuston/src/qemu/bin/git/tests/qemu-iotests/scratch/mig_file
>>
>> Why do these get left behind? Nothing to consume the data...?
> 
> aha, understand. it is due to last vm_b.shutdown() and vm_b.launch in
> case of should_migrate. So, at the end of the test I restart vm_b with
> -incoming parameter. But it looks like  a bug anyway, If we start qemu
> with -incoming "exec", should not we kill cat process, if there were no
> migration?

I agree, but it's your choice whether you want to fix that bug or just
change the test slightly -- I'm responsible for the iotests, but not for
migration, so I have to admit I don't mind just changing this test to
accomodate. O:-)

It appears that just removing the mig_file before the second
vm_b.launch() is sufficient (and enclosing the removal of that file in
tearDown() in a try/except block).  I suppose the cat process will
complain, but that doesn't stop the test from passing.  If you want to
be really nice, I suppose you could just overwrite the FIFO mig_file
with an empty regular file, but I don't think it's actually necessary...

Max



Re: [Qemu-devel] [PATCH] migration: Don't activate block devices if using -S

2018-04-11 Thread Kevin Wolf
Am 11.04.2018 um 12:01 hat Jiri Denemark geschrieben:
> On Tue, Apr 10, 2018 at 16:47:56 +0200, Kevin Wolf wrote:
> > Am 10.04.2018 um 16:22 hat Dr. David Alan Gilbert geschrieben:
> > > * Kevin Wolf (kw...@redhat.com) wrote:
> > > > Am 10.04.2018 um 12:40 hat Dr. David Alan Gilbert geschrieben:
> > > > > Hmm; having chatted to Jiri I'm OK with reverting it, on the condition
> > > > > that I actually understand how this alternative would work first.
> > > > > 
> > > > > I can't currently see how a block-inactivate would be used.
> > > > > I also can't see how a block-activate unless it's also with the
> > > > > change that you're asking to revert.
> > > > > 
> > > > > Can you explain the way you see it working?
> > > > 
> > > > The key is making the delayed activation of block devices (and probably
> > > > delayed announcement of NICs? - you didn't answer that part) optional
> > > > instead of making it the default.
> > > 
> > > NIC announcments are broken in similar but slightly different ways;  we
> > > did have a series on list to help a while ago but it never got merged;
> > > I'd like to keep that mess separate.
> > 
> > Okay. I just thought that it would make sense to have clear migration
> > phases that are the same for all external resources that the QEMU
> > processes use.
> 
> I don't think NIC announcements should be delayed in this specific case
> since we're dealing with a failure recovery which should be rare in
> comparison to successful migration when we want NIC announcements to be
> send early. In other words, any NIC issues should be solved separately
> and Laine would likely be a better person for discussing them since he
> has a broader knowledge of all the fancy network stuff which libvirt
> needs to coordinate with.

Well, if I were the migration maintainer, I would insist on a properly
designed phase model that solves the problem once and for all because it
would be clear where everything belongs. We could still have bugs in the
future, but that would be internal implementation bugs with no effect on
the API.

But I'm not the maintainer and Dave prefers to deal with it basically as
a bunch of one-off fixes, and that will work, too. It will probably
clutter up the external API a bit (because the management tool will have
to separately address migration of block devices, network devices and
possibly other things in the future), but that shouldn't matter much for
libvirt. Maybe what we do need is some documentation of the recommended
process for performing a live migration so that management tools know
which QMP commands they need to issue when.

Kevin



Re: [Qemu-devel] [PATCH] iotests: fix 169

2018-04-11 Thread Vladimir Sementsov-Ogievskiy

11.04.2018 12:36, Vladimir Sementsov-Ogievskiy wrote:

11.04.2018 12:02, Vladimir Sementsov-Ogievskiy wrote:

03.04.2018 23:13, John Snow wrote:


On 04/03/2018 12:23 PM, Max Reitz wrote:

On 2018-03-30 18:10, Vladimir Sementsov-Ogievskiy wrote:

Use MIGRATION events instead of RESUME. Also, make a TODO: enable
dirty-bitmaps capability for offline case.

This (likely) fixes racy faults at least of the following types:

 - timeout on waiting for RESUME event
 - sha256 mismatch on 136 (138 after this patch)

Signed-off-by: Vladimir Sementsov-Ogievskiy 


---

This patch is a true change for the test anyway. But I don't 
understand,

why (and do really) it fixes the things. And I'm not sure about do we
really have a bug in bitmap migration or persistence. So, it's up 
to you,

take it into 2.12...

It was already discussed, that "STOP" event is bad for tests. What 
about

"RESUME"? How can we miss it? And sha256 mismatch is really something
strange.

Max, please check, do it fix 169 for you.

  tests/qemu-iotests/169 | 44 
+++-

  1 file changed, 23 insertions(+), 21 deletions(-)
This makes the test pass (thanks!), but it still leaves behind five 
cats...


Max



Hmm:

jhuston  14772  0.0  0.0   4296   784 pts/3    S    16:12 0:00 cat
/home/bos/jhuston/src/qemu/bin/git/tests/qemu-iotests/scratch/mig_file
jhuston  14796  0.0  0.0   4296   764 pts/3    S    16:12 0:00 cat
/home/bos/jhuston/src/qemu/bin/git/tests/qemu-iotests/scratch/mig_file
jhuston  14940  0.0  0.0   4296   788 pts/3    S    16:12 0:00 cat
/home/bos/jhuston/src/qemu/bin/git/tests/qemu-iotests/scratch/mig_file
jhuston  14964  0.0  0.0   4296   720 pts/3    S    16:12 0:00 cat
/home/bos/jhuston/src/qemu/bin/git/tests/qemu-iotests/scratch/mig_file
jhuston  15052  0.0  0.0   4296   768 pts/3    S    16:12 0:00 cat
/home/bos/jhuston/src/qemu/bin/git/tests/qemu-iotests/scratch/mig_file

Why do these get left behind? Nothing to consume the data...?


aha, understand. it is due to last vm_b.shutdown() and vm_b.launch in 
case of should_migrate. So, at the end of the test I restart vm_b 
with -incoming parameter. But it looks like a bug anyway, If we start 
qemu with -incoming "exec", should not we kill cat process, if there 
were no migration?




third type of fail, without this patch:

+==
+ERROR: test__persistent__migbitmap__offline_shared 
(__main__.TestDirtyBitmapMigration)

+methodcaller(name, ...) --> methodcaller object
+--
+Traceback (most recent call last):
+  File "169", line 135, in do_test_migration
+    self.vm_b.launch()
+  File 
"/work/src/qemu/up-169/tests/qemu-iotests/../../scripts/qemu.py", line 
221, in launch

+    self._launch()
+  File 
"/work/src/qemu/up-169/tests/qemu-iotests/../../scripts/qemu.py", line 
244, in _launch

+    self._post_launch()
+  File 
"/work/src/qemu/up-169/tests/qemu-iotests/../../scripts/qtest.py", 
line 100, in _post_launch

+    super(QEMUQtestMachine, self)._post_launch()
+  File 
"/work/src/qemu/up-169/tests/qemu-iotests/../../scripts/qemu.py", line 
196, in _post_launch

+    self._qmp.accept()
+  File 
"/work/src/qemu/up-169/tests/qemu-iotests/../../scripts/qmp/qmp.py", 
line 157, in accept

+    return self.__negotiate_capabilities()
+  File 
"/work/src/qemu/up-169/tests/qemu-iotests/../../scripts/qmp/qmp.py", 
line 75, in __negotiate_capabilities

+    resp = self.cmd('qmp_capabilities')
+  File 
"/work/src/qemu/up-169/tests/qemu-iotests/../../scripts/qmp/qmp.py", 
line 191, in cmd

+    return self.cmd_obj(qmp_cmd)
+  File 
"/work/src/qemu/up-169/tests/qemu-iotests/../../scripts/qmp/qmp.py", 
line 174, in cmd_obj

+    resp = self.__json_read()
+  File 
"/work/src/qemu/up-169/tests/qemu-iotests/../../scripts/qmp/qmp.py", 
line 82, in __json_read

+    data = self.__sockfile.readline()
+  File "/usr/lib64/python2.7/socket.py", line 447, in readline
+    data = self._sock.recv(self._rbufsize)
+error: [Errno 104] Connection reset by peer
+




Hmm, first type? I'm now not sure about, did I really see sha256 
mismatch, or something like this (should be error, but found bitmap):


--- /work/src/qemu/up-169/tests/qemu-iotests/169.out    2018-04-11 
15:35:10.055027392 +0300
+++ /work/src/qemu/up-169/tests/qemu-iotests/169.out.bad 2018-04-11 
15:58:09.300450045 +0300

@@ -1,5 +1,20 @@
-
+F...
+==
+FAIL: test__not_persistent__migbitmap__offline 
(__main__.TestDirtyBitmapMigration)

+methodcaller(name, ...) --> methodcaller object
+--
+Traceback (most recent call last):
+  File "169", line 136, in do_test_migration
+    self.check_bitmap(self.vm_b, sha256 if persistent else False)
+  File "169", line 77, in check_bitmap
+    "Dirty bitmap 'bitmap0' 

Re: [Qemu-devel] [PATCH 2/2] qemu-thread: let cur_mon be per-thread

2018-04-11 Thread Eric Blake
On 04/11/2018 04:48 AM, Peter Xu wrote:

> Okay. :) Thanks for confirming.  Then let me repost this patch without
> touching the qemu-threads.
> 
> Btw, do you want me to repost the first patch separately too, or keep
> the code as is?  I believe it depends on whether you treat that one as
> a cleanup or not.  Thanks,

The first patch is no longer necessary for your new approach, but as it
is a cleanup and you've already written it, it does not hurt to still
send it as a separate cleanup patch useful in its own right.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org



signature.asc
Description: OpenPGP digital signature


Re: [Qemu-devel] [PATCH] iotests: Split 214 off of 122

2018-04-11 Thread Max Reitz
On 2018-04-06 18:41, Max Reitz wrote:
> Commit abd3622cc03cf41ed542126a540385f30a4c0175 added a case to 122
> regarding how the qcow2 driver handles an incorrect compressed data
> length value.  This does not really fit into 122, as that file is
> supposed to contain qemu-img convert test cases, which this case is not.
> So this patch splits it off into its own file; maybe we will even get
> more qcow2-only compression tests in the future.
> 
> Also, that test case does not work with refcount_bits=1, so mark that
> option as unsupported.
> 
> Signed-off-by: Max Reitz 
> ---
> Kind of a v2 for "iotests: 122 needs at least two refcount bits now"
> (fulfills the same purpose, but also splits the case into its own file
>  so you can still run 122 with refcount_bits=1 [Eric]).
> 
> I was a bit lost what to do about the copyright text, since this test
> case was written by Berto.  I figured I'd drop the "owner" variable (it
> isn't used anyway), but I put "Red Hat" into the copyright line --
> currently every test has copyright information, so I decided it'd be
> difficult to leave that out, and I figured I simply cannot claim
> copyright for Igalia.  So, here we go.
> ---
>  tests/qemu-iotests/122 | 47 ---
>  tests/qemu-iotests/122.out | 33 
>  tests/qemu-iotests/214 | 96 
> ++
>  tests/qemu-iotests/214.out | 35 +
>  tests/qemu-iotests/group   |  1 +
>  5 files changed, 132 insertions(+), 80 deletions(-)
>  create mode 100755 tests/qemu-iotests/214
>  create mode 100644 tests/qemu-iotests/214.out

Changed the copyright information, added Berto's S-o-b (and Eric's R-b)
and applied to my block-next branch.

Max



Re: [Qemu-devel] [PATCH v2 1/2] qcow2: try load bitmaps only once

2018-04-11 Thread Vladimir Sementsov-Ogievskiy

11.04.2018 16:22, Eric Blake wrote:

On 04/11/2018 07:26 AM, Vladimir Sementsov-Ogievskiy wrote:

Checking reopen by existence of some bitmaps is wrong, as it may be
some other bitmaps, or on the other hand, user may remove bitmaps. This
criteria is bad. To simplify things and make behavior more predictable
let's just add a flag to remember, that we've already tried to load
bitmaps on open and do not want do it again.

Wording suggestion:

Checking whether we are reopening an image based on the existence of
some bitmaps is wrong, as the set of bitmaps may have changed (for
example, the user may have added or removed bitmaps in the meantime).
To simplify things and make behavior more predictable, let's just add a
flag to remember whether we've already tried to load bitmaps, so that we
only attempt it on the initial open.


You've dropped an idea supposed by Max, about migration related bitmaps,
I said "some other bitmaps", because now I doubt is it possible: we have 
migration related

bitmaps on source, not on target..

Anyway, I'm ok with all your suggestions, thank you. I can respin, or 
they may be squashed-in inflight.





Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
  block/qcow2.h |  1 +
  block/qcow2.c | 16 
  2 files changed, 9 insertions(+), 8 deletions(-)

@@ -1480,10 +1481,9 @@ static int coroutine_fn qcow2_do_open(BlockDriverState 
*bs, QDict *options,
  s->autoclear_features &= QCOW2_AUTOCLEAR_MASK;
  }
  
-if (bdrv_dirty_bitmap_next(bs, NULL)) {

-/* It's some kind of reopen with already existing dirty bitmaps. There
- * are no known cases where we need loading bitmaps in such situation,
- * so it's safer don't load them.
+if (s->dirty_bitmaps_loaded) {
+/* It's some kind of reopen. There are no known cases where we need
+ * loading bitmaps in such situation, so it's safer don't load them.

Pre-existing wording, but sounds better as:

There are no known cases where we need to reload bitmaps in such a
situation, so it's safer to skip them.


   *
   * Moreover, if we have some readonly bitmaps and we are reopening for
   * rw we should reopen bitmaps correspondingly.
@@ -1491,13 +1491,13 @@ static int coroutine_fn qcow2_do_open(BlockDriverState 
*bs, QDict *options,
  if (bdrv_has_readonly_bitmaps(bs) &&
  !bdrv_is_read_only(bs) && !(bdrv_get_flags(bs) & BDRV_O_INACTIVE))
  {
-bool header_updated = false;
  qcow2_reopen_bitmaps_rw_hint(bs, _updated, _err);
-update_header = update_header && !header_updated;
  }
-} else if (qcow2_load_dirty_bitmaps(bs, _err)) {
-update_header = false;
+} else {
+header_updated = qcow2_load_dirty_bitmaps(bs, _err);
+s->dirty_bitmaps_loaded = true;
  }
+update_header = update_header && !header_updated;

Could write this as 'update_header &= !header_updated;'




--
Best regards,
Vladimir




Re: [Qemu-devel] [PATCHv3 for-2.13 0/2] Helpers to obtain host page sizes for guest RAM

2018-04-11 Thread Paolo Bonzini
On 11/04/2018 09:04, David Gibson wrote:
> This series makes some small changes to make it easier to obtain the
> host page size backing given portions of guest RAM.  We use this in a
> couple of places currently, and I have one or two more to add in some
> upcoming code.
> 
> Assuming there are no objections, what should be the procedure for
> staging this?  Can I put it in my for-2.13 ppc tree, even though it's
> technically generic code, or should it go through someone else's tree?

Yes, go ahead:

Acked-by: Paolo Bonzini 

Paolo



Re: [Qemu-devel] [PULL v2 0/3] linux-user fixes for -rc3

2018-04-11 Thread Peter Maydell
On 10 April 2018 at 17:01, Laurent Vivier  wrote:
> The following changes since commit df6378eb0e6cfd58a22a1c3ff8fa4a9039f1eaa8:
>
>   Merge remote-tracking branch 'remotes/kraxel/tags/ui-20180410-pull-request' 
> into staging (2018-04-10 14:04:27 +0100)
>
> are available in the Git repository at:
>
>   git://github.com/vivier/qemu.git tags/linux-user-for-2.12-pull-request
>
> for you to fetch changes up to 46a1ee4f397ffd305da95fb65dc73be49667ff6d:
>
>   linux-user: implement HWCAP bits on MIPS (2018-04-10 18:00:14 +0200)
>
> 
> Trivial fixes for microblaze
> v2: add HWCAP bits on MIPS
> 
>
> James Cowgill (1):
>   linux-user: implement HWCAP bits on MIPS
>
> Laurent Vivier (2):
>   linux-user: fix microblaze get_sp_from_cpustate()
>   linux-user: add microblaze/microblazeel magic numbers in
> qemu-binfmt-conf.sh
>
>  linux-user/elfload.c  | 24 
>  linux-user/microblaze/target_signal.h |  2 +-
>  scripts/qemu-binfmt-conf.sh   | 12 ++--
>  3 files changed, 35 insertions(+), 3 deletions(-)

Applied, thanks.

-- PMM



Re: [Qemu-devel] [PATCH v3 6/7] s390x/kvm: handle AP instruction interception

2018-04-11 Thread Halil Pasic


On 04/11/2018 03:20 PM, Tony Krowiak wrote:
 I may be all wrong, though... can we at least have a translation of
 ECA.28 and EECA.28 (the "ap is there" bit and the "ap instructions are
 interpreted" bit?)
    
>>> I think we have a misunderstanding here. I will wait for Tony. Maybe
>>> he can understand this better or explain in more accessible way.
>>  From what I get from your explanation, this approach sounds like a good
>> way forward. But let's wait for Tony.
> 
> I agree, this solves the problem with specifying installing AP facilities
> on the guest (-cpu xxx,ap=on) and not configuring a vfio-ap device
> (-device vfio-ap,sysfsdev=$path-to-mediated-device).
> 
> To recap:
> The realize() function for the vfio-ap device opens the mediated matrix
> device file descriptor to set up the communication pathway with the vfio_ap
> kernel driver. When the fd is opened, the vfio_ap driver sets ECA.28 for the
> guest to instruct SIE to interpret AP instructions. The driver also
> configures the AP matrix for the guest in its SIE state description. 
> Consequently,
> if a vfio-ap device is not configured for the guest, but AP facilities are
> installed, all AP instructions will be intercepted and routed to QEMU. If 
> there
> are no interception handlers for the AP instructions, QEMU injects an 
> operation
> exception into the guest. This results in initialization of the AP bus on the
> guest to terminate. This patch was intended to circumvent that problem.
> 
> With Halil's suggestion, there is no need to provide these handlers. If ECA.28
> is set for the guest by default when the AP facilities are installed, then 
> the AP
> instructions will be interpreted and the AP bus will get initialized on the
> guest. Since there is no vfio-ap device to provide the AP matrix configuration
> for the guest, the AP bus will not detect any devices, but that's okay. AP
> instructions targeting at an APQN will execute successfully and set a response
> code in the status block returned from the instruction indicating the
> APQN is invalid  but there will be no exception unless there is truly
> an exception condition caused by the execution of the instruction.

That's exactly the idea, but it will only work out this way if the effective
ECA.28 bit (i.e. EECA.28) is also set.

Could you please comment the first paragraph quoted in this email?

Halil



Re: [Qemu-devel] [PATCH] iotests: fix 169

2018-04-11 Thread Max Reitz
On 2018-04-11 15:05, Vladimir Sementsov-Ogievskiy wrote:

[...]

> Hmm, first type? I'm now not sure about, did I really see sha256
> mismatch, or something like this (should be error, but found bitmap):
> 
> --- /work/src/qemu/up-169/tests/qemu-iotests/169.out    2018-04-11
> 15:35:10.055027392 +0300
> +++ /work/src/qemu/up-169/tests/qemu-iotests/169.out.bad 2018-04-11
> 15:58:09.300450045 +0300
> @@ -1,5 +1,20 @@
> -
> +F...
> +==
> +FAIL: test__not_persistent__migbitmap__offline
> (__main__.TestDirtyBitmapMigration)
> +methodcaller(name, ...) --> methodcaller object
> +--
> +Traceback (most recent call last):
> +  File "169", line 136, in do_test_migration
> +    self.check_bitmap(self.vm_b, sha256 if persistent else False)
> +  File "169", line 77, in check_bitmap
> +    "Dirty bitmap 'bitmap0' not found");
> +  File "/work/src/qemu/up-169/tests/qemu-iotests/iotests.py", line 389,
> in assert_qmp
> +    result = self.dictpath(d, path)
> +  File "/work/src/qemu/up-169/tests/qemu-iotests/iotests.py", line 348,
> in dictpath
> +    self.fail('failed path traversal for "%s" in "%s"' % (path, str(d)))
> +AssertionError: failed path traversal for "error/desc" in "{u'return':
> {u'sha256':
> u'01d2ebedcb8f549a2547dbf8e231c410e3e747a9479e98909fc936e0035cf8b1'}}"
> 
> 
> Max, did you really seed sha256 mismatch or only something like this?

I'm pretty sure I did see mismatches.

Max



[Qemu-devel] [PATCH 11/19] test-bdrv-drain: Test node deletion in subtree recursion

2018-04-11 Thread Kevin Wolf
If bdrv_do_drained_begin() polls during its subtree recursion, the graph
can change and mess up the bs->children iteration. Test that this
doesn't happen.

Signed-off-by: Kevin Wolf 
---
 tests/test-bdrv-drain.c | 38 +-
 1 file changed, 29 insertions(+), 9 deletions(-)

diff --git a/tests/test-bdrv-drain.c b/tests/test-bdrv-drain.c
index 05768213be..4af6571ca3 100644
--- a/tests/test-bdrv-drain.c
+++ b/tests/test-bdrv-drain.c
@@ -875,7 +875,8 @@ static void coroutine_fn test_co_delete_by_drain(void 
*opaque)
  * If @detach_instead_of_delete is set, the BDS is not going to be
  * deleted but will only detach all of its children.
  */
-static void do_test_delete_by_drain(bool detach_instead_of_delete)
+static void do_test_delete_by_drain(bool detach_instead_of_delete,
+enum drain_type drain_type)
 {
 BlockBackend *blk;
 BlockDriverState *bs, *child_bs, *null_bs;
@@ -931,9 +932,23 @@ static void do_test_delete_by_drain(bool 
detach_instead_of_delete)
  * test_co_delete_by_drain() resuming.  Thus, @bs will be deleted
  * and the coroutine will exit while this drain operation is still
  * in progress. */
-bdrv_ref(child_bs);
-bdrv_drain(child_bs);
-bdrv_unref(child_bs);
+switch (drain_type) {
+case BDRV_DRAIN:
+bdrv_ref(child_bs);
+bdrv_drain(child_bs);
+bdrv_unref(child_bs);
+break;
+case BDRV_SUBTREE_DRAIN:
+/* Would have to ref/unref bs here for !detach_instead_of_delete, but
+ * then the whole test becomes pointless because the graph changes
+ * don't occur during the drain any more. */
+assert(detach_instead_of_delete);
+bdrv_subtree_drained_begin(bs);
+bdrv_subtree_drained_end(bs);
+break;
+default:
+g_assert_not_reached();
+}
 
 while (!dbdd.done) {
 aio_poll(qemu_get_aio_context(), true);
@@ -946,15 +961,19 @@ static void do_test_delete_by_drain(bool 
detach_instead_of_delete)
 }
 }
 
-
 static void test_delete_by_drain(void)
 {
-do_test_delete_by_drain(false);
+do_test_delete_by_drain(false, BDRV_DRAIN);
 }
 
 static void test_detach_by_drain(void)
 {
-do_test_delete_by_drain(true);
+do_test_delete_by_drain(true, BDRV_DRAIN);
+}
+
+static void test_detach_by_drain_subtree(void)
+{
+do_test_delete_by_drain(true, BDRV_SUBTREE_DRAIN);
 }
 
 
@@ -1005,8 +1024,9 @@ int main(int argc, char **argv)
 g_test_add_func("/bdrv-drain/blockjob/drain_subtree",
 test_blockjob_drain_subtree);
 
-g_test_add_func("/bdrv-drain/deletion", test_delete_by_drain);
-g_test_add_func("/bdrv-drain/detach", test_detach_by_drain);
+g_test_add_func("/bdrv-drain/deletion/drain", test_delete_by_drain);
+g_test_add_func("/bdrv-drain/detach/drain", test_detach_by_drain);
+g_test_add_func("/bdrv-drain/detach/drain_subtree", 
test_detach_by_drain_subtree);
 
 ret = g_test_run();
 qemu_event_destroy(_event);
-- 
2.13.6




[Qemu-devel] [PATCH 16/19] block: Allow AIO_WAIT_WHILE with NULL ctx

2018-04-11 Thread Kevin Wolf
bdrv_drain_all() wants to have a single polling loop for draining the
in-flight requests of all nodes. This means that the AIO_WAIT_WHILE()
condition relies on activity in multiple AioContexts, which is polled
from the mainloop context. We must therefore call AIO_WAIT_WHILE() from
the mainloop thread and use the AioWait notification mechanism.

Just randomly picking the AioContext of any non-mainloop thread would
work, but instead of bothering to find such a context in the caller, we
can just as well accept NULL for ctx.

Signed-off-by: Kevin Wolf 
---
 include/block/aio-wait.h | 13 +
 1 file changed, 9 insertions(+), 4 deletions(-)

diff --git a/include/block/aio-wait.h b/include/block/aio-wait.h
index 783d3678dd..c85a62f798 100644
--- a/include/block/aio-wait.h
+++ b/include/block/aio-wait.h
@@ -57,7 +57,8 @@ typedef struct {
 /**
  * AIO_WAIT_WHILE:
  * @wait: the aio wait object
- * @ctx: the aio context
+ * @ctx: the aio context, or NULL if multiple aio contexts (for which the
+ *   caller does not hold a lock) are involved in the polling condition.
  * @cond: wait while this conditional expression is true
  *
  * Wait while a condition is true.  Use this to implement synchronous
@@ -75,7 +76,7 @@ typedef struct {
 bool waited_ = false;  \
 AioWait *wait_ = (wait);   \
 AioContext *ctx_ = (ctx);  \
-if (in_aio_context_home_thread(ctx_)) {\
+if (ctx_ && in_aio_context_home_thread(ctx_)) {\
 while ((cond)) {   \
 aio_poll(ctx_, true);  \
 waited_ = true;\
@@ -86,9 +87,13 @@ typedef struct {
 /* Increment wait_->num_waiters before evaluating cond. */ \
 atomic_inc(_->num_waiters);   \
 while ((cond)) {   \
-aio_context_release(ctx_); \
+if (ctx_) {\
+aio_context_release(ctx_); \
+}  \
 aio_poll(qemu_get_aio_context(), true);\
-aio_context_acquire(ctx_); \
+if (ctx_) {\
+aio_context_acquire(ctx_); \
+}  \
 waited_ = true;\
 }  \
 atomic_dec(_->num_waiters);   \
-- 
2.13.6




Re: [Qemu-devel] [PATCH 00/10] Avoid integer overflow in next_page_start

2018-04-11 Thread Emilio G. Cota
On Wed, Apr 11, 2018 at 10:08:58 +1000, Richard Henderson wrote:
> On 04/11/2018 02:19 AM, Emilio G. Cota wrote:
> > Richard pointed out in another thread that when computing
> > next_page_start we can break checks for the last page in the
> > address space due to integer overflow. This affects several targets;
> > the appended fixes them.
> > 
> > You can fetch the patches from:
> >   https://github.com/cota/qemu/tree/next_page_overflow
> 
> Reviewed-by: Richard Henderson 

Thanks!

To ease an eventual merge I'll be updating the patches' R-b tags as
they come in this branch:
  https://github.com/cota/qemu/tree/next_page_overflow-r-b

BTW to avoid conflicts we should merge this before the translator loop
conversion series; I'll make that clear when I send a new version
of that patch set.

Emilio



Re: [Qemu-devel] [PATCH v1 15/24] tests/tcg/i368: fix hello-i386

2018-04-11 Thread Thomas Huth
On 10.04.2018 21:39, Alex Bennée wrote:
> This is a direct syscall test so needs additional CFLAGS and LDFLAGS.
> 
> Signed-off-by: Alex Bennée 
> ---
>  tests/tcg/i386/Makefile.target | 6 ++
>  1 file changed, 6 insertions(+)
> 
> diff --git a/tests/tcg/i386/Makefile.target b/tests/tcg/i386/Makefile.target
> index 1df69e0dab..44803c0382 100644
> --- a/tests/tcg/i386/Makefile.target
> +++ b/tests/tcg/i386/Makefile.target
> @@ -9,6 +9,12 @@ CFLAGS+=$(CROSS_CC_GUEST_CFLAGS)
>  endif
>  endif
>  
> +#
> +# hello-i386 is a barebones app
> +#
> +hello-i386: CFLAGS+=-ffreestanding
> +hello-i386: LDFLAGS+=-nostdlib
> +
>  #
>  # test-386 includes a couple of additional objects that need to be linked 
> together
>  #
> 

Reviewed-by: Thomas Huth 



[Qemu-devel] [Bug 1762558] Re: Many crashes with "memslot_get_virt: slot_id 170 too big"-type errors in 2.12.0 rc2

2018-04-11 Thread Adam Williamson
Up to you, of course. Just realized I didn't mention here that I also
reported this downstream, and since it turns out to be not triggered by
a qemu change I've been doing most of the investigation there:

https://bugzilla.redhat.com/show_bug.cgi?id=1565354

So far it's looking like the change that triggered this is going from
kernel 4.16rc7 to kernel 4.17rc4.

** Bug watch added: Red Hat Bugzilla #1565354
   https://bugzilla.redhat.com/show_bug.cgi?id=1565354

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1762558

Title:
  Many crashes with "memslot_get_virt: slot_id 170 too big"-type errors
  in 2.12.0 rc2

Status in QEMU:
  New

Bug description:
  Since qemu 2.12.0 rc2 - qemu-2.12.0-0.6.rc2.fc29 - landed in Fedora
  Rawhide, just about all of our openQA-automated tests of Rawhide
  guests which run with qxl / SPICE graphics in the guest have died
  partway in, always shortly after the test switches from the installer
  (an X environment) to a console on a tty. qemu is, I think, hanging.
  There are always some errors like this right around the time of the
  hang:

  [2018-04-09T20:13:42.0736 UTC] [debug] QEMU: id 0, group 0, virt start 0, 
virt end , generation 0, delta 0
  [2018-04-09T20:13:42.0736 UTC] [debug] QEMU: id 1, group 1, virt start 
7f42dbc0, virt end 7f42dfbfe000, generation 0, delta 7f42dbc0
  [2018-04-09T20:13:42.0736 UTC] [debug] QEMU: id 2, group 1, virt start 
7f42d7a0, virt end 7f42dba0, generation 0, delta 7f42d7a0
  [2018-04-09T20:13:42.0736 UTC] [debug] QEMU: 
  [2018-04-09T20:13:42.0736 UTC] [debug] QEMU: (process:45812): Spice-CRITICAL 
**: memslot.c:111:memslot_get_virt: slot_id 218 too big, addr=da8e21fbda8e21fb

  or occasionally like this:

  [2018-04-09T20:13:58.0717 UTC] [debug] QEMU: id 0, group 0, virt start 0, 
virt end , generation 0, delta 0
  [2018-04-09T20:13:58.0720 UTC] [debug] QEMU: id 1, group 1, virt start 
7ff093c0, virt end 7ff097bfe000, generation 0, delta 7ff093c0
  [2018-04-09T20:13:58.0720 UTC] [debug] QEMU: id 2, group 1, virt start 
7ff08fa0, virt end 7ff093a0, generation 0, delta 7ff08fa0
  [2018-04-09T20:13:58.0720 UTC] [debug] QEMU: 
  [2018-04-09T20:13:58.0720 UTC] [debug] QEMU: (process:25622): Spice-WARNING 
**: memslot.c:68:memslot_validate_virt: virtual address out of range
  [2018-04-09T20:13:58.0720 UTC] [debug] QEMU: virt=0x0+0x18 slot_id=0 
group_id=1
  [2018-04-09T20:13:58.0721 UTC] [debug] QEMU: slot=0x0-0x0 delta=0x0
  [2018-04-09T20:13:58.0721 UTC] [debug] QEMU: 
  [2018-04-09T20:13:58.0721 UTC] [debug] QEMU: (process:25622): Spice-WARNING 
**: display-channel.c:2426:display_channel_validate_surface: invalid surface_id 
1048576
  [2018-04-09T20:14:14.0728 UTC] [debug] QEMU: id 0, group 0, virt start 0, 
virt end , generation 0, delta 0
  [2018-04-09T20:14:14.0728 UTC] [debug] QEMU: id 1, group 1, virt start 
7ff093c0, virt end 7ff097bfe000, generation 0, delta 7ff093c0
  [2018-04-09T20:14:14.0728 UTC] [debug] QEMU: id 2, group 1, virt start 
7ff08fa0, virt end 7ff093a0, generation 0, delta 7ff08fa0
  [2018-04-09T20:14:14.0728 UTC] [debug] QEMU: 
  [2018-04-09T20:14:14.0728 UTC] [debug] QEMU: (process:25622): Spice-CRITICAL 
**: memslot.c:122:memslot_get_virt: address generation is not valid, group_id 
1, slot_id 0, gen 110, slot_gen 0

  The same tests running on Fedora 28 guests on the same hosts are not
  hanging, and the same tests were not hanging right before the qemu
  package got updated, so this seems very strongly tied to the new qemu.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1762558/+subscriptions



Re: [Qemu-devel] [PATCH 09/10] target/s390x: avoid integer overflow in next_page PC check

2018-04-11 Thread Cornelia Huck
On Tue, 10 Apr 2018 12:19:45 -0400
"Emilio G. Cota"  wrote:

> If the PC is in the last page of the address space, next_page_start
> overflows to 0. Fix it.
> 
> Cc: Cornelia Huck 
> Cc: Alexander Graf 
> Cc: David Hildenbrand 
> Cc: qemu-s3...@nongnu.org
> Signed-off-by: Emilio G. Cota 
> ---
>  target/s390x/translate.c | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)

Acked-by: Cornelia Huck 



Re: [Qemu-devel] [PATCH v2 1/2] qcow2: try load bitmaps only once

2018-04-11 Thread Max Reitz
On 2018-04-11 14:26, Vladimir Sementsov-Ogievskiy wrote:
> Checking reopen by existence of some bitmaps is wrong, as it may be
> some other bitmaps, or on the other hand, user may remove bitmaps. This
> criteria is bad. To simplify things and make behavior more predictable
> let's just add a flag to remember, that we've already tried to load
> bitmaps on open and do not want do it again.
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy 
> ---
>  block/qcow2.h |  1 +
>  block/qcow2.c | 16 
>  2 files changed, 9 insertions(+), 8 deletions(-)

Reviewed-by: Max Reitz 



[Qemu-devel] [PATCH 01/19] test-bdrv-drain: bdrv_drain() works with cross-AioContext events

2018-04-11 Thread Kevin Wolf
As long as nobody keeps the other I/O thread from working, there is no
reason why bdrv_drain() wouldn't work with cross-AioContext events. The
key is that the root request we're waiting for is in the AioContext
we're polling (which it always is for bdrv_drain()) so that aio_poll()
is woken up in the end.

Add a test case that shows that it works. Remove the comment in
bdrv_drain() that claims otherwise.

Signed-off-by: Kevin Wolf 
---
 block/io.c  |   4 --
 tests/test-bdrv-drain.c | 187 +++-
 2 files changed, 186 insertions(+), 5 deletions(-)

diff --git a/block/io.c b/block/io.c
index bd9a19a9c4..28e71221b0 100644
--- a/block/io.c
+++ b/block/io.c
@@ -369,10 +369,6 @@ void bdrv_unapply_subtree_drain(BdrvChild *child, 
BlockDriverState *old_parent)
  *
  * Note that unlike bdrv_drain_all(), the caller must hold the BlockDriverState
  * AioContext.
- *
- * Only this BlockDriverState's AioContext is run, so in-flight requests must
- * not depend on events in other AioContexts.  In that case, use
- * bdrv_drain_all() instead.
  */
 void coroutine_fn bdrv_co_drain(BlockDriverState *bs)
 {
diff --git a/tests/test-bdrv-drain.c b/tests/test-bdrv-drain.c
index 7673de1062..29634102d8 100644
--- a/tests/test-bdrv-drain.c
+++ b/tests/test-bdrv-drain.c
@@ -27,9 +27,13 @@
 #include "block/blockjob_int.h"
 #include "sysemu/block-backend.h"
 #include "qapi/error.h"
+#include "iothread.h"
+
+static QemuEvent done_event;
 
 typedef struct BDRVTestState {
 int drain_count;
+AioContext *bh_indirection_ctx;
 } BDRVTestState;
 
 static void coroutine_fn bdrv_test_co_drain_begin(BlockDriverState *bs)
@@ -50,16 +54,29 @@ static void bdrv_test_close(BlockDriverState *bs)
 g_assert_cmpint(s->drain_count, >, 0);
 }
 
+static void co_reenter_bh(void *opaque)
+{
+aio_co_wake(opaque);
+}
+
 static int coroutine_fn bdrv_test_co_preadv(BlockDriverState *bs,
 uint64_t offset, uint64_t bytes,
 QEMUIOVector *qiov, int flags)
 {
+BDRVTestState *s = bs->opaque;
+
 /* We want this request to stay until the polling loop in drain waits for
  * it to complete. We need to sleep a while as bdrv_drain_invoke() comes
  * first and polls its result, too, but it shouldn't accidentally complete
  * this request yet. */
 qemu_co_sleep_ns(QEMU_CLOCK_REALTIME, 10);
 
+if (s->bh_indirection_ctx) {
+aio_bh_schedule_oneshot(s->bh_indirection_ctx, co_reenter_bh,
+qemu_coroutine_self());
+qemu_coroutine_yield();
+}
+
 return 0;
 }
 
@@ -490,6 +507,164 @@ static void test_graph_change(void)
 blk_unref(blk_b);
 }
 
+struct test_iothread_data {
+BlockDriverState *bs;
+enum drain_type drain_type;
+int *aio_ret;
+};
+
+static void test_iothread_drain_entry(void *opaque)
+{
+struct test_iothread_data *data = opaque;
+
+aio_context_acquire(bdrv_get_aio_context(data->bs));
+do_drain_begin(data->drain_type, data->bs);
+g_assert_cmpint(*data->aio_ret, ==, 0);
+do_drain_end(data->drain_type, data->bs);
+aio_context_release(bdrv_get_aio_context(data->bs));
+
+qemu_event_set(_event);
+}
+
+static void test_iothread_aio_cb(void *opaque, int ret)
+{
+int *aio_ret = opaque;
+*aio_ret = ret;
+qemu_event_set(_event);
+}
+
+/*
+ * Starts an AIO request on a BDS that runs in the AioContext of iothread 1.
+ * The request involves a BH on iothread 2 before it can complete.
+ *
+ * @drain_thread = 0 means that do_drain_begin/end are called from the main
+ * thread, @drain_thread = 1 means that they are called from iothread 1. Drain
+ * for this BDS cannot be called from iothread 2 because only the main thread
+ * may do cross-AioContext polling.
+ */
+static void test_iothread_common(enum drain_type drain_type, int drain_thread)
+{
+BlockBackend *blk;
+BlockDriverState *bs;
+BDRVTestState *s;
+BlockAIOCB *acb;
+int aio_ret;
+struct test_iothread_data data;
+
+IOThread *a = iothread_new();
+IOThread *b = iothread_new();
+AioContext *ctx_a = iothread_get_aio_context(a);
+AioContext *ctx_b = iothread_get_aio_context(b);
+
+QEMUIOVector qiov;
+struct iovec iov = {
+.iov_base = NULL,
+.iov_len = 0,
+};
+qemu_iovec_init_external(, , 1);
+
+/* bdrv_drain_all() may only be called from the main loop thread */
+if (drain_type == BDRV_DRAIN_ALL && drain_thread != 0) {
+goto out;
+}
+
+blk = blk_new(BLK_PERM_ALL, BLK_PERM_ALL);
+bs = bdrv_new_open_driver(_test, "test-node", BDRV_O_RDWR,
+  _abort);
+s = bs->opaque;
+blk_insert_bs(blk, bs, _abort);
+
+blk_set_aio_context(blk, ctx_a);
+aio_context_acquire(ctx_a);
+
+s->bh_indirection_ctx = ctx_b;
+
+aio_ret = -EINPROGRESS;
+if (drain_thread == 0) {
+acb = 

[Qemu-devel] [PATCH 07/19] block: Really pause block jobs on drain

2018-04-11 Thread Kevin Wolf
We already requested that block jobs be paused in .bdrv_drained_begin,
but no guarantee was made that the job was actually inactive at the
point where bdrv_drained_begin() returned.

This introduces a new callback BdrvChildRole.bdrv_drained_poll() and
uses it to make bdrv_drain_poll() consider block jobs using the node to
be drained.

For the test case to work as expected, we have to switch from
block_job_sleep_ns() to qemu_co_sleep_ns() so that the test job is even
considered active and must be waited for when draining the node.

Signed-off-by: Kevin Wolf 
---
 include/block/block.h|  7 +++
 include/block/block_int.h|  7 +++
 include/block/blockjob_int.h |  8 
 block.c  |  9 +
 block/io.c   | 27 ---
 block/mirror.c   |  8 
 blockjob.c   | 21 +
 tests/test-bdrv-drain.c  | 18 ++
 8 files changed, 94 insertions(+), 11 deletions(-)

diff --git a/include/block/block.h b/include/block/block.h
index cdec3639a3..23dee3c114 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -559,6 +559,13 @@ void bdrv_parent_drained_begin(BlockDriverState *bs, 
BdrvChild *ignore);
 void bdrv_parent_drained_end(BlockDriverState *bs, BdrvChild *ignore);
 
 /**
+ * bdrv_drain_poll:
+ *
+ * Poll for pending requests in @bs. This is part of bdrv_drained_begin.
+ */
+bool bdrv_drain_poll(BlockDriverState *bs, bool top_level);
+
+/**
  * bdrv_drained_begin:
  *
  * Begin a quiesced section for exclusive access to the BDS, by disabling
diff --git a/include/block/block_int.h b/include/block/block_int.h
index c4dd1d4bb8..dc6985e3ae 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -575,6 +575,13 @@ struct BdrvChildRole {
 void (*drained_begin)(BdrvChild *child);
 void (*drained_end)(BdrvChild *child);
 
+/*
+ * Returns whether the parent has pending requests for the child. This
+ * callback is polled after .drained_begin() has been called until all
+ * activity on the child has stopped.
+ */
+bool (*drained_poll)(BdrvChild *child);
+
 /* Notifies the parent that the child has been activated/inactivated (e.g.
  * when migration is completing) and it can start/stop requesting
  * permissions and doing I/O on it. */
diff --git a/include/block/blockjob_int.h b/include/block/blockjob_int.h
index 642adce68b..3a2f851d3f 100644
--- a/include/block/blockjob_int.h
+++ b/include/block/blockjob_int.h
@@ -106,6 +106,14 @@ struct BlockJobDriver {
 void coroutine_fn (*resume)(BlockJob *job);
 
 /*
+ * Returns whether the job has pending requests for the child or will
+ * submit new requests before the next pause point. This callback is polled
+ * in the context of draining a job node after requesting that the job be
+ * paused, until all activity on the child has stopped.
+ */
+bool (*drained_poll)(BlockJob *job);
+
+/*
  * If the callback is not NULL, it will be invoked before the job is
  * resumed in a new AioContext.  This is the place to move any resources
  * besides job->blk to the new AioContext.
diff --git a/block.c b/block.c
index a2caadf0a0..462287bdfb 100644
--- a/block.c
+++ b/block.c
@@ -820,6 +820,12 @@ static void bdrv_child_cb_drained_begin(BdrvChild *child)
 bdrv_drained_begin(bs);
 }
 
+static bool bdrv_child_cb_drained_poll(BdrvChild *child)
+{
+BlockDriverState *bs = child->opaque;
+return bdrv_drain_poll(bs, false);
+}
+
 static void bdrv_child_cb_drained_end(BdrvChild *child)
 {
 BlockDriverState *bs = child->opaque;
@@ -904,6 +910,7 @@ const BdrvChildRole child_file = {
 .get_parent_desc = bdrv_child_get_parent_desc,
 .inherit_options = bdrv_inherited_options,
 .drained_begin   = bdrv_child_cb_drained_begin,
+.drained_poll= bdrv_child_cb_drained_poll,
 .drained_end = bdrv_child_cb_drained_end,
 .attach  = bdrv_child_cb_attach,
 .detach  = bdrv_child_cb_detach,
@@ -928,6 +935,7 @@ const BdrvChildRole child_format = {
 .get_parent_desc = bdrv_child_get_parent_desc,
 .inherit_options = bdrv_inherited_fmt_options,
 .drained_begin   = bdrv_child_cb_drained_begin,
+.drained_poll= bdrv_child_cb_drained_poll,
 .drained_end = bdrv_child_cb_drained_end,
 .attach  = bdrv_child_cb_attach,
 .detach  = bdrv_child_cb_detach,
@@ -1047,6 +1055,7 @@ const BdrvChildRole child_backing = {
 .detach  = bdrv_backing_detach,
 .inherit_options = bdrv_backing_options,
 .drained_begin   = bdrv_child_cb_drained_begin,
+.drained_poll= bdrv_child_cb_drained_poll,
 .drained_end = bdrv_child_cb_drained_end,
 .inactivate  = bdrv_child_cb_inactivate,
 .update_filename = bdrv_backing_update_filename,
diff --git a/block/io.c b/block/io.c
index 6f580f49ff..0a778eeff4 100644
--- 

[Qemu-devel] [PATCH 00/19] Drain fixes and cleanups, part 3

2018-04-11 Thread Kevin Wolf
This is the third and hopefully for now last part of my work to fix
drain. The main goal of this series is to make drain robust against
graph changes that happen in any callbacks of in-flight requests while
we drain a block node.

The individual patches describe the details, but the rough plan is to
change all three drain types (single node, subtree and all) to work like
this:

1. First call all the necessary callbacks to quiesce external sources
   for new requests. This includes the block driver callbacks, the child
   node callbacks and disabling external AioContext events. This is done
   recursively.

   Much of the trouble we had with drain resulted from the fact that the
   graph changed while we were traversing the graph recursively. None of
   the callbacks called in this phase may change the graph.

2. Then do a single AIO_WAIT_WHILE() to drain the requests of all
   affected nodes. The aio_poll() called by it is where graph changes
   can happen and we need to be careful.

   However, while evaluating the loop condition, the graph can't change,
   so we can safely call all necessary callbacks, if needed recursively,
   to determine whether there are still pending requests in any affected
   nodes. We just need to make sure that we don't rely on the set of
   nodes being the same between any two evaluation of the condition.

There are a few more smaller, mostly self-contained changes needed
before we're actually safe, but this is the main mechanism that will
help you understand what we're working towards during the series.

Kevin Wolf (18):
  test-bdrv-drain: bdrv_drain() works with cross-AioContext events
  block: Use bdrv_do_drain_begin/end in bdrv_drain_all()
  block: Remove 'recursive' parameter from bdrv_drain_invoke()
  block: Don't manually poll in bdrv_drain_all()
  tests/test-bdrv-drain: bdrv_drain_all() works in coroutines now
  block: Avoid unnecessary aio_poll() in AIO_WAIT_WHILE()
  block: Really pause block jobs on drain
  block: Remove bdrv_drain_recurse()
  block: Drain recursively with a single BDRV_POLL_WHILE()
  test-bdrv-drain: Test node deletion in subtree recursion
  block: Don't poll in parent drain callbacks
  test-bdrv-drain: Graph change through parent callback
  block: Defer .bdrv_drain_begin callback to polling phase
  test-bdrv-drain: Test that bdrv_drain_invoke() doesn't poll
  block: Allow AIO_WAIT_WHILE with NULL ctx
  block: Move bdrv_drain_all_begin() out of coroutine context
  block: Allow graph changes in bdrv_drain_all_begin/end sections
  test-bdrv-drain: Test graph changes in drain_all section

Max Reitz (1):
  test-bdrv-drain: Add test for node deletion

 include/block/aio-wait.h |  25 +-
 include/block/block.h|  17 +
 include/block/block_int.h|   8 +
 include/block/blockjob_int.h |   8 +
 block.c  |  31 +-
 block/io.c   | 280 +---
 block/mirror.c   |   8 +
 blockjob.c   |  21 ++
 tests/test-bdrv-drain.c  | 738 ---
 9 files changed, 974 insertions(+), 162 deletions(-)

-- 
2.13.6




[Qemu-devel] [PATCH 05/19] tests/test-bdrv-drain: bdrv_drain_all() works in coroutines now

2018-04-11 Thread Kevin Wolf
Since we use bdrv_do_drained_begin/end() for bdrv_drain_all_begin/end(),
coroutine context is automatically left with a BH, preventing the
deadlocks that made bdrv_drain_all*() unsafe in coroutine context. We
can consider it compatible now the latest, after having removed the old
polling code as dead code.

Enable the coroutine test cases for bdrv_drain_all().

Signed-off-by: Kevin Wolf 
---
 tests/test-bdrv-drain.c | 16 ++--
 1 file changed, 14 insertions(+), 2 deletions(-)

diff --git a/tests/test-bdrv-drain.c b/tests/test-bdrv-drain.c
index cd870609c5..6de525b509 100644
--- a/tests/test-bdrv-drain.c
+++ b/tests/test-bdrv-drain.c
@@ -233,6 +233,11 @@ static void test_drv_cb_drain_subtree(void)
 test_drv_cb_common(BDRV_SUBTREE_DRAIN, true);
 }
 
+static void test_drv_cb_co_drain_all(void)
+{
+call_in_coroutine(test_drv_cb_drain_all);
+}
+
 static void test_drv_cb_co_drain(void)
 {
 call_in_coroutine(test_drv_cb_drain);
@@ -289,6 +294,11 @@ static void test_quiesce_drain_subtree(void)
 test_quiesce_common(BDRV_SUBTREE_DRAIN, true);
 }
 
+static void test_quiesce_co_drain_all(void)
+{
+call_in_coroutine(test_quiesce_drain_all);
+}
+
 static void test_quiesce_co_drain(void)
 {
 call_in_coroutine(test_quiesce_drain);
@@ -795,7 +805,8 @@ int main(int argc, char **argv)
 g_test_add_func("/bdrv-drain/driver-cb/drain_subtree",
 test_drv_cb_drain_subtree);
 
-// XXX bdrv_drain_all() doesn't work in coroutine context
+g_test_add_func("/bdrv-drain/driver-cb/co/drain_all",
+test_drv_cb_co_drain_all);
 g_test_add_func("/bdrv-drain/driver-cb/co/drain", test_drv_cb_co_drain);
 g_test_add_func("/bdrv-drain/driver-cb/co/drain_subtree",
 test_drv_cb_co_drain_subtree);
@@ -806,7 +817,8 @@ int main(int argc, char **argv)
 g_test_add_func("/bdrv-drain/quiesce/drain_subtree",
 test_quiesce_drain_subtree);
 
-// XXX bdrv_drain_all() doesn't work in coroutine context
+g_test_add_func("/bdrv-drain/quiesce/co/drain_all",
+test_quiesce_co_drain_all);
 g_test_add_func("/bdrv-drain/quiesce/co/drain", test_quiesce_co_drain);
 g_test_add_func("/bdrv-drain/quiesce/co/drain_subtree",
 test_quiesce_co_drain_subtree);
-- 
2.13.6




[Qemu-devel] [PATCH 19/19] test-bdrv-drain: Test graph changes in drain_all section

2018-04-11 Thread Kevin Wolf
This tests both adding and remove a node between bdrv_drain_all_begin()
and bdrv_drain_all_end(), and enabled the existing detach test for
drain_all.

Signed-off-by: Kevin Wolf 
---
 tests/test-bdrv-drain.c | 75 +++--
 1 file changed, 73 insertions(+), 2 deletions(-)

diff --git a/tests/test-bdrv-drain.c b/tests/test-bdrv-drain.c
index 97ca0743c6..462842a761 100644
--- a/tests/test-bdrv-drain.c
+++ b/tests/test-bdrv-drain.c
@@ -457,7 +457,7 @@ static void test_multiparent(void)
 blk_unref(blk_b);
 }
 
-static void test_graph_change(void)
+static void test_graph_change_drain_subtree(void)
 {
 BlockBackend *blk_a, *blk_b;
 BlockDriverState *bs_a, *bs_b, *backing;
@@ -536,6 +536,63 @@ static void test_graph_change(void)
 blk_unref(blk_b);
 }
 
+static void test_graph_change_drain_all(void)
+{
+BlockBackend *blk_a, *blk_b;
+BlockDriverState *bs_a, *bs_b;
+BDRVTestState *a_s, *b_s;
+
+/* Create node A with a BlockBackend */
+blk_a = blk_new(BLK_PERM_ALL, BLK_PERM_ALL);
+bs_a = bdrv_new_open_driver(_test, "test-node-a", BDRV_O_RDWR,
+_abort);
+a_s = bs_a->opaque;
+blk_insert_bs(blk_a, bs_a, _abort);
+
+g_assert_cmpint(bs_a->quiesce_counter, ==, 0);
+g_assert_cmpint(a_s->drain_count, ==, 0);
+
+/* Call bdrv_drain_all_begin() */
+bdrv_drain_all_begin();
+
+g_assert_cmpint(bs_a->quiesce_counter, ==, 1);
+g_assert_cmpint(a_s->drain_count, ==, 1);
+
+/* Create node B with a BlockBackend */
+blk_b = blk_new(BLK_PERM_ALL, BLK_PERM_ALL);
+bs_b = bdrv_new_open_driver(_test, "test-node-b", BDRV_O_RDWR,
+_abort);
+b_s = bs_b->opaque;
+blk_insert_bs(blk_b, bs_b, _abort);
+
+g_assert_cmpint(bs_a->quiesce_counter, ==, 1);
+g_assert_cmpint(bs_b->quiesce_counter, ==, 1);
+g_assert_cmpint(a_s->drain_count, ==, 1);
+g_assert_cmpint(b_s->drain_count, ==, 1);
+
+/* Unref and finally delete node A */
+blk_unref(blk_a);
+
+g_assert_cmpint(bs_a->quiesce_counter, ==, 1);
+g_assert_cmpint(bs_b->quiesce_counter, ==, 1);
+g_assert_cmpint(a_s->drain_count, ==, 1);
+g_assert_cmpint(b_s->drain_count, ==, 1);
+
+bdrv_unref(bs_a);
+
+g_assert_cmpint(bs_b->quiesce_counter, ==, 1);
+g_assert_cmpint(b_s->drain_count, ==, 1);
+
+/* End the drained section */
+bdrv_drain_all_end();
+
+g_assert_cmpint(bs_b->quiesce_counter, ==, 0);
+g_assert_cmpint(b_s->drain_count, ==, 0);
+
+bdrv_unref(bs_b);
+blk_unref(blk_b);
+}
+
 struct test_iothread_data {
 BlockDriverState *bs;
 enum drain_type drain_type;
@@ -971,6 +1028,10 @@ static void do_test_delete_by_drain(bool 
detach_instead_of_delete,
 bdrv_subtree_drained_begin(bs);
 bdrv_subtree_drained_end(bs);
 break;
+case BDRV_DRAIN_ALL:
+bdrv_drain_all_begin();
+bdrv_drain_all_end();
+break;
 default:
 g_assert_not_reached();
 }
@@ -991,6 +1052,11 @@ static void test_delete_by_drain(void)
 do_test_delete_by_drain(false, BDRV_DRAIN);
 }
 
+static void test_detach_by_drain_all(void)
+{
+do_test_delete_by_drain(true, BDRV_DRAIN_ALL);
+}
+
 static void test_detach_by_drain(void)
 {
 do_test_delete_by_drain(true, BDRV_DRAIN);
@@ -1219,7 +1285,11 @@ int main(int argc, char **argv)
 
 g_test_add_func("/bdrv-drain/nested", test_nested);
 g_test_add_func("/bdrv-drain/multiparent", test_multiparent);
-g_test_add_func("/bdrv-drain/graph-change", test_graph_change);
+
+g_test_add_func("/bdrv-drain/graph-change/drain_subtree",
+test_graph_change_drain_subtree);
+g_test_add_func("/bdrv-drain/graph-change/drain_all",
+test_graph_change_drain_all);
 
 g_test_add_func("/bdrv-drain/iothread/drain_all", test_iothread_drain_all);
 g_test_add_func("/bdrv-drain/iothread/drain", test_iothread_drain);
@@ -1232,6 +1302,7 @@ int main(int argc, char **argv)
 test_blockjob_drain_subtree);
 
 g_test_add_func("/bdrv-drain/deletion/drain", test_delete_by_drain);
+g_test_add_func("/bdrv-drain/detach/drain_all", test_detach_by_drain_all);
 g_test_add_func("/bdrv-drain/detach/drain", test_detach_by_drain);
 g_test_add_func("/bdrv-drain/detach/drain_subtree", 
test_detach_by_drain_subtree);
 g_test_add_func("/bdrv-drain/detach/parent_cb", test_detach_by_parent_cb);
-- 
2.13.6




[Qemu-devel] [PATCH 14/19] block: Defer .bdrv_drain_begin callback to polling phase

2018-04-11 Thread Kevin Wolf
We cannot allow aio_poll() in bdrv_drain_invoke(begin=true) until we're
done with propagating the drain through the graph and are doing the
single final BDRV_POLL_WHILE().

Just schedule the coroutine with the callback and increase bs->in_flight
to make sure that the polling phase will wait for it.

Signed-off-by: Kevin Wolf 
---
 block/io.c | 28 +++-
 1 file changed, 23 insertions(+), 5 deletions(-)

diff --git a/block/io.c b/block/io.c
index f372b9ffb0..fc26c1a2f8 100644
--- a/block/io.c
+++ b/block/io.c
@@ -178,22 +178,40 @@ static void coroutine_fn bdrv_drain_invoke_entry(void 
*opaque)
 
 /* Set data->done before reading bs->wakeup.  */
 atomic_mb_set(>done, true);
-bdrv_wakeup(bs);
+bdrv_dec_in_flight(bs);
+
+if (data->begin) {
+g_free(data);
+}
 }
 
 /* Recursively call BlockDriver.bdrv_co_drain_begin/end callbacks */
 static void bdrv_drain_invoke(BlockDriverState *bs, bool begin)
 {
-BdrvCoDrainData data = { .bs = bs, .done = false, .begin = begin};
+BdrvCoDrainData *data;
 
 if (!bs->drv || (begin && !bs->drv->bdrv_co_drain_begin) ||
 (!begin && !bs->drv->bdrv_co_drain_end)) {
 return;
 }
 
-data.co = qemu_coroutine_create(bdrv_drain_invoke_entry, );
-bdrv_coroutine_enter(bs, data.co);
-BDRV_POLL_WHILE(bs, !data.done);
+data = g_new(BdrvCoDrainData, 1);
+*data = (BdrvCoDrainData) {
+.bs = bs,
+.done = false,
+.begin = begin
+};
+
+/* Make sure the driver callback completes during the polling phase for
+ * drain_begin. */
+bdrv_inc_in_flight(bs);
+data->co = qemu_coroutine_create(bdrv_drain_invoke_entry, data);
+aio_co_schedule(bdrv_get_aio_context(bs), data->co);
+
+if (!begin) {
+BDRV_POLL_WHILE(bs, !data->done);
+g_free(data);
+}
 }
 
 /* Returns true if BDRV_POLL_WHILE() should go into a blocking aio_poll() */
-- 
2.13.6




Re: [Qemu-devel] [PATCH 00/10] Avoid integer overflow in next_page_start

2018-04-11 Thread Cornelia Huck
On Wed, 11 Apr 2018 11:29:42 -0400
"Emilio G. Cota"  wrote:

> On Wed, Apr 11, 2018 at 10:08:58 +1000, Richard Henderson wrote:
> > On 04/11/2018 02:19 AM, Emilio G. Cota wrote:  
> > > Richard pointed out in another thread that when computing
> > > next_page_start we can break checks for the last page in the
> > > address space due to integer overflow. This affects several targets;
> > > the appended fixes them.
> > > 
> > > You can fetch the patches from:
> > >   https://github.com/cota/qemu/tree/next_page_overflow  
> > 
> > Reviewed-by: Richard Henderson   
> 
> Thanks!
> 
> To ease an eventual merge I'll be updating the patches' R-b tags as
> they come in this branch:
>   https://github.com/cota/qemu/tree/next_page_overflow-r-b
> 
> BTW to avoid conflicts we should merge this before the translator loop
> conversion series; I'll make that clear when I send a new version
> of that patch set.
> 
>   Emilio

So, this series will be merged in one go, then? I'll ack the s390x
patch.



Re: [Qemu-devel] [Qemu-arm] [PATCH for-2.12] hw/char/cmsdk-apb-uart.c: Correctly clear INTSTATUS bits on writes

2018-04-11 Thread Philippe Mathieu-Daudé
On 04/10/2018 10:42 AM, Peter Maydell wrote:
> The CMSDK APB UART INTSTATUS register bits are all write-one-to-clear.
> We were getting this correct for the TXO and RXO bits (which need
> special casing because their state lives in the STATE register),
> but had forgotten to handle the normal bits for RX and TX which
> we do store in our s->intstatus field.
> 
> Perform the W1C operation on the bits in s->intstatus too.
> 
> Fixes: https://bugs.launchpad.net/qemu/+bug/1760262
> Cc: qemu-sta...@nongnu.org
> Signed-off-by: Peter Maydell 
> ---
> Not a disaster if this doesn't get into 2.12, I guess. I think
> it's missed the rc3 boat, so if we need an rc4 for some other
> reason we can put it in.
> 
>  hw/char/cmsdk-apb-uart.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/hw/char/cmsdk-apb-uart.c b/hw/char/cmsdk-apb-uart.c
> index 1ad1e14295..9c0929d8a2 100644
> --- a/hw/char/cmsdk-apb-uart.c
> +++ b/hw/char/cmsdk-apb-uart.c
> @@ -274,6 +274,7 @@ static void uart_write(void *opaque, hwaddr offset, 
> uint64_t value,
>   * is then reflected into the intstatus value by the update 
> function).

The comment was correct :)

Reviewed-by: Philippe Mathieu-Daudé 

>   */
>  s->state &= ~(value & (R_INTSTATUS_TXO_MASK | R_INTSTATUS_RXO_MASK));
> +s->intstatus &= ~value;
>  cmsdk_apb_uart_update(s);
>  break;
>  case A_BAUDDIV:
> 



Re: [Qemu-devel] [PATCH v1 13/24] tests/tcg/i386: move test-i386-sse.c to tests/tcg/x86_64/test-sse.c

2018-04-11 Thread Alex Bennée

Thomas Huth  writes:

> On 10.04.2018 21:39, Alex Bennée wrote:
>> The test mixes up 32bit and 64 bit code. It should probably be split
>> into two distinct test cases. However for now just move it out of the
>> way of the i386 build.
>>
>> Signed-off-by: Alex Bennée 
>> ---
>>  tests/tcg/{i386/test-i386-ssse3.c => x86_64/test-sse.c} | 6 ++
>>  1 file changed, 2 insertions(+), 4 deletions(-)
>>  rename tests/tcg/{i386/test-i386-ssse3.c => x86_64/test-sse.c} (93%)
>
> Do we really want to have a separate x86_64 folder here? We also have
> 64-bit code in hw/i386/ and target/i386/ so I don't think that we should
> handle this differently for tests/tcg/i386 ? Wouldn't it be sufficient
> to simply move this code to a separate file? ... just my 0.02 €

It's certainly simpler for the build rules to have a simple mapping from
$(TARGET_NAME) to tests/tcg/FOO

--
Alex Bennée



[Qemu-devel] [PATCH 02/19] block: Use bdrv_do_drain_begin/end in bdrv_drain_all()

2018-04-11 Thread Kevin Wolf
bdrv_do_drain_begin/end() implement already everything that
bdrv_drain_all_begin/end() need and currently still do manually: Disable
external events, call parent drain callbacks, call block driver
callbacks.

It also does two more things:

The first is incrementing bs->quiesce_counter. bdrv_drain_all() already
stood out in the test case by behaving different from the other drain
variants. Adding this is not only safe, but in fact a bug fix.

The second is calling bdrv_drain_recurse(). We already do that later in
the same function in a loop, so basically doing an early first iteration
doesn't hurt.

Signed-off-by: Kevin Wolf 
---
 block/io.c  | 10 ++
 tests/test-bdrv-drain.c | 14 --
 2 files changed, 6 insertions(+), 18 deletions(-)

diff --git a/block/io.c b/block/io.c
index 28e71221b0..cad59db2f4 100644
--- a/block/io.c
+++ b/block/io.c
@@ -412,11 +412,8 @@ void bdrv_drain_all_begin(void)
 for (bs = bdrv_first(); bs; bs = bdrv_next()) {
 AioContext *aio_context = bdrv_get_aio_context(bs);
 
-/* Stop things in parent-to-child order */
 aio_context_acquire(aio_context);
-aio_disable_external(aio_context);
-bdrv_parent_drained_begin(bs, NULL);
-bdrv_drain_invoke(bs, true, true);
+bdrv_do_drained_begin(bs, true, NULL);
 aio_context_release(aio_context);
 
 if (!g_slist_find(aio_ctxs, aio_context)) {
@@ -457,11 +454,8 @@ void bdrv_drain_all_end(void)
 for (bs = bdrv_first(); bs; bs = bdrv_next()) {
 AioContext *aio_context = bdrv_get_aio_context(bs);
 
-/* Re-enable things in child-to-parent order */
 aio_context_acquire(aio_context);
-bdrv_drain_invoke(bs, false, true);
-bdrv_parent_drained_end(bs, NULL);
-aio_enable_external(aio_context);
+bdrv_do_drained_end(bs, true, NULL);
 aio_context_release(aio_context);
 }
 }
diff --git a/tests/test-bdrv-drain.c b/tests/test-bdrv-drain.c
index 29634102d8..cd870609c5 100644
--- a/tests/test-bdrv-drain.c
+++ b/tests/test-bdrv-drain.c
@@ -276,8 +276,7 @@ static void test_quiesce_common(enum drain_type drain_type, 
bool recursive)
 
 static void test_quiesce_drain_all(void)
 {
-// XXX drain_all doesn't quiesce
-//test_quiesce_common(BDRV_DRAIN_ALL, true);
+test_quiesce_common(BDRV_DRAIN_ALL, true);
 }
 
 static void test_quiesce_drain(void)
@@ -319,12 +318,7 @@ static void test_nested(void)
 
 for (outer = 0; outer < DRAIN_TYPE_MAX; outer++) {
 for (inner = 0; inner < DRAIN_TYPE_MAX; inner++) {
-/* XXX bdrv_drain_all() doesn't increase the quiesce_counter */
-int bs_quiesce  = (outer != BDRV_DRAIN_ALL) +
-  (inner != BDRV_DRAIN_ALL);
-int backing_quiesce = (outer == BDRV_SUBTREE_DRAIN) +
-  (inner == BDRV_SUBTREE_DRAIN);
-int backing_cb_cnt  = (outer != BDRV_DRAIN) +
+int backing_quiesce = (outer != BDRV_DRAIN) +
   (inner != BDRV_DRAIN);
 
 g_assert_cmpint(bs->quiesce_counter, ==, 0);
@@ -335,10 +329,10 @@ static void test_nested(void)
 do_drain_begin(outer, bs);
 do_drain_begin(inner, bs);
 
-g_assert_cmpint(bs->quiesce_counter, ==, bs_quiesce);
+g_assert_cmpint(bs->quiesce_counter, ==, 2);
 g_assert_cmpint(backing->quiesce_counter, ==, backing_quiesce);
 g_assert_cmpint(s->drain_count, ==, 2);
-g_assert_cmpint(backing_s->drain_count, ==, backing_cb_cnt);
+g_assert_cmpint(backing_s->drain_count, ==, backing_quiesce);
 
 do_drain_end(inner, bs);
 do_drain_end(outer, bs);
-- 
2.13.6




[Qemu-devel] [PATCH 08/19] block: Remove bdrv_drain_recurse()

2018-04-11 Thread Kevin Wolf
For bdrv_drain(), recursively waiting for child node requests is
pointless because we didn't quiesce their parents, so new requests could
come in anyway. Letting the function work only on a single node makes it
more consistent.

For subtree drains and drain_all, we already have the recursion in
bdrv_do_drained_begin(), so the extra recursion doesn't add anything
either.

Remove the useless code.

Signed-off-by: Kevin Wolf 
---
 block/io.c | 36 +++-
 1 file changed, 3 insertions(+), 33 deletions(-)

diff --git a/block/io.c b/block/io.c
index 0a778eeff4..f24f39c278 100644
--- a/block/io.c
+++ b/block/io.c
@@ -211,38 +211,6 @@ bool bdrv_drain_poll(BlockDriverState *bs, bool top_level)
 return atomic_read(>in_flight);
 }
 
-static bool bdrv_drain_recurse(BlockDriverState *bs)
-{
-BdrvChild *child, *tmp;
-bool waited;
-
-/* Wait for drained requests to finish */
-waited = BDRV_POLL_WHILE(bs, bdrv_drain_poll(bs, true));
-
-QLIST_FOREACH_SAFE(child, >children, next, tmp) {
-BlockDriverState *bs = child->bs;
-bool in_main_loop =
-qemu_get_current_aio_context() == qemu_get_aio_context();
-assert(bs->refcnt > 0);
-if (in_main_loop) {
-/* In case the recursive bdrv_drain_recurse processes a
- * block_job_defer_to_main_loop BH and modifies the graph,
- * let's hold a reference to bs until we are done.
- *
- * IOThread doesn't have such a BH, and it is not safe to call
- * bdrv_unref without BQL, so skip doing it there.
- */
-bdrv_ref(bs);
-}
-waited |= bdrv_drain_recurse(bs);
-if (in_main_loop) {
-bdrv_unref(bs);
-}
-}
-
-return waited;
-}
-
 static void bdrv_do_drained_begin(BlockDriverState *bs, bool recursive,
   BdrvChild *parent);
 static void bdrv_do_drained_end(BlockDriverState *bs, bool recursive,
@@ -310,7 +278,9 @@ void bdrv_do_drained_begin(BlockDriverState *bs, bool 
recursive,
 
 bdrv_parent_drained_begin(bs, parent);
 bdrv_drain_invoke(bs, true);
-bdrv_drain_recurse(bs);
+
+/* Wait for drained requests to finish */
+BDRV_POLL_WHILE(bs, bdrv_drain_poll(bs, true));
 
 if (recursive) {
 bs->recursive_quiesce_counter++;
-- 
2.13.6




[Qemu-devel] [PATCH 04/19] block: Don't manually poll in bdrv_drain_all()

2018-04-11 Thread Kevin Wolf
All involved nodes are already idle, we called bdrv_do_draine_begin() on
them.

The comment in the code suggested that this were not correct because the
completion of a request on one node could spawn a new request on a
different node (which might have been drained before, so we wouldn't
drain the new request). In reality, new requests to different nodes
aren't spawned out of nothing, but only in the context of a parent
request, and they aren't submitted to random nodes, but only to child
nodes. As long as we still poll for the completion of the parent request
(which we do), draining each root node separately is good enough.

Remove the additional polling code from bdrv_drain_all_begin() and
replace it with an assertion that all nodes are already idle after we
drained them separately.

Signed-off-by: Kevin Wolf 
---
 block/io.c | 41 -
 1 file changed, 12 insertions(+), 29 deletions(-)

diff --git a/block/io.c b/block/io.c
index d2bd89c3bb..ea6f9f023a 100644
--- a/block/io.c
+++ b/block/io.c
@@ -376,6 +376,16 @@ void bdrv_drain(BlockDriverState *bs)
 bdrv_drained_end(bs);
 }
 
+static void bdrv_drain_assert_idle(BlockDriverState *bs)
+{
+BdrvChild *child, *next;
+
+assert(atomic_read(>in_flight) == 0);
+QLIST_FOREACH_SAFE(child, >children, next, next) {
+bdrv_drain_assert_idle(child->bs);
+}
+}
+
 /*
  * Wait for pending requests to complete across all BlockDriverStates
  *
@@ -390,11 +400,8 @@ void bdrv_drain(BlockDriverState *bs)
  */
 void bdrv_drain_all_begin(void)
 {
-/* Always run first iteration so any pending completion BHs run */
-bool waited = true;
 BlockDriverState *bs;
 BdrvNextIterator it;
-GSList *aio_ctxs = NULL, *ctx;
 
 /* BDRV_POLL_WHILE() for a node can only be called from its own I/O thread
  * or the main loop AioContext. We potentially use BDRV_POLL_WHILE() on
@@ -408,35 +415,11 @@ void bdrv_drain_all_begin(void)
 aio_context_acquire(aio_context);
 bdrv_do_drained_begin(bs, true, NULL);
 aio_context_release(aio_context);
-
-if (!g_slist_find(aio_ctxs, aio_context)) {
-aio_ctxs = g_slist_prepend(aio_ctxs, aio_context);
-}
 }
 
-/* Note that completion of an asynchronous I/O operation can trigger any
- * number of other I/O operations on other devices---for example a
- * coroutine can submit an I/O request to another device in response to
- * request completion.  Therefore we must keep looping until there was no
- * more activity rather than simply draining each device independently.
- */
-while (waited) {
-waited = false;
-
-for (ctx = aio_ctxs; ctx != NULL; ctx = ctx->next) {
-AioContext *aio_context = ctx->data;
-
-aio_context_acquire(aio_context);
-for (bs = bdrv_first(); bs; bs = bdrv_next()) {
-if (aio_context == bdrv_get_aio_context(bs)) {
-waited |= bdrv_drain_recurse(bs);
-}
-}
-aio_context_release(aio_context);
-}
+for (bs = bdrv_first(); bs; bs = bdrv_next()) {
+bdrv_drain_assert_idle(bs);
 }
-
-g_slist_free(aio_ctxs);
 }
 
 void bdrv_drain_all_end(void)
-- 
2.13.6




[Qemu-devel] [PATCH 03/19] block: Remove 'recursive' parameter from bdrv_drain_invoke()

2018-04-11 Thread Kevin Wolf
All callers pass false for the 'recursive' parameter now. Remove it.

Signed-off-by: Kevin Wolf 
---
 block/io.c | 13 +++--
 1 file changed, 3 insertions(+), 10 deletions(-)

diff --git a/block/io.c b/block/io.c
index cad59db2f4..d2bd89c3bb 100644
--- a/block/io.c
+++ b/block/io.c
@@ -167,9 +167,8 @@ static void coroutine_fn bdrv_drain_invoke_entry(void 
*opaque)
 }
 
 /* Recursively call BlockDriver.bdrv_co_drain_begin/end callbacks */
-static void bdrv_drain_invoke(BlockDriverState *bs, bool begin, bool recursive)
+static void bdrv_drain_invoke(BlockDriverState *bs, bool begin)
 {
-BdrvChild *child, *tmp;
 BdrvCoDrainData data = { .bs = bs, .done = false, .begin = begin};
 
 if (!bs->drv || (begin && !bs->drv->bdrv_co_drain_begin) ||
@@ -180,12 +179,6 @@ static void bdrv_drain_invoke(BlockDriverState *bs, bool 
begin, bool recursive)
 data.co = qemu_coroutine_create(bdrv_drain_invoke_entry, );
 bdrv_coroutine_enter(bs, data.co);
 BDRV_POLL_WHILE(bs, !data.done);
-
-if (recursive) {
-QLIST_FOREACH_SAFE(child, >children, next, tmp) {
-bdrv_drain_invoke(child->bs, begin, true);
-}
-}
 }
 
 static bool bdrv_drain_recurse(BlockDriverState *bs)
@@ -286,7 +279,7 @@ void bdrv_do_drained_begin(BlockDriverState *bs, bool 
recursive,
 }
 
 bdrv_parent_drained_begin(bs, parent);
-bdrv_drain_invoke(bs, true, false);
+bdrv_drain_invoke(bs, true);
 bdrv_drain_recurse(bs);
 
 if (recursive) {
@@ -321,7 +314,7 @@ void bdrv_do_drained_end(BlockDriverState *bs, bool 
recursive,
 old_quiesce_counter = atomic_fetch_dec(>quiesce_counter);
 
 /* Re-enable things in child-to-parent order */
-bdrv_drain_invoke(bs, false, false);
+bdrv_drain_invoke(bs, false);
 bdrv_parent_drained_end(bs, parent);
 if (old_quiesce_counter == 1) {
 aio_enable_external(bdrv_get_aio_context(bs));
-- 
2.13.6




[Qemu-devel] [PATCH 06/19] block: Avoid unnecessary aio_poll() in AIO_WAIT_WHILE()

2018-04-11 Thread Kevin Wolf
Commit 91af091f923 added an additional aio_poll() to BDRV_POLL_WHILE()
in order to make sure that all pending BHs are executed on drain. This
was the wrong place to make the fix, as it is useless overhead for all
other users of the macro and unnecessarily complicates the mechanism.

This patch effectively reverts said commit (the context has changed a
bit and the code has moved to AIO_WAIT_WHILE()) and instead polls in the
loop condition for drain.

The effect is probably hard to measure in any real-world use case
because actual I/O will dominate, but if I run only the initialisation
part of 'qemu-img convert' where it calls bdrv_block_status() for the
whole image to find out how much data there is copy, this phase actually
needs only roughly half the time after this patch.

Signed-off-by: Kevin Wolf 
---
 include/block/aio-wait.h | 22 --
 block/io.c   | 11 ++-
 2 files changed, 18 insertions(+), 15 deletions(-)

diff --git a/include/block/aio-wait.h b/include/block/aio-wait.h
index 8c90a2e66e..783d3678dd 100644
--- a/include/block/aio-wait.h
+++ b/include/block/aio-wait.h
@@ -73,29 +73,23 @@ typedef struct {
  */
 #define AIO_WAIT_WHILE(wait, ctx, cond) ({ \
 bool waited_ = false;  \
-bool busy_ = true; \
 AioWait *wait_ = (wait);   \
 AioContext *ctx_ = (ctx);  \
 if (in_aio_context_home_thread(ctx_)) {\
-while ((cond) || busy_) {  \
-busy_ = aio_poll(ctx_, (cond));\
-waited_ |= !!(cond) | busy_;   \
+while ((cond)) {   \
+aio_poll(ctx_, true);  \
+waited_ = true;\
 }  \
 } else {   \
 assert(qemu_get_current_aio_context() ==   \
qemu_get_aio_context());\
 /* Increment wait_->num_waiters before evaluating cond. */ \
 atomic_inc(_->num_waiters);   \
-while (busy_) {\
-if ((cond)) {  \
-waited_ = busy_ = true;\
-aio_context_release(ctx_); \
-aio_poll(qemu_get_aio_context(), true);\
-aio_context_acquire(ctx_); \
-} else {   \
-busy_ = aio_poll(ctx_, false); \
-waited_ |= busy_;  \
-}  \
+while ((cond)) {   \
+aio_context_release(ctx_); \
+aio_poll(qemu_get_aio_context(), true);\
+aio_context_acquire(ctx_); \
+waited_ = true;\
 }  \
 atomic_dec(_->num_waiters);   \
 }  \
diff --git a/block/io.c b/block/io.c
index ea6f9f023a..6f580f49ff 100644
--- a/block/io.c
+++ b/block/io.c
@@ -181,13 +181,22 @@ static void bdrv_drain_invoke(BlockDriverState *bs, bool 
begin)
 BDRV_POLL_WHILE(bs, !data.done);
 }
 
+/* Returns true if BDRV_POLL_WHILE() should go into a blocking aio_poll() */
+static bool bdrv_drain_poll(BlockDriverState *bs)
+{
+/* Execute pending BHs first and check everything else only after the BHs
+ * have executed. */
+while (aio_poll(bs->aio_context, false));
+return atomic_read(>in_flight);
+}
+
 static bool bdrv_drain_recurse(BlockDriverState *bs)
 {
 BdrvChild *child, *tmp;
 bool waited;
 
 /* Wait for drained requests to finish */
-waited = BDRV_POLL_WHILE(bs, atomic_read(>in_flight) > 0);
+waited = BDRV_POLL_WHILE(bs, bdrv_drain_poll(bs));
 
 QLIST_FOREACH_SAFE(child, >children, next, tmp) {
 BlockDriverState *bs = child->bs;
-- 
2.13.6




Re: [Qemu-devel] [PATCH v1 08/24] docker: Makefile.include introduce DOCKER_SCRIPT

2018-04-11 Thread Philippe Mathieu-Daudé
On 04/10/2018 04:39 PM, Alex Bennée wrote:
> Define this in one place to make it easy to re-use.
> 
> Signed-off-by: Alex Bennée 

Reviewed-by: Philippe Mathieu-Daudé 

> ---
>  tests/docker/Makefile.include | 10 ++
>  1 file changed, 6 insertions(+), 4 deletions(-)
> 
> diff --git a/tests/docker/Makefile.include b/tests/docker/Makefile.include
> index de87341528..6a5aa9ec71 100644
> --- a/tests/docker/Makefile.include
> +++ b/tests/docker/Makefile.include
> @@ -14,6 +14,8 @@ DOCKER_TESTS := $(notdir $(shell \
>  
>  DOCKER_TOOLS := travis
>  
> +DOCKER_SCRIPT=$(SRC_PATH)/tests/docker/docker.py
> +
>  TESTS ?= %
>  IMAGES ?= %
>  
> @@ -37,7 +39,7 @@ docker-image-%: $(DOCKER_FILES_DIR)/%.docker
>   echo WARNING: EXECUTABLE is not set, debootstrap may fail. 2>&1 
> ; \
>   fi
>   $(call quiet-command,\
> - $(SRC_PATH)/tests/docker/docker.py build qemu:$* $< \
> + $(DOCKER_SCRIPT) build qemu:$* $< \
>   $(if $V,,--quiet) $(if $(NOCACHE),--no-cache) \
>   $(if $(NOUSER),,--add-current-user) \
>   $(if $(EXTRA_FILES),--extra-files $(EXTRA_FILES))\
> @@ -129,11 +131,11 @@ docker-run: docker-qemu-src
>   fi
>   $(if $(EXECUTABLE), \
>   $(call quiet-command,   \
> - $(SRC_PATH)/tests/docker/docker.py update   \
> + $(DOCKER_SCRIPT) update \
>   $(IMAGE) $(EXECUTABLE), \
>   "  COPYING $(EXECUTABLE) to $(IMAGE)"))
>   $(call quiet-command,   \
> - $(SRC_PATH)/tests/docker/docker.py run  \
> + $(DOCKER_SCRIPT) run\
>   $(if $(NOUSER),,-u $(shell id -u))  \
>   --security-opt seccomp=unconfined   \
>   $(if $V,,--rm)  \
> @@ -163,4 +165,4 @@ docker-run-%:
>   @$(MAKE) docker-run TEST=$(CMD) IMAGE=qemu:$(IMAGE)
>  
>  docker-clean:
> - $(call quiet-command, $(SRC_PATH)/tests/docker/docker.py clean)
> + $(call quiet-command, $(DOCKER_SCRIPT) clean)
> 



Re: [Qemu-devel] [PATCH v2 0/2] bitmaps persistent and migration fixes

2018-04-11 Thread Max Reitz
On 2018-04-11 14:26, Vladimir Sementsov-Ogievskiy wrote:
> v2:
> 
> 01: new, proposed by Max
> 02: fix leaving cat processes
> 
> Vladimir Sementsov-Ogievskiy (2):
>   qcow2: try load bitmaps only once
>   iotests: fix 169
> 
>  block/qcow2.h  |  1 +
>  block/qcow2.c  | 16 
>  tests/qemu-iotests/169 | 48 +++-
>  3 files changed, 36 insertions(+), 29 deletions(-)

Thanks, changed the comment wording in patch 1 according to Eric's
suggestion and applied to my block branch:

https://github.com/XanClic/qemu/commits/block

I left the commit wording unchanged because as you said there might be
more cases where we have bitmaps at the node although we haven't yet
loaded any from the file.  I certainly don't want to have to think too
much about those cases and that to me is the main use of patch 1.

If there is going to be an rc4 (it appears there is), I will send a pull
request with these patches for it.  If not, I probably won't.  Then I'll
just add a qemu-stable CC and get them into 2.13 -- unless you disagree
and you think that this series should indeed be in 2.12.

Max



[Qemu-devel] [PATCH 17/19] block: Move bdrv_drain_all_begin() out of coroutine context

2018-04-11 Thread Kevin Wolf
Before we can introduce a single polling loop for all nodes in
bdrv_drain_all_begin(), we must make sure to run it outside of coroutine
context like we already do for bdrv_do_drained_begin().

Signed-off-by: Kevin Wolf 
---
 block/io.c | 22 +-
 1 file changed, 17 insertions(+), 5 deletions(-)

diff --git a/block/io.c b/block/io.c
index fc26c1a2f8..db6810b35f 100644
--- a/block/io.c
+++ b/block/io.c
@@ -255,11 +255,16 @@ static void bdrv_co_drain_bh_cb(void *opaque)
 Coroutine *co = data->co;
 BlockDriverState *bs = data->bs;
 
-bdrv_dec_in_flight(bs);
-if (data->begin) {
-bdrv_do_drained_begin(bs, data->recursive, data->parent, data->poll);
+if (bs) {
+bdrv_dec_in_flight(bs);
+if (data->begin) {
+bdrv_do_drained_begin(bs, data->recursive, data->parent, 
data->poll);
+} else {
+bdrv_do_drained_end(bs, data->recursive, data->parent);
+}
 } else {
-bdrv_do_drained_end(bs, data->recursive, data->parent);
+assert(data->begin);
+bdrv_drain_all_begin();
 }
 
 data->done = true;
@@ -285,7 +290,9 @@ static void coroutine_fn 
bdrv_co_yield_to_drain(BlockDriverState *bs,
 .parent = parent,
 .poll = poll,
 };
-bdrv_inc_in_flight(bs);
+if (bs) {
+bdrv_inc_in_flight(bs);
+}
 aio_bh_schedule_oneshot(bdrv_get_aio_context(bs),
 bdrv_co_drain_bh_cb, );
 
@@ -455,6 +462,11 @@ void bdrv_drain_all_begin(void)
 BlockDriverState *bs;
 BdrvNextIterator it;
 
+if (qemu_in_coroutine()) {
+bdrv_co_yield_to_drain(NULL, true, false, NULL, true);
+return;
+}
+
 /* BDRV_POLL_WHILE() for a node can only be called from its own I/O thread
  * or the main loop AioContext. We potentially use BDRV_POLL_WHILE() on
  * nodes in several different AioContexts, so make sure we're in the main
-- 
2.13.6




Re: [Qemu-devel] [PATCH v1 13/24] tests/tcg/i386: move test-i386-sse.c to tests/tcg/x86_64/test-sse.c

2018-04-11 Thread Thomas Huth
On 10.04.2018 21:39, Alex Bennée wrote:
> The test mixes up 32bit and 64 bit code. It should probably be split
> into two distinct test cases. However for now just move it out of the
> way of the i386 build.
> 
> Signed-off-by: Alex Bennée 
> ---
>  tests/tcg/{i386/test-i386-ssse3.c => x86_64/test-sse.c} | 6 ++
>  1 file changed, 2 insertions(+), 4 deletions(-)
>  rename tests/tcg/{i386/test-i386-ssse3.c => x86_64/test-sse.c} (93%)

Do we really want to have a separate x86_64 folder here? We also have
64-bit code in hw/i386/ and target/i386/ so I don't think that we should
handle this differently for tests/tcg/i386 ? Wouldn't it be sufficient
to simply move this code to a separate file? ... just my 0.02 €

 Thomas



Re: [Qemu-devel] [PATCH v2 06/17] target/mips: convert to DisasJumpType

2018-04-11 Thread Emilio G. Cota
On Wed, Apr 11, 2018 at 09:27:57 +1000, Richard Henderson wrote:
> On 04/11/2018 12:23 AM, Emilio G. Cota wrote:
> >  case DISAS_STOP:
> > -gen_goto_tb(, 0, ctx.pc);
> > +tcg_gen_lookup_and_goto_ptr();
> 
> You need to write ctx.pc back to the pc first, e.g.
> 
> gen_save_pc(ctx.pc);
> tcg_gen_lookup_and_goto_ptr();

Thanks, fixed now.

Emilio



[Qemu-devel] [PATCH 15/19] test-bdrv-drain: Test that bdrv_drain_invoke() doesn't poll

2018-04-11 Thread Kevin Wolf
This adds a test case that goes wrong if bdrv_drain_invoke() calls
aio_poll().

Signed-off-by: Kevin Wolf 
---
 tests/test-bdrv-drain.c | 102 +---
 1 file changed, 88 insertions(+), 14 deletions(-)

diff --git a/tests/test-bdrv-drain.c b/tests/test-bdrv-drain.c
index fdf3ce19ea..79d845ecbb 100644
--- a/tests/test-bdrv-drain.c
+++ b/tests/test-bdrv-drain.c
@@ -34,12 +34,16 @@ static QemuEvent done_event;
 typedef struct BDRVTestState {
 int drain_count;
 AioContext *bh_indirection_ctx;
+bool sleep_in_drain_begin;
 } BDRVTestState;
 
 static void coroutine_fn bdrv_test_co_drain_begin(BlockDriverState *bs)
 {
 BDRVTestState *s = bs->opaque;
 s->drain_count++;
+if (s->sleep_in_drain_begin) {
+qemu_co_sleep_ns(QEMU_CLOCK_REALTIME, 10);
+}
 }
 
 static void coroutine_fn bdrv_test_co_drain_end(BlockDriverState *bs)
@@ -80,6 +84,22 @@ static int coroutine_fn bdrv_test_co_preadv(BlockDriverState 
*bs,
 return 0;
 }
 
+static void bdrv_test_child_perm(BlockDriverState *bs, BdrvChild *c,
+ const BdrvChildRole *role,
+ BlockReopenQueue *reopen_queue,
+ uint64_t perm, uint64_t shared,
+ uint64_t *nperm, uint64_t *nshared)
+{
+/* bdrv_format_default_perms() accepts only these two, so disguise
+ * detach_by_driver_cb_role as one of them. */
+if (role != _file && role != _backing) {
+role = _file;
+}
+
+bdrv_format_default_perms(bs, c, role, reopen_queue, perm, shared,
+  nperm, nshared);
+}
+
 static BlockDriver bdrv_test = {
 .format_name= "test",
 .instance_size  = sizeof(BDRVTestState),
@@ -90,7 +110,7 @@ static BlockDriver bdrv_test = {
 .bdrv_co_drain_begin= bdrv_test_co_drain_begin,
 .bdrv_co_drain_end  = bdrv_test_co_drain_end,
 
-.bdrv_child_perm= bdrv_format_default_perms,
+.bdrv_child_perm= bdrv_test_child_perm,
 };
 
 static void aio_ret_cb(void *opaque, int ret)
@@ -982,13 +1002,14 @@ struct detach_by_parent_data {
 BdrvChild *child_b;
 BlockDriverState *c;
 BdrvChild *child_c;
+bool by_parent_cb;
 };
+static struct detach_by_parent_data detach_by_parent_data;
 
-static void detach_by_parent_aio_cb(void *opaque, int ret)
+static void detach_indirect_bh(void *opaque)
 {
 struct detach_by_parent_data *data = opaque;
 
-g_assert_cmpint(ret, ==, 0);
 bdrv_unref_child(data->parent_b, data->child_b);
 
 bdrv_ref(data->c);
@@ -996,6 +1017,25 @@ static void detach_by_parent_aio_cb(void *opaque, int ret)
   _file, _abort);
 }
 
+static void detach_by_parent_aio_cb(void *opaque, int ret)
+{
+struct detach_by_parent_data *data = _by_parent_data;
+
+g_assert_cmpint(ret, ==, 0);
+if (data->by_parent_cb) {
+detach_indirect_bh(data);
+}
+}
+
+static void detach_by_driver_cb_drained_begin(BdrvChild *child)
+{
+aio_bh_schedule_oneshot(qemu_get_current_aio_context(),
+detach_indirect_bh, _by_parent_data);
+child_file.drained_begin(child);
+}
+
+static BdrvChildRole detach_by_driver_cb_role;
+
 /*
  * Initial graph:
  *
@@ -1003,17 +1043,25 @@ static void detach_by_parent_aio_cb(void *opaque, int 
ret)
  *\ /   \
  * A B C
  *
- * PA has a pending write request whose callback changes the child nodes of PB:
- * It removes B and adds C instead. The subtree of PB is drained, which will
- * indirectly drain the write request, too.
+ * by_parent_cb == true:  Test that parent callbacks don't poll
+ *
+ * PA has a pending write request whose callback changes the child nodes of
+ * PB: It removes B and adds C instead. The subtree of PB is drained, which
+ * will indirectly drain the write request, too.
+ *
+ * by_parent_cb == false: Test that bdrv_drain_invoke() doesn't poll
+ *
+ * PA's BdrvChildRole has a .drained_begin callback that schedules a BH
+ * that does the same graph change. If bdrv_drain_invoke() calls it, the
+ * state is messed up, but if it is only polled in the single
+ * BDRV_POLL_WHILE() at the end of the drain, this should work fine.
  */
-static void test_detach_by_parent_cb(void)
+static void test_detach_indirect(bool by_parent_cb)
 {
 BlockBackend *blk;
 BlockDriverState *parent_a, *parent_b, *a, *b, *c;
 BdrvChild *child_a, *child_b;
 BlockAIOCB *acb;
-struct detach_by_parent_data data;
 
 QEMUIOVector qiov;
 struct iovec iov = {
@@ -1022,6 +1070,12 @@ static void test_detach_by_parent_cb(void)
 };
 qemu_iovec_init_external(, , 1);
 
+if (!by_parent_cb) {
+detach_by_driver_cb_role = child_file;
+detach_by_driver_cb_role.drained_begin =
+detach_by_driver_cb_drained_begin;
+}
+
 /* Create all involved nodes */
 

Re: [Qemu-devel] [PATCH 4/5] migration: fix qemu carsh when RDMA live migration

2018-04-11 Thread Dr. David Alan Gilbert
* Lidong Chen (jemmy858...@gmail.com) wrote:
> After postcopy, the destination qemu work in the dedicated
> thread, so only invoke yield_until_fd_readable before postcopy
> migration.

The subject line needs to be more discriptive:
   migration: Stop rdma yielding during incoming postcopy

I think.
(Also please check the subject spellings)

> Signed-off-by: Lidong Chen 
> ---
>  migration/rdma.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/migration/rdma.c b/migration/rdma.c
> index 53773c7..81be482 100644
> --- a/migration/rdma.c
> +++ b/migration/rdma.c
> @@ -1489,11 +1489,13 @@ static int qemu_rdma_wait_comp_channel(RDMAContext 
> *rdma)
>   * Coroutine doesn't start until migration_fd_process_incoming()
>   * so don't yield unless we know we're running inside of a coroutine.
>   */
> -if (rdma->migration_started_on_destination) {
> +if (rdma->migration_started_on_destination &&
> +migration_incoming_get_current()->state == MIGRATION_STATUS_ACTIVE) {

OK, that's a bit delicate; watch if it ever gets called in a failure
case or similar - and also wathc out if we make more use of the status
on the destination, but otherwise, and with a fix for the subject;


Reviewed-by: Dr. David Alan Gilbert 

>  yield_until_fd_readable(rdma->comp_channel->fd);
>  } else {
>  /* This is the source side, we're in a separate thread
>   * or destination prior to migration_fd_process_incoming()
> + * after postcopy, the destination also in a seprate thread.
>   * we can't yield; so we have to poll the fd.
>   * But we need to be able to handle 'cancel' or an error
>   * without hanging forever.
> -- 
> 1.8.3.1
> 
--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK



Re: [Qemu-devel] [PATCH v2 2/2] iotests: fix 169

2018-04-11 Thread Max Reitz
On 2018-04-11 14:26, Vladimir Sementsov-Ogievskiy wrote:
> Improve and fix 169:
> - use MIGRATION events instead of RESUME
> - make a TODO: enable dirty-bitmaps capability for offline case
> - recreate vm_b without -incoming near test end
> 
> This (likely) fixes racy faults at least of the following types:
> 
> - timeout on waiting for RESUME event
> - sha256 mismatch on line 136 (142 after this patch)
> - fail to self.vm_b.launch() on line 135 (141 now after this patch)
> 
> And surely fixes cat processes, left after test finish.
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy 
> ---
>  tests/qemu-iotests/169 | 48 +++-
>  1 file changed, 27 insertions(+), 21 deletions(-)

Looks good, makes the test pass reliably (here) and doesn't result in
overly loud meowing due to an abundance of orphaned cats.  Thanks!

Reviewed-by: Max Reitz 



Re: [Qemu-devel] [PATCH v1 16/24] tests/tcg/i386: fix test-i386-fprem

2018-04-11 Thread Philippe Mathieu-Daudé
On 04/10/2018 04:39 PM, Alex Bennée wrote:
> Remove dependencies on QEMU's source tree and build directly.
> 
> Signed-off-by: Alex Bennée 

Reviewed-by: Philippe Mathieu-Daudé 

> ---
>  tests/tcg/i386/test-i386-fprem.c | 9 ++---
>  1 file changed, 6 insertions(+), 3 deletions(-)
> 
> diff --git a/tests/tcg/i386/test-i386-fprem.c 
> b/tests/tcg/i386/test-i386-fprem.c
> index 1a71623204..771dc07433 100644
> --- a/tests/tcg/i386/test-i386-fprem.c
> +++ b/tests/tcg/i386/test-i386-fprem.c
> @@ -23,7 +23,10 @@
>   *  along with this program; if not, see .
>   */
>  
> -#include "qemu/osdep.h"
> +#include 
> +#include 
> +
> +#define ARRAY_SIZE(x) (sizeof(x) / sizeof((x)[0]))
>  
>  /*
>   * Inspired by 's union ieee854_long_double, but with single
> @@ -39,7 +42,7 @@ union float80u {
>  unsigned int exponent:15;
>  unsigned int negative:1;
>  unsigned int empty:16;
> -} QEMU_PACKED ieee;
> +} __attribute__((packed)) ieee;
>  
>  /* This is for NaNs in the IEEE 854 double-extended-precision format.  */
>  struct {
> @@ -49,7 +52,7 @@ union float80u {
>  unsigned int exponent:15;
>  unsigned int negative:1;
>  unsigned int empty:16;
> -} QEMU_PACKED ieee_nan;
> +} __attribute__((packed)) ieee_nan;
>  };
>  
>  #define IEEE854_LONG_DOUBLE_BIAS 0x3fff
> 



Re: [Qemu-devel] [PULL for-2.12 0/1] tcg patch queue

2018-04-11 Thread Peter Maydell
On 11 April 2018 at 00:31, Richard Henderson
<richard.hender...@linaro.org> wrote:
> The following changes since commit 26d6a7c87b05017ffabffb5e16837a0fccf67e90:
>
>   Merge remote-tracking branch 'remotes/ericb/tags/pull-qapi-2018-04-10' into 
> staging (2018-04-10 22:16:19 +0100)
>
> are available in the Git repository at:
>
>   git://github.com/rth7680/qemu.git tags/pull-tcg-20180411
>
> for you to fetch changes up to afd46fcad2dceffda35c0586f5723c127b6e09d8:
>
>   icount: fix cpu_restore_state_from_tb for non-tb-exit cases (2018-04-11 
> 09:05:22 +1000)
>
> 
> Handle read-modify-write i/o with icount
>
> 
> Pavel Dovgalyuk (1):
>   icount: fix cpu_restore_state_from_tb for non-tb-exit cases
>
>  include/exec/exec-all.h  |  5 -
>  accel/tcg/cpu-exec-common.c  | 10 +-
>  accel/tcg/cpu-exec.c |  1 -
>  accel/tcg/translate-all.c| 27 ++-
>  accel/tcg/user-exec.c|  2 +-
>  hw/misc/mips_itu.c   |  3 +--
>  target/alpha/helper.c|  2 +-
>  target/alpha/mem_helper.c|  6 ++
>  target/arm/op_helper.c   |  6 +++---
>  target/cris/op_helper.c  |  4 ++--
>  target/i386/helper.c |  2 +-
>  target/i386/svm_helper.c |  2 +-
>  target/m68k/op_helper.c  |  4 ++--
>  target/moxie/helper.c|  2 +-
>  target/openrisc/sys_helper.c |  8 
>  target/tricore/op_helper.c   |  2 +-
>  target/xtensa/op_helper.c|  4 ++--
>  17 files changed, 45 insertions(+), 45 deletions(-)


Applied, thanks.

-- PMM



Re: [Qemu-devel] [PATCH v1 17/24] tests/tcg: move ARM specific tests into subdir

2018-04-11 Thread Thomas Huth
On 10.04.2018 21:39, Alex Bennée wrote:
> These only need to be built for ARM guests.
> 
> Signed-off-by: Alex Bennée 
> ---
>  tests/tcg/README  |  9 -
>  tests/tcg/arm/README  | 11 +++
>  tests/tcg/{ => arm}/hello-arm.c   |  0
>  tests/tcg/{ => arm}/test-arm-iwmmxt.s |  0
>  4 files changed, 11 insertions(+), 9 deletions(-)
>  create mode 100644 tests/tcg/arm/README
>  rename tests/tcg/{ => arm}/hello-arm.c (100%)
>  rename tests/tcg/{ => arm}/test-arm-iwmmxt.s (100%)

Reviewed-by: Thomas Huth 



Re: [Qemu-devel] [PATCH 5/5] migration: disable RDMA WRITR after postcopy started.

2018-04-11 Thread Dr. David Alan Gilbert
* Lidong Chen (jemmy858...@gmail.com) wrote:
> RDMA write operations are performed with no notification to the destination
> qemu, then the destination qemu can not wakeup. So disable RDMA WRITE after
> postcopy started.
> 
> Signed-off-by: Lidong Chen 

This patch needs to be near the beginning of the series; at the moment a
bisect would lead you to the middle of the series which had return
paths, but then would fail to work properly because it would try and use
the RDMA code.

> ---
>  migration/qemu-file.c |  3 ++-
>  migration/rdma.c  | 12 
>  2 files changed, 14 insertions(+), 1 deletion(-)
> 
> diff --git a/migration/qemu-file.c b/migration/qemu-file.c
> index 8acb574..a64ac3a 100644
> --- a/migration/qemu-file.c
> +++ b/migration/qemu-file.c
> @@ -260,7 +260,8 @@ size_t ram_control_save_page(QEMUFile *f, ram_addr_t 
> block_offset,
>  int ret = f->hooks->save_page(f, f->opaque, block_offset,
>offset, size, bytes_sent);
>  f->bytes_xfer += size;
> -if (ret != RAM_SAVE_CONTROL_DELAYED) {
> +if (ret != RAM_SAVE_CONTROL_DELAYED &&
> +ret != RAM_SAVE_CONTROL_NOT_SUPP) {

What about f->bytes_xfer in this case?

Is there anything we have to do at the switchover into postcopy to make
sure that all pages have been received?

Dave

>  if (bytes_sent && *bytes_sent > 0) {
>  qemu_update_position(f, *bytes_sent);
>  } else if (ret < 0) {
> diff --git a/migration/rdma.c b/migration/rdma.c
> index 81be482..8529ddd 100644
> --- a/migration/rdma.c
> +++ b/migration/rdma.c
> @@ -2964,6 +2964,10 @@ static size_t qemu_rdma_save_page(QEMUFile *f, void 
> *opaque,
>  
>  CHECK_ERROR_STATE();
>  
> +if (migrate_get_current()->state == MIGRATION_STATUS_POSTCOPY_ACTIVE) {
> +return RAM_SAVE_CONTROL_NOT_SUPP;
> +}
> +
>  qemu_fflush(f);
>  
>  if (size > 0) {
> @@ -3528,6 +3532,10 @@ static int qemu_rdma_registration_start(QEMUFile *f, 
> void *opaque,
>  
>  CHECK_ERROR_STATE();
>  
> +if (migrate_get_current()->state == MIGRATION_STATUS_POSTCOPY_ACTIVE) {
> +return 0;
> +}
> +
>  trace_qemu_rdma_registration_start(flags);
>  qemu_put_be64(f, RAM_SAVE_FLAG_HOOK);
>  qemu_fflush(f);
> @@ -3550,6 +3558,10 @@ static int qemu_rdma_registration_stop(QEMUFile *f, 
> void *opaque,
>  
>  CHECK_ERROR_STATE();
>  
> +if (migrate_get_current()->state == MIGRATION_STATUS_POSTCOPY_ACTIVE) {
> +return 0;
> +}
> +
>  qemu_fflush(f);
>  ret = qemu_rdma_drain_cq(f, rdma);
>  
> -- 
> 1.8.3.1
> 
--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK



  1   2   3   >