date:20171109

Re: [Qemu-devel] [PATCH] target-i386: adds PV_TLB_FLUSH CPUID feature bit

2017-11-09 Thread Wanpeng Li

2017-11-10 15:45 GMT+08:00 Wanpeng Li :
> From: Wanpeng Li 
>
> Adds PV_TLB_FLUSH CPUID feature bit.
>
> Cc: Paolo Bonzini 
> Cc: Radim Krčmář 
> Cc: Richard Henderson 
> Cc: Eduardo Habkost 
> Signed-off-by: Wanpeng Li 
> ---
>  target/i386/cpu.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/target/i386/cpu.c b/target/i386/cpu.c
> index 6f21a5e..ecebc5a 100644
> --- a/target/i386/cpu.c
> +++ b/target/i386/cpu.c
> @@ -347,7 +347,7 @@ static FeatureWordInfo feature_word_info[FEATURE_WORDS] = 
> {
>  .feat_names = {
>  "kvmclock", "kvm-nopiodelay", "kvm-mmu", "kvmclock",
>  "kvm-asyncpf", "kvm-steal-time", "kvm-pv-eoi", "kvm-pv-unhalt",
> -NULL, NULL, NULL, NULL,
> +NULL, "kvm-pv-tlb-flush", NULL, NULL,

Note: bit 8 is reserved for the PV_DEDICATED which is posted by
another guy in kvm community.

>  NULL, NULL, NULL, NULL,
>  NULL, NULL, NULL, NULL,
>  NULL, NULL, NULL, NULL,
> --
> 2.7.4
>

[Qemu-devel] [PATCH] target-i386: adds PV_TLB_FLUSH CPUID feature bit

2017-11-09 Thread Wanpeng Li

From: Wanpeng Li 

Adds PV_TLB_FLUSH CPUID feature bit.

Cc: Paolo Bonzini 
Cc: Radim Krčmář 
Cc: Richard Henderson 
Cc: Eduardo Habkost 
Signed-off-by: Wanpeng Li 
---
 target/i386/cpu.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index 6f21a5e..ecebc5a 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -347,7 +347,7 @@ static FeatureWordInfo feature_word_info[FEATURE_WORDS] = {
 .feat_names = {
 "kvmclock", "kvm-nopiodelay", "kvm-mmu", "kvmclock",
 "kvm-asyncpf", "kvm-steal-time", "kvm-pv-eoi", "kvm-pv-unhalt",
-NULL, NULL, NULL, NULL,
+NULL, "kvm-pv-tlb-flush", NULL, NULL,
 NULL, NULL, NULL, NULL,
 NULL, NULL, NULL, NULL,
 NULL, NULL, NULL, NULL,
-- 
2.7.4

Re: [Qemu-devel] [PATCH] slirp: don't zero ti_i since we access it later.

2017-11-09 Thread no-reply

Hi,

This series seems to have some coding style problems. See output below for
more information:

Subject: [Qemu-devel] [PATCH] slirp: don't zero ti_i since we access it later.
Type: series
Message-id: 20171108225340.10194-1-lep...@google.com

=== TEST SCRIPT BEGIN ===
#!/bin/bash

BASE=base
n=1
total=$(git log --oneline $BASE.. | wc -l)
failed=0

git config --local diff.renamelimit 0
git config --local diff.renames True

commits="$(git log --format=%H --reverse $BASE..)"
for c in $commits; do
echo "Checking PATCH $n/$total: $(git log -n 1 --format=%s $c)..."
if ! git show $c --format=email | ./scripts/checkpatch.pl --mailback -; then
failed=1
echo
fi
n=$((n+1))
done

exit $failed
=== TEST SCRIPT END ===

Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384
From https://github.com/patchew-project/qemu
 * [new tag]   patchew/20171108225340.10194-1-lep...@google.com -> 
patchew/20171108225340.10194-1-lep...@google.com
Switched to a new branch 'test'
566ba17a0b slirp: don't zero ti_i since we access it later.

=== OUTPUT BEGIN ===
Checking PATCH 1/1: slirp: don't zero ti_i since we access it later
ERROR: code indent should never use tabs
#21: FILE: slirp/tcp_subr.c:151:
+^I^Iswitch (af) {$

ERROR: code indent should never use tabs
#22: FILE: slirp/tcp_subr.c:152:
+^I^Icase AF_INET:$

ERROR: code indent should never use tabs
#23: FILE: slirp/tcp_subr.c:153:
+^I^Iti->ti.ti_i4.ih_x1 = 0;$

ERROR: code indent should never use tabs
#24: FILE: slirp/tcp_subr.c:154:
+^I^Ibreak;$

ERROR: code indent should never use tabs
#25: FILE: slirp/tcp_subr.c:155:
+^I^Icase AF_INET6:$

ERROR: code indent should never use tabs
#26: FILE: slirp/tcp_subr.c:156:
+^I^Iti->ti.ti_i6.ih_x1 = 0;$

ERROR: code indent should never use tabs
#27: FILE: slirp/tcp_subr.c:157:
+^I^Ibreak;$

ERROR: code indent should never use tabs
#28: FILE: slirp/tcp_subr.c:158:
+^I^Idefault:$

ERROR: code indent should never use tabs
#29: FILE: slirp/tcp_subr.c:159:
+^I^Ig_assert_not_reached();$

ERROR: code indent should never use tabs
#30: FILE: slirp/tcp_subr.c:160:
+^I^I}$

total: 10 errors, 0 warnings, 17 lines checked

Your patch has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.

=== OUTPUT END ===

Test command exited with code: 1


---
Email generated automatically by Patchew [http://patchew.org/].
Please send your feedback to patchew-de...@freelists.org

Re: [Qemu-devel] [PULL 0/1] slirp update

2017-11-09 Thread no-reply

Hi,

This series seems to have some coding style problems. See output below for
more information:

Subject: [Qemu-devel] [PULL 0/1] slirp update
Type: series
Message-id: 20171109180142.13923-1-samuel.thiba...@ens-lyon.org

=== TEST SCRIPT BEGIN ===
#!/bin/bash

BASE=base
n=1
total=$(git log --oneline $BASE.. | wc -l)
failed=0

git config --local diff.renamelimit 0
git config --local diff.renames True

commits="$(git log --format=%H --reverse $BASE..)"
for c in $commits; do
echo "Checking PATCH $n/$total: $(git log -n 1 --format=%s $c)..."
if ! git show $c --format=email | ./scripts/checkpatch.pl --mailback -; then
failed=1
echo
fi
n=$((n+1))
done

exit $failed
=== TEST SCRIPT END ===

Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384
From https://github.com/patchew-project/qemu
 * [new tag]   
patchew/20171109180142.13923-1-samuel.thiba...@ens-lyon.org -> 
patchew/20171109180142.13923-1-samuel.thiba...@ens-lyon.org
 * [new tag]   patchew/20171109181741.31318-1-lep...@google.com -> 
patchew/20171109181741.31318-1-lep...@google.com
Switched to a new branch 'test'
9f23f4da08 slirp: don't zero the whole ti_i when m == NULL

=== OUTPUT BEGIN ===
Checking PATCH 1/1: slirp: don't zero the whole ti_i when m == NULL...
ERROR: code indent should never use tabs
#29: FILE: slirp/tcp_subr.c:151:
+^I^Iswitch (af) {$

ERROR: code indent should never use tabs
#30: FILE: slirp/tcp_subr.c:152:
+^I^Icase AF_INET:$

ERROR: code indent should never use tabs
#31: FILE: slirp/tcp_subr.c:153:
+^I^Iti->ti.ti_i4.ih_x1 = 0;$

ERROR: code indent should never use tabs
#32: FILE: slirp/tcp_subr.c:154:
+^I^Ibreak;$

ERROR: code indent should never use tabs
#33: FILE: slirp/tcp_subr.c:155:
+^I^Icase AF_INET6:$

ERROR: code indent should never use tabs
#34: FILE: slirp/tcp_subr.c:156:
+^I^Iti->ti.ti_i6.ih_x1 = 0;$

ERROR: code indent should never use tabs
#35: FILE: slirp/tcp_subr.c:157:
+^I^Ibreak;$

ERROR: code indent should never use tabs
#36: FILE: slirp/tcp_subr.c:158:
+^I^Idefault:$

ERROR: code indent should never use tabs
#37: FILE: slirp/tcp_subr.c:159:
+^I^Ig_assert_not_reached();$

ERROR: code indent should never use tabs
#38: FILE: slirp/tcp_subr.c:160:
+^I^I}$

total: 10 errors, 0 warnings, 17 lines checked

Your patch has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.

=== OUTPUT END ===

Test command exited with code: 1


---
Email generated automatically by Patchew [http://patchew.org/].
Please send your feedback to patchew-de...@freelists.org

Re: [Qemu-devel] [PATCH v2] aarch64: advertise the GIC system register interface

2017-11-09 Thread no-reply

Hi,

This series seems to have some coding style problems. See output below for
more information:

Subject: [Qemu-devel] [PATCH v2] aarch64: advertise the GIC system register 
interface
Type: series
Message-id: alpine.DEB.2.10.1711061412330.30448@sstabellini-ThinkPad-X260

=== TEST SCRIPT BEGIN ===
#!/bin/bash

BASE=base
n=1
total=$(git log --oneline $BASE.. | wc -l)
failed=0

git config --local diff.renamelimit 0
git config --local diff.renames True

commits="$(git log --format=%H --reverse $BASE..)"
for c in $commits; do
echo "Checking PATCH $n/$total: $(git log -n 1 --format=%s $c)..."
if ! git show $c --format=email | ./scripts/checkpatch.pl --mailback -; then
failed=1
echo
fi
n=$((n+1))
done

exit $failed
=== TEST SCRIPT END ===

Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384
From https://github.com/patchew-project/qemu
 * [new tag]   
patchew/alpine.DEB.2.10.1711061412330.30448@sstabellini-ThinkPad-X260 -> 
patchew/alpine.DEB.2.10.1711061412330.30448@sstabellini-ThinkPad-X260
Switched to a new branch 'test'
d3e1cec549 aarch64: advertise the GIC system register interface

=== OUTPUT BEGIN ===
Checking PATCH 1/1: aarch64: advertise the GIC system register interface...
ERROR: spaces required around that '|' (ctx:VxV)
#67: FILE: target/arm/helper.c:4690:
+  .resetvalue = cpu->gicv3_sysregs ? (cpu->id_aa64pfr0|0x0100) 
:
   ^

total: 1 errors, 0 warnings, 34 lines checked

Your patch has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.

=== OUTPUT END ===

Test command exited with code: 1


---
Email generated automatically by Patchew [http://patchew.org/].
Please send your feedback to patchew-de...@freelists.org

Re: [Qemu-devel] [PULL, 08/14] spapr_pci/spapr_pci_vfio: Support Dynamic DMA Windows (DDW)

2017-11-09 Thread David Gibson

On Wed, Nov 08, 2017 at 09:52:12AM +1100, Alexey Kardashevskiy wrote:
> On 07/11/17 19:42, Laurent Vivier wrote:
> > On 05/07/2016 07:31, David Gibson wrote:
> >> From: Alexey Kardashevskiy 
> >>
> >> This adds support for Dynamic DMA Windows (DDW) option defined by
> >> the SPAPR specification which allows to have additional DMA window(s)
> >>
> >> The "ddw" property is enabled by default on a PHB but for compatibility
> >> the pseries-2.6 machine and older disable it.
> >> This also creates a single DMA window for the older machines to
> >> maintain backward migration.
> >>
> >> This implements DDW for PHB with emulated and VFIO devices. The host
> >> kernel support is required. The advertised IOMMU page sizes are 4K and
> >> 64K; 16M pages are supported but not advertised by default, in order to
> >> enable them, the user has to specify "pgsz" property for PHB and
> >> enable huge pages for RAM.
> > 
> > Why is it not advirtised by default?
> 
> I do not remember clearly but this kind of automation is usually less
> manageable. What if we do not want huge IOMMU pages for some reason?

The reason is basically for migration.  Because of the compatibility
requirements for that, we generally try to avoid making any guest
visible changes to the virtual machine based on host capabilities or
configuration because it makes migration a total nightmare.


That said, in this case the difference which we'd need here -
availability of hugepages in the guest, is *already* guest visible for
the normal (non IOMMU) page mapping.  So maybe automatically changing
the IOMMU mask as well would be ok.  Or at least no worse than we have
already.

> > When we start qemu with hugepage memory ("mount -t hugetlbfs none
> > /mnt/kvm_hugepage" and ".. -mem-path /mnt/kvm_hugepage .."), we have an
> > ugly message:
> > 
> > "qemu-kvm: System page size 0x100 is not enabled in page_size_mask
> > (0x11000). Performance may be slow"
> > 
> > I understand if we want to use this with VFIO, we need something like
> > "-global spapr-pci-host-bridge.pgsz=0x1011000".
> > 
> > But is it needed if we don't use VFIO?
> 
> Yes, TCE tables are still created in KVM so the size matters.
> 
> > Is it a way QEMU adds automatically the 0x100 mask to page_size_mask?
> 
> No. That thing which decides about -mem-path should also add the pgsz mask.
> 
> 

-- 
David Gibson| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson


signature.asc
Description: PGP signature

Re: [Qemu-devel] kvm: virtio-net: saved image requires TUN_F_UFO support

2017-11-09 Thread Jason Wang




On 2017年11月08日 19:22, Jason Wang wrote:



On 2017年11月08日 18:46, Paolo Bonzini wrote:

On 08/11/2017 09:21, Jason Wang wrote:


On 2017年11月08日 17:05, Stefan Priebe - Profihost AG wrote:

Am 08.11.2017 um 08:54 schrieb Jason Wang:

On 2017年11月08日 15:41, Stefan Priebe - Profihost AG wrote:

Hi Paolo,

Am 06.11.2017 um 12:27 schrieb Paolo Bonzini:

On 06/11/2017 12:09, Stefan Priebe - Profihost AG wrote:

HI Paolo,

could this patchset be related?

Uh oh, yes it should.  Jason, any ways to fix it?  I suppose we
need to
disable UFO in the newest machine types, but do we also have to do
(software) UFO in vhost-net and QEMU for migration compatibility?

Any news on this?

Greets,
Since we probe UFO support, it will be disabled automatically on 
device

startup.

For the issue of migration, I think the only way is trying to fix 
it in

kernel.

But 4.14 is around the corner and the patchset is already included.
Shouldn't this get fixed before 4.14 is released?

We will try to seek a solution as soon as possible. If we can't catch
4.14, we will do it for stable for sure.

Jason, can you write to netdev and Cc Linus and me?

Thanks,

Paolo


Paolo, see this https://www.spinics.net/lists/netdev/msg465454.html

Just notice this today since I'm not on the cc list.

Thanks




An update:

After some discussions on netdev, we will try to bring UFO back 
partially for just TAP. Willem promise to fix this. 4.14 is too late 
consider the fix is not trivial, it will go through stable tree.


Thanks

Re: [Qemu-devel] [Qemu-block] segfault in parallel blockjobs (iotest 30)

2017-11-09 Thread Fam Zheng

On Thu, 11/09 17:26, Alberto Garcia wrote:
> On Wed 08 Nov 2017 03:45:38 PM CET, Alberto Garcia wrote:
> 
> > I'm thinking that perhaps we should add the pause point directly to
> > block_job_defer_to_main_loop(), to prevent any block job from running
> > the exit function when it's paused.
> 
> I was trying this and unfortunately this breaks the mirror job at least
> (reproduced with a simple block-commit on the topmost node, which uses
> commit_active_start() -> mirror_start_job()).
> 
> So what happens is that mirror_run() always calls bdrv_drained_begin()
> before returning, pausing the block job. The closing bdrv_drained_end()
> call is at the end of mirror_exit(), already in the main loop.
> 
> So the mirror job is always calling block_job_defer_to_main_loop() and
> mirror_exit() when the job is paused.
> 

FWIW, I think Max's report on 194 failures is related:

https://lists.gnu.org/archive/html/qemu-devel/2017-11/msg01822.html

so perhaps it's worth testing his patch too:

https://lists.gnu.org/archive/html/qemu-devel/2017-11/msg01835.html

Fam

Re: [Qemu-devel] [PATCH for-2.11] block: Keep strong reference when draining all BDS

2017-11-09 Thread Fam Zheng

On Thu, 11/09 21:43, Max Reitz wrote:
> Draining a BDS may lead to graph modifications, which in turn may result
> in it and other BDS being stripped of their current references.  If
> bdrv_drain_all_begin() and bdrv_drain_all_end() do not keep strong
> references themselves, the BDS they are trying to drain (or undrain) may
> disappear right under their feet -- or, more specifically, under the
> feet of BDRV_POLL_WHILE() in bdrv_drain_recurse().
> 
> This fixes an occasional hang of iotest 194.
> 
> Signed-off-by: Max Reitz 
> ---
>  block/io.c | 47 ---
>  1 file changed, 44 insertions(+), 3 deletions(-)
> 
> diff --git a/block/io.c b/block/io.c
> index 3d5ef2cabe..a0a2833e8e 100644
> --- a/block/io.c
> +++ b/block/io.c
> @@ -340,7 +340,10 @@ void bdrv_drain_all_begin(void)
>  bool waited = true;
>  BlockDriverState *bs;
>  BdrvNextIterator it;
> -GSList *aio_ctxs = NULL, *ctx;
> +GSList *aio_ctxs = NULL, *ctx, *bs_list = NULL, *bs_list_entry;
> +
> +/* Must be called from the main loop */
> +assert(qemu_get_current_aio_context() == qemu_get_aio_context());
>  
>  block_job_pause_all();
>  
> @@ -355,6 +358,12 @@ void bdrv_drain_all_begin(void)
>  if (!g_slist_find(aio_ctxs, aio_context)) {
>  aio_ctxs = g_slist_prepend(aio_ctxs, aio_context);
>  }
> +
> +/* Keep a strong reference to all root BDS and copy them into
> + * an own list because draining them may lead to graph
> + * modifications. */
> +bdrv_ref(bs);
> +bs_list = g_slist_prepend(bs_list, bs);
>  }
>  
>  /* Note that completion of an asynchronous I/O operation can trigger any
> @@ -370,7 +379,11 @@ void bdrv_drain_all_begin(void)
>  AioContext *aio_context = ctx->data;
>  
>  aio_context_acquire(aio_context);
> -for (bs = bdrv_first(); bs; bs = bdrv_next()) {
> +for (bs_list_entry = bs_list; bs_list_entry;
> + bs_list_entry = bs_list_entry->next)
> +{
> +bs = bs_list_entry->data;
> +
>  if (aio_context == bdrv_get_aio_context(bs)) {
>  waited |= bdrv_drain_recurse(bs, true);
>  }
> @@ -379,24 +392,52 @@ void bdrv_drain_all_begin(void)
>  }
>  }
>  
> +for (bs_list_entry = bs_list; bs_list_entry;
> + bs_list_entry = bs_list_entry->next)
> +{
> +bdrv_unref(bs_list_entry->data);
> +}
> +
>  g_slist_free(aio_ctxs);
> +g_slist_free(bs_list);
>  }
>  
>  void bdrv_drain_all_end(void)
>  {
>  BlockDriverState *bs;
>  BdrvNextIterator it;
> +GSList *bs_list = NULL, *bs_list_entry;
> +
> +/* Must be called from the main loop */
> +assert(qemu_get_current_aio_context() == qemu_get_aio_context());
>  
> +/* Keep a strong reference to all root BDS and copy them into an
> + * own list because draining them may lead to graph modifications.
> + */
>  for (bs = bdrv_first(); bs; bs = bdrv_next()) {
> -AioContext *aio_context = bdrv_get_aio_context(bs);
> +bdrv_ref(bs);
> +bs_list = g_slist_prepend(bs_list, bs);
> +}
> +
> +for (bs_list_entry = bs_list; bs_list_entry;
> + bs_list_entry = bs_list_entry->next)
> +{
> +AioContext *aio_context;
> +
> +bs = bs_list_entry->data;
> +aio_context = bdrv_get_aio_context(bs);
>  
>  aio_context_acquire(aio_context);
>  aio_enable_external(aio_context);
>  bdrv_parent_drained_end(bs);
>  bdrv_drain_recurse(bs, false);
>  aio_context_release(aio_context);
> +
> +bdrv_unref(bs);
>  }
>  
> +g_slist_free(bs_list);
> +
>  block_job_resume_all();
>  }
>  
> -- 
> 2.13.6
> 
> 

It is better to put the references into BdrvNextIterator and introduce
bdrv_next_iterator_destroy() to free them? You'll need to touch all callers
because it is not C++, but it secures all of rest, which seems vulnerable in the
same pattern, for example the aio_poll() in iothread_stop_all().

Fam

Re: [Qemu-devel] Intermittent hang of iotest 194 (bdrv_drain_all after non-shared storage migration)

2017-11-09 Thread Fam Zheng

On Thu, 11/09 20:31, Max Reitz wrote:
> On 2017-11-09 16:30, Fam Zheng wrote:
> > On Thu, 11/09 16:14, Max Reitz wrote:
> >> On 2017-11-09 05:21, Fam Zheng wrote:
> >>> On Thu, 11/09 01:48, Max Reitz wrote:
>  Hi,
> 
>  More exciting news from the bdrv_drain() front!
> 
>  I've noticed in the past that iotest 194 sometimes hangs.  I usually run
>  the tests on tmpfs, but I've just now verified that it happens on my SSD
>  just as well.
> 
>  So the reproducer is a plain:
> 
>  while ./check -raw 194; do; done
> >>>
> >>> I cannot produce it on my machine.
> >>
> >> Hm, too bad.  I see it both on my work laptop (with Fedora) and my
> >> desktop (with Arch)...
> >>
>  (No difference between raw or qcow2, though.)
> 
>  And then, after a couple of runs (or a couple ten), it will just hang.
>  The reason is that the source VM lingers around and doesn't quit
>  voluntarily -- the test itself was successful, but it just can't exit.
> 
>  If you force it to exit by killing the VM (e.g. through pkill -11 qemu),
>  this is the backtrace:
> 
>  #0  0x7f7cfc297e06 in ppoll () at /lib64/libc.so.6
>  #1  0x563b846bcac9 in ppoll (__ss=0x0, __timeout=0x0,
>  __nfds=, __fds=) at
>  /usr/include/bits/poll2.h:77
> >>>
> >>> Looking at the 0 timeout it seems we are in the aio_poll(ctx, 
> >>> blocking=false);
> >>> branches of BDRV_POLL_WHILE? Is it a busy loop? If so I wonder what is 
> >>> making
> >>> progress and causing the return value to be true in aio_poll().
> >>>
>  #2  0x563b846bcac9 in qemu_poll_ns (fds=,
>  nfds=, timeout=) at util/qemu-timer.c:322
>  #3  0x563b846be711 in aio_poll (ctx=ctx@entry=0x563b856e3e80,
>  blocking=) at util/aio-posix.c:629
>  #4  0x563b8463afa4 in bdrv_drain_recurse
>  (bs=bs@entry=0x563b865568a0, begin=begin@entry=true) at block/io.c:201
>  #5  0x563b8463baff in bdrv_drain_all_begin () at block/io.c:381
>  #6  0x563b8463bc99 in bdrv_drain_all () at block/io.c:411
>  #7  0x563b8459888b in block_migration_cleanup (opaque=  out>) at migration/block.c:714
>  #8  0x563b845883be in qemu_savevm_state_cleanup () at
>  migration/savevm.c:1251
>  #9  0x563b845811fd in migration_thread (opaque=0x563b856f1da0) at
>  migration/migration.c:2298
>  #10 0x7f7cfc56f36d in start_thread () at /lib64/libpthread.so.0
>  #11 0x7f7cfc2a3e1f in clone () at /lib64/libc.so.6
> 
> 
>  And when you make bdrv_drain_all_begin() print what we are trying to
>  drain, you can see that it's the format node (managed by the "raw"
>  driver in this case).
> >>>
> >>> So what is the value of bs->in_flight?
> >>
> >> gdb:
> >>> (gdb) print bs->in_flight 
> >>> $3 = 2307492233
> >>
> >> "That's weird, why would it..."
> >>
> >>> (gdb) print *bs
> >>> $4 = {open_flags = -1202160144, read_only = 161, encrypted = 85, sg = 
> >>> false, probed = false, force_share = 96, implicit = 159, drv = 0x0, 
> >>> opaque = 0x0, aio_context = 0x8989898989898989, aio_notifiers = {lh_first 
> >>> = 0x8989898989898989}, 
> >>>   walking_aio_notifiers = 137, filename = '\211' , 
> >>> backing_file = '\211' , backing_format = '\211' 
> >>> , full_open_options = 0x8989898989898989, 
> >>>   exact_filename = '\211' , backing = 
> >>> 0x8989898989898989, file = 0x8989898989898989, bl = {request_alignment = 
> >>> 2307492233, max_pdiscard = -1987475063, pdiscard_alignment = 2307492233, 
> >>> max_pwrite_zeroes = -1987475063, pwrite_zeroes_alignment = 
> >>> 2307492233, opt_transfer = 2307492233, max_transfer = 2307492233, 
> >>> min_mem_alignment = 9910603678816504201, opt_mem_alignment = 
> >>> 9910603678816504201, max_iov = -1987475063}, 
> >>>   supported_write_flags = 2307492233, supported_zero_flags = 2307492233, 
> >>> node_name = '\211' , node_list = {tqe_next = 
> >>> 0x8989898989898989, tqe_prev = 0x8989898989898989}, bs_list = {tqe_next = 
> >>> 0x8989898989898989, 
> >>> tqe_prev = 0x8989898989898989}, monitor_list = {tqe_next = 
> >>> 0x8989898989898989, tqe_prev = 0x8989898989898989}, refcnt = -1987475063, 
> >>> op_blockers = {{lh_first = 0x8989898989898989} }, job = 
> >>> 0x8989898989898989, 
> >>>   inherits_from = 0x8989898989898989, children = {lh_first = 
> >>> 0x8989898989898989}, parents = {lh_first = 0x8989898989898989}, options = 
> >>> 0x8989898989898989, explicit_options = 0x8989898989898989, detect_zeroes 
> >>> = 2307492233, 
> >>>   backing_blocker = 0x8989898989898989, total_sectors = 
> >>> -8536140394893047415, before_write_notifiers = {notifiers = {lh_first = 
> >>> 0x8989898989898989}}, write_threshold_offset = 9910603678816504201, 
> >>> write_threshold_notifier = {notify = 
> >>> 0x8989898989898989, node = {le_next = 0x8989898989898989, le_prev = 
> >>> 0x8989898989898989}}, dirty_bitmap_mutex = {lock = {__data = {__lock = 
> >>> -1987475063, __count = 2307492233, __owner =

Re: [Qemu-devel] [PATCH v2 1/1] Makefile: Capstone: Add support for cross compile ranlib

2017-11-09 Thread Alistair Francis

On Thu, Nov 9, 2017 at 3:09 AM, Peter Maydell  wrote:
> On 8 November 2017 at 15:10, Alistair Francis
>  wrote:
>> When cross compiling QEMU for Windows we need to specify the cross
>> version of ranlib to avoid build errors when building capstone. This
>> patch ensures we use the same cross prefix on ranlib as other toolchain
>> components.
>>
>> This patch fixes this build error when building on RHEL (or Fedora) with
>> MinGW:
>> /scratch/jenkins_tmp/workspace/QEMU-BuildAll/label/ssw_rhel7/target/windows/build-windows/capstone/capstone.lib:
>> error adding symbols: Archive has no index; run ranlib to add one
>> collect2: error: ld returned 1 exit status
>> make: *** [qemu-io.exe] Error 1
>>
>> Signed-off-by: Alistair Francis 
>> Suggested-by: Peter Maydell 
>> Reviewed-by: Philippe Mathieu-Daud 
>> Tested-by: Philippe Mathieu-Daud 
>
> Something in your patch handling workflow seems to be
> breaking non-ASCII characters, and it has dropped the
> last character in Philippe's surname. It would be good
> if you could look into fixing this -- we have quite a
> few contributors whose names are not representable
> purely in ASCII.

I think it's the way that I copy lines into the patches. I'll be sure
to make sure I get all the characters next time. Thanks for pointing
it out.

Alistair

>
> thanks
> -- PMM

Re: [Qemu-devel] [PATCH v8 01/14] block/dirty-bitmap: add bdrv_dirty_bitmap_enable_successor()

2017-11-09 Thread John Snow



On 10/30/2017 12:32 PM, Vladimir Sementsov-Ogievskiy wrote:
> Enabling bitmap successor is necessary to enable successors of bitmaps
> being migrated before target vm start.
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy 
> ---
>  include/block/dirty-bitmap.h | 1 +
>  block/dirty-bitmap.c | 8 
>  2 files changed, 9 insertions(+)
> 
> diff --git a/include/block/dirty-bitmap.h b/include/block/dirty-bitmap.h
> index 3579a7597c..93d4336505 100644
> --- a/include/block/dirty-bitmap.h
> +++ b/include/block/dirty-bitmap.h
> @@ -20,6 +20,7 @@ BdrvDirtyBitmap 
> *bdrv_dirty_bitmap_abdicate(BlockDriverState *bs,
>  BdrvDirtyBitmap *bdrv_reclaim_dirty_bitmap(BlockDriverState *bs,
> BdrvDirtyBitmap *bitmap,
> Error **errp);
> +void bdrv_dirty_bitmap_enable_successor(BdrvDirtyBitmap *bitmap);
>  BdrvDirtyBitmap *bdrv_find_dirty_bitmap(BlockDriverState *bs,
>  const char *name);
>  void bdrv_dirty_bitmap_make_anon(BdrvDirtyBitmap *bitmap);
> diff --git a/block/dirty-bitmap.c b/block/dirty-bitmap.c
> index bd04e991b1..81adbeb6d4 100644
> --- a/block/dirty-bitmap.c
> +++ b/block/dirty-bitmap.c
> @@ -237,6 +237,14 @@ int bdrv_dirty_bitmap_create_successor(BlockDriverState 
> *bs,
>  return 0;
>  }
>  
> +/* Called with BQL taken. */
> +void bdrv_dirty_bitmap_enable_successor(BdrvDirtyBitmap *bitmap)
> +{
> +qemu_mutex_lock(bitmap->mutex);
> +bdrv_enable_dirty_bitmap(bitmap->successor);
> +qemu_mutex_unlock(bitmap->mutex);
> +}
> +
>  /**
>   * For a bitmap with a successor, yield our name to the successor,
>   * delete the old bitmap, and return a handle to the new bitmap.
> 

Mechanically correct; though I haven't looked forward to how it's used yet.

Reviewed-by: John Snow

[Qemu-devel] [PATCH] thread-posix: fix qemu_rec_mutex_trylock macro

2017-11-09 Thread Emilio G. Cota

We never noticed because it has no users.

Signed-off-by: Emilio G. Cota 
---
 include/qemu/thread-posix.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/qemu/thread-posix.h b/include/qemu/thread-posix.h
index f4296d3..f3f47e4 100644
--- a/include/qemu/thread-posix.h
+++ b/include/qemu/thread-posix.h
@@ -7,7 +7,7 @@
 typedef QemuMutex QemuRecMutex;
 #define qemu_rec_mutex_destroy qemu_mutex_destroy
 #define qemu_rec_mutex_lock qemu_mutex_lock
-#define qemu_rec_mutex_try_lock qemu_mutex_try_lock
+#define qemu_rec_mutex_trylock qemu_mutex_trylock
 #define qemu_rec_mutex_unlock qemu_mutex_unlock
 
 struct QemuMutex {
-- 
2.7.4

Re: [Qemu-devel] [Qemu devel PATCH] MAINTAINERS: Add entries for Smartfusion2

2017-11-09 Thread Philippe Mathieu-Daudé

On 11/09/2017 08:55 PM, Peter Maydell wrote:
> On 9 November 2017 at 21:46, Philippe Mathieu-Daudé  wrote:
>> Hi Subbaraya,
>>
>> On 11/09/2017 09:02 AM, Subbaraya Sundeep wrote:
>>> add voluntarily myself as maintainer for Smartfusion2
>>
>> You need to share your GnuPG key signed, I couldn't find it using
>> http://pgp.mit.edu/pks/lookup?search=Subbaraya+Sundeep
>>
>> from https://wiki.qemu.org/Contribute/SubmitAPullRequest :
> 
> I don't in general expect to take pull requests from
> everybody listed as a maintainer in the MAINTAINERS file.
> That just means "I'm going to be reviewing and should
> be cc'd on patches". Pull requests are sent by people
> who are maintainers for a subsystem. Rule of thumb:
> unless somebody asks you to send a pull request, you
> don't need to do it.

Ok, please apologize my misunderstanding. I still think the M: entry
stand for 'Maintainer' instead of 'Mail', and still don't understand the
difference with a "Designated reviewer" (R: entry):

M: Mail patches to: FullName 
R: Designated reviewer: FullName 
   These reviewers should be CCed on patches.

"Designated reviewer" seems to duplicate the M: entry and is therefore
confusing. Can we simply remove it instead?

When introduced in fdf6fab4df4 the explanation was:

--
Some people are not content with the amount of mail they get, and would
like to be CCed on patches for areas they do not maintain.  Let them
satisfy their own appetite for qemu-devel messages.

Seriously: the purpose here is a bit different from the Linux kernel.
While Linux uses "R" to designate non-maintainers for reviewing patches
in a given area, in QEMU I would also like to use "R" so that people can
delegate sending pull requests while keeping some degree of oversight.
--

Regards,

Phil.

Re: [Qemu-devel] [Qemu devel PATCH] MAINTAINERS: Add entries for Smartfusion2

2017-11-09 Thread Peter Maydell

On 9 November 2017 at 21:46, Philippe Mathieu-Daudé  wrote:
> Hi Subbaraya,
>
> On 11/09/2017 09:02 AM, Subbaraya Sundeep wrote:
>> add voluntarily myself as maintainer for Smartfusion2
>
> You need to share your GnuPG key signed, I couldn't find it using
> http://pgp.mit.edu/pks/lookup?search=Subbaraya+Sundeep
>
> from https://wiki.qemu.org/Contribute/SubmitAPullRequest :

I don't in general expect to take pull requests from
everybody listed as a maintainer in the MAINTAINERS file.
That just means "I'm going to be reviewing and should
be cc'd on patches". Pull requests are sent by people
who are maintainers for a subsystem. Rule of thumb:
unless somebody asks you to send a pull request, you
don't need to do it.

thanks
-- PMM

Re: [Qemu-devel] [PATCH for-2.11] iotests: Use new-style NBD connections

2017-11-09 Thread Max Reitz

On 2017-11-09 23:12, Eric Blake wrote:
> Old-style NBD is deprecated upstream (it is documented, but no
> longer implemented in the reference implementation), and it is
> severely limited (it cannot support structured replies, which
> means it cannot support efficient handling of zeroes), when
> compared to new-style NBD.  We are better off having our iotests
> favor new-style everywhere (although some explicit tests,
> particularly 83, still cover old-style for back-compat reasons);
> this is as simple as supplying the empty string as the default
> export name, as it does not change the URI needed to connect a
> client to the server.  This also gives us more coverage of the
> just-added structured reply code, when not overriding $QEMU_NBD
> to intentionally point to an older server.
> 
> Signed-off-by: Eric Blake 
> ---
> 
> Proposing this for 2.11; it can either go in through the NBD
> tree (although I just send my 2.11-rc1 pull request) or through
> Max' iotest tree.

random.org said 1, so: Thanks, applied to my block branch:

https://github.com/XanClic/qemu/commits/block

Max



signature.asc
Description: OpenPGP digital signature

[Qemu-devel] [Bug 1731347] [NEW] VFIO Passthrough of SAS2008-based HBA card fails on E3-1225v3 due to failed DMA mapping (-14)

2017-11-09 Thread Johannes Falke

Public bug reported:

There is a bug preventing multiple people with my combination of
hardware from using PCI passthrough. I am not actually sure whether the
bug is in kernel/kvm, vfio or qemu, however, as qemu is the highest-
level of these, I am reporting the bug here as you will likely know
better where the origin of the bug may be found.

When attempting to pass through this device to a KVM using VFIO, this
results in error -14 (Bad Address):

# qemu-system-x86_64 -enable-kvm -m 10G -net none -monitor stdio -serial
# none -parallel none -vnc :1 -device vfio-pci,host=1:00.0 -S
QEMU 2.9.1 monitor - type 'help' for more information
(qemu) c
(qemu) qemu-system-x86_64: VFIO_MAP_DMA: -14
qemu-system-x86_64: vfio_dma_map(0x7f548f0a1fc0, 0xfebd, 0x2000, 
0x7f54a909d000) = -14 (Bad address)
qemu: hardware error: vfio: DMA mapping failed, unable to continue

See also:
https://bugzilla.proxmox.com/show_bug.cgi?id=1556
https://www.redhat.com/archives/vfio-users/2016-May/msg00088.html

This has occurred on Proxmox (Proxmox and Debian packages, Ubuntu kernel), 
Ubuntu,
and pure Debian packages and kernel on Proxmox. However, this error
reportedly does NOT occur for:

- different distributions(!) (Fedora 24, 25)
- different HBA cards (SAS2308, SAS3008)
- different CPU (E3-1220v5)

I would be thankful for any input and I'll be happy to provide any
further info necessary. This is my first time delving this deep into
anything close to the kernel.

Thanks and best regards,
Johannes Falke

** Affects: qemu
 Importance: Undecided
 Status: New

** Attachment added: "strace.txt"
   https://bugs.launchpad.net/bugs/1731347/+attachment/5006683/+files/strace.txt

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1731347

Title:
  VFIO Passthrough of SAS2008-based HBA card fails on E3-1225v3 due to
  failed DMA mapping (-14)

Status in QEMU:
  New

Bug description:
  There is a bug preventing multiple people with my combination of
  hardware from using PCI passthrough. I am not actually sure whether
  the bug is in kernel/kvm, vfio or qemu, however, as qemu is the
  highest-level of these, I am reporting the bug here as you will likely
  know better where the origin of the bug may be found.

  When attempting to pass through this device to a KVM using VFIO, this
  results in error -14 (Bad Address):

  # qemu-system-x86_64 -enable-kvm -m 10G -net none -monitor stdio -serial
  # none -parallel none -vnc :1 -device vfio-pci,host=1:00.0 -S
  QEMU 2.9.1 monitor - type 'help' for more information
  (qemu) c
  (qemu) qemu-system-x86_64: VFIO_MAP_DMA: -14
  qemu-system-x86_64: vfio_dma_map(0x7f548f0a1fc0, 0xfebd, 0x2000, 
0x7f54a909d000) = -14 (Bad address)
  qemu: hardware error: vfio: DMA mapping failed, unable to continue

  See also:
  https://bugzilla.proxmox.com/show_bug.cgi?id=1556
  https://www.redhat.com/archives/vfio-users/2016-May/msg00088.html

  This has occurred on Proxmox (Proxmox and Debian packages, Ubuntu kernel), 
Ubuntu,
  and pure Debian packages and kernel on Proxmox. However, this error
  reportedly does NOT occur for:

  - different distributions(!) (Fedora 24, 25)
  - different HBA cards (SAS2308, SAS3008)
  - different CPU (E3-1220v5)

  I would be thankful for any input and I'll be happy to provide any
  further info necessary. This is my first time delving this deep into
  anything close to the kernel.

  Thanks and best regards,
  Johannes Falke

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1731347/+subscriptions

Re: [Qemu-devel] [PATCH v2 0/5] iotests: Make some tests less flaky

2017-11-09 Thread Max Reitz

On 2017-11-09 21:30, Max Reitz wrote:
> There are a couple of tests that fail (on my machine) from time to
> time (and by that I mean that recently I've rarely ever had a test run
> with both 083 and 136 working on first try).
> This series should fix most (at least the issues I am aware of).

Fixed the commit message in patch 4 as suggested by Eric and applied to
my block branch:

https://github.com/XanClic/qemu/commits/block

Max



signature.asc
Description: OpenPGP digital signature

[Qemu-devel] [PATCH for-2.11] iotests: Use new-style NBD connections

2017-11-09 Thread Eric Blake

Old-style NBD is deprecated upstream (it is documented, but no
longer implemented in the reference implementation), and it is
severely limited (it cannot support structured replies, which
means it cannot support efficient handling of zeroes), when
compared to new-style NBD.  We are better off having our iotests
favor new-style everywhere (although some explicit tests,
particularly 83, still cover old-style for back-compat reasons);
this is as simple as supplying the empty string as the default
export name, as it does not change the URI needed to connect a
client to the server.  This also gives us more coverage of the
just-added structured reply code, when not overriding $QEMU_NBD
to intentionally point to an older server.

Signed-off-by: Eric Blake 
---

Proposing this for 2.11; it can either go in through the NBD
tree (although I just send my 2.11-rc1 pull request) or through
Max' iotest tree.

 tests/qemu-iotests/common.rc | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tests/qemu-iotests/common.rc b/tests/qemu-iotests/common.rc
index 0e8a33c696..dbae7d74ba 100644
--- a/tests/qemu-iotests/common.rc
+++ b/tests/qemu-iotests/common.rc
@@ -242,7 +242,7 @@ _make_test_img()
 if [ $IMGPROTO = "nbd" ]; then
 # Pass a sufficiently high number to -e that should be enough for all
 # tests
-eval "$QEMU_NBD -v -t -b 127.0.0.1 -p 10810 -f $IMGFMT -e 42  
$TEST_IMG_FILE >/dev/null &"
+eval "$QEMU_NBD -v -t -b 127.0.0.1 -p 10810 -f $IMGFMT -e 42 -x '' 
$TEST_IMG_FILE >/dev/null &"
 sleep 1 # FIXME: qemu-nbd needs to be listening before we continue
 fi

-- 
2.13.6

[Qemu-devel] [PATCH for 2.11 v2 2/2] xlnx-zcu102: Add an info message deprecating the EP108

2017-11-09 Thread Alistair Francis

The EP108 was an early access development board that is no longer used.
Add an info message to convert any users to the ZCU102 instead. On QEMU
they are both identical.

This patch also updated the qemu-doc.texi file to indicate that the
EP108 has been deprecated.

Signed-off-by: Alistair Francis 
---
V2:
 - Update the qemu-doc.texi file

 hw/arm/xlnx-zcu102.c | 3 +++
 qemu-doc.texi| 7 +++
 2 files changed, 10 insertions(+)

diff --git a/hw/arm/xlnx-zcu102.c b/hw/arm/xlnx-zcu102.c
index 7ec03dad42..a234a1 100644
--- a/hw/arm/xlnx-zcu102.c
+++ b/hw/arm/xlnx-zcu102.c
@@ -164,6 +164,9 @@ static void xlnx_ep108_init(MachineState *machine)
 {
 XlnxZCU102 *s = EP108_MACHINE(machine);
 
+info_report("The Xilinx EP108 machine is deprecated, please use the "
+"ZCU102 machine instead. It has the same features supported.");
+
 xlnx_zynqmp_init(s, machine);
 }
 
diff --git a/qemu-doc.texi b/qemu-doc.texi
index 8c10956a66..d383ac44d4 100644
--- a/qemu-doc.texi
+++ b/qemu-doc.texi
@@ -2537,6 +2537,13 @@ or ``ivshmem-doorbell`` device types.
 The ``spapr-pci-vfio-host-bridge'' device type is replaced by
 the ``spapr-pci-host-bridge'' device type.
 
+@section System emulator machines
+
+@subsection Xilinx EP108 (since 2.11.0)
+
+The ``xlnx-ep108'' machine has been replaced by the ``xlnx-zcu102'' machine.
+The ``xlnx-zcu102'' machine has the same features and capabilites in QEMU.
+
 @node License
 @appendix License
 
-- 
2.14.1

[Qemu-devel] [PATCH for 2.11 v2 0/2] Xilinx ZCU102 fixes for 2.11

2017-11-09 Thread Alistair Francis

These are two small fixes for 2.11.

V2:
 - Update qemu-doc.texi

Alistair Francis (2):
  xlnx-zynqmp: Properly support the smp command line option
  xlnx-zcu102: Add an info message deprecating the EP108

 hw/arm/xlnx-zcu102.c |  6 +-
 hw/arm/xlnx-zynqmp.c | 26 --
 qemu-doc.texi|  7 +++
 3 files changed, 28 insertions(+), 11 deletions(-)

-- 
2.14.1

Re: [Qemu-devel] [Qemu devel PATCH] MAINTAINERS: Add entries for Smartfusion2

2017-11-09 Thread Philippe Mathieu-Daudé

Hi Subbaraya,

On 11/09/2017 09:02 AM, Subbaraya Sundeep wrote:
> add voluntarily myself as maintainer for Smartfusion2

You need to share your GnuPG key signed, I couldn't find it using
http://pgp.mit.edu/pks/lookup?search=Subbaraya+Sundeep

from https://wiki.qemu.org/Contribute/SubmitAPullRequest :
All pull requests must be signed. If your key is not already signed by
members of the QEMU community, you should make arrangements to attend a
KeySigningParty (for example at KVM Forum) or make alternative
arrangements to have your key signed by an attendee. Key signing
requires meeting another community member *in person* so please make
appropriate arrangements.

IMHO it would be great you find some time to review some patches
(qemu-...@nongnu.org is not that verbose):
https://wiki.qemu.org/Contribute/SubmitAPatch#Return_the_favor

> 
> Signed-off-by: Subbaraya Sundeep 
> ---
>  MAINTAINERS | 17 +
>  1 file changed, 17 insertions(+)
> 
> diff --git a/MAINTAINERS b/MAINTAINERS
> index 0cd4d02..dae08bd 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -564,6 +564,23 @@ M: Alistair Francis 
>  S: Maintained
>  F: hw/arm/netduino2.c
>  
> +MSF2 SoC

SmartFusion2

> +M: Subbaraya Sundeep 
> +S: Maintained
> +F: hw/arm/msf2-soc.c
> +F: hw/misc/msf2-sysreg.c
> +F: hw/timer/mss-timer.c
> +F: hw/ssi/mss-spi.c
> +F: include/hw/arm/msf2-soc.h
> +F: include/hw/misc/msf2-sysreg.h
> +F: include/hw/timer/mss-timer.h
> +F: include/hw/ssi/mss-spi.h
> +

Emcraft M2S-FG484

> +MSF2 SOM board
> +M: Subbaraya Sundeep 
> +S: Maintained
> +F: hw/arm/msf2-som.c
> +
>  CRIS Machines
>  -
>  Axis Dev88

Regards,

Phil.

Re: [Qemu-devel] [Qemu-block] [PATCH] blockdev-backup: enable non-root nodes for backup

2017-11-09 Thread John Snow



On 11/09/2017 09:16 AM, Vladimir Sementsov-Ogievskiy wrote:
> What was the reason to abandon non-root nodes?

Eric had it correct: we were never convinced it work would properly, so
we went with a smaller set.

Re: [Qemu-devel] [PATCH for-2.11] block: Keep strong reference when draining all BDS

2017-11-09 Thread Eric Blake

On 11/09/2017 02:43 PM, Max Reitz wrote:
> Draining a BDS may lead to graph modifications, which in turn may result
> in it and other BDS being stripped of their current references.  If
> bdrv_drain_all_begin() and bdrv_drain_all_end() do not keep strong
> references themselves, the BDS they are trying to drain (or undrain) may
> disappear right under their feet -- or, more specifically, under the
> feet of BDRV_POLL_WHILE() in bdrv_drain_recurse().
> 
> This fixes an occasional hang of iotest 194.
> 
> Signed-off-by: Max Reitz 
> ---
>  block/io.c | 47 ---
>  1 file changed, 44 insertions(+), 3 deletions(-)

> +
> +/* Keep a strong reference to all root BDS and copy them into
> + * an own list because draining them may lead to graph

'an own' sounds awkward; maybe 'copy them into a local list'

> + * modifications. */
> +bdrv_ref(bs);
> +bs_list = g_slist_prepend(bs_list, bs);
>  }

>  void bdrv_drain_all_end(void)
>  {
>  BlockDriverState *bs;
>  BdrvNextIterator it;
> +GSList *bs_list = NULL, *bs_list_entry;
> +
> +/* Must be called from the main loop */
> +assert(qemu_get_current_aio_context() == qemu_get_aio_context());
>  
> +/* Keep a strong reference to all root BDS and copy them into an
> + * own list because draining them may lead to graph modifications.

And again.

With that tweak,
Reviewed-by: Eric Blake 

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [PATCH v2 4/5] iotests: Make 083 less flaky

2017-11-09 Thread Max Reitz

On 2017-11-09 21:58, Eric Blake wrote:
> On 11/09/2017 02:30 PM, Max Reitz wrote:
>> 083 has (at least) two issues:
>>
>> 1. By launching the nbd-fault-injector in background, it may not be
>>scheduled until the first grep on its output file is executed.
>>However, until then, that file may not have been created yet -- so it
>>either does not exist yet (thus making the grep emit an error), or it
>>does exist but contains stale data (thus making the rest of the test
>>case work connect to a wrong address).
>>Fix this by explicitly overwriting the output file before executing
>>nbd-fault-injector.
>>
>> 2. The nbd-fault-injector prints things other than "Listening on...".
>>It also prints a "Closing connection" message from time to time.  We
>>currently invoke sed on the whole file in the hope of it only
>>containing the "Listening on..." line yet.  That hope is sometimes
>>shattered by the brutal reality of race conditions, so invoke grep
>>before sed.
> 
> Comment is now stale; s/invoke grep before sed/make the sed script more
> robust/

*sigh*

It appears my hope of easily fixing patches is often shattered by the
brutal reality, too.

> Reviewed-by: Eric Blake 

Thanks!  I'll fix the sentence when applying the series.

Max



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [PATCH v2 4/5] iotests: Make 083 less flaky

2017-11-09 Thread Eric Blake

On 11/09/2017 02:30 PM, Max Reitz wrote:
> 083 has (at least) two issues:
> 
> 1. By launching the nbd-fault-injector in background, it may not be
>scheduled until the first grep on its output file is executed.
>However, until then, that file may not have been created yet -- so it
>either does not exist yet (thus making the grep emit an error), or it
>does exist but contains stale data (thus making the rest of the test
>case work connect to a wrong address).
>Fix this by explicitly overwriting the output file before executing
>nbd-fault-injector.
> 
> 2. The nbd-fault-injector prints things other than "Listening on...".
>It also prints a "Closing connection" message from time to time.  We
>currently invoke sed on the whole file in the hope of it only
>containing the "Listening on..." line yet.  That hope is sometimes
>shattered by the brutal reality of race conditions, so invoke grep
>before sed.

Comment is now stale; s/invoke grep before sed/make the sed script more
robust/

Reviewed-by: Eric Blake 

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org



signature.asc
Description: OpenPGP digital signature

[Qemu-devel] [PATCH for-2.11] block: Keep strong reference when draining all BDS

2017-11-09 Thread Max Reitz

Draining a BDS may lead to graph modifications, which in turn may result
in it and other BDS being stripped of their current references.  If
bdrv_drain_all_begin() and bdrv_drain_all_end() do not keep strong
references themselves, the BDS they are trying to drain (or undrain) may
disappear right under their feet -- or, more specifically, under the
feet of BDRV_POLL_WHILE() in bdrv_drain_recurse().

This fixes an occasional hang of iotest 194.

Signed-off-by: Max Reitz 
---
 block/io.c | 47 ---
 1 file changed, 44 insertions(+), 3 deletions(-)

diff --git a/block/io.c b/block/io.c
index 3d5ef2cabe..a0a2833e8e 100644
--- a/block/io.c
+++ b/block/io.c
@@ -340,7 +340,10 @@ void bdrv_drain_all_begin(void)
 bool waited = true;
 BlockDriverState *bs;
 BdrvNextIterator it;
-GSList *aio_ctxs = NULL, *ctx;
+GSList *aio_ctxs = NULL, *ctx, *bs_list = NULL, *bs_list_entry;
+
+/* Must be called from the main loop */
+assert(qemu_get_current_aio_context() == qemu_get_aio_context());
 
 block_job_pause_all();
 
@@ -355,6 +358,12 @@ void bdrv_drain_all_begin(void)
 if (!g_slist_find(aio_ctxs, aio_context)) {
 aio_ctxs = g_slist_prepend(aio_ctxs, aio_context);
 }
+
+/* Keep a strong reference to all root BDS and copy them into
+ * an own list because draining them may lead to graph
+ * modifications. */
+bdrv_ref(bs);
+bs_list = g_slist_prepend(bs_list, bs);
 }
 
 /* Note that completion of an asynchronous I/O operation can trigger any
@@ -370,7 +379,11 @@ void bdrv_drain_all_begin(void)
 AioContext *aio_context = ctx->data;
 
 aio_context_acquire(aio_context);
-for (bs = bdrv_first(); bs; bs = bdrv_next()) {
+for (bs_list_entry = bs_list; bs_list_entry;
+ bs_list_entry = bs_list_entry->next)
+{
+bs = bs_list_entry->data;
+
 if (aio_context == bdrv_get_aio_context(bs)) {
 waited |= bdrv_drain_recurse(bs, true);
 }
@@ -379,24 +392,52 @@ void bdrv_drain_all_begin(void)
 }
 }
 
+for (bs_list_entry = bs_list; bs_list_entry;
+ bs_list_entry = bs_list_entry->next)
+{
+bdrv_unref(bs_list_entry->data);
+}
+
 g_slist_free(aio_ctxs);
+g_slist_free(bs_list);
 }
 
 void bdrv_drain_all_end(void)
 {
 BlockDriverState *bs;
 BdrvNextIterator it;
+GSList *bs_list = NULL, *bs_list_entry;
+
+/* Must be called from the main loop */
+assert(qemu_get_current_aio_context() == qemu_get_aio_context());
 
+/* Keep a strong reference to all root BDS and copy them into an
+ * own list because draining them may lead to graph modifications.
+ */
 for (bs = bdrv_first(); bs; bs = bdrv_next()) {
-AioContext *aio_context = bdrv_get_aio_context(bs);
+bdrv_ref(bs);
+bs_list = g_slist_prepend(bs_list, bs);
+}
+
+for (bs_list_entry = bs_list; bs_list_entry;
+ bs_list_entry = bs_list_entry->next)
+{
+AioContext *aio_context;
+
+bs = bs_list_entry->data;
+aio_context = bdrv_get_aio_context(bs);
 
 aio_context_acquire(aio_context);
 aio_enable_external(aio_context);
 bdrv_parent_drained_end(bs);
 bdrv_drain_recurse(bs, false);
 aio_context_release(aio_context);
+
+bdrv_unref(bs);
 }
 
+g_slist_free(bs_list);
+
 block_job_resume_all();
 }
 
-- 
2.13.6

[Qemu-devel] [PATCH v2 5/5] iotests: Make 136 less flaky

2017-11-09 Thread Max Reitz

136 executes some AIO requests without a final aio_flush; then it
advances the virtual clock and thus expects the last access time of the
device to be less than the current time when queried (i.e. idle_time_ns
to be greater than 0).  However, without the aio_flush, some requests
may be settled after the clock_step invocation.  In that case,
idle_time_ns would be 0 and the test fails.

Fix this by adding an aio_flush if any AIO request other than some other
aio_flush has been executed.

Signed-off-by: Max Reitz 
Reviewed-by: Eric Blake 
Reviewed-by: Stefan Hajnoczi 
---
 tests/qemu-iotests/136 | 14 +-
 1 file changed, 13 insertions(+), 1 deletion(-)

diff --git a/tests/qemu-iotests/136 b/tests/qemu-iotests/136
index 4b994897af..88b97ea7c6 100644
--- a/tests/qemu-iotests/136
+++ b/tests/qemu-iotests/136
@@ -238,6 +238,18 @@ sector = "%d"
 for i in range(failed_wr_ops):
 ops.append("aio_write %d 512" % bad_offset)
 
+# We need an extra aio_flush to settle all outstanding AIO
+# operations before we can advance the virtual clock, so that
+# the last access happens before clock_step and idle_time_ns
+# will be greater than 0
+extra_flush = 0
+if rd_ops + wr_ops + invalid_rd_ops + invalid_wr_ops + \
+failed_rd_ops + failed_wr_ops > 0:
+extra_flush = 1
+
+if extra_flush > 0:
+ops.append("aio_flush")
+
 if failed_wr_ops > 0:
 highest_offset = max(highest_offset, bad_offset + 512)
 
@@ -251,7 +263,7 @@ sector = "%d"
 self.total_wr_bytes += wr_ops * wr_size
 self.total_wr_ops += wr_ops
 self.total_wr_merged += wr_merged
-self.total_flush_ops += flush_ops
+self.total_flush_ops += flush_ops + extra_flush
 self.invalid_rd_ops += invalid_rd_ops
 self.invalid_wr_ops += invalid_wr_ops
 self.failed_rd_ops += failed_rd_ops
-- 
2.13.6

[Qemu-devel] [PATCH v2 4/5] iotests: Make 083 less flaky

2017-11-09 Thread Max Reitz

083 has (at least) two issues:

1. By launching the nbd-fault-injector in background, it may not be
   scheduled until the first grep on its output file is executed.
   However, until then, that file may not have been created yet -- so it
   either does not exist yet (thus making the grep emit an error), or it
   does exist but contains stale data (thus making the rest of the test
   case work connect to a wrong address).
   Fix this by explicitly overwriting the output file before executing
   nbd-fault-injector.

2. The nbd-fault-injector prints things other than "Listening on...".
   It also prints a "Closing connection" message from time to time.  We
   currently invoke sed on the whole file in the hope of it only
   containing the "Listening on..." line yet.  That hope is sometimes
   shattered by the brutal reality of race conditions, so invoke grep
   before sed.

Signed-off-by: Max Reitz 
---
 tests/qemu-iotests/083 | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/tests/qemu-iotests/083 b/tests/qemu-iotests/083
index 0306f112da..3c1adbf0fb 100755
--- a/tests/qemu-iotests/083
+++ b/tests/qemu-iotests/083
@@ -86,6 +86,7 @@ EOF
 
rm -f "$TEST_DIR/nbd.sock"
 
+echo > "$TEST_DIR/nbd-fault-injector.out"
$PYTHON nbd-fault-injector.py $extra_args "$nbd_addr" 
"$TEST_DIR/nbd-fault-injector.conf" >"$TEST_DIR/nbd-fault-injector.out" 2>&1 &
 
# Wait for server to be ready
@@ -94,7 +95,8 @@ EOF
done
 
# Extract the final address (port number has now been assigned in tcp 
case)
-   nbd_addr=$(sed 's/Listening on \(.*\)$/\1/' 
"$TEST_DIR/nbd-fault-injector.out")
+nbd_addr=$(sed -n 's/^Listening on //p' \
+   "$TEST_DIR/nbd-fault-injector.out")
 
if [ "$proto" = "tcp" ]; then
nbd_url="nbd+tcp://$nbd_addr/$export_name"
-- 
2.13.6

[Qemu-devel] [PATCH v2 3/5] iotests: Make 055 less flaky

2017-11-09 Thread Max Reitz

First of all, test 055 does a valiant job of invoking pause_drive()
sometimes, but that is worth nothing without blkdebug.  So the first
thing to do is to sprinkle a couple of "blkdebug::" in there -- with the
exception of the transaction tests, because the blkdebug break points
make the transaction QMP command hang (which is bad).  In that case, we
can get away with throttling the block job that it effectively is
paused.

Then, 055 usually does not pause the drive before starting a block job
that should be cancelled.  This means that the backup job might be
completed already before block-job-cancel is invoked; thus making the
test either fail (currently) or moot if cancel_and_wait() ignored this
condition.  Fix this by pausing the drive before starting the job.

Signed-off-by: Max Reitz 
Reviewed-by: Eric Blake 
Reviewed-by: Stefan Hajnoczi 
---
 tests/qemu-iotests/055 | 25 -
 1 file changed, 16 insertions(+), 9 deletions(-)

diff --git a/tests/qemu-iotests/055 b/tests/qemu-iotests/055
index e1206caf9b..8a5d9fd269 100755
--- a/tests/qemu-iotests/055
+++ b/tests/qemu-iotests/055
@@ -48,7 +48,7 @@ class TestSingleDrive(iotests.QMPTestCase):
 def setUp(self):
 qemu_img('create', '-f', iotests.imgfmt, blockdev_target_img, 
str(image_len))
 
-self.vm = iotests.VM().add_drive(test_img)
+self.vm = iotests.VM().add_drive('blkdebug::' + test_img)
 self.vm.add_drive(blockdev_target_img, interface="none")
 if iotests.qemu_default_machine == 'pc':
 self.vm.add_drive(None, 'media=cdrom', 'ide')
@@ -65,10 +65,11 @@ class TestSingleDrive(iotests.QMPTestCase):
 def do_test_cancel(self, cmd, target):
 self.assert_no_active_block_jobs()
 
+self.vm.pause_drive('drive0')
 result = self.vm.qmp(cmd, device='drive0', target=target, sync='full')
 self.assert_qmp(result, 'return', {})
 
-event = self.cancel_and_wait()
+event = self.cancel_and_wait(resume=True)
 self.assert_qmp(event, 'data/type', 'backup')
 
 def test_cancel_drive_backup(self):
@@ -166,7 +167,7 @@ class TestSetSpeed(iotests.QMPTestCase):
 def setUp(self):
 qemu_img('create', '-f', iotests.imgfmt, blockdev_target_img, 
str(image_len))
 
-self.vm = iotests.VM().add_drive(test_img)
+self.vm = iotests.VM().add_drive('blkdebug::' + test_img)
 self.vm.add_drive(blockdev_target_img, interface="none")
 self.vm.launch()
 
@@ -246,6 +247,8 @@ class TestSetSpeed(iotests.QMPTestCase):
 def test_set_speed_invalid_blockdev_backup(self):
 self.do_test_set_speed_invalid('blockdev-backup',  'drive1')
 
+# Note: We cannot use pause_drive() here, or the transaction command
+#   would stall.  Instead, we limit the block job speed here.
 class TestSingleTransaction(iotests.QMPTestCase):
 def setUp(self):
 qemu_img('create', '-f', iotests.imgfmt, blockdev_target_img, 
str(image_len))
@@ -271,7 +274,8 @@ class TestSingleTransaction(iotests.QMPTestCase):
 'type': cmd,
 'data': { 'device': 'drive0',
   'target': target,
-  'sync': 'full' },
+  'sync': 'full',
+  'speed': 64 * 1024 },
 }
 ])
 
@@ -289,12 +293,12 @@ class TestSingleTransaction(iotests.QMPTestCase):
 def do_test_pause(self, cmd, target, image):
 self.assert_no_active_block_jobs()
 
-self.vm.pause_drive('drive0')
 result = self.vm.qmp('transaction', actions=[{
 'type': cmd,
 'data': { 'device': 'drive0',
   'target': target,
-  'sync': 'full' },
+  'sync': 'full',
+  'speed': 64 * 1024 },
 }
 ])
 self.assert_qmp(result, 'return', {})
@@ -302,7 +306,9 @@ class TestSingleTransaction(iotests.QMPTestCase):
 result = self.vm.qmp('block-job-pause', device='drive0')
 self.assert_qmp(result, 'return', {})
 
-self.vm.resume_drive('drive0')
+result = self.vm.qmp('block-job-set-speed', device='drive0', speed=0)
+self.assert_qmp(result, 'return', {})
+
 self.pause_job('drive0')
 
 result = self.vm.qmp('query-block-jobs')
@@ -461,7 +467,7 @@ class TestDriveCompression(iotests.QMPTestCase):
 pass
 
 def do_prepare_drives(self, fmt, args, attach_target):
-self.vm = iotests.VM().add_drive(test_img)
+self.vm = iotests.VM().add_drive('blkdebug::' + test_img)
 
 qemu_img('create', '-f', fmt, blockdev_target_img,
  str(TestDriveCompression.image_len), *args)
@@ -500,10 +506,11 @@ class TestDriveCompression(iotests.QMPTestCase):
 
 self.assert_no_active_block_jobs()
 
+self.vm.pause_drive('drive0')
 result =

[Qemu-devel] [PATCH v2 2/5] iotests: Add missing 'blkdebug::' in 040

2017-11-09 Thread Max Reitz

040 tries to invoke pause_drive() on a drive that does not use blkdebug.
Good idea, but let's use blkdebug to make it actually work.

Signed-off-by: Max Reitz 
Reviewed-by: Eric Blake 
Reviewed-by: Stefan Hajnoczi 
---
 tests/qemu-iotests/040 | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tests/qemu-iotests/040 b/tests/qemu-iotests/040
index c284d08796..90b5b4f2ad 100755
--- a/tests/qemu-iotests/040
+++ b/tests/qemu-iotests/040
@@ -289,7 +289,7 @@ class TestSetSpeed(ImageCommitTestCase):
 qemu_img('create', '-f', iotests.imgfmt, '-o', 'backing_file=%s' % 
mid_img, test_img)
 qemu_io('-f', iotests.imgfmt, '-c', 'write -P 0x1 0 512', test_img)
 qemu_io('-f', iotests.imgfmt, '-c', 'write -P 0xef 524288 524288', 
mid_img)
-self.vm = iotests.VM().add_drive(test_img)
+self.vm = iotests.VM().add_drive('blkdebug::' + test_img)
 self.vm.launch()
 
 def tearDown(self):
-- 
2.13.6

[Qemu-devel] [PATCH v2 1/5] iotests: Make 030 less flaky

2017-11-09 Thread Max Reitz

This patch fixes two race conditions in 030:

1. The first is in TestENOSPC.test_enospc().  After resuming the job,
   querying it to confirm it is no longer paused may fail because in the
   meantime it might have completed already.  The same was fixed in
   TestEIO.test_ignore() already (in commit
   2c3b44da07d341557a8203cc509ea07fe3605e11).

2. The second is in TestSetSpeed.test_set_speed_invalid(): Here, a
   stream job is started on a drive without any break points, with a
   block-job-set-speed invoked subsequently.  However, without any break
   points, the job might have completed in the meantime (on tmpfs at
   least); or it might complete before cancel_and_wait() which expects
   the job to still exist.  This can be fixed like everywhere else by
   pausing the drive (installing break points) before starting the job
   and letting cancel_and_wait() resume it.

Signed-off-by: Max Reitz 
Reviewed-by: Eric Blake 
Reviewed-by: Stefan Hajnoczi 
---
 tests/qemu-iotests/030 | 8 ++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/tests/qemu-iotests/030 b/tests/qemu-iotests/030
index 18838948fa..457984b8e9 100755
--- a/tests/qemu-iotests/030
+++ b/tests/qemu-iotests/030
@@ -666,6 +666,7 @@ class TestENOSPC(TestErrors):
 if event['event'] == 'BLOCK_JOB_ERROR':
 self.assert_qmp(event, 'data/device', 'drive0')
 self.assert_qmp(event, 'data/operation', 'read')
+error = True
 
 result = self.vm.qmp('query-block-jobs')
 self.assert_qmp(result, 'return[0]/paused', True)
@@ -676,9 +677,11 @@ class TestENOSPC(TestErrors):
 self.assert_qmp(result, 'return', {})
 
 result = self.vm.qmp('query-block-jobs')
+if result == {'return': []}:
+# Race; likely already finished. Check.
+continue
 self.assert_qmp(result, 'return[0]/paused', False)
 self.assert_qmp(result, 'return[0]/io-status', 'ok')
-error = True
 elif event['event'] == 'BLOCK_JOB_COMPLETED':
 self.assertTrue(error, 'job completed unexpectedly')
 self.assert_qmp(event, 'data/type', 'stream')
@@ -792,13 +795,14 @@ class TestSetSpeed(iotests.QMPTestCase):
 
 self.assert_no_active_block_jobs()
 
+self.vm.pause_drive('drive0')
 result = self.vm.qmp('block-stream', device='drive0')
 self.assert_qmp(result, 'return', {})
 
 result = self.vm.qmp('block-job-set-speed', device='drive0', speed=-1)
 self.assert_qmp(result, 'error/class', 'GenericError')
 
-self.cancel_and_wait()
+self.cancel_and_wait(resume=True)
 
 if __name__ == '__main__':
 iotests.main(supported_fmts=['qcow2', 'qed'])
-- 
2.13.6

[Qemu-devel] [PATCH v2 0/5] iotests: Make some tests less flaky

2017-11-09 Thread Max Reitz

There are a couple of tests that fail (on my machine) from time to
time (and by that I mean that recently I've rarely ever had a test run
with both 083 and 136 working on first try).
This series should fix most (at least the issues I am aware of).

Notes:
- 083 might have another issue, but if so it occurs extremely rarely and
  so I was unable to debug it.

- 129 is flaky, too, because it tries to use block jobs with BB
  throttling -- however, block jobs ignore that these days.  Making it
  use a throttle filter node will require quite a bit of work.  See
  http://lists.nongnu.org/archive/html/qemu-block/2017-11/msg00111.html
  for more.

- 194 sometimes hangs because the source VM fails to drain its drive.
  This is probably not an issue with the test, but actually an issue in
  qemu.  See
  http://lists.nongnu.org/archive/html/qemu-block/2017-11/msg00256.html
  for more.


v2:
 - Spelling fixes [Eric]
 - Contract grep | sed to a single sed in 083 [Eric]


git-backport-diff against v1:

Key:
[] : patches are identical
[] : number of functional differences between upstream/downstream patch
[down] : patch is downstream-only
The flags [FC] indicate (F)unctional and (C)ontextual differences, respectively

001/5:[] [--] 'iotests: Make 030 less flaky'
002/5:[] [--] 'iotests: Add missing 'blkdebug::' in 040'
003/5:[] [--] 'iotests: Make 055 less flaky'
004/5:[0003] [FC] 'iotests: Make 083 less flaky'
005/5:[] [--] 'iotests: Make 136 less flaky'


Max Reitz (5):
  iotests: Make 030 less flaky
  iotests: Add missing 'blkdebug::' in 040
  iotests: Make 055 less flaky
  iotests: Make 083 less flaky
  iotests: Make 136 less flaky

 tests/qemu-iotests/030 |  8 ++--
 tests/qemu-iotests/040 |  2 +-
 tests/qemu-iotests/055 | 25 -
 tests/qemu-iotests/083 |  4 +++-
 tests/qemu-iotests/136 | 14 +-
 5 files changed, 39 insertions(+), 14 deletions(-)

-- 
2.13.6

Re: [Qemu-devel] [PATCH 4/5] iotests: Make 083 less flaky

2017-11-09 Thread Max Reitz

On 2017-11-09 15:11, Eric Blake wrote:
> On 11/08/2017 07:38 PM, Max Reitz wrote:
>> 083 has (at least) two issues:
> 
> I think I hit one of them intermittently yesterday; thanks for
> diagnosing these (and like you say, there may be more lurking, but we'll
> whack them separately if we can reproduce and identify them).
> 
>>
>> 1. By launching the nbd-fault-injector in background, it may not be
>>scheduled until the first grep on its output file is executed.
>>However, until then, that file may not have been created yet -- so it
>>either does not exist yet (thus making the grep emit an error), or it
>>does exist but contains stale data (thus making the rest of the test
>>case work connect to a wrong address).
>>Fix this by explicitly overwriting the output file before executing
>>nbd-fault-injector.
>>
>> 2. The nbd-fault-injector prints things other than "Listening on...".
>>It also prints a "Closing connection" message from time to time.  We
>>currently invoke sed on the whole file in the hope of it only
>>containing the "Listening on..." line yet.  That hope is sometimes
>>shattered by the brutal reality of race conditions, so invoke grep
>>before sed.
> 
> Invoking 'grep | sed' is almost always a waste of a process; sed can do
> the job alone.
> 
>>
>> Signed-off-by: Max Reitz 
>> ---
>>  tests/qemu-iotests/083 | 3 ++-
>>  1 file changed, 2 insertions(+), 1 deletion(-)
>>
>> diff --git a/tests/qemu-iotests/083 b/tests/qemu-iotests/083
>> index 0306f112da..2f6444eeb9 100755
>> --- a/tests/qemu-iotests/083
>> +++ b/tests/qemu-iotests/083
>> @@ -86,6 +86,7 @@ EOF
>>  
>>  rm -f "$TEST_DIR/nbd.sock"
>>  
>> +echo > "$TEST_DIR/nbd-fault-injector.out"
> 
> This makes the file contain a blank line.  Would it be any better as a
> truly empty file, as in:
> 
> : > "$TEST_DIR/nbd-fault-injector.out"

Although ":>" looks funny, I guess I'd rather stay with the echo...
Yes, it may look stupid to you, but I would have to look up what exactly
that will do, whereas the echo is clear.

And in the end, both work equally fine.

>>  $PYTHON nbd-fault-injector.py $extra_args "$nbd_addr" 
>> "$TEST_DIR/nbd-fault-injector.conf" >"$TEST_DIR/nbd-fault-injector.out" 2>&1 
>> &
>>  
>>  # Wait for server to be ready
>> @@ -94,7 +95,7 @@ EOF
>>  done
>>  
>>  # Extract the final address (port number has now been assigned in tcp 
>> case)
>> -nbd_addr=$(sed 's/Listening on \(.*\)$/\1/' 
>> "$TEST_DIR/nbd-fault-injector.out")
>> +nbd_addr=$(grep 'Listening on ' "$TEST_DIR/nbd-fault-injector.out" 
>> | sed 's/Listening on \(.*\)$/\1/')
> 
> Fixing TAB damage while at it - nice.
> 
> Here's how to do it using just sed, and with less typing:
> 
> nbd_addr=$(sed -n 's/^Listening on //p' \
>"$TEST_DIR/nbd-fault-injector.out")

Oh, nice!  Will do, thanks.

Max



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [RFC v2 3/6] possible_cpus: add CPUArchId::type field

2017-11-09 Thread Eduardo Habkost

On Thu, Nov 09, 2017 at 05:58:03PM +1100, David Gibson wrote:
> On Tue, Nov 07, 2017 at 04:04:04PM +0100, Cornelia Huck wrote:
> > On Mon, 6 Nov 2017 16:02:16 -0200
> > Eduardo Habkost  wrote:
> > 
> > > On Tue, Oct 31, 2017 at 03:01:14PM +0100, Igor Mammedov wrote:
> > > > On Thu, 19 Oct 2017 17:31:51 +1100
> > > > David Gibson  wrote:
> > > >   
> > > > > On Wed, Oct 18, 2017 at 01:12:12PM +0200, Igor Mammedov wrote:  
> > > > > > For enabling early cpu to numa node configuration at runtime
> > > > > > qmp_query_hotpluggable_cpus() should provide a list of available
> > > > > > cpu slots at early stage, before machine_init() is called and
> > > > > > the 1st cpu is created, so that mgmt might be able to call it
> > > > > > and use output to set numa mapping.
> > > > > > Use MachineClass::possible_cpu_arch_ids() callback to set
> > > > > > cpu type info, along with the rest of possible cpu properties,
> > > > > > to let machine define which cpu type* will be used.
> > > > > > 
> > > > > > * for SPAPR it will be a spapr core type and for ARM/s390x/x86
> > > > > >   a respective descendant of CPUClass.
> > > > > > 
> > > > > > Move parse_numa_opts() in vl.c after cpu_model is parsed into
> > > > > > cpu_type so that possible_cpu_arch_ids() would know which
> > > > > > cpu_type to use during layout initialization.
> > > > > > 
> > > > > > Signed-off-by: Igor Mammedov 
> > > > > 
> > > > > Reviewed-by: David Gibson 
> > > > >   
> > > > > > ---
> > > > > >   v2:
> > > > > >  - fix NULL dereference caused by not initialized
> > > > > >MachineState::cpu_type at the time parse_numa_opts()
> > > > > >were called
> > > > > > ---
> > > > > >  include/hw/boards.h|  2 ++
> > > > > >  hw/arm/virt.c  |  3 ++-
> > > > > >  hw/core/machine.c  | 12 ++--
> > > > > >  hw/i386/pc.c   |  4 +++-
> > > > > >  hw/ppc/spapr.c | 13 -
> > > > > >  hw/s390x/s390-virtio-ccw.c |  1 +
> > > > > >  vl.c   |  3 +--
> > > > > >  7 files changed, 23 insertions(+), 15 deletions(-)
> > > > > > 
> > > > > > diff --git a/include/hw/boards.h b/include/hw/boards.h
> > > > > > index 191a5b3..fa21758 100644
> > > > > > --- a/include/hw/boards.h
> > > > > > +++ b/include/hw/boards.h
> > > > > > @@ -80,6 +80,7 @@ void machine_set_cpu_numa_node(MachineState 
> > > > > > *machine,
> > > > > >   * CPUArchId:
> > > > > >   * @arch_id - architecture-dependent CPU ID of present or possible 
> > > > > > CPU
> > > > > 
> > > > > I know this isn't really in scope for this patch, but is @arch_id here
> > > > > supposed to have meaning defined by the target, or by the machine?
> > > > > 
> > > > > If it's the machime, it could do with a rename - "arch" means target
> > > > > to most people (thanks to Linux).
> > > > > 
> > > > > If it's the target, it's kind of bogus, because it doesn't necessarily
> > > > > have a clear meaning per target - get_arch_id in CPUClass has the same
> > > > > problem, which is probably one reason it's basically only used by the
> > > > > x86 code at present.
> > > > > 
> > > > > e.g. for target/ppc, what do we use?  There's the PIR, which is in the
> > > > > CPU.. but only on some cpu models, not all.  There will generally be
> > > > > some kind of master PIC id, but there are different PIC models on
> > > > > different boards.  What goes in the devicetree?  Well only some
> > > > > machines use devicetree, and they might define the cpu reg 
> > > > > differently.
> > > > > 
> > > > > Board designs will generally try to make some if not all of those
> > > > > possible values equal for simplicity, but there's still no real way of
> > > > > defining a sensible arch_id independent of machine / board.  
> > > > I'd say arch_id is machine specific so far, it was introduced when we
> > > > didn't have CpuInstanceProperties and at that time we considered only
> > > > vcpus (threads) and doesn't really apply to spapr cores.
> > > > 
> > > > In general we could do away with arch_id and use CpuInstanceProperties
> > > > instead, but arch_id also serves aux purpose, it allows machine to
> > > > pre-calculate(cache) apic-id/mpidr values in one place and then they
> > > > are/(could be) used by arch in-depended code to build acpi tables.
> > > > So if we drop arch_id we would need to introduce a machine hook,
> > > > which would translate CpuInstanceProperties into current arch_id.  
> > > 
> > > I think we need to do a better to job documenting where exactly
> > > we expect arch_id to be used and how, so people know what it's
> > > supposed to return.
> > > 
> > > If the only place where it's useful now is ACPI code (is it?),
> > > should we rename it to something like get_acpi_id()?
> > 
> > It is also used in hw/s390x/sclp.c to fill out a control block, so acpi
> > isn't the only user.
> 
> Yeah.. this is kind of bogus.  The

Re: [Qemu-devel] [Qemu devel PATCH] MAINTAINERS: Add entries for Smartfusion2

2017-11-09 Thread Alistair Francis

On Thu, Nov 9, 2017 at 4:02 AM, Subbaraya Sundeep
 wrote:
> add voluntarily myself as maintainer for Smartfusion2

This should be:
Voluntarily add myself as maintainer for Smartfusion2.

>
> Signed-off-by: Subbaraya Sundeep 

With the commit message fixed:

Reviewed-by: Alistair Francis 

Alistair

> ---
>  MAINTAINERS | 17 +
>  1 file changed, 17 insertions(+)
>
> diff --git a/MAINTAINERS b/MAINTAINERS
> index 0cd4d02..dae08bd 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -564,6 +564,23 @@ M: Alistair Francis 
>  S: Maintained
>  F: hw/arm/netduino2.c
>
> +MSF2 SoC
> +M: Subbaraya Sundeep 
> +S: Maintained
> +F: hw/arm/msf2-soc.c
> +F: hw/misc/msf2-sysreg.c
> +F: hw/timer/mss-timer.c
> +F: hw/ssi/mss-spi.c
> +F: include/hw/arm/msf2-soc.h
> +F: include/hw/misc/msf2-sysreg.h
> +F: include/hw/timer/mss-timer.h
> +F: include/hw/ssi/mss-spi.h
> +
> +MSF2 SOM board
> +M: Subbaraya Sundeep 
> +S: Maintained
> +F: hw/arm/msf2-som.c
> +
>  CRIS Machines
>  -
>  Axis Dev88
> --
> 2.5.0
>

Re: [Qemu-devel] [PATCH] smbios: support setting OEM strings table

2017-11-09 Thread Michael S. Tsirkin

On Thu, Nov 09, 2017 at 02:40:05PM +, Daniel P. Berrange wrote:
> Ping, any thoughts on this ? (not expecting it in this 2.11 release of course)

I queued this up.

> On Sat, Oct 28, 2017 at 09:51:36PM +0100, Daniel P. Berrange wrote:
> > The cloud-init program currently allows fetching of its data by repurposing 
> > of
> > the 'system' type 'serial' field. This is a clear abuse of the serial field 
> > that
> > would clash with other valid usage a virt management app might have for that
> > field.
> > 
> > Fortunately the SMBIOS defines an "OEM Strings" table whose puporse is to 
> > allow
> > exposing of arbitrary vendor specific strings to the operating system. This 
> > is
> > perfect for use with cloud-init, or as a way to pass arguments to OS 
> > installers
> > such as anaconda.
> > 
> > This patch makes it easier to support this with QEMU. e.g.
> > 
> >   $QEMU -smbios type=11,value=Hello,value=World,value=Tricky,,value=test
> > 
> > Which results in the guest seeing dmidecode data
> > 
> >   Handle 0x0E00, DMI type 11, 5 bytes
> >   OEM Strings
> >   String 1: Hello
> >   String 2: World
> >   String 3: Tricky,value=test
> > 
> > It is suggested that any app wanting to make use of this OEM strings 
> > capability
> > for accepting data from the host mgmt layer should use its name as a string
> > prefix. e.g. to expose OEM strings targetting both cloud init and anaconda 
> > in
> > parallel the mgmt app could set
> > 
> >   $QEMU -smbios 
> > type=11,value=cloud-init:ds=nocloud-net;s=http://10.10.0.1:8000/,\
> > 
> > value=anaconda:method=http://dl.fedoraproject.org/pub/fedora/linux/releases/25/x86_64/os
> > 
> > which would appear as
> > 
> >   Handle 0x0E00, DMI type 11, 5 bytes
> >   OEM Strings
> >   String 1: cloud-init:ds=nocloud-net;s=http://10.10.0.1:8000/
> >   String 2: 
> > anaconda:method=http://dl.fedoraproject.org/pub/fedora/linux/releases/25/x86_64/os
> > 
> > Use of such string prefixes means the app won't have to care which string 
> > slot
> > its data appears in.
> > 
> > Signed-off-by: Daniel P. Berrange 
> > ---
> >  hw/smbios/smbios.c | 72 
> > ++
> >  hw/smbios/smbios_build.h   | 12 
> >  include/hw/smbios/smbios.h |  6 
> >  3 files changed, 90 insertions(+)
> > 
> > diff --git a/hw/smbios/smbios.c b/hw/smbios/smbios.c
> > index 1a5437a07d..5d11f01874 100644
> > --- a/hw/smbios/smbios.c
> > +++ b/hw/smbios/smbios.c
> > @@ -96,6 +96,11 @@ static struct {
> >  } type4;
> >  
> >  static struct {
> > +size_t nvalues;
> > +const char **values;
> > +} type11;
> > +
> > +static struct {
> >  const char *loc_pfx, *bank, *manufacturer, *serial, *asset, *part;
> >  uint16_t speed;
> >  } type17;
> > @@ -282,6 +287,14 @@ static const QemuOptDesc qemu_smbios_type4_opts[] = {
> >  { /* end of list */ }
> >  };
> >  
> > +static const QemuOptDesc qemu_smbios_type11_opts[] = {
> > +{
> > +.name = "value",
> > +.type = QEMU_OPT_STRING,
> > +.help = "OEM string data",
> > +},
> > +};
> > +
> >  static const QemuOptDesc qemu_smbios_type17_opts[] = {
> >  {
> >  .name = "type",
> > @@ -590,6 +603,27 @@ static void smbios_build_type_4_table(unsigned 
> > instance)
> >  smbios_type4_count++;
> >  }
> >  
> > +static void smbios_build_type_11_table(void)
> > +{
> > +char count_str[128];
> > +size_t i;
> > +
> > +if (type11.nvalues == 0) {
> > +return;
> > +}
> > +
> > +SMBIOS_BUILD_TABLE_PRE(11, 0xe00, true); /* required */
> > +
> > +snprintf(count_str, sizeof(count_str), "%zu", type11.nvalues);
> > +t->count = type11.nvalues;
> > +
> > +for (i = 0; i < type11.nvalues; i++) {
> > +SMBIOS_TABLE_SET_STR_LIST(11, type11.values[i]);
> > +}
> > +
> > +SMBIOS_BUILD_TABLE_POST;
> > +}
> > +
> >  #define ONE_KB ((ram_addr_t)1 << 10)
> >  #define ONE_MB ((ram_addr_t)1 << 20)
> >  #define ONE_GB ((ram_addr_t)1 << 30)
> > @@ -832,6 +866,8 @@ void smbios_get_tables(const struct 
> > smbios_phys_mem_area *mem_array,
> >  smbios_build_type_4_table(i);
> >  }
> >  
> > +smbios_build_type_11_table();
> > +
> >  #define MAX_DIMM_SZ (16ll * ONE_GB)
> >  #define GET_DIMM_SZ ((i < dimm_cnt - 1) ? MAX_DIMM_SZ \
> >  : ((ram_size - 1) % MAX_DIMM_SZ) + 
> > 1)
> > @@ -882,6 +918,38 @@ static void save_opt(const char **dest, QemuOpts 
> > *opts, const char *name)
> >  }
> >  }
> >  
> > +
> > +struct opt_list {
> > +const char *name;
> > +size_t *ndest;
> > +const char ***dest;
> > +};
> > +
> > +static int save_opt_one(void *opaque,
> > +const char *name, const char *value,
> > +Error **errp)
> > +{
> > +struct opt_list *opt = opaque;
> > +
> > +if (!g_str_equal(name, opt->name)) {
> > +return 0;
> > +}
> > +

Re: [Qemu-devel] [PATCH V3] hw/pci-host: Fix x86 Host Bridges 64bit PCI hole

2017-11-09 Thread Eduardo Habkost

On Thu, Nov 09, 2017 at 12:02:10PM +0100, Laszlo Ersek wrote:
[...]
> (3) An idea for the property defaults: you remove
> DEFAULT_PCI_HOLE64_SIZE, which is cool. How about introducing (in the
> proper header files)
> 
> #define DEFAULT_I440FX_PCI_HOLE64_SIZE (1ULL << 31)
> #define DEFAULT_Q35_PCI_HOLE64_SIZE(1ULL << 35)
> 
> The main reason for my suggestion is that (1ULL << 35) is used twice in
> the code, for an obscure qdev/qom reason. The comments definitely help,
> so keep them, but a greppable macro would help even more, IMO.
> 
> And, once we add DEFAULT_Q35_PCI_HOLE64_SIZE, we should add
> DEFAULT_I440FX_PCI_HOLE64_SIZE too, for consistency.

Agreed, especially considering how the code that initializes the
defaults is non-obvious.

> 
> Anyway, I'll leave this up to you as well.

If we do that, I recommend adding a bigger warning to
q35_host_props.  The one in this patch is very easy to miss if
people try to touch the defaults for other mch.* properties in
the future.  I suggest something like:

/*
 * NOTE: setting defaults for the mch.* fields in this table
 * don't work, because mch is a separate QOM object that is
 * zeroed by the object_initialize(>mch, ...) call inside
 * q35_host_initfn().  The default values for those
 * properties need to be initialized manually by
 * q35_host_initfn() after the object_initialize() call.
 */

> 
> 
> (4) I also have a suggestion for the commit message: please move the
> paragraph that starts with
> 
>   "Even if the new defaults..."
> 
> from the v2->v3 changelog to the commit message proper. IMO it is
> important information.
> 
> 
> With (4) addressed:
> 
> Reviewed-by: Laszlo Ersek 
> 
> Thanks!
> Laszlo

-- 
Eduardo

Re: [Qemu-devel] [Qemu-ppc] [PATCH for-2.12 v2 00/12] spapr: introduce an IRQ allocator at the machine level

2017-11-09 Thread Cédric Le Goater

David,

On 11/09/2017 10:14 AM, Cédric Le Goater wrote:
> Hello,
> 
> Currently, the ICSState 'ics' object of the sPAPR machine acts as the
> global interrupt source handler and also as the IRQ number allocator
> for the machine. Some IRQ numbers are allocated very early in the
> machine initialization sequence to populate the device tree, and this
> is a problem to introduce the new POWER XIVE interrupt model, as it
> needs to share the IRQ numbers with the older model.
> 
> To prepare ground for XIVE, here is a proposal adding a set of new
> XICSFabric operations to let the machine handle directly the IRQ
> number allocation and to decorrelate the allocation from the interrupt
> source object :
> 
> bool (*irq_test)(XICSFabric *xi, int irq);
> int (*irq_alloc_block)(XICSFabric *xi, int count, int align);
> void (*irq_free_block)(XICSFabric *xi, int irq, int num);
> bool (*irq_is_lsi)(XICSFabric *xi, int irq);
> 
> In these prototypes, the 'irq' parameter refers to a number in the
> global IRQ number space.
> 
> On the sPAPR platform, these operations are simply backed by a bitmap
> and to handle migration compatibility, we introduce a machine class
> flag 'has_irq_bitmap', which is set to false for older machines.

Talking with Greg, I have found a better solution to handle migration 
compatibility which is to do like the phb_placement. That is to use 
two sets of XICS IRQ operations : one for the latest pseries machine 
using the bitmap and another one for the older pseries machine using 
the ICSIRQState. It impacts mostly patch 5.

This solution is cleaner and I will send a v3 in my morning unless
you start reviewing and have some obvious comments in the v2 series.


Thanks,

C. 


> To completely remove the use of the ICSState object (required to
> introduce XIVE), we also need to change how the nature of an
> interrupt, MSI or LSI, is stored. Today, this is done using the flag
> attribute of the ICSIRQState array. We change that by splitting the
> IRQ number space of the machine in two: first the LSIs and then the
> MSIs. This has the benefit to keep the LSI IRQ numbers in a well known
> range which is useful for PHB hotplug.
> 
> The git repo for this pachset can be found here along with the latest
> XIVE model:
> 
> https://github.com/legoater/qemu/commits/xive
> 
> Thanks,
> 
> C.
> 
> Tests :
> 
>  - make check on each patch
>  - migration :
>  qemu-2.12 (pseries-2.12) <->  qemu-2.12 (pseries-2.12)
>  qemu-2.12 (pseries-2.10) <->  qemu-2.12 (pseries-2.10)
>  qemu-2.10 (pseries-2.10) <->  qemu-2.12 (pseries-2.10)
> 
> Changes since v1 :
> 
>  - reorganised patchset to introduce the XICSFabric operations before
>the major changes: bitmap and IRQ number space split   
>  - introduced a reference bitmap to save some state in migration
> 
> Cédric Le Goater (12):
>   spapr: add pseries 2.12 machine type
>   ppc/xics: remove useless if condition
>   spapr: introduce new XICSFabric operations for an IRQ allocator
>   spapr: move current IRQ allocation under the machine
>   spapr: introduce an IRQ allocator using a bitmap
>   spapr: store a reference IRQ bitmap
>   spapr: introduce an 'irq_base' number
>   spapr: remove the use of ics_valid_irq()
>   spapr: introduce a XICSFabric is_lsi() operation
>   spapr: split the IRQ number space for LSI interrupts
>   sparp: merge ics_set_irq_type() in irq_alloc_block() operation
>   spapr: use sPAPRMachineState in spapr_ics_ prototypes
> 
>  hw/intc/trace-events   |   2 -
>  hw/intc/xics.c |  37 ++-
>  hw/intc/xics_kvm.c |   4 +-
>  hw/intc/xics_spapr.c   |  76 -
>  hw/ppc/pnv.c   |  34 ++
>  hw/ppc/pnv_psi.c   |   4 --
>  hw/ppc/spapr.c | 174 
> -
>  hw/ppc/spapr_events.c  |   4 +-
>  hw/ppc/spapr_pci.c |   8 +--
>  hw/ppc/spapr_vio.c |   2 +-
>  hw/ppc/trace-events|   2 +
>  include/hw/ppc/spapr.h |   6 ++
>  include/hw/ppc/xics.h  |  20 --
>  13 files changed, 267 insertions(+), 106 deletions(-)
>

Re: [Qemu-devel] Intermittent hang of iotest 194 (bdrv_drain_all after non-shared storage migration)

2017-11-09 Thread Max Reitz

On 2017-11-09 16:30, Fam Zheng wrote:
> On Thu, 11/09 16:14, Max Reitz wrote:
>> On 2017-11-09 05:21, Fam Zheng wrote:
>>> On Thu, 11/09 01:48, Max Reitz wrote:
 Hi,

 More exciting news from the bdrv_drain() front!

 I've noticed in the past that iotest 194 sometimes hangs.  I usually run
 the tests on tmpfs, but I've just now verified that it happens on my SSD
 just as well.

 So the reproducer is a plain:

 while ./check -raw 194; do; done
>>>
>>> I cannot produce it on my machine.
>>
>> Hm, too bad.  I see it both on my work laptop (with Fedora) and my
>> desktop (with Arch)...
>>
 (No difference between raw or qcow2, though.)

 And then, after a couple of runs (or a couple ten), it will just hang.
 The reason is that the source VM lingers around and doesn't quit
 voluntarily -- the test itself was successful, but it just can't exit.

 If you force it to exit by killing the VM (e.g. through pkill -11 qemu),
 this is the backtrace:

 #0  0x7f7cfc297e06 in ppoll () at /lib64/libc.so.6
 #1  0x563b846bcac9 in ppoll (__ss=0x0, __timeout=0x0,
 __nfds=, __fds=) at
 /usr/include/bits/poll2.h:77
>>>
>>> Looking at the 0 timeout it seems we are in the aio_poll(ctx, 
>>> blocking=false);
>>> branches of BDRV_POLL_WHILE? Is it a busy loop? If so I wonder what is 
>>> making
>>> progress and causing the return value to be true in aio_poll().
>>>
 #2  0x563b846bcac9 in qemu_poll_ns (fds=,
 nfds=, timeout=) at util/qemu-timer.c:322
 #3  0x563b846be711 in aio_poll (ctx=ctx@entry=0x563b856e3e80,
 blocking=) at util/aio-posix.c:629
 #4  0x563b8463afa4 in bdrv_drain_recurse
 (bs=bs@entry=0x563b865568a0, begin=begin@entry=true) at block/io.c:201
 #5  0x563b8463baff in bdrv_drain_all_begin () at block/io.c:381
 #6  0x563b8463bc99 in bdrv_drain_all () at block/io.c:411
 #7  0x563b8459888b in block_migration_cleanup (opaque=>>> out>) at migration/block.c:714
 #8  0x563b845883be in qemu_savevm_state_cleanup () at
 migration/savevm.c:1251
 #9  0x563b845811fd in migration_thread (opaque=0x563b856f1da0) at
 migration/migration.c:2298
 #10 0x7f7cfc56f36d in start_thread () at /lib64/libpthread.so.0
 #11 0x7f7cfc2a3e1f in clone () at /lib64/libc.so.6


 And when you make bdrv_drain_all_begin() print what we are trying to
 drain, you can see that it's the format node (managed by the "raw"
 driver in this case).
>>>
>>> So what is the value of bs->in_flight?
>>
>> gdb:
>>> (gdb) print bs->in_flight 
>>> $3 = 2307492233
>>
>> "That's weird, why would it..."
>>
>>> (gdb) print *bs
>>> $4 = {open_flags = -1202160144, read_only = 161, encrypted = 85, sg = 
>>> false, probed = false, force_share = 96, implicit = 159, drv = 0x0, opaque 
>>> = 0x0, aio_context = 0x8989898989898989, aio_notifiers = {lh_first = 
>>> 0x8989898989898989}, 
>>>   walking_aio_notifiers = 137, filename = '\211' , 
>>> backing_file = '\211' , backing_format = '\211' 
>>> , full_open_options = 0x8989898989898989, 
>>>   exact_filename = '\211' , backing = 
>>> 0x8989898989898989, file = 0x8989898989898989, bl = {request_alignment = 
>>> 2307492233, max_pdiscard = -1987475063, pdiscard_alignment = 2307492233, 
>>> max_pwrite_zeroes = -1987475063, pwrite_zeroes_alignment = 2307492233, 
>>> opt_transfer = 2307492233, max_transfer = 2307492233, min_mem_alignment = 
>>> 9910603678816504201, opt_mem_alignment = 9910603678816504201, max_iov = 
>>> -1987475063}, 
>>>   supported_write_flags = 2307492233, supported_zero_flags = 2307492233, 
>>> node_name = '\211' , node_list = {tqe_next = 
>>> 0x8989898989898989, tqe_prev = 0x8989898989898989}, bs_list = {tqe_next = 
>>> 0x8989898989898989, 
>>> tqe_prev = 0x8989898989898989}, monitor_list = {tqe_next = 
>>> 0x8989898989898989, tqe_prev = 0x8989898989898989}, refcnt = -1987475063, 
>>> op_blockers = {{lh_first = 0x8989898989898989} }, job = 
>>> 0x8989898989898989, 
>>>   inherits_from = 0x8989898989898989, children = {lh_first = 
>>> 0x8989898989898989}, parents = {lh_first = 0x8989898989898989}, options = 
>>> 0x8989898989898989, explicit_options = 0x8989898989898989, detect_zeroes = 
>>> 2307492233, 
>>>   backing_blocker = 0x8989898989898989, total_sectors = 
>>> -8536140394893047415, before_write_notifiers = {notifiers = {lh_first = 
>>> 0x8989898989898989}}, write_threshold_offset = 9910603678816504201, 
>>> write_threshold_notifier = {notify = 
>>> 0x8989898989898989, node = {le_next = 0x8989898989898989, le_prev = 
>>> 0x8989898989898989}}, dirty_bitmap_mutex = {lock = {__data = {__lock = 
>>> -1987475063, __count = 2307492233, __owner = -1987475063, __nusers = 
>>> 2307492233, 
>>> __kind = -1987475063, __spins = -30327, __elision = -30327, __list 
>>> = {__prev = 0x8989898989898989, __next = 0x8989898989898989}}, __size = 
>>> '\211' , __align = -8536140394893047415},

Re: [Qemu-devel] [PATCH 6/7] s390x/pci: move the memory region write from pcistg

2017-11-09 Thread Cornelia Huck

On Tue,  7 Nov 2017 18:24:38 +0100
Pierre Morel  wrote:

> Let's move the memory region write from pcistg into a dedicated
> function.
> This allows us to prepare a later patch searching for subregions
> inside of the memory region.

OK, so here is the memory region write. Do we have any sleeping
endianness bugs in there for when we wire up tcg? I'm not sure how this
plays with the bswaps (see patch 1). 

But maybe I've just gotten lost somewhere.

> 
> Signed-off-by: Pierre Morel 
> Reviewed-by: Yi Min Zhao 
> ---
>  hw/s390x/s390-pci-inst.c | 27 +--
>  1 file changed, 17 insertions(+), 10 deletions(-)
> 
> diff --git a/hw/s390x/s390-pci-inst.c b/hw/s390x/s390-pci-inst.c
> index 50135a0..97f62b5 100644
> --- a/hw/s390x/s390-pci-inst.c
> +++ b/hw/s390x/s390-pci-inst.c
> @@ -455,12 +455,27 @@ static int trap_msix(S390PCIBusDevice *pbdev, uint64_t 
> offset, uint8_t pcias)
>  }
>  }
>  
> +static MemTxResult zpci_write_bar(S390PCIBusDevice *pbdev, uint8_t pcias,
> +  uint64_t offset, uint64_t data, uint8_t 
> len)
> +{
> +MemoryRegion *mr;
> +
> +if (trap_msix(pbdev, offset, pcias)) {
> +offset = offset - pbdev->msix.table_offset;
> +mr = >pdev->msix_table_mmio;
> +} else {
> +mr = pbdev->pdev->io_regions[pcias].memory;
> +}
> +
> +return memory_region_dispatch_write(mr, offset, data, len,
> +MEMTXATTRS_UNSPECIFIED);
> +}
> +
>  int pcistg_service_call(S390CPU *cpu, uint8_t r1, uint8_t r2)
>  {
>  CPUS390XState *env = >env;
>  uint64_t offset, data;
>  S390PCIBusDevice *pbdev;
> -MemoryRegion *mr;
>  MemTxResult result;
>  uint8_t len;
>  uint32_t fh;
> @@ -517,15 +532,7 @@ int pcistg_service_call(S390CPU *cpu, uint8_t r1, 
> uint8_t r2)
>  return 0;
>  }
>  
> -if (trap_msix(pbdev, offset, pcias)) {
> -offset = offset - pbdev->msix.table_offset;
> -mr = >pdev->msix_table_mmio;
> -} else {
> -mr = pbdev->pdev->io_regions[pcias].memory;
> -}
> -
> -result = memory_region_dispatch_write(mr, offset, data, len,
> - MEMTXATTRS_UNSPECIFIED);
> +result = zpci_write_bar(pbdev, pcias, offset, data, len);
>  if (result != MEMTX_OK) {
>  program_interrupt(env, PGM_OPERAND, 4);
>  return 0;

Re: [Qemu-devel] [PATCH 1/7] s390x/pci: factor out endianess conversion

2017-11-09 Thread Cornelia Huck

On Thu, 9 Nov 2017 15:55:46 -0300
Philippe Mathieu-Daudé  wrote:

> On 11/09/2017 01:38 PM, Cornelia Huck wrote:
> > On Tue,  7 Nov 2017 18:24:33 +0100
> > Pierre Morel  wrote:
> >   
> >> There are two places where the same endianness conversion
> >> is done.
> >> Let's factor this out into a static function.
> >>
> >> Signed-off-by: Pierre Morel 
> >> Reviewed-by: Yi Min Zhao 
> >> ---
> >>  hw/s390x/s390-pci-inst.c | 58 
> >> ++--
> >>  1 file changed, 32 insertions(+), 26 deletions(-)
> >>
> >> diff --git a/hw/s390x/s390-pci-inst.c b/hw/s390x/s390-pci-inst.c
> >> index 8e088f3..8fcb02d 100644
> >> --- a/hw/s390x/s390-pci-inst.c
> >> +++ b/hw/s390x/s390-pci-inst.c
> >> @@ -314,6 +314,35 @@ out:
> >>  return 0;
> >>  }
> >>  
> >> +/**
> >> + * This function swaps the data at ptr according from one
> >> + * endianness to the other.
> >> + * valid data in the uint64_t data field.  
> > 
> > I'm not sure what that line is supposed to mean?
> >   
> >> + * @ptr: a pointer to a uint64_t data field
> >> + * @len: the length of the valid data, must be 1,2,4 or 8
> >> + */
> >> +static int zpci_endian_swap(uint64_t *ptr, uint8_t len)
> >> +{
> >> +uint64_t data = *ptr;
> >> +switch (len) {
> >> +case 1:
> >> +break;
> >> +case 2:
> >> +data = bswap16(data);
> >> +break;
> >> +case 4:
> >> +data = bswap32(data);
> >> +break;
> >> +case 8:
> >> +data = bswap64(data);
> >> +break;
> >> +default:
> >> +return -EINVAL;
> >> +}
> >> +*ptr = data;
> >> +return 0;
> >> +}  
> 
> This is usually care taken by memory::adjust_endianness() ...

Yes, but that's not a memory region write.

> 
> > I was expecting more code to use a similar pattern, but it seems
> > surprisingly uncommon.  
> 
> Which ring a bell for latent bug?

Looking at this, it seems there *is* a latent bug, which has not popped
up so far as the pci instructions are not wired up in tcg yet. This
code is only called from the kvm path...

> 
> This remind me of a similar issue on ppc:
> 
> http://lists.nongnu.org/archive/html/qemu-devel/2017-02/msg05121.html
> ...
> http://lists.nongnu.org/archive/html/qemu-devel/2017-02/msg05666.html

[Qemu-devel] KVM Forum BoF session notes: QEMU configuration/command-line/QMP

2017-11-09 Thread Eduardo Habkost

Hi,

Below are the notes I collected during the BoF session about QEMU
configuration, command-line and QMP.  Sorry for taking so long to
send them.

All inaccuracies and mistakes below are my own fault.  (I learned
the hard way that trying to take notes while participating in the
discussion at the same time is not a good idea.)



KVM Forum 2017 BoF session: configuration, command-line and QMP
2017-10-26

People: Jiri Denemark, Laine Stump, Peter Krempa,
Markus Armbruster, Eduardo Habkost, Igor Mammedov,
Kevin Wolf, Eduardo Otubo, Alistair Francis,
Kashyap Chamarthy, Max Reitz



First topic: early QMP and -paused series by Igor

ehabkost tried to explain the original problem and Igor's proposal[1].

Markus: startup time is important, some things can make it slow.  Personally,
not so much concerned about querying stuff.  Just cache it!

Markus, about starting a monitor earlier: "it just makes sense".  One problem
is: command-line processing ordering mess in QEMU.  It's tempting but very
scary. very easy to screw up.  About command-line ordering, “I'd just go
left-to-right”.

ehabkost: is gradual change with NUMA stuff the right path?  (Markus: "what's 
the
right path?"; ehabkost: "I don't know").  If you are using CPU hotplug you still
might have some stuff not appearing in the command-line, because it uses
query-hotpluggable-cpus.  pkrempa: if you're using cpu hotplug you don't care
about startup time.  NUMA is not different.

Markus's objection to current -paused series: the need for another option/state.
We basically have 3 states/phases:
1: between exec and monitor available
2: between monitor and machines start to runs (in that phase a device_add 
is/can be a cold plug)
3: then the machine actually runs, and all device_add is hotplug

ehabkost: that's how I understand it, too.

ehabkost: question for libvirt developers: is there an "expansion" operation
where libvirt could fill-up missing data on the XML based on info queried from
QEMU? (Answer seems to be yes)



Some comments from Igor and Peter about query-hotpluggable-cpus and the need to
choose the right device type (sometimes TYPE_CPU is not appropriate for
device_add, but only CPU cores, like on Power).

ehabkost: that's probably another use case for query-device-slots: knowing that
power machines can't device_add threads, but just cores.  Tricky part: providing
slot information at qemu-probing time (with -machine none), before any machine
was built.



Some discussion about QEMU capability caching, why we need it, what we need to
get rid of it (answer: early QMP?).

Markus: would --query-FOO for some FOOs help? libvirt developers seem to prefer
early QMP.



Now that libvirt queries QEMU for host CPUID capabilities, there are more
interesting ways the cache may need to be invalidated.

Markus: “there's a it of a combinatorial explosion, i.e. query results can
depend on so much more than just the QEMU version and machine type.  Makes
invalidating the cache tricky, and possibly even finding something useful in it
insufficiently likely to be worth the trouble.”



Some discussion about libvirt/qemu versions compatibility.  Markus proposes that
we could choose to make newer QEMU require a more recent libvirt version.
OpenStack Nova re-evaluates every 6 months what next minimum libvirt and QEMU
versions to use (via constants: NEXT_MIN_LIBVIRT_VERSION, NEXT_MIN_QEMU_VERSION)

Markus: “The version decoupling we have is really nice, but I think an
occasional dependency would be tolerable.  Consider it when it saves lots of
work.”

Markus: “Also: drop support for old versions of QEMU.  Anything before 1.5?”



References and notes:

[1] References for “Early QMP”:

* Igor’s proposal at:
  Subject: [Qemu-devel] [RFC 0/6] enable numa configuration before  
machine_init() from HMP/QMP
  https://www.mail-archive.com/qemu-devel@nongnu.org/msg488601.html
* And what Eduardo has suggested at:
  
http://www.linux-kvm.org/images/4/46/03x06A-Eduardo_HabkostMachine-type_Introspection_and_Configuration_Where_Are_We_Going.pdf

Comments from Markus:

> Related: "Why /two/ config interfaces?" in my KVM Forum talk.  The two
> being CLI and QMP.
>
> We've long held the idea to support starting QEMU with a minimal command
> line, then do all configuration via QMP.  Not possible today, because
> lots of things can only be done via CLI.  In part because QMP becomes
> available only after quite a few config boats have sailed.
>
> Making QMP available earlier during startup ("early QMP") would be a
> stepping stone towards the "only QMP" goal.
>
> Related: rebasing CLI onto QAPI would make replicating CLI-only
> interfaces in QMP easier.

-- 
Eduardo

Re: [Qemu-devel] [PATCH 1/7] s390x/pci: factor out endianess conversion

2017-11-09 Thread Philippe Mathieu-Daudé

On 11/09/2017 01:38 PM, Cornelia Huck wrote:
> On Tue,  7 Nov 2017 18:24:33 +0100
> Pierre Morel  wrote:
> 
>> There are two places where the same endianness conversion
>> is done.
>> Let's factor this out into a static function.
>>
>> Signed-off-by: Pierre Morel 
>> Reviewed-by: Yi Min Zhao 
>> ---
>>  hw/s390x/s390-pci-inst.c | 58 
>> ++--
>>  1 file changed, 32 insertions(+), 26 deletions(-)
>>
>> diff --git a/hw/s390x/s390-pci-inst.c b/hw/s390x/s390-pci-inst.c
>> index 8e088f3..8fcb02d 100644
>> --- a/hw/s390x/s390-pci-inst.c
>> +++ b/hw/s390x/s390-pci-inst.c
>> @@ -314,6 +314,35 @@ out:
>>  return 0;
>>  }
>>  
>> +/**
>> + * This function swaps the data at ptr according from one
>> + * endianness to the other.
>> + * valid data in the uint64_t data field.
> 
> I'm not sure what that line is supposed to mean?
> 
>> + * @ptr: a pointer to a uint64_t data field
>> + * @len: the length of the valid data, must be 1,2,4 or 8
>> + */
>> +static int zpci_endian_swap(uint64_t *ptr, uint8_t len)
>> +{
>> +uint64_t data = *ptr;
>> +switch (len) {
>> +case 1:
>> +break;
>> +case 2:
>> +data = bswap16(data);
>> +break;
>> +case 4:
>> +data = bswap32(data);
>> +break;
>> +case 8:
>> +data = bswap64(data);
>> +break;
>> +default:
>> +return -EINVAL;
>> +}
>> +*ptr = data;
>> +return 0;
>> +}

This is usually care taken by memory::adjust_endianness() ...

> I was expecting more code to use a similar pattern, but it seems
> surprisingly uncommon.

Which ring a bell for latent bug?

This remind me of a similar issue on ppc:

http://lists.nongnu.org/archive/html/qemu-devel/2017-02/msg05121.html
...
http://lists.nongnu.org/archive/html/qemu-devel/2017-02/msg05666.html

Re: [Qemu-devel] [PATCH] slirp: don't zero ti_i since we access it later.

2017-11-09 Thread Samuel Thibault

Tao Wu(吴涛@Eng), on jeu. 09 nov. 2017 10:48:27 -0800, wrote:
> Thanks. Actually this is a follow up with my previous effort to fix this bug.
> I was busy on something else and then got lost in that old thread. Now I just
> checked some my local patch
> to see if they've merged to upstream and then found it out.
> 
> This is old thread about this:
> [1]http://lists.nongnu.org/archive/html/qemu-devel/2017-04/msg05544.html

I knew I had seen that proposal somewhere before :)

Thanks for the patch,
Samuel

Re: [Qemu-devel] [PATCH] slirp: don't zero ti_i since we access it later.

2017-11-09 Thread 吴涛@Eng

Thanks. Actually this is a follow up with my previous effort to fix this
bug.
I was busy on something else and then got lost in that old thread. Now I
just checked some my local patch
to see if they've merged to upstream and then found it out.

This is old thread about this:
http://lists.nongnu.org/archive/html/qemu-devel/2017-04/msg05544.html


On Thu, Nov 9, 2017 at 2:50 AM, Marc-André Lureau <
marcandre.lur...@gmail.com> wrote:

> Hi
>
> Adding Guillaume in CC, who wrote that line in commit 98c63057d2144
>
> On Wed, Nov 8, 2017 at 11:53 PM, Tao Wu via Qemu-devel
>  wrote:
> > The current code looks buggy, we zero ti_i while we access
> > ti_dst/ti_src later.
>
> Could you described the symptoms and why you fixed it that way?
>
> thanks
>
> >
> > Signed-off-by: Tao Wu 
> > ---
> >  slirp/tcp_subr.c | 11 ++-
> >  1 file changed, 10 insertions(+), 1 deletion(-)
> >
> > diff --git a/slirp/tcp_subr.c b/slirp/tcp_subr.c
> > index dc8b4bbb50..da0d53743f 100644
> > --- a/slirp/tcp_subr.c
> > +++ b/slirp/tcp_subr.c
> > @@ -148,7 +148,16 @@ tcp_respond(struct tcpcb *tp, struct tcpiphdr *ti,
> struct mbuf *m,
> > m->m_data += IF_MAXLINKHDR;
> > *mtod(m, struct tcpiphdr *) = *ti;
> > ti = mtod(m, struct tcpiphdr *);
> > -   memset(>ti, 0, sizeof(ti->ti));
> > +   switch (af) {
> > +   case AF_INET:
> > +   ti->ti.ti_i4.ih_x1 = 0;
> > +   break;
> > +   case AF_INET6:
> > +   ti->ti.ti_i6.ih_x1 = 0;
> > +   break;
> > +   default:
> > +   g_assert_not_reached();
> > +   }
> > flags = TH_ACK;
> > } else {
> > /*
> > --
> > 2.15.0.448.gf294e3d99a-goog
> >
> >
>
>
>
> --
> Marc-André Lureau
>

Re: [Qemu-devel] [PATCH] virtio-gpu: fix bug in host memory calculation.

2017-11-09 Thread 吴涛@Eng

Thanks. Sent out a new version add comments to say that we rely on
pixman create_bits to fail.

On Thu, Nov 9, 2017 at 2:34 AM, Marc-André Lureau
 wrote:
> Hi
>
> - Original Message -
>> The old code treats bits as bytes when calculating host memory usage.
>> Change it to be consistent with allocation logic in pixman library.
>>
>
> Good catch
>
>> Signed-off-by: Tao Wu 
>> ---
>>  hw/display/virtio-gpu.c | 13 +++--
>>  1 file changed, 11 insertions(+), 2 deletions(-)
>>
>> diff --git a/hw/display/virtio-gpu.c b/hw/display/virtio-gpu.c
>> index 43bbe09ea0..428786f291 100644
>> --- a/hw/display/virtio-gpu.c
>> +++ b/hw/display/virtio-gpu.c
>> @@ -322,6 +322,15 @@ static pixman_format_code_t get_pixman_format(uint32_t
>> virtio_gpu_format)
>>  }
>>  }
>>
>> +static uint32_t calc_image_hostmem(pixman_format_code_t pformat,
>> +   uint32_t width, uint32_t height)
>> +{
>> +/* copied from pixman/pixman-bits-image.c, skip integer overflow check.
>> */
>
> So we rely on pixman create_bits() to fail if overflow happened? perhaps it's 
> worth a comment.
>
>> +int bpp = PIXMAN_FORMAT_BPP(pformat);
>> +int stride = ((width * bpp + 0x1f) >> 5) * sizeof(uint32_t);
>> +return height * stride;
>> +}
>> +
>>  static void virtio_gpu_resource_create_2d(VirtIOGPU *g,
>>struct virtio_gpu_ctrl_command
>>*cmd)
>>  {
>> @@ -366,7 +375,7 @@ static void virtio_gpu_resource_create_2d(VirtIOGPU *g,
>>  return;
>>  }
>>
>> -res->hostmem = PIXMAN_FORMAT_BPP(pformat) * c2d.width * c2d.height;
>> +res->hostmem = calc_image_hostmem(pformat, c2d.width, c2d.height);
>>  if (res->hostmem + g->hostmem < g->conf.max_hostmem) {
>>  res->image = pixman_image_create_bits(pformat,
>>c2d.width,
>> @@ -1087,7 +1096,7 @@ static int virtio_gpu_load(QEMUFile *f, void *opaque,
>> size_t size,
>>  return -EINVAL;
>>  }
>>
>> -res->hostmem = PIXMAN_FORMAT_BPP(pformat) * res->width *
>> res->height;
>> +res->hostmem = calc_image_hostmem(pformat, res->width, res->height);
>>
>>  res->addrs = g_new(uint64_t, res->iov_cnt);
>>  res->iov = g_new(struct iovec, res->iov_cnt);
>> --
>> 2.15.0.403.gc27cc4dac6-goog
>>
>
> looks good otherwise,
>
> Reviewed-by: Marc-André Lureau 
>
>>

Re: [Qemu-devel] [PATCH] virtio-gpu: fix bug in host memory calculation.

2017-11-09 Thread Marc-André Lureau

On Thu, Nov 9, 2017 at 7:17 PM, Tao Wu via Qemu-devel
 wrote:
> The old code treats bits as bytes when calculating host memory usage.
> Change it to be consistent with allocation logic in pixman library.
>
> Signed-off-by: Tao Wu 

Reviewed-by: Marc-André Lureau 
> ---
>  hw/display/virtio-gpu.c | 16 ++--
>  1 file changed, 14 insertions(+), 2 deletions(-)
>
> diff --git a/hw/display/virtio-gpu.c b/hw/display/virtio-gpu.c
> index 43bbe09ea0..274e365713 100644
> --- a/hw/display/virtio-gpu.c
> +++ b/hw/display/virtio-gpu.c
> @@ -322,6 +322,18 @@ static pixman_format_code_t get_pixman_format(uint32_t 
> virtio_gpu_format)
>  }
>  }
>
> +static uint32_t calc_image_hostmem(pixman_format_code_t pformat,
> +   uint32_t width, uint32_t height)
> +{
> +/* Copied from pixman/pixman-bits-image.c, skip integer overflow check.
> + * pixman_image_create_bits will fail in case it overflow.
> + */
> +
> +int bpp = PIXMAN_FORMAT_BPP(pformat);
> +int stride = ((width * bpp + 0x1f) >> 5) * sizeof(uint32_t);
> +return height * stride;
> +}
> +
>  static void virtio_gpu_resource_create_2d(VirtIOGPU *g,
>struct virtio_gpu_ctrl_command 
> *cmd)
>  {
> @@ -366,7 +378,7 @@ static void virtio_gpu_resource_create_2d(VirtIOGPU *g,
>  return;
>  }
>
> -res->hostmem = PIXMAN_FORMAT_BPP(pformat) * c2d.width * c2d.height;
> +res->hostmem = calc_image_hostmem(pformat, c2d.width, c2d.height);
>  if (res->hostmem + g->hostmem < g->conf.max_hostmem) {
>  res->image = pixman_image_create_bits(pformat,
>c2d.width,
> @@ -1087,7 +1099,7 @@ static int virtio_gpu_load(QEMUFile *f, void *opaque, 
> size_t size,
>  return -EINVAL;
>  }
>
> -res->hostmem = PIXMAN_FORMAT_BPP(pformat) * res->width * res->height;
> +res->hostmem = calc_image_hostmem(pformat, res->width, res->height);
>
>  res->addrs = g_new(uint64_t, res->iov_cnt);
>  res->iov = g_new(struct iovec, res->iov_cnt);
> --
> 2.15.0.448.gf294e3d99a-goog
>
>



-- 
Marc-André Lureau

[Qemu-devel] [PATCH] virtio-gpu: fix bug in host memory calculation.

2017-11-09 Thread Tao Wu via Qemu-devel

The old code treats bits as bytes when calculating host memory usage.
Change it to be consistent with allocation logic in pixman library.

Signed-off-by: Tao Wu 
---
 hw/display/virtio-gpu.c | 16 ++--
 1 file changed, 14 insertions(+), 2 deletions(-)

diff --git a/hw/display/virtio-gpu.c b/hw/display/virtio-gpu.c
index 43bbe09ea0..274e365713 100644
--- a/hw/display/virtio-gpu.c
+++ b/hw/display/virtio-gpu.c
@@ -322,6 +322,18 @@ static pixman_format_code_t get_pixman_format(uint32_t 
virtio_gpu_format)
 }
 }
 
+static uint32_t calc_image_hostmem(pixman_format_code_t pformat,
+   uint32_t width, uint32_t height)
+{
+/* Copied from pixman/pixman-bits-image.c, skip integer overflow check.
+ * pixman_image_create_bits will fail in case it overflow.
+ */
+
+int bpp = PIXMAN_FORMAT_BPP(pformat);
+int stride = ((width * bpp + 0x1f) >> 5) * sizeof(uint32_t);
+return height * stride;
+}
+
 static void virtio_gpu_resource_create_2d(VirtIOGPU *g,
   struct virtio_gpu_ctrl_command *cmd)
 {
@@ -366,7 +378,7 @@ static void virtio_gpu_resource_create_2d(VirtIOGPU *g,
 return;
 }
 
-res->hostmem = PIXMAN_FORMAT_BPP(pformat) * c2d.width * c2d.height;
+res->hostmem = calc_image_hostmem(pformat, c2d.width, c2d.height);
 if (res->hostmem + g->hostmem < g->conf.max_hostmem) {
 res->image = pixman_image_create_bits(pformat,
   c2d.width,
@@ -1087,7 +1099,7 @@ static int virtio_gpu_load(QEMUFile *f, void *opaque, 
size_t size,
 return -EINVAL;
 }
 
-res->hostmem = PIXMAN_FORMAT_BPP(pformat) * res->width * res->height;
+res->hostmem = calc_image_hostmem(pformat, res->width, res->height);
 
 res->addrs = g_new(uint64_t, res->iov_cnt);
 res->iov = g_new(struct iovec, res->iov_cnt);
-- 
2.15.0.448.gf294e3d99a-goog

[Qemu-devel] [PULL 0/1] slirp update

2017-11-09 Thread Samuel Thibault

warning: redirection vers https://people.debian.org/~sthibault/qemu.git/
The following changes since commit b0fbe46ad82982b289a44ee2495b59b0bad8a842:

  Update version for v2.11.0-rc0 release (2017-11-07 16:05:28 +)

are available in the git repository at:

  http://people.debian.org/~sthibault/qemu.git tags/samuel-thibault

for you to fetch changes up to 990132cda9baa27bdc558df6c9c15aacb0322d2c:

  slirp: don't zero the whole ti_i when m == NULL (2017-11-09 18:59:22 +0100)


slirp updates


Tao Wu (1):
  slirp: don't zero the whole ti_i when m == NULL

This fixes the qemu slirp behavior is some rare cases (some RST cases,
keep alive probes), packets would be sent to 0.0.0.0.

 slirp/tcp_subr.c | 11 ++-
 1 file changed, 10 insertions(+), 1 deletion(-)

[Qemu-devel] [PULL 1/1] slirp: don't zero the whole ti_i when m == NULL

2017-11-09 Thread Samuel Thibault

From: Tao Wu 

98c63057d2144fb81681580cd84c13c93794c96e ('slirp: Factorizing
tcpiphdr structure with an union') introduced a memset call to clear
possibly-undefined fields in ti. This however overwrites src/dst/pr which
are used below.

So let us clear only the unused fields.

This should fix some rare cases (some RST cases, keep alive probes)
where packets would be sent to 0.0.0.0.

Signed-off-by: Tao Wu 
Signed-off-by: Samuel Thibault 
---
 slirp/tcp_subr.c | 11 ++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/slirp/tcp_subr.c b/slirp/tcp_subr.c
index dc8b4bbb50..da0d53743f 100644
--- a/slirp/tcp_subr.c
+++ b/slirp/tcp_subr.c
@@ -148,7 +148,16 @@ tcp_respond(struct tcpcb *tp, struct tcpiphdr *ti, struct 
mbuf *m,
m->m_data += IF_MAXLINKHDR;
*mtod(m, struct tcpiphdr *) = *ti;
ti = mtod(m, struct tcpiphdr *);
-   memset(>ti, 0, sizeof(ti->ti));
+   switch (af) {
+   case AF_INET:
+   ti->ti.ti_i4.ih_x1 = 0;
+   break;
+   case AF_INET6:
+   ti->ti.ti_i6.ih_x1 = 0;
+   break;
+   default:
+   g_assert_not_reached();
+   }
flags = TH_ACK;
} else {
/*
-- 
2.14.2

Re: [Qemu-devel] [PATCH] slirp: don't zero ti_i since we access it later.

2017-11-09 Thread Samuel Thibault

Hello,

Tao Wu, on mer. 08 nov. 2017 14:53:40 -0800, wrote:
> The current code looks buggy, we zero ti_i while we access
> ti_dst/ti_src later.

Mmm, indeed, looking again at how it was introduced, it was too much.

Samuel

Re: [Qemu-devel] [PATCH] blockdev-backup: enable non-root nodes for backup

2017-11-09 Thread Kevin Wolf

Am 09.11.2017 um 17:33 hat Eric Blake geschrieben:
> On 11/09/2017 08:16 AM, Vladimir Sementsov-Ogievskiy wrote:
> > This is needed to implement image-fleecing scheme, when we create
> > a temporary node, mark our active node to be backing for the temp,
> > and start backup(sync=none) from active node to the temp node.
> > Temp node then represents a kind of snapshot and may be used
> > for external backup through NBD.
> > 
> > Signed-off-by: Vladimir Sementsov-Ogievskiy 
> > ---
> > 
> > What was the reason to abandon non-root nodes?
> 
> I think the original restriction was that we didn't know all the
> implications to having multiple readers to an intermediate node, so it
> was easier to prevent it with plans to add it later than to add it up
> front and deal with the fallout.  But I think that now we are
> sufficiently versed in handling BDS trees with multiple readers, with
> proper op-blocking in place; so you are right that we can probably
> support it now.

Op blockers are actually the reason why I'm not so sure. The old
blockers often still only work on the top level, and we haven't fully
replaced them yet. In particular, graph modifications might not be
protected well enough by the new system yet.

But then, backup works only on a single node in the source tree, not on
a whole subchain, so I don't see what other operation could conflict
with a backup job. The only thing is resizes, but that is covered by the
new system.

So in the end, I think this specific change should be okay.

> However, I'm a bit worried that there is no documentation change about
> this in a .json QAPI file, nor any easy way to introspect via QMP
> whether a particular qemu implementation supports this (short of trying
> it and seeing whether it works).  I'm also thinking that this is 2.12
> material, unless we can prove it is fixing a bug for 2.11 that was not
> previously present.

Sounds like another use case for the capability annotations that Markus
wants to introduce for commands in the QAPI schema.

Kevin

signature.asc
Description: PGP signature

[Qemu-devel] [PATCH V4] hw/pci-host: Fix x86 Host Bridges 64bit PCI hole

2017-11-09 Thread Marcel Apfelbaum

Currently there is no MMIO range over 4G
reserved for PCI hotplug. Since the 32bit PCI hole
depends on the number of cold-plugged PCI devices
and other factors, it is very possible is too small
to hotplug PCI devices with large BARs.

Fix it by reserving 2G for I4400FX chipset
in order to comply with older Win32 Guest OSes
and 32G for Q35 chipset.

Even if the new defaults of pci-hole64-size will appear in
"info qtree" also for older machines, the property was
not implemented so no changes will be visible to guests.

Note this is a regression since prev QEMU versions had
some range reserved for 64bit PCI hotplug.

Reviewed-by: Laszlo Ersek 
Reviewed-by: Gerd Hoffmann 
Signed-off-by: Marcel Apfelbaum 
---

V3 -> V4:
 - Addressed Laszlo's comments:
- Added defines for pci-hole64 default size props.
- Rounded the hole64_end to 1G
- Moved some info to commit message
 - Addressed Michael's comments:
- Added more comments.
 - I kept Gerd's "review-by" tag since no functional changes were made.

V2 -> V3:
 - Addressed Gerd's and others comments and re-enabled the pci-hole64-size
   property defaulting it to 2G for I440FX and 32g for Q35.
 - Even if the new defaults of pci-hole64-size will appear in "info qtree"
   also for older machines, the property was not implemented so
   no changes will be visible to guests.

V1 -> V2:
 Addressed Igor's comments:
- aligned the hole64 start to 1Gb
 (I think all the computations took care of it already,
  but it can't hurt)
- Init compat props to "off" instead of "false"

 hw/i386/pc.c  | 22 ++
 hw/pci-host/piix.c| 32 ++--
 hw/pci-host/q35.c | 35 ---
 include/hw/i386/pc.h  | 10 +-
 include/hw/pci-host/q35.h |  1 +
 5 files changed, 94 insertions(+), 6 deletions(-)

diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index e11a65b545..fafe5ba5cd 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -1448,6 +1448,28 @@ void pc_memory_init(PCMachineState *pcms,
 pcms->ioapic_as = _space_memory;
 }
 
+/*
+ * The 64bit pci hole starts after "above 4G RAM" and
+ * potentially the space reserved for memory hotplug.
+ */
+uint64_t pc_pci_hole64_start(void)
+{
+PCMachineState *pcms = PC_MACHINE(qdev_get_machine());
+PCMachineClass *pcmc = PC_MACHINE_GET_CLASS(pcms);
+uint64_t hole64_start = 0;
+
+if (pcmc->has_reserved_memory && pcms->hotplug_memory.base) {
+hole64_start = pcms->hotplug_memory.base;
+if (!pcmc->broken_reserved_end) {
+hole64_start += memory_region_size(>hotplug_memory.mr);
+}
+} else {
+hole64_start = 0x1ULL + pcms->above_4g_mem_size;
+}
+
+return ROUND_UP(hole64_start, 1ULL << 30);
+}
+
 qemu_irq pc_allocate_cpu_irq(void)
 {
 return qemu_allocate_irq(pic_irq_request, NULL, 0);
diff --git a/hw/pci-host/piix.c b/hw/pci-host/piix.c
index a7e2256870..f689c31d12 100644
--- a/hw/pci-host/piix.c
+++ b/hw/pci-host/piix.c
@@ -50,6 +50,7 @@ typedef struct I440FXState {
 PCIHostState parent_obj;
 Range pci_hole;
 uint64_t pci_hole64_size;
+bool pci_hole64_fix;
 uint32_t short_root_bus;
 } I440FXState;
 
@@ -112,6 +113,9 @@ struct PCII440FXState {
 #define I440FX_PAM_SIZE 7
 #define I440FX_SMRAM0x72
 
+/* Keep it 2G to comply with older win32 guests */
+#define I440FX_PCI_HOST_HOLE64_SIZE_DEFAULT (1ULL << 31)
+
 /* Older coreboot versions (4.0 and older) read a config register that doesn't
  * exist in real hardware, to get the RAM size from QEMU.
  */
@@ -238,29 +242,52 @@ static void i440fx_pcihost_get_pci_hole_end(Object *obj, 
Visitor *v,
 visit_type_uint32(v, name, , errp);
 }
 
+/*
+ * The 64bit PCI hole start is set by the Guest firmware
+ * as the address of the first 64bit PCI MEM resource.
+ * If no PCI device has resources on the 64bit area,
+ * the 64bit PCI hole will start after "over 4G RAM" and the
+ * reserved space for memory hotplug if any.
+ */
 static void i440fx_pcihost_get_pci_hole64_start(Object *obj, Visitor *v,
 const char *name,
 void *opaque, Error **errp)
 {
 PCIHostState *h = PCI_HOST_BRIDGE(obj);
+I440FXState *s = I440FX_PCI_HOST_BRIDGE(obj);
 Range w64;
 uint64_t value;
 
 pci_bus_get_w64_range(h->bus, );
 value = range_is_empty() ? 0 : range_lob();
+if (!value && s->pci_hole64_fix) {
+value = pc_pci_hole64_start();
+}
 visit_type_uint64(v, name, , errp);
 }
 
+/*
+ * The 64bit PCI hole end is set by the Guest firmware
+ * as the address of the last 64bit PCI MEM resource.
+ * Then it is expanded to the PCI_HOST_PROP_PCI_HOLE64_SIZE
+ * that can be configured by the user.
+ */
 static void i440fx_pcihost_get_pci_hole64_end(Object *obj, Visitor *v,
   const char

Re: [Qemu-devel] [Qemu-block] [PATCH v2] throttle: fix a qemu crash problem when calling blk_delete

2017-11-09 Thread Stefan Hajnoczi

On Tue, Oct 24, 2017 at 11:33:51AM +0800, sochin jiang wrote:
> commit 7ca7f0 moves the throttling related part of the BDS life cycle
> management to BlockBackend, adds call to
> throttle_timers_detach_aio_context in blk_remove_bs.  commit 1606e
> remove a block device from its throttle group in blk_delete by calling
> blk_io_limits_disable, this fix an easily reproducible qemu crash. But
> delete a BB without a BDS inserted could easily cause a qemu crash too
> by calling bdrv_drained_begin in blk_io_limits_disable. Say, a simply
> drive_add and then a drive_del command.
> 
> This patch removes draining BDS by calling throttle_group_unregister_tgm
> directly instead of blk_io_limits_disable, leaves draining operation to
> blk_remove_bs in case that there is no BDS inserted. Futhermore, make sure
> throttle timers are initialized or attached before throttle_timers_destroy
> is called in throttle_group_unregister_tgm.
> 
> Signed-off-by: sochin jiang 
> ---
>  block/block-backend.c   | 2 +-
>  block/throttle-groups.c | 4 +++-
>  2 files changed, 4 insertions(+), 2 deletions(-)
> 
> diff --git a/block/block-backend.c b/block/block-backend.c
> index 45d9101..39c7cca 100644
> --- a/block/block-backend.c
> +++ b/block/block-backend.c
> @@ -341,7 +341,7 @@ static void blk_delete(BlockBackend *blk)
>  assert(!blk->name);
>  assert(!blk->dev);
>  if (blk->public.throttle_group_member.throttle_state) {
> -blk_io_limits_disable(blk);
> +throttle_group_unregister_tgm(>public.throttle_group_member);

The following assertions fail without the drain when there are pending
requests:

  void throttle_group_unregister_tgm(ThrottleGroupMember *tgm)
  {
  ThrottleState *ts = tgm->throttle_state;
  ThrottleGroup *tg = container_of(ts, ThrottleGroup, ts);
  ThrottleGroupMember *token;
  int i;

  if (!ts) {
  /* Discard already unregistered tgm */
  return;
  }

  assert(tgm->pending_reqs[0] == 0 && tgm->pending_reqs[1] == 0);
  assert(qemu_co_queue_empty(>throttled_reqs[0]));
  assert(qemu_co_queue_empty(>throttled_reqs[1]));

A safer approach is making blk_io_limits_disable(blk) skip the draining
when blk_bs(blk) == NULL.  That is the only case where we are 100% sure
that there are no pending requests.


signature.asc
Description: PGP signature

[Qemu-devel] [PULL 8/8] nbd/server: Fix structured read of length 0

2017-11-09 Thread Eric Blake

The NBD spec was recently clarified to state that a read of length 0
should not be attempted by a compliant client; but that a server must
still handle it correctly in an unspecified manner (that is, either
a successful no-op or an error reply, but not a crash) [1].  However,
it also implies that NBD_REPLY_TYPE_OFFSET_DATA must have a non-zero
payload length, but our existing code was replying with a chunk
that a picky client could reject as invalid because it was missing
a payload (our own client implementation was recently patched to be
that picky, after first fixing it to not send 0-length requests).

We are already doing successful no-ops for 0-length writes and for
non-structured reads; so for consistency, we want structured reply
reads to also be a no-op.  The easiest way to do this is to return
a NBD_REPLY_TYPE_NONE chunk; this is best done via a new helper
function (especially since future patches for other structured
replies may benefit from using the same helper).

[1] https://github.com/NetworkBlockDevice/nbd/commit/ee926037

Signed-off-by: Eric Blake 
Message-Id: <20171108215703.9295-8-ebl...@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy 
---
 nbd/server.c | 21 -
 nbd/trace-events |  1 +
 2 files changed, 21 insertions(+), 1 deletion(-)

diff --git a/nbd/server.c b/nbd/server.c
index 6ebb7d9c2e..df771fd42f 100644
--- a/nbd/server.c
+++ b/nbd/server.c
@@ -1273,6 +1273,21 @@ static inline void set_be_chunk(NBDStructuredReplyChunk 
*chunk, uint16_t flags,
 stl_be_p(>length, length);
 }

+static int coroutine_fn nbd_co_send_structured_done(NBDClient *client,
+uint64_t handle,
+Error **errp)
+{
+NBDStructuredReplyChunk chunk;
+struct iovec iov[] = {
+{.iov_base = , .iov_len = sizeof(chunk)},
+};
+
+trace_nbd_co_send_structured_done(handle);
+set_be_chunk(, NBD_REPLY_FLAG_DONE, NBD_REPLY_TYPE_NONE, handle, 0);
+
+return nbd_co_send_iov(client, iov, 1, errp);
+}
+
 static int coroutine_fn nbd_co_send_structured_read(NBDClient *client,
 uint64_t handle,
 uint64_t offset,
@@ -1286,6 +1301,7 @@ static int coroutine_fn 
nbd_co_send_structured_read(NBDClient *client,
 {.iov_base = data, .iov_len = size}
 };

+assert(size);
 trace_nbd_co_send_structured_read(handle, offset, data, size);
 set_be_chunk(, NBD_REPLY_FLAG_DONE, NBD_REPLY_TYPE_OFFSET_DATA,
  handle, sizeof(chunk) - sizeof(chunk.h) + size);
@@ -1544,10 +1560,13 @@ reply:
 if (ret < 0) {
 ret = nbd_co_send_structured_error(req->client, request.handle,
-ret, msg, _err);
-} else {
+} else if (reply_data_len) {
 ret = nbd_co_send_structured_read(req->client, request.handle,
   request.from, req->data,
   reply_data_len, _err);
+} else {
+ret = nbd_co_send_structured_done(req->client, request.handle,
+  _err);
 }
 } else {
 ret = nbd_co_send_simple_reply(req->client, request.handle,
diff --git a/nbd/trace-events b/nbd/trace-events
index bbc75f6414..92568edce5 100644
--- a/nbd/trace-events
+++ b/nbd/trace-events
@@ -55,6 +55,7 @@ nbd_receive_request(uint32_t magic, uint16_t flags, uint16_t 
type, uint64_t from
 nbd_blk_aio_attached(const char *name, void *ctx) "Export %s: Attaching 
clients to AIO context %p\n"
 nbd_blk_aio_detach(const char *name, void *ctx) "Export %s: Detaching clients 
from AIO context %p\n"
 nbd_co_send_simple_reply(uint64_t handle, uint32_t error, const char *errname, 
int len) "Send simple reply: handle = %" PRIu64 ", error = %" PRIu32 " (%s), 
len = %d"
+nbd_co_send_structured_done(uint64_t handle) "Send structured reply done: 
handle = %" PRIu64
 nbd_co_send_structured_read(uint64_t handle, uint64_t offset, void *data, 
size_t size) "Send structured read data reply: handle = %" PRIu64 ", offset = 
%" PRIu64 ", data = %p, len = %zu"
 nbd_co_send_structured_error(uint64_t handle, int err, const char *errname, 
const char *msg) "Send structured error reply: handle = %" PRIu64 ", error = %d 
(%s), msg = '%s'"
 nbd_co_receive_request_decode_type(uint64_t handle, uint16_t type, const char 
*name) "Decoding type: handle = %" PRIu64 ", type = %" PRIu16 " (%s)"
-- 
2.13.6

[Qemu-devel] [PULL 5/8] nbd: Fix struct name for structured reads

2017-11-09 Thread Eric Blake

A closer read of the NBD spec shows that a structured reply chunk
for a hole is not quite identical to the prefix of a data chunk,
because the hole has to also send a 32-bit size field.  Although
we do not yet send holes, we should fix the misleading information
in our header and make it easier for a future patch to support
sparse reads.  Messed up in commit bae245d1.

Signed-off-by: Eric Blake 
Message-Id: <20171108215703.9295-5-ebl...@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy 
---
 include/block/nbd.h | 18 +-
 nbd/server.c|  2 +-
 2 files changed, 14 insertions(+), 6 deletions(-)

diff --git a/include/block/nbd.h b/include/block/nbd.h
index 92d1723d7c..113c707a5e 100644
--- a/include/block/nbd.h
+++ b/include/block/nbd.h
@@ -86,15 +86,23 @@ typedef union NBDReply {
 } QEMU_PACKED;
 } NBDReply;

-/* Header of NBD_REPLY_TYPE_OFFSET_DATA, complete NBD_REPLY_TYPE_OFFSET_HOLE */
-typedef struct NBDStructuredRead {
-NBDStructuredReplyChunk h;
+/* Header of chunk for NBD_REPLY_TYPE_OFFSET_DATA */
+typedef struct NBDStructuredReadData {
+NBDStructuredReplyChunk h; /* h.length >= 9 */
 uint64_t offset;
-} QEMU_PACKED NBDStructuredRead;
+/* At least one byte of data payload follows, calculated from h.length */
+} QEMU_PACKED NBDStructuredReadData;
+
+/* Complete chunk for NBD_REPLY_TYPE_OFFSET_HOLE */
+typedef struct NBDStructuredReadHole {
+NBDStructuredReplyChunk h; /* h.length == 12 */
+uint64_t offset;
+uint32_t length;
+} QEMU_PACKED NBDStructuredReadHole;

 /* Header of all NBD_REPLY_TYPE_ERROR* errors */
 typedef struct NBDStructuredError {
-NBDStructuredReplyChunk h;
+NBDStructuredReplyChunk h; /* h.length >= 6 */
 uint32_t error;
 uint16_t message_length;
 } QEMU_PACKED NBDStructuredError;
diff --git a/nbd/server.c b/nbd/server.c
index bcf0cdb47c..6ebb7d9c2e 100644
--- a/nbd/server.c
+++ b/nbd/server.c
@@ -1280,7 +1280,7 @@ static int coroutine_fn 
nbd_co_send_structured_read(NBDClient *client,
 size_t size,
 Error **errp)
 {
-NBDStructuredRead chunk;
+NBDStructuredReadData chunk;
 struct iovec iov[] = {
 {.iov_base = , .iov_len = sizeof(chunk)},
 {.iov_base = data, .iov_len = size}
-- 
2.13.6

[Qemu-devel] [PULL 6/8] nbd-client: Short-circuit 0-length operations

2017-11-09 Thread Eric Blake

The NBD spec was recently clarified to state that clients should
not send 0-length requests to the server, as the server behavior
is undefined [1].  We know that qemu-nbd's behavior is a successful
no-op (once it has filtered for read-only exports), but other NBD
implementations might return an error.  To avoid any questionable
server implementations, it is better to just short-circuit such
requests on the client side (we are relying on the block layer to
already filter out requests such as invalid offset, write to a
read-only volume, and so forth); do the short-circuit as late as
possible to still benefit from protections from assertions that
the block layer is not violating our assumptions.

[1] https://github.com/NetworkBlockDevice/nbd/commit/ee926037

Signed-off-by: Eric Blake 
Message-Id: <20171108215703.9295-6-ebl...@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy 
---
 block/nbd-client.c | 11 ++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/block/nbd-client.c b/block/nbd-client.c
index daa4392531..0a675d0fab 100644
--- a/block/nbd-client.c
+++ b/block/nbd-client.c
@@ -674,6 +674,9 @@ int nbd_client_co_preadv(BlockDriverState *bs, uint64_t 
offset,
 assert(bytes <= NBD_MAX_BUFFER_SIZE);
 assert(!flags);

+if (!bytes) {
+return 0;
+}
 ret = nbd_co_send_request(bs, , NULL);
 if (ret < 0) {
 return ret;
@@ -705,6 +708,9 @@ int nbd_client_co_pwritev(BlockDriverState *bs, uint64_t 
offset,

 assert(bytes <= NBD_MAX_BUFFER_SIZE);

+if (!bytes) {
+return 0;
+}
 return nbd_co_request(bs, , qiov);
 }

@@ -731,6 +737,9 @@ int nbd_client_co_pwrite_zeroes(BlockDriverState *bs, 
int64_t offset,
 request.flags |= NBD_CMD_FLAG_NO_HOLE;
 }

+if (!bytes) {
+return 0;
+}
 return nbd_co_request(bs, , NULL);
 }

@@ -759,7 +768,7 @@ int nbd_client_co_pdiscard(BlockDriverState *bs, int64_t 
offset, int bytes)
 };

 assert(!(client->info.flags & NBD_FLAG_READ_ONLY));
-if (!(client->info.flags & NBD_FLAG_SEND_TRIM)) {
+if (!(client->info.flags & NBD_FLAG_SEND_TRIM) || !bytes) {
 return 0;
 }

-- 
2.13.6

[Qemu-devel] [PULL 4/8] nbd/client: Nicer trace of structured reply

2017-11-09 Thread Eric Blake

It's useful to know which structured reply chunk is being processed.
Missed in commit d2febedb.

Signed-off-by: Eric Blake 
Message-Id: <20171108215703.9295-4-ebl...@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy 
---
 nbd/client.c | 4 +++-
 nbd/trace-events | 2 +-
 2 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/nbd/client.c b/nbd/client.c
index 3d680e63e1..1880103d2a 100644
--- a/nbd/client.c
+++ b/nbd/client.c
@@ -979,6 +979,7 @@ static int nbd_receive_structured_reply_chunk(QIOChannel 
*ioc,
 int nbd_receive_reply(QIOChannel *ioc, NBDReply *reply, Error **errp)
 {
 int ret;
+const char *type;

 ret = nbd_read_eof(ioc, >magic, sizeof(reply->magic), errp);
 if (ret <= 0) {
@@ -1008,8 +1009,9 @@ int nbd_receive_reply(QIOChannel *ioc, NBDReply *reply, 
Error **errp)
 if (ret < 0) {
 break;
 }
+type = nbd_reply_type_lookup(reply->structured.type);
 trace_nbd_receive_structured_reply_chunk(reply->structured.flags,
- reply->structured.type,
+ reply->structured.type, type,
  reply->structured.handle,
  reply->structured.length);
 break;
diff --git a/nbd/trace-events b/nbd/trace-events
index 4a13757524..bbc75f6414 100644
--- a/nbd/trace-events
+++ b/nbd/trace-events
@@ -27,7 +27,7 @@ nbd_client_clear_queue(void) "Clearing NBD queue"
 nbd_client_clear_socket(void) "Clearing NBD socket"
 nbd_send_request(uint64_t from, uint32_t len, uint64_t handle, uint16_t flags, 
uint16_t type, const char *name) "Sending request to server: { .from = %" 
PRIu64", .len = %" PRIu32 ", .handle = %" PRIu64 ", .flags = 0x%" PRIx16 ", 
.type = %" PRIu16 " (%s) }"
 nbd_receive_simple_reply(int32_t error, const char *errname, uint64_t handle) 
"Got simple reply: { .error = %" PRId32 " (%s), handle = %" PRIu64" }"
-nbd_receive_structured_reply_chunk(uint16_t flags, uint16_t type, uint64_t 
handle, uint32_t length) "Got structured reply chunk: { flags = 0x%" PRIx16 ", 
type = %d, handle = %" PRIu64 ", length = %" PRIu32 " }"
+nbd_receive_structured_reply_chunk(uint16_t flags, uint16_t type, const char 
*name, uint64_t handle, uint32_t length) "Got structured reply chunk: { flags = 
0x%" PRIx16 ", type = %d (%s), handle = %" PRIu64 ", length = %" PRIu32 " }"

 # nbd/common.c
 nbd_unknown_error(int err) "Squashing unexpected error %d to EINVAL"
-- 
2.13.6

[Qemu-devel] [PULL 1/8] nbd/server: fix nbd_negotiate_handle_info

2017-11-09 Thread Eric Blake

From: Vladimir Sementsov-Ogievskiy 

namelen should be here, length is unrelated, and always 0 at this
point.  Broken in introduction in commit f37708f6, but mostly
harmless (replying with '' as the name does not violate protocol,
and does not confuse qemu as the nbd client since our implementation
does not ask for the name; but might confuse some other client that
does ask for the name especially if the default export is different
than the export name being queried).

Adding an assert makes it obvious that we are not skipping any bytes
in the client's message, as well as making it obvious that we were
using the wrong variable.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
CC: qemu-sta...@nongnu.org
Message-Id: <20171101154204.27146-1-vsement...@virtuozzo.com>
[eblake: improve commit message, squash in assert addition]
Signed-off-by: Eric Blake 
---
 nbd/server.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/nbd/server.c b/nbd/server.c
index 70b40ed27e..bcf0cdb47c 100644
--- a/nbd/server.c
+++ b/nbd/server.c
@@ -423,6 +423,7 @@ static int nbd_negotiate_handle_info(NBDClient *client, 
uint32_t length,
 break;
 }
 }
+assert(length == 0);

 exp = nbd_export_find(name);
 if (!exp) {
@@ -433,7 +434,7 @@ static int nbd_negotiate_handle_info(NBDClient *client, 
uint32_t length,

 /* Don't bother sending NBD_INFO_NAME unless client requested it */
 if (sendname) {
-rc = nbd_negotiate_send_info(client, opt, NBD_INFO_NAME, length, name,
+rc = nbd_negotiate_send_info(client, opt, NBD_INFO_NAME, namelen, name,
  errp);
 if (rc < 0) {
 return rc;
-- 
2.13.6

[Qemu-devel] [PULL 7/8] nbd-client: Stricter enforcing of structured reply spec

2017-11-09 Thread Eric Blake

Ensure that the server is not sending unexpected chunk lengths
for either the NONE or the OFFSET_DATA chunk, nor unexpected
hole length for OFFSET_HOLE.  This will flag any server as
broken that responds to a zero-length read with an OFFSET_DATA
(what our server currently does, but that's about to be fixed)
or with OFFSET_HOLE, even though we previously fixed our client
to never be able to send such a request over the wire.

Signed-off-by: Eric Blake 
Message-Id: <20171108215703.9295-7-ebl...@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy 
---
 block/nbd-client.c | 11 +--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/block/nbd-client.c b/block/nbd-client.c
index 0a675d0fab..bcfed0133d 100644
--- a/block/nbd-client.c
+++ b/block/nbd-client.c
@@ -216,7 +216,7 @@ static int 
nbd_parse_offset_hole_payload(NBDStructuredReplyChunk *chunk,
 offset = payload_advance64();
 hole_size = payload_advance32();

-if (offset < orig_offset || hole_size > qiov->size ||
+if (!hole_size || offset < orig_offset || hole_size > qiov->size ||
 offset > orig_offset + qiov->size - hole_size) {
 error_setg(errp, "Protocol error: server sent chunk exceeding 
requested"
  " region");
@@ -281,7 +281,8 @@ static int 
nbd_co_receive_offset_data_payload(NBDClientSession *s,

 assert(nbd_reply_is_structured(>reply));

-if (chunk->length < sizeof(offset)) {
+/* The NBD spec requires at least one byte of payload */
+if (chunk->length <= sizeof(offset)) {
 error_setg(errp, "Protocol error: invalid payload for "
  "NBD_REPLY_TYPE_OFFSET_DATA");
 return -EINVAL;
@@ -293,6 +294,7 @@ static int 
nbd_co_receive_offset_data_payload(NBDClientSession *s,
 be64_to_cpus();

 data_size = chunk->length - sizeof(offset);
+assert(data_size);
 if (offset < orig_offset || data_size > qiov->size ||
 offset > orig_offset + qiov->size - data_size) {
 error_setg(errp, "Protocol error: server sent chunk exceeding 
requested"
@@ -411,6 +413,11 @@ static coroutine_fn int nbd_co_do_receive_one_chunk(
" NBD_REPLY_FLAG_DONE flag set");
 return -EINVAL;
 }
+if (chunk->length) {
+error_setg(errp, "Protocol error: NBD_REPLY_TYPE_NONE chunk with"
+   " nonzero length");
+return -EINVAL;
+}
 return 0;
 }

-- 
2.13.6

[Qemu-devel] [PULL 3/8] nbd-client: Refuse read-only client with BDRV_O_RDWR

2017-11-09 Thread Eric Blake

The NBD spec says that clients should not try to write/trim to
an export advertised as read-only by the server.  But we failed
to check that, and would allow the block layer to use NBD with
BDRV_O_RDWR even when the server is read-only, which meant we
were depending on the server sending a proper EPERM failure for
various commands, and also exposes a leaky abstraction: using
qemu-io in read-write mode would succeed on 'w -z 0 0' because
of local short-circuiting logic, but 'w 0 0' would send a
request over the wire (where it then depends on the server, and
fails at least for qemu-nbd but might pass for other NBD
implementations).

With this patch, a client MUST request read-only mode to access
a server that is doing a read-only export, or else it will get
a message like:

can't open device nbd://localhost:10809/foo: request for write access conflicts 
with read-only export

It is no longer possible to even attempt writes over the wire
(including the corner case of 0-length writes), because the block
layer enforces the explicit read-only request; this matches the
behavior of qcow2 when backed by a read-only POSIX file.

Fix several iotests to comply with the new behavior (since
qemu-nbd of an internal snapshot, as well as nbd-server-add over QMP,
default to a read-only export, we must tell blockdev-add/qemu-io to
set up a read-only client).

CC: qemu-sta...@nongnu.org
Signed-off-by: Eric Blake 
Message-Id: <20171108215703.9295-3-ebl...@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy 
---
 block/nbd-client.c | 9 +
 tests/qemu-iotests/058 | 8 
 tests/qemu-iotests/140 | 4 ++--
 tests/qemu-iotests/147 | 1 +
 4 files changed, 16 insertions(+), 6 deletions(-)

diff --git a/block/nbd-client.c b/block/nbd-client.c
index de6c153328..daa4392531 100644
--- a/block/nbd-client.c
+++ b/block/nbd-client.c
@@ -697,6 +697,7 @@ int nbd_client_co_pwritev(BlockDriverState *bs, uint64_t 
offset,
 .len = bytes,
 };

+assert(!(client->info.flags & NBD_FLAG_READ_ONLY));
 if (flags & BDRV_REQ_FUA) {
 assert(client->info.flags & NBD_FLAG_SEND_FUA);
 request.flags |= NBD_CMD_FLAG_FUA;
@@ -717,6 +718,7 @@ int nbd_client_co_pwrite_zeroes(BlockDriverState *bs, 
int64_t offset,
 .len = bytes,
 };

+assert(!(client->info.flags & NBD_FLAG_READ_ONLY));
 if (!(client->info.flags & NBD_FLAG_SEND_WRITE_ZEROES)) {
 return -ENOTSUP;
 }
@@ -756,6 +758,7 @@ int nbd_client_co_pdiscard(BlockDriverState *bs, int64_t 
offset, int bytes)
 .len = bytes,
 };

+assert(!(client->info.flags & NBD_FLAG_READ_ONLY));
 if (!(client->info.flags & NBD_FLAG_SEND_TRIM)) {
 return 0;
 }
@@ -814,6 +817,12 @@ int nbd_client_init(BlockDriverState *bs,
 logout("Failed to negotiate with the NBD server\n");
 return ret;
 }
+if (client->info.flags & NBD_FLAG_READ_ONLY &&
+!bdrv_is_read_only(bs)) {
+error_setg(errp,
+   "request for write access conflicts with read-only export");
+return -EACCES;
+}
 if (client->info.flags & NBD_FLAG_SEND_FUA) {
 bs->supported_write_flags = BDRV_REQ_FUA;
 bs->supported_zero_flags |= BDRV_REQ_FUA;
diff --git a/tests/qemu-iotests/058 b/tests/qemu-iotests/058
index 2253c6a6d1..5eb8784669 100755
--- a/tests/qemu-iotests/058
+++ b/tests/qemu-iotests/058
@@ -117,15 +117,15 @@ _export_nbd_snapshot sn1

 echo
 echo "== verifying the exported snapshot with patterns, method 1 =="
-$QEMU_IO_NBD -c 'read -P 0xa 0x1000 0x1000' "$nbd_snapshot_img" | 
_filter_qemu_io
-$QEMU_IO_NBD -c 'read -P 0xb 0x2000 0x1000' "$nbd_snapshot_img" | 
_filter_qemu_io
+$QEMU_IO_NBD -r -c 'read -P 0xa 0x1000 0x1000' "$nbd_snapshot_img" | 
_filter_qemu_io
+$QEMU_IO_NBD -r -c 'read -P 0xb 0x2000 0x1000' "$nbd_snapshot_img" | 
_filter_qemu_io

 _export_nbd_snapshot1 sn1

 echo
 echo "== verifying the exported snapshot with patterns, method 2 =="
-$QEMU_IO_NBD -c 'read -P 0xa 0x1000 0x1000' "$nbd_snapshot_img" | 
_filter_qemu_io
-$QEMU_IO_NBD -c 'read -P 0xb 0x2000 0x1000' "$nbd_snapshot_img" | 
_filter_qemu_io
+$QEMU_IO_NBD -r -c 'read -P 0xa 0x1000 0x1000' "$nbd_snapshot_img" | 
_filter_qemu_io
+$QEMU_IO_NBD -r -c 'read -P 0xb 0x2000 0x1000' "$nbd_snapshot_img" | 
_filter_qemu_io

 $QEMU_IMG convert "$TEST_IMG" -l sn1 -O qcow2 "$converted_image"

diff --git a/tests/qemu-iotests/140 b/tests/qemu-iotests/140
index f89d0d6789..a8fc95145c 100755
--- a/tests/qemu-iotests/140
+++ b/tests/qemu-iotests/140
@@ -78,7 +78,7 @@ _send_qemu_cmd $QEMU_HANDLE \
'arguments': { 'device': 'drv' }}" \
 'return'

-$QEMU_IO_PROG -f raw -c 'read -P 42 0 64k' \
+$QEMU_IO_PROG -f raw -r -c 'read -P 42 0 64k' \
 "nbd+unix:///drv?socket=$TEST_DIR/nbd" 2>&1 \
 | _filter_qemu_io | _filter_nbd

@@ -87,7 +87,7 @@ _send_qemu_cmd $QEMU_HANDLE \
'arguments': { 'device': 'drv' }}" \
 'return'

-$QEMU_IO_PROG -f raw -c close

[Qemu-devel] [PULL 0/8] NBD patches for 2.11-rc1

2017-11-09 Thread Eric Blake

The following changes since commit b0fbe46ad82982b289a44ee2495b59b0bad8a842:

  Update version for v2.11.0-rc0 release (2017-11-07 16:05:28 +)

are available in the git repository at:

  git://repo.or.cz/qemu/ericb.git tags/pull-nbd-2017-11-09

for you to fetch changes up to ef8c887ee01a4e4c8c5c28c86ea5b45162c51bcd:

  nbd/server: Fix structured read of length 0 (2017-11-09 10:25:11 -0600)


nbd patches for 2017-11-09

- Vladimir Sementsov-Ogievskiy: nbd/server: fix nbd_negotiate_handle_info
- Eric Blake: 0/7 various NBD fixes for 2.11


Eric Blake (7):
  nbd-client: Fix error message typos
  nbd-client: Refuse read-only client with BDRV_O_RDWR
  nbd/client: Nicer trace of structured reply
  nbd: Fix struct name for structured reads
  nbd-client: Short-circuit 0-length operations
  nbd-client: Stricter enforcing of structured reply spec
  nbd/server: Fix structured read of length 0

Vladimir Sementsov-Ogievskiy (1):
  nbd/server: fix nbd_negotiate_handle_info

 include/block/nbd.h| 18 +-
 block/nbd-client.c | 37 +++--
 nbd/client.c   |  4 +++-
 nbd/server.c   | 26 +++---
 nbd/trace-events   |  3 ++-
 tests/qemu-iotests/058 |  8 
 tests/qemu-iotests/140 |  4 ++--
 tests/qemu-iotests/147 |  1 +
 8 files changed, 79 insertions(+), 22 deletions(-)

-- 
2.13.6

[Qemu-devel] [PULL 2/8] nbd-client: Fix error message typos

2017-11-09 Thread Eric Blake

Provide missing spaces that are required when using string
concatenation to break error messages across source lines.
Introduced in commit f140e300.

Signed-off-by: Eric Blake 
Message-Id: <20171108215703.9295-2-ebl...@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy 
---
 block/nbd-client.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/block/nbd-client.c b/block/nbd-client.c
index b44d4d4a01..de6c153328 100644
--- a/block/nbd-client.c
+++ b/block/nbd-client.c
@@ -248,7 +248,7 @@ static int nbd_parse_error_payload(NBDStructuredReplyChunk 
*chunk,

 error = nbd_errno_to_system_errno(payload_advance32());
 if (error == 0) {
-error_setg(errp, "Protocol error: server sent structured error chunk"
+error_setg(errp, "Protocol error: server sent structured error chunk "
  "with error = 0");
 return -EINVAL;
 }
@@ -257,7 +257,7 @@ static int nbd_parse_error_payload(NBDStructuredReplyChunk 
*chunk,
 message_size = payload_advance16();

 if (message_size > chunk->length - sizeof(error) - sizeof(message_size)) {
-error_setg(errp, "Protocol error: server sent structured error chunk"
+error_setg(errp, "Protocol error: server sent structured error chunk "
  "with incorrect message size");
 return -EINVAL;
 }
@@ -408,7 +408,7 @@ static coroutine_fn int nbd_co_do_receive_one_chunk(
 if (chunk->type == NBD_REPLY_TYPE_NONE) {
 if (!(chunk->flags & NBD_REPLY_FLAG_DONE)) {
 error_setg(errp, "Protocol error: NBD_REPLY_TYPE_NONE chunk 
without"
- "NBD_REPLY_FLAG_DONE flag set");
+   " NBD_REPLY_FLAG_DONE flag set");
 return -EINVAL;
 }
 return 0;
-- 
2.13.6

Re: [Qemu-devel] [Qemu-block] [PATCH] block: all I/O should be completed before removing throttle timers.

2017-11-09 Thread Stefan Hajnoczi

On Sat, Oct 21, 2017 at 01:34:00PM +0800, Zhengui Li wrote:
> From: Zhengui 
> 
> In blk_remove_bs, all I/O should be completed before removing throttle
> timers. If there has inflight I/O, removing throttle timers here will
> cause the inflight I/O never return.
> This patch add bdrv_drained_begin before throttle_timers_detach_aio_context
> to let all I/O completed before removing throttle timers.
> 
> Signed-off-by: Zhengui 
> ---
>  block/block-backend.c | 4 
>  1 file changed, 4 insertions(+)

Reviewed-by: Stefan Hajnoczi 


signature.asc
Description: PGP signature

Re: [Qemu-devel] [PATCH 3/7] s390x/pci: rework PCI LOAD

2017-11-09 Thread Cornelia Huck

On Tue,  7 Nov 2017 18:24:35 +0100
Pierre Morel  wrote:

> Enhance the fault detection, correction of the fault reporting.

Basically the same comments as for the previous patch (but looks good
in general).

> 
> Signed-off-by: Pierre Morel 
> Reviewed-by: Yi Min Zhao 
> ---
>  hw/s390x/s390-pci-inst.c | 27 ---
>  1 file changed, 16 insertions(+), 11 deletions(-)

Re: [Qemu-devel] [PATCH 2/7] s390x/pci: rework PCI STORE

2017-11-09 Thread Cornelia Huck

On Tue,  7 Nov 2017 18:24:34 +0100
Pierre Morel  wrote:

> Enhance the fault detection, correction of the fault reporting.
> 
> Signed-off-by: Pierre Morel 
> Reviewed-by: Yi Min Zhao 
> ---
>  hw/s390x/s390-pci-inst.c | 41 -
>  1 file changed, 24 insertions(+), 17 deletions(-)
> 
> diff --git a/hw/s390x/s390-pci-inst.c b/hw/s390x/s390-pci-inst.c
> index 8fcb02d..4a2f996 100644
> --- a/hw/s390x/s390-pci-inst.c
> +++ b/hw/s390x/s390-pci-inst.c
> @@ -469,6 +469,12 @@ int pcistg_service_call(S390CPU *cpu, uint8_t r1, 
> uint8_t r2)
>  pcias = (env->regs[r2] >> 16) & 0xf;
>  len = env->regs[r2] & 0xf;
>  offset = env->regs[r2 + 1];
> +data = env->regs[r1];
> +
> +if (!(fh & FH_MASK_ENABLE)) {

This covers the reserved/standby/disabled states, right?

> +setcc(cpu, ZPCI_PCI_LS_INVAL_HANDLE);
> +return 0;
> +}
>  
>  pbdev = s390_pci_find_dev_by_fh(s390_get_phb(), fh);
>  if (!pbdev) {
> @@ -478,12 +484,7 @@ int pcistg_service_call(S390CPU *cpu, uint8_t r1, 
> uint8_t r2)
>  }
>  
>  switch (pbdev->state) {
> -case ZPCI_FS_RESERVED:
> -case ZPCI_FS_STANDBY:
> -case ZPCI_FS_DISABLED:
>  case ZPCI_FS_PERMANENT_ERROR:
> -setcc(cpu, ZPCI_PCI_LS_INVAL_HANDLE);
> -return 0;
>  case ZPCI_FS_ERROR:
>  setcc(cpu, ZPCI_PCI_LS_ERR);
>  s390_set_status_code(env, r2, ZPCI_PCI_ST_BLOCKED);
> @@ -492,9 +493,13 @@ int pcistg_service_call(S390CPU *cpu, uint8_t r1, 
> uint8_t r2)
>  break;
>  }
>  
> -data = env->regs[r1];
> -if (pcias < 6) {
> -if ((8 - (offset & 0x7)) < len) {
> +/* Test that the pcias is a valid pcias and do the according operations 
> */

s/according/appropriate/ ?

> +switch (pcias) {
> +case 0 ... 5:
> +/* Check length:
> + * A length of 0 is invalid and length should not cross a double word
> + */
> +if (!len || (len > (8 - (offset & 0x7 {
>  program_interrupt(env, PGM_OPERAND, 4);
>  return 0;
>  }
> @@ -512,21 +517,23 @@ int pcistg_service_call(S390CPU *cpu, uint8_t r1, 
> uint8_t r2)
>  program_interrupt(env, PGM_OPERAND, 4);
>  return 0;
>  }
> -} else if (pcias == 15) {
> -if ((4 - (offset & 0x3)) < len) {
> -program_interrupt(env, PGM_OPERAND, 4);
> -return 0;
> -}
> -
> -if (zpci_endian_swap(, len)) {
> +break;
> +case 15:
> +/* ZPCI uses the pseudo BAR number 15 as configuration space */
> +/* possible access length are 1,2,4 and must not cross a word */

s/length/lengths/

> +if (!len || (len > (4 - (offset & 0x3))) || len == 3) {
>  program_interrupt(env, PGM_OPERAND, 4);
>  return 0;
>  }
> -
> +/* len = 1,2,4 so we do not need to test */
> +zpci_endian_swap(, len);
>  pci_host_config_write_common(pbdev->pdev, offset,
>   pci_config_size(pbdev->pdev),
>   data, len);
> -} else {
> +break;
> +case PCI_ROM_SLOT:

So, will this be filled in a later patch? (Reading from the top.)

> +/* Fall through */
> +default:
>  DPRINTF("pcistg invalid space\n");
>  setcc(cpu, ZPCI_PCI_LS_ERR);
>  s390_set_status_code(env, r2, ZPCI_PCI_ST_INVAL_AS);

On the whole, I think this improves readability, especially for folks
that do not have access to the spec.

Re: [Qemu-devel] [PATCH V5] hw/pcie-pci-bridge: restrict to X86 and ARM

2017-11-09 Thread Philippe Mathieu-Daudé

Hi Marcel,

Cc'ing Paul and Yongbok Since I'm not sure their Boston board could also
use it.

On 11/09/2017 12:46 PM, Marcel Apfelbaum wrote:
> The PCIE-PCI bridge is specific to "pure" PCIe systems
> (on QEMU we have X86 and ARM), it does not make sense to
> have it in other archs.
> 
> Reported-by: Thomas Huth 
> Signed-off-by: Marcel Apfelbaum 
> ---
> 
> V4 -> V5
>   - Since all other tries failed, conditioned the
> device on the PCIe Root Port.
> 
> V3 -> V4:
>  - Move the config line to pci.mak  (Thomas)
> 
> V2 -> V3:
>  - Another tweak in subject s/if/it (Cornelia) 
> 
> V1 -> V2:
>  Addressed Thomas and Cornelia comments:
>  - Conditioned the pcie-pci-bridge compilation on
>the PCIe Root CONFIG_PCIE_PORT
>  - Tweaked subject PCI -> PCIe
> 
>  Thanks,
>  Marcel
> 
> 
>  hw/pci-bridge/Makefile.objs | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/hw/pci-bridge/Makefile.objs b/hw/pci-bridge/Makefile.objs
> index 666db37da2..1b05023662 100644
> --- a/hw/pci-bridge/Makefile.objs
> +++ b/hw/pci-bridge/Makefile.objs
> @@ -1,5 +1,5 @@
> -common-obj-y += pci_bridge_dev.o pcie_pci_bridge.o
> -common-obj-$(CONFIG_PCIE_PORT) += pcie_root_port.o gen_pcie_root_port.o
> +common-obj-y += pci_bridge_dev.o
> +common-obj-$(CONFIG_PCIE_PORT) += pcie_root_port.o gen_pcie_root_port.o 
> pcie_pci_bridge.o

KISS :)

Reviewed-by: Philippe Mathieu-Daudé 
Tested-by: Philippe Mathieu-Daudé 

>  common-obj-$(CONFIG_PXB) += pci_expander_bridge.o
>  common-obj-$(CONFIG_XIO3130) += xio3130_upstream.o xio3130_downstream.o
>  common-obj-$(CONFIG_IOH3420) += ioh3420.o
>

Re: [Qemu-devel] [PATCH 1/7] s390x/pci: factor out endianess conversion

2017-11-09 Thread Cornelia Huck

On Tue,  7 Nov 2017 18:24:33 +0100
Pierre Morel  wrote:

> There are two places where the same endianness conversion
> is done.
> Let's factor this out into a static function.
> 
> Signed-off-by: Pierre Morel 
> Reviewed-by: Yi Min Zhao 
> ---
>  hw/s390x/s390-pci-inst.c | 58 
> ++--
>  1 file changed, 32 insertions(+), 26 deletions(-)
> 
> diff --git a/hw/s390x/s390-pci-inst.c b/hw/s390x/s390-pci-inst.c
> index 8e088f3..8fcb02d 100644
> --- a/hw/s390x/s390-pci-inst.c
> +++ b/hw/s390x/s390-pci-inst.c
> @@ -314,6 +314,35 @@ out:
>  return 0;
>  }
>  
> +/**
> + * This function swaps the data at ptr according from one
> + * endianness to the other.
> + * valid data in the uint64_t data field.

I'm not sure what that line is supposed to mean?

> + * @ptr: a pointer to a uint64_t data field
> + * @len: the length of the valid data, must be 1,2,4 or 8
> + */
> +static int zpci_endian_swap(uint64_t *ptr, uint8_t len)
> +{
> +uint64_t data = *ptr;
> +switch (len) {
> +case 1:
> +break;
> +case 2:
> +data = bswap16(data);
> +break;
> +case 4:
> +data = bswap32(data);
> +break;
> +case 8:
> +data = bswap64(data);
> +break;
> +default:
> +return -EINVAL;
> +}
> +*ptr = data;
> +return 0;
> +}
> +

I was expecting more code to use a similar pattern, but it seems
surprisingly uncommon.

Re: [Qemu-devel] [PATCH V5] hw/pcie-pci-bridge: restrict to X86 and ARM

2017-11-09 Thread Thomas Huth

On 09.11.2017 16:46, Marcel Apfelbaum wrote:
> The PCIE-PCI bridge is specific to "pure" PCIe systems
> (on QEMU we have X86 and ARM), it does not make sense to
> have it in other archs.
> 
> Reported-by: Thomas Huth 
> Signed-off-by: Marcel Apfelbaum 
> ---
> 
> V4 -> V5
>   - Since all other tries failed, conditioned the
> device on the PCIe Root Port.
> 
> V3 -> V4:
>  - Move the config line to pci.mak  (Thomas)
> 
> V2 -> V3:
>  - Another tweak in subject s/if/it (Cornelia) 
> 
> V1 -> V2:
>  Addressed Thomas and Cornelia comments:
>  - Conditioned the pcie-pci-bridge compilation on
>the PCIe Root CONFIG_PCIE_PORT
>  - Tweaked subject PCI -> PCIe
> 
>  Thanks,
>  Marcel
> 
> 
>  hw/pci-bridge/Makefile.objs | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/hw/pci-bridge/Makefile.objs b/hw/pci-bridge/Makefile.objs
> index 666db37da2..1b05023662 100644
> --- a/hw/pci-bridge/Makefile.objs
> +++ b/hw/pci-bridge/Makefile.objs
> @@ -1,5 +1,5 @@
> -common-obj-y += pci_bridge_dev.o pcie_pci_bridge.o
> -common-obj-$(CONFIG_PCIE_PORT) += pcie_root_port.o gen_pcie_root_port.o
> +common-obj-y += pci_bridge_dev.o
> +common-obj-$(CONFIG_PCIE_PORT) += pcie_root_port.o gen_pcie_root_port.o 
> pcie_pci_bridge.o
>  common-obj-$(CONFIG_PXB) += pci_expander_bridge.o
>  common-obj-$(CONFIG_XIO3130) += xio3130_upstream.o xio3130_downstream.o
>  common-obj-$(CONFIG_IOH3420) += ioh3420.o

Reviewed-by: Thomas Huth

Re: [Qemu-devel] [PATCH] blockdev-backup: enable non-root nodes for backup

2017-11-09 Thread Eric Blake

On 11/09/2017 08:16 AM, Vladimir Sementsov-Ogievskiy wrote:
> This is needed to implement image-fleecing scheme, when we create
> a temporary node, mark our active node to be backing for the temp,
> and start backup(sync=none) from active node to the temp node.
> Temp node then represents a kind of snapshot and may be used
> for external backup through NBD.
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy 
> ---
> 
> What was the reason to abandon non-root nodes?

I think the original restriction was that we didn't know all the
implications to having multiple readers to an intermediate node, so it
was easier to prevent it with plans to add it later than to add it up
front and deal with the fallout.  But I think that now we are
sufficiently versed in handling BDS trees with multiple readers, with
proper op-blocking in place; so you are right that we can probably
support it now.

However, I'm a bit worried that there is no documentation change about
this in a .json QAPI file, nor any easy way to introspect via QMP
whether a particular qemu implementation supports this (short of trying
it and seeing whether it works).  I'm also thinking that this is 2.12
material, unless we can prove it is fixing a bug for 2.11 that was not
previously present.

> 
>  blockdev.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/blockdev.c b/blockdev.c
> index 56a6b24a0b..d0a5a107f0 100644
> --- a/blockdev.c
> +++ b/blockdev.c
> @@ -3345,7 +3345,7 @@ BlockJob *do_blockdev_backup(BlockdevBackup *backup, 
> BlockJobTxn *txn,
>  backup->compress = false;
>  }
>  
> -bs = qmp_get_root_bs(backup->device, errp);
> +bs = bdrv_lookup_bs(backup->device, backup->device, errp);
>  if (!bs) {
>  return NULL;
>  }
> 

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org

signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [Qemu-block] segfault in parallel blockjobs (iotest 30)

2017-11-09 Thread Alberto Garcia

On Wed 08 Nov 2017 03:45:38 PM CET, Alberto Garcia wrote:

> I'm thinking that perhaps we should add the pause point directly to
> block_job_defer_to_main_loop(), to prevent any block job from running
> the exit function when it's paused.

I was trying this and unfortunately this breaks the mirror job at least
(reproduced with a simple block-commit on the topmost node, which uses
commit_active_start() -> mirror_start_job()).

So what happens is that mirror_run() always calls bdrv_drained_begin()
before returning, pausing the block job. The closing bdrv_drained_end()
call is at the end of mirror_exit(), already in the main loop.

So the mirror job is always calling block_job_defer_to_main_loop() and
mirror_exit() when the job is paused.

Berto

Re: [Qemu-devel] [Qemu-block] Drainage in bdrv_replace_child_noperm()

2017-11-09 Thread Kevin Wolf

Am 08.11.2017 um 21:16 hat Max Reitz geschrieben:
> On 2017-11-07 15:22, Kevin Wolf wrote:
> > I think the issue is much simpler, even though it still has two parts.
> > It's the old story of bdrv_drain mixing two separate concepts:
> > 
> > 1. Wait synchronously for the completion of all my requests to this
> >node. This needs to be propagated down the graph to the children.
> 
> So, flush without flushing protocol drivers? :-)

No, flush is a completely different thing. Drain is about completing all
in-flight requests, whereas flush is about writing out all caches. You
can do either one without the other.

In particular, a flush doesn't involve a drain, you can still requests
while a flush request is in flight. The semantics is that a flush makes
sure that all requests which had already completed when the flush
request was started are stable on disk. Later requests may or may not be
stable on disk yet.

> > 2. Make sure that nobody else sends new requests to this node. This
> >needs to be propagated up the graph to the parents.
> > 
> > Some callers want only 1. (usually callers of bdrv_drain_all() without a
> > begin/end pair), some callers want both 1. and 2. (that's the begin/end
> > construction for getting exclusive access). Not sure if there are
> > callers that want only 2., but possibly.
> > 
> > If we actually take this separation serious, the first step to a fix
> > could be that BdrvChildRole.drained_begin shouldn't be propagated to the
> > children.
> 
> That seems good to me, but considering that the default was to just
> propagate it to every child, I thought that I was missing something.

bdrv_child_cb_drained_begin() calls bdrv_drained_begin() to wait for the
completion of all running requests on its current node. The propagation
to every child is an unwanted side effect, but so far it didn't seem to
hurt anyone, so we didn't care about preventing it.

> >   We may still need to drain the requests at the node itself:
> > > Imagine a drained backing file of qcow2 node. Making sure that the qcow2
> > node doesn't get new requests isn't enough, we also must make sure that
> > in-flight requests don't access the backing file any more. This means
> > draining the qcow2 node, though we don't care whether its child nodes
> > still have requests in flight.
> 
> If you want to stop the qcow2 node from generating further requests, you
> can either flush them (as you said) or pause them (whatever that means...).

Pausing is probably way to complicated. qcow2 never issues requests by
itself. It only has requests running if someone else has a request
running on the qcow2 node. So it's enough to just drain the request
queue of the qcow2 node to get to the state we want.

The only thing we need to make sure is that we drain it _before_ its
child node is replaced with a drained one, so that its requests can
complete.

In fact, I think graph modifications should only be done with drained
nodes anyway. Otherwise you could switch over in the middle of a single
high-level operation and possibly get callbacks from a node that isn't a
child any more. Maybe we should add some assertions to that effect and
clean up whatever code breaks after it.

> However, if you flush them, you also need to flush the children you have
> flushed them to...  So what I wrote was technically wrong ("don't pass
> parent drainage back to the child because the child is already
> drained"), instead it should be "don't pass parent drainage back to the
> child because the child is going to be drained (flushed) anyway".
> 
> So, pause requests first (either by parking them or flushing them,
> driver's choice), then flush the tree.
> 
> Of course that's easier said than done...  First, we would need a
> non-recursive flush.  Then, every node that is visited by the drain
> would (*after* recursing to its parents) need to flush itself.
> 
> (Note that I'm completely disregarding nodes which are below the
> original node that is supposed to be drained, and which therefore do not
> drain their parents (or do they?) because I'd like to retain at least
> some of my sanity for now.)

I don't follow, but I suppose this is related to the flush/drain
confusion.

> Secondly we have exactly the issue you describe below...
> 
> > The big question is whether bdrv_drain() would still work for a single
> > node without recursing to the children, but as it uses bs->in_flight, I
> > think that should be okay these days.
> > 
> >> (Most importantly, ideally we'd want to attach the new child to the
> >> parent and then drain the parent: This would give us exactly the state
> >> we want.  However, attaching the child first and then draining the
> >> parent is unsafe, so we cannot do it...)
> >>
> >> [1] Whether the parent (un-)drains the child depends on the
> >> BdrvChildRole.drained_{begin,end}() implementation, strictly speaking.
> >> We cannot say it generally.
> >>
> >> OK, so how to fix it?  I don't know, so I'm asking you. :-)
> > 
> >

Re: [Qemu-devel] [PATCH v4 0/4] Don't write headers if BDS is INACTIVE

2017-11-09 Thread Denis V. Lunev

On 11/07/2017 04:10 PM, Jeff Cody wrote:
> Changes from v3->v4:
>
> Patch 3: Add migrate_del_blocker and error_free (Thanks Stefan)
>
>
> git-backport-diff -r qemu/master.. -u ba11b69
>
> 001/4:[] [--] 'block/vhdx.c: Don't blindly update the header'
> 002/4:[] [--] 'block/parallels: Do not update header or truncate image 
> when INMIGRATE'
> 003/4:[0003] [FC] 'block/parallels: add migration blocker'
> 004/4:[] [--] 'qemu-iotests: update unsupported image formats in 194'
>
>
> Changes from v2->v3:
>
> Patch 2: Uh... fix that misspelling.  Thanks Stefan :)
> Patch 3: New patch to block migration in parallels
>
> git-backport-diff -r qemu/master.. -u 6dc6acb
>
> 001/4:[] [--] 'block/vhdx.c: Don't blindly update the header'
> 002/4:[] [--] 'block/parallels: Do not update header or truncate image 
> when INMIGRATE'
> 003/4:[down] 'block/parallels: add migration blocker'
> 004/4:[] [--] 'qemu-iotests: update unsupported image formats in 194'
>
>
> Changes from v1->v2:
>
> * Drop previous parallels patches, just check BDRV_O_INACTIVE now
>   (Kevin)
>
> git-backport-diff -r qemu/master.. -u github/master
> Key:
> [] : patches are identical
> [] : number of functional differences between upstream/downstream patch
> [down] : patch is downstream-only
> The flags [FC] indicate (F)unctional and (C)ontextual differences, 
> respectively
>
> 001/3:[] [--] 'block/vhdx.c: Don't blindly update the header'
> 002/3:[down] 'block/parallals: Do not update header or truncate image when 
> INMIGRATE'
> 003/3:[] [--] 'qemu-iotests: update unsupported image formats in 194'
>
>
> v1:
>
> VHDX and Parallels both blindly write headers to the image file
> if the images are opened R/W.  This causes an assert if the QEMU run
> state is INMIGRATE.
>
> Jeff Cody (4):
>   block/vhdx.c: Don't blindly update the header
>   block/parallels: Do not update header or truncate image when INMIGRATE
>   block/parallels: add migration blocker
>   qemu-iotests: update unsupported image formats in 194
>
>  block/parallels.c  | 22 +-
>  block/vhdx.c   |  7 ---
>  tests/qemu-iotests/194 |  2 +-
>  3 files changed, 18 insertions(+), 13 deletions(-)
>
Reviewed-by: Denis V. Lunev

Re: [Qemu-devel] [Qemu-block] [PATCH 0/5] iotests: Make some tests less flaky

2017-11-09 Thread Stefan Hajnoczi

On Thu, Nov 09, 2017 at 02:37:59AM +0100, Max Reitz wrote:
> There are a couple of tests that fail (on my machine) from time to
> time (and by that I mean that recently I've rarely ever had a test run
> with both 083 and 136 working on first try).
> This series should fix most (at least the issues I am aware of).
> 
> Notes:
> - 083 might have another issue, but if so it occurs extremely rarely and
>   so I was unable to debug it.
> 
> - 129 is flaky, too, because it tries to use block jobs with BB
>   throttling -- however, block jobs ignore that these days.  Making it
>   use a throttle filter node will require quite a bit of work.  See
>   http://lists.nongnu.org/archive/html/qemu-block/2017-11/msg00111.html
>   for more.
> 
> - 194 sometimes hangs because the source VM fails to drain its drive.
>   This is probably not an issue with the test, but actually an issue in
>   qemu.  See
>   http://lists.nongnu.org/archive/html/qemu-block/2017-11/msg00256.html
>   for more.
> 
> 
> "All tests have passed, let's ship it!"
> -- Me, 2:36 am
> 
> 
> (Editor's note: "all" means raw/file, qcow2/file, and raw/nbd.)
> 
> 
> Max Reitz (5):
>   iotests: Make 030 less flaky
>   iotests: Add missing 'blkdebug::' in 040
>   iotests: Make 055 less flaky
>   iotests: Make 083 less flaky
>   iotests: Make 136 less flaky
> 
>  tests/qemu-iotests/030 |  8 ++--
>  tests/qemu-iotests/040 |  2 +-
>  tests/qemu-iotests/055 | 25 -
>  tests/qemu-iotests/083 |  3 ++-
>  tests/qemu-iotests/136 | 14 +-
>  5 files changed, 38 insertions(+), 14 deletions(-)
> 
> -- 
> 2.13.6
> 
> 

Thanks for these fixes!  QEMU 2.11 material.

Reviewed-by: Stefan Hajnoczi 


signature.asc
Description: PGP signature

Re: [Qemu-devel] [PATCH] AMD Processor Topology Information

2017-11-09 Thread pixo


Hi Philippe,

Most of the code for 'case 0x801E:' is a copy of 'case 4:' of the 
same procedure in both cpu.c and kvm.c

Values were changes for AMD Zen architecture.
The only new code is 'case 0x801D:' which defines core topology.

Hope this info helps wit review.

On 2017-11-08 13:44, Philippe Mathieu-Daudé wrote:

Hi Stanislav,

This does not seem so trivial ;)

Cc'ing more reviewers.

On 11/03/2017 02:30 PM, Stanislav Lanci wrote:

V2:
Adds information about cache size and topology on leaf 0x801D for 
family 17h
Without the added cache topology guest with SMT suffers latency 
problems


Add CPUID 0x801E for describing AMD Processor Topology Information
Disables warning about smt for 17h family of AMD CPUs

Signed-off-by: Stanislav Lanci 
---
 target/i386/cpu.c | 93 
++-

 target/i386/kvm.c | 28 +++--
 2 files changed, 117 insertions(+), 4 deletions(-)

diff --git a/target/i386/cpu.c b/target/i386/cpu.c
index ddc45abd70..1545e3fe31 100644
--- a/target/i386/cpu.c
+++ b/target/i386/cpu.c
@@ -113,7 +113,9 @@
 /* L1 instruction cache: */
 #define L1I_LINE_SIZE 64
 #define L1I_ASSOCIATIVITY  8
+#define L1I_ASSOC_AMD_ZEN  4
 #define L1I_SETS  64
+#define L1I_SETS_AMD_ZEN 256
 #define L1I_PARTITIONS 1
 /* Size = LINE_SIZE*ASSOCIATIVITY*SETS*PARTITIONS = 32KiB */
 #define L1I_DESCRIPTOR CPUID_2_L1I_32KB_8WAY_64B
@@ -125,7 +127,9 @@
 /* Level 2 unified cache: */
 #define L2_LINE_SIZE  64
 #define L2_ASSOCIATIVITY  16
+#define L2_ASSOC_AMD_ZEN   8
 #define L2_SETS 4096
+#define L2_SETS_AMD_ZEN 1024
 #define L2_PARTITIONS  1
 /* Size = LINE_SIZE*ASSOCIATIVITY*SETS*PARTITIONS = 4MiB */
 /*FIXME: CPUID leaf 2 descriptor is inconsistent with CPUID leaf 4 */
@@ -142,6 +146,7 @@
 #define L3_N_LINE_SIZE 64
 #define L3_N_ASSOCIATIVITY 16
 #define L3_N_SETS   16384
+#define L3_N_SETS_AMD_ZEN4096
 #define L3_N_PARTITIONS 1
 #define L3_N_DESCRIPTOR CPUID_2_L3_16MB_16WAY_64B
 #define L3_N_LINES_PER_TAG  1
@@ -3072,6 +3077,91 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t 
index, uint32_t count,

 *edx = 0;
 }
 break;
+case 0x801D: /* AMD TOPOEXT cache info for ZEN */
+if (cpu->cache_info_passthrough) {
+host_cpuid(index, count, eax, ebx, ecx, edx);
+break;
+} else if ((env->cpuid_version & 0xFF00F00) == 0x800F00) {
+*eax = 0;
+switch (count) {
+case 0: /* L1 dcache info */
+*eax |= CPUID_4_TYPE_DCACHE | \
+CPUID_4_LEVEL(1) | \
+CPUID_4_SELF_INIT_LEVEL | \
+((cs->nr_threads - 1) << 14);
+*ebx = (L1D_LINE_SIZE - 1) | \
+   ((L1D_PARTITIONS - 1) << 12) | \
+   ((L1D_ASSOCIATIVITY - 1) << 22);
+*ecx = L1D_SETS - 1;
+*edx = 0;
+break;
+case 1: /* L1 icache info */
+*eax |= CPUID_4_TYPE_ICACHE | \
+CPUID_4_LEVEL(1) | \
+CPUID_4_SELF_INIT_LEVEL | \
+((cs->nr_threads - 1) << 14);
+*ebx = (L1I_LINE_SIZE - 1) | \
+   ((L1I_PARTITIONS - 1) << 12) | \
+   ((L1I_ASSOC_AMD_ZEN - 1) << 22);
+*ecx = L1I_SETS_AMD_ZEN - 1;
+*edx = 0;
+break;
+case 2: /* L2 cache info */
+*eax |= CPUID_4_TYPE_UNIFIED | \
+CPUID_4_LEVEL(2) | \
+CPUID_4_SELF_INIT_LEVEL | \
+((cs->nr_threads - 1) << 14);
+*ebx = (L2_LINE_SIZE - 1) | \
+   ((L2_PARTITIONS - 1) << 12) | \
+   ((L2_ASSOC_AMD_ZEN - 1) << 22);
+*ecx = L2_SETS_AMD_ZEN - 1;
+*edx = CPUID_4_INCLUSIVE;
+break;
+case 3: /* L3 cache info */
+if (!cpu->enable_l3_cache) {
+*eax = 0;
+*ebx = 0;
+*ecx = 0;
+*edx = 0;
+break;
+}
+*eax |= CPUID_4_TYPE_UNIFIED | \
+CPUID_4_LEVEL(3) | \
+CPUID_4_SELF_INIT_LEVEL | \
+((cs->nr_cores * cs->nr_threads - 1) << 14);
+*ebx = (L3_N_LINE_SIZE - 1) | \
+   ((L3_N_PARTITIONS - 1) << 12) | \
+   ((L3_N_ASSOCIATIVITY - 1) << 22);
+*ecx = L3_N_SETS_AMD_ZEN - 1;
+*edx = CPUID_4_NO_INVD_SHARING;
+break;
+default: /* end of info */
+*eax = 0;
+*ebx = 0;
+*ecx = 0;
+

Re: [Qemu-devel] [Qemu-block] RFC: use case for adding QMP, block jobs & multiple exports to qemu-nbd ?

2017-11-09 Thread Daniel P. Berrange

On Thu, Nov 09, 2017 at 02:54:35PM +0100, Markus Armbruster wrote:
> Max Reitz  writes:
> 
> > On 2017-11-02 13:02, Daniel P. Berrange wrote:
> > [...]
> >> One alternative approach to doing this would be to suggest that we should
> >> instead just spawn qemu-system-x86_64 with '--machine none' and use that
> >> as a replacement for qemu-nbd, since it already has a built-in NBD server
> >> which can do many exports at once and arbitrary block jobs.
> >
> > As far as I know, we had wanted to add QMP support to qemu-nbd maybe one
> > or two years ago, but nobody ever did it.
> >
> > I've had some discussions about this with Markus and Kevin at KVM Forum.
> >  They appeared to strongly prefer this approach.  I agree with them that
> > design-wise, a qemu with no machine at all (and not even -M none) and
> > early QMP is the way we want to go anyway, and then this would be the
> > correct tool to use.
> 
> "Strongly" is perhaps a bit strong, at least as far as I'm concerned.  I
> just believe that we want the capability to run QEMU without a machine
> anyway, and if that's good enough, then why bother duplicating so much
> of qemu-system-FOO in qemu-nbd & friends?  Besides, once you start to
> duplicate, you'll likely find it hard to stop.
> 
> >> I'm concerned that this could end up being a be a game of whack-a-mole
> >> though, constantly trying to cut out/down all the bits of system emulation
> >> in the machine emulators to get its resource overhead to match the low
> >> overhead of standalone qemu-nbd.
> >
> > However, I personally share your concern.  Especially, I think that
> > getting to a point where we can have no machine at all and early QMP
> > will take much longer than just adding QMP to qemu-nbd -- or adding a
> > qmp command to qemu-img (because you can add NBD exports through QMP, so
> > qemu-nbd's hardcoded focus on NBD exports seems kind of weird then)[1].
> >
> > I'm very much torn here.  There are two approaches: Stripping fat qemu
> > down, or fattening lean qemu-img (?) up.  The latter is very simple.
> 
> "Very simple" is perhaps debatable, but I think we can agree on
> "temptingly simple".

My other concern with using QEMU system emulator binary is that even
if you make it possible to run it with no guest machine instantiated,
it is still a massive binary containing all the stuff you get with a
QEMU system emulator. To be able to lock down the security of this
QEMU to the same level that we could do with qemu-nbd will take even
more work than we would already have todo to make "no machine" be
possible. Even then getting the right security level would require
the invoker to enable the right magic combo of options. And then we
would also need full QEMU backend modularization to be done so that
we could have a binary for serving NBD which didn't pull in irrelevant
stuff like SPICE, GTK & Xorg libraries. This just all sounds like we
are queuing up unfeasibly large amounts of work, alot of which we've
talked about for 10 years and still not been any to make a step forward
on impl.

It feels like the key important factor is to avoid re-inventing the
wheel in multiple places. I don't think this implies that we need to
have a single binary for doing two completely separate tasks. On the
qemu-nbd side it should basically involve assembling building blocks
that we already have available through the QEMU code base. The block
layer is already well isolated & reusable, as evidenced by fact that
it is used across many programs already with little code duplication.
In theory the QMP monitor is fairly well isolated, since we already
reuse the infra for the QEMU guest agent too. For qemu-nbd we could
reuse even more than with the guest agent, since we can pull in some
of the actual command impls for sharing. There's doubtless still
some refactoring to QMP to make it possible, but its no where near
the kind of scope we'd be looking at to take the QEMU system emulator
and enable a "no machine" startup, make it secure by default and
modularize all its dependancies.

Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

Re: [Qemu-devel] [Qemu-block] [PATCH] coroutine: simplify co_aio_sleep_ns() prototype

2017-11-09 Thread Stefan Hajnoczi

On Thu, Nov 09, 2017 at 10:26:52AM +, Stefan Hajnoczi wrote:
> The AioContext pointer argument to co_aio_sleep_ns() is only used for
> the sleep timer.  It does not affect where the caller coroutine is
> resumed.
> 
> Due to changes to coroutine and AIO APIs it is now possible to drop the
> AioContext pointer argument.  This is safe to do since no caller has
> specific requirements for which AioContext the timer must run in.
> 
> This patch drops the AioContext pointer argument and renames the
> function to simplify the API.
> 
> Reported-by: Paolo Bonzini 
> Reported-by: Eric Blake 
> Signed-off-by: Stefan Hajnoczi 
> ---
>  include/qemu/coroutine.h| 6 +-
>  block/null.c| 3 +--
>  block/sheepdog.c| 3 +--
>  blockjob.c  | 2 +-
>  util/qemu-coroutine-sleep.c | 4 ++--
>  5 files changed, 6 insertions(+), 12 deletions(-)

Thanks, applied to my block-next tree for QEMU 2.12:
https://github.com/stefanha/qemu/commits/block-next

Stefan


signature.asc
Description: PGP signature

Re: [Qemu-devel] [PATCH 0/2] add qmp nbd-server-remove

2017-11-09 Thread Eric Blake

On 11/09/2017 09:40 AM, Vladimir Sementsov-Ogievskiy wrote:
> Add command to remove nbd export, pair to nbd-server-add.
> The whole thing and description are in patch 02.
> 
> Vladimir Sementsov-Ogievskiy (2):
>   nbd/server: add additional assert to nbd_export_put
>   qmp: add nbd-server-remove
> 
>  qapi/block.json | 20 
>  blockdev-nbd.c  | 27 +++
>  nbd/server.c|  1 +
>  3 files changed, 48 insertions(+)

Reviewed-by: Eric Blake 

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [PATCH 2/2] qmp: add nbd-server-remove

2017-11-09 Thread Eric Blake

On 11/09/2017 09:40 AM, Vladimir Sementsov-Ogievskiy wrote:
> Add command for export removing. It is needed for cases when we
> don't want to keep export after the operation on it was completed.
> The other example is temporary node, created with blockdev-add.
> If we want to delete it we should firstly remove corresponding
> NBD export.
> 
> Signed-off-by: Vladimir Sementsov-Ogievskiy 
> ---
>  qapi/block.json | 20 
>  blockdev-nbd.c  | 27 +++
>  2 files changed, 47 insertions(+)

It would be nice (okay as a followup patch) to add iotest coverage of
the new command.

>  ##
> +# @nbd-server-remove:
> +#
> +# Stop exporting block node through QEMU's embedded NBD server.
> +#
> +# @device: The device name or node name of the exported node. Should be equal
> +#  to @device parameter for corresponding nbd-server-add command 
> call.
> +#
> +# @force: Whether active connections to the export should be closed. If this
> +# parameter is false the export is only removed from named exports 
> list,
> +# so new connetions are impossible and it would be freed after all
> +# clients are disconnected (default false).
> +#
> +# Returns: error if the server is not running or the device is not marked for
> +#  export.
> +#
> +# Since: 2.12

You are correct that this is too late for 2.11, but I like the concept.
Once we have some testsuite coverage, I'll queue it for 2.12.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.   +1-919-301-3266
Virtualization:  qemu.org | libvirt.org



signature.asc
Description: OpenPGP digital signature

Re: [Qemu-devel] [PATCH] Add missing parameters to mon option documentation

2017-11-09 Thread Marc-André Lureau

Hi

On Thu, Nov 9, 2017 at 1:19 PM, Vicente Jimenez Aguilar
 wrote:
> Documentation missed 'mon' option's 'pretty' and 'default' parameters
>
> Signed-off-by: Vicente Jimenez Aguilar 
> ---
>  qemu-options.hx | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/qemu-options.hx b/qemu-options.hx
> index 3728e9b4dd..72cf48a8e5 100644
> --- a/qemu-options.hx
> +++ b/qemu-options.hx
> @@ -3477,9 +3477,9 @@ Like -qmp but uses pretty JSON formatting.
>  ETEXI
>
>  DEF("mon", HAS_ARG, QEMU_OPTION_mon, \
> -"-mon [chardev=]name[,mode=readline|control]\n", QEMU_ARCH_ALL)
> +"-mon 
> [chardev=]name[,mode=readline|control][,pretty[=on|off]][,default[=on|off]]\n",
>  QEMU_ARCH_ALL)
>  STEXI
> -@item -mon [chardev=]name[,mode=readline|control]
> +@item -mon 
> [chardev=]name[,mode=readline|control][,pretty[=on|off]][,default[=on|off]]

"default" is deprecated since commit 06ac27f (2.4.0). It does nothing
anymore, so should probably not be documented.

"pretty" was added in commit 39eaab9ac2a82f. You write some
documentation for it based on the commit message perhaps?

thanks

>  @findex -mon
>  Setup monitor on chardev @var{name}.
>  ETEXI
> --
> 2.14.1
>
>



-- 
Marc-André Lureau

[Qemu-devel] [PATCH V5] hw/pcie-pci-bridge: restrict to X86 and ARM

2017-11-09 Thread Marcel Apfelbaum

The PCIE-PCI bridge is specific to "pure" PCIe systems
(on QEMU we have X86 and ARM), it does not make sense to
have it in other archs.

Reported-by: Thomas Huth 
Signed-off-by: Marcel Apfelbaum 
---

V4 -> V5
  - Since all other tries failed, conditioned the
device on the PCIe Root Port.

V3 -> V4:
 - Move the config line to pci.mak  (Thomas)

V2 -> V3:
 - Another tweak in subject s/if/it (Cornelia) 

V1 -> V2:
 Addressed Thomas and Cornelia comments:
 - Conditioned the pcie-pci-bridge compilation on
   the PCIe Root CONFIG_PCIE_PORT
 - Tweaked subject PCI -> PCIe

 Thanks,
 Marcel


 hw/pci-bridge/Makefile.objs | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/hw/pci-bridge/Makefile.objs b/hw/pci-bridge/Makefile.objs
index 666db37da2..1b05023662 100644
--- a/hw/pci-bridge/Makefile.objs
+++ b/hw/pci-bridge/Makefile.objs
@@ -1,5 +1,5 @@
-common-obj-y += pci_bridge_dev.o pcie_pci_bridge.o
-common-obj-$(CONFIG_PCIE_PORT) += pcie_root_port.o gen_pcie_root_port.o
+common-obj-y += pci_bridge_dev.o
+common-obj-$(CONFIG_PCIE_PORT) += pcie_root_port.o gen_pcie_root_port.o 
pcie_pci_bridge.o
 common-obj-$(CONFIG_PXB) += pci_expander_bridge.o
 common-obj-$(CONFIG_XIO3130) += xio3130_upstream.o xio3130_downstream.o
 common-obj-$(CONFIG_IOH3420) += ioh3420.o
-- 
2.13.5

[Qemu-devel] [PATCH 1/2] nbd/server: add additional assert to nbd_export_put

2017-11-09 Thread Vladimir Sementsov-Ogievskiy

This place is not obvious, nbd_export_close may theoretically reduce
refcount to 0. It may happen if someone calls nbd_export_put on named
export not through nbd_export_set_name when refcount is 1.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
 nbd/server.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/nbd/server.c b/nbd/server.c
index 70b40ed27e..2f2e05943f 100644
--- a/nbd/server.c
+++ b/nbd/server.c
@@ -1179,6 +1179,7 @@ void nbd_export_put(NBDExport *exp)
 nbd_export_close(exp);
 }
 
+assert(exp->refcount > 0);
 if (--exp->refcount == 0) {
 assert(exp->name == NULL);
 assert(exp->description == NULL);
-- 
2.11.1

[Qemu-devel] [PATCH 0/2] add qmp nbd-server-remove

2017-11-09 Thread Vladimir Sementsov-Ogievskiy

Add command to remove nbd export, pair to nbd-server-add.
The whole thing and description are in patch 02.

Vladimir Sementsov-Ogievskiy (2):
  nbd/server: add additional assert to nbd_export_put
  qmp: add nbd-server-remove

 qapi/block.json | 20 
 blockdev-nbd.c  | 27 +++
 nbd/server.c|  1 +
 3 files changed, 48 insertions(+)

-- 
2.11.1

[Qemu-devel] [PATCH 2/2] qmp: add nbd-server-remove

2017-11-09 Thread Vladimir Sementsov-Ogievskiy

Add command for export removing. It is needed for cases when we
don't want to keep export after the operation on it was completed.
The other example is temporary node, created with blockdev-add.
If we want to delete it we should firstly remove corresponding
NBD export.

Signed-off-by: Vladimir Sementsov-Ogievskiy 
---
 qapi/block.json | 20 
 blockdev-nbd.c  | 27 +++
 2 files changed, 47 insertions(+)

diff --git a/qapi/block.json b/qapi/block.json
index f093fa3f27..1827940717 100644
--- a/qapi/block.json
+++ b/qapi/block.json
@@ -223,6 +223,26 @@
 { 'command': 'nbd-server-add', 'data': {'device': 'str', '*writable': 'bool'} }
 
 ##
+# @nbd-server-remove:
+#
+# Stop exporting block node through QEMU's embedded NBD server.
+#
+# @device: The device name or node name of the exported node. Should be equal
+#  to @device parameter for corresponding nbd-server-add command call.
+#
+# @force: Whether active connections to the export should be closed. If this
+# parameter is false the export is only removed from named exports 
list,
+# so new connetions are impossible and it would be freed after all
+# clients are disconnected (default false).
+#
+# Returns: error if the server is not running or the device is not marked for
+#  export.
+#
+# Since: 2.12
+##
+{ 'command': 'nbd-server-remove', 'data': {'device': 'str', '*force': 'bool'} }
+
+##
 # @nbd-server-stop:
 #
 # Stop QEMU's embedded NBD server, and unregister all devices previously
diff --git a/blockdev-nbd.c b/blockdev-nbd.c
index 28f551a7b0..5f66951c33 100644
--- a/blockdev-nbd.c
+++ b/blockdev-nbd.c
@@ -203,6 +203,33 @@ void qmp_nbd_server_add(const char *device, bool 
has_writable, bool writable,
 nbd_export_put(exp);
 }
 
+void qmp_nbd_server_remove(const char *device, bool has_force, bool force,
+   Error **errp)
+{
+NBDExport *exp;
+
+if (!nbd_server) {
+error_setg(errp, "NBD server not running");
+return;
+}
+
+exp = nbd_export_find(device);
+if (exp == NULL) {
+error_setg(errp, "'%s' is not exported", device);
+return;
+}
+
+if (!has_force) {
+force = false;
+}
+
+if (force) {
+nbd_export_close(exp);
+} else {
+nbd_export_set_name(exp, NULL);
+}
+}
+
 void qmp_nbd_server_stop(Error **errp)
 {
 nbd_export_close_all();
-- 
2.11.1

[Qemu-devel] [PULL for-2.11 2/2] target/s390x: Finish implementing RISBGN

2017-11-09 Thread Cornelia Huck

From: Richard Henderson 

We added the entry to insn-data.def, but failed to update op_risbg
to match.  No need to special-case the imask inversion, since that
is already ~0 for RISBG (and now RISBGN).

Fixes: 375ee58bedcda359011fe7fa99e0647f66f9ffa0
Fixes: https://bugs.launchpad.net/qemu/+bug/1701798 (s390x part)
Signed-off-by: Richard Henderson 
Message-Id: <20171107145546.767-1-richard.hender...@linaro.org>
Reviewed-by: Thomas Huth 
Tested-by: Peter Maydell 
Signed-off-by: Cornelia Huck 
---
 target/s390x/translate.c | 9 +++--
 1 file changed, 3 insertions(+), 6 deletions(-)

diff --git a/target/s390x/translate.c b/target/s390x/translate.c
index dee72a787d..85d0a6c3af 100644
--- a/target/s390x/translate.c
+++ b/target/s390x/translate.c
@@ -3432,6 +3432,7 @@ static ExitStatus op_risbg(DisasContext *s, DisasOps *o)
 /* Adjust the arguments for the specific insn.  */
 switch (s->fields->op2) {
 case 0x55: /* risbg */
+case 0x59: /* risbgn */
 i3 &= 63;
 i4 &= 63;
 pmask = ~0;
@@ -3447,7 +3448,7 @@ static ExitStatus op_risbg(DisasContext *s, DisasOps *o)
 pmask = 0xull;
 break;
 default:
-abort();
+g_assert_not_reached();
 }
 
 /* MASK is the set of bits to be inserted from R2.
@@ -3464,11 +3465,7 @@ static ExitStatus op_risbg(DisasContext *s, DisasOps *o)
insns, we need to keep the other half of the register.  */
 imask = ~mask | ~pmask;
 if (do_zero) {
-if (s->fields->op2 == 0x55) {
-imask = 0;
-} else {
-imask = ~pmask;
-}
+imask = ~pmask;
 }
 
 len = i4 - i3 + 1;
-- 
2.13.6

[Qemu-devel] [PULL for-2.11 0/2] s390x changes for 2.11-rc1

2017-11-09 Thread Cornelia Huck

The following changes since commit b0fbe46ad82982b289a44ee2495b59b0bad8a842:

  Update version for v2.11.0-rc0 release (2017-11-07 16:05:28 +)

are available in the git repository at:

  git://github.com/cohuck/qemu tags/s390x-20171109

for you to fetch changes up to fdaae351435147b9be6161d0f136ca7c40308059:

  target/s390x: Finish implementing RISBGN (2017-11-09 10:36:06 +0100)


s390x changes: let pci devices start out in a usable state, and make
RISBGN work in tcg.



Christian Borntraeger (1):
  s390x/pci: let pci devices start in configured mode

Richard Henderson (1):
  target/s390x: Finish implementing RISBGN

 hw/s390x/s390-pci-bus.c  | 2 +-
 target/s390x/translate.c | 9 +++--
 2 files changed, 4 insertions(+), 7 deletions(-)

-- 
2.13.6

[Qemu-devel] [PULL for-2.11 1/2] s390x/pci: let pci devices start in configured mode

2017-11-09 Thread Cornelia Huck

From: Christian Borntraeger 

Currently, to enable a pci device in the guest, the user has to issue
echo 1 > /sys/bus/pci/slots//power. This is not what people
expect. On an LPAR, the user can put a PCI device in configured or
deconfigured state via IOCDS. The "start in deconfigured state" can be
used for "sharing" a pci function across LPARs. This is not what we are
going to use in KVM, so always start configured.

Signed-off-by: Christian Borntraeger 
Acked-by: Yi Min Zhao 
Reviewed-by: Pierre Morel 
Message-Id: <20171107175455.73793-2-borntrae...@de.ibm.com>
Signed-off-by: Cornelia Huck 
---
 hw/s390x/s390-pci-bus.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/s390x/s390-pci-bus.c b/hw/s390x/s390-pci-bus.c
index e7a58e81f7..2b1e1409bf 100644
--- a/hw/s390x/s390-pci-bus.c
+++ b/hw/s390x/s390-pci-bus.c
@@ -715,7 +715,7 @@ static void s390_pcihost_hot_plug(HotplugHandler 
*hotplug_dev,
 pbdev->pdev = pdev;
 pbdev->iommu = s390_pci_get_iommu(s, pdev->bus, pdev->devfn);
 pbdev->iommu->pbdev = pbdev;
-pbdev->state = ZPCI_FS_STANDBY;
+pbdev->state = ZPCI_FS_DISABLED;
 
 if (s390_pci_msix_init(pbdev)) {
 error_setg(errp, "MSI-X support is mandatory "
-- 
2.13.6

Re: [Qemu-devel] Intermittent hang of iotest 194 (bdrv_drain_all after non-shared storage migration)

2017-11-09 Thread Fam Zheng

On Thu, 11/09 16:14, Max Reitz wrote:
> On 2017-11-09 05:21, Fam Zheng wrote:
> > On Thu, 11/09 01:48, Max Reitz wrote:
> >> Hi,
> >>
> >> More exciting news from the bdrv_drain() front!
> >>
> >> I've noticed in the past that iotest 194 sometimes hangs.  I usually run
> >> the tests on tmpfs, but I've just now verified that it happens on my SSD
> >> just as well.
> >>
> >> So the reproducer is a plain:
> >>
> >> while ./check -raw 194; do; done
> > 
> > I cannot produce it on my machine.
> 
> Hm, too bad.  I see it both on my work laptop (with Fedora) and my
> desktop (with Arch)...
> 
> >> (No difference between raw or qcow2, though.)
> >>
> >> And then, after a couple of runs (or a couple ten), it will just hang.
> >> The reason is that the source VM lingers around and doesn't quit
> >> voluntarily -- the test itself was successful, but it just can't exit.
> >>
> >> If you force it to exit by killing the VM (e.g. through pkill -11 qemu),
> >> this is the backtrace:
> >>
> >> #0  0x7f7cfc297e06 in ppoll () at /lib64/libc.so.6
> >> #1  0x563b846bcac9 in ppoll (__ss=0x0, __timeout=0x0,
> >> __nfds=, __fds=) at
> >> /usr/include/bits/poll2.h:77
> > 
> > Looking at the 0 timeout it seems we are in the aio_poll(ctx, 
> > blocking=false);
> > branches of BDRV_POLL_WHILE? Is it a busy loop? If so I wonder what is 
> > making
> > progress and causing the return value to be true in aio_poll().
> > 
> >> #2  0x563b846bcac9 in qemu_poll_ns (fds=,
> >> nfds=, timeout=) at util/qemu-timer.c:322
> >> #3  0x563b846be711 in aio_poll (ctx=ctx@entry=0x563b856e3e80,
> >> blocking=) at util/aio-posix.c:629
> >> #4  0x563b8463afa4 in bdrv_drain_recurse
> >> (bs=bs@entry=0x563b865568a0, begin=begin@entry=true) at block/io.c:201
> >> #5  0x563b8463baff in bdrv_drain_all_begin () at block/io.c:381
> >> #6  0x563b8463bc99 in bdrv_drain_all () at block/io.c:411
> >> #7  0x563b8459888b in block_migration_cleanup (opaque= >> out>) at migration/block.c:714
> >> #8  0x563b845883be in qemu_savevm_state_cleanup () at
> >> migration/savevm.c:1251
> >> #9  0x563b845811fd in migration_thread (opaque=0x563b856f1da0) at
> >> migration/migration.c:2298
> >> #10 0x7f7cfc56f36d in start_thread () at /lib64/libpthread.so.0
> >> #11 0x7f7cfc2a3e1f in clone () at /lib64/libc.so.6
> >>
> >>
> >> And when you make bdrv_drain_all_begin() print what we are trying to
> >> drain, you can see that it's the format node (managed by the "raw"
> >> driver in this case).
> > 
> > So what is the value of bs->in_flight?
> 
> gdb:
> > (gdb) print bs->in_flight 
> > $3 = 2307492233
> 
> "That's weird, why would it..."
> 
> > (gdb) print *bs
> > $4 = {open_flags = -1202160144, read_only = 161, encrypted = 85, sg = 
> > false, probed = false, force_share = 96, implicit = 159, drv = 0x0, opaque 
> > = 0x0, aio_context = 0x8989898989898989, aio_notifiers = {lh_first = 
> > 0x8989898989898989}, 
> >   walking_aio_notifiers = 137, filename = '\211' , 
> > backing_file = '\211' , backing_format = '\211' 
> > , full_open_options = 0x8989898989898989, 
> >   exact_filename = '\211' , backing = 
> > 0x8989898989898989, file = 0x8989898989898989, bl = {request_alignment = 
> > 2307492233, max_pdiscard = -1987475063, pdiscard_alignment = 2307492233, 
> > max_pwrite_zeroes = -1987475063, pwrite_zeroes_alignment = 2307492233, 
> > opt_transfer = 2307492233, max_transfer = 2307492233, min_mem_alignment = 
> > 9910603678816504201, opt_mem_alignment = 9910603678816504201, max_iov = 
> > -1987475063}, 
> >   supported_write_flags = 2307492233, supported_zero_flags = 2307492233, 
> > node_name = '\211' , node_list = {tqe_next = 
> > 0x8989898989898989, tqe_prev = 0x8989898989898989}, bs_list = {tqe_next = 
> > 0x8989898989898989, 
> > tqe_prev = 0x8989898989898989}, monitor_list = {tqe_next = 
> > 0x8989898989898989, tqe_prev = 0x8989898989898989}, refcnt = -1987475063, 
> > op_blockers = {{lh_first = 0x8989898989898989} }, job = 
> > 0x8989898989898989, 
> >   inherits_from = 0x8989898989898989, children = {lh_first = 
> > 0x8989898989898989}, parents = {lh_first = 0x8989898989898989}, options = 
> > 0x8989898989898989, explicit_options = 0x8989898989898989, detect_zeroes = 
> > 2307492233, 
> >   backing_blocker = 0x8989898989898989, total_sectors = 
> > -8536140394893047415, before_write_notifiers = {notifiers = {lh_first = 
> > 0x8989898989898989}}, write_threshold_offset = 9910603678816504201, 
> > write_threshold_notifier = {notify = 
> > 0x8989898989898989, node = {le_next = 0x8989898989898989, le_prev = 
> > 0x8989898989898989}}, dirty_bitmap_mutex = {lock = {__data = {__lock = 
> > -1987475063, __count = 2307492233, __owner = -1987475063, __nusers = 
> > 2307492233, 
> > __kind = -1987475063, __spins = -30327, __elision = -30327, __list 
> > = {__prev = 0x8989898989898989, __next = 0x8989898989898989}}, __size = 
> > '\211' , __align = -8536140394893047415}, initialized = 
> > 137}, 
> >   dirty_bitmaps

Re: [Qemu-devel] [PATCH V4] hw/pcie-pci-bridge: restrict to X86 and ARM

2017-11-09 Thread Marcel Apfelbaum


On 09/11/2017 16:37, Fam Zheng wrote:

On Thu, 11/09 16:08, Marcel Apfelbaum wrote:

make[1]: *** No rule to make target `../hw/pci-bridge/pcie_pci_bridge.o', 
needed by `qemu-system-aarch64'.  Stop.
make[1]: *** Waiting for unfinished 
jobs/tmp/qemu-test/src/target/arm/translate-a64.c: In function 
'handle_shri_with_rndacc':
/tmp/qemu-test/src/target/arm/translate-a64.c:6390: warning: 'tcg_src_hi' may 
be used uninitialized in this function
/tmp/qemu-test/src/target/arm/translate-a64.c: In function 
'disas_simd_scalar_two_reg_misc':
/tmp/qemu-test/src/target/arm/translate-a64.c:8117: warning: 'rmode' may be 
used uninitialized in this function

make: *** [subdir-aarch64-softmmu] Error 2
Traceback (most recent call last):
File "./tests/docker/docker.py", line 385, in 
  sys.exit(main())
File "./tests/docker/docker.py", line 382, in main
  return args.cmdobj.run(args, argv)
File "./tests/docker/docker.py", line 239, in run
  return Docker().run(argv, args.keep, quiet=args.quiet)
File "./tests/docker/docker.py", line 207, in run
  quiet=quiet)
File "./tests/docker/docker.py", line 125, in _do_check
  return subprocess.check_call(self._command + cmd, **kwargs)
File "/usr/lib64/python2.7/subprocess.py", line 186, in check_call
  raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['docker', 'run', '--label', 
'com.qemu.instance.uuid=561e8180c54e11e79bd052540069c830', '-u', '0', 
'--security-opt', 'seccomp=unconfined', '--rm', '--net=none', '-e', 
'TARGET_LIST=', '-e', 'EXTRA_CONFIGURE_OPTS=', '-e', 'V=', '-e', 'J=8', '-e', 
'DEBUG=', '-e', 'SHOW_ENV=1', '-e', 'CCACHE_DIR=/var/tmp/ccache', '-v', 
'/root/.cache/qemu-docker-ccache:/var/tmp/ccache:z', '-v', 
'/var/tmp/patchew-tester-tmp-gp94i687/src/docker-src.2017-11-09-08.03.01.2879:/var/tmp/qemu:z,ro',
 'qemu:centos6', '/var/tmp/qemu/run', 'test-quick']' returned non-zero exit 
status 2
make[1]: *** [tests/docker/Makefile.include:129: docker-run] Error 1
make[1]: Leaving directory '/var/tmp/patchew-tester-tmp-gp94i687/src'
make: *** [tests/docker/Makefile.include:163: docker-run-test-quick@centos6] 
Error 2

real1m28.235s
user0m4.283s
sys 0m4.399s
=== OUTPUT END ===

Test command exited with code: 2


---
Email generated automatically by Patchew [http://patchew.org/].
Please send your feedback to patchew-de...@freelists.org



Hi Fam,

I can compile just fine in my workstation, can you
please advice ?


I can see the same error with

 make docker-run-test-quick@centos6 V=1 J=8

and also a normal out-of-tree build with

 $QEMU_SRC/configure --enable-debug 
--target-list=x86_64-softmmu,aarch64-softmmu



Yes, I succeeded to reproduce it, thanks!



Can you install docker and try it?



I wanted to try it eventually, now it seems a good time.
I will let you know if any problems.

Thanks,
Marcel


Fam

[Qemu-devel] [Bug 1731277] Re: Provide target specific qemu man pages

2017-11-09 Thread Peter Maydell

I'm kind of hoping that moving to Sphinx for our docs toolchain will
allow us to for instance have the board specific information in doc
comments in each board source file, which could then be automatically
assembled into the right documentation. The current manpages are
autobuilt from the monolithic qemu-doc.texi, which is woefully out of
date in many areas...

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1731277

Title:
  Provide target specific qemu man pages

Status in QEMU:
  New

Bug description:
  Right now, all qemu target binaries (qemu-system-...) share the same
  man page.

  The current man page is primarily focused on x86, and therefore the
  information given is entirely wrong for e.g. arm, powerpc or s390x.

  NAME
 qemu-doc - QEMU Emulator User Documentation

  SYNOPSIS
 qemu-system-i386 [options] [disk_image]

  DESCRIPTION
 The QEMU PC System emulator simulates the following peripherals:

 -   i440FX host PCI bridge and PIIX3 PCI to ISA bridge

 -   Cirrus CLGD 5446 PCI VGA card or dummy VGA card with Bochs VESA 
extensions (hardware level, including all non
 standard modes).

 -   PS/2 mouse and keyboard

 -   2 PCI IDE interfaces with hard disk and CD-ROM support

 -   Floppy disk

  ...

  We should have target specific man pages, with the common
  options/settings factored out, so they are included in all target
  specific man pages.

  "man qemu-system-s390x" should give s390x specific (+common)
  information and "man qemu-system-x86_64" should contain x86 specific
  (+common) information.

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1731277/+subscriptions

Re: [Qemu-devel] [PATCH V4] hw/pcie-pci-bridge: restrict to X86 and ARM

2017-11-09 Thread Marcel Apfelbaum


On 09/11/2017 16:29, Philippe Mathieu-Daudé wrote:

Hi Marcel,

On 11/09/2017 11:10 AM, Marcel Apfelbaum wrote:

On 09/11/2017 16:03, Thomas Huth wrote:

On 09.11.2017 14:04, no-re...@patchew.org wrote:

Hi,

This series failed automatic build test. Please find the testing
commands and
their output below. If you have docker installed, you can probably
reproduce it
locally.

[...]

    CC  aarch64-softmmu/hw/scsi/virtio-scsi.o
    CC  aarch64-softmmu/hw/scsi/virtio-scsi-dataplane.o
make[1]: *** No rule to make target
`../hw/pci-bridge/pcie_pci_bridge.o', needed by
`qemu-system-x86_64'.  Stop.
make[1]: *** Waiting for unfinished jobs
    CC  aarch64-softmmu/hw/scsi/vhost-scsi-common.o
    CC  aarch64-softmmu/hw/scsi/vhost-scsi.o
    CC  aarch64-softmmu/hw/scsi/vhost-user-scsi.o
    CC  aarch64-softmmu/hw/sd/omap_mmc.o
    CC  aarch64-softmmu/hw/sd/pxa2xx_mmci.o
    CC  aarch64-softmmu/hw/sd/bcm2835_sdhost.o
make: *** [subdir-x86_64-softmmu] Error 2
make: *** Waiting for unfinished jobs
    CC  aarch64-softmmu/hw/ssi/omap_spi.o
    CC  aarch64-softmmu/hw/ssi/imx_spi.o
    CC  aarch64-softmmu/hw/timer/exynos4210_mct.o
    CC  aarch64-softmmu/hw/timer/exynos4210_pwm.o

[...]

    CC  aarch64-softmmu/hw/timer/exynos4210_rtc.o
    CC  aarch64-softmmu/target/arm/translate-a64.o
make[1]: *** No rule to make target
`../hw/pci-bridge/pcie_pci_bridge.o', needed by
`qemu-system-aarch64'.  Stop.
make[1]: *** Waiting for unfinished
jobs/tmp/qemu-test/src/target/arm/translate-a64.c: In function
'handle_shri_with_rndacc':


Hmm, looks like this was not working as expected?



In my workstation works just fine (clean and compiled all archs),
I sent a mail to Fam.


Applying your patch on v2.11.0-rc0 I also get this:

make[1]: *** No rule to make target
'../hw/pci-bridge/pcie_pci_bridge.o', needed by 'qemu-system-aarch64'.
Stop.
Makefile:383: recipe for target 'subdir-aarch64-softmmu' failed
make: *** [subdir-aarch64-softmmu] Error 2

$ fgrep CONFIG_PCIE config-all-devices.mak
CONFIG_PCIE_PCI_BRIDGE:=$(findstring
y,$(CONFIG_PCIE_PCI_BRIDGE)$(CONFIG_PCIE_PORT))
CONFIG_PCIE_PORT:=$(findstring y,$(CONFIG_PCIE_PORT)y)

$ make subdir-aarch64-softmmu --debug=v 2>/dev/null | fgrep -i
pcie_pci_bridge
Reading makefile '../hw/pci-bridge/pcie_pci_bridge.d' (search path)
(don't care) (no ~ expansion)...
 Considering target file '../hw/pci-bridge/pcie_pci_bridge.o'.
  File '../hw/pci-bridge/pcie_pci_bridge.o' does not exist.
  Finished prerequisites of target file
'../hw/pci-bridge/pcie_pci_bridge.o'.
 Must remake target '../hw/pci-bridge/pcie_pci_bridge.o'.



Hi Philippe,
Thanks for trying it! I succeeded to reproduce with an out of tree build.
For some reason the make system doesn't work as I would want to.

I'll resend a working version,
Thanks,
Marcel

Re: [Qemu-devel] Intermittent hang of iotest 194 (bdrv_drain_all after non-shared storage migration)

2017-11-09 Thread Max Reitz

On 2017-11-09 05:21, Fam Zheng wrote:
> On Thu, 11/09 01:48, Max Reitz wrote:
>> Hi,
>>
>> More exciting news from the bdrv_drain() front!
>>
>> I've noticed in the past that iotest 194 sometimes hangs.  I usually run
>> the tests on tmpfs, but I've just now verified that it happens on my SSD
>> just as well.
>>
>> So the reproducer is a plain:
>>
>> while ./check -raw 194; do; done
> 
> I cannot produce it on my machine.

Hm, too bad.  I see it both on my work laptop (with Fedora) and my
desktop (with Arch)...

>> (No difference between raw or qcow2, though.)
>>
>> And then, after a couple of runs (or a couple ten), it will just hang.
>> The reason is that the source VM lingers around and doesn't quit
>> voluntarily -- the test itself was successful, but it just can't exit.
>>
>> If you force it to exit by killing the VM (e.g. through pkill -11 qemu),
>> this is the backtrace:
>>
>> #0  0x7f7cfc297e06 in ppoll () at /lib64/libc.so.6
>> #1  0x563b846bcac9 in ppoll (__ss=0x0, __timeout=0x0,
>> __nfds=, __fds=) at
>> /usr/include/bits/poll2.h:77
> 
> Looking at the 0 timeout it seems we are in the aio_poll(ctx, blocking=false);
> branches of BDRV_POLL_WHILE? Is it a busy loop? If so I wonder what is making
> progress and causing the return value to be true in aio_poll().
> 
>> #2  0x563b846bcac9 in qemu_poll_ns (fds=,
>> nfds=, timeout=) at util/qemu-timer.c:322
>> #3  0x563b846be711 in aio_poll (ctx=ctx@entry=0x563b856e3e80,
>> blocking=) at util/aio-posix.c:629
>> #4  0x563b8463afa4 in bdrv_drain_recurse
>> (bs=bs@entry=0x563b865568a0, begin=begin@entry=true) at block/io.c:201
>> #5  0x563b8463baff in bdrv_drain_all_begin () at block/io.c:381
>> #6  0x563b8463bc99 in bdrv_drain_all () at block/io.c:411
>> #7  0x563b8459888b in block_migration_cleanup (opaque=> out>) at migration/block.c:714
>> #8  0x563b845883be in qemu_savevm_state_cleanup () at
>> migration/savevm.c:1251
>> #9  0x563b845811fd in migration_thread (opaque=0x563b856f1da0) at
>> migration/migration.c:2298
>> #10 0x7f7cfc56f36d in start_thread () at /lib64/libpthread.so.0
>> #11 0x7f7cfc2a3e1f in clone () at /lib64/libc.so.6
>>
>>
>> And when you make bdrv_drain_all_begin() print what we are trying to
>> drain, you can see that it's the format node (managed by the "raw"
>> driver in this case).
> 
> So what is the value of bs->in_flight?

gdb:
> (gdb) print bs->in_flight 
> $3 = 2307492233

"That's weird, why would it..."

> (gdb) print *bs
> $4 = {open_flags = -1202160144, read_only = 161, encrypted = 85, sg = false, 
> probed = false, force_share = 96, implicit = 159, drv = 0x0, opaque = 0x0, 
> aio_context = 0x8989898989898989, aio_notifiers = {lh_first = 
> 0x8989898989898989}, 
>   walking_aio_notifiers = 137, filename = '\211' , 
> backing_file = '\211' , backing_format = '\211'  16 times>, full_open_options = 0x8989898989898989, 
>   exact_filename = '\211' , backing = 0x8989898989898989, 
> file = 0x8989898989898989, bl = {request_alignment = 2307492233, max_pdiscard 
> = -1987475063, pdiscard_alignment = 2307492233, 
> max_pwrite_zeroes = -1987475063, pwrite_zeroes_alignment = 2307492233, 
> opt_transfer = 2307492233, max_transfer = 2307492233, min_mem_alignment = 
> 9910603678816504201, opt_mem_alignment = 9910603678816504201, max_iov = 
> -1987475063}, 
>   supported_write_flags = 2307492233, supported_zero_flags = 2307492233, 
> node_name = '\211' , node_list = {tqe_next = 
> 0x8989898989898989, tqe_prev = 0x8989898989898989}, bs_list = {tqe_next = 
> 0x8989898989898989, 
> tqe_prev = 0x8989898989898989}, monitor_list = {tqe_next = 
> 0x8989898989898989, tqe_prev = 0x8989898989898989}, refcnt = -1987475063, 
> op_blockers = {{lh_first = 0x8989898989898989} }, job = 
> 0x8989898989898989, 
>   inherits_from = 0x8989898989898989, children = {lh_first = 
> 0x8989898989898989}, parents = {lh_first = 0x8989898989898989}, options = 
> 0x8989898989898989, explicit_options = 0x8989898989898989, detect_zeroes = 
> 2307492233, 
>   backing_blocker = 0x8989898989898989, total_sectors = -8536140394893047415, 
> before_write_notifiers = {notifiers = {lh_first = 0x8989898989898989}}, 
> write_threshold_offset = 9910603678816504201, write_threshold_notifier = 
> {notify = 
> 0x8989898989898989, node = {le_next = 0x8989898989898989, le_prev = 
> 0x8989898989898989}}, dirty_bitmap_mutex = {lock = {__data = {__lock = 
> -1987475063, __count = 2307492233, __owner = -1987475063, __nusers = 
> 2307492233, 
> __kind = -1987475063, __spins = -30327, __elision = -30327, __list = 
> {__prev = 0x8989898989898989, __next = 0x8989898989898989}}, __size = '\211' 
> , __align = -8536140394893047415}, initialized = 137}, 
>   dirty_bitmaps = {lh_first = 0x8989898989898989}, wr_highest_offset = {value 
> = 9910603678816504201}, copy_on_read = -1987475063, in_flight = 2307492233, 
> serialising_in_flight = 2307492233, wakeup = 137, io_plugged = 2307492233, 
>   enable_write_cache = -1987475063,

Re: [Qemu-devel] [PATCH] Add missing parameters to mon option documentation

2017-11-09 Thread Daniel P. Berrange

On Thu, Nov 09, 2017 at 01:19:03PM +0100, Vicente Jimenez Aguilar wrote:
> Documentation missed 'mon' option's 'pretty' and 'default' parameters
> 
> Signed-off-by: Vicente Jimenez Aguilar 
> ---
>  qemu-options.hx | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/qemu-options.hx b/qemu-options.hx
> index 3728e9b4dd..72cf48a8e5 100644
> --- a/qemu-options.hx
> +++ b/qemu-options.hx
> @@ -3477,9 +3477,9 @@ Like -qmp but uses pretty JSON formatting.
>  ETEXI
>  
>  DEF("mon", HAS_ARG, QEMU_OPTION_mon, \
> -"-mon [chardev=]name[,mode=readline|control]\n", QEMU_ARCH_ALL)
> +"-mon 
> [chardev=]name[,mode=readline|control][,pretty[=on|off]][,default[=on|off]]\n",
>  QEMU_ARCH_ALL)
>  STEXI
> -@item -mon [chardev=]name[,mode=readline|control]
> +@item -mon 
> [chardev=]name[,mode=readline|control][,pretty[=on|off]][,default[=on|off]]

The 'default' option is deprecated and unused, so intentionally not
documented

https://qemu.weilnetz.de/doc/qemu-doc.html#g_t_002dmon-default_003don-_0028since-2_002e4_002e0_0029


Regards,
Daniel
-- 
|: https://berrange.com  -o-https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o-https://fstop138.berrange.com :|
|: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|

Re: [Qemu-devel] [PATCH V3] hw/pci-host: Fix x86 Host Bridges 64bit PCI hole

2017-11-09 Thread Michael S. Tsirkin

On Wed, Nov 08, 2017 at 04:48:09PM +0200, Marcel Apfelbaum wrote:
> Currently there is no MMIO range over 4G
> reserved for PCI hotplug. Since the 32bit PCI hole
> depends on the number of cold-plugged PCI devices
> and other factors, it is very possible is too small
> to hotplug PCI devices with large BARs.
> 
> Fix it by reserving 2G for I4400FX chipset
> in order to comply with older Win32 Guest OSes
> and 32G for Q35 chipset.
> 
> Note this is a regression since prev QEMU versions had
> some range reserved for 64bit PCI hotplug.
> 
> Signed-off-by: Marcel Apfelbaum 

Looks sane to me, but could we add some documentation please?
Short term code comments. Long term let's add a document.

> ---
> 
> V2 -> V3:
>  - Addressed Gerd's and others comments and re-enabled the pci-hole64-size
>property defaulting it to 2G for I440FX and 32g for Q35.
>  - Even if the new defaults of pci-hole64-size will appear in "info qtree"
>also for older machines, the property was not implemented so
>no changes will be visible to guests.
> 
> V1 -> V2:
>  Addressed Igor's comments:
> - aligned the hole64 start to 1Gb
>  (I think all the computations took care of it already,
>   but it can't hurt)
> - Init compat props to "off" instead of "false"
> 
> Thanks,
> Marcel 
> 
> 
> hw/i386/pc.c  | 18 ++
>  hw/pci-host/piix.c| 13 -
>  hw/pci-host/q35.c | 17 +++--
>  include/hw/i386/pc.h  | 10 +-
>  include/hw/pci-host/q35.h |  1 +
>  5 files changed, 55 insertions(+), 4 deletions(-)
> 
> diff --git a/hw/i386/pc.c b/hw/i386/pc.c
> index e11a65b545..af3177cca5 100644
> --- a/hw/i386/pc.c
> +++ b/hw/i386/pc.c
> @@ -1448,6 +1448,24 @@ void pc_memory_init(PCMachineState *pcms,
>  pcms->ioapic_as = _space_memory;
>  }
>  
> +uint64_t pc_pci_hole64_start(void)
> +{
> +PCMachineState *pcms = PC_MACHINE(qdev_get_machine());
> +PCMachineClass *pcmc = PC_MACHINE_GET_CLASS(pcms);
> +uint64_t hole64_start = 0;
> +
> +if (pcmc->has_reserved_memory && pcms->hotplug_memory.base) {
> +hole64_start = pcms->hotplug_memory.base;
> +if (!pcmc->broken_reserved_end) {
> +hole64_start += memory_region_size(>hotplug_memory.mr);
> +}
> +} else {
> +hole64_start = 0x1ULL + pcms->above_4g_mem_size;
> +}
> +
> +return ROUND_UP(hole64_start, 1ULL << 30);
> +}
> +
>  qemu_irq pc_allocate_cpu_irq(void)
>  {
>  return qemu_allocate_irq(pic_irq_request, NULL, 0);
> diff --git a/hw/pci-host/piix.c b/hw/pci-host/piix.c
> index a7e2256870..e033ee7df8 100644
> --- a/hw/pci-host/piix.c
> +++ b/hw/pci-host/piix.c
> @@ -50,6 +50,7 @@ typedef struct I440FXState {
>  PCIHostState parent_obj;
>  Range pci_hole;
>  uint64_t pci_hole64_size;
> +bool pci_hole64_fix;
>  uint32_t short_root_bus;
>  } I440FXState;
>  
> @@ -243,11 +244,15 @@ static void i440fx_pcihost_get_pci_hole64_start(Object 
> *obj, Visitor *v,
>  void *opaque, Error **errp)
>  {
>  PCIHostState *h = PCI_HOST_BRIDGE(obj);
> +I440FXState *s = I440FX_PCI_HOST_BRIDGE(obj);
>  Range w64;
>  uint64_t value;
>  
>  pci_bus_get_w64_range(h->bus, );
>  value = range_is_empty() ? 0 : range_lob();
> +if (!value && s->pci_hole64_fix) {
> +value = pc_pci_hole64_start();
> +}
>  visit_type_uint64(v, name, , errp);
>  }
>  
> @@ -256,11 +261,16 @@ static void i440fx_pcihost_get_pci_hole64_end(Object 
> *obj, Visitor *v,
>Error **errp)
>  {
>  PCIHostState *h = PCI_HOST_BRIDGE(obj);
> +I440FXState *s = I440FX_PCI_HOST_BRIDGE(obj);
> +uint64_t hole64_start = pc_pci_hole64_start();
>  Range w64;
>  uint64_t value;
>  
>  pci_bus_get_w64_range(h->bus, );
>  value = range_is_empty() ? 0 : range_upb() + 1;
> +if (s->pci_hole64_fix && value < (hole64_start + s->pci_hole64_size)) {
> +value = hole64_start + s->pci_hole64_size;
> +}
>  visit_type_uint64(v, name, , errp);
>  }
>  
> @@ -863,8 +873,9 @@ static const char 
> *i440fx_pcihost_root_bus_path(PCIHostState *host_bridge,
>  
>  static Property i440fx_props[] = {
>  DEFINE_PROP_SIZE(PCI_HOST_PROP_PCI_HOLE64_SIZE, I440FXState,
> - pci_hole64_size, DEFAULT_PCI_HOLE64_SIZE),
> + pci_hole64_size, 1ULL << 31),
>  DEFINE_PROP_UINT32("short_root_bus", I440FXState, short_root_bus, 0),
> +DEFINE_PROP_BOOL("x-pci-hole64-fix", I440FXState, pci_hole64_fix, true),
>  DEFINE_PROP_END_OF_LIST(),
>  };
>  
> diff --git a/hw/pci-host/q35.c b/hw/pci-host/q35.c
> index ddaa7d1b44..f95337c13a 100644
> --- a/hw/pci-host/q35.c
> +++ b/hw/pci-host/q35.c
> @@ -104,11 +104,15 @@ static void q35_host_get_pci_hole64_start(Object *obj, 
> Visitor *v,
>Error **errp)
>  {
>  PCIHostState

[Qemu-devel] [PATCH 2/2] Add mon default parameter to help documentation

2017-11-09 Thread Vicente Jimenez Aguilar

Spotted after reading mon_init_func from vl.c

Signed-off-by: Vicente Jimenez Aguilar 
---
 qemu-options.hx | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/qemu-options.hx b/qemu-options.hx
index 2692a48f63..12192a20e4 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -3477,7 +3477,7 @@ Like -qmp but uses pretty JSON formatting.
 ETEXI
 
 DEF("mon", HAS_ARG, QEMU_OPTION_mon, \
-"-mon [chardev=]name[,mode=readline|control][,pretty[=on|off]]\n", 
QEMU_ARCH_ALL)
+"-mon 
[chardev=]name[,mode=readline|control][,pretty[=on|off]][,default[=on|off]]\n", 
QEMU_ARCH_ALL)
 STEXI
 @item -mon [chardev=]name[,mode=readline|control]
 @findex -mon
-- 
2.14.1

[Qemu-devel] [PATCH 1/2] Add mon pretty parameter to help documentation

2017-11-09 Thread Vicente Jimenez Aguilar

Signed-off-by: Vicente Jimenez Aguilar 
---
 qemu-options.hx | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/qemu-options.hx b/qemu-options.hx
index 3728e9b4dd..2692a48f63 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -3477,7 +3477,7 @@ Like -qmp but uses pretty JSON formatting.
 ETEXI
 
 DEF("mon", HAS_ARG, QEMU_OPTION_mon, \
-"-mon [chardev=]name[,mode=readline|control]\n", QEMU_ARCH_ALL)
+"-mon [chardev=]name[,mode=readline|control][,pretty[=on|off]]\n", 
QEMU_ARCH_ALL)
 STEXI
 @item -mon [chardev=]name[,mode=readline|control]
 @findex -mon
-- 
2.14.1

[Qemu-devel] [PATCH] Add missing parameters to mon option documentation

2017-11-09 Thread Vicente Jimenez Aguilar

Documentation missed 'mon' option's 'pretty' and 'default' parameters

Signed-off-by: Vicente Jimenez Aguilar 
---
 qemu-options.hx | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/qemu-options.hx b/qemu-options.hx
index 3728e9b4dd..72cf48a8e5 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -3477,9 +3477,9 @@ Like -qmp but uses pretty JSON formatting.
 ETEXI
 
 DEF("mon", HAS_ARG, QEMU_OPTION_mon, \
-"-mon [chardev=]name[,mode=readline|control]\n", QEMU_ARCH_ALL)
+"-mon 
[chardev=]name[,mode=readline|control][,pretty[=on|off]][,default[=on|off]]\n", 
QEMU_ARCH_ALL)
 STEXI
-@item -mon [chardev=]name[,mode=readline|control]
+@item -mon 
[chardev=]name[,mode=readline|control][,pretty[=on|off]][,default[=on|off]]
 @findex -mon
 Setup monitor on chardev @var{name}.
 ETEXI
-- 
2.14.1

[Qemu-devel] [PATCH] Add mon pretty parameter to help documentation

2017-11-09 Thread Vicente Jimenez Aguilar

Signed-off-by: Vicente Jimenez Aguilar 
---
 qemu-options.hx | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/qemu-options.hx b/qemu-options.hx
index 3728e9b4dd..2692a48f63 100644
--- a/qemu-options.hx
+++ b/qemu-options.hx
@@ -3477,7 +3477,7 @@ Like -qmp but uses pretty JSON formatting.
 ETEXI
 
 DEF("mon", HAS_ARG, QEMU_OPTION_mon, \
-"-mon [chardev=]name[,mode=readline|control]\n", QEMU_ARCH_ALL)
+"-mon [chardev=]name[,mode=readline|control][,pretty[=on|off]]\n", 
QEMU_ARCH_ALL)
 STEXI
 @item -mon [chardev=]name[,mode=readline|control]
 @findex -mon
-- 
2.14.1

Re: [Qemu-devel] [RISU PATCH 07/10] risugen: add --sve support

2017-11-09 Thread Alex Bennée


Dave Martin  writes:

> On Tue, Nov 07, 2017 at 03:05:55PM +, Alex Bennée wrote:
>> This is similar to the approach used by the FP/simd data in so far as
>> we generate a block of random data and then load into it. As there are
>> no post-index SVE operations we need to emit an additional incp
>> instruction to generate our offset into the array.
>>
>> Signed-off-by: Alex Bennée 
>> ---
>>  risugen|  3 +++
>>  risugen_arm.pm | 57 
>> ++---
>>  2 files changed, 53 insertions(+), 7 deletions(-)
>>
>> diff --git a/risugen b/risugen
>> index aba4bb7..0ac8e86 100755
>> --- a/risugen
>> +++ b/risugen
>> @@ -317,6 +317,7 @@ sub main()
>>  my $condprob = 0;
>>  my $fpscr = 0;
>>  my $fp_enabled = 1;
>> +my $sve_enabled = 1;
>>  my $big_endian = 0;
>>  my ($infile, $outfile);
>>
>> @@ -334,6 +335,7 @@ sub main()
>>  },
>>  "be" => sub { $big_endian = 1; },
>>  "no-fp" => sub { $fp_enabled = 0; },
>> +"sve" => sub { $sve_enabled = 1; },
>>  ) or return 1;
>>  # allow "--pattern re,re" and "--pattern re --pattern re"
>>  @pattern_re = split(/,/,join(',',@pattern_re));
>> @@ -361,6 +363,7 @@ sub main()
>>  'fpscr' => $fpscr,
>>  'numinsns' => $numinsns,
>>  'fp_enabled' => $fp_enabled,
>> +'sve_enabled' => $sve_enabled,
>>  'outfile' => $outfile,
>>  'details' => \%insn_details,
>>  'keys' => \@insn_keys,
>> diff --git a/risugen_arm.pm b/risugen_arm.pm
>> index 2f10d58..8d1e1fd 100644
>> --- a/risugen_arm.pm
>> +++ b/risugen_arm.pm
>> @@ -472,9 +472,47 @@ sub write_random_aarch64_fpdata()
>>  }
>>  }
>>
>> -sub write_random_aarch64_regdata($)
>> +sub write_random_aarch64_svedata()
>>  {
>> -my ($fp_enabled) = @_;
>> +# Load SVE registers
>> +my $align = 16;
>> +my $vl = 16; # number of vqs
>
> Would this be better phrased
>
>   my $vq = 16;# quadwords per vector
>
>> +my $datalen = (32 * $vl * 16) + $align;
>> +
>> +write_pc_adr(0, (3 * 4) + ($align - 1)); # insn 1
>> +write_align_reg(0, $align);  # insn 2
>> +write_jump_fwd($datalen);# insn 3
>> +
>> +# align safety
>> +for (my $i = 0; $i < ($align / 4); $i++) {
>> +# align with nops
>> +insn32(0xd503201f);
>> +};
>> +
>> +for (my $rt = 0; $rt <= 31; $rt++) {
>> +for (my $q = 0; $q < $vl; $q++) {
>> +write_random_fpreg_var(4); # quad
>> +}
>> +}
>> +
>> +# Reset all the predicate registers to all true
>> +for (my $p = 0; $p < 16; $p++) {
>> +insn32(0x2518e3e0 | $p);
>> +}
>> +
>> +# there is no post index load so we do this by hand
>> +write_mov_ri(1, 0);
>> +for (my $rt = 0; $rt <= 31; $rt++) {
>> +# ld1dz0.d, p0/z, [x0, x1, lsl #3]
>> +insn32(0xa5e14000 | $rt);
>> +# incpx1, p0.d
>> +insn32(0x25ec8801);
>
> You could avoid this with the unpredicated form LDR (vector).
> (LD1x scalar+immediate doesn't provide enough immediate range).
>
>   # ldr   z$rt, [x0, #$rt, mul vl]
>   insn32(0x85804000 + $rt + (($rt & 7) << 10) + (($rt & 0x18) << 13));
>
> which is what the kernel does.

Ahh thanks.

> No harm in exercising different instructions though!  The kernel uses
> embarrassingly few.

Well that's what the actual random instructions are for. I haven't yet
considered if we need to group things together to properly exercise the
predicate code. So far RISU doesn't exercise conditional branches as
it's hard to do that without generating code that would lock up. It's
very much the "easy 80%" of instructions we target.

> Does it matter that the stride will depend on the actual current VL?
> If x0 just points to a block of random data, I guess it doesn't matter:
> some trailing data remains unused, but that doesn't make the used data
> any less random.

Not really. We do try and re-generate every N instructions so all the
floats don't just denormalise.

>
> [...]
>
> Cheers
> ---Dave


--
Alex Bennée

1 2 >

1 - 100 of 198 matches

Mail list logo