Re: [Qemu-devel] [PATCH] target-i386: adds PV_TLB_FLUSH CPUID feature bit
2017-11-10 15:45 GMT+08:00 Wanpeng Li: > From: Wanpeng Li > > Adds PV_TLB_FLUSH CPUID feature bit. > > Cc: Paolo Bonzini > Cc: Radim Krčmář > Cc: Richard Henderson > Cc: Eduardo Habkost > Signed-off-by: Wanpeng Li > --- > target/i386/cpu.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/target/i386/cpu.c b/target/i386/cpu.c > index 6f21a5e..ecebc5a 100644 > --- a/target/i386/cpu.c > +++ b/target/i386/cpu.c > @@ -347,7 +347,7 @@ static FeatureWordInfo feature_word_info[FEATURE_WORDS] = > { > .feat_names = { > "kvmclock", "kvm-nopiodelay", "kvm-mmu", "kvmclock", > "kvm-asyncpf", "kvm-steal-time", "kvm-pv-eoi", "kvm-pv-unhalt", > -NULL, NULL, NULL, NULL, > +NULL, "kvm-pv-tlb-flush", NULL, NULL, Note: bit 8 is reserved for the PV_DEDICATED which is posted by another guy in kvm community. > NULL, NULL, NULL, NULL, > NULL, NULL, NULL, NULL, > NULL, NULL, NULL, NULL, > -- > 2.7.4 >
[Qemu-devel] [PATCH] target-i386: adds PV_TLB_FLUSH CPUID feature bit
From: Wanpeng LiAdds PV_TLB_FLUSH CPUID feature bit. Cc: Paolo Bonzini Cc: Radim Krčmář Cc: Richard Henderson Cc: Eduardo Habkost Signed-off-by: Wanpeng Li --- target/i386/cpu.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/target/i386/cpu.c b/target/i386/cpu.c index 6f21a5e..ecebc5a 100644 --- a/target/i386/cpu.c +++ b/target/i386/cpu.c @@ -347,7 +347,7 @@ static FeatureWordInfo feature_word_info[FEATURE_WORDS] = { .feat_names = { "kvmclock", "kvm-nopiodelay", "kvm-mmu", "kvmclock", "kvm-asyncpf", "kvm-steal-time", "kvm-pv-eoi", "kvm-pv-unhalt", -NULL, NULL, NULL, NULL, +NULL, "kvm-pv-tlb-flush", NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, -- 2.7.4
Re: [Qemu-devel] [PATCH] slirp: don't zero ti_i since we access it later.
Hi, This series seems to have some coding style problems. See output below for more information: Subject: [Qemu-devel] [PATCH] slirp: don't zero ti_i since we access it later. Type: series Message-id: 20171108225340.10194-1-lep...@google.com === TEST SCRIPT BEGIN === #!/bin/bash BASE=base n=1 total=$(git log --oneline $BASE.. | wc -l) failed=0 git config --local diff.renamelimit 0 git config --local diff.renames True commits="$(git log --format=%H --reverse $BASE..)" for c in $commits; do echo "Checking PATCH $n/$total: $(git log -n 1 --format=%s $c)..." if ! git show $c --format=email | ./scripts/checkpatch.pl --mailback -; then failed=1 echo fi n=$((n+1)) done exit $failed === TEST SCRIPT END === Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384 From https://github.com/patchew-project/qemu * [new tag] patchew/20171108225340.10194-1-lep...@google.com -> patchew/20171108225340.10194-1-lep...@google.com Switched to a new branch 'test' 566ba17a0b slirp: don't zero ti_i since we access it later. === OUTPUT BEGIN === Checking PATCH 1/1: slirp: don't zero ti_i since we access it later ERROR: code indent should never use tabs #21: FILE: slirp/tcp_subr.c:151: +^I^Iswitch (af) {$ ERROR: code indent should never use tabs #22: FILE: slirp/tcp_subr.c:152: +^I^Icase AF_INET:$ ERROR: code indent should never use tabs #23: FILE: slirp/tcp_subr.c:153: +^I^Iti->ti.ti_i4.ih_x1 = 0;$ ERROR: code indent should never use tabs #24: FILE: slirp/tcp_subr.c:154: +^I^Ibreak;$ ERROR: code indent should never use tabs #25: FILE: slirp/tcp_subr.c:155: +^I^Icase AF_INET6:$ ERROR: code indent should never use tabs #26: FILE: slirp/tcp_subr.c:156: +^I^Iti->ti.ti_i6.ih_x1 = 0;$ ERROR: code indent should never use tabs #27: FILE: slirp/tcp_subr.c:157: +^I^Ibreak;$ ERROR: code indent should never use tabs #28: FILE: slirp/tcp_subr.c:158: +^I^Idefault:$ ERROR: code indent should never use tabs #29: FILE: slirp/tcp_subr.c:159: +^I^Ig_assert_not_reached();$ ERROR: code indent should never use tabs #30: FILE: slirp/tcp_subr.c:160: +^I^I}$ total: 10 errors, 0 warnings, 17 lines checked Your patch has style problems, please review. If any of these errors are false positives report them to the maintainer, see CHECKPATCH in MAINTAINERS. === OUTPUT END === Test command exited with code: 1 --- Email generated automatically by Patchew [http://patchew.org/]. Please send your feedback to patchew-de...@freelists.org
Re: [Qemu-devel] [PULL 0/1] slirp update
Hi, This series seems to have some coding style problems. See output below for more information: Subject: [Qemu-devel] [PULL 0/1] slirp update Type: series Message-id: 20171109180142.13923-1-samuel.thiba...@ens-lyon.org === TEST SCRIPT BEGIN === #!/bin/bash BASE=base n=1 total=$(git log --oneline $BASE.. | wc -l) failed=0 git config --local diff.renamelimit 0 git config --local diff.renames True commits="$(git log --format=%H --reverse $BASE..)" for c in $commits; do echo "Checking PATCH $n/$total: $(git log -n 1 --format=%s $c)..." if ! git show $c --format=email | ./scripts/checkpatch.pl --mailback -; then failed=1 echo fi n=$((n+1)) done exit $failed === TEST SCRIPT END === Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384 From https://github.com/patchew-project/qemu * [new tag] patchew/20171109180142.13923-1-samuel.thiba...@ens-lyon.org -> patchew/20171109180142.13923-1-samuel.thiba...@ens-lyon.org * [new tag] patchew/20171109181741.31318-1-lep...@google.com -> patchew/20171109181741.31318-1-lep...@google.com Switched to a new branch 'test' 9f23f4da08 slirp: don't zero the whole ti_i when m == NULL === OUTPUT BEGIN === Checking PATCH 1/1: slirp: don't zero the whole ti_i when m == NULL... ERROR: code indent should never use tabs #29: FILE: slirp/tcp_subr.c:151: +^I^Iswitch (af) {$ ERROR: code indent should never use tabs #30: FILE: slirp/tcp_subr.c:152: +^I^Icase AF_INET:$ ERROR: code indent should never use tabs #31: FILE: slirp/tcp_subr.c:153: +^I^Iti->ti.ti_i4.ih_x1 = 0;$ ERROR: code indent should never use tabs #32: FILE: slirp/tcp_subr.c:154: +^I^Ibreak;$ ERROR: code indent should never use tabs #33: FILE: slirp/tcp_subr.c:155: +^I^Icase AF_INET6:$ ERROR: code indent should never use tabs #34: FILE: slirp/tcp_subr.c:156: +^I^Iti->ti.ti_i6.ih_x1 = 0;$ ERROR: code indent should never use tabs #35: FILE: slirp/tcp_subr.c:157: +^I^Ibreak;$ ERROR: code indent should never use tabs #36: FILE: slirp/tcp_subr.c:158: +^I^Idefault:$ ERROR: code indent should never use tabs #37: FILE: slirp/tcp_subr.c:159: +^I^Ig_assert_not_reached();$ ERROR: code indent should never use tabs #38: FILE: slirp/tcp_subr.c:160: +^I^I}$ total: 10 errors, 0 warnings, 17 lines checked Your patch has style problems, please review. If any of these errors are false positives report them to the maintainer, see CHECKPATCH in MAINTAINERS. === OUTPUT END === Test command exited with code: 1 --- Email generated automatically by Patchew [http://patchew.org/]. Please send your feedback to patchew-de...@freelists.org
Re: [Qemu-devel] [PATCH v2] aarch64: advertise the GIC system register interface
Hi, This series seems to have some coding style problems. See output below for more information: Subject: [Qemu-devel] [PATCH v2] aarch64: advertise the GIC system register interface Type: series Message-id: alpine.DEB.2.10.1711061412330.30448@sstabellini-ThinkPad-X260 === TEST SCRIPT BEGIN === #!/bin/bash BASE=base n=1 total=$(git log --oneline $BASE.. | wc -l) failed=0 git config --local diff.renamelimit 0 git config --local diff.renames True commits="$(git log --format=%H --reverse $BASE..)" for c in $commits; do echo "Checking PATCH $n/$total: $(git log -n 1 --format=%s $c)..." if ! git show $c --format=email | ./scripts/checkpatch.pl --mailback -; then failed=1 echo fi n=$((n+1)) done exit $failed === TEST SCRIPT END === Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384 From https://github.com/patchew-project/qemu * [new tag] patchew/alpine.DEB.2.10.1711061412330.30448@sstabellini-ThinkPad-X260 -> patchew/alpine.DEB.2.10.1711061412330.30448@sstabellini-ThinkPad-X260 Switched to a new branch 'test' d3e1cec549 aarch64: advertise the GIC system register interface === OUTPUT BEGIN === Checking PATCH 1/1: aarch64: advertise the GIC system register interface... ERROR: spaces required around that '|' (ctx:VxV) #67: FILE: target/arm/helper.c:4690: + .resetvalue = cpu->gicv3_sysregs ? (cpu->id_aa64pfr0|0x0100) : ^ total: 1 errors, 0 warnings, 34 lines checked Your patch has style problems, please review. If any of these errors are false positives report them to the maintainer, see CHECKPATCH in MAINTAINERS. === OUTPUT END === Test command exited with code: 1 --- Email generated automatically by Patchew [http://patchew.org/]. Please send your feedback to patchew-de...@freelists.org
Re: [Qemu-devel] [PULL, 08/14] spapr_pci/spapr_pci_vfio: Support Dynamic DMA Windows (DDW)
On Wed, Nov 08, 2017 at 09:52:12AM +1100, Alexey Kardashevskiy wrote: > On 07/11/17 19:42, Laurent Vivier wrote: > > On 05/07/2016 07:31, David Gibson wrote: > >> From: Alexey Kardashevskiy> >> > >> This adds support for Dynamic DMA Windows (DDW) option defined by > >> the SPAPR specification which allows to have additional DMA window(s) > >> > >> The "ddw" property is enabled by default on a PHB but for compatibility > >> the pseries-2.6 machine and older disable it. > >> This also creates a single DMA window for the older machines to > >> maintain backward migration. > >> > >> This implements DDW for PHB with emulated and VFIO devices. The host > >> kernel support is required. The advertised IOMMU page sizes are 4K and > >> 64K; 16M pages are supported but not advertised by default, in order to > >> enable them, the user has to specify "pgsz" property for PHB and > >> enable huge pages for RAM. > > > > Why is it not advirtised by default? > > I do not remember clearly but this kind of automation is usually less > manageable. What if we do not want huge IOMMU pages for some reason? The reason is basically for migration. Because of the compatibility requirements for that, we generally try to avoid making any guest visible changes to the virtual machine based on host capabilities or configuration because it makes migration a total nightmare. That said, in this case the difference which we'd need here - availability of hugepages in the guest, is *already* guest visible for the normal (non IOMMU) page mapping. So maybe automatically changing the IOMMU mask as well would be ok. Or at least no worse than we have already. > > When we start qemu with hugepage memory ("mount -t hugetlbfs none > > /mnt/kvm_hugepage" and ".. -mem-path /mnt/kvm_hugepage .."), we have an > > ugly message: > > > > "qemu-kvm: System page size 0x100 is not enabled in page_size_mask > > (0x11000). Performance may be slow" > > > > I understand if we want to use this with VFIO, we need something like > > "-global spapr-pci-host-bridge.pgsz=0x1011000". > > > > But is it needed if we don't use VFIO? > > Yes, TCE tables are still created in KVM so the size matters. > > > Is it a way QEMU adds automatically the 0x100 mask to page_size_mask? > > No. That thing which decides about -mem-path should also add the pgsz mask. > > -- David Gibson| I'll have my music baroque, and my code david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_ | _way_ _around_! http://www.ozlabs.org/~dgibson signature.asc Description: PGP signature
Re: [Qemu-devel] kvm: virtio-net: saved image requires TUN_F_UFO support
On 2017年11月08日 19:22, Jason Wang wrote: On 2017年11月08日 18:46, Paolo Bonzini wrote: On 08/11/2017 09:21, Jason Wang wrote: On 2017年11月08日 17:05, Stefan Priebe - Profihost AG wrote: Am 08.11.2017 um 08:54 schrieb Jason Wang: On 2017年11月08日 15:41, Stefan Priebe - Profihost AG wrote: Hi Paolo, Am 06.11.2017 um 12:27 schrieb Paolo Bonzini: On 06/11/2017 12:09, Stefan Priebe - Profihost AG wrote: HI Paolo, could this patchset be related? Uh oh, yes it should. Jason, any ways to fix it? I suppose we need to disable UFO in the newest machine types, but do we also have to do (software) UFO in vhost-net and QEMU for migration compatibility? Any news on this? Greets, Since we probe UFO support, it will be disabled automatically on device startup. For the issue of migration, I think the only way is trying to fix it in kernel. But 4.14 is around the corner and the patchset is already included. Shouldn't this get fixed before 4.14 is released? We will try to seek a solution as soon as possible. If we can't catch 4.14, we will do it for stable for sure. Jason, can you write to netdev and Cc Linus and me? Thanks, Paolo Paolo, see this https://www.spinics.net/lists/netdev/msg465454.html Just notice this today since I'm not on the cc list. Thanks An update: After some discussions on netdev, we will try to bring UFO back partially for just TAP. Willem promise to fix this. 4.14 is too late consider the fix is not trivial, it will go through stable tree. Thanks
Re: [Qemu-devel] [Qemu-block] segfault in parallel blockjobs (iotest 30)
On Thu, 11/09 17:26, Alberto Garcia wrote: > On Wed 08 Nov 2017 03:45:38 PM CET, Alberto Garcia wrote: > > > I'm thinking that perhaps we should add the pause point directly to > > block_job_defer_to_main_loop(), to prevent any block job from running > > the exit function when it's paused. > > I was trying this and unfortunately this breaks the mirror job at least > (reproduced with a simple block-commit on the topmost node, which uses > commit_active_start() -> mirror_start_job()). > > So what happens is that mirror_run() always calls bdrv_drained_begin() > before returning, pausing the block job. The closing bdrv_drained_end() > call is at the end of mirror_exit(), already in the main loop. > > So the mirror job is always calling block_job_defer_to_main_loop() and > mirror_exit() when the job is paused. > FWIW, I think Max's report on 194 failures is related: https://lists.gnu.org/archive/html/qemu-devel/2017-11/msg01822.html so perhaps it's worth testing his patch too: https://lists.gnu.org/archive/html/qemu-devel/2017-11/msg01835.html Fam
Re: [Qemu-devel] [PATCH for-2.11] block: Keep strong reference when draining all BDS
On Thu, 11/09 21:43, Max Reitz wrote: > Draining a BDS may lead to graph modifications, which in turn may result > in it and other BDS being stripped of their current references. If > bdrv_drain_all_begin() and bdrv_drain_all_end() do not keep strong > references themselves, the BDS they are trying to drain (or undrain) may > disappear right under their feet -- or, more specifically, under the > feet of BDRV_POLL_WHILE() in bdrv_drain_recurse(). > > This fixes an occasional hang of iotest 194. > > Signed-off-by: Max Reitz> --- > block/io.c | 47 --- > 1 file changed, 44 insertions(+), 3 deletions(-) > > diff --git a/block/io.c b/block/io.c > index 3d5ef2cabe..a0a2833e8e 100644 > --- a/block/io.c > +++ b/block/io.c > @@ -340,7 +340,10 @@ void bdrv_drain_all_begin(void) > bool waited = true; > BlockDriverState *bs; > BdrvNextIterator it; > -GSList *aio_ctxs = NULL, *ctx; > +GSList *aio_ctxs = NULL, *ctx, *bs_list = NULL, *bs_list_entry; > + > +/* Must be called from the main loop */ > +assert(qemu_get_current_aio_context() == qemu_get_aio_context()); > > block_job_pause_all(); > > @@ -355,6 +358,12 @@ void bdrv_drain_all_begin(void) > if (!g_slist_find(aio_ctxs, aio_context)) { > aio_ctxs = g_slist_prepend(aio_ctxs, aio_context); > } > + > +/* Keep a strong reference to all root BDS and copy them into > + * an own list because draining them may lead to graph > + * modifications. */ > +bdrv_ref(bs); > +bs_list = g_slist_prepend(bs_list, bs); > } > > /* Note that completion of an asynchronous I/O operation can trigger any > @@ -370,7 +379,11 @@ void bdrv_drain_all_begin(void) > AioContext *aio_context = ctx->data; > > aio_context_acquire(aio_context); > -for (bs = bdrv_first(); bs; bs = bdrv_next()) { > +for (bs_list_entry = bs_list; bs_list_entry; > + bs_list_entry = bs_list_entry->next) > +{ > +bs = bs_list_entry->data; > + > if (aio_context == bdrv_get_aio_context(bs)) { > waited |= bdrv_drain_recurse(bs, true); > } > @@ -379,24 +392,52 @@ void bdrv_drain_all_begin(void) > } > } > > +for (bs_list_entry = bs_list; bs_list_entry; > + bs_list_entry = bs_list_entry->next) > +{ > +bdrv_unref(bs_list_entry->data); > +} > + > g_slist_free(aio_ctxs); > +g_slist_free(bs_list); > } > > void bdrv_drain_all_end(void) > { > BlockDriverState *bs; > BdrvNextIterator it; > +GSList *bs_list = NULL, *bs_list_entry; > + > +/* Must be called from the main loop */ > +assert(qemu_get_current_aio_context() == qemu_get_aio_context()); > > +/* Keep a strong reference to all root BDS and copy them into an > + * own list because draining them may lead to graph modifications. > + */ > for (bs = bdrv_first(); bs; bs = bdrv_next()) { > -AioContext *aio_context = bdrv_get_aio_context(bs); > +bdrv_ref(bs); > +bs_list = g_slist_prepend(bs_list, bs); > +} > + > +for (bs_list_entry = bs_list; bs_list_entry; > + bs_list_entry = bs_list_entry->next) > +{ > +AioContext *aio_context; > + > +bs = bs_list_entry->data; > +aio_context = bdrv_get_aio_context(bs); > > aio_context_acquire(aio_context); > aio_enable_external(aio_context); > bdrv_parent_drained_end(bs); > bdrv_drain_recurse(bs, false); > aio_context_release(aio_context); > + > +bdrv_unref(bs); > } > > +g_slist_free(bs_list); > + > block_job_resume_all(); > } > > -- > 2.13.6 > > It is better to put the references into BdrvNextIterator and introduce bdrv_next_iterator_destroy() to free them? You'll need to touch all callers because it is not C++, but it secures all of rest, which seems vulnerable in the same pattern, for example the aio_poll() in iothread_stop_all(). Fam
Re: [Qemu-devel] Intermittent hang of iotest 194 (bdrv_drain_all after non-shared storage migration)
On Thu, 11/09 20:31, Max Reitz wrote: > On 2017-11-09 16:30, Fam Zheng wrote: > > On Thu, 11/09 16:14, Max Reitz wrote: > >> On 2017-11-09 05:21, Fam Zheng wrote: > >>> On Thu, 11/09 01:48, Max Reitz wrote: > Hi, > > More exciting news from the bdrv_drain() front! > > I've noticed in the past that iotest 194 sometimes hangs. I usually run > the tests on tmpfs, but I've just now verified that it happens on my SSD > just as well. > > So the reproducer is a plain: > > while ./check -raw 194; do; done > >>> > >>> I cannot produce it on my machine. > >> > >> Hm, too bad. I see it both on my work laptop (with Fedora) and my > >> desktop (with Arch)... > >> > (No difference between raw or qcow2, though.) > > And then, after a couple of runs (or a couple ten), it will just hang. > The reason is that the source VM lingers around and doesn't quit > voluntarily -- the test itself was successful, but it just can't exit. > > If you force it to exit by killing the VM (e.g. through pkill -11 qemu), > this is the backtrace: > > #0 0x7f7cfc297e06 in ppoll () at /lib64/libc.so.6 > #1 0x563b846bcac9 in ppoll (__ss=0x0, __timeout=0x0, > __nfds=, __fds=) at > /usr/include/bits/poll2.h:77 > >>> > >>> Looking at the 0 timeout it seems we are in the aio_poll(ctx, > >>> blocking=false); > >>> branches of BDRV_POLL_WHILE? Is it a busy loop? If so I wonder what is > >>> making > >>> progress and causing the return value to be true in aio_poll(). > >>> > #2 0x563b846bcac9 in qemu_poll_ns (fds=, > nfds=, timeout=) at util/qemu-timer.c:322 > #3 0x563b846be711 in aio_poll (ctx=ctx@entry=0x563b856e3e80, > blocking=) at util/aio-posix.c:629 > #4 0x563b8463afa4 in bdrv_drain_recurse > (bs=bs@entry=0x563b865568a0, begin=begin@entry=true) at block/io.c:201 > #5 0x563b8463baff in bdrv_drain_all_begin () at block/io.c:381 > #6 0x563b8463bc99 in bdrv_drain_all () at block/io.c:411 > #7 0x563b8459888b in block_migration_cleanup (opaque= out>) at migration/block.c:714 > #8 0x563b845883be in qemu_savevm_state_cleanup () at > migration/savevm.c:1251 > #9 0x563b845811fd in migration_thread (opaque=0x563b856f1da0) at > migration/migration.c:2298 > #10 0x7f7cfc56f36d in start_thread () at /lib64/libpthread.so.0 > #11 0x7f7cfc2a3e1f in clone () at /lib64/libc.so.6 > > > And when you make bdrv_drain_all_begin() print what we are trying to > drain, you can see that it's the format node (managed by the "raw" > driver in this case). > >>> > >>> So what is the value of bs->in_flight? > >> > >> gdb: > >>> (gdb) print bs->in_flight > >>> $3 = 2307492233 > >> > >> "That's weird, why would it..." > >> > >>> (gdb) print *bs > >>> $4 = {open_flags = -1202160144, read_only = 161, encrypted = 85, sg = > >>> false, probed = false, force_share = 96, implicit = 159, drv = 0x0, > >>> opaque = 0x0, aio_context = 0x8989898989898989, aio_notifiers = {lh_first > >>> = 0x8989898989898989}, > >>> walking_aio_notifiers = 137, filename = '\211' , > >>> backing_file = '\211' , backing_format = '\211' > >>> , full_open_options = 0x8989898989898989, > >>> exact_filename = '\211' , backing = > >>> 0x8989898989898989, file = 0x8989898989898989, bl = {request_alignment = > >>> 2307492233, max_pdiscard = -1987475063, pdiscard_alignment = 2307492233, > >>> max_pwrite_zeroes = -1987475063, pwrite_zeroes_alignment = > >>> 2307492233, opt_transfer = 2307492233, max_transfer = 2307492233, > >>> min_mem_alignment = 9910603678816504201, opt_mem_alignment = > >>> 9910603678816504201, max_iov = -1987475063}, > >>> supported_write_flags = 2307492233, supported_zero_flags = 2307492233, > >>> node_name = '\211' , node_list = {tqe_next = > >>> 0x8989898989898989, tqe_prev = 0x8989898989898989}, bs_list = {tqe_next = > >>> 0x8989898989898989, > >>> tqe_prev = 0x8989898989898989}, monitor_list = {tqe_next = > >>> 0x8989898989898989, tqe_prev = 0x8989898989898989}, refcnt = -1987475063, > >>> op_blockers = {{lh_first = 0x8989898989898989} }, job = > >>> 0x8989898989898989, > >>> inherits_from = 0x8989898989898989, children = {lh_first = > >>> 0x8989898989898989}, parents = {lh_first = 0x8989898989898989}, options = > >>> 0x8989898989898989, explicit_options = 0x8989898989898989, detect_zeroes > >>> = 2307492233, > >>> backing_blocker = 0x8989898989898989, total_sectors = > >>> -8536140394893047415, before_write_notifiers = {notifiers = {lh_first = > >>> 0x8989898989898989}}, write_threshold_offset = 9910603678816504201, > >>> write_threshold_notifier = {notify = > >>> 0x8989898989898989, node = {le_next = 0x8989898989898989, le_prev = > >>> 0x8989898989898989}}, dirty_bitmap_mutex = {lock = {__data = {__lock = > >>> -1987475063, __count = 2307492233, __owner =
Re: [Qemu-devel] [PATCH v2 1/1] Makefile: Capstone: Add support for cross compile ranlib
On Thu, Nov 9, 2017 at 3:09 AM, Peter Maydellwrote: > On 8 November 2017 at 15:10, Alistair Francis > wrote: >> When cross compiling QEMU for Windows we need to specify the cross >> version of ranlib to avoid build errors when building capstone. This >> patch ensures we use the same cross prefix on ranlib as other toolchain >> components. >> >> This patch fixes this build error when building on RHEL (or Fedora) with >> MinGW: >> /scratch/jenkins_tmp/workspace/QEMU-BuildAll/label/ssw_rhel7/target/windows/build-windows/capstone/capstone.lib: >> error adding symbols: Archive has no index; run ranlib to add one >> collect2: error: ld returned 1 exit status >> make: *** [qemu-io.exe] Error 1 >> >> Signed-off-by: Alistair Francis >> Suggested-by: Peter Maydell >> Reviewed-by: Philippe Mathieu-Daud >> Tested-by: Philippe Mathieu-Daud > > Something in your patch handling workflow seems to be > breaking non-ASCII characters, and it has dropped the > last character in Philippe's surname. It would be good > if you could look into fixing this -- we have quite a > few contributors whose names are not representable > purely in ASCII. I think it's the way that I copy lines into the patches. I'll be sure to make sure I get all the characters next time. Thanks for pointing it out. Alistair > > thanks > -- PMM
Re: [Qemu-devel] [PATCH v8 01/14] block/dirty-bitmap: add bdrv_dirty_bitmap_enable_successor()
On 10/30/2017 12:32 PM, Vladimir Sementsov-Ogievskiy wrote: > Enabling bitmap successor is necessary to enable successors of bitmaps > being migrated before target vm start. > > Signed-off-by: Vladimir Sementsov-Ogievskiy> --- > include/block/dirty-bitmap.h | 1 + > block/dirty-bitmap.c | 8 > 2 files changed, 9 insertions(+) > > diff --git a/include/block/dirty-bitmap.h b/include/block/dirty-bitmap.h > index 3579a7597c..93d4336505 100644 > --- a/include/block/dirty-bitmap.h > +++ b/include/block/dirty-bitmap.h > @@ -20,6 +20,7 @@ BdrvDirtyBitmap > *bdrv_dirty_bitmap_abdicate(BlockDriverState *bs, > BdrvDirtyBitmap *bdrv_reclaim_dirty_bitmap(BlockDriverState *bs, > BdrvDirtyBitmap *bitmap, > Error **errp); > +void bdrv_dirty_bitmap_enable_successor(BdrvDirtyBitmap *bitmap); > BdrvDirtyBitmap *bdrv_find_dirty_bitmap(BlockDriverState *bs, > const char *name); > void bdrv_dirty_bitmap_make_anon(BdrvDirtyBitmap *bitmap); > diff --git a/block/dirty-bitmap.c b/block/dirty-bitmap.c > index bd04e991b1..81adbeb6d4 100644 > --- a/block/dirty-bitmap.c > +++ b/block/dirty-bitmap.c > @@ -237,6 +237,14 @@ int bdrv_dirty_bitmap_create_successor(BlockDriverState > *bs, > return 0; > } > > +/* Called with BQL taken. */ > +void bdrv_dirty_bitmap_enable_successor(BdrvDirtyBitmap *bitmap) > +{ > +qemu_mutex_lock(bitmap->mutex); > +bdrv_enable_dirty_bitmap(bitmap->successor); > +qemu_mutex_unlock(bitmap->mutex); > +} > + > /** > * For a bitmap with a successor, yield our name to the successor, > * delete the old bitmap, and return a handle to the new bitmap. > Mechanically correct; though I haven't looked forward to how it's used yet. Reviewed-by: John Snow
[Qemu-devel] [PATCH] thread-posix: fix qemu_rec_mutex_trylock macro
We never noticed because it has no users. Signed-off-by: Emilio G. Cota--- include/qemu/thread-posix.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/include/qemu/thread-posix.h b/include/qemu/thread-posix.h index f4296d3..f3f47e4 100644 --- a/include/qemu/thread-posix.h +++ b/include/qemu/thread-posix.h @@ -7,7 +7,7 @@ typedef QemuMutex QemuRecMutex; #define qemu_rec_mutex_destroy qemu_mutex_destroy #define qemu_rec_mutex_lock qemu_mutex_lock -#define qemu_rec_mutex_try_lock qemu_mutex_try_lock +#define qemu_rec_mutex_trylock qemu_mutex_trylock #define qemu_rec_mutex_unlock qemu_mutex_unlock struct QemuMutex { -- 2.7.4
Re: [Qemu-devel] [Qemu devel PATCH] MAINTAINERS: Add entries for Smartfusion2
On 11/09/2017 08:55 PM, Peter Maydell wrote: > On 9 November 2017 at 21:46, Philippe Mathieu-Daudéwrote: >> Hi Subbaraya, >> >> On 11/09/2017 09:02 AM, Subbaraya Sundeep wrote: >>> add voluntarily myself as maintainer for Smartfusion2 >> >> You need to share your GnuPG key signed, I couldn't find it using >> http://pgp.mit.edu/pks/lookup?search=Subbaraya+Sundeep >> >> from https://wiki.qemu.org/Contribute/SubmitAPullRequest : > > I don't in general expect to take pull requests from > everybody listed as a maintainer in the MAINTAINERS file. > That just means "I'm going to be reviewing and should > be cc'd on patches". Pull requests are sent by people > who are maintainers for a subsystem. Rule of thumb: > unless somebody asks you to send a pull request, you > don't need to do it. Ok, please apologize my misunderstanding. I still think the M: entry stand for 'Maintainer' instead of 'Mail', and still don't understand the difference with a "Designated reviewer" (R: entry): M: Mail patches to: FullName R: Designated reviewer: FullName These reviewers should be CCed on patches. "Designated reviewer" seems to duplicate the M: entry and is therefore confusing. Can we simply remove it instead? When introduced in fdf6fab4df4 the explanation was: -- Some people are not content with the amount of mail they get, and would like to be CCed on patches for areas they do not maintain. Let them satisfy their own appetite for qemu-devel messages. Seriously: the purpose here is a bit different from the Linux kernel. While Linux uses "R" to designate non-maintainers for reviewing patches in a given area, in QEMU I would also like to use "R" so that people can delegate sending pull requests while keeping some degree of oversight. -- Regards, Phil.
Re: [Qemu-devel] [Qemu devel PATCH] MAINTAINERS: Add entries for Smartfusion2
On 9 November 2017 at 21:46, Philippe Mathieu-Daudéwrote: > Hi Subbaraya, > > On 11/09/2017 09:02 AM, Subbaraya Sundeep wrote: >> add voluntarily myself as maintainer for Smartfusion2 > > You need to share your GnuPG key signed, I couldn't find it using > http://pgp.mit.edu/pks/lookup?search=Subbaraya+Sundeep > > from https://wiki.qemu.org/Contribute/SubmitAPullRequest : I don't in general expect to take pull requests from everybody listed as a maintainer in the MAINTAINERS file. That just means "I'm going to be reviewing and should be cc'd on patches". Pull requests are sent by people who are maintainers for a subsystem. Rule of thumb: unless somebody asks you to send a pull request, you don't need to do it. thanks -- PMM
Re: [Qemu-devel] [PATCH for-2.11] iotests: Use new-style NBD connections
On 2017-11-09 23:12, Eric Blake wrote: > Old-style NBD is deprecated upstream (it is documented, but no > longer implemented in the reference implementation), and it is > severely limited (it cannot support structured replies, which > means it cannot support efficient handling of zeroes), when > compared to new-style NBD. We are better off having our iotests > favor new-style everywhere (although some explicit tests, > particularly 83, still cover old-style for back-compat reasons); > this is as simple as supplying the empty string as the default > export name, as it does not change the URI needed to connect a > client to the server. This also gives us more coverage of the > just-added structured reply code, when not overriding $QEMU_NBD > to intentionally point to an older server. > > Signed-off-by: Eric Blake> --- > > Proposing this for 2.11; it can either go in through the NBD > tree (although I just send my 2.11-rc1 pull request) or through > Max' iotest tree. random.org said 1, so: Thanks, applied to my block branch: https://github.com/XanClic/qemu/commits/block Max signature.asc Description: OpenPGP digital signature
[Qemu-devel] [Bug 1731347] [NEW] VFIO Passthrough of SAS2008-based HBA card fails on E3-1225v3 due to failed DMA mapping (-14)
Public bug reported: There is a bug preventing multiple people with my combination of hardware from using PCI passthrough. I am not actually sure whether the bug is in kernel/kvm, vfio or qemu, however, as qemu is the highest- level of these, I am reporting the bug here as you will likely know better where the origin of the bug may be found. When attempting to pass through this device to a KVM using VFIO, this results in error -14 (Bad Address): # qemu-system-x86_64 -enable-kvm -m 10G -net none -monitor stdio -serial # none -parallel none -vnc :1 -device vfio-pci,host=1:00.0 -S QEMU 2.9.1 monitor - type 'help' for more information (qemu) c (qemu) qemu-system-x86_64: VFIO_MAP_DMA: -14 qemu-system-x86_64: vfio_dma_map(0x7f548f0a1fc0, 0xfebd, 0x2000, 0x7f54a909d000) = -14 (Bad address) qemu: hardware error: vfio: DMA mapping failed, unable to continue See also: https://bugzilla.proxmox.com/show_bug.cgi?id=1556 https://www.redhat.com/archives/vfio-users/2016-May/msg00088.html This has occurred on Proxmox (Proxmox and Debian packages, Ubuntu kernel), Ubuntu, and pure Debian packages and kernel on Proxmox. However, this error reportedly does NOT occur for: - different distributions(!) (Fedora 24, 25) - different HBA cards (SAS2308, SAS3008) - different CPU (E3-1220v5) I would be thankful for any input and I'll be happy to provide any further info necessary. This is my first time delving this deep into anything close to the kernel. Thanks and best regards, Johannes Falke ** Affects: qemu Importance: Undecided Status: New ** Attachment added: "strace.txt" https://bugs.launchpad.net/bugs/1731347/+attachment/5006683/+files/strace.txt -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/1731347 Title: VFIO Passthrough of SAS2008-based HBA card fails on E3-1225v3 due to failed DMA mapping (-14) Status in QEMU: New Bug description: There is a bug preventing multiple people with my combination of hardware from using PCI passthrough. I am not actually sure whether the bug is in kernel/kvm, vfio or qemu, however, as qemu is the highest-level of these, I am reporting the bug here as you will likely know better where the origin of the bug may be found. When attempting to pass through this device to a KVM using VFIO, this results in error -14 (Bad Address): # qemu-system-x86_64 -enable-kvm -m 10G -net none -monitor stdio -serial # none -parallel none -vnc :1 -device vfio-pci,host=1:00.0 -S QEMU 2.9.1 monitor - type 'help' for more information (qemu) c (qemu) qemu-system-x86_64: VFIO_MAP_DMA: -14 qemu-system-x86_64: vfio_dma_map(0x7f548f0a1fc0, 0xfebd, 0x2000, 0x7f54a909d000) = -14 (Bad address) qemu: hardware error: vfio: DMA mapping failed, unable to continue See also: https://bugzilla.proxmox.com/show_bug.cgi?id=1556 https://www.redhat.com/archives/vfio-users/2016-May/msg00088.html This has occurred on Proxmox (Proxmox and Debian packages, Ubuntu kernel), Ubuntu, and pure Debian packages and kernel on Proxmox. However, this error reportedly does NOT occur for: - different distributions(!) (Fedora 24, 25) - different HBA cards (SAS2308, SAS3008) - different CPU (E3-1220v5) I would be thankful for any input and I'll be happy to provide any further info necessary. This is my first time delving this deep into anything close to the kernel. Thanks and best regards, Johannes Falke To manage notifications about this bug go to: https://bugs.launchpad.net/qemu/+bug/1731347/+subscriptions
Re: [Qemu-devel] [PATCH v2 0/5] iotests: Make some tests less flaky
On 2017-11-09 21:30, Max Reitz wrote: > There are a couple of tests that fail (on my machine) from time to > time (and by that I mean that recently I've rarely ever had a test run > with both 083 and 136 working on first try). > This series should fix most (at least the issues I am aware of). Fixed the commit message in patch 4 as suggested by Eric and applied to my block branch: https://github.com/XanClic/qemu/commits/block Max signature.asc Description: OpenPGP digital signature
[Qemu-devel] [PATCH for-2.11] iotests: Use new-style NBD connections
Old-style NBD is deprecated upstream (it is documented, but no longer implemented in the reference implementation), and it is severely limited (it cannot support structured replies, which means it cannot support efficient handling of zeroes), when compared to new-style NBD. We are better off having our iotests favor new-style everywhere (although some explicit tests, particularly 83, still cover old-style for back-compat reasons); this is as simple as supplying the empty string as the default export name, as it does not change the URI needed to connect a client to the server. This also gives us more coverage of the just-added structured reply code, when not overriding $QEMU_NBD to intentionally point to an older server. Signed-off-by: Eric Blake--- Proposing this for 2.11; it can either go in through the NBD tree (although I just send my 2.11-rc1 pull request) or through Max' iotest tree. tests/qemu-iotests/common.rc | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/tests/qemu-iotests/common.rc b/tests/qemu-iotests/common.rc index 0e8a33c696..dbae7d74ba 100644 --- a/tests/qemu-iotests/common.rc +++ b/tests/qemu-iotests/common.rc @@ -242,7 +242,7 @@ _make_test_img() if [ $IMGPROTO = "nbd" ]; then # Pass a sufficiently high number to -e that should be enough for all # tests -eval "$QEMU_NBD -v -t -b 127.0.0.1 -p 10810 -f $IMGFMT -e 42 $TEST_IMG_FILE >/dev/null &" +eval "$QEMU_NBD -v -t -b 127.0.0.1 -p 10810 -f $IMGFMT -e 42 -x '' $TEST_IMG_FILE >/dev/null &" sleep 1 # FIXME: qemu-nbd needs to be listening before we continue fi -- 2.13.6
[Qemu-devel] [PATCH for 2.11 v2 2/2] xlnx-zcu102: Add an info message deprecating the EP108
The EP108 was an early access development board that is no longer used. Add an info message to convert any users to the ZCU102 instead. On QEMU they are both identical. This patch also updated the qemu-doc.texi file to indicate that the EP108 has been deprecated. Signed-off-by: Alistair Francis--- V2: - Update the qemu-doc.texi file hw/arm/xlnx-zcu102.c | 3 +++ qemu-doc.texi| 7 +++ 2 files changed, 10 insertions(+) diff --git a/hw/arm/xlnx-zcu102.c b/hw/arm/xlnx-zcu102.c index 7ec03dad42..a234a1 100644 --- a/hw/arm/xlnx-zcu102.c +++ b/hw/arm/xlnx-zcu102.c @@ -164,6 +164,9 @@ static void xlnx_ep108_init(MachineState *machine) { XlnxZCU102 *s = EP108_MACHINE(machine); +info_report("The Xilinx EP108 machine is deprecated, please use the " +"ZCU102 machine instead. It has the same features supported."); + xlnx_zynqmp_init(s, machine); } diff --git a/qemu-doc.texi b/qemu-doc.texi index 8c10956a66..d383ac44d4 100644 --- a/qemu-doc.texi +++ b/qemu-doc.texi @@ -2537,6 +2537,13 @@ or ``ivshmem-doorbell`` device types. The ``spapr-pci-vfio-host-bridge'' device type is replaced by the ``spapr-pci-host-bridge'' device type. +@section System emulator machines + +@subsection Xilinx EP108 (since 2.11.0) + +The ``xlnx-ep108'' machine has been replaced by the ``xlnx-zcu102'' machine. +The ``xlnx-zcu102'' machine has the same features and capabilites in QEMU. + @node License @appendix License -- 2.14.1
[Qemu-devel] [PATCH for 2.11 v2 0/2] Xilinx ZCU102 fixes for 2.11
These are two small fixes for 2.11. V2: - Update qemu-doc.texi Alistair Francis (2): xlnx-zynqmp: Properly support the smp command line option xlnx-zcu102: Add an info message deprecating the EP108 hw/arm/xlnx-zcu102.c | 6 +- hw/arm/xlnx-zynqmp.c | 26 -- qemu-doc.texi| 7 +++ 3 files changed, 28 insertions(+), 11 deletions(-) -- 2.14.1
Re: [Qemu-devel] [Qemu devel PATCH] MAINTAINERS: Add entries for Smartfusion2
Hi Subbaraya, On 11/09/2017 09:02 AM, Subbaraya Sundeep wrote: > add voluntarily myself as maintainer for Smartfusion2 You need to share your GnuPG key signed, I couldn't find it using http://pgp.mit.edu/pks/lookup?search=Subbaraya+Sundeep from https://wiki.qemu.org/Contribute/SubmitAPullRequest : All pull requests must be signed. If your key is not already signed by members of the QEMU community, you should make arrangements to attend a KeySigningParty (for example at KVM Forum) or make alternative arrangements to have your key signed by an attendee. Key signing requires meeting another community member *in person* so please make appropriate arrangements. IMHO it would be great you find some time to review some patches (qemu-...@nongnu.org is not that verbose): https://wiki.qemu.org/Contribute/SubmitAPatch#Return_the_favor > > Signed-off-by: Subbaraya Sundeep> --- > MAINTAINERS | 17 + > 1 file changed, 17 insertions(+) > > diff --git a/MAINTAINERS b/MAINTAINERS > index 0cd4d02..dae08bd 100644 > --- a/MAINTAINERS > +++ b/MAINTAINERS > @@ -564,6 +564,23 @@ M: Alistair Francis > S: Maintained > F: hw/arm/netduino2.c > > +MSF2 SoC SmartFusion2 > +M: Subbaraya Sundeep > +S: Maintained > +F: hw/arm/msf2-soc.c > +F: hw/misc/msf2-sysreg.c > +F: hw/timer/mss-timer.c > +F: hw/ssi/mss-spi.c > +F: include/hw/arm/msf2-soc.h > +F: include/hw/misc/msf2-sysreg.h > +F: include/hw/timer/mss-timer.h > +F: include/hw/ssi/mss-spi.h > + Emcraft M2S-FG484 > +MSF2 SOM board > +M: Subbaraya Sundeep > +S: Maintained > +F: hw/arm/msf2-som.c > + > CRIS Machines > - > Axis Dev88 Regards, Phil.
Re: [Qemu-devel] [Qemu-block] [PATCH] blockdev-backup: enable non-root nodes for backup
On 11/09/2017 09:16 AM, Vladimir Sementsov-Ogievskiy wrote: > What was the reason to abandon non-root nodes? Eric had it correct: we were never convinced it work would properly, so we went with a smaller set.
Re: [Qemu-devel] [PATCH for-2.11] block: Keep strong reference when draining all BDS
On 11/09/2017 02:43 PM, Max Reitz wrote: > Draining a BDS may lead to graph modifications, which in turn may result > in it and other BDS being stripped of their current references. If > bdrv_drain_all_begin() and bdrv_drain_all_end() do not keep strong > references themselves, the BDS they are trying to drain (or undrain) may > disappear right under their feet -- or, more specifically, under the > feet of BDRV_POLL_WHILE() in bdrv_drain_recurse(). > > This fixes an occasional hang of iotest 194. > > Signed-off-by: Max Reitz> --- > block/io.c | 47 --- > 1 file changed, 44 insertions(+), 3 deletions(-) > + > +/* Keep a strong reference to all root BDS and copy them into > + * an own list because draining them may lead to graph 'an own' sounds awkward; maybe 'copy them into a local list' > + * modifications. */ > +bdrv_ref(bs); > +bs_list = g_slist_prepend(bs_list, bs); > } > void bdrv_drain_all_end(void) > { > BlockDriverState *bs; > BdrvNextIterator it; > +GSList *bs_list = NULL, *bs_list_entry; > + > +/* Must be called from the main loop */ > +assert(qemu_get_current_aio_context() == qemu_get_aio_context()); > > +/* Keep a strong reference to all root BDS and copy them into an > + * own list because draining them may lead to graph modifications. And again. With that tweak, Reviewed-by: Eric Blake -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3266 Virtualization: qemu.org | libvirt.org signature.asc Description: OpenPGP digital signature
Re: [Qemu-devel] [PATCH v2 4/5] iotests: Make 083 less flaky
On 2017-11-09 21:58, Eric Blake wrote: > On 11/09/2017 02:30 PM, Max Reitz wrote: >> 083 has (at least) two issues: >> >> 1. By launching the nbd-fault-injector in background, it may not be >>scheduled until the first grep on its output file is executed. >>However, until then, that file may not have been created yet -- so it >>either does not exist yet (thus making the grep emit an error), or it >>does exist but contains stale data (thus making the rest of the test >>case work connect to a wrong address). >>Fix this by explicitly overwriting the output file before executing >>nbd-fault-injector. >> >> 2. The nbd-fault-injector prints things other than "Listening on...". >>It also prints a "Closing connection" message from time to time. We >>currently invoke sed on the whole file in the hope of it only >>containing the "Listening on..." line yet. That hope is sometimes >>shattered by the brutal reality of race conditions, so invoke grep >>before sed. > > Comment is now stale; s/invoke grep before sed/make the sed script more > robust/ *sigh* It appears my hope of easily fixing patches is often shattered by the brutal reality, too. > Reviewed-by: Eric BlakeThanks! I'll fix the sentence when applying the series. Max signature.asc Description: OpenPGP digital signature
Re: [Qemu-devel] [PATCH v2 4/5] iotests: Make 083 less flaky
On 11/09/2017 02:30 PM, Max Reitz wrote: > 083 has (at least) two issues: > > 1. By launching the nbd-fault-injector in background, it may not be >scheduled until the first grep on its output file is executed. >However, until then, that file may not have been created yet -- so it >either does not exist yet (thus making the grep emit an error), or it >does exist but contains stale data (thus making the rest of the test >case work connect to a wrong address). >Fix this by explicitly overwriting the output file before executing >nbd-fault-injector. > > 2. The nbd-fault-injector prints things other than "Listening on...". >It also prints a "Closing connection" message from time to time. We >currently invoke sed on the whole file in the hope of it only >containing the "Listening on..." line yet. That hope is sometimes >shattered by the brutal reality of race conditions, so invoke grep >before sed. Comment is now stale; s/invoke grep before sed/make the sed script more robust/ Reviewed-by: Eric Blake-- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3266 Virtualization: qemu.org | libvirt.org signature.asc Description: OpenPGP digital signature
[Qemu-devel] [PATCH for-2.11] block: Keep strong reference when draining all BDS
Draining a BDS may lead to graph modifications, which in turn may result in it and other BDS being stripped of their current references. If bdrv_drain_all_begin() and bdrv_drain_all_end() do not keep strong references themselves, the BDS they are trying to drain (or undrain) may disappear right under their feet -- or, more specifically, under the feet of BDRV_POLL_WHILE() in bdrv_drain_recurse(). This fixes an occasional hang of iotest 194. Signed-off-by: Max Reitz--- block/io.c | 47 --- 1 file changed, 44 insertions(+), 3 deletions(-) diff --git a/block/io.c b/block/io.c index 3d5ef2cabe..a0a2833e8e 100644 --- a/block/io.c +++ b/block/io.c @@ -340,7 +340,10 @@ void bdrv_drain_all_begin(void) bool waited = true; BlockDriverState *bs; BdrvNextIterator it; -GSList *aio_ctxs = NULL, *ctx; +GSList *aio_ctxs = NULL, *ctx, *bs_list = NULL, *bs_list_entry; + +/* Must be called from the main loop */ +assert(qemu_get_current_aio_context() == qemu_get_aio_context()); block_job_pause_all(); @@ -355,6 +358,12 @@ void bdrv_drain_all_begin(void) if (!g_slist_find(aio_ctxs, aio_context)) { aio_ctxs = g_slist_prepend(aio_ctxs, aio_context); } + +/* Keep a strong reference to all root BDS and copy them into + * an own list because draining them may lead to graph + * modifications. */ +bdrv_ref(bs); +bs_list = g_slist_prepend(bs_list, bs); } /* Note that completion of an asynchronous I/O operation can trigger any @@ -370,7 +379,11 @@ void bdrv_drain_all_begin(void) AioContext *aio_context = ctx->data; aio_context_acquire(aio_context); -for (bs = bdrv_first(); bs; bs = bdrv_next()) { +for (bs_list_entry = bs_list; bs_list_entry; + bs_list_entry = bs_list_entry->next) +{ +bs = bs_list_entry->data; + if (aio_context == bdrv_get_aio_context(bs)) { waited |= bdrv_drain_recurse(bs, true); } @@ -379,24 +392,52 @@ void bdrv_drain_all_begin(void) } } +for (bs_list_entry = bs_list; bs_list_entry; + bs_list_entry = bs_list_entry->next) +{ +bdrv_unref(bs_list_entry->data); +} + g_slist_free(aio_ctxs); +g_slist_free(bs_list); } void bdrv_drain_all_end(void) { BlockDriverState *bs; BdrvNextIterator it; +GSList *bs_list = NULL, *bs_list_entry; + +/* Must be called from the main loop */ +assert(qemu_get_current_aio_context() == qemu_get_aio_context()); +/* Keep a strong reference to all root BDS and copy them into an + * own list because draining them may lead to graph modifications. + */ for (bs = bdrv_first(); bs; bs = bdrv_next()) { -AioContext *aio_context = bdrv_get_aio_context(bs); +bdrv_ref(bs); +bs_list = g_slist_prepend(bs_list, bs); +} + +for (bs_list_entry = bs_list; bs_list_entry; + bs_list_entry = bs_list_entry->next) +{ +AioContext *aio_context; + +bs = bs_list_entry->data; +aio_context = bdrv_get_aio_context(bs); aio_context_acquire(aio_context); aio_enable_external(aio_context); bdrv_parent_drained_end(bs); bdrv_drain_recurse(bs, false); aio_context_release(aio_context); + +bdrv_unref(bs); } +g_slist_free(bs_list); + block_job_resume_all(); } -- 2.13.6
[Qemu-devel] [PATCH v2 5/5] iotests: Make 136 less flaky
136 executes some AIO requests without a final aio_flush; then it advances the virtual clock and thus expects the last access time of the device to be less than the current time when queried (i.e. idle_time_ns to be greater than 0). However, without the aio_flush, some requests may be settled after the clock_step invocation. In that case, idle_time_ns would be 0 and the test fails. Fix this by adding an aio_flush if any AIO request other than some other aio_flush has been executed. Signed-off-by: Max ReitzReviewed-by: Eric Blake Reviewed-by: Stefan Hajnoczi --- tests/qemu-iotests/136 | 14 +- 1 file changed, 13 insertions(+), 1 deletion(-) diff --git a/tests/qemu-iotests/136 b/tests/qemu-iotests/136 index 4b994897af..88b97ea7c6 100644 --- a/tests/qemu-iotests/136 +++ b/tests/qemu-iotests/136 @@ -238,6 +238,18 @@ sector = "%d" for i in range(failed_wr_ops): ops.append("aio_write %d 512" % bad_offset) +# We need an extra aio_flush to settle all outstanding AIO +# operations before we can advance the virtual clock, so that +# the last access happens before clock_step and idle_time_ns +# will be greater than 0 +extra_flush = 0 +if rd_ops + wr_ops + invalid_rd_ops + invalid_wr_ops + \ +failed_rd_ops + failed_wr_ops > 0: +extra_flush = 1 + +if extra_flush > 0: +ops.append("aio_flush") + if failed_wr_ops > 0: highest_offset = max(highest_offset, bad_offset + 512) @@ -251,7 +263,7 @@ sector = "%d" self.total_wr_bytes += wr_ops * wr_size self.total_wr_ops += wr_ops self.total_wr_merged += wr_merged -self.total_flush_ops += flush_ops +self.total_flush_ops += flush_ops + extra_flush self.invalid_rd_ops += invalid_rd_ops self.invalid_wr_ops += invalid_wr_ops self.failed_rd_ops += failed_rd_ops -- 2.13.6
[Qemu-devel] [PATCH v2 4/5] iotests: Make 083 less flaky
083 has (at least) two issues: 1. By launching the nbd-fault-injector in background, it may not be scheduled until the first grep on its output file is executed. However, until then, that file may not have been created yet -- so it either does not exist yet (thus making the grep emit an error), or it does exist but contains stale data (thus making the rest of the test case work connect to a wrong address). Fix this by explicitly overwriting the output file before executing nbd-fault-injector. 2. The nbd-fault-injector prints things other than "Listening on...". It also prints a "Closing connection" message from time to time. We currently invoke sed on the whole file in the hope of it only containing the "Listening on..." line yet. That hope is sometimes shattered by the brutal reality of race conditions, so invoke grep before sed. Signed-off-by: Max Reitz--- tests/qemu-iotests/083 | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/tests/qemu-iotests/083 b/tests/qemu-iotests/083 index 0306f112da..3c1adbf0fb 100755 --- a/tests/qemu-iotests/083 +++ b/tests/qemu-iotests/083 @@ -86,6 +86,7 @@ EOF rm -f "$TEST_DIR/nbd.sock" +echo > "$TEST_DIR/nbd-fault-injector.out" $PYTHON nbd-fault-injector.py $extra_args "$nbd_addr" "$TEST_DIR/nbd-fault-injector.conf" >"$TEST_DIR/nbd-fault-injector.out" 2>&1 & # Wait for server to be ready @@ -94,7 +95,8 @@ EOF done # Extract the final address (port number has now been assigned in tcp case) - nbd_addr=$(sed 's/Listening on \(.*\)$/\1/' "$TEST_DIR/nbd-fault-injector.out") +nbd_addr=$(sed -n 's/^Listening on //p' \ + "$TEST_DIR/nbd-fault-injector.out") if [ "$proto" = "tcp" ]; then nbd_url="nbd+tcp://$nbd_addr/$export_name" -- 2.13.6
[Qemu-devel] [PATCH v2 3/5] iotests: Make 055 less flaky
First of all, test 055 does a valiant job of invoking pause_drive() sometimes, but that is worth nothing without blkdebug. So the first thing to do is to sprinkle a couple of "blkdebug::" in there -- with the exception of the transaction tests, because the blkdebug break points make the transaction QMP command hang (which is bad). In that case, we can get away with throttling the block job that it effectively is paused. Then, 055 usually does not pause the drive before starting a block job that should be cancelled. This means that the backup job might be completed already before block-job-cancel is invoked; thus making the test either fail (currently) or moot if cancel_and_wait() ignored this condition. Fix this by pausing the drive before starting the job. Signed-off-by: Max ReitzReviewed-by: Eric Blake Reviewed-by: Stefan Hajnoczi --- tests/qemu-iotests/055 | 25 - 1 file changed, 16 insertions(+), 9 deletions(-) diff --git a/tests/qemu-iotests/055 b/tests/qemu-iotests/055 index e1206caf9b..8a5d9fd269 100755 --- a/tests/qemu-iotests/055 +++ b/tests/qemu-iotests/055 @@ -48,7 +48,7 @@ class TestSingleDrive(iotests.QMPTestCase): def setUp(self): qemu_img('create', '-f', iotests.imgfmt, blockdev_target_img, str(image_len)) -self.vm = iotests.VM().add_drive(test_img) +self.vm = iotests.VM().add_drive('blkdebug::' + test_img) self.vm.add_drive(blockdev_target_img, interface="none") if iotests.qemu_default_machine == 'pc': self.vm.add_drive(None, 'media=cdrom', 'ide') @@ -65,10 +65,11 @@ class TestSingleDrive(iotests.QMPTestCase): def do_test_cancel(self, cmd, target): self.assert_no_active_block_jobs() +self.vm.pause_drive('drive0') result = self.vm.qmp(cmd, device='drive0', target=target, sync='full') self.assert_qmp(result, 'return', {}) -event = self.cancel_and_wait() +event = self.cancel_and_wait(resume=True) self.assert_qmp(event, 'data/type', 'backup') def test_cancel_drive_backup(self): @@ -166,7 +167,7 @@ class TestSetSpeed(iotests.QMPTestCase): def setUp(self): qemu_img('create', '-f', iotests.imgfmt, blockdev_target_img, str(image_len)) -self.vm = iotests.VM().add_drive(test_img) +self.vm = iotests.VM().add_drive('blkdebug::' + test_img) self.vm.add_drive(blockdev_target_img, interface="none") self.vm.launch() @@ -246,6 +247,8 @@ class TestSetSpeed(iotests.QMPTestCase): def test_set_speed_invalid_blockdev_backup(self): self.do_test_set_speed_invalid('blockdev-backup', 'drive1') +# Note: We cannot use pause_drive() here, or the transaction command +# would stall. Instead, we limit the block job speed here. class TestSingleTransaction(iotests.QMPTestCase): def setUp(self): qemu_img('create', '-f', iotests.imgfmt, blockdev_target_img, str(image_len)) @@ -271,7 +274,8 @@ class TestSingleTransaction(iotests.QMPTestCase): 'type': cmd, 'data': { 'device': 'drive0', 'target': target, - 'sync': 'full' }, + 'sync': 'full', + 'speed': 64 * 1024 }, } ]) @@ -289,12 +293,12 @@ class TestSingleTransaction(iotests.QMPTestCase): def do_test_pause(self, cmd, target, image): self.assert_no_active_block_jobs() -self.vm.pause_drive('drive0') result = self.vm.qmp('transaction', actions=[{ 'type': cmd, 'data': { 'device': 'drive0', 'target': target, - 'sync': 'full' }, + 'sync': 'full', + 'speed': 64 * 1024 }, } ]) self.assert_qmp(result, 'return', {}) @@ -302,7 +306,9 @@ class TestSingleTransaction(iotests.QMPTestCase): result = self.vm.qmp('block-job-pause', device='drive0') self.assert_qmp(result, 'return', {}) -self.vm.resume_drive('drive0') +result = self.vm.qmp('block-job-set-speed', device='drive0', speed=0) +self.assert_qmp(result, 'return', {}) + self.pause_job('drive0') result = self.vm.qmp('query-block-jobs') @@ -461,7 +467,7 @@ class TestDriveCompression(iotests.QMPTestCase): pass def do_prepare_drives(self, fmt, args, attach_target): -self.vm = iotests.VM().add_drive(test_img) +self.vm = iotests.VM().add_drive('blkdebug::' + test_img) qemu_img('create', '-f', fmt, blockdev_target_img, str(TestDriveCompression.image_len), *args) @@ -500,10 +506,11 @@ class TestDriveCompression(iotests.QMPTestCase): self.assert_no_active_block_jobs() +self.vm.pause_drive('drive0') result =
[Qemu-devel] [PATCH v2 2/5] iotests: Add missing 'blkdebug::' in 040
040 tries to invoke pause_drive() on a drive that does not use blkdebug. Good idea, but let's use blkdebug to make it actually work. Signed-off-by: Max ReitzReviewed-by: Eric Blake Reviewed-by: Stefan Hajnoczi --- tests/qemu-iotests/040 | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/tests/qemu-iotests/040 b/tests/qemu-iotests/040 index c284d08796..90b5b4f2ad 100755 --- a/tests/qemu-iotests/040 +++ b/tests/qemu-iotests/040 @@ -289,7 +289,7 @@ class TestSetSpeed(ImageCommitTestCase): qemu_img('create', '-f', iotests.imgfmt, '-o', 'backing_file=%s' % mid_img, test_img) qemu_io('-f', iotests.imgfmt, '-c', 'write -P 0x1 0 512', test_img) qemu_io('-f', iotests.imgfmt, '-c', 'write -P 0xef 524288 524288', mid_img) -self.vm = iotests.VM().add_drive(test_img) +self.vm = iotests.VM().add_drive('blkdebug::' + test_img) self.vm.launch() def tearDown(self): -- 2.13.6
[Qemu-devel] [PATCH v2 1/5] iotests: Make 030 less flaky
This patch fixes two race conditions in 030: 1. The first is in TestENOSPC.test_enospc(). After resuming the job, querying it to confirm it is no longer paused may fail because in the meantime it might have completed already. The same was fixed in TestEIO.test_ignore() already (in commit 2c3b44da07d341557a8203cc509ea07fe3605e11). 2. The second is in TestSetSpeed.test_set_speed_invalid(): Here, a stream job is started on a drive without any break points, with a block-job-set-speed invoked subsequently. However, without any break points, the job might have completed in the meantime (on tmpfs at least); or it might complete before cancel_and_wait() which expects the job to still exist. This can be fixed like everywhere else by pausing the drive (installing break points) before starting the job and letting cancel_and_wait() resume it. Signed-off-by: Max ReitzReviewed-by: Eric Blake Reviewed-by: Stefan Hajnoczi --- tests/qemu-iotests/030 | 8 ++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/tests/qemu-iotests/030 b/tests/qemu-iotests/030 index 18838948fa..457984b8e9 100755 --- a/tests/qemu-iotests/030 +++ b/tests/qemu-iotests/030 @@ -666,6 +666,7 @@ class TestENOSPC(TestErrors): if event['event'] == 'BLOCK_JOB_ERROR': self.assert_qmp(event, 'data/device', 'drive0') self.assert_qmp(event, 'data/operation', 'read') +error = True result = self.vm.qmp('query-block-jobs') self.assert_qmp(result, 'return[0]/paused', True) @@ -676,9 +677,11 @@ class TestENOSPC(TestErrors): self.assert_qmp(result, 'return', {}) result = self.vm.qmp('query-block-jobs') +if result == {'return': []}: +# Race; likely already finished. Check. +continue self.assert_qmp(result, 'return[0]/paused', False) self.assert_qmp(result, 'return[0]/io-status', 'ok') -error = True elif event['event'] == 'BLOCK_JOB_COMPLETED': self.assertTrue(error, 'job completed unexpectedly') self.assert_qmp(event, 'data/type', 'stream') @@ -792,13 +795,14 @@ class TestSetSpeed(iotests.QMPTestCase): self.assert_no_active_block_jobs() +self.vm.pause_drive('drive0') result = self.vm.qmp('block-stream', device='drive0') self.assert_qmp(result, 'return', {}) result = self.vm.qmp('block-job-set-speed', device='drive0', speed=-1) self.assert_qmp(result, 'error/class', 'GenericError') -self.cancel_and_wait() +self.cancel_and_wait(resume=True) if __name__ == '__main__': iotests.main(supported_fmts=['qcow2', 'qed']) -- 2.13.6
[Qemu-devel] [PATCH v2 0/5] iotests: Make some tests less flaky
There are a couple of tests that fail (on my machine) from time to time (and by that I mean that recently I've rarely ever had a test run with both 083 and 136 working on first try). This series should fix most (at least the issues I am aware of). Notes: - 083 might have another issue, but if so it occurs extremely rarely and so I was unable to debug it. - 129 is flaky, too, because it tries to use block jobs with BB throttling -- however, block jobs ignore that these days. Making it use a throttle filter node will require quite a bit of work. See http://lists.nongnu.org/archive/html/qemu-block/2017-11/msg00111.html for more. - 194 sometimes hangs because the source VM fails to drain its drive. This is probably not an issue with the test, but actually an issue in qemu. See http://lists.nongnu.org/archive/html/qemu-block/2017-11/msg00256.html for more. v2: - Spelling fixes [Eric] - Contract grep | sed to a single sed in 083 [Eric] git-backport-diff against v1: Key: [] : patches are identical [] : number of functional differences between upstream/downstream patch [down] : patch is downstream-only The flags [FC] indicate (F)unctional and (C)ontextual differences, respectively 001/5:[] [--] 'iotests: Make 030 less flaky' 002/5:[] [--] 'iotests: Add missing 'blkdebug::' in 040' 003/5:[] [--] 'iotests: Make 055 less flaky' 004/5:[0003] [FC] 'iotests: Make 083 less flaky' 005/5:[] [--] 'iotests: Make 136 less flaky' Max Reitz (5): iotests: Make 030 less flaky iotests: Add missing 'blkdebug::' in 040 iotests: Make 055 less flaky iotests: Make 083 less flaky iotests: Make 136 less flaky tests/qemu-iotests/030 | 8 ++-- tests/qemu-iotests/040 | 2 +- tests/qemu-iotests/055 | 25 - tests/qemu-iotests/083 | 4 +++- tests/qemu-iotests/136 | 14 +- 5 files changed, 39 insertions(+), 14 deletions(-) -- 2.13.6
Re: [Qemu-devel] [PATCH 4/5] iotests: Make 083 less flaky
On 2017-11-09 15:11, Eric Blake wrote: > On 11/08/2017 07:38 PM, Max Reitz wrote: >> 083 has (at least) two issues: > > I think I hit one of them intermittently yesterday; thanks for > diagnosing these (and like you say, there may be more lurking, but we'll > whack them separately if we can reproduce and identify them). > >> >> 1. By launching the nbd-fault-injector in background, it may not be >>scheduled until the first grep on its output file is executed. >>However, until then, that file may not have been created yet -- so it >>either does not exist yet (thus making the grep emit an error), or it >>does exist but contains stale data (thus making the rest of the test >>case work connect to a wrong address). >>Fix this by explicitly overwriting the output file before executing >>nbd-fault-injector. >> >> 2. The nbd-fault-injector prints things other than "Listening on...". >>It also prints a "Closing connection" message from time to time. We >>currently invoke sed on the whole file in the hope of it only >>containing the "Listening on..." line yet. That hope is sometimes >>shattered by the brutal reality of race conditions, so invoke grep >>before sed. > > Invoking 'grep | sed' is almost always a waste of a process; sed can do > the job alone. > >> >> Signed-off-by: Max Reitz>> --- >> tests/qemu-iotests/083 | 3 ++- >> 1 file changed, 2 insertions(+), 1 deletion(-) >> >> diff --git a/tests/qemu-iotests/083 b/tests/qemu-iotests/083 >> index 0306f112da..2f6444eeb9 100755 >> --- a/tests/qemu-iotests/083 >> +++ b/tests/qemu-iotests/083 >> @@ -86,6 +86,7 @@ EOF >> >> rm -f "$TEST_DIR/nbd.sock" >> >> +echo > "$TEST_DIR/nbd-fault-injector.out" > > This makes the file contain a blank line. Would it be any better as a > truly empty file, as in: > > : > "$TEST_DIR/nbd-fault-injector.out" Although ":>" looks funny, I guess I'd rather stay with the echo... Yes, it may look stupid to you, but I would have to look up what exactly that will do, whereas the echo is clear. And in the end, both work equally fine. >> $PYTHON nbd-fault-injector.py $extra_args "$nbd_addr" >> "$TEST_DIR/nbd-fault-injector.conf" >"$TEST_DIR/nbd-fault-injector.out" 2>&1 >> & >> >> # Wait for server to be ready >> @@ -94,7 +95,7 @@ EOF >> done >> >> # Extract the final address (port number has now been assigned in tcp >> case) >> -nbd_addr=$(sed 's/Listening on \(.*\)$/\1/' >> "$TEST_DIR/nbd-fault-injector.out") >> +nbd_addr=$(grep 'Listening on ' "$TEST_DIR/nbd-fault-injector.out" >> | sed 's/Listening on \(.*\)$/\1/') > > Fixing TAB damage while at it - nice. > > Here's how to do it using just sed, and with less typing: > > nbd_addr=$(sed -n 's/^Listening on //p' \ >"$TEST_DIR/nbd-fault-injector.out") Oh, nice! Will do, thanks. Max signature.asc Description: OpenPGP digital signature
Re: [Qemu-devel] [RFC v2 3/6] possible_cpus: add CPUArchId::type field
On Thu, Nov 09, 2017 at 05:58:03PM +1100, David Gibson wrote: > On Tue, Nov 07, 2017 at 04:04:04PM +0100, Cornelia Huck wrote: > > On Mon, 6 Nov 2017 16:02:16 -0200 > > Eduardo Habkostwrote: > > > > > On Tue, Oct 31, 2017 at 03:01:14PM +0100, Igor Mammedov wrote: > > > > On Thu, 19 Oct 2017 17:31:51 +1100 > > > > David Gibson wrote: > > > > > > > > > On Wed, Oct 18, 2017 at 01:12:12PM +0200, Igor Mammedov wrote: > > > > > > For enabling early cpu to numa node configuration at runtime > > > > > > qmp_query_hotpluggable_cpus() should provide a list of available > > > > > > cpu slots at early stage, before machine_init() is called and > > > > > > the 1st cpu is created, so that mgmt might be able to call it > > > > > > and use output to set numa mapping. > > > > > > Use MachineClass::possible_cpu_arch_ids() callback to set > > > > > > cpu type info, along with the rest of possible cpu properties, > > > > > > to let machine define which cpu type* will be used. > > > > > > > > > > > > * for SPAPR it will be a spapr core type and for ARM/s390x/x86 > > > > > > a respective descendant of CPUClass. > > > > > > > > > > > > Move parse_numa_opts() in vl.c after cpu_model is parsed into > > > > > > cpu_type so that possible_cpu_arch_ids() would know which > > > > > > cpu_type to use during layout initialization. > > > > > > > > > > > > Signed-off-by: Igor Mammedov > > > > > > > > > > Reviewed-by: David Gibson > > > > > > > > > > > --- > > > > > > v2: > > > > > > - fix NULL dereference caused by not initialized > > > > > >MachineState::cpu_type at the time parse_numa_opts() > > > > > >were called > > > > > > --- > > > > > > include/hw/boards.h| 2 ++ > > > > > > hw/arm/virt.c | 3 ++- > > > > > > hw/core/machine.c | 12 ++-- > > > > > > hw/i386/pc.c | 4 +++- > > > > > > hw/ppc/spapr.c | 13 - > > > > > > hw/s390x/s390-virtio-ccw.c | 1 + > > > > > > vl.c | 3 +-- > > > > > > 7 files changed, 23 insertions(+), 15 deletions(-) > > > > > > > > > > > > diff --git a/include/hw/boards.h b/include/hw/boards.h > > > > > > index 191a5b3..fa21758 100644 > > > > > > --- a/include/hw/boards.h > > > > > > +++ b/include/hw/boards.h > > > > > > @@ -80,6 +80,7 @@ void machine_set_cpu_numa_node(MachineState > > > > > > *machine, > > > > > > * CPUArchId: > > > > > > * @arch_id - architecture-dependent CPU ID of present or possible > > > > > > CPU > > > > > > > > > > I know this isn't really in scope for this patch, but is @arch_id here > > > > > supposed to have meaning defined by the target, or by the machine? > > > > > > > > > > If it's the machime, it could do with a rename - "arch" means target > > > > > to most people (thanks to Linux). > > > > > > > > > > If it's the target, it's kind of bogus, because it doesn't necessarily > > > > > have a clear meaning per target - get_arch_id in CPUClass has the same > > > > > problem, which is probably one reason it's basically only used by the > > > > > x86 code at present. > > > > > > > > > > e.g. for target/ppc, what do we use? There's the PIR, which is in the > > > > > CPU.. but only on some cpu models, not all. There will generally be > > > > > some kind of master PIC id, but there are different PIC models on > > > > > different boards. What goes in the devicetree? Well only some > > > > > machines use devicetree, and they might define the cpu reg > > > > > differently. > > > > > > > > > > Board designs will generally try to make some if not all of those > > > > > possible values equal for simplicity, but there's still no real way of > > > > > defining a sensible arch_id independent of machine / board. > > > > I'd say arch_id is machine specific so far, it was introduced when we > > > > didn't have CpuInstanceProperties and at that time we considered only > > > > vcpus (threads) and doesn't really apply to spapr cores. > > > > > > > > In general we could do away with arch_id and use CpuInstanceProperties > > > > instead, but arch_id also serves aux purpose, it allows machine to > > > > pre-calculate(cache) apic-id/mpidr values in one place and then they > > > > are/(could be) used by arch in-depended code to build acpi tables. > > > > So if we drop arch_id we would need to introduce a machine hook, > > > > which would translate CpuInstanceProperties into current arch_id. > > > > > > I think we need to do a better to job documenting where exactly > > > we expect arch_id to be used and how, so people know what it's > > > supposed to return. > > > > > > If the only place where it's useful now is ACPI code (is it?), > > > should we rename it to something like get_acpi_id()? > > > > It is also used in hw/s390x/sclp.c to fill out a control block, so acpi > > isn't the only user. > > Yeah.. this is kind of bogus. The
Re: [Qemu-devel] [Qemu devel PATCH] MAINTAINERS: Add entries for Smartfusion2
On Thu, Nov 9, 2017 at 4:02 AM, Subbaraya Sundeepwrote: > add voluntarily myself as maintainer for Smartfusion2 This should be: Voluntarily add myself as maintainer for Smartfusion2. > > Signed-off-by: Subbaraya Sundeep With the commit message fixed: Reviewed-by: Alistair Francis Alistair > --- > MAINTAINERS | 17 + > 1 file changed, 17 insertions(+) > > diff --git a/MAINTAINERS b/MAINTAINERS > index 0cd4d02..dae08bd 100644 > --- a/MAINTAINERS > +++ b/MAINTAINERS > @@ -564,6 +564,23 @@ M: Alistair Francis > S: Maintained > F: hw/arm/netduino2.c > > +MSF2 SoC > +M: Subbaraya Sundeep > +S: Maintained > +F: hw/arm/msf2-soc.c > +F: hw/misc/msf2-sysreg.c > +F: hw/timer/mss-timer.c > +F: hw/ssi/mss-spi.c > +F: include/hw/arm/msf2-soc.h > +F: include/hw/misc/msf2-sysreg.h > +F: include/hw/timer/mss-timer.h > +F: include/hw/ssi/mss-spi.h > + > +MSF2 SOM board > +M: Subbaraya Sundeep > +S: Maintained > +F: hw/arm/msf2-som.c > + > CRIS Machines > - > Axis Dev88 > -- > 2.5.0 >
Re: [Qemu-devel] [PATCH] smbios: support setting OEM strings table
On Thu, Nov 09, 2017 at 02:40:05PM +, Daniel P. Berrange wrote: > Ping, any thoughts on this ? (not expecting it in this 2.11 release of course) I queued this up. > On Sat, Oct 28, 2017 at 09:51:36PM +0100, Daniel P. Berrange wrote: > > The cloud-init program currently allows fetching of its data by repurposing > > of > > the 'system' type 'serial' field. This is a clear abuse of the serial field > > that > > would clash with other valid usage a virt management app might have for that > > field. > > > > Fortunately the SMBIOS defines an "OEM Strings" table whose puporse is to > > allow > > exposing of arbitrary vendor specific strings to the operating system. This > > is > > perfect for use with cloud-init, or as a way to pass arguments to OS > > installers > > such as anaconda. > > > > This patch makes it easier to support this with QEMU. e.g. > > > > $QEMU -smbios type=11,value=Hello,value=World,value=Tricky,,value=test > > > > Which results in the guest seeing dmidecode data > > > > Handle 0x0E00, DMI type 11, 5 bytes > > OEM Strings > > String 1: Hello > > String 2: World > > String 3: Tricky,value=test > > > > It is suggested that any app wanting to make use of this OEM strings > > capability > > for accepting data from the host mgmt layer should use its name as a string > > prefix. e.g. to expose OEM strings targetting both cloud init and anaconda > > in > > parallel the mgmt app could set > > > > $QEMU -smbios > > type=11,value=cloud-init:ds=nocloud-net;s=http://10.10.0.1:8000/,\ > > > > value=anaconda:method=http://dl.fedoraproject.org/pub/fedora/linux/releases/25/x86_64/os > > > > which would appear as > > > > Handle 0x0E00, DMI type 11, 5 bytes > > OEM Strings > > String 1: cloud-init:ds=nocloud-net;s=http://10.10.0.1:8000/ > > String 2: > > anaconda:method=http://dl.fedoraproject.org/pub/fedora/linux/releases/25/x86_64/os > > > > Use of such string prefixes means the app won't have to care which string > > slot > > its data appears in. > > > > Signed-off-by: Daniel P. Berrange> > --- > > hw/smbios/smbios.c | 72 > > ++ > > hw/smbios/smbios_build.h | 12 > > include/hw/smbios/smbios.h | 6 > > 3 files changed, 90 insertions(+) > > > > diff --git a/hw/smbios/smbios.c b/hw/smbios/smbios.c > > index 1a5437a07d..5d11f01874 100644 > > --- a/hw/smbios/smbios.c > > +++ b/hw/smbios/smbios.c > > @@ -96,6 +96,11 @@ static struct { > > } type4; > > > > static struct { > > +size_t nvalues; > > +const char **values; > > +} type11; > > + > > +static struct { > > const char *loc_pfx, *bank, *manufacturer, *serial, *asset, *part; > > uint16_t speed; > > } type17; > > @@ -282,6 +287,14 @@ static const QemuOptDesc qemu_smbios_type4_opts[] = { > > { /* end of list */ } > > }; > > > > +static const QemuOptDesc qemu_smbios_type11_opts[] = { > > +{ > > +.name = "value", > > +.type = QEMU_OPT_STRING, > > +.help = "OEM string data", > > +}, > > +}; > > + > > static const QemuOptDesc qemu_smbios_type17_opts[] = { > > { > > .name = "type", > > @@ -590,6 +603,27 @@ static void smbios_build_type_4_table(unsigned > > instance) > > smbios_type4_count++; > > } > > > > +static void smbios_build_type_11_table(void) > > +{ > > +char count_str[128]; > > +size_t i; > > + > > +if (type11.nvalues == 0) { > > +return; > > +} > > + > > +SMBIOS_BUILD_TABLE_PRE(11, 0xe00, true); /* required */ > > + > > +snprintf(count_str, sizeof(count_str), "%zu", type11.nvalues); > > +t->count = type11.nvalues; > > + > > +for (i = 0; i < type11.nvalues; i++) { > > +SMBIOS_TABLE_SET_STR_LIST(11, type11.values[i]); > > +} > > + > > +SMBIOS_BUILD_TABLE_POST; > > +} > > + > > #define ONE_KB ((ram_addr_t)1 << 10) > > #define ONE_MB ((ram_addr_t)1 << 20) > > #define ONE_GB ((ram_addr_t)1 << 30) > > @@ -832,6 +866,8 @@ void smbios_get_tables(const struct > > smbios_phys_mem_area *mem_array, > > smbios_build_type_4_table(i); > > } > > > > +smbios_build_type_11_table(); > > + > > #define MAX_DIMM_SZ (16ll * ONE_GB) > > #define GET_DIMM_SZ ((i < dimm_cnt - 1) ? MAX_DIMM_SZ \ > > : ((ram_size - 1) % MAX_DIMM_SZ) + > > 1) > > @@ -882,6 +918,38 @@ static void save_opt(const char **dest, QemuOpts > > *opts, const char *name) > > } > > } > > > > + > > +struct opt_list { > > +const char *name; > > +size_t *ndest; > > +const char ***dest; > > +}; > > + > > +static int save_opt_one(void *opaque, > > +const char *name, const char *value, > > +Error **errp) > > +{ > > +struct opt_list *opt = opaque; > > + > > +if (!g_str_equal(name, opt->name)) { > > +return 0; > > +} > > +
Re: [Qemu-devel] [PATCH V3] hw/pci-host: Fix x86 Host Bridges 64bit PCI hole
On Thu, Nov 09, 2017 at 12:02:10PM +0100, Laszlo Ersek wrote: [...] > (3) An idea for the property defaults: you remove > DEFAULT_PCI_HOLE64_SIZE, which is cool. How about introducing (in the > proper header files) > > #define DEFAULT_I440FX_PCI_HOLE64_SIZE (1ULL << 31) > #define DEFAULT_Q35_PCI_HOLE64_SIZE(1ULL << 35) > > The main reason for my suggestion is that (1ULL << 35) is used twice in > the code, for an obscure qdev/qom reason. The comments definitely help, > so keep them, but a greppable macro would help even more, IMO. > > And, once we add DEFAULT_Q35_PCI_HOLE64_SIZE, we should add > DEFAULT_I440FX_PCI_HOLE64_SIZE too, for consistency. Agreed, especially considering how the code that initializes the defaults is non-obvious. > > Anyway, I'll leave this up to you as well. If we do that, I recommend adding a bigger warning to q35_host_props. The one in this patch is very easy to miss if people try to touch the defaults for other mch.* properties in the future. I suggest something like: /* * NOTE: setting defaults for the mch.* fields in this table * don't work, because mch is a separate QOM object that is * zeroed by the object_initialize(>mch, ...) call inside * q35_host_initfn(). The default values for those * properties need to be initialized manually by * q35_host_initfn() after the object_initialize() call. */ > > > (4) I also have a suggestion for the commit message: please move the > paragraph that starts with > > "Even if the new defaults..." > > from the v2->v3 changelog to the commit message proper. IMO it is > important information. > > > With (4) addressed: > > Reviewed-by: Laszlo Ersek> > Thanks! > Laszlo -- Eduardo
Re: [Qemu-devel] [Qemu-ppc] [PATCH for-2.12 v2 00/12] spapr: introduce an IRQ allocator at the machine level
David, On 11/09/2017 10:14 AM, Cédric Le Goater wrote: > Hello, > > Currently, the ICSState 'ics' object of the sPAPR machine acts as the > global interrupt source handler and also as the IRQ number allocator > for the machine. Some IRQ numbers are allocated very early in the > machine initialization sequence to populate the device tree, and this > is a problem to introduce the new POWER XIVE interrupt model, as it > needs to share the IRQ numbers with the older model. > > To prepare ground for XIVE, here is a proposal adding a set of new > XICSFabric operations to let the machine handle directly the IRQ > number allocation and to decorrelate the allocation from the interrupt > source object : > > bool (*irq_test)(XICSFabric *xi, int irq); > int (*irq_alloc_block)(XICSFabric *xi, int count, int align); > void (*irq_free_block)(XICSFabric *xi, int irq, int num); > bool (*irq_is_lsi)(XICSFabric *xi, int irq); > > In these prototypes, the 'irq' parameter refers to a number in the > global IRQ number space. > > On the sPAPR platform, these operations are simply backed by a bitmap > and to handle migration compatibility, we introduce a machine class > flag 'has_irq_bitmap', which is set to false for older machines. Talking with Greg, I have found a better solution to handle migration compatibility which is to do like the phb_placement. That is to use two sets of XICS IRQ operations : one for the latest pseries machine using the bitmap and another one for the older pseries machine using the ICSIRQState. It impacts mostly patch 5. This solution is cleaner and I will send a v3 in my morning unless you start reviewing and have some obvious comments in the v2 series. Thanks, C. > To completely remove the use of the ICSState object (required to > introduce XIVE), we also need to change how the nature of an > interrupt, MSI or LSI, is stored. Today, this is done using the flag > attribute of the ICSIRQState array. We change that by splitting the > IRQ number space of the machine in two: first the LSIs and then the > MSIs. This has the benefit to keep the LSI IRQ numbers in a well known > range which is useful for PHB hotplug. > > The git repo for this pachset can be found here along with the latest > XIVE model: > > https://github.com/legoater/qemu/commits/xive > > Thanks, > > C. > > Tests : > > - make check on each patch > - migration : > qemu-2.12 (pseries-2.12) <-> qemu-2.12 (pseries-2.12) > qemu-2.12 (pseries-2.10) <-> qemu-2.12 (pseries-2.10) > qemu-2.10 (pseries-2.10) <-> qemu-2.12 (pseries-2.10) > > Changes since v1 : > > - reorganised patchset to introduce the XICSFabric operations before >the major changes: bitmap and IRQ number space split > - introduced a reference bitmap to save some state in migration > > Cédric Le Goater (12): > spapr: add pseries 2.12 machine type > ppc/xics: remove useless if condition > spapr: introduce new XICSFabric operations for an IRQ allocator > spapr: move current IRQ allocation under the machine > spapr: introduce an IRQ allocator using a bitmap > spapr: store a reference IRQ bitmap > spapr: introduce an 'irq_base' number > spapr: remove the use of ics_valid_irq() > spapr: introduce a XICSFabric is_lsi() operation > spapr: split the IRQ number space for LSI interrupts > sparp: merge ics_set_irq_type() in irq_alloc_block() operation > spapr: use sPAPRMachineState in spapr_ics_ prototypes > > hw/intc/trace-events | 2 - > hw/intc/xics.c | 37 ++- > hw/intc/xics_kvm.c | 4 +- > hw/intc/xics_spapr.c | 76 - > hw/ppc/pnv.c | 34 ++ > hw/ppc/pnv_psi.c | 4 -- > hw/ppc/spapr.c | 174 > - > hw/ppc/spapr_events.c | 4 +- > hw/ppc/spapr_pci.c | 8 +-- > hw/ppc/spapr_vio.c | 2 +- > hw/ppc/trace-events| 2 + > include/hw/ppc/spapr.h | 6 ++ > include/hw/ppc/xics.h | 20 -- > 13 files changed, 267 insertions(+), 106 deletions(-) >
Re: [Qemu-devel] Intermittent hang of iotest 194 (bdrv_drain_all after non-shared storage migration)
On 2017-11-09 16:30, Fam Zheng wrote: > On Thu, 11/09 16:14, Max Reitz wrote: >> On 2017-11-09 05:21, Fam Zheng wrote: >>> On Thu, 11/09 01:48, Max Reitz wrote: Hi, More exciting news from the bdrv_drain() front! I've noticed in the past that iotest 194 sometimes hangs. I usually run the tests on tmpfs, but I've just now verified that it happens on my SSD just as well. So the reproducer is a plain: while ./check -raw 194; do; done >>> >>> I cannot produce it on my machine. >> >> Hm, too bad. I see it both on my work laptop (with Fedora) and my >> desktop (with Arch)... >> (No difference between raw or qcow2, though.) And then, after a couple of runs (or a couple ten), it will just hang. The reason is that the source VM lingers around and doesn't quit voluntarily -- the test itself was successful, but it just can't exit. If you force it to exit by killing the VM (e.g. through pkill -11 qemu), this is the backtrace: #0 0x7f7cfc297e06 in ppoll () at /lib64/libc.so.6 #1 0x563b846bcac9 in ppoll (__ss=0x0, __timeout=0x0, __nfds=, __fds=) at /usr/include/bits/poll2.h:77 >>> >>> Looking at the 0 timeout it seems we are in the aio_poll(ctx, >>> blocking=false); >>> branches of BDRV_POLL_WHILE? Is it a busy loop? If so I wonder what is >>> making >>> progress and causing the return value to be true in aio_poll(). >>> #2 0x563b846bcac9 in qemu_poll_ns (fds=, nfds=, timeout=) at util/qemu-timer.c:322 #3 0x563b846be711 in aio_poll (ctx=ctx@entry=0x563b856e3e80, blocking=) at util/aio-posix.c:629 #4 0x563b8463afa4 in bdrv_drain_recurse (bs=bs@entry=0x563b865568a0, begin=begin@entry=true) at block/io.c:201 #5 0x563b8463baff in bdrv_drain_all_begin () at block/io.c:381 #6 0x563b8463bc99 in bdrv_drain_all () at block/io.c:411 #7 0x563b8459888b in block_migration_cleanup (opaque=>>> out>) at migration/block.c:714 #8 0x563b845883be in qemu_savevm_state_cleanup () at migration/savevm.c:1251 #9 0x563b845811fd in migration_thread (opaque=0x563b856f1da0) at migration/migration.c:2298 #10 0x7f7cfc56f36d in start_thread () at /lib64/libpthread.so.0 #11 0x7f7cfc2a3e1f in clone () at /lib64/libc.so.6 And when you make bdrv_drain_all_begin() print what we are trying to drain, you can see that it's the format node (managed by the "raw" driver in this case). >>> >>> So what is the value of bs->in_flight? >> >> gdb: >>> (gdb) print bs->in_flight >>> $3 = 2307492233 >> >> "That's weird, why would it..." >> >>> (gdb) print *bs >>> $4 = {open_flags = -1202160144, read_only = 161, encrypted = 85, sg = >>> false, probed = false, force_share = 96, implicit = 159, drv = 0x0, opaque >>> = 0x0, aio_context = 0x8989898989898989, aio_notifiers = {lh_first = >>> 0x8989898989898989}, >>> walking_aio_notifiers = 137, filename = '\211' , >>> backing_file = '\211' , backing_format = '\211' >>> , full_open_options = 0x8989898989898989, >>> exact_filename = '\211' , backing = >>> 0x8989898989898989, file = 0x8989898989898989, bl = {request_alignment = >>> 2307492233, max_pdiscard = -1987475063, pdiscard_alignment = 2307492233, >>> max_pwrite_zeroes = -1987475063, pwrite_zeroes_alignment = 2307492233, >>> opt_transfer = 2307492233, max_transfer = 2307492233, min_mem_alignment = >>> 9910603678816504201, opt_mem_alignment = 9910603678816504201, max_iov = >>> -1987475063}, >>> supported_write_flags = 2307492233, supported_zero_flags = 2307492233, >>> node_name = '\211' , node_list = {tqe_next = >>> 0x8989898989898989, tqe_prev = 0x8989898989898989}, bs_list = {tqe_next = >>> 0x8989898989898989, >>> tqe_prev = 0x8989898989898989}, monitor_list = {tqe_next = >>> 0x8989898989898989, tqe_prev = 0x8989898989898989}, refcnt = -1987475063, >>> op_blockers = {{lh_first = 0x8989898989898989} }, job = >>> 0x8989898989898989, >>> inherits_from = 0x8989898989898989, children = {lh_first = >>> 0x8989898989898989}, parents = {lh_first = 0x8989898989898989}, options = >>> 0x8989898989898989, explicit_options = 0x8989898989898989, detect_zeroes = >>> 2307492233, >>> backing_blocker = 0x8989898989898989, total_sectors = >>> -8536140394893047415, before_write_notifiers = {notifiers = {lh_first = >>> 0x8989898989898989}}, write_threshold_offset = 9910603678816504201, >>> write_threshold_notifier = {notify = >>> 0x8989898989898989, node = {le_next = 0x8989898989898989, le_prev = >>> 0x8989898989898989}}, dirty_bitmap_mutex = {lock = {__data = {__lock = >>> -1987475063, __count = 2307492233, __owner = -1987475063, __nusers = >>> 2307492233, >>> __kind = -1987475063, __spins = -30327, __elision = -30327, __list >>> = {__prev = 0x8989898989898989, __next = 0x8989898989898989}}, __size = >>> '\211' , __align = -8536140394893047415},
Re: [Qemu-devel] [PATCH 6/7] s390x/pci: move the memory region write from pcistg
On Tue, 7 Nov 2017 18:24:38 +0100 Pierre Morelwrote: > Let's move the memory region write from pcistg into a dedicated > function. > This allows us to prepare a later patch searching for subregions > inside of the memory region. OK, so here is the memory region write. Do we have any sleeping endianness bugs in there for when we wire up tcg? I'm not sure how this plays with the bswaps (see patch 1). But maybe I've just gotten lost somewhere. > > Signed-off-by: Pierre Morel > Reviewed-by: Yi Min Zhao > --- > hw/s390x/s390-pci-inst.c | 27 +-- > 1 file changed, 17 insertions(+), 10 deletions(-) > > diff --git a/hw/s390x/s390-pci-inst.c b/hw/s390x/s390-pci-inst.c > index 50135a0..97f62b5 100644 > --- a/hw/s390x/s390-pci-inst.c > +++ b/hw/s390x/s390-pci-inst.c > @@ -455,12 +455,27 @@ static int trap_msix(S390PCIBusDevice *pbdev, uint64_t > offset, uint8_t pcias) > } > } > > +static MemTxResult zpci_write_bar(S390PCIBusDevice *pbdev, uint8_t pcias, > + uint64_t offset, uint64_t data, uint8_t > len) > +{ > +MemoryRegion *mr; > + > +if (trap_msix(pbdev, offset, pcias)) { > +offset = offset - pbdev->msix.table_offset; > +mr = >pdev->msix_table_mmio; > +} else { > +mr = pbdev->pdev->io_regions[pcias].memory; > +} > + > +return memory_region_dispatch_write(mr, offset, data, len, > +MEMTXATTRS_UNSPECIFIED); > +} > + > int pcistg_service_call(S390CPU *cpu, uint8_t r1, uint8_t r2) > { > CPUS390XState *env = >env; > uint64_t offset, data; > S390PCIBusDevice *pbdev; > -MemoryRegion *mr; > MemTxResult result; > uint8_t len; > uint32_t fh; > @@ -517,15 +532,7 @@ int pcistg_service_call(S390CPU *cpu, uint8_t r1, > uint8_t r2) > return 0; > } > > -if (trap_msix(pbdev, offset, pcias)) { > -offset = offset - pbdev->msix.table_offset; > -mr = >pdev->msix_table_mmio; > -} else { > -mr = pbdev->pdev->io_regions[pcias].memory; > -} > - > -result = memory_region_dispatch_write(mr, offset, data, len, > - MEMTXATTRS_UNSPECIFIED); > +result = zpci_write_bar(pbdev, pcias, offset, data, len); > if (result != MEMTX_OK) { > program_interrupt(env, PGM_OPERAND, 4); > return 0;
Re: [Qemu-devel] [PATCH 1/7] s390x/pci: factor out endianess conversion
On Thu, 9 Nov 2017 15:55:46 -0300 Philippe Mathieu-Daudéwrote: > On 11/09/2017 01:38 PM, Cornelia Huck wrote: > > On Tue, 7 Nov 2017 18:24:33 +0100 > > Pierre Morel wrote: > > > >> There are two places where the same endianness conversion > >> is done. > >> Let's factor this out into a static function. > >> > >> Signed-off-by: Pierre Morel > >> Reviewed-by: Yi Min Zhao > >> --- > >> hw/s390x/s390-pci-inst.c | 58 > >> ++-- > >> 1 file changed, 32 insertions(+), 26 deletions(-) > >> > >> diff --git a/hw/s390x/s390-pci-inst.c b/hw/s390x/s390-pci-inst.c > >> index 8e088f3..8fcb02d 100644 > >> --- a/hw/s390x/s390-pci-inst.c > >> +++ b/hw/s390x/s390-pci-inst.c > >> @@ -314,6 +314,35 @@ out: > >> return 0; > >> } > >> > >> +/** > >> + * This function swaps the data at ptr according from one > >> + * endianness to the other. > >> + * valid data in the uint64_t data field. > > > > I'm not sure what that line is supposed to mean? > > > >> + * @ptr: a pointer to a uint64_t data field > >> + * @len: the length of the valid data, must be 1,2,4 or 8 > >> + */ > >> +static int zpci_endian_swap(uint64_t *ptr, uint8_t len) > >> +{ > >> +uint64_t data = *ptr; > >> +switch (len) { > >> +case 1: > >> +break; > >> +case 2: > >> +data = bswap16(data); > >> +break; > >> +case 4: > >> +data = bswap32(data); > >> +break; > >> +case 8: > >> +data = bswap64(data); > >> +break; > >> +default: > >> +return -EINVAL; > >> +} > >> +*ptr = data; > >> +return 0; > >> +} > > This is usually care taken by memory::adjust_endianness() ... Yes, but that's not a memory region write. > > > I was expecting more code to use a similar pattern, but it seems > > surprisingly uncommon. > > Which ring a bell for latent bug? Looking at this, it seems there *is* a latent bug, which has not popped up so far as the pci instructions are not wired up in tcg yet. This code is only called from the kvm path... > > This remind me of a similar issue on ppc: > > http://lists.nongnu.org/archive/html/qemu-devel/2017-02/msg05121.html > ... > http://lists.nongnu.org/archive/html/qemu-devel/2017-02/msg05666.html
[Qemu-devel] KVM Forum BoF session notes: QEMU configuration/command-line/QMP
Hi, Below are the notes I collected during the BoF session about QEMU configuration, command-line and QMP. Sorry for taking so long to send them. All inaccuracies and mistakes below are my own fault. (I learned the hard way that trying to take notes while participating in the discussion at the same time is not a good idea.) KVM Forum 2017 BoF session: configuration, command-line and QMP 2017-10-26 People: Jiri Denemark, Laine Stump, Peter Krempa, Markus Armbruster, Eduardo Habkost, Igor Mammedov, Kevin Wolf, Eduardo Otubo, Alistair Francis, Kashyap Chamarthy, Max Reitz First topic: early QMP and -paused series by Igor ehabkost tried to explain the original problem and Igor's proposal[1]. Markus: startup time is important, some things can make it slow. Personally, not so much concerned about querying stuff. Just cache it! Markus, about starting a monitor earlier: "it just makes sense". One problem is: command-line processing ordering mess in QEMU. It's tempting but very scary. very easy to screw up. About command-line ordering, “I'd just go left-to-right”. ehabkost: is gradual change with NUMA stuff the right path? (Markus: "what's the right path?"; ehabkost: "I don't know"). If you are using CPU hotplug you still might have some stuff not appearing in the command-line, because it uses query-hotpluggable-cpus. pkrempa: if you're using cpu hotplug you don't care about startup time. NUMA is not different. Markus's objection to current -paused series: the need for another option/state. We basically have 3 states/phases: 1: between exec and monitor available 2: between monitor and machines start to runs (in that phase a device_add is/can be a cold plug) 3: then the machine actually runs, and all device_add is hotplug ehabkost: that's how I understand it, too. ehabkost: question for libvirt developers: is there an "expansion" operation where libvirt could fill-up missing data on the XML based on info queried from QEMU? (Answer seems to be yes) Some comments from Igor and Peter about query-hotpluggable-cpus and the need to choose the right device type (sometimes TYPE_CPU is not appropriate for device_add, but only CPU cores, like on Power). ehabkost: that's probably another use case for query-device-slots: knowing that power machines can't device_add threads, but just cores. Tricky part: providing slot information at qemu-probing time (with -machine none), before any machine was built. Some discussion about QEMU capability caching, why we need it, what we need to get rid of it (answer: early QMP?). Markus: would --query-FOO for some FOOs help? libvirt developers seem to prefer early QMP. Now that libvirt queries QEMU for host CPUID capabilities, there are more interesting ways the cache may need to be invalidated. Markus: “there's a it of a combinatorial explosion, i.e. query results can depend on so much more than just the QEMU version and machine type. Makes invalidating the cache tricky, and possibly even finding something useful in it insufficiently likely to be worth the trouble.” Some discussion about libvirt/qemu versions compatibility. Markus proposes that we could choose to make newer QEMU require a more recent libvirt version. OpenStack Nova re-evaluates every 6 months what next minimum libvirt and QEMU versions to use (via constants: NEXT_MIN_LIBVIRT_VERSION, NEXT_MIN_QEMU_VERSION) Markus: “The version decoupling we have is really nice, but I think an occasional dependency would be tolerable. Consider it when it saves lots of work.” Markus: “Also: drop support for old versions of QEMU. Anything before 1.5?” References and notes: [1] References for “Early QMP”: * Igor’s proposal at: Subject: [Qemu-devel] [RFC 0/6] enable numa configuration before machine_init() from HMP/QMP https://www.mail-archive.com/qemu-devel@nongnu.org/msg488601.html * And what Eduardo has suggested at: http://www.linux-kvm.org/images/4/46/03x06A-Eduardo_HabkostMachine-type_Introspection_and_Configuration_Where_Are_We_Going.pdf Comments from Markus: > Related: "Why /two/ config interfaces?" in my KVM Forum talk. The two > being CLI and QMP. > > We've long held the idea to support starting QEMU with a minimal command > line, then do all configuration via QMP. Not possible today, because > lots of things can only be done via CLI. In part because QMP becomes > available only after quite a few config boats have sailed. > > Making QMP available earlier during startup ("early QMP") would be a > stepping stone towards the "only QMP" goal. > > Related: rebasing CLI onto QAPI would make replicating CLI-only > interfaces in QMP easier. -- Eduardo
Re: [Qemu-devel] [PATCH 1/7] s390x/pci: factor out endianess conversion
On 11/09/2017 01:38 PM, Cornelia Huck wrote: > On Tue, 7 Nov 2017 18:24:33 +0100 > Pierre Morelwrote: > >> There are two places where the same endianness conversion >> is done. >> Let's factor this out into a static function. >> >> Signed-off-by: Pierre Morel >> Reviewed-by: Yi Min Zhao >> --- >> hw/s390x/s390-pci-inst.c | 58 >> ++-- >> 1 file changed, 32 insertions(+), 26 deletions(-) >> >> diff --git a/hw/s390x/s390-pci-inst.c b/hw/s390x/s390-pci-inst.c >> index 8e088f3..8fcb02d 100644 >> --- a/hw/s390x/s390-pci-inst.c >> +++ b/hw/s390x/s390-pci-inst.c >> @@ -314,6 +314,35 @@ out: >> return 0; >> } >> >> +/** >> + * This function swaps the data at ptr according from one >> + * endianness to the other. >> + * valid data in the uint64_t data field. > > I'm not sure what that line is supposed to mean? > >> + * @ptr: a pointer to a uint64_t data field >> + * @len: the length of the valid data, must be 1,2,4 or 8 >> + */ >> +static int zpci_endian_swap(uint64_t *ptr, uint8_t len) >> +{ >> +uint64_t data = *ptr; >> +switch (len) { >> +case 1: >> +break; >> +case 2: >> +data = bswap16(data); >> +break; >> +case 4: >> +data = bswap32(data); >> +break; >> +case 8: >> +data = bswap64(data); >> +break; >> +default: >> +return -EINVAL; >> +} >> +*ptr = data; >> +return 0; >> +} This is usually care taken by memory::adjust_endianness() ... > I was expecting more code to use a similar pattern, but it seems > surprisingly uncommon. Which ring a bell for latent bug? This remind me of a similar issue on ppc: http://lists.nongnu.org/archive/html/qemu-devel/2017-02/msg05121.html ... http://lists.nongnu.org/archive/html/qemu-devel/2017-02/msg05666.html
Re: [Qemu-devel] [PATCH] slirp: don't zero ti_i since we access it later.
Tao Wu(吴涛@Eng), on jeu. 09 nov. 2017 10:48:27 -0800, wrote: > Thanks. Actually this is a follow up with my previous effort to fix this bug. > I was busy on something else and then got lost in that old thread. Now I just > checked some my local patch > to see if they've merged to upstream and then found it out. > > This is old thread about this: > [1]http://lists.nongnu.org/archive/html/qemu-devel/2017-04/msg05544.html I knew I had seen that proposal somewhere before :) Thanks for the patch, Samuel
Re: [Qemu-devel] [PATCH] slirp: don't zero ti_i since we access it later.
Thanks. Actually this is a follow up with my previous effort to fix this bug. I was busy on something else and then got lost in that old thread. Now I just checked some my local patch to see if they've merged to upstream and then found it out. This is old thread about this: http://lists.nongnu.org/archive/html/qemu-devel/2017-04/msg05544.html On Thu, Nov 9, 2017 at 2:50 AM, Marc-André Lureau < marcandre.lur...@gmail.com> wrote: > Hi > > Adding Guillaume in CC, who wrote that line in commit 98c63057d2144 > > On Wed, Nov 8, 2017 at 11:53 PM, Tao Wu via Qemu-devel >wrote: > > The current code looks buggy, we zero ti_i while we access > > ti_dst/ti_src later. > > Could you described the symptoms and why you fixed it that way? > > thanks > > > > > Signed-off-by: Tao Wu > > --- > > slirp/tcp_subr.c | 11 ++- > > 1 file changed, 10 insertions(+), 1 deletion(-) > > > > diff --git a/slirp/tcp_subr.c b/slirp/tcp_subr.c > > index dc8b4bbb50..da0d53743f 100644 > > --- a/slirp/tcp_subr.c > > +++ b/slirp/tcp_subr.c > > @@ -148,7 +148,16 @@ tcp_respond(struct tcpcb *tp, struct tcpiphdr *ti, > struct mbuf *m, > > m->m_data += IF_MAXLINKHDR; > > *mtod(m, struct tcpiphdr *) = *ti; > > ti = mtod(m, struct tcpiphdr *); > > - memset(>ti, 0, sizeof(ti->ti)); > > + switch (af) { > > + case AF_INET: > > + ti->ti.ti_i4.ih_x1 = 0; > > + break; > > + case AF_INET6: > > + ti->ti.ti_i6.ih_x1 = 0; > > + break; > > + default: > > + g_assert_not_reached(); > > + } > > flags = TH_ACK; > > } else { > > /* > > -- > > 2.15.0.448.gf294e3d99a-goog > > > > > > > > -- > Marc-André Lureau >
Re: [Qemu-devel] [PATCH] virtio-gpu: fix bug in host memory calculation.
Thanks. Sent out a new version add comments to say that we rely on pixman create_bits to fail. On Thu, Nov 9, 2017 at 2:34 AM, Marc-André Lureauwrote: > Hi > > - Original Message - >> The old code treats bits as bytes when calculating host memory usage. >> Change it to be consistent with allocation logic in pixman library. >> > > Good catch > >> Signed-off-by: Tao Wu >> --- >> hw/display/virtio-gpu.c | 13 +++-- >> 1 file changed, 11 insertions(+), 2 deletions(-) >> >> diff --git a/hw/display/virtio-gpu.c b/hw/display/virtio-gpu.c >> index 43bbe09ea0..428786f291 100644 >> --- a/hw/display/virtio-gpu.c >> +++ b/hw/display/virtio-gpu.c >> @@ -322,6 +322,15 @@ static pixman_format_code_t get_pixman_format(uint32_t >> virtio_gpu_format) >> } >> } >> >> +static uint32_t calc_image_hostmem(pixman_format_code_t pformat, >> + uint32_t width, uint32_t height) >> +{ >> +/* copied from pixman/pixman-bits-image.c, skip integer overflow check. >> */ > > So we rely on pixman create_bits() to fail if overflow happened? perhaps it's > worth a comment. > >> +int bpp = PIXMAN_FORMAT_BPP(pformat); >> +int stride = ((width * bpp + 0x1f) >> 5) * sizeof(uint32_t); >> +return height * stride; >> +} >> + >> static void virtio_gpu_resource_create_2d(VirtIOGPU *g, >>struct virtio_gpu_ctrl_command >>*cmd) >> { >> @@ -366,7 +375,7 @@ static void virtio_gpu_resource_create_2d(VirtIOGPU *g, >> return; >> } >> >> -res->hostmem = PIXMAN_FORMAT_BPP(pformat) * c2d.width * c2d.height; >> +res->hostmem = calc_image_hostmem(pformat, c2d.width, c2d.height); >> if (res->hostmem + g->hostmem < g->conf.max_hostmem) { >> res->image = pixman_image_create_bits(pformat, >>c2d.width, >> @@ -1087,7 +1096,7 @@ static int virtio_gpu_load(QEMUFile *f, void *opaque, >> size_t size, >> return -EINVAL; >> } >> >> -res->hostmem = PIXMAN_FORMAT_BPP(pformat) * res->width * >> res->height; >> +res->hostmem = calc_image_hostmem(pformat, res->width, res->height); >> >> res->addrs = g_new(uint64_t, res->iov_cnt); >> res->iov = g_new(struct iovec, res->iov_cnt); >> -- >> 2.15.0.403.gc27cc4dac6-goog >> > > looks good otherwise, > > Reviewed-by: Marc-André Lureau > >>
Re: [Qemu-devel] [PATCH] virtio-gpu: fix bug in host memory calculation.
On Thu, Nov 9, 2017 at 7:17 PM, Tao Wu via Qemu-develwrote: > The old code treats bits as bytes when calculating host memory usage. > Change it to be consistent with allocation logic in pixman library. > > Signed-off-by: Tao Wu Reviewed-by: Marc-André Lureau > --- > hw/display/virtio-gpu.c | 16 ++-- > 1 file changed, 14 insertions(+), 2 deletions(-) > > diff --git a/hw/display/virtio-gpu.c b/hw/display/virtio-gpu.c > index 43bbe09ea0..274e365713 100644 > --- a/hw/display/virtio-gpu.c > +++ b/hw/display/virtio-gpu.c > @@ -322,6 +322,18 @@ static pixman_format_code_t get_pixman_format(uint32_t > virtio_gpu_format) > } > } > > +static uint32_t calc_image_hostmem(pixman_format_code_t pformat, > + uint32_t width, uint32_t height) > +{ > +/* Copied from pixman/pixman-bits-image.c, skip integer overflow check. > + * pixman_image_create_bits will fail in case it overflow. > + */ > + > +int bpp = PIXMAN_FORMAT_BPP(pformat); > +int stride = ((width * bpp + 0x1f) >> 5) * sizeof(uint32_t); > +return height * stride; > +} > + > static void virtio_gpu_resource_create_2d(VirtIOGPU *g, >struct virtio_gpu_ctrl_command > *cmd) > { > @@ -366,7 +378,7 @@ static void virtio_gpu_resource_create_2d(VirtIOGPU *g, > return; > } > > -res->hostmem = PIXMAN_FORMAT_BPP(pformat) * c2d.width * c2d.height; > +res->hostmem = calc_image_hostmem(pformat, c2d.width, c2d.height); > if (res->hostmem + g->hostmem < g->conf.max_hostmem) { > res->image = pixman_image_create_bits(pformat, >c2d.width, > @@ -1087,7 +1099,7 @@ static int virtio_gpu_load(QEMUFile *f, void *opaque, > size_t size, > return -EINVAL; > } > > -res->hostmem = PIXMAN_FORMAT_BPP(pformat) * res->width * res->height; > +res->hostmem = calc_image_hostmem(pformat, res->width, res->height); > > res->addrs = g_new(uint64_t, res->iov_cnt); > res->iov = g_new(struct iovec, res->iov_cnt); > -- > 2.15.0.448.gf294e3d99a-goog > > -- Marc-André Lureau
[Qemu-devel] [PATCH] virtio-gpu: fix bug in host memory calculation.
The old code treats bits as bytes when calculating host memory usage. Change it to be consistent with allocation logic in pixman library. Signed-off-by: Tao Wu--- hw/display/virtio-gpu.c | 16 ++-- 1 file changed, 14 insertions(+), 2 deletions(-) diff --git a/hw/display/virtio-gpu.c b/hw/display/virtio-gpu.c index 43bbe09ea0..274e365713 100644 --- a/hw/display/virtio-gpu.c +++ b/hw/display/virtio-gpu.c @@ -322,6 +322,18 @@ static pixman_format_code_t get_pixman_format(uint32_t virtio_gpu_format) } } +static uint32_t calc_image_hostmem(pixman_format_code_t pformat, + uint32_t width, uint32_t height) +{ +/* Copied from pixman/pixman-bits-image.c, skip integer overflow check. + * pixman_image_create_bits will fail in case it overflow. + */ + +int bpp = PIXMAN_FORMAT_BPP(pformat); +int stride = ((width * bpp + 0x1f) >> 5) * sizeof(uint32_t); +return height * stride; +} + static void virtio_gpu_resource_create_2d(VirtIOGPU *g, struct virtio_gpu_ctrl_command *cmd) { @@ -366,7 +378,7 @@ static void virtio_gpu_resource_create_2d(VirtIOGPU *g, return; } -res->hostmem = PIXMAN_FORMAT_BPP(pformat) * c2d.width * c2d.height; +res->hostmem = calc_image_hostmem(pformat, c2d.width, c2d.height); if (res->hostmem + g->hostmem < g->conf.max_hostmem) { res->image = pixman_image_create_bits(pformat, c2d.width, @@ -1087,7 +1099,7 @@ static int virtio_gpu_load(QEMUFile *f, void *opaque, size_t size, return -EINVAL; } -res->hostmem = PIXMAN_FORMAT_BPP(pformat) * res->width * res->height; +res->hostmem = calc_image_hostmem(pformat, res->width, res->height); res->addrs = g_new(uint64_t, res->iov_cnt); res->iov = g_new(struct iovec, res->iov_cnt); -- 2.15.0.448.gf294e3d99a-goog
[Qemu-devel] [PULL 0/1] slirp update
warning: redirection vers https://people.debian.org/~sthibault/qemu.git/ The following changes since commit b0fbe46ad82982b289a44ee2495b59b0bad8a842: Update version for v2.11.0-rc0 release (2017-11-07 16:05:28 +) are available in the git repository at: http://people.debian.org/~sthibault/qemu.git tags/samuel-thibault for you to fetch changes up to 990132cda9baa27bdc558df6c9c15aacb0322d2c: slirp: don't zero the whole ti_i when m == NULL (2017-11-09 18:59:22 +0100) slirp updates Tao Wu (1): slirp: don't zero the whole ti_i when m == NULL This fixes the qemu slirp behavior is some rare cases (some RST cases, keep alive probes), packets would be sent to 0.0.0.0. slirp/tcp_subr.c | 11 ++- 1 file changed, 10 insertions(+), 1 deletion(-)
[Qemu-devel] [PULL 1/1] slirp: don't zero the whole ti_i when m == NULL
From: Tao Wu98c63057d2144fb81681580cd84c13c93794c96e ('slirp: Factorizing tcpiphdr structure with an union') introduced a memset call to clear possibly-undefined fields in ti. This however overwrites src/dst/pr which are used below. So let us clear only the unused fields. This should fix some rare cases (some RST cases, keep alive probes) where packets would be sent to 0.0.0.0. Signed-off-by: Tao Wu Signed-off-by: Samuel Thibault --- slirp/tcp_subr.c | 11 ++- 1 file changed, 10 insertions(+), 1 deletion(-) diff --git a/slirp/tcp_subr.c b/slirp/tcp_subr.c index dc8b4bbb50..da0d53743f 100644 --- a/slirp/tcp_subr.c +++ b/slirp/tcp_subr.c @@ -148,7 +148,16 @@ tcp_respond(struct tcpcb *tp, struct tcpiphdr *ti, struct mbuf *m, m->m_data += IF_MAXLINKHDR; *mtod(m, struct tcpiphdr *) = *ti; ti = mtod(m, struct tcpiphdr *); - memset(>ti, 0, sizeof(ti->ti)); + switch (af) { + case AF_INET: + ti->ti.ti_i4.ih_x1 = 0; + break; + case AF_INET6: + ti->ti.ti_i6.ih_x1 = 0; + break; + default: + g_assert_not_reached(); + } flags = TH_ACK; } else { /* -- 2.14.2
Re: [Qemu-devel] [PATCH] slirp: don't zero ti_i since we access it later.
Hello, Tao Wu, on mer. 08 nov. 2017 14:53:40 -0800, wrote: > The current code looks buggy, we zero ti_i while we access > ti_dst/ti_src later. Mmm, indeed, looking again at how it was introduced, it was too much. Samuel
Re: [Qemu-devel] [PATCH] blockdev-backup: enable non-root nodes for backup
Am 09.11.2017 um 17:33 hat Eric Blake geschrieben: > On 11/09/2017 08:16 AM, Vladimir Sementsov-Ogievskiy wrote: > > This is needed to implement image-fleecing scheme, when we create > > a temporary node, mark our active node to be backing for the temp, > > and start backup(sync=none) from active node to the temp node. > > Temp node then represents a kind of snapshot and may be used > > for external backup through NBD. > > > > Signed-off-by: Vladimir Sementsov-Ogievskiy> > --- > > > > What was the reason to abandon non-root nodes? > > I think the original restriction was that we didn't know all the > implications to having multiple readers to an intermediate node, so it > was easier to prevent it with plans to add it later than to add it up > front and deal with the fallout. But I think that now we are > sufficiently versed in handling BDS trees with multiple readers, with > proper op-blocking in place; so you are right that we can probably > support it now. Op blockers are actually the reason why I'm not so sure. The old blockers often still only work on the top level, and we haven't fully replaced them yet. In particular, graph modifications might not be protected well enough by the new system yet. But then, backup works only on a single node in the source tree, not on a whole subchain, so I don't see what other operation could conflict with a backup job. The only thing is resizes, but that is covered by the new system. So in the end, I think this specific change should be okay. > However, I'm a bit worried that there is no documentation change about > this in a .json QAPI file, nor any easy way to introspect via QMP > whether a particular qemu implementation supports this (short of trying > it and seeing whether it works). I'm also thinking that this is 2.12 > material, unless we can prove it is fixing a bug for 2.11 that was not > previously present. Sounds like another use case for the capability annotations that Markus wants to introduce for commands in the QAPI schema. Kevin signature.asc Description: PGP signature
[Qemu-devel] [PATCH V4] hw/pci-host: Fix x86 Host Bridges 64bit PCI hole
Currently there is no MMIO range over 4G reserved for PCI hotplug. Since the 32bit PCI hole depends on the number of cold-plugged PCI devices and other factors, it is very possible is too small to hotplug PCI devices with large BARs. Fix it by reserving 2G for I4400FX chipset in order to comply with older Win32 Guest OSes and 32G for Q35 chipset. Even if the new defaults of pci-hole64-size will appear in "info qtree" also for older machines, the property was not implemented so no changes will be visible to guests. Note this is a regression since prev QEMU versions had some range reserved for 64bit PCI hotplug. Reviewed-by: Laszlo ErsekReviewed-by: Gerd Hoffmann Signed-off-by: Marcel Apfelbaum --- V3 -> V4: - Addressed Laszlo's comments: - Added defines for pci-hole64 default size props. - Rounded the hole64_end to 1G - Moved some info to commit message - Addressed Michael's comments: - Added more comments. - I kept Gerd's "review-by" tag since no functional changes were made. V2 -> V3: - Addressed Gerd's and others comments and re-enabled the pci-hole64-size property defaulting it to 2G for I440FX and 32g for Q35. - Even if the new defaults of pci-hole64-size will appear in "info qtree" also for older machines, the property was not implemented so no changes will be visible to guests. V1 -> V2: Addressed Igor's comments: - aligned the hole64 start to 1Gb (I think all the computations took care of it already, but it can't hurt) - Init compat props to "off" instead of "false" hw/i386/pc.c | 22 ++ hw/pci-host/piix.c| 32 ++-- hw/pci-host/q35.c | 35 --- include/hw/i386/pc.h | 10 +- include/hw/pci-host/q35.h | 1 + 5 files changed, 94 insertions(+), 6 deletions(-) diff --git a/hw/i386/pc.c b/hw/i386/pc.c index e11a65b545..fafe5ba5cd 100644 --- a/hw/i386/pc.c +++ b/hw/i386/pc.c @@ -1448,6 +1448,28 @@ void pc_memory_init(PCMachineState *pcms, pcms->ioapic_as = _space_memory; } +/* + * The 64bit pci hole starts after "above 4G RAM" and + * potentially the space reserved for memory hotplug. + */ +uint64_t pc_pci_hole64_start(void) +{ +PCMachineState *pcms = PC_MACHINE(qdev_get_machine()); +PCMachineClass *pcmc = PC_MACHINE_GET_CLASS(pcms); +uint64_t hole64_start = 0; + +if (pcmc->has_reserved_memory && pcms->hotplug_memory.base) { +hole64_start = pcms->hotplug_memory.base; +if (!pcmc->broken_reserved_end) { +hole64_start += memory_region_size(>hotplug_memory.mr); +} +} else { +hole64_start = 0x1ULL + pcms->above_4g_mem_size; +} + +return ROUND_UP(hole64_start, 1ULL << 30); +} + qemu_irq pc_allocate_cpu_irq(void) { return qemu_allocate_irq(pic_irq_request, NULL, 0); diff --git a/hw/pci-host/piix.c b/hw/pci-host/piix.c index a7e2256870..f689c31d12 100644 --- a/hw/pci-host/piix.c +++ b/hw/pci-host/piix.c @@ -50,6 +50,7 @@ typedef struct I440FXState { PCIHostState parent_obj; Range pci_hole; uint64_t pci_hole64_size; +bool pci_hole64_fix; uint32_t short_root_bus; } I440FXState; @@ -112,6 +113,9 @@ struct PCII440FXState { #define I440FX_PAM_SIZE 7 #define I440FX_SMRAM0x72 +/* Keep it 2G to comply with older win32 guests */ +#define I440FX_PCI_HOST_HOLE64_SIZE_DEFAULT (1ULL << 31) + /* Older coreboot versions (4.0 and older) read a config register that doesn't * exist in real hardware, to get the RAM size from QEMU. */ @@ -238,29 +242,52 @@ static void i440fx_pcihost_get_pci_hole_end(Object *obj, Visitor *v, visit_type_uint32(v, name, , errp); } +/* + * The 64bit PCI hole start is set by the Guest firmware + * as the address of the first 64bit PCI MEM resource. + * If no PCI device has resources on the 64bit area, + * the 64bit PCI hole will start after "over 4G RAM" and the + * reserved space for memory hotplug if any. + */ static void i440fx_pcihost_get_pci_hole64_start(Object *obj, Visitor *v, const char *name, void *opaque, Error **errp) { PCIHostState *h = PCI_HOST_BRIDGE(obj); +I440FXState *s = I440FX_PCI_HOST_BRIDGE(obj); Range w64; uint64_t value; pci_bus_get_w64_range(h->bus, ); value = range_is_empty() ? 0 : range_lob(); +if (!value && s->pci_hole64_fix) { +value = pc_pci_hole64_start(); +} visit_type_uint64(v, name, , errp); } +/* + * The 64bit PCI hole end is set by the Guest firmware + * as the address of the last 64bit PCI MEM resource. + * Then it is expanded to the PCI_HOST_PROP_PCI_HOLE64_SIZE + * that can be configured by the user. + */ static void i440fx_pcihost_get_pci_hole64_end(Object *obj, Visitor *v, const char
Re: [Qemu-devel] [Qemu-block] [PATCH v2] throttle: fix a qemu crash problem when calling blk_delete
On Tue, Oct 24, 2017 at 11:33:51AM +0800, sochin jiang wrote: > commit 7ca7f0 moves the throttling related part of the BDS life cycle > management to BlockBackend, adds call to > throttle_timers_detach_aio_context in blk_remove_bs. commit 1606e > remove a block device from its throttle group in blk_delete by calling > blk_io_limits_disable, this fix an easily reproducible qemu crash. But > delete a BB without a BDS inserted could easily cause a qemu crash too > by calling bdrv_drained_begin in blk_io_limits_disable. Say, a simply > drive_add and then a drive_del command. > > This patch removes draining BDS by calling throttle_group_unregister_tgm > directly instead of blk_io_limits_disable, leaves draining operation to > blk_remove_bs in case that there is no BDS inserted. Futhermore, make sure > throttle timers are initialized or attached before throttle_timers_destroy > is called in throttle_group_unregister_tgm. > > Signed-off-by: sochin jiang> --- > block/block-backend.c | 2 +- > block/throttle-groups.c | 4 +++- > 2 files changed, 4 insertions(+), 2 deletions(-) > > diff --git a/block/block-backend.c b/block/block-backend.c > index 45d9101..39c7cca 100644 > --- a/block/block-backend.c > +++ b/block/block-backend.c > @@ -341,7 +341,7 @@ static void blk_delete(BlockBackend *blk) > assert(!blk->name); > assert(!blk->dev); > if (blk->public.throttle_group_member.throttle_state) { > -blk_io_limits_disable(blk); > +throttle_group_unregister_tgm(>public.throttle_group_member); The following assertions fail without the drain when there are pending requests: void throttle_group_unregister_tgm(ThrottleGroupMember *tgm) { ThrottleState *ts = tgm->throttle_state; ThrottleGroup *tg = container_of(ts, ThrottleGroup, ts); ThrottleGroupMember *token; int i; if (!ts) { /* Discard already unregistered tgm */ return; } assert(tgm->pending_reqs[0] == 0 && tgm->pending_reqs[1] == 0); assert(qemu_co_queue_empty(>throttled_reqs[0])); assert(qemu_co_queue_empty(>throttled_reqs[1])); A safer approach is making blk_io_limits_disable(blk) skip the draining when blk_bs(blk) == NULL. That is the only case where we are 100% sure that there are no pending requests. signature.asc Description: PGP signature
[Qemu-devel] [PULL 8/8] nbd/server: Fix structured read of length 0
The NBD spec was recently clarified to state that a read of length 0 should not be attempted by a compliant client; but that a server must still handle it correctly in an unspecified manner (that is, either a successful no-op or an error reply, but not a crash) [1]. However, it also implies that NBD_REPLY_TYPE_OFFSET_DATA must have a non-zero payload length, but our existing code was replying with a chunk that a picky client could reject as invalid because it was missing a payload (our own client implementation was recently patched to be that picky, after first fixing it to not send 0-length requests). We are already doing successful no-ops for 0-length writes and for non-structured reads; so for consistency, we want structured reply reads to also be a no-op. The easiest way to do this is to return a NBD_REPLY_TYPE_NONE chunk; this is best done via a new helper function (especially since future patches for other structured replies may benefit from using the same helper). [1] https://github.com/NetworkBlockDevice/nbd/commit/ee926037 Signed-off-by: Eric BlakeMessage-Id: <20171108215703.9295-8-ebl...@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy --- nbd/server.c | 21 - nbd/trace-events | 1 + 2 files changed, 21 insertions(+), 1 deletion(-) diff --git a/nbd/server.c b/nbd/server.c index 6ebb7d9c2e..df771fd42f 100644 --- a/nbd/server.c +++ b/nbd/server.c @@ -1273,6 +1273,21 @@ static inline void set_be_chunk(NBDStructuredReplyChunk *chunk, uint16_t flags, stl_be_p(>length, length); } +static int coroutine_fn nbd_co_send_structured_done(NBDClient *client, +uint64_t handle, +Error **errp) +{ +NBDStructuredReplyChunk chunk; +struct iovec iov[] = { +{.iov_base = , .iov_len = sizeof(chunk)}, +}; + +trace_nbd_co_send_structured_done(handle); +set_be_chunk(, NBD_REPLY_FLAG_DONE, NBD_REPLY_TYPE_NONE, handle, 0); + +return nbd_co_send_iov(client, iov, 1, errp); +} + static int coroutine_fn nbd_co_send_structured_read(NBDClient *client, uint64_t handle, uint64_t offset, @@ -1286,6 +1301,7 @@ static int coroutine_fn nbd_co_send_structured_read(NBDClient *client, {.iov_base = data, .iov_len = size} }; +assert(size); trace_nbd_co_send_structured_read(handle, offset, data, size); set_be_chunk(, NBD_REPLY_FLAG_DONE, NBD_REPLY_TYPE_OFFSET_DATA, handle, sizeof(chunk) - sizeof(chunk.h) + size); @@ -1544,10 +1560,13 @@ reply: if (ret < 0) { ret = nbd_co_send_structured_error(req->client, request.handle, -ret, msg, _err); -} else { +} else if (reply_data_len) { ret = nbd_co_send_structured_read(req->client, request.handle, request.from, req->data, reply_data_len, _err); +} else { +ret = nbd_co_send_structured_done(req->client, request.handle, + _err); } } else { ret = nbd_co_send_simple_reply(req->client, request.handle, diff --git a/nbd/trace-events b/nbd/trace-events index bbc75f6414..92568edce5 100644 --- a/nbd/trace-events +++ b/nbd/trace-events @@ -55,6 +55,7 @@ nbd_receive_request(uint32_t magic, uint16_t flags, uint16_t type, uint64_t from nbd_blk_aio_attached(const char *name, void *ctx) "Export %s: Attaching clients to AIO context %p\n" nbd_blk_aio_detach(const char *name, void *ctx) "Export %s: Detaching clients from AIO context %p\n" nbd_co_send_simple_reply(uint64_t handle, uint32_t error, const char *errname, int len) "Send simple reply: handle = %" PRIu64 ", error = %" PRIu32 " (%s), len = %d" +nbd_co_send_structured_done(uint64_t handle) "Send structured reply done: handle = %" PRIu64 nbd_co_send_structured_read(uint64_t handle, uint64_t offset, void *data, size_t size) "Send structured read data reply: handle = %" PRIu64 ", offset = %" PRIu64 ", data = %p, len = %zu" nbd_co_send_structured_error(uint64_t handle, int err, const char *errname, const char *msg) "Send structured error reply: handle = %" PRIu64 ", error = %d (%s), msg = '%s'" nbd_co_receive_request_decode_type(uint64_t handle, uint16_t type, const char *name) "Decoding type: handle = %" PRIu64 ", type = %" PRIu16 " (%s)" -- 2.13.6
[Qemu-devel] [PULL 5/8] nbd: Fix struct name for structured reads
A closer read of the NBD spec shows that a structured reply chunk for a hole is not quite identical to the prefix of a data chunk, because the hole has to also send a 32-bit size field. Although we do not yet send holes, we should fix the misleading information in our header and make it easier for a future patch to support sparse reads. Messed up in commit bae245d1. Signed-off-by: Eric BlakeMessage-Id: <20171108215703.9295-5-ebl...@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy --- include/block/nbd.h | 18 +- nbd/server.c| 2 +- 2 files changed, 14 insertions(+), 6 deletions(-) diff --git a/include/block/nbd.h b/include/block/nbd.h index 92d1723d7c..113c707a5e 100644 --- a/include/block/nbd.h +++ b/include/block/nbd.h @@ -86,15 +86,23 @@ typedef union NBDReply { } QEMU_PACKED; } NBDReply; -/* Header of NBD_REPLY_TYPE_OFFSET_DATA, complete NBD_REPLY_TYPE_OFFSET_HOLE */ -typedef struct NBDStructuredRead { -NBDStructuredReplyChunk h; +/* Header of chunk for NBD_REPLY_TYPE_OFFSET_DATA */ +typedef struct NBDStructuredReadData { +NBDStructuredReplyChunk h; /* h.length >= 9 */ uint64_t offset; -} QEMU_PACKED NBDStructuredRead; +/* At least one byte of data payload follows, calculated from h.length */ +} QEMU_PACKED NBDStructuredReadData; + +/* Complete chunk for NBD_REPLY_TYPE_OFFSET_HOLE */ +typedef struct NBDStructuredReadHole { +NBDStructuredReplyChunk h; /* h.length == 12 */ +uint64_t offset; +uint32_t length; +} QEMU_PACKED NBDStructuredReadHole; /* Header of all NBD_REPLY_TYPE_ERROR* errors */ typedef struct NBDStructuredError { -NBDStructuredReplyChunk h; +NBDStructuredReplyChunk h; /* h.length >= 6 */ uint32_t error; uint16_t message_length; } QEMU_PACKED NBDStructuredError; diff --git a/nbd/server.c b/nbd/server.c index bcf0cdb47c..6ebb7d9c2e 100644 --- a/nbd/server.c +++ b/nbd/server.c @@ -1280,7 +1280,7 @@ static int coroutine_fn nbd_co_send_structured_read(NBDClient *client, size_t size, Error **errp) { -NBDStructuredRead chunk; +NBDStructuredReadData chunk; struct iovec iov[] = { {.iov_base = , .iov_len = sizeof(chunk)}, {.iov_base = data, .iov_len = size} -- 2.13.6
[Qemu-devel] [PULL 6/8] nbd-client: Short-circuit 0-length operations
The NBD spec was recently clarified to state that clients should not send 0-length requests to the server, as the server behavior is undefined [1]. We know that qemu-nbd's behavior is a successful no-op (once it has filtered for read-only exports), but other NBD implementations might return an error. To avoid any questionable server implementations, it is better to just short-circuit such requests on the client side (we are relying on the block layer to already filter out requests such as invalid offset, write to a read-only volume, and so forth); do the short-circuit as late as possible to still benefit from protections from assertions that the block layer is not violating our assumptions. [1] https://github.com/NetworkBlockDevice/nbd/commit/ee926037 Signed-off-by: Eric BlakeMessage-Id: <20171108215703.9295-6-ebl...@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy --- block/nbd-client.c | 11 ++- 1 file changed, 10 insertions(+), 1 deletion(-) diff --git a/block/nbd-client.c b/block/nbd-client.c index daa4392531..0a675d0fab 100644 --- a/block/nbd-client.c +++ b/block/nbd-client.c @@ -674,6 +674,9 @@ int nbd_client_co_preadv(BlockDriverState *bs, uint64_t offset, assert(bytes <= NBD_MAX_BUFFER_SIZE); assert(!flags); +if (!bytes) { +return 0; +} ret = nbd_co_send_request(bs, , NULL); if (ret < 0) { return ret; @@ -705,6 +708,9 @@ int nbd_client_co_pwritev(BlockDriverState *bs, uint64_t offset, assert(bytes <= NBD_MAX_BUFFER_SIZE); +if (!bytes) { +return 0; +} return nbd_co_request(bs, , qiov); } @@ -731,6 +737,9 @@ int nbd_client_co_pwrite_zeroes(BlockDriverState *bs, int64_t offset, request.flags |= NBD_CMD_FLAG_NO_HOLE; } +if (!bytes) { +return 0; +} return nbd_co_request(bs, , NULL); } @@ -759,7 +768,7 @@ int nbd_client_co_pdiscard(BlockDriverState *bs, int64_t offset, int bytes) }; assert(!(client->info.flags & NBD_FLAG_READ_ONLY)); -if (!(client->info.flags & NBD_FLAG_SEND_TRIM)) { +if (!(client->info.flags & NBD_FLAG_SEND_TRIM) || !bytes) { return 0; } -- 2.13.6
[Qemu-devel] [PULL 4/8] nbd/client: Nicer trace of structured reply
It's useful to know which structured reply chunk is being processed. Missed in commit d2febedb. Signed-off-by: Eric BlakeMessage-Id: <20171108215703.9295-4-ebl...@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy --- nbd/client.c | 4 +++- nbd/trace-events | 2 +- 2 files changed, 4 insertions(+), 2 deletions(-) diff --git a/nbd/client.c b/nbd/client.c index 3d680e63e1..1880103d2a 100644 --- a/nbd/client.c +++ b/nbd/client.c @@ -979,6 +979,7 @@ static int nbd_receive_structured_reply_chunk(QIOChannel *ioc, int nbd_receive_reply(QIOChannel *ioc, NBDReply *reply, Error **errp) { int ret; +const char *type; ret = nbd_read_eof(ioc, >magic, sizeof(reply->magic), errp); if (ret <= 0) { @@ -1008,8 +1009,9 @@ int nbd_receive_reply(QIOChannel *ioc, NBDReply *reply, Error **errp) if (ret < 0) { break; } +type = nbd_reply_type_lookup(reply->structured.type); trace_nbd_receive_structured_reply_chunk(reply->structured.flags, - reply->structured.type, + reply->structured.type, type, reply->structured.handle, reply->structured.length); break; diff --git a/nbd/trace-events b/nbd/trace-events index 4a13757524..bbc75f6414 100644 --- a/nbd/trace-events +++ b/nbd/trace-events @@ -27,7 +27,7 @@ nbd_client_clear_queue(void) "Clearing NBD queue" nbd_client_clear_socket(void) "Clearing NBD socket" nbd_send_request(uint64_t from, uint32_t len, uint64_t handle, uint16_t flags, uint16_t type, const char *name) "Sending request to server: { .from = %" PRIu64", .len = %" PRIu32 ", .handle = %" PRIu64 ", .flags = 0x%" PRIx16 ", .type = %" PRIu16 " (%s) }" nbd_receive_simple_reply(int32_t error, const char *errname, uint64_t handle) "Got simple reply: { .error = %" PRId32 " (%s), handle = %" PRIu64" }" -nbd_receive_structured_reply_chunk(uint16_t flags, uint16_t type, uint64_t handle, uint32_t length) "Got structured reply chunk: { flags = 0x%" PRIx16 ", type = %d, handle = %" PRIu64 ", length = %" PRIu32 " }" +nbd_receive_structured_reply_chunk(uint16_t flags, uint16_t type, const char *name, uint64_t handle, uint32_t length) "Got structured reply chunk: { flags = 0x%" PRIx16 ", type = %d (%s), handle = %" PRIu64 ", length = %" PRIu32 " }" # nbd/common.c nbd_unknown_error(int err) "Squashing unexpected error %d to EINVAL" -- 2.13.6
[Qemu-devel] [PULL 1/8] nbd/server: fix nbd_negotiate_handle_info
From: Vladimir Sementsov-Ogievskiynamelen should be here, length is unrelated, and always 0 at this point. Broken in introduction in commit f37708f6, but mostly harmless (replying with '' as the name does not violate protocol, and does not confuse qemu as the nbd client since our implementation does not ask for the name; but might confuse some other client that does ask for the name especially if the default export is different than the export name being queried). Adding an assert makes it obvious that we are not skipping any bytes in the client's message, as well as making it obvious that we were using the wrong variable. Signed-off-by: Vladimir Sementsov-Ogievskiy CC: qemu-sta...@nongnu.org Message-Id: <20171101154204.27146-1-vsement...@virtuozzo.com> [eblake: improve commit message, squash in assert addition] Signed-off-by: Eric Blake --- nbd/server.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/nbd/server.c b/nbd/server.c index 70b40ed27e..bcf0cdb47c 100644 --- a/nbd/server.c +++ b/nbd/server.c @@ -423,6 +423,7 @@ static int nbd_negotiate_handle_info(NBDClient *client, uint32_t length, break; } } +assert(length == 0); exp = nbd_export_find(name); if (!exp) { @@ -433,7 +434,7 @@ static int nbd_negotiate_handle_info(NBDClient *client, uint32_t length, /* Don't bother sending NBD_INFO_NAME unless client requested it */ if (sendname) { -rc = nbd_negotiate_send_info(client, opt, NBD_INFO_NAME, length, name, +rc = nbd_negotiate_send_info(client, opt, NBD_INFO_NAME, namelen, name, errp); if (rc < 0) { return rc; -- 2.13.6
[Qemu-devel] [PULL 7/8] nbd-client: Stricter enforcing of structured reply spec
Ensure that the server is not sending unexpected chunk lengths for either the NONE or the OFFSET_DATA chunk, nor unexpected hole length for OFFSET_HOLE. This will flag any server as broken that responds to a zero-length read with an OFFSET_DATA (what our server currently does, but that's about to be fixed) or with OFFSET_HOLE, even though we previously fixed our client to never be able to send such a request over the wire. Signed-off-by: Eric BlakeMessage-Id: <20171108215703.9295-7-ebl...@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy --- block/nbd-client.c | 11 +-- 1 file changed, 9 insertions(+), 2 deletions(-) diff --git a/block/nbd-client.c b/block/nbd-client.c index 0a675d0fab..bcfed0133d 100644 --- a/block/nbd-client.c +++ b/block/nbd-client.c @@ -216,7 +216,7 @@ static int nbd_parse_offset_hole_payload(NBDStructuredReplyChunk *chunk, offset = payload_advance64(); hole_size = payload_advance32(); -if (offset < orig_offset || hole_size > qiov->size || +if (!hole_size || offset < orig_offset || hole_size > qiov->size || offset > orig_offset + qiov->size - hole_size) { error_setg(errp, "Protocol error: server sent chunk exceeding requested" " region"); @@ -281,7 +281,8 @@ static int nbd_co_receive_offset_data_payload(NBDClientSession *s, assert(nbd_reply_is_structured(>reply)); -if (chunk->length < sizeof(offset)) { +/* The NBD spec requires at least one byte of payload */ +if (chunk->length <= sizeof(offset)) { error_setg(errp, "Protocol error: invalid payload for " "NBD_REPLY_TYPE_OFFSET_DATA"); return -EINVAL; @@ -293,6 +294,7 @@ static int nbd_co_receive_offset_data_payload(NBDClientSession *s, be64_to_cpus(); data_size = chunk->length - sizeof(offset); +assert(data_size); if (offset < orig_offset || data_size > qiov->size || offset > orig_offset + qiov->size - data_size) { error_setg(errp, "Protocol error: server sent chunk exceeding requested" @@ -411,6 +413,11 @@ static coroutine_fn int nbd_co_do_receive_one_chunk( " NBD_REPLY_FLAG_DONE flag set"); return -EINVAL; } +if (chunk->length) { +error_setg(errp, "Protocol error: NBD_REPLY_TYPE_NONE chunk with" + " nonzero length"); +return -EINVAL; +} return 0; } -- 2.13.6
[Qemu-devel] [PULL 3/8] nbd-client: Refuse read-only client with BDRV_O_RDWR
The NBD spec says that clients should not try to write/trim to an export advertised as read-only by the server. But we failed to check that, and would allow the block layer to use NBD with BDRV_O_RDWR even when the server is read-only, which meant we were depending on the server sending a proper EPERM failure for various commands, and also exposes a leaky abstraction: using qemu-io in read-write mode would succeed on 'w -z 0 0' because of local short-circuiting logic, but 'w 0 0' would send a request over the wire (where it then depends on the server, and fails at least for qemu-nbd but might pass for other NBD implementations). With this patch, a client MUST request read-only mode to access a server that is doing a read-only export, or else it will get a message like: can't open device nbd://localhost:10809/foo: request for write access conflicts with read-only export It is no longer possible to even attempt writes over the wire (including the corner case of 0-length writes), because the block layer enforces the explicit read-only request; this matches the behavior of qcow2 when backed by a read-only POSIX file. Fix several iotests to comply with the new behavior (since qemu-nbd of an internal snapshot, as well as nbd-server-add over QMP, default to a read-only export, we must tell blockdev-add/qemu-io to set up a read-only client). CC: qemu-sta...@nongnu.org Signed-off-by: Eric BlakeMessage-Id: <20171108215703.9295-3-ebl...@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy --- block/nbd-client.c | 9 + tests/qemu-iotests/058 | 8 tests/qemu-iotests/140 | 4 ++-- tests/qemu-iotests/147 | 1 + 4 files changed, 16 insertions(+), 6 deletions(-) diff --git a/block/nbd-client.c b/block/nbd-client.c index de6c153328..daa4392531 100644 --- a/block/nbd-client.c +++ b/block/nbd-client.c @@ -697,6 +697,7 @@ int nbd_client_co_pwritev(BlockDriverState *bs, uint64_t offset, .len = bytes, }; +assert(!(client->info.flags & NBD_FLAG_READ_ONLY)); if (flags & BDRV_REQ_FUA) { assert(client->info.flags & NBD_FLAG_SEND_FUA); request.flags |= NBD_CMD_FLAG_FUA; @@ -717,6 +718,7 @@ int nbd_client_co_pwrite_zeroes(BlockDriverState *bs, int64_t offset, .len = bytes, }; +assert(!(client->info.flags & NBD_FLAG_READ_ONLY)); if (!(client->info.flags & NBD_FLAG_SEND_WRITE_ZEROES)) { return -ENOTSUP; } @@ -756,6 +758,7 @@ int nbd_client_co_pdiscard(BlockDriverState *bs, int64_t offset, int bytes) .len = bytes, }; +assert(!(client->info.flags & NBD_FLAG_READ_ONLY)); if (!(client->info.flags & NBD_FLAG_SEND_TRIM)) { return 0; } @@ -814,6 +817,12 @@ int nbd_client_init(BlockDriverState *bs, logout("Failed to negotiate with the NBD server\n"); return ret; } +if (client->info.flags & NBD_FLAG_READ_ONLY && +!bdrv_is_read_only(bs)) { +error_setg(errp, + "request for write access conflicts with read-only export"); +return -EACCES; +} if (client->info.flags & NBD_FLAG_SEND_FUA) { bs->supported_write_flags = BDRV_REQ_FUA; bs->supported_zero_flags |= BDRV_REQ_FUA; diff --git a/tests/qemu-iotests/058 b/tests/qemu-iotests/058 index 2253c6a6d1..5eb8784669 100755 --- a/tests/qemu-iotests/058 +++ b/tests/qemu-iotests/058 @@ -117,15 +117,15 @@ _export_nbd_snapshot sn1 echo echo "== verifying the exported snapshot with patterns, method 1 ==" -$QEMU_IO_NBD -c 'read -P 0xa 0x1000 0x1000' "$nbd_snapshot_img" | _filter_qemu_io -$QEMU_IO_NBD -c 'read -P 0xb 0x2000 0x1000' "$nbd_snapshot_img" | _filter_qemu_io +$QEMU_IO_NBD -r -c 'read -P 0xa 0x1000 0x1000' "$nbd_snapshot_img" | _filter_qemu_io +$QEMU_IO_NBD -r -c 'read -P 0xb 0x2000 0x1000' "$nbd_snapshot_img" | _filter_qemu_io _export_nbd_snapshot1 sn1 echo echo "== verifying the exported snapshot with patterns, method 2 ==" -$QEMU_IO_NBD -c 'read -P 0xa 0x1000 0x1000' "$nbd_snapshot_img" | _filter_qemu_io -$QEMU_IO_NBD -c 'read -P 0xb 0x2000 0x1000' "$nbd_snapshot_img" | _filter_qemu_io +$QEMU_IO_NBD -r -c 'read -P 0xa 0x1000 0x1000' "$nbd_snapshot_img" | _filter_qemu_io +$QEMU_IO_NBD -r -c 'read -P 0xb 0x2000 0x1000' "$nbd_snapshot_img" | _filter_qemu_io $QEMU_IMG convert "$TEST_IMG" -l sn1 -O qcow2 "$converted_image" diff --git a/tests/qemu-iotests/140 b/tests/qemu-iotests/140 index f89d0d6789..a8fc95145c 100755 --- a/tests/qemu-iotests/140 +++ b/tests/qemu-iotests/140 @@ -78,7 +78,7 @@ _send_qemu_cmd $QEMU_HANDLE \ 'arguments': { 'device': 'drv' }}" \ 'return' -$QEMU_IO_PROG -f raw -c 'read -P 42 0 64k' \ +$QEMU_IO_PROG -f raw -r -c 'read -P 42 0 64k' \ "nbd+unix:///drv?socket=$TEST_DIR/nbd" 2>&1 \ | _filter_qemu_io | _filter_nbd @@ -87,7 +87,7 @@ _send_qemu_cmd $QEMU_HANDLE \ 'arguments': { 'device': 'drv' }}" \ 'return' -$QEMU_IO_PROG -f raw -c close
[Qemu-devel] [PULL 0/8] NBD patches for 2.11-rc1
The following changes since commit b0fbe46ad82982b289a44ee2495b59b0bad8a842: Update version for v2.11.0-rc0 release (2017-11-07 16:05:28 +) are available in the git repository at: git://repo.or.cz/qemu/ericb.git tags/pull-nbd-2017-11-09 for you to fetch changes up to ef8c887ee01a4e4c8c5c28c86ea5b45162c51bcd: nbd/server: Fix structured read of length 0 (2017-11-09 10:25:11 -0600) nbd patches for 2017-11-09 - Vladimir Sementsov-Ogievskiy: nbd/server: fix nbd_negotiate_handle_info - Eric Blake: 0/7 various NBD fixes for 2.11 Eric Blake (7): nbd-client: Fix error message typos nbd-client: Refuse read-only client with BDRV_O_RDWR nbd/client: Nicer trace of structured reply nbd: Fix struct name for structured reads nbd-client: Short-circuit 0-length operations nbd-client: Stricter enforcing of structured reply spec nbd/server: Fix structured read of length 0 Vladimir Sementsov-Ogievskiy (1): nbd/server: fix nbd_negotiate_handle_info include/block/nbd.h| 18 +- block/nbd-client.c | 37 +++-- nbd/client.c | 4 +++- nbd/server.c | 26 +++--- nbd/trace-events | 3 ++- tests/qemu-iotests/058 | 8 tests/qemu-iotests/140 | 4 ++-- tests/qemu-iotests/147 | 1 + 8 files changed, 79 insertions(+), 22 deletions(-) -- 2.13.6
[Qemu-devel] [PULL 2/8] nbd-client: Fix error message typos
Provide missing spaces that are required when using string concatenation to break error messages across source lines. Introduced in commit f140e300. Signed-off-by: Eric BlakeMessage-Id: <20171108215703.9295-2-ebl...@redhat.com> Reviewed-by: Vladimir Sementsov-Ogievskiy --- block/nbd-client.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/block/nbd-client.c b/block/nbd-client.c index b44d4d4a01..de6c153328 100644 --- a/block/nbd-client.c +++ b/block/nbd-client.c @@ -248,7 +248,7 @@ static int nbd_parse_error_payload(NBDStructuredReplyChunk *chunk, error = nbd_errno_to_system_errno(payload_advance32()); if (error == 0) { -error_setg(errp, "Protocol error: server sent structured error chunk" +error_setg(errp, "Protocol error: server sent structured error chunk " "with error = 0"); return -EINVAL; } @@ -257,7 +257,7 @@ static int nbd_parse_error_payload(NBDStructuredReplyChunk *chunk, message_size = payload_advance16(); if (message_size > chunk->length - sizeof(error) - sizeof(message_size)) { -error_setg(errp, "Protocol error: server sent structured error chunk" +error_setg(errp, "Protocol error: server sent structured error chunk " "with incorrect message size"); return -EINVAL; } @@ -408,7 +408,7 @@ static coroutine_fn int nbd_co_do_receive_one_chunk( if (chunk->type == NBD_REPLY_TYPE_NONE) { if (!(chunk->flags & NBD_REPLY_FLAG_DONE)) { error_setg(errp, "Protocol error: NBD_REPLY_TYPE_NONE chunk without" - "NBD_REPLY_FLAG_DONE flag set"); + " NBD_REPLY_FLAG_DONE flag set"); return -EINVAL; } return 0; -- 2.13.6
Re: [Qemu-devel] [Qemu-block] [PATCH] block: all I/O should be completed before removing throttle timers.
On Sat, Oct 21, 2017 at 01:34:00PM +0800, Zhengui Li wrote: > From: Zhengui> > In blk_remove_bs, all I/O should be completed before removing throttle > timers. If there has inflight I/O, removing throttle timers here will > cause the inflight I/O never return. > This patch add bdrv_drained_begin before throttle_timers_detach_aio_context > to let all I/O completed before removing throttle timers. > > Signed-off-by: Zhengui > --- > block/block-backend.c | 4 > 1 file changed, 4 insertions(+) Reviewed-by: Stefan Hajnoczi signature.asc Description: PGP signature
Re: [Qemu-devel] [PATCH 3/7] s390x/pci: rework PCI LOAD
On Tue, 7 Nov 2017 18:24:35 +0100 Pierre Morelwrote: > Enhance the fault detection, correction of the fault reporting. Basically the same comments as for the previous patch (but looks good in general). > > Signed-off-by: Pierre Morel > Reviewed-by: Yi Min Zhao > --- > hw/s390x/s390-pci-inst.c | 27 --- > 1 file changed, 16 insertions(+), 11 deletions(-)
Re: [Qemu-devel] [PATCH 2/7] s390x/pci: rework PCI STORE
On Tue, 7 Nov 2017 18:24:34 +0100 Pierre Morelwrote: > Enhance the fault detection, correction of the fault reporting. > > Signed-off-by: Pierre Morel > Reviewed-by: Yi Min Zhao > --- > hw/s390x/s390-pci-inst.c | 41 - > 1 file changed, 24 insertions(+), 17 deletions(-) > > diff --git a/hw/s390x/s390-pci-inst.c b/hw/s390x/s390-pci-inst.c > index 8fcb02d..4a2f996 100644 > --- a/hw/s390x/s390-pci-inst.c > +++ b/hw/s390x/s390-pci-inst.c > @@ -469,6 +469,12 @@ int pcistg_service_call(S390CPU *cpu, uint8_t r1, > uint8_t r2) > pcias = (env->regs[r2] >> 16) & 0xf; > len = env->regs[r2] & 0xf; > offset = env->regs[r2 + 1]; > +data = env->regs[r1]; > + > +if (!(fh & FH_MASK_ENABLE)) { This covers the reserved/standby/disabled states, right? > +setcc(cpu, ZPCI_PCI_LS_INVAL_HANDLE); > +return 0; > +} > > pbdev = s390_pci_find_dev_by_fh(s390_get_phb(), fh); > if (!pbdev) { > @@ -478,12 +484,7 @@ int pcistg_service_call(S390CPU *cpu, uint8_t r1, > uint8_t r2) > } > > switch (pbdev->state) { > -case ZPCI_FS_RESERVED: > -case ZPCI_FS_STANDBY: > -case ZPCI_FS_DISABLED: > case ZPCI_FS_PERMANENT_ERROR: > -setcc(cpu, ZPCI_PCI_LS_INVAL_HANDLE); > -return 0; > case ZPCI_FS_ERROR: > setcc(cpu, ZPCI_PCI_LS_ERR); > s390_set_status_code(env, r2, ZPCI_PCI_ST_BLOCKED); > @@ -492,9 +493,13 @@ int pcistg_service_call(S390CPU *cpu, uint8_t r1, > uint8_t r2) > break; > } > > -data = env->regs[r1]; > -if (pcias < 6) { > -if ((8 - (offset & 0x7)) < len) { > +/* Test that the pcias is a valid pcias and do the according operations > */ s/according/appropriate/ ? > +switch (pcias) { > +case 0 ... 5: > +/* Check length: > + * A length of 0 is invalid and length should not cross a double word > + */ > +if (!len || (len > (8 - (offset & 0x7 { > program_interrupt(env, PGM_OPERAND, 4); > return 0; > } > @@ -512,21 +517,23 @@ int pcistg_service_call(S390CPU *cpu, uint8_t r1, > uint8_t r2) > program_interrupt(env, PGM_OPERAND, 4); > return 0; > } > -} else if (pcias == 15) { > -if ((4 - (offset & 0x3)) < len) { > -program_interrupt(env, PGM_OPERAND, 4); > -return 0; > -} > - > -if (zpci_endian_swap(, len)) { > +break; > +case 15: > +/* ZPCI uses the pseudo BAR number 15 as configuration space */ > +/* possible access length are 1,2,4 and must not cross a word */ s/length/lengths/ > +if (!len || (len > (4 - (offset & 0x3))) || len == 3) { > program_interrupt(env, PGM_OPERAND, 4); > return 0; > } > - > +/* len = 1,2,4 so we do not need to test */ > +zpci_endian_swap(, len); > pci_host_config_write_common(pbdev->pdev, offset, > pci_config_size(pbdev->pdev), > data, len); > -} else { > +break; > +case PCI_ROM_SLOT: So, will this be filled in a later patch? (Reading from the top.) > +/* Fall through */ > +default: > DPRINTF("pcistg invalid space\n"); > setcc(cpu, ZPCI_PCI_LS_ERR); > s390_set_status_code(env, r2, ZPCI_PCI_ST_INVAL_AS); On the whole, I think this improves readability, especially for folks that do not have access to the spec.
Re: [Qemu-devel] [PATCH V5] hw/pcie-pci-bridge: restrict to X86 and ARM
Hi Marcel, Cc'ing Paul and Yongbok Since I'm not sure their Boston board could also use it. On 11/09/2017 12:46 PM, Marcel Apfelbaum wrote: > The PCIE-PCI bridge is specific to "pure" PCIe systems > (on QEMU we have X86 and ARM), it does not make sense to > have it in other archs. > > Reported-by: Thomas Huth> Signed-off-by: Marcel Apfelbaum > --- > > V4 -> V5 > - Since all other tries failed, conditioned the > device on the PCIe Root Port. > > V3 -> V4: > - Move the config line to pci.mak (Thomas) > > V2 -> V3: > - Another tweak in subject s/if/it (Cornelia) > > V1 -> V2: > Addressed Thomas and Cornelia comments: > - Conditioned the pcie-pci-bridge compilation on >the PCIe Root CONFIG_PCIE_PORT > - Tweaked subject PCI -> PCIe > > Thanks, > Marcel > > > hw/pci-bridge/Makefile.objs | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/hw/pci-bridge/Makefile.objs b/hw/pci-bridge/Makefile.objs > index 666db37da2..1b05023662 100644 > --- a/hw/pci-bridge/Makefile.objs > +++ b/hw/pci-bridge/Makefile.objs > @@ -1,5 +1,5 @@ > -common-obj-y += pci_bridge_dev.o pcie_pci_bridge.o > -common-obj-$(CONFIG_PCIE_PORT) += pcie_root_port.o gen_pcie_root_port.o > +common-obj-y += pci_bridge_dev.o > +common-obj-$(CONFIG_PCIE_PORT) += pcie_root_port.o gen_pcie_root_port.o > pcie_pci_bridge.o KISS :) Reviewed-by: Philippe Mathieu-Daudé Tested-by: Philippe Mathieu-Daudé > common-obj-$(CONFIG_PXB) += pci_expander_bridge.o > common-obj-$(CONFIG_XIO3130) += xio3130_upstream.o xio3130_downstream.o > common-obj-$(CONFIG_IOH3420) += ioh3420.o >
Re: [Qemu-devel] [PATCH 1/7] s390x/pci: factor out endianess conversion
On Tue, 7 Nov 2017 18:24:33 +0100 Pierre Morelwrote: > There are two places where the same endianness conversion > is done. > Let's factor this out into a static function. > > Signed-off-by: Pierre Morel > Reviewed-by: Yi Min Zhao > --- > hw/s390x/s390-pci-inst.c | 58 > ++-- > 1 file changed, 32 insertions(+), 26 deletions(-) > > diff --git a/hw/s390x/s390-pci-inst.c b/hw/s390x/s390-pci-inst.c > index 8e088f3..8fcb02d 100644 > --- a/hw/s390x/s390-pci-inst.c > +++ b/hw/s390x/s390-pci-inst.c > @@ -314,6 +314,35 @@ out: > return 0; > } > > +/** > + * This function swaps the data at ptr according from one > + * endianness to the other. > + * valid data in the uint64_t data field. I'm not sure what that line is supposed to mean? > + * @ptr: a pointer to a uint64_t data field > + * @len: the length of the valid data, must be 1,2,4 or 8 > + */ > +static int zpci_endian_swap(uint64_t *ptr, uint8_t len) > +{ > +uint64_t data = *ptr; > +switch (len) { > +case 1: > +break; > +case 2: > +data = bswap16(data); > +break; > +case 4: > +data = bswap32(data); > +break; > +case 8: > +data = bswap64(data); > +break; > +default: > +return -EINVAL; > +} > +*ptr = data; > +return 0; > +} > + I was expecting more code to use a similar pattern, but it seems surprisingly uncommon.
Re: [Qemu-devel] [PATCH V5] hw/pcie-pci-bridge: restrict to X86 and ARM
On 09.11.2017 16:46, Marcel Apfelbaum wrote: > The PCIE-PCI bridge is specific to "pure" PCIe systems > (on QEMU we have X86 and ARM), it does not make sense to > have it in other archs. > > Reported-by: Thomas Huth> Signed-off-by: Marcel Apfelbaum > --- > > V4 -> V5 > - Since all other tries failed, conditioned the > device on the PCIe Root Port. > > V3 -> V4: > - Move the config line to pci.mak (Thomas) > > V2 -> V3: > - Another tweak in subject s/if/it (Cornelia) > > V1 -> V2: > Addressed Thomas and Cornelia comments: > - Conditioned the pcie-pci-bridge compilation on >the PCIe Root CONFIG_PCIE_PORT > - Tweaked subject PCI -> PCIe > > Thanks, > Marcel > > > hw/pci-bridge/Makefile.objs | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/hw/pci-bridge/Makefile.objs b/hw/pci-bridge/Makefile.objs > index 666db37da2..1b05023662 100644 > --- a/hw/pci-bridge/Makefile.objs > +++ b/hw/pci-bridge/Makefile.objs > @@ -1,5 +1,5 @@ > -common-obj-y += pci_bridge_dev.o pcie_pci_bridge.o > -common-obj-$(CONFIG_PCIE_PORT) += pcie_root_port.o gen_pcie_root_port.o > +common-obj-y += pci_bridge_dev.o > +common-obj-$(CONFIG_PCIE_PORT) += pcie_root_port.o gen_pcie_root_port.o > pcie_pci_bridge.o > common-obj-$(CONFIG_PXB) += pci_expander_bridge.o > common-obj-$(CONFIG_XIO3130) += xio3130_upstream.o xio3130_downstream.o > common-obj-$(CONFIG_IOH3420) += ioh3420.o Reviewed-by: Thomas Huth
Re: [Qemu-devel] [PATCH] blockdev-backup: enable non-root nodes for backup
On 11/09/2017 08:16 AM, Vladimir Sementsov-Ogievskiy wrote: > This is needed to implement image-fleecing scheme, when we create > a temporary node, mark our active node to be backing for the temp, > and start backup(sync=none) from active node to the temp node. > Temp node then represents a kind of snapshot and may be used > for external backup through NBD. > > Signed-off-by: Vladimir Sementsov-Ogievskiy> --- > > What was the reason to abandon non-root nodes? I think the original restriction was that we didn't know all the implications to having multiple readers to an intermediate node, so it was easier to prevent it with plans to add it later than to add it up front and deal with the fallout. But I think that now we are sufficiently versed in handling BDS trees with multiple readers, with proper op-blocking in place; so you are right that we can probably support it now. However, I'm a bit worried that there is no documentation change about this in a .json QAPI file, nor any easy way to introspect via QMP whether a particular qemu implementation supports this (short of trying it and seeing whether it works). I'm also thinking that this is 2.12 material, unless we can prove it is fixing a bug for 2.11 that was not previously present. > > blockdev.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/blockdev.c b/blockdev.c > index 56a6b24a0b..d0a5a107f0 100644 > --- a/blockdev.c > +++ b/blockdev.c > @@ -3345,7 +3345,7 @@ BlockJob *do_blockdev_backup(BlockdevBackup *backup, > BlockJobTxn *txn, > backup->compress = false; > } > > -bs = qmp_get_root_bs(backup->device, errp); > +bs = bdrv_lookup_bs(backup->device, backup->device, errp); > if (!bs) { > return NULL; > } > -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3266 Virtualization: qemu.org | libvirt.org signature.asc Description: OpenPGP digital signature
Re: [Qemu-devel] [Qemu-block] segfault in parallel blockjobs (iotest 30)
On Wed 08 Nov 2017 03:45:38 PM CET, Alberto Garcia wrote: > I'm thinking that perhaps we should add the pause point directly to > block_job_defer_to_main_loop(), to prevent any block job from running > the exit function when it's paused. I was trying this and unfortunately this breaks the mirror job at least (reproduced with a simple block-commit on the topmost node, which uses commit_active_start() -> mirror_start_job()). So what happens is that mirror_run() always calls bdrv_drained_begin() before returning, pausing the block job. The closing bdrv_drained_end() call is at the end of mirror_exit(), already in the main loop. So the mirror job is always calling block_job_defer_to_main_loop() and mirror_exit() when the job is paused. Berto
Re: [Qemu-devel] [Qemu-block] Drainage in bdrv_replace_child_noperm()
Am 08.11.2017 um 21:16 hat Max Reitz geschrieben: > On 2017-11-07 15:22, Kevin Wolf wrote: > > I think the issue is much simpler, even though it still has two parts. > > It's the old story of bdrv_drain mixing two separate concepts: > > > > 1. Wait synchronously for the completion of all my requests to this > >node. This needs to be propagated down the graph to the children. > > So, flush without flushing protocol drivers? :-) No, flush is a completely different thing. Drain is about completing all in-flight requests, whereas flush is about writing out all caches. You can do either one without the other. In particular, a flush doesn't involve a drain, you can still requests while a flush request is in flight. The semantics is that a flush makes sure that all requests which had already completed when the flush request was started are stable on disk. Later requests may or may not be stable on disk yet. > > 2. Make sure that nobody else sends new requests to this node. This > >needs to be propagated up the graph to the parents. > > > > Some callers want only 1. (usually callers of bdrv_drain_all() without a > > begin/end pair), some callers want both 1. and 2. (that's the begin/end > > construction for getting exclusive access). Not sure if there are > > callers that want only 2., but possibly. > > > > If we actually take this separation serious, the first step to a fix > > could be that BdrvChildRole.drained_begin shouldn't be propagated to the > > children. > > That seems good to me, but considering that the default was to just > propagate it to every child, I thought that I was missing something. bdrv_child_cb_drained_begin() calls bdrv_drained_begin() to wait for the completion of all running requests on its current node. The propagation to every child is an unwanted side effect, but so far it didn't seem to hurt anyone, so we didn't care about preventing it. > > We may still need to drain the requests at the node itself: > > > Imagine a drained backing file of qcow2 node. Making sure that the qcow2 > > node doesn't get new requests isn't enough, we also must make sure that > > in-flight requests don't access the backing file any more. This means > > draining the qcow2 node, though we don't care whether its child nodes > > still have requests in flight. > > If you want to stop the qcow2 node from generating further requests, you > can either flush them (as you said) or pause them (whatever that means...). Pausing is probably way to complicated. qcow2 never issues requests by itself. It only has requests running if someone else has a request running on the qcow2 node. So it's enough to just drain the request queue of the qcow2 node to get to the state we want. The only thing we need to make sure is that we drain it _before_ its child node is replaced with a drained one, so that its requests can complete. In fact, I think graph modifications should only be done with drained nodes anyway. Otherwise you could switch over in the middle of a single high-level operation and possibly get callbacks from a node that isn't a child any more. Maybe we should add some assertions to that effect and clean up whatever code breaks after it. > However, if you flush them, you also need to flush the children you have > flushed them to... So what I wrote was technically wrong ("don't pass > parent drainage back to the child because the child is already > drained"), instead it should be "don't pass parent drainage back to the > child because the child is going to be drained (flushed) anyway". > > So, pause requests first (either by parking them or flushing them, > driver's choice), then flush the tree. > > Of course that's easier said than done... First, we would need a > non-recursive flush. Then, every node that is visited by the drain > would (*after* recursing to its parents) need to flush itself. > > (Note that I'm completely disregarding nodes which are below the > original node that is supposed to be drained, and which therefore do not > drain their parents (or do they?) because I'd like to retain at least > some of my sanity for now.) I don't follow, but I suppose this is related to the flush/drain confusion. > Secondly we have exactly the issue you describe below... > > > The big question is whether bdrv_drain() would still work for a single > > node without recursing to the children, but as it uses bs->in_flight, I > > think that should be okay these days. > > > >> (Most importantly, ideally we'd want to attach the new child to the > >> parent and then drain the parent: This would give us exactly the state > >> we want. However, attaching the child first and then draining the > >> parent is unsafe, so we cannot do it...) > >> > >> [1] Whether the parent (un-)drains the child depends on the > >> BdrvChildRole.drained_{begin,end}() implementation, strictly speaking. > >> We cannot say it generally. > >> > >> OK, so how to fix it? I don't know, so I'm asking you. :-) > > > >
Re: [Qemu-devel] [PATCH v4 0/4] Don't write headers if BDS is INACTIVE
On 11/07/2017 04:10 PM, Jeff Cody wrote: > Changes from v3->v4: > > Patch 3: Add migrate_del_blocker and error_free (Thanks Stefan) > > > git-backport-diff -r qemu/master.. -u ba11b69 > > 001/4:[] [--] 'block/vhdx.c: Don't blindly update the header' > 002/4:[] [--] 'block/parallels: Do not update header or truncate image > when INMIGRATE' > 003/4:[0003] [FC] 'block/parallels: add migration blocker' > 004/4:[] [--] 'qemu-iotests: update unsupported image formats in 194' > > > Changes from v2->v3: > > Patch 2: Uh... fix that misspelling. Thanks Stefan :) > Patch 3: New patch to block migration in parallels > > git-backport-diff -r qemu/master.. -u 6dc6acb > > 001/4:[] [--] 'block/vhdx.c: Don't blindly update the header' > 002/4:[] [--] 'block/parallels: Do not update header or truncate image > when INMIGRATE' > 003/4:[down] 'block/parallels: add migration blocker' > 004/4:[] [--] 'qemu-iotests: update unsupported image formats in 194' > > > Changes from v1->v2: > > * Drop previous parallels patches, just check BDRV_O_INACTIVE now > (Kevin) > > git-backport-diff -r qemu/master.. -u github/master > Key: > [] : patches are identical > [] : number of functional differences between upstream/downstream patch > [down] : patch is downstream-only > The flags [FC] indicate (F)unctional and (C)ontextual differences, > respectively > > 001/3:[] [--] 'block/vhdx.c: Don't blindly update the header' > 002/3:[down] 'block/parallals: Do not update header or truncate image when > INMIGRATE' > 003/3:[] [--] 'qemu-iotests: update unsupported image formats in 194' > > > v1: > > VHDX and Parallels both blindly write headers to the image file > if the images are opened R/W. This causes an assert if the QEMU run > state is INMIGRATE. > > Jeff Cody (4): > block/vhdx.c: Don't blindly update the header > block/parallels: Do not update header or truncate image when INMIGRATE > block/parallels: add migration blocker > qemu-iotests: update unsupported image formats in 194 > > block/parallels.c | 22 +- > block/vhdx.c | 7 --- > tests/qemu-iotests/194 | 2 +- > 3 files changed, 18 insertions(+), 13 deletions(-) > Reviewed-by: Denis V. Lunev
Re: [Qemu-devel] [Qemu-block] [PATCH 0/5] iotests: Make some tests less flaky
On Thu, Nov 09, 2017 at 02:37:59AM +0100, Max Reitz wrote: > There are a couple of tests that fail (on my machine) from time to > time (and by that I mean that recently I've rarely ever had a test run > with both 083 and 136 working on first try). > This series should fix most (at least the issues I am aware of). > > Notes: > - 083 might have another issue, but if so it occurs extremely rarely and > so I was unable to debug it. > > - 129 is flaky, too, because it tries to use block jobs with BB > throttling -- however, block jobs ignore that these days. Making it > use a throttle filter node will require quite a bit of work. See > http://lists.nongnu.org/archive/html/qemu-block/2017-11/msg00111.html > for more. > > - 194 sometimes hangs because the source VM fails to drain its drive. > This is probably not an issue with the test, but actually an issue in > qemu. See > http://lists.nongnu.org/archive/html/qemu-block/2017-11/msg00256.html > for more. > > > "All tests have passed, let's ship it!" > -- Me, 2:36 am > > > (Editor's note: "all" means raw/file, qcow2/file, and raw/nbd.) > > > Max Reitz (5): > iotests: Make 030 less flaky > iotests: Add missing 'blkdebug::' in 040 > iotests: Make 055 less flaky > iotests: Make 083 less flaky > iotests: Make 136 less flaky > > tests/qemu-iotests/030 | 8 ++-- > tests/qemu-iotests/040 | 2 +- > tests/qemu-iotests/055 | 25 - > tests/qemu-iotests/083 | 3 ++- > tests/qemu-iotests/136 | 14 +- > 5 files changed, 38 insertions(+), 14 deletions(-) > > -- > 2.13.6 > > Thanks for these fixes! QEMU 2.11 material. Reviewed-by: Stefan Hajnoczisignature.asc Description: PGP signature
Re: [Qemu-devel] [PATCH] AMD Processor Topology Information
Hi Philippe, Most of the code for 'case 0x801E:' is a copy of 'case 4:' of the same procedure in both cpu.c and kvm.c Values were changes for AMD Zen architecture. The only new code is 'case 0x801D:' which defines core topology. Hope this info helps wit review. On 2017-11-08 13:44, Philippe Mathieu-Daudé wrote: Hi Stanislav, This does not seem so trivial ;) Cc'ing more reviewers. On 11/03/2017 02:30 PM, Stanislav Lanci wrote: V2: Adds information about cache size and topology on leaf 0x801D for family 17h Without the added cache topology guest with SMT suffers latency problems Add CPUID 0x801E for describing AMD Processor Topology Information Disables warning about smt for 17h family of AMD CPUs Signed-off-by: Stanislav Lanci--- target/i386/cpu.c | 93 ++- target/i386/kvm.c | 28 +++-- 2 files changed, 117 insertions(+), 4 deletions(-) diff --git a/target/i386/cpu.c b/target/i386/cpu.c index ddc45abd70..1545e3fe31 100644 --- a/target/i386/cpu.c +++ b/target/i386/cpu.c @@ -113,7 +113,9 @@ /* L1 instruction cache: */ #define L1I_LINE_SIZE 64 #define L1I_ASSOCIATIVITY 8 +#define L1I_ASSOC_AMD_ZEN 4 #define L1I_SETS 64 +#define L1I_SETS_AMD_ZEN 256 #define L1I_PARTITIONS 1 /* Size = LINE_SIZE*ASSOCIATIVITY*SETS*PARTITIONS = 32KiB */ #define L1I_DESCRIPTOR CPUID_2_L1I_32KB_8WAY_64B @@ -125,7 +127,9 @@ /* Level 2 unified cache: */ #define L2_LINE_SIZE 64 #define L2_ASSOCIATIVITY 16 +#define L2_ASSOC_AMD_ZEN 8 #define L2_SETS 4096 +#define L2_SETS_AMD_ZEN 1024 #define L2_PARTITIONS 1 /* Size = LINE_SIZE*ASSOCIATIVITY*SETS*PARTITIONS = 4MiB */ /*FIXME: CPUID leaf 2 descriptor is inconsistent with CPUID leaf 4 */ @@ -142,6 +146,7 @@ #define L3_N_LINE_SIZE 64 #define L3_N_ASSOCIATIVITY 16 #define L3_N_SETS 16384 +#define L3_N_SETS_AMD_ZEN4096 #define L3_N_PARTITIONS 1 #define L3_N_DESCRIPTOR CPUID_2_L3_16MB_16WAY_64B #define L3_N_LINES_PER_TAG 1 @@ -3072,6 +3077,91 @@ void cpu_x86_cpuid(CPUX86State *env, uint32_t index, uint32_t count, *edx = 0; } break; +case 0x801D: /* AMD TOPOEXT cache info for ZEN */ +if (cpu->cache_info_passthrough) { +host_cpuid(index, count, eax, ebx, ecx, edx); +break; +} else if ((env->cpuid_version & 0xFF00F00) == 0x800F00) { +*eax = 0; +switch (count) { +case 0: /* L1 dcache info */ +*eax |= CPUID_4_TYPE_DCACHE | \ +CPUID_4_LEVEL(1) | \ +CPUID_4_SELF_INIT_LEVEL | \ +((cs->nr_threads - 1) << 14); +*ebx = (L1D_LINE_SIZE - 1) | \ + ((L1D_PARTITIONS - 1) << 12) | \ + ((L1D_ASSOCIATIVITY - 1) << 22); +*ecx = L1D_SETS - 1; +*edx = 0; +break; +case 1: /* L1 icache info */ +*eax |= CPUID_4_TYPE_ICACHE | \ +CPUID_4_LEVEL(1) | \ +CPUID_4_SELF_INIT_LEVEL | \ +((cs->nr_threads - 1) << 14); +*ebx = (L1I_LINE_SIZE - 1) | \ + ((L1I_PARTITIONS - 1) << 12) | \ + ((L1I_ASSOC_AMD_ZEN - 1) << 22); +*ecx = L1I_SETS_AMD_ZEN - 1; +*edx = 0; +break; +case 2: /* L2 cache info */ +*eax |= CPUID_4_TYPE_UNIFIED | \ +CPUID_4_LEVEL(2) | \ +CPUID_4_SELF_INIT_LEVEL | \ +((cs->nr_threads - 1) << 14); +*ebx = (L2_LINE_SIZE - 1) | \ + ((L2_PARTITIONS - 1) << 12) | \ + ((L2_ASSOC_AMD_ZEN - 1) << 22); +*ecx = L2_SETS_AMD_ZEN - 1; +*edx = CPUID_4_INCLUSIVE; +break; +case 3: /* L3 cache info */ +if (!cpu->enable_l3_cache) { +*eax = 0; +*ebx = 0; +*ecx = 0; +*edx = 0; +break; +} +*eax |= CPUID_4_TYPE_UNIFIED | \ +CPUID_4_LEVEL(3) | \ +CPUID_4_SELF_INIT_LEVEL | \ +((cs->nr_cores * cs->nr_threads - 1) << 14); +*ebx = (L3_N_LINE_SIZE - 1) | \ + ((L3_N_PARTITIONS - 1) << 12) | \ + ((L3_N_ASSOCIATIVITY - 1) << 22); +*ecx = L3_N_SETS_AMD_ZEN - 1; +*edx = CPUID_4_NO_INVD_SHARING; +break; +default: /* end of info */ +*eax = 0; +*ebx = 0; +*ecx = 0; +
Re: [Qemu-devel] [Qemu-block] RFC: use case for adding QMP, block jobs & multiple exports to qemu-nbd ?
On Thu, Nov 09, 2017 at 02:54:35PM +0100, Markus Armbruster wrote: > Max Reitzwrites: > > > On 2017-11-02 13:02, Daniel P. Berrange wrote: > > [...] > >> One alternative approach to doing this would be to suggest that we should > >> instead just spawn qemu-system-x86_64 with '--machine none' and use that > >> as a replacement for qemu-nbd, since it already has a built-in NBD server > >> which can do many exports at once and arbitrary block jobs. > > > > As far as I know, we had wanted to add QMP support to qemu-nbd maybe one > > or two years ago, but nobody ever did it. > > > > I've had some discussions about this with Markus and Kevin at KVM Forum. > > They appeared to strongly prefer this approach. I agree with them that > > design-wise, a qemu with no machine at all (and not even -M none) and > > early QMP is the way we want to go anyway, and then this would be the > > correct tool to use. > > "Strongly" is perhaps a bit strong, at least as far as I'm concerned. I > just believe that we want the capability to run QEMU without a machine > anyway, and if that's good enough, then why bother duplicating so much > of qemu-system-FOO in qemu-nbd & friends? Besides, once you start to > duplicate, you'll likely find it hard to stop. > > >> I'm concerned that this could end up being a be a game of whack-a-mole > >> though, constantly trying to cut out/down all the bits of system emulation > >> in the machine emulators to get its resource overhead to match the low > >> overhead of standalone qemu-nbd. > > > > However, I personally share your concern. Especially, I think that > > getting to a point where we can have no machine at all and early QMP > > will take much longer than just adding QMP to qemu-nbd -- or adding a > > qmp command to qemu-img (because you can add NBD exports through QMP, so > > qemu-nbd's hardcoded focus on NBD exports seems kind of weird then)[1]. > > > > I'm very much torn here. There are two approaches: Stripping fat qemu > > down, or fattening lean qemu-img (?) up. The latter is very simple. > > "Very simple" is perhaps debatable, but I think we can agree on > "temptingly simple". My other concern with using QEMU system emulator binary is that even if you make it possible to run it with no guest machine instantiated, it is still a massive binary containing all the stuff you get with a QEMU system emulator. To be able to lock down the security of this QEMU to the same level that we could do with qemu-nbd will take even more work than we would already have todo to make "no machine" be possible. Even then getting the right security level would require the invoker to enable the right magic combo of options. And then we would also need full QEMU backend modularization to be done so that we could have a binary for serving NBD which didn't pull in irrelevant stuff like SPICE, GTK & Xorg libraries. This just all sounds like we are queuing up unfeasibly large amounts of work, alot of which we've talked about for 10 years and still not been any to make a step forward on impl. It feels like the key important factor is to avoid re-inventing the wheel in multiple places. I don't think this implies that we need to have a single binary for doing two completely separate tasks. On the qemu-nbd side it should basically involve assembling building blocks that we already have available through the QEMU code base. The block layer is already well isolated & reusable, as evidenced by fact that it is used across many programs already with little code duplication. In theory the QMP monitor is fairly well isolated, since we already reuse the infra for the QEMU guest agent too. For qemu-nbd we could reuse even more than with the guest agent, since we can pull in some of the actual command impls for sharing. There's doubtless still some refactoring to QMP to make it possible, but its no where near the kind of scope we'd be looking at to take the QEMU system emulator and enable a "no machine" startup, make it secure by default and modularize all its dependancies. Regards, Daniel -- |: https://berrange.com -o-https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o-https://fstop138.berrange.com :| |: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|
Re: [Qemu-devel] [Qemu-block] [PATCH] coroutine: simplify co_aio_sleep_ns() prototype
On Thu, Nov 09, 2017 at 10:26:52AM +, Stefan Hajnoczi wrote: > The AioContext pointer argument to co_aio_sleep_ns() is only used for > the sleep timer. It does not affect where the caller coroutine is > resumed. > > Due to changes to coroutine and AIO APIs it is now possible to drop the > AioContext pointer argument. This is safe to do since no caller has > specific requirements for which AioContext the timer must run in. > > This patch drops the AioContext pointer argument and renames the > function to simplify the API. > > Reported-by: Paolo Bonzini> Reported-by: Eric Blake > Signed-off-by: Stefan Hajnoczi > --- > include/qemu/coroutine.h| 6 +- > block/null.c| 3 +-- > block/sheepdog.c| 3 +-- > blockjob.c | 2 +- > util/qemu-coroutine-sleep.c | 4 ++-- > 5 files changed, 6 insertions(+), 12 deletions(-) Thanks, applied to my block-next tree for QEMU 2.12: https://github.com/stefanha/qemu/commits/block-next Stefan signature.asc Description: PGP signature
Re: [Qemu-devel] [PATCH 0/2] add qmp nbd-server-remove
On 11/09/2017 09:40 AM, Vladimir Sementsov-Ogievskiy wrote: > Add command to remove nbd export, pair to nbd-server-add. > The whole thing and description are in patch 02. > > Vladimir Sementsov-Ogievskiy (2): > nbd/server: add additional assert to nbd_export_put > qmp: add nbd-server-remove > > qapi/block.json | 20 > blockdev-nbd.c | 27 +++ > nbd/server.c| 1 + > 3 files changed, 48 insertions(+) Reviewed-by: Eric Blake-- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3266 Virtualization: qemu.org | libvirt.org signature.asc Description: OpenPGP digital signature
Re: [Qemu-devel] [PATCH 2/2] qmp: add nbd-server-remove
On 11/09/2017 09:40 AM, Vladimir Sementsov-Ogievskiy wrote: > Add command for export removing. It is needed for cases when we > don't want to keep export after the operation on it was completed. > The other example is temporary node, created with blockdev-add. > If we want to delete it we should firstly remove corresponding > NBD export. > > Signed-off-by: Vladimir Sementsov-Ogievskiy> --- > qapi/block.json | 20 > blockdev-nbd.c | 27 +++ > 2 files changed, 47 insertions(+) It would be nice (okay as a followup patch) to add iotest coverage of the new command. > ## > +# @nbd-server-remove: > +# > +# Stop exporting block node through QEMU's embedded NBD server. > +# > +# @device: The device name or node name of the exported node. Should be equal > +# to @device parameter for corresponding nbd-server-add command > call. > +# > +# @force: Whether active connections to the export should be closed. If this > +# parameter is false the export is only removed from named exports > list, > +# so new connetions are impossible and it would be freed after all > +# clients are disconnected (default false). > +# > +# Returns: error if the server is not running or the device is not marked for > +# export. > +# > +# Since: 2.12 You are correct that this is too late for 2.11, but I like the concept. Once we have some testsuite coverage, I'll queue it for 2.12. -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3266 Virtualization: qemu.org | libvirt.org signature.asc Description: OpenPGP digital signature
Re: [Qemu-devel] [PATCH] Add missing parameters to mon option documentation
Hi On Thu, Nov 9, 2017 at 1:19 PM, Vicente Jimenez Aguilarwrote: > Documentation missed 'mon' option's 'pretty' and 'default' parameters > > Signed-off-by: Vicente Jimenez Aguilar > --- > qemu-options.hx | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/qemu-options.hx b/qemu-options.hx > index 3728e9b4dd..72cf48a8e5 100644 > --- a/qemu-options.hx > +++ b/qemu-options.hx > @@ -3477,9 +3477,9 @@ Like -qmp but uses pretty JSON formatting. > ETEXI > > DEF("mon", HAS_ARG, QEMU_OPTION_mon, \ > -"-mon [chardev=]name[,mode=readline|control]\n", QEMU_ARCH_ALL) > +"-mon > [chardev=]name[,mode=readline|control][,pretty[=on|off]][,default[=on|off]]\n", > QEMU_ARCH_ALL) > STEXI > -@item -mon [chardev=]name[,mode=readline|control] > +@item -mon > [chardev=]name[,mode=readline|control][,pretty[=on|off]][,default[=on|off]] "default" is deprecated since commit 06ac27f (2.4.0). It does nothing anymore, so should probably not be documented. "pretty" was added in commit 39eaab9ac2a82f. You write some documentation for it based on the commit message perhaps? thanks > @findex -mon > Setup monitor on chardev @var{name}. > ETEXI > -- > 2.14.1 > > -- Marc-André Lureau
[Qemu-devel] [PATCH V5] hw/pcie-pci-bridge: restrict to X86 and ARM
The PCIE-PCI bridge is specific to "pure" PCIe systems (on QEMU we have X86 and ARM), it does not make sense to have it in other archs. Reported-by: Thomas HuthSigned-off-by: Marcel Apfelbaum --- V4 -> V5 - Since all other tries failed, conditioned the device on the PCIe Root Port. V3 -> V4: - Move the config line to pci.mak (Thomas) V2 -> V3: - Another tweak in subject s/if/it (Cornelia) V1 -> V2: Addressed Thomas and Cornelia comments: - Conditioned the pcie-pci-bridge compilation on the PCIe Root CONFIG_PCIE_PORT - Tweaked subject PCI -> PCIe Thanks, Marcel hw/pci-bridge/Makefile.objs | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/hw/pci-bridge/Makefile.objs b/hw/pci-bridge/Makefile.objs index 666db37da2..1b05023662 100644 --- a/hw/pci-bridge/Makefile.objs +++ b/hw/pci-bridge/Makefile.objs @@ -1,5 +1,5 @@ -common-obj-y += pci_bridge_dev.o pcie_pci_bridge.o -common-obj-$(CONFIG_PCIE_PORT) += pcie_root_port.o gen_pcie_root_port.o +common-obj-y += pci_bridge_dev.o +common-obj-$(CONFIG_PCIE_PORT) += pcie_root_port.o gen_pcie_root_port.o pcie_pci_bridge.o common-obj-$(CONFIG_PXB) += pci_expander_bridge.o common-obj-$(CONFIG_XIO3130) += xio3130_upstream.o xio3130_downstream.o common-obj-$(CONFIG_IOH3420) += ioh3420.o -- 2.13.5
[Qemu-devel] [PATCH 1/2] nbd/server: add additional assert to nbd_export_put
This place is not obvious, nbd_export_close may theoretically reduce refcount to 0. It may happen if someone calls nbd_export_put on named export not through nbd_export_set_name when refcount is 1. Signed-off-by: Vladimir Sementsov-Ogievskiy--- nbd/server.c | 1 + 1 file changed, 1 insertion(+) diff --git a/nbd/server.c b/nbd/server.c index 70b40ed27e..2f2e05943f 100644 --- a/nbd/server.c +++ b/nbd/server.c @@ -1179,6 +1179,7 @@ void nbd_export_put(NBDExport *exp) nbd_export_close(exp); } +assert(exp->refcount > 0); if (--exp->refcount == 0) { assert(exp->name == NULL); assert(exp->description == NULL); -- 2.11.1
[Qemu-devel] [PATCH 0/2] add qmp nbd-server-remove
Add command to remove nbd export, pair to nbd-server-add. The whole thing and description are in patch 02. Vladimir Sementsov-Ogievskiy (2): nbd/server: add additional assert to nbd_export_put qmp: add nbd-server-remove qapi/block.json | 20 blockdev-nbd.c | 27 +++ nbd/server.c| 1 + 3 files changed, 48 insertions(+) -- 2.11.1
[Qemu-devel] [PATCH 2/2] qmp: add nbd-server-remove
Add command for export removing. It is needed for cases when we don't want to keep export after the operation on it was completed. The other example is temporary node, created with blockdev-add. If we want to delete it we should firstly remove corresponding NBD export. Signed-off-by: Vladimir Sementsov-Ogievskiy--- qapi/block.json | 20 blockdev-nbd.c | 27 +++ 2 files changed, 47 insertions(+) diff --git a/qapi/block.json b/qapi/block.json index f093fa3f27..1827940717 100644 --- a/qapi/block.json +++ b/qapi/block.json @@ -223,6 +223,26 @@ { 'command': 'nbd-server-add', 'data': {'device': 'str', '*writable': 'bool'} } ## +# @nbd-server-remove: +# +# Stop exporting block node through QEMU's embedded NBD server. +# +# @device: The device name or node name of the exported node. Should be equal +# to @device parameter for corresponding nbd-server-add command call. +# +# @force: Whether active connections to the export should be closed. If this +# parameter is false the export is only removed from named exports list, +# so new connetions are impossible and it would be freed after all +# clients are disconnected (default false). +# +# Returns: error if the server is not running or the device is not marked for +# export. +# +# Since: 2.12 +## +{ 'command': 'nbd-server-remove', 'data': {'device': 'str', '*force': 'bool'} } + +## # @nbd-server-stop: # # Stop QEMU's embedded NBD server, and unregister all devices previously diff --git a/blockdev-nbd.c b/blockdev-nbd.c index 28f551a7b0..5f66951c33 100644 --- a/blockdev-nbd.c +++ b/blockdev-nbd.c @@ -203,6 +203,33 @@ void qmp_nbd_server_add(const char *device, bool has_writable, bool writable, nbd_export_put(exp); } +void qmp_nbd_server_remove(const char *device, bool has_force, bool force, + Error **errp) +{ +NBDExport *exp; + +if (!nbd_server) { +error_setg(errp, "NBD server not running"); +return; +} + +exp = nbd_export_find(device); +if (exp == NULL) { +error_setg(errp, "'%s' is not exported", device); +return; +} + +if (!has_force) { +force = false; +} + +if (force) { +nbd_export_close(exp); +} else { +nbd_export_set_name(exp, NULL); +} +} + void qmp_nbd_server_stop(Error **errp) { nbd_export_close_all(); -- 2.11.1
[Qemu-devel] [PULL for-2.11 2/2] target/s390x: Finish implementing RISBGN
From: Richard HendersonWe added the entry to insn-data.def, but failed to update op_risbg to match. No need to special-case the imask inversion, since that is already ~0 for RISBG (and now RISBGN). Fixes: 375ee58bedcda359011fe7fa99e0647f66f9ffa0 Fixes: https://bugs.launchpad.net/qemu/+bug/1701798 (s390x part) Signed-off-by: Richard Henderson Message-Id: <20171107145546.767-1-richard.hender...@linaro.org> Reviewed-by: Thomas Huth Tested-by: Peter Maydell Signed-off-by: Cornelia Huck --- target/s390x/translate.c | 9 +++-- 1 file changed, 3 insertions(+), 6 deletions(-) diff --git a/target/s390x/translate.c b/target/s390x/translate.c index dee72a787d..85d0a6c3af 100644 --- a/target/s390x/translate.c +++ b/target/s390x/translate.c @@ -3432,6 +3432,7 @@ static ExitStatus op_risbg(DisasContext *s, DisasOps *o) /* Adjust the arguments for the specific insn. */ switch (s->fields->op2) { case 0x55: /* risbg */ +case 0x59: /* risbgn */ i3 &= 63; i4 &= 63; pmask = ~0; @@ -3447,7 +3448,7 @@ static ExitStatus op_risbg(DisasContext *s, DisasOps *o) pmask = 0xull; break; default: -abort(); +g_assert_not_reached(); } /* MASK is the set of bits to be inserted from R2. @@ -3464,11 +3465,7 @@ static ExitStatus op_risbg(DisasContext *s, DisasOps *o) insns, we need to keep the other half of the register. */ imask = ~mask | ~pmask; if (do_zero) { -if (s->fields->op2 == 0x55) { -imask = 0; -} else { -imask = ~pmask; -} +imask = ~pmask; } len = i4 - i3 + 1; -- 2.13.6
[Qemu-devel] [PULL for-2.11 0/2] s390x changes for 2.11-rc1
The following changes since commit b0fbe46ad82982b289a44ee2495b59b0bad8a842: Update version for v2.11.0-rc0 release (2017-11-07 16:05:28 +) are available in the git repository at: git://github.com/cohuck/qemu tags/s390x-20171109 for you to fetch changes up to fdaae351435147b9be6161d0f136ca7c40308059: target/s390x: Finish implementing RISBGN (2017-11-09 10:36:06 +0100) s390x changes: let pci devices start out in a usable state, and make RISBGN work in tcg. Christian Borntraeger (1): s390x/pci: let pci devices start in configured mode Richard Henderson (1): target/s390x: Finish implementing RISBGN hw/s390x/s390-pci-bus.c | 2 +- target/s390x/translate.c | 9 +++-- 2 files changed, 4 insertions(+), 7 deletions(-) -- 2.13.6
[Qemu-devel] [PULL for-2.11 1/2] s390x/pci: let pci devices start in configured mode
From: Christian BorntraegerCurrently, to enable a pci device in the guest, the user has to issue echo 1 > /sys/bus/pci/slots//power. This is not what people expect. On an LPAR, the user can put a PCI device in configured or deconfigured state via IOCDS. The "start in deconfigured state" can be used for "sharing" a pci function across LPARs. This is not what we are going to use in KVM, so always start configured. Signed-off-by: Christian Borntraeger Acked-by: Yi Min Zhao Reviewed-by: Pierre Morel Message-Id: <20171107175455.73793-2-borntrae...@de.ibm.com> Signed-off-by: Cornelia Huck --- hw/s390x/s390-pci-bus.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/hw/s390x/s390-pci-bus.c b/hw/s390x/s390-pci-bus.c index e7a58e81f7..2b1e1409bf 100644 --- a/hw/s390x/s390-pci-bus.c +++ b/hw/s390x/s390-pci-bus.c @@ -715,7 +715,7 @@ static void s390_pcihost_hot_plug(HotplugHandler *hotplug_dev, pbdev->pdev = pdev; pbdev->iommu = s390_pci_get_iommu(s, pdev->bus, pdev->devfn); pbdev->iommu->pbdev = pbdev; -pbdev->state = ZPCI_FS_STANDBY; +pbdev->state = ZPCI_FS_DISABLED; if (s390_pci_msix_init(pbdev)) { error_setg(errp, "MSI-X support is mandatory " -- 2.13.6
Re: [Qemu-devel] Intermittent hang of iotest 194 (bdrv_drain_all after non-shared storage migration)
On Thu, 11/09 16:14, Max Reitz wrote: > On 2017-11-09 05:21, Fam Zheng wrote: > > On Thu, 11/09 01:48, Max Reitz wrote: > >> Hi, > >> > >> More exciting news from the bdrv_drain() front! > >> > >> I've noticed in the past that iotest 194 sometimes hangs. I usually run > >> the tests on tmpfs, but I've just now verified that it happens on my SSD > >> just as well. > >> > >> So the reproducer is a plain: > >> > >> while ./check -raw 194; do; done > > > > I cannot produce it on my machine. > > Hm, too bad. I see it both on my work laptop (with Fedora) and my > desktop (with Arch)... > > >> (No difference between raw or qcow2, though.) > >> > >> And then, after a couple of runs (or a couple ten), it will just hang. > >> The reason is that the source VM lingers around and doesn't quit > >> voluntarily -- the test itself was successful, but it just can't exit. > >> > >> If you force it to exit by killing the VM (e.g. through pkill -11 qemu), > >> this is the backtrace: > >> > >> #0 0x7f7cfc297e06 in ppoll () at /lib64/libc.so.6 > >> #1 0x563b846bcac9 in ppoll (__ss=0x0, __timeout=0x0, > >> __nfds=, __fds=) at > >> /usr/include/bits/poll2.h:77 > > > > Looking at the 0 timeout it seems we are in the aio_poll(ctx, > > blocking=false); > > branches of BDRV_POLL_WHILE? Is it a busy loop? If so I wonder what is > > making > > progress and causing the return value to be true in aio_poll(). > > > >> #2 0x563b846bcac9 in qemu_poll_ns (fds=, > >> nfds=, timeout=) at util/qemu-timer.c:322 > >> #3 0x563b846be711 in aio_poll (ctx=ctx@entry=0x563b856e3e80, > >> blocking=) at util/aio-posix.c:629 > >> #4 0x563b8463afa4 in bdrv_drain_recurse > >> (bs=bs@entry=0x563b865568a0, begin=begin@entry=true) at block/io.c:201 > >> #5 0x563b8463baff in bdrv_drain_all_begin () at block/io.c:381 > >> #6 0x563b8463bc99 in bdrv_drain_all () at block/io.c:411 > >> #7 0x563b8459888b in block_migration_cleanup (opaque= >> out>) at migration/block.c:714 > >> #8 0x563b845883be in qemu_savevm_state_cleanup () at > >> migration/savevm.c:1251 > >> #9 0x563b845811fd in migration_thread (opaque=0x563b856f1da0) at > >> migration/migration.c:2298 > >> #10 0x7f7cfc56f36d in start_thread () at /lib64/libpthread.so.0 > >> #11 0x7f7cfc2a3e1f in clone () at /lib64/libc.so.6 > >> > >> > >> And when you make bdrv_drain_all_begin() print what we are trying to > >> drain, you can see that it's the format node (managed by the "raw" > >> driver in this case). > > > > So what is the value of bs->in_flight? > > gdb: > > (gdb) print bs->in_flight > > $3 = 2307492233 > > "That's weird, why would it..." > > > (gdb) print *bs > > $4 = {open_flags = -1202160144, read_only = 161, encrypted = 85, sg = > > false, probed = false, force_share = 96, implicit = 159, drv = 0x0, opaque > > = 0x0, aio_context = 0x8989898989898989, aio_notifiers = {lh_first = > > 0x8989898989898989}, > > walking_aio_notifiers = 137, filename = '\211' , > > backing_file = '\211' , backing_format = '\211' > > , full_open_options = 0x8989898989898989, > > exact_filename = '\211' , backing = > > 0x8989898989898989, file = 0x8989898989898989, bl = {request_alignment = > > 2307492233, max_pdiscard = -1987475063, pdiscard_alignment = 2307492233, > > max_pwrite_zeroes = -1987475063, pwrite_zeroes_alignment = 2307492233, > > opt_transfer = 2307492233, max_transfer = 2307492233, min_mem_alignment = > > 9910603678816504201, opt_mem_alignment = 9910603678816504201, max_iov = > > -1987475063}, > > supported_write_flags = 2307492233, supported_zero_flags = 2307492233, > > node_name = '\211' , node_list = {tqe_next = > > 0x8989898989898989, tqe_prev = 0x8989898989898989}, bs_list = {tqe_next = > > 0x8989898989898989, > > tqe_prev = 0x8989898989898989}, monitor_list = {tqe_next = > > 0x8989898989898989, tqe_prev = 0x8989898989898989}, refcnt = -1987475063, > > op_blockers = {{lh_first = 0x8989898989898989} }, job = > > 0x8989898989898989, > > inherits_from = 0x8989898989898989, children = {lh_first = > > 0x8989898989898989}, parents = {lh_first = 0x8989898989898989}, options = > > 0x8989898989898989, explicit_options = 0x8989898989898989, detect_zeroes = > > 2307492233, > > backing_blocker = 0x8989898989898989, total_sectors = > > -8536140394893047415, before_write_notifiers = {notifiers = {lh_first = > > 0x8989898989898989}}, write_threshold_offset = 9910603678816504201, > > write_threshold_notifier = {notify = > > 0x8989898989898989, node = {le_next = 0x8989898989898989, le_prev = > > 0x8989898989898989}}, dirty_bitmap_mutex = {lock = {__data = {__lock = > > -1987475063, __count = 2307492233, __owner = -1987475063, __nusers = > > 2307492233, > > __kind = -1987475063, __spins = -30327, __elision = -30327, __list > > = {__prev = 0x8989898989898989, __next = 0x8989898989898989}}, __size = > > '\211' , __align = -8536140394893047415}, initialized = > > 137}, > > dirty_bitmaps
Re: [Qemu-devel] [PATCH V4] hw/pcie-pci-bridge: restrict to X86 and ARM
On 09/11/2017 16:37, Fam Zheng wrote: On Thu, 11/09 16:08, Marcel Apfelbaum wrote: make[1]: *** No rule to make target `../hw/pci-bridge/pcie_pci_bridge.o', needed by `qemu-system-aarch64'. Stop. make[1]: *** Waiting for unfinished jobs/tmp/qemu-test/src/target/arm/translate-a64.c: In function 'handle_shri_with_rndacc': /tmp/qemu-test/src/target/arm/translate-a64.c:6390: warning: 'tcg_src_hi' may be used uninitialized in this function /tmp/qemu-test/src/target/arm/translate-a64.c: In function 'disas_simd_scalar_two_reg_misc': /tmp/qemu-test/src/target/arm/translate-a64.c:8117: warning: 'rmode' may be used uninitialized in this function make: *** [subdir-aarch64-softmmu] Error 2 Traceback (most recent call last): File "./tests/docker/docker.py", line 385, in sys.exit(main()) File "./tests/docker/docker.py", line 382, in main return args.cmdobj.run(args, argv) File "./tests/docker/docker.py", line 239, in run return Docker().run(argv, args.keep, quiet=args.quiet) File "./tests/docker/docker.py", line 207, in run quiet=quiet) File "./tests/docker/docker.py", line 125, in _do_check return subprocess.check_call(self._command + cmd, **kwargs) File "/usr/lib64/python2.7/subprocess.py", line 186, in check_call raise CalledProcessError(retcode, cmd) subprocess.CalledProcessError: Command '['docker', 'run', '--label', 'com.qemu.instance.uuid=561e8180c54e11e79bd052540069c830', '-u', '0', '--security-opt', 'seccomp=unconfined', '--rm', '--net=none', '-e', 'TARGET_LIST=', '-e', 'EXTRA_CONFIGURE_OPTS=', '-e', 'V=', '-e', 'J=8', '-e', 'DEBUG=', '-e', 'SHOW_ENV=1', '-e', 'CCACHE_DIR=/var/tmp/ccache', '-v', '/root/.cache/qemu-docker-ccache:/var/tmp/ccache:z', '-v', '/var/tmp/patchew-tester-tmp-gp94i687/src/docker-src.2017-11-09-08.03.01.2879:/var/tmp/qemu:z,ro', 'qemu:centos6', '/var/tmp/qemu/run', 'test-quick']' returned non-zero exit status 2 make[1]: *** [tests/docker/Makefile.include:129: docker-run] Error 1 make[1]: Leaving directory '/var/tmp/patchew-tester-tmp-gp94i687/src' make: *** [tests/docker/Makefile.include:163: docker-run-test-quick@centos6] Error 2 real1m28.235s user0m4.283s sys 0m4.399s === OUTPUT END === Test command exited with code: 2 --- Email generated automatically by Patchew [http://patchew.org/]. Please send your feedback to patchew-de...@freelists.org Hi Fam, I can compile just fine in my workstation, can you please advice ? I can see the same error with make docker-run-test-quick@centos6 V=1 J=8 and also a normal out-of-tree build with $QEMU_SRC/configure --enable-debug --target-list=x86_64-softmmu,aarch64-softmmu Yes, I succeeded to reproduce it, thanks! Can you install docker and try it? I wanted to try it eventually, now it seems a good time. I will let you know if any problems. Thanks, Marcel Fam
[Qemu-devel] [Bug 1731277] Re: Provide target specific qemu man pages
I'm kind of hoping that moving to Sphinx for our docs toolchain will allow us to for instance have the board specific information in doc comments in each board source file, which could then be automatically assembled into the right documentation. The current manpages are autobuilt from the monolithic qemu-doc.texi, which is woefully out of date in many areas... -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/1731277 Title: Provide target specific qemu man pages Status in QEMU: New Bug description: Right now, all qemu target binaries (qemu-system-...) share the same man page. The current man page is primarily focused on x86, and therefore the information given is entirely wrong for e.g. arm, powerpc or s390x. NAME qemu-doc - QEMU Emulator User Documentation SYNOPSIS qemu-system-i386 [options] [disk_image] DESCRIPTION The QEMU PC System emulator simulates the following peripherals: - i440FX host PCI bridge and PIIX3 PCI to ISA bridge - Cirrus CLGD 5446 PCI VGA card or dummy VGA card with Bochs VESA extensions (hardware level, including all non standard modes). - PS/2 mouse and keyboard - 2 PCI IDE interfaces with hard disk and CD-ROM support - Floppy disk ... We should have target specific man pages, with the common options/settings factored out, so they are included in all target specific man pages. "man qemu-system-s390x" should give s390x specific (+common) information and "man qemu-system-x86_64" should contain x86 specific (+common) information. To manage notifications about this bug go to: https://bugs.launchpad.net/qemu/+bug/1731277/+subscriptions
Re: [Qemu-devel] [PATCH V4] hw/pcie-pci-bridge: restrict to X86 and ARM
On 09/11/2017 16:29, Philippe Mathieu-Daudé wrote: Hi Marcel, On 11/09/2017 11:10 AM, Marcel Apfelbaum wrote: On 09/11/2017 16:03, Thomas Huth wrote: On 09.11.2017 14:04, no-re...@patchew.org wrote: Hi, This series failed automatic build test. Please find the testing commands and their output below. If you have docker installed, you can probably reproduce it locally. [...] CC aarch64-softmmu/hw/scsi/virtio-scsi.o CC aarch64-softmmu/hw/scsi/virtio-scsi-dataplane.o make[1]: *** No rule to make target `../hw/pci-bridge/pcie_pci_bridge.o', needed by `qemu-system-x86_64'. Stop. make[1]: *** Waiting for unfinished jobs CC aarch64-softmmu/hw/scsi/vhost-scsi-common.o CC aarch64-softmmu/hw/scsi/vhost-scsi.o CC aarch64-softmmu/hw/scsi/vhost-user-scsi.o CC aarch64-softmmu/hw/sd/omap_mmc.o CC aarch64-softmmu/hw/sd/pxa2xx_mmci.o CC aarch64-softmmu/hw/sd/bcm2835_sdhost.o make: *** [subdir-x86_64-softmmu] Error 2 make: *** Waiting for unfinished jobs CC aarch64-softmmu/hw/ssi/omap_spi.o CC aarch64-softmmu/hw/ssi/imx_spi.o CC aarch64-softmmu/hw/timer/exynos4210_mct.o CC aarch64-softmmu/hw/timer/exynos4210_pwm.o [...] CC aarch64-softmmu/hw/timer/exynos4210_rtc.o CC aarch64-softmmu/target/arm/translate-a64.o make[1]: *** No rule to make target `../hw/pci-bridge/pcie_pci_bridge.o', needed by `qemu-system-aarch64'. Stop. make[1]: *** Waiting for unfinished jobs/tmp/qemu-test/src/target/arm/translate-a64.c: In function 'handle_shri_with_rndacc': Hmm, looks like this was not working as expected? In my workstation works just fine (clean and compiled all archs), I sent a mail to Fam. Applying your patch on v2.11.0-rc0 I also get this: make[1]: *** No rule to make target '../hw/pci-bridge/pcie_pci_bridge.o', needed by 'qemu-system-aarch64'. Stop. Makefile:383: recipe for target 'subdir-aarch64-softmmu' failed make: *** [subdir-aarch64-softmmu] Error 2 $ fgrep CONFIG_PCIE config-all-devices.mak CONFIG_PCIE_PCI_BRIDGE:=$(findstring y,$(CONFIG_PCIE_PCI_BRIDGE)$(CONFIG_PCIE_PORT)) CONFIG_PCIE_PORT:=$(findstring y,$(CONFIG_PCIE_PORT)y) $ make subdir-aarch64-softmmu --debug=v 2>/dev/null | fgrep -i pcie_pci_bridge Reading makefile '../hw/pci-bridge/pcie_pci_bridge.d' (search path) (don't care) (no ~ expansion)... Considering target file '../hw/pci-bridge/pcie_pci_bridge.o'. File '../hw/pci-bridge/pcie_pci_bridge.o' does not exist. Finished prerequisites of target file '../hw/pci-bridge/pcie_pci_bridge.o'. Must remake target '../hw/pci-bridge/pcie_pci_bridge.o'. Hi Philippe, Thanks for trying it! I succeeded to reproduce with an out of tree build. For some reason the make system doesn't work as I would want to. I'll resend a working version, Thanks, Marcel
Re: [Qemu-devel] Intermittent hang of iotest 194 (bdrv_drain_all after non-shared storage migration)
On 2017-11-09 05:21, Fam Zheng wrote: > On Thu, 11/09 01:48, Max Reitz wrote: >> Hi, >> >> More exciting news from the bdrv_drain() front! >> >> I've noticed in the past that iotest 194 sometimes hangs. I usually run >> the tests on tmpfs, but I've just now verified that it happens on my SSD >> just as well. >> >> So the reproducer is a plain: >> >> while ./check -raw 194; do; done > > I cannot produce it on my machine. Hm, too bad. I see it both on my work laptop (with Fedora) and my desktop (with Arch)... >> (No difference between raw or qcow2, though.) >> >> And then, after a couple of runs (or a couple ten), it will just hang. >> The reason is that the source VM lingers around and doesn't quit >> voluntarily -- the test itself was successful, but it just can't exit. >> >> If you force it to exit by killing the VM (e.g. through pkill -11 qemu), >> this is the backtrace: >> >> #0 0x7f7cfc297e06 in ppoll () at /lib64/libc.so.6 >> #1 0x563b846bcac9 in ppoll (__ss=0x0, __timeout=0x0, >> __nfds=, __fds=) at >> /usr/include/bits/poll2.h:77 > > Looking at the 0 timeout it seems we are in the aio_poll(ctx, blocking=false); > branches of BDRV_POLL_WHILE? Is it a busy loop? If so I wonder what is making > progress and causing the return value to be true in aio_poll(). > >> #2 0x563b846bcac9 in qemu_poll_ns (fds=, >> nfds=, timeout=) at util/qemu-timer.c:322 >> #3 0x563b846be711 in aio_poll (ctx=ctx@entry=0x563b856e3e80, >> blocking=) at util/aio-posix.c:629 >> #4 0x563b8463afa4 in bdrv_drain_recurse >> (bs=bs@entry=0x563b865568a0, begin=begin@entry=true) at block/io.c:201 >> #5 0x563b8463baff in bdrv_drain_all_begin () at block/io.c:381 >> #6 0x563b8463bc99 in bdrv_drain_all () at block/io.c:411 >> #7 0x563b8459888b in block_migration_cleanup (opaque=> out>) at migration/block.c:714 >> #8 0x563b845883be in qemu_savevm_state_cleanup () at >> migration/savevm.c:1251 >> #9 0x563b845811fd in migration_thread (opaque=0x563b856f1da0) at >> migration/migration.c:2298 >> #10 0x7f7cfc56f36d in start_thread () at /lib64/libpthread.so.0 >> #11 0x7f7cfc2a3e1f in clone () at /lib64/libc.so.6 >> >> >> And when you make bdrv_drain_all_begin() print what we are trying to >> drain, you can see that it's the format node (managed by the "raw" >> driver in this case). > > So what is the value of bs->in_flight? gdb: > (gdb) print bs->in_flight > $3 = 2307492233 "That's weird, why would it..." > (gdb) print *bs > $4 = {open_flags = -1202160144, read_only = 161, encrypted = 85, sg = false, > probed = false, force_share = 96, implicit = 159, drv = 0x0, opaque = 0x0, > aio_context = 0x8989898989898989, aio_notifiers = {lh_first = > 0x8989898989898989}, > walking_aio_notifiers = 137, filename = '\211' , > backing_file = '\211' , backing_format = '\211' 16 times>, full_open_options = 0x8989898989898989, > exact_filename = '\211' , backing = 0x8989898989898989, > file = 0x8989898989898989, bl = {request_alignment = 2307492233, max_pdiscard > = -1987475063, pdiscard_alignment = 2307492233, > max_pwrite_zeroes = -1987475063, pwrite_zeroes_alignment = 2307492233, > opt_transfer = 2307492233, max_transfer = 2307492233, min_mem_alignment = > 9910603678816504201, opt_mem_alignment = 9910603678816504201, max_iov = > -1987475063}, > supported_write_flags = 2307492233, supported_zero_flags = 2307492233, > node_name = '\211' , node_list = {tqe_next = > 0x8989898989898989, tqe_prev = 0x8989898989898989}, bs_list = {tqe_next = > 0x8989898989898989, > tqe_prev = 0x8989898989898989}, monitor_list = {tqe_next = > 0x8989898989898989, tqe_prev = 0x8989898989898989}, refcnt = -1987475063, > op_blockers = {{lh_first = 0x8989898989898989} }, job = > 0x8989898989898989, > inherits_from = 0x8989898989898989, children = {lh_first = > 0x8989898989898989}, parents = {lh_first = 0x8989898989898989}, options = > 0x8989898989898989, explicit_options = 0x8989898989898989, detect_zeroes = > 2307492233, > backing_blocker = 0x8989898989898989, total_sectors = -8536140394893047415, > before_write_notifiers = {notifiers = {lh_first = 0x8989898989898989}}, > write_threshold_offset = 9910603678816504201, write_threshold_notifier = > {notify = > 0x8989898989898989, node = {le_next = 0x8989898989898989, le_prev = > 0x8989898989898989}}, dirty_bitmap_mutex = {lock = {__data = {__lock = > -1987475063, __count = 2307492233, __owner = -1987475063, __nusers = > 2307492233, > __kind = -1987475063, __spins = -30327, __elision = -30327, __list = > {__prev = 0x8989898989898989, __next = 0x8989898989898989}}, __size = '\211' > , __align = -8536140394893047415}, initialized = 137}, > dirty_bitmaps = {lh_first = 0x8989898989898989}, wr_highest_offset = {value > = 9910603678816504201}, copy_on_read = -1987475063, in_flight = 2307492233, > serialising_in_flight = 2307492233, wakeup = 137, io_plugged = 2307492233, > enable_write_cache = -1987475063,
Re: [Qemu-devel] [PATCH] Add missing parameters to mon option documentation
On Thu, Nov 09, 2017 at 01:19:03PM +0100, Vicente Jimenez Aguilar wrote: > Documentation missed 'mon' option's 'pretty' and 'default' parameters > > Signed-off-by: Vicente Jimenez Aguilar> --- > qemu-options.hx | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/qemu-options.hx b/qemu-options.hx > index 3728e9b4dd..72cf48a8e5 100644 > --- a/qemu-options.hx > +++ b/qemu-options.hx > @@ -3477,9 +3477,9 @@ Like -qmp but uses pretty JSON formatting. > ETEXI > > DEF("mon", HAS_ARG, QEMU_OPTION_mon, \ > -"-mon [chardev=]name[,mode=readline|control]\n", QEMU_ARCH_ALL) > +"-mon > [chardev=]name[,mode=readline|control][,pretty[=on|off]][,default[=on|off]]\n", > QEMU_ARCH_ALL) > STEXI > -@item -mon [chardev=]name[,mode=readline|control] > +@item -mon > [chardev=]name[,mode=readline|control][,pretty[=on|off]][,default[=on|off]] The 'default' option is deprecated and unused, so intentionally not documented https://qemu.weilnetz.de/doc/qemu-doc.html#g_t_002dmon-default_003don-_0028since-2_002e4_002e0_0029 Regards, Daniel -- |: https://berrange.com -o-https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o-https://fstop138.berrange.com :| |: https://entangle-photo.org-o-https://www.instagram.com/dberrange :|
Re: [Qemu-devel] [PATCH V3] hw/pci-host: Fix x86 Host Bridges 64bit PCI hole
On Wed, Nov 08, 2017 at 04:48:09PM +0200, Marcel Apfelbaum wrote: > Currently there is no MMIO range over 4G > reserved for PCI hotplug. Since the 32bit PCI hole > depends on the number of cold-plugged PCI devices > and other factors, it is very possible is too small > to hotplug PCI devices with large BARs. > > Fix it by reserving 2G for I4400FX chipset > in order to comply with older Win32 Guest OSes > and 32G for Q35 chipset. > > Note this is a regression since prev QEMU versions had > some range reserved for 64bit PCI hotplug. > > Signed-off-by: Marcel ApfelbaumLooks sane to me, but could we add some documentation please? Short term code comments. Long term let's add a document. > --- > > V2 -> V3: > - Addressed Gerd's and others comments and re-enabled the pci-hole64-size >property defaulting it to 2G for I440FX and 32g for Q35. > - Even if the new defaults of pci-hole64-size will appear in "info qtree" >also for older machines, the property was not implemented so >no changes will be visible to guests. > > V1 -> V2: > Addressed Igor's comments: > - aligned the hole64 start to 1Gb > (I think all the computations took care of it already, > but it can't hurt) > - Init compat props to "off" instead of "false" > > Thanks, > Marcel > > > hw/i386/pc.c | 18 ++ > hw/pci-host/piix.c| 13 - > hw/pci-host/q35.c | 17 +++-- > include/hw/i386/pc.h | 10 +- > include/hw/pci-host/q35.h | 1 + > 5 files changed, 55 insertions(+), 4 deletions(-) > > diff --git a/hw/i386/pc.c b/hw/i386/pc.c > index e11a65b545..af3177cca5 100644 > --- a/hw/i386/pc.c > +++ b/hw/i386/pc.c > @@ -1448,6 +1448,24 @@ void pc_memory_init(PCMachineState *pcms, > pcms->ioapic_as = _space_memory; > } > > +uint64_t pc_pci_hole64_start(void) > +{ > +PCMachineState *pcms = PC_MACHINE(qdev_get_machine()); > +PCMachineClass *pcmc = PC_MACHINE_GET_CLASS(pcms); > +uint64_t hole64_start = 0; > + > +if (pcmc->has_reserved_memory && pcms->hotplug_memory.base) { > +hole64_start = pcms->hotplug_memory.base; > +if (!pcmc->broken_reserved_end) { > +hole64_start += memory_region_size(>hotplug_memory.mr); > +} > +} else { > +hole64_start = 0x1ULL + pcms->above_4g_mem_size; > +} > + > +return ROUND_UP(hole64_start, 1ULL << 30); > +} > + > qemu_irq pc_allocate_cpu_irq(void) > { > return qemu_allocate_irq(pic_irq_request, NULL, 0); > diff --git a/hw/pci-host/piix.c b/hw/pci-host/piix.c > index a7e2256870..e033ee7df8 100644 > --- a/hw/pci-host/piix.c > +++ b/hw/pci-host/piix.c > @@ -50,6 +50,7 @@ typedef struct I440FXState { > PCIHostState parent_obj; > Range pci_hole; > uint64_t pci_hole64_size; > +bool pci_hole64_fix; > uint32_t short_root_bus; > } I440FXState; > > @@ -243,11 +244,15 @@ static void i440fx_pcihost_get_pci_hole64_start(Object > *obj, Visitor *v, > void *opaque, Error **errp) > { > PCIHostState *h = PCI_HOST_BRIDGE(obj); > +I440FXState *s = I440FX_PCI_HOST_BRIDGE(obj); > Range w64; > uint64_t value; > > pci_bus_get_w64_range(h->bus, ); > value = range_is_empty() ? 0 : range_lob(); > +if (!value && s->pci_hole64_fix) { > +value = pc_pci_hole64_start(); > +} > visit_type_uint64(v, name, , errp); > } > > @@ -256,11 +261,16 @@ static void i440fx_pcihost_get_pci_hole64_end(Object > *obj, Visitor *v, >Error **errp) > { > PCIHostState *h = PCI_HOST_BRIDGE(obj); > +I440FXState *s = I440FX_PCI_HOST_BRIDGE(obj); > +uint64_t hole64_start = pc_pci_hole64_start(); > Range w64; > uint64_t value; > > pci_bus_get_w64_range(h->bus, ); > value = range_is_empty() ? 0 : range_upb() + 1; > +if (s->pci_hole64_fix && value < (hole64_start + s->pci_hole64_size)) { > +value = hole64_start + s->pci_hole64_size; > +} > visit_type_uint64(v, name, , errp); > } > > @@ -863,8 +873,9 @@ static const char > *i440fx_pcihost_root_bus_path(PCIHostState *host_bridge, > > static Property i440fx_props[] = { > DEFINE_PROP_SIZE(PCI_HOST_PROP_PCI_HOLE64_SIZE, I440FXState, > - pci_hole64_size, DEFAULT_PCI_HOLE64_SIZE), > + pci_hole64_size, 1ULL << 31), > DEFINE_PROP_UINT32("short_root_bus", I440FXState, short_root_bus, 0), > +DEFINE_PROP_BOOL("x-pci-hole64-fix", I440FXState, pci_hole64_fix, true), > DEFINE_PROP_END_OF_LIST(), > }; > > diff --git a/hw/pci-host/q35.c b/hw/pci-host/q35.c > index ddaa7d1b44..f95337c13a 100644 > --- a/hw/pci-host/q35.c > +++ b/hw/pci-host/q35.c > @@ -104,11 +104,15 @@ static void q35_host_get_pci_hole64_start(Object *obj, > Visitor *v, >Error **errp) > { > PCIHostState
[Qemu-devel] [PATCH 2/2] Add mon default parameter to help documentation
Spotted after reading mon_init_func from vl.c Signed-off-by: Vicente Jimenez Aguilar--- qemu-options.hx | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/qemu-options.hx b/qemu-options.hx index 2692a48f63..12192a20e4 100644 --- a/qemu-options.hx +++ b/qemu-options.hx @@ -3477,7 +3477,7 @@ Like -qmp but uses pretty JSON formatting. ETEXI DEF("mon", HAS_ARG, QEMU_OPTION_mon, \ -"-mon [chardev=]name[,mode=readline|control][,pretty[=on|off]]\n", QEMU_ARCH_ALL) +"-mon [chardev=]name[,mode=readline|control][,pretty[=on|off]][,default[=on|off]]\n", QEMU_ARCH_ALL) STEXI @item -mon [chardev=]name[,mode=readline|control] @findex -mon -- 2.14.1
[Qemu-devel] [PATCH 1/2] Add mon pretty parameter to help documentation
Signed-off-by: Vicente Jimenez Aguilar--- qemu-options.hx | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/qemu-options.hx b/qemu-options.hx index 3728e9b4dd..2692a48f63 100644 --- a/qemu-options.hx +++ b/qemu-options.hx @@ -3477,7 +3477,7 @@ Like -qmp but uses pretty JSON formatting. ETEXI DEF("mon", HAS_ARG, QEMU_OPTION_mon, \ -"-mon [chardev=]name[,mode=readline|control]\n", QEMU_ARCH_ALL) +"-mon [chardev=]name[,mode=readline|control][,pretty[=on|off]]\n", QEMU_ARCH_ALL) STEXI @item -mon [chardev=]name[,mode=readline|control] @findex -mon -- 2.14.1
[Qemu-devel] [PATCH] Add missing parameters to mon option documentation
Documentation missed 'mon' option's 'pretty' and 'default' parameters Signed-off-by: Vicente Jimenez Aguilar--- qemu-options.hx | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/qemu-options.hx b/qemu-options.hx index 3728e9b4dd..72cf48a8e5 100644 --- a/qemu-options.hx +++ b/qemu-options.hx @@ -3477,9 +3477,9 @@ Like -qmp but uses pretty JSON formatting. ETEXI DEF("mon", HAS_ARG, QEMU_OPTION_mon, \ -"-mon [chardev=]name[,mode=readline|control]\n", QEMU_ARCH_ALL) +"-mon [chardev=]name[,mode=readline|control][,pretty[=on|off]][,default[=on|off]]\n", QEMU_ARCH_ALL) STEXI -@item -mon [chardev=]name[,mode=readline|control] +@item -mon [chardev=]name[,mode=readline|control][,pretty[=on|off]][,default[=on|off]] @findex -mon Setup monitor on chardev @var{name}. ETEXI -- 2.14.1
[Qemu-devel] [PATCH] Add mon pretty parameter to help documentation
Signed-off-by: Vicente Jimenez Aguilar--- qemu-options.hx | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/qemu-options.hx b/qemu-options.hx index 3728e9b4dd..2692a48f63 100644 --- a/qemu-options.hx +++ b/qemu-options.hx @@ -3477,7 +3477,7 @@ Like -qmp but uses pretty JSON formatting. ETEXI DEF("mon", HAS_ARG, QEMU_OPTION_mon, \ -"-mon [chardev=]name[,mode=readline|control]\n", QEMU_ARCH_ALL) +"-mon [chardev=]name[,mode=readline|control][,pretty[=on|off]]\n", QEMU_ARCH_ALL) STEXI @item -mon [chardev=]name[,mode=readline|control] @findex -mon -- 2.14.1
Re: [Qemu-devel] [RISU PATCH 07/10] risugen: add --sve support
Dave Martinwrites: > On Tue, Nov 07, 2017 at 03:05:55PM +, Alex Bennée wrote: >> This is similar to the approach used by the FP/simd data in so far as >> we generate a block of random data and then load into it. As there are >> no post-index SVE operations we need to emit an additional incp >> instruction to generate our offset into the array. >> >> Signed-off-by: Alex Bennée >> --- >> risugen| 3 +++ >> risugen_arm.pm | 57 >> ++--- >> 2 files changed, 53 insertions(+), 7 deletions(-) >> >> diff --git a/risugen b/risugen >> index aba4bb7..0ac8e86 100755 >> --- a/risugen >> +++ b/risugen >> @@ -317,6 +317,7 @@ sub main() >> my $condprob = 0; >> my $fpscr = 0; >> my $fp_enabled = 1; >> +my $sve_enabled = 1; >> my $big_endian = 0; >> my ($infile, $outfile); >> >> @@ -334,6 +335,7 @@ sub main() >> }, >> "be" => sub { $big_endian = 1; }, >> "no-fp" => sub { $fp_enabled = 0; }, >> +"sve" => sub { $sve_enabled = 1; }, >> ) or return 1; >> # allow "--pattern re,re" and "--pattern re --pattern re" >> @pattern_re = split(/,/,join(',',@pattern_re)); >> @@ -361,6 +363,7 @@ sub main() >> 'fpscr' => $fpscr, >> 'numinsns' => $numinsns, >> 'fp_enabled' => $fp_enabled, >> +'sve_enabled' => $sve_enabled, >> 'outfile' => $outfile, >> 'details' => \%insn_details, >> 'keys' => \@insn_keys, >> diff --git a/risugen_arm.pm b/risugen_arm.pm >> index 2f10d58..8d1e1fd 100644 >> --- a/risugen_arm.pm >> +++ b/risugen_arm.pm >> @@ -472,9 +472,47 @@ sub write_random_aarch64_fpdata() >> } >> } >> >> -sub write_random_aarch64_regdata($) >> +sub write_random_aarch64_svedata() >> { >> -my ($fp_enabled) = @_; >> +# Load SVE registers >> +my $align = 16; >> +my $vl = 16; # number of vqs > > Would this be better phrased > > my $vq = 16;# quadwords per vector > >> +my $datalen = (32 * $vl * 16) + $align; >> + >> +write_pc_adr(0, (3 * 4) + ($align - 1)); # insn 1 >> +write_align_reg(0, $align); # insn 2 >> +write_jump_fwd($datalen);# insn 3 >> + >> +# align safety >> +for (my $i = 0; $i < ($align / 4); $i++) { >> +# align with nops >> +insn32(0xd503201f); >> +}; >> + >> +for (my $rt = 0; $rt <= 31; $rt++) { >> +for (my $q = 0; $q < $vl; $q++) { >> +write_random_fpreg_var(4); # quad >> +} >> +} >> + >> +# Reset all the predicate registers to all true >> +for (my $p = 0; $p < 16; $p++) { >> +insn32(0x2518e3e0 | $p); >> +} >> + >> +# there is no post index load so we do this by hand >> +write_mov_ri(1, 0); >> +for (my $rt = 0; $rt <= 31; $rt++) { >> +# ld1dz0.d, p0/z, [x0, x1, lsl #3] >> +insn32(0xa5e14000 | $rt); >> +# incpx1, p0.d >> +insn32(0x25ec8801); > > You could avoid this with the unpredicated form LDR (vector). > (LD1x scalar+immediate doesn't provide enough immediate range). > > # ldr z$rt, [x0, #$rt, mul vl] > insn32(0x85804000 + $rt + (($rt & 7) << 10) + (($rt & 0x18) << 13)); > > which is what the kernel does. Ahh thanks. > No harm in exercising different instructions though! The kernel uses > embarrassingly few. Well that's what the actual random instructions are for. I haven't yet considered if we need to group things together to properly exercise the predicate code. So far RISU doesn't exercise conditional branches as it's hard to do that without generating code that would lock up. It's very much the "easy 80%" of instructions we target. > Does it matter that the stride will depend on the actual current VL? > If x0 just points to a block of random data, I guess it doesn't matter: > some trailing data remains unused, but that doesn't make the used data > any less random. Not really. We do try and re-generate every N instructions so all the floats don't just denormalise. > > [...] > > Cheers > ---Dave -- Alex Bennée