Re: [Bug 1805256] Re: [Qemu-devel] qemu_futex_wait() lockups in ARM64: 2 possible issues
Patchew URL: https://patchew.org/QEMU/20191011175536.GB25464@xps13.dannf/ Hi, This series seems to have some coding style problems. See output below for more information: Subject: [Bug 1805256] Re: [Qemu-devel] qemu_futex_wait() lockups in ARM64: 2 possible issues Type: series Message-id: 20191011175536.GB25464@xps13.dannf === TEST SCRIPT BEGIN === #!/bin/bash git rev-parse base > /dev/null || exit 0 git config --local diff.renamelimit 0 git config --local diff.renames True git config --local diff.algorithm histogram ./scripts/checkpatch.pl --mailback base.. === TEST SCRIPT END === Switched to a new branch 'test' d3707c3 qemu_futex_wait() lockups in ARM64: 2 possible issues === OUTPUT BEGIN === WARNING: Block comments use a leading /* on a separate line #66: FILE: util/async.c:345: +/* Using atomic_mb_read ensures that e.g. bh->scheduled is written before ERROR: code indent should never use tabs #74: FILE: util/async.c:350: +^Ievent_notifier_set(>notifier);$ ERROR: Missing Signed-off-by: line(s) total: 2 errors, 1 warnings, 33 lines checked Commit d3707c31b5da (qemu_futex_wait() lockups in ARM64: 2 possible issues) has style problems, please review. If any of these errors are false positives report them to the maintainer, see CHECKPATCH in MAINTAINERS. === OUTPUT END === Test command exited with code: 1 The full log is available at http://patchew.org/logs/20191011175536.GB25464@xps13.dannf/testing.checkpatch/?type=message. --- Email generated automatically by Patchew [https://patchew.org/]. Please send your feedback to patchew-de...@redhat.com
Re: [Bug 1805256] Re: [Qemu-devel] qemu_futex_wait() lockups in ARM64: 2 possible issues
Patchew URL: https://patchew.org/QEMU/20191011175536.GB25464@xps13.dannf/ Hi, This series failed the docker-mingw@fedora build test. Please find the testing commands and their output below. If you have Docker installed, you can probably reproduce it locally. === TEST SCRIPT BEGIN === #! /bin/bash export ARCH=x86_64 make docker-image-fedora V=1 NETWORK=1 time make docker-test-mingw@fedora J=14 NETWORK=1 === TEST SCRIPT END === In file included from /tmp/qemu-test/src/include/qemu/osdep.h:51, from /tmp/qemu-test/src/util/async.c:26: /tmp/qemu-test/src/util/async.c: In function 'aio_notify': /tmp/qemu-test/src/include/qemu/compiler.h:86:36: error: static assertion failed: "not expecting: sizeof(*>notify_me) > ATOMIC_REG_SIZE" #define QEMU_BUILD_BUG_MSG(x, msg) _Static_assert(!(x), msg) ^~ /tmp/qemu-test/src/include/qemu/compiler.h:94:30: note: in expansion of macro 'QEMU_BUILD_BUG_MSG' --- /tmp/qemu-test/src/util/async.c:349:9: note: in expansion of macro 'atomic_mb_read' if (atomic_mb_read(>notify_me)) { ^~ make: *** [/tmp/qemu-test/src/rules.mak:69: util/async.o] Error 1 make: *** Waiting for unfinished jobs Traceback (most recent call last): File "./tests/docker/docker.py", line 662, in --- raise CalledProcessError(retcode, cmd) subprocess.CalledProcessError: Command '['sudo', '-n', 'docker', 'run', '--label', 'com.qemu.instance.uuid=b90449ea684e4a14ad6b34f4f5dcb0c9', '-u', '1001', '--security-opt', 'seccomp=unconfined', '--rm', '-e', 'TARGET_LIST=', '-e', 'EXTRA_CONFIGURE_OPTS=', '-e', 'V=', '-e', 'J=14', '-e', 'DEBUG=', '-e', 'SHOW_ENV=', '-e', 'CCACHE_DIR=/var/tmp/ccache', '-v', '/home/patchew/.cache/qemu-docker-ccache:/var/tmp/ccache:z', '-v', '/var/tmp/patchew-tester-tmp-0y2eb54d/src/docker-src.2019-10-11-20.12.57.30254:/var/tmp/qemu:z,ro', 'qemu:fedora', '/var/tmp/qemu/run', 'test-mingw']' returned non-zero exit status 2. filter=--filter=label=com.qemu.instance.uuid=b90449ea684e4a14ad6b34f4f5dcb0c9 make[1]: *** [docker-run] Error 1 make[1]: Leaving directory `/var/tmp/patchew-tester-tmp-0y2eb54d/src' make: *** [docker-run-test-mingw@fedora] Error 2 real11m27.735s user0m7.532s The full log is available at http://patchew.org/logs/20191011175536.GB25464@xps13.dannf/testing.docker-mingw@fedora/?type=message. --- Email generated automatically by Patchew [https://patchew.org/]. Please send your feedback to patchew-de...@redhat.com
[Bug 1805256] Re: [Qemu-devel] qemu_futex_wait() lockups in ARM64: 2 possible issues
On Fri, Oct 11, 2019 at 08:30:02AM +, Jan Glauber wrote: > On Fri, Oct 11, 2019 at 10:18:18AM +0200, Paolo Bonzini wrote: > > On 11/10/19 08:05, Jan Glauber wrote: > > > On Wed, Oct 09, 2019 at 11:15:04AM +0200, Paolo Bonzini wrote: > > >>> ...but if I bump notify_me size to uint64_t the issue goes away. > > >> > > >> Ouch. :) Is this with or without my patch(es)? > > > > You didn't answer this question. > > Oh, sorry... I did but the mail probably didn't make it out. > I have both of your changes applied (as I think they make sense). > > > >> Also, what if you just add a dummy uint32_t after notify_me? > > > > > > With the dummy the testcase also runs fine for 500 iterations. > > > > You might be lucky and causing list_lock to be in another cache line. > > What if you add __attribute__((aligned(16)) to notify_me (and keep the > > dummy)? > > Good point. I'll try to force both into the same cacheline. On the Hi1620, this still hangs in the first iteration: diff --git a/include/block/aio.h b/include/block/aio.h index 6b0d52f732..00e56a5412 100644 --- a/include/block/aio.h +++ b/include/block/aio.h @@ -82,7 +82,7 @@ struct AioContext { * Instead, the aio_poll calls include both the prepare and the * dispatch phase, hence a simple counter is enough for them. */ -uint32_t notify_me; +__attribute__((aligned(16))) uint64_t notify_me; /* A lock to protect between QEMUBH and AioHandler adders and deleter, * and to ensure that no callbacks are removed while we're walking and diff --git a/util/async.c b/util/async.c index ca83e32c7f..024c4c567d 100644 --- a/util/async.c +++ b/util/async.c @@ -242,7 +242,7 @@ aio_ctx_check(GSource *source) aio_notify_accept(ctx); for (bh = ctx->first_bh; bh; bh = bh->next) { -if (bh->scheduled) { +if (atomic_mb_read(>scheduled)) { return true; } } @@ -342,12 +342,12 @@ LinuxAioState *aio_get_linux_aio(AioContext *ctx) void aio_notify(AioContext *ctx) { -/* Write e.g. bh->scheduled before reading ctx->notify_me. Pairs - * with atomic_or in aio_ctx_prepare or atomic_add in aio_poll. +/* Using atomic_mb_read ensures that e.g. bh->scheduled is written before + * ctx->notify_me is read. Pairs with atomic_or in aio_ctx_prepare or + * atomic_add in aio_poll. */ -smp_mb(); -if (ctx->notify_me) { -event_notifier_set(>notifier); +if (atomic_mb_read(>notify_me)) { + event_notifier_set(>notifier); atomic_mb_set(>notified, true); } } -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/1805256 Title: qemu-img hangs on rcu_call_ready_event logic in Aarch64 when converting images Status in kunpeng920: New Status in QEMU: In Progress Status in qemu package in Ubuntu: In Progress Status in qemu source package in Bionic: New Status in qemu source package in Disco: New Status in qemu source package in Eoan: In Progress Status in qemu source package in FF-Series: New Bug description: Command: qemu-img convert -f qcow2 -O qcow2 ./disk01.qcow2 ./output.qcow2 Hangs indefinitely approximately 30% of the runs. Workaround: qemu-img convert -m 1 -f qcow2 -O qcow2 ./disk01.qcow2 ./output.qcow2 Run "qemu-img convert" with "a single coroutine" to avoid this issue. (gdb) thread 1 ... (gdb) bt #0 0xbf1ad81c in __GI_ppoll #1 0xaabcf73c in ppoll #2 qemu_poll_ns #3 0xaabd0764 in os_host_main_loop_wait #4 main_loop_wait ... (gdb) thread 2 ... (gdb) bt #0 syscall () #1 0xaabd41cc in qemu_futex_wait #2 qemu_event_wait (ev=ev@entry=0xaac86ce8 ) #3 0xaabed05c in call_rcu_thread #4 0xaabd34c8 in qemu_thread_start #5 0xbf25c880 in start_thread #6 0xbf1b6b9c in thread_start () (gdb) thread 3 ... (gdb) bt #0 0xbf11aa20 in __GI___sigtimedwait #1 0xbf2671b4 in __sigwait #2 0xaabd1ddc in sigwait_compat #3 0xaabd34c8 in qemu_thread_start #4 0xbf25c880 in start_thread #5 0xbf1b6b9c in thread_start (gdb) run Starting program: /usr/bin/qemu-img convert -f qcow2 -O qcow2 ./disk01.ext4.qcow2 ./output.qcow2 [New Thread 0xbec5ad90 (LWP 72839)] [New Thread 0xbe459d90 (LWP 72840)] [New Thread 0xbdb57d90 (LWP 72841)] [New Thread 0xacac9d90 (LWP 72859)] [New Thread 0xa7ffed90 (LWP 72860)] [New Thread 0xa77fdd90 (LWP 72861)] [New Thread 0xa6ffcd90 (LWP 72862)] [New Thread 0xa67fbd90 (LWP 72863)] [New Thread 0xa5ffad90 (LWP 72864)] [Thread 0xa5ffad90 (LWP 72864) exited] [Thread 0xa6ffcd90 (LWP 72862) exited] [Thread 0xa77fdd90 (LWP 72861) exited] [Thread 0xbdb57d90 (LWP 72841) exited] [Thread 0xa67fbd90 (LWP 72863) exited] [Thread 0xacac9d90 (LWP