Re: intermittent hang in qos-test for qemu-system-i386 on 32-bit arm host

2021-07-22 Thread Claudio Fontana
On 7/10/21 3:30 PM, Peter Maydell wrote:
> I've noticed recently that intermittently 'make check' will hang on
> my aarch32 test system (really an aarch64 box with an aarch32 chroot).
> 
> I think from grep that this must be the vhost-user-blk test.
> 
> Here's the process tree:
> 
> pmaydell 13126  0.0  0.0   8988  6416 ?SJul09   0:01 make
> -C build/all-a32 check V=1 GCC_COLORS= -j9
> pmaydell 19632  0.0  0.0   4432  2096 ?SJul09   0:00  \_
> bash -o pipefail -c echo 'MALLOC_PERTURB_=${MALLOC_PERTURB_:-$((
> ${RANDOM:-0} % 255 + 1))} QTEST_QEMU_IMG=./qemu-img
> G_TEST_DBUS_DAEMON=/home/peter.maydell/qemu/tests/dbus-vmstate-daemon.sh
> QTEST_QEMU_BINARY=./qemu-system-i386
> QTEST_QEMU_STORAGE_DAEMON_BINARY=./storage-daemon/qemu-storage-daemon
> tests/qtest/qos-test --tap -k' &&
> MALLOC_PERTURB_=${MALLOC_PERTURB_:-$(( ${RANDOM:-0} % 255 + 1))}
> QTEST_QEMU_IMG=./qemu-img
> G_TEST_DBUS_DAEMON=/home/peter.maydell/qemu/tests/dbus-vmstate-daemon.sh
> QTEST_QEMU_BINARY=./qemu-system-i386
> QTEST_QEMU_STORAGE_DAEMON_BINARY=./storage-daemon/qemu-storage-daemon
> tests/qtest/qos-test --tap -k -m quick < /dev/null |
> ./scripts/tap-driver.pl --test-name="qtest-i386/qos-test"
> pmaydell 19634  0.0  0.0  13608  3076 ?Sl   Jul09   0:02
> \_ tests/qtest/qos-test --tap -k -m quick
> pmaydell 20679  0.0  0.0 109076 16100 ?Sl   Jul09   0:00
> |   \_ ./storage-daemon/qemu-storage-daemon --blockdev
> driver=file,node-name=disk0,filename=qtest.X7RL2X --export
> type=vhost-user-blk,id=disk0,addr.type=unix,addr.path=/tmp/qtest-19634-sock.9LJoHn,node-name=disk0,writable=on,num-queues=1
> pmaydell 20681  0.0  0.2 447828 46544 ?Sl   Jul09   0:00
> |   \_ ./qemu-system-i386 -qtest unix:/tmp/qtest-19634.sock -qtest-log
> /dev/null -chardev socket,path=/tmp/qtest-19634.qmp,id=char0 -mon
> chardev=char0,mode=control -display none -M pc -device
> vhost-user-blk-pci,id=drv0,chardev=char1,addr=4.0 -object
> memory-backend-memfd,id=mem,size=256M,share=on -M memory-backend=mem
> -m 256M -chardev socket,id=char1,path=/tmp/qtest-19634-sock.9LJoHn
> -accel qtest
> pmaydell 19635  0.0  0.0  10256  7176 ?SJul09   0:00
> \_ perl ./scripts/tap-driver.pl --test-name=qtest-i386/qos-test
> 
> 
> Backtrace from tests/qtest/qos-test (not as helpful as it could
> be since this is an optimized build):
> 
> (gdb) thread apply all bt
> 
> Thread 2 (Thread 0xf76ff240 (LWP 19636)):
> #0  syscall () at ../sysdeps/unix/sysv/linux/arm/syscall.S:37
> #1  0x005206de in qemu_futex_wait (val=, f= out>) at /home/peter.maydell/qemu/include/qemu/futex.h:29
> #2  qemu_event_wait (ev=ev@entry=0x5816fc ) at
> ../../util/qemu-thread-posix.c:480
> #3  0x005469c2 in call_rcu_thread (opaque=) at
> ../../util/rcu.c:258
> #4  0x0051fbc2 in qemu_thread_start (args=) at
> ../../util/qemu-thread-posix.c:541
> #5  0xf785a614 in start_thread (arg=0xf6ce711c) at pthread_create.c:463
> #6  0xf77f57ec in ?? () at ../sysdeps/unix/sysv/linux/arm/clone.S:73
> from /lib/arm-linux-gnueabihf/libc.so.6
> Backtrace stopped: previous frame identical to this frame (corrupt stack?)
> 
> Thread 1 (Thread 0xf7a04010 (LWP 19634)):
> #0  __libc_do_syscall () at 
> ../sysdeps/unix/sysv/linux/arm/libc-do-syscall.S:46
> #1  0xf7861d8c in __libc_read (fd=12, buf=buf@entry=0xff9ce8e4,
> nbytes=nbytes@entry=1024) at ../sysdeps/unix/sysv/linux/read.c:27
> #2  0x004ebc5a in read (__nbytes=1024, __buf=0xff9ce8e4,
> __fd=) at
> /usr/include/arm-linux-gnueabihf/bits/unistd.h:44
> #3  qtest_client_socket_recv_line (s=0x1a46cb8) at
> ../../tests/qtest/libqtest.c:494
> #4  0x004ebd4e in qtest_rsp_args (s=s@entry=0x1a46cb8,
> expected_args=expected_args@entry=1) at
> ../../tests/qtest/libqtest.c:521
> #5  0x004ec1ee in qtest_query_target_endianness (s=0x1a46cb8) at
> ../../tests/qtest/libqtest.c:570
> #6  0x004ec94a in qtest_init_without_qmp_handshake
> (extra_args=) at ../../tests/qtest/libqtest.c:332
> #7  0x004ecd7a in qtest_init (extra_args=) at
> ../../tests/qtest/libqtest.c:339
> #8  0x004ded10 in qtest_start (
> args=0x1a63710 "-M pc  -device
> vhost-user-blk-pci,id=drv0,chardev=char1,addr=4.0 -object
> memory-backend-memfd,id=mem,size=256M,share=on  -M memory-backend=mem
> -m 256M -chardev socket,id=char1,path=/tmp/qtest-19634-so"...) at
> ../../tests/qtest/libqtest-single.h:29
> #9  restart_qemu_or_continue (
> path=0x1a63710 "-M pc  -device
> vhost-user-blk-pci,id=drv0,chardev=char1,addr=4.0 -object
> memory-backend-memfd,id=mem,size=256M,share=on  -M memory-backend=mem
> -m 256M -chardev socket,id=char1,path=/tmp/qtest-19634-so"...) at
> ../../tests/qtest/qos-test.c:105
> #10 run_one_test (arg=) at ../../tests/qtest/qos-test.c:178
> #11 0xf794ee74 in ?? () from /usr/lib/arm-linux-gnueabihf/libglib-2.0.so.0
> Backtrace stopped: previous frame identical to this frame (corrupt stack?)
> 
> 
> Backtrace from qemu-system-i386:
> 
> (gdb) thread apply all bt
> 
> Thread 4 (Thread 0xdfd0cb90 (LWP 20734)):
> #0  0xf6f85206 in 

Re: intermittent hang in qos-test for qemu-system-i386 on 32-bit arm host

2021-07-16 Thread Kevin Wolf
Am 11.07.2021 um 17:53 hat Peter Maydell geschrieben:
> On Sat, 10 Jul 2021 at 14:30, Peter Maydell  wrote:
> >
> > I've noticed recently that intermittently 'make check' will hang on
> > my aarch32 test system (really an aarch64 box with an aarch32 chroot).
> >
> > I think from grep that this must be the vhost-user-blk test.
> 
> I've also now seen this on qemu-system-i386 guest x86-64 Linux host:

Your two stack traces look very different to me, the common thing is
just that one process requests something and the other seems to have
ignored it and is just idle.

In the first stack trace, it was the qtest sending the very first
command ('endianness') over the qtest socket and QEMU seemed to ignore
it. In the second stack trace, it is the vhost-user-blk realize() code
in QEMU sending a request to the export in qemu-storage-daemon and never
getting an answer.

If this is the same bug, it looks to me as if it's something with the
event notification in the main loop? Can we check if there would
actually be an event pending in the apparently idle process if ppoll()
returned?

Kevin




Re: intermittent hang in qos-test for qemu-system-i386 on 32-bit arm host

2021-07-16 Thread Coiby Xu

On Mon, Jul 12, 2021 at 10:39:50AM +0100, Peter Maydell wrote:

On Sun, 11 Jul 2021 at 23:55, Coiby Xu  wrote:


On Mon, Jul 12, 2021 at 06:20:33AM +0800, Coiby Xu wrote:
>On Sun, Jul 11, 2021 at 04:53:51PM +0100, Peter Maydell wrote:
>>On Sat, 10 Jul 2021 at 14:30, Peter Maydell  wrote:
>>>
>>>I've noticed recently that intermittently 'make check' will hang on
>>>my aarch32 test system (really an aarch64 box with an aarch32 chroot).
>>>
>>>I think from grep that this must be the vhost-user-blk test.
>>
>>I've also now seen this on qemu-system-i386 guest x86-64 Linux host:
>
>Good to to know that! This makes it much easier for me to debug this
>issue.

Which i386 image do you use for the guest? Could you share the download
link? I can't find a suitable i386 qcow2 image. For example, [1] is
outdated.


I'm just running "make check" on the x86-64 host, which runs tests
on qemu-system-i386 (assuming you built i386 targets).


How often can this issue happen? Unfortunately, I can't reproduce it in
the past four days on my own laptop and two Openstack machines every now 
and then,

git clone git://git.qemu.org/qemu.g
mkdir build && cd build
../configure --target-list=i386-softmmu 
MALLOC_PERTURB_=${MALLOC_PERTURB_:-$(( ${RANDOM:-0} % 255 + 1))} \

  QTEST_QEMU_BINARY=build/i386-softmmu/qemu-system-i386 
QTEST_QEMU_IMG=build/qemu-img \
  QTEST_QEMU_STORAGE_DAEMON_BINARY=build/storage-daemon/qemu-storage-daemon 
\
  build/tests/qtest/qos-test

I've also tried running the whole tests using "make check" or "make
check-qtest" but the results were the same. 



-- PMM


--
Best regards,
Coiby



Re: intermittent hang in qos-test for qemu-system-i386 on 32-bit arm host

2021-07-12 Thread Peter Maydell
On Sun, 11 Jul 2021 at 23:55, Coiby Xu  wrote:
>
> On Mon, Jul 12, 2021 at 06:20:33AM +0800, Coiby Xu wrote:
> >On Sun, Jul 11, 2021 at 04:53:51PM +0100, Peter Maydell wrote:
> >>On Sat, 10 Jul 2021 at 14:30, Peter Maydell  
> >>wrote:
> >>>
> >>>I've noticed recently that intermittently 'make check' will hang on
> >>>my aarch32 test system (really an aarch64 box with an aarch32 chroot).
> >>>
> >>>I think from grep that this must be the vhost-user-blk test.
> >>
> >>I've also now seen this on qemu-system-i386 guest x86-64 Linux host:
> >
> >Good to to know that! This makes it much easier for me to debug this
> >issue.
>
> Which i386 image do you use for the guest? Could you share the download
> link? I can't find a suitable i386 qcow2 image. For example, [1] is
> outdated.

I'm just running "make check" on the x86-64 host, which runs tests
on qemu-system-i386 (assuming you built i386 targets).

-- PMM



Re: intermittent hang in qos-test for qemu-system-i386 on 32-bit arm host

2021-07-11 Thread Coiby Xu

On Mon, Jul 12, 2021 at 06:20:33AM +0800, Coiby Xu wrote:

On Sun, Jul 11, 2021 at 04:53:51PM +0100, Peter Maydell wrote:

On Sat, 10 Jul 2021 at 14:30, Peter Maydell  wrote:


I've noticed recently that intermittently 'make check' will hang on
my aarch32 test system (really an aarch64 box with an aarch32 chroot).

I think from grep that this must be the vhost-user-blk test.


I've also now seen this on qemu-system-i386 guest x86-64 Linux host:


Good to to know that! This makes it much easier for me to debug this
issue.


Which i386 image do you use for the guest? Could you share the download
link? I can't find a suitable i386 qcow2 image. For example, [1] is
outdated.

[1] http://people.debian.org/~aurel32/qemu

--
Best regards,
Coiby



Re: intermittent hang in qos-test for qemu-system-i386 on 32-bit arm host

2021-07-11 Thread Coiby Xu

On Sun, Jul 11, 2021 at 04:53:51PM +0100, Peter Maydell wrote:

On Sat, 10 Jul 2021 at 14:30, Peter Maydell  wrote:


I've noticed recently that intermittently 'make check' will hang on
my aarch32 test system (really an aarch64 box with an aarch32 chroot).

I think from grep that this must be the vhost-user-blk test.


I've also now seen this on qemu-system-i386 guest x86-64 Linux host:


Good to to know that! This makes it much easier for me to debug this
issue.



Process tree:
petmay01 28992  0.0  0.0 123812  8612 ?Sl   14:46   0:01
  \_ tests/qtest/qos-test --tap -k -m quick
petmay01 30068  0.0  0.0 379204 20580 ?Sl   14:46   0:00
  |   \_ ./storage-daemon/qemu-storage-daemon
--blockdev driver=file,node-name=disk0,filename=qtest.6kY6px --export
type=vhost-user-blk,id=disk0,addr.type=unix,addr.path=/tmp/qtest-28992-sock.4Kgtk1,node-name=disk0,writable=on,num-queues=1
petmay01 30070  0.0  0.1 1083248 63748 ?   Sl   14:46   0:00
  |   \_ ./qemu-system-i386 -qtest
unix:/tmp/qtest-28992.sock -qtest-log /dev/null -chardev
socket,path=/tmp/qtest-28992.qmp,id=char0 -mon
chardev=char0,mode=control -display none -M pc -device
vhost-user-blk-pci,id=drv0,chardev=char1,addr=4.0 -object
memory-backend-memfd,id=mem,size=256M,share=on -M memory-backend=mem
-m 256M -chardev socket,id=char1,path=/tmp/qtest-28992-sock.4Kgtk1
-accel qtest


Backtrace, qos-test:
(gdb) thread apply all bt

Thread 2 (Thread 0x7fd086f1c700 (LWP 28995)):
#0  syscall () at ../sysdeps/unix/sysv/linux/x86_64/syscall.S:38
#1  0x56448599484b in qemu_futex_wait (val=,
f=)
   at /mnt/nvmedisk/linaro/qemu-for-merges/include/qemu/futex.h:29
#2  qemu_event_wait (ev=ev@entry=0x564485c322e8 )
   at ../../util/qemu-thread-posix.c:480
#3  0x56448599dc18 in call_rcu_thread (opaque=opaque@entry=0x0) at
../../util/rcu.c:258
#4  0x564485993966 in qemu_thread_start (args=)
   at ../../util/qemu-thread-posix.c:541
#5  0x7fd088b446db in start_thread (arg=0x7fd086f1c700) at
pthread_create.c:463
#6  0x7fd08886d71f in clone () at
../sysdeps/unix/sysv/linux/x86_64/clone.S:95

Thread 1 (Thread 0x7fd089d9a900 (LWP 28992)):
#0  0x7fd088b4e474 in __libc_read (fd=6,
buf=buf@entry=0x7fff05f024f0, nbytes=nbytes@entry=1024)
   at ../sysdeps/unix/sysv/linux/read.c:27
#1  0x564485947cb2 in read (__nbytes=1024, __buf=0x7fff05f024f0,
__fd=)
   at /usr/include/x86_64-linux-gnu/bits/unistd.h:44
#2  qtest_client_socket_recv_line (s=0x5644866f38b0) at
../../tests/qtest/libqtest.c:494
#3  0x564485947e61 in qtest_rsp_args (s=s@entry=0x5644866f38b0,
   expected_args=expected_args@entry=1) at ../../tests/qtest/libqtest.c:521
#4  0x56448594846f in qtest_query_target_endianness (s=0x5644866f38b0)
   at ../../tests/qtest/libqtest.c:570
#5  0x564485948ed2 in qtest_init_without_qmp_handshake
(extra_args=)
   at ../../tests/qtest/libqtest.c:332
#6  0x564485949616 in qtest_init (extra_args=) at
../../tests/qtest/libqtest.c:339
#7  0x5644859338cd in qtest_start (
   args=0x5644866f6d00 "-M pc  -device
vhost-user-blk-pci,id=drv0,chardev=char1,addr=4.0 -object
memory-backend-memfd,id=mem,size=256M,share=on  -M memory-backend=mem
-m 256M -chardev socket,id=char1,path=/tmp/qtest-28992-so"...) at
../../tests/qtest/libqtest-single.h:29
#8  restart_qemu_or_continue (
   path=0x5644866f6d00 "-M pc  -device
vhost-user-blk-pci,id=drv0,chardev=char1,addr=4.0 -object
memory-backend-memfd,id=mem,size=256M,share=on  -M memory-backend=mem
-m 256M -chardev socket,id=char1,path=/tmp/qtest-28992-so"...) at
../../tests/qtest/qos-test.c:105
#9  run_one_test (arg=) at ../../tests/qtest/qos-test.c:178
#10 0x7fd08990c05a in ?? () from /usr/lib/x86_64-linux-gnu/libglib-2.0.so.0
#11 0x7fd08990bf8b in ?? () from /usr/lib/x86_64-linux-gnu/libglib-2.0.so.0
#12 0x7fd08990bf8b in ?? () from /usr/lib/x86_64-linux-gnu/libglib-2.0.so.0
#13 0x7fd08990bf8b in ?? () from /usr/lib/x86_64-linux-gnu/libglib-2.0.so.0
#14 0x7fd08990bf8b in ?? () from /usr/lib/x86_64-linux-gnu/libglib-2.0.so.0
#15 0x7fd08990bf8b in ?? () from /usr/lib/x86_64-linux-gnu/libglib-2.0.so.0
#16 0x7fd08990bf8b in ?? () from /usr/lib/x86_64-linux-gnu/libglib-2.0.so.0
#17 0x7fd08990bf8b in ?? () from /usr/lib/x86_64-linux-gnu/libglib-2.0.so.0
#18 0x7fd08990c232 in g_test_run_suite () from
/usr/lib/x86_64-linux-gnu/libglib-2.0.so.0
#19 0x7fd08990c251 in g_test_run () from
/usr/lib/x86_64-linux-gnu/libglib-2.0.so.0
#20 0x564485932359 in main (argc=, argv=, envp=)
   at ../../tests/qtest/qos-test.c:338

Backtrace, qemu-system-i386:Thread 4 (Thread 0x7f965ac7f700 (LWP 30079)):
#0  0x7f9674b6938c in __GI___sigtimedwait (set=,
   set@entry=0x7f965ac7c090, info=info@entry=0x7f965ac7bfd0,
timeout=timeout@entry=0x0)
   at ../sysdeps/unix/sysv/linux/sigtimedwait.c:42
#1  0x7f9674f2c54c in __sigwait (set=set@entry=0x7f965ac7c090,
sig=sig@entry=0x7f965ac7c08c)
   at 

Re: intermittent hang in qos-test for qemu-system-i386 on 32-bit arm host

2021-07-11 Thread Coiby Xu

On Sun, Jul 11, 2021 at 06:23:41AM -0700, Richard Henderson wrote:

On 7/11/21 5:16 AM, Peter Maydell wrote:

On Sun, 11 Jul 2021 at 13:10, Coiby Xu  wrote:


Hi Peter,

On Sat, Jul 10, 2021 at 02:30:36PM +0100, Peter Maydell wrote:

I've noticed recently that intermittently 'make check' will hang on
my aarch32 test system (really an aarch64 box with an aarch32 chroot).


I have a newbie question. How do you do an aarch32 chroot on an aarch64
box? At least, this issue seems to be not reproducible on an aarch64 box
directly. I specifically ran the qos-test for 5 consecutive times and
each time the test could finish successfully,


Your aarch64 host CPU needs to support aarch32 at EL0 (some
AArch64 CPUs are pure-64 bit these days). The host kernel needs
to implement the 32-bit compat layer. It probably also needs to be
built for 4K pages (which mostly means "not RedHat"). Then you can
set up the 32-bit chroot however you'd normally set up a chroot
(for Debian you can do this with debootstrap; other distros will vary;
schroot is also a bit nicer than raw chroot IMHO.)


If you do have a kernel built with 64k pages ("RedHat"), but you do 
have a host cpu that supports aarch32 at EL1 and EL0, then you can run 
aarch32 under KVM.


The command-line I use is

../run/bin/qemu-system-aarch64 -m 4096 -smp 8 -nographic \
 -M virt -cpu host,aarch64=off --accel kvm \
 -kernel vmlinuz-4.19.0-16-armmp-lpae \
 -initrd initrd.img-4.19.0-16-armmp-lpae \
 -append 'console=ttyAMA0 root=/dev/vda2' \
 -drive if=none,file=hda.q,format=qcow2,id=hd,discard=on \
 -device virtio-blk-device,drive=hd \
 -netdev tap,id=tap0,br=virbr0,helper=/usr/libexec/qemu-bridge-helper \
 -device virtio-net-device,netdev=tap0

I believe that I had to perform the install under tcg because I 
couldn't find the right magic to boot off the debian cdrom with kvm.


Thanks for the instructions! Since this issue is also reproducible on 
qemu-system-i386 guest x86-64 Linux host according to Peter's new email, 
I'll check it on i386 guest first.





r~


--
Best regards,
Coiby



Re: intermittent hang in qos-test for qemu-system-i386 on 32-bit arm host

2021-07-11 Thread Peter Maydell
On Sat, 10 Jul 2021 at 14:30, Peter Maydell  wrote:
>
> I've noticed recently that intermittently 'make check' will hang on
> my aarch32 test system (really an aarch64 box with an aarch32 chroot).
>
> I think from grep that this must be the vhost-user-blk test.

I've also now seen this on qemu-system-i386 guest x86-64 Linux host:

Process tree:
petmay01 28992  0.0  0.0 123812  8612 ?Sl   14:46   0:01
   \_ tests/qtest/qos-test --tap -k -m quick
petmay01 30068  0.0  0.0 379204 20580 ?Sl   14:46   0:00
   |   \_ ./storage-daemon/qemu-storage-daemon
--blockdev driver=file,node-name=disk0,filename=qtest.6kY6px --export
type=vhost-user-blk,id=disk0,addr.type=unix,addr.path=/tmp/qtest-28992-sock.4Kgtk1,node-name=disk0,writable=on,num-queues=1
petmay01 30070  0.0  0.1 1083248 63748 ?   Sl   14:46   0:00
   |   \_ ./qemu-system-i386 -qtest
unix:/tmp/qtest-28992.sock -qtest-log /dev/null -chardev
socket,path=/tmp/qtest-28992.qmp,id=char0 -mon
chardev=char0,mode=control -display none -M pc -device
vhost-user-blk-pci,id=drv0,chardev=char1,addr=4.0 -object
memory-backend-memfd,id=mem,size=256M,share=on -M memory-backend=mem
-m 256M -chardev socket,id=char1,path=/tmp/qtest-28992-sock.4Kgtk1
-accel qtest


Backtrace, qos-test:
(gdb) thread apply all bt

Thread 2 (Thread 0x7fd086f1c700 (LWP 28995)):
#0  syscall () at ../sysdeps/unix/sysv/linux/x86_64/syscall.S:38
#1  0x56448599484b in qemu_futex_wait (val=,
f=)
at /mnt/nvmedisk/linaro/qemu-for-merges/include/qemu/futex.h:29
#2  qemu_event_wait (ev=ev@entry=0x564485c322e8 )
at ../../util/qemu-thread-posix.c:480
#3  0x56448599dc18 in call_rcu_thread (opaque=opaque@entry=0x0) at
../../util/rcu.c:258
#4  0x564485993966 in qemu_thread_start (args=)
at ../../util/qemu-thread-posix.c:541
#5  0x7fd088b446db in start_thread (arg=0x7fd086f1c700) at
pthread_create.c:463
#6  0x7fd08886d71f in clone () at
../sysdeps/unix/sysv/linux/x86_64/clone.S:95

Thread 1 (Thread 0x7fd089d9a900 (LWP 28992)):
#0  0x7fd088b4e474 in __libc_read (fd=6,
buf=buf@entry=0x7fff05f024f0, nbytes=nbytes@entry=1024)
at ../sysdeps/unix/sysv/linux/read.c:27
#1  0x564485947cb2 in read (__nbytes=1024, __buf=0x7fff05f024f0,
__fd=)
at /usr/include/x86_64-linux-gnu/bits/unistd.h:44
#2  qtest_client_socket_recv_line (s=0x5644866f38b0) at
../../tests/qtest/libqtest.c:494
#3  0x564485947e61 in qtest_rsp_args (s=s@entry=0x5644866f38b0,
expected_args=expected_args@entry=1) at ../../tests/qtest/libqtest.c:521
#4  0x56448594846f in qtest_query_target_endianness (s=0x5644866f38b0)
at ../../tests/qtest/libqtest.c:570
#5  0x564485948ed2 in qtest_init_without_qmp_handshake
(extra_args=)
at ../../tests/qtest/libqtest.c:332
#6  0x564485949616 in qtest_init (extra_args=) at
../../tests/qtest/libqtest.c:339
#7  0x5644859338cd in qtest_start (
args=0x5644866f6d00 "-M pc  -device
vhost-user-blk-pci,id=drv0,chardev=char1,addr=4.0 -object
memory-backend-memfd,id=mem,size=256M,share=on  -M memory-backend=mem
-m 256M -chardev socket,id=char1,path=/tmp/qtest-28992-so"...) at
../../tests/qtest/libqtest-single.h:29
#8  restart_qemu_or_continue (
path=0x5644866f6d00 "-M pc  -device
vhost-user-blk-pci,id=drv0,chardev=char1,addr=4.0 -object
memory-backend-memfd,id=mem,size=256M,share=on  -M memory-backend=mem
-m 256M -chardev socket,id=char1,path=/tmp/qtest-28992-so"...) at
../../tests/qtest/qos-test.c:105
#9  run_one_test (arg=) at ../../tests/qtest/qos-test.c:178
#10 0x7fd08990c05a in ?? () from /usr/lib/x86_64-linux-gnu/libglib-2.0.so.0
#11 0x7fd08990bf8b in ?? () from /usr/lib/x86_64-linux-gnu/libglib-2.0.so.0
#12 0x7fd08990bf8b in ?? () from /usr/lib/x86_64-linux-gnu/libglib-2.0.so.0
#13 0x7fd08990bf8b in ?? () from /usr/lib/x86_64-linux-gnu/libglib-2.0.so.0
#14 0x7fd08990bf8b in ?? () from /usr/lib/x86_64-linux-gnu/libglib-2.0.so.0
#15 0x7fd08990bf8b in ?? () from /usr/lib/x86_64-linux-gnu/libglib-2.0.so.0
#16 0x7fd08990bf8b in ?? () from /usr/lib/x86_64-linux-gnu/libglib-2.0.so.0
#17 0x7fd08990bf8b in ?? () from /usr/lib/x86_64-linux-gnu/libglib-2.0.so.0
#18 0x7fd08990c232 in g_test_run_suite () from
/usr/lib/x86_64-linux-gnu/libglib-2.0.so.0
#19 0x7fd08990c251 in g_test_run () from
/usr/lib/x86_64-linux-gnu/libglib-2.0.so.0
#20 0x564485932359 in main (argc=, argv=, envp=)
at ../../tests/qtest/qos-test.c:338

Backtrace, qemu-system-i386:Thread 4 (Thread 0x7f965ac7f700 (LWP 30079)):
#0  0x7f9674b6938c in __GI___sigtimedwait (set=,
set@entry=0x7f965ac7c090, info=info@entry=0x7f965ac7bfd0,
timeout=timeout@entry=0x0)
at ../sysdeps/unix/sysv/linux/sigtimedwait.c:42
#1  0x7f9674f2c54c in __sigwait (set=set@entry=0x7f965ac7c090,
sig=sig@entry=0x7f965ac7c08c)
at ../sysdeps/unix/sysv/linux/sigwait.c:28
#2  0x55c2a04af6b3 in dummy_cpu_thread_fn (arg=arg@entry=0x55c2a1727aa0)
at 

Re: intermittent hang in qos-test for qemu-system-i386 on 32-bit arm host

2021-07-11 Thread Richard Henderson

On 7/11/21 7:21 AM, Peter Maydell wrote:

On Sun, 11 Jul 2021 at 14:23, Richard Henderson
 wrote:

I believe that I had to perform the install under tcg because I couldn't find 
the right
magic to boot off the debian cdrom with kvm.


Weird, it ought not in theory to care...


Looking back at the install script I used, I had u-boot boot off the cdrom, and I'm 
booting the kernel directly for kvm.  I guess there's something about the specific u-boot 
image I had that didn't work with kvm.  It has been long enough that I don't recall any 
further details.



r~



Re: intermittent hang in qos-test for qemu-system-i386 on 32-bit arm host

2021-07-11 Thread Peter Maydell
On Sun, 11 Jul 2021 at 14:23, Richard Henderson
 wrote:
> I believe that I had to perform the install under tcg because I couldn't find 
> the right
> magic to boot off the debian cdrom with kvm.

Weird, it ought not in theory to care...

-- PMM



Re: intermittent hang in qos-test for qemu-system-i386 on 32-bit arm host

2021-07-11 Thread Richard Henderson

On 7/11/21 5:16 AM, Peter Maydell wrote:

On Sun, 11 Jul 2021 at 13:10, Coiby Xu  wrote:


Hi Peter,

On Sat, Jul 10, 2021 at 02:30:36PM +0100, Peter Maydell wrote:

I've noticed recently that intermittently 'make check' will hang on
my aarch32 test system (really an aarch64 box with an aarch32 chroot).


I have a newbie question. How do you do an aarch32 chroot on an aarch64
box? At least, this issue seems to be not reproducible on an aarch64 box
directly. I specifically ran the qos-test for 5 consecutive times and
each time the test could finish successfully,


Your aarch64 host CPU needs to support aarch32 at EL0 (some
AArch64 CPUs are pure-64 bit these days). The host kernel needs
to implement the 32-bit compat layer. It probably also needs to be
built for 4K pages (which mostly means "not RedHat"). Then you can
set up the 32-bit chroot however you'd normally set up a chroot
(for Debian you can do this with debootstrap; other distros will vary;
schroot is also a bit nicer than raw chroot IMHO.)


If you do have a kernel built with 64k pages ("RedHat"), but you do have a host cpu that 
supports aarch32 at EL1 and EL0, then you can run aarch32 under KVM.


The command-line I use is

../run/bin/qemu-system-aarch64 -m 4096 -smp 8 -nographic \
  -M virt -cpu host,aarch64=off --accel kvm \
  -kernel vmlinuz-4.19.0-16-armmp-lpae \
  -initrd initrd.img-4.19.0-16-armmp-lpae \
  -append 'console=ttyAMA0 root=/dev/vda2' \
  -drive if=none,file=hda.q,format=qcow2,id=hd,discard=on \
  -device virtio-blk-device,drive=hd \
  -netdev tap,id=tap0,br=virbr0,helper=/usr/libexec/qemu-bridge-helper \
  -device virtio-net-device,netdev=tap0

I believe that I had to perform the install under tcg because I couldn't find the right 
magic to boot off the debian cdrom with kvm.



r~



Re: intermittent hang in qos-test for qemu-system-i386 on 32-bit arm host

2021-07-11 Thread Peter Maydell
On Sun, 11 Jul 2021 at 13:10, Coiby Xu  wrote:
>
> Hi Peter,
>
> On Sat, Jul 10, 2021 at 02:30:36PM +0100, Peter Maydell wrote:
> >I've noticed recently that intermittently 'make check' will hang on
> >my aarch32 test system (really an aarch64 box with an aarch32 chroot).
>
> I have a newbie question. How do you do an aarch32 chroot on an aarch64
> box? At least, this issue seems to be not reproducible on an aarch64 box
> directly. I specifically ran the qos-test for 5 consecutive times and
> each time the test could finish successfully,

Your aarch64 host CPU needs to support aarch32 at EL0 (some
AArch64 CPUs are pure-64 bit these days). The host kernel needs
to implement the 32-bit compat layer. It probably also needs to be
built for 4K pages (which mostly means "not RedHat"). Then you can
set up the 32-bit chroot however you'd normally set up a chroot
(for Debian you can do this with debootstrap; other distros will vary;
schroot is also a bit nicer than raw chroot IMHO.)

-- PMM



Re: intermittent hang in qos-test for qemu-system-i386 on 32-bit arm host

2021-07-11 Thread Coiby Xu

Hi Peter,

On Sat, Jul 10, 2021 at 02:30:36PM +0100, Peter Maydell wrote:

I've noticed recently that intermittently 'make check' will hang on
my aarch32 test system (really an aarch64 box with an aarch32 chroot).


I have a newbie question. How do you do an aarch32 chroot on an aarch64
box? At least, this issue seems to be not reproducible on an aarch64 box
directly. I specifically ran the qos-test for 5 consecutive times and
each time the test could finish successfully, 


$ MALLOC_PERTURB_=${MALLOC_PERTURB_:-$(( ${RANDOM:-0} % 255 + 1))} \
  QTEST_QEMU_BINARY=build/i386-softmmu/qemu-system-i386 
QTEST_QEMU_IMG=build/qemu-img \
  QTEST_QEMU_STORAGE_DAEMON_BINARY=build/storage-daemon/qemu-storage-daemon \
  build/tests/qtest/qos-test



I think from grep that this must be the vhost-user-blk test.

Here's the process tree:

pmaydell 13126  0.0  0.0   8988  6416 ?SJul09   0:01 make
-C build/all-a32 check V=1 GCC_COLORS= -j9
pmaydell 19632  0.0  0.0   4432  2096 ?SJul09   0:00  \_
bash -o pipefail -c echo 'MALLOC_PERTURB_=${MALLOC_PERTURB_:-$((
${RANDOM:-0} % 255 + 1))} QTEST_QEMU_IMG=./qemu-img
G_TEST_DBUS_DAEMON=/home/peter.maydell/qemu/tests/dbus-vmstate-daemon.sh
QTEST_QEMU_BINARY=./qemu-system-i386
QTEST_QEMU_STORAGE_DAEMON_BINARY=./storage-daemon/qemu-storage-daemon
tests/qtest/qos-test --tap -k' &&
MALLOC_PERTURB_=${MALLOC_PERTURB_:-$(( ${RANDOM:-0} % 255 + 1))}
QTEST_QEMU_IMG=./qemu-img
G_TEST_DBUS_DAEMON=/home/peter.maydell/qemu/tests/dbus-vmstate-daemon.sh
QTEST_QEMU_BINARY=./qemu-system-i386
QTEST_QEMU_STORAGE_DAEMON_BINARY=./storage-daemon/qemu-storage-daemon
tests/qtest/qos-test --tap -k -m quick < /dev/null |
./scripts/tap-driver.pl --test-name="qtest-i386/qos-test"
pmaydell 19634  0.0  0.0  13608  3076 ?Sl   Jul09   0:02
\_ tests/qtest/qos-test --tap -k -m quick
pmaydell 20679  0.0  0.0 109076 16100 ?Sl   Jul09   0:00
|   \_ ./storage-daemon/qemu-storage-daemon --blockdev
driver=file,node-name=disk0,filename=qtest.X7RL2X --export
type=vhost-user-blk,id=disk0,addr.type=unix,addr.path=/tmp/qtest-19634-sock.9LJoHn,node-name=disk0,writable=on,num-queues=1
pmaydell 20681  0.0  0.2 447828 46544 ?Sl   Jul09   0:00
|   \_ ./qemu-system-i386 -qtest unix:/tmp/qtest-19634.sock -qtest-log
/dev/null -chardev socket,path=/tmp/qtest-19634.qmp,id=char0 -mon
chardev=char0,mode=control -display none -M pc -device
vhost-user-blk-pci,id=drv0,chardev=char1,addr=4.0 -object
memory-backend-memfd,id=mem,size=256M,share=on -M memory-backend=mem
-m 256M -chardev socket,id=char1,path=/tmp/qtest-19634-sock.9LJoHn
-accel qtest
pmaydell 19635  0.0  0.0  10256  7176 ?SJul09   0:00
\_ perl ./scripts/tap-driver.pl --test-name=qtest-i386/qos-test


Backtrace from tests/qtest/qos-test (not as helpful as it could
be since this is an optimized build):

(gdb) thread apply all bt

Thread 2 (Thread 0xf76ff240 (LWP 19636)):
#0  syscall () at ../sysdeps/unix/sysv/linux/arm/syscall.S:37
#1  0x005206de in qemu_futex_wait (val=, f=) at /home/peter.maydell/qemu/include/qemu/futex.h:29
#2  qemu_event_wait (ev=ev@entry=0x5816fc ) at
../../util/qemu-thread-posix.c:480
#3  0x005469c2 in call_rcu_thread (opaque=) at
../../util/rcu.c:258
#4  0x0051fbc2 in qemu_thread_start (args=) at
../../util/qemu-thread-posix.c:541
#5  0xf785a614 in start_thread (arg=0xf6ce711c) at pthread_create.c:463
#6  0xf77f57ec in ?? () at ../sysdeps/unix/sysv/linux/arm/clone.S:73
from /lib/arm-linux-gnueabihf/libc.so.6
Backtrace stopped: previous frame identical to this frame (corrupt stack?)

Thread 1 (Thread 0xf7a04010 (LWP 19634)):
#0  __libc_do_syscall () at ../sysdeps/unix/sysv/linux/arm/libc-do-syscall.S:46
#1  0xf7861d8c in __libc_read (fd=12, buf=buf@entry=0xff9ce8e4,
nbytes=nbytes@entry=1024) at ../sysdeps/unix/sysv/linux/read.c:27
#2  0x004ebc5a in read (__nbytes=1024, __buf=0xff9ce8e4,
__fd=) at
/usr/include/arm-linux-gnueabihf/bits/unistd.h:44
#3  qtest_client_socket_recv_line (s=0x1a46cb8) at
../../tests/qtest/libqtest.c:494
#4  0x004ebd4e in qtest_rsp_args (s=s@entry=0x1a46cb8,
expected_args=expected_args@entry=1) at
../../tests/qtest/libqtest.c:521
#5  0x004ec1ee in qtest_query_target_endianness (s=0x1a46cb8) at
../../tests/qtest/libqtest.c:570
#6  0x004ec94a in qtest_init_without_qmp_handshake
(extra_args=) at ../../tests/qtest/libqtest.c:332
#7  0x004ecd7a in qtest_init (extra_args=) at
../../tests/qtest/libqtest.c:339
#8  0x004ded10 in qtest_start (
   args=0x1a63710 "-M pc  -device
vhost-user-blk-pci,id=drv0,chardev=char1,addr=4.0 -object
memory-backend-memfd,id=mem,size=256M,share=on  -M memory-backend=mem
-m 256M -chardev socket,id=char1,path=/tmp/qtest-19634-so"...) at
../../tests/qtest/libqtest-single.h:29
#9  restart_qemu_or_continue (
   path=0x1a63710 "-M pc  -device
vhost-user-blk-pci,id=drv0,chardev=char1,addr=4.0 -object
memory-backend-memfd,id=mem,size=256M,share=on  -M memory-backend=mem
-m 256M -chardev socket,id=char1,path=/tmp/qtest-19634-so"...) at

intermittent hang in qos-test for qemu-system-i386 on 32-bit arm host

2021-07-10 Thread Peter Maydell
I've noticed recently that intermittently 'make check' will hang on
my aarch32 test system (really an aarch64 box with an aarch32 chroot).

I think from grep that this must be the vhost-user-blk test.

Here's the process tree:

pmaydell 13126  0.0  0.0   8988  6416 ?SJul09   0:01 make
-C build/all-a32 check V=1 GCC_COLORS= -j9
pmaydell 19632  0.0  0.0   4432  2096 ?SJul09   0:00  \_
bash -o pipefail -c echo 'MALLOC_PERTURB_=${MALLOC_PERTURB_:-$((
${RANDOM:-0} % 255 + 1))} QTEST_QEMU_IMG=./qemu-img
G_TEST_DBUS_DAEMON=/home/peter.maydell/qemu/tests/dbus-vmstate-daemon.sh
QTEST_QEMU_BINARY=./qemu-system-i386
QTEST_QEMU_STORAGE_DAEMON_BINARY=./storage-daemon/qemu-storage-daemon
tests/qtest/qos-test --tap -k' &&
MALLOC_PERTURB_=${MALLOC_PERTURB_:-$(( ${RANDOM:-0} % 255 + 1))}
QTEST_QEMU_IMG=./qemu-img
G_TEST_DBUS_DAEMON=/home/peter.maydell/qemu/tests/dbus-vmstate-daemon.sh
QTEST_QEMU_BINARY=./qemu-system-i386
QTEST_QEMU_STORAGE_DAEMON_BINARY=./storage-daemon/qemu-storage-daemon
tests/qtest/qos-test --tap -k -m quick < /dev/null |
./scripts/tap-driver.pl --test-name="qtest-i386/qos-test"
pmaydell 19634  0.0  0.0  13608  3076 ?Sl   Jul09   0:02
\_ tests/qtest/qos-test --tap -k -m quick
pmaydell 20679  0.0  0.0 109076 16100 ?Sl   Jul09   0:00
|   \_ ./storage-daemon/qemu-storage-daemon --blockdev
driver=file,node-name=disk0,filename=qtest.X7RL2X --export
type=vhost-user-blk,id=disk0,addr.type=unix,addr.path=/tmp/qtest-19634-sock.9LJoHn,node-name=disk0,writable=on,num-queues=1
pmaydell 20681  0.0  0.2 447828 46544 ?Sl   Jul09   0:00
|   \_ ./qemu-system-i386 -qtest unix:/tmp/qtest-19634.sock -qtest-log
/dev/null -chardev socket,path=/tmp/qtest-19634.qmp,id=char0 -mon
chardev=char0,mode=control -display none -M pc -device
vhost-user-blk-pci,id=drv0,chardev=char1,addr=4.0 -object
memory-backend-memfd,id=mem,size=256M,share=on -M memory-backend=mem
-m 256M -chardev socket,id=char1,path=/tmp/qtest-19634-sock.9LJoHn
-accel qtest
pmaydell 19635  0.0  0.0  10256  7176 ?SJul09   0:00
\_ perl ./scripts/tap-driver.pl --test-name=qtest-i386/qos-test


Backtrace from tests/qtest/qos-test (not as helpful as it could
be since this is an optimized build):

(gdb) thread apply all bt

Thread 2 (Thread 0xf76ff240 (LWP 19636)):
#0  syscall () at ../sysdeps/unix/sysv/linux/arm/syscall.S:37
#1  0x005206de in qemu_futex_wait (val=, f=) at /home/peter.maydell/qemu/include/qemu/futex.h:29
#2  qemu_event_wait (ev=ev@entry=0x5816fc ) at
../../util/qemu-thread-posix.c:480
#3  0x005469c2 in call_rcu_thread (opaque=) at
../../util/rcu.c:258
#4  0x0051fbc2 in qemu_thread_start (args=) at
../../util/qemu-thread-posix.c:541
#5  0xf785a614 in start_thread (arg=0xf6ce711c) at pthread_create.c:463
#6  0xf77f57ec in ?? () at ../sysdeps/unix/sysv/linux/arm/clone.S:73
from /lib/arm-linux-gnueabihf/libc.so.6
Backtrace stopped: previous frame identical to this frame (corrupt stack?)

Thread 1 (Thread 0xf7a04010 (LWP 19634)):
#0  __libc_do_syscall () at ../sysdeps/unix/sysv/linux/arm/libc-do-syscall.S:46
#1  0xf7861d8c in __libc_read (fd=12, buf=buf@entry=0xff9ce8e4,
nbytes=nbytes@entry=1024) at ../sysdeps/unix/sysv/linux/read.c:27
#2  0x004ebc5a in read (__nbytes=1024, __buf=0xff9ce8e4,
__fd=) at
/usr/include/arm-linux-gnueabihf/bits/unistd.h:44
#3  qtest_client_socket_recv_line (s=0x1a46cb8) at
../../tests/qtest/libqtest.c:494
#4  0x004ebd4e in qtest_rsp_args (s=s@entry=0x1a46cb8,
expected_args=expected_args@entry=1) at
../../tests/qtest/libqtest.c:521
#5  0x004ec1ee in qtest_query_target_endianness (s=0x1a46cb8) at
../../tests/qtest/libqtest.c:570
#6  0x004ec94a in qtest_init_without_qmp_handshake
(extra_args=) at ../../tests/qtest/libqtest.c:332
#7  0x004ecd7a in qtest_init (extra_args=) at
../../tests/qtest/libqtest.c:339
#8  0x004ded10 in qtest_start (
args=0x1a63710 "-M pc  -device
vhost-user-blk-pci,id=drv0,chardev=char1,addr=4.0 -object
memory-backend-memfd,id=mem,size=256M,share=on  -M memory-backend=mem
-m 256M -chardev socket,id=char1,path=/tmp/qtest-19634-so"...) at
../../tests/qtest/libqtest-single.h:29
#9  restart_qemu_or_continue (
path=0x1a63710 "-M pc  -device
vhost-user-blk-pci,id=drv0,chardev=char1,addr=4.0 -object
memory-backend-memfd,id=mem,size=256M,share=on  -M memory-backend=mem
-m 256M -chardev socket,id=char1,path=/tmp/qtest-19634-so"...) at
../../tests/qtest/qos-test.c:105
#10 run_one_test (arg=) at ../../tests/qtest/qos-test.c:178
#11 0xf794ee74 in ?? () from /usr/lib/arm-linux-gnueabihf/libglib-2.0.so.0
Backtrace stopped: previous frame identical to this frame (corrupt stack?)


Backtrace from qemu-system-i386:

(gdb) thread apply all bt

Thread 4 (Thread 0xdfd0cb90 (LWP 20734)):
#0  0xf6f85206 in __libc_do_syscall () at
../sysdeps/unix/sysv/linux/arm/libc-do-syscall.S:47
#1  0xf6f93492 in __GI___sigtimedwait (set=,
set@entry=0xdfd0c3c4, info=info@entry=0xdfd0c324,
timeout=timeout@entry=0x0) at
../sysdeps/unix/sysv/linux/sigtimedwait.c:42
#2  0xf7073e6c