* Laurent Vivier (lviv...@redhat.com) wrote: > On 18/07/2017 15:07, Laurent Vivier wrote: > > On 21/06/2017 15:23, Gerd Hoffmann wrote: > >> Drop the temporary workaround for the broken display updates. > >> All display adapters are updated, so this should be safe without > >> causing regressions. > > > > It seems it breaks QMP command 'migrate "exec:cat>mig"'. > > > > The command hangs and doesn't create the file. > > > > It happens with qemu-system-ppc64 on x86 (so TCG mode). > > > > my command: > > > > ./ppc64-softmmu/qemu-system-ppc64 -serial mon:stdio > > > > I wait SLOF fails to find an OS, and: > > > > Ctrl-a c > > (qemu) migrate -d "exec:cat>mig" > > > > The file is not created and the command hangs: > > > > #0 in __lll_lock_wait > > #1 in pthread_mutex_lock > > #2 in qemu_mutex_lock > > #3 in rcu_init_lock > > #4 in fork > > #5 in qemu_fork > > #6 in qio_channel_command_new_spawn > > #7 in exec_start_outgoing_migration > > #8 in qmp_migrate > > ... > > > > It looks like a deadlock. > > I think this patch is not the cause of the problem, the one it removes > just unlocks the deadlock by playing with locks. > > We have a rcu_init_lock() on fork() because of: > > utils/rcu.c: > > static void __attribute__((__constructor__)) rcu_init(void) > { > #ifdef CONFIG_POSIX > pthread_atfork(rcu_init_lock, rcu_init_unlock, rcu_init_unlock); > #endif > rcu_init_complete(); > } > > The QMP thread hangs on: > > (gdb) p rcu_sync_lock > $1 = {lock = {__data = {__lock = 2, __count = 0, __owner = 23865, > __nusers = 1, __kind = 0, __spins = 0, __elision = 0, __list = { > __prev = 0x0, __next = 0x0}}, > __size = "\002\000\000\000\000\000\000\000\071]\000\000\001", '\000' > <repeats 26 times>, __align = 2}, initialized = true} > > > The lock is already taken by thread 2: > > (gdb) info thread > Id Target Id Frame > 1 Thread 0x7f1cf02fdf00 (LWP 23864) "qemu-system-ppc" > 0x00007f1cd914b37d in __lll_lock_wait () from /lib64/libpthread.so.0 > * 2 Thread 0x7f1cc9762700 (LWP 23865) "qemu-system-ppc" > 0x00007f1cd410daa9 in syscall () from /lib64/libc.so.6 > 3 Thread 0x7f1cbf8d5700 (LWP 23866) "qemu-system-ppc" > 0x00007f1cd914b37d in __lll_lock_wait () from /lib64/libpthread.so.0 > > (gdb) bt > #0 0x00007f1cd410daa9 in syscall () at /lib64/libc.so.6 > #1 0x000055ab028ddda2 in qemu_futex_wait > #2 0x000055ab028ddda2 in qemu_event_wait > #3 0x000055ab028eda2b in wait_for_readers > #4 0x000055ab028eda2b in synchronize_rcu > #5 0x000055ab028edc5b in call_rcu_thread > #6 0x00007f1cd914273a in start_thread () > #7 0x00007f1cd4113e0f in clone () > > So it seems we cannot fork() from QMP? > [cc: Paolo] > > Any comments?
I remembered hitting this in the past - but I can only trigger it rarely for me on x86; the following script triggers it after ~100 iterations on my laptop: #!/bin/bash while true do OURPIPE=/tmp/delaystop.$$ mknod $OURPIPE p ./try/x86_64-softmmu/qemu-system-x86_64 -nographic -M pc,accel=kvm -smp 8 < $OURPIPE & QEMUPID=$! exec 10> "$OURPIPE" # Flip the mon to hmp echo -e '\001c' >&10 # just a test echo 'migrate -d "exec: cat > /dev/null"' >&10 sleep $(printf ".%05d" $RANDOM) echo "info status" >&10 echo "q" >&10 rm $OURPIPE wait $QEMUPID || break done (From my notes I mentioned that to Paolo about 18months ago after he nailed a different case) Dave > Laurent > -- Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK