On Fri, 17 Jul 2020 at 18:24, John Snow <js...@redhat.com> wrote: > > - The real problem, though: Why is QEMU hanging? It might need a longer > timeout, or it might be having problems with the console socket again. > > (CC Robert Foley who has been working on the Console socket draining > problems. Maybe he has some insight here?)
When we did see the console issues we would see a hung stack like this: #0 0x0000aaaad43d141c in qemu_chr_write_buffer #1 0x0000aaaad43d194c in qemu_chr_write #2 0x0000aaaad43d3968 in qemu_chr_fe_write_all #3 0x0000aaaad417cf80 in pl011_write #4 0x0000aaaad3f3c7b0 in memory_region_write_accessor #5 0x0000aaaad3f3a1fc in access_with_adjusted_size #6 0x0000aaaad3f3e828 in memory_region_dispatch_write #7 0x0000aaaad3f517b0 in io_writex #8 0x0000ffff574a1d34 in code_gen_buffer () #9 0x0000aaaad3f67228 in cpu_tb_exec #10 0x0000aaaad3f67228 in cpu_loop_exec_tb #11 0x0000aaaad3f67228 in cpu_exec #12 0x0000aaaad3f2dbe4 in tcg_cpu_exec #13 0x0000aaaad3f305e8 in qemu_tcg_cpu_thread_fn #14 0x0000aaaad4441d88 in qemu_thread_start #15 0x0000ffff85bec088 in start_thread #16 0x0000ffff85b5c4ec in thread_start However, since we added console socket draining thread, it seems to have fixed this and presently basevm.py should be using this console draining for the vm-build-openbsd. When QEMU is hanging and exceeding our shutdown timeout, could we (optionally) send something like a SIGABRT to QEMU to force a core dump so we can get the stack and see where QEMU is hung? I suppose that presumes it is reproducible, but it might help to remove doubt in cases where QEMU hangs. -Rob > > --js >