On Tue, 4 Oct 2022 at 13:00, Daniel P. Berrangé <berra...@redhat.com> wrote:
>
> The g_slice custom allocator is not async signal safe with its
> mutexes. When a multithreaded program running in the qemu user
> emulator forks, it can end up deadlocking in the g_slice
> allocator
>
>   Thread 1:
>   #0 syscall () at ../sysdeps/unix/sysv/linux/x86_64/syscall.S:38
>   #1 0x00007f54e190c77c in g_mutex_lock_slowpath 
> (mutex=mutex@entry=0x7f54e1dc7600 <allocator+96>) at 
> ../glib/gthread-posix.c:1462
>   #2 0x00007f54e190d222 in g_mutex_lock (mutex=mutex@entry=0x7f54e1dc7600 
> <allocator+96>) at ../glib/gthread-posix.c:1486
>   #3 0x00007f54e18e39f2 in magazine_cache_pop_magazine 
> (countp=0x7f54280e6638, ix=2) at ../glib/gslice.c:769
>   #4 thread_memory_magazine1_reload (ix=2, tmem=0x7f54280e6600) at 
> ../glib/gslice.c:845
>   #5 g_slice_alloc (mem_size=mem_size@entry=40) at ../glib/gslice.c:1058
>   #6 0x00007f54e18f06fa in g_tree_node_new (value=0x7f54d4066540 
> <code_gen_buffer+419091>, key=0x7f54d4066560 <code_gen_buffer+419123>) at 
> ../glib/gtree.c:517
>   #7 g_tree_insert_internal (tree=0x555556aed800, key=0x7f54d4066560 
> <code_gen_buffer+419123>, value=0x7f54d4066540 <code_gen_buffer+419091>, 
> replace=0) at ../glib/gtree.c:517
>   #8 0x00007f54e186b755 in tcg_tb_insert (tb=0x7f54d4066540 
> <code_gen_buffer+419091>) at ../tcg/tcg.c:534
>   #9 0x00007f54e1820545 in tb_gen_code (cpu=0x7f54980b4b60, pc=274906407438, 
> cs_base=0, flags=24832, cflags=-16252928) at ../accel/tcg/translate-all.c:2118
>   #10 0x00007f54e18034a5 in tb_find (cpu=0x7f54980b4b60, 
> last_tb=0x7f54d4066440 <code_gen_buffer+418835>, tb_exit=0, cf_mask=524288) 
> at ../accel/tcg/cpu-exec.c:462
>   #11 0x00007f54e1803bd9 in cpu_exec (cpu=0x7f54980b4b60) at 
> ../accel/tcg/cpu-exec.c:818
>   #12 0x00007f54e1735a4c in cpu_loop (env=0x7f54980bce40) at 
> ../linux-user/riscv/cpu_loop.c:37
>   #13 0x00007f54e1844b22 in clone_func (arg=0x7f5402f3b080) at 
> ../linux-user/syscall.c:6422
>   #14 0x00007f54e191950a in start_thread (arg=<optimized out>) at 
> pthread_create.c:477
>   #15 0x00007f54e19a52a3 in clone () at 
> ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
>
> The only known workaround for this problem is to disable the g_slice
> custom allocator, in favor of system malloc which is believed to be
> async signal safe on all platforms QEMU officially targets.
>
> g_slice uses a one-time initializer to check the G_SLICE env variable
> making it hard for QEMU to set the env before any GLib API call has
> triggered the initializer. Even attribute((constructor)) is not
> sufficient as QEMU has many constructors and there is no ordering
> guarantee between them.
>
> This patch attempts to workaround this by re-exec()ing the QEMU user
> emulators if the G_SLICE env variable is not already set. This means
> the env variable will be inherited down the process tree spawned
> from there onwards. There is a possibility this could have unexpected
> consequences, but this has to be balanced against the real known
> problem of QEMU user emulators randomly deadlocking.
>
> Fixes: https://gitlab.com/qemu-project/qemu/-/issues/285
> Signed-off-by: Daniel P. Berrangé <berra...@redhat.com>
> ---
>
> Can't say I especially like this but I'm out of other ideas for how
> to guarantee a solution. Users can't set env vars prior to launching
> QEMU user emulators when using binfmt.
>
> NB, I tested the linux-user impl and it stops the hangs in my
> testing. I've not even compiled tested the bsd-user impl, just
> blindly copied the linux-user code.

I suspect a simple re-exec won't play nicely with all the possible
ways you can use binfmt-misc...

-- PMM

Reply via email to