On Mon, 4 Nov 2019 at 07:13, Alex Bennée <alex.ben...@linaro.org> wrote:
> I wonder if using RCU to swap out the new and old FD would make things a > bit smoother here? You would have to tweak the qemu_log_lock/unlock > functions to preserve the current FD around the lock and call_rcu a > freeing function when a new handle is set. Thanks for your comments on this. We agree that using RCU is a great solution here. We have this patch well underway and plan to post a patch in the next few days. -Rob On Mon, 4 Nov 2019 at 07:13, Alex Bennée <alex.ben...@linaro.org> wrote: > > > Robert Foley <robert.fo...@linaro.org> writes: > > > We hit an issue when trying to change the log file from the monitor > > console. The root of the issue here is that the qemu_logfile handle > > is not thread safe. So when we try to close the file, we end up with > > a seg fault. The full analysis is below along with some possible > > solutions. > > Will plan to post a patch soon, but any comments or opinions on our > > proposed solution would be appreciated. Thanks. > > > > The version of QEMU we are using is: master as of about Oct 15, > > 9020e9526cd08c4dc99d54dba48730de2908c970. > > > > This is what we did to reproduce the issue. > > First we enable logging and select the log file. > > (qemu) log in_asm,out_asm,op > > (qemu) logfile asm.out > > > > Then we start this command in the guest. This just keeps the guest > > performing operations that result in logging to be constantly > > generated. > > $ for i in {0..1000}; do ls -l; done > > > > Next we switch to the monitor console and change the file. > > (qemu) logfile asm_new.log > > > > This action causes a seg fault. Please see the stack trace (below). > > > > The code, which changes the log file unconditionally > > (qemu_set_log_filename()), closes the qemu_logfile, sets it to NULL, > > and then opens the new file. > > Since the file handle is still in use, we end up with a seg fault when > > the code which is trying to log ends up using a NULL file handle. > > Yep, good catch. > > > We are considering a few solutions. > > > > A straightforward solution would be to simply prevent the file from > > being changed while logging is enabled. In other words, force the > > user to first disable logging before changing the log file. > > This solution seems to cover the general case. However, if a user > > were to disable logging and change the log file in quick succession, > > we would still be subject to a similar race. A log call could still > > make it through the logging enable check and proceed to use a file > > handle that gets changed to NULL. > > > > Another option is to add a mutex to prevent the qemu_logfile handle > > from being changed while it is in use. This certainly works and has > > the advantage of being fairly straightforward. Also we are thinking > > that since the mutex would only be used when logging is enabled it has > > the advantage of not having an effect on the normal case performance. > > Another option is to implement a simple atomic ref count and prevent > > the file from being changed while there are outstanding references. > > The mutex is the simplest but I wonder if it show up on heavy usage? The > log file is already doing file locking for the areas that want > contiguous log statements. > > I wonder if using RCU to swap out the new and old FD would make things a > bit smoother here? You would have to tweak the qemu_log_lock/unlock > functions to preserve the current FD around the lock and call_rcu a > freeing function when a new handle is set. > > > We are leaning towards the mutex option, and plan to post a patch > > soon, but would appreciate comments or opinions on this solution. > > > > Thanks, > > Rob Foley > > > > stack trace > > ========================================== > > Thread 10 "qemu-system-aar" received signal SIGSEGV, Segmentation fault. > > [Switching to Thread 0xffff113f9d90 (LWP 9493)] __flockfile > > (stream=0x0) at ../sysdeps/pthread/flockfile.c:27 > > 27 ../sysdeps/pthread/flockfile.c: No such file or directory. > > (gdb) bt > > #0 __flockfile (stream=0x0) at ../sysdeps/pthread/flockfile.c:27 > > #1 0x0000aaaae0fac8b8 in qemu_flockfile (f=<optimized out>) at > > /home/rob/qemu/qemu_unchanged/include/sysemu/os-posix.h:87 > > #2 qemu_log_lock () at /home/rob/qemu/qemu_unchanged/include/qemu/log.h:57 > > #3 translator_loop (ops=0xaaaae17f1348 <aarch64_translator_ops>, > > db=0xffff113f9088, db@entry=0xffff113f9098, > > cpu=cpu@entry=0xaaab0a52bc50, > > tb=tb@entry=0xffff4c92d000 <code_gen_buffer+814927796>, > > max_insns=max_insns@entry=512) at > > /home/rob/qemu/qemu_unchanged/accel/tcg/translator.c:121 > > #4 0x0000aaaae10c1c18 in gen_intermediate_code > > (cpu=cpu@entry=0xaaab0a52bc50, tb=tb@entry=0xffff4c92d000 > > <code_gen_buffer+814927796>, max_insns=max_insns@entry=512) > > at /home/rob/qemu/qemu_unchanged/target/arm/translate.c:11320 > > #5 0x0000aaaae0fab248 in tb_gen_code (cpu=0xaaab0a52bc50, > > cpu@entry=0xffffabe2a000, pc=187650897458448, cs_base=65536, > > flags=43690, cflags=-16252928, cflags@entry=524288) > > at /home/rob/qemu/qemu_unchanged/accel/tcg/translate-all.c:1738 > > #6 0x0000aaaae0fa8e74 in tb_find (cf_mask=524288, tb_exit=0, > > last_tb=0xffff4c92cc40 <code_gen_buffer+814926836>, > > cpu=0xffffabe2a000) > > at /home/rob/qemu/qemu_unchanged/accel/tcg/cpu-exec.c:408 > > #7 cpu_exec (cpu=0xffffabe2a000, cpu@entry=0xaaab0a52bc50) at > > /home/rob/qemu/qemu_unchanged/accel/tcg/cpu-exec.c:730 > > #8 0x0000aaaae0f6de24 in tcg_cpu_exec (cpu=0xaaab0a52bc50) at > > /home/rob/qemu/qemu_unchanged/cpus.c:1454 > > #9 0x0000aaaae0f70908 in qemu_tcg_cpu_thread_fn (arg=0xaaab0a52bc50) > > at /home/rob/qemu/qemu_unchanged/cpus.c:1762 > > #10 0x0000aaaae145bd38 in qemu_thread_start (args=<optimized out>) at > > /home/rob/qemu/qemu_unchanged/util/qemu-thread-posix.c:519 > > #11 0x0000ffffabe0a088 in start_thread (arg=0xffffcc20410f) at > > pthread_create.c:463 > > #12 0x0000ffffabd7a4ec in thread_start () at > > ../sysdeps/unix/sysv/linux/aarch64/clone.S:78 > > > -- > Alex Bennée