On 14/02/20 15:49, Alex Bennée wrote: > The bug describes a race whereby cpu_exec_step_atomic can acquire a TB > which is invalidated by a tb_flush before we execute it. This doesn't > affect the other cpu_exec modes as a tb_flush by it's nature can only > occur on a quiescent system. The race was described as: > > B2. tcg_cpu_exec => cpu_exec => tb_find => tb_gen_code > B3. tcg_tb_alloc obtains a new TB > > C3. TB obtained with tb_lookup__cpu_state or tb_gen_code > (same TB as B2) > > A3. start_exclusive critical section entered > A4. do_tb_flush is called, TB memory freed/re-allocated > A5. end_exclusive exits critical section > > B2. tcg_cpu_exec => cpu_exec => tb_find => tb_gen_code > B3. tcg_tb_alloc reallocates TB from B2 > > C4. start_exclusive critical section entered > C5. cpu_tb_exec executes the TB code that was free in A4 > > The simplest fix is to widen the exclusive period to include the TB > lookup. As a result we can drop the complication of checking we are in > the exclusive region before we end it.
Reviewed-by: Paolo Bonzini <pbonz...@redhat.com> > Signed-off-by: Alex Bennée <alex.ben...@linaro.org> > Cc: Yifan <m...@yifanlu.com> > Cc: Bug 1863025 <1863...@bugs.launchpad.net> > --- > accel/tcg/cpu-exec.c | 21 +++++++++++---------- > 1 file changed, 11 insertions(+), 10 deletions(-) > > diff --git a/accel/tcg/cpu-exec.c b/accel/tcg/cpu-exec.c > index 2560c90eec7..d95c4848a47 100644 > --- a/accel/tcg/cpu-exec.c > +++ b/accel/tcg/cpu-exec.c > @@ -240,6 +240,8 @@ void cpu_exec_step_atomic(CPUState *cpu) > uint32_t cf_mask = cflags & CF_HASH_MASK; > > if (sigsetjmp(cpu->jmp_env, 0) == 0) { > + start_exclusive(); > + > tb = tb_lookup__cpu_state(cpu, &pc, &cs_base, &flags, cf_mask); > if (tb == NULL) { > mmap_lock(); > @@ -247,8 +249,6 @@ void cpu_exec_step_atomic(CPUState *cpu) > mmap_unlock(); > } > > - start_exclusive(); > - > /* Since we got here, we know that parallel_cpus must be true. */ > parallel_cpus = false; > cc->cpu_exec_enter(cpu); > @@ -271,14 +271,15 @@ void cpu_exec_step_atomic(CPUState *cpu) > qemu_plugin_disable_mem_helpers(cpu); > } > > - if (cpu_in_exclusive_context(cpu)) { > - /* We might longjump out of either the codegen or the > - * execution, so must make sure we only end the exclusive > - * region if we started it. > - */ > - parallel_cpus = true; > - end_exclusive(); > - } > + > + /* > + * As we start the exclusive region before codegen we must still > + * be in the region if we longjump out of either the codegen or > + * the execution. > + */ > + g_assert(cpu_in_exclusive_context(cpu)); > + parallel_cpus = true; > + end_exclusive(); > } > > struct tb_desc { >