On 10/11/2017 09:20, Pavel Dovgalyuk wrote: >> From: Paolo Bonzini [mailto:pbonz...@redhat.com] >> On 03/11/2017 09:27, Pavel Dovgalyuk wrote: >>>> From: Paolo Bonzini [mailto:pbonz...@redhat.com] >>>> On 02/11/2017 12:33, Paolo Bonzini wrote: >>>>> On 02/11/2017 12:24, Pavel Dovgalyuk wrote: >>>>>>> I am not sure about this. I think if instead you should return false >>>>>>> from here and EXCP_INTERRUPT from cpu_exec. >>>>>> The problem is inside the TB. It checks cpu->icount_decr.u16.high which >>>>>> is -1. >>>>>> And we have to enter the TB to cause an exception (because it exists in >>>>>> replay log). >>>>>> That is why we reset this flag and try to execute the TB. >>>>> >>>>> But if u16.high is -1, shouldn't you return EXCP_INTERRUPT first (via >>>>> "Finally, check if we need to exit to the main loop" in >>>>> cpu_handle_interrupt)? Then only cause the exception when that one is >>>>> processed. >>>> >>>> ... indeed, you probably need something like: >>>> >>>> /* Clear the interrupt flag now since we're processing >>>> * cpu->interrupt_request and cpu->exit_request. >>>> */ >>>> insns_left = atomic_read(&cpu->icount_decr.u32); >>>> atomic_set(&cpu->icount_decr.u16.high, 0); >>>> if (unlikely(insns_left < 0) { >>>> /* Ensure the zeroing of icount_decr comes before the next read >>>> * of cpu->exit_request or cpu->interrupt_request. >>>> */ >>>> smb_mb(); >>>> } >>>> >>>> at the top of cpu_handle_interrupt. Then you can remove the same >>>> atomic_set+smp_mb in cpu_loop_exec_tb, like >>>> >>>> *last_tb = NULL; >>>> insns_left = atomic_read(&cpu->icount_decr.u32); >>>> if (insns_left < 0) { >>>> /* Something asked us to stop executing chained TBs; just >>>> * continue round the main loop. Whatever requested the exit >>>> * will also have set something else (eg exit_request or >>>> * interrupt_request) which will be handled by >>>> * cpu_handle_interrupt. cpu_handle_interrupt will also >>>> * clear cpu->icount_decr.u16.high. >>>> */ >>>> return; >>>> } >>> >>> I tried this approach and it didn't work. >>> I think iothread sets u16.high flag after resetting it in >>> cpu_handle_interrupt. >> >> But why is this a problem? The TB would exit immediately and go again >> to cpu_handle_interrupt. cpu_handle_interrupt returns true and >> cpu_handle_exception causes the exception via cpu_exec_nocache. > > I've tested your variant more thoroughly. > It seems, that iothread calls cpu_exec between > atomic_set(&cpu->icount_decr.u16.high, 0); > in cpu_handle_interrupt and cpu_exec_nocache in cpu_handle_exception. > I see no other reason, because this happens not for the every time. > And cpu_handle_interrupt is not called again, because cpu_handle_exception > returns true. > Therefore we have an infinite loop, because no other code here resets > cpu->icount_decr.u16.high.
Then returning true unconditionally is wrong in the cpu_exec_nocache case. What if you do: diff --git a/accel/tcg/cpu-exec.c b/accel/tcg/cpu-exec.c index 61297f8f4a..fb5446be3e 100644 --- a/accel/tcg/cpu-exec.c +++ b/accel/tcg/cpu-exec.c @@ -470,7 +470,19 @@ static inline void cpu_handle_debug_exception(CPUState *cpu) static inline bool cpu_handle_exception(CPUState *cpu, int *ret) { - if (cpu->exception_index >= 0) { + if (cpu->exception_index < 0) { +#ifndef CONFIG_USER_ONLY + if (replay_has_exception() + && cpu->icount_decr.u16.low + cpu->icount_extra == 0) { + /* try to cause an exception pending in the log */ + cpu_exec_nocache(cpu, 1, tb_find(cpu, NULL, 0, curr_cflags()), true); + } +#endif + if (cpu->exception_index < 0) { + return; + } + } + if (cpu->exception_index >= EXCP_INTERRUPT) { /* exit request from the cpu execution loop */ *ret = cpu->exception_index; @@ -505,16 +517,6 @@ static inline bool cpu_handle_exception(CPUState *cpu, int *ret) } #endif } -#ifndef CONFIG_USER_ONLY - } else if (replay_has_exception() - && cpu->icount_decr.u16.low + cpu->icount_extra == 0) { - /* try to cause an exception pending in the log */ - cpu_exec_nocache(cpu, 1, tb_find(cpu, NULL, 0, curr_cflags()), true); - *ret = -1; - return true; -#endif - } - return false; }