> From: Paolo Bonzini [mailto:pbonz...@redhat.com] > On 03/11/2017 09:27, Pavel Dovgalyuk wrote: > >> From: Paolo Bonzini [mailto:pbonz...@redhat.com] > >> On 02/11/2017 12:33, Paolo Bonzini wrote: > >>> On 02/11/2017 12:24, Pavel Dovgalyuk wrote: > >>>>> I am not sure about this. I think if instead you should return false > >>>>> from here and EXCP_INTERRUPT from cpu_exec. > >>>> The problem is inside the TB. It checks cpu->icount_decr.u16.high which > >>>> is -1. > >>>> And we have to enter the TB to cause an exception (because it exists in > >>>> replay log). > >>>> That is why we reset this flag and try to execute the TB. > >>> > >>> But if u16.high is -1, shouldn't you return EXCP_INTERRUPT first (via > >>> "Finally, check if we need to exit to the main loop" in > >>> cpu_handle_interrupt)? Then only cause the exception when that one is > >>> processed. > >> > >> ... indeed, you probably need something like: > >> > >> /* Clear the interrupt flag now since we're processing > >> * cpu->interrupt_request and cpu->exit_request. > >> */ > >> insns_left = atomic_read(&cpu->icount_decr.u32); > >> atomic_set(&cpu->icount_decr.u16.high, 0); > >> if (unlikely(insns_left < 0) { > >> /* Ensure the zeroing of icount_decr comes before the next read > >> * of cpu->exit_request or cpu->interrupt_request. > >> */ > >> smb_mb(); > >> } > >> > >> at the top of cpu_handle_interrupt. Then you can remove the same > >> atomic_set+smp_mb in cpu_loop_exec_tb, like > >> > >> *last_tb = NULL; > >> insns_left = atomic_read(&cpu->icount_decr.u32); > >> if (insns_left < 0) { > >> /* Something asked us to stop executing chained TBs; just > >> * continue round the main loop. Whatever requested the exit > >> * will also have set something else (eg exit_request or > >> * interrupt_request) which will be handled by > >> * cpu_handle_interrupt. cpu_handle_interrupt will also > >> * clear cpu->icount_decr.u16.high. > >> */ > >> return; > >> } > > > > I tried this approach and it didn't work. > > I think iothread sets u16.high flag after resetting it in > > cpu_handle_interrupt. > > But why is this a problem? The TB would exit immediately and go again > to cpu_handle_interrupt. cpu_handle_interrupt returns true and > cpu_handle_exception causes the exception via cpu_exec_nocache.
I've tested your variant more thoroughly. It seems, that iothread calls cpu_exec between atomic_set(&cpu->icount_decr.u16.high, 0); in cpu_handle_interrupt and cpu_exec_nocache in cpu_handle_exception. I see no other reason, because this happens not for the every time. And cpu_handle_interrupt is not called again, because cpu_handle_exception returns true. Therefore we have an infinite loop, because no other code here resets cpu->icount_decr.u16.high. Pavel Dovgalyuk