On 29/03/16 00:21, Paolo Bonzini wrote: > > On 28/03/2016 17:18, Sergey Fedorov wrote: >> The use pattern of 'tb_invalidated_flag' is a bit intricate; correct me, >> if I'm wrong about the following. Basically, 'tb_invalidated_flag' was >> meant to catch two events: >> * some TB has been invalidated by tb_phys_invalidate(); > This is patch 4. > >> * the whole translation buffer has been flushed by tb_flush(). > This is patch 5. > >> Then it is checked to ensure: >> * the last executed TB can be safely patched to directly call the next >> one in cpu_exec(); >> * the original TB should be provided for further possible invalidation >> along with the temporarily generated TB when in cpu_exec_nocache(). >> >> [...] I would suggest the following solution: >> (1) Use 'tb->pc' as an indicator of whether TB is valid; check for it >> in cpu_exec() when deciding on whether to patch the last executed >> TB or not >> (2) Use 'tcg_ctx.tb_ctx.tb_flush_count' to check for translation buffer >> flushes; capture it before calling tb_gen_code() and compare to it >> afterwards to check if tb_flush() has been called in between > Of course that would work, but it would be slower.
What's going to be slower? > I think it is > unnecessary for two reasons: > > 1) There are two calls to cpu_exec_nocache. One exits immediately with > "break;", the other always sets "next_tb = 0;". Therefore it is safe in > both cases for cpu_exec_nocache to hijack cpu->tb_invalidated_flag. > > 2) if it were broken, it would _also_ be broken before these patches > because cpu_exec_nocache always runs with tb_lock taken. I can't see how cpu_exec_nocache() always runs with tb_lock taken. > So I think > documenting the assumptions is better than changing them at the same > time as doing other changes. I'm not sure I understand you here exactly, but if implementing my proposal, it'd rather be a separate patch/series, I think. > Your observation that tb->pc==-1 is not necessarily safe still holds of > course. Probably the best thing is an inline that can do one of: > > 1) set cs_base to an invalid value (anything nonzero is enough except on > x86 and SPARC; SPARC can use all-ones) > > 2) sets the flags to an invalid combination (x86 can use all ones) > > 3) sets the PC to an invalid value (no one really needs it) It's a bit tricky. Does it really worth doing so instead of using a separate dedicated flag? Mainly, it should cost one extra compare on TB look-up. I suppose it's a kind of trade-off between performance and code clarity. Kind regards, Sergey