On 05/04/16 18:32, Alex Bennée wrote: > From: Paolo Bonzini <pbonz...@redhat.com> > > softmmu requires more functions to be thread-safe, because translation > blocks can be invalidated from e.g. notdirty callbacks. Probably the > same holds for user-mode emulation, it's just that no one has ever > tried to produce a coherent locking there. > > This patch will guide the introduction of more tb_lock and tb_unlock > calls for system emulation. > > Note that after this patch some (most) of the mentioned functions are > still called outside tb_lock/tb_unlock. The next one will rectify this. > > Signed-off-by: Paolo Bonzini <pbonz...@redhat.com> > Signed-off-by: Alex Bennée <alex.ben...@linaro.org> (snip) > diff --git a/include/qom/cpu.h b/include/qom/cpu.h > index 6931db9..13eeaae 100644 > --- a/include/qom/cpu.h > +++ b/include/qom/cpu.h > @@ -306,7 +306,10 @@ struct CPUState { > > void *env_ptr; /* CPUArchState */ > struct TranslationBlock *current_tb; > + > + /* Writes protected by tb_lock, reads not thread-safe */ > struct TranslationBlock *tb_jmp_cache[TB_JMP_CACHE_SIZE];
What does "reads not thread-safe" mean? Where does it get read outside of either actually held tb_lock or promised in a comment added by this patch? > + > struct GDBRegisterState *gdb_regs; > int gdb_num_regs; > int gdb_num_g_regs; > diff --git a/tcg/tcg.h b/tcg/tcg.h > index ea4ff02..a46d17c 100644 > --- a/tcg/tcg.h > +++ b/tcg/tcg.h > @@ -609,6 +609,7 @@ static inline bool tcg_op_buf_full(void) > > /* pool based memory allocation */ > > +/* tb_lock must be held for tcg_malloc_internal. */ How far are we ready to go in commenting such functions? The functions can be divided into three categories: * extern * static, called only from another one function (for better code structuring) * static, called from multiple other functions (for encapsulation of common code) I think we should decide how to comment locking requirements and follow this consistently. Concerning this particular case, I think there's no much point in making tcg_malloc_internal() and tcg_malloc() externally visible and commenting locking requirement for them. We'd better have a separate header file under include/ for external TCG interface declarations and use this one as internal only in tcg/. > void *tcg_malloc_internal(TCGContext *s, int size); > void tcg_pool_reset(TCGContext *s); > void tcg_pool_delete(TCGContext *s); > @@ -617,6 +618,7 @@ void tb_lock(void); > void tb_unlock(void); > void tb_lock_reset(void); > > +/* Called with tb_lock held. */ > static inline void *tcg_malloc(int size) > { > TCGContext *s = &tcg_ctx; > diff --git a/translate-all.c b/translate-all.c > index d923008..a7ff5e7 100644 > --- a/translate-all.c > +++ b/translate-all.c > @@ -248,7 +248,9 @@ static int encode_search(TranslationBlock *tb, uint8_t > *block) > return p - block; > } > > -/* The cpu state corresponding to 'searched_pc' is restored. */ > +/* The cpu state corresponding to 'searched_pc' is restored. > + * Called with tb_lock held. > + */ > static int cpu_restore_state_from_tb(CPUState *cpu, TranslationBlock *tb, > uintptr_t searched_pc) > { > @@ -306,8 +308,10 @@ bool cpu_restore_state(CPUState *cpu, uintptr_t retaddr) cpu_restore_state_from_tb() called right here also requires tb_lock(). > if (tb->cflags & CF_NOCACHE) { > /* one-shot translation, invalidate it immediately */ > cpu->current_tb = NULL; > + tb_lock(); > tb_phys_invalidate(tb, -1); > tb_free(tb); > + tb_unlock(); > } > return true; > } > @@ -399,6 +403,7 @@ static void page_init(void) > } > > /* If alloc=1: > + * Called with tb_lock held for system emulation. > * Called with mmap_lock held for user-mode emulation. There's a number of functions where their comment states that tb_lock should be held for system emulation but mmap_lock for user-mode emulation. BTW, what is the purpose of mmap_lock? And how it is related to tb_lock? > */ > static PageDesc *page_find_alloc(tb_page_addr_t index, int alloc) (snip) > @@ -1429,7 +1446,9 @@ void tb_invalidate_phys_page_fast(tb_page_addr_t start, > int len) > } > if (!p->code_bitmap && > ++p->code_write_count >= SMC_BITMAP_USE_THRESHOLD) { > - /* build code bitmap */ > + /* build code bitmap. FIXME: writes should be protected by > + * tb_lock, reads by tb_lock or RCU. > + */ Probably, page_find() called earlier in this function requires tb_lock held as well as tb_invalidate_phys_page_range() which can be called later. Maybe tb_invalidate_phys_page_fast() is a good candidate to be always called with tb_lock held. > build_page_bitmap(p); > } > if (p->code_bitmap) { (snip) A list of candidates (as of rth/tcg-next) to also have such a comment includes: tb_find_physical() tb_find_slow() tb_hash_remove() tb_page_remove() tb_remove_from_jmp_list() tb_jmp_unlink() build_page_bitmap() tb_link_page() tb_invalidate_phys_range() tb_invalidate_phys_page_range() page_find() invalidate_page_bitmap() tb_invalidate_check() tb_invalidate_phys_page_fast() tb_invalidate_phys_page() tb_invalidate_phys_addr() cpu_io_recompile() However, there's no sensible description of what is protected by tb_lock and mmap_lock. I think we need to have a clear documented description of the TCG locking scheme in order to be sure we do right things in MTTCG. Kind regards, Sergey