On Thu, Nov 15, 2018 at 12:32:00 +0100, Richard Henderson wrote: > On 11/14/18 2:00 AM, Emilio G. Cota wrote: > > The following might be related: I'm seeing segfaults with -smp 8 > > and beyond when doing bootup+shutdown of an aarch64 guest on > > an x86-64 host. > > I'm not seeing that. Anything else special on the command-line? > Are the segv in the code_gen_buffer or elsewhere?
I just spent some time on this. I've noticed two issues: - All TCG contexts end up using the same hash table, since we only allocate one table in tcg_context_init. This leads to memory corruption. This fixes it (confirmed that there aren't races with helgrind): --- a/tcg/tcg.c +++ b/tcg/tcg.c @@ -763,6 +763,14 @@ void tcg_register_thread(void) err = tcg_region_initial_alloc__locked(tcg_ctx); g_assert(!err); qemu_mutex_unlock(®ion.lock); + +#ifdef TCG_TARGET_NEED_LDST_OOL_LABELS + /* if n == 0, keep the hash table we allocated in tcg_context_init */ + if (n) { + /* Both key and value are raw pointers. */ + s->ldst_ool_thunks = g_hash_table_new(NULL, NULL); + } +#endif } #endif /* !CONFIG_USER_ONLY */ - Segfault in code_gen_buffer. This one I don't have a fix for, but it's *much* easier to reproduce when -tb-size is very small, e.g. "-tb-size 5 -smp 2" (BTW it crashes with x86_64 guests too.) So at first I thought the code cache flushing was the problem, but I don't see how that could be, at least from a TCGContext viewpoint -- I agree that clearing the hash table in tcg_region_assign is a good place to do so. Thanks, Emilio