Michael Tokarev <m...@tls.msk.ru> writes:
> 09.09.2023 13:27, Michael Tokarev wrote: >> From: Richard Henderson <richard.hender...@linaro.org> >> Acked-by: Alex Bennée <alex.ben...@linaro.org> >> Suggested-by: Alex Bennée <alex.ben...@linaro.org> >> Signed-off-by: Richard Henderson <richard.hender...@linaro.org> >> (cherry picked from commit 86e4f93d827d3c1efd00cd8a906e38a2c0f2b5bc) >> Signed-off-by: Michael Tokarev <m...@tls.msk.ru> >> diff --git a/softmmu/physmem.c b/softmmu/physmem.c >> index 3df73542e1..7597dc1c39 100644 >> --- a/softmmu/physmem.c >> +++ b/softmmu/physmem.c >> @@ -2413,9 +2413,15 @@ MemoryRegionSection *iotlb_to_section(CPUState *cpu, >> int asidx = cpu_asidx_from_attrs(cpu, attrs); >> CPUAddressSpace *cpuas = &cpu->cpu_ases[asidx]; >> AddressSpaceDispatch *d = qatomic_rcu_read(&cpuas->memory_dispatch); >> - MemoryRegionSection *sections = d->map.sections; >> + int section_index = index & ~TARGET_PAGE_MASK; >> + MemoryRegionSection *ret; >> + >> + assert(section_index < d->map.sections_nb); > > This assert now triggers on staging-8.1 > > https://ci.debian.net/data/autopkgtest/testing/amd64/d/dropbear/37993610/log.gz > https://ci.debian.net/data/autopkgtest/testing/amd64/c/cryptsetup/37993606/log.gz The asserts are basically there to detect when we attempt to do a MR lookup and it is not fully committed. If they are firing its because things have gone wrong (which we know because we still have patches in flight for the full fix). The main benefit of the assert is we fail early rather than later on in various cryptic ways (evidenced by the fact we raised 3 different bugs which exhibited slightly different symptoms that where all fundamentally caused by getting bogus memory region data). > >> + ret = d->map.sections + section_index; >> + assert(ret->mr); >> + assert(ret->mr->ops); >> - return §ions[index & ~TARGET_PAGE_MASK]; >> + return ret; >> } >> static void io_mem_init(void) > > In this upload I removed softmmu-Use-async_run_on_cpu-in-tcg_commit.patch > (0d58c660689f6da1), > and the test run uses tcg and -smp 4, which is the configuration which > 0d58c6606 > was supposed to fix. > > qemu-system-x86_64 -no-user-config -nodefaults -name > autopkgtest-cryptsetup-cryptroot-sysvinit \ > -machine type=q35,graphics=off -cpu qemu64,-svm,-vmx -smp cpus=4 -m size=2G \ > -vga none -display none -object rng.... > > I wonder if I should keep 0d58c6606 for 8.1.1 (the deadline is > tomorrow).. Unfortunately 0d58c is not the full fix, it papered over one crack but revealed others. It might be leading to a false sense of security. So I would argue: - keep the assert - better to fail early than to fail later in a hard to understand way - toss a coin for the 0d58c66 fix, if we include it we may end up reverting later once we have the "complete" fix but at least its slightly better for x86 while definitely breaking MIPS > > Thanks, > > /mjt -- Alex Bennée Virtualisation Tech Lead @ Linaro