On Thu, 8 May 2025 14:29:18 +0100
Jonathan Cameron <jonathan.came...@huawei.com> wrote:

> On Tue, 29 Apr 2025 19:43:05 -0700
> Richard Henderson <richard.hender...@linaro.org> wrote:
> 
> > On 4/29/25 14:35, Alistair Francis wrote:  
> > > On Sat, Apr 26, 2025 at 3:36 AM Jonathan Cameron via
> > > <qemu-devel@nongnu.org> wrote:    
> > >>
> > >> On Tue, 22 Apr 2025 12:26:55 -0700
> > >> Richard Henderson <richard.hender...@linaro.org> wrote:
> > >>    
> > >>> Recover two bits from the inline flags.    
> > >>
> > >>
> > >> Hi Richard,
> > >>
> > >> Early days but something (I'm fairly sure in this patch) is tripping up 
> > >> my favourite
> > >> TCG corner case of running code out of MMIO memory (interleaved CXL 
> > >> memory).
> > >>
> > >> Only seeing it on arm64 tests so far which isn't upstream yet..
> > >> (guess what I was getting ready to post today)
> > >>
> > >> Back trace is:
> > >>
> > >> #0  0x0000555555fd4296 in cpu_atomic_fetch_andq_le_mmu 
> > >> (env=0x555557ee19b0, addr=18442241572520067072, 
> > >> val=18446744073701163007, oi=8244, retaddr=<optimized out>) at 
> > >> ../../accel/tcg/atomic_template.h:140
> > >> #1  0x00007fffb6894125 in code_gen_buffer ()
> > >> #2  0x0000555555fc4c46 in cpu_tb_exec (cpu=cpu@entry=0x555557ededf0, 
> > >> itb=itb@entry=0x7fffb6894000 <code_gen_buffer+200511443>, 
> > >> tb_exit=tb_exit@entry=0x7ffff4bfb744) at ../../accel/tcg/cpu-exec.c:455
> > >> #3  0x0000555555fc51c2 in cpu_loop_exec_tb (tb_exit=0x7ffff4bfb744, 
> > >> last_tb=<synthetic pointer>, pc=<optimized out>, tb=0x7fffb6894000 
> > >> <code_gen_buffer+200511443>, cpu=0x555557ededf0) at 
> > >> ../../accel/tcg/cpu-exec.c:904
> > >> #4  cpu_exec_loop (cpu=cpu@entry=0x555557ededf0, 
> > >> sc=sc@entry=0x7ffff4bfb7f0) at ../../accel/tcg/cpu-exec.c:1018
> > >> #5  0x0000555555fc58f1 in cpu_exec_setjmp (cpu=cpu@entry=0x555557ededf0, 
> > >> sc=sc@entry=0x7ffff4bfb7f0) at ../../accel/tcg/cpu-exec.c:1035
> > >> #6  0x0000555555fc5f6c in cpu_exec (cpu=cpu@entry=0x555557ededf0) at 
> > >> ../../accel/tcg/cpu-exec.c:1061
> > >> #7  0x0000555556146ac3 in tcg_cpu_exec (cpu=cpu@entry=0x555557ededf0) at 
> > >> ../../accel/tcg/tcg-accel-ops.c:81
> > >> #8  0x0000555556146ee3 in mttcg_cpu_thread_fn 
> > >> (arg=arg@entry=0x555557ededf0) at 
> > >> ../../accel/tcg/tcg-accel-ops-mttcg.c:94
> > >> #9  0x00005555561f6450 in qemu_thread_start (args=0x555557f8f430) at 
> > >> ../../util/qemu-thread-posix.c:541
> > >> #10 0x00007ffff7750aa4 in start_thread (arg=<optimized out>) at 
> > >> ./nptl/pthread_create.c:447
> > >> #11 0x00007ffff77ddc3c in clone3 () at 
> > >> ../sysdeps/unix/sysv/linux/x86_64/clone3.S:78
> > >>
> > >> I haven't pushed out the rebased tree yet making this a truly awful bug 
> > >> report.
> > >>
> > >> The pull request you sent with this in wasn't bisectable so this was a 
> > >> bit of a guessing
> > >> game. I see the seg fault only after this patch.    
> > > 
> > > I see the same thing with some RISC-V tests. I can provide the test
> > > images if you want as well    
> > 
> > 
> > Yes please.
> > 
> > 
> > r~  
> 
> I'm guessing Alastair is busy.
> 
> I got around to testing this on x86 and indeed blow up is the same.
> 
> 0x0000555555e3dd77 in cpu_atomic_add_fetchl_le_mmu (env=0x55555736bef0, 
> addr=140271756837240, val=1, oi=34, retaddr=<optimized out>) at 
> ../../accel/tcg/atomic_template.h:143
> 143     GEN_ATOMIC_HELPER(add_fetch)
> (gdb) bt
> #0  0x0000555555e3dd77 in cpu_atomic_add_fetchl_le_mmu (env=0x55555736bef0, 
> addr=140271756837240, val=1, oi=34, retaddr=<optimized out>) at 
> ../../accel/tcg/atomic_template.h:143
> #1  0x00007fffbc31c6f0 in code_gen_buffer ()
> #2  0x0000555555e23aa6 in cpu_tb_exec (cpu=cpu@entry=0x555557369330, 
> itb=itb@entry=0x7fffbc31c600 <code_gen_buffer+295441875>, 
> tb_exit=tb_exit@entry=0x7ffff4bfd6ec) at ../../accel/tcg/cpu-exec.c:438
> #3  0x0000555555e24025 in cpu_loop_exec_tb (tb_exit=0x7ffff4bfd6ec, 
> last_tb=<synthetic pointer>, pc=<optimized out>, tb=0x7fffbc31c600 
> <code_gen_buffer+295441875>, cpu=0x555557369330) at 
> ../../accel/tcg/cpu-exec.c:872
> #4  cpu_exec_loop (cpu=cpu@entry=0x555557369330, sc=sc@entry=0x7ffff4bfd7b0) 
> at ../../accel/tcg/cpu-exec.c:982
> #5  0x0000555555e247a1 in cpu_exec_setjmp (cpu=cpu@entry=0x555557369330, 
> sc=sc@entry=0x7ffff4bfd7b0) at ../../accel/tcg/cpu-exec.c:999
> #6  0x0000555555e24e2c in cpu_exec (cpu=cpu@entry=0x555557369330) at 
> ../../accel/tcg/cpu-exec.c:1025
> #7  0x0000555555e42c73 in tcg_cpu_exec (cpu=cpu@entry=0x555557369330) at 
> ../../accel/tcg/tcg-accel-ops.c:81
> #8  0x0000555555e43093 in mttcg_cpu_thread_fn (arg=arg@entry=0x555557369330) 
> at ../../accel/tcg/tcg-accel-ops-mttcg.c:94
> #9  0x0000555555ef2250 in qemu_thread_start (args=0x5555573e6e20) at 
> ../../util/qemu-thread-posix.c:541
> #10 0x00007ffff7750aa4 in start_thread (arg=<optimized out>) at 
> ./nptl/pthread_create.c:447
> #11 0x00007ffff77ddc3c in clone3 () at 
> ../sysdeps/unix/sysv/linux/x86_64/clone3.S:78
> 
> Need one patch for my particular setup to work around some DMA buffer issues 
> in virtio (similar to
> a patch for pci space last year).  I've been meaning to post an RFC to get 
> feedback on how
> to handle this but not gotten to it yet!
> 
> From 801e47897c5959a22ed050d7e7feebbbd3a12588 Mon Sep 17 00:00:00 2001
> From: Jonathan Cameron <jonathan.came...@huawei.com>
> Date: Mon, 22 Apr 2024 13:54:37 +0100
> Subject: [PATCH] physmem: Increase bounce buffers for "memory" address space.
> 
> Doesn't need to be this big and should be configurable.
> 
> Signed-off-by: Jonathan Cameron <jonathan.came...@huawei.com>
> ---
>  system/physmem.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/system/physmem.c b/system/physmem.c
> index 3f4fd69d9a..651b875827 100644
> --- a/system/physmem.c
> +++ b/system/physmem.c
> @@ -2798,6 +2798,7 @@ static void memory_map_init(void)
>      memory_region_init(system_memory, NULL, "system", UINT64_MAX);
>      address_space_init(&address_space_memory, system_memory, "memory");
>  
> +    address_space_memory.max_bounce_buffer_size = 1024 * 1024 * 1024;
>      system_io = g_malloc(sizeof(*system_io));
>      memory_region_init_io(system_io, NULL, &unassigned_io_ops, NULL, "io",
>                            65536);

Hi Richard

As discussed on Friday, I've put test kernel up at 
https://gitlab.com/jic23/qemu-debug
It's just a build of mainline as checked out today. I'll commit the kernel 
config as well
for information.  Nothing particularly special just a lot of stuff built in so 
you don't need to fuss around with modules in the root fs / initrd etc.

The readme.md file in that repo has instructions to replicate with a typical 
setup +
shell scripts.  Only thing you'll need to install on the mentioned standard 
debian nocloud
image is numactl. Otherwise all cut and paste scripts.

Let me know if this either doesn't work for you (should segfault) on numctl -m 
2 ls
or there is anything else I can do to help debug this one.

Thanks,

Jonathan


Reply via email to