On 230310 0723, Alexander Bulekov wrote: > On 230310 1214, Fiona Ebner wrote: > > Am 05.02.23 um 05:07 schrieb Alexander Bulekov: > > > Add a flag to the DeviceState, when a device is engaged in PIO/MMIO/DMA. > > > This flag is set/checked prior to calling a device's MemoryRegion > > > handlers, and set when device code initiates DMA. The purpose of this > > > flag is to prevent two types of DMA-based reentrancy issues: > > > > > > 1.) mmio -> dma -> mmio case > > > 2.) bh -> dma write -> mmio case > > > > > > These issues have led to problems such as stack-exhaustion and > > > use-after-frees. > > > > > > Summary of the problem from Peter Maydell: > > > https://lore.kernel.org/qemu-devel/cafeaca_23vc7he3iam-jva6w38lk4hjowae5kcknhprd5fp...@mail.gmail.com > > > > > > Resolves: https://gitlab.com/qemu-project/qemu/-/issues/62 > > > Resolves: https://gitlab.com/qemu-project/qemu/-/issues/540 > > > Resolves: https://gitlab.com/qemu-project/qemu/-/issues/541 > > > Resolves: https://gitlab.com/qemu-project/qemu/-/issues/556 > > > Resolves: https://gitlab.com/qemu-project/qemu/-/issues/557 > > > Resolves: https://gitlab.com/qemu-project/qemu/-/issues/827 > > > Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1282 > > > > > > Reviewed-by: Darren Kenny <darren.ke...@oracle.com> > > > Reviewed-by: Stefan Hajnoczi <stefa...@redhat.com> > > > Signed-off-by: Alexander Bulekov <alx...@bu.edu> > > > Acked-by: Peter Xu <pet...@redhat.com> > > > --- > > > include/hw/qdev-core.h | 7 +++++++ > > > softmmu/memory.c | 17 +++++++++++++++++ > > > softmmu/trace-events | 1 + > > > 3 files changed, 25 insertions(+) > > > > > Hi, > > there seems to be an issue with this patch or existing issue exposed by > > this patch in combination with the LSI SCSI controller. > > > > After applying this patch on current master (i.e. > > ee59483267de29056b5b2ee2421ef3844e5c9932), a Debian 11 with the LSI > > controller would not boot properly anymore: > > > [ 7.540907] sym0: <895a> rev 0x0 at pci 0000:00:05.0 irq 10 > > > [ 7.546028] sym0: No NVRAM, ID 7, Fast-40, LVD, parity checking > > > [ 7.559724] sym0: SCSI BUS has been reset. > > > [ 7.560820] sym0: interrupted SCRIPT address not found. > > > [ 7.563802] scsi host2: sym-2.2.3 > > > [ 7.881111] e1000 0000:00:03.0 eth0: (PCI:33MHz:32-bit) > > > 52:54:00:12:34:56 > > > [ 7.881998] e1000 0000:00:03.0 eth0: Intel(R) PRO/1000 Network > > > Connection > > > [ 7.925902] e1000 0000:00:03.0 ens3: renamed from eth0 > > > [ 32.654811] scsi 2:0:0:0: tag#192 ABORT operation started > > > [ 37.764283] scsi 2:0:0:0: ABORT operation timed-out. > > > [ 37.774974] scsi 2:0:0:0: tag#192 DEVICE RESET operation started > > > [ 42.882488] scsi 2:0:0:0: DEVICE RESET operation timed-out. > > > [ 42.883606] scsi 2:0:0:0: tag#192 BUS RESET operation started > > > [ 48.002437] scsi 2:0:0:0: BUS RESET operation timed-out. > > > [ 48.003030] scsi 2:0:0:0: tag#192 HOST RESET operation started > > > [ 48.010226] sym0: SCSI BUS has been reset. > > > [ 53.122472] scsi 2:0:0:0: HOST RESET operation timed-out. > > > [ 53.123030] scsi 2:0:0:0: Device offlined - not ready after error > > > recovery > > > > The commandline I used is: > > ./qemu-system-x86_64 \ > > -cpu 'kvm64' \ > > -m 4096 \ > > -serial 'stdio' \ > > -device 'lsi,id=scsihw0,bus=pci.0,addr=0x5' \ > > -drive > > 'file=/dev/zvol/myzpool/vm-9006-disk-0,if=none,id=drive-scsi0,format=raw' \ > > -device > > 'scsi-hd,bus=scsihw0.0,scsi-id=0,drive=drive-scsi0,id=scsi0,bootindex=100' \ > > -machine 'pc' > > > > Happy to provide any more information if necessary! > > > > CC-ing Fam Zheng (reviewer:SCSI) > > > > Originally reported by one of our community members: > > https://forum.proxmox.com/threads/123843/ > > > > Best Regards, > > Fiona > > > > Thanks, I confirmed this by booting up a livecd iso with an lsi device > attached. I will do some digging > > Stack-trace: > > #0 trace_memory_region_reentrant_io (cpu_index=<optimized out>, > mr=<optimized out>, offset=<optimized out>, size=<optimized out>) at > trace/trace-softmmu.h:337 > #1 0x000055555815ce67 in access_with_adjusted_size (addr=addr@entry=0x1000, > value=0x7ffef01fb980, size=size@entry=0x4, access_size_min=0x1, > access_size_min@entry=0x0, access_size_max=0x4, access_fn=0x555558181370 > <memory_region_read_accessor>, mr=0x627000000c50, attrs=... > ) at ../softmmu/memory.c:552 > #2 0x000055555815aec7 in memory_region_dispatch_read1 (mr=0x627000000c50, > addr=0x1000, pval=<optimized out>, size=0x4, attrs=...) at > ../softmmu/memory.c:1448
This MR seems to be "lsi-ram". >From hw/scsi/lsi53c895a.c: memory_region_init_io(&s->ram_io, OBJECT(s), &lsi_ram_ops, s, "lsi-ram", 0x2000); So the LSI device is reading an LSI "Script" from its own IO region.. In this particular case, I think there was no reason to use memory_region_init_io rather than memory_region_init_ram, but this makes me worried that there are other devices that use something like this. -Alex