On 200903 1154, Jason Wang wrote: > > On 2020/9/3 上午12:22, Li Qiang wrote: > > The qemu device fuzzer has found several DMA to MMIO issue. > > These issues is caused by the guest driver programs the DMA > > address, then in the device MMIO handler it trigger the DMA > > and as the DMA address is MMIO it will trigger another dispatch > > and reenter the MMIO handler again. However most of the device > > is not reentrant. > > > > DMA to MMIO will cause issues depend by the device emulator, > > mostly it will crash the qemu. Following is three classic > > DMA to MMIO issue. > > > > e1000e: https://bugs.launchpad.net/qemu/+bug/1886362 > > xhci: https://bugs.launchpad.net/qemu/+bug/1891354 > > virtio-gpu: https://bugs.launchpad.net/qemu/+bug/1888606 > > > > The DMA to MMIO issue I think can be classified as following: > > 1. DMA to the device itself > > 2. device A DMA to device B and to device C > > 3. device A DMA to device B and to device A > > > > The first case of course should not be allowed. > > The second case I think it ok as the device IO handler has no > > assumption about the IO data came from no matter it come from > > device or other device. This is for P2P DMA. > > The third case I think it also should not be allowed. > > > > So our issue has been reduced by one case: not allowed the > > device's IO handler reenter. > > > > Paolo suggested that we can refactor the device emulation with > > BH. However it is a lot of work. > > I have thought several propose to address this, also discuss > > this with Jason Wang in private email. > > > > I have can solve this issue in core framework or in specific device. > > After try several methods I choose address it in per-device for > > following reason: > > 1. If we address it in core framwork we have to recored and check the > > device or MR info in MR dispatch write function. Unfortunally we have > > no these info in core framework. > > 2. The performance will also be decrease largely > > 3. Only the device itself know its IO > > > I think we still need to seek a way to address this issue completely. > > How about adding a flag in MemoryRegionOps and detect the reentrancy through > that flag?
What happens for devices with multiple MemoryRegions? Make all the MemoryRegionOps share the same flag? What about the virtio-gpu bug, where the problem happens in a bh->mmio access rather than an mmio->mmio access? -Alex > Thanks > > > > > > The (most of the) device emulation is protected by BQL one time only > > a device emulation code can be run. We can add a flag to indicate the > > IO is running. The first two patches does this. For simplicity at the > > RFC stage I just set it while enter the IO callback and clear it exit > > the IO callback. It should be check/set/clean according the per-device's > > IO emulation. > > The second issue which itself suffers a race condition so I uses a > > atomic. > > > > > > > > > > Li Qiang (3): > > e1000e: make the IO handler reentrant > > xhci: make the IO handler reentrant > > virtio-gpu: make the IO handler reentrant > > > > hw/display/virtio-gpu.c | 10 ++++++ > > hw/net/e1000e.c | 35 +++++++++++++++++++- > > hw/usb/hcd-xhci.c | 60 ++++++++++++++++++++++++++++++++++ > > hw/usb/hcd-xhci.h | 1 + > > include/hw/virtio/virtio-gpu.h | 1 + > > 5 files changed, 106 insertions(+), 1 deletion(-) > > >