Hey Paolo, Thanks for confirming that the AVX is not supported for MMIO space.
So for the emulated device, basically I have to force the compiler avoid using vmovdqu . I am curious about how kvm emulates those instructions. Do you mind sharing some related code pointer ? Thanks, Xu On Mar 4, 2024, at 6:39 PM, Paolo Bonzini <pbonz...@redhat.com> wrote: !-------------------------------------------------------------------| This Message Is From an External Sender |-------------------------------------------------------------------! On 3/4/24 22:59, Alex Williamson wrote: Since you're not seeing a KVM_EXIT_MMIO I'd guess this is more of a KVM issue than QEMU (Cc kvm list). Possibly KVM doesn't emulate vmovdqu relative to an MMIO access, but honestly I'm not positive that AVX instructions are meant to work on MMIO space. I'll let x86 KVM experts more familiar with specific opcode semantics weigh in on that. Indeed, KVM's instruction emulator supports some SSE MOV instructions but not the corresponding AVX instructions. Vector instructions however do work on MMIO space, and they are used occasionally especially in combination with write-combining memory. SSE support was added to KVM because some operating systems used SSE instructions to read and write to VRAM. However, so far we've never received any reports of OSes using AVX instructions on devices that QEMU can emulate (as opposed to, for example, GPU VRAM that is passed through). Thanks, Paolo Is your "program" just doing a memcpy() with an mmap() of the PCI BAR acquired through pci-sysfs or a userspace vfio-pci driver within the guest? In QEMU 4a2e242bbb30 ("memory: Don't use memcpy for ram_device regions") we resolved an issue[1] where QEMU itself was doing a memcpy() to assigned device MMIO space resulting in breaking functionality of the device. IIRC memcpy() was using an SSE instruction that didn't fault, but didn't work correctly relative to MMIO space either. So I also wouldn't rule out that the program isn't inherently misbehaving by using memcpy() and thereby ignoring the nature of the device MMIO access semantics. Thanks, Alex [1]https://bugs.launchpad.net/qemu/+bug/1384892