On 01/04/18 15:29, Vitaly Kuznetsov wrote: > Laszlo Ersek <ler...@redhat.com> writes:
> In fact, the only writew() needs patching is in vp_notify(), when I > replace it with 'asm volatile' everything works. > >> * Does it make a difference if you disable EPT in the L1 KVM >> configuration? (EPT is probably primarily controlled by the CPU features >> exposed by L0 Hyper-V, and secondarily by the "ept" parameter of the >> "kvm_intel" module in L1.) >> >> Asking about EPT because the virtio rings and descriptors are in RAM, >> accessing which in L2 should "normally" never trap to L1/L0. However (I >> *guess*), when those pages are accessed for the very first time in L2, >> they likely do trap, and then the EPT setting in L1 might make a difference. > > Disabling EPT helps! OK... > I also tried tracing L1 KVM and the difference between working and > non-working cases seems to be: > > 1) Working: > > ... > <...>-51387 [014] 64765.695019: kvm_page_fault: address > fe007000 error_code 182 > <...>-51387 [014] 64765.695024: kvm_emulate_insn: 0:eca87: 66 > 89 14 30 > <...>-51387 [014] 64765.695026: vcpu_match_mmio: gva > 0xfe007000 gpa 0xfe007000 Write GPA > <...>-51387 [014] 64765.695026: kvm_mmio: mmio write > len 2 gpa 0xfe007000 val 0x0 > <...>-51387 [014] 64765.695033: kvm_entry: vcpu 0 > <...>-51387 [014] 64765.695042: kvm_exit: reason > EPT_VIOLATION rip 0xeae17 info 181 306 > <...>-51387 [014] 64765.695043: kvm_page_fault: address > f0694 error_code 181 > <...>-51387 [014] 64765.695044: kvm_entry: vcpu 0 > ... > > 2) Broken: > > ... > <...>-38071 [014] 63385.241117: kvm_page_fault: address > fe007000 error_code 182 > <...>-38071 [014] 63385.241121: kvm_emulate_insn: 0:ecffb: 66 > 89 06 > <...>-38071 [014] 63385.241123: vcpu_match_mmio: gva > 0xfe007000 gpa 0xfe007000 Write GPA > <...>-38071 [014] 63385.241124: kvm_mmio: mmio write > len 2 gpa 0xfe007000 val 0x0 > <...>-38071 [014] 63385.241143: kvm_entry: vcpu 0 > <...>-38071 [014] 63385.241162: kvm_exit: reason > EXTERNAL_INTERRUPT rip 0xecffe info 0 800000f6 > <...>-38071 [014] 63385.241162: kvm_entry: vcpu 0 > ... > > The 'kvm_emulate_insn' difference is actually the diferent versions of > 'mov' we get with the current code and with my 'asm volatile' > version. What makes me wonder is where the 'EXTERNAL_INTERRUPT' (only > seen in broken version) comes from. > I don't think said interrupt matters. I also don't think the MOV differences matter; after all, in both cases we end up with the identical vcpu_match_mmio: gva 0xfe007000 gpa 0xfe007000 Write GPA kvm_mmio: mmio write len 2 gpa 0xfe007000 val 0x0 sequence. Here's another random idea: I'll admit that I have no clue how SeaBIOS uses SMM, but I found an earlier email from Paolo <886757208.6870637.1484133921200.javamail.zim...@redhat.com> where he wrote, "the main reason for it [i.e., SMM], is that it provides a safer way to access a PCI device's memory BARs". (SeaBIOS commit 55215cd425d36 seems to give some background.) And that kind of access is what vp_notify()/writew() does, and I see "call32_smm" / "handle_smi" log entries in your thread starter, intermixed with "vp notify". Down-stream we disabled SMM in SeaBIOS because we deemed the additional safety (see above) unnecessary for our limited BIOS service use cases (=mostly grub), while SMM caused obscure problems: - https://bugzilla.redhat.com/show_bug.cgi?id=1378006 - https://bugzilla.redhat.com/show_bug.cgi?id=1425516 So... can you rebuild SeaBIOS with "CONFIG_USE_SMM=n"? (If you originally encountered the strange behavior with downstream SeaBIOS, which already has CONFIG_USE_SMM=n, then please ignore...) Thanks, Laszlo _______________________________________________ SeaBIOS mailing list SeaBIOS@seabios.org https://mail.coreboot.org/mailman/listinfo/seabios