Re: [RFC/RFT PATCH v2 0/3] KVM: Introduce KVM_MEM_UNCACHED
On Thu, May 14, 2015 at 11:37:46AM +0100, Peter Maydell wrote: On 14 May 2015 at 11:31, Andrew Jones drjo...@redhat.com wrote: Forgot to (4): switch from setting userspace's mapping to device memory to normal, non-cacheable. Using device memory caused a problem that Alex Graf found, and Peter Maydell suggested using normal, non-cacheable instead. Did you check that non-cacheable is definitely the correct kind of Normal memory attribute we want? (ie not write-through). I was concerned that write-through wouldn't be sufficient. If the guest writes to its non-cached memory, and QEMU needs to see what it wrote, then won't write-through fail to work? Unless we some how invalidate the cache first? drew ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: [RFC/RFT PATCH v2 0/3] KVM: Introduce KVM_MEM_UNCACHED
On Thu, May 14, 2015 at 01:38:38PM +0200, Paolo Bonzini wrote: On 14/05/2015 13:36, Christoffer Dall wrote: (It's probably worth looking at the documentation in the first hunk too, under the commit message.) Why is this a hack/unintuitive? Is the semantics of the QEMU PCI bus not simply that MMIO regions are coherent? Only until device assignment gets into the picture. Will UEFI have to deal with device assignment in any respect? Why not? For example you could do network boot from an assigned network card. In fact, anything that UEFI has to deal with, the OS has to deal with too. If you need a UEFI hack, chances are you need or will need a Linux hack too. Fair enough. I was thinking that UEFI needs to be built with knowledge of all the hardware present including any passthrough devices, but I guess this is plainly not true with PCI (and might not even be true with the level of DT parsing we do for the virtual platform). So, getting back to my original question. Is the point then that UEFI must assume (from ACPI/DT) the cache-coherency properties of the PCI controller which exists in hardware on the system you're running on, even for the virtual PCI bus because that will be the semantics for assigned devices? And in that case, we have no way to distinguish between passthrough devices and virtual devices plugged into the virtual PCI bus? What about the idea of having two virtual PCI buses on your system where one is always cache-coherent and uses for virtual devices, and the other is whatever the hardware is and used for passthrough devices? -Christoffer ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: [RFC/RFT PATCH v2 0/3] KVM: Introduce KVM_MEM_UNCACHED
On 14 May 2015 at 13:28, Paolo Bonzini pbonz...@redhat.com wrote: Well, PCI BARs are generally MMIO resources, and hence should not be cached. As an optimization, OS drivers can mark them as cacheable or write-combining or something like that, but in general it's a safe default to leave them uncached---one would think. Isn't this handled by the OS mapping them in the 'prefetchable' MMIO window rather than the 'non-prefetchable' one? (QEMU's generic-PCIe device doesn't yet support the prefetchable window.) -- PMM ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: [RFC/RFT PATCH v2 0/3] KVM: Introduce KVM_MEM_UNCACHED
On Thu, May 14, 2015 at 02:08:49PM +0200, Paolo Bonzini wrote: On 14/05/2015 14:00, Christoffer Dall wrote: So, getting back to my original question. Is the point then that UEFI must assume (from ACPI/DT) the cache-coherency properties of the PCI controller which exists in hardware on the system you're running on, even for the virtual PCI bus because that will be the semantics for assigned devices? And in that case, we have no way to distinguish between passthrough devices and virtual devices plugged into the virtual PCI bus? Well, we could use the subsystem id. But it's a hack, and may cause incompatibilities with some drivers. Michael, any ideas? What about the idea of having two virtual PCI buses on your system where one is always cache-coherent and uses for virtual devices, and the other is whatever the hardware is and used for passthrough devices? I think that was rejected before. Do you remember where? I just remember Catalin mentioning the idea to me verbally. Besides the slightly heavy added use of resources etc. it seems like it would address some of our issues in a good way. But I'm still not sure why UEFI/Linux currently sees our PCI bus as being non-coherent when in fact it is and we have no passthrough issues currently. Are all PCI controllers always non-coherent for some reason and therefore we model it as such too? -Christoffer ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: [RFC/RFT PATCH v2 0/3] KVM: Introduce KVM_MEM_UNCACHED
On Thu, May 14, 2015 at 02:28:49PM +0200, Paolo Bonzini wrote: On 14/05/2015 14:24, Christoffer Dall wrote: On Thu, May 14, 2015 at 02:08:49PM +0200, Paolo Bonzini wrote: On 14/05/2015 14:00, Christoffer Dall wrote: So, getting back to my original question. Is the point then that UEFI must assume (from ACPI/DT) the cache-coherency properties of the PCI controller which exists in hardware on the system you're running on, even for the virtual PCI bus because that will be the semantics for assigned devices? And in that case, we have no way to distinguish between passthrough devices and virtual devices plugged into the virtual PCI bus? Well, we could use the subsystem id. But it's a hack, and may cause incompatibilities with some drivers. Michael, any ideas? What about the idea of having two virtual PCI buses on your system where one is always cache-coherent and uses for virtual devices, and the other is whatever the hardware is and used for passthrough devices? I think that was rejected before. Do you remember where? I just remember Catalin mentioning the idea to me verbally. In the last centithread on the subject. :) At least I and Peter disagreed. It's not about the heavy added use of resources, it's more about it being really easy to misconfigure. But I'm still not sure why UEFI/Linux currently sees our PCI bus as being non-coherent when in fact it is and we have no passthrough issues currently. Are all PCI controllers always non-coherent for some reason and therefore we model it as such too? Well, PCI BARs are generally MMIO resources, and hence should not be cached. As an optimization, OS drivers can mark them as cacheable or write-combining or something like that, but in general it's a safe default to leave them uncached---one would think. ok, I guess this series makes sense then, assuming it works, and assuming we don't kill performance by going to RAM all the time when we don't have to... Thanks, -Christoffer ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: [RFC/RFT PATCH v2 0/3] KVM: Introduce KVM_MEM_UNCACHED
On 14 May 2015 at 14:03, Andrew Jones drjo...@redhat.com wrote: On Thu, May 14, 2015 at 11:37:46AM +0100, Peter Maydell wrote: On 14 May 2015 at 11:31, Andrew Jones drjo...@redhat.com wrote: Forgot to (4): switch from setting userspace's mapping to device memory to normal, non-cacheable. Using device memory caused a problem that Alex Graf found, and Peter Maydell suggested using normal, non-cacheable instead. Did you check that non-cacheable is definitely the correct kind of Normal memory attribute we want? (ie not write-through). I was concerned that write-through wouldn't be sufficient. If the guest writes to its non-cached memory, and QEMU needs to see what it wrote, then won't write-through fail to work? Unless we some how invalidate the cache first? Well, I meant more that the correct mapping for userspace is the same as the guest, whatever that is, and so somebody needs to look at what the guest actually does rather than merely hoping NormalNC is OK. (For instance, do we need to provide support for QEMU to map both NC and writethrough?) -- PMM ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: [RFC/RFT PATCH v2 0/3] KVM: Introduce KVM_MEM_UNCACHED
On 05/14/15 14:34, Christoffer Dall wrote: On Thu, May 14, 2015 at 02:28:49PM +0200, Paolo Bonzini wrote: On 14/05/2015 14:24, Christoffer Dall wrote: On Thu, May 14, 2015 at 02:08:49PM +0200, Paolo Bonzini wrote: On 14/05/2015 14:00, Christoffer Dall wrote: So, getting back to my original question. Is the point then that UEFI must assume (from ACPI/DT) the cache-coherency properties of the PCI controller which exists in hardware on the system you're running on, even for the virtual PCI bus because that will be the semantics for assigned devices? And in that case, we have no way to distinguish between passthrough devices and virtual devices plugged into the virtual PCI bus? Well, we could use the subsystem id. But it's a hack, and may cause incompatibilities with some drivers. Michael, any ideas? What about the idea of having two virtual PCI buses on your system where one is always cache-coherent and uses for virtual devices, and the other is whatever the hardware is and used for passthrough devices? I think that was rejected before. Do you remember where? I just remember Catalin mentioning the idea to me verbally. In the last centithread on the subject. :) At least I and Peter disagreed. It's not about the heavy added use of resources, it's more about it being really easy to misconfigure. But I'm still not sure why UEFI/Linux currently sees our PCI bus as being non-coherent when in fact it is and we have no passthrough issues currently. Are all PCI controllers always non-coherent for some reason and therefore we model it as such too? Well, PCI BARs are generally MMIO resources, and hence should not be cached. As an optimization, OS drivers can mark them as cacheable or write-combining or something like that, but in general it's a safe default to leave them uncached---one would think. ok, I guess this series makes sense then, assuming it works, and assuming we don't kill performance by going to RAM all the time when we don't have to... The idea Paolo and I had discussed in the past is: - Remove the kludge from UEFI, and map all MMIO BARs as uncached by default. This should be a theoretically correct approach, and for assigned devices, correct in practice too. - At an appropriate place in the firmware (specifically, right before this line: [1]), when PCI devices have been enumerated, but their particular drivers (especially VGA) have not been connected yet, check the subsystem id / vendor id / etc for each, and if we can tell it's virtual, then set the attributes for all of its MMIO bars to writeback. It doesn't seem hard to implement, I've just been shying away from actually coding it up because I'd like to see it make a difference for an actual assigned device. That is, reproducing the current (statically kludged) behavior wouldn't be hard, but I prefer not to write a new patch until I can test it both ways. UC is broken and WB works for virtual devices, fine; now let's see if the exact reverse holds for assigned devices in reality. ... Testing of which will require someone to send a PCI card (NIC or GPU) -- with an AARCH64 UEFI driver oprom on it -- to my place, so that I can plug into my Mustang. ;) Thanks Laszlo [1] https://github.com/tianocore/edk2/blob/99d9ade85aad554a0fa08fff8586b0fd40570ac3/ArmPlatformPkg/ArmVirtualizationPkg/Library/PlatformIntelBdsLib/IntelBdsPlatform.c#L366 ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: [RFC/RFT PATCH v2 0/3] KVM: Introduce KVM_MEM_UNCACHED
On 14/05/2015 14:00, Christoffer Dall wrote: So, getting back to my original question. Is the point then that UEFI must assume (from ACPI/DT) the cache-coherency properties of the PCI controller which exists in hardware on the system you're running on, even for the virtual PCI bus because that will be the semantics for assigned devices? And in that case, we have no way to distinguish between passthrough devices and virtual devices plugged into the virtual PCI bus? Well, we could use the subsystem id. But it's a hack, and may cause incompatibilities with some drivers. Michael, any ideas? What about the idea of having two virtual PCI buses on your system where one is always cache-coherent and uses for virtual devices, and the other is whatever the hardware is and used for passthrough devices? I think that was rejected before. Paolo ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: [RFC/RFT PATCH v2 0/3] KVM: Introduce KVM_MEM_UNCACHED
On 05/14/15 12:30, Christoffer Dall wrote: On Wed, May 13, 2015 at 01:31:51PM +0200, Andrew Jones wrote: Introduce a new memory region flag, KVM_MEM_UNCACHED, which is needed by ARM. This flag informs KVM that the given memory region is typically mapped by the guest as non-cacheable. KVM for ARM then ensures that that memory is indeed mapped non-cacheable by the guest, and also remaps that region as non-cacheable for userspace, allowing them both to maintain a coherent view. Changes since v1: 1) don't pin pages [Paolo] 2) ensure the guest maps the memory non-cacheable [me] 3) clean up memslot flag documentation [Christoffer] changes 1 and 2 effectively redesigned/rewrote v1. Find v1 here http://www.spinics.net/lists/kvm-arm/msg14022.html The QEMU series for v1 hasn't really changed. Only the linux header hack needed to bump KVM_CAP_UNCACHED_MEM from 107 to 116. Find the series here http://www.spinics.net/lists/kvm-arm/msg14026.html Testing: This series still needs lots of testing, but I thought I'd kick it to the list early, as there's been recent interest in solving this problem, and I'd like to get test results and opinions on this approach from others sooner than later. I've tested with AAVMF (UEFI for AArch64 mach-virt guests). AAVMF has a kludge in it to avoid the coherency problem. How does the 'kludge' work? https://github.com/tianocore/edk2/commit/f9a8be42 (It's probably worth looking at the documentation in the first hunk too, under the commit message.) Thanks Laszlo I've tested both with and without that kludge active. Both worked for me (almost). Sometimes with the non-kludged version I was still able to see a bit of corruption in grub's output after edk2 loaded it - not much, and not always, but something. Remind me, this is a VGA framebuffer corruption with a PCI-plugged VGA card? Thanks, -Christoffer Anyway, it's quite frustrating, as I'm not sure what I'm missing... This series applies to Linus' 110bc76729d4, but I tested with a version backported to the current RHELSA kernel. Thanks for reviews and testing! drew Andrew Jones (3): arm/arm64: pageattr: add set_memory_nc KVM: promote KVM_MEMSLOT_INCOHERENT to uapi arm/arm64: KVM: implement 'uncached' mem coherency Documentation/virtual/kvm/api.txt | 20 -- arch/arm/include/asm/cacheflush.h | 1 + arch/arm/include/asm/kvm_mmu.h| 5 - arch/arm/include/asm/pgtable-3level.h | 1 + arch/arm/include/asm/pgtable.h| 1 + arch/arm/include/uapi/asm/kvm.h | 1 + arch/arm/kvm/arm.c| 1 + arch/arm/kvm/mmu.c| 39 ++- arch/arm/mm/pageattr.c| 7 +++ arch/arm64/include/asm/cacheflush.h | 1 + arch/arm64/include/asm/kvm_mmu.h | 5 - arch/arm64/include/asm/memory.h | 1 + arch/arm64/include/asm/pgtable.h | 1 + arch/arm64/include/uapi/asm/kvm.h | 1 + arch/arm64/mm/pageattr.c | 8 +++ include/linux/kvm_host.h | 1 - include/uapi/linux/kvm.h | 2 ++ virt/kvm/kvm_main.c | 7 ++- 18 files changed, 79 insertions(+), 24 deletions(-) -- 2.1.0 ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: [RFC/RFT PATCH v2 0/3] KVM: Introduce KVM_MEM_UNCACHED
On Thu, May 14, 2015 at 01:09:34PM +0200, Laszlo Ersek wrote: On 05/14/15 12:30, Christoffer Dall wrote: On Wed, May 13, 2015 at 01:31:51PM +0200, Andrew Jones wrote: Introduce a new memory region flag, KVM_MEM_UNCACHED, which is needed by ARM. This flag informs KVM that the given memory region is typically mapped by the guest as non-cacheable. KVM for ARM then ensures that that memory is indeed mapped non-cacheable by the guest, and also remaps that region as non-cacheable for userspace, allowing them both to maintain a coherent view. Changes since v1: 1) don't pin pages [Paolo] 2) ensure the guest maps the memory non-cacheable [me] 3) clean up memslot flag documentation [Christoffer] changes 1 and 2 effectively redesigned/rewrote v1. Find v1 here http://www.spinics.net/lists/kvm-arm/msg14022.html The QEMU series for v1 hasn't really changed. Only the linux header hack needed to bump KVM_CAP_UNCACHED_MEM from 107 to 116. Find the series here http://www.spinics.net/lists/kvm-arm/msg14026.html Testing: This series still needs lots of testing, but I thought I'd kick it to the list early, as there's been recent interest in solving this problem, and I'd like to get test results and opinions on this approach from others sooner than later. I've tested with AAVMF (UEFI for AArch64 mach-virt guests). AAVMF has a kludge in it to avoid the coherency problem. How does the 'kludge' work? https://github.com/tianocore/edk2/commit/f9a8be42 (It's probably worth looking at the documentation in the first hunk too, under the commit message.) Why is this a hack/unintuitive? Is the semantics of the QEMU PCI bus not simply that MMIO regions are coherent? -Christoffer ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: [RFC/RFT PATCH v2 0/3] KVM: Introduce KVM_MEM_UNCACHED
On 14/05/2015 13:29, Christoffer Dall wrote: (It's probably worth looking at the documentation in the first hunk too, under the commit message.) Why is this a hack/unintuitive? Is the semantics of the QEMU PCI bus not simply that MMIO regions are coherent? Only until device assignment gets into the picture. Paolo ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: [RFC/RFT PATCH v2 0/3] KVM: Introduce KVM_MEM_UNCACHED
On Thu, May 14, 2015 at 01:31:03PM +0200, Paolo Bonzini wrote: On 14/05/2015 13:29, Christoffer Dall wrote: (It's probably worth looking at the documentation in the first hunk too, under the commit message.) Why is this a hack/unintuitive? Is the semantics of the QEMU PCI bus not simply that MMIO regions are coherent? Only until device assignment gets into the picture. Will UEFI have to deal with device assignment in any respect? -Christoffer ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: [RFC/RFT PATCH v2 0/3] KVM: Introduce KVM_MEM_UNCACHED
On 14/05/2015 13:36, Christoffer Dall wrote: (It's probably worth looking at the documentation in the first hunk too, under the commit message.) Why is this a hack/unintuitive? Is the semantics of the QEMU PCI bus not simply that MMIO regions are coherent? Only until device assignment gets into the picture. Will UEFI have to deal with device assignment in any respect? Why not? For example you could do network boot from an assigned network card. In fact, anything that UEFI has to deal with, the OS has to deal with too. If you need a UEFI hack, chances are you need or will need a Linux hack too. Paolo ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: [RFC/RFT PATCH v2 0/3] KVM: Introduce KVM_MEM_UNCACHED
On Wed, May 13, 2015 at 01:31:51PM +0200, Andrew Jones wrote: Introduce a new memory region flag, KVM_MEM_UNCACHED, which is needed by ARM. This flag informs KVM that the given memory region is typically mapped by the guest as non-cacheable. KVM for ARM then ensures that that memory is indeed mapped non-cacheable by the guest, and also remaps that region as non-cacheable for userspace, allowing them both to maintain a coherent view. Changes since v1: 1) don't pin pages [Paolo] 2) ensure the guest maps the memory non-cacheable [me] 3) clean up memslot flag documentation [Christoffer] Forgot to (4): switch from setting userspace's mapping to device memory to normal, non-cacheable. Using device memory caused a problem that Alex Graf found, and Peter Maydell suggested using normal, non-cacheable instead. changes 1 and 2 effectively redesigned/rewrote v1. Find v1 here http://www.spinics.net/lists/kvm-arm/msg14022.html The QEMU series for v1 hasn't really changed. Only the linux header hack needed to bump KVM_CAP_UNCACHED_MEM from 107 to 116. Find the series here http://www.spinics.net/lists/kvm-arm/msg14026.html Testing: This series still needs lots of testing, but I thought I'd kick it to the list early, as there's been recent interest in solving this problem, and I'd like to get test results and opinions on this approach from others sooner than later. I've tested with AAVMF (UEFI for AArch64 mach-virt guests). AAVMF has a kludge in it to avoid the coherency problem. I've tested both with and without that kludge active. Both worked for me (almost). Sometimes with the non-kludged version I was still able to see a bit of corruption in grub's output after edk2 loaded it - not much, and not always, but something. Anyway, it's quite frustrating, as I'm not sure what I'm missing... This series applies to Linus' 110bc76729d4, but I tested with a version backported to the current RHELSA kernel. Thanks for reviews and testing! drew Andrew Jones (3): arm/arm64: pageattr: add set_memory_nc KVM: promote KVM_MEMSLOT_INCOHERENT to uapi arm/arm64: KVM: implement 'uncached' mem coherency Documentation/virtual/kvm/api.txt | 20 -- arch/arm/include/asm/cacheflush.h | 1 + arch/arm/include/asm/kvm_mmu.h| 5 - arch/arm/include/asm/pgtable-3level.h | 1 + arch/arm/include/asm/pgtable.h| 1 + arch/arm/include/uapi/asm/kvm.h | 1 + arch/arm/kvm/arm.c| 1 + arch/arm/kvm/mmu.c| 39 ++- arch/arm/mm/pageattr.c| 7 +++ arch/arm64/include/asm/cacheflush.h | 1 + arch/arm64/include/asm/kvm_mmu.h | 5 - arch/arm64/include/asm/memory.h | 1 + arch/arm64/include/asm/pgtable.h | 1 + arch/arm64/include/uapi/asm/kvm.h | 1 + arch/arm64/mm/pageattr.c | 8 +++ include/linux/kvm_host.h | 1 - include/uapi/linux/kvm.h | 2 ++ virt/kvm/kvm_main.c | 7 ++- 18 files changed, 79 insertions(+), 24 deletions(-) -- 2.1.0 ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: [RFC/RFT PATCH v2 0/3] KVM: Introduce KVM_MEM_UNCACHED
On 14 May 2015 at 11:31, Andrew Jones drjo...@redhat.com wrote: Forgot to (4): switch from setting userspace's mapping to device memory to normal, non-cacheable. Using device memory caused a problem that Alex Graf found, and Peter Maydell suggested using normal, non-cacheable instead. Did you check that non-cacheable is definitely the correct kind of Normal memory attribute we want? (ie not write-through). -- PMM ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm