Re: [PATCH 12/16] mm/ioremap: Add arch-specific callbacks on ioremap/iounmap calls
On Wed, Jul 28, 2021 at 12:01:53PM +0100, Marc Zyngier wrote: > On Tue, 27 Jul 2021 19:12:04 +0100, > Will Deacon wrote: > > > > On Thu, Jul 15, 2021 at 05:31:55PM +0100, Marc Zyngier wrote: > > > Add a pair of hooks (ioremap_page_range_hook/iounmap_page_range_hook) > > > that can be implemented by an architecture. > > > > > > Signed-off-by: Marc Zyngier > > > --- > > > include/linux/io.h | 3 +++ > > > mm/ioremap.c | 13 - > > > mm/vmalloc.c | 8 > > > 3 files changed, 23 insertions(+), 1 deletion(-) > > > > > > diff --git a/include/linux/io.h b/include/linux/io.h > > > index 9595151d800d..0ffc265f114c 100644 > > > --- a/include/linux/io.h > > > +++ b/include/linux/io.h > > > @@ -21,6 +21,9 @@ void __ioread32_copy(void *to, const void __iomem > > > *from, size_t count); > > > void __iowrite64_copy(void __iomem *to, const void *from, size_t count); > > > > > > #ifdef CONFIG_MMU > > > +void ioremap_page_range_hook(unsigned long addr, unsigned long end, > > > + phys_addr_t phys_addr, pgprot_t prot); > > > +void iounmap_page_range_hook(phys_addr_t phys_addr, size_t size); > > > int ioremap_page_range(unsigned long addr, unsigned long end, > > > phys_addr_t phys_addr, pgprot_t prot); > > > #else > > > > Can we avoid these hooks by instead not registering the regions proactively > > in the guest and moving that logic to a fault handler which runs off the > > back of the injected data abort? From there, we could check if the faulting > > IPA is a memory address and register it as MMIO if not. > > > > Dunno, you've spent more time than me thinking about this, but just > > wondering if you'd had a crack at doing it that way, as it _seems_ simpler > > to my naive brain. > > I thought about it, but couldn't work out whether it was always > possible for the guest to handle these faults (first access in an > interrupt context, for example?). If the check is a simple pfn_valid() I think it should be ok, but yes, we'd definitely not want to do anything more involved given that this could run in all sorts of horrible contexts. > Also, this changes the semantics of the protection this is supposed to > offer: any access out of the RAM space will generate an abort, and the > fault handler will grant MMIO forwarding for this page. Stray accesses > that would normally be properly handled as fatal would now succeed and > be forwarded to userspace, even if there was no emulated devices > there. That's true, it would offer much weaker guarantees to the guest. It's more like a guarantee that memory never traps to the VMM. It also then wouldn't help with the write-combine fun. It would be simpler though, but with less functionality. > For this to work, we'd need to work out whether there is any existing > device mapping that actually points to this page. And whether it > actually is supposed to be forwarded to userspace. Do we have a rmap > for device mappings? I don't think this would be possible given your comments above. So let's stick with the approach you've taken. It just feels like there should be a way to do this without introducing new hooks into the core code. If it wasn't for pci_remap_iospace(), we could simply hook our definition of __ioremap_caller(). Another avenue to explore would be looking at the IO resource instead; I see x86 already uses IORES_MAP_ENCRYPTED and IORES_MAP_SYSTEM_RAM to drive pgprot... Will ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: [PATCH 12/16] mm/ioremap: Add arch-specific callbacks on ioremap/iounmap calls
On Tue, 27 Jul 2021 19:12:04 +0100, Will Deacon wrote: > > On Thu, Jul 15, 2021 at 05:31:55PM +0100, Marc Zyngier wrote: > > Add a pair of hooks (ioremap_page_range_hook/iounmap_page_range_hook) > > that can be implemented by an architecture. > > > > Signed-off-by: Marc Zyngier > > --- > > include/linux/io.h | 3 +++ > > mm/ioremap.c | 13 - > > mm/vmalloc.c | 8 > > 3 files changed, 23 insertions(+), 1 deletion(-) > > > > diff --git a/include/linux/io.h b/include/linux/io.h > > index 9595151d800d..0ffc265f114c 100644 > > --- a/include/linux/io.h > > +++ b/include/linux/io.h > > @@ -21,6 +21,9 @@ void __ioread32_copy(void *to, const void __iomem *from, > > size_t count); > > void __iowrite64_copy(void __iomem *to, const void *from, size_t count); > > > > #ifdef CONFIG_MMU > > +void ioremap_page_range_hook(unsigned long addr, unsigned long end, > > +phys_addr_t phys_addr, pgprot_t prot); > > +void iounmap_page_range_hook(phys_addr_t phys_addr, size_t size); > > int ioremap_page_range(unsigned long addr, unsigned long end, > >phys_addr_t phys_addr, pgprot_t prot); > > #else > > Can we avoid these hooks by instead not registering the regions proactively > in the guest and moving that logic to a fault handler which runs off the > back of the injected data abort? From there, we could check if the faulting > IPA is a memory address and register it as MMIO if not. > > Dunno, you've spent more time than me thinking about this, but just > wondering if you'd had a crack at doing it that way, as it _seems_ simpler > to my naive brain. I thought about it, but couldn't work out whether it was always possible for the guest to handle these faults (first access in an interrupt context, for example?). Also, this changes the semantics of the protection this is supposed to offer: any access out of the RAM space will generate an abort, and the fault handler will grant MMIO forwarding for this page. Stray accesses that would normally be properly handled as fatal would now succeed and be forwarded to userspace, even if there was no emulated devices there. For this to work, we'd need to work out whether there is any existing device mapping that actually points to this page. And whether it actually is supposed to be forwarded to userspace. Do we have a rmap for device mappings? M. -- Without deviation from the norm, progress is not possible. ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
Re: [PATCH 12/16] mm/ioremap: Add arch-specific callbacks on ioremap/iounmap calls
On Thu, Jul 15, 2021 at 05:31:55PM +0100, Marc Zyngier wrote: > Add a pair of hooks (ioremap_page_range_hook/iounmap_page_range_hook) > that can be implemented by an architecture. > > Signed-off-by: Marc Zyngier > --- > include/linux/io.h | 3 +++ > mm/ioremap.c | 13 - > mm/vmalloc.c | 8 > 3 files changed, 23 insertions(+), 1 deletion(-) > > diff --git a/include/linux/io.h b/include/linux/io.h > index 9595151d800d..0ffc265f114c 100644 > --- a/include/linux/io.h > +++ b/include/linux/io.h > @@ -21,6 +21,9 @@ void __ioread32_copy(void *to, const void __iomem *from, > size_t count); > void __iowrite64_copy(void __iomem *to, const void *from, size_t count); > > #ifdef CONFIG_MMU > +void ioremap_page_range_hook(unsigned long addr, unsigned long end, > + phys_addr_t phys_addr, pgprot_t prot); > +void iounmap_page_range_hook(phys_addr_t phys_addr, size_t size); > int ioremap_page_range(unsigned long addr, unsigned long end, > phys_addr_t phys_addr, pgprot_t prot); > #else Can we avoid these hooks by instead not registering the regions proactively in the guest and moving that logic to a fault handler which runs off the back of the injected data abort? From there, we could check if the faulting IPA is a memory address and register it as MMIO if not. Dunno, you've spent more time than me thinking about this, but just wondering if you'd had a crack at doing it that way, as it _seems_ simpler to my naive brain. Will ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
[PATCH 12/16] mm/ioremap: Add arch-specific callbacks on ioremap/iounmap calls
Add a pair of hooks (ioremap_page_range_hook/iounmap_page_range_hook) that can be implemented by an architecture. Signed-off-by: Marc Zyngier --- include/linux/io.h | 3 +++ mm/ioremap.c | 13 - mm/vmalloc.c | 8 3 files changed, 23 insertions(+), 1 deletion(-) diff --git a/include/linux/io.h b/include/linux/io.h index 9595151d800d..0ffc265f114c 100644 --- a/include/linux/io.h +++ b/include/linux/io.h @@ -21,6 +21,9 @@ void __ioread32_copy(void *to, const void __iomem *from, size_t count); void __iowrite64_copy(void __iomem *to, const void *from, size_t count); #ifdef CONFIG_MMU +void ioremap_page_range_hook(unsigned long addr, unsigned long end, +phys_addr_t phys_addr, pgprot_t prot); +void iounmap_page_range_hook(phys_addr_t phys_addr, size_t size); int ioremap_page_range(unsigned long addr, unsigned long end, phys_addr_t phys_addr, pgprot_t prot); #else diff --git a/mm/ioremap.c b/mm/ioremap.c index 8ee0136f8cb0..bd77a86088f2 100644 --- a/mm/ioremap.c +++ b/mm/ioremap.c @@ -28,10 +28,21 @@ early_param("nohugeiomap", set_nohugeiomap); static const unsigned int iomap_max_page_shift = PAGE_SHIFT; #endif /* CONFIG_HAVE_ARCH_HUGE_VMAP */ +void __weak ioremap_page_range_hook(unsigned long addr, unsigned long end, + phys_addr_t phys_addr, pgprot_t prot) +{ +} + int ioremap_page_range(unsigned long addr, unsigned long end, phys_addr_t phys_addr, pgprot_t prot) { - return vmap_range(addr, end, phys_addr, prot, iomap_max_page_shift); + int ret; + + ret = vmap_range(addr, end, phys_addr, prot, iomap_max_page_shift); + if (!ret) + ioremap_page_range_hook(addr, end, phys_addr, prot); + + return ret; } #ifdef CONFIG_GENERIC_IOREMAP diff --git a/mm/vmalloc.c b/mm/vmalloc.c index d5cd52805149..af18a6141093 100644 --- a/mm/vmalloc.c +++ b/mm/vmalloc.c @@ -38,6 +38,7 @@ #include #include #include +#include #include #include @@ -2551,6 +2552,10 @@ static void vm_remove_mappings(struct vm_struct *area, int deallocate_pages) set_area_direct_map(area, set_direct_map_default_noflush); } +void __weak iounmap_page_range_hook(phys_addr_t phys_addr, size_t size) +{ +} + static void __vunmap(const void *addr, int deallocate_pages) { struct vm_struct *area; @@ -2574,6 +2579,9 @@ static void __vunmap(const void *addr, int deallocate_pages) kasan_poison_vmalloc(area->addr, get_vm_area_size(area)); + if (area->flags & VM_IOREMAP) + iounmap_page_range_hook(area->phys_addr, get_vm_area_size(area)); + vm_remove_mappings(area, deallocate_pages); if (deallocate_pages) { -- 2.30.2 ___ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm