Re: [PATCH 12/16] mm/ioremap: Add arch-specific callbacks on ioremap/iounmap calls

2021-07-30 Thread Will Deacon
On Wed, Jul 28, 2021 at 12:01:53PM +0100, Marc Zyngier wrote:
> On Tue, 27 Jul 2021 19:12:04 +0100,
> Will Deacon  wrote:
> > 
> > On Thu, Jul 15, 2021 at 05:31:55PM +0100, Marc Zyngier wrote:
> > > Add a pair of hooks (ioremap_page_range_hook/iounmap_page_range_hook)
> > > that can be implemented by an architecture.
> > > 
> > > Signed-off-by: Marc Zyngier 
> > > ---
> > >  include/linux/io.h |  3 +++
> > >  mm/ioremap.c   | 13 -
> > >  mm/vmalloc.c   |  8 
> > >  3 files changed, 23 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/include/linux/io.h b/include/linux/io.h
> > > index 9595151d800d..0ffc265f114c 100644
> > > --- a/include/linux/io.h
> > > +++ b/include/linux/io.h
> > > @@ -21,6 +21,9 @@ void __ioread32_copy(void *to, const void __iomem 
> > > *from, size_t count);
> > >  void __iowrite64_copy(void __iomem *to, const void *from, size_t count);
> > >  
> > >  #ifdef CONFIG_MMU
> > > +void ioremap_page_range_hook(unsigned long addr, unsigned long end,
> > > +  phys_addr_t phys_addr, pgprot_t prot);
> > > +void iounmap_page_range_hook(phys_addr_t phys_addr, size_t size);
> > >  int ioremap_page_range(unsigned long addr, unsigned long end,
> > >  phys_addr_t phys_addr, pgprot_t prot);
> > >  #else
> > 
> > Can we avoid these hooks by instead not registering the regions proactively
> > in the guest and moving that logic to a fault handler which runs off the
> > back of the injected data abort? From there, we could check if the faulting
> > IPA is a memory address and register it as MMIO if not.
> > 
> > Dunno, you've spent more time than me thinking about this, but just
> > wondering if you'd had a crack at doing it that way, as it _seems_ simpler
> > to my naive brain.
> 
> I thought about it, but couldn't work out whether it was always
> possible for the guest to handle these faults (first access in an
> interrupt context, for example?).

If the check is a simple pfn_valid() I think it should be ok, but yes, we'd
definitely not want to do anything more involved given that this could run
in all sorts of horrible contexts.

> Also, this changes the semantics of the protection this is supposed to
> offer: any access out of the RAM space will generate an abort, and the
> fault handler will grant MMIO forwarding for this page. Stray accesses
> that would normally be properly handled as fatal would now succeed and
> be forwarded to userspace, even if there was no emulated devices
> there.

That's true, it would offer much weaker guarantees to the guest. It's more
like a guarantee that memory never traps to the VMM. It also then wouldn't
help with the write-combine fun. It would be simpler though, but with less
functionality.

> For this to work, we'd need to work out whether there is any existing
> device mapping that actually points to this page. And whether it
> actually is supposed to be forwarded to userspace. Do we have a rmap
> for device mappings?

I don't think this would be possible given your comments above.

So let's stick with the approach you've taken. It just feels like there
should be a way to do this without introducing new hooks into the core
code. If it wasn't for pci_remap_iospace(), we could simply hook our
definition of __ioremap_caller(). Another avenue to explore would be
looking at the IO resource instead; I see x86 already uses
IORES_MAP_ENCRYPTED and IORES_MAP_SYSTEM_RAM to drive pgprot...

Will
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH 12/16] mm/ioremap: Add arch-specific callbacks on ioremap/iounmap calls

2021-07-28 Thread Marc Zyngier
On Tue, 27 Jul 2021 19:12:04 +0100,
Will Deacon  wrote:
> 
> On Thu, Jul 15, 2021 at 05:31:55PM +0100, Marc Zyngier wrote:
> > Add a pair of hooks (ioremap_page_range_hook/iounmap_page_range_hook)
> > that can be implemented by an architecture.
> > 
> > Signed-off-by: Marc Zyngier 
> > ---
> >  include/linux/io.h |  3 +++
> >  mm/ioremap.c   | 13 -
> >  mm/vmalloc.c   |  8 
> >  3 files changed, 23 insertions(+), 1 deletion(-)
> > 
> > diff --git a/include/linux/io.h b/include/linux/io.h
> > index 9595151d800d..0ffc265f114c 100644
> > --- a/include/linux/io.h
> > +++ b/include/linux/io.h
> > @@ -21,6 +21,9 @@ void __ioread32_copy(void *to, const void __iomem *from, 
> > size_t count);
> >  void __iowrite64_copy(void __iomem *to, const void *from, size_t count);
> >  
> >  #ifdef CONFIG_MMU
> > +void ioremap_page_range_hook(unsigned long addr, unsigned long end,
> > +phys_addr_t phys_addr, pgprot_t prot);
> > +void iounmap_page_range_hook(phys_addr_t phys_addr, size_t size);
> >  int ioremap_page_range(unsigned long addr, unsigned long end,
> >phys_addr_t phys_addr, pgprot_t prot);
> >  #else
> 
> Can we avoid these hooks by instead not registering the regions proactively
> in the guest and moving that logic to a fault handler which runs off the
> back of the injected data abort? From there, we could check if the faulting
> IPA is a memory address and register it as MMIO if not.
> 
> Dunno, you've spent more time than me thinking about this, but just
> wondering if you'd had a crack at doing it that way, as it _seems_ simpler
> to my naive brain.

I thought about it, but couldn't work out whether it was always
possible for the guest to handle these faults (first access in an
interrupt context, for example?).

Also, this changes the semantics of the protection this is supposed to
offer: any access out of the RAM space will generate an abort, and the
fault handler will grant MMIO forwarding for this page. Stray accesses
that would normally be properly handled as fatal would now succeed and
be forwarded to userspace, even if there was no emulated devices
there.

For this to work, we'd need to work out whether there is any existing
device mapping that actually points to this page. And whether it
actually is supposed to be forwarded to userspace. Do we have a rmap
for device mappings?

M.

-- 
Without deviation from the norm, progress is not possible.
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


Re: [PATCH 12/16] mm/ioremap: Add arch-specific callbacks on ioremap/iounmap calls

2021-07-27 Thread Will Deacon
On Thu, Jul 15, 2021 at 05:31:55PM +0100, Marc Zyngier wrote:
> Add a pair of hooks (ioremap_page_range_hook/iounmap_page_range_hook)
> that can be implemented by an architecture.
> 
> Signed-off-by: Marc Zyngier 
> ---
>  include/linux/io.h |  3 +++
>  mm/ioremap.c   | 13 -
>  mm/vmalloc.c   |  8 
>  3 files changed, 23 insertions(+), 1 deletion(-)
> 
> diff --git a/include/linux/io.h b/include/linux/io.h
> index 9595151d800d..0ffc265f114c 100644
> --- a/include/linux/io.h
> +++ b/include/linux/io.h
> @@ -21,6 +21,9 @@ void __ioread32_copy(void *to, const void __iomem *from, 
> size_t count);
>  void __iowrite64_copy(void __iomem *to, const void *from, size_t count);
>  
>  #ifdef CONFIG_MMU
> +void ioremap_page_range_hook(unsigned long addr, unsigned long end,
> +  phys_addr_t phys_addr, pgprot_t prot);
> +void iounmap_page_range_hook(phys_addr_t phys_addr, size_t size);
>  int ioremap_page_range(unsigned long addr, unsigned long end,
>  phys_addr_t phys_addr, pgprot_t prot);
>  #else

Can we avoid these hooks by instead not registering the regions proactively
in the guest and moving that logic to a fault handler which runs off the
back of the injected data abort? From there, we could check if the faulting
IPA is a memory address and register it as MMIO if not.

Dunno, you've spent more time than me thinking about this, but just
wondering if you'd had a crack at doing it that way, as it _seems_ simpler
to my naive brain.

Will
___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm


[PATCH 12/16] mm/ioremap: Add arch-specific callbacks on ioremap/iounmap calls

2021-07-15 Thread Marc Zyngier
Add a pair of hooks (ioremap_page_range_hook/iounmap_page_range_hook)
that can be implemented by an architecture.

Signed-off-by: Marc Zyngier 
---
 include/linux/io.h |  3 +++
 mm/ioremap.c   | 13 -
 mm/vmalloc.c   |  8 
 3 files changed, 23 insertions(+), 1 deletion(-)

diff --git a/include/linux/io.h b/include/linux/io.h
index 9595151d800d..0ffc265f114c 100644
--- a/include/linux/io.h
+++ b/include/linux/io.h
@@ -21,6 +21,9 @@ void __ioread32_copy(void *to, const void __iomem *from, 
size_t count);
 void __iowrite64_copy(void __iomem *to, const void *from, size_t count);
 
 #ifdef CONFIG_MMU
+void ioremap_page_range_hook(unsigned long addr, unsigned long end,
+phys_addr_t phys_addr, pgprot_t prot);
+void iounmap_page_range_hook(phys_addr_t phys_addr, size_t size);
 int ioremap_page_range(unsigned long addr, unsigned long end,
   phys_addr_t phys_addr, pgprot_t prot);
 #else
diff --git a/mm/ioremap.c b/mm/ioremap.c
index 8ee0136f8cb0..bd77a86088f2 100644
--- a/mm/ioremap.c
+++ b/mm/ioremap.c
@@ -28,10 +28,21 @@ early_param("nohugeiomap", set_nohugeiomap);
 static const unsigned int iomap_max_page_shift = PAGE_SHIFT;
 #endif /* CONFIG_HAVE_ARCH_HUGE_VMAP */
 
+void __weak ioremap_page_range_hook(unsigned long addr, unsigned long end,
+   phys_addr_t phys_addr, pgprot_t prot)
+{
+}
+
 int ioremap_page_range(unsigned long addr,
   unsigned long end, phys_addr_t phys_addr, pgprot_t prot)
 {
-   return vmap_range(addr, end, phys_addr, prot, iomap_max_page_shift);
+   int ret;
+
+   ret = vmap_range(addr, end, phys_addr, prot, iomap_max_page_shift);
+   if (!ret)
+   ioremap_page_range_hook(addr, end, phys_addr, prot);
+
+   return ret;
 }
 
 #ifdef CONFIG_GENERIC_IOREMAP
diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index d5cd52805149..af18a6141093 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -38,6 +38,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 
@@ -2551,6 +2552,10 @@ static void vm_remove_mappings(struct vm_struct *area, 
int deallocate_pages)
set_area_direct_map(area, set_direct_map_default_noflush);
 }
 
+void __weak iounmap_page_range_hook(phys_addr_t phys_addr, size_t size)
+{
+}
+
 static void __vunmap(const void *addr, int deallocate_pages)
 {
struct vm_struct *area;
@@ -2574,6 +2579,9 @@ static void __vunmap(const void *addr, int 
deallocate_pages)
 
kasan_poison_vmalloc(area->addr, get_vm_area_size(area));
 
+   if (area->flags & VM_IOREMAP)
+   iounmap_page_range_hook(area->phys_addr, 
get_vm_area_size(area));
+
vm_remove_mappings(area, deallocate_pages);
 
if (deallocate_pages) {
-- 
2.30.2

___
kvmarm mailing list
kvmarm@lists.cs.columbia.edu
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm