Re: [PATCH v4 0/9] mm: Use vm_map_pages() and vm_map_pages_zero() API

2019-02-20 Thread Souptick Joarder
On Fri, Feb 15, 2019 at 8:06 AM Souptick Joarder  wrote:
>
> Previouly drivers have their own way of mapping range of
> kernel pages/memory into user vma and this was done by
> invoking vm_insert_page() within a loop.
>
> As this pattern is common across different drivers, it can
> be generalized by creating new functions and use it across
> the drivers.
>
> vm_map_pages() is the API which could be used to map
> kernel memory/pages in drivers which has considered vm_pgoff.
>
> vm_map_pages_zero() is the API which could be used to map
> range of kernel memory/pages in drivers which has not considered
> vm_pgoff. vm_pgoff is passed default as 0 for those drivers.
>
> We _could_ then at a later "fix" these drivers which are using
> vm_map_pages_zero() to behave according to the normal vm_pgoff
> offsetting simply by removing the _zero suffix on the function
> name and if that causes regressions, it gives us an easy way to revert.
>
> Tested on Rockchip hardware and display is working fine, including talking
> to Lima via prime.
>
> v1 -> v2:
> Few Reviewed-by.
>
> Updated the change log in [8/9]
>
> In [7/9], vm_pgoff is treated in V4L2 API as a 'cookie'
> to select a buffer, not as a in-buffer offset by design
> and it always want to mmap a whole buffer from its beginning.
> Added additional changes after discussing with Marek and
> vm_map_pages() could be used instead of vm_map_pages_zero().
>
> v2 -> v3:
> Corrected the documentation as per review comment.
>
> As suggested in v2, renaming the interfaces to -
> *vm_insert_range() -> vm_map_pages()* and
> *vm_insert_range_buggy() -> vm_map_pages_zero()*.
> As the interface is renamed, modified the code accordingly,
> updated the change logs and modified the subject lines to use the
> new interfaces. There is no other change apart from renaming and
> using the new interface.
>
> Patch[1/9] & [4/9], Tested on Rockchip hardware.
>
> v3 -> v4:
> Fixed build warnings on patch [8/9] reported by kbuild test robot.
>
> Souptick Joarder (9):
>   mm: Introduce new vm_map_pages() and vm_map_pages_zero() API
>   arm: mm: dma-mapping: Convert to use vm_map_pages()
>   drivers/firewire/core-iso.c: Convert to use vm_map_pages_zero()
>   drm/rockchip/rockchip_drm_gem.c: Convert to use vm_map_pages()
>   drm/xen/xen_drm_front_gem.c: Convert to use vm_map_pages()
>   iommu/dma-iommu.c: Convert to use vm_map_pages()
>   videobuf2/videobuf2-dma-sg.c: Convert to use vm_map_pages()
>   xen/gntdev.c: Convert to use vm_map_pages()
>   xen/privcmd-buf.c: Convert to use vm_map_pages_zero()

If no further comment, is it fine to take these patches to -mm
tree for regression ?

>
>  arch/arm/mm/dma-mapping.c  | 22 ++
>  drivers/firewire/core-iso.c| 15 +---
>  drivers/gpu/drm/rockchip/rockchip_drm_gem.c| 17 +
>  drivers/gpu/drm/xen/xen_drm_front_gem.c| 18 ++---
>  drivers/iommu/dma-iommu.c  | 12 +---
>  drivers/media/common/videobuf2/videobuf2-core.c|  7 ++
>  .../media/common/videobuf2/videobuf2-dma-contig.c  |  6 --
>  drivers/media/common/videobuf2/videobuf2-dma-sg.c  | 22 ++
>  drivers/xen/gntdev.c   | 11 ++-
>  drivers/xen/privcmd-buf.c  |  8 +--
>  include/linux/mm.h |  4 ++
>  mm/memory.c| 81 
> ++
>  mm/nommu.c | 14 
>  13 files changed, 134 insertions(+), 103 deletions(-)
>
> --
> 1.9.1
>
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 7/7] iommu/ipmmu-vmsa: Add suspend/resume support

2019-02-20 Thread Geert Uytterhoeven
Hi Laurent,

On Wed, Feb 20, 2019 at 5:11 PM Laurent Pinchart
 wrote:
> On Wed, Feb 20, 2019 at 05:05:49PM +0100, Geert Uytterhoeven wrote:
> > On Wed, Feb 20, 2019 at 4:42 PM Laurent Pinchart wrote:
> > > On Wed, Feb 20, 2019 at 04:05:31PM +0100, Geert Uytterhoeven wrote:
> > >> During PSCI system suspend, R-Car Gen3 SoCs are powered down, and all
> > >> IPMMU state is lost.  Hence after s2ram, devices wired behind an IPMMU,
> > >> and configured to use it, will see their DMA operations hang.
> > >>
> > >> To fix this, restore all IPMMU contexts, and re-enable all active
> > >> micro-TLBs during system resume.
> > >>
> > >> To avoid overhead on platforms not needing it, the resume code has a
> > >> build time dependency on sleep and PSCI support, and a runtime
> > >> dependency on PSCI.

> > >> --- a/drivers/iommu/ipmmu-vmsa.c
> > >> +++ b/drivers/iommu/ipmmu-vmsa.c

> > >> @@ -1158,10 +1166,52 @@ static int ipmmu_remove(struct platform_device 
> > >> *pdev)
> > >>   return 0;
> > >>  }
> > >>
> > >> +#if defined(CONFIG_PM_SLEEP) && defined(CONFIG_ARM_PSCI_FW)
> > >> +static int ipmmu_resume_noirq(struct device *dev)
> > >> +{
> > >> + struct ipmmu_vmsa_device *mmu = dev_get_drvdata(dev);
> > >> + unsigned int i;
> > >> +
> > >> + /* This is the best we can do to check for the presence of PSCI */
> > >> + if (!psci_ops.cpu_suspend)
> > >> + return 0;
> > >
> > > PSCI suspend disabling power to the SoC completely may be a common
> > > behaviour on our development boards, but isn't mandated by the PSCI
> > > specification if I'm not mistaken. Is there a way to instead detect that
> > > power has been lost, perhaps by checking whether a register has been
> > > reset to its default value ?
> >
> > The approach here is the same as in the clk and pinctrl drivers.
> >
> > I think we could check if the IMCTR registers for allocated domains in root
> > IPMMUs are non-zero.  But that's about as expensive as doing the full
> > restore, I think.
>
> Would reading just one register be more expensive that full
> reconfiguration ? Or is there no single register that could serve this
> purpose ?
>
> > And it may have to be done for each and every IPMMU instance, or do
> > you trust caching for this?
>
> If we can find a single register I think that reading it for every IPMMU
> instance wouldn't be an issue.

Upon more thought, probably it can be done by reading the IMCTR
for the first non-zero domain. Will look into it...

> > >> +static const struct dev_pm_ops ipmmu_pm  = {
> > >> + SET_NOIRQ_SYSTEM_SLEEP_PM_OPS(NULL, ipmmu_resume_noirq)
> > >> +};
> > >> +#define DEV_PM_OPS   _pm
> > >> +#else
> > >> +#define DEV_PM_OPS   NULL
> > >> +#endif /* CONFIG_PM_SLEEP && CONFIG_ARM_PSCI_FW */
> > >> +
> > >>  static struct platform_driver ipmmu_driver = {
> > >>   .driver = {
> > >>   .name = "ipmmu-vmsa",
> > >>   .of_match_table = of_match_ptr(ipmmu_of_ids),
> > >> + .pm = DEV_PM_OPS,
> > >
> > > I would have used conditional compilation here instead of using a
> > > DEV_PM_OPS macro, as I think the macro decreases readability (and also
> > > given how its generic name could later conflict with something else).
> >
> > You mean
> >
> > #ifdef ...
> > .pm = _pm,
> > #endif
> >
> > and marking ipmmu_pm __maybe_unused__?
>
> Yes. Up to you.

I'm not such a big fan of __maybe_unused. It's easy to add, and too
easy to forget to remove it when it's no longer needed (casts have the
same problem).

Usually people just annotate the actual suspend/resume methods with
__maybe_unused, which still leaves the (large) struct dev_pm_ops in
memory.

So I started preferring the DEV_PM_OPS approach...

Gr{oetje,eeting}s,

Geert

-- 
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH] iommu/dmar: fix buffer overflow during PCI bus notification

2019-02-20 Thread Julia Cartwright
Commit 57384592c433 ("iommu/vt-d: Store bus information in RMRR PCI
device path") changed the type of the path data, however, the change in
path type was not reflected in size calculations.  Update to use the
correct type and prevent a buffer overflow.

This bug manifests in systems with deep PCI hierarchies, and can lead to
an overflow of the static allocated buffer (dmar_pci_notify_info_buf),
or can lead to overflow of slab-allocated data.

   BUG: KASAN: global-out-of-bounds in dmar_alloc_pci_notify_info+0x1d5/0x2e0
   Write of size 1 at addr 90445d80 by task swapper/0/1
   CPU: 0 PID: 1 Comm: swapper/0 Tainted: GW   
4.14.87-rt49-02406-gd0a0e96 #1
   Call Trace:
? dump_stack+0x46/0x59
? print_address_description+0x1df/0x290
? dmar_alloc_pci_notify_info+0x1d5/0x2e0
? kasan_report+0x256/0x340
? dmar_alloc_pci_notify_info+0x1d5/0x2e0
? e820__memblock_setup+0xb0/0xb0
? dmar_dev_scope_init+0x424/0x48f
? __down_write_common+0x1ec/0x230
? dmar_dev_scope_init+0x48f/0x48f
? dmar_free_unused_resources+0x109/0x109
? cpumask_next+0x16/0x20
? __kmem_cache_create+0x392/0x430
? kmem_cache_create+0x135/0x2f0
? e820__memblock_setup+0xb0/0xb0
? intel_iommu_init+0x170/0x1848
? _raw_spin_unlock_irqrestore+0x32/0x60
? migrate_enable+0x27a/0x5b0
? sched_setattr+0x20/0x20
? migrate_disable+0x1fc/0x380
? task_rq_lock+0x170/0x170
? try_to_run_init_process+0x40/0x40
? locks_remove_file+0x85/0x2f0
? dev_prepare_static_identity_mapping+0x78/0x78
? rt_spin_unlock+0x39/0x50
? lockref_put_or_lock+0x2a/0x40
? dput+0x128/0x2f0
? __rcu_read_unlock+0x66/0x80
? __fput+0x250/0x300
? __rcu_read_lock+0x1b/0x30
? mntput_no_expire+0x38/0x290
? e820__memblock_setup+0xb0/0xb0
? pci_iommu_init+0x25/0x63
? pci_iommu_init+0x25/0x63
? do_one_initcall+0x7e/0x1c0
? initcall_blacklisted+0x120/0x120
? kernel_init_freeable+0x27b/0x307
? rest_init+0xd0/0xd0
? kernel_init+0xf/0x120
? rest_init+0xd0/0xd0
? ret_from_fork+0x1f/0x40
   The buggy address belongs to the variable:
dmar_pci_notify_info_buf+0x40/0x60

Fixes: 57384592c433 ("iommu/vt-d: Store bus information in RMRR PCI device 
path")
Signed-off-by: Julia Cartwright 
---
 drivers/iommu/dmar.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/iommu/dmar.c b/drivers/iommu/dmar.c
index 6b7df25e1488..9c49300e9fb7 100644
--- a/drivers/iommu/dmar.c
+++ b/drivers/iommu/dmar.c
@@ -145,7 +145,7 @@ dmar_alloc_pci_notify_info(struct pci_dev *dev, unsigned 
long event)
for (tmp = dev; tmp; tmp = tmp->bus->self)
level++;
 
-   size = sizeof(*info) + level * sizeof(struct acpi_dmar_pci_path);
+   size = sizeof(*info) + level * sizeof(info->path[0]);
if (size <= sizeof(dmar_pci_notify_info_buf)) {
info = (struct dmar_pci_notify_info *)dmar_pci_notify_info_buf;
} else {
-- 
2.20.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] iommu/dmar: fix buffer overflow during PCI bus notification

2019-02-20 Thread Julia Cartwright
On Wed, Feb 20, 2019 at 10:46:31AM -0600, Julia Cartwright wrote:
> Commit 57384592c433 ("iommu/vt-d: Store bus information in RMRR PCI
> device path") changed the type of the path data, however, the change in
> path type was not reflected in size calculations.  Update to use the
> correct type and prevent a buffer overflow.
> 
> This bug manifests in systems with deep PCI hierarchies, and can lead to
> an overflow of the static allocated buffer (dmar_pci_notify_info_buf),
> or can lead to overflow of slab-allocated data.
> 
>BUG: KASAN: global-out-of-bounds in dmar_alloc_pci_notify_info+0x1d5/0x2e0
>Write of size 1 at addr 90445d80 by task swapper/0/1
>CPU: 0 PID: 1 Comm: swapper/0 Tainted: GW   
> 4.14.87-rt49-02406-gd0a0e96 #1
>Call Trace:
> ? dump_stack+0x46/0x59
> ? print_address_description+0x1df/0x290
[..]
>The buggy address belongs to the variable:
> dmar_pci_notify_info_buf+0x40/0x60
> 
> Fixes: 57384592c433 ("iommu/vt-d: Store bus information in RMRR PCI device 
> path")
> Signed-off-by: Julia Cartwright 
> ---
>  drivers/iommu/dmar.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/iommu/dmar.c b/drivers/iommu/dmar.c
> index 6b7df25e1488..9c49300e9fb7 100644
> --- a/drivers/iommu/dmar.c
> +++ b/drivers/iommu/dmar.c
> @@ -145,7 +145,7 @@ dmar_alloc_pci_notify_info(struct pci_dev *dev, unsigned 
> long event)
>   for (tmp = dev; tmp; tmp = tmp->bus->self)
>   level++;
>  
> - size = sizeof(*info) + level * sizeof(struct acpi_dmar_pci_path);
> + size = sizeof(*info) + level * sizeof(info->path[0]);

This is probably a candidate for struct_size() instead, if that's what
is preferred.

   Julia
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 7/7] iommu/ipmmu-vmsa: Add suspend/resume support

2019-02-20 Thread Laurent Pinchart
Hi Geert,

On Wed, Feb 20, 2019 at 05:05:49PM +0100, Geert Uytterhoeven wrote:
> On Wed, Feb 20, 2019 at 4:42 PM Laurent Pinchart wrote:
> > On Wed, Feb 20, 2019 at 04:05:31PM +0100, Geert Uytterhoeven wrote:
> >> During PSCI system suspend, R-Car Gen3 SoCs are powered down, and all
> >> IPMMU state is lost.  Hence after s2ram, devices wired behind an IPMMU,
> >> and configured to use it, will see their DMA operations hang.
> >>
> >> To fix this, restore all IPMMU contexts, and re-enable all active
> >> micro-TLBs during system resume.
> >>
> >> To avoid overhead on platforms not needing it, the resume code has a
> >> build time dependency on sleep and PSCI support, and a runtime
> >> dependency on PSCI.
> >>
> >> Signed-off-by: Geert Uytterhoeven 
> >> ---
> >> This patch takes a different approach than the BSP, which implements a
> >> bulk save/restore of all registers during system suspend/resume.
> >
> > I like this approach better too.
> 
> Thanks ;-)
> 
> >> --- a/drivers/iommu/ipmmu-vmsa.c
> >> +++ b/drivers/iommu/ipmmu-vmsa.c
> 
> >> @@ -58,6 +62,7 @@ struct ipmmu_vmsa_device {
> >>   spinlock_t lock;/* Protects ctx and 
> >> domains[] */
> >>   DECLARE_BITMAP(ctx, IPMMU_CTX_MAX);
> >>   struct ipmmu_vmsa_domain *domains[IPMMU_CTX_MAX];
> >> + s8 utlb_ctx[IPMMU_UTLB_MAX];
> >
> > How about making this a bitmask instead to save memory ? I would also
> > rename it as utlb_ctx doesn't really carry the meaning of the field,
> > whose purpose is to store whether the µTLB is enabled or disabled.
> 
> This field isn't just a binary flag, but stores the context used for the
> uTLB, so we can map from micro-TLB to context.
> Given there can be 8 contexts, plus the need to indicate unused contexts,
> that means 4 bits/micro-TLB.  So the overhead is just 24 bytes per IPMMU
> instance.

My bad, I've overlooked that.

> I considered allocating the array dynamically (by having s8 utlb_ctx[]
> at the end of the structure), but didn't go that route, as the domains[]
> array already uses more memory.
> 
> >> @@ -1158,10 +1166,52 @@ static int ipmmu_remove(struct platform_device 
> >> *pdev)
> >>   return 0;
> >>  }
> >>
> >> +#if defined(CONFIG_PM_SLEEP) && defined(CONFIG_ARM_PSCI_FW)
> >> +static int ipmmu_resume_noirq(struct device *dev)
> >> +{
> >> + struct ipmmu_vmsa_device *mmu = dev_get_drvdata(dev);
> >> + unsigned int i;
> >> +
> >> + /* This is the best we can do to check for the presence of PSCI */
> >> + if (!psci_ops.cpu_suspend)
> >> + return 0;
> >
> > PSCI suspend disabling power to the SoC completely may be a common
> > behaviour on our development boards, but isn't mandated by the PSCI
> > specification if I'm not mistaken. Is there a way to instead detect that
> > power has been lost, perhaps by checking whether a register has been
> > reset to its default value ?
> 
> The approach here is the same as in the clk and pinctrl drivers.
> 
> I think we could check if the IMCTR registers for allocated domains in root
> IPMMUs are non-zero.  But that's about as expensive as doing the full
> restore, I think.

Would reading just one register be more expensive that full
reconfiguration ? Or is there no single register that could serve this
purpose ?

> And it may have to be done for each and every IPMMU instance, or do
> you trust caching for this?

If we can find a single register I think that reading it for every IPMMU
instance wouldn't be an issue.

> >> +
> >> + /* Reset root MMU and restore contexts */
> >
> > I think the rest of the code adds a period at the end of sentences in
> > comments.
> 
> The balance seems to be just under 50% ;-)
> 
> >> + if (ipmmu_is_root(mmu)) {
> >> + ipmmu_device_reset(mmu);
> >> +
> >> + for (i = 0; i < mmu->num_ctx; i++) {
> >> + if (!mmu->domains[i])
> >> + continue;
> >> +
> >> + ipmmu_context_init(mmu->domains[i]);
> >> + }
> >> + }
> >> +
> >> + /* Re-enable active micro-TLBs */
> >> + for (i = 0; i < mmu->features->num_utlbs; i++) {
> >> + if (mmu->utlb_ctx[i] == IPMMU_CTX_INVALID)
> >> + continue;
> >> +
> >> + ipmmu_utlb_enable(mmu->root->domains[mmu->utlb_ctx[i]], i);
> >> + }
> >> +
> >> + return 0;
> >> +}
> >> +
> >> +static const struct dev_pm_ops ipmmu_pm  = {
> >> + SET_NOIRQ_SYSTEM_SLEEP_PM_OPS(NULL, ipmmu_resume_noirq)
> >> +};
> >> +#define DEV_PM_OPS   _pm
> >> +#else
> >> +#define DEV_PM_OPS   NULL
> >> +#endif /* CONFIG_PM_SLEEP && CONFIG_ARM_PSCI_FW */
> >> +
> >>  static struct platform_driver ipmmu_driver = {
> >>   .driver = {
> >>   .name = "ipmmu-vmsa",
> >>   .of_match_table = of_match_ptr(ipmmu_of_ids),
> >> + .pm = DEV_PM_OPS,
> >
> > I would have used conditional compilation here instead of using a
> > DEV_PM_OPS macro, as I think the macro 

Re: [PATCH 7/7] iommu/ipmmu-vmsa: Add suspend/resume support

2019-02-20 Thread Geert Uytterhoeven
Hi Laurent,

On Wed, Feb 20, 2019 at 4:42 PM Laurent Pinchart
 wrote:
> On Wed, Feb 20, 2019 at 04:05:31PM +0100, Geert Uytterhoeven wrote:
> > During PSCI system suspend, R-Car Gen3 SoCs are powered down, and all
> > IPMMU state is lost.  Hence after s2ram, devices wired behind an IPMMU,
> > and configured to use it, will see their DMA operations hang.
> >
> > To fix this, restore all IPMMU contexts, and re-enable all active
> > micro-TLBs during system resume.
> >
> > To avoid overhead on platforms not needing it, the resume code has a
> > build time dependency on sleep and PSCI support, and a runtime
> > dependency on PSCI.
> >
> > Signed-off-by: Geert Uytterhoeven 
> > ---
> > This patch takes a different approach than the BSP, which implements a
> > bulk save/restore of all registers during system suspend/resume.
>
> I like this approach better too.

Thanks ;-)

> > --- a/drivers/iommu/ipmmu-vmsa.c
> > +++ b/drivers/iommu/ipmmu-vmsa.c

> > @@ -58,6 +62,7 @@ struct ipmmu_vmsa_device {
> >   spinlock_t lock;/* Protects ctx and domains[] 
> > */
> >   DECLARE_BITMAP(ctx, IPMMU_CTX_MAX);
> >   struct ipmmu_vmsa_domain *domains[IPMMU_CTX_MAX];
> > + s8 utlb_ctx[IPMMU_UTLB_MAX];
>
> How about making this a bitmask instead to save memory ? I would also
> rename it as utlb_ctx doesn't really carry the meaning of the field,
> whose purpose is to store whether the µTLB is enabled or disabled.

This field isn't just a binary flag, but stores the context used for the
uTLB, so we can map from micro-TLB to context.
Given there can be 8 contexts, plus the need to indicate unused contexts,
that means 4 bits/micro-TLB.  So the overhead is just 24 bytes per IPMMU
instance.

I considered allocating the array dynamically (by having s8 utlb_ctx[]
at the end of the structure), but didn't go that route, as the domains[]
array already uses more memory.

> > @@ -1158,10 +1166,52 @@ static int ipmmu_remove(struct platform_device 
> > *pdev)
> >   return 0;
> >  }
> >
> > +#if defined(CONFIG_PM_SLEEP) && defined(CONFIG_ARM_PSCI_FW)
> > +static int ipmmu_resume_noirq(struct device *dev)
> > +{
> > + struct ipmmu_vmsa_device *mmu = dev_get_drvdata(dev);
> > + unsigned int i;
> > +
> > + /* This is the best we can do to check for the presence of PSCI */
> > + if (!psci_ops.cpu_suspend)
> > + return 0;
>
> PSCI suspend disabling power to the SoC completely may be a common
> behaviour on our development boards, but isn't mandated by the PSCI
> specification if I'm not mistaken. Is there a way to instead detect that
> power has been lost, perhaps by checking whether a register has been
> reset to its default value ?

The approach here is the same as in the clk and pinctrl drivers.

I think we could check if the IMCTR registers for allocated domains in root
IPMMUs are non-zero.  But that's about as expensive as doing the full
restore, I think.  And it may have to be done for each and every IPMMU
instance, or do you trust caching for this?

> > +
> > + /* Reset root MMU and restore contexts */
>
> I think the rest of the code adds a period at the end of sentences in
> comments.

The balance seems to be just under 50% ;-)

> > + if (ipmmu_is_root(mmu)) {
> > + ipmmu_device_reset(mmu);
> > +
> > + for (i = 0; i < mmu->num_ctx; i++) {
> > + if (!mmu->domains[i])
> > + continue;
> > +
> > + ipmmu_context_init(mmu->domains[i]);
> > + }
> > + }
> > +
> > + /* Re-enable active micro-TLBs */
> > + for (i = 0; i < mmu->features->num_utlbs; i++) {
> > + if (mmu->utlb_ctx[i] == IPMMU_CTX_INVALID)
> > + continue;
> > +
> > + ipmmu_utlb_enable(mmu->root->domains[mmu->utlb_ctx[i]], i);
> > + }
> > +
> > + return 0;
> > +}
> > +
> > +static const struct dev_pm_ops ipmmu_pm  = {
> > + SET_NOIRQ_SYSTEM_SLEEP_PM_OPS(NULL, ipmmu_resume_noirq)
> > +};
> > +#define DEV_PM_OPS   _pm
> > +#else
> > +#define DEV_PM_OPS   NULL
> > +#endif /* CONFIG_PM_SLEEP && CONFIG_ARM_PSCI_FW */
> > +
> >  static struct platform_driver ipmmu_driver = {
> >   .driver = {
> >   .name = "ipmmu-vmsa",
> >   .of_match_table = of_match_ptr(ipmmu_of_ids),
> > + .pm = DEV_PM_OPS,
>
> I would have used conditional compilation here instead of using a
> DEV_PM_OPS macro, as I think the macro decreases readability (and also
> given how its generic name could later conflict with something else).

You mean

#ifdef ...
.pm = _pm,
#endif

and marking ipmmu_pm __maybe_unused__?

Gr{oetje,eeting}s,

Geert

-- 
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
  

Re: [PATCH 6/7] iommu/ipmmu-vmsa: Extract hardware context initialization

2019-02-20 Thread Geert Uytterhoeven
Hi Laurent,

On Wed, Feb 20, 2019 at 4:35 PM Laurent Pinchart
 wrote:
> On Wed, Feb 20, 2019 at 04:05:30PM +0100, Geert Uytterhoeven wrote:
> > ipmmu_domain_init_context() takes care of (1) initializing the software
> > domain, and (2) initializing the hardware context for the domain.
> >
> > Extract the code to initialize the hardware context into a new subroutine
> > ipmmu_context_init(), to prepare for later reuse.
> >
> > Signed-off-by: Geert Uytterhoeven 
> > ---
> >  drivers/iommu/ipmmu-vmsa.c | 91 --
> >  1 file changed, 48 insertions(+), 43 deletions(-)
> >
> > diff --git a/drivers/iommu/ipmmu-vmsa.c b/drivers/iommu/ipmmu-vmsa.c
> > index 0a21e734466eb1bd..92a766dd8b459f0c 100644
> > --- a/drivers/iommu/ipmmu-vmsa.c
> > +++ b/drivers/iommu/ipmmu-vmsa.c
> > @@ -404,52 +404,10 @@ static void ipmmu_domain_free_context(struct 
> > ipmmu_vmsa_device *mmu,
> >   spin_unlock_irqrestore(>lock, flags);
> >  }
> >
> > -static int ipmmu_domain_init_context(struct ipmmu_vmsa_domain *domain)
> > +static void ipmmu_context_init(struct ipmmu_vmsa_domain *domain)
>
> ipmmu_context_init() vs. ipmmmu_domain_init_context() is confusing. You
> could call this one ipmmu_domain_setup_context() maybe ?

Thanks, that name was actually on my shortlist, and may make most sense.

Gr{oetje,eeting}s,

Geert

-- 
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 5/7] iommu/ipmmu-vmsa: Move num_utlbs to SoC-specific features

2019-02-20 Thread Laurent Pinchart
Hi Geert,

Thank you for the patch.

On Wed, Feb 20, 2019 at 04:05:29PM +0100, Geert Uytterhoeven wrote:
> The maximum number of micro-TLBs per IPMMU instance is not fixed, but
> depends on the SoC type.  Hence move it from struct ipmmu_vmsa_device to
> struct ipmmu_features, and set up the correct value for both R-Car Gen2
> and Gen3 SoCs.
> 
> Note that currently no code uses this value.
> 
> Signed-off-by: Geert Uytterhoeven 

Reviewed-by: Laurent Pinchart 

> ---
>  drivers/iommu/ipmmu-vmsa.c | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/iommu/ipmmu-vmsa.c b/drivers/iommu/ipmmu-vmsa.c
> index 4e3134a9a52f8d87..0a21e734466eb1bd 100644
> --- a/drivers/iommu/ipmmu-vmsa.c
> +++ b/drivers/iommu/ipmmu-vmsa.c
> @@ -42,6 +42,7 @@ struct ipmmu_features {
>   bool use_ns_alias_offset;
>   bool has_cache_leaf_nodes;
>   unsigned int number_of_contexts;
> + unsigned int num_utlbs;
>   bool setup_imbuscr;
>   bool twobit_imttbcr_sl0;
>   bool reserved_context;
> @@ -53,7 +54,6 @@ struct ipmmu_vmsa_device {
>   struct iommu_device iommu;
>   struct ipmmu_vmsa_device *root;
>   const struct ipmmu_features *features;
> - unsigned int num_utlbs;
>   unsigned int num_ctx;
>   spinlock_t lock;/* Protects ctx and domains[] */
>   DECLARE_BITMAP(ctx, IPMMU_CTX_MAX);
> @@ -972,6 +972,7 @@ static const struct ipmmu_features ipmmu_features_default 
> = {
>   .use_ns_alias_offset = true,
>   .has_cache_leaf_nodes = false,
>   .number_of_contexts = 1, /* software only tested with one context */
> + .num_utlbs = 32,
>   .setup_imbuscr = true,
>   .twobit_imttbcr_sl0 = false,
>   .reserved_context = false,
> @@ -981,6 +982,7 @@ static const struct ipmmu_features 
> ipmmu_features_rcar_gen3 = {
>   .use_ns_alias_offset = false,
>   .has_cache_leaf_nodes = true,
>   .number_of_contexts = 8,
> + .num_utlbs = 48,
>   .setup_imbuscr = false,
>   .twobit_imttbcr_sl0 = true,
>   .reserved_context = true,
> @@ -1033,7 +1035,6 @@ static int ipmmu_probe(struct platform_device *pdev)
>   }
>  
>   mmu->dev = >dev;
> - mmu->num_utlbs = 48;
>   spin_lock_init(>lock);
>   bitmap_zero(mmu->ctx, IPMMU_CTX_MAX);
>   mmu->features = of_device_get_match_data(>dev);

-- 
Regards,

Laurent Pinchart
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 7/7] iommu/ipmmu-vmsa: Add suspend/resume support

2019-02-20 Thread Laurent Pinchart
Hi Geert,

Thank you for the patch.

On Wed, Feb 20, 2019 at 04:05:31PM +0100, Geert Uytterhoeven wrote:
> During PSCI system suspend, R-Car Gen3 SoCs are powered down, and all
> IPMMU state is lost.  Hence after s2ram, devices wired behind an IPMMU,
> and configured to use it, will see their DMA operations hang.
> 
> To fix this, restore all IPMMU contexts, and re-enable all active
> micro-TLBs during system resume.
> 
> To avoid overhead on platforms not needing it, the resume code has a
> build time dependency on sleep and PSCI support, and a runtime
> dependency on PSCI.
> 
> Signed-off-by: Geert Uytterhoeven 
> ---
> This patch takes a different approach than the BSP, which implements a
> bulk save/restore of all registers during system suspend/resume.

I like this approach better too.

>  drivers/iommu/ipmmu-vmsa.c | 52 +-
>  1 file changed, 51 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/iommu/ipmmu-vmsa.c b/drivers/iommu/ipmmu-vmsa.c
> index 92a766dd8b459f0c..5d22139914e8f033 100644
> --- a/drivers/iommu/ipmmu-vmsa.c
> +++ b/drivers/iommu/ipmmu-vmsa.c
> @@ -22,6 +22,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include 
>  #include 
>  #include 
> @@ -36,7 +37,10 @@
>  #define arm_iommu_detach_device(...) do {} while (0)
>  #endif
>  
> -#define IPMMU_CTX_MAX 8U
> +#define IPMMU_CTX_MAX8U
> +#define IPMMU_CTX_INVALID-1
> +
> +#define IPMMU_UTLB_MAX   48U
>  
>  struct ipmmu_features {
>   bool use_ns_alias_offset;
> @@ -58,6 +62,7 @@ struct ipmmu_vmsa_device {
>   spinlock_t lock;/* Protects ctx and domains[] */
>   DECLARE_BITMAP(ctx, IPMMU_CTX_MAX);
>   struct ipmmu_vmsa_domain *domains[IPMMU_CTX_MAX];
> + s8 utlb_ctx[IPMMU_UTLB_MAX];

How about making this a bitmask instead to save memory ? I would also
rename it as utlb_ctx doesn't really carry the meaning of the field,
whose purpose is to store whether the µTLB is enabled or disabled.

>  
>   struct iommu_group *group;
>   struct dma_iommu_mapping *mapping;
> @@ -335,6 +340,7 @@ static void ipmmu_utlb_enable(struct ipmmu_vmsa_domain 
> *domain,
>   ipmmu_write(mmu, IMUCTR(utlb),
>   IMUCTR_TTSEL_MMU(domain->context_id) | IMUCTR_FLUSH |
>   IMUCTR_MMUEN);
> + mmu->utlb_ctx[utlb] = domain->context_id;
>  }
>  
>  /*
> @@ -346,6 +352,7 @@ static void ipmmu_utlb_disable(struct ipmmu_vmsa_domain 
> *domain,
>   struct ipmmu_vmsa_device *mmu = domain->mmu;
>  
>   ipmmu_write(mmu, IMUCTR(utlb), 0);
> + mmu->utlb_ctx[utlb] = IPMMU_CTX_INVALID;
>  }
>  
>  static void ipmmu_tlb_flush_all(void *cookie)
> @@ -1043,6 +1050,7 @@ static int ipmmu_probe(struct platform_device *pdev)
>   spin_lock_init(>lock);
>   bitmap_zero(mmu->ctx, IPMMU_CTX_MAX);
>   mmu->features = of_device_get_match_data(>dev);
> + memset(mmu->utlb_ctx, IPMMU_CTX_INVALID, mmu->features->num_utlbs);
>   dma_set_mask_and_coherent(>dev, DMA_BIT_MASK(40));
>  
>   /* Map I/O memory and request IRQ. */
> @@ -1158,10 +1166,52 @@ static int ipmmu_remove(struct platform_device *pdev)
>   return 0;
>  }
>  
> +#if defined(CONFIG_PM_SLEEP) && defined(CONFIG_ARM_PSCI_FW)
> +static int ipmmu_resume_noirq(struct device *dev)
> +{
> + struct ipmmu_vmsa_device *mmu = dev_get_drvdata(dev);
> + unsigned int i;
> +
> + /* This is the best we can do to check for the presence of PSCI */
> + if (!psci_ops.cpu_suspend)
> + return 0;

PSCI suspend disabling power to the SoC completely may be a common
behaviour on our development boards, but isn't mandated by the PSCI
specification if I'm not mistaken. Is there a way to instead detect that
power has been lost, perhaps by checking whether a register has been
reset to its default value ?

> +
> + /* Reset root MMU and restore contexts */

I think the rest of the code adds a period at the end of sentences in
comments.

> + if (ipmmu_is_root(mmu)) {
> + ipmmu_device_reset(mmu);
> +
> + for (i = 0; i < mmu->num_ctx; i++) {
> + if (!mmu->domains[i])
> + continue;
> +
> + ipmmu_context_init(mmu->domains[i]);
> + }
> + }
> +
> + /* Re-enable active micro-TLBs */
> + for (i = 0; i < mmu->features->num_utlbs; i++) {
> + if (mmu->utlb_ctx[i] == IPMMU_CTX_INVALID)
> + continue;
> +
> + ipmmu_utlb_enable(mmu->root->domains[mmu->utlb_ctx[i]], i);
> + }
> +
> + return 0;
> +}
> +
> +static const struct dev_pm_ops ipmmu_pm  = {
> + SET_NOIRQ_SYSTEM_SLEEP_PM_OPS(NULL, ipmmu_resume_noirq)
> +};
> +#define DEV_PM_OPS   _pm
> +#else
> +#define DEV_PM_OPS   NULL
> +#endif /* CONFIG_PM_SLEEP && CONFIG_ARM_PSCI_FW */
> +
>  static struct platform_driver ipmmu_driver = {
>   .driver = {
>   .name = "ipmmu-vmsa",
>   

Re: [PATCH 3/7] iommu/ipmmu-vmsa: Prepare to handle 40-bit error addresses

2019-02-20 Thread Geert Uytterhoeven
Hi Laurent,

On Wed, Feb 20, 2019 at 4:31 PM Laurent Pinchart
 wrote:
> On Wed, Feb 20, 2019 at 04:05:27PM +0100, Geert Uytterhoeven wrote:
> > On R-Car Gen3, the faulting virtual address is a 40-bit address, and
> > comprised of two registers.  Read the upper address part, and combine
> > both parts, when running on a 64-bit system.
> >
> > Signed-off-by: Geert Uytterhoeven 
> > ---
> > Apart from this, the driver doesn't support 40-bit IOVA addresses yet.
> > ---
> >  drivers/iommu/ipmmu-vmsa.c | 16 ++--
> >  1 file changed, 10 insertions(+), 6 deletions(-)
> >
> > diff --git a/drivers/iommu/ipmmu-vmsa.c b/drivers/iommu/ipmmu-vmsa.c
> > index ac70cb967ff821c6..4d07c26c97848b65 100644
> > --- a/drivers/iommu/ipmmu-vmsa.c
> > +++ b/drivers/iommu/ipmmu-vmsa.c
> > @@ -186,7 +186,9 @@ static struct ipmmu_vmsa_device *to_ipmmu(struct device 
> > *dev)
> >  #define IMMAIR_ATTR_IDX_WBRWA1
> >  #define IMMAIR_ATTR_IDX_DEV  2
> >
> > -#define IMEAR0x0030
> > +#define IMEAR0x0030  /* R-Car Gen2 */
> > +#define IMELAR   IMEAR   /* R-Car Gen3 */
>
> I would have defined that as a single macro.

Fair enough.

> > +#define IMEUAR   0x0034  /* R-Car Gen3 */
> >
> >  #define IMPCTR   0x0200
> >  #define IMPSTR   0x0208
> > @@ -521,14 +523,16 @@ static irqreturn_t ipmmu_domain_irq(struct 
> > ipmmu_vmsa_domain *domain)
> >  {
> >   const u32 err_mask = IMSTR_MHIT | IMSTR_ABORT | IMSTR_PF | IMSTR_TF;
> >   struct ipmmu_vmsa_device *mmu = domain->mmu;
> > + unsigned long iova;
>
> Isn't there a more appropriate type, such as dma_addr_t possibly ?

Good question! Thanks for opening the can of worms ;-)

I used "unsigned long", as that's what the "iova" parameter of
report_iommu_fault() (called in this function) takes.

However, dma_addr_t would probably be a better choice.

Note that most iommu_ops methods take unsigned long, except for
.iova_to_phys(), which takes dma_addr_t.
Also, all io_pgtable_ops methods take unsigned long, including
.iova_to_phys().

The latter is interesting, as several IOMMU drivers forward the iova
received in their iommu_ops.iova_to_phys() methods to
io_pgtable_ops.iova_to_phys(), thus truncating the value on e.g. arm32
with LPAE.

Gr{oetje,eeting}s,

Geert

-- 
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 6/7] iommu/ipmmu-vmsa: Extract hardware context initialization

2019-02-20 Thread Laurent Pinchart
Hi Geert,

Thank you for the patch.

On Wed, Feb 20, 2019 at 04:05:30PM +0100, Geert Uytterhoeven wrote:
> ipmmu_domain_init_context() takes care of (1) initializing the software
> domain, and (2) initializing the hardware context for the domain.
> 
> Extract the code to initialize the hardware context into a new subroutine
> ipmmu_context_init(), to prepare for later reuse.
> 
> Signed-off-by: Geert Uytterhoeven 
> ---
>  drivers/iommu/ipmmu-vmsa.c | 91 --
>  1 file changed, 48 insertions(+), 43 deletions(-)
> 
> diff --git a/drivers/iommu/ipmmu-vmsa.c b/drivers/iommu/ipmmu-vmsa.c
> index 0a21e734466eb1bd..92a766dd8b459f0c 100644
> --- a/drivers/iommu/ipmmu-vmsa.c
> +++ b/drivers/iommu/ipmmu-vmsa.c
> @@ -404,52 +404,10 @@ static void ipmmu_domain_free_context(struct 
> ipmmu_vmsa_device *mmu,
>   spin_unlock_irqrestore(>lock, flags);
>  }
>  
> -static int ipmmu_domain_init_context(struct ipmmu_vmsa_domain *domain)
> +static void ipmmu_context_init(struct ipmmu_vmsa_domain *domain)

ipmmu_context_init() vs. ipmmmu_domain_init_context() is confusing. You
could call this one ipmmu_domain_setup_context() maybe ?

>  {
>   u64 ttbr;
>   u32 tmp;
> - int ret;
> -
> - /*
> -  * Allocate the page table operations.
> -  *
> -  * VMSA states in section B3.6.3 "Control of Secure or Non-secure memory
> -  * access, Long-descriptor format" that the NStable bit being set in a
> -  * table descriptor will result in the NStable and NS bits of all child
> -  * entries being ignored and considered as being set. The IPMMU seems
> -  * not to comply with this, as it generates a secure access page fault
> -  * if any of the NStable and NS bits isn't set when running in
> -  * non-secure mode.
> -  */
> - domain->cfg.quirks = IO_PGTABLE_QUIRK_ARM_NS;
> - domain->cfg.pgsize_bitmap = SZ_1G | SZ_2M | SZ_4K;
> - domain->cfg.ias = 32;
> - domain->cfg.oas = 40;
> - domain->cfg.tlb = _gather_ops;
> - domain->io_domain.geometry.aperture_end = DMA_BIT_MASK(32);
> - domain->io_domain.geometry.force_aperture = true;
> - /*
> -  * TODO: Add support for coherent walk through CCI with DVM and remove
> -  * cache handling. For now, delegate it to the io-pgtable code.
> -  */
> - domain->cfg.iommu_dev = domain->mmu->root->dev;
> -
> - /*
> -  * Find an unused context.
> -  */
> - ret = ipmmu_domain_allocate_context(domain->mmu->root, domain);
> - if (ret < 0)
> - return ret;
> -
> - domain->context_id = ret;
> -
> - domain->iop = alloc_io_pgtable_ops(ARM_32_LPAE_S1, >cfg,
> -domain);
> - if (!domain->iop) {
> - ipmmu_domain_free_context(domain->mmu->root,
> -   domain->context_id);
> - return -EINVAL;
> - }
>  
>   /* TTBR0 */
>   ttbr = domain->cfg.arm_lpae_s1_cfg.ttbr[0];
> @@ -495,7 +453,54 @@ static int ipmmu_domain_init_context(struct 
> ipmmu_vmsa_domain *domain)
>*/
>   ipmmu_ctx_write_all(domain, IMCTR,
>   IMCTR_INTEN | IMCTR_FLUSH | IMCTR_MMUEN);
> +}
> +
> +static int ipmmu_domain_init_context(struct ipmmu_vmsa_domain *domain)
> +{
> + int ret;
> +
> + /*
> +  * Allocate the page table operations.
> +  *
> +  * VMSA states in section B3.6.3 "Control of Secure or Non-secure memory
> +  * access, Long-descriptor format" that the NStable bit being set in a
> +  * table descriptor will result in the NStable and NS bits of all child
> +  * entries being ignored and considered as being set. The IPMMU seems
> +  * not to comply with this, as it generates a secure access page fault
> +  * if any of the NStable and NS bits isn't set when running in
> +  * non-secure mode.
> +  */
> + domain->cfg.quirks = IO_PGTABLE_QUIRK_ARM_NS;
> + domain->cfg.pgsize_bitmap = SZ_1G | SZ_2M | SZ_4K;
> + domain->cfg.ias = 32;
> + domain->cfg.oas = 40;
> + domain->cfg.tlb = _gather_ops;
> + domain->io_domain.geometry.aperture_end = DMA_BIT_MASK(32);
> + domain->io_domain.geometry.force_aperture = true;
> + /*
> +  * TODO: Add support for coherent walk through CCI with DVM and remove
> +  * cache handling. For now, delegate it to the io-pgtable code.
> +  */
> + domain->cfg.iommu_dev = domain->mmu->root->dev;
> +
> + /*
> +  * Find an unused context.
> +  */
> + ret = ipmmu_domain_allocate_context(domain->mmu->root, domain);
> + if (ret < 0)
> + return ret;
> +
> + domain->context_id = ret;
> +
> + domain->iop = alloc_io_pgtable_ops(ARM_32_LPAE_S1, >cfg,
> +domain);
> + if (!domain->iop) {
> + ipmmu_domain_free_context(domain->mmu->root,
> +   domain->context_id);
> + return -EINVAL;
> + }

Re: [PATCH 3/7] iommu/ipmmu-vmsa: Prepare to handle 40-bit error addresses

2019-02-20 Thread Laurent Pinchart
Hi Geert,

Thank you for the patch.

On Wed, Feb 20, 2019 at 04:05:27PM +0100, Geert Uytterhoeven wrote:
> On R-Car Gen3, the faulting virtual address is a 40-bit address, and
> comprised of two registers.  Read the upper address part, and combine
> both parts, when running on a 64-bit system.
> 
> Signed-off-by: Geert Uytterhoeven 
> ---
> Apart from this, the driver doesn't support 40-bit IOVA addresses yet.
> ---
>  drivers/iommu/ipmmu-vmsa.c | 16 ++--
>  1 file changed, 10 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/iommu/ipmmu-vmsa.c b/drivers/iommu/ipmmu-vmsa.c
> index ac70cb967ff821c6..4d07c26c97848b65 100644
> --- a/drivers/iommu/ipmmu-vmsa.c
> +++ b/drivers/iommu/ipmmu-vmsa.c
> @@ -186,7 +186,9 @@ static struct ipmmu_vmsa_device *to_ipmmu(struct device 
> *dev)
>  #define IMMAIR_ATTR_IDX_WBRWA1
>  #define IMMAIR_ATTR_IDX_DEV  2
>  
> -#define IMEAR0x0030
> +#define IMEAR0x0030  /* R-Car Gen2 */
> +#define IMELAR   IMEAR   /* R-Car Gen3 */

I would have defined that as a single macro.

> +#define IMEUAR   0x0034  /* R-Car Gen3 */
>  
>  #define IMPCTR   0x0200
>  #define IMPSTR   0x0208
> @@ -521,14 +523,16 @@ static irqreturn_t ipmmu_domain_irq(struct 
> ipmmu_vmsa_domain *domain)
>  {
>   const u32 err_mask = IMSTR_MHIT | IMSTR_ABORT | IMSTR_PF | IMSTR_TF;
>   struct ipmmu_vmsa_device *mmu = domain->mmu;
> + unsigned long iova;

Isn't there a more appropriate type, such as dma_addr_t possibly ?

>   u32 status;
> - u32 iova;
>  
>   status = ipmmu_ctx_read_root(domain, IMSTR);
>   if (!(status & err_mask))
>   return IRQ_NONE;
>  
> - iova = ipmmu_ctx_read_root(domain, IMEAR);
> + iova = ipmmu_ctx_read_root(domain, IMELAR);
> + if (IS_ENABLED(CONFIG_64BIT))
> + iova |= (u64)ipmmu_ctx_read_root(domain, IMEUAR) << 32;
>  
>   /*
>* Clear the error status flags. Unlike traditional interrupt flag
> @@ -540,10 +544,10 @@ static irqreturn_t ipmmu_domain_irq(struct 
> ipmmu_vmsa_domain *domain)
>  
>   /* Log fatal errors. */
>   if (status & IMSTR_MHIT)
> - dev_err_ratelimited(mmu->dev, "Multiple TLB hits @0x%08x\n",
> + dev_err_ratelimited(mmu->dev, "Multiple TLB hits @0x%lx\n",
>   iova);
>   if (status & IMSTR_ABORT)
> - dev_err_ratelimited(mmu->dev, "Page Table Walk Abort @0x%08x\n",
> + dev_err_ratelimited(mmu->dev, "Page Table Walk Abort @0x%lx\n",
>   iova);
>  
>   if (!(status & (IMSTR_PF | IMSTR_TF)))
> @@ -559,7 +563,7 @@ static irqreturn_t ipmmu_domain_irq(struct 
> ipmmu_vmsa_domain *domain)
>   return IRQ_HANDLED;
>  
>   dev_err_ratelimited(mmu->dev,
> - "Unhandled fault: status 0x%08x iova 0x%08x\n",
> + "Unhandled fault: status 0x%08x iova 0x%lx\n",
>   status, iova);
>  
>   return IRQ_HANDLED;

-- 
Regards,

Laurent Pinchart
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 4/7] iommu/ipmmu-vmsa: Make IPMMU_CTX_MAX unsigned

2019-02-20 Thread Laurent Pinchart
Hi Geert,

Thank you for the patch.

On Wed, Feb 20, 2019 at 04:05:28PM +0100, Geert Uytterhoeven wrote:
> Make the IPMMU_CTX_MAX constant unsigned, to match the type of
> ipmmu_features.number_of_contexts.
> 
> This allows to use plain min() instead of type-casting min_t().
> 
> Signed-off-by: Geert Uytterhoeven 

Reviewed-by: Laurent Pinchart 

> ---
>  drivers/iommu/ipmmu-vmsa.c | 5 ++---
>  1 file changed, 2 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/iommu/ipmmu-vmsa.c b/drivers/iommu/ipmmu-vmsa.c
> index 4d07c26c97848b65..4e3134a9a52f8d87 100644
> --- a/drivers/iommu/ipmmu-vmsa.c
> +++ b/drivers/iommu/ipmmu-vmsa.c
> @@ -36,7 +36,7 @@
>  #define arm_iommu_detach_device(...) do {} while (0)
>  #endif
>  
> -#define IPMMU_CTX_MAX 8
> +#define IPMMU_CTX_MAX 8U
>  
>  struct ipmmu_features {
>   bool use_ns_alias_offset;
> @@ -1060,8 +1060,7 @@ static int ipmmu_probe(struct platform_device *pdev)
>   if (mmu->features->use_ns_alias_offset)
>   mmu->base += IM_NS_ALIAS_OFFSET;
>  
> - mmu->num_ctx = min_t(unsigned int, IPMMU_CTX_MAX,
> -  mmu->features->number_of_contexts);
> + mmu->num_ctx = min(IPMMU_CTX_MAX, mmu->features->number_of_contexts);
>  
>   irq = platform_get_irq(pdev, 0);
>  

-- 
Regards,

Laurent Pinchart
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 2/7] iommu/ipmmu-vmsa: Call ipmmu_ctx_write_root() instead of open coding

2019-02-20 Thread Laurent Pinchart
Hi Geert,

Thank you for the patch.

On Wed, Feb 20, 2019 at 04:05:26PM +0100, Geert Uytterhoeven wrote:
> There is a helper to write to the root IPMMU instance's registers, so
> let's use it.
> 
> Signed-off-by: Geert Uytterhoeven 
> ---
>  drivers/iommu/ipmmu-vmsa.c | 3 +--
>  1 file changed, 1 insertion(+), 2 deletions(-)
> 
> diff --git a/drivers/iommu/ipmmu-vmsa.c b/drivers/iommu/ipmmu-vmsa.c
> index 9f2b781e20a0eba6..ac70cb967ff821c6 100644
> --- a/drivers/iommu/ipmmu-vmsa.c
> +++ b/drivers/iommu/ipmmu-vmsa.c
> @@ -280,8 +280,7 @@ static void ipmmu_ctx_write_all(struct ipmmu_vmsa_domain 
> *domain,
>   ipmmu_write(domain->mmu,
>   domain->context_id * IM_CTX_SIZE + reg, data);
>  
> - ipmmu_write(domain->mmu->root,
> - domain->context_id * IM_CTX_SIZE + reg, data);
> + ipmmu_ctx_write_root(domain, reg, data);

This makes the code harder to read in my opinion, and I don't think it
bring much optimization.

>  }
>  
>  /* 
> -

-- 
Regards,

Laurent Pinchart
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 1/7] iommu/ipmmu-vmsa: Link IOMMUs and devices in sysfs

2019-02-20 Thread Laurent Pinchart
Hi Geert,

Thank you for the patch.

On Wed, Feb 20, 2019 at 04:05:25PM +0100, Geert Uytterhoeven wrote:
> As of commit 7af9a5fdb9e0ca33 ("iommu/ipmmu-vmsa: Use
> iommu_device_sysfs_add()/remove()"), IOMMU devices show up under
> /sys/class/iommus/, but their "devices" subdirectories are empty.
> Likewise, devices tied to an IOMMU do not have an "iommu" backlink.
> 
> Make sure all links are created, on both arm32 and arm64.
> 
> Signed-off-by: Geert Uytterhoeven 

Reviewed-by: Laurent Pinchart 

> ---
>  drivers/iommu/ipmmu-vmsa.c | 24 +---
>  1 file changed, 17 insertions(+), 7 deletions(-)
> 
> diff --git a/drivers/iommu/ipmmu-vmsa.c b/drivers/iommu/ipmmu-vmsa.c
> index 9a380c10655e182d..9f2b781e20a0eba6 100644
> --- a/drivers/iommu/ipmmu-vmsa.c
> +++ b/drivers/iommu/ipmmu-vmsa.c
> @@ -885,27 +885,37 @@ static int ipmmu_init_arm_mapping(struct device *dev)
>  
>  static int ipmmu_add_device(struct device *dev)
>  {
> + struct ipmmu_vmsa_device *mmu = to_ipmmu(dev);
>   struct iommu_group *group;
> + int ret;
>  
>   /*
>* Only let through devices that have been verified in xlate()
>*/
> - if (!to_ipmmu(dev))
> + if (!mmu)
>   return -ENODEV;
>  
> - if (IS_ENABLED(CONFIG_ARM) && !IS_ENABLED(CONFIG_IOMMU_DMA))
> - return ipmmu_init_arm_mapping(dev);
> + if (IS_ENABLED(CONFIG_ARM) && !IS_ENABLED(CONFIG_IOMMU_DMA)) {
> + ret = ipmmu_init_arm_mapping(dev);
> + if (ret)
> + return ret;
> + } else {
> + group = iommu_group_get_for_dev(dev);
> + if (IS_ERR(group))
> + return PTR_ERR(group);
>  
> - group = iommu_group_get_for_dev(dev);
> - if (IS_ERR(group))
> - return PTR_ERR(group);
> + iommu_group_put(group);
> + }
>  
> - iommu_group_put(group);
> + iommu_device_link(>iommu, dev);
>   return 0;
>  }
>  
>  static void ipmmu_remove_device(struct device *dev)
>  {
> + struct ipmmu_vmsa_device *mmu = to_ipmmu(dev);
> +
> + iommu_device_unlink(>iommu, dev);
>   arm_iommu_detach_device(dev);
>   iommu_group_remove_device(dev);
>  }

-- 
Regards,

Laurent Pinchart
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCHv7] x86/kdump: bugfix, make the behavior of crashkernel=X consistent with kaslr

2019-02-20 Thread Pingfan Liu
On Mon, Feb 18, 2019 at 9:48 AM Dave Young  wrote:
>
> On 02/15/19 at 11:24am, Borislav Petkov wrote:
> > On Tue, Feb 12, 2019 at 04:48:16AM +0800, Dave Young wrote:
> > > Even we make it automatic in kernel, but we have to have some default
> > > value for swiotlb in case crashkernel can not find a free region under 4G.
> > > So this default value can not work for every use cases, people need
> > > manually use crashkernel=,low and crashkernel=,high in case
> > > crashkernel=X does not work.
> >
> > Why would the user need to find swiotlb range? The kernel has all the
> > information it requires at its finger tips in order to decide properly.
> >
> > The user wants a crashkernel range, the kernel tries the low range =>
> > no workie, then it tries the next range => workie but needs to allocate
> > swiotlb range so that DMA can happen too. Doh, then the kernel does
> > allocate that too.
>
> It is ideal if kernel can do it automatically, but I'm not sure if
> kernel can predict the swiotlb reserved size automatically.
>
Agreed, I think it is hard to decide the reserved size automatically.
We do not know the requirement for memory of ZONE_DMA32 at boot time.
The requirement depends on how many DMA32 devices, and the dynamic
payload of them.

> Let's add more people to seek for comments.
>
> >
> > Why would the user need to do anything here?!
> >
> > --
> > Regards/Gruss,
> > Boris.
> >
> > Good mailing practices for 400: avoid top-posting and trim the reply.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCHv7] x86/kdump: bugfix, make the behavior of crashkernel=X consistent with kaslr

2019-02-20 Thread Pingfan Liu
On Wed, Feb 20, 2019 at 5:41 PM Dave Young  wrote:
>
> On 02/20/19 at 09:32am, Borislav Petkov wrote:
> > On Mon, Feb 18, 2019 at 09:48:20AM +0800, Dave Young wrote:
> > > It is ideal if kernel can do it automatically, but I'm not sure if
> > > kernel can predict the swiotlb reserved size automatically.
> >
> > Do you see how even more absurd this gets?
> >
> > If the kernel cannot know the swiotlb reserved size automatically, how
> > is the normal user even supposed to know?!
> >
I think swiotlb is bounce-buffer, if we enlarge it, we can get better
performance. Default size should be enough for platform to work. But
in case of reserving low memory for crashkernel, things are different.
The reserve low memory = swiotlb_size_or_default() + DMA32 memory for
devices. And the 2nd item in the right of the equation varies, based
on machine type and dynamic payload

> > I see swiotlb_size_or_default() so we have a sane default which we fall
> > back to. Now where's the problem with that?
>
> Good question, I expect some answer from people who know more about the
> background.  It would be good to have some actual test results, Pingfan
> is trying to do some tests.
>
Not following the idea, I do not think the following test result can
tell much. (We need various type of machine to get a final result.)
I do a quick test on "HPE ProLiant DL380 Gen10/ProLiant DL380 Gen10",
command line "crashkernel=180M,high crashkernel=64M,low" can work for
the 2nd kernel. Although it complained some memory shortage issue:
[7.655591] fbcon: mgadrmfb (fb0) is primary device
[7.655639] Console: switching to colour frame buffer device 128x48
[7.660609] systemd-udevd: page allocation failure: order:0, mode:0x280d4
[7.660611] CPU: 0 PID: 180 Comm: systemd-udevd Not tainted
3.10.0-957.el7.x86_64 #1
[7.660612] Hardware name: HPE ProLiant DL380 Gen10/ProLiant DL380
Gen10, BIOS U30 06/20/2018
[7.660612] Call Trace:
[7.660621]  [] dump_stack+0x19/0x1b
[7.660625]  [] warn_alloc_failed+0x110/0x180
[7.660628]  [] __alloc_pages_slowpath+0x6b6/0x724
[7.660631]  [] __alloc_pages_nodemask+0x405/0x420
[7.660633]  [] alloc_pages_current+0x98/0x110
[7.660638]  [] ttm_pool_populate+0x3d2/0x4b0 [ttm]
[7.660641]  [] ttm_tt_populate+0x7d/0x90 [ttm]
[7.660644]  [] ttm_bo_kmap+0x124/0x240 [ttm]
[7.660648]  [] ? __wake_up_sync_key+0x4f/0x60
[7.660650]  [] mga_dirty_update+0x25e/0x310 [mgag200]
[7.660653]  [] mga_imageblit+0x2f/0x40 [mgag200]
[7.660657]  [] soft_cursor+0x1ba/0x260
[7.660659]  [] bit_cursor+0x663/0x6a0
[7.660662]  [] ? console_trylock+0x19/0x70
[7.660664]  [] fbcon_cursor+0x13d/0x1c0
[7.660665]  [] ? bit_clear+0x120/0x120
[7.660668]  [] hide_cursor+0x2e/0xa0
[7.660669]  [] redraw_screen+0x188/0x270
[7.660671]  [] do_bind_con_driver+0x316/0x340
[7.660672]  [] do_take_over_console+0x49/0x60
[7.660674]  [] do_fbcon_takeover+0x63/0xd0
[7.660675]  [] fbcon_event_notify+0x61d/0x730
[7.660678]  [] notifier_call_chain+0x4f/0x70
[7.660681]  [] __blocking_notifier_call_chain+0x4d/0x70
[7.660683]  [] blocking_notifier_call_chain+0x16/0x20
[7.660684]  [] fb_notifier_call_chain+0x1b/0x20
[7.660686]  [] register_framebuffer+0x1f6/0x340
[7.660690]  []
__drm_fb_helper_initial_config_and_unlock+0x252/0x3e0 [drm_kms_helper]
[7.660694]  []
drm_fb_helper_initial_config+0x3e/0x50 [drm_kms_helper]
[7.660697]  [] mgag200_fbdev_init+0xe3/0x100 [mgag200]
[7.660699]  [] mgag200_modeset_init+0x154/0x1d0 [mgag200]
[7.660701]  [] mgag200_driver_load+0x41d/0x5b0 [mgag200]
[7.660708]  [] drm_dev_register+0x15f/0x1f0 [drm]
[7.660711]  [] ? pci_enable_device_flags+0xe8/0x140
[7.660718]  [] drm_get_pci_dev+0x8a/0x1a0 [drm]
[7.660720]  [] mga_pci_probe+0x9b/0xc0 [mgag200]
[7.660722]  [] local_pci_probe+0x4a/0xb0
[7.660723]  [] pci_device_probe+0x109/0x160
[7.660726]  [] driver_probe_device+0xc5/0x3e0
[7.660727]  [] __driver_attach+0x93/0xa0
[7.660728]  [] ? __device_attach+0x50/0x50
[7.660730]  [] bus_for_each_dev+0x75/0xc0
[7.660731]  [] driver_attach+0x1e/0x20
[7.660733]  [] bus_add_driver+0x200/0x2d0
[7.660734]  [] driver_register+0x64/0xf0
[7.660735]  [] __pci_register_driver+0xa5/0xc0
[7.660737]  [] ? 0xc012cfff
[7.660739]  [] mgag200_init+0x39/0x1000 [mgag200]
[7.660742]  [] do_one_initcall+0xba/0x240
[7.660745]  [] load_module+0x272c/0x2bc0
[7.660748]  [] ? ddebug_proc_write+0x100/0x100
[7.660750]  [] SyS_init_module+0xef/0x140
[7.660752]  [] system_call_fastpath+0x22/0x27
[7.660753] Mem-Info:
[7.660756] active_anon:3364 inactive_anon:6661 isolated_anon:0
[7.660756]  active_file:0 inactive_file:0 isolated_file:0
[7.660756]  unevictable:0 dirty:0 writeback:0 unstable:0
[7.660756]  slab_reclaimable:1492 slab_unreclaimable:3116
[7.660756]  mapped:1223 shmem:8449 pagetables:179 bounce:0
[7.660756]  

[PATCH 1/7] iommu/ipmmu-vmsa: Link IOMMUs and devices in sysfs

2019-02-20 Thread Geert Uytterhoeven
As of commit 7af9a5fdb9e0ca33 ("iommu/ipmmu-vmsa: Use
iommu_device_sysfs_add()/remove()"), IOMMU devices show up under
/sys/class/iommus/, but their "devices" subdirectories are empty.
Likewise, devices tied to an IOMMU do not have an "iommu" backlink.

Make sure all links are created, on both arm32 and arm64.

Signed-off-by: Geert Uytterhoeven 
---
 drivers/iommu/ipmmu-vmsa.c | 24 +---
 1 file changed, 17 insertions(+), 7 deletions(-)

diff --git a/drivers/iommu/ipmmu-vmsa.c b/drivers/iommu/ipmmu-vmsa.c
index 9a380c10655e182d..9f2b781e20a0eba6 100644
--- a/drivers/iommu/ipmmu-vmsa.c
+++ b/drivers/iommu/ipmmu-vmsa.c
@@ -885,27 +885,37 @@ static int ipmmu_init_arm_mapping(struct device *dev)
 
 static int ipmmu_add_device(struct device *dev)
 {
+   struct ipmmu_vmsa_device *mmu = to_ipmmu(dev);
struct iommu_group *group;
+   int ret;
 
/*
 * Only let through devices that have been verified in xlate()
 */
-   if (!to_ipmmu(dev))
+   if (!mmu)
return -ENODEV;
 
-   if (IS_ENABLED(CONFIG_ARM) && !IS_ENABLED(CONFIG_IOMMU_DMA))
-   return ipmmu_init_arm_mapping(dev);
+   if (IS_ENABLED(CONFIG_ARM) && !IS_ENABLED(CONFIG_IOMMU_DMA)) {
+   ret = ipmmu_init_arm_mapping(dev);
+   if (ret)
+   return ret;
+   } else {
+   group = iommu_group_get_for_dev(dev);
+   if (IS_ERR(group))
+   return PTR_ERR(group);
 
-   group = iommu_group_get_for_dev(dev);
-   if (IS_ERR(group))
-   return PTR_ERR(group);
+   iommu_group_put(group);
+   }
 
-   iommu_group_put(group);
+   iommu_device_link(>iommu, dev);
return 0;
 }
 
 static void ipmmu_remove_device(struct device *dev)
 {
+   struct ipmmu_vmsa_device *mmu = to_ipmmu(dev);
+
+   iommu_device_unlink(>iommu, dev);
arm_iommu_detach_device(dev);
iommu_group_remove_device(dev);
 }
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 7/7] iommu/ipmmu-vmsa: Add suspend/resume support

2019-02-20 Thread Geert Uytterhoeven
During PSCI system suspend, R-Car Gen3 SoCs are powered down, and all
IPMMU state is lost.  Hence after s2ram, devices wired behind an IPMMU,
and configured to use it, will see their DMA operations hang.

To fix this, restore all IPMMU contexts, and re-enable all active
micro-TLBs during system resume.

To avoid overhead on platforms not needing it, the resume code has a
build time dependency on sleep and PSCI support, and a runtime
dependency on PSCI.

Signed-off-by: Geert Uytterhoeven 
---
This patch takes a different approach than the BSP, which implements a
bulk save/restore of all registers during system suspend/resume.

 drivers/iommu/ipmmu-vmsa.c | 52 +-
 1 file changed, 51 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/ipmmu-vmsa.c b/drivers/iommu/ipmmu-vmsa.c
index 92a766dd8b459f0c..5d22139914e8f033 100644
--- a/drivers/iommu/ipmmu-vmsa.c
+++ b/drivers/iommu/ipmmu-vmsa.c
@@ -22,6 +22,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -36,7 +37,10 @@
 #define arm_iommu_detach_device(...)   do {} while (0)
 #endif
 
-#define IPMMU_CTX_MAX 8U
+#define IPMMU_CTX_MAX  8U
+#define IPMMU_CTX_INVALID  -1
+
+#define IPMMU_UTLB_MAX 48U
 
 struct ipmmu_features {
bool use_ns_alias_offset;
@@ -58,6 +62,7 @@ struct ipmmu_vmsa_device {
spinlock_t lock;/* Protects ctx and domains[] */
DECLARE_BITMAP(ctx, IPMMU_CTX_MAX);
struct ipmmu_vmsa_domain *domains[IPMMU_CTX_MAX];
+   s8 utlb_ctx[IPMMU_UTLB_MAX];
 
struct iommu_group *group;
struct dma_iommu_mapping *mapping;
@@ -335,6 +340,7 @@ static void ipmmu_utlb_enable(struct ipmmu_vmsa_domain 
*domain,
ipmmu_write(mmu, IMUCTR(utlb),
IMUCTR_TTSEL_MMU(domain->context_id) | IMUCTR_FLUSH |
IMUCTR_MMUEN);
+   mmu->utlb_ctx[utlb] = domain->context_id;
 }
 
 /*
@@ -346,6 +352,7 @@ static void ipmmu_utlb_disable(struct ipmmu_vmsa_domain 
*domain,
struct ipmmu_vmsa_device *mmu = domain->mmu;
 
ipmmu_write(mmu, IMUCTR(utlb), 0);
+   mmu->utlb_ctx[utlb] = IPMMU_CTX_INVALID;
 }
 
 static void ipmmu_tlb_flush_all(void *cookie)
@@ -1043,6 +1050,7 @@ static int ipmmu_probe(struct platform_device *pdev)
spin_lock_init(>lock);
bitmap_zero(mmu->ctx, IPMMU_CTX_MAX);
mmu->features = of_device_get_match_data(>dev);
+   memset(mmu->utlb_ctx, IPMMU_CTX_INVALID, mmu->features->num_utlbs);
dma_set_mask_and_coherent(>dev, DMA_BIT_MASK(40));
 
/* Map I/O memory and request IRQ. */
@@ -1158,10 +1166,52 @@ static int ipmmu_remove(struct platform_device *pdev)
return 0;
 }
 
+#if defined(CONFIG_PM_SLEEP) && defined(CONFIG_ARM_PSCI_FW)
+static int ipmmu_resume_noirq(struct device *dev)
+{
+   struct ipmmu_vmsa_device *mmu = dev_get_drvdata(dev);
+   unsigned int i;
+
+   /* This is the best we can do to check for the presence of PSCI */
+   if (!psci_ops.cpu_suspend)
+   return 0;
+
+   /* Reset root MMU and restore contexts */
+   if (ipmmu_is_root(mmu)) {
+   ipmmu_device_reset(mmu);
+
+   for (i = 0; i < mmu->num_ctx; i++) {
+   if (!mmu->domains[i])
+   continue;
+
+   ipmmu_context_init(mmu->domains[i]);
+   }
+   }
+
+   /* Re-enable active micro-TLBs */
+   for (i = 0; i < mmu->features->num_utlbs; i++) {
+   if (mmu->utlb_ctx[i] == IPMMU_CTX_INVALID)
+   continue;
+
+   ipmmu_utlb_enable(mmu->root->domains[mmu->utlb_ctx[i]], i);
+   }
+
+   return 0;
+}
+
+static const struct dev_pm_ops ipmmu_pm  = {
+   SET_NOIRQ_SYSTEM_SLEEP_PM_OPS(NULL, ipmmu_resume_noirq)
+};
+#define DEV_PM_OPS _pm
+#else
+#define DEV_PM_OPS NULL
+#endif /* CONFIG_PM_SLEEP && CONFIG_ARM_PSCI_FW */
+
 static struct platform_driver ipmmu_driver = {
.driver = {
.name = "ipmmu-vmsa",
.of_match_table = of_match_ptr(ipmmu_of_ids),
+   .pm = DEV_PM_OPS,
},
.probe = ipmmu_probe,
.remove = ipmmu_remove,
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 0/7] iommu/ipmmu-vmsa: Suspend/resume support and assorted cleanups

2019-02-20 Thread Geert Uytterhoeven
Hi Jörg, Magnus,

On R-Car Gen3 systems with PSCI, PSCI may power down the SoC during
system suspend, thus losing all IOMMU state.  Hence after s2ram, devices
behind an IPMMU (e.g. SATA), and configured to use it, will fail to
complete their I/O operations.

This patch series adds suspend/resume support to the Renesas IPMMU-VMSA
IOMMU driver, and performs some smaller cleanups and fixes during the
process.  Most patches are fairly independent, except for patch 7/7,
which depends on patches 5/7 and 6/7.

This has been tested on Salvator-XS with R-Car H3 ES2.0, with IPMMU
suport for SATA enabled.  To play safe, the resume operation has also
been tested on R-Car M2-W, where it is currently not enabled due to the
absence of PSCI in the firmware.

Thanks for your comments!

Geert Uytterhoeven (7):
  iommu/ipmmu-vmsa: Link IOMMUs and devices in sysfs
  iommu/ipmmu-vmsa: Call ipmmu_ctx_write_root() instead of open coding
  iommu/ipmmu-vmsa: Prepare to handle 40-bit error addresses
  iommu/ipmmu-vmsa: Make IPMMU_CTX_MAX unsigned
  iommu/ipmmu-vmsa: Move num_utlbs to SoC-specific features
  iommu/ipmmu-vmsa: Extract hardware context initialization
  iommu/ipmmu-vmsa: Add suspend/resume support

 drivers/iommu/ipmmu-vmsa.c | 194 +
 1 file changed, 131 insertions(+), 63 deletions(-)

-- 
2.17.1

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH 2/7] iommu/ipmmu-vmsa: Call ipmmu_ctx_write_root() instead of open coding

2019-02-20 Thread Geert Uytterhoeven
There is a helper to write to the root IPMMU instance's registers, so
let's use it.

Signed-off-by: Geert Uytterhoeven 
---
 drivers/iommu/ipmmu-vmsa.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/iommu/ipmmu-vmsa.c b/drivers/iommu/ipmmu-vmsa.c
index 9f2b781e20a0eba6..ac70cb967ff821c6 100644
--- a/drivers/iommu/ipmmu-vmsa.c
+++ b/drivers/iommu/ipmmu-vmsa.c
@@ -280,8 +280,7 @@ static void ipmmu_ctx_write_all(struct ipmmu_vmsa_domain 
*domain,
ipmmu_write(domain->mmu,
domain->context_id * IM_CTX_SIZE + reg, data);
 
-   ipmmu_write(domain->mmu->root,
-   domain->context_id * IM_CTX_SIZE + reg, data);
+   ipmmu_ctx_write_root(domain, reg, data);
 }
 
 /* 
-
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 5/7] iommu/ipmmu-vmsa: Move num_utlbs to SoC-specific features

2019-02-20 Thread Geert Uytterhoeven
The maximum number of micro-TLBs per IPMMU instance is not fixed, but
depends on the SoC type.  Hence move it from struct ipmmu_vmsa_device to
struct ipmmu_features, and set up the correct value for both R-Car Gen2
and Gen3 SoCs.

Note that currently no code uses this value.

Signed-off-by: Geert Uytterhoeven 
---
 drivers/iommu/ipmmu-vmsa.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/drivers/iommu/ipmmu-vmsa.c b/drivers/iommu/ipmmu-vmsa.c
index 4e3134a9a52f8d87..0a21e734466eb1bd 100644
--- a/drivers/iommu/ipmmu-vmsa.c
+++ b/drivers/iommu/ipmmu-vmsa.c
@@ -42,6 +42,7 @@ struct ipmmu_features {
bool use_ns_alias_offset;
bool has_cache_leaf_nodes;
unsigned int number_of_contexts;
+   unsigned int num_utlbs;
bool setup_imbuscr;
bool twobit_imttbcr_sl0;
bool reserved_context;
@@ -53,7 +54,6 @@ struct ipmmu_vmsa_device {
struct iommu_device iommu;
struct ipmmu_vmsa_device *root;
const struct ipmmu_features *features;
-   unsigned int num_utlbs;
unsigned int num_ctx;
spinlock_t lock;/* Protects ctx and domains[] */
DECLARE_BITMAP(ctx, IPMMU_CTX_MAX);
@@ -972,6 +972,7 @@ static const struct ipmmu_features ipmmu_features_default = 
{
.use_ns_alias_offset = true,
.has_cache_leaf_nodes = false,
.number_of_contexts = 1, /* software only tested with one context */
+   .num_utlbs = 32,
.setup_imbuscr = true,
.twobit_imttbcr_sl0 = false,
.reserved_context = false,
@@ -981,6 +982,7 @@ static const struct ipmmu_features ipmmu_features_rcar_gen3 
= {
.use_ns_alias_offset = false,
.has_cache_leaf_nodes = true,
.number_of_contexts = 8,
+   .num_utlbs = 48,
.setup_imbuscr = false,
.twobit_imttbcr_sl0 = true,
.reserved_context = true,
@@ -1033,7 +1035,6 @@ static int ipmmu_probe(struct platform_device *pdev)
}
 
mmu->dev = >dev;
-   mmu->num_utlbs = 48;
spin_lock_init(>lock);
bitmap_zero(mmu->ctx, IPMMU_CTX_MAX);
mmu->features = of_device_get_match_data(>dev);
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 6/7] iommu/ipmmu-vmsa: Extract hardware context initialization

2019-02-20 Thread Geert Uytterhoeven
ipmmu_domain_init_context() takes care of (1) initializing the software
domain, and (2) initializing the hardware context for the domain.

Extract the code to initialize the hardware context into a new subroutine
ipmmu_context_init(), to prepare for later reuse.

Signed-off-by: Geert Uytterhoeven 
---
 drivers/iommu/ipmmu-vmsa.c | 91 --
 1 file changed, 48 insertions(+), 43 deletions(-)

diff --git a/drivers/iommu/ipmmu-vmsa.c b/drivers/iommu/ipmmu-vmsa.c
index 0a21e734466eb1bd..92a766dd8b459f0c 100644
--- a/drivers/iommu/ipmmu-vmsa.c
+++ b/drivers/iommu/ipmmu-vmsa.c
@@ -404,52 +404,10 @@ static void ipmmu_domain_free_context(struct 
ipmmu_vmsa_device *mmu,
spin_unlock_irqrestore(>lock, flags);
 }
 
-static int ipmmu_domain_init_context(struct ipmmu_vmsa_domain *domain)
+static void ipmmu_context_init(struct ipmmu_vmsa_domain *domain)
 {
u64 ttbr;
u32 tmp;
-   int ret;
-
-   /*
-* Allocate the page table operations.
-*
-* VMSA states in section B3.6.3 "Control of Secure or Non-secure memory
-* access, Long-descriptor format" that the NStable bit being set in a
-* table descriptor will result in the NStable and NS bits of all child
-* entries being ignored and considered as being set. The IPMMU seems
-* not to comply with this, as it generates a secure access page fault
-* if any of the NStable and NS bits isn't set when running in
-* non-secure mode.
-*/
-   domain->cfg.quirks = IO_PGTABLE_QUIRK_ARM_NS;
-   domain->cfg.pgsize_bitmap = SZ_1G | SZ_2M | SZ_4K;
-   domain->cfg.ias = 32;
-   domain->cfg.oas = 40;
-   domain->cfg.tlb = _gather_ops;
-   domain->io_domain.geometry.aperture_end = DMA_BIT_MASK(32);
-   domain->io_domain.geometry.force_aperture = true;
-   /*
-* TODO: Add support for coherent walk through CCI with DVM and remove
-* cache handling. For now, delegate it to the io-pgtable code.
-*/
-   domain->cfg.iommu_dev = domain->mmu->root->dev;
-
-   /*
-* Find an unused context.
-*/
-   ret = ipmmu_domain_allocate_context(domain->mmu->root, domain);
-   if (ret < 0)
-   return ret;
-
-   domain->context_id = ret;
-
-   domain->iop = alloc_io_pgtable_ops(ARM_32_LPAE_S1, >cfg,
-  domain);
-   if (!domain->iop) {
-   ipmmu_domain_free_context(domain->mmu->root,
- domain->context_id);
-   return -EINVAL;
-   }
 
/* TTBR0 */
ttbr = domain->cfg.arm_lpae_s1_cfg.ttbr[0];
@@ -495,7 +453,54 @@ static int ipmmu_domain_init_context(struct 
ipmmu_vmsa_domain *domain)
 */
ipmmu_ctx_write_all(domain, IMCTR,
IMCTR_INTEN | IMCTR_FLUSH | IMCTR_MMUEN);
+}
+
+static int ipmmu_domain_init_context(struct ipmmu_vmsa_domain *domain)
+{
+   int ret;
+
+   /*
+* Allocate the page table operations.
+*
+* VMSA states in section B3.6.3 "Control of Secure or Non-secure memory
+* access, Long-descriptor format" that the NStable bit being set in a
+* table descriptor will result in the NStable and NS bits of all child
+* entries being ignored and considered as being set. The IPMMU seems
+* not to comply with this, as it generates a secure access page fault
+* if any of the NStable and NS bits isn't set when running in
+* non-secure mode.
+*/
+   domain->cfg.quirks = IO_PGTABLE_QUIRK_ARM_NS;
+   domain->cfg.pgsize_bitmap = SZ_1G | SZ_2M | SZ_4K;
+   domain->cfg.ias = 32;
+   domain->cfg.oas = 40;
+   domain->cfg.tlb = _gather_ops;
+   domain->io_domain.geometry.aperture_end = DMA_BIT_MASK(32);
+   domain->io_domain.geometry.force_aperture = true;
+   /*
+* TODO: Add support for coherent walk through CCI with DVM and remove
+* cache handling. For now, delegate it to the io-pgtable code.
+*/
+   domain->cfg.iommu_dev = domain->mmu->root->dev;
+
+   /*
+* Find an unused context.
+*/
+   ret = ipmmu_domain_allocate_context(domain->mmu->root, domain);
+   if (ret < 0)
+   return ret;
+
+   domain->context_id = ret;
+
+   domain->iop = alloc_io_pgtable_ops(ARM_32_LPAE_S1, >cfg,
+  domain);
+   if (!domain->iop) {
+   ipmmu_domain_free_context(domain->mmu->root,
+ domain->context_id);
+   return -EINVAL;
+   }
 
+   ipmmu_context_init(domain);
return 0;
 }
 
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 3/7] iommu/ipmmu-vmsa: Prepare to handle 40-bit error addresses

2019-02-20 Thread Geert Uytterhoeven
On R-Car Gen3, the faulting virtual address is a 40-bit address, and
comprised of two registers.  Read the upper address part, and combine
both parts, when running on a 64-bit system.

Signed-off-by: Geert Uytterhoeven 
---
Apart from this, the driver doesn't support 40-bit IOVA addresses yet.
---
 drivers/iommu/ipmmu-vmsa.c | 16 ++--
 1 file changed, 10 insertions(+), 6 deletions(-)

diff --git a/drivers/iommu/ipmmu-vmsa.c b/drivers/iommu/ipmmu-vmsa.c
index ac70cb967ff821c6..4d07c26c97848b65 100644
--- a/drivers/iommu/ipmmu-vmsa.c
+++ b/drivers/iommu/ipmmu-vmsa.c
@@ -186,7 +186,9 @@ static struct ipmmu_vmsa_device *to_ipmmu(struct device 
*dev)
 #define IMMAIR_ATTR_IDX_WBRWA  1
 #define IMMAIR_ATTR_IDX_DEV2
 
-#define IMEAR  0x0030
+#define IMEAR  0x0030  /* R-Car Gen2 */
+#define IMELAR IMEAR   /* R-Car Gen3 */
+#define IMEUAR 0x0034  /* R-Car Gen3 */
 
 #define IMPCTR 0x0200
 #define IMPSTR 0x0208
@@ -521,14 +523,16 @@ static irqreturn_t ipmmu_domain_irq(struct 
ipmmu_vmsa_domain *domain)
 {
const u32 err_mask = IMSTR_MHIT | IMSTR_ABORT | IMSTR_PF | IMSTR_TF;
struct ipmmu_vmsa_device *mmu = domain->mmu;
+   unsigned long iova;
u32 status;
-   u32 iova;
 
status = ipmmu_ctx_read_root(domain, IMSTR);
if (!(status & err_mask))
return IRQ_NONE;
 
-   iova = ipmmu_ctx_read_root(domain, IMEAR);
+   iova = ipmmu_ctx_read_root(domain, IMELAR);
+   if (IS_ENABLED(CONFIG_64BIT))
+   iova |= (u64)ipmmu_ctx_read_root(domain, IMEUAR) << 32;
 
/*
 * Clear the error status flags. Unlike traditional interrupt flag
@@ -540,10 +544,10 @@ static irqreturn_t ipmmu_domain_irq(struct 
ipmmu_vmsa_domain *domain)
 
/* Log fatal errors. */
if (status & IMSTR_MHIT)
-   dev_err_ratelimited(mmu->dev, "Multiple TLB hits @0x%08x\n",
+   dev_err_ratelimited(mmu->dev, "Multiple TLB hits @0x%lx\n",
iova);
if (status & IMSTR_ABORT)
-   dev_err_ratelimited(mmu->dev, "Page Table Walk Abort @0x%08x\n",
+   dev_err_ratelimited(mmu->dev, "Page Table Walk Abort @0x%lx\n",
iova);
 
if (!(status & (IMSTR_PF | IMSTR_TF)))
@@ -559,7 +563,7 @@ static irqreturn_t ipmmu_domain_irq(struct 
ipmmu_vmsa_domain *domain)
return IRQ_HANDLED;
 
dev_err_ratelimited(mmu->dev,
-   "Unhandled fault: status 0x%08x iova 0x%08x\n",
+   "Unhandled fault: status 0x%08x iova 0x%lx\n",
status, iova);
 
return IRQ_HANDLED;
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 4/7] iommu/ipmmu-vmsa: Make IPMMU_CTX_MAX unsigned

2019-02-20 Thread Geert Uytterhoeven
Make the IPMMU_CTX_MAX constant unsigned, to match the type of
ipmmu_features.number_of_contexts.

This allows to use plain min() instead of type-casting min_t().

Signed-off-by: Geert Uytterhoeven 
---
 drivers/iommu/ipmmu-vmsa.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/drivers/iommu/ipmmu-vmsa.c b/drivers/iommu/ipmmu-vmsa.c
index 4d07c26c97848b65..4e3134a9a52f8d87 100644
--- a/drivers/iommu/ipmmu-vmsa.c
+++ b/drivers/iommu/ipmmu-vmsa.c
@@ -36,7 +36,7 @@
 #define arm_iommu_detach_device(...)   do {} while (0)
 #endif
 
-#define IPMMU_CTX_MAX 8
+#define IPMMU_CTX_MAX 8U
 
 struct ipmmu_features {
bool use_ns_alias_offset;
@@ -1060,8 +1060,7 @@ static int ipmmu_probe(struct platform_device *pdev)
if (mmu->features->use_ns_alias_offset)
mmu->base += IM_NS_ALIAS_OFFSET;
 
-   mmu->num_ctx = min_t(unsigned int, IPMMU_CTX_MAX,
-mmu->features->number_of_contexts);
+   mmu->num_ctx = min(IPMMU_CTX_MAX, mmu->features->number_of_contexts);
 
irq = platform_get_irq(pdev, 0);
 
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 1/1] iommu: Bind process address spaces to devices

2019-02-20 Thread Jean-Philippe Brucker
Add bind() and unbind() operations to the IOMMU API. Bind() returns a
PASID to the device driver (by convention, a 20-bit system-wide ID
representing the address space). When programming DMA addresses, device
drivers include this PASID in a device-specific manner, to let the device
access the given address space. Since the process memory may be paged out,
device and IOMMU must support I/O page faults (e.g. PCI PRI).

Device drivers pass an mm_exit() callback to bind(), that is called by the
IOMMU driver if the process exits before the device driver called
unbind(). In mm_exit(), device driver should disable DMA from the given
context, so that the core IOMMU can reallocate the PASID.

To use these functions, device driver must first enable the
IOMMU_DEV_FEAT_SVA device feature with iommu_dev_enable_feature().

Signed-off-by: Jean-Philippe Brucker 
---
 drivers/iommu/iommu.c | 104 ++
 include/linux/iommu.h |  24 ++
 2 files changed, 128 insertions(+)

diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 1629354255c3..5feba98566b2 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -2351,3 +2351,107 @@ int iommu_aux_get_pasid(struct iommu_domain *domain, 
struct device *dev)
return ret;
 }
 EXPORT_SYMBOL_GPL(iommu_aux_get_pasid);
+
+/**
+ * iommu_sva_bind_device() - Bind a process address space to a device
+ * @dev: the device
+ * @mm: the mm to bind, caller must hold a reference to it
+ * @pasid: valid address where the PASID will be stored
+ * @mm_exit: notifier function to call when the mm exits
+ * @drvdata: private data passed to the mm exit handler
+ *
+ * Create a bond between device and task, allowing the device to access the mm
+ * using the returned PASID. If unbind() isn't called first, a subsequent 
bind()
+ * for the same device and mm fails with -EEXIST.
+ *
+ * iommu_dev_enable_feature(dev, IOMMU_DEV_FEAT_SVA) must be called first, to
+ * initialize the required SVA features.
+ *
+ * @mm_exit is called when the mm is about to be torn down by exit_mmap. After
+ * @mm_exit returns, the device must not issue any more transaction with the
+ * PASID given as argument.
+ *
+ * The @mm_exit handler is allowed to sleep. Be careful about the locks taken 
in
+ * @mm_exit, because they might lead to deadlocks if they are also held when
+ * dropping references to the mm. Consider the following call chain:
+ *   mutex_lock(A); mmput(mm) -> exit_mm() -> @mm_exit() -> mutex_lock(A)
+ * Using mmput_async() prevents this scenario.
+ *
+ * On success, 0 is returned and @pasid contains a valid ID. Otherwise, an 
error
+ * is returned.
+ */
+int iommu_sva_bind_device(struct device *dev, struct mm_struct *mm, int *pasid,
+ iommu_mm_exit_handler_t mm_exit, void *drvdata)
+{
+   int ret = -EINVAL;
+   struct iommu_group *group;
+   const struct iommu_ops *ops = dev->bus->iommu_ops;
+
+   if (!pasid)
+   return -EINVAL;
+
+   if (!ops || !ops->bind_mm)
+   return -ENODEV;
+
+   group = iommu_group_get(dev);
+   if (!group)
+   return -ENODEV;
+
+   /* Ensure device count and domain don't change while we're binding */
+   mutex_lock(>mutex);
+
+   /*
+* To keep things simple, SVA currently doesn't support IOMMU groups
+* with more than one device. Existing SVA-capable systems are not
+* affected by the problems that required IOMMU groups (lack of ACS
+* isolation, device ID aliasing and other hardware issues).
+*/
+   if (iommu_group_device_count(group) != 1)
+   goto out_unlock;
+
+   ret = ops->bind_mm(dev, mm, pasid, mm_exit, drvdata);
+
+out_unlock:
+   mutex_unlock(>mutex);
+   iommu_group_put(group);
+
+   return ret;
+}
+EXPORT_SYMBOL_GPL(iommu_sva_bind_device);
+
+/**
+ * iommu_sva_unbind_device() - Remove a bond created with iommu_sva_bind_device
+ * @dev: the device
+ * @pasid: the pasid returned by bind()
+ *
+ * Remove bond between device and address space identified by @pasid. Users
+ * should not call unbind() if the corresponding mm exited (as the PASID might
+ * have been reallocated for another process).
+ *
+ * The device must not be issuing any more transaction for this PASID. All
+ * outstanding page requests for this PASID must have been flushed to the 
IOMMU.
+ *
+ * Returns 0 on success, or an error value
+ */
+int iommu_sva_unbind_device(struct device *dev, int pasid)
+{
+   int ret = -EINVAL;
+   struct iommu_group *group;
+   const struct iommu_ops *ops = dev->bus->iommu_ops;
+
+   if (!ops || !ops->unbind_mm)
+   return -ENODEV;
+
+   group = iommu_group_get(dev);
+   if (!group)
+   return -ENODEV;
+
+   mutex_lock(>mutex);
+   ret = ops->unbind_mm(dev, pasid);
+   mutex_unlock(>mutex);
+
+   iommu_group_put(group);
+
+   return ret;
+}

[PATCH 0/1] IOMMU SVA device driver interface

2019-02-20 Thread Jean-Philippe Brucker
This series focuses on the device driver API for SVA, as I'd like to get
at least parts of it merged in v5.2 [1]. If we can get consensus on the
interface, it will be easier for device drivers to start adding SVA
support while we're improving the IOMMU side.

Since v3 [2] I changed the interface as requested, and changed iommu-sva
to be a set of helpers rather than the main entry point. This way
intel-svm and amd_iommu_v2 can support the common bind() API without
having to move to the generic implementation (which I'm still rewriting).

The four dev_feature functions are implemented by Lu Baolu's IOMMU-aware
mdev series [3].
 
* iommu_dev_has_feature(dev, IOMMU_DEV_FEAT_SVA) -> true/false
  - reports if SVA is supported by IOMMU and device
  - iommu_ops->dev_has_feat()

* iommu_dev_enable_feature(dev, IOMMU_DEV_FEAT_SVA) -> 0/err
  - enables SVA if IOMMU and device support it.
  - iommu_ops->dev_enable_feat()

* iommu_dev_disable_feature(dev, IOMMU_DEV_FEAT_SVA) -> 0/err
  - disable SVA if it was enabled
  - iommu_ops->dev_disable_feat()

* iommu_dev_feature_enabled(dev, IOMMU_DEV_FEAT_SVA) -> true/false
  - shows the SVA state (could be implemented in iommu.c)
  - iommu_ops->dev_feature_enabled()

* iommu_sva_bind(dev, mm, , mm_exit_cb, drvdata) -> 0/err
  - Binds dev to mm
  - Sets a callback to disable the PASID, in case mm exits before the dd
calls unbind()
  - iommu_ops->bind_mm()

* iommu_sva_unbind(dev, )
  - Unbinds dev from mm
  - iommu_ops->unbind_mm()

Please have a look and tell me what needs to change to be compatible
with intel-svm, amd_iommu_v2 and AMD KFD. As the only SVA user currently
upstream AMD KFD will get preferential treatment, but to keep this
simple I didn't add the min/max_pasid params that AMD KFD requires to
set non-discoverable PASID limits. I could add iommu_dev_set_sva_param()
or something like that?

Thanks,
Jean

[1] The full patch stack contains:
* bind()/unbind() API
* fault reporting API
* ATS for SMMUv3
* generic IOASID
* IO mm tracking
* IO page fault handler
* PRI + stall for SMMUv3
* PASID for SMMUv3
* VFIO usage example
I'd like to get at least the first three into v5.2

git://linux-arm.org/linux-jpb.git sva/current

http://www.linux-arm.org/git?p=linux-jpb.git;a=shortlog;h=refs/heads/sva/current
Passes all my tests, but needs some refinement.

[2] [PATCH v3 00/10] Shared Virtual Addressing for the IOMMU
https://www.spinics.net/lists/iommu/msg30076.html
See also, for more recent interface discussion:
[RFC PATCH 0/6] Auxiliary IOMMU domains and Arm SMMUv3
https://www.spinics.net/lists/iommu/msg30637.html

[3] [PATCH v6 0/9] vfio/mdev: IOMMU aware mediated device

https://lore.kernel.org/lkml/20190213040301.23021-10-baolu...@linux.intel.com/T/


---
If you're not sure what this all means:

Shared Virtual Addressing (SVA) is the ability to share process address
spaces with devices. It is called "SVM" (Shared Virtual Memory) by OpenCL
and some IOMMU architectures, but since that abbreviation is already used
for AMD virtualisation in Linux (Secure Virtual Machine), we prefer the
less ambiguous "SVA".

Sharing process address spaces with devices allows to rely on core kernel
memory management for DMA, removing some complexity from application and
device drivers. After binding to a device, applications can instruct it to
perform DMA on buffers obtained with malloc.

The device, bus and IOMMU must support the following features:

* Multiple address spaces per device, for example using the PCI PASID
  (Process Address Space ID) extension. The IOMMU driver allocates a
  PASID and the device uses it in DMA transactions.
* I/O Page Faults (IOPF), for example PCI PRI (Page Request Interface) or
  Arm SMMU stall. The core mm handles translation faults from the IOMMU.
* MMU and IOMMU implement compatible page table formats.
---

Jean-Philippe Brucker (1):
  iommu: Bind process address spaces to devices

 drivers/iommu/iommu.c | 104 ++
 include/linux/iommu.h |  24 ++
 2 files changed, 128 insertions(+)

-- 
2.20.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 2/3] iommu: Document iommu_ops.iotlb_sync_map()

2019-02-20 Thread Dmitry Osipenko
20.02.2019 16:00, Geert Uytterhoeven пишет:
> Add missing kerneldoc for iommu_ops.iotlb_sync_map().
> 
> Fixes: 1d7ae53b152dbc5b ("iommu: Introduce iotlb_sync_map callback")
> Signed-off-by: Geert Uytterhoeven 
> ---
>  include/linux/iommu.h | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/include/linux/iommu.h b/include/linux/iommu.h
> index 848fb07026b67169..3ed5d372b090af54 100644
> --- a/include/linux/iommu.h
> +++ b/include/linux/iommu.h
> @@ -169,6 +169,7 @@ struct iommu_resv_region {
>   * @unmap: unmap a physically contiguous memory region from an iommu domain
>   * @flush_iotlb_all: Synchronously flush all hardware TLBs for this domain
>   * @iotlb_range_add: Add a given iova range to the flush queue for this 
> domain
> + * @iotlb_sync_map: Sync mappings created recently using @map to the hardware
>   * @iotlb_sync: Flush all queued ranges from the hardware TLBs and empty 
> flush
>   *queue
>   * @iova_to_phys: translate iova to physical address
> 

Pretty sure there was a kerneldoc in some version of the original patch that 
introduced the iotlb_sync_map(), but probably it got lost after a rebase. The 
kerneldoc comment is correct, thank you! BTW, for some reason gmail marked this 
series as a spam, maybe you need to check the email headers and whatnot.

Reviewed-by: Dmitry Osipenko 
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH 0/3] iommu: Kerneldoc improvements

2019-02-20 Thread Geert Uytterhoeven
Hi Jörg,

This series contains a fix for an incorrect kerneldoc parameter, and
adds the missing kerneldoc for two recently added IOMMU methods.

Thanks!

Geert Uytterhoeven (3):
  iommu: Fix kerneldoc for iommu_ops.flush_iotlb_all()
  iommu: Document iommu_ops.iotlb_sync_map()
  iommu: Document iommu_ops.is_attach_deferred()

 include/linux/iommu.h | 5 -
 1 file changed, 4 insertions(+), 1 deletion(-)

-- 
2.17.1

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- ge...@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH 3/3] iommu: Document iommu_ops.is_attach_deferred()

2019-02-20 Thread Geert Uytterhoeven
Add missing kerneldoc for iommu_ops.is_attach_deferred().

Fixes: e01d1913b0d08171 ("iommu: Add is_attach_deferred call-back to iommu-ops")
Signed-off-by: Geert Uytterhoeven 
---
 include/linux/iommu.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index 3ed5d372b090af54..ffbbc7e39ceeba3e 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -184,6 +184,8 @@ struct iommu_resv_region {
  * @domain_window_enable: Configure and enable a particular window for a domain
  * @domain_window_disable: Disable a particular window for a domain
  * @of_xlate: add OF master IDs to iommu grouping
+ * @is_attach_deferred: Check if domain attach should be deferred from iommu
+ *  driver init to device driver init (default no)
  * @pgsize_bitmap: bitmap of all possible supported page sizes
  */
 struct iommu_ops {
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 2/3] iommu: Document iommu_ops.iotlb_sync_map()

2019-02-20 Thread Geert Uytterhoeven
Add missing kerneldoc for iommu_ops.iotlb_sync_map().

Fixes: 1d7ae53b152dbc5b ("iommu: Introduce iotlb_sync_map callback")
Signed-off-by: Geert Uytterhoeven 
---
 include/linux/iommu.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index 848fb07026b67169..3ed5d372b090af54 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -169,6 +169,7 @@ struct iommu_resv_region {
  * @unmap: unmap a physically contiguous memory region from an iommu domain
  * @flush_iotlb_all: Synchronously flush all hardware TLBs for this domain
  * @iotlb_range_add: Add a given iova range to the flush queue for this domain
+ * @iotlb_sync_map: Sync mappings created recently using @map to the hardware
  * @iotlb_sync: Flush all queued ranges from the hardware TLBs and empty flush
  *queue
  * @iova_to_phys: translate iova to physical address
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 1/3] iommu: Fix kerneldoc for iommu_ops.flush_iotlb_all()

2019-02-20 Thread Geert Uytterhoeven
While the API wrapper is called iommu_flush_tlb_all(), the actual
iommu_ops method is called .flush_iotlb_all(), not .flush_tlb_all().

Fixes: add02cfdc9bc2987 ("iommu: Introduce Interface for IOMMU TLB Flushing")
Signed-off-by: Geert Uytterhoeven 
---
 include/linux/iommu.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index 477ef47c357c0553..848fb07026b67169 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -167,7 +167,7 @@ struct iommu_resv_region {
  * @detach_dev: detach device from an iommu domain
  * @map: map a physically contiguous memory region to an iommu domain
  * @unmap: unmap a physically contiguous memory region from an iommu domain
- * @flush_tlb_all: Synchronously flush all hardware TLBs for this domain
+ * @flush_iotlb_all: Synchronously flush all hardware TLBs for this domain
  * @iotlb_range_add: Add a given iova range to the flush queue for this domain
  * @iotlb_sync: Flush all queued ranges from the hardware TLBs and empty flush
  *queue
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH] iommu: Fix IOMMU debugfs fallout

2019-02-20 Thread Geert Uytterhoeven
A change made in the final version of IOMMU debugfs support replaced the
public function iommu_debugfs_new_driver_dir() by the public dentry
iommu_debugfs_dir in , but forgot to update both the
implementation in iommu-debugfs.c, and the patch description.

Fix this by exporting iommu_debugfs_dir, and removing the reference to
and implementation of iommu_debugfs_new_driver_dir().

Fixes: bad614b24293ae46 ("iommu: Enable debugfs exposure of IOMMU driver 
internals")
Signed-off-by: Geert Uytterhoeven 
---
 drivers/iommu/iommu-debugfs.c | 23 ---
 1 file changed, 4 insertions(+), 19 deletions(-)

diff --git a/drivers/iommu/iommu-debugfs.c b/drivers/iommu/iommu-debugfs.c
index 3b1bf88fd1b0494a..f0354894209648fd 100644
--- a/drivers/iommu/iommu-debugfs.c
+++ b/drivers/iommu/iommu-debugfs.c
@@ -12,6 +12,7 @@
 #include 
 
 struct dentry *iommu_debugfs_dir;
+EXPORT_SYMBOL_GPL(iommu_debugfs_dir);
 
 /**
  * iommu_debugfs_setup - create the top-level iommu directory in debugfs
@@ -23,9 +24,9 @@ struct dentry *iommu_debugfs_dir;
  * Emit a strong warning at boot time to indicate that this feature is
  * enabled.
  *
- * This function is called from iommu_init; drivers may then call
- * iommu_debugfs_new_driver_dir() to instantiate a vendor-specific
- * directory to be used to expose internal data.
+ * This function is called from iommu_init; drivers may then use
+ * iommu_debugfs_dir to instantiate a vendor-specific directory to be used
+ * to expose internal data.
  */
 void iommu_debugfs_setup(void)
 {
@@ -48,19 +49,3 @@ void iommu_debugfs_setup(void)

pr_warn("*\n");
}
 }
-
-/**
- * iommu_debugfs_new_driver_dir - create a vendor directory under debugfs/iommu
- * @vendor: name of the vendor-specific subdirectory to create
- *
- * This function is called by an IOMMU driver to create the top-level debugfs
- * directory for that driver.
- *
- * Return: upon success, a pointer to the dentry for the new directory.
- * NULL in case of failure.
- */
-struct dentry *iommu_debugfs_new_driver_dir(const char *vendor)
-{
-   return debugfs_create_dir(vendor, iommu_debugfs_dir);
-}
-EXPORT_SYMBOL_GPL(iommu_debugfs_new_driver_dir);
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCHv7] x86/kdump: bugfix, make the behavior of crashkernel=X consistent with kaslr

2019-02-20 Thread Dave Young
On 02/20/19 at 09:32am, Borislav Petkov wrote:
> On Mon, Feb 18, 2019 at 09:48:20AM +0800, Dave Young wrote:
> > It is ideal if kernel can do it automatically, but I'm not sure if
> > kernel can predict the swiotlb reserved size automatically.
> 
> Do you see how even more absurd this gets?
> 
> If the kernel cannot know the swiotlb reserved size automatically, how
> is the normal user even supposed to know?!
> 
> I see swiotlb_size_or_default() so we have a sane default which we fall
> back to. Now where's the problem with that?

Good question, I expect some answer from people who know more about the
background.  It would be good to have some actual test results, Pingfan
is trying to do some tests.

Previously Joerg posted below patch, maybe he has some idea. Joerg?

commit 94fb9334182284e8e7e4bcb9125c25dc33af19d4
Author: Joerg Roedel 
Date:   Wed Jun 10 17:49:42 2015 +0200

x86/crash: Allocate enough low memory when crashkernel=high

Thanks
Dave
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v6 21/22] iommu/mediatek: Fix iova_to_phys PA start for 4GB mode

2019-02-20 Thread Yong Wu
On Tue, 2019-02-19 at 15:33 -0800, Evan Green wrote:
> On Sun, Feb 17, 2019 at 1:09 AM Yong Wu  wrote:
> >
> > In the 4GB mode, the physical address is remapped,
> >
> > Here is the detailed remap relationship.
> > CPU PA ->HW PA
> > 0x4000_  0x1_4000_ (Add bit32)
> > 0x8000_  0x1_8000_ ...
> > 0xc000_  0x1_c000_ ...
> > 0x1__0x1__ (No change)
> >
> > Thus, we always add bit32 for PA when entering mtk_iommu_map.
> > But in the iova_to_phys, the CPU don't need this bit32 if the
> > PA is from 0x1_4000_ to 0x1__.
> > This patch discards the bit32 in this iova_to_phys in the 4GB mode.
> >
> > Signed-off-by: Yong Wu 
> > ---
> >  drivers/iommu/mtk_iommu.c | 18 ++
> >  1 file changed, 18 insertions(+)
> >
> > diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c
> > index 0277396..076d333 100644
> > --- a/drivers/iommu/mtk_iommu.c
> > +++ b/drivers/iommu/mtk_iommu.c
> > @@ -119,6 +119,19 @@ struct mtk_iommu_domain {
> >
> >  static const struct iommu_ops mtk_iommu_ops;
> >
> > +/*
> > + * In M4U 4GB mode, the physical address is remapped as below:
> > + *  CPU PA ->   M4U HW PA
> > + *  0x4000_ 0x1_4000_ (Add bit32)
> > + *  0x8000_ 0x1_8000_ ...
> > + *  0xc000_ 0x1_c000_ ...
> > + *  0x1__   0x1__ (No change)
> > + *
> > + * Thus, We always add BIT32 in the iommu_map and disable BIT32 if PA is >=
> > + * 0x1_4000_ in the iova_to_phys.
> > + */
> > +#define MTK_IOMMU_4GB_MODE_PA_14000 0x14000UL
> > +
> >  static LIST_HEAD(m4ulist); /* List all the M4U HWs */
> >
> >  #define for_each_m4u(data) list_for_each_entry(data, , list)
> > @@ -415,6 +428,7 @@ static phys_addr_t mtk_iommu_iova_to_phys(struct 
> > iommu_domain *domain,
> >   dma_addr_t iova)
> >  {
> > struct mtk_iommu_domain *dom = to_mtk_domain(domain);
> > +   struct mtk_iommu_data *data = mtk_iommu_get_m4u_data();
> > unsigned long flags;
> > phys_addr_t pa;
> >
> > @@ -422,6 +436,10 @@ static phys_addr_t mtk_iommu_iova_to_phys(struct 
> > iommu_domain *domain,
> > pa = dom->iop->iova_to_phys(dom->iop, iova);
> > spin_unlock_irqrestore(>pgtlock, flags);
> >
> > +   if (data->plat_data->has_4gb_mode && data->dram_is_4gb &&
> > +   pa >= MTK_IOMMU_4GB_MODE_PA_14000)
> > +   pa &= ~BIT_ULL(32);
> > +
> 
> The define doesn't really make it much better, but I guess it doesn't
> make it worse either. As I was reviewing this I was thinking that this
> should be rolled into patch 6 "iommu/io-pgtable-arm-v7s: Extend
> MediaTek 4GB Mode". But I guess this was returning bad PAs since
> before this series, right? So does this need a Fixes tag?

Thanks very much for your reviewing so many patches.

Yes. The issue exist before this series, It was introduced by this
commit:
30e2fccf9512 ("iommu/mediatek: Enlarge the validate PA range for 4GB
mode")

I will send a new version to add this tag.

> -Evan
> 
> ___
> Linux-mediatek mailing list
> linux-media...@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-mediatek


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCHv7] x86/kdump: bugfix, make the behavior of crashkernel=X consistent with kaslr

2019-02-20 Thread Borislav Petkov
On Mon, Feb 18, 2019 at 09:48:20AM +0800, Dave Young wrote:
> It is ideal if kernel can do it automatically, but I'm not sure if
> kernel can predict the swiotlb reserved size automatically.

Do you see how even more absurd this gets?

If the kernel cannot know the swiotlb reserved size automatically, how
is the normal user even supposed to know?!

I see swiotlb_size_or_default() so we have a sane default which we fall
back to. Now where's the problem with that?

-- 
Regards/Gruss,
Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu