Re: [PATCH 1/2] iommu: Fix race condition during default domain allocation

2021-06-16 Thread Ashish Mhetre



On 6/11/2021 6:19 PM, Robin Murphy wrote:

External email: Use caution opening links or attachments


On 2021-06-11 11:45, Will Deacon wrote:

On Thu, Jun 10, 2021 at 09:46:53AM +0530, Ashish Mhetre wrote:

Domain is getting created more than once during asynchronous multiple
display heads(devices) probe. All the display heads share same SID and
are expected to be in same domain. As iommu_alloc_default_domain() call
is not protected, the group->default_domain and group->domain are ending
up with different domains and leading to subsequent IOMMU faults.
Fix this by protecting iommu_alloc_default_domain() call with 
group->mutex.


Can you provide some more information about exactly what the h/w
configuration is, and the callstack which exhibits the race, please?


It'll be basically the same as the issue reported long ago with PCI
groups in the absence of ACS not being constructed correctly. Triggering
the iommu_probe_device() replay in of_iommu_configure() off the back of
driver probe is way too late and allows calls to happen in the wrong
order, or indeed race in parallel as here. Fixing that is still on my
radar, but will not be simple, and will probably go hand-in-hand with
phasing out the bus ops (for the multiple-driver-coexistence problem).


For iommu group creation, the stack flow during race is like:
Display device 1:
iommu_probe_device -> iommu_group_get_for_dev -> arm_smmu_device_group
Display device 2:
iommu_probe_device -> iommu_group_get_for_dev -> arm_smmu_device_group

And this way it ends up in creating 2 groups for 2 display devices 
sharing same SID.
Ideally for 2nd display device, iommu_group_get call from 
iommu_group_get_for_dev should return same group as 1st display device. 
But due to the race, it ends up with 2 groups.


For default domain, the stack flow during race is like:
Display device 1:
iommu_probe_device -> iommu_alloc_default_domain -> arm_smmu_domain_alloc
Display device 2:
iommu_probe_device -> iommu_alloc_default_domain -> arm_smmu_domain_alloc

Here also 2nd device should already have domain allocated and 
'if(group->default_domain)' condition from iommu_alloc_default_domain 
should be true for 2nd device.


Issue with this is IOVA accesses from 2nd device results in context faults.


Signed-off-by: Ashish Mhetre 
---
  drivers/iommu/iommu.c | 2 ++
  1 file changed, 2 insertions(+)

diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 808ab70..2700500 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -273,7 +273,9 @@ int iommu_probe_device(struct device *dev)
  * support default domains, so the return value is not yet
  * checked.
  */
+    mutex_lock(>mutex);
 iommu_alloc_default_domain(group, dev);
+    mutex_unlock(>mutex);


It feels wrong to serialise this for everybody just to cater for systems
with aliasing SIDs between devices.


If two or more devices are racing at this point then they're already
going to be serialised by at least iommu_group_add_device(), so I doubt
there would be much impact - only the first device through here will
hold the mutex for any appreciable length of time. Every other path
which modifies group->domain does so with the mutex held (note the
"expected" default domain allocation flow in bus_iommu_probe() in
particular), so not holding it here does seem like a straightforward
oversight.

Robin.
Serialization will only happen for the devices sharing same group. Only 
the first device in group will hold this till domain is created. For 
rest of the devices it will just check for existing domain in 
iommu_alloc_default_domain and then return and release the mutex.



___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: swiotlb/caamjr regression (Was: [GIT PULL] (swiotlb) stable/for-linus-5.12)

2021-06-16 Thread Dominique MARTINET
Christoph Hellwig wrote on Thu, Jun 17, 2021 at 07:12:32AM +0200:
> On Thu, Jun 17, 2021 at 09:39:15AM +0900, Dominique MARTINET wrote:
> > Konrad Rzeszutek Wilk wrote on Wed, Jun 16, 2021 at 08:27:39PM -0400:
> > > Thank you for testing that - and this is a bummer indeed.
> > 
> > Hm, actually not that surprising if it was working without the offset
> > adjustments and doing non-aligned mappings -- perhaps the nvme code just
> > needs to round the offsets down instead of expecting swiotlb to do it?
> 
> It can't.  The whole point of the series was to keep the original offsets.

Right, now I'm reading this again there are two kind of offsets (quoting
code from today's master)
---
static void swiotlb_bounce(struct device *dev, phys_addr_t tlb_addr, size_t 
size,
   enum dma_data_direction dir)
{
struct io_tlb_mem *mem = io_tlb_default_mem;
int index = (tlb_addr - mem->start) >> IO_TLB_SHIFT;
phys_addr_t orig_addr = mem->slots[index].orig_addr;
---

There is:
 - (tlb_addr - mem->start) alignment that Linus added up
 - mem->slots[index].orig_addr alignment (within IO_TLB_SIZE blocks)


I would assume that series made it possible to preserve offsets within a
block for orig_addr, but in the process broke the offsets of a bounce
within an memory slot (the first one) ; I assume we want to restore here
the offset within the IO_TLB_SIZE block in orig_addr so it needs another
offseting of that orig_addr offset e.g. taking a block and offsets
within blocks, we have at the start of function:

 |-|---|--|
 ^ ^   ^
 block start   slot orig addr  tlb_addr

and want the orig_addr variable to align with tlb_addr.


So I was a bit hasty in saying nvme needs to remove offsets, it's more
that current code only has the second one working while the quick fix
breaks the second one in the process of fixing the first...



Jianxiong Gao, before spending more time on this, could you also try
Chanho Park's patch?
https://lore.kernel.org/linux-iommu/20210510091816.ga2...@lst.de/T/#m0d0df6490350a08dcc24c9086c8edc165b402d6f

I frankly don't understand many details of that code at this point,
in particular I have no idea why or if the patch needs another offset
with mem->start or where the dma_get_min_align_mask(dev) comes from,
but it'll be interesting to test.


Thanks,
-- 
Dominique
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: swiotlb/caamjr regression (Was: [GIT PULL] (swiotlb) stable/for-linus-5.12)

2021-06-16 Thread Christoph Hellwig
On Thu, Jun 17, 2021 at 09:39:15AM +0900, Dominique MARTINET wrote:
> Konrad Rzeszutek Wilk wrote on Wed, Jun 16, 2021 at 08:27:39PM -0400:
> > Thank you for testing that - and this is a bummer indeed.
> 
> Hm, actually not that surprising if it was working without the offset
> adjustments and doing non-aligned mappings -- perhaps the nvme code just
> needs to round the offsets down instead of expecting swiotlb to do it?

It can't.  The whole point of the series was to keep the original offsets.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: Plan for /dev/ioasid RFC v2

2021-06-16 Thread Liu Yi L
Hi Alex,

On Wed, 16 Jun 2021 13:39:37 -0600, Alex Williamson wrote:

> On Wed, 16 Jun 2021 06:43:23 +
> "Tian, Kevin"  wrote:
> 
> > > From: Alex Williamson 
> > > Sent: Wednesday, June 16, 2021 12:12 AM
> > > 
> > > On Tue, 15 Jun 2021 02:31:39 +
> > > "Tian, Kevin"  wrote:
> > > 
> > > > > From: Alex Williamson 
> > > > > Sent: Tuesday, June 15, 2021 12:28 AM
> > > > >
> > > > [...]
> > > > > > IOASID. Today the group fd requires an IOASID before it hands out a
> > > > > > device_fd. With iommu_fd the device_fd will not allow IOCTLs until 
> > > > > > it
> > > > > > has a blocked DMA IOASID and is successefully joined to an 
> > > > > > iommu_fd.
> > > > >
> > > > > Which is the root of my concern.  Who owns ioctls to the device fd?
> > > > > It's my understanding this is a vfio provided file descriptor and it's
> > > > > therefore vfio's responsibility.  A device-level IOASID interface
> > > > > therefore requires that vfio manage the group aspect of device access.
> > > > > AFAICT, that means that device access can therefore only begin when 
> > > > > all
> > > > > devices for a given group are attached to the IOASID and must halt for
> > > > > all devices in the group if any device is ever detached from an 
> > > > > IOASID,
> > > > > even temporarily.  That suggests a lot more oversight of the IOASIDs 
> > > > > by
> > > > > vfio than I'd prefer.
> > > > >
> > > >
> > > > This is possibly the point that is worthy of more clarification and
> > > > alignment, as it sounds like the root of controversy here.
> > > >
> > > > I feel the goal of vfio group management is more about ownership, i.e.
> > > > all devices within a group must be assigned to a single user. Following
> > > > the three rules defined by Jason, what we really care is whether a group
> > > > of devices can be isolated from the rest of the world, i.e. no access to
> > > > memory/device outside of its security context and no access to its
> > > > security context from devices outside of this group. This can be 
> > > > achieved
> > > > as long as every device in the group is either in block-DMA state when
> > > > it's not attached to any security context or attached to an IOASID 
> > > > context
> > > > in IOMMU fd.
> > > >
> > > > As long as group-level isolation is satisfied, how devices within a 
> > > > group
> > > > are further managed is decided by the user (unattached, all attached to
> > > > same IOASID, attached to different IOASIDs) as long as the user
> > > > understands the implication of lacking of isolation within the group. 
> > > > This
> > > > is what a device-centric model comes to play. Misconfiguration just 
> > > > hurts
> > > > the user itself.
> > > >
> > > > If this rationale can be agreed, then I didn't see the point of having 
> > > > VFIO
> > > > to mandate all devices in the group must be attached/detached in
> > > > lockstep.
> > > 
> > > In theory this sounds great, but there are still too many assumptions
> > > and too much hand waving about where isolation occurs for me to feel
> > > like I really have the complete picture.  So let's walk through some
> > > examples.  Please fill in and correct where I'm wrong.
> > 
> > Thanks for putting these examples. They are helpful for clearing the 
> > whole picture.
> > 
> > Before filling in let's first align on what is the key difference between
> > current VFIO model and this new proposal. With this comparison we'll
> > know which of following questions are answered with existing VFIO
> > mechanism and which are handled differently.
> > 
> > With Yi's help we figured out the current mechanism:
> > 
> > 1) vfio_group_viable. The code comment explains the intention clearly:
> > 
> > --
> > * A vfio group is viable for use by userspace if all devices are in
> >  * one of the following states:
> >  *  - driver-less
> >  *  - bound to a vfio driver
> >  *  - bound to an otherwise allowed driver
> >  *  - a PCI interconnect device
> > --
> > 
> > Note this check is not related to an IOMMU security context.  
> 
> Because this is a pre-requisite for imposing that IOMMU security
> context.
>  
> > 2) vfio_iommu_group_notifier. When an IOMMU_GROUP_NOTIFY_
> > BOUND_DRIVER event is notified, vfio_group_viable is re-evaluated.
> > If the affected group was previously viable but now becomes not 
> > viable, BUG_ON() as it implies that this device is bound to a non-vfio 
> > driver which breaks the group isolation.  
> 
> This notifier action is conditional on there being users of devices
> within a secure group IOMMU context.
> 
> > 3) vfio_group_get_device_fd. User can acquire a device fd only after
> > a) the group is viable;
> > b) the group is attached to a container;
> > c) iommu is set on the container (implying a security context
> > established);  
> 
> The order is actually b) a) c) but arguably b) is a no-op until:
> 
> d) a device fd is provided to the user

Per the code in QEMU vfio_get_group(). The 

Re: [RFC PATCH] iommu: add domain->nested

2021-06-16 Thread Zhangfei Gao



On 2021/6/16 下午10:44, Christoph Hellwig wrote:

On Wed, Jun 16, 2021 at 10:38:02PM +0800, Zhangfei Gao wrote:

+++ b/include/linux/iommu.h
@@ -87,6 +87,7 @@ struct iommu_domain {
void *handler_token;
struct iommu_domain_geometry geometry;
void *iova_cookie;
+   int nested;

This should probably be a bool : 1;

Also this needs a user, so please just queue up a variant of this for
the code that eventually relies on this information.

Thanks Christoph

Got it, will do this.


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v12 06/12] swiotlb: Use is_swiotlb_force_bounce for swiotlb data bouncing

2021-06-16 Thread Stefano Stabellini
On Wed, 16 Jun 2021, Claire Chang wrote:
> Propagate the swiotlb_force into io_tlb_default_mem->force_bounce and
> use it to determine whether to bounce the data or not. This will be
> useful later to allow for different pools.
> 
> Signed-off-by: Claire Chang 
> ---
>  include/linux/swiotlb.h | 11 +++
>  kernel/dma/direct.c |  2 +-
>  kernel/dma/direct.h |  2 +-
>  kernel/dma/swiotlb.c|  4 
>  4 files changed, 17 insertions(+), 2 deletions(-)
> 
> diff --git a/include/linux/swiotlb.h b/include/linux/swiotlb.h
> index dd1c30a83058..8d8855c77d9a 100644
> --- a/include/linux/swiotlb.h
> +++ b/include/linux/swiotlb.h
> @@ -84,6 +84,7 @@ extern enum swiotlb_force swiotlb_force;
>   *   unmap calls.
>   * @debugfs: The dentry to debugfs.
>   * @late_alloc:  %true if allocated using the page allocator
> + * @force_bounce: %true if swiotlb bouncing is forced
>   */
>  struct io_tlb_mem {
>   phys_addr_t start;
> @@ -94,6 +95,7 @@ struct io_tlb_mem {
>   spinlock_t lock;
>   struct dentry *debugfs;
>   bool late_alloc;
> + bool force_bounce;
>   struct io_tlb_slot {
>   phys_addr_t orig_addr;
>   size_t alloc_size;
> @@ -109,6 +111,11 @@ static inline bool is_swiotlb_buffer(struct device *dev, 
> phys_addr_t paddr)
>   return mem && paddr >= mem->start && paddr < mem->end;
>  }
>  
> +static inline bool is_swiotlb_force_bounce(struct device *dev)
> +{
> + return dev->dma_io_tlb_mem->force_bounce;
> +}
>  void __init swiotlb_exit(void);
>  unsigned int swiotlb_max_segment(void);
>  size_t swiotlb_max_mapping_size(struct device *dev);
> @@ -120,6 +127,10 @@ static inline bool is_swiotlb_buffer(struct device *dev, 
> phys_addr_t paddr)
>  {
>   return false;
>  }
> +static inline bool is_swiotlb_force_bounce(struct device *dev)
> +{
> + return false;
> +}
>  static inline void swiotlb_exit(void)
>  {
>  }
> diff --git a/kernel/dma/direct.c b/kernel/dma/direct.c
> index 7a88c34d0867..a92465b4eb12 100644
> --- a/kernel/dma/direct.c
> +++ b/kernel/dma/direct.c
> @@ -496,7 +496,7 @@ size_t dma_direct_max_mapping_size(struct device *dev)
>  {
>   /* If SWIOTLB is active, use its maximum mapping size */
>   if (is_swiotlb_active(dev) &&
> - (dma_addressing_limited(dev) || swiotlb_force == SWIOTLB_FORCE))
> + (dma_addressing_limited(dev) || is_swiotlb_force_bounce(dev)))
>   return swiotlb_max_mapping_size(dev);
>   return SIZE_MAX;
>  }
> diff --git a/kernel/dma/direct.h b/kernel/dma/direct.h
> index 13e9e7158d94..4632b0f4f72e 100644
> --- a/kernel/dma/direct.h
> +++ b/kernel/dma/direct.h
> @@ -87,7 +87,7 @@ static inline dma_addr_t dma_direct_map_page(struct device 
> *dev,
>   phys_addr_t phys = page_to_phys(page) + offset;
>   dma_addr_t dma_addr = phys_to_dma(dev, phys);
>  
> - if (unlikely(swiotlb_force == SWIOTLB_FORCE))
> + if (is_swiotlb_force_bounce(dev))
>   return swiotlb_map(dev, phys, size, dir, attrs);
>
>   if (unlikely(!dma_capable(dev, dma_addr, size, true))) {

Should we also make the same change in
drivers/xen/swiotlb-xen.c:xen_swiotlb_map_page ?

If I make that change, I can see that everything is working as
expected for a restricted-dma device with Linux running as dom0 on Xen.
However, is_swiotlb_force_bounce returns non-zero even for normal
non-restricted-dma devices. That shouldn't happen, right?

It looks like struct io_tlb_slot is not zeroed on allocation.
Adding memset(mem, 0x0, struct_size) in swiotlb_late_init_with_tbl
solves the issue.

With those two changes, the series passes my tests and you can add my
tested-by.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: swiotlb/caamjr regression (Was: [GIT PULL] (swiotlb) stable/for-linus-5.12)

2021-06-16 Thread Dominique MARTINET
Konrad Rzeszutek Wilk wrote on Wed, Jun 16, 2021 at 08:27:39PM -0400:
> Thank you for testing that - and this is a bummer indeed.

Hm, actually not that surprising if it was working without the offset
adjustments and doing non-aligned mappings -- perhaps the nvme code just
needs to round the offsets down instead of expecting swiotlb to do it?

Note I didn't look at that part of the code at all, so I might be
stating the obvious in a way that's difficult to adjust...


> Dominique, Horia,
> 
> Are those crypto devices somehow easily available to test out the
> patches?

The one I have is included in the iMX8MP and iMX8MQ socs, the later is
included in the mnt reform and librem 5 and both have evaluation
toolkits but I wouldn't quite say they are easy to get...

I'm happy to test different patch variants if Horia doesn't beat me to
it though, it's not as practical as having the device but don't hesitate
to ask if I can run with extra debugs or something.

-- 
Dominique
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: swiotlb/caamjr regression (Was: [GIT PULL] (swiotlb) stable/for-linus-5.12)

2021-06-16 Thread Konrad Rzeszutek Wilk
On Wed, Jun 16, 2021 at 01:49:54PM -0700, Jianxiong Gao wrote:
> On Fri, Jun 11, 2021 at 3:35 AM Konrad Rzeszutek Wilk
>  wrote:
> >
> > On Fri, Jun 11, 2021 at 08:21:53AM +0200, Christoph Hellwig wrote:
> > > On Thu, Jun 10, 2021 at 05:52:07PM +0300, Horia Geantă wrote:
> > > > I've noticed the failure also in v5.10 and v5.11 stable kernels,
> > > > since the patch set has been backported.
> > >
> > > FYI, there has been a patch on the list that should have fixed this
> > > for about a month:
> > >
> > > https://lore.kernel.org/linux-iommu/20210510091816.ga2...@lst.de/T/#m0d0df6490350a08dcc24c9086c8edc165b402d6f
> > >
> > > but it seems like it never got picked up.
> >
> > Jianxiong,
> > Would you be up for testing this patch on your NVMe rig please? I don't
> > forsee a problem.. but just in case
> >
> I have tested the attached patch and it generates an error when
> formatting a disk to xfs format in Rhel 8 environment:

Thank you for testing that - and this is a bummer indeed.

Jianxiong,
How unique is this NVMe? Should I be able to reproduce this with any
type or is it specific to Google Cloud?

Dominique, Horia,

Are those crypto devices somehow easily available to test out the
patches?

P.S.
Most unfortunate timing - I am out in rural areas in US with not great
Internet, so won't be able to get fully down to this until Monday.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v12 11/12] dt-bindings: of: Add restricted DMA pool

2021-06-16 Thread Stefano Stabellini
On Wed, 16 Jun 2021, Claire Chang wrote:
> Introduce the new compatible string, restricted-dma-pool, for restricted
> DMA. One can specify the address and length of the restricted DMA memory
> region by restricted-dma-pool in the reserved-memory node.
> 
> Signed-off-by: Claire Chang 
> ---
>  .../reserved-memory/reserved-memory.txt   | 36 +--
>  1 file changed, 33 insertions(+), 3 deletions(-)
> 
> diff --git 
> a/Documentation/devicetree/bindings/reserved-memory/reserved-memory.txt 
> b/Documentation/devicetree/bindings/reserved-memory/reserved-memory.txt
> index e8d3096d922c..46804f24df05 100644
> --- a/Documentation/devicetree/bindings/reserved-memory/reserved-memory.txt
> +++ b/Documentation/devicetree/bindings/reserved-memory/reserved-memory.txt
> @@ -51,6 +51,23 @@ compatible (optional) - standard definition
>used as a shared pool of DMA buffers for a set of devices. It can
>be used by an operating system to instantiate the necessary pool
>management subsystem if necessary.
> +- restricted-dma-pool: This indicates a region of memory meant to be
> +  used as a pool of restricted DMA buffers for a set of devices. The
> +  memory region would be the only region accessible to those devices.
> +  When using this, the no-map and reusable properties must not be 
> set,
> +  so the operating system can create a virtual mapping that will be 
> used
> +  for synchronization. The main purpose for restricted DMA is to
> +  mitigate the lack of DMA access control on systems without an 
> IOMMU,
> +  which could result in the DMA accessing the system memory at
> +  unexpected times and/or unexpected addresses, possibly leading to 
> data
> +  leakage or corruption. The feature on its own provides a basic 
> level
> +  of protection against the DMA overwriting buffer contents at
> +  unexpected times. However, to protect against general data leakage 
> and
> +  system memory corruption, the system needs to provide way to lock 
> down
> +  the memory access, e.g., MPU. Note that since coherent allocation
> +  needs remapping, one must set up another device coherent pool by
> +  shared-dma-pool and use dma_alloc_from_dev_coherent instead for 
> atomic
> +  coherent allocation.
>  - vendor specific string in the form ,[-]
>  no-map (optional) - empty property
>  - Indicates the operating system must not create a virtual mapping
> @@ -85,10 +102,11 @@ memory-region-names (optional) - a list of names, one 
> for each corresponding
>  
>  Example
>  ---
> -This example defines 3 contiguous regions are defined for Linux kernel:
> +This example defines 4 contiguous regions for Linux kernel:
>  one default of all device drivers (named linux,cma@7200 and 64MiB in 
> size),
> -one dedicated to the framebuffer device (named framebuffer@7800, 8MiB), 
> and
> -one for multimedia processing (named multimedia-memory@7700, 64MiB).
> +one dedicated to the framebuffer device (named framebuffer@7800, 8MiB),
> +one for multimedia processing (named multimedia-memory@7700, 64MiB), and
> +one for restricted dma pool (named restricted_dma_reserved@0x5000, 
> 64MiB).
>  
>  / {
>   #address-cells = <1>;
> @@ -120,6 +138,11 @@ one for multimedia processing (named 
> multimedia-memory@7700, 64MiB).
>   compatible = "acme,multimedia-memory";
>   reg = <0x7700 0x400>;
>   };
> +
> + restricted_dma_reserved: restricted_dma_reserved {
> + compatible = "restricted-dma-pool";
> + reg = <0x5000 0x400>;
> + };
>   };
>  
>   /* ... */
> @@ -138,4 +161,11 @@ one for multimedia processing (named 
> multimedia-memory@7700, 64MiB).
>   memory-region = <_reserved>;
>   /* ... */
>   };
> +
> + pcie_device: pcie_device@0,0 {
> + reg = <0x8301 0x0 0x 0x0 0x0010
> +0x8301 0x0 0x0010 0x0 0x0010>;
> + memory-region = <_dma_mem_reserved>;

Shouldn't it be _dma_reserved ?

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: swiotlb/caamjr regression (Was: [GIT PULL] (swiotlb) stable/for-linus-5.12)

2021-06-16 Thread Jianxiong Gao via iommu
On Fri, Jun 11, 2021 at 3:35 AM Konrad Rzeszutek Wilk
 wrote:
>
> On Fri, Jun 11, 2021 at 08:21:53AM +0200, Christoph Hellwig wrote:
> > On Thu, Jun 10, 2021 at 05:52:07PM +0300, Horia Geantă wrote:
> > > I've noticed the failure also in v5.10 and v5.11 stable kernels,
> > > since the patch set has been backported.
> >
> > FYI, there has been a patch on the list that should have fixed this
> > for about a month:
> >
> > https://lore.kernel.org/linux-iommu/20210510091816.ga2...@lst.de/T/#m0d0df6490350a08dcc24c9086c8edc165b402d6f
> >
> > but it seems like it never got picked up.
>
> Jianxiong,
> Would you be up for testing this patch on your NVMe rig please? I don't
> forsee a problem.. but just in case
>
I have tested the attached patch and it generates an error when
formatting a disk to xfs format in Rhel 8 environment:

sudo mkfs.xfs -f /dev/nvme0n2
meta-data=/dev/nvme0n2   isize=512agcount=4, agsize=32768000 blks
 =   sectsz=512   attr=2, projid32bit=1
 =   crc=1finobt=1, sparse=1, rmapbt=0
 =   reflink=1
data =   bsize=4096   blocks=131072000, imaxpct=25
 =   sunit=0  swidth=0 blks
naming   =version 2  bsize=4096   ascii-ci=0, ftype=1
log  =internal log   bsize=4096   blocks=64000, version=2
 =   sectsz=512   sunit=0 blks, lazy-count=1
realtime =none   extsz=4096   blocks=0, rtextents=0
Discarding blocks...Done.
bad magic number
bad magic number
Metadata corruption detected at 0x56211de4c0c8, xfs_sb block 0x0/0x200
libxfs_writebufr: write verifer failed on xfs_sb bno 0x0/0x200
releasing dirty buffer (bulk) to free list!

I applied the patch on commit 06af8679449d.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: Plan for /dev/ioasid RFC v2

2021-06-16 Thread Alex Williamson
On Wed, 16 Jun 2021 06:43:23 +
"Tian, Kevin"  wrote:

> > From: Alex Williamson 
> > Sent: Wednesday, June 16, 2021 12:12 AM
> > 
> > On Tue, 15 Jun 2021 02:31:39 +
> > "Tian, Kevin"  wrote:
> >   
> > > > From: Alex Williamson 
> > > > Sent: Tuesday, June 15, 2021 12:28 AM
> > > >  
> > > [...]  
> > > > > IOASID. Today the group fd requires an IOASID before it hands out a
> > > > > device_fd. With iommu_fd the device_fd will not allow IOCTLs until it
> > > > > has a blocked DMA IOASID and is successefully joined to an iommu_fd.  
> > > >
> > > > Which is the root of my concern.  Who owns ioctls to the device fd?
> > > > It's my understanding this is a vfio provided file descriptor and it's
> > > > therefore vfio's responsibility.  A device-level IOASID interface
> > > > therefore requires that vfio manage the group aspect of device access.
> > > > AFAICT, that means that device access can therefore only begin when all
> > > > devices for a given group are attached to the IOASID and must halt for
> > > > all devices in the group if any device is ever detached from an IOASID,
> > > > even temporarily.  That suggests a lot more oversight of the IOASIDs by
> > > > vfio than I'd prefer.
> > > >  
> > >
> > > This is possibly the point that is worthy of more clarification and
> > > alignment, as it sounds like the root of controversy here.
> > >
> > > I feel the goal of vfio group management is more about ownership, i.e.
> > > all devices within a group must be assigned to a single user. Following
> > > the three rules defined by Jason, what we really care is whether a group
> > > of devices can be isolated from the rest of the world, i.e. no access to
> > > memory/device outside of its security context and no access to its
> > > security context from devices outside of this group. This can be achieved
> > > as long as every device in the group is either in block-DMA state when
> > > it's not attached to any security context or attached to an IOASID context
> > > in IOMMU fd.
> > >
> > > As long as group-level isolation is satisfied, how devices within a group
> > > are further managed is decided by the user (unattached, all attached to
> > > same IOASID, attached to different IOASIDs) as long as the user
> > > understands the implication of lacking of isolation within the group. This
> > > is what a device-centric model comes to play. Misconfiguration just hurts
> > > the user itself.
> > >
> > > If this rationale can be agreed, then I didn't see the point of having 
> > > VFIO
> > > to mandate all devices in the group must be attached/detached in
> > > lockstep.  
> > 
> > In theory this sounds great, but there are still too many assumptions
> > and too much hand waving about where isolation occurs for me to feel
> > like I really have the complete picture.  So let's walk through some
> > examples.  Please fill in and correct where I'm wrong.  
> 
> Thanks for putting these examples. They are helpful for clearing the 
> whole picture.
> 
> Before filling in let's first align on what is the key difference between
> current VFIO model and this new proposal. With this comparison we'll
> know which of following questions are answered with existing VFIO
> mechanism and which are handled differently.
> 
> With Yi's help we figured out the current mechanism:
> 
> 1) vfio_group_viable. The code comment explains the intention clearly:
> 
> --
> * A vfio group is viable for use by userspace if all devices are in
>  * one of the following states:
>  *  - driver-less
>  *  - bound to a vfio driver
>  *  - bound to an otherwise allowed driver
>  *  - a PCI interconnect device
> --
> 
> Note this check is not related to an IOMMU security context.

Because this is a pre-requisite for imposing that IOMMU security
context.
 
> 2) vfio_iommu_group_notifier. When an IOMMU_GROUP_NOTIFY_
> BOUND_DRIVER event is notified, vfio_group_viable is re-evaluated.
> If the affected group was previously viable but now becomes not 
> viable, BUG_ON() as it implies that this device is bound to a non-vfio 
> driver which breaks the group isolation.

This notifier action is conditional on there being users of devices
within a secure group IOMMU context.

> 3) vfio_group_get_device_fd. User can acquire a device fd only after
>   a) the group is viable;
>   b) the group is attached to a container;
>   c) iommu is set on the container (implying a security context
>   established);

The order is actually b) a) c) but arguably b) is a no-op until:

d) a device fd is provided to the user
 
> The new device-centric proposal suggests:
> 
> 1) vfio_group_viable;
> 2) vfio_iommu_group_notifier;
> 3) block-DMA if a device is detached from previous domain (instead of
> switching back to default domain as today);

I'm literally begging for specifics in this thread, but none are
provided here.  What is the "previous domain"?  How is a device placed
into a DMA blocking IOMMU context?  Is this the IOMMU default domain?

Re: [PATCH] dt-bindings: Drop redundant minItems/maxItems

2021-06-16 Thread Wolfram Sang
On Tue, Jun 15, 2021 at 01:15:43PM -0600, Rob Herring wrote:
> If a property has an 'items' list, then a 'minItems' or 'maxItems' with the
> same size as the list is redundant and can be dropped. Note that is DT
> schema specific behavior and not standard json-schema behavior. The tooling
> will fixup the final schema adding any unspecified minItems/maxItems.
> 
> This condition is partially checked with the meta-schema already, but
> only if both 'minItems' and 'maxItems' are equal to the 'items' length.
> An improved meta-schema is pending.
> 
> Cc: Jens Axboe 
> Cc: Stephen Boyd 
> Cc: Herbert Xu 
> Cc: "David S. Miller" 
> Cc: David Airlie 
> Cc: Daniel Vetter 
> Cc: Vinod Koul 
> Cc: Bartosz Golaszewski 
> Cc: Kamal Dasu 
> Cc: Jonathan Cameron 
> Cc: Lars-Peter Clausen 
> Cc: Thomas Gleixner 
> Cc: Marc Zyngier 
> Cc: Joerg Roedel 
> Cc: Jassi Brar 
> Cc: Mauro Carvalho Chehab 
> Cc: Krzysztof Kozlowski 
> Cc: Ulf Hansson 
> Cc: Jakub Kicinski 
> Cc: Wolfgang Grandegger 
> Cc: Marc Kleine-Budde 
> Cc: Andrew Lunn 
> Cc: Vivien Didelot 
> Cc: Vladimir Oltean 
> Cc: Bjorn Helgaas 
> Cc: Kishon Vijay Abraham I 
> Cc: Linus Walleij 
> Cc: "Uwe Kleine-König" 
> Cc: Lee Jones 
> Cc: Ohad Ben-Cohen 
> Cc: Mathieu Poirier 
> Cc: Philipp Zabel 
> Cc: Paul Walmsley 
> Cc: Palmer Dabbelt 
> Cc: Albert Ou 
> Cc: Alessandro Zummo 
> Cc: Alexandre Belloni 
> Cc: Greg Kroah-Hartman 
> Cc: Mark Brown 
> Cc: Zhang Rui 
> Cc: Daniel Lezcano 
> Cc: Wim Van Sebroeck 
> Cc: Guenter Roeck 
> Signed-off-by: Rob Herring 

Acked-by: Wolfram Sang  # for I2C



signature.asc
Description: PGP signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH v4 7/7] iommu/amd: Use only natural aligned flushes in a VM

2021-06-16 Thread Nadav Amit
From: Nadav Amit 

When running on an AMD vIOMMU, it is better to avoid TLB flushes
of unmodified PTEs. vIOMMUs require the hypervisor to synchronize the
virtualized IOMMU's PTEs with the physical ones. This process induce
overheads.

AMD IOMMU allows us to flush any range that is aligned to the power of
2. So when running on top of a vIOMMU, break the range into sub-ranges
that are naturally aligned, and flush each one separately. This apporach
is better when running with a vIOMMU, but on physical IOMMUs, the
penalty of IOTLB misses due to unnecessary flushed entries is likely to
be low.

Repurpose (i.e., keeping the name, changing the logic)
domain_flush_pages() so it is used to choose whether to perform one
flush of the whole range or multiple ones to avoid flushing unnecessary
ranges. Use NpCache, as usual, to infer whether the IOMMU is physical or
virtual.

Cc: Joerg Roedel 
Cc: Will Deacon 
Cc: Jiajun Cao 
Cc: Lu Baolu 
Cc: iommu@lists.linux-foundation.org
Cc: linux-ker...@vger.kernel.org
Suggested-by: Robin Murphy 
Signed-off-by: Nadav Amit 
---
 drivers/iommu/amd/iommu.c | 47 ++-
 1 file changed, 42 insertions(+), 5 deletions(-)

diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index ce8e970aac9a..ec0b6ad27e48 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -1262,15 +1262,52 @@ static void __domain_flush_pages(struct 
protection_domain *domain,
 }
 
 static void domain_flush_pages(struct protection_domain *domain,
-  u64 address, size_t size)
+  u64 address, size_t size, int pde)
 {
-   __domain_flush_pages(domain, address, size, 0);
+   if (likely(!amd_iommu_np_cache)) {
+   __domain_flush_pages(domain, address, size, pde);
+   return;
+   }
+
+   /*
+* When NpCache is on, we infer that we run in a VM and use a vIOMMU.
+* In such setups it is best to avoid flushes of ranges which are not
+* naturally aligned, since it would lead to flushes of unmodified
+* PTEs. Such flushes would require the hypervisor to do more work than
+* necessary. Therefore, perform repeated flushes of aligned ranges
+* until you cover the range. Each iteration flush the smaller between
+* the natural alignment of the address that we flush and the highest
+* bit that is set in the remaining size.
+*/
+   while (size != 0) {
+   int addr_alignment = __ffs(address);
+   int size_alignment = __fls(size);
+   int min_alignment;
+   size_t flush_size;
+
+   /*
+* size is always non-zero, but address might be zero, causing
+* addr_alignment to be negative. As the casting of the
+* argument in __ffs(address) to long might trim the high bits
+* of the address on x86-32, cast to long when doing the check.
+*/
+   if (likely((unsigned long)address != 0))
+   min_alignment = min(addr_alignment, size_alignment);
+   else
+   min_alignment = size_alignment;
+
+   flush_size = 1ul << min_alignment;
+
+   __domain_flush_pages(domain, address, flush_size, pde);
+   address += flush_size;
+   size -= flush_size;
+   }
 }
 
 /* Flush the whole IO/TLB for a given protection domain - including PDE */
 void amd_iommu_domain_flush_tlb_pde(struct protection_domain *domain)
 {
-   __domain_flush_pages(domain, 0, CMD_INV_IOMMU_ALL_PAGES_ADDRESS, 1);
+   domain_flush_pages(domain, 0, CMD_INV_IOMMU_ALL_PAGES_ADDRESS, 1);
 }
 
 void amd_iommu_domain_flush_complete(struct protection_domain *domain)
@@ -1297,7 +1334,7 @@ static void domain_flush_np_cache(struct 
protection_domain *domain,
unsigned long flags;
 
spin_lock_irqsave(>lock, flags);
-   domain_flush_pages(domain, iova, size);
+   domain_flush_pages(domain, iova, size, 1);
amd_iommu_domain_flush_complete(domain);
spin_unlock_irqrestore(>lock, flags);
}
@@ -2205,7 +2242,7 @@ static void amd_iommu_iotlb_sync(struct iommu_domain 
*domain,
unsigned long flags;
 
spin_lock_irqsave(>lock, flags);
-   __domain_flush_pages(dom, gather->start, gather->end - gather->start, 
1);
+   domain_flush_pages(dom, gather->start, gather->end - gather->start, 1);
amd_iommu_domain_flush_complete(dom);
spin_unlock_irqrestore(>lock, flags);
 }
-- 
2.25.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v4 6/7] iommu/amd: Sync once for scatter-gather operations

2021-06-16 Thread Nadav Amit
From: Nadav Amit 

On virtual machines, software must flush the IOTLB after each page table
entry update.

The iommu_map_sg() code iterates through the given scatter-gather list
and invokes iommu_map() for each element in the scatter-gather list,
which calls into the vendor IOMMU driver through iommu_ops callback. As
the result, a single sg mapping may lead to multiple IOTLB flushes.

Fix this by adding amd_iotlb_sync_map() callback and flushing at this
point after all sg mappings we set.

This commit is followed and inspired by commit 933fcd01e97e2
("iommu/vt-d: Add iotlb_sync_map callback").

Cc: Joerg Roedel 
Cc: Will Deacon 
Cc: Jiajun Cao 
Cc: Robin Murphy 
Cc: Lu Baolu 
Cc: iommu@lists.linux-foundation.org
Cc: linux-ker...@vger.kernel.org
Signed-off-by: Nadav Amit 
---
 drivers/iommu/amd/iommu.c | 15 ---
 1 file changed, 12 insertions(+), 3 deletions(-)

diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index 63048aabaf5d..ce8e970aac9a 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -2027,6 +2027,16 @@ static int amd_iommu_attach_device(struct iommu_domain 
*dom,
return ret;
 }
 
+static void amd_iommu_iotlb_sync_map(struct iommu_domain *dom,
+unsigned long iova, size_t size)
+{
+   struct protection_domain *domain = to_pdomain(dom);
+   struct io_pgtable_ops *ops = >iop.iop.ops;
+
+   if (ops->map)
+   domain_flush_np_cache(domain, iova, size);
+}
+
 static int amd_iommu_map(struct iommu_domain *dom, unsigned long iova,
 phys_addr_t paddr, size_t page_size, int iommu_prot,
 gfp_t gfp)
@@ -2045,10 +2055,8 @@ static int amd_iommu_map(struct iommu_domain *dom, 
unsigned long iova,
if (iommu_prot & IOMMU_WRITE)
prot |= IOMMU_PROT_IW;
 
-   if (ops->map) {
+   if (ops->map)
ret = ops->map(ops, iova, paddr, page_size, prot, gfp);
-   domain_flush_np_cache(domain, iova, page_size);
-   }
 
return ret;
 }
@@ -2228,6 +2236,7 @@ const struct iommu_ops amd_iommu_ops = {
.attach_dev = amd_iommu_attach_device,
.detach_dev = amd_iommu_detach_device,
.map = amd_iommu_map,
+   .iotlb_sync_map = amd_iommu_iotlb_sync_map,
.unmap = amd_iommu_unmap,
.iova_to_phys = amd_iommu_iova_to_phys,
.probe_device = amd_iommu_probe_device,
-- 
2.25.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v4 4/7] iommu: Factor iommu_iotlb_gather_is_disjoint() out

2021-06-16 Thread Nadav Amit
From: Nadav Amit 

Refactor iommu_iotlb_gather_add_page() and factor out the logic that
detects whether IOTLB gather range and a new range are disjoint. To be
used by the next patch that implements different gathering logic for
AMD.

Note that updating gather->pgsize unconditionally does not affect
correctness as the function had (and has) an invariant, in which
gather->pgsize always represents the flushing granularity of its range.
Arguably, “size" should never be zero, but lets assume for the matter of
discussion that it might.

If "size" equals to "gather->pgsize", then the assignment in question
has no impact.

Otherwise, if "size" is non-zero, then iommu_iotlb_sync() would
initialize the size and range (see iommu_iotlb_gather_init()), and the
invariant is kept.

Otherwise, "size" is zero, and "gather" already holds a range, so
gather->pgsize is non-zero and (gather->pgsize && gather->pgsize !=
size) is true. Therefore, again, iommu_iotlb_sync() would be called and
initialize the size.

Cc: Joerg Roedel 
Cc: Jiajun Cao 
Cc: Robin Murphy 
Cc: Lu Baolu 
Cc: iommu@lists.linux-foundation.org
Cc: linux-ker...@vger.kernel.org>
Acked-by: Will Deacon 
Signed-off-by: Nadav Amit 
---
 include/linux/iommu.h | 34 ++
 1 file changed, 26 insertions(+), 8 deletions(-)

diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index e554871db46f..979a5ceeea55 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -497,6 +497,28 @@ static inline void iommu_iotlb_sync(struct iommu_domain 
*domain,
iommu_iotlb_gather_init(iotlb_gather);
 }
 
+/**
+ * iommu_iotlb_gather_is_disjoint - Checks whether a new range is disjoint
+ *
+ * @gather: TLB gather data
+ * @iova: start of page to invalidate
+ * @size: size of page to invalidate
+ *
+ * Helper for IOMMU drivers to check whether a new range and the gathered range
+ * are disjoint. For many IOMMUs, flushing the IOMMU in this case is better
+ * than merging the two, which might lead to unnecessary invalidations.
+ */
+static inline
+bool iommu_iotlb_gather_is_disjoint(struct iommu_iotlb_gather *gather,
+   unsigned long iova, size_t size)
+{
+   unsigned long start = iova, end = start + size - 1;
+
+   return gather->end != 0 &&
+   (end + 1 < gather->start || start > gather->end + 1);
+}
+
+
 /**
  * iommu_iotlb_gather_add_range - Gather for address-based TLB invalidation
  * @gather: TLB gather data
@@ -533,20 +555,16 @@ static inline void iommu_iotlb_gather_add_page(struct 
iommu_domain *domain,
   struct iommu_iotlb_gather 
*gather,
   unsigned long iova, size_t size)
 {
-   unsigned long start = iova, end = start + size - 1;
-
/*
 * If the new page is disjoint from the current range or is mapped at
 * a different granularity, then sync the TLB so that the gather
 * structure can be rewritten.
 */
-   if (gather->pgsize != size ||
-   end + 1 < gather->start || start > gather->end + 1) {
-   if (gather->pgsize)
-   iommu_iotlb_sync(domain, gather);
-   gather->pgsize = size;
-   }
+   if ((gather->pgsize && gather->pgsize != size) ||
+   iommu_iotlb_gather_is_disjoint(gather, iova, size))
+   iommu_iotlb_sync(domain, gather);
 
+   gather->pgsize = size;
iommu_iotlb_gather_add_range(gather, iova, size);
 }
 
-- 
2.25.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH v4 5/7] iommu/amd: Tailored gather logic for AMD

2021-06-16 Thread Nadav Amit
From: Nadav Amit 

AMD's IOMMU can flush efficiently (i.e., in a single flush) any range.
This is in contrast, for instnace, to Intel IOMMUs that have a limit on
the number of pages that can be flushed in a single flush.  In addition,
AMD's IOMMU do not care about the page-size, so changes of the page size
do not need to trigger a TLB flush.

So in most cases, a TLB flush due to disjoint range is not needed for
AMD. Yet, vIOMMUs require the hypervisor to synchronize the virtualized
IOMMU's PTEs with the physical ones. This process induce overheads, so
it is better not to cause unnecessary flushes, i.e., flushes of PTEs
that were not modified.

Implement and use amd_iommu_iotlb_gather_add_page() and use it instead
of the generic iommu_iotlb_gather_add_page(). Ignore disjoint regions
unless "non-present cache" feature is reported by the IOMMU
capabilities, as this is an indication we are running on a physical
IOMMU. A similar indication is used by VT-d (see "caching mode"). The
new logic retains the same flushing behavior that we had before the
introduction of page-selective IOTLB flushes for AMD.

On virtualized environments, check if the newly flushed region and the
gathered one are disjoint and flush if it is.

Cc: Joerg Roedel 
Cc: Will Deacon 
Cc: Jiajun Cao 
Cc: Lu Baolu 
Cc: iommu@lists.linux-foundation.org
Cc: linux-ker...@vger.kernel.org>
Cc: Robin Murphy 
Signed-off-by: Nadav Amit 
---
 drivers/iommu/amd/iommu.c | 23 ++-
 1 file changed, 22 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index 3e40f6610b6a..63048aabaf5d 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -2053,6 +2053,27 @@ static int amd_iommu_map(struct iommu_domain *dom, 
unsigned long iova,
return ret;
 }
 
+static void amd_iommu_iotlb_gather_add_page(struct iommu_domain *domain,
+   struct iommu_iotlb_gather *gather,
+   unsigned long iova, size_t size)
+{
+   /*
+* AMD's IOMMU can flush as many pages as necessary in a single flush.
+* Unless we run in a virtual machine, which can be inferred according
+* to whether "non-present cache" is on, it is probably best to prefer
+* (potentially) too extensive TLB flushing (i.e., more misses) over
+* mutliple TLB flushes (i.e., more flushes). For virtual machines the
+* hypervisor needs to synchronize the host IOMMU PTEs with those of
+* the guest, and the trade-off is different: unnecessary TLB flushes
+* should be avoided.
+*/
+   if (amd_iommu_np_cache && gather->end != 0 &&
+   iommu_iotlb_gather_is_disjoint(gather, iova, size))
+   iommu_iotlb_sync(domain, gather);
+
+   iommu_iotlb_gather_add_range(gather, iova, size);
+}
+
 static size_t amd_iommu_unmap(struct iommu_domain *dom, unsigned long iova,
  size_t page_size,
  struct iommu_iotlb_gather *gather)
@@ -2067,7 +2088,7 @@ static size_t amd_iommu_unmap(struct iommu_domain *dom, 
unsigned long iova,
 
r = (ops->unmap) ? ops->unmap(ops, iova, page_size, gather) : 0;
 
-   iommu_iotlb_gather_add_page(dom, gather, iova, page_size);
+   amd_iommu_iotlb_gather_add_page(dom, gather, iova, page_size);
 
return r;
 }
-- 
2.25.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v4 3/7] iommu: Improve iommu_iotlb_gather helpers

2021-06-16 Thread Nadav Amit
From: Robin Murphy 

The Mediatek driver is not the only one which might want a basic
address-based gathering behaviour, so although it's arguably simple
enough to open-code, let's factor it out for the sake of cleanliness.
Let's also take this opportunity to document the intent of these
helpers for clarity.

Cc: Joerg Roedel 
Cc: Will Deacon 
Cc: Jiajun Cao 
Cc: Robin Murphy 
Cc: Lu Baolu 
Cc: iommu@lists.linux-foundation.org
Cc: linux-ker...@vger.kernel.org
Signed-off-by: Robin Murphy 
Signed-off-by: Nadav Amit 

---

Changes from Robin's version:
* Added iommu_iotlb_gather_add_range() stub !CONFIG_IOMMU_API
* Use iommu_iotlb_gather_add_range() in iommu_iotlb_gather_add_page()
---
 drivers/iommu/mtk_iommu.c |  6 +-
 include/linux/iommu.h | 38 +-
 2 files changed, 34 insertions(+), 10 deletions(-)

diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c
index e06b8a0e2b56..cd457487ce81 100644
--- a/drivers/iommu/mtk_iommu.c
+++ b/drivers/iommu/mtk_iommu.c
@@ -521,12 +521,8 @@ static size_t mtk_iommu_unmap(struct iommu_domain *domain,
  struct iommu_iotlb_gather *gather)
 {
struct mtk_iommu_domain *dom = to_mtk_domain(domain);
-   unsigned long end = iova + size - 1;
 
-   if (gather->start > iova)
-   gather->start = iova;
-   if (gather->end < end)
-   gather->end = end;
+   iommu_iotlb_gather_add_range(gather, iova, size);
return dom->iop->unmap(dom->iop, iova, size, gather);
 }
 
diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index 32d448050bf7..e554871db46f 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -497,6 +497,38 @@ static inline void iommu_iotlb_sync(struct iommu_domain 
*domain,
iommu_iotlb_gather_init(iotlb_gather);
 }
 
+/**
+ * iommu_iotlb_gather_add_range - Gather for address-based TLB invalidation
+ * @gather: TLB gather data
+ * @iova: start of page to invalidate
+ * @size: size of page to invalidate
+ *
+ * Helper for IOMMU drivers to build arbitrarily-sized invalidation commands
+ * where only the address range matters, and simply minimising intermediate
+ * syncs is preferred.
+ */
+static inline void iommu_iotlb_gather_add_range(struct iommu_iotlb_gather 
*gather,
+   unsigned long iova, size_t size)
+{
+   unsigned long end = iova + size - 1;
+
+   if (gather->start > iova)
+   gather->start = iova;
+   if (gather->end < end)
+   gather->end = end;
+}
+
+/**
+ * iommu_iotlb_gather_add_page - Gather for page-based TLB invalidation
+ * @domain: IOMMU domain to be invalidated
+ * @gather: TLB gather data
+ * @iova: start of page to invalidate
+ * @size: size of page to invalidate
+ *
+ * Helper for IOMMU drivers to build invalidation commands based on individual
+ * pages, or with page size/table level hints which cannot be gathered if they
+ * differ.
+ */
 static inline void iommu_iotlb_gather_add_page(struct iommu_domain *domain,
   struct iommu_iotlb_gather 
*gather,
   unsigned long iova, size_t size)
@@ -515,11 +547,7 @@ static inline void iommu_iotlb_gather_add_page(struct 
iommu_domain *domain,
gather->pgsize = size;
}
 
-   if (gather->end < end)
-   gather->end = end;
-
-   if (gather->start > start)
-   gather->start = start;
+   iommu_iotlb_gather_add_range(gather, iova, size);
 }
 
 /* PCI device grouping function */
-- 
2.25.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v4 2/7] iommu/amd: Do not use flush-queue when NpCache is on

2021-06-16 Thread Nadav Amit
From: Nadav Amit 

Do not use flush-queue on virtualized environments, where the NpCache
capability of the IOMMU is set. This is required to reduce
virtualization overheads.

This change follows a similar change to Intel's VT-d and a detailed
explanation as for the rationale is described in commit 29b32839725f
("iommu/vt-d: Do not use flush-queue when caching-mode is on").

Cc: Joerg Roedel 
Cc: Will Deacon 
Cc: Jiajun Cao 
Cc: Robin Murphy 
Cc: Lu Baolu 
Cc: iommu@lists.linux-foundation.org
Cc: linux-ker...@vger.kernel.org
Signed-off-by: Nadav Amit 
---
 drivers/iommu/amd/init.c | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c
index d006724f4dc2..4a52d22d0d6f 100644
--- a/drivers/iommu/amd/init.c
+++ b/drivers/iommu/amd/init.c
@@ -1850,8 +1850,13 @@ static int __init iommu_init_pci(struct amd_iommu *iommu)
if (ret)
return ret;
 
-   if (iommu->cap & (1UL << IOMMU_CAP_NPCACHE))
+   if (iommu->cap & (1UL << IOMMU_CAP_NPCACHE)) {
+   if (!amd_iommu_unmap_flush)
+   pr_warn("IOMMU batching is disabled due to 
virtualization");
+
amd_iommu_np_cache = true;
+   amd_iommu_unmap_flush = true;
+   }
 
init_iommu_perf_ctr(iommu);
 
-- 
2.25.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v4 1/7] iommu/amd: Selective flush on unmap

2021-06-16 Thread Nadav Amit
From: Nadav Amit 

Recent patch attempted to enable selective page flushes on AMD IOMMU but
neglected to adapt amd_iommu_iotlb_sync() to use the selective flushes.

Adapt amd_iommu_iotlb_sync() to use selective flushes and change
amd_iommu_unmap() to collect the flushes. As a defensive measure, to
avoid potential issues as those that the Intel IOMMU driver encountered
recently, flush the page-walk caches by always setting the "pde"
parameter. This can be removed later.

Cc: Joerg Roedel 
Cc: Will Deacon 
Cc: Jiajun Cao 
Cc: Robin Murphy 
Cc: Lu Baolu 
Cc: iommu@lists.linux-foundation.org
Cc: linux-ker...@vger.kernel.org
Signed-off-by: Nadav Amit 
---
 drivers/iommu/amd/iommu.c | 15 +--
 1 file changed, 13 insertions(+), 2 deletions(-)

diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index 3ac42bbdefc6..3e40f6610b6a 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -2059,12 +2059,17 @@ static size_t amd_iommu_unmap(struct iommu_domain *dom, 
unsigned long iova,
 {
struct protection_domain *domain = to_pdomain(dom);
struct io_pgtable_ops *ops = >iop.iop.ops;
+   size_t r;
 
if ((amd_iommu_pgtable == AMD_IOMMU_V1) &&
(domain->iop.mode == PAGE_MODE_NONE))
return 0;
 
-   return (ops->unmap) ? ops->unmap(ops, iova, page_size, gather) : 0;
+   r = (ops->unmap) ? ops->unmap(ops, iova, page_size, gather) : 0;
+
+   iommu_iotlb_gather_add_page(dom, gather, iova, page_size);
+
+   return r;
 }
 
 static phys_addr_t amd_iommu_iova_to_phys(struct iommu_domain *dom,
@@ -2167,7 +2172,13 @@ static void amd_iommu_flush_iotlb_all(struct 
iommu_domain *domain)
 static void amd_iommu_iotlb_sync(struct iommu_domain *domain,
 struct iommu_iotlb_gather *gather)
 {
-   amd_iommu_flush_iotlb_all(domain);
+   struct protection_domain *dom = to_pdomain(domain);
+   unsigned long flags;
+
+   spin_lock_irqsave(>lock, flags);
+   __domain_flush_pages(dom, gather->start, gather->end - gather->start, 
1);
+   amd_iommu_domain_flush_complete(dom);
+   spin_unlock_irqrestore(>lock, flags);
 }
 
 static int amd_iommu_def_domain_type(struct device *dev)
-- 
2.25.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v4 0/7] iommu/amd: Enable page-selective flushes

2021-06-16 Thread Nadav Amit
From: Nadav Amit 

The previous patch, commit 268aa4548277 ("iommu/amd: Page-specific
invalidations for more than one page") was supposed to enable
page-selective IOTLB flushes on AMD.

Besides the bug that was already fixed by commit a017c567915f
("iommu/amd: Fix wrong parentheses on page-specific invalidations")
there are several remaining matters to enable and benefit from
page-selective IOTLB flushes on AMD:

1. Enable selective flushes on unmap (patch 1)
2. Avoid using flush-queue on vIOMMUs (patch 2)
3. Relaxed flushes when gathering, excluding vIOMMUs (patches 3-5)
4. Syncing once on scatter-gather map operations (patch 6)
5. Breaking flushes to naturally aligned ranges on vIOMMU (patch 7)

The main difference in this version is that the logic that flushes
vIOMMU was improved based on Robin's feedback. Batching decisions are
not based on alignment anymore, but instead the flushing range is broken
into naturally aligned regions on sync. Doing so allows us to flush only
the entries that we modified with the minimal number of flushes.

Cc: Joerg Roedel 
Cc: Will Deacon 
Cc: Jiajun Cao 
Cc: Robin Murphy 
Cc: Lu Baolu 
Cc: iommu@lists.linux-foundation.org
Cc: linux-ker...@vger.kernel.org

---

v3->v4:
* Breaking flushes to naturally aligned ranges on vIOMMU [Robin]
* Removing unnecessary stubs; fixing comment [Robin]
* Removing unused variable [Yong]
* Changing pr_warn_once() to pr_warn() [Robin]
* Improving commit log [Will]

v2->v3:
* Rebase on v5.13-rc5
* Refactoring (patches 4-5) [Robin]
* Rework flush logic (patch 5): more relaxed on native
* Syncing once on scatter-gather operations (patch 6)

v1->v2:
* Rebase on v5.13-rc3

Nadav Amit (6):
  iommu/amd: Selective flush on unmap
  iommu/amd: Do not use flush-queue when NpCache is on
  iommu: Factor iommu_iotlb_gather_is_disjoint() out
  iommu/amd: Tailored gather logic for AMD
  iommu/amd: Sync once for scatter-gather operations
  iommu/amd: Use only natural aligned flushes in a VM

Robin Murphy (1):
  iommu: Improve iommu_iotlb_gather helpers

 drivers/iommu/amd/init.c  |  7 ++-
 drivers/iommu/amd/iommu.c | 96 +++
 drivers/iommu/mtk_iommu.c |  6 +--
 include/linux/iommu.h | 72 +++--
 4 files changed, 153 insertions(+), 28 deletions(-)

-- 
2.25.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v4 5/6] iommu/dma: Simplify calls to iommu_setup_dma_ops()

2021-06-16 Thread Robin Murphy

On 2021-06-10 08:51, Jean-Philippe Brucker wrote:

dma-iommu uses the address bounds described in domain->geometry during
IOVA allocation. The address size parameters of iommu_setup_dma_ops()
are useful for describing additional limits set by the platform
firmware, but aren't needed for drivers that call this function from
probe_finalize(). The base parameter can be zero because dma-iommu
already removes the first IOVA page, and the limit parameter can be
U64_MAX because it's only checked against the domain geometry. Simplify
calls to iommu_setup_dma_ops().

Signed-off-by: Jean-Philippe Brucker 
---
  drivers/iommu/amd/iommu.c   |  9 +
  drivers/iommu/dma-iommu.c   |  4 +++-
  drivers/iommu/intel/iommu.c | 10 +-
  3 files changed, 5 insertions(+), 18 deletions(-)

diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index 94b96d81fcfd..d3123bc05c08 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -1708,14 +1708,7 @@ static struct iommu_device 
*amd_iommu_probe_device(struct device *dev)
  
  static void amd_iommu_probe_finalize(struct device *dev)

  {
-   struct iommu_domain *domain;
-
-   /* Domains are initialized for this device - have a look what we ended 
up with */
-   domain = iommu_get_domain_for_dev(dev);
-   if (domain->type == IOMMU_DOMAIN_DMA)
-   iommu_setup_dma_ops(dev, IOVA_START_PFN << PAGE_SHIFT, U64_MAX);
-   else
-   set_dma_ops(dev, NULL);
+   iommu_setup_dma_ops(dev, 0, U64_MAX);
  }
  
  static void amd_iommu_release_device(struct device *dev)

diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index c62e19bed302..175f8eaeb5b3 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -1322,7 +1322,9 @@ void iommu_setup_dma_ops(struct device *dev, u64 
dma_base, u64 dma_limit)
if (domain->type == IOMMU_DOMAIN_DMA) {
if (iommu_dma_init_domain(domain, dma_base, dma_limit, dev))
goto out_err;
-   dev->dma_ops = _dma_ops;
+   set_dma_ops(dev, _dma_ops);
+   } else {
+   set_dma_ops(dev, NULL);


I'm not keen on moving this here, since iommu-dma only knows that its 
own ops are right for devices it *is* managing; it can't assume any 
particular ops are appropriate for devices it isn't. The idea here is 
that arch_setup_dma_ops() may have already set the appropriate ops for 
the non-IOMMU case, so if the default domain type is passthrough then we 
leave those in place.


For example, I do still plan to revisit my conversion of arch/arm 
someday, at which point I'd have to undo this for that reason.


Simplifying the base and size arguments is of course fine, but TBH I'd 
say rip the whole bloody lot out of the arch_setup_dma_ops() flow now. 
It's a considerable faff passing them around for nothing but a tenuous 
sanity check in iommu_dma_init_domain(), and now that dev->dma_range_map 
is a common thing we should expect that to give us any relevant 
limitations if we even still care.


That said, those are all things which can be fixed up later if the 
series is otherwise ready to go and there's still a chance of landing it 
for 5.14. If you do have any other reason to respin, then I think the 
x86 probe_finalize functions simply want an unconditional 
set_dma_ops(dev, NULL) before the iommu_setup_dma_ops() call.


Cheers,
Robin.


}
  
  	return;

diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index 85f18342603c..8d866940692a 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -5165,15 +5165,7 @@ static void intel_iommu_release_device(struct device 
*dev)
  
  static void intel_iommu_probe_finalize(struct device *dev)

  {
-   dma_addr_t base = IOVA_START_PFN << VTD_PAGE_SHIFT;
-   struct iommu_domain *domain = iommu_get_domain_for_dev(dev);
-   struct dmar_domain *dmar_domain = to_dmar_domain(domain);
-
-   if (domain && domain->type == IOMMU_DOMAIN_DMA)
-   iommu_setup_dma_ops(dev, base,
-   __DOMAIN_MAX_ADDR(dmar_domain->gaw));
-   else
-   set_dma_ops(dev, NULL);
+   iommu_setup_dma_ops(dev, 0, U64_MAX);
  }
  
  static void intel_iommu_get_resv_regions(struct device *device,



___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v4 6/6] iommu/virtio: Enable x86 support

2021-06-16 Thread Eric Auger
Hi jean,

On 6/10/21 9:51 AM, Jean-Philippe Brucker wrote:
> With the VIOT support in place, x86 platforms can now use the
> virtio-iommu.
>
> Because the other x86 IOMMU drivers aren't yet ready to use the
> acpi_dma_setup() path, x86 doesn't implement arch_setup_dma_ops() at the
> moment. Similarly to Vt-d and AMD IOMMU, call iommu_setup_dma_ops() from
> probe_finalize().
>
> Acked-by: Joerg Roedel 
> Acked-by: Michael S. Tsirkin 
> Signed-off-by: Jean-Philippe Brucker 
Reviewed-by: Eric Auger 

Eric
> ---
>  drivers/iommu/Kconfig| 3 ++-
>  drivers/iommu/dma-iommu.c| 1 +
>  drivers/iommu/virtio-iommu.c | 8 
>  3 files changed, 11 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
> index aff8a4830dd1..07b7c25cbed8 100644
> --- a/drivers/iommu/Kconfig
> +++ b/drivers/iommu/Kconfig
> @@ -400,8 +400,9 @@ config HYPERV_IOMMU
>  config VIRTIO_IOMMU
>   tristate "Virtio IOMMU driver"
>   depends on VIRTIO
> - depends on ARM64
> + depends on (ARM64 || X86)
>   select IOMMU_API
> + select IOMMU_DMA
>   select INTERVAL_TREE
>   select ACPI_VIOT if ACPI
>   help
> diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
> index 175f8eaeb5b3..46ed43c400cf 100644
> --- a/drivers/iommu/dma-iommu.c
> +++ b/drivers/iommu/dma-iommu.c
> @@ -1332,6 +1332,7 @@ void iommu_setup_dma_ops(struct device *dev, u64 
> dma_base, u64 dma_limit)
>pr_warn("Failed to set up IOMMU for device %s; retaining platform DMA 
> ops\n",
>dev_name(dev));
>  }
> +EXPORT_SYMBOL_GPL(iommu_setup_dma_ops);
>  
>  static struct iommu_dma_msi_page *iommu_dma_get_msi_page(struct device *dev,
>   phys_addr_t msi_addr, struct iommu_domain *domain)
> diff --git a/drivers/iommu/virtio-iommu.c b/drivers/iommu/virtio-iommu.c
> index 218fe8560e8d..77aee1207ced 100644
> --- a/drivers/iommu/virtio-iommu.c
> +++ b/drivers/iommu/virtio-iommu.c
> @@ -1026,6 +1026,13 @@ static struct iommu_device *viommu_probe_device(struct 
> device *dev)
>   return ERR_PTR(ret);
>  }
>  
> +static void viommu_probe_finalize(struct device *dev)
> +{
> +#ifndef CONFIG_ARCH_HAS_SETUP_DMA_OPS
> + iommu_setup_dma_ops(dev, 0, U64_MAX);
> +#endif
> +}
> +
>  static void viommu_release_device(struct device *dev)
>  {
>   struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev);
> @@ -1062,6 +1069,7 @@ static struct iommu_ops viommu_ops = {
>   .iova_to_phys   = viommu_iova_to_phys,
>   .iotlb_sync = viommu_iotlb_sync,
>   .probe_device   = viommu_probe_device,
> + .probe_finalize = viommu_probe_finalize,
>   .release_device = viommu_release_device,
>   .device_group   = viommu_device_group,
>   .get_resv_regions   = viommu_get_resv_regions,

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v4 5/6] iommu/dma: Simplify calls to iommu_setup_dma_ops()

2021-06-16 Thread Eric Auger
Hi Jean,

On 6/10/21 9:51 AM, Jean-Philippe Brucker wrote:
> dma-iommu uses the address bounds described in domain->geometry during
> IOVA allocation. The address size parameters of iommu_setup_dma_ops()
> are useful for describing additional limits set by the platform
> firmware, but aren't needed for drivers that call this function from
> probe_finalize(). The base parameter can be zero because dma-iommu
> already removes the first IOVA page, and the limit parameter can be
> U64_MAX because it's only checked against the domain geometry. Simplify
> calls to iommu_setup_dma_ops().
>
> Signed-off-by: Jean-Philippe Brucker 
Reviewed-by: Eric Auger 

Eric
> ---
>  drivers/iommu/amd/iommu.c   |  9 +
>  drivers/iommu/dma-iommu.c   |  4 +++-
>  drivers/iommu/intel/iommu.c | 10 +-
>  3 files changed, 5 insertions(+), 18 deletions(-)
>
> diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
> index 94b96d81fcfd..d3123bc05c08 100644
> --- a/drivers/iommu/amd/iommu.c
> +++ b/drivers/iommu/amd/iommu.c
> @@ -1708,14 +1708,7 @@ static struct iommu_device 
> *amd_iommu_probe_device(struct device *dev)
>  
>  static void amd_iommu_probe_finalize(struct device *dev)
>  {
> - struct iommu_domain *domain;
> -
> - /* Domains are initialized for this device - have a look what we ended 
> up with */
> - domain = iommu_get_domain_for_dev(dev);
> - if (domain->type == IOMMU_DOMAIN_DMA)
> - iommu_setup_dma_ops(dev, IOVA_START_PFN << PAGE_SHIFT, U64_MAX);
> - else
> - set_dma_ops(dev, NULL);
> + iommu_setup_dma_ops(dev, 0, U64_MAX);
>  }
>  
>  static void amd_iommu_release_device(struct device *dev)
> diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
> index c62e19bed302..175f8eaeb5b3 100644
> --- a/drivers/iommu/dma-iommu.c
> +++ b/drivers/iommu/dma-iommu.c
> @@ -1322,7 +1322,9 @@ void iommu_setup_dma_ops(struct device *dev, u64 
> dma_base, u64 dma_limit)
>   if (domain->type == IOMMU_DOMAIN_DMA) {
>   if (iommu_dma_init_domain(domain, dma_base, dma_limit, dev))
>   goto out_err;
> - dev->dma_ops = _dma_ops;
> + set_dma_ops(dev, _dma_ops);
> + } else {
> + set_dma_ops(dev, NULL);
>   }
>  
>   return;
> diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
> index 85f18342603c..8d866940692a 100644
> --- a/drivers/iommu/intel/iommu.c
> +++ b/drivers/iommu/intel/iommu.c
> @@ -5165,15 +5165,7 @@ static void intel_iommu_release_device(struct device 
> *dev)
>  
>  static void intel_iommu_probe_finalize(struct device *dev)
>  {
> - dma_addr_t base = IOVA_START_PFN << VTD_PAGE_SHIFT;
> - struct iommu_domain *domain = iommu_get_domain_for_dev(dev);
> - struct dmar_domain *dmar_domain = to_dmar_domain(domain);
> -
> - if (domain && domain->type == IOMMU_DOMAIN_DMA)
> - iommu_setup_dma_ops(dev, base,
> - __DOMAIN_MAX_ADDR(dmar_domain->gaw));
> - else
> - set_dma_ops(dev, NULL);
> + iommu_setup_dma_ops(dev, 0, U64_MAX);
>  }
>  
>  static void intel_iommu_get_resv_regions(struct device *device,

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 2/2] swiotlb-xen: override common mmap and get_sgtable dma ops

2021-06-16 Thread Christoph Hellwig
On Wed, Jun 16, 2021 at 11:39:07AM -0400, Boris Ostrovsky wrote:
> 
> On 6/16/21 11:35 AM, Christoph Hellwig wrote:
> > On Wed, Jun 16, 2021 at 11:33:50AM -0400, Boris Ostrovsky wrote:
> >> Isn't the expectation of virt_to_page() that it only works on 
> >> non-vmalloc'd addresses? (This is not a rhetorical question, I actually 
> >> don't know).
> > Yes.  Thus is why I'd suggest to just do the vmalloc_to_page or
> > virt_to_page dance in ops_helpers.c and just continue using that.
> 
> 
> Ah, OK, so something along the lines of what I suggested. (I thought by 
> "helpers" you meant virt_to_page()).

Yes.  Just keeping it contained in the common code without duplicating it
into a xen-specific version.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 2/2] swiotlb-xen: override common mmap and get_sgtable dma ops

2021-06-16 Thread Boris Ostrovsky


On 6/16/21 11:35 AM, Christoph Hellwig wrote:
> On Wed, Jun 16, 2021 at 11:33:50AM -0400, Boris Ostrovsky wrote:
>> Isn't the expectation of virt_to_page() that it only works on non-vmalloc'd 
>> addresses? (This is not a rhetorical question, I actually don't know).
> Yes.  Thus is why I'd suggest to just do the vmalloc_to_page or
> virt_to_page dance in ops_helpers.c and just continue using that.


Ah, OK, so something along the lines of what I suggested. (I thought by 
"helpers" you meant virt_to_page()).


-boris

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 2/2] swiotlb-xen: override common mmap and get_sgtable dma ops

2021-06-16 Thread Christoph Hellwig
On Wed, Jun 16, 2021 at 11:33:50AM -0400, Boris Ostrovsky wrote:
> Isn't the expectation of virt_to_page() that it only works on non-vmalloc'd 
> addresses? (This is not a rhetorical question, I actually don't know).

Yes.  Thus is why I'd suggest to just do the vmalloc_to_page or
virt_to_page dance in ops_helpers.c and just continue using that.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 2/2] swiotlb-xen: override common mmap and get_sgtable dma ops

2021-06-16 Thread Boris Ostrovsky

On 6/16/21 10:21 AM, Christoph Hellwig wrote:
> On Wed, Jun 16, 2021 at 10:12:55AM -0400, Boris Ostrovsky wrote:
>> I wonder now whether we could avoid code duplication between here and 
>> dma_common_mmap()/dma_common_get_sgtable() and use your helper there.
>>
>>
>> Christoph, would that work?  I.e. something like
> You should not duplicate the code at all, and just make the common
> helpers work with vmalloc addresses.


Isn't the expectation of virt_to_page() that it only works on non-vmalloc'd 
addresses? (This is not a rhetorical question, I actually don't know).



-boris

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v4 4/6] iommu/dma: Pass address limit rather than size to iommu_setup_dma_ops()

2021-06-16 Thread Eric Auger
Hi Jean,

On 6/10/21 9:51 AM, Jean-Philippe Brucker wrote:
> Passing a 64-bit address width to iommu_setup_dma_ops() is valid on
> virtual platforms, but isn't currently possible. The overflow check in
> iommu_dma_init_domain() prevents this even when @dma_base isn't 0. Pass
> a limit address instead of a size, so callers don't have to fake a size
> to work around the check.
>
> Signed-off-by: Jean-Philippe Brucker 
> ---
>  include/linux/dma-iommu.h   |  4 ++--
>  arch/arm64/mm/dma-mapping.c |  2 +-
>  drivers/iommu/amd/iommu.c   |  2 +-
>  drivers/iommu/dma-iommu.c   | 12 ++--
>  drivers/iommu/intel/iommu.c |  2 +-
>  5 files changed, 11 insertions(+), 11 deletions(-)
>
> diff --git a/include/linux/dma-iommu.h b/include/linux/dma-iommu.h
> index 6e75a2d689b4..758ca4694257 100644
> --- a/include/linux/dma-iommu.h
> +++ b/include/linux/dma-iommu.h
> @@ -19,7 +19,7 @@ int iommu_get_msi_cookie(struct iommu_domain *domain, 
> dma_addr_t base);
>  void iommu_put_dma_cookie(struct iommu_domain *domain);
>  
>  /* Setup call for arch DMA mapping code */
> -void iommu_setup_dma_ops(struct device *dev, u64 dma_base, u64 size);
> +void iommu_setup_dma_ops(struct device *dev, u64 dma_base, u64 dma_limit);
>  
>  /* The DMA API isn't _quite_ the whole story, though... */
>  /*
> @@ -50,7 +50,7 @@ struct msi_msg;
>  struct device;
>  
>  static inline void iommu_setup_dma_ops(struct device *dev, u64 dma_base,
> - u64 size)
> +u64 dma_limit)
>  {
>  }
>  
> diff --git a/arch/arm64/mm/dma-mapping.c b/arch/arm64/mm/dma-mapping.c
> index 4bf1dd3eb041..7bd1d2199141 100644
> --- a/arch/arm64/mm/dma-mapping.c
> +++ b/arch/arm64/mm/dma-mapping.c
> @@ -50,7 +50,7 @@ void arch_setup_dma_ops(struct device *dev, u64 dma_base, 
> u64 size,
>  
>   dev->dma_coherent = coherent;
>   if (iommu)
> - iommu_setup_dma_ops(dev, dma_base, size);
> + iommu_setup_dma_ops(dev, dma_base, size - dma_base - 1);
I don't get  size - dma_base - 1?
>  
>  #ifdef CONFIG_XEN
>   if (xen_swiotlb_detect())
> diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
> index 3ac42bbdefc6..94b96d81fcfd 100644
> --- a/drivers/iommu/amd/iommu.c
> +++ b/drivers/iommu/amd/iommu.c
> @@ -1713,7 +1713,7 @@ static void amd_iommu_probe_finalize(struct device *dev)
>   /* Domains are initialized for this device - have a look what we ended 
> up with */
>   domain = iommu_get_domain_for_dev(dev);
>   if (domain->type == IOMMU_DOMAIN_DMA)
> - iommu_setup_dma_ops(dev, IOVA_START_PFN << PAGE_SHIFT, 0);
> + iommu_setup_dma_ops(dev, IOVA_START_PFN << PAGE_SHIFT, U64_MAX);
>   else
>   set_dma_ops(dev, NULL);
>  }
> diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
> index 7bcdd1205535..c62e19bed302 100644
> --- a/drivers/iommu/dma-iommu.c
> +++ b/drivers/iommu/dma-iommu.c
> @@ -319,16 +319,16 @@ static bool dev_is_untrusted(struct device *dev)
>   * iommu_dma_init_domain - Initialise a DMA mapping domain
>   * @domain: IOMMU domain previously prepared by iommu_get_dma_cookie()
>   * @base: IOVA at which the mappable address space starts
> - * @size: Size of IOVA space
> + * @limit: Last address of the IOVA space
>   * @dev: Device the domain is being initialised for
>   *
> - * @base and @size should be exact multiples of IOMMU page granularity to
> + * @base and @limit + 1 should be exact multiples of IOMMU page granularity 
> to
>   * avoid rounding surprises. If necessary, we reserve the page at address 0
>   * to ensure it is an invalid IOVA. It is safe to reinitialise a domain, but
>   * any change which could make prior IOVAs invalid will fail.
>   */
>  static int iommu_dma_init_domain(struct iommu_domain *domain, dma_addr_t 
> base,
> - u64 size, struct device *dev)
> +  dma_addr_t limit, struct device *dev)
>  {
>   struct iommu_dma_cookie *cookie = domain->iova_cookie;
>   unsigned long order, base_pfn;
> @@ -346,7 +346,7 @@ static int iommu_dma_init_domain(struct iommu_domain 
> *domain, dma_addr_t base,
>   /* Check the domain allows at least some access to the device... */
>   if (domain->geometry.force_aperture) {
>   if (base > domain->geometry.aperture_end ||
> - base + size <= domain->geometry.aperture_start) {
> + limit < domain->geometry.aperture_start) {
>   pr_warn("specified DMA range outside IOMMU 
> capability\n");
>   return -EFAULT;
>   }
> @@ -1308,7 +1308,7 @@ static const struct dma_map_ops iommu_dma_ops = {
>   * The IOMMU core code allocates the default DMA domain, which the underlying
>   * IOMMU driver needs to support via the dma-iommu layer.
>   */
> -void iommu_setup_dma_ops(struct device *dev, u64 dma_base, u64 size)
> +void iommu_setup_dma_ops(struct device *dev, u64 dma_base, u64 dma_limit)
>  {
>   struct 

Re: [RFC PATCH] iommu: add domain->nested

2021-06-16 Thread Christoph Hellwig
On Wed, Jun 16, 2021 at 10:38:02PM +0800, Zhangfei Gao wrote:
> +++ b/include/linux/iommu.h
> @@ -87,6 +87,7 @@ struct iommu_domain {
>   void *handler_token;
>   struct iommu_domain_geometry geometry;
>   void *iova_cookie;
> + int nested;

This should probably be a bool : 1;

Also this needs a user, so please just queue up a variant of this for
the code that eventually relies on this information.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[RFC PATCH] iommu: add domain->nested

2021-06-16 Thread Zhangfei Gao
Add domain->nested to decide whether domain is in nesting mode,
since attr DOMAIN_ATTR_NESTING is removed in the patches:
7876a83 iommu: remove iommu_domain_{get,set}_attr
7e14754 iommu: remove DOMAIN_ATTR_NESTING

Signed-off-by: Zhangfei Gao 
---

Nesting info is still required for vsva according to
https://patchwork.kernel.org/project/linux-arm-kernel/patch/20210301084257.945454-16-...@lst.de/

 drivers/iommu/iommu.c | 8 +++-
 include/linux/iommu.h | 1 +
 2 files changed, 8 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 808ab70..ba26ad0 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -2684,11 +2684,17 @@ core_initcall(iommu_init);
 
 int iommu_enable_nesting(struct iommu_domain *domain)
 {
+   int ret;
+
if (domain->type != IOMMU_DOMAIN_UNMANAGED)
return -EINVAL;
if (!domain->ops->enable_nesting)
return -EINVAL;
-   return domain->ops->enable_nesting(domain);
+   ret = domain->ops->enable_nesting(domain);
+   if (!ret)
+   domain->nested = 1;
+
+   return ret;
 }
 EXPORT_SYMBOL_GPL(iommu_enable_nesting);
 
diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index 32d4480..179f849 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -87,6 +87,7 @@ struct iommu_domain {
void *handler_token;
struct iommu_domain_geometry geometry;
void *iova_cookie;
+   int nested;
 };
 
 enum iommu_cap {
-- 
2.7.4

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 2/2] swiotlb-xen: override common mmap and get_sgtable dma ops

2021-06-16 Thread Christoph Hellwig
On Wed, Jun 16, 2021 at 10:12:55AM -0400, Boris Ostrovsky wrote:
> I wonder now whether we could avoid code duplication between here and 
> dma_common_mmap()/dma_common_get_sgtable() and use your helper there.
> 
> 
> Christoph, would that work?  I.e. something like

You should not duplicate the code at all, and just make the common
helpers work with vmalloc addresses.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 2/2] swiotlb-xen: override common mmap and get_sgtable dma ops

2021-06-16 Thread Boris Ostrovsky

On 6/16/21 7:42 AM, Roman Skakun wrote:
> This commit is dedicated to fix incorrect conversion from
> cpu_addr to page address in cases when we get virtual
> address which allocated through xen_swiotlb_alloc_coherent()
> and can be mapped in the vmalloc range.
> As the result, virt_to_page() cannot convert this address
> properly and return incorrect page address.
>
> Need to detect such cases and obtains the page address using
> vmalloc_to_page() instead.
>
> The reference code for mmap() and get_sgtable() was copied
> from kernel/dma/ops_helpers.c and modified to provide
> additional detections as described above.
>
> In order to simplify code there was added a new
> dma_cpu_addr_to_page() helper.
>
> Signed-off-by: Roman Skakun 
> Reviewed-by: Andrii Anisov 
> ---
>  drivers/xen/swiotlb-xen.c | 42 +++
>  1 file changed, 34 insertions(+), 8 deletions(-)
>
> diff --git a/drivers/xen/swiotlb-xen.c b/drivers/xen/swiotlb-xen.c
> index 90bc5fc321bc..9331a8500547 100644
> --- a/drivers/xen/swiotlb-xen.c
> +++ b/drivers/xen/swiotlb-xen.c
> @@ -118,6 +118,14 @@ static int is_xen_swiotlb_buffer(struct device *dev, 
> dma_addr_t dma_addr)
>   return 0;
>  }
>  
> +static struct page *cpu_addr_to_page(void *cpu_addr)
> +{
> + if (is_vmalloc_addr(cpu_addr))
> + return vmalloc_to_page(cpu_addr);
> + else
> + return virt_to_page(cpu_addr);
> +}
> +
>  static int
>  xen_swiotlb_fixup(void *buf, size_t size, unsigned long nslabs)
>  {
> @@ -337,7 +345,7 @@ xen_swiotlb_free_coherent(struct device *hwdev, size_t 
> size, void *vaddr,
>   int order = get_order(size);
>   phys_addr_t phys;
>   u64 dma_mask = DMA_BIT_MASK(32);
> - struct page *page;
> + struct page *page = cpu_addr_to_page(vaddr);
>  
>   if (hwdev && hwdev->coherent_dma_mask)
>   dma_mask = hwdev->coherent_dma_mask;
> @@ -349,11 +357,6 @@ xen_swiotlb_free_coherent(struct device *hwdev, size_t 
> size, void *vaddr,
>   /* Convert the size to actually allocated. */
>   size = 1UL << (order + XEN_PAGE_SHIFT);
>  
> - if (is_vmalloc_addr(vaddr))
> - page = vmalloc_to_page(vaddr);
> - else
> - page = virt_to_page(vaddr);
> -
>   if (!WARN_ON((dev_addr + size - 1 > dma_mask) ||
>range_straddles_page_boundary(phys, size)) &&
>   TestClearPageXenRemapped(page))
> @@ -573,7 +576,23 @@ xen_swiotlb_dma_mmap(struct device *dev, struct 
> vm_area_struct *vma,
>void *cpu_addr, dma_addr_t dma_addr, size_t size,
>unsigned long attrs)
>  {
> - return dma_common_mmap(dev, vma, cpu_addr, dma_addr, size, attrs);
> + unsigned long user_count = vma_pages(vma);
> + unsigned long count = PAGE_ALIGN(size) >> PAGE_SHIFT;
> + unsigned long off = vma->vm_pgoff;
> + struct page *page = cpu_addr_to_page(cpu_addr);
> + int ret;
> +
> + vma->vm_page_prot = dma_pgprot(dev, vma->vm_page_prot, attrs);
> +
> + if (dma_mmap_from_dev_coherent(dev, vma, cpu_addr, size, ))
> + return ret;
> +
> + if (off >= count || user_count > count - off)
> + return -ENXIO;
> +
> + return remap_pfn_range(vma, vma->vm_start,
> + page_to_pfn(page) + vma->vm_pgoff,
> + user_count << PAGE_SHIFT, vma->vm_page_prot);
>  }


I wonder now whether we could avoid code duplication between here and 
dma_common_mmap()/dma_common_get_sgtable() and use your helper there.


Christoph, would that work?  I.e. something like


diff --git a/kernel/dma/ops_helpers.c b/kernel/dma/ops_helpers.c
index 910ae69cae77..43411c2fa47b 100644
--- a/kernel/dma/ops_helpers.c
+++ b/kernel/dma/ops_helpers.c
@@ -12,7 +12,7 @@ int dma_common_get_sgtable(struct device *dev, struct 
sg_table *sgt,
 void *cpu_addr, dma_addr_t dma_addr, size_t size,
 unsigned long attrs)
 {
-   struct page *page = virt_to_page(cpu_addr);
+   struct page *page = cpu_addr_to_page(cpu_addr);
    int ret;
 
    ret = sg_alloc_table(sgt, 1, GFP_KERNEL);
@@ -43,7 +43,7 @@ int dma_common_mmap(struct device *dev, struct vm_area_struct 
*vma,
    return -ENXIO;
 
    return remap_pfn_range(vma, vma->vm_start,
-   page_to_pfn(virt_to_page(cpu_addr)) + vma->vm_pgoff,
+   page_to_pfn(cpu_addr_to_page(cpu_addr)) + vma->vm_pgoff,
    user_count << PAGE_SHIFT, vma->vm_page_prot);
 #else
    return -ENXIO;


-boris


>  
>  /*
> @@ -585,7 +604,14 @@ xen_swiotlb_get_sgtable(struct device *dev, struct 
> sg_table *sgt,
>   void *cpu_addr, dma_addr_t handle, size_t size,
>   unsigned long attrs)
>  {
> - return dma_common_get_sgtable(dev, sgt, cpu_addr, handle, size, attrs);
> + struct page *page = cpu_addr_to_page(cpu_addr);
> + int ret;
> +
> + ret = sg_alloc_table(sgt, 1, 

Re: [PATCH] dt-bindings: Drop redundant minItems/maxItems

2021-06-16 Thread Marc Kleine-Budde
On 15.06.2021 13:15:43, Rob Herring wrote:
> If a property has an 'items' list, then a 'minItems' or 'maxItems' with the
> same size as the list is redundant and can be dropped. Note that is DT
> schema specific behavior and not standard json-schema behavior. The tooling
> will fixup the final schema adding any unspecified minItems/maxItems.
> 
> This condition is partially checked with the meta-schema already, but
> only if both 'minItems' and 'maxItems' are equal to the 'items' length.
> An improved meta-schema is pending.
[...]
>  Documentation/devicetree/bindings/net/can/bosch,m_can.yaml  | 2 --

Acked-by: Marc Kleine-Budde 

regards,
Marc

-- 
Pengutronix e.K. | Marc Kleine-Budde   |
Embedded Linux   | https://www.pengutronix.de  |
Vertretung West/Dortmund | Phone: +49-231-2826-924 |
Amtsgericht Hildesheim, HRA 2686 | Fax:   +49-5121-206917- |


signature.asc
Description: PGP signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH v7 02/15] iommu: Add an unmap_pages() op for IOMMU drivers

2021-06-16 Thread Georgi Djakov
From: "Isaac J. Manjarres" 

Add a callback for IOMMU drivers to provide a path for the
IOMMU framework to call into an IOMMU driver, which can call
into the io-pgtable code, to unmap a virtually contiguous
range of pages of the same size.

For IOMMU drivers that do not specify an unmap_pages() callback,
the existing logic of unmapping memory one page block at a time
will be used.

Signed-off-by: Isaac J. Manjarres 
Suggested-by: Will Deacon 
Signed-off-by: Will Deacon 
Acked-by: Lu Baolu 
Signed-off-by: Georgi Djakov 
---
 include/linux/iommu.h | 4 
 1 file changed, 4 insertions(+)

diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index 32d448050bf7..25a844121be5 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -181,6 +181,7 @@ struct iommu_iotlb_gather {
  * @detach_dev: detach device from an iommu domain
  * @map: map a physically contiguous memory region to an iommu domain
  * @unmap: unmap a physically contiguous memory region from an iommu domain
+ * @unmap_pages: unmap a number of pages of the same size from an iommu domain
  * @flush_iotlb_all: Synchronously flush all hardware TLBs for this domain
  * @iotlb_sync_map: Sync mappings created recently using @map to the hardware
  * @iotlb_sync: Flush all queued ranges from the hardware TLBs and empty flush
@@ -231,6 +232,9 @@ struct iommu_ops {
   phys_addr_t paddr, size_t size, int prot, gfp_t gfp);
size_t (*unmap)(struct iommu_domain *domain, unsigned long iova,
 size_t size, struct iommu_iotlb_gather *iotlb_gather);
+   size_t (*unmap_pages)(struct iommu_domain *domain, unsigned long iova,
+ size_t pgsize, size_t pgcount,
+ struct iommu_iotlb_gather *iotlb_gather);
void (*flush_iotlb_all)(struct iommu_domain *domain);
void (*iotlb_sync_map)(struct iommu_domain *domain, unsigned long iova,
   size_t size);
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v7 09/15] iommu/io-pgtable-arm: Prepare PTE methods for handling multiple entries

2021-06-16 Thread Georgi Djakov
From: "Isaac J. Manjarres" 

The PTE methods currently operate on a single entry. In preparation
for manipulating multiple PTEs in one map or unmap call, allow them
to handle multiple PTEs.

Signed-off-by: Isaac J. Manjarres 
Suggested-by: Robin Murphy 
Signed-off-by: Georgi Djakov 
---
 drivers/iommu/io-pgtable-arm.c | 78 --
 1 file changed, 44 insertions(+), 34 deletions(-)

diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c
index 87def58e79b5..ea66b10c04c4 100644
--- a/drivers/iommu/io-pgtable-arm.c
+++ b/drivers/iommu/io-pgtable-arm.c
@@ -232,20 +232,23 @@ static void __arm_lpae_free_pages(void *pages, size_t 
size,
free_pages((unsigned long)pages, get_order(size));
 }
 
-static void __arm_lpae_sync_pte(arm_lpae_iopte *ptep,
+static void __arm_lpae_sync_pte(arm_lpae_iopte *ptep, int num_entries,
struct io_pgtable_cfg *cfg)
 {
dma_sync_single_for_device(cfg->iommu_dev, __arm_lpae_dma_addr(ptep),
-  sizeof(*ptep), DMA_TO_DEVICE);
+  sizeof(*ptep) * num_entries, DMA_TO_DEVICE);
 }
 
 static void __arm_lpae_set_pte(arm_lpae_iopte *ptep, arm_lpae_iopte pte,
-  struct io_pgtable_cfg *cfg)
+  int num_entries, struct io_pgtable_cfg *cfg)
 {
-   *ptep = pte;
+   int i;
+
+   for (i = 0; i < num_entries; i++)
+   ptep[i] = pte;
 
if (!cfg->coherent_walk)
-   __arm_lpae_sync_pte(ptep, cfg);
+   __arm_lpae_sync_pte(ptep, num_entries, cfg);
 }
 
 static size_t __arm_lpae_unmap(struct arm_lpae_io_pgtable *data,
@@ -255,47 +258,54 @@ static size_t __arm_lpae_unmap(struct arm_lpae_io_pgtable 
*data,
 
 static void __arm_lpae_init_pte(struct arm_lpae_io_pgtable *data,
phys_addr_t paddr, arm_lpae_iopte prot,
-   int lvl, arm_lpae_iopte *ptep)
+   int lvl, int num_entries, arm_lpae_iopte *ptep)
 {
arm_lpae_iopte pte = prot;
+   struct io_pgtable_cfg *cfg = >iop.cfg;
+   size_t sz = ARM_LPAE_BLOCK_SIZE(lvl, data);
+   int i;
 
if (data->iop.fmt != ARM_MALI_LPAE && lvl == ARM_LPAE_MAX_LEVELS - 1)
pte |= ARM_LPAE_PTE_TYPE_PAGE;
else
pte |= ARM_LPAE_PTE_TYPE_BLOCK;
 
-   pte |= paddr_to_iopte(paddr, data);
+   for (i = 0; i < num_entries; i++)
+   ptep[i] = pte | paddr_to_iopte(paddr + i * sz, data);
 
-   __arm_lpae_set_pte(ptep, pte, >iop.cfg);
+   if (!cfg->coherent_walk)
+   __arm_lpae_sync_pte(ptep, num_entries, cfg);
 }
 
 static int arm_lpae_init_pte(struct arm_lpae_io_pgtable *data,
 unsigned long iova, phys_addr_t paddr,
-arm_lpae_iopte prot, int lvl,
+arm_lpae_iopte prot, int lvl, int num_entries,
 arm_lpae_iopte *ptep)
 {
-   arm_lpae_iopte pte = *ptep;
-
-   if (iopte_leaf(pte, lvl, data->iop.fmt)) {
-   /* We require an unmap first */
-   WARN_ON(!selftest_running);
-   return -EEXIST;
-   } else if (iopte_type(pte) == ARM_LPAE_PTE_TYPE_TABLE) {
-   /*
-* We need to unmap and free the old table before
-* overwriting it with a block entry.
-*/
-   arm_lpae_iopte *tblp;
-   size_t sz = ARM_LPAE_BLOCK_SIZE(lvl, data);
-
-   tblp = ptep - ARM_LPAE_LVL_IDX(iova, lvl, data);
-   if (__arm_lpae_unmap(data, NULL, iova, sz, lvl, tblp) != sz) {
-   WARN_ON(1);
-   return -EINVAL;
+   int i;
+
+   for (i = 0; i < num_entries; i++)
+   if (iopte_leaf(ptep[i], lvl, data->iop.fmt)) {
+   /* We require an unmap first */
+   WARN_ON(!selftest_running);
+   return -EEXIST;
+   } else if (iopte_type(ptep[i]) == ARM_LPAE_PTE_TYPE_TABLE) {
+   /*
+* We need to unmap and free the old table before
+* overwriting it with a block entry.
+*/
+   arm_lpae_iopte *tblp;
+   size_t sz = ARM_LPAE_BLOCK_SIZE(lvl, data);
+
+   tblp = ptep - ARM_LPAE_LVL_IDX(iova, lvl, data);
+   if (__arm_lpae_unmap(data, NULL, iova + i * sz, sz,
+lvl, tblp) != sz) {
+   WARN_ON(1);
+   return -EINVAL;
+   }
}
-   }
 
-   __arm_lpae_init_pte(data, paddr, prot, lvl, ptep);
+   __arm_lpae_init_pte(data, paddr, prot, lvl, num_entries, ptep);
return 0;
 }
 
@@ -323,7 

[PATCH v7 06/15] iommu: Split 'addr_merge' argument to iommu_pgsize() into separate parts

2021-06-16 Thread Georgi Djakov
From: Will Deacon 

The 'addr_merge' parameter to iommu_pgsize() is a fabricated address
intended to describe the alignment requirements to consider when
choosing an appropriate page size. On the iommu_map() path, this address
is the logical OR of the virtual and physical addresses.

Subsequent improvements to iommu_pgsize() will need to check the
alignment of the virtual and physical components of 'addr_merge'
independently, so pass them in as separate parameters and reconstruct
'addr_merge' locally.

No functional change.

Signed-off-by: Will Deacon 
Signed-off-by: Isaac J. Manjarres 
Signed-off-by: Georgi Djakov 
---
 drivers/iommu/iommu.c | 10 ++
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 80e471ada358..80e14c139d40 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -2375,12 +2375,13 @@ phys_addr_t iommu_iova_to_phys(struct iommu_domain 
*domain, dma_addr_t iova)
 }
 EXPORT_SYMBOL_GPL(iommu_iova_to_phys);
 
-static size_t iommu_pgsize(struct iommu_domain *domain,
-  unsigned long addr_merge, size_t size)
+static size_t iommu_pgsize(struct iommu_domain *domain, unsigned long iova,
+  phys_addr_t paddr, size_t size)
 {
unsigned int pgsize_idx;
unsigned long pgsizes;
size_t pgsize;
+   unsigned long addr_merge = paddr | iova;
 
/* Page sizes supported by the hardware and small enough for @size */
pgsizes = domain->pgsize_bitmap & GENMASK(__fls(size), 0);
@@ -2433,7 +2434,7 @@ static int __iommu_map(struct iommu_domain *domain, 
unsigned long iova,
pr_debug("map: iova 0x%lx pa %pa size 0x%zx\n", iova, , size);
 
while (size) {
-   size_t pgsize = iommu_pgsize(domain, iova | paddr, size);
+   size_t pgsize = iommu_pgsize(domain, iova, paddr, size);
 
pr_debug("mapping: iova 0x%lx pa %pa pgsize 0x%zx\n",
 iova, , pgsize);
@@ -2521,8 +2522,9 @@ static size_t __iommu_unmap(struct iommu_domain *domain,
 * or we hit an area that isn't mapped.
 */
while (unmapped < size) {
-   size_t pgsize = iommu_pgsize(domain, iova, size - unmapped);
+   size_t pgsize;
 
+   pgsize = iommu_pgsize(domain, iova, iova, size - unmapped);
unmapped_page = ops->unmap(domain, iova, pgsize, iotlb_gather);
if (!unmapped_page)
break;
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v7 15/15] iommu/arm-smmu: Implement the map_pages() IOMMU driver callback

2021-06-16 Thread Georgi Djakov
From: "Isaac J. Manjarres" 

Implement the map_pages() callback for the ARM SMMU driver
to allow calls from iommu_map to map multiple pages of
the same size in one call. Also, remove the map() callback
for the ARM SMMU driver, as it will no longer be used.

Signed-off-by: Isaac J. Manjarres 
Suggested-by: Will Deacon 
Signed-off-by: Georgi Djakov 
---
 drivers/iommu/arm/arm-smmu/arm-smmu.c | 9 +
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu.c 
b/drivers/iommu/arm/arm-smmu/arm-smmu.c
index 593a15cfa8d5..c1ca3b49a620 100644
--- a/drivers/iommu/arm/arm-smmu/arm-smmu.c
+++ b/drivers/iommu/arm/arm-smmu/arm-smmu.c
@@ -1193,8 +1193,9 @@ static int arm_smmu_attach_dev(struct iommu_domain 
*domain, struct device *dev)
return ret;
 }
 
-static int arm_smmu_map(struct iommu_domain *domain, unsigned long iova,
-   phys_addr_t paddr, size_t size, int prot, gfp_t gfp)
+static int arm_smmu_map_pages(struct iommu_domain *domain, unsigned long iova,
+ phys_addr_t paddr, size_t pgsize, size_t pgcount,
+ int prot, gfp_t gfp, size_t *mapped)
 {
struct io_pgtable_ops *ops = to_smmu_domain(domain)->pgtbl_ops;
struct arm_smmu_device *smmu = to_smmu_domain(domain)->smmu;
@@ -1204,7 +1205,7 @@ static int arm_smmu_map(struct iommu_domain *domain, 
unsigned long iova,
return -ENODEV;
 
arm_smmu_rpm_get(smmu);
-   ret = ops->map(ops, iova, paddr, size, prot, gfp);
+   ret = ops->map_pages(ops, iova, paddr, pgsize, pgcount, prot, gfp, 
mapped);
arm_smmu_rpm_put(smmu);
 
return ret;
@@ -1574,7 +1575,7 @@ static struct iommu_ops arm_smmu_ops = {
.domain_alloc   = arm_smmu_domain_alloc,
.domain_free= arm_smmu_domain_free,
.attach_dev = arm_smmu_attach_dev,
-   .map= arm_smmu_map,
+   .map_pages  = arm_smmu_map_pages,
.unmap_pages= arm_smmu_unmap_pages,
.flush_iotlb_all= arm_smmu_flush_iotlb_all,
.iotlb_sync = arm_smmu_iotlb_sync,
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v7 00/15] Optimizing iommu_[map/unmap] performance

2021-06-16 Thread Georgi Djakov
When unmapping a buffer from an IOMMU domain, the IOMMU framework unmaps
the buffer at a granule of the largest page size that is supported by
the IOMMU hardware and fits within the buffer. For every block that
is unmapped, the IOMMU framework will call into the IOMMU driver, and
then the io-pgtable framework to walk the page tables to find the entry
that corresponds to the IOVA, and then unmaps the entry.

This can be suboptimal in scenarios where a buffer or a piece of a
buffer can be split into several contiguous page blocks of the same size.
For example, consider an IOMMU that supports 4 KB page blocks, 2 MB page
blocks, and 1 GB page blocks, and a buffer that is 4 MB in size is being
unmapped at IOVA 0. The current call-flow will result in 4 indirect calls,
and 2 page table walks, to unmap 2 entries that are next to each other in
the page-tables, when both entries could have been unmapped in one shot
by clearing both page table entries in the same call.

The same optimization is applicable to mapping buffers as well, so
these patches implement a set of callbacks called unmap_pages and
map_pages to the io-pgtable code and IOMMU drivers which unmaps or maps
an IOVA range that consists of a number of pages of the same
page size that is supported by the IOMMU hardware, and allows for
manipulating multiple page table entries in the same set of indirect
calls. The reason for introducing these callbacks is to give other IOMMU
drivers/io-pgtable formats time to change to using the new callbacks, so
that the transition to using this approach can be done piecemeal.

Changes since V6:
(https://lore.kernel.org/r/1623776913-390160-1-git-send-email-quic_c_gdj...@quicinc.com/)

* Fix compiler warning (patch 08/15)
* Free underlying page tables for large mappings (patch 10/15)
Consider the case where a 2N--where N > 1--MB buffer is composed
entirely of 4 KB pages. This means that at the second to last level,
the buffer will have N non-leaf entries that point to page tables
with 4 KB mappings.

When the buffer is unmapped, all N entries will be cleared at the
second to last level. However, the existing logic only checks if
it needs to free the underlying page tables for the first non-leaf
entry. Therefore, the page table memory for the other entries N-1
entries will be leaked.

Fix this memory leak by ensuring that we apply the same check to
all N entries that are being unmapped.

When unmapping multiple entries, __arm_lpae_unmap() should unmap
one entry at a time and perform TLB maintenance as required for that
entry.

Changes since V5: 
(https://lore.kernel.org/r/20210408171402.12607-1-isa...@codeaurora.org/)

* Rebased on next-20210515.
* Fixed minor checkpatch warnings - indentation, extra blank lines.
* Use the correct function argument in __arm_lpae_map(). (chenxiang)

Changes since V4:

* Fixed type for addr_merge from phys_addr_t to unsigned long so
  that GENMASK() can be used.
* Hooked up arm_v7s_[unmap/map]_pages to the io-pgtable ops.
* Introduced a macro for calculating the number of page table entries
  for the ARM LPAE io-pgtable format.

Changes since V3:

* Removed usage of ULL variants of bitops from Will's patches, as
  they were not needed.
* Instead of unmapping/mapping pgcount pages, unmap_pages() and
  map_pages() will at most unmap and map pgcount pages, allowing
  for part of the pages in pgcount to be mapped and unmapped. This
  was done to simplify the handling in the io-pgtable layer.
* Extended the existing PTE manipulation methods in io-pgtable-arm
  to handle multiple entries, per Robin's suggestion, eliminating
  the need to add functions to clear multiple PTEs.
* Implemented a naive form of [map/unmap]_pages() for ARM v7s io-pgtable
  format.
* arm_[v7s/lpae]_[map/unmap] will call
  arm_[v7s/lpae]_[map_pages/unmap_pages] with an argument of 1 page.
* The arm_smmu_[map/unmap] functions have been removed, since they
  have been replaced by arm_smmu_[map/unmap]_pages.

Changes since V2:

* Added a check in __iommu_map() to check for the existence
  of either the map or map_pages callback as per Lu's suggestion.

Changes since V1:

* Implemented the map_pages() callbacks
* Integrated Will's patches into this series which
  address several concerns about how iommu_pgsize() partitioned a
  buffer (I made a minor change to the patch which changes
  iommu_pgsize() to use bitmaps by using the ULL variants of
  the bitops)

Isaac J. Manjarres (12):
  iommu/io-pgtable: Introduce unmap_pages() as a page table op
  iommu: Add an unmap_pages() op for IOMMU drivers
  iommu/io-pgtable: Introduce map_pages() as a page table op
  iommu: Add a map_pages() op for IOMMU drivers
  iommu: Add support for the map_pages() callback
  iommu/io-pgtable-arm: Prepare PTE methods for handling multiple
entries
  iommu/io-pgtable-arm: Implement arm_lpae_unmap_pages()
  iommu/io-pgtable-arm: Implement arm_lpae_map_pages()
  iommu/io-pgtable-arm-v7s: Implement 

[PATCH v7 12/15] iommu/io-pgtable-arm-v7s: Implement arm_v7s_unmap_pages()

2021-06-16 Thread Georgi Djakov
From: "Isaac J. Manjarres" 

Implement the unmap_pages() callback for the ARM v7s io-pgtable
format.

Signed-off-by: Isaac J. Manjarres 
Signed-off-by: Georgi Djakov 
---
 drivers/iommu/io-pgtable-arm-v7s.c | 24 +---
 1 file changed, 21 insertions(+), 3 deletions(-)

diff --git a/drivers/iommu/io-pgtable-arm-v7s.c 
b/drivers/iommu/io-pgtable-arm-v7s.c
index d4004bcf333a..1af060686985 100644
--- a/drivers/iommu/io-pgtable-arm-v7s.c
+++ b/drivers/iommu/io-pgtable-arm-v7s.c
@@ -710,15 +710,32 @@ static size_t __arm_v7s_unmap(struct arm_v7s_io_pgtable 
*data,
return __arm_v7s_unmap(data, gather, iova, size, lvl + 1, ptep);
 }
 
-static size_t arm_v7s_unmap(struct io_pgtable_ops *ops, unsigned long iova,
-   size_t size, struct iommu_iotlb_gather *gather)
+static size_t arm_v7s_unmap_pages(struct io_pgtable_ops *ops, unsigned long 
iova,
+ size_t pgsize, size_t pgcount,
+ struct iommu_iotlb_gather *gather)
 {
struct arm_v7s_io_pgtable *data = io_pgtable_ops_to_data(ops);
+   size_t unmapped = 0, ret;
 
if (WARN_ON(iova >= (1ULL << data->iop.cfg.ias)))
return 0;
 
-   return __arm_v7s_unmap(data, gather, iova, size, 1, data->pgd);
+   while (pgcount--) {
+   ret = __arm_v7s_unmap(data, gather, iova, pgsize, 1, data->pgd);
+   if (!ret)
+   break;
+
+   unmapped += pgsize;
+   iova += pgsize;
+   }
+
+   return unmapped;
+}
+
+static size_t arm_v7s_unmap(struct io_pgtable_ops *ops, unsigned long iova,
+   size_t size, struct iommu_iotlb_gather *gather)
+{
+   return arm_v7s_unmap_pages(ops, iova, size, 1, gather);
 }
 
 static phys_addr_t arm_v7s_iova_to_phys(struct io_pgtable_ops *ops,
@@ -781,6 +798,7 @@ static struct io_pgtable *arm_v7s_alloc_pgtable(struct 
io_pgtable_cfg *cfg,
data->iop.ops = (struct io_pgtable_ops) {
.map= arm_v7s_map,
.unmap  = arm_v7s_unmap,
+   .unmap_pages= arm_v7s_unmap_pages,
.iova_to_phys   = arm_v7s_iova_to_phys,
};
 
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v7 14/15] iommu/arm-smmu: Implement the unmap_pages() IOMMU driver callback

2021-06-16 Thread Georgi Djakov
From: "Isaac J. Manjarres" 

Implement the unmap_pages() callback for the ARM SMMU driver
to allow calls from iommu_unmap to unmap multiple pages of
the same size in one call. Also, remove the unmap() callback
for the SMMU driver, as it will no longer be used.

Signed-off-by: Isaac J. Manjarres 
Suggested-by: Will Deacon 
Signed-off-by: Georgi Djakov 
---
 drivers/iommu/arm/arm-smmu/arm-smmu.c | 9 +
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu.c 
b/drivers/iommu/arm/arm-smmu/arm-smmu.c
index 61233bcc4588..593a15cfa8d5 100644
--- a/drivers/iommu/arm/arm-smmu/arm-smmu.c
+++ b/drivers/iommu/arm/arm-smmu/arm-smmu.c
@@ -1210,8 +1210,9 @@ static int arm_smmu_map(struct iommu_domain *domain, 
unsigned long iova,
return ret;
 }
 
-static size_t arm_smmu_unmap(struct iommu_domain *domain, unsigned long iova,
-size_t size, struct iommu_iotlb_gather *gather)
+static size_t arm_smmu_unmap_pages(struct iommu_domain *domain, unsigned long 
iova,
+  size_t pgsize, size_t pgcount,
+  struct iommu_iotlb_gather *iotlb_gather)
 {
struct io_pgtable_ops *ops = to_smmu_domain(domain)->pgtbl_ops;
struct arm_smmu_device *smmu = to_smmu_domain(domain)->smmu;
@@ -1221,7 +1222,7 @@ static size_t arm_smmu_unmap(struct iommu_domain *domain, 
unsigned long iova,
return 0;
 
arm_smmu_rpm_get(smmu);
-   ret = ops->unmap(ops, iova, size, gather);
+   ret = ops->unmap_pages(ops, iova, pgsize, pgcount, iotlb_gather);
arm_smmu_rpm_put(smmu);
 
return ret;
@@ -1574,7 +1575,7 @@ static struct iommu_ops arm_smmu_ops = {
.domain_free= arm_smmu_domain_free,
.attach_dev = arm_smmu_attach_dev,
.map= arm_smmu_map,
-   .unmap  = arm_smmu_unmap,
+   .unmap_pages= arm_smmu_unmap_pages,
.flush_iotlb_all= arm_smmu_flush_iotlb_all,
.iotlb_sync = arm_smmu_iotlb_sync,
.iova_to_phys   = arm_smmu_iova_to_phys,
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v7 08/15] iommu: Add support for the map_pages() callback

2021-06-16 Thread Georgi Djakov
From: "Isaac J. Manjarres" 

Since iommu_pgsize can calculate how many pages of the
same size can be mapped/unmapped before the next largest
page size boundary, add support for invoking an IOMMU
driver's map_pages() callback, if it provides one.

Signed-off-by: Isaac J. Manjarres 
Suggested-by: Will Deacon 
Signed-off-by: Georgi Djakov 
---
 drivers/iommu/iommu.c | 43 +++
 1 file changed, 35 insertions(+), 8 deletions(-)

diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 725622c7e603..70a729ce88b1 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -2429,6 +2429,30 @@ static size_t iommu_pgsize(struct iommu_domain *domain, 
unsigned long iova,
return pgsize;
 }
 
+static int __iommu_map_pages(struct iommu_domain *domain, unsigned long iova,
+phys_addr_t paddr, size_t size, int prot,
+gfp_t gfp, size_t *mapped)
+{
+   const struct iommu_ops *ops = domain->ops;
+   size_t pgsize, count;
+   int ret;
+
+   pgsize = iommu_pgsize(domain, iova, paddr, size, );
+
+   pr_debug("mapping: iova 0x%lx pa %pa pgsize 0x%zx count %zu\n",
+iova, , pgsize, count);
+
+   if (ops->map_pages) {
+   ret = ops->map_pages(domain, iova, paddr, pgsize, count, prot,
+gfp, mapped);
+   } else {
+   ret = ops->map(domain, iova, paddr, pgsize, prot, gfp);
+   *mapped = ret ? 0 : pgsize;
+   }
+
+   return ret;
+}
+
 static int __iommu_map(struct iommu_domain *domain, unsigned long iova,
   phys_addr_t paddr, size_t size, int prot, gfp_t gfp)
 {
@@ -2439,7 +2463,7 @@ static int __iommu_map(struct iommu_domain *domain, 
unsigned long iova,
phys_addr_t orig_paddr = paddr;
int ret = 0;
 
-   if (unlikely(ops->map == NULL ||
+   if (unlikely(!(ops->map || ops->map_pages) ||
 domain->pgsize_bitmap == 0UL))
return -ENODEV;
 
@@ -2463,18 +2487,21 @@ static int __iommu_map(struct iommu_domain *domain, 
unsigned long iova,
pr_debug("map: iova 0x%lx pa %pa size 0x%zx\n", iova, , size);
 
while (size) {
-   size_t pgsize = iommu_pgsize(domain, iova, paddr, size, NULL);
+   size_t mapped = 0;
 
-   pr_debug("mapping: iova 0x%lx pa %pa pgsize 0x%zx\n",
-iova, , pgsize);
-   ret = ops->map(domain, iova, paddr, pgsize, prot, gfp);
+   ret = __iommu_map_pages(domain, iova, paddr, size, prot, gfp,
+   );
+   /*
+* Some pages may have been mapped, even if an error occurred,
+* so we should account for those so they can be unmapped.
+*/
+   size -= mapped;
 
if (ret)
break;
 
-   iova += pgsize;
-   paddr += pgsize;
-   size -= pgsize;
+   iova += mapped;
+   paddr += mapped;
}
 
/* unroll mapping in case something went wrong */
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v7 01/15] iommu/io-pgtable: Introduce unmap_pages() as a page table op

2021-06-16 Thread Georgi Djakov
From: "Isaac J. Manjarres" 

The io-pgtable code expects to operate on a single block or
granule of memory that is supported by the IOMMU hardware when
unmapping memory.

This means that when a large buffer that consists of multiple
such blocks is unmapped, the io-pgtable code will walk the page
tables to the correct level to unmap each block, even for blocks
that are virtually contiguous and at the same level, which can
incur an overhead in performance.

Introduce the unmap_pages() page table op to express to the
io-pgtable code that it should unmap a number of blocks of
the same size, instead of a single block. Doing so allows
multiple blocks to be unmapped in one call to the io-pgtable
code, reducing the number of page table walks, and indirect
calls.

Signed-off-by: Isaac J. Manjarres 
Suggested-by: Will Deacon 
Signed-off-by: Will Deacon 
Signed-off-by: Georgi Djakov 
---
 include/linux/io-pgtable.h | 4 
 1 file changed, 4 insertions(+)

diff --git a/include/linux/io-pgtable.h b/include/linux/io-pgtable.h
index 4d40dfa75b55..9391c5fa71e6 100644
--- a/include/linux/io-pgtable.h
+++ b/include/linux/io-pgtable.h
@@ -144,6 +144,7 @@ struct io_pgtable_cfg {
  *
  * @map:  Map a physically contiguous memory region.
  * @unmap:Unmap a physically contiguous memory region.
+ * @unmap_pages:  Unmap a range of virtually contiguous pages of the same size.
  * @iova_to_phys: Translate iova to physical address.
  *
  * These functions map directly onto the iommu_ops member functions with
@@ -154,6 +155,9 @@ struct io_pgtable_ops {
   phys_addr_t paddr, size_t size, int prot, gfp_t gfp);
size_t (*unmap)(struct io_pgtable_ops *ops, unsigned long iova,
size_t size, struct iommu_iotlb_gather *gather);
+   size_t (*unmap_pages)(struct io_pgtable_ops *ops, unsigned long iova,
+ size_t pgsize, size_t pgcount,
+ struct iommu_iotlb_gather *gather);
phys_addr_t (*iova_to_phys)(struct io_pgtable_ops *ops,
unsigned long iova);
 };
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v7 10/15] iommu/io-pgtable-arm: Implement arm_lpae_unmap_pages()

2021-06-16 Thread Georgi Djakov
From: "Isaac J. Manjarres" 

Implement the unmap_pages() callback for the ARM LPAE io-pgtable
format.

Signed-off-by: Isaac J. Manjarres 
Suggested-by: Will Deacon 
Signed-off-by: Georgi Djakov 
---
 drivers/iommu/io-pgtable-arm.c | 120 +
 1 file changed, 74 insertions(+), 46 deletions(-)

diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c
index ea66b10c04c4..fe8fa0ee9c98 100644
--- a/drivers/iommu/io-pgtable-arm.c
+++ b/drivers/iommu/io-pgtable-arm.c
@@ -46,6 +46,9 @@
 #define ARM_LPAE_PGD_SIZE(d)   \
(sizeof(arm_lpae_iopte) << (d)->pgd_bits)
 
+#define ARM_LPAE_PTES_PER_TABLE(d) \
+   (ARM_LPAE_GRANULE(d) >> ilog2(sizeof(arm_lpae_iopte)))
+
 /*
  * Calculate the index at level l used to map virtual address a using the
  * pagetable in d.
@@ -239,22 +242,19 @@ static void __arm_lpae_sync_pte(arm_lpae_iopte *ptep, int 
num_entries,
   sizeof(*ptep) * num_entries, DMA_TO_DEVICE);
 }
 
-static void __arm_lpae_set_pte(arm_lpae_iopte *ptep, arm_lpae_iopte pte,
-  int num_entries, struct io_pgtable_cfg *cfg)
+static void __arm_lpae_clear_pte(arm_lpae_iopte *ptep, struct io_pgtable_cfg 
*cfg)
 {
-   int i;
 
-   for (i = 0; i < num_entries; i++)
-   ptep[i] = pte;
+   *ptep = 0;
 
if (!cfg->coherent_walk)
-   __arm_lpae_sync_pte(ptep, num_entries, cfg);
+   __arm_lpae_sync_pte(ptep, 1, cfg);
 }
 
 static size_t __arm_lpae_unmap(struct arm_lpae_io_pgtable *data,
   struct iommu_iotlb_gather *gather,
-  unsigned long iova, size_t size, int lvl,
-  arm_lpae_iopte *ptep);
+  unsigned long iova, size_t size, size_t pgcount,
+  int lvl, arm_lpae_iopte *ptep);
 
 static void __arm_lpae_init_pte(struct arm_lpae_io_pgtable *data,
phys_addr_t paddr, arm_lpae_iopte prot,
@@ -298,7 +298,7 @@ static int arm_lpae_init_pte(struct arm_lpae_io_pgtable 
*data,
size_t sz = ARM_LPAE_BLOCK_SIZE(lvl, data);
 
tblp = ptep - ARM_LPAE_LVL_IDX(iova, lvl, data);
-   if (__arm_lpae_unmap(data, NULL, iova + i * sz, sz,
+   if (__arm_lpae_unmap(data, NULL, iova + i * sz, sz, 1,
 lvl, tblp) != sz) {
WARN_ON(1);
return -EINVAL;
@@ -526,14 +526,15 @@ static size_t arm_lpae_split_blk_unmap(struct 
arm_lpae_io_pgtable *data,
   struct iommu_iotlb_gather *gather,
   unsigned long iova, size_t size,
   arm_lpae_iopte blk_pte, int lvl,
-  arm_lpae_iopte *ptep)
+  arm_lpae_iopte *ptep, size_t pgcount)
 {
struct io_pgtable_cfg *cfg = >iop.cfg;
arm_lpae_iopte pte, *tablep;
phys_addr_t blk_paddr;
size_t tablesz = ARM_LPAE_GRANULE(data);
size_t split_sz = ARM_LPAE_BLOCK_SIZE(lvl, data);
-   int i, unmap_idx = -1;
+   int ptes_per_table = ARM_LPAE_PTES_PER_TABLE(data);
+   int i, unmap_idx_start = -1, num_entries = 0, max_entries;
 
if (WARN_ON(lvl == ARM_LPAE_MAX_LEVELS))
return 0;
@@ -542,15 +543,18 @@ static size_t arm_lpae_split_blk_unmap(struct 
arm_lpae_io_pgtable *data,
if (!tablep)
return 0; /* Bytes unmapped */
 
-   if (size == split_sz)
-   unmap_idx = ARM_LPAE_LVL_IDX(iova, lvl, data);
+   if (size == split_sz) {
+   unmap_idx_start = ARM_LPAE_LVL_IDX(iova, lvl, data);
+   max_entries = ptes_per_table - unmap_idx_start;
+   num_entries = min_t(int, pgcount, max_entries);
+   }
 
blk_paddr = iopte_to_paddr(blk_pte, data);
pte = iopte_prot(blk_pte);
 
-   for (i = 0; i < tablesz / sizeof(pte); i++, blk_paddr += split_sz) {
+   for (i = 0; i < ptes_per_table; i++, blk_paddr += split_sz) {
/* Unmap! */
-   if (i == unmap_idx)
+   if (i >= unmap_idx_start && i < (unmap_idx_start + num_entries))
continue;
 
__arm_lpae_init_pte(data, blk_paddr, pte, lvl, 1, [i]);
@@ -568,76 +572,92 @@ static size_t arm_lpae_split_blk_unmap(struct 
arm_lpae_io_pgtable *data,
return 0;
 
tablep = iopte_deref(pte, data);
-   } else if (unmap_idx >= 0) {
-   io_pgtable_tlb_add_page(>iop, gather, iova, size);
-   return size;
+   } else if (unmap_idx_start >= 0) {
+   for (i = 0; i < num_entries; i++)
+   

[PATCH v7 13/15] iommu/io-pgtable-arm-v7s: Implement arm_v7s_map_pages()

2021-06-16 Thread Georgi Djakov
From: "Isaac J. Manjarres" 

Implement the map_pages() callback for the ARM v7s io-pgtable
format.

Signed-off-by: Isaac J. Manjarres 
Signed-off-by: Georgi Djakov 
---
 drivers/iommu/io-pgtable-arm-v7s.c | 26 ++
 1 file changed, 22 insertions(+), 4 deletions(-)

diff --git a/drivers/iommu/io-pgtable-arm-v7s.c 
b/drivers/iommu/io-pgtable-arm-v7s.c
index 1af060686985..5db90d7ce2ec 100644
--- a/drivers/iommu/io-pgtable-arm-v7s.c
+++ b/drivers/iommu/io-pgtable-arm-v7s.c
@@ -519,11 +519,12 @@ static int __arm_v7s_map(struct arm_v7s_io_pgtable *data, 
unsigned long iova,
return __arm_v7s_map(data, iova, paddr, size, prot, lvl + 1, cptep, 
gfp);
 }
 
-static int arm_v7s_map(struct io_pgtable_ops *ops, unsigned long iova,
-   phys_addr_t paddr, size_t size, int prot, gfp_t gfp)
+static int arm_v7s_map_pages(struct io_pgtable_ops *ops, unsigned long iova,
+phys_addr_t paddr, size_t pgsize, size_t pgcount,
+int prot, gfp_t gfp, size_t *mapped)
 {
struct arm_v7s_io_pgtable *data = io_pgtable_ops_to_data(ops);
-   int ret;
+   int ret = -EINVAL;
 
if (WARN_ON(iova >= (1ULL << data->iop.cfg.ias) ||
paddr >= (1ULL << data->iop.cfg.oas)))
@@ -533,7 +534,17 @@ static int arm_v7s_map(struct io_pgtable_ops *ops, 
unsigned long iova,
if (!(prot & (IOMMU_READ | IOMMU_WRITE)))
return 0;
 
-   ret = __arm_v7s_map(data, iova, paddr, size, prot, 1, data->pgd, gfp);
+   while (pgcount--) {
+   ret = __arm_v7s_map(data, iova, paddr, pgsize, prot, 1, 
data->pgd,
+   gfp);
+   if (ret)
+   break;
+
+   iova += pgsize;
+   paddr += pgsize;
+   if (mapped)
+   *mapped += pgsize;
+   }
/*
 * Synchronise all PTE updates for the new mapping before there's
 * a chance for anything to kick off a table walk for the new iova.
@@ -543,6 +554,12 @@ static int arm_v7s_map(struct io_pgtable_ops *ops, 
unsigned long iova,
return ret;
 }
 
+static int arm_v7s_map(struct io_pgtable_ops *ops, unsigned long iova,
+  phys_addr_t paddr, size_t size, int prot, gfp_t gfp)
+{
+   return arm_v7s_map_pages(ops, iova, paddr, size, 1, prot, gfp, NULL);
+}
+
 static void arm_v7s_free_pgtable(struct io_pgtable *iop)
 {
struct arm_v7s_io_pgtable *data = io_pgtable_to_data(iop);
@@ -797,6 +814,7 @@ static struct io_pgtable *arm_v7s_alloc_pgtable(struct 
io_pgtable_cfg *cfg,
 
data->iop.ops = (struct io_pgtable_ops) {
.map= arm_v7s_map,
+   .map_pages  = arm_v7s_map_pages,
.unmap  = arm_v7s_unmap,
.unmap_pages= arm_v7s_unmap_pages,
.iova_to_phys   = arm_v7s_iova_to_phys,
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v7 07/15] iommu: Hook up '->unmap_pages' driver callback

2021-06-16 Thread Georgi Djakov
From: Will Deacon 

Extend iommu_pgsize() to populate an optional 'count' parameter so that
we can direct unmapping operation to the ->unmap_pages callback if it
has been provided by the driver.

Signed-off-by: Will Deacon 
Signed-off-by: Isaac J. Manjarres 
Signed-off-by: Georgi Djakov 
---
 drivers/iommu/iommu.c | 59 +++
 1 file changed, 50 insertions(+), 9 deletions(-)

diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 80e14c139d40..725622c7e603 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -2376,11 +2376,11 @@ phys_addr_t iommu_iova_to_phys(struct iommu_domain 
*domain, dma_addr_t iova)
 EXPORT_SYMBOL_GPL(iommu_iova_to_phys);
 
 static size_t iommu_pgsize(struct iommu_domain *domain, unsigned long iova,
-  phys_addr_t paddr, size_t size)
+  phys_addr_t paddr, size_t size, size_t *count)
 {
-   unsigned int pgsize_idx;
+   unsigned int pgsize_idx, pgsize_idx_next;
unsigned long pgsizes;
-   size_t pgsize;
+   size_t offset, pgsize, pgsize_next;
unsigned long addr_merge = paddr | iova;
 
/* Page sizes supported by the hardware and small enough for @size */
@@ -2396,7 +2396,36 @@ static size_t iommu_pgsize(struct iommu_domain *domain, 
unsigned long iova,
/* Pick the biggest page size remaining */
pgsize_idx = __fls(pgsizes);
pgsize = BIT(pgsize_idx);
+   if (!count)
+   return pgsize;
 
+   /* Find the next biggest support page size, if it exists */
+   pgsizes = domain->pgsize_bitmap & ~GENMASK(pgsize_idx, 0);
+   if (!pgsizes)
+   goto out_set_count;
+
+   pgsize_idx_next = __ffs(pgsizes);
+   pgsize_next = BIT(pgsize_idx_next);
+
+   /*
+* There's no point trying a bigger page size unless the virtual
+* and physical addresses are similarly offset within the larger page.
+*/
+   if ((iova ^ paddr) & (pgsize_next - 1))
+   goto out_set_count;
+
+   /* Calculate the offset to the next page size alignment boundary */
+   offset = pgsize_next - (addr_merge & (pgsize_next - 1));
+
+   /*
+* If size is big enough to accommodate the larger page, reduce
+* the number of smaller pages.
+*/
+   if (offset + pgsize_next <= size)
+   size = offset;
+
+out_set_count:
+   *count = size >> pgsize_idx;
return pgsize;
 }
 
@@ -2434,7 +2463,7 @@ static int __iommu_map(struct iommu_domain *domain, 
unsigned long iova,
pr_debug("map: iova 0x%lx pa %pa size 0x%zx\n", iova, , size);
 
while (size) {
-   size_t pgsize = iommu_pgsize(domain, iova, paddr, size);
+   size_t pgsize = iommu_pgsize(domain, iova, paddr, size, NULL);
 
pr_debug("mapping: iova 0x%lx pa %pa pgsize 0x%zx\n",
 iova, , pgsize);
@@ -2485,6 +2514,19 @@ int iommu_map_atomic(struct iommu_domain *domain, 
unsigned long iova,
 }
 EXPORT_SYMBOL_GPL(iommu_map_atomic);
 
+static size_t __iommu_unmap_pages(struct iommu_domain *domain,
+ unsigned long iova, size_t size,
+ struct iommu_iotlb_gather *iotlb_gather)
+{
+   const struct iommu_ops *ops = domain->ops;
+   size_t pgsize, count;
+
+   pgsize = iommu_pgsize(domain, iova, iova, size, );
+   return ops->unmap_pages ?
+  ops->unmap_pages(domain, iova, pgsize, count, iotlb_gather) :
+  ops->unmap(domain, iova, pgsize, iotlb_gather);
+}
+
 static size_t __iommu_unmap(struct iommu_domain *domain,
unsigned long iova, size_t size,
struct iommu_iotlb_gather *iotlb_gather)
@@ -2494,7 +2536,7 @@ static size_t __iommu_unmap(struct iommu_domain *domain,
unsigned long orig_iova = iova;
unsigned int min_pagesz;
 
-   if (unlikely(ops->unmap == NULL ||
+   if (unlikely(!(ops->unmap || ops->unmap_pages) ||
 domain->pgsize_bitmap == 0UL))
return 0;
 
@@ -2522,10 +2564,9 @@ static size_t __iommu_unmap(struct iommu_domain *domain,
 * or we hit an area that isn't mapped.
 */
while (unmapped < size) {
-   size_t pgsize;
-
-   pgsize = iommu_pgsize(domain, iova, iova, size - unmapped);
-   unmapped_page = ops->unmap(domain, iova, pgsize, iotlb_gather);
+   unmapped_page = __iommu_unmap_pages(domain, iova,
+   size - unmapped,
+   iotlb_gather);
if (!unmapped_page)
break;
 
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v7 11/15] iommu/io-pgtable-arm: Implement arm_lpae_map_pages()

2021-06-16 Thread Georgi Djakov
From: "Isaac J. Manjarres" 

Implement the map_pages() callback for the ARM LPAE io-pgtable
format.

Signed-off-by: Isaac J. Manjarres 
Signed-off-by: Georgi Djakov 
---
 drivers/iommu/io-pgtable-arm.c | 41 +++--
 1 file changed, 31 insertions(+), 10 deletions(-)

diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c
index fe8fa0ee9c98..053df4048a29 100644
--- a/drivers/iommu/io-pgtable-arm.c
+++ b/drivers/iommu/io-pgtable-arm.c
@@ -341,20 +341,30 @@ static arm_lpae_iopte 
arm_lpae_install_table(arm_lpae_iopte *table,
 }
 
 static int __arm_lpae_map(struct arm_lpae_io_pgtable *data, unsigned long iova,
- phys_addr_t paddr, size_t size, arm_lpae_iopte prot,
- int lvl, arm_lpae_iopte *ptep, gfp_t gfp)
+ phys_addr_t paddr, size_t size, size_t pgcount,
+ arm_lpae_iopte prot, int lvl, arm_lpae_iopte *ptep,
+ gfp_t gfp, size_t *mapped)
 {
arm_lpae_iopte *cptep, pte;
size_t block_size = ARM_LPAE_BLOCK_SIZE(lvl, data);
size_t tblsz = ARM_LPAE_GRANULE(data);
struct io_pgtable_cfg *cfg = >iop.cfg;
+   int ret = 0, num_entries, max_entries, map_idx_start;
 
/* Find our entry at the current level */
-   ptep += ARM_LPAE_LVL_IDX(iova, lvl, data);
+   map_idx_start = ARM_LPAE_LVL_IDX(iova, lvl, data);
+   ptep += map_idx_start;
 
/* If we can install a leaf entry at this level, then do so */
-   if (size == block_size)
-   return arm_lpae_init_pte(data, iova, paddr, prot, lvl, 1, ptep);
+   if (size == block_size) {
+   max_entries = ARM_LPAE_PTES_PER_TABLE(data) - map_idx_start;
+   num_entries = min_t(int, pgcount, max_entries);
+   ret = arm_lpae_init_pte(data, iova, paddr, prot, lvl, 
num_entries, ptep);
+   if (!ret && mapped)
+   *mapped += num_entries * size;
+
+   return ret;
+   }
 
/* We can't allocate tables at the final level */
if (WARN_ON(lvl >= ARM_LPAE_MAX_LEVELS - 1))
@@ -383,7 +393,8 @@ static int __arm_lpae_map(struct arm_lpae_io_pgtable *data, 
unsigned long iova,
}
 
/* Rinse, repeat */
-   return __arm_lpae_map(data, iova, paddr, size, prot, lvl + 1, cptep, 
gfp);
+   return __arm_lpae_map(data, iova, paddr, size, pgcount, prot, lvl + 1,
+ cptep, gfp, mapped);
 }
 
 static arm_lpae_iopte arm_lpae_prot_to_pte(struct arm_lpae_io_pgtable *data,
@@ -450,8 +461,9 @@ static arm_lpae_iopte arm_lpae_prot_to_pte(struct 
arm_lpae_io_pgtable *data,
return pte;
 }
 
-static int arm_lpae_map(struct io_pgtable_ops *ops, unsigned long iova,
-   phys_addr_t paddr, size_t size, int iommu_prot, gfp_t 
gfp)
+static int arm_lpae_map_pages(struct io_pgtable_ops *ops, unsigned long iova,
+ phys_addr_t paddr, size_t pgsize, size_t pgcount,
+ int iommu_prot, gfp_t gfp, size_t *mapped)
 {
struct arm_lpae_io_pgtable *data = io_pgtable_ops_to_data(ops);
struct io_pgtable_cfg *cfg = >iop.cfg;
@@ -460,7 +472,7 @@ static int arm_lpae_map(struct io_pgtable_ops *ops, 
unsigned long iova,
arm_lpae_iopte prot;
long iaext = (s64)iova >> cfg->ias;
 
-   if (WARN_ON(!size || (size & cfg->pgsize_bitmap) != size))
+   if (WARN_ON(!pgsize || (pgsize & cfg->pgsize_bitmap) != pgsize))
return -EINVAL;
 
if (cfg->quirks & IO_PGTABLE_QUIRK_ARM_TTBR1)
@@ -473,7 +485,8 @@ static int arm_lpae_map(struct io_pgtable_ops *ops, 
unsigned long iova,
return 0;
 
prot = arm_lpae_prot_to_pte(data, iommu_prot);
-   ret = __arm_lpae_map(data, iova, paddr, size, prot, lvl, ptep, gfp);
+   ret = __arm_lpae_map(data, iova, paddr, pgsize, pgcount, prot, lvl,
+ptep, gfp, mapped);
/*
 * Synchronise all PTE updates for the new mapping before there's
 * a chance for anything to kick off a table walk for the new iova.
@@ -483,6 +496,13 @@ static int arm_lpae_map(struct io_pgtable_ops *ops, 
unsigned long iova,
return ret;
 }
 
+static int arm_lpae_map(struct io_pgtable_ops *ops, unsigned long iova,
+   phys_addr_t paddr, size_t size, int iommu_prot, gfp_t 
gfp)
+{
+   return arm_lpae_map_pages(ops, iova, paddr, size, 1, iommu_prot, gfp,
+ NULL);
+}
+
 static void __arm_lpae_free_pgtable(struct arm_lpae_io_pgtable *data, int lvl,
arm_lpae_iopte *ptep)
 {
@@ -787,6 +807,7 @@ arm_lpae_alloc_pgtable(struct io_pgtable_cfg *cfg)
 
data->iop.ops = (struct io_pgtable_ops) {
.map= arm_lpae_map,
+   .map_pages  = arm_lpae_map_pages,
.unmap  = arm_lpae_unmap,

[PATCH v7 05/15] iommu: Use bitmap to calculate page size in iommu_pgsize()

2021-06-16 Thread Georgi Djakov
From: Will Deacon 

Avoid the potential for shifting values by amounts greater than the
width of their type by using a bitmap to compute page size in
iommu_pgsize().

Signed-off-by: Will Deacon 
Signed-off-by: Isaac J. Manjarres 
Signed-off-by: Georgi Djakov 
---
 drivers/iommu/iommu.c | 31 ---
 1 file changed, 12 insertions(+), 19 deletions(-)

diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 5419c4b9f27a..80e471ada358 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -8,6 +8,7 @@
 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -2378,30 +2379,22 @@ static size_t iommu_pgsize(struct iommu_domain *domain,
   unsigned long addr_merge, size_t size)
 {
unsigned int pgsize_idx;
+   unsigned long pgsizes;
size_t pgsize;
 
-   /* Max page size that still fits into 'size' */
-   pgsize_idx = __fls(size);
+   /* Page sizes supported by the hardware and small enough for @size */
+   pgsizes = domain->pgsize_bitmap & GENMASK(__fls(size), 0);
 
-   /* need to consider alignment requirements ? */
-   if (likely(addr_merge)) {
-   /* Max page size allowed by address */
-   unsigned int align_pgsize_idx = __ffs(addr_merge);
-   pgsize_idx = min(pgsize_idx, align_pgsize_idx);
-   }
-
-   /* build a mask of acceptable page sizes */
-   pgsize = (1UL << (pgsize_idx + 1)) - 1;
-
-   /* throw away page sizes not supported by the hardware */
-   pgsize &= domain->pgsize_bitmap;
+   /* Constrain the page sizes further based on the maximum alignment */
+   if (likely(addr_merge))
+   pgsizes &= GENMASK(__ffs(addr_merge), 0);
 
-   /* make sure we're still sane */
-   BUG_ON(!pgsize);
+   /* Make sure we have at least one suitable page size */
+   BUG_ON(!pgsizes);
 
-   /* pick the biggest page */
-   pgsize_idx = __fls(pgsize);
-   pgsize = 1UL << pgsize_idx;
+   /* Pick the biggest page size remaining */
+   pgsize_idx = __fls(pgsizes);
+   pgsize = BIT(pgsize_idx);
 
return pgsize;
 }
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v7 04/15] iommu: Add a map_pages() op for IOMMU drivers

2021-06-16 Thread Georgi Djakov
From: "Isaac J. Manjarres" 

Add a callback for IOMMU drivers to provide a path for the
IOMMU framework to call into an IOMMU driver, which can
call into the io-pgtable code, to map a physically contiguous
rnage of pages of the same size.

For IOMMU drivers that do not specify a map_pages() callback,
the existing logic of mapping memory one page block at a time
will be used.

Signed-off-by: Isaac J. Manjarres 
Suggested-by: Will Deacon 
Acked-by: Lu Baolu 
Signed-off-by: Georgi Djakov 
---
 include/linux/iommu.h | 5 +
 1 file changed, 5 insertions(+)

diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index 25a844121be5..d7989d4a7404 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -180,6 +180,8 @@ struct iommu_iotlb_gather {
  * @attach_dev: attach device to an iommu domain
  * @detach_dev: detach device from an iommu domain
  * @map: map a physically contiguous memory region to an iommu domain
+ * @map_pages: map a physically contiguous set of pages of the same size to
+ * an iommu domain.
  * @unmap: unmap a physically contiguous memory region from an iommu domain
  * @unmap_pages: unmap a number of pages of the same size from an iommu domain
  * @flush_iotlb_all: Synchronously flush all hardware TLBs for this domain
@@ -230,6 +232,9 @@ struct iommu_ops {
void (*detach_dev)(struct iommu_domain *domain, struct device *dev);
int (*map)(struct iommu_domain *domain, unsigned long iova,
   phys_addr_t paddr, size_t size, int prot, gfp_t gfp);
+   int (*map_pages)(struct iommu_domain *domain, unsigned long iova,
+phys_addr_t paddr, size_t pgsize, size_t pgcount,
+int prot, gfp_t gfp, size_t *mapped);
size_t (*unmap)(struct iommu_domain *domain, unsigned long iova,
 size_t size, struct iommu_iotlb_gather *iotlb_gather);
size_t (*unmap_pages)(struct iommu_domain *domain, unsigned long iova,
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v7 03/15] iommu/io-pgtable: Introduce map_pages() as a page table op

2021-06-16 Thread Georgi Djakov
From: "Isaac J. Manjarres" 

Mapping memory into io-pgtables follows the same semantics
that unmapping memory used to follow (i.e. a buffer will be
mapped one page block per call to the io-pgtable code). This
means that it can be optimized in the same way that unmapping
memory was, so add a map_pages() callback to the io-pgtable
ops structure, so that a range of pages of the same size
can be mapped within the same call.

Signed-off-by: Isaac J. Manjarres 
Suggested-by: Will Deacon 
Signed-off-by: Georgi Djakov 
---
 include/linux/io-pgtable.h | 4 
 1 file changed, 4 insertions(+)

diff --git a/include/linux/io-pgtable.h b/include/linux/io-pgtable.h
index 9391c5fa71e6..c43f3b899d2a 100644
--- a/include/linux/io-pgtable.h
+++ b/include/linux/io-pgtable.h
@@ -143,6 +143,7 @@ struct io_pgtable_cfg {
  * struct io_pgtable_ops - Page table manipulation API for IOMMU drivers.
  *
  * @map:  Map a physically contiguous memory region.
+ * @map_pages:Map a physically contiguous range of pages of the same size.
  * @unmap:Unmap a physically contiguous memory region.
  * @unmap_pages:  Unmap a range of virtually contiguous pages of the same size.
  * @iova_to_phys: Translate iova to physical address.
@@ -153,6 +154,9 @@ struct io_pgtable_cfg {
 struct io_pgtable_ops {
int (*map)(struct io_pgtable_ops *ops, unsigned long iova,
   phys_addr_t paddr, size_t size, int prot, gfp_t gfp);
+   int (*map_pages)(struct io_pgtable_ops *ops, unsigned long iova,
+phys_addr_t paddr, size_t pgsize, size_t pgcount,
+int prot, gfp_t gfp, size_t *mapped);
size_t (*unmap)(struct io_pgtable_ops *ops, unsigned long iova,
size_t size, struct iommu_iotlb_gather *gather);
size_t (*unmap_pages)(struct io_pgtable_ops *ops, unsigned long iova,
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v4 3/6] ACPI: Add driver for the VIOT table

2021-06-16 Thread Eric Auger
Hi Jean,

On 6/10/21 9:51 AM, Jean-Philippe Brucker wrote:
> The ACPI Virtual I/O Translation Table describes topology of
> para-virtual platforms, similarly to vendor tables DMAR, IVRS and IORT.
> For now it describes the relation between virtio-iommu and the endpoints
> it manages.
>
> Three steps are needed to configure DMA of endpoints:
>
> (1) acpi_viot_init(): parse the VIOT table, find or create the fwnode
> associated to each vIOMMU device.
>
> (2) When probing the vIOMMU device, the driver registers its IOMMU ops
> within the IOMMU subsystem. This step doesn't require any
> intervention from the VIOT driver.
>
> (3) viot_iommu_configure(): before binding the endpoint to a driver,
> find the associated IOMMU ops. Register them, along with the
> endpoint ID, into the device's iommu_fwspec.
>
> If step (3) happens before step (2), it is deferred until the IOMMU is
> initialized, then retried.
>
> Signed-off-by: Jean-Philippe Brucker 
> ---
>  drivers/acpi/Kconfig  |   3 +
>  drivers/iommu/Kconfig |   1 +
>  drivers/acpi/Makefile |   2 +
>  include/linux/acpi_viot.h |  19 ++
>  drivers/acpi/bus.c|   2 +
>  drivers/acpi/scan.c   |   3 +
>  drivers/acpi/viot.c   | 364 ++
>  MAINTAINERS   |   8 +
>  8 files changed, 402 insertions(+)
>  create mode 100644 include/linux/acpi_viot.h
>  create mode 100644 drivers/acpi/viot.c
>
> diff --git a/drivers/acpi/Kconfig b/drivers/acpi/Kconfig
> index eedec61e3476..3758c6940ed7 100644
> --- a/drivers/acpi/Kconfig
> +++ b/drivers/acpi/Kconfig
> @@ -526,6 +526,9 @@ endif
>  
>  source "drivers/acpi/pmic/Kconfig"
>  
> +config ACPI_VIOT
> + bool
> +
>  endif# ACPI
>  
>  config X86_PM_TIMER
> diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
> index 1f111b399bca..aff8a4830dd1 100644
> --- a/drivers/iommu/Kconfig
> +++ b/drivers/iommu/Kconfig
> @@ -403,6 +403,7 @@ config VIRTIO_IOMMU
>   depends on ARM64
>   select IOMMU_API
>   select INTERVAL_TREE
> + select ACPI_VIOT if ACPI
>   help
> Para-virtualised IOMMU driver with virtio.
>  
> diff --git a/drivers/acpi/Makefile b/drivers/acpi/Makefile
> index 700b41adf2db..a6e644c48987 100644
> --- a/drivers/acpi/Makefile
> +++ b/drivers/acpi/Makefile
> @@ -118,3 +118,5 @@ video-objs+= acpi_video.o 
> video_detect.o
>  obj-y+= dptf/
>  
>  obj-$(CONFIG_ARM64)  += arm64/
> +
> +obj-$(CONFIG_ACPI_VIOT)  += viot.o
> diff --git a/include/linux/acpi_viot.h b/include/linux/acpi_viot.h
> new file mode 100644
> index ..1eb8ee5b0e5f
> --- /dev/null
> +++ b/include/linux/acpi_viot.h
> @@ -0,0 +1,19 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +
> +#ifndef __ACPI_VIOT_H__
> +#define __ACPI_VIOT_H__
> +
> +#include 
> +
> +#ifdef CONFIG_ACPI_VIOT
> +void __init acpi_viot_init(void);
> +int viot_iommu_configure(struct device *dev);
> +#else
> +static inline void acpi_viot_init(void) {}
> +static inline int viot_iommu_configure(struct device *dev)
> +{
> + return -ENODEV;
> +}
> +#endif
> +
> +#endif /* __ACPI_VIOT_H__ */
> diff --git a/drivers/acpi/bus.c b/drivers/acpi/bus.c
> index be7da23fad76..b835ca702ff0 100644
> --- a/drivers/acpi/bus.c
> +++ b/drivers/acpi/bus.c
> @@ -27,6 +27,7 @@
>  #include 
>  #endif
>  #include 
> +#include 
>  #include 
>  #include 
>  #include 
> @@ -1339,6 +1340,7 @@ static int __init acpi_init(void)
>   pci_mmcfg_late_init();
>   acpi_iort_init();
>   acpi_scan_init();
> + acpi_viot_init();
>   acpi_ec_init();
>   acpi_debugfs_init();
>   acpi_sleep_proc_init();
> diff --git a/drivers/acpi/scan.c b/drivers/acpi/scan.c
> index 0c53c8533300..4fa684fdfda8 100644
> --- a/drivers/acpi/scan.c
> +++ b/drivers/acpi/scan.c
> @@ -9,6 +9,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include 
>  #include 
>  #include 
> @@ -1556,6 +1557,8 @@ static const struct iommu_ops 
> *acpi_iommu_configure_id(struct device *dev,
>   return ops;
>  
>   err = iort_iommu_configure_id(dev, id_in);
> + if (err && err != -EPROBE_DEFER)
> + err = viot_iommu_configure(dev);
>  
>   /*
>* If we have reason to believe the IOMMU driver missed the initial
> diff --git a/drivers/acpi/viot.c b/drivers/acpi/viot.c
> new file mode 100644
> index ..892cd9fa7b6d
> --- /dev/null
> +++ b/drivers/acpi/viot.c
> @@ -0,0 +1,364 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * Virtual I/O topology
> + *
> + * The Virtual I/O Translation Table (VIOT) describes the topology of
> + * para-virtual IOMMUs and the endpoints they manage. The OS uses it to
> + * initialize devices in the right order, preventing endpoints from issuing 
> DMA
> + * before their IOMMU is ready.
> + *
> + * When binding a driver to a device, before calling the device driver's 
> probe()
> + * method, the driver infrastructure calls 

Re: [PATCH v4 0/6] Add support for ACPI VIOT

2021-06-16 Thread Eric Auger
Hi Jean,

On 6/10/21 9:51 AM, Jean-Philippe Brucker wrote:
> Add a driver for the ACPI VIOT table, which provides topology
> information for para-virtual IOMMUs. Enable virtio-iommu on
> non-devicetree platforms, including x86.
>
> Since v3 [1] I fixed a build bug for !CONFIG_IOMMU_API. Joerg offered to
> take this series through the IOMMU tree, which requires Acks for patches
> 1-3.
>
> You can find a QEMU implementation at [2], with extra support for
> testing all VIOT nodes including MMIO-based endpoints and IOMMU.
> This series is at [3].
>
> [1] 
> https://lore.kernel.org/linux-iommu/2021060215.1077006-1-jean-phili...@linaro.org/
> [2] https://jpbrucker.net/git/qemu/log/?h=virtio-iommu/acpi
> [3] https://jpbrucker.net/git/linux/log/?h=virtio-iommu/acpi

I tested the series on both aarch64 and x86_64 with qemu. It works for me.
Feel free to add my T-b.

Tested-by: Eric Auger 

Thanks

Eric

>
>
> Jean-Philippe Brucker (6):
>   ACPI: arm64: Move DMA setup operations out of IORT
>   ACPI: Move IOMMU setup code out of IORT
>   ACPI: Add driver for the VIOT table
>   iommu/dma: Pass address limit rather than size to
> iommu_setup_dma_ops()
>   iommu/dma: Simplify calls to iommu_setup_dma_ops()
>   iommu/virtio: Enable x86 support
>
>  drivers/acpi/Kconfig |   3 +
>  drivers/iommu/Kconfig|   4 +-
>  drivers/acpi/Makefile|   2 +
>  drivers/acpi/arm64/Makefile  |   1 +
>  include/acpi/acpi_bus.h  |   3 +
>  include/linux/acpi.h |   3 +
>  include/linux/acpi_iort.h|  14 +-
>  include/linux/acpi_viot.h|  19 ++
>  include/linux/dma-iommu.h|   4 +-
>  arch/arm64/mm/dma-mapping.c  |   2 +-
>  drivers/acpi/arm64/dma.c |  50 +
>  drivers/acpi/arm64/iort.c| 129 ++---
>  drivers/acpi/bus.c   |   2 +
>  drivers/acpi/scan.c  |  78 +++-
>  drivers/acpi/viot.c  | 364 +++
>  drivers/iommu/amd/iommu.c|   9 +-
>  drivers/iommu/dma-iommu.c|  17 +-
>  drivers/iommu/intel/iommu.c  |  10 +-
>  drivers/iommu/virtio-iommu.c |   8 +
>  MAINTAINERS  |   8 +
>  20 files changed, 580 insertions(+), 150 deletions(-)
>  create mode 100644 include/linux/acpi_viot.h
>  create mode 100644 drivers/acpi/arm64/dma.c
>  create mode 100644 drivers/acpi/viot.c
>

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] dt-bindings: Drop redundant minItems/maxItems

2021-06-16 Thread Jonathan Cameron
On Tue, 15 Jun 2021 13:15:43 -0600
Rob Herring  wrote:

> If a property has an 'items' list, then a 'minItems' or 'maxItems' with the
> same size as the list is redundant and can be dropped. Note that is DT
> schema specific behavior and not standard json-schema behavior. The tooling
> will fixup the final schema adding any unspecified minItems/maxItems.
> 
> This condition is partially checked with the meta-schema already, but
> only if both 'minItems' and 'maxItems' are equal to the 'items' length.
> An improved meta-schema is pending.
> 

...

>  .../devicetree/bindings/iio/adc/amlogic,meson-saradc.yaml   | 1 -

For this one, the fact it overrides maxItems elsewhere makes this a little
bit odd.  I guess we can get used to it being implicit.

>  .../devicetree/bindings/iio/adc/st,stm32-dfsdm-adc.yaml | 2 --

Acked-by: Jonathan Cameron 


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v12 00/12] Restricted DMA

2021-06-16 Thread Will Deacon
Hi Claire,

On Wed, Jun 16, 2021 at 02:21:45PM +0800, Claire Chang wrote:
> This series implements mitigations for lack of DMA access control on
> systems without an IOMMU, which could result in the DMA accessing the
> system memory at unexpected times and/or unexpected addresses, possibly
> leading to data leakage or corruption.
> 
> For example, we plan to use the PCI-e bus for Wi-Fi and that PCI-e bus is
> not behind an IOMMU. As PCI-e, by design, gives the device full access to
> system memory, a vulnerability in the Wi-Fi firmware could easily escalate
> to a full system exploit (remote wifi exploits: [1a], [1b] that shows a
> full chain of exploits; [2], [3]).
> 
> To mitigate the security concerns, we introduce restricted DMA. Restricted
> DMA utilizes the existing swiotlb to bounce streaming DMA in and out of a
> specially allocated region and does memory allocation from the same region.
> The feature on its own provides a basic level of protection against the DMA
> overwriting buffer contents at unexpected times. However, to protect
> against general data leakage and system memory corruption, the system needs
> to provide a way to restrict the DMA to a predefined memory region (this is
> usually done at firmware level, e.g. MPU in ATF on some ARM platforms [4]).
> 
> [1a] 
> https://googleprojectzero.blogspot.com/2017/04/over-air-exploiting-broadcoms-wi-fi_4.html
> [1b] 
> https://googleprojectzero.blogspot.com/2017/04/over-air-exploiting-broadcoms-wi-fi_11.html
> [2] https://blade.tencent.com/en/advisories/qualpwn/
> [3] 
> https://www.bleepingcomputer.com/news/security/vulnerabilities-found-in-highly-popular-firmware-for-wifi-chips/
> [4] 
> https://github.com/ARM-software/arm-trusted-firmware/blob/master/plat/mediatek/mt8183/drivers/emi_mpu/emi_mpu.c#L132
> 
> v12:
> Split is_dev_swiotlb_force into is_swiotlb_force_bounce (patch 06/12) and
> is_swiotlb_for_alloc (patch 09/12)

I took this for a spin in an arm64 KVM guest with virtio devices using the
DMA API and it works as expected on top of swiotlb devel/for-linus-5.14, so:

Tested-by: Will Deacon 

Thanks!

Will
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 9/9] memory: mtk-smi: mt8195: Add initial setting for smi-larb

2021-06-16 Thread Yong Wu
To improve the performance, We add some initial setting for smi larbs.
there are two part:
1), Each port has the special ostd(outstanding) value in each larb.
2), Two general setting for each larb.

In some SoC, this setting maybe changed dynamically for some special case
like 4K, and this initial setting is enough in mt8195.

Signed-off-by: Yong Wu 
---
 drivers/memory/mtk-smi.c | 74 +++-
 1 file changed, 73 insertions(+), 1 deletion(-)

diff --git a/drivers/memory/mtk-smi.c b/drivers/memory/mtk-smi.c
index 08b28e96fd8c..33f497b58f7b 100644
--- a/drivers/memory/mtk-smi.c
+++ b/drivers/memory/mtk-smi.c
@@ -32,6 +32,14 @@
 #define SMI_DUMMY  0x444
 
 /* SMI LARB */
+#define SMI_LARB_CMD_THRT_CON  0x24
+#define SMI_LARB_THRT_EN   0x370256
+
+#define SMI_LARB_SW_FLAG   0x40
+#define SMI_LARB_SW_FLAG_1 0x1
+
+#define SMI_LARB_OSTDL_PORT0x200
+#define SMI_LARB_OSTDL_PORTx(id)   (SMI_LARB_OSTDL_PORT + (((id) & 0x1f) 
<< 2))
 
 /* Below are about mmu enable registers, they are different in SoCs */
 /* mt2701 */
@@ -67,6 +75,11 @@
 })
 
 #define SMI_COMMON_INIT_REGS_NR6
+#define SMI_LARB_PORT_NR_MAX   32
+
+#define MTK_SMI_FLAG_LARB_THRT_EN  BIT(0)
+#define MTK_SMI_FLAG_LARB_SW_FLAG  BIT(1)
+#define MTK_SMI_CAPS(flags, _x)(!!((flags) & (_x)))
 
 struct mtk_smi_reg_pair {
unsigned intoffset;
@@ -100,6 +113,8 @@ struct mtk_smi_larb_gen {
int port_in_larb[MTK_LARB_NR_MAX + 1];
void (*config_port)(struct device *dev);
unsigned intlarb_direct_to_common_mask;
+   const u8(*ostd)[SMI_LARB_PORT_NR_MAX];
+   unsigned intflags_general;
 };
 
 struct mtk_smi {
@@ -187,12 +202,22 @@ static void mtk_smi_larb_config_port_mt8173(struct device 
*dev)
 static void mtk_smi_larb_config_port_gen2_general(struct device *dev)
 {
struct mtk_smi_larb *larb = dev_get_drvdata(dev);
-   u32 reg;
+   u32 reg, flags_general = larb->larb_gen->flags_general;
+   const u8 *larbostd = larb->larb_gen->ostd[larb->larbid];
int i;
 
if (BIT(larb->larbid) & larb->larb_gen->larb_direct_to_common_mask)
return;
 
+   if (MTK_SMI_CAPS(flags_general, MTK_SMI_FLAG_LARB_THRT_EN))
+   writel_relaxed(SMI_LARB_THRT_EN, larb->base + 
SMI_LARB_CMD_THRT_CON);
+
+   if (MTK_SMI_CAPS(flags_general, MTK_SMI_FLAG_LARB_SW_FLAG))
+   writel_relaxed(SMI_LARB_SW_FLAG_1, larb->base + 
SMI_LARB_SW_FLAG);
+
+   for (i = 0; i < SMI_LARB_PORT_NR_MAX && larbostd && !!larbostd[i]; i++)
+   writel_relaxed(larbostd[i], larb->base + 
SMI_LARB_OSTDL_PORTx(i));
+
for_each_set_bit(i, (unsigned long *)larb->mmu, 32) {
reg = readl_relaxed(larb->base + SMI_LARB_NONSEC_CON(i));
reg |= F_MMU_EN;
@@ -263,6 +288,51 @@ static const struct component_ops 
mtk_smi_larb_component_ops = {
.unbind = mtk_smi_larb_unbind,
 };
 
+static const u8 mtk_smi_larb_mt8195_ostd[][SMI_LARB_PORT_NR_MAX] = {
+   [0] = {0x0a, 0xc, 0x22, 0x22, 0x01, 0x0a,}, /* larb0 */
+   [1] = {0x0a, 0xc, 0x22, 0x22, 0x01, 0x0a,}, /* larb1 */
+   [2] = {0x12, 0x12, 0x12, 0x12, 0x0a,},  /* ... */
+   [3] = {0x12, 0x12, 0x12, 0x12, 0x28, 0x28, 0x0a,},
+   [4] = {0x06, 0x01, 0x17, 0x06, 0x0a,},
+   [5] = {0x06, 0x01, 0x17, 0x06, 0x06, 0x01, 0x06, 0x0a,},
+   [6] = {0x06, 0x01, 0x06, 0x0a,},
+   [7] = {0x0c, 0x0c, 0x12,},
+   [8] = {0x0c, 0x0c, 0x12,},
+   [9] = {0x0a, 0x08, 0x04, 0x06, 0x01, 0x01, 0x10, 0x18, 0x11, 0x0a,
+   0x08, 0x04, 0x11, 0x06, 0x02, 0x06, 0x01, 0x11, 0x11, 0x06,},
+   [10] = {0x18, 0x08, 0x01, 0x01, 0x20, 0x12, 0x18, 0x06, 0x05, 0x10,
+   0x08, 0x08, 0x10, 0x08, 0x08, 0x18, 0x0c, 0x09, 0x0b, 0x0d,
+   0x0d, 0x06, 0x10, 0x10,},
+   [11] = {0x0e, 0x0e, 0x0e, 0x0e, 0x0e, 0x0e, 0x01, 0x01, 0x01, 0x01,},
+   [12] = {0x09, 0x09, 0x05, 0x05, 0x0c, 0x18, 0x02, 0x02, 0x04, 0x02,},
+   [13] = {0x02, 0x02, 0x12, 0x12, 0x02, 0x02, 0x02, 0x02, 0x08, 0x01,},
+   [14] = {0x12, 0x12, 0x02, 0x02, 0x02, 0x02, 0x16, 0x01, 0x16, 0x01,
+   0x01, 0x02, 0x02, 0x08, 0x02,},
+   [15] = {}, /* */
+   [16] = {0x28, 0x02, 0x02, 0x12, 0x02, 0x12, 0x10, 0x02, 0x02, 0x0a,
+   0x12, 0x02, 0x0a, 0x16, 0x02, 0x04,},
+   [17] = {0x1a, 0x0e, 0x0a, 0x0a, 0x0c, 0x0e, 0x10,},
+   [18] = {0x12, 0x06, 0x12, 0x06,},
+   [19] = {0x01, 0x04, 0x01, 0x01, 0x01, 0x01, 0x01, 0x04, 0x04, 0x01,
+   0x01, 0x01, 0x04, 0x0a, 0x06, 0x01, 0x01, 0x01, 0x0a, 0x06,
+   0x01, 0x01, 0x05, 0x03, 0x03, 0x04, 0x01,},
+   [20] = {0x01, 0x04, 0x01, 0x01, 0x01, 0x01, 0x01, 0x04, 0x04, 0x01,
+   0x01, 0x01, 0x04, 0x0a, 0x06, 0x01, 0x01, 0x01, 0x0a, 0x06,
+   

[PATCH 7/9] memory: mtk-smi: mt8195: Add smi support

2021-06-16 Thread Yong Wu
mt8195 has two smi-common. the IP are the same.
only the larbs that connect with the smi-common are different.
thus the bus_sel is different for the two smi-common.

Signed-off-by: Yong Wu 
---
 drivers/memory/mtk-smi.c | 23 +++
 1 file changed, 23 insertions(+)

diff --git a/drivers/memory/mtk-smi.c b/drivers/memory/mtk-smi.c
index fa3a14605dc2..8b1bfef47ecd 100644
--- a/drivers/memory/mtk-smi.c
+++ b/drivers/memory/mtk-smi.c
@@ -286,6 +286,10 @@ static const struct mtk_smi_larb_gen mtk_smi_larb_mt8192 = 
{
.config_port= mtk_smi_larb_config_port_gen2_general,
 };
 
+static const struct mtk_smi_larb_gen mtk_smi_larb_mt8195 = {
+   .config_port= mtk_smi_larb_config_port_gen2_general,
+};
+
 static const struct of_device_id mtk_smi_larb_of_ids[] = {
{.compatible = "mediatek,mt2701-smi-larb", .data = 
_smi_larb_mt2701},
{.compatible = "mediatek,mt2712-smi-larb", .data = 
_smi_larb_mt2712},
@@ -294,6 +298,7 @@ static const struct of_device_id mtk_smi_larb_of_ids[] = {
{.compatible = "mediatek,mt8173-smi-larb", .data = 
_smi_larb_mt8173},
{.compatible = "mediatek,mt8183-smi-larb", .data = 
_smi_larb_mt8183},
{.compatible = "mediatek,mt8192-smi-larb", .data = 
_smi_larb_mt8192},
+   {.compatible = "mediatek,mt8195-smi-larb", .data = 
_smi_larb_mt8195},
{}
 };
 
@@ -408,6 +413,21 @@ static const struct mtk_smi_common_plat 
mtk_smi_common_mt8192 = {
F_MMU1_LARB(6),
 };
 
+static const struct mtk_smi_common_plat mtk_smi_common_mt8195_vdo = {
+   .type = MTK_SMI_GEN2,
+   .bus_sel  = F_MMU1_LARB(1) | F_MMU1_LARB(3) | F_MMU1_LARB(5) |
+   F_MMU1_LARB(7),
+};
+
+static const struct mtk_smi_common_plat mtk_smi_common_mt8195_vpp = {
+   .type = MTK_SMI_GEN2,
+   .bus_sel  = F_MMU1_LARB(1) | F_MMU1_LARB(2) | F_MMU1_LARB(7),
+};
+
+static const struct mtk_smi_common_plat mtk_smi_sub_common_mt8195 = {
+   .type = MTK_SMI_GEN2_SUB_COMM,
+};
+
 static const struct of_device_id mtk_smi_common_of_ids[] = {
{.compatible = "mediatek,mt2701-smi-common", .data = 
_smi_common_gen1},
{.compatible = "mediatek,mt2712-smi-common", .data = 
_smi_common_gen2},
@@ -416,6 +436,9 @@ static const struct of_device_id mtk_smi_common_of_ids[] = {
{.compatible = "mediatek,mt8173-smi-common", .data = 
_smi_common_gen2},
{.compatible = "mediatek,mt8183-smi-common", .data = 
_smi_common_mt8183},
{.compatible = "mediatek,mt8192-smi-common", .data = 
_smi_common_mt8192},
+   {.compatible = "mediatek,mt8195-smi-common-vdo", .data = 
_smi_common_mt8195_vdo},
+   {.compatible = "mediatek,mt8195-smi-common-vpp", .data = 
_smi_common_mt8195_vpp},
+   {.compatible = "mediatek,mt8195-smi-sub-common", .data = 
_smi_sub_common_mt8195},
{}
 };
 
-- 
2.18.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 8/9] memory: mtk-smi: mt8195: Add initial setting for smi-common

2021-06-16 Thread Yong Wu
To improve the performance, This patch adds initial setting for smi-common.
some register use some fix setting(suggested from DE).

Signed-off-by: Yong Wu 
---
 drivers/memory/mtk-smi.c | 42 
 1 file changed, 38 insertions(+), 4 deletions(-)

diff --git a/drivers/memory/mtk-smi.c b/drivers/memory/mtk-smi.c
index 8b1bfef47ecd..08b28e96fd8c 100644
--- a/drivers/memory/mtk-smi.c
+++ b/drivers/memory/mtk-smi.c
@@ -18,11 +18,19 @@
 #include 
 
 /* SMI COMMON */
+#define SMI_L1LEN  0x100
+
 #define SMI_BUS_SEL0x220
 #define SMI_BUS_LARB_SHIFT(larbid) ((larbid) << 1)
 /* All are MMU0 defaultly. Only specialize mmu1 here. */
 #define F_MMU1_LARB(larbid)(0x1 << SMI_BUS_LARB_SHIFT(larbid))
 
+#define SMI_M4U_TH 0x234
+#define SMI_FIFO_TH1   0x238
+#define SMI_FIFO_TH2   0x23c
+#define SMI_DCM0x300
+#define SMI_DUMMY  0x444
+
 /* SMI LARB */
 
 /* Below are about mmu enable registers, they are different in SoCs */
@@ -58,6 +66,13 @@
(_id << 8 | _id << 10 | _id << 12 | _id << 14); \
 })
 
+#define SMI_COMMON_INIT_REGS_NR6
+
+struct mtk_smi_reg_pair {
+   unsigned intoffset;
+   u32 value;
+};
+
 enum mtk_smi_type {
MTK_SMI_GEN1,
MTK_SMI_GEN2,   /* gen2 smi common */
@@ -77,6 +92,8 @@ static const char * const mtk_smi_larb_clocks[] = {
 struct mtk_smi_common_plat {
enum mtk_smi_type   type;
u32 bus_sel; /* Balance some larbs to enter mmu0 or 
mmu1 */
+
+   const struct mtk_smi_reg_pair   *init;
 };
 
 struct mtk_smi_larb_gen {
@@ -387,6 +404,15 @@ static struct platform_driver mtk_smi_larb_driver = {
}
 };
 
+static const struct mtk_smi_reg_pair 
mtk_smi_common_mt8195_init[SMI_COMMON_INIT_REGS_NR] = {
+   {SMI_L1LEN, 0xb},
+   {SMI_M4U_TH, 0xe100e10},
+   {SMI_FIFO_TH1, 0x506090a},
+   {SMI_FIFO_TH2, 0x506090a},
+   {SMI_DCM, 0x4f1},
+   {SMI_DUMMY, 0x1},
+};
+
 static const struct mtk_smi_common_plat mtk_smi_common_gen1 = {
.type = MTK_SMI_GEN1,
 };
@@ -417,11 +443,13 @@ static const struct mtk_smi_common_plat 
mtk_smi_common_mt8195_vdo = {
.type = MTK_SMI_GEN2,
.bus_sel  = F_MMU1_LARB(1) | F_MMU1_LARB(3) | F_MMU1_LARB(5) |
F_MMU1_LARB(7),
+   .init = mtk_smi_common_mt8195_init,
 };
 
 static const struct mtk_smi_common_plat mtk_smi_common_mt8195_vpp = {
.type = MTK_SMI_GEN2,
.bus_sel  = F_MMU1_LARB(1) | F_MMU1_LARB(2) | F_MMU1_LARB(7),
+   .init = mtk_smi_common_mt8195_init,
 };
 
 static const struct mtk_smi_common_plat mtk_smi_sub_common_mt8195 = {
@@ -514,15 +542,21 @@ static int mtk_smi_common_remove(struct platform_device 
*pdev)
 static int __maybe_unused mtk_smi_common_resume(struct device *dev)
 {
struct mtk_smi *common = dev_get_drvdata(dev);
-   u32 bus_sel = common->plat->bus_sel;
-   int ret;
+   const struct mtk_smi_reg_pair *init = common->plat->init;
+   u32 bus_sel = common->plat->bus_sel; /* default is 0 */
+   int ret, i;
 
ret = clk_bulk_prepare_enable(common->clk_num, common->clks);
if (ret)
return ret;
 
-   if (common->plat->type == MTK_SMI_GEN2 && bus_sel)
-   writel(bus_sel, common->base + SMI_BUS_SEL);
+   if (common->plat->type != MTK_SMI_GEN2)
+   return 0;
+
+   for (i = 0; i < SMI_COMMON_INIT_REGS_NR && init && init[i].offset; i++)
+   writel_relaxed(init[i].value, common->base + init[i].offset);
+
+   writel(bus_sel, common->base + SMI_BUS_SEL);
return 0;
 }
 
-- 
2.18.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 6/9] memory: mtk-smi: Add smi sub common support

2021-06-16 Thread Yong Wu
This patch adds smi-sub-common support. some larbs may connect with the
smi-sub-common, then connect with smi-common.

Before we create device link between smi-larb with smi-common, If we have
sub-common, we should use device link the smi-larb and smi-sub-common,
then use device link between the smi-sub-common with smi-common. This is
for enabling clock/power automatically.
Move the device link code to a new interface for reusing.

there is no SW extra setting for smi-sub-common.

Signed-off-by: Yong Wu 
---
 drivers/memory/mtk-smi.c | 78 ++--
 1 file changed, 52 insertions(+), 26 deletions(-)

diff --git a/drivers/memory/mtk-smi.c b/drivers/memory/mtk-smi.c
index 6858877ac859..fa3a14605dc2 100644
--- a/drivers/memory/mtk-smi.c
+++ b/drivers/memory/mtk-smi.c
@@ -60,7 +60,8 @@
 
 enum mtk_smi_type {
MTK_SMI_GEN1,
-   MTK_SMI_GEN2
+   MTK_SMI_GEN2,   /* gen2 smi common */
+   MTK_SMI_GEN2_SUB_COMM,  /* gen2 smi sub common */
 };
 
 #define MTK_SMI_CLK_NR_MAX 4
@@ -93,13 +94,14 @@ struct mtk_smi {
void __iomem*smi_ao_base; /* only for gen1 */
void __iomem*base;/* only for gen2 */
};
+   struct device   *smi_common_dev; /* for sub common */
const struct mtk_smi_common_plat *plat;
 };
 
 struct mtk_smi_larb { /* larb: local arbiter */
struct mtk_smi  smi;
void __iomem*base;
-   struct device   *smi_common_dev;
+   struct device   *smi_common_dev; /* common or 
sub-common dev */
const struct mtk_smi_larb_gen   *larb_gen;
int larbid;
u32 *mmu;
@@ -206,6 +208,39 @@ mtk_smi_larb_unbind(struct device *dev, struct device 
*master, void *data)
/* Do nothing as the iommu is always enabled. */
 }
 
+static int mtk_smi_device_link_common(struct device *dev, struct device 
**com_dev)
+{
+   struct platform_device *smi_com_pdev;
+   struct device_node *smi_com_node;
+   struct device *smi_com_dev;
+   struct device_link *link;
+
+   smi_com_node = of_parse_phandle(dev->of_node, "mediatek,smi", 0);
+   if (!smi_com_node)
+   return -EINVAL;
+
+   smi_com_pdev = of_find_device_by_node(smi_com_node);
+   of_node_put(smi_com_node);
+   if (smi_com_pdev) {
+   /* smi common is the supplier, Make sure it is ready before */
+   if (!platform_get_drvdata(smi_com_pdev))
+   return -EPROBE_DEFER;
+   smi_com_dev = _com_pdev->dev;
+   link = device_link_add(dev, smi_com_dev,
+  DL_FLAG_PM_RUNTIME | DL_FLAG_STATELESS);
+   if (!link) {
+   dev_err(dev, "Unable to link smi-common dev\n");
+   return -ENODEV;
+   }
+   *com_dev = smi_com_dev;
+   } else {
+   dev_err(dev, "Failed to get the smi_common device\n");
+   return -EINVAL;
+   }
+
+   return 0;
+}
+
 static const struct component_ops mtk_smi_larb_component_ops = {
.bind = mtk_smi_larb_bind,
.unbind = mtk_smi_larb_unbind,
@@ -267,9 +302,6 @@ static int mtk_smi_larb_probe(struct platform_device *pdev)
struct mtk_smi_larb *larb;
struct resource *res;
struct device *dev = >dev;
-   struct device_node *smi_node;
-   struct platform_device *smi_pdev;
-   struct device_link *link;
int i, ret;
 
larb = devm_kzalloc(dev, sizeof(*larb), GFP_KERNEL);
@@ -291,27 +323,9 @@ static int mtk_smi_larb_probe(struct platform_device *pdev)
return ret;
 
larb->smi.dev = dev;
-
-   smi_node = of_parse_phandle(dev->of_node, "mediatek,smi", 0);
-   if (!smi_node)
-   return -EINVAL;
-
-   smi_pdev = of_find_device_by_node(smi_node);
-   of_node_put(smi_node);
-   if (smi_pdev) {
-   if (!platform_get_drvdata(smi_pdev))
-   return -EPROBE_DEFER;
-   larb->smi_common_dev = _pdev->dev;
-   link = device_link_add(dev, larb->smi_common_dev,
-  DL_FLAG_PM_RUNTIME | DL_FLAG_STATELESS);
-   if (!link) {
-   dev_err(dev, "Unable to link smi-common dev\n");
-   return -ENODEV;
-   }
-   } else {
-   dev_err(dev, "Failed to get the smi_common device\n");
-   return -EINVAL;
-   }
+   ret = mtk_smi_device_link_common(dev, >smi_common_dev);
+   if (ret < 0)
+   return ret;
 
pm_runtime_enable(dev);
platform_set_drvdata(pdev, larb);
@@ -451,6 +465,14 @@ static int mtk_smi_common_probe(struct platform_device 
*pdev)
if (IS_ERR(common->base))
 

[PATCH 5/9] memory: mtk-smi: Adjust some code position

2021-06-16 Thread Yong Wu
This patch has no functional change, Only move the code position to make
the code more readable.
1. Put the register smi-common above smi-larb. this is preparing to add
   many others register setting.
2. put mtk_smi_larb_bind near larb_unbind.
3. Sort the SoC data alphabetically.
   and put them in one line as the current kernel allow it.

Signed-off-by: Yong Wu 
---
 drivers/memory/mtk-smi.c | 219 ---
 1 file changed, 90 insertions(+), 129 deletions(-)

diff --git a/drivers/memory/mtk-smi.c b/drivers/memory/mtk-smi.c
index 8eb39b46a6c8..6858877ac859 100644
--- a/drivers/memory/mtk-smi.c
+++ b/drivers/memory/mtk-smi.c
@@ -17,12 +17,15 @@
 #include 
 #include 
 
-/* mt8173 */
-#define SMI_LARB_MMU_EN0xf00
+/* SMI COMMON */
+#define SMI_BUS_SEL0x220
+#define SMI_BUS_LARB_SHIFT(larbid) ((larbid) << 1)
+/* All are MMU0 defaultly. Only specialize mmu1 here. */
+#define F_MMU1_LARB(larbid)(0x1 << SMI_BUS_LARB_SHIFT(larbid))
 
-/* mt8167 */
-#define MT8167_SMI_LARB_MMU_EN 0xfc0
+/* SMI LARB */
 
+/* Below are about mmu enable registers, they are different in SoCs */
 /* mt2701 */
 #define REG_SMI_SECUR_CON_BASE 0x5c0
 
@@ -41,20 +44,20 @@
 /* mt2701 domain should be set to 3 */
 #define SMI_SECUR_CON_VAL_DOMAIN(id)   (0x3 << id) & 0x7) << 2) + 1))
 
-/* mt2712 */
-#define SMI_LARB_NONSEC_CON(id)(0x380 + ((id) * 4))
-#define F_MMU_EN   BIT(0)
-#define BANK_SEL(id)   ({  \
+/* mt8167 */
+#define MT8167_SMI_LARB_MMU_EN 0xfc0
+
+/* mt8173 */
+#define MT8173_SMI_LARB_MMU_EN 0xf00
+
+/* larb gen2 */
+#define SMI_LARB_NONSEC_CON(id)(0x380 + ((id) * 4))
+#define F_MMU_EN   BIT(0)
+#define BANK_SEL(id)   ({  \
u32 _id = (id) & 0x3;   \
(_id << 8 | _id << 10 | _id << 12 | _id << 14); \
 })
 
-/* SMI COMMON */
-#define SMI_BUS_SEL0x220
-#define SMI_BUS_LARB_SHIFT(larbid) ((larbid) << 1)
-/* All are MMU0 defaultly. Only specialize mmu1 here. */
-#define F_MMU1_LARB(larbid)(0x1 << SMI_BUS_LARB_SHIFT(larbid))
-
 enum mtk_smi_type {
MTK_SMI_GEN1,
MTK_SMI_GEN2
@@ -117,55 +120,6 @@ void mtk_smi_larb_put(struct device *larbdev)
 }
 EXPORT_SYMBOL_GPL(mtk_smi_larb_put);
 
-static int
-mtk_smi_larb_bind(struct device *dev, struct device *master, void *data)
-{
-   struct mtk_smi_larb *larb = dev_get_drvdata(dev);
-   struct mtk_smi_larb_iommu *larb_mmu = data;
-   unsigned int i;
-
-   for (i = 0; i < MTK_LARB_NR_MAX; i++) {
-   if (dev == larb_mmu[i].dev) {
-   larb->larbid = i;
-   larb->mmu = _mmu[i].mmu;
-   larb->bank = larb_mmu[i].bank;
-   return 0;
-   }
-   }
-   return -ENODEV;
-}
-
-static void mtk_smi_larb_config_port_gen2_general(struct device *dev)
-{
-   struct mtk_smi_larb *larb = dev_get_drvdata(dev);
-   u32 reg;
-   int i;
-
-   if (BIT(larb->larbid) & larb->larb_gen->larb_direct_to_common_mask)
-   return;
-
-   for_each_set_bit(i, (unsigned long *)larb->mmu, 32) {
-   reg = readl_relaxed(larb->base + SMI_LARB_NONSEC_CON(i));
-   reg |= F_MMU_EN;
-   reg |= BANK_SEL(larb->bank[i]);
-   writel(reg, larb->base + SMI_LARB_NONSEC_CON(i));
-   }
-}
-
-static void mtk_smi_larb_config_port_mt8173(struct device *dev)
-{
-   struct mtk_smi_larb *larb = dev_get_drvdata(dev);
-
-   writel(*larb->mmu, larb->base + SMI_LARB_MMU_EN);
-}
-
-static void mtk_smi_larb_config_port_mt8167(struct device *dev)
-{
-   struct mtk_smi_larb *larb = dev_get_drvdata(dev);
-
-   writel(*larb->mmu, larb->base + MT8167_SMI_LARB_MMU_EN);
-}
-
 static void mtk_smi_larb_config_port_gen1(struct device *dev)
 {
struct mtk_smi_larb *larb = dev_get_drvdata(dev);
@@ -197,6 +151,55 @@ static void mtk_smi_larb_config_port_gen1(struct device 
*dev)
}
 }
 
+static void mtk_smi_larb_config_port_mt8167(struct device *dev)
+{
+   struct mtk_smi_larb *larb = dev_get_drvdata(dev);
+
+   writel(*larb->mmu, larb->base + MT8167_SMI_LARB_MMU_EN);
+}
+
+static void mtk_smi_larb_config_port_mt8173(struct device *dev)
+{
+   struct mtk_smi_larb *larb = dev_get_drvdata(dev);
+
+   writel(*larb->mmu, larb->base + MT8173_SMI_LARB_MMU_EN);
+}
+
+static void mtk_smi_larb_config_port_gen2_general(struct device *dev)
+{
+   struct mtk_smi_larb *larb = dev_get_drvdata(dev);
+   u32 reg;
+   int i;
+
+   if (BIT(larb->larbid) & larb->larb_gen->larb_direct_to_common_mask)
+   return;
+
+   for_each_set_bit(i, (unsigned long *)larb->mmu, 32) {
+   reg = readl_relaxed(larb->base + SMI_LARB_NONSEC_CON(i));
+   reg |= F_MMU_EN;
+   reg 

[PATCH 3/9] memory: mtk-smi: Use clk_bulk instead of the clk ops

2021-06-16 Thread Yong Wu
smi have many clocks: apb/smi/gals.
This patch use clk_bulk interface instead of the orginal one to simply
the code.

gals is optional clk(some larbs may don't have gals). use clk_bulk_optional
instead. and then remove the has_gals flag.

Also remove clk fail logs since bulk interface already output fail log.

Signed-off-by: Yong Wu 
---
 drivers/memory/mtk-smi.c | 124 +++
 1 file changed, 34 insertions(+), 90 deletions(-)

diff --git a/drivers/memory/mtk-smi.c b/drivers/memory/mtk-smi.c
index c5fb51f73b34..bcd2bf130655 100644
--- a/drivers/memory/mtk-smi.c
+++ b/drivers/memory/mtk-smi.c
@@ -60,9 +60,18 @@ enum mtk_smi_gen {
MTK_SMI_GEN2
 };
 
+#define MTK_SMI_CLK_NR_MAX 4
+
+static const char * const mtk_smi_common_clocks[] = {
+   "apb", "smi", "gals0", "gals1", /* glas is optional */
+};
+
+static const char * const mtk_smi_larb_clocks[] = {
+   "apb",  "smi", "gals"
+};
+
 struct mtk_smi_common_plat {
enum mtk_smi_gen gen;
-   bool has_gals;
u32  bus_sel; /* Balance some larbs to enter mmu0 or mmu1 */
 };
 
@@ -70,13 +79,12 @@ struct mtk_smi_larb_gen {
int port_in_larb[MTK_LARB_NR_MAX + 1];
void (*config_port)(struct device *dev);
unsigned intlarb_direct_to_common_mask;
-   boolhas_gals;
 };
 
 struct mtk_smi {
struct device   *dev;
-   struct clk  *clk_apb, *clk_smi;
-   struct clk  *clk_gals0, *clk_gals1;
+   unsigned intclk_num;
+   struct clk_bulk_dataclks[MTK_SMI_CLK_NR_MAX];
struct clk  *clk_async; /*only needed by mt2701*/
union {
void __iomem*smi_ao_base; /* only for gen1 */
@@ -95,45 +103,6 @@ struct mtk_smi_larb { /* larb: local arbiter */
unsigned char   *bank;
 };
 
-static int mtk_smi_clk_enable(const struct mtk_smi *smi)
-{
-   int ret;
-
-   ret = clk_prepare_enable(smi->clk_apb);
-   if (ret)
-   return ret;
-
-   ret = clk_prepare_enable(smi->clk_smi);
-   if (ret)
-   goto err_disable_apb;
-
-   ret = clk_prepare_enable(smi->clk_gals0);
-   if (ret)
-   goto err_disable_smi;
-
-   ret = clk_prepare_enable(smi->clk_gals1);
-   if (ret)
-   goto err_disable_gals0;
-
-   return 0;
-
-err_disable_gals0:
-   clk_disable_unprepare(smi->clk_gals0);
-err_disable_smi:
-   clk_disable_unprepare(smi->clk_smi);
-err_disable_apb:
-   clk_disable_unprepare(smi->clk_apb);
-   return ret;
-}
-
-static void mtk_smi_clk_disable(const struct mtk_smi *smi)
-{
-   clk_disable_unprepare(smi->clk_gals1);
-   clk_disable_unprepare(smi->clk_gals0);
-   clk_disable_unprepare(smi->clk_smi);
-   clk_disable_unprepare(smi->clk_apb);
-}
-
 int mtk_smi_larb_get(struct device *larbdev)
 {
int ret = pm_runtime_resume_and_get(larbdev);
@@ -270,7 +239,6 @@ static const struct mtk_smi_larb_gen mtk_smi_larb_mt6779 = {
 };
 
 static const struct mtk_smi_larb_gen mtk_smi_larb_mt8183 = {
-   .has_gals   = true,
.config_port= mtk_smi_larb_config_port_gen2_general,
.larb_direct_to_common_mask = BIT(2) | BIT(3) | BIT(7),
  /* IPU0 | IPU1 | CCU */
@@ -320,6 +288,7 @@ static int mtk_smi_larb_probe(struct platform_device *pdev)
struct device_node *smi_node;
struct platform_device *smi_pdev;
struct device_link *link;
+   int i, ret;
 
larb = devm_kzalloc(dev, sizeof(*larb), GFP_KERNEL);
if (!larb)
@@ -331,22 +300,14 @@ static int mtk_smi_larb_probe(struct platform_device 
*pdev)
if (IS_ERR(larb->base))
return PTR_ERR(larb->base);
 
-   larb->smi.clk_apb = devm_clk_get(dev, "apb");
-   if (IS_ERR(larb->smi.clk_apb))
-   return PTR_ERR(larb->smi.clk_apb);
-
-   larb->smi.clk_smi = devm_clk_get(dev, "smi");
-   if (IS_ERR(larb->smi.clk_smi))
-   return PTR_ERR(larb->smi.clk_smi);
-
-   if (larb->larb_gen->has_gals) {
-   /* The larbs may still haven't gals even if the SoC support.*/
-   larb->smi.clk_gals0 = devm_clk_get(dev, "gals");
-   if (PTR_ERR(larb->smi.clk_gals0) == -ENOENT)
-   larb->smi.clk_gals0 = NULL;
-   else if (IS_ERR(larb->smi.clk_gals0))
-   return PTR_ERR(larb->smi.clk_gals0);
-   }
+   larb->smi.clk_num = ARRAY_SIZE(mtk_smi_larb_clocks);
+   for (i = 0; i < larb->smi.clk_num; i++)
+   larb->smi.clks[i].id = mtk_smi_larb_clocks[i];
+
+   ret = devm_clk_bulk_get_optional(dev, larb->smi.clk_num, 
larb->smi.clks);
+   if (ret)
+   return ret;
+
larb->smi.dev = dev;
 
   

[PATCH 4/9] memory: mtk-smi: Rename smi_gen to smi_type

2021-06-16 Thread Yong Wu
This is a preparing patch for adding smi sub common.

About the previou smi_gen, we have gen1/gen2 that stand for the generation
number for HW. I plan to add a new type(sub_common), then the "gen" is not
prober. this patch only change it to "type", No functional change.

Signed-off-by: Yong Wu 
---
 drivers/memory/mtk-smi.c | 24 
 1 file changed, 12 insertions(+), 12 deletions(-)

diff --git a/drivers/memory/mtk-smi.c b/drivers/memory/mtk-smi.c
index bcd2bf130655..8eb39b46a6c8 100644
--- a/drivers/memory/mtk-smi.c
+++ b/drivers/memory/mtk-smi.c
@@ -55,7 +55,7 @@
 /* All are MMU0 defaultly. Only specialize mmu1 here. */
 #define F_MMU1_LARB(larbid)(0x1 << SMI_BUS_LARB_SHIFT(larbid))
 
-enum mtk_smi_gen {
+enum mtk_smi_type {
MTK_SMI_GEN1,
MTK_SMI_GEN2
 };
@@ -71,8 +71,8 @@ static const char * const mtk_smi_larb_clocks[] = {
 };
 
 struct mtk_smi_common_plat {
-   enum mtk_smi_gen gen;
-   u32  bus_sel; /* Balance some larbs to enter mmu0 or mmu1 */
+   enum mtk_smi_type   type;
+   u32 bus_sel; /* Balance some larbs to enter mmu0 or 
mmu1 */
 };
 
 struct mtk_smi_larb_gen {
@@ -387,27 +387,27 @@ static struct platform_driver mtk_smi_larb_driver = {
 };
 
 static const struct mtk_smi_common_plat mtk_smi_common_gen1 = {
-   .gen = MTK_SMI_GEN1,
+   .type = MTK_SMI_GEN1,
 };
 
 static const struct mtk_smi_common_plat mtk_smi_common_gen2 = {
-   .gen = MTK_SMI_GEN2,
+   .type = MTK_SMI_GEN2,
 };
 
 static const struct mtk_smi_common_plat mtk_smi_common_mt6779 = {
-   .gen= MTK_SMI_GEN2,
-   .bus_sel= F_MMU1_LARB(1) | F_MMU1_LARB(2) | F_MMU1_LARB(4) |
- F_MMU1_LARB(5) | F_MMU1_LARB(6) | F_MMU1_LARB(7),
+   .type = MTK_SMI_GEN2,
+   .bus_sel  = F_MMU1_LARB(1) | F_MMU1_LARB(2) | F_MMU1_LARB(4) |
+   F_MMU1_LARB(5) | F_MMU1_LARB(6) | F_MMU1_LARB(7),
 };
 
 static const struct mtk_smi_common_plat mtk_smi_common_mt8183 = {
-   .gen  = MTK_SMI_GEN2,
+   .type = MTK_SMI_GEN2,
.bus_sel  = F_MMU1_LARB(1) | F_MMU1_LARB(2) | F_MMU1_LARB(5) |
F_MMU1_LARB(7),
 };
 
 static const struct mtk_smi_common_plat mtk_smi_common_mt8192 = {
-   .gen  = MTK_SMI_GEN2,
+   .type = MTK_SMI_GEN2,
.bus_sel  = F_MMU1_LARB(1) | F_MMU1_LARB(2) | F_MMU1_LARB(5) |
F_MMU1_LARB(6),
 };
@@ -471,7 +471,7 @@ static int mtk_smi_common_probe(struct platform_device 
*pdev)
 * clock into emi clock domain, but for mtk smi gen2, there's no smi ao
 * base.
 */
-   if (common->plat->gen == MTK_SMI_GEN1) {
+   if (common->plat->type == MTK_SMI_GEN1) {
res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
common->smi_ao_base = devm_ioremap_resource(dev, res);
if (IS_ERR(common->smi_ao_base))
@@ -511,7 +511,7 @@ static int __maybe_unused mtk_smi_common_resume(struct 
device *dev)
if (ret)
return ret;
 
-   if (common->plat->gen == MTK_SMI_GEN2 && bus_sel)
+   if (common->plat->type == MTK_SMI_GEN2 && bus_sel)
writel(bus_sel, common->base + SMI_BUS_SEL);
return 0;
 }
-- 
2.18.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 2/9] dt-bindings: memory: mediatek: Add mt8195 smi sub common

2021-06-16 Thread Yong Wu
This patch adds the binding for smi-sub-common. The SMI block diagram
like this:

IOMMU
 |  |
  smi-common
  --
  |    |
 larb0   larb7   <-max is 8

The smi-common connects with smi-larb and IOMMU. The maximum larbs number
that connects with a smi-common is 8. If the engines number is over 8,
sometimes we use a smi-sub-common which is nearly same with smi-common.
It supports up to 8 input and 1 output(smi-common has 2 output)

Something like:

IOMMU
 |  |
  smi-common
  -
  |  |  ...
larb0  sub-common   ...   <-max is 8
  ---
   ||...   <-max is 8 too.
 larb2 larb5

We don't need extra SW setting for smi-sub-common, only the sub-common has
special clocks need to enable when the engines access dram.

If it is sub-common, it should have a "mediatek,smi" phandle to point to
its smi-common. also, the sub-common only has one gals clock.

Signed-off-by: Yong Wu 
---
 .../mediatek,smi-common.yaml  | 25 +++
 1 file changed, 25 insertions(+)

diff --git 
a/Documentation/devicetree/bindings/memory-controllers/mediatek,smi-common.yaml 
b/Documentation/devicetree/bindings/memory-controllers/mediatek,smi-common.yaml
index 6317025bd203..11515afdfb2e 100644
--- 
a/Documentation/devicetree/bindings/memory-controllers/mediatek,smi-common.yaml
+++ 
b/Documentation/devicetree/bindings/memory-controllers/mediatek,smi-common.yaml
@@ -38,6 +38,7 @@ properties:
   - mediatek,mt8192-smi-common
   - mediatek,mt8195-smi-common-vdo
   - mediatek,mt8195-smi-common-vpp
+  - mediatek,mt8195-smi-sub-common
 
   - description: for mt7623
 items:
@@ -69,6 +70,10 @@ properties:
 minItems: 2
 maxItems: 4
 
+  mediatek,smi:
+$ref: /schemas/types.yaml#/definitions/phandle-array
+description: a phandle to the smi-common node above. Only for sub-common.
+
 required:
   - compatible
   - reg
@@ -95,6 +100,26 @@ allOf:
 - const: smi
 - const: async
 
+  - if:  # only for sub common
+  properties:
+compatible:
+  contains:
+enum:
+  - mediatek,mt8195-smi-sub-common
+then:
+  required:
+- mediatek,smi
+  properties:
+clock:
+  items:
+minItems: 3
+maxItems: 3
+clock-names:
+  items:
+- const: apb
+- const: smi
+- const: gals0
+
   - if:  # for gen2 HW that have gals
   properties:
 compatible:
-- 
2.18.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 1/9] dt-bindings: memory: mediatek: Add mt8195 smi binding

2021-06-16 Thread Yong Wu
This patch adds mt8195 smi supporting in the bindings.

In mt8195, there are two smi-common HW, one is for vdo(video output),
the other is for vpp(video processing pipe). They connects with different
smi-larbs, then some setting(bus_sel) is different. Differentiate them
with the compatible string.

Something like this:

IOMMU(VDO)  IOMMU(VPP)
   |   |
  SMI_COMMON_VDO  SMI_COMMON_VPP
  --- 
  |  |   ...  |  | ...
larb0 larb2  ...larb1 larb3...

Signed-off-by: Yong Wu 
---
 .../bindings/memory-controllers/mediatek,smi-common.yaml| 6 +-
 .../bindings/memory-controllers/mediatek,smi-larb.yaml  | 3 +++
 2 files changed, 8 insertions(+), 1 deletion(-)

diff --git 
a/Documentation/devicetree/bindings/memory-controllers/mediatek,smi-common.yaml 
b/Documentation/devicetree/bindings/memory-controllers/mediatek,smi-common.yaml
index a08a32340987..6317025bd203 100644
--- 
a/Documentation/devicetree/bindings/memory-controllers/mediatek,smi-common.yaml
+++ 
b/Documentation/devicetree/bindings/memory-controllers/mediatek,smi-common.yaml
@@ -16,7 +16,7 @@ description: |
   MediaTek SMI have two generations of HW architecture, here is the list
   which generation the SoCs use:
   generation 1: mt2701 and mt7623.
-  generation 2: mt2712, mt6779, mt8167, mt8173, mt8183 and mt8192.
+  generation 2: mt2712, mt6779, mt8167, mt8173, mt8183, mt8192 and mt8195.
 
   There's slight differences between the two SMI, for generation 2, the
   register which control the iommu port is at each larb's register base. But
@@ -36,6 +36,8 @@ properties:
   - mediatek,mt8173-smi-common
   - mediatek,mt8183-smi-common
   - mediatek,mt8192-smi-common
+  - mediatek,mt8195-smi-common-vdo
+  - mediatek,mt8195-smi-common-vpp
 
   - description: for mt7623
 items:
@@ -100,6 +102,8 @@ allOf:
 - mediatek,mt6779-smi-common
 - mediatek,mt8183-smi-common
 - mediatek,mt8192-smi-common
+- mediatek,mt8195-smi-common-vdo
+- mediatek,mt8195-smi-common-vpp
 
 then:
   properties:
diff --git 
a/Documentation/devicetree/bindings/memory-controllers/mediatek,smi-larb.yaml 
b/Documentation/devicetree/bindings/memory-controllers/mediatek,smi-larb.yaml
index 7ed7839ff0a7..a100283903bd 100644
--- 
a/Documentation/devicetree/bindings/memory-controllers/mediatek,smi-larb.yaml
+++ 
b/Documentation/devicetree/bindings/memory-controllers/mediatek,smi-larb.yaml
@@ -24,6 +24,7 @@ properties:
   - mediatek,mt8173-smi-larb
   - mediatek,mt8183-smi-larb
   - mediatek,mt8192-smi-larb
+  - mediatek,mt8195-smi-larb
 
   - description: for mt7623
 items:
@@ -75,6 +76,7 @@ allOf:
 compatible:
   enum:
 - mediatek,mt8183-smi-larb
+- mediatek,mt8195-smi-larb
 
 then:
   properties:
@@ -109,6 +111,7 @@ allOf:
   - mediatek,mt6779-smi-larb
   - mediatek,mt8167-smi-larb
   - mediatek,mt8192-smi-larb
+  - mediatek,mt8195-smi-larb
 
 then:
   required:
-- 
2.18.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 0/9] MT8195 SMI support

2021-06-16 Thread Yong Wu
This patchset mainly adds SMI support for mt8195.

Comparing with the previous version, add two new functions:
a) add smi sub common
b) add initial setting for smi-common and smi-larb.

Yong Wu (9):
  dt-bindings: memory: mediatek: Add mt8195 smi binding
  dt-bindings: memory: mediatek: Add mt8195 smi sub common
  memory: mtk-smi: Use clk_bulk instead of the clk ops
  memory: mtk-smi: Rename smi_gen to smi_type
  memory: mtk-smi: Adjust some code position
  memory: mtk-smi: Add smi sub common support
  memory: mtk-smi: mt8195: Add smi support
  memory: mtk-smi: mt8195: Add initial setting for smi-common
  memory: mtk-smi: mt8195: Add initial setting for smi-larb

 .../mediatek,smi-common.yaml  |  31 +-
 .../memory-controllers/mediatek,smi-larb.yaml |   3 +
 drivers/memory/mtk-smi.c  | 568 ++
 3 files changed, 347 insertions(+), 255 deletions(-)

-- 
2.18.0


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] swiotlb-xen: override common mmap and get_sgtable dma ops

2021-06-16 Thread Roman Skakun
> We make sure that we allocate contiguous memory in 
> xen_swiotlb_alloc_coherent().

I understood.
Thanks!

-- 
Best Regards, Roman.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 2/2] swiotlb-xen: override common mmap and get_sgtable dma ops

2021-06-16 Thread Roman Skakun
This commit is dedicated to fix incorrect conversion from
cpu_addr to page address in cases when we get virtual
address which allocated through xen_swiotlb_alloc_coherent()
and can be mapped in the vmalloc range.
As the result, virt_to_page() cannot convert this address
properly and return incorrect page address.

Need to detect such cases and obtains the page address using
vmalloc_to_page() instead.

The reference code for mmap() and get_sgtable() was copied
from kernel/dma/ops_helpers.c and modified to provide
additional detections as described above.

In order to simplify code there was added a new
dma_cpu_addr_to_page() helper.

Signed-off-by: Roman Skakun 
Reviewed-by: Andrii Anisov 
---
 drivers/xen/swiotlb-xen.c | 42 +++
 1 file changed, 34 insertions(+), 8 deletions(-)

diff --git a/drivers/xen/swiotlb-xen.c b/drivers/xen/swiotlb-xen.c
index 90bc5fc321bc..9331a8500547 100644
--- a/drivers/xen/swiotlb-xen.c
+++ b/drivers/xen/swiotlb-xen.c
@@ -118,6 +118,14 @@ static int is_xen_swiotlb_buffer(struct device *dev, 
dma_addr_t dma_addr)
return 0;
 }
 
+static struct page *cpu_addr_to_page(void *cpu_addr)
+{
+   if (is_vmalloc_addr(cpu_addr))
+   return vmalloc_to_page(cpu_addr);
+   else
+   return virt_to_page(cpu_addr);
+}
+
 static int
 xen_swiotlb_fixup(void *buf, size_t size, unsigned long nslabs)
 {
@@ -337,7 +345,7 @@ xen_swiotlb_free_coherent(struct device *hwdev, size_t 
size, void *vaddr,
int order = get_order(size);
phys_addr_t phys;
u64 dma_mask = DMA_BIT_MASK(32);
-   struct page *page;
+   struct page *page = cpu_addr_to_page(vaddr);
 
if (hwdev && hwdev->coherent_dma_mask)
dma_mask = hwdev->coherent_dma_mask;
@@ -349,11 +357,6 @@ xen_swiotlb_free_coherent(struct device *hwdev, size_t 
size, void *vaddr,
/* Convert the size to actually allocated. */
size = 1UL << (order + XEN_PAGE_SHIFT);
 
-   if (is_vmalloc_addr(vaddr))
-   page = vmalloc_to_page(vaddr);
-   else
-   page = virt_to_page(vaddr);
-
if (!WARN_ON((dev_addr + size - 1 > dma_mask) ||
 range_straddles_page_boundary(phys, size)) &&
TestClearPageXenRemapped(page))
@@ -573,7 +576,23 @@ xen_swiotlb_dma_mmap(struct device *dev, struct 
vm_area_struct *vma,
 void *cpu_addr, dma_addr_t dma_addr, size_t size,
 unsigned long attrs)
 {
-   return dma_common_mmap(dev, vma, cpu_addr, dma_addr, size, attrs);
+   unsigned long user_count = vma_pages(vma);
+   unsigned long count = PAGE_ALIGN(size) >> PAGE_SHIFT;
+   unsigned long off = vma->vm_pgoff;
+   struct page *page = cpu_addr_to_page(cpu_addr);
+   int ret;
+
+   vma->vm_page_prot = dma_pgprot(dev, vma->vm_page_prot, attrs);
+
+   if (dma_mmap_from_dev_coherent(dev, vma, cpu_addr, size, ))
+   return ret;
+
+   if (off >= count || user_count > count - off)
+   return -ENXIO;
+
+   return remap_pfn_range(vma, vma->vm_start,
+   page_to_pfn(page) + vma->vm_pgoff,
+   user_count << PAGE_SHIFT, vma->vm_page_prot);
 }
 
 /*
@@ -585,7 +604,14 @@ xen_swiotlb_get_sgtable(struct device *dev, struct 
sg_table *sgt,
void *cpu_addr, dma_addr_t handle, size_t size,
unsigned long attrs)
 {
-   return dma_common_get_sgtable(dev, sgt, cpu_addr, handle, size, attrs);
+   struct page *page = cpu_addr_to_page(cpu_addr);
+   int ret;
+
+   ret = sg_alloc_table(sgt, 1, GFP_KERNEL);
+   if (!ret)
+   sg_set_page(sgt->sgl, page, PAGE_ALIGN(size), 0);
+
+   return ret;
 }
 
 const struct dma_map_ops xen_swiotlb_dma_ops = {
-- 
2.25.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 1/2] Revert "swiotlb-xen: remove xen_swiotlb_dma_mmap and xen_swiotlb_dma_get_sgtable"

2021-06-16 Thread Roman Skakun
This reverts commit 922659ea771b3fd728149262c5ea15608fab9719.

Signed-off-by: Roman Skakun 
---
 drivers/xen/swiotlb-xen.c | 29 +++--
 1 file changed, 27 insertions(+), 2 deletions(-)

diff --git a/drivers/xen/swiotlb-xen.c b/drivers/xen/swiotlb-xen.c
index 2b385c1b4a99..90bc5fc321bc 100644
--- a/drivers/xen/swiotlb-xen.c
+++ b/drivers/xen/swiotlb-xen.c
@@ -563,6 +563,31 @@ xen_swiotlb_dma_supported(struct device *hwdev, u64 mask)
return xen_virt_to_bus(hwdev, xen_io_tlb_end - 1) <= mask;
 }
 
+/*
+ * Create userspace mapping for the DMA-coherent memory.
+ * This function should be called with the pages from the current domain only,
+ * passing pages mapped from other domains would lead to memory corruption.
+ */
+static int
+xen_swiotlb_dma_mmap(struct device *dev, struct vm_area_struct *vma,
+void *cpu_addr, dma_addr_t dma_addr, size_t size,
+unsigned long attrs)
+{
+   return dma_common_mmap(dev, vma, cpu_addr, dma_addr, size, attrs);
+}
+
+/*
+ * This function should be called with the pages from the current domain only,
+ * passing pages mapped from other domains would lead to memory corruption.
+ */
+static int
+xen_swiotlb_get_sgtable(struct device *dev, struct sg_table *sgt,
+   void *cpu_addr, dma_addr_t handle, size_t size,
+   unsigned long attrs)
+{
+   return dma_common_get_sgtable(dev, sgt, cpu_addr, handle, size, attrs);
+}
+
 const struct dma_map_ops xen_swiotlb_dma_ops = {
.alloc = xen_swiotlb_alloc_coherent,
.free = xen_swiotlb_free_coherent,
@@ -575,8 +600,8 @@ const struct dma_map_ops xen_swiotlb_dma_ops = {
.map_page = xen_swiotlb_map_page,
.unmap_page = xen_swiotlb_unmap_page,
.dma_supported = xen_swiotlb_dma_supported,
-   .mmap = dma_common_mmap,
-   .get_sgtable = dma_common_get_sgtable,
+   .mmap = xen_swiotlb_dma_mmap,
+   .get_sgtable = xen_swiotlb_get_sgtable,
.alloc_pages = dma_common_alloc_pages,
.free_pages = dma_common_free_pages,
 };
-- 
2.25.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] dt-bindings: Drop redundant minItems/maxItems

2021-06-16 Thread Mark Brown
On Tue, Jun 15, 2021 at 01:15:43PM -0600, Rob Herring wrote:
> If a property has an 'items' list, then a 'minItems' or 'maxItems' with the
> same size as the list is redundant and can be dropped. Note that is DT
> schema specific behavior and not standard json-schema behavior. The tooling
> will fixup the final schema adding any unspecified minItems/maxItems.

Acked-by: Mark Brown 


signature.asc
Description: PGP signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH v13 6/6] iommu: Remove mode argument from iommu_set_dma_strict()

2021-06-16 Thread John Garry
We only ever now set strict mode enabled in iommu_set_dma_strict(), so
just remove the argument.

Signed-off-by: John Garry 
Reviewed-by: Robin Murphy 
---
 drivers/iommu/amd/init.c| 2 +-
 drivers/iommu/intel/iommu.c | 6 +++---
 drivers/iommu/iommu.c   | 5 ++---
 include/linux/iommu.h   | 2 +-
 4 files changed, 7 insertions(+), 8 deletions(-)

diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c
index fb3618af643b..7bc460052678 100644
--- a/drivers/iommu/amd/init.c
+++ b/drivers/iommu/amd/init.c
@@ -3099,7 +3099,7 @@ static int __init parse_amd_iommu_options(char *str)
for (; *str; ++str) {
if (strncmp(str, "fullflush", 9) == 0) {
pr_warn("amd_iommu=fullflush deprecated; use 
iommu.strict instead\n");
-   iommu_set_dma_strict(true);
+   iommu_set_dma_strict();
}
if (strncmp(str, "force_enable", 12) == 0)
amd_iommu_force_enable = true;
diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index d586990fa751..0618c35cfb51 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -454,7 +454,7 @@ static int __init intel_iommu_setup(char *str)
iommu_dma_forcedac = true;
} else if (!strncmp(str, "strict", 6)) {
pr_warn("intel_iommu=strict deprecated; use 
iommu.strict instead\n");
-   iommu_set_dma_strict(true);
+   iommu_set_dma_strict();
} else if (!strncmp(str, "sp_off", 6)) {
pr_info("Disable supported super page\n");
intel_iommu_superpage = 0;
@@ -4382,7 +4382,7 @@ int __init intel_iommu_init(void)
 */
if (cap_caching_mode(iommu->cap)) {
pr_warn("IOMMU batching disallowed due to 
virtualization\n");
-   iommu_set_dma_strict(true);
+   iommu_set_dma_strict();
}
iommu_device_sysfs_add(>iommu, NULL,
   intel_iommu_groups,
@@ -5699,7 +5699,7 @@ static void quirk_calpella_no_shadow_gtt(struct pci_dev 
*dev)
} else if (dmar_map_gfx) {
/* we have to ensure the gfx device is idle before we flush */
pci_info(dev, "Disabling batched IOTLB flush on Ironlake\n");
-   iommu_set_dma_strict(true);
+   iommu_set_dma_strict();
}
 }
 DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x0040, 
quirk_calpella_no_shadow_gtt);
diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 60b1ec42e73b..ff221d3ddcbc 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -349,10 +349,9 @@ static int __init iommu_dma_setup(char *str)
 }
 early_param("iommu.strict", iommu_dma_setup);
 
-void iommu_set_dma_strict(bool strict)
+void iommu_set_dma_strict(void)
 {
-   if (strict || !(iommu_cmd_line & IOMMU_CMD_LINE_STRICT))
-   iommu_dma_strict = strict;
+   iommu_dma_strict = true;
 }
 
 bool iommu_get_dma_strict(struct iommu_domain *domain)
diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index 32d448050bf7..754f67d6dd90 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -476,7 +476,7 @@ int iommu_enable_nesting(struct iommu_domain *domain);
 int iommu_set_pgtable_quirks(struct iommu_domain *domain,
unsigned long quirks);
 
-void iommu_set_dma_strict(bool val);
+void iommu_set_dma_strict(void);
 bool iommu_get_dma_strict(struct iommu_domain *domain);
 
 extern int report_iommu_fault(struct iommu_domain *domain, struct device *dev,
-- 
2.26.2

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v13 5/6] iommu/amd: Add support for IOMMU default DMA mode build options

2021-06-16 Thread John Garry
From: Zhen Lei 

Make IOMMU_DEFAULT_LAZY default for when AMD_IOMMU config is set, which
matches current behaviour.

For "fullflush" param, just call iommu_set_dma_strict(true) directly.

Since we get a strict vs lazy mode print already in iommu_subsys_init(),
and maintain a deprecation print when "fullflush" param is passed, drop the
prints in amd_iommu_init_dma_ops().

Finally drop global flag amd_iommu_unmap_flush, as it has no longer has any
purpose.

[jpg: Rebase for relocated file and drop amd_iommu_unmap_flush]
Signed-off-by: Zhen Lei 
Signed-off-by: John Garry 
---
 drivers/iommu/Kconfig   | 2 +-
 drivers/iommu/amd/amd_iommu_types.h | 6 --
 drivers/iommu/amd/init.c| 3 +--
 drivers/iommu/amd/iommu.c   | 6 --
 4 files changed, 2 insertions(+), 15 deletions(-)

diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
index c214a36eb2dc..fd1ad28dd5ee 100644
--- a/drivers/iommu/Kconfig
+++ b/drivers/iommu/Kconfig
@@ -94,7 +94,7 @@ choice
prompt "IOMMU default DMA IOTLB invalidation mode"
depends on IOMMU_DMA
 
-   default IOMMU_DEFAULT_LAZY if INTEL_IOMMU
+   default IOMMU_DEFAULT_LAZY if (AMD_IOMMU || INTEL_IOMMU)
default IOMMU_DEFAULT_STRICT
help
  This option allows an IOMMU DMA IOTLB invalidation mode to be
diff --git a/drivers/iommu/amd/amd_iommu_types.h 
b/drivers/iommu/amd/amd_iommu_types.h
index 94c1a7a9876d..8dbe61e2b3c1 100644
--- a/drivers/iommu/amd/amd_iommu_types.h
+++ b/drivers/iommu/amd/amd_iommu_types.h
@@ -779,12 +779,6 @@ extern u16 amd_iommu_last_bdf;
 /* allocation bitmap for domain ids */
 extern unsigned long *amd_iommu_pd_alloc_bitmap;
 
-/*
- * If true, the addresses will be flushed on unmap time, not when
- * they are reused
- */
-extern bool amd_iommu_unmap_flush;
-
 /* Smallest max PASID supported by any IOMMU in the system */
 extern u32 amd_iommu_max_pasid;
 
diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c
index 9f3096d650aa..fb3618af643b 100644
--- a/drivers/iommu/amd/init.c
+++ b/drivers/iommu/amd/init.c
@@ -161,7 +161,6 @@ u16 amd_iommu_last_bdf; /* largest PCI 
device id we have
   to handle */
 LIST_HEAD(amd_iommu_unity_map);/* a list of required unity 
mappings
   we find in ACPI */
-bool amd_iommu_unmap_flush;/* if true, flush on every unmap */
 
 LIST_HEAD(amd_iommu_list); /* list of all AMD IOMMUs in the
   system */
@@ -3100,7 +3099,7 @@ static int __init parse_amd_iommu_options(char *str)
for (; *str; ++str) {
if (strncmp(str, "fullflush", 9) == 0) {
pr_warn("amd_iommu=fullflush deprecated; use 
iommu.strict instead\n");
-   amd_iommu_unmap_flush = true;
+   iommu_set_dma_strict(true);
}
if (strncmp(str, "force_enable", 12) == 0)
amd_iommu_force_enable = true;
diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
index b1fbf2c83df5..32b541ee2e11 100644
--- a/drivers/iommu/amd/iommu.c
+++ b/drivers/iommu/amd/iommu.c
@@ -1775,12 +1775,6 @@ void amd_iommu_domain_update(struct protection_domain 
*domain)
 static void __init amd_iommu_init_dma_ops(void)
 {
swiotlb = (iommu_default_passthrough() || sme_me_mask) ? 1 : 0;
-
-   if (amd_iommu_unmap_flush)
-   pr_info("IO/TLB flush on unmap enabled\n");
-   else
-   pr_info("Lazy IO/TLB flushing enabled\n");
-   iommu_set_dma_strict(amd_iommu_unmap_flush);
 }
 
 int __init amd_iommu_init_api(void)
-- 
2.26.2

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v13 4/6] iommu/vt-d: Add support for IOMMU default DMA mode build options

2021-06-16 Thread John Garry
From: Zhen Lei 

Make IOMMU_DEFAULT_LAZY default for when INTEL_IOMMU config is set,
as is current behaviour.

Also delete global flag intel_iommu_strict:
- In intel_iommu_setup(), call iommu_set_dma_strict(true) directly. Also
  remove the print, as iommu_subsys_init() prints the mode and we have
  already marked this param as deprecated.

- For cap_caching_mode() check in intel_iommu_setup(), call
  iommu_set_dma_strict(true) directly, and reword the accompanying print
  and add the missing '\n'.

- For Ironlake GPU, again call iommu_set_dma_strict(true) directly and
  keep the accompanying print.

[jpg: Remove intel_iommu_strict]
Signed-off-by: Zhen Lei 
Signed-off-by: John Garry 
---
 drivers/iommu/Kconfig   |  1 +
 drivers/iommu/intel/iommu.c | 15 ++-
 2 files changed, 7 insertions(+), 9 deletions(-)

diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
index 0327a942fdb7..c214a36eb2dc 100644
--- a/drivers/iommu/Kconfig
+++ b/drivers/iommu/Kconfig
@@ -94,6 +94,7 @@ choice
prompt "IOMMU default DMA IOTLB invalidation mode"
depends on IOMMU_DMA
 
+   default IOMMU_DEFAULT_LAZY if INTEL_IOMMU
default IOMMU_DEFAULT_STRICT
help
  This option allows an IOMMU DMA IOTLB invalidation mode to be
diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index 821d8227a4e6..d586990fa751 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -361,7 +361,6 @@ int intel_iommu_enabled = 0;
 EXPORT_SYMBOL_GPL(intel_iommu_enabled);
 
 static int dmar_map_gfx = 1;
-static int intel_iommu_strict;
 static int intel_iommu_superpage = 1;
 static int iommu_identity_mapping;
 static int iommu_skip_te_disable;
@@ -455,8 +454,7 @@ static int __init intel_iommu_setup(char *str)
iommu_dma_forcedac = true;
} else if (!strncmp(str, "strict", 6)) {
pr_warn("intel_iommu=strict deprecated; use 
iommu.strict instead\n");
-   pr_info("Disable batched IOTLB flush\n");
-   intel_iommu_strict = 1;
+   iommu_set_dma_strict(true);
} else if (!strncmp(str, "sp_off", 6)) {
pr_info("Disable supported super page\n");
intel_iommu_superpage = 0;
@@ -4382,9 +4380,9 @@ int __init intel_iommu_init(void)
 * is likely to be much lower than the overhead of synchronizing
 * the virtual and physical IOMMU page-tables.
 */
-   if (!intel_iommu_strict && cap_caching_mode(iommu->cap)) {
-   pr_warn("IOMMU batching is disabled due to 
virtualization");
-   intel_iommu_strict = 1;
+   if (cap_caching_mode(iommu->cap)) {
+   pr_warn("IOMMU batching disallowed due to 
virtualization\n");
+   iommu_set_dma_strict(true);
}
iommu_device_sysfs_add(>iommu, NULL,
   intel_iommu_groups,
@@ -4393,7 +4391,6 @@ int __init intel_iommu_init(void)
}
up_read(_global_lock);
 
-   iommu_set_dma_strict(intel_iommu_strict);
bus_set_iommu(_bus_type, _iommu_ops);
if (si_domain && !hw_pass_through)
register_memory_notifier(_iommu_memory_nb);
@@ -5702,8 +5699,8 @@ static void quirk_calpella_no_shadow_gtt(struct pci_dev 
*dev)
} else if (dmar_map_gfx) {
/* we have to ensure the gfx device is idle before we flush */
pci_info(dev, "Disabling batched IOTLB flush on Ironlake\n");
-   intel_iommu_strict = 1;
-   }
+   iommu_set_dma_strict(true);
+   }
 }
 DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x0040, 
quirk_calpella_no_shadow_gtt);
 DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x0044, 
quirk_calpella_no_shadow_gtt);
-- 
2.26.2

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v13 3/6] iommu: Enhance IOMMU default DMA mode build options

2021-06-16 Thread John Garry
From: Zhen Lei 

First, add build options IOMMU_DEFAULT_{LAZY|STRICT}, so that we have the
opportunity to set {lazy|strict} mode as default at build time. Then put
the two config options in an choice, as they are mutually exclusive.

[jpg: Make choice between strict and lazy only (and not passthrough)]
Signed-off-by: Zhen Lei 
Signed-off-by: John Garry 
Reviewed-by: Robin Murphy 
---
 .../admin-guide/kernel-parameters.txt |  3 +-
 drivers/iommu/Kconfig | 40 +++
 drivers/iommu/iommu.c |  2 +-
 3 files changed, 43 insertions(+), 2 deletions(-)

diff --git a/Documentation/admin-guide/kernel-parameters.txt 
b/Documentation/admin-guide/kernel-parameters.txt
index fcbb36d6eea7..d8fb36363be0 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -2052,9 +2052,10 @@
  throughput at the cost of reduced device isolation.
  Will fall back to strict mode if not supported by
  the relevant IOMMU driver.
-   1 - Strict mode (default).
+   1 - Strict mode.
  DMA unmap operations invalidate IOMMU hardware TLBs
  synchronously.
+   unset - Use value of CONFIG_IOMMU_DEFAULT_{LAZY,STRICT}.
Note: on x86, the default behaviour depends on the
equivalent driver-specific parameters, but a strict
mode explicitly specified by either method takes
diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
index 1f111b399bca..0327a942fdb7 100644
--- a/drivers/iommu/Kconfig
+++ b/drivers/iommu/Kconfig
@@ -90,6 +90,46 @@ config IOMMU_DEFAULT_PASSTHROUGH
 
  If unsure, say N here.
 
+choice
+   prompt "IOMMU default DMA IOTLB invalidation mode"
+   depends on IOMMU_DMA
+
+   default IOMMU_DEFAULT_STRICT
+   help
+ This option allows an IOMMU DMA IOTLB invalidation mode to be
+ chosen at build time, to override the default mode of each ARCH,
+ removing the need to pass in kernel parameters through command line.
+ It is still possible to provide common boot params to override this
+ config.
+
+ If unsure, keep the default.
+
+config IOMMU_DEFAULT_STRICT
+   bool "strict"
+   help
+ For every IOMMU DMA unmap operation, the flush operation of IOTLB and
+ the free operation of IOVA are guaranteed to be done in the unmap
+ function.
+
+config IOMMU_DEFAULT_LAZY
+   bool "lazy"
+   help
+ Support lazy mode, where for every IOMMU DMA unmap operation, the
+ flush operation of IOTLB and the free operation of IOVA are deferred.
+ They are only guaranteed to be done before the related IOVA will be
+ reused.
+
+ The isolation provided in this mode is not as secure as STRICT mode,
+ such that a vulnerable time window may be created between the DMA
+ unmap and the mappings cached in the IOMMU IOTLB or device TLB
+ finally being invalidated, where the device could still access the
+ memory which has already been unmapped by the device driver.
+ However this mode may provide better performance in high throughput
+ scenarios, and is still considerably more secure than passthrough
+ mode or no IOMMU.
+
+endchoice
+
 config OF_IOMMU
def_bool y
depends on OF && IOMMU_API
diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index cf58949cc2f3..60b1ec42e73b 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -29,7 +29,7 @@ static struct kset *iommu_group_kset;
 static DEFINE_IDA(iommu_group_ida);
 
 static unsigned int iommu_def_domain_type __read_mostly;
-static bool iommu_dma_strict __read_mostly = true;
+static bool iommu_dma_strict __read_mostly = 
IS_ENABLED(CONFIG_IOMMU_DEFAULT_STRICT);
 static u32 iommu_cmd_line __read_mostly;
 
 struct iommu_group {
-- 
2.26.2

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v13 2/6] iommu: Print strict or lazy mode at init time

2021-06-16 Thread John Garry
As well as the default domain type, it's useful to know whether strict
or lazy for DMA domains, so add this info in a separate print.

The (stict/lazy) mode may be also set via iommu.strict earlyparm, but
this will be processed prior to iommu_subsys_init(), so that print will be
accurate for drivers which don't set the mode via custom means.

For the drivers which set the mode via custom means - AMD and Intel drivers
- they maintain prints to inform a change in policy or that custom cmdline
methods to change policy are deprecated.

Signed-off-by: John Garry 
Reviewed-by: Robin Murphy 
---
 drivers/iommu/iommu.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 5419c4b9f27a..cf58949cc2f3 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -138,6 +138,11 @@ static int __init iommu_subsys_init(void)
(iommu_cmd_line & IOMMU_CMD_LINE_DMA_API) ?
"(set via kernel command line)" : "");
 
+   pr_info("DMA domain TLB invalidation policy: %s mode %s\n",
+   iommu_dma_strict ? "strict" : "lazy",
+   (iommu_cmd_line & IOMMU_CMD_LINE_STRICT) ?
+   "(set via kernel command line)" : "");
+
return 0;
 }
 subsys_initcall(iommu_subsys_init);
-- 
2.26.2

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v13 0/6] iommu: Enhance IOMMU default DMA mode build options

2021-06-16 Thread John Garry
This is a reboot of Zhen Lei's series from a couple of years ago, which
never made it across the line.

I still think that it has some value, so taking up the mantle.

Motivation:
Allow lazy mode be default mode for DMA domains for all ARCHs, and not
only those who hardcode it (to be lazy). For ARM64, currently we must use
a kernel command line parameter to use lazy mode, which is less than
ideal.

I have now included the print for strict/lazy mode, which I originally
sent in:
https://lore.kernel.org/linux-iommu/72eb3de9-1d1c-ae46-c5a9-95f26525d...@huawei.com/

There was some concern there about drivers and their custom prints
conflicting with the print in that patch, but I think that it
should be ok.

Differences to v12:
- Rebase to next-20210611 and include patch "iommu: Update "iommu.strict"
  documentation" as a baseline
- Add Robin's RB tags (thanks!)
- Please let me know if not ok with kernel-parameters.txt update
  in 3/6
- Add a patch to mark x86 strict cmdline params as deprecated
- Improve wording in Kconfig change and tweak iommu_dma_strict declaration

Differences to v11:
- Rebase to next-20210610
- Drop strict mode globals in Intel and AMD drivers
- Include patch to print strict vs lazy mode
- Include patch to remove argument from iommu_set_dma_strict()

Differences to v10:
- Rebase to v5.13-rc4
- Correct comment and typo in Kconfig (Randy)
- Make Kconfig choice depend on relevant architectures
- Update kernel-parameters.txt for CONFIG_IOMMU_DEFAULT_{LAZY,STRICT}

John Garry (3):
  iommu: Deprecate Intel and AMD cmdline methods to enable strict mode
  iommu: Print strict or lazy mode at init time
  iommu: Remove mode argument from iommu_set_dma_strict()

Zhen Lei (3):
  iommu: Enhance IOMMU default DMA mode build options
  iommu/vt-d: Add support for IOMMU default DMA mode build options
  iommu/amd: Add support for IOMMU default DMA mode build options

 .../admin-guide/kernel-parameters.txt |  8 ++--
 drivers/iommu/Kconfig | 41 +++
 drivers/iommu/amd/amd_iommu_types.h   |  6 ---
 drivers/iommu/amd/init.c  |  7 ++--
 drivers/iommu/amd/iommu.c |  6 ---
 drivers/iommu/intel/iommu.c   | 16 
 drivers/iommu/iommu.c | 12 --
 include/linux/iommu.h |  2 +-
 8 files changed, 66 insertions(+), 32 deletions(-)

-- 
2.26.2

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v13 1/6] iommu: Deprecate Intel and AMD cmdline methods to enable strict mode

2021-06-16 Thread John Garry
Now that the x86 drivers support iommu.strict, deprecate the custom
methods.

Signed-off-by: John Garry 
---
 Documentation/admin-guide/kernel-parameters.txt | 5 +++--
 drivers/iommu/amd/init.c| 4 +++-
 drivers/iommu/intel/iommu.c | 1 +
 3 files changed, 7 insertions(+), 3 deletions(-)

diff --git a/Documentation/admin-guide/kernel-parameters.txt 
b/Documentation/admin-guide/kernel-parameters.txt
index 30e9dd52464e..fcbb36d6eea7 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -290,7 +290,8 @@
amd_iommu=  [HW,X86-64]
Pass parameters to the AMD IOMMU driver in the system.
Possible values are:
-   fullflush - enable flushing of IO/TLB entries when
+   fullflush   [Deprecated, use iommu.strict instead]
+ - enable flushing of IO/TLB entries when
they are unmapped. Otherwise they are
flushed before they will be reused, which
is a lot of faster
@@ -1947,7 +1948,7 @@
bypassed by not enabling DMAR with this option. In
this case, gfx device will use physical address for
DMA.
-   strict [Default Off]
+   strict [Default Off] [Deprecated, use iommu.strict instead]
With this option on every unmap_single operation will
result in a hardware IOTLB flush operation as opposed
to batching them for performance.
diff --git a/drivers/iommu/amd/init.c b/drivers/iommu/amd/init.c
index 46280e6e1535..9f3096d650aa 100644
--- a/drivers/iommu/amd/init.c
+++ b/drivers/iommu/amd/init.c
@@ -3098,8 +3098,10 @@ static int __init parse_amd_iommu_intr(char *str)
 static int __init parse_amd_iommu_options(char *str)
 {
for (; *str; ++str) {
-   if (strncmp(str, "fullflush", 9) == 0)
+   if (strncmp(str, "fullflush", 9) == 0) {
+   pr_warn("amd_iommu=fullflush deprecated; use 
iommu.strict instead\n");
amd_iommu_unmap_flush = true;
+   }
if (strncmp(str, "force_enable", 12) == 0)
amd_iommu_force_enable = true;
if (strncmp(str, "off", 3) == 0)
diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index bd93c7ec879e..821d8227a4e6 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -454,6 +454,7 @@ static int __init intel_iommu_setup(char *str)
pr_warn("intel_iommu=forcedac deprecated; use 
iommu.forcedac instead\n");
iommu_dma_forcedac = true;
} else if (!strncmp(str, "strict", 6)) {
+   pr_warn("intel_iommu=strict deprecated; use 
iommu.strict instead\n");
pr_info("Disable batched IOTLB flush\n");
intel_iommu_strict = 1;
} else if (!strncmp(str, "sp_off", 6)) {
-- 
2.26.2

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] dt-bindings: Drop redundant minItems/maxItems

2021-06-16 Thread Philipp Zabel
On Tue, 2021-06-15 at 13:15 -0600, Rob Herring wrote:
> If a property has an 'items' list, then a 'minItems' or 'maxItems' with the
> same size as the list is redundant and can be dropped. Note that is DT
> schema specific behavior and not standard json-schema behavior. The tooling
> will fixup the final schema adding any unspecified minItems/maxItems.
> 
> This condition is partially checked with the meta-schema already, but
> only if both 'minItems' and 'maxItems' are equal to the 'items' length.
> An improved meta-schema is pending.
[...]
>  Documentation/devicetree/bindings/reset/fsl,imx-src.yaml| 1 -
[...]
> diff --git a/Documentation/devicetree/bindings/reset/fsl,imx-src.yaml 
> b/Documentation/devicetree/bindings/reset/fsl,imx-src.yaml
> index 27c5e34a3ac6..b11ac533f914 100644
> --- a/Documentation/devicetree/bindings/reset/fsl,imx-src.yaml
> +++ b/Documentation/devicetree/bindings/reset/fsl,imx-src.yaml
> @@ -59,7 +59,6 @@ properties:
>- description: SRC interrupt
>- description: CPU WDOG interrupts out of SRC
>  minItems: 1
> -maxItems: 2
>  
>'#reset-cells':
>  const: 1

Acked-by: Philipp Zabel 

regards
Philipp
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[GIT PULL] iommu/arm-smmu: Updates for 5.14

2021-06-16 Thread Will Deacon
Hi Joerg,

Please pull these Arm SMMU updates for 5.14.

Of particular note is the support for stalling faults with platform devices
using SMMUv3 -- this concludes the bulk of the SVA work from Jean-Philippe.

Other than that, one thing to note is that the patch from Thierry adding
the '->prove_finalize' implementation hook is also shared with the
memory-controller tree, since they build on top of it to get the SMMU
working with an nvidia SOC. Unfortunately, that patch also caused a NULL
dereference on other parts, so there's a subsequent patch on top here
addressing that.

Due to the above, the sooner this lands in -next, the better.

Cheers,

Will

--->8

The following changes since commit c4681547bcce777daf576925a966ffa824edd09d:

  Linux 5.13-rc3 (2021-05-23 11:42:48 -1000)

are available in the Git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/will/linux.git 
tags/arm-smmu-updates

for you to fetch changes up to ddd25670d39b2181c7bec33301f2d24cdcf25dde:

  Merge branch 'for-thierry/arm-smmu' into for-joerg/arm-smmu/updates 
(2021-06-16 11:30:55 +0100)


Arm SMMU updates for 5.14

- SMMUv3:

  * Support stalling faults for platform devices

  * Decrease defaults sizes for the event and PRI queues

- SMMUv2:

  * Support for a new '->probe_finalize' hook, needed by Nvidia

  * Even more Qualcomm compatible strings

  * Avoid Adreno TTBR1 quirk for DB820C platform

- Misc:

  * Trivial cleanups/refactoring


Amey Narkhede (1):
  iommu/arm: Cleanup resources in case of probe error path

Bixuan Cui (1):
  iommu/arm-smmu-v3: Change *array into *const array

Eric Anholt (2):
  iommu/arm-smmu-qcom: Skip the TTBR1 quirk for db820c.
  arm64: dts: msm8996: Mark the GPU's SMMU as an adreno one.

Jean-Philippe Brucker (4):
  dt-bindings: Document stall property for IOMMU masters
  ACPI/IORT: Enable stall support for platform devices
  iommu/arm-smmu-v3: Add stall support for platform devices
  iommu/arm-smmu-v3: Ratelimit event dump

Martin Botka (1):
  iommu/arm-smmu-qcom: Add sm6125 compatible

Sai Prakash Ranjan (2):
  iommu/arm-smmu-qcom: Add SC7280 SMMU compatible
  iommu/arm-smmu-qcom: Move the adreno smmu specific impl

Shawn Guo (2):
  iommu/arm-smmu-qcom: hook up qcom_smmu_impl for ACPI boot
  iommu/arm-smmu-qcom: Protect acpi_match_platform_list() call with 
CONFIG_ACPI

Thierry Reding (1):
  iommu/arm-smmu: Implement ->probe_finalize()

Will Deacon (2):
  iommu/arm-smmu: Check smmu->impl pointer before dereferencing
  Merge branch 'for-thierry/arm-smmu' into for-joerg/arm-smmu/updates

Xiyu Yang (2):
  iommu/arm-smmu: Fix arm_smmu_device refcount leak when arm_smmu_rpm_get 
fails
  iommu/arm-smmu: Fix arm_smmu_device refcount leak in address translation

Zhen Lei (2):
  iommu/arm-smmu-v3: Decrease the queue size of evtq and priq
  iommu/arm-smmu-v3: Remove unnecessary oom message

 Documentation/devicetree/bindings/iommu/iommu.txt |  18 ++
 arch/arm64/boot/dts/qcom/msm8996.dtsi |   2 +-
 drivers/acpi/arm64/iort.c |   4 +-
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-sva.c   |  59 +-
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c   | 222 --
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h   |  48 -
 drivers/iommu/arm/arm-smmu/arm-smmu-qcom.c|  43 -
 drivers/iommu/arm/arm-smmu/arm-smmu.c |  38 +++-
 drivers/iommu/arm/arm-smmu/arm-smmu.h |   1 +
 drivers/iommu/arm/arm-smmu/qcom_iommu.c   |  13 +-
 10 files changed, 409 insertions(+), 39 deletions(-)
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] dt-bindings: Drop redundant minItems/maxItems

2021-06-16 Thread Vinod Koul
On 15-06-21, 13:15, Rob Herring wrote:
> If a property has an 'items' list, then a 'minItems' or 'maxItems' with the
> same size as the list is redundant and can be dropped. Note that is DT
> schema specific behavior and not standard json-schema behavior. The tooling
> will fixup the final schema adding any unspecified minItems/maxItems.
> 
> This condition is partially checked with the meta-schema already, but
> only if both 'minItems' and 'maxItems' are equal to the 'items' length.
> An improved meta-schema is pending.

>  .../devicetree/bindings/dma/renesas,rcar-dmac.yaml  | 1 -

>  Documentation/devicetree/bindings/phy/brcm,sata-phy.yaml| 1 -
>  Documentation/devicetree/bindings/phy/mediatek,tphy.yaml| 2 --
>  .../devicetree/bindings/phy/phy-cadence-sierra.yaml | 2 --
>  .../devicetree/bindings/phy/phy-cadence-torrent.yaml| 4 
>  .../devicetree/bindings/phy/qcom,ipq806x-usb-phy-hs.yaml| 1 -
>  .../devicetree/bindings/phy/qcom,ipq806x-usb-phy-ss.yaml| 1 -
>  Documentation/devicetree/bindings/phy/qcom,qmp-phy.yaml | 1 -
>  Documentation/devicetree/bindings/phy/qcom,qusb2-phy.yaml   | 2 --
>  Documentation/devicetree/bindings/phy/renesas,usb2-phy.yaml | 2 --
>  Documentation/devicetree/bindings/phy/renesas,usb3-phy.yaml | 1 -

Acked-By: Vinod Koul 

-- 
~Vinod
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 1/1] iommu/arm-smmu-v3: remove unnecessary oom message

2021-06-16 Thread Will Deacon
On Wed, Jun 16, 2021 at 09:47:18AM +0800, Leizhen (ThunderTown) wrote:
> 
> 
> On 2021/6/15 19:55, Will Deacon wrote:
> > On Tue, Jun 15, 2021 at 12:51:38PM +0100, Robin Murphy wrote:
> >> On 2021-06-15 12:34, Will Deacon wrote:
> >>> On Tue, Jun 15, 2021 at 07:22:10PM +0800, Leizhen (ThunderTown) wrote:
> 
> 
>  On 2021/6/11 18:32, Will Deacon wrote:
> > On Wed, Jun 09, 2021 at 08:54:38PM +0800, Zhen Lei wrote:
> >> Fixes scripts/checkpatch.pl warning:
> >> WARNING: Possible unnecessary 'out of memory' message
> >>
> >> Remove it can help us save a bit of memory.
> >>
> >> Signed-off-by: Zhen Lei 
> >> ---
> >>   drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 8 ++--
> >>   1 file changed, 2 insertions(+), 6 deletions(-)
> >>
> >> diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c 
> >> b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> >> index 2ddc3cd5a7d1..fd7c55b44881 100644
> >> --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> >> +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
> >> @@ -2787,10 +2787,8 @@ static int arm_smmu_init_l1_strtab(struct 
> >> arm_smmu_device *smmu)
> >>void *strtab = smmu->strtab_cfg.strtab;
> >>cfg->l1_desc = devm_kzalloc(smmu->dev, size, GFP_KERNEL);
> >> -  if (!cfg->l1_desc) {
> >> -  dev_err(smmu->dev, "failed to allocate l1 stream table 
> >> desc\n");
> >> +  if (!cfg->l1_desc)
> >
> > What error do you get if devm_kzalloc() fails? I'd like to make sure 
> > it's
> > easy to track down _which_ allocation failed in that case -- does it 
> > give
> > you a line number, for example?
> 
>  When devm_kzalloc() fails, the OOM information is printed. No line 
>  number information, but the
>  size(order) and call stack is printed. It doesn't matter which 
>  allocation failed, the failure
>  is caused by insufficient system memory rather than the fault of the 
>  SMMU driver. Therefore,
>  the current printing is not helpful for locating the problem of 
>  insufficient memory. After all,
>  when memory allocation fails, the SMMU driver cannot work at a lower 
>  specification.
> >>>
> >>> I don't entirely agree. Another reason for the failure is because the 
> >>> driver
> >>> might be asking for a huge (or negative) allocation, in which case it 
> >>> might
> >>> be instructive to have a look at the actual caller, particularly if the
> >>> size is derived from hardware or firmware properties.
> >>
> >> Agreed - other than deliberately-contrived situations I don't think I've
> >> ever hit a genuine OOM, but I definitely have debugged attempts to allocate
> >> -1 of something. If the driver-specific message actually calls out the
> >> critical information, e.g. "failed to allocate %d stream table entries", it
> >> gives debugging a head start since the miscalculation is obvious, but a
> >> static message that only identifies the callsite really only saves a quick
> >> trip to scripts/faddr2line, and personally I've never found that
> >> particularly valuable.
> > 
> > So it sounds like this particular patch is fine, but the one for smmuv2
> > should leave the IRQ allocation message alone (by virtue of it printing
> > something a bit more useful -- the number of irqs).
> 
> num_irqs = 0;
> while ((res = platform_get_resource(pdev, IORESOURCE_IRQ, num_irqs))) 
> {
> num_irqs++;
> }
> 
> As the above code, num_irqs is calculated based on the number of dtb or acpi
> configuration items, it can't be too large. That is, there is almost zero 
> chance
> that devm_kcalloc() will fail because num_irqs is too large.

Right, because firmware is never wrong about anything :)

Will
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v4 1/6] ACPI: arm64: Move DMA setup operations out of IORT

2021-06-16 Thread Eric Auger
Hi jean,

On 6/10/21 9:51 AM, Jean-Philippe Brucker wrote:
> Extract generic DMA setup code out of IORT, so it can be reused by VIOT.
> Keep it in drivers/acpi/arm64 for now, since it could break x86
> platforms that haven't run this code so far, if they have invalid
> tables.
>
> Signed-off-by: Jean-Philippe Brucker 
Reviewed-by: Eric Auger 


Eric

> ---
>  drivers/acpi/arm64/Makefile |  1 +
>  include/linux/acpi.h|  3 +++
>  include/linux/acpi_iort.h   |  6 ++---
>  drivers/acpi/arm64/dma.c| 50 ++
>  drivers/acpi/arm64/iort.c   | 54 ++---
>  drivers/acpi/scan.c |  2 +-
>  6 files changed, 66 insertions(+), 50 deletions(-)
>  create mode 100644 drivers/acpi/arm64/dma.c
>
> diff --git a/drivers/acpi/arm64/Makefile b/drivers/acpi/arm64/Makefile
> index 6ff50f4ed947..66acbe77f46e 100644
> --- a/drivers/acpi/arm64/Makefile
> +++ b/drivers/acpi/arm64/Makefile
> @@ -1,3 +1,4 @@
>  # SPDX-License-Identifier: GPL-2.0-only
>  obj-$(CONFIG_ACPI_IORT)  += iort.o
>  obj-$(CONFIG_ACPI_GTDT)  += gtdt.o
> +obj-y+= dma.o
> diff --git a/include/linux/acpi.h b/include/linux/acpi.h
> index c60745f657e9..7aaa9559cc19 100644
> --- a/include/linux/acpi.h
> +++ b/include/linux/acpi.h
> @@ -259,9 +259,12 @@ void acpi_numa_x2apic_affinity_init(struct 
> acpi_srat_x2apic_cpu_affinity *pa);
>  
>  #ifdef CONFIG_ARM64
>  void acpi_numa_gicc_affinity_init(struct acpi_srat_gicc_affinity *pa);
> +void acpi_arch_dma_setup(struct device *dev, u64 *dma_addr, u64 *dma_size);
>  #else
>  static inline void
>  acpi_numa_gicc_affinity_init(struct acpi_srat_gicc_affinity *pa) { }
> +static inline void
> +acpi_arch_dma_setup(struct device *dev, u64 *dma_addr, u64 *dma_size) { }
>  #endif
>  
>  int acpi_numa_memory_affinity_init (struct acpi_srat_mem_affinity *ma);
> diff --git a/include/linux/acpi_iort.h b/include/linux/acpi_iort.h
> index 1a12baa58e40..f7f054833afd 100644
> --- a/include/linux/acpi_iort.h
> +++ b/include/linux/acpi_iort.h
> @@ -34,7 +34,7 @@ struct irq_domain *iort_get_device_domain(struct device 
> *dev, u32 id,
>  void acpi_configure_pmsi_domain(struct device *dev);
>  int iort_pmsi_get_dev_id(struct device *dev, u32 *dev_id);
>  /* IOMMU interface */
> -void iort_dma_setup(struct device *dev, u64 *dma_addr, u64 *size);
> +int iort_dma_get_ranges(struct device *dev, u64 *size);
>  const struct iommu_ops *iort_iommu_configure_id(struct device *dev,
>   const u32 *id_in);
>  int iort_iommu_msi_get_resv_regions(struct device *dev, struct list_head 
> *head);
> @@ -48,8 +48,8 @@ static inline struct irq_domain *iort_get_device_domain(
>  { return NULL; }
>  static inline void acpi_configure_pmsi_domain(struct device *dev) { }
>  /* IOMMU interface */
> -static inline void iort_dma_setup(struct device *dev, u64 *dma_addr,
> -   u64 *size) { }
> +static inline int iort_dma_get_ranges(struct device *dev, u64 *size)
> +{ return -ENODEV; }
>  static inline const struct iommu_ops *iort_iommu_configure_id(
> struct device *dev, const u32 *id_in)
>  { return NULL; }
> diff --git a/drivers/acpi/arm64/dma.c b/drivers/acpi/arm64/dma.c
> new file mode 100644
> index ..f16739ad3cc0
> --- /dev/null
> +++ b/drivers/acpi/arm64/dma.c
> @@ -0,0 +1,50 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +#include 
> +#include 
> +#include 
> +#include 
> +
> +void acpi_arch_dma_setup(struct device *dev, u64 *dma_addr, u64 *dma_size)
> +{
> + int ret;
> + u64 end, mask;
> + u64 dmaaddr = 0, size = 0, offset = 0;
> +
> + /*
> +  * If @dev is expected to be DMA-capable then the bus code that created
> +  * it should have initialised its dma_mask pointer by this point. For
> +  * now, we'll continue the legacy behaviour of coercing it to the
> +  * coherent mask if not, but we'll no longer do so quietly.
> +  */
> + if (!dev->dma_mask) {
> + dev_warn(dev, "DMA mask not set\n");
> + dev->dma_mask = >coherent_dma_mask;
> + }
> +
> + if (dev->coherent_dma_mask)
> + size = max(dev->coherent_dma_mask, dev->coherent_dma_mask + 1);
> + else
> + size = 1ULL << 32;
> +
> + ret = acpi_dma_get_range(dev, , , );
> + if (ret == -ENODEV)
> + ret = iort_dma_get_ranges(dev, );
> + if (!ret) {
> + /*
> +  * Limit coherent and dma mask based on size retrieved from
> +  * firmware.
> +  */
> + end = dmaaddr + size - 1;
> + mask = DMA_BIT_MASK(ilog2(end) + 1);
> + dev->bus_dma_limit = end;
> + dev->coherent_dma_mask = min(dev->coherent_dma_mask, mask);
> + *dev->dma_mask = min(*dev->dma_mask, mask);
> + }
> +
> + *dma_addr = dmaaddr;
> + *dma_size = size;
> +
> + ret = 

Re: [PATCH v4 2/6] ACPI: Move IOMMU setup code out of IORT

2021-06-16 Thread Eric Auger
Hi jean,
On 6/10/21 9:51 AM, Jean-Philippe Brucker wrote:
> Extract the code that sets up the IOMMU infrastructure from IORT, since
> it can be reused by VIOT. Move it one level up into a new
> acpi_iommu_configure_id() function, which calls the IORT parsing
> function which in turn calls the acpi_iommu_fwspec_init() helper.
>
> Signed-off-by: Jean-Philippe Brucker 
> ---
>  include/acpi/acpi_bus.h   |  3 ++
>  include/linux/acpi_iort.h |  8 ++---
>  drivers/acpi/arm64/iort.c | 75 +--
>  drivers/acpi/scan.c   | 73 -
>  4 files changed, 87 insertions(+), 72 deletions(-)
>
> diff --git a/include/acpi/acpi_bus.h b/include/acpi/acpi_bus.h
> index 3a82faac5767..41f092a269f6 100644
> --- a/include/acpi/acpi_bus.h
> +++ b/include/acpi/acpi_bus.h
> @@ -588,6 +588,9 @@ struct acpi_pci_root {
>  
>  bool acpi_dma_supported(struct acpi_device *adev);
>  enum dev_dma_attr acpi_get_dma_attr(struct acpi_device *adev);
> +int acpi_iommu_fwspec_init(struct device *dev, u32 id,
> +struct fwnode_handle *fwnode,
> +const struct iommu_ops *ops);
>  int acpi_dma_get_range(struct device *dev, u64 *dma_addr, u64 *offset,
>  u64 *size);
>  int acpi_dma_configure_id(struct device *dev, enum dev_dma_attr attr,
> diff --git a/include/linux/acpi_iort.h b/include/linux/acpi_iort.h
> index f7f054833afd..f1f0842a2cb2 100644
> --- a/include/linux/acpi_iort.h
> +++ b/include/linux/acpi_iort.h
> @@ -35,8 +35,7 @@ void acpi_configure_pmsi_domain(struct device *dev);
>  int iort_pmsi_get_dev_id(struct device *dev, u32 *dev_id);
>  /* IOMMU interface */
>  int iort_dma_get_ranges(struct device *dev, u64 *size);
> -const struct iommu_ops *iort_iommu_configure_id(struct device *dev,
> - const u32 *id_in);
> +int iort_iommu_configure_id(struct device *dev, const u32 *id_in);
>  int iort_iommu_msi_get_resv_regions(struct device *dev, struct list_head 
> *head);
>  phys_addr_t acpi_iort_dma_get_max_cpu_address(void);
>  #else
> @@ -50,9 +49,8 @@ static inline void acpi_configure_pmsi_domain(struct device 
> *dev) { }
>  /* IOMMU interface */
>  static inline int iort_dma_get_ranges(struct device *dev, u64 *size)
>  { return -ENODEV; }
> -static inline const struct iommu_ops *iort_iommu_configure_id(
> -   struct device *dev, const u32 *id_in)
> -{ return NULL; }
> +static inline int iort_iommu_configure_id(struct device *dev, const u32 
> *id_in)
> +{ return -ENODEV; }
>  static inline
>  int iort_iommu_msi_get_resv_regions(struct device *dev, struct list_head 
> *head)
>  { return 0; }
> diff --git a/drivers/acpi/arm64/iort.c b/drivers/acpi/arm64/iort.c
> index a940be1cf2af..b5b021e064b6 100644
> --- a/drivers/acpi/arm64/iort.c
> +++ b/drivers/acpi/arm64/iort.c
> @@ -806,23 +806,6 @@ static struct acpi_iort_node 
> *iort_get_msi_resv_iommu(struct device *dev)
>   return NULL;
>  }
>  
> -static inline const struct iommu_ops *iort_fwspec_iommu_ops(struct device 
> *dev)
> -{
> - struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev);
> -
> - return (fwspec && fwspec->ops) ? fwspec->ops : NULL;
> -}
> -
> -static inline int iort_add_device_replay(struct device *dev)
> -{
> - int err = 0;
> -
> - if (dev->bus && !device_iommu_mapped(dev))
> - err = iommu_probe_device(dev);
> -
> - return err;
> -}
> -
>  /**
>   * iort_iommu_msi_get_resv_regions - Reserved region driver helper
>   * @dev: Device from iommu_get_resv_regions()
> @@ -900,18 +883,6 @@ static inline bool iort_iommu_driver_enabled(u8 type)
>   }
>  }
>  
> -static int arm_smmu_iort_xlate(struct device *dev, u32 streamid,
> -struct fwnode_handle *fwnode,
> -const struct iommu_ops *ops)
> -{
> - int ret = iommu_fwspec_init(dev, fwnode, ops);
> -
> - if (!ret)
> - ret = iommu_fwspec_add_ids(dev, , 1);
> -
> - return ret;
> -}
> -
>  static bool iort_pci_rc_supports_ats(struct acpi_iort_node *node)
>  {
>   struct acpi_iort_root_complex *pci_rc;
> @@ -946,7 +917,7 @@ static int iort_iommu_xlate(struct device *dev, struct 
> acpi_iort_node *node,
>   return iort_iommu_driver_enabled(node->type) ?
>  -EPROBE_DEFER : -ENODEV;
>  
> - return arm_smmu_iort_xlate(dev, streamid, iort_fwnode, ops);
> + return acpi_iommu_fwspec_init(dev, streamid, iort_fwnode, ops);
>  }
>  
>  struct iort_pci_alias_info {
> @@ -1020,24 +991,14 @@ static int iort_nc_iommu_map_id(struct device *dev,
>   * @dev: device to configure
>   * @id_in: optional input id const value pointer
>   *
> - * Returns: iommu_ops pointer on configuration success
> - *  NULL on configuration failure
> + * Returns: 0 on success, <0 on failure
>   */
> -const struct iommu_ops *iort_iommu_configure_id(struct device *dev,
> -   

Re: [PATCH] iommu/io-pgtable-arm: Optimize partial walk flush for large scatter-gather list

2021-06-16 Thread Sai Prakash Ranjan

On 2021-06-16 12:28, Sai Prakash Ranjan wrote:

Hi Robin,

On 2021-06-15 19:23, Robin Murphy wrote:

On 2021-06-15 12:51, Sai Prakash Ranjan wrote:


...

Hi @Robin, from these discussions it seems they are not ok with the 
change
for all SoC vendor implementations and do not have any data on such 
impact.
As I mentioned above, on QCOM platforms we do have several 
optimizations in HW
for TLBIs and would like to make use of it and reduce the unmap 
latency.

What do you think, should this be made implementation specific?


Yes, it sounds like there's enough uncertainty for now that this needs
to be an opt-in feature. However, I still think that non-strict mode
could use it generically, since that's all about over-invalidating to
save time on individual unmaps - and relatively non-deterministic -
already.

So maybe we have a second set of iommu_flush_ops, or just a flag
somewhere to control the tlb_flush_walk functions internally, and the
choice can be made in the iommu_get_dma_strict() test, but also forced
on all the time by your init_context hook. What do you reckon?



Sounds good to me. Since you mentioned non-strict mode using it 
generically,
can't we just set tlb_flush_all() in io_pgtable_tlb_flush_walk() like 
below
based on quirk so that we don't need to add any check in 
iommu_get_dma_strict()

and just force the new flush_ops in init_context hook?

if (iop->cfg.quirks & IO_PGTABLE_QUIRK_NON_STRICT) {
iop->cfg.tlb->tlb_flush_all(iop->cookie);
return;
}



Instead of flush_ops in init_context hook, perhaps a io_pgtable quirk 
since this
is related to tlb, probably a bad name but IO_PGTABLE_QUIRK_TLB_INV 
which will be
set in init_context impl hook and the prev condition in 
io_pgtable_tlb_flush_walk()
becomes something like below. Seems very minimal and neat instead of 
poking into

tlb_flush_walk functions or touching dma strict with some flag?

if (iop->cfg.quirks & IO_PGTABLE_QUIRK_NON_STRICT ||
iop->cfg.quirks & IO_PGTABLE_QUIRK_TLB_INV) {
iop->cfg.tlb->tlb_flush_all(iop->cookie);
return;
}

Thanks,
Sai

--
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a 
member

of Code Aurora Forum, hosted by The Linux Foundation
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v12 3/5] iommu/vt-d: Add support for IOMMU default DMA mode build options

2021-06-16 Thread Lu Baolu

On 2021/6/15 16:25, Robin Murphy wrote:

On 2021-06-15 08:26, Lu Baolu wrote:

Hi John,

On 6/14/21 4:03 PM, John Garry wrote:

On 12/06/2021 03:14, Lu Baolu wrote:

On 2021/6/11 20:20, John Garry wrote:

diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
index 2a71347611d4..4467353f981b 100644
--- a/drivers/iommu/Kconfig
+++ b/drivers/iommu/Kconfig
@@ -94,6 +94,7 @@ choice
  prompt "IOMMU default DMA mode"
  depends on IOMMU_DMA
+    default IOMMU_DEFAULT_LAZY if INTEL_IOMMU
  default IOMMU_DEFAULT_STRICT


If two default values are different. Which one will be overridden?


If I understand your question correctly, I think you are asking if 
both are set:

CONFIG_IOMMU_DEFAULT_LAZY=y
CONFIG_IOMMU_DEFAULT_STRICT=y

If this happens, then make defconfig complains about both being set, 
and selects the 2nd, whatever that is.


On x86, Intel or AMD, this will be

 prompt "IOMMU default DMA mode"
   depends on IOMMU_DMA
   default IOMMU_DEFAULT_LAZY
   default IOMMU_DEFAULT_STRICT

which will be default, LAZY or STRICT? Will it cause complains?


See Documentation/kbuild/kconfig-language.rst:

   A config option can have any number of default values. If multiple
   default values are visible, only the first defined one is active.


Get it. Thank you, Robin.

Best regards,
baolu
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v12 09/12] swiotlb: Add restricted DMA alloc/free support

2021-06-16 Thread Christoph Hellwig
On Wed, Jun 16, 2021 at 02:21:54PM +0800, Claire Chang wrote:
> Add the functions, swiotlb_{alloc,free} and is_swiotlb_for_alloc to
> support the memory allocation from restricted DMA pool.
> 
> The restricted DMA pool is preferred if available.
> 
> Note that since coherent allocation needs remapping, one must set up
> another device coherent pool by shared-dma-pool and use
> dma_alloc_from_dev_coherent instead for atomic coherent allocation.
> 
> Signed-off-by: Claire Chang 

Looks good,

Reviewed-by: Christoph Hellwig 
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v12 06/12] swiotlb: Use is_swiotlb_force_bounce for swiotlb data bouncing

2021-06-16 Thread Christoph Hellwig
On Wed, Jun 16, 2021 at 02:21:51PM +0800, Claire Chang wrote:
> Propagate the swiotlb_force into io_tlb_default_mem->force_bounce and
> use it to determine whether to bounce the data or not. This will be
> useful later to allow for different pools.
> 
> Signed-off-by: Claire Chang 

Looks good,

Reviewed-by: Christoph Hellwig 
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] iommu/io-pgtable-arm: Optimize partial walk flush for large scatter-gather list

2021-06-16 Thread Sai Prakash Ranjan

Hi Robin,

On 2021-06-15 19:23, Robin Murphy wrote:

On 2021-06-15 12:51, Sai Prakash Ranjan wrote:


...

Hi @Robin, from these discussions it seems they are not ok with the 
change
for all SoC vendor implementations and do not have any data on such 
impact.
As I mentioned above, on QCOM platforms we do have several 
optimizations in HW
for TLBIs and would like to make use of it and reduce the unmap 
latency.

What do you think, should this be made implementation specific?


Yes, it sounds like there's enough uncertainty for now that this needs
to be an opt-in feature. However, I still think that non-strict mode
could use it generically, since that's all about over-invalidating to
save time on individual unmaps - and relatively non-deterministic -
already.

So maybe we have a second set of iommu_flush_ops, or just a flag
somewhere to control the tlb_flush_walk functions internally, and the
choice can be made in the iommu_get_dma_strict() test, but also forced
on all the time by your init_context hook. What do you reckon?



Sounds good to me. Since you mentioned non-strict mode using it 
generically,
can't we just set tlb_flush_all() in io_pgtable_tlb_flush_walk() like 
below
based on quirk so that we don't need to add any check in 
iommu_get_dma_strict()

and just force the new flush_ops in init_context hook?

if (iop->cfg.quirks & IO_PGTABLE_QUIRK_NON_STRICT) {
iop->cfg.tlb->tlb_flush_all(iop->cookie);
return;
}

Thanks,
Sai

--
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a 
member

of Code Aurora Forum, hosted by The Linux Foundation
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


RE: Plan for /dev/ioasid RFC v2

2021-06-16 Thread Tian, Kevin
> From: Alex Williamson 
> Sent: Wednesday, June 16, 2021 12:56 AM
> 
> On Tue, 15 Jun 2021 01:21:35 +
> "Tian, Kevin"  wrote:
> 
> > > From: Jason Gunthorpe 
> > > Sent: Monday, June 14, 2021 9:38 PM
> > >
> > > On Mon, Jun 14, 2021 at 03:09:31AM +, Tian, Kevin wrote:
> > >
> > > > If a device can be always blocked from accessing memory in the IOMMU
> > > > before it's bound to a driver or more specifically before the driver
> > > > moves it to a new security context, then there is no need for VFIO
> > > > to track whether IOASIDfd has taken over ownership of the DMA
> > > > context for all devices within a group.
> > >
> > > I've been assuming we'd do something like this, where when a device is
> > > first turned into a VFIO it tells the IOMMU layer that this device
> > > should be DMA blocked unless an IOASID is attached to
> > > it. Disconnecting an IOASID returns it to blocked.
> >
> > Or just make sure a device is in block-DMA when it's unbound from a
> > driver or a security context. Then no need to explicitly tell IOMMU layer
> > to do so when it's bound to a new driver.
> >
> > Currently the default domain type applies even when a device is not
> > bound. This implies that if iommu=passthrough a device is always
> > allowed to access arbitrary system memory with or without a driver.
> > I feel the current domain type (identity, dma, unmanged) should apply
> > only when a driver is loaded...
> 
> Note that vfio does not currently require all devices in the group to
> be bound to drivers.  Other devices within the group, those bound to
> vfio drivers, can be used in this configuration.  This is not
> necessarily recommended though as a non-vfio, non-stub driver binding
> to one of those devices can trigger a BUG_ON.

This is a good learning that I didn't realize before. 

As explained in previous mail, we need reuse the group_viable mechanism
to trigger BUG_ON.

> 
> > > > If this works I didn't see the need for vfio to keep the sequence.
> > > > VFIO still keeps group fd to claim ownership of all devices in a
> > > > group.
> > >
> > > As Alex says you still have to deal with the problem that device A in
> > > a group can gain control of device B in the same group.
> >
> > There is no isolation in the group then how could vfio prevent device
> > A from gaining control of device B? for example when both are attached
> > to the same GPA address space with device MMIO bar included, devA
> > can do p2p to devB. It's all user's policy how to deal with devices within
> > the group.
> 
> The latter is user policy, yes, but it's a system security issue that
> the user cannot use device A to control device B if the user doesn't
> have access to both devices, ie. doesn't own the group.  vfio would
> prevent this by not allowing access to device A while device B is
> insecure and would require that all devices within the group remain in
> a secure, user owned state for the extent of access to device A.
> 
> > > This means device A and B can not be used from to two different
> > > security contexts.
> >
> > It depends on how the security context is defined. From iommu layer
> > p.o.v, an IOASID is a security context which isolates a device from
> > the rest of the system (but not the sibling in the same group). As you
> > suggested earlier, it's completely sane if an user wants to attach
> > devices in a group to different IOASIDs. Here I just talk about this fact.
> 
> This is sane, yes, but that doesn't give us license to allow the user
> to access device A regardless of the state of device B.
> 
> > >
> > > If the /dev/iommu FD is the security context then the tracking is
> > > needed there.
> > >
> >
> > As I replied to Alex, my point is that VFIO doesn't need to know the
> > attaching status of each device in a group before it can allow user to
> > access a device. As long as a device in a group either in block DMA
> > or switch to a new address space created via /dev/iommu FD, there's
> > no problem to allow user accessing it. User cannot do harm to the
> > world outside of the group. User knows there is no isolation within
> > the group. that is it.
> 
> This is self contradictory, "vfio doesn't need to know the attachment
> status"... "[a]s long as a device in a group either in block DMA or
> switch to a new address space".  So vfio does need to know the latter.
> How does it know that?  Thanks,
> 

My point was that, if a device can only be in two states: block-DMA or 
attached to a new address space, which both are secure, then vfio 
doesn't need to track which state a device is actually in.

Thanks
Kevin
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

RE: Plan for /dev/ioasid RFC v2

2021-06-16 Thread Tian, Kevin
> From: Alex Williamson 
> Sent: Wednesday, June 16, 2021 12:12 AM
> 
> On Tue, 15 Jun 2021 02:31:39 +
> "Tian, Kevin"  wrote:
> 
> > > From: Alex Williamson 
> > > Sent: Tuesday, June 15, 2021 12:28 AM
> > >
> > [...]
> > > > IOASID. Today the group fd requires an IOASID before it hands out a
> > > > device_fd. With iommu_fd the device_fd will not allow IOCTLs until it
> > > > has a blocked DMA IOASID and is successefully joined to an iommu_fd.
> > >
> > > Which is the root of my concern.  Who owns ioctls to the device fd?
> > > It's my understanding this is a vfio provided file descriptor and it's
> > > therefore vfio's responsibility.  A device-level IOASID interface
> > > therefore requires that vfio manage the group aspect of device access.
> > > AFAICT, that means that device access can therefore only begin when all
> > > devices for a given group are attached to the IOASID and must halt for
> > > all devices in the group if any device is ever detached from an IOASID,
> > > even temporarily.  That suggests a lot more oversight of the IOASIDs by
> > > vfio than I'd prefer.
> > >
> >
> > This is possibly the point that is worthy of more clarification and
> > alignment, as it sounds like the root of controversy here.
> >
> > I feel the goal of vfio group management is more about ownership, i.e.
> > all devices within a group must be assigned to a single user. Following
> > the three rules defined by Jason, what we really care is whether a group
> > of devices can be isolated from the rest of the world, i.e. no access to
> > memory/device outside of its security context and no access to its
> > security context from devices outside of this group. This can be achieved
> > as long as every device in the group is either in block-DMA state when
> > it's not attached to any security context or attached to an IOASID context
> > in IOMMU fd.
> >
> > As long as group-level isolation is satisfied, how devices within a group
> > are further managed is decided by the user (unattached, all attached to
> > same IOASID, attached to different IOASIDs) as long as the user
> > understands the implication of lacking of isolation within the group. This
> > is what a device-centric model comes to play. Misconfiguration just hurts
> > the user itself.
> >
> > If this rationale can be agreed, then I didn't see the point of having VFIO
> > to mandate all devices in the group must be attached/detached in
> > lockstep.
> 
> In theory this sounds great, but there are still too many assumptions
> and too much hand waving about where isolation occurs for me to feel
> like I really have the complete picture.  So let's walk through some
> examples.  Please fill in and correct where I'm wrong.

Thanks for putting these examples. They are helpful for clearing the 
whole picture.

Before filling in let's first align on what is the key difference between
current VFIO model and this new proposal. With this comparison we'll
know which of following questions are answered with existing VFIO
mechanism and which are handled differently.

With Yi's help we figured out the current mechanism:

1) vfio_group_viable. The code comment explains the intention clearly:

--
* A vfio group is viable for use by userspace if all devices are in
 * one of the following states:
 *  - driver-less
 *  - bound to a vfio driver
 *  - bound to an otherwise allowed driver
 *  - a PCI interconnect device
--

Note this check is not related to an IOMMU security context.

2) vfio_iommu_group_notifier. When an IOMMU_GROUP_NOTIFY_
BOUND_DRIVER event is notified, vfio_group_viable is re-evaluated.
If the affected group was previously viable but now becomes not 
viable, BUG_ON() as it implies that this device is bound to a non-vfio 
driver which breaks the group isolation.

3) vfio_group_get_device_fd. User can acquire a device fd only after
a) the group is viable;
b) the group is attached to a container;
c) iommu is set on the container (implying a security context
established);

The new device-centric proposal suggests:

1) vfio_group_viable;
2) vfio_iommu_group_notifier;
3) block-DMA if a device is detached from previous domain (instead of
switching back to default domain as today);
4) vfio_group_get_device_fd. User can acquire a device fd once the group
is viable;
5) device-centric when binding to IOMMU fd or attaching to IOASID

In this model the group viability mechanism is kept but there is no need
for VFIO to track the actual attaching status.

Now let's look at how the new model works.

> 
> 1) A dual-function PCIe e1000e NIC where the functions are grouped
>together due to ACS isolation issues.
> 
>a) Initial state: functions 0 & 1 are both bound to e1000e driver.
> 
>b) Admin uses driverctl to bind function 1 to vfio-pci, creating
>   vfio device file, which is chmod'd to grant to a user.

This implies that function 1 is in block-DMA mode when it's unbound
from e1000e.

> 
>c) User opens vfio function 

Re: [PATCH v4 0/6] Add support for ACPI VIOT

2021-06-16 Thread Jean-Philippe Brucker
Hi Rafael,

On Thu, Jun 10, 2021 at 09:51:27AM +0200, Jean-Philippe Brucker wrote:
> Add a driver for the ACPI VIOT table, which provides topology
> information for para-virtual IOMMUs. Enable virtio-iommu on
> non-devicetree platforms, including x86.
> 
> Since v3 [1] I fixed a build bug for !CONFIG_IOMMU_API. Joerg offered to
> take this series through the IOMMU tree, which requires Acks for patches
> 1-3.

I was wondering if you could take a look at patches 1-3, otherwise we'll
miss the mark for 5.14 since I won't be able to resend next week. The
series adds support for virtio-iommu on QEMU and cloud hypervisor.

Thanks,
Jean

> 
> You can find a QEMU implementation at [2], with extra support for
> testing all VIOT nodes including MMIO-based endpoints and IOMMU.
> This series is at [3].
> 
> [1] 
> https://lore.kernel.org/linux-iommu/2021060215.1077006-1-jean-phili...@linaro.org/
> [2] https://jpbrucker.net/git/qemu/log/?h=virtio-iommu/acpi
> [3] https://jpbrucker.net/git/linux/log/?h=virtio-iommu/acpi
> 
> 
> Jean-Philippe Brucker (6):
>   ACPI: arm64: Move DMA setup operations out of IORT
>   ACPI: Move IOMMU setup code out of IORT
>   ACPI: Add driver for the VIOT table
>   iommu/dma: Pass address limit rather than size to
> iommu_setup_dma_ops()
>   iommu/dma: Simplify calls to iommu_setup_dma_ops()
>   iommu/virtio: Enable x86 support
> 
>  drivers/acpi/Kconfig |   3 +
>  drivers/iommu/Kconfig|   4 +-
>  drivers/acpi/Makefile|   2 +
>  drivers/acpi/arm64/Makefile  |   1 +
>  include/acpi/acpi_bus.h  |   3 +
>  include/linux/acpi.h |   3 +
>  include/linux/acpi_iort.h|  14 +-
>  include/linux/acpi_viot.h|  19 ++
>  include/linux/dma-iommu.h|   4 +-
>  arch/arm64/mm/dma-mapping.c  |   2 +-
>  drivers/acpi/arm64/dma.c |  50 +
>  drivers/acpi/arm64/iort.c| 129 ++---
>  drivers/acpi/bus.c   |   2 +
>  drivers/acpi/scan.c  |  78 +++-
>  drivers/acpi/viot.c  | 364 +++
>  drivers/iommu/amd/iommu.c|   9 +-
>  drivers/iommu/dma-iommu.c|  17 +-
>  drivers/iommu/intel/iommu.c  |  10 +-
>  drivers/iommu/virtio-iommu.c |   8 +
>  MAINTAINERS  |   8 +
>  20 files changed, 580 insertions(+), 150 deletions(-)
>  create mode 100644 include/linux/acpi_viot.h
>  create mode 100644 drivers/acpi/arm64/dma.c
>  create mode 100644 drivers/acpi/viot.c
> 
> -- 
> 2.31.1
> 
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v11 00/12] Restricted DMA

2021-06-16 Thread Claire Chang
v12: https://lore.kernel.org/patchwork/cover/1447254/

On Wed, Jun 16, 2021 at 11:52 AM Claire Chang  wrote:
>
> This series implements mitigations for lack of DMA access control on
> systems without an IOMMU, which could result in the DMA accessing the
> system memory at unexpected times and/or unexpected addresses, possibly
> leading to data leakage or corruption.
>
> For example, we plan to use the PCI-e bus for Wi-Fi and that PCI-e bus is
> not behind an IOMMU. As PCI-e, by design, gives the device full access to
> system memory, a vulnerability in the Wi-Fi firmware could easily escalate
> to a full system exploit (remote wifi exploits: [1a], [1b] that shows a
> full chain of exploits; [2], [3]).
>
> To mitigate the security concerns, we introduce restricted DMA. Restricted
> DMA utilizes the existing swiotlb to bounce streaming DMA in and out of a
> specially allocated region and does memory allocation from the same region.
> The feature on its own provides a basic level of protection against the DMA
> overwriting buffer contents at unexpected times. However, to protect
> against general data leakage and system memory corruption, the system needs
> to provide a way to restrict the DMA to a predefined memory region (this is
> usually done at firmware level, e.g. MPU in ATF on some ARM platforms [4]).
>
> [1a] 
> https://googleprojectzero.blogspot.com/2017/04/over-air-exploiting-broadcoms-wi-fi_4.html
> [1b] 
> https://googleprojectzero.blogspot.com/2017/04/over-air-exploiting-broadcoms-wi-fi_11.html
> [2] https://blade.tencent.com/en/advisories/qualpwn/
> [3] 
> https://www.bleepingcomputer.com/news/security/vulnerabilities-found-in-highly-popular-firmware-for-wifi-chips/
> [4] 
> https://github.com/ARM-software/arm-trusted-firmware/blob/master/plat/mediatek/mt8183/drivers/emi_mpu/emi_mpu.c#L132
>
> v11:
> - Rebase against swiotlb devel/for-linus-5.14
> - s/mempry/memory/g
> - exchange the order of patch 09/12 and 10/12
> https://lore.kernel.org/patchwork/cover/1446882/
>
> v10:
> Address the comments in v9 to
>   - fix the dev->dma_io_tlb_mem assignment
>   - propagate swiotlb_force setting into io_tlb_default_mem->force
>   - move set_memory_decrypted out of swiotlb_init_io_tlb_mem
>   - move debugfs_dir declaration into the main CONFIG_DEBUG_FS block
>   - add swiotlb_ prefix to find_slots and release_slots
>   - merge the 3 alloc/free related patches
>   - move the CONFIG_DMA_RESTRICTED_POOL later
>
> v9:
> Address the comments in v7 to
>   - set swiotlb active pool to dev->dma_io_tlb_mem
>   - get rid of get_io_tlb_mem
>   - dig out the device struct for is_swiotlb_active
>   - move debugfs_create_dir out of swiotlb_create_debugfs
>   - do set_memory_decrypted conditionally in swiotlb_init_io_tlb_mem
>   - use IS_ENABLED in kernel/dma/direct.c
>   - fix redefinition of 'of_dma_set_restricted_buffer'
> https://lore.kernel.org/patchwork/cover/1445081/
>
> v8:
> - Fix reserved-memory.txt and add the reg property in example.
> - Fix sizeof for of_property_count_elems_of_size in
>   drivers/of/address.c#of_dma_set_restricted_buffer.
> - Apply Will's suggestion to try the OF node having DMA configuration in
>   drivers/of/address.c#of_dma_set_restricted_buffer.
> - Fix typo in the comment of 
> drivers/of/address.c#of_dma_set_restricted_buffer.
> - Add error message for PageHighMem in
>   kernel/dma/swiotlb.c#rmem_swiotlb_device_init and move it to
>   rmem_swiotlb_setup.
> - Fix the message string in rmem_swiotlb_setup.
> https://lore.kernel.org/patchwork/cover/1437112/
>
> v7:
> Fix debugfs, PageHighMem and comment style in rmem_swiotlb_device_init
> https://lore.kernel.org/patchwork/cover/1431031/
>
> v6:
> Address the comments in v5
> https://lore.kernel.org/patchwork/cover/1423201/
>
> v5:
> Rebase on latest linux-next
> https://lore.kernel.org/patchwork/cover/1416899/
>
> v4:
> - Fix spinlock bad magic
> - Use rmem->name for debugfs entry
> - Address the comments in v3
> https://lore.kernel.org/patchwork/cover/1378113/
>
> v3:
> Using only one reserved memory region for both streaming DMA and memory
> allocation.
> https://lore.kernel.org/patchwork/cover/1360992/
>
> v2:
> Building on top of swiotlb.
> https://lore.kernel.org/patchwork/cover/1280705/
>
> v1:
> Using dma_map_ops.
> https://lore.kernel.org/patchwork/cover/1271660/
>
> Claire Chang (12):
>   swiotlb: Refactor swiotlb init functions
>   swiotlb: Refactor swiotlb_create_debugfs
>   swiotlb: Set dev->dma_io_tlb_mem to the swiotlb pool used
>   swiotlb: Update is_swiotlb_buffer to add a struct device argument
>   swiotlb: Update is_swiotlb_active to add a struct device argument
>   swiotlb: Use is_dev_swiotlb_force for swiotlb data bouncing
>   swiotlb: Move alloc_size to swiotlb_find_slots
>   swiotlb: Refactor swiotlb_tbl_unmap_single
>   swiotlb: Add restricted DMA alloc/free support
>   swiotlb: Add restricted DMA pool initialization
>   dt-bindings: of: Add restricted DMA pool
>   of: Add plumbing for restricted DMA pool
>
>  

[PATCH v12 12/12] of: Add plumbing for restricted DMA pool

2021-06-16 Thread Claire Chang
If a device is not behind an IOMMU, we look up the device node and set
up the restricted DMA when the restricted-dma-pool is presented.

Signed-off-by: Claire Chang 
---
 drivers/of/address.c| 33 +
 drivers/of/device.c |  3 +++
 drivers/of/of_private.h |  6 ++
 3 files changed, 42 insertions(+)

diff --git a/drivers/of/address.c b/drivers/of/address.c
index 73ddf2540f3f..cdf700fba5c4 100644
--- a/drivers/of/address.c
+++ b/drivers/of/address.c
@@ -8,6 +8,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -1022,6 +1023,38 @@ int of_dma_get_range(struct device_node *np, const 
struct bus_dma_region **map)
of_node_put(node);
return ret;
 }
+
+int of_dma_set_restricted_buffer(struct device *dev, struct device_node *np)
+{
+   struct device_node *node, *of_node = dev->of_node;
+   int count, i;
+
+   count = of_property_count_elems_of_size(of_node, "memory-region",
+   sizeof(u32));
+   /*
+* If dev->of_node doesn't exist or doesn't contain memory-region, try
+* the OF node having DMA configuration.
+*/
+   if (count <= 0) {
+   of_node = np;
+   count = of_property_count_elems_of_size(
+   of_node, "memory-region", sizeof(u32));
+   }
+
+   for (i = 0; i < count; i++) {
+   node = of_parse_phandle(of_node, "memory-region", i);
+   /*
+* There might be multiple memory regions, but only one
+* restricted-dma-pool region is allowed.
+*/
+   if (of_device_is_compatible(node, "restricted-dma-pool") &&
+   of_device_is_available(node))
+   return of_reserved_mem_device_init_by_idx(dev, of_node,
+ i);
+   }
+
+   return 0;
+}
 #endif /* CONFIG_HAS_DMA */
 
 /**
diff --git a/drivers/of/device.c b/drivers/of/device.c
index 6cb86de404f1..e68316836a7a 100644
--- a/drivers/of/device.c
+++ b/drivers/of/device.c
@@ -165,6 +165,9 @@ int of_dma_configure_id(struct device *dev, struct 
device_node *np,
 
arch_setup_dma_ops(dev, dma_start, size, iommu, coherent);
 
+   if (!iommu)
+   return of_dma_set_restricted_buffer(dev, np);
+
return 0;
 }
 EXPORT_SYMBOL_GPL(of_dma_configure_id);
diff --git a/drivers/of/of_private.h b/drivers/of/of_private.h
index d9e6a324de0a..25cebbed5f02 100644
--- a/drivers/of/of_private.h
+++ b/drivers/of/of_private.h
@@ -161,12 +161,18 @@ struct bus_dma_region;
 #if defined(CONFIG_OF_ADDRESS) && defined(CONFIG_HAS_DMA)
 int of_dma_get_range(struct device_node *np,
const struct bus_dma_region **map);
+int of_dma_set_restricted_buffer(struct device *dev, struct device_node *np);
 #else
 static inline int of_dma_get_range(struct device_node *np,
const struct bus_dma_region **map)
 {
return -ENODEV;
 }
+static inline int of_dma_set_restricted_buffer(struct device *dev,
+  struct device_node *np)
+{
+   return -ENODEV;
+}
 #endif
 
 #endif /* _LINUX_OF_PRIVATE_H */
-- 
2.32.0.272.g935e593368-goog

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v12 11/12] dt-bindings: of: Add restricted DMA pool

2021-06-16 Thread Claire Chang
Introduce the new compatible string, restricted-dma-pool, for restricted
DMA. One can specify the address and length of the restricted DMA memory
region by restricted-dma-pool in the reserved-memory node.

Signed-off-by: Claire Chang 
---
 .../reserved-memory/reserved-memory.txt   | 36 +--
 1 file changed, 33 insertions(+), 3 deletions(-)

diff --git 
a/Documentation/devicetree/bindings/reserved-memory/reserved-memory.txt 
b/Documentation/devicetree/bindings/reserved-memory/reserved-memory.txt
index e8d3096d922c..46804f24df05 100644
--- a/Documentation/devicetree/bindings/reserved-memory/reserved-memory.txt
+++ b/Documentation/devicetree/bindings/reserved-memory/reserved-memory.txt
@@ -51,6 +51,23 @@ compatible (optional) - standard definition
   used as a shared pool of DMA buffers for a set of devices. It can
   be used by an operating system to instantiate the necessary pool
   management subsystem if necessary.
+- restricted-dma-pool: This indicates a region of memory meant to be
+  used as a pool of restricted DMA buffers for a set of devices. The
+  memory region would be the only region accessible to those devices.
+  When using this, the no-map and reusable properties must not be set,
+  so the operating system can create a virtual mapping that will be 
used
+  for synchronization. The main purpose for restricted DMA is to
+  mitigate the lack of DMA access control on systems without an IOMMU,
+  which could result in the DMA accessing the system memory at
+  unexpected times and/or unexpected addresses, possibly leading to 
data
+  leakage or corruption. The feature on its own provides a basic level
+  of protection against the DMA overwriting buffer contents at
+  unexpected times. However, to protect against general data leakage 
and
+  system memory corruption, the system needs to provide way to lock 
down
+  the memory access, e.g., MPU. Note that since coherent allocation
+  needs remapping, one must set up another device coherent pool by
+  shared-dma-pool and use dma_alloc_from_dev_coherent instead for 
atomic
+  coherent allocation.
 - vendor specific string in the form ,[-]
 no-map (optional) - empty property
 - Indicates the operating system must not create a virtual mapping
@@ -85,10 +102,11 @@ memory-region-names (optional) - a list of names, one for 
each corresponding
 
 Example
 ---
-This example defines 3 contiguous regions are defined for Linux kernel:
+This example defines 4 contiguous regions for Linux kernel:
 one default of all device drivers (named linux,cma@7200 and 64MiB in size),
-one dedicated to the framebuffer device (named framebuffer@7800, 8MiB), and
-one for multimedia processing (named multimedia-memory@7700, 64MiB).
+one dedicated to the framebuffer device (named framebuffer@7800, 8MiB),
+one for multimedia processing (named multimedia-memory@7700, 64MiB), and
+one for restricted dma pool (named restricted_dma_reserved@0x5000, 64MiB).
 
 / {
#address-cells = <1>;
@@ -120,6 +138,11 @@ one for multimedia processing (named 
multimedia-memory@7700, 64MiB).
compatible = "acme,multimedia-memory";
reg = <0x7700 0x400>;
};
+
+   restricted_dma_reserved: restricted_dma_reserved {
+   compatible = "restricted-dma-pool";
+   reg = <0x5000 0x400>;
+   };
};
 
/* ... */
@@ -138,4 +161,11 @@ one for multimedia processing (named 
multimedia-memory@7700, 64MiB).
memory-region = <_reserved>;
/* ... */
};
+
+   pcie_device: pcie_device@0,0 {
+   reg = <0x8301 0x0 0x 0x0 0x0010
+  0x8301 0x0 0x0010 0x0 0x0010>;
+   memory-region = <_dma_mem_reserved>;
+   /* ... */
+   };
 };
-- 
2.32.0.272.g935e593368-goog

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v12 10/12] swiotlb: Add restricted DMA pool initialization

2021-06-16 Thread Claire Chang
Add the initialization function to create restricted DMA pools from
matching reserved-memory nodes.

Regardless of swiotlb setting, the restricted DMA pool is preferred if
available.

The restricted DMA pools provide a basic level of protection against the
DMA overwriting buffer contents at unexpected times. However, to protect
against general data leakage and system memory corruption, the system
needs to provide a way to lock down the memory access, e.g., MPU.

Signed-off-by: Claire Chang 
Reviewed-by: Christoph Hellwig 
---
 include/linux/swiotlb.h |  3 +-
 kernel/dma/Kconfig  | 14 
 kernel/dma/swiotlb.c| 76 +
 3 files changed, 92 insertions(+), 1 deletion(-)

diff --git a/include/linux/swiotlb.h b/include/linux/swiotlb.h
index a73fad460162..175b6c113ed8 100644
--- a/include/linux/swiotlb.h
+++ b/include/linux/swiotlb.h
@@ -73,7 +73,8 @@ extern enum swiotlb_force swiotlb_force;
  * range check to see if the memory was in fact allocated by this
  * API.
  * @nslabs:The number of IO TLB blocks (in groups of 64) between @start and
- * @end. This is command line adjustable via setup_io_tlb_npages.
+ * @end. For default swiotlb, this is command line adjustable via
+ * setup_io_tlb_npages.
  * @used:  The number of used IO TLB block.
  * @list:  The free list describing the number of free entries available
  * from each index.
diff --git a/kernel/dma/Kconfig b/kernel/dma/Kconfig
index 77b405508743..3e961dc39634 100644
--- a/kernel/dma/Kconfig
+++ b/kernel/dma/Kconfig
@@ -80,6 +80,20 @@ config SWIOTLB
bool
select NEED_DMA_MAP_STATE
 
+config DMA_RESTRICTED_POOL
+   bool "DMA Restricted Pool"
+   depends on OF && OF_RESERVED_MEM
+   select SWIOTLB
+   help
+ This enables support for restricted DMA pools which provide a level of
+ DMA memory protection on systems with limited hardware protection
+ capabilities, such as those lacking an IOMMU.
+
+ For more information see
+ 

+ and .
+ If unsure, say "n".
+
 #
 # Should be selected if we can mmap non-coherent mappings to userspace.
 # The only thing that is really required is a way to set an uncached bit
diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c
index d3d4f1a25fee..8a4d4ad4335e 100644
--- a/kernel/dma/swiotlb.c
+++ b/kernel/dma/swiotlb.c
@@ -39,6 +39,13 @@
 #ifdef CONFIG_DEBUG_FS
 #include 
 #endif
+#ifdef CONFIG_DMA_RESTRICTED_POOL
+#include 
+#include 
+#include 
+#include 
+#include 
+#endif
 
 #include 
 #include 
@@ -735,4 +742,73 @@ bool swiotlb_free(struct device *dev, struct page *page, 
size_t size)
return true;
 }
 
+static int rmem_swiotlb_device_init(struct reserved_mem *rmem,
+   struct device *dev)
+{
+   struct io_tlb_mem *mem = rmem->priv;
+   unsigned long nslabs = rmem->size >> IO_TLB_SHIFT;
+
+   /*
+* Since multiple devices can share the same pool, the private data,
+* io_tlb_mem struct, will be initialized by the first device attached
+* to it.
+*/
+   if (!mem) {
+   mem = kzalloc(struct_size(mem, slots, nslabs), GFP_KERNEL);
+   if (!mem)
+   return -ENOMEM;
+
+   swiotlb_init_io_tlb_mem(mem, rmem->base, nslabs, false);
+   mem->force_bounce = true;
+   mem->for_alloc = true;
+   set_memory_decrypted((unsigned long)phys_to_virt(rmem->base),
+rmem->size >> PAGE_SHIFT);
+
+   rmem->priv = mem;
+
+   if (IS_ENABLED(CONFIG_DEBUG_FS)) {
+   mem->debugfs =
+   debugfs_create_dir(rmem->name, debugfs_dir);
+   swiotlb_create_debugfs_files(mem);
+   }
+   }
+
+   dev->dma_io_tlb_mem = mem;
+
+   return 0;
+}
+
+static void rmem_swiotlb_device_release(struct reserved_mem *rmem,
+   struct device *dev)
+{
+   dev->dma_io_tlb_mem = io_tlb_default_mem;
+}
+
+static const struct reserved_mem_ops rmem_swiotlb_ops = {
+   .device_init = rmem_swiotlb_device_init,
+   .device_release = rmem_swiotlb_device_release,
+};
+
+static int __init rmem_swiotlb_setup(struct reserved_mem *rmem)
+{
+   unsigned long node = rmem->fdt_node;
+
+   if (of_get_flat_dt_prop(node, "reusable", NULL) ||
+   of_get_flat_dt_prop(node, "linux,cma-default", NULL) ||
+   of_get_flat_dt_prop(node, "linux,dma-default", NULL) ||
+   of_get_flat_dt_prop(node, "no-map", NULL))
+   return -EINVAL;
+
+   if (PageHighMem(pfn_to_page(PHYS_PFN(rmem->base {
+   pr_err("Restricted DMA pool must be accessible within the 
linear mapping.");
+   return -EINVAL;
+   }
+
+   rmem->ops = _swiotlb_ops;
+  

[PATCH v12 09/12] swiotlb: Add restricted DMA alloc/free support

2021-06-16 Thread Claire Chang
Add the functions, swiotlb_{alloc,free} and is_swiotlb_for_alloc to
support the memory allocation from restricted DMA pool.

The restricted DMA pool is preferred if available.

Note that since coherent allocation needs remapping, one must set up
another device coherent pool by shared-dma-pool and use
dma_alloc_from_dev_coherent instead for atomic coherent allocation.

Signed-off-by: Claire Chang 
---
 include/linux/swiotlb.h | 26 ++
 kernel/dma/direct.c | 49 +++--
 kernel/dma/swiotlb.c| 38 ++--
 3 files changed, 99 insertions(+), 14 deletions(-)

diff --git a/include/linux/swiotlb.h b/include/linux/swiotlb.h
index 8d8855c77d9a..a73fad460162 100644
--- a/include/linux/swiotlb.h
+++ b/include/linux/swiotlb.h
@@ -85,6 +85,7 @@ extern enum swiotlb_force swiotlb_force;
  * @debugfs:   The dentry to debugfs.
  * @late_alloc:%true if allocated using the page allocator
  * @force_bounce: %true if swiotlb bouncing is forced
+ * @for_alloc:  %true if the pool is used for memory allocation
  */
 struct io_tlb_mem {
phys_addr_t start;
@@ -96,6 +97,7 @@ struct io_tlb_mem {
struct dentry *debugfs;
bool late_alloc;
bool force_bounce;
+   bool for_alloc;
struct io_tlb_slot {
phys_addr_t orig_addr;
size_t alloc_size;
@@ -156,4 +158,28 @@ static inline void swiotlb_adjust_size(unsigned long size)
 extern void swiotlb_print_info(void);
 extern void swiotlb_set_max_segment(unsigned int);
 
+#ifdef CONFIG_DMA_RESTRICTED_POOL
+struct page *swiotlb_alloc(struct device *dev, size_t size);
+bool swiotlb_free(struct device *dev, struct page *page, size_t size);
+
+static inline bool is_swiotlb_for_alloc(struct device *dev)
+{
+   return dev->dma_io_tlb_mem->for_alloc;
+}
+#else
+static inline struct page *swiotlb_alloc(struct device *dev, size_t size)
+{
+   return NULL;
+}
+static inline bool swiotlb_free(struct device *dev, struct page *page,
+   size_t size)
+{
+   return false;
+}
+static inline bool is_swiotlb_for_alloc(struct device *dev)
+{
+   return false;
+}
+#endif /* CONFIG_DMA_RESTRICTED_POOL */
+
 #endif /* __LINUX_SWIOTLB_H */
diff --git a/kernel/dma/direct.c b/kernel/dma/direct.c
index a92465b4eb12..2de33e5d302b 100644
--- a/kernel/dma/direct.c
+++ b/kernel/dma/direct.c
@@ -75,6 +75,15 @@ static bool dma_coherent_ok(struct device *dev, phys_addr_t 
phys, size_t size)
min_not_zero(dev->coherent_dma_mask, dev->bus_dma_limit);
 }
 
+static void __dma_direct_free_pages(struct device *dev, struct page *page,
+   size_t size)
+{
+   if (IS_ENABLED(CONFIG_DMA_RESTRICTED_POOL) &&
+   swiotlb_free(dev, page, size))
+   return;
+   dma_free_contiguous(dev, page, size);
+}
+
 static struct page *__dma_direct_alloc_pages(struct device *dev, size_t size,
gfp_t gfp)
 {
@@ -86,6 +95,16 @@ static struct page *__dma_direct_alloc_pages(struct device 
*dev, size_t size,
 
gfp |= dma_direct_optimal_gfp_mask(dev, dev->coherent_dma_mask,
   _limit);
+   if (IS_ENABLED(CONFIG_DMA_RESTRICTED_POOL) &&
+   is_swiotlb_for_alloc(dev)) {
+   page = swiotlb_alloc(dev, size);
+   if (page && !dma_coherent_ok(dev, page_to_phys(page), size)) {
+   __dma_direct_free_pages(dev, page, size);
+   return NULL;
+   }
+   return page;
+   }
+
page = dma_alloc_contiguous(dev, size, gfp);
if (page && !dma_coherent_ok(dev, page_to_phys(page), size)) {
dma_free_contiguous(dev, page, size);
@@ -142,7 +161,7 @@ void *dma_direct_alloc(struct device *dev, size_t size,
gfp |= __GFP_NOWARN;
 
if ((attrs & DMA_ATTR_NO_KERNEL_MAPPING) &&
-   !force_dma_unencrypted(dev)) {
+   !force_dma_unencrypted(dev) && !is_swiotlb_for_alloc(dev)) {
page = __dma_direct_alloc_pages(dev, size, gfp & ~__GFP_ZERO);
if (!page)
return NULL;
@@ -155,18 +174,23 @@ void *dma_direct_alloc(struct device *dev, size_t size,
}
 
if (!IS_ENABLED(CONFIG_ARCH_HAS_DMA_SET_UNCACHED) &&
-   !IS_ENABLED(CONFIG_DMA_DIRECT_REMAP) &&
-   !dev_is_dma_coherent(dev))
+   !IS_ENABLED(CONFIG_DMA_DIRECT_REMAP) && !dev_is_dma_coherent(dev) &&
+   !is_swiotlb_for_alloc(dev))
return arch_dma_alloc(dev, size, dma_handle, gfp, attrs);
 
/*
 * Remapping or decrypting memory may block. If either is required and
 * we can't block, allocate the memory from the atomic pools.
+* If restricted DMA (i.e., is_swiotlb_for_alloc) is required, one must
+* set up another device coherent pool by shared-dma-pool and use
+* 

[PATCH v12 08/12] swiotlb: Refactor swiotlb_tbl_unmap_single

2021-06-16 Thread Claire Chang
Add a new function, swiotlb_release_slots, to make the code reusable for
supporting different bounce buffer pools.

Signed-off-by: Claire Chang 
Reviewed-by: Christoph Hellwig 
---
 kernel/dma/swiotlb.c | 35 ---
 1 file changed, 20 insertions(+), 15 deletions(-)

diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c
index b59e689aa79d..688c6e0c43ff 100644
--- a/kernel/dma/swiotlb.c
+++ b/kernel/dma/swiotlb.c
@@ -555,27 +555,15 @@ phys_addr_t swiotlb_tbl_map_single(struct device *dev, 
phys_addr_t orig_addr,
return tlb_addr;
 }
 
-/*
- * tlb_addr is the physical address of the bounce buffer to unmap.
- */
-void swiotlb_tbl_unmap_single(struct device *hwdev, phys_addr_t tlb_addr,
- size_t mapping_size, enum dma_data_direction dir,
- unsigned long attrs)
+static void swiotlb_release_slots(struct device *dev, phys_addr_t tlb_addr)
 {
-   struct io_tlb_mem *mem = hwdev->dma_io_tlb_mem;
+   struct io_tlb_mem *mem = dev->dma_io_tlb_mem;
unsigned long flags;
-   unsigned int offset = swiotlb_align_offset(hwdev, tlb_addr);
+   unsigned int offset = swiotlb_align_offset(dev, tlb_addr);
int index = (tlb_addr - offset - mem->start) >> IO_TLB_SHIFT;
int nslots = nr_slots(mem->slots[index].alloc_size + offset);
int count, i;
 
-   /*
-* First, sync the memory before unmapping the entry
-*/
-   if (!(attrs & DMA_ATTR_SKIP_CPU_SYNC) &&
-   (dir == DMA_FROM_DEVICE || dir == DMA_BIDIRECTIONAL))
-   swiotlb_bounce(hwdev, tlb_addr, mapping_size, DMA_FROM_DEVICE);
-
/*
 * Return the buffer to the free list by setting the corresponding
 * entries to indicate the number of contiguous entries available.
@@ -610,6 +598,23 @@ void swiotlb_tbl_unmap_single(struct device *hwdev, 
phys_addr_t tlb_addr,
spin_unlock_irqrestore(>lock, flags);
 }
 
+/*
+ * tlb_addr is the physical address of the bounce buffer to unmap.
+ */
+void swiotlb_tbl_unmap_single(struct device *dev, phys_addr_t tlb_addr,
+ size_t mapping_size, enum dma_data_direction dir,
+ unsigned long attrs)
+{
+   /*
+* First, sync the memory before unmapping the entry
+*/
+   if (!(attrs & DMA_ATTR_SKIP_CPU_SYNC) &&
+   (dir == DMA_FROM_DEVICE || dir == DMA_BIDIRECTIONAL))
+   swiotlb_bounce(dev, tlb_addr, mapping_size, DMA_FROM_DEVICE);
+
+   swiotlb_release_slots(dev, tlb_addr);
+}
+
 void swiotlb_sync_single_for_device(struct device *dev, phys_addr_t tlb_addr,
size_t size, enum dma_data_direction dir)
 {
-- 
2.32.0.272.g935e593368-goog

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v12 07/12] swiotlb: Move alloc_size to swiotlb_find_slots

2021-06-16 Thread Claire Chang
Rename find_slots to swiotlb_find_slots and move the maintenance of
alloc_size to it for better code reusability later.

Signed-off-by: Claire Chang 
Reviewed-by: Christoph Hellwig 
---
 kernel/dma/swiotlb.c | 16 
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c
index b5a9c4c0b4db..b59e689aa79d 100644
--- a/kernel/dma/swiotlb.c
+++ b/kernel/dma/swiotlb.c
@@ -431,8 +431,8 @@ static unsigned int wrap_index(struct io_tlb_mem *mem, 
unsigned int index)
  * Find a suitable number of IO TLB entries size that will fit this request and
  * allocate a buffer from that IO TLB pool.
  */
-static int find_slots(struct device *dev, phys_addr_t orig_addr,
-   size_t alloc_size)
+static int swiotlb_find_slots(struct device *dev, phys_addr_t orig_addr,
+ size_t alloc_size)
 {
struct io_tlb_mem *mem = dev->dma_io_tlb_mem;
unsigned long boundary_mask = dma_get_seg_boundary(dev);
@@ -487,8 +487,11 @@ static int find_slots(struct device *dev, phys_addr_t 
orig_addr,
return -1;
 
 found:
-   for (i = index; i < index + nslots; i++)
+   for (i = index; i < index + nslots; i++) {
mem->slots[i].list = 0;
+   mem->slots[i].alloc_size =
+   alloc_size - ((i - index) << IO_TLB_SHIFT);
+   }
for (i = index - 1;
 io_tlb_offset(i) != IO_TLB_SEGSIZE - 1 &&
 mem->slots[i].list; i--)
@@ -529,7 +532,7 @@ phys_addr_t swiotlb_tbl_map_single(struct device *dev, 
phys_addr_t orig_addr,
return (phys_addr_t)DMA_MAPPING_ERROR;
}
 
-   index = find_slots(dev, orig_addr, alloc_size + offset);
+   index = swiotlb_find_slots(dev, orig_addr, alloc_size + offset);
if (index == -1) {
if (!(attrs & DMA_ATTR_NO_WARN))
dev_warn_ratelimited(dev,
@@ -543,11 +546,8 @@ phys_addr_t swiotlb_tbl_map_single(struct device *dev, 
phys_addr_t orig_addr,
 * This is needed when we sync the memory.  Then we sync the buffer if
 * needed.
 */
-   for (i = 0; i < nr_slots(alloc_size + offset); i++) {
+   for (i = 0; i < nr_slots(alloc_size + offset); i++)
mem->slots[index + i].orig_addr = slot_addr(orig_addr, i);
-   mem->slots[index + i].alloc_size =
-   alloc_size - (i << IO_TLB_SHIFT);
-   }
tlb_addr = slot_addr(mem->start, index) + offset;
if (!(attrs & DMA_ATTR_SKIP_CPU_SYNC) &&
(dir == DMA_TO_DEVICE || dir == DMA_BIDIRECTIONAL))
-- 
2.32.0.272.g935e593368-goog

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v12 06/12] swiotlb: Use is_swiotlb_force_bounce for swiotlb data bouncing

2021-06-16 Thread Claire Chang
Propagate the swiotlb_force into io_tlb_default_mem->force_bounce and
use it to determine whether to bounce the data or not. This will be
useful later to allow for different pools.

Signed-off-by: Claire Chang 
---
 include/linux/swiotlb.h | 11 +++
 kernel/dma/direct.c |  2 +-
 kernel/dma/direct.h |  2 +-
 kernel/dma/swiotlb.c|  4 
 4 files changed, 17 insertions(+), 2 deletions(-)

diff --git a/include/linux/swiotlb.h b/include/linux/swiotlb.h
index dd1c30a83058..8d8855c77d9a 100644
--- a/include/linux/swiotlb.h
+++ b/include/linux/swiotlb.h
@@ -84,6 +84,7 @@ extern enum swiotlb_force swiotlb_force;
  * unmap calls.
  * @debugfs:   The dentry to debugfs.
  * @late_alloc:%true if allocated using the page allocator
+ * @force_bounce: %true if swiotlb bouncing is forced
  */
 struct io_tlb_mem {
phys_addr_t start;
@@ -94,6 +95,7 @@ struct io_tlb_mem {
spinlock_t lock;
struct dentry *debugfs;
bool late_alloc;
+   bool force_bounce;
struct io_tlb_slot {
phys_addr_t orig_addr;
size_t alloc_size;
@@ -109,6 +111,11 @@ static inline bool is_swiotlb_buffer(struct device *dev, 
phys_addr_t paddr)
return mem && paddr >= mem->start && paddr < mem->end;
 }
 
+static inline bool is_swiotlb_force_bounce(struct device *dev)
+{
+   return dev->dma_io_tlb_mem->force_bounce;
+}
+
 void __init swiotlb_exit(void);
 unsigned int swiotlb_max_segment(void);
 size_t swiotlb_max_mapping_size(struct device *dev);
@@ -120,6 +127,10 @@ static inline bool is_swiotlb_buffer(struct device *dev, 
phys_addr_t paddr)
 {
return false;
 }
+static inline bool is_swiotlb_force_bounce(struct device *dev)
+{
+   return false;
+}
 static inline void swiotlb_exit(void)
 {
 }
diff --git a/kernel/dma/direct.c b/kernel/dma/direct.c
index 7a88c34d0867..a92465b4eb12 100644
--- a/kernel/dma/direct.c
+++ b/kernel/dma/direct.c
@@ -496,7 +496,7 @@ size_t dma_direct_max_mapping_size(struct device *dev)
 {
/* If SWIOTLB is active, use its maximum mapping size */
if (is_swiotlb_active(dev) &&
-   (dma_addressing_limited(dev) || swiotlb_force == SWIOTLB_FORCE))
+   (dma_addressing_limited(dev) || is_swiotlb_force_bounce(dev)))
return swiotlb_max_mapping_size(dev);
return SIZE_MAX;
 }
diff --git a/kernel/dma/direct.h b/kernel/dma/direct.h
index 13e9e7158d94..4632b0f4f72e 100644
--- a/kernel/dma/direct.h
+++ b/kernel/dma/direct.h
@@ -87,7 +87,7 @@ static inline dma_addr_t dma_direct_map_page(struct device 
*dev,
phys_addr_t phys = page_to_phys(page) + offset;
dma_addr_t dma_addr = phys_to_dma(dev, phys);
 
-   if (unlikely(swiotlb_force == SWIOTLB_FORCE))
+   if (is_swiotlb_force_bounce(dev))
return swiotlb_map(dev, phys, size, dir, attrs);
 
if (unlikely(!dma_capable(dev, dma_addr, size, true))) {
diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c
index 101abeb0a57d..b5a9c4c0b4db 100644
--- a/kernel/dma/swiotlb.c
+++ b/kernel/dma/swiotlb.c
@@ -179,6 +179,10 @@ static void swiotlb_init_io_tlb_mem(struct io_tlb_mem 
*mem, phys_addr_t start,
mem->end = mem->start + bytes;
mem->index = 0;
mem->late_alloc = late_alloc;
+
+   if (swiotlb_force == SWIOTLB_FORCE)
+   mem->force_bounce = true;
+
spin_lock_init(>lock);
for (i = 0; i < mem->nslabs; i++) {
mem->slots[i].list = IO_TLB_SEGSIZE - io_tlb_offset(i);
-- 
2.32.0.272.g935e593368-goog

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v12 05/12] swiotlb: Update is_swiotlb_active to add a struct device argument

2021-06-16 Thread Claire Chang
Update is_swiotlb_active to add a struct device argument. This will be
useful later to allow for different pools.

Signed-off-by: Claire Chang 
Reviewed-by: Christoph Hellwig 
---
 drivers/gpu/drm/i915/gem/i915_gem_internal.c | 2 +-
 drivers/gpu/drm/nouveau/nouveau_ttm.c| 2 +-
 drivers/pci/xen-pcifront.c   | 2 +-
 include/linux/swiotlb.h  | 4 ++--
 kernel/dma/direct.c  | 2 +-
 kernel/dma/swiotlb.c | 4 ++--
 6 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/i915/gem/i915_gem_internal.c 
b/drivers/gpu/drm/i915/gem/i915_gem_internal.c
index a9d65fc8aa0e..4b7afa0fc85d 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_internal.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_internal.c
@@ -42,7 +42,7 @@ static int i915_gem_object_get_pages_internal(struct 
drm_i915_gem_object *obj)
 
max_order = MAX_ORDER;
 #ifdef CONFIG_SWIOTLB
-   if (is_swiotlb_active()) {
+   if (is_swiotlb_active(obj->base.dev->dev)) {
unsigned int max_segment;
 
max_segment = swiotlb_max_segment();
diff --git a/drivers/gpu/drm/nouveau/nouveau_ttm.c 
b/drivers/gpu/drm/nouveau/nouveau_ttm.c
index 9662522aa066..be15bfd9e0ee 100644
--- a/drivers/gpu/drm/nouveau/nouveau_ttm.c
+++ b/drivers/gpu/drm/nouveau/nouveau_ttm.c
@@ -321,7 +321,7 @@ nouveau_ttm_init(struct nouveau_drm *drm)
}
 
 #if IS_ENABLED(CONFIG_SWIOTLB) && IS_ENABLED(CONFIG_X86)
-   need_swiotlb = is_swiotlb_active();
+   need_swiotlb = is_swiotlb_active(dev->dev);
 #endif
 
ret = ttm_bo_device_init(>ttm.bdev, _bo_driver,
diff --git a/drivers/pci/xen-pcifront.c b/drivers/pci/xen-pcifront.c
index b7a8f3a1921f..0d56985bfe81 100644
--- a/drivers/pci/xen-pcifront.c
+++ b/drivers/pci/xen-pcifront.c
@@ -693,7 +693,7 @@ static int pcifront_connect_and_init_dma(struct 
pcifront_device *pdev)
 
spin_unlock(_dev_lock);
 
-   if (!err && !is_swiotlb_active()) {
+   if (!err && !is_swiotlb_active(>xdev->dev)) {
err = pci_xen_swiotlb_init_late();
if (err)
dev_err(>xdev->dev, "Could not setup SWIOTLB!\n");
diff --git a/include/linux/swiotlb.h b/include/linux/swiotlb.h
index d1f3d95881cd..dd1c30a83058 100644
--- a/include/linux/swiotlb.h
+++ b/include/linux/swiotlb.h
@@ -112,7 +112,7 @@ static inline bool is_swiotlb_buffer(struct device *dev, 
phys_addr_t paddr)
 void __init swiotlb_exit(void);
 unsigned int swiotlb_max_segment(void);
 size_t swiotlb_max_mapping_size(struct device *dev);
-bool is_swiotlb_active(void);
+bool is_swiotlb_active(struct device *dev);
 void __init swiotlb_adjust_size(unsigned long size);
 #else
 #define swiotlb_force SWIOTLB_NO_FORCE
@@ -132,7 +132,7 @@ static inline size_t swiotlb_max_mapping_size(struct device 
*dev)
return SIZE_MAX;
 }
 
-static inline bool is_swiotlb_active(void)
+static inline bool is_swiotlb_active(struct device *dev)
 {
return false;
 }
diff --git a/kernel/dma/direct.c b/kernel/dma/direct.c
index 84c9feb5474a..7a88c34d0867 100644
--- a/kernel/dma/direct.c
+++ b/kernel/dma/direct.c
@@ -495,7 +495,7 @@ int dma_direct_supported(struct device *dev, u64 mask)
 size_t dma_direct_max_mapping_size(struct device *dev)
 {
/* If SWIOTLB is active, use its maximum mapping size */
-   if (is_swiotlb_active() &&
+   if (is_swiotlb_active(dev) &&
(dma_addressing_limited(dev) || swiotlb_force == SWIOTLB_FORCE))
return swiotlb_max_mapping_size(dev);
return SIZE_MAX;
diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c
index a9f5c08dd94a..101abeb0a57d 100644
--- a/kernel/dma/swiotlb.c
+++ b/kernel/dma/swiotlb.c
@@ -663,9 +663,9 @@ size_t swiotlb_max_mapping_size(struct device *dev)
return ((size_t)IO_TLB_SIZE) * IO_TLB_SEGSIZE;
 }
 
-bool is_swiotlb_active(void)
+bool is_swiotlb_active(struct device *dev)
 {
-   return io_tlb_default_mem != NULL;
+   return dev->dma_io_tlb_mem != NULL;
 }
 EXPORT_SYMBOL_GPL(is_swiotlb_active);
 
-- 
2.32.0.272.g935e593368-goog

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v12 04/12] swiotlb: Update is_swiotlb_buffer to add a struct device argument

2021-06-16 Thread Claire Chang
Update is_swiotlb_buffer to add a struct device argument. This will be
useful later to allow for different pools.

Signed-off-by: Claire Chang 
Reviewed-by: Christoph Hellwig 
---
 drivers/iommu/dma-iommu.c | 12 ++--
 drivers/xen/swiotlb-xen.c |  2 +-
 include/linux/swiotlb.h   |  7 ---
 kernel/dma/direct.c   |  6 +++---
 kernel/dma/direct.h   |  6 +++---
 5 files changed, 17 insertions(+), 16 deletions(-)

diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index 3087d9fa6065..10997ef541f8 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -507,7 +507,7 @@ static void __iommu_dma_unmap_swiotlb(struct device *dev, 
dma_addr_t dma_addr,
 
__iommu_dma_unmap(dev, dma_addr, size);
 
-   if (unlikely(is_swiotlb_buffer(phys)))
+   if (unlikely(is_swiotlb_buffer(dev, phys)))
swiotlb_tbl_unmap_single(dev, phys, size, dir, attrs);
 }
 
@@ -578,7 +578,7 @@ static dma_addr_t __iommu_dma_map_swiotlb(struct device 
*dev, phys_addr_t phys,
}
 
iova = __iommu_dma_map(dev, phys, aligned_size, prot, dma_mask);
-   if (iova == DMA_MAPPING_ERROR && is_swiotlb_buffer(phys))
+   if (iova == DMA_MAPPING_ERROR && is_swiotlb_buffer(dev, phys))
swiotlb_tbl_unmap_single(dev, phys, org_size, dir, attrs);
return iova;
 }
@@ -749,7 +749,7 @@ static void iommu_dma_sync_single_for_cpu(struct device 
*dev,
if (!dev_is_dma_coherent(dev))
arch_sync_dma_for_cpu(phys, size, dir);
 
-   if (is_swiotlb_buffer(phys))
+   if (is_swiotlb_buffer(dev, phys))
swiotlb_sync_single_for_cpu(dev, phys, size, dir);
 }
 
@@ -762,7 +762,7 @@ static void iommu_dma_sync_single_for_device(struct device 
*dev,
return;
 
phys = iommu_iova_to_phys(iommu_get_dma_domain(dev), dma_handle);
-   if (is_swiotlb_buffer(phys))
+   if (is_swiotlb_buffer(dev, phys))
swiotlb_sync_single_for_device(dev, phys, size, dir);
 
if (!dev_is_dma_coherent(dev))
@@ -783,7 +783,7 @@ static void iommu_dma_sync_sg_for_cpu(struct device *dev,
if (!dev_is_dma_coherent(dev))
arch_sync_dma_for_cpu(sg_phys(sg), sg->length, dir);
 
-   if (is_swiotlb_buffer(sg_phys(sg)))
+   if (is_swiotlb_buffer(dev, sg_phys(sg)))
swiotlb_sync_single_for_cpu(dev, sg_phys(sg),
sg->length, dir);
}
@@ -800,7 +800,7 @@ static void iommu_dma_sync_sg_for_device(struct device *dev,
return;
 
for_each_sg(sgl, sg, nelems, i) {
-   if (is_swiotlb_buffer(sg_phys(sg)))
+   if (is_swiotlb_buffer(dev, sg_phys(sg)))
swiotlb_sync_single_for_device(dev, sg_phys(sg),
   sg->length, dir);
 
diff --git a/drivers/xen/swiotlb-xen.c b/drivers/xen/swiotlb-xen.c
index 4c89afc0df62..0c6ed09f8513 100644
--- a/drivers/xen/swiotlb-xen.c
+++ b/drivers/xen/swiotlb-xen.c
@@ -100,7 +100,7 @@ static int is_xen_swiotlb_buffer(struct device *dev, 
dma_addr_t dma_addr)
 * in our domain. Therefore _only_ check address within our domain.
 */
if (pfn_valid(PFN_DOWN(paddr)))
-   return is_swiotlb_buffer(paddr);
+   return is_swiotlb_buffer(dev, paddr);
return 0;
 }
 
diff --git a/include/linux/swiotlb.h b/include/linux/swiotlb.h
index 216854a5e513..d1f3d95881cd 100644
--- a/include/linux/swiotlb.h
+++ b/include/linux/swiotlb.h
@@ -2,6 +2,7 @@
 #ifndef __LINUX_SWIOTLB_H
 #define __LINUX_SWIOTLB_H
 
+#include 
 #include 
 #include 
 #include 
@@ -101,9 +102,9 @@ struct io_tlb_mem {
 };
 extern struct io_tlb_mem *io_tlb_default_mem;
 
-static inline bool is_swiotlb_buffer(phys_addr_t paddr)
+static inline bool is_swiotlb_buffer(struct device *dev, phys_addr_t paddr)
 {
-   struct io_tlb_mem *mem = io_tlb_default_mem;
+   struct io_tlb_mem *mem = dev->dma_io_tlb_mem;
 
return mem && paddr >= mem->start && paddr < mem->end;
 }
@@ -115,7 +116,7 @@ bool is_swiotlb_active(void);
 void __init swiotlb_adjust_size(unsigned long size);
 #else
 #define swiotlb_force SWIOTLB_NO_FORCE
-static inline bool is_swiotlb_buffer(phys_addr_t paddr)
+static inline bool is_swiotlb_buffer(struct device *dev, phys_addr_t paddr)
 {
return false;
 }
diff --git a/kernel/dma/direct.c b/kernel/dma/direct.c
index f737e3347059..84c9feb5474a 100644
--- a/kernel/dma/direct.c
+++ b/kernel/dma/direct.c
@@ -343,7 +343,7 @@ void dma_direct_sync_sg_for_device(struct device *dev,
for_each_sg(sgl, sg, nents, i) {
phys_addr_t paddr = dma_to_phys(dev, sg_dma_address(sg));
 
-   if (unlikely(is_swiotlb_buffer(paddr)))
+   if (unlikely(is_swiotlb_buffer(dev, paddr)))
swiotlb_sync_single_for_device(dev, paddr, sg->length,

[PATCH v12 03/12] swiotlb: Set dev->dma_io_tlb_mem to the swiotlb pool used

2021-06-16 Thread Claire Chang
Always have the pointer to the swiotlb pool used in struct device. This
could help simplify the code for other pools.

Signed-off-by: Claire Chang 
Reviewed-by: Christoph Hellwig 
---
 drivers/base/core.c| 4 
 include/linux/device.h | 4 
 kernel/dma/swiotlb.c   | 8 
 3 files changed, 12 insertions(+), 4 deletions(-)

diff --git a/drivers/base/core.c b/drivers/base/core.c
index f29839382f81..cb3123e3954d 100644
--- a/drivers/base/core.c
+++ b/drivers/base/core.c
@@ -27,6 +27,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include  /* for dma_default_coherent */
 
@@ -2736,6 +2737,9 @@ void device_initialize(struct device *dev)
 defined(CONFIG_ARCH_HAS_SYNC_DMA_FOR_CPU_ALL)
dev->dma_coherent = dma_default_coherent;
 #endif
+#ifdef CONFIG_SWIOTLB
+   dev->dma_io_tlb_mem = io_tlb_default_mem;
+#endif
 }
 EXPORT_SYMBOL_GPL(device_initialize);
 
diff --git a/include/linux/device.h b/include/linux/device.h
index ba660731bd25..240d652a0696 100644
--- a/include/linux/device.h
+++ b/include/linux/device.h
@@ -416,6 +416,7 @@ struct dev_links_info {
  * @dma_pools: Dma pools (if dma'ble device).
  * @dma_mem:   Internal for coherent mem override.
  * @cma_area:  Contiguous memory area for dma allocations
+ * @dma_io_tlb_mem: Pointer to the swiotlb pool used.  Not for driver use.
  * @archdata:  For arch-specific additions.
  * @of_node:   Associated device tree node.
  * @fwnode:Associated device node supplied by platform firmware.
@@ -518,6 +519,9 @@ struct device {
 #ifdef CONFIG_DMA_CMA
struct cma *cma_area;   /* contiguous memory area for dma
   allocations */
+#endif
+#ifdef CONFIG_SWIOTLB
+   struct io_tlb_mem *dma_io_tlb_mem;
 #endif
/* arch specific additions */
struct dev_archdata archdata;
diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c
index af416bcd1914..a9f5c08dd94a 100644
--- a/kernel/dma/swiotlb.c
+++ b/kernel/dma/swiotlb.c
@@ -339,7 +339,7 @@ void __init swiotlb_exit(void)
 static void swiotlb_bounce(struct device *dev, phys_addr_t tlb_addr, size_t 
size,
   enum dma_data_direction dir)
 {
-   struct io_tlb_mem *mem = io_tlb_default_mem;
+   struct io_tlb_mem *mem = dev->dma_io_tlb_mem;
int index = (tlb_addr - mem->start) >> IO_TLB_SHIFT;
unsigned int offset = (tlb_addr - mem->start) & (IO_TLB_SIZE - 1);
phys_addr_t orig_addr = mem->slots[index].orig_addr;
@@ -430,7 +430,7 @@ static unsigned int wrap_index(struct io_tlb_mem *mem, 
unsigned int index)
 static int find_slots(struct device *dev, phys_addr_t orig_addr,
size_t alloc_size)
 {
-   struct io_tlb_mem *mem = io_tlb_default_mem;
+   struct io_tlb_mem *mem = dev->dma_io_tlb_mem;
unsigned long boundary_mask = dma_get_seg_boundary(dev);
dma_addr_t tbl_dma_addr =
phys_to_dma_unencrypted(dev, mem->start) & boundary_mask;
@@ -507,7 +507,7 @@ phys_addr_t swiotlb_tbl_map_single(struct device *dev, 
phys_addr_t orig_addr,
size_t mapping_size, size_t alloc_size,
enum dma_data_direction dir, unsigned long attrs)
 {
-   struct io_tlb_mem *mem = io_tlb_default_mem;
+   struct io_tlb_mem *mem = dev->dma_io_tlb_mem;
unsigned int offset = swiotlb_align_offset(dev, orig_addr);
unsigned int i;
int index;
@@ -558,7 +558,7 @@ void swiotlb_tbl_unmap_single(struct device *hwdev, 
phys_addr_t tlb_addr,
  size_t mapping_size, enum dma_data_direction dir,
  unsigned long attrs)
 {
-   struct io_tlb_mem *mem = io_tlb_default_mem;
+   struct io_tlb_mem *mem = hwdev->dma_io_tlb_mem;
unsigned long flags;
unsigned int offset = swiotlb_align_offset(hwdev, tlb_addr);
int index = (tlb_addr - offset - mem->start) >> IO_TLB_SHIFT;
-- 
2.32.0.272.g935e593368-goog

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v12 02/12] swiotlb: Refactor swiotlb_create_debugfs

2021-06-16 Thread Claire Chang
Split the debugfs creation to make the code reusable for supporting
different bounce buffer pools.

Signed-off-by: Claire Chang 
Reviewed-by: Christoph Hellwig 
---
 kernel/dma/swiotlb.c | 21 ++---
 1 file changed, 14 insertions(+), 7 deletions(-)

diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c
index 3ba0f08a39a1..af416bcd1914 100644
--- a/kernel/dma/swiotlb.c
+++ b/kernel/dma/swiotlb.c
@@ -670,19 +670,26 @@ bool is_swiotlb_active(void)
 EXPORT_SYMBOL_GPL(is_swiotlb_active);
 
 #ifdef CONFIG_DEBUG_FS
+static struct dentry *debugfs_dir;
 
-static int __init swiotlb_create_debugfs(void)
+static void swiotlb_create_debugfs_files(struct io_tlb_mem *mem)
 {
-   struct io_tlb_mem *mem = io_tlb_default_mem;
-
-   if (!mem)
-   return 0;
-   mem->debugfs = debugfs_create_dir("swiotlb", NULL);
debugfs_create_ulong("io_tlb_nslabs", 0400, mem->debugfs, >nslabs);
debugfs_create_ulong("io_tlb_used", 0400, mem->debugfs, >used);
+}
+
+static int __init swiotlb_create_default_debugfs(void)
+{
+   struct io_tlb_mem *mem = io_tlb_default_mem;
+
+   debugfs_dir = debugfs_create_dir("swiotlb", NULL);
+   if (mem) {
+   mem->debugfs = debugfs_dir;
+   swiotlb_create_debugfs_files(mem);
+   }
return 0;
 }
 
-late_initcall(swiotlb_create_debugfs);
+late_initcall(swiotlb_create_default_debugfs);
 
 #endif
-- 
2.32.0.272.g935e593368-goog

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


  1   2   >