Re: [PATCH v2 01/10] memory: tegra: Implement SID override programming

2021-04-26 Thread Thierry Reding
On Mon, Apr 26, 2021 at 10:28:43AM +0200, Krzysztof Kozlowski wrote:
> On 20/04/2021 19:26, Thierry Reding wrote:
> > From: Thierry Reding 
> > 
> > Instead of programming all SID overrides during early boot, perform the
> > operation on-demand after the SMMU translations have been set up for a
> > device. This reuses data from device tree to match memory clients for a
> > device and programs the SID specified in device tree, which corresponds
> > to the SID used for the SMMU context banks for the device.
> > 
> > Signed-off-by: Thierry Reding 
> > ---
> >  drivers/memory/tegra/mc.c   |  9 +
> >  drivers/memory/tegra/tegra186.c | 72 +
> >  include/soc/tegra/mc.h  |  3 ++
> >  3 files changed, 84 insertions(+)
> > 
> > diff --git a/drivers/memory/tegra/mc.c b/drivers/memory/tegra/mc.c
> > index c854639cf30c..bace5ecfe770 100644
> > --- a/drivers/memory/tegra/mc.c
> > +++ b/drivers/memory/tegra/mc.c
> > @@ -97,6 +97,15 @@ struct tegra_mc *devm_tegra_memory_controller_get(struct 
> > device *dev)
> >  }
> >  EXPORT_SYMBOL_GPL(devm_tegra_memory_controller_get);
> >  
> > +int tegra_mc_probe_device(struct tegra_mc *mc, struct device *dev)
> > +{
> > +   if (mc->soc->ops && mc->soc->ops->probe_device)
> > +   return mc->soc->ops->probe_device(mc, dev);
> > +
> > +   return 0;
> > +}
> > +EXPORT_SYMBOL_GPL(tegra_mc_probe_device);
> > +
> >  static int tegra_mc_block_dma_common(struct tegra_mc *mc,
> >  const struct tegra_mc_reset *rst)
> >  {
> > diff --git a/drivers/memory/tegra/tegra186.c 
> > b/drivers/memory/tegra/tegra186.c
> > index 1f87915ccd62..e65eac5764d4 100644
> > --- a/drivers/memory/tegra/tegra186.c
> > +++ b/drivers/memory/tegra/tegra186.c
> > @@ -4,6 +4,7 @@
> >   */
> >  
> >  #include 
> > +#include 
> >  #include 
> >  #include 
> >  #include 
> > @@ -15,6 +16,10 @@
> >  #include 
> >  #endif
> >  
> > +#define MC_SID_STREAMID_OVERRIDE_MASK GENMASK(7, 0)
> > +#define MC_SID_STREAMID_SECURITY_WRITE_ACCESS_DISABLED BIT(16)
> > +#define MC_SID_STREAMID_SECURITY_OVERRIDE BIT(8)
> > +
> >  static void tegra186_mc_program_sid(struct tegra_mc *mc)
> >  {
> > unsigned int i;
> > @@ -66,10 +71,77 @@ static int tegra186_mc_resume(struct tegra_mc *mc)
> > return 0;
> >  }
> >  
> > +static void tegra186_mc_client_sid_override(struct tegra_mc *mc,
> > +   const struct tegra_mc_client 
> > *client,
> > +   unsigned int sid)
> > +{
> > +   u32 value, old;
> > +
> > +   value = readl(mc->regs + client->regs.sid.security);
> > +   if ((value & MC_SID_STREAMID_SECURITY_OVERRIDE) == 0) {
> > +   /*
> > +* If the secure firmware has locked this down the override
> > +* for this memory client, there's nothing we can do here.
> > +*/
> > +   if (value & MC_SID_STREAMID_SECURITY_WRITE_ACCESS_DISABLED)
> > +   return;
> > +
> > +   /*
> > +* Otherwise, try to set the override itself. Typically the
> > +* secure firmware will never have set this configuration.
> > +* Instead, it will either have disabled write access to
> > +* this field, or it will already have set an explicit
> > +* override itself.
> > +*/
> > +   WARN_ON((value & MC_SID_STREAMID_SECURITY_OVERRIDE) == 0);
> > +
> > +   value |= MC_SID_STREAMID_SECURITY_OVERRIDE;
> > +   writel(value, mc->regs + client->regs.sid.security);
> > +   }
> > +
> > +   value = readl(mc->regs + client->regs.sid.override);
> > +   old = value & MC_SID_STREAMID_OVERRIDE_MASK;
> > +
> > +   if (old != sid) {
> > +   dev_dbg(mc->dev, "overriding SID %x for %s with %x\n", old,
> > +   client->name, sid);
> > +   writel(sid, mc->regs + client->regs.sid.override);
> > +   }
> > +}
> > +
> > +static int tegra186_mc_probe_device(struct tegra_mc *mc, struct device 
> > *dev)
> > +{
> > +#if IS_ENABLED(CONFIG_IOMMU_API)
> 
> Is this part really build-time dependent? I don't see here any uses of
> IOMMU specific fields, so maybe this should be runtime choice based on
> enabled inte

Re: [PATCH v1 2/2] iommu/tegra-smmu: Revert workaround that was needed for Nyan Big Chromebook

2021-04-26 Thread Thierry Reding
On Sat, Apr 24, 2021 at 11:27:10PM +0300, Dmitry Osipenko wrote:
> 23.04.2021 18:23, Dmitry Osipenko пишет:
> > 23.04.2021 18:01, Guillaume Tucker пишет:
> >> On 02/04/2021 15:40, Dmitry Osipenko wrote:
> >>> 01.04.2021 11:55, Nicolin Chen пишет:
>  On Mon, Mar 29, 2021 at 02:32:56AM +0300, Dmitry Osipenko wrote:
> > The previous commit fixes problem where display client was attaching too
> > early to IOMMU during kernel boot in a multi-platform kernel 
> > configuration
> > which enables CONFIG_ARM_DMA_USE_IOMMU=y. The workaround that helped to
> > defer the IOMMU attachment for Nyan Big Chromebook isn't needed anymore,
> > revert it.
> 
>  Sorry for the late reply. I have been busy with downstream tasks.
> 
>  I will give them a try by the end of the week. Yet, probably it'd
>  be better to include Guillaume also as he has the Nyan platform.
> 
> >>>
> >>> Indeed, thanks. Although, I'm pretty sure that it's the same issue which
> >>> I reproduced on Nexus 7.
> >>>
> >>> Guillaume, could you please give a test to these patches on Nyan Big?
> >>> There should be no EMEM errors in the kernel log with this patches.
> >>>
> >>> https://patchwork.ozlabs.org/project/linux-tegra/list/?series=236215
> >>
> >> So sorry for the very late reply.  I have tried the patches but
> >> hit some issues on linux-next, it's not reaching a login prompt
> >> with next-20210422.  So I then tried with next-20210419 which
> >> does boot but shows the IOMMU error:
> >>
> >> <6>[2.995341] tegra-dc 5420.dc: Adding to iommu group 1
> >> <4>[3.001070] Failed to attached device 5420.dc to IOMMU_mapping  
> >>
> >>   https://lava.collabora.co.uk/scheduler/job/3570052#L1120
> >>
> >> The branch I'm using with the patches applied can be found here:
> >>
> >>   
> >> https://gitlab.collabora.com/gtucker/linux/-/commits/next-20210419-nyan-big-drm-read/
> >>
> >> Hope this helps, let me know if you need anything else to be
> >> tested.
> > 
> > 
> > Hello Guillaume,
> > 
> > The current linux-next doesn't boot on all ARM (AFAIK), the older
> > next-20210413 works. The above message should be unrelated to the boot
> > problem. It should be okay to ignore that message as it should be
> > harmless in yours case.
> > 
> 
> Although, the 20210419 should be good.
> 
> Thierry, do you know what those SOR and Nouveau issues are about?

There's a use-after-free (though it's really a use-before-init) issue in
linux-next at the moment, but a fix has been suggested. The fix for this
along with an additional leak plug is here:

http://patchwork.ozlabs.org/project/linux-tegra/list/?series=240569

I'm not aware of any Nouveau issues. What version and platform are those
happening on? Are there any logs? I can't seem to find them in this
thread.

Thierry


signature.asc
Description: PGP signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH v2 5/5] iommu/tegra-smmu: Support managed domains

2021-04-23 Thread Thierry Reding
From: Navneet Kumar 

Allow creating identity and DMA API compatible IOMMU domains. When
creating a DMA API compatible domain, make sure to also create the
required cookie.

Signed-off-by: Navneet Kumar 
Signed-off-by: Thierry Reding 
---
 drivers/iommu/tegra-smmu.c | 47 --
 1 file changed, 30 insertions(+), 17 deletions(-)

diff --git a/drivers/iommu/tegra-smmu.c b/drivers/iommu/tegra-smmu.c
index 6bf7654371c5..40647e1f03ae 100644
--- a/drivers/iommu/tegra-smmu.c
+++ b/drivers/iommu/tegra-smmu.c
@@ -16,6 +16,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
@@ -281,8 +282,11 @@ static bool tegra_smmu_capable(enum iommu_cap cap)
 static struct iommu_domain *tegra_smmu_domain_alloc(unsigned type)
 {
struct tegra_smmu_as *as;
+   int ret;
 
-   if (type != IOMMU_DOMAIN_UNMANAGED)
+   if (type != IOMMU_DOMAIN_UNMANAGED &&
+   type != IOMMU_DOMAIN_DMA &&
+   type != IOMMU_DOMAIN_IDENTITY)
return NULL;
 
as = kzalloc(sizeof(*as), GFP_KERNEL);
@@ -291,26 +295,23 @@ static struct iommu_domain 
*tegra_smmu_domain_alloc(unsigned type)
 
as->attr = SMMU_PD_READABLE | SMMU_PD_WRITABLE | SMMU_PD_NONSECURE;
 
-   as->pd = alloc_page(GFP_KERNEL | __GFP_DMA | __GFP_ZERO);
-   if (!as->pd) {
-   kfree(as);
-   return NULL;
+   if (type == IOMMU_DOMAIN_DMA) {
+   ret = iommu_get_dma_cookie(>domain);
+   if (ret)
+   goto free_as;
}
 
+   as->pd = alloc_page(GFP_KERNEL | __GFP_DMA | __GFP_ZERO);
+   if (!as->pd)
+   goto put_dma_cookie;
+
as->count = kcalloc(SMMU_NUM_PDE, sizeof(u32), GFP_KERNEL);
-   if (!as->count) {
-   __free_page(as->pd);
-   kfree(as);
-   return NULL;
-   }
+   if (!as->count)
+   goto free_pd_range;
 
as->pts = kcalloc(SMMU_NUM_PDE, sizeof(*as->pts), GFP_KERNEL);
-   if (!as->pts) {
-   kfree(as->count);
-   __free_page(as->pd);
-   kfree(as);
-   return NULL;
-   }
+   if (!as->pts)
+   goto free_pts;
 
spin_lock_init(>lock);
 
@@ -320,6 +321,18 @@ static struct iommu_domain 
*tegra_smmu_domain_alloc(unsigned type)
as->domain.geometry.force_aperture = true;
 
return >domain;
+
+free_pts:
+   kfree(as->pts);
+free_pd_range:
+   __free_page(as->pd);
+put_dma_cookie:
+   if (type == IOMMU_DOMAIN_DMA)
+   iommu_put_dma_cookie(>domain);
+free_as:
+   kfree(as);
+
+   return NULL;
 }
 
 static void tegra_smmu_domain_free(struct iommu_domain *domain)
@@ -1051,7 +1064,7 @@ static const struct iommu_ops tegra_smmu_ops = {
.map = tegra_smmu_map,
.unmap = tegra_smmu_unmap,
.iova_to_phys = tegra_smmu_iova_to_phys,
-   .get_resv_regions = of_iommu_get_resv_regions,
+   .get_resv_regions = iommu_dma_get_resv_regions,
.put_resv_regions = generic_iommu_put_resv_regions,
.apply_resv_region = tegra_smmu_apply_resv_region,
.of_xlate = tegra_smmu_of_xlate,
-- 
2.30.2

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v2 4/5] iommu/tegra-smmu: Add support for reserved regions

2021-04-23 Thread Thierry Reding
From: Thierry Reding 

The Tegra DRM driver currently uses the IOMMU API explicitly. This means
that it has fine-grained control over when exactly the translation
through the IOMMU is enabled. This currently happens after the driver
probes, so the driver is in a DMA quiesced state when the IOMMU
translation is enabled.

During the transition of the Tegra DRM driver to use the DMA API instead
of the IOMMU API explicitly, it was observed that on certain platforms
the display controllers were still actively fetching from memory. When a
DMA IOMMU domain is created as part of the DMA/IOMMU API setup during
boot, the IOMMU translation for the display controllers can be enabled a
significant amount of time before the driver has had a chance to reset
the hardware into a sane state. This causes the SMMU to detect faults on
the addresses that the display controller is trying to fetch.

To avoid this, and as a byproduct paving the way for seamless transition
of display from the bootloader to the kernel, add support for reserved
regions in the Tegra SMMU driver. This is implemented using the standard
reserved memory device tree bindings, which let us describe regions of
memory which the kernel is forbidden from using for regular allocations.
The Tegra SMMU driver will parse the nodes associated with each device
via the "memory-region" property and return reserved regions that the
IOMMU core will then create direct mappings for prior to attaching the
IOMMU domains to the devices. This ensures that a 1:1 mapping is in
place when IOMMU translation starts and prevents the SMMU from detecting
any faults.

Signed-off-by: Thierry Reding 
---
 drivers/iommu/tegra-smmu.c | 76 ++
 1 file changed, 76 insertions(+)

diff --git a/drivers/iommu/tegra-smmu.c b/drivers/iommu/tegra-smmu.c
index 0a281833f611..6bf7654371c5 100644
--- a/drivers/iommu/tegra-smmu.c
+++ b/drivers/iommu/tegra-smmu.c
@@ -10,6 +10,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -539,6 +540,38 @@ static void tegra_smmu_set_pde(struct tegra_smmu_as *as, 
unsigned long iova,
struct tegra_smmu *smmu = as->smmu;
u32 *pd = page_address(as->pd);
unsigned long offset = pd_index * sizeof(*pd);
+   bool unmap = false;
+
+   /*
+* XXX Move this outside of this function. Perhaps add a struct
+* iommu_domain parameter to ->{get,put}_resv_regions() so that
+* the mapping can be done there.
+*
+* The problem here is that as->smmu is only known once we attach
+* the domain to a device (because then we look up the right SMMU
+* instance via the dev->archdata.iommu pointer). When the direct
+* mappings are created for reserved regions, the domain has not
+* been attached to a device yet, so we don't know. We currently
+* fix that up in ->apply_resv_regions() because that is the first
+* time where we have access to a struct device that will be used
+* with the IOMMU domain. However, that's asymmetric and doesn't
+* take care of the page directory mapping either, so we need to
+* come up with something better.
+*/
+   if (as->pd_dma == 0) {
+   as->pd_dma = dma_map_page(smmu->dev, as->pd, 0, SMMU_SIZE_PD,
+ DMA_TO_DEVICE);
+   if (dma_mapping_error(smmu->dev, as->pd_dma))
+   return;
+
+   if (!smmu_dma_addr_valid(smmu, as->pd_dma)) {
+   dma_unmap_page(smmu->dev, as->pd_dma, SMMU_SIZE_PD,
+  DMA_TO_DEVICE);
+   return;
+   }
+
+   unmap = true;
+   }
 
/* Set the page directory entry first */
pd[pd_index] = value;
@@ -551,6 +584,12 @@ static void tegra_smmu_set_pde(struct tegra_smmu_as *as, 
unsigned long iova,
smmu_flush_ptc(smmu, as->pd_dma, offset);
smmu_flush_tlb_section(smmu, as->id, iova);
smmu_flush(smmu);
+
+   if (unmap) {
+   dma_unmap_page(smmu->dev, as->pd_dma, SMMU_SIZE_PD,
+  DMA_TO_DEVICE);
+   as->pd_dma = 0;
+   }
 }
 
 static u32 *tegra_smmu_pte_offset(struct page *pt_page, unsigned long iova)
@@ -945,6 +984,40 @@ static struct iommu_group *tegra_smmu_device_group(struct 
device *dev)
return group->group;
 }
 
+static void tegra_smmu_apply_resv_region(struct device *dev,
+struct iommu_domain *domain,
+struct iommu_resv_region *region)
+{
+   struct tegra_smmu *smmu = dev_iommu_priv_get(dev);
+   struct tegra_smmu_as *as = to_smmu_as(domain);
+
+   /*
+* ->attach_dev() may not have been called yet at this point, so the
+* address space may not have b

[PATCH v2 3/5] iommu: dma: Use of_iommu_get_resv_regions()

2021-04-23 Thread Thierry Reding
From: Thierry Reding 

For device tree nodes, use the standard of_iommu_get_resv_regions()
implementation to obtain the reserved memory regions associated with a
device.

Cc: Rob Herring 
Cc: Frank Rowand 
Cc: devicet...@vger.kernel.org
Signed-off-by: Thierry Reding 
---
 drivers/iommu/dma-iommu.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c
index 7bcdd1205535..52b424176241 100644
--- a/drivers/iommu/dma-iommu.c
+++ b/drivers/iommu/dma-iommu.c
@@ -19,6 +19,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -190,6 +191,8 @@ void iommu_dma_get_resv_regions(struct device *dev, struct 
list_head *list)
if (!is_of_node(dev_iommu_fwspec_get(dev)->iommu_fwnode))
iort_iommu_msi_get_resv_regions(dev, list);
 
+   if (dev->of_node)
+   of_iommu_get_resv_regions(dev, list);
 }
 EXPORT_SYMBOL(iommu_dma_get_resv_regions);
 
-- 
2.30.2

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v2 2/5] iommu: Implement of_iommu_get_resv_regions()

2021-04-23 Thread Thierry Reding
From: Thierry Reding 

This is an implementation that IOMMU drivers can use to obtain reserved
memory regions from a device tree node. It uses the reserved-memory DT
bindings to find the regions associated with a given device. If these
regions are marked accordingly, identity mappings will be created for
them in the IOMMU domain that the devices will be attached to.

Cc: Frank Rowand 
Cc: devicet...@vger.kernel.org
Reviewed-by: Rob Herring 
Signed-off-by: Thierry Reding 
---
Changes in v3:
- change "active" property to identity mapping flag that is part of the
  memory region specifier (as defined by #memory-region-cells) to allow
  per-reference flags to be used

Changes in v2:
- use "active" property to determine whether direct mappings are needed
---
 drivers/iommu/of_iommu.c | 54 
 include/linux/of_iommu.h |  8 ++
 2 files changed, 62 insertions(+)

diff --git a/drivers/iommu/of_iommu.c b/drivers/iommu/of_iommu.c
index a9d2df001149..321ebd5fdaba 100644
--- a/drivers/iommu/of_iommu.c
+++ b/drivers/iommu/of_iommu.c
@@ -11,12 +11,15 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
 #include 
 #include 
 
+#include 
+
 #define NO_IOMMU   1
 
 /**
@@ -240,3 +243,54 @@ const struct iommu_ops *of_iommu_configure(struct device 
*dev,
 
return ops;
 }
+
+/**
+ * of_iommu_get_resv_regions - reserved region driver helper for device tree
+ * @dev: device for which to get reserved regions
+ * @list: reserved region list
+ *
+ * IOMMU drivers can use this to implement their .get_resv_regions() callback
+ * for memory regions attached to a device tree node. See the reserved-memory
+ * device tree bindings on how to use these:
+ *
+ *   Documentation/devicetree/bindings/reserved-memory/reserved-memory.txt
+ */
+void of_iommu_get_resv_regions(struct device *dev, struct list_head *list)
+{
+   struct of_phandle_iterator it;
+   int err;
+
+   of_for_each_phandle(, err, dev->of_node, "memory-region", 
"#memory-region-cells", 0) {
+   struct iommu_resv_region *region;
+   struct of_phandle_args args;
+   struct resource res;
+
+   args.args_count = of_phandle_iterator_args(, args.args, 
MAX_PHANDLE_ARGS);
+
+   err = of_address_to_resource(it.node, 0, );
+   if (err < 0) {
+   dev_err(dev, "failed to parse memory region %pOF: %d\n",
+   it.node, err);
+   continue;
+   }
+
+   if (args.args_count > 0) {
+   /*
+* Active memory regions are expected to be accessed by 
hardware during
+* boot and must therefore have an identity mapping 
created prior to the
+* driver taking control of the hardware. This ensures 
that non-quiescent
+* hardware doesn't cause IOMMU faults during boot.
+*/
+   if (args.args[0] & MEMORY_REGION_IDENTITY_MAPPING) {
+   region = iommu_alloc_resv_region(res.start, 
resource_size(),
+IOMMU_READ | 
IOMMU_WRITE,
+
IOMMU_RESV_DIRECT_RELAXABLE);
+   if (!region)
+   continue;
+
+   list_add_tail(>list, list);
+   }
+   }
+   }
+}
+EXPORT_SYMBOL(of_iommu_get_resv_regions);
diff --git a/include/linux/of_iommu.h b/include/linux/of_iommu.h
index 16f4b3e87f20..8412437acaac 100644
--- a/include/linux/of_iommu.h
+++ b/include/linux/of_iommu.h
@@ -16,6 +16,9 @@ extern const struct iommu_ops *of_iommu_configure(struct 
device *dev,
struct device_node *master_np,
const u32 *id);
 
+extern void of_iommu_get_resv_regions(struct device *dev,
+ struct list_head *list);
+
 #else
 
 static inline int of_get_dma_window(struct device_node *dn, const char *prefix,
@@ -32,6 +35,11 @@ static inline const struct iommu_ops 
*of_iommu_configure(struct device *dev,
return NULL;
 }
 
+static inline void of_iommu_get_resv_regions(struct device *dev,
+struct list_head *list)
+{
+}
+
 #endif /* CONFIG_OF_IOMMU */
 
 #endif /* __OF_IOMMU_H */
-- 
2.30.2

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v2 1/5] dt-bindings: reserved-memory: Document memory region specifier

2021-04-23 Thread Thierry Reding
From: Thierry Reding 

Reserved memory region phandle references can be accompanied by a
specifier that provides additional information about how that specific
reference should be treated.

One use-case is to mark a memory region as needing an identity mapping
in the system's IOMMU for the device that references the region. This is
needed for example when the bootloader has set up hardware (such as a
display controller) to actively access a memory region (e.g. a boot
splash screen framebuffer) during boot. The operating system can use the
identity mapping flag from the specifier to make sure an IOMMU identity
mapping is set up for the framebuffer before IOMMU translations are
enabled for the display controller.

Signed-off-by: Thierry Reding 
---
 .../reserved-memory/reserved-memory.txt   | 21 +++
 include/dt-bindings/reserved-memory.h |  8 +++
 2 files changed, 29 insertions(+)
 create mode 100644 include/dt-bindings/reserved-memory.h

diff --git 
a/Documentation/devicetree/bindings/reserved-memory/reserved-memory.txt 
b/Documentation/devicetree/bindings/reserved-memory/reserved-memory.txt
index e8d3096d922c..e9c2f80b441f 100644
--- a/Documentation/devicetree/bindings/reserved-memory/reserved-memory.txt
+++ b/Documentation/devicetree/bindings/reserved-memory/reserved-memory.txt
@@ -52,6 +52,11 @@ compatible (optional) - standard definition
   be used by an operating system to instantiate the necessary pool
   management subsystem if necessary.
 - vendor specific string in the form ,[-]
+#memory-region-cells (optional) -
+- Defines how many cells are used to form the memory region specifier.
+  The memory region specifier contains additional information on how a
+  reserved memory region referenced by the corresponding phandle will
+  be used in a specific context.
 no-map (optional) - empty property
 - Indicates the operating system must not create a virtual mapping
   of the region as part of its standard mapping of system memory,
@@ -83,6 +88,22 @@ memory-region (optional) - phandle, specifier pairs to 
children of /reserved-mem
 memory-region-names (optional) - a list of names, one for each corresponding
   entry in the memory-region property
 
+Reserved memory region references can be accompanied by a memory region
+specifier, which provides additional information about how the memory region
+will be used in that specific context. If a reserved memory region does not
+have the #memory-region-cells property, 0 is implied and no information
+besides the phandle is conveyed. For reserved memory regions that contain
+#memory-region-cells = <1>, the following encoding applies if not otherwise
+overridden by the bindings selected by the region's compatible string:
+
+  - bit 0: If set, requests that the region be identity mapped if the system
+uses an IOMMU for I/O virtual address translations. This is used, for
+example, when a bootloader has configured a display controller to display
+a boot splash. Once the OS takes over and enables the IOMMU for the given
+display controller, the IOMMU may fault if the framebuffer hasn't been
+mapped to the IOMMU at the address that the display controller tries to
+access.
+
 Example
 ---
 This example defines 3 contiguous regions are defined for Linux kernel:
diff --git a/include/dt-bindings/reserved-memory.h 
b/include/dt-bindings/reserved-memory.h
new file mode 100644
index ..174ca3448342
--- /dev/null
+++ b/include/dt-bindings/reserved-memory.h
@@ -0,0 +1,8 @@
+/* SPDX-License-Identifier: (GPL-2.0+ or MIT) */
+
+#ifndef _DT_BINDINGS_RESERVED_MEMORY_H
+#define _DT_BINDINGS_RESERVED_MEMORY_H
+
+#define MEMORY_REGION_IDENTITY_MAPPING 0x1
+
+#endif
-- 
2.30.2

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v2 0/5] iommu: Support identity mappings of reserved-memory regions

2021-04-23 Thread Thierry Reding
From: Thierry Reding 

Hi,

this is an updated proposal to solve the problem of passing memory
regions that are actively being accessed during boot. The particular
use-case that I need this for is when the bootloader has set up the
display controller to scan out a boot splash screen. During boot the
DMA/IOMMU glue code will attach devices to an IOMMU domain and by
doing so enable IOMMU translations. Typically this will be before a
device driver has had a chance to either disable the display
controller or set up a new framebuffer and map it to the IOMMU.

In that case, the IOMMU will start to fault because the accesses of
the display controller will be for memory addresses that are not mapped
in the IOMMU. The solution is obviously to create identity mappings for
such memory regions. From a device tree point of view, these memory
regions can be described using the reserved-memory device tree bindings
and hooked up to the consumer devices using the "memory-region"
property. On the kernel side, the IOMMU framework already supports the
concept of reserved regions, as well as a way of marking these regions
as requiring identity (a.k.a. direct) mappings.

Unfortunately, the current reserved-memory region bindings only allow
properties of the regions themselves to be described (such as whether a
kernel virtual mapping of the region is needed or not), but it doesn't
provide a way of associating extra information with any particular
reference to these regions. However, that's exactly what's needed for
this case because a given region may need to be identity mapped for a
specific device (such as the display controller scanning out from the
region) but referenced by multiple devices (e.g. if the memory is some
special carveout memory reserved for display purposes).

This series of patches proposes a simple solution: extend memory-region
properties to use an optional specifier, such as the ones already
commonly used for things like GPIOs or interrupts. The specifier needs
to be provided if the reserved-memory region has a non-zero
#memory-region-cells property (if the property is not present, zero is
the assumed default value). The specifier contains flags that specify
how the reference is to be treated. This series of patches introduces
the MEMORY_REGION_IDENTITY_MAPPING flag (value: 0x1) that marks the
specific reference to the memory region to require an identity mapping.

In practice, a device tree would look like this:

reserved-memory {
#address-cells = <2>;
#size-cells = <2>;

fb: framebuffer@92cb2000 {
reg = <0 0x92cb2000 0 0x0080>;
#memory-region-cells = <1>;
};
};

...

display@5240 {
...
memory-region = < MEMORY_REGION_IDENTITY_MAPPING>;
...
};

Note: While the above would be valid DTS content, it's more likely that
in practice this content would be dynamically generated by the
bootloader using runtime information (such as the framebuffer memory
location).

An operating system can derive from that  pair that
the 8 MiB of memory at physical address 0x92cb2000 need to be identity
mapped to the same IO virtual address if the device is attached to an
IOMMU. If no IOMMU is enabled in the system, obviously no identity
mapping needs to be created, but the operating system may still use the
reference to transition to its own framebuffer using the existing memory
region.

Note that an earlier proposal was to use the existing simple-framebuffer
device tree bindings to transport this information. Unfortunately there
are cases where this is not enough. On Tegra SoCs, for example, the
bootloader will also set up a color space correction lookup table in the
system memory that the display controller will access during boot,
alongside the framebuffer. The simple-framebuffer DT bindings have no
way of describing this (and I guess one could argue that this particular
setup no longer is a "simple" framebuffer), so the above, more flexible
proposal was implemented.

I've made corresponding changes in the proprietary bootloader, added a
compatibility shim in U-Boot (which forwards information created by the
proprietary bootloader to the kernel) and the attached patches to test
this on Jetson TX1, Jetson TX2 and Jetson AGX Xavier.

Note that there will be no new releases of the bootloader for earlier
devices, so adding support for these new DT bindings will not be
practical. The bootloaders on those devices do pass information about
the active framebuffer via the kernel command-line, so we may want to
add code to create reserved regions in the IOMMU based on that.

Thierry

Navneet Kumar (1):
  iommu/tegra-smmu: Support managed domains

Thierry Reding (4):
  dt-bindings: reserved-memory: Document memory region specifier
  iommu: Implement of_iommu_get_resv_regions()
  i

[PATCH v2 09/10] arm64: tegra: Enable SMMU support on Tegra194

2021-04-20 Thread Thierry Reding
From: Thierry Reding 

Add the device tree node for the dual-SMMU found on Tegra194 and hook up
peripherals such as host1x, BPMP, HDA, SDMMC, EQOS and VIC.

Signed-off-by: Thierry Reding 
---
 arch/arm64/boot/dts/nvidia/tegra194.dtsi | 86 
 1 file changed, 86 insertions(+)

diff --git a/arch/arm64/boot/dts/nvidia/tegra194.dtsi 
b/arch/arm64/boot/dts/nvidia/tegra194.dtsi
index 4d37ee0ea4d1..6ed296e27158 100644
--- a/arch/arm64/boot/dts/nvidia/tegra194.dtsi
+++ b/arch/arm64/boot/dts/nvidia/tegra194.dtsi
@@ -62,6 +62,7 @@ ethernet@249 {
interconnects = < TEGRA194_MEMORY_CLIENT_EQOSR >,
< TEGRA194_MEMORY_CLIENT_EQOSW >;
interconnect-names = "dma-mem", "write";
+   iommus = < TEGRA194_SID_EQOS>;
status = "disabled";
 
snps,write-requests = <1>;
@@ -733,6 +734,7 @@ sdmmc1: mmc@340 {
interconnects = < TEGRA194_MEMORY_CLIENT_SDMMCRA 
>,
< TEGRA194_MEMORY_CLIENT_SDMMCWA 
>;
interconnect-names = "dma-mem", "write";
+   iommus = < TEGRA194_SID_SDMMC1>;
nvidia,pad-autocal-pull-up-offset-3v3-timeout =
<0x07>;
nvidia,pad-autocal-pull-down-offset-3v3-timeout =
@@ -759,6 +761,7 @@ sdmmc3: mmc@344 {
interconnects = < TEGRA194_MEMORY_CLIENT_SDMMCR 
>,
< TEGRA194_MEMORY_CLIENT_SDMMCW 
>;
interconnect-names = "dma-mem", "write";
+   iommus = < TEGRA194_SID_SDMMC3>;
nvidia,pad-autocal-pull-up-offset-1v8 = <0x00>;
nvidia,pad-autocal-pull-down-offset-1v8 = <0x7a>;
nvidia,pad-autocal-pull-up-offset-3v3-timeout = <0x07>;
@@ -790,6 +793,7 @@ sdmmc4: mmc@346 {
interconnects = < TEGRA194_MEMORY_CLIENT_SDMMCRAB 
>,
< TEGRA194_MEMORY_CLIENT_SDMMCWAB 
>;
interconnect-names = "dma-mem", "write";
+   iommus = < TEGRA194_SID_SDMMC4>;
nvidia,pad-autocal-pull-up-offset-hs400 = <0x00>;
nvidia,pad-autocal-pull-down-offset-hs400 = <0x00>;
nvidia,pad-autocal-pull-up-offset-1v8-timeout = <0x0a>;
@@ -821,6 +825,7 @@ hda@351 {
interconnects = < TEGRA194_MEMORY_CLIENT_HDAR >,
< TEGRA194_MEMORY_CLIENT_HDAW >;
interconnect-names = "dma-mem", "write";
+   iommus = < TEGRA194_SID_HDA>;
status = "disabled";
};
 
@@ -1300,6 +1305,84 @@ pmc: pmc@c36 {
interrupt-controller;
};
 
+

[PATCH v2 10/10] arm64: tegra: Enable SMMU support for display on Tegra194

2021-04-20 Thread Thierry Reding
From: Thierry Reding 

The display controllers are attached to a separate ARM SMMU instance
that is dedicated to servicing isochronous memory clients. Add this ISO
instance of the ARM SMMU to device tree and attach all four display
controllers to it.

Signed-off-by: Thierry Reding 
---
 arch/arm64/boot/dts/nvidia/tegra194.dtsi | 80 
 1 file changed, 80 insertions(+)

diff --git a/arch/arm64/boot/dts/nvidia/tegra194.dtsi 
b/arch/arm64/boot/dts/nvidia/tegra194.dtsi
index 6ed296e27158..00f8248f216e 100644
--- a/arch/arm64/boot/dts/nvidia/tegra194.dtsi
+++ b/arch/arm64/boot/dts/nvidia/tegra194.dtsi
@@ -1305,6 +1305,82 @@ pmc: pmc@c36 {
interrupt-controller;
};
 
+   smmu_iso: iommu@1000 {
+   compatible = "nvidia,tegra194-smmu", "nvidia,smmu-500";
+   reg = <0x1000 0x80>;
+   interrupts = ,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+,
+;
+   stream-match-mask = <0x7f80>;
+   #global-interrupts = <1>;
+   #iommu-cells = <1>;
+
+   nvidia,memory-controller = <>;
+   status = "okay";
+   };
+
smmu: iommu@1200 {
compatible = "nvidia,tegra194-smmu", "nvidia,smmu-500";
reg = <0x1200 0x80>,
@@ -1441,6 +1517,7 @@ display@1520 {
interconnects = < 
TEGRA194_MEMORY_CLIENT_NVDISPLAYR >,
< 
TEGRA194_MEMORY_CLIENT_NVDISPLAYR1 >;
interconnect-names = "dma-mem", 
"read-1";
+   iommus = <_iso 
TEGRA194_SID_NVDISPLAY>;
 
nvidia,outputs = <   
>;
nvidia,head = <0>;
@@ -1459,6 +1536,7 @@ display@1521 {
interconnects = < 
TEGRA194_MEMORY_CLIENT_NVDISPLAYR >,
< 
TEGRA194_MEMORY_CLIENT_NVDISPLAYR1 >;
interconnect-names = "dma-mem", 
"read-1";
+   iommus = <_iso 
TEGRA194_SID_NVDIS

[PATCH v2 08/10] arm64: tegra: Hook up memory controller to SMMU on Tegra186

2021-04-20 Thread Thierry Reding
From: Thierry Reding 

On Tegra186 and later, the memory controller needs to be programmed in
coordination with any of the ARM SMMU instances to configure the stream
ID used for each memory client.

To support this, add a phandle reference to the memory controller to the
SMMU device tree node.

Signed-off-by: Thierry Reding 
---
 arch/arm64/boot/dts/nvidia/tegra186.dtsi | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/arm64/boot/dts/nvidia/tegra186.dtsi 
b/arch/arm64/boot/dts/nvidia/tegra186.dtsi
index a173f40256ae..d02f6bf3e2ca 100644
--- a/arch/arm64/boot/dts/nvidia/tegra186.dtsi
+++ b/arch/arm64/boot/dts/nvidia/tegra186.dtsi
@@ -1152,6 +1152,8 @@ smmu: iommu@1200 {
stream-match-mask = <0x7f80>;
#global-interrupts = <1>;
#iommu-cells = <1>;
+
+   nvidia,memory-controller = <>;
};
 
host1x@13e0 {
-- 
2.30.2

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v2 07/10] arm64: tegra: Use correct compatible string for Tegra186 SMMU

2021-04-20 Thread Thierry Reding
From: Thierry Reding 

The SMMU found on Tegra186 requires interoperation with the memory
controller in order to program stream ID overrides. The generic ARM SMMU
500 compatible is therefore inaccurate. Replace it with a more correct,
SoC-specific compatible string.

Signed-off-by: Thierry Reding 
---
 arch/arm64/boot/dts/nvidia/tegra186.dtsi | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/arm64/boot/dts/nvidia/tegra186.dtsi 
b/arch/arm64/boot/dts/nvidia/tegra186.dtsi
index 9f75bbf00cf7..a173f40256ae 100644
--- a/arch/arm64/boot/dts/nvidia/tegra186.dtsi
+++ b/arch/arm64/boot/dts/nvidia/tegra186.dtsi
@@ -1082,7 +1082,7 @@ pci@3,0 {
};
 
smmu: iommu@1200 {
-   compatible = "arm,mmu-500";
+   compatible = "nvidia,tegra186-smmu", "nvidia,smmu-500";
reg = <0 0x1200 0 0x80>;
interrupts = ,
 ,
-- 
2.30.2

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v2 06/10] iommu/arm-smmu: Use Tegra implementation on Tegra186

2021-04-20 Thread Thierry Reding
From: Thierry Reding 

Tegra186 requires the same SID override programming as Tegra194 in order
to seamlessly transition from the firmware framebuffer to the Linux
framebuffer, so the Tegra implementation needs to be used on Tegra186
devices as well.

Signed-off-by: Thierry Reding 
---
 drivers/iommu/arm/arm-smmu/arm-smmu-impl.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu-impl.c 
b/drivers/iommu/arm/arm-smmu/arm-smmu-impl.c
index 136872e77195..9f465e146799 100644
--- a/drivers/iommu/arm/arm-smmu/arm-smmu-impl.c
+++ b/drivers/iommu/arm/arm-smmu/arm-smmu-impl.c
@@ -211,7 +211,8 @@ struct arm_smmu_device *arm_smmu_impl_init(struct 
arm_smmu_device *smmu)
if (of_property_read_bool(np, "calxeda,smmu-secure-config-access"))
smmu->impl = _impl;
 
-   if (of_device_is_compatible(np, "nvidia,tegra194-smmu"))
+   if (of_device_is_compatible(np, "nvidia,tegra194-smmu") ||
+   of_device_is_compatible(np, "nvidia,tegra186-smmu"))
return nvidia_smmu_impl_init(smmu);
 
smmu = qcom_smmu_impl_init(smmu);
-- 
2.30.2

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v2 05/10] iommu/arm-smmu: tegra: Implement SID override programming

2021-04-20 Thread Thierry Reding
From: Thierry Reding 

The secure firmware keeps some SID override registers set as passthrough
in order to allow devices such as the display controller to operate with
no knowledge of SMMU translations until an operating system driver takes
over. This is needed in order to seamlessly transition from the firmware
framebuffer to the OS framebuffer.

Upon successfully attaching a device to the SMMU and in the process
creating identity mappings for memory regions that are being accessed,
the Tegra implementation will call into the memory controller driver to
program the override SIDs appropriately.

Signed-off-by: Thierry Reding 
---
 drivers/iommu/arm/arm-smmu/arm-smmu-nvidia.c | 32 ++--
 1 file changed, 29 insertions(+), 3 deletions(-)

diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu-nvidia.c 
b/drivers/iommu/arm/arm-smmu/arm-smmu-nvidia.c
index 0e547b29143d..01e9b50b10a1 100644
--- a/drivers/iommu/arm/arm-smmu/arm-smmu-nvidia.c
+++ b/drivers/iommu/arm/arm-smmu/arm-smmu-nvidia.c
@@ -7,6 +7,8 @@
 #include 
 #include 
 
+#include 
+
 #include "arm-smmu.h"
 
 /*
@@ -15,10 +17,17 @@
  * interleaved IOVA accesses across them and translates accesses from
  * non-isochronous HW devices.
  * Third one is used for translating accesses from isochronous HW devices.
+ *
+ * In addition, the SMMU driver needs to coordinate with the memory controller
+ * driver to ensure that the right SID override is programmed for any given
+ * memory client. This is necessary to allow for use-case such as seamlessly
+ * handing over the display controller configuration from the firmware to the
+ * kernel.
+ *
  * This implementation supports programming of the two instances that must
- * be programmed identically.
- * The third instance usage is through standard arm-smmu driver itself and
- * is out of scope of this implementation.
+ * be programmed identically and takes care of invoking the memory controller
+ * driver for SID override programming after devices have been attached to an
+ * SMMU instance.
  */
 #define MAX_SMMU_INSTANCES 2
 
@@ -26,6 +35,7 @@ struct nvidia_smmu {
struct arm_smmu_device smmu;
void __iomem *bases[MAX_SMMU_INSTANCES];
unsigned int num_instances;
+   struct tegra_mc *mc;
 };
 
 static inline struct nvidia_smmu *to_nvidia_smmu(struct arm_smmu_device *smmu)
@@ -237,6 +247,17 @@ static irqreturn_t nvidia_smmu_context_fault(int irq, void 
*dev)
return ret;
 }
 
+static void nvidia_smmu_probe_finalize(struct arm_smmu_device *smmu, struct 
device *dev)
+{
+   struct nvidia_smmu *nvidia = to_nvidia_smmu(smmu);
+   int err;
+
+   err = tegra_mc_probe_device(nvidia->mc, dev);
+   if (err < 0)
+   dev_err(smmu->dev, "memory controller probe failed for %s: 
%d\n",
+   dev_name(dev), err);
+}
+
 static const struct arm_smmu_impl nvidia_smmu_impl = {
.read_reg = nvidia_smmu_read_reg,
.write_reg = nvidia_smmu_write_reg,
@@ -246,6 +267,7 @@ static const struct arm_smmu_impl nvidia_smmu_impl = {
.tlb_sync = nvidia_smmu_tlb_sync,
.global_fault = nvidia_smmu_global_fault,
.context_fault = nvidia_smmu_context_fault,
+   .probe_finalize = nvidia_smmu_probe_finalize,
 };
 
 static const struct arm_smmu_impl nvidia_smmu_single_impl = {
@@ -264,6 +286,10 @@ struct arm_smmu_device *nvidia_smmu_impl_init(struct 
arm_smmu_device *smmu)
if (!nvidia_smmu)
return ERR_PTR(-ENOMEM);
 
+   nvidia_smmu->mc = devm_tegra_memory_controller_get(dev);
+   if (IS_ERR(nvidia_smmu->mc))
+   return ERR_CAST(nvidia_smmu->mc);
+
/* Instance 0 is ioremapped by arm-smmu.c. */
nvidia_smmu->bases[0] = smmu->base;
nvidia_smmu->num_instances++;
-- 
2.30.2

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v2 04/10] iommu/arm-smmu: tegra: Detect number of instances at runtime

2021-04-20 Thread Thierry Reding
From: Thierry Reding 

Parse the reg property in device tree and detect the number of instances
represented by a device tree node. This is subsequently needed in order
to support single-instance SMMUs with the Tegra implementation because
additional programming is needed to properly configure the SID override
registers in the memory controller.

Signed-off-by: Thierry Reding 
---
Changes in v2:
- provide a separate implementation to simplify single instances
---
 drivers/iommu/arm/arm-smmu/arm-smmu-nvidia.c | 58 ++--
 1 file changed, 42 insertions(+), 16 deletions(-)

diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu-nvidia.c 
b/drivers/iommu/arm/arm-smmu/arm-smmu-nvidia.c
index 29117444e5a0..0e547b29143d 100644
--- a/drivers/iommu/arm/arm-smmu/arm-smmu-nvidia.c
+++ b/drivers/iommu/arm/arm-smmu/arm-smmu-nvidia.c
@@ -20,13 +20,19 @@
  * The third instance usage is through standard arm-smmu driver itself and
  * is out of scope of this implementation.
  */
-#define NUM_SMMU_INSTANCES 2
+#define MAX_SMMU_INSTANCES 2
 
 struct nvidia_smmu {
-   struct arm_smmu_device  smmu;
-   void __iomem*bases[NUM_SMMU_INSTANCES];
+   struct arm_smmu_device smmu;
+   void __iomem *bases[MAX_SMMU_INSTANCES];
+   unsigned int num_instances;
 };
 
+static inline struct nvidia_smmu *to_nvidia_smmu(struct arm_smmu_device *smmu)
+{
+   return container_of(smmu, struct nvidia_smmu, smmu);
+}
+
 static inline void __iomem *nvidia_smmu_page(struct arm_smmu_device *smmu,
 unsigned int inst, int page)
 {
@@ -47,9 +53,10 @@ static u32 nvidia_smmu_read_reg(struct arm_smmu_device *smmu,
 static void nvidia_smmu_write_reg(struct arm_smmu_device *smmu,
  int page, int offset, u32 val)
 {
+   struct nvidia_smmu *nvidia = to_nvidia_smmu(smmu);
unsigned int i;
 
-   for (i = 0; i < NUM_SMMU_INSTANCES; i++) {
+   for (i = 0; i < nvidia->num_instances; i++) {
void __iomem *reg = nvidia_smmu_page(smmu, i, page) + offset;
 
writel_relaxed(val, reg);
@@ -67,9 +74,10 @@ static u64 nvidia_smmu_read_reg64(struct arm_smmu_device 
*smmu,
 static void nvidia_smmu_write_reg64(struct arm_smmu_device *smmu,
int page, int offset, u64 val)
 {
+   struct nvidia_smmu *nvidia = to_nvidia_smmu(smmu);
unsigned int i;
 
-   for (i = 0; i < NUM_SMMU_INSTANCES; i++) {
+   for (i = 0; i < nvidia->num_instances; i++) {
void __iomem *reg = nvidia_smmu_page(smmu, i, page) + offset;
 
writeq_relaxed(val, reg);
@@ -79,6 +87,7 @@ static void nvidia_smmu_write_reg64(struct arm_smmu_device 
*smmu,
 static void nvidia_smmu_tlb_sync(struct arm_smmu_device *smmu, int page,
 int sync, int status)
 {
+   struct nvidia_smmu *nvidia = to_nvidia_smmu(smmu);
unsigned int delay;
 
arm_smmu_writel(smmu, page, sync, 0);
@@ -90,7 +99,7 @@ static void nvidia_smmu_tlb_sync(struct arm_smmu_device 
*smmu, int page,
u32 val = 0;
unsigned int i;
 
-   for (i = 0; i < NUM_SMMU_INSTANCES; i++) {
+   for (i = 0; i < nvidia->num_instances; i++) {
void __iomem *reg;
 
reg = nvidia_smmu_page(smmu, i, page) + status;
@@ -112,9 +121,10 @@ static void nvidia_smmu_tlb_sync(struct arm_smmu_device 
*smmu, int page,
 
 static int nvidia_smmu_reset(struct arm_smmu_device *smmu)
 {
+   struct nvidia_smmu *nvidia = to_nvidia_smmu(smmu);
unsigned int i;
 
-   for (i = 0; i < NUM_SMMU_INSTANCES; i++) {
+   for (i = 0; i < nvidia->num_instances; i++) {
u32 val;
void __iomem *reg = nvidia_smmu_page(smmu, i, ARM_SMMU_GR0) +
ARM_SMMU_GR0_sGFSR;
@@ -157,8 +167,9 @@ static irqreturn_t nvidia_smmu_global_fault(int irq, void 
*dev)
unsigned int inst;
irqreturn_t ret = IRQ_NONE;
struct arm_smmu_device *smmu = dev;
+   struct nvidia_smmu *nvidia = to_nvidia_smmu(smmu);
 
-   for (inst = 0; inst < NUM_SMMU_INSTANCES; inst++) {
+   for (inst = 0; inst < nvidia->num_instances; inst++) {
irqreturn_t irq_ret;
 
irq_ret = nvidia_smmu_global_fault_inst(irq, smmu, inst);
@@ -202,11 +213,13 @@ static irqreturn_t nvidia_smmu_context_fault(int irq, 
void *dev)
struct arm_smmu_device *smmu;
struct iommu_domain *domain = dev;
struct arm_smmu_domain *smmu_domain;
+   struct nvidia_smmu *nvidia;
 
smmu_domain = container_of(domain, struct arm_smmu_domain, domain);
smmu = smmu_domain->smmu;
+   nvidia = to_nvidia_smmu(smmu);
 
-   for (inst = 0; inst < NUM_SMMU_INSTANCES; inst++) {
+ 

[PATCH v2 03/10] iommu/arm-smmu: Implement ->probe_finalize()

2021-04-20 Thread Thierry Reding
From: Thierry Reding 

Implement a ->probe_finalize() callback that can be used by vendor
implementations to perform extra programming necessary after devices
have been attached to the SMMU.

Signed-off-by: Thierry Reding 
---
Changes in v2:
-remove unnecessarily paranoid check
---
 drivers/iommu/arm/arm-smmu/arm-smmu.c | 13 +
 drivers/iommu/arm/arm-smmu/arm-smmu.h |  1 +
 2 files changed, 14 insertions(+)

diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu.c 
b/drivers/iommu/arm/arm-smmu/arm-smmu.c
index 6f72c4d208ca..d20ce4d57df2 100644
--- a/drivers/iommu/arm/arm-smmu/arm-smmu.c
+++ b/drivers/iommu/arm/arm-smmu/arm-smmu.c
@@ -1450,6 +1450,18 @@ static void arm_smmu_release_device(struct device *dev)
iommu_fwspec_free(dev);
 }
 
+static void arm_smmu_probe_finalize(struct device *dev)
+{
+   struct arm_smmu_master_cfg *cfg;
+   struct arm_smmu_device *smmu;
+
+   cfg = dev_iommu_priv_get(dev);
+   smmu = cfg->smmu;
+
+   if (smmu->impl->probe_finalize)
+   smmu->impl->probe_finalize(smmu, dev);
+}
+
 static struct iommu_group *arm_smmu_device_group(struct device *dev)
 {
struct arm_smmu_master_cfg *cfg = dev_iommu_priv_get(dev);
@@ -1569,6 +1581,7 @@ static struct iommu_ops arm_smmu_ops = {
.iova_to_phys   = arm_smmu_iova_to_phys,
.probe_device   = arm_smmu_probe_device,
.release_device = arm_smmu_release_device,
+   .probe_finalize = arm_smmu_probe_finalize,
.device_group   = arm_smmu_device_group,
.enable_nesting = arm_smmu_enable_nesting,
.set_pgtable_quirks = arm_smmu_set_pgtable_quirks,
diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu.h 
b/drivers/iommu/arm/arm-smmu/arm-smmu.h
index c31a59d35c64..147c95e7c59c 100644
--- a/drivers/iommu/arm/arm-smmu/arm-smmu.h
+++ b/drivers/iommu/arm/arm-smmu/arm-smmu.h
@@ -439,6 +439,7 @@ struct arm_smmu_impl {
  struct device *dev, int start);
void (*write_s2cr)(struct arm_smmu_device *smmu, int idx);
void (*write_sctlr)(struct arm_smmu_device *smmu, int idx, u32 reg);
+   void (*probe_finalize)(struct arm_smmu_device *smmu, struct device 
*dev);
 };
 
 #define INVALID_SMENDX -1
-- 
2.30.2

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v2 02/10] dt-bindings: arm-smmu: Add Tegra186 compatible string

2021-04-20 Thread Thierry Reding
From: Thierry Reding 

The ARM SMMU instantiations found on Tegra186 and later need inter-
operation with the memory controller in order to correctly program
stream ID overrides.

Furthermore, on Tegra194 multiple instances of the SMMU can gang up
to achieve higher throughput. In order to do this, they have to be
programmed identically so that the memory controller can interleave
memory accesses between them.

Add the Tegra186 compatible string to make sure the interoperation
with the memory controller can be enabled on that SoC generation.

Signed-off-by: Thierry Reding 
---
 Documentation/devicetree/bindings/iommu/arm,smmu.yaml | 11 +--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/Documentation/devicetree/bindings/iommu/arm,smmu.yaml 
b/Documentation/devicetree/bindings/iommu/arm,smmu.yaml
index 9d27aa5111d4..1181b590db71 100644
--- a/Documentation/devicetree/bindings/iommu/arm,smmu.yaml
+++ b/Documentation/devicetree/bindings/iommu/arm,smmu.yaml
@@ -54,8 +54,14 @@ properties:
   - const: arm,mmu-500
   - description: NVIDIA SoCs that program two ARM MMU-500s identically
 items:
+  - description: NVIDIA SoCs that require memory controller interaction
+  and may program multiple ARM MMU-500s identically with the memory
+  controller interleaving translations between multiple instances
+  for improved performance.
+items:
   - enum:
-  - nvidia,tegra194-smmu
+  - const: nvidia,tegra194-smmu
+  - const: nvidia,tegra186-smmu
   - const: nvidia,smmu-500
   - items:
   - const: arm,mmu-500
@@ -165,10 +171,11 @@ allOf:
   contains:
 enum:
   - nvidia,tegra194-smmu
+  - nvidia,tegra186-smmu
 then:
   properties:
 reg:
-  minItems: 2
+  minItems: 1
   maxItems: 2
 else:
   properties:
-- 
2.30.2

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v2 01/10] memory: tegra: Implement SID override programming

2021-04-20 Thread Thierry Reding
From: Thierry Reding 

Instead of programming all SID overrides during early boot, perform the
operation on-demand after the SMMU translations have been set up for a
device. This reuses data from device tree to match memory clients for a
device and programs the SID specified in device tree, which corresponds
to the SID used for the SMMU context banks for the device.

Signed-off-by: Thierry Reding 
---
 drivers/memory/tegra/mc.c   |  9 +
 drivers/memory/tegra/tegra186.c | 72 +
 include/soc/tegra/mc.h  |  3 ++
 3 files changed, 84 insertions(+)

diff --git a/drivers/memory/tegra/mc.c b/drivers/memory/tegra/mc.c
index c854639cf30c..bace5ecfe770 100644
--- a/drivers/memory/tegra/mc.c
+++ b/drivers/memory/tegra/mc.c
@@ -97,6 +97,15 @@ struct tegra_mc *devm_tegra_memory_controller_get(struct 
device *dev)
 }
 EXPORT_SYMBOL_GPL(devm_tegra_memory_controller_get);
 
+int tegra_mc_probe_device(struct tegra_mc *mc, struct device *dev)
+{
+   if (mc->soc->ops && mc->soc->ops->probe_device)
+   return mc->soc->ops->probe_device(mc, dev);
+
+   return 0;
+}
+EXPORT_SYMBOL_GPL(tegra_mc_probe_device);
+
 static int tegra_mc_block_dma_common(struct tegra_mc *mc,
 const struct tegra_mc_reset *rst)
 {
diff --git a/drivers/memory/tegra/tegra186.c b/drivers/memory/tegra/tegra186.c
index 1f87915ccd62..e65eac5764d4 100644
--- a/drivers/memory/tegra/tegra186.c
+++ b/drivers/memory/tegra/tegra186.c
@@ -4,6 +4,7 @@
  */
 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -15,6 +16,10 @@
 #include 
 #endif
 
+#define MC_SID_STREAMID_OVERRIDE_MASK GENMASK(7, 0)
+#define MC_SID_STREAMID_SECURITY_WRITE_ACCESS_DISABLED BIT(16)
+#define MC_SID_STREAMID_SECURITY_OVERRIDE BIT(8)
+
 static void tegra186_mc_program_sid(struct tegra_mc *mc)
 {
unsigned int i;
@@ -66,10 +71,77 @@ static int tegra186_mc_resume(struct tegra_mc *mc)
return 0;
 }
 
+static void tegra186_mc_client_sid_override(struct tegra_mc *mc,
+   const struct tegra_mc_client 
*client,
+   unsigned int sid)
+{
+   u32 value, old;
+
+   value = readl(mc->regs + client->regs.sid.security);
+   if ((value & MC_SID_STREAMID_SECURITY_OVERRIDE) == 0) {
+   /*
+* If the secure firmware has locked this down the override
+* for this memory client, there's nothing we can do here.
+*/
+   if (value & MC_SID_STREAMID_SECURITY_WRITE_ACCESS_DISABLED)
+   return;
+
+   /*
+* Otherwise, try to set the override itself. Typically the
+* secure firmware will never have set this configuration.
+* Instead, it will either have disabled write access to
+* this field, or it will already have set an explicit
+* override itself.
+*/
+   WARN_ON((value & MC_SID_STREAMID_SECURITY_OVERRIDE) == 0);
+
+   value |= MC_SID_STREAMID_SECURITY_OVERRIDE;
+   writel(value, mc->regs + client->regs.sid.security);
+   }
+
+   value = readl(mc->regs + client->regs.sid.override);
+   old = value & MC_SID_STREAMID_OVERRIDE_MASK;
+
+   if (old != sid) {
+   dev_dbg(mc->dev, "overriding SID %x for %s with %x\n", old,
+   client->name, sid);
+   writel(sid, mc->regs + client->regs.sid.override);
+   }
+}
+
+static int tegra186_mc_probe_device(struct tegra_mc *mc, struct device *dev)
+{
+#if IS_ENABLED(CONFIG_IOMMU_API)
+   struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev);
+   struct of_phandle_args args;
+   unsigned int i, index = 0;
+
+   while (!of_parse_phandle_with_args(dev->of_node, "interconnects", 
"#interconnect-cells",
+  index, )) {
+   if (args.np == mc->dev->of_node && args.args_count != 0) {
+   for (i = 0; i < mc->soc->num_clients; i++) {
+   const struct tegra_mc_client *client = 
>soc->clients[i];
+
+   if (client->id == args.args[0]) {
+   u32 sid = fwspec->ids[0] & 
MC_SID_STREAMID_OVERRIDE_MASK;
+
+   tegra186_mc_client_sid_override(mc, 
client, sid);
+   }
+   }
+   }
+
+   index++;
+   }
+#endif
+
+   return 0;
+}
+
 const struct tegra_mc_ops tegra186_mc_ops = {
.probe = tegra186_mc_probe,
.remove = tegra186_mc_remove,
.resume = tegra186_mc_resume,
+   .probe_device = tegra186_mc_probe_device,
 };
 
 #if defined(CONFIG

[PATCH v2 00/10] arm64: tegra: Prevent early SMMU faults

2021-04-20 Thread Thierry Reding
From: Thierry Reding 

Hi,

this is a set of patches that is the result of earlier discussions
regarding early identity mappings that are needed to avoid SMMU faults
during early boot.

The goal here is to avoid early identity mappings altogether and instead
postpone the need for the identity mappings to when devices are attached
to the SMMU. This works by making the SMMU driver coordinate with the
memory controller driver on when to start enforcing SMMU translations.
This makes Tegra behave in a more standard way and pushes the code to
deal with the Tegra-specific programming into the NVIDIA SMMU
implementation.

Compared to the original version of these patches, I've split the
preparatory work into a separate patch series because it became very
large and will be mostly uninteresting for this audience.

Patch 1 provides a mechanism to program SID overrides at runtime. Patch
2 updates the ARM SMMU device tree bindings to include the Tegra186
compatible string as suggested by Robin during review.

Patches 3 and 4 create the fundamentals in the SMMU driver to support
this and also make this functionality available on Tegra186. Patch 5
hooks the ARM SMMU up to the memory controller so that the memory client
stream ID overrides can be programmed at the right time.

Patch 6 extends this mechanism to Tegra186 and patches 7-9 enable all of
this through device tree updates. Patch 10 is included here to show how
SMMU will be enabled for display controllers. However, it cannot be
applied yet because the code to create identity mappings for potentially
live framebuffers hasn't been merged yet.

The end result is that various peripherals will have SMMU enabled, while
the display controllers will keep using passthrough, as initially set up
by firmware. Once the device tree bindings have been accepted and the
SMMU driver has been updated to create identity mappings for the display
controllers, they can be hooked up to the SMMU and the code in this
series will automatically program the SID overrides to enable SMMU
translations at the right time.

Note that the series creates a compile time dependency between the
memory controller and IOMMU trees. If it helps I can provide a branch
for each tree, modelling the dependency, once the series has been
reviewed.

Changes in v2:
- split off the preparatory work into a separate series (that needs to
  be applied first)
- address review comments by Robin

Thierry

Thierry Reding (10):
  memory: tegra: Implement SID override programming
  dt-bindings: arm-smmu: Add Tegra186 compatible string
  iommu/arm-smmu: Implement ->probe_finalize()
  iommu/arm-smmu: tegra: Detect number of instances at runtime
  iommu/arm-smmu: tegra: Implement SID override programming
  iommu/arm-smmu: Use Tegra implementation on Tegra186
  arm64: tegra: Use correct compatible string for Tegra186 SMMU
  arm64: tegra: Hook up memory controller to SMMU on Tegra186
  arm64: tegra: Enable SMMU support on Tegra194
  arm64: tegra: Enable SMMU support for display on Tegra194

 .../devicetree/bindings/iommu/arm,smmu.yaml   |  11 +-
 arch/arm64/boot/dts/nvidia/tegra186.dtsi  |   4 +-
 arch/arm64/boot/dts/nvidia/tegra194.dtsi  | 166 ++
 drivers/iommu/arm/arm-smmu/arm-smmu-impl.c|   3 +-
 drivers/iommu/arm/arm-smmu/arm-smmu-nvidia.c  |  90 --
 drivers/iommu/arm/arm-smmu/arm-smmu.c |  13 ++
 drivers/iommu/arm/arm-smmu/arm-smmu.h |   1 +
 drivers/memory/tegra/mc.c |   9 +
 drivers/memory/tegra/tegra186.c   |  72 
 include/soc/tegra/mc.h|   3 +
 10 files changed, 349 insertions(+), 23 deletions(-)

-- 
2.30.2

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v1 1/2] iommu/tegra-smmu: Defer attachment of display clients

2021-04-08 Thread Thierry Reding
On Thu, Apr 08, 2021 at 02:42:42AM -0700, Nicolin Chen wrote:
> On Mon, Mar 29, 2021 at 02:32:55AM +0300, Dmitry Osipenko wrote:
> > All consumer-grade Android and Chromebook devices show a splash screen
> > on boot and then display is left enabled when kernel is booted. This
> > behaviour is unacceptable in a case of implicit IOMMU domains to which
> > devices are attached during kernel boot since devices, like display
> > controller, may perform DMA at that time. We can work around this problem
> > by deferring the enable of SMMU translation for a specific devices,
> > like a display controller, until the first IOMMU mapping is created,
> > which works good enough in practice because by that time h/w is already
> > stopped.
> > 
> > Signed-off-by: Dmitry Osipenko 
> 
> For both patches:
> Acked-by: Nicolin Chen 
> Tested-by: Nicolin Chen 
> 
> The WAR looks good to me. Perhaps Thierry would give some input.
> 
> Another topic:
> I think this may help work around the mc-errors, which we have
> been facing on Tegra210 also when we enable IOMMU_DOMAIN_DMA.
> (attached a test patch rebasing on these two)

Ugh... that's exactly what I was afraid of. Now everybody is going to
think that we can just work around this issue with driver-specific SMMU
hacks...

> However, GPU would also report errors using DMA domain:
> 
>  nouveau 5700.gpu: acr: firmware unavailable
>  nouveau 5700.gpu: pmu: firmware unavailable
>  nouveau 5700.gpu: gr: firmware unavailable
>  tegra-mc 70019000.memory-controller: gpusrd: read @0xfffbe200: 
> Security violation (TrustZone violation)
>  nouveau 5700.gpu: DRM: failed to create kernel channel, -22
>  tegra-mc 70019000.memory-controller: gpusrd: read @0xfffad000: 
> Security violation (TrustZone violation)
>  nouveau 5700.gpu: fifo: SCHED_ERROR 20 []
>  nouveau 5700.gpu: fifo: SCHED_ERROR 20 []
> 
> Looking at the address, seems that GPU allocated memory in 32-bit
> physical address space behind SMMU, so a violation happened after
> turning on DMA domain I guess... 

The problem with GPU is... extra complicated. You're getting these
faults because you're enabling the IOMMU-backed DMA API, which then
causes the Nouveau driver allocate buffers using the DMA API instead of
explicitly allocating pages and then mapping them using the IOMMU API.
However, there are additional patches needed to teach Nouveau about how
to deal with SMMU and those haven't been merged yet. I've got prototypes
of this, but before the whole framebuffer carveout passing work makes
progress there's little sense in moving individual pieces forward.

One more not to try and cut corners. We know what the right solution is,
even if it takes a lot of work. I'm willing to ack this patch, or some
version of it, but only as a way of working around things we have no
realistic chance of fixing properly anymore. I still think it would be
best if we could derive identity mappings from command-line arguments on
these platforms because I think most of them will actually set that, and
then the solution becomes at least uniform at the SMMU level.

For Tegra210 I've already laid out a path to a solution that's going to
be generic and extend to Tegra186 and later as well.

Thierry


signature.asc
Description: PGP signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v1 1/2] iommu/tegra-smmu: Defer attachment of display clients

2021-04-08 Thread Thierry Reding
On Mon, Mar 29, 2021 at 02:32:55AM +0300, Dmitry Osipenko wrote:
> All consumer-grade Android and Chromebook devices show a splash screen
> on boot and then display is left enabled when kernel is booted. This
> behaviour is unacceptable in a case of implicit IOMMU domains to which
> devices are attached during kernel boot since devices, like display
> controller, may perform DMA at that time. We can work around this problem
> by deferring the enable of SMMU translation for a specific devices,
> like a display controller, until the first IOMMU mapping is created,
> which works good enough in practice because by that time h/w is already
> stopped.
> 
> Signed-off-by: Dmitry Osipenko 
> ---
>  drivers/iommu/tegra-smmu.c | 71 ++
>  1 file changed, 71 insertions(+)

In general I do see why we would want to enable this. However, I think
this is a bad idea because it's going to proliferate the bad practice of
not describing things properly in device tree.

Whatever happened to the idea of creating identity mappings based on the
obscure tegra_fb_mem (or whatever it was called) command-line option? Is
that command-line not universally passed to the kernel from bootloaders
that initialize display?

That idealistic objection aside, this seems a bit over-engineered for
the hack that it is. See below.

> diff --git a/drivers/iommu/tegra-smmu.c b/drivers/iommu/tegra-smmu.c
> index 602aab98c079..af1e4b5adb27 100644
> --- a/drivers/iommu/tegra-smmu.c
> +++ b/drivers/iommu/tegra-smmu.c
> @@ -60,6 +60,8 @@ struct tegra_smmu_as {
>   dma_addr_t pd_dma;
>   unsigned id;
>   u32 attr;
> + bool display_attached[2];
> + bool attached_devices_need_sync;
>  };
>  
>  static struct tegra_smmu_as *to_smmu_as(struct iommu_domain *dom)
> @@ -78,6 +80,10 @@ static inline u32 smmu_readl(struct tegra_smmu *smmu, 
> unsigned long offset)
>   return readl(smmu->regs + offset);
>  }
>  
> +/* all Tegra SoCs use the same group IDs for displays */
> +#define SMMU_SWGROUP_DC  1
> +#define SMMU_SWGROUP_DCB 2
> +
>  #define SMMU_CONFIG 0x010
>  #define  SMMU_CONFIG_ENABLE (1 << 0)
>  
> @@ -253,6 +259,20 @@ static inline void smmu_flush(struct tegra_smmu *smmu)
>   smmu_readl(smmu, SMMU_PTB_ASID);
>  }
>  
> +static int smmu_swgroup_to_display_id(unsigned int swgroup)
> +{
> + switch (swgroup) {
> + case SMMU_SWGROUP_DC:
> + return 0;
> +
> + case SMMU_SWGROUP_DCB:
> + return 1;
> +
> + default:
> + return -1;
> + }
> +}
> +

Why do we need to have this two-level mapping? Do we even need to care
about the specific swgroups IDs? Can we not just simply check at attach
time if the client that's being attached is a display client and then
set atteched_devices_need_sync = true?

Thierry


signature.asc
Description: PGP signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH 0/9] arm64: tegra: Prevent early SMMU faults

2021-03-26 Thread Thierry Reding
On Fri, Mar 26, 2021 at 06:29:28PM +0300, Dmitry Osipenko wrote:
> 25.03.2021 16:03, Thierry Reding пишет:
> > From: Thierry Reding 
> > 
> > Hi,
> > 
> > this is a set of patches that is the result of earlier discussions
> > regarding early identity mappings that are needed to avoid SMMU faults
> > during early boot.
> > 
> > The goal here is to avoid early identity mappings altogether and instead
> > postpone the need for the identity mappings to when devices are attached
> > to the SMMU. This works by making the SMMU driver coordinate with the
> > memory controller driver on when to start enforcing SMMU translations.
> > This makes Tegra behave in a more standard way and pushes the code to
> > deal with the Tegra-specific programming into the NVIDIA SMMU
> > implementation.
> 
> It is an interesting idea which inspired me to try to apply a somewhat 
> similar thing to Tegra SMMU driver by holding the SMMU ASID enable-bit until 
> display driver allows to toggle it. This means that we will need an extra 
> small tegra-specific SMMU API function, but it should be okay.
> 
> I typed a patch and seems it's working good, I'll prepare a proper patch if 
> you like it.

That would actually be working around the problem that this patch was
supposed to prepare for. The reason for this current patch series is to
make sure SMMU translation isn't enabled until a device has actually
been attached to the SMMU. Once it has been attached, the assumption is
that any identity mappings will have been created.

One Tegra SMMU that shouldn't be a problem because translations aren't
enabled until device attach time. So in other words this patch set is to
get Tegra186 and later to parity with earlier chips from this point of
view.

I think the problem that you're trying to work around is better solved
by establishing these identity mappings. I do have patches to implement
this for Tegra210 and earlier, though they may require additional work
if you have bootloaders that don't use standard DT bindings for passing
information about the framebuffer to the kernel.

Thierry


signature.asc
Description: PGP signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH 1/9] memory: tegra: Move internal data structures into separate header

2021-03-25 Thread Thierry Reding
On Thu, Mar 25, 2021 at 06:12:51PM +0300, Dmitry Osipenko wrote:
> 25.03.2021 16:03, Thierry Reding пишет:
> > From: Thierry Reding 
> > 
> > From Tegra20 through Tegra210, either the GART or SMMU drivers need
> > access to the internals of the memory controller driver because they are
> > tightly coupled (in fact, the GART and SMMU are part of the memory
> > controller). On later chips, a separate hardware block implements the
> > SMMU functionality, so this is no longer needed. However, we still want
> > to reuse some of the existing infrastructure on later chips, so split
> > the memory controller internals into a separate header file to avoid
> > conflicts with the implementation on newer chips.
> > 
> > Signed-off-by: Thierry Reding 
> > ---
> >  drivers/iommu/tegra-gart.c  |  2 +-
> >  drivers/iommu/tegra-smmu.c  |  2 +-
> >  drivers/memory/tegra/mc.h   |  2 +-
> >  drivers/memory/tegra/tegra186.c | 12 ---
> >  include/soc/tegra/mc-internal.h | 62 +
> >  include/soc/tegra/mc.h  | 50 --
> >  6 files changed, 72 insertions(+), 58 deletions(-)
> >  create mode 100644 include/soc/tegra/mc-internal.h
> 
> What about to make T186 to re-use the existing tegra_mc struct? Seems
> there is nothing special in that struct which doesn't fit for the newer
> SoCs. Please notice that both SMMU and GART are already optional and all
> the SoC differences are specified within the tegra_mc_soc. It looks to
> me that this could be a much nicer and cleaner variant.

The problem is that much of the interesting bits in tegra_mc_soc are
basically incompatible between the two. For instance the tegra_mc_client
and tegra186_mc_client structures, while they have the same purpose,
have completely different content. I didn't see a way to unify that
without overly complicating things by making half of the fields
basically optional on one or the other SoC generation.

Maybe one option would be to split tegra_mc into a tegra_mc_common and
then derive tegra_mc and tegra186_mc from that. That way we could share
the common bits while still letting the chip-specific differences be
handled separately.

Thierry


signature.asc
Description: PGP signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH 3/9] memory: tegra: Implement SID override programming

2021-03-25 Thread Thierry Reding
On Thu, Mar 25, 2021 at 02:27:10PM +, Robin Murphy wrote:
> On 2021-03-25 13:03, Thierry Reding wrote:
> > From: Thierry Reding 
> > 
> > Instead of programming all SID overrides during early boot, perform the
> > operation on-demand after the SMMU translations have been set up for a
> > device. This reuses data from device tree to match memory clients for a
> > device and programs the SID specified in device tree, which corresponds
> > to the SID used for the SMMU context banks for the device.
> 
> Can you clarify what exactly the SID override does? I'm guessing it's more
> than just changing the ID presented to the SMMU from one value to another,
> since that alone wouldn't help under disable_bypass.

My understanding is that this override is basically one level higher
than the SMMU. There's a special override SID (0x7f) that can be used to
avoid memory accesses to go through the SMMU at all. That is, as long as
that passthrough SID is configured for a memory client, accesses by that
client will be routed around the SMMU. Only if a valid SID is programmed
in this override will accesses for a memory client be routed to the
SMMU.

> > Signed-off-by: Thierry Reding 
> > ---
> >   drivers/memory/tegra/tegra186.c | 70 +
> >   include/soc/tegra/mc.h  | 10 +
> >   2 files changed, 80 insertions(+)
> > 
> > diff --git a/drivers/memory/tegra/tegra186.c 
> > b/drivers/memory/tegra/tegra186.c
> > index efa922d51d83..a89e8e40d875 100644
> > --- a/drivers/memory/tegra/tegra186.c
> > +++ b/drivers/memory/tegra/tegra186.c
> > @@ -4,6 +4,7 @@
> >*/
> >   #include 
> > +#include 
> >   #include 
> >   #include 
> >   #include 
> > @@ -19,6 +20,10 @@
> >   #include 
> >   #endif
> > +#define MC_SID_STREAMID_OVERRIDE_MASK GENMASK(7, 0)
> > +#define MC_SID_STREAMID_SECURITY_WRITE_ACCESS_DISABLED BIT(16)
> > +#define MC_SID_STREAMID_SECURITY_OVERRIDE BIT(8)
> > +
> >   struct tegra186_mc_client {
> > const char *name;
> > unsigned int id;
> > @@ -1808,6 +1813,71 @@ static struct platform_driver tegra186_mc_driver = {
> >   };
> >   module_platform_driver(tegra186_mc_driver);
> > +static void tegra186_mc_client_sid_override(struct tegra_mc *mc,
> > +   const struct tegra186_mc_client 
> > *client,
> > +   unsigned int sid)
> > +{
> > +   u32 value, old;
> > +
> > +   value = readl(mc->regs + client->regs.security);
> > +   if ((value & MC_SID_STREAMID_SECURITY_OVERRIDE) == 0) {
> > +   /*
> > +* If the secure firmware has locked this down the override
> > +* for this memory client, there's nothing we can do here.
> > +*/
> > +   if (value & MC_SID_STREAMID_SECURITY_WRITE_ACCESS_DISABLED)
> > +   return;
> 
> How likely is that in practice? If it's anything more than vanishingly rare
> then that would seem to be a strong pointer back towards persevering with
> the common solution that will work for everyone.

The idea behind this patch series is basically to use this mechanism in
order to avoid the murky waters between ARM SMMU driver probe and SMMU
device probe, so that we can avoid the early identity mappings that make
things so complicated.

So in other words until the device has been attached to the SMMU (at
which point it's expected that any identity mappings will have been
created), the device will remain in passthrough mode through the SID
override mechanism. After the device has been attached, we'd lock the
SID to the proper value and hence enable SMMU translation.

In a typical setup it would actually be fairly common to encounter the
above. The firmware will pre-program the SID overrides and lock down the
configuration for most devices. The only one that will stay unconfigured
at the moment is display, specifically because it is the only device
that may not be in a quiescent state during boot. For all other devices
write access to the SID override register is disabled and the above just
abandons early because the subsequent operations would just be
discarded.

> > +   /*
> > +* Otherwise, try to set the override itself. Typically the
> > +* secure firmware will never have set this configuration.
> > +* Instead, it will either have disabled write access to
> > +* this field, or it will already have set an explicit
> > +* override itself.
> > +*/
> > +   WARN_ON((value & MC_SID_STREAMID_SECURITY_OVERRIDE) == 0);
> 
> Giv

[PATCH 6/9] iommu/arm-smmu: tegra: Implement SID override programming

2021-03-25 Thread Thierry Reding
From: Thierry Reding 

The secure firmware keeps some SID override registers set as passthrough
in order to allow devices such as the display controller to operate with
no knowledge of SMMU translations until an operating system driver takes
over. This is needed in order to seamlessly transition from the firmware
framebuffer to the OS framebuffer.

Upon successfully attaching a device to the SMMU and in the process
creating identity mappings for memory regions that are being accessed,
the Tegra implementation will call into the memory controller driver to
program the override SIDs appropriately.

Signed-off-by: Thierry Reding 
---
 drivers/iommu/arm/arm-smmu/arm-smmu-nvidia.c | 32 ++--
 1 file changed, 29 insertions(+), 3 deletions(-)

diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu-nvidia.c 
b/drivers/iommu/arm/arm-smmu/arm-smmu-nvidia.c
index 5b1170b028f0..127b51e6445f 100644
--- a/drivers/iommu/arm/arm-smmu/arm-smmu-nvidia.c
+++ b/drivers/iommu/arm/arm-smmu/arm-smmu-nvidia.c
@@ -7,6 +7,8 @@
 #include 
 #include 
 
+#include 
+
 #include "arm-smmu.h"
 
 /*
@@ -15,10 +17,17 @@
  * interleaved IOVA accesses across them and translates accesses from
  * non-isochronous HW devices.
  * Third one is used for translating accesses from isochronous HW devices.
+ *
+ * In addition, the SMMU driver needs to coordinate with the memory controller
+ * driver to ensure that the right SID override is programmed for any given
+ * memory client. This is necessary to allow for use-case such as seamlessly
+ * handing over the display controller configuration from the firmware to the
+ * kernel.
+ *
  * This implementation supports programming of the two instances that must
- * be programmed identically.
- * The third instance usage is through standard arm-smmu driver itself and
- * is out of scope of this implementation.
+ * be programmed identically and takes care of invoking the memory controller
+ * driver for SID override programming after devices have been attached to an
+ * SMMU instance.
  */
 #define MAX_SMMU_INSTANCES 2
 
@@ -26,6 +35,7 @@ struct nvidia_smmu {
struct arm_smmu_device smmu;
void __iomem *bases[MAX_SMMU_INSTANCES];
unsigned int num_instances;
+   struct tegra_mc *mc;
 };
 
 static inline struct nvidia_smmu *to_nvidia_smmu(struct arm_smmu_device *smmu)
@@ -237,6 +247,17 @@ static irqreturn_t nvidia_smmu_context_fault(int irq, void 
*dev)
return ret;
 }
 
+static void nvidia_smmu_probe_finalize(struct arm_smmu_device *smmu, struct 
device *dev)
+{
+   struct nvidia_smmu *nvidia = to_nvidia_smmu(smmu);
+   int err;
+
+   err = tegra186_mc_probe_device(nvidia->mc, dev);
+   if (err < 0)
+   dev_err(smmu->dev, "memory controller probe failed for %s: 
%d\n",
+   dev_name(dev), err);
+}
+
 static const struct arm_smmu_impl nvidia_smmu_impl = {
.read_reg = nvidia_smmu_read_reg,
.write_reg = nvidia_smmu_write_reg,
@@ -246,6 +267,7 @@ static const struct arm_smmu_impl nvidia_smmu_impl = {
.tlb_sync = nvidia_smmu_tlb_sync,
.global_fault = nvidia_smmu_global_fault,
.context_fault = nvidia_smmu_context_fault,
+   .probe_finalize = nvidia_smmu_probe_finalize,
 };
 
 struct arm_smmu_device *nvidia_smmu_impl_init(struct arm_smmu_device *smmu)
@@ -260,6 +282,10 @@ struct arm_smmu_device *nvidia_smmu_impl_init(struct 
arm_smmu_device *smmu)
if (!nvidia_smmu)
return ERR_PTR(-ENOMEM);
 
+   nvidia_smmu->mc = devm_tegra_memory_controller_get(dev);
+   if (IS_ERR(nvidia_smmu->mc))
+   return ERR_CAST(nvidia_smmu->mc);
+
/* Instance 0 is ioremapped by arm-smmu.c. */
nvidia_smmu->bases[0] = smmu->base;
nvidia_smmu->num_instances++;
-- 
2.30.2

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 5/9] iommu/arm-smmu: tegra: Detect number of instances at runtime

2021-03-25 Thread Thierry Reding
From: Thierry Reding 

Parse the reg property in device tree and detect the number of instances
represented by a device tree node. This is subsequently needed in order
to support single-instance SMMUs with the Tegra implementation because
additional programming is needed to properly configure the SID override
registers in the memory controller.

Signed-off-by: Thierry Reding 
---
 drivers/iommu/arm/arm-smmu/arm-smmu-nvidia.c | 49 ++--
 1 file changed, 34 insertions(+), 15 deletions(-)

diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu-nvidia.c 
b/drivers/iommu/arm/arm-smmu/arm-smmu-nvidia.c
index 29117444e5a0..5b1170b028f0 100644
--- a/drivers/iommu/arm/arm-smmu/arm-smmu-nvidia.c
+++ b/drivers/iommu/arm/arm-smmu/arm-smmu-nvidia.c
@@ -20,13 +20,19 @@
  * The third instance usage is through standard arm-smmu driver itself and
  * is out of scope of this implementation.
  */
-#define NUM_SMMU_INSTANCES 2
+#define MAX_SMMU_INSTANCES 2
 
 struct nvidia_smmu {
-   struct arm_smmu_device  smmu;
-   void __iomem*bases[NUM_SMMU_INSTANCES];
+   struct arm_smmu_device smmu;
+   void __iomem *bases[MAX_SMMU_INSTANCES];
+   unsigned int num_instances;
 };
 
+static inline struct nvidia_smmu *to_nvidia_smmu(struct arm_smmu_device *smmu)
+{
+   return container_of(smmu, struct nvidia_smmu, smmu);
+}
+
 static inline void __iomem *nvidia_smmu_page(struct arm_smmu_device *smmu,
 unsigned int inst, int page)
 {
@@ -47,9 +53,10 @@ static u32 nvidia_smmu_read_reg(struct arm_smmu_device *smmu,
 static void nvidia_smmu_write_reg(struct arm_smmu_device *smmu,
  int page, int offset, u32 val)
 {
+   struct nvidia_smmu *nvidia = to_nvidia_smmu(smmu);
unsigned int i;
 
-   for (i = 0; i < NUM_SMMU_INSTANCES; i++) {
+   for (i = 0; i < nvidia->num_instances; i++) {
void __iomem *reg = nvidia_smmu_page(smmu, i, page) + offset;
 
writel_relaxed(val, reg);
@@ -67,9 +74,10 @@ static u64 nvidia_smmu_read_reg64(struct arm_smmu_device 
*smmu,
 static void nvidia_smmu_write_reg64(struct arm_smmu_device *smmu,
int page, int offset, u64 val)
 {
+   struct nvidia_smmu *nvidia = to_nvidia_smmu(smmu);
unsigned int i;
 
-   for (i = 0; i < NUM_SMMU_INSTANCES; i++) {
+   for (i = 0; i < nvidia->num_instances; i++) {
void __iomem *reg = nvidia_smmu_page(smmu, i, page) + offset;
 
writeq_relaxed(val, reg);
@@ -79,6 +87,7 @@ static void nvidia_smmu_write_reg64(struct arm_smmu_device 
*smmu,
 static void nvidia_smmu_tlb_sync(struct arm_smmu_device *smmu, int page,
 int sync, int status)
 {
+   struct nvidia_smmu *nvidia = to_nvidia_smmu(smmu);
unsigned int delay;
 
arm_smmu_writel(smmu, page, sync, 0);
@@ -90,7 +99,7 @@ static void nvidia_smmu_tlb_sync(struct arm_smmu_device 
*smmu, int page,
u32 val = 0;
unsigned int i;
 
-   for (i = 0; i < NUM_SMMU_INSTANCES; i++) {
+   for (i = 0; i < nvidia->num_instances; i++) {
void __iomem *reg;
 
reg = nvidia_smmu_page(smmu, i, page) + status;
@@ -112,9 +121,10 @@ static void nvidia_smmu_tlb_sync(struct arm_smmu_device 
*smmu, int page,
 
 static int nvidia_smmu_reset(struct arm_smmu_device *smmu)
 {
+   struct nvidia_smmu *nvidia = to_nvidia_smmu(smmu);
unsigned int i;
 
-   for (i = 0; i < NUM_SMMU_INSTANCES; i++) {
+   for (i = 0; i < nvidia->num_instances; i++) {
u32 val;
void __iomem *reg = nvidia_smmu_page(smmu, i, ARM_SMMU_GR0) +
ARM_SMMU_GR0_sGFSR;
@@ -157,8 +167,9 @@ static irqreturn_t nvidia_smmu_global_fault(int irq, void 
*dev)
unsigned int inst;
irqreturn_t ret = IRQ_NONE;
struct arm_smmu_device *smmu = dev;
+   struct nvidia_smmu *nvidia = to_nvidia_smmu(smmu);
 
-   for (inst = 0; inst < NUM_SMMU_INSTANCES; inst++) {
+   for (inst = 0; inst < nvidia->num_instances; inst++) {
irqreturn_t irq_ret;
 
irq_ret = nvidia_smmu_global_fault_inst(irq, smmu, inst);
@@ -202,11 +213,13 @@ static irqreturn_t nvidia_smmu_context_fault(int irq, 
void *dev)
struct arm_smmu_device *smmu;
struct iommu_domain *domain = dev;
struct arm_smmu_domain *smmu_domain;
+   struct nvidia_smmu *nvidia;
 
smmu_domain = container_of(domain, struct arm_smmu_domain, domain);
smmu = smmu_domain->smmu;
+   nvidia = to_nvidia_smmu(smmu);
 
-   for (inst = 0; inst < NUM_SMMU_INSTANCES; inst++) {
+   for (inst = 0; inst < nvidia->num_instances; inst++) {
irqreturn_t irq_ret;
 

[PATCH 4/9] iommu/arm-smmu: Implement ->probe_finalize()

2021-03-25 Thread Thierry Reding
From: Thierry Reding 

Implement a ->probe_finalize() callback that can be used by vendor
implementations to perform extra programming necessary after devices
have been attached to the SMMU.

Signed-off-by: Thierry Reding 
---
 drivers/iommu/arm/arm-smmu/arm-smmu.c | 17 +
 drivers/iommu/arm/arm-smmu/arm-smmu.h |  1 +
 2 files changed, 18 insertions(+)

diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu.c 
b/drivers/iommu/arm/arm-smmu/arm-smmu.c
index d8c6bfde6a61..4589e76543a8 100644
--- a/drivers/iommu/arm/arm-smmu/arm-smmu.c
+++ b/drivers/iommu/arm/arm-smmu/arm-smmu.c
@@ -1447,6 +1447,22 @@ static void arm_smmu_release_device(struct device *dev)
iommu_fwspec_free(dev);
 }
 
+static void arm_smmu_probe_finalize(struct device *dev)
+{
+   struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev);
+   struct arm_smmu_master_cfg *cfg;
+   struct arm_smmu_device *smmu;
+
+   if (!fwspec || fwspec->ops != _smmu_ops)
+   return;
+
+   cfg = dev_iommu_priv_get(dev);
+   smmu = cfg->smmu;
+
+   if (smmu->impl->probe_finalize)
+   smmu->impl->probe_finalize(smmu, dev);
+}
+
 static struct iommu_group *arm_smmu_device_group(struct device *dev)
 {
struct arm_smmu_master_cfg *cfg = dev_iommu_priv_get(dev);
@@ -1630,6 +1646,7 @@ static struct iommu_ops arm_smmu_ops = {
.iova_to_phys   = arm_smmu_iova_to_phys,
.probe_device   = arm_smmu_probe_device,
.release_device = arm_smmu_release_device,
+   .probe_finalize = arm_smmu_probe_finalize,
.device_group   = arm_smmu_device_group,
.domain_get_attr= arm_smmu_domain_get_attr,
.domain_set_attr= arm_smmu_domain_set_attr,
diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu.h 
b/drivers/iommu/arm/arm-smmu/arm-smmu.h
index d2a2d1bc58ba..6779db30cebb 100644
--- a/drivers/iommu/arm/arm-smmu/arm-smmu.h
+++ b/drivers/iommu/arm/arm-smmu/arm-smmu.h
@@ -439,6 +439,7 @@ struct arm_smmu_impl {
  struct device *dev, int start);
void (*write_s2cr)(struct arm_smmu_device *smmu, int idx);
void (*write_sctlr)(struct arm_smmu_device *smmu, int idx, u32 reg);
+   void (*probe_finalize)(struct arm_smmu_device *smmu, struct device 
*dev);
 };
 
 #define INVALID_SMENDX -1
-- 
2.30.2

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 2/9] memory: tegra: Add memory client IDs to tables

2021-03-25 Thread Thierry Reding
From: Thierry Reding 

The memory client IDs will subsequently be used to program override SIDs
for the given clients depending on the device tree configuration.

Signed-off-by: Thierry Reding 
---
 drivers/memory/tegra/tegra186.c | 206 
 1 file changed, 206 insertions(+)

diff --git a/drivers/memory/tegra/tegra186.c b/drivers/memory/tegra/tegra186.c
index aa676c45650b..efa922d51d83 100644
--- a/drivers/memory/tegra/tegra186.c
+++ b/drivers/memory/tegra/tegra186.c
@@ -21,6 +21,7 @@
 
 struct tegra186_mc_client {
const char *name;
+   unsigned int id;
unsigned int sid;
struct {
unsigned int override;
@@ -70,6 +71,7 @@ static void tegra186_mc_program_sid(struct tegra_mc *mc)
 static const struct tegra186_mc_client tegra186_mc_clients[] = {
{
.name = "ptcr",
+   .id = TEGRA186_MEMORY_CLIENT_PTCR,
.sid = TEGRA186_SID_PASSTHROUGH,
.regs = {
.override = 0x000,
@@ -77,6 +79,7 @@ static const struct tegra186_mc_client tegra186_mc_clients[] 
= {
},
}, {
.name = "afir",
+   .id = TEGRA186_MEMORY_CLIENT_AFIR,
.sid = TEGRA186_SID_AFI,
.regs = {
.override = 0x070,
@@ -84,6 +87,7 @@ static const struct tegra186_mc_client tegra186_mc_clients[] 
= {
},
}, {
.name = "hdar",
+   .id = TEGRA186_MEMORY_CLIENT_HDAR,
.sid = TEGRA186_SID_HDA,
.regs = {
.override = 0x0a8,
@@ -91,6 +95,7 @@ static const struct tegra186_mc_client tegra186_mc_clients[] 
= {
},
}, {
.name = "host1xdmar",
+   .id = TEGRA186_MEMORY_CLIENT_HOST1XDMAR,
.sid = TEGRA186_SID_HOST1X,
.regs = {
.override = 0x0b0,
@@ -99,12 +104,14 @@ static const struct tegra186_mc_client 
tegra186_mc_clients[] = {
}, {
.name = "nvencsrd",
.sid = TEGRA186_SID_NVENC,
+   .id = TEGRA186_MEMORY_CLIENT_NVENCSRD,
.regs = {
.override = 0x0e0,
.security = 0x0e4,
},
}, {
.name = "satar",
+   .id = TEGRA186_MEMORY_CLIENT_SATAR,
.sid = TEGRA186_SID_SATA,
.regs = {
.override = 0x0f8,
@@ -112,6 +119,7 @@ static const struct tegra186_mc_client 
tegra186_mc_clients[] = {
},
}, {
.name = "mpcorer",
+   .id = TEGRA186_MEMORY_CLIENT_MPCORER,
.sid = TEGRA186_SID_PASSTHROUGH,
.regs = {
.override = 0x138,
@@ -119,6 +127,7 @@ static const struct tegra186_mc_client 
tegra186_mc_clients[] = {
},
}, {
.name = "nvencswr",
+   .id = TEGRA186_MEMORY_CLIENT_NVENCSWR,
.sid = TEGRA186_SID_NVENC,
.regs = {
.override = 0x158,
@@ -126,6 +135,7 @@ static const struct tegra186_mc_client 
tegra186_mc_clients[] = {
},
}, {
.name = "afiw",
+   .id = TEGRA186_MEMORY_CLIENT_AFIW,
.sid = TEGRA186_SID_AFI,
.regs = {
.override = 0x188,
@@ -133,6 +143,7 @@ static const struct tegra186_mc_client 
tegra186_mc_clients[] = {
},
}, {
.name = "hdaw",
+   .id = TEGRA186_MEMORY_CLIENT_HDAW,
.sid = TEGRA186_SID_HDA,
.regs = {
.override = 0x1a8,
@@ -140,6 +151,7 @@ static const struct tegra186_mc_client 
tegra186_mc_clients[] = {
},
}, {
.name = "mpcorew",
+   .id = TEGRA186_MEMORY_CLIENT_MPCOREW,
.sid = TEGRA186_SID_PASSTHROUGH,
.regs = {
.override = 0x1c8,
@@ -147,6 +159,7 @@ static const struct tegra186_mc_client 
tegra186_mc_clients[] = {
},
}, {
.name = "sataw",
+   .id = TEGRA186_MEMORY_CLIENT_SATAW,
.sid = TEGRA186_SID_SATA,
.regs = {
.override = 0x1e8,
@@ -154,6 +167,7 @@ static const struct tegra186_mc_client 
tegra186_mc_clients[] = {
},
}, {
.name = "ispra",
+   .id = TEGRA186_MEMORY_CLIENT_ISPRA,
.sid = TEGRA186_SID_ISP,
.regs = {
.override = 0x220,
@@ -161,6 +175,7 @@ static const struct tegra186_mc_client 
tegra186_mc_clients[] = {
   

[PATCH 3/9] memory: tegra: Implement SID override programming

2021-03-25 Thread Thierry Reding
From: Thierry Reding 

Instead of programming all SID overrides during early boot, perform the
operation on-demand after the SMMU translations have been set up for a
device. This reuses data from device tree to match memory clients for a
device and programs the SID specified in device tree, which corresponds
to the SID used for the SMMU context banks for the device.

Signed-off-by: Thierry Reding 
---
 drivers/memory/tegra/tegra186.c | 70 +
 include/soc/tegra/mc.h  | 10 +
 2 files changed, 80 insertions(+)

diff --git a/drivers/memory/tegra/tegra186.c b/drivers/memory/tegra/tegra186.c
index efa922d51d83..a89e8e40d875 100644
--- a/drivers/memory/tegra/tegra186.c
+++ b/drivers/memory/tegra/tegra186.c
@@ -4,6 +4,7 @@
  */
 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -19,6 +20,10 @@
 #include 
 #endif
 
+#define MC_SID_STREAMID_OVERRIDE_MASK GENMASK(7, 0)
+#define MC_SID_STREAMID_SECURITY_WRITE_ACCESS_DISABLED BIT(16)
+#define MC_SID_STREAMID_SECURITY_OVERRIDE BIT(8)
+
 struct tegra186_mc_client {
const char *name;
unsigned int id;
@@ -1808,6 +1813,71 @@ static struct platform_driver tegra186_mc_driver = {
 };
 module_platform_driver(tegra186_mc_driver);
 
+static void tegra186_mc_client_sid_override(struct tegra_mc *mc,
+   const struct tegra186_mc_client 
*client,
+   unsigned int sid)
+{
+   u32 value, old;
+
+   value = readl(mc->regs + client->regs.security);
+   if ((value & MC_SID_STREAMID_SECURITY_OVERRIDE) == 0) {
+   /*
+* If the secure firmware has locked this down the override
+* for this memory client, there's nothing we can do here.
+*/
+   if (value & MC_SID_STREAMID_SECURITY_WRITE_ACCESS_DISABLED)
+   return;
+
+   /*
+* Otherwise, try to set the override itself. Typically the
+* secure firmware will never have set this configuration.
+* Instead, it will either have disabled write access to
+* this field, or it will already have set an explicit
+* override itself.
+*/
+   WARN_ON((value & MC_SID_STREAMID_SECURITY_OVERRIDE) == 0);
+
+   value |= MC_SID_STREAMID_SECURITY_OVERRIDE;
+   writel(value, mc->regs + client->regs.security);
+   }
+
+   value = readl(mc->regs + client->regs.override);
+   old = value & MC_SID_STREAMID_OVERRIDE_MASK;
+
+   if (old != sid) {
+   dev_dbg(mc->dev, "overriding SID %x for %s with %x\n", old,
+   client->name, sid);
+   writel(sid, mc->regs + client->regs.override);
+   }
+}
+
+int tegra186_mc_probe_device(struct tegra_mc *mc, struct device *dev)
+{
+   struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev);
+   struct of_phandle_args args;
+   unsigned int i, index = 0;
+
+   while (!of_parse_phandle_with_args(dev->of_node, "interconnects", 
"#interconnect-cells",
+  index, )) {
+   if (args.np == mc->dev->of_node && args.args_count != 0) {
+   for (i = 0; i < mc->soc->num_clients; i++) {
+   const struct tegra186_mc_client *client = 
>soc->clients[i];
+
+   if (client->id == args.args[0]) {
+   u32 sid = fwspec->ids[0] & 
MC_SID_STREAMID_OVERRIDE_MASK;
+
+   tegra186_mc_client_sid_override(mc, 
client, sid);
+   }
+   }
+   }
+
+       index++;
+   }
+
+   return 0;
+}
+EXPORT_SYMBOL_GPL(tegra186_mc_probe_device);
+
 MODULE_AUTHOR("Thierry Reding ");
 MODULE_DESCRIPTION("NVIDIA Tegra186 Memory Controller driver");
 MODULE_LICENSE("GPL v2");
diff --git a/include/soc/tegra/mc.h b/include/soc/tegra/mc.h
index 7be8441c6e9e..73d5ecf0e76a 100644
--- a/include/soc/tegra/mc.h
+++ b/include/soc/tegra/mc.h
@@ -168,4 +168,14 @@ devm_tegra_memory_controller_get(struct device *dev)
 }
 #endif
 
+#if IS_ENABLED(CONFIG_ARCH_TEGRA_186_SOC) || \
+IS_ENABLED(CONFIG_ARCH_TEGRA_194_SOC)
+int tegra186_mc_probe_device(struct tegra_mc *mc, struct device *dev);
+#else
+static inline int tegra186_mc_probe_device(struct tegra_mc *mc, struct device 
*dev)
+{
+   return 0;
+}
+#endif
+
 #endif /* __SOC_TEGRA_MC_H__ */
-- 
2.30.2

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 9/9] arm64: tegra: Enable SMMU support on Tegra194

2021-03-25 Thread Thierry Reding
From: Thierry Reding 

Add the device tree node for the dual-SMMU found on Tegra194 and hook up
peripherals such as host1x, BPMP, HDA, SDMMC, EQOS and VIC.

Signed-off-by: Thierry Reding 
---
 arch/arm64/boot/dts/nvidia/tegra194.dtsi | 86 
 1 file changed, 86 insertions(+)

diff --git a/arch/arm64/boot/dts/nvidia/tegra194.dtsi 
b/arch/arm64/boot/dts/nvidia/tegra194.dtsi
index 9449156fae39..3c1231a9ff62 100644
--- a/arch/arm64/boot/dts/nvidia/tegra194.dtsi
+++ b/arch/arm64/boot/dts/nvidia/tegra194.dtsi
@@ -62,6 +62,7 @@ ethernet@249 {
interconnects = < TEGRA194_MEMORY_CLIENT_EQOSR >,
< TEGRA194_MEMORY_CLIENT_EQOSW >;
interconnect-names = "dma-mem", "write";
+   iommus = < TEGRA194_SID_EQOS>;
status = "disabled";
 
snps,write-requests = <1>;
@@ -733,6 +734,7 @@ sdmmc1: mmc@340 {
interconnects = < TEGRA194_MEMORY_CLIENT_SDMMCRA 
>,
< TEGRA194_MEMORY_CLIENT_SDMMCWA 
>;
interconnect-names = "dma-mem", "write";
+   iommus = < TEGRA194_SID_SDMMC1>;
nvidia,pad-autocal-pull-up-offset-3v3-timeout =
<0x07>;
nvidia,pad-autocal-pull-down-offset-3v3-timeout =
@@ -759,6 +761,7 @@ sdmmc3: mmc@344 {
interconnects = < TEGRA194_MEMORY_CLIENT_SDMMCR 
>,
< TEGRA194_MEMORY_CLIENT_SDMMCW 
>;
interconnect-names = "dma-mem", "write";
+   iommus = < TEGRA194_SID_SDMMC3>;
nvidia,pad-autocal-pull-up-offset-1v8 = <0x00>;
nvidia,pad-autocal-pull-down-offset-1v8 = <0x7a>;
nvidia,pad-autocal-pull-up-offset-3v3-timeout = <0x07>;
@@ -790,6 +793,7 @@ sdmmc4: mmc@346 {
interconnects = < TEGRA194_MEMORY_CLIENT_SDMMCRAB 
>,
< TEGRA194_MEMORY_CLIENT_SDMMCWAB 
>;
interconnect-names = "dma-mem", "write";
+   iommus = < TEGRA194_SID_SDMMC4>;
nvidia,pad-autocal-pull-up-offset-hs400 = <0x00>;
nvidia,pad-autocal-pull-down-offset-hs400 = <0x00>;
nvidia,pad-autocal-pull-up-offset-1v8-timeout = <0x0a>;
@@ -821,6 +825,7 @@ hda@351 {
interconnects = < TEGRA194_MEMORY_CLIENT_HDAR >,
< TEGRA194_MEMORY_CLIENT_HDAW >;
interconnect-names = "dma-mem", "write";
+   iommus = < TEGRA194_SID_HDA>;
status = "disabled";
};
 
@@ -1300,6 +1305,84 @@ pmc: pmc@c36 {
interrupt-controller;
};

[PATCH 8/9] arm64: tegra: Hook up memory controller to SMMU on Tegra186

2021-03-25 Thread Thierry Reding
From: Thierry Reding 

On Tegra186 and later, the memory controller needs to be programmed in
coordination with any of the ARM SMMU instances to configure the stream
ID used for each memory client.

To support this, add a phandle reference to the memory controller to the
SMMU device tree node.

Signed-off-by: Thierry Reding 
---
 arch/arm64/boot/dts/nvidia/tegra186.dtsi | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/arm64/boot/dts/nvidia/tegra186.dtsi 
b/arch/arm64/boot/dts/nvidia/tegra186.dtsi
index 9f75bbf00cf7..e9fdf9e18d37 100644
--- a/arch/arm64/boot/dts/nvidia/tegra186.dtsi
+++ b/arch/arm64/boot/dts/nvidia/tegra186.dtsi
@@ -1152,6 +1152,8 @@ smmu: iommu@1200 {
stream-match-mask = <0x7f80>;
#global-interrupts = <1>;
#iommu-cells = <1>;
+
+   nvidia,memory-controller = <>;
};
 
host1x@13e0 {
-- 
2.30.2

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 7/9] iommu/arm-smmu: Use Tegra implementation on Tegra186

2021-03-25 Thread Thierry Reding
From: Thierry Reding 

Tegra186 requires the same SID override programming as Tegra194 in order
to seamlessly transition from the firmware framebuffer to the Linux
framebuffer, so the Tegra implementation needs to be used on Tegra186
devices as well.

Signed-off-by: Thierry Reding 
---
 drivers/iommu/arm/arm-smmu/arm-smmu-impl.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu-impl.c 
b/drivers/iommu/arm/arm-smmu/arm-smmu-impl.c
index 136872e77195..9f465e146799 100644
--- a/drivers/iommu/arm/arm-smmu/arm-smmu-impl.c
+++ b/drivers/iommu/arm/arm-smmu/arm-smmu-impl.c
@@ -211,7 +211,8 @@ struct arm_smmu_device *arm_smmu_impl_init(struct 
arm_smmu_device *smmu)
if (of_property_read_bool(np, "calxeda,smmu-secure-config-access"))
smmu->impl = _impl;
 
-   if (of_device_is_compatible(np, "nvidia,tegra194-smmu"))
+   if (of_device_is_compatible(np, "nvidia,tegra194-smmu") ||
+   of_device_is_compatible(np, "nvidia,tegra186-smmu"))
return nvidia_smmu_impl_init(smmu);
 
smmu = qcom_smmu_impl_init(smmu);
-- 
2.30.2

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 1/9] memory: tegra: Move internal data structures into separate header

2021-03-25 Thread Thierry Reding
From: Thierry Reding 

>From Tegra20 through Tegra210, either the GART or SMMU drivers need
access to the internals of the memory controller driver because they are
tightly coupled (in fact, the GART and SMMU are part of the memory
controller). On later chips, a separate hardware block implements the
SMMU functionality, so this is no longer needed. However, we still want
to reuse some of the existing infrastructure on later chips, so split
the memory controller internals into a separate header file to avoid
conflicts with the implementation on newer chips.

Signed-off-by: Thierry Reding 
---
 drivers/iommu/tegra-gart.c  |  2 +-
 drivers/iommu/tegra-smmu.c  |  2 +-
 drivers/memory/tegra/mc.h   |  2 +-
 drivers/memory/tegra/tegra186.c | 12 ---
 include/soc/tegra/mc-internal.h | 62 +
 include/soc/tegra/mc.h  | 50 --
 6 files changed, 72 insertions(+), 58 deletions(-)
 create mode 100644 include/soc/tegra/mc-internal.h

diff --git a/drivers/iommu/tegra-gart.c b/drivers/iommu/tegra-gart.c
index 6f130e51f072..716185234b2a 100644
--- a/drivers/iommu/tegra-gart.c
+++ b/drivers/iommu/tegra-gart.c
@@ -17,7 +17,7 @@
 #include 
 #include 
 
-#include 
+#include 
 
 #define GART_REG_BASE  0x24
 #define GART_CONFIG(0x24 - GART_REG_BASE)
diff --git a/drivers/iommu/tegra-smmu.c b/drivers/iommu/tegra-smmu.c
index 602aab98c079..fdb798c62596 100644
--- a/drivers/iommu/tegra-smmu.c
+++ b/drivers/iommu/tegra-smmu.c
@@ -17,7 +17,7 @@
 #include 
 
 #include 
-#include 
+#include 
 
 struct tegra_smmu_group {
struct list_head list;
diff --git a/drivers/memory/tegra/mc.h b/drivers/memory/tegra/mc.h
index 1ee34f0da4f7..116bf68325b7 100644
--- a/drivers/memory/tegra/mc.h
+++ b/drivers/memory/tegra/mc.h
@@ -10,7 +10,7 @@
 #include 
 #include 
 
-#include 
+#include 
 
 #define MC_INTSTATUS   0x00
 #define MC_INTMASK 0x04
diff --git a/drivers/memory/tegra/tegra186.c b/drivers/memory/tegra/tegra186.c
index e25c954dde2e..aa676c45650b 100644
--- a/drivers/memory/tegra/tegra186.c
+++ b/drivers/memory/tegra/tegra186.c
@@ -9,6 +9,8 @@
 #include 
 #include 
 
+#include 
+
 #if defined(CONFIG_ARCH_TEGRA_186_SOC)
 #include 
 #endif
@@ -31,14 +33,14 @@ struct tegra186_mc_soc {
unsigned int num_clients;
 };
 
-struct tegra186_mc {
+struct tegra_mc {
struct device *dev;
void __iomem *regs;
 
const struct tegra186_mc_soc *soc;
 };
 
-static void tegra186_mc_program_sid(struct tegra186_mc *mc)
+static void tegra186_mc_program_sid(struct tegra_mc *mc)
 {
unsigned int i;
 
@@ -1523,8 +1525,8 @@ static const struct tegra186_mc_soc tegra194_mc_soc = {
 
 static int tegra186_mc_probe(struct platform_device *pdev)
 {
-   struct tegra186_mc *mc;
struct resource *res;
+   struct tegra_mc *mc;
int err;
 
mc = devm_kzalloc(>dev, sizeof(*mc), GFP_KERNEL);
@@ -1552,7 +1554,7 @@ static int tegra186_mc_probe(struct platform_device *pdev)
 
 static int tegra186_mc_remove(struct platform_device *pdev)
 {
-   struct tegra186_mc *mc = platform_get_drvdata(pdev);
+   struct tegra_mc *mc = platform_get_drvdata(pdev);
 
of_platform_depopulate(mc->dev);
 
@@ -1577,7 +1579,7 @@ static int __maybe_unused tegra186_mc_suspend(struct 
device *dev)
 
 static int __maybe_unused tegra186_mc_resume(struct device *dev)
 {
-   struct tegra186_mc *mc = dev_get_drvdata(dev);
+   struct tegra_mc *mc = dev_get_drvdata(dev);
 
tegra186_mc_program_sid(mc);
 
diff --git a/include/soc/tegra/mc-internal.h b/include/soc/tegra/mc-internal.h
new file mode 100644
index ..4f327695d58c
--- /dev/null
+++ b/include/soc/tegra/mc-internal.h
@@ -0,0 +1,62 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Copyright (C) 2014 NVIDIA Corporation
+ * Copyright (C) 2021 NVIDIA Corporation
+ */
+
+#ifndef __SOC_TEGRA_MC_INTERNAL_H__
+#define __SOC_TEGRA_MC_INTERNAL_H__
+
+#include 
+
+struct tegra_mc_soc {
+   const struct tegra_mc_client *clients;
+   unsigned int num_clients;
+
+   const unsigned long *emem_regs;
+   unsigned int num_emem_regs;
+
+   unsigned int num_address_bits;
+   unsigned int atom_size;
+
+   u8 client_id_mask;
+
+   const struct tegra_smmu_soc *smmu;
+
+   u32 intmask;
+
+   const struct tegra_mc_reset_ops *reset_ops;
+   const struct tegra_mc_reset *resets;
+   unsigned int num_resets;
+
+   const struct tegra_mc_icc_ops *icc_ops;
+
+   int (*init)(struct tegra_mc *mc);
+};
+
+struct tegra_mc {
+   struct device *dev;
+   struct tegra_smmu *smmu;
+   struct gart_device *gart;
+   void __iomem *regs;
+   struct clk *clk;
+   int irq;
+
+   const struct tegra_mc_soc *soc;
+   unsigned long tick;
+
+   struct tegra_mc_timing *timings;
+   unsigned int num_timings;
+
+

[PATCH 0/9] arm64: tegra: Prevent early SMMU faults

2021-03-25 Thread Thierry Reding
From: Thierry Reding 

Hi,

this is a set of patches that is the result of earlier discussions
regarding early identity mappings that are needed to avoid SMMU faults
during early boot.

The goal here is to avoid early identity mappings altogether and instead
postpone the need for the identity mappings to when devices are attached
to the SMMU. This works by making the SMMU driver coordinate with the
memory controller driver on when to start enforcing SMMU translations.
This makes Tegra behave in a more standard way and pushes the code to
deal with the Tegra-specific programming into the NVIDIA SMMU
implementation.

Patches 1 and 2 are preparatory work that is used in patch 3 to provide
a mechanism to program SID overrides at runtime. Patches 4 and 5 create
the fundamentals in the SMMU driver to support this and also make this
functionality available on Tegra186. Patch 6 hooks the ARM SMMU up to
the memory controller so that the memory overrides can be programmed at
the right time.

Patch 7 extends this mechanism to Tegra186 and patches 8-9 enable all of
this through device tree updates.

The end result is that various peripherals will have SMMU enabled, while
the display controllers will keep using passthrough, as initially set up
by firmware. Once the device tree bindings have been accepted and the
SMMU driver has been updated to create identity mappings for the display
controllers, they can be hooked up to the SMMU and the code in this
series will automatically program the SID overrides to enable SMMU
translations at the right time.

Thierry

Thierry Reding (9):
  memory: tegra: Move internal data structures into separate header
  memory: tegra: Add memory client IDs to tables
  memory: tegra: Implement SID override programming
  iommu/arm-smmu: Implement ->probe_finalize()
  iommu/arm-smmu: tegra: Detect number of instances at runtime
  iommu/arm-smmu: tegra: Implement SID override programming
  iommu/arm-smmu: Use Tegra implementation on Tegra186
  arm64: tegra: Hook up memory controller to SMMU on Tegra186
  arm64: tegra: Enable SMMU support on Tegra194

 arch/arm64/boot/dts/nvidia/tegra186.dtsi |   2 +
 arch/arm64/boot/dts/nvidia/tegra194.dtsi |  86 ++
 drivers/iommu/arm/arm-smmu/arm-smmu-impl.c   |   3 +-
 drivers/iommu/arm/arm-smmu/arm-smmu-nvidia.c |  81 --
 drivers/iommu/arm/arm-smmu/arm-smmu.c|  17 ++
 drivers/iommu/arm/arm-smmu/arm-smmu.h|   1 +
 drivers/iommu/tegra-gart.c   |   2 +-
 drivers/iommu/tegra-smmu.c   |   2 +-
 drivers/memory/tegra/mc.h|   2 +-
 drivers/memory/tegra/tegra186.c  | 288 ++-
 include/soc/tegra/mc-internal.h  |  62 
 include/soc/tegra/mc.h   |  60 +---
 12 files changed, 529 insertions(+), 77 deletions(-)
 create mode 100644 include/soc/tegra/mc-internal.h

-- 
2.30.2

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v5] iommu/tegra-smmu: Add pagetable mappings to debugfs

2021-03-16 Thread Thierry Reding
On Mon, Mar 15, 2021 at 01:36:31PM -0700, Nicolin Chen wrote:
> This patch dumps all active mapping entries from pagetable
> to a debugfs directory named "mappings".
> 
> Attaching an example:
> 
> SWGROUP: hc
> ASID: 0
> reg: 0x250
> PTB_ASID: 0xe0080004
> as->pd_dma: 0x80004000
> {
> [1023] 0xf008000b (1)
> {
> PTE RANGE  | ATTR | PHYS   | IOVA 
>   | SIZE
> [#1023, #1023] | 0x5  | 0x000111a8d000 | 
> 0xf000 | 0x1000
> }
> }
> Total PDE count: 1
> Total PTE count: 1
> 
> Tested-by: Dmitry Osipenko 
> Reviewed-by: Dmitry Osipenko 
> Signed-off-by: Nicolin Chen 
> ---
> 
> Changelog
> v5:
>  * Fixed a typo in commit message
>  * Splitted a long line into two lines
>  * Rearranged variable defines by length
>  * Added Tested-by and Reviewed-by from Dmitry
> v4: https://lkml.org/lkml/2021/3/14/429
>  * Changed %d to %u for unsigned variables
>  * Fixed print format mismatch warnings on ARM32
> v3: https://lkml.org/lkml/2021/3/14/30
>  * Fixed PHYS and IOVA print formats
>  * Changed variables to unsigned int type
>  * Changed the table outputs to be compact
> v2: https://lkml.org/lkml/2021/3/9/1382
>  * Expanded mutex range to the entire function
>  * Added as->lock to protect pagetable walkthrough
>  * Replaced devm_kzalloc with devm_kcalloc for group_debug
>  * Added "PTE RANGE" and "SIZE" columns to group contiguous mappings
>  * Dropped as->count check; added WARN_ON when as->count mismatches 
> pd[pd_index]
> v1: https://lkml.org/lkml/2020/9/26/70
> 
>  drivers/iommu/tegra-smmu.c | 181 -
>  1 file changed, 176 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/iommu/tegra-smmu.c b/drivers/iommu/tegra-smmu.c
> index 97eb62f667d2..b728cae63314 100644
> --- a/drivers/iommu/tegra-smmu.c
> +++ b/drivers/iommu/tegra-smmu.c
> @@ -19,6 +19,11 @@
>  #include 
>  #include 
>  
> +struct tegra_smmu_group_debug {
> + const struct tegra_smmu_swgroup *group;
> + void *priv;

This always stores the address space, so why not make this:

struct tegra_smmu_as *as;

? While at it, perhaps throw in a const to make sure we don't modify
this structure in the debugfs code.

> +};
> +
>  struct tegra_smmu_group {
>   struct list_head list;
>   struct tegra_smmu *smmu;
> @@ -47,6 +52,8 @@ struct tegra_smmu {
>   struct dentry *debugfs;
>  
>   struct iommu_device iommu;  /* IOMMU Core code handle */
> +
> + struct tegra_smmu_group_debug *group_debug;
>  };
>  
>  struct tegra_smmu_as {
> @@ -152,6 +159,9 @@ static inline u32 smmu_readl(struct tegra_smmu *smmu, 
> unsigned long offset)
>  
>  #define SMMU_PDE_ATTR(SMMU_PDE_READABLE | SMMU_PDE_WRITABLE 
> | \
>SMMU_PDE_NONSECURE)
> +#define SMMU_PTE_ATTR(SMMU_PTE_READABLE | SMMU_PTE_WRITABLE 
> | \
> +  SMMU_PTE_NONSECURE)
> +#define SMMU_PTE_ATTR_SHIFT  (29)

No need for the parentheses here.

>  
>  static unsigned int iova_pd_index(unsigned long iova)
>  {
> @@ -163,6 +173,12 @@ static unsigned int iova_pt_index(unsigned long iova)
>   return (iova >> SMMU_PTE_SHIFT) & (SMMU_NUM_PTE - 1);
>  }
>  
> +static unsigned long pd_pt_index_iova(unsigned int pd_index, unsigned int 
> pt_index)
> +{
> + return ((dma_addr_t)pd_index & (SMMU_NUM_PDE - 1)) << SMMU_PDE_SHIFT |
> +((dma_addr_t)pt_index & (SMMU_NUM_PTE - 1)) << SMMU_PTE_SHIFT;
> +}
> +
>  static bool smmu_dma_addr_valid(struct tegra_smmu *smmu, dma_addr_t addr)
>  {
>   addr >>= 12;
> @@ -334,7 +350,7 @@ static void tegra_smmu_domain_free(struct iommu_domain 
> *domain)
>  }
>  
>  static const struct tegra_smmu_swgroup *
> -tegra_smmu_find_swgroup(struct tegra_smmu *smmu, unsigned int swgroup)
> +tegra_smmu_find_swgroup(struct tegra_smmu *smmu, unsigned int swgroup, int 
> *index)
>  {
>   const struct tegra_smmu_swgroup *group = NULL;
>   unsigned int i;
> @@ -342,6 +358,8 @@ tegra_smmu_find_swgroup(struct tegra_smmu *smmu, unsigned 
> int swgroup)
>   for (i = 0; i < smmu->soc->num_swgroups; i++) {
>   if (smmu->soc->swgroups[i].swgroup == swgroup) {
>   group = >soc->swgroups[i];
> + if (index)
> + *index = i;

This doesn't look like the right place for this. And this also makes
things hard to follow because it passes out-of-band data in the index
parameter.

I'm thinking that this could benefit from a bit of refactoring where
we could for example embed struct tegra_smmu_group_debug into struct
tegra_smmu_group and then reference that when necessary instead of
carrying all that data in an orthogonal array. That should also make
it easier to track this.

Come to think of it, everything that's currently in your new struct
tegra_smmu_group_debug would be useful in struct tegra_smmu_group,
irrespective of debugfs support.

>   

Re: [PATCH] iommu/tegra-smmu: Fix mc errors on tegra124-nyan

2021-03-03 Thread Thierry Reding
On Thu, Feb 18, 2021 at 02:07:02PM -0800, Nicolin Chen wrote:
> Commit 25938c73cd79 ("iommu/tegra-smmu: Rework tegra_smmu_probe_device()")
> removed certain hack in the tegra_smmu_probe() by relying on IOMMU core to
> of_xlate SMMU's SID per device, so as to get rid of tegra_smmu_find() and
> tegra_smmu_configure() that are typically done in the IOMMU core also.
> 
> This approach works for both existing devices that have DT nodes and other
> devices (like PCI device) that don't exist in DT, on Tegra210 and Tegra3
> upon testing. However, Page Fault errors are reported on tegra124-Nyan:
> 
>   tegra-mc 70019000.memory-controller: display0a: read @0xfe056b40:
>EMEM address decode error (SMMU translation error [--S])
>   tegra-mc 70019000.memory-controller: display0a: read @0xfe056b40:
>Page fault (SMMU translation error [--S])
> 
> After debugging, I found that the mentioned commit changed some function
> callback sequence of tegra-smmu's, resulting in enabling SMMU for display
> client before display driver gets initialized. I couldn't reproduce exact
> same issue on Tegra210 as Tegra124 (arm-32) differs at arch-level code.
> 
> Actually this Page Fault is a known issue, as on most of Tegra platforms,
> display gets enabled by the bootloader for the splash screen feature, so
> it keeps filling the framebuffer memory. A proper fix to this issue is to
> 1:1 linear map the framebuffer memory to IOVA space so the SMMU will have
> the same address as the physical address in its page table. Yet, Thierry
> has been working on the solution above for a year, and it hasn't merged.
> 
> Therefore, let's partially revert the mentioned commit to fix the errors.
> 
> The reason why we do a partial revert here is that we can still set priv
> in ->of_xlate() callback for PCI devices. Meanwhile, devices existing in
> DT, like display, will go through tegra_smmu_configure() at the stage of
> bus_set_iommu() when SMMU gets probed(), as what it did before we merged
> the mentioned commit.
> 
> Once we have the linear map solution for framebuffer memory, this change
> can be cleaned away.
> 
> [Big thank to Guillaume who reported and helped debugging/verification]
> 
> Fixes: 25938c73cd79 ("iommu/tegra-smmu: Rework tegra_smmu_probe_device()")
> Reported-by: Guillaume Tucker 
> Signed-off-by: Nicolin Chen 
> ---
> 
> Guillaume, would you please give a "Tested-by" to this change? Thanks!
> 
>  drivers/iommu/tegra-smmu.c | 72 +-
>  1 file changed, 71 insertions(+), 1 deletion(-)

Acked-by: Thierry Reding 


signature.asc
Description: PGP signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: next/master bisection: baseline.login on r8a77960-ulcb

2021-02-25 Thread Thierry Reding
On Thu, Feb 25, 2021 at 11:14:57AM +, Robin Murphy wrote:
> On 2021-02-25 11:09, Thierry Reding wrote:
> > On Wed, Feb 24, 2021 at 10:39:42PM +0100, Heiko Thiery wrote:
> > > Hi Christoph and all,
> > > 
> > > On 23.02.21 10:56, Guillaume Tucker wrote:
> > > > Hi Christoph,
> > > > 
> > > > Please see the bisection report below about a boot failure on
> > > > r8a77960-ulcb on next-20210222.
> > > > 
> > > > Reports aren't automatically sent to the public while we're
> > > > trialing new bisection features on kernelci.org but this one
> > > > looks valid.
> > > > 
> > > > The log shows a kernel panic, more details can be found here:
> > > > 
> > > > https://kernelci.org/test/case/id/6034bde034504edc9faddd2c/
> > > > 
> > > > Please let us know if you need any help to debug the issue or try
> > > > a fix on this platform.
> > > 
> > > I am also seeing this problem on an iMX8MQ board and can help test if you
> > > have a fix.
> > 
> > This is also causing boot failures on Jetson AGX Xavier. The origin is
> > slightly different from the above kernelci.org report, but the BUG_ON is
> > the same:
> > 
> >  [2.650447] [ cut here ]
> >  [2.650588] kernel BUG at include/linux/iommu-helper.h:23!
> >  [2.650729] Internal error: Oops - BUG: 0 [#1] PREEMPT SMP
> >  [2.654330] Modules linked in:
> >  [2.657474] CPU: 2 PID: 67 Comm: kworker/2:1 Not tainted 
> > 5.11.0-next-20210225-00025-gfd15609b3a81-dirty #120
> >  [2.667367] Hardware name: NVIDIA Jetson AGX Xavier Developer Kit 
> > (DT)
> >  [2.674096] Workqueue: events deferred_probe_work_func
> >  [2.679169] pstate: 40400089 (nZcv daIf +PAN -UAO -TCO BTYPE=--)
> >  [2.684949] pc : find_slots.isra.0+0x118/0x2f0
> >  [2.689494] lr : find_slots.isra.0+0x88/0x2f0
> >  [2.693696] sp : 800011faf950
> >  [2.697281] x29: 800011faf950 x28: 0001
> >  [2.702537] x27: 0001 x26: 
> >  [2.708131] x25: 0001 x24: 000105f03148
> >  [2.713556] x23: 0001 x22: 800011559000
> >  [2.718835] x21: 800011559a80 x20: edc0
> >  [2.724493] x19:  x18: 0020
> >  [2.729770] x17: 0003ffd7d160 x16: 0068
> >  [2.735173] x15: 80b43150 x14: 
> >  [2.740944] x13: 82b5d791 x12: 0040
> >  [2.746113] x11: a248 x10: 
> >  [2.751882] x9 :  x8 : 0003fed3
> >  [2.757139] x7 :  x6 : 
> >  [2.762818] x5 :  x4 : 
> >  [2.767984] x3 : 0001e8303148 x2 : 8000
> >  [2.773580] x1 :  x0 : 001db800
> >  [2.778662] Call trace:
> >  [2.781136]  find_slots.isra.0+0x118/0x2f0
> >  [2.785137]  swiotlb_tbl_map_single+0x80/0x1b4
> >  [2.789858]  swiotlb_map+0x58/0x200
> >  [2.793355]  dma_direct_map_page+0x148/0x1c0
> >  [2.797386]  dma_map_page_attrs+0x2c/0x54
> >  [2.801411]  dw_pcie_host_init+0x40c/0x4c0
> >  [2.805633]  tegra_pcie_config_rp+0x7c/0x1f4
> >  [2.810155]  tegra_pcie_dw_probe+0x3d0/0x60c
> >  [2.814185]  platform_probe+0x68/0xe0
> >  [2.817688]  really_probe+0xe4/0x4c0
> >  [2.821362]  driver_probe_device+0x58/0xc0
> >  [2.825386]  __device_attach_driver+0xa8/0x104
> >  [2.829953]  bus_for_each_drv+0x78/0xd0
> >  [2.833434]  __device_attach+0xdc/0x17c
> >  [2.837631]  device_initial_probe+0x14/0x20
> >  [2.841680]  bus_probe_device+0x9c/0xa4
> >  [2.845160]  deferred_probe_work_func+0x74/0xb0
> >  [2.849734]  process_one_work+0x1cc/0x350
> >  [2.853822]  worker_thread+0x20c/0x3ac
> >  [2.858018]  kthread+0x128/0x134
> >  [2.860997]  ret_from_fork+0x10/0x34
> >  [2.864508] Code: ca180063 ea06007f 54fffee1 b50001e7 (d421)
> >  [2.870547] ---[ end trace e5c50bdcf12b316e ]---
> >  [2.875087] note: kworker/2:1[67] exited with preempt_count 2
> >  [2.880836] [ cut here ]
> > 
> > I've confirmed that reverting the 

Re: next/master bisection: baseline.login on r8a77960-ulcb

2021-02-25 Thread Thierry Reding
On Wed, Feb 24, 2021 at 10:39:42PM +0100, Heiko Thiery wrote:
> Hi Christoph and all,
> 
> On 23.02.21 10:56, Guillaume Tucker wrote:
> > Hi Christoph,
> > 
> > Please see the bisection report below about a boot failure on
> > r8a77960-ulcb on next-20210222.
> > 
> > Reports aren't automatically sent to the public while we're
> > trialing new bisection features on kernelci.org but this one
> > looks valid.
> > 
> > The log shows a kernel panic, more details can be found here:
> > 
> >https://kernelci.org/test/case/id/6034bde034504edc9faddd2c/
> > 
> > Please let us know if you need any help to debug the issue or try
> > a fix on this platform.
> 
> I am also seeing this problem on an iMX8MQ board and can help test if you
> have a fix.

This is also causing boot failures on Jetson AGX Xavier. The origin is
slightly different from the above kernelci.org report, but the BUG_ON is
the same:

[2.650447] [ cut here ]
[2.650588] kernel BUG at include/linux/iommu-helper.h:23!
[2.650729] Internal error: Oops - BUG: 0 [#1] PREEMPT SMP
[2.654330] Modules linked in:
[2.657474] CPU: 2 PID: 67 Comm: kworker/2:1 Not tainted 
5.11.0-next-20210225-00025-gfd15609b3a81-dirty #120
[2.667367] Hardware name: NVIDIA Jetson AGX Xavier Developer Kit (DT)
[2.674096] Workqueue: events deferred_probe_work_func
[2.679169] pstate: 40400089 (nZcv daIf +PAN -UAO -TCO BTYPE=--)
[2.684949] pc : find_slots.isra.0+0x118/0x2f0
[2.689494] lr : find_slots.isra.0+0x88/0x2f0
[2.693696] sp : 800011faf950
[2.697281] x29: 800011faf950 x28: 0001
[2.702537] x27: 0001 x26: 
[2.708131] x25: 0001 x24: 000105f03148
[2.713556] x23: 0001 x22: 800011559000
[2.718835] x21: 800011559a80 x20: edc0
[2.724493] x19:  x18: 0020
[2.729770] x17: 0003ffd7d160 x16: 0068
[2.735173] x15: 80b43150 x14: 
[2.740944] x13: 82b5d791 x12: 0040
[2.746113] x11: a248 x10: 
[2.751882] x9 :  x8 : 0003fed3
[2.757139] x7 :  x6 : 
[2.762818] x5 :  x4 : 
[2.767984] x3 : 0001e8303148 x2 : 8000
[2.773580] x1 :  x0 : 001db800
[2.778662] Call trace:
[2.781136]  find_slots.isra.0+0x118/0x2f0
[2.785137]  swiotlb_tbl_map_single+0x80/0x1b4
[2.789858]  swiotlb_map+0x58/0x200
[2.793355]  dma_direct_map_page+0x148/0x1c0
[2.797386]  dma_map_page_attrs+0x2c/0x54
[2.801411]  dw_pcie_host_init+0x40c/0x4c0
[2.805633]  tegra_pcie_config_rp+0x7c/0x1f4
[2.810155]  tegra_pcie_dw_probe+0x3d0/0x60c
[2.814185]  platform_probe+0x68/0xe0
[2.817688]  really_probe+0xe4/0x4c0
[2.821362]  driver_probe_device+0x58/0xc0
[2.825386]  __device_attach_driver+0xa8/0x104
[2.829953]  bus_for_each_drv+0x78/0xd0
[2.833434]  __device_attach+0xdc/0x17c
[2.837631]  device_initial_probe+0x14/0x20
[2.841680]  bus_probe_device+0x9c/0xa4
[2.845160]  deferred_probe_work_func+0x74/0xb0
[2.849734]  process_one_work+0x1cc/0x350
[2.853822]  worker_thread+0x20c/0x3ac
[2.858018]  kthread+0x128/0x134
[2.860997]  ret_from_fork+0x10/0x34
[2.864508] Code: ca180063 ea06007f 54fffee1 b50001e7 (d421)
[2.870547] ---[ end trace e5c50bdcf12b316e ]---
[2.875087] note: kworker/2:1[67] exited with preempt_count 2
[2.880836] [ cut here ]

I've confirmed that reverting the following commits makes the system
boot again:

47cfc5be1934 ("swiotlb: Validate bounce size in the sync/unmap path")
c6f50c7719e7 ("swiotlb: respect min_align_mask")
e952d9a1bc20 ("swiotlb: don't modify orig_addr in swiotlb_tbl_sync_single")
567d877f9a7d ("swiotlb: refactor swiotlb_tbl_map_single")

Let me know if I can help test any fixes for this.

Thierry


signature.asc
Description: PGP signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v2 1/4] dt-bindings: reserved-memory: Document "active" property

2020-12-18 Thread Thierry Reding
On Fri, Dec 18, 2020 at 04:15:45PM -0600, Rob Herring wrote:
> On Thu, Dec 17, 2020 at 9:00 AM Thierry Reding  
> wrote:
> >
> > On Tue, Nov 10, 2020 at 08:33:09PM +0100, Thierry Reding wrote:
> > > On Fri, Nov 06, 2020 at 04:25:48PM +0100, Thierry Reding wrote:
> > > > On Thu, Nov 05, 2020 at 05:47:21PM +, Robin Murphy wrote:
> > > > > On 2020-11-05 16:43, Thierry Reding wrote:
> > > > > > On Thu, Sep 24, 2020 at 01:27:25PM +0200, Thierry Reding wrote:
> > > > > > > On Tue, Sep 15, 2020 at 02:36:48PM +0200, Thierry Reding wrote:
> > > > > > > > On Mon, Sep 14, 2020 at 04:08:29PM -0600, Rob Herring wrote:
> > > > > > > > > On Fri, Sep 04, 2020 at 02:59:57PM +0200, Thierry Reding 
> > > > > > > > > wrote:
> > > > > > > > > > From: Thierry Reding 
> > > > > > > > > >
> > > > > > > > > > Reserved memory regions can be marked as "active" if 
> > > > > > > > > > hardware is
> > > > > > > > > > expected to access the regions during boot and before the 
> > > > > > > > > > operating
> > > > > > > > > > system can take control. One example where this is useful 
> > > > > > > > > > is for the
> > > > > > > > > > operating system to infer whether the region needs to be 
> > > > > > > > > > identity-
> > > > > > > > > > mapped through an IOMMU.
> > > > > > > > >
> > > > > > > > > I like simple solutions, but this hardly seems adequate to 
> > > > > > > > > solve the
> > > > > > > > > problem of passing IOMMU setup from bootloader/firmware to 
> > > > > > > > > the OS. Like
> > > > > > > > > what is the IOVA that's supposed to be used if identity 
> > > > > > > > > mapping is not
> > > > > > > > > used?
> > > > > > > >
> > > > > > > > The assumption here is that if the region is not active there 
> > > > > > > > is no need
> > > > > > > > for the IOVA to be specified because the kernel will allocate 
> > > > > > > > memory and
> > > > > > > > assign any IOVA of its choosing.
> > > > > > > >
> > > > > > > > Also, note that this is not meant as a way of passing IOMMU 
> > > > > > > > setup from
> > > > > > > > the bootloader or firmware to the OS. The purpose of this is to 
> > > > > > > > specify
> > > > > > > > that some region of memory is actively being accessed during 
> > > > > > > > boot. The
> > > > > > > > particular case that I'm looking at is where the bootloader set 
> > > > > > > > up a
> > > > > > > > splash screen and keeps it on during boot. The bootloader has 
> > > > > > > > not set up
> > > > > > > > an IOMMU mapping and the identity mapping serves as a way of 
> > > > > > > > keeping the
> > > > > > > > accesses by the display hardware working during the 
> > > > > > > > transitional period
> > > > > > > > after the IOMMU translations have been enabled by the kernel 
> > > > > > > > but before
> > > > > > > > the kernel display driver has had a chance to set up its own 
> > > > > > > > IOMMU
> > > > > > > > mappings.
> > > > > > > >
> > > > > > > > > If you know enough about the regions to assume identity 
> > > > > > > > > mapping, then
> > > > > > > > > can't you know if active or not?
> > > > > > > >
> > > > > > > > We could alternatively add some property that describes the 
> > > > > > > > region as
> > > > > > > > requiring an identity mapping. But note that we can't make any
> > > > > > > > assumptions here about the usage of these regions because the 
> > > > > > > > IOMMU
> > > > > > > > d

Re: [PATCH v2 1/4] dt-bindings: reserved-memory: Document "active" property

2020-12-17 Thread Thierry Reding
On Tue, Nov 10, 2020 at 08:33:09PM +0100, Thierry Reding wrote:
> On Fri, Nov 06, 2020 at 04:25:48PM +0100, Thierry Reding wrote:
> > On Thu, Nov 05, 2020 at 05:47:21PM +, Robin Murphy wrote:
> > > On 2020-11-05 16:43, Thierry Reding wrote:
> > > > On Thu, Sep 24, 2020 at 01:27:25PM +0200, Thierry Reding wrote:
> > > > > On Tue, Sep 15, 2020 at 02:36:48PM +0200, Thierry Reding wrote:
> > > > > > On Mon, Sep 14, 2020 at 04:08:29PM -0600, Rob Herring wrote:
> > > > > > > On Fri, Sep 04, 2020 at 02:59:57PM +0200, Thierry Reding wrote:
> > > > > > > > From: Thierry Reding 
> > > > > > > > 
> > > > > > > > Reserved memory regions can be marked as "active" if hardware is
> > > > > > > > expected to access the regions during boot and before the 
> > > > > > > > operating
> > > > > > > > system can take control. One example where this is useful is 
> > > > > > > > for the
> > > > > > > > operating system to infer whether the region needs to be 
> > > > > > > > identity-
> > > > > > > > mapped through an IOMMU.
> > > > > > > 
> > > > > > > I like simple solutions, but this hardly seems adequate to solve 
> > > > > > > the
> > > > > > > problem of passing IOMMU setup from bootloader/firmware to the 
> > > > > > > OS. Like
> > > > > > > what is the IOVA that's supposed to be used if identity mapping 
> > > > > > > is not
> > > > > > > used?
> > > > > > 
> > > > > > The assumption here is that if the region is not active there is no 
> > > > > > need
> > > > > > for the IOVA to be specified because the kernel will allocate 
> > > > > > memory and
> > > > > > assign any IOVA of its choosing.
> > > > > > 
> > > > > > Also, note that this is not meant as a way of passing IOMMU setup 
> > > > > > from
> > > > > > the bootloader or firmware to the OS. The purpose of this is to 
> > > > > > specify
> > > > > > that some region of memory is actively being accessed during boot. 
> > > > > > The
> > > > > > particular case that I'm looking at is where the bootloader set up a
> > > > > > splash screen and keeps it on during boot. The bootloader has not 
> > > > > > set up
> > > > > > an IOMMU mapping and the identity mapping serves as a way of 
> > > > > > keeping the
> > > > > > accesses by the display hardware working during the transitional 
> > > > > > period
> > > > > > after the IOMMU translations have been enabled by the kernel but 
> > > > > > before
> > > > > > the kernel display driver has had a chance to set up its own IOMMU
> > > > > > mappings.
> > > > > > 
> > > > > > > If you know enough about the regions to assume identity mapping, 
> > > > > > > then
> > > > > > > can't you know if active or not?
> > > > > > 
> > > > > > We could alternatively add some property that describes the region 
> > > > > > as
> > > > > > requiring an identity mapping. But note that we can't make any
> > > > > > assumptions here about the usage of these regions because the IOMMU
> > > > > > driver simply has no way of knowing what they are being used for.
> > > > > > 
> > > > > > Some additional information is required in device tree for the IOMMU
> > > > > > driver to be able to make that decision.
> > > > > 
> > > > > Rob, can you provide any hints on exactly how you want to move this
> > > > > forward? I don't know in what direction you'd like to proceed.
> > > > 
> > > > Hi Rob,
> > > > 
> > > > do you have any suggestions on how to proceed with this? I'd like to get
> > > > this moving again because it's something that's been nagging me for some
> > > > months now. It also requires changes across two levels in the bootloader
> > > > stack as well as Linux and it takes quite a bit of work to make all the
> > > > changes, so before 

Re: [PATCH RESEND 5/5] iommu/tegra-smmu: Add PCI support

2020-11-20 Thread Thierry Reding
On Wed, Nov 11, 2020 at 02:21:29PM -0800, Nicolin Chen wrote:
> This patch simply adds support for PCI devices.
> 
> Reviewed-by: Dmitry Osipenko 
> Tested-by: Dmitry Osipenko 
> Signed-off-by: Nicolin Chen 
> ---
>  drivers/iommu/tegra-smmu.c | 35 +--
>  1 file changed, 25 insertions(+), 10 deletions(-)

Acked-by: Thierry Reding 


signature.asc
Description: PGP signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH RESEND 4/5] iommu/tegra-smmu: Rework tegra_smmu_probe_device()

2020-11-20 Thread Thierry Reding
On Wed, Nov 11, 2020 at 02:21:28PM -0800, Nicolin Chen wrote:
> The bus_set_iommu() in tegra_smmu_probe() enumerates all clients
> to call in tegra_smmu_probe_device() where each client searches
> its DT node for smmu pointer and swgroup ID, so as to configure
> an fwspec. But this requires a valid smmu pointer even before mc
> and smmu drivers are probed. So in tegra_smmu_probe() we added a
> line of code to fill mc->smmu, marking "a bit of a hack".
> 
> This works for most of clients in the DTB, however, doesn't work
> for a client that doesn't exist in DTB, a PCI device for example.
> 
> Actually, if we return ERR_PTR(-ENODEV) in ->probe_device() when
> it's called from bus_set_iommu(), iommu core will let everything
> carry on. Then when a client gets probed, of_iommu_configure() in
> iommu core will search DTB for swgroup ID and call ->of_xlate()
> to prepare an fwspec, similar to tegra_smmu_probe_device() and
> tegra_smmu_configure(). Then it'll call tegra_smmu_probe_device()
> again, and this time we shall return smmu->iommu pointer properly.
> 
> So we can get rid of tegra_smmu_find() and tegra_smmu_configure()
> along with DT polling code by letting the iommu core handle every
> thing, except a problem that we search iommus property in DTB not
> only for swgroup ID but also for mc node to get mc->smmu pointer
> to call dev_iommu_priv_set() and return the smmu->iommu pointer.
> So we'll need to find another way to get smmu pointer.
> 
> Referencing the implementation of sun50i-iommu driver, of_xlate()
> has client's dev pointer, mc node and swgroup ID. This means that
> we can call dev_iommu_priv_set() in of_xlate() instead, so we can
> simply get smmu pointer in ->probe_device().
> 
> This patch reworks tegra_smmu_probe_device() by:
> 1) Removing mc->smmu hack in tegra_smmu_probe() so as to return
>ERR_PTR(-ENODEV) in tegra_smmu_probe_device() during stage of
>tegra_smmu_probe/tegra_mc_probe().
> 2) Moving dev_iommu_priv_set() to of_xlate() so we can get smmu
>pointer in tegra_smmu_probe_device() to replace DTB polling.
> 3) Removing tegra_smmu_configure() accordingly since iommu core
>takes care of it.
> 
> This also fixes a problem that previously we could add clients to
> iommu groups before iommu core initializes its default domain:
> ubuntu@jetson:~$ dmesg | grep iommu
> platform 5000.host1x: Adding to iommu group 1
> platform 5700.gpu: Adding to iommu group 2
> iommu: Default domain type: Translated
> platform 5420.dc: Adding to iommu group 3
> platform 5424.dc: Adding to iommu group 3
> platform 5434.vic: Adding to iommu group 4
> 
> Though it works fine with IOMMU_DOMAIN_UNMANAGED, but will have
> warnings if switching to IOMMU_DOMAIN_DMA:
> iommu: Failed to allocate default IOMMU domain of type 0 for
>group (null) - Falling back to IOMMU_DOMAIN_DMA
> iommu: Failed to allocate default IOMMU domain of type 0 for
>group (null) - Falling back to IOMMU_DOMAIN_DMA
> 
> Now, bypassing the first probe_device() call from bus_set_iommu()
> fixes the sequence:
> ubuntu@jetson:~$ dmesg | grep iommu
> iommu: Default domain type: Translated
> tegra-host1x 5000.host1x: Adding to iommu group 0
> tegra-dc 5420.dc: Adding to iommu group 1
> tegra-dc 5424.dc: Adding to iommu group 1
> tegra-vic 5434.vic: Adding to iommu group 2
> nouveau 5700.gpu: Adding to iommu group 3
> 
> Note that dmesg log above is testing with IOMMU_DOMAIN_UNMANAGED.
> 
> Reviewed-by: Dmitry Osipenko 
> Tested-by: Dmitry Osipenko 
> Signed-off-by: Nicolin Chen 
> ---
>  drivers/iommu/tegra-smmu.c | 96 ++
>  1 file changed, 15 insertions(+), 81 deletions(-)

Acked-by: Thierry Reding 


signature.asc
Description: PGP signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH RESEND 3/5] iommu/tegra-smmu: Use fwspec in tegra_smmu_(de)attach_dev

2020-11-20 Thread Thierry Reding
On Wed, Nov 11, 2020 at 02:21:27PM -0800, Nicolin Chen wrote:
> In tegra_smmu_(de)attach_dev() functions, we poll DTB for each
> client's iommus property to get swgroup ID in order to prepare
> "as" and enable smmu. Actually tegra_smmu_configure() prepared
> an fwspec for each client, and added to the fwspec all swgroup
> IDs of client DT node in DTB.
> 
> So this patch uses fwspec in tegra_smmu_(de)attach_dev() so as
> to replace the redundant DT polling code.
> 
> Reviewed-by: Dmitry Osipenko 
> Tested-by: Dmitry Osipenko 
> Signed-off-by: Nicolin Chen 
> ---
>  drivers/iommu/tegra-smmu.c | 56 --
>  1 file changed, 23 insertions(+), 33 deletions(-)

Acked-by: Thierry Reding 


signature.asc
Description: PGP signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH] Revert "firmware: QCOM_SCM: Allow qcom_scm driver to be loadable as a permenent module"

2020-11-19 Thread Thierry Reding
From: Thierry Reding 

Commit d0511b5496c0 ("firmware: QCOM_SCM: Allow qcom_scm driver to be
loadable as a permenent module") causes the ARM SMMU driver to be built
as a loadable module when using the Aarch64 default configuration. This
in turn causes problems because if the loadable module is not shipped
in an initial ramdisk, then the deferred probe timeout mechanism will
cause all SMMU masters to probe without SMMU support and fall back to
just plain DMA ops (not IOMMU-backed).

Once the system has mounted the rootfs, the ARM SMMU driver will then
be loaded, but since the ARM SMMU driver faults by default, this causes
a slew of SMMU faults for the SMMU masters that have already been set
up with plain DMA ops and cause these devices to malfunction.

Revert that commit to unbreak things while we look for an alternative
solution.

Reported-by: Jon Hunter 
Signed-off-by: Thierry Reding 
---
 drivers/firmware/Kconfig| 4 ++--
 drivers/firmware/Makefile   | 3 +--
 drivers/firmware/qcom_scm.c | 4 
 drivers/iommu/Kconfig   | 2 --
 drivers/net/wireless/ath/ath10k/Kconfig | 1 -
 5 files changed, 3 insertions(+), 11 deletions(-)

diff --git a/drivers/firmware/Kconfig b/drivers/firmware/Kconfig
index 5e369928bc56..3315e3c21586 100644
--- a/drivers/firmware/Kconfig
+++ b/drivers/firmware/Kconfig
@@ -235,8 +235,8 @@ config INTEL_STRATIX10_RSU
  Say Y here if you want Intel RSU support.
 
 config QCOM_SCM
-   tristate "Qcom SCM driver"
-   depends on (ARM && HAVE_ARM_SMCCC) || ARM64
+   bool
+   depends on ARM || ARM64
select RESET_CONTROLLER
 
 config QCOM_SCM_DOWNLOAD_MODE_DEFAULT
diff --git a/drivers/firmware/Makefile b/drivers/firmware/Makefile
index 523173cbff33..5e013b6a3692 100644
--- a/drivers/firmware/Makefile
+++ b/drivers/firmware/Makefile
@@ -17,8 +17,7 @@ obj-$(CONFIG_ISCSI_IBFT)  += iscsi_ibft.o
 obj-$(CONFIG_FIRMWARE_MEMMAP)  += memmap.o
 obj-$(CONFIG_RASPBERRYPI_FIRMWARE) += raspberrypi.o
 obj-$(CONFIG_FW_CFG_SYSFS) += qemu_fw_cfg.o
-obj-$(CONFIG_QCOM_SCM) += qcom-scm.o
-qcom-scm-objs += qcom_scm.o qcom_scm-smc.o qcom_scm-legacy.o
+obj-$(CONFIG_QCOM_SCM) += qcom_scm.o qcom_scm-smc.o qcom_scm-legacy.o
 obj-$(CONFIG_TI_SCI_PROTOCOL)  += ti_sci.o
 obj-$(CONFIG_TRUSTED_FOUNDATIONS) += trusted_foundations.o
 obj-$(CONFIG_TURRIS_MOX_RWTM)  += turris-mox-rwtm.o
diff --git a/drivers/firmware/qcom_scm.c b/drivers/firmware/qcom_scm.c
index 6f431b73e617..7be48c1bec96 100644
--- a/drivers/firmware/qcom_scm.c
+++ b/drivers/firmware/qcom_scm.c
@@ -1280,7 +1280,6 @@ static const struct of_device_id qcom_scm_dt_match[] = {
{ .compatible = "qcom,scm" },
{}
 };
-MODULE_DEVICE_TABLE(of, qcom_scm_dt_match);
 
 static struct platform_driver qcom_scm_driver = {
.driver = {
@@ -1296,6 +1295,3 @@ static int __init qcom_scm_init(void)
return platform_driver_register(_scm_driver);
 }
 subsys_initcall(qcom_scm_init);
-
-MODULE_DESCRIPTION("Qualcomm Technologies, Inc. SCM driver");
-MODULE_LICENSE("GPL v2");
diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
index c64d7a2b6513..04878caf6da4 100644
--- a/drivers/iommu/Kconfig
+++ b/drivers/iommu/Kconfig
@@ -248,7 +248,6 @@ config SPAPR_TCE_IOMMU
 config ARM_SMMU
tristate "ARM Ltd. System MMU (SMMU) Support"
depends on ARM64 || ARM || (COMPILE_TEST && !GENERIC_ATOMIC64)
-   depends on QCOM_SCM || !QCOM_SCM #if QCOM_SCM=m this can't be =y
select IOMMU_API
select IOMMU_IO_PGTABLE_LPAE
select ARM_DMA_USE_IOMMU if ARM
@@ -376,7 +375,6 @@ config QCOM_IOMMU
# Note: iommu drivers cannot (yet?) be built as modules
bool "Qualcomm IOMMU Support"
depends on ARCH_QCOM || (COMPILE_TEST && !GENERIC_ATOMIC64)
-   depends on QCOM_SCM=y
select IOMMU_API
select IOMMU_IO_PGTABLE_LPAE
select ARM_DMA_USE_IOMMU
diff --git a/drivers/net/wireless/ath/ath10k/Kconfig 
b/drivers/net/wireless/ath/ath10k/Kconfig
index 741289e385d5..40f91bc8514d 100644
--- a/drivers/net/wireless/ath/ath10k/Kconfig
+++ b/drivers/net/wireless/ath/ath10k/Kconfig
@@ -44,7 +44,6 @@ config ATH10K_SNOC
tristate "Qualcomm ath10k SNOC support"
depends on ATH10K
depends on ARCH_QCOM || COMPILE_TEST
-   depends on QCOM_SCM || !QCOM_SCM #if QCOM_SCM=m this can't be =y
select QCOM_QMI_HELPERS
help
  This module adds support for integrated WCN3990 chip connected
-- 
2.29.2

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v6 3/3] firmware: QCOM_SCM: Allow qcom_scm driver to be loadable as a permenent module

2020-11-17 Thread Thierry Reding
On Mon, Nov 16, 2020 at 11:48:39AM -0800, John Stultz wrote:
> On Mon, Nov 16, 2020 at 8:36 AM Will Deacon  wrote:
> > On Mon, Nov 16, 2020 at 04:59:36PM +0100, Thierry Reding wrote:
> > > On Fri, Nov 06, 2020 at 04:27:10AM +, John Stultz wrote:
> > > Unfortunately, the ARM SMMU module will eventually end up being loaded
> > > once the root filesystem has been mounted (for example via SDHCI or
> > > Ethernet, both with using just plain, non-IOMMU-backed DMA API) and then
> > > initialize, configuring as "fault by default", which then results from a
> > > slew of SMMU faults from all the devices that have previously configured
> > > themselves without IOMMU support.
> >
> > I wonder if fw_devlink=on would help here?
> >
> > But either way, I'd be more inclined to revert this change if it's causing
> > problems for !QCOM devices.
> >
> > Linus -- please can you drop this one (patch 3/3) for now, given that it's
> > causing problems?
> 
> Agreed. Apologies again for the trouble.
> 
> I do feel like the probe timeout to handle optional links is causing a
> lot of the trouble here. I expect fw_devlink would solve this, but it
> may be awhile before it can be always enabled.  I may see about
> pushing the default probe timeout value to be a little further out
> than init (I backed away from my last attempt as I didn't want to
> cause long (30 second) delays for cases like NFS root, but maybe 2-5
> seconds would be enough to make things work better for everyone).

I think there are two problems here: 1) the deferred probe timeout can
cause a mismatch between what SMMU masters and the SMMU think is going
on and 2) a logistical problem of dealing with the SMMU driver being a
loadable module.

The second problem can be dealt with by shipping the module in the
initial ramdisk. That's a bit annoying, but perhaps the right thing to
do. At least on Tegra we need this because all the devices that carry
the root filesystem (Ethernet for NFS and SDHCI/USB/SATA/PCI for disk
boot) are SMMU masters and will start to fault once the SMMU driver is
loaded.

The first problem is trickier, but if the ARM SMMU driver is built as a
module and shipped in the initial ramdisk it should work. Like I said,
this is annoying because it makes the development a bit more complicated
than just rebuilding a kernel image and flashing it (or boot it straight
from TFTP) because now everytime the ARM SMMU module is built the
initial ramdisk needs to be updated (and potentially flashed) as well.

Thierry

P.S.: Interestingly this is very similar to the problem that I've been
trying to address for display hardware that's left on by the bootloader.
Given that, one potential solution would be to somehow retrieve memory
allocations done by these devices and create identity mappings in the
ARM SMMU address spaces for such devices, much like we plan to do for
devices left on by the bootloader (like the display controller for
showing a boot splash). I suspect that it's not really worth doing this
for devices that are only initialized by the kernel because we have a
bit of control over when we enable them, so I'd prefer if we just kept
things reasonably simple and made sure the SMMU was either always used
by a device from the start or not at all. Dynamically switching between
SMMU and no-SMMU seems a bit eccentric.


signature.asc
Description: PGP signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v6 3/3] firmware: QCOM_SCM: Allow qcom_scm driver to be loadable as a permenent module

2020-11-16 Thread Thierry Reding
On Fri, Nov 06, 2020 at 04:27:10AM +, John Stultz wrote:
> Allow the qcom_scm driver to be loadable as a permenent module.
> 
> This still uses the "depends on QCOM_SCM || !QCOM_SCM" bit to
> ensure that drivers that call into the qcom_scm driver are
> also built as modules. While not ideal in some cases its the
> only safe way I can find to avoid build errors without having
> those drivers select QCOM_SCM and have to force it on (as
> QCOM_SCM=n can be valid for those drivers).
> 
> Cc: Catalin Marinas 
> Cc: Will Deacon 
> Cc: Andy Gross 
> Cc: Bjorn Andersson 
> Cc: Joerg Roedel 
> Cc: Thomas Gleixner 
> Cc: Jason Cooper 
> Cc: Marc Zyngier 
> Cc: Linus Walleij 
> Cc: Vinod Koul 
> Cc: Kalle Valo 
> Cc: Maulik Shah 
> Cc: Lina Iyer 
> Cc: Saravana Kannan 
> Cc: Todd Kjos 
> Cc: Greg Kroah-Hartman 
> Cc: linux-arm-...@vger.kernel.org
> Cc: iommu@lists.linux-foundation.org
> Cc: linux-g...@vger.kernel.org
> Acked-by: Kalle Valo 
> Acked-by: Greg Kroah-Hartman 
> Reviewed-by: Bjorn Andersson 
> Signed-off-by: John Stultz 
> ---
> v3:
> * Fix __arm_smccc_smc build issue reported by
>   kernel test robot 
> v4:
> * Add "depends on QCOM_SCM || !QCOM_SCM" bit to ath10k
>   config that requires it.
> v5:
> * Fix QCOM_QCM typo in Kconfig, it should be QCOM_SCM
> ---
>  drivers/firmware/Kconfig| 4 ++--
>  drivers/firmware/Makefile   | 3 ++-
>  drivers/firmware/qcom_scm.c | 4 
>  drivers/iommu/Kconfig   | 2 ++
>  drivers/net/wireless/ath/ath10k/Kconfig | 1 +
>  5 files changed, 11 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/firmware/Kconfig b/drivers/firmware/Kconfig
> index 3315e3c215864..5e369928bc567 100644
> --- a/drivers/firmware/Kconfig
> +++ b/drivers/firmware/Kconfig
> @@ -235,8 +235,8 @@ config INTEL_STRATIX10_RSU
> Say Y here if you want Intel RSU support.
>  
>  config QCOM_SCM
> - bool
> - depends on ARM || ARM64
> + tristate "Qcom SCM driver"
> + depends on (ARM && HAVE_ARM_SMCCC) || ARM64
>   select RESET_CONTROLLER
>  
>  config QCOM_SCM_DOWNLOAD_MODE_DEFAULT
> diff --git a/drivers/firmware/Makefile b/drivers/firmware/Makefile
> index 5e013b6a3692e..523173cbff335 100644
> --- a/drivers/firmware/Makefile
> +++ b/drivers/firmware/Makefile
> @@ -17,7 +17,8 @@ obj-$(CONFIG_ISCSI_IBFT)+= iscsi_ibft.o
>  obj-$(CONFIG_FIRMWARE_MEMMAP)+= memmap.o
>  obj-$(CONFIG_RASPBERRYPI_FIRMWARE) += raspberrypi.o
>  obj-$(CONFIG_FW_CFG_SYSFS)   += qemu_fw_cfg.o
> -obj-$(CONFIG_QCOM_SCM)   += qcom_scm.o qcom_scm-smc.o 
> qcom_scm-legacy.o
> +obj-$(CONFIG_QCOM_SCM)   += qcom-scm.o
> +qcom-scm-objs += qcom_scm.o qcom_scm-smc.o qcom_scm-legacy.o
>  obj-$(CONFIG_TI_SCI_PROTOCOL)+= ti_sci.o
>  obj-$(CONFIG_TRUSTED_FOUNDATIONS) += trusted_foundations.o
>  obj-$(CONFIG_TURRIS_MOX_RWTM)+= turris-mox-rwtm.o
> diff --git a/drivers/firmware/qcom_scm.c b/drivers/firmware/qcom_scm.c
> index 7be48c1bec96d..6f431b73e617d 100644
> --- a/drivers/firmware/qcom_scm.c
> +++ b/drivers/firmware/qcom_scm.c
> @@ -1280,6 +1280,7 @@ static const struct of_device_id qcom_scm_dt_match[] = {
>   { .compatible = "qcom,scm" },
>   {}
>  };
> +MODULE_DEVICE_TABLE(of, qcom_scm_dt_match);
>  
>  static struct platform_driver qcom_scm_driver = {
>   .driver = {
> @@ -1295,3 +1296,6 @@ static int __init qcom_scm_init(void)
>   return platform_driver_register(_scm_driver);
>  }
>  subsys_initcall(qcom_scm_init);
> +
> +MODULE_DESCRIPTION("Qualcomm Technologies, Inc. SCM driver");
> +MODULE_LICENSE("GPL v2");
> diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
> index 04878caf6da49..c64d7a2b65134 100644
> --- a/drivers/iommu/Kconfig
> +++ b/drivers/iommu/Kconfig
> @@ -248,6 +248,7 @@ config SPAPR_TCE_IOMMU
>  config ARM_SMMU
>   tristate "ARM Ltd. System MMU (SMMU) Support"
>   depends on ARM64 || ARM || (COMPILE_TEST && !GENERIC_ATOMIC64)
> + depends on QCOM_SCM || !QCOM_SCM #if QCOM_SCM=m this can't be =y
>   select IOMMU_API
>   select IOMMU_IO_PGTABLE_LPAE
>   select ARM_DMA_USE_IOMMU if ARM

This, in conjunction with deferred probe timeout, causes mayhem on
Tegra186. The problem, as far as I can tell, is that there are various
devices that are hooked up to the ARM SMMU, but if ARM SMMU ends up
being built as a loadable module, then those devices will initialize
without IOMMU support (because deferred probe will timeout before the
ARM SMMU module can be loaded from the root filesystem).

Unfortunately, the ARM SMMU module will eventually end up being loaded
once the root filesystem has been mounted (for example via SDHCI or
Ethernet, both with using just plain, non-IOMMU-backed DMA API) and then
initialize, configuring as "fault by default", which then results from a
slew of SMMU faults from all the devices that have previously configured
themselves without IOMMU support.

One way to work around this is to just disable all QCOM-related drivers
for the 

Re: [PATCH v2 1/4] dt-bindings: reserved-memory: Document "active" property

2020-11-10 Thread Thierry Reding
On Fri, Nov 06, 2020 at 04:25:48PM +0100, Thierry Reding wrote:
> On Thu, Nov 05, 2020 at 05:47:21PM +, Robin Murphy wrote:
> > On 2020-11-05 16:43, Thierry Reding wrote:
> > > On Thu, Sep 24, 2020 at 01:27:25PM +0200, Thierry Reding wrote:
> > > > On Tue, Sep 15, 2020 at 02:36:48PM +0200, Thierry Reding wrote:
> > > > > On Mon, Sep 14, 2020 at 04:08:29PM -0600, Rob Herring wrote:
> > > > > > On Fri, Sep 04, 2020 at 02:59:57PM +0200, Thierry Reding wrote:
> > > > > > > From: Thierry Reding 
> > > > > > > 
> > > > > > > Reserved memory regions can be marked as "active" if hardware is
> > > > > > > expected to access the regions during boot and before the 
> > > > > > > operating
> > > > > > > system can take control. One example where this is useful is for 
> > > > > > > the
> > > > > > > operating system to infer whether the region needs to be identity-
> > > > > > > mapped through an IOMMU.
> > > > > > 
> > > > > > I like simple solutions, but this hardly seems adequate to solve the
> > > > > > problem of passing IOMMU setup from bootloader/firmware to the OS. 
> > > > > > Like
> > > > > > what is the IOVA that's supposed to be used if identity mapping is 
> > > > > > not
> > > > > > used?
> > > > > 
> > > > > The assumption here is that if the region is not active there is no 
> > > > > need
> > > > > for the IOVA to be specified because the kernel will allocate memory 
> > > > > and
> > > > > assign any IOVA of its choosing.
> > > > > 
> > > > > Also, note that this is not meant as a way of passing IOMMU setup from
> > > > > the bootloader or firmware to the OS. The purpose of this is to 
> > > > > specify
> > > > > that some region of memory is actively being accessed during boot. The
> > > > > particular case that I'm looking at is where the bootloader set up a
> > > > > splash screen and keeps it on during boot. The bootloader has not set 
> > > > > up
> > > > > an IOMMU mapping and the identity mapping serves as a way of keeping 
> > > > > the
> > > > > accesses by the display hardware working during the transitional 
> > > > > period
> > > > > after the IOMMU translations have been enabled by the kernel but 
> > > > > before
> > > > > the kernel display driver has had a chance to set up its own IOMMU
> > > > > mappings.
> > > > > 
> > > > > > If you know enough about the regions to assume identity mapping, 
> > > > > > then
> > > > > > can't you know if active or not?
> > > > > 
> > > > > We could alternatively add some property that describes the region as
> > > > > requiring an identity mapping. But note that we can't make any
> > > > > assumptions here about the usage of these regions because the IOMMU
> > > > > driver simply has no way of knowing what they are being used for.
> > > > > 
> > > > > Some additional information is required in device tree for the IOMMU
> > > > > driver to be able to make that decision.
> > > > 
> > > > Rob, can you provide any hints on exactly how you want to move this
> > > > forward? I don't know in what direction you'd like to proceed.
> > > 
> > > Hi Rob,
> > > 
> > > do you have any suggestions on how to proceed with this? I'd like to get
> > > this moving again because it's something that's been nagging me for some
> > > months now. It also requires changes across two levels in the bootloader
> > > stack as well as Linux and it takes quite a bit of work to make all the
> > > changes, so before I go and rewrite everything I'd like to get the DT
> > > bindings sorted out first.
> > > 
> > > So just to summarize why I think this simple solution is good enough: it
> > > tries to solve a very narrow and simple problem. This is not an attempt
> > > at describing the firmware's full IOMMU setup to the kernel. In fact, it
> > > is primarily targetted at cases where the firmware hasn't setup an IOMMU
> > > at all, and we just want to make sure that when the kernel takes over
> > > and does want to

Re: [PATCH v2 1/4] dt-bindings: reserved-memory: Document "active" property

2020-11-06 Thread Thierry Reding
On Thu, Nov 05, 2020 at 05:47:21PM +, Robin Murphy wrote:
> On 2020-11-05 16:43, Thierry Reding wrote:
> > On Thu, Sep 24, 2020 at 01:27:25PM +0200, Thierry Reding wrote:
> > > On Tue, Sep 15, 2020 at 02:36:48PM +0200, Thierry Reding wrote:
> > > > On Mon, Sep 14, 2020 at 04:08:29PM -0600, Rob Herring wrote:
> > > > > On Fri, Sep 04, 2020 at 02:59:57PM +0200, Thierry Reding wrote:
> > > > > > From: Thierry Reding 
> > > > > > 
> > > > > > Reserved memory regions can be marked as "active" if hardware is
> > > > > > expected to access the regions during boot and before the operating
> > > > > > system can take control. One example where this is useful is for the
> > > > > > operating system to infer whether the region needs to be identity-
> > > > > > mapped through an IOMMU.
> > > > > 
> > > > > I like simple solutions, but this hardly seems adequate to solve the
> > > > > problem of passing IOMMU setup from bootloader/firmware to the OS. 
> > > > > Like
> > > > > what is the IOVA that's supposed to be used if identity mapping is not
> > > > > used?
> > > > 
> > > > The assumption here is that if the region is not active there is no need
> > > > for the IOVA to be specified because the kernel will allocate memory and
> > > > assign any IOVA of its choosing.
> > > > 
> > > > Also, note that this is not meant as a way of passing IOMMU setup from
> > > > the bootloader or firmware to the OS. The purpose of this is to specify
> > > > that some region of memory is actively being accessed during boot. The
> > > > particular case that I'm looking at is where the bootloader set up a
> > > > splash screen and keeps it on during boot. The bootloader has not set up
> > > > an IOMMU mapping and the identity mapping serves as a way of keeping the
> > > > accesses by the display hardware working during the transitional period
> > > > after the IOMMU translations have been enabled by the kernel but before
> > > > the kernel display driver has had a chance to set up its own IOMMU
> > > > mappings.
> > > > 
> > > > > If you know enough about the regions to assume identity mapping, then
> > > > > can't you know if active or not?
> > > > 
> > > > We could alternatively add some property that describes the region as
> > > > requiring an identity mapping. But note that we can't make any
> > > > assumptions here about the usage of these regions because the IOMMU
> > > > driver simply has no way of knowing what they are being used for.
> > > > 
> > > > Some additional information is required in device tree for the IOMMU
> > > > driver to be able to make that decision.
> > > 
> > > Rob, can you provide any hints on exactly how you want to move this
> > > forward? I don't know in what direction you'd like to proceed.
> > 
> > Hi Rob,
> > 
> > do you have any suggestions on how to proceed with this? I'd like to get
> > this moving again because it's something that's been nagging me for some
> > months now. It also requires changes across two levels in the bootloader
> > stack as well as Linux and it takes quite a bit of work to make all the
> > changes, so before I go and rewrite everything I'd like to get the DT
> > bindings sorted out first.
> > 
> > So just to summarize why I think this simple solution is good enough: it
> > tries to solve a very narrow and simple problem. This is not an attempt
> > at describing the firmware's full IOMMU setup to the kernel. In fact, it
> > is primarily targetted at cases where the firmware hasn't setup an IOMMU
> > at all, and we just want to make sure that when the kernel takes over
> > and does want to enable the IOMMU, that all the regions that are
> > actively being accessed by non-quiesced hardware (the most typical
> > example would be a framebuffer scanning out a splat screen or animation,
> > but it could equally well be some sort of welcoming tone or music being
> > played back) are described in device tree.
> > 
> > In other words, and this is perhaps better answering your second
> > question: in addition to describing reserved memory regions, we want to
> > add a bit of information here about the usage of these memory regions.
> > Some memory regions may contain information that the kernel may want to
> > use (such a

Re: [PATCH v2 1/4] dt-bindings: reserved-memory: Document "active" property

2020-11-05 Thread Thierry Reding
On Thu, Sep 24, 2020 at 01:27:25PM +0200, Thierry Reding wrote:
> On Tue, Sep 15, 2020 at 02:36:48PM +0200, Thierry Reding wrote:
> > On Mon, Sep 14, 2020 at 04:08:29PM -0600, Rob Herring wrote:
> > > On Fri, Sep 04, 2020 at 02:59:57PM +0200, Thierry Reding wrote:
> > > > From: Thierry Reding 
> > > > 
> > > > Reserved memory regions can be marked as "active" if hardware is
> > > > expected to access the regions during boot and before the operating
> > > > system can take control. One example where this is useful is for the
> > > > operating system to infer whether the region needs to be identity-
> > > > mapped through an IOMMU.
> > > 
> > > I like simple solutions, but this hardly seems adequate to solve the 
> > > problem of passing IOMMU setup from bootloader/firmware to the OS. Like 
> > > what is the IOVA that's supposed to be used if identity mapping is not 
> > > used?
> > 
> > The assumption here is that if the region is not active there is no need
> > for the IOVA to be specified because the kernel will allocate memory and
> > assign any IOVA of its choosing.
> > 
> > Also, note that this is not meant as a way of passing IOMMU setup from
> > the bootloader or firmware to the OS. The purpose of this is to specify
> > that some region of memory is actively being accessed during boot. The
> > particular case that I'm looking at is where the bootloader set up a
> > splash screen and keeps it on during boot. The bootloader has not set up
> > an IOMMU mapping and the identity mapping serves as a way of keeping the
> > accesses by the display hardware working during the transitional period
> > after the IOMMU translations have been enabled by the kernel but before
> > the kernel display driver has had a chance to set up its own IOMMU
> > mappings.
> > 
> > > If you know enough about the regions to assume identity mapping, then 
> > > can't you know if active or not?
> > 
> > We could alternatively add some property that describes the region as
> > requiring an identity mapping. But note that we can't make any
> > assumptions here about the usage of these regions because the IOMMU
> > driver simply has no way of knowing what they are being used for.
> > 
> > Some additional information is required in device tree for the IOMMU
> > driver to be able to make that decision.
> 
> Rob, can you provide any hints on exactly how you want to move this
> forward? I don't know in what direction you'd like to proceed.

Hi Rob,

do you have any suggestions on how to proceed with this? I'd like to get
this moving again because it's something that's been nagging me for some
months now. It also requires changes across two levels in the bootloader
stack as well as Linux and it takes quite a bit of work to make all the
changes, so before I go and rewrite everything I'd like to get the DT
bindings sorted out first.

So just to summarize why I think this simple solution is good enough: it
tries to solve a very narrow and simple problem. This is not an attempt
at describing the firmware's full IOMMU setup to the kernel. In fact, it
is primarily targetted at cases where the firmware hasn't setup an IOMMU
at all, and we just want to make sure that when the kernel takes over
and does want to enable the IOMMU, that all the regions that are
actively being accessed by non-quiesced hardware (the most typical
example would be a framebuffer scanning out a splat screen or animation,
but it could equally well be some sort of welcoming tone or music being
played back) are described in device tree.

In other words, and this is perhaps better answering your second
question: in addition to describing reserved memory regions, we want to
add a bit of information here about the usage of these memory regions.
Some memory regions may contain information that the kernel may want to
use (such an external memory frequency scaling tables) and those I would
describe as "inactive" memory because it isn't being accessed by
hardware. The framebuffer in this case is the opposite and it is being
actively accessed (hence it is marked "active") by hardware while the
kernel is busy setting everything up so that it can reconfigure that
hardware and take over with its own framebuffer (for the console, for
example). It's also not so much that we know enough about the region to
assume it needs identity mapping. We don't really care about that from
the DT point of view. In fact, depending on the rest of the system
configuration, we may not need identity mapping (i.e. if none of the
users of the reserved memory region are behind an IOMMU). But the point
here is that the IOMMU drivers can use 

Re: [PATCH v2 1/4] dt-bindings: reserved-memory: Document "active" property

2020-11-05 Thread Thierry Reding
On Fri, Sep 25, 2020 at 04:21:17PM +0300, Dmitry Osipenko wrote:
> 25.09.2020 15:39, Robin Murphy пишет:
> ...
> >> IIRC, in the past Robin Murphy was suggesting to read out hardware state
> >> early during kernel boot in order to find what regions are in use by
> >> hardware.
> > 
> > I doubt I suggested that in general, because I've always firmly believed
> > it to be a terrible idea. I've debugged too many cases where firmware or
> > kexec has inadvertently left DMA running and corrupted kernel memory, so
> > in general we definitely *don't* want to blindly trust random hardware
> > state. Anything I may have said in relation to Qualcomm's fundamentally
> > broken hypervisor/bootloader setup should not be considered outside that
> > specific context ;)
> > 
> > Robin.
> > 
> >> I think it should be easy to do for the display controller since we
> >> could check clock and PD states in order to decide whether DC's IO could
> >> be accessed and then read out the FB pointer and size. I guess it should
> >> take about hundred lines of code.
> 
> The active DMA is indeed very dangerous, but it's a bit less dangerous
> in a case of read-only DMA.
> 
> I got another idea of how we could benefit from the active display
> hardware. Maybe we could do the following:
> 
> 1. Check whether display is active
> 
> 2. Allocate CMA that matches the FB size
> 
> 3. Create identity mapping for the CMA
> 
> 4. Switch display framebuffer to our CMA
> 
> 5. Create very early simple-framebuffer out of the CMA
> 
> 6. Once Tegra DRM driver is loaded, it will kick out the simple-fb, and
> thus, release temporal CMA and identity mapping.
> 
> This will provide us with a very early framebuffer output and it will
> work on all devices out-of-the-box!

Well that's already kind of what this is trying to achieve, only
skipping the CMA step because the memory is already there and actively
being scanned out from. The problem with your sequence above is first
that you have to allocate from CMA, which means that this has to wait
until CMA becomes available. That's fairly early, but it's not
immediately there. Until you get to that point, there's always the
potential for the display controller to read out from memory that may
now be used for something else. As you said, read-only active DMA isn't
as dangerous as write DMA, but it's not very nice either.

Furthermore, your point 5. above requires device-specific knowledge and
as I mentioned earlier that requires a small, but not necessarily
trivial, device-specific driver to work, which is very impractical for
multi-platform kernels.

There's nothing preventing these reserved-memory regions from being
reused to implement simple-framebuffer. I could in fact imagine a fairly
simple extension to the existing simple-framebuffer binding that could
look like this for Tegra:

dc@5200 {
compatible = "nvidia,tegra210-display", "simple-framebuffer";
...
memory-region = <>;
width = <1920>;
height = <1080>;
stride = <7680>;
format = "r8g8b8";
...
};

That's not dissimilar to what you're proposing above, except that it
moves everything before step 5. into the bootloader's responsibility and
therefore avoids the need for hardware-specific early display code in
the kernel.

Thierry


signature.asc
Description: PGP signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v2 1/4] dt-bindings: reserved-memory: Document "active" property

2020-11-05 Thread Thierry Reding
On Fri, Sep 25, 2020 at 01:39:07PM +0100, Robin Murphy wrote:
> On 2020-09-24 17:23, Dmitry Osipenko wrote:
> > 24.09.2020 17:01, Thierry Reding пишет:
> > > On Thu, Sep 24, 2020 at 04:23:59PM +0300, Dmitry Osipenko wrote:
> > > > 04.09.2020 15:59, Thierry Reding пишет:
> > > > > From: Thierry Reding 
> > > > > 
> > > > > Reserved memory regions can be marked as "active" if hardware is
> > > > > expected to access the regions during boot and before the operating
> > > > > system can take control. One example where this is useful is for the
> > > > > operating system to infer whether the region needs to be identity-
> > > > > mapped through an IOMMU.
> > > > > 
> > > > > Signed-off-by: Thierry Reding 
> > > > > ---
> > > > >   .../bindings/reserved-memory/reserved-memory.txt   | 7 
> > > > > +++
> > > > >   1 file changed, 7 insertions(+)
> > > > > 
> > > > > diff --git 
> > > > > a/Documentation/devicetree/bindings/reserved-memory/reserved-memory.txt
> > > > >  
> > > > > b/Documentation/devicetree/bindings/reserved-memory/reserved-memory.txt
> > > > > index 4dd20de6977f..163d2927e4fc 100644
> > > > > --- 
> > > > > a/Documentation/devicetree/bindings/reserved-memory/reserved-memory.txt
> > > > > +++ 
> > > > > b/Documentation/devicetree/bindings/reserved-memory/reserved-memory.txt
> > > > > @@ -63,6 +63,13 @@ reusable (optional) - empty property
> > > > > able to reclaim it back. Typically that means that the 
> > > > > operating
> > > > > system can use that region to store volatile or cached data 
> > > > > that
> > > > > can be otherwise regenerated or migrated elsewhere.
> > > > > +active (optional) - empty property
> > > > > +- If this property is set for a reserved memory region, it 
> > > > > indicates
> > > > > +  that some piece of hardware may be actively accessing this 
> > > > > region.
> > > > > +  Should the operating system want to enable IOMMU protection 
> > > > > for a
> > > > > +  device, all active memory regions must have been 
> > > > > identity-mapped
> > > > > +  in order to ensure that non-quiescent hardware during boot can
> > > > > +  continue to access the memory.
> > > > >   Linux implementation note:
> > > > >   - If a "linux,cma-default" property is present, then Linux will use 
> > > > > the
> > > > > 
> > > > 
> > > > Hi,
> > > > 
> > > > Could you please explain what devices need this quirk? I see that you're
> > > > targeting Tegra SMMU driver, which means that it should be some pre-T186
> > > > device.
> > > 
> > > Primarily I'm looking at Tegra210 and later, because on earlier devices
> > > the bootloader doesn't consistently initialize display. I know that it
> > > does on some devices, but not all of them.
> > 
> > AFAIK, all tablet devices starting with Tegra20 that have display panel
> > are initializing display at a boot time for showing splash screen. This
> > includes all T20/T30/T114 tablets that are already supported by upstream
> > kernel.
> > 
> > > This same code should also
> > > work on Tegra186 and later (with an ARM SMMU) although the situation is
> > > slightly more complicated there because IOMMU translations will fault by
> > > default long before these identity mappings can be established.
> > > 
> > > > Is this reservation needed for some device that has display
> > > > hardwired to a very specific IOMMU domain at the boot time?
> > > 
> > > No, this is only used to convey information about the active framebuffer
> > > to the kernel. In practice the DMA/IOMMU code will use this information
> > > to establish a 1:1 mapping on whatever IOMMU domain that was picked for
> > > display.
> > > 
> > > > If you're targeting devices that don't have IOMMU enabled by default at
> > > > the boot time, then this approach won't work for the existing devices
> > > > which won't ever get an updated bootloader.
> > > 
> > > If the devices don't use an IOMMU, then there should be no p

Re: [PATCH v2 1/4] dt-bindings: reserved-memory: Document "active" property

2020-11-05 Thread Thierry Reding
On Thu, Sep 24, 2020 at 07:23:34PM +0300, Dmitry Osipenko wrote:
> 24.09.2020 17:01, Thierry Reding пишет:
> > On Thu, Sep 24, 2020 at 04:23:59PM +0300, Dmitry Osipenko wrote:
> >> 04.09.2020 15:59, Thierry Reding пишет:
> >>> From: Thierry Reding 
> >>>
> >>> Reserved memory regions can be marked as "active" if hardware is
> >>> expected to access the regions during boot and before the operating
> >>> system can take control. One example where this is useful is for the
> >>> operating system to infer whether the region needs to be identity-
> >>> mapped through an IOMMU.
> >>>
> >>> Signed-off-by: Thierry Reding 
> >>> ---
> >>>  .../bindings/reserved-memory/reserved-memory.txt   | 7 +++
> >>>  1 file changed, 7 insertions(+)
> >>>
> >>> diff --git 
> >>> a/Documentation/devicetree/bindings/reserved-memory/reserved-memory.txt 
> >>> b/Documentation/devicetree/bindings/reserved-memory/reserved-memory.txt
> >>> index 4dd20de6977f..163d2927e4fc 100644
> >>> --- 
> >>> a/Documentation/devicetree/bindings/reserved-memory/reserved-memory.txt
> >>> +++ 
> >>> b/Documentation/devicetree/bindings/reserved-memory/reserved-memory.txt
> >>> @@ -63,6 +63,13 @@ reusable (optional) - empty property
> >>>able to reclaim it back. Typically that means that the operating
> >>>system can use that region to store volatile or cached data that
> >>>can be otherwise regenerated or migrated elsewhere.
> >>> +active (optional) - empty property
> >>> +- If this property is set for a reserved memory region, it indicates
> >>> +  that some piece of hardware may be actively accessing this region.
> >>> +  Should the operating system want to enable IOMMU protection for a
> >>> +  device, all active memory regions must have been identity-mapped
> >>> +  in order to ensure that non-quiescent hardware during boot can
> >>> +  continue to access the memory.
> >>>  
> >>>  Linux implementation note:
> >>>  - If a "linux,cma-default" property is present, then Linux will use the
> >>>
> >>
> >> Hi,
> >>
> >> Could you please explain what devices need this quirk? I see that you're
> >> targeting Tegra SMMU driver, which means that it should be some pre-T186
> >> device.
> > 
> > Primarily I'm looking at Tegra210 and later, because on earlier devices
> > the bootloader doesn't consistently initialize display. I know that it
> > does on some devices, but not all of them.
> 
> AFAIK, all tablet devices starting with Tegra20 that have display panel
> are initializing display at a boot time for showing splash screen. This
> includes all T20/T30/T114 tablets that are already supported by upstream
> kernel.
> 
> > This same code should also
> > work on Tegra186 and later (with an ARM SMMU) although the situation is
> > slightly more complicated there because IOMMU translations will fault by
> > default long before these identity mappings can be established.
> > 
> >> Is this reservation needed for some device that has display
> >> hardwired to a very specific IOMMU domain at the boot time?
> > 
> > No, this is only used to convey information about the active framebuffer
> > to the kernel. In practice the DMA/IOMMU code will use this information
> > to establish a 1:1 mapping on whatever IOMMU domain that was picked for
> > display.
> > 
> >> If you're targeting devices that don't have IOMMU enabled by default at
> >> the boot time, then this approach won't work for the existing devices
> >> which won't ever get an updated bootloader.
> > 
> > If the devices don't use an IOMMU, then there should be no problem. The
> > extra reserved-memory nodes would still be necessary to ensure that the
> > kernel doesn't reuse the framebuffer memory for the slab allocator, but
> > if no IOMMU is used, then the display controller accessing the memory
> > isn't going to cause problems other than perhaps scanning out data that
> > is no longer a framebuffer.
> > 
> > There should also be no problem for devices with an old bootloader
> > because this code is triggered by the presence of a reserved-memory node
> > referenced via the memory-region property. Devices with an old
> > bootloader should continue to work as they did before. Although I
> > suppose they woul

Re: [PATCH v4 2/3] iommu/tegra-smmu: Rework tegra_smmu_probe_device()

2020-10-09 Thread Thierry Reding
On Thu, Oct 08, 2020 at 02:12:10PM -0700, Nicolin Chen wrote:
> On Thu, Oct 08, 2020 at 11:53:43AM +0200, Thierry Reding wrote:
> > On Mon, Oct 05, 2020 at 06:05:46PM -0700, Nicolin Chen wrote:
> > > On Mon, Oct 05, 2020 at 11:57:54AM +0200, Thierry Reding wrote:
> > > > On Fri, Oct 02, 2020 at 11:58:29AM -0700, Nicolin Chen wrote:
> > > > > On Fri, Oct 02, 2020 at 06:02:18PM +0300, Dmitry Osipenko wrote:
> > > > > > 02.10.2020 09:08, Nicolin Chen пишет:
> > > > > > >  static int tegra_smmu_of_xlate(struct device *dev,
> > > > > > >  struct of_phandle_args *args)
> > > > > > >  {
> > > > > > > + struct platform_device *iommu_pdev = 
> > > > > > > of_find_device_by_node(args->np);
> > > > > > > + struct tegra_mc *mc = platform_get_drvdata(iommu_pdev);
> > > > > > >   u32 id = args->args[0];
> > > > > > >  
> > > > > > > + of_node_put(args->np);
> > > > > > 
> > > > > > of_find_device_by_node() takes device reference and not the np
> > > > > > reference. This is a bug, please remove of_node_put().
> > > > > 
> > > > > Looks like so. Replacing it with put_device(_pdev->dev);
> > > > 
> > > > Putting the put_device() here is wrong, though. You need to make sure
> > > > you keep a reference to it as long as you keep accessing the data that
> > > > is owned by it.
> > > 
> > > I am confused. You said in the other reply (to Dmitry) that we do
> > > need to put_device(mc->dev), where mc->dev should be the same as
> > > iommu_pdev->dev. But here your comments sounds that we should not
> > > put_device at all since ->probe_device/group_device/attach_dev()
> > > will use it later.
> > 
> > You need to call put_device() at some point to release the reference
> > that you acquired by calling of_find_device_by_node(). If you don't
> > release it, you're leaking the reference and the kernel isn't going to
> > know when it's safe to delete the device.
> > 
> > So what I'm saying is that we either release it here, which isn't quite
> > right because we do reference data relating to the device later on. And
> 
> I see. A small question here by the way: By looking at other IOMMU
> drivers that are calling driver_find_device_by_fwnode() function,
> I found that most of them put_device right after the function call,
> and dev_get_drvdata() after putting the device..
> 
> Feels like they are doing it wrongly?

Well, like I said this is somewhat academic because these are all
referencing the IOMMU that by definition still needs to be around
when this code is called, and there's locks in place to ensure
these don't go away. So it's not like these drivers are doing it
wrong, they're just not doing it pedantically right.

> 
> > because it isn't quite right there should be a reason to justify it,
> > which is that the SMMU parent device is the same as the MC, so the
> > reference count isn't strictly necessary. But that's not quite obvious,
> > so highlighting it in a comment makes sense.
> > 
> > The other alternative is to not call put_device() here and keep on to
> > the reference as long as you keep using "mc". This might be difficult to
> > implement because it may not be obvious where to release it. I think
> > this is the better alternative, but if it's too complicated to implement
> > it might not be worth it.
> 
> I feel so too. The dev is got at of_xlate() that does not have an
> obvious counterpart function. So I'll just remove put_device() and
> put a line of comments, as you suggested.

I think you misunderstood. Not calling put_device() would be wrong
because that leaks a reference to the SMMU that you can't get back. My
suggestion was rather to keep put_device() here, but add a comment as to
why it's okay to call the put_device() here, even though you keep using
its private data later beyond this point, which typically would be wrong
to do.

Thierry


signature.asc
Description: PGP signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v4 2/2] iommu/tegra-smmu: Expand mutex protection range

2020-10-08 Thread Thierry Reding
On Mon, Sep 28, 2020 at 11:13:25PM -0700, Nicolin Chen wrote:
> This is used to protect potential race condition at use_count.
> since probes of client drivers, calling attach_dev(), may run
> concurrently.
> 
> Signed-off-by: Nicolin Chen 
> ---
> 
> Changelog
> v3->v4:
>  * Fixed typo "Expend" => "Expand"
> v2->v3:
>  * Renamed label "err_unlock" to "unlock"
> v1->v2:
>  * N/A
> 
>  drivers/iommu/tegra-smmu.c | 34 +++++-
>  1 file changed, 21 insertions(+), 13 deletions(-)

Acked-by: Thierry Reding 


signature.asc
Description: PGP signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v4 1/2] iommu/tegra-smmu: Unwrap tegra_smmu_group_get

2020-10-08 Thread Thierry Reding
On Mon, Sep 28, 2020 at 11:13:24PM -0700, Nicolin Chen wrote:
> The tegra_smmu_group_get was added to group devices in different
> SWGROUPs and it'd return a NULL group pointer upon a mismatch at
> tegra_smmu_find_group(), so for most of clients/devices, it very
> likely would mismatch and need a fallback generic_device_group().
> 
> But now tegra_smmu_group_get handles devices in same SWGROUP too,
> which means that it would allocate a group for every new SWGROUP
> or would directly return an existing one upon matching a SWGROUP,
> i.e. any device will go through this function.
> 
> So possibility of having a NULL group pointer in device_group()
> is upon failure of either devm_kzalloc() or iommu_group_alloc().
> In either case, calling generic_device_group() no longer makes a
> sense. Especially for devm_kzalloc() failing case, it'd cause a
> problem if it fails at devm_kzalloc() yet succeeds at a fallback
> generic_device_group(), because it does not create a group->list
> for other devices to match.
> 
> This patch simply unwraps the function to clean it up.
> 
> Signed-off-by: Nicolin Chen 
> ---
> 
> Changelog
> v2->v4:
>  * N/A
> v1->v2:
>  * Changed type of swgroup to "unsigned int", following Thierry's
>commnets.
> 
>  drivers/iommu/tegra-smmu.c | 19 ---
>  1 file changed, 4 insertions(+), 15 deletions(-)

Acked-by: Thierry Reding 


signature.asc
Description: PGP signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v4 2/3] iommu/tegra-smmu: Rework tegra_smmu_probe_device()

2020-10-08 Thread Thierry Reding
On Mon, Oct 05, 2020 at 06:05:46PM -0700, Nicolin Chen wrote:
> On Mon, Oct 05, 2020 at 11:57:54AM +0200, Thierry Reding wrote:
> > On Fri, Oct 02, 2020 at 11:58:29AM -0700, Nicolin Chen wrote:
> > > On Fri, Oct 02, 2020 at 06:02:18PM +0300, Dmitry Osipenko wrote:
> > > > 02.10.2020 09:08, Nicolin Chen пишет:
> > > > >  static int tegra_smmu_of_xlate(struct device *dev,
> > > > >  struct of_phandle_args *args)
> > > > >  {
> > > > > + struct platform_device *iommu_pdev = 
> > > > > of_find_device_by_node(args->np);
> > > > > + struct tegra_mc *mc = platform_get_drvdata(iommu_pdev);
> > > > >   u32 id = args->args[0];
> > > > >  
> > > > > + of_node_put(args->np);
> > > > 
> > > > of_find_device_by_node() takes device reference and not the np
> > > > reference. This is a bug, please remove of_node_put().
> > > 
> > > Looks like so. Replacing it with put_device(_pdev->dev);
> > 
> > Putting the put_device() here is wrong, though. You need to make sure
> > you keep a reference to it as long as you keep accessing the data that
> > is owned by it.
> 
> I am confused. You said in the other reply (to Dmitry) that we do
> need to put_device(mc->dev), where mc->dev should be the same as
> iommu_pdev->dev. But here your comments sounds that we should not
> put_device at all since ->probe_device/group_device/attach_dev()
> will use it later.

You need to call put_device() at some point to release the reference
that you acquired by calling of_find_device_by_node(). If you don't
release it, you're leaking the reference and the kernel isn't going to
know when it's safe to delete the device.

So what I'm saying is that we either release it here, which isn't quite
right because we do reference data relating to the device later on. And
because it isn't quite right there should be a reason to justify it,
which is that the SMMU parent device is the same as the MC, so the
reference count isn't strictly necessary. But that's not quite obvious,
so highlighting it in a comment makes sense.

The other alternative is to not call put_device() here and keep on to
the reference as long as you keep using "mc". This might be difficult to
implement because it may not be obvious where to release it. I think
this is the better alternative, but if it's too complicated to implement
it might not be worth it.

> > Like I said earlier, this is a bit weird in this case because we're
> > self-referencing, so iommu_pdev->dev is going to stay around as long as
> > the SMMU is. However, it might be worth to properly track the lifetime
> > anyway just so that the code can serve as a good example of how to do
> > things.
> 
> What's this "track-the-lifetime"?

This basically just means that SMMU needs to ensure that MC stays alive
(by holding a reference to it) as long as SMMU uses it. If the last
reference to MC is dropped, then the mc pointer and potentially anything
that it points to will become dangling. If you were to drop the last
reference at this point, then on the next line the mc pointer could
already be invalid.

That's how it generally works, anyway. What's special about this use-
case is that the SMMU and MC are the same device, so it should be safe
to omit this additional tracking because the IOMMU tracking should take
care of that already.

> > If you decide to go for the shortcut and not track this reference
> > properly, then at least you need to add a comment as to why it is safe
> > to do in this case. This ensures that readers are away of the
> > circumstances and don't copy this bad code into a context where the
> > circumstances are different.
> 
> I don't quite get this "shortcut" here either...mind elaborating?

The shortcut is taking advantage of the knowledge that the SMMU and the
MC are the same device and therefore not properly track the MC object.
Given that their code is located in different locations, this isn't
obvious to the casual reader of the code, so they may assume that this
is the normal way to do things. To avoid that, the code should have a
comment explaining why that is.

Thierry


signature.asc
Description: PGP signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v4 2/3] iommu/tegra-smmu: Rework tegra_smmu_probe_device()

2020-10-05 Thread Thierry Reding
On Mon, Oct 05, 2020 at 04:28:53PM +0300, Dmitry Osipenko wrote:
> 05.10.2020 14:15, Thierry Reding пишет:
> > On Mon, Oct 05, 2020 at 01:36:55PM +0300, Dmitry Osipenko wrote:
> >> 05.10.2020 12:53, Thierry Reding пишет:
> >>> On Fri, Oct 02, 2020 at 05:50:08PM +0300, Dmitry Osipenko wrote:
> >>>> 02.10.2020 17:22, Dmitry Osipenko пишет:
> >>>>>>  static int tegra_smmu_of_xlate(struct device *dev,
> >>>>>>   struct of_phandle_args *args)
> >>>>>>  {
> >>>>>> +  struct platform_device *iommu_pdev = 
> >>>>>> of_find_device_by_node(args->np);
> >>>>>> +  struct tegra_mc *mc = platform_get_drvdata(iommu_pdev);
> >>>>>>u32 id = args->args[0];
> >>>>>>  
> >>>>>> +  of_node_put(args->np);
> >>>>>> +
> >>>>>> +  if (!mc || !mc->smmu)
> >>>>>> +  return -EPROBE_DEFER;
> >>>>> platform_get_drvdata(NULL) will crash.
> >>>>>
> >>>>
> >>>> Actually, platform_get_drvdata(NULL) can't happen. I overlooked this.
> >>>
> >>> How so? It's technically possible for the iommus property to reference a
> >>> device tree node for which no platform device will ever be created, in
> >>> which case of_find_device_by_node() will return NULL. That's very
> >>> unlikely and perhaps worth just crashing on to make sure it gets fixed
> >>> immediately.
> >>
> >> The tegra_smmu_ops are registered from the SMMU driver itself and MC
> >> driver sets platform data before SMMU is initialized, hence device is
> >> guaranteed to exist and mc can't be NULL.
> > 
> > Yes, but that assumes that args->np points to the memory controller's
> > device tree node. It's obviously a mistake to do this, but I don't think
> > anyone will prevent you from doing this:
> > 
> > iommus = <&{/chosen} 0>;
> > 
> > In that case, since no platform device is created for the /chosen node,
> > iommu_pdev will end up being NULL and platform_get_drvdata() will crash.
> 
> But then Tegra SMMU isn't associated with the device's IOMMU path, and
> thus, tegra_smmu_of_xlate() won't be invoked for this device.

Indeed, you're right! It used to be that ops were assigned to the bus
without any knowledge about the specific instances that might exist, but
nowadays there's struct iommu_device which properly encapsulates all of
that, so yeah, I don't think this can ever be NULL.

Although that makes me wonder why we aren't going one step further and
pass struct iommu_device * into ->of_xlate(), which would avoid the need
for us to do the look up once more.

Thierry


signature.asc
Description: PGP signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v4 2/3] iommu/tegra-smmu: Rework tegra_smmu_probe_device()

2020-10-05 Thread Thierry Reding
On Mon, Oct 05, 2020 at 01:36:55PM +0300, Dmitry Osipenko wrote:
> 05.10.2020 12:53, Thierry Reding пишет:
> > On Fri, Oct 02, 2020 at 05:50:08PM +0300, Dmitry Osipenko wrote:
> >> 02.10.2020 17:22, Dmitry Osipenko пишет:
> >>>>  static int tegra_smmu_of_xlate(struct device *dev,
> >>>> struct of_phandle_args *args)
> >>>>  {
> >>>> +struct platform_device *iommu_pdev = 
> >>>> of_find_device_by_node(args->np);
> >>>> +struct tegra_mc *mc = platform_get_drvdata(iommu_pdev);
> >>>>  u32 id = args->args[0];
> >>>>  
> >>>> +of_node_put(args->np);
> >>>> +
> >>>> +if (!mc || !mc->smmu)
> >>>> +return -EPROBE_DEFER;
> >>> platform_get_drvdata(NULL) will crash.
> >>>
> >>
> >> Actually, platform_get_drvdata(NULL) can't happen. I overlooked this.
> > 
> > How so? It's technically possible for the iommus property to reference a
> > device tree node for which no platform device will ever be created, in
> > which case of_find_device_by_node() will return NULL. That's very
> > unlikely and perhaps worth just crashing on to make sure it gets fixed
> > immediately.
> 
> The tegra_smmu_ops are registered from the SMMU driver itself and MC
> driver sets platform data before SMMU is initialized, hence device is
> guaranteed to exist and mc can't be NULL.

Yes, but that assumes that args->np points to the memory controller's
device tree node. It's obviously a mistake to do this, but I don't think
anyone will prevent you from doing this:

iommus = <&{/chosen} 0>;

In that case, since no platform device is created for the /chosen node,
iommu_pdev will end up being NULL and platform_get_drvdata() will crash.

That said, I'm fine with not adding a check for that. If anyone really
does end up messing this up they deserve the crash.

I'm still a bit undecided about the mc->smmu check because I haven't
convinced myself yet that it can't happen.

Thierry


signature.asc
Description: PGP signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v5 2/3] iommu/tegra-smmu: Rework tegra_smmu_probe_device()

2020-10-05 Thread Thierry Reding
On Mon, Oct 05, 2020 at 11:41:08AM +0300, Dmitry Osipenko wrote:
> 05.10.2020 00:57, Nicolin Chen пишет:
> > On Sat, Oct 03, 2020 at 05:06:42PM +0300, Dmitry Osipenko wrote:
> >> 03.10.2020 09:59, Nicolin Chen пишет:
> >>>  static int tegra_smmu_of_xlate(struct device *dev,
> >>>  struct of_phandle_args *args)
> >>>  {
> >>> + struct platform_device *iommu_pdev = of_find_device_by_node(args->np);
> >>> + struct tegra_mc *mc = platform_get_drvdata(iommu_pdev);
> >>>   u32 id = args->args[0];
> >>>  
> >>> + put_device(_pdev->dev);
> >>> +
> >>> + if (!mc || !mc->smmu)
> >>> + return -EPROBE_DEFER;
> >>
> >> I'm not very excited by seeing code in the patches that can't be
> >> explained by the patch authors and will appreciate if you could provide
> >> a detailed explanation about why this NULL checking is needed because I
> >> think it is unneeded, especially given that other IOMMU drivers don't
> >> have such check.
> > 
> > This function could be called from of_iommu_configure(), which is
> > a part of other driver's probe(). So I think it's safer to have a
> > check. Yet, given mc driver is added to the "arch_initcall" stage,
> > you are probably right that there's no really need at this moment
> > because all clients should be called after mc/smmu are inited. So
> > I'll resend a v6, if that makes you happy.
> 
> I wanted to get the explanation :) I'm very curious why it's actually
> needed because I'm not 100% sure whether it's not needed at all.
> 
> I'd assume that the only possible problem could be if some device is
> created in parallel with the MC probing and there is no locking that
> could prevent this in the drivers core. It's not apparent to me whether
> this situation could happen at all in practice.
> 
> The MC is created early and at that time everything is sequential, so
> it's indeed should be safe to remove the check.

I think I now remember exactly why the "hack" in tegra_smmu_probe()
exists. The reason is that the MC driver does this:

mc->smmu = tegra_smmu_probe(...);

That means that mc->smmu is going to be NULL until tegra_smmu_probe()
has finished. But tegra_smmu_probe() calls bus_set_iommu() and that in
turn calls ->probe_device(). So the purpose of the "hack" in the
tegra_smmu_probe() function was to make sure mc->smmu was available at
that point, because, well, it is already known, but we haven't gotten
around to storing it yet.

->of_xlate() can theoretically be called as early as right after
bus_set_iommu() via of_iommu_configure() if that is called in parallel
with tegra_smmu_probe(). I think that's very unlikely, but I'm not 100%
sure that it can't happen.

In any case, I do agree with Dmitry that we should have a comment here
explaining why this is necessary. Even if we're completely certain that
this is necessary, it's not obvious and therefore should get that
comment. And if we're not certain that it's necessary, it's probably
also good to mention that in the comment so that eventually it can be
determined or the check removed if it proves to be unnecessary.

Thierry


signature.asc
Description: PGP signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v3 2/3] iommu/tegra-smmu: Rework .probe_device and .attach_dev

2020-10-05 Thread Thierry Reding
On Mon, Oct 05, 2020 at 12:38:20PM +0300, Dmitry Osipenko wrote:
> 05.10.2020 12:16, Thierry Reding пишет:
> ...
> >> I think you meant regmap in regards to protecting IO accesses, but this
> >> is not what regmap is about if IO accesses are atomic by nature.
> > 
> > Then why is there regmap-mmio?
> 
> To protect programming sequences for example, actually that's the only
> real use-case I saw in kernel drivers once. In our case there are no
> sequences that require protection, at least I'm not aware about that.

True. But I'd still prefer to have a more formal mechanism of handing
out access to registers.

Either way, this isn't very relevant in the case of tegra20-devfreq
because there's really no reason for it to be a separate driver with
device tree lookup since it's all internal to the MC driver. The only
reason (unless I'm missing something) for that is to have the code
located in drivers/devfreq. We can do that without requiring DT lookup
either like we did for tegra-smmu/tegra-mc, or by directly copying the
code into drivers/memory.

It's become fairly common in recent years to group code by IP rather
than functionality. We nowadays have good tools to help with subsystem
wide refactoring that make it much less necessary for subsystem code to
be concentrated in a single directory.

> >> Secondly, you're missing that tegra30-devfreq driver will also need to
> >> perform the MC lookup [3] and then driver will no longer work for the
> >> older DTs if phandle becomes mandatory because these DTs do not have the
> >> MC phandle in the ACTMON node.
> >>
> >> [3]
> >> https://github.com/grate-driver/linux/commit/441d19281f9b3428a4db1ecb3a02e1ec02a8ad7f
> >>
> >> So we will need the fall back for T30/124 as well.
> > 
> > No, we don't need the fallback because this is new functionality which
> > can and should be gated on the existence of the new phandle. If there's
> > no phandle then we have no choice but to use the old code to ensure old
> > behaviour.
> 
> You just repeated what I was trying to say:)
> 
> Perhaps I spent a bit too much time touching that code to the point that
> lost feeling that there is a need to clarify everything in details.

I assumed that by "fall back" you meant the lookup-by-compatible. But
what I was talking about is the fall back to the current code which does
not use the MC device tree node at all. The latter is going to be
required to ensure that the code continues to work as-is. But the former
is not required because we have fall back code that already works with
old device trees. New code that is using the memory controller's timings
nodes can be gated on the existence of the phandle in DT and doing so is
not going to break backwards-compatibility.

But perhaps I was misunderstanding what you were trying to say.

Thierry


signature.asc
Description: PGP signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v4 3/3] iommu/tegra-smmu: Add PCI support

2020-10-05 Thread Thierry Reding
On Thu, Oct 01, 2020 at 11:08:07PM -0700, Nicolin Chen wrote:
> This patch simply adds support for PCI devices.
> 
> Signed-off-by: Nicolin Chen 
> ---
> 
> Changelog
> v3->v4
>  * Dropped !iommu_present() check
>  * Added CONFIG_PCI check in the exit path
> v2->v3
>  * Replaced ternary conditional operator with if-else in .device_group()
>  * Dropped change in tegra_smmu_remove()
> v1->v2
>  * Added error-out labels in tegra_smmu_probe()
>  * Dropped pci_request_acs() since IOMMU core would call it.
> 
>  drivers/iommu/tegra-smmu.c | 37 +++--
>  1 file changed, 27 insertions(+), 10 deletions(-)
> 
> diff --git a/drivers/iommu/tegra-smmu.c b/drivers/iommu/tegra-smmu.c
> index 02d02b0c55c4..b701a7b55e84 100644
> --- a/drivers/iommu/tegra-smmu.c
> +++ b/drivers/iommu/tegra-smmu.c
> @@ -10,6 +10,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include 
>  #include 
>  #include 
> @@ -865,7 +866,11 @@ static struct iommu_group 
> *tegra_smmu_device_group(struct device *dev)
>   group->smmu = smmu;
>   group->soc = soc;
>  
> - group->group = iommu_group_alloc();
> + if (dev_is_pci(dev))
> + group->group = pci_device_group(dev);
> + else
> + group->group = generic_device_group(dev);
> +
>   if (IS_ERR(group->group)) {
>   devm_kfree(smmu->dev, group);
>   mutex_unlock(>lock);
> @@ -1069,22 +1074,32 @@ struct tegra_smmu *tegra_smmu_probe(struct device 
> *dev,
>   iommu_device_set_fwnode(>iommu, dev->fwnode);
>  
>   err = iommu_device_register(>iommu);
> - if (err) {
> - iommu_device_sysfs_remove(>iommu);
> - return ERR_PTR(err);
> - }
> + if (err)
> + goto err_sysfs;
>  
>   err = bus_set_iommu(_bus_type, _smmu_ops);
> - if (err < 0) {
> - iommu_device_unregister(>iommu);
> - iommu_device_sysfs_remove(>iommu);
> - return ERR_PTR(err);
> - }
> + if (err < 0)
> + goto err_unregister;
> +
> +#ifdef CONFIG_PCI
> + err = bus_set_iommu(_bus_type, _smmu_ops);
> + if (err < 0)
> + goto err_bus_set;
> +#endif
>  
>   if (IS_ENABLED(CONFIG_DEBUG_FS))
>   tegra_smmu_debugfs_init(smmu);
>  
>   return smmu;
> +
> +err_bus_set: __maybe_unused;
> + bus_set_iommu(_bus_type, NULL);
> +err_unregister:
> + iommu_device_unregister(>iommu);
> +err_sysfs:
> + iommu_device_sysfs_remove(>iommu);

Can you please switch to label names that are more consistent with the
others in this driver? Notably the ones in tegra_smmu_domain_alloc().
The idea is to describe in the name of the label what's happening at the
label. Something like this, for example:

unset_platform_bus:
bus_set_iommu(_bus_type, NULL);
unregister:
iommu_device_unregister(>iommu);
remove_sysfs:
iommu_device_sysfs_remove(>iommu);

Thierry


signature.asc
Description: PGP signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v4 2/3] iommu/tegra-smmu: Rework tegra_smmu_probe_device()

2020-10-05 Thread Thierry Reding
On Fri, Oct 02, 2020 at 11:58:29AM -0700, Nicolin Chen wrote:
> On Fri, Oct 02, 2020 at 06:02:18PM +0300, Dmitry Osipenko wrote:
> > 02.10.2020 09:08, Nicolin Chen пишет:
> > >  static int tegra_smmu_of_xlate(struct device *dev,
> > >  struct of_phandle_args *args)
> > >  {
> > > + struct platform_device *iommu_pdev = of_find_device_by_node(args->np);
> > > + struct tegra_mc *mc = platform_get_drvdata(iommu_pdev);
> > >   u32 id = args->args[0];
> > >  
> > > + of_node_put(args->np);
> > 
> > of_find_device_by_node() takes device reference and not the np
> > reference. This is a bug, please remove of_node_put().
> 
> Looks like so. Replacing it with put_device(_pdev->dev);

Putting the put_device() here is wrong, though. You need to make sure
you keep a reference to it as long as you keep accessing the data that
is owned by it.

Like I said earlier, this is a bit weird in this case because we're
self-referencing, so iommu_pdev->dev is going to stay around as long as
the SMMU is. However, it might be worth to properly track the lifetime
anyway just so that the code can serve as a good example of how to do
things.

If you decide to go for the shortcut and not track this reference
properly, then at least you need to add a comment as to why it is safe
to do in this case. This ensures that readers are away of the
circumstances and don't copy this bad code into a context where the
circumstances are different.

Thierry


signature.asc
Description: PGP signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v4 2/3] iommu/tegra-smmu: Rework tegra_smmu_probe_device()

2020-10-05 Thread Thierry Reding
On Fri, Oct 02, 2020 at 05:50:08PM +0300, Dmitry Osipenko wrote:
> 02.10.2020 17:22, Dmitry Osipenko пишет:
> >>  static int tegra_smmu_of_xlate(struct device *dev,
> >>   struct of_phandle_args *args)
> >>  {
> >> +  struct platform_device *iommu_pdev = of_find_device_by_node(args->np);
> >> +  struct tegra_mc *mc = platform_get_drvdata(iommu_pdev);
> >>u32 id = args->args[0];
> >>  
> >> +  of_node_put(args->np);
> >> +
> >> +  if (!mc || !mc->smmu)
> >> +  return -EPROBE_DEFER;
> > platform_get_drvdata(NULL) will crash.
> > 
> 
> Actually, platform_get_drvdata(NULL) can't happen. I overlooked this.

How so? It's technically possible for the iommus property to reference a
device tree node for which no platform device will ever be created, in
which case of_find_device_by_node() will return NULL. That's very
unlikely and perhaps worth just crashing on to make sure it gets fixed
immediately.

Thierry


signature.asc
Description: PGP signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v4 2/3] iommu/tegra-smmu: Rework tegra_smmu_probe_device()

2020-10-05 Thread Thierry Reding
On Fri, Oct 02, 2020 at 05:22:41PM +0300, Dmitry Osipenko wrote:
> 02.10.2020 09:08, Nicolin Chen пишет:
> >  static int tegra_smmu_of_xlate(struct device *dev,
> >struct of_phandle_args *args)
> >  {
> > +   struct platform_device *iommu_pdev = of_find_device_by_node(args->np);
> > +   struct tegra_mc *mc = platform_get_drvdata(iommu_pdev);
> > u32 id = args->args[0];
> >  
> > +   of_node_put(args->np);
> > +
> > +   if (!mc || !mc->smmu)
> > +   return -EPROBE_DEFER;
> 
> platform_get_drvdata(NULL) will crash.
> 
> > +   dev_iommu_priv_set(dev, mc->smmu);
> 
> I think put_device(mc->dev) is missed here, doesn't it?

Yeah, I think we'd need that here, otherwise we'd be leaking a
reference. Worse, even, mc->dev is the same device that owns the SMMU,
so we're basically incrementing our own reference here and never
releasing it. We also need that put_device(mc->dev) in the error case
above because we already hold the reference there.

Thierry


signature.asc
Description: PGP signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v4 2/3] iommu/tegra-smmu: Rework tegra_smmu_probe_device()

2020-10-05 Thread Thierry Reding
On Fri, Oct 02, 2020 at 12:53:28PM -0700, Nicolin Chen wrote:
> On Fri, Oct 02, 2020 at 05:58:29PM +0300, Dmitry Osipenko wrote:
> > 02.10.2020 17:22, Dmitry Osipenko пишет:
> > > 02.10.2020 09:08, Nicolin Chen пишет:
> > >> -static void tegra_smmu_release_device(struct device *dev)
> > >> -{
> > >> -dev_iommu_priv_set(dev, NULL);
> > >> -}
> > >> +static void tegra_smmu_release_device(struct device *dev) {}
> > > 
> > > Please keep the braces as-is.
> > > 
> > 
> > I noticed that you borrowed this style from the sun50i-iommu driver, but
> > this is a bit unusual coding style for the c files. At least to me it's
> > unusual to see header-style function stub in a middle of c file. But
> > maybe it's just me.
> 
> I don't see a rule in ./Documentation/process/coding-style.rst
> against this, and there're plenty of drivers doing so. If you
> feel uncomfortable with this style, you may add a rule to that
> doc so everyone will follow :)

I also prefer braces on separate lines. Even better would be to just
drop this entirely and make ->release_device() optional. At least the
following drivers could be cleaned up that way:

* fsl-pamu
* msm-iommu
* sun50i-iommu
* tegra-gart
* tegra-smmu

And it looks like mtk-iommu and mtk-iommu-v1 do only iommu_fwspec_free()
in their ->release_device() implementations, but that's already done via
iommu_release_device().

Thierry


signature.asc
Description: PGP signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v3 2/3] iommu/tegra-smmu: Rework .probe_device and .attach_dev

2020-10-05 Thread Thierry Reding
On Mon, Oct 05, 2020 at 11:14:27AM +0300, Dmitry Osipenko wrote:
> 05.10.2020 10:13, Thierry Reding пишет:
> ...
> > Have you also seen that sun50i-iommu does look up the SMMU from a
> > phandle using of_find_device_by_node()? So I think you've shown yourself
> > that even "modern" drivers avoid global pointers and look up via
> > phandle.
> 
> I have no problem with the lookup by phandle and I'm all for it. It's
> now apparent to me that you completely missed my point, but that should
> be my fault that I haven't conveyed it properly from the start. I just
> wanted to avoid the incompatible DT changes which could break older DTs
> + I simply wanted to improve the older code without introducing new
> features, that's it.
> 
> Anyways, after yours comments I started to look at how the interconnect
> patches could be improved and found new things, like that OPPs now
> support ICC and that EMC has a working EMC_STAT, I also discovered
> syscon and simple-mfd. This means that we won't need the global pointers
> at all neither for SMMU, nor for interconnect, nor for EMC drivers :)

Well, evidently discussion on mailing lists actually works. =)

Thierry


signature.asc
Description: PGP signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v3 2/3] iommu/tegra-smmu: Rework .probe_device and .attach_dev

2020-10-05 Thread Thierry Reding
On Thu, Oct 01, 2020 at 10:04:30PM +0300, Dmitry Osipenko wrote:
> ...
> >> There are dozens variants of the panels and system could easily have
> >> more than one panel, hence a direct lookup by phandle is a natural
> >> choice for the panels.
> > 
> > Not really, there's typically only just one panel. But that's just one
> > example. EMC would be another. There's only a single EMC on Tegra and
> > yet for something like interconnects we still reference it by phandle.
> 
> Interconnect is a generic API.
> 
> > PMC is another case and so is CAR, GPIO (at least on early Tegra SoCs)
> > and pinmux, etc.
> > 
> > The example of GPIO shows very well how this is important. If we had
> > made the assumption from the beginning that there was only ever going to
> > be a single GPIO controller, then we would've had a big problem when the
> > first SoC shipped that had multiple GPIO controllers.
> 
> This is true, but only if all these words are applied to the generic APIs.

The reason why this is true for generic APIs is because people actually
think about generic APIs a bit more than about custom APIs because they
need to be more future-proof.

That doesn't make it wrong to think hard about using custom APIs because
we also want those to be somewhat future-proof.

> >> While all Tegra SoCs have a single fixed MC in the system, and thus,
> >> there is no real need to use phandle because we can't mix up MC with
> >> anything else.
> > 
> > The same is true for the SMMU, and yet the iommus property references
> > the SMMU by phandle. There are a *lot* of cases where you could imply
> > dependencies because you have intimate knowledge about the hardware
> > within drivers. But the point is to avoid this wherever possible so
> > that the DTB is as self-describing as possible.
> > 
>  older DTs if DT change will be needed. Please give a detailed 
>  explanation.
> >>>
> >>> New functionality doesn't have to work with older DTs.
> >>
> >> This is fine in general, but I'm afraid that in this particular case we
> >> will need to have a fall back anyways because otherwise it should break
> >> the old functionality.
> > 
> > It looks like tegra20-devfreq is the only one that currently does this
> > lookup via compatible string. And looking at the driver, what it does is
> > pretty horrible, to be honest. It gets a reference to the memory
> > controller and then simply accesses registers within the memory
> > controller without any type of protection against concurrent accesses or
> > reference counting to make sure the registers it accesses are still
> > valid. At the very least this should've been a regmap.
> 
> Regmap is about abstracting accesses to devices that may sit on
> different types of buses, like I2C or SPI for example. Or devices that
> have a non-trivial registers mapping, or have slow IO and need caching.

Those are common uses, yes.

> I think you meant regmap in regards to protecting IO accesses, but this
> is not what regmap is about if IO accesses are atomic by nature.

Then why is there regmap-mmio?

> The tegra20-devfreq functionality is very separated from the rest of the
> memory controller, hence there are no conflicts in regards to hardware
> accesses, so there is nothing to protect.
> 
> Also, Regmap API itself doesn't manage refcounting of the mappings.

That may be true now, but at least it is something formal rather than
just dereferencing some pointer and accessing registers through it. If
this ever becomes a problem it's something that we can more easily
address.

> > And not
> > coincidentally, regmaps are usually passed around by referencing their
> > provider via phandle.
> 
> Any real-world examples? I think you're mixing up regmap with something
> else.

syscon is the most obvious that comes to mind. It is meant to address
the kind of use-case that tegra20-devfreq apparently needs here, where
you have registers for certain functionality that are located in a
completely different IP block from the rest of that functionality. Often
there are better alternatives to solve this, by reusing existing
infrastructure, such as pinmux.

In cases where no subsystem exists we typically use syscon, which are
implemented via regmaps, to gain access to shared registers.

> The devfreq driver works just like the SMMU and GART. The devfreq device
> is supposed to be created only once both MC and EMC drivers are loaded
> and we know that they can't go away [1].
> 
> [1]
> https://patchwork.ozlabs.org/project/linux-tegra/patch/20200814000621.8415-32-dig...@gmail.com/

Huh... why is the tegra20-devfreq device instantiated from the EMC
driver? That doesn't make any sense to me. If there aren't any registers
that the driver accesses, then it would make more sense to subsume that
functionality under some different driver (tegra20-mc most likely by
the looks of things).

On a side-note: once we move tegra20-devfreq into tegra20-mc, there's no
need for this look up at all anymore.

> Hence the 

Re: [PATCH v3 2/3] iommu/tegra-smmu: Rework .probe_device and .attach_dev

2020-10-05 Thread Thierry Reding
On Fri, Oct 02, 2020 at 04:55:34AM +0300, Dmitry Osipenko wrote:
> 02.10.2020 04:07, Nicolin Chen пишет:
> > On Thu, Oct 01, 2020 at 11:33:38PM +0300, Dmitry Osipenko wrote:
> > If we can't come to an agreement on globalizing mc pointer, would
> > it be possible to pass tegra_mc_driver through tegra_smmu_probe()
> > so we can continue to use driver_find_device_by_fwnode() as v1?
> >
> > v1: https://lkml.org/lkml/2020/9/26/68
> 
>  tegra_smmu_probe() already takes a struct tegra_mc *. Did you mean
>  tegra_smmu_probe_device()? I don't think we can do that because it isn't
> >>>
> >>> I was saying to have a global parent_driver pointer: similar to
> >>> my v1, yet rather than "extern" the tegra_mc_driver, we pass it
> >>> through egra_smmu_probe() and store it in a static global value
> >>> so as to call tegra_smmu_get_by_fwnode() in ->probe_device().
> >>>
> >>> Though I agree that creating a global device pointer (mc) might
> >>> be controversial, yet having a global parent_driver pointer may
> >>> not be against the rule, considering that it is common in iommu
> >>> drivers to call driver_find_device_by_fwnode in probe_device().
> >>
> >> You don't need the global pointer if you have SMMU OF node.
> >>
> >> You could also get driver pointer from mc->dev->driver.
> >>
> >> But I don't think you need to do this at all. The probe_device() could
> >> be invoked only for the tegra_smmu_ops and then seems you could use
> >> dev_iommu_priv_set() in tegra_smmu_of_xlate(), like sun50i-iommu driver
> >> does.
> > 
> > Getting iommu device pointer using driver_find_device_by_fwnode()
> > is a common practice in ->probe_device() of other iommu drivers.
> 
> Please give me a full list of the IOMMU drivers which use this method.

ARM SMMU and ARM SMMU v3 do this and so does virtio-iommu. Pretty much
all the other drivers for ARM platforms have their own variations of
tegra_smmu_find() using of_find_device_by_node() at some point.

What others do differently is that they call of_find_device_by_node()
from ->of_xlate(), which is notably different from what we do in
tegra-smmu (where we call it from ->probe_device()). It's entirely
possible that we can do that as well, which is what we've been
discussing in a different sub-thread, but like I mentioned there I do
recall that being problematic, otherwise I wouldn't have left all the
comments in the code.

If we can determine that moving this to ->of_xlate() works fine in all
cases, then I think that's something that we should do for tegra-smmu to
become more consistent with other drivers.

> > But this requires a device_driver pointer that tegra-smmu doesn't
> > have. So passing tegra_mc_driver through tegra_smmu_probe() will
> > address it.
> > 
> 
> If you're borrowing code and ideas from other drivers, then at least
> please borrow them from a modern good-looking drivers. And I already
> pointed out that following cargo cult is not always a good idea.
> 
> ARM-SMMU isn't a modern driver and it has legacy code. You shouldn't
> copy it blindly. The sun50i-iommu driver was added half year ago, you
> may use it as a reference.

That's nonsense. There's no such thing as "modern" drivers is Linux
because they are constantly improved. Yes, ARM SMMU may have legacy code
paths, but that's because it has been around for much longer than others
and therefore is much more mature.

I can't say much about sun50i-iommu because I'm not familiar with it,
but I have seen plenty of "modern" drivers that turn out to be much
worse than "old" drivers. New doesn't always mean better.

> Always consult the IOMMU core code. If you're too unsure about
> something, then maybe better to start a new thread and ask Joerg about
> the best modern practices that IOMMU drivers should use.

This doesn't really have anything to do with the IOMMU core code. This
has to do with platform and firmware code, so the IOMMU core is only
marginally involved.

Thierry


signature.asc
Description: PGP signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v3 2/3] iommu/tegra-smmu: Rework .probe_device and .attach_dev

2020-10-05 Thread Thierry Reding
On Thu, Oct 01, 2020 at 11:33:38PM +0300, Dmitry Osipenko wrote:
> 01.10.2020 14:04, Nicolin Chen пишет:
> > On Thu, Oct 01, 2020 at 12:23:16PM +0200, Thierry Reding wrote:
> >  > > >>>>>> It looks to me like the only reason why you need this new 
> > global API is
> >>>>>>>>>> because PCI devices may not have a device tree node with a phandle 
> >>>>>>>>>> to
> >>>>>>>>>> the IOMMU. However, SMMU support for PCI will only be enabled if 
> >>>>>>>>>> the
> >>>>>>>>>> root complex has an iommus property, right? In that case, can't we
> >>>>>>>>>> simply do something like this:
> >>>>>>>>>>
> >>>>>>>>>>if (dev_is_pci(dev))
> >>>>>>>>>>np = find_host_bridge(dev)->of_node;
> >>>>>>>>>>else
> >>>>>>>>>>np = dev->of_node;
> > 
> >>> I personally am not a fan of adding a path for PCI device either,
> >>> since PCI/IOMMU cores could have taken care of it while the same
> >>> path can't be used for other buses.
> >>
> >> There's already plenty of other drivers that do something similar to
> >> this. Take a look at the arm-smmu driver, for example, which seems to be
> >> doing exactly the same thing to finding the right device tree node to
> >> look at (see dev_get_dev_node() in drivers/iommu/arm-smmu/arm-smmu.c).
> > 
> > Hmm..okay..that is quite convincing then...
> 
> Not very convincing to me. I don't see a "plenty of other drivers",
> there is only one arm-smmu driver.

There's ARM SMMU, ARM SMMU v3 and at least FSL PAMU. Even some of the
x86 platforms use dev_is_pci() to special-case PCI devices. That's just
because PCI is fundamentally different from fixed devices such as those
on a platform bus.

> The dev_get_dev_node() is under CONFIG_ARM_SMMU_LEGACY_DT_BINDINGS (!).
> Guys, doesn't it look strange to you? :)

See above, there are other cases where PCI devices are treated special.
For example, pretty much every driver that supports PCI differentiates
between PCI and other devices in their ->device_group() callback.

> The arm-smmu driver does a similar thing for the modern bindings to what
> Nicolin's v3 is doing.

I don't really have any objections to doing something similar to what
ARM SMMU does. My main objection is to the use of a global pointer that
is used to look up the SMMU. As you see, the ARM SMMU driver also does
this lookup via driver_find_device_by_fwnode() rather than storing a
global pointer.

Also you can't quite equate ARM SMMU with Tegra SMMU. ARM SMMU can
properly deal with devices behind a PCI host bridge, whereas on Tegra
all those devices are thrown in the same bucket with the same IOMMU
domain. So it's to be expected that some things will have to be
different between the two drivers.

> >>> If we can't come to an agreement on globalizing mc pointer, would
> >>> it be possible to pass tegra_mc_driver through tegra_smmu_probe()
> >>> so we can continue to use driver_find_device_by_fwnode() as v1?
> >>>
> >>> v1: https://lkml.org/lkml/2020/9/26/68
> >>
> >> tegra_smmu_probe() already takes a struct tegra_mc *. Did you mean
> >> tegra_smmu_probe_device()? I don't think we can do that because it isn't
> > 
> > I was saying to have a global parent_driver pointer: similar to
> > my v1, yet rather than "extern" the tegra_mc_driver, we pass it
> > through egra_smmu_probe() and store it in a static global value
> > so as to call tegra_smmu_get_by_fwnode() in ->probe_device().
> > 
> > Though I agree that creating a global device pointer (mc) might
> > be controversial, yet having a global parent_driver pointer may
> > not be against the rule, considering that it is common in iommu
> > drivers to call driver_find_device_by_fwnode in probe_device().
> 
> You don't need the global pointer if you have SMMU OF node.
> 
> You could also get driver pointer from mc->dev->driver.
> 
> But I don't think you need to do this at all. The probe_device() could
> be invoked only for the tegra_smmu_ops and then seems you could use
> dev_iommu_priv_set() in tegra_smmu_of_xlate(), like sun50i-iommu driver
> does.

Have you also seen that sun50i-iommu does look up the SMMU from a
phandle using of_find_device_by_node()? So I think you've shown yourself
that even "modern" drivers avoid global pointers and look up via
phandle.

Thierry


signature.asc
Description: PGP signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v3 2/3] iommu/tegra-smmu: Rework .probe_device and .attach_dev

2020-10-01 Thread Thierry Reding
On Wed, Sep 30, 2020 at 01:36:18PM -0700, Nicolin Chen wrote:
> On Wed, Sep 30, 2020 at 05:31:31PM +0200, Thierry Reding wrote:
> > On Wed, Sep 30, 2020 at 01:42:57AM -0700, Nicolin Chen wrote:
> > > Previously the driver relies on bus_set_iommu() in .probe() to call
> > > in .probe_device() function so each client can poll iommus property
> > > in DTB to configure fwspec via tegra_smmu_configure(). According to
> > > the comments in .probe(), this is a bit of a hack. And this doesn't
> > > work for a client that doesn't exist in DTB, PCI device for example.
> > > 
> > > Actually when a device/client gets probed, the of_iommu_configure()
> > > will call in .probe_device() function again, with a prepared fwspec
> > > from of_iommu_configure() that reads the SWGROUP id in DTB as we do
> > > in tegra-smmu driver.
> > > 
> > > Additionally, as a new helper devm_tegra_get_memory_controller() is
> > > introduced, there's no need to poll the iommus property in order to
> > > get mc->smmu pointers or SWGROUP id.
> > > 
> > > This patch reworks .probe_device() and .attach_dev() by doing:
> > > 1) Using fwspec to get swgroup id in .attach_dev/.dettach_dev()
> > > 2) Removing DT polling code, tegra_smmu_find/tegra_smmu_configure()
> > > 3) Calling devm_tegra_get_memory_controller() in .probe_device()
> > > 4) Also dropping the hack in .probe() that's no longer needed.
> > > 
> > > Signed-off-by: Nicolin Chen 
> [...]
> > >  static struct iommu_device *tegra_smmu_probe_device(struct device *dev)
> > >  {
> > > - struct device_node *np = dev->of_node;
> > > - struct tegra_smmu *smmu = NULL;
> > > - struct of_phandle_args args;
> > > - unsigned int index = 0;
> > > - int err;
> > > -
> > > - while (of_parse_phandle_with_args(np, "iommus", "#iommu-cells", index,
> > > -   ) == 0) {
> > > - smmu = tegra_smmu_find(args.np);
> > > - if (smmu) {
> > > - err = tegra_smmu_configure(smmu, dev, );
> > > - of_node_put(args.np);
> > > -
> > > - if (err < 0)
> > > - return ERR_PTR(err);
> > > -
> > > - /*
> > > -  * Only a single IOMMU master interface is currently
> > > -  * supported by the Linux kernel, so abort after the
> > > -  * first match.
> > > -  */
> > > - dev_iommu_priv_set(dev, smmu);
> > > -
> > > - break;
> > > - }
> > > + struct tegra_mc *mc = devm_tegra_get_memory_controller(dev);
> > > + struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev);
> > 
> > It looks to me like the only reason why you need this new global API is
> > because PCI devices may not have a device tree node with a phandle to
> > the IOMMU. However, SMMU support for PCI will only be enabled if the
> > root complex has an iommus property, right? In that case, can't we
> > simply do something like this:
> > 
> > if (dev_is_pci(dev))
> > np = find_host_bridge(dev)->of_node;
> > else
> > np = dev->of_node;
> > 
> > ? I'm not sure exactly what find_host_bridge() is called, but I'm pretty
> > sure that exists.
> > 
> > Once we have that we can still iterate over the iommus property and do
> > not need to rely on this global variable.
> 
> I agree that it'd work. But I was hoping to simplify the code
> here if it's possible. Looks like we have an argument on this
> so I will choose to go with your suggestion above for now.
> 
> > > - of_node_put(args.np);
> > > - index++;
> > > - }
> > > + /* An invalid mc pointer means mc and smmu drivers are not ready */
> > > + if (IS_ERR(mc))
> > > + return ERR_PTR(-EPROBE_DEFER);
> > >  
> > > - if (!smmu)
> > > + /*
> > > +  * IOMMU core allows -ENODEV return to carry on. So bypass any call
> > > +  * from bus_set_iommu() during tegra_smmu_probe(), as a device will
> > > +  * call in again via of_iommu_configure when fwspec is prepared.
> > > +  */
> > > + if (!mc->smmu || !fwspec || fwspec->ops != _smmu_ops)
> > >   return ERR_PTR(-ENODEV);
> > >  
> > > - return >iommu;
> > > + dev_iommu_priv_set(dev, mc->smmu);
> > >

Re: [PATCH v3 2/3] iommu/tegra-smmu: Rework .probe_device and .attach_dev

2020-10-01 Thread Thierry Reding
On Thu, Oct 01, 2020 at 03:33:19AM -0700, Nicolin Chen wrote:
> On Thu, Oct 01, 2020 at 11:51:52AM +0200, Thierry Reding wrote:
> > > > >> ...
> > > > >>>> It looks to me like the only reason why you need this new global 
> > > > >>>> API is
> > > > >>>> because PCI devices may not have a device tree node with a phandle 
> > > > >>>> to
> > > > >>>> the IOMMU. However, SMMU support for PCI will only be enabled if 
> > > > >>>> the
> > > > >>>> root complex has an iommus property, right? In that case, can't we
> > > > >>>> simply do something like this:
> > > > >>>>
> > > > >>>>if (dev_is_pci(dev))
> > > > >>>>np = find_host_bridge(dev)->of_node;
> > > > >>>>else
> > > > >>>>np = dev->of_node;
> > > > >>>>
> > > > >>>> ? I'm not sure exactly what find_host_bridge() is called, but I'm 
> > > > >>>> pretty
> > > > >>>> sure that exists.
> 
> > > @@ -814,12 +815,15 @@ static struct tegra_smmu *tegra_smmu_find(struct 
> > > device_node *np)
> > >  }
> > >  
> > >  static int tegra_smmu_configure(struct tegra_smmu *smmu, struct device 
> > > *dev,
> > > - struct of_phandle_args *args)
> > > + struct of_phandle_args *args, struct 
> > > fwnode_handle *fwnode)
> > >  {
> > >   const struct iommu_ops *ops = smmu->iommu.ops;
> > >   int err;
> > >  
> > > - err = iommu_fwspec_init(dev, >of_node->fwnode, ops);
> > > + if (!fwnode)
> > > + return -ENOENT;
> > > +
> > > + err = iommu_fwspec_init(dev, fwnode, ops);
> > >   if (err < 0) {
> > >   dev_err(dev, "failed to initialize fwspec: %d\n", err);
> > >   return err;
> > > @@ -835,6 +839,19 @@ static int tegra_smmu_configure(struct tegra_smmu 
> > > *smmu, struct device *dev,
> > >   return 0;
> > >  }
> > >  
> > > +static struct device_node *tegra_smmu_find_pci_np(struct pci_dev 
> > > *pci_dev)
> > > +{
> > > + struct pci_bus *bus = pci_dev->bus;
> > > + struct device *dev = >dev;
> > > +
> > > + while (!of_property_read_bool(dev->of_node, "iommus") && bus->parent) {
> > > + dev = >parent->dev;
> > > + bus = bus->parent;
> > > + }
> > > +
> > > + return dev->of_node;
> > > +}
> > 
> > This seems like it's the equivalent of pci_get_host_bridge_device(). Can
> > you use that instead? I think you might use the parent of the host
> > bridge that's returned from that function, though.
> 
> I noticed that one when looking up one of the of_ functions, yet
> also found that this pci_get_host_bridge_device() is privated by
> https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/drivers/pci/pci.h?id=975e1ac173058b8710e5979e97fc1397233301f3
> 
> Would PCI folks be that willing to (allow to) revert it?

Yeah, sounds like that would be useful. If you do, perhaps also take the
opportunity to replace open-coded variants, such as the one in arm-smmu.

Either that, or open-code this in tegra-smmu, like arm-smmu does.

Thierry


signature.asc
Description: PGP signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v3 2/3] iommu/tegra-smmu: Rework .probe_device and .attach_dev

2020-10-01 Thread Thierry Reding
On Wed, Sep 30, 2020 at 07:48:50PM -0700, Nicolin Chen wrote:
> On Thu, Oct 01, 2020 at 05:06:19AM +0300, Dmitry Osipenko wrote:
> > 01.10.2020 04:26, Nicolin Chen пишет:
> > > On Thu, Oct 01, 2020 at 12:56:46AM +0300, Dmitry Osipenko wrote:
> > >> 01.10.2020 00:32, Nicolin Chen пишет:
> > >>> On Thu, Oct 01, 2020 at 12:24:25AM +0300, Dmitry Osipenko wrote:
> >  ...
> > >> It looks to me like the only reason why you need this new global API 
> > >> is
> > >> because PCI devices may not have a device tree node with a phandle to
> > >> the IOMMU. However, SMMU support for PCI will only be enabled if the
> > >> root complex has an iommus property, right? In that case, can't we
> > >> simply do something like this:
> > >>
> > >>  if (dev_is_pci(dev))
> > >>  np = find_host_bridge(dev)->of_node;
> > >>  else
> > >>  np = dev->of_node;
> > >>
> > >> ? I'm not sure exactly what find_host_bridge() is called, but I'm 
> > >> pretty
> > >> sure that exists.
> > >>
> > >> Once we have that we can still iterate over the iommus property and 
> > >> do
> > >> not need to rely on this global variable.
> > >
> > > I agree that it'd work. But I was hoping to simplify the code
> > > here if it's possible. Looks like we have an argument on this
> > > so I will choose to go with your suggestion above for now.
> > 
> >  This patch removed more lines than were added. If this will be opposite
> >  for the Thierry's suggestion, then it's probably not a great 
> >  suggestion.
> > >>>
> > >>> Sorry, I don't quite understand this comments. Would you please
> > >>> elaborate what's this "it" being "not a great suggestion"?
> > >>>
> > >>
> > >> I meant that you should try to implement Thierry's solution, but if the
> > >> end result will be worse than the current patch, then you shouldn't make
> > >> a v4, but get back to this discussion in order to choose the best option
> > >> and make everyone agree on it.
> > > 
> > > I see. Thanks for the reply. And here is a sample implementation:
> > 
> > That's what I supposed to happen :) The new variant adds code and
> > complexity, while old did the opposite. Hence the old variant is clearly
> > more attractive, IMO.
> 
> I personally am not a fan of adding a path for PCI device either,
> since PCI/IOMMU cores could have taken care of it while the same
> path can't be used for other buses.

There's already plenty of other drivers that do something similar to
this. Take a look at the arm-smmu driver, for example, which seems to be
doing exactly the same thing to finding the right device tree node to
look at (see dev_get_dev_node() in drivers/iommu/arm-smmu/arm-smmu.c).

> If we can't come to an agreement on globalizing mc pointer, would
> it be possible to pass tegra_mc_driver through tegra_smmu_probe()
> so we can continue to use driver_find_device_by_fwnode() as v1?
> 
> v1: https://lkml.org/lkml/2020/9/26/68

tegra_smmu_probe() already takes a struct tegra_mc *. Did you mean
tegra_smmu_probe_device()? I don't think we can do that because it isn't
known at that point whether MC really is the SMMU. That's in fact the
whole reason why we have to go through this whole dance of iterating
over the iommus entries to find the SMMU.

Thierry


signature.asc
Description: PGP signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v3 2/3] iommu/tegra-smmu: Rework .probe_device and .attach_dev

2020-10-01 Thread Thierry Reding
On Thu, Oct 01, 2020 at 05:06:19AM +0300, Dmitry Osipenko wrote:
> 01.10.2020 04:26, Nicolin Chen пишет:
> > On Thu, Oct 01, 2020 at 12:56:46AM +0300, Dmitry Osipenko wrote:
> >> 01.10.2020 00:32, Nicolin Chen пишет:
> >>> On Thu, Oct 01, 2020 at 12:24:25AM +0300, Dmitry Osipenko wrote:
>  ...
> >> It looks to me like the only reason why you need this new global API is
> >> because PCI devices may not have a device tree node with a phandle to
> >> the IOMMU. However, SMMU support for PCI will only be enabled if the
> >> root complex has an iommus property, right? In that case, can't we
> >> simply do something like this:
> >>
> >>if (dev_is_pci(dev))
> >>np = find_host_bridge(dev)->of_node;
> >>else
> >>np = dev->of_node;
> >>
> >> ? I'm not sure exactly what find_host_bridge() is called, but I'm 
> >> pretty
> >> sure that exists.
> >>
> >> Once we have that we can still iterate over the iommus property and do
> >> not need to rely on this global variable.
> >
> > I agree that it'd work. But I was hoping to simplify the code
> > here if it's possible. Looks like we have an argument on this
> > so I will choose to go with your suggestion above for now.
> 
>  This patch removed more lines than were added. If this will be opposite
>  for the Thierry's suggestion, then it's probably not a great suggestion.
> >>>
> >>> Sorry, I don't quite understand this comments. Would you please
> >>> elaborate what's this "it" being "not a great suggestion"?
> >>>
> >>
> >> I meant that you should try to implement Thierry's solution, but if the
> >> end result will be worse than the current patch, then you shouldn't make
> >> a v4, but get back to this discussion in order to choose the best option
> >> and make everyone agree on it.
> > 
> > I see. Thanks for the reply. And here is a sample implementation:
> 
> That's what I supposed to happen :) The new variant adds code and
> complexity, while old did the opposite. Hence the old variant is clearly
> more attractive, IMO.

Surely code size can't be the only measure of good code. You can fit the
above on even fewer lines if you sacrifice readability. In this case you
can strip away those lines because you're effectively using a global
variable.

So there's always a compromise and I think in this case it's not a good
one because we sacrifice explicit code that clearly documents what's
going on with less code that's a bit handwavy about what's happening.

Thierry


signature.asc
Description: PGP signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v3 2/3] iommu/tegra-smmu: Rework .probe_device and .attach_dev

2020-10-01 Thread Thierry Reding
On Wed, Sep 30, 2020 at 06:26:30PM -0700, Nicolin Chen wrote:
> On Thu, Oct 01, 2020 at 12:56:46AM +0300, Dmitry Osipenko wrote:
> > 01.10.2020 00:32, Nicolin Chen пишет:
> > > On Thu, Oct 01, 2020 at 12:24:25AM +0300, Dmitry Osipenko wrote:
> > >> ...
> >  It looks to me like the only reason why you need this new global API is
> >  because PCI devices may not have a device tree node with a phandle to
> >  the IOMMU. However, SMMU support for PCI will only be enabled if the
> >  root complex has an iommus property, right? In that case, can't we
> >  simply do something like this:
> > 
> > if (dev_is_pci(dev))
> > np = find_host_bridge(dev)->of_node;
> > else
> > np = dev->of_node;
> > 
> >  ? I'm not sure exactly what find_host_bridge() is called, but I'm 
> >  pretty
> >  sure that exists.
> > 
> >  Once we have that we can still iterate over the iommus property and do
> >  not need to rely on this global variable.
> > >>>
> > >>> I agree that it'd work. But I was hoping to simplify the code
> > >>> here if it's possible. Looks like we have an argument on this
> > >>> so I will choose to go with your suggestion above for now.
> > >>
> > >> This patch removed more lines than were added. If this will be opposite
> > >> for the Thierry's suggestion, then it's probably not a great suggestion.
> > > 
> > > Sorry, I don't quite understand this comments. Would you please
> > > elaborate what's this "it" being "not a great suggestion"?
> > > 
> > 
> > I meant that you should try to implement Thierry's solution, but if the
> > end result will be worse than the current patch, then you shouldn't make
> > a v4, but get back to this discussion in order to choose the best option
> > and make everyone agree on it.
> 
> I see. Thanks for the reply. And here is a sample implementation:
> 
> @@ -814,12 +815,15 @@ static struct tegra_smmu *tegra_smmu_find(struct 
> device_node *np)
>  }
>  
>  static int tegra_smmu_configure(struct tegra_smmu *smmu, struct device *dev,
> - struct of_phandle_args *args)
> + struct of_phandle_args *args, struct 
> fwnode_handle *fwnode)
>  {
>   const struct iommu_ops *ops = smmu->iommu.ops;
>   int err;
>  
> - err = iommu_fwspec_init(dev, >of_node->fwnode, ops);
> + if (!fwnode)
> + return -ENOENT;
> +
> + err = iommu_fwspec_init(dev, fwnode, ops);
>   if (err < 0) {
>   dev_err(dev, "failed to initialize fwspec: %d\n", err);
>   return err;
> @@ -835,6 +839,19 @@ static int tegra_smmu_configure(struct tegra_smmu *smmu, 
> struct device *dev,
>   return 0;
>  }
>  
> +static struct device_node *tegra_smmu_find_pci_np(struct pci_dev *pci_dev)
> +{
> + struct pci_bus *bus = pci_dev->bus;
> + struct device *dev = >dev;
> +
> + while (!of_property_read_bool(dev->of_node, "iommus") && bus->parent) {
> + dev = >parent->dev;
> + bus = bus->parent;
> + }
> +
> + return dev->of_node;
> +}

This seems like it's the equivalent of pci_get_host_bridge_device(). Can
you use that instead? I think you might use the parent of the host
bridge that's returned from that function, though.

Thierry


signature.asc
Description: PGP signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v3 2/3] iommu/tegra-smmu: Rework .probe_device and .attach_dev

2020-10-01 Thread Thierry Reding
On Wed, Sep 30, 2020 at 01:36:18PM -0700, Nicolin Chen wrote:
> On Wed, Sep 30, 2020 at 05:31:31PM +0200, Thierry Reding wrote:
> > On Wed, Sep 30, 2020 at 01:42:57AM -0700, Nicolin Chen wrote:
> > > Previously the driver relies on bus_set_iommu() in .probe() to call
> > > in .probe_device() function so each client can poll iommus property
> > > in DTB to configure fwspec via tegra_smmu_configure(). According to
> > > the comments in .probe(), this is a bit of a hack. And this doesn't
> > > work for a client that doesn't exist in DTB, PCI device for example.
> > > 
> > > Actually when a device/client gets probed, the of_iommu_configure()
> > > will call in .probe_device() function again, with a prepared fwspec
> > > from of_iommu_configure() that reads the SWGROUP id in DTB as we do
> > > in tegra-smmu driver.
> > > 
> > > Additionally, as a new helper devm_tegra_get_memory_controller() is
> > > introduced, there's no need to poll the iommus property in order to
> > > get mc->smmu pointers or SWGROUP id.
> > > 
> > > This patch reworks .probe_device() and .attach_dev() by doing:
> > > 1) Using fwspec to get swgroup id in .attach_dev/.dettach_dev()
> > > 2) Removing DT polling code, tegra_smmu_find/tegra_smmu_configure()
> > > 3) Calling devm_tegra_get_memory_controller() in .probe_device()
> > > 4) Also dropping the hack in .probe() that's no longer needed.
> > > 
> > > Signed-off-by: Nicolin Chen 
> [...]
> > >  static struct iommu_device *tegra_smmu_probe_device(struct device *dev)
> > >  {
> > > - struct device_node *np = dev->of_node;
> > > - struct tegra_smmu *smmu = NULL;
> > > - struct of_phandle_args args;
> > > - unsigned int index = 0;
> > > - int err;
> > > -
> > > - while (of_parse_phandle_with_args(np, "iommus", "#iommu-cells", index,
> > > -   ) == 0) {
> > > - smmu = tegra_smmu_find(args.np);
> > > - if (smmu) {
> > > - err = tegra_smmu_configure(smmu, dev, );
> > > - of_node_put(args.np);
> > > -
> > > - if (err < 0)
> > > - return ERR_PTR(err);
> > > -
> > > - /*
> > > -  * Only a single IOMMU master interface is currently
> > > -  * supported by the Linux kernel, so abort after the
> > > -  * first match.
> > > -  */
> > > - dev_iommu_priv_set(dev, smmu);
> > > -
> > > - break;
> > > - }
> > > + struct tegra_mc *mc = devm_tegra_get_memory_controller(dev);
> > > + struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev);
> > 
> > It looks to me like the only reason why you need this new global API is
> > because PCI devices may not have a device tree node with a phandle to
> > the IOMMU. However, SMMU support for PCI will only be enabled if the
> > root complex has an iommus property, right? In that case, can't we
> > simply do something like this:
> > 
> > if (dev_is_pci(dev))
> > np = find_host_bridge(dev)->of_node;
> > else
> > np = dev->of_node;
> > 
> > ? I'm not sure exactly what find_host_bridge() is called, but I'm pretty
> > sure that exists.
> > 
> > Once we have that we can still iterate over the iommus property and do
> > not need to rely on this global variable.
> 
> I agree that it'd work. But I was hoping to simplify the code
> here if it's possible. Looks like we have an argument on this
> so I will choose to go with your suggestion above for now.
> 
> > > - of_node_put(args.np);
> > > - index++;
> > > - }
> > > + /* An invalid mc pointer means mc and smmu drivers are not ready */
> > > + if (IS_ERR(mc))
> > > + return ERR_PTR(-EPROBE_DEFER);
> > >  
> > > - if (!smmu)
> > > + /*
> > > +  * IOMMU core allows -ENODEV return to carry on. So bypass any call
> > > +  * from bus_set_iommu() during tegra_smmu_probe(), as a device will
> > > +  * call in again via of_iommu_configure when fwspec is prepared.
> > > +  */
> > > + if (!mc->smmu || !fwspec || fwspec->ops != _smmu_ops)
> > >   return ERR_PTR(-ENODEV);
> > >  
> > > - return >iommu;
> > > + dev_iommu_priv_set(dev, mc->smmu);
> > >

Re: [PATCH v3 2/3] iommu/tegra-smmu: Rework .probe_device and .attach_dev

2020-10-01 Thread Thierry Reding
On Wed, Sep 30, 2020 at 07:29:12PM +0300, Dmitry Osipenko wrote:
> ...
> >> Secondly, I'm already about to use the new tegra_get_memory_controller()
> >> API for all the T20/30/124/210 EMC and devfreq drivers.
> > 
> > Also, this really proves the point I was trying to make about how this
> > is going to proliferate...
> 
> Sorry, I'm probably totally missing yours point.. "what" exactly will
> proliferate?

Making use of this lookup-by-compatible mechanism. If you provide a
function to make that easy, then people are going to use it, without
even thinking about whether or not it is a good idea.

Thierry


signature.asc
Description: PGP signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v3 2/3] iommu/tegra-smmu: Rework .probe_device and .attach_dev

2020-10-01 Thread Thierry Reding
On Thu, Oct 01, 2020 at 05:11:30AM +0300, Dmitry Osipenko wrote:
> 30.09.2020 19:47, Thierry Reding пишет:
> > On Wed, Sep 30, 2020 at 07:25:41PM +0300, Dmitry Osipenko wrote:
> >> 30.09.2020 19:06, Thierry Reding пишет:
> >>> On Wed, Sep 30, 2020 at 06:36:52PM +0300, Dmitry Osipenko wrote:
> >>>>  I'...
> >>>>>> +  struct tegra_mc *mc = devm_tegra_get_memory_controller(dev);
> >>>>>> +  struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev);
> >>>>>
> >>>>> It looks to me like the only reason why you need this new global API is
> >>>>> because PCI devices may not have a device tree node with a phandle to
> >>>>> the IOMMU. However, SMMU support for PCI will only be enabled if the
> >>>>> root complex has an iommus property, right? In that case, can't we
> >>>>> simply do something like this:
> >>>>>
> >>>>> if (dev_is_pci(dev))
> >>>>> np = find_host_bridge(dev)->of_node;
> >>>>> else
> >>>>> np = dev->of_node;
> >>>>>
> >>>>> ? I'm not sure exactly what find_host_bridge() is called, but I'm pretty
> >>>>> sure that exists.
> >>>>>
> >>>>> Once we have that we can still iterate over the iommus property and do
> >>>>> not need to rely on this global variable.
> >>>>
> >>>> This sounds more complicated than the current variant.
> >>>>
> >>>> Secondly, I'm already about to use the new tegra_get_memory_controller()
> >>>> API for all the T20/30/124/210 EMC and devfreq drivers.
> >>>
> >>> Why do we need it there? They seem to work fine without it right now.
> >>
> >> All the Tegra30/124/210 EMC drivers are already duplicating that MC
> >> lookup code and only the recent T210 driver does it properly.
> >>
> >>> If it is required for new functionality, we can always make the dependent
> >>> on a DT reference via phandle without breaking any existing code.
> >>
> >> That's correct, it will be also needed for the new functionality as
> >> well, hence even more drivers will need to perform the MC lookup.
> > 
> > I don't have any issues with adding a helper if we need it from several
> > different locations. But the helper should be working off of a given
> > device and look up the device via the device tree node referenced by
> > phandle. We already have those phandles in place for the EMC devices,
> > and any other device that needs to interoperate with the MC should also
> > get such a reference.
> > 
> >> I don't quite understand why you're asking for the phandle reference,
> >> it's absolutely not needed for the MC lookup and won't work for the
> > 
> > We need that phandle in order to establish a link between the devices.
> > Yes, you can probably do it without the phandle and just match by
> > compatible string. But we don't do that for other types of devices
> > either, right? For a display driver we reference the attached panel via
> > phandle, but we could also just look it up via name or absolute path or
> > some other heuristic. But a phandle is just a much more explicit way of
> > linking the devices, so why not use it?
> 
> There are dozens variants of the panels and system could easily have
> more than one panel, hence a direct lookup by phandle is a natural
> choice for the panels.

Not really, there's typically only just one panel. But that's just one
example. EMC would be another. There's only a single EMC on Tegra and
yet for something like interconnects we still reference it by phandle.
PMC is another case and so is CAR, GPIO (at least on early Tegra SoCs)
and pinmux, etc.

The example of GPIO shows very well how this is important. If we had
made the assumption from the beginning that there was only ever going to
be a single GPIO controller, then we would've had a big problem when the
first SoC shipped that had multiple GPIO controllers.

> While all Tegra SoCs have a single fixed MC in the system, and thus,
> there is no real need to use phandle because we can't mix up MC with
> anything else.

The same is true for the SMMU, and yet the iommus property references
the SMMU by phandle. There are a *lot* of cases where you could imply
dependencies because you have intimate knowledge about the hardware
within drivers. But the point is to avoid this wherever possible so
that the DTB is as self-describing as possible.

> >> 

Re: [PATCH v3 2/3] iommu/tegra-smmu: Rework .probe_device and .attach_dev

2020-09-30 Thread Thierry Reding
On Wed, Sep 30, 2020 at 07:25:41PM +0300, Dmitry Osipenko wrote:
> 30.09.2020 19:06, Thierry Reding пишет:
> > On Wed, Sep 30, 2020 at 06:36:52PM +0300, Dmitry Osipenko wrote:
> >>  I'...
> >>>> +struct tegra_mc *mc = devm_tegra_get_memory_controller(dev);
> >>>> +struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev);
> >>>
> >>> It looks to me like the only reason why you need this new global API is
> >>> because PCI devices may not have a device tree node with a phandle to
> >>> the IOMMU. However, SMMU support for PCI will only be enabled if the
> >>> root complex has an iommus property, right? In that case, can't we
> >>> simply do something like this:
> >>>
> >>>   if (dev_is_pci(dev))
> >>>   np = find_host_bridge(dev)->of_node;
> >>>   else
> >>>   np = dev->of_node;
> >>>
> >>> ? I'm not sure exactly what find_host_bridge() is called, but I'm pretty
> >>> sure that exists.
> >>>
> >>> Once we have that we can still iterate over the iommus property and do
> >>> not need to rely on this global variable.
> >>
> >> This sounds more complicated than the current variant.
> >>
> >> Secondly, I'm already about to use the new tegra_get_memory_controller()
> >> API for all the T20/30/124/210 EMC and devfreq drivers.
> > 
> > Why do we need it there? They seem to work fine without it right now.
> 
> All the Tegra30/124/210 EMC drivers are already duplicating that MC
> lookup code and only the recent T210 driver does it properly.
> 
> > If it is required for new functionality, we can always make the dependent
> > on a DT reference via phandle without breaking any existing code.
> 
> That's correct, it will be also needed for the new functionality as
> well, hence even more drivers will need to perform the MC lookup.

I don't have any issues with adding a helper if we need it from several
different locations. But the helper should be working off of a given
device and look up the device via the device tree node referenced by
phandle. We already have those phandles in place for the EMC devices,
and any other device that needs to interoperate with the MC should also
get such a reference.

> I don't quite understand why you're asking for the phandle reference,
> it's absolutely not needed for the MC lookup and won't work for the

We need that phandle in order to establish a link between the devices.
Yes, you can probably do it without the phandle and just match by
compatible string. But we don't do that for other types of devices
either, right? For a display driver we reference the attached panel via
phandle, but we could also just look it up via name or absolute path or
some other heuristic. But a phandle is just a much more explicit way of
linking the devices, so why not use it?

> older DTs if DT change will be needed. Please give a detailed explanation.

New functionality doesn't have to work with older DTs.

Thierry


signature.asc
Description: PGP signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v3 1/3] memory: tegra: Add devm_tegra_get_memory_controller()

2020-09-30 Thread Thierry Reding
On Wed, Sep 30, 2020 at 07:26:00PM +0300, Dmitry Osipenko wrote:
> 30.09.2020 19:15, Thierry Reding пишет:
> > On Wed, Sep 30, 2020 at 07:06:27PM +0300, Dmitry Osipenko wrote:
> >> 30.09.2020 19:03, Thierry Reding пишет:
> >>> On Wed, Sep 30, 2020 at 06:53:06PM +0300, Dmitry Osipenko wrote:
> >>>> 30.09.2020 18:23, Thierry Reding пишет:
> >>>>> On Wed, Sep 30, 2020 at 01:42:56AM -0700, Nicolin Chen wrote:
> >>>>>> From: Dmitry Osipenko 
> >>>>>>
> >>>>>> Multiple Tegra drivers need to retrieve Memory Controller and hence 
> >>>>>> there
> >>>>>> is quite some duplication of the retrieval code among the drivers. 
> >>>>>> Let's
> >>>>>> add a new common helper for the retrieval of the MC.
> >>>>>>
> >>>>>> Signed-off-by: Dmitry Osipenko 
> >>>>>> Signed-off-by: Nicolin Chen 
> >>>>>> ---
> >>>>>>
> >>>>>> Changelog
> >>>>>> v2->v3:
> >>>>>>  * Replaced with Dimtry's devm_tegra_get_memory_controller()
> >>>>>> v1->v2:
> >>>>>>  * N/A
> >>>>>>
> >>>>>>  drivers/memory/tegra/mc.c | 39 +++
> >>>>>>  include/soc/tegra/mc.h| 17 +
> >>>>>>  2 files changed, 56 insertions(+)
> >>>>>
> >>>>> Let's not add this helper, please. If a device needs a reference to the
> >>>>> memory controller, it should have a phandle to the memory controller in
> >>>>> device tree so that it can be looked up explicitly.
> >>>>>
> >>>>> Adding this helper is officially sanctioning that it's okay not to have
> >>>>> that reference and that's a bad idea.
> >>>>
> >>>> And please explain why it's a bad idea, I don't see anything bad here at
> >>>> all.
> >>>
> >>> Well, you said yourself in a recent comment that we should avoid global
> >>> variables. devm_tegra_get_memory_controller() is nothing but a glorified
> >>> global variable.
> >>
> >> This is not a variable, but a common helper function which will remove
> >> the duplicated code and will help to avoid common mistakes like a missed
> >> put_device().
> > 
> > Yeah, you're right: this is actually much worse than a global variable.
> > It's a helper function that needs 50+ lines in order to effectively
> > access a global variable.
> > 
> > You could write this much simpler by doing something like this:
> > 
> > static struct tegra_mc *global_mc;
> > 
> > int tegra_mc_probe(...)
> > {
> > ...
> > 
> > global_mc = mc;
> > 
> > ...
> > }
> > 
> > struct tegra_mc *tegra_get_memory_controller(void)
> > {
> > return global_mc;
> > }
> > 
> > The result is *exactly* the same, except that this is actually more
> > honest. Nicolin's patch *pretends* that it isn't using a global variable
> > by wrapping a lot of complicated code around it.
> > 
> > But that doesn't change the fact that this accesses a singleton object
> > without actually being able to tie it to the device in the first place.
> 
> I don't think that the MC driver will stay built-in forever, although
> its modularization is complicated right now. Hence something shall keep
> the reference to the MC device resources while they are in use and this
> patch takes care of doing that.

It looks to me like all the other places where we get a reference to the
MC also keep a reference to the device. That's obviously not going to be
enough once the code is turned into a module. At that point we need to
make sure to also grab a reference to the module. But that's orthogonal
to this discussion.

> Secondly, the Nicolin's patch doesn't pretend on anything, but rather

Yes, the patch does pretend to "look up" the memory controller device,
but in reality it will always return a singleton object, which can just
as easily be achieved by using a global variable.

> brings the already existing duplicated code to a single common place.

Where exactly is that duplicated code? The only places I see where we
get a reference to the memory controller are from the EMC drivers and
they properly look up the MC via the nvidia,memory-controller device
tree property.

But that's not what this new helper does. This code will use the OF
lookup table to find any match and then returns that, completely
ignoring any links established by the device tree.

Thierry


signature.asc
Description: PGP signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v3 1/3] memory: tegra: Add devm_tegra_get_memory_controller()

2020-09-30 Thread Thierry Reding
On Wed, Sep 30, 2020 at 07:06:27PM +0300, Dmitry Osipenko wrote:
> 30.09.2020 19:03, Thierry Reding пишет:
> > On Wed, Sep 30, 2020 at 06:53:06PM +0300, Dmitry Osipenko wrote:
> >> 30.09.2020 18:23, Thierry Reding пишет:
> >>> On Wed, Sep 30, 2020 at 01:42:56AM -0700, Nicolin Chen wrote:
> >>>> From: Dmitry Osipenko 
> >>>>
> >>>> Multiple Tegra drivers need to retrieve Memory Controller and hence there
> >>>> is quite some duplication of the retrieval code among the drivers. Let's
> >>>> add a new common helper for the retrieval of the MC.
> >>>>
> >>>> Signed-off-by: Dmitry Osipenko 
> >>>> Signed-off-by: Nicolin Chen 
> >>>> ---
> >>>>
> >>>> Changelog
> >>>> v2->v3:
> >>>>  * Replaced with Dimtry's devm_tegra_get_memory_controller()
> >>>> v1->v2:
> >>>>  * N/A
> >>>>
> >>>>  drivers/memory/tegra/mc.c | 39 +++
> >>>>  include/soc/tegra/mc.h| 17 +
> >>>>  2 files changed, 56 insertions(+)
> >>>
> >>> Let's not add this helper, please. If a device needs a reference to the
> >>> memory controller, it should have a phandle to the memory controller in
> >>> device tree so that it can be looked up explicitly.
> >>>
> >>> Adding this helper is officially sanctioning that it's okay not to have
> >>> that reference and that's a bad idea.
> >>
> >> And please explain why it's a bad idea, I don't see anything bad here at
> >> all.
> > 
> > Well, you said yourself in a recent comment that we should avoid global
> > variables. devm_tegra_get_memory_controller() is nothing but a glorified
> > global variable.
> 
> This is not a variable, but a common helper function which will remove
> the duplicated code and will help to avoid common mistakes like a missed
> put_device().

Yeah, you're right: this is actually much worse than a global variable.
It's a helper function that needs 50+ lines in order to effectively
access a global variable.

You could write this much simpler by doing something like this:

static struct tegra_mc *global_mc;

int tegra_mc_probe(...)
{
...

global_mc = mc;

...
}

struct tegra_mc *tegra_get_memory_controller(void)
{
return global_mc;
}

The result is *exactly* the same, except that this is actually more
honest. Nicolin's patch *pretends* that it isn't using a global variable
by wrapping a lot of complicated code around it.

But that doesn't change the fact that this accesses a singleton object
without actually being able to tie it to the device in the first place.

Thierry


signature.asc
Description: PGP signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v3 2/3] iommu/tegra-smmu: Rework .probe_device and .attach_dev

2020-09-30 Thread Thierry Reding
On Wed, Sep 30, 2020 at 06:36:52PM +0300, Dmitry Osipenko wrote:
>  I'...
> >> +  struct tegra_mc *mc = devm_tegra_get_memory_controller(dev);
> >> +  struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev);
> > 
> > It looks to me like the only reason why you need this new global API is
> > because PCI devices may not have a device tree node with a phandle to
> > the IOMMU. However, SMMU support for PCI will only be enabled if the
> > root complex has an iommus property, right? In that case, can't we
> > simply do something like this:
> > 
> > if (dev_is_pci(dev))
> > np = find_host_bridge(dev)->of_node;
> > else
> > np = dev->of_node;
> > 
> > ? I'm not sure exactly what find_host_bridge() is called, but I'm pretty
> > sure that exists.
> > 
> > Once we have that we can still iterate over the iommus property and do
> > not need to rely on this global variable.
> 
> This sounds more complicated than the current variant.

I don't think so. It's actually very clear and explicit. And yes, this
might be a little more work (and honestly, this is what? a handful of
lines?) than accessing a global variable, but that's a fair price to pay
for proper encapsulation.

> Secondly, I'm already about to use the new tegra_get_memory_controller()
> API for all the T20/30/124/210 EMC and devfreq drivers.

Also, this really proves the point I was trying to make about how this
is going to proliferate...

Thierry


signature.asc
Description: PGP signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v3 2/3] iommu/tegra-smmu: Rework .probe_device and .attach_dev

2020-09-30 Thread Thierry Reding
On Wed, Sep 30, 2020 at 06:36:52PM +0300, Dmitry Osipenko wrote:
>  I'...
> >> +  struct tegra_mc *mc = devm_tegra_get_memory_controller(dev);
> >> +  struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev);
> > 
> > It looks to me like the only reason why you need this new global API is
> > because PCI devices may not have a device tree node with a phandle to
> > the IOMMU. However, SMMU support for PCI will only be enabled if the
> > root complex has an iommus property, right? In that case, can't we
> > simply do something like this:
> > 
> > if (dev_is_pci(dev))
> > np = find_host_bridge(dev)->of_node;
> > else
> > np = dev->of_node;
> > 
> > ? I'm not sure exactly what find_host_bridge() is called, but I'm pretty
> > sure that exists.
> > 
> > Once we have that we can still iterate over the iommus property and do
> > not need to rely on this global variable.
> 
> This sounds more complicated than the current variant.
> 
> Secondly, I'm already about to use the new tegra_get_memory_controller()
> API for all the T20/30/124/210 EMC and devfreq drivers.

Why do we need it there? They seem to work fine without it right now. If
it is required for new functionality, we can always make the dependent
on a DT reference via phandle without breaking any existing code.

Thierry


signature.asc
Description: PGP signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v3 1/3] memory: tegra: Add devm_tegra_get_memory_controller()

2020-09-30 Thread Thierry Reding
On Wed, Sep 30, 2020 at 06:53:06PM +0300, Dmitry Osipenko wrote:
> 30.09.2020 18:23, Thierry Reding пишет:
> > On Wed, Sep 30, 2020 at 01:42:56AM -0700, Nicolin Chen wrote:
> >> From: Dmitry Osipenko 
> >>
> >> Multiple Tegra drivers need to retrieve Memory Controller and hence there
> >> is quite some duplication of the retrieval code among the drivers. Let's
> >> add a new common helper for the retrieval of the MC.
> >>
> >> Signed-off-by: Dmitry Osipenko 
> >> Signed-off-by: Nicolin Chen 
> >> ---
> >>
> >> Changelog
> >> v2->v3:
> >>  * Replaced with Dimtry's devm_tegra_get_memory_controller()
> >> v1->v2:
> >>  * N/A
> >>
> >>  drivers/memory/tegra/mc.c | 39 +++
> >>  include/soc/tegra/mc.h| 17 +
> >>  2 files changed, 56 insertions(+)
> > 
> > Let's not add this helper, please. If a device needs a reference to the
> > memory controller, it should have a phandle to the memory controller in
> > device tree so that it can be looked up explicitly.
> > 
> > Adding this helper is officially sanctioning that it's okay not to have
> > that reference and that's a bad idea.
> 
> And please explain why it's a bad idea, I don't see anything bad here at
> all.

Well, you said yourself in a recent comment that we should avoid global
variables. devm_tegra_get_memory_controller() is nothing but a glorified
global variable.

Thierry


signature.asc
Description: PGP signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v3 2/3] iommu/tegra-smmu: Rework .probe_device and .attach_dev

2020-09-30 Thread Thierry Reding
On Wed, Sep 30, 2020 at 01:42:57AM -0700, Nicolin Chen wrote:
> Previously the driver relies on bus_set_iommu() in .probe() to call
> in .probe_device() function so each client can poll iommus property
> in DTB to configure fwspec via tegra_smmu_configure(). According to
> the comments in .probe(), this is a bit of a hack. And this doesn't
> work for a client that doesn't exist in DTB, PCI device for example.
> 
> Actually when a device/client gets probed, the of_iommu_configure()
> will call in .probe_device() function again, with a prepared fwspec
> from of_iommu_configure() that reads the SWGROUP id in DTB as we do
> in tegra-smmu driver.
> 
> Additionally, as a new helper devm_tegra_get_memory_controller() is
> introduced, there's no need to poll the iommus property in order to
> get mc->smmu pointers or SWGROUP id.
> 
> This patch reworks .probe_device() and .attach_dev() by doing:
> 1) Using fwspec to get swgroup id in .attach_dev/.dettach_dev()
> 2) Removing DT polling code, tegra_smmu_find/tegra_smmu_configure()
> 3) Calling devm_tegra_get_memory_controller() in .probe_device()
> 4) Also dropping the hack in .probe() that's no longer needed.
> 
> Signed-off-by: Nicolin Chen 
> ---
> 
> Changelog
> v2->v3
>  * Used devm_tegra_get_memory_controller() to get mc pointer
>  * Replaced IS_ERR_OR_NULL with IS_ERR in .probe_device()
> v1->v2
>  * Replaced in .probe_device() tegra_smmu_find/tegra_smmu_configure()
>with tegra_get_memory_controller call.
>  * Dropped the hack in tegra_smmu_probe().
> 
>  drivers/iommu/tegra-smmu.c | 144 ++---
>  1 file changed, 36 insertions(+), 108 deletions(-)
> 
> diff --git a/drivers/iommu/tegra-smmu.c b/drivers/iommu/tegra-smmu.c
> index 6a3ecc334481..636dc3b89545 100644
> --- a/drivers/iommu/tegra-smmu.c
> +++ b/drivers/iommu/tegra-smmu.c
> @@ -61,6 +61,8 @@ struct tegra_smmu_as {
>   u32 attr;
>  };
>  
> +static const struct iommu_ops tegra_smmu_ops;
> +
>  static struct tegra_smmu_as *to_smmu_as(struct iommu_domain *dom)
>  {
>   return container_of(dom, struct tegra_smmu_as, domain);
> @@ -484,60 +486,50 @@ static void tegra_smmu_as_unprepare(struct tegra_smmu 
> *smmu,
>  static int tegra_smmu_attach_dev(struct iommu_domain *domain,
>struct device *dev)
>  {
> + struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev);
>   struct tegra_smmu *smmu = dev_iommu_priv_get(dev);
>   struct tegra_smmu_as *as = to_smmu_as(domain);
> - struct device_node *np = dev->of_node;
> - struct of_phandle_args args;
>   unsigned int index = 0;
>   int err = 0;
>  
> - while (!of_parse_phandle_with_args(np, "iommus", "#iommu-cells", index,
> -)) {
> - unsigned int swgroup = args.args[0];
> -
> - if (args.np != smmu->dev->of_node) {
> - of_node_put(args.np);
> - continue;
> - }
> -
> - of_node_put(args.np);
> + if (!fwspec || fwspec->ops != _smmu_ops)
> + return -ENOENT;
>  
> + for (index = 0; index < fwspec->num_ids; index++) {
>   err = tegra_smmu_as_prepare(smmu, as);
> - if (err < 0)
> - return err;
> + if (err)
> + goto disable;
>  
> - tegra_smmu_enable(smmu, swgroup, as->id);
> - index++;
> + tegra_smmu_enable(smmu, fwspec->ids[index], as->id);
>   }
>  
>   if (index == 0)
>   return -ENODEV;
>  
>   return 0;
> +
> +disable:
> + while (index--) {
> + tegra_smmu_disable(smmu, fwspec->ids[index], as->id);
> + tegra_smmu_as_unprepare(smmu, as);
> + }
> +
> + return err;
>  }
>  
>  static void tegra_smmu_detach_dev(struct iommu_domain *domain, struct device 
> *dev)
>  {
> + struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev);
>   struct tegra_smmu_as *as = to_smmu_as(domain);
> - struct device_node *np = dev->of_node;
>   struct tegra_smmu *smmu = as->smmu;
> - struct of_phandle_args args;
>   unsigned int index = 0;
>  
> - while (!of_parse_phandle_with_args(np, "iommus", "#iommu-cells", index,
> -)) {
> - unsigned int swgroup = args.args[0];
> -
> - if (args.np != smmu->dev->of_node) {
> - of_node_put(args.np);
> - continue;
> - }
> -
> - of_node_put(args.np);
> + if (!fwspec || fwspec->ops != _smmu_ops)
> + return;
>  
> - tegra_smmu_disable(smmu, swgroup, as->id);
> + for (index = 0; index < fwspec->num_ids; index++) {
> + tegra_smmu_disable(smmu, fwspec->ids[index], as->id);
>   tegra_smmu_as_unprepare(smmu, as);
> - index++;
>   }
>  }
>  
> @@ -807,80 +799,26 @@ static phys_addr_t tegra_smmu_iova_to_phys(struct 
> 

Re: [PATCH v3 1/3] memory: tegra: Add devm_tegra_get_memory_controller()

2020-09-30 Thread Thierry Reding
On Wed, Sep 30, 2020 at 01:42:56AM -0700, Nicolin Chen wrote:
> From: Dmitry Osipenko 
> 
> Multiple Tegra drivers need to retrieve Memory Controller and hence there
> is quite some duplication of the retrieval code among the drivers. Let's
> add a new common helper for the retrieval of the MC.
> 
> Signed-off-by: Dmitry Osipenko 
> Signed-off-by: Nicolin Chen 
> ---
> 
> Changelog
> v2->v3:
>  * Replaced with Dimtry's devm_tegra_get_memory_controller()
> v1->v2:
>  * N/A
> 
>  drivers/memory/tegra/mc.c | 39 +++
>  include/soc/tegra/mc.h| 17 +
>  2 files changed, 56 insertions(+)

Let's not add this helper, please. If a device needs a reference to the
memory controller, it should have a phandle to the memory controller in
device tree so that it can be looked up explicitly.

Adding this helper is officially sanctioning that it's okay not to have
that reference and that's a bad idea.

Thierry


signature.asc
Description: PGP signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH 4/5] iommu/tegra-smmu: Add PCI support

2020-09-28 Thread Thierry Reding
On Sat, Sep 26, 2020 at 01:07:18AM -0700, Nicolin Chen wrote:
> This patch simply adds support for PCI devices.
> 
> Signed-off-by: Nicolin Chen 
> ---
>  drivers/iommu/tegra-smmu.c | 17 -
>  1 file changed, 16 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/iommu/tegra-smmu.c b/drivers/iommu/tegra-smmu.c
> index 97a7185b4578..9dbc5d7183cc 100644
> --- a/drivers/iommu/tegra-smmu.c
> +++ b/drivers/iommu/tegra-smmu.c
> @@ -15,6 +15,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  
>  #include 
>  #include 
> @@ -935,6 +936,7 @@ static struct iommu_group *tegra_smmu_device_group(struct 
> device *dev)
>   const struct tegra_smmu_group_soc *soc;
>   struct tegra_smmu_group *group;
>   int swgroup = fwspec->ids[0];
> + bool pci = dev_is_pci(dev);
>   struct iommu_group *grp;
>  
>   /* Find group_soc associating with swgroup */
> @@ -961,7 +963,7 @@ static struct iommu_group *tegra_smmu_device_group(struct 
> device *dev)
>   group->smmu = smmu;
>   group->soc = soc;
>  
> - group->group = iommu_group_alloc();
> + group->group = pci ? pci_device_group(dev) : iommu_group_alloc();
>   if (IS_ERR(group->group)) {
>   devm_kfree(smmu->dev, group);
>   mutex_unlock(>lock);
> @@ -1180,6 +1182,19 @@ struct tegra_smmu *tegra_smmu_probe(struct device *dev,
>   return ERR_PTR(err);
>   }
>  
> +#ifdef CONFIG_PCI
> + if (!iommu_present(_bus_type)) {
> + pci_request_acs();
> + err = bus_set_iommu(_bus_type, _smmu_ops);
> + if (err < 0) {
> + bus_set_iommu(_bus_type, NULL);
> + iommu_device_unregister(>iommu);
> + iommu_device_sysfs_remove(>iommu);
> + return ERR_PTR(err);

It might be worth factoring out the cleanup code now that there are
multiple failures from which we may need to clean up.

Also, it'd be great if somehow we could do this without the #ifdef,
but I guess since we're using the pci_bus_type global variable directly,
there isn't much we can do here?

Thierry


signature.asc
Description: PGP signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH 3/5] iommu/tegra-smmu: Use iommu_fwspec in .probe_/.attach_device()

2020-09-28 Thread Thierry Reding
On Sat, Sep 26, 2020 at 01:07:17AM -0700, Nicolin Chen wrote:
> The tegra_smmu_probe_device() function searches in DT for the iommu
> phandler to get "smmu" pointer. This works for most of SMMU clients
> that exist in the DTB. But a PCI device will not be added to iommu,
> since it doesn't have a DT node.
> 
> Fortunately, for a client with a DT node, tegra_smmu_probe_device()
> calls tegra_smmu_of_xlate() via tegra_smmu_configure(), while for a
> PCI device, of_pci_iommu_init() in the IOMMU core calls .of_xlate()
> as well, even before running tegra_smmu_probe_device(). And in both
> cases, tegra_smmu_of_xlate() prepares a valid iommu_fwspec pointer
> that allows us to get the mc->smmu pointer via dev_get_drvdata() by
> calling driver_find_device_by_fwnode().
> 
> So this patch uses iommu_fwspec in .probe_device() and related code
> for a client that does not exist in the DTB, especially a PCI one.
> 
> Signed-off-by: Nicolin Chen 
> ---
>  drivers/iommu/tegra-smmu.c | 89 +++---
>  drivers/memory/tegra/mc.c  |  2 +-
>  include/soc/tegra/mc.h |  2 +
>  3 files changed, 56 insertions(+), 37 deletions(-)
> 
> diff --git a/drivers/iommu/tegra-smmu.c b/drivers/iommu/tegra-smmu.c
> index b10e02073610..97a7185b4578 100644
> --- a/drivers/iommu/tegra-smmu.c
> +++ b/drivers/iommu/tegra-smmu.c
> @@ -13,6 +13,7 @@
>  #include 
>  #include 
>  #include 
> +#include 

Why is this needed? I don't see any of the symbols declared in that file
used here.

>  #include 
>  
>  #include 
> @@ -61,6 +62,8 @@ struct tegra_smmu_as {
>   u32 attr;
>  };
>  
> +static const struct iommu_ops tegra_smmu_ops;
> +
>  static struct tegra_smmu_as *to_smmu_as(struct iommu_domain *dom)
>  {
>   return container_of(dom, struct tegra_smmu_as, domain);
> @@ -484,60 +487,49 @@ static void tegra_smmu_as_unprepare(struct tegra_smmu 
> *smmu,
>  static int tegra_smmu_attach_dev(struct iommu_domain *domain,
>struct device *dev)
>  {
> + struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev);
>   struct tegra_smmu *smmu = dev_iommu_priv_get(dev);
>   struct tegra_smmu_as *as = to_smmu_as(domain);
> - struct device_node *np = dev->of_node;
> - struct of_phandle_args args;
> - unsigned int index = 0;
> - int err = 0;
> -
> - while (!of_parse_phandle_with_args(np, "iommus", "#iommu-cells", index,
> -)) {
> - unsigned int swgroup = args.args[0];
> -
> - if (args.np != smmu->dev->of_node) {
> - of_node_put(args.np);
> - continue;
> - }
> + int index, err = 0;
>  
> - of_node_put(args.np);
> + if (!fwspec || fwspec->ops != _smmu_ops)
> + return -ENOENT;
>  
> + for (index = 0; index < fwspec->num_ids; index++) {
>   err = tegra_smmu_as_prepare(smmu, as);
> - if (err < 0)
> - return err;
> + if (err)
> + goto err_disable;

I'd personally drop the err_ prefix here because it's pretty obvious
that we're going to do this as a result of an error happening.

>  
> - tegra_smmu_enable(smmu, swgroup, as->id);
> - index++;
> + tegra_smmu_enable(smmu, fwspec->ids[index], as->id);
>   }
>  
>   if (index == 0)
>   return -ENODEV;
>  
>   return 0;
> +
> +err_disable:
> + for (index--; index >= 0; index--) {
> + tegra_smmu_disable(smmu, fwspec->ids[index], as->id);
> + tegra_smmu_as_unprepare(smmu, as);
> + }

I think a more idiomatic version of doing this would be:

while (index--) {
...
}

> +
> + return err;
>  }
>  
>  static void tegra_smmu_detach_dev(struct iommu_domain *domain, struct device 
> *dev)
>  {
> + struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev);
>   struct tegra_smmu_as *as = to_smmu_as(domain);
> - struct device_node *np = dev->of_node;
>   struct tegra_smmu *smmu = as->smmu;
> - struct of_phandle_args args;
>   unsigned int index = 0;
>  
> - while (!of_parse_phandle_with_args(np, "iommus", "#iommu-cells", index,
> -)) {
> - unsigned int swgroup = args.args[0];
> -
> - if (args.np != smmu->dev->of_node) {
> - of_node_put(args.np);
> - continue;
> - }
> -
> - of_node_put(args.np);
> + if (!fwspec || fwspec->ops != _smmu_ops)
> + return;
>  
> - tegra_smmu_disable(smmu, swgroup, as->id);
> + for (index = 0; index < fwspec->num_ids; index++) {
> + tegra_smmu_disable(smmu, fwspec->ids[index], as->id);
>   tegra_smmu_as_unprepare(smmu, as);
> - index++;
>   }
>  }
>  
> @@ -845,10 +837,25 @@ static int tegra_smmu_configure(struct tegra_smmu 
> *smmu, struct 

Re: [PATCH 3/5] iommu/tegra-smmu: Use iommu_fwspec in .probe_/.attach_device()

2020-09-28 Thread Thierry Reding
On Sat, Sep 26, 2020 at 05:48:17PM +0300, Dmitry Osipenko wrote:
> 26.09.2020 11:07, Nicolin Chen пишет:
> ...
> > +   /* NULL smmu pointer means that SMMU driver is not probed yet */
> > +   if (unlikely(!smmu))
> > +   return ERR_PTR(-EPROBE_DEFER);
> 
> Hello, Nicolin!
> 
> Please don't pollute code with likely/unlikely. This is not a
> performance-critical code.
> 
> ...
> > -static struct platform_driver tegra_mc_driver = {
> > +struct platform_driver tegra_mc_driver = {
> > .driver = {
> > .name = "tegra-mc",
> > .of_match_table = tegra_mc_of_match,
> > diff --git a/include/soc/tegra/mc.h b/include/soc/tegra/mc.h
> > index 1238e35653d1..49a4cf64c4b9 100644
> > --- a/include/soc/tegra/mc.h
> > +++ b/include/soc/tegra/mc.h
> > @@ -184,4 +184,6 @@ struct tegra_mc {
> >  int tegra_mc_write_emem_configuration(struct tegra_mc *mc, unsigned long 
> > rate);
> >  unsigned int tegra_mc_get_emem_device_count(struct tegra_mc *mc);
> >  
> > +extern struct platform_driver tegra_mc_driver;
> 
> No global variables, please. See for the example:
> 
> https://elixir.bootlin.com/linux/v5.9-rc6/source/drivers/devfreq/tegra20-devfreq.c#L100
> 
> The tegra_get_memory_controller() is now needed by multiple Tegra
> drivers, I think it should be good to have it added into the MC driver
> and then make it globally available for all drivers by making use of
> of_find_matching_node_and_match().
> 
> diff --git a/drivers/memory/tegra/mc.c b/drivers/memory/tegra/mc.c
> index e1db209fd2ea..ed1bd6d00aaf 100644
> --- a/drivers/memory/tegra/mc.c
> +++ b/drivers/memory/tegra/mc.c
> @@ -43,6 +43,29 @@ static const struct of_device_id tegra_mc_of_match[] = {
>  };
>  MODULE_DEVICE_TABLE(of, tegra_mc_of_match);
> 
> +struct tegra_mc *tegra_get_memory_controller(void)
> +{
> + struct platform_device *pdev;
> + struct device_node *np;
> + struct tegra_mc *mc;
> +
> + np = of_find_matching_node_and_match(NULL, tegra_mc_of_match, NULL);
> + if (!np)
> + return ERR_PTR(-ENOENT);
> +
> + pdev = of_find_device_by_node(np);
> + of_node_put(np);
> + if (!pdev)
> + return ERR_PTR(-ENODEV);
> +
> + mc = platform_get_drvdata(pdev);
> + if (!mc)
> + return ERR_PTR(-EPROBE_DEFER);
> +
> + return mc;
> +}
> +EXPORT_SYMBOL_GPL(tegra_get_memory_controller);

We already have tegra_smmu_find(), which should be enough for this
particular use-case.

Thierry


signature.asc
Description: PGP signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH 1/5] iommu/tegra-smmu: Unwrap tegra_smmu_group_get

2020-09-28 Thread Thierry Reding
On Sat, Sep 26, 2020 at 01:07:15AM -0700, Nicolin Chen wrote:
> The tegra_smmu_group_get was added to group devices in different
> SWGROUPs and it'd return a NULL group pointer upon a mismatch at
> tegra_smmu_find_group(), so for most of clients/devices, it very
> likely would mismatch and need a fallback generic_device_group().
> 
> But now tegra_smmu_group_get handles devices in same SWGROUP too,
> which means that it would allocate a group for every new SWGROUP
> or would directly return an existing one upon matching a SWGROUP,
> i.e. any device will go through this function.
> 
> So possibility of having a NULL group pointer in device_group()
> is upon failure of either devm_kzalloc() or iommu_group_alloc().
> In either case, calling generic_device_group() no longer makes a
> sense. Especially for devm_kzalloc() failing case, it'd cause a
> problem if it fails at devm_kzalloc() yet succeeds at a fallback
> generic_device_group(), because it does not create a group->list
> for other devices to match.
> 
> This patch simply unwraps the function to clean it up.
> 
> Signed-off-by: Nicolin Chen 
> ---
>  drivers/iommu/tegra-smmu.c | 19 ---
>  1 file changed, 4 insertions(+), 15 deletions(-)
> 
> diff --git a/drivers/iommu/tegra-smmu.c b/drivers/iommu/tegra-smmu.c
> index 0becdbfea306..6335285dc373 100644
> --- a/drivers/iommu/tegra-smmu.c
> +++ b/drivers/iommu/tegra-smmu.c
> @@ -903,11 +903,13 @@ static void tegra_smmu_group_release(void *iommu_data)
>   mutex_unlock(>lock);
>  }
>  
> -static struct iommu_group *tegra_smmu_group_get(struct tegra_smmu *smmu,
> - unsigned int swgroup)
> +static struct iommu_group *tegra_smmu_device_group(struct device *dev)
>  {
> + struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev);
> + struct tegra_smmu *smmu = dev_iommu_priv_get(dev);
>   const struct tegra_smmu_group_soc *soc;
>   struct tegra_smmu_group *group;
> + int swgroup = fwspec->ids[0];

This should be unsigned int to match the type of swgroup elsewhere.
Also, it might not be worth having an extra local variable for this
since it's only used once.

Thierry


signature.asc
Description: PGP signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: IOVA allocation dependency between firmware buffer and remaining buffers

2020-09-24 Thread Thierry Reding
On Thu, Sep 24, 2020 at 12:41:29PM +0200, Marek Szyprowski wrote:
> Hi Thierry,
> 
> On 24.09.2020 12:16, Thierry Reding wrote:
> > On Thu, Sep 24, 2020 at 10:46:46AM +0200, Marek Szyprowski wrote:
> >> On 24.09.2020 10:28, Joerg Roedel wrote:
> >>> On Wed, Sep 23, 2020 at 08:48:26AM +0200, Marek Szyprowski wrote:
> >>>> It allows to remap given buffer at the specific IOVA address, although
> >>>> it doesn't guarantee that those specific addresses won't be later used
> >>>> by the IOVA allocator. Probably it would make sense to add an API for
> >>>> generic IOMMU-DMA framework to mark the given IOVA range as
> >>>> reserved/unused to protect them.
> >>> There is an API for that, the IOMMU driver can return IOVA reserved
> >>> regions per device and the IOMMU core code will take care of mapping
> >>> these regions and reserving them in the IOVA allocator, so that
> >>> DMA-IOMMU code will not use it for allocations.
> >>>
> >>> Have a look at the iommu_ops->get_resv_regions() and
> >>> iommu_ops->put_resv_regions().
> >> I know about the reserved regions IOMMU API, but the main problem here,
> >> in case of Exynos, is that those reserved regions won't be created by
> >> the IOMMU driver but by the IOMMU client device. It is just a result how
> >> the media drivers manages their IOVA space. They simply have to load
> >> firmware at the IOVA address lower than the any address of the used
> >> buffers.
> > I've been working on adding a way to automatically add direct mappings
> > using reserved-memory regions parsed from device tree, see:
> >
> >  
> > https://lore.kernel.org/lkml/2020090413.691933-1-thierry.red...@gmail.com/
> >
> > Perhaps this can be of use? With that you should be able to add a
> > reserved-memory region somewhere in the lower range that you need for
> > firmware images and have that automatically added as a direct mapping
> > so that it won't be reused later on for dynamic allocations.
> 
> Frankly, using that would be even bigger hack than what I've proposed in 
> my workaround. I see no point polluting DT with such artificial regions 
> just to ensure specific IOVA space layout.

I think I misunderstood the requirements that you have. Sounds like
there are no actual restrictions for where exactly the memory resides
for the firmware, it just has to be lower than any of the buffer
allocations. I agree, in that case using reserved memory regions does
not make sense at all.

Thierry


signature.asc
Description: PGP signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v2 1/4] dt-bindings: reserved-memory: Document "active" property

2020-09-24 Thread Thierry Reding
On Thu, Sep 24, 2020 at 04:23:59PM +0300, Dmitry Osipenko wrote:
> 04.09.2020 15:59, Thierry Reding пишет:
> > From: Thierry Reding 
> > 
> > Reserved memory regions can be marked as "active" if hardware is
> > expected to access the regions during boot and before the operating
> > system can take control. One example where this is useful is for the
> > operating system to infer whether the region needs to be identity-
> > mapped through an IOMMU.
> > 
> > Signed-off-by: Thierry Reding 
> > ---
> >  .../bindings/reserved-memory/reserved-memory.txt   | 7 +++
> >  1 file changed, 7 insertions(+)
> > 
> > diff --git 
> > a/Documentation/devicetree/bindings/reserved-memory/reserved-memory.txt 
> > b/Documentation/devicetree/bindings/reserved-memory/reserved-memory.txt
> > index 4dd20de6977f..163d2927e4fc 100644
> > --- a/Documentation/devicetree/bindings/reserved-memory/reserved-memory.txt
> > +++ b/Documentation/devicetree/bindings/reserved-memory/reserved-memory.txt
> > @@ -63,6 +63,13 @@ reusable (optional) - empty property
> >able to reclaim it back. Typically that means that the operating
> >system can use that region to store volatile or cached data that
> >can be otherwise regenerated or migrated elsewhere.
> > +active (optional) - empty property
> > +- If this property is set for a reserved memory region, it indicates
> > +  that some piece of hardware may be actively accessing this region.
> > +  Should the operating system want to enable IOMMU protection for a
> > +  device, all active memory regions must have been identity-mapped
> > +  in order to ensure that non-quiescent hardware during boot can
> > +  continue to access the memory.
> >  
> >  Linux implementation note:
> >  - If a "linux,cma-default" property is present, then Linux will use the
> > 
> 
> Hi,
> 
> Could you please explain what devices need this quirk? I see that you're
> targeting Tegra SMMU driver, which means that it should be some pre-T186
> device.

Primarily I'm looking at Tegra210 and later, because on earlier devices
the bootloader doesn't consistently initialize display. I know that it
does on some devices, but not all of them. This same code should also
work on Tegra186 and later (with an ARM SMMU) although the situation is
slightly more complicated there because IOMMU translations will fault by
default long before these identity mappings can be established.

> Is this reservation needed for some device that has display
> hardwired to a very specific IOMMU domain at the boot time?

No, this is only used to convey information about the active framebuffer
to the kernel. In practice the DMA/IOMMU code will use this information
to establish a 1:1 mapping on whatever IOMMU domain that was picked for
display.

> If you're targeting devices that don't have IOMMU enabled by default at
> the boot time, then this approach won't work for the existing devices
> which won't ever get an updated bootloader.

If the devices don't use an IOMMU, then there should be no problem. The
extra reserved-memory nodes would still be necessary to ensure that the
kernel doesn't reuse the framebuffer memory for the slab allocator, but
if no IOMMU is used, then the display controller accessing the memory
isn't going to cause problems other than perhaps scanning out data that
is no longer a framebuffer.

There should also be no problem for devices with an old bootloader
because this code is triggered by the presence of a reserved-memory node
referenced via the memory-region property. Devices with an old
bootloader should continue to work as they did before. Although I
suppose they would start faulting once we enable DMA/IOMMU integration
for Tegra SMMU if they have a bootloader that does initialize display to
actively scan out during boot.

> I think Robin Murphy already suggested that we should simply create
> a dummy "identity" IOMMU domain by default for the DRM/VDE devices and
> then replace it with an explicitly created domain within the drivers.

I don't recall reading about that suggestion. So does this mean that for
certain devices we'd want to basically passthrough by default and then
at some point during boot take over with a properly managed IOMMU
domain?

The primary goal here is to move towards using the DMA API rather than
the IOMMU API directly, so we don't really have the option of replacing
with an explicitly created domain. Unless we have code in the DMA/IOMMU
code that does this somehow.

But I'm not sure what would be a good way to mark certain devices as
needing an identity domain by default. Do we still use the reserved-
memory node for that? That would still require some sort of flag to
speci

Re: [PATCH v2 1/4] dt-bindings: reserved-memory: Document "active" property

2020-09-24 Thread Thierry Reding
On Tue, Sep 15, 2020 at 02:36:48PM +0200, Thierry Reding wrote:
> On Mon, Sep 14, 2020 at 04:08:29PM -0600, Rob Herring wrote:
> > On Fri, Sep 04, 2020 at 02:59:57PM +0200, Thierry Reding wrote:
> > > From: Thierry Reding 
> > > 
> > > Reserved memory regions can be marked as "active" if hardware is
> > > expected to access the regions during boot and before the operating
> > > system can take control. One example where this is useful is for the
> > > operating system to infer whether the region needs to be identity-
> > > mapped through an IOMMU.
> > 
> > I like simple solutions, but this hardly seems adequate to solve the 
> > problem of passing IOMMU setup from bootloader/firmware to the OS. Like 
> > what is the IOVA that's supposed to be used if identity mapping is not 
> > used?
> 
> The assumption here is that if the region is not active there is no need
> for the IOVA to be specified because the kernel will allocate memory and
> assign any IOVA of its choosing.
> 
> Also, note that this is not meant as a way of passing IOMMU setup from
> the bootloader or firmware to the OS. The purpose of this is to specify
> that some region of memory is actively being accessed during boot. The
> particular case that I'm looking at is where the bootloader set up a
> splash screen and keeps it on during boot. The bootloader has not set up
> an IOMMU mapping and the identity mapping serves as a way of keeping the
> accesses by the display hardware working during the transitional period
> after the IOMMU translations have been enabled by the kernel but before
> the kernel display driver has had a chance to set up its own IOMMU
> mappings.
> 
> > If you know enough about the regions to assume identity mapping, then 
> > can't you know if active or not?
> 
> We could alternatively add some property that describes the region as
> requiring an identity mapping. But note that we can't make any
> assumptions here about the usage of these regions because the IOMMU
> driver simply has no way of knowing what they are being used for.
> 
> Some additional information is required in device tree for the IOMMU
> driver to be able to make that decision.

Rob, can you provide any hints on exactly how you want to move this
forward? I don't know in what direction you'd like to proceed.

Thierry


signature.asc
Description: PGP signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH 3/3] iommu/tegra-smmu: Allow to group clients in same swgroup

2020-09-24 Thread Thierry Reding
On Fri, Sep 11, 2020 at 12:16:43AM -0700, Nicolin Chen wrote:
> There can be clients using the same swgroup in DT, for example i2c0
> and i2c1. The current driver will add them to separate IOMMU groups,
> though it has implemented device_group() callback which is to group
> devices using different swgroups like DC and DCB.
> 
> All clients having the same swgroup should be also added to the same
> IOMMU group so as to share an asid. Otherwise, the asid register may
> get overwritten every time a new device is attached.
> 
> Signed-off-by: Nicolin Chen 
> ---
>  drivers/iommu/tegra-smmu.c | 11 +++
>  1 file changed, 7 insertions(+), 4 deletions(-)

Makes sense:

Acked-by: Thierry Reding 


signature.asc
Description: PGP signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH 2/3] iommu/tegra-smmu: Fix iova->phys translation

2020-09-24 Thread Thierry Reding
On Fri, Sep 11, 2020 at 12:16:42AM -0700, Nicolin Chen wrote:
> IOVA might not be always 4KB aligned. So tegra_smmu_iova_to_phys
> function needs to add on the lower 12-bit offset from input iova.
> 
> Signed-off-by: Nicolin Chen 
> ---
>  drivers/iommu/tegra-smmu.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/iommu/tegra-smmu.c b/drivers/iommu/tegra-smmu.c
> index 789d21c01b77..50b962b0647e 100644
> --- a/drivers/iommu/tegra-smmu.c
> +++ b/drivers/iommu/tegra-smmu.c
> @@ -795,7 +795,7 @@ static phys_addr_t tegra_smmu_iova_to_phys(struct 
> iommu_domain *domain,
>  
>   pfn = *pte & as->smmu->pfn_mask;
>  
> - return SMMU_PFN_PHYS(pfn);
> + return SMMU_PFN_PHYS(pfn) + SMMU_OFFSET_IN_PAGE(iova);
>  }
>  
>  static struct tegra_smmu *tegra_smmu_find(struct device_node *np)

Acked-by: Thierry Reding 


signature.asc
Description: PGP signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH 1/3] iommu/tegra-smmu: Do not use PAGE_SHIFT and PAGE_MASK

2020-09-24 Thread Thierry Reding
On Fri, Sep 11, 2020 at 12:16:41AM -0700, Nicolin Chen wrote:
> PAGE_SHIFT and PAGE_MASK are defined corresponding to the page size
> for CPU virtual addresses, which means PAGE_SHIFT could be a number
> other than 12, but tegra-smmu maintains fixed 4KB IOVA pages and has
> fixed [21:12] bit range for PTE entries.
> 
> So this patch replaces all PAGE_SHIFT/PAGE_MASK references with the
> macros defined with SMMU_PTE_SHIFT.
> 
> Signed-off-by: Nicolin Chen 
> ---
>  drivers/iommu/tegra-smmu.c | 14 ++
>  1 file changed, 10 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/iommu/tegra-smmu.c b/drivers/iommu/tegra-smmu.c
> index 046add7acb61..789d21c01b77 100644
> --- a/drivers/iommu/tegra-smmu.c
> +++ b/drivers/iommu/tegra-smmu.c
> @@ -130,6 +130,11 @@ static inline u32 smmu_readl(struct tegra_smmu *smmu, 
> unsigned long offset)
>  #define SMMU_PDE_SHIFT 22
>  #define SMMU_PTE_SHIFT 12
>  
> +#define SMMU_PAGE_MASK   (~(SMMU_SIZE_PT-1))
> +#define SMMU_OFFSET_IN_PAGE(x)   ((unsigned long)(x) & ~SMMU_PAGE_MASK)
> +#define SMMU_PFN_PHYS(x) ((phys_addr_t)(x) << SMMU_PTE_SHIFT)
> +#define SMMU_PHYS_PFN(x) ((unsigned long)((x) >> SMMU_PTE_SHIFT))
> +
>  #define SMMU_PD_READABLE (1 << 31)
>  #define SMMU_PD_WRITABLE (1 << 30)
>  #define SMMU_PD_NONSECURE(1 << 29)
> @@ -644,7 +649,7 @@ static void tegra_smmu_set_pte(struct tegra_smmu_as *as, 
> unsigned long iova,
>  u32 *pte, dma_addr_t pte_dma, u32 val)
>  {
>   struct tegra_smmu *smmu = as->smmu;
> - unsigned long offset = offset_in_page(pte);
> + unsigned long offset = SMMU_OFFSET_IN_PAGE(pte);
>  
>   *pte = val;
>  
> @@ -726,7 +731,7 @@ __tegra_smmu_map(struct iommu_domain *domain, unsigned 
> long iova,
>   pte_attrs |= SMMU_PTE_WRITABLE;
>  
>   tegra_smmu_set_pte(as, iova, pte, pte_dma,
> -__phys_to_pfn(paddr) | pte_attrs);
> +SMMU_PHYS_PFN(paddr) | pte_attrs);
>  
>   return 0;
>  }
> @@ -790,7 +795,7 @@ static phys_addr_t tegra_smmu_iova_to_phys(struct 
> iommu_domain *domain,
>  
>   pfn = *pte & as->smmu->pfn_mask;
>  
> - return PFN_PHYS(pfn);
> + return SMMU_PFN_PHYS(pfn);
>  }
>  
>  static struct tegra_smmu *tegra_smmu_find(struct device_node *np)
> @@ -1108,7 +1113,8 @@ struct tegra_smmu *tegra_smmu_probe(struct device *dev,
>   smmu->dev = dev;
>   smmu->mc = mc;
>  
> - smmu->pfn_mask = BIT_MASK(mc->soc->num_address_bits - PAGE_SHIFT) - 1;
> + smmu->pfn_mask =
> + BIT_MASK(mc->soc->num_address_bits - SMMU_PTE_SHIFT) - 1;

checkpatch no longer warns about lines longer than 80 characters. The
new limit is 100, so you can fit this all on one line.

But either way:

Acked-by: Thierry Reding 


signature.asc
Description: PGP signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: IOVA allocation dependency between firmware buffer and remaining buffers

2020-09-24 Thread Thierry Reding
On Thu, Sep 24, 2020 at 10:46:46AM +0200, Marek Szyprowski wrote:
> Hi Joerg,
> 
> On 24.09.2020 10:28, Joerg Roedel wrote:
> > On Wed, Sep 23, 2020 at 08:48:26AM +0200, Marek Szyprowski wrote:
> >> It allows to remap given buffer at the specific IOVA address, although
> >> it doesn't guarantee that those specific addresses won't be later used
> >> by the IOVA allocator. Probably it would make sense to add an API for
> >> generic IOMMU-DMA framework to mark the given IOVA range as
> >> reserved/unused to protect them.
> > There is an API for that, the IOMMU driver can return IOVA reserved
> > regions per device and the IOMMU core code will take care of mapping
> > these regions and reserving them in the IOVA allocator, so that
> > DMA-IOMMU code will not use it for allocations.
> >
> > Have a look at the iommu_ops->get_resv_regions() and
> > iommu_ops->put_resv_regions().
> 
> I know about the reserved regions IOMMU API, but the main problem here, 
> in case of Exynos, is that those reserved regions won't be created by 
> the IOMMU driver but by the IOMMU client device. It is just a result how 
> the media drivers manages their IOVA space. They simply have to load 
> firmware at the IOVA address lower than the any address of the used 
> buffers.

I've been working on adding a way to automatically add direct mappings
using reserved-memory regions parsed from device tree, see:


https://lore.kernel.org/lkml/2020090413.691933-1-thierry.red...@gmail.com/

Perhaps this can be of use? With that you should be able to add a
reserved-memory region somewhere in the lower range that you need for
firmware images and have that automatically added as a direct mapping
so that it won't be reused later on for dynamic allocations.

Thierry


signature.asc
Description: PGP signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH] iommu/tegra-smmu: Fix tlb_mask

2020-09-17 Thread Thierry Reding
On Tue, Sep 15, 2020 at 05:23:59PM -0700, Nicolin Chen wrote:
> The "num_tlb_lines" might not be a power-of-2 value, being 48 on
> Tegra210 for example. So the current way of calculating tlb_mask
> using the num_tlb_lines is not correct: tlb_mask=0x5f in case of
> num_tlb_lines=48, which will trim a setting of 0x30 (48) to 0x10.
> 
> Signed-off-by: Nicolin Chen 
> ---
>  drivers/iommu/tegra-smmu.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)

This is technically a prerequisite for this patch you sent out earlier:


https://patchwork.ozlabs.org/project/linux-tegra/patch/20200915232803.26163-1-nicoleots...@gmail.com/

You should send both of those out as one series and add maintainers for
both subsystems to both patches so that they can work out who will be
applying them.

For this pair it's probably best for Joerg to pick up both patches
because this primarily concerns the Tegra SMMU, whereas the above patch
only provides the per-SoC data update for the SMMU. Obviously if Joerg
prefers for Krzysztof to pick up both patches that's fine with me too.

In either case, please send this out as a series so that both Joerg and
Krzysztof (Cc'ed for visibility) are aware of both patches. From the
Tegra side:

Acked-by: Thierry Reding 

> diff --git a/drivers/iommu/tegra-smmu.c b/drivers/iommu/tegra-smmu.c
> index 84fdee473873..0becdbfea306 100644
> --- a/drivers/iommu/tegra-smmu.c
> +++ b/drivers/iommu/tegra-smmu.c
> @@ -1120,7 +1120,7 @@ struct tegra_smmu *tegra_smmu_probe(struct device *dev,
>   BIT_MASK(mc->soc->num_address_bits - SMMU_PTE_SHIFT) - 1;
>   dev_dbg(dev, "address bits: %u, PFN mask: %#lx\n",
>   mc->soc->num_address_bits, smmu->pfn_mask);
> - smmu->tlb_mask = (smmu->soc->num_tlb_lines << 1) - 1;
> + smmu->tlb_mask = (1 << fls(smmu->soc->num_tlb_lines)) - 1;
>   dev_dbg(dev, "TLB lines: %u, mask: %#lx\n", smmu->soc->num_tlb_lines,
>   smmu->tlb_mask);
>  
> -- 
> 2.17.1
> 


signature.asc
Description: PGP signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v2 1/4] dt-bindings: reserved-memory: Document "active" property

2020-09-15 Thread Thierry Reding
On Mon, Sep 14, 2020 at 04:08:29PM -0600, Rob Herring wrote:
> On Fri, Sep 04, 2020 at 02:59:57PM +0200, Thierry Reding wrote:
> > From: Thierry Reding 
> > 
> > Reserved memory regions can be marked as "active" if hardware is
> > expected to access the regions during boot and before the operating
> > system can take control. One example where this is useful is for the
> > operating system to infer whether the region needs to be identity-
> > mapped through an IOMMU.
> 
> I like simple solutions, but this hardly seems adequate to solve the 
> problem of passing IOMMU setup from bootloader/firmware to the OS. Like 
> what is the IOVA that's supposed to be used if identity mapping is not 
> used?

The assumption here is that if the region is not active there is no need
for the IOVA to be specified because the kernel will allocate memory and
assign any IOVA of its choosing.

Also, note that this is not meant as a way of passing IOMMU setup from
the bootloader or firmware to the OS. The purpose of this is to specify
that some region of memory is actively being accessed during boot. The
particular case that I'm looking at is where the bootloader set up a
splash screen and keeps it on during boot. The bootloader has not set up
an IOMMU mapping and the identity mapping serves as a way of keeping the
accesses by the display hardware working during the transitional period
after the IOMMU translations have been enabled by the kernel but before
the kernel display driver has had a chance to set up its own IOMMU
mappings.

> If you know enough about the regions to assume identity mapping, then 
> can't you know if active or not?

We could alternatively add some property that describes the region as
requiring an identity mapping. But note that we can't make any
assumptions here about the usage of these regions because the IOMMU
driver simply has no way of knowing what they are being used for.

Some additional information is required in device tree for the IOMMU
driver to be able to make that decision.

Thierry

> 
> > Signed-off-by: Thierry Reding 
> > ---
> >  .../bindings/reserved-memory/reserved-memory.txt   | 7 +++
> >  1 file changed, 7 insertions(+)
> > 
> > diff --git 
> > a/Documentation/devicetree/bindings/reserved-memory/reserved-memory.txt 
> > b/Documentation/devicetree/bindings/reserved-memory/reserved-memory.txt
> > index 4dd20de6977f..163d2927e4fc 100644
> > --- a/Documentation/devicetree/bindings/reserved-memory/reserved-memory.txt
> > +++ b/Documentation/devicetree/bindings/reserved-memory/reserved-memory.txt
> > @@ -63,6 +63,13 @@ reusable (optional) - empty property
> >able to reclaim it back. Typically that means that the operating
> >system can use that region to store volatile or cached data that
> >can be otherwise regenerated or migrated elsewhere.
> > +active (optional) - empty property
> > +- If this property is set for a reserved memory region, it indicates
> > +  that some piece of hardware may be actively accessing this region.
> > +  Should the operating system want to enable IOMMU protection for a
> > +  device, all active memory regions must have been identity-mapped
> > +  in order to ensure that non-quiescent hardware during boot can
> > +  continue to access the memory.
> >  
> >  Linux implementation note:
> >  - If a "linux,cma-default" property is present, then Linux will use the
> > -- 
> > 2.28.0
> > 


signature.asc
Description: PGP signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

[PATCH v2 2/4] iommu: Implement of_iommu_get_resv_regions()

2020-09-04 Thread Thierry Reding
From: Thierry Reding 

This is an implementation that IOMMU drivers can use to obtain reserved
memory regions from a device tree node. It uses the reserved-memory DT
bindings to find the regions associated with a given device. These
regions will be used to create 1:1 mappings in the IOMMU domain that
the devices will be attached to.

Cc: Frank Rowand 
Cc: devicet...@vger.kernel.org
Signed-off-by: Thierry Reding 
---
Hi Rob,

you had previously reviewed this patch, but I haven't included that here
because there's a new property now that you might not be okay with.

Thierry

Changes in v2:
- use "active" property to determine whether direct mapping are needed

 drivers/iommu/of_iommu.c | 49 
 include/linux/of_iommu.h |  8 +++
 2 files changed, 57 insertions(+)

diff --git a/drivers/iommu/of_iommu.c b/drivers/iommu/of_iommu.c
index e505b9130a1c..3341d27fbbba 100644
--- a/drivers/iommu/of_iommu.c
+++ b/drivers/iommu/of_iommu.c
@@ -11,6 +11,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -245,3 +246,51 @@ const struct iommu_ops *of_iommu_configure(struct device 
*dev,
 
return ops;
 }
+
+/**
+ * of_iommu_get_resv_regions - reserved region driver helper for device tree
+ * @dev: device for which to get reserved regions
+ * @list: reserved region list
+ *
+ * IOMMU drivers can use this to implement their .get_resv_regions() callback
+ * for memory regions attached to a device tree node. See the reserved-memory
+ * device tree bindings on how to use these:
+ *
+ *   Documentation/devicetree/bindings/reserved-memory/reserved-memory.txt
+ */
+void of_iommu_get_resv_regions(struct device *dev, struct list_head *list)
+{
+   struct of_phandle_iterator it;
+   int err;
+
+   of_for_each_phandle(, err, dev->of_node, "memory-region", NULL, 0) {
+   struct iommu_resv_region *region;
+   struct resource res;
+
+   /*
+* Active memory regions are expected to be accessed by
+* hardware during boot and must therefore have an identity
+* mapping created prior to the driver taking control of the
+* hardware. This ensures that non-quiescent hardware doesn't
+* cause IOMMU faults during boot.
+*/
+   if (!of_property_read_bool(it.node, "active"))
+   continue;
+
+   err = of_address_to_resource(it.node, 0, );
+   if (err < 0) {
+   dev_err(dev, "failed to parse memory region %pOF: %d\n",
+   it.node, err);
+   continue;
+   }
+
+   region = iommu_alloc_resv_region(res.start, resource_size(),
+IOMMU_READ | IOMMU_WRITE,
+IOMMU_RESV_DIRECT_RELAXABLE);
+   if (!region)
+   continue;
+
+   list_add_tail(>list, list);
+   }
+}
+EXPORT_SYMBOL(of_iommu_get_resv_regions);
diff --git a/include/linux/of_iommu.h b/include/linux/of_iommu.h
index 16f4b3e87f20..8412437acaac 100644
--- a/include/linux/of_iommu.h
+++ b/include/linux/of_iommu.h
@@ -16,6 +16,9 @@ extern const struct iommu_ops *of_iommu_configure(struct 
device *dev,
struct device_node *master_np,
const u32 *id);
 
+extern void of_iommu_get_resv_regions(struct device *dev,
+ struct list_head *list);
+
 #else
 
 static inline int of_get_dma_window(struct device_node *dn, const char *prefix,
@@ -32,6 +35,11 @@ static inline const struct iommu_ops 
*of_iommu_configure(struct device *dev,
return NULL;
 }
 
+static inline void of_iommu_get_resv_regions(struct device *dev,
+struct list_head *list)
+{
+}
+
 #endif /* CONFIG_OF_IOMMU */
 
 #endif /* __OF_IOMMU_H */
-- 
2.28.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[RFC 4/4] iommu/tegra-smmu: Add support for reserved regions

2020-09-04 Thread Thierry Reding
From: Thierry Reding 

The Tegra DRM driver currently uses the IOMMU API explicitly. This means
that it has fine-grained control over when exactly the translation
through the IOMMU is enabled. This currently happens after the driver
probes, so the driver is in a DMA quiesced state when the IOMMU
translation is enabled.

During the transition of the Tegra DRM driver to use the DMA API instead
of the IOMMU API explicitly, it was observed that on certain platforms
the display controllers were still actively fetching from memory. When a
DMA IOMMU domain is created as part of the DMA/IOMMU API setup during
boot, the IOMMU translation for the display controllers can be enabled a
significant amount of time before the driver has had a chance to reset
the hardware into a sane state. This causes the SMMU to detect faults on
the addresses that the display controller is trying to fetch.

To avoid this, and as a byproduct paving the way for seamless transition
of display from the bootloader to the kernel, add support for reserved
regions in the Tegra SMMU driver. This is implemented using the standard
reserved memory device tree bindings, which let us describe regions of
memory which the kernel is forbidden from using for regular allocations.
The Tegra SMMU driver will parse the nodes associated with each device
via the "memory-region" property and return reserved regions that the
IOMMU core will then create direct mappings for prior to attaching the
IOMMU domains to the devices. This ensures that a 1:1 mapping is in
place when IOMMU translation starts and prevents the SMMU from detecting
any faults.

Signed-off-by: Thierry Reding 
---
I'm sending this out as RFC because there's a few hacks in here to make
this work properly and I'm not fully happy with this yet (see sections
marked with XXX).

Thierry

 drivers/iommu/tegra-smmu.c | 115 +
 1 file changed, 115 insertions(+)

diff --git a/drivers/iommu/tegra-smmu.c b/drivers/iommu/tegra-smmu.c
index 2574e716086b..33abc1527ac4 100644
--- a/drivers/iommu/tegra-smmu.c
+++ b/drivers/iommu/tegra-smmu.c
@@ -9,6 +9,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -530,6 +531,38 @@ static void tegra_smmu_set_pde(struct tegra_smmu_as *as, 
unsigned long iova,
struct tegra_smmu *smmu = as->smmu;
u32 *pd = page_address(as->pd);
unsigned long offset = pd_index * sizeof(*pd);
+   bool unmap = false;
+
+   /*
+* XXX Move this outside of this function. Perhaps add a struct
+* iommu_domain parameter to ->{get,put}_resv_regions() so that
+* the mapping can be done there.
+*
+* The problem here is that as->smmu is only known once we attach
+* the domain to a device (because then we look up the right SMMU
+* instance via the dev->archdata.iommu pointer). When the direct
+* mappings are created for reserved regions, the domain has not
+* been attached to a device yet, so we don't know. We currently
+* fix that up in ->apply_resv_regions() because that is the first
+* time where we have access to a struct device that will be used
+* with the IOMMU domain. However, that's asymmetric and doesn't
+* take care of the page directory mapping either, so we need to
+* come up with something better.
+*/
+   if (as->pd_dma == 0) {
+   as->pd_dma = dma_map_page(smmu->dev, as->pd, 0, SMMU_SIZE_PD,
+ DMA_TO_DEVICE);
+   if (dma_mapping_error(smmu->dev, as->pd_dma))
+   return;
+
+   if (!smmu_dma_addr_valid(smmu, as->pd_dma)) {
+   dma_unmap_page(smmu->dev, as->pd_dma, SMMU_SIZE_PD,
+  DMA_TO_DEVICE);
+   return;
+   }
+
+   unmap = true;
+   }
 
/* Set the page directory entry first */
pd[pd_index] = value;
@@ -542,6 +575,12 @@ static void tegra_smmu_set_pde(struct tegra_smmu_as *as, 
unsigned long iova,
smmu_flush_ptc(smmu, as->pd_dma, offset);
smmu_flush_tlb_section(smmu, as->id, iova);
smmu_flush(smmu);
+
+   if (unmap) {
+   dma_unmap_page(smmu->dev, as->pd_dma, SMMU_SIZE_PD,
+  DMA_TO_DEVICE);
+   as->pd_dma = 0;
+   }
 }
 
 static u32 *tegra_smmu_pte_offset(struct page *pt_page, unsigned long iova)
@@ -882,6 +921,79 @@ static struct iommu_group *tegra_smmu_device_group(struct 
device *dev)
return group;
 }
 
+static void tegra_smmu_get_resv_regions(struct device *dev, struct list_head 
*list)
+{
+   struct of_phandle_iterator it;
+   int err;
+
+   if (!dev->of_node)
+   return;
+
+   of_for_each_phandle(, err, dev->of_node, "memory-region", NULL, 0) {

  1   2   3   4   5   6   >