Re: [RFC PATCH 3/3] PCI: iproc: Add dma reserve resources to host

2018-12-12 Thread poza

On 2018-12-12 11:16, Srinath Mannam wrote:

IPROC host has the limitation that it can use
only those address ranges given by dma-ranges
property as inbound address.
So that the memory address holes in dma-ranges
should be reserved to allocate as DMA address.

All such reserved addresses are created as resource
entries and add to dma_resv list of pci host bridge.

These dma reserve resources created by parsing
dma-ranges parameter.

Ex:
dma-ranges = < \
  0x4300 0x00 0x8000 0x00 0x8000 0x00 0x8000 \
  0x4300 0x08 0x 0x08 0x 0x08 0x \
  0x4300 0x80 0x 0x80 0x 0x40 0x>

In the above example of dma-ranges, memory address from
0x0 - 0x8000,
0x1 - 0x8,
0x10 - 0x80 and
0x100 - 0x.
are not allowed to use as inbound addresses.
So that we need to add these address range to dma_resv
list to reserve their IOVA address ranges.

Signed-off-by: Srinath Mannam 
---
 drivers/pci/controller/pcie-iproc.c | 49 
+

 1 file changed, 49 insertions(+)

diff --git a/drivers/pci/controller/pcie-iproc.c
b/drivers/pci/controller/pcie-iproc.c
index 3160e93..43e465a 100644
--- a/drivers/pci/controller/pcie-iproc.c
+++ b/drivers/pci/controller/pcie-iproc.c
@@ -1154,25 +1154,74 @@ static int iproc_pcie_setup_ib(struct 
iproc_pcie *pcie,

return ret;
 }

+static int
+iproc_pcie_add_dma_resv_range(struct device *dev, struct list_head 
*resources,

+ uint64_t start, uint64_t end)
+{
+   struct resource *res;
+
+   res = devm_kzalloc(dev, sizeof(struct resource), GFP_KERNEL);
+   if (!res)
+   return -ENOMEM;
+
+   res->start = (resource_size_t)start;
+   res->end = (resource_size_t)end;
+   pci_add_resource_offset(resources, res, 0);
+
+   return 0;
+}
+
 static int iproc_pcie_map_dma_ranges(struct iproc_pcie *pcie)
 {
+   struct pci_host_bridge *host = pci_host_bridge_from_priv(pcie);
struct of_pci_range range;
struct of_pci_range_parser parser;
int ret;
+   uint64_t start, end;
+   LIST_HEAD(resources);

/* Get the dma-ranges from DT */
ret = of_pci_dma_range_parser_init(, pcie->dev->of_node);
if (ret)
return ret;

+   start = 0;
for_each_of_pci_range(, ) {
+   end = range.pci_addr;
+   /* dma-ranges list expected in sorted order */
+   if (end < start) {
+   ret = -EINVAL;
+   goto out;
+   }
/* Each range entry corresponds to an inbound mapping region */
ret = iproc_pcie_setup_ib(pcie, , IPROC_PCIE_IB_MAP_MEM);
if (ret)
return ret;
+
+   if (end - start) {
+   ret = iproc_pcie_add_dma_resv_range(pcie->dev,
+   ,
+   start, end);
+   if (ret)
+   goto out;
+   }
+   start = range.pci_addr + range.size;
}

+   end = ~0;

Hi Srinath,

this series is based on following patch sets.

https://lkml.org/lkml/2017/5/16/19
https://lkml.org/lkml/2017/5/16/23
https://lkml.org/lkml/2017/5/16/21,

some comments to be adapted from the patch-set I did.

end = ~0;
you should consider DMA_MASK, to see iproc controller is in 32 bit or 64 
bit system.

please check following code snippet.

if (tmp_dma_addr < DMA_BIT_MASK(sizeof(dma_addr_t) * 8)) {
+   lo = iova_pfn(iovad, tmp_dma_addr);
+   hi = iova_pfn(iovad,
+ DMA_BIT_MASK(sizeof(dma_addr_t) * 8) - 1);
+   reserve_iova(iovad, lo, hi);
+   }

Also if this controller is integrated to 64bit platform, but decide to 
restrict DMA to 32 bit for some reason, the code should address such 
scenarios.

so it is always safe to do

#define BITS_PER_BYTE 8
DMA_BIT_MASK(sizeof(dma_addr_t) * BITS_PER_BYTE)
so please use kernle macro to find the end of DMA region.


Also ideally according to SBSA v5

8.3 PCI Express device view of memory

Transactions from a PCI express device will either directly address the 
memory system of the base server system
or be presented to a SMMU for optional address translation and 
permission policing.
In systems that are compatible with level 3 or above of the SBSA, the 
addresses sent by PCI express devices
must be presented to the memory system or SMMU unmodified. In a system 
where the PCI express does not use
an SMMU, the PCI express devices have the same view of physical memory 
as the PEs. In a system with a
SMMU for PCI express there are no transformations to addresses being 
sent by PCI express devices before they

are presented as an input address to the SMMU.


Re: [PATCH 1/1] iommu/arm-smmu: Add support to use Last level cache

2018-12-12 Thread Tomasz Figa
On Fri, Dec 7, 2018 at 6:25 PM Vivek Gautam  wrote:
>
> Hi Robin,
>
> On Tue, Dec 4, 2018 at 8:51 PM Robin Murphy  wrote:
> >
> > On 04/12/2018 11:01, Vivek Gautam wrote:
> > > Qualcomm SoCs have an additional level of cache called as
> > > System cache, aka. Last level cache (LLC). This cache sits right
> > > before the DDR, and is tightly coupled with the memory controller.
> > > The cache is available to all the clients present in the SoC system.
> > > The clients request their slices from this system cache, make it
> > > active, and can then start using it.
> > > For these clients with smmu, to start using the system cache for
> > > buffers and, related page tables [1], memory attributes need to be
> > > set accordingly.
> > > This change updates the MAIR and TCR configurations with correct
> > > attributes to use this system cache.
> > >
> > > To explain a little about memory attribute requirements here:
> > >
> > > Non-coherent I/O devices can't look-up into inner caches. However,
> > > coherent I/O devices can. But both can allocate in the system cache
> > > based on system policy and configured memory attributes in page
> > > tables.
> > > CPUs can access both inner and outer caches (including system cache,
> > > aka. Last level cache), and can allocate into system cache too
> > > based on memory attributes, and system policy.
> > >
> > > Further looking at memory types, we have following -
> > > a) Normal uncached :- MAIR 0x44, inner non-cacheable,
> > >outer non-cacheable;
> > > b) Normal cached :-   MAIR 0xff, inner read write-back non-transient,
> > >outer read write-back non-transient;
> > >attribute setting for coherenet I/O devices.
> > >
> > > and, for non-coherent i/o devices that can allocate in system cache
> > > another type gets added -
> > > c) Normal sys-cached/non-inner-cached :-
> > >MAIR 0xf4, inner non-cacheable,
> > >outer read write-back non-transient
> > >
> > > So, CPU will automatically use the system cache for memory marked as
> > > normal cached. The normal sys-cached is downgraded to normal non-cached
> > > memory for CPUs.
> > > Coherent I/O devices can use system cache by marking the memory as
> > > normal cached.
> > > Non-coherent I/O devices, to use system cache, should mark the memory as
> > > normal sys-cached in page tables.
> > >
> > > This change is a realisation of following changes
> > > from downstream msm-4.9:
> > > iommu: io-pgtable-arm: Support DOMAIN_ATTRIBUTE_USE_UPSTREAM_HINT[2]
> > > iommu: io-pgtable-arm: Implement IOMMU_USE_UPSTREAM_HINT[3]
> > >
> > > [1] https://patchwork.kernel.org/patch/10302791/
> > > [2] 
> > > https://source.codeaurora.org/quic/la/kernel/msm-4.9/commit/?h=msm-4.9=bf762276796e79ca90014992f4d9da5593fa7d51
> > > [3] 
> > > https://source.codeaurora.org/quic/la/kernel/msm-4.9/commit/?h=msm-4.9=d4c72c413ea27c43f60825193d4de9cb8ffd9602
> > >
> > > Signed-off-by: Vivek Gautam 
> > > ---
> > >
> > > Changes since v1:
> > >   - Addressed Tomasz's comments for basing the change on
> > > "NO_INNER_CACHE" concept for non-coherent I/O devices
> > > rather than capturing "SYS_CACHE". This is to indicate
> > > clearly the intent of non-coherent I/O devices that
> > > can't access inner caches.
> >
> > That seems backwards to me - there is already a fundamental assumption
> > that non-coherent devices can't access caches. What we're adding here is
> > a weird exception where they *can* use some level of cache despite still
> > being non-coherent overall.
> >
> > In other words, it's not a case of downgrading coherent devices'
> > accesses to bypass inner caches, it's upgrading non-coherent devices'
> > accesses to hit the outer cache. That's certainly the understanding I
> > got from talking with Pratik at Plumbers, and it does appear to fit with
> > your explanation above despite the final conclusion you draw being
> > different.
>
> Thanks for the thorough review of the change.
> Right, I guess it's rather an upgrade for non-coherent devices to use
> an outer cache than a downgrade for coherent devices.
>

Note that it was not my suggestion to use "NO_INNER_CACHE" for
enabling the system cache, sorry for not being clear. What I was
asking for in my comment was regarding the previous patch disabling
inner cache if system cache is requested, which may not make for
coherent devices, which could benefit from using both inner and system
cache.

So note that there are several cases here:
 - coherent, IC, system cache alloc,
 - coherent. non-IC, system cache alloc,
 - coherent, IC, system cache look-up,
 - noncoherent device, non-IC, system cache alloc,
 - noncoherent device, non-IC, system cache look-up.

Given the presence or lack of coherency for the device, which of the
2/3 options is the best depends on the use case, e.g. DMA/CPU access
pattern, sharing memory between multiple devices, etc.

Best regards,

Re: [PATCH] iommu/io-pgtable-arm-v7s: Don't check PHYS_OFFSET if RAMDOMIZE_BASE is enabled

2018-12-12 Thread Yong Wu
On Wed, 2018-12-12 at 13:39 +, Robin Murphy wrote:
> On 12/12/2018 13:02, Yong Wu wrote:
> > If CONFIG_RANDOMIZE_BASE is enabled, the "memstart_addr" will be updated
> > randomly, then the PHYS_OFFSET may be random.
> 
> Oh, I hadn't ever realised that, good catch. However, since 29859aeb8a6e 
> I think we should probably just remove this check altogether.

Thanks the hint. It looks that I only need revert 82db33dc5e49.

> 
> > Fixes: 82db33dc5e49 ("iommu/io-pgtable-arm: Check for v7s-incapable
> > systems")
> 
> Note that this alone wouldn't be sufficient for stable prior to 4.18, 
> since CONFIG_RANDOMIZE_BASE would then allow the original crash to 
> happen again.

Yes. the problem always exist. Nicolas is fixing this. I will delete the
tag here.

> 
> Robin.
> 
> > Reported-by: CK Hu 
> > Signed-off-by: Yong Wu 
> > ---
> >   drivers/iommu/io-pgtable-arm-v7s.c | 2 +-
> >   1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/drivers/iommu/io-pgtable-arm-v7s.c 
> > b/drivers/iommu/io-pgtable-arm-v7s.c
> > index 445c3bd..70941e6 100644
> > --- a/drivers/iommu/io-pgtable-arm-v7s.c
> > +++ b/drivers/iommu/io-pgtable-arm-v7s.c
> > @@ -709,7 +709,7 @@ static struct io_pgtable *arm_v7s_alloc_pgtable(struct 
> > io_pgtable_cfg *cfg,
> >   {
> > struct arm_v7s_io_pgtable *data;
> >   
> > -#ifdef PHYS_OFFSET
> > +#if defined(PHYS_OFFSET) && !defined(CONFIG_RANDOMIZE_BASE)
> > if (upper_32_bits(PHYS_OFFSET))
> > return NULL;
> >   #endif
> > 
> 
> ___
> Linux-mediatek mailing list
> linux-media...@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-mediatek


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v7 22/24] iommu/tegra: gart: Don't detach devices from inactive domains

2018-12-12 Thread Dmitry Osipenko
There could be unlimited number of allocated domains, but only one domain
can be active at a time. Hence devices must be detached only from the
active domain.

Signed-off-by: Dmitry Osipenko 
Acked-by: Thierry Reding 
---
 drivers/iommu/tegra-gart.c | 8 +---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/drivers/iommu/tegra-gart.c b/drivers/iommu/tegra-gart.c
index a3ce6918577d..74c9be13f043 100644
--- a/drivers/iommu/tegra-gart.c
+++ b/drivers/iommu/tegra-gart.c
@@ -171,7 +171,7 @@ static int gart_iommu_attach_dev(struct iommu_domain 
*domain,
 struct device *dev)
 {
struct gart_domain *gart_domain = to_gart_domain(domain);
-   struct gart_device *gart = gart_domain->gart;
+   struct gart_device *gart = gart_handle;
struct gart_client *client, *c;
int err = 0;
 
@@ -195,6 +195,7 @@ static int gart_iommu_attach_dev(struct iommu_domain 
*domain,
goto fail;
}
gart->active_domain = domain;
+   gart_domain->gart = gart;
list_add(>list, >client);
spin_unlock(>client_lock);
dev_dbg(gart->dev, "Attached %s\n", dev_name(dev));
@@ -217,8 +218,10 @@ static void __gart_iommu_detach_dev(struct iommu_domain 
*domain,
if (c->dev == dev) {
list_del(>list);
kfree(c);
-   if (list_empty(>client))
+   if (list_empty(>client)) {
gart->active_domain = NULL;
+   gart_domain->gart = NULL;
+   }
dev_dbg(gart->dev, "Detached %s\n", dev_name(dev));
return;
}
@@ -254,7 +257,6 @@ static struct iommu_domain 
*gart_iommu_domain_alloc(unsigned type)
if (!gart_domain)
return NULL;
 
-   gart_domain->gart = gart;
gart_domain->domain.geometry.aperture_start = gart->iovmm_base;
gart_domain->domain.geometry.aperture_end = gart->iovmm_base +
gart->page_count * GART_PAGE_SIZE - 1;
-- 
2.20.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v7 23/24] iommu/tegra: gart: Simplify clients-tracking code

2018-12-12 Thread Dmitry Osipenko
GART is a simple IOMMU provider that has single address space. There is
no need to setup global clients list and manage it for tracking of the
active domain, hence lot's of code could be safely removed and replaced
with a simpler alternative.

Signed-off-by: Dmitry Osipenko 
Acked-by: Thierry Reding 
---
 drivers/iommu/tegra-gart.c | 155 ++---
 1 file changed, 40 insertions(+), 115 deletions(-)

diff --git a/drivers/iommu/tegra-gart.c b/drivers/iommu/tegra-gart.c
index 74c9be13f043..ad348c61d5e7 100644
--- a/drivers/iommu/tegra-gart.c
+++ b/drivers/iommu/tegra-gart.c
@@ -23,7 +23,6 @@
 
 #include 
 #include 
-#include 
 #include 
 #include 
 #include 
@@ -46,30 +45,20 @@
 #define GART_PAGE_MASK \
(~(GART_PAGE_SIZE - 1) & ~GART_ENTRY_PHYS_ADDR_VALID)
 
-struct gart_client {
-   struct device   *dev;
-   struct list_headlist;
-};
-
 struct gart_device {
void __iomem*regs;
u32 *savedata;
u32 page_count; /* total remappable size */
dma_addr_t  iovmm_base; /* offset to vmm_area */
spinlock_t  pte_lock;   /* for pagetable */
-   struct list_headclient;
-   spinlock_t  client_lock;/* for client list */
+   spinlock_t  dom_lock;   /* for active domain */
+   unsigned intactive_devices; /* number of active devices */
struct iommu_domain *active_domain; /* current active domain */
struct device   *dev;
 
struct iommu_device iommu;  /* IOMMU Core handle */
 };
 
-struct gart_domain {
-   struct iommu_domain domain; /* generic domain handle */
-   struct gart_device *gart;   /* link to gart device   */
-};
-
 static struct gart_device *gart_handle; /* unique for a system */
 
 static bool gart_debug;
@@ -77,11 +66,6 @@ static bool gart_debug;
 #define GART_PTE(_pfn) \
(GART_ENTRY_PHYS_ADDR_VALID | ((_pfn) << PAGE_SHIFT))
 
-static struct gart_domain *to_gart_domain(struct iommu_domain *dom)
-{
-   return container_of(dom, struct gart_domain, domain);
-}
-
 /*
  * Any interaction between any block on PPSB and a block on APB or AHB
  * must have these read-back to ensure the APB/AHB bus transaction is
@@ -170,125 +154,70 @@ static inline bool gart_iova_range_valid(struct 
gart_device *gart,
 static int gart_iommu_attach_dev(struct iommu_domain *domain,
 struct device *dev)
 {
-   struct gart_domain *gart_domain = to_gart_domain(domain);
struct gart_device *gart = gart_handle;
-   struct gart_client *client, *c;
-   int err = 0;
-
-   client = kzalloc(sizeof(*c), GFP_KERNEL);
-   if (!client)
-   return -ENOMEM;
-   client->dev = dev;
-
-   spin_lock(>client_lock);
-   list_for_each_entry(c, >client, list) {
-   if (c->dev == dev) {
-   dev_err(gart->dev,
-   "%s is already attached\n", dev_name(dev));
-   err = -EINVAL;
-   goto fail;
-   }
-   }
-   if (gart->active_domain && gart->active_domain != domain) {
-   dev_err(gart->dev, "Only one domain can be active at a time\n");
-   err = -EINVAL;
-   goto fail;
-   }
-   gart->active_domain = domain;
-   gart_domain->gart = gart;
-   list_add(>list, >client);
-   spin_unlock(>client_lock);
-   dev_dbg(gart->dev, "Attached %s\n", dev_name(dev));
-   return 0;
+   int ret = 0;
 
-fail:
-   kfree(client);
-   spin_unlock(>client_lock);
-   return err;
-}
+   spin_lock(>dom_lock);
 
-static void __gart_iommu_detach_dev(struct iommu_domain *domain,
-   struct device *dev)
-{
-   struct gart_domain *gart_domain = to_gart_domain(domain);
-   struct gart_device *gart = gart_domain->gart;
-   struct gart_client *c;
-
-   list_for_each_entry(c, >client, list) {
-   if (c->dev == dev) {
-   list_del(>list);
-   kfree(c);
-   if (list_empty(>client)) {
-   gart->active_domain = NULL;
-   gart_domain->gart = NULL;
-   }
-   dev_dbg(gart->dev, "Detached %s\n", dev_name(dev));
-   return;
-   }
+   if (gart->active_domain && gart->active_domain != domain) {
+   ret = -EBUSY;
+   } else if (dev->archdata.iommu != domain) {
+   dev->archdata.iommu = domain;
+   gart->active_domain = domain;
+   gart->active_devices++;
}
 
-   dev_err(gart->dev, "Couldn't find %s to 

[PATCH v7 24/24] iommu/tegra: gart: Perform code refactoring

2018-12-12 Thread Dmitry Osipenko
Removed redundant safety-checks in the code and some debug code that
isn't actually very useful for debugging, like enormous pagetable dump
on each fault. The majority of the changes are code reshuffling,
variables/whitespaces clean up and removal of debug messages that
duplicate messages of the IOMMU-core. Now the GART translation is kept
disabled while GART is suspended.

Signed-off-by: Dmitry Osipenko 
Acked-by: Thierry Reding 
---
 drivers/iommu/tegra-gart.c | 253 +++--
 1 file changed, 105 insertions(+), 148 deletions(-)

diff --git a/drivers/iommu/tegra-gart.c b/drivers/iommu/tegra-gart.c
index ad348c61d5e7..4d8057916552 100644
--- a/drivers/iommu/tegra-gart.c
+++ b/drivers/iommu/tegra-gart.c
@@ -1,5 +1,5 @@
 /*
- * IOMMU API for GART in Tegra20
+ * IOMMU API for Graphics Address Relocation Table on Tegra20
  *
  * Copyright (c) 2010-2012, NVIDIA CORPORATION.  All rights reserved.
  *
@@ -31,70 +31,63 @@
 
 #include 
 
-/* bitmap of the page sizes currently supported */
-#define GART_IOMMU_PGSIZES (SZ_4K)
-
 #define GART_REG_BASE  0x24
 #define GART_CONFIG(0x24 - GART_REG_BASE)
 #define GART_ENTRY_ADDR(0x28 - GART_REG_BASE)
 #define GART_ENTRY_DATA(0x2c - GART_REG_BASE)
-#define GART_ENTRY_PHYS_ADDR_VALID (1 << 31)
+
+#define GART_ENTRY_PHYS_ADDR_VALID BIT(31)
 
 #define GART_PAGE_SHIFT12
 #define GART_PAGE_SIZE (1 << GART_PAGE_SHIFT)
-#define GART_PAGE_MASK \
-   (~(GART_PAGE_SIZE - 1) & ~GART_ENTRY_PHYS_ADDR_VALID)
+#define GART_PAGE_MASK GENMASK(30, GART_PAGE_SHIFT)
+
+/* bitmap of the page sizes currently supported */
+#define GART_IOMMU_PGSIZES (GART_PAGE_SIZE)
 
 struct gart_device {
void __iomem*regs;
u32 *savedata;
-   u32 page_count; /* total remappable size */
-   dma_addr_t  iovmm_base; /* offset to vmm_area */
+   unsigned long   iovmm_base; /* offset to vmm_area start */
+   unsigned long   iovmm_end;  /* offset to vmm_area end */
spinlock_t  pte_lock;   /* for pagetable */
spinlock_t  dom_lock;   /* for active domain */
unsigned intactive_devices; /* number of active devices */
struct iommu_domain *active_domain; /* current active domain */
-   struct device   *dev;
-
struct iommu_device iommu;  /* IOMMU Core handle */
+   struct device   *dev;
 };
 
 static struct gart_device *gart_handle; /* unique for a system */
 
 static bool gart_debug;
 
-#define GART_PTE(_pfn) \
-   (GART_ENTRY_PHYS_ADDR_VALID | ((_pfn) << PAGE_SHIFT))
-
 /*
  * Any interaction between any block on PPSB and a block on APB or AHB
  * must have these read-back to ensure the APB/AHB bus transaction is
  * complete before initiating activity on the PPSB block.
  */
-#define FLUSH_GART_REGS(gart)  ((void)readl((gart)->regs + GART_CONFIG))
+#define FLUSH_GART_REGS(gart)  readl_relaxed((gart)->regs + GART_CONFIG)
 
 #define for_each_gart_pte(gart, iova)  \
for (iova = gart->iovmm_base;   \
-iova < gart->iovmm_base + GART_PAGE_SIZE * gart->page_count; \
+iova < gart->iovmm_end;\
 iova += GART_PAGE_SIZE)
 
 static inline void gart_set_pte(struct gart_device *gart,
-   unsigned long offs, u32 pte)
+   unsigned long iova, unsigned long pte)
 {
-   writel(offs, gart->regs + GART_ENTRY_ADDR);
-   writel(pte, gart->regs + GART_ENTRY_DATA);
-
-   dev_dbg(gart->dev, "%s %08lx:%08x\n",
-pte ? "map" : "unmap", offs, pte & GART_PAGE_MASK);
+   writel_relaxed(iova, gart->regs + GART_ENTRY_ADDR);
+   writel_relaxed(pte, gart->regs + GART_ENTRY_DATA);
 }
 
 static inline unsigned long gart_read_pte(struct gart_device *gart,
- unsigned long offs)
+ unsigned long iova)
 {
unsigned long pte;
 
-   writel(offs, gart->regs + GART_ENTRY_ADDR);
-   pte = readl(gart->regs + GART_ENTRY_DATA);
+   writel_relaxed(iova, gart->regs + GART_ENTRY_ADDR);
+   pte = readl_relaxed(gart->regs + GART_ENTRY_DATA);
 
return pte;
 }
@@ -106,49 +99,20 @@ static void do_gart_setup(struct gart_device *gart, const 
u32 *data)
for_each_gart_pte(gart, iova)
gart_set_pte(gart, iova, data ? *(data++) : 0);
 
-   writel(1, gart->regs + GART_CONFIG);
+   writel_relaxed(1, gart->regs + GART_CONFIG);
FLUSH_GART_REGS(gart);
 }
 
-#ifdef DEBUG
-static void gart_dump_table(struct gart_device *gart)
-{
-   unsigned long 

[PATCH v7 21/24] iommu/tegra: gart: Prepend error/debug messages with "gart:"

2018-12-12 Thread Dmitry Osipenko
GART became a part of Memory Controller, hence now the drivers device
is Memory Controller and not GART. As a result all printed messages are
prepended with the "tegra-mc 7000f000.memory-controller:", so let's
prepend GART's messages with "gart:" in order to differentiate them
from the MC.

Signed-off-by: Dmitry Osipenko 
Acked-by: Thierry Reding 
---
 drivers/iommu/tegra-gart.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/iommu/tegra-gart.c b/drivers/iommu/tegra-gart.c
index 2e0e6aff8f70..a3ce6918577d 100644
--- a/drivers/iommu/tegra-gart.c
+++ b/drivers/iommu/tegra-gart.c
@@ -19,6 +19,8 @@
  * 51 Franklin St - Fifth Floor, Boston, MA 02110-1301 USA.
  */
 
+#define dev_fmt(fmt)   "gart: " fmt
+
 #include 
 #include 
 #include 
-- 
2.20.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v7 19/24] iommu/tegra: gart: Allow only one active domain at a time

2018-12-12 Thread Dmitry Osipenko
GART has a single address space that is shared by all devices, hence only
one domain could be active at a time.

Signed-off-by: Dmitry Osipenko 
Acked-by: Thierry Reding 
---
 drivers/iommu/tegra-gart.c | 9 +
 1 file changed, 9 insertions(+)

diff --git a/drivers/iommu/tegra-gart.c b/drivers/iommu/tegra-gart.c
index 7fdd8b12efd5..71ff22be9560 100644
--- a/drivers/iommu/tegra-gart.c
+++ b/drivers/iommu/tegra-gart.c
@@ -57,6 +57,7 @@ struct gart_device {
spinlock_t  pte_lock;   /* for pagetable */
struct list_headclient;
spinlock_t  client_lock;/* for client list */
+   struct iommu_domain *active_domain; /* current active domain */
struct device   *dev;
 
struct iommu_device iommu;  /* IOMMU Core handle */
@@ -186,6 +187,12 @@ static int gart_iommu_attach_dev(struct iommu_domain 
*domain,
goto fail;
}
}
+   if (gart->active_domain && gart->active_domain != domain) {
+   dev_err(gart->dev, "Only one domain can be active at a time\n");
+   err = -EINVAL;
+   goto fail;
+   }
+   gart->active_domain = domain;
list_add(>list, >client);
spin_unlock(>client_lock);
dev_dbg(gart->dev, "Attached %s\n", dev_name(dev));
@@ -208,6 +215,8 @@ static void __gart_iommu_detach_dev(struct iommu_domain 
*domain,
if (c->dev == dev) {
list_del(>list);
devm_kfree(gart->dev, c);
+   if (list_empty(>client))
+   gart->active_domain = NULL;
dev_dbg(gart->dev, "Detached %s\n", dev_name(dev));
return;
}
-- 
2.20.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v7 18/24] iommu/tegra: gart: Fix NULL pointer dereference

2018-12-12 Thread Dmitry Osipenko
Fix NULL pointer dereference on IOMMU domain destruction that happens
because clients list is being iterated unsafely and its elements are
getting deleted during the iteration.

Signed-off-by: Dmitry Osipenko 
Acked-by: Thierry Reding 
---
 drivers/iommu/tegra-gart.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/iommu/tegra-gart.c b/drivers/iommu/tegra-gart.c
index a7a9400e0cd8..7fdd8b12efd5 100644
--- a/drivers/iommu/tegra-gart.c
+++ b/drivers/iommu/tegra-gart.c
@@ -260,9 +260,9 @@ static void gart_iommu_domain_free(struct iommu_domain 
*domain)
if (gart) {
spin_lock(>client_lock);
if (!list_empty(>client)) {
-   struct gart_client *c;
+   struct gart_client *c, *tmp;
 
-   list_for_each_entry(c, >client, list)
+   list_for_each_entry_safe(c, tmp, >client, list)
__gart_iommu_detach_dev(domain, c->dev);
}
spin_unlock(>client_lock);
-- 
2.20.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v7 17/24] iommu/tegra: gart: Fix spinlock recursion

2018-12-12 Thread Dmitry Osipenko
Fix spinlock recursion bug that happens on IOMMU domain destruction if
any of the allocated domains have devices attached to them.

Signed-off-by: Dmitry Osipenko 
Acked-by: Thierry Reding 
---
 drivers/iommu/tegra-gart.c | 24 
 1 file changed, 16 insertions(+), 8 deletions(-)

diff --git a/drivers/iommu/tegra-gart.c b/drivers/iommu/tegra-gart.c
index b35ffa312a83..a7a9400e0cd8 100644
--- a/drivers/iommu/tegra-gart.c
+++ b/drivers/iommu/tegra-gart.c
@@ -197,25 +197,33 @@ static int gart_iommu_attach_dev(struct iommu_domain 
*domain,
return err;
 }
 
-static void gart_iommu_detach_dev(struct iommu_domain *domain,
- struct device *dev)
+static void __gart_iommu_detach_dev(struct iommu_domain *domain,
+   struct device *dev)
 {
struct gart_domain *gart_domain = to_gart_domain(domain);
struct gart_device *gart = gart_domain->gart;
struct gart_client *c;
 
-   spin_lock(>client_lock);
-
list_for_each_entry(c, >client, list) {
if (c->dev == dev) {
list_del(>list);
devm_kfree(gart->dev, c);
dev_dbg(gart->dev, "Detached %s\n", dev_name(dev));
-   goto out;
+   return;
}
}
-   dev_err(gart->dev, "Couldn't find\n");
-out:
+
+   dev_err(gart->dev, "Couldn't find %s to detach\n", dev_name(dev));
+}
+
+static void gart_iommu_detach_dev(struct iommu_domain *domain,
+ struct device *dev)
+{
+   struct gart_domain *gart_domain = to_gart_domain(domain);
+   struct gart_device *gart = gart_domain->gart;
+
+   spin_lock(>client_lock);
+   __gart_iommu_detach_dev(domain, dev);
spin_unlock(>client_lock);
 }
 
@@ -255,7 +263,7 @@ static void gart_iommu_domain_free(struct iommu_domain 
*domain)
struct gart_client *c;
 
list_for_each_entry(c, >client, list)
-   gart_iommu_detach_dev(domain, c->dev);
+   __gart_iommu_detach_dev(domain, c->dev);
}
spin_unlock(>client_lock);
}
-- 
2.20.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v7 15/24] memory: tegra: Do not ask for IRQ sharing

2018-12-12 Thread Dmitry Osipenko
Memory Controller driver never shared IRQ with any other driver and very
unlikely that it will. Hence there is no need to request IRQ sharing and
the corresponding flag can be dropped safely.

Signed-off-by: Dmitry Osipenko 
---
 drivers/memory/tegra/mc.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/memory/tegra/mc.c b/drivers/memory/tegra/mc.c
index 3545868c51c0..570da2129fa6 100644
--- a/drivers/memory/tegra/mc.c
+++ b/drivers/memory/tegra/mc.c
@@ -682,7 +682,7 @@ static int tegra_mc_probe(struct platform_device *pdev)
 
mc_writel(mc, mc->soc->intmask, MC_INTMASK);
 
-   err = devm_request_irq(>dev, mc->irq, isr, IRQF_SHARED,
+   err = devm_request_irq(>dev, mc->irq, isr, 0,
   dev_name(>dev), mc);
if (err < 0) {
dev_err(>dev, "failed to request IRQ#%u: %d\n", mc->irq,
-- 
2.20.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v7 16/24] memory: tegra: Clean up error messages

2018-12-12 Thread Dmitry Osipenko
Make all messages to start with a lower case and don't unnecessarily go
over 80 chars in the code.

Signed-off-by: Dmitry Osipenko 
---
 drivers/memory/tegra/mc.c | 15 ---
 1 file changed, 8 insertions(+), 7 deletions(-)

diff --git a/drivers/memory/tegra/mc.c b/drivers/memory/tegra/mc.c
index 570da2129fa6..0a53598d982f 100644
--- a/drivers/memory/tegra/mc.c
+++ b/drivers/memory/tegra/mc.c
@@ -163,7 +163,7 @@ static int tegra_mc_hotreset_assert(struct 
reset_controller_dev *rcdev,
/* block clients DMA requests */
err = rst_ops->block_dma(mc, rst);
if (err) {
-   dev_err(mc->dev, "Failed to block %s DMA: %d\n",
+   dev_err(mc->dev, "failed to block %s DMA: %d\n",
rst->name, err);
return err;
}
@@ -173,7 +173,7 @@ static int tegra_mc_hotreset_assert(struct 
reset_controller_dev *rcdev,
/* wait for completion of the outstanding DMA requests */
while (!rst_ops->dma_idling(mc, rst)) {
if (!retries--) {
-   dev_err(mc->dev, "Failed to flush %s DMA\n",
+   dev_err(mc->dev, "failed to flush %s DMA\n",
rst->name);
return -EBUSY;
}
@@ -186,7 +186,7 @@ static int tegra_mc_hotreset_assert(struct 
reset_controller_dev *rcdev,
/* clear clients DMA requests sitting before arbitration */
err = rst_ops->hotreset_assert(mc, rst);
if (err) {
-   dev_err(mc->dev, "Failed to hot reset %s: %d\n",
+   dev_err(mc->dev, "failed to hot reset %s: %d\n",
rst->name, err);
return err;
}
@@ -215,7 +215,7 @@ static int tegra_mc_hotreset_deassert(struct 
reset_controller_dev *rcdev,
/* take out client from hot reset */
err = rst_ops->hotreset_deassert(mc, rst);
if (err) {
-   dev_err(mc->dev, "Failed to deassert hot reset %s: 
%d\n",
+   dev_err(mc->dev, "failed to deassert hot reset %s: 
%d\n",
rst->name, err);
return err;
}
@@ -225,7 +225,7 @@ static int tegra_mc_hotreset_deassert(struct 
reset_controller_dev *rcdev,
/* allow new DMA requests to proceed to arbitration */
err = rst_ops->unblock_dma(mc, rst);
if (err) {
-   dev_err(mc->dev, "Failed to unblock %s DMA : %d\n",
+   dev_err(mc->dev, "failed to unblock %s DMA : %d\n",
rst->name, err);
return err;
}
@@ -657,7 +657,8 @@ static int tegra_mc_probe(struct platform_device *pdev)
{
err = tegra_mc_setup_latency_allowance(mc);
if (err < 0) {
-   dev_err(>dev, "failed to setup latency allowance: 
%d\n",
+   dev_err(>dev,
+   "failed to setup latency allowance: %d\n",
err);
return err;
}
@@ -678,7 +679,7 @@ static int tegra_mc_probe(struct platform_device *pdev)
return mc->irq;
}
 
-   WARN(!mc->soc->client_id_mask, "Missing client ID mask for this SoC\n");
+   WARN(!mc->soc->client_id_mask, "missing client ID mask for this SoC\n");
 
mc_writel(mc, mc->soc->intmask, MC_INTMASK);
 
-- 
2.20.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v7 14/24] memory: tegra: Do not try to probe SMMU on Tegra20

2018-12-12 Thread Dmitry Osipenko
Tegra20 doesn't have SMMU. Move out checking of the SMMU presence from
the SMMU driver into the Memory Controller driver. This change makes code
consistent in regards to how GART/SMMU presence checking is performed.

Signed-off-by: Dmitry Osipenko 
---
 drivers/iommu/tegra-smmu.c | 4 
 drivers/memory/tegra/mc.c  | 6 --
 2 files changed, 4 insertions(+), 6 deletions(-)

diff --git a/drivers/iommu/tegra-smmu.c b/drivers/iommu/tegra-smmu.c
index 97b3c0461831..32ea81e9f012 100644
--- a/drivers/iommu/tegra-smmu.c
+++ b/drivers/iommu/tegra-smmu.c
@@ -982,10 +982,6 @@ struct tegra_smmu *tegra_smmu_probe(struct device *dev,
u32 value;
int err;
 
-   /* This can happen on Tegra20 which doesn't have an SMMU */
-   if (!soc)
-   return NULL;
-
smmu = devm_kzalloc(dev, sizeof(*smmu), GFP_KERNEL);
if (!smmu)
return ERR_PTR(-ENOMEM);
diff --git a/drivers/memory/tegra/mc.c b/drivers/memory/tegra/mc.c
index e684e234361a..3545868c51c0 100644
--- a/drivers/memory/tegra/mc.c
+++ b/drivers/memory/tegra/mc.c
@@ -695,11 +695,13 @@ static int tegra_mc_probe(struct platform_device *pdev)
dev_err(>dev, "failed to register reset controller: %d\n",
err);
 
-   if (IS_ENABLED(CONFIG_TEGRA_IOMMU_SMMU)) {
+   if (IS_ENABLED(CONFIG_TEGRA_IOMMU_SMMU) && mc->soc->smmu) {
mc->smmu = tegra_smmu_probe(>dev, mc->soc->smmu, mc);
-   if (IS_ERR(mc->smmu))
+   if (IS_ERR(mc->smmu)) {
dev_err(>dev, "failed to probe SMMU: %ld\n",
PTR_ERR(mc->smmu));
+   mc->smmu = NULL;
+   }
}
 
if (IS_ENABLED(CONFIG_TEGRA_IOMMU_GART) && !mc->soc->smmu) {
-- 
2.20.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v7 13/24] iommu/tegra: gart: Integrate with Memory Controller driver

2018-12-12 Thread Dmitry Osipenko
The device-tree binding has been changed. There is no separate GART device
anymore, it is squashed into the Memory Controller. Integrate GART module
with the MC in a way it is done for the SMMU on Tegra30+.

Signed-off-by: Dmitry Osipenko 
---
 drivers/iommu/Kconfig  |  1 +
 drivers/iommu/tegra-gart.c | 71 ++
 drivers/memory/tegra/mc.c  | 43 +++
 include/soc/tegra/mc.h | 25 ++
 4 files changed, 87 insertions(+), 53 deletions(-)

diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
index d9a25715650e..83c099bb7288 100644
--- a/drivers/iommu/Kconfig
+++ b/drivers/iommu/Kconfig
@@ -282,6 +282,7 @@ config ROCKCHIP_IOMMU
 config TEGRA_IOMMU_GART
bool "Tegra GART IOMMU Support"
depends on ARCH_TEGRA_2x_SOC
+   depends on TEGRA_MC
select IOMMU_API
help
  Enables support for remapping discontiguous physical memory
diff --git a/drivers/iommu/tegra-gart.c b/drivers/iommu/tegra-gart.c
index 835fea461c59..b35ffa312a83 100644
--- a/drivers/iommu/tegra-gart.c
+++ b/drivers/iommu/tegra-gart.c
@@ -19,16 +19,17 @@
  * 51 Franklin St - Fifth Floor, Boston, MA 02110-1301 USA.
  */
 
-#include 
 #include 
 #include 
 #include 
 #include 
-#include 
+#include 
 #include 
 #include 
 #include 
 
+#include 
+
 /* bitmap of the page sizes currently supported */
 #define GART_IOMMU_PGSIZES (SZ_4K)
 
@@ -397,9 +398,8 @@ static const struct iommu_ops gart_iommu_ops = {
.iotlb_sync = gart_iommu_sync,
 };
 
-static int tegra_gart_suspend(struct device *dev)
+int tegra_gart_suspend(struct gart_device *gart)
 {
-   struct gart_device *gart = dev_get_drvdata(dev);
unsigned long iova;
u32 *data = gart->savedata;
unsigned long flags;
@@ -411,9 +411,8 @@ static int tegra_gart_suspend(struct device *dev)
return 0;
 }
 
-static int tegra_gart_resume(struct device *dev)
+int tegra_gart_resume(struct gart_device *gart)
 {
-   struct gart_device *gart = dev_get_drvdata(dev);
unsigned long flags;
 
spin_lock_irqsave(>pte_lock, flags);
@@ -422,41 +421,33 @@ static int tegra_gart_resume(struct device *dev)
return 0;
 }
 
-static int tegra_gart_probe(struct platform_device *pdev)
+struct gart_device *tegra_gart_probe(struct device *dev, struct tegra_mc *mc)
 {
struct gart_device *gart;
-   struct resource *res, *res_remap;
+   struct resource *res_remap;
void __iomem *gart_regs;
-   struct device *dev = >dev;
int ret;
 
BUILD_BUG_ON(PAGE_SHIFT != GART_PAGE_SHIFT);
 
/* the GART memory aperture is required */
-   res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
-   res_remap = platform_get_resource(pdev, IORESOURCE_MEM, 1);
-   if (!res || !res_remap) {
+   res_remap = platform_get_resource(to_platform_device(dev),
+ IORESOURCE_MEM, 1);
+   if (!res_remap) {
dev_err(dev, "GART memory aperture expected\n");
-   return -ENXIO;
+   return ERR_PTR(-ENXIO);
}
 
gart = devm_kzalloc(dev, sizeof(*gart), GFP_KERNEL);
if (!gart) {
dev_err(dev, "failed to allocate gart_device\n");
-   return -ENOMEM;
-   }
-
-   gart_regs = devm_ioremap(dev, res->start, resource_size(res));
-   if (!gart_regs) {
-   dev_err(dev, "failed to remap GART registers\n");
-   return -ENXIO;
+   return ERR_PTR(-ENOMEM);
}
 
-   ret = iommu_device_sysfs_add(>iommu, >dev, NULL,
-dev_name(>dev));
+   ret = iommu_device_sysfs_add(>iommu, dev, NULL, "gart");
if (ret) {
dev_err(dev, "Failed to register IOMMU in sysfs\n");
-   return ret;
+   return ERR_PTR(ret);
}
 
iommu_device_set_ops(>iommu, _iommu_ops);
@@ -468,7 +459,8 @@ static int tegra_gart_probe(struct platform_device *pdev)
goto remove_sysfs;
}
 
-   gart->dev = >dev;
+   gart->dev = dev;
+   gart_regs = mc->regs + GART_REG_BASE;
spin_lock_init(>pte_lock);
spin_lock_init(>client_lock);
INIT_LIST_HEAD(>client);
@@ -483,46 +475,19 @@ static int tegra_gart_probe(struct platform_device *pdev)
goto unregister_iommu;
}
 
-   platform_set_drvdata(pdev, gart);
do_gart_setup(gart, NULL);
 
gart_handle = gart;
 
-   return 0;
+   return gart;
 
 unregister_iommu:
iommu_device_unregister(>iommu);
 remove_sysfs:
iommu_device_sysfs_remove(>iommu);
 
-   return ret;
-}
-
-static const struct dev_pm_ops tegra_gart_pm_ops = {
-   .suspend= tegra_gart_suspend,
-   .resume = tegra_gart_resume,
-};
-
-static const struct of_device_id tegra_gart_of_match[] = {
-   { .compatible = "nvidia,tegra20-gart", },
-   { },
-};
-

[PATCH v7 12/24] memory: tegra: Use relaxed versions of readl/writel

2018-12-12 Thread Dmitry Osipenko
There is no need for inserting of memory barriers to access registers of
Memory Controller. Hence use the relaxed versions of the accessors.

Signed-off-by: Dmitry Osipenko 
Acked-by: Thierry Reding 
---
 drivers/memory/tegra/mc.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/memory/tegra/mc.h b/drivers/memory/tegra/mc.h
index 9856f085e487..887a3b07334f 100644
--- a/drivers/memory/tegra/mc.h
+++ b/drivers/memory/tegra/mc.h
@@ -26,13 +26,13 @@
 
 static inline u32 mc_readl(struct tegra_mc *mc, unsigned long offset)
 {
-   return readl(mc->regs + offset);
+   return readl_relaxed(mc->regs + offset);
 }
 
 static inline void mc_writel(struct tegra_mc *mc, u32 value,
 unsigned long offset)
 {
-   writel(value, mc->regs + offset);
+   writel_relaxed(value, mc->regs + offset);
 }
 
 extern const struct tegra_mc_reset_ops terga_mc_reset_ops_common;
-- 
2.20.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v7 10/24] memory: tegra: Read client ID on GART page fault

2018-12-12 Thread Dmitry Osipenko
With the device tree binding changes, now Memory Controller has access to
GART registers. Hence it is now possible to read client ID on GART page
fault to get information about what memory client causes the fault.

Signed-off-by: Dmitry Osipenko 
Acked-by: Thierry Reding 
---
 drivers/memory/tegra/mc.c | 12 ++--
 1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/drivers/memory/tegra/mc.c b/drivers/memory/tegra/mc.c
index 59db13287b47..ce8cf81b55d7 100644
--- a/drivers/memory/tegra/mc.c
+++ b/drivers/memory/tegra/mc.c
@@ -38,6 +38,7 @@
 
 #define MC_ERR_ADR 0x0c
 
+#define MC_GART_ERROR_REQ  0x30
 #define MC_DECERR_EMEM_OTHERS_STATUS   0x58
 #define MC_SECURITY_VIOLATION_STATUS   0x74
 
@@ -575,8 +576,15 @@ static __maybe_unused irqreturn_t tegra20_mc_irq(int irq, 
void *data)
break;
 
case MC_INT_INVALID_GART_PAGE:
-   dev_err_ratelimited(mc->dev, "%s\n", error);
-   continue;
+   reg = MC_GART_ERROR_REQ;
+   value = mc_readl(mc, reg);
+
+   id = (value >> 1) & mc->soc->client_id_mask;
+   desc = error_names[2];
+
+   if (value & BIT(0))
+   direction = "write";
+   break;
 
case MC_INT_SECURITY_VIOLATION:
reg = MC_SECURITY_VIOLATION_STATUS;
-- 
2.20.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v7 11/24] memory: tegra: Use of_device_get_match_data()

2018-12-12 Thread Dmitry Osipenko
There is no need to match device with the DT node since it was already
matched, use of_device_get_match_data() helper to get the match-data.

Signed-off-by: Dmitry Osipenko 
Acked-by: Thierry Reding 
---
 drivers/memory/tegra/mc.c | 8 ++--
 1 file changed, 2 insertions(+), 6 deletions(-)

diff --git a/drivers/memory/tegra/mc.c b/drivers/memory/tegra/mc.c
index ce8cf81b55d7..55ecfb2d8cfd 100644
--- a/drivers/memory/tegra/mc.c
+++ b/drivers/memory/tegra/mc.c
@@ -12,6 +12,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -619,23 +620,18 @@ static __maybe_unused irqreturn_t tegra20_mc_irq(int irq, 
void *data)
 
 static int tegra_mc_probe(struct platform_device *pdev)
 {
-   const struct of_device_id *match;
struct resource *res;
struct tegra_mc *mc;
void *isr;
int err;
 
-   match = of_match_node(tegra_mc_of_match, pdev->dev.of_node);
-   if (!match)
-   return -ENODEV;
-
mc = devm_kzalloc(>dev, sizeof(*mc), GFP_KERNEL);
if (!mc)
return -ENOMEM;
 
platform_set_drvdata(pdev, mc);
spin_lock_init(>lock);
-   mc->soc = match->data;
+   mc->soc = of_device_get_match_data(>dev);
mc->dev = >dev;
 
/* length of MC tick in nanoseconds */
-- 
2.20.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v7 09/24] memory: tegra: Adapt to Tegra20 device-tree binding changes

2018-12-12 Thread Dmitry Osipenko
The tegra20-mc device-tree binding has been changed, GART has been
squashed into Memory Controller and now the clock property is mandatory
for Tegra20, the DT compatible has been changed as well. Adapt driver to
the DT changes.

Signed-off-by: Dmitry Osipenko 
Acked-by: Thierry Reding 
---
 drivers/memory/tegra/mc.c | 21 -
 drivers/memory/tegra/mc.h |  6 --
 include/soc/tegra/mc.h|  2 +-
 3 files changed, 9 insertions(+), 20 deletions(-)

diff --git a/drivers/memory/tegra/mc.c b/drivers/memory/tegra/mc.c
index b99f3c620f6c..59db13287b47 100644
--- a/drivers/memory/tegra/mc.c
+++ b/drivers/memory/tegra/mc.c
@@ -51,7 +51,7 @@
 
 static const struct of_device_id tegra_mc_of_match[] = {
 #ifdef CONFIG_ARCH_TEGRA_2x_SOC
-   { .compatible = "nvidia,tegra20-mc", .data = _mc_soc },
+   { .compatible = "nvidia,tegra20-mc-gart", .data = _mc_soc },
 #endif
 #ifdef CONFIG_ARCH_TEGRA_3x_SOC
{ .compatible = "nvidia,tegra30-mc", .data = _mc_soc },
@@ -638,24 +638,19 @@ static int tegra_mc_probe(struct platform_device *pdev)
if (IS_ERR(mc->regs))
return PTR_ERR(mc->regs);
 
+   mc->clk = devm_clk_get(>dev, "mc");
+   if (IS_ERR(mc->clk)) {
+   dev_err(>dev, "failed to get MC clock: %ld\n",
+   PTR_ERR(mc->clk));
+   return PTR_ERR(mc->clk);
+   }
+
 #ifdef CONFIG_ARCH_TEGRA_2x_SOC
if (mc->soc == _mc_soc) {
-   res = platform_get_resource(pdev, IORESOURCE_MEM, 1);
-   mc->regs2 = devm_ioremap_resource(>dev, res);
-   if (IS_ERR(mc->regs2))
-   return PTR_ERR(mc->regs2);
-
isr = tegra20_mc_irq;
} else
 #endif
{
-   mc->clk = devm_clk_get(>dev, "mc");
-   if (IS_ERR(mc->clk)) {
-   dev_err(>dev, "failed to get MC clock: %ld\n",
-   PTR_ERR(mc->clk));
-   return PTR_ERR(mc->clk);
-   }
-
err = tegra_mc_setup_latency_allowance(mc);
if (err < 0) {
dev_err(>dev, "failed to setup latency allowance: 
%d\n",
diff --git a/drivers/memory/tegra/mc.h b/drivers/memory/tegra/mc.h
index 01065f12ebeb..9856f085e487 100644
--- a/drivers/memory/tegra/mc.h
+++ b/drivers/memory/tegra/mc.h
@@ -26,18 +26,12 @@
 
 static inline u32 mc_readl(struct tegra_mc *mc, unsigned long offset)
 {
-   if (mc->regs2 && offset >= 0x24)
-   return readl(mc->regs2 + offset - 0x3c);
-
return readl(mc->regs + offset);
 }
 
 static inline void mc_writel(struct tegra_mc *mc, u32 value,
 unsigned long offset)
 {
-   if (mc->regs2 && offset >= 0x24)
-   return writel(value, mc->regs2 + offset - 0x3c);
-
writel(value, mc->regs + offset);
 }
 
diff --git a/include/soc/tegra/mc.h b/include/soc/tegra/mc.h
index b43f37fea096..db5bfdf589b4 100644
--- a/include/soc/tegra/mc.h
+++ b/include/soc/tegra/mc.h
@@ -144,7 +144,7 @@ struct tegra_mc_soc {
 struct tegra_mc {
struct device *dev;
struct tegra_smmu *smmu;
-   void __iomem *regs, *regs2;
+   void __iomem *regs;
struct clk *clk;
int irq;
 
-- 
2.20.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v7 08/24] memory: tegra: Don't invoke Tegra30+ specific memory timing setup on Tegra20

2018-12-12 Thread Dmitry Osipenko
This fixes irrelevant "tegra-mc 7000f000.memory-controller: no memory
timings for RAM code 0 registered" warning message during of kernels
boot-up on Tegra20.

Fixes: a8d502fd3348 ("memory: tegra: Squash tegra20-mc into common tegra-mc 
driver")
Signed-off-by: Dmitry Osipenko 
Acked-by: Jon Hunter 
Acked-by: Thierry Reding 
---
 drivers/memory/tegra/mc.c | 11 ++-
 1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/drivers/memory/tegra/mc.c b/drivers/memory/tegra/mc.c
index 24afc36833bf..b99f3c620f6c 100644
--- a/drivers/memory/tegra/mc.c
+++ b/drivers/memory/tegra/mc.c
@@ -664,12 +664,13 @@ static int tegra_mc_probe(struct platform_device *pdev)
}
 
isr = tegra_mc_irq;
-   }
 
-   err = tegra_mc_setup_timings(mc);
-   if (err < 0) {
-   dev_err(>dev, "failed to setup timings: %d\n", err);
-   return err;
+   err = tegra_mc_setup_timings(mc);
+   if (err < 0) {
+   dev_err(>dev, "failed to setup timings: %d\n",
+   err);
+   return err;
+   }
}
 
mc->irq = platform_get_irq(pdev, 0);
-- 
2.20.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v7 07/24] ARM: dts: tegra20: Update Memory Controller node to the new binding

2018-12-12 Thread Dmitry Osipenko
Device tree binding of Memory Controller has been changed: GART has been
squashed into the MC, there are a new mandatory clock and #iommu-cells
properties, the compatible has been changed to 'tegra20-mc-gart'.

Signed-off-by: Dmitry Osipenko 
Acked-by: Thierry Reding 
---
 arch/arm/boot/dts/tegra20.dtsi | 15 ++-
 1 file changed, 6 insertions(+), 9 deletions(-)

diff --git a/arch/arm/boot/dts/tegra20.dtsi b/arch/arm/boot/dts/tegra20.dtsi
index dcad6d6128cf..8c942e60703e 100644
--- a/arch/arm/boot/dts/tegra20.dtsi
+++ b/arch/arm/boot/dts/tegra20.dtsi
@@ -616,17 +616,14 @@
};
 
mc: memory-controller@7000f000 {
-   compatible = "nvidia,tegra20-mc";
-   reg = <0x7000f000 0x024
-  0x7000f03c 0x3c4>;
+   compatible = "nvidia,tegra20-mc-gart";
+   reg = <0x7000f000 0x400 /* controller registers */
+  0x5800 0x0200>;  /* GART aperture */
+   clocks = <_car TEGRA20_CLK_MC>;
+   clock-names = "mc";
interrupts = ;
#reset-cells = <1>;
-   };
-
-   iommu@7000f024 {
-   compatible = "nvidia,tegra20-gart";
-   reg = <0x7000f024 0x0018/* controller registers */
-  0x5800 0x0200>;  /* GART aperture */
+   #iommu-cells = <0>;
};
 
memory-controller@7000f400 {
-- 
2.20.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v7 06/24] dt-bindings: memory: tegra: Squash tegra20-gart into tegra20-mc

2018-12-12 Thread Dmitry Osipenko
Splitting GART and Memory Controller wasn't a good decision that was made
back in the day. Given that the GART driver wasn't ever been used by
anything in the kernel, we decided that it will be better to correct the
mistakes of the past and merge two bindings into a single one. As a result
there is a DT ABI change for the Memory Controller that allows not to
break newer kernels using older DT and not to break older kernels using
newer DT, that is done by changing the 'compatible' of the node to
'tegra20-mc-gart' and adding a new-required clock property. The new clock
property also puts the tegra20-mc binding in line with the bindings of the
later Tegra generations.

Signed-off-by: Dmitry Osipenko 
Reviewed-by: Rob Herring 
Acked-by: Thierry Reding 
---
 .../bindings/iommu/nvidia,tegra20-gart.txt| 14 --
 .../memory-controllers/nvidia,tegra20-mc.txt  | 27 +--
 2 files changed, 19 insertions(+), 22 deletions(-)
 delete mode 100644 
Documentation/devicetree/bindings/iommu/nvidia,tegra20-gart.txt

diff --git a/Documentation/devicetree/bindings/iommu/nvidia,tegra20-gart.txt 
b/Documentation/devicetree/bindings/iommu/nvidia,tegra20-gart.txt
deleted file mode 100644
index 099d9362ebc1..
--- a/Documentation/devicetree/bindings/iommu/nvidia,tegra20-gart.txt
+++ /dev/null
@@ -1,14 +0,0 @@
-NVIDIA Tegra 20 GART
-
-Required properties:
-- compatible: "nvidia,tegra20-gart"
-- reg: Two pairs of cells specifying the physical address and size of
-  the memory controller registers and the GART aperture respectively.
-
-Example:
-
-   gart {
-   compatible = "nvidia,tegra20-gart";
-   reg = <0x7000f024 0x0018/* controller registers */
-  0x5800 0x0200>;  /* GART aperture */
-   };
diff --git 
a/Documentation/devicetree/bindings/memory-controllers/nvidia,tegra20-mc.txt 
b/Documentation/devicetree/bindings/memory-controllers/nvidia,tegra20-mc.txt
index 7d60a50a4fa1..e55328237df4 100644
--- a/Documentation/devicetree/bindings/memory-controllers/nvidia,tegra20-mc.txt
+++ b/Documentation/devicetree/bindings/memory-controllers/nvidia,tegra20-mc.txt
@@ -1,26 +1,37 @@
 NVIDIA Tegra20 MC(Memory Controller)
 
 Required properties:
-- compatible : "nvidia,tegra20-mc"
-- reg : Should contain 2 register ranges(address and length); see the
-  example below. Note that the MC registers are interleaved with the
-  GART registers, and hence must be represented as multiple ranges.
+- compatible : "nvidia,tegra20-mc-gart"
+- reg : Should contain 2 register ranges: physical base address and length of
+  the controller's registers and the GART aperture respectively.
+- clocks: Must contain an entry for each entry in clock-names.
+  See ../clocks/clock-bindings.txt for details.
+- clock-names: Must include the following entries:
+  - mc: the module's clock input
 - interrupts : Should contain MC General interrupt.
 - #reset-cells : Should be 1. This cell represents memory client module ID.
   The assignments may be found in header file 
   or in the TRM documentation.
+- #iommu-cells: Should be 0. This cell represents the number of cells in an
+  IOMMU specifier needed to encode an address. GART supports only a single
+  address space that is shared by all devices, therefore no additional
+  information needed for the address encoding.
 
 Example:
mc: memory-controller@7000f000 {
-   compatible = "nvidia,tegra20-mc";
-   reg = <0x7000f000 0x024
-  0x7000f03c 0x3c4>;
-   interrupts = <0 77 0x04>;
+   compatible = "nvidia,tegra20-mc-gart";
+   reg = <0x7000f000 0x400 /* controller registers */
+  0x5800 0x0200>;  /* GART aperture */
+   clocks = <_car TEGRA20_CLK_MC>;
+   clock-names = "mc";
+   interrupts = ;
#reset-cells = <1>;
+   #iommu-cells = <0>;
};
 
video-codec@6001a000 {
compatible = "nvidia,tegra20-vde";
...
resets = < TEGRA20_MC_RESET_VDE>;
+   iommus = <>;
};
-- 
2.20.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v7 04/24] iommu: Introduce iotlb_sync_map callback

2018-12-12 Thread Dmitry Osipenko
Introduce iotlb_sync_map() callback that is invoked in the end of
iommu_map(). This new callback allows IOMMU drivers to avoid syncing
after mapping of each contiguous chunk and sync only when the whole
mapping is completed, optimizing performance of the mapping operation.

Signed-off-by: Dmitry Osipenko 
Reviewed-by: Robin Murphy 
Reviewed-by: Thierry Reding 
---
 drivers/iommu/iommu.c | 8 ++--
 include/linux/iommu.h | 1 +
 2 files changed, 7 insertions(+), 2 deletions(-)

diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index cc25ec6d4c06..79e7c49ed16a 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -1582,13 +1582,14 @@ static size_t iommu_pgsize(struct iommu_domain *domain,
 int iommu_map(struct iommu_domain *domain, unsigned long iova,
  phys_addr_t paddr, size_t size, int prot)
 {
+   const struct iommu_ops *ops = domain->ops;
unsigned long orig_iova = iova;
unsigned int min_pagesz;
size_t orig_size = size;
phys_addr_t orig_paddr = paddr;
int ret = 0;
 
-   if (unlikely(domain->ops->map == NULL ||
+   if (unlikely(ops->map == NULL ||
 domain->pgsize_bitmap == 0UL))
return -ENODEV;
 
@@ -1617,7 +1618,7 @@ int iommu_map(struct iommu_domain *domain, unsigned long 
iova,
pr_debug("mapping: iova 0x%lx pa %pa pgsize 0x%zx\n",
 iova, , pgsize);
 
-   ret = domain->ops->map(domain, iova, paddr, pgsize, prot);
+   ret = ops->map(domain, iova, paddr, pgsize, prot);
if (ret)
break;
 
@@ -1626,6 +1627,9 @@ int iommu_map(struct iommu_domain *domain, unsigned long 
iova,
size -= pgsize;
}
 
+   if (ops->iotlb_sync_map)
+   ops->iotlb_sync_map(domain);
+
/* unroll mapping in case something went wrong */
if (ret)
iommu_unmap(domain, orig_iova, orig_size - size);
diff --git a/include/linux/iommu.h b/include/linux/iommu.h
index 11db18b9ffe8..0e90c5cc72db 100644
--- a/include/linux/iommu.h
+++ b/include/linux/iommu.h
@@ -201,6 +201,7 @@ struct iommu_ops {
void (*flush_iotlb_all)(struct iommu_domain *domain);
void (*iotlb_range_add)(struct iommu_domain *domain,
unsigned long iova, size_t size);
+   void (*iotlb_sync_map)(struct iommu_domain *domain);
void (*iotlb_sync)(struct iommu_domain *domain);
phys_addr_t (*iova_to_phys)(struct iommu_domain *domain, dma_addr_t 
iova);
int (*add_device)(struct device *dev);
-- 
2.20.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v7 03/24] iommu/tegra: gart: Ignore devices without IOMMU phandle in DT

2018-12-12 Thread Dmitry Osipenko
GART can't handle all devices, hence ignore devices that aren't related
to GART. IOMMU phandle must be explicitly assign to devices in the device
tree.

Signed-off-by: Dmitry Osipenko 
Acked-by: Thierry Reding 
---
 drivers/iommu/tegra-gart.c | 14 +-
 1 file changed, 13 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/tegra-gart.c b/drivers/iommu/tegra-gart.c
index 1cd470b2beea..37a76388ff7e 100644
--- a/drivers/iommu/tegra-gart.c
+++ b/drivers/iommu/tegra-gart.c
@@ -345,8 +345,12 @@ static bool gart_iommu_capable(enum iommu_cap cap)
 
 static int gart_iommu_add_device(struct device *dev)
 {
-   struct iommu_group *group = iommu_group_get_for_dev(dev);
+   struct iommu_group *group;
 
+   if (!dev->iommu_fwspec)
+   return -ENODEV;
+
+   group = iommu_group_get_for_dev(dev);
if (IS_ERR(group))
return PTR_ERR(group);
 
@@ -363,6 +367,12 @@ static void gart_iommu_remove_device(struct device *dev)
iommu_device_unlink(_handle->iommu, dev);
 }
 
+static int gart_iommu_of_xlate(struct device *dev,
+  struct of_phandle_args *args)
+{
+   return 0;
+}
+
 static const struct iommu_ops gart_iommu_ops = {
.capable= gart_iommu_capable,
.domain_alloc   = gart_iommu_domain_alloc,
@@ -376,6 +386,7 @@ static const struct iommu_ops gart_iommu_ops = {
.unmap  = gart_iommu_unmap,
.iova_to_phys   = gart_iommu_iova_to_phys,
.pgsize_bitmap  = GART_IOMMU_PGSIZES,
+   .of_xlate   = gart_iommu_of_xlate,
 };
 
 static int tegra_gart_suspend(struct device *dev)
@@ -441,6 +452,7 @@ static int tegra_gart_probe(struct platform_device *pdev)
}
 
iommu_device_set_ops(>iommu, _iommu_ops);
+   iommu_device_set_fwnode(>iommu, dev->fwnode);
 
ret = iommu_device_register(>iommu);
if (ret) {
-- 
2.20.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v7 02/24] iommu/tegra: gart: Clean up driver probe errors handling

2018-12-12 Thread Dmitry Osipenko
Properly clean up allocated resources on the drivers probe failure and
remove unneeded checks.

Signed-off-by: Dmitry Osipenko 
Acked-by: Thierry Reding 
---
 drivers/iommu/tegra-gart.c | 16 ++--
 1 file changed, 10 insertions(+), 6 deletions(-)

diff --git a/drivers/iommu/tegra-gart.c b/drivers/iommu/tegra-gart.c
index ff75cf60117b..1cd470b2beea 100644
--- a/drivers/iommu/tegra-gart.c
+++ b/drivers/iommu/tegra-gart.c
@@ -411,9 +411,6 @@ static int tegra_gart_probe(struct platform_device *pdev)
struct device *dev = >dev;
int ret;
 
-   if (gart_handle)
-   return -EIO;
-
BUILD_BUG_ON(PAGE_SHIFT != GART_PAGE_SHIFT);
 
/* the GART memory aperture is required */
@@ -448,8 +445,7 @@ static int tegra_gart_probe(struct platform_device *pdev)
ret = iommu_device_register(>iommu);
if (ret) {
dev_err(dev, "Failed to register IOMMU\n");
-   iommu_device_sysfs_remove(>iommu);
-   return ret;
+   goto remove_sysfs;
}
 
gart->dev = >dev;
@@ -463,7 +459,8 @@ static int tegra_gart_probe(struct platform_device *pdev)
gart->savedata = vmalloc(array_size(sizeof(u32), gart->page_count));
if (!gart->savedata) {
dev_err(dev, "failed to allocate context save area\n");
-   return -ENOMEM;
+   ret = -ENOMEM;
+   goto unregister_iommu;
}
 
platform_set_drvdata(pdev, gart);
@@ -472,6 +469,13 @@ static int tegra_gart_probe(struct platform_device *pdev)
gart_handle = gart;
 
return 0;
+
+unregister_iommu:
+   iommu_device_unregister(>iommu);
+remove_sysfs:
+   iommu_device_sysfs_remove(>iommu);
+
+   return ret;
 }
 
 static const struct dev_pm_ops tegra_gart_pm_ops = {
-- 
2.20.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v7 05/24] iommu/tegra: gart: Optimize mapping / unmapping performance

2018-12-12 Thread Dmitry Osipenko
Currently GART writes one page entry at a time. More optimal would be to
aggregate the writes and flush BUS buffer in the end, this gives map/unmap
10-40% performance boost (depending on size of mapping) in comparison to
flushing after each page entry update.

Signed-off-by: Dmitry Osipenko 
Acked-by: Thierry Reding 
---
 drivers/iommu/tegra-gart.c | 12 ++--
 1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/drivers/iommu/tegra-gart.c b/drivers/iommu/tegra-gart.c
index 37a76388ff7e..835fea461c59 100644
--- a/drivers/iommu/tegra-gart.c
+++ b/drivers/iommu/tegra-gart.c
@@ -290,7 +290,6 @@ static int gart_iommu_map(struct iommu_domain *domain, 
unsigned long iova,
}
}
gart_set_pte(gart, iova, GART_PTE(pfn));
-   FLUSH_GART_REGS(gart);
spin_unlock_irqrestore(>pte_lock, flags);
return 0;
 }
@@ -307,7 +306,6 @@ static size_t gart_iommu_unmap(struct iommu_domain *domain, 
unsigned long iova,
 
spin_lock_irqsave(>pte_lock, flags);
gart_set_pte(gart, iova, 0);
-   FLUSH_GART_REGS(gart);
spin_unlock_irqrestore(>pte_lock, flags);
return bytes;
 }
@@ -373,6 +371,14 @@ static int gart_iommu_of_xlate(struct device *dev,
return 0;
 }
 
+static void gart_iommu_sync(struct iommu_domain *domain)
+{
+   struct gart_domain *gart_domain = to_gart_domain(domain);
+   struct gart_device *gart = gart_domain->gart;
+
+   FLUSH_GART_REGS(gart);
+}
+
 static const struct iommu_ops gart_iommu_ops = {
.capable= gart_iommu_capable,
.domain_alloc   = gart_iommu_domain_alloc,
@@ -387,6 +393,8 @@ static const struct iommu_ops gart_iommu_ops = {
.iova_to_phys   = gart_iommu_iova_to_phys,
.pgsize_bitmap  = GART_IOMMU_PGSIZES,
.of_xlate   = gart_iommu_of_xlate,
+   .iotlb_sync_map = gart_iommu_sync,
+   .iotlb_sync = gart_iommu_sync,
 };
 
 static int tegra_gart_suspend(struct device *dev)
-- 
2.20.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v7 00/24] Tegra GART driver clean up and optimization

2018-12-12 Thread Dmitry Osipenko
Hello,

This patch-series integrates the GART (IOMMU) driver with the Memory
Controller driver, that allows to report the name of a faulty memory
client on GART page fault. A major code clean up and performance
optimization is performed in this series as well.

Changelog:

v7: Addressed review comments to v6 from Thierry Reding.

Added three new (minor) patches that further prettifying the drivers
code.

New patches in v7:
1) "memory: tegra: Do not try to probe SMMU on Tegra20"
2) "memory: tegra: Do not ask for IRQ sharing"
3) "memory: tegra: Clean up error messages"

Incorporated another "one-line" change into the "perform code refactoring"
patch: now GART translation is kept disabled in HW while GART is suspended.

v6: v5 that is re-based on the recent linux-next.

v5: Addressed review comments from Thierry Reding to v4. Added WARN_ON() to
make sure that active domain isn't getting released, kept include headers
where necessary, etc.. All changes are quite minor.

Added new patch "memory: tegra: Use relaxed versions of readl/writel".

v4: In the v3 Rob Herring requested to make device-tree binding changes
backwards-compatible with the older kernels, that is achieved by
changing the 'compatible' value of the DT node.

The code-refactoring patches got some more (minor) polish.

Added new patch "memory: tegra: Use of_device_get_match_data()".

v3: Memory Controller integration part has been reworked and now GART's
device-tree binding is changed. Adding Rob Herring for the device-tree
changes reviewing.

GART now disallows more than one active domain at a time.

Fixed "spinlock recursion", "NULL pointer dereference" and "detaching
of all devices from inactive domains".

New code-refactoring patches.

The previously standalone patch "memory: tegra: Don't invoke Tegra30+
specific memory timing setup on Tegra20" is now included into this
series because there is a dependency on that patch and it wasn't applied
yet.

v2: Addressed review comments from Robin Murphy to v1 by moving devices
iommu_fwspec check to gart_iommu_add_device().

Dropped the "Provide single domain and group for all devices" patch from
the series for now because after some more considering it became not
exactly apparent whether that is what we need, that was also suggested
by Robin Murphy in the review comment. Maybe something like a runtime
IOMMU usage for devices would be a better solution, allowing to implement
transparent context switching of virtual IOMMU domains.

Some very minor code cleanups, reworded commit messages.

Dmitry Osipenko (24):
  iommu/tegra: gart: Remove pr_fmt and clean up includes
  iommu/tegra: gart: Clean up driver probe errors handling
  iommu/tegra: gart: Ignore devices without IOMMU phandle in DT
  iommu: Introduce iotlb_sync_map callback
  iommu/tegra: gart: Optimize mapping / unmapping performance
  dt-bindings: memory: tegra: Squash tegra20-gart into tegra20-mc
  ARM: dts: tegra20: Update Memory Controller node to the new binding
  memory: tegra: Don't invoke Tegra30+ specific memory timing setup on
Tegra20
  memory: tegra: Adapt to Tegra20 device-tree binding changes
  memory: tegra: Read client ID on GART page fault
  memory: tegra: Use of_device_get_match_data()
  memory: tegra: Use relaxed versions of readl/writel
  iommu/tegra: gart: Integrate with Memory Controller driver
  memory: tegra: Do not try to probe SMMU on Tegra20
  memory: tegra: Do not ask for IRQ sharing
  memory: tegra: Clean up error messages
  iommu/tegra: gart: Fix spinlock recursion
  iommu/tegra: gart: Fix NULL pointer dereference
  iommu/tegra: gart: Allow only one active domain at a time
  iommu/tegra: gart: Don't use managed resources
  iommu/tegra: gart: Prepend error/debug messages with "gart:"
  iommu/tegra: gart: Don't detach devices from inactive domains
  iommu/tegra: gart: Simplify clients-tracking code
  iommu/tegra: gart: Perform code refactoring

 .../bindings/iommu/nvidia,tegra20-gart.txt|  14 -
 .../memory-controllers/nvidia,tegra20-mc.txt  |  27 +-
 arch/arm/boot/dts/tegra20.dtsi|  15 +-
 drivers/iommu/Kconfig |   1 +
 drivers/iommu/iommu.c |   8 +-
 drivers/iommu/tegra-gart.c| 473 +++---
 drivers/iommu/tegra-smmu.c|   4 -
 drivers/memory/tegra/mc.c | 118 +++--
 drivers/memory/tegra/mc.h |  10 +-
 include/linux/iommu.h |   1 +
 include/soc/tegra/mc.h|  27 +-
 11 files changed, 323 insertions(+), 375 deletions(-)
 delete mode 100644 
Documentation/devicetree/bindings/iommu/nvidia,tegra20-gart.txt

-- 
2.20.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v7 01/24] iommu/tegra: gart: Remove pr_fmt and clean up includes

2018-12-12 Thread Dmitry Osipenko
Remove unneeded headers inclusion and sort the headers in alphabet order.
Remove pr_fmt macro since there is no pr_*() in the code and it doesn't
affect dev_*() functions.

Signed-off-by: Dmitry Osipenko 
Acked-by: Thierry Reding 
---
 drivers/iommu/tegra-gart.c | 17 +
 1 file changed, 5 insertions(+), 12 deletions(-)

diff --git a/drivers/iommu/tegra-gart.c b/drivers/iommu/tegra-gart.c
index da6a4e357b2b..ff75cf60117b 100644
--- a/drivers/iommu/tegra-gart.c
+++ b/drivers/iommu/tegra-gart.c
@@ -19,22 +19,15 @@
  * 51 Franklin St - Fifth Floor, Boston, MA 02110-1301 USA.
  */
 
-#define pr_fmt(fmt)"%s(): " fmt, __func__
-
 #include 
+#include 
+#include 
+#include 
 #include 
-#include 
-#include 
+#include 
 #include 
+#include 
 #include 
-#include 
-#include 
-#include 
-#include 
-#include 
-#include 
-
-#include 
 
 /* bitmap of the page sizes currently supported */
 #define GART_IOMMU_PGSIZES (SZ_4K)
-- 
2.20.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [virtio-dev] Re: [PATCH v5 5/7] iommu: Add virtio-iommu driver

2018-12-12 Thread Auger Eric
Hi,

On 12/12/18 3:56 PM, Michael S. Tsirkin wrote:
> On Fri, Dec 07, 2018 at 06:52:31PM +, Jean-Philippe Brucker wrote:
>> Sorry for the delay, I wanted to do a little more performance analysis
>> before continuing.
>>
>> On 27/11/2018 18:10, Michael S. Tsirkin wrote:
>>> On Tue, Nov 27, 2018 at 05:55:20PM +, Jean-Philippe Brucker wrote:
>> +if (!virtio_has_feature(vdev, VIRTIO_F_VERSION_1) ||
>> +!virtio_has_feature(vdev, VIRTIO_IOMMU_F_MAP_UNMAP))
>
> Why bother with a feature bit for this then btw?

 We'll need a new feature bit for sharing page tables with the hardware,
 because they require different requests (attach_table/invalidate instead
 of map/unmap.) A future device supporting page table sharing won't
 necessarily need to support map/unmap.

>>> I don't see virtio iommu being extended to support ARM specific
>>> requests. This just won't scale, too many different
>>> descriptor formats out there.
>>
>> They aren't really ARM specific requests. The two new requests are
>> ATTACH_TABLE and INVALIDATE, which would be used by x86 IOMMUs as well.
>>
>> Sharing CPU address space with the HW IOMMU (SVM) has been in the scope
>> of virtio-iommu since the first RFC, and I've been working with that
>> extension in mind since the beginning. As an example you can have a look
>> at my current draft for this [1], which is inspired from the VFIO work
>> we've been doing with Intel.
>>
>> The negotiation phase inevitably requires vendor-specific fields in the
>> descriptors - host tells which formats are supported, guest chooses a
>> format and attaches page tables. But invalidation and fault reporting
>> descriptors are fairly generic.
> 
> We need to tread carefully here.  People expect it that if user does
> lspci and sees a virtio device then it's reasonably portable.
> 
>>> If you want to go that way down the road, you should avoid
>>> virtio iommu, instead emulate and share code with the ARM SMMU (probably
>>> with a different vendor id so you can implement the
>>> report on map for devices without PRI).
>>
>> vSMMU has to stay in userspace though. The main reason we're proposing
>> virtio-iommu is that emulating every possible vIOMMU model in the kernel
>> would be unmaintainable. With virtio-iommu we can process the fast path
>> in the host kernel, through vhost-iommu, and do the heavy lifting in
>> userspace.
> 
> Interesting.
> 
>> As said above, I'm trying to keep the fast path for
>> virtio-iommu generic.
>>
>> More notes on what I consider to be the fast path, and comparison with
>> vSMMU:
>>
>> (1) The primary use-case we have in mind for vIOMMU is something like
>> DPDK in the guest, assigning a hardware device to guest userspace. DPDK
>> maps a large amount of memory statically, to be used by a pass-through
>> device. For this case I don't think we care about vIOMMU performance.
>> Setup and teardown need to be reasonably fast, sure, but the MAP/UNMAP
>> requests don't have to be optimal.
>>
>>
>> (2) If the assigned device is owned by the guest kernel, then mappings
>> are dynamic and require dma_map/unmap() to be fast, but there generally
>> is no need for a vIOMMU, since device and drivers are trusted by the
>> guest kernel. Even when the user does enable a vIOMMU for this case
>> (allowing to over-commit guest memory, which needs to be pinned
>> otherwise),
> 
> BTW that's in theory in practice it doesn't really work.
> 
>> we generally play tricks like lazy TLBI (non-strict mode) to
>> make it faster.
> 
> Simple lazy TLB for guest/userspace drivers would be a big no no.
> You need something smarter.
> 
>> Here device and drivers are trusted, therefore the
>> vulnerability window of lazy mode isn't a concern.
>>
>> If the reason to enable the vIOMMU is over-comitting guest memory
>> however, you can't use nested translation because it requires pinning
>> the second-level tables. For this case performance matters a bit,
>> because your invalidate-on-map needs to be fast, even if you enable lazy
>> mode and only receive inval-on-unmap every 10ms. It won't ever be as
>> fast as nested translation, though. For this case I think vSMMU+Caching
>> Mode and userspace virtio-iommu with MAP/UNMAP would perform similarly
>> (given page-sized payloads), because the pagetable walk doesn't add a
>> lot of overhead compared to the context switch. But given the results
>> below, vhost-iommu would be faster than vSMMU+CM.
>>
>>
>> (3) Then there is SVM. For SVM, any destructive change to the process
>> address space requires a synchronous invalidation command to the
>> hardware (at least when using PCI ATS). Given that SVM is based on page
>> faults, fault reporting from host to guest also needs to be fast, as
>> well as fault response from guest to host.
>>
>> I think this is where performance matters the most. To get a feel of the
>> advantage we get with virtio-iommu, I compared the vSMMU page-table
>> sharing implementation [2] and 

Re: [virtio-dev] Re: [PATCH v5 5/7] iommu: Add virtio-iommu driver

2018-12-12 Thread Michael S. Tsirkin
On Fri, Dec 07, 2018 at 06:52:31PM +, Jean-Philippe Brucker wrote:
> Sorry for the delay, I wanted to do a little more performance analysis
> before continuing.
> 
> On 27/11/2018 18:10, Michael S. Tsirkin wrote:
> > On Tue, Nov 27, 2018 at 05:55:20PM +, Jean-Philippe Brucker wrote:
>  +if (!virtio_has_feature(vdev, VIRTIO_F_VERSION_1) ||
>  +!virtio_has_feature(vdev, VIRTIO_IOMMU_F_MAP_UNMAP))
> >>>
> >>> Why bother with a feature bit for this then btw?
> >>
> >> We'll need a new feature bit for sharing page tables with the hardware,
> >> because they require different requests (attach_table/invalidate instead
> >> of map/unmap.) A future device supporting page table sharing won't
> >> necessarily need to support map/unmap.
> >>
> > I don't see virtio iommu being extended to support ARM specific
> > requests. This just won't scale, too many different
> > descriptor formats out there.
> 
> They aren't really ARM specific requests. The two new requests are
> ATTACH_TABLE and INVALIDATE, which would be used by x86 IOMMUs as well.
> 
> Sharing CPU address space with the HW IOMMU (SVM) has been in the scope
> of virtio-iommu since the first RFC, and I've been working with that
> extension in mind since the beginning. As an example you can have a look
> at my current draft for this [1], which is inspired from the VFIO work
> we've been doing with Intel.
> 
> The negotiation phase inevitably requires vendor-specific fields in the
> descriptors - host tells which formats are supported, guest chooses a
> format and attaches page tables. But invalidation and fault reporting
> descriptors are fairly generic.

We need to tread carefully here.  People expect it that if user does
lspci and sees a virtio device then it's reasonably portable.

> > If you want to go that way down the road, you should avoid
> > virtio iommu, instead emulate and share code with the ARM SMMU (probably
> > with a different vendor id so you can implement the
> > report on map for devices without PRI).
> 
> vSMMU has to stay in userspace though. The main reason we're proposing
> virtio-iommu is that emulating every possible vIOMMU model in the kernel
> would be unmaintainable. With virtio-iommu we can process the fast path
> in the host kernel, through vhost-iommu, and do the heavy lifting in
> userspace.

Interesting.

> As said above, I'm trying to keep the fast path for
> virtio-iommu generic.
> 
> More notes on what I consider to be the fast path, and comparison with
> vSMMU:
> 
> (1) The primary use-case we have in mind for vIOMMU is something like
> DPDK in the guest, assigning a hardware device to guest userspace. DPDK
> maps a large amount of memory statically, to be used by a pass-through
> device. For this case I don't think we care about vIOMMU performance.
> Setup and teardown need to be reasonably fast, sure, but the MAP/UNMAP
> requests don't have to be optimal.
> 
> 
> (2) If the assigned device is owned by the guest kernel, then mappings
> are dynamic and require dma_map/unmap() to be fast, but there generally
> is no need for a vIOMMU, since device and drivers are trusted by the
> guest kernel. Even when the user does enable a vIOMMU for this case
> (allowing to over-commit guest memory, which needs to be pinned
> otherwise),

BTW that's in theory in practice it doesn't really work.

> we generally play tricks like lazy TLBI (non-strict mode) to
> make it faster.

Simple lazy TLB for guest/userspace drivers would be a big no no.
You need something smarter.

> Here device and drivers are trusted, therefore the
> vulnerability window of lazy mode isn't a concern.
> 
> If the reason to enable the vIOMMU is over-comitting guest memory
> however, you can't use nested translation because it requires pinning
> the second-level tables. For this case performance matters a bit,
> because your invalidate-on-map needs to be fast, even if you enable lazy
> mode and only receive inval-on-unmap every 10ms. It won't ever be as
> fast as nested translation, though. For this case I think vSMMU+Caching
> Mode and userspace virtio-iommu with MAP/UNMAP would perform similarly
> (given page-sized payloads), because the pagetable walk doesn't add a
> lot of overhead compared to the context switch. But given the results
> below, vhost-iommu would be faster than vSMMU+CM.
> 
> 
> (3) Then there is SVM. For SVM, any destructive change to the process
> address space requires a synchronous invalidation command to the
> hardware (at least when using PCI ATS). Given that SVM is based on page
> faults, fault reporting from host to guest also needs to be fast, as
> well as fault response from guest to host.
> 
> I think this is where performance matters the most. To get a feel of the
> advantage we get with virtio-iommu, I compared the vSMMU page-table
> sharing implementation [2] and vhost-iommu + VFIO with page table
> sharing (based on Tomasz Nowicki's vhost-iommu prototype). That's on a
> ThunderX2 with a 10Gb NIC 

Re: use generic DMA mapping code in powerpc V4

2018-12-12 Thread Christian Zigotzky
Hi Christoph,

Thanks a lot for your reply. I will test your patches tomorrow.

Cheers,
Christian

Sent from my iPhone

> On 12. Dec 2018, at 15:15, Christoph Hellwig  wrote:
> 
> Thanks for bisecting.  I've spent some time going over the conversion
> but can't really pinpoint it.  I have three little patches that switch
> parts of the code to the generic version.  This is on top of the
> last good commmit (977706f9755d2d697aa6f45b4f9f0e07516efeda).
> 
> Can you check with whіch one things stop working?
> 
> 
> <0001-get_required_mask.patch>
> <0002-swiotlb-dma_supported.patch>
> <0003-nommu-dma_supported.patch>
> <0004-alloc-free.patch>
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH 12/34] powerpc/cell: move dma direct window setup out of dma_configure

2018-12-12 Thread Christoph Hellwig
On Sun, Dec 09, 2018 at 09:23:39PM +1100, Michael Ellerman wrote:
> Christoph Hellwig  writes:
> 
> > Configure the dma settings at device setup time, and stop playing games
> > with get_pci_dma_ops.  This prepares for using the common dma_configure
> > code later on.
> >
> > Signed-off-by: Christoph Hellwig 
> > ---
> >  arch/powerpc/platforms/cell/iommu.c | 20 +++-
> >  1 file changed, 11 insertions(+), 9 deletions(-)
> 
> This one's crashing, haven't dug into why yet:

Can you provide a gdb assembly of the exact crash site?  This looks
like for some odd reason the DT structures aren't fully setup by the
time we are probing the device, which seems odd.

Either way, something like the patch below would ensure we call
cell_iommu_get_fixed_address from a similar context as before, can you
check if that fixes the issue?

diff --git a/arch/powerpc/platforms/cell/iommu.c 
b/arch/powerpc/platforms/cell/iommu.c
index 93c7e4aef571..4891b338bf9f 100644
--- a/arch/powerpc/platforms/cell/iommu.c
+++ b/arch/powerpc/platforms/cell/iommu.c
@@ -569,19 +569,12 @@ static struct iommu_table *cell_get_iommu_table(struct 
device *dev)
return >table;
 }
 
-static u64 cell_iommu_get_fixed_address(struct device *dev);
-
 static void cell_dma_dev_setup(struct device *dev)
 {
-   if (cell_iommu_enabled) {
-   u64 addr = cell_iommu_get_fixed_address(dev);
-
-   if (addr != OF_BAD_ADDR)
-   set_dma_offset(dev, addr + dma_iommu_fixed_base);
+   if (cell_iommu_enabled)
set_iommu_table_base(dev, cell_get_iommu_table(dev));
-   } else {
+   else
set_dma_offset(dev, cell_dma_nommu_offset);
-   }
 }
 
 static void cell_pci_dma_dev_setup(struct pci_dev *dev)
@@ -865,8 +858,16 @@ static u64 cell_iommu_get_fixed_address(struct device *dev)
 
 static bool cell_pci_iommu_bypass_supported(struct pci_dev *pdev, u64 mask)
 {
-   return mask == DMA_BIT_MASK(64) &&
-   cell_iommu_get_fixed_address(>dev) != OF_BAD_ADDR;
+   if (mask == DMA_BIT_MASK(64)) {
+   u64 addr = cell_iommu_get_fixed_address(>dev);
+
+   if (addr != OF_BAD_ADDR) {
+   set_dma_offset(>dev, dma_iommu_fixed_base + addr);
+   return true;
+   }
+   }
+
+   return true;
 }
 
 static void insert_16M_pte(unsigned long addr, unsigned long *ptab,
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v6 0/7] Add virtio-iommu driver

2018-12-12 Thread Michael S. Tsirkin
On Wed, Dec 12, 2018 at 11:35:45AM +0100, Joerg Roedel wrote:
> Hi,
> 
> to make progress on this, we should first agree on the protocol used
> between guest and host. I have a few points to discuss on the protocol
> first.
> 
> On Tue, Dec 11, 2018 at 06:20:57PM +, Jean-Philippe Brucker wrote:
> > [1] Virtio-iommu specification v0.9, sources and pdf
> > git://linux-arm.org/virtio-iommu.git virtio-iommu/v0.9
> > http://jpbrucker.net/virtio-iommu/spec/v0.9/virtio-iommu-v0.9.pdf
> 
> Looking at this I wonder why it doesn't make the IOTLB visible to the
> guest. the UNMAP requests seem to require that the TLB is already
> flushed to make the unmap visible.
> 
> I think that will cost significant performance for both, vfio and
> dma-iommu use-cases which both do (vfio at least to some degree),
> deferred flushing.
> 
> I also wonder whether the protocol should implement a
> protocol version handshake

virtio has a builtin version handshake so devices don't need to.

> and iommu-feature set queries.
> 
> > [3] git://linux-arm.org/linux-jpb.git virtio-iommu/v0.9.1
> > git://linux-arm.org/kvmtool-jpb.git virtio-iommu/v0.9
> 
> Unfortunatly gitweb seems to be broken on linux-arm.org. What is missing
> in this patch-set to make this work on x86?

And I wonder about pcc too.

> Regards,
> 
>   Joerg
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: use generic DMA mapping code in powerpc V4

2018-12-12 Thread Christoph Hellwig
Thanks for bisecting.  I've spent some time going over the conversion
but can't really pinpoint it.  I have three little patches that switch
parts of the code to the generic version.  This is on top of the
last good commmit (977706f9755d2d697aa6f45b4f9f0e07516efeda).

Can you check with whіch one things stop working?


>From 83a4b87de6bc6a75b500c9959de88e2157fbcd7c Mon Sep 17 00:00:00 2001
From: Christoph Hellwig 
Date: Wed, 12 Dec 2018 15:07:49 +0100
Subject: get_required_mask

---
 arch/powerpc/kernel/dma-iommu.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/kernel/dma-iommu.c b/arch/powerpc/kernel/dma-iommu.c
index 5b15e53ee43d..2e682004959f 100644
--- a/arch/powerpc/kernel/dma-iommu.c
+++ b/arch/powerpc/kernel/dma-iommu.c
@@ -152,7 +152,7 @@ u64 dma_iommu_get_required_mask(struct device *dev)
 		return 0;
 
 	if (dev_is_pci(dev)) {
-		u64 bypass_mask = dma_nommu_get_required_mask(dev);
+		u64 bypass_mask = dma_direct_get_required_mask(dev);
 
 		if (dma_iommu_bypass_supported(dev, bypass_mask))
 			return bypass_mask;
-- 
2.19.2

>From c2579a3619575397929781a14895966cbc1d217b Mon Sep 17 00:00:00 2001
From: Christoph Hellwig 
Date: Wed, 12 Dec 2018 15:08:52 +0100
Subject: swiotlb dma_supported

---
 arch/powerpc/kernel/dma-swiotlb.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/kernel/dma-swiotlb.c b/arch/powerpc/kernel/dma-swiotlb.c
index aa11625c6691..52ee531c1a0d 100644
--- a/arch/powerpc/kernel/dma-swiotlb.c
+++ b/arch/powerpc/kernel/dma-swiotlb.c
@@ -36,7 +36,7 @@ const struct dma_map_ops powerpc_swiotlb_dma_ops = {
 	.free = __dma_nommu_free_coherent,
 	.map_sg = swiotlb_map_sg_attrs,
 	.unmap_sg = swiotlb_unmap_sg_attrs,
-	.dma_supported = swiotlb_dma_supported,
+	.dma_supported = dma_direct_supported,
 	.map_page = swiotlb_map_page,
 	.unmap_page = swiotlb_unmap_page,
 	.sync_single_for_cpu = swiotlb_sync_single_for_cpu,
-- 
2.19.2

>From 0105db9e6d8d031b4295116630fd0318fd146737 Mon Sep 17 00:00:00 2001
From: Christoph Hellwig 
Date: Wed, 12 Dec 2018 15:10:36 +0100
Subject: nommu dma_supported

---
 arch/powerpc/kernel/dma.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/kernel/dma.c b/arch/powerpc/kernel/dma.c
index a6590aa77181..f53d11d35230 100644
--- a/arch/powerpc/kernel/dma.c
+++ b/arch/powerpc/kernel/dma.c
@@ -179,7 +179,7 @@ const struct dma_map_ops dma_nommu_ops = {
 	.alloc= __dma_nommu_alloc_coherent,
 	.free= __dma_nommu_free_coherent,
 	.map_sg= dma_nommu_map_sg,
-	.dma_supported			= dma_nommu_dma_supported,
+	.dma_supported			= dma_direct_supported,
 	.map_page			= dma_nommu_map_page,
 #ifdef CONFIG_NOT_COHERENT_CACHE
 	.sync_single_for_cpu 		= dma_nommu_sync_single,
-- 
2.19.2

>From 4c5dd4d4a4b4e63be722fd29ada896c5962072b8 Mon Sep 17 00:00:00 2001
From: Christoph Hellwig 
Date: Wed, 12 Dec 2018 15:11:38 +0100
Subject: alloc/free

---
 arch/powerpc/kernel/dma.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/arch/powerpc/kernel/dma.c b/arch/powerpc/kernel/dma.c
index f53d11d35230..d3db6d879559 100644
--- a/arch/powerpc/kernel/dma.c
+++ b/arch/powerpc/kernel/dma.c
@@ -176,8 +176,13 @@ static inline void dma_nommu_sync_single(struct device *dev,
 #endif
 
 const struct dma_map_ops dma_nommu_ops = {
+#ifdef CONFIG_NOT_COHERENT_CACHE
 	.alloc= __dma_nommu_alloc_coherent,
 	.free= __dma_nommu_free_coherent,
+#else
+	.alloc= dma_direct_alloc,
+	.free= dma_direct_free,
+#endif
 	.map_sg= dma_nommu_map_sg,
 	.dma_supported			= dma_direct_supported,
 	.map_page			= dma_nommu_map_page,
-- 
2.19.2

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v6 13/21] iommu/tegra: gart: Integrate with Memory Controller driver

2018-12-12 Thread Dmitry Osipenko
On 12.12.2018 13:14, Thierry Reding wrote:
> On Sun, Dec 09, 2018 at 11:29:42PM +0300, Dmitry Osipenko wrote:
>> The device-tree binding has been changed. There is no separate GART device
>> anymore, it is squashed into the Memory Controller. Integrate GART module
>> with the MC in a way it is done for the SMMU of Tegra30+.
>>
>> Signed-off-by: Dmitry Osipenko 
>> ---
>>  drivers/iommu/Kconfig  |  1 +
>>  drivers/iommu/tegra-gart.c | 77 --
>>  drivers/memory/tegra/mc.c  | 41 
>>  include/soc/tegra/mc.h | 27 +
>>  4 files changed, 93 insertions(+), 53 deletions(-)
>>
>> diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
>> index d9a25715650e..83c099bb7288 100644
>> --- a/drivers/iommu/Kconfig
>> +++ b/drivers/iommu/Kconfig
>> @@ -282,6 +282,7 @@ config ROCKCHIP_IOMMU
>>  config TEGRA_IOMMU_GART
>>  bool "Tegra GART IOMMU Support"
>>  depends on ARCH_TEGRA_2x_SOC
>> +depends on TEGRA_MC
>>  select IOMMU_API
>>  help
>>Enables support for remapping discontiguous physical memory
>> diff --git a/drivers/iommu/tegra-gart.c b/drivers/iommu/tegra-gart.c
>> index 835fea461c59..0a72b6afa842 100644
>> --- a/drivers/iommu/tegra-gart.c
>> +++ b/drivers/iommu/tegra-gart.c
>> @@ -19,16 +19,17 @@
>>   * 51 Franklin St - Fifth Floor, Boston, MA 02110-1301 USA.
>>   */
>>  
>> -#include 
>>  #include 
>>  #include 
>>  #include 
>>  #include 
>> -#include 
>> +#include 
>>  #include 
>>  #include 
>>  #include 
>>  
>> +#include 
>> +
>>  /* bitmap of the page sizes currently supported */
>>  #define GART_IOMMU_PGSIZES  (SZ_4K)
>>  
>> @@ -397,9 +398,8 @@ static const struct iommu_ops gart_iommu_ops = {
>>  .iotlb_sync = gart_iommu_sync,
>>  };
>>  
>> -static int tegra_gart_suspend(struct device *dev)
>> +int tegra_gart_suspend(struct gart_device *gart)
>>  {
>> -struct gart_device *gart = dev_get_drvdata(dev);
>>  unsigned long iova;
>>  u32 *data = gart->savedata;
>>  unsigned long flags;
>> @@ -411,9 +411,8 @@ static int tegra_gart_suspend(struct device *dev)
>>  return 0;
>>  }
>>  
>> -static int tegra_gart_resume(struct device *dev)
>> +int tegra_gart_resume(struct gart_device *gart)
>>  {
>> -struct gart_device *gart = dev_get_drvdata(dev);
>>  unsigned long flags;
>>  
>>  spin_lock_irqsave(>pte_lock, flags);
>> @@ -422,41 +421,39 @@ static int tegra_gart_resume(struct device *dev)
>>  return 0;
>>  }
>>  
>> -static int tegra_gart_probe(struct platform_device *pdev)
>> +struct gart_device *tegra_gart_probe(struct device *dev,
>> + const struct tegra_smmu_soc *soc,
>> + struct tegra_mc *mc)
>>  {
>>  struct gart_device *gart;
>> -struct resource *res, *res_remap;
>> +struct resource *res_remap;
>>  void __iomem *gart_regs;
>> -struct device *dev = >dev;
>>  int ret;
>>  
>>  BUILD_BUG_ON(PAGE_SHIFT != GART_PAGE_SHIFT);
>>  
>> +/* Tegra30+ has an SMMU and no GART */
>> +if (soc)
>> +return NULL;
> 
> This looks weird. Why do we even call tegra_gart_probe() on anything but
> Tegra20?

Probably because I just wanted to replicate the weirdness of calling 
tegra_smmu_probe() on Tegra20.. for consistency.

>> +
>>  /* the GART memory aperture is required */
>> -res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
>> -res_remap = platform_get_resource(pdev, IORESOURCE_MEM, 1);
>> -if (!res || !res_remap) {
>> +res_remap = platform_get_resource(to_platform_device(dev),
>> +  IORESOURCE_MEM, 1);
>> +if (!res_remap) {
>>  dev_err(dev, "GART memory aperture expected\n");
>> -return -ENXIO;
>> +return ERR_PTR(-ENXIO);
>>  }
>>  
>>  gart = devm_kzalloc(dev, sizeof(*gart), GFP_KERNEL);
>>  if (!gart) {
>>  dev_err(dev, "failed to allocate gart_device\n");
>> -return -ENOMEM;
>> +return ERR_PTR(-ENOMEM);
>>  }
>>  
>> -gart_regs = devm_ioremap(dev, res->start, resource_size(res));
>> -if (!gart_regs) {
>> -dev_err(dev, "failed to remap GART registers\n");
>> -return -ENXIO;
>> -}
>> -
>> -ret = iommu_device_sysfs_add(>iommu, >dev, NULL,
>> - dev_name(>dev));
>> +ret = iommu_device_sysfs_add(>iommu, dev, NULL, "gart");
>>  if (ret) {
>>  dev_err(dev, "Failed to register IOMMU in sysfs\n");
>> -return ret;
>> +return ERR_PTR(ret);
>>  }
>>  
>>  iommu_device_set_ops(>iommu, _iommu_ops);
>> @@ -468,7 +465,8 @@ static int tegra_gart_probe(struct platform_device *pdev)
>>  goto remove_sysfs;
>>  }
>>  
>> -gart->dev = >dev;
>> +gart->dev = dev;
>> +gart_regs = mc->regs + GART_REG_BASE;
>>  spin_lock_init(>pte_lock);
>>  spin_lock_init(>client_lock);
>>  

Re: [PATCH 6/9] iommu/mediatek: Use helper functions to access dev->iommu_fwspec

2018-12-12 Thread Yong Wu
On Tue, 2018-12-11 at 13:19 +0100, Joerg Roedel wrote:
> From: Joerg Roedel 
> 
> Use the new helpers dev_iommu_fwspec_get()/set() to access
> the dev->iommu_fwspec pointer. This makes it easier to move
> that pointer later into another struct.
> 
> Cc: Matthias Brugger 
> Signed-off-by: Joerg Roedel 
> ---
>  drivers/iommu/mtk_iommu.c| 21 -
>  drivers/iommu/mtk_iommu_v1.c | 28 
>  2 files changed, 28 insertions(+), 21 deletions(-)

Tested-by: Yong Wu 


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] iommu/io-pgtable-arm-v7s: Don't check PHYS_OFFSET if RAMDOMIZE_BASE is enabled

2018-12-12 Thread Robin Murphy

On 12/12/2018 13:02, Yong Wu wrote:

If CONFIG_RANDOMIZE_BASE is enabled, the "memstart_addr" will be updated
randomly, then the PHYS_OFFSET may be random.


Oh, I hadn't ever realised that, good catch. However, since 29859aeb8a6e 
I think we should probably just remove this check altogether.



Fixes: 82db33dc5e49 ("iommu/io-pgtable-arm: Check for v7s-incapable
systems")


Note that this alone wouldn't be sufficient for stable prior to 4.18, 
since CONFIG_RANDOMIZE_BASE would then allow the original crash to 
happen again.


Robin.


Reported-by: CK Hu 
Signed-off-by: Yong Wu 
---
  drivers/iommu/io-pgtable-arm-v7s.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/iommu/io-pgtable-arm-v7s.c 
b/drivers/iommu/io-pgtable-arm-v7s.c
index 445c3bd..70941e6 100644
--- a/drivers/iommu/io-pgtable-arm-v7s.c
+++ b/drivers/iommu/io-pgtable-arm-v7s.c
@@ -709,7 +709,7 @@ static struct io_pgtable *arm_v7s_alloc_pgtable(struct 
io_pgtable_cfg *cfg,
  {
struct arm_v7s_io_pgtable *data;
  
-#ifdef PHYS_OFFSET

+#if defined(PHYS_OFFSET) && !defined(CONFIG_RANDOMIZE_BASE)
if (upper_32_bits(PHYS_OFFSET))
return NULL;
  #endif


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH] iommu/io-pgtable-arm-v7s: Don't check PHYS_OFFSET if RAMDOMIZE_BASE is enabled

2018-12-12 Thread Yong Wu
If CONFIG_RANDOMIZE_BASE is enabled, the "memstart_addr" will be updated
randomly, then the PHYS_OFFSET may be random.

Fixes: 82db33dc5e49 ("iommu/io-pgtable-arm: Check for v7s-incapable
systems")
Reported-by: CK Hu 
Signed-off-by: Yong Wu 
---
 drivers/iommu/io-pgtable-arm-v7s.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/iommu/io-pgtable-arm-v7s.c 
b/drivers/iommu/io-pgtable-arm-v7s.c
index 445c3bd..70941e6 100644
--- a/drivers/iommu/io-pgtable-arm-v7s.c
+++ b/drivers/iommu/io-pgtable-arm-v7s.c
@@ -709,7 +709,7 @@ static struct io_pgtable *arm_v7s_alloc_pgtable(struct 
io_pgtable_cfg *cfg,
 {
struct arm_v7s_io_pgtable *data;
 
-#ifdef PHYS_OFFSET
+#if defined(PHYS_OFFSET) && !defined(CONFIG_RANDOMIZE_BASE)
if (upper_32_bits(PHYS_OFFSET))
return NULL;
 #endif
-- 
1.9.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v6 00/21] IOMMU: Tegra GART driver clean up and optimization

2018-12-12 Thread Dmitry Osipenko
On 12.12.2018 13:43, Joerg Roedel wrote:
> Hi Thierry, Hi Dmitry,
> 
> On Wed, Dec 12, 2018 at 11:24:15AM +0100, Thierry Reding wrote:
>> So appart from the one issue in the "memory controller integration"
>> patch this looks good and I've acked the remaining patches. Once the one
>> remaining issue is fixed I think this is ready to be merged.
>>
>> Joerg, given the dependencies between the various parts of the series, I
>> think it may be better for you to merge everything through the IOMMU
>> tree if you don't mind.
> 
> Sounds good. Dmitry, can you please address Thierry's comments on patch
> 13, add his Acks on the other patches and re-send? I'll queue it in the
> iommu-tree then.

Sure, thanks to you both!
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 1/6] driver core: Introduce device_iommu_mapped() function

2018-12-12 Thread Joerg Roedel
On Wed, Dec 12, 2018 at 12:04:35PM +0100, Greg Kroah-Hartman wrote:
> On Tue, Dec 11, 2018 at 02:43:38PM +0100, Joerg Roedel wrote:
> > Cc: Greg Kroah-Hartman 
> > Acked-by: Greg Kroah-Hartman 
> 
> No need to have a cc: line if I have already acked it :)

Right, I'll remove it, sorry for the noise.

Regards,

Joerg
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 1/6] driver core: Introduce device_iommu_mapped() function

2018-12-12 Thread Greg Kroah-Hartman
On Tue, Dec 11, 2018 at 02:43:38PM +0100, Joerg Roedel wrote:
> From: Joerg Roedel 
> 
> Some places in the kernel check the iommu_group pointer in
> 'struct device' in order to find ot whether a device is
> mapped by an IOMMU.
> 
> This is not good way to make this check, as the pointer will
> be moved to 'struct dev_iommu_data'. This way to make the
> check is also not very readable.
> 
> Introduce an explicit function to perform this check.
> 
> Cc: Greg Kroah-Hartman 
> Acked-by: Greg Kroah-Hartman 

No need to have a cc: line if I have already acked it :)

thanks,

greg k-h
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v6 00/21] IOMMU: Tegra GART driver clean up and optimization

2018-12-12 Thread Joerg Roedel
Hi Thierry, Hi Dmitry,

On Wed, Dec 12, 2018 at 11:24:15AM +0100, Thierry Reding wrote:
> So appart from the one issue in the "memory controller integration"
> patch this looks good and I've acked the remaining patches. Once the one
> remaining issue is fixed I think this is ready to be merged.
> 
> Joerg, given the dependencies between the various parts of the series, I
> think it may be better for you to merge everything through the IOMMU
> tree if you don't mind.

Sounds good. Dmitry, can you please address Thierry's comments on patch
13, add his Acks on the other patches and re-send? I'll queue it in the
iommu-tree then.

Regards,

Joerg
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v6 0/7] Add virtio-iommu driver

2018-12-12 Thread Joerg Roedel
Hi,

to make progress on this, we should first agree on the protocol used
between guest and host. I have a few points to discuss on the protocol
first.

On Tue, Dec 11, 2018 at 06:20:57PM +, Jean-Philippe Brucker wrote:
> [1] Virtio-iommu specification v0.9, sources and pdf
> git://linux-arm.org/virtio-iommu.git virtio-iommu/v0.9
> http://jpbrucker.net/virtio-iommu/spec/v0.9/virtio-iommu-v0.9.pdf

Looking at this I wonder why it doesn't make the IOTLB visible to the
guest. the UNMAP requests seem to require that the TLB is already
flushed to make the unmap visible.

I think that will cost significant performance for both, vfio and
dma-iommu use-cases which both do (vfio at least to some degree),
deferred flushing.

I also wonder whether the protocol should implement a
protocol version handshake and iommu-feature set queries.

> [3] git://linux-arm.org/linux-jpb.git virtio-iommu/v0.9.1
> git://linux-arm.org/kvmtool-jpb.git virtio-iommu/v0.9

Unfortunatly gitweb seems to be broken on linux-arm.org. What is missing
in this patch-set to make this work on x86?

Regards,

Joerg
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v6 00/21] IOMMU: Tegra GART driver clean up and optimization

2018-12-12 Thread Thierry Reding
On Tue, Dec 11, 2018 at 10:53:17AM +0100, Joerg Roedel wrote:
> On Sun, Dec 09, 2018 at 11:29:29PM +0300, Dmitry Osipenko wrote:
> > Dmitry Osipenko (21):
> >   iommu/tegra: gart: Remove pr_fmt and clean up includes
> >   iommu/tegra: gart: Clean up driver probe errors handling
> >   iommu/tegra: gart: Ignore devices without IOMMU phandle in DT
> >   iommu: Introduce iotlb_sync_map callback
> >   iommu/tegra: gart: Optimize mapping / unmapping performance
> >   dt-bindings: memory: tegra: Squash tegra20-gart into tegra20-mc
> >   ARM: dts: tegra20: Update Memory Controller node to the new binding
> >   memory: tegra: Don't invoke Tegra30+ specific memory timing setup on
> > Tegra20
> >   memory: tegra: Adapt to Tegra20 device-tree binding changes
> >   memory: tegra: Read client ID on GART page fault
> >   memory: tegra: Use of_device_get_match_data()
> >   memory: tegra: Use relaxed versions of readl/writel
> >   iommu/tegra: gart: Integrate with Memory Controller driver
> >   iommu/tegra: gart: Fix spinlock recursion
> >   iommu/tegra: gart: Fix NULL pointer dereference
> >   iommu/tegra: gart: Allow only one active domain at a time
> >   iommu/tegra: gart: Don't use managed resources
> >   iommu/tegra: gart: Prepend error/debug messages with "gart:"
> >   iommu/tegra: gart: Don't detach devices from inactive domains
> >   iommu/tegra: gart: Simplify clients-tracking code
> >   iommu/tegra: gart: Perform code refactoring
> 
> This is making progress, but some parts have no Ack or Review yet, I'll
> wait for Thierry before applying these patches into the iommu tree.

So appart from the one issue in the "memory controller integration"
patch this looks good and I've acked the remaining patches. Once the one
remaining issue is fixed I think this is ready to be merged.

Joerg, given the dependencies between the various parts of the series, I
think it may be better for you to merge everything through the IOMMU
tree if you don't mind.

Thierry


signature.asc
Description: PGP signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v6 18/21] iommu/tegra: gart: Prepend error/debug messages with "gart:"

2018-12-12 Thread Thierry Reding
On Sun, Dec 09, 2018 at 11:29:47PM +0300, Dmitry Osipenko wrote:
> GART became a part of Memory Controller, hence now the drivers device
> is Memory Controller and not GART. As a result all printed messages are
> prepended with the "tegra-mc 7000f000.memory-controller:", so let's
> prepend GART's messages with "gart:" in order to differentiate them
> from the MC.
> 
> Signed-off-by: Dmitry Osipenko 
> ---
>  drivers/iommu/tegra-gart.c | 2 ++
>  1 file changed, 2 insertions(+)

Acked-by: Thierry Reding 


signature.asc
Description: PGP signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v6 21/21] iommu/tegra: gart: Perform code refactoring

2018-12-12 Thread Thierry Reding
On Sun, Dec 09, 2018 at 11:29:50PM +0300, Dmitry Osipenko wrote:
> Removed redundant safety-checks in the code and some debug code that
> isn't actually very useful for debugging, like enormous pagetable dump
> on each fault. The majority of the changes are code reshuffling,
> variables/whitespaces clean up and removal of debug messages that
> duplicate messages of the IOMMU-core.
> 
> Signed-off-by: Dmitry Osipenko 
> ---
>  drivers/iommu/tegra-gart.c | 244 +++--
>  1 file changed, 96 insertions(+), 148 deletions(-)

This is a little over the top in some places, but there are enough good
changes to gloss that over:

Acked-by: Thierry Reding 


signature.asc
Description: PGP signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v6 10/21] memory: tegra: Read client ID on GART page fault

2018-12-12 Thread Thierry Reding
On Sun, Dec 09, 2018 at 11:29:39PM +0300, Dmitry Osipenko wrote:
> With the device tree binding changes, now Memory Controller has access to
> GART registers. Hence it is now possible to read client ID on GART page
> fault to get information about what memory client causes the fault.
> 
> Signed-off-by: Dmitry Osipenko 
> ---
>  drivers/memory/tegra/mc.c | 12 ++--
>  1 file changed, 10 insertions(+), 2 deletions(-)

Acked-by: Thierry Reding 


signature.asc
Description: PGP signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v6 11/21] memory: tegra: Use of_device_get_match_data()

2018-12-12 Thread Thierry Reding
On Sun, Dec 09, 2018 at 11:29:40PM +0300, Dmitry Osipenko wrote:
> There is no need to match device with the DT node since it was already
> matched, use of_device_get_match_data() helper to get the match-data.
> 
> Signed-off-by: Dmitry Osipenko 
> ---
>  drivers/memory/tegra/mc.c | 8 ++--
>  1 file changed, 2 insertions(+), 6 deletions(-)

Acked-by: Thierry Reding 


signature.asc
Description: PGP signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

RE: [RFC PATCH 0/6] Auxiliary IOMMU domains and Arm SMMUv3

2018-12-12 Thread Tian, Kevin
> From: 'j...@8bytes.org' [mailto:j...@8bytes.org]
> Sent: Wednesday, December 12, 2018 5:54 PM
> 
> Hi Kevin,
> 
> On Wed, Dec 12, 2018 at 09:31:27AM +, Tian, Kevin wrote:
> > > From: 'j...@8bytes.org'
> > > Sent: Monday, December 10, 2018 4:58 PM
> > > These represent whether the device together with the IOMMU support
> > > them,
> > > basically whether these features are usable via the IOMMU-API.
> >
> > "device together with IOMMU" or just "IOMMU itself"?
> 
> No, it should mean device together with IOMMU support it. It is a way
> for users of the IOMMU-API to check whether they can successfully use
> the aux-specific functions.
> 
> > however there is a problem with aux. A device may implement both
> > SR-IOV and Scalable IOV capabilities, but at any time only one of them
> > can be enabled. Driver will provide interfaces for end user to choose.
> > In such case we cannot assume that device-side Scalable-IOV can be
> > always enabled while IOMMU is in scalable mode.
> >
> > It works better if we position those features just representing IOMMU
> > support only. In that case, aux is related only to scalable mode of IOMMU
> > itself, which is orthogonal to whether device side supports SIOV.
> 
> Yeah, but we don't make that decision in the IOMMU code. Whether the
> device exposes SR-IOV or PASID based isolation is decided in PCI code
> based on user input (SR-IOV is also enabled in PCI code and IOMMU just
> uses the new devices that appear).
> 
> Only if the user enabled scalable mode on the device and the IOMMU
> supports it too, the feature-check function returns true for the aux
> feature.
> 

Then this is another proof for an enable/disable part in API. :-)

Thanks
kevin
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v6 09/21] memory: tegra: Adapt to Tegra20 device-tree binding changes

2018-12-12 Thread Thierry Reding
On Sun, Dec 09, 2018 at 11:29:38PM +0300, Dmitry Osipenko wrote:
> The tegra20-mc device-tree binding has been changed, GART has been
> squashed into Memory Controller and now the clock property is mandatory
> for Tegra20, the DT compatible has been changed as well. Adapt driver to
> the DT changes.
> 
> Signed-off-by: Dmitry Osipenko 
> ---
>  drivers/memory/tegra/mc.c | 21 -
>  drivers/memory/tegra/mc.h |  6 --
>  include/soc/tegra/mc.h|  2 +-
>  3 files changed, 9 insertions(+), 20 deletions(-)

Acked-by: Thierry Reding 


signature.asc
Description: PGP signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v6 08/21] memory: tegra: Don't invoke Tegra30+ specific memory timing setup on Tegra20

2018-12-12 Thread Thierry Reding
On Sun, Dec 09, 2018 at 11:29:37PM +0300, Dmitry Osipenko wrote:
> This fixes irrelevant "tegra-mc 7000f000.memory-controller: no memory
> timings for RAM code 0 registered" warning message during of kernels
> boot-up on Tegra20.
> 
> Fixes: a8d502fd3348 ("memory: tegra: Squash tegra20-mc into common tegra-mc 
> driver")
> Signed-off-by: Dmitry Osipenko 
> Acked-by: Jon Hunter 
> ---
>  drivers/memory/tegra/mc.c | 11 ++-
>  1 file changed, 6 insertions(+), 5 deletions(-)

Acked-by: Thierry Reding 


signature.asc
Description: PGP signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v6 07/21] ARM: dts: tegra20: Update Memory Controller node to the new binding

2018-12-12 Thread Thierry Reding
On Sun, Dec 09, 2018 at 11:29:36PM +0300, Dmitry Osipenko wrote:
> Device tree binding of Memory Controller has been changed: GART has been
> squashed into the MC, there are a new mandatory clock and #iommu-cells
> properties, the compatible has been changed to 'tegra20-mc-gart'.
> 
> Signed-off-by: Dmitry Osipenko 
> ---
>  arch/arm/boot/dts/tegra20.dtsi | 15 ++-
>  1 file changed, 6 insertions(+), 9 deletions(-)

Acked-by: Thierry Reding 


signature.asc
Description: PGP signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v6 06/21] dt-bindings: memory: tegra: Squash tegra20-gart into tegra20-mc

2018-12-12 Thread Thierry Reding
On Sun, Dec 09, 2018 at 11:29:35PM +0300, Dmitry Osipenko wrote:
> Splitting GART and Memory Controller wasn't a good decision that was made
> back in the day. Given that the GART driver wasn't ever been used by
> anything in the kernel, we decided that it will be better to correct the
> mistakes of the past and merge two bindings into a single one. As a result
> there is a DT ABI change for the Memory Controller that allows not to
> break newer kernels using older DT and not to break older kernels using
> newer DT, that is done by changing the 'compatible' of the node to
> 'tegra20-mc-gart' and adding a new-required clock property. The new clock
> property also puts the tegra20-mc binding in line with the bindings of the
> later Tegra generations.
> 
> Signed-off-by: Dmitry Osipenko 
> Reviewed-by: Rob Herring 
> ---
>  .../bindings/iommu/nvidia,tegra20-gart.txt| 14 --
>  .../memory-controllers/nvidia,tegra20-mc.txt  | 27 +--
>  2 files changed, 19 insertions(+), 22 deletions(-)
>  delete mode 100644 
> Documentation/devicetree/bindings/iommu/nvidia,tegra20-gart.txt

Acked-by: Thierry Reding 


signature.asc
Description: PGP signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH 9/9] iommu/tegra: Use helper functions to access dev->iommu_fwspec

2018-12-12 Thread Thierry Reding
On Tue, Dec 11, 2018 at 01:19:10PM +0100, Joerg Roedel wrote:
> From: Joerg Roedel 
> 
> Use the new helpers dev_iommu_fwspec_get()/set() to access
> the dev->iommu_fwspec pointer. This makes it easier to move
> that pointer later into another struct.
> 
> Cc: Thierry Reding 
> Signed-off-by: Joerg Roedel 
> ---
>  drivers/iommu/tegra-smmu.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)

Acked-by: Thierry Reding 


signature.asc
Description: PGP signature
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v4 1/2] dt-bindings: arm-smmu: Add binding doc for Qcom smmu-500

2018-12-12 Thread Vivek Gautam
Hi Will,

On Fri, Oct 12, 2018 at 11:37 AM Vivek Gautam
 wrote:
>
>
>
> On 10/12/2018 3:46 AM, Rob Herring wrote:
> > On Thu, 11 Oct 2018 15:19:29 +0530, Vivek Gautam wrote:
> >> Qcom's implementation of arm,mmu-500 works well with current
> >> arm-smmu driver implementation. Adding a soc specific compatible
> >> along with arm,mmu-500 makes the bindings future safe.
> >>
> >> Signed-off-by: Vivek Gautam 
> >> ---
> >>
> >> Changes since v3:
> >>   - Refined language more to state things directly for the bindings
> >> description.
> >>
> >>   Documentation/devicetree/bindings/iommu/arm,smmu.txt | 4 
> >>   1 file changed, 4 insertions(+)
> >>
> > Reviewed-by: Rob Herring 
>
> Thank you Rob.
>

Can you please pick this one as well to your tree? This goes on top of the
bindings patch for "qcom,smmu-v2". So, it can't go through Andy's tree.
Will ask Andy to pick the second patch of the series, that adds the dt node.

I guess as I sent this one along with the dt patch, I would have
mistakenly added
you to 'cc' list rather than 'to' list.
Let me know if you would like me to resend it.

Thank
Vivek

-- 
QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member
of Code Aurora Forum, hosted by The Linux Foundation
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


RE: [RFC PATCH 0/6] Auxiliary IOMMU domains and Arm SMMUv3

2018-12-12 Thread Tian, Kevin
> From: 'j...@8bytes.org'
> Sent: Monday, December 10, 2018 4:58 PM
> 
> Hi Kevin,
> 
> On Mon, Dec 10, 2018 at 02:06:44AM +, Tian, Kevin wrote:
> > Can I interpret above as that you agree with the aux domain concept (i.e.
> one
> > device can be linked to multiple domains) in general, and now we're just
> trying
> > to address the remaining open in API level?
> 
> Yes, I thouht about alternatives, but in the end they are all harder to
> use than the aux-domain concept. We just need to make sure that we have
> a clear definition of what the API extension do and how they impact the
> behavior of existing functions.

sounds great!

> 
> > >   enum iommu_dev_features {
> > >   /* ... */
> > >   IOMMU_DEV_FEAT_AUX,
> > >   IOMMU_DEV_FEAT_SVA,
> > >   /* ... */
> > >   };
> > >
> >
> > Does above represent whether a device implements aux/sva features,
> > or whether a device has been enabled by driver to support aux/sva
> > features?
> 
> These represent whether the device together with the IOMMU support
> them,
> basically whether these features are usable via the IOMMU-API.

"device together with IOMMU" or just "IOMMU itself"? 

the former might be OK for sva. As Jean replied in another mail, enabling
it in both IOMMU and device side just consumes some resources, while 
not impacting other usages on that device.

however there is a problem with aux. A device may implement both
SR-IOV and Scalable IOV capabilities, but at any time only one of them
can be enabled. Driver will provide interfaces for end user to choose.
In such case we cannot assume that device-side Scalable-IOV can be
always enabled while IOMMU is in scalable mode.

It works better if we position those features just representing IOMMU 
support only. In that case, aux is related only to scalable mode of IOMMU
itself, which is orthogonal to whether device side supports SIOV.

> 
> 
> >
> > >   /* Check if a device supports a given feature of the IOMMU-API */
> > >   bool iommu_dev_has_feature(struct device *dev, enum
> > > iommu_dev_features *feat);
> >
> > If the latter we also need iommu_dev_set_feature so driver can poke
> > it based on its own configuration.
> 
> Do you mean we need an explict enable-API for the features? I thought
> about that too, but some features clearly don't need it and I didn't
> want to complicate the usage. My assumption was that when the feature is
> available, it is also enabled.
> 
> > >   /* So we need a iommu_aux_detach_all()? */
> >
> > for what scenario?
> 
> Maybe for detaching any PASID users from a device so that it can be
> assigned to a VM as a whole. But I am not sure it is needed.
> 

yes, possibly useful in reset path as Jean pointed out.

Thanks
Kevin
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC PATCH 0/6] Auxiliary IOMMU domains and Arm SMMUv3

2018-12-12 Thread 'j...@8bytes.org'
On Tue, Dec 11, 2018 at 01:35:23PM +, Jean-Philippe Brucker wrote:
> > /* So we need a iommu_aux_detach_all()? */
> 
> This could be useful for device drivers that want to do bulk cleanup on
> device removal. If they rely on Function Level Reset to disable PASID
> states for example, they could also cleanup with a detach_all(). But
> most will probably need to clean up individual PASID states (for example
> free all mdevs) and therefore don't need detach_all()

Yeah, so the more I think about it, the more dangerous it seems to have
this function. It creates subtle error cases for the users of SVA and
aux-domains because their mapping goes away without notice. It is
certainly better to force an orderly shutdown of all users before the
device can be re-assigned.

> > This concept can also be easily extended for supporting SVA in parallel
> > on the same device, with the same constraints regarding the behavior of
> > iommu_attach_device()/iommu_detach_device().
> 
> So this would be without the initial step of attaching an "ext" domain
> to the device in order to enable SVA/PASID?

No, I don't think anymore we should introduce a special domain type to
enable these features. Separate enable/disable functions per feature
seem to be a better choice.

> If I understood it correctly, I agree with the
> iommu_attach/detach_device() behavior for SVA as well. If a device is
> bound to an mm, then the device cannot be attached to another domain
> with iommu_attach_device(). This prevents leaks and forces the device
> driver to clean up PASID states when switching to a different driver
> (e.g. vfio-pci)

Right, the API should ensure we are safe on that side.

Thanks,

Joerg

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC PATCH 0/6] Auxiliary IOMMU domains and Arm SMMUv3

2018-12-12 Thread 'j...@8bytes.org'
On Tue, Dec 11, 2018 at 06:34:23PM +, Jean-Philippe Brucker wrote:
> The cost of enabling those features for a device does seem negligible.
> For the SMMU we need to allocate about 70k of additional memory for the
> initial PASID table, but enabling the PASID cap shouldn't add any
> overhead otherwise. Same for PRI, shouldn't add any overhead.

Okay, the memory requirements are a pretty good case for an
enable/disable part in the API.

> As for ATS, it supposedly makes things faster, although it does mean
> more invalidation requests. There currently is a global pci=noats
> parameter, but maybe we need something with finer granularity to
> enable/disable it per device?

Yeah, I don't object against this, but I think this is a matter of PCI
code. The IOMMUs should just enable ATS when it is available. PCI core
can fail to enable it if requested by the user.

Regards,

Joerg
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [GIT PULL] iommu/arm-smmu: Updates for 4.21

2018-12-12 Thread Joerg Roedel
On Tue, Dec 11, 2018 at 08:08:48PM +, Will Deacon wrote:
> The following changes since commit 9ff01193a20d391e8dbce4403dd5ef87c7eaaca6:
> 
>   Linux 4.20-rc3 (2018-11-18 13:33:44 -0800)
> 
> are available in the git repository at:
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/will/linux.git 
> for-joerg/arm-smmu/updates

Pulled, thanks Will.

Btw, there was a merge conflict with the patches to unmodularize the
IOMMU drivers in drivers/iommu/arm-smmu.c. I think I fixed it up, but it
would be good if you can check it later when I pushed it out.

Thanks,

Joerg
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC v2 14/20] iommu: introduce device fault data

2018-12-12 Thread Auger Eric
Hi Jacob,

On 9/21/18 12:06 AM, Jacob Pan wrote:
> On Tue, 18 Sep 2018 16:24:51 +0200
> Eric Auger  wrote:
> 
>> From: Jacob Pan 
>>
>> Device faults detected by IOMMU can be reported outside IOMMU
>> subsystem for further processing. This patch intends to provide
>> a generic device fault data such that device drivers can be
>> communicated with IOMMU faults without model specific knowledge.
>>
>> The proposed format is the result of discussion at:
>> https://lkml.org/lkml/2017/11/10/291
>> Part of the code is based on Jean-Philippe Brucker's patchset
>> (https://patchwork.kernel.org/patch/9989315/).
>>
>> The assumption is that model specific IOMMU driver can filter and
>> handle most of the internal faults if the cause is within IOMMU driver
>> control. Therefore, the fault reasons can be reported are grouped
>> and generalized based common specifications such as PCI ATS.
>>
>> Signed-off-by: Jacob Pan 
>> Signed-off-by: Jean-Philippe Brucker 
>> Signed-off-by: Liu, Yi L 
>> Signed-off-by: Ashok Raj 
>> Signed-off-by: Eric Auger 
>> [moved part of the iommu_fault_event struct in the uapi, enriched
>>  the fault reasons to be able to map unrecoverable SMMUv3 errors]
> Sounds good to me.
> There are also other "enrichment" we need to do to support mdev or
> finer granularity fault reporting below physical device. e.g. PASID
> level.
> 
> The current scheme works for PCIe physical device level, where each
> device registers a single handler only once. When device fault is
> detected by the IOMMU, it will find the matching handler and private
> data to report back. However, for devices partitioned by PASID and
> represented by mdev this may not work. Since IOMMU is not mdev aware
> and only works at physical device level.
> So I am thinking we should allow multiple registration of fault handler
> with different data and ID. i.e.
> 
> int iommu_register_device_fault_handler(struct device *dev,
>   iommu_dev_fault_handler_t handler,
>   int id,
>   void *data)
> 
> where the new "id field" is
>  * @id: Identification of the handler private data, will be used by fault
>  *  reporting code to match the handler data to be returned. For page
>  *  request, this can be the PASID. ID must be unique per device, i.e.
>  *  each ID can only be registered once per device.
>  *  - IOMMU_DEV_FAULT_ID_UNRECOVERY (~0U) is reserved for fault reporting
>  *  w/o ID. e.g. unrecoverable faults.
> 
> I am still testing, but just wanted to have feedback on this idea.

I am currently respinning this series. Do you have a respin for this
patch including iommu_register_device_fault_handler with the @id param
as you suggested above? Otherwise 2 solutions: I keep the code as is or
I do the modification myself implementing a list of fault_params?

Besides do you plans for "[PATCH v5 00/23] IOMMU and VT-d driver support
for Shared Virtual Address (SVA)" respin - hope I didn't miss anything? - ?

Thanks

Eric
> 
> Thanks,
> 
> Jacob
> 
> 
>> ---
>>  include/linux/iommu.h  | 55 -
>>  include/uapi/linux/iommu.h | 83
>> ++ 2 files changed, 136
>> insertions(+), 2 deletions(-)
>>
>> diff --git a/include/linux/iommu.h b/include/linux/iommu.h
>> index 9bd3e63d562b..7529c14ff506 100644
>> --- a/include/linux/iommu.h
>> +++ b/include/linux/iommu.h
>> @@ -49,13 +49,17 @@ struct bus_type;
>>  struct device;
>>  struct iommu_domain;
>>  struct notifier_block;
>> +struct iommu_fault_event;
>>  
>>  /* iommu fault flags */
>> -#define IOMMU_FAULT_READ0x0
>> -#define IOMMU_FAULT_WRITE   0x1
>> +#define IOMMU_FAULT_READ(1 << 0)
>> +#define IOMMU_FAULT_WRITE   (1 << 1)
>> +#define IOMMU_FAULT_EXEC(1 << 2)
>> +#define IOMMU_FAULT_PRIV(1 << 3)
>>  
>>  typedef int (*iommu_fault_handler_t)(struct iommu_domain *,
>>  struct device *, unsigned long, int, void *);
>> +typedef int (*iommu_dev_fault_handler_t)(struct iommu_fault_event *,
>> void *); 
>>  struct iommu_domain_geometry {
>>  dma_addr_t aperture_start; /* First address that can be
>> mapped*/ @@ -262,6 +266,52 @@ struct iommu_device {
>>  struct device *dev;
>>  };
>>  
>> +/**
>> + * struct iommu_fault_event - Generic per device fault data
>> + *
>> + * - PCI and non-PCI devices
>> + * - Recoverable faults (e.g. page request), information based on
>> PCI ATS
>> + * and PASID spec.
>> + * - Un-recoverable faults of device interest
>> + * - DMA remapping and IRQ remapping faults
>> + *
>> + * @fault: fault descriptor
>> + * @device_private: if present, uniquely identify device-specific
>> + *  private data for an individual page request.
>> + * @iommu_private: used by the IOMMU driver for storing
>> fault-specific
>> + * data. Users should not modify this field before
>> + * sending