Re: [PATCH 3/3] Fix compile error in drivers/gpu/drm/msm/msm_drv.c with IOMMU disabled

2013-09-27 Thread Rob Clark
On Wed, Sep 25, 2013 at 10:49 AM, Joerg Roedel j...@8bytes.org wrote:
 The function msm_iommu_get_ctx() is needed buy the MSM-GPU
 driver with and wiithout IOMMU compiled in. Make the
 function available when no IOMMU driver is there.


For this one,

Reviewed-by: Rob Clark robdcl...@gmail.com

But I am not the right one to merge this one.  And, well, if there is
a way to make this work without msm_iommu_get_ctx(), I am interested
in some hints ;-)

Of the other two, 1/3 looks fine and I'll pull that in.  And I'll see
if I can come up with a better way for 2/3

BR,
-R

 Signed-off-by: Joerg Roedel j...@8bytes.org
 ---
  drivers/iommu/msm_iommu.h |7 +++
  1 file changed, 7 insertions(+)

 diff --git a/drivers/iommu/msm_iommu.h b/drivers/iommu/msm_iommu.h
 index 5c7c955..da53558 100644
 --- a/drivers/iommu/msm_iommu.h
 +++ b/drivers/iommu/msm_iommu.h
 @@ -108,7 +108,14 @@ struct msm_iommu_ctx_drvdata {
   * Useful for testing and drivers that do not yet fully have IOMMU stuff in
   * their platform devices.
   */
 +#ifdef CONFIG_MSM_IOMMU
  struct device *msm_iommu_get_ctx(const char *ctx_name);
 +#else
 +static inline struct device *msm_iommu_get_ctx(const char *ctx_name)
 +{
 +   return NULL;
 +}
 +#endif

  /*
   * Interrupt handler for the IOMMU context fault interrupt. Hooking the
 --
 1.7.9.5


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] drm/msm: Fix link error with !MSM_IOMMU

2014-01-08 Thread Rob Clark
On Wed, Jan 8, 2014 at 8:35 AM, Joerg Roedel j...@8bytes.org wrote:
 On Wed, Jan 08, 2014 at 08:23:49AM -0500, Rob Clark wrote:
 On Tue, Jan 7, 2014 at 5:53 PM, Joerg Roedel j...@8bytes.org wrote:
  On Tue, Jan 07, 2014 at 11:47:26PM +0100, Joerg Roedel wrote:
  The DRM driver for MSM depends on symbols from the MSM
  IOMMU driver. Add this dependency to the Kconfig file.
 
  Fixes this comile error:
 
Kernel: arch/arm/boot/zImage is ready
ERROR: msm_iommu_get_ctx [drivers/gpu/drm/msm/msm.ko]
undefined!
make[2]: *** [__modpost] Error 1

 Thanks, I'll pull this into my pull req for 3.14 drm-next.. although,
 after chatting with Stephen Boyd, I think we should eventually be able
 to get rid of this (seems msm_iommu_get_ctx() was mainly for dealing
 with secure playback (?))

 It is already in my next-branch. So if you are ok with the approach for
 now there is no need for action. If you want it to go through your tree
 anyway please let me know and I will drop it from mine.

probably will be a slight merge conflict if it goes through your
branch.. although should be easy enough to resolve, basically the
result should look something like:

  depends on (ARCH_MSM  ARCH_MSM8960  MSM_IOMMU) || (ARM  COMPILE_TEST)

BR,
-R


 Joerg


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC/PATCH 2/7] iommu-api: Add map_range/unmap_range functions

2014-07-08 Thread Rob Clark
On Tue, Jul 8, 2014 at 5:53 PM, Olav Haugan ohau...@codeaurora.org wrote:
 Hi Hiroshi,

 On 7/3/2014 9:29 PM, Hiroshi Doyu wrote:
 Hi Olav,

 Olav Haugan ohau...@codeaurora.org writes:

 Mapping and unmapping are more often than not in the critical path.
 map_range and unmap_range allows SMMU driver implementations to optimize
 the process of mapping and unmapping buffers into the SMMU page tables.
 Instead of mapping one physical address, do TLB operation (expensive),
 mapping, do TLB operation, mapping, do TLB operation the driver can map
 a scatter-gatherlist of physically contiguous pages into one virtual
 address space and then at the end do one TLB operation.

 Additionally, the mapping operation would be faster in general since
 clients does not have to keep calling map API over and over again for
 each physically contiguous chunk of memory that needs to be mapped to a
 virtually contiguous region.

 Signed-off-by: Olav Haugan ohau...@codeaurora.org
 ---
  drivers/iommu/iommu.c | 24 
  include/linux/iommu.h | 24 
  2 files changed, 48 insertions(+)

 diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
 index efc..f2a6b80 100644
 --- a/drivers/iommu/iommu.c
 +++ b/drivers/iommu/iommu.c
 @@ -898,6 +898,30 @@ size_t iommu_unmap(struct iommu_domain *domain, 
 unsigned long iova, size_t size)
  EXPORT_SYMBOL_GPL(iommu_unmap);


 +int iommu_map_range(struct iommu_domain *domain, unsigned int iova,
 +struct scatterlist *sg, unsigned int len, int prot)
 +{
 +if (unlikely(domain-ops-map_range == NULL))
 +return -ENODEV;
 +
 +BUG_ON(iova  (~PAGE_MASK));
 +
 +return domain-ops-map_range(domain, iova, sg, len, prot);
 +}
 +EXPORT_SYMBOL_GPL(iommu_map_range);

 We have the similar one internally, which is named, iommu_map_sg(),
 called from DMA API.

 Great, so this new API will be useful to more people!

 +int iommu_unmap_range(struct iommu_domain *domain, unsigned int iova,
 +  unsigned int len)
 +{
 +if (unlikely(domain-ops-unmap_range == NULL))
 +return -ENODEV;
 +
 +BUG_ON(iova  (~PAGE_MASK));
 +
 +return domain-ops-unmap_range(domain, iova, len);
 +}
 +EXPORT_SYMBOL_GPL(iommu_unmap_range);

 Can the existing iommu_unmap() do the same?

 I believe iommu_unmap() behaves a bit differently because it will keep
 on calling domain-ops-unmap() until everything is unmapped instead of
 letting the iommu implementation take care of unmapping everything in
 one call.

 I am abandoning the patch series since our driver was not accepted.
 However, if there are no objections I will resubmit this patch (PATCH
 2/7) as an independent patch to add this new map_range API.

+1 for map_range().. I've seen for gpu workloads, at least, it is the
downstream map_range() API is quite beneficial.   It was worth at
least a few fps in xonotic.

And, possibly getting off the subject a bit, but I was wondering about
the possibility of going one step further and batching up mapping
and/or unmapping multiple buffers (ranges) at once.  I have a pretty
convenient sync point in drm/msm to flush out multiple mappings before
kicking gpu.

BR,
-R

 Thanks,

 Olav Haugan

 --
 The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
 hosted by The Linux Foundation
 --
 To unsubscribe from this list: send the line unsubscribe linux-arm-msm in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC/PATCH 2/7] iommu-api: Add map_range/unmap_range functions

2014-07-09 Thread Rob Clark
On Wed, Jul 9, 2014 at 8:03 PM, Olav Haugan ohau...@codeaurora.org wrote:
 On 7/8/2014 4:49 PM, Rob Clark wrote:
 On Tue, Jul 8, 2014 at 5:53 PM, Olav Haugan ohau...@codeaurora.org wrote:
 Hi Hiroshi,

 On 7/3/2014 9:29 PM, Hiroshi Doyu wrote:
 Hi Olav,

 Olav Haugan ohau...@codeaurora.org writes:

 Mapping and unmapping are more often than not in the critical path.
 map_range and unmap_range allows SMMU driver implementations to optimize
 the process of mapping and unmapping buffers into the SMMU page tables.
 Instead of mapping one physical address, do TLB operation (expensive),
 mapping, do TLB operation, mapping, do TLB operation the driver can map
 a scatter-gatherlist of physically contiguous pages into one virtual
 address space and then at the end do one TLB operation.

 Additionally, the mapping operation would be faster in general since
 clients does not have to keep calling map API over and over again for
 each physically contiguous chunk of memory that needs to be mapped to a
 virtually contiguous region.

 Signed-off-by: Olav Haugan ohau...@codeaurora.org
 ---
  drivers/iommu/iommu.c | 24 
  include/linux/iommu.h | 24 
  2 files changed, 48 insertions(+)

 diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
 index efc..f2a6b80 100644
 --- a/drivers/iommu/iommu.c
 +++ b/drivers/iommu/iommu.c
 @@ -898,6 +898,30 @@ size_t iommu_unmap(struct iommu_domain *domain, 
 unsigned long iova, size_t size)
  EXPORT_SYMBOL_GPL(iommu_unmap);


 +int iommu_map_range(struct iommu_domain *domain, unsigned int iova,
 +struct scatterlist *sg, unsigned int len, int prot)
 +{
 +if (unlikely(domain-ops-map_range == NULL))
 +return -ENODEV;
 +
 +BUG_ON(iova  (~PAGE_MASK));
 +
 +return domain-ops-map_range(domain, iova, sg, len, prot);
 +}
 +EXPORT_SYMBOL_GPL(iommu_map_range);

 We have the similar one internally, which is named, iommu_map_sg(),
 called from DMA API.

 Great, so this new API will be useful to more people!

 +int iommu_unmap_range(struct iommu_domain *domain, unsigned int iova,
 +  unsigned int len)
 +{
 +if (unlikely(domain-ops-unmap_range == NULL))
 +return -ENODEV;
 +
 +BUG_ON(iova  (~PAGE_MASK));
 +
 +return domain-ops-unmap_range(domain, iova, len);
 +}
 +EXPORT_SYMBOL_GPL(iommu_unmap_range);

 Can the existing iommu_unmap() do the same?

 I believe iommu_unmap() behaves a bit differently because it will keep
 on calling domain-ops-unmap() until everything is unmapped instead of
 letting the iommu implementation take care of unmapping everything in
 one call.

 I am abandoning the patch series since our driver was not accepted.
 However, if there are no objections I will resubmit this patch (PATCH
 2/7) as an independent patch to add this new map_range API.

 +1 for map_range().. I've seen for gpu workloads, at least, it is the
 downstream map_range() API is quite beneficial.   It was worth at
 least a few fps in xonotic.

 And, possibly getting off the subject a bit, but I was wondering about
 the possibility of going one step further and batching up mapping
 and/or unmapping multiple buffers (ranges) at once.  I have a pretty
 convenient sync point in drm/msm to flush out multiple mappings before
 kicking gpu.

 I think you should be able to do that with this API already - at least
 the mapping part since we are passing in a sg list (this could be a
 chained sglist).

What I mean by batching up is mapping and unmapping multiple sglists
each at different iova's with minmal cpu cache and iommu tlb flushes..

Ideally we'd let the IOMMU driver be clever and build out all 2nd
level tables before inserting into first level tables (to minimize cpu
cache flushing).. also, there is probably a reasonable chance that
we'd be mapping a new buffer into existing location, so there might be
some potential to reuse existing 2nd level tables (and save a tiny bit
of free/alloc).  I've not thought too much about how that would look
in code.. might be kinda, umm, fun..

But at an API level, we should be able to do a bunch of
map/unmap_range's with one flush.

Maybe it could look like a sequence of iommu_{map,unmap}_range()
followed by iommu_flush()?

BR,
-R

 Thanks,

 Olav

 --
 The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
 hosted by The Linux Foundation
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC/PATCH 2/7] iommu-api: Add map_range/unmap_range functions

2014-07-10 Thread Rob Clark
On Thu, Jul 10, 2014 at 3:10 AM, Thierry Reding
thierry.red...@gmail.com wrote:
 On Wed, Jul 09, 2014 at 08:40:21PM -0400, Rob Clark wrote:
 On Wed, Jul 9, 2014 at 8:03 PM, Olav Haugan ohau...@codeaurora.org wrote:
  On 7/8/2014 4:49 PM, Rob Clark wrote:
  On Tue, Jul 8, 2014 at 5:53 PM, Olav Haugan ohau...@codeaurora.org 
  wrote:
  Hi Hiroshi,
 
  On 7/3/2014 9:29 PM, Hiroshi Doyu wrote:
  Hi Olav,
 
  Olav Haugan ohau...@codeaurora.org writes:
 
  Mapping and unmapping are more often than not in the critical path.
  map_range and unmap_range allows SMMU driver implementations to 
  optimize
  the process of mapping and unmapping buffers into the SMMU page tables.
  Instead of mapping one physical address, do TLB operation (expensive),
  mapping, do TLB operation, mapping, do TLB operation the driver can map
  a scatter-gatherlist of physically contiguous pages into one virtual
  address space and then at the end do one TLB operation.
 
  Additionally, the mapping operation would be faster in general since
  clients does not have to keep calling map API over and over again for
  each physically contiguous chunk of memory that needs to be mapped to a
  virtually contiguous region.
 
  Signed-off-by: Olav Haugan ohau...@codeaurora.org
  ---
   drivers/iommu/iommu.c | 24 
   include/linux/iommu.h | 24 
   2 files changed, 48 insertions(+)
 
  diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
  index efc..f2a6b80 100644
  --- a/drivers/iommu/iommu.c
  +++ b/drivers/iommu/iommu.c
  @@ -898,6 +898,30 @@ size_t iommu_unmap(struct iommu_domain *domain, 
  unsigned long iova, size_t size)
   EXPORT_SYMBOL_GPL(iommu_unmap);
 
 
  +int iommu_map_range(struct iommu_domain *domain, unsigned int iova,
  +struct scatterlist *sg, unsigned int len, int prot)
  +{
  +if (unlikely(domain-ops-map_range == NULL))
  +return -ENODEV;
  +
  +BUG_ON(iova  (~PAGE_MASK));
  +
  +return domain-ops-map_range(domain, iova, sg, len, prot);
  +}
  +EXPORT_SYMBOL_GPL(iommu_map_range);
 
  We have the similar one internally, which is named, iommu_map_sg(),
  called from DMA API.
 
  Great, so this new API will be useful to more people!
 
  +int iommu_unmap_range(struct iommu_domain *domain, unsigned int iova,
  +  unsigned int len)
  +{
  +if (unlikely(domain-ops-unmap_range == NULL))
  +return -ENODEV;
  +
  +BUG_ON(iova  (~PAGE_MASK));
  +
  +return domain-ops-unmap_range(domain, iova, len);
  +}
  +EXPORT_SYMBOL_GPL(iommu_unmap_range);
 
  Can the existing iommu_unmap() do the same?
 
  I believe iommu_unmap() behaves a bit differently because it will keep
  on calling domain-ops-unmap() until everything is unmapped instead of
  letting the iommu implementation take care of unmapping everything in
  one call.
 
  I am abandoning the patch series since our driver was not accepted.
  However, if there are no objections I will resubmit this patch (PATCH
  2/7) as an independent patch to add this new map_range API.
 
  +1 for map_range().. I've seen for gpu workloads, at least, it is the
  downstream map_range() API is quite beneficial.   It was worth at
  least a few fps in xonotic.
 
  And, possibly getting off the subject a bit, but I was wondering about
  the possibility of going one step further and batching up mapping
  and/or unmapping multiple buffers (ranges) at once.  I have a pretty
  convenient sync point in drm/msm to flush out multiple mappings before
  kicking gpu.
 
  I think you should be able to do that with this API already - at least
  the mapping part since we are passing in a sg list (this could be a
  chained sglist).

 What I mean by batching up is mapping and unmapping multiple sglists
 each at different iova's with minmal cpu cache and iommu tlb flushes..

 Ideally we'd let the IOMMU driver be clever and build out all 2nd
 level tables before inserting into first level tables (to minimize cpu
 cache flushing).. also, there is probably a reasonable chance that
 we'd be mapping a new buffer into existing location, so there might be
 some potential to reuse existing 2nd level tables (and save a tiny bit
 of free/alloc).  I've not thought too much about how that would look
 in code.. might be kinda, umm, fun..

 But at an API level, we should be able to do a bunch of
 map/unmap_range's with one flush.

 Maybe it could look like a sequence of iommu_{map,unmap}_range()
 followed by iommu_flush()?

 Doesn't that mean that the IOMMU driver would have to keep track of all
 mappings until it sees an iommu_flush()? That sounds like it could be a
 lot of work and complicated code.

Well, depends on how elaborate you want to get.  If you don't want to
be too fancy, it may just be a matter of not doing TLB flush until
iommu_flush().  If you want to get fancy and minimize cpu flushes too,
then iommu driver would have to do some more tracking to build up a
transaction internally.  I'm

[RFC] iommu: qcom-iommu-v0 IOMMU

2014-07-10 Thread Rob Clark
 platform_driver qcom_iommu_driver = {
+   .driver = {
+   .name   = qcom-iommu-v0,
+   .of_match_table = qcom_iommu_dt_match,
+   },
+   .probe  = qcom_iommu_probe,
+   .remove = qcom_iommu_remove,
+};
+
+static int __init get_tex_class(int icp, int ocp, int mt, int nos)
+{
+   int i = 0;
+   unsigned int prrr = 0;
+   unsigned int nmrr = 0;
+   int c_icp, c_ocp, c_mt, c_nos;
+
+   RCP15_PRRR(prrr);
+   RCP15_NMRR(nmrr);
+
+   for (i = 0; i  NUM_TEX_CLASS; i++) {
+   c_nos = PRRR_NOS(prrr, i);
+   c_mt = PRRR_MT(prrr, i);
+   c_icp = NMRR_ICP(nmrr, i);
+   c_ocp = NMRR_OCP(nmrr, i);
+
+   if (icp == c_icp  ocp == c_ocp  c_mt == mt  c_nos == nos)
+   return i;
+   }
+
+   return -ENODEV;
+}
+
+static int __init qcom_iommu_init(void)
+{
+   int ret;
+
+   ret = platform_driver_register(qcom_iommu_driver);
+   if (ret) {
+   pr_err(Failed to register IOMMU driver\n);
+   goto error;
+   }
+
+   qcom_iommu_tex_class[QCOM_IOMMU_ATTR_NONCACHED] =
+   get_tex_class(CP_NONCACHED, CP_NONCACHED, MT_NORMAL, 1);
+
+   qcom_iommu_tex_class[QCOM_IOMMU_ATTR_CACHED_WB_WA] =
+   get_tex_class(CP_WB_WA, CP_WB_WA, MT_NORMAL, 1);
+
+   qcom_iommu_tex_class[QCOM_IOMMU_ATTR_CACHED_WB_NWA] =
+   get_tex_class(CP_WB_NWA, CP_WB_NWA, MT_NORMAL, 1);
+
+   qcom_iommu_tex_class[QCOM_IOMMU_ATTR_CACHED_WT] =
+   get_tex_class(CP_WT, CP_WT, MT_NORMAL, 1);
+
+   bus_set_iommu(platform_bus_type, qcom_iommu_ops);
+
+   return 0;
+
+error:
+   return ret;
+}
+
+static void __exit qcom_iommu_driver_exit(void)
+{
+   platform_driver_unregister(qcom_iommu_driver);
+}
+
+subsys_initcall(qcom_iommu_init);
+module_exit(qcom_iommu_driver_exit);
+
+MODULE_LICENSE(GPL v2);
+MODULE_AUTHOR(Stepan Moskovchenko step...@codeaurora.org);
+MODULE_AUTHOR(Rob Clark robdcl...@gmail.com);
diff --git a/drivers/iommu/qcom_iommu_v0.h b/drivers/iommu/qcom_iommu_v0.h
new file mode 100644
index 000..efe8535
--- /dev/null
+++ b/drivers/iommu/qcom_iommu_v0.h
@@ -0,0 +1,95 @@
+/* Copyright (c) 2010-2011, Code Aurora Forum. All rights reserved.
+ * Copyright (C) 2014 Red Hat
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 and
+ * only version 2 as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA
+ * 02110-1301, USA.
+ */
+
+/* NOTE: originally based on msm_iommu non-DT driver for same hw
+ * but as the structure of the driver changes considerably for DT
+ * it seemed easier to not try to support old platforms with the
+ * same driver.
+ */
+
+#ifndef QCOM_IOMMU_V0_H
+#define QCOM_IOMMU_V0_H
+
+#include linux/interrupt.h
+#include linux/clk.h
+
+/* Sharability attributes of QCOM IOMMU mappings */
+#define QCOM_IOMMU_ATTR_NON_SH 0x0
+#define QCOM_IOMMU_ATTR_SH 0x4
+
+/* Cacheability attributes of QCOM IOMMU mappings */
+#define QCOM_IOMMU_ATTR_NONCACHED  0x0
+#define QCOM_IOMMU_ATTR_CACHED_WB_WA   0x1
+#define QCOM_IOMMU_ATTR_CACHED_WB_NWA  0x2
+#define QCOM_IOMMU_ATTR_CACHED_WT  0x3
+
+/* Mask for the cache policy attribute */
+#define QCOM_IOMMU_CP_MASK 0x03
+
+/* Maximum number of Machine IDs that we are allowing to be mapped to the same
+ * context bank. The number of MIDs mapped to the same CB does not affect
+ * performance, but there is a practical limit on how many distinct MIDs may
+ * be present. These mappings are typically determined at design time and are
+ * not expected to change at run time.
+ */
+#define MAX_NUM_MIDS   32
+
+/**
+ * struct qcom_iommu - a single IOMMU hardware instance
+ * @dev: IOMMU device
+ * @base: IOMMU config port base address (VA)
+ * @irq: Interrupt number
+ * @ncb: Number of context banks present on this IOMMU HW instance
+ * @ttbr_split: ttbr split
+ * @clk: The bus clock for this IOMMU hardware instance
+ * @pclk: The clock for the IOMMU bus interconnect
+ * @ctx_list: list of 'struct qcom_iommu_ctx'
+ * @dev_node: list head in qcom_iommu_devices list
+ * @dom_node: list head in domain
+ */
+struct qcom_iommu {
+   struct device *dev;
+   void __iomem *base;
+   int irq;
+   int ncb;
+   int ttbr_split;
+   struct clk *clk;
+   struct clk *pclk;
+   struct list_head ctx_list;
+   struct list_head dev_node

Re: [RFC/PATCH 2/7] iommu-api: Add map_range/unmap_range functions

2014-07-10 Thread Rob Clark
On Thu, Jul 10, 2014 at 6:43 PM, Olav Haugan ohau...@codeaurora.org wrote:
 On 7/9/2014 5:40 PM, Rob Clark wrote:
 On Wed, Jul 9, 2014 at 8:03 PM, Olav Haugan ohau...@codeaurora.org wrote:
 On 7/8/2014 4:49 PM, Rob Clark wrote:
 On Tue, Jul 8, 2014 at 5:53 PM, Olav Haugan ohau...@codeaurora.org wrote:
 Hi Hiroshi,

 On 7/3/2014 9:29 PM, Hiroshi Doyu wrote:
 Hi Olav,

 Olav Haugan ohau...@codeaurora.org writes:

 Mapping and unmapping are more often than not in the critical path.
 map_range and unmap_range allows SMMU driver implementations to optimize
 the process of mapping and unmapping buffers into the SMMU page tables.
 Instead of mapping one physical address, do TLB operation (expensive),
 mapping, do TLB operation, mapping, do TLB operation the driver can map
 a scatter-gatherlist of physically contiguous pages into one virtual
 address space and then at the end do one TLB operation.

 Additionally, the mapping operation would be faster in general since
 clients does not have to keep calling map API over and over again for
 each physically contiguous chunk of memory that needs to be mapped to a
 virtually contiguous region.

 Signed-off-by: Olav Haugan ohau...@codeaurora.org
 ---
  drivers/iommu/iommu.c | 24 
  include/linux/iommu.h | 24 
  2 files changed, 48 insertions(+)

 diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
 index efc..f2a6b80 100644
 --- a/drivers/iommu/iommu.c
 +++ b/drivers/iommu/iommu.c
 @@ -898,6 +898,30 @@ size_t iommu_unmap(struct iommu_domain *domain, 
 unsigned long iova, size_t size)
  EXPORT_SYMBOL_GPL(iommu_unmap);


 +int iommu_map_range(struct iommu_domain *domain, unsigned int iova,
 +struct scatterlist *sg, unsigned int len, int prot)
 +{
 +if (unlikely(domain-ops-map_range == NULL))
 +return -ENODEV;
 +
 +BUG_ON(iova  (~PAGE_MASK));
 +
 +return domain-ops-map_range(domain, iova, sg, len, prot);
 +}
 +EXPORT_SYMBOL_GPL(iommu_map_range);

 We have the similar one internally, which is named, iommu_map_sg(),
 called from DMA API.

 Great, so this new API will be useful to more people!

 +int iommu_unmap_range(struct iommu_domain *domain, unsigned int iova,
 +  unsigned int len)
 +{
 +if (unlikely(domain-ops-unmap_range == NULL))
 +return -ENODEV;
 +
 +BUG_ON(iova  (~PAGE_MASK));
 +
 +return domain-ops-unmap_range(domain, iova, len);
 +}
 +EXPORT_SYMBOL_GPL(iommu_unmap_range);

 Can the existing iommu_unmap() do the same?

 I believe iommu_unmap() behaves a bit differently because it will keep
 on calling domain-ops-unmap() until everything is unmapped instead of
 letting the iommu implementation take care of unmapping everything in
 one call.

 I am abandoning the patch series since our driver was not accepted.
 However, if there are no objections I will resubmit this patch (PATCH
 2/7) as an independent patch to add this new map_range API.

 +1 for map_range().. I've seen for gpu workloads, at least, it is the
 downstream map_range() API is quite beneficial.   It was worth at
 least a few fps in xonotic.

 And, possibly getting off the subject a bit, but I was wondering about
 the possibility of going one step further and batching up mapping
 and/or unmapping multiple buffers (ranges) at once.  I have a pretty
 convenient sync point in drm/msm to flush out multiple mappings before
 kicking gpu.

 I think you should be able to do that with this API already - at least
 the mapping part since we are passing in a sg list (this could be a
 chained sglist).

 What I mean by batching up is mapping and unmapping multiple sglists
 each at different iova's with minmal cpu cache and iommu tlb flushes..

 Ideally we'd let the IOMMU driver be clever and build out all 2nd
 level tables before inserting into first level tables (to minimize cpu
 cache flushing).. also, there is probably a reasonable chance that
 we'd be mapping a new buffer into existing location, so there might be
 some potential to reuse existing 2nd level tables (and save a tiny bit
 of free/alloc).  I've not thought too much about how that would look
 in code.. might be kinda, umm, fun..

 But at an API level, we should be able to do a bunch of
 map/unmap_range's with one flush.

 Maybe it could look like a sequence of iommu_{map,unmap}_range()
 followed by iommu_flush()?


 So we could add another argument (options) in the range api that
 allows you to indicate whether you want to invalidate TLB or not.

sounds reasonable.. I'm pretty sure we want explict-flush to be an
opt-in behaviour.

BR,
-R

 Thanks,

 Olav

 --
 The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
 hosted by The Linux Foundation
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC] iommu: qcom-iommu-v0 IOMMU

2014-07-10 Thread Rob Clark
On Thu, Jul 10, 2014 at 5:53 PM, Stephen Boyd sb...@codeaurora.org wrote:
 On 07/10, Rob Clark wrote:
 So, in it's current form, this is superficially a copy of msm_iommu
 plus DT conversion.  But the pre-DT IOMMU driver had fairly different
 structure.. ie. psuedo root device, with IOMMU devices hanging off
 that, and context devices hanging off that.  The context devices were
 what the client device would attach (which was also somewhat non-
 standard.. see msm_iommu_get_ctx()).

 I couldn't really think of some sane way to refactor this and add DT
 while at the same time keeping compatibility with the old pre-DT msm
 stuff.  So I copied to a new driver.

 It was pointed out that nothing upstream actually *used* the msm_iommu
 driver.  So if no one objects to dropping pre-DT support, then I could
 do some patch rejuggling + sed to make this replace the old driver
 instead.

 +1

 +
 +// TODO any good reason for global lock vs per-iommu lock?
 +DEFINE_SPINLOCK(qcom_iommu_lock);

 static?

oh, yes.. ofc

I do wonder if we might want to make locking a bit more fine grained
to reduce contention (but, otoh, the gpu driver isn't going to contend
with itself, and other drivers probably aren't taxing the iommu quite
so hard).  But I guess it would be ok to leave that as a future
optimization.


 +static LIST_HEAD(qcom_iommu_devices);
 +
 +/* Note that a single iommu_domain can, for devices sitting behind
 + * more than one IOMMU (ie. one per AXI interface) will have more
 + * than one iommu in the iommu_list.  But all are programmed to
 + * point at the same pagetables so from client device perspective
 + * they act as a single IOMMU.
 + */
 +struct qcom_domain_priv {
 + unsigned long *pgtable;
 + struct list_head iommu_list;  /* list of attached 'struct qcom_iommu' 
 */
 +};
 +
 +static int __enable_clocks(struct qcom_iommu *iommu)
 +{
 + int ret;
 +
 + ret = clk_prepare_enable(iommu-pclk);

 Looks like __enable_clocks() is called from within a spinlock
 sometimes. Please move the prepare call outside of any atomic
 sections and also enable CONFIG_PROVE_LOCKING and/or
 CONFIG_DEBUG_ATOMIC_SLEEP to find such problems.

good point.  I haven't gotten around to running this against a debug
kernel.  It is only at the 'just barely works now' stage, I just
wanted to send for early feedback on the approach.  (Mainly to make
sure no one screams bloody murder about dropping pre-DT support from
old driver.)

BR,
-R

 + if (ret)
 + goto fail;
 +
 + if (iommu-clk) {
 + ret = clk_prepare_enable(iommu-clk);
 + if (ret)
 + clk_disable_unprepare(iommu-pclk);
 + }
 +fail:
 + return ret;
 +}
 +

 --
 Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
 hosted by The Linux Foundation
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v4] devicetree: Add generic IOMMU device tree bindings

2014-07-11 Thread Rob Clark
On Fri, Jul 4, 2014 at 11:29 AM, Thierry Reding
thierry.red...@gmail.com wrote:
 From: Thierry Reding tred...@nvidia.com

 This commit introduces a generic device tree binding for IOMMU devices.
 Only a very minimal subset is described here, but it is enough to cover
 the requirements of both the Exynos System MMU and Tegra SMMU as
 discussed here:

 https://lkml.org/lkml/2014/4/27/346

 Signed-off-by: Thierry Reding tred...@nvidia.com
 ---
 Changes in v4:
 - clarify that disabling an IOMMU DT node may not disable translation
 - be more explicit that examples are only examples
 - add multi-ID master example

 Changes in v3:
 - use #iommu-cells instead of #address-cells/#size-cells
 - drop optional iommu-names property

 Changes in v2:
 - add notes about dma-ranges property (drop note from commit message)
 - document priorities of iommus property vs. dma-ranges property
 - drop #iommu-cells in favour of #address-cells and #size-cells
 - remove multiple-master device example

  Documentation/devicetree/bindings/iommu/iommu.txt | 172 
 ++
  1 file changed, 172 insertions(+)
  create mode 100644 Documentation/devicetree/bindings/iommu/iommu.txt

 diff --git a/Documentation/devicetree/bindings/iommu/iommu.txt 
 b/Documentation/devicetree/bindings/iommu/iommu.txt
 new file mode 100644
 index ..464a81eaaf61
 --- /dev/null
 +++ b/Documentation/devicetree/bindings/iommu/iommu.txt
 @@ -0,0 +1,172 @@
 +This document describes the generic device tree binding for IOMMUs and their
 +master(s).
 +
 +
 +IOMMU device node:
 +==
 +
 +An IOMMU can provide the following services:
 +
 +* Remap address space to allow devices to access physical memory ranges that
 +  they otherwise wouldn't be capable of accessing.
 +
 +  Example: 32-bit DMA to 64-bit physical addresses
 +
 +* Implement scatter-gather at page level granularity so that the device does
 +  not have to.
 +
 +* Provide system protection against rogue DMA by forcing all accesses to go
 +  through the IOMMU and faulting when encountering accesses to unmapped
 +  address regions.
 +
 +* Provide address space isolation between multiple contexts.
 +
 +  Example: Virtualization
 +
 +Device nodes compatible with this binding represent hardware with some of the
 +above capabilities.
 +
 +IOMMUs can be single-master or multiple-master. Single-master IOMMU devices
 +typically have a fixed association to the master device, whereas multiple-
 +master IOMMU devices can translate accesses from more than one master.
 +
 +The device tree node of the IOMMU device's parent bus must contain a valid
 +dma-ranges property that describes how the physical address space of the
 +IOMMU maps to memory. An empty dma-ranges property means that there is a
 +1:1 mapping from IOMMU to memory.
 +
 +Required properties:
 +
 +- #iommu-cells: The number of cells in an IOMMU specifier needed to encode an
 +  address.
 +
 +The meaning of the IOMMU specifier is defined by the device tree binding of
 +the specific IOMMU. Below are a few examples of typical use-cases:
 +
 +- #iommu-cells = 0: Single master IOMMU devices are not configurable and
 +  therefore no additional information needs to be encoded in the specifier.
 +  This may also apply to multiple master IOMMU devices that do not allow the
 +  association of masters to be configured. Note that an IOMMU can by design
 +  be multi-master yet only expose a single master in a given configuration.
 +  In such cases the number of cells will usually be 1 as in the next case.
 +- #iommu-cells = 1: Multiple master IOMMU devices may need to be configured
 +  in order to enable translation for a given master. In such cases the single
 +  address cell corresponds to the master device's ID. In some cases more than
 +  one cell can be required to represent a single master ID.
 +- #iommu-cells = 4: Some IOMMU devices allow the DMA window for masters to
 +  be configured. The first cell of the address in this may contain the master
 +  device's ID for example, while the second cell could contain the start of
 +  the DMA window for the given device. The length of the DMA window is given
 +  by the third and fourth cells.
 +
 +Note that these are merely examples and real-world use-cases may use 
 different
 +definitions to represent their individual needs. Always refer to the specific
 +IOMMU binding for the exact meaning of the cells that make up the specifier.
 +
 +
 +IOMMU master node:
 +==
 +
 +Devices that access memory through an IOMMU are called masters. A device can
 +have multiple master interfaces (to one or more IOMMU devices).
 +
 +Required properties:
 +
 +- iommus: A list of phandle and IOMMU specifier pairs that describe the IOMMU
 +  master interfaces of the device. One entry in the list describes one master
 +  interface of the device.
 +
 +When an iommus property is specified in a device tree node, the IOMMU will
 +be used for address 

Re: [PATCH v4] devicetree: Add generic IOMMU device tree bindings

2014-07-12 Thread Rob Clark
On Sat, Jul 12, 2014 at 5:39 AM, Will Deacon will.dea...@arm.com wrote:
 Hi Rob,

 On Fri, Jul 11, 2014 at 09:55:14PM +0100, Rob Clark wrote:
 On Fri, Jul 4, 2014 at 11:29 AM, Thierry Reding
 thierry.red...@gmail.com wrote:
  From: Thierry Reding tred...@nvidia.com
 ok, so I was working through this to try to convert my
 {qcom,msm}-iommu-v0 RFC over to using these bindings.  For background,
 I was initially using something that looked a bit more like the
 current arm-smmu bindings:

 gpu {
 #stream-id-cells = 16;
 ...
 };

 gfx3d: qcom,iommu@7c0 {
 compatible = qcom,iommu-v0;
 ...
 mmu-masters =
 /* gfx3d_user: */
 gpu 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15,
 /* gfx3d_priv: */
 gpu 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31;
 };

 gfx3d1: qcom,iommu@7d0 {
 compatible = qcom,iommu-v0;
 ...
 mmu-masters =
 /* gfx3d_user: */
 gpu 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15,
 /* gfx3d_priv: */
 gpu 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31;
 };

 With my current arrangement, I have everything I need when the iommu
 device is probed to set up each of the context banks

 This proposal inverts that relationship.  Which forces me to do a lot
 more (including DT parsing) on device attach.  Which I'm not a huge
 fan of.  Ie. if I even wanted to try to implement per-process
 pagetables for gpu without completely going behind the IOMMU API's
 back, I would want attach/detach to be as lightweight as possible.

 I think we'd have to walk the entire device-tree at IOMMU probe time in
 order to enumerate the masters, which sounds horrible... If we want to do
 clever SMR allocation on the ARM SMMU (as I've been discussing with Olav),
 we're going to need knowledge about *all* the valid Stream IDs in the system
 before we can program *any* translations.

I guess walking the whole tree would work.. but it really doesn't
really sound like a nice solution ;-)

also, I'm not really sure to what extent it is bad form to assume the
tree is static.

 Was there actually a good reason for having the device link to the
 iommu rather than the other way around?  How much would people hate it
 if I just ignore the generic bindings and use something that works for
 me instead.  I mean, it isn't exactly like there is going to be .dts
 re-use across different SoC's..  and at least with current IOMMU API
 some sort of of_get_named_iommu() API doesn't really make sense.

 The thing is, if you end up ignoring the generic binding then we have two
 IOMMUs using the same (ARM SMMU) binding and it begs the question as to
 which is the more generic! I know we're keen to get this merged, but merging
 something that people won't use and calling it generic doesn't seem ideal
 either. We do, however, desperately need a generic binding.

yeah, ignoring the generic binding is not my first choice.  I'd rather
have something that works well for everyone.  But I wasn't really sure
if the current proposal was arbitrary, or if there are some
conflicting requirements between different platforms.

 Turning the question around; Thierry -- what are the issue in using
 something like the ARM SMMU binding (master device IDs held in the IOMMU
 node) for the nvidia IOMMU?

+1 for doing it more like how arm-smmu is currently.. that works much
better for me :-)

BR,
-R

 Will
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v4] devicetree: Add generic IOMMU device tree bindings

2014-07-12 Thread Rob Clark
On Sat, Jul 12, 2014 at 8:22 AM, Arnd Bergmann a...@arndb.de wrote:
 On Saturday 12 July 2014, Rob Clark wrote:
  Was there actually a good reason for having the device link to the
  iommu rather than the other way around?  How much would people hate it
  if I just ignore the generic bindings and use something that works for
  me instead.  I mean, it isn't exactly like there is going to be .dts
  re-use across different SoC's..  and at least with current IOMMU API
  some sort of of_get_named_iommu() API doesn't really make sense.
 
  The thing is, if you end up ignoring the generic binding then we have two
  IOMMUs using the same (ARM SMMU) binding and it begs the question as to
  which is the more generic! I know we're keen to get this merged, but 
  merging
  something that people won't use and calling it generic doesn't seem ideal
  either. We do, however, desperately need a generic binding.

 yeah, ignoring the generic binding is not my first choice.  I'd rather
 have something that works well for everyone.  But I wasn't really sure
 if the current proposal was arbitrary, or if there are some
 conflicting requirements between different platforms.

 The common case that needs to be simple is attaching one (master) device
 to an IOMMU using the shared global context for the purposes of implementing
 the dma-mapping API.

well, I don't disagree that IOMMU API has some problems.  It is too
tied to the bus type, which doesn't really seem to make sense for
platform devices.  (Unless we start having multiple platform busses?)

But at least given the current IOMMU API I'm not really sure how it
makes a difference which way the link goes.  But if there has already
been some discussion about how you want to handle the tie in with
dma-mapping, if you could point me at that then maybe your point will
make more sense to me.

 The way that Thierry's binding does that is the obvious solution to this,
 and it mirrors what we do in practically every other subsystem. I definitely
 want the SMMU to change before anybody starts using it in a real system,
 which we fortunately do not have yet.

hmm, well if some of the things I need for (like this or batching
mappings) are too weird and gpu specific, I'm willing to duplicate the
IOMMU driver in drm/msm.  It really isn't so much code, and that gives
me a lot more more flexibility to do crazy things... at some point I'm
probably going to want to do context switches by banging the IOMMU
registers directly from the gpu.

But given what Will said, I don't think what I need here is too far
out of line.  But if it is really a problem for dma-mapping, I suppose
we could have links in both directions?  However, I think the link
which contains the stream-id's really needs to be in the IOMMU, not
the device using the IOMMU.

BR,
-R
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v4] devicetree: Add generic IOMMU device tree bindings

2014-07-13 Thread Rob Clark
On Sun, Jul 13, 2014 at 5:43 AM, Will Deacon will.dea...@arm.com wrote:
 On Sat, Jul 12, 2014 at 01:57:31PM +0100, Rob Clark wrote:
 On Sat, Jul 12, 2014 at 8:22 AM, Arnd Bergmann a...@arndb.de wrote:
  On Saturday 12 July 2014, Rob Clark wrote:
   Was there actually a good reason for having the device link to the
   iommu rather than the other way around?  How much would people hate it
   if I just ignore the generic bindings and use something that works for
   me instead.  I mean, it isn't exactly like there is going to be .dts
   re-use across different SoC's..  and at least with current IOMMU API
   some sort of of_get_named_iommu() API doesn't really make sense.
  
   The thing is, if you end up ignoring the generic binding then we have 
   two
   IOMMUs using the same (ARM SMMU) binding and it begs the question as to
   which is the more generic! I know we're keen to get this merged, but 
   merging
   something that people won't use and calling it generic doesn't seem 
   ideal
   either. We do, however, desperately need a generic binding.
 
  yeah, ignoring the generic binding is not my first choice.  I'd rather
  have something that works well for everyone.  But I wasn't really sure
  if the current proposal was arbitrary, or if there are some
  conflicting requirements between different platforms.
 
  The common case that needs to be simple is attaching one (master) device
  to an IOMMU using the shared global context for the purposes of 
  implementing
  the dma-mapping API.

 well, I don't disagree that IOMMU API has some problems.  It is too
 tied to the bus type, which doesn't really seem to make sense for
 platform devices.  (Unless we start having multiple platform busses?)

 But at least given the current IOMMU API I'm not really sure how it
 makes a difference which way the link goes.  But if there has already
 been some discussion about how you want to handle the tie in with
 dma-mapping, if you could point me at that then maybe your point will
 make more sense to me.

 If you look at the proposed binding in isolation, I think it *is* cleaner
 than the ARM SMMU binding (I've acked it...) and I believe it's more
 consistent with the way we describe linkages elsewhere.

 However, the issue you're raising is that it's more difficult to make use of
 in a Linux IOMMU driver. The reward you'll get for using it will come
 eventually when the DMA ops are automatically swizzled for devices using the
 generic binding.

 My plan for the ARM SMMU driver is:

   (1) Change -probe() to walk the device-tree looking for all masters with
   phandles back to the SMMU instance being probed

   (2) For each master, extract the Stream IDs and add them to the internal
   SMMU driver data structures (an rbtree per SMMU instance). For
   hotpluggable buses, we'll need a way for the bus controller to
   reserve a range of IDs -- this will likely be a later extension to
   the binding.

   (3) When we get an -add() call, warn if it's a device we haven't seen
   and reject the addition.

 That way, -attach() should be the same as it is now, I think.

 Have you tried implementing something like that? We agreed that (1) isn't
 pretty, but I don't have a good alternative and it's only done at
 probe-time.

I haven't tried implementing that yet, but I'm sure it would work.  I
was just hoping to avoid having to do that ;-)

I suppose perhaps there is room for a shared helper here, to at least
avoid duplicating that in each IOMMU driver which needs the
stream-id's up front.

 Will

 BTW: Is the msm-v0 IOMMU compatible with the ARM SMMU driver, or is it a
 completely different design requiring a different driver?

My understanding is that it is different from msm v1 IOMMU, although I
think it shares the same pagetable format with v1.  Not sure if that
is the same as arm-smmu?   If so it might be nice to try to extract
out some shared helper fxns for map/unmap as well.

I expect Olav knows better the similarities/differences.

BR,
-R
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v4] devicetree: Add generic IOMMU device tree bindings

2014-07-14 Thread Rob Clark
On Mon, Jul 14, 2014 at 2:24 AM, Thierry Reding
thierry.red...@gmail.com wrote:
 On Sat, Jul 12, 2014 at 08:57:31AM -0400, Rob Clark wrote:
 On Sat, Jul 12, 2014 at 8:22 AM, Arnd Bergmann a...@arndb.de wrote:
 [...]
  The way that Thierry's binding does that is the obvious solution to this,
  and it mirrors what we do in practically every other subsystem. I 
  definitely
  want the SMMU to change before anybody starts using it in a real system,
  which we fortunately do not have yet.

 hmm, well if some of the things I need for (like this or batching
 mappings) are too weird and gpu specific, I'm willing to duplicate the
 IOMMU driver in drm/msm.  It really isn't so much code, and that gives
 me a lot more more flexibility to do crazy things... at some point I'm
 probably going to want to do context switches by banging the IOMMU
 registers directly from the gpu.

 If the IOMMU API doesn't provide for what you need, then perhaps it's
 time to enhance it? We do that all the time in other parts of the
 kernel, why should IOMMU be special?

sure.. and my comment was also about the map/unmap batching.

Bypassing IOMMU wouldn't be my first choice.  (Especially because I'd
then get to implement it twice.)  But if some of the things I need are
too specific to one driver (or worse, problematic for other IOMMU
use-cases which I don't know about), then it is an option I'd be
willing to consider.  If nothing else, it would get me out of
allocating sglists for every buffer..  I wonder how much memory
scatterlists take up for 500M of gfx buffers?

 It seems to me like context switching for per-process address space
 isolation is one of the important features of an IOMMU. If the current
 API doesn't let you do that then we should think of ways how it can be
 improved. And if it doesn't do it fast enough, then we should equally
 find ways to speed it up.

 This is part of why I think it would be good to have explicit objects
 associated with IOMMU contexts. That would give us a good place to add
 caching for this kind of situation. Currently we're required to handle
 most of this in drivers (map from struct device to context, ...).

well, it is at least awkward that the current api conflates attaching
device and attaching context.  I think we could get some use out of an
iommu_swap() API which conceptually acts as:

  iommu_swap(olddomain, newdomain, dev)
  {
 iommu_detach_device(olddomain, dev);
 iommu_attach_device(newdomain, dev);
  }

BR,
-R

 Thierry
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v4] devicetree: Add generic IOMMU device tree bindings

2014-07-16 Thread Rob Clark
On Tue, Jul 15, 2014 at 9:25 PM, Olav Haugan ohau...@codeaurora.org wrote:
 On 7/13/2014 4:43 AM, Rob Clark wrote:
 On Sun, Jul 13, 2014 at 5:43 AM, Will Deacon will.dea...@arm.com wrote:
 On Sat, Jul 12, 2014 at 01:57:31PM +0100, Rob Clark wrote:
 On Sat, Jul 12, 2014 at 8:22 AM, Arnd Bergmann a...@arndb.de wrote:
 On Saturday 12 July 2014, Rob Clark wrote:
 Was there actually a good reason for having the device link to the
 iommu rather than the other way around?  How much would people hate it
 if I just ignore the generic bindings and use something that works for
 me instead.  I mean, it isn't exactly like there is going to be .dts
 re-use across different SoC's..  and at least with current IOMMU API
 some sort of of_get_named_iommu() API doesn't really make sense.

 The thing is, if you end up ignoring the generic binding then we have 
 two
 IOMMUs using the same (ARM SMMU) binding and it begs the question as to
 which is the more generic! I know we're keen to get this merged, but 
 merging
 something that people won't use and calling it generic doesn't seem 
 ideal
 either. We do, however, desperately need a generic binding.

 yeah, ignoring the generic binding is not my first choice.  I'd rather
 have something that works well for everyone.  But I wasn't really sure
 if the current proposal was arbitrary, or if there are some
 conflicting requirements between different platforms.

 The common case that needs to be simple is attaching one (master) device
 to an IOMMU using the shared global context for the purposes of 
 implementing
 the dma-mapping API.

 well, I don't disagree that IOMMU API has some problems.  It is too
 tied to the bus type, which doesn't really seem to make sense for
 platform devices.  (Unless we start having multiple platform busses?)

 But at least given the current IOMMU API I'm not really sure how it
 makes a difference which way the link goes.  But if there has already
 been some discussion about how you want to handle the tie in with
 dma-mapping, if you could point me at that then maybe your point will
 make more sense to me.

 If you look at the proposed binding in isolation, I think it *is* cleaner
 than the ARM SMMU binding (I've acked it...) and I believe it's more
 consistent with the way we describe linkages elsewhere.

 However, the issue you're raising is that it's more difficult to make use of
 in a Linux IOMMU driver. The reward you'll get for using it will come
 eventually when the DMA ops are automatically swizzled for devices using the
 generic binding.

 My plan for the ARM SMMU driver is:

   (1) Change -probe() to walk the device-tree looking for all masters with
   phandles back to the SMMU instance being probed

   (2) For each master, extract the Stream IDs and add them to the internal
   SMMU driver data structures (an rbtree per SMMU instance). For
   hotpluggable buses, we'll need a way for the bus controller to
   reserve a range of IDs -- this will likely be a later extension to
   the binding.

   (3) When we get an -add() call, warn if it's a device we haven't seen
   and reject the addition.

 That way, -attach() should be the same as it is now, I think.

 Have you tried implementing something like that? We agreed that (1) isn't
 pretty, but I don't have a good alternative and it's only done at
 probe-time.

 I haven't tried implementing that yet, but I'm sure it would work.  I
 was just hoping to avoid having to do that ;-)

 Is the reason you want to do it this way because you want to guarantee
 that all masters (and stream IDs) have been identified before the first
 attach call? I am just wondering why you cannot continue doing the
 master/streamID discovery during add_device() callback?

it was mostly because I couldn't think of a sane way to differentiate
between first and second time a device attaches (without keeping a
reference to the device).  But I guess it is ok to assume no hotplug
(since walking the device tree also seems acceptable)

BR,
-R


 BTW: Is the msm-v0 IOMMU compatible with the ARM SMMU driver, or is it a
 completely different design requiring a different driver?

 My understanding is that it is different from msm v1 IOMMU, although I
 think it shares the same pagetable format with v1.  Not sure if that
 is the same as arm-smmu?   If so it might be nice to try to extract
 out some shared helper fxns for map/unmap as well.

 I expect Olav knows better the similarities/differences.


 The msm-v0 IOMMU is not compatible with ARM SMMUv1 specification.
 However, it is a close cousin. The hardware was designed before the ARM
 SMMUv1 specification was available I believe. But it shares many of the
 same concepts as the ARM SMMUv1.

 msm-v0 IOMMU supports V7S page table format only. The ARM SMMU driver
 does not support V7S at this time. However, I believe we need to support
 this.

 Will, this reminds me. We definitely have a need to use different page
 tables in the ARM SMMU driver vs. the ARM CPU. We

Re: [PATCH v2 1/1] iommu-api: Add map_range/unmap_range functions

2014-07-22 Thread Rob Clark
On Mon, Jul 21, 2014 at 8:59 PM, Olav Haugan ohau...@codeaurora.org wrote:
 On 7/17/2014 1:21 AM, Thierry Reding wrote:
 On Wed, Jul 16, 2014 at 06:01:57PM -0700, Olav Haugan wrote:
 Mapping and unmapping are more often than not in the critical path.
 map_range and unmap_range allows SMMU driver implementations to optimize

 s/SMMU/IOMMU/

 the process of mapping and unmapping buffers into the SMMU page tables.

 s/SMMU/IOMMU/

 Instead of mapping one physical address, do TLB operation (expensive),
 mapping, do TLB operation, mapping, do TLB operation the driver can map
 a scatter-gatherlist of physically contiguous pages into one virtual
 address space and then at the end do one TLB operation.

 I find the above hard to read. Maybe:

 Instead of mapping a buffer one page at a time and requiring potentially
 expensive TLB operations for each page, this function allows the driver
 to map all pages in one go and defer TLB maintenance until after all
 pages have been mapped.

 Yeah, all above is OK with me.


 Additionally, the mapping operation would be faster in general since
 clients does not have to keep calling map API over and over again for
 each physically contiguous chunk of memory that needs to be mapped to a
 virtually contiguous region.

 Signed-off-by: Olav Haugan ohau...@codeaurora.org
 ---
  drivers/iommu/iommu.c | 48 
  include/linux/iommu.h | 25 +
  2 files changed, 73 insertions(+)

 diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
 index 1698360..a0eebb7 100644
 --- a/drivers/iommu/iommu.c
 +++ b/drivers/iommu/iommu.c
 @@ -1089,6 +1089,54 @@ size_t iommu_unmap(struct iommu_domain *domain, 
 unsigned long iova, size_t size)
  EXPORT_SYMBOL_GPL(iommu_unmap);


 +int iommu_map_range(struct iommu_domain *domain, unsigned int iova,

 Maybe iova should be dma_addr_t? Or at least unsigned long? And perhaps
 iommu_map_sg() would be more consistent with the equivalent function in
 struct dma_ops?

 +struct scatterlist *sg, unsigned int len, int opt)

 The length argument seems to be the size of the mapping. Again, the
 struct dma_ops function uses this argument to denote the number of
 entries in the scatterlist.

 opt is somewhat opaque. Perhaps this should be turned into unsigned long
 flags? Although given that there aren't any users yet it's difficult to
 say what's best here. Perhaps the addition of this argument should be
 postponed until there are actual users?

 I am thinking something like this:

 int iommu_map_sg(struct iommu_domain *domain, struct scatterlist *sg,
 unsigned int nents, int prot, unsigned long flags);
 int iommu_unmap_sg(struct iommu_domain *domain, struct scatterlist *sg,
 unsigned int nents, unsigned long flags);

 The iova is contained within sg so we don't need that argument really
 and I would like to keep the flags argument. I would prefer not to
 change the API after it has been published which could potentially
 affect a lot of call sites.

ugg.. that at least forces me to construct a separate sg for mapping
the same buffer in multiple process's gpu addr space.  Not really a
fan of that.

BR,
-R

 +{
 +s32 ret = 0;

 Should be int to match the function's return type.

 +u32 offset = 0;
 +u32 start_iova = iova;

 These should match the type of iova. Also, what's the point of
 start_iova if we can simply keep iova constant and use offset where
 necessary?

 +BUG_ON(iova  (~PAGE_MASK));
 +
 +if (unlikely(domain-ops-map_range == NULL)) {
 +while (offset  len) {

 Maybe this should use for_each_sg()?

 +phys_addr_t phys = page_to_phys(sg_page(sg));
 +u32 page_len = PAGE_ALIGN(sg-offset + sg-length);

 Shouldn't this alignment be left to iommu_map() to handle? It has code
 to deal with that already.

 I don't see page alignment in the iommu_map function. I only see a check
 whether the (iova | paddr | size) is aligned to the minimum page size
 and then it errors out if it isn't


 +ret = iommu_map(domain, iova, phys, page_len, opt);

 This conflates the new opt argument with iommu_map()'s prot argument.
 Maybe those two should rather be split?

 +if (ret)
 +goto fail;
 +
 +iova += page_len;
 +offset += page_len;
 +if (offset  len)
 +sg = sg_next(sg);
 +}
 +} else {
 +ret = domain-ops-map_range(domain, iova, sg, len, opt);
 +}

 Perhaps rather than check for a -map_range implementation everytime a
 better option may be to export this generic implementation so that
 drivers can set it in their iommu_ops if they don't implement it? So the
 contents of the if () block could become a new function:

   int iommu_map_range_generic(...)
   {
   ...
   }
   EXPORT_SYMBOL(iommu_map_range_generic);


Re: [PATCH v5 1/1] iommu-api: Add map_sg/unmap_sg functions

2014-08-12 Thread Rob Clark
On Mon, Aug 11, 2014 at 9:51 PM, Hiroshi Doyu hd...@nvidia.com wrote:
 Hi Olav,

 Olav Haugan ohau...@codeaurora.org writes:

 @@ -93,6 +94,10 @@ enum iommu_attr {
   * @detach_dev: detach device from an iommu domain
   * @map: map a physically contiguous memory region to an iommu domain
   * @unmap: unmap a physically contiguous memory region from an iommu domain
 + * @map_sg: map a scatter-gather list of physically contiguous memory chunks
 + * to an iommu domain
 + * @unmap_sg: unmap a scatter-gather list of physically contiguous memory
 + * chunks from an iommu domain
   * @iova_to_phys: translate iova to physical address
   * @domain_has_cap: domain capabilities query
   * @add_device: add device to iommu grouping
 @@ -110,6 +115,11 @@ struct iommu_ops {
phys_addr_t paddr, size_t size, int prot);
 size_t (*unmap)(struct iommu_domain *domain, unsigned long iova,
  size_t size);
 +   int (*map_sg)(struct iommu_domain *domain, unsigned long iova,
 +   struct scatterlist *sg, unsigned int nents, int prot,
 +   unsigned long flags);
 +   int (*unmap_sg)(struct iommu_domain *domain, unsigned long iova,
 +   size_t size, unsigned long flags);

 Do you have any exmaple/explanation for the above flags?

 Is this going to be used for iommu global/standard attribute or SoC
 spcific one?

iirc, one plan for 'flags' was some sort of DONT_FLUSH_TLB flag for
drivers which wanted to map/unmap N buffers with a single flush at the
end.  There might have been some other usages envisioned.

BR,
-R
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v5 1/1] iommu-api: Add map_sg/unmap_sg functions

2014-08-18 Thread Rob Clark
On Mon, Aug 18, 2014 at 10:07 AM, j...@8bytes.org j...@8bytes.org wrote:
 On Tue, Aug 12, 2014 at 09:56:11AM -0700, Olav Haugan wrote:
 On 8/12/2014 3:48 AM, Rob Clark wrote:
  iirc, one plan for 'flags' was some sort of DONT_FLUSH_TLB flag for
  drivers which wanted to map/unmap N buffers with a single flush at the
  end.  There might have been some other usages envisioned.

 Yes, that was the original intent of the flags for now. I am sure we can
 find other uses for this in the future.

 Do you have anything else in mind already besides the DONT_FLUSH_TLB
 flag?

 How is the IOTLB supposed to be flushed when this flag is used?


well, I was thinking one of two ways:

1) add new flush() vfunc.. this, I think, would be most convenient for
drivers using this feature
2) or driver simply doesn't set DONT_FLUSH_TLB flag on the last
{map,unmap}..  that would be slightly more awkward to use, but would
avoid adding a new vfunc

BR,
-R

 Joerg

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v6 6/8] dma-mapping: detect and configure IOMMU in of_dma_configure

2014-12-10 Thread Rob Clark
On Mon, Dec 1, 2014 at 11:57 AM, Will Deacon will.dea...@arm.com wrote:
 This patch extends of_dma_configure so that it sets up the IOMMU for a
 device, as well as the coherent/non-coherent DMA mapping ops.

 Acked-by: Arnd Bergmann a...@arndb.de
 Acked-by: Marek Szyprowski m.szyprow...@samsung.com
 Tested-by: Robin Murphy robin.mur...@arm.com
 Signed-off-by: Will Deacon will.dea...@arm.com
 ---
  arch/arm/include/asm/dma-mapping.h |  4 +++-
  drivers/of/platform.c  | 21 ++---
  include/linux/dma-mapping.h|  8 +++-
  3 files changed, 24 insertions(+), 9 deletions(-)

 diff --git a/arch/arm/include/asm/dma-mapping.h 
 b/arch/arm/include/asm/dma-mapping.h
 index dc3420e77758..f3c0d953f6a2 100644
 --- a/arch/arm/include/asm/dma-mapping.h
 +++ b/arch/arm/include/asm/dma-mapping.h
 @@ -121,7 +121,9 @@ static inline unsigned long dma_max_pfn(struct device 
 *dev)
  }
  #define dma_max_pfn(dev) dma_max_pfn(dev)

 -static inline void arch_setup_dma_ops(struct device *dev, bool coherent)
 +static inline void arch_setup_dma_ops(struct device *dev, u64 dma_base,
 + u64 size, struct iommu_ops *iommu,
 + bool coherent)
  {
 if (coherent)
 set_dma_ops(dev, arm_coherent_dma_ops);
 diff --git a/drivers/of/platform.c b/drivers/of/platform.c
 index ff1f4e9afccb..b89caf8c7586 100644
 --- a/drivers/of/platform.c
 +++ b/drivers/of/platform.c
 @@ -19,6 +19,7 @@
  #include linux/slab.h
  #include linux/of_address.h
  #include linux/of_device.h
 +#include linux/of_iommu.h
  #include linux/of_irq.h
  #include linux/of_platform.h
  #include linux/platform_device.h
 @@ -166,6 +167,7 @@ static void of_dma_configure(struct device *dev)
 int ret;
 bool coherent;
 unsigned long offset;
 +   struct iommu_ops *iommu;

 /*
  * Set default dma-mask to 32 bit. Drivers are expected to setup
 @@ -194,7 +196,16 @@ static void of_dma_configure(struct device *dev)
 dev_dbg(dev, device is%sdma coherent\n,
 coherent ?   :  not );

 -   arch_setup_dma_ops(dev, coherent);
 +   iommu = of_iommu_configure(dev);
 +   dev_dbg(dev, device is%sbehind an iommu\n,
 +   iommu ?   :  not );
 +
 +   arch_setup_dma_ops(dev, dma_addr, size, iommu, coherent);


so, what is the way for a driver that explicitly wants to manage it's
own device virtual address space to opt out of this?  I suspect that
won't be the common case, but for a gpu, if dma layer all of a sudden
thinks it is in control of the gpu's virtual address space, things are
going to end in tears..

BR,
-R


 +}
 +
 +static void of_dma_deconfigure(struct device *dev)
 +{
 +   arch_teardown_dma_ops(dev);
  }

  /**
 @@ -223,16 +234,12 @@ static struct platform_device 
 *of_platform_device_create_pdata(
 if (!dev)
 goto err_clear_flag;

 -   of_dma_configure(dev-dev);
 dev-dev.bus = platform_bus_type;
 dev-dev.platform_data = platform_data;
 -
 -   /* We do not fill the DMA ops for platform devices by default.
 -* This is currently the responsibility of the platform code
 -* to do such, possibly using a device notifier
 -*/
 +   of_dma_configure(dev-dev);

 if (of_device_add(dev) != 0) {
 +   of_dma_deconfigure(dev-dev);
 platform_device_put(dev);
 goto err_clear_flag;
 }
 diff --git a/include/linux/dma-mapping.h b/include/linux/dma-mapping.h
 index 8a1560f95d4a..c3007cb4bfa6 100644
 --- a/include/linux/dma-mapping.h
 +++ b/include/linux/dma-mapping.h
 @@ -130,7 +130,13 @@ static inline int dma_coerce_mask_and_coherent(struct 
 device *dev, u64 mask)
  extern u64 dma_get_required_mask(struct device *dev);

  #ifndef arch_setup_dma_ops
 -static inline void arch_setup_dma_ops(struct device *dev, bool coherent) { }
 +static inline void arch_setup_dma_ops(struct device *dev, u64 dma_base,
 + u64 size, struct iommu_ops *iommu,
 + bool coherent) { }
 +#endif
 +
 +#ifndef arch_teardown_dma_ops
 +static inline void arch_teardown_dma_ops(struct device *dev) { }
  #endif

  static inline unsigned int dma_get_max_seg_size(struct device *dev)
 --
 2.1.1

 ___
 iommu mailing list
 iommu@lists.linux-foundation.org
 https://lists.linuxfoundation.org/mailman/listinfo/iommu
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC 1/3] iommu/arm-smmu: Add support to opt-in to stalling

2017-01-11 Thread Rob Clark
On Wed, Jan 11, 2017 at 4:36 AM, Will Deacon <will.dea...@arm.com> wrote:
> On Tue, Jan 10, 2017 at 02:20:13PM -0500, Rob Clark wrote:
>> On Tue, Jan 10, 2017 at 12:52 PM, Will Deacon <will.dea...@arm.com> wrote:
>> > On Fri, Jan 06, 2017 at 11:26:49AM -0500, Rob Clark wrote:
>> >> Hmm, well we install the fault handler on the iommu_domain..  perhaps
>> >> maybe a combo of dts property (or deciding based on more specific
>> >> compat string), plus extra param passed in to
>> >> iommu_set_fault_hander().  The dts property or compat string to
>> >> indicate whether the iommu (and how it is wired up) can handle stalls,
>> >> and enable_stall param when fault handler is registered to indicate
>> >> whether the device itself can cope.. if either can't do stalling, then
>> >> don't set CFCFG.
>> >
>> > I thought about this some more, and I think you're right. Having
>> > iommu_set_fault_handler take a flags parameter indicating that, for 
>> > example,
>> > the fault handler can deal with paging, is all we need to implement the
>> > per-master opt-in functionality for stalling faults. There's no real
>> > requirement to standardise a generic firmware property for that (but
>> > we still need *something* that says stalling is usable on the SMMU --
>> > perhaps just the compatible string is ok).
>>
>> btw, it occurred to me that maybe it should be flags param to
>> iommu_attach_device() (just in case fault handler not installed?)
>> otoh stalling without a fault handler is silly, but I guess we need it
>> to infer whether stalling can be supported by other devices on same
>> iommu.. tbh I'm on a bit shaky ground when it comes to multiple
>> devices per iommu since the SoC's I'm familiar with do it the other
>> way around.  But I guess you have thought more about the multi-device
>> case, so figured I should suggest it..
>
> I don't think it works at attach time, because the stalling property belongs
> to the domain, rather than the individual devices within it. Similarly, I
> don't think we should allow this property to be toggled once devices have
> been attached.
>

hmm, I was more thinking of cases where drivers for particular devices
need some work (ie. like potentially disabling hw hang detect during
faults).. I guess we could have three levels, that all have to be true
in order to enable stall: smmu, domain (pass flags in to
iommu_domain_alloc()??), and device (iommu_attach_device())?

BR,
-R
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC 1/3] iommu/arm-smmu: Add support to opt-in to stalling

2017-01-10 Thread Rob Clark
On Tue, Jan 10, 2017 at 12:52 PM, Will Deacon <will.dea...@arm.com> wrote:
> Hi Rob,
>
> On Fri, Jan 06, 2017 at 11:26:49AM -0500, Rob Clark wrote:
>> On Thu, Jan 5, 2017 at 10:49 AM, Will Deacon <will.dea...@arm.com> wrote:
>> > On Thu, Jan 05, 2017 at 10:27:27AM -0500, Rob Clark wrote:
>> >> I'm not sure if the better solution then would be to have two fault
>> >> callbacks, one immediately from the IRQ and a later one from wq.  Or
>> >> let the driver handle the wq business and give it a way to tell the
>> >> IOMMU when to resume.
>> >>
>> >> I kinda think we should punt on the worker thread for now until we are
>> >> ready to resume faulting transactions, because I guess a strong chance
>> >> that whatever way we do it now will be wrong ;-)
>> >
>> > I guess what I'm after is for you to change the interrupt handlers to be
>> > threaded, like they are for SMMUv3. I *think* you can do that with a NULL
>> > thread_fn for now, and just call report_iommu_fault from the handler.
>> > The return value of that could, in theory, be used to queued the paging
>> > request and wake the paging thread in future.
>>
>> If we only pass in the non-threaded irq fxn, I'm not really sure how
>> that changes anything.. or maybe I'm not understanding what you mean.
>>
>> But yeah, I guess we could use request_threaded_irq() to get both IRQ
>> context notification and a later thread context notification rather
>> than doing the wq thing.  Either way the iommu API has to change
>> slightly.
>>
>> >> > I wonder if this should also be predicated on the compatible string, so
>> >> > that the "arm,smmu-enable-stall" property is ignored (with a warning) if
>> >> > the compatible string isn't specific enough to identify an 
>> >> > implementation
>> >> > with the required SS behaviour? On the other hand, it feels pretty
>> >> > redundant and a single "stalling works" property is all we need.
>> >>
>> >> We could also drop the property and key the behavior on specific
>> >> compat strings I guess.  Having both seems a bit odd.  Anyways, I'll
>> >> defer to DT folks about what the cleaner approach is.
>> >
>> > As Robin pointed out, we do need to be able to distinguish the integration
>> > of the device from the device itself. For example, MMU-9000 might be 
>> > capable
>> > of stalling, but if it's bolted to a PCI RC, it's not safe to do so.
>>
>> Hmm, well we install the fault handler on the iommu_domain..  perhaps
>> maybe a combo of dts property (or deciding based on more specific
>> compat string), plus extra param passed in to
>> iommu_set_fault_hander().  The dts property or compat string to
>> indicate whether the iommu (and how it is wired up) can handle stalls,
>> and enable_stall param when fault handler is registered to indicate
>> whether the device itself can cope.. if either can't do stalling, then
>> don't set CFCFG.
>
> I thought about this some more, and I think you're right. Having
> iommu_set_fault_handler take a flags parameter indicating that, for example,
> the fault handler can deal with paging, is all we need to implement the
> per-master opt-in functionality for stalling faults. There's no real
> requirement to standardise a generic firmware property for that (but
> we still need *something* that says stalling is usable on the SMMU --
> perhaps just the compatible string is ok).

btw, it occurred to me that maybe it should be flags param to
iommu_attach_device() (just in case fault handler not installed?)
otoh stalling without a fault handler is silly, but I guess we need it
to infer whether stalling can be supported by other devices on same
iommu.. tbh I'm on a bit shaky ground when it comes to multiple
devices per iommu since the SoC's I'm familiar with do it the other
way around.  But I guess you have thought more about the multi-device
case, so figured I should suggest it..

> Taking this further, there's then no need for the threaded IRQ function
> in the SMMUv2 driver after all. Instead, we pass a continuation function
> pointer and opaque token from the SMMU driver to the fault handler in
> IRQ context (this will be in thread context for SMMUv3, but that should
> be fine). The fault handler can then stash these someplace, and signal
> a wakeup for its own threaded handler, which ultimately calls the SMMU
> continuation function with the opaque token as a parameter when it's done
> with the fault. I think that's enough to get things rolling without adding
> lots of infrastructure to the SMMU driver initially. If a pattern emerges
> amongst users of the interface, then we could consolidate some of the work
> handling back into IOMMU core.
>
> What do you think? It should all be pretty straightforward for what you
> want to do.

yeah, that makes sense to me..  I can give it a try.

BR,
-R

> Will
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC 1/3] iommu/arm-smmu: Add support to opt-in to stalling

2017-01-05 Thread Rob Clark
On Thu, Jan 5, 2017 at 6:55 AM, Will Deacon <will.dea...@arm.com> wrote:
> On Tue, Jan 03, 2017 at 04:30:54PM -0500, Rob Clark wrote:
>> TODO maybe we want two options, one to enable stalling, and 2nd to punt
>> handling to wq?  I haven't needed to use mm APIs from fault handler yet
>> (although it is something that I think we'll want some day).  Perhaps
>> stalling support is limited to just letting driver dump some extra
>> debugging information otherwise.  Threaded handling probably only useful
>> with stalling, but inverse may not always be true.
>
> I'd actually like to see this stuck on a worker thread, because I think
> that's more generally useful and I don't want to have a situation where
> sometimes the IOMMU fault notifier is run in IRQ context and sometimes it's
> not.

So I was talking a bit w/ Jordan on IRC yesterday..  and we also have
the GPU's hw hang-detect to contend with.  So I *suspect* that when we
get to the point of using this to do things like page in things from
swap and resume the faulting transaction, we probably want to get
called immediately from the IRQ handler so we can disable the hw
hang-detect.

I'm not sure if the better solution then would be to have two fault
callbacks, one immediately from the IRQ and a later one from wq.  Or
let the driver handle the wq business and give it a way to tell the
IOMMU when to resume.

I kinda think we should punt on the worker thread for now until we are
ready to resume faulting transactions, because I guess a strong chance
that whatever way we do it now will be wrong ;-)

>>
>> Signed-off-by: Rob Clark <robdcl...@gmail.com>
>> ---
>>  .../devicetree/bindings/iommu/arm,smmu.txt |  3 ++
>>  drivers/iommu/arm-smmu.c   | 42 
>> ++
>>  2 files changed, 39 insertions(+), 6 deletions(-)
>>
>> diff --git a/Documentation/devicetree/bindings/iommu/arm,smmu.txt 
>> b/Documentation/devicetree/bindings/iommu/arm,smmu.txt
>> index ef465b0..5f405a6 100644
>> --- a/Documentation/devicetree/bindings/iommu/arm,smmu.txt
>> +++ b/Documentation/devicetree/bindings/iommu/arm,smmu.txt
>> @@ -68,6 +68,9 @@ conditions.
>>aliases of secure registers have to be used during
>>SMMU configuration.
>>
>> +- arm,smmu-enable-stall : Enable stall mode to stall memory transactions
>> +  and resume after fault is handled
>> +
>>  ** Deprecated properties:
>>
>>  - mmu-masters (deprecated in favour of the generic "iommus" binding) :
>> diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
>> index d505432..a71cb8f 100644
>> --- a/drivers/iommu/arm-smmu.c
>> +++ b/drivers/iommu/arm-smmu.c
>> @@ -350,6 +350,7 @@ struct arm_smmu_device {
>>   u32 features;
>>
>>  #define ARM_SMMU_OPT_SECURE_CFG_ACCESS (1 << 0)
>> +#define ARM_SMMU_OPT_ENABLE_STALL  (1 << 1)
>>   u32 options;
>>   enum arm_smmu_arch_version  version;
>>   enum arm_smmu_implementationmodel;
>> @@ -425,6 +426,7 @@ static bool using_legacy_binding, using_generic_binding;
>>
>>  static struct arm_smmu_option_prop arm_smmu_options[] = {
>>   { ARM_SMMU_OPT_SECURE_CFG_ACCESS, "calxeda,smmu-secure-config-access" 
>> },
>> + { ARM_SMMU_OPT_ENABLE_STALL,  "arm,smmu-enable-stall" },
>>   { 0, NULL},
>>  };
>>
>> @@ -676,7 +678,8 @@ static struct iommu_gather_ops arm_smmu_gather_ops = {
>>
>>  static irqreturn_t arm_smmu_context_fault(int irq, void *dev)
>>  {
>> - u32 fsr, fsynr;
>> + int flags, ret;
>> + u32 fsr, fsynr, resume;
>>   unsigned long iova;
>>   struct iommu_domain *domain = dev;
>>   struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
>> @@ -690,15 +693,40 @@ static irqreturn_t arm_smmu_context_fault(int irq, 
>> void *dev)
>>   if (!(fsr & FSR_FAULT))
>>   return IRQ_NONE;
>>
>> + if (fsr & FSR_IGN)
>> + dev_err_ratelimited(smmu->dev,
>> + "Unexpected context fault (fsr 0x%x)\n",
>> + fsr);
>> +
>>   fsynr = readl_relaxed(cb_base + ARM_SMMU_CB_FSYNR0);
>> - iova = readq_relaxed(cb_base + ARM_SMMU_CB_FAR);
>> + flags = fsynr & FSYNR0_WNR ? IOMMU_FAULT_WRITE : IOMMU_FAULT_READ;
>>
>> - dev_err_ratelimited(smmu->dev,
>> - "Unhandled context fault: fsr=0x%x, iova=0x%08lx

Re: [RFC 2/3] iommu/arm-smmu: Add qcom implementation

2017-01-04 Thread Rob Clark
On Wed, Jan 4, 2017 at 8:33 AM, Sricharan <sricha...@codeaurora.org> wrote:
> Hi,
>
>>-Original Message-
>>From: linux-arm-msm-ow...@vger.kernel.org 
>>[mailto:linux-arm-msm-ow...@vger.kernel.org] On Behalf Of Jordan Crouse
>>Sent: Wednesday, January 04, 2017 3:59 AM
>>To: Rob Clark <robdcl...@gmail.com>
>>Cc: Will Deacon <will.dea...@arm.com>; iommu@lists.linux-foundation.org; 
>>linux-arm-...@vger.kernel.org; Sricharan R
>><sricha...@codeaurora.org>
>>Subject: Re: [RFC 2/3] iommu/arm-smmu: Add qcom implementation
>>
>>On Tue, Jan 03, 2017 at 04:30:55PM -0500, Rob Clark wrote:
>>> At least on the db820c I have, with the firmware I have, I'm not seeing
>>> the SS bit set, even though the iommu is in a stalled state.  So for
>>> this implementation ignore not having SS bit set.
>>
>>The SS bit gets set if SCTLR.CFCFG is set to 1. It works in the downstream
>>kernel because the GPU driver writes directly to SCTLR in the IOMMU hardware
>>(which of course is a crime against humanity but that is one of the many 
>>reasons
>>why it is a *downstream* driver).
>>
>>My understanding is that SCTLR.CFCFG == 0 should automatically terminate the
>>transaction so I don't understand why we need to write to RESUME. I'm not
>>doubting Rob's patch, I'm doubting why we need it in the first place. It seems
>>that if we have to write it regardless of the value of CFCFG then we should
>>probably just do that instead of relying on the SS bit.
>>
>
> The patch is setting CFCFG to 1, hence we require clearing the fault with a
> write to the RESUME register. I tested these patches on arm-smmu with
> the DB820c and saw that the 'FSR_SS' bit is getting set properly after a
> fault on the adreno smmu.

I'll drop this patch and re-test.. hopefully later today.  It's
possible that I was having the problem w/ SS not set due to some other
issue.  (This was what I was seeing initially after just reverting the
patch that removed the stall/resume stuff.)  I probably need to double
checkk that CFCFG bit isn't getting cleared somewhere.

BR,
-R

>>The public spec doesn't give any indication to me that any of this behavior is
>>implementation specific but I only have one implementation to base that
>>assumption on. Perhaps the default value of SCTLR is implementation specific?
>>
>>If other implementations do expect SS (and CFCFG) to be set by default then we
>>would indeed need to set up a quirk. The other possibility would be to force
>>set CFCFG for all targets, but I would be hesitant to do that on the GPU iommu
>>because if we stall the GPU for too long then hang detect will fire.
>>
>
> As i understood from the previous discussions on this [1],  the
> behaviour of the stall model (whether enabling the stall would impact other
> contexts as well) and how the stalled context bank is going to assert the
> interrupts were implementation defined. I thought that the setting of the
> 'SS' bit should happen if stall model is supported.
>
> [1] https://www.spinics.net/lists/linux-arm-msm/msg25203.html
>
> Regards,
>  Sricharan
>
>
>>Jordan
>>
>>> Signed-off-by: Rob Clark <robdcl...@gmail.com>
>>> ---
>>>  drivers/iommu/arm-smmu.c | 6 ++
>>>  1 file changed, 6 insertions(+)
>>>
>>> diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
>>> index a71cb8f..a8d9901 100644
>>> --- a/drivers/iommu/arm-smmu.c
>>> +++ b/drivers/iommu/arm-smmu.c
>>> @@ -298,6 +298,7 @@ enum arm_smmu_implementation {
>>>  GENERIC_SMMU,
>>>  ARM_MMU500,
>>>  CAVIUM_SMMUV2,
>>> +QCOM_SMMUV2,
>>>  };
>>>
>>>  struct arm_smmu_s2cr {
>>> @@ -716,6 +717,9 @@ static irqreturn_t arm_smmu_context_fault(int irq, void 
>>> *dev)
>>>  /* Clear the faulting FSR */
>>>  writel(fsr, cb_base + ARM_SMMU_CB_FSR);
>>>
>>> +if (smmu->model == QCOM_SMMUV2)
>>> +fsr |= FSR_SS;
>>> +
>>>  /* Retry or terminate any stalled transactions */
>>>  if (fsr & FSR_SS) {
>>>  /* Should we care about ending up w/ a stalled transaction
>>> @@ -1991,6 +1995,7 @@ ARM_SMMU_MATCH_DATA(smmu_generic_v2, ARM_SMMU_V2, 
>>> GENERIC_SMMU);
>>>  ARM_SMMU_MATCH_DATA(arm_mmu401, ARM_SMMU_V1_64K, GENERIC_SMMU);
>>>  ARM_SMMU_MATCH_DATA(arm_mmu500, ARM_SMMU_V2, ARM_MMU500);
>>>  ARM_SMMU_MATCH_DATA(cavium_smmuv2, ARM_SMMU_V2, CAVIUM_SMMUV2);
>>> +ARM_SMMU_MATCH_DATA(qcom_smmuv2, ARM_SMMU_V2, QCOM_SMMUV2);

[RFC 1/3] iommu/arm-smmu: Add support to opt-in to stalling

2017-01-03 Thread Rob Clark
TODO maybe we want two options, one to enable stalling, and 2nd to punt
handling to wq?  I haven't needed to use mm APIs from fault handler yet
(although it is something that I think we'll want some day).  Perhaps
stalling support is limited to just letting driver dump some extra
debugging information otherwise.  Threaded handling probably only useful
with stalling, but inverse may not always be true.

Signed-off-by: Rob Clark <robdcl...@gmail.com>
---
 .../devicetree/bindings/iommu/arm,smmu.txt |  3 ++
 drivers/iommu/arm-smmu.c   | 42 ++
 2 files changed, 39 insertions(+), 6 deletions(-)

diff --git a/Documentation/devicetree/bindings/iommu/arm,smmu.txt 
b/Documentation/devicetree/bindings/iommu/arm,smmu.txt
index ef465b0..5f405a6 100644
--- a/Documentation/devicetree/bindings/iommu/arm,smmu.txt
+++ b/Documentation/devicetree/bindings/iommu/arm,smmu.txt
@@ -68,6 +68,9 @@ conditions.
   aliases of secure registers have to be used during
   SMMU configuration.
 
+- arm,smmu-enable-stall : Enable stall mode to stall memory transactions
+  and resume after fault is handled
+
 ** Deprecated properties:
 
 - mmu-masters (deprecated in favour of the generic "iommus" binding) :
diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index d505432..a71cb8f 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -350,6 +350,7 @@ struct arm_smmu_device {
u32 features;
 
 #define ARM_SMMU_OPT_SECURE_CFG_ACCESS (1 << 0)
+#define ARM_SMMU_OPT_ENABLE_STALL  (1 << 1)
u32 options;
enum arm_smmu_arch_version  version;
enum arm_smmu_implementationmodel;
@@ -425,6 +426,7 @@ static bool using_legacy_binding, using_generic_binding;
 
 static struct arm_smmu_option_prop arm_smmu_options[] = {
{ ARM_SMMU_OPT_SECURE_CFG_ACCESS, "calxeda,smmu-secure-config-access" },
+   { ARM_SMMU_OPT_ENABLE_STALL,  "arm,smmu-enable-stall" },
{ 0, NULL},
 };
 
@@ -676,7 +678,8 @@ static struct iommu_gather_ops arm_smmu_gather_ops = {
 
 static irqreturn_t arm_smmu_context_fault(int irq, void *dev)
 {
-   u32 fsr, fsynr;
+   int flags, ret;
+   u32 fsr, fsynr, resume;
unsigned long iova;
struct iommu_domain *domain = dev;
struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
@@ -690,15 +693,40 @@ static irqreturn_t arm_smmu_context_fault(int irq, void 
*dev)
if (!(fsr & FSR_FAULT))
return IRQ_NONE;
 
+   if (fsr & FSR_IGN)
+   dev_err_ratelimited(smmu->dev,
+   "Unexpected context fault (fsr 0x%x)\n",
+   fsr);
+
fsynr = readl_relaxed(cb_base + ARM_SMMU_CB_FSYNR0);
-   iova = readq_relaxed(cb_base + ARM_SMMU_CB_FAR);
+   flags = fsynr & FSYNR0_WNR ? IOMMU_FAULT_WRITE : IOMMU_FAULT_READ;
 
-   dev_err_ratelimited(smmu->dev,
-   "Unhandled context fault: fsr=0x%x, iova=0x%08lx, fsynr=0x%x, cb=%d\n",
-   fsr, iova, fsynr, cfg->cbndx);
+   iova = readq_relaxed(cb_base + ARM_SMMU_CB_FAR);
+   if (!report_iommu_fault(domain, smmu->dev, iova, flags)) {
+   ret = IRQ_HANDLED;
+   resume = RESUME_RETRY;
+   } else {
+   dev_err_ratelimited(smmu->dev,
+   "Unhandled context fault: iova=0x%08lx, fsynr=0x%x, 
cb=%d\n",
+   iova, fsynr, cfg->cbndx);
+   ret = IRQ_NONE;
+   resume = RESUME_TERMINATE;
+   }
 
+   /* Clear the faulting FSR */
writel(fsr, cb_base + ARM_SMMU_CB_FSR);
-   return IRQ_HANDLED;
+
+   /* Retry or terminate any stalled transactions */
+   if (fsr & FSR_SS) {
+   /* Should we care about ending up w/ a stalled transaction
+* when we didn't ask for it?  I guess for now best to call
+* attention to it and resume anyways.
+*/
+   WARN_ON(!(smmu->options & ARM_SMMU_OPT_ENABLE_STALL));
+   writel_relaxed(resume, cb_base + ARM_SMMU_CB_RESUME);
+   }
+
+   return ret;
 }
 
 static irqreturn_t arm_smmu_global_fault(int irq, void *dev)
@@ -824,6 +852,8 @@ static void arm_smmu_init_context_bank(struct 
arm_smmu_domain *smmu_domain,
 
/* SCTLR */
reg = SCTLR_CFIE | SCTLR_CFRE | SCTLR_AFE | SCTLR_TRE | SCTLR_M;
+   if (smmu->options & ARM_SMMU_OPT_ENABLE_STALL)
+   reg |= SCTLR_CFCFG;
if (stage1)
reg |= SCTLR_S1_ASIDPNE;
 #ifdef __BIG_ENDIAN
-- 
2.7.4

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[RFC 0/3] iommu/arm-smmu: patches for adreno

2017-01-03 Thread Rob Clark
Will,

I meant to scrape something together a bit sooner.  I wanted to check if
this was in line with what you were thinking for upstream alternative to
reverting "iommu/arm-smmu: Disable stalling faults for all endpoints".

(Third patch is semi-unrelated, but I'd prefer to only have my rate-
limited prints from drm/msm, since they contain additional information
about gpu state for debugging the fault.)

Rob Clark (3):
  iommu/arm-smmu: Add support to opt-in to stalling
  iommu/arm-smmu: Add qcom implementation
  iommu/arm-smmu: Let fault handler return -EFAULT

 .../devicetree/bindings/iommu/arm,smmu.txt |  3 ++
 drivers/iommu/arm-smmu.c   | 54 --
 2 files changed, 52 insertions(+), 5 deletions(-)

-- 
2.7.4

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[RFC 3/3] iommu/arm-smmu: Let fault handler return -EFAULT

2017-01-03 Thread Rob Clark
Let the iommu user ask the iommu to terminate the transaction without
printing any error msg via -EFAULT return.

(Alternatively, look for -ENOSYS return instead to trigger the msg?)

Signed-off-by: Rob Clark <robdcl...@gmail.com>
---
 drivers/iommu/arm-smmu.c | 12 ++--
 1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index a8d9901..dc26c98 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -703,15 +703,23 @@ static irqreturn_t arm_smmu_context_fault(int irq, void 
*dev)
flags = fsynr & FSYNR0_WNR ? IOMMU_FAULT_WRITE : IOMMU_FAULT_READ;
 
iova = readq_relaxed(cb_base + ARM_SMMU_CB_FAR);
-   if (!report_iommu_fault(domain, smmu->dev, iova, flags)) {
+
+   switch (report_iommu_fault(domain, smmu->dev, iova, flags)) {
+   case 0:
ret = IRQ_HANDLED;
resume = RESUME_RETRY;
-   } else {
+   break;
+   case -EFAULT:
+   ret = IRQ_HANDLED;
+   resume = RESUME_TERMINATE;
+   break;
+   default:
dev_err_ratelimited(smmu->dev,
"Unhandled context fault: iova=0x%08lx, fsynr=0x%x, 
cb=%d\n",
iova, fsynr, cfg->cbndx);
ret = IRQ_NONE;
resume = RESUME_TERMINATE;
+   break;
}
 
/* Clear the faulting FSR */
-- 
2.7.4

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[RFC 2/3] iommu/arm-smmu: Add qcom implementation

2017-01-03 Thread Rob Clark
At least on the db820c I have, with the firmware I have, I'm not seeing
the SS bit set, even though the iommu is in a stalled state.  So for
this implementation ignore not having SS bit set.

Signed-off-by: Rob Clark <robdcl...@gmail.com>
---
 drivers/iommu/arm-smmu.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index a71cb8f..a8d9901 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -298,6 +298,7 @@ enum arm_smmu_implementation {
GENERIC_SMMU,
ARM_MMU500,
CAVIUM_SMMUV2,
+   QCOM_SMMUV2,
 };
 
 struct arm_smmu_s2cr {
@@ -716,6 +717,9 @@ static irqreturn_t arm_smmu_context_fault(int irq, void 
*dev)
/* Clear the faulting FSR */
writel(fsr, cb_base + ARM_SMMU_CB_FSR);
 
+   if (smmu->model == QCOM_SMMUV2)
+   fsr |= FSR_SS;
+
/* Retry or terminate any stalled transactions */
if (fsr & FSR_SS) {
/* Should we care about ending up w/ a stalled transaction
@@ -1991,6 +1995,7 @@ ARM_SMMU_MATCH_DATA(smmu_generic_v2, ARM_SMMU_V2, 
GENERIC_SMMU);
 ARM_SMMU_MATCH_DATA(arm_mmu401, ARM_SMMU_V1_64K, GENERIC_SMMU);
 ARM_SMMU_MATCH_DATA(arm_mmu500, ARM_SMMU_V2, ARM_MMU500);
 ARM_SMMU_MATCH_DATA(cavium_smmuv2, ARM_SMMU_V2, CAVIUM_SMMUV2);
+ARM_SMMU_MATCH_DATA(qcom_smmuv2, ARM_SMMU_V2, QCOM_SMMUV2);
 
 static const struct of_device_id arm_smmu_of_match[] = {
{ .compatible = "arm,smmu-v1", .data = _generic_v1 },
@@ -1999,6 +2004,7 @@ static const struct of_device_id arm_smmu_of_match[] = {
{ .compatible = "arm,mmu-401", .data = _mmu401 },
{ .compatible = "arm,mmu-500", .data = _mmu500 },
{ .compatible = "cavium,smmu-v2", .data = _smmuv2 },
+   { .compatible = "qcom,smmu-v2", .data = _smmuv2 },
{ },
 };
 MODULE_DEVICE_TABLE(of, arm_smmu_of_match);
-- 
2.7.4

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC 2/3] iommu/arm-smmu: Add qcom implementation

2017-01-04 Thread Rob Clark
On Wed, Jan 4, 2017 at 9:31 AM, Rob Clark <robdcl...@gmail.com> wrote:
> On Wed, Jan 4, 2017 at 8:33 AM, Sricharan <sricha...@codeaurora.org> wrote:
>> Hi,
>>
>>>-Original Message-
>>>From: linux-arm-msm-ow...@vger.kernel.org 
>>>[mailto:linux-arm-msm-ow...@vger.kernel.org] On Behalf Of Jordan Crouse
>>>Sent: Wednesday, January 04, 2017 3:59 AM
>>>To: Rob Clark <robdcl...@gmail.com>
>>>Cc: Will Deacon <will.dea...@arm.com>; iommu@lists.linux-foundation.org; 
>>>linux-arm-...@vger.kernel.org; Sricharan R
>>><sricha...@codeaurora.org>
>>>Subject: Re: [RFC 2/3] iommu/arm-smmu: Add qcom implementation
>>>
>>>On Tue, Jan 03, 2017 at 04:30:55PM -0500, Rob Clark wrote:
>>>> At least on the db820c I have, with the firmware I have, I'm not seeing
>>>> the SS bit set, even though the iommu is in a stalled state.  So for
>>>> this implementation ignore not having SS bit set.
>>>
>>>The SS bit gets set if SCTLR.CFCFG is set to 1. It works in the downstream
>>>kernel because the GPU driver writes directly to SCTLR in the IOMMU hardware
>>>(which of course is a crime against humanity but that is one of the many 
>>>reasons
>>>why it is a *downstream* driver).
>>>
>>>My understanding is that SCTLR.CFCFG == 0 should automatically terminate the
>>>transaction so I don't understand why we need to write to RESUME. I'm not
>>>doubting Rob's patch, I'm doubting why we need it in the first place. It 
>>>seems
>>>that if we have to write it regardless of the value of CFCFG then we should
>>>probably just do that instead of relying on the SS bit.
>>>
>>
>> The patch is setting CFCFG to 1, hence we require clearing the fault with a
>> write to the RESUME register. I tested these patches on arm-smmu with
>> the DB820c and saw that the 'FSR_SS' bit is getting set properly after a
>> fault on the adreno smmu.
>
> I'll drop this patch and re-test.. hopefully later today.  It's
> possible that I was having the problem w/ SS not set due to some other
> issue.  (This was what I was seeing initially after just reverting the
> patch that removed the stall/resume stuff.)  I probably need to double
> checkk that CFCFG bit isn't getting cleared somewhere.
>

Ok, we can drop this patch, I've confirmed the SS bit is getting set
properly so we don't need a hack.  Not really sure what was going on
earlier when I had this problem before, maybe CFCFG wasn't getting set
properly..

BR,
-R
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC 1/3] iommu/arm-smmu: Add support to opt-in to stalling

2017-01-06 Thread Rob Clark
On Thu, Jan 5, 2017 at 10:49 AM, Will Deacon <will.dea...@arm.com> wrote:
> On Thu, Jan 05, 2017 at 10:27:27AM -0500, Rob Clark wrote:
>> On Thu, Jan 5, 2017 at 6:55 AM, Will Deacon <will.dea...@arm.com> wrote:
>> > On Tue, Jan 03, 2017 at 04:30:54PM -0500, Rob Clark wrote:
>> >> TODO maybe we want two options, one to enable stalling, and 2nd to punt
>> >> handling to wq?  I haven't needed to use mm APIs from fault handler yet
>> >> (although it is something that I think we'll want some day).  Perhaps
>> >> stalling support is limited to just letting driver dump some extra
>> >> debugging information otherwise.  Threaded handling probably only useful
>> >> with stalling, but inverse may not always be true.
>> >
>> > I'd actually like to see this stuck on a worker thread, because I think
>> > that's more generally useful and I don't want to have a situation where
>> > sometimes the IOMMU fault notifier is run in IRQ context and sometimes it's
>> > not.
>>
>> So I was talking a bit w/ Jordan on IRC yesterday..  and we also have
>> the GPU's hw hang-detect to contend with.  So I *suspect* that when we
>> get to the point of using this to do things like page in things from
>> swap and resume the faulting transaction, we probably want to get
>> called immediately from the IRQ handler so we can disable the hw
>> hang-detect.
>
> Well, if you want to use an SMMU for paging, then the GPU driver would
> need to request that explicitly when allocating its DMA buffers, to that
> would be the time to either delay or disable the hang detection.

If userspace is using SVM, for example, it is pretty impossible to
know when to expect a fault.  The best you could do is keep track that
*some* process which has active work queued up for gpu is using SVM
and disable hang detect for *everyone*.. which is kind of sad.

>> I'm not sure if the better solution then would be to have two fault
>> callbacks, one immediately from the IRQ and a later one from wq.  Or
>> let the driver handle the wq business and give it a way to tell the
>> IOMMU when to resume.
>>
>> I kinda think we should punt on the worker thread for now until we are
>> ready to resume faulting transactions, because I guess a strong chance
>> that whatever way we do it now will be wrong ;-)
>
> I guess what I'm after is for you to change the interrupt handlers to be
> threaded, like they are for SMMUv3. I *think* you can do that with a NULL
> thread_fn for now, and just call report_iommu_fault from the handler.
> The return value of that could, in theory, be used to queued the paging
> request and wake the paging thread in future.

If we only pass in the non-threaded irq fxn, I'm not really sure how
that changes anything.. or maybe I'm not understanding what you mean.

But yeah, I guess we could use request_threaded_irq() to get both IRQ
context notification and a later thread context notification rather
than doing the wq thing.  Either way the iommu API has to change
slightly.

>> > I wonder if this should also be predicated on the compatible string, so
>> > that the "arm,smmu-enable-stall" property is ignored (with a warning) if
>> > the compatible string isn't specific enough to identify an implementation
>> > with the required SS behaviour? On the other hand, it feels pretty
>> > redundant and a single "stalling works" property is all we need.
>>
>> We could also drop the property and key the behavior on specific
>> compat strings I guess.  Having both seems a bit odd.  Anyways, I'll
>> defer to DT folks about what the cleaner approach is.
>
> As Robin pointed out, we do need to be able to distinguish the integration
> of the device from the device itself. For example, MMU-9000 might be capable
> of stalling, but if it's bolted to a PCI RC, it's not safe to do so.

Hmm, well we install the fault handler on the iommu_domain..  perhaps
maybe a combo of dts property (or deciding based on more specific
compat string), plus extra param passed in to
iommu_set_fault_hander().  The dts property or compat string to
indicate whether the iommu (and how it is wired up) can handle stalls,
and enable_stall param when fault handler is registered to indicate
whether the device itself can cope.. if either can't do stalling, then
don't set CFCFG.

BR,
-R
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC 1/3] iommu/arm-smmu: Add support to opt-in to stalling

2017-01-06 Thread Rob Clark
On Thu, Jan 5, 2017 at 12:25 PM, Will Deacon  wrote:
>> That's still got to be a per-master property, not a SMMU property, I
>> think. To illustrate:
>>
>>   [A] [B]   [C]
>>|   |_|
>>  __|__|___
>> | TBU || TBU |
>> |_|  SMMU  |_|
>> |__|__|__|
>>|  |
>>
>> Say A and B are instances of some device happy to be stalled, and C is a
>> PCIe RC, and each is attached to their own context bank - enabling
>> stalls for A is definitely fine. However even though B and C are using
>> different context banks, enabling stalls for B might deadlock C if it
>> results in more total outstanding transactions than the TBU's slave port
>> supports. Therefore A can happily claim to be stall-safe, but B cannot
>> due to its integration with respect to C.
>
> So in this case, don't say that B and C can DMA to unpinned memory. You
> need the third property. This property (property 2) is concerned with the
> SMMU itself because, e.g. the way the walker has been integrated can
> cause a deadlock.


fwiw, I guess I'm mostly thinking about case (A)..  but I guess in the
(B) case amend my suggestion about adding param to
iommu_set_fault_handler() slightly to consider the enable_stall param
passed in when both (B) and (C) register their fault handlers?

Or I guess the idea about increasing extra cell (which IIUC would let
us add an extra param in dt in the devices iommus property) could also
work.  Unless maybe there could be some cases where whether a device
can do stalling is also a function of the driver as well (ie. some
feature needs to be implemented type thing)..


BR,
-R
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 3/4] iommu/arm-smmu: Disable stalling faults for all endpoints

2016-12-06 Thread Rob Clark
On Thu, Aug 18, 2016 at 9:05 AM, Will Deacon  wrote:
> Enabling stalling faults can result in hardware deadlock on poorly
> designed systems, particularly those with a PCI root complex upstream of
> the SMMU.
>
> Although it's not really Linux's job to save hardware integrators from
> their own misfortune, it *is* our job to stop userspace (e.g. VFIO
> clients) from hosing the system for everybody else, even if they might
> already be required to have elevated privileges.
>
> Given that the fault handling code currently executes entirely in IRQ
> context, there is nothing that can sensibly be done to recover from
> things like page faults anyway, so let's rip this code out for now and
> avoid the potential for deadlock.

Hi Will,

so, I'd like to re-introduce this feature, I *guess* as some sort of
opt-in quirk (ie. disabled by default unless something in DT tells you
otherwise??  But I'm open to suggestions.  I'm not entirely sure what
hw was having problems due to this feature.)

On newer snapdragon devices we are using arm-smmu for the GPU, and
halting the GPU so the driver's fault handler can dump some GPU state
on faults is enormously helpful for debugging and tracking down where
in the gpu cmdstream the fault was triggered.  In addition, we will
eventually want the ability to update pagetables from fault handler
and resuming the faulting transition.

Some additional comments below..

> Cc: 
> Reported-by: Matt Evans 
> Signed-off-by: Will Deacon 
> ---
>  drivers/iommu/arm-smmu.c | 34 +++---
>  1 file changed, 7 insertions(+), 27 deletions(-)
>
> diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
> index 4f49fe29f202..2db74ebc3240 100644
> --- a/drivers/iommu/arm-smmu.c
> +++ b/drivers/iommu/arm-smmu.c
> @@ -686,8 +686,7 @@ static struct iommu_gather_ops arm_smmu_gather_ops = {
>
>  static irqreturn_t arm_smmu_context_fault(int irq, void *dev)
>  {
> -   int flags, ret;
> -   u32 fsr, fsynr, resume;
> +   u32 fsr, fsynr;
> unsigned long iova;
> struct iommu_domain *domain = dev;
> struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
> @@ -701,34 +700,15 @@ static irqreturn_t arm_smmu_context_fault(int irq, void 
> *dev)
> if (!(fsr & FSR_FAULT))
> return IRQ_NONE;
>
> -   if (fsr & FSR_IGN)
> -   dev_err_ratelimited(smmu->dev,
> -   "Unexpected context fault (fsr 0x%x)\n",
> -   fsr);
> -
> fsynr = readl_relaxed(cb_base + ARM_SMMU_CB_FSYNR0);
> -   flags = fsynr & FSYNR0_WNR ? IOMMU_FAULT_WRITE : IOMMU_FAULT_READ;
> -
> iova = readq_relaxed(cb_base + ARM_SMMU_CB_FAR);
> -   if (!report_iommu_fault(domain, smmu->dev, iova, flags)) {
> -   ret = IRQ_HANDLED;
> -   resume = RESUME_RETRY;
> -   } else {
> -   dev_err_ratelimited(smmu->dev,
> -   "Unhandled context fault: iova=0x%08lx, fsynr=0x%x, 
> cb=%d\n",
> -   iova, fsynr, cfg->cbndx);

I would like to decouple this dev_err_ratelimit() print from the
RESUME_RETRY vs RESUME_TERMINATE behaviour.  I need the ability to
indicate by return from my fault handler whether to resume or
terminate.  But I already have my own ratelimted prints and would
prefer not to spam dmesg twice.

I'm thinking about report_iommu_fault() returning:

  0 => RESUME_RETRY
  -EFAULT => RESUME_TERMINATE but don't print
  anything else (or specifically -ENOSYS?) => RESUME_TERMINATE and print

thoughts?

> -   ret = IRQ_NONE;
> -   resume = RESUME_TERMINATE;
> -   }
> -
> -   /* Clear the faulting FSR */
> -   writel(fsr, cb_base + ARM_SMMU_CB_FSR);
>
> -   /* Retry or terminate any stalled transactions */
> -   if (fsr & FSR_SS)
> -   writel_relaxed(resume, cb_base + ARM_SMMU_CB_RESUME);

This might be a bug in qcom's implementation of the smmu spec, but
seems like we don't have SS bit set, yet we still require RESUME reg
to be written, otherwise gpu is perma-wedged.  Maybe topic for a
separate quirk?  I'm not sure if writing RESUME reg on other hw when
SS bit is not set is likely to cause problems?  If not I suppose we
could just unconditionally write it.

Anyways, I'm not super-familiar w/ arm-smmu so suggestions welcome..
in between debugging freedreno I'll try to put together some patches.

BR,
-R

> +   dev_err_ratelimited(smmu->dev,
> +   "Unhandled context fault: fsr=0x%x, iova=0x%08lx, fsynr=0x%x, 
> cb=%d\n",
> +   fsr, iova, fsynr, cfg->cbndx);
>
> -   return ret;
> +   writel(fsr, cb_base + ARM_SMMU_CB_FSR);
> +   return IRQ_HANDLED;
>  }
>
>  static irqreturn_t arm_smmu_global_fault(int irq, void *dev)
> @@ -837,7 +817,7 @@ static void arm_smmu_init_context_bank(struct 
> arm_smmu_domain *smmu_domain,
> 

[PATCH 0/4] iommu: add qcom_iommu for early "B" family devices

2017-08-03 Thread Rob Clark
An iommu driver for Qualcomm "B" family devices which do implement the
ARM SMMU spec, but not in a way that arm-smmu can support.

(I initially added support to arm-smmu, but it was decided that approach
was too intrusive and it would be cleaner to have a separate driver.)

I should note that all the dependencies for this driver have been merged
since 4.12, and it is the last thing needed for having another fully-
enabled (gpu/display/video codec/etc) ARM device that is fully upstream.

One minor change to move a couple #defines and MMU500 bits back to
arm-smmu.c as suggested by Will.

Rob Clark (3):
  Docs: dt: document qcom iommu bindings
  iommu: arm-smmu: split out register defines
  iommu: add qcom_iommu

Stanimir Varbanov (1):
  iommu: qcom: initialize secure page table

 .../devicetree/bindings/iommu/qcom,iommu.txt   | 121 +++
 drivers/iommu/Kconfig  |  10 +
 drivers/iommu/Makefile |   1 +
 drivers/iommu/arm-smmu-regs.h  | 220 +
 drivers/iommu/arm-smmu.c   | 211 +
 drivers/iommu/qcom_iommu.c | 932 +
 6 files changed, 1293 insertions(+), 202 deletions(-)
 create mode 100644 Documentation/devicetree/bindings/iommu/qcom,iommu.txt
 create mode 100644 drivers/iommu/arm-smmu-regs.h
 create mode 100644 drivers/iommu/qcom_iommu.c

-- 
2.13.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 3/4] iommu: add qcom_iommu

2017-08-03 Thread Rob Clark
An iommu driver for Qualcomm "B" family devices which do implement the
ARM SMMU spec, but not in a way that is compatible with how the arm-smmu
driver is designed.  It seems SMMU_SCR1.GASRAE=1 so the global register
space is not accessible.  This means it needs to get configuration from
devicetree instead of setting it up dynamically.

In the end, other than register definitions, there is not much code to
share with arm-smmu (other than what has already been refactored out
into the pgtable helpers).

Signed-off-by: Rob Clark <robdcl...@gmail.com>
Tested-by: Riku Voipio <riku.voi...@linaro.org>
Tested-by: Archit Taneja <arch...@codeaurora.org>
---
 drivers/iommu/Kconfig  |  10 +
 drivers/iommu/Makefile |   1 +
 drivers/iommu/qcom_iommu.c | 868 +
 3 files changed, 879 insertions(+)
 create mode 100644 drivers/iommu/qcom_iommu.c

diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
index f73ff28f77e2..92f5fd2e0e4b 100644
--- a/drivers/iommu/Kconfig
+++ b/drivers/iommu/Kconfig
@@ -367,4 +367,14 @@ config MTK_IOMMU_V1
 
  if unsure, say N here.
 
+config QCOM_IOMMU
+   # Note: iommu drivers cannot (yet?) be built as modules
+   bool "Qualcomm IOMMU Support"
+   depends on ARCH_QCOM || COMPILE_TEST
+   select IOMMU_API
+   select IOMMU_IO_PGTABLE_LPAE
+   select ARM_DMA_USE_IOMMU
+   help
+ Support for IOMMU on certain Qualcomm SoCs.
+
 endif # IOMMU_SUPPORT
diff --git a/drivers/iommu/Makefile b/drivers/iommu/Makefile
index 195f7b997d8e..b910aea813a1 100644
--- a/drivers/iommu/Makefile
+++ b/drivers/iommu/Makefile
@@ -27,3 +27,4 @@ obj-$(CONFIG_TEGRA_IOMMU_SMMU) += tegra-smmu.o
 obj-$(CONFIG_EXYNOS_IOMMU) += exynos-iommu.o
 obj-$(CONFIG_FSL_PAMU) += fsl_pamu.o fsl_pamu_domain.o
 obj-$(CONFIG_S390_IOMMU) += s390-iommu.o
+obj-$(CONFIG_QCOM_IOMMU) += qcom_iommu.o
diff --git a/drivers/iommu/qcom_iommu.c b/drivers/iommu/qcom_iommu.c
new file mode 100644
index ..860cad1cb167
--- /dev/null
+++ b/drivers/iommu/qcom_iommu.c
@@ -0,0 +1,868 @@
+/*
+ * IOMMU API for QCOM secure IOMMUs.  Somewhat based on arm-smmu.c
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ *
+ * Copyright (C) 2013 ARM Limited
+ * Copyright (C) 2017 Red Hat
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "io-pgtable.h"
+#include "arm-smmu-regs.h"
+
+#define SMMU_INTR_SEL_NS 0x2000
+
+struct qcom_iommu_ctx;
+
+struct qcom_iommu_dev {
+   /* IOMMU core code handle */
+   struct iommu_device  iommu;
+   struct device   *dev;
+   struct clk  *iface_clk;
+   struct clk  *bus_clk;
+   void __iomem*local_base;
+   u32  sec_id;
+   u8   num_ctxs;
+   struct qcom_iommu_ctx   *ctxs[0];   /* indexed by asid-1 */
+};
+
+struct qcom_iommu_ctx {
+   struct device   *dev;
+   void __iomem*base;
+   bool secure_init;
+   u8   asid;  /* asid and ctx bank # are 1:1 */
+};
+
+struct qcom_iommu_domain {
+   struct io_pgtable_ops   *pgtbl_ops;
+   spinlock_t   pgtbl_lock;
+   struct mutex init_mutex; /* Protects iommu pointer */
+   struct iommu_domain  domain;
+   struct qcom_iommu_dev   *iommu;
+};
+
+static struct qcom_iommu_domain *to_qcom_iommu_domain(struct iommu_domain *dom)
+{
+   return container_of(dom, struct qcom_iommu_domain, domain);
+}
+
+static const struct iommu_ops qcom_iommu_ops;
+
+static struct qcom_iommu_dev * to_iommu(struct iommu_fwspec *fwspec)
+{
+   if (!fwspec || fwspec->ops != _iommu_ops)
+   return NULL;
+   return fwspec->iommu_priv;
+}
+
+static struct qcom_iommu_ctx * to_ctx(struct iommu_fwspec *fwspec, unsigned 
asid)
+{
+   struct qcom_iommu_dev *qcom_iommu = to_iommu(fwspec);
+   if (!qcom_iommu)
+   return NULL;
+   return qcom_iommu->ctxs[asid - 1];
+}
+
+static inline void
+iommu_writel(struct qcom_iommu_ctx *ctx, unsigned reg, u32 val)
+{
+   writel_relaxed(val, ctx->ba

[PATCH 4/4] iommu: qcom: initialize secure page table

2017-08-03 Thread Rob Clark
From: Stanimir Varbanov <stanimir.varba...@linaro.org>

This basically gets the secure page table size, allocates memory for
secure pagetables and passes the physical address to the trusted zone.

Signed-off-by: Stanimir Varbanov <stanimir.varba...@linaro.org>
Signed-off-by: Rob Clark <robdcl...@gmail.com>
Tested-by: Archit Taneja <arch...@codeaurora.org>
---
 drivers/iommu/qcom_iommu.c | 64 ++
 1 file changed, 64 insertions(+)

diff --git a/drivers/iommu/qcom_iommu.c b/drivers/iommu/qcom_iommu.c
index 860cad1cb167..48b62aa52787 100644
--- a/drivers/iommu/qcom_iommu.c
+++ b/drivers/iommu/qcom_iommu.c
@@ -604,6 +604,51 @@ static void qcom_iommu_disable_clocks(struct 
qcom_iommu_dev *qcom_iommu)
clk_disable_unprepare(qcom_iommu->iface_clk);
 }
 
+static int qcom_iommu_sec_ptbl_init(struct device *dev)
+{
+   size_t psize = 0;
+   unsigned int spare = 0;
+   void *cpu_addr;
+   dma_addr_t paddr;
+   unsigned long attrs;
+   static bool allocated = false;
+   int ret;
+
+   if (allocated)
+   return 0;
+
+   ret = qcom_scm_iommu_secure_ptbl_size(spare, );
+   if (ret) {
+   dev_err(dev, "failed to get iommu secure pgtable size (%d)\n",
+   ret);
+   return ret;
+   }
+
+   dev_info(dev, "iommu sec: pgtable size: %zu\n", psize);
+
+   attrs = DMA_ATTR_NO_KERNEL_MAPPING;
+
+   cpu_addr = dma_alloc_attrs(dev, psize, , GFP_KERNEL, attrs);
+   if (!cpu_addr) {
+   dev_err(dev, "failed to allocate %zu bytes for pgtable\n",
+   psize);
+   return -ENOMEM;
+   }
+
+   ret = qcom_scm_iommu_secure_ptbl_init(paddr, psize, spare);
+   if (ret) {
+   dev_err(dev, "failed to init iommu pgtable (%d)\n", ret);
+   goto free_mem;
+   }
+
+   allocated = true;
+   return 0;
+
+free_mem:
+   dma_free_attrs(dev, psize, cpu_addr, paddr, attrs);
+   return ret;
+}
+
 static int get_asid(const struct device_node *np)
 {
u32 reg;
@@ -700,6 +745,17 @@ static struct platform_driver qcom_iommu_ctx_driver = {
.remove = qcom_iommu_ctx_remove,
 };
 
+static bool qcom_iommu_has_secure_context(struct qcom_iommu_dev *qcom_iommu)
+{
+   struct device_node *child;
+
+   for_each_child_of_node(qcom_iommu->dev->of_node, child)
+   if (of_device_is_compatible(child, "qcom,msm-iommu-v1-sec"))
+   return true;
+
+   return false;
+}
+
 static int qcom_iommu_device_probe(struct platform_device *pdev)
 {
struct device_node *child;
@@ -744,6 +800,14 @@ static int qcom_iommu_device_probe(struct platform_device 
*pdev)
return -ENODEV;
}
 
+   if (qcom_iommu_has_secure_context(qcom_iommu)) {
+   ret = qcom_iommu_sec_ptbl_init(dev);
+   if (ret) {
+   dev_err(dev, "cannot init secure pg table(%d)\n", ret);
+   return ret;
+   }
+   }
+
platform_set_drvdata(pdev, qcom_iommu);
 
pm_runtime_enable(dev);
-- 
2.13.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 1/4] Docs: dt: document qcom iommu bindings

2017-08-03 Thread Rob Clark
Cc: devicet...@vger.kernel.org
Signed-off-by: Rob Clark <robdcl...@gmail.com>
Reviewed-by: Rob Herring <r...@kernel.org>
Tested-by: Archit Taneja <arch...@codeaurora.org>
---
 .../devicetree/bindings/iommu/qcom,iommu.txt   | 121 +
 1 file changed, 121 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/iommu/qcom,iommu.txt

diff --git a/Documentation/devicetree/bindings/iommu/qcom,iommu.txt 
b/Documentation/devicetree/bindings/iommu/qcom,iommu.txt
new file mode 100644
index ..b2641ceb2b40
--- /dev/null
+++ b/Documentation/devicetree/bindings/iommu/qcom,iommu.txt
@@ -0,0 +1,121 @@
+* QCOM IOMMU v1 Implementation
+
+Qualcomm "B" family devices which are not compatible with arm-smmu have
+a similar looking IOMMU but without access to the global register space,
+and optionally requiring additional configuration to route context irqs
+to non-secure vs secure interrupt line.
+
+** Required properties:
+
+- compatible   : Should be one of:
+
+"qcom,msm8916-iommu"
+
+ Followed by "qcom,msm-iommu-v1".
+
+- clock-names  : Should be a pair of "iface" (required for IOMMUs
+ register group access) and "bus" (required for
+ the IOMMUs underlying bus access).
+
+- clocks   : Phandles for respective clocks described by
+ clock-names.
+
+- #address-cells   : must be 1.
+
+- #size-cells  : must be 1.
+
+- #iommu-cells : Must be 1.  Index identifies the context-bank #.
+
+- ranges   : Base address and size of the iommu context banks.
+
+- qcom,iommu-secure-id  : secure-id.
+
+- List of sub-nodes, one per translation context bank.  Each sub-node
+  has the following required properties:
+
+  - compatible : Should be one of:
+- "qcom,msm-iommu-v1-ns"  : non-secure context bank
+- "qcom,msm-iommu-v1-sec" : secure context bank
+  - reg: Base address and size of context bank within the iommu
+  - interrupts : The context fault irq.
+
+** Optional properties:
+
+- reg  : Base address and size of the SMMU local base, should
+ be only specified if the iommu requires configuration
+ for routing of context bank irq's to secure vs non-
+ secure lines.  (Ie. if the iommu contains secure
+ context banks)
+
+
+** Examples:
+
+   apps_iommu: iommu@1e2 {
+   #address-cells = <1>;
+   #size-cells = <1>;
+   #iommu-cells = <1>;
+   compatible = "qcom,msm8916-iommu", "qcom,msm-iommu-v1";
+   ranges = <0 0x1e2 0x4>;
+   reg = <0x1ef 0x3000>;
+   clocks = < GCC_SMMU_CFG_CLK>,
+< GCC_APSS_TCU_CLK>;
+   clock-names = "iface", "bus";
+   qcom,iommu-secure-id = <17>;
+
+   // mdp_0:
+   iommu-ctx@4000 {
+   compatible = "qcom,msm-iommu-v1-ns";
+   reg = <0x4000 0x1000>;
+   interrupts = ;
+   };
+
+   // venus_ns:
+   iommu-ctx@5000 {
+   compatible = "qcom,msm-iommu-v1-sec";
+   reg = <0x5000 0x1000>;
+   interrupts = ;
+   };
+   };
+
+   gpu_iommu: iommu@1f08000 {
+   #address-cells = <1>;
+   #size-cells = <1>;
+   #iommu-cells = <1>;
+   compatible = "qcom,msm8916-iommu", "qcom,msm-iommu-v1";
+   ranges = <0 0x1f08000 0x1>;
+   clocks = < GCC_SMMU_CFG_CLK>,
+< GCC_GFX_TCU_CLK>;
+   clock-names = "iface", "bus";
+   qcom,iommu-secure-id = <18>;
+
+   // gfx3d_user:
+   iommu-ctx@1000 {
+   compatible = "qcom,msm-iommu-v1-ns";
+   reg = <0x1000 0x1000>;
+   interrupts = ;
+   };
+
+   // gfx3d_priv:
+   iommu-ctx@2000 {
+   compatible = "qcom,msm-iommu-v1-ns";
+   reg = <0x2000 0x1000>;
+   interrupts = ;
+   };
+   };
+
+   ...
+
+   venus: video-codec@1d0 {
+   ...
+   iommus = <_iommu 5>;
+   };
+
+   mdp: mdp@1a01000 {
+   ...
+   iommus = <_iommu 4>;
+   };
+
+   gpu@01c0 {
+   ...
+   iommus = <_iommu 1>, <_iommu 2>;
+   };
-- 
2.13.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 2/4] iommu: arm-smmu: split out register defines

2017-08-03 Thread Rob Clark
I want to re-use some of these for qcom_iommu, which has (roughly) the
same context-bank registers.

Signed-off-by: Rob Clark <robdcl...@gmail.com>
Tested-by: Archit Taneja <arch...@codeaurora.org>
---
 drivers/iommu/arm-smmu-regs.h | 220 ++
 drivers/iommu/arm-smmu.c  | 211 ++--
 2 files changed, 229 insertions(+), 202 deletions(-)
 create mode 100644 drivers/iommu/arm-smmu-regs.h

diff --git a/drivers/iommu/arm-smmu-regs.h b/drivers/iommu/arm-smmu-regs.h
new file mode 100644
index ..a1226e4ab5f8
--- /dev/null
+++ b/drivers/iommu/arm-smmu-regs.h
@@ -0,0 +1,220 @@
+/*
+ * IOMMU API for ARM architected SMMU implementations.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
+ *
+ * Copyright (C) 2013 ARM Limited
+ *
+ * Author: Will Deacon <will.dea...@arm.com>
+ */
+
+#ifndef _ARM_SMMU_REGS_H
+#define _ARM_SMMU_REGS_H
+
+/* Configuration registers */
+#define ARM_SMMU_GR0_sCR0  0x0
+#define sCR0_CLIENTPD  (1 << 0)
+#define sCR0_GFRE  (1 << 1)
+#define sCR0_GFIE  (1 << 2)
+#define sCR0_EXIDENABLE(1 << 3)
+#define sCR0_GCFGFRE   (1 << 4)
+#define sCR0_GCFGFIE   (1 << 5)
+#define sCR0_USFCFG(1 << 10)
+#define sCR0_VMIDPNE   (1 << 11)
+#define sCR0_PTM   (1 << 12)
+#define sCR0_FB(1 << 13)
+#define sCR0_VMID16EN  (1 << 31)
+#define sCR0_BSU_SHIFT 14
+#define sCR0_BSU_MASK  0x3
+
+/* Auxiliary Configuration register */
+#define ARM_SMMU_GR0_sACR  0x10
+
+/* Identification registers */
+#define ARM_SMMU_GR0_ID0   0x20
+#define ARM_SMMU_GR0_ID1   0x24
+#define ARM_SMMU_GR0_ID2   0x28
+#define ARM_SMMU_GR0_ID3   0x2c
+#define ARM_SMMU_GR0_ID4   0x30
+#define ARM_SMMU_GR0_ID5   0x34
+#define ARM_SMMU_GR0_ID6   0x38
+#define ARM_SMMU_GR0_ID7   0x3c
+#define ARM_SMMU_GR0_sGFSR 0x48
+#define ARM_SMMU_GR0_sGFSYNR0  0x50
+#define ARM_SMMU_GR0_sGFSYNR1  0x54
+#define ARM_SMMU_GR0_sGFSYNR2  0x58
+
+#define ID0_S1TS   (1 << 30)
+#define ID0_S2TS   (1 << 29)
+#define ID0_NTS(1 << 28)
+#define ID0_SMS(1 << 27)
+#define ID0_ATOSNS (1 << 26)
+#define ID0_PTFS_NO_AARCH32(1 << 25)
+#define ID0_PTFS_NO_AARCH32S   (1 << 24)
+#define ID0_CTTW   (1 << 14)
+#define ID0_NUMIRPT_SHIFT  16
+#define ID0_NUMIRPT_MASK   0xff
+#define ID0_NUMSIDB_SHIFT  9
+#define ID0_NUMSIDB_MASK   0xf
+#define ID0_EXIDS  (1 << 8)
+#define ID0_NUMSMRG_SHIFT  0
+#define ID0_NUMSMRG_MASK   0xff
+
+#define ID1_PAGESIZE   (1 << 31)
+#define ID1_NUMPAGENDXB_SHIFT  28
+#define ID1_NUMPAGENDXB_MASK   7
+#define ID1_NUMS2CB_SHIFT  16
+#define ID1_NUMS2CB_MASK   0xff
+#define ID1_NUMCB_SHIFT0
+#define ID1_NUMCB_MASK 0xff
+
+#define ID2_OAS_SHIFT  4
+#define ID2_OAS_MASK   0xf
+#define ID2_IAS_SHIFT  0
+#define ID2_IAS_MASK   0xf
+#define ID2_UBS_SHIFT  8
+#define ID2_UBS_MASK   0xf
+#define ID2_PTFS_4K(1 << 12)
+#define ID2_PTFS_16K   (1 << 13)
+#define ID2_PTFS_64K   (1 << 14)
+#define ID2_VMID16 (1 << 15)
+
+#define ID7_MAJOR_SHIFT4
+#define ID7_MAJOR_MASK 0xf
+
+/* Global TLB invalidation */
+#define ARM_SMMU_GR0_TLBIVMID  0x64
+#define ARM_SMMU_GR0_TLBIALLNSNH   0x68
+#define ARM_SMMU_GR0_TLBIALLH  0x6c
+#define ARM_SMMU_GR0_sTLBGSYNC 0x70
+#define ARM_SMMU_GR0_sTLBGSTATUS   0x74
+#define sTLBGSTATUS_GSACTIVE 

Re: [pull] iommu: qcom_iommu for v4.14

2017-08-15 Thread Rob Clark
On Tue, Aug 15, 2017 at 11:42 AM, Joerg Roedel <j...@8bytes.org> wrote:
> On Sun, Aug 13, 2017 at 10:27:36AM -0400, Rob Clark wrote:
>> Hi Joerg,
>>
>> These patches have been on list for quite some time now.  The bindings
>> have r-b from Rob Herring, and the patch touching arm-smmu has Will's
>> a-b.  And all the review comments on the driver patches have been
>> addressed (despite no one stepping up w/ a r-b tag).  Please consider
>> merging this pull request for v4.14
>
> Okay, I applied it, not from this pull-request, but from your last
> post. Some comments for the future below.

Thanks.  Patches from list are the same.

For future reference to you prefer to pick up patches from list?
(Dave prefers pull req's for drm, but I guess the volume of patches
for an iommu driver should be lower.)

>> Rob Clark (3):
>>   Docs: dt: document qcom iommu bindings
>
> This patch has no commit-message. Usually that is a reason for me to
> reject the patch. I applied it anyway this time because it has an
> Reviewed-By from a device-tree maintainer, and if its fine for him, I
> can live with it.
>
> For the other three patches I edited the Subject line to match the
> convention of the iommu-tree, which is "iommu/: _Capital Letter_...
>
> The patches below are now named:
>
>>   iommu: arm-smmu: split out register defines
> iommu/arm-smmu: Split out register defines
>
>>   iommu: add qcom_iommu
> iommu/qcom: Add qcom_iommu
>>
>> Stanimir Varbanov (1):
>>   iommu: qcom: initialize secure page table
> iommu/qcom: Initialize secure page table
>
> Please follow that convention for future patches.
>

will do, thanks

BR,
-R

>
> Thanks,
>
> Joerg
>
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH] MAINTAINERS: Add entry for qcom_iommu

2017-08-16 Thread Rob Clark
Add maintainer entry for qcom_iommu.

Signed-off-by: Rob Clark <robdcl...@gmail.com>
---
 MAINTAINERS | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index 44cb004c765d..8e6ab3d2e01f 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -10941,6 +10941,13 @@ T: git 
git://git.kernel.org/pub/scm/linux/kernel/git/rkuo/linux-hexagon-kernel.g
 S: Supported
 F: arch/hexagon/
 
+QUALCOMM IOMMU
+M:     Rob Clark <robdcl...@gmail.com>
+L: iommu@lists.linux-foundation.org
+L: linux-arm-...@vger.kernel.org
+S: Maintained
+F: drivers/iommu/qcom_iommu.c
+
 QUALCOMM VENUS VIDEO ACCELERATOR DRIVER
 M: Stanimir Varbanov <stanimir.varba...@linaro.org>
 L: linux-me...@vger.kernel.org
-- 
2.13.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[pull] iommu: qcom_iommu for v4.14

2017-08-13 Thread Rob Clark
Hi Joerg,

These patches have been on list for quite some time now.  The bindings
have r-b from Rob Herring, and the patch touching arm-smmu has Will's
a-b.  And all the review comments on the driver patches have been
addressed (despite no one stepping up w/ a r-b tag).  Please consider
merging this pull request for v4.14

BR,
-R

The following changes since commit 60110f262d408c3155e578359812c80fd1f786f9:

  Merge branches 'iommu/fixes', 'arm/exynos', 'arm/renesas',
'arm/rockchip', 'arm/omap', 'arm/mediatek', 'x86/vt-d' and 'core' into
next (2017-08-11 17:09:23 +0200)

are available in the git repository at:

  git://people.freedesktop.org/~robclark/linux

for you to fetch changes up to 8a4342ede167b8c095961846e908b9ac1c884650:

  iommu: qcom: initialize secure page table (2017-08-12 11:50:35 -0400)


Rob Clark (3):
  Docs: dt: document qcom iommu bindings
  iommu: arm-smmu: split out register defines
  iommu: add qcom_iommu

Stanimir Varbanov (1):
  iommu: qcom: initialize secure page table

 .../devicetree/bindings/iommu/qcom,iommu.txt   | 121 +++
 drivers/iommu/Kconfig  |  10 +
 drivers/iommu/Makefile |   1 +
 drivers/iommu/arm-smmu-regs.h  | 220 +
 drivers/iommu/arm-smmu.c   | 211 +
 drivers/iommu/qcom_iommu.c | 932 +
 6 files changed, 1293 insertions(+), 202 deletions(-)
 create mode 100644 Documentation/devicetree/bindings/iommu/qcom,iommu.txt
 create mode 100644 drivers/iommu/arm-smmu-regs.h
 create mode 100644 drivers/iommu/qcom_iommu.c
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [pull] iommu: qcom_iommu for v4.14

2017-08-13 Thread Rob Clark
On Sun, Aug 13, 2017 at 10:27 AM, Rob Clark <robdcl...@gmail.com> wrote:
> Hi Joerg,
>
> These patches have been on list for quite some time now.  The bindings
> have r-b from Rob Herring, and the patch touching arm-smmu has Will's
> a-b.  And all the review comments on the driver patches have been
> addressed (despite no one stepping up w/ a r-b tag).  Please consider
> merging this pull request for v4.14
>
> BR,
> -R
>
> The following changes since commit 60110f262d408c3155e578359812c80fd1f786f9:
>
>   Merge branches 'iommu/fixes', 'arm/exynos', 'arm/renesas',
> 'arm/rockchip', 'arm/omap', 'arm/mediatek', 'x86/vt-d' and 'core' into
> next (2017-08-11 17:09:23 +0200)
>
> are available in the git repository at:
>
>   git://people.freedesktop.org/~robclark/linux

ugg, something went wrong with my script.. that should be:

  git://people.freedesktop.org/~robclark/linux qcom_iommu-next

BR,
-R

>
> for you to fetch changes up to 8a4342ede167b8c095961846e908b9ac1c884650:
>
>   iommu: qcom: initialize secure page table (2017-08-12 11:50:35 -0400)
>
> 
> Rob Clark (3):
>   Docs: dt: document qcom iommu bindings
>   iommu: arm-smmu: split out register defines
>   iommu: add qcom_iommu
>
> Stanimir Varbanov (1):
>   iommu: qcom: initialize secure page table
>
>  .../devicetree/bindings/iommu/qcom,iommu.txt   | 121 +++
>  drivers/iommu/Kconfig  |  10 +
>  drivers/iommu/Makefile |   1 +
>  drivers/iommu/arm-smmu-regs.h  | 220 +
>  drivers/iommu/arm-smmu.c   | 211 +
>  drivers/iommu/qcom_iommu.c | 932 
> +
>  6 files changed, 1293 insertions(+), 202 deletions(-)
>  create mode 100644 Documentation/devicetree/bindings/iommu/qcom,iommu.txt
>  create mode 100644 drivers/iommu/arm-smmu-regs.h
>  create mode 100644 drivers/iommu/qcom_iommu.c
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH V4 3/6] iommu/arm-smmu: Invoke pm_runtime during probe, add/remove device

2017-07-13 Thread Rob Clark
On Thu, Jul 13, 2017 at 9:53 AM, Sricharan R <sricha...@codeaurora.org> wrote:
> Hi,
>
> On 7/13/2017 5:20 PM, Rob Clark wrote:
>> On Thu, Jul 13, 2017 at 1:35 AM, Sricharan R <sricha...@codeaurora.org> 
>> wrote:
>>> Hi Vivek,
>>>
>>> On 7/13/2017 10:43 AM, Vivek Gautam wrote:
>>>> Hi Stephen,
>>>>
>>>>
>>>> On 07/13/2017 04:24 AM, Stephen Boyd wrote:
>>>>> On 07/06, Vivek Gautam wrote:
>>>>>> @@ -1231,12 +1237,18 @@ static int arm_smmu_map(struct iommu_domain 
>>>>>> *domain, unsigned long iova,
>>>>>>   static size_t arm_smmu_unmap(struct iommu_domain *domain, unsigned 
>>>>>> long iova,
>>>>>>size_t size)
>>>>>>   {
>>>>>> -struct io_pgtable_ops *ops = to_smmu_domain(domain)->pgtbl_ops;
>>>>>> +struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
>>>>>> +struct io_pgtable_ops *ops = smmu_domain->pgtbl_ops;
>>>>>> +size_t ret;
>>>>>> if (!ops)
>>>>>>   return 0;
>>>>>>   -return ops->unmap(ops, iova, size);
>>>>>> +pm_runtime_get_sync(smmu_domain->smmu->dev);
>>>>> Can these map/unmap ops be called from an atomic context? I seem
>>>>> to recall that being a problem before.
>>>>
>>>> That's something which was dropped in the following patch merged in master:
>>>> 523d7423e21b iommu/arm-smmu: Remove io-pgtable spinlock
>>>>
>>>> Looks like we don't  need locks here anymore?
>>>
>>>  Apart from the locking, wonder why a explicit pm_runtime is needed
>>>  from unmap. Somehow looks like some path in the master using that
>>>  should have enabled the pm ?
>>>
>>
>> Yes, there are a bunch of scenarios where unmap can happen with
>> disabled master (but not in atomic context).  On the gpu side we
>> opportunistically keep a buffer mapping until the buffer is freed
>> (which can happen after gpu is disabled).  Likewise, v4l2 won't unmap
>> an exported dmabuf while some other driver holds a reference to it
>> (which can be dropped when the v4l2 device is suspended).
>>
>> Since unmap triggers tbl flush which touches iommu regs, the iommu
>> driver *definitely* needs a pm_runtime_get_sync().
>
>  Ok, with that being the case, there are two things here,
>
>  1) If the device links are still intact at these places where unmap is 
> called,
> then pm_runtime from the master would setup the all the clocks. That would
> avoid reintroducing the locking indirectly here.
>
>  2) If not, then doing it here is the only way. But for both cases, since
> the unmap can be called from atomic context, resume handler here should
> avoid doing clk_prepare_enable , instead move the clk_prepare to the init.
>

I do kinda like the approach Marek suggested.. of deferring the tlb
flush until resume.  I'm wondering if we could combine that with
putting the mmu in a stalled state when we suspend (and not resume the
mmu until after the pending tlb flush)?

BR,
-R
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH V4 3/6] iommu/arm-smmu: Invoke pm_runtime during probe, add/remove device

2017-07-13 Thread Rob Clark
On Thu, Jul 13, 2017 at 5:50 AM, Robin Murphy  wrote:
> On 13/07/17 07:48, Stephen Boyd wrote:
>> On 07/13, Vivek Gautam wrote:
>>> Hi Stephen,
>>>
>>>
>>> On 07/13/2017 04:24 AM, Stephen Boyd wrote:
 On 07/06, Vivek Gautam wrote:
> @@ -1231,12 +1237,18 @@ static int arm_smmu_map(struct iommu_domain 
> *domain, unsigned long iova,
>  static size_t arm_smmu_unmap(struct iommu_domain *domain, unsigned long 
> iova,
> size_t size)
>  {
> -  struct io_pgtable_ops *ops = to_smmu_domain(domain)->pgtbl_ops;
> +  struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
> +  struct io_pgtable_ops *ops = smmu_domain->pgtbl_ops;
> +  size_t ret;
>if (!ops)
>return 0;
> -  return ops->unmap(ops, iova, size);
> +  pm_runtime_get_sync(smmu_domain->smmu->dev);
 Can these map/unmap ops be called from an atomic context? I seem
 to recall that being a problem before.
>>>
>>> That's something which was dropped in the following patch merged in master:
>>> 523d7423e21b iommu/arm-smmu: Remove io-pgtable spinlock
>>>
>>> Looks like we don't  need locks here anymore?
>>>
>>
>> While removing the spinlock around the map/unmap path may be one
>> thing, I'm not sure that's all of them. Is there a path from an
>> atomic DMA allocation (GFP_ATOMIC sort of thing) mapped into an
>> IOMMU for a device that can eventually get down to here and
>> attempt to turn a clk on?
>
> Yes, in the DMA path map/unmap will frequently be called from IRQ
> handlers (think e.g. network packets). The whole point of removing the
> lock was to allow multiple maps/unmaps to execute in parallel (since we
> know they will be safely operating on different areas of the pagetable).
> AFAICS this change is going to largely reintroduce that bottleneck via
> dev->power_lock, which is anything but what we want :(
>

Maybe __pm_runtime_resume() needs some sort of fast-path if already
enabled?  Or otherwise we need some sort of flag to tell the iommu
that it cannot rely on the unmapping device to be resumed?

BR,
-R
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[RESEND PATCH 3/4] iommu: add qcom_iommu

2017-07-13 Thread Rob Clark
An iommu driver for Qualcomm "B" family devices which do implement the
ARM SMMU spec, but not in a way that is compatible with how the arm-smmu
driver is designed.  It seems SMMU_SCR1.GASRAE=1 so the global register
space is not accessible.  This means it needs to get configuration from
devicetree instead of setting it up dynamically.

In the end, other than register definitions, there is not much code to
share with arm-smmu (other than what has already been refactored out
into the pgtable helpers).

Signed-off-by: Rob Clark <robdcl...@gmail.com>
Tested-by: Riku Voipio <riku.voi...@linaro.org>
---
v1: original
v2: bindings cleanups and kconfig issues that kbuild robot pointed out
v3: fix issues pointed out by Rob H. and actually make device removal
work
v4: fix WARN_ON() splats reported by Archit
v5: some fixes to build as a module.. note that it cannot actually
be built as a module yet (at minimum a bunch of other iommu syms
that are needed are not exported, but there may be more to it
than that), but at least qcom_iommu is ready should it become
possible to build iommu drivers as modules.
v6: Add additional pm-runtime get/puts around paths that can hit
TLB inv, to avoid unclocked register access if device using the
iommu is not powered on.  And pre-emptively clear interrupts
before registering IRQ handler just in case the bootloader has
left us a surpise.
v7: Address review comments from Robin (don't associate iommu_group
with context bank, table lookup instead of list to find context
bank, etc)
v8: Fix silly bug on detach.  Actually Robin already pointed it out
but I somehow overlooked that comment when preparing v7.

 drivers/iommu/Kconfig  |  10 +
 drivers/iommu/Makefile |   1 +
 drivers/iommu/qcom_iommu.c | 868 +
 3 files changed, 879 insertions(+)
 create mode 100644 drivers/iommu/qcom_iommu.c

diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
index 6ee3a25ae731..aa4b62893fe1 100644
--- a/drivers/iommu/Kconfig
+++ b/drivers/iommu/Kconfig
@@ -367,4 +367,14 @@ config MTK_IOMMU_V1
 
  if unsure, say N here.
 
+config QCOM_IOMMU
+   # Note: iommu drivers cannot (yet?) be built as modules
+   bool "Qualcomm IOMMU Support"
+   depends on ARCH_QCOM || COMPILE_TEST
+   select IOMMU_API
+   select IOMMU_IO_PGTABLE_LPAE
+   select ARM_DMA_USE_IOMMU
+   help
+ Support for IOMMU on certain Qualcomm SoCs.
+
 endif # IOMMU_SUPPORT
diff --git a/drivers/iommu/Makefile b/drivers/iommu/Makefile
index 195f7b997d8e..b910aea813a1 100644
--- a/drivers/iommu/Makefile
+++ b/drivers/iommu/Makefile
@@ -27,3 +27,4 @@ obj-$(CONFIG_TEGRA_IOMMU_SMMU) += tegra-smmu.o
 obj-$(CONFIG_EXYNOS_IOMMU) += exynos-iommu.o
 obj-$(CONFIG_FSL_PAMU) += fsl_pamu.o fsl_pamu_domain.o
 obj-$(CONFIG_S390_IOMMU) += s390-iommu.o
+obj-$(CONFIG_QCOM_IOMMU) += qcom_iommu.o
diff --git a/drivers/iommu/qcom_iommu.c b/drivers/iommu/qcom_iommu.c
new file mode 100644
index ..860cad1cb167
--- /dev/null
+++ b/drivers/iommu/qcom_iommu.c
@@ -0,0 +1,868 @@
+/*
+ * IOMMU API for QCOM secure IOMMUs.  Somewhat based on arm-smmu.c
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ *
+ * Copyright (C) 2013 ARM Limited
+ * Copyright (C) 2017 Red Hat
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "io-pgtable.h"
+#include "arm-smmu-regs.h"
+
+#define SMMU_INTR_SEL_NS 0x2000
+
+struct qcom_iommu_ctx;
+
+struct qcom_iommu_dev {
+   /* IOMMU core code handle */
+   struct iommu_device  iommu;
+   struct device   *dev;
+   struct clk  *iface_clk;
+   struct clk  *bus_clk;
+   void __iomem*local_base;
+   u32  sec_id;
+   u8   num_ctxs;
+   struct qcom_iommu_ctx   *ctxs[0];   /* indexed by asid-1 */
+};
+
+struct qcom_iommu_ctx {
+   struct device   *dev;
+   void __iomem*base;
+   bool secure_init;
+   u8   asid;  /* asid and ctx bank # are 1:1 */
+};
+
+struct qcom

[RESEND PATCH 2/4] iommu: arm-smmu: split out register defines

2017-07-13 Thread Rob Clark
I want to re-use some of these for qcom_iommu, which has (roughly) the
same context-bank registers.

Signed-off-by: Rob Clark <robdcl...@gmail.com>
---
 drivers/iommu/arm-smmu-regs.h | 227 ++
 drivers/iommu/arm-smmu.c  | 203 +
 2 files changed, 228 insertions(+), 202 deletions(-)
 create mode 100644 drivers/iommu/arm-smmu-regs.h

diff --git a/drivers/iommu/arm-smmu-regs.h b/drivers/iommu/arm-smmu-regs.h
new file mode 100644
index ..87589c863068
--- /dev/null
+++ b/drivers/iommu/arm-smmu-regs.h
@@ -0,0 +1,227 @@
+/*
+ * IOMMU API for ARM architected SMMU implementations.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
+ *
+ * Copyright (C) 2013 ARM Limited
+ *
+ * Author: Will Deacon <will.dea...@arm.com>
+ */
+
+#ifndef _ARM_SMMU_REGS_H
+#define _ARM_SMMU_REGS_H
+
+/* Configuration registers */
+#define ARM_SMMU_GR0_sCR0  0x0
+#define sCR0_CLIENTPD  (1 << 0)
+#define sCR0_GFRE  (1 << 1)
+#define sCR0_GFIE  (1 << 2)
+#define sCR0_EXIDENABLE(1 << 3)
+#define sCR0_GCFGFRE   (1 << 4)
+#define sCR0_GCFGFIE   (1 << 5)
+#define sCR0_USFCFG(1 << 10)
+#define sCR0_VMIDPNE   (1 << 11)
+#define sCR0_PTM   (1 << 12)
+#define sCR0_FB(1 << 13)
+#define sCR0_VMID16EN  (1 << 31)
+#define sCR0_BSU_SHIFT 14
+#define sCR0_BSU_MASK  0x3
+
+/* Auxiliary Configuration register */
+#define ARM_SMMU_GR0_sACR  0x10
+
+/* Identification registers */
+#define ARM_SMMU_GR0_ID0   0x20
+#define ARM_SMMU_GR0_ID1   0x24
+#define ARM_SMMU_GR0_ID2   0x28
+#define ARM_SMMU_GR0_ID3   0x2c
+#define ARM_SMMU_GR0_ID4   0x30
+#define ARM_SMMU_GR0_ID5   0x34
+#define ARM_SMMU_GR0_ID6   0x38
+#define ARM_SMMU_GR0_ID7   0x3c
+#define ARM_SMMU_GR0_sGFSR 0x48
+#define ARM_SMMU_GR0_sGFSYNR0  0x50
+#define ARM_SMMU_GR0_sGFSYNR1  0x54
+#define ARM_SMMU_GR0_sGFSYNR2  0x58
+
+#define ID0_S1TS   (1 << 30)
+#define ID0_S2TS   (1 << 29)
+#define ID0_NTS(1 << 28)
+#define ID0_SMS(1 << 27)
+#define ID0_ATOSNS (1 << 26)
+#define ID0_PTFS_NO_AARCH32(1 << 25)
+#define ID0_PTFS_NO_AARCH32S   (1 << 24)
+#define ID0_CTTW   (1 << 14)
+#define ID0_NUMIRPT_SHIFT  16
+#define ID0_NUMIRPT_MASK   0xff
+#define ID0_NUMSIDB_SHIFT  9
+#define ID0_NUMSIDB_MASK   0xf
+#define ID0_EXIDS  (1 << 8)
+#define ID0_NUMSMRG_SHIFT  0
+#define ID0_NUMSMRG_MASK   0xff
+
+#define ID1_PAGESIZE   (1 << 31)
+#define ID1_NUMPAGENDXB_SHIFT  28
+#define ID1_NUMPAGENDXB_MASK   7
+#define ID1_NUMS2CB_SHIFT  16
+#define ID1_NUMS2CB_MASK   0xff
+#define ID1_NUMCB_SHIFT0
+#define ID1_NUMCB_MASK 0xff
+
+#define ID2_OAS_SHIFT  4
+#define ID2_OAS_MASK   0xf
+#define ID2_IAS_SHIFT  0
+#define ID2_IAS_MASK   0xf
+#define ID2_UBS_SHIFT  8
+#define ID2_UBS_MASK   0xf
+#define ID2_PTFS_4K(1 << 12)
+#define ID2_PTFS_16K   (1 << 13)
+#define ID2_PTFS_64K   (1 << 14)
+#define ID2_VMID16 (1 << 15)
+
+#define ID7_MAJOR_SHIFT4
+#define ID7_MAJOR_MASK 0xf
+
+/* Global TLB invalidation */
+#define ARM_SMMU_GR0_TLBIVMID  0x64
+#define ARM_SMMU_GR0_TLBIALLNSNH   0x68
+#define ARM_SMMU_GR0_TLBIALLH  0x6c
+#define ARM_SMMU_GR0_sTLBGSYNC 0x70
+#define ARM_SMMU_GR0_sTLBGSTATUS   0x74
+#define sTLBGSTATUS_GSACTIVE   (1 << 0)
+#define TLB_LOOP_TIMEOUT   10

[RESEND PATCH 1/4] Docs: dt: document qcom iommu bindings

2017-07-13 Thread Rob Clark
Cc: devicet...@vger.kernel.org
Signed-off-by: Rob Clark <robdcl...@gmail.com>
Reviewed-by: Rob Herring <r...@kernel.org>
---
 .../devicetree/bindings/iommu/qcom,iommu.txt   | 121 +
 1 file changed, 121 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/iommu/qcom,iommu.txt

diff --git a/Documentation/devicetree/bindings/iommu/qcom,iommu.txt 
b/Documentation/devicetree/bindings/iommu/qcom,iommu.txt
new file mode 100644
index ..b2641ceb2b40
--- /dev/null
+++ b/Documentation/devicetree/bindings/iommu/qcom,iommu.txt
@@ -0,0 +1,121 @@
+* QCOM IOMMU v1 Implementation
+
+Qualcomm "B" family devices which are not compatible with arm-smmu have
+a similar looking IOMMU but without access to the global register space,
+and optionally requiring additional configuration to route context irqs
+to non-secure vs secure interrupt line.
+
+** Required properties:
+
+- compatible   : Should be one of:
+
+"qcom,msm8916-iommu"
+
+ Followed by "qcom,msm-iommu-v1".
+
+- clock-names  : Should be a pair of "iface" (required for IOMMUs
+ register group access) and "bus" (required for
+ the IOMMUs underlying bus access).
+
+- clocks   : Phandles for respective clocks described by
+ clock-names.
+
+- #address-cells   : must be 1.
+
+- #size-cells  : must be 1.
+
+- #iommu-cells : Must be 1.  Index identifies the context-bank #.
+
+- ranges   : Base address and size of the iommu context banks.
+
+- qcom,iommu-secure-id  : secure-id.
+
+- List of sub-nodes, one per translation context bank.  Each sub-node
+  has the following required properties:
+
+  - compatible : Should be one of:
+- "qcom,msm-iommu-v1-ns"  : non-secure context bank
+- "qcom,msm-iommu-v1-sec" : secure context bank
+  - reg: Base address and size of context bank within the iommu
+  - interrupts : The context fault irq.
+
+** Optional properties:
+
+- reg  : Base address and size of the SMMU local base, should
+ be only specified if the iommu requires configuration
+ for routing of context bank irq's to secure vs non-
+ secure lines.  (Ie. if the iommu contains secure
+ context banks)
+
+
+** Examples:
+
+   apps_iommu: iommu@1e2 {
+   #address-cells = <1>;
+   #size-cells = <1>;
+   #iommu-cells = <1>;
+   compatible = "qcom,msm8916-iommu", "qcom,msm-iommu-v1";
+   ranges = <0 0x1e2 0x4>;
+   reg = <0x1ef 0x3000>;
+   clocks = < GCC_SMMU_CFG_CLK>,
+< GCC_APSS_TCU_CLK>;
+   clock-names = "iface", "bus";
+   qcom,iommu-secure-id = <17>;
+
+   // mdp_0:
+   iommu-ctx@4000 {
+   compatible = "qcom,msm-iommu-v1-ns";
+   reg = <0x4000 0x1000>;
+   interrupts = ;
+   };
+
+   // venus_ns:
+   iommu-ctx@5000 {
+   compatible = "qcom,msm-iommu-v1-sec";
+   reg = <0x5000 0x1000>;
+   interrupts = ;
+   };
+   };
+
+   gpu_iommu: iommu@1f08000 {
+   #address-cells = <1>;
+   #size-cells = <1>;
+   #iommu-cells = <1>;
+   compatible = "qcom,msm8916-iommu", "qcom,msm-iommu-v1";
+   ranges = <0 0x1f08000 0x1>;
+   clocks = < GCC_SMMU_CFG_CLK>,
+< GCC_GFX_TCU_CLK>;
+   clock-names = "iface", "bus";
+   qcom,iommu-secure-id = <18>;
+
+   // gfx3d_user:
+   iommu-ctx@1000 {
+   compatible = "qcom,msm-iommu-v1-ns";
+   reg = <0x1000 0x1000>;
+   interrupts = ;
+   };
+
+   // gfx3d_priv:
+   iommu-ctx@2000 {
+   compatible = "qcom,msm-iommu-v1-ns";
+   reg = <0x2000 0x1000>;
+   interrupts = ;
+   };
+   };
+
+   ...
+
+   venus: video-codec@1d0 {
+   ...
+   iommus = <_iommu 5>;
+   };
+
+   mdp: mdp@1a01000 {
+   ...
+   iommus = <_iommu 4>;
+   };
+
+   gpu@01c0 {
+   ...
+   iommus = <_iommu 1>, <_iommu 2>;
+   };
-- 
2.13.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[RESEND PATCH 4/4] iommu: qcom: initialize secure page table

2017-07-13 Thread Rob Clark
From: Stanimir Varbanov <stanimir.varba...@linaro.org>

This basically gets the secure page table size, allocates memory for
secure pagetables and passes the physical address to the trusted zone.

Signed-off-by: Stanimir Varbanov <stanimir.varba...@linaro.org>
Signed-off-by: Rob Clark <robdcl...@gmail.com>
---
 drivers/iommu/qcom_iommu.c | 64 ++
 1 file changed, 64 insertions(+)

diff --git a/drivers/iommu/qcom_iommu.c b/drivers/iommu/qcom_iommu.c
index 860cad1cb167..48b62aa52787 100644
--- a/drivers/iommu/qcom_iommu.c
+++ b/drivers/iommu/qcom_iommu.c
@@ -604,6 +604,51 @@ static void qcom_iommu_disable_clocks(struct 
qcom_iommu_dev *qcom_iommu)
clk_disable_unprepare(qcom_iommu->iface_clk);
 }
 
+static int qcom_iommu_sec_ptbl_init(struct device *dev)
+{
+   size_t psize = 0;
+   unsigned int spare = 0;
+   void *cpu_addr;
+   dma_addr_t paddr;
+   unsigned long attrs;
+   static bool allocated = false;
+   int ret;
+
+   if (allocated)
+   return 0;
+
+   ret = qcom_scm_iommu_secure_ptbl_size(spare, );
+   if (ret) {
+   dev_err(dev, "failed to get iommu secure pgtable size (%d)\n",
+   ret);
+   return ret;
+   }
+
+   dev_info(dev, "iommu sec: pgtable size: %zu\n", psize);
+
+   attrs = DMA_ATTR_NO_KERNEL_MAPPING;
+
+   cpu_addr = dma_alloc_attrs(dev, psize, , GFP_KERNEL, attrs);
+   if (!cpu_addr) {
+   dev_err(dev, "failed to allocate %zu bytes for pgtable\n",
+   psize);
+   return -ENOMEM;
+   }
+
+   ret = qcom_scm_iommu_secure_ptbl_init(paddr, psize, spare);
+   if (ret) {
+   dev_err(dev, "failed to init iommu pgtable (%d)\n", ret);
+   goto free_mem;
+   }
+
+   allocated = true;
+   return 0;
+
+free_mem:
+   dma_free_attrs(dev, psize, cpu_addr, paddr, attrs);
+   return ret;
+}
+
 static int get_asid(const struct device_node *np)
 {
u32 reg;
@@ -700,6 +745,17 @@ static struct platform_driver qcom_iommu_ctx_driver = {
.remove = qcom_iommu_ctx_remove,
 };
 
+static bool qcom_iommu_has_secure_context(struct qcom_iommu_dev *qcom_iommu)
+{
+   struct device_node *child;
+
+   for_each_child_of_node(qcom_iommu->dev->of_node, child)
+   if (of_device_is_compatible(child, "qcom,msm-iommu-v1-sec"))
+   return true;
+
+   return false;
+}
+
 static int qcom_iommu_device_probe(struct platform_device *pdev)
 {
struct device_node *child;
@@ -744,6 +800,14 @@ static int qcom_iommu_device_probe(struct platform_device 
*pdev)
return -ENODEV;
}
 
+   if (qcom_iommu_has_secure_context(qcom_iommu)) {
+   ret = qcom_iommu_sec_ptbl_init(dev);
+   if (ret) {
+   dev_err(dev, "cannot init secure pg table(%d)\n", ret);
+   return ret;
+   }
+   }
+
platform_set_drvdata(pdev, qcom_iommu);
 
pm_runtime_enable(dev);
-- 
2.13.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[RESEND PATCH 0/4] iommu: add qcom_iommu for early "B" family devices

2017-07-13 Thread Rob Clark
An iommu driver for Qualcomm "B" family devices which do implement the
ARM SMMU spec, but not in a way that arm-smmu can support.

(I initially added support to arm-smmu, but it was decided that approach
was too intrusive and it would be cleaner to have a separate driver.)

I should note that all the dependencies for this driver have been merged
since 4.12, and it is the last thing needed for having another fully-
enabled (gpu/display/video codec/etc) ARM device that is fully upstream.

Rob Clark (3):
  Docs: dt: document qcom iommu bindings
  iommu: arm-smmu: split out register defines
  iommu: add qcom_iommu

Stanimir Varbanov (1):
  iommu: qcom: initialize secure page table

 .../devicetree/bindings/iommu/qcom,iommu.txt   | 121 +++
 drivers/iommu/Kconfig  |  10 +
 drivers/iommu/Makefile |   1 +
 drivers/iommu/arm-smmu-regs.h  | 227 +
 drivers/iommu/arm-smmu.c   | 203 +
 drivers/iommu/qcom_iommu.c | 932 +
 6 files changed, 1292 insertions(+), 202 deletions(-)
 create mode 100644 Documentation/devicetree/bindings/iommu/qcom,iommu.txt
 create mode 100644 drivers/iommu/arm-smmu-regs.h
 create mode 100644 drivers/iommu/qcom_iommu.c

-- 
2.13.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH V4 3/6] iommu/arm-smmu: Invoke pm_runtime during probe, add/remove device

2017-07-13 Thread Rob Clark
On Thu, Jul 13, 2017 at 1:35 AM, Sricharan R  wrote:
> Hi Vivek,
>
> On 7/13/2017 10:43 AM, Vivek Gautam wrote:
>> Hi Stephen,
>>
>>
>> On 07/13/2017 04:24 AM, Stephen Boyd wrote:
>>> On 07/06, Vivek Gautam wrote:
 @@ -1231,12 +1237,18 @@ static int arm_smmu_map(struct iommu_domain 
 *domain, unsigned long iova,
   static size_t arm_smmu_unmap(struct iommu_domain *domain, unsigned long 
 iova,
size_t size)
   {
 -struct io_pgtable_ops *ops = to_smmu_domain(domain)->pgtbl_ops;
 +struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
 +struct io_pgtable_ops *ops = smmu_domain->pgtbl_ops;
 +size_t ret;
 if (!ops)
   return 0;
   -return ops->unmap(ops, iova, size);
 +pm_runtime_get_sync(smmu_domain->smmu->dev);
>>> Can these map/unmap ops be called from an atomic context? I seem
>>> to recall that being a problem before.
>>
>> That's something which was dropped in the following patch merged in master:
>> 523d7423e21b iommu/arm-smmu: Remove io-pgtable spinlock
>>
>> Looks like we don't  need locks here anymore?
>
>  Apart from the locking, wonder why a explicit pm_runtime is needed
>  from unmap. Somehow looks like some path in the master using that
>  should have enabled the pm ?
>

Yes, there are a bunch of scenarios where unmap can happen with
disabled master (but not in atomic context).  On the gpu side we
opportunistically keep a buffer mapping until the buffer is freed
(which can happen after gpu is disabled).  Likewise, v4l2 won't unmap
an exported dmabuf while some other driver holds a reference to it
(which can be dropped when the v4l2 device is suspended).

Since unmap triggers tbl flush which touches iommu regs, the iommu
driver *definitely* needs a pm_runtime_get_sync().

BR,
-R
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH V4 3/6] iommu/arm-smmu: Invoke pm_runtime during probe, add/remove device

2017-07-13 Thread Rob Clark
On Thu, Jul 13, 2017 at 8:02 AM, Marek Szyprowski
<m.szyprow...@samsung.com> wrote:
> Hi All,
>
> On 2017-07-13 13:50, Rob Clark wrote:
>>
>> On Thu, Jul 13, 2017 at 1:35 AM, Sricharan R <sricha...@codeaurora.org>
>> wrote:
>>>
>>> On 7/13/2017 10:43 AM, Vivek Gautam wrote:
>>>>
>>>> On 07/13/2017 04:24 AM, Stephen Boyd wrote:
>>>>>
>>>>> On 07/06, Vivek Gautam wrote:
>>>>>>
>>>>>> @@ -1231,12 +1237,18 @@ static int arm_smmu_map(struct iommu_domain
>>>>>> *domain, unsigned long iova,
>>>>>>static size_t arm_smmu_unmap(struct iommu_domain *domain, unsigned
>>>>>> long iova,
>>>>>> size_t size)
>>>>>>{
>>>>>> -struct io_pgtable_ops *ops = to_smmu_domain(domain)->pgtbl_ops;
>>>>>> +struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
>>>>>> +struct io_pgtable_ops *ops = smmu_domain->pgtbl_ops;
>>>>>> +size_t ret;
>>>>>>  if (!ops)
>>>>>>return 0;
>>>>>>-return ops->unmap(ops, iova, size);
>>>>>> +pm_runtime_get_sync(smmu_domain->smmu->dev);
>>>>>
>>>>> Can these map/unmap ops be called from an atomic context? I seem
>>>>> to recall that being a problem before.
>>>>
>>>> That's something which was dropped in the following patch merged in
>>>> master:
>>>> 523d7423e21b iommu/arm-smmu: Remove io-pgtable spinlock
>>>>
>>>> Looks like we don't  need locks here anymore?
>>>
>>>   Apart from the locking, wonder why a explicit pm_runtime is needed
>>>   from unmap. Somehow looks like some path in the master using that
>>>   should have enabled the pm ?
>>>
>> Yes, there are a bunch of scenarios where unmap can happen with
>> disabled master (but not in atomic context).  On the gpu side we
>> opportunistically keep a buffer mapping until the buffer is freed
>> (which can happen after gpu is disabled).  Likewise, v4l2 won't unmap
>> an exported dmabuf while some other driver holds a reference to it
>> (which can be dropped when the v4l2 device is suspended).
>>
>> Since unmap triggers tbl flush which touches iommu regs, the iommu
>> driver *definitely* needs a pm_runtime_get_sync().
>
>
> Afair unmap might be called from atomic context as well, for example as
> a result of dma_unmap_page(). In exynos IOMMU I simply check the runtime
> PM state of IOMMU device. TLB flush is performed only when IOMMU is in
> active
> state. If it is suspended, I assume that the IOMMU controller's context
> is already lost and its respective power domain might be already turned off,
> so there is no point in touching IOMMU registers.
>

that seems like an interesting approach.. although I wonder if there
can be some race w/ new device memory access once clks are enabled
before tlb flush completes?  That would be rather bad, since this
approach is letting the backing pages of memory be freed before tlb
flush.

BR,
-R
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH V4 3/6] iommu/arm-smmu: Invoke pm_runtime during probe, add/remove device

2017-07-14 Thread Rob Clark
On Fri, Jul 14, 2017 at 2:06 PM, Will Deacon <will.dea...@arm.com> wrote:
> On Fri, Jul 14, 2017 at 01:42:13PM -0400, Rob Clark wrote:
>> On Fri, Jul 14, 2017 at 1:07 PM, Will Deacon <will.dea...@arm.com> wrote:
>> > On Thu, Jul 13, 2017 at 10:55:10AM -0400, Rob Clark wrote:
>> >> On Thu, Jul 13, 2017 at 9:53 AM, Sricharan R <sricha...@codeaurora.org> 
>> >> wrote:
>> >> > Hi,
>> >> >
>> >> > On 7/13/2017 5:20 PM, Rob Clark wrote:
>> >> >> On Thu, Jul 13, 2017 at 1:35 AM, Sricharan R 
>> >> >> <sricha...@codeaurora.org> wrote:
>> >> >>> Hi Vivek,
>> >> >>>
>> >> >>> On 7/13/2017 10:43 AM, Vivek Gautam wrote:
>> >> >>>> Hi Stephen,
>> >> >>>>
>> >> >>>>
>> >> >>>> On 07/13/2017 04:24 AM, Stephen Boyd wrote:
>> >> >>>>> On 07/06, Vivek Gautam wrote:
>> >> >>>>>> @@ -1231,12 +1237,18 @@ static int arm_smmu_map(struct 
>> >> >>>>>> iommu_domain *domain, unsigned long iova,
>> >> >>>>>>   static size_t arm_smmu_unmap(struct iommu_domain *domain, 
>> >> >>>>>> unsigned long iova,
>> >> >>>>>>size_t size)
>> >> >>>>>>   {
>> >> >>>>>> -struct io_pgtable_ops *ops = 
>> >> >>>>>> to_smmu_domain(domain)->pgtbl_ops;
>> >> >>>>>> +struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
>> >> >>>>>> +struct io_pgtable_ops *ops = smmu_domain->pgtbl_ops;
>> >> >>>>>> +size_t ret;
>> >> >>>>>> if (!ops)
>> >> >>>>>>   return 0;
>> >> >>>>>>   -return ops->unmap(ops, iova, size);
>> >> >>>>>> +pm_runtime_get_sync(smmu_domain->smmu->dev);
>> >> >>>>> Can these map/unmap ops be called from an atomic context? I seem
>> >> >>>>> to recall that being a problem before.
>> >> >>>>
>> >> >>>> That's something which was dropped in the following patch merged in 
>> >> >>>> master:
>> >> >>>> 523d7423e21b iommu/arm-smmu: Remove io-pgtable spinlock
>> >> >>>>
>> >> >>>> Looks like we don't  need locks here anymore?
>> >> >>>
>> >> >>>  Apart from the locking, wonder why a explicit pm_runtime is needed
>> >> >>>  from unmap. Somehow looks like some path in the master using that
>> >> >>>  should have enabled the pm ?
>> >> >>>
>> >> >>
>> >> >> Yes, there are a bunch of scenarios where unmap can happen with
>> >> >> disabled master (but not in atomic context).  On the gpu side we
>> >> >> opportunistically keep a buffer mapping until the buffer is freed
>> >> >> (which can happen after gpu is disabled).  Likewise, v4l2 won't unmap
>> >> >> an exported dmabuf while some other driver holds a reference to it
>> >> >> (which can be dropped when the v4l2 device is suspended).
>> >> >>
>> >> >> Since unmap triggers tbl flush which touches iommu regs, the iommu
>> >> >> driver *definitely* needs a pm_runtime_get_sync().
>> >> >
>> >> >  Ok, with that being the case, there are two things here,
>> >> >
>> >> >  1) If the device links are still intact at these places where unmap is 
>> >> > called,
>> >> > then pm_runtime from the master would setup the all the clocks. 
>> >> > That would
>> >> > avoid reintroducing the locking indirectly here.
>> >> >
>> >> >  2) If not, then doing it here is the only way. But for both cases, 
>> >> > since
>> >> > the unmap can be called from atomic context, resume handler here 
>> >> > should
>> >> > avoid doing clk_prepare_enable , instead move the clk_prepare to 
>> >> > the init.
>> >> >
>> >>
>> >> I do kinda like the approach Marek suggested.. of deferring the tlb
>> >> flush until resume.  I'm 

Re: [PATCH V4 3/6] iommu/arm-smmu: Invoke pm_runtime during probe, add/remove device

2017-07-14 Thread Rob Clark
On Fri, Jul 14, 2017 at 3:01 PM, Will Deacon <will.dea...@arm.com> wrote:
> On Fri, Jul 14, 2017 at 02:25:45PM -0400, Rob Clark wrote:
>> On Fri, Jul 14, 2017 at 2:06 PM, Will Deacon <will.dea...@arm.com> wrote:
>> > On Fri, Jul 14, 2017 at 01:42:13PM -0400, Rob Clark wrote:
>> >> On Fri, Jul 14, 2017 at 1:07 PM, Will Deacon <will.dea...@arm.com> wrote:
>> >> > On Thu, Jul 13, 2017 at 10:55:10AM -0400, Rob Clark wrote:
>> >> >> On Thu, Jul 13, 2017 at 9:53 AM, Sricharan R 
>> >> >> <sricha...@codeaurora.org> wrote:
>> >> >> > Hi,
>> >> >> >
>> >> >> > On 7/13/2017 5:20 PM, Rob Clark wrote:
>> >> >> >> On Thu, Jul 13, 2017 at 1:35 AM, Sricharan R 
>> >> >> >> <sricha...@codeaurora.org> wrote:
>> >> >> >>> Hi Vivek,
>> >> >> >>>
>> >> >> >>> On 7/13/2017 10:43 AM, Vivek Gautam wrote:
>> >> >> >>>> Hi Stephen,
>> >> >> >>>>
>> >> >> >>>>
>> >> >> >>>> On 07/13/2017 04:24 AM, Stephen Boyd wrote:
>> >> >> >>>>> On 07/06, Vivek Gautam wrote:
>> >> >> >>>>>> @@ -1231,12 +1237,18 @@ static int arm_smmu_map(struct 
>> >> >> >>>>>> iommu_domain *domain, unsigned long iova,
>> >> >> >>>>>>   static size_t arm_smmu_unmap(struct iommu_domain *domain, 
>> >> >> >>>>>> unsigned long iova,
>> >> >> >>>>>>size_t size)
>> >> >> >>>>>>   {
>> >> >> >>>>>> -struct io_pgtable_ops *ops = 
>> >> >> >>>>>> to_smmu_domain(domain)->pgtbl_ops;
>> >> >> >>>>>> +struct arm_smmu_domain *smmu_domain = 
>> >> >> >>>>>> to_smmu_domain(domain);
>> >> >> >>>>>> +struct io_pgtable_ops *ops = smmu_domain->pgtbl_ops;
>> >> >> >>>>>> +size_t ret;
>> >> >> >>>>>> if (!ops)
>> >> >> >>>>>>   return 0;
>> >> >> >>>>>>   -return ops->unmap(ops, iova, size);
>> >> >> >>>>>> +pm_runtime_get_sync(smmu_domain->smmu->dev);
>> >> >> >>>>> Can these map/unmap ops be called from an atomic context? I seem
>> >> >> >>>>> to recall that being a problem before.
>> >> >> >>>>
>> >> >> >>>> That's something which was dropped in the following patch merged 
>> >> >> >>>> in master:
>> >> >> >>>> 523d7423e21b iommu/arm-smmu: Remove io-pgtable spinlock
>> >> >> >>>>
>> >> >> >>>> Looks like we don't  need locks here anymore?
>> >> >> >>>
>> >> >> >>>  Apart from the locking, wonder why a explicit pm_runtime is needed
>> >> >> >>>  from unmap. Somehow looks like some path in the master using that
>> >> >> >>>  should have enabled the pm ?
>> >> >> >>>
>> >> >> >>
>> >> >> >> Yes, there are a bunch of scenarios where unmap can happen with
>> >> >> >> disabled master (but not in atomic context).  On the gpu side we
>> >> >> >> opportunistically keep a buffer mapping until the buffer is freed
>> >> >> >> (which can happen after gpu is disabled).  Likewise, v4l2 won't 
>> >> >> >> unmap
>> >> >> >> an exported dmabuf while some other driver holds a reference to it
>> >> >> >> (which can be dropped when the v4l2 device is suspended).
>> >> >> >>
>> >> >> >> Since unmap triggers tbl flush which touches iommu regs, the iommu
>> >> >> >> driver *definitely* needs a pm_runtime_get_sync().
>> >> >> >
>> >> >> >  Ok, with that being the case, there are two things here,
>> >> >> >
>> >> >> >  1) If the device links are still intact at these places where unmap 
>> >

Re: [PATCH V4 3/6] iommu/arm-smmu: Invoke pm_runtime during probe, add/remove device

2017-07-14 Thread Rob Clark
On Fri, Jul 14, 2017 at 1:07 PM, Will Deacon <will.dea...@arm.com> wrote:
> On Thu, Jul 13, 2017 at 10:55:10AM -0400, Rob Clark wrote:
>> On Thu, Jul 13, 2017 at 9:53 AM, Sricharan R <sricha...@codeaurora.org> 
>> wrote:
>> > Hi,
>> >
>> > On 7/13/2017 5:20 PM, Rob Clark wrote:
>> >> On Thu, Jul 13, 2017 at 1:35 AM, Sricharan R <sricha...@codeaurora.org> 
>> >> wrote:
>> >>> Hi Vivek,
>> >>>
>> >>> On 7/13/2017 10:43 AM, Vivek Gautam wrote:
>> >>>> Hi Stephen,
>> >>>>
>> >>>>
>> >>>> On 07/13/2017 04:24 AM, Stephen Boyd wrote:
>> >>>>> On 07/06, Vivek Gautam wrote:
>> >>>>>> @@ -1231,12 +1237,18 @@ static int arm_smmu_map(struct iommu_domain 
>> >>>>>> *domain, unsigned long iova,
>> >>>>>>   static size_t arm_smmu_unmap(struct iommu_domain *domain, unsigned 
>> >>>>>> long iova,
>> >>>>>>size_t size)
>> >>>>>>   {
>> >>>>>> -struct io_pgtable_ops *ops = to_smmu_domain(domain)->pgtbl_ops;
>> >>>>>> +struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
>> >>>>>> +struct io_pgtable_ops *ops = smmu_domain->pgtbl_ops;
>> >>>>>> +size_t ret;
>> >>>>>> if (!ops)
>> >>>>>>   return 0;
>> >>>>>>   -return ops->unmap(ops, iova, size);
>> >>>>>> +pm_runtime_get_sync(smmu_domain->smmu->dev);
>> >>>>> Can these map/unmap ops be called from an atomic context? I seem
>> >>>>> to recall that being a problem before.
>> >>>>
>> >>>> That's something which was dropped in the following patch merged in 
>> >>>> master:
>> >>>> 523d7423e21b iommu/arm-smmu: Remove io-pgtable spinlock
>> >>>>
>> >>>> Looks like we don't  need locks here anymore?
>> >>>
>> >>>  Apart from the locking, wonder why a explicit pm_runtime is needed
>> >>>  from unmap. Somehow looks like some path in the master using that
>> >>>  should have enabled the pm ?
>> >>>
>> >>
>> >> Yes, there are a bunch of scenarios where unmap can happen with
>> >> disabled master (but not in atomic context).  On the gpu side we
>> >> opportunistically keep a buffer mapping until the buffer is freed
>> >> (which can happen after gpu is disabled).  Likewise, v4l2 won't unmap
>> >> an exported dmabuf while some other driver holds a reference to it
>> >> (which can be dropped when the v4l2 device is suspended).
>> >>
>> >> Since unmap triggers tbl flush which touches iommu regs, the iommu
>> >> driver *definitely* needs a pm_runtime_get_sync().
>> >
>> >  Ok, with that being the case, there are two things here,
>> >
>> >  1) If the device links are still intact at these places where unmap is 
>> > called,
>> > then pm_runtime from the master would setup the all the clocks. That 
>> > would
>> > avoid reintroducing the locking indirectly here.
>> >
>> >  2) If not, then doing it here is the only way. But for both cases, since
>> > the unmap can be called from atomic context, resume handler here should
>> > avoid doing clk_prepare_enable , instead move the clk_prepare to the 
>> > init.
>> >
>>
>> I do kinda like the approach Marek suggested.. of deferring the tlb
>> flush until resume.  I'm wondering if we could combine that with
>> putting the mmu in a stalled state when we suspend (and not resume the
>> mmu until after the pending tlb flush)?
>
> I'm not sure that a stalled state is what we're after here, because we need
> to take care to prevent any table walks if we've freed the underlying pages.
> What we could try to do is disable the SMMU (put into global bypass) and
> invalidate the TLB when performing a suspend operation, then we just ignore
> invalidation whilst the clocks are stopped and, on resume, enable the SMMU
> again.

wouldn't stalled just block any memory transactions by device(s) using
the context bank?  Putting it in bypass isn't really a good thing if
there is any chance the device can sneak in a memory access before
we've taking it back out of bypass (ie. makes gpu a giant userspace
controlled root hole).

BR,
-R

> That said, I don't think we can tolerate suspend/resume racing with
> map/unmap, and it's not clear to me how we avoid that without penalising
> the fastpath.
>
> Will
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[RESEND PATCH 1/4] Docs: dt: document qcom iommu bindings

2017-08-09 Thread Rob Clark
Cc: devicet...@vger.kernel.org
Signed-off-by: Rob Clark <robdcl...@gmail.com>
Reviewed-by: Rob Herring <r...@kernel.org>
---
 .../devicetree/bindings/iommu/qcom,iommu.txt   | 121 +
 1 file changed, 121 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/iommu/qcom,iommu.txt

diff --git a/Documentation/devicetree/bindings/iommu/qcom,iommu.txt 
b/Documentation/devicetree/bindings/iommu/qcom,iommu.txt
new file mode 100644
index ..b2641ceb2b40
--- /dev/null
+++ b/Documentation/devicetree/bindings/iommu/qcom,iommu.txt
@@ -0,0 +1,121 @@
+* QCOM IOMMU v1 Implementation
+
+Qualcomm "B" family devices which are not compatible with arm-smmu have
+a similar looking IOMMU but without access to the global register space,
+and optionally requiring additional configuration to route context irqs
+to non-secure vs secure interrupt line.
+
+** Required properties:
+
+- compatible   : Should be one of:
+
+"qcom,msm8916-iommu"
+
+ Followed by "qcom,msm-iommu-v1".
+
+- clock-names  : Should be a pair of "iface" (required for IOMMUs
+ register group access) and "bus" (required for
+ the IOMMUs underlying bus access).
+
+- clocks   : Phandles for respective clocks described by
+ clock-names.
+
+- #address-cells   : must be 1.
+
+- #size-cells  : must be 1.
+
+- #iommu-cells : Must be 1.  Index identifies the context-bank #.
+
+- ranges   : Base address and size of the iommu context banks.
+
+- qcom,iommu-secure-id  : secure-id.
+
+- List of sub-nodes, one per translation context bank.  Each sub-node
+  has the following required properties:
+
+  - compatible : Should be one of:
+- "qcom,msm-iommu-v1-ns"  : non-secure context bank
+- "qcom,msm-iommu-v1-sec" : secure context bank
+  - reg: Base address and size of context bank within the iommu
+  - interrupts : The context fault irq.
+
+** Optional properties:
+
+- reg  : Base address and size of the SMMU local base, should
+ be only specified if the iommu requires configuration
+ for routing of context bank irq's to secure vs non-
+ secure lines.  (Ie. if the iommu contains secure
+ context banks)
+
+
+** Examples:
+
+   apps_iommu: iommu@1e2 {
+   #address-cells = <1>;
+   #size-cells = <1>;
+   #iommu-cells = <1>;
+   compatible = "qcom,msm8916-iommu", "qcom,msm-iommu-v1";
+   ranges = <0 0x1e2 0x4>;
+   reg = <0x1ef 0x3000>;
+   clocks = < GCC_SMMU_CFG_CLK>,
+< GCC_APSS_TCU_CLK>;
+   clock-names = "iface", "bus";
+   qcom,iommu-secure-id = <17>;
+
+   // mdp_0:
+   iommu-ctx@4000 {
+   compatible = "qcom,msm-iommu-v1-ns";
+   reg = <0x4000 0x1000>;
+   interrupts = ;
+   };
+
+   // venus_ns:
+   iommu-ctx@5000 {
+   compatible = "qcom,msm-iommu-v1-sec";
+   reg = <0x5000 0x1000>;
+   interrupts = ;
+   };
+   };
+
+   gpu_iommu: iommu@1f08000 {
+   #address-cells = <1>;
+   #size-cells = <1>;
+   #iommu-cells = <1>;
+   compatible = "qcom,msm8916-iommu", "qcom,msm-iommu-v1";
+   ranges = <0 0x1f08000 0x1>;
+   clocks = < GCC_SMMU_CFG_CLK>,
+< GCC_GFX_TCU_CLK>;
+   clock-names = "iface", "bus";
+   qcom,iommu-secure-id = <18>;
+
+   // gfx3d_user:
+   iommu-ctx@1000 {
+   compatible = "qcom,msm-iommu-v1-ns";
+   reg = <0x1000 0x1000>;
+   interrupts = ;
+   };
+
+   // gfx3d_priv:
+   iommu-ctx@2000 {
+   compatible = "qcom,msm-iommu-v1-ns";
+   reg = <0x2000 0x1000>;
+   interrupts = ;
+   };
+   };
+
+   ...
+
+   venus: video-codec@1d0 {
+   ...
+   iommus = <_iommu 5>;
+   };
+
+   mdp: mdp@1a01000 {
+   ...
+   iommus = <_iommu 4>;
+   };
+
+   gpu@01c0 {
+   ...
+   iommus = <_iommu 1>, <_iommu 2>;
+   };
-- 
2.13.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[RESEND PATCH 4/4] iommu: qcom: initialize secure page table

2017-08-09 Thread Rob Clark
From: Stanimir Varbanov <stanimir.varba...@linaro.org>

This basically gets the secure page table size, allocates memory for
secure pagetables and passes the physical address to the trusted zone.

Signed-off-by: Stanimir Varbanov <stanimir.varba...@linaro.org>
Signed-off-by: Rob Clark <robdcl...@gmail.com>
---
 drivers/iommu/qcom_iommu.c | 64 ++
 1 file changed, 64 insertions(+)

diff --git a/drivers/iommu/qcom_iommu.c b/drivers/iommu/qcom_iommu.c
index 860cad1cb167..48b62aa52787 100644
--- a/drivers/iommu/qcom_iommu.c
+++ b/drivers/iommu/qcom_iommu.c
@@ -604,6 +604,51 @@ static void qcom_iommu_disable_clocks(struct 
qcom_iommu_dev *qcom_iommu)
clk_disable_unprepare(qcom_iommu->iface_clk);
 }
 
+static int qcom_iommu_sec_ptbl_init(struct device *dev)
+{
+   size_t psize = 0;
+   unsigned int spare = 0;
+   void *cpu_addr;
+   dma_addr_t paddr;
+   unsigned long attrs;
+   static bool allocated = false;
+   int ret;
+
+   if (allocated)
+   return 0;
+
+   ret = qcom_scm_iommu_secure_ptbl_size(spare, );
+   if (ret) {
+   dev_err(dev, "failed to get iommu secure pgtable size (%d)\n",
+   ret);
+   return ret;
+   }
+
+   dev_info(dev, "iommu sec: pgtable size: %zu\n", psize);
+
+   attrs = DMA_ATTR_NO_KERNEL_MAPPING;
+
+   cpu_addr = dma_alloc_attrs(dev, psize, , GFP_KERNEL, attrs);
+   if (!cpu_addr) {
+   dev_err(dev, "failed to allocate %zu bytes for pgtable\n",
+   psize);
+   return -ENOMEM;
+   }
+
+   ret = qcom_scm_iommu_secure_ptbl_init(paddr, psize, spare);
+   if (ret) {
+   dev_err(dev, "failed to init iommu pgtable (%d)\n", ret);
+   goto free_mem;
+   }
+
+   allocated = true;
+   return 0;
+
+free_mem:
+   dma_free_attrs(dev, psize, cpu_addr, paddr, attrs);
+   return ret;
+}
+
 static int get_asid(const struct device_node *np)
 {
u32 reg;
@@ -700,6 +745,17 @@ static struct platform_driver qcom_iommu_ctx_driver = {
.remove = qcom_iommu_ctx_remove,
 };
 
+static bool qcom_iommu_has_secure_context(struct qcom_iommu_dev *qcom_iommu)
+{
+   struct device_node *child;
+
+   for_each_child_of_node(qcom_iommu->dev->of_node, child)
+   if (of_device_is_compatible(child, "qcom,msm-iommu-v1-sec"))
+   return true;
+
+   return false;
+}
+
 static int qcom_iommu_device_probe(struct platform_device *pdev)
 {
struct device_node *child;
@@ -744,6 +800,14 @@ static int qcom_iommu_device_probe(struct platform_device 
*pdev)
return -ENODEV;
}
 
+   if (qcom_iommu_has_secure_context(qcom_iommu)) {
+   ret = qcom_iommu_sec_ptbl_init(dev);
+   if (ret) {
+   dev_err(dev, "cannot init secure pg table(%d)\n", ret);
+   return ret;
+   }
+   }
+
platform_set_drvdata(pdev, qcom_iommu);
 
pm_runtime_enable(dev);
-- 
2.13.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[RESEND PATCH 0/4] iommu: add qcom_iommu for early "B" family devices

2017-08-09 Thread Rob Clark
An iommu driver for Qualcomm "B" family devices which do implement the
ARM SMMU spec, but not in a way that arm-smmu can support.

(I initially added support to arm-smmu, but it was decided that approach
was too intrusive and it would be cleaner to have a separate driver.)

I should note that all the dependencies for this driver have been merged
since 4.12, and it is the last thing needed for having another fully-
enabled (gpu/display/video codec/etc) ARM device that is fully upstream.

These patches can also be found at:

  git://people.freedesktop.org/~robclark/linux qcom_iommu-next

  https://cgit.freedesktop.org/~robclark/linux/log/?h=qcom_iommu-next

No change since last time.

Rob Clark (3):
  Docs: dt: document qcom iommu bindings
  iommu: arm-smmu: split out register defines
  iommu: add qcom_iommu

Stanimir Varbanov (1):
  iommu: qcom: initialize secure page table

 .../devicetree/bindings/iommu/qcom,iommu.txt   | 121 +++
 drivers/iommu/Kconfig  |  10 +
 drivers/iommu/Makefile |   1 +
 drivers/iommu/arm-smmu-regs.h  | 220 +
 drivers/iommu/arm-smmu.c   | 211 +
 drivers/iommu/qcom_iommu.c | 932 +
 6 files changed, 1293 insertions(+), 202 deletions(-)
 create mode 100644 Documentation/devicetree/bindings/iommu/qcom,iommu.txt
 create mode 100644 drivers/iommu/arm-smmu-regs.h
 create mode 100644 drivers/iommu/qcom_iommu.c

-- 
2.13.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH V4 3/6] iommu/arm-smmu: Invoke pm_runtime during probe, add/remove device

2017-08-07 Thread Rob Clark
On Mon, Aug 7, 2017 at 4:27 AM, Vivek Gautam
<vivek.gau...@codeaurora.org> wrote:
> On Thu, Jul 13, 2017 at 5:20 PM, Rob Clark <robdcl...@gmail.com> wrote:
>> On Thu, Jul 13, 2017 at 1:35 AM, Sricharan R <sricha...@codeaurora.org> 
>> wrote:
>>> Hi Vivek,
>>>
>>> On 7/13/2017 10:43 AM, Vivek Gautam wrote:
>>>> Hi Stephen,
>>>>
>>>>
>>>> On 07/13/2017 04:24 AM, Stephen Boyd wrote:
>>>>> On 07/06, Vivek Gautam wrote:
>>>>>> @@ -1231,12 +1237,18 @@ static int arm_smmu_map(struct iommu_domain 
>>>>>> *domain, unsigned long iova,
>>>>>>   static size_t arm_smmu_unmap(struct iommu_domain *domain, unsigned 
>>>>>> long iova,
>>>>>>size_t size)
>>>>>>   {
>>>>>> -struct io_pgtable_ops *ops = to_smmu_domain(domain)->pgtbl_ops;
>>>>>> +struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
>>>>>> +struct io_pgtable_ops *ops = smmu_domain->pgtbl_ops;
>>>>>> +size_t ret;
>>>>>> if (!ops)
>>>>>>   return 0;
>>>>>>   -return ops->unmap(ops, iova, size);
>>>>>> +pm_runtime_get_sync(smmu_domain->smmu->dev);
>>>>> Can these map/unmap ops be called from an atomic context? I seem
>>>>> to recall that being a problem before.
>>>>
>>>> That's something which was dropped in the following patch merged in master:
>>>> 523d7423e21b iommu/arm-smmu: Remove io-pgtable spinlock
>>>>
>>>> Looks like we don't  need locks here anymore?
>>>
>>>  Apart from the locking, wonder why a explicit pm_runtime is needed
>>>  from unmap. Somehow looks like some path in the master using that
>>>  should have enabled the pm ?
>>>
>>
>> Yes, there are a bunch of scenarios where unmap can happen with
>> disabled master (but not in atomic context).
>
> I would like to understand whether there is a situation where an unmap is
> called in atomic context without an enabled master?
>
> Let's say we have the case where all the unmap calls in atomic context happen
> only from the master's context (in which case the device link should
> take care of
> the pm state of smmu), and the only unmap that happen in non-atomic context
> is the one with master disabled. In such a case doesn it make sense to
> distinguish
> the atomic/non-atomic context and add pm_runtime_get_sync()/put_sync() only
> for the non-atomic context since that would be the one with master disabled.
>

At least drm/msm needs to hold obj->lock (a mutex) in unmap, so it
won't unmap anything in atomic ctx (but it can unmap w/ master
disabled).  I can't really comment about other non-gpu drivers.  It
seems like a reasonable constraint that either master is enabled or
not in atomic ctx.

Currently we actually wrap unmap w/ pm_runtime_get/put_sync(), but I'd
like to drop that to avoid powering up the gpu.

BR,
-R
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 07/13] drm/msm: Use sychronized interface of the IOMMU-API

2017-08-19 Thread Rob Clark
On Thu, Aug 17, 2017 at 8:56 AM, Joerg Roedel <j...@8bytes.org> wrote:
> From: Joerg Roedel <jroe...@suse.de>
>
> The map and unmap functions of the IOMMU-API changed their
> semantics: They do no longer guarantee that the hardware
> TLBs are synchronized with the page-table updates they made.
>
> To make conversion easier, new synchronized functions have
> been introduced which give these guarantees again until the
> code is converted to use the new TLB-flush interface of the
> IOMMU-API, which allows certain optimizations.
>
> But for now, just convert this code to use the synchronized
> functions so that it will behave as before.


I like the idea of decoupling tlb inv with map (well that doesn't
really effect any recent hw) and unmap (which does need tlb flush)..
and for a gpu driver it should be pretty easy to back up unmaps.

I did slightly prefer adding a new iommu_{map,unmap}_async() rather
than changing semantics of existing API, but either way:

Acked-by: Rob Clark <robdcl...@gmail.com>

> Cc: Rob Clark <robdcl...@gmail.com>
> Cc: David Airlie <airl...@linux.ie>
> Cc: linux-arm-...@vger.kernel.org
> Cc: dri-de...@lists.freedesktop.org
> Cc: freedr...@lists.freedesktop.org
> Signed-off-by: Joerg Roedel <jroe...@suse.de>
> ---
>  drivers/gpu/drm/msm/msm_iommu.c | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/msm/msm_iommu.c b/drivers/gpu/drm/msm/msm_iommu.c
> index b23d336..b3525b7 100644
> --- a/drivers/gpu/drm/msm/msm_iommu.c
> +++ b/drivers/gpu/drm/msm/msm_iommu.c
> @@ -64,7 +64,8 @@ static int msm_iommu_map(struct msm_mmu *mmu, uint64_t iova,
> size_t ret;
>
>  // pm_runtime_get_sync(mmu->dev);
> -   ret = iommu_map_sg(iommu->domain, iova, sgt->sgl, sgt->nents, prot);
> +   ret = iommu_map_sg_sync(iommu->domain, iova, sgt->sgl,
> +   sgt->nents, prot);
>  // pm_runtime_put_sync(mmu->dev);
> WARN_ON(ret < 0);
>
> @@ -77,7 +78,7 @@ static int msm_iommu_unmap(struct msm_mmu *mmu, uint64_t 
> iova,
> struct msm_iommu *iommu = to_msm_iommu(mmu);
>
> pm_runtime_get_sync(mmu->dev);
> -   iommu_unmap(iommu->domain, iova, len);
> +   iommu_unmap_sync(iommu->domain, iova, len);
> pm_runtime_put_sync(mmu->dev);
>
> return 0;
> --
> 2.7.4
>
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] iommu: qcom: annotate PM functions as __maybe_unused

2017-08-23 Thread Rob Clark
On Wed, Aug 23, 2017 at 9:42 AM, Arnd Bergmann <a...@arndb.de> wrote:
> The qcom_iommu_disable_clocks() function is only called from PM
> code that is hidden in an #ifdef, causing a harmless warning without
> CONFIG_PM:
>
> drivers/iommu/qcom_iommu.c:601:13: error: 'qcom_iommu_disable_clocks' defined 
> but not used [-Werror=unused-function]
>  static void qcom_iommu_disable_clocks(struct qcom_iommu_dev *qcom_iommu)
> drivers/iommu/qcom_iommu.c:581:12: error: 'qcom_iommu_enable_clocks' defined 
> but not used [-Werror=unused-function]
>  static int qcom_iommu_enable_clocks(struct qcom_iommu_dev *qcom_iommu)
>
> Replacing that #ifdef with __maybe_unused annotations lets the compiler
> drop the functions silently instead.
>
> Fixes: 0ae349a0f33f ("iommu/qcom: Add qcom_iommu")
> Signed-off-by: Arnd Bergmann <a...@arndb.de>

thanks

Acked-by: Rob Clark <robdcl...@gmail.com>

> ---
>  drivers/iommu/qcom_iommu.c | 6 ++
>  1 file changed, 2 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/iommu/qcom_iommu.c b/drivers/iommu/qcom_iommu.c
> index 48b62aa52787..c8a587d034b0 100644
> --- a/drivers/iommu/qcom_iommu.c
> +++ b/drivers/iommu/qcom_iommu.c
> @@ -860,8 +860,7 @@ static int qcom_iommu_device_remove(struct 
> platform_device *pdev)
> return 0;
>  }
>
> -#ifdef CONFIG_PM
> -static int qcom_iommu_resume(struct device *dev)
> +static int __maybe_unused qcom_iommu_resume(struct device *dev)
>  {
> struct platform_device *pdev = to_platform_device(dev);
> struct qcom_iommu_dev *qcom_iommu = platform_get_drvdata(pdev);
> @@ -869,7 +868,7 @@ static int qcom_iommu_resume(struct device *dev)
> return qcom_iommu_enable_clocks(qcom_iommu);
>  }
>
> -static int qcom_iommu_suspend(struct device *dev)
> +static int __maybe_unused qcom_iommu_suspend(struct device *dev)
>  {
> struct platform_device *pdev = to_platform_device(dev);
> struct qcom_iommu_dev *qcom_iommu = platform_get_drvdata(pdev);
> @@ -878,7 +877,6 @@ static int qcom_iommu_suspend(struct device *dev)
>
> return 0;
>  }
> -#endif
>
>  static const struct dev_pm_ops qcom_iommu_pm_ops = {
> SET_RUNTIME_PM_OPS(qcom_iommu_suspend, qcom_iommu_resume, NULL)
> --
> 2.9.0
>
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 3/4] iommu: add qcom_iommu

2017-05-11 Thread Rob Clark
On Thu, May 11, 2017 at 11:08 AM, Sricharan R  wrote:
> Hi Rob,
>
> 
>
>> +static irqreturn_t qcom_iommu_fault(int irq, void *dev)
>> +{
>> + struct qcom_iommu_ctx *ctx = dev;
>> + u32 fsr, fsynr;
>> + unsigned long iova;
>> +
>> + fsr = iommu_readl(ctx, ARM_SMMU_CB_FSR);
>> +
>> + if (!(fsr & FSR_FAULT))
>> + return IRQ_NONE;
>> +
>> + fsynr = iommu_readl(ctx, ARM_SMMU_CB_FSYNR0);
>> + iova = iommu_readq(ctx, ARM_SMMU_CB_FAR);
>> +
>> + dev_err_ratelimited(ctx->dev,
>> + "Unhandled context fault: fsr=0x%x, "
>> + "iova=0x%08lx, fsynr=0x%x, cb=%d\n",
>> + fsr, iova, fsynr, ctx->asid);
>> +
>> + iommu_writel(ctx, ARM_SMMU_CB_FSR, fsr);
>
> Just thinking if the clocks should be enabled in the fault handler
> for handling cases that would happen out of the master context.
> While global faults are one case, that is anyways is handled in
> secure world for this case. Something like bootloader used the iommu
> and not handled the fault, and getting the fault in kernel the
> moment we enable the ctx. Atleast downstream seems to enable the
> clocks in the fault handler explicitly.

hmm, I wonder if we should instead do something to clear interrupts
when we initialize the context?

I guess we probably don't want to get fault irq's from the bootloader..

BR,
-R

> Regards,
>  Sricharan
>
>
>> +
>> + return IRQ_HANDLED;
>> +}
>> +
>> +static int qcom_iommu_init_domain(struct iommu_domain *domain,
>> +   struct qcom_iommu_dev *qcom_iommu,
>> +   struct iommu_fwspec *fwspec)
>> +{
>> + struct qcom_iommu_domain *qcom_domain = to_qcom_iommu_domain(domain);
>> + struct io_pgtable_ops *pgtbl_ops;
>> + struct io_pgtable_cfg pgtbl_cfg;
>> + int i, ret = 0;
>> + u32 reg;
>> +
>> + mutex_lock(_domain->init_mutex);
>> + if (qcom_domain->iommu)
>> + goto out_unlock;
>> +
>> + pgtbl_cfg = (struct io_pgtable_cfg) {
>> + .pgsize_bitmap  = qcom_iommu_ops.pgsize_bitmap,
>> + .ias= 32,
>> + .oas= 40,
>> + .tlb= _gather_ops,
>> + .iommu_dev  = qcom_iommu->dev,
>> + };
>> +
>> + qcom_domain->iommu = qcom_iommu;
>> + pgtbl_ops = alloc_io_pgtable_ops(ARM_32_LPAE_S1, _cfg, fwspec);
>> + if (!pgtbl_ops) {
>> + dev_err(qcom_iommu->dev, "failed to allocate pagetable ops\n");
>> + ret = -ENOMEM;
>> + goto out_clear_iommu;
>> + }
>> +
>> + /* Update the domain's page sizes to reflect the page table format */
>> + domain->pgsize_bitmap = pgtbl_cfg.pgsize_bitmap;
>> + domain->geometry.aperture_end = (1ULL << pgtbl_cfg.ias) - 1;
>> + domain->geometry.force_aperture = true;
>> +
>> + for (i = 0; i < fwspec->num_ids; i++) {
>> + struct qcom_iommu_ctx *ctx = to_ctx(fwspec, fwspec->ids[i]);
>> +
>> + if (!ctx->secure_init) {
>> + ret = qcom_scm_restore_sec_cfg(qcom_iommu->sec_id, 
>> ctx->asid);
>> + if (ret) {
>> + dev_err(qcom_iommu->dev, "secure init failed: 
>> %d\n", ret);
>> + goto out_clear_iommu;
>> + }
>> + ctx->secure_init = true;
>> + }
>> +
>> + /* TTBRs */
>> + iommu_writeq(ctx, ARM_SMMU_CB_TTBR0,
>> + pgtbl_cfg.arm_lpae_s1_cfg.ttbr[0] |
>> + ((u64)ctx->asid << TTBRn_ASID_SHIFT));
>> + iommu_writeq(ctx, ARM_SMMU_CB_TTBR1,
>> + pgtbl_cfg.arm_lpae_s1_cfg.ttbr[1] |
>> + ((u64)ctx->asid << TTBRn_ASID_SHIFT));
>> +
>> + /* TTBCR */
>> + iommu_writel(ctx, ARM_SMMU_CB_TTBCR2,
>> + (pgtbl_cfg.arm_lpae_s1_cfg.tcr >> 32) |
>> + TTBCR2_SEP_UPSTREAM);
>> + iommu_writel(ctx, ARM_SMMU_CB_TTBCR,
>> + pgtbl_cfg.arm_lpae_s1_cfg.tcr);
>> +
>> + /* MAIRs (stage-1 only) */
>> + iommu_writel(ctx, ARM_SMMU_CB_S1_MAIR0,
>> + pgtbl_cfg.arm_lpae_s1_cfg.mair[0]);
>> + iommu_writel(ctx, ARM_SMMU_CB_S1_MAIR1,
>> + pgtbl_cfg.arm_lpae_s1_cfg.mair[1]);
>> +
>> + /* SCTLR */
>> + reg = SCTLR_CFIE | SCTLR_CFRE | SCTLR_AFE | SCTLR_TRE |
>> + SCTLR_M | SCTLR_S1_ASIDPNE;
>> +
>> + if (IS_ENABLED(CONFIG_BIG_ENDIAN))
>> + reg |= SCTLR_E;
>> +
>> + iommu_writel(ctx, ARM_SMMU_CB_SCTLR, reg);
>> + }
>> +
>> + mutex_unlock(_domain->init_mutex);
>> +
>> + /* Publish page table ops for map/unmap */
>> + qcom_domain->pgtbl_ops = pgtbl_ops;

[PATCH 3/4] iommu: add qcom_iommu

2017-05-09 Thread Rob Clark
An iommu driver for Qualcomm "B" family devices which do not completely
implement the ARM SMMU spec.  These devices have context-bank register
layout that is similar to ARM SMMU, but no global register space (or at
least not one that is accessible).

Signed-off-by: Rob Clark <robdcl...@gmail.com>
Signed-off-by: Stanimir Varbanov <stanimir.varba...@linaro.org>
---
v1: original
v2: bindings cleanups and kconfig issues that kbuild robot pointed out
v4: fix issues pointed out by Rob H. and actually make device removal
work
v3: fix WARN_ON() splats reported by Archit
v4: some fixes to build as a module.. note that it cannot actually
be built as a module yet (at minimum a bunch of other iommu syms
that are needed are not exported, but there may be more to it
than that), but at least qcom_iommu is ready should it become
possible to build iommu drivers as modules.

 drivers/iommu/Kconfig  |  10 +
 drivers/iommu/Makefile |   1 +
 drivers/iommu/qcom_iommu.c | 855 +
 3 files changed, 866 insertions(+)
 create mode 100644 drivers/iommu/qcom_iommu.c

diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
index 37e204f..55d68c9 100644
--- a/drivers/iommu/Kconfig
+++ b/drivers/iommu/Kconfig
@@ -359,4 +359,14 @@ config MTK_IOMMU_V1
 
  if unsure, say N here.
 
+config QCOM_IOMMU
+   # Note: iommu drivers cannot (yet?) be built as modules
+   bool "Qualcomm IOMMU Support"
+   depends on ARCH_QCOM || COMPILE_TEST
+   select IOMMU_API
+   select IOMMU_IO_PGTABLE_LPAE
+   select ARM_DMA_USE_IOMMU
+   help
+ Support for IOMMU on certain Qualcomm SoCs.
+
 endif # IOMMU_SUPPORT
diff --git a/drivers/iommu/Makefile b/drivers/iommu/Makefile
index 195f7b9..b910aea 100644
--- a/drivers/iommu/Makefile
+++ b/drivers/iommu/Makefile
@@ -27,3 +27,4 @@ obj-$(CONFIG_TEGRA_IOMMU_SMMU) += tegra-smmu.o
 obj-$(CONFIG_EXYNOS_IOMMU) += exynos-iommu.o
 obj-$(CONFIG_FSL_PAMU) += fsl_pamu.o fsl_pamu_domain.o
 obj-$(CONFIG_S390_IOMMU) += s390-iommu.o
+obj-$(CONFIG_QCOM_IOMMU) += qcom_iommu.o
diff --git a/drivers/iommu/qcom_iommu.c b/drivers/iommu/qcom_iommu.c
new file mode 100644
index 000..85fe364
--- /dev/null
+++ b/drivers/iommu/qcom_iommu.c
@@ -0,0 +1,855 @@
+/*
+ * IOMMU API for QCOM secure IOMMUs.  Somewhat based on arm-smmu.c
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ *
+ * Copyright (C) 2013 ARM Limited
+ * Copyright (C) 2017 Red Hat
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "io-pgtable.h"
+#include "arm-smmu-regs.h"
+
+#define SMMU_INTR_SEL_NS 0x2000
+
+struct qcom_iommu_dev {
+   /* IOMMU core code handle */
+   struct iommu_device  iommu;
+   struct device   *dev;
+   struct clk  *iface_clk;
+   struct clk  *bus_clk;
+   void __iomem*local_base;
+   u32  sec_id;
+   struct list_head context_list;   /* list of qcom_iommu_context 
*/
+};
+
+struct qcom_iommu_ctx {
+   struct device   *dev;
+   void __iomem*base;
+   unsigned int irq;
+   bool secure_init;
+   u32  asid;  /* asid and ctx bank # are 1:1 */
+   struct iommu_group  *group;
+   struct list_head node;  /* head in 
qcom_iommu_device::context_list */
+};
+
+struct qcom_iommu_domain {
+   struct io_pgtable_ops   *pgtbl_ops;
+   spinlock_t   pgtbl_lock;
+   struct mutex init_mutex; /* Protects iommu pointer */
+   struct iommu_domain  domain;
+   struct qcom_iommu_dev   *iommu;
+};
+
+static struct qcom_iommu_domain *to_qcom_iommu_domain(struct iommu_domain *dom)
+{
+   return container_of(dom, struct qcom_iommu_domain, domain);
+}
+
+static const struct iommu_ops qcom_iommu_ops;
+
+static struct qcom_iommu_dev * __to_iommu(struct iommu_fwspec *fwspec)
+{
+   if (!fwspec || fwspec->ops != _iommu_ops)
+   return NULL;
+   return fwspec->iommu_priv;
+}
+
+static struct qcom_iommu_dev * to_iommu(struct iommu_fwspec *f

Re: [PATCH 3/4] iommu: add qcom_iommu

2017-06-12 Thread Rob Clark
On Fri, May 26, 2017 at 8:56 AM, Robin Murphy  wrote:
>> + struct iommu_group  *group;
>
> This feels weird, since a device can be associated with multiple
> contexts, but only one group, so group-per-context is somewhat redundant
> and smacks of being in the wrong place. Does the firmware ever map
> multiple devices to the same context?


so, actually it seems like I can dump all of this, and just plug
generic_device_group directly in to iommu ops without needing to care
about tracking the iommu_group myself.  At least this appears to work.

BR,
-R
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[RESEND PATCH 3/4] iommu: add qcom_iommu

2017-06-21 Thread Rob Clark
An iommu driver for Qualcomm "B" family devices which do implement the
ARM SMMU spec, but not in a way that is compatible with how the arm-smmu
driver is designed.  It seems SMMU_SCR1.GASRAE=1 so the global register
space is not accessible.  This means it needs to get configuration from
devicetree instead of setting it up dynamically.

In the end, other than register definitions, there is not much code to
share with arm-smmu (other than what has already been refactored out
into the pgtable helpers).

Signed-off-by: Rob Clark <robdcl...@gmail.com>
Tested-by: Riku Voipio <riku.voi...@linaro.org>
---
v1: original
v2: bindings cleanups and kconfig issues that kbuild robot pointed out
v3: fix issues pointed out by Rob H. and actually make device removal
work
v4: fix WARN_ON() splats reported by Archit
v5: some fixes to build as a module.. note that it cannot actually
be built as a module yet (at minimum a bunch of other iommu syms
that are needed are not exported, but there may be more to it
than that), but at least qcom_iommu is ready should it become
possible to build iommu drivers as modules.
v6: Add additional pm-runtime get/puts around paths that can hit
TLB inv, to avoid unclocked register access if device using the
iommu is not powered on.  And pre-emptively clear interrupts
before registering IRQ handler just in case the bootloader has
left us a surpise.
v7: Address review comments from Robin (don't associate iommu_group
with context bank, table lookup instead of list to find context
bank, etc)

 drivers/iommu/Kconfig  |  10 +
 drivers/iommu/Makefile |   1 +
 drivers/iommu/qcom_iommu.c | 868 +
 3 files changed, 879 insertions(+)
 create mode 100644 drivers/iommu/qcom_iommu.c

diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
index 6ee3a25..aa4b628 100644
--- a/drivers/iommu/Kconfig
+++ b/drivers/iommu/Kconfig
@@ -367,4 +367,14 @@ config MTK_IOMMU_V1
 
  if unsure, say N here.
 
+config QCOM_IOMMU
+   # Note: iommu drivers cannot (yet?) be built as modules
+   bool "Qualcomm IOMMU Support"
+   depends on ARCH_QCOM || COMPILE_TEST
+   select IOMMU_API
+   select IOMMU_IO_PGTABLE_LPAE
+   select ARM_DMA_USE_IOMMU
+   help
+ Support for IOMMU on certain Qualcomm SoCs.
+
 endif # IOMMU_SUPPORT
diff --git a/drivers/iommu/Makefile b/drivers/iommu/Makefile
index 195f7b9..b910aea 100644
--- a/drivers/iommu/Makefile
+++ b/drivers/iommu/Makefile
@@ -27,3 +27,4 @@ obj-$(CONFIG_TEGRA_IOMMU_SMMU) += tegra-smmu.o
 obj-$(CONFIG_EXYNOS_IOMMU) += exynos-iommu.o
 obj-$(CONFIG_FSL_PAMU) += fsl_pamu.o fsl_pamu_domain.o
 obj-$(CONFIG_S390_IOMMU) += s390-iommu.o
+obj-$(CONFIG_QCOM_IOMMU) += qcom_iommu.o
diff --git a/drivers/iommu/qcom_iommu.c b/drivers/iommu/qcom_iommu.c
new file mode 100644
index 000..860cad1
--- /dev/null
+++ b/drivers/iommu/qcom_iommu.c
@@ -0,0 +1,868 @@
+/*
+ * IOMMU API for QCOM secure IOMMUs.  Somewhat based on arm-smmu.c
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ *
+ * Copyright (C) 2013 ARM Limited
+ * Copyright (C) 2017 Red Hat
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "io-pgtable.h"
+#include "arm-smmu-regs.h"
+
+#define SMMU_INTR_SEL_NS 0x2000
+
+struct qcom_iommu_ctx;
+
+struct qcom_iommu_dev {
+   /* IOMMU core code handle */
+   struct iommu_device  iommu;
+   struct device   *dev;
+   struct clk  *iface_clk;
+   struct clk  *bus_clk;
+   void __iomem*local_base;
+   u32  sec_id;
+   u8   num_ctxs;
+   struct qcom_iommu_ctx   *ctxs[0];   /* indexed by asid-1 */
+};
+
+struct qcom_iommu_ctx {
+   struct device   *dev;
+   void __iomem*base;
+   bool secure_init;
+   u8   asid;  /* asid and ctx bank # are 1:1 */
+};
+
+struct qcom_iommu_domain {
+   struct io_pgtable_ops   *pgtbl_ops;
+   spinlock_t   pgtbl_lock;
+   struct mutex init_mutex; /* Protects iommu pointer 

[RESEND PATCH 4/4] iommu: qcom: initialize secure page table

2017-06-21 Thread Rob Clark
From: Stanimir Varbanov <stanimir.varba...@linaro.org>

This basically gets the secure page table size, allocates memory for
secure pagetables and passes the physical address to the trusted zone.

Signed-off-by: Stanimir Varbanov <stanimir.varba...@linaro.org>
Signed-off-by: Rob Clark <robdcl...@gmail.com>
---
 drivers/iommu/qcom_iommu.c | 64 ++
 1 file changed, 64 insertions(+)

diff --git a/drivers/iommu/qcom_iommu.c b/drivers/iommu/qcom_iommu.c
index 860cad1..48b62aa 100644
--- a/drivers/iommu/qcom_iommu.c
+++ b/drivers/iommu/qcom_iommu.c
@@ -604,6 +604,51 @@ static void qcom_iommu_disable_clocks(struct 
qcom_iommu_dev *qcom_iommu)
clk_disable_unprepare(qcom_iommu->iface_clk);
 }
 
+static int qcom_iommu_sec_ptbl_init(struct device *dev)
+{
+   size_t psize = 0;
+   unsigned int spare = 0;
+   void *cpu_addr;
+   dma_addr_t paddr;
+   unsigned long attrs;
+   static bool allocated = false;
+   int ret;
+
+   if (allocated)
+   return 0;
+
+   ret = qcom_scm_iommu_secure_ptbl_size(spare, );
+   if (ret) {
+   dev_err(dev, "failed to get iommu secure pgtable size (%d)\n",
+   ret);
+   return ret;
+   }
+
+   dev_info(dev, "iommu sec: pgtable size: %zu\n", psize);
+
+   attrs = DMA_ATTR_NO_KERNEL_MAPPING;
+
+   cpu_addr = dma_alloc_attrs(dev, psize, , GFP_KERNEL, attrs);
+   if (!cpu_addr) {
+   dev_err(dev, "failed to allocate %zu bytes for pgtable\n",
+   psize);
+   return -ENOMEM;
+   }
+
+   ret = qcom_scm_iommu_secure_ptbl_init(paddr, psize, spare);
+   if (ret) {
+   dev_err(dev, "failed to init iommu pgtable (%d)\n", ret);
+   goto free_mem;
+   }
+
+   allocated = true;
+   return 0;
+
+free_mem:
+   dma_free_attrs(dev, psize, cpu_addr, paddr, attrs);
+   return ret;
+}
+
 static int get_asid(const struct device_node *np)
 {
u32 reg;
@@ -700,6 +745,17 @@ static struct platform_driver qcom_iommu_ctx_driver = {
.remove = qcom_iommu_ctx_remove,
 };
 
+static bool qcom_iommu_has_secure_context(struct qcom_iommu_dev *qcom_iommu)
+{
+   struct device_node *child;
+
+   for_each_child_of_node(qcom_iommu->dev->of_node, child)
+   if (of_device_is_compatible(child, "qcom,msm-iommu-v1-sec"))
+   return true;
+
+   return false;
+}
+
 static int qcom_iommu_device_probe(struct platform_device *pdev)
 {
struct device_node *child;
@@ -744,6 +800,14 @@ static int qcom_iommu_device_probe(struct platform_device 
*pdev)
return -ENODEV;
}
 
+   if (qcom_iommu_has_secure_context(qcom_iommu)) {
+   ret = qcom_iommu_sec_ptbl_init(dev);
+   if (ret) {
+   dev_err(dev, "cannot init secure pg table(%d)\n", ret);
+   return ret;
+   }
+   }
+
platform_set_drvdata(pdev, qcom_iommu);
 
pm_runtime_enable(dev);
-- 
2.9.4

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[RESEND PATCH 2/4] iommu: arm-smmu: split out register defines

2017-06-21 Thread Rob Clark
I want to re-use some of these for qcom_iommu, which has (roughly) the
same context-bank registers.

Signed-off-by: Rob Clark <robdcl...@gmail.com>
---
 drivers/iommu/arm-smmu-regs.h | 227 ++
 drivers/iommu/arm-smmu.c  | 203 +
 2 files changed, 228 insertions(+), 202 deletions(-)
 create mode 100644 drivers/iommu/arm-smmu-regs.h

diff --git a/drivers/iommu/arm-smmu-regs.h b/drivers/iommu/arm-smmu-regs.h
new file mode 100644
index 000..87589c8
--- /dev/null
+++ b/drivers/iommu/arm-smmu-regs.h
@@ -0,0 +1,227 @@
+/*
+ * IOMMU API for ARM architected SMMU implementations.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
+ *
+ * Copyright (C) 2013 ARM Limited
+ *
+ * Author: Will Deacon <will.dea...@arm.com>
+ */
+
+#ifndef _ARM_SMMU_REGS_H
+#define _ARM_SMMU_REGS_H
+
+/* Configuration registers */
+#define ARM_SMMU_GR0_sCR0  0x0
+#define sCR0_CLIENTPD  (1 << 0)
+#define sCR0_GFRE  (1 << 1)
+#define sCR0_GFIE  (1 << 2)
+#define sCR0_EXIDENABLE(1 << 3)
+#define sCR0_GCFGFRE   (1 << 4)
+#define sCR0_GCFGFIE   (1 << 5)
+#define sCR0_USFCFG(1 << 10)
+#define sCR0_VMIDPNE   (1 << 11)
+#define sCR0_PTM   (1 << 12)
+#define sCR0_FB(1 << 13)
+#define sCR0_VMID16EN  (1 << 31)
+#define sCR0_BSU_SHIFT 14
+#define sCR0_BSU_MASK  0x3
+
+/* Auxiliary Configuration register */
+#define ARM_SMMU_GR0_sACR  0x10
+
+/* Identification registers */
+#define ARM_SMMU_GR0_ID0   0x20
+#define ARM_SMMU_GR0_ID1   0x24
+#define ARM_SMMU_GR0_ID2   0x28
+#define ARM_SMMU_GR0_ID3   0x2c
+#define ARM_SMMU_GR0_ID4   0x30
+#define ARM_SMMU_GR0_ID5   0x34
+#define ARM_SMMU_GR0_ID6   0x38
+#define ARM_SMMU_GR0_ID7   0x3c
+#define ARM_SMMU_GR0_sGFSR 0x48
+#define ARM_SMMU_GR0_sGFSYNR0  0x50
+#define ARM_SMMU_GR0_sGFSYNR1  0x54
+#define ARM_SMMU_GR0_sGFSYNR2  0x58
+
+#define ID0_S1TS   (1 << 30)
+#define ID0_S2TS   (1 << 29)
+#define ID0_NTS(1 << 28)
+#define ID0_SMS(1 << 27)
+#define ID0_ATOSNS (1 << 26)
+#define ID0_PTFS_NO_AARCH32(1 << 25)
+#define ID0_PTFS_NO_AARCH32S   (1 << 24)
+#define ID0_CTTW   (1 << 14)
+#define ID0_NUMIRPT_SHIFT  16
+#define ID0_NUMIRPT_MASK   0xff
+#define ID0_NUMSIDB_SHIFT  9
+#define ID0_NUMSIDB_MASK   0xf
+#define ID0_EXIDS  (1 << 8)
+#define ID0_NUMSMRG_SHIFT  0
+#define ID0_NUMSMRG_MASK   0xff
+
+#define ID1_PAGESIZE   (1 << 31)
+#define ID1_NUMPAGENDXB_SHIFT  28
+#define ID1_NUMPAGENDXB_MASK   7
+#define ID1_NUMS2CB_SHIFT  16
+#define ID1_NUMS2CB_MASK   0xff
+#define ID1_NUMCB_SHIFT0
+#define ID1_NUMCB_MASK 0xff
+
+#define ID2_OAS_SHIFT  4
+#define ID2_OAS_MASK   0xf
+#define ID2_IAS_SHIFT  0
+#define ID2_IAS_MASK   0xf
+#define ID2_UBS_SHIFT  8
+#define ID2_UBS_MASK   0xf
+#define ID2_PTFS_4K(1 << 12)
+#define ID2_PTFS_16K   (1 << 13)
+#define ID2_PTFS_64K   (1 << 14)
+#define ID2_VMID16 (1 << 15)
+
+#define ID7_MAJOR_SHIFT4
+#define ID7_MAJOR_MASK 0xf
+
+/* Global TLB invalidation */
+#define ARM_SMMU_GR0_TLBIVMID  0x64
+#define ARM_SMMU_GR0_TLBIALLNSNH   0x68
+#define ARM_SMMU_GR0_TLBIALLH  0x6c
+#define ARM_SMMU_GR0_sTLBGSYNC 0x70
+#define ARM_SMMU_GR0_sTLBGSTATUS   0x74
+#define sTLBGSTATUS_GSACTIVE   (1 << 0)
+#define TLB_LOOP_TIMEOUT   10

[RESEND PATCH 1/4] Docs: dt: document qcom iommu bindings

2017-06-21 Thread Rob Clark
Cc: devicet...@vger.kernel.org
Signed-off-by: Rob Clark <robdcl...@gmail.com>
Reviewed-by: Rob Herring <r...@kernel.org>
---
 .../devicetree/bindings/iommu/qcom,iommu.txt   | 121 +
 1 file changed, 121 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/iommu/qcom,iommu.txt

diff --git a/Documentation/devicetree/bindings/iommu/qcom,iommu.txt 
b/Documentation/devicetree/bindings/iommu/qcom,iommu.txt
new file mode 100644
index 000..b2641ce
--- /dev/null
+++ b/Documentation/devicetree/bindings/iommu/qcom,iommu.txt
@@ -0,0 +1,121 @@
+* QCOM IOMMU v1 Implementation
+
+Qualcomm "B" family devices which are not compatible with arm-smmu have
+a similar looking IOMMU but without access to the global register space,
+and optionally requiring additional configuration to route context irqs
+to non-secure vs secure interrupt line.
+
+** Required properties:
+
+- compatible   : Should be one of:
+
+"qcom,msm8916-iommu"
+
+ Followed by "qcom,msm-iommu-v1".
+
+- clock-names  : Should be a pair of "iface" (required for IOMMUs
+ register group access) and "bus" (required for
+ the IOMMUs underlying bus access).
+
+- clocks   : Phandles for respective clocks described by
+ clock-names.
+
+- #address-cells   : must be 1.
+
+- #size-cells  : must be 1.
+
+- #iommu-cells : Must be 1.  Index identifies the context-bank #.
+
+- ranges   : Base address and size of the iommu context banks.
+
+- qcom,iommu-secure-id  : secure-id.
+
+- List of sub-nodes, one per translation context bank.  Each sub-node
+  has the following required properties:
+
+  - compatible : Should be one of:
+- "qcom,msm-iommu-v1-ns"  : non-secure context bank
+- "qcom,msm-iommu-v1-sec" : secure context bank
+  - reg: Base address and size of context bank within the iommu
+  - interrupts : The context fault irq.
+
+** Optional properties:
+
+- reg  : Base address and size of the SMMU local base, should
+ be only specified if the iommu requires configuration
+ for routing of context bank irq's to secure vs non-
+ secure lines.  (Ie. if the iommu contains secure
+ context banks)
+
+
+** Examples:
+
+   apps_iommu: iommu@1e2 {
+   #address-cells = <1>;
+   #size-cells = <1>;
+   #iommu-cells = <1>;
+   compatible = "qcom,msm8916-iommu", "qcom,msm-iommu-v1";
+   ranges = <0 0x1e2 0x4>;
+   reg = <0x1ef 0x3000>;
+   clocks = < GCC_SMMU_CFG_CLK>,
+< GCC_APSS_TCU_CLK>;
+   clock-names = "iface", "bus";
+   qcom,iommu-secure-id = <17>;
+
+   // mdp_0:
+   iommu-ctx@4000 {
+   compatible = "qcom,msm-iommu-v1-ns";
+   reg = <0x4000 0x1000>;
+   interrupts = ;
+   };
+
+   // venus_ns:
+   iommu-ctx@5000 {
+   compatible = "qcom,msm-iommu-v1-sec";
+   reg = <0x5000 0x1000>;
+   interrupts = ;
+   };
+   };
+
+   gpu_iommu: iommu@1f08000 {
+   #address-cells = <1>;
+   #size-cells = <1>;
+   #iommu-cells = <1>;
+   compatible = "qcom,msm8916-iommu", "qcom,msm-iommu-v1";
+   ranges = <0 0x1f08000 0x1>;
+   clocks = < GCC_SMMU_CFG_CLK>,
+< GCC_GFX_TCU_CLK>;
+   clock-names = "iface", "bus";
+   qcom,iommu-secure-id = <18>;
+
+   // gfx3d_user:
+   iommu-ctx@1000 {
+   compatible = "qcom,msm-iommu-v1-ns";
+   reg = <0x1000 0x1000>;
+   interrupts = ;
+   };
+
+   // gfx3d_priv:
+   iommu-ctx@2000 {
+   compatible = "qcom,msm-iommu-v1-ns";
+   reg = <0x2000 0x1000>;
+   interrupts = ;
+   };
+   };
+
+   ...
+
+   venus: video-codec@1d0 {
+   ...
+   iommus = <_iommu 5>;
+   };
+
+   mdp: mdp@1a01000 {
+   ...
+   iommus = <_iommu 4>;
+   };
+
+   gpu@01c0 {
+   ...
+   iommus = <_iommu 1>, <_iommu 2>;
+   };
-- 
2.9.4

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[RESEND PATCH 0/4] iommu: add qcom_iommu for early "B" family devices

2017-06-21 Thread Rob Clark
An iommu driver for Qualcomm "B" family devices which do not implement
the ARM SMMU spec in a way that is compatible with the arm-smmu driver.

No change since last time, other than Riku's tested-by.

Rob Clark (3):
  Docs: dt: document qcom iommu bindings
  iommu: arm-smmu: split out register defines
  iommu: add qcom_iommu

Stanimir Varbanov (1):
  iommu: qcom: initialize secure page table

 .../devicetree/bindings/iommu/qcom,iommu.txt   | 121 +++
 drivers/iommu/Kconfig  |  10 +
 drivers/iommu/Makefile |   1 +
 drivers/iommu/arm-smmu-regs.h  | 227 +
 drivers/iommu/arm-smmu.c   | 203 +
 drivers/iommu/qcom_iommu.c | 932 +
 6 files changed, 1292 insertions(+), 202 deletions(-)
 create mode 100644 Documentation/devicetree/bindings/iommu/qcom,iommu.txt
 create mode 100644 drivers/iommu/arm-smmu-regs.h
 create mode 100644 drivers/iommu/qcom_iommu.c

-- 
2.9.4

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 3/4] iommu: add qcom_iommu

2017-06-26 Thread Rob Clark
An iommu driver for Qualcomm "B" family devices which do implement the
ARM SMMU spec, but not in a way that is compatible with how the arm-smmu
driver is designed.  It seems SMMU_SCR1.GASRAE=1 so the global register
space is not accessible.  This means it needs to get configuration from
devicetree instead of setting it up dynamically.

In the end, other than register definitions, there is not much code to
share with arm-smmu (other than what has already been refactored out
into the pgtable helpers).

Signed-off-by: Rob Clark <robdcl...@gmail.com>
Tested-by: Riku Voipio <riku.voi...@linaro.org>
---
v1: original
v2: bindings cleanups and kconfig issues that kbuild robot pointed out
v3: fix issues pointed out by Rob H. and actually make device removal
work
v4: fix WARN_ON() splats reported by Archit
v5: some fixes to build as a module.. note that it cannot actually
be built as a module yet (at minimum a bunch of other iommu syms
that are needed are not exported, but there may be more to it
than that), but at least qcom_iommu is ready should it become
possible to build iommu drivers as modules.
v6: Add additional pm-runtime get/puts around paths that can hit
TLB inv, to avoid unclocked register access if device using the
iommu is not powered on.  And pre-emptively clear interrupts
before registering IRQ handler just in case the bootloader has
left us a surpise.
v7: Address review comments from Robin (don't associate iommu_group
with context bank, table lookup instead of list to find context
bank, etc)
v8: Fix silly bug on detach.  Actually Robin already pointed it out
but I somehow overlooked that comment when preparing v7.

 drivers/iommu/Kconfig  |  10 +
 drivers/iommu/Makefile |   1 +
 drivers/iommu/qcom_iommu.c | 857 +
 3 files changed, 868 insertions(+)
 create mode 100644 drivers/iommu/qcom_iommu.c

diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
index 6ee3a25..aa4b628 100644
--- a/drivers/iommu/Kconfig
+++ b/drivers/iommu/Kconfig
@@ -367,4 +367,14 @@ config MTK_IOMMU_V1
 
  if unsure, say N here.
 
+config QCOM_IOMMU
+   # Note: iommu drivers cannot (yet?) be built as modules
+   bool "Qualcomm IOMMU Support"
+   depends on ARCH_QCOM || COMPILE_TEST
+   select IOMMU_API
+   select IOMMU_IO_PGTABLE_LPAE
+   select ARM_DMA_USE_IOMMU
+   help
+ Support for IOMMU on certain Qualcomm SoCs.
+
 endif # IOMMU_SUPPORT
diff --git a/drivers/iommu/Makefile b/drivers/iommu/Makefile
index 195f7b9..b910aea 100644
--- a/drivers/iommu/Makefile
+++ b/drivers/iommu/Makefile
@@ -27,3 +27,4 @@ obj-$(CONFIG_TEGRA_IOMMU_SMMU) += tegra-smmu.o
 obj-$(CONFIG_EXYNOS_IOMMU) += exynos-iommu.o
 obj-$(CONFIG_FSL_PAMU) += fsl_pamu.o fsl_pamu_domain.o
 obj-$(CONFIG_S390_IOMMU) += s390-iommu.o
+obj-$(CONFIG_QCOM_IOMMU) += qcom_iommu.o
diff --git a/drivers/iommu/qcom_iommu.c b/drivers/iommu/qcom_iommu.c
new file mode 100644
index 000..33e984e
--- /dev/null
+++ b/drivers/iommu/qcom_iommu.c
@@ -0,0 +1,857 @@
+/*
+ * IOMMU API for QCOM secure IOMMUs.  Somewhat based on arm-smmu.c
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ *
+ * Copyright (C) 2013 ARM Limited
+ * Copyright (C) 2017 Red Hat
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "io-pgtable.h"
+#include "arm-smmu-regs.h"
+
+#define SMMU_INTR_SEL_NS 0x2000
+
+struct qcom_iommu_ctx;
+
+struct qcom_iommu_dev {
+   /* IOMMU core code handle */
+   struct iommu_device  iommu;
+   struct device   *dev;
+   struct clk  *iface_clk;
+   struct clk  *bus_clk;
+   void __iomem*local_base;
+   u32  sec_id;
+   u8   num_ctxs;
+   struct qcom_iommu_ctx   *ctxs[0];   /* indexed by asid-1 */
+};
+
+struct qcom_iommu_ctx {
+   struct device   *dev;
+   void __iomem*base;
+   bool secure_init;
+   u8   asid;  /* asid and ctx bank # are 1:1 */
+};
+
+struct qcom_iommu_domain {
+   struct io_pgtable_ops

[PATCH 4/4] iommu: qcom: initialize secure page table

2017-06-26 Thread Rob Clark
From: Stanimir Varbanov <stanimir.varba...@linaro.org>

This basically gets the secure page table size, allocates memory for
secure pagetables and passes the physical address to the trusted zone.

Signed-off-by: Stanimir Varbanov <stanimir.varba...@linaro.org>
Signed-off-by: Rob Clark <robdcl...@gmail.com>
---
 drivers/iommu/qcom_iommu.c | 64 ++
 1 file changed, 64 insertions(+)

diff --git a/drivers/iommu/qcom_iommu.c b/drivers/iommu/qcom_iommu.c
index 33e984e..438f38a 100644
--- a/drivers/iommu/qcom_iommu.c
+++ b/drivers/iommu/qcom_iommu.c
@@ -593,6 +593,51 @@ static void qcom_iommu_disable_clocks(struct 
qcom_iommu_dev *qcom_iommu)
clk_disable_unprepare(qcom_iommu->iface_clk);
 }
 
+static int qcom_iommu_sec_ptbl_init(struct device *dev)
+{
+   size_t psize = 0;
+   unsigned int spare = 0;
+   void *cpu_addr;
+   dma_addr_t paddr;
+   unsigned long attrs;
+   static bool allocated = false;
+   int ret;
+
+   if (allocated)
+   return 0;
+
+   ret = qcom_scm_iommu_secure_ptbl_size(spare, );
+   if (ret) {
+   dev_err(dev, "failed to get iommu secure pgtable size (%d)\n",
+   ret);
+   return ret;
+   }
+
+   dev_info(dev, "iommu sec: pgtable size: %zu\n", psize);
+
+   attrs = DMA_ATTR_NO_KERNEL_MAPPING;
+
+   cpu_addr = dma_alloc_attrs(dev, psize, , GFP_KERNEL, attrs);
+   if (!cpu_addr) {
+   dev_err(dev, "failed to allocate %zu bytes for pgtable\n",
+   psize);
+   return -ENOMEM;
+   }
+
+   ret = qcom_scm_iommu_secure_ptbl_init(paddr, psize, spare);
+   if (ret) {
+   dev_err(dev, "failed to init iommu pgtable (%d)\n", ret);
+   goto free_mem;
+   }
+
+   allocated = true;
+   return 0;
+
+free_mem:
+   dma_free_attrs(dev, psize, cpu_addr, paddr, attrs);
+   return ret;
+}
+
 static int get_asid(const struct device_node *np)
 {
u32 reg;
@@ -689,6 +734,17 @@ static struct platform_driver qcom_iommu_ctx_driver = {
.remove = qcom_iommu_ctx_remove,
 };
 
+static bool qcom_iommu_has_secure_context(struct qcom_iommu_dev *qcom_iommu)
+{
+   struct device_node *child;
+
+   for_each_child_of_node(qcom_iommu->dev->of_node, child)
+   if (of_device_is_compatible(child, "qcom,msm-iommu-v1-sec"))
+   return true;
+
+   return false;
+}
+
 static int qcom_iommu_device_probe(struct platform_device *pdev)
 {
struct device_node *child;
@@ -733,6 +789,14 @@ static int qcom_iommu_device_probe(struct platform_device 
*pdev)
return -ENODEV;
}
 
+   if (qcom_iommu_has_secure_context(qcom_iommu)) {
+   ret = qcom_iommu_sec_ptbl_init(dev);
+   if (ret) {
+   dev_err(dev, "cannot init secure pg table(%d)\n", ret);
+   return ret;
+   }
+   }
+
platform_set_drvdata(pdev, qcom_iommu);
 
pm_runtime_enable(dev);
-- 
2.9.4

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 0/4] iommu: add qcom_iommu for early "B" family devices

2017-06-26 Thread Rob Clark
An iommu driver for Qualcomm "B" family devices which do not implement
the ARM SMMU spec in a way that is compatible with the arm-smmu driver.

Rob Clark (3):
  Docs: dt: document qcom iommu bindings
  iommu: arm-smmu: split out register defines
  iommu: add qcom_iommu

Stanimir Varbanov (1):
  iommu: qcom: initialize secure page table

 .../devicetree/bindings/iommu/qcom,iommu.txt   | 121 +++
 drivers/iommu/Kconfig  |  10 +
 drivers/iommu/Makefile |   1 +
 drivers/iommu/arm-smmu-regs.h  | 227 +
 drivers/iommu/arm-smmu.c   | 203 +
 drivers/iommu/qcom_iommu.c | 921 +
 6 files changed, 1281 insertions(+), 202 deletions(-)
 create mode 100644 Documentation/devicetree/bindings/iommu/qcom,iommu.txt
 create mode 100644 drivers/iommu/arm-smmu-regs.h
 create mode 100644 drivers/iommu/qcom_iommu.c

-- 
2.9.4

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 2/4] iommu: arm-smmu: split out register defines

2017-06-26 Thread Rob Clark
I want to re-use some of these for qcom_iommu, which has (roughly) the
same context-bank registers.

Signed-off-by: Rob Clark <robdcl...@gmail.com>
---
 drivers/iommu/arm-smmu-regs.h | 227 ++
 drivers/iommu/arm-smmu.c  | 203 +
 2 files changed, 228 insertions(+), 202 deletions(-)
 create mode 100644 drivers/iommu/arm-smmu-regs.h

diff --git a/drivers/iommu/arm-smmu-regs.h b/drivers/iommu/arm-smmu-regs.h
new file mode 100644
index 000..87589c8
--- /dev/null
+++ b/drivers/iommu/arm-smmu-regs.h
@@ -0,0 +1,227 @@
+/*
+ * IOMMU API for ARM architected SMMU implementations.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
+ *
+ * Copyright (C) 2013 ARM Limited
+ *
+ * Author: Will Deacon <will.dea...@arm.com>
+ */
+
+#ifndef _ARM_SMMU_REGS_H
+#define _ARM_SMMU_REGS_H
+
+/* Configuration registers */
+#define ARM_SMMU_GR0_sCR0  0x0
+#define sCR0_CLIENTPD  (1 << 0)
+#define sCR0_GFRE  (1 << 1)
+#define sCR0_GFIE  (1 << 2)
+#define sCR0_EXIDENABLE(1 << 3)
+#define sCR0_GCFGFRE   (1 << 4)
+#define sCR0_GCFGFIE   (1 << 5)
+#define sCR0_USFCFG(1 << 10)
+#define sCR0_VMIDPNE   (1 << 11)
+#define sCR0_PTM   (1 << 12)
+#define sCR0_FB(1 << 13)
+#define sCR0_VMID16EN  (1 << 31)
+#define sCR0_BSU_SHIFT 14
+#define sCR0_BSU_MASK  0x3
+
+/* Auxiliary Configuration register */
+#define ARM_SMMU_GR0_sACR  0x10
+
+/* Identification registers */
+#define ARM_SMMU_GR0_ID0   0x20
+#define ARM_SMMU_GR0_ID1   0x24
+#define ARM_SMMU_GR0_ID2   0x28
+#define ARM_SMMU_GR0_ID3   0x2c
+#define ARM_SMMU_GR0_ID4   0x30
+#define ARM_SMMU_GR0_ID5   0x34
+#define ARM_SMMU_GR0_ID6   0x38
+#define ARM_SMMU_GR0_ID7   0x3c
+#define ARM_SMMU_GR0_sGFSR 0x48
+#define ARM_SMMU_GR0_sGFSYNR0  0x50
+#define ARM_SMMU_GR0_sGFSYNR1  0x54
+#define ARM_SMMU_GR0_sGFSYNR2  0x58
+
+#define ID0_S1TS   (1 << 30)
+#define ID0_S2TS   (1 << 29)
+#define ID0_NTS(1 << 28)
+#define ID0_SMS(1 << 27)
+#define ID0_ATOSNS (1 << 26)
+#define ID0_PTFS_NO_AARCH32(1 << 25)
+#define ID0_PTFS_NO_AARCH32S   (1 << 24)
+#define ID0_CTTW   (1 << 14)
+#define ID0_NUMIRPT_SHIFT  16
+#define ID0_NUMIRPT_MASK   0xff
+#define ID0_NUMSIDB_SHIFT  9
+#define ID0_NUMSIDB_MASK   0xf
+#define ID0_EXIDS  (1 << 8)
+#define ID0_NUMSMRG_SHIFT  0
+#define ID0_NUMSMRG_MASK   0xff
+
+#define ID1_PAGESIZE   (1 << 31)
+#define ID1_NUMPAGENDXB_SHIFT  28
+#define ID1_NUMPAGENDXB_MASK   7
+#define ID1_NUMS2CB_SHIFT  16
+#define ID1_NUMS2CB_MASK   0xff
+#define ID1_NUMCB_SHIFT0
+#define ID1_NUMCB_MASK 0xff
+
+#define ID2_OAS_SHIFT  4
+#define ID2_OAS_MASK   0xf
+#define ID2_IAS_SHIFT  0
+#define ID2_IAS_MASK   0xf
+#define ID2_UBS_SHIFT  8
+#define ID2_UBS_MASK   0xf
+#define ID2_PTFS_4K(1 << 12)
+#define ID2_PTFS_16K   (1 << 13)
+#define ID2_PTFS_64K   (1 << 14)
+#define ID2_VMID16 (1 << 15)
+
+#define ID7_MAJOR_SHIFT4
+#define ID7_MAJOR_MASK 0xf
+
+/* Global TLB invalidation */
+#define ARM_SMMU_GR0_TLBIVMID  0x64
+#define ARM_SMMU_GR0_TLBIALLNSNH   0x68
+#define ARM_SMMU_GR0_TLBIALLH  0x6c
+#define ARM_SMMU_GR0_sTLBGSYNC 0x70
+#define ARM_SMMU_GR0_sTLBGSTATUS   0x74
+#define sTLBGSTATUS_GSACTIVE   (1 << 0)
+#define TLB_LOOP_TIMEOUT   10

[PATCH 1/4] Docs: dt: document qcom iommu bindings

2017-06-26 Thread Rob Clark
Cc: devicet...@vger.kernel.org
Signed-off-by: Rob Clark <robdcl...@gmail.com>
Reviewed-by: Rob Herring <r...@kernel.org>
---
 .../devicetree/bindings/iommu/qcom,iommu.txt   | 121 +
 1 file changed, 121 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/iommu/qcom,iommu.txt

diff --git a/Documentation/devicetree/bindings/iommu/qcom,iommu.txt 
b/Documentation/devicetree/bindings/iommu/qcom,iommu.txt
new file mode 100644
index 000..b2641ce
--- /dev/null
+++ b/Documentation/devicetree/bindings/iommu/qcom,iommu.txt
@@ -0,0 +1,121 @@
+* QCOM IOMMU v1 Implementation
+
+Qualcomm "B" family devices which are not compatible with arm-smmu have
+a similar looking IOMMU but without access to the global register space,
+and optionally requiring additional configuration to route context irqs
+to non-secure vs secure interrupt line.
+
+** Required properties:
+
+- compatible   : Should be one of:
+
+"qcom,msm8916-iommu"
+
+ Followed by "qcom,msm-iommu-v1".
+
+- clock-names  : Should be a pair of "iface" (required for IOMMUs
+ register group access) and "bus" (required for
+ the IOMMUs underlying bus access).
+
+- clocks   : Phandles for respective clocks described by
+ clock-names.
+
+- #address-cells   : must be 1.
+
+- #size-cells  : must be 1.
+
+- #iommu-cells : Must be 1.  Index identifies the context-bank #.
+
+- ranges   : Base address and size of the iommu context banks.
+
+- qcom,iommu-secure-id  : secure-id.
+
+- List of sub-nodes, one per translation context bank.  Each sub-node
+  has the following required properties:
+
+  - compatible : Should be one of:
+- "qcom,msm-iommu-v1-ns"  : non-secure context bank
+- "qcom,msm-iommu-v1-sec" : secure context bank
+  - reg: Base address and size of context bank within the iommu
+  - interrupts : The context fault irq.
+
+** Optional properties:
+
+- reg  : Base address and size of the SMMU local base, should
+ be only specified if the iommu requires configuration
+ for routing of context bank irq's to secure vs non-
+ secure lines.  (Ie. if the iommu contains secure
+ context banks)
+
+
+** Examples:
+
+   apps_iommu: iommu@1e2 {
+   #address-cells = <1>;
+   #size-cells = <1>;
+   #iommu-cells = <1>;
+   compatible = "qcom,msm8916-iommu", "qcom,msm-iommu-v1";
+   ranges = <0 0x1e2 0x4>;
+   reg = <0x1ef 0x3000>;
+   clocks = < GCC_SMMU_CFG_CLK>,
+< GCC_APSS_TCU_CLK>;
+   clock-names = "iface", "bus";
+   qcom,iommu-secure-id = <17>;
+
+   // mdp_0:
+   iommu-ctx@4000 {
+   compatible = "qcom,msm-iommu-v1-ns";
+   reg = <0x4000 0x1000>;
+   interrupts = ;
+   };
+
+   // venus_ns:
+   iommu-ctx@5000 {
+   compatible = "qcom,msm-iommu-v1-sec";
+   reg = <0x5000 0x1000>;
+   interrupts = ;
+   };
+   };
+
+   gpu_iommu: iommu@1f08000 {
+   #address-cells = <1>;
+   #size-cells = <1>;
+   #iommu-cells = <1>;
+   compatible = "qcom,msm8916-iommu", "qcom,msm-iommu-v1";
+   ranges = <0 0x1f08000 0x1>;
+   clocks = < GCC_SMMU_CFG_CLK>,
+< GCC_GFX_TCU_CLK>;
+   clock-names = "iface", "bus";
+   qcom,iommu-secure-id = <18>;
+
+   // gfx3d_user:
+   iommu-ctx@1000 {
+   compatible = "qcom,msm-iommu-v1-ns";
+   reg = <0x1000 0x1000>;
+   interrupts = ;
+   };
+
+   // gfx3d_priv:
+   iommu-ctx@2000 {
+   compatible = "qcom,msm-iommu-v1-ns";
+   reg = <0x2000 0x1000>;
+   interrupts = ;
+   };
+   };
+
+   ...
+
+   venus: video-codec@1d0 {
+   ...
+   iommus = <_iommu 5>;
+   };
+
+   mdp: mdp@1a01000 {
+   ...
+   iommus = <_iommu 4>;
+   };
+
+   gpu@01c0 {
+   ...
+   iommus = <_iommu 1>, <_iommu 2>;
+   };
-- 
2.9.4

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 3/4] iommu: add qcom_iommu

2017-05-26 Thread Rob Clark
On Fri, May 26, 2017 at 8:56 AM, Robin Murphy <robin.mur...@arm.com> wrote:
> On 25/05/17 18:33, Rob Clark wrote:
>> An iommu driver for Qualcomm "B" family devices which do not completely
>> implement the ARM SMMU spec.  These devices have context-bank register
>> layout that is similar to ARM SMMU, but no global register space (or at
>> least not one that is accessible).
>
> I still object to this description, because the SMMU_SCR1.GASRAE = 1
> usage model is explicitly *specified* by the ARM SMMU spec! It's merely
> that the arm-smmu driver is designed for the case where we do have
> control of the global space and stage 2 contexts.

hmm, ok.. well, I've no idea what secure world is doing, but it sounds
plausible that GASRAE is set to 1.. at least that would match how
things behave.

In that case, I wonder if the driver should have a more generic name
than "qcom_iommu" (and likewise for compat strings, etc)?  I've really
no idea if qcom is the only one doing this.  In either case,
suggestions welcome.  (I had assumed someone would have bikeshedded
the name/compat-strings by now ;-))

>> Signed-off-by: Rob Clark <robdcl...@gmail.com>
>> ---
>> v1: original
>> v2: bindings cleanups and kconfig issues that kbuild robot pointed out
>> v3: fix issues pointed out by Rob H. and actually make device removal
>> work
>> v4: fix WARN_ON() splats reported by Archit
>> v5: some fixes to build as a module.. note that it cannot actually
>> be built as a module yet (at minimum a bunch of other iommu syms
>> that are needed are not exported, but there may be more to it
>> than that), but at least qcom_iommu is ready should it become
>> possible to build iommu drivers as modules.
>
> Note that with the 4.12 probe-deferral changes, modules totally work!
> For any master which probed before the IOMMU driver was loaded, you can
> then hook them up after the fact by just unbinding and rebinding their
> drivers - it's really cool.

hmm, ok, last time I tried this was 4.11 + iommu-next for 4.12 (plus a
couple other -next trees), since 4.12-rc1 wasn't out yet.. but at that
time, we needed at least a few EXPORT_SYMBOL()s, plus probably some
sort of fix for iommu bug I was trying to fix/paper-over in
<20170505180837.11326-1-robdcl...@gmail.com> (at least if you wanted
module unload to work).  For the former issue, I can send patches to
add EXPORT_SYMBOL()s (or is EXPORT_SYMBOL_GPL() preferred?).. for
latter, well I spend 80% or my time working on userspace level part of
gpu driver stack, and 80% of my kernel time working in drm, so I'll
leave this to someone who spends more than 4% of their time working on
the iommu subsystem ;-)

>> v6: Add additional pm-runtime get/puts around paths that can hit
>> TLB inv, to avoid unclocked register access if device using the
>> iommu is not powered on.  And pre-emptively clear interrupts
>> before registering IRQ handler just in case the bootloader has
>> left us a surpise.
>>
>>  drivers/iommu/Kconfig  |  10 +
>>  drivers/iommu/Makefile |   1 +
>>  drivers/iommu/qcom_iommu.c | 878 
>> +
>>  3 files changed, 889 insertions(+)
>>  create mode 100644 drivers/iommu/qcom_iommu.c
>>
>> diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
>> index 6ee3a25..aa4b628 100644
>> --- a/drivers/iommu/Kconfig
>> +++ b/drivers/iommu/Kconfig
>> @@ -367,4 +367,14 @@ config MTK_IOMMU_V1
>>
>> if unsure, say N here.
>>
>> +config QCOM_IOMMU
>> + # Note: iommu drivers cannot (yet?) be built as modules
>> + bool "Qualcomm IOMMU Support"
>> + depends on ARCH_QCOM || COMPILE_TEST
>> + select IOMMU_API
>> + select IOMMU_IO_PGTABLE_LPAE
>> + select ARM_DMA_USE_IOMMU
>> + help
>> +   Support for IOMMU on certain Qualcomm SoCs.
>> +
>>  endif # IOMMU_SUPPORT
>> diff --git a/drivers/iommu/Makefile b/drivers/iommu/Makefile
>> index 195f7b9..b910aea 100644
>> --- a/drivers/iommu/Makefile
>> +++ b/drivers/iommu/Makefile
>> @@ -27,3 +27,4 @@ obj-$(CONFIG_TEGRA_IOMMU_SMMU) += tegra-smmu.o
>>  obj-$(CONFIG_EXYNOS_IOMMU) += exynos-iommu.o
>>  obj-$(CONFIG_FSL_PAMU) += fsl_pamu.o fsl_pamu_domain.o
>>  obj-$(CONFIG_S390_IOMMU) += s390-iommu.o
>> +obj-$(CONFIG_QCOM_IOMMU) += qcom_iommu.o
>> diff --git a/drivers/iommu/qcom_iommu.c b/drivers/iommu/qcom_iommu.c
>> new file mode 100644
>> index 000..bfaf97c
>> --- /dev/null
>> +++ b/drivers/iommu/qcom_iommu.c
>> @@ -0,0 +1,878 @@
>> +/*
>> + * IOMMU API for QCOM secure IOMMUs.  Somewh

Re: [PATCH 1/4] Docs: dt: document qcom iommu bindings

2017-05-26 Thread Rob Clark
On Fri, May 26, 2017 at 7:33 AM, Robin Murphy <robin.mur...@arm.com> wrote:
> On 25/05/17 18:33, Rob Clark wrote:
>> Cc: devicet...@vger.kernel.org
>> Signed-off-by: Rob Clark <robdcl...@gmail.com>
>> Reviewed-by: Rob Herring <r...@kernel.org>
>> ---
>>  .../devicetree/bindings/iommu/qcom,iommu.txt   | 121 
>> +
>>  1 file changed, 121 insertions(+)
>>  create mode 100644 Documentation/devicetree/bindings/iommu/qcom,iommu.txt
>>
>> diff --git a/Documentation/devicetree/bindings/iommu/qcom,iommu.txt 
>> b/Documentation/devicetree/bindings/iommu/qcom,iommu.txt
>> new file mode 100644
>> index 000..0d50f84
>> --- /dev/null
>> +++ b/Documentation/devicetree/bindings/iommu/qcom,iommu.txt
>> @@ -0,0 +1,121 @@
>> +* QCOM IOMMU v1 Implementation
>> +
>> +Qualcomm "B" family devices which are not compatible with arm-smmu have
>> +a similar looking IOMMU but without access to the global register space,
>> +and optionally requiring additional configuration to route context irqs
>> +to non-secure vs secure interrupt line.
>> +
>> +** Required properties:
>> +
>> +- compatible   : Should be one of:
>> +
>> +"qcom,msm8916-iommu"
>> +
>> + Followed by "qcom,msm-iommu-v1".
>> +
>> +- clock-names  : Should be a pair of "iface" (required for IOMMUs
>> + register group access) and "bus" (required for
>> + the IOMMUs underlying bus access).
>> +
>> +- clocks   : Phandles for respective clocks described by
>> + clock-names.
>> +
>> +- #address-cells   : must be 1.
>> +
>> +- #size-cells  : must be 1.
>> +
>> +- #iommu-cells : Must be 1.
>
> You need to document what the value in the cell means for this binding.
> AFAICS it looks to be the hardware context bank index, but I wonder if
> it might be simpler to use the child node index instead.

yeah, it is the ctx bank idx.. I suppose it could work either way.  I
guess the advantage of ctx bank index is that it wouldn't shift if
adding child nodes incrementally (which at least might be more likely
for someone wiring up things incrementally when bringing up a new SoC
without docs.. and we've had a few cases of that).

(Ie. I think there were some cases where there are gaps in the context
banks.. and not really sure if those where just context banks that
downstream kernel didn't bother using or what.)

>> +- ranges   : Base address and size of the iommu context banks.
>> +
>> +- qcom,iommu-secure-id  : secure-id.
>> +
>> +- List of sub-nodes, one per translation context bank.  Each sub-node
>> +  has the following required properties:
>> +
>> +  - compatible : Should be one of:
>> +- "qcom,msm-iommu-v1-ns"  : non-secure context bank
>> +- "qcom,msm-iommu-v1-sec" : secure context bank
>> +  - reg: Base address and size of context bank within the iommu
>> +  - interrupts : The context fault irq.
>> +
>> +** Optional properties:
>> +
>> +- reg  : Base address and size of the SMMU local base, should
>> + be only specified if the iommu requires configuration
>> + for routing of context bank irq's to secure vs non-
>> + secure lines.  (Ie. if the iommu contains secure
>> + context banks)
>> +
>> +
>> +** Examples:
>> +
>> + apps_iommu: iommu@1e2 {
>> + #address-cells = <1>;
>> + #size-cells = <1>;
>> + #iommu-cells = <1>;
>> + compatible = "qcom,msm8916-iommu", "qcom,msm-iommu-v1";
>> + ranges = <0 0x1e2 0x4>;
>> + reg = <0x1ef 0x3000>;
>> + clocks = < GCC_SMMU_CFG_CLK>,
>> +  < GCC_APSS_TCU_CLK>;
>> + clock-names = "iface", "bus";
>> + qcom,iommu-secure-id = <17>;
>> +
>> + // mdp_0:
>> + iommu-ctx@4000 {
>> + compatible = "qcom,msm-iommu-v1-ns";
>> + reg = <0x4000 0x1000>;
>> + interrupts = ;
>> + };
>> +
>> + // venus_ns:
>> + iommu-ctx@5000 {
>> + compatible = "qcom,

[PATCH 3/4] iommu: add qcom_iommu

2017-06-13 Thread Rob Clark
An iommu driver for Qualcomm "B" family devices which do implement the
ARM SMMU spec, but not in a way that is compatible with how the arm-smmu
driver is designed.  It seems SMMU_SCR1.GASRAE=1 so the global register
space is not accessible.  This means it needs to get configuration from
devicetree instead of setting it up dynamically.

In the end, other than register definitions, there is not much code to
share with arm-smmu (other than what has already been refactored out
into the pgtable helpers).

Signed-off-by: Rob Clark <robdcl...@gmail.com>
---
v1: original
v2: bindings cleanups and kconfig issues that kbuild robot pointed out
v3: fix issues pointed out by Rob H. and actually make device removal
work
v4: fix WARN_ON() splats reported by Archit
v5: some fixes to build as a module.. note that it cannot actually
be built as a module yet (at minimum a bunch of other iommu syms
that are needed are not exported, but there may be more to it
than that), but at least qcom_iommu is ready should it become
possible to build iommu drivers as modules.
v6: Add additional pm-runtime get/puts around paths that can hit
TLB inv, to avoid unclocked register access if device using the
iommu is not powered on.  And pre-emptively clear interrupts
before registering IRQ handler just in case the bootloader has
left us a surpise.
v7: Address review comments from Robin (don't associate iommu_group
with context bank, table lookup instead of list to find context
bank, etc)

 drivers/iommu/Kconfig  |  10 +
 drivers/iommu/Makefile |   1 +
 drivers/iommu/qcom_iommu.c | 868 +
 3 files changed, 879 insertions(+)
 create mode 100644 drivers/iommu/qcom_iommu.c

diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
index 6ee3a25..aa4b628 100644
--- a/drivers/iommu/Kconfig
+++ b/drivers/iommu/Kconfig
@@ -367,4 +367,14 @@ config MTK_IOMMU_V1
 
  if unsure, say N here.
 
+config QCOM_IOMMU
+   # Note: iommu drivers cannot (yet?) be built as modules
+   bool "Qualcomm IOMMU Support"
+   depends on ARCH_QCOM || COMPILE_TEST
+   select IOMMU_API
+   select IOMMU_IO_PGTABLE_LPAE
+   select ARM_DMA_USE_IOMMU
+   help
+ Support for IOMMU on certain Qualcomm SoCs.
+
 endif # IOMMU_SUPPORT
diff --git a/drivers/iommu/Makefile b/drivers/iommu/Makefile
index 195f7b9..b910aea 100644
--- a/drivers/iommu/Makefile
+++ b/drivers/iommu/Makefile
@@ -27,3 +27,4 @@ obj-$(CONFIG_TEGRA_IOMMU_SMMU) += tegra-smmu.o
 obj-$(CONFIG_EXYNOS_IOMMU) += exynos-iommu.o
 obj-$(CONFIG_FSL_PAMU) += fsl_pamu.o fsl_pamu_domain.o
 obj-$(CONFIG_S390_IOMMU) += s390-iommu.o
+obj-$(CONFIG_QCOM_IOMMU) += qcom_iommu.o
diff --git a/drivers/iommu/qcom_iommu.c b/drivers/iommu/qcom_iommu.c
new file mode 100644
index 000..860cad1
--- /dev/null
+++ b/drivers/iommu/qcom_iommu.c
@@ -0,0 +1,868 @@
+/*
+ * IOMMU API for QCOM secure IOMMUs.  Somewhat based on arm-smmu.c
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ *
+ * Copyright (C) 2013 ARM Limited
+ * Copyright (C) 2017 Red Hat
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "io-pgtable.h"
+#include "arm-smmu-regs.h"
+
+#define SMMU_INTR_SEL_NS 0x2000
+
+struct qcom_iommu_ctx;
+
+struct qcom_iommu_dev {
+   /* IOMMU core code handle */
+   struct iommu_device  iommu;
+   struct device   *dev;
+   struct clk  *iface_clk;
+   struct clk  *bus_clk;
+   void __iomem*local_base;
+   u32  sec_id;
+   u8   num_ctxs;
+   struct qcom_iommu_ctx   *ctxs[0];   /* indexed by asid-1 */
+};
+
+struct qcom_iommu_ctx {
+   struct device   *dev;
+   void __iomem*base;
+   bool secure_init;
+   u8   asid;  /* asid and ctx bank # are 1:1 */
+};
+
+struct qcom_iommu_domain {
+   struct io_pgtable_ops   *pgtbl_ops;
+   spinlock_t   pgtbl_lock;
+   struct mutex init_mutex; /* Protects iommu pointer */
+   struct iommu_domain  domain;
+   

[PATCH 3/4] iommu: add qcom_iommu

2017-05-04 Thread Rob Clark
An iommu driver for Qualcomm "B" family devices which do not completely
implement the ARM SMMU spec.  These devices have context-bank register
layout that is similar to ARM SMMU, but no global register space (or at
least not one that is accessible).

Signed-off-by: Rob Clark <robdcl...@gmail.com>
Signed-off-by: Stanimir Varbanov <stanimir.varba...@linaro.org>
---
 drivers/iommu/Kconfig  |  10 +
 drivers/iommu/Makefile |   1 +
 drivers/iommu/qcom_iommu.c | 825 +
 3 files changed, 836 insertions(+)
 create mode 100644 drivers/iommu/qcom_iommu.c

diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
index 37e204f..400a404 100644
--- a/drivers/iommu/Kconfig
+++ b/drivers/iommu/Kconfig
@@ -359,4 +359,14 @@ config MTK_IOMMU_V1
 
  if unsure, say N here.
 
+config QCOM_IOMMU
+   bool "Qualcomm IOMMU Support"
+   depends on ARM || ARM64
+   depends on ARCH_QCOM || COMPILE_TEST
+   select IOMMU_API
+   select IOMMU_IO_PGTABLE_LPAE
+   select ARM_DMA_USE_IOMMU
+   help
+ Support for IOMMU on certain Qualcomm SoCs.
+
 endif # IOMMU_SUPPORT
diff --git a/drivers/iommu/Makefile b/drivers/iommu/Makefile
index 195f7b9..b910aea 100644
--- a/drivers/iommu/Makefile
+++ b/drivers/iommu/Makefile
@@ -27,3 +27,4 @@ obj-$(CONFIG_TEGRA_IOMMU_SMMU) += tegra-smmu.o
 obj-$(CONFIG_EXYNOS_IOMMU) += exynos-iommu.o
 obj-$(CONFIG_FSL_PAMU) += fsl_pamu.o fsl_pamu_domain.o
 obj-$(CONFIG_S390_IOMMU) += s390-iommu.o
+obj-$(CONFIG_QCOM_IOMMU) += qcom_iommu.o
diff --git a/drivers/iommu/qcom_iommu.c b/drivers/iommu/qcom_iommu.c
new file mode 100644
index 000..1cf7c8e
--- /dev/null
+++ b/drivers/iommu/qcom_iommu.c
@@ -0,0 +1,825 @@
+/*
+ * IOMMU API for QCOM secure IOMMUs.  Somewhat based on arm-smmu.c
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
+ *
+ * Copyright (C) 2013 ARM Limited
+ * Copyright (C) 2017 Red Hat
+ */
+
+#define pr_fmt(fmt) "qcom-iommu: " fmt
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "io-pgtable.h"
+#include "arm-smmu-regs.h"
+
+#define SMMU_INTR_SEL_NS 0x2000
+
+struct qcom_iommu_dev {
+   /* IOMMU core code handle */
+   struct iommu_device  iommu;
+   struct device   *dev;
+   struct clk  *iface_clk;
+   struct clk  *bus_clk;
+   void __iomem*local_base;
+   u32  sec_id;
+   struct list_head context_list;   /* list of qcom_iommu_context 
*/
+};
+
+struct qcom_iommu_ctx {
+   struct device   *dev;
+   void __iomem*base;
+   unsigned int irq;
+   bool secure_init;
+   u32  asid;  /* asid and ctx bank # are 1:1 */
+   struct iommu_group  *group;
+   struct list_head node;  /* head in 
qcom_iommu_device::context_list */
+};
+
+struct qcom_iommu_domain {
+   struct io_pgtable_ops   *pgtbl_ops;
+   spinlock_t   pgtbl_lock;
+   struct mutex init_mutex; /* Protects iommu pointer */
+   struct iommu_domain  domain;
+   struct qcom_iommu_dev   *iommu;
+};
+
+static struct qcom_iommu_domain *to_qcom_iommu_domain(struct iommu_domain *dom)
+{
+   return container_of(dom, struct qcom_iommu_domain, domain);
+}
+
+static const struct iommu_ops qcom_iommu_ops;
+
+static struct qcom_iommu_dev * __to_iommu(struct iommu_fwspec *fwspec)
+{
+   if (!fwspec || fwspec->ops != _iommu_ops)
+   return NULL;
+   return fwspec->iommu_priv;
+}
+
+static struct qcom_iommu_dev * to_iommu(struct iommu_fwspec *fwspec)
+{
+   struct qcom_iommu_dev *qcom_iommu = __to_iommu(fwspec);
+   WARN_ON(!qcom_iommu);
+   return qcom_iommu;
+}
+
+static struct qcom_iommu_ctx * to_ctx(struct iommu_fwspec *fwspec, unsigned 
asid)
+{
+   struct qcom_iommu_dev *qcom_iommu = to_iommu(fwspec);
+   struct qcom_iommu_ctx *ctx;
+
+   if (!qcom_iommu)
+   return NULL;
+
+   list_for_each_entry(ctx, _iommu->context_list, node)
+   if (

[PATCH 1/4] Docs: dt: document qcom iommu bindings

2017-05-04 Thread Rob Clark
Cc: devicet...@vger.kernel.org
Signed-off-by: Rob Clark <robdcl...@gmail.com>
---
 .../devicetree/bindings/iommu/qcom,iommu.txt   | 121 +
 1 file changed, 121 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/iommu/qcom,iommu.txt

diff --git a/Documentation/devicetree/bindings/iommu/qcom,iommu.txt 
b/Documentation/devicetree/bindings/iommu/qcom,iommu.txt
new file mode 100644
index 000..0d50f84
--- /dev/null
+++ b/Documentation/devicetree/bindings/iommu/qcom,iommu.txt
@@ -0,0 +1,121 @@
+* QCOM IOMMU v1 Implementation
+
+Qualcomm "B" family devices which are not compatible with arm-smmu have
+a similar looking IOMMU but without access to the global register space,
+and optionally requiring additional configuration to route context irqs
+to non-secure vs secure interrupt line.
+
+** Required properties:
+
+- compatible   : Should be one of:
+
+"qcom,msm8916-iommu"
+
+ Followed by "qcom,msm-iommu-v1".
+
+- clock-names  : Should be a pair of "iface" (required for IOMMUs
+ register group access) and "bus" (required for
+ the IOMMUs underlying bus access).
+
+- clocks   : Phandles for respective clocks described by
+ clock-names.
+
+- #address-cells   : must be 1.
+
+- #size-cells  : must be 1.
+
+- #iommu-cells : Must be 1.
+
+- ranges   : Base address and size of the iommu context banks.
+
+- qcom,iommu-secure-id  : secure-id.
+
+- List of sub-nodes, one per translation context bank.  Each sub-node
+  has the following required properties:
+
+  - compatible : Should be one of:
+- "qcom,msm-iommu-v1-ns"  : non-secure context bank
+- "qcom,msm-iommu-v1-sec" : secure context bank
+  - reg: Base address and size of context bank within the iommu
+  - interrupts : The context fault irq.
+
+** Optional properties:
+
+- reg  : Base address and size of the SMMU local base, should
+ be only specified if the iommu requires configuration
+ for routing of context bank irq's to secure vs non-
+ secure lines.  (Ie. if the iommu contains secure
+ context banks)
+
+
+** Examples:
+
+   apps_iommu: iommu@1e2 {
+   #address-cells = <1>;
+   #size-cells = <1>;
+   #iommu-cells = <1>;
+   compatible = "qcom,msm8916-iommu", "qcom,msm-iommu-v1";
+   ranges = <0 0x1e2 0x4>;
+   reg = <0x1ef 0x3000>;
+   clocks = < GCC_SMMU_CFG_CLK>,
+< GCC_APSS_TCU_CLK>;
+   clock-names = "iface", "bus";
+   qcom,iommu-secure-id = <17>;
+
+   // mdp_0:
+   iommu-ctx@4000 {
+   compatible = "qcom,msm-iommu-v1-ns";
+   reg = <0x4000 0x1000>;
+   interrupts = ;
+   };
+
+   // venus_ns:
+   iommu-ctx@5000 {
+   compatible = "qcom,msm-iommu-v1-sec";
+   reg = <0x5000 0x1000>;
+   interrupts = ;
+   };
+   };
+
+   gpu_iommu: iommu@1f08000 {
+   #address-cells = <1>;
+   #size-cells = <1>;
+   #iommu-cells = <1>;
+   compatible = "qcom,msm8916-iommu", "qcom,msm-iommu-v1";
+   ranges = <0 0x1f08000 0x1>;
+   clocks = < GCC_SMMU_CFG_CLK>,
+< GCC_GFX_TCU_CLK>;
+   clock-names = "iface", "bus";
+   qcom,iommu-secure-id = <18>;
+
+   // gfx3d_user:
+   iommu-ctx@1f09000 {
+   compatible = "qcom,msm-iommu-v1-ns";
+   reg = <0x1000 0x1000>;
+   interrupts = ;
+   };
+
+   // gfx3d_priv:
+   iommu-ctx@1f0a000 {
+   compatible = "qcom,msm-iommu-v1-ns";
+   reg = <0x2000 0x1000>;
+   interrupts = ;
+   };
+   };
+
+   ...
+
+   venus: video-codec@1d0 {
+   ...
+   iommus = <_iommu 5>;
+   };
+
+   mdp: mdp@1a01000 {
+   ...
+   iommus = <_iommu 4>;
+   };
+
+   gpu@01c0 {
+   ...
+   iommus = <_iommu 1>, <_iommu 2>;
+   };
-- 
2.9.3

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 2/4] iommu: arm-smmu: split out register defines

2017-05-04 Thread Rob Clark
I want to re-use some of these for qcom_iommu, which has (roughly) the
same context-bank registers.

Signed-off-by: Rob Clark <robdcl...@gmail.com>
---
 drivers/iommu/arm-smmu-regs.h | 227 ++
 drivers/iommu/arm-smmu.c  | 203 +
 2 files changed, 228 insertions(+), 202 deletions(-)
 create mode 100644 drivers/iommu/arm-smmu-regs.h

diff --git a/drivers/iommu/arm-smmu-regs.h b/drivers/iommu/arm-smmu-regs.h
new file mode 100644
index 000..87589c8
--- /dev/null
+++ b/drivers/iommu/arm-smmu-regs.h
@@ -0,0 +1,227 @@
+/*
+ * IOMMU API for ARM architected SMMU implementations.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
+ *
+ * Copyright (C) 2013 ARM Limited
+ *
+ * Author: Will Deacon <will.dea...@arm.com>
+ */
+
+#ifndef _ARM_SMMU_REGS_H
+#define _ARM_SMMU_REGS_H
+
+/* Configuration registers */
+#define ARM_SMMU_GR0_sCR0  0x0
+#define sCR0_CLIENTPD  (1 << 0)
+#define sCR0_GFRE  (1 << 1)
+#define sCR0_GFIE  (1 << 2)
+#define sCR0_EXIDENABLE(1 << 3)
+#define sCR0_GCFGFRE   (1 << 4)
+#define sCR0_GCFGFIE   (1 << 5)
+#define sCR0_USFCFG(1 << 10)
+#define sCR0_VMIDPNE   (1 << 11)
+#define sCR0_PTM   (1 << 12)
+#define sCR0_FB(1 << 13)
+#define sCR0_VMID16EN  (1 << 31)
+#define sCR0_BSU_SHIFT 14
+#define sCR0_BSU_MASK  0x3
+
+/* Auxiliary Configuration register */
+#define ARM_SMMU_GR0_sACR  0x10
+
+/* Identification registers */
+#define ARM_SMMU_GR0_ID0   0x20
+#define ARM_SMMU_GR0_ID1   0x24
+#define ARM_SMMU_GR0_ID2   0x28
+#define ARM_SMMU_GR0_ID3   0x2c
+#define ARM_SMMU_GR0_ID4   0x30
+#define ARM_SMMU_GR0_ID5   0x34
+#define ARM_SMMU_GR0_ID6   0x38
+#define ARM_SMMU_GR0_ID7   0x3c
+#define ARM_SMMU_GR0_sGFSR 0x48
+#define ARM_SMMU_GR0_sGFSYNR0  0x50
+#define ARM_SMMU_GR0_sGFSYNR1  0x54
+#define ARM_SMMU_GR0_sGFSYNR2  0x58
+
+#define ID0_S1TS   (1 << 30)
+#define ID0_S2TS   (1 << 29)
+#define ID0_NTS(1 << 28)
+#define ID0_SMS(1 << 27)
+#define ID0_ATOSNS (1 << 26)
+#define ID0_PTFS_NO_AARCH32(1 << 25)
+#define ID0_PTFS_NO_AARCH32S   (1 << 24)
+#define ID0_CTTW   (1 << 14)
+#define ID0_NUMIRPT_SHIFT  16
+#define ID0_NUMIRPT_MASK   0xff
+#define ID0_NUMSIDB_SHIFT  9
+#define ID0_NUMSIDB_MASK   0xf
+#define ID0_EXIDS  (1 << 8)
+#define ID0_NUMSMRG_SHIFT  0
+#define ID0_NUMSMRG_MASK   0xff
+
+#define ID1_PAGESIZE   (1 << 31)
+#define ID1_NUMPAGENDXB_SHIFT  28
+#define ID1_NUMPAGENDXB_MASK   7
+#define ID1_NUMS2CB_SHIFT  16
+#define ID1_NUMS2CB_MASK   0xff
+#define ID1_NUMCB_SHIFT0
+#define ID1_NUMCB_MASK 0xff
+
+#define ID2_OAS_SHIFT  4
+#define ID2_OAS_MASK   0xf
+#define ID2_IAS_SHIFT  0
+#define ID2_IAS_MASK   0xf
+#define ID2_UBS_SHIFT  8
+#define ID2_UBS_MASK   0xf
+#define ID2_PTFS_4K(1 << 12)
+#define ID2_PTFS_16K   (1 << 13)
+#define ID2_PTFS_64K   (1 << 14)
+#define ID2_VMID16 (1 << 15)
+
+#define ID7_MAJOR_SHIFT4
+#define ID7_MAJOR_MASK 0xf
+
+/* Global TLB invalidation */
+#define ARM_SMMU_GR0_TLBIVMID  0x64
+#define ARM_SMMU_GR0_TLBIALLNSNH   0x68
+#define ARM_SMMU_GR0_TLBIALLH  0x6c
+#define ARM_SMMU_GR0_sTLBGSYNC 0x70
+#define ARM_SMMU_GR0_sTLBGSTATUS   0x74
+#define sTLBGSTATUS_GSACTIVE   (1 << 0)
+#define TLB_LOOP_TIMEOUT   10

[PATCH 0/4] iommu: add qcom_iommu for early "B" family devices (v3)

2017-05-04 Thread Rob Clark
An iommu driver for Qualcomm "B" family devices which do not completely
implement the ARM SMMU spec.  These devices have context-bank register
layout that is similar to ARM SMMU, but no global register space (or at
least not one that is accessible).

At this point, all the dependencies have landed for v4.12, so please
review.  An iommu driver is the last remaining piece of the puzzle to
have the gpu working out of the box on 8x16 (dragonboard 410c, for
example).  There are only minimal changes since v2:

  * small dt doc change suggested by Rob H.
  * fix for WARN_ON() splats reported by Archit

Rob Clark (3):
  Docs: dt: document qcom iommu bindings
  iommu: arm-smmu: split out register defines
  iommu: add qcom_iommu

Stanimir Varbanov (1):
  iommu: qcom: initialize secure page table

 .../devicetree/bindings/iommu/qcom,iommu.txt   | 121 +++
 drivers/iommu/Kconfig  |  10 +
 drivers/iommu/Makefile |   1 +
 drivers/iommu/arm-smmu-regs.h  | 227 ++
 drivers/iommu/arm-smmu.c   | 203 +
 drivers/iommu/qcom_iommu.c | 889 +
 6 files changed, 1249 insertions(+), 202 deletions(-)
 create mode 100644 Documentation/devicetree/bindings/iommu/qcom,iommu.txt
 create mode 100644 drivers/iommu/arm-smmu-regs.h
 create mode 100644 drivers/iommu/qcom_iommu.c

-- 
2.9.3

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 4/4] iommu: qcom: initialize secure page table

2017-05-04 Thread Rob Clark
From: Stanimir Varbanov <stanimir.varba...@linaro.org>

This basically gets the secure page table size, allocates memory for
secure pagetables and passes the physical address to the trusted zone.

Signed-off-by: Stanimir Varbanov <stanimir.varba...@linaro.org>
Signed-off-by: Rob Clark <robdcl...@gmail.com>
---
 drivers/iommu/qcom_iommu.c | 64 ++
 1 file changed, 64 insertions(+)

diff --git a/drivers/iommu/qcom_iommu.c b/drivers/iommu/qcom_iommu.c
index 1cf7c8e..a35d4cc 100644
--- a/drivers/iommu/qcom_iommu.c
+++ b/drivers/iommu/qcom_iommu.c
@@ -615,6 +615,51 @@ static void qcom_iommu_disable_clocks(struct 
qcom_iommu_dev *qcom_iommu)
clk_disable_unprepare(qcom_iommu->iface_clk);
 }
 
+static int qcom_iommu_sec_ptbl_init(struct device *dev)
+{
+   size_t psize = 0;
+   unsigned int spare = 0;
+   void *cpu_addr;
+   dma_addr_t paddr;
+   unsigned long attrs;
+   static bool allocated = false;
+   int ret;
+
+   if (allocated)
+   return 0;
+
+   ret = qcom_scm_iommu_secure_ptbl_size(spare, );
+   if (ret) {
+   dev_err(dev, "failed to get iommu secure pgtable size (%d)\n",
+   ret);
+   return ret;
+   }
+
+   dev_info(dev, "iommu sec: pgtable size: %zu\n", psize);
+
+   attrs = DMA_ATTR_NO_KERNEL_MAPPING;
+
+   cpu_addr = dma_alloc_attrs(dev, psize, , GFP_KERNEL, attrs);
+   if (!cpu_addr) {
+   dev_err(dev, "failed to allocate %zu bytes for pgtable\n",
+   psize);
+   return -ENOMEM;
+   }
+
+   ret = qcom_scm_iommu_secure_ptbl_init(paddr, psize, spare);
+   if (ret) {
+   dev_err(dev, "failed to init iommu pgtable (%d)\n", ret);
+   goto free_mem;
+   }
+
+   allocated = true;
+   return 0;
+
+free_mem:
+   dma_free_attrs(dev, psize, cpu_addr, paddr, attrs);
+   return ret;
+}
+
 static int qcom_iommu_ctx_probe(struct platform_device *pdev)
 {
struct qcom_iommu_ctx *ctx;
@@ -695,6 +740,17 @@ static struct platform_driver qcom_iommu_ctx_driver = {
 };
 module_platform_driver(qcom_iommu_ctx_driver);
 
+static bool qcom_iommu_has_secure_context(struct qcom_iommu_dev *qcom_iommu)
+{
+   struct device_node *child;
+
+   for_each_child_of_node(qcom_iommu->dev->of_node, child)
+   if (of_device_is_compatible(child, "qcom,msm-iommu-v1-sec"))
+   return true;
+
+   return false;
+}
+
 static int qcom_iommu_device_probe(struct platform_device *pdev)
 {
struct qcom_iommu_dev *qcom_iommu;
@@ -731,6 +787,14 @@ static int qcom_iommu_device_probe(struct platform_device 
*pdev)
return -ENODEV;
}
 
+   if (qcom_iommu_has_secure_context(qcom_iommu)) {
+   ret = qcom_iommu_sec_ptbl_init(dev);
+   if (ret) {
+   dev_err(dev, "cannot init secure pg table(%d)\n", ret);
+   return ret;
+   }
+   }
+
platform_set_drvdata(pdev, qcom_iommu);
 
/* register context bank devices, which are child nodes: */
-- 
2.9.3

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] iommu: fix device remove

2017-05-05 Thread Rob Clark
On Fri, May 5, 2017 at 2:24 PM, Greg KH <gre...@linuxfoundation.org> wrote:
> On Fri, May 05, 2017 at 02:08:37PM -0400, Rob Clark wrote:
>> It looks like it *used* to make sense to free the device.  But now it is
>> embedded in 'struct iommu' (which is allocated or embedded in something
>> that the device allocated).
>>
>> Spotted when testing qcom_iommu with CONFIG_DEBUG_TEST_DRIVER_REMOVE.
>>
>> Fixes: 39ab955 ("iommu: Add sysfs bindings for struct iommu_device")
>> Signed-off-by: Rob Clark <robdcl...@gmail.com>
>> ---
>>  drivers/iommu/iommu-sysfs.c | 1 -
>>  1 file changed, 1 deletion(-)
>>
>> diff --git a/drivers/iommu/iommu-sysfs.c b/drivers/iommu/iommu-sysfs.c
>> index c58351e..ad19cbb 100644
>> --- a/drivers/iommu/iommu-sysfs.c
>> +++ b/drivers/iommu/iommu-sysfs.c
>> @@ -34,7 +34,6 @@ static const struct attribute_group *iommu_dev_groups[] = {
>>
>>  static void iommu_release_device(struct device *dev)
>>  {
>> - kfree(dev);
>>  }
>
> As per the documentation in the kernel tree, I now get to make fun of
> you for doing such a crazh and foolish thing!
>
> Come on, don't do that, a release function _HAS_ to free the memory
> involved.  If not, then it is really broken...

There are shenanigans going on.. so release isn't counterpoint to a
_probe() which allocates some memory.  See iommu_device_sysfs_add().
So I'm not the one you get to make fun of ;-)

This the correct thing to do.  Whether the way the extra fake device
embedded in something allocated in the iommu driver's probe (and
free'd it *it's* _release()) stuff for iommu sysfs stuff works is
bonkers or not, I'll let someone else decide..  it was like that when
I got here.

BR,
-R

> greg k-h
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH] iommu: fix device remove

2017-05-05 Thread Rob Clark
It looks like it *used* to make sense to free the device.  But now it is
embedded in 'struct iommu' (which is allocated or embedded in something
that the device allocated).

Spotted when testing qcom_iommu with CONFIG_DEBUG_TEST_DRIVER_REMOVE.

Fixes: 39ab955 ("iommu: Add sysfs bindings for struct iommu_device")
Signed-off-by: Rob Clark <robdcl...@gmail.com>
---
 drivers/iommu/iommu-sysfs.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/iommu/iommu-sysfs.c b/drivers/iommu/iommu-sysfs.c
index c58351e..ad19cbb 100644
--- a/drivers/iommu/iommu-sysfs.c
+++ b/drivers/iommu/iommu-sysfs.c
@@ -34,7 +34,6 @@ static const struct attribute_group *iommu_dev_groups[] = {
 
 static void iommu_release_device(struct device *dev)
 {
-   kfree(dev);
 }
 
 static struct class iommu_class = {
-- 
2.9.3

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH] iommu: add qcom_iommu

2017-05-05 Thread Rob Clark
An iommu driver for Qualcomm "B" family devices which do not completely
implement the ARM SMMU spec.  These devices have context-bank register
layout that is similar to ARM SMMU, but no global register space (or at
least not one that is accessible).

Signed-off-by: Rob Clark <robdcl...@gmail.com>
---
v1: original
v2: bindings cleanups and kconfig issues that kbuild robot pointed out
v4: fix issues pointed out by Rob H. and actually make device removal
work
v3: fix WARN_ON() splats reported by Archit

 drivers/iommu/Kconfig  |   9 +
 drivers/iommu/Makefile |   1 +
 drivers/iommu/qcom_iommu.c | 833 +
 3 files changed, 843 insertions(+)
 create mode 100644 drivers/iommu/qcom_iommu.c

diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
index 37e204f..f8f79a4 100644
--- a/drivers/iommu/Kconfig
+++ b/drivers/iommu/Kconfig
@@ -359,4 +359,13 @@ config MTK_IOMMU_V1
 
  if unsure, say N here.
 
+config QCOM_IOMMU
+   bool "Qualcomm IOMMU Support"
+   depends on ARCH_QCOM || COMPILE_TEST
+   select IOMMU_API
+   select IOMMU_IO_PGTABLE_LPAE
+   select ARM_DMA_USE_IOMMU
+   help
+ Support for IOMMU on certain Qualcomm SoCs.
+
 endif # IOMMU_SUPPORT
diff --git a/drivers/iommu/Makefile b/drivers/iommu/Makefile
index 195f7b9..b910aea 100644
--- a/drivers/iommu/Makefile
+++ b/drivers/iommu/Makefile
@@ -27,3 +27,4 @@ obj-$(CONFIG_TEGRA_IOMMU_SMMU) += tegra-smmu.o
 obj-$(CONFIG_EXYNOS_IOMMU) += exynos-iommu.o
 obj-$(CONFIG_FSL_PAMU) += fsl_pamu.o fsl_pamu_domain.o
 obj-$(CONFIG_S390_IOMMU) += s390-iommu.o
+obj-$(CONFIG_QCOM_IOMMU) += qcom_iommu.o
diff --git a/drivers/iommu/qcom_iommu.c b/drivers/iommu/qcom_iommu.c
new file mode 100644
index 000..025780a
--- /dev/null
+++ b/drivers/iommu/qcom_iommu.c
@@ -0,0 +1,833 @@
+/*
+ * IOMMU API for QCOM secure IOMMUs.  Somewhat based on arm-smmu.c
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ *
+ * Copyright (C) 2013 ARM Limited
+ * Copyright (C) 2017 Red Hat
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "io-pgtable.h"
+#include "arm-smmu-regs.h"
+
+#define SMMU_INTR_SEL_NS 0x2000
+
+struct qcom_iommu_dev {
+   /* IOMMU core code handle */
+   struct iommu_device  iommu;
+   struct device   *dev;
+   struct clk  *iface_clk;
+   struct clk  *bus_clk;
+   void __iomem*local_base;
+   u32  sec_id;
+   struct list_head context_list;   /* list of qcom_iommu_context 
*/
+};
+
+struct qcom_iommu_ctx {
+   struct device   *dev;
+   void __iomem*base;
+   unsigned int irq;
+   bool secure_init;
+   u32  asid;  /* asid and ctx bank # are 1:1 */
+   struct iommu_group  *group;
+   struct list_head node;  /* head in 
qcom_iommu_device::context_list */
+};
+
+struct qcom_iommu_domain {
+   struct io_pgtable_ops   *pgtbl_ops;
+   spinlock_t   pgtbl_lock;
+   struct mutex init_mutex; /* Protects iommu pointer */
+   struct iommu_domain  domain;
+   struct qcom_iommu_dev   *iommu;
+};
+
+static struct qcom_iommu_domain *to_qcom_iommu_domain(struct iommu_domain *dom)
+{
+   return container_of(dom, struct qcom_iommu_domain, domain);
+}
+
+static const struct iommu_ops qcom_iommu_ops;
+
+static struct qcom_iommu_dev * __to_iommu(struct iommu_fwspec *fwspec)
+{
+   if (!fwspec || fwspec->ops != _iommu_ops)
+   return NULL;
+   return fwspec->iommu_priv;
+}
+
+static struct qcom_iommu_dev * to_iommu(struct iommu_fwspec *fwspec)
+{
+   struct qcom_iommu_dev *qcom_iommu = __to_iommu(fwspec);
+   WARN_ON(!qcom_iommu);
+   return qcom_iommu;
+}
+
+static struct qcom_iommu_ctx * to_ctx(struct iommu_fwspec *fwspec, unsigned 
asid)
+{
+   struct qcom_iommu_dev *qcom_iommu = to_iommu(fwspec);
+   struct qcom_iommu_ctx *ctx;
+
+   if (!qcom_iommu)
+   return NULL;
+
+   list_for_each_entry(ctx, _iommu->context_list, node)
+   

Re: [PATCH] iommu: add qcom_iommu

2017-05-05 Thread Rob Clark
On Fri, May 5, 2017 at 3:50 PM, Rob Herring <r...@kernel.org> wrote:
> On Fri, May 5, 2017 at 2:37 PM, Rob Clark <robdcl...@gmail.com> wrote:
>> On Fri, May 5, 2017 at 3:04 PM, Rob Herring <r...@kernel.org> wrote:
>>> On Fri, May 5, 2017 at 1:21 PM, Rob Clark <robdcl...@gmail.com> wrote:
>>>> An iommu driver for Qualcomm "B" family devices which do not completely
>>>> implement the ARM SMMU spec.  These devices have context-bank register
>>>> layout that is similar to ARM SMMU, but no global register space (or at
>>>> least not one that is accessible).
>>>>
>>>> Signed-off-by: Rob Clark <robdcl...@gmail.com>
>>>> ---
>>>> v1: original
>>>> v2: bindings cleanups and kconfig issues that kbuild robot pointed out
>>>> v4: fix issues pointed out by Rob H. and actually make device removal
>>>> work
>>>> v3: fix WARN_ON() splats reported by Archit
>>>>
>>>>  drivers/iommu/Kconfig  |   9 +
>>>>  drivers/iommu/Makefile |   1 +
>>>>  drivers/iommu/qcom_iommu.c | 833 
>>>> +
>>>>  3 files changed, 843 insertions(+)
>>>>  create mode 100644 drivers/iommu/qcom_iommu.c
>>>>
>>>> diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
>>>> index 37e204f..f8f79a4 100644
>>>> --- a/drivers/iommu/Kconfig
>>>> +++ b/drivers/iommu/Kconfig
>>>> @@ -359,4 +359,13 @@ config MTK_IOMMU_V1
>>>>
>>>>   if unsure, say N here.
>>>>
>>>> +config QCOM_IOMMU
>>>> +   bool "Qualcomm IOMMU Support"
>>>
>>> Either this needs to be tristate or...
>>>
>>> [...]
>>>> +#include 
>>>
>>> this include and the things that need it should go. Or some
>>> explanation like "once X happens, then we can enable as module" and
>>> leave it all for now.
>>
>> tbh, I'm not sure what the issue is for modules (other than
>> potentially that you'd want the iommu driver fairly early in boot if
>> you didn't have an initrd).  I just saw that the other iommu drivers
>> are all bool.  (Sorry, I don't really follow iommu-devel so not
>> familiar with the history.)  With my distro hat on, I would prefer
>> them to be modules eventually.
>
> For starters, does it even build as a module if you allow that? It
> might not work because of some run-time ordering, but that's good
> enough for this discussion.

It does in fact build as a module..  I suppose I need to figure out a
more convenient way to re-pack modules in an initrd to actually test
it and see what does or does not explode..

BR,
-R

>>
>>> See this[1] for some background.
>>
>> it mentions there are some downsides, but I can't see where those
>> downsides are listed ;-)
>>
>> I would kinda prefer to leave the MODULE_*() stuff in place unless
>> modular iommu drivers are never going to happen.
>
> Maybe a note, so the module police don't fix it.
>
> Rob
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] iommu: fix device remove

2017-05-05 Thread Rob Clark
On Fri, May 5, 2017 at 3:58 PM, Greg KH <gre...@linuxfoundation.org> wrote:
> On Fri, May 05, 2017 at 02:56:00PM -0400, Rob Clark wrote:
>> On Fri, May 5, 2017 at 2:24 PM, Greg KH <gre...@linuxfoundation.org> wrote:
>> > On Fri, May 05, 2017 at 02:08:37PM -0400, Rob Clark wrote:
>> >> It looks like it *used* to make sense to free the device.  But now it is
>> >> embedded in 'struct iommu' (which is allocated or embedded in something
>> >> that the device allocated).
>> >>
>> >> Spotted when testing qcom_iommu with CONFIG_DEBUG_TEST_DRIVER_REMOVE.
>> >>
>> >> Fixes: 39ab955 ("iommu: Add sysfs bindings for struct iommu_device")
>> >> Signed-off-by: Rob Clark <robdcl...@gmail.com>
>> >> ---
>> >>  drivers/iommu/iommu-sysfs.c | 1 -
>> >>  1 file changed, 1 deletion(-)
>> >>
>> >> diff --git a/drivers/iommu/iommu-sysfs.c b/drivers/iommu/iommu-sysfs.c
>> >> index c58351e..ad19cbb 100644
>> >> --- a/drivers/iommu/iommu-sysfs.c
>> >> +++ b/drivers/iommu/iommu-sysfs.c
>> >> @@ -34,7 +34,6 @@ static const struct attribute_group *iommu_dev_groups[] 
>> >> = {
>> >>
>> >>  static void iommu_release_device(struct device *dev)
>> >>  {
>> >> - kfree(dev);
>> >>  }
>> >
>> > As per the documentation in the kernel tree, I now get to make fun of
>> > you for doing such a crazh and foolish thing!
>> >
>> > Come on, don't do that, a release function _HAS_ to free the memory
>> > involved.  If not, then it is really broken...
>>
>> There are shenanigans going on.. so release isn't counterpoint to a
>> _probe() which allocates some memory.  See iommu_device_sysfs_add().
>> So I'm not the one you get to make fun of ;-)
>>
>> This the correct thing to do.  Whether the way the extra fake device
>> embedded in something allocated in the iommu driver's probe (and
>> free'd it *it's* _release()) stuff for iommu sysfs stuff works is
>> bonkers or not, I'll let someone else decide..  it was like that when
>> I got here.
>
> If you have multiple reference counts in the same structure, your code
> is wrong.  That is the root issue here that needs to be resolved.  Yes,
> your patch papers over that, but again, it isn't right either.
>

fair enough, I should have been more precise and said that this patch
is "the correct thing to do for how the code works now".. as far as
bigger refactoring, I'll leave that to someone who understands why the
code works the way it currently does.  My patch at least makes things
less wrong.  (But removing an iommu is kind of a crazy thing to do so
it's perhaps a rather theoretical problem.)

BR,
-R
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 0/4] iommu: add qcom_iommu for early "B" family devices

2017-05-25 Thread Rob Clark
An iommu driver for Qualcomm "B" family devices which do not completely
implement the ARM SMMU spec.  These devices have context-bank register
layout that is similar to ARM SMMU, but no global register space (or at
least not one that is accessible).

A couple more minor changes in 3/4, and dt bindings now have Rob H's r-b.

Rob Clark (3):
  Docs: dt: document qcom iommu bindings
  iommu: arm-smmu: split out register defines
  iommu: add qcom_iommu

Stanimir Varbanov (1):
  iommu: qcom: initialize secure page table

 .../devicetree/bindings/iommu/qcom,iommu.txt   | 121 +++
 drivers/iommu/Kconfig  |  10 +
 drivers/iommu/Makefile |   1 +
 drivers/iommu/arm-smmu-regs.h  | 227 +
 drivers/iommu/arm-smmu.c   | 203 +
 drivers/iommu/qcom_iommu.c | 942 +
 6 files changed, 1302 insertions(+), 202 deletions(-)
 create mode 100644 Documentation/devicetree/bindings/iommu/qcom,iommu.txt
 create mode 100644 drivers/iommu/arm-smmu-regs.h
 create mode 100644 drivers/iommu/qcom_iommu.c

-- 
2.9.4

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 2/4] iommu: arm-smmu: split out register defines

2017-05-25 Thread Rob Clark
I want to re-use some of these for qcom_iommu, which has (roughly) the
same context-bank registers.

Signed-off-by: Rob Clark <robdcl...@gmail.com>
---
 drivers/iommu/arm-smmu-regs.h | 227 ++
 drivers/iommu/arm-smmu.c  | 203 +
 2 files changed, 228 insertions(+), 202 deletions(-)
 create mode 100644 drivers/iommu/arm-smmu-regs.h

diff --git a/drivers/iommu/arm-smmu-regs.h b/drivers/iommu/arm-smmu-regs.h
new file mode 100644
index 000..87589c8
--- /dev/null
+++ b/drivers/iommu/arm-smmu-regs.h
@@ -0,0 +1,227 @@
+/*
+ * IOMMU API for ARM architected SMMU implementations.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
+ *
+ * Copyright (C) 2013 ARM Limited
+ *
+ * Author: Will Deacon <will.dea...@arm.com>
+ */
+
+#ifndef _ARM_SMMU_REGS_H
+#define _ARM_SMMU_REGS_H
+
+/* Configuration registers */
+#define ARM_SMMU_GR0_sCR0  0x0
+#define sCR0_CLIENTPD  (1 << 0)
+#define sCR0_GFRE  (1 << 1)
+#define sCR0_GFIE  (1 << 2)
+#define sCR0_EXIDENABLE(1 << 3)
+#define sCR0_GCFGFRE   (1 << 4)
+#define sCR0_GCFGFIE   (1 << 5)
+#define sCR0_USFCFG(1 << 10)
+#define sCR0_VMIDPNE   (1 << 11)
+#define sCR0_PTM   (1 << 12)
+#define sCR0_FB(1 << 13)
+#define sCR0_VMID16EN  (1 << 31)
+#define sCR0_BSU_SHIFT 14
+#define sCR0_BSU_MASK  0x3
+
+/* Auxiliary Configuration register */
+#define ARM_SMMU_GR0_sACR  0x10
+
+/* Identification registers */
+#define ARM_SMMU_GR0_ID0   0x20
+#define ARM_SMMU_GR0_ID1   0x24
+#define ARM_SMMU_GR0_ID2   0x28
+#define ARM_SMMU_GR0_ID3   0x2c
+#define ARM_SMMU_GR0_ID4   0x30
+#define ARM_SMMU_GR0_ID5   0x34
+#define ARM_SMMU_GR0_ID6   0x38
+#define ARM_SMMU_GR0_ID7   0x3c
+#define ARM_SMMU_GR0_sGFSR 0x48
+#define ARM_SMMU_GR0_sGFSYNR0  0x50
+#define ARM_SMMU_GR0_sGFSYNR1  0x54
+#define ARM_SMMU_GR0_sGFSYNR2  0x58
+
+#define ID0_S1TS   (1 << 30)
+#define ID0_S2TS   (1 << 29)
+#define ID0_NTS(1 << 28)
+#define ID0_SMS(1 << 27)
+#define ID0_ATOSNS (1 << 26)
+#define ID0_PTFS_NO_AARCH32(1 << 25)
+#define ID0_PTFS_NO_AARCH32S   (1 << 24)
+#define ID0_CTTW   (1 << 14)
+#define ID0_NUMIRPT_SHIFT  16
+#define ID0_NUMIRPT_MASK   0xff
+#define ID0_NUMSIDB_SHIFT  9
+#define ID0_NUMSIDB_MASK   0xf
+#define ID0_EXIDS  (1 << 8)
+#define ID0_NUMSMRG_SHIFT  0
+#define ID0_NUMSMRG_MASK   0xff
+
+#define ID1_PAGESIZE   (1 << 31)
+#define ID1_NUMPAGENDXB_SHIFT  28
+#define ID1_NUMPAGENDXB_MASK   7
+#define ID1_NUMS2CB_SHIFT  16
+#define ID1_NUMS2CB_MASK   0xff
+#define ID1_NUMCB_SHIFT0
+#define ID1_NUMCB_MASK 0xff
+
+#define ID2_OAS_SHIFT  4
+#define ID2_OAS_MASK   0xf
+#define ID2_IAS_SHIFT  0
+#define ID2_IAS_MASK   0xf
+#define ID2_UBS_SHIFT  8
+#define ID2_UBS_MASK   0xf
+#define ID2_PTFS_4K(1 << 12)
+#define ID2_PTFS_16K   (1 << 13)
+#define ID2_PTFS_64K   (1 << 14)
+#define ID2_VMID16 (1 << 15)
+
+#define ID7_MAJOR_SHIFT4
+#define ID7_MAJOR_MASK 0xf
+
+/* Global TLB invalidation */
+#define ARM_SMMU_GR0_TLBIVMID  0x64
+#define ARM_SMMU_GR0_TLBIALLNSNH   0x68
+#define ARM_SMMU_GR0_TLBIALLH  0x6c
+#define ARM_SMMU_GR0_sTLBGSYNC 0x70
+#define ARM_SMMU_GR0_sTLBGSTATUS   0x74
+#define sTLBGSTATUS_GSACTIVE   (1 << 0)
+#define TLB_LOOP_TIMEOUT   10

[PATCH 3/4] iommu: add qcom_iommu

2017-05-25 Thread Rob Clark
An iommu driver for Qualcomm "B" family devices which do not completely
implement the ARM SMMU spec.  These devices have context-bank register
layout that is similar to ARM SMMU, but no global register space (or at
least not one that is accessible).

Signed-off-by: Rob Clark <robdcl...@gmail.com>
---
v1: original
v2: bindings cleanups and kconfig issues that kbuild robot pointed out
v3: fix issues pointed out by Rob H. and actually make device removal
work
v4: fix WARN_ON() splats reported by Archit
v5: some fixes to build as a module.. note that it cannot actually
be built as a module yet (at minimum a bunch of other iommu syms
that are needed are not exported, but there may be more to it
than that), but at least qcom_iommu is ready should it become
possible to build iommu drivers as modules.
v6: Add additional pm-runtime get/puts around paths that can hit
TLB inv, to avoid unclocked register access if device using the
iommu is not powered on.  And pre-emptively clear interrupts
before registering IRQ handler just in case the bootloader has
left us a surpise.

 drivers/iommu/Kconfig  |  10 +
 drivers/iommu/Makefile |   1 +
 drivers/iommu/qcom_iommu.c | 878 +
 3 files changed, 889 insertions(+)
 create mode 100644 drivers/iommu/qcom_iommu.c

diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
index 6ee3a25..aa4b628 100644
--- a/drivers/iommu/Kconfig
+++ b/drivers/iommu/Kconfig
@@ -367,4 +367,14 @@ config MTK_IOMMU_V1
 
  if unsure, say N here.
 
+config QCOM_IOMMU
+   # Note: iommu drivers cannot (yet?) be built as modules
+   bool "Qualcomm IOMMU Support"
+   depends on ARCH_QCOM || COMPILE_TEST
+   select IOMMU_API
+   select IOMMU_IO_PGTABLE_LPAE
+   select ARM_DMA_USE_IOMMU
+   help
+ Support for IOMMU on certain Qualcomm SoCs.
+
 endif # IOMMU_SUPPORT
diff --git a/drivers/iommu/Makefile b/drivers/iommu/Makefile
index 195f7b9..b910aea 100644
--- a/drivers/iommu/Makefile
+++ b/drivers/iommu/Makefile
@@ -27,3 +27,4 @@ obj-$(CONFIG_TEGRA_IOMMU_SMMU) += tegra-smmu.o
 obj-$(CONFIG_EXYNOS_IOMMU) += exynos-iommu.o
 obj-$(CONFIG_FSL_PAMU) += fsl_pamu.o fsl_pamu_domain.o
 obj-$(CONFIG_S390_IOMMU) += s390-iommu.o
+obj-$(CONFIG_QCOM_IOMMU) += qcom_iommu.o
diff --git a/drivers/iommu/qcom_iommu.c b/drivers/iommu/qcom_iommu.c
new file mode 100644
index 000..bfaf97c
--- /dev/null
+++ b/drivers/iommu/qcom_iommu.c
@@ -0,0 +1,878 @@
+/*
+ * IOMMU API for QCOM secure IOMMUs.  Somewhat based on arm-smmu.c
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ *
+ * Copyright (C) 2013 ARM Limited
+ * Copyright (C) 2017 Red Hat
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "io-pgtable.h"
+#include "arm-smmu-regs.h"
+
+#define SMMU_INTR_SEL_NS 0x2000
+
+struct qcom_iommu_dev {
+   /* IOMMU core code handle */
+   struct iommu_device  iommu;
+   struct device   *dev;
+   struct clk  *iface_clk;
+   struct clk  *bus_clk;
+   void __iomem*local_base;
+   u32  sec_id;
+   struct list_head context_list;   /* list of qcom_iommu_context 
*/
+};
+
+struct qcom_iommu_ctx {
+   struct device   *dev;
+   void __iomem*base;
+   unsigned int irq;
+   bool secure_init;
+   u32  asid;  /* asid and ctx bank # are 1:1 */
+   struct iommu_group  *group;
+   struct list_head node;  /* head in 
qcom_iommu_device::context_list */
+};
+
+struct qcom_iommu_domain {
+   struct io_pgtable_ops   *pgtbl_ops;
+   spinlock_t   pgtbl_lock;
+   struct mutex init_mutex; /* Protects iommu pointer */
+   struct iommu_domain  domain;
+   struct qcom_iommu_dev   *iommu;
+};
+
+static struct qcom_iommu_domain *to_qcom_iommu_domain(struct iommu_domain *dom)
+{
+   return container_of(dom, struct qcom_iommu_domain, domain);
+}
+
+static const struct iommu_ops qcom_iommu_ops;
+
+static struct qcom_iommu_dev * __to_iommu

[PATCH 4/4] iommu: qcom: initialize secure page table

2017-05-25 Thread Rob Clark
From: Stanimir Varbanov <stanimir.varba...@linaro.org>

This basically gets the secure page table size, allocates memory for
secure pagetables and passes the physical address to the trusted zone.

Signed-off-by: Stanimir Varbanov <stanimir.varba...@linaro.org>
Signed-off-by: Rob Clark <robdcl...@gmail.com>
---
 drivers/iommu/qcom_iommu.c | 64 ++
 1 file changed, 64 insertions(+)

diff --git a/drivers/iommu/qcom_iommu.c b/drivers/iommu/qcom_iommu.c
index bfaf97c..3033862 100644
--- a/drivers/iommu/qcom_iommu.c
+++ b/drivers/iommu/qcom_iommu.c
@@ -632,6 +632,51 @@ static void qcom_iommu_disable_clocks(struct 
qcom_iommu_dev *qcom_iommu)
clk_disable_unprepare(qcom_iommu->iface_clk);
 }
 
+static int qcom_iommu_sec_ptbl_init(struct device *dev)
+{
+   size_t psize = 0;
+   unsigned int spare = 0;
+   void *cpu_addr;
+   dma_addr_t paddr;
+   unsigned long attrs;
+   static bool allocated = false;
+   int ret;
+
+   if (allocated)
+   return 0;
+
+   ret = qcom_scm_iommu_secure_ptbl_size(spare, );
+   if (ret) {
+   dev_err(dev, "failed to get iommu secure pgtable size (%d)\n",
+   ret);
+   return ret;
+   }
+
+   dev_info(dev, "iommu sec: pgtable size: %zu\n", psize);
+
+   attrs = DMA_ATTR_NO_KERNEL_MAPPING;
+
+   cpu_addr = dma_alloc_attrs(dev, psize, , GFP_KERNEL, attrs);
+   if (!cpu_addr) {
+   dev_err(dev, "failed to allocate %zu bytes for pgtable\n",
+   psize);
+   return -ENOMEM;
+   }
+
+   ret = qcom_scm_iommu_secure_ptbl_init(paddr, psize, spare);
+   if (ret) {
+   dev_err(dev, "failed to init iommu pgtable (%d)\n", ret);
+   goto free_mem;
+   }
+
+   allocated = true;
+   return 0;
+
+free_mem:
+   dma_free_attrs(dev, psize, cpu_addr, paddr, attrs);
+   return ret;
+}
+
 static int qcom_iommu_ctx_probe(struct platform_device *pdev)
 {
struct qcom_iommu_ctx *ctx;
@@ -718,6 +763,17 @@ static struct platform_driver qcom_iommu_ctx_driver = {
.remove = qcom_iommu_ctx_remove,
 };
 
+static bool qcom_iommu_has_secure_context(struct qcom_iommu_dev *qcom_iommu)
+{
+   struct device_node *child;
+
+   for_each_child_of_node(qcom_iommu->dev->of_node, child)
+   if (of_device_is_compatible(child, "qcom,msm-iommu-v1-sec"))
+   return true;
+
+   return false;
+}
+
 static int qcom_iommu_device_probe(struct platform_device *pdev)
 {
struct qcom_iommu_dev *qcom_iommu;
@@ -754,6 +810,14 @@ static int qcom_iommu_device_probe(struct platform_device 
*pdev)
return -ENODEV;
}
 
+   if (qcom_iommu_has_secure_context(qcom_iommu)) {
+   ret = qcom_iommu_sec_ptbl_init(dev);
+   if (ret) {
+   dev_err(dev, "cannot init secure pg table(%d)\n", ret);
+   return ret;
+   }
+   }
+
platform_set_drvdata(pdev, qcom_iommu);
 
pm_runtime_enable(dev);
-- 
2.9.4

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 1/4] Docs: dt: document qcom iommu bindings

2017-05-25 Thread Rob Clark
Cc: devicet...@vger.kernel.org
Signed-off-by: Rob Clark <robdcl...@gmail.com>
Reviewed-by: Rob Herring <r...@kernel.org>
---
 .../devicetree/bindings/iommu/qcom,iommu.txt   | 121 +
 1 file changed, 121 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/iommu/qcom,iommu.txt

diff --git a/Documentation/devicetree/bindings/iommu/qcom,iommu.txt 
b/Documentation/devicetree/bindings/iommu/qcom,iommu.txt
new file mode 100644
index 000..0d50f84
--- /dev/null
+++ b/Documentation/devicetree/bindings/iommu/qcom,iommu.txt
@@ -0,0 +1,121 @@
+* QCOM IOMMU v1 Implementation
+
+Qualcomm "B" family devices which are not compatible with arm-smmu have
+a similar looking IOMMU but without access to the global register space,
+and optionally requiring additional configuration to route context irqs
+to non-secure vs secure interrupt line.
+
+** Required properties:
+
+- compatible   : Should be one of:
+
+"qcom,msm8916-iommu"
+
+ Followed by "qcom,msm-iommu-v1".
+
+- clock-names  : Should be a pair of "iface" (required for IOMMUs
+ register group access) and "bus" (required for
+ the IOMMUs underlying bus access).
+
+- clocks   : Phandles for respective clocks described by
+ clock-names.
+
+- #address-cells   : must be 1.
+
+- #size-cells  : must be 1.
+
+- #iommu-cells : Must be 1.
+
+- ranges   : Base address and size of the iommu context banks.
+
+- qcom,iommu-secure-id  : secure-id.
+
+- List of sub-nodes, one per translation context bank.  Each sub-node
+  has the following required properties:
+
+  - compatible : Should be one of:
+- "qcom,msm-iommu-v1-ns"  : non-secure context bank
+- "qcom,msm-iommu-v1-sec" : secure context bank
+  - reg: Base address and size of context bank within the iommu
+  - interrupts : The context fault irq.
+
+** Optional properties:
+
+- reg  : Base address and size of the SMMU local base, should
+ be only specified if the iommu requires configuration
+ for routing of context bank irq's to secure vs non-
+ secure lines.  (Ie. if the iommu contains secure
+ context banks)
+
+
+** Examples:
+
+   apps_iommu: iommu@1e2 {
+   #address-cells = <1>;
+   #size-cells = <1>;
+   #iommu-cells = <1>;
+   compatible = "qcom,msm8916-iommu", "qcom,msm-iommu-v1";
+   ranges = <0 0x1e2 0x4>;
+   reg = <0x1ef 0x3000>;
+   clocks = < GCC_SMMU_CFG_CLK>,
+< GCC_APSS_TCU_CLK>;
+   clock-names = "iface", "bus";
+   qcom,iommu-secure-id = <17>;
+
+   // mdp_0:
+   iommu-ctx@4000 {
+   compatible = "qcom,msm-iommu-v1-ns";
+   reg = <0x4000 0x1000>;
+   interrupts = ;
+   };
+
+   // venus_ns:
+   iommu-ctx@5000 {
+   compatible = "qcom,msm-iommu-v1-sec";
+   reg = <0x5000 0x1000>;
+   interrupts = ;
+   };
+   };
+
+   gpu_iommu: iommu@1f08000 {
+   #address-cells = <1>;
+   #size-cells = <1>;
+   #iommu-cells = <1>;
+   compatible = "qcom,msm8916-iommu", "qcom,msm-iommu-v1";
+   ranges = <0 0x1f08000 0x1>;
+   clocks = < GCC_SMMU_CFG_CLK>,
+< GCC_GFX_TCU_CLK>;
+   clock-names = "iface", "bus";
+   qcom,iommu-secure-id = <18>;
+
+   // gfx3d_user:
+   iommu-ctx@1f09000 {
+   compatible = "qcom,msm-iommu-v1-ns";
+   reg = <0x1000 0x1000>;
+   interrupts = ;
+   };
+
+   // gfx3d_priv:
+   iommu-ctx@1f0a000 {
+   compatible = "qcom,msm-iommu-v1-ns";
+   reg = <0x2000 0x1000>;
+   interrupts = ;
+   };
+   };
+
+   ...
+
+   venus: video-codec@1d0 {
+   ...
+   iommus = <_iommu 5>;
+   };
+
+   mdp: mdp@1a01000 {
+   ...
+   iommus = <_iommu 4>;
+   };
+
+   gpu@01c0 {
+   ...
+   iommus = <_iommu 1>, <_iommu 2>;
+   };
-- 
2.9.4

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 3/4] iommu: add qcom_iommu

2017-06-01 Thread Rob Clark
An iommu driver for Qualcomm "B" family devices which do implement the
ARM SMMU spec, but not in a way that is compatible with how the arm-smmu
driver is designed.  It seems SMMU_SCR1.GASRAE=1 so the global register
space is not accessible.  This means it needs to get configuration from
devicetree instead of setting it up dynamically.

In the end, other than register definitions, there is not much code to
share with arm-smmu (other than what has already been refactored out
into the pgtable helpers).

Signed-off-by: Rob Clark <robdcl...@gmail.com>
---
 drivers/iommu/Kconfig  |  10 +
 drivers/iommu/Makefile |   1 +
 drivers/iommu/qcom_iommu.c | 901 +
 3 files changed, 912 insertions(+)
 create mode 100644 drivers/iommu/qcom_iommu.c

diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
index 6ee3a25..aa4b628 100644
--- a/drivers/iommu/Kconfig
+++ b/drivers/iommu/Kconfig
@@ -367,4 +367,14 @@ config MTK_IOMMU_V1
 
  if unsure, say N here.
 
+config QCOM_IOMMU
+   # Note: iommu drivers cannot (yet?) be built as modules
+   bool "Qualcomm IOMMU Support"
+   depends on ARCH_QCOM || COMPILE_TEST
+   select IOMMU_API
+   select IOMMU_IO_PGTABLE_LPAE
+   select ARM_DMA_USE_IOMMU
+   help
+ Support for IOMMU on certain Qualcomm SoCs.
+
 endif # IOMMU_SUPPORT
diff --git a/drivers/iommu/Makefile b/drivers/iommu/Makefile
index 195f7b9..b910aea 100644
--- a/drivers/iommu/Makefile
+++ b/drivers/iommu/Makefile
@@ -27,3 +27,4 @@ obj-$(CONFIG_TEGRA_IOMMU_SMMU) += tegra-smmu.o
 obj-$(CONFIG_EXYNOS_IOMMU) += exynos-iommu.o
 obj-$(CONFIG_FSL_PAMU) += fsl_pamu.o fsl_pamu_domain.o
 obj-$(CONFIG_S390_IOMMU) += s390-iommu.o
+obj-$(CONFIG_QCOM_IOMMU) += qcom_iommu.o
diff --git a/drivers/iommu/qcom_iommu.c b/drivers/iommu/qcom_iommu.c
new file mode 100644
index 000..3b578e6
--- /dev/null
+++ b/drivers/iommu/qcom_iommu.c
@@ -0,0 +1,901 @@
+/*
+ * IOMMU API for QCOM secure IOMMUs.  Somewhat based on arm-smmu.c
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ *
+ * Copyright (C) 2013 ARM Limited
+ * Copyright (C) 2017 Red Hat
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "io-pgtable.h"
+#include "arm-smmu-regs.h"
+
+#define SMMU_INTR_SEL_NS 0x2000
+
+struct qcom_iommu_ctx;
+
+struct qcom_iommu_dev {
+   /* IOMMU core code handle */
+   struct iommu_device  iommu;
+   struct device   *dev;
+   struct clk  *iface_clk;
+   struct clk  *bus_clk;
+   void __iomem*local_base;
+   u32  sec_id;
+   u8   num_ctxs;
+   struct qcom_iommu_ctx   *ctxs[0];   /* indexed by asid-1 */
+};
+
+struct qcom_iommu_ctx {
+   struct device   *dev;
+   void __iomem*base;
+   bool secure_init;
+   u8   asid;  /* asid and ctx bank # are 1:1 */
+   struct iommu_group  *group;
+};
+
+struct qcom_iommu_domain {
+   struct io_pgtable_ops   *pgtbl_ops;
+   spinlock_t   pgtbl_lock;
+   struct mutex init_mutex; /* Protects iommu pointer */
+   struct iommu_domain  domain;
+   struct qcom_iommu_dev   *iommu;
+};
+
+static struct qcom_iommu_domain *to_qcom_iommu_domain(struct iommu_domain *dom)
+{
+   return container_of(dom, struct qcom_iommu_domain, domain);
+}
+
+static const struct iommu_ops qcom_iommu_ops;
+
+static struct qcom_iommu_dev * to_iommu(struct iommu_fwspec *fwspec)
+{
+   if (!fwspec || fwspec->ops != _iommu_ops)
+   return NULL;
+   return fwspec->iommu_priv;
+}
+
+static struct qcom_iommu_ctx * to_ctx(struct iommu_fwspec *fwspec, unsigned 
asid)
+{
+   struct qcom_iommu_dev *qcom_iommu = to_iommu(fwspec);
+   if (!qcom_iommu)
+   return NULL;
+   return qcom_iommu->ctxs[asid - 1];
+}
+
+static inline void
+iommu_writel(struct qcom_iommu_ctx *ctx, unsigned reg, u32 val)
+{
+   writel_relaxed(val, ctx->base + reg);
+}
+
+static inline void
+iommu_writeq(struct qcom_iommu_ctx *ctx, unsigned reg, u64 val)
+{
+

[PATCH 2/4] iommu: arm-smmu: split out register defines

2017-06-01 Thread Rob Clark
I want to re-use some of these for qcom_iommu, which has (roughly) the
same context-bank registers.

Signed-off-by: Rob Clark <robdcl...@gmail.com>
---
 drivers/iommu/arm-smmu-regs.h | 227 ++
 drivers/iommu/arm-smmu.c  | 203 +
 2 files changed, 228 insertions(+), 202 deletions(-)
 create mode 100644 drivers/iommu/arm-smmu-regs.h

diff --git a/drivers/iommu/arm-smmu-regs.h b/drivers/iommu/arm-smmu-regs.h
new file mode 100644
index 000..87589c8
--- /dev/null
+++ b/drivers/iommu/arm-smmu-regs.h
@@ -0,0 +1,227 @@
+/*
+ * IOMMU API for ARM architected SMMU implementations.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
+ *
+ * Copyright (C) 2013 ARM Limited
+ *
+ * Author: Will Deacon <will.dea...@arm.com>
+ */
+
+#ifndef _ARM_SMMU_REGS_H
+#define _ARM_SMMU_REGS_H
+
+/* Configuration registers */
+#define ARM_SMMU_GR0_sCR0  0x0
+#define sCR0_CLIENTPD  (1 << 0)
+#define sCR0_GFRE  (1 << 1)
+#define sCR0_GFIE  (1 << 2)
+#define sCR0_EXIDENABLE(1 << 3)
+#define sCR0_GCFGFRE   (1 << 4)
+#define sCR0_GCFGFIE   (1 << 5)
+#define sCR0_USFCFG(1 << 10)
+#define sCR0_VMIDPNE   (1 << 11)
+#define sCR0_PTM   (1 << 12)
+#define sCR0_FB(1 << 13)
+#define sCR0_VMID16EN  (1 << 31)
+#define sCR0_BSU_SHIFT 14
+#define sCR0_BSU_MASK  0x3
+
+/* Auxiliary Configuration register */
+#define ARM_SMMU_GR0_sACR  0x10
+
+/* Identification registers */
+#define ARM_SMMU_GR0_ID0   0x20
+#define ARM_SMMU_GR0_ID1   0x24
+#define ARM_SMMU_GR0_ID2   0x28
+#define ARM_SMMU_GR0_ID3   0x2c
+#define ARM_SMMU_GR0_ID4   0x30
+#define ARM_SMMU_GR0_ID5   0x34
+#define ARM_SMMU_GR0_ID6   0x38
+#define ARM_SMMU_GR0_ID7   0x3c
+#define ARM_SMMU_GR0_sGFSR 0x48
+#define ARM_SMMU_GR0_sGFSYNR0  0x50
+#define ARM_SMMU_GR0_sGFSYNR1  0x54
+#define ARM_SMMU_GR0_sGFSYNR2  0x58
+
+#define ID0_S1TS   (1 << 30)
+#define ID0_S2TS   (1 << 29)
+#define ID0_NTS(1 << 28)
+#define ID0_SMS(1 << 27)
+#define ID0_ATOSNS (1 << 26)
+#define ID0_PTFS_NO_AARCH32(1 << 25)
+#define ID0_PTFS_NO_AARCH32S   (1 << 24)
+#define ID0_CTTW   (1 << 14)
+#define ID0_NUMIRPT_SHIFT  16
+#define ID0_NUMIRPT_MASK   0xff
+#define ID0_NUMSIDB_SHIFT  9
+#define ID0_NUMSIDB_MASK   0xf
+#define ID0_EXIDS  (1 << 8)
+#define ID0_NUMSMRG_SHIFT  0
+#define ID0_NUMSMRG_MASK   0xff
+
+#define ID1_PAGESIZE   (1 << 31)
+#define ID1_NUMPAGENDXB_SHIFT  28
+#define ID1_NUMPAGENDXB_MASK   7
+#define ID1_NUMS2CB_SHIFT  16
+#define ID1_NUMS2CB_MASK   0xff
+#define ID1_NUMCB_SHIFT0
+#define ID1_NUMCB_MASK 0xff
+
+#define ID2_OAS_SHIFT  4
+#define ID2_OAS_MASK   0xf
+#define ID2_IAS_SHIFT  0
+#define ID2_IAS_MASK   0xf
+#define ID2_UBS_SHIFT  8
+#define ID2_UBS_MASK   0xf
+#define ID2_PTFS_4K(1 << 12)
+#define ID2_PTFS_16K   (1 << 13)
+#define ID2_PTFS_64K   (1 << 14)
+#define ID2_VMID16 (1 << 15)
+
+#define ID7_MAJOR_SHIFT4
+#define ID7_MAJOR_MASK 0xf
+
+/* Global TLB invalidation */
+#define ARM_SMMU_GR0_TLBIVMID  0x64
+#define ARM_SMMU_GR0_TLBIALLNSNH   0x68
+#define ARM_SMMU_GR0_TLBIALLH  0x6c
+#define ARM_SMMU_GR0_sTLBGSYNC 0x70
+#define ARM_SMMU_GR0_sTLBGSTATUS   0x74
+#define sTLBGSTATUS_GSACTIVE   (1 << 0)
+#define TLB_LOOP_TIMEOUT   10

[PATCH 4/4] iommu: qcom: initialize secure page table

2017-06-01 Thread Rob Clark
From: Stanimir Varbanov <stanimir.varba...@linaro.org>

This basically gets the secure page table size, allocates memory for
secure pagetables and passes the physical address to the trusted zone.

Signed-off-by: Stanimir Varbanov <stanimir.varba...@linaro.org>
Signed-off-by: Rob Clark <robdcl...@gmail.com>
---
 drivers/iommu/qcom_iommu.c | 64 ++
 1 file changed, 64 insertions(+)

diff --git a/drivers/iommu/qcom_iommu.c b/drivers/iommu/qcom_iommu.c
index 3b578e6..753ee0e 100644
--- a/drivers/iommu/qcom_iommu.c
+++ b/drivers/iommu/qcom_iommu.c
@@ -636,6 +636,51 @@ static void qcom_iommu_disable_clocks(struct 
qcom_iommu_dev *qcom_iommu)
clk_disable_unprepare(qcom_iommu->iface_clk);
 }
 
+static int qcom_iommu_sec_ptbl_init(struct device *dev)
+{
+   size_t psize = 0;
+   unsigned int spare = 0;
+   void *cpu_addr;
+   dma_addr_t paddr;
+   unsigned long attrs;
+   static bool allocated = false;
+   int ret;
+
+   if (allocated)
+   return 0;
+
+   ret = qcom_scm_iommu_secure_ptbl_size(spare, );
+   if (ret) {
+   dev_err(dev, "failed to get iommu secure pgtable size (%d)\n",
+   ret);
+   return ret;
+   }
+
+   dev_info(dev, "iommu sec: pgtable size: %zu\n", psize);
+
+   attrs = DMA_ATTR_NO_KERNEL_MAPPING;
+
+   cpu_addr = dma_alloc_attrs(dev, psize, , GFP_KERNEL, attrs);
+   if (!cpu_addr) {
+   dev_err(dev, "failed to allocate %zu bytes for pgtable\n",
+   psize);
+   return -ENOMEM;
+   }
+
+   ret = qcom_scm_iommu_secure_ptbl_init(paddr, psize, spare);
+   if (ret) {
+   dev_err(dev, "failed to init iommu pgtable (%d)\n", ret);
+   goto free_mem;
+   }
+
+   allocated = true;
+   return 0;
+
+free_mem:
+   dma_free_attrs(dev, psize, cpu_addr, paddr, attrs);
+   return ret;
+}
+
 static int get_asid(const struct device_node *np)
 {
u32 reg;
@@ -733,6 +778,17 @@ static struct platform_driver qcom_iommu_ctx_driver = {
.remove = qcom_iommu_ctx_remove,
 };
 
+static bool qcom_iommu_has_secure_context(struct qcom_iommu_dev *qcom_iommu)
+{
+   struct device_node *child;
+
+   for_each_child_of_node(qcom_iommu->dev->of_node, child)
+   if (of_device_is_compatible(child, "qcom,msm-iommu-v1-sec"))
+   return true;
+
+   return false;
+}
+
 static int qcom_iommu_device_probe(struct platform_device *pdev)
 {
struct device_node *child;
@@ -777,6 +833,14 @@ static int qcom_iommu_device_probe(struct platform_device 
*pdev)
return -ENODEV;
}
 
+   if (qcom_iommu_has_secure_context(qcom_iommu)) {
+   ret = qcom_iommu_sec_ptbl_init(dev);
+   if (ret) {
+   dev_err(dev, "cannot init secure pg table(%d)\n", ret);
+   return ret;
+   }
+   }
+
platform_set_drvdata(pdev, qcom_iommu);
 
pm_runtime_enable(dev);
-- 
2.9.4

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 1/4] Docs: dt: document qcom iommu bindings

2017-06-01 Thread Rob Clark
Cc: devicet...@vger.kernel.org
Signed-off-by: Rob Clark <robdcl...@gmail.com>
Reviewed-by: Rob Herring <r...@kernel.org>
---
 .../devicetree/bindings/iommu/qcom,iommu.txt   | 121 +
 1 file changed, 121 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/iommu/qcom,iommu.txt

diff --git a/Documentation/devicetree/bindings/iommu/qcom,iommu.txt 
b/Documentation/devicetree/bindings/iommu/qcom,iommu.txt
new file mode 100644
index 000..b2641ce
--- /dev/null
+++ b/Documentation/devicetree/bindings/iommu/qcom,iommu.txt
@@ -0,0 +1,121 @@
+* QCOM IOMMU v1 Implementation
+
+Qualcomm "B" family devices which are not compatible with arm-smmu have
+a similar looking IOMMU but without access to the global register space,
+and optionally requiring additional configuration to route context irqs
+to non-secure vs secure interrupt line.
+
+** Required properties:
+
+- compatible   : Should be one of:
+
+"qcom,msm8916-iommu"
+
+ Followed by "qcom,msm-iommu-v1".
+
+- clock-names  : Should be a pair of "iface" (required for IOMMUs
+ register group access) and "bus" (required for
+ the IOMMUs underlying bus access).
+
+- clocks   : Phandles for respective clocks described by
+ clock-names.
+
+- #address-cells   : must be 1.
+
+- #size-cells  : must be 1.
+
+- #iommu-cells : Must be 1.  Index identifies the context-bank #.
+
+- ranges   : Base address and size of the iommu context banks.
+
+- qcom,iommu-secure-id  : secure-id.
+
+- List of sub-nodes, one per translation context bank.  Each sub-node
+  has the following required properties:
+
+  - compatible : Should be one of:
+- "qcom,msm-iommu-v1-ns"  : non-secure context bank
+- "qcom,msm-iommu-v1-sec" : secure context bank
+  - reg: Base address and size of context bank within the iommu
+  - interrupts : The context fault irq.
+
+** Optional properties:
+
+- reg  : Base address and size of the SMMU local base, should
+ be only specified if the iommu requires configuration
+ for routing of context bank irq's to secure vs non-
+ secure lines.  (Ie. if the iommu contains secure
+ context banks)
+
+
+** Examples:
+
+   apps_iommu: iommu@1e2 {
+   #address-cells = <1>;
+   #size-cells = <1>;
+   #iommu-cells = <1>;
+   compatible = "qcom,msm8916-iommu", "qcom,msm-iommu-v1";
+   ranges = <0 0x1e2 0x4>;
+   reg = <0x1ef 0x3000>;
+   clocks = < GCC_SMMU_CFG_CLK>,
+< GCC_APSS_TCU_CLK>;
+   clock-names = "iface", "bus";
+   qcom,iommu-secure-id = <17>;
+
+   // mdp_0:
+   iommu-ctx@4000 {
+   compatible = "qcom,msm-iommu-v1-ns";
+   reg = <0x4000 0x1000>;
+   interrupts = ;
+   };
+
+   // venus_ns:
+   iommu-ctx@5000 {
+   compatible = "qcom,msm-iommu-v1-sec";
+   reg = <0x5000 0x1000>;
+   interrupts = ;
+   };
+   };
+
+   gpu_iommu: iommu@1f08000 {
+   #address-cells = <1>;
+   #size-cells = <1>;
+   #iommu-cells = <1>;
+   compatible = "qcom,msm8916-iommu", "qcom,msm-iommu-v1";
+   ranges = <0 0x1f08000 0x1>;
+   clocks = < GCC_SMMU_CFG_CLK>,
+< GCC_GFX_TCU_CLK>;
+   clock-names = "iface", "bus";
+   qcom,iommu-secure-id = <18>;
+
+   // gfx3d_user:
+   iommu-ctx@1000 {
+   compatible = "qcom,msm-iommu-v1-ns";
+   reg = <0x1000 0x1000>;
+   interrupts = ;
+   };
+
+   // gfx3d_priv:
+   iommu-ctx@2000 {
+   compatible = "qcom,msm-iommu-v1-ns";
+   reg = <0x2000 0x1000>;
+   interrupts = ;
+   };
+   };
+
+   ...
+
+   venus: video-codec@1d0 {
+   ...
+   iommus = <_iommu 5>;
+   };
+
+   mdp: mdp@1a01000 {
+   ...
+   iommus = <_iommu 4>;
+   };
+
+   gpu@01c0 {
+   ...
+   iommus = <_iommu 1>, <_iommu 2>;
+   };
-- 
2.9.4

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 0/4] iommu: add qcom_iommu for early "B" family devices

2017-06-01 Thread Rob Clark
An iommu driver for Qualcomm "B" family devices which do not implement
the ARM SMMU spec in a way that is compatible with the arm-smmu driver.

Since last version, a few updates to 3/4 based on Robin's suggestions.

Rob Clark (3):
  Docs: dt: document qcom iommu bindings
  iommu: arm-smmu: split out register defines
  iommu: add qcom_iommu

Stanimir Varbanov (1):
  iommu: qcom: initialize secure page table

 .../devicetree/bindings/iommu/qcom,iommu.txt   | 121 +++
 drivers/iommu/Kconfig  |  10 +
 drivers/iommu/Makefile |   1 +
 drivers/iommu/arm-smmu-regs.h  | 227 +
 drivers/iommu/arm-smmu.c   | 203 +
 drivers/iommu/qcom_iommu.c | 965 +
 6 files changed, 1325 insertions(+), 202 deletions(-)
 create mode 100644 Documentation/devicetree/bindings/iommu/qcom,iommu.txt
 create mode 100644 drivers/iommu/arm-smmu-regs.h
 create mode 100644 drivers/iommu/qcom_iommu.c

-- 
2.9.4

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 3/4] iommu/arm-smmu: Disable stalling faults for all endpoints

2017-09-13 Thread Rob Clark
On Fri, Dec 16, 2016 at 6:54 AM, Will Deacon <will.dea...@arm.com> wrote:
> Hi Rob,
>
> On Tue, Dec 06, 2016 at 06:30:21PM -0500, Rob Clark wrote:
>> On Thu, Aug 18, 2016 at 9:05 AM, Will Deacon <will.dea...@arm.com> wrote:
>> > Enabling stalling faults can result in hardware deadlock on poorly
>> > designed systems, particularly those with a PCI root complex upstream of
>> > the SMMU.
>> >
>> > Although it's not really Linux's job to save hardware integrators from
>> > their own misfortune, it *is* our job to stop userspace (e.g. VFIO
>> > clients) from hosing the system for everybody else, even if they might
>> > already be required to have elevated privileges.
>> >
>> > Given that the fault handling code currently executes entirely in IRQ
>> > context, there is nothing that can sensibly be done to recover from
>> > things like page faults anyway, so let's rip this code out for now and
>> > avoid the potential for deadlock.
>>
>> so, I'd like to re-introduce this feature, I *guess* as some sort of
>> opt-in quirk (ie. disabled by default unless something in DT tells you
>> otherwise??  But I'm open to suggestions.  I'm not entirely sure what
>> hw was having problems due to this feature.)
>>
>> On newer snapdragon devices we are using arm-smmu for the GPU, and
>> halting the GPU so the driver's fault handler can dump some GPU state
>> on faults is enormously helpful for debugging and tracking down where
>> in the gpu cmdstream the fault was triggered.  In addition, we will
>> eventually want the ability to update pagetables from fault handler
>> and resuming the faulting transition.
>
> I'm not against reintroducing this, but it would certainly need to be
> opt-in, as you suggest. If we want to try to use stall faults to enable
> demand paging for DMA, then that means running core mm code to resolve
> the fault and we can't do that in irq context. Consequently, we have to
> hand this off to a thread, which means the hardware must allow the SS
> bit to remain set without immediately reasserting the interrupt line.
> Furthermore, we can't handle multiple faults on a context-bank, so we'd
> need to restrict ourselves to one device (i.e. faulting stream) per
> domain (CB).
>
> I think that means we want both specific compatible strings to identify
> the SS bit behaviour, but also a way to opt-in for the stall model as a
> separate property to indicate that the SoC integration can support this
> without e.g. deadlocking.
>

How do you feel about a short-term step to keep things in irq context,
but enable stall mode?  I'm debugging an issue, where it appears that
the gpu cannot handle a non-stalled fault (triggers some fairly
bizarre follow-on problems).  So I think even if we don't add a fault
handler callback, we still want to set the CFCFG bit and
RESUME_TERMINATE in the fault handler on this hardware.

BR,
-R
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC] iommu: arm-smmu: stall support

2017-09-18 Thread Rob Clark
On Mon, Sep 18, 2017 at 7:13 AM, Jean-Philippe Brucker
<jean-philippe.bruc...@arm.com> wrote:
> Hi Rob,
>
> On 14/09/17 20:44, Rob Clark wrote:
>> Adds a new domain property for iommu clients to opt-in to stalling
>> with asynchronous resume, and for the client to determine if the
>> iommu supports this.
>>
>> Current motivation is that:
>>
>> a) On 8x96/a530, if we don't enable CFCFG (or HUPCF) then non-
>>faulting translations which are happening concurrently with
>>one that faults, fail (or return garbage), which triggers all
>>sorts of fun GPU crashes, which generally have no relation
>>to the root fault.  (The CP can be far ahead in the cmdstream
>>from the other parts of the GPU...)
>
> Would the GPU driver always enable stall for this implementation? Or only
> enable it for specific domains?

I expect for all domains.  (Currently that is just a single domain,
but I expect that to change)

> Instead of enabling it at domain level, I wonder if couldn't be left
> entirely to the SMMU driver. I have a proposal (that I'll publish shortly)
> for adding a "can-stall" attribute to device-tree nodes, telling the SMMU
> driver that the device can withstand stalled transactions without locking
> up the system.
>
> The SMMU would then enable stall for this particular device without
> needing approval from the device driver. I'm doing this for v3, which has
> a more mature stall model, but I suppose we can do the same for v2 as well.

The GPU driver does need to know if stalling is supported/enabled by
the iommu driver (since depending on SoC, drm/msm works with one of
three different iommu drivers currently), and to be in control of
resume.. I'm a bit sceptical about trying to abstract too much at the
iommu level.

For example when the gpu gets a fault, it tends to get 1000s of
faults.  On the first fault, I want to kick things off to a wq where I
can snapshot the cmdstream and gpu state.  But subsequent faults on
the same submit I ignore.

Btw, apologies that I haven't sent the corresponding drm/msm patches
yet.  I haven't had a chance to clean up yet, but you can find
something rough here:

  
https://github.com/freedreno/kernel-msm/commits/integration-linux-qcomlt-v4.13-rc3

> In any case, the firmware has to tell the OS that a device is capable of
> stalling, because it is unlikely that many platform devices will
> gracefully handle this mode.
>
>> b) I am working on a debugfs feature to dump submits/batches
>>that cause GPU hangs, and I would like to also use this for
>>faults.  But it needs to run in non-atomic context, so I
>>need to toss things off to a workqueue, and then resume
>>the iommu after it finishes.
>
> Are you relying on stalled transaction to freeze the GPU state and
> allow for introspection? I suppose the debug code would always terminate
> after recording the fault? I'm just trying to get a picture of all
> possible users of a common fault API.

yes, this is what I'm doing now.  For SVM, however, we'd retry the
transaction instead of terminating.

>> c) (and ofc at some point in the future for SVM we'd like to
>>be able to pin unpinned pages and things like that, in
>>response to faults.)
>
> For SVM there will be generic code calling into the mm code to pin pages
> and resume the SMMU. We are working on consolidating this with other
> IOMMUs at the moment and use generic code where possible. Ideally the GPU
> driver shouldn't need to get involved.
>
> That new API will be based on PASIDs/SSIDs, which doesn't exist in SMMUv2.
> I do believe that we also need to consolidate the API for devices and
> IOMMUs that support page faults but not PASIDs. We could use a common
> fault workqueue in the IOMMU core.

I've no idea qcom's plans for future hw, but pretty sure we are going
to want to implement SVM on v2 iommu, without PASIDs/SSIDs.  However
on current hw, there is really only one userspace process active on
the gpu at a time, so we don't really need PASIDs/SSIDs.

> It seems like your use-case (b) could fit in there. If the device driver
> didn't bind to a process but instead registered a fault handler, then we
> could ask it to do something with the fault. And since it's in a wq, the
> call to device driver would be synchronous and we'd pass the return status
> (retry/terminate) to the SMMU.
>
> This is probably easier to handle than a separate "resume" callback,
> especially with SMMUv3 stall and PRI, where faults are out of order and
> contain a token identifying a fault.

IIRC Will or Robin mentioned wanting a token in earlier stall
discussion.. although not being familiar with v3 I wasn't quite sure
what the use was.

At any rate, adding a token to fault handler cal

Re: [PATCH 3/4] iommu/arm-smmu: Disable stalling faults for all endpoints

2017-09-18 Thread Rob Clark
On Mon, Sep 18, 2017 at 1:33 PM, Will Deacon <will.dea...@arm.com> wrote:
> On Wed, Sep 13, 2017 at 03:31:20PM -0400, Rob Clark wrote:
>> On Fri, Dec 16, 2016 at 6:54 AM, Will Deacon <will.dea...@arm.com> wrote:
>> > Hi Rob,
>> >
>> > On Tue, Dec 06, 2016 at 06:30:21PM -0500, Rob Clark wrote:
>> >> On Thu, Aug 18, 2016 at 9:05 AM, Will Deacon <will.dea...@arm.com> wrote:
>> >> > Enabling stalling faults can result in hardware deadlock on poorly
>> >> > designed systems, particularly those with a PCI root complex upstream of
>> >> > the SMMU.
>> >> >
>> >> > Although it's not really Linux's job to save hardware integrators from
>> >> > their own misfortune, it *is* our job to stop userspace (e.g. VFIO
>> >> > clients) from hosing the system for everybody else, even if they might
>> >> > already be required to have elevated privileges.
>> >> >
>> >> > Given that the fault handling code currently executes entirely in IRQ
>> >> > context, there is nothing that can sensibly be done to recover from
>> >> > things like page faults anyway, so let's rip this code out for now and
>> >> > avoid the potential for deadlock.
>> >>
>> >> so, I'd like to re-introduce this feature, I *guess* as some sort of
>> >> opt-in quirk (ie. disabled by default unless something in DT tells you
>> >> otherwise??  But I'm open to suggestions.  I'm not entirely sure what
>> >> hw was having problems due to this feature.)
>> >>
>> >> On newer snapdragon devices we are using arm-smmu for the GPU, and
>> >> halting the GPU so the driver's fault handler can dump some GPU state
>> >> on faults is enormously helpful for debugging and tracking down where
>> >> in the gpu cmdstream the fault was triggered.  In addition, we will
>> >> eventually want the ability to update pagetables from fault handler
>> >> and resuming the faulting transition.
>> >
>> > I'm not against reintroducing this, but it would certainly need to be
>> > opt-in, as you suggest. If we want to try to use stall faults to enable
>> > demand paging for DMA, then that means running core mm code to resolve
>> > the fault and we can't do that in irq context. Consequently, we have to
>> > hand this off to a thread, which means the hardware must allow the SS
>> > bit to remain set without immediately reasserting the interrupt line.
>> > Furthermore, we can't handle multiple faults on a context-bank, so we'd
>> > need to restrict ourselves to one device (i.e. faulting stream) per
>> > domain (CB).
>> >
>> > I think that means we want both specific compatible strings to identify
>> > the SS bit behaviour, but also a way to opt-in for the stall model as a
>> > separate property to indicate that the SoC integration can support this
>> > without e.g. deadlocking.
>> >
>>
>> How do you feel about a short-term step to keep things in irq context,
>> but enable stall mode?  I'm debugging an issue, where it appears that
>> the gpu cannot handle a non-stalled fault (triggers some fairly
>> bizarre follow-on problems).  So I think even if we don't add a fault
>> handler callback, we still want to set the CFCFG bit and
>> RESUME_TERMINATE in the fault handler on this hardware.
>
> Hmm, colour me unconvinced. Why does enabling stalls fix this problem?
> I'd rather we bite the bullet and implement things properly on top of
> a workqueue so that you can build on the same basic infrastructure as
> the SVM work that Jean-Philippe is looking at, particular as you also
> have a use-case for running fault code in non-atomic context.
>

So, it seems like setting either CFCFG or HUPCF (which is what
downstream android kernel apparently does) avoids this issue.

It seems like some sort of hw bug, but not sure if in the iommu or the
gpu.. you'd probably have to ask someone @qcom about the details, but
without setting either of these bits, it seems that other concurrent
memory transactions to the one triggering fault and up returning bogus
values to the GPU.  So the CP (which is reading cmdstream somewhat far
ahead of the DRAW or COMPUTE shader that triggers a fault) ends up
reading bogus cmdstream values and triggers all sorts of spectacular
fireworks.

BR,
-R
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC] iommu: arm-smmu: stall support

2017-09-19 Thread Rob Clark
On Tue, Sep 19, 2017 at 8:30 AM, Joerg Roedel <j...@8bytes.org> wrote:
> Hi Rob,
>
> thanks for the RFC patch. I have some comments about the interface to
> the IOMMU-API below.
>
> On Thu, Sep 14, 2017 at 03:44:33PM -0400, Rob Clark wrote:
>> +/**
>> + * iommu_domain_resume - Resume translations for a domain after a fault.
>> + *
>> + * This can be called at some point after the fault handler is called,
>> + * allowing the user of the IOMMU to (for example) handle the fault
>> + * from a task context.  It is illegal to call this if
>> + * iommu_domain_set_attr(STALL) failed.
>> + *
>> + * @domain:the domain to resume
>> + * @terminate: if true, the translation that triggered the fault should
>> + *be terminated, else it should be retried.
>> + */
>> +void iommu_domain_resume(struct iommu_domain *domain, bool terminate)
>> +{
>> + /* invalid to call if iommu_domain_set_attr(STALL) failed: */
>> + if (WARN_ON(!domain->ops->domain_resume))
>> + return;
>> + domain->ops->domain_resume(domain, terminate);
>> +}
>> +EXPORT_SYMBOL_GPL(iommu_domain_resume);
>
> So this function is being called by the device driver owning the domain,
> right?

yes, this was my plan

> I don't think that the resume call-back you added needs to be exposed
> like this. It is better to do the page-fault handling completly in the
> iommu-code, including calling the resume call-back and just let the
> device-driver provide a per-domain call-back to let it handle the fault
> and map in the required pages.

I would like to decide in the IRQ whether or not to queue work or not,
because when we get a gpu fault, we tend to get 1000's of gpu faults
all at once (and I really only need to handle the first one).  I
suppose that could also be achieved by having a special return value
from the fault handler to say "call me again from a wq"..

Note that in the drm driver I already have a suitable wq to queue the
work, so it really doesn't buy me anything to have the iommu driver
toss things off to a wq for me.  Might be a different situation for
other drivers (but I guess mostly other drivers are using iommu API
indirectly via dma-mapping?)

> The interface could look like this:
>
> * New function iommu_domain_enable_stalls(domain) - When
>   this function returns the domain is in stall-handling mode. A
>   iommu_domain_disable_stalls() might make sense too, not sure
>   about that.

I don't particularly see a use-case for disabling stalls, fwiw

BR,
-R

> * When stalls are enabled for a domain, report_iommu_fault()
>   queues the fault to a workqueue (so that its handler can
>   block) and in the workqueue you call ->resume() based on the
>   return value of the handler.
>
> As a side-note, as there has been discussion on this: For now it doesn't
> make sense to merge this with the SVM page-fault handling efforts, as
> this path is different enough (SVM will call handle_mm_fault() as the
> handler, for example).
>
>
> Regards,
>
> Joerg
>
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[RFC] iommu: arm-smmu: stall support

2017-09-14 Thread Rob Clark
Adds a new domain property for iommu clients to opt-in to stalling
with asynchronous resume, and for the client to determine if the
iommu supports this.

Current motivation is that:

a) On 8x96/a530, if we don't enable CFCFG (or HUPCF) then non-
   faulting translations which are happening concurrently with
   one that faults, fail (or return garbage), which triggers all
   sorts of fun GPU crashes, which generally have no relation
   to the root fault.  (The CP can be far ahead in the cmdstream
   from the other parts of the GPU...)

b) I am working on a debugfs feature to dump submits/batches
   that cause GPU hangs, and I would like to also use this for
   faults.  But it needs to run in non-atomic context, so I
   need to toss things off to a workqueue, and then resume
   the iommu after it finishes.

c) (and ofc at some point in the future for SVM we'd like to
   be able to pin unpinned pages and things like that, in
   response to faults.)

TODO
 - For RFC I thought it would be easier to review the idea
   as a single patch, but it should be split into separate
   core and arm-smmu parts

 - I vaguely remember someone (Will?) mentioning that there
   could be cases with multiple masters sharing a single
   context bank, and somehow stalling might not work in that
   case?  (How does that even happen, arm-smmu assignes the
   context banks?  Maybe I'm mis-remembering the details.)
   I think that this probably shouldn't effect the API parts
   of this RFC, the iommu driver should already know about
   all the devices that might attach because of ->attach_dev()
   so it could fail in _set_attr()?

Signed-off-by: Rob Clark <robdcl...@gmail.com>
---
 drivers/iommu/arm-smmu.c | 36 
 drivers/iommu/iommu.c| 21 +
 include/linux/iommu.h| 14 ++
 3 files changed, 67 insertions(+), 4 deletions(-)

diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index fe8e7fd61282..50131985a1e7 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -239,6 +239,7 @@ struct arm_smmu_domain {
struct io_pgtable_ops   *pgtbl_ops;
struct arm_smmu_cfg cfg;
enum arm_smmu_domain_stage  stage;
+   boolstall;
struct mutexinit_mutex; /* Protects smmu pointer */
spinlock_t  cb_lock; /* Serialises ATS1* ops */
struct iommu_domain domain;
@@ -544,6 +545,24 @@ static const struct iommu_gather_ops 
arm_smmu_s2_tlb_ops_v1 = {
.tlb_sync   = arm_smmu_tlb_sync_vmid,
 };
 
+static void arm_smmu_domain_resume(struct iommu_domain *domain, bool terminate)
+{
+   struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
+   struct arm_smmu_cfg *cfg = _domain->cfg;
+   struct arm_smmu_device *smmu = smmu_domain->smmu;
+   void __iomem *cb_base;
+   unsigned val;
+
+   cb_base = ARM_SMMU_CB(smmu, cfg->cbndx);
+
+   if (terminate)
+   val = RESUME_TERMINATE;
+   else
+   val = RESUME_RETRY;
+
+   writel_relaxed(val, cb_base + ARM_SMMU_CB_RESUME);
+}
+
 static irqreturn_t arm_smmu_context_fault(int irq, void *dev)
 {
u32 fsr, fsynr;
@@ -563,11 +582,14 @@ static irqreturn_t arm_smmu_context_fault(int irq, void 
*dev)
fsynr = readl_relaxed(cb_base + ARM_SMMU_CB_FSYNR0);
iova = readq_relaxed(cb_base + ARM_SMMU_CB_FAR);
 
-   dev_err_ratelimited(smmu->dev,
-   "Unhandled context fault: fsr=0x%x, iova=0x%08lx, fsynr=0x%x, cb=%d\n",
-   fsr, iova, fsynr, cfg->cbndx);
-
writel(fsr, cb_base + ARM_SMMU_CB_FSR);
+
+   if (!report_iommu_fault(domain, smmu->dev, iova, 0)) {
+   dev_err_ratelimited(smmu->dev,
+   "Unhandled context fault: fsr=0x%x, iova=0x%08lx, fsynr=0x%x, 
cb=%d\n",
+   fsr, iova, fsynr, cfg->cbndx);
+   }
+
return IRQ_HANDLED;
 }
 
@@ -698,6 +720,8 @@ static void arm_smmu_init_context_bank(struct 
arm_smmu_domain *smmu_domain,
 
/* SCTLR */
reg = SCTLR_CFIE | SCTLR_CFRE | SCTLR_AFE | SCTLR_TRE | SCTLR_M;
+   if (smmu_domain->stall)
+   reg |= SCTLR_CFCFG;/* stall on fault */
if (stage1)
reg |= SCTLR_S1_ASIDPNE;
 #ifdef __BIG_ENDIAN
@@ -1524,6 +1548,9 @@ static int arm_smmu_domain_set_attr(struct iommu_domain 
*domain,
smmu_domain->stage = ARM_SMMU_DOMAIN_S1;
 
break;
+   case DOMAIN_ATTR_STALL:
+   smmu_domain->stall = *(bool *)data;
+   break;
default:
ret = -ENODEV;
}
@@ -1587,6 +1614,7 @@ static struct iommu_ops arm_smmu_ops = {
.device_group   = arm_smmu_device_group,
.domain_get_attr= arm_smmu_domain_get_at

Re: [RFC] iommu: arm-smmu: stall support

2017-09-22 Thread Rob Clark
On Fri, Sep 22, 2017 at 6:02 AM, Jean-Philippe Brucker
<jean-philippe.bruc...@arm.com> wrote:
> On 22/09/17 10:02, Joerg Roedel wrote:
>> On Tue, Sep 19, 2017 at 10:23:43AM -0400, Rob Clark wrote:
>>> I would like to decide in the IRQ whether or not to queue work or not,
>>> because when we get a gpu fault, we tend to get 1000's of gpu faults
>>> all at once (and I really only need to handle the first one).  I
>>> suppose that could also be achieved by having a special return value
>>> from the fault handler to say "call me again from a wq"..
>>>
>>> Note that in the drm driver I already have a suitable wq to queue the
>>> work, so it really doesn't buy me anything to have the iommu driver
>>> toss things off to a wq for me.  Might be a different situation for
>>> other drivers (but I guess mostly other drivers are using iommu API
>>> indirectly via dma-mapping?)
>>
>> Okay, so since you are the only user for now, we don't need a
>> work-queue. But I still want the ->resume call-back to be hidden in the
>> iommu code and not be exposed to users.
>>
>> We already have per-domain fault-handlers, so the best solution for now
>> is to call ->resume from report_iommu_fault() when the fault-handler
>> returns a special value.
>
> The problem is that report_iommu_fault is called from IRQ context by the
> SMMU driver, so the device driver callback cannot sleep.
>
> So if the device driver needs to be able to sleep between fault report and
> resume, as I understand Rob needs for writing debugfs, we can either:
>
> * call report_iommu_fault from higher up, in a thread or workqueue.
> * split the fault reporting as this patch proposes. The exact same
>   mechanism is needed for the vSVM work by Intel: in order to inject fault
>   into the guest, they would like to have an atomic notifier registered by
>   VFIO for passing down the Page Request, and a new function in the IOMMU
>   API to resume/complete the fault.
>

I'm in favour if splitting the reporting *somehow*.. the two
approaches that seemed sane are:

1) call fault handler from irq and having separate domain->resume()
called by the driver, potentially from a wq
2) or having two fault callbacks, first called before wq and then
based on returned value, optionally 2nd callback called from wq

The first seemed less intrusive to me, but I'm flexible.

BR,
-R
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC] iommu: arm-smmu: stall support

2017-09-27 Thread Rob Clark
On Wed, Sep 27, 2017 at 9:49 AM, Jean-Philippe Brucker
<jean-philippe.bruc...@arm.com> wrote:
> Hi Joerg,
>
> On 27/09/17 13:15, Joerg Roedel wrote:
>> Hi Rob, Jean,
>>
>> On Fri, Sep 22, 2017 at 02:42:44PM -0400, Rob Clark wrote:
>>> I'm in favour if splitting the reporting *somehow*.. the two
>>> approaches that seemed sane are:
>>>
>>> 1) call fault handler from irq and having separate domain->resume()
>>> called by the driver, potentially from a wq
>>> 2) or having two fault callbacks, first called before wq and then
>>> based on returned value, optionally 2nd callback called from wq
>>>
>>> The first seemed less intrusive to me, but I'm flexible.
>>
>> How about adding a flag to the fault-handler call-back that tells us
>> whether it wants to sleep or not. If it wants, we call it from a wq, if
>> not we call call it directly like we do today in the
>> report_iommu_fault() function.
>>
>> In any case we call iommu_ops->resume() when set on completion of the
>> fault-handler either from the workqueue or report_iommu_fault itself.
>
> I like this approach. When the device driver registers a fault handler,
> it also tells when it would like to be called (either in atomic context,
> blocking context, or both).

What I have in mind is still a case-by-case decision.  Ie. I'd decide
if it is the first fault from a particular submit (job), in which case
I'd want to schedule the wq, or if it is one of the 999 following
faults from the same submit (in which case, skip the wq).

So a static decision when registering the fault handler doesn't work.

BR,
-R


> Then the handler itself receives a flag that says which context it's
> being called from. It returns a value telling the IOMMU how to proceed.
> Depending on this value we either resume/abort immediately, or add the
> fault to the workqueue if necessary.
>
> How about using the following return values:
>
> /**
>  * enum iommu_fault_status - Return status of fault handlers, telling the 
> IOMMU
>  *  driver how to proceed with the fault.
>  *
>  * @IOMMU_FAULT_STATUS_NONE: Fault was not handled. Call the next handler, or
>  *  terminate.
>  * @IOMMU_FAULT_STATUS_FAILURE: General error. Drop all subsequent faults from
>  *  this device if possible. This is "Response Failure" in PCI PRI.
>  * @IOMMU_FAULT_STATUS_INVALID: Could not handle this fault, don't retry the
>  *  access. This is "Invalid Request" in PCI PRI.
>  * @IOMMU_FAULT_STATUS_HANDLED: Fault has been handled and the page tables
>  *  populated, retry the access.
>  * @IOMMU_FAULT_STATUS_IGNORE: Stop processing the fault, and do not send a
>  *  reply to the device.
>  *
>  * For unrecoverable faults, the only valid status is IOMMU_FAULT_STATUS_NONE
>  * For a recoverable fault, if no one handled the fault, treat as
>  * IOMMU_FAULT_STATUS_INVALID.
>  */
> enum iommu_fault_status {
> IOMMU_FAULT_STATUS_NONE = 0,
> IOMMU_FAULT_STATUS_FAILURE,
> IOMMU_FAULT_STATUS_INVALID,
> IOMMU_FAULT_STATUS_HANDLED,
> IOMMU_FAULT_STATUS_IGNORE,
> };
>
> This would probably cover the two use-cases of reporting faults to
> device drivers, and injecting them into the guest with VFIO, as well as
> handling PPRs internally. I'm also working on providing more details
> (pasid for instance) in the fault callback.
>
> We could also use the fault handler for invalid PRI Page Requests
> (currently specialized by amd_iommu_set_invalid_ppr_cb). It's just a
> matter of adding a registration flag to iommu_set_fault_handler.
>
> Thanks,
> Jean
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


  1   2   3   4   5   >