Re: [PATCH 3/3] Fix compile error in drivers/gpu/drm/msm/msm_drv.c with IOMMU disabled

2013-09-27 Thread Rob Clark
On Wed, Sep 25, 2013 at 10:49 AM, Joerg Roedel j...@8bytes.org wrote:
 The function msm_iommu_get_ctx() is needed buy the MSM-GPU
 driver with and wiithout IOMMU compiled in. Make the
 function available when no IOMMU driver is there.


For this one,

Reviewed-by: Rob Clark robdcl...@gmail.com

But I am not the right one to merge this one.  And, well, if there is
a way to make this work without msm_iommu_get_ctx(), I am interested
in some hints ;-)

Of the other two, 1/3 looks fine and I'll pull that in.  And I'll see
if I can come up with a better way for 2/3

BR,
-R

 Signed-off-by: Joerg Roedel j...@8bytes.org
 ---
  drivers/iommu/msm_iommu.h |7 +++
  1 file changed, 7 insertions(+)

 diff --git a/drivers/iommu/msm_iommu.h b/drivers/iommu/msm_iommu.h
 index 5c7c955..da53558 100644
 --- a/drivers/iommu/msm_iommu.h
 +++ b/drivers/iommu/msm_iommu.h
 @@ -108,7 +108,14 @@ struct msm_iommu_ctx_drvdata {
   * Useful for testing and drivers that do not yet fully have IOMMU stuff in
   * their platform devices.
   */
 +#ifdef CONFIG_MSM_IOMMU
  struct device *msm_iommu_get_ctx(const char *ctx_name);
 +#else
 +static inline struct device *msm_iommu_get_ctx(const char *ctx_name)
 +{
 +   return NULL;
 +}
 +#endif

  /*
   * Interrupt handler for the IOMMU context fault interrupt. Hooking the
 --
 1.7.9.5


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH] drm/msm: Fix link error with !MSM_IOMMU

2014-01-08 Thread Rob Clark
On Wed, Jan 8, 2014 at 8:35 AM, Joerg Roedel j...@8bytes.org wrote:
 On Wed, Jan 08, 2014 at 08:23:49AM -0500, Rob Clark wrote:
 On Tue, Jan 7, 2014 at 5:53 PM, Joerg Roedel j...@8bytes.org wrote:
  On Tue, Jan 07, 2014 at 11:47:26PM +0100, Joerg Roedel wrote:
  The DRM driver for MSM depends on symbols from the MSM
  IOMMU driver. Add this dependency to the Kconfig file.
 
  Fixes this comile error:
 
Kernel: arch/arm/boot/zImage is ready
ERROR: msm_iommu_get_ctx [drivers/gpu/drm/msm/msm.ko]
undefined!
make[2]: *** [__modpost] Error 1

 Thanks, I'll pull this into my pull req for 3.14 drm-next.. although,
 after chatting with Stephen Boyd, I think we should eventually be able
 to get rid of this (seems msm_iommu_get_ctx() was mainly for dealing
 with secure playback (?))

 It is already in my next-branch. So if you are ok with the approach for
 now there is no need for action. If you want it to go through your tree
 anyway please let me know and I will drop it from mine.

probably will be a slight merge conflict if it goes through your
branch.. although should be easy enough to resolve, basically the
result should look something like:

  depends on (ARCH_MSM  ARCH_MSM8960  MSM_IOMMU) || (ARM  COMPILE_TEST)

BR,
-R


 Joerg


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC/PATCH 2/7] iommu-api: Add map_range/unmap_range functions

2014-07-08 Thread Rob Clark
On Tue, Jul 8, 2014 at 5:53 PM, Olav Haugan ohau...@codeaurora.org wrote:
 Hi Hiroshi,

 On 7/3/2014 9:29 PM, Hiroshi Doyu wrote:
 Hi Olav,

 Olav Haugan ohau...@codeaurora.org writes:

 Mapping and unmapping are more often than not in the critical path.
 map_range and unmap_range allows SMMU driver implementations to optimize
 the process of mapping and unmapping buffers into the SMMU page tables.
 Instead of mapping one physical address, do TLB operation (expensive),
 mapping, do TLB operation, mapping, do TLB operation the driver can map
 a scatter-gatherlist of physically contiguous pages into one virtual
 address space and then at the end do one TLB operation.

 Additionally, the mapping operation would be faster in general since
 clients does not have to keep calling map API over and over again for
 each physically contiguous chunk of memory that needs to be mapped to a
 virtually contiguous region.

 Signed-off-by: Olav Haugan ohau...@codeaurora.org
 ---
  drivers/iommu/iommu.c | 24 
  include/linux/iommu.h | 24 
  2 files changed, 48 insertions(+)

 diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
 index efc..f2a6b80 100644
 --- a/drivers/iommu/iommu.c
 +++ b/drivers/iommu/iommu.c
 @@ -898,6 +898,30 @@ size_t iommu_unmap(struct iommu_domain *domain, 
 unsigned long iova, size_t size)
  EXPORT_SYMBOL_GPL(iommu_unmap);


 +int iommu_map_range(struct iommu_domain *domain, unsigned int iova,
 +struct scatterlist *sg, unsigned int len, int prot)
 +{
 +if (unlikely(domain-ops-map_range == NULL))
 +return -ENODEV;
 +
 +BUG_ON(iova  (~PAGE_MASK));
 +
 +return domain-ops-map_range(domain, iova, sg, len, prot);
 +}
 +EXPORT_SYMBOL_GPL(iommu_map_range);

 We have the similar one internally, which is named, iommu_map_sg(),
 called from DMA API.

 Great, so this new API will be useful to more people!

 +int iommu_unmap_range(struct iommu_domain *domain, unsigned int iova,
 +  unsigned int len)
 +{
 +if (unlikely(domain-ops-unmap_range == NULL))
 +return -ENODEV;
 +
 +BUG_ON(iova  (~PAGE_MASK));
 +
 +return domain-ops-unmap_range(domain, iova, len);
 +}
 +EXPORT_SYMBOL_GPL(iommu_unmap_range);

 Can the existing iommu_unmap() do the same?

 I believe iommu_unmap() behaves a bit differently because it will keep
 on calling domain-ops-unmap() until everything is unmapped instead of
 letting the iommu implementation take care of unmapping everything in
 one call.

 I am abandoning the patch series since our driver was not accepted.
 However, if there are no objections I will resubmit this patch (PATCH
 2/7) as an independent patch to add this new map_range API.

+1 for map_range().. I've seen for gpu workloads, at least, it is the
downstream map_range() API is quite beneficial.   It was worth at
least a few fps in xonotic.

And, possibly getting off the subject a bit, but I was wondering about
the possibility of going one step further and batching up mapping
and/or unmapping multiple buffers (ranges) at once.  I have a pretty
convenient sync point in drm/msm to flush out multiple mappings before
kicking gpu.

BR,
-R

 Thanks,

 Olav Haugan

 --
 The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
 hosted by The Linux Foundation
 --
 To unsubscribe from this list: send the line unsubscribe linux-arm-msm in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC/PATCH 2/7] iommu-api: Add map_range/unmap_range functions

2014-07-09 Thread Rob Clark
On Wed, Jul 9, 2014 at 8:03 PM, Olav Haugan ohau...@codeaurora.org wrote:
 On 7/8/2014 4:49 PM, Rob Clark wrote:
 On Tue, Jul 8, 2014 at 5:53 PM, Olav Haugan ohau...@codeaurora.org wrote:
 Hi Hiroshi,

 On 7/3/2014 9:29 PM, Hiroshi Doyu wrote:
 Hi Olav,

 Olav Haugan ohau...@codeaurora.org writes:

 Mapping and unmapping are more often than not in the critical path.
 map_range and unmap_range allows SMMU driver implementations to optimize
 the process of mapping and unmapping buffers into the SMMU page tables.
 Instead of mapping one physical address, do TLB operation (expensive),
 mapping, do TLB operation, mapping, do TLB operation the driver can map
 a scatter-gatherlist of physically contiguous pages into one virtual
 address space and then at the end do one TLB operation.

 Additionally, the mapping operation would be faster in general since
 clients does not have to keep calling map API over and over again for
 each physically contiguous chunk of memory that needs to be mapped to a
 virtually contiguous region.

 Signed-off-by: Olav Haugan ohau...@codeaurora.org
 ---
  drivers/iommu/iommu.c | 24 
  include/linux/iommu.h | 24 
  2 files changed, 48 insertions(+)

 diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
 index efc..f2a6b80 100644
 --- a/drivers/iommu/iommu.c
 +++ b/drivers/iommu/iommu.c
 @@ -898,6 +898,30 @@ size_t iommu_unmap(struct iommu_domain *domain, 
 unsigned long iova, size_t size)
  EXPORT_SYMBOL_GPL(iommu_unmap);


 +int iommu_map_range(struct iommu_domain *domain, unsigned int iova,
 +struct scatterlist *sg, unsigned int len, int prot)
 +{
 +if (unlikely(domain-ops-map_range == NULL))
 +return -ENODEV;
 +
 +BUG_ON(iova  (~PAGE_MASK));
 +
 +return domain-ops-map_range(domain, iova, sg, len, prot);
 +}
 +EXPORT_SYMBOL_GPL(iommu_map_range);

 We have the similar one internally, which is named, iommu_map_sg(),
 called from DMA API.

 Great, so this new API will be useful to more people!

 +int iommu_unmap_range(struct iommu_domain *domain, unsigned int iova,
 +  unsigned int len)
 +{
 +if (unlikely(domain-ops-unmap_range == NULL))
 +return -ENODEV;
 +
 +BUG_ON(iova  (~PAGE_MASK));
 +
 +return domain-ops-unmap_range(domain, iova, len);
 +}
 +EXPORT_SYMBOL_GPL(iommu_unmap_range);

 Can the existing iommu_unmap() do the same?

 I believe iommu_unmap() behaves a bit differently because it will keep
 on calling domain-ops-unmap() until everything is unmapped instead of
 letting the iommu implementation take care of unmapping everything in
 one call.

 I am abandoning the patch series since our driver was not accepted.
 However, if there are no objections I will resubmit this patch (PATCH
 2/7) as an independent patch to add this new map_range API.

 +1 for map_range().. I've seen for gpu workloads, at least, it is the
 downstream map_range() API is quite beneficial.   It was worth at
 least a few fps in xonotic.

 And, possibly getting off the subject a bit, but I was wondering about
 the possibility of going one step further and batching up mapping
 and/or unmapping multiple buffers (ranges) at once.  I have a pretty
 convenient sync point in drm/msm to flush out multiple mappings before
 kicking gpu.

 I think you should be able to do that with this API already - at least
 the mapping part since we are passing in a sg list (this could be a
 chained sglist).

What I mean by batching up is mapping and unmapping multiple sglists
each at different iova's with minmal cpu cache and iommu tlb flushes..

Ideally we'd let the IOMMU driver be clever and build out all 2nd
level tables before inserting into first level tables (to minimize cpu
cache flushing).. also, there is probably a reasonable chance that
we'd be mapping a new buffer into existing location, so there might be
some potential to reuse existing 2nd level tables (and save a tiny bit
of free/alloc).  I've not thought too much about how that would look
in code.. might be kinda, umm, fun..

But at an API level, we should be able to do a bunch of
map/unmap_range's with one flush.

Maybe it could look like a sequence of iommu_{map,unmap}_range()
followed by iommu_flush()?

BR,
-R

 Thanks,

 Olav

 --
 The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
 hosted by The Linux Foundation
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC/PATCH 2/7] iommu-api: Add map_range/unmap_range functions

2014-07-10 Thread Rob Clark
On Thu, Jul 10, 2014 at 3:10 AM, Thierry Reding
thierry.red...@gmail.com wrote:
 On Wed, Jul 09, 2014 at 08:40:21PM -0400, Rob Clark wrote:
 On Wed, Jul 9, 2014 at 8:03 PM, Olav Haugan ohau...@codeaurora.org wrote:
  On 7/8/2014 4:49 PM, Rob Clark wrote:
  On Tue, Jul 8, 2014 at 5:53 PM, Olav Haugan ohau...@codeaurora.org 
  wrote:
  Hi Hiroshi,
 
  On 7/3/2014 9:29 PM, Hiroshi Doyu wrote:
  Hi Olav,
 
  Olav Haugan ohau...@codeaurora.org writes:
 
  Mapping and unmapping are more often than not in the critical path.
  map_range and unmap_range allows SMMU driver implementations to 
  optimize
  the process of mapping and unmapping buffers into the SMMU page tables.
  Instead of mapping one physical address, do TLB operation (expensive),
  mapping, do TLB operation, mapping, do TLB operation the driver can map
  a scatter-gatherlist of physically contiguous pages into one virtual
  address space and then at the end do one TLB operation.
 
  Additionally, the mapping operation would be faster in general since
  clients does not have to keep calling map API over and over again for
  each physically contiguous chunk of memory that needs to be mapped to a
  virtually contiguous region.
 
  Signed-off-by: Olav Haugan ohau...@codeaurora.org
  ---
   drivers/iommu/iommu.c | 24 
   include/linux/iommu.h | 24 
   2 files changed, 48 insertions(+)
 
  diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
  index efc..f2a6b80 100644
  --- a/drivers/iommu/iommu.c
  +++ b/drivers/iommu/iommu.c
  @@ -898,6 +898,30 @@ size_t iommu_unmap(struct iommu_domain *domain, 
  unsigned long iova, size_t size)
   EXPORT_SYMBOL_GPL(iommu_unmap);
 
 
  +int iommu_map_range(struct iommu_domain *domain, unsigned int iova,
  +struct scatterlist *sg, unsigned int len, int prot)
  +{
  +if (unlikely(domain-ops-map_range == NULL))
  +return -ENODEV;
  +
  +BUG_ON(iova  (~PAGE_MASK));
  +
  +return domain-ops-map_range(domain, iova, sg, len, prot);
  +}
  +EXPORT_SYMBOL_GPL(iommu_map_range);
 
  We have the similar one internally, which is named, iommu_map_sg(),
  called from DMA API.
 
  Great, so this new API will be useful to more people!
 
  +int iommu_unmap_range(struct iommu_domain *domain, unsigned int iova,
  +  unsigned int len)
  +{
  +if (unlikely(domain-ops-unmap_range == NULL))
  +return -ENODEV;
  +
  +BUG_ON(iova  (~PAGE_MASK));
  +
  +return domain-ops-unmap_range(domain, iova, len);
  +}
  +EXPORT_SYMBOL_GPL(iommu_unmap_range);
 
  Can the existing iommu_unmap() do the same?
 
  I believe iommu_unmap() behaves a bit differently because it will keep
  on calling domain-ops-unmap() until everything is unmapped instead of
  letting the iommu implementation take care of unmapping everything in
  one call.
 
  I am abandoning the patch series since our driver was not accepted.
  However, if there are no objections I will resubmit this patch (PATCH
  2/7) as an independent patch to add this new map_range API.
 
  +1 for map_range().. I've seen for gpu workloads, at least, it is the
  downstream map_range() API is quite beneficial.   It was worth at
  least a few fps in xonotic.
 
  And, possibly getting off the subject a bit, but I was wondering about
  the possibility of going one step further and batching up mapping
  and/or unmapping multiple buffers (ranges) at once.  I have a pretty
  convenient sync point in drm/msm to flush out multiple mappings before
  kicking gpu.
 
  I think you should be able to do that with this API already - at least
  the mapping part since we are passing in a sg list (this could be a
  chained sglist).

 What I mean by batching up is mapping and unmapping multiple sglists
 each at different iova's with minmal cpu cache and iommu tlb flushes..

 Ideally we'd let the IOMMU driver be clever and build out all 2nd
 level tables before inserting into first level tables (to minimize cpu
 cache flushing).. also, there is probably a reasonable chance that
 we'd be mapping a new buffer into existing location, so there might be
 some potential to reuse existing 2nd level tables (and save a tiny bit
 of free/alloc).  I've not thought too much about how that would look
 in code.. might be kinda, umm, fun..

 But at an API level, we should be able to do a bunch of
 map/unmap_range's with one flush.

 Maybe it could look like a sequence of iommu_{map,unmap}_range()
 followed by iommu_flush()?

 Doesn't that mean that the IOMMU driver would have to keep track of all
 mappings until it sees an iommu_flush()? That sounds like it could be a
 lot of work and complicated code.

Well, depends on how elaborate you want to get.  If you don't want to
be too fancy, it may just be a matter of not doing TLB flush until
iommu_flush().  If you want to get fancy and minimize cpu flushes too,
then iommu driver would have to do some more tracking to build up a
transaction internally.  I'm

[RFC] iommu: qcom-iommu-v0 IOMMU

2014-07-10 Thread Rob Clark
 platform_driver qcom_iommu_driver = {
+   .driver = {
+   .name   = qcom-iommu-v0,
+   .of_match_table = qcom_iommu_dt_match,
+   },
+   .probe  = qcom_iommu_probe,
+   .remove = qcom_iommu_remove,
+};
+
+static int __init get_tex_class(int icp, int ocp, int mt, int nos)
+{
+   int i = 0;
+   unsigned int prrr = 0;
+   unsigned int nmrr = 0;
+   int c_icp, c_ocp, c_mt, c_nos;
+
+   RCP15_PRRR(prrr);
+   RCP15_NMRR(nmrr);
+
+   for (i = 0; i  NUM_TEX_CLASS; i++) {
+   c_nos = PRRR_NOS(prrr, i);
+   c_mt = PRRR_MT(prrr, i);
+   c_icp = NMRR_ICP(nmrr, i);
+   c_ocp = NMRR_OCP(nmrr, i);
+
+   if (icp == c_icp  ocp == c_ocp  c_mt == mt  c_nos == nos)
+   return i;
+   }
+
+   return -ENODEV;
+}
+
+static int __init qcom_iommu_init(void)
+{
+   int ret;
+
+   ret = platform_driver_register(qcom_iommu_driver);
+   if (ret) {
+   pr_err(Failed to register IOMMU driver\n);
+   goto error;
+   }
+
+   qcom_iommu_tex_class[QCOM_IOMMU_ATTR_NONCACHED] =
+   get_tex_class(CP_NONCACHED, CP_NONCACHED, MT_NORMAL, 1);
+
+   qcom_iommu_tex_class[QCOM_IOMMU_ATTR_CACHED_WB_WA] =
+   get_tex_class(CP_WB_WA, CP_WB_WA, MT_NORMAL, 1);
+
+   qcom_iommu_tex_class[QCOM_IOMMU_ATTR_CACHED_WB_NWA] =
+   get_tex_class(CP_WB_NWA, CP_WB_NWA, MT_NORMAL, 1);
+
+   qcom_iommu_tex_class[QCOM_IOMMU_ATTR_CACHED_WT] =
+   get_tex_class(CP_WT, CP_WT, MT_NORMAL, 1);
+
+   bus_set_iommu(platform_bus_type, qcom_iommu_ops);
+
+   return 0;
+
+error:
+   return ret;
+}
+
+static void __exit qcom_iommu_driver_exit(void)
+{
+   platform_driver_unregister(qcom_iommu_driver);
+}
+
+subsys_initcall(qcom_iommu_init);
+module_exit(qcom_iommu_driver_exit);
+
+MODULE_LICENSE(GPL v2);
+MODULE_AUTHOR(Stepan Moskovchenko step...@codeaurora.org);
+MODULE_AUTHOR(Rob Clark robdcl...@gmail.com);
diff --git a/drivers/iommu/qcom_iommu_v0.h b/drivers/iommu/qcom_iommu_v0.h
new file mode 100644
index 000..efe8535
--- /dev/null
+++ b/drivers/iommu/qcom_iommu_v0.h
@@ -0,0 +1,95 @@
+/* Copyright (c) 2010-2011, Code Aurora Forum. All rights reserved.
+ * Copyright (C) 2014 Red Hat
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 and
+ * only version 2 as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA
+ * 02110-1301, USA.
+ */
+
+/* NOTE: originally based on msm_iommu non-DT driver for same hw
+ * but as the structure of the driver changes considerably for DT
+ * it seemed easier to not try to support old platforms with the
+ * same driver.
+ */
+
+#ifndef QCOM_IOMMU_V0_H
+#define QCOM_IOMMU_V0_H
+
+#include linux/interrupt.h
+#include linux/clk.h
+
+/* Sharability attributes of QCOM IOMMU mappings */
+#define QCOM_IOMMU_ATTR_NON_SH 0x0
+#define QCOM_IOMMU_ATTR_SH 0x4
+
+/* Cacheability attributes of QCOM IOMMU mappings */
+#define QCOM_IOMMU_ATTR_NONCACHED  0x0
+#define QCOM_IOMMU_ATTR_CACHED_WB_WA   0x1
+#define QCOM_IOMMU_ATTR_CACHED_WB_NWA  0x2
+#define QCOM_IOMMU_ATTR_CACHED_WT  0x3
+
+/* Mask for the cache policy attribute */
+#define QCOM_IOMMU_CP_MASK 0x03
+
+/* Maximum number of Machine IDs that we are allowing to be mapped to the same
+ * context bank. The number of MIDs mapped to the same CB does not affect
+ * performance, but there is a practical limit on how many distinct MIDs may
+ * be present. These mappings are typically determined at design time and are
+ * not expected to change at run time.
+ */
+#define MAX_NUM_MIDS   32
+
+/**
+ * struct qcom_iommu - a single IOMMU hardware instance
+ * @dev: IOMMU device
+ * @base: IOMMU config port base address (VA)
+ * @irq: Interrupt number
+ * @ncb: Number of context banks present on this IOMMU HW instance
+ * @ttbr_split: ttbr split
+ * @clk: The bus clock for this IOMMU hardware instance
+ * @pclk: The clock for the IOMMU bus interconnect
+ * @ctx_list: list of 'struct qcom_iommu_ctx'
+ * @dev_node: list head in qcom_iommu_devices list
+ * @dom_node: list head in domain
+ */
+struct qcom_iommu {
+   struct device *dev;
+   void __iomem *base;
+   int irq;
+   int ncb;
+   int ttbr_split;
+   struct clk *clk;
+   struct clk *pclk;
+   struct list_head ctx_list;
+   struct list_head dev_node

Re: [RFC/PATCH 2/7] iommu-api: Add map_range/unmap_range functions

2014-07-10 Thread Rob Clark
On Thu, Jul 10, 2014 at 6:43 PM, Olav Haugan ohau...@codeaurora.org wrote:
 On 7/9/2014 5:40 PM, Rob Clark wrote:
 On Wed, Jul 9, 2014 at 8:03 PM, Olav Haugan ohau...@codeaurora.org wrote:
 On 7/8/2014 4:49 PM, Rob Clark wrote:
 On Tue, Jul 8, 2014 at 5:53 PM, Olav Haugan ohau...@codeaurora.org wrote:
 Hi Hiroshi,

 On 7/3/2014 9:29 PM, Hiroshi Doyu wrote:
 Hi Olav,

 Olav Haugan ohau...@codeaurora.org writes:

 Mapping and unmapping are more often than not in the critical path.
 map_range and unmap_range allows SMMU driver implementations to optimize
 the process of mapping and unmapping buffers into the SMMU page tables.
 Instead of mapping one physical address, do TLB operation (expensive),
 mapping, do TLB operation, mapping, do TLB operation the driver can map
 a scatter-gatherlist of physically contiguous pages into one virtual
 address space and then at the end do one TLB operation.

 Additionally, the mapping operation would be faster in general since
 clients does not have to keep calling map API over and over again for
 each physically contiguous chunk of memory that needs to be mapped to a
 virtually contiguous region.

 Signed-off-by: Olav Haugan ohau...@codeaurora.org
 ---
  drivers/iommu/iommu.c | 24 
  include/linux/iommu.h | 24 
  2 files changed, 48 insertions(+)

 diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
 index efc..f2a6b80 100644
 --- a/drivers/iommu/iommu.c
 +++ b/drivers/iommu/iommu.c
 @@ -898,6 +898,30 @@ size_t iommu_unmap(struct iommu_domain *domain, 
 unsigned long iova, size_t size)
  EXPORT_SYMBOL_GPL(iommu_unmap);


 +int iommu_map_range(struct iommu_domain *domain, unsigned int iova,
 +struct scatterlist *sg, unsigned int len, int prot)
 +{
 +if (unlikely(domain-ops-map_range == NULL))
 +return -ENODEV;
 +
 +BUG_ON(iova  (~PAGE_MASK));
 +
 +return domain-ops-map_range(domain, iova, sg, len, prot);
 +}
 +EXPORT_SYMBOL_GPL(iommu_map_range);

 We have the similar one internally, which is named, iommu_map_sg(),
 called from DMA API.

 Great, so this new API will be useful to more people!

 +int iommu_unmap_range(struct iommu_domain *domain, unsigned int iova,
 +  unsigned int len)
 +{
 +if (unlikely(domain-ops-unmap_range == NULL))
 +return -ENODEV;
 +
 +BUG_ON(iova  (~PAGE_MASK));
 +
 +return domain-ops-unmap_range(domain, iova, len);
 +}
 +EXPORT_SYMBOL_GPL(iommu_unmap_range);

 Can the existing iommu_unmap() do the same?

 I believe iommu_unmap() behaves a bit differently because it will keep
 on calling domain-ops-unmap() until everything is unmapped instead of
 letting the iommu implementation take care of unmapping everything in
 one call.

 I am abandoning the patch series since our driver was not accepted.
 However, if there are no objections I will resubmit this patch (PATCH
 2/7) as an independent patch to add this new map_range API.

 +1 for map_range().. I've seen for gpu workloads, at least, it is the
 downstream map_range() API is quite beneficial.   It was worth at
 least a few fps in xonotic.

 And, possibly getting off the subject a bit, but I was wondering about
 the possibility of going one step further and batching up mapping
 and/or unmapping multiple buffers (ranges) at once.  I have a pretty
 convenient sync point in drm/msm to flush out multiple mappings before
 kicking gpu.

 I think you should be able to do that with this API already - at least
 the mapping part since we are passing in a sg list (this could be a
 chained sglist).

 What I mean by batching up is mapping and unmapping multiple sglists
 each at different iova's with minmal cpu cache and iommu tlb flushes..

 Ideally we'd let the IOMMU driver be clever and build out all 2nd
 level tables before inserting into first level tables (to minimize cpu
 cache flushing).. also, there is probably a reasonable chance that
 we'd be mapping a new buffer into existing location, so there might be
 some potential to reuse existing 2nd level tables (and save a tiny bit
 of free/alloc).  I've not thought too much about how that would look
 in code.. might be kinda, umm, fun..

 But at an API level, we should be able to do a bunch of
 map/unmap_range's with one flush.

 Maybe it could look like a sequence of iommu_{map,unmap}_range()
 followed by iommu_flush()?


 So we could add another argument (options) in the range api that
 allows you to indicate whether you want to invalidate TLB or not.

sounds reasonable.. I'm pretty sure we want explict-flush to be an
opt-in behaviour.

BR,
-R

 Thanks,

 Olav

 --
 The Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
 hosted by The Linux Foundation
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC] iommu: qcom-iommu-v0 IOMMU

2014-07-10 Thread Rob Clark
On Thu, Jul 10, 2014 at 5:53 PM, Stephen Boyd sb...@codeaurora.org wrote:
 On 07/10, Rob Clark wrote:
 So, in it's current form, this is superficially a copy of msm_iommu
 plus DT conversion.  But the pre-DT IOMMU driver had fairly different
 structure.. ie. psuedo root device, with IOMMU devices hanging off
 that, and context devices hanging off that.  The context devices were
 what the client device would attach (which was also somewhat non-
 standard.. see msm_iommu_get_ctx()).

 I couldn't really think of some sane way to refactor this and add DT
 while at the same time keeping compatibility with the old pre-DT msm
 stuff.  So I copied to a new driver.

 It was pointed out that nothing upstream actually *used* the msm_iommu
 driver.  So if no one objects to dropping pre-DT support, then I could
 do some patch rejuggling + sed to make this replace the old driver
 instead.

 +1

 +
 +// TODO any good reason for global lock vs per-iommu lock?
 +DEFINE_SPINLOCK(qcom_iommu_lock);

 static?

oh, yes.. ofc

I do wonder if we might want to make locking a bit more fine grained
to reduce contention (but, otoh, the gpu driver isn't going to contend
with itself, and other drivers probably aren't taxing the iommu quite
so hard).  But I guess it would be ok to leave that as a future
optimization.


 +static LIST_HEAD(qcom_iommu_devices);
 +
 +/* Note that a single iommu_domain can, for devices sitting behind
 + * more than one IOMMU (ie. one per AXI interface) will have more
 + * than one iommu in the iommu_list.  But all are programmed to
 + * point at the same pagetables so from client device perspective
 + * they act as a single IOMMU.
 + */
 +struct qcom_domain_priv {
 + unsigned long *pgtable;
 + struct list_head iommu_list;  /* list of attached 'struct qcom_iommu' 
 */
 +};
 +
 +static int __enable_clocks(struct qcom_iommu *iommu)
 +{
 + int ret;
 +
 + ret = clk_prepare_enable(iommu-pclk);

 Looks like __enable_clocks() is called from within a spinlock
 sometimes. Please move the prepare call outside of any atomic
 sections and also enable CONFIG_PROVE_LOCKING and/or
 CONFIG_DEBUG_ATOMIC_SLEEP to find such problems.

good point.  I haven't gotten around to running this against a debug
kernel.  It is only at the 'just barely works now' stage, I just
wanted to send for early feedback on the approach.  (Mainly to make
sure no one screams bloody murder about dropping pre-DT support from
old driver.)

BR,
-R

 + if (ret)
 + goto fail;
 +
 + if (iommu-clk) {
 + ret = clk_prepare_enable(iommu-clk);
 + if (ret)
 + clk_disable_unprepare(iommu-pclk);
 + }
 +fail:
 + return ret;
 +}
 +

 --
 Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
 hosted by The Linux Foundation
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v4] devicetree: Add generic IOMMU device tree bindings

2014-07-11 Thread Rob Clark
On Fri, Jul 4, 2014 at 11:29 AM, Thierry Reding
thierry.red...@gmail.com wrote:
 From: Thierry Reding tred...@nvidia.com

 This commit introduces a generic device tree binding for IOMMU devices.
 Only a very minimal subset is described here, but it is enough to cover
 the requirements of both the Exynos System MMU and Tegra SMMU as
 discussed here:

 https://lkml.org/lkml/2014/4/27/346

 Signed-off-by: Thierry Reding tred...@nvidia.com
 ---
 Changes in v4:
 - clarify that disabling an IOMMU DT node may not disable translation
 - be more explicit that examples are only examples
 - add multi-ID master example

 Changes in v3:
 - use #iommu-cells instead of #address-cells/#size-cells
 - drop optional iommu-names property

 Changes in v2:
 - add notes about dma-ranges property (drop note from commit message)
 - document priorities of iommus property vs. dma-ranges property
 - drop #iommu-cells in favour of #address-cells and #size-cells
 - remove multiple-master device example

  Documentation/devicetree/bindings/iommu/iommu.txt | 172 
 ++
  1 file changed, 172 insertions(+)
  create mode 100644 Documentation/devicetree/bindings/iommu/iommu.txt

 diff --git a/Documentation/devicetree/bindings/iommu/iommu.txt 
 b/Documentation/devicetree/bindings/iommu/iommu.txt
 new file mode 100644
 index ..464a81eaaf61
 --- /dev/null
 +++ b/Documentation/devicetree/bindings/iommu/iommu.txt
 @@ -0,0 +1,172 @@
 +This document describes the generic device tree binding for IOMMUs and their
 +master(s).
 +
 +
 +IOMMU device node:
 +==
 +
 +An IOMMU can provide the following services:
 +
 +* Remap address space to allow devices to access physical memory ranges that
 +  they otherwise wouldn't be capable of accessing.
 +
 +  Example: 32-bit DMA to 64-bit physical addresses
 +
 +* Implement scatter-gather at page level granularity so that the device does
 +  not have to.
 +
 +* Provide system protection against rogue DMA by forcing all accesses to go
 +  through the IOMMU and faulting when encountering accesses to unmapped
 +  address regions.
 +
 +* Provide address space isolation between multiple contexts.
 +
 +  Example: Virtualization
 +
 +Device nodes compatible with this binding represent hardware with some of the
 +above capabilities.
 +
 +IOMMUs can be single-master or multiple-master. Single-master IOMMU devices
 +typically have a fixed association to the master device, whereas multiple-
 +master IOMMU devices can translate accesses from more than one master.
 +
 +The device tree node of the IOMMU device's parent bus must contain a valid
 +dma-ranges property that describes how the physical address space of the
 +IOMMU maps to memory. An empty dma-ranges property means that there is a
 +1:1 mapping from IOMMU to memory.
 +
 +Required properties:
 +
 +- #iommu-cells: The number of cells in an IOMMU specifier needed to encode an
 +  address.
 +
 +The meaning of the IOMMU specifier is defined by the device tree binding of
 +the specific IOMMU. Below are a few examples of typical use-cases:
 +
 +- #iommu-cells = 0: Single master IOMMU devices are not configurable and
 +  therefore no additional information needs to be encoded in the specifier.
 +  This may also apply to multiple master IOMMU devices that do not allow the
 +  association of masters to be configured. Note that an IOMMU can by design
 +  be multi-master yet only expose a single master in a given configuration.
 +  In such cases the number of cells will usually be 1 as in the next case.
 +- #iommu-cells = 1: Multiple master IOMMU devices may need to be configured
 +  in order to enable translation for a given master. In such cases the single
 +  address cell corresponds to the master device's ID. In some cases more than
 +  one cell can be required to represent a single master ID.
 +- #iommu-cells = 4: Some IOMMU devices allow the DMA window for masters to
 +  be configured. The first cell of the address in this may contain the master
 +  device's ID for example, while the second cell could contain the start of
 +  the DMA window for the given device. The length of the DMA window is given
 +  by the third and fourth cells.
 +
 +Note that these are merely examples and real-world use-cases may use 
 different
 +definitions to represent their individual needs. Always refer to the specific
 +IOMMU binding for the exact meaning of the cells that make up the specifier.
 +
 +
 +IOMMU master node:
 +==
 +
 +Devices that access memory through an IOMMU are called masters. A device can
 +have multiple master interfaces (to one or more IOMMU devices).
 +
 +Required properties:
 +
 +- iommus: A list of phandle and IOMMU specifier pairs that describe the IOMMU
 +  master interfaces of the device. One entry in the list describes one master
 +  interface of the device.
 +
 +When an iommus property is specified in a device tree node, the IOMMU will
 +be used for address 

Re: [PATCH v4] devicetree: Add generic IOMMU device tree bindings

2014-07-12 Thread Rob Clark
On Sat, Jul 12, 2014 at 5:39 AM, Will Deacon will.dea...@arm.com wrote:
 Hi Rob,

 On Fri, Jul 11, 2014 at 09:55:14PM +0100, Rob Clark wrote:
 On Fri, Jul 4, 2014 at 11:29 AM, Thierry Reding
 thierry.red...@gmail.com wrote:
  From: Thierry Reding tred...@nvidia.com
 ok, so I was working through this to try to convert my
 {qcom,msm}-iommu-v0 RFC over to using these bindings.  For background,
 I was initially using something that looked a bit more like the
 current arm-smmu bindings:

 gpu {
 #stream-id-cells = 16;
 ...
 };

 gfx3d: qcom,iommu@7c0 {
 compatible = qcom,iommu-v0;
 ...
 mmu-masters =
 /* gfx3d_user: */
 gpu 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15,
 /* gfx3d_priv: */
 gpu 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31;
 };

 gfx3d1: qcom,iommu@7d0 {
 compatible = qcom,iommu-v0;
 ...
 mmu-masters =
 /* gfx3d_user: */
 gpu 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15,
 /* gfx3d_priv: */
 gpu 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31;
 };

 With my current arrangement, I have everything I need when the iommu
 device is probed to set up each of the context banks

 This proposal inverts that relationship.  Which forces me to do a lot
 more (including DT parsing) on device attach.  Which I'm not a huge
 fan of.  Ie. if I even wanted to try to implement per-process
 pagetables for gpu without completely going behind the IOMMU API's
 back, I would want attach/detach to be as lightweight as possible.

 I think we'd have to walk the entire device-tree at IOMMU probe time in
 order to enumerate the masters, which sounds horrible... If we want to do
 clever SMR allocation on the ARM SMMU (as I've been discussing with Olav),
 we're going to need knowledge about *all* the valid Stream IDs in the system
 before we can program *any* translations.

I guess walking the whole tree would work.. but it really doesn't
really sound like a nice solution ;-)

also, I'm not really sure to what extent it is bad form to assume the
tree is static.

 Was there actually a good reason for having the device link to the
 iommu rather than the other way around?  How much would people hate it
 if I just ignore the generic bindings and use something that works for
 me instead.  I mean, it isn't exactly like there is going to be .dts
 re-use across different SoC's..  and at least with current IOMMU API
 some sort of of_get_named_iommu() API doesn't really make sense.

 The thing is, if you end up ignoring the generic binding then we have two
 IOMMUs using the same (ARM SMMU) binding and it begs the question as to
 which is the more generic! I know we're keen to get this merged, but merging
 something that people won't use and calling it generic doesn't seem ideal
 either. We do, however, desperately need a generic binding.

yeah, ignoring the generic binding is not my first choice.  I'd rather
have something that works well for everyone.  But I wasn't really sure
if the current proposal was arbitrary, or if there are some
conflicting requirements between different platforms.

 Turning the question around; Thierry -- what are the issue in using
 something like the ARM SMMU binding (master device IDs held in the IOMMU
 node) for the nvidia IOMMU?

+1 for doing it more like how arm-smmu is currently.. that works much
better for me :-)

BR,
-R

 Will
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v4] devicetree: Add generic IOMMU device tree bindings

2014-07-12 Thread Rob Clark
On Sat, Jul 12, 2014 at 8:22 AM, Arnd Bergmann a...@arndb.de wrote:
 On Saturday 12 July 2014, Rob Clark wrote:
  Was there actually a good reason for having the device link to the
  iommu rather than the other way around?  How much would people hate it
  if I just ignore the generic bindings and use something that works for
  me instead.  I mean, it isn't exactly like there is going to be .dts
  re-use across different SoC's..  and at least with current IOMMU API
  some sort of of_get_named_iommu() API doesn't really make sense.
 
  The thing is, if you end up ignoring the generic binding then we have two
  IOMMUs using the same (ARM SMMU) binding and it begs the question as to
  which is the more generic! I know we're keen to get this merged, but 
  merging
  something that people won't use and calling it generic doesn't seem ideal
  either. We do, however, desperately need a generic binding.

 yeah, ignoring the generic binding is not my first choice.  I'd rather
 have something that works well for everyone.  But I wasn't really sure
 if the current proposal was arbitrary, or if there are some
 conflicting requirements between different platforms.

 The common case that needs to be simple is attaching one (master) device
 to an IOMMU using the shared global context for the purposes of implementing
 the dma-mapping API.

well, I don't disagree that IOMMU API has some problems.  It is too
tied to the bus type, which doesn't really seem to make sense for
platform devices.  (Unless we start having multiple platform busses?)

But at least given the current IOMMU API I'm not really sure how it
makes a difference which way the link goes.  But if there has already
been some discussion about how you want to handle the tie in with
dma-mapping, if you could point me at that then maybe your point will
make more sense to me.

 The way that Thierry's binding does that is the obvious solution to this,
 and it mirrors what we do in practically every other subsystem. I definitely
 want the SMMU to change before anybody starts using it in a real system,
 which we fortunately do not have yet.

hmm, well if some of the things I need for (like this or batching
mappings) are too weird and gpu specific, I'm willing to duplicate the
IOMMU driver in drm/msm.  It really isn't so much code, and that gives
me a lot more more flexibility to do crazy things... at some point I'm
probably going to want to do context switches by banging the IOMMU
registers directly from the gpu.

But given what Will said, I don't think what I need here is too far
out of line.  But if it is really a problem for dma-mapping, I suppose
we could have links in both directions?  However, I think the link
which contains the stream-id's really needs to be in the IOMMU, not
the device using the IOMMU.

BR,
-R
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v4] devicetree: Add generic IOMMU device tree bindings

2014-07-13 Thread Rob Clark
On Sun, Jul 13, 2014 at 5:43 AM, Will Deacon will.dea...@arm.com wrote:
 On Sat, Jul 12, 2014 at 01:57:31PM +0100, Rob Clark wrote:
 On Sat, Jul 12, 2014 at 8:22 AM, Arnd Bergmann a...@arndb.de wrote:
  On Saturday 12 July 2014, Rob Clark wrote:
   Was there actually a good reason for having the device link to the
   iommu rather than the other way around?  How much would people hate it
   if I just ignore the generic bindings and use something that works for
   me instead.  I mean, it isn't exactly like there is going to be .dts
   re-use across different SoC's..  and at least with current IOMMU API
   some sort of of_get_named_iommu() API doesn't really make sense.
  
   The thing is, if you end up ignoring the generic binding then we have 
   two
   IOMMUs using the same (ARM SMMU) binding and it begs the question as to
   which is the more generic! I know we're keen to get this merged, but 
   merging
   something that people won't use and calling it generic doesn't seem 
   ideal
   either. We do, however, desperately need a generic binding.
 
  yeah, ignoring the generic binding is not my first choice.  I'd rather
  have something that works well for everyone.  But I wasn't really sure
  if the current proposal was arbitrary, or if there are some
  conflicting requirements between different platforms.
 
  The common case that needs to be simple is attaching one (master) device
  to an IOMMU using the shared global context for the purposes of 
  implementing
  the dma-mapping API.

 well, I don't disagree that IOMMU API has some problems.  It is too
 tied to the bus type, which doesn't really seem to make sense for
 platform devices.  (Unless we start having multiple platform busses?)

 But at least given the current IOMMU API I'm not really sure how it
 makes a difference which way the link goes.  But if there has already
 been some discussion about how you want to handle the tie in with
 dma-mapping, if you could point me at that then maybe your point will
 make more sense to me.

 If you look at the proposed binding in isolation, I think it *is* cleaner
 than the ARM SMMU binding (I've acked it...) and I believe it's more
 consistent with the way we describe linkages elsewhere.

 However, the issue you're raising is that it's more difficult to make use of
 in a Linux IOMMU driver. The reward you'll get for using it will come
 eventually when the DMA ops are automatically swizzled for devices using the
 generic binding.

 My plan for the ARM SMMU driver is:

   (1) Change -probe() to walk the device-tree looking for all masters with
   phandles back to the SMMU instance being probed

   (2) For each master, extract the Stream IDs and add them to the internal
   SMMU driver data structures (an rbtree per SMMU instance). For
   hotpluggable buses, we'll need a way for the bus controller to
   reserve a range of IDs -- this will likely be a later extension to
   the binding.

   (3) When we get an -add() call, warn if it's a device we haven't seen
   and reject the addition.

 That way, -attach() should be the same as it is now, I think.

 Have you tried implementing something like that? We agreed that (1) isn't
 pretty, but I don't have a good alternative and it's only done at
 probe-time.

I haven't tried implementing that yet, but I'm sure it would work.  I
was just hoping to avoid having to do that ;-)

I suppose perhaps there is room for a shared helper here, to at least
avoid duplicating that in each IOMMU driver which needs the
stream-id's up front.

 Will

 BTW: Is the msm-v0 IOMMU compatible with the ARM SMMU driver, or is it a
 completely different design requiring a different driver?

My understanding is that it is different from msm v1 IOMMU, although I
think it shares the same pagetable format with v1.  Not sure if that
is the same as arm-smmu?   If so it might be nice to try to extract
out some shared helper fxns for map/unmap as well.

I expect Olav knows better the similarities/differences.

BR,
-R
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v4] devicetree: Add generic IOMMU device tree bindings

2014-07-14 Thread Rob Clark
On Mon, Jul 14, 2014 at 2:24 AM, Thierry Reding
thierry.red...@gmail.com wrote:
 On Sat, Jul 12, 2014 at 08:57:31AM -0400, Rob Clark wrote:
 On Sat, Jul 12, 2014 at 8:22 AM, Arnd Bergmann a...@arndb.de wrote:
 [...]
  The way that Thierry's binding does that is the obvious solution to this,
  and it mirrors what we do in practically every other subsystem. I 
  definitely
  want the SMMU to change before anybody starts using it in a real system,
  which we fortunately do not have yet.

 hmm, well if some of the things I need for (like this or batching
 mappings) are too weird and gpu specific, I'm willing to duplicate the
 IOMMU driver in drm/msm.  It really isn't so much code, and that gives
 me a lot more more flexibility to do crazy things... at some point I'm
 probably going to want to do context switches by banging the IOMMU
 registers directly from the gpu.

 If the IOMMU API doesn't provide for what you need, then perhaps it's
 time to enhance it? We do that all the time in other parts of the
 kernel, why should IOMMU be special?

sure.. and my comment was also about the map/unmap batching.

Bypassing IOMMU wouldn't be my first choice.  (Especially because I'd
then get to implement it twice.)  But if some of the things I need are
too specific to one driver (or worse, problematic for other IOMMU
use-cases which I don't know about), then it is an option I'd be
willing to consider.  If nothing else, it would get me out of
allocating sglists for every buffer..  I wonder how much memory
scatterlists take up for 500M of gfx buffers?

 It seems to me like context switching for per-process address space
 isolation is one of the important features of an IOMMU. If the current
 API doesn't let you do that then we should think of ways how it can be
 improved. And if it doesn't do it fast enough, then we should equally
 find ways to speed it up.

 This is part of why I think it would be good to have explicit objects
 associated with IOMMU contexts. That would give us a good place to add
 caching for this kind of situation. Currently we're required to handle
 most of this in drivers (map from struct device to context, ...).

well, it is at least awkward that the current api conflates attaching
device and attaching context.  I think we could get some use out of an
iommu_swap() API which conceptually acts as:

  iommu_swap(olddomain, newdomain, dev)
  {
 iommu_detach_device(olddomain, dev);
 iommu_attach_device(newdomain, dev);
  }

BR,
-R

 Thierry
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v4] devicetree: Add generic IOMMU device tree bindings

2014-07-16 Thread Rob Clark
On Tue, Jul 15, 2014 at 9:25 PM, Olav Haugan ohau...@codeaurora.org wrote:
 On 7/13/2014 4:43 AM, Rob Clark wrote:
 On Sun, Jul 13, 2014 at 5:43 AM, Will Deacon will.dea...@arm.com wrote:
 On Sat, Jul 12, 2014 at 01:57:31PM +0100, Rob Clark wrote:
 On Sat, Jul 12, 2014 at 8:22 AM, Arnd Bergmann a...@arndb.de wrote:
 On Saturday 12 July 2014, Rob Clark wrote:
 Was there actually a good reason for having the device link to the
 iommu rather than the other way around?  How much would people hate it
 if I just ignore the generic bindings and use something that works for
 me instead.  I mean, it isn't exactly like there is going to be .dts
 re-use across different SoC's..  and at least with current IOMMU API
 some sort of of_get_named_iommu() API doesn't really make sense.

 The thing is, if you end up ignoring the generic binding then we have 
 two
 IOMMUs using the same (ARM SMMU) binding and it begs the question as to
 which is the more generic! I know we're keen to get this merged, but 
 merging
 something that people won't use and calling it generic doesn't seem 
 ideal
 either. We do, however, desperately need a generic binding.

 yeah, ignoring the generic binding is not my first choice.  I'd rather
 have something that works well for everyone.  But I wasn't really sure
 if the current proposal was arbitrary, or if there are some
 conflicting requirements between different platforms.

 The common case that needs to be simple is attaching one (master) device
 to an IOMMU using the shared global context for the purposes of 
 implementing
 the dma-mapping API.

 well, I don't disagree that IOMMU API has some problems.  It is too
 tied to the bus type, which doesn't really seem to make sense for
 platform devices.  (Unless we start having multiple platform busses?)

 But at least given the current IOMMU API I'm not really sure how it
 makes a difference which way the link goes.  But if there has already
 been some discussion about how you want to handle the tie in with
 dma-mapping, if you could point me at that then maybe your point will
 make more sense to me.

 If you look at the proposed binding in isolation, I think it *is* cleaner
 than the ARM SMMU binding (I've acked it...) and I believe it's more
 consistent with the way we describe linkages elsewhere.

 However, the issue you're raising is that it's more difficult to make use of
 in a Linux IOMMU driver. The reward you'll get for using it will come
 eventually when the DMA ops are automatically swizzled for devices using the
 generic binding.

 My plan for the ARM SMMU driver is:

   (1) Change -probe() to walk the device-tree looking for all masters with
   phandles back to the SMMU instance being probed

   (2) For each master, extract the Stream IDs and add them to the internal
   SMMU driver data structures (an rbtree per SMMU instance). For
   hotpluggable buses, we'll need a way for the bus controller to
   reserve a range of IDs -- this will likely be a later extension to
   the binding.

   (3) When we get an -add() call, warn if it's a device we haven't seen
   and reject the addition.

 That way, -attach() should be the same as it is now, I think.

 Have you tried implementing something like that? We agreed that (1) isn't
 pretty, but I don't have a good alternative and it's only done at
 probe-time.

 I haven't tried implementing that yet, but I'm sure it would work.  I
 was just hoping to avoid having to do that ;-)

 Is the reason you want to do it this way because you want to guarantee
 that all masters (and stream IDs) have been identified before the first
 attach call? I am just wondering why you cannot continue doing the
 master/streamID discovery during add_device() callback?

it was mostly because I couldn't think of a sane way to differentiate
between first and second time a device attaches (without keeping a
reference to the device).  But I guess it is ok to assume no hotplug
(since walking the device tree also seems acceptable)

BR,
-R


 BTW: Is the msm-v0 IOMMU compatible with the ARM SMMU driver, or is it a
 completely different design requiring a different driver?

 My understanding is that it is different from msm v1 IOMMU, although I
 think it shares the same pagetable format with v1.  Not sure if that
 is the same as arm-smmu?   If so it might be nice to try to extract
 out some shared helper fxns for map/unmap as well.

 I expect Olav knows better the similarities/differences.


 The msm-v0 IOMMU is not compatible with ARM SMMUv1 specification.
 However, it is a close cousin. The hardware was designed before the ARM
 SMMUv1 specification was available I believe. But it shares many of the
 same concepts as the ARM SMMUv1.

 msm-v0 IOMMU supports V7S page table format only. The ARM SMMU driver
 does not support V7S at this time. However, I believe we need to support
 this.

 Will, this reminds me. We definitely have a need to use different page
 tables in the ARM SMMU driver vs. the ARM CPU. We

Re: [PATCH v2 1/1] iommu-api: Add map_range/unmap_range functions

2014-07-22 Thread Rob Clark
On Mon, Jul 21, 2014 at 8:59 PM, Olav Haugan ohau...@codeaurora.org wrote:
 On 7/17/2014 1:21 AM, Thierry Reding wrote:
 On Wed, Jul 16, 2014 at 06:01:57PM -0700, Olav Haugan wrote:
 Mapping and unmapping are more often than not in the critical path.
 map_range and unmap_range allows SMMU driver implementations to optimize

 s/SMMU/IOMMU/

 the process of mapping and unmapping buffers into the SMMU page tables.

 s/SMMU/IOMMU/

 Instead of mapping one physical address, do TLB operation (expensive),
 mapping, do TLB operation, mapping, do TLB operation the driver can map
 a scatter-gatherlist of physically contiguous pages into one virtual
 address space and then at the end do one TLB operation.

 I find the above hard to read. Maybe:

 Instead of mapping a buffer one page at a time and requiring potentially
 expensive TLB operations for each page, this function allows the driver
 to map all pages in one go and defer TLB maintenance until after all
 pages have been mapped.

 Yeah, all above is OK with me.


 Additionally, the mapping operation would be faster in general since
 clients does not have to keep calling map API over and over again for
 each physically contiguous chunk of memory that needs to be mapped to a
 virtually contiguous region.

 Signed-off-by: Olav Haugan ohau...@codeaurora.org
 ---
  drivers/iommu/iommu.c | 48 
  include/linux/iommu.h | 25 +
  2 files changed, 73 insertions(+)

 diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
 index 1698360..a0eebb7 100644
 --- a/drivers/iommu/iommu.c
 +++ b/drivers/iommu/iommu.c
 @@ -1089,6 +1089,54 @@ size_t iommu_unmap(struct iommu_domain *domain, 
 unsigned long iova, size_t size)
  EXPORT_SYMBOL_GPL(iommu_unmap);


 +int iommu_map_range(struct iommu_domain *domain, unsigned int iova,

 Maybe iova should be dma_addr_t? Or at least unsigned long? And perhaps
 iommu_map_sg() would be more consistent with the equivalent function in
 struct dma_ops?

 +struct scatterlist *sg, unsigned int len, int opt)

 The length argument seems to be the size of the mapping. Again, the
 struct dma_ops function uses this argument to denote the number of
 entries in the scatterlist.

 opt is somewhat opaque. Perhaps this should be turned into unsigned long
 flags? Although given that there aren't any users yet it's difficult to
 say what's best here. Perhaps the addition of this argument should be
 postponed until there are actual users?

 I am thinking something like this:

 int iommu_map_sg(struct iommu_domain *domain, struct scatterlist *sg,
 unsigned int nents, int prot, unsigned long flags);
 int iommu_unmap_sg(struct iommu_domain *domain, struct scatterlist *sg,
 unsigned int nents, unsigned long flags);

 The iova is contained within sg so we don't need that argument really
 and I would like to keep the flags argument. I would prefer not to
 change the API after it has been published which could potentially
 affect a lot of call sites.

ugg.. that at least forces me to construct a separate sg for mapping
the same buffer in multiple process's gpu addr space.  Not really a
fan of that.

BR,
-R

 +{
 +s32 ret = 0;

 Should be int to match the function's return type.

 +u32 offset = 0;
 +u32 start_iova = iova;

 These should match the type of iova. Also, what's the point of
 start_iova if we can simply keep iova constant and use offset where
 necessary?

 +BUG_ON(iova  (~PAGE_MASK));
 +
 +if (unlikely(domain-ops-map_range == NULL)) {
 +while (offset  len) {

 Maybe this should use for_each_sg()?

 +phys_addr_t phys = page_to_phys(sg_page(sg));
 +u32 page_len = PAGE_ALIGN(sg-offset + sg-length);

 Shouldn't this alignment be left to iommu_map() to handle? It has code
 to deal with that already.

 I don't see page alignment in the iommu_map function. I only see a check
 whether the (iova | paddr | size) is aligned to the minimum page size
 and then it errors out if it isn't


 +ret = iommu_map(domain, iova, phys, page_len, opt);

 This conflates the new opt argument with iommu_map()'s prot argument.
 Maybe those two should rather be split?

 +if (ret)
 +goto fail;
 +
 +iova += page_len;
 +offset += page_len;
 +if (offset  len)
 +sg = sg_next(sg);
 +}
 +} else {
 +ret = domain-ops-map_range(domain, iova, sg, len, opt);
 +}

 Perhaps rather than check for a -map_range implementation everytime a
 better option may be to export this generic implementation so that
 drivers can set it in their iommu_ops if they don't implement it? So the
 contents of the if () block could become a new function:

   int iommu_map_range_generic(...)
   {
   ...
   }
   EXPORT_SYMBOL(iommu_map_range_generic);


Re: [PATCH v5 1/1] iommu-api: Add map_sg/unmap_sg functions

2014-08-12 Thread Rob Clark
On Mon, Aug 11, 2014 at 9:51 PM, Hiroshi Doyu hd...@nvidia.com wrote:
 Hi Olav,

 Olav Haugan ohau...@codeaurora.org writes:

 @@ -93,6 +94,10 @@ enum iommu_attr {
   * @detach_dev: detach device from an iommu domain
   * @map: map a physically contiguous memory region to an iommu domain
   * @unmap: unmap a physically contiguous memory region from an iommu domain
 + * @map_sg: map a scatter-gather list of physically contiguous memory chunks
 + * to an iommu domain
 + * @unmap_sg: unmap a scatter-gather list of physically contiguous memory
 + * chunks from an iommu domain
   * @iova_to_phys: translate iova to physical address
   * @domain_has_cap: domain capabilities query
   * @add_device: add device to iommu grouping
 @@ -110,6 +115,11 @@ struct iommu_ops {
phys_addr_t paddr, size_t size, int prot);
 size_t (*unmap)(struct iommu_domain *domain, unsigned long iova,
  size_t size);
 +   int (*map_sg)(struct iommu_domain *domain, unsigned long iova,
 +   struct scatterlist *sg, unsigned int nents, int prot,
 +   unsigned long flags);
 +   int (*unmap_sg)(struct iommu_domain *domain, unsigned long iova,
 +   size_t size, unsigned long flags);

 Do you have any exmaple/explanation for the above flags?

 Is this going to be used for iommu global/standard attribute or SoC
 spcific one?

iirc, one plan for 'flags' was some sort of DONT_FLUSH_TLB flag for
drivers which wanted to map/unmap N buffers with a single flush at the
end.  There might have been some other usages envisioned.

BR,
-R
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v5 1/1] iommu-api: Add map_sg/unmap_sg functions

2014-08-18 Thread Rob Clark
On Mon, Aug 18, 2014 at 10:07 AM, j...@8bytes.org j...@8bytes.org wrote:
 On Tue, Aug 12, 2014 at 09:56:11AM -0700, Olav Haugan wrote:
 On 8/12/2014 3:48 AM, Rob Clark wrote:
  iirc, one plan for 'flags' was some sort of DONT_FLUSH_TLB flag for
  drivers which wanted to map/unmap N buffers with a single flush at the
  end.  There might have been some other usages envisioned.

 Yes, that was the original intent of the flags for now. I am sure we can
 find other uses for this in the future.

 Do you have anything else in mind already besides the DONT_FLUSH_TLB
 flag?

 How is the IOTLB supposed to be flushed when this flag is used?


well, I was thinking one of two ways:

1) add new flush() vfunc.. this, I think, would be most convenient for
drivers using this feature
2) or driver simply doesn't set DONT_FLUSH_TLB flag on the last
{map,unmap}..  that would be slightly more awkward to use, but would
avoid adding a new vfunc

BR,
-R

 Joerg

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v6 6/8] dma-mapping: detect and configure IOMMU in of_dma_configure

2014-12-10 Thread Rob Clark
On Mon, Dec 1, 2014 at 11:57 AM, Will Deacon will.dea...@arm.com wrote:
 This patch extends of_dma_configure so that it sets up the IOMMU for a
 device, as well as the coherent/non-coherent DMA mapping ops.

 Acked-by: Arnd Bergmann a...@arndb.de
 Acked-by: Marek Szyprowski m.szyprow...@samsung.com
 Tested-by: Robin Murphy robin.mur...@arm.com
 Signed-off-by: Will Deacon will.dea...@arm.com
 ---
  arch/arm/include/asm/dma-mapping.h |  4 +++-
  drivers/of/platform.c  | 21 ++---
  include/linux/dma-mapping.h|  8 +++-
  3 files changed, 24 insertions(+), 9 deletions(-)

 diff --git a/arch/arm/include/asm/dma-mapping.h 
 b/arch/arm/include/asm/dma-mapping.h
 index dc3420e77758..f3c0d953f6a2 100644
 --- a/arch/arm/include/asm/dma-mapping.h
 +++ b/arch/arm/include/asm/dma-mapping.h
 @@ -121,7 +121,9 @@ static inline unsigned long dma_max_pfn(struct device 
 *dev)
  }
  #define dma_max_pfn(dev) dma_max_pfn(dev)

 -static inline void arch_setup_dma_ops(struct device *dev, bool coherent)
 +static inline void arch_setup_dma_ops(struct device *dev, u64 dma_base,
 + u64 size, struct iommu_ops *iommu,
 + bool coherent)
  {
 if (coherent)
 set_dma_ops(dev, arm_coherent_dma_ops);
 diff --git a/drivers/of/platform.c b/drivers/of/platform.c
 index ff1f4e9afccb..b89caf8c7586 100644
 --- a/drivers/of/platform.c
 +++ b/drivers/of/platform.c
 @@ -19,6 +19,7 @@
  #include linux/slab.h
  #include linux/of_address.h
  #include linux/of_device.h
 +#include linux/of_iommu.h
  #include linux/of_irq.h
  #include linux/of_platform.h
  #include linux/platform_device.h
 @@ -166,6 +167,7 @@ static void of_dma_configure(struct device *dev)
 int ret;
 bool coherent;
 unsigned long offset;
 +   struct iommu_ops *iommu;

 /*
  * Set default dma-mask to 32 bit. Drivers are expected to setup
 @@ -194,7 +196,16 @@ static void of_dma_configure(struct device *dev)
 dev_dbg(dev, device is%sdma coherent\n,
 coherent ?   :  not );

 -   arch_setup_dma_ops(dev, coherent);
 +   iommu = of_iommu_configure(dev);
 +   dev_dbg(dev, device is%sbehind an iommu\n,
 +   iommu ?   :  not );
 +
 +   arch_setup_dma_ops(dev, dma_addr, size, iommu, coherent);


so, what is the way for a driver that explicitly wants to manage it's
own device virtual address space to opt out of this?  I suspect that
won't be the common case, but for a gpu, if dma layer all of a sudden
thinks it is in control of the gpu's virtual address space, things are
going to end in tears..

BR,
-R


 +}
 +
 +static void of_dma_deconfigure(struct device *dev)
 +{
 +   arch_teardown_dma_ops(dev);
  }

  /**
 @@ -223,16 +234,12 @@ static struct platform_device 
 *of_platform_device_create_pdata(
 if (!dev)
 goto err_clear_flag;

 -   of_dma_configure(dev-dev);
 dev-dev.bus = platform_bus_type;
 dev-dev.platform_data = platform_data;
 -
 -   /* We do not fill the DMA ops for platform devices by default.
 -* This is currently the responsibility of the platform code
 -* to do such, possibly using a device notifier
 -*/
 +   of_dma_configure(dev-dev);

 if (of_device_add(dev) != 0) {
 +   of_dma_deconfigure(dev-dev);
 platform_device_put(dev);
 goto err_clear_flag;
 }
 diff --git a/include/linux/dma-mapping.h b/include/linux/dma-mapping.h
 index 8a1560f95d4a..c3007cb4bfa6 100644
 --- a/include/linux/dma-mapping.h
 +++ b/include/linux/dma-mapping.h
 @@ -130,7 +130,13 @@ static inline int dma_coerce_mask_and_coherent(struct 
 device *dev, u64 mask)
  extern u64 dma_get_required_mask(struct device *dev);

  #ifndef arch_setup_dma_ops
 -static inline void arch_setup_dma_ops(struct device *dev, bool coherent) { }
 +static inline void arch_setup_dma_ops(struct device *dev, u64 dma_base,
 + u64 size, struct iommu_ops *iommu,
 + bool coherent) { }
 +#endif
 +
 +#ifndef arch_teardown_dma_ops
 +static inline void arch_teardown_dma_ops(struct device *dev) { }
  #endif

  static inline unsigned int dma_get_max_seg_size(struct device *dev)
 --
 2.1.1

 ___
 iommu mailing list
 iommu@lists.linux-foundation.org
 https://lists.linuxfoundation.org/mailman/listinfo/iommu
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 4/4] iommu/msm: Mark driver BROKEN

2015-02-04 Thread Rob Clark
On Wed, Feb 4, 2015 at 1:54 PM, Olav Haugan ohau...@codeaurora.org wrote:
 Adding Mitch H. and Rob Clark.


The current upstream msm-iommu isn't actually used by anything,
afaict.. it lacks support for DT based platforms.  So this is fine by
me.

Acked-by: Rob Clark robdcl...@gmail.com



 On 2/3/2015 11:58 PM, Thierry Reding wrote:

 From: Thierry Reding tred...@nvidia.com

 The MSM IOMMU driver unconditionally calls bus_set_iommu(), which is a
 very stupid thing to do on multi-platform kernels. While marking the
 driver BROKEN may seem a little extreme, there is no other way to make
 the driver skip initialization. One of the problems is that it doesn't
 have devicetree binding documentation and the driver doesn't contain a
 struct of_device_id table either, so no way to check that it is indeed
 valid to set up the IOMMU operations for this driver.

 This fixes a problem on Tegra20 where the DRM driver will try to use the
 obviously non-existent MSM IOMMU.

 Marking the driver BROKEN shouldn't do any harm, since there aren't any
 users currently. There is no struct of_device_id table, so the device
 can't be instantiated from device tree, and I couldn't find any code
 that would instantiate a matching platform_device either, so the driver
 is effectively unused.

 Reported-by: Nicolas Chauvet kwiz...@gmail.com
 Cc: David Brown dav...@codeaurora.org
 Cc: Daniel Walker dwal...@fifo99.com
 Cc: Bryan Huntsman bry...@codeaurora.org
 Cc: Olav Haugan ohau...@codeaurora.org
 Signed-off-by: Thierry Reding tred...@nvidia.com
 ---
   drivers/iommu/Kconfig | 1 +
   1 file changed, 1 insertion(+)

 diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
 index 325188eef1c1..0f70bc1fce65 100644
 --- a/drivers/iommu/Kconfig
 +++ b/drivers/iommu/Kconfig
 @@ -31,6 +31,7 @@ config FSL_PAMU
   config MSM_IOMMU
 bool MSM IOMMU Support
 depends on ARCH_MSM8X60 || ARCH_MSM8960
 +   depends on BROKEN
 select IOMMU_API
 help
   Support for the IOMMUs found on certain Qualcomm SOCs.



 .Olav

 --
 Qualcomm Innovation Center, Inc.
 The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
 a Linux Foundation Collaborative Project
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH V6 0/6] iommu/msm: Add DT adaptation and generic bindings support

2016-08-11 Thread Rob Clark
On Mon, Jun 13, 2016 at 7:36 AM, Sricharan R <sricha...@codeaurora.org> wrote:
> The msm_iommu.c driver currently works based on platform data.
> A single master device can be connected to more than one iommu and multiple
> contexts in each of the iommu. This association between master and iommus was
> represented from platform data using parent/child devices. The master drivers
> were responsible for attaching all of the iommus/context to a domain. Now the
> platform data support is removed and DT support is added. The master/iommus 
> are
> added through generic iommu bindings.
>
> This is essentially rework of the patch posted earlier by
> Rob Clark <robdcl...@gmail.com>. This series folds the changes in to the
> existing driver with the addition of generic bindings.
>
> http://www.spinics.net/lists/linux-arm-msm/msg10077.html
>
> Tested this series on ifc6410 board.

btw, the current state, at least on linaro integration branch, fault
handling doesn't work so well (ie. device never gets resumed).. which
is a bit unfortunate for a gpu (and results in a *lot* of rebooting on
my part when debugging userspace).  I haven't had time yet to compare
to the ancient downstream driver, but not sure if you have any ideas?

I guess probably disabling stall on fault would help.  But I'm not
even getting the "Fault occurred in context.." prints.  Seeing the
fault iova is pretty useful since that plus gpu cmdstream trace helps
me figure out which texture/etc is being accessed out of bounds.

BR,
-R

> [V6] After some discussions on patch 6 [1] from previous post,
>  it was concluded that the changes for using relaxed writes
>  in all places should not be a part of this series, so should
>  be moved it. So removed that patch and added Acked/Tested tags.
>  [1] https://patchwork.kernel.org/patch/9129231/
>
> [V5] Changed the compatible binding name as per comments, added comments
>  for usage of barriers in patch 6.
>
> [V4] Addressed comments for making the iommu compatible binding more soc
>  specific and updated the documentation for the iommu clocks.
>
> [V3] Addressed comments to correct the usage
>  of the #iommu-cells binding, improve the flush_iotlb_range function,
>  added a new patch to use writel_relaxed for register access and split
>  up the documentation patch.
>
> [V2] Adapted the driver to use generic ARMV7S short descriptor pagetable ops
>  and addressed comments.
>
> [V1]
>https://lists.linuxfoundation.org/pipermail/iommu/2015-August/014074.html
>
> Sricharan R (6):
>   iommu/msm: Add DT adaptation
>   documentation: iommu: Add bindings for msm,iommu-v0 ip
>   iommu/msm: Move the contents from msm_iommu_dev.c to msm_iommu.c
>   iommu/msm: Add support for generic master bindings
>   iommu/msm: use generic ARMV7S short descriptor pagetable ops
>   iommu/msm: Remove driver BROKEN
>
>  .../devicetree/bindings/iommu/msm,iommu-v0.txt |  64 ++
>  drivers/iommu/Kconfig  |   2 +-
>  drivers/iommu/Makefile |   2 +-
>  drivers/iommu/msm_iommu.c  | 870 
> +++--
>  drivers/iommu/msm_iommu.h  |  73 +-
>  drivers/iommu/msm_iommu_dev.c  | 381 -
>  6 files changed, 564 insertions(+), 828 deletions(-)
>  create mode 100644 Documentation/devicetree/bindings/iommu/msm,iommu-v0.txt
>  delete mode 100644 drivers/iommu/msm_iommu_dev.c
>
> --
> QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member of 
> Code Aurora Forum, hosted by The Linux Foundation
>
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH V6 0/6] iommu/msm: Add DT adaptation and generic bindings support

2016-08-11 Thread Rob Clark
On Thu, Aug 11, 2016 at 4:11 PM, Rob Clark <robdcl...@gmail.com> wrote:
> On Mon, Jun 13, 2016 at 7:36 AM, Sricharan R <sricha...@codeaurora.org> wrote:
>> The msm_iommu.c driver currently works based on platform data.
>> A single master device can be connected to more than one iommu and multiple
>> contexts in each of the iommu. This association between master and iommus was
>> represented from platform data using parent/child devices. The master drivers
>> were responsible for attaching all of the iommus/context to a domain. Now the
>> platform data support is removed and DT support is added. The master/iommus 
>> are
>> added through generic iommu bindings.
>>
>> This is essentially rework of the patch posted earlier by
>> Rob Clark <robdcl...@gmail.com>. This series folds the changes in to the
>> existing driver with the addition of generic bindings.
>>
>> http://www.spinics.net/lists/linux-arm-msm/msg10077.html
>>
>> Tested this series on ifc6410 board.
>
> btw, the current state, at least on linaro integration branch, fault
> handling doesn't work so well (ie. device never gets resumed).. which
> is a bit unfortunate for a gpu (and results in a *lot* of rebooting on
> my part when debugging userspace).  I haven't had time yet to compare
> to the ancient downstream driver, but not sure if you have any ideas?
>
> I guess probably disabling stall on fault would help.  But I'm not
> even getting the "Fault occurred in context.." prints.  Seeing the
> fault iova is pretty useful since that plus gpu cmdstream trace helps
> me figure out which texture/etc is being accessed out of bounds.

fyi, it looks like it is not getting any fault irq..  it's *possible*
that I screwed up the irq #'s when translating from downstream, so you
might want to double check that.  I thought I had it right, I assume I
would have noticed during piglit runs if fault recovery wasn't working
(since the result is that *everything* after the faulting test would
have failed since gpu is wedged with no access to memory), but it was
long enough ago that I can't claim that definitively.

If you need an easy way to trigger a gpu fault, msmtest is a good way,
change this line:

  https://github.com/freedreno/msmtest/blob/master/msmtest.c#L247

from OUT_RELOC() to OUT_RING(ring, 0x) will trigger a fault.

BR,
-R

>
>> [V6] After some discussions on patch 6 [1] from previous post,
>>  it was concluded that the changes for using relaxed writes
>>  in all places should not be a part of this series, so should
>>  be moved it. So removed that patch and added Acked/Tested tags.
>>  [1] https://patchwork.kernel.org/patch/9129231/
>>
>> [V5] Changed the compatible binding name as per comments, added comments
>>  for usage of barriers in patch 6.
>>
>> [V4] Addressed comments for making the iommu compatible binding more soc
>>  specific and updated the documentation for the iommu clocks.
>>
>> [V3] Addressed comments to correct the usage
>>  of the #iommu-cells binding, improve the flush_iotlb_range function,
>>  added a new patch to use writel_relaxed for register access and split
>>  up the documentation patch.
>>
>> [V2] Adapted the driver to use generic ARMV7S short descriptor pagetable ops
>>  and addressed comments.
>>
>> [V1]
>>https://lists.linuxfoundation.org/pipermail/iommu/2015-August/014074.html
>>
>> Sricharan R (6):
>>   iommu/msm: Add DT adaptation
>>   documentation: iommu: Add bindings for msm,iommu-v0 ip
>>   iommu/msm: Move the contents from msm_iommu_dev.c to msm_iommu.c
>>   iommu/msm: Add support for generic master bindings
>>   iommu/msm: use generic ARMV7S short descriptor pagetable ops
>>   iommu/msm: Remove driver BROKEN
>>
>>  .../devicetree/bindings/iommu/msm,iommu-v0.txt |  64 ++
>>  drivers/iommu/Kconfig  |   2 +-
>>  drivers/iommu/Makefile |   2 +-
>>  drivers/iommu/msm_iommu.c  | 870 
>> +++--
>>  drivers/iommu/msm_iommu.h  |  73 +-
>>  drivers/iommu/msm_iommu_dev.c  | 381 -
>>  6 files changed, 564 insertions(+), 828 deletions(-)
>>  create mode 100644 Documentation/devicetree/bindings/iommu/msm,iommu-v0.txt
>>  delete mode 100644 drivers/iommu/msm_iommu_dev.c
>>
>> --
>> QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member of 
>> Code Aurora Forum, hosted by The Linux Foundation
>>
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH V6 0/6] iommu/msm: Add DT adaptation and generic bindings support

2016-08-12 Thread Rob Clark
On Fri, Aug 12, 2016 at 3:00 AM, Sricharan  wrote:
> Hi Rob,
>
>>> btw, the current state, at least on linaro integration branch, fault
>>> handling doesn't work so well (ie. device never gets resumed).. which
>>> is a bit unfortunate for a gpu (and results in a *lot* of rebooting on
>>> my part when debugging userspace).  I haven't had time yet to compare
>>> to the ancient downstream driver, but not sure if you have any ideas?
>>>
>>> I guess probably disabling stall on fault would help.  But I'm not
>>> even getting the "Fault occurred in context.." prints.  Seeing the
>>> fault iova is pretty useful since that plus gpu cmdstream trace helps
>>> me figure out which texture/etc is being accessed out of bounds.
>>
>>fyi, it looks like it is not getting any fault irq..  it's *possible*
>>that I screwed up the irq #'s when translating from downstream, so you
>>might want to double check that.  I thought I had it right, I assume I
>>would have noticed during piglit runs if fault recovery wasn't working
>>(since the result is that *everything* after the faulting test would
>>have failed since gpu is wedged with no access to memory), but it was
>>long enough ago that I can't claim that definitively.
>>
>>If you need an easy way to trigger a gpu fault, msmtest is a good way,
>>change this line:
>>
>>  https://github.com/freedreno/msmtest/blob/master/msmtest.c#L247
>>
>>from OUT_RELOC() to OUT_RING(ring, 0x) will trigger a fault.
>>
>So for the irq to be triggered, 'non-secure' irq line has to be
>   populated in DT. There is a 'secure'and 'non-secure' irq lines for these 
> iommus
>   and  non-secure irq number is secure + 1. I tested this by having a 'return 
> 0'
>  from the msm_iommu_map (no mapping), and the faults were getting triggered.
>
>   Can you share me your dts data ?
>

I think this is what you want:

https://github.com/freedreno/kernel-msm/blob/integration-linux-qcomlt/arch/arm/boot/dts/qcom-apq8064.dtsi#L2008

I haven't tested a display fault, so I suppose it is possible that
irq's are working for some iommu instances but not others?

BR,
-R

>
> Regards,
>  Sricharan
>
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH V6 0/6] iommu/msm: Add DT adaptation and generic bindings support

2016-08-12 Thread Rob Clark
On Fri, Aug 12, 2016 at 9:03 AM, Sricharan  wrote:
> Hi Rob,
>
> btw, the current state, at least on linaro integration branch, fault
> handling doesn't work so well (ie. device never gets resumed).. which
> is a bit unfortunate for a gpu (and results in a *lot* of rebooting on
> my part when debugging userspace).  I haven't had time yet to compare
> to the ancient downstream driver, but not sure if you have any ideas?
>
> I guess probably disabling stall on fault would help.  But I'm not
> even getting the "Fault occurred in context.." prints.  Seeing the
> fault iova is pretty useful since that plus gpu cmdstream trace helps
> me figure out which texture/etc is being accessed out of bounds.

fyi, it looks like it is not getting any fault irq..  it's *possible*
that I screwed up the irq #'s when translating from downstream, so you
might want to double check that.  I thought I had it right, I assume I
would have noticed during piglit runs if fault recovery wasn't working
(since the result is that *everything* after the faulting test would
have failed since gpu is wedged with no access to memory), but it was
long enough ago that I can't claim that definitively.

If you need an easy way to trigger a gpu fault, msmtest is a good way,
change this line:

  https://github.com/freedreno/msmtest/blob/master/msmtest.c#L247

from OUT_RELOC() to OUT_RING(ring, 0x) will trigger a fault.

>>>So for the irq to be triggered, 'non-secure' irq line has to be
>>>   populated in DT. There is a 'secure'and 'non-secure' irq lines for these 
>>> iommus
>>>   and  non-secure irq number is secure + 1. I tested this by having a 
>>> 'return 0'
>>>  from the msm_iommu_map (no mapping), and the faults were getting triggered.
>>>
>>>   Can you share me your dts data ?
>>>
>>
>>I think this is what you want:
>>
>>https://github.com/freedreno/kernel-msm/blob/integration-linux-qcomlt/arch/arm/boot/dts/qcom-apq8064.dtsi#L2008
>>
>>I haven't tested a display fault, so I suppose it is possible that
>>irq's are working for some iommu instances but not others?
>
> So in your DT, for gfx3d, the non-secure line is '70' and not '69' (This is 
> secure) .
>  Infact only '70' should be populated. The driver sets the irq line based on 
> resource 0.
>  This applies for all iommu nodes in your DT. (only the second irq line is 
> needed).

ahh, that would explain.

Is it better to remove the extra entry, or should I just swap them
all?  Ie. might there be some point in the future where the driver
would want both?

BR,
-R
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH V6 0/6] iommu/msm: Add DT adaptation and generic bindings support

2016-08-12 Thread Rob Clark
On Fri, Aug 12, 2016 at 9:48 AM, Sricharan  wrote:
> Hi,
>
>>> btw, the current state, at least on linaro integration branch, fault
>>> handling doesn't work so well (ie. device never gets resumed).. which
>>> is a bit unfortunate for a gpu (and results in a *lot* of rebooting on
>>> my part when debugging userspace).  I haven't had time yet to compare
>>> to the ancient downstream driver, but not sure if you have any ideas?
>>>
>>> I guess probably disabling stall on fault would help.  But I'm not
>>> even getting the "Fault occurred in context.." prints.  Seeing the
>>> fault iova is pretty useful since that plus gpu cmdstream trace helps
>>> me figure out which texture/etc is being accessed out of bounds.
>>
>>fyi, it looks like it is not getting any fault irq..  it's *possible*
>>that I screwed up the irq #'s when translating from downstream, so you
>>might want to double check that.  I thought I had it right, I assume I
>>would have noticed during piglit runs if fault recovery wasn't working
>>(since the result is that *everything* after the faulting test would
>>have failed since gpu is wedged with no access to memory), but it was
>>long enough ago that I can't claim that definitively.
>>
>>If you need an easy way to trigger a gpu fault, msmtest is a good way,
>>change this line:
>>
>>  https://github.com/freedreno/msmtest/blob/master/msmtest.c#L247
>>
>>from OUT_RELOC() to OUT_RING(ring, 0x) will trigger a fault.
>>
>So for the irq to be triggered, 'non-secure' irq line has to be
>   populated in DT. There is a 'secure'and 'non-secure' irq lines for 
> these iommus
>   and  non-secure irq number is secure + 1. I tested this by having a 
> 'return 0'
>  from the msm_iommu_map (no mapping), and the faults were getting 
> triggered.
>
>   Can you share me your dts data ?
>

I think this is what you want:

https://github.com/freedreno/kernel-msm/blob/integration-linux-qcomlt/arch/arm/boot/dts/qcom-apq8064.dtsi#L2008

I haven't tested a display fault, so I suppose it is possible that
irq's are working for some iommu instances but not others?
>>>
>>> So in your DT, for gfx3d, the non-secure line is '70' and not '69' (This is 
>>> secure) .
>>>  Infact only '70' should be populated. The driver sets the irq line based 
>>> on resource 0.
>>>  This applies for all iommu nodes in your DT. (only the second irq line is 
>>> needed).
>>
>>ahh, that would explain.
>>
>>Is it better to remove the extra entry, or should I just swap them
>>all?  Ie. might there be some point in the future where the driver
>>would want both?
> I feel better to have one. Not sure why the secure irq was added in first
> place in the downstream data and it would setup/handled by the TZ

Ok, getting further.. still seems like the gpu is not getting resumed,
but at least we are getting a fault interrupt..

BR,
-R
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 2/2] iommu/msm: wire up fault handling

2016-08-12 Thread Rob Clark
When things go wrong on the gpu, we can get *thousands* of faults.  With
so many pr_err() prints, which were slowing down resuming the iommu,
drm/msm would think the GPU had actually hung and reset it.

Wire up the fault reporting, so instead we get a small ratelimited print
of the fault address from drm/msm's fault handler instead.

Signed-off-by: Rob Clark <robdcl...@gmail.com>
---
 drivers/iommu/msm_iommu.c | 16 +++-
 drivers/iommu/msm_iommu.h |  3 +++
 2 files changed, 14 insertions(+), 5 deletions(-)

diff --git a/drivers/iommu/msm_iommu.c b/drivers/iommu/msm_iommu.c
index f6f596f..1110b72 100644
--- a/drivers/iommu/msm_iommu.c
+++ b/drivers/iommu/msm_iommu.c
@@ -411,6 +411,7 @@ static int msm_iommu_attach_dev(struct iommu_domain 
*domain, struct device *dev)
}
__disable_clocks(iommu);
list_add(>dom_node, >list_attached);
+   iommu->domain = domain;
}
}
 
@@ -614,8 +615,8 @@ irqreturn_t msm_iommu_fault_handler(int irq, void *dev_id)
goto fail;
}
 
-   pr_err("Unexpected IOMMU page fault!\n");
-   pr_err("base = %08x\n", (unsigned int)iommu->base);
+   pr_debug("Unexpected IOMMU page fault!\n");
+   pr_debug("base = %08x\n", (unsigned int)iommu->base);
 
ret = __enable_clocks(iommu);
if (ret)
@@ -624,9 +625,14 @@ irqreturn_t msm_iommu_fault_handler(int irq, void *dev_id)
for (i = 0; i < iommu->ncb; i++) {
fsr = GET_FSR(iommu->base, i);
if (fsr) {
-   pr_err("Fault occurred in context %d.\n", i);
-   pr_err("Interesting registers:\n");
-   print_ctx_regs(iommu->base, i);
+   int ret = report_iommu_fault(iommu->domain,
+   to_msm_priv(iommu->domain)->dev,
+   GET_FAR(iommu->base, i), 0);
+   if (ret == -ENOSYS) {
+   pr_err("Fault occurred in context %d.\n", i);
+   pr_err("Interesting registers:\n");
+   print_ctx_regs(iommu->base, i);
+   }
SET_FSR(iommu->base, i, 0x400F);
SET_RESUME(iommu->base, i, 1);
}
diff --git a/drivers/iommu/msm_iommu.h b/drivers/iommu/msm_iommu.h
index 4ca25d5..c53016c 100644
--- a/drivers/iommu/msm_iommu.h
+++ b/drivers/iommu/msm_iommu.h
@@ -56,6 +56,8 @@
  * dom_node:   list head for domain
  * ctx_list:   list of 'struct msm_iommu_ctx_dev'
  * context_map: Bitmap to track allocated context banks
+ * domain: iommu domain that this iommu dev is a member of,
+ * ie. whose msm_priv::list_attached are we on?
  */
 struct msm_iommu_dev {
void __iomem *base;
@@ -68,6 +70,7 @@ struct msm_iommu_dev {
struct list_head dom_node;
struct list_head ctx_list;
DECLARE_BITMAP(context_map, IOMMU_MAX_CBS);
+   struct iommu_domain *domain;
 };
 
 /**
-- 
2.7.4

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 2/2] iommu/msm: wire up fault handling

2016-08-12 Thread Rob Clark
On Fri, Aug 12, 2016 at 12:17 PM, Sricharan <sricha...@codeaurora.org> wrote:
> Hi,
>
>>When things go wrong on the gpu, we can get *thousands* of faults.  With
>>so many pr_err() prints, which were slowing down resuming the iommu,
>>drm/msm would think the GPU had actually hung and reset it.
>>
>>Wire up the fault reporting, so instead we get a small ratelimited print
>>of the fault address from drm/msm's fault handler instead.
>>
>>Signed-off-by: Rob Clark <robdcl...@gmail.com>
>>---
>> drivers/iommu/msm_iommu.c | 16 +++-
>> drivers/iommu/msm_iommu.h |  3 +++
>> 2 files changed, 14 insertions(+), 5 deletions(-)
>>
>>diff --git a/drivers/iommu/msm_iommu.c b/drivers/iommu/msm_iommu.c
>>index f6f596f..1110b72 100644
>>--- a/drivers/iommu/msm_iommu.c
>>+++ b/drivers/iommu/msm_iommu.c
>>@@ -411,6 +411,7 @@ static int msm_iommu_attach_dev(struct iommu_domain 
>>*domain, struct device *dev)
>>   }
>>   __disable_clocks(iommu);
>>   list_add(>dom_node, >list_attached);
>>+  iommu->domain = domain;
>>   }
>>   }
>>
>>@@ -614,8 +615,8 @@ irqreturn_t msm_iommu_fault_handler(int irq, void *dev_id)
>>   goto fail;
>>   }
>>
>>-  pr_err("Unexpected IOMMU page fault!\n");
>>-  pr_err("base = %08x\n", (unsigned int)iommu->base);
>>+  pr_debug("Unexpected IOMMU page fault!\n");
>
>   So was just thinking if its better to have only this as a 
> ratelimited print,
>   for global faults ?, otherwise

it is possibly a good idea to ratelimit the pr_err prints that get
printed when there is not a fault handler installed..  although in the
case there is a handler, I don't think we should print anything.  (At
least not unless DEBUG is defined.)

If we can actually resume the faulting memory transaction, then we
could use this to implement virtual memory for the GPU, like the HMM
stuff.. in order to use malloc'd memory with the gpu without having to
pin..

(I know we can resume future memory transactions, but not sure if we
can update iommu page tables and resume the transaction that triggered
the fault..)

BR,
-R

> Reviewed-by: sricha...@codeaurora.org
>
> Regards,
>  Sricharan
>
>
>
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[RFC 1/3] iommu: introduce stall/resume support

2017-02-01 Thread Rob Clark
A new flag when registering the fault handler indicates that the user
supports stalling, and will call iommu_domain_resume() at some point
later, potentially from a workqueue.  (This would allow the user to do
mm related operations that could not be done from irq context.)

Signed-off-by: Rob Clark <robdcl...@gmail.com>
---
 drivers/gpu/drm/etnaviv/etnaviv_mmu.c|  2 +-
 drivers/gpu/drm/msm/msm_iommu.c  | 12 +---
 drivers/infiniband/hw/usnic/usnic_uiom.c |  2 +-
 drivers/iommu/iommu.c| 24 +++-
 drivers/remoteproc/remoteproc_core.c |  2 +-
 include/linux/iommu.h|  5 -
 6 files changed, 39 insertions(+), 8 deletions(-)

diff --git a/drivers/gpu/drm/etnaviv/etnaviv_mmu.c 
b/drivers/gpu/drm/etnaviv/etnaviv_mmu.c
index 169ac96..a8819bc 100644
--- a/drivers/gpu/drm/etnaviv/etnaviv_mmu.c
+++ b/drivers/gpu/drm/etnaviv/etnaviv_mmu.c
@@ -303,7 +303,7 @@ struct etnaviv_iommu *etnaviv_iommu_new(struct etnaviv_gpu 
*gpu)
mmu->domain->geometry.aperture_end -
mmu->domain->geometry.aperture_start + 1);
 
-   iommu_set_fault_handler(mmu->domain, etnaviv_fault_handler, gpu->dev);
+   iommu_set_fault_handler(mmu->domain, etnaviv_fault_handler, gpu->dev, 
false);
 
return mmu;
 }
diff --git a/drivers/gpu/drm/msm/msm_iommu.c b/drivers/gpu/drm/msm/msm_iommu.c
index 48e79d0..7521582 100644
--- a/drivers/gpu/drm/msm/msm_iommu.c
+++ b/drivers/gpu/drm/msm/msm_iommu.c
@@ -29,9 +29,15 @@ static int msm_fault_handler(struct iommu_domain *domain, 
struct device *dev,
unsigned long iova, int flags, void *arg)
 {
struct msm_iommu *iommu = arg;
+   int ret = 0;
+
if (iommu->base.handler)
-   return iommu->base.handler(iommu->base.arg, iova, flags);
-   pr_warn_ratelimited("*** fault: iova=%08lx, flags=%d\n", iova, flags);
+   ret = iommu->base.handler(iommu->base.arg, iova, flags);
+   else
+   pr_warn_ratelimited("*** fault: iova=%08lx, flags=%d\n", iova, 
flags);
+
+   iommu_domain_resume(domain, false);
+
return 0;
 }
 
@@ -172,7 +178,7 @@ struct msm_mmu *msm_iommu_new(struct device *dev, struct 
iommu_domain *domain)
 
iommu->domain = domain;
msm_mmu_init(>base, dev, );
-   iommu_set_fault_handler(domain, msm_fault_handler, iommu);
+   iommu_set_fault_handler(domain, msm_fault_handler, iommu, true);
 
if (of_find_compatible_node(NULL, NULL, "qcom,msm-smmu-v2") ||
of_find_compatible_node(NULL, NULL, "qcom,msm-mmu-500"))
diff --git a/drivers/infiniband/hw/usnic/usnic_uiom.c 
b/drivers/infiniband/hw/usnic/usnic_uiom.c
index 1ccee6e..0613701 100644
--- a/drivers/infiniband/hw/usnic/usnic_uiom.c
+++ b/drivers/infiniband/hw/usnic/usnic_uiom.c
@@ -476,7 +476,7 @@ struct usnic_uiom_pd *usnic_uiom_alloc_pd(void)
return ERR_PTR(-ENOMEM);
}
 
-   iommu_set_fault_handler(pd->domain, usnic_uiom_dma_fault, NULL);
+   iommu_set_fault_handler(pd->domain, usnic_uiom_dma_fault, NULL, false);
 
spin_lock_init(>lock);
INIT_LIST_HEAD(>devs);
diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index 9a2f196..65257cc 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -1029,6 +1029,7 @@ EXPORT_SYMBOL_GPL(iommu_capable);
  * @domain: iommu domain
  * @handler: fault handler
  * @token: user data, will be passed back to the fault handler
+ * @can_stall: the user can support stalling on iommu fault
  *
  * This function should be used by IOMMU users which want to be notified
  * whenever an IOMMU fault happens.
@@ -1038,12 +1039,14 @@ EXPORT_SYMBOL_GPL(iommu_capable);
  */
 void iommu_set_fault_handler(struct iommu_domain *domain,
iommu_fault_handler_t handler,
-   void *token)
+   void *token,
+   bool can_stall)
 {
BUG_ON(!domain);
 
domain->handler = handler;
domain->handler_token = token;
+   domain->can_stall = can_stall;
 }
 EXPORT_SYMBOL_GPL(iommu_set_fault_handler);
 
@@ -1546,6 +1549,25 @@ int iommu_domain_set_attr(struct iommu_domain *domain,
 }
 EXPORT_SYMBOL_GPL(iommu_domain_set_attr);
 
+/**
+ * iommu_domain_resume() - resume a stalled transaction after fault
+ * @domain: iommu domain
+ * @resume: if true, resume the transaction, else abort it
+ *
+ * Users that pass can_stall=true to iommu_set_fault_handler() must
+ * call this function to resume (or terminate) the stalled iommu
+ * transaction.  It may either be called directly from the fault
+ * handler, or at some point later from a thread context (ie. if the
+ * fault handler needs to do anything that cannot be done from atomic
+ * context, ie. use any mm 

[RFC 2/3] iommu/arm-smmu: Add support to opt-in to stalling

2017-02-01 Thread Rob Clark
TODO maybe some dev_dbg() or some other way to tell if stalling is
actually enabled?

Signed-off-by: Rob Clark <robdcl...@gmail.com>
---
 .../devicetree/bindings/iommu/arm,smmu.txt |  3 +
 drivers/iommu/arm-smmu.c   | 85 --
 2 files changed, 82 insertions(+), 6 deletions(-)

diff --git a/Documentation/devicetree/bindings/iommu/arm,smmu.txt 
b/Documentation/devicetree/bindings/iommu/arm,smmu.txt
index ef465b0..5f405a6 100644
--- a/Documentation/devicetree/bindings/iommu/arm,smmu.txt
+++ b/Documentation/devicetree/bindings/iommu/arm,smmu.txt
@@ -68,6 +68,9 @@ conditions.
   aliases of secure registers have to be used during
   SMMU configuration.
 
+- arm,smmu-enable-stall : Enable stall mode to stall memory transactions
+  and resume after fault is handled
+
 ** Deprecated properties:
 
 - mmu-masters (deprecated in favour of the generic "iommus" binding) :
diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index d505432..96a1be6 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -350,6 +350,7 @@ struct arm_smmu_device {
u32 features;
 
 #define ARM_SMMU_OPT_SECURE_CFG_ACCESS (1 << 0)
+#define ARM_SMMU_OPT_ENABLE_STALL  (1 << 1)
u32 options;
enum arm_smmu_arch_version  version;
enum arm_smmu_implementationmodel;
@@ -377,6 +378,8 @@ struct arm_smmu_device {
int num_clocks;
struct clk  **clocks;
 
+   struct list_headdomain_list;
+
u32 cavium_id_base; /* Specific to Cavium */
 };
 
@@ -412,6 +415,7 @@ struct arm_smmu_domain {
enum arm_smmu_domain_stage  stage;
struct mutexinit_mutex; /* Protects smmu pointer */
struct iommu_domain domain;
+   struct list_headdomain_node;
 };
 
 struct arm_smmu_option_prop {
@@ -425,6 +429,7 @@ static bool using_legacy_binding, using_generic_binding;
 
 static struct arm_smmu_option_prop arm_smmu_options[] = {
{ ARM_SMMU_OPT_SECURE_CFG_ACCESS, "calxeda,smmu-secure-config-access" },
+   { ARM_SMMU_OPT_ENABLE_STALL,  "arm,smmu-enable-stall" },
{ 0, NULL},
 };
 
@@ -676,7 +681,8 @@ static struct iommu_gather_ops arm_smmu_gather_ops = {
 
 static irqreturn_t arm_smmu_context_fault(int irq, void *dev)
 {
-   u32 fsr, fsynr;
+   int flags, ret;
+   u32 fsr, fsynr, resume;
unsigned long iova;
struct iommu_domain *domain = dev;
struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
@@ -690,15 +696,48 @@ static irqreturn_t arm_smmu_context_fault(int irq, void 
*dev)
if (!(fsr & FSR_FAULT))
return IRQ_NONE;
 
+   if (fsr & FSR_IGN)
+   dev_err_ratelimited(smmu->dev,
+   "Unexpected context fault (fsr 0x%x)\n",
+   fsr);
+
fsynr = readl_relaxed(cb_base + ARM_SMMU_CB_FSYNR0);
-   iova = readq_relaxed(cb_base + ARM_SMMU_CB_FAR);
+   flags = fsynr & FSYNR0_WNR ? IOMMU_FAULT_WRITE : IOMMU_FAULT_READ;
 
-   dev_err_ratelimited(smmu->dev,
-   "Unhandled context fault: fsr=0x%x, iova=0x%08lx, fsynr=0x%x, cb=%d\n",
-   fsr, iova, fsynr, cfg->cbndx);
+   iova = readq_relaxed(cb_base + ARM_SMMU_CB_FAR);
+   if (!report_iommu_fault(domain, smmu->dev, iova, flags)) {
+   ret = IRQ_HANDLED;
+   resume = RESUME_RETRY;
+   } else {
+   dev_err_ratelimited(smmu->dev,
+   "Unhandled context fault: iova=0x%08lx, fsynr=0x%x, 
cb=%d\n",
+   iova, fsynr, cfg->cbndx);
+   ret = IRQ_NONE;
+   resume = RESUME_TERMINATE;
+   }
 
+   /* Clear the faulting FSR */
writel(fsr, cb_base + ARM_SMMU_CB_FSR);
-   return IRQ_HANDLED;
+
+   if ((fsr & FSR_SS) && !domain->can_stall) {
+   /* Retry or terminate any stalled transactions */
+   writel_relaxed(resume, cb_base + ARM_SMMU_CB_RESUME);
+   }
+
+   return ret;
+}
+
+static void arm_smmu_domain_resume(struct iommu_domain *domain, bool resume)
+{
+   struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
+   struct arm_smmu_cfg *cfg = _domain->cfg;
+   struct arm_smmu_device *smmu = smmu_domain->smmu;
+   void __iomem *cb_base;
+
+   cb_base = ARM_SMMU_CB_BASE(smmu) + ARM_SMMU_CB(smmu, cfg->cbndx);
+
+   writel_relaxed(resume ? RESUME_RETRY : RESUME_TERMINATE,
+   cb_base + ARM_SMMU_CB_RESUME);
 }
 
 static irqreturn_t arm_smmu_global_fault(int irq, void *dev)
@@ -725,6 +764,20 @@ s

[RFC 0/3] iommu/arm-smmu: stalling support (v2)

2017-02-01 Thread Rob Clark
Now, as was previously discussed[1], now we have a dt property on the
smmu, to indicate that the smmu supports stalling, as well as a flag
on the domain (set when fault-handler is registered) to indicate that
the domain supports stalling.

I ran into one snag, thanks to the auto-attached DMA domain, since
that doesn't support stalling.  I think when the driver is explicitly
managing the iommu, we should just get rid of the DMA domain.  (But
maybe there is a cleaner way to do this than what I came up with, so
suggestions welcome.)

[1] http://www.spinics.net/lists/linux-arm-msm/msg25357.html

Rob Clark (3):
  iommu: introduce stall/resume support
  iommu/arm-smmu: Add support to opt-in to stalling
  iommu/arm-smmu: detach DMA domain if driver is managing iommu

 .../devicetree/bindings/iommu/arm,smmu.txt |   3 +
 drivers/gpu/drm/etnaviv/etnaviv_mmu.c  |   2 +-
 drivers/gpu/drm/msm/msm_iommu.c|  12 ++-
 drivers/infiniband/hw/usnic/usnic_uiom.c   |   2 +-
 drivers/iommu/arm-smmu.c   | 100 +++--
 drivers/iommu/iommu.c  |  24 -
 drivers/remoteproc/remoteproc_core.c   |   2 +-
 include/linux/iommu.h  |   5 +-
 8 files changed, 136 insertions(+), 14 deletions(-)

-- 
2.9.3

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[RFC 3/3] iommu/arm-smmu: detach DMA domain if driver is managing iommu

2017-02-01 Thread Rob Clark
Before the driver is probed, arm_smmu_add_device() helpfully attaches
an IOMMU_DOMAIN_DMA domain.  Which ofc does not support stalling, and
when the driver later attaches a domain that can_stall to an smmu that
can stall, the default _DMA domain prevents stalling from being enabled.
(And will cause further problems later)

One simple way to deal with this is simply toss the default _DMA domain
if the driver attaches it's own domain.

TODO maybe the tracking of list of attached domains should be done in
iommu core, so the detach can happen outside of group->mutex.

Signed-off-by: Rob Clark <robdcl...@gmail.com>
---
 drivers/iommu/arm-smmu.c | 15 +++
 1 file changed, 15 insertions(+)

diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index 96a1be6..50bf135 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -1323,6 +1323,21 @@ static int arm_smmu_attach_dev(struct iommu_domain 
*domain, struct device *dev)
 
smmu = fwspec_smmu(fwspec);
 
+   /*
+* If driver is explicitly managing the iommu, detatch any previously
+* attached _DMA domains.
+*
+* TODO maybe this logic should be in iommu_attach_device() so it can
+* happen outside of holding group->mutex??
+*/
+   if (domain->type != IOMMU_DOMAIN_DMA) {
+   struct arm_smmu_domain *other_domain, *n;
+
+   list_for_each_entry_safe(other_domain, n, >domain_list, 
domain_node)
+   if (other_domain->domain.type == IOMMU_DOMAIN_DMA)
+   arm_smmu_detach_dev(_domain->domain, dev);
+   }
+
if (WARN_ON(!list_empty(_domain->domain_node)))
return -EINVAL;
 
-- 
2.9.3

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC 3/3] iommu/arm-smmu: detach DMA domain if driver is managing iommu

2017-02-01 Thread Rob Clark
On Wed, Feb 1, 2017 at 10:23 AM, Rob Clark <robdcl...@gmail.com> wrote:
> Before the driver is probed, arm_smmu_add_device() helpfully attaches
> an IOMMU_DOMAIN_DMA domain.  Which ofc does not support stalling, and
> when the driver later attaches a domain that can_stall to an smmu that
> can stall, the default _DMA domain prevents stalling from being enabled.
> (And will cause further problems later)
>
> One simple way to deal with this is simply toss the default _DMA domain
> if the driver attaches it's own domain.
>
> TODO maybe the tracking of list of attached domains should be done in
> iommu core, so the detach can happen outside of group->mutex.
>
> Signed-off-by: Rob Clark <robdcl...@gmail.com>
> ---
>  drivers/iommu/arm-smmu.c | 15 +++
>  1 file changed, 15 insertions(+)
>
> diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
> index 96a1be6..50bf135 100644
> --- a/drivers/iommu/arm-smmu.c
> +++ b/drivers/iommu/arm-smmu.c
> @@ -1323,6 +1323,21 @@ static int arm_smmu_attach_dev(struct iommu_domain 
> *domain, struct device *dev)
>
> smmu = fwspec_smmu(fwspec);
>
> +   /*
> +* If driver is explicitly managing the iommu, detatch any previously
> +* attached _DMA domains.
> +*
> +* TODO maybe this logic should be in iommu_attach_device() so it can
> +* happen outside of holding group->mutex??
> +*/
> +   if (domain->type != IOMMU_DOMAIN_DMA) {
> +   struct arm_smmu_domain *other_domain, *n;
> +
> +   list_for_each_entry_safe(other_domain, n, >domain_list, 
> domain_node)
> +   if (other_domain->domain.type == IOMMU_DOMAIN_DMA)
> +   arm_smmu_detach_dev(_domain->domain, 
> dev);

hmm, we might want to unhook dev->archdata.dma_ops here too..

I'm thinking maybe on arm64 __generic_dma_ops() should fall back to
swiotlb ops instead of dummy_ops if archdata.dma_ops is NULL, so that
we could just set it to NULL here?

(Is there really any purpose for having the dummy-ops??)

BR,
-R

> +   }
> +
> if (WARN_ON(!list_empty(_domain->domain_node)))
> return -EINVAL;
>
> --
> 2.9.3
>
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC 3/3] iommu/arm-smmu: detach DMA domain if driver is managing iommu

2017-02-02 Thread Rob Clark
On Wed, Feb 1, 2017 at 11:10 PM, Sricharan <sricha...@codeaurora.org> wrote:
> Hi Rob,
>
>>On Wed, Feb 1, 2017 at 10:23 AM, Rob Clark <robdcl...@gmail.com> wrote:
>>> Before the driver is probed, arm_smmu_add_device() helpfully attaches
>>> an IOMMU_DOMAIN_DMA domain.  Which ofc does not support stalling, and
>>> when the driver later attaches a domain that can_stall to an smmu that
>>> can stall, the default _DMA domain prevents stalling from being enabled.
>>> (And will cause further problems later)
>>>
>>> One simple way to deal with this is simply toss the default _DMA domain
>>> if the driver attaches it's own domain.
>>>
>>> TODO maybe the tracking of list of attached domains should be done in
>>> iommu core, so the detach can happen outside of group->mutex.
>>>
>>> Signed-off-by: Rob Clark <robdcl...@gmail.com>
>>> ---
>>>  drivers/iommu/arm-smmu.c | 15 +++
>>>  1 file changed, 15 insertions(+)
>>>
>>> diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
>>> index 96a1be6..50bf135 100644
>>> --- a/drivers/iommu/arm-smmu.c
>>> +++ b/drivers/iommu/arm-smmu.c
>>> @@ -1323,6 +1323,21 @@ static int arm_smmu_attach_dev(struct iommu_domain 
>>> *domain, struct device *dev)
>>>
>>> smmu = fwspec_smmu(fwspec);
>>>
>>> +   /*
>>> +* If driver is explicitly managing the iommu, detatch any 
>>> previously
>>> +* attached _DMA domains.
>>> +*
>>> +* TODO maybe this logic should be in iommu_attach_device() so it 
>>> can
>>> +* happen outside of holding group->mutex??
>>> +*/
>>> +   if (domain->type != IOMMU_DOMAIN_DMA) {
>>> +   struct arm_smmu_domain *other_domain, *n;
>>> +
>>> +   list_for_each_entry_safe(other_domain, n, 
>>> >domain_list, domain_node)
>>> +   if (other_domain->domain.type == IOMMU_DOMAIN_DMA)
>>> +   arm_smmu_detach_dev(_domain->domain, 
>>> dev);
>>
>
> So the arm_smmu_detach_dev api is no more there and is removed now.
> Also this will be a problem when multiple devices share the iommu, we end up
> removing domains used by other devices. Should this not be done
> per-device which does not want to use the DMA domain ?

I actually added the _detach fxn in the previous patch.

But I think managing the iommu explicitly is not going to work if
multiple devices share the same pagetable.  I think that is a scenario
that should never happen with the gpu.

>>hmm, we might want to unhook dev->archdata.dma_ops here too..
>>
>>I'm thinking maybe on arm64 __generic_dma_ops() should fall back to
>>swiotlb ops instead of dummy_ops if archdata.dma_ops is NULL, so that
>>we could just set it to NULL here?
>>
>
> hmm, both not attaching the default dma domain and not setting the dma_ops
> is tried in this series as well [1]
>
> [1] https://www.spinics.net/lists/arm-kernel/msg556081.html

I'll have a look at that series later this morning.

>>(Is there really any purpose for having the dummy-ops??)
>>
>
> To enforce setting arch_setup_dma_ops for device so that the
> devices can do cache coherent transactions, otherwise disable the dma
> capability of the device. I see that this was introduced as a part of
> making ACPI_CCA_REQUIRED to be set in arm64 and later
> generalized.

hmm, looks like all the dummy ops just fail.  I'm not entirely sure
how that can be useful.  But I'll poke around the git history.  Seems
like just always falling back to swiotlb op's would be the equivalent
to armv7 falling back to arm_dma_ops..

BR,
-R

>
> Regards,
>  Sricharan
>
>>BR,
>>-R
>>
>>> +   }
>>> +
>>> if (WARN_ON(!list_empty(_domain->domain_node)))
>>> return -EINVAL;
>>>
>>> --
>>> 2.9.3
>>>
>
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 0/5] Implement SMMU passthrough using the default domain

2017-02-02 Thread Rob Clark
On Thu, Jan 26, 2017 at 12:18 PM, Joerg Roedel  wrote:
> On Tue, Jan 24, 2017 at 08:42:23PM +0530, Sricharan wrote:
>> Thanks for this series. We had a case with the GPU.
>> The GPU's iommu was setup by kernel and the GPU
>> also does dynamic updates for on-the-fly switching between
>> process pagetables.  GPU driver was not using DMA domain and
>> the GPU's firmware was always expecting to run out  of contextbank
>>  '0' (although not correct) , which was not the case after the DMA domain
>> was made default  as '0' was getting allocated for DMA domain and
>> there were concerns about reusing the DMA domain as well.
>> Now with this series, looks there is an way out of that that can be tried.
>>
>> So should the default domain not be per device specific selectable ?
>
> Note that iommu-drivers can request direct-mapping for any given device
> on its initializtion. This is used on x86 for devices that need a 1-1
> mapping for some reason.
>
> Also device drivers can use the iommu-api and assign their own domain to
> a device, which allows them to manage the dma address space on their
> own.

Part of the problem is that dev->archdata.dma_ops gets wired up to
iommu_dma_ops.  Which isn't so bad on it's own, except that cache ops
are not exposed to drivers, forcing us to use dma-mapping API
(dma_map_sg, etc) for cache operations.

Possibly we should just expose cache op's to drivers bypass this abuse
of dma-mapping.

btw, Will, we definitely want this to *not* rely on kcmdline for the
gpu with it's own private iommu case..

BR,
-R
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC 3/3] iommu/arm-smmu: detach DMA domain if driver is managing iommu

2017-02-02 Thread Rob Clark
On Wed, Feb 1, 2017 at 11:10 PM, Sricharan <sricha...@codeaurora.org> wrote:
> Hi Rob,
>
>>On Wed, Feb 1, 2017 at 10:23 AM, Rob Clark <robdcl...@gmail.com> wrote:
>>> Before the driver is probed, arm_smmu_add_device() helpfully attaches
>>> an IOMMU_DOMAIN_DMA domain.  Which ofc does not support stalling, and
>>> when the driver later attaches a domain that can_stall to an smmu that
>>> can stall, the default _DMA domain prevents stalling from being enabled.
>>> (And will cause further problems later)
>>>
>>> One simple way to deal with this is simply toss the default _DMA domain
>>> if the driver attaches it's own domain.
>>>
>>> TODO maybe the tracking of list of attached domains should be done in
>>> iommu core, so the detach can happen outside of group->mutex.
>>>
>>> Signed-off-by: Rob Clark <robdcl...@gmail.com>
>>> ---
>>>  drivers/iommu/arm-smmu.c | 15 +++
>>>  1 file changed, 15 insertions(+)
>>>
>>> diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
>>> index 96a1be6..50bf135 100644
>>> --- a/drivers/iommu/arm-smmu.c
>>> +++ b/drivers/iommu/arm-smmu.c
>>> @@ -1323,6 +1323,21 @@ static int arm_smmu_attach_dev(struct iommu_domain 
>>> *domain, struct device *dev)
>>>
>>> smmu = fwspec_smmu(fwspec);
>>>
>>> +   /*
>>> +* If driver is explicitly managing the iommu, detatch any 
>>> previously
>>> +* attached _DMA domains.
>>> +*
>>> +* TODO maybe this logic should be in iommu_attach_device() so it 
>>> can
>>> +* happen outside of holding group->mutex??
>>> +*/
>>> +   if (domain->type != IOMMU_DOMAIN_DMA) {
>>> +   struct arm_smmu_domain *other_domain, *n;
>>> +
>>> +   list_for_each_entry_safe(other_domain, n, 
>>> >domain_list, domain_node)
>>> +   if (other_domain->domain.type == IOMMU_DOMAIN_DMA)
>>> +   arm_smmu_detach_dev(_domain->domain, 
>>> dev);
>>
>
> So the arm_smmu_detach_dev api is no more there and is removed now.
> Also this will be a problem when multiple devices share the iommu, we end up
> removing domains used by other devices. Should this not be done
> per-device which does not want to use the DMA domain ?
>
>>hmm, we might want to unhook dev->archdata.dma_ops here too..
>>
>>I'm thinking maybe on arm64 __generic_dma_ops() should fall back to
>>swiotlb ops instead of dummy_ops if archdata.dma_ops is NULL, so that
>>we could just set it to NULL here?
>>
>
> hmm, both not attaching the default dma domain and not setting the dma_ops
> is tried in this series as well [1]
>
> [1] https://www.spinics.net/lists/arm-kernel/msg556081.html
>
>>(Is there really any purpose for having the dummy-ops??)
>>
>
> To enforce setting arch_setup_dma_ops for device so that the
> devices can do cache coherent transactions, otherwise disable the dma
> capability of the device. I see that this was introduced as a part of
> making ACPI_CCA_REQUIRED to be set in arm64 and later
> generalized.

hmm, maybe fallback to swiotlb ops vs dummy ops should depend on
whether it is an ACPI system or not..

BR,
-R
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 0/5] Implement SMMU passthrough using the default domain

2017-02-03 Thread Rob Clark
On Thu, Feb 2, 2017 at 10:12 AM, Will Deacon <will.dea...@arm.com> wrote:
> On Thu, Feb 02, 2017 at 10:02:50AM -0500, Rob Clark wrote:
>> On Thu, Jan 26, 2017 at 12:18 PM, Joerg Roedel <j...@8bytes.org> wrote:
>> > On Tue, Jan 24, 2017 at 08:42:23PM +0530, Sricharan wrote:
>> >> Thanks for this series. We had a case with the GPU.
>> >> The GPU's iommu was setup by kernel and the GPU
>> >> also does dynamic updates for on-the-fly switching between
>> >> process pagetables.  GPU driver was not using DMA domain and
>> >> the GPU's firmware was always expecting to run out  of contextbank
>> >>  '0' (although not correct) , which was not the case after the DMA domain
>> >> was made default  as '0' was getting allocated for DMA domain and
>> >> there were concerns about reusing the DMA domain as well.
>> >> Now with this series, looks there is an way out of that that can be tried.
>> >>
>> >> So should the default domain not be per device specific selectable ?
>> >
>> > Note that iommu-drivers can request direct-mapping for any given device
>> > on its initializtion. This is used on x86 for devices that need a 1-1
>> > mapping for some reason.
>> >
>> > Also device drivers can use the iommu-api and assign their own domain to
>> > a device, which allows them to manage the dma address space on their
>> > own.
>>
>> Part of the problem is that dev->archdata.dma_ops gets wired up to
>> iommu_dma_ops.  Which isn't so bad on it's own, except that cache ops
>> are not exposed to drivers, forcing us to use dma-mapping API
>> (dma_map_sg, etc) for cache operations.
>>
>> Possibly we should just expose cache op's to drivers bypass this abuse
>> of dma-mapping.
>>
>> btw, Will, we definitely want this to *not* rely on kcmdline for the
>> gpu with it's own private iommu case..
>
> I still need to understand the unmanaged domain case, but I don't really
> see why that's related to this series to be honest.
>

Only relation is if we were trying to solve the case of a driver that
needs to manage it's own domain with this patchset..  but that seems a
bit like "when all you have is a hammer, everything looks like a
nail"..

BR,
-R
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC 1/3] iommu/arm-smmu: Add support to opt-in to stalling

2017-01-30 Thread Rob Clark
On Thu, Jan 12, 2017 at 10:17 AM, Will Deacon <will.dea...@arm.com> wrote:
> On Wed, Jan 11, 2017 at 03:59:30PM -0500, Rob Clark wrote:
>> On Wed, Jan 11, 2017 at 4:36 AM, Will Deacon <will.dea...@arm.com> wrote:
>> > On Tue, Jan 10, 2017 at 02:20:13PM -0500, Rob Clark wrote:
>> >> On Tue, Jan 10, 2017 at 12:52 PM, Will Deacon <will.dea...@arm.com> wrote:
>> >> > On Fri, Jan 06, 2017 at 11:26:49AM -0500, Rob Clark wrote:
>> >> >> Hmm, well we install the fault handler on the iommu_domain..  perhaps
>> >> >> maybe a combo of dts property (or deciding based on more specific
>> >> >> compat string), plus extra param passed in to
>> >> >> iommu_set_fault_hander().  The dts property or compat string to
>> >> >> indicate whether the iommu (and how it is wired up) can handle stalls,
>> >> >> and enable_stall param when fault handler is registered to indicate
>> >> >> whether the device itself can cope.. if either can't do stalling, then
>> >> >> don't set CFCFG.
>> >> >
>> >> > I thought about this some more, and I think you're right. Having
>> >> > iommu_set_fault_handler take a flags parameter indicating that, for 
>> >> > example,
>> >> > the fault handler can deal with paging, is all we need to implement the
>> >> > per-master opt-in functionality for stalling faults. There's no real
>> >> > requirement to standardise a generic firmware property for that (but
>> >> > we still need *something* that says stalling is usable on the SMMU --
>> >> > perhaps just the compatible string is ok).
>> >>
>> >> btw, it occurred to me that maybe it should be flags param to
>> >> iommu_attach_device() (just in case fault handler not installed?)
>> >> otoh stalling without a fault handler is silly, but I guess we need it
>> >> to infer whether stalling can be supported by other devices on same
>> >> iommu.. tbh I'm on a bit shaky ground when it comes to multiple
>> >> devices per iommu since the SoC's I'm familiar with do it the other
>> >> way around.  But I guess you have thought more about the multi-device
>> >> case, so figured I should suggest it..
>> >
>> > I don't think it works at attach time, because the stalling property 
>> > belongs
>> > to the domain, rather than the individual devices within it. Similarly, I
>> > don't think we should allow this property to be toggled once devices have
>> > been attached.
>> >
>>
>> hmm, I was more thinking of cases where drivers for particular devices
>> need some work (ie. like potentially disabling hw hang detect during
>> faults).. I guess we could have three levels, that all have to be true
>> in order to enable stall: smmu, domain (pass flags in to
>> iommu_domain_alloc()??), and device (iommu_attach_device())?
>
> Hooking iommu_set_fault_handler, as you originally suggested, is the best
> way to set the flag on the domain. I think we just need to enforce that
> iommu_set_fault_handler is called prior to attaching devices to a domain,
> so that the IOMMU driver can configure the domain appropriately on the
> first attach.

Hi Will, just (finally) revisiting this..

So I started working on a patch to add can_stall to
iommu_set_fault_handler() (fortunately not many callers).  And then
adding an iommu_domain_resume(domain, resume/terminate).  (Ie.
iommu_domain_resume() would be called by the IOMMU user either
directly from fault handler callback, or indirectly from a thread
context..  that seemed a bit less clunky than passing a callback to
the callback..)

But is there any good way to iterate all the domains associated w/ the
arm_smmu_device?  Unfortunately we don't pass in the device ptr to
iommu_domain_alloc() no I'm not entirely sure at what point we know
whether all the associated domains can stall..

BR,
-R
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 2/2] iommu: add qcom_iommu

2017-02-22 Thread Rob Clark
On Wed, Feb 22, 2017 at 4:31 AM, Sricharan  wrote:
> Hi Rob,
>
>>diff --git a/Documentation/devicetree/bindings/iommu/qcom,iommu.txt
>>b/Documentation/devicetree/bindings/iommu/qcom,iommu.txt
>>new file mode 100644
>>index 000..78a8d65
>>--- /dev/null
>>+++ b/Documentation/devicetree/bindings/iommu/qcom,iommu.txt
>>@@ -0,0 +1,45 @@
>>+* QCOM IOMMU Implementation
>>+
>>+Qualcomm "B" family devices which are not compatible with arm-smmu have
>>+a similar looking IOMMU but without access to the global register space.
>>+This is modelled as separate IOMMU devices which have just a single
>>+master.
>>+
>>+** Required properties:
>>+
>>+- compatible: Should be one of:
>>+
>>+"qcom,msm8916-iommu-context-bank"
>>+
>>+  depending on the particular implementation and/or the
>>+  version of the architecture implemented.
>>+
>>+- reg   : Base address and size of the SMMU.  And optionally,
>>+  if present, the "smmu_local_base"
>>+
>>+- interrupts: The context fault irq.
>>+
>>+- #iommu-cells  : Must be 0
>>+
>>+- qcom,iommu-ctx-asid   : context ASID
>>+
>>+- qcom,iommu-secure-id  : secure-id
>>+
>>+- clocks: The interface clock (iface_clk) and bus clock (bus_clk)
>>+
>>+** Examples:
>>+
>>+  mdp_iommu: iommu-context-bank@1e24000 {
>>+  compatible = "qcom,msm8916-iommu-context-bank";
>>+  reg = <0x1e24000 0x1000
>>+  0x1ef 0x3000>;
>>+  reg-names = "iommu_base", "smmu_local_base";
>>+  interrupts = ;
>>+  qcom,iommu-ctx-asid = <4>;
>>+  qcom,iommu-secure-id = <17>;
>
> This is not an per context bank property and can be programmed for an
> given iommu only once. So we call qcom_iommu_sec_init for
> each context bank once, which does not look correct. Similarly for
> smmu_local_base as well. So should this be handled using an global
> once for all contexts ?

yeah, smmu_local_base and secure-id would be duplicate for all context
banks that are part of the same actual iommu.  (But it was Robin's
suggestion to just model this as separate context-bank devices, since
we cannot touch the global space).

Did I misunderstand the downstream driver code?  It looked like
qcom_scm_restore_sec_cfg() was called once on first attach per
context-bank, not globally for the entire iommu, which is what I'm
doing with this driver.  But I haven't yet tried to enable other
context-banks in the apps iommu yet.

>>+  #iommu-cells = <0>;
>>+  clocks = < GCC_SMMU_CFG_CLK>,
>>+   < GCC_APSS_TCU_CLK>;
>>+  clock-names = "iface_clk", "bus_clk";
>
> I am trying to generalize the clock bindings for MMU-500 and one more
> qcom specific. Anyways this can follow that.

no problem to adapt to what you come up with for arm-smmu, it is
basically the same requirements.

>>+  status = "okay";
>
> <..>
>
>>+#define pr_fmt(fmt) "qcom-iommu: " fmt
>>+
>>+#include 
>>+#include 
>>+#include 
>>+#include 
>>+#include 
>>+#include 
>>+#include 
>>+#include 
>>+#include 
>>+#include 
>>+#include 
>>+#include 
>>+#include 
>>+#include 
>>+#include 
>>+#include 
>>+#include 
>>+#include 
>>+#include 
>>+#include 
>>+#include 
>>+
>>+#include "io-pgtable.h"
>>+#include "arm-smmu-regs.h"
>>+
>>+// TODO are these qcom specific, or just something no one bothered to add to 
>>arm-smmu
>>+#define SMMU_CB_TLBSYNC  0x7f0
>>+#define SMMU_CB_TLBSTATUS0x7f4
>
> I think the reason was in arm-smmu, we are using the global TLBSYNC/STATUS 
> bits, as its
> used in both global device reset and flush path. Otherwise here, its correct 
> to add this.

ok, that is what I suspected.. in next version I'll add these two to
the shared header instead

>>+#define SMMU_INTR_SEL_NS 0x2000
>>+
>>+
>>+struct qcom_iommu_device {
>>+  struct device   *dev;
>>+
>>+  void __iomem*base;
>>+  void __iomem*local_base;
>>+  unsigned int irq;
>>+  struct clk  *iface_clk;
>>+  struct clk  *bus_clk;
>>+
>>+  bool secure_init;
>>+  u32  asid;  /* asid and ctx bank # are 1:1 */
>>+  u32  sec_id;
>>+
>>+  /* single group per device: */
>>+  struct iommu_group  *group;
>>+};
>>+
>>+struct qcom_iommu_domain {
>>+  struct qcom_iommu_device*iommu;
>>+  struct io_pgtable_ops   *pgtbl_ops;
>>+  spinlock_t   pgtbl_lock;
>>+  struct mutex init_mutex; /* Protects iommu pointer 
>>*/
>>+  struct iommu_domain  domain;
>>+};
>>+
>>+static struct qcom_iommu_domain *to_qcom_iommu_domain(struct iommu_domain 
>>*dom)
>>+{
>>+  return container_of(dom, struct qcom_iommu_domain, domain);
>>+}
>>+
>>+static const struct iommu_ops qcom_iommu_ops;
>>+static 

Re: [RFC 2/2] iommu/arm-smmu: support qcom implementation

2017-02-14 Thread Rob Clark
On Tue, Feb 14, 2017 at 1:46 PM, Robin Murphy <robin.mur...@arm.com> wrote:
> Hi Rob,
>
> On 10/02/17 18:41, Rob Clark wrote:
>> For devices with iommu(s) in secure mode, we cannot touch global
>> registers, and we have to live with the context -> sid mapping that
>> the secure world has set up for us.
>>
>> This enables, for example db410c (apq8016) devices to use the up-
>> stream arm-smmu driver.  This is the last major hurdle for having
>> the gpu working out of the box on an upstream kernel.
>>
>> NOTE: at this point, it works but I haven't spent any time thinking
>> much about the bindings.  Since we can't read the global regs, we
>> need to get all the device config info from DT (or at least anything
>> that can't be hard-coded).
>
> This approach seems, I have to say, unworkably horrible. I'm absolutely
> against the idea of pretending we have access to global state which we
> don't, then piling a facsimile of that state into the DT for no reason
> other than to keep the overcomplicated pretence up. This configuration
> comes out looking and behaving like a discrete IOMMU (see e.g.
> rockchip-iommu for inspiration) - if we have to support it, it would
> make far more sense to simply describe what we have, i.e. a
> "qcom,apq8016-smmu-context-bank" with a single interrupt and
> #iommu-cells = <0>. They'd want their own probe routine and private data
> (AFAICS more or less just a base address, an IRQ, ID features and an
> iommu_group), and probably a separate iommu_ops because you'd want to
> handle {add,remove}_device() and device_group() significantly
> differently. By the time we get up to {attach,remove}_dev() it might be
> clean enough to dynamically handle both cases in the same code,
> especially with a little general refactor of the ARM_SMMU_CB*
> arithmetic, and once we get to operating on arm_smmu_domains there
> should be no real difference.

fair enough.. I'll give this approach a try.

I guess at the end of the day, the code sharing might just amount to
arm_smmu_init_context_bank_ctx() (ie. half of what was
arm_smmu_init_context_bank()).  Possibly just splitting out a header
w/ the register offset #defines is the extent of the code sharing that
would make sense.

>> Also, this only works for non-secure contexts.  For secure contexts,
>> even things like map/unmap need to go via scm, so that bypasses
>> even more of arm-smmu.  I'm not sure if it is better to have all of
>> those special paths added to arm-smmu.  Or have a 2nd iommu driver
>> for the secure contexts.  (I think the second path only works since
>> I don't think the CPU is really touching the hw at all for secure
>> contexts.)
>>
>> Not in lieu of bindings docs (which will come before this is more
>> than just an RFC), but here is an example of what the DT bindings
>> look like:
>>
>>   gpu_iommu: qcom,iommu@1f0 {
>>   compatible = "qcom,smmu-v2";
>>   reg = <0x1f0 0x1>;
>>
>>   #global-interrupts = <1>;
>>   interrupts =
>>   ,// global
>>   ,   // unk0
>>   ,   // gfx3d_user
>>   ;   // gfx3d_priv
>>
>>   qcom,stream-to-cb = <
>>   0x0002   // unk0
>>   0x   // gfx3d_user
>>   0x0001   // gfx3d_priv
>>   >;
>>
>>   #iommu-cells = <1>;
>>
>>   clocks = < GCC_SMMU_CFG_CLK>,
>>< GCC_GFX_TCU_CLK>;
>>   clock-names = "smmu_iface_clk", "smmu_bus_clk";
>>
>>   qcom,cb-count = <3>;
>>   qcom,iommu-secure-id = <18>;
>>   qcom,mapping-groups-count = <3>;
>>
>>   status = "okay";
>>   };
>>
>> Since we must live with the assignment of stream-id's to context bank
>> mapping that the secure world has set up for us, the "qcom,stream-to-cb"
>> binding gives a mapping table of sid to cb.  (Indexed by cb, value is
>> the sid.)  I'm not 100% sure what to do about devices with multiple
>> sid's..  if I understand how things work properly, we could probably
>> make the values in this table the result of OR'ing all the sids
>> together.
>>
>> The "qcom,cb-count" and "qcom,mapping-groups-count" can go away and
>> be computed from "qcom,stream-to-cb".  I just haven't done that yet

[RFC 2/2] iommu/arm-smmu: support qcom implementation

2017-02-10 Thread Rob Clark
For devices with iommu(s) in secure mode, we cannot touch global
registers, and we have to live with the context -> sid mapping that
the secure world has set up for us.

This enables, for example db410c (apq8016) devices to use the up-
stream arm-smmu driver.  This is the last major hurdle for having
the gpu working out of the box on an upstream kernel.

NOTE: at this point, it works but I haven't spent any time thinking
much about the bindings.  Since we can't read the global regs, we
need to get all the device config info from DT (or at least anything
that can't be hard-coded).

Also, this only works for non-secure contexts.  For secure contexts,
even things like map/unmap need to go via scm, so that bypasses
even more of arm-smmu.  I'm not sure if it is better to have all of
those special paths added to arm-smmu.  Or have a 2nd iommu driver
for the secure contexts.  (I think the second path only works since
I don't think the CPU is really touching the hw at all for secure
contexts.)

Not in lieu of bindings docs (which will come before this is more
than just an RFC), but here is an example of what the DT bindings
look like:

gpu_iommu: qcom,iommu@1f0 {
compatible = "qcom,smmu-v2";
reg = <0x1f0 0x1>;

#global-interrupts = <1>;
interrupts =
,// global
,   // unk0
,   // gfx3d_user
;   // gfx3d_priv

qcom,stream-to-cb = <
0x0002   // unk0
0x   // gfx3d_user
0x0001   // gfx3d_priv
>;

#iommu-cells = <1>;

clocks = < GCC_SMMU_CFG_CLK>,
 < GCC_GFX_TCU_CLK>;
clock-names = "smmu_iface_clk", "smmu_bus_clk";

qcom,cb-count = <3>;
qcom,iommu-secure-id = <18>;
qcom,mapping-groups-count = <3>;

status = "okay";
};

Since we must live with the assignment of stream-id's to context bank
mapping that the secure world has set up for us, the "qcom,stream-to-cb"
binding gives a mapping table of sid to cb.  (Indexed by cb, value is
the sid.)  I'm not 100% sure what to do about devices with multiple
sid's..  if I understand how things work properly, we could probably
make the values in this table the result of OR'ing all the sids
together.

The "qcom,cb-count" and "qcom,mapping-groups-count" can go away and
be computed from "qcom,stream-to-cb".  I just haven't done that yet.
The "qcom,iommu-secure-id" field is needed for the one scm call needed
for non-secure contexts for initial configuration.

Anyways, at this point, I'm mostly just looking for feedback about
whether this is the best way forward, vs introducing a seperate iommu
driver, and any suggestions anyone might have.  And any ideas about how
to best handle the secure context banks, since I think we have no
choice but to use them for venus (the video enc/dec block).

Signed-off-by: Rob Clark <robdcl...@gmail.com>
---
 drivers/iommu/arm-smmu.c | 233 +++
 1 file changed, 217 insertions(+), 16 deletions(-)

diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index 938417e..7b7d05f 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -52,6 +52,9 @@
 #include 
 #include 
 
+#include 
+
 #include 
 
 #include "io-pgtable.h"
@@ -309,6 +312,7 @@ enum arm_smmu_implementation {
GENERIC_SMMU,
ARM_MMU500,
CAVIUM_SMMUV2,
+   QCOM_SMMUV2,
 };
 
 struct arm_smmu_s2cr {
@@ -380,6 +384,11 @@ struct arm_smmu_device {
struct arm_smmu_s2cr*s2crs;
struct mutexstream_map_mutex;
 
+   void __iomem*local_base; /* specific to qcom */
+   u32 *stream_map; /* specific to qcom */
+   u32 sec_id;  /* specific to qcom */
+   DECLARE_BITMAP(context_init, ARM_SMMU_MAX_CBS); /* specific to qcom */
+
unsigned long   va_size;
unsigned long   ipa_size;
unsigned long   pa_size;
@@ -599,6 +608,9 @@ static void __arm_smmu_tlb_sync(struct arm_smmu_device 
*smmu)
int count = 0;
void __iomem *gr0_base = ARM_SMMU_GR0(smmu);
 
+   if (smmu->model == QCOM_SMMUV2)
+   return;
+
writel_relaxed(0, gr0_base + ARM_SMMU_GR0_sTLBGSYNC);
while (readl_relaxed(gr0_base + ARM_SMMU_GR0_sTLBGSTATUS)
   & sTLBGSTATUS_GSACTIVE) {
@@ -739,19 +751,17 @@ static irqreturn_t arm_smmu_global_fault(int irq, void 
*dev)
ret

[PATCH 1/2] firmware/qcom: add qcom_scm_restore_sec_cfg()

2017-02-10 Thread Rob Clark
Signed-off-by: Rob Clark <robdcl...@gmail.com>
---
 drivers/firmware/qcom_scm-32.c |  6 ++
 drivers/firmware/qcom_scm-64.c | 16 
 drivers/firmware/qcom_scm.c|  6 ++
 drivers/firmware/qcom_scm.h|  5 +
 include/linux/qcom_scm.h   |  2 ++
 5 files changed, 35 insertions(+)

diff --git a/drivers/firmware/qcom_scm-32.c b/drivers/firmware/qcom_scm-32.c
index 8ad226c..722e65a 100644
--- a/drivers/firmware/qcom_scm-32.c
+++ b/drivers/firmware/qcom_scm-32.c
@@ -578,3 +578,9 @@ int __qcom_scm_set_remote_state(struct device *dev, u32 
state, u32 id)
 
return ret ? : le32_to_cpu(scm_ret);
 }
+
+int __qcom_scm_restore_sec_cfg(struct device *dev, u32 device_id,
+  u32 spare)
+{
+   return -ENODEV;
+}
diff --git a/drivers/firmware/qcom_scm-64.c b/drivers/firmware/qcom_scm-64.c
index c933259..550e3a3 100644
--- a/drivers/firmware/qcom_scm-64.c
+++ b/drivers/firmware/qcom_scm-64.c
@@ -381,3 +381,19 @@ int __qcom_scm_set_remote_state(struct device *dev, u32 
state, u32 id)
 
return ret ? : res.a1;
 }
+
+int __qcom_scm_restore_sec_cfg(struct device *dev, u32 device_id, u32 spare)
+{
+   struct qcom_scm_desc desc = {0};
+   struct arm_smccc_res res;
+   int ret;
+
+   desc.args[0] = device_id;
+   desc.args[1] = spare;
+   desc.arginfo = QCOM_SCM_ARGS(2);
+
+   ret = qcom_scm_call(dev, QCOM_SCM_SVC_MP, QCOM_SCM_RESTORE_SEC_CFG,
+   , );
+
+   return ret ? : res.a1;
+}
diff --git a/drivers/firmware/qcom_scm.c b/drivers/firmware/qcom_scm.c
index d987bcc..ae1f473 100644
--- a/drivers/firmware/qcom_scm.c
+++ b/drivers/firmware/qcom_scm.c
@@ -315,6 +315,12 @@ static const struct reset_control_ops 
qcom_scm_pas_reset_ops = {
.deassert = qcom_scm_pas_reset_deassert,
 };
 
+int qcom_scm_restore_sec_cfg(u32 device_id, u32 spare)
+{
+   return __qcom_scm_restore_sec_cfg(__scm->dev, device_id, spare);
+}
+EXPORT_SYMBOL(qcom_scm_restore_sec_cfg);
+
 /**
  * qcom_scm_is_available() - Checks if SCM is available
  */
diff --git a/drivers/firmware/qcom_scm.h b/drivers/firmware/qcom_scm.h
index 6a0f154..31fc732 100644
--- a/drivers/firmware/qcom_scm.h
+++ b/drivers/firmware/qcom_scm.h
@@ -85,4 +85,9 @@ static inline int qcom_scm_remap_error(int err)
return -EINVAL;
 }
 
+#define QCOM_SCM_SVC_MP0xc
+#define QCOM_SCM_RESTORE_SEC_CFG   2
+extern int __qcom_scm_restore_sec_cfg(struct device *dev, u32 device_id,
+ u32 spare);
+
 #endif
diff --git a/include/linux/qcom_scm.h b/include/linux/qcom_scm.h
index d32f6f1..22017f5d 100644
--- a/include/linux/qcom_scm.h
+++ b/include/linux/qcom_scm.h
@@ -40,6 +40,7 @@ extern int qcom_scm_pas_shutdown(u32 peripheral);
 extern void qcom_scm_cpu_power_down(u32 flags);
 extern u32 qcom_scm_get_version(void);
 extern int qcom_scm_set_remote_state(u32 state, u32 id);
+extern int qcom_scm_restore_sec_cfg(u32 device_id, u32 spare);
 #else
 static inline
 int qcom_scm_set_cold_boot_addr(void *entry, const cpumask_t *cpus)
@@ -67,5 +68,6 @@ static inline void qcom_scm_cpu_power_down(u32 flags) {}
 static inline u32 qcom_scm_get_version(void) { return 0; }
 static inline u32
 qcom_scm_set_remote_state(u32 state,u32 id) { return -ENODEV; }
+static inline int qcom_scm_restore_sec_cfg(u32 device_id, u32 spare) { return 
-ENODEV; }
 #endif
 #endif
-- 
2.9.3

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 2/2] iommu: add qcom_iommu

2017-02-15 Thread Rob Clark
An iommu driver for Qualcomm "B" family devices which do not completely
implement the ARM SMMU spec.  These devices have context-bank register
layout that is similar to ARM SMMU, but no global register space (or at
least not one that is accessible).

Signed-off-by: Rob Clark <robdcl...@gmail.com>
---
Feel free to bikeshed the name, I just had to pick something.  Maybe
we should go back to what downstream calls it (ie. msm-iommu-v1)

Also unsure about the compatible string.  Possibly it should be more
generic, since 8x74 and 8x84 and probably a bunch of others want to
use this same driver.  Although I'm not entirely sure whether they
use the same pagetable format configuration, so we might just want
to keep the SoC name in the compat string for making those sorts of
decisions.

 .../devicetree/bindings/iommu/qcom,iommu.txt   |  45 ++
 drivers/iommu/Kconfig  |  10 +
 drivers/iommu/Makefile |   1 +
 drivers/iommu/qcom_iommu.c | 699 +
 4 files changed, 755 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/iommu/qcom,iommu.txt
 create mode 100644 drivers/iommu/qcom_iommu.c

diff --git a/Documentation/devicetree/bindings/iommu/qcom,iommu.txt 
b/Documentation/devicetree/bindings/iommu/qcom,iommu.txt
new file mode 100644
index 000..78a8d65
--- /dev/null
+++ b/Documentation/devicetree/bindings/iommu/qcom,iommu.txt
@@ -0,0 +1,45 @@
+* QCOM IOMMU Implementation
+
+Qualcomm "B" family devices which are not compatible with arm-smmu have
+a similar looking IOMMU but without access to the global register space.
+This is modelled as separate IOMMU devices which have just a single
+master.
+
+** Required properties:
+
+- compatible: Should be one of:
+
+"qcom,msm8916-iommu-context-bank"
+
+  depending on the particular implementation and/or the
+  version of the architecture implemented.
+
+- reg   : Base address and size of the SMMU.  And optionally,
+  if present, the "smmu_local_base"
+
+- interrupts: The context fault irq.
+
+- #iommu-cells  : Must be 0
+
+- qcom,iommu-ctx-asid   : context ASID
+
+- qcom,iommu-secure-id  : secure-id
+
+- clocks: The interface clock (iface_clk) and bus clock (bus_clk)
+
+** Examples:
+
+   mdp_iommu: iommu-context-bank@1e24000 {
+   compatible = "qcom,msm8916-iommu-context-bank";
+   reg = <0x1e24000 0x1000
+   0x1ef 0x3000>;
+   reg-names = "iommu_base", "smmu_local_base";
+   interrupts = ;
+   qcom,iommu-ctx-asid = <4>;
+   qcom,iommu-secure-id = <17>;
+   #iommu-cells = <0>;
+   clocks = < GCC_SMMU_CFG_CLK>,
+< GCC_APSS_TCU_CLK>;
+   clock-names = "iface_clk", "bus_clk";
+   status = "okay";
+   };
diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
index 8ee54d7..631e1cd 100644
--- a/drivers/iommu/Kconfig
+++ b/drivers/iommu/Kconfig
@@ -362,4 +362,14 @@ config MTK_IOMMU_V1
 
  if unsure, say N here.
 
+config QCOM_IOMMU
+   bool "Qualcomm IOMMU Support"
+   depends on ARM || ARM64
+   depends on ARCH_QCOM || COMPILE_TEST
+   select IOMMU_API
+   select IOMMU_IO_PGTABLE_LPAE
+   select ARM_DMA_USE_IOMMU
+   help
+ Support for IOMMU on certain Qualcomm SoCs.
+
 endif # IOMMU_SUPPORT
diff --git a/drivers/iommu/Makefile b/drivers/iommu/Makefile
index 195f7b9..b910aea 100644
--- a/drivers/iommu/Makefile
+++ b/drivers/iommu/Makefile
@@ -27,3 +27,4 @@ obj-$(CONFIG_TEGRA_IOMMU_SMMU) += tegra-smmu.o
 obj-$(CONFIG_EXYNOS_IOMMU) += exynos-iommu.o
 obj-$(CONFIG_FSL_PAMU) += fsl_pamu.o fsl_pamu_domain.o
 obj-$(CONFIG_S390_IOMMU) += s390-iommu.o
+obj-$(CONFIG_QCOM_IOMMU) += qcom_iommu.o
diff --git a/drivers/iommu/qcom_iommu.c b/drivers/iommu/qcom_iommu.c
new file mode 100644
index 000..eb92d60
--- /dev/null
+++ b/drivers/iommu/qcom_iommu.c
@@ -0,0 +1,699 @@
+/*
+ * IOMMU API for QCOM secure IOMMUs.  Somewhat based on arm-smmu.c
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
+ *
+ * Copyright (C

[PATCH 1/2] iommu/arm-smmu: split out register defines

2017-02-15 Thread Rob Clark
I want to re-use some of these for qcom_iommu, which has (roughly) the
same context-bank registers.

Signed-off-by: Rob Clark <robdcl...@gmail.com>
---
 drivers/iommu/arm-smmu-regs.h | 225 ++
 drivers/iommu/arm-smmu.c  | 200 +
 2 files changed, 226 insertions(+), 199 deletions(-)
 create mode 100644 drivers/iommu/arm-smmu-regs.h

diff --git a/drivers/iommu/arm-smmu-regs.h b/drivers/iommu/arm-smmu-regs.h
new file mode 100644
index 000..632240f
--- /dev/null
+++ b/drivers/iommu/arm-smmu-regs.h
@@ -0,0 +1,225 @@
+/*
+ * IOMMU API for ARM architected SMMU implementations.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
+ *
+ * Copyright (C) 2013 ARM Limited
+ *
+ * Author: Will Deacon <will.dea...@arm.com>
+ */
+
+#ifndef _ARM_SMMU_REGS_H
+#define _ARM_SMMU_REGS_H
+
+
+/* Configuration registers */
+#define ARM_SMMU_GR0_sCR0  0x0
+#define sCR0_CLIENTPD  (1 << 0)
+#define sCR0_GFRE  (1 << 1)
+#define sCR0_GFIE  (1 << 2)
+#define sCR0_EXIDENABLE(1 << 3)
+#define sCR0_GCFGFRE   (1 << 4)
+#define sCR0_GCFGFIE   (1 << 5)
+#define sCR0_USFCFG(1 << 10)
+#define sCR0_VMIDPNE   (1 << 11)
+#define sCR0_PTM   (1 << 12)
+#define sCR0_FB(1 << 13)
+#define sCR0_VMID16EN  (1 << 31)
+#define sCR0_BSU_SHIFT 14
+#define sCR0_BSU_MASK  0x3
+
+/* Auxiliary Configuration register */
+#define ARM_SMMU_GR0_sACR  0x10
+
+/* Identification registers */
+#define ARM_SMMU_GR0_ID0   0x20
+#define ARM_SMMU_GR0_ID1   0x24
+#define ARM_SMMU_GR0_ID2   0x28
+#define ARM_SMMU_GR0_ID3   0x2c
+#define ARM_SMMU_GR0_ID4   0x30
+#define ARM_SMMU_GR0_ID5   0x34
+#define ARM_SMMU_GR0_ID6   0x38
+#define ARM_SMMU_GR0_ID7   0x3c
+#define ARM_SMMU_GR0_sGFSR 0x48
+#define ARM_SMMU_GR0_sGFSYNR0  0x50
+#define ARM_SMMU_GR0_sGFSYNR1  0x54
+#define ARM_SMMU_GR0_sGFSYNR2  0x58
+
+#define ID0_S1TS   (1 << 30)
+#define ID0_S2TS   (1 << 29)
+#define ID0_NTS(1 << 28)
+#define ID0_SMS(1 << 27)
+#define ID0_ATOSNS (1 << 26)
+#define ID0_PTFS_NO_AARCH32(1 << 25)
+#define ID0_PTFS_NO_AARCH32S   (1 << 24)
+#define ID0_CTTW   (1 << 14)
+#define ID0_NUMIRPT_SHIFT  16
+#define ID0_NUMIRPT_MASK   0xff
+#define ID0_NUMSIDB_SHIFT  9
+#define ID0_NUMSIDB_MASK   0xf
+#define ID0_EXIDS  (1 << 8)
+#define ID0_NUMSMRG_SHIFT  0
+#define ID0_NUMSMRG_MASK   0xff
+
+#define ID1_PAGESIZE   (1 << 31)
+#define ID1_NUMPAGENDXB_SHIFT  28
+#define ID1_NUMPAGENDXB_MASK   7
+#define ID1_NUMS2CB_SHIFT  16
+#define ID1_NUMS2CB_MASK   0xff
+#define ID1_NUMCB_SHIFT0
+#define ID1_NUMCB_MASK 0xff
+
+#define ID2_OAS_SHIFT  4
+#define ID2_OAS_MASK   0xf
+#define ID2_IAS_SHIFT  0
+#define ID2_IAS_MASK   0xf
+#define ID2_UBS_SHIFT  8
+#define ID2_UBS_MASK   0xf
+#define ID2_PTFS_4K(1 << 12)
+#define ID2_PTFS_16K   (1 << 13)
+#define ID2_PTFS_64K   (1 << 14)
+#define ID2_VMID16 (1 << 15)
+
+#define ID7_MAJOR_SHIFT4
+#define ID7_MAJOR_MASK 0xf
+
+/* Global TLB invalidation */
+#define ARM_SMMU_GR0_TLBIVMID  0x64
+#define ARM_SMMU_GR0_TLBIALLNSNH   0x68
+#define ARM_SMMU_GR0_TLBIALLH  0x6c
+#define ARM_SMMU_GR0_sTLBGSYNC 0x70
+#define ARM_SMMU_GR0_sTLBGSTATUS   0x74
+#define sTLBGSTATUS_GSACTIVE   (1 << 0)
+#define TLB_LOOP_TIMEOUT   100 /* 1s! */

[PATCH 8/9] ARM64: DT: add video codec devicetree node

2017-03-01 Thread Rob Clark
From: Stanimir Varbanov <stanimir.varba...@linaro.org>

Signed-off-by: Stanimir Varbanov <stanimir.varba...@linaro.org>
Signed-off-by: Rob Clark <robdcl...@gmail.com>
---
 arch/arm64/boot/dts/qcom/msm8916.dtsi | 28 
 1 file changed, 28 insertions(+)

diff --git a/arch/arm64/boot/dts/qcom/msm8916.dtsi 
b/arch/arm64/boot/dts/qcom/msm8916.dtsi
index b0daf39..7bcf4cd 100644
--- a/arch/arm64/boot/dts/qcom/msm8916.dtsi
+++ b/arch/arm64/boot/dts/qcom/msm8916.dtsi
@@ -88,6 +88,13 @@
no-map;
};
 
+   venus_mem: venus@8990 {
+   compatible = "shared-dma-pool";
+   reg = <0x0 0x8990 0x0 0x80>;
+   alignment = <0x1000>;
+   no-map;
+   };
+
mba_mem: mba@8ea0 {
no-map;
reg = <0 0x8ea0 0 0x10>;
@@ -1190,6 +1197,27 @@
};
};
};
+
+   venus: video-codec@1d0 {
+   compatible = "qcom,msm8916-venus";
+   reg = <0x01d0 0xff000>;
+   interrupts = ;
+   power-domains = < VENUS_GDSC>;
+   clocks = < GCC_VENUS0_VCODEC0_CLK>,
+< GCC_VENUS0_AHB_CLK>,
+< GCC_VENUS0_AXI_CLK>;
+   clock-names = "core", "iface", "bus";
+   memory-region = <_mem>;
+   status = "okay";
+
+   video-decoder {
+   compatible = "venus-decoder";
+   };
+
+   video-encoder {
+   compatible = "venus-encoder";
+   };
+   };
};
 
smd {
-- 
2.9.3

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 7/9] ARM64: DT: add gpu for msm8916

2017-03-01 Thread Rob Clark
Signed-off-by: Rob Clark <robdcl...@gmail.com>
---
 arch/arm64/boot/dts/qcom/msm8916.dtsi | 23 +++
 1 file changed, 23 insertions(+)

diff --git a/arch/arm64/boot/dts/qcom/msm8916.dtsi 
b/arch/arm64/boot/dts/qcom/msm8916.dtsi
index 68a8e67..b0daf39 100644
--- a/arch/arm64/boot/dts/qcom/msm8916.dtsi
+++ b/arch/arm64/boot/dts/qcom/msm8916.dtsi
@@ -698,6 +698,29 @@
#thermal-sensor-cells = <1>;
};
 
+   gpu@01c0 {
+   compatible = "qcom,adreno-306.0", "qcom,adreno";
+   reg = <0x01c0 0x2>;
+   reg-names = "kgsl_3d0_reg_memory";
+   interrupts = <0 33 0>;
+   interrupt-names = "kgsl_3d0_irq";
+   clock-names =
+   "core",
+   "iface",
+   "mem",
+   "mem_iface",
+   "alt_mem_iface",
+   "gfx3d_clk";
+   clocks =
+   < GCC_OXILI_GFX3D_CLK>,
+   < GCC_OXILI_AHB_CLK>,
+   < GCC_OXILI_GMEM_CLK>,
+   < GCC_BIMC_GFX_CLK>,
+   < GCC_BIMC_GPU_CLK>,
+   < GFX3D_CLK_SRC>;
+   power-domains = < OXILI_GDSC>;
+   };
+
mdss: mdss@1a0 {
compatible = "qcom,mdss";
reg = <0x1a0 0x1000>,
-- 
2.9.3

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 5/9] iommu: add qcom_iommu

2017-03-01 Thread Rob Clark
An iommu driver for Qualcomm "B" family devices which do not completely
implement the ARM SMMU spec.  These devices have context-bank register
layout that is similar to ARM SMMU, but no global register space (or at
least not one that is accessible).

Signed-off-by: Rob Clark <robdcl...@gmail.com>
---
 drivers/iommu/Kconfig |  10 +
 drivers/iommu/Makefile|   1 +
 drivers/iommu/arm-smmu-regs.h |   2 +
 drivers/iommu/qcom_iommu.c| 825 ++
 4 files changed, 838 insertions(+)
 create mode 100644 drivers/iommu/qcom_iommu.c

diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
index 37e204f..400a404 100644
--- a/drivers/iommu/Kconfig
+++ b/drivers/iommu/Kconfig
@@ -359,4 +359,14 @@ config MTK_IOMMU_V1
 
  if unsure, say N here.
 
+config QCOM_IOMMU
+   bool "Qualcomm IOMMU Support"
+   depends on ARM || ARM64
+   depends on ARCH_QCOM || COMPILE_TEST
+   select IOMMU_API
+   select IOMMU_IO_PGTABLE_LPAE
+   select ARM_DMA_USE_IOMMU
+   help
+ Support for IOMMU on certain Qualcomm SoCs.
+
 endif # IOMMU_SUPPORT
diff --git a/drivers/iommu/Makefile b/drivers/iommu/Makefile
index 195f7b9..b910aea 100644
--- a/drivers/iommu/Makefile
+++ b/drivers/iommu/Makefile
@@ -27,3 +27,4 @@ obj-$(CONFIG_TEGRA_IOMMU_SMMU) += tegra-smmu.o
 obj-$(CONFIG_EXYNOS_IOMMU) += exynos-iommu.o
 obj-$(CONFIG_FSL_PAMU) += fsl_pamu.o fsl_pamu_domain.o
 obj-$(CONFIG_S390_IOMMU) += s390-iommu.o
+obj-$(CONFIG_QCOM_IOMMU) += qcom_iommu.o
diff --git a/drivers/iommu/arm-smmu-regs.h b/drivers/iommu/arm-smmu-regs.h
index 632240f..e643164 100644
--- a/drivers/iommu/arm-smmu-regs.h
+++ b/drivers/iommu/arm-smmu-regs.h
@@ -174,6 +174,8 @@ enum arm_smmu_s2cr_privcfg {
 #define ARM_SMMU_CB_S1_TLBIVAL 0x620
 #define ARM_SMMU_CB_S2_TLBIIPAS2   0x630
 #define ARM_SMMU_CB_S2_TLBIIPAS2L  0x638
+#define ARM_SMMU_CB_TLBSYNC0x7f0
+#define ARM_SMMU_CB_TLBSTATUS  0x7f4
 #define ARM_SMMU_CB_ATS1PR 0x800
 #define ARM_SMMU_CB_ATSR   0x8f0
 
diff --git a/drivers/iommu/qcom_iommu.c b/drivers/iommu/qcom_iommu.c
new file mode 100644
index 000..5d3bb63
--- /dev/null
+++ b/drivers/iommu/qcom_iommu.c
@@ -0,0 +1,825 @@
+/*
+ * IOMMU API for QCOM secure IOMMUs.  Somewhat based on arm-smmu.c
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
+ *
+ * Copyright (C) 2013 ARM Limited
+ * Copyright (C) 2017 Red Hat
+ */
+
+#define pr_fmt(fmt) "qcom-iommu: " fmt
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "io-pgtable.h"
+#include "arm-smmu-regs.h"
+
+#define SMMU_INTR_SEL_NS 0x2000
+
+struct qcom_iommu_dev {
+   /* IOMMU core code handle */
+   struct iommu_device  iommu;
+   struct device   *dev;
+   struct clk  *iface_clk;
+   struct clk  *bus_clk;
+   void __iomem*local_base;
+   u32  sec_id;
+   struct list_head context_list;   /* list of qcom_iommu_context 
*/
+};
+
+struct qcom_iommu_ctx {
+   struct device   *dev;
+   void __iomem*base;
+   unsigned int irq;
+   bool secure_init;
+   u32  asid;  /* asid and ctx bank # are 1:1 */
+   struct iommu_group  *group;
+   struct list_head node;  /* head in 
qcom_iommu_device::context_list */
+};
+
+struct qcom_iommu_domain {
+   struct io_pgtable_ops   *pgtbl_ops;
+   spinlock_t   pgtbl_lock;
+   struct mutex init_mutex; /* Protects iommu pointer */
+   struct iommu_domain  domain;
+   struct qcom_iommu_dev   *iommu;
+};
+
+static struct qcom_iommu_domain *to_qcom_iommu_domain(struct iommu_domain *dom)
+{
+   return container_of(dom, struct qcom_iommu_domain, domain);
+}
+
+static const struct iommu_ops qcom_iommu_ops;
+static struct platform_driver qcom_iommu_driver;
+
+static struct qcom_iommu_dev * __to_iommu(struct iommu_fwspec *fwspec)
+{
+   if (WARN_ON(!fwspec || fwspec->ops != _iommu_ops))
+   return N

[PATCH 2/9] firmware: qcom_scm: add two scm calls for iommu secure page table

2017-03-01 Thread Rob Clark
From: Stanimir Varbanov <stanimir.varba...@linaro.org>

Those two new SCM calls are needed from qcom-iommu driver in order
to initialize secure iommu page table.

Signed-off-by: Stanimir Varbanov <stanimir.varba...@linaro.org>
Signed-off-by: Rob Clark <robdcl...@gmail.com>
---
 drivers/firmware/qcom_scm-64.c | 42 ++
 drivers/firmware/qcom_scm.c| 12 
 drivers/firmware/qcom_scm.h|  6 ++
 include/linux/qcom_scm.h   |  2 ++
 4 files changed, 62 insertions(+)

diff --git a/drivers/firmware/qcom_scm-64.c b/drivers/firmware/qcom_scm-64.c
index 550e3a3..6e6d561 100644
--- a/drivers/firmware/qcom_scm-64.c
+++ b/drivers/firmware/qcom_scm-64.c
@@ -397,3 +397,45 @@ int __qcom_scm_restore_sec_cfg(struct device *dev, u32 
device_id, u32 spare)
 
return ret ? : res.a1;
 }
+
+int __qcom_scm_iommu_secure_ptbl_size(struct device *dev, u32 spare,
+ size_t *size)
+{
+   struct qcom_scm_desc desc = {0};
+   struct arm_smccc_res res;
+   int ret;
+
+   desc.args[0] = spare;
+   desc.arginfo = QCOM_SCM_ARGS(1);
+
+   ret = qcom_scm_call(dev, QCOM_SCM_SVC_MP,
+   QCOM_SCM_IOMMU_SECURE_PTBL_SIZE, , );
+
+   if (size)
+   *size = res.a1;
+
+   return ret ? : res.a2;
+}
+
+int __qcom_scm_iommu_secure_ptbl_init(struct device *dev, u64 addr, u32 size,
+ u32 spare)
+{
+   struct qcom_scm_desc desc = {0};
+   struct arm_smccc_res res;
+   int ret;
+
+   desc.args[0] = addr;
+   desc.args[1] = size;
+   desc.args[2] = spare;
+   desc.arginfo = QCOM_SCM_ARGS(3, QCOM_SCM_RW, QCOM_SCM_VAL,
+QCOM_SCM_VAL);
+
+   ret = qcom_scm_call(dev, QCOM_SCM_SVC_MP,
+   QCOM_SCM_IOMMU_SECURE_PTBL_INIT, , );
+
+   /* the pg table has been initialized already, ignore the error */
+   if (ret == -EPERM)
+   ret = 0;
+
+   return ret;
+}
diff --git a/drivers/firmware/qcom_scm.c b/drivers/firmware/qcom_scm.c
index ae1f473..bb16510 100644
--- a/drivers/firmware/qcom_scm.c
+++ b/drivers/firmware/qcom_scm.c
@@ -321,6 +321,18 @@ int qcom_scm_restore_sec_cfg(u32 device_id, u32 spare)
 }
 EXPORT_SYMBOL(qcom_scm_restore_sec_cfg);
 
+int qcom_scm_iommu_secure_ptbl_size(u32 spare, size_t *size)
+{
+   return __qcom_scm_iommu_secure_ptbl_size(__scm->dev, spare, size);
+}
+EXPORT_SYMBOL(qcom_scm_iommu_secure_ptbl_size);
+
+int qcom_scm_iommu_secure_ptbl_init(u64 addr, u32 size, u32 spare)
+{
+   return __qcom_scm_iommu_secure_ptbl_init(__scm->dev, addr, size, spare);
+}
+EXPORT_SYMBOL(qcom_scm_iommu_secure_ptbl_init);
+
 /**
  * qcom_scm_is_available() - Checks if SCM is available
  */
diff --git a/drivers/firmware/qcom_scm.h b/drivers/firmware/qcom_scm.h
index 31fc732..9bea691 100644
--- a/drivers/firmware/qcom_scm.h
+++ b/drivers/firmware/qcom_scm.h
@@ -89,5 +89,11 @@ static inline int qcom_scm_remap_error(int err)
 #define QCOM_SCM_RESTORE_SEC_CFG   2
 extern int __qcom_scm_restore_sec_cfg(struct device *dev, u32 device_id,
  u32 spare);
+#define QCOM_SCM_IOMMU_SECURE_PTBL_SIZE3
+#define QCOM_SCM_IOMMU_SECURE_PTBL_INIT4
+extern int __qcom_scm_iommu_secure_ptbl_size(struct device *dev, u32 spare,
+size_t *size);
+extern int __qcom_scm_iommu_secure_ptbl_init(struct device *dev, u64 addr,
+u32 size, u32 spare);
 
 #endif
diff --git a/include/linux/qcom_scm.h b/include/linux/qcom_scm.h
index 22017f5d..c46c3c2 100644
--- a/include/linux/qcom_scm.h
+++ b/include/linux/qcom_scm.h
@@ -41,6 +41,8 @@ extern void qcom_scm_cpu_power_down(u32 flags);
 extern u32 qcom_scm_get_version(void);
 extern int qcom_scm_set_remote_state(u32 state, u32 id);
 extern int qcom_scm_restore_sec_cfg(u32 device_id, u32 spare);
+extern int qcom_scm_iommu_secure_ptbl_size(u32 spare, size_t *size);
+extern int qcom_scm_iommu_secure_ptbl_init(u64 addr, u32 size, u32 spare);
 #else
 static inline
 int qcom_scm_set_cold_boot_addr(void *entry, const cpumask_t *cpus)
-- 
2.9.3

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 0/9] iommu: add qcom_iommu for early "B" family devices

2017-03-01 Thread Rob Clark
An iommu driver for Qualcomm "B" family devices which do not completely
implement the ARM SMMU spec.  These devices have context-bank register
layout that is similar to ARM SMMU, but no global register space (or at
least not one that is accessible).

Compared to first version of the patchset, the bindings have changed
somewhat, which necessitated some changes in the structure of the driver.
It turns out that even though the global register space is not accessible,
we do have (for IOMMUs that contain secure contexts) a separate non-
standard global register space where we need to configure the routing
of irqs.  And we could not just assign this register range to each
context-bank node.  So now we have a single iommu device which contains
all of it's context banks:

apps_iommu: msm-iommu-v1@1e2 {
#address-cells = <1>;
#size-cells = <1>;
#iommu-cells = <1>;
compatible = "qcom,msm-iommu-v1";
ranges = <0 0x1e2 0x4>;
reg = <0x1ef 0x3000>;
clocks = < GCC_SMMU_CFG_CLK>,
 < GCC_APSS_TCU_CLK>;
clock-names = "iface_clk", "bus_clk";
qcom,iommu-secure-id = <17>;

// mdp_0:
msm-iommu-v1-ctx@4000 {
compatible = "qcom,msm-iommu-v1-ns";
reg = <0x4000 0x1000>;
interrupts = ;
};

// venus_ns:
msm-iommu-v1-ctx@5000 {
compatible = "qcom,msm-iommu-v1-sec";
reg = <0x5000 0x1000>;
interrupts = ;
};
};

gpu_iommu: msm-iommu-v1@1f08000 {
...
};

There are a couple vaguely unrelated patches to add venus and gpu dt nodes,
so that we have something to wire up the iommu to.

These patches apply on top of some in-flight patches to support IOMMU
probe deferral.  You can find full branch on top of linux-next here:

   git://people.freedesktop.org/~robclark/linux  
next-20170228-db410c-qcom-smmu-3-venus

or github if you prefer:

   
https://github.com/freedreno/kernel-msm/commits/next-20170228-db410c-qcom-smmu-3-venus


Rob Clark (6):
  firmware/qcom: add qcom_scm_restore_sec_cfg()
  Docs: dt: document qcom iommu bindings
  iommu: arm-smmu: split out register defines
  iommu: add qcom_iommu
  ARM64: DT: add gpu for msm8916
  ARM64: DT: add iommu for msm8916

Stanimir Varbanov (3):
  firmware: qcom_scm: add two scm calls for iommu secure page table
  iommu: qcom: initialize secure page table
  ARM64: DT: add video codec devicetree node

 .../devicetree/bindings/iommu/qcom,iommu.txt   | 106 +++
 arch/arm64/boot/dts/qcom/msm8916.dtsi  | 108 +++
 drivers/firmware/qcom_scm-32.c |   6 +
 drivers/firmware/qcom_scm-64.c |  58 ++
 drivers/firmware/qcom_scm.c|  18 +
 drivers/firmware/qcom_scm.h|  11 +
 drivers/iommu/Kconfig  |  10 +
 drivers/iommu/Makefile |   1 +
 drivers/iommu/arm-smmu-regs.h  | 227 ++
 drivers/iommu/arm-smmu.c   | 200 +
 drivers/iommu/qcom_iommu.c | 889 +
 include/linux/qcom_scm.h   |   4 +
 12 files changed, 1439 insertions(+), 199 deletions(-)
 create mode 100644 Documentation/devicetree/bindings/iommu/qcom,iommu.txt
 create mode 100644 drivers/iommu/arm-smmu-regs.h
 create mode 100644 drivers/iommu/qcom_iommu.c

-- 
2.9.3

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 1/9] firmware/qcom: add qcom_scm_restore_sec_cfg()

2017-03-01 Thread Rob Clark
Signed-off-by: Rob Clark <robdcl...@gmail.com>
---
 drivers/firmware/qcom_scm-32.c |  6 ++
 drivers/firmware/qcom_scm-64.c | 16 
 drivers/firmware/qcom_scm.c|  6 ++
 drivers/firmware/qcom_scm.h|  5 +
 include/linux/qcom_scm.h   |  2 ++
 5 files changed, 35 insertions(+)

diff --git a/drivers/firmware/qcom_scm-32.c b/drivers/firmware/qcom_scm-32.c
index 8ad226c..722e65a 100644
--- a/drivers/firmware/qcom_scm-32.c
+++ b/drivers/firmware/qcom_scm-32.c
@@ -578,3 +578,9 @@ int __qcom_scm_set_remote_state(struct device *dev, u32 
state, u32 id)
 
return ret ? : le32_to_cpu(scm_ret);
 }
+
+int __qcom_scm_restore_sec_cfg(struct device *dev, u32 device_id,
+  u32 spare)
+{
+   return -ENODEV;
+}
diff --git a/drivers/firmware/qcom_scm-64.c b/drivers/firmware/qcom_scm-64.c
index c933259..550e3a3 100644
--- a/drivers/firmware/qcom_scm-64.c
+++ b/drivers/firmware/qcom_scm-64.c
@@ -381,3 +381,19 @@ int __qcom_scm_set_remote_state(struct device *dev, u32 
state, u32 id)
 
return ret ? : res.a1;
 }
+
+int __qcom_scm_restore_sec_cfg(struct device *dev, u32 device_id, u32 spare)
+{
+   struct qcom_scm_desc desc = {0};
+   struct arm_smccc_res res;
+   int ret;
+
+   desc.args[0] = device_id;
+   desc.args[1] = spare;
+   desc.arginfo = QCOM_SCM_ARGS(2);
+
+   ret = qcom_scm_call(dev, QCOM_SCM_SVC_MP, QCOM_SCM_RESTORE_SEC_CFG,
+   , );
+
+   return ret ? : res.a1;
+}
diff --git a/drivers/firmware/qcom_scm.c b/drivers/firmware/qcom_scm.c
index d987bcc..ae1f473 100644
--- a/drivers/firmware/qcom_scm.c
+++ b/drivers/firmware/qcom_scm.c
@@ -315,6 +315,12 @@ static const struct reset_control_ops 
qcom_scm_pas_reset_ops = {
.deassert = qcom_scm_pas_reset_deassert,
 };
 
+int qcom_scm_restore_sec_cfg(u32 device_id, u32 spare)
+{
+   return __qcom_scm_restore_sec_cfg(__scm->dev, device_id, spare);
+}
+EXPORT_SYMBOL(qcom_scm_restore_sec_cfg);
+
 /**
  * qcom_scm_is_available() - Checks if SCM is available
  */
diff --git a/drivers/firmware/qcom_scm.h b/drivers/firmware/qcom_scm.h
index 6a0f154..31fc732 100644
--- a/drivers/firmware/qcom_scm.h
+++ b/drivers/firmware/qcom_scm.h
@@ -85,4 +85,9 @@ static inline int qcom_scm_remap_error(int err)
return -EINVAL;
 }
 
+#define QCOM_SCM_SVC_MP0xc
+#define QCOM_SCM_RESTORE_SEC_CFG   2
+extern int __qcom_scm_restore_sec_cfg(struct device *dev, u32 device_id,
+ u32 spare);
+
 #endif
diff --git a/include/linux/qcom_scm.h b/include/linux/qcom_scm.h
index d32f6f1..22017f5d 100644
--- a/include/linux/qcom_scm.h
+++ b/include/linux/qcom_scm.h
@@ -40,6 +40,7 @@ extern int qcom_scm_pas_shutdown(u32 peripheral);
 extern void qcom_scm_cpu_power_down(u32 flags);
 extern u32 qcom_scm_get_version(void);
 extern int qcom_scm_set_remote_state(u32 state, u32 id);
+extern int qcom_scm_restore_sec_cfg(u32 device_id, u32 spare);
 #else
 static inline
 int qcom_scm_set_cold_boot_addr(void *entry, const cpumask_t *cpus)
@@ -67,5 +68,6 @@ static inline void qcom_scm_cpu_power_down(u32 flags) {}
 static inline u32 qcom_scm_get_version(void) { return 0; }
 static inline u32
 qcom_scm_set_remote_state(u32 state,u32 id) { return -ENODEV; }
+static inline int qcom_scm_restore_sec_cfg(u32 device_id, u32 spare) { return 
-ENODEV; }
 #endif
 #endif
-- 
2.9.3

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 6/9] iommu: qcom: initialize secure page table

2017-03-01 Thread Rob Clark
From: Stanimir Varbanov <stanimir.varba...@linaro.org>

This bassicaly get the secure page table size, allocate memory
and return back the physical address to the trusted zone.

Signed-off-by: Stanimir Varbanov <stanimir.varba...@linaro.org>
Signed-off-by: Rob Clark <robdcl...@gmail.com>
---
 drivers/iommu/qcom_iommu.c | 64 ++
 1 file changed, 64 insertions(+)

diff --git a/drivers/iommu/qcom_iommu.c b/drivers/iommu/qcom_iommu.c
index 5d3bb63..082ece7 100644
--- a/drivers/iommu/qcom_iommu.c
+++ b/drivers/iommu/qcom_iommu.c
@@ -608,6 +608,51 @@ static void qcom_iommu_disable_clocks(struct 
qcom_iommu_dev *qcom_iommu)
clk_disable_unprepare(qcom_iommu->iface_clk);
 }
 
+static int qcom_iommu_sec_ptbl_init(struct device *dev)
+{
+   size_t psize = 0;
+   unsigned int spare = 0;
+   void *cpu_addr;
+   dma_addr_t paddr;
+   unsigned long attrs;
+   static bool allocated = false;
+   int ret;
+
+   if (allocated)
+   return 0;
+
+   ret = qcom_scm_iommu_secure_ptbl_size(spare, );
+   if (ret) {
+   dev_err(dev, "failed to get iommu secure pgtable size (%d)\n",
+   ret);
+   return ret;
+   }
+
+   dev_info(dev, "iommu sec: pgtable size: %zu\n", psize);
+
+   attrs = DMA_ATTR_NO_KERNEL_MAPPING;
+
+   cpu_addr = dma_alloc_attrs(dev, psize, , GFP_KERNEL, attrs);
+   if (!cpu_addr) {
+   dev_err(dev, "failed to allocate %zu bytes for pgtable\n",
+   psize);
+   return -ENOMEM;
+   }
+
+   ret = qcom_scm_iommu_secure_ptbl_init(paddr, psize, spare);
+   if (ret) {
+   dev_err(dev, "failed to init iommu pgtable (%d)\n", ret);
+   goto free_mem;
+   }
+
+   allocated = true;
+   return 0;
+
+free_mem:
+   dma_free_attrs(dev, psize, cpu_addr, paddr, attrs);
+   return ret;
+}
+
 static int qcom_iommu_ctx_probe(struct platform_device *pdev)
 {
struct qcom_iommu_ctx *ctx;
@@ -693,6 +738,17 @@ static struct platform_driver qcom_iommu_ctx_driver = {
 };
 module_platform_driver(qcom_iommu_ctx_driver);
 
+static bool qcom_iommu_has_secure_context(struct qcom_iommu_dev *qcom_iommu)
+{
+   struct device_node *child;
+
+   for_each_child_of_node(qcom_iommu->dev->of_node, child)
+   if (of_device_is_compatible(child, "qcom,msm-iommu-v1-sec"))
+   return true;
+
+   return false;
+}
+
 static int qcom_iommu_device_probe(struct platform_device *pdev)
 {
struct qcom_iommu_dev *qcom_iommu;
@@ -731,6 +787,14 @@ static int qcom_iommu_device_probe(struct platform_device 
*pdev)
return -ENODEV;
}
 
+   if (qcom_iommu_has_secure_context(qcom_iommu)) {
+   ret = qcom_iommu_sec_ptbl_init(dev);
+   if (ret) {
+   dev_err(dev, "cannot init secure pg table(%d)\n", ret);
+   return ret;
+   }
+   }
+
platform_set_drvdata(pdev, qcom_iommu);
 
/* register context bank devices, which are child nodes: */
-- 
2.9.3

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 4/9] iommu: arm-smmu: split out register defines

2017-03-01 Thread Rob Clark
I want to re-use some of these for qcom_iommu, which has (roughly) the
same context-bank registers.

Signed-off-by: Rob Clark <robdcl...@gmail.com>
---
 drivers/iommu/arm-smmu-regs.h | 225 ++
 drivers/iommu/arm-smmu.c  | 200 +
 2 files changed, 226 insertions(+), 199 deletions(-)
 create mode 100644 drivers/iommu/arm-smmu-regs.h

diff --git a/drivers/iommu/arm-smmu-regs.h b/drivers/iommu/arm-smmu-regs.h
new file mode 100644
index 000..632240f
--- /dev/null
+++ b/drivers/iommu/arm-smmu-regs.h
@@ -0,0 +1,225 @@
+/*
+ * IOMMU API for ARM architected SMMU implementations.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
+ *
+ * Copyright (C) 2013 ARM Limited
+ *
+ * Author: Will Deacon <will.dea...@arm.com>
+ */
+
+#ifndef _ARM_SMMU_REGS_H
+#define _ARM_SMMU_REGS_H
+
+
+/* Configuration registers */
+#define ARM_SMMU_GR0_sCR0  0x0
+#define sCR0_CLIENTPD  (1 << 0)
+#define sCR0_GFRE  (1 << 1)
+#define sCR0_GFIE  (1 << 2)
+#define sCR0_EXIDENABLE(1 << 3)
+#define sCR0_GCFGFRE   (1 << 4)
+#define sCR0_GCFGFIE   (1 << 5)
+#define sCR0_USFCFG(1 << 10)
+#define sCR0_VMIDPNE   (1 << 11)
+#define sCR0_PTM   (1 << 12)
+#define sCR0_FB(1 << 13)
+#define sCR0_VMID16EN  (1 << 31)
+#define sCR0_BSU_SHIFT 14
+#define sCR0_BSU_MASK  0x3
+
+/* Auxiliary Configuration register */
+#define ARM_SMMU_GR0_sACR  0x10
+
+/* Identification registers */
+#define ARM_SMMU_GR0_ID0   0x20
+#define ARM_SMMU_GR0_ID1   0x24
+#define ARM_SMMU_GR0_ID2   0x28
+#define ARM_SMMU_GR0_ID3   0x2c
+#define ARM_SMMU_GR0_ID4   0x30
+#define ARM_SMMU_GR0_ID5   0x34
+#define ARM_SMMU_GR0_ID6   0x38
+#define ARM_SMMU_GR0_ID7   0x3c
+#define ARM_SMMU_GR0_sGFSR 0x48
+#define ARM_SMMU_GR0_sGFSYNR0  0x50
+#define ARM_SMMU_GR0_sGFSYNR1  0x54
+#define ARM_SMMU_GR0_sGFSYNR2  0x58
+
+#define ID0_S1TS   (1 << 30)
+#define ID0_S2TS   (1 << 29)
+#define ID0_NTS(1 << 28)
+#define ID0_SMS(1 << 27)
+#define ID0_ATOSNS (1 << 26)
+#define ID0_PTFS_NO_AARCH32(1 << 25)
+#define ID0_PTFS_NO_AARCH32S   (1 << 24)
+#define ID0_CTTW   (1 << 14)
+#define ID0_NUMIRPT_SHIFT  16
+#define ID0_NUMIRPT_MASK   0xff
+#define ID0_NUMSIDB_SHIFT  9
+#define ID0_NUMSIDB_MASK   0xf
+#define ID0_EXIDS  (1 << 8)
+#define ID0_NUMSMRG_SHIFT  0
+#define ID0_NUMSMRG_MASK   0xff
+
+#define ID1_PAGESIZE   (1 << 31)
+#define ID1_NUMPAGENDXB_SHIFT  28
+#define ID1_NUMPAGENDXB_MASK   7
+#define ID1_NUMS2CB_SHIFT  16
+#define ID1_NUMS2CB_MASK   0xff
+#define ID1_NUMCB_SHIFT0
+#define ID1_NUMCB_MASK 0xff
+
+#define ID2_OAS_SHIFT  4
+#define ID2_OAS_MASK   0xf
+#define ID2_IAS_SHIFT  0
+#define ID2_IAS_MASK   0xf
+#define ID2_UBS_SHIFT  8
+#define ID2_UBS_MASK   0xf
+#define ID2_PTFS_4K(1 << 12)
+#define ID2_PTFS_16K   (1 << 13)
+#define ID2_PTFS_64K   (1 << 14)
+#define ID2_VMID16 (1 << 15)
+
+#define ID7_MAJOR_SHIFT4
+#define ID7_MAJOR_MASK 0xf
+
+/* Global TLB invalidation */
+#define ARM_SMMU_GR0_TLBIVMID  0x64
+#define ARM_SMMU_GR0_TLBIALLNSNH   0x68
+#define ARM_SMMU_GR0_TLBIALLH  0x6c
+#define ARM_SMMU_GR0_sTLBGSYNC 0x70
+#define ARM_SMMU_GR0_sTLBGSTATUS   0x74
+#define sTLBGSTATUS_GSACTIVE   (1 << 0)
+#define TLB_LOOP_TIMEOUT   100 /* 1s! */

[PATCH 3/9] Docs: dt: document qcom iommu bindings

2017-03-01 Thread Rob Clark
Cc: devicet...@vger.kernel.org
Signed-off-by: Rob Clark <robdcl...@gmail.com>
---
 .../devicetree/bindings/iommu/qcom,iommu.txt   | 106 +
 1 file changed, 106 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/iommu/qcom,iommu.txt

diff --git a/Documentation/devicetree/bindings/iommu/qcom,iommu.txt 
b/Documentation/devicetree/bindings/iommu/qcom,iommu.txt
new file mode 100644
index 000..2e69b78
--- /dev/null
+++ b/Documentation/devicetree/bindings/iommu/qcom,iommu.txt
@@ -0,0 +1,106 @@
+* QCOM IOMMU v1 Implementation
+
+Qualcomm "B" family devices which are not compatible with arm-smmu have
+a similar looking IOMMU but without access to the global register space,
+and optionally requiring additional configuration to route context irqs
+to non-secure vs secure interrupt line.
+
+** Required properties:
+
+- compatible   : Should be "qcom,msm-iommu-v1".
+- clocks   : The interface clock (iface_clk) and bus clock (bus_clk).
+- #address-cells   : must be 1.
+- #size-cells  : must be 1.
+- #iommu-cells : Must be 1.
+- ranges   : Base address and size of the iommu context banks.
+- qcom,iommu-secure-id  : secure-id.
+
+- List of sub-nodes, one per translation context bank.  Each sub-node
+  has the following required properties:
+
+  - compatible : Should be one of:
+- "qcom,msm-iommu-v1-ns"  : non-secure context bank
+- "qcom,msm-iommu-v1-sec" : secure context bank
+  - reg: Base address and size of context bank within the iommu
+  - interrupts : The context fault irq.
+
+** Optional properties:
+
+- reg  : Base address and size of the SMMU local base, should
+ be only specified if the iommu requires configuration
+ for routing of context bank irq's to secure vs non-
+ secure lines.  (Ie. if the iommu contains secure
+ context banks)
+
+
+** Examples:
+
+   apps_iommu: msm-iommu-v1@1e2 {
+   #address-cells = <1>;
+   #size-cells = <1>;
+   #iommu-cells = <1>;
+   compatible = "qcom,msm-iommu-v1";
+   ranges = <0 0x1e2 0x4>;
+   reg = <0x1ef 0x3000>;
+   clocks = < GCC_SMMU_CFG_CLK>,
+< GCC_APSS_TCU_CLK>;
+   clock-names = "iface_clk", "bus_clk";
+   qcom,iommu-secure-id = <17>;
+
+   // mdp_0:
+   msm-iommu-v1-ctx@4000 {
+   compatible = "qcom,msm-iommu-v1-ns";
+   reg = <0x4000 0x1000>;
+   interrupts = ;
+   };
+
+   // venus_ns:
+   msm-iommu-v1-ctx@5000 {
+   compatible = "qcom,msm-iommu-v1-sec";
+   reg = <0x5000 0x1000>;
+   interrupts = ;
+   };
+   };
+
+   gpu_iommu: msm-iommu-v1@1f08000 {
+   #address-cells = <1>;
+   #size-cells = <1>;
+   #iommu-cells = <1>;
+   compatible = "qcom,msm-iommu-v1";
+   ranges = <0 0x1f08000 0x1>;
+   clocks = < GCC_SMMU_CFG_CLK>,
+< GCC_GFX_TCU_CLK>;
+   clock-names = "iface_clk", "bus_clk";
+   qcom,iommu-secure-id = <18>;
+
+   // gfx3d_user:
+   msm-iommu-v1-ctx@1f09000 {
+   compatible = "qcom,msm-iommu-v1-ns";
+   reg = <0x1000 0x1000>;
+   interrupts = ;
+   };
+
+   // gfx3d_priv:
+   msm-iommu-v1-ctx@1f0a000 {
+   compatible = "qcom,msm-iommu-v1-ns";
+   reg = <0x2000 0x1000>;
+   interrupts = ;
+   };
+   };
+
+   ...
+
+   venus: video-codec@1d0 {
+   ...
+   iommus = <_iommu 5>;
+   };
+
+   mdp: mdp@1a01000 {
+   ...
+   iommus = <_iommu 4>;
+   };
+
+   gpu@01c0 {
+   ...
+   iommus = <_iommu 1>, <_iommu 2>;
+   };
-- 
2.9.3

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 5/9] iommu: add qcom_iommu

2017-03-01 Thread Rob Clark
On Wed, Mar 1, 2017 at 6:54 PM, Stephen Boyd <sb...@codeaurora.org> wrote:
> On 03/01/2017 09:42 AM, Rob Clark wrote:
>> diff --git a/drivers/iommu/qcom_iommu.c b/drivers/iommu/qcom_iommu.c
>> new file mode 100644
>> index 000..5d3bb63
>> --- /dev/null
>> +++ b/drivers/iommu/qcom_iommu.c
>> @@ -0,0 +1,825 @@
>> +/*
>> + * IOMMU API for QCOM secure IOMMUs.  Somewhat based on arm-smmu.c
>> + *
>> + * This program is free software; you can redistribute it and/or modify
>> + * it under the terms of the GNU General Public License version 2 as
>> + * published by the Free Software Foundation.
>> + *
>> + * This program is distributed in the hope that it will be useful,
>> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
>> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
>> + * GNU General Public License for more details.
>> + *
>> + * You should have received a copy of the GNU General Public License
>> + * along with this program; if not, write to the Free Software
>> + * Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, 
>> USA.
>> + *
>> + * Copyright (C) 2013 ARM Limited
>> + * Copyright (C) 2017 Red Hat
>> + */
>> +
>> +#define pr_fmt(fmt) "qcom-iommu: " fmt
>> +
>> +#include 
>> +#include 
>> +#include 
>> +#include 
>> +#include 
>> +#include 
>> +#include 
>> +#include 
>> +#include 
>> +#include 
>> +#include 
>> +#include 
>> +#include 
>> +#include 
>> +#include 
>> +#include 
>> +#include 
>> +#include 
>> +#include 
>> +#include 
>> +#include 
>
> mutex.h?

added

>> +
>> +#include "io-pgtable.h"
>> +#include "arm-smmu-regs.h"
>> +
>> +#define SMMU_INTR_SEL_NS 0x2000
>> +
>> +struct qcom_iommu_dev {
>> + /* IOMMU core code handle */
>> + struct iommu_device  iommu;
>> + struct device   *dev;
>> + struct clk  *iface_clk;
>> + struct clk  *bus_clk;
>> + void __iomem*local_base;
>> + u32  sec_id;
>> + struct list_head context_list;   /* list of qcom_iommu_context 
>> */
>> +};
>> +
>> +struct qcom_iommu_ctx {
>> + struct device   *dev;
>> + void __iomem*base;
>> + unsigned int irq;
>> + bool secure_init;
>> + u32  asid;  /* asid and ctx bank # are 1:1 */
>> + struct iommu_group  *group;
>> + struct list_head node;  /* head in 
>> qcom_iommu_device::context_list */
>> +};
>> +
>> +struct qcom_iommu_domain {
>> + struct io_pgtable_ops   *pgtbl_ops;
>> + spinlock_t   pgtbl_lock;
>> + struct mutex init_mutex; /* Protects iommu pointer */
>> + struct iommu_domain  domain;
>> + struct qcom_iommu_dev   *iommu;
>> +};
>> +
>> +static struct qcom_iommu_domain *to_qcom_iommu_domain(struct iommu_domain 
>> *dom)
>> +{
>> + return container_of(dom, struct qcom_iommu_domain, domain);
>> +}
>> +
>> +static const struct iommu_ops qcom_iommu_ops;
>> +static struct platform_driver qcom_iommu_driver;
>
> Why forward declared?

qcom_iommu_driver fwd declaration can be dropped.. qcom_iommu_ops is
still needed since it is passed through (indirectly) to
alloc_io_pgtable_ops() from qcom_iommu_attach_dev() (one of the ops
fxns)

>> +
>> +static struct qcom_iommu_dev * __to_iommu(struct iommu_fwspec *fwspec)
>> +{
>> + if (WARN_ON(!fwspec || fwspec->ops != _iommu_ops))
>> + return NULL;
>> + return fwspec->iommu_priv;
>> +}
>> +
>> +static struct qcom_iommu_ctx * __to_ctx(struct iommu_fwspec *fwspec, 
>> unsigned asid)
>> +{
>> + struct qcom_iommu_dev *qcom_iommu = __to_iommu(fwspec);
>> + struct qcom_iommu_ctx *ctx;
>> +
>> + if (!qcom_iommu)
>> + return NULL;
>> +
>> + list_for_each_entry(ctx, _iommu->context_list, node)
>> + if (ctx->asid == asid)
>> + return ctx;
>> +
>> + WARN(1, "no ctx for asid %u\n", asid);
>> + return NULL;
>> +}
>> +
>> +static inline void
>> +iommu_writel(struct qcom_iommu_ctx *ctx, unsigned reg, u32 val)
>> +{
>> + writel_relaxed(val, ctx->base + reg);
>> +}
&

Re: [RFC 1/3] iommu/arm-smmu: Add support to opt-in to stalling

2017-01-11 Thread Rob Clark
On Wed, Jan 11, 2017 at 4:36 AM, Will Deacon <will.dea...@arm.com> wrote:
> On Tue, Jan 10, 2017 at 02:20:13PM -0500, Rob Clark wrote:
>> On Tue, Jan 10, 2017 at 12:52 PM, Will Deacon <will.dea...@arm.com> wrote:
>> > On Fri, Jan 06, 2017 at 11:26:49AM -0500, Rob Clark wrote:
>> >> Hmm, well we install the fault handler on the iommu_domain..  perhaps
>> >> maybe a combo of dts property (or deciding based on more specific
>> >> compat string), plus extra param passed in to
>> >> iommu_set_fault_hander().  The dts property or compat string to
>> >> indicate whether the iommu (and how it is wired up) can handle stalls,
>> >> and enable_stall param when fault handler is registered to indicate
>> >> whether the device itself can cope.. if either can't do stalling, then
>> >> don't set CFCFG.
>> >
>> > I thought about this some more, and I think you're right. Having
>> > iommu_set_fault_handler take a flags parameter indicating that, for 
>> > example,
>> > the fault handler can deal with paging, is all we need to implement the
>> > per-master opt-in functionality for stalling faults. There's no real
>> > requirement to standardise a generic firmware property for that (but
>> > we still need *something* that says stalling is usable on the SMMU --
>> > perhaps just the compatible string is ok).
>>
>> btw, it occurred to me that maybe it should be flags param to
>> iommu_attach_device() (just in case fault handler not installed?)
>> otoh stalling without a fault handler is silly, but I guess we need it
>> to infer whether stalling can be supported by other devices on same
>> iommu.. tbh I'm on a bit shaky ground when it comes to multiple
>> devices per iommu since the SoC's I'm familiar with do it the other
>> way around.  But I guess you have thought more about the multi-device
>> case, so figured I should suggest it..
>
> I don't think it works at attach time, because the stalling property belongs
> to the domain, rather than the individual devices within it. Similarly, I
> don't think we should allow this property to be toggled once devices have
> been attached.
>

hmm, I was more thinking of cases where drivers for particular devices
need some work (ie. like potentially disabling hw hang detect during
faults).. I guess we could have three levels, that all have to be true
in order to enable stall: smmu, domain (pass flags in to
iommu_domain_alloc()??), and device (iommu_attach_device())?

BR,
-R
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC 1/3] iommu/arm-smmu: Add support to opt-in to stalling

2017-01-10 Thread Rob Clark
On Tue, Jan 10, 2017 at 12:52 PM, Will Deacon <will.dea...@arm.com> wrote:
> Hi Rob,
>
> On Fri, Jan 06, 2017 at 11:26:49AM -0500, Rob Clark wrote:
>> On Thu, Jan 5, 2017 at 10:49 AM, Will Deacon <will.dea...@arm.com> wrote:
>> > On Thu, Jan 05, 2017 at 10:27:27AM -0500, Rob Clark wrote:
>> >> I'm not sure if the better solution then would be to have two fault
>> >> callbacks, one immediately from the IRQ and a later one from wq.  Or
>> >> let the driver handle the wq business and give it a way to tell the
>> >> IOMMU when to resume.
>> >>
>> >> I kinda think we should punt on the worker thread for now until we are
>> >> ready to resume faulting transactions, because I guess a strong chance
>> >> that whatever way we do it now will be wrong ;-)
>> >
>> > I guess what I'm after is for you to change the interrupt handlers to be
>> > threaded, like they are for SMMUv3. I *think* you can do that with a NULL
>> > thread_fn for now, and just call report_iommu_fault from the handler.
>> > The return value of that could, in theory, be used to queued the paging
>> > request and wake the paging thread in future.
>>
>> If we only pass in the non-threaded irq fxn, I'm not really sure how
>> that changes anything.. or maybe I'm not understanding what you mean.
>>
>> But yeah, I guess we could use request_threaded_irq() to get both IRQ
>> context notification and a later thread context notification rather
>> than doing the wq thing.  Either way the iommu API has to change
>> slightly.
>>
>> >> > I wonder if this should also be predicated on the compatible string, so
>> >> > that the "arm,smmu-enable-stall" property is ignored (with a warning) if
>> >> > the compatible string isn't specific enough to identify an 
>> >> > implementation
>> >> > with the required SS behaviour? On the other hand, it feels pretty
>> >> > redundant and a single "stalling works" property is all we need.
>> >>
>> >> We could also drop the property and key the behavior on specific
>> >> compat strings I guess.  Having both seems a bit odd.  Anyways, I'll
>> >> defer to DT folks about what the cleaner approach is.
>> >
>> > As Robin pointed out, we do need to be able to distinguish the integration
>> > of the device from the device itself. For example, MMU-9000 might be 
>> > capable
>> > of stalling, but if it's bolted to a PCI RC, it's not safe to do so.
>>
>> Hmm, well we install the fault handler on the iommu_domain..  perhaps
>> maybe a combo of dts property (or deciding based on more specific
>> compat string), plus extra param passed in to
>> iommu_set_fault_hander().  The dts property or compat string to
>> indicate whether the iommu (and how it is wired up) can handle stalls,
>> and enable_stall param when fault handler is registered to indicate
>> whether the device itself can cope.. if either can't do stalling, then
>> don't set CFCFG.
>
> I thought about this some more, and I think you're right. Having
> iommu_set_fault_handler take a flags parameter indicating that, for example,
> the fault handler can deal with paging, is all we need to implement the
> per-master opt-in functionality for stalling faults. There's no real
> requirement to standardise a generic firmware property for that (but
> we still need *something* that says stalling is usable on the SMMU --
> perhaps just the compatible string is ok).

btw, it occurred to me that maybe it should be flags param to
iommu_attach_device() (just in case fault handler not installed?)
otoh stalling without a fault handler is silly, but I guess we need it
to infer whether stalling can be supported by other devices on same
iommu.. tbh I'm on a bit shaky ground when it comes to multiple
devices per iommu since the SoC's I'm familiar with do it the other
way around.  But I guess you have thought more about the multi-device
case, so figured I should suggest it..

> Taking this further, there's then no need for the threaded IRQ function
> in the SMMUv2 driver after all. Instead, we pass a continuation function
> pointer and opaque token from the SMMU driver to the fault handler in
> IRQ context (this will be in thread context for SMMUv3, but that should
> be fine). The fault handler can then stash these someplace, and signal
> a wakeup for its own threaded handler, which ultimately calls the SMMU
> continuation function with the opaque token as a parameter when it's done
> with the fault. I think that's enough to get things rolling without adding
> lots of infrastructure to the SMMU driver initially. If a pattern emerges
> amongst users of the interface, then we could consolidate some of the work
> handling back into IOMMU core.
>
> What do you think? It should all be pretty straightforward for what you
> want to do.

yeah, that makes sense to me..  I can give it a try.

BR,
-R

> Will
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC 1/3] iommu/arm-smmu: Add support to opt-in to stalling

2017-01-05 Thread Rob Clark
On Thu, Jan 5, 2017 at 6:55 AM, Will Deacon <will.dea...@arm.com> wrote:
> On Tue, Jan 03, 2017 at 04:30:54PM -0500, Rob Clark wrote:
>> TODO maybe we want two options, one to enable stalling, and 2nd to punt
>> handling to wq?  I haven't needed to use mm APIs from fault handler yet
>> (although it is something that I think we'll want some day).  Perhaps
>> stalling support is limited to just letting driver dump some extra
>> debugging information otherwise.  Threaded handling probably only useful
>> with stalling, but inverse may not always be true.
>
> I'd actually like to see this stuck on a worker thread, because I think
> that's more generally useful and I don't want to have a situation where
> sometimes the IOMMU fault notifier is run in IRQ context and sometimes it's
> not.

So I was talking a bit w/ Jordan on IRC yesterday..  and we also have
the GPU's hw hang-detect to contend with.  So I *suspect* that when we
get to the point of using this to do things like page in things from
swap and resume the faulting transaction, we probably want to get
called immediately from the IRQ handler so we can disable the hw
hang-detect.

I'm not sure if the better solution then would be to have two fault
callbacks, one immediately from the IRQ and a later one from wq.  Or
let the driver handle the wq business and give it a way to tell the
IOMMU when to resume.

I kinda think we should punt on the worker thread for now until we are
ready to resume faulting transactions, because I guess a strong chance
that whatever way we do it now will be wrong ;-)

>>
>> Signed-off-by: Rob Clark <robdcl...@gmail.com>
>> ---
>>  .../devicetree/bindings/iommu/arm,smmu.txt |  3 ++
>>  drivers/iommu/arm-smmu.c   | 42 
>> ++
>>  2 files changed, 39 insertions(+), 6 deletions(-)
>>
>> diff --git a/Documentation/devicetree/bindings/iommu/arm,smmu.txt 
>> b/Documentation/devicetree/bindings/iommu/arm,smmu.txt
>> index ef465b0..5f405a6 100644
>> --- a/Documentation/devicetree/bindings/iommu/arm,smmu.txt
>> +++ b/Documentation/devicetree/bindings/iommu/arm,smmu.txt
>> @@ -68,6 +68,9 @@ conditions.
>>aliases of secure registers have to be used during
>>SMMU configuration.
>>
>> +- arm,smmu-enable-stall : Enable stall mode to stall memory transactions
>> +  and resume after fault is handled
>> +
>>  ** Deprecated properties:
>>
>>  - mmu-masters (deprecated in favour of the generic "iommus" binding) :
>> diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
>> index d505432..a71cb8f 100644
>> --- a/drivers/iommu/arm-smmu.c
>> +++ b/drivers/iommu/arm-smmu.c
>> @@ -350,6 +350,7 @@ struct arm_smmu_device {
>>   u32 features;
>>
>>  #define ARM_SMMU_OPT_SECURE_CFG_ACCESS (1 << 0)
>> +#define ARM_SMMU_OPT_ENABLE_STALL  (1 << 1)
>>   u32 options;
>>   enum arm_smmu_arch_version  version;
>>   enum arm_smmu_implementationmodel;
>> @@ -425,6 +426,7 @@ static bool using_legacy_binding, using_generic_binding;
>>
>>  static struct arm_smmu_option_prop arm_smmu_options[] = {
>>   { ARM_SMMU_OPT_SECURE_CFG_ACCESS, "calxeda,smmu-secure-config-access" 
>> },
>> + { ARM_SMMU_OPT_ENABLE_STALL,  "arm,smmu-enable-stall" },
>>   { 0, NULL},
>>  };
>>
>> @@ -676,7 +678,8 @@ static struct iommu_gather_ops arm_smmu_gather_ops = {
>>
>>  static irqreturn_t arm_smmu_context_fault(int irq, void *dev)
>>  {
>> - u32 fsr, fsynr;
>> + int flags, ret;
>> + u32 fsr, fsynr, resume;
>>   unsigned long iova;
>>   struct iommu_domain *domain = dev;
>>   struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
>> @@ -690,15 +693,40 @@ static irqreturn_t arm_smmu_context_fault(int irq, 
>> void *dev)
>>   if (!(fsr & FSR_FAULT))
>>   return IRQ_NONE;
>>
>> + if (fsr & FSR_IGN)
>> + dev_err_ratelimited(smmu->dev,
>> + "Unexpected context fault (fsr 0x%x)\n",
>> + fsr);
>> +
>>   fsynr = readl_relaxed(cb_base + ARM_SMMU_CB_FSYNR0);
>> - iova = readq_relaxed(cb_base + ARM_SMMU_CB_FAR);
>> + flags = fsynr & FSYNR0_WNR ? IOMMU_FAULT_WRITE : IOMMU_FAULT_READ;
>>
>> - dev_err_ratelimited(smmu->dev,
>> - "Unhandled context fault: fsr=0x%x, iova=0x%08lx

Re: [RFC 2/3] iommu/arm-smmu: Add qcom implementation

2017-01-04 Thread Rob Clark
On Wed, Jan 4, 2017 at 8:33 AM, Sricharan <sricha...@codeaurora.org> wrote:
> Hi,
>
>>-Original Message-
>>From: linux-arm-msm-ow...@vger.kernel.org 
>>[mailto:linux-arm-msm-ow...@vger.kernel.org] On Behalf Of Jordan Crouse
>>Sent: Wednesday, January 04, 2017 3:59 AM
>>To: Rob Clark <robdcl...@gmail.com>
>>Cc: Will Deacon <will.dea...@arm.com>; iommu@lists.linux-foundation.org; 
>>linux-arm-...@vger.kernel.org; Sricharan R
>><sricha...@codeaurora.org>
>>Subject: Re: [RFC 2/3] iommu/arm-smmu: Add qcom implementation
>>
>>On Tue, Jan 03, 2017 at 04:30:55PM -0500, Rob Clark wrote:
>>> At least on the db820c I have, with the firmware I have, I'm not seeing
>>> the SS bit set, even though the iommu is in a stalled state.  So for
>>> this implementation ignore not having SS bit set.
>>
>>The SS bit gets set if SCTLR.CFCFG is set to 1. It works in the downstream
>>kernel because the GPU driver writes directly to SCTLR in the IOMMU hardware
>>(which of course is a crime against humanity but that is one of the many 
>>reasons
>>why it is a *downstream* driver).
>>
>>My understanding is that SCTLR.CFCFG == 0 should automatically terminate the
>>transaction so I don't understand why we need to write to RESUME. I'm not
>>doubting Rob's patch, I'm doubting why we need it in the first place. It seems
>>that if we have to write it regardless of the value of CFCFG then we should
>>probably just do that instead of relying on the SS bit.
>>
>
> The patch is setting CFCFG to 1, hence we require clearing the fault with a
> write to the RESUME register. I tested these patches on arm-smmu with
> the DB820c and saw that the 'FSR_SS' bit is getting set properly after a
> fault on the adreno smmu.

I'll drop this patch and re-test.. hopefully later today.  It's
possible that I was having the problem w/ SS not set due to some other
issue.  (This was what I was seeing initially after just reverting the
patch that removed the stall/resume stuff.)  I probably need to double
checkk that CFCFG bit isn't getting cleared somewhere.

BR,
-R

>>The public spec doesn't give any indication to me that any of this behavior is
>>implementation specific but I only have one implementation to base that
>>assumption on. Perhaps the default value of SCTLR is implementation specific?
>>
>>If other implementations do expect SS (and CFCFG) to be set by default then we
>>would indeed need to set up a quirk. The other possibility would be to force
>>set CFCFG for all targets, but I would be hesitant to do that on the GPU iommu
>>because if we stall the GPU for too long then hang detect will fire.
>>
>
> As i understood from the previous discussions on this [1],  the
> behaviour of the stall model (whether enabling the stall would impact other
> contexts as well) and how the stalled context bank is going to assert the
> interrupts were implementation defined. I thought that the setting of the
> 'SS' bit should happen if stall model is supported.
>
> [1] https://www.spinics.net/lists/linux-arm-msm/msg25203.html
>
> Regards,
>  Sricharan
>
>
>>Jordan
>>
>>> Signed-off-by: Rob Clark <robdcl...@gmail.com>
>>> ---
>>>  drivers/iommu/arm-smmu.c | 6 ++
>>>  1 file changed, 6 insertions(+)
>>>
>>> diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
>>> index a71cb8f..a8d9901 100644
>>> --- a/drivers/iommu/arm-smmu.c
>>> +++ b/drivers/iommu/arm-smmu.c
>>> @@ -298,6 +298,7 @@ enum arm_smmu_implementation {
>>>  GENERIC_SMMU,
>>>  ARM_MMU500,
>>>  CAVIUM_SMMUV2,
>>> +QCOM_SMMUV2,
>>>  };
>>>
>>>  struct arm_smmu_s2cr {
>>> @@ -716,6 +717,9 @@ static irqreturn_t arm_smmu_context_fault(int irq, void 
>>> *dev)
>>>  /* Clear the faulting FSR */
>>>  writel(fsr, cb_base + ARM_SMMU_CB_FSR);
>>>
>>> +if (smmu->model == QCOM_SMMUV2)
>>> +fsr |= FSR_SS;
>>> +
>>>  /* Retry or terminate any stalled transactions */
>>>  if (fsr & FSR_SS) {
>>>  /* Should we care about ending up w/ a stalled transaction
>>> @@ -1991,6 +1995,7 @@ ARM_SMMU_MATCH_DATA(smmu_generic_v2, ARM_SMMU_V2, 
>>> GENERIC_SMMU);
>>>  ARM_SMMU_MATCH_DATA(arm_mmu401, ARM_SMMU_V1_64K, GENERIC_SMMU);
>>>  ARM_SMMU_MATCH_DATA(arm_mmu500, ARM_SMMU_V2, ARM_MMU500);
>>>  ARM_SMMU_MATCH_DATA(cavium_smmuv2, ARM_SMMU_V2, CAVIUM_SMMUV2);
>>> +ARM_SMMU_MATCH_DATA(qcom_smmuv2, ARM_SMMU_V2, QCOM_SMMUV2);

[RFC 1/3] iommu/arm-smmu: Add support to opt-in to stalling

2017-01-03 Thread Rob Clark
TODO maybe we want two options, one to enable stalling, and 2nd to punt
handling to wq?  I haven't needed to use mm APIs from fault handler yet
(although it is something that I think we'll want some day).  Perhaps
stalling support is limited to just letting driver dump some extra
debugging information otherwise.  Threaded handling probably only useful
with stalling, but inverse may not always be true.

Signed-off-by: Rob Clark <robdcl...@gmail.com>
---
 .../devicetree/bindings/iommu/arm,smmu.txt |  3 ++
 drivers/iommu/arm-smmu.c   | 42 ++
 2 files changed, 39 insertions(+), 6 deletions(-)

diff --git a/Documentation/devicetree/bindings/iommu/arm,smmu.txt 
b/Documentation/devicetree/bindings/iommu/arm,smmu.txt
index ef465b0..5f405a6 100644
--- a/Documentation/devicetree/bindings/iommu/arm,smmu.txt
+++ b/Documentation/devicetree/bindings/iommu/arm,smmu.txt
@@ -68,6 +68,9 @@ conditions.
   aliases of secure registers have to be used during
   SMMU configuration.
 
+- arm,smmu-enable-stall : Enable stall mode to stall memory transactions
+  and resume after fault is handled
+
 ** Deprecated properties:
 
 - mmu-masters (deprecated in favour of the generic "iommus" binding) :
diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index d505432..a71cb8f 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -350,6 +350,7 @@ struct arm_smmu_device {
u32 features;
 
 #define ARM_SMMU_OPT_SECURE_CFG_ACCESS (1 << 0)
+#define ARM_SMMU_OPT_ENABLE_STALL  (1 << 1)
u32 options;
enum arm_smmu_arch_version  version;
enum arm_smmu_implementationmodel;
@@ -425,6 +426,7 @@ static bool using_legacy_binding, using_generic_binding;
 
 static struct arm_smmu_option_prop arm_smmu_options[] = {
{ ARM_SMMU_OPT_SECURE_CFG_ACCESS, "calxeda,smmu-secure-config-access" },
+   { ARM_SMMU_OPT_ENABLE_STALL,  "arm,smmu-enable-stall" },
{ 0, NULL},
 };
 
@@ -676,7 +678,8 @@ static struct iommu_gather_ops arm_smmu_gather_ops = {
 
 static irqreturn_t arm_smmu_context_fault(int irq, void *dev)
 {
-   u32 fsr, fsynr;
+   int flags, ret;
+   u32 fsr, fsynr, resume;
unsigned long iova;
struct iommu_domain *domain = dev;
struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
@@ -690,15 +693,40 @@ static irqreturn_t arm_smmu_context_fault(int irq, void 
*dev)
if (!(fsr & FSR_FAULT))
return IRQ_NONE;
 
+   if (fsr & FSR_IGN)
+   dev_err_ratelimited(smmu->dev,
+   "Unexpected context fault (fsr 0x%x)\n",
+   fsr);
+
fsynr = readl_relaxed(cb_base + ARM_SMMU_CB_FSYNR0);
-   iova = readq_relaxed(cb_base + ARM_SMMU_CB_FAR);
+   flags = fsynr & FSYNR0_WNR ? IOMMU_FAULT_WRITE : IOMMU_FAULT_READ;
 
-   dev_err_ratelimited(smmu->dev,
-   "Unhandled context fault: fsr=0x%x, iova=0x%08lx, fsynr=0x%x, cb=%d\n",
-   fsr, iova, fsynr, cfg->cbndx);
+   iova = readq_relaxed(cb_base + ARM_SMMU_CB_FAR);
+   if (!report_iommu_fault(domain, smmu->dev, iova, flags)) {
+   ret = IRQ_HANDLED;
+   resume = RESUME_RETRY;
+   } else {
+   dev_err_ratelimited(smmu->dev,
+   "Unhandled context fault: iova=0x%08lx, fsynr=0x%x, 
cb=%d\n",
+   iova, fsynr, cfg->cbndx);
+   ret = IRQ_NONE;
+   resume = RESUME_TERMINATE;
+   }
 
+   /* Clear the faulting FSR */
writel(fsr, cb_base + ARM_SMMU_CB_FSR);
-   return IRQ_HANDLED;
+
+   /* Retry or terminate any stalled transactions */
+   if (fsr & FSR_SS) {
+   /* Should we care about ending up w/ a stalled transaction
+* when we didn't ask for it?  I guess for now best to call
+* attention to it and resume anyways.
+*/
+   WARN_ON(!(smmu->options & ARM_SMMU_OPT_ENABLE_STALL));
+   writel_relaxed(resume, cb_base + ARM_SMMU_CB_RESUME);
+   }
+
+   return ret;
 }
 
 static irqreturn_t arm_smmu_global_fault(int irq, void *dev)
@@ -824,6 +852,8 @@ static void arm_smmu_init_context_bank(struct 
arm_smmu_domain *smmu_domain,
 
/* SCTLR */
reg = SCTLR_CFIE | SCTLR_CFRE | SCTLR_AFE | SCTLR_TRE | SCTLR_M;
+   if (smmu->options & ARM_SMMU_OPT_ENABLE_STALL)
+   reg |= SCTLR_CFCFG;
if (stage1)
reg |= SCTLR_S1_ASIDPNE;
 #ifdef __BIG_ENDIAN
-- 
2.7.4

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[RFC 0/3] iommu/arm-smmu: patches for adreno

2017-01-03 Thread Rob Clark
Will,

I meant to scrape something together a bit sooner.  I wanted to check if
this was in line with what you were thinking for upstream alternative to
reverting "iommu/arm-smmu: Disable stalling faults for all endpoints".

(Third patch is semi-unrelated, but I'd prefer to only have my rate-
limited prints from drm/msm, since they contain additional information
about gpu state for debugging the fault.)

Rob Clark (3):
  iommu/arm-smmu: Add support to opt-in to stalling
  iommu/arm-smmu: Add qcom implementation
  iommu/arm-smmu: Let fault handler return -EFAULT

 .../devicetree/bindings/iommu/arm,smmu.txt |  3 ++
 drivers/iommu/arm-smmu.c   | 54 --
 2 files changed, 52 insertions(+), 5 deletions(-)

-- 
2.7.4

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[RFC 3/3] iommu/arm-smmu: Let fault handler return -EFAULT

2017-01-03 Thread Rob Clark
Let the iommu user ask the iommu to terminate the transaction without
printing any error msg via -EFAULT return.

(Alternatively, look for -ENOSYS return instead to trigger the msg?)

Signed-off-by: Rob Clark <robdcl...@gmail.com>
---
 drivers/iommu/arm-smmu.c | 12 ++--
 1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index a8d9901..dc26c98 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -703,15 +703,23 @@ static irqreturn_t arm_smmu_context_fault(int irq, void 
*dev)
flags = fsynr & FSYNR0_WNR ? IOMMU_FAULT_WRITE : IOMMU_FAULT_READ;
 
iova = readq_relaxed(cb_base + ARM_SMMU_CB_FAR);
-   if (!report_iommu_fault(domain, smmu->dev, iova, flags)) {
+
+   switch (report_iommu_fault(domain, smmu->dev, iova, flags)) {
+   case 0:
ret = IRQ_HANDLED;
resume = RESUME_RETRY;
-   } else {
+   break;
+   case -EFAULT:
+   ret = IRQ_HANDLED;
+   resume = RESUME_TERMINATE;
+   break;
+   default:
dev_err_ratelimited(smmu->dev,
"Unhandled context fault: iova=0x%08lx, fsynr=0x%x, 
cb=%d\n",
iova, fsynr, cfg->cbndx);
ret = IRQ_NONE;
resume = RESUME_TERMINATE;
+   break;
}
 
/* Clear the faulting FSR */
-- 
2.7.4

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[RFC 2/3] iommu/arm-smmu: Add qcom implementation

2017-01-03 Thread Rob Clark
At least on the db820c I have, with the firmware I have, I'm not seeing
the SS bit set, even though the iommu is in a stalled state.  So for
this implementation ignore not having SS bit set.

Signed-off-by: Rob Clark <robdcl...@gmail.com>
---
 drivers/iommu/arm-smmu.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
index a71cb8f..a8d9901 100644
--- a/drivers/iommu/arm-smmu.c
+++ b/drivers/iommu/arm-smmu.c
@@ -298,6 +298,7 @@ enum arm_smmu_implementation {
GENERIC_SMMU,
ARM_MMU500,
CAVIUM_SMMUV2,
+   QCOM_SMMUV2,
 };
 
 struct arm_smmu_s2cr {
@@ -716,6 +717,9 @@ static irqreturn_t arm_smmu_context_fault(int irq, void 
*dev)
/* Clear the faulting FSR */
writel(fsr, cb_base + ARM_SMMU_CB_FSR);
 
+   if (smmu->model == QCOM_SMMUV2)
+   fsr |= FSR_SS;
+
/* Retry or terminate any stalled transactions */
if (fsr & FSR_SS) {
/* Should we care about ending up w/ a stalled transaction
@@ -1991,6 +1995,7 @@ ARM_SMMU_MATCH_DATA(smmu_generic_v2, ARM_SMMU_V2, 
GENERIC_SMMU);
 ARM_SMMU_MATCH_DATA(arm_mmu401, ARM_SMMU_V1_64K, GENERIC_SMMU);
 ARM_SMMU_MATCH_DATA(arm_mmu500, ARM_SMMU_V2, ARM_MMU500);
 ARM_SMMU_MATCH_DATA(cavium_smmuv2, ARM_SMMU_V2, CAVIUM_SMMUV2);
+ARM_SMMU_MATCH_DATA(qcom_smmuv2, ARM_SMMU_V2, QCOM_SMMUV2);
 
 static const struct of_device_id arm_smmu_of_match[] = {
{ .compatible = "arm,smmu-v1", .data = _generic_v1 },
@@ -1999,6 +2004,7 @@ static const struct of_device_id arm_smmu_of_match[] = {
{ .compatible = "arm,mmu-401", .data = _mmu401 },
{ .compatible = "arm,mmu-500", .data = _mmu500 },
{ .compatible = "cavium,smmu-v2", .data = _smmuv2 },
+   { .compatible = "qcom,smmu-v2", .data = _smmuv2 },
{ },
 };
 MODULE_DEVICE_TABLE(of, arm_smmu_of_match);
-- 
2.7.4

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC 2/3] iommu/arm-smmu: Add qcom implementation

2017-01-04 Thread Rob Clark
On Wed, Jan 4, 2017 at 9:31 AM, Rob Clark <robdcl...@gmail.com> wrote:
> On Wed, Jan 4, 2017 at 8:33 AM, Sricharan <sricha...@codeaurora.org> wrote:
>> Hi,
>>
>>>-Original Message-
>>>From: linux-arm-msm-ow...@vger.kernel.org 
>>>[mailto:linux-arm-msm-ow...@vger.kernel.org] On Behalf Of Jordan Crouse
>>>Sent: Wednesday, January 04, 2017 3:59 AM
>>>To: Rob Clark <robdcl...@gmail.com>
>>>Cc: Will Deacon <will.dea...@arm.com>; iommu@lists.linux-foundation.org; 
>>>linux-arm-...@vger.kernel.org; Sricharan R
>>><sricha...@codeaurora.org>
>>>Subject: Re: [RFC 2/3] iommu/arm-smmu: Add qcom implementation
>>>
>>>On Tue, Jan 03, 2017 at 04:30:55PM -0500, Rob Clark wrote:
>>>> At least on the db820c I have, with the firmware I have, I'm not seeing
>>>> the SS bit set, even though the iommu is in a stalled state.  So for
>>>> this implementation ignore not having SS bit set.
>>>
>>>The SS bit gets set if SCTLR.CFCFG is set to 1. It works in the downstream
>>>kernel because the GPU driver writes directly to SCTLR in the IOMMU hardware
>>>(which of course is a crime against humanity but that is one of the many 
>>>reasons
>>>why it is a *downstream* driver).
>>>
>>>My understanding is that SCTLR.CFCFG == 0 should automatically terminate the
>>>transaction so I don't understand why we need to write to RESUME. I'm not
>>>doubting Rob's patch, I'm doubting why we need it in the first place. It 
>>>seems
>>>that if we have to write it regardless of the value of CFCFG then we should
>>>probably just do that instead of relying on the SS bit.
>>>
>>
>> The patch is setting CFCFG to 1, hence we require clearing the fault with a
>> write to the RESUME register. I tested these patches on arm-smmu with
>> the DB820c and saw that the 'FSR_SS' bit is getting set properly after a
>> fault on the adreno smmu.
>
> I'll drop this patch and re-test.. hopefully later today.  It's
> possible that I was having the problem w/ SS not set due to some other
> issue.  (This was what I was seeing initially after just reverting the
> patch that removed the stall/resume stuff.)  I probably need to double
> checkk that CFCFG bit isn't getting cleared somewhere.
>

Ok, we can drop this patch, I've confirmed the SS bit is getting set
properly so we don't need a hack.  Not really sure what was going on
earlier when I had this problem before, maybe CFCFG wasn't getting set
properly..

BR,
-R
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC 1/3] iommu/arm-smmu: Add support to opt-in to stalling

2017-01-06 Thread Rob Clark
On Thu, Jan 5, 2017 at 10:49 AM, Will Deacon <will.dea...@arm.com> wrote:
> On Thu, Jan 05, 2017 at 10:27:27AM -0500, Rob Clark wrote:
>> On Thu, Jan 5, 2017 at 6:55 AM, Will Deacon <will.dea...@arm.com> wrote:
>> > On Tue, Jan 03, 2017 at 04:30:54PM -0500, Rob Clark wrote:
>> >> TODO maybe we want two options, one to enable stalling, and 2nd to punt
>> >> handling to wq?  I haven't needed to use mm APIs from fault handler yet
>> >> (although it is something that I think we'll want some day).  Perhaps
>> >> stalling support is limited to just letting driver dump some extra
>> >> debugging information otherwise.  Threaded handling probably only useful
>> >> with stalling, but inverse may not always be true.
>> >
>> > I'd actually like to see this stuck on a worker thread, because I think
>> > that's more generally useful and I don't want to have a situation where
>> > sometimes the IOMMU fault notifier is run in IRQ context and sometimes it's
>> > not.
>>
>> So I was talking a bit w/ Jordan on IRC yesterday..  and we also have
>> the GPU's hw hang-detect to contend with.  So I *suspect* that when we
>> get to the point of using this to do things like page in things from
>> swap and resume the faulting transaction, we probably want to get
>> called immediately from the IRQ handler so we can disable the hw
>> hang-detect.
>
> Well, if you want to use an SMMU for paging, then the GPU driver would
> need to request that explicitly when allocating its DMA buffers, to that
> would be the time to either delay or disable the hang detection.

If userspace is using SVM, for example, it is pretty impossible to
know when to expect a fault.  The best you could do is keep track that
*some* process which has active work queued up for gpu is using SVM
and disable hang detect for *everyone*.. which is kind of sad.

>> I'm not sure if the better solution then would be to have two fault
>> callbacks, one immediately from the IRQ and a later one from wq.  Or
>> let the driver handle the wq business and give it a way to tell the
>> IOMMU when to resume.
>>
>> I kinda think we should punt on the worker thread for now until we are
>> ready to resume faulting transactions, because I guess a strong chance
>> that whatever way we do it now will be wrong ;-)
>
> I guess what I'm after is for you to change the interrupt handlers to be
> threaded, like they are for SMMUv3. I *think* you can do that with a NULL
> thread_fn for now, and just call report_iommu_fault from the handler.
> The return value of that could, in theory, be used to queued the paging
> request and wake the paging thread in future.

If we only pass in the non-threaded irq fxn, I'm not really sure how
that changes anything.. or maybe I'm not understanding what you mean.

But yeah, I guess we could use request_threaded_irq() to get both IRQ
context notification and a later thread context notification rather
than doing the wq thing.  Either way the iommu API has to change
slightly.

>> > I wonder if this should also be predicated on the compatible string, so
>> > that the "arm,smmu-enable-stall" property is ignored (with a warning) if
>> > the compatible string isn't specific enough to identify an implementation
>> > with the required SS behaviour? On the other hand, it feels pretty
>> > redundant and a single "stalling works" property is all we need.
>>
>> We could also drop the property and key the behavior on specific
>> compat strings I guess.  Having both seems a bit odd.  Anyways, I'll
>> defer to DT folks about what the cleaner approach is.
>
> As Robin pointed out, we do need to be able to distinguish the integration
> of the device from the device itself. For example, MMU-9000 might be capable
> of stalling, but if it's bolted to a PCI RC, it's not safe to do so.

Hmm, well we install the fault handler on the iommu_domain..  perhaps
maybe a combo of dts property (or deciding based on more specific
compat string), plus extra param passed in to
iommu_set_fault_hander().  The dts property or compat string to
indicate whether the iommu (and how it is wired up) can handle stalls,
and enable_stall param when fault handler is registered to indicate
whether the device itself can cope.. if either can't do stalling, then
don't set CFCFG.

BR,
-R
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC 1/3] iommu/arm-smmu: Add support to opt-in to stalling

2017-01-06 Thread Rob Clark
On Thu, Jan 5, 2017 at 12:25 PM, Will Deacon  wrote:
>> That's still got to be a per-master property, not a SMMU property, I
>> think. To illustrate:
>>
>>   [A] [B]   [C]
>>|   |_|
>>  __|__|___
>> | TBU || TBU |
>> |_|  SMMU  |_|
>> |__|__|__|
>>|  |
>>
>> Say A and B are instances of some device happy to be stalled, and C is a
>> PCIe RC, and each is attached to their own context bank - enabling
>> stalls for A is definitely fine. However even though B and C are using
>> different context banks, enabling stalls for B might deadlock C if it
>> results in more total outstanding transactions than the TBU's slave port
>> supports. Therefore A can happily claim to be stall-safe, but B cannot
>> due to its integration with respect to C.
>
> So in this case, don't say that B and C can DMA to unpinned memory. You
> need the third property. This property (property 2) is concerned with the
> SMMU itself because, e.g. the way the walker has been integrated can
> cause a deadlock.


fwiw, I guess I'm mostly thinking about case (A)..  but I guess in the
(B) case amend my suggestion about adding param to
iommu_set_fault_handler() slightly to consider the enable_stall param
passed in when both (B) and (C) register their fault handlers?

Or I guess the idea about increasing extra cell (which IIUC would let
us add an extra param in dt in the devices iommus property) could also
work.  Unless maybe there could be some cases where whether a device
can do stalling is also a function of the driver as well (ie. some
feature needs to be implemented type thing)..


BR,
-R
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 3/4] iommu/arm-smmu: Disable stalling faults for all endpoints

2016-12-06 Thread Rob Clark
On Thu, Aug 18, 2016 at 9:05 AM, Will Deacon  wrote:
> Enabling stalling faults can result in hardware deadlock on poorly
> designed systems, particularly those with a PCI root complex upstream of
> the SMMU.
>
> Although it's not really Linux's job to save hardware integrators from
> their own misfortune, it *is* our job to stop userspace (e.g. VFIO
> clients) from hosing the system for everybody else, even if they might
> already be required to have elevated privileges.
>
> Given that the fault handling code currently executes entirely in IRQ
> context, there is nothing that can sensibly be done to recover from
> things like page faults anyway, so let's rip this code out for now and
> avoid the potential for deadlock.

Hi Will,

so, I'd like to re-introduce this feature, I *guess* as some sort of
opt-in quirk (ie. disabled by default unless something in DT tells you
otherwise??  But I'm open to suggestions.  I'm not entirely sure what
hw was having problems due to this feature.)

On newer snapdragon devices we are using arm-smmu for the GPU, and
halting the GPU so the driver's fault handler can dump some GPU state
on faults is enormously helpful for debugging and tracking down where
in the gpu cmdstream the fault was triggered.  In addition, we will
eventually want the ability to update pagetables from fault handler
and resuming the faulting transition.

Some additional comments below..

> Cc: 
> Reported-by: Matt Evans 
> Signed-off-by: Will Deacon 
> ---
>  drivers/iommu/arm-smmu.c | 34 +++---
>  1 file changed, 7 insertions(+), 27 deletions(-)
>
> diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
> index 4f49fe29f202..2db74ebc3240 100644
> --- a/drivers/iommu/arm-smmu.c
> +++ b/drivers/iommu/arm-smmu.c
> @@ -686,8 +686,7 @@ static struct iommu_gather_ops arm_smmu_gather_ops = {
>
>  static irqreturn_t arm_smmu_context_fault(int irq, void *dev)
>  {
> -   int flags, ret;
> -   u32 fsr, fsynr, resume;
> +   u32 fsr, fsynr;
> unsigned long iova;
> struct iommu_domain *domain = dev;
> struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
> @@ -701,34 +700,15 @@ static irqreturn_t arm_smmu_context_fault(int irq, void 
> *dev)
> if (!(fsr & FSR_FAULT))
> return IRQ_NONE;
>
> -   if (fsr & FSR_IGN)
> -   dev_err_ratelimited(smmu->dev,
> -   "Unexpected context fault (fsr 0x%x)\n",
> -   fsr);
> -
> fsynr = readl_relaxed(cb_base + ARM_SMMU_CB_FSYNR0);
> -   flags = fsynr & FSYNR0_WNR ? IOMMU_FAULT_WRITE : IOMMU_FAULT_READ;
> -
> iova = readq_relaxed(cb_base + ARM_SMMU_CB_FAR);
> -   if (!report_iommu_fault(domain, smmu->dev, iova, flags)) {
> -   ret = IRQ_HANDLED;
> -   resume = RESUME_RETRY;
> -   } else {
> -   dev_err_ratelimited(smmu->dev,
> -   "Unhandled context fault: iova=0x%08lx, fsynr=0x%x, 
> cb=%d\n",
> -   iova, fsynr, cfg->cbndx);

I would like to decouple this dev_err_ratelimit() print from the
RESUME_RETRY vs RESUME_TERMINATE behaviour.  I need the ability to
indicate by return from my fault handler whether to resume or
terminate.  But I already have my own ratelimted prints and would
prefer not to spam dmesg twice.

I'm thinking about report_iommu_fault() returning:

  0 => RESUME_RETRY
  -EFAULT => RESUME_TERMINATE but don't print
  anything else (or specifically -ENOSYS?) => RESUME_TERMINATE and print

thoughts?

> -   ret = IRQ_NONE;
> -   resume = RESUME_TERMINATE;
> -   }
> -
> -   /* Clear the faulting FSR */
> -   writel(fsr, cb_base + ARM_SMMU_CB_FSR);
>
> -   /* Retry or terminate any stalled transactions */
> -   if (fsr & FSR_SS)
> -   writel_relaxed(resume, cb_base + ARM_SMMU_CB_RESUME);

This might be a bug in qcom's implementation of the smmu spec, but
seems like we don't have SS bit set, yet we still require RESUME reg
to be written, otherwise gpu is perma-wedged.  Maybe topic for a
separate quirk?  I'm not sure if writing RESUME reg on other hw when
SS bit is not set is likely to cause problems?  If not I suppose we
could just unconditionally write it.

Anyways, I'm not super-familiar w/ arm-smmu so suggestions welcome..
in between debugging freedreno I'll try to put together some patches.

BR,
-R

> +   dev_err_ratelimited(smmu->dev,
> +   "Unhandled context fault: fsr=0x%x, iova=0x%08lx, fsynr=0x%x, 
> cb=%d\n",
> +   fsr, iova, fsynr, cfg->cbndx);
>
> -   return ret;
> +   writel(fsr, cb_base + ARM_SMMU_CB_FSR);
> +   return IRQ_HANDLED;
>  }
>
>  static irqreturn_t arm_smmu_global_fault(int irq, void *dev)
> @@ -837,7 +817,7 @@ static void arm_smmu_init_context_bank(struct 
> arm_smmu_domain *smmu_domain,
> 

Re: [PATCH 3/9] Docs: dt: document qcom iommu bindings

2017-03-23 Thread Rob Clark
On Thu, Mar 23, 2017 at 6:21 PM, Rob Herring <r...@kernel.org> wrote:
> On Tue, Mar 14, 2017 at 11:18:05AM -0400, Rob Clark wrote:
>> Cc: devicet...@vger.kernel.org
>> Signed-off-by: Rob Clark <robdcl...@gmail.com>
>> ---
>>  .../devicetree/bindings/iommu/qcom,iommu.txt   | 113 
>> +
>>  1 file changed, 113 insertions(+)
>>  create mode 100644 Documentation/devicetree/bindings/iommu/qcom,iommu.txt
>>
>> diff --git a/Documentation/devicetree/bindings/iommu/qcom,iommu.txt 
>> b/Documentation/devicetree/bindings/iommu/qcom,iommu.txt
>> new file mode 100644
>> index 000..fd5b7fa
>> --- /dev/null
>> +++ b/Documentation/devicetree/bindings/iommu/qcom,iommu.txt
>> @@ -0,0 +1,113 @@
>> +* QCOM IOMMU v1 Implementation
>> +
>> +Qualcomm "B" family devices which are not compatible with arm-smmu have
>> +a similar looking IOMMU but without access to the global register space,
>> +and optionally requiring additional configuration to route context irqs
>> +to non-secure vs secure interrupt line.
>> +
>> +** Required properties:
>> +
>> +- compatible   : Should be one of:
>> +
>> +"qcom,msm8916-iommu"
>> +
>> +- clock-names  : Should be a pair of "iface" (required for IOMMUs
>> + register group access) and "bus" (required for
>> + the IOMMUs underlying bus access).
>> +- clocks   : Phandles for respective clocks described by
>> + clock-names.
>> +- #address-cells   : must be 1.
>> +- #size-cells  : must be 1.
>> +- #iommu-cells : Must be 1.
>> +- ranges   : Base address and size of the iommu context banks.
>> +- qcom,iommu-secure-id  : secure-id.
>> +
>> +- List of sub-nodes, one per translation context bank.  Each sub-node
>> +  has the following required properties:
>> +
>> +  - compatible : Should be one of:
>> +- "qcom,msm-iommu-v1-ns"  : non-secure context bank
>> +- "qcom,msm-iommu-v1-sec" : secure context bank
>> +  - reg: Base address and size of context bank within the iommu
>> +  - interrupts : The context fault irq.
>> +
>> +** Optional properties:
>> +
>> +- reg  : Base address and size of the SMMU local base, should
>> + be only specified if the iommu requires configuration
>> + for routing of context bank irq's to secure vs non-
>> + secure lines.  (Ie. if the iommu contains secure
>> + context banks)
>> +
>> +
>> +** Examples:
>> +
>> + apps_iommu: iommu@1e2 {
>> + #address-cells = <1>;
>> + #size-cells = <1>;
>> + #iommu-cells = <1>;
>> + compatible = "qcom,msm8916-iommu", "qcom,msm-iommu-v1";
>
> You didn't document the fallback above. Maybe just drop it if only a few
> chips have this iommu.

not completely sure I understand what you want..

I think more than a few chips.. I suspect it is more like everything
after the last "a" family devices (snapdragon 600?) and before 820..
(well, more or less at least a few years worth of devices, stuff that
seems likely to be able to run an upstream kernel would be 800, 805,
808, 810.. and I guess there are some cut down 6xx and 4xx variants of
those)

I guess qcom_iommu wouldn't care about all the various 32b devices
(since they aren't going to use 64b page tables).. 808/810, I'm not
100% sure about..

>> + ranges = <0 0x1e2 0x4>;
>> + reg = <0x1ef 0x3000>;
>
> When you have both reg and ranges, use reg value for the unit-address.

whoops, I thought I fixed that

>> + clocks = < GCC_SMMU_CFG_CLK>,
>> +  < GCC_APSS_TCU_CLK>;
>> + clock-names = "iface", "bus";
>> + qcom,iommu-secure-id = <17>;
>> +
>> + // mdp_0:
>> + iommu-ctx@4000 {
>> + compatible = "qcom,msm-iommu-v1-ns";
>> + reg = <0x4000 0x1000>;
>> + interrupts = ;
>> + };
>> +
>> + // venus_ns:
>> + iommu-ctx@5000 {
>> + compatible = "qcom,msm-iommu-v1-sec";
>> + reg = <0x5000 0x1000>;
>> + interrupts = ;

Re: [PATCH 5/9] iommu: add qcom_iommu

2017-03-30 Thread Rob Clark
On Thu, Mar 30, 2017 at 2:19 AM, Archit Taneja <arch...@codeaurora.org> wrote:
> Hi,
>
> On 03/14/2017 08:48 PM, Rob Clark wrote:
>>
>> An iommu driver for Qualcomm "B" family devices which do not completely
>> implement the ARM SMMU spec.  These devices have context-bank register
>> layout that is similar to ARM SMMU, but no global register space (or at
>> least not one that is accessible).
>>
>> Signed-off-by: Rob Clark <robdcl...@gmail.com>
>> Signed-off-by: Stanimir Varbanov <stanimir.varba...@linaro.org>
>> ---
>>  drivers/iommu/Kconfig |  10 +
>>  drivers/iommu/Makefile|   1 +
>>  drivers/iommu/arm-smmu-regs.h |   2 +
>>  drivers/iommu/qcom_iommu.c| 818
>> ++
>>  4 files changed, 831 insertions(+)
>>  create mode 100644 drivers/iommu/qcom_iommu.c
>
>
> 
>
>> +
>> +static int qcom_iommu_add_device(struct device *dev)
>> +{
>> +   struct qcom_iommu_dev *qcom_iommu = __to_iommu(dev->iommu_fwspec);
>
>
> __to_iommu() has a WARN_ON() that gets triggered here for all devices on
> the platform bus that aren't backed by our iommu. We should return -ENODEV
> for all of them without throwing a warning.
>
>> +   struct iommu_group *group;
>> +   struct device_link *link;
>> +
>
>
> We could do something like:
>
> if (fwspec && fwspec->ops == _iommu_ops)
> qcom_iommu = __to_iommu(fwspec);
> else
> qcom_iommu = NULL;

thanks.. I wonder how I wasn't hitting that?

I'll incorporate this (plus small dt bindings doc update) into next
version.. probably won't have time to send until the weekend or next
week

BR,
-R


> Thanks,
> Archit
>
>
>> +   if (!qcom_iommu)
>> +   return -ENODEV;
>> +
>> +   /*
>> +* Establish the link between iommu and master, so that the
>> +* iommu gets runtime enabled/disabled as per the master's
>> +* needs.
>> +*/
>> +   link = device_link_add(dev, qcom_iommu->dev, DL_FLAG_PM_RUNTIME);
>> +   if (!link) {
>> +   dev_err(qcom_iommu->dev, "Unable to create device link
>> between %s and %s\n",
>> +   dev_name(qcom_iommu->dev), dev_name(dev));
>> +   return -ENODEV;
>> +   }
>> +
>> +   group = iommu_group_get_for_dev(dev);
>> +   if (IS_ERR_OR_NULL(group))
>> +   return PTR_ERR_OR_ZERO(group);
>> +
>> +   iommu_group_put(group);
>> +   iommu_device_link(_iommu->iommu, dev);
>> +
>> +   return 0;
>> +}
>> +
>> +static void qcom_iommu_remove_device(struct device *dev)
>> +{
>> +   struct qcom_iommu_dev *qcom_iommu = __to_iommu(dev->iommu_fwspec);
>> +
>> +   if (!qcom_iommu)
>> +   return;
>> +
>> +   iommu_group_remove_device(dev);
>> +   iommu_device_unlink(_iommu->iommu, dev);
>> +   iommu_fwspec_free(dev);
>> +}
>> +
>> +static struct iommu_group *qcom_iommu_device_group(struct device *dev)
>> +{
>> +   struct iommu_fwspec *fwspec = dev->iommu_fwspec;
>> +   struct iommu_group *group = NULL;
>> +   unsigned i;
>> +
>> +   for (i = 0; i < fwspec->num_ids; i++) {
>> +   struct qcom_iommu_ctx *ctx = __to_ctx(fwspec,
>> fwspec->ids[i]);
>> +
>> +   if (group && ctx->group && group != ctx->group)
>> +   return ERR_PTR(-EINVAL);
>> +
>> +   group = ctx->group;
>> +   }
>> +
>> +   if (group)
>> +   return iommu_group_ref_get(group);
>> +
>> +   group = generic_device_group(dev);
>> +
>> +   for (i = 0; i < fwspec->num_ids; i++) {
>> +   struct qcom_iommu_ctx *ctx = __to_ctx(fwspec,
>> fwspec->ids[i]);
>> +   ctx->group = iommu_group_ref_get(group);
>> +   }
>> +
>> +   return group;
>> +}
>> +
>> +static int qcom_iommu_of_xlate(struct device *dev, struct of_phandle_args
>> *args)
>> +{
>> +   struct platform_device *iommu_pdev;
>> +
>> +   if (args->args_count != 1) {
>> +   dev_err(dev, "incorrect number of iommu params found for
>> %s "
>> +   "(found %d, expected 1)\n",
>> +   args->np->full_name, args->args_count);
>>

Re: [PATCH 5/9] iommu: add qcom_iommu

2017-03-20 Thread Rob Clark
On Mon, Mar 20, 2017 at 10:21 AM, Sricharan R <sricha...@codeaurora.org> wrote:
> Hi Rob,
>
> sorry for the delayed response. Was not there mostly last week.
>
>
> On 3/13/2017 11:49 PM, Rob Clark wrote:
>>
>> On Mon, Mar 13, 2017 at 9:38 AM,  <sricha...@codeaurora.org> wrote:
>>>
>>> Hi Rob,
>>>
>>> [..]
>>>
>>>
>>>> +static int qcom_iommu_init_domain(struct iommu_domain *domain,
>>>> + struct qcom_iommu_dev *qcom_iommu,
>>>> + struct iommu_fwspec *fwspec)
>>>> +{
>>>> +   struct qcom_iommu_domain *qcom_domain =
>>>> to_qcom_iommu_domain(domain);
>>>> +   struct io_pgtable_ops *pgtbl_ops;
>>>> +   struct io_pgtable_cfg pgtbl_cfg;
>>>> +   int i, ret = 0;
>>>> +   u32 reg;
>>>> +
>>>> +   mutex_lock(_domain->init_mutex);
>>>> +   if (qcom_domain->iommu)
>>>> +   goto out_unlock;
>>>> +
>>>> +   pgtbl_cfg = (struct io_pgtable_cfg) {
>>>> +   .pgsize_bitmap  = qcom_iommu_ops.pgsize_bitmap,
>>>> +   .ias= 32,
>>>> +   .oas= 40,
>>>> +   .tlb= _gather_ops,
>>>> +   .iommu_dev  = qcom_iommu->dev,
>>>> +   };
>>>> +
>>>> +   qcom_domain->iommu = qcom_iommu;
>>>> +   pgtbl_ops = alloc_io_pgtable_ops(ARM_32_LPAE_S1, _cfg,
>>>> fwspec);
>>>
>>>
>>>
>>> So why not pass in the ctx pointer itself
>>> that we get below as a cookie ? That would basically
>>> avoid iterating through the list in the tlb_ops ?
>>
>>
>> The issue is that one domain might be attached to a device with multiple
>> ctx's.
>>
>
> ok, interesting, what's the usecase for this ? Dynamic domain
> has 1-ctx shared with multiple domains, but this seems to be
> the inverse.

For example, the gpu has the "user" and "priv" contexts.  And at least
for now we attach a single domain.  (Actually we might not be
triggering the gpu yet to use the "user" context, so it hasn't really
been an issue yet.. that may possibly anger arm-smmu in the future..)

>> Although perhaps __to_ctx() could be made a bit more clever.  I was
>> mostly in "make it work, optimize later" mode ;-)
>>
>> Note also, I'm thinking (both for qcom_iommu and arm-smmu) that we
>> want to move pgtbl alloc into _domain_alloc().. or at least that would
>> allow the driver to iommu_map/unmap() before attaching the domain.
>> (Partly this depends on how the iommu task and/or dynamic domain stuff
>> works out.. but one way or another we want to be able to map things to
>> pagetables that aren't the currently attached pagetables)
>
>
> ok, the dynamic domain patches i remember was tweaking the attach path
> to make this work. So are you trying to avoid that and simple do only
> a dynamic_domain_alloc instead ?
>

I probably need to look more closely at the dynamic domain stuff.
Although I'm not sure that we'll enable per-process pagetables on
anything earlier than a5xx so possibly we only care about arm-smmu for
that..

BR,
-R
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH V3 2/5] iommu/arm-smmu: Add support for MMU40x/500 clocks

2017-03-16 Thread Rob Clark
On Thu, Mar 9, 2017 at 10:35 AM, Sricharan R  wrote:
> The MMU400x/500 is the implementation of the SMMUv2
> arch specification. It is split in to two blocks
> TBU, TCU. TBU caches the page table, instantiated
> for each master locally, clocked by the TBUn_clk.
> TCU manages the address translation with PTW and has
> the programming interface as well, clocked using the
> TCU_CLK. The TBU can also be sharing the same clock
> domain as TCU, in which case both are clocked using
> the TCU_CLK.
>
> This defines the clock bindings for the same and adds the
> init, enable and disable functions for handling the
> clocks.
>
> Signed-off-by: Sricharan R 
> ---
>  .../devicetree/bindings/iommu/arm,smmu.txt | 27 ++
>  drivers/iommu/arm-smmu.c   | 95 
> +-
>  2 files changed, 121 insertions(+), 1 deletion(-)
>
> diff --git a/Documentation/devicetree/bindings/iommu/arm,smmu.txt 
> b/Documentation/devicetree/bindings/iommu/arm,smmu.txt
> index 6cdf32d..b369c13 100644
> --- a/Documentation/devicetree/bindings/iommu/arm,smmu.txt
> +++ b/Documentation/devicetree/bindings/iommu/arm,smmu.txt
> @@ -60,6 +60,28 @@ conditions.
>aliases of secure registers have to be used during
>SMMU configuration.
>
> +- clock-names:Should be "tbu_clk" and "tcu_clk" and "cfg_clk" for
> +  "arm,mmu-400", "arm,mmu-401" and "arm,mmu-500"

I guess that should be: "Should be "tbu_clk" *or* "tcu_clk" and
"cfg_clk" for..."

Also, possibly we should define our own compat strings for various
SoC's that require these clks so we can properly describe when they
are required?  I guess that would address Rob H's comment.

BR,
-R

> +  "tcu_clk" is required for smmu's register access using the
> +  programming interface and ptw for downstream bus access.
> +
> +  "tbu_clk" is required for access to the TBU connected to 
> the
> +  master locally. This clock is optional and not required 
> when
> +  TBU is in the same clock domain as the TCU or when the TBU 
> is
> +  clocked along with the master.
> +
> +  "cfg_clk" is optional if required to access the TCU's 
> programming
> +  interface, apart from the "tcu_clk".
> +
> +- clocks: Phandles for respective clocks described by clock-names.
> +
> +- power-domains:  Phandles to SMMU's power domain specifier. This is
> +  required even if SMMU belongs to the master's power
> +  domain, as the SMMU will have to be enabled and
> +  accessed before master gets enabled and linked to its
> +  SMMU.
> +
>  ** Deprecated properties:
>
>  - mmu-masters (deprecated in favour of the generic "iommus" binding) :
> @@ -84,6 +106,11 @@ conditions.
>   <0 36 4>,
>   <0 37 4>;
>  #iommu-cells = <1>;
> +clocks = < GCC_SMMU_CFG_CLK>,
> + < GCC_APSS_TCU_CLK>,
> +< GCC_MDP_TBU_CLK>;
> +
> +   clock-names = "cfg_clk", "tcu_clk", "tbu_clk";
>  };
>
>  /* device with two stream IDs, 0 and 7 */
> diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c
> index f7e11d3..720a1ef 100644
> --- a/drivers/iommu/arm-smmu.c
> +++ b/drivers/iommu/arm-smmu.c
> @@ -341,6 +341,12 @@ struct arm_smmu_master_cfg {
>  #define for_each_cfg_sme(fw, i, idx) \
> for (i = 0; idx = fwspec_smendx(fw, i), i < fw->num_ids; ++i)
>
> +struct mmu500_clk {
> +   struct clk *cfg_clk;
> +   struct clk *tcu_clk;
> +   struct clk *tbu_clk;
> +};
> +
>  struct arm_smmu_clks {
> void *clks;
> int (*init_clocks)(struct arm_smmu_device *smmu);
> @@ -455,6 +461,92 @@ static struct arm_smmu_domain *to_smmu_domain(struct 
> iommu_domain *dom)
> return container_of(dom, struct arm_smmu_domain, domain);
>  }
>
> +static int mmu500_enable_clocks(struct arm_smmu_device *smmu)
> +{
> +   int ret = 0;
> +   struct mmu500_clk *sclks = smmu->smmu_clks.clks;
> +
> +   if (!sclks)
> +   return 0;
> +
> +   ret = clk_prepare_enable(sclks->cfg_clk);
> +   if (ret) {
> +   dev_err(smmu->dev, "Couldn't enable cfg_clk");
> +   return ret;
> +   }
> +
> +   ret = clk_prepare_enable(sclks->tcu_clk);
> +   if (ret) {
> +   dev_err(smmu->dev, "Couldn't enable tcu_clk");
> +   clk_disable_unprepare(sclks->cfg_clk);
> +   return ret;
> +   }
> +
> +   ret = clk_prepare_enable(sclks->tbu_clk);
> +   if (ret) {
> +   dev_err(smmu->dev, "Couln't enable tbu_clk");
> +   clk_disable_unprepare(sclks->tcu_clk);
> +   clk_disable_unprepare(sclks->cfg_clk);
> +   return 

Re: [PATCH V3 0/5] iommu/arm-smmu: Add runtime pm/sleep support

2017-03-31 Thread Rob Clark
On Fri, Mar 31, 2017 at 1:54 PM, Will Deacon  wrote:
> On Thu, Mar 09, 2017 at 09:05:43PM +0530, Sricharan R wrote:
>> This series provides the support for turning on the arm-smmu's
>> clocks/power domains using runtime pm. This is done using the
>> recently introduced device links patches, which lets the symmu's
>> runtime to follow the master's runtime pm, so the smmu remains
>> powered only when the masters use it.
>
> Do you have any numbers for the power savings you achieve with this?
> How often do we actually manage to stop the SMMU clocks on an SoC with
> a handful of masters?
>
> In other words, is this too coarse-grained to be useful, or is it common
> that all the devices upstream of the SMMU are suspended?

well, if you think about a phone/tablet with a command mode panel,
pretty much all devices will be suspended most of the time ;-)

maybe it's a different case with servers.. unfortunately we have to
share the same driver across both..

BR,
-R

> Thanks,
>
> Will
>
>>
>> Took some reference from the exynos runtime patches [2].
>> Tested this with MDP, GPU, VENUS devices on apq8096-db820c board.
>>
>> Previous version of the patchset [1].
>>
>> [V3]
>>* Reworked the patches to keep the clocks init/enabling function
>>  seperately for each compatible.
>>
>>* Added clocks bindings for MMU40x/500.
>>
>>* Added a new compatible for qcom,smmu-v2 implementation and
>>  the clock bindings for the same.
>>
>>* Rebased on top of 4.11-rc1
>>
>> [V2]
>>* Split the patches little differently.
>>
>>* Addressed comments.
>>
>>* Removed the patch #4 [3] from previous post
>>  for arm-smmu context save restore. Planning to
>>  post this separately after reworking/addressing Robin's
>>  feedback.
>>
>>* Reversed the sequence to disable clocks than enabling.
>>  This was required for those cases where the
>>  clocks are populated in a dependent order from DT.
>>
>> [1] https://www.spinics.net/lists/linux-arm-msm/msg23870.html
>> [2] https://lkml.org/lkml/2016/10/20/70
>> [3] https://patchwork.kernel.org/patch/9389717/
>>
>> Sricharan R (5):
>>   iommu/arm-smmu: Add pm_runtime/sleep ops
>>   iommu/arm-smmu: Add support for MMU40x/500 clocks
>>   drivers: arm-smmu: Add clock support for QCOM_SMMUV2
>>   iommu/arm-smmu: Invoke pm_runtime during probe, add/remove device
>>   iommu/arm-smmu: Add the device_link between masters and smmu
>>
>>  .../devicetree/bindings/iommu/arm,smmu.txt |  35 +++
>>  drivers/iommu/arm-smmu.c   | 349 
>> -
>>  2 files changed, 373 insertions(+), 11 deletions(-)
>>
>> --
>> QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member of 
>> Code Aurora Forum, hosted by The Linux Foundation
>>
> ___
> iommu mailing list
> iommu@lists.linux-foundation.org
> https://lists.linuxfoundation.org/mailman/listinfo/iommu
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 3/9] Docs: dt: document qcom iommu bindings

2017-03-03 Thread Rob Clark
On Fri, Mar 3, 2017 at 1:21 AM, Rob Herring <r...@kernel.org> wrote:
> On Wed, Mar 01, 2017 at 12:42:52PM -0500, Rob Clark wrote:
>
> Nit: use "dt-bindings: iommu: ..." for subject. And a commit message
> would be nice.
>
>> Cc: devicet...@vger.kernel.org
>> Signed-off-by: Rob Clark <robdcl...@gmail.com>
>> ---
>>  .../devicetree/bindings/iommu/qcom,iommu.txt   | 106 
>> +
>>  1 file changed, 106 insertions(+)
>>  create mode 100644 Documentation/devicetree/bindings/iommu/qcom,iommu.txt
>>
>> diff --git a/Documentation/devicetree/bindings/iommu/qcom,iommu.txt 
>> b/Documentation/devicetree/bindings/iommu/qcom,iommu.txt
>> new file mode 100644
>> index 000..2e69b78
>> --- /dev/null
>> +++ b/Documentation/devicetree/bindings/iommu/qcom,iommu.txt
>> @@ -0,0 +1,106 @@
>> +* QCOM IOMMU v1 Implementation
>> +
>> +Qualcomm "B" family devices which are not compatible with arm-smmu have
>> +a similar looking IOMMU but without access to the global register space,
>> +and optionally requiring additional configuration to route context irqs
>> +to non-secure vs secure interrupt line.
>> +
>> +** Required properties:
>> +
>> +- compatible   : Should be "qcom,msm-iommu-v1".
>
> Fine as a fallback, but this needs chip specific compatibles.

ok, so maybe:

   compatible = "qcom,msm8916-iommu", "qcom,msm-iommu-v1";

>> +- clocks   : The interface clock (iface_clk) and bus clock 
>> (bus_clk).
>
> The names need to be documented under clock-names prop.
>
> '_clk' is redundant.

ok

>> +- #address-cells   : must be 1.
>> +- #size-cells  : must be 1.
>> +- #iommu-cells : Must be 1.
>> +- ranges   : Base address and size of the iommu context banks.
>> +- qcom,iommu-secure-id  : secure-id.
>> +
>> +- List of sub-nodes, one per translation context bank.  Each sub-node
>> +  has the following required properties:
>> +
>> +  - compatible : Should be one of:
>> +- "qcom,msm-iommu-v1-ns"  : non-secure context bank
>> +- "qcom,msm-iommu-v1-sec" : secure context bank
>
> These are okay without chip specific strings.
>
>> +  - reg: Base address and size of context bank within the iommu
>> +  - interrupts : The context fault irq.
>> +
>> +** Optional properties:
>> +
>> +- reg  : Base address and size of the SMMU local base, should
>> + be only specified if the iommu requires configuration
>> + for routing of context bank irq's to secure vs non-
>> + secure lines.  (Ie. if the iommu contains secure
>> + context banks)
>> +
>> +
>> +** Examples:
>> +
>> + apps_iommu: msm-iommu-v1@1e2 {
>
> iommu@...
>
> And this should be the reg address, not the ranges address.

ok.. but I'm not entirely sure what to do w/ gpu_iommu, which doesn't
have a reg address.

I guess I could have a required reg address (which is the unaccessible
global register space), and make the "SMMU local base" thing a 2nd
optional address.  Not sure if that is weird, since we can't actually
do anything with the global register space.

>> + #address-cells = <1>;
>> + #size-cells = <1>;
>> + #iommu-cells = <1>;
>> + compatible = "qcom,msm-iommu-v1";
>> + ranges = <0 0x1e2 0x4>;
>> + reg = <0x1ef 0x3000>;
>> + clocks = < GCC_SMMU_CFG_CLK>,
>> +  < GCC_APSS_TCU_CLK>;
>> + clock-names = "iface_clk", "bus_clk";
>> + qcom,iommu-secure-id = <17>;
>> +
>> + // mdp_0:
>> + msm-iommu-v1-ctx@4000 {
>
> iommu@...

it's not weird to have:

  iommu@1e2 {
...
iommu@4000 {
   ...
};
  };

??

BR,
-R

>> + compatible = "qcom,msm-iommu-v1-ns";
>> + reg = <0x4000 0x1000>;
>> + interrupts = ;
>> + };
>> +
>> + // venus_ns:
>> + msm-iommu-v1-ctx@5000 {
>> + compatible = "qcom,msm-iommu-v1-sec";
>> + reg = <0x5000 0x1000>;
>> + interrupts = ;
>> + };
>> + };
>> +
>> + gpu_iommu: msm-iommu-v1@1f08000 {

[PATCH 2/9] firmware: qcom_scm: add two scm calls for iommu secure page table

2017-03-14 Thread Rob Clark
From: Stanimir Varbanov <stanimir.varba...@linaro.org>

Those two new SCM calls are needed from qcom-iommu driver in order
to initialize secure iommu page table.

Signed-off-by: Stanimir Varbanov <stanimir.varba...@linaro.org>
Signed-off-by: Rob Clark <robdcl...@gmail.com>
---
 drivers/firmware/qcom_scm-32.c | 12 
 drivers/firmware/qcom_scm-64.c | 42 ++
 drivers/firmware/qcom_scm.c| 12 
 drivers/firmware/qcom_scm.h|  6 ++
 include/linux/qcom_scm.h   |  4 
 5 files changed, 76 insertions(+)

diff --git a/drivers/firmware/qcom_scm-32.c b/drivers/firmware/qcom_scm-32.c
index 722e65a..93e3b96 100644
--- a/drivers/firmware/qcom_scm-32.c
+++ b/drivers/firmware/qcom_scm-32.c
@@ -584,3 +584,15 @@ int __qcom_scm_restore_sec_cfg(struct device *dev, u32 
device_id,
 {
return -ENODEV;
 }
+
+int __qcom_scm_iommu_secure_ptbl_size(struct device *dev, u32 spare,
+ size_t *size)
+{
+   return -ENODEV;
+}
+
+int __qcom_scm_iommu_secure_ptbl_init(struct device *dev, u64 addr, u32 size,
+ u32 spare)
+{
+   return -ENODEV;
+}
diff --git a/drivers/firmware/qcom_scm-64.c b/drivers/firmware/qcom_scm-64.c
index 550e3a3..6e6d561 100644
--- a/drivers/firmware/qcom_scm-64.c
+++ b/drivers/firmware/qcom_scm-64.c
@@ -397,3 +397,45 @@ int __qcom_scm_restore_sec_cfg(struct device *dev, u32 
device_id, u32 spare)
 
return ret ? : res.a1;
 }
+
+int __qcom_scm_iommu_secure_ptbl_size(struct device *dev, u32 spare,
+ size_t *size)
+{
+   struct qcom_scm_desc desc = {0};
+   struct arm_smccc_res res;
+   int ret;
+
+   desc.args[0] = spare;
+   desc.arginfo = QCOM_SCM_ARGS(1);
+
+   ret = qcom_scm_call(dev, QCOM_SCM_SVC_MP,
+   QCOM_SCM_IOMMU_SECURE_PTBL_SIZE, , );
+
+   if (size)
+   *size = res.a1;
+
+   return ret ? : res.a2;
+}
+
+int __qcom_scm_iommu_secure_ptbl_init(struct device *dev, u64 addr, u32 size,
+ u32 spare)
+{
+   struct qcom_scm_desc desc = {0};
+   struct arm_smccc_res res;
+   int ret;
+
+   desc.args[0] = addr;
+   desc.args[1] = size;
+   desc.args[2] = spare;
+   desc.arginfo = QCOM_SCM_ARGS(3, QCOM_SCM_RW, QCOM_SCM_VAL,
+QCOM_SCM_VAL);
+
+   ret = qcom_scm_call(dev, QCOM_SCM_SVC_MP,
+   QCOM_SCM_IOMMU_SECURE_PTBL_INIT, , );
+
+   /* the pg table has been initialized already, ignore the error */
+   if (ret == -EPERM)
+   ret = 0;
+
+   return ret;
+}
diff --git a/drivers/firmware/qcom_scm.c b/drivers/firmware/qcom_scm.c
index ae1f473..bb16510 100644
--- a/drivers/firmware/qcom_scm.c
+++ b/drivers/firmware/qcom_scm.c
@@ -321,6 +321,18 @@ int qcom_scm_restore_sec_cfg(u32 device_id, u32 spare)
 }
 EXPORT_SYMBOL(qcom_scm_restore_sec_cfg);
 
+int qcom_scm_iommu_secure_ptbl_size(u32 spare, size_t *size)
+{
+   return __qcom_scm_iommu_secure_ptbl_size(__scm->dev, spare, size);
+}
+EXPORT_SYMBOL(qcom_scm_iommu_secure_ptbl_size);
+
+int qcom_scm_iommu_secure_ptbl_init(u64 addr, u32 size, u32 spare)
+{
+   return __qcom_scm_iommu_secure_ptbl_init(__scm->dev, addr, size, spare);
+}
+EXPORT_SYMBOL(qcom_scm_iommu_secure_ptbl_init);
+
 /**
  * qcom_scm_is_available() - Checks if SCM is available
  */
diff --git a/drivers/firmware/qcom_scm.h b/drivers/firmware/qcom_scm.h
index 31fc732..9bea691 100644
--- a/drivers/firmware/qcom_scm.h
+++ b/drivers/firmware/qcom_scm.h
@@ -89,5 +89,11 @@ static inline int qcom_scm_remap_error(int err)
 #define QCOM_SCM_RESTORE_SEC_CFG   2
 extern int __qcom_scm_restore_sec_cfg(struct device *dev, u32 device_id,
  u32 spare);
+#define QCOM_SCM_IOMMU_SECURE_PTBL_SIZE3
+#define QCOM_SCM_IOMMU_SECURE_PTBL_INIT4
+extern int __qcom_scm_iommu_secure_ptbl_size(struct device *dev, u32 spare,
+size_t *size);
+extern int __qcom_scm_iommu_secure_ptbl_init(struct device *dev, u64 addr,
+u32 size, u32 spare);
 
 #endif
diff --git a/include/linux/qcom_scm.h b/include/linux/qcom_scm.h
index 22017f5d..e538047 100644
--- a/include/linux/qcom_scm.h
+++ b/include/linux/qcom_scm.h
@@ -41,6 +41,8 @@ extern void qcom_scm_cpu_power_down(u32 flags);
 extern u32 qcom_scm_get_version(void);
 extern int qcom_scm_set_remote_state(u32 state, u32 id);
 extern int qcom_scm_restore_sec_cfg(u32 device_id, u32 spare);
+extern int qcom_scm_iommu_secure_ptbl_size(u32 spare, size_t *size);
+extern int qcom_scm_iommu_secure_ptbl_init(u64 addr, u32 size, u32 spare);
 #else
 static inline
 int qcom_scm_set_cold_boot_addr(void *entry, const cpumask_t *cpus)
@@ -69,5 +71,7 @@ static inline u32 qcom_scm_get_version(void) { return

[PATCH 0/9] iommu: add qcom_iommu for early "B" family devices (v2)

2017-03-14 Thread Rob Clark
An iommu driver for Qualcomm "B" family devices which do not completely
implement the ARM SMMU spec.  These devices have context-bank register
layout that is similar to ARM SMMU, but no global register space (or at
least not one that is accessible).

There are a couple vaguely unrelated patches to add venus and gpu dt nodes,
so that we have something to wire up the iommu to.

These patches apply on top of some in-flight patches to support IOMMU
probe deferral.  You can find full branch on top of linux-next here:

   git://people.freedesktop.org/~robclark/linux  
next-20170307-db410c-qcom-smmu-3-venus

or github if you prefer:

   
https://github.com/freedreno/kernel-msm/commits/next-20170307-db410c-qcom-smmu-3-venus

Compared to previous patchset, there have been some (mostly binding related)
cleanups.  Also fixed some other-config related build issues that kbuild
robot spotted.

Rob Clark (6):
  firmware/qcom: add qcom_scm_restore_sec_cfg()
  Docs: dt: document qcom iommu bindings
  iommu: arm-smmu: split out register defines
  iommu: add qcom_iommu
  ARM64: DT: add gpu for msm8916
  ARM64: DT: add iommu for msm8916

Stanimir Varbanov (3):
  firmware: qcom_scm: add two scm calls for iommu secure page table
  iommu: qcom: initialize secure page table
  ARM64: DT: add video codec devicetree node

 .../devicetree/bindings/iommu/qcom,iommu.txt   | 113 +++
 arch/arm64/boot/dts/qcom/msm8916.dtsi  | 108 +++
 drivers/firmware/qcom_scm-32.c |  18 +
 drivers/firmware/qcom_scm-64.c |  58 ++
 drivers/firmware/qcom_scm.c|  18 +
 drivers/firmware/qcom_scm.h|  11 +
 drivers/iommu/Kconfig  |  10 +
 drivers/iommu/Makefile |   1 +
 drivers/iommu/arm-smmu-regs.h  | 227 ++
 drivers/iommu/arm-smmu.c   | 200 +
 drivers/iommu/qcom_iommu.c | 882 +
 include/linux/qcom_scm.h   |   6 +
 12 files changed, 1453 insertions(+), 199 deletions(-)
 create mode 100644 Documentation/devicetree/bindings/iommu/qcom,iommu.txt
 create mode 100644 drivers/iommu/arm-smmu-regs.h
 create mode 100644 drivers/iommu/qcom_iommu.c

-- 
2.9.3

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 1/9] firmware/qcom: add qcom_scm_restore_sec_cfg()

2017-03-14 Thread Rob Clark
Signed-off-by: Rob Clark <robdcl...@gmail.com>
---
 drivers/firmware/qcom_scm-32.c |  6 ++
 drivers/firmware/qcom_scm-64.c | 16 
 drivers/firmware/qcom_scm.c|  6 ++
 drivers/firmware/qcom_scm.h|  5 +
 include/linux/qcom_scm.h   |  2 ++
 5 files changed, 35 insertions(+)

diff --git a/drivers/firmware/qcom_scm-32.c b/drivers/firmware/qcom_scm-32.c
index 8ad226c..722e65a 100644
--- a/drivers/firmware/qcom_scm-32.c
+++ b/drivers/firmware/qcom_scm-32.c
@@ -578,3 +578,9 @@ int __qcom_scm_set_remote_state(struct device *dev, u32 
state, u32 id)
 
return ret ? : le32_to_cpu(scm_ret);
 }
+
+int __qcom_scm_restore_sec_cfg(struct device *dev, u32 device_id,
+  u32 spare)
+{
+   return -ENODEV;
+}
diff --git a/drivers/firmware/qcom_scm-64.c b/drivers/firmware/qcom_scm-64.c
index c933259..550e3a3 100644
--- a/drivers/firmware/qcom_scm-64.c
+++ b/drivers/firmware/qcom_scm-64.c
@@ -381,3 +381,19 @@ int __qcom_scm_set_remote_state(struct device *dev, u32 
state, u32 id)
 
return ret ? : res.a1;
 }
+
+int __qcom_scm_restore_sec_cfg(struct device *dev, u32 device_id, u32 spare)
+{
+   struct qcom_scm_desc desc = {0};
+   struct arm_smccc_res res;
+   int ret;
+
+   desc.args[0] = device_id;
+   desc.args[1] = spare;
+   desc.arginfo = QCOM_SCM_ARGS(2);
+
+   ret = qcom_scm_call(dev, QCOM_SCM_SVC_MP, QCOM_SCM_RESTORE_SEC_CFG,
+   , );
+
+   return ret ? : res.a1;
+}
diff --git a/drivers/firmware/qcom_scm.c b/drivers/firmware/qcom_scm.c
index d987bcc..ae1f473 100644
--- a/drivers/firmware/qcom_scm.c
+++ b/drivers/firmware/qcom_scm.c
@@ -315,6 +315,12 @@ static const struct reset_control_ops 
qcom_scm_pas_reset_ops = {
.deassert = qcom_scm_pas_reset_deassert,
 };
 
+int qcom_scm_restore_sec_cfg(u32 device_id, u32 spare)
+{
+   return __qcom_scm_restore_sec_cfg(__scm->dev, device_id, spare);
+}
+EXPORT_SYMBOL(qcom_scm_restore_sec_cfg);
+
 /**
  * qcom_scm_is_available() - Checks if SCM is available
  */
diff --git a/drivers/firmware/qcom_scm.h b/drivers/firmware/qcom_scm.h
index 6a0f154..31fc732 100644
--- a/drivers/firmware/qcom_scm.h
+++ b/drivers/firmware/qcom_scm.h
@@ -85,4 +85,9 @@ static inline int qcom_scm_remap_error(int err)
return -EINVAL;
 }
 
+#define QCOM_SCM_SVC_MP0xc
+#define QCOM_SCM_RESTORE_SEC_CFG   2
+extern int __qcom_scm_restore_sec_cfg(struct device *dev, u32 device_id,
+ u32 spare);
+
 #endif
diff --git a/include/linux/qcom_scm.h b/include/linux/qcom_scm.h
index d32f6f1..22017f5d 100644
--- a/include/linux/qcom_scm.h
+++ b/include/linux/qcom_scm.h
@@ -40,6 +40,7 @@ extern int qcom_scm_pas_shutdown(u32 peripheral);
 extern void qcom_scm_cpu_power_down(u32 flags);
 extern u32 qcom_scm_get_version(void);
 extern int qcom_scm_set_remote_state(u32 state, u32 id);
+extern int qcom_scm_restore_sec_cfg(u32 device_id, u32 spare);
 #else
 static inline
 int qcom_scm_set_cold_boot_addr(void *entry, const cpumask_t *cpus)
@@ -67,5 +68,6 @@ static inline void qcom_scm_cpu_power_down(u32 flags) {}
 static inline u32 qcom_scm_get_version(void) { return 0; }
 static inline u32
 qcom_scm_set_remote_state(u32 state,u32 id) { return -ENODEV; }
+static inline int qcom_scm_restore_sec_cfg(u32 device_id, u32 spare) { return 
-ENODEV; }
 #endif
 #endif
-- 
2.9.3

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 6/9] iommu: qcom: initialize secure page table

2017-03-14 Thread Rob Clark
From: Stanimir Varbanov <stanimir.varba...@linaro.org>

This basically gets the secure page table size, allocates memory for
secure pagetables and passes the physical address to the trusted zone.

Signed-off-by: Stanimir Varbanov <stanimir.varba...@linaro.org>
Signed-off-by: Rob Clark <robdcl...@gmail.com>
---
 drivers/iommu/qcom_iommu.c | 64 ++
 1 file changed, 64 insertions(+)

diff --git a/drivers/iommu/qcom_iommu.c b/drivers/iommu/qcom_iommu.c
index 6b7adbf..743a628 100644
--- a/drivers/iommu/qcom_iommu.c
+++ b/drivers/iommu/qcom_iommu.c
@@ -608,6 +608,51 @@ static void qcom_iommu_disable_clocks(struct 
qcom_iommu_dev *qcom_iommu)
clk_disable_unprepare(qcom_iommu->iface_clk);
 }
 
+static int qcom_iommu_sec_ptbl_init(struct device *dev)
+{
+   size_t psize = 0;
+   unsigned int spare = 0;
+   void *cpu_addr;
+   dma_addr_t paddr;
+   unsigned long attrs;
+   static bool allocated = false;
+   int ret;
+
+   if (allocated)
+   return 0;
+
+   ret = qcom_scm_iommu_secure_ptbl_size(spare, );
+   if (ret) {
+   dev_err(dev, "failed to get iommu secure pgtable size (%d)\n",
+   ret);
+   return ret;
+   }
+
+   dev_info(dev, "iommu sec: pgtable size: %zu\n", psize);
+
+   attrs = DMA_ATTR_NO_KERNEL_MAPPING;
+
+   cpu_addr = dma_alloc_attrs(dev, psize, , GFP_KERNEL, attrs);
+   if (!cpu_addr) {
+   dev_err(dev, "failed to allocate %zu bytes for pgtable\n",
+   psize);
+   return -ENOMEM;
+   }
+
+   ret = qcom_scm_iommu_secure_ptbl_init(paddr, psize, spare);
+   if (ret) {
+   dev_err(dev, "failed to init iommu pgtable (%d)\n", ret);
+   goto free_mem;
+   }
+
+   allocated = true;
+   return 0;
+
+free_mem:
+   dma_free_attrs(dev, psize, cpu_addr, paddr, attrs);
+   return ret;
+}
+
 static int qcom_iommu_ctx_probe(struct platform_device *pdev)
 {
struct qcom_iommu_ctx *ctx;
@@ -688,6 +733,17 @@ static struct platform_driver qcom_iommu_ctx_driver = {
 };
 module_platform_driver(qcom_iommu_ctx_driver);
 
+static bool qcom_iommu_has_secure_context(struct qcom_iommu_dev *qcom_iommu)
+{
+   struct device_node *child;
+
+   for_each_child_of_node(qcom_iommu->dev->of_node, child)
+   if (of_device_is_compatible(child, "qcom,msm-iommu-v1-sec"))
+   return true;
+
+   return false;
+}
+
 static int qcom_iommu_device_probe(struct platform_device *pdev)
 {
struct qcom_iommu_dev *qcom_iommu;
@@ -724,6 +780,14 @@ static int qcom_iommu_device_probe(struct platform_device 
*pdev)
return -ENODEV;
}
 
+   if (qcom_iommu_has_secure_context(qcom_iommu)) {
+   ret = qcom_iommu_sec_ptbl_init(dev);
+   if (ret) {
+   dev_err(dev, "cannot init secure pg table(%d)\n", ret);
+   return ret;
+   }
+   }
+
platform_set_drvdata(pdev, qcom_iommu);
 
/* register context bank devices, which are child nodes: */
-- 
2.9.3

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 5/9] iommu: add qcom_iommu

2017-03-14 Thread Rob Clark
An iommu driver for Qualcomm "B" family devices which do not completely
implement the ARM SMMU spec.  These devices have context-bank register
layout that is similar to ARM SMMU, but no global register space (or at
least not one that is accessible).

Signed-off-by: Rob Clark <robdcl...@gmail.com>
Signed-off-by: Stanimir Varbanov <stanimir.varba...@linaro.org>
---
 drivers/iommu/Kconfig |  10 +
 drivers/iommu/Makefile|   1 +
 drivers/iommu/arm-smmu-regs.h |   2 +
 drivers/iommu/qcom_iommu.c| 818 ++
 4 files changed, 831 insertions(+)
 create mode 100644 drivers/iommu/qcom_iommu.c

diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
index 37e204f..400a404 100644
--- a/drivers/iommu/Kconfig
+++ b/drivers/iommu/Kconfig
@@ -359,4 +359,14 @@ config MTK_IOMMU_V1
 
  if unsure, say N here.
 
+config QCOM_IOMMU
+   bool "Qualcomm IOMMU Support"
+   depends on ARM || ARM64
+   depends on ARCH_QCOM || COMPILE_TEST
+   select IOMMU_API
+   select IOMMU_IO_PGTABLE_LPAE
+   select ARM_DMA_USE_IOMMU
+   help
+ Support for IOMMU on certain Qualcomm SoCs.
+
 endif # IOMMU_SUPPORT
diff --git a/drivers/iommu/Makefile b/drivers/iommu/Makefile
index 195f7b9..b910aea 100644
--- a/drivers/iommu/Makefile
+++ b/drivers/iommu/Makefile
@@ -27,3 +27,4 @@ obj-$(CONFIG_TEGRA_IOMMU_SMMU) += tegra-smmu.o
 obj-$(CONFIG_EXYNOS_IOMMU) += exynos-iommu.o
 obj-$(CONFIG_FSL_PAMU) += fsl_pamu.o fsl_pamu_domain.o
 obj-$(CONFIG_S390_IOMMU) += s390-iommu.o
+obj-$(CONFIG_QCOM_IOMMU) += qcom_iommu.o
diff --git a/drivers/iommu/arm-smmu-regs.h b/drivers/iommu/arm-smmu-regs.h
index 632240f..e643164 100644
--- a/drivers/iommu/arm-smmu-regs.h
+++ b/drivers/iommu/arm-smmu-regs.h
@@ -174,6 +174,8 @@ enum arm_smmu_s2cr_privcfg {
 #define ARM_SMMU_CB_S1_TLBIVAL 0x620
 #define ARM_SMMU_CB_S2_TLBIIPAS2   0x630
 #define ARM_SMMU_CB_S2_TLBIIPAS2L  0x638
+#define ARM_SMMU_CB_TLBSYNC0x7f0
+#define ARM_SMMU_CB_TLBSTATUS  0x7f4
 #define ARM_SMMU_CB_ATS1PR 0x800
 #define ARM_SMMU_CB_ATSR   0x8f0
 
diff --git a/drivers/iommu/qcom_iommu.c b/drivers/iommu/qcom_iommu.c
new file mode 100644
index 000..6b7adbf
--- /dev/null
+++ b/drivers/iommu/qcom_iommu.c
@@ -0,0 +1,818 @@
+/*
+ * IOMMU API for QCOM secure IOMMUs.  Somewhat based on arm-smmu.c
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
+ *
+ * Copyright (C) 2013 ARM Limited
+ * Copyright (C) 2017 Red Hat
+ */
+
+#define pr_fmt(fmt) "qcom-iommu: " fmt
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "io-pgtable.h"
+#include "arm-smmu-regs.h"
+
+#define SMMU_INTR_SEL_NS 0x2000
+
+struct qcom_iommu_dev {
+   /* IOMMU core code handle */
+   struct iommu_device  iommu;
+   struct device   *dev;
+   struct clk  *iface_clk;
+   struct clk  *bus_clk;
+   void __iomem*local_base;
+   u32  sec_id;
+   struct list_head context_list;   /* list of qcom_iommu_context 
*/
+};
+
+struct qcom_iommu_ctx {
+   struct device   *dev;
+   void __iomem*base;
+   unsigned int irq;
+   bool secure_init;
+   u32  asid;  /* asid and ctx bank # are 1:1 */
+   struct iommu_group  *group;
+   struct list_head node;  /* head in 
qcom_iommu_device::context_list */
+};
+
+struct qcom_iommu_domain {
+   struct io_pgtable_ops   *pgtbl_ops;
+   spinlock_t   pgtbl_lock;
+   struct mutex init_mutex; /* Protects iommu pointer */
+   struct iommu_domain  domain;
+   struct qcom_iommu_dev   *iommu;
+};
+
+static struct qcom_iommu_domain *to_qcom_iommu_domain(struct iommu_domain *dom)
+{
+   return container_of(dom, struct qcom_iommu_domain, domain);
+}
+
+static const struct iommu_ops qcom_iommu_ops;
+
+static struct qcom_iommu_dev * __to_iommu(struct iommu_fwspec *fwspec)
+{
+   if (WARN_ON(!fwspec || fwspec->ops != _iommu_ops

Re: [PATCH 5/9] iommu: add qcom_iommu

2017-03-13 Thread Rob Clark
On Mon, Mar 13, 2017 at 9:38 AM,   wrote:
> Hi Rob,
>
> [..]
>
>
>> +static int qcom_iommu_init_domain(struct iommu_domain *domain,
>> + struct qcom_iommu_dev *qcom_iommu,
>> + struct iommu_fwspec *fwspec)
>> +{
>> +   struct qcom_iommu_domain *qcom_domain =
>> to_qcom_iommu_domain(domain);
>> +   struct io_pgtable_ops *pgtbl_ops;
>> +   struct io_pgtable_cfg pgtbl_cfg;
>> +   int i, ret = 0;
>> +   u32 reg;
>> +
>> +   mutex_lock(_domain->init_mutex);
>> +   if (qcom_domain->iommu)
>> +   goto out_unlock;
>> +
>> +   pgtbl_cfg = (struct io_pgtable_cfg) {
>> +   .pgsize_bitmap  = qcom_iommu_ops.pgsize_bitmap,
>> +   .ias= 32,
>> +   .oas= 40,
>> +   .tlb= _gather_ops,
>> +   .iommu_dev  = qcom_iommu->dev,
>> +   };
>> +
>> +   qcom_domain->iommu = qcom_iommu;
>> +   pgtbl_ops = alloc_io_pgtable_ops(ARM_32_LPAE_S1, _cfg,
>> fwspec);
>
>
> So why not pass in the ctx pointer itself
> that we get below as a cookie ? That would basically
> avoid iterating through the list in the tlb_ops ?

The issue is that one domain might be attached to a device with multiple ctx's.

Although perhaps __to_ctx() could be made a bit more clever.  I was
mostly in "make it work, optimize later" mode ;-)

Note also, I'm thinking (both for qcom_iommu and arm-smmu) that we
want to move pgtbl alloc into _domain_alloc().. or at least that would
allow the driver to iommu_map/unmap() before attaching the domain.
(Partly this depends on how the iommu task and/or dynamic domain stuff
works out.. but one way or another we want to be able to map things to
pagetables that aren't the currently attached pagetables)

> [..]
>
>
>> +static int qcom_iommu_add_device(struct device *dev)
>> +{
>> +   struct qcom_iommu_dev *qcom_iommu = __to_iommu(dev->iommu_fwspec);
>> +   struct iommu_group *group;
>> +   struct device_link *link;
>> +
>> +   if (!qcom_iommu)
>> +   return -ENODEV;
>> +
>> +   group = iommu_group_get_for_dev(dev);
>> +   if (IS_ERR_OR_NULL(group))
>> +   return PTR_ERR_OR_ZERO(group);
>> +
>> +   iommu_group_put(group);
>> +   iommu_device_link(_iommu->iommu, dev);
>> +
>> +   /*
>> +* Establish the link between iommu and master, so that the
>> +* iommu gets runtime enabled/disabled as per the master's
>> +* needs.
>> +*/
>> +   link = device_link_add(dev, qcom_iommu->dev, DL_FLAG_PM_RUNTIME);
>> +   if (!link) {
>> +   dev_warn(qcom_iommu->dev, "Unable to create device link
>> between %s and %s\n",
>> +dev_name(qcom_iommu->dev), dev_name(dev));
>> +   /* TODO fatal or ignore? */
>> +   }
>
>
> Yes, should be fatal when depend on master's pm_runtime to call
> the iommu's runtime. The iommu may remain unclocked if the link
> is not there. Might have to fixed in my patch as well.

ok, I've made it -ENODEV

>
>> +
>> +   return 0;
>> +}
>> +
>> +static void qcom_iommu_remove_device(struct device *dev)
>> +{
>> +   struct qcom_iommu_dev *qcom_iommu = __to_iommu(dev->iommu_fwspec);
>> +
>> +   if (!qcom_iommu)
>> +   return;
>> +
>> +   iommu_group_remove_device(dev);
>> +   iommu_device_unlink(_iommu->iommu, dev);
>> +   iommu_fwspec_free(dev);
>> +}
>> +
>> +static struct iommu_group *qcom_iommu_device_group(struct device *dev)
>> +{
>> +   struct iommu_fwspec *fwspec = dev->iommu_fwspec;
>> +   struct iommu_group *group = NULL;
>> +   unsigned i;
>> +
>> +   for (i = 0; i < fwspec->num_ids; i++) {
>> +   struct qcom_iommu_ctx *ctx = __to_ctx(fwspec,
>> fwspec->ids[i]);
>> +
>> +   if (group && ctx->group && group != ctx->group)
>> +   return ERR_PTR(-EINVAL);
>> +
>> +   group = ctx->group;
>> +   }
>
>
> I think in this case, the master may devices may not populate the
> same asid/ctx bank more than once intentionally or is this to
> catch accidental wrong DT entry. Just thinking
> if we ever need this logic to get an already existing group in
> our case, simply create a new group always ?

mostly just to catch wrong DT entry.. I don't think we'd need it
anyways.  Perhaps it justifies a WARN_ON()?

>> +
>> +   if (group)
>> +   return iommu_group_ref_get(group);
>> +
>> +   group = generic_device_group(dev);
>> +
>> +   for (i = 0; i < fwspec->num_ids; i++) {
>> +   struct qcom_iommu_ctx *ctx = __to_ctx(fwspec,
>> fwspec->ids[i]);
>> +   ctx->group = iommu_group_ref_get(group);
>> +   }
>> +
>> +   return group;
>> +}
>> +
>> +static int qcom_iommu_of_xlate(struct device *dev, struct of_phandle_args
>> *args)
>> +{
>> +   struct platform_device *iommu_pdev;
>> +
>> +  

[PATCH 7/9] ARM64: DT: add gpu for msm8916

2017-03-14 Thread Rob Clark
Signed-off-by: Rob Clark <robdcl...@gmail.com>
---
 arch/arm64/boot/dts/qcom/msm8916.dtsi | 23 +++
 1 file changed, 23 insertions(+)

diff --git a/arch/arm64/boot/dts/qcom/msm8916.dtsi 
b/arch/arm64/boot/dts/qcom/msm8916.dtsi
index 68a8e67..b0daf39 100644
--- a/arch/arm64/boot/dts/qcom/msm8916.dtsi
+++ b/arch/arm64/boot/dts/qcom/msm8916.dtsi
@@ -698,6 +698,29 @@
#thermal-sensor-cells = <1>;
};
 
+   gpu@01c0 {
+   compatible = "qcom,adreno-306.0", "qcom,adreno";
+   reg = <0x01c0 0x2>;
+   reg-names = "kgsl_3d0_reg_memory";
+   interrupts = <0 33 0>;
+   interrupt-names = "kgsl_3d0_irq";
+   clock-names =
+   "core",
+   "iface",
+   "mem",
+   "mem_iface",
+   "alt_mem_iface",
+   "gfx3d_clk";
+   clocks =
+   < GCC_OXILI_GFX3D_CLK>,
+   < GCC_OXILI_AHB_CLK>,
+   < GCC_OXILI_GMEM_CLK>,
+   < GCC_BIMC_GFX_CLK>,
+   < GCC_BIMC_GPU_CLK>,
+   < GFX3D_CLK_SRC>;
+   power-domains = < OXILI_GDSC>;
+   };
+
mdss: mdss@1a0 {
compatible = "qcom,mdss";
reg = <0x1a0 0x1000>,
-- 
2.9.3

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 9/9] ARM64: DT: add iommu for msm8916

2017-03-14 Thread Rob Clark
Signed-off-by: Rob Clark <robdcl...@gmail.com>
---
 arch/arm64/boot/dts/qcom/msm8916.dtsi | 57 +++
 1 file changed, 57 insertions(+)

diff --git a/arch/arm64/boot/dts/qcom/msm8916.dtsi 
b/arch/arm64/boot/dts/qcom/msm8916.dtsi
index 7bcf4cd..8aeec6f 100644
--- a/arch/arm64/boot/dts/qcom/msm8916.dtsi
+++ b/arch/arm64/boot/dts/qcom/msm8916.dtsi
@@ -705,6 +705,59 @@
#thermal-sensor-cells = <1>;
};
 
+   apps_iommu: iommu@1ef {
+   #address-cells = <1>;
+   #size-cells = <1>;
+   #iommu-cells = <1>;
+   compatible = "qcom,msm8916-iommu", "qcom,msm-iommu-v1";
+   ranges = <0 0x1e2 0x4>;
+   reg = <0x1ef 0x3000>;
+   clocks = < GCC_SMMU_CFG_CLK>,
+< GCC_APSS_TCU_CLK>;
+   clock-names = "iface", "bus";
+   qcom,iommu-secure-id = <17>;
+
+   // mdp_0:
+   iommu-ctx@4000 {
+   compatible = "qcom,msm-iommu-v1-ns";
+   reg = <0x4000 0x1000>;
+   interrupts = ;
+   };
+
+   // venus_ns:
+   iommu-ctx@5000 {
+   compatible = "qcom,msm-iommu-v1-sec";
+   reg = <0x5000 0x1000>;
+   interrupts = ;
+   };
+   };
+
+   gpu_iommu: iommu@1f08000 {
+   #address-cells = <1>;
+   #size-cells = <1>;
+   #iommu-cells = <1>;
+   compatible = "qcom,msm8916-iommu", "qcom,msm-iommu-v1";
+   ranges = <0 0x1f08000 0x1>;
+   clocks = < GCC_SMMU_CFG_CLK>,
+< GCC_GFX_TCU_CLK>;
+   clock-names = "iface", "bus";
+   qcom,iommu-secure-id = <18>;
+
+   // gfx3d_user:
+   iommu-ctx@1000 {
+   compatible = "qcom,msm-iommu-v1-ns";
+   reg = <0x1000 0x1000>;
+   interrupts = ;
+   };
+
+   // gfx3d_priv:
+   iommu-ctx@2000 {
+   compatible = "qcom,msm-iommu-v1-ns";
+   reg = <0x2000 0x1000>;
+   interrupts = ;
+   };
+   };
+
gpu@01c0 {
compatible = "qcom,adreno-306.0", "qcom,adreno";
reg = <0x01c0 0x2>;
@@ -726,6 +779,7 @@
< GCC_BIMC_GPU_CLK>,
< GFX3D_CLK_SRC>;
power-domains = < OXILI_GDSC>;
+   iommus = <_iommu 1>, <_iommu 2>;
};
 
mdss: mdss@1a0 {
@@ -769,6 +823,8 @@
  "core_clk",
  "vsync_clk";
 
+   iommus = <_iommu 4>;
+
ports {
#address-cells = <1>;
#size-cells = <0>;
@@ -1207,6 +1263,7 @@
 < GCC_VENUS0_AHB_CLK>,
 < GCC_VENUS0_AXI_CLK>;
clock-names = "core", "iface", "bus";
+   iommus = <_iommu 5>;
memory-region = <_mem>;
status = "okay";
 
-- 
2.9.3

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 8/9] ARM64: DT: add video codec devicetree node

2017-03-14 Thread Rob Clark
From: Stanimir Varbanov <stanimir.varba...@linaro.org>

Signed-off-by: Stanimir Varbanov <stanimir.varba...@linaro.org>
Signed-off-by: Rob Clark <robdcl...@gmail.com>
---
 arch/arm64/boot/dts/qcom/msm8916.dtsi | 28 
 1 file changed, 28 insertions(+)

diff --git a/arch/arm64/boot/dts/qcom/msm8916.dtsi 
b/arch/arm64/boot/dts/qcom/msm8916.dtsi
index b0daf39..7bcf4cd 100644
--- a/arch/arm64/boot/dts/qcom/msm8916.dtsi
+++ b/arch/arm64/boot/dts/qcom/msm8916.dtsi
@@ -88,6 +88,13 @@
no-map;
};
 
+   venus_mem: venus@8990 {
+   compatible = "shared-dma-pool";
+   reg = <0x0 0x8990 0x0 0x80>;
+   alignment = <0x1000>;
+   no-map;
+   };
+
mba_mem: mba@8ea0 {
no-map;
reg = <0 0x8ea0 0 0x10>;
@@ -1190,6 +1197,27 @@
};
};
};
+
+   venus: video-codec@1d0 {
+   compatible = "qcom,msm8916-venus";
+   reg = <0x01d0 0xff000>;
+   interrupts = ;
+   power-domains = < VENUS_GDSC>;
+   clocks = < GCC_VENUS0_VCODEC0_CLK>,
+< GCC_VENUS0_AHB_CLK>,
+< GCC_VENUS0_AXI_CLK>;
+   clock-names = "core", "iface", "bus";
+   memory-region = <_mem>;
+   status = "okay";
+
+   video-decoder {
+   compatible = "venus-decoder";
+   };
+
+   video-encoder {
+   compatible = "venus-encoder";
+   };
+   };
};
 
smd {
-- 
2.9.3

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 4/9] iommu: arm-smmu: split out register defines

2017-03-14 Thread Rob Clark
I want to re-use some of these for qcom_iommu, which has (roughly) the
same context-bank registers.

Signed-off-by: Rob Clark <robdcl...@gmail.com>
---
 drivers/iommu/arm-smmu-regs.h | 225 ++
 drivers/iommu/arm-smmu.c  | 200 +
 2 files changed, 226 insertions(+), 199 deletions(-)
 create mode 100644 drivers/iommu/arm-smmu-regs.h

diff --git a/drivers/iommu/arm-smmu-regs.h b/drivers/iommu/arm-smmu-regs.h
new file mode 100644
index 000..632240f
--- /dev/null
+++ b/drivers/iommu/arm-smmu-regs.h
@@ -0,0 +1,225 @@
+/*
+ * IOMMU API for ARM architected SMMU implementations.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
+ *
+ * Copyright (C) 2013 ARM Limited
+ *
+ * Author: Will Deacon <will.dea...@arm.com>
+ */
+
+#ifndef _ARM_SMMU_REGS_H
+#define _ARM_SMMU_REGS_H
+
+
+/* Configuration registers */
+#define ARM_SMMU_GR0_sCR0  0x0
+#define sCR0_CLIENTPD  (1 << 0)
+#define sCR0_GFRE  (1 << 1)
+#define sCR0_GFIE  (1 << 2)
+#define sCR0_EXIDENABLE(1 << 3)
+#define sCR0_GCFGFRE   (1 << 4)
+#define sCR0_GCFGFIE   (1 << 5)
+#define sCR0_USFCFG(1 << 10)
+#define sCR0_VMIDPNE   (1 << 11)
+#define sCR0_PTM   (1 << 12)
+#define sCR0_FB(1 << 13)
+#define sCR0_VMID16EN  (1 << 31)
+#define sCR0_BSU_SHIFT 14
+#define sCR0_BSU_MASK  0x3
+
+/* Auxiliary Configuration register */
+#define ARM_SMMU_GR0_sACR  0x10
+
+/* Identification registers */
+#define ARM_SMMU_GR0_ID0   0x20
+#define ARM_SMMU_GR0_ID1   0x24
+#define ARM_SMMU_GR0_ID2   0x28
+#define ARM_SMMU_GR0_ID3   0x2c
+#define ARM_SMMU_GR0_ID4   0x30
+#define ARM_SMMU_GR0_ID5   0x34
+#define ARM_SMMU_GR0_ID6   0x38
+#define ARM_SMMU_GR0_ID7   0x3c
+#define ARM_SMMU_GR0_sGFSR 0x48
+#define ARM_SMMU_GR0_sGFSYNR0  0x50
+#define ARM_SMMU_GR0_sGFSYNR1  0x54
+#define ARM_SMMU_GR0_sGFSYNR2  0x58
+
+#define ID0_S1TS   (1 << 30)
+#define ID0_S2TS   (1 << 29)
+#define ID0_NTS(1 << 28)
+#define ID0_SMS(1 << 27)
+#define ID0_ATOSNS (1 << 26)
+#define ID0_PTFS_NO_AARCH32(1 << 25)
+#define ID0_PTFS_NO_AARCH32S   (1 << 24)
+#define ID0_CTTW   (1 << 14)
+#define ID0_NUMIRPT_SHIFT  16
+#define ID0_NUMIRPT_MASK   0xff
+#define ID0_NUMSIDB_SHIFT  9
+#define ID0_NUMSIDB_MASK   0xf
+#define ID0_EXIDS  (1 << 8)
+#define ID0_NUMSMRG_SHIFT  0
+#define ID0_NUMSMRG_MASK   0xff
+
+#define ID1_PAGESIZE   (1 << 31)
+#define ID1_NUMPAGENDXB_SHIFT  28
+#define ID1_NUMPAGENDXB_MASK   7
+#define ID1_NUMS2CB_SHIFT  16
+#define ID1_NUMS2CB_MASK   0xff
+#define ID1_NUMCB_SHIFT0
+#define ID1_NUMCB_MASK 0xff
+
+#define ID2_OAS_SHIFT  4
+#define ID2_OAS_MASK   0xf
+#define ID2_IAS_SHIFT  0
+#define ID2_IAS_MASK   0xf
+#define ID2_UBS_SHIFT  8
+#define ID2_UBS_MASK   0xf
+#define ID2_PTFS_4K(1 << 12)
+#define ID2_PTFS_16K   (1 << 13)
+#define ID2_PTFS_64K   (1 << 14)
+#define ID2_VMID16 (1 << 15)
+
+#define ID7_MAJOR_SHIFT4
+#define ID7_MAJOR_MASK 0xf
+
+/* Global TLB invalidation */
+#define ARM_SMMU_GR0_TLBIVMID  0x64
+#define ARM_SMMU_GR0_TLBIALLNSNH   0x68
+#define ARM_SMMU_GR0_TLBIALLH  0x6c
+#define ARM_SMMU_GR0_sTLBGSYNC 0x70
+#define ARM_SMMU_GR0_sTLBGSTATUS   0x74
+#define sTLBGSTATUS_GSACTIVE   (1 << 0)
+#define TLB_LOOP_TIMEOUT   100 /* 1s! */

Re: [PATCH 5/9] iommu: add qcom_iommu

2017-03-07 Thread Rob Clark
On Tue, Mar 7, 2017 at 12:48 PM, Robin Murphy <robin.mur...@arm.com> wrote:
> On 01/03/17 17:42, Rob Clark wrote:
>> An iommu driver for Qualcomm "B" family devices which do not completely
>> implement the ARM SMMU spec.
>
> Is that actually true, or is it just that it's a compliant SMMU on which
> firmware has set SCR1.GASRAE? (which makes the global address space
> secure-access-only). I don't know which Qualcomm SoCs are the ones
> apparently using a plain ARM MMU-500 IP, but if any of those are also
> running this particular firmware configuration that puts us in a
> somewhat weird situation with respect to drivers :/
>

I can't say for sure, I don't really know exactly what tz is doing.
Although the net effect from linux kernel perspective is that it isn't
really "compliant".  And I think the SMMU_INTR_SEL_NS part (for
controlling routing of cb irqs) is non-standard.

As far as I can tell, if there was firmware that allowed access to the
global address space, I don't think it ever escaped outside of qcom's
labs (ie. might have existed on early versions of chips for new SoC
bring-up.. but I think from upstream perspective we can ignore that).

BR,
-R
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 0/4] iommu: add qcom_iommu for early "B" family devices

2017-08-03 Thread Rob Clark
An iommu driver for Qualcomm "B" family devices which do implement the
ARM SMMU spec, but not in a way that arm-smmu can support.

(I initially added support to arm-smmu, but it was decided that approach
was too intrusive and it would be cleaner to have a separate driver.)

I should note that all the dependencies for this driver have been merged
since 4.12, and it is the last thing needed for having another fully-
enabled (gpu/display/video codec/etc) ARM device that is fully upstream.

One minor change to move a couple #defines and MMU500 bits back to
arm-smmu.c as suggested by Will.

Rob Clark (3):
  Docs: dt: document qcom iommu bindings
  iommu: arm-smmu: split out register defines
  iommu: add qcom_iommu

Stanimir Varbanov (1):
  iommu: qcom: initialize secure page table

 .../devicetree/bindings/iommu/qcom,iommu.txt   | 121 +++
 drivers/iommu/Kconfig  |  10 +
 drivers/iommu/Makefile |   1 +
 drivers/iommu/arm-smmu-regs.h  | 220 +
 drivers/iommu/arm-smmu.c   | 211 +
 drivers/iommu/qcom_iommu.c | 932 +
 6 files changed, 1293 insertions(+), 202 deletions(-)
 create mode 100644 Documentation/devicetree/bindings/iommu/qcom,iommu.txt
 create mode 100644 drivers/iommu/arm-smmu-regs.h
 create mode 100644 drivers/iommu/qcom_iommu.c

-- 
2.13.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 3/4] iommu: add qcom_iommu

2017-08-03 Thread Rob Clark
An iommu driver for Qualcomm "B" family devices which do implement the
ARM SMMU spec, but not in a way that is compatible with how the arm-smmu
driver is designed.  It seems SMMU_SCR1.GASRAE=1 so the global register
space is not accessible.  This means it needs to get configuration from
devicetree instead of setting it up dynamically.

In the end, other than register definitions, there is not much code to
share with arm-smmu (other than what has already been refactored out
into the pgtable helpers).

Signed-off-by: Rob Clark <robdcl...@gmail.com>
Tested-by: Riku Voipio <riku.voi...@linaro.org>
Tested-by: Archit Taneja <arch...@codeaurora.org>
---
 drivers/iommu/Kconfig  |  10 +
 drivers/iommu/Makefile |   1 +
 drivers/iommu/qcom_iommu.c | 868 +
 3 files changed, 879 insertions(+)
 create mode 100644 drivers/iommu/qcom_iommu.c

diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
index f73ff28f77e2..92f5fd2e0e4b 100644
--- a/drivers/iommu/Kconfig
+++ b/drivers/iommu/Kconfig
@@ -367,4 +367,14 @@ config MTK_IOMMU_V1
 
  if unsure, say N here.
 
+config QCOM_IOMMU
+   # Note: iommu drivers cannot (yet?) be built as modules
+   bool "Qualcomm IOMMU Support"
+   depends on ARCH_QCOM || COMPILE_TEST
+   select IOMMU_API
+   select IOMMU_IO_PGTABLE_LPAE
+   select ARM_DMA_USE_IOMMU
+   help
+ Support for IOMMU on certain Qualcomm SoCs.
+
 endif # IOMMU_SUPPORT
diff --git a/drivers/iommu/Makefile b/drivers/iommu/Makefile
index 195f7b997d8e..b910aea813a1 100644
--- a/drivers/iommu/Makefile
+++ b/drivers/iommu/Makefile
@@ -27,3 +27,4 @@ obj-$(CONFIG_TEGRA_IOMMU_SMMU) += tegra-smmu.o
 obj-$(CONFIG_EXYNOS_IOMMU) += exynos-iommu.o
 obj-$(CONFIG_FSL_PAMU) += fsl_pamu.o fsl_pamu_domain.o
 obj-$(CONFIG_S390_IOMMU) += s390-iommu.o
+obj-$(CONFIG_QCOM_IOMMU) += qcom_iommu.o
diff --git a/drivers/iommu/qcom_iommu.c b/drivers/iommu/qcom_iommu.c
new file mode 100644
index ..860cad1cb167
--- /dev/null
+++ b/drivers/iommu/qcom_iommu.c
@@ -0,0 +1,868 @@
+/*
+ * IOMMU API for QCOM secure IOMMUs.  Somewhat based on arm-smmu.c
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ *
+ * Copyright (C) 2013 ARM Limited
+ * Copyright (C) 2017 Red Hat
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "io-pgtable.h"
+#include "arm-smmu-regs.h"
+
+#define SMMU_INTR_SEL_NS 0x2000
+
+struct qcom_iommu_ctx;
+
+struct qcom_iommu_dev {
+   /* IOMMU core code handle */
+   struct iommu_device  iommu;
+   struct device   *dev;
+   struct clk  *iface_clk;
+   struct clk  *bus_clk;
+   void __iomem*local_base;
+   u32  sec_id;
+   u8   num_ctxs;
+   struct qcom_iommu_ctx   *ctxs[0];   /* indexed by asid-1 */
+};
+
+struct qcom_iommu_ctx {
+   struct device   *dev;
+   void __iomem*base;
+   bool secure_init;
+   u8   asid;  /* asid and ctx bank # are 1:1 */
+};
+
+struct qcom_iommu_domain {
+   struct io_pgtable_ops   *pgtbl_ops;
+   spinlock_t   pgtbl_lock;
+   struct mutex init_mutex; /* Protects iommu pointer */
+   struct iommu_domain  domain;
+   struct qcom_iommu_dev   *iommu;
+};
+
+static struct qcom_iommu_domain *to_qcom_iommu_domain(struct iommu_domain *dom)
+{
+   return container_of(dom, struct qcom_iommu_domain, domain);
+}
+
+static const struct iommu_ops qcom_iommu_ops;
+
+static struct qcom_iommu_dev * to_iommu(struct iommu_fwspec *fwspec)
+{
+   if (!fwspec || fwspec->ops != _iommu_ops)
+   return NULL;
+   return fwspec->iommu_priv;
+}
+
+static struct qcom_iommu_ctx * to_ctx(struct iommu_fwspec *fwspec, unsigned 
asid)
+{
+   struct qcom_iommu_dev *qcom_iommu = to_iommu(fwspec);
+   if (!qcom_iommu)
+   return NULL;
+   return qcom_iommu->ctxs[asid - 1];
+}
+
+static inline void
+iommu_writel(struct qcom_iommu_ctx *ctx, unsigned reg, u32 val)
+{
+   writel_relaxed(val, ctx->ba

[PATCH 4/4] iommu: qcom: initialize secure page table

2017-08-03 Thread Rob Clark
From: Stanimir Varbanov <stanimir.varba...@linaro.org>

This basically gets the secure page table size, allocates memory for
secure pagetables and passes the physical address to the trusted zone.

Signed-off-by: Stanimir Varbanov <stanimir.varba...@linaro.org>
Signed-off-by: Rob Clark <robdcl...@gmail.com>
Tested-by: Archit Taneja <arch...@codeaurora.org>
---
 drivers/iommu/qcom_iommu.c | 64 ++
 1 file changed, 64 insertions(+)

diff --git a/drivers/iommu/qcom_iommu.c b/drivers/iommu/qcom_iommu.c
index 860cad1cb167..48b62aa52787 100644
--- a/drivers/iommu/qcom_iommu.c
+++ b/drivers/iommu/qcom_iommu.c
@@ -604,6 +604,51 @@ static void qcom_iommu_disable_clocks(struct 
qcom_iommu_dev *qcom_iommu)
clk_disable_unprepare(qcom_iommu->iface_clk);
 }
 
+static int qcom_iommu_sec_ptbl_init(struct device *dev)
+{
+   size_t psize = 0;
+   unsigned int spare = 0;
+   void *cpu_addr;
+   dma_addr_t paddr;
+   unsigned long attrs;
+   static bool allocated = false;
+   int ret;
+
+   if (allocated)
+   return 0;
+
+   ret = qcom_scm_iommu_secure_ptbl_size(spare, );
+   if (ret) {
+   dev_err(dev, "failed to get iommu secure pgtable size (%d)\n",
+   ret);
+   return ret;
+   }
+
+   dev_info(dev, "iommu sec: pgtable size: %zu\n", psize);
+
+   attrs = DMA_ATTR_NO_KERNEL_MAPPING;
+
+   cpu_addr = dma_alloc_attrs(dev, psize, , GFP_KERNEL, attrs);
+   if (!cpu_addr) {
+   dev_err(dev, "failed to allocate %zu bytes for pgtable\n",
+   psize);
+   return -ENOMEM;
+   }
+
+   ret = qcom_scm_iommu_secure_ptbl_init(paddr, psize, spare);
+   if (ret) {
+   dev_err(dev, "failed to init iommu pgtable (%d)\n", ret);
+   goto free_mem;
+   }
+
+   allocated = true;
+   return 0;
+
+free_mem:
+   dma_free_attrs(dev, psize, cpu_addr, paddr, attrs);
+   return ret;
+}
+
 static int get_asid(const struct device_node *np)
 {
u32 reg;
@@ -700,6 +745,17 @@ static struct platform_driver qcom_iommu_ctx_driver = {
.remove = qcom_iommu_ctx_remove,
 };
 
+static bool qcom_iommu_has_secure_context(struct qcom_iommu_dev *qcom_iommu)
+{
+   struct device_node *child;
+
+   for_each_child_of_node(qcom_iommu->dev->of_node, child)
+   if (of_device_is_compatible(child, "qcom,msm-iommu-v1-sec"))
+   return true;
+
+   return false;
+}
+
 static int qcom_iommu_device_probe(struct platform_device *pdev)
 {
struct device_node *child;
@@ -744,6 +800,14 @@ static int qcom_iommu_device_probe(struct platform_device 
*pdev)
return -ENODEV;
}
 
+   if (qcom_iommu_has_secure_context(qcom_iommu)) {
+   ret = qcom_iommu_sec_ptbl_init(dev);
+   if (ret) {
+   dev_err(dev, "cannot init secure pg table(%d)\n", ret);
+   return ret;
+   }
+   }
+
platform_set_drvdata(pdev, qcom_iommu);
 
pm_runtime_enable(dev);
-- 
2.13.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 1/4] Docs: dt: document qcom iommu bindings

2017-08-03 Thread Rob Clark
Cc: devicet...@vger.kernel.org
Signed-off-by: Rob Clark <robdcl...@gmail.com>
Reviewed-by: Rob Herring <r...@kernel.org>
Tested-by: Archit Taneja <arch...@codeaurora.org>
---
 .../devicetree/bindings/iommu/qcom,iommu.txt   | 121 +
 1 file changed, 121 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/iommu/qcom,iommu.txt

diff --git a/Documentation/devicetree/bindings/iommu/qcom,iommu.txt 
b/Documentation/devicetree/bindings/iommu/qcom,iommu.txt
new file mode 100644
index ..b2641ceb2b40
--- /dev/null
+++ b/Documentation/devicetree/bindings/iommu/qcom,iommu.txt
@@ -0,0 +1,121 @@
+* QCOM IOMMU v1 Implementation
+
+Qualcomm "B" family devices which are not compatible with arm-smmu have
+a similar looking IOMMU but without access to the global register space,
+and optionally requiring additional configuration to route context irqs
+to non-secure vs secure interrupt line.
+
+** Required properties:
+
+- compatible   : Should be one of:
+
+"qcom,msm8916-iommu"
+
+ Followed by "qcom,msm-iommu-v1".
+
+- clock-names  : Should be a pair of "iface" (required for IOMMUs
+ register group access) and "bus" (required for
+ the IOMMUs underlying bus access).
+
+- clocks   : Phandles for respective clocks described by
+ clock-names.
+
+- #address-cells   : must be 1.
+
+- #size-cells  : must be 1.
+
+- #iommu-cells : Must be 1.  Index identifies the context-bank #.
+
+- ranges   : Base address and size of the iommu context banks.
+
+- qcom,iommu-secure-id  : secure-id.
+
+- List of sub-nodes, one per translation context bank.  Each sub-node
+  has the following required properties:
+
+  - compatible : Should be one of:
+- "qcom,msm-iommu-v1-ns"  : non-secure context bank
+- "qcom,msm-iommu-v1-sec" : secure context bank
+  - reg: Base address and size of context bank within the iommu
+  - interrupts : The context fault irq.
+
+** Optional properties:
+
+- reg  : Base address and size of the SMMU local base, should
+ be only specified if the iommu requires configuration
+ for routing of context bank irq's to secure vs non-
+ secure lines.  (Ie. if the iommu contains secure
+ context banks)
+
+
+** Examples:
+
+   apps_iommu: iommu@1e2 {
+   #address-cells = <1>;
+   #size-cells = <1>;
+   #iommu-cells = <1>;
+   compatible = "qcom,msm8916-iommu", "qcom,msm-iommu-v1";
+   ranges = <0 0x1e2 0x4>;
+   reg = <0x1ef 0x3000>;
+   clocks = < GCC_SMMU_CFG_CLK>,
+< GCC_APSS_TCU_CLK>;
+   clock-names = "iface", "bus";
+   qcom,iommu-secure-id = <17>;
+
+   // mdp_0:
+   iommu-ctx@4000 {
+   compatible = "qcom,msm-iommu-v1-ns";
+   reg = <0x4000 0x1000>;
+   interrupts = ;
+   };
+
+   // venus_ns:
+   iommu-ctx@5000 {
+   compatible = "qcom,msm-iommu-v1-sec";
+   reg = <0x5000 0x1000>;
+   interrupts = ;
+   };
+   };
+
+   gpu_iommu: iommu@1f08000 {
+   #address-cells = <1>;
+   #size-cells = <1>;
+   #iommu-cells = <1>;
+   compatible = "qcom,msm8916-iommu", "qcom,msm-iommu-v1";
+   ranges = <0 0x1f08000 0x1>;
+   clocks = < GCC_SMMU_CFG_CLK>,
+< GCC_GFX_TCU_CLK>;
+   clock-names = "iface", "bus";
+   qcom,iommu-secure-id = <18>;
+
+   // gfx3d_user:
+   iommu-ctx@1000 {
+   compatible = "qcom,msm-iommu-v1-ns";
+   reg = <0x1000 0x1000>;
+   interrupts = ;
+   };
+
+   // gfx3d_priv:
+   iommu-ctx@2000 {
+   compatible = "qcom,msm-iommu-v1-ns";
+   reg = <0x2000 0x1000>;
+   interrupts = ;
+   };
+   };
+
+   ...
+
+   venus: video-codec@1d0 {
+   ...
+   iommus = <_iommu 5>;
+   };
+
+   mdp: mdp@1a01000 {
+   ...
+   iommus = <_iommu 4>;
+   };
+
+   gpu@01c0 {
+   ...
+   iommus = <_iommu 1>, <_iommu 2>;
+   };
-- 
2.13.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 2/4] iommu: arm-smmu: split out register defines

2017-08-03 Thread Rob Clark
I want to re-use some of these for qcom_iommu, which has (roughly) the
same context-bank registers.

Signed-off-by: Rob Clark <robdcl...@gmail.com>
Tested-by: Archit Taneja <arch...@codeaurora.org>
---
 drivers/iommu/arm-smmu-regs.h | 220 ++
 drivers/iommu/arm-smmu.c  | 211 ++--
 2 files changed, 229 insertions(+), 202 deletions(-)
 create mode 100644 drivers/iommu/arm-smmu-regs.h

diff --git a/drivers/iommu/arm-smmu-regs.h b/drivers/iommu/arm-smmu-regs.h
new file mode 100644
index ..a1226e4ab5f8
--- /dev/null
+++ b/drivers/iommu/arm-smmu-regs.h
@@ -0,0 +1,220 @@
+/*
+ * IOMMU API for ARM architected SMMU implementations.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
+ *
+ * Copyright (C) 2013 ARM Limited
+ *
+ * Author: Will Deacon <will.dea...@arm.com>
+ */
+
+#ifndef _ARM_SMMU_REGS_H
+#define _ARM_SMMU_REGS_H
+
+/* Configuration registers */
+#define ARM_SMMU_GR0_sCR0  0x0
+#define sCR0_CLIENTPD  (1 << 0)
+#define sCR0_GFRE  (1 << 1)
+#define sCR0_GFIE  (1 << 2)
+#define sCR0_EXIDENABLE(1 << 3)
+#define sCR0_GCFGFRE   (1 << 4)
+#define sCR0_GCFGFIE   (1 << 5)
+#define sCR0_USFCFG(1 << 10)
+#define sCR0_VMIDPNE   (1 << 11)
+#define sCR0_PTM   (1 << 12)
+#define sCR0_FB(1 << 13)
+#define sCR0_VMID16EN  (1 << 31)
+#define sCR0_BSU_SHIFT 14
+#define sCR0_BSU_MASK  0x3
+
+/* Auxiliary Configuration register */
+#define ARM_SMMU_GR0_sACR  0x10
+
+/* Identification registers */
+#define ARM_SMMU_GR0_ID0   0x20
+#define ARM_SMMU_GR0_ID1   0x24
+#define ARM_SMMU_GR0_ID2   0x28
+#define ARM_SMMU_GR0_ID3   0x2c
+#define ARM_SMMU_GR0_ID4   0x30
+#define ARM_SMMU_GR0_ID5   0x34
+#define ARM_SMMU_GR0_ID6   0x38
+#define ARM_SMMU_GR0_ID7   0x3c
+#define ARM_SMMU_GR0_sGFSR 0x48
+#define ARM_SMMU_GR0_sGFSYNR0  0x50
+#define ARM_SMMU_GR0_sGFSYNR1  0x54
+#define ARM_SMMU_GR0_sGFSYNR2  0x58
+
+#define ID0_S1TS   (1 << 30)
+#define ID0_S2TS   (1 << 29)
+#define ID0_NTS(1 << 28)
+#define ID0_SMS(1 << 27)
+#define ID0_ATOSNS (1 << 26)
+#define ID0_PTFS_NO_AARCH32(1 << 25)
+#define ID0_PTFS_NO_AARCH32S   (1 << 24)
+#define ID0_CTTW   (1 << 14)
+#define ID0_NUMIRPT_SHIFT  16
+#define ID0_NUMIRPT_MASK   0xff
+#define ID0_NUMSIDB_SHIFT  9
+#define ID0_NUMSIDB_MASK   0xf
+#define ID0_EXIDS  (1 << 8)
+#define ID0_NUMSMRG_SHIFT  0
+#define ID0_NUMSMRG_MASK   0xff
+
+#define ID1_PAGESIZE   (1 << 31)
+#define ID1_NUMPAGENDXB_SHIFT  28
+#define ID1_NUMPAGENDXB_MASK   7
+#define ID1_NUMS2CB_SHIFT  16
+#define ID1_NUMS2CB_MASK   0xff
+#define ID1_NUMCB_SHIFT0
+#define ID1_NUMCB_MASK 0xff
+
+#define ID2_OAS_SHIFT  4
+#define ID2_OAS_MASK   0xf
+#define ID2_IAS_SHIFT  0
+#define ID2_IAS_MASK   0xf
+#define ID2_UBS_SHIFT  8
+#define ID2_UBS_MASK   0xf
+#define ID2_PTFS_4K(1 << 12)
+#define ID2_PTFS_16K   (1 << 13)
+#define ID2_PTFS_64K   (1 << 14)
+#define ID2_VMID16 (1 << 15)
+
+#define ID7_MAJOR_SHIFT4
+#define ID7_MAJOR_MASK 0xf
+
+/* Global TLB invalidation */
+#define ARM_SMMU_GR0_TLBIVMID  0x64
+#define ARM_SMMU_GR0_TLBIALLNSNH   0x68
+#define ARM_SMMU_GR0_TLBIALLH  0x6c
+#define ARM_SMMU_GR0_sTLBGSYNC 0x70
+#define ARM_SMMU_GR0_sTLBGSTATUS   0x74
+#define sTLBGSTATUS_GSACTIVE 

Re: [pull] iommu: qcom_iommu for v4.14

2017-08-15 Thread Rob Clark
On Tue, Aug 15, 2017 at 11:42 AM, Joerg Roedel <j...@8bytes.org> wrote:
> On Sun, Aug 13, 2017 at 10:27:36AM -0400, Rob Clark wrote:
>> Hi Joerg,
>>
>> These patches have been on list for quite some time now.  The bindings
>> have r-b from Rob Herring, and the patch touching arm-smmu has Will's
>> a-b.  And all the review comments on the driver patches have been
>> addressed (despite no one stepping up w/ a r-b tag).  Please consider
>> merging this pull request for v4.14
>
> Okay, I applied it, not from this pull-request, but from your last
> post. Some comments for the future below.

Thanks.  Patches from list are the same.

For future reference to you prefer to pick up patches from list?
(Dave prefers pull req's for drm, but I guess the volume of patches
for an iommu driver should be lower.)

>> Rob Clark (3):
>>   Docs: dt: document qcom iommu bindings
>
> This patch has no commit-message. Usually that is a reason for me to
> reject the patch. I applied it anyway this time because it has an
> Reviewed-By from a device-tree maintainer, and if its fine for him, I
> can live with it.
>
> For the other three patches I edited the Subject line to match the
> convention of the iommu-tree, which is "iommu/: _Capital Letter_...
>
> The patches below are now named:
>
>>   iommu: arm-smmu: split out register defines
> iommu/arm-smmu: Split out register defines
>
>>   iommu: add qcom_iommu
> iommu/qcom: Add qcom_iommu
>>
>> Stanimir Varbanov (1):
>>   iommu: qcom: initialize secure page table
> iommu/qcom: Initialize secure page table
>
> Please follow that convention for future patches.
>

will do, thanks

BR,
-R

>
> Thanks,
>
> Joerg
>
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH] MAINTAINERS: Add entry for qcom_iommu

2017-08-16 Thread Rob Clark
Add maintainer entry for qcom_iommu.

Signed-off-by: Rob Clark <robdcl...@gmail.com>
---
 MAINTAINERS | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index 44cb004c765d..8e6ab3d2e01f 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -10941,6 +10941,13 @@ T: git 
git://git.kernel.org/pub/scm/linux/kernel/git/rkuo/linux-hexagon-kernel.g
 S: Supported
 F: arch/hexagon/
 
+QUALCOMM IOMMU
+M:     Rob Clark <robdcl...@gmail.com>
+L: iommu@lists.linux-foundation.org
+L: linux-arm-...@vger.kernel.org
+S: Maintained
+F: drivers/iommu/qcom_iommu.c
+
 QUALCOMM VENUS VIDEO ACCELERATOR DRIVER
 M: Stanimir Varbanov <stanimir.varba...@linaro.org>
 L: linux-me...@vger.kernel.org
-- 
2.13.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[pull] iommu: qcom_iommu for v4.14

2017-08-13 Thread Rob Clark
Hi Joerg,

These patches have been on list for quite some time now.  The bindings
have r-b from Rob Herring, and the patch touching arm-smmu has Will's
a-b.  And all the review comments on the driver patches have been
addressed (despite no one stepping up w/ a r-b tag).  Please consider
merging this pull request for v4.14

BR,
-R

The following changes since commit 60110f262d408c3155e578359812c80fd1f786f9:

  Merge branches 'iommu/fixes', 'arm/exynos', 'arm/renesas',
'arm/rockchip', 'arm/omap', 'arm/mediatek', 'x86/vt-d' and 'core' into
next (2017-08-11 17:09:23 +0200)

are available in the git repository at:

  git://people.freedesktop.org/~robclark/linux

for you to fetch changes up to 8a4342ede167b8c095961846e908b9ac1c884650:

  iommu: qcom: initialize secure page table (2017-08-12 11:50:35 -0400)


Rob Clark (3):
  Docs: dt: document qcom iommu bindings
  iommu: arm-smmu: split out register defines
  iommu: add qcom_iommu

Stanimir Varbanov (1):
  iommu: qcom: initialize secure page table

 .../devicetree/bindings/iommu/qcom,iommu.txt   | 121 +++
 drivers/iommu/Kconfig  |  10 +
 drivers/iommu/Makefile |   1 +
 drivers/iommu/arm-smmu-regs.h  | 220 +
 drivers/iommu/arm-smmu.c   | 211 +
 drivers/iommu/qcom_iommu.c | 932 +
 6 files changed, 1293 insertions(+), 202 deletions(-)
 create mode 100644 Documentation/devicetree/bindings/iommu/qcom,iommu.txt
 create mode 100644 drivers/iommu/arm-smmu-regs.h
 create mode 100644 drivers/iommu/qcom_iommu.c
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [pull] iommu: qcom_iommu for v4.14

2017-08-13 Thread Rob Clark
On Sun, Aug 13, 2017 at 10:27 AM, Rob Clark <robdcl...@gmail.com> wrote:
> Hi Joerg,
>
> These patches have been on list for quite some time now.  The bindings
> have r-b from Rob Herring, and the patch touching arm-smmu has Will's
> a-b.  And all the review comments on the driver patches have been
> addressed (despite no one stepping up w/ a r-b tag).  Please consider
> merging this pull request for v4.14
>
> BR,
> -R
>
> The following changes since commit 60110f262d408c3155e578359812c80fd1f786f9:
>
>   Merge branches 'iommu/fixes', 'arm/exynos', 'arm/renesas',
> 'arm/rockchip', 'arm/omap', 'arm/mediatek', 'x86/vt-d' and 'core' into
> next (2017-08-11 17:09:23 +0200)
>
> are available in the git repository at:
>
>   git://people.freedesktop.org/~robclark/linux

ugg, something went wrong with my script.. that should be:

  git://people.freedesktop.org/~robclark/linux qcom_iommu-next

BR,
-R

>
> for you to fetch changes up to 8a4342ede167b8c095961846e908b9ac1c884650:
>
>   iommu: qcom: initialize secure page table (2017-08-12 11:50:35 -0400)
>
> 
> Rob Clark (3):
>   Docs: dt: document qcom iommu bindings
>   iommu: arm-smmu: split out register defines
>   iommu: add qcom_iommu
>
> Stanimir Varbanov (1):
>   iommu: qcom: initialize secure page table
>
>  .../devicetree/bindings/iommu/qcom,iommu.txt   | 121 +++
>  drivers/iommu/Kconfig  |  10 +
>  drivers/iommu/Makefile |   1 +
>  drivers/iommu/arm-smmu-regs.h  | 220 +
>  drivers/iommu/arm-smmu.c   | 211 +
>  drivers/iommu/qcom_iommu.c | 932 
> +
>  6 files changed, 1293 insertions(+), 202 deletions(-)
>  create mode 100644 Documentation/devicetree/bindings/iommu/qcom,iommu.txt
>  create mode 100644 drivers/iommu/arm-smmu-regs.h
>  create mode 100644 drivers/iommu/qcom_iommu.c
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH V4 3/6] iommu/arm-smmu: Invoke pm_runtime during probe, add/remove device

2017-07-13 Thread Rob Clark
On Thu, Jul 13, 2017 at 9:53 AM, Sricharan R <sricha...@codeaurora.org> wrote:
> Hi,
>
> On 7/13/2017 5:20 PM, Rob Clark wrote:
>> On Thu, Jul 13, 2017 at 1:35 AM, Sricharan R <sricha...@codeaurora.org> 
>> wrote:
>>> Hi Vivek,
>>>
>>> On 7/13/2017 10:43 AM, Vivek Gautam wrote:
>>>> Hi Stephen,
>>>>
>>>>
>>>> On 07/13/2017 04:24 AM, Stephen Boyd wrote:
>>>>> On 07/06, Vivek Gautam wrote:
>>>>>> @@ -1231,12 +1237,18 @@ static int arm_smmu_map(struct iommu_domain 
>>>>>> *domain, unsigned long iova,
>>>>>>   static size_t arm_smmu_unmap(struct iommu_domain *domain, unsigned 
>>>>>> long iova,
>>>>>>size_t size)
>>>>>>   {
>>>>>> -struct io_pgtable_ops *ops = to_smmu_domain(domain)->pgtbl_ops;
>>>>>> +struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
>>>>>> +struct io_pgtable_ops *ops = smmu_domain->pgtbl_ops;
>>>>>> +size_t ret;
>>>>>> if (!ops)
>>>>>>   return 0;
>>>>>>   -return ops->unmap(ops, iova, size);
>>>>>> +pm_runtime_get_sync(smmu_domain->smmu->dev);
>>>>> Can these map/unmap ops be called from an atomic context? I seem
>>>>> to recall that being a problem before.
>>>>
>>>> That's something which was dropped in the following patch merged in master:
>>>> 523d7423e21b iommu/arm-smmu: Remove io-pgtable spinlock
>>>>
>>>> Looks like we don't  need locks here anymore?
>>>
>>>  Apart from the locking, wonder why a explicit pm_runtime is needed
>>>  from unmap. Somehow looks like some path in the master using that
>>>  should have enabled the pm ?
>>>
>>
>> Yes, there are a bunch of scenarios where unmap can happen with
>> disabled master (but not in atomic context).  On the gpu side we
>> opportunistically keep a buffer mapping until the buffer is freed
>> (which can happen after gpu is disabled).  Likewise, v4l2 won't unmap
>> an exported dmabuf while some other driver holds a reference to it
>> (which can be dropped when the v4l2 device is suspended).
>>
>> Since unmap triggers tbl flush which touches iommu regs, the iommu
>> driver *definitely* needs a pm_runtime_get_sync().
>
>  Ok, with that being the case, there are two things here,
>
>  1) If the device links are still intact at these places where unmap is 
> called,
> then pm_runtime from the master would setup the all the clocks. That would
> avoid reintroducing the locking indirectly here.
>
>  2) If not, then doing it here is the only way. But for both cases, since
> the unmap can be called from atomic context, resume handler here should
> avoid doing clk_prepare_enable , instead move the clk_prepare to the init.
>

I do kinda like the approach Marek suggested.. of deferring the tlb
flush until resume.  I'm wondering if we could combine that with
putting the mmu in a stalled state when we suspend (and not resume the
mmu until after the pending tlb flush)?

BR,
-R
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH V4 3/6] iommu/arm-smmu: Invoke pm_runtime during probe, add/remove device

2017-07-13 Thread Rob Clark
On Thu, Jul 13, 2017 at 5:50 AM, Robin Murphy  wrote:
> On 13/07/17 07:48, Stephen Boyd wrote:
>> On 07/13, Vivek Gautam wrote:
>>> Hi Stephen,
>>>
>>>
>>> On 07/13/2017 04:24 AM, Stephen Boyd wrote:
 On 07/06, Vivek Gautam wrote:
> @@ -1231,12 +1237,18 @@ static int arm_smmu_map(struct iommu_domain 
> *domain, unsigned long iova,
>  static size_t arm_smmu_unmap(struct iommu_domain *domain, unsigned long 
> iova,
> size_t size)
>  {
> -  struct io_pgtable_ops *ops = to_smmu_domain(domain)->pgtbl_ops;
> +  struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
> +  struct io_pgtable_ops *ops = smmu_domain->pgtbl_ops;
> +  size_t ret;
>if (!ops)
>return 0;
> -  return ops->unmap(ops, iova, size);
> +  pm_runtime_get_sync(smmu_domain->smmu->dev);
 Can these map/unmap ops be called from an atomic context? I seem
 to recall that being a problem before.
>>>
>>> That's something which was dropped in the following patch merged in master:
>>> 523d7423e21b iommu/arm-smmu: Remove io-pgtable spinlock
>>>
>>> Looks like we don't  need locks here anymore?
>>>
>>
>> While removing the spinlock around the map/unmap path may be one
>> thing, I'm not sure that's all of them. Is there a path from an
>> atomic DMA allocation (GFP_ATOMIC sort of thing) mapped into an
>> IOMMU for a device that can eventually get down to here and
>> attempt to turn a clk on?
>
> Yes, in the DMA path map/unmap will frequently be called from IRQ
> handlers (think e.g. network packets). The whole point of removing the
> lock was to allow multiple maps/unmaps to execute in parallel (since we
> know they will be safely operating on different areas of the pagetable).
> AFAICS this change is going to largely reintroduce that bottleneck via
> dev->power_lock, which is anything but what we want :(
>

Maybe __pm_runtime_resume() needs some sort of fast-path if already
enabled?  Or otherwise we need some sort of flag to tell the iommu
that it cannot rely on the unmapping device to be resumed?

BR,
-R
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[RESEND PATCH 3/4] iommu: add qcom_iommu

2017-07-13 Thread Rob Clark
An iommu driver for Qualcomm "B" family devices which do implement the
ARM SMMU spec, but not in a way that is compatible with how the arm-smmu
driver is designed.  It seems SMMU_SCR1.GASRAE=1 so the global register
space is not accessible.  This means it needs to get configuration from
devicetree instead of setting it up dynamically.

In the end, other than register definitions, there is not much code to
share with arm-smmu (other than what has already been refactored out
into the pgtable helpers).

Signed-off-by: Rob Clark <robdcl...@gmail.com>
Tested-by: Riku Voipio <riku.voi...@linaro.org>
---
v1: original
v2: bindings cleanups and kconfig issues that kbuild robot pointed out
v3: fix issues pointed out by Rob H. and actually make device removal
work
v4: fix WARN_ON() splats reported by Archit
v5: some fixes to build as a module.. note that it cannot actually
be built as a module yet (at minimum a bunch of other iommu syms
that are needed are not exported, but there may be more to it
than that), but at least qcom_iommu is ready should it become
possible to build iommu drivers as modules.
v6: Add additional pm-runtime get/puts around paths that can hit
TLB inv, to avoid unclocked register access if device using the
iommu is not powered on.  And pre-emptively clear interrupts
before registering IRQ handler just in case the bootloader has
left us a surpise.
v7: Address review comments from Robin (don't associate iommu_group
with context bank, table lookup instead of list to find context
bank, etc)
v8: Fix silly bug on detach.  Actually Robin already pointed it out
but I somehow overlooked that comment when preparing v7.

 drivers/iommu/Kconfig  |  10 +
 drivers/iommu/Makefile |   1 +
 drivers/iommu/qcom_iommu.c | 868 +
 3 files changed, 879 insertions(+)
 create mode 100644 drivers/iommu/qcom_iommu.c

diff --git a/drivers/iommu/Kconfig b/drivers/iommu/Kconfig
index 6ee3a25ae731..aa4b62893fe1 100644
--- a/drivers/iommu/Kconfig
+++ b/drivers/iommu/Kconfig
@@ -367,4 +367,14 @@ config MTK_IOMMU_V1
 
  if unsure, say N here.
 
+config QCOM_IOMMU
+   # Note: iommu drivers cannot (yet?) be built as modules
+   bool "Qualcomm IOMMU Support"
+   depends on ARCH_QCOM || COMPILE_TEST
+   select IOMMU_API
+   select IOMMU_IO_PGTABLE_LPAE
+   select ARM_DMA_USE_IOMMU
+   help
+ Support for IOMMU on certain Qualcomm SoCs.
+
 endif # IOMMU_SUPPORT
diff --git a/drivers/iommu/Makefile b/drivers/iommu/Makefile
index 195f7b997d8e..b910aea813a1 100644
--- a/drivers/iommu/Makefile
+++ b/drivers/iommu/Makefile
@@ -27,3 +27,4 @@ obj-$(CONFIG_TEGRA_IOMMU_SMMU) += tegra-smmu.o
 obj-$(CONFIG_EXYNOS_IOMMU) += exynos-iommu.o
 obj-$(CONFIG_FSL_PAMU) += fsl_pamu.o fsl_pamu_domain.o
 obj-$(CONFIG_S390_IOMMU) += s390-iommu.o
+obj-$(CONFIG_QCOM_IOMMU) += qcom_iommu.o
diff --git a/drivers/iommu/qcom_iommu.c b/drivers/iommu/qcom_iommu.c
new file mode 100644
index ..860cad1cb167
--- /dev/null
+++ b/drivers/iommu/qcom_iommu.c
@@ -0,0 +1,868 @@
+/*
+ * IOMMU API for QCOM secure IOMMUs.  Somewhat based on arm-smmu.c
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ *
+ * Copyright (C) 2013 ARM Limited
+ * Copyright (C) 2017 Red Hat
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "io-pgtable.h"
+#include "arm-smmu-regs.h"
+
+#define SMMU_INTR_SEL_NS 0x2000
+
+struct qcom_iommu_ctx;
+
+struct qcom_iommu_dev {
+   /* IOMMU core code handle */
+   struct iommu_device  iommu;
+   struct device   *dev;
+   struct clk  *iface_clk;
+   struct clk  *bus_clk;
+   void __iomem*local_base;
+   u32  sec_id;
+   u8   num_ctxs;
+   struct qcom_iommu_ctx   *ctxs[0];   /* indexed by asid-1 */
+};
+
+struct qcom_iommu_ctx {
+   struct device   *dev;
+   void __iomem*base;
+   bool secure_init;
+   u8   asid;  /* asid and ctx bank # are 1:1 */
+};
+
+struct qcom

[RESEND PATCH 2/4] iommu: arm-smmu: split out register defines

2017-07-13 Thread Rob Clark
I want to re-use some of these for qcom_iommu, which has (roughly) the
same context-bank registers.

Signed-off-by: Rob Clark <robdcl...@gmail.com>
---
 drivers/iommu/arm-smmu-regs.h | 227 ++
 drivers/iommu/arm-smmu.c  | 203 +
 2 files changed, 228 insertions(+), 202 deletions(-)
 create mode 100644 drivers/iommu/arm-smmu-regs.h

diff --git a/drivers/iommu/arm-smmu-regs.h b/drivers/iommu/arm-smmu-regs.h
new file mode 100644
index ..87589c863068
--- /dev/null
+++ b/drivers/iommu/arm-smmu-regs.h
@@ -0,0 +1,227 @@
+/*
+ * IOMMU API for ARM architected SMMU implementations.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
+ *
+ * Copyright (C) 2013 ARM Limited
+ *
+ * Author: Will Deacon <will.dea...@arm.com>
+ */
+
+#ifndef _ARM_SMMU_REGS_H
+#define _ARM_SMMU_REGS_H
+
+/* Configuration registers */
+#define ARM_SMMU_GR0_sCR0  0x0
+#define sCR0_CLIENTPD  (1 << 0)
+#define sCR0_GFRE  (1 << 1)
+#define sCR0_GFIE  (1 << 2)
+#define sCR0_EXIDENABLE(1 << 3)
+#define sCR0_GCFGFRE   (1 << 4)
+#define sCR0_GCFGFIE   (1 << 5)
+#define sCR0_USFCFG(1 << 10)
+#define sCR0_VMIDPNE   (1 << 11)
+#define sCR0_PTM   (1 << 12)
+#define sCR0_FB(1 << 13)
+#define sCR0_VMID16EN  (1 << 31)
+#define sCR0_BSU_SHIFT 14
+#define sCR0_BSU_MASK  0x3
+
+/* Auxiliary Configuration register */
+#define ARM_SMMU_GR0_sACR  0x10
+
+/* Identification registers */
+#define ARM_SMMU_GR0_ID0   0x20
+#define ARM_SMMU_GR0_ID1   0x24
+#define ARM_SMMU_GR0_ID2   0x28
+#define ARM_SMMU_GR0_ID3   0x2c
+#define ARM_SMMU_GR0_ID4   0x30
+#define ARM_SMMU_GR0_ID5   0x34
+#define ARM_SMMU_GR0_ID6   0x38
+#define ARM_SMMU_GR0_ID7   0x3c
+#define ARM_SMMU_GR0_sGFSR 0x48
+#define ARM_SMMU_GR0_sGFSYNR0  0x50
+#define ARM_SMMU_GR0_sGFSYNR1  0x54
+#define ARM_SMMU_GR0_sGFSYNR2  0x58
+
+#define ID0_S1TS   (1 << 30)
+#define ID0_S2TS   (1 << 29)
+#define ID0_NTS(1 << 28)
+#define ID0_SMS(1 << 27)
+#define ID0_ATOSNS (1 << 26)
+#define ID0_PTFS_NO_AARCH32(1 << 25)
+#define ID0_PTFS_NO_AARCH32S   (1 << 24)
+#define ID0_CTTW   (1 << 14)
+#define ID0_NUMIRPT_SHIFT  16
+#define ID0_NUMIRPT_MASK   0xff
+#define ID0_NUMSIDB_SHIFT  9
+#define ID0_NUMSIDB_MASK   0xf
+#define ID0_EXIDS  (1 << 8)
+#define ID0_NUMSMRG_SHIFT  0
+#define ID0_NUMSMRG_MASK   0xff
+
+#define ID1_PAGESIZE   (1 << 31)
+#define ID1_NUMPAGENDXB_SHIFT  28
+#define ID1_NUMPAGENDXB_MASK   7
+#define ID1_NUMS2CB_SHIFT  16
+#define ID1_NUMS2CB_MASK   0xff
+#define ID1_NUMCB_SHIFT0
+#define ID1_NUMCB_MASK 0xff
+
+#define ID2_OAS_SHIFT  4
+#define ID2_OAS_MASK   0xf
+#define ID2_IAS_SHIFT  0
+#define ID2_IAS_MASK   0xf
+#define ID2_UBS_SHIFT  8
+#define ID2_UBS_MASK   0xf
+#define ID2_PTFS_4K(1 << 12)
+#define ID2_PTFS_16K   (1 << 13)
+#define ID2_PTFS_64K   (1 << 14)
+#define ID2_VMID16 (1 << 15)
+
+#define ID7_MAJOR_SHIFT4
+#define ID7_MAJOR_MASK 0xf
+
+/* Global TLB invalidation */
+#define ARM_SMMU_GR0_TLBIVMID  0x64
+#define ARM_SMMU_GR0_TLBIALLNSNH   0x68
+#define ARM_SMMU_GR0_TLBIALLH  0x6c
+#define ARM_SMMU_GR0_sTLBGSYNC 0x70
+#define ARM_SMMU_GR0_sTLBGSTATUS   0x74
+#define sTLBGSTATUS_GSACTIVE   (1 << 0)
+#define TLB_LOOP_TIMEOUT   10

[RESEND PATCH 1/4] Docs: dt: document qcom iommu bindings

2017-07-13 Thread Rob Clark
Cc: devicet...@vger.kernel.org
Signed-off-by: Rob Clark <robdcl...@gmail.com>
Reviewed-by: Rob Herring <r...@kernel.org>
---
 .../devicetree/bindings/iommu/qcom,iommu.txt   | 121 +
 1 file changed, 121 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/iommu/qcom,iommu.txt

diff --git a/Documentation/devicetree/bindings/iommu/qcom,iommu.txt 
b/Documentation/devicetree/bindings/iommu/qcom,iommu.txt
new file mode 100644
index ..b2641ceb2b40
--- /dev/null
+++ b/Documentation/devicetree/bindings/iommu/qcom,iommu.txt
@@ -0,0 +1,121 @@
+* QCOM IOMMU v1 Implementation
+
+Qualcomm "B" family devices which are not compatible with arm-smmu have
+a similar looking IOMMU but without access to the global register space,
+and optionally requiring additional configuration to route context irqs
+to non-secure vs secure interrupt line.
+
+** Required properties:
+
+- compatible   : Should be one of:
+
+"qcom,msm8916-iommu"
+
+ Followed by "qcom,msm-iommu-v1".
+
+- clock-names  : Should be a pair of "iface" (required for IOMMUs
+ register group access) and "bus" (required for
+ the IOMMUs underlying bus access).
+
+- clocks   : Phandles for respective clocks described by
+ clock-names.
+
+- #address-cells   : must be 1.
+
+- #size-cells  : must be 1.
+
+- #iommu-cells : Must be 1.  Index identifies the context-bank #.
+
+- ranges   : Base address and size of the iommu context banks.
+
+- qcom,iommu-secure-id  : secure-id.
+
+- List of sub-nodes, one per translation context bank.  Each sub-node
+  has the following required properties:
+
+  - compatible : Should be one of:
+- "qcom,msm-iommu-v1-ns"  : non-secure context bank
+- "qcom,msm-iommu-v1-sec" : secure context bank
+  - reg: Base address and size of context bank within the iommu
+  - interrupts : The context fault irq.
+
+** Optional properties:
+
+- reg  : Base address and size of the SMMU local base, should
+ be only specified if the iommu requires configuration
+ for routing of context bank irq's to secure vs non-
+ secure lines.  (Ie. if the iommu contains secure
+ context banks)
+
+
+** Examples:
+
+   apps_iommu: iommu@1e2 {
+   #address-cells = <1>;
+   #size-cells = <1>;
+   #iommu-cells = <1>;
+   compatible = "qcom,msm8916-iommu", "qcom,msm-iommu-v1";
+   ranges = <0 0x1e2 0x4>;
+   reg = <0x1ef 0x3000>;
+   clocks = < GCC_SMMU_CFG_CLK>,
+< GCC_APSS_TCU_CLK>;
+   clock-names = "iface", "bus";
+   qcom,iommu-secure-id = <17>;
+
+   // mdp_0:
+   iommu-ctx@4000 {
+   compatible = "qcom,msm-iommu-v1-ns";
+   reg = <0x4000 0x1000>;
+   interrupts = ;
+   };
+
+   // venus_ns:
+   iommu-ctx@5000 {
+   compatible = "qcom,msm-iommu-v1-sec";
+   reg = <0x5000 0x1000>;
+   interrupts = ;
+   };
+   };
+
+   gpu_iommu: iommu@1f08000 {
+   #address-cells = <1>;
+   #size-cells = <1>;
+   #iommu-cells = <1>;
+   compatible = "qcom,msm8916-iommu", "qcom,msm-iommu-v1";
+   ranges = <0 0x1f08000 0x1>;
+   clocks = < GCC_SMMU_CFG_CLK>,
+< GCC_GFX_TCU_CLK>;
+   clock-names = "iface", "bus";
+   qcom,iommu-secure-id = <18>;
+
+   // gfx3d_user:
+   iommu-ctx@1000 {
+   compatible = "qcom,msm-iommu-v1-ns";
+   reg = <0x1000 0x1000>;
+   interrupts = ;
+   };
+
+   // gfx3d_priv:
+   iommu-ctx@2000 {
+   compatible = "qcom,msm-iommu-v1-ns";
+   reg = <0x2000 0x1000>;
+   interrupts = ;
+   };
+   };
+
+   ...
+
+   venus: video-codec@1d0 {
+   ...
+   iommus = <_iommu 5>;
+   };
+
+   mdp: mdp@1a01000 {
+   ...
+   iommus = <_iommu 4>;
+   };
+
+   gpu@01c0 {
+   ...
+   iommus = <_iommu 1>, <_iommu 2>;
+   };
-- 
2.13.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[RESEND PATCH 4/4] iommu: qcom: initialize secure page table

2017-07-13 Thread Rob Clark
From: Stanimir Varbanov <stanimir.varba...@linaro.org>

This basically gets the secure page table size, allocates memory for
secure pagetables and passes the physical address to the trusted zone.

Signed-off-by: Stanimir Varbanov <stanimir.varba...@linaro.org>
Signed-off-by: Rob Clark <robdcl...@gmail.com>
---
 drivers/iommu/qcom_iommu.c | 64 ++
 1 file changed, 64 insertions(+)

diff --git a/drivers/iommu/qcom_iommu.c b/drivers/iommu/qcom_iommu.c
index 860cad1cb167..48b62aa52787 100644
--- a/drivers/iommu/qcom_iommu.c
+++ b/drivers/iommu/qcom_iommu.c
@@ -604,6 +604,51 @@ static void qcom_iommu_disable_clocks(struct 
qcom_iommu_dev *qcom_iommu)
clk_disable_unprepare(qcom_iommu->iface_clk);
 }
 
+static int qcom_iommu_sec_ptbl_init(struct device *dev)
+{
+   size_t psize = 0;
+   unsigned int spare = 0;
+   void *cpu_addr;
+   dma_addr_t paddr;
+   unsigned long attrs;
+   static bool allocated = false;
+   int ret;
+
+   if (allocated)
+   return 0;
+
+   ret = qcom_scm_iommu_secure_ptbl_size(spare, );
+   if (ret) {
+   dev_err(dev, "failed to get iommu secure pgtable size (%d)\n",
+   ret);
+   return ret;
+   }
+
+   dev_info(dev, "iommu sec: pgtable size: %zu\n", psize);
+
+   attrs = DMA_ATTR_NO_KERNEL_MAPPING;
+
+   cpu_addr = dma_alloc_attrs(dev, psize, , GFP_KERNEL, attrs);
+   if (!cpu_addr) {
+   dev_err(dev, "failed to allocate %zu bytes for pgtable\n",
+   psize);
+   return -ENOMEM;
+   }
+
+   ret = qcom_scm_iommu_secure_ptbl_init(paddr, psize, spare);
+   if (ret) {
+   dev_err(dev, "failed to init iommu pgtable (%d)\n", ret);
+   goto free_mem;
+   }
+
+   allocated = true;
+   return 0;
+
+free_mem:
+   dma_free_attrs(dev, psize, cpu_addr, paddr, attrs);
+   return ret;
+}
+
 static int get_asid(const struct device_node *np)
 {
u32 reg;
@@ -700,6 +745,17 @@ static struct platform_driver qcom_iommu_ctx_driver = {
.remove = qcom_iommu_ctx_remove,
 };
 
+static bool qcom_iommu_has_secure_context(struct qcom_iommu_dev *qcom_iommu)
+{
+   struct device_node *child;
+
+   for_each_child_of_node(qcom_iommu->dev->of_node, child)
+   if (of_device_is_compatible(child, "qcom,msm-iommu-v1-sec"))
+   return true;
+
+   return false;
+}
+
 static int qcom_iommu_device_probe(struct platform_device *pdev)
 {
struct device_node *child;
@@ -744,6 +800,14 @@ static int qcom_iommu_device_probe(struct platform_device 
*pdev)
return -ENODEV;
}
 
+   if (qcom_iommu_has_secure_context(qcom_iommu)) {
+   ret = qcom_iommu_sec_ptbl_init(dev);
+   if (ret) {
+   dev_err(dev, "cannot init secure pg table(%d)\n", ret);
+   return ret;
+   }
+   }
+
platform_set_drvdata(pdev, qcom_iommu);
 
pm_runtime_enable(dev);
-- 
2.13.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[RESEND PATCH 0/4] iommu: add qcom_iommu for early "B" family devices

2017-07-13 Thread Rob Clark
An iommu driver for Qualcomm "B" family devices which do implement the
ARM SMMU spec, but not in a way that arm-smmu can support.

(I initially added support to arm-smmu, but it was decided that approach
was too intrusive and it would be cleaner to have a separate driver.)

I should note that all the dependencies for this driver have been merged
since 4.12, and it is the last thing needed for having another fully-
enabled (gpu/display/video codec/etc) ARM device that is fully upstream.

Rob Clark (3):
  Docs: dt: document qcom iommu bindings
  iommu: arm-smmu: split out register defines
  iommu: add qcom_iommu

Stanimir Varbanov (1):
  iommu: qcom: initialize secure page table

 .../devicetree/bindings/iommu/qcom,iommu.txt   | 121 +++
 drivers/iommu/Kconfig  |  10 +
 drivers/iommu/Makefile |   1 +
 drivers/iommu/arm-smmu-regs.h  | 227 +
 drivers/iommu/arm-smmu.c   | 203 +
 drivers/iommu/qcom_iommu.c | 932 +
 6 files changed, 1292 insertions(+), 202 deletions(-)
 create mode 100644 Documentation/devicetree/bindings/iommu/qcom,iommu.txt
 create mode 100644 drivers/iommu/arm-smmu-regs.h
 create mode 100644 drivers/iommu/qcom_iommu.c

-- 
2.13.0

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH V4 3/6] iommu/arm-smmu: Invoke pm_runtime during probe, add/remove device

2017-07-13 Thread Rob Clark
On Thu, Jul 13, 2017 at 1:35 AM, Sricharan R  wrote:
> Hi Vivek,
>
> On 7/13/2017 10:43 AM, Vivek Gautam wrote:
>> Hi Stephen,
>>
>>
>> On 07/13/2017 04:24 AM, Stephen Boyd wrote:
>>> On 07/06, Vivek Gautam wrote:
 @@ -1231,12 +1237,18 @@ static int arm_smmu_map(struct iommu_domain 
 *domain, unsigned long iova,
   static size_t arm_smmu_unmap(struct iommu_domain *domain, unsigned long 
 iova,
size_t size)
   {
 -struct io_pgtable_ops *ops = to_smmu_domain(domain)->pgtbl_ops;
 +struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
 +struct io_pgtable_ops *ops = smmu_domain->pgtbl_ops;
 +size_t ret;
 if (!ops)
   return 0;
   -return ops->unmap(ops, iova, size);
 +pm_runtime_get_sync(smmu_domain->smmu->dev);
>>> Can these map/unmap ops be called from an atomic context? I seem
>>> to recall that being a problem before.
>>
>> That's something which was dropped in the following patch merged in master:
>> 523d7423e21b iommu/arm-smmu: Remove io-pgtable spinlock
>>
>> Looks like we don't  need locks here anymore?
>
>  Apart from the locking, wonder why a explicit pm_runtime is needed
>  from unmap. Somehow looks like some path in the master using that
>  should have enabled the pm ?
>

Yes, there are a bunch of scenarios where unmap can happen with
disabled master (but not in atomic context).  On the gpu side we
opportunistically keep a buffer mapping until the buffer is freed
(which can happen after gpu is disabled).  Likewise, v4l2 won't unmap
an exported dmabuf while some other driver holds a reference to it
(which can be dropped when the v4l2 device is suspended).

Since unmap triggers tbl flush which touches iommu regs, the iommu
driver *definitely* needs a pm_runtime_get_sync().

BR,
-R
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH V4 3/6] iommu/arm-smmu: Invoke pm_runtime during probe, add/remove device

2017-07-13 Thread Rob Clark
On Thu, Jul 13, 2017 at 8:02 AM, Marek Szyprowski
<m.szyprow...@samsung.com> wrote:
> Hi All,
>
> On 2017-07-13 13:50, Rob Clark wrote:
>>
>> On Thu, Jul 13, 2017 at 1:35 AM, Sricharan R <sricha...@codeaurora.org>
>> wrote:
>>>
>>> On 7/13/2017 10:43 AM, Vivek Gautam wrote:
>>>>
>>>> On 07/13/2017 04:24 AM, Stephen Boyd wrote:
>>>>>
>>>>> On 07/06, Vivek Gautam wrote:
>>>>>>
>>>>>> @@ -1231,12 +1237,18 @@ static int arm_smmu_map(struct iommu_domain
>>>>>> *domain, unsigned long iova,
>>>>>>static size_t arm_smmu_unmap(struct iommu_domain *domain, unsigned
>>>>>> long iova,
>>>>>> size_t size)
>>>>>>{
>>>>>> -struct io_pgtable_ops *ops = to_smmu_domain(domain)->pgtbl_ops;
>>>>>> +struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
>>>>>> +struct io_pgtable_ops *ops = smmu_domain->pgtbl_ops;
>>>>>> +size_t ret;
>>>>>>  if (!ops)
>>>>>>return 0;
>>>>>>-return ops->unmap(ops, iova, size);
>>>>>> +pm_runtime_get_sync(smmu_domain->smmu->dev);
>>>>>
>>>>> Can these map/unmap ops be called from an atomic context? I seem
>>>>> to recall that being a problem before.
>>>>
>>>> That's something which was dropped in the following patch merged in
>>>> master:
>>>> 523d7423e21b iommu/arm-smmu: Remove io-pgtable spinlock
>>>>
>>>> Looks like we don't  need locks here anymore?
>>>
>>>   Apart from the locking, wonder why a explicit pm_runtime is needed
>>>   from unmap. Somehow looks like some path in the master using that
>>>   should have enabled the pm ?
>>>
>> Yes, there are a bunch of scenarios where unmap can happen with
>> disabled master (but not in atomic context).  On the gpu side we
>> opportunistically keep a buffer mapping until the buffer is freed
>> (which can happen after gpu is disabled).  Likewise, v4l2 won't unmap
>> an exported dmabuf while some other driver holds a reference to it
>> (which can be dropped when the v4l2 device is suspended).
>>
>> Since unmap triggers tbl flush which touches iommu regs, the iommu
>> driver *definitely* needs a pm_runtime_get_sync().
>
>
> Afair unmap might be called from atomic context as well, for example as
> a result of dma_unmap_page(). In exynos IOMMU I simply check the runtime
> PM state of IOMMU device. TLB flush is performed only when IOMMU is in
> active
> state. If it is suspended, I assume that the IOMMU controller's context
> is already lost and its respective power domain might be already turned off,
> so there is no point in touching IOMMU registers.
>

that seems like an interesting approach.. although I wonder if there
can be some race w/ new device memory access once clks are enabled
before tlb flush completes?  That would be rather bad, since this
approach is letting the backing pages of memory be freed before tlb
flush.

BR,
-R
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH V4 3/6] iommu/arm-smmu: Invoke pm_runtime during probe, add/remove device

2017-07-14 Thread Rob Clark
On Fri, Jul 14, 2017 at 2:06 PM, Will Deacon <will.dea...@arm.com> wrote:
> On Fri, Jul 14, 2017 at 01:42:13PM -0400, Rob Clark wrote:
>> On Fri, Jul 14, 2017 at 1:07 PM, Will Deacon <will.dea...@arm.com> wrote:
>> > On Thu, Jul 13, 2017 at 10:55:10AM -0400, Rob Clark wrote:
>> >> On Thu, Jul 13, 2017 at 9:53 AM, Sricharan R <sricha...@codeaurora.org> 
>> >> wrote:
>> >> > Hi,
>> >> >
>> >> > On 7/13/2017 5:20 PM, Rob Clark wrote:
>> >> >> On Thu, Jul 13, 2017 at 1:35 AM, Sricharan R 
>> >> >> <sricha...@codeaurora.org> wrote:
>> >> >>> Hi Vivek,
>> >> >>>
>> >> >>> On 7/13/2017 10:43 AM, Vivek Gautam wrote:
>> >> >>>> Hi Stephen,
>> >> >>>>
>> >> >>>>
>> >> >>>> On 07/13/2017 04:24 AM, Stephen Boyd wrote:
>> >> >>>>> On 07/06, Vivek Gautam wrote:
>> >> >>>>>> @@ -1231,12 +1237,18 @@ static int arm_smmu_map(struct 
>> >> >>>>>> iommu_domain *domain, unsigned long iova,
>> >> >>>>>>   static size_t arm_smmu_unmap(struct iommu_domain *domain, 
>> >> >>>>>> unsigned long iova,
>> >> >>>>>>size_t size)
>> >> >>>>>>   {
>> >> >>>>>> -struct io_pgtable_ops *ops = 
>> >> >>>>>> to_smmu_domain(domain)->pgtbl_ops;
>> >> >>>>>> +struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain);
>> >> >>>>>> +struct io_pgtable_ops *ops = smmu_domain->pgtbl_ops;
>> >> >>>>>> +size_t ret;
>> >> >>>>>> if (!ops)
>> >> >>>>>>   return 0;
>> >> >>>>>>   -return ops->unmap(ops, iova, size);
>> >> >>>>>> +pm_runtime_get_sync(smmu_domain->smmu->dev);
>> >> >>>>> Can these map/unmap ops be called from an atomic context? I seem
>> >> >>>>> to recall that being a problem before.
>> >> >>>>
>> >> >>>> That's something which was dropped in the following patch merged in 
>> >> >>>> master:
>> >> >>>> 523d7423e21b iommu/arm-smmu: Remove io-pgtable spinlock
>> >> >>>>
>> >> >>>> Looks like we don't  need locks here anymore?
>> >> >>>
>> >> >>>  Apart from the locking, wonder why a explicit pm_runtime is needed
>> >> >>>  from unmap. Somehow looks like some path in the master using that
>> >> >>>  should have enabled the pm ?
>> >> >>>
>> >> >>
>> >> >> Yes, there are a bunch of scenarios where unmap can happen with
>> >> >> disabled master (but not in atomic context).  On the gpu side we
>> >> >> opportunistically keep a buffer mapping until the buffer is freed
>> >> >> (which can happen after gpu is disabled).  Likewise, v4l2 won't unmap
>> >> >> an exported dmabuf while some other driver holds a reference to it
>> >> >> (which can be dropped when the v4l2 device is suspended).
>> >> >>
>> >> >> Since unmap triggers tbl flush which touches iommu regs, the iommu
>> >> >> driver *definitely* needs a pm_runtime_get_sync().
>> >> >
>> >> >  Ok, with that being the case, there are two things here,
>> >> >
>> >> >  1) If the device links are still intact at these places where unmap is 
>> >> > called,
>> >> > then pm_runtime from the master would setup the all the clocks. 
>> >> > That would
>> >> > avoid reintroducing the locking indirectly here.
>> >> >
>> >> >  2) If not, then doing it here is the only way. But for both cases, 
>> >> > since
>> >> > the unmap can be called from atomic context, resume handler here 
>> >> > should
>> >> > avoid doing clk_prepare_enable , instead move the clk_prepare to 
>> >> > the init.
>> >> >
>> >>
>> >> I do kinda like the approach Marek suggested.. of deferring the tlb
>> >> flush until resume.  I'm 

  1   2   3   4   5   >