Re: [PATCH v6 03/10] iommu/mediatek: Use a u32 flags to describe different HW features

2020-07-03 Thread Yingjoe Chen
On Fri, 2020-07-03 at 12:41 +0800, Chao Hao wrote:
> Given the fact that we are adding more and more plat_data bool values,
> it would make sense to use a u32 flags register and add the appropriate
> macro definitions to set and check for a flag present.
> No functional change.
> 
> Cc: Yong Wu 
> Suggested-by: Matthias Brugger 
> Signed-off-by: Chao Hao 
> Reviewed-by: Matthias Brugger 
> ---
>  drivers/iommu/mtk_iommu.c | 28 +---
>  drivers/iommu/mtk_iommu.h |  7 +--
>  2 files changed, 18 insertions(+), 17 deletions(-)
> 
> diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c
> index 88d3df5b91c2..40ca564d97af 100644
> --- a/drivers/iommu/mtk_iommu.c
> +++ b/drivers/iommu/mtk_iommu.c
> @@ -100,6 +100,15 @@
>  #define MTK_M4U_TO_LARB(id)  (((id) >> 5) & 0xf)
>  #define MTK_M4U_TO_PORT(id)  ((id) & 0x1f)
>  
> +#define HAS_4GB_MODE BIT(0)
> +/* HW will use the EMI clock if there isn't the "bclk". */
> +#define HAS_BCLK BIT(1)
> +#define HAS_VLD_PA_RNG   BIT(2)
> +#define RESET_AXIBIT(3)
> +
> +#define MTK_IOMMU_HAS_FLAG(pdata, _x) \
> + pdata)->flags) & (_x)) == (_x))
> +
>  struct mtk_iommu_domain {
>   struct io_pgtable_cfg   cfg;
>   struct io_pgtable_ops   *iop;
> @@ -563,7 +572,8 @@ static int mtk_iommu_hw_init(const struct mtk_iommu_data 
> *data)
>upper_32_bits(data->protect_base);
>   writel_relaxed(regval, data->base + REG_MMU_IVRP_PADDR);
>  
> - if (data->enable_4GB && data->plat_data->has_vld_pa_rng) {
> + if (data->enable_4GB &&
> + MTK_IOMMU_HAS_FLAG(data->plat_data, HAS_VLD_PA_RNG)) {
>   /*
>* If 4GB mode is enabled, the validate PA range is from
>* 0x1__ to 0x1__. here record bit[32:30].
> @@ -573,7 +583,7 @@ static int mtk_iommu_hw_init(const struct mtk_iommu_data 
> *data)
>   }
>   writel_relaxed(0, data->base + REG_MMU_DCM_DIS);
>  
> - if (data->plat_data->reset_axi) {
> + if (MTK_IOMMU_HAS_FLAG(data->plat_data, RESET_AXI)) {
>   /* The register is called STANDARD_AXI_MODE in this case */
>   writel_relaxed(0, data->base + REG_MMU_MISC_CTRL);
>   }
> @@ -618,7 +628,7 @@ static int mtk_iommu_probe(struct platform_device *pdev)
>  
>   /* Whether the current dram is over 4GB */
>   data->enable_4GB = !!(max_pfn > (BIT_ULL(32) >> PAGE_SHIFT));
> - if (!data->plat_data->has_4gb_mode)
> + if (!MTK_IOMMU_HAS_FLAG(data->plat_data, HAS_4GB_MODE))
>   data->enable_4GB = false;
>  
>   res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
> @@ -631,7 +641,7 @@ static int mtk_iommu_probe(struct platform_device *pdev)
>   if (data->irq < 0)
>   return data->irq;
>  
> - if (data->plat_data->has_bclk) {
> + if (MTK_IOMMU_HAS_FLAG(data->plat_data, HAS_BCLK)) {
>   data->bclk = devm_clk_get(dev, "bclk");
>   if (IS_ERR(data->bclk))
>   return PTR_ERR(data->bclk);
> @@ -763,23 +773,19 @@ static const struct dev_pm_ops mtk_iommu_pm_ops = {
>  
>  static const struct mtk_iommu_plat_data mt2712_data = {
>   .m4u_plat = M4U_MT2712,
> - .has_4gb_mode = true,
> - .has_bclk = true,
> - .has_vld_pa_rng   = true,
> + .flags= HAS_4GB_MODE | HAS_BCLK | HAS_VLD_PA_RNG,
>   .larbid_remap = {0, 1, 2, 3, 4, 5, 6, 7, 8, 9},
>  };
>  
>  static const struct mtk_iommu_plat_data mt8173_data = {
>   .m4u_plat = M4U_MT8173,
> - .has_4gb_mode = true,
> - .has_bclk = true,
> - .reset_axi= true,
> + .flags= HAS_4GB_MODE | HAS_BCLK | RESET_AXI,
>   .larbid_remap = {0, 1, 2, 3, 4, 5}, /* Linear mapping. */
>  };
>  
>  static const struct mtk_iommu_plat_data mt8183_data = {
>   .m4u_plat = M4U_MT8183,
> - .reset_axi= true,
> + .flags= RESET_AXI,
>   .larbid_remap = {0, 4, 5, 6, 7, 2, 3, 1},
>  };
>  
> diff --git a/drivers/iommu/mtk_iommu.h b/drivers/iommu/mtk_iommu.h
> index 7212e6fcf982..5225a9170aaa 100644
> --- a/drivers/iommu/mtk_iommu.h
> +++ b/drivers/iommu/mtk_iommu.h
> @@ -39,12 +39,7 @@ enum mtk_iommu_plat {
>  
>  struct mtk_iommu_plat_data {
>   enum mtk_iommu_plat m4u_plat;
> - boolhas_4gb_mode;
> -
> - /* HW will use the EMI clock if there isn't the "bclk". */
> - boolhas_bclk;
> - boolhas_vld_pa_rng;
> - boolreset_axi;
> + u32 flags;


How about using bit field instead? eg

  u32 has_bclk:1;

In this way, we don't need to change code.

Joe.C


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 1/2] iommu/io-pgtable: Add MTK 4GB mode in Short-descriptor

2016-03-10 Thread Yingjoe Chen
On Wed, 2016-03-02 at 12:31 +, Robin Murphy wrote:
> Hi Yong,
> 
> On 23/02/16 23:02, Yong Wu wrote:
> > Mediatek extend bit9 in the lvl1 and lvl2 pgtable descriptor of the
> > Short-descriptor as the 4GB mode in which the dram size will be
> > over 4GB.
> >
> > We add a special quirk for this MTK-4GB mode, And in the standard
> > spec, Bit9 in the lvl1 is "IMPLEMENTATION DEFINED", while it's AP[2]
> > in the lvl2, therefore if this quirk is enabled, NO_PERMS is also
> > expected.
> 
> Would you be able to explain exactly what this "4GB mode" actually is? 
> I've been trying to make sense of it from the original M4U patches and 
> the patch for the I2C driver, but it has me completely baffled.
> 
> Is it simply used as an extra physical address bit (although surely that 
> would make it "8GB mode"?), or does it do something crazier like 
> toggling an interconnect remap that shifts the output addresses up by 
> 1GB to be relative to the base of DRAM, like a dma_pfn_offset?

Unfortunately, it is somewhat more crazier than that. Let me have a
short brief on what we got.

Normally, mt8173 memory map looks like this:

Physical addr
-
| 1st GB |->  HW SRAM and Regs
|
| 2nd GB |->  DRAM 1st GB
|
| 3rd GB |->  DRAM 2nd GB
|
| 4th GB |->  DRAM 3rd GB
|

On mt8173, we have a "DRAM 4GB mode" toggle bit. If it is enabled, from
CPU's point of view, the dram will be shifted to start from PA
0x1_. The PA 0x4000~0x will become an alias to
0x1_4000~0x1_.

Many HW only support 32bits physical address, the 33bit will be added to
PA when 4GB mode is enabled. This looks like dma_pfn_offset you
mentioned.

Some others, like i2c or audio, support 33bits physical address, mostly
because they support DMA to/from SRAM with the same SW interface. When
4GB mode is enabled, these HW can still access DRAM with aliased dram
address (0x4000 ~ 0x)

The MTK M4U(iommu) is another case. It did add 33 bits to page table
entries and registers. Unfortunately, when 4GB mode is enabled, 33bit
must be 1. It treats all PA < 0x1__ as invalid address. That's
why we always set the 33bit when 4GB mode is enabled in this patch.

And there are other HWs with even crazier remap rule, but that's another
story...

Joe.C


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 2/4] iommu: Implement common IOMMU ops for DMA mapping

2015-05-29 Thread Yingjoe Chen

Hi Robin,

More info, hope this make it clearer. We are calling dma_map_sg_attrs
with the following 2 sg. With IOMMU, we are expecting it got merged into
1 contiguous va range, but instead we get 2 va range.

sg0 dma_address 0xfeeddc00 size 0x400, offset 0xc00
sg1 dma_address 0xfeede000 size 0x1000, offset 0x0

Joe.C

On Fri, 2015-05-29 at 13:26 +0800, Yong Wu wrote:
 Hi Robin,
 Thanks.
 
 While we test venc in v4l2, we get a problem:
 When we enter the funtion[0], it will be break unexpectedly in the
 funcion[1] while the offset of sg table is not zero. It is ok if the
 offset is zero. Then I add more log in dma-iommu.c, please help check
 below.
 All we tested it based on dma v2. and have not tested it on v3 yet.
 The code of iommu-map-sg seems the same. if it's fixed in v3, I'm very
 sorry. The map_sg in mtk-iommu use default_iommu_map_sg.
 Any question please tell me, Thanks very much. 
 
 [0]http://lxr.free-electrons.com/source/drivers/media/v4l2-core/videobuf2-dma-contig.c#L564

 [1]http://lxr.free-electrons.com/source/drivers/media/v4l2-core/videobuf2-dma-contig.c#L70
 
 
 On Wed, 2015-05-27 at 15:09 +0100, Robin Murphy wrote:
  Taking inspiration from the existing arch/arm code, break out some
  generic functions to interface the DMA-API to the IOMMU-API. This will
  do the bulk of the heavy lifting for IOMMU-backed dma-mapping.
  
  Signed-off-by: Robin Murphy robin.mur...@arm.com
  ---
   drivers/iommu/Kconfig |   7 +
   drivers/iommu/Makefile|   1 +
   drivers/iommu/dma-iommu.c | 560 
  ++
   include/linux/dma-iommu.h |  94 
   4 files changed, 662 insertions(+)
   create mode 100644 drivers/iommu/dma-iommu.c
   create mode 100644 include/linux/dma-iommu.h
  
 [snip]
  +static int __finalise_sg(struct device *dev, struct scatterlist *sg, int 
  nents,
  +   dma_addr_t dma_addr)
  +{
  +   struct scatterlist *s, *seg = sg;
  +   unsigned long seg_mask = dma_get_seg_boundary(dev);
  +   unsigned int max_len = dma_get_max_seg_size(dev);
  +   unsigned int seg_len = 0, seg_dma = 0;
  +   int i, count = 1;
  +
  +   for_each_sg(sg, s, nents, i) {
  +   /* Un-swizzling the fields here, hence the naming mismatch */
  +   unsigned int s_offset = sg_dma_address(s);
  +   unsigned int s_length = sg_dma_len(s);
  +   unsigned int s_dma_len = s-length;
  +
  +   s-offset = s_offset;
  +   s-length = s_length;
  +   sg_dma_address(s) = DMA_ERROR_CODE;
  +   sg_dma_len(s) = 0;
  +
  +   if (seg_len  (seg_dma + seg_len == dma_addr + s_offset) 
  +   (seg_len + s_dma_len = max_len) 
  +   ((seg_dma  seg_mask) = seg_mask - (seg_len + s_length))
  +  ) {
  +   sg_dma_len(seg) += s_dma_len;
  +   } else {
  +   if (seg_len) {
  +   seg = sg_next(seg);
  +   count++;
  +   }
  +   sg_dma_len(seg) = s_dma_len;
  +   sg_dma_address(seg) = dma_addr + s_offset;
Here the value of sg_dma_address have added s_offset, but
 sg_dma_len(seg) still is s_dma_len.
In the first loop, s_dma_len is from s-length which is alignd by
 s_length = iova_align(iovad, s_length + s_offset); in
 the interface iommu_dma_map_sg.
  +
  +   seg_len = s_offset;
  +   seg_dma = dma_addr + s_offset;
  +   }
  +   seg_len += s_length;
  +   dma_addr += s_dma_len;
  +   }
  +   return count;
  +}
  +
  +static void __invalidate_sg(struct scatterlist *sg, int nents)
  +{
  +   struct scatterlist *s;
  +   int i;
  +
  +   for_each_sg(sg, s, nents, i) {
  +   if (sg_dma_address(s) != DMA_ERROR_CODE)
  +   s-offset = sg_dma_address(s);
  +   if (sg_dma_len(s))
  +   s-length = sg_dma_len(s);
  +   sg_dma_address(s) = DMA_ERROR_CODE;
  +   sg_dma_len(s) = 0;
  +   }
  +}
  +
  +int iommu_dma_map_sg(struct device *dev, struct scatterlist *sg,
  +   int nents, int prot, bool coherent)
  +{
  +   struct iommu_dma_domain *dom = arch_get_dma_domain(dev);
  +   struct iova_domain *iovad = dom-iovad;
  +   struct iova *iova;
  +   struct scatterlist *s;
  +   dma_addr_t dma_addr;
  +   size_t iova_len = 0;
  +   int i;
  +
  +   /*
  +* Work out how much IOVA space we need, and align the segments to
  +* IOVA granules for the IOMMU driver to handle. With some clever
  +* trickery we can modify the list in a reversible manner.
  +*/
  +   for_each_sg(sg, s, nents, i) {
  +   size_t s_offset = iova_offset(iovad, s-offset);
  +   size_t s_length = s-length;
  +
  +   sg_dma_address(s) = s-offset;
  +   sg_dma_len(s) = s_length;
  +   s-offset -= s_offset;
  +   s_length = iova_align(iovad, s_length + s_offset);
  +   s-length = 

Re: [PATCH 00/15] iommu: Move domain allocation into drivers

2015-03-23 Thread Yingjoe Chen
On Mon, 2015-03-23 at 12:49 +0100, Joerg Roedel wrote:
 Hi Yingjoe,
 
 On Fri, Mar 20, 2015 at 05:24:18PM +0800, Yingjoe Chen wrote:
  What's the status of this patchset?
  While porting MTK IOMMU driver[1], we need to use a global variable
  because we need to do dma_alloc_coherent in our domain_init. I think we
  can remove that global variable if we base on this series.
 
 I plan to send out a new version this week.

Hi Joerg,

That's great.
Thanks for the update :)

Joe.C


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 00/15] iommu: Move domain allocation into drivers

2015-03-20 Thread Yingjoe Chen
On Tue, 2015-01-27 at 00:51 +0100, Joerg Roedel wrote:
 From: Joerg Roedel jroe...@suse.de
 
 Hi,
 
 here is patch-set to replace the existing domain_init and
 domain_destroy iommu-ops with the new domain_alloc and
 domain_free callbacks
 
 The new callbacks move the allocation of iommu domains into
 the iommu driver, allowing them to put a generic
 iommu_domain struct into their own domain struct. This makes
 domain handling in the drivers more cache efficient and
 prepares the introduction of default domains in another
 patch-set.

Hi Joerg,

What's the status of this patchset?
While porting MTK IOMMU driver[1], we need to use a global variable
because we need to do dma_alloc_coherent in our domain_init. I think we
can remove that global variable if we base on this series.

Joe.C

[1]
http://lists.infradead.org/pipermail/linux-arm-kernel/2015-March/328461.html


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 2/5] iommu/mediatek: Add mt8173 IOMMU driver

2015-03-09 Thread Yingjoe Chen
On Tue, 2015-03-10 at 02:00 +0900, Tomasz Figa wrote:
 On Mon, Mar 9, 2015 at 11:46 PM, Yingjoe Chen yingjoe.c...@mediatek.com 
 wrote:
  On Mon, 2015-03-09 at 20:11 +0900, Tomasz Figa wrote:
  ...
   +/*
   + * pimudev is a global var for dma_alloc_coherent.
   + * It is not accepatable, we will delete it if domain_alloc is enabled
   + */
   +static struct device *pimudev;
 
  This is indeed not acceptable. Could you replace dma_alloc_coherent()
  with something that doesn't require device pointer, e.g.
  alloc_pages()? (Although that would require you to handle cache
  maintenance in the driver, due to cached memory allocated.) I need to
  think about a better solution for this.
 
  Hi,
 
  For 2nd level page table, we use cached memory now. Currently we are
  using __dma_flush_range to flush the cache, which is also unacceptable.
 
  For proper cache management, we'll need to use dma_map_single or
  dma_sync_*, which still need a deivce*.
 
 Looking at how already mainlined drivers do this, they either use
 dmac_flush_range()
 (https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/drivers/iommu/msm_iommu.c?id=refs/tags/v4.0-rc3#n80)
 or directly __cpuc_flush_dcache_area() and outer_flush_range()
 (https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/drivers/iommu/rockchip-iommu.c?id=refs/tags/v4.0-rc3#n93).

Hi,

These only exist in arch/arm, not arm64. I think we should avoid using
API start with __ in drivers. This driver might be used in both
arm/arm64, I think the only option for us is DMA APIs.

Actually, I'm thinking that we should change to use coherent memory for
2nd level page table as well and totally skip the cache flush. It seems
dma_pool_create is suitable to replace kmem_cache we are using right
now. However it still need a device*, which we have to fix anyway.

Joe.C


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [RFC PATCH 0/5] arm64: IOMMU-backed DMA mapping

2015-01-13 Thread Yingjoe Chen
On Mon, 2015-01-12 at 20:48 +, Robin Murphy wrote:
 Hi all,
 
 Whilst it's a long way off perfect, this has reached the point of being
 functional and stable enough to be useful, so here it is. The core
 consists of the meat of the arch/arm implementation modified to remove
 the assumption of PAGE_SIZE pages and ported over to the Intel IOVA
 allocator instead of the bitmap-based one. For that, this series depends
 on my Genericise the IOVA allocator series posted earlier[1].

Hi Robin,

We'd to give it a try. Do you have a public git tree contains both
series?

Joe.C



___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu