Re: [PATCH v7 14/21] iommu/mediatek: Add mmu1 support

2019-06-18 Thread Tomasz Figa via iommu
On Mon, Jun 10, 2019 at 9:21 PM Yong Wu  wrote:
>
> Normally the M4U HW connect EMI with smi. the diagram is like below:
>   EMI
>|
>   M4U
>|
> smi-common
>|
>-
>||| |...
> larb0 larb1  larb2 larb3
>
> Actually there are 2 mmu cells in the M4U HW, like this diagram:
>
>   EMI
>-
> | |
>mmu0  mmu1 <- M4U
> | |
>-
>|
> smi-common
>|
>-
>||| |...
> larb0 larb1  larb2 larb3
>
> This patch add support for mmu1. In order to get better performance,
> we could adjust some larbs go to mmu1 while the others still go to
> mmu0. This is controlled by a SMI COMMON register SMI_BUS_SEL(0x220).
>
> mt2712, mt8173 and mt8183 M4U HW all have 2 mmu cells. the default
> value of that register is 0 which means all the larbs go to mmu0
> defaultly.
>
> This is a preparing patch for adjusting SMI_BUS_SEL for mt8183.
>
> Signed-off-by: Yong Wu 
> Reviewed-by: Evan Green 
> ---
>  drivers/iommu/mtk_iommu.c | 46 +-
>  1 file changed, 29 insertions(+), 17 deletions(-)
>
> diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c
> index 3a14301..ec4ce74 100644
> --- a/drivers/iommu/mtk_iommu.c
> +++ b/drivers/iommu/mtk_iommu.c
> @@ -72,26 +72,32 @@
>  #define F_INT_CLR_BIT  BIT(12)
>
>  #define REG_MMU_INT_MAIN_CONTROL   0x124
> -#define F_INT_TRANSLATION_FAULTBIT(0)
> -#define F_INT_MAIN_MULTI_HIT_FAULT BIT(1)
> -#define F_INT_INVALID_PA_FAULT BIT(2)
> -#define F_INT_ENTRY_REPLACEMENT_FAULT  BIT(3)
> -#define F_INT_TLB_MISS_FAULT   BIT(4)
> -#define F_INT_MISS_TRANSACTION_FIFO_FAULT  BIT(5)
> -#define F_INT_PRETETCH_TRANSATION_FIFO_FAULT   BIT(6)
> +   /* mmu0 | mmu1 */
> +#define F_INT_TRANSLATION_FAULT(BIT(0) | BIT(7))
> +#define F_INT_MAIN_MULTI_HIT_FAULT (BIT(1) | BIT(8))
> +#define F_INT_INVALID_PA_FAULT (BIT(2) | BIT(9))
> +#define F_INT_ENTRY_REPLACEMENT_FAULT  (BIT(3) | BIT(10))
> +#define F_INT_TLB_MISS_FAULT   (BIT(4) | BIT(11))
> +#define F_INT_MISS_TRANSACTION_FIFO_FAULT  (BIT(5) | BIT(12))
> +#define F_INT_PRETETCH_TRANSATION_FIFO_FAULT   (BIT(6) | BIT(13))

If there are two IOMMUs, shouldn't we have two driver instances handle
them, instead of making the driver combine them two internally?

And, what is even more important from security point of view actually,
have two separate page tables (aka IOMMU groups) for them?

Best regards,
Tomasz
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v5 08/13] iommu/rockchip: Control clocks needed to access the IOMMU

2018-02-14 Thread Tomasz Figa via iommu
On Wed, Feb 14, 2018 at 7:03 PM, Vivek Gautam
 wrote:
>
>
> On 1/24/2018 7:19 PM, Robin Murphy wrote:
>>
>> On 24/01/18 10:35, Jeffy Chen wrote:
>>>
>>> From: Tomasz Figa 
>>>
>>> Current code relies on master driver enabling necessary clocks before
>>> IOMMU is accessed, however there are cases when the IOMMU should be
>>> accessed while the master is not running yet, for example allocating
>>> V4L2 videobuf2 buffers, which is done by the VB2 framework using DMA
>>> mapping API and doesn't engage the master driver at all.
>>>
>>> This patch fixes the problem by letting clocks needed for IOMMU
>>> operation to be listed in Device Tree and making the driver enable them
>>> for the time of accessing the hardware.
>>>
>>> Signed-off-by: Jeffy Chen 
>>> Signed-off-by: Tomasz Figa 
>>> ---
>>>
>>> Changes in v5:
>>> Use clk_bulk APIs.
>>>
>>> Changes in v4: None
>>> Changes in v3: None
>>> Changes in v2: None
>
>
> [snip]
>
>
>>>   +static int rk_iommu_of_get_clocks(struct rk_iommu *iommu)
>>> +{
>>> +struct device_node *np = iommu->dev->of_node;
>>> +int ret;
>>> +int i;
>>> +
>>> +ret = of_count_phandle_with_args(np, "clocks", "#clock-cells");
>>> +if (ret == -ENOENT)
>>> +return 0;
>>> +else if (ret < 0)
>>> +return ret;
>>> +
>>> +iommu->num_clocks = ret;
>>> +iommu->clocks = devm_kcalloc(iommu->dev, iommu->num_clocks,
>>> + sizeof(*iommu->clocks), GFP_KERNEL);
>>> +if (!iommu->clocks)
>>> +return -ENOMEM;
>>> +
>>> +for (i = 0; i < iommu->num_clocks; ++i) {
>>> +iommu->clocks[i].clk = of_clk_get(np, i);
>>> +if (IS_ERR(iommu->clocks[i].clk)) {
>>> +ret = PTR_ERR(iommu->clocks[i].clk);
>>> +goto err_clk_put;
>>> +}
>>> +}
>>
>>
>> Just to confirm my understanding from a quick scan through the code, the
>> reason we can't use clk_bulk_get() here is that currently, clocks[i].id
>> being NULL means we'd end up just getting the first clock multiple times,
>> right?
>>
>> I guess there could be other users who also want "just get whatever clocks
>> I have" functionality, so it might be worth proposing that for the core API
>> as a separate/follow-up patch, but it definitely doesn't need to be part of
>> this series.
>
>
> Just to understand. Is it okay to make the driver "just get whatever clocks
> device node gives"?
> Doesn't the driver need to be aware of which all clocks are supposed to be
> obtained and enabled
>  It's should good for debug to let the world know which clock we failed to
> get.

Yeah, in general that's desired. However, it is at least impractical
to specify all the clocks in Rockchip case, because it's different for
each block and depends on the master next to which it is located.

Best regards,
Tomasz
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v4 00/12] Intel IPU3 ImgU patchset

2017-12-04 Thread Tomasz Figa via iommu
Hi Raj,

On Tue, Dec 5, 2017 at 9:13 AM, Mani, Rajmohan  wrote:
> Hi Hans,
>
> Thanks for your patience and sharing your thoughts on this.
>
>> Subject: Re: [PATCH v4 00/12] Intel IPU3 ImgU patchset
>>
>> Hi Rajmohan,
>>
>> On 11/17/2017 03:58 AM, Mani, Rajmohan wrote:
>> > Hi Sakari and all,
>> >
>> >> -Original Message-
>> >> From: Zhi, Yong
>> >> Sent: Tuesday, October 17, 2017 8:47 PM
>> >> To: linux-me...@vger.kernel.org; sakari.ai...@linux.intel.com
>> >> Cc: Zheng, Jian Xu ; Mani, Rajmohan
>> >> ; Toivonen, Tuukka
>> >> ; Hu, Jerry W ;
>> >> a...@arndb.de; h...@lst.de; robin.mur...@arm.com; iommu@lists.linux-
>> >> foundation.org; Zhi, Yong 
>> >> Subject: [PATCH v4 00/12] Intel IPU3 ImgU patchset
>> >>
>> >> This patchset adds support for the Intel IPU3 (Image Processing Unit)
>> >> ImgU which is essentially a modern memory-to-memory ISP. It
>> >> implements raw Bayer to YUV image format conversion as well as a
>> >> large number of other pixel processing algorithms for improving the image
>> quality.
>> >>
>> >> Meta data formats are defined for image statistics (3A, i.e.
>> >> automatic white balance, exposure and focus, histogram and local area
>> >> contrast
>> >> enhancement) as well as for the pixel processing algorithm parameters.
>> >> The documentation for these formats is currently not included in the
>> >> patchset but will be added in a future version of this set.
>> >>
>> >
>> > Here is an update on the IPU3 documentation that we are currently working
>> on.
>> >
>> > Image processing in IPU3 relies on the following.
>> >
>> > 1) HW configuration to enable ISP and
>> > 2) setting customer specific 3A Tuning / Algorithm Parameters to achieve
>> desired image quality.
>> >
>> > We intend to provide documentation on ImgU driver programming interface
>> to help users of this driver to configure and enable ISP HW to meet their
>> needs.  This documentation will include details on complete V4L2 Kernel 
>> driver
>> interface and IO-Control parameters, except for the ISP internal algorithm 
>> and
>> its parameters (which is Intel proprietary IP).
>> >
>> > We will also provide an user space library in binary form to help users of 
>> > this
>> driver, to convert the public 3A tuning parameters to IPU3 algorithm
>> parameters. This tool will be released under NDA to the users of this driver.
>>
>> So I discussed this situation with Sakari in Prague during the ELCE. I am not
>> happy with adding a driver to the kernel that needs an NDA to be usable. I
>> can't agree to that. It's effectively the same as firmware that's only 
>> available
>> under an NDA and we wouldn't accept such drivers either.
>>
>
> Ack
>
>> There are a few options:
>>
>> 1) Document the ISP parameters and that format they are stored in allowing
>> for
>>open source implementations. While this is the ideal, I suspect that this 
>> is
>>a no-go for Intel.
>>
>
> Ack
>
>> 2) The driver can be used without using these ISP parameters and still 
>> achieve
>>'OK' quality. I.e., it's usable for basic webcam usage under normal 
>> lighting
>>conditions. I'm not sure if this is possible at all, though.
>>
>
> This is something that we have tried and are able to do image capture with
> the default ISP parameters using ov5670 sensor.
> Additionally we had to set optimal values for the ov5670 sensor's exposure and
> gain settings.

Does it mean hardcoding some ov5670-specific settings in the ISP
driver? If not, I guess it might be good enough?

>
> Please see if the following image looks to meet the 'OK' quality.
>
> git clone https://github.com/RajmohanMani/ipu3-misc.git
> ipu3-misc/ov5670.jpg is the image file.
>
>> 3) Make the library available without requiring an NDA.
>>
>
> We are also actively exploring this option to see if this can be done.
>
>> 4) Provide a non-NDA library (ideally open source) that achieves at minimum
>>the quality as described in 2: i.e. usable for basic webcam.
>>
>
> I see this is the same as option 3) + open sourcing the library.
> Open sourcing the library does not look to be an option.
> I will reconfirm this.

In my understanding, that could be quite different from option 3). The
open source library would not have to implement all of the
capabilities, just enough to get the "OK" quality and the implemented
part could use some simpler algorithms not covered by IP.

Best regards,
Tomasz
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v3 5/6] iommu/rockchip: use DMA API to map, to flush cache

2016-06-15 Thread Tomasz Figa via iommu
Hi Shunqian,

On Wed, Jun 15, 2016 at 9:04 PM, Shunqian Zheng  wrote:
> Use DMA API instead of architecture internal functions like
> __cpuc_flush_dcache_area() etc.
>
> To support the virtual device like DRM the virtual slave iommu
> added in the previous patch, attaching to which the DRM can use
> it own domain->dev for dma_map_*(), dma_sync_*() even VOP is disabled.
>
> With this patch, this driver is available for ARM64 like RK3399.
>
> Signed-off-by: Shunqian Zheng 
> ---
>  drivers/iommu/rockchip-iommu.c | 113 
> +++--
>  1 file changed, 76 insertions(+), 37 deletions(-)

In general looks good to me, but still have some concern about
attaching and detaching. Please see inline.

> diff --git a/drivers/iommu/rockchip-iommu.c b/drivers/iommu/rockchip-iommu.c
> index 82ecc99..b60b29e 100644
> --- a/drivers/iommu/rockchip-iommu.c
> +++ b/drivers/iommu/rockchip-iommu.c
[snip]
> @@ -886,6 +901,22 @@ static int rk_iommu_attach_device(struct iommu_domain 
> *domain,
> return -ENOMEM;
> }
>
> +   /* Set the domain dev to virtual one if exist */
> +   if (rk_iommu_is_virtual(iommu) || !rk_domain->dev)

Does it matter if the iommu is virtual or not? This condition will
always evaluate to true on first attach anyway, so there might be
times when rk_domain->dev points to a real device.

> +   rk_domain->dev = iommu->dev;
> +
[snip]
>  static void rk_iommu_detach_device(struct iommu_domain *domain,
> @@ -987,8 +1023,6 @@ static struct iommu_domain 
> *rk_iommu_domain_alloc(unsigned type)
> if (!rk_domain->dt)
> goto err_dt;
>
> -   rk_table_flush(rk_domain->dt, NUM_DT_ENTRIES);
> -
> spin_lock_init(_domain->iommus_lock);
> spin_lock_init(_domain->dt_lock);
> INIT_LIST_HEAD(_domain->iommus);
> @@ -1012,10 +1046,15 @@ static void rk_iommu_domain_free(struct iommu_domain 
> *domain)
> if (rk_dte_is_pt_valid(dte)) {
> phys_addr_t pt_phys = rk_dte_pt_address(dte);
> u32 *page_table = phys_to_virt(pt_phys);
> +   dma_unmap_single(rk_domain->dev, pt_phys,
> +SPAGE_SIZE, DMA_TO_DEVICE);
> free_page((unsigned long)page_table);
> }
> }
>
> +   if (!rk_domain->dt_dma)
> +   dma_unmap_single(rk_domain->dev, rk_domain->dt_dma,
> +SPAGE_SIZE, DMA_TO_DEVICE);
> free_page((unsigned long)rk_domain->dt);
>
> iommu_put_dma_cookie(_domain->domain);

If we detach here a device whose IOMMU is currently pointed by
rk_domain->dev, then the pointer will not point to anything valid
anymore. To be honest, I don't see any good solution for this. Maybe
we should keep all the IOMMUs in a list and set the ->dev pointer to
any from the list here?

Best regards,
Tomasz
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v3 4/6] drm: rockchip: use common iommu api to attach iommu

2016-06-15 Thread Tomasz Figa via iommu
Hi Shunqian,

On Wed, Jun 15, 2016 at 9:04 PM, Shunqian Zheng  wrote:
> Rockchip DRM used the arm special API, arm_iommu_*(), to attach
> iommu for ARM32 SoCs. This patch convert to common iommu API
> so it would support ARM64 like RK3399.
>
> The general idea is domain_alloc(), attach_device() and
> arch_setup_dma_ops() to set dma_ops manually for DRM at the last.
>
> Signed-off-by: Shunqian Zheng 
> ---
>  drivers/gpu/drm/rockchip/rockchip_drm_drv.c | 112 
> +---
>  drivers/gpu/drm/rockchip/rockchip_drm_drv.h |   1 +
>  2 files changed, 71 insertions(+), 42 deletions(-)

Please see my comment inline.

> diff --git a/drivers/gpu/drm/rockchip/rockchip_drm_drv.c 
> b/drivers/gpu/drm/rockchip/rockchip_drm_drv.c
> index f5a68fc..b52c38d 100644
> --- a/drivers/gpu/drm/rockchip/rockchip_drm_drv.c
> +++ b/drivers/gpu/drm/rockchip/rockchip_drm_drv.c
> @@ -14,16 +14,16 @@
>   * GNU General Public License for more details.
>   */
>
> -#include 
> -
>  #include 
>  #include 
>  #include 
> +#include 
>  #include 
>  #include 
>  #include 
>  #include 
>  #include 
> +#include 
>
>  #include "rockchip_drm_drv.h"
>  #include "rockchip_drm_fb.h"
> @@ -46,7 +46,8 @@ static bool is_support_iommu = true;
>  int rockchip_drm_dma_attach_device(struct drm_device *drm_dev,
>struct device *dev)
>  {
> -   struct dma_iommu_mapping *mapping = drm_dev->dev->archdata.mapping;
> +   struct rockchip_drm_private *private = drm_dev->dev_private;
> +   struct iommu_domain *domain = private->domain;
> int ret;
>
> if (!is_support_iommu)
> @@ -58,16 +59,25 @@ int rockchip_drm_dma_attach_device(struct drm_device 
> *drm_dev,
>
> dma_set_max_seg_size(dev, DMA_BIT_MASK(32));
>
> -   return arm_iommu_attach_device(dev, mapping);
> +   ret = iommu_attach_device(domain, dev);
> +   if (ret) {
> +   dev_err(dev, "Failed to attach iommu device\n");
> +   return ret;
> +   }
> +
> +   /* TODO(djkurtz): fetch the mapping start/size from somewhere */
> +   arch_setup_dma_ops(dev, 0x, SZ_2G, dev->bus->iommu_ops, 
> false);
> +   return 0;
>  }
>
>  void rockchip_drm_dma_detach_device(struct drm_device *drm_dev,
> struct device *dev)
>  {
> -   if (!is_support_iommu)
> -   return;
> +   struct rockchip_drm_private *private = drm_dev->dev_private;
> +   struct iommu_domain *domain = private->domain;
>
> -   arm_iommu_detach_device(dev);
> +   if (is_support_iommu)
> +   iommu_detach_device(domain, dev);
>  }
>
>  int rockchip_register_crtc_funcs(struct drm_crtc *crtc,
> @@ -132,10 +142,52 @@ static void rockchip_drm_crtc_disable_vblank(struct 
> drm_device *dev,
> priv->crtc_funcs[pipe]->disable_vblank(crtc);
>  }
>
> +static int rockchip_drm_init_iommu(struct drm_device *drm_dev)
> +{
> +   struct rockchip_drm_private *private = drm_dev->dev_private;
> +   struct device *dev = drm_dev->dev;
> +   int ret;
> +
> +   dev->dma_parms = devm_kzalloc(dev, sizeof(*dev->dma_parms),
> + GFP_KERNEL);
> +   if (!dev->dma_parms)
> +   return -ENOMEM;
> +
> +   private->domain = iommu_domain_alloc(_bus_type);
> +   if (!private->domain)
> +   return -ENOMEM;
> +
> +   /* TODO(djkurtz): fetch the mapping start/size from somewhere */
> +   ret = iommu_dma_init_domain(private->domain, 0x, SZ_2G);
> +   if (ret) {
> +   dev_err(dev, "Failed to init domain\n");
> +   goto err_free_domain;
> +   }
> +
> +   ret = rockchip_drm_dma_attach_device(drm_dev, dev);
> +   if (ret) {
> +   dev_err(dev, "Failed to attach device\n");
> +   goto err_free_domain;
> +   }
> +
> +   return 0;
> +
> +err_free_domain:
> +   iommu_domain_free(private->domain);
> +
> +   return ret;
> +}
> +
> +static void rockchip_iommu_cleanup(struct drm_device *drm_dev)
> +{
> +   struct rockchip_drm_private *private = drm_dev->dev_private;
> +

No need to call rockchip_drm_dma_detach_device() here?

> +   iommu_domain_free(private->domain);
> +}

Otherwise looks good, so after addressing the above, feel free to add
my Reviewed-by.

Best regards,
Tomasz
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v2 6/7] iommu/rockchip: use DMA API to map, to flush cache

2016-06-10 Thread Tomasz Figa via iommu
Hi,

On Wed, Jun 8, 2016 at 10:26 PM, Shunqian Zheng  wrote:
> Use DMA API instead of architecture internal functions like
> __cpuc_flush_dcache_area() etc.
>
> To support the virtual device like DRM the virtual slave iommu
> added in the previous patch, attaching to which the DRM can use
> it own domain->dev for dma_map_*(), dma_sync_*() even VOP is disabled.
>
> With this patch, this driver is available for ARM64 like RK3399.
>

Could we instead simply allocate coherent memory for page tables using
dma_alloc_coherent() and skip any flushing on CPU side completely? If
I'm looking correctly, the driver only reads back the page directory
when checking if there is a need to allocate new page table, so there
shouldn't be any significant penalty for disabling the cache.

Other than that, please see some comments inline.

> Signed-off-by: Shunqian Zheng 
> ---
>  drivers/iommu/rockchip-iommu.c | 113 
> ++---
>  1 file changed, 71 insertions(+), 42 deletions(-)
>
> diff --git a/drivers/iommu/rockchip-iommu.c b/drivers/iommu/rockchip-iommu.c
> index d6c3051..aafea6e 100644
> --- a/drivers/iommu/rockchip-iommu.c
> +++ b/drivers/iommu/rockchip-iommu.c
> @@ -4,8 +4,6 @@
>   * published by the Free Software Foundation.
>   */
>
> -#include 
> -#include 
>  #include 
>  #include 
>  #include 
> @@ -61,8 +59,7 @@
>  #define RK_MMU_IRQ_BUS_ERROR 0x02  /* bus read error */
>  #define RK_MMU_IRQ_MASK  (RK_MMU_IRQ_PAGE_FAULT | 
> RK_MMU_IRQ_BUS_ERROR)
>
> -#define NUM_DT_ENTRIES 1024
> -#define NUM_PT_ENTRIES 1024
> +#define NUM_TLB_ENTRIES 1024 /* for both DT and PT */

Is it necessary to change this in this patch? In general, it's not a
good idea to mix multiple logical changes together.

>
>  #define SPAGE_ORDER 12
>  #define SPAGE_SIZE (1 << SPAGE_ORDER)
> @@ -82,7 +79,9 @@
>
>  struct rk_iommu_domain {
> struct list_head iommus;
> +   struct device *dev;
> u32 *dt; /* page directory table */
> +   dma_addr_t dt_dma;
> spinlock_t iommus_lock; /* lock for iommus list */
> spinlock_t dt_lock; /* lock for modifying page directory table */
>
> @@ -98,14 +97,12 @@ struct rk_iommu {
> struct iommu_domain *domain; /* domain to which iommu is attached */
>  };
>
> -static inline void rk_table_flush(u32 *va, unsigned int count)
> +static inline void rk_table_flush(struct device *dev, dma_addr_t dma,
> + unsigned int count)
>  {
> -   phys_addr_t pa_start = virt_to_phys(va);
> -   phys_addr_t pa_end = virt_to_phys(va + count);
> -   size_t size = pa_end - pa_start;
> +   size_t size = count * 4;

It would be a good idea to specify what "count" is. I'm a bit confused
that before it meant bytes and now some multiple of 4?

Best regards,
Tomasz
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu