Re: [PATCH] CHROMIUM: iommu: rockchip: Make sure that page table state is coherent

2021-12-09 Thread Dafna Hirschfeld




On 23.03.15 10:38, Tomasz Figa wrote:

Sorry, I had to dig my way out through my backlog.

On Tue, Mar 3, 2015 at 10:36 PM, Joerg Roedel  wrote:

On Mon, Feb 09, 2015 at 08:19:21PM +0900, Tomasz Figa wrote:

Even though the code uses the dt_lock spin lock to serialize mapping
operation from different threads, it does not protect from IOMMU
accesses that might be already taking place and thus altering state
of the IOTLB. This means that current mapping code which first zaps
the page table and only then updates it with new mapping which is
prone to mentioned race.


Could you elabortate a bit on the race and why it is sufficient to zap
only the first and the last iova? From the description and the comments
in the patch this is not clear to me.


Let's start with why it's sufficient to zap only first and last iova.

While unmapping, the driver zaps all iovas belonging to the mapping,
so the page tables not used by any mapping won't be cached. Now when
the driver creates a mapping it might end up occupying several page
tables. However, since the mapping area is virtually contiguous, only
the first and last page table can be shared with different mappings.
This means that only first and last iovas can be already cached. In
fact, we could detect if first and last page tables are shared and do
not zap at all, but this wouldn't really optimize too much. Why
invalidating one iova is enough to invalidate the whole page table is
unclear to me as well, but it seems to be the correct way on this
hardware.


Hi,
It seems to me that actually each mapping needs exactly one page.
Since (as the inline doc in rk_iommu_map states) the pgsize_bitmap
makes sure that iova mappings fits exactly into one page table
since the mapping size is maximum 4M.

This actually means that if rk_dte_get_page_table does not allocate a
new page table but returns one that is already partially used from previous
mappings then two page tables might be required, but I think the iova
allocation somehow make sure that this will not be the case.

If it was the case then the code would be buggy because it means
that the loop in rk_iommu_map_iova will write behind the page table
given in rk_dte_get_page_table (which we didn't allocate)

So I it seems to me that calling 'rk_iommu_zap_iova(rk_domain, iova, 
SPAGE_SIZE);'
as done before this patch should be used, but be moved from
rk_dte_get_page_table to where rk_iommu_zap_iova_first_last is now

Thanks,
Dafna



As for the race, it's also kind of explained by the above. The already
running hardware can trigger page table look-ups in the IOMMU and so
caching of the page table between our zapping and updating its
contents. With this patch zapping is performed after updating the page
table so the race is gone.

Best regards,
Tomasz

 From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: 
Received: (majord...@vger.kernel.org) by vger.kernel.org via listexpand
id S1753210AbbCWM3R (ORCPT );
Mon, 23 Mar 2015 08:29:17 -0400
Received: from 8bytes.org ([81.169.241.247]:33957 "EHLO theia.8bytes.org"
rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP
id S1752552AbbCWM3M (ORCPT );
Mon, 23 Mar 2015 08:29:12 -0400
Date: Mon, 23 Mar 2015 13:29:10 +0100
From: Joerg Roedel 
To: Tomasz Figa 
Cc: iommu@lists.linux-foundation.org,
 "linux-arm-ker...@lists.infradead.org"
,
 "linux-ker...@vger.kernel.org" ,
 "open list:ARM/Rockchip SoC..." ,
 Heiko Stuebner , Daniel Kurtz 
Subject: Re: [PATCH] CHROMIUM: iommu: rockchip: Make sure that page table
  state is coherent
Message-ID: <20150323122910.go4...@8bytes.org>
References: <1423480761-33453-1-git-send-email-tf...@chromium.org>
  <20150303133659.gd10...@8bytes.org>
  
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: 

User-Agent: Mutt/1.5.21 (2010-09-15)
Sender: linux-kernel-ow...@vger.kernel.org
List-ID: 
X-Mailing-List: linux-ker...@vger.kernel.org

Hi Tomasz,

On Mon, Mar 23, 2015 at 05:38:45PM +0900, Tomasz Figa wrote:

While unmapping, the driver zaps all iovas belonging to the mapping,
so the page tables not used by any mapping won't be cached. Now when
the driver creates a mapping it might end up occupying several page
tables. However, since the mapping area is virtually contiguous, only
the first and last page table can be shared with different mappings.
This means that only first and last iovas can be already cached. In
fact, we could detect if first and last page tables are shared and do
not zap at all, but this wouldn't really optimize too much. Why
invalidating one iova is enough to invalidate the whole page table is
unclear to me as well, but it seems to be the correct way on this
hardware.

As for the race, it's also kind of explained by the above. The already
running hardware can trigger page table look-ups in the IOMMU and so
caching of the page table between our zapping and updating its
contents. With this patch zapping is 

[PATCH v2 4/5] iommu/mediatek: Add tlb_lock in tlb_flush_all

2021-12-08 Thread Dafna Hirschfeld
From: Yong Wu 

The tlb_flush_all touches the registers controlling tlb operations.
Protect it with the tlb_lock spinlock.
This also require the range_sync func to release that spinlock before
calling tlb_flush_all.

Signed-off-by: Yong Wu 
[refactor commit log]
Signed-off-by: Dafna Hirschfeld 
---
 drivers/iommu/mtk_iommu.c | 12 +---
 1 file changed, 9 insertions(+), 3 deletions(-)

diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c
index e30ac68fab48..195a411e3087 100644
--- a/drivers/iommu/mtk_iommu.c
+++ b/drivers/iommu/mtk_iommu.c
@@ -210,10 +210,14 @@ static struct mtk_iommu_domain *to_mtk_domain(struct 
iommu_domain *dom)
 
 static void mtk_iommu_tlb_flush_all(struct mtk_iommu_data *data)
 {
+   unsigned long flags;
+
+   spin_lock_irqsave(>tlb_lock, flags);
writel_relaxed(F_INVLD_EN1 | F_INVLD_EN0,
   data->base + data->plat_data->inv_sel_reg);
writel_relaxed(F_ALL_INVLD, data->base + REG_MMU_INVALIDATE);
wmb(); /* Make sure the tlb flush all done */
+   spin_unlock_irqrestore(>tlb_lock, flags);
 }
 
 static void mtk_iommu_tlb_flush_range_sync(unsigned long iova, size_t size,
@@ -242,14 +246,16 @@ static void mtk_iommu_tlb_flush_range_sync(unsigned long 
iova, size_t size,
/* tlb sync */
ret = readl_poll_timeout_atomic(data->base + REG_MMU_CPE_DONE,
tmp, tmp != 0, 10, 1000);
+
+   /* Clear the CPE status */
+   writel_relaxed(0, data->base + REG_MMU_CPE_DONE);
+   spin_unlock_irqrestore(>tlb_lock, flags);
+
if (ret) {
dev_warn(data->dev,
 "Partial TLB flush timed out, falling back to 
full flush\n");
mtk_iommu_tlb_flush_all(data);
}
-   /* Clear the CPE status */
-   writel_relaxed(0, data->base + REG_MMU_CPE_DONE);
-   spin_unlock_irqrestore(>tlb_lock, flags);
 
pm_runtime_put(data->dev);
}
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v2 0/5] iommu/mediatek: Fix tlb flush logic

2021-12-08 Thread Dafna Hirschfeld
Often devices allocate DMA buffers before they do
runtime pm resume. This is the case for example with v4l2
devices where buffers are allocated during 'VIDIOC_REQBUFS`
and runtime resume happens later usually during 'VIDIOC_STREAMON'.
In such cases the partial tlb flush when allocating will fail
since the iommu is runtime suspended. This will print a warning
and try to do full flush. But there is actually no need to flush
the tlb before the consumer device is turned on.

Fix the warning by skipping partial flush when allocating and instead
do full flash in runtime resume.

In order to do full flush from the resume cb, the test:

if (pm_runtime_get_if_in_use(data->dev) <= 0)
continue;

needs to be removed from the flush all func since pm_runtime_get_if_in_use
returns 0 while resuming and will skip the flush


This patchset is a combination of 4 patches already sent in a different
patchset: [1] and a warning fix from Sebastian Reichel

[1] 
https://lore.kernel.org/linux-devicetree/20210923115840.17813-1-yong...@mediatek.com/

changes since v1:
-

* Added preparation patches to remove the unneeded 'for_each_m4u' usage
and add a spinlock to protect access to tlb control registers.
* remove the pm runtime status check as explained above.
* refactor commit logs and inline doc
* move the call to full flush to the bottom of the resume cb after all 
registers are updated.


Sebastian Reichel (1):
  iommu/mediatek: Always check runtime PM status in tlb flush range
callback

Yong Wu (4):
  iommu/mediatek: Remove for_each_m4u in tlb_sync_all
  iommu/mediatek: Remove the power status checking in tlb flush all
  iommu/mediatek: Add tlb_lock in tlb_flush_all
  iommu/mediatek: Always tlb_flush_all when each PM resume

 drivers/iommu/mtk_iommu.c | 42 ---
 1 file changed, 22 insertions(+), 20 deletions(-)

-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v2 5/5] iommu/mediatek: Always tlb_flush_all when each PM resume

2021-12-08 Thread Dafna Hirschfeld
From: Yong Wu 

Prepare for 2 HWs that sharing pgtable in different power-domains.

When there are 2 M4U HWs, it may has problem in the flush_range in which
we get the pm_status via the m4u dev, BUT that function don't reflect the
real power-domain status of the HW since there may be other HW also use
that power-domain.

DAM allocation is often done while the allocating device is runtime
suspended. In such a case the iommu will also be suspended and partial
flushing of the tlb will not be executed.
Therefore, we add a tlb_flush_all in the pm_runtime_resume to make
sure the tlb is always clean.

In other case, the iommu's power should be active via device
link with smi.

Signed-off-by: Yong Wu 
[move the call to mtk_iommu_tlb_flush_all to the bottom of resume cb, improve 
doc/log]
Signed-off-by: Dafna Hirschfeld 
---
 drivers/iommu/mtk_iommu.c | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c
index 195a411e3087..4799cd06511b 100644
--- a/drivers/iommu/mtk_iommu.c
+++ b/drivers/iommu/mtk_iommu.c
@@ -997,6 +997,13 @@ static int __maybe_unused mtk_iommu_runtime_resume(struct 
device *dev)
writel_relaxed(reg->ivrp_paddr, base + REG_MMU_IVRP_PADDR);
writel_relaxed(reg->vld_pa_rng, base + REG_MMU_VLD_PA_RNG);
writel(m4u_dom->cfg.arm_v7s_cfg.ttbr & MMU_PT_ADDR_MASK, base + 
REG_MMU_PT_BASE_ADDR);
+
+   /*
+* Users may allocate dma buffer before they call pm_runtime_get,
+* in which case it will lack the necessary tlb flush.
+* Thus, make sure to update the tlb after each PM resume.
+*/
+   mtk_iommu_tlb_flush_all(data);
return 0;
 }
 
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v2 1/5] iommu/mediatek: Remove for_each_m4u in tlb_sync_all

2021-12-08 Thread Dafna Hirschfeld
From: Yong Wu 

The tlb_sync_all is called from these three functions:
a) flush_iotlb_all: it will be called for each a iommu HW.
b) tlb_flush_range_sync: it already has for_each_m4u.
c) in irq: When IOMMU HW translation fault, Only need flush itself.

Thus, No need for_each_m4u in this tlb_sync_all. Remove it.

Signed-off-by: Yong Wu 
Reviewed-by: Dafna Hirschfeld 
---
 drivers/iommu/mtk_iommu.c | 16 +++-
 1 file changed, 7 insertions(+), 9 deletions(-)

diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c
index 507123ae7485..342aa562ab6a 100644
--- a/drivers/iommu/mtk_iommu.c
+++ b/drivers/iommu/mtk_iommu.c
@@ -210,17 +210,15 @@ static struct mtk_iommu_domain *to_mtk_domain(struct 
iommu_domain *dom)
 
 static void mtk_iommu_tlb_flush_all(struct mtk_iommu_data *data)
 {
-   for_each_m4u(data) {
-   if (pm_runtime_get_if_in_use(data->dev) <= 0)
-   continue;
+   if (pm_runtime_get_if_in_use(data->dev) <= 0)
+   return;
 
-   writel_relaxed(F_INVLD_EN1 | F_INVLD_EN0,
-  data->base + data->plat_data->inv_sel_reg);
-   writel_relaxed(F_ALL_INVLD, data->base + REG_MMU_INVALIDATE);
-   wmb(); /* Make sure the tlb flush all done */
+   writel_relaxed(F_INVLD_EN1 | F_INVLD_EN0,
+  data->base + data->plat_data->inv_sel_reg);
+   writel_relaxed(F_ALL_INVLD, data->base + REG_MMU_INVALIDATE);
+   wmb(); /* Make sure the tlb flush all done */
 
-   pm_runtime_put(data->dev);
-   }
+   pm_runtime_put(data->dev);
 }
 
 static void mtk_iommu_tlb_flush_range_sync(unsigned long iova, size_t size,
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v2 3/5] iommu/mediatek: Remove the power status checking in tlb flush all

2021-12-08 Thread Dafna Hirschfeld
From: Yong Wu 

To simplify the code, Remove the power status checking in the
tlb_flush_all, remove this:
   if (pm_runtime_get_if_in_use(data->dev) <= 0)
continue;

The mtk_iommu_tlb_flush_all is called from
a) isr
b) tlb flush range fail case
c) iommu_create_device_direct_mappings

In first two cases, the power and clock are always enabled.
In the third case tlb flush is unnecessary because in a later patch
in the series a full flush from the pm_runtime_resume callback is added.

In addition, writing the tlb control register when the iommu is not resumed
is ok and the write is ignored.

Signed-off-by: Yong Wu 
[refactor commit log]
Signed-off-by: Dafna Hirschfeld 
---
 drivers/iommu/mtk_iommu.c | 5 -
 1 file changed, 5 deletions(-)

diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c
index dd2c08c54df4..e30ac68fab48 100644
--- a/drivers/iommu/mtk_iommu.c
+++ b/drivers/iommu/mtk_iommu.c
@@ -210,15 +210,10 @@ static struct mtk_iommu_domain *to_mtk_domain(struct 
iommu_domain *dom)
 
 static void mtk_iommu_tlb_flush_all(struct mtk_iommu_data *data)
 {
-   if (pm_runtime_get_if_in_use(data->dev) <= 0)
-   return;
-
writel_relaxed(F_INVLD_EN1 | F_INVLD_EN0,
   data->base + data->plat_data->inv_sel_reg);
writel_relaxed(F_ALL_INVLD, data->base + REG_MMU_INVALIDATE);
wmb(); /* Make sure the tlb flush all done */
-
-   pm_runtime_put(data->dev);
 }
 
 static void mtk_iommu_tlb_flush_range_sync(unsigned long iova, size_t size,
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH v2 2/5] iommu/mediatek: Always check runtime PM status in tlb flush range callback

2021-12-08 Thread Dafna Hirschfeld
From: Sebastian Reichel 

In case of v4l2_reqbufs() it is possible, that a TLB flush is done
without runtime PM being enabled. In that case the "Partial TLB flush
timed out, falling back to full flush" warning is printed.

Commit c0b57581b73b ("iommu/mediatek: Add power-domain operation")
introduced has_pm as optimization to avoid checking runtime PM
when there is no power domain attached. But without the PM domain
there is still the device driver's runtime PM suspend handler, which
disables the clock. Thus flushing should also be avoided when there
is no PM domain involved.

Signed-off-by: Sebastian Reichel 
Reviewed-by: Dafna Hirschfeld 
---
 drivers/iommu/mtk_iommu.c | 10 +++---
 1 file changed, 3 insertions(+), 7 deletions(-)

diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c
index 342aa562ab6a..dd2c08c54df4 100644
--- a/drivers/iommu/mtk_iommu.c
+++ b/drivers/iommu/mtk_iommu.c
@@ -225,16 +225,13 @@ static void mtk_iommu_tlb_flush_range_sync(unsigned long 
iova, size_t size,
   size_t granule,
   struct mtk_iommu_data *data)
 {
-   bool has_pm = !!data->dev->pm_domain;
unsigned long flags;
int ret;
u32 tmp;
 
for_each_m4u(data) {
-   if (has_pm) {
-   if (pm_runtime_get_if_in_use(data->dev) <= 0)
-   continue;
-   }
+   if (pm_runtime_get_if_in_use(data->dev) <= 0)
+   continue;
 
spin_lock_irqsave(>tlb_lock, flags);
writel_relaxed(F_INVLD_EN1 | F_INVLD_EN0,
@@ -259,8 +256,7 @@ static void mtk_iommu_tlb_flush_range_sync(unsigned long 
iova, size_t size,
writel_relaxed(0, data->base + REG_MMU_CPE_DONE);
spin_unlock_irqrestore(>tlb_lock, flags);
 
-   if (has_pm)
-   pm_runtime_put(data->dev);
+   pm_runtime_put(data->dev);
}
 }
 
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH 1/2] iommu/mediatek: Always tlb_flush_all when each PM resume

2021-12-08 Thread Dafna Hirschfeld



On 08.12.21 11:50, Dafna Hirschfeld wrote:



On 07.12.21 10:31, Dafna Hirschfeld wrote:



On 27.11.21 04:46, Yong Wu wrote:

Hi Dafna,

Sorry for reply late.

On Mon, 2021-11-22 at 12:43 +0200, Dafna Hirschfeld wrote:

From: Yong Wu 

Prepare for 2 HWs that sharing pgtable in different power-domains.

When there are 2 M4U HWs, it may has problem in the flush_range in
which
we get the pm_status via the m4u dev, BUT that function don't reflect
the
real power-domain status of the HW since there may be other HW also
use
that power-domain.

The function dma_alloc_attrs help allocate the iommu buffer which
need the corresponding power domain since tlb flush is needed when
preparing iova. BUT this function only is for allocating buffer,
we have no good reason to request the user always call pm_runtime_get
before calling dma_alloc_xxx. Therefore, we add a tlb_flush_all
in the pm_runtime_resume to make sure the tlb always is clean.

Another solution is always call pm_runtime_get in the
tlb_flush_range.
This will trigger pm runtime resume/backup so often when the iommu
power is not active at some time(means user don't call pm_runtime_get
before calling dma_alloc_xxx), This may cause the performance drop.
thus we don't use this.

In other case, the iommu's power should always be active via device
link with smi.

The previous SoC don't have PM except mt8192. the mt8192 IOMMU is
display's
power-domain which nearly always is enabled. thus no need fix tags
here.
Prepare for mt8195.


In this patchset, this message should be not proper. I think you could
add the comment why this patch is needed in mt8173.



Signed-off-by: Yong Wu 
[imporvie inline doc]
Signed-off-by: Dafna Hirschfeld 
---
  drivers/iommu/mtk_iommu.c | 7 +++
  1 file changed, 7 insertions(+)

diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c
index 25b834104790..28dc4b95b6d9 100644
--- a/drivers/iommu/mtk_iommu.c
+++ b/drivers/iommu/mtk_iommu.c
@@ -964,6 +964,13 @@ static int __maybe_unused
mtk_iommu_runtime_resume(struct device *dev)
  return ret;
  }
+    /*
+ * Users may allocate dma buffer before they call
pm_runtime_get,
+ * in which case it will lack the necessary tlb flush.
+ * Thus, make sure to update the tlb after each PM resume.
+ */
+    mtk_iommu_tlb_flush_all(data);


This should not work. since current the *_tlb_flush_all call
pm_runtime_get_if_in_use which will always return 0 when it called from
this runtime_cb in my test. thus, It won't do the tlb_flush_all
actually.


He, indeed, my mistake, although the encoder works more or less fine even
without the full flush so I didn't catch that.



I guess this also depend on these two patches of mt8195 v3.
[PATCH v3 09/33] iommu/mediatek: Remove for_each_m4u in tlb_sync_all
[PATCH v3 10/33] iommu/mediatek: Add tlb_lock in tlb_flush_all


I'll add those two



like in [10/33], I added a mtk_iommu_tlb_do_flush_all which don't have
the pm operation.


yes, I need to remove the pm_runtime_get_if_in_use call in the 'flush_all' func
I see there is also a patch for that in the mt8195 v3 series "[PATCH v3 13/33] 
iommu/mediatek: Remove the power status checking in tlb flush all"

So I'll send v2, adding all those 3 patches, but I think adding 
mtk_iommu_tlb_do_flush_all
on patch 9 and removing it again on patch 13 is confusing so I'll avoid that.



In addition, the call to mtk_iommu_tlb_flush_all from mtk_iommu_runtime_resume 
should move to the bottom of the function
after all values are updated


Thanks,
Dafna





This looks has a dependence. Let me know if I can help this.


It did work for me, testing on elm device. I'll check that again.





+
  /*
   * Uppon first resume, only enable the clk and return, since
the values of the
   * registers are not yet set.



___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH 1/2] iommu/mediatek: Always tlb_flush_all when each PM resume

2021-12-08 Thread Dafna Hirschfeld



On 07.12.21 10:31, Dafna Hirschfeld wrote:



On 27.11.21 04:46, Yong Wu wrote:

Hi Dafna,

Sorry for reply late.

On Mon, 2021-11-22 at 12:43 +0200, Dafna Hirschfeld wrote:

From: Yong Wu 

Prepare for 2 HWs that sharing pgtable in different power-domains.

When there are 2 M4U HWs, it may has problem in the flush_range in
which
we get the pm_status via the m4u dev, BUT that function don't reflect
the
real power-domain status of the HW since there may be other HW also
use
that power-domain.

The function dma_alloc_attrs help allocate the iommu buffer which
need the corresponding power domain since tlb flush is needed when
preparing iova. BUT this function only is for allocating buffer,
we have no good reason to request the user always call pm_runtime_get
before calling dma_alloc_xxx. Therefore, we add a tlb_flush_all
in the pm_runtime_resume to make sure the tlb always is clean.

Another solution is always call pm_runtime_get in the
tlb_flush_range.
This will trigger pm runtime resume/backup so often when the iommu
power is not active at some time(means user don't call pm_runtime_get
before calling dma_alloc_xxx), This may cause the performance drop.
thus we don't use this.

In other case, the iommu's power should always be active via device
link with smi.

The previous SoC don't have PM except mt8192. the mt8192 IOMMU is
display's
power-domain which nearly always is enabled. thus no need fix tags
here.
Prepare for mt8195.


In this patchset, this message should be not proper. I think you could
add the comment why this patch is needed in mt8173.



Signed-off-by: Yong Wu 
[imporvie inline doc]
Signed-off-by: Dafna Hirschfeld 
---
  drivers/iommu/mtk_iommu.c | 7 +++
  1 file changed, 7 insertions(+)

diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c
index 25b834104790..28dc4b95b6d9 100644
--- a/drivers/iommu/mtk_iommu.c
+++ b/drivers/iommu/mtk_iommu.c
@@ -964,6 +964,13 @@ static int __maybe_unused
mtk_iommu_runtime_resume(struct device *dev)
  return ret;
  }
+    /*
+ * Users may allocate dma buffer before they call
pm_runtime_get,
+ * in which case it will lack the necessary tlb flush.
+ * Thus, make sure to update the tlb after each PM resume.
+ */
+    mtk_iommu_tlb_flush_all(data);


This should not work. since current the *_tlb_flush_all call
pm_runtime_get_if_in_use which will always return 0 when it called from
this runtime_cb in my test. thus, It won't do the tlb_flush_all
actually.


He, indeed, my mistake, although the encoder works more or less fine even
without the full flush so I didn't catch that.



I guess this also depend on these two patches of mt8195 v3.
[PATCH v3 09/33] iommu/mediatek: Remove for_each_m4u in tlb_sync_all
[PATCH v3 10/33] iommu/mediatek: Add tlb_lock in tlb_flush_all


I'll add those two



like in [10/33], I added a mtk_iommu_tlb_do_flush_all which don't have
the pm operation.


yes, I need to remove the pm_runtime_get_if_in_use call in the 'flush_all' func
I see there is also a patch for that in the mt8195 v3 series "[PATCH v3 13/33] 
iommu/mediatek: Remove the power status checking in tlb flush all"

So I'll send v2, adding all those 3 patches, but I think adding 
mtk_iommu_tlb_do_flush_all
on patch 9 and removing it again on patch 13 is confusing so I'll avoid that.

Thanks,
Dafna





This looks has a dependence. Let me know if I can help this.


It did work for me, testing on elm device. I'll check that again.





+
  /*
   * Uppon first resume, only enable the clk and return, since
the values of the
   * registers are not yet set.



___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH 1/2] iommu/mediatek: Always tlb_flush_all when each PM resume

2021-12-07 Thread Dafna Hirschfeld




On 27.11.21 04:46, Yong Wu wrote:

Hi Dafna,

Sorry for reply late.

On Mon, 2021-11-22 at 12:43 +0200, Dafna Hirschfeld wrote:

From: Yong Wu 

Prepare for 2 HWs that sharing pgtable in different power-domains.

When there are 2 M4U HWs, it may has problem in the flush_range in
which
we get the pm_status via the m4u dev, BUT that function don't reflect
the
real power-domain status of the HW since there may be other HW also
use
that power-domain.

The function dma_alloc_attrs help allocate the iommu buffer which
need the corresponding power domain since tlb flush is needed when
preparing iova. BUT this function only is for allocating buffer,
we have no good reason to request the user always call pm_runtime_get
before calling dma_alloc_xxx. Therefore, we add a tlb_flush_all
in the pm_runtime_resume to make sure the tlb always is clean.

Another solution is always call pm_runtime_get in the
tlb_flush_range.
This will trigger pm runtime resume/backup so often when the iommu
power is not active at some time(means user don't call pm_runtime_get
before calling dma_alloc_xxx), This may cause the performance drop.
thus we don't use this.

In other case, the iommu's power should always be active via device
link with smi.

The previous SoC don't have PM except mt8192. the mt8192 IOMMU is
display's
power-domain which nearly always is enabled. thus no need fix tags
here.
Prepare for mt8195.


In this patchset, this message should be not proper. I think you could
add the comment why this patch is needed in mt8173.



Signed-off-by: Yong Wu 
[imporvie inline doc]
Signed-off-by: Dafna Hirschfeld 
---
  drivers/iommu/mtk_iommu.c | 7 +++
  1 file changed, 7 insertions(+)

diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c
index 25b834104790..28dc4b95b6d9 100644
--- a/drivers/iommu/mtk_iommu.c
+++ b/drivers/iommu/mtk_iommu.c
@@ -964,6 +964,13 @@ static int __maybe_unused
mtk_iommu_runtime_resume(struct device *dev)
return ret;
}
  
+	/*

+* Users may allocate dma buffer before they call
pm_runtime_get,
+* in which case it will lack the necessary tlb flush.
+* Thus, make sure to update the tlb after each PM resume.
+*/
+   mtk_iommu_tlb_flush_all(data);


This should not work. since current the *_tlb_flush_all call
pm_runtime_get_if_in_use which will always return 0 when it called from
this runtime_cb in my test. thus, It won't do the tlb_flush_all
actually.

I guess this also depend on these two patches of mt8195 v3.
[PATCH v3 09/33] iommu/mediatek: Remove for_each_m4u in tlb_sync_all
[PATCH v3 10/33] iommu/mediatek: Add tlb_lock in tlb_flush_all

like in [10/33], I added a mtk_iommu_tlb_do_flush_all which don't have
the pm operation.

This looks has a dependence. Let me know if I can help this.


It did work for me, testing on elm device. I'll check that again.





+
/*
 * Uppon first resume, only enable the clk and return, since
the values of the
 * registers are not yet set.

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 3/4] iommu/rockchip: rename dte_index to dte

2021-12-04 Thread Dafna Hirschfeld
In rk_iommu_map, the var dte_index is actually
set to the dte and not to the dte index. Rename it.

Signed-off-by: Dafna Hirschfeld 
---
 drivers/iommu/rockchip-iommu.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/iommu/rockchip-iommu.c b/drivers/iommu/rockchip-iommu.c
index bd73cf9c54f5..ba60f0faafcc 100644
--- a/drivers/iommu/rockchip-iommu.c
+++ b/drivers/iommu/rockchip-iommu.c
@@ -852,7 +852,7 @@ static int rk_iommu_map(struct iommu_domain *domain, 
unsigned long _iova,
unsigned long flags;
dma_addr_t pte_dma, iova = (dma_addr_t)_iova;
u32 *page_table, *pte_addr;
-   u32 dte_index, pte_index;
+   u32 dte, pte_index;
int ret;
 
spin_lock_irqsave(_domain->dt_lock, flags);
@@ -870,11 +870,11 @@ static int rk_iommu_map(struct iommu_domain *domain, 
unsigned long _iova,
return PTR_ERR(page_table);
}
 
-   dte_index = rk_domain->dt[rk_iova_dte_index(iova)];
+   dte = rk_domain->dt[rk_iova_dte_index(iova)];
pte_index = rk_iova_pte_index(iova);
pte_addr = _table[pte_index];
 
-   pte_dma = rk_ops->pt_address(dte_index) + pte_index * sizeof(u32);
+   pte_dma = rk_ops->pt_address(dte) + pte_index * sizeof(u32);
ret = rk_iommu_map_iova(rk_domain, pte_addr, pte_dma, iova,
paddr, size, prot);
 
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 4/4] iommu/rockchip: replace pt_address cb with dma_addr_dte when setting dt addr

2021-12-04 Thread Dafna Hirschfeld
The dt address is calculated using the dma_addr_dte cb.
So when setting the dt address to the DTE_ADDR_DUMMY
that cb should be used instead of pt_address.

Signed-off-by: Dafna Hirschfeld 
---
 drivers/iommu/rockchip-iommu.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/iommu/rockchip-iommu.c b/drivers/iommu/rockchip-iommu.c
index ba60f0faafcc..013f7608a2e6 100644
--- a/drivers/iommu/rockchip-iommu.c
+++ b/drivers/iommu/rockchip-iommu.c
@@ -510,7 +510,7 @@ static int rk_iommu_force_reset(struct rk_iommu *iommu)
 * and verifying that upper 5 nybbles are read back.
 */
for (i = 0; i < iommu->num_mmu; i++) {
-   dte_addr = rk_ops->pt_address(DTE_ADDR_DUMMY);
+   dte_addr = rk_ops->dma_addr_dte(DTE_ADDR_DUMMY);
rk_iommu_write(iommu->bases[i], RK_MMU_DTE_ADDR, dte_addr);
 
if (dte_addr != rk_iommu_read(iommu->bases[i], 
RK_MMU_DTE_ADDR)) {
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 1/4] iommu/rockchip: replace 4 with sizeof(u32)

2021-12-04 Thread Dafna Hirschfeld
In log_iova, multiply by 4 is used to calculate the
addresses. In other places in this driver, sizeof(u3)
is used. So replace 4 with sizeof(u32) for consistency

Signed-off-by: Dafna Hirschfeld 
---
 drivers/iommu/rockchip-iommu.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/iommu/rockchip-iommu.c b/drivers/iommu/rockchip-iommu.c
index 5cb260820eda..bd22d0a6eaf0 100644
--- a/drivers/iommu/rockchip-iommu.c
+++ b/drivers/iommu/rockchip-iommu.c
@@ -580,14 +580,14 @@ static void log_iova(struct rk_iommu *iommu, int index, 
dma_addr_t iova)
mmu_dte_addr = rk_iommu_read(base, RK_MMU_DTE_ADDR);
mmu_dte_addr_phys = rk_ops->dte_addr_phys(mmu_dte_addr);
 
-   dte_addr_phys = mmu_dte_addr_phys + (4 * dte_index);
+   dte_addr_phys = mmu_dte_addr_phys + sizeof(u32) * dte_index;
dte_addr = phys_to_virt(dte_addr_phys);
dte = *dte_addr;
 
if (!rk_dte_is_pt_valid(dte))
goto print_it;
 
-   pte_addr_phys = rk_ops->pt_address(dte) + (pte_index * 4);
+   pte_addr_phys = rk_ops->pt_address(dte) + pte_index * sizeof(u32);
pte_addr = phys_to_virt(pte_addr_phys);
pte = *pte_addr;
 
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 2/4] iommu/rockchip: remove redundant var dte_addr

2021-12-04 Thread Dafna Hirschfeld
Using dte_addr as local var is redundant.
Instead acces rk_domain->dt[dte_index] directly.

Signed-off-by: Dafna Hirschfeld 
---
 drivers/iommu/rockchip-iommu.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/iommu/rockchip-iommu.c b/drivers/iommu/rockchip-iommu.c
index bd22d0a6eaf0..bd73cf9c54f5 100644
--- a/drivers/iommu/rockchip-iommu.c
+++ b/drivers/iommu/rockchip-iommu.c
@@ -744,7 +744,7 @@ static void rk_iommu_zap_iova_first_last(struct 
rk_iommu_domain *rk_domain,
 static u32 *rk_dte_get_page_table(struct rk_iommu_domain *rk_domain,
  dma_addr_t iova)
 {
-   u32 *page_table, *dte_addr;
+   u32 *page_table;
u32 dte_index, dte;
phys_addr_t pt_phys;
dma_addr_t pt_dma;
@@ -752,8 +752,8 @@ static u32 *rk_dte_get_page_table(struct rk_iommu_domain 
*rk_domain,
assert_spin_locked(_domain->dt_lock);
 
dte_index = rk_iova_dte_index(iova);
-   dte_addr = _domain->dt[dte_index];
-   dte = *dte_addr;
+   dte = rk_domain->dt[dte_index];
+
if (rk_dte_is_pt_valid(dte))
goto done;
 
@@ -769,7 +769,7 @@ static u32 *rk_dte_get_page_table(struct rk_iommu_domain 
*rk_domain,
}
 
dte = rk_ops->mk_dtentries(pt_dma);
-   *dte_addr = dte;
+   rk_domain->dt[dte_index] = dte;
 
rk_table_flush(rk_domain,
   rk_domain->dt_dma + dte_index * sizeof(u32), 1);
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v3 12/33] iommu/mediatek: Always tlb_flush_all when each PM resume

2021-11-30 Thread Dafna Hirschfeld




On 30.11.21 09:39, Yong Wu wrote:

On Sat, 2021-11-27 at 12:11 +0200, Dafna Hirschfeld wrote:


On 10.11.21 09:50, Yong Wu wrote:

On Wed, 2021-11-10 at 07:29 +0200, Dafna Hirschfeld wrote:


On 10.11.21 04:20, Yong Wu wrote:

On Tue, 2021-11-09 at 14:21 +0200, Dafna Hirschfeld wrote:

Hi
This patch is needed in order to update the tlb when a device
is
powered on.
Could you send this patch alone without the whole series so
it
get
accepted easier?


Which SoC are you testing on? In previous SoC, the IOMMU HW
don't
have
power-domain, and we have a "has_pm"[1] in the tlb function for
that
case. The "has_pm" should be always 0 for the previous SoC like
mt8173,
it should always tlb synchronize.

thus, Could you help share more about your issue? In which case
it
lack
the necessary tlb operation. At least, We need confirm if it
needs
a
"Fixes" tags if sending this patch alone.


Hi,
I work with the mtk-vcodec driver on mt8173. As you wrote, the
iommu
doesn't
have a power-domain and so when allocating buffers before the
device
is powered
on, there is the warning
"Partial TLB flush timed out, falling back to full flush"
flooding the log buf.


oh. Thanks very much for your information. Get it now.

This issue should be introduced by the:

b34ea31fe013 ("iommu/mediatek: Always enable the clk on resume")


Hi, reverting this commit didn't solve those warnings,
I think this is because in the function mtk_iommu_attach_device
the first call to pm_runtime_resume_and_get does not turn the clks on
since m4u_dom is not yet initialize. And then mtk_iommu_attach_device
calls pm_runtime_put right after mtk_iommu_hw_init is called
(where the clks are turned on)


oh. Right. this is also related with the patch of "Add power-domain
operation".

The current problem is that there is a redundant log of "Partial TLB
flush timed out" in mt8173. We need fix this issue firstly. Are you
going to prepare the patches again? If not, I could help this. You
could help confirm them if you are free.


Hi,
I already sent a patch last week: 
https://lore.kernel.org/linux-iommu/afb46ad6ca9477a2bf71233858406caa6ccb1588.ca...@mediatek.com/T/
could you please review it?

Thanks,
Dafna





Thanks.



thanks,
Dafna




tlb failed due to the bclk is not enabled. Could you help try that
after reverting this?



Sebastian Reichel suggested to remove the 'if(has_pm)' check to
avoid
this warning,
and avoid flushing the tlb if the device is off:

[1] http://ix.io/3Eyr

This fixes the warning, but then the tlb is not flushed in sync,
Therefore the tlb should be flushed when the device is resumed.

So the two patches (the one suggested in the link [1] and this
patch)
should be sent together as a 2-patch series.


then this is reasonable. You could help this into a new patchset if
you
are free(add Fixes tag).

Thanks.



Thanks,
Dafna




___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v3 12/33] iommu/mediatek: Always tlb_flush_all when each PM resume

2021-11-27 Thread Dafna Hirschfeld




On 10.11.21 09:50, Yong Wu wrote:

On Wed, 2021-11-10 at 07:29 +0200, Dafna Hirschfeld wrote:


On 10.11.21 04:20, Yong Wu wrote:

On Tue, 2021-11-09 at 14:21 +0200, Dafna Hirschfeld wrote:

Hi
This patch is needed in order to update the tlb when a device is
powered on.
Could you send this patch alone without the whole series so it
get
accepted easier?


Which SoC are you testing on? In previous SoC, the IOMMU HW don't
have
power-domain, and we have a "has_pm"[1] in the tlb function for
that
case. The "has_pm" should be always 0 for the previous SoC like
mt8173,
it should always tlb synchronize.

thus, Could you help share more about your issue? In which case it
lack
the necessary tlb operation. At least, We need confirm if it needs
a
"Fixes" tags if sending this patch alone.


Hi,
I work with the mtk-vcodec driver on mt8173. As you wrote, the iommu
doesn't
have a power-domain and so when allocating buffers before the device
is powered
on, there is the warning
"Partial TLB flush timed out, falling back to full flush"
flooding the log buf.


oh. Thanks very much for your information. Get it now.

This issue should be introduced by the:

b34ea31fe013 ("iommu/mediatek: Always enable the clk on resume")


Hi, reverting this commit didn't solve those warnings,
I think this is because in the function mtk_iommu_attach_device
the first call to pm_runtime_resume_and_get does not turn the clks on
since m4u_dom is not yet initialize. And then mtk_iommu_attach_device
calls pm_runtime_put right after mtk_iommu_hw_init is called
(where the clks are turned on)

thanks,
Dafna




tlb failed due to the bclk is not enabled. Could you help try that
after reverting this?



Sebastian Reichel suggested to remove the 'if(has_pm)' check to avoid
this warning,
and avoid flushing the tlb if the device is off:

[1] http://ix.io/3Eyr

This fixes the warning, but then the tlb is not flushed in sync,
Therefore the tlb should be flushed when the device is resumed.

So the two patches (the one suggested in the link [1] and this patch)
should be sent together as a 2-patch series.


then this is reasonable. You could help this into a new patchset if you
are free(add Fixes tag).

Thanks.



Thanks,
Dafna



Thanks.

[1]


https://elixir.bootlin.com/linux/v5.15/source/drivers/iommu/mtk_iommu.c#L236



I can resend the patch on your behalf if you want.

Thanks,
Dafna

On 23.09.21 14:58, Yong Wu wrote:

Prepare for 2 HWs that sharing pgtable in different power-
domains.

When there are 2 M4U HWs, it may has problem in the flush_range
in
which
we get the pm_status via the m4u dev, BUT that function don't
reflect the
real power-domain status of the HW since there may be other HW
also
use
that power-domain.

The function dma_alloc_attrs help allocate the iommu buffer
which
need the corresponding power domain since tlb flush is needed
when
preparing iova. BUT this function only is for allocating
buffer,
we have no good reason to request the user always call
pm_runtime_get
before calling dma_alloc_xxx. Therefore, we add a tlb_flush_all
in the pm_runtime_resume to make sure the tlb always is clean.

Another solution is always call pm_runtime_get in the
tlb_flush_range.
This will trigger pm runtime resume/backup so often when the
iommu
power is not active at some time(means user don't call
pm_runtime_get
before calling dma_alloc_xxx), This may cause the performance
drop.
thus we don't use this.

In other case, the iommu's power should always be active via
device
link with smi.

The previous SoC don't have PM except mt8192. the mt8192 IOMMU
is
display's
power-domain which nearly always is enabled. thus no need fix
tags
here.
Prepare for mt8195.

Signed-off-by: Yong Wu 
---
drivers/iommu/mtk_iommu.c | 11 +++
1 file changed, 11 insertions(+)

diff --git a/drivers/iommu/mtk_iommu.c
b/drivers/iommu/mtk_iommu.c
index 44cf5547d084..e9e94944ed91 100644
--- a/drivers/iommu/mtk_iommu.c
+++ b/drivers/iommu/mtk_iommu.c
@@ -984,6 +984,17 @@ static int __maybe_unused
mtk_iommu_runtime_resume(struct device *dev)
return ret;
}

+	/*

+* Users may allocate dma buffer before they call
pm_runtime_get, then
+* it will lack the necessary tlb flush.
+*
+* We have no good reason to request the users always
call
dma_alloc_xx
+* after pm_runtime_get_sync.
+*
+* Thus, Make sure the tlb always is clean after each
PM
resume.
+*/
+   mtk_iommu_tlb_do_flush_all(data);
+
/*
 * Uppon first resume, only enable the clk and return,
since
the values of the
 * registers are not yet set.


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v3 12/33] iommu/mediatek: Always tlb_flush_all when each PM resume

2021-11-22 Thread Dafna Hirschfeld




On 22.11.21 09:05, Yong Wu wrote:

Hi Dafna,

On Wed, 2021-11-10 at 15:50 +0800, Yong Wu wrote:

On Wed, 2021-11-10 at 07:29 +0200, Dafna Hirschfeld wrote:


On 10.11.21 04:20, Yong Wu wrote:

On Tue, 2021-11-09 at 14:21 +0200, Dafna Hirschfeld wrote:

Hi
This patch is needed in order to update the tlb when a device
is
powered on.
Could you send this patch alone without the whole series so it
get
accepted easier?


Which SoC are you testing on? In previous SoC, the IOMMU HW don't
have
power-domain, and we have a "has_pm"[1] in the tlb function for
that
case. The "has_pm" should be always 0 for the previous SoC like
mt8173,
it should always tlb synchronize.

thus, Could you help share more about your issue? In which case
it
lack
the necessary tlb operation. At least, We need confirm if it
needs
a
"Fixes" tags if sending this patch alone.


Hi,
I work with the mtk-vcodec driver on mt8173. As you wrote, the
iommu
doesn't
have a power-domain and so when allocating buffers before the
device
is powered
on, there is the warning
"Partial TLB flush timed out, falling back to full flush"
flooding the log buf.


oh. Thanks very much for your information. Get it now.

This issue should be introduced by the:

b34ea31fe013 ("iommu/mediatek: Always enable the clk on resume")

tlb failed due to the bclk is not enabled. Could you help try that
after reverting this?



Sebastian Reichel suggested to remove the 'if(has_pm)' check to
avoid
this warning,
and avoid flushing the tlb if the device is off:

[1] http://ix.io/3Eyr

This fixes the warning, but then the tlb is not flushed in sync,
Therefore the tlb should be flushed when the device is resumed.

So the two patches (the one suggested in the link [1] and this
patch)
should be sent together as a 2-patch series.


then this is reasonable. You could help this into a new patchset if
you
are free(add Fixes tag).

Thanks.



Thanks,
Dafna



Thanks.

[1]





https://elixir.bootlin.com/linux/v5.15/source/drivers/iommu/mtk_iommu.c#L236



I can resend the patch on your behalf if you want.


What's your plan here?

If you send the two as a independent patchset on v5.16, I will rebase
yours.
If you have no time for this, I could help this.


Hi, just sent it in the patchset ` iommu/mediatek: fix tlb flush logic`

Thanks,
Dafna



Thanks.



Thanks,
Dafna

On 23.09.21 14:58, Yong Wu wrote:

Prepare for 2 HWs that sharing pgtable in different power-
domains.

When there are 2 M4U HWs, it may has problem in the
flush_range
in
which
we get the pm_status via the m4u dev, BUT that function don't
reflect the
real power-domain status of the HW since there may be other
HW
also
use
that power-domain.

The function dma_alloc_attrs help allocate the iommu buffer
which
need the corresponding power domain since tlb flush is needed
when
preparing iova. BUT this function only is for allocating
buffer,
we have no good reason to request the user always call
pm_runtime_get
before calling dma_alloc_xxx. Therefore, we add a
tlb_flush_all
in the pm_runtime_resume to make sure the tlb always is
clean.

Another solution is always call pm_runtime_get in the
tlb_flush_range.
This will trigger pm runtime resume/backup so often when the
iommu
power is not active at some time(means user don't call
pm_runtime_get
before calling dma_alloc_xxx), This may cause the performance
drop.
thus we don't use this.

In other case, the iommu's power should always be active via
device
link with smi.

The previous SoC don't have PM except mt8192. the mt8192
IOMMU
is
display's
power-domain which nearly always is enabled. thus no need fix
tags
here.
Prepare for mt8195.

Signed-off-by: Yong Wu 
---
drivers/iommu/mtk_iommu.c | 11 +++
1 file changed, 11 insertions(+)

diff --git a/drivers/iommu/mtk_iommu.c
b/drivers/iommu/mtk_iommu.c
index 44cf5547d084..e9e94944ed91 100644
--- a/drivers/iommu/mtk_iommu.c
+++ b/drivers/iommu/mtk_iommu.c
@@ -984,6 +984,17 @@ static int __maybe_unused
mtk_iommu_runtime_resume(struct device *dev)
return ret;
}

+	/*

+* Users may allocate dma buffer before they call
pm_runtime_get, then
+* it will lack the necessary tlb flush.
+*
+* We have no good reason to request the users always
call
dma_alloc_xx
+* after pm_runtime_get_sync.
+*
+* Thus, Make sure the tlb always is clean after each
PM
resume.
+*/
+   mtk_iommu_tlb_do_flush_all(data);
+
/*
 * Uppon first resume, only enable the clk and return,
since
the values of the
 * registers are not yet set.



___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 0/2] iommu/mediatek: fix tlb flush logic

2021-11-22 Thread Dafna Hirschfeld
Often devices allocate dma buffers before they do
runtime pm resume. This is the case for example with v4l2
devices where buffers are allocated during 'VIDIOC_REQBUFS`
and runtime resume happens later usually during 'VIDIOC_STREAMON'.
In such cases the partial tlb flush when allocating will fail
since the the iommu is runtime suspended. This will print a warning
and try to do full flush. But there is actually no need to flush
the tlb before the consumer device is turned on.

Fix the warning by skipping parital flush when allocating and instead
do full flash in runtime resume

This patchset is a combination of a patch already sent in a different
patchset: [1] and a warning fix from Sebastian Reichel

[1] 
https://lore.kernel.org/linux-devicetree/20210923115840.17813-13-yong...@mediatek.com/

Sebastian Reichel (1):
  iommu/mediatek: always check runtime PM status in tlb flush range
callback

Yong Wu (1):
  iommu/mediatek: Always tlb_flush_all when each PM resume

 drivers/iommu/mtk_iommu.c | 17 ++---
 1 file changed, 10 insertions(+), 7 deletions(-)

-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 1/2] iommu/mediatek: Always tlb_flush_all when each PM resume

2021-11-22 Thread Dafna Hirschfeld
From: Yong Wu 

Prepare for 2 HWs that sharing pgtable in different power-domains.

When there are 2 M4U HWs, it may has problem in the flush_range in which
we get the pm_status via the m4u dev, BUT that function don't reflect the
real power-domain status of the HW since there may be other HW also use
that power-domain.

The function dma_alloc_attrs help allocate the iommu buffer which
need the corresponding power domain since tlb flush is needed when
preparing iova. BUT this function only is for allocating buffer,
we have no good reason to request the user always call pm_runtime_get
before calling dma_alloc_xxx. Therefore, we add a tlb_flush_all
in the pm_runtime_resume to make sure the tlb always is clean.

Another solution is always call pm_runtime_get in the tlb_flush_range.
This will trigger pm runtime resume/backup so often when the iommu
power is not active at some time(means user don't call pm_runtime_get
before calling dma_alloc_xxx), This may cause the performance drop.
thus we don't use this.

In other case, the iommu's power should always be active via device
link with smi.

The previous SoC don't have PM except mt8192. the mt8192 IOMMU is display's
power-domain which nearly always is enabled. thus no need fix tags here.
Prepare for mt8195.

Signed-off-by: Yong Wu 
[imporvie inline doc]
Signed-off-by: Dafna Hirschfeld 
---
 drivers/iommu/mtk_iommu.c | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c
index 25b834104790..28dc4b95b6d9 100644
--- a/drivers/iommu/mtk_iommu.c
+++ b/drivers/iommu/mtk_iommu.c
@@ -964,6 +964,13 @@ static int __maybe_unused mtk_iommu_runtime_resume(struct 
device *dev)
return ret;
}
 
+   /*
+* Users may allocate dma buffer before they call pm_runtime_get,
+* in which case it will lack the necessary tlb flush.
+* Thus, make sure to update the tlb after each PM resume.
+*/
+   mtk_iommu_tlb_flush_all(data);
+
/*
 * Uppon first resume, only enable the clk and return, since the values 
of the
 * registers are not yet set.
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH 2/2] iommu/mediatek: always check runtime PM status in tlb flush range callback

2021-11-22 Thread Dafna Hirschfeld
From: Sebastian Reichel 

In case of v4l2_reqbufs() it is possible, that a TLB flush is done
without runtime PM being enabled. In that case the "Partial TLB flush
timed out, falling back to full flush" warning is printed.

Commit c0b57581b73b ("iommu/mediatek: Add power-domain operation")
introduced has_pm as optimization to avoid checking runtime PM
when there is no power domain attached. But without the PM domain
there is still the device driver's runtime PM suspend handler, which
disables the clock. Thus flushing should also be avoided when there
is no PM domain involved.

Signed-off-by: Sebastian Reichel 
Reviewed-by: Dafna Hirschfeld 
---
 drivers/iommu/mtk_iommu.c | 10 +++---
 1 file changed, 3 insertions(+), 7 deletions(-)

diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c
index 28dc4b95b6d9..b0535fcfd1d7 100644
--- a/drivers/iommu/mtk_iommu.c
+++ b/drivers/iommu/mtk_iommu.c
@@ -227,16 +227,13 @@ static void mtk_iommu_tlb_flush_range_sync(unsigned long 
iova, size_t size,
   size_t granule,
   struct mtk_iommu_data *data)
 {
-   bool has_pm = !!data->dev->pm_domain;
unsigned long flags;
int ret;
u32 tmp;
 
for_each_m4u(data) {
-   if (has_pm) {
-   if (pm_runtime_get_if_in_use(data->dev) <= 0)
-   continue;
-   }
+   if (pm_runtime_get_if_in_use(data->dev) <= 0)
+   continue;
 
spin_lock_irqsave(>tlb_lock, flags);
writel_relaxed(F_INVLD_EN1 | F_INVLD_EN0,
@@ -261,8 +258,7 @@ static void mtk_iommu_tlb_flush_range_sync(unsigned long 
iova, size_t size,
writel_relaxed(0, data->base + REG_MMU_CPE_DONE);
spin_unlock_irqrestore(>tlb_lock, flags);
 
-   if (has_pm)
-   pm_runtime_put(data->dev);
+   pm_runtime_put(data->dev);
}
 }
 
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v3 12/33] iommu/mediatek: Always tlb_flush_all when each PM resume

2021-11-09 Thread Dafna Hirschfeld




On 10.11.21 04:20, Yong Wu wrote:

On Tue, 2021-11-09 at 14:21 +0200, Dafna Hirschfeld wrote:

Hi
This patch is needed in order to update the tlb when a device is
powered on.
Could you send this patch alone without the whole series so it get
accepted easier?


Which SoC are you testing on? In previous SoC, the IOMMU HW don't have
power-domain, and we have a "has_pm"[1] in the tlb function for that
case. The "has_pm" should be always 0 for the previous SoC like mt8173,
it should always tlb synchronize.

thus, Could you help share more about your issue? In which case it lack
the necessary tlb operation. At least, We need confirm if it needs a
"Fixes" tags if sending this patch alone.


Hi,
I work with the mtk-vcodec driver on mt8173. As you wrote, the iommu doesn't
have a power-domain and so when allocating buffers before the device is powered
on, there is the warning
"Partial TLB flush timed out, falling back to full flush"
flooding the log buf.

Sebastian Reichel suggested to remove the 'if(has_pm)' check to avoid this 
warning,
and avoid flushing the tlb if the device is off:

[1] http://ix.io/3Eyr

This fixes the warning, but then the tlb is not flushed in sync,
Therefore the tlb should be flushed when the device is resumed.

So the two patches (the one suggested in the link [1] and this patch)
should be sent together as a 2-patch series.

Thanks,
Dafna



Thanks.

[1]
https://elixir.bootlin.com/linux/v5.15/source/drivers/iommu/mtk_iommu.c#L236


I can resend the patch on your behalf if you want.

Thanks,
Dafna

On 23.09.21 14:58, Yong Wu wrote:

Prepare for 2 HWs that sharing pgtable in different power-domains.

When there are 2 M4U HWs, it may has problem in the flush_range in
which
we get the pm_status via the m4u dev, BUT that function don't
reflect the
real power-domain status of the HW since there may be other HW also
use
that power-domain.

The function dma_alloc_attrs help allocate the iommu buffer which
need the corresponding power domain since tlb flush is needed when
preparing iova. BUT this function only is for allocating buffer,
we have no good reason to request the user always call
pm_runtime_get
before calling dma_alloc_xxx. Therefore, we add a tlb_flush_all
in the pm_runtime_resume to make sure the tlb always is clean.

Another solution is always call pm_runtime_get in the
tlb_flush_range.
This will trigger pm runtime resume/backup so often when the iommu
power is not active at some time(means user don't call
pm_runtime_get
before calling dma_alloc_xxx), This may cause the performance drop.
thus we don't use this.

In other case, the iommu's power should always be active via device
link with smi.

The previous SoC don't have PM except mt8192. the mt8192 IOMMU is
display's
power-domain which nearly always is enabled. thus no need fix tags
here.
Prepare for mt8195.

Signed-off-by: Yong Wu 
---
   drivers/iommu/mtk_iommu.c | 11 +++
   1 file changed, 11 insertions(+)

diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c
index 44cf5547d084..e9e94944ed91 100644
--- a/drivers/iommu/mtk_iommu.c
+++ b/drivers/iommu/mtk_iommu.c
@@ -984,6 +984,17 @@ static int __maybe_unused
mtk_iommu_runtime_resume(struct device *dev)
return ret;
}
   
+	/*

+* Users may allocate dma buffer before they call
pm_runtime_get, then
+* it will lack the necessary tlb flush.
+*
+* We have no good reason to request the users always call
dma_alloc_xx
+* after pm_runtime_get_sync.
+*
+* Thus, Make sure the tlb always is clean after each PM
resume.
+*/
+   mtk_iommu_tlb_do_flush_all(data);
+
/*
 * Uppon first resume, only enable the clk and return, since
the values of the
 * registers are not yet set.


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v3 12/33] iommu/mediatek: Always tlb_flush_all when each PM resume

2021-11-09 Thread Dafna Hirschfeld

Hi
This patch is needed in order to update the tlb when a device is powered on.
Could you send this patch alone without the whole series so it get
accepted easier?
I can resend the patch on your behalf if you want.

Thanks,
Dafna

On 23.09.21 14:58, Yong Wu wrote:

Prepare for 2 HWs that sharing pgtable in different power-domains.

When there are 2 M4U HWs, it may has problem in the flush_range in which
we get the pm_status via the m4u dev, BUT that function don't reflect the
real power-domain status of the HW since there may be other HW also use
that power-domain.

The function dma_alloc_attrs help allocate the iommu buffer which
need the corresponding power domain since tlb flush is needed when
preparing iova. BUT this function only is for allocating buffer,
we have no good reason to request the user always call pm_runtime_get
before calling dma_alloc_xxx. Therefore, we add a tlb_flush_all
in the pm_runtime_resume to make sure the tlb always is clean.

Another solution is always call pm_runtime_get in the tlb_flush_range.
This will trigger pm runtime resume/backup so often when the iommu
power is not active at some time(means user don't call pm_runtime_get
before calling dma_alloc_xxx), This may cause the performance drop.
thus we don't use this.

In other case, the iommu's power should always be active via device
link with smi.

The previous SoC don't have PM except mt8192. the mt8192 IOMMU is display's
power-domain which nearly always is enabled. thus no need fix tags here.
Prepare for mt8195.

Signed-off-by: Yong Wu 
---
  drivers/iommu/mtk_iommu.c | 11 +++
  1 file changed, 11 insertions(+)

diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c
index 44cf5547d084..e9e94944ed91 100644
--- a/drivers/iommu/mtk_iommu.c
+++ b/drivers/iommu/mtk_iommu.c
@@ -984,6 +984,17 @@ static int __maybe_unused mtk_iommu_runtime_resume(struct 
device *dev)
return ret;
}
  
+	/*

+* Users may allocate dma buffer before they call pm_runtime_get, then
+* it will lack the necessary tlb flush.
+*
+* We have no good reason to request the users always call dma_alloc_xx
+* after pm_runtime_get_sync.
+*
+* Thus, Make sure the tlb always is clean after each PM resume.
+*/
+   mtk_iommu_tlb_do_flush_all(data);
+
/*
 * Uppon first resume, only enable the clk and return, since the values 
of the
 * registers are not yet set.


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


[PATCH] iommu: log iova range in map/unmap trace events

2021-11-04 Thread Dafna Hirschfeld
In case of an iommu page fault, the faulting iova is logged in
trace_io_page_fault. It is therefore convenient to log the
iova range in mapping/unmapping trace events so that it is
easier to see if the faulting iova was recently in any of
those ranges.

Signed-off-by: Dafna Hirschfeld 
---
 include/trace/events/iommu.h | 10 ++
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/include/trace/events/iommu.h b/include/trace/events/iommu.h
index 72b4582322ff..29096fe12623 100644
--- a/include/trace/events/iommu.h
+++ b/include/trace/events/iommu.h
@@ -101,8 +101,9 @@ TRACE_EVENT(map,
__entry->size = size;
),
 
-   TP_printk("IOMMU: iova=0x%016llx paddr=0x%016llx size=%zu",
-   __entry->iova, __entry->paddr, __entry->size
+   TP_printk("IOMMU: iova=0x%016llx - 0x%016llx paddr=0x%016llx size=%zu",
+ __entry->iova, __entry->iova + __entry->size, __entry->paddr,
+ __entry->size
)
 );
 
@@ -124,8 +125,9 @@ TRACE_EVENT(unmap,
__entry->unmapped_size = unmapped_size;
),
 
-   TP_printk("IOMMU: iova=0x%016llx size=%zu unmapped_size=%zu",
-   __entry->iova, __entry->size, __entry->unmapped_size
+   TP_printk("IOMMU: iova=0x%016llx - 0x%016llx size=%zu 
unmapped_size=%zu",
+ __entry->iova, __entry->iova + __entry->size,
+ __entry->size, __entry->unmapped_size
)
 );
 
-- 
2.17.1

___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v3 13/33] iommu/mediatek: Remove the power status checking in tlb flush all

2021-10-22 Thread Dafna Hirschfeld

Hi


On 23.09.21 13:58, Yong Wu wrote:

To simplify the code, Remove the power status checking in the
tlb_flush_all, remove this:
if (pm_runtime_get_if_in_use(data->dev) <= 0)
continue;

After this patch, the mtk_iommu_tlb_flush_all will be called from
a) isr
b) pm runtime resume callback
c) tlb flush range fail case
d) iommu_create_device_direct_mappings
-> iommu_flush_iotlb_all
In first three cases, the power and clock always are enabled; d) is direct


Regarding case "c) tlb flush range fail case", I found out that this often 
happens
when the iommu is used while it is runtime suspended. For example the mtk-vcodec
encoder driver calls "pm_runtime_resume_and_get" only when it starts streaming 
but
buffers allocation is done in 'v4l2_reqbufs' before "pm_runtime_resume_and_get" 
is called
and then I see the warning "Partial TLB flush timed out, falling back to full 
flush"
I am not sure how to fix that issue, but it seems that case 'c)' might indicate 
that
power and clock are actually not enabled.


mapping, the tlb flush is unnecessay since we already have tlb_flush_all
in the pm_runtime_resume callback. When the iommu's power status is
changed to active, the tlb always is clean.

In addition, there still are 2 reasons that don't add PM status checking
in the tlb flush all:
a) Write tlb flush all register also is ok even though the HW has no
power and clocks. Write ignore.
b) pm_runtime_get_if_in_use(m4udev) is 0 when the tlb_flush_all
is called frm pm_runtime_resume cb. From this point, we can not add
this code above in this tlb_flush_all.

Signed-off-by: Yong Wu 
---
  drivers/iommu/mtk_iommu.c | 20 +++-
  1 file changed, 7 insertions(+), 13 deletions(-)

diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c
index e9e94944ed91..4a33b6c6b1db 100644
--- a/drivers/iommu/mtk_iommu.c
+++ b/drivers/iommu/mtk_iommu.c
@@ -204,10 +204,14 @@ static struct mtk_iommu_domain *to_mtk_domain(struct 
iommu_domain *dom)
return container_of(dom, struct mtk_iommu_domain, domain);
  }
  
-static void mtk_iommu_tlb_do_flush_all(struct mtk_iommu_data *data)

+static void mtk_iommu_tlb_flush_all(struct mtk_iommu_data *data)
  {
unsigned long flags;
  
+	/*

+* No need get power status since the HW PM status nearly is active
+* when entering here.
+*/
spin_lock_irqsave(>tlb_lock, flags);
writel_relaxed(F_INVLD_EN1 | F_INVLD_EN0,
   data->base + data->plat_data->inv_sel_reg);
@@ -216,16 +220,6 @@ static void mtk_iommu_tlb_do_flush_all(struct 
mtk_iommu_data *data)
spin_unlock_irqrestore(>tlb_lock, flags);
  }
  
-static void mtk_iommu_tlb_flush_all(struct mtk_iommu_data *data)

-{
-   if (pm_runtime_get_if_in_use(data->dev) <= 0)
-   return;
-
-   mtk_iommu_tlb_do_flush_all(data);
-
-   pm_runtime_put(data->dev);
-}
-
  static void mtk_iommu_tlb_flush_range_sync(unsigned long iova, size_t size,
   struct mtk_iommu_data *data)
  {
@@ -263,7 +257,7 @@ static void mtk_iommu_tlb_flush_range_sync(unsigned long 
iova, size_t size,
if (ret) {
dev_warn(data->dev,
 "Partial TLB flush timed out, falling back to full 
flush\n");
-   mtk_iommu_tlb_do_flush_all(data);
+   mtk_iommu_tlb_flush_all(data);
}
  
  		if (has_pm)

@@ -993,7 +987,7 @@ static int __maybe_unused mtk_iommu_runtime_resume(struct 
device *dev)
 *
 * Thus, Make sure the tlb always is clean after each PM resume.
 */
-   mtk_iommu_tlb_do_flush_all(data);
+   mtk_iommu_tlb_flush_all(data);
  
  	/*

 * Uppon first resume, only enable the clk and return, since the values 
of the


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v8 04/12] iommu/mediatek: Add device_link between the consumer and the larb devices

2021-10-18 Thread Dafna Hirschfeld




On 16.10.21 04:23, Yong Wu wrote:

On Mon, 2021-10-11 at 14:36 +0200, Dafna Hirschfeld wrote:


On 29.09.21 03:37, Yong Wu wrote:

MediaTek IOMMU-SMI diagram is like below. all the consumer connect
with
smi-larb, then connect with smi-common.

  M4U
   |
  smi-common
   |
-
| |...
| |
larb1 larb2
| |
vdec   venc

When the consumer works, it should enable the smi-larb's power
which
also need enable the smi-common's power firstly.

Thus, First of all, use the device link connect the consumer and
the
smi-larbs. then add device link between the smi-larb and smi-
common.

This patch adds device_link between the consumer and the larbs.

When device_link_add, I add the flag DL_FLAG_STATELESS to avoid
calling
pm_runtime_xx to keep the original status of clocks. It can avoid
two
issues:
1) Display HW show fastlogo abnormally reported in [1]. At the
beggining,
all the clocks are enabled before entering kernel, but the clocks
for
display HW(always in larb0) will be gated after clk_enable and
clk_disable
called from device_link_add(->pm_runtime_resume) and rpm_idle. The
clock
operation happened before display driver probe. At that time, the
display
HW will be abnormal.

2) A deadlock issue reported in [2]. Use DL_FLAG_STATELESS to skip
pm_runtime_xx to avoid the deadlock.

Corresponding, DL_FLAG_AUTOREMOVE_CONSUMER can't be added, then
device_link_removed should be added explicitly.

[1]
https://lore.kernel.org/linux-mediatek/1564213888.22908.4.camel@mhfsdcap03/
[2] https://lore.kernel.org/patchwork/patch/1086569/

Suggested-by: Tomasz Figa 
Signed-off-by: Yong Wu 
Tested-by: Frank Wunderlich  # BPI-
R2/MT7623
---
   drivers/iommu/mtk_iommu.c| 22 ++
   drivers/iommu/mtk_iommu_v1.c | 20 +++-
   2 files changed, 41 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c
index d5848f78a677..a2fa55899434 100644
--- a/drivers/iommu/mtk_iommu.c
+++ b/drivers/iommu/mtk_iommu.c
@@ -560,22 +560,44 @@ static struct iommu_device
*mtk_iommu_probe_device(struct device *dev)
   {
struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev);
struct mtk_iommu_data *data;
+   struct device_link *link;
+   struct device *larbdev;
+   unsigned int larbid;
   
   	if (!fwspec || fwspec->ops != _iommu_ops)

return ERR_PTR(-ENODEV); /* Not a iommu client device
*/
   
   	data = dev_iommu_priv_get(dev);
   
+	/*

+* Link the consumer device with the smi-larb device(supplier)
+* The device in each a larb is a independent HW. thus only
link
+* one larb here.
+*/
+   larbid = MTK_M4U_TO_LARB(fwspec->ids[0]);


so larbid is always the same for all the ids of a device?


Yes. For me, each a dtsi node should represent a HW unit which can only
connect one larb.


If so maybe it worth testing it and return error if this is not the
case.


Thanks for the suggestion. This is very helpful. I did see someone put
the different larbs in one node. I will check this, and add return


I am working on bugs found on media drivers, could you please point me to
that wrong node?
Will you send a fix to that node in the dtsi?


Thanks,
Dafna


EINVAL for this case.








Thanks,
Dafna
  



___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v8 04/12] iommu/mediatek: Add device_link between the consumer and the larb devices

2021-10-11 Thread Dafna Hirschfeld




On 29.09.21 03:37, Yong Wu wrote:

MediaTek IOMMU-SMI diagram is like below. all the consumer connect with
smi-larb, then connect with smi-common.

 M4U
  |
 smi-common
  |
   -
   | |...
   | |
larb1 larb2
   | |
vdec   venc

When the consumer works, it should enable the smi-larb's power which
also need enable the smi-common's power firstly.

Thus, First of all, use the device link connect the consumer and the
smi-larbs. then add device link between the smi-larb and smi-common.

This patch adds device_link between the consumer and the larbs.

When device_link_add, I add the flag DL_FLAG_STATELESS to avoid calling
pm_runtime_xx to keep the original status of clocks. It can avoid two
issues:
1) Display HW show fastlogo abnormally reported in [1]. At the beggining,
all the clocks are enabled before entering kernel, but the clocks for
display HW(always in larb0) will be gated after clk_enable and clk_disable
called from device_link_add(->pm_runtime_resume) and rpm_idle. The clock
operation happened before display driver probe. At that time, the display
HW will be abnormal.

2) A deadlock issue reported in [2]. Use DL_FLAG_STATELESS to skip
pm_runtime_xx to avoid the deadlock.

Corresponding, DL_FLAG_AUTOREMOVE_CONSUMER can't be added, then
device_link_removed should be added explicitly.

[1] https://lore.kernel.org/linux-mediatek/1564213888.22908.4.camel@mhfsdcap03/
[2] https://lore.kernel.org/patchwork/patch/1086569/

Suggested-by: Tomasz Figa 
Signed-off-by: Yong Wu 
Tested-by: Frank Wunderlich  # BPI-R2/MT7623
---
  drivers/iommu/mtk_iommu.c| 22 ++
  drivers/iommu/mtk_iommu_v1.c | 20 +++-
  2 files changed, 41 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c
index d5848f78a677..a2fa55899434 100644
--- a/drivers/iommu/mtk_iommu.c
+++ b/drivers/iommu/mtk_iommu.c
@@ -560,22 +560,44 @@ static struct iommu_device *mtk_iommu_probe_device(struct 
device *dev)
  {
struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev);
struct mtk_iommu_data *data;
+   struct device_link *link;
+   struct device *larbdev;
+   unsigned int larbid;
  
  	if (!fwspec || fwspec->ops != _iommu_ops)

return ERR_PTR(-ENODEV); /* Not a iommu client device */
  
  	data = dev_iommu_priv_get(dev);
  
+	/*

+* Link the consumer device with the smi-larb device(supplier)
+* The device in each a larb is a independent HW. thus only link
+* one larb here.
+*/
+   larbid = MTK_M4U_TO_LARB(fwspec->ids[0]);


so larbid is always the same for all the ids of a device? If so
maybe it worth testing it and return error if this is not the case.

Thanks,
Dafna


+   larbdev = data->larb_imu[larbid].dev;
+   link = device_link_add(dev, larbdev,
+  DL_FLAG_PM_RUNTIME | DL_FLAG_STATELESS);
+   if (!link)
+   dev_err(dev, "Unable to link %s\n", dev_name(larbdev));
return >iommu;
  }
  
  static void mtk_iommu_release_device(struct device *dev)

  {
struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev);
+   struct mtk_iommu_data *data;
+   struct device *larbdev;
+   unsigned int larbid;
  
  	if (!fwspec || fwspec->ops != _iommu_ops)

return;
  
+	data = dev_iommu_priv_get(dev);

+   larbid = MTK_M4U_TO_LARB(fwspec->ids[0]);
+   larbdev = data->larb_imu[larbid].dev;
+   device_link_remove(dev, larbdev);
+
iommu_fwspec_free(dev);
  }
  
diff --git a/drivers/iommu/mtk_iommu_v1.c b/drivers/iommu/mtk_iommu_v1.c

index 4d7809432239..e6f13459470e 100644
--- a/drivers/iommu/mtk_iommu_v1.c
+++ b/drivers/iommu/mtk_iommu_v1.c
@@ -423,7 +423,9 @@ static struct iommu_device *mtk_iommu_probe_device(struct 
device *dev)
struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev);
struct of_phandle_args iommu_spec;
struct mtk_iommu_data *data;
-   int err, idx = 0;
+   int err, idx = 0, larbid;
+   struct device_link *link;
+   struct device *larbdev;
  
  	/*

 * In the deferred case, free the existed fwspec.
@@ -453,6 +455,14 @@ static struct iommu_device *mtk_iommu_probe_device(struct 
device *dev)
  
  	data = dev_iommu_priv_get(dev);
  
+	/* Link the consumer device with the smi-larb device(supplier) */

+   larbid = mt2701_m4u_to_larb(fwspec->ids[0]);
+   larbdev = data->larb_imu[larbid].dev;
+   link = device_link_add(dev, larbdev,
+  DL_FLAG_PM_RUNTIME | DL_FLAG_STATELESS);
+   if (!link)
+   dev_err(dev, "Unable to link %s\n", dev_name(larbdev));
+
return >iommu;
  }
  
@@ -473,10 +483,18 @@ static void mtk_iommu_probe_finalize(struct device *dev)

  static void mtk_iommu_release_device(struct device *dev)
  {
struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev);
+   struct 

Re: [PATCH v2 11/29] iommu/mediatek: Always pm_runtime_get while tlb flush

2021-09-30 Thread Dafna Hirschfeld




On 13.08.21 08:53, Yong Wu wrote:

Prepare for 2 HWs that sharing pgtable in different power-domains.

The previous SoC don't have PM. Only mt8192 has power-domain,
and it is display's power-domain which nearly always is enabled.


hi, I see that in mt1873.dtsi, many devices that uses the iommu have the
'power-domains' property.



When there are 2 M4U HWs, it may has problem.
In this function, we get the pm_status via the m4u dev, but it don't
reflect the real power-domain status of the HW since there may be other
HW also use that power-domain.

Currently we could not get the real power-domain status, thus always
pm_runtime_get here.

Prepare for mt8195, thus, no need fix tags here.

This patch may drop the performance, we expect the user could
pm_runtime_get_sync before dma_alloc_attrs which need tlb ops.



Could you explain this sentence a bit? should the user call pm_runtime_get_sync
before calling dma_alloc_attrs?

Thanks,
Dafna


Signed-off-by: Yong Wu 
---
  drivers/iommu/mtk_iommu.c | 5 -
  1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c
index add23a36a5e2..abc721a1da21 100644
--- a/drivers/iommu/mtk_iommu.c
+++ b/drivers/iommu/mtk_iommu.c
@@ -238,8 +238,11 @@ static void mtk_iommu_tlb_flush_range_sync(unsigned long 
iova, size_t size,
  
  	for_each_m4u(data, head) {

if (has_pm) {
-   if (pm_runtime_get_if_in_use(data->dev) <= 0)
+   ret = pm_runtime_resume_and_get(data->dev);
+   if (ret < 0) {
+   dev_err(data->dev, "tlb flush: pm get fail 
%d.\n", ret);
continue;
+   }
}
  
  		spin_lock_irqsave(>tlb_lock, flags);



___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v8 09/12] media: mtk-vcodec: Get rid of mtk_smi_larb_get/put

2021-09-30 Thread Dafna Hirschfeld




On 30.09.21 05:28, Yong Wu wrote:

Hi Dafna,

Thanks very much for the review.

On Wed, 2021-09-29 at 14:13 +0200, Dafna Hirschfeld wrote:


On 29.09.21 03:37, Yong Wu wrote:

MediaTek IOMMU has already added the device_link between the
consumer
and smi-larb device. If the vcodec device call the
pm_runtime_get_sync,
the smi-larb's pm_runtime_get_sync also be called automatically.

CC: Tiffany Lin 
CC: Irui Wang 
Signed-off-by: Yong Wu 
Reviewed-by: Evan Green 
Acked-by: Tiffany Lin 
Reviewed-by: Dafna Hirschfeld 
---
   .../platform/mtk-vcodec/mtk_vcodec_dec_pm.c   | 37 +++---
--
   .../platform/mtk-vcodec/mtk_vcodec_drv.h  |  3 --
   .../platform/mtk-vcodec/mtk_vcodec_enc.c  |  1 -
   .../platform/mtk-vcodec/mtk_vcodec_enc_pm.c   | 44 +++---
-
   4 files changed, 10 insertions(+), 75 deletions(-)

diff --git a/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_pm.c
b/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_pm.c
index 6038db96f71c..d0bf9aa3b29d 100644
--- a/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_pm.c
+++ b/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_pm.c
@@ -8,14 +8,12 @@
   #include 
   #include 
   #include 
-#include 
   
   #include "mtk_vcodec_dec_pm.h"

   #include "mtk_vcodec_util.h"
   
   int mtk_vcodec_init_dec_pm(struct mtk_vcodec_dev *mtkdev)

   {
-   struct device_node *node;
struct platform_device *pdev;
struct mtk_vcodec_pm *pm;
struct mtk_vcodec_clk *dec_clk;
@@ -26,18 +24,7 @@ int mtk_vcodec_init_dec_pm(struct mtk_vcodec_dev
*mtkdev)
pm = >pm;
pm->mtkdev = mtkdev;
dec_clk = >vdec_clk;
-   node = of_parse_phandle(pdev->dev.of_node, "mediatek,larb", 0);
-   if (!node) {
-   mtk_v4l2_err("of_parse_phandle mediatek,larb fail!");
-   return -1;
-   }
   
-	pdev = of_find_device_by_node(node);

-   of_node_put(node);
-   if (WARN_ON(!pdev)) {
-   return -1;
-   }
-   pm->larbvdec = >dev;
pdev = mtkdev->plat_dev;
pm->dev = >dev;
   
@@ -47,14 +34,11 @@ int mtk_vcodec_init_dec_pm(struct

mtk_vcodec_dev *mtkdev)
dec_clk->clk_info = devm_kcalloc(>dev,
dec_clk->clk_num, sizeof(*clk_info),
GFP_KERNEL);
-   if (!dec_clk->clk_info) {
-   ret = -ENOMEM;
-   goto put_device;
-   }
+   if (!dec_clk->clk_info)
+   return -ENOMEM;
} else {
mtk_v4l2_err("Failed to get vdec clock count");
-   ret = -EINVAL;
-   goto put_device;
+   return -EINVAL;
}
   
   	for (i = 0; i < dec_clk->clk_num; i++) {

@@ -63,29 +47,24 @@ int mtk_vcodec_init_dec_pm(struct
mtk_vcodec_dev *mtkdev)
"clock-names", i, _info->clk_name);
if (ret) {
mtk_v4l2_err("Failed to get clock name id =
%d", i);
-   goto put_device;
+   return ret;
}
clk_info->vcodec_clk = devm_clk_get(>dev,
clk_info->clk_name);
if (IS_ERR(clk_info->vcodec_clk)) {
mtk_v4l2_err("devm_clk_get (%d)%s fail", i,
clk_info->clk_name);
-   ret = PTR_ERR(clk_info->vcodec_clk);
-   goto put_device;
+   return PTR_ERR(clk_info->vcodec_clk);
}
}
   
   	pm_runtime_enable(>dev);

return 0;
-put_device:
-   put_device(pm->larbvdec);
-   return ret;
   }
   
   void mtk_vcodec_release_dec_pm(struct mtk_vcodec_dev *dev)

   {
pm_runtime_disable(dev->pm.dev);
-   put_device(dev->pm.larbvdec);
   }


Now that functions only do  'pm_runtime_disable(dev->pm.dev);' so it
will be more
readable to remove the function mtk_vcodec_release_dec_pm
and replace with pm_runtime_disable(dev->pm.dev);
Same for the 'enc' equivalent.


Make sense. But It may be not proper if using pm_runtime_disable
as the symmetry with mtk_vcodec_init_dec_pm in the mtk_vcodec_probe.

Maybe we should move pm_runtime_enable out from mtk_vcodec_init_dec_pm
into mtk_vcodec_probe. I could do a new patch for this. Is this ok for
you?


yes, there is also asymettry when calling pm_runtime* in general,
I see in the decoder it is called from mtk_vcodec_dec_pm.c
but in the encoder it is called from mtk_vcodec_enc.c,

I think all calls to pm_runtime* should be out of the *_pm.c files
since for example 'mtk_vcodec_dec_pw_on' also do just one call to
pm_runtime_resume_and_get so this function can also be removed.

thanks,
Dafna





Thanks,
Dafna


[snip]
___
Linux-m

Re: [PATCH v8 03/12] iommu/mediatek: Add probe_defer for smi-larb

2021-09-29 Thread Dafna Hirschfeld




On 29.09.21 03:37, Yong Wu wrote:

Prepare for adding device_link.

The iommu consumer should use device_link to connect with the
smi-larb(supplier). then the smi-larb should run before the iommu
consumer. Here we delay the iommu driver until the smi driver is ready,
then all the iommu consumers always are after the smi driver.

When there is no this patch, if some consumer drivers run before
smi-larb, the supplier link_status is DL_DEV_NO_DRIVER(0) in the
device_link_add, then device_links_driver_bound will use WARN_ON
to complain that the link_status of supplier is not right.

device_is_bound may be more elegant here. but it is not allowed to
EXPORT from https://lore.kernel.org/patchwork/patch/1334670/.

Signed-off-by: Yong Wu 
Tested-by: Frank Wunderlich  # BPI-R2/MT7623
---
  drivers/iommu/mtk_iommu.c| 2 +-
  drivers/iommu/mtk_iommu_v1.c | 2 +-
  2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c
index d837adfd1da5..d5848f78a677 100644
--- a/drivers/iommu/mtk_iommu.c
+++ b/drivers/iommu/mtk_iommu.c
@@ -844,7 +844,7 @@ static int mtk_iommu_probe(struct platform_device *pdev)
id = i;
  
  		plarbdev = of_find_device_by_node(larbnode);

-   if (!plarbdev) {
+   if (!plarbdev || !plarbdev->dev.driver) {
of_node_put(larbnode);
return -EPROBE_DEFER;


if plarbdev is null doesn't that mean that the device does not exist?
so we should return -ENODEV in that case?

thanks,
Dafna


}
diff --git a/drivers/iommu/mtk_iommu_v1.c b/drivers/iommu/mtk_iommu_v1.c
index 1467ba1e4417..4d7809432239 100644
--- a/drivers/iommu/mtk_iommu_v1.c
+++ b/drivers/iommu/mtk_iommu_v1.c
@@ -602,7 +602,7 @@ static int mtk_iommu_probe(struct platform_device *pdev)
}
  
  		plarbdev = of_find_device_by_node(larbnode);

-   if (!plarbdev) {
+   if (!plarbdev || !plarbdev->dev.driver) {
of_node_put(larbnode);
return -EPROBE_DEFER;
}


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v8 09/12] media: mtk-vcodec: Get rid of mtk_smi_larb_get/put

2021-09-29 Thread Dafna Hirschfeld




On 29.09.21 03:37, Yong Wu wrote:

MediaTek IOMMU has already added the device_link between the consumer
and smi-larb device. If the vcodec device call the pm_runtime_get_sync,
the smi-larb's pm_runtime_get_sync also be called automatically.

CC: Tiffany Lin 
CC: Irui Wang 
Signed-off-by: Yong Wu 
Reviewed-by: Evan Green 
Acked-by: Tiffany Lin 
Reviewed-by: Dafna Hirschfeld 
---
  .../platform/mtk-vcodec/mtk_vcodec_dec_pm.c   | 37 +++-
  .../platform/mtk-vcodec/mtk_vcodec_drv.h  |  3 --
  .../platform/mtk-vcodec/mtk_vcodec_enc.c  |  1 -
  .../platform/mtk-vcodec/mtk_vcodec_enc_pm.c   | 44 +++
  4 files changed, 10 insertions(+), 75 deletions(-)

diff --git a/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_pm.c 
b/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_pm.c
index 6038db96f71c..d0bf9aa3b29d 100644
--- a/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_pm.c
+++ b/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_pm.c
@@ -8,14 +8,12 @@
  #include 
  #include 
  #include 
-#include 
  
  #include "mtk_vcodec_dec_pm.h"

  #include "mtk_vcodec_util.h"
  
  int mtk_vcodec_init_dec_pm(struct mtk_vcodec_dev *mtkdev)

  {
-   struct device_node *node;
struct platform_device *pdev;
struct mtk_vcodec_pm *pm;
struct mtk_vcodec_clk *dec_clk;
@@ -26,18 +24,7 @@ int mtk_vcodec_init_dec_pm(struct mtk_vcodec_dev *mtkdev)
pm = >pm;
pm->mtkdev = mtkdev;
dec_clk = >vdec_clk;
-   node = of_parse_phandle(pdev->dev.of_node, "mediatek,larb", 0);
-   if (!node) {
-   mtk_v4l2_err("of_parse_phandle mediatek,larb fail!");
-   return -1;
-   }
  
-	pdev = of_find_device_by_node(node);

-   of_node_put(node);
-   if (WARN_ON(!pdev)) {
-   return -1;
-   }
-   pm->larbvdec = >dev;
pdev = mtkdev->plat_dev;
pm->dev = >dev;
  
@@ -47,14 +34,11 @@ int mtk_vcodec_init_dec_pm(struct mtk_vcodec_dev *mtkdev)

dec_clk->clk_info = devm_kcalloc(>dev,
dec_clk->clk_num, sizeof(*clk_info),
GFP_KERNEL);
-   if (!dec_clk->clk_info) {
-   ret = -ENOMEM;
-   goto put_device;
-   }
+   if (!dec_clk->clk_info)
+   return -ENOMEM;
} else {
mtk_v4l2_err("Failed to get vdec clock count");
-   ret = -EINVAL;
-   goto put_device;
+   return -EINVAL;
}
  
  	for (i = 0; i < dec_clk->clk_num; i++) {

@@ -63,29 +47,24 @@ int mtk_vcodec_init_dec_pm(struct mtk_vcodec_dev *mtkdev)
"clock-names", i, _info->clk_name);
if (ret) {
mtk_v4l2_err("Failed to get clock name id = %d", i);
-   goto put_device;
+   return ret;
}
clk_info->vcodec_clk = devm_clk_get(>dev,
clk_info->clk_name);
if (IS_ERR(clk_info->vcodec_clk)) {
mtk_v4l2_err("devm_clk_get (%d)%s fail", i,
clk_info->clk_name);
-   ret = PTR_ERR(clk_info->vcodec_clk);
-   goto put_device;
+   return PTR_ERR(clk_info->vcodec_clk);
}
}
  
  	pm_runtime_enable(>dev);

return 0;
-put_device:
-   put_device(pm->larbvdec);
-   return ret;
  }
  
  void mtk_vcodec_release_dec_pm(struct mtk_vcodec_dev *dev)

  {
pm_runtime_disable(dev->pm.dev);
-   put_device(dev->pm.larbvdec);
  }


Now that functions only do  'pm_runtime_disable(dev->pm.dev);' so it will be 
more
readable to remove the function mtk_vcodec_release_dec_pm
and replace with pm_runtime_disable(dev->pm.dev);
Same for the 'enc' equivalent.

Thanks,
Dafna

  
  int mtk_vcodec_dec_pw_on(struct mtk_vcodec_pm *pm)

@@ -122,11 +101,6 @@ void mtk_vcodec_dec_clock_on(struct mtk_vcodec_pm *pm)
}
}
  
-	ret = mtk_smi_larb_get(pm->larbvdec);

-   if (ret) {
-   mtk_v4l2_err("mtk_smi_larb_get larbvdec fail %d", ret);
-   goto error;
-   }>   return;
  
  error:

@@ -139,7 +113,6 @@ void mtk_vcodec_dec_clock_off(struct mtk_vcodec_pm *pm)
struct mtk_vcodec_clk *dec_clk = >vdec_clk;
int i = 0;
  
-	mtk_smi_larb_put(pm->larbvdec);

for (i = dec_clk->clk_num - 1; i >= 0; i--)
clk_disable_unprepare(dec_clk->clk_info[i].vcodec_clk);
  }
diff --git a/drivers/media/platform/mtk-vcodec/mtk_vcodec_drv.h 
b/drivers/media/platform/mtk-vcodec/mtk_vcodec_drv.h
index c6c7672fecfb..64b73dd880ce 100644
--- a/drivers/media/platform/mtk-vcodec/mtk_v

Re: [PATCH v7 04/12] iommu/mediatek: Add device_link between the consumer and the larb devices

2021-08-05 Thread Dafna Hirschfeld




On 30.07.21 04:52, Yong Wu wrote:

MediaTek IOMMU-SMI diagram is like below. all the consumer connect with
smi-larb, then connect with smi-common.

 M4U
  |
 smi-common
  |
   -
   | |...
   | |
larb1 larb2
   | |
vdec   venc

When the consumer works, it should enable the smi-larb's power which
also need enable the smi-common's power firstly.

Thus, First of all, use the device link connect the consumer and the
smi-larbs. then add device link between the smi-larb and smi-common.

This patch adds device_link between the consumer and the larbs.

When device_link_add, I add the flag DL_FLAG_STATELESS to avoid calling
pm_runtime_xx to keep the original status of clocks. It can avoid two
issues:
1) Display HW show fastlogo abnormally reported in [1]. At the beggining,
all the clocks are enabled before entering kernel, but the clocks for
display HW(always in larb0) will be gated after clk_enable and clk_disable
called from device_link_add(->pm_runtime_resume) and rpm_idle. The clock
operation happened before display driver probe. At that time, the display
HW will be abnormal.

2) A deadlock issue reported in [2]. Use DL_FLAG_STATELESS to skip
pm_runtime_xx to avoid the deadlock.

Corresponding, DL_FLAG_AUTOREMOVE_CONSUMER can't be added, then
device_link_removed should be added explicitly.

[1] https://lore.kernel.org/linux-mediatek/1564213888.22908.4.camel@mhfsdcap03/
[2] https://lore.kernel.org/patchwork/patch/1086569/

Suggested-by: Tomasz Figa 
Signed-off-by: Yong Wu 
Tested-by: Dafna Hirschfeld  # on mt8173


Hi, unfortunately, I have to take back the Tested-by tag.
I am now testing the mtk-vcodec with latest kernel + patches sent from the 
mailing list:
https://gitlab.collabora.com/eballetbo/linux/-/commits/topic/chromeos/chromeos-5.14
which includes this patchset.

On chromeos I open a video conference with googl-meet which cause the 
mtk-vcodec vp8 encoder to run.
If I kill it with `killall -9 chrome` I get some page fault messages from the 
iommu:

[  837.255952] mtk-iommu 10205000.iommu: fault type=0x5 iova=0xfcff0001 pa=0x0 
larb=0 port=0 layer=1 read
[  837.265696] mtk-iommu 10205000.iommu: fault type=0x5 iova=0xfcff0001 pa=0x0 
larb=0 port=0 layer=1 read
[  837.282367] mtk-iommu 10205000.iommu: fault type=0x5 iova=0xfcff0001 pa=0x0 
larb=0 port=0 layer=1 read
[  837.299028] mtk-iommu 10205000.iommu: fault type=0x5 iova=0xfcff0001 pa=0x0 
larb=0 port=0 layer=1 read
[  837.315683] mtk-iommu 10205000.iommu: fault type=0x5 iova=0xfcff0001 pa=0x0 
larb=0 port=0 layer=1 read
[  837.332345] mtk-iommu 10205000.iommu: fault type=0x5 iova=0xfcff0001 pa=0x0 
larb=0 port=0 layer=1 read
[  837.349004] mtk-iommu 10205000.iommu: fault type=0x5 iova=0xfcff0001 pa=0x0 
larb=0 port=0 layer=1 read
[  837.365665] mtk-iommu 10205000.iommu: fault type=0x5 iova=0xfcff0001 pa=0x0 
larb=0 port=0 layer=1 read
[  837.382329] mtk-iommu 10205000.iommu: fault type=0x5 iova=0xfcff0001 pa=0x0 
larb=0 port=0 layer=1 read
[  837.42] mtk-iommu 10205000.iommu: fault type=0x5 iova=0xfcff0001 pa=0x0 
larb=0 port=0 layer=1 read

In addition, running the encoder tests from the shell:

sudo --user=#1000 
/usr/local/libexec/chrome-binary-tests/video_encode_accelerator_tests 
--gtest_filter=VideoEncoderTest.FlushAtEndOfStream_Multiple*  --codec=vp8 
/usr/local/share/tast/data/chromiumos/tast/local/bundles/cros/video/data/tulip2-320x180.yuv
 --disable_validator

At some point it fails with the error

[ 5472.161821] [MTK_V4L2][ERROR] mtk_vcodec_wait_for_done_ctx:32: [290] 
ctx->type=1, cmd=1, wait_event_interruptible_timeout time=1000ms out 0 0!
[ 5472.174678] [MTK_VCODEC][ERROR][290]: vp8_enc_encode_frame() irq_status=0 
failed
[ 5472.182687] [MTK_V4L2][ERROR] mtk_venc_worker:1239: venc_if_encode failed=-5


If you have any idea of what might be the problem or how to debug?

Thanks,
Dafna


---
  drivers/iommu/mtk_iommu.c| 22 ++
  drivers/iommu/mtk_iommu_v1.c | 20 +++-
  2 files changed, 41 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c
index a02dde094788..ee742900cf4b 100644
--- a/drivers/iommu/mtk_iommu.c
+++ b/drivers/iommu/mtk_iommu.c
@@ -571,22 +571,44 @@ static struct iommu_device *mtk_iommu_probe_device(struct 
device *dev)
  {
struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev);
struct mtk_iommu_data *data;
+   struct device_link *link;
+   struct device *larbdev;
+   unsigned int larbid;
  
  	if (!fwspec || fwspec->ops != _iommu_ops)

return ERR_PTR(-ENODEV); /* Not a iommu client device */
  
  	data = dev_iommu_priv_get(dev);
  
+	/*

+* Link the consumer device with the smi-larb device(supplier)
+* The device in each a larb is a independent HW. thus only link
+* one larb here.
+*/
+   larbid = MTK_M4U_TO_LARB(fwspec->ids[0]);
+   larbdev = data->larb_

Re: [PATCH v7 3/4] iommu: rockchip: Add internal ops to handle variants

2021-07-30 Thread Dafna Hirschfeld



On 29.07.21 18:58, Robin Murphy wrote:

On 2021-07-29 17:08, Heiko Stübner wrote:

Hi Dafna,

Am Donnerstag, 29. Juli 2021, 17:59:26 CEST schrieb Dafna Hirschfeld:

On 25.05.21 14:15, Benjamin Gaignard wrote:

@@ -879,7 +895,7 @@ static int rk_iommu_enable(struct rk_iommu *iommu)
   for (i = 0; i < iommu->num_mmu; i++) {
   rk_iommu_write(iommu->bases[i], RK_MMU_DTE_ADDR,
-   rk_domain->dt_dma);
+   rk_ops->dma_addr_dte(rk_domain->dt_dma));


Hi,
This is not related to that patch, I was wondring why are all mmu devices 
initialized
with the same dt_dma?
I see for example that the isp0_mmu in rk3399.dtsi has two resources. Can't 
each resource
be initialized with different dt_dma and this way there are two dt tables 
instead of the two mmus pointing
to the same dt table.


maybe
git log -1 cd6438c5f8446691afa4829fe1a9d7b656204f11

"iommu/rockchip: Reconstruct to support multi slaves
There are some IPs, such as video encoder/decoder, contains 2 slave iommus,
one for reading and the other for writing. They share the same irq and
clock with master.
This patch reconstructs to support this case by making them share the same
Page Directory, Page Tables and even the register operations.
That means every instruction to the reading MMU registers would be
duplicated to the writing MMU and vice versa."


Right. In theory we *could* maintain a separate pagetable for each IOMMU 
instance, but it would just lead to a load of complexity and overhead. For a 
map request, we'd have to do extra work to decide which table(s) need 
modifying, and duplicate all the work of the actual mapping if it's more than 
one. For an unmap request, we'd have no choice but to walk *all* the tables 
backing that domain to figure out which (if any) actually had it mapped in the 
first place.

Given that we already have distinct read and write permissions for mappings 
within a single table, there's not even any functional benefit that could be 
gained in this case (and in the more general case where the device might emit 
all kinds of transactions from all its interfaces you'd have to maintain 
identical mappings for all its IOMMUs anyway). Saving memory and code 
complexity by physically sharing one pagetable and not worrying about trying to 
do selective TLB maintenance is a bigger win than anything else could be.

Robin.


Hi, I just try to understand how this iommu hardware/software works. I have two 
questions,

1. So we currently miss a potential mapping of the hardware right? I mean , 
each mmu can map 1024*1024*4K = 4G addresses, so two mmus can potentially map 
8G. But since
we set them to identical values, we can map only up to 4G.
2. What is the benefit of setting all mmus if they are all set to the same 
values? Can't we just work with the first mmu like it was done before that patch
cd6438c5f8446691afa4829fe1a9d7b656204f11

Thanks,
Dafna



___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v7 3/4] iommu: rockchip: Add internal ops to handle variants

2021-07-29 Thread Dafna Hirschfeld




On 25.05.21 14:15, Benjamin Gaignard wrote:

Add internal ops to be able to handle incoming variant v2.
The goal is to keep the overall structure of the framework but
to allow to add the evolution of this hardware block.

The ops are global for a SoC because iommu domains are not
attached to a specific devices if they are for a virtuel device like
drm. Use a global variable shouldn't be since SoC usually doesn't
embedded different versions of the iommu hardware block.
If that happen one day a WARN_ON will be displayed at probe time.

Signed-off-by: Benjamin Gaignard 
Reviewed-by: Heiko Stuebner 
---
version 7:
  - Set DMA mask
  - Add function to convert dma address to dte

version 6:
  - Remove #include 
  - Remove pt_address_mask field
  - Only use once of_device_get_match_data
  - Return an error if ops don't match

version 5:
  - Use of_device_get_match_data()
  - Add internal ops inside the driver

  drivers/iommu/rockchip-iommu.c | 86 +-
  1 file changed, 64 insertions(+), 22 deletions(-)

diff --git a/drivers/iommu/rockchip-iommu.c b/drivers/iommu/rockchip-iommu.c
index 7a2932772fdf..bd2cf7f08c71 100644
--- a/drivers/iommu/rockchip-iommu.c
+++ b/drivers/iommu/rockchip-iommu.c
@@ -96,6 +96,15 @@ static const char * const rk_iommu_clocks[] = {
"aclk", "iface",
  };
  
+struct rk_iommu_ops {

+   phys_addr_t (*pt_address)(u32 dte);
+   u32 (*mk_dtentries)(dma_addr_t pt_dma);
+   u32 (*mk_ptentries)(phys_addr_t page, int prot);
+   phys_addr_t (*dte_addr_phys)(u32 addr);
+   u32 (*dma_addr_dte)(dma_addr_t dt_dma);
+   u64 dma_bit_mask;
+};
+
  struct rk_iommu {
struct device *dev;
void __iomem **bases;
@@ -116,6 +125,7 @@ struct rk_iommudata {
  };
  
  static struct device *dma_dev;

+static const struct rk_iommu_ops *rk_ops;
  
  static inline void rk_table_flush(struct rk_iommu_domain *dom, dma_addr_t dma,

  unsigned int count)
@@ -215,11 +225,6 @@ static inline u32 rk_mk_dte(dma_addr_t pt_dma)
  #define RK_PTE_PAGE_READABLE  BIT(1)
  #define RK_PTE_PAGE_VALID BIT(0)
  
-static inline phys_addr_t rk_pte_page_address(u32 pte)

-{
-   return (phys_addr_t)pte & RK_PTE_PAGE_ADDRESS_MASK;
-}
-
  static inline bool rk_pte_is_page_valid(u32 pte)
  {
return pte & RK_PTE_PAGE_VALID;
@@ -448,10 +453,10 @@ static int rk_iommu_force_reset(struct rk_iommu *iommu)
 * and verifying that upper 5 nybbles are read back.
 */
for (i = 0; i < iommu->num_mmu; i++) {
-   rk_iommu_write(iommu->bases[i], RK_MMU_DTE_ADDR, 
DTE_ADDR_DUMMY);
+   dte_addr = rk_ops->pt_address(DTE_ADDR_DUMMY);
+   rk_iommu_write(iommu->bases[i], RK_MMU_DTE_ADDR, dte_addr);
  
-		dte_addr = rk_iommu_read(iommu->bases[i], RK_MMU_DTE_ADDR);

-   if (dte_addr != (DTE_ADDR_DUMMY & RK_DTE_PT_ADDRESS_MASK)) {
+   if (dte_addr != rk_iommu_read(iommu->bases[i], 
RK_MMU_DTE_ADDR)) {
dev_err(iommu->dev, "Error during raw reset. MMU_DTE_ADDR is 
not functioning\n");
return -EFAULT;
}
@@ -470,6 +475,16 @@ static int rk_iommu_force_reset(struct rk_iommu *iommu)
return 0;
  }
  
+static inline phys_addr_t rk_dte_addr_phys(u32 addr)

+{
+   return (phys_addr_t)addr;
+}
+
+static inline u32 rk_dma_addr_dte(dma_addr_t dt_dma)
+{
+   return dt_dma;
+}
+
  static void log_iova(struct rk_iommu *iommu, int index, dma_addr_t iova)
  {
void __iomem *base = iommu->bases[index];
@@ -489,7 +504,7 @@ static void log_iova(struct rk_iommu *iommu, int index, 
dma_addr_t iova)
page_offset = rk_iova_page_offset(iova);
  
  	mmu_dte_addr = rk_iommu_read(base, RK_MMU_DTE_ADDR);

-   mmu_dte_addr_phys = (phys_addr_t)mmu_dte_addr;
+   mmu_dte_addr_phys = rk_ops->dte_addr_phys(mmu_dte_addr);
  
  	dte_addr_phys = mmu_dte_addr_phys + (4 * dte_index);

dte_addr = phys_to_virt(dte_addr_phys);
@@ -498,14 +513,14 @@ static void log_iova(struct rk_iommu *iommu, int index, 
dma_addr_t iova)
if (!rk_dte_is_pt_valid(dte))
goto print_it;
  
-	pte_addr_phys = rk_dte_pt_address(dte) + (pte_index * 4);

+   pte_addr_phys = rk_ops->pt_address(dte) + (pte_index * 4);
pte_addr = phys_to_virt(pte_addr_phys);
pte = *pte_addr;
  
  	if (!rk_pte_is_page_valid(pte))

goto print_it;
  
-	page_addr_phys = rk_pte_page_address(pte) + page_offset;

+   page_addr_phys = rk_ops->pt_address(pte) + page_offset;
page_flags = pte & RK_PTE_PAGE_FLAGS_MASK;
  
  print_it:

@@ -601,13 +616,13 @@ static phys_addr_t rk_iommu_iova_to_phys(struct 
iommu_domain *domain,
if (!rk_dte_is_pt_valid(dte))
goto out;
  
-	pt_phys = rk_dte_pt_address(dte);

+   pt_phys = rk_ops->pt_address(dte);
page_table = (u32 *)phys_to_virt(pt_phys);
pte = page_table[rk_iova_pte_index(iova)];
   

Re: [PATCH v6 00/11] Clean up "mediatek,larb"

2021-07-14 Thread Dafna Hirschfeld

Hi

Thanks for the patchset.

I tested it on mt8173 (elm) with chromeos userspace.
Before that patchset, the test:

tast -verbose run -build=false 10.42.0.175 video.DecodeAccel.h264

sometimes passed and sometimes failed with 'context deadline exceeded'.
With this patchset it seems that the test always passes so I added tested-by:

Tested-by: Dafna Hirschfeld 

Thanks,
Dafna




On 14.07.21 04:56, Yong Wu wrote:

MediaTek IOMMU block diagram always like below:

 M4U
  |
 smi-common
  |
   -
   | |  ...
   | |
larb1 larb2
   | |
vdec   venc

All the consumer connect with smi-larb, then connect with smi-common.

When the consumer works, it should enable the smi-larb's power which also
need enable the smi-common's power firstly.

Thus, Firstly, use the device link connect the consumer and the
smi-larbs. then add device link between the smi-larb and smi-common.

After adding the device_link, then "mediatek,larb" property can be removed.
the iommu consumer don't need call the mtk_smi_larb_get/put to enable
the power and clock of smi-larb and smi-common.

About the MM dt-binding/dtsi patches, I guess they should go together, thus
I don't split them for each a MM module and each a SoC.

Base on v5.14-rc1, and a jpeg[1] and mdp[2] patchset.

[1] 
https://lore.kernel.org/linux-mediatek/20210702102304.3346429-1-hsi...@chromium.org/
[2] 
https://lore.kernel.org/linux-mediatek/20210709022324.1607884-1-ei...@chromium.org/

Change notes:
v6: 1) rebase on v5.14-rc1.
 2) Fix the issue commented in v5 from Dafna and Hsin-Yi.
 3) Remove the patches about using pm_runtime_resume_and_get since they have
already been merged by other patches.

v5: 
https://lore.kernel.org/linux-mediatek/20210410091128.31823-1-yong...@mediatek.com/
 1) Base v5.12-rc2.
 2) Remove changing the mtk-iommu to module_platform_driver patch, It have 
already been a
 independent patch.

v4: 
https://lore.kernel.org/linux-mediatek/1590826218-23653-1-git-send-email-yong...@mediatek.com/
 base on v5.7-rc1.
   1) Move drm PM patch before smi patchs.
   2) Change builtin_platform_driver to module_platform_driver since we may need
  build as module.
   3) Rebase many patchset as above.

v3: 
https://lore.kernel.org/linux-iommu/1567503456-24725-1-git-send-email-yong...@mediatek.com/
 1) rebase on v5.3-rc1 and the latest mt8183 patchset.
 2) Use device_is_bound to check whether the driver is ready from Matthias.
 3) Add DL_FLAG_STATELESS flag when calling device_link_add and explain the
reason in the commit message[3/14].
 4) Add a display patch[12/14] into this series. otherwise it may affect
display HW fastlogo even though it don't happen in mt8183.

v2: https://lore.kernel.org/linux-iommu/1560171313-28299-1-git-send-email-yong...@mediatek.com/

1) rebase on v5.2-rc1.
2) Move adding device_link between the consumer and smi-larb into
iommu_add_device from Robin.
3) add DL_FLAG_AUTOREMOVE_CONSUMER even though the smi is built-in from 
Evan.
4) Remove the shutdown callback in iommu.

v1: 
https://lore.kernel.org/linux-iommu/1546318276-18993-1-git-send-email-yong...@mediatek.com/

Yong Wu (10):
   dt-binding: mediatek: Get rid of mediatek,larb for multimedia HW
   iommu/mediatek: Add probe_defer for smi-larb
   iommu/mediatek: Add device_link between the consumer and the larb
 devices
   media: mtk-jpeg: Get rid of mtk_smi_larb_get/put
   media: mtk-mdp: Get rid of mtk_smi_larb_get/put
   drm/mediatek: Get rid of mtk_smi_larb_get/put
   media: mtk-vcodec: Get rid of mtk_smi_larb_get/put
   memory: mtk-smi: Get rid of mtk_smi_larb_get/put
   arm: dts: mediatek: Get rid of mediatek,larb for MM nodes
   arm64: dts: mediatek: Get rid of mediatek,larb for MM nodes

Yongqiang Niu (1):
   drm/mediatek: Add pm runtime support for ovl and rdma

  .../display/mediatek/mediatek,disp.txt|  9 
  .../bindings/media/mediatek-jpeg-decoder.yaml |  9 
  .../bindings/media/mediatek-jpeg-encoder.yaml |  9 
  .../bindings/media/mediatek-mdp.txt   |  8 
  .../bindings/media/mediatek-vcodec.txt|  4 --
  arch/arm/boot/dts/mt2701.dtsi |  2 -
  arch/arm/boot/dts/mt7623n.dtsi|  5 --
  arch/arm64/boot/dts/mediatek/mt8173.dtsi  | 16 ---
  arch/arm64/boot/dts/mediatek/mt8183.dtsi  |  6 ---
  drivers/gpu/drm/mediatek/mtk_disp_ovl.c   |  9 +++-
  drivers/gpu/drm/mediatek/mtk_disp_rdma.c  |  9 +++-
  drivers/gpu/drm/mediatek/mtk_drm_crtc.c   | 19 
  drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.c   | 36 +--
  drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.h   |  1 -
  drivers/gpu/drm/mediatek/mtk_drm_drv.c|  5 +-
  drivers/iommu/mtk_iommu.c | 24 +-
  drivers/iommu/mtk_iommu_v1.c  | 22 -
  .../media/platform/mtk-jpeg/mtk_jpeg_core.c   | 45 +-
  .../medi

Re: [PATCH v6 01/11] dt-binding: mediatek: Get rid of mediatek, larb for multimedia HW

2021-07-14 Thread Dafna Hirschfeld



On 14.07.21 10:13, Dafna Hirschfeld wrote:

Hi,
thanks for the patch

On 14.07.21 04:56, Yong Wu wrote:

After adding device_link between the consumer with the smi-larbs,
if the consumer call its owner pm_runtime_get(_sync), the
pm_runtime_get(_sync) of smi-larb and smi-common will be called
automatically. Thus, the consumer don't need the property.

And IOMMU also know which larb this consumer connects with from
iommu id in the "iommus=" property.

Signed-off-by: Yong Wu 
Reviewed-by: Rob Herring 
Reviewed-by: Evan Green 
---
  .../bindings/display/mediatek/mediatek,disp.txt  | 9 -
  .../devicetree/bindings/media/mediatek-jpeg-decoder.yaml | 9 -
  .../devicetree/bindings/media/mediatek-jpeg-encoder.yaml | 9 -


On which repo are these patches based on ?
In linux-next the file mediatek-jpeg-encoder.yaml don't exist

Thanks,
Dafna


sorry, I see you reference the patch that convert to yaml in the cover letter.

Thanks,
Dafna




  Documentation/devicetree/bindings/media/mediatek-mdp.txt | 8 
  .../devicetree/bindings/media/mediatek-vcodec.txt    | 4 
  5 files changed, 39 deletions(-)

diff --git 
a/Documentation/devicetree/bindings/display/mediatek/mediatek,disp.txt 
b/Documentation/devicetree/bindings/display/mediatek/mediatek,disp.txt
index fbb59c9ddda6..867bd82e2f03 100644
--- a/Documentation/devicetree/bindings/display/mediatek/mediatek,disp.txt
+++ b/Documentation/devicetree/bindings/display/mediatek/mediatek,disp.txt
@@ -61,8 +61,6 @@ Required properties (DMA function blocks):
  "mediatek,-disp-rdma"
  "mediatek,-disp-wdma"
    the supported chips are mt2701, mt8167 and mt8173.
-- larb: Should contain a phandle pointing to the local arbiter device as 
defined
-  in 
Documentation/devicetree/bindings/memory-controllers/mediatek,smi-larb.yaml
  - iommus: Should point to the respective IOMMU block with master port as
    argument, see Documentation/devicetree/bindings/iommu/mediatek,iommu.yaml
    for details.
@@ -91,7 +89,6 @@ ovl0: ovl@1400c000 {
  power-domains = < MT8173_POWER_DOMAIN_MM>;
  clocks = < CLK_MM_DISP_OVL0>;
  iommus = < M4U_PORT_DISP_OVL0>;
-    mediatek,larb = <>;
  };
  ovl1: ovl@1400d000 {
@@ -101,7 +98,6 @@ ovl1: ovl@1400d000 {
  power-domains = < MT8173_POWER_DOMAIN_MM>;
  clocks = < CLK_MM_DISP_OVL1>;
  iommus = < M4U_PORT_DISP_OVL1>;
-    mediatek,larb = <>;
  };
  rdma0: rdma@1400e000 {
@@ -111,7 +107,6 @@ rdma0: rdma@1400e000 {
  power-domains = < MT8173_POWER_DOMAIN_MM>;
  clocks = < CLK_MM_DISP_RDMA0>;
  iommus = < M4U_PORT_DISP_RDMA0>;
-    mediatek,larb = <>;
  mediatek,rdma-fifosize = <8192>;
  };
@@ -122,7 +117,6 @@ rdma1: rdma@1400f000 {
  power-domains = < MT8173_POWER_DOMAIN_MM>;
  clocks = < CLK_MM_DISP_RDMA1>;
  iommus = < M4U_PORT_DISP_RDMA1>;
-    mediatek,larb = <>;
  };
  rdma2: rdma@1401 {
@@ -132,7 +126,6 @@ rdma2: rdma@1401 {
  power-domains = < MT8173_POWER_DOMAIN_MM>;
  clocks = < CLK_MM_DISP_RDMA2>;
  iommus = < M4U_PORT_DISP_RDMA2>;
-    mediatek,larb = <>;
  };
  wdma0: wdma@14011000 {
@@ -142,7 +135,6 @@ wdma0: wdma@14011000 {
  power-domains = < MT8173_POWER_DOMAIN_MM>;
  clocks = < CLK_MM_DISP_WDMA0>;
  iommus = < M4U_PORT_DISP_WDMA0>;
-    mediatek,larb = <>;
  };
  wdma1: wdma@14012000 {
@@ -152,7 +144,6 @@ wdma1: wdma@14012000 {
  power-domains = < MT8173_POWER_DOMAIN_MM>;
  clocks = < CLK_MM_DISP_WDMA1>;
  iommus = < M4U_PORT_DISP_WDMA1>;
-    mediatek,larb = <>;
  };
  color0: color@14013000 {
diff --git a/Documentation/devicetree/bindings/media/mediatek-jpeg-decoder.yaml 
b/Documentation/devicetree/bindings/media/mediatek-jpeg-decoder.yaml
index 9b87f036f178..052e752157b4 100644
--- a/Documentation/devicetree/bindings/media/mediatek-jpeg-decoder.yaml
+++ b/Documentation/devicetree/bindings/media/mediatek-jpeg-decoder.yaml
@@ -42,13 +42,6 @@ properties:
    power-domains:
  maxItems: 1
-  mediatek,larb:
-    $ref: '/schemas/types.yaml#/definitions/phandle'
-    description: |
-  Must contain the local arbiters in the current Socs, see
-  
Documentation/devicetree/bindings/memory-controllers/mediatek,smi-larb.yaml
-  for details.
-
    iommus:
  maxItems: 2
  description: |
@@ -63,7 +56,6 @@ required:
    - clocks
    - clock-names
    - power-domains
-  - mediatek,larb
    - iommus
  additionalProperties: false
@@ -83,7 +75,6 @@ examples:
    clock-names = "jpgdec-smi",
  "jpgdec";
    power-domains = < MT2701_POWER_DOMAIN_ISP>;
-  mediatek,larb = <>;
    iommus = < MT2701_M4U_PORT_JPGDEC_WDMA>,
 < MT2701_M4U_PORT_JPGDEC_BSDMA>;
  };
diff --git a/Documentation/devicet

Re: [PATCH v6 06/11] drm/mediatek: Add pm runtime support for ovl and rdma

2021-07-14 Thread Dafna Hirschfeld




On 14.07.21 04:56, Yong Wu wrote:

From: Yongqiang Niu 

Prepare for smi cleaning up "mediatek,larb".

Display use the dispsys device to call pm_rumtime_get_sync before.
This patch add pm_runtime_xx with ovl and rdma device whose nodes has
"iommus" property, then display could help pm_runtime_get for smi via
ovl or rdma device.

CC: CK Hu 
Signed-off-by: Yongqiang Niu 
Signed-off-by: Yong Wu 
(Yong: Use pm_runtime_resume_and_get instead of pm_runtime_get_sync)
Acked-by: Chun-Kuang Hu 
---
  drivers/gpu/drm/mediatek/mtk_disp_ovl.c  |  9 -
  drivers/gpu/drm/mediatek/mtk_disp_rdma.c |  9 -
  drivers/gpu/drm/mediatek/mtk_drm_crtc.c  | 12 +++-
  3 files changed, 27 insertions(+), 3 deletions(-)

diff --git a/drivers/gpu/drm/mediatek/mtk_disp_ovl.c 
b/drivers/gpu/drm/mediatek/mtk_disp_ovl.c
index fa9d79963cd3..ea5760f856ec 100644
--- a/drivers/gpu/drm/mediatek/mtk_disp_ovl.c
+++ b/drivers/gpu/drm/mediatek/mtk_disp_ovl.c
@@ -11,6 +11,7 @@
  #include 
  #include 
  #include 
+#include 
  #include 
  
  #include "mtk_disp_drv.h"

@@ -414,15 +415,21 @@ static int mtk_disp_ovl_probe(struct platform_device 
*pdev)
return ret;
}
  
+	pm_runtime_enable(dev);

+
ret = component_add(dev, _disp_ovl_component_ops);
-   if (ret)
+   if (ret) {
+   pm_runtime_disable(dev);
dev_err(dev, "Failed to add component: %d\n", ret);
+   }
  
  	return ret;

  }
  
  static int mtk_disp_ovl_remove(struct platform_device *pdev)

  {
+   pm_runtime_disable(>dev);
+
return 0;
  }
  
diff --git a/drivers/gpu/drm/mediatek/mtk_disp_rdma.c b/drivers/gpu/drm/mediatek/mtk_disp_rdma.c

index 705f28ceb4dd..0f31d1c8e37c 100644
--- a/drivers/gpu/drm/mediatek/mtk_disp_rdma.c
+++ b/drivers/gpu/drm/mediatek/mtk_disp_rdma.c
@@ -9,6 +9,7 @@
  #include 
  #include 
  #include 
+#include 
  #include 
  
  #include "mtk_disp_drv.h"

@@ -327,9 +328,13 @@ static int mtk_disp_rdma_probe(struct platform_device 
*pdev)
  
  	platform_set_drvdata(pdev, priv);
  
+	pm_runtime_enable(dev);

+
ret = component_add(dev, _disp_rdma_component_ops);
-   if (ret)
+   if (ret) {
+   pm_runtime_disable(dev);
dev_err(dev, "Failed to add component: %d\n", ret);
+   }
  
  	return ret;

  }
@@ -338,6 +343,8 @@ static int mtk_disp_rdma_remove(struct platform_device 
*pdev)
  {
component_del(>dev, _disp_rdma_component_ops);
  
+	pm_runtime_disable(>dev);

+
return 0;
  }
  
diff --git a/drivers/gpu/drm/mediatek/mtk_drm_crtc.c b/drivers/gpu/drm/mediatek/mtk_drm_crtc.c

index 474efb844249..08e3f352377d 100644
--- a/drivers/gpu/drm/mediatek/mtk_drm_crtc.c
+++ b/drivers/gpu/drm/mediatek/mtk_drm_crtc.c
@@ -557,9 +557,15 @@ static void mtk_drm_crtc_atomic_enable(struct drm_crtc 
*crtc,
return;
}
  
+	ret = pm_runtime_resume_and_get(comp->dev);

+   if (ret < 0)
+   DRM_DEV_ERROR(comp->dev, "Failed to enable power domain: %d\n",
+ ret);


shouldn't the code return in case of failure here?

Thanks,
Dafna


+
ret = mtk_crtc_ddp_hw_init(mtk_crtc);
if (ret) {
mtk_smi_larb_put(comp->larb_dev);
+   pm_runtime_put(comp->dev);
return;
}
  
@@ -572,7 +578,7 @@ static void mtk_drm_crtc_atomic_disable(struct drm_crtc *crtc,

  {
struct mtk_drm_crtc *mtk_crtc = to_mtk_crtc(crtc);
struct mtk_ddp_comp *comp = mtk_crtc->ddp_comp[0];
-   int i;
+   int i, ret;
  
  	DRM_DEBUG_DRIVER("%s %d\n", __func__, crtc->base.id);

if (!mtk_crtc->enabled)
@@ -596,6 +602,10 @@ static void mtk_drm_crtc_atomic_disable(struct drm_crtc 
*crtc,
drm_crtc_vblank_off(crtc);
mtk_crtc_ddp_hw_fini(mtk_crtc);
mtk_smi_larb_put(comp->larb_dev);
+   ret = pm_runtime_put(comp->dev);
+   if (ret < 0)
+   DRM_DEV_ERROR(comp->dev, "Failed to disable power domain: %d\n",
+ ret);
  
  	mtk_crtc->enabled = false;

  }


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v6 09/11] memory: mtk-smi: Get rid of mtk_smi_larb_get/put

2021-07-14 Thread Dafna Hirschfeld




On 14.07.21 04:56, Yong Wu wrote:

After adding device_link between the iommu consumer and smi-larb,
the pm_runtime_get(_sync) of smi-larb and smi-common will be called
automatically. we can get rid of mtk_smi_larb_get/put.

CC: Matthias Brugger 
Signed-off-by: Yong Wu 
Reviewed-by: Evan Green 
Acked-by: Krzysztof Kozlowski 
Acked-by: Matthias Brugger 


Reviewed-by: Dafna Hirschfeld 


---
  drivers/memory/mtk-smi.c   | 14 --
  include/soc/mediatek/smi.h | 20 
  2 files changed, 34 deletions(-)

diff --git a/drivers/memory/mtk-smi.c b/drivers/memory/mtk-smi.c
index c5fb51f73b34..7c61c924e220 100644
--- a/drivers/memory/mtk-smi.c
+++ b/drivers/memory/mtk-smi.c
@@ -134,20 +134,6 @@ static void mtk_smi_clk_disable(const struct mtk_smi *smi)
clk_disable_unprepare(smi->clk_apb);
  }
  
-int mtk_smi_larb_get(struct device *larbdev)

-{
-   int ret = pm_runtime_resume_and_get(larbdev);
-
-   return (ret < 0) ? ret : 0;
-}
-EXPORT_SYMBOL_GPL(mtk_smi_larb_get);
-
-void mtk_smi_larb_put(struct device *larbdev)
-{
-   pm_runtime_put_sync(larbdev);
-}
-EXPORT_SYMBOL_GPL(mtk_smi_larb_put);
-
  static int
  mtk_smi_larb_bind(struct device *dev, struct device *master, void *data)
  {
diff --git a/include/soc/mediatek/smi.h b/include/soc/mediatek/smi.h
index 15e3397cec58..11f7d6b59642 100644
--- a/include/soc/mediatek/smi.h
+++ b/include/soc/mediatek/smi.h
@@ -19,26 +19,6 @@ struct mtk_smi_larb_iommu {
unsigned char  bank[32];
  };
  
-/*

- * mtk_smi_larb_get: Enable the power domain and clocks for this local arbiter.
- *   It also initialize some basic setting(like iommu).
- * mtk_smi_larb_put: Disable the power domain and clocks for this local 
arbiter.
- * Both should be called in non-atomic context.
- *
- * Returns 0 if successful, negative on failure.
- */
-int mtk_smi_larb_get(struct device *larbdev);
-void mtk_smi_larb_put(struct device *larbdev);
-
-#else
-
-static inline int mtk_smi_larb_get(struct device *larbdev)
-{
-   return 0;
-}
-
-static inline void mtk_smi_larb_put(struct device *larbdev) { }
-
  #endif
  
  #endif



___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v6 08/11] media: mtk-vcodec: Get rid of mtk_smi_larb_get/put

2021-07-14 Thread Dafna Hirschfeld




On 14.07.21 04:56, Yong Wu wrote:

MediaTek IOMMU has already added the device_link between the consumer
and smi-larb device. If the vcodec device call the pm_runtime_get_sync,
the smi-larb's pm_runtime_get_sync also be called automatically.

CC: Tiffany Lin 
CC: Irui Wang 
Signed-off-by: Yong Wu 
Reviewed-by: Evan Green 
Acked-by: Tiffany Lin 


Reviewed-by: Dafna Hirschfeld 


---
  .../platform/mtk-vcodec/mtk_vcodec_dec_pm.c   | 37 +++-
  .../platform/mtk-vcodec/mtk_vcodec_drv.h  |  3 --
  .../platform/mtk-vcodec/mtk_vcodec_enc.c  |  1 -
  .../platform/mtk-vcodec/mtk_vcodec_enc_pm.c   | 44 +++
  4 files changed, 10 insertions(+), 75 deletions(-)

diff --git a/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_pm.c 
b/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_pm.c
index 6038db96f71c..d0bf9aa3b29d 100644
--- a/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_pm.c
+++ b/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_pm.c
@@ -8,14 +8,12 @@
  #include 
  #include 
  #include 
-#include 
  
  #include "mtk_vcodec_dec_pm.h"

  #include "mtk_vcodec_util.h"
  
  int mtk_vcodec_init_dec_pm(struct mtk_vcodec_dev *mtkdev)

  {
-   struct device_node *node;
struct platform_device *pdev;
struct mtk_vcodec_pm *pm;
struct mtk_vcodec_clk *dec_clk;
@@ -26,18 +24,7 @@ int mtk_vcodec_init_dec_pm(struct mtk_vcodec_dev *mtkdev)
pm = >pm;
pm->mtkdev = mtkdev;
dec_clk = >vdec_clk;
-   node = of_parse_phandle(pdev->dev.of_node, "mediatek,larb", 0);
-   if (!node) {
-   mtk_v4l2_err("of_parse_phandle mediatek,larb fail!");
-   return -1;
-   }
  
-	pdev = of_find_device_by_node(node);

-   of_node_put(node);
-   if (WARN_ON(!pdev)) {
-   return -1;
-   }
-   pm->larbvdec = >dev;
pdev = mtkdev->plat_dev;
pm->dev = >dev;
  
@@ -47,14 +34,11 @@ int mtk_vcodec_init_dec_pm(struct mtk_vcodec_dev *mtkdev)

dec_clk->clk_info = devm_kcalloc(>dev,
dec_clk->clk_num, sizeof(*clk_info),
GFP_KERNEL);
-   if (!dec_clk->clk_info) {
-   ret = -ENOMEM;
-   goto put_device;
-   }
+   if (!dec_clk->clk_info)
+   return -ENOMEM;
} else {
mtk_v4l2_err("Failed to get vdec clock count");
-   ret = -EINVAL;
-   goto put_device;
+   return -EINVAL;
}
  
  	for (i = 0; i < dec_clk->clk_num; i++) {

@@ -63,29 +47,24 @@ int mtk_vcodec_init_dec_pm(struct mtk_vcodec_dev *mtkdev)
"clock-names", i, _info->clk_name);
if (ret) {
mtk_v4l2_err("Failed to get clock name id = %d", i);
-   goto put_device;
+   return ret;
}
clk_info->vcodec_clk = devm_clk_get(>dev,
clk_info->clk_name);
if (IS_ERR(clk_info->vcodec_clk)) {
mtk_v4l2_err("devm_clk_get (%d)%s fail", i,
clk_info->clk_name);
-   ret = PTR_ERR(clk_info->vcodec_clk);
-   goto put_device;
+   return PTR_ERR(clk_info->vcodec_clk);
}
}
  
  	pm_runtime_enable(>dev);

return 0;
-put_device:
-   put_device(pm->larbvdec);
-   return ret;
  }
  
  void mtk_vcodec_release_dec_pm(struct mtk_vcodec_dev *dev)

  {
pm_runtime_disable(dev->pm.dev);
-   put_device(dev->pm.larbvdec);
  }
  
  int mtk_vcodec_dec_pw_on(struct mtk_vcodec_pm *pm)

@@ -122,11 +101,6 @@ void mtk_vcodec_dec_clock_on(struct mtk_vcodec_pm *pm)
}
}
  
-	ret = mtk_smi_larb_get(pm->larbvdec);

-   if (ret) {
-   mtk_v4l2_err("mtk_smi_larb_get larbvdec fail %d", ret);
-   goto error;
-   }
return;
  
  error:

@@ -139,7 +113,6 @@ void mtk_vcodec_dec_clock_off(struct mtk_vcodec_pm *pm)
struct mtk_vcodec_clk *dec_clk = >vdec_clk;
int i = 0;
  
-	mtk_smi_larb_put(pm->larbvdec);

for (i = dec_clk->clk_num - 1; i >= 0; i--)
clk_disable_unprepare(dec_clk->clk_info[i].vcodec_clk);
  }
diff --git a/drivers/media/platform/mtk-vcodec/mtk_vcodec_drv.h 
b/drivers/media/platform/mtk-vcodec/mtk_vcodec_drv.h
index c6c7672fecfb..64b73dd880ce 100644
--- a/drivers/media/platform/mtk-vcodec/mtk_vcodec_drv.h
+++ b/drivers/media/platform/mtk-vcodec/mtk_vcodec_drv.h
@@ -189,10 +189,7 @@ struct mtk_vcodec_clk {
   */
  struct mtk_vcodec_pm {
struct mtk_vcodec_clk   vdec_clk;
-   struct device   *larbvdec;
-
struct mtk_vcodec_c

Re: [PATCH v6 07/11] drm/mediatek: Get rid of mtk_smi_larb_get/put

2021-07-14 Thread Dafna Hirschfeld




On 14.07.21 04:56, Yong Wu wrote:

MediaTek IOMMU has already added the device_link between the consumer
and smi-larb device. If the drm device call the pm_runtime_get_sync,
the smi-larb's pm_runtime_get_sync also be called automatically.

CC: CK Hu 
CC: Philipp Zabel 
Signed-off-by: Yong Wu 
Reviewed-by: Evan Green 
Acked-by: Chun-Kuang Hu 


Reviewed-by: Dafna Hirschfeld 


---
  drivers/gpu/drm/mediatek/mtk_drm_crtc.c |  9 --
  drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.c | 36 ++---
  drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.h |  1 -
  drivers/gpu/drm/mediatek/mtk_drm_drv.c  |  5 +--
  4 files changed, 3 insertions(+), 48 deletions(-)

diff --git a/drivers/gpu/drm/mediatek/mtk_drm_crtc.c 
b/drivers/gpu/drm/mediatek/mtk_drm_crtc.c
index 08e3f352377d..d046abcf66ce 100644
--- a/drivers/gpu/drm/mediatek/mtk_drm_crtc.c
+++ b/drivers/gpu/drm/mediatek/mtk_drm_crtc.c
@@ -10,7 +10,6 @@
  #include 
  
  #include 

-#include 
  
  #include 

  #include 
@@ -551,12 +550,6 @@ static void mtk_drm_crtc_atomic_enable(struct drm_crtc 
*crtc,
  
  	DRM_DEBUG_DRIVER("%s %d\n", __func__, crtc->base.id);
  
-	ret = mtk_smi_larb_get(comp->larb_dev);

-   if (ret) {
-   DRM_ERROR("Failed to get larb: %d\n", ret);
-   return;
-   }
-
ret = pm_runtime_resume_and_get(comp->dev);
if (ret < 0)
DRM_DEV_ERROR(comp->dev, "Failed to enable power domain: %d\n",
@@ -564,7 +557,6 @@ static void mtk_drm_crtc_atomic_enable(struct drm_crtc 
*crtc,
  
  	ret = mtk_crtc_ddp_hw_init(mtk_crtc);

if (ret) {
-   mtk_smi_larb_put(comp->larb_dev);
pm_runtime_put(comp->dev);
return;
}
@@ -601,7 +593,6 @@ static void mtk_drm_crtc_atomic_disable(struct drm_crtc 
*crtc,
  
  	drm_crtc_vblank_off(crtc);

mtk_crtc_ddp_hw_fini(mtk_crtc);
-   mtk_smi_larb_put(comp->larb_dev);
ret = pm_runtime_put(comp->dev);
if (ret < 0)
DRM_DEV_ERROR(comp->dev, "Failed to disable power domain: %d\n",
diff --git a/drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.c 
b/drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.c
index 75bc00e17fc4..7d240218d4c7 100644
--- a/drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.c
+++ b/drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.c
@@ -449,37 +449,15 @@ unsigned int mtk_drm_find_possible_crtc_by_comp(struct 
drm_device *drm,
return ret;
  }
  
-static int mtk_ddp_get_larb_dev(struct device_node *node, struct mtk_ddp_comp *comp,

-   struct device *dev)
-{
-   struct device_node *larb_node;
-   struct platform_device *larb_pdev;
-
-   larb_node = of_parse_phandle(node, "mediatek,larb", 0);
-   if (!larb_node) {
-   dev_err(dev, "Missing mediadek,larb phandle in %pOF node\n", 
node);
-   return -EINVAL;
-   }
-
-   larb_pdev = of_find_device_by_node(larb_node);
-   if (!larb_pdev) {
-   dev_warn(dev, "Waiting for larb device %pOF\n", larb_node);
-   of_node_put(larb_node);
-   return -EPROBE_DEFER;
-   }
-   of_node_put(larb_node);
-   comp->larb_dev = _pdev->dev;
-
-   return 0;
-}
-
  int mtk_ddp_comp_init(struct device_node *node, struct mtk_ddp_comp *comp,
  enum mtk_ddp_comp_id comp_id)
  {
struct platform_device *comp_pdev;
enum mtk_ddp_comp_type type;
struct mtk_ddp_comp_dev *priv;
+#if IS_REACHABLE(CONFIG_MTK_CMDQ)
int ret;
+#endif
  
  	if (comp_id < 0 || comp_id >= DDP_COMPONENT_ID_MAX)

return -EINVAL;
@@ -495,16 +473,6 @@ int mtk_ddp_comp_init(struct device_node *node, struct 
mtk_ddp_comp *comp,
}
comp->dev = _pdev->dev;
  
-	/* Only DMA capable components need the LARB property */

-   if (type == MTK_DISP_OVL ||
-   type == MTK_DISP_OVL_2L ||
-   type == MTK_DISP_RDMA ||
-   type == MTK_DISP_WDMA) {
-   ret = mtk_ddp_get_larb_dev(node, comp, comp->dev);
-   if (ret)
-   return ret;
-   }
-
if (type == MTK_DISP_BLS ||
type == MTK_DISP_CCORR ||
type == MTK_DISP_COLOR ||
diff --git a/drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.h 
b/drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.h
index bb914d976cf5..1b582262b682 100644
--- a/drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.h
+++ b/drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.h
@@ -70,7 +70,6 @@ struct mtk_ddp_comp_funcs {
  struct mtk_ddp_comp {
struct device *dev;
int irq;
-   struct device *larb_dev;
enum mtk_ddp_comp_id id;
const struct mtk_ddp_comp_funcs *funcs;
  };
diff --git a/drivers/gpu/drm/mediatek/mtk_drm_drv.c 
b/drivers/gpu/drm/mediatek/mtk_drm_drv.c
index b46bdb8985da..0d5ef3d8d081 100644
--- a/drivers/gpu/drm/mediatek/mtk

Re: [PATCH v6 05/11] media: mtk-mdp: Get rid of mtk_smi_larb_get/put

2021-07-14 Thread Dafna Hirschfeld




On 14.07.21 04:56, Yong Wu wrote:

MediaTek IOMMU has already added the device_link between the consumer
and smi-larb device. If the mdp device call the pm_runtime_get_sync,
the smi-larb's pm_runtime_get_sync also be called automatically.

CC: Minghsiu Tsai 
CC: Houlong Wei 
Signed-off-by: Yong Wu 
Reviewed-by: Evan Green 
Reviewed-by: Houlong Wei 


Reviewed-by: Dafna Hirschfeld 


---
  drivers/media/platform/mtk-mdp/mtk_mdp_comp.c | 46 +--
  drivers/media/platform/mtk-mdp/mtk_mdp_comp.h |  2 -
  drivers/media/platform/mtk-mdp/mtk_mdp_core.c |  1 -
  3 files changed, 1 insertion(+), 48 deletions(-)

diff --git a/drivers/media/platform/mtk-mdp/mtk_mdp_comp.c 
b/drivers/media/platform/mtk-mdp/mtk_mdp_comp.c
index de2d425efdd1..5e0ea83a9f7f 100644
--- a/drivers/media/platform/mtk-mdp/mtk_mdp_comp.c
+++ b/drivers/media/platform/mtk-mdp/mtk_mdp_comp.c
@@ -13,7 +13,6 @@
  #include 
  #include 
  #include 
-#include 
  #include 
  
  #include "mtk_mdp_comp.h"

@@ -57,13 +56,6 @@ int mtk_mdp_comp_power_on(struct mtk_mdp_comp *comp)
  {
int status, err;
  
-	if (comp->larb_dev) {

-   err = mtk_smi_larb_get(comp->larb_dev);
-   if (err)
-   dev_err(comp->dev,
-   "failed to get larb, err %d.\n", err);
-   }
-
err = pm_runtime_get_sync(comp->dev);
if (err < 0) {
dev_err(comp->dev, "failed to runtime get, err %d.\n", err);
@@ -146,9 +138,6 @@ void mtk_mdp_comp_clock_off(struct mtk_mdp_comp *comp)
continue;
clk_disable_unprepare(comp->clk[i]);
}
-
-   if (comp->larb_dev)
-   mtk_smi_larb_put(comp->larb_dev);
  }
  
  /*

@@ -236,9 +225,6 @@ static const struct component_ops mtk_mdp_component_ops = {
  
  int mtk_mdp_comp_init(struct mtk_mdp_comp *comp, struct device *dev)

  {
-   struct device_node *larb_node;
-   struct platform_device *larb_pdev;
-   int ret;
int i;
struct device_node *node = dev->of_node;
enum mtk_mdp_comp_type comp_type =
@@ -252,8 +238,7 @@ int mtk_mdp_comp_init(struct mtk_mdp_comp *comp, struct 
device *dev)
if (IS_ERR(comp->clk[i])) {
if (PTR_ERR(comp->clk[i]) != -EPROBE_DEFER)
dev_err(dev, "Failed to get clock\n");
-   ret = PTR_ERR(comp->clk[i]);
-   goto err;
+   return PTR_ERR(comp->clk[i]);
}
  
  		/* Only RDMA needs two clocks */

@@ -261,36 +246,7 @@ int mtk_mdp_comp_init(struct mtk_mdp_comp *comp, struct 
device *dev)
break;
}
  
-	/* Only DMA capable components need the LARB property */

-   comp->larb_dev = NULL;
-   if (comp_type != MTK_MDP_RDMA &&
-   comp_type != MTK_MDP_WDMA &&
-   comp_type != MTK_MDP_WROT)
-   return 0;
-
-   larb_node = of_parse_phandle(node, "mediatek,larb", 0);
-   if (!larb_node) {
-   dev_err(dev,
-   "Missing mediadek,larb phandle in %pOF node\n", node);
-   ret = -EINVAL;
-   goto err;
-   }
-
-   larb_pdev = of_find_device_by_node(larb_node);
-   if (!larb_pdev) {
-   dev_warn(dev, "Waiting for larb device %pOF\n", larb_node);
-   of_node_put(larb_node);
-   ret = -EPROBE_DEFER;
-   goto err;
-   }
-   of_node_put(larb_node);
-
-   comp->larb_dev = _pdev->dev;
-
return 0;
-
-err:
-   return ret;
  }
  
  static int mtk_mdp_comp_probe(struct platform_device *pdev)

diff --git a/drivers/media/platform/mtk-mdp/mtk_mdp_comp.h 
b/drivers/media/platform/mtk-mdp/mtk_mdp_comp.h
index 5201c47f7baa..2bd229cc7eae 100644
--- a/drivers/media/platform/mtk-mdp/mtk_mdp_comp.h
+++ b/drivers/media/platform/mtk-mdp/mtk_mdp_comp.h
@@ -11,13 +11,11 @@
   * struct mtk_mdp_comp - the MDP's function component data
   * @node: list node to track sibing MDP components
   * @clk:  clocks required for component
- * @larb_dev:  SMI device required for component
   * @dev:  component's device
   */
  struct mtk_mdp_comp {
struct list_headnode;
struct clk  *clk[2];
-   struct device   *larb_dev;
struct device   *dev;
  };
  
diff --git a/drivers/media/platform/mtk-mdp/mtk_mdp_core.c b/drivers/media/platform/mtk-mdp/mtk_mdp_core.c

index e1fb39231248..be7d35b3e3ff 100644
--- a/drivers/media/platform/mtk-mdp/mtk_mdp_core.c
+++ b/drivers/media/platform/mtk-mdp/mtk_mdp_core.c
@@ -18,7 +18,6 @@
  #include 
  #include 
  #include 
-#include 
  
  #include "mtk_mdp_comp.h"

  #include "mtk_mdp_core.h"


___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu


Re: [PATCH v6 04/11] media: mtk-jpeg: Get rid of mtk_smi_larb_get/put

2021-07-14 Thread Dafna Hirschfeld




On 14.07.21 04:56, Yong Wu wrote:

MediaTek IOMMU has already added device_link between the consumer
and smi-larb device. If the jpg device call the pm_runtime_get_sync,
the smi-larb's pm_runtime_get_sync also be called automatically.

After removing the larb_get operations, then mtk_jpeg_clk_init is
also unnecessary. Remove it too.

CC: Rick Chang 
CC: Xia Jiang 
Signed-off-by: Yong Wu 
Reviewed-by: Evan Green 
Acked-by: Rick Chang 


Reviewed-by: Dafna Hirschfeld 


---
  .../media/platform/mtk-jpeg/mtk_jpeg_core.c   | 45 +--
  .../media/platform/mtk-jpeg/mtk_jpeg_core.h   |  2 -
  2 files changed, 2 insertions(+), 45 deletions(-)

diff --git a/drivers/media/platform/mtk-jpeg/mtk_jpeg_core.c 
b/drivers/media/platform/mtk-jpeg/mtk_jpeg_core.c
index a89c7b206eef..4fea2c512434 100644
--- a/drivers/media/platform/mtk-jpeg/mtk_jpeg_core.c
+++ b/drivers/media/platform/mtk-jpeg/mtk_jpeg_core.c
@@ -22,7 +22,6 @@
  #include 
  #include 
  #include 
-#include 
  
  #include "mtk_jpeg_enc_hw.h"

  #include "mtk_jpeg_dec_hw.h"
@@ -1055,10 +1054,6 @@ static void mtk_jpeg_clk_on(struct mtk_jpeg_dev *jpeg)
  {
int ret;
  
-	ret = mtk_smi_larb_get(jpeg->larb);

-   if (ret)
-   dev_err(jpeg->dev, "mtk_smi_larb_get larbvdec fail %d\n", ret);
-
ret = clk_bulk_prepare_enable(jpeg->variant->num_clks,
  jpeg->variant->clks);
if (ret)
@@ -1069,7 +1064,6 @@ static void mtk_jpeg_clk_off(struct mtk_jpeg_dev *jpeg)
  {
clk_bulk_disable_unprepare(jpeg->variant->num_clks,
   jpeg->variant->clks);
-   mtk_smi_larb_put(jpeg->larb);
  }
  
  static irqreturn_t mtk_jpeg_enc_done(struct mtk_jpeg_dev *jpeg)

@@ -1284,35 +1278,6 @@ static struct clk_bulk_data mtk_jpeg_clocks[] = {
{ .id = "jpgenc" },
  };
  
-static int mtk_jpeg_clk_init(struct mtk_jpeg_dev *jpeg)

-{
-   struct device_node *node;
-   struct platform_device *pdev;
-   int ret;
-
-   node = of_parse_phandle(jpeg->dev->of_node, "mediatek,larb", 0);
-   if (!node)
-   return -EINVAL;
-   pdev = of_find_device_by_node(node);
-   if (WARN_ON(!pdev)) {
-   of_node_put(node);
-   return -EINVAL;
-   }
-   of_node_put(node);
-
-   jpeg->larb = >dev;
-
-   ret = devm_clk_bulk_get(jpeg->dev, jpeg->variant->num_clks,
-   jpeg->variant->clks);
-   if (ret) {
-   dev_err(>dev, "failed to get jpeg clock:%d\n", ret);
-   put_device(>dev);
-   return ret;
-   }
-
-   return 0;
-}
-
  static void mtk_jpeg_job_timeout_work(struct work_struct *work)
  {
struct mtk_jpeg_dev *jpeg = container_of(work, struct mtk_jpeg_dev,
@@ -1333,11 +1298,6 @@ static void mtk_jpeg_job_timeout_work(struct work_struct 
*work)
v4l2_m2m_job_finish(jpeg->m2m_dev, ctx->fh.m2m_ctx);
  }
  
-static inline void mtk_jpeg_clk_release(struct mtk_jpeg_dev *jpeg)

-{
-   put_device(jpeg->larb);
-}
-
  static int mtk_jpeg_probe(struct platform_device *pdev)
  {
struct mtk_jpeg_dev *jpeg;
@@ -1376,7 +1336,8 @@ static int mtk_jpeg_probe(struct platform_device *pdev)
goto err_req_irq;
}
  
-	ret = mtk_jpeg_clk_init(jpeg);

+   ret = devm_clk_bulk_get(jpeg->dev, jpeg->variant->num_clks,
+   jpeg->variant->clks);
if (ret) {
dev_err(>dev, "Failed to init clk, err %d\n", ret);
goto err_clk_init;
@@ -1442,7 +1403,6 @@ static int mtk_jpeg_probe(struct platform_device *pdev)
v4l2_device_unregister(>v4l2_dev);
  
  err_dev_register:

-   mtk_jpeg_clk_release(jpeg);
  
  err_clk_init:
  
@@ -1460,7 +1420,6 @@ static int mtk_jpeg_remove(struct platform_device *pdev)

video_device_release(jpeg->vdev);
v4l2_m2m_release(jpeg->m2m_dev);
v4l2_device_unregister(>v4l2_dev);
-   mtk_jpeg_clk_release(jpeg);
  
  	return 0;

  }
diff --git a/drivers/media/platform/mtk-jpeg/mtk_jpeg_core.h 
b/drivers/media/platform/mtk-jpeg/mtk_jpeg_core.h
index 595f7f10c9fd..3e4811a41ba2 100644
--- a/drivers/media/platform/mtk-jpeg/mtk_jpeg_core.h
+++ b/drivers/media/platform/mtk-jpeg/mtk_jpeg_core.h
@@ -85,7 +85,6 @@ struct mtk_jpeg_variant {
   * @alloc_ctx:videobuf2 memory allocator's context
   * @vdev: video device node for jpeg mem2mem mode
   * @reg_base: JPEG registers mapping
- * @larb:  SMI device
   * @job_timeout_work: IRQ timeout structure
   * @variant:  driver variant to be used
   */
@@ -99,7 +98,6 @@ struct mtk_jpeg_dev {
void*alloc_ctx;
struct video_device *vdev;

Re: [PATCH v6 03/11] iommu/mediatek: Add device_link between the consumer and the larb devices

2021-07-14 Thread Dafna Hirschfeld




On 14.07.21 04:56, Yong Wu wrote:

MediaTek IOMMU-SMI diagram is like below. all the consumer connect with
smi-larb, then connect with smi-common.

 M4U
  |
 smi-common
  |
   -
   | |...
   | |
larb1 larb2
   | |
vdec   venc

When the consumer works, it should enable the smi-larb's power which
also need enable the smi-common's power firstly.

Thus, First of all, use the device link connect the consumer and the
smi-larbs. then add device link between the smi-larb and smi-common.

This patch adds device_link between the consumer and the larbs.

When device_link_add, I add the flag DL_FLAG_STATELESS to avoid calling
pm_runtime_xx to keep the original status of clocks. It can avoid two
issues:
1) Display HW show fastlogo abnormally reported in [1]. At the beggining,
all the clocks are enabled before entering kernel, but the clocks for
display HW(always in larb0) will be gated after clk_enable and clk_disable
called from device_link_add(->pm_runtime_resume) and rpm_idle. The clock
operation happened before display driver probe. At that time, the display
HW will be abnormal.

2) A deadlock issue reported in [2]. Use DL_FLAG_STATELESS to skip
pm_runtime_xx to avoid the deadlock.

Corresponding, DL_FLAG_AUTOREMOVE_CONSUMER can't be added, then
device_link_removed should be added explicitly.

[1] https://lore.kernel.org/linux-mediatek/1564213888.22908.4.camel@mhfsdcap03/
[2] https://lore.kernel.org/patchwork/patch/1086569/

Suggested-by: Tomasz Figa 
Signed-off-by: Yong Wu 
---
  drivers/iommu/mtk_iommu.c| 22 ++
  drivers/iommu/mtk_iommu_v1.c | 20 +++-
  2 files changed, 41 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c
index a02dde094788..ee742900cf4b 100644
--- a/drivers/iommu/mtk_iommu.c
+++ b/drivers/iommu/mtk_iommu.c
@@ -571,22 +571,44 @@ static struct iommu_device *mtk_iommu_probe_device(struct 
device *dev)
  {
struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev);
struct mtk_iommu_data *data;
+   struct device_link *link;
+   struct device *larbdev;
+   unsigned int larbid;
  
  	if (!fwspec || fwspec->ops != _iommu_ops)

return ERR_PTR(-ENODEV); /* Not a iommu client device */
  
  	data = dev_iommu_priv_get(dev);
  
+	/*

+* Link the consumer device with the smi-larb device(supplier)
+* The device in each a larb is a independent HW. thus only link
+* one larb here.
+*/
+   larbid = MTK_M4U_TO_LARB(fwspec->ids[0]);
+   larbdev = data->larb_imu[larbid].dev;
+   link = device_link_add(dev, larbdev,
+  DL_FLAG_PM_RUNTIME | DL_FLAG_STATELESS);
+   if (!link)
+   dev_err(dev, "Unable to link %s\n", dev_name(larbdev));

shoudn't ERR_PTR be returned in case of failure?

Thanks,
Dafna


return >iommu;
  }
  
  static void mtk_iommu_release_device(struct device *dev)

  {
struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev);
+   struct mtk_iommu_data *data;
+   struct device *larbdev;
+   unsigned int larbid;
  
  	if (!fwspec || fwspec->ops != _iommu_ops)

return;
  
+	data = dev_iommu_priv_get(dev);

+   larbid = MTK_M4U_TO_LARB(fwspec->ids[0]);
+   larbdev = data->larb_imu[larbid].dev;
+   device_link_remove(dev, larbdev);
+
iommu_fwspec_free(dev);
  }
  
diff --git a/drivers/iommu/mtk_iommu_v1.c b/drivers/iommu/mtk_iommu_v1.c

index d9365a3d8dc9..d2a7c66b8239 100644
--- a/drivers/iommu/mtk_iommu_v1.c
+++ b/drivers/iommu/mtk_iommu_v1.c
@@ -424,7 +424,9 @@ static struct iommu_device *mtk_iommu_probe_device(struct 
device *dev)
struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev);
struct of_phandle_args iommu_spec;
struct mtk_iommu_data *data;
-   int err, idx = 0;
+   int err, idx = 0, larbid;
+   struct device_link *link;
+   struct device *larbdev;
  
  	while (!of_parse_phandle_with_args(dev->of_node, "iommus",

   "#iommu-cells",
@@ -445,6 +447,14 @@ static struct iommu_device *mtk_iommu_probe_device(struct 
device *dev)
  
  	data = dev_iommu_priv_get(dev);
  
+	/* Link the consumer device with the smi-larb device(supplier) */

+   larbid = mt2701_m4u_to_larb(fwspec->ids[0]);
+   larbdev = data->larb_imu[larbid].dev;
+   link = device_link_add(dev, larbdev,
+  DL_FLAG_PM_RUNTIME | DL_FLAG_STATELESS);
+   if (!link)
+   dev_err(dev, "Unable to link %s\n", dev_name(larbdev));
+
return >iommu;
  }
  
@@ -465,10 +475,18 @@ static void mtk_iommu_probe_finalize(struct device *dev)

  static void mtk_iommu_release_device(struct device *dev)
  {
struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev);
+   struct mtk_iommu_data *data;
+   struct device *larbdev;
+   unsigned int larbid;
 

Re: [PATCH v6 01/11] dt-binding: mediatek: Get rid of mediatek, larb for multimedia HW

2021-07-14 Thread Dafna Hirschfeld

Hi,
thanks for the patch

On 14.07.21 04:56, Yong Wu wrote:

After adding device_link between the consumer with the smi-larbs,
if the consumer call its owner pm_runtime_get(_sync), the
pm_runtime_get(_sync) of smi-larb and smi-common will be called
automatically. Thus, the consumer don't need the property.

And IOMMU also know which larb this consumer connects with from
iommu id in the "iommus=" property.

Signed-off-by: Yong Wu 
Reviewed-by: Rob Herring 
Reviewed-by: Evan Green 
---
  .../bindings/display/mediatek/mediatek,disp.txt  | 9 -
  .../devicetree/bindings/media/mediatek-jpeg-decoder.yaml | 9 -
  .../devicetree/bindings/media/mediatek-jpeg-encoder.yaml | 9 -


On which repo are these patches based on ?
In linux-next the file mediatek-jpeg-encoder.yaml don't exist

Thanks,
Dafna


  Documentation/devicetree/bindings/media/mediatek-mdp.txt | 8 
  .../devicetree/bindings/media/mediatek-vcodec.txt| 4 
  5 files changed, 39 deletions(-)

diff --git 
a/Documentation/devicetree/bindings/display/mediatek/mediatek,disp.txt 
b/Documentation/devicetree/bindings/display/mediatek/mediatek,disp.txt
index fbb59c9ddda6..867bd82e2f03 100644
--- a/Documentation/devicetree/bindings/display/mediatek/mediatek,disp.txt
+++ b/Documentation/devicetree/bindings/display/mediatek/mediatek,disp.txt
@@ -61,8 +61,6 @@ Required properties (DMA function blocks):
"mediatek,-disp-rdma"
"mediatek,-disp-wdma"
the supported chips are mt2701, mt8167 and mt8173.
-- larb: Should contain a phandle pointing to the local arbiter device as 
defined
-  in 
Documentation/devicetree/bindings/memory-controllers/mediatek,smi-larb.yaml
  - iommus: Should point to the respective IOMMU block with master port as
argument, see Documentation/devicetree/bindings/iommu/mediatek,iommu.yaml
for details.
@@ -91,7 +89,6 @@ ovl0: ovl@1400c000 {
power-domains = < MT8173_POWER_DOMAIN_MM>;
clocks = < CLK_MM_DISP_OVL0>;
iommus = < M4U_PORT_DISP_OVL0>;
-   mediatek,larb = <>;
  };
  
  ovl1: ovl@1400d000 {

@@ -101,7 +98,6 @@ ovl1: ovl@1400d000 {
power-domains = < MT8173_POWER_DOMAIN_MM>;
clocks = < CLK_MM_DISP_OVL1>;
iommus = < M4U_PORT_DISP_OVL1>;
-   mediatek,larb = <>;
  };
  
  rdma0: rdma@1400e000 {

@@ -111,7 +107,6 @@ rdma0: rdma@1400e000 {
power-domains = < MT8173_POWER_DOMAIN_MM>;
clocks = < CLK_MM_DISP_RDMA0>;
iommus = < M4U_PORT_DISP_RDMA0>;
-   mediatek,larb = <>;
mediatek,rdma-fifosize = <8192>;
  };
  
@@ -122,7 +117,6 @@ rdma1: rdma@1400f000 {

power-domains = < MT8173_POWER_DOMAIN_MM>;
clocks = < CLK_MM_DISP_RDMA1>;
iommus = < M4U_PORT_DISP_RDMA1>;
-   mediatek,larb = <>;
  };
  
  rdma2: rdma@1401 {

@@ -132,7 +126,6 @@ rdma2: rdma@1401 {
power-domains = < MT8173_POWER_DOMAIN_MM>;
clocks = < CLK_MM_DISP_RDMA2>;
iommus = < M4U_PORT_DISP_RDMA2>;
-   mediatek,larb = <>;
  };
  
  wdma0: wdma@14011000 {

@@ -142,7 +135,6 @@ wdma0: wdma@14011000 {
power-domains = < MT8173_POWER_DOMAIN_MM>;
clocks = < CLK_MM_DISP_WDMA0>;
iommus = < M4U_PORT_DISP_WDMA0>;
-   mediatek,larb = <>;
  };
  
  wdma1: wdma@14012000 {

@@ -152,7 +144,6 @@ wdma1: wdma@14012000 {
power-domains = < MT8173_POWER_DOMAIN_MM>;
clocks = < CLK_MM_DISP_WDMA1>;
iommus = < M4U_PORT_DISP_WDMA1>;
-   mediatek,larb = <>;
  };
  
  color0: color@14013000 {

diff --git a/Documentation/devicetree/bindings/media/mediatek-jpeg-decoder.yaml 
b/Documentation/devicetree/bindings/media/mediatek-jpeg-decoder.yaml
index 9b87f036f178..052e752157b4 100644
--- a/Documentation/devicetree/bindings/media/mediatek-jpeg-decoder.yaml
+++ b/Documentation/devicetree/bindings/media/mediatek-jpeg-decoder.yaml
@@ -42,13 +42,6 @@ properties:
power-domains:
  maxItems: 1
  
-  mediatek,larb:

-$ref: '/schemas/types.yaml#/definitions/phandle'
-description: |
-  Must contain the local arbiters in the current Socs, see
-  
Documentation/devicetree/bindings/memory-controllers/mediatek,smi-larb.yaml
-  for details.
-
iommus:
  maxItems: 2
  description: |
@@ -63,7 +56,6 @@ required:
- clocks
- clock-names
- power-domains
-  - mediatek,larb
- iommus
  
  additionalProperties: false

@@ -83,7 +75,6 @@ examples:
clock-names = "jpgdec-smi",
  "jpgdec";
power-domains = < MT2701_POWER_DOMAIN_ISP>;
-  mediatek,larb = <>;
iommus = < MT2701_M4U_PORT_JPGDEC_WDMA>,
 < MT2701_M4U_PORT_JPGDEC_BSDMA>;
  };
diff --git a/Documentation/devicetree/bindings/media/mediatek-jpeg-encoder.yaml 
b/Documentation/devicetree/bindings/media/mediatek-jpeg-encoder.yaml
index fcd9b829e036..8bfdfdfaba59 100644
--- a/Documentation/devicetree/bindings/media/mediatek-jpeg-encoder.yaml
+++ 

Re: [PATCH v5 11/16] drm/mediatek: Get rid of mtk_smi_larb_get/put

2021-05-25 Thread Dafna Hirschfeld

Hi

On 10.04.21 12:11, Yong Wu wrote:

MediaTek IOMMU has already added the device_link between the consumer
and smi-larb device. If the drm device call the pm_runtime_get_sync,
the smi-larb's pm_runtime_get_sync also be called automatically.

CC: CK Hu 
CC: Philipp Zabel 
Signed-off-by: Yong Wu 
Reviewed-by: Evan Green 
Acked-by: Chun-Kuang Hu 
---
  drivers/gpu/drm/mediatek/mtk_drm_crtc.c |  9 --
  drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.c | 35 -
  drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.h |  1 -
  drivers/gpu/drm/mediatek/mtk_drm_drv.c  |  5 +--
  4 files changed, 1 insertion(+), 49 deletions(-)

diff --git a/drivers/gpu/drm/mediatek/mtk_drm_crtc.c 
b/drivers/gpu/drm/mediatek/mtk_drm_crtc.c
index 971ef58ac1dc..d59353af4019 100644
--- a/drivers/gpu/drm/mediatek/mtk_drm_crtc.c
+++ b/drivers/gpu/drm/mediatek/mtk_drm_crtc.c
@@ -10,7 +10,6 @@
  #include 
  
  #include 

-#include 
  
  #include 

  #include 
@@ -544,12 +543,6 @@ static void mtk_drm_crtc_atomic_enable(struct drm_crtc 
*crtc,
  
  	DRM_DEBUG_DRIVER("%s %d\n", __func__, crtc->base.id);
  
-	ret = mtk_smi_larb_get(comp->larb_dev);

-   if (ret) {
-   DRM_ERROR("Failed to get larb: %d\n", ret);
-   return;
-   }
-
ret = pm_runtime_resume_and_get(comp->dev);
if (ret < 0)
DRM_DEV_ERROR(comp->dev, "Failed to enable power domain: %d\n",
@@ -557,7 +550,6 @@ static void mtk_drm_crtc_atomic_enable(struct drm_crtc 
*crtc,
  
  	ret = mtk_crtc_ddp_hw_init(mtk_crtc);

if (ret) {
-   mtk_smi_larb_put(comp->larb_dev);
pm_runtime_put(comp->dev);
return;
}
@@ -594,7 +586,6 @@ static void mtk_drm_crtc_atomic_disable(struct drm_crtc 
*crtc,
  
  	drm_crtc_vblank_off(crtc);

mtk_crtc_ddp_hw_fini(mtk_crtc);
-   mtk_smi_larb_put(comp->larb_dev);
ret = pm_runtime_put(comp->dev);
if (ret < 0)
DRM_DEV_ERROR(comp->dev, "Failed to disable power domain: %d\n",
diff --git a/drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.c 
b/drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.c
index 75bc00e17fc4..6c01492ba4df 100644
--- a/drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.c
+++ b/drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.c
@@ -449,37 +449,12 @@ unsigned int mtk_drm_find_possible_crtc_by_comp(struct 
drm_device *drm,
return ret;
  }
  
-static int mtk_ddp_get_larb_dev(struct device_node *node, struct mtk_ddp_comp *comp,

-   struct device *dev)
-{
-   struct device_node *larb_node;
-   struct platform_device *larb_pdev;
-
-   larb_node = of_parse_phandle(node, "mediatek,larb", 0);
-   if (!larb_node) {
-   dev_err(dev, "Missing mediadek,larb phandle in %pOF node\n", 
node);
-   return -EINVAL;
-   }
-
-   larb_pdev = of_find_device_by_node(larb_node);
-   if (!larb_pdev) {
-   dev_warn(dev, "Waiting for larb device %pOF\n", larb_node);
-   of_node_put(larb_node);
-   return -EPROBE_DEFER;
-   }
-   of_node_put(larb_node);
-   comp->larb_dev = _pdev->dev;
-
-   return 0;
-}
-
  int mtk_ddp_comp_init(struct device_node *node, struct mtk_ddp_comp *comp,
  enum mtk_ddp_comp_id comp_id)
  {
struct platform_device *comp_pdev;
enum mtk_ddp_comp_type type;
struct mtk_ddp_comp_dev *priv;
-   int ret;

Hi,

This 'ret' is also used inside `if IS_REACHABLE(CONFIG_MTK_CMDQ)`
so it  should not be removed.

Thanks,
Dafna

  
  	if (comp_id < 0 || comp_id >= DDP_COMPONENT_ID_MAX)

return -EINVAL;
@@ -495,16 +470,6 @@ int mtk_ddp_comp_init(struct device_node *node, struct 
mtk_ddp_comp *comp,
}
comp->dev = _pdev->dev;
  
-	/* Only DMA capable components need the LARB property */

-   if (type == MTK_DISP_OVL ||
-   type == MTK_DISP_OVL_2L ||
-   type == MTK_DISP_RDMA ||
-   type == MTK_DISP_WDMA) {
-   ret = mtk_ddp_get_larb_dev(node, comp, comp->dev);
-   if (ret)
-   return ret;
-   }
-
if (type == MTK_DISP_BLS ||
type == MTK_DISP_CCORR ||
type == MTK_DISP_COLOR ||
diff --git a/drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.h 
b/drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.h
index bb914d976cf5..1b582262b682 100644
--- a/drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.h
+++ b/drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.h
@@ -70,7 +70,6 @@ struct mtk_ddp_comp_funcs {
  struct mtk_ddp_comp {
struct device *dev;
int irq;
-   struct device *larb_dev;
enum mtk_ddp_comp_id id;
const struct mtk_ddp_comp_funcs *funcs;
  };
diff --git a/drivers/gpu/drm/mediatek/mtk_drm_drv.c 
b/drivers/gpu/drm/mediatek/mtk_drm_drv.c
index b013d56d2777..622de47239eb 100644
--- a/drivers/gpu/drm/mediatek/mtk_drm_drv.c
+++ b/drivers/gpu/drm/mediatek/mtk_drm_drv.c
@@ -576,11 +576,8 @@ static int 

Re: [PATCH v8 05/14] media: rkisp1: add Rockchip ISP1 subdev driver

2020-08-13 Thread Dafna Hirschfeld



Am 07.08.20 um 18:08 schrieb Dafna Hirschfeld:

Hi

Am 06.08.20 um 14:22 schrieb Tomasz Figa:

On Thu, Aug 6, 2020 at 11:21 AM Dafna Hirschfeld
 wrote:




Am 05.08.20 um 23:10 schrieb Dafna Hirschfeld:

Hi

On 22.07.20 17:24, Tomasz Figa wrote:

Hi Dafna,

On Sat, Jul 11, 2020 at 01:04:31PM +0200, Dafna Hirschfeld wrote:

Hi Laurent,

On 16.08.19 02:13, Laurent Pinchart wrote:

Hello Helen,

Thank you for the patch.

On Tue, Jul 30, 2019 at 03:42:47PM -0300, Helen Koike wrote:

[snip]

+static void rkisp1_isp_queue_event_sof(struct rkisp1_isp_subdev *isp)
+{
+    struct v4l2_event event = {
+    .type = V4L2_EVENT_FRAME_SYNC,
+    .u.frame_sync.frame_sequence =
+    atomic_inc_return(>frm_sync_seq) - 1,


I would move the increment to the caller, hiding it in this function is
error-prone (and if you look at the caller I'm pointing out one possible
error :-)).

In general usage of frm_sync_seq through the driver seems to be very
race-prone. It's read in various IRQ handling functions, all coming from
the same IRQ, so that part is fine (and wouldn't require an atomic
variable), but when read from the buffer queue handlers I really get a
red light flashing in my head. I'll try to investigate more when
reviewing the next patches.


I see that the only place were 'frame_sequence' is read outside of the irq
handlers is in the capture in 'rkisp1_vb2_buf_queue':

 /*
   * If there's no next buffer assigned, queue this buffer directly
   * as the next buffer, and update the memory interface.
   */
  if (cap->is_streaming && !cap->buf.next &&
  atomic_read(>rkisp1->isp.frame_sequence) == -1) {
  cap->buf.next = ispbuf;
  rkisp1_set_next_buf(cap);
  } else {
  list_add_tail(>queue, >buf.queue);
  }
This "if" condition seems very specific, a case where we already stream but 
v-start was not yet received.
I think it is possible to remove the test 
'atomic_read(>rkisp1->isp.frame_sequence) == -1'
from the above condition so that the next buffer is updated in case it is null 
not just before the first
v-start signal.



We don't have this special case in the Chrome OS code.

I suppose it would make it possible to resume the capture 1 frame
earlier after a queue underrun, as otherwise the new buffer would be
only programmed after the next frame start interrupt and used for the
next-next frame.  However, it's racy, because programming of the buffer
addresses is not atomic and could end up with the hardware using few
plane addresses from the new buffer and few from the dummy buffer.

Given that and also the fact that a queue underrun is a very special
case, where the system was already having problems catching up, I'd just
remove this special case.

[snip]

+void rkisp1_isp_isr(unsigned int isp_mis, struct rkisp1_device *dev)
+{
+    void __iomem *base = dev->base_addr;
+    unsigned int isp_mis_tmp = 0;


_tmp are never good names :-S


+    unsigned int isp_err = 0;


Neither of these variable need to be initialised to 0.


+
+    /* start edge of v_sync */
+    if (isp_mis & CIF_ISP_V_START) {
+    rkisp1_isp_queue_event_sof(>isp_sdev);


This will increment the frame sequence number. What if the interrupt is
slightly delayed and the next frame starts before we get a change to
copy the sequence number to the buffers (before they will complete
below) ?


Do you mean that we get two sequental v-start signals and then the next
frame-end signal in MI_MIS belongs to the first v-start signal of the two?
How can this be solved? I wonder if any v-start signal has a later signal
that correspond to the same frame so that we can follow it?

Maybe we should have one counter that is incremented on v-start signal,
and another counter that is incremented uppon some other signal?



We're talking about a hard IRQ. I can't imagine the interrupt handler
being delayed for a time close to a full frame interval (~16ms for 60
fps) to trigger such scenario.




+
+    writel(CIF_ISP_V_START, base + CIF_ISP_ICR);


Do you need to clear all interrupt bits individually, can't you write
isp_mis to CIF_ISP_ICR at the beginning of the function to clear them
all in one go ?


+    isp_mis_tmp = readl(base + CIF_ISP_MIS);
+    if (isp_mis_tmp & CIF_ISP_V_START)
+    v4l2_err(>v4l2_dev, "isp icr v_statr err: 0x%x\n",
+ isp_mis_tmp);


This require some explanation. It looks like a naive way to protect
against something, but I think it could trigger under normal
circumstances if IRQ handling is delayed, and wouldn't do much anyway.
Same for the similar constructs below.


+    }
+
+    if ((isp_mis & CIF_ISP_PIC_SIZE_ERROR)) {
+    /* Clear pic_size_error */
+    writel(CIF_ISP_PIC_SIZE_ERROR, base + CIF_ISP_ICR);
+    isp_err = readl(base + CIF_ISP_ERR);
+    v4l2_err(>

Re: [PATCH v8 05/14] media: rkisp1: add Rockchip ISP1 subdev driver

2020-08-07 Thread Dafna Hirschfeld

Hi

Am 06.08.20 um 14:22 schrieb Tomasz Figa:

On Thu, Aug 6, 2020 at 11:21 AM Dafna Hirschfeld
 wrote:




Am 05.08.20 um 23:10 schrieb Dafna Hirschfeld:

Hi

On 22.07.20 17:24, Tomasz Figa wrote:

Hi Dafna,

On Sat, Jul 11, 2020 at 01:04:31PM +0200, Dafna Hirschfeld wrote:

Hi Laurent,

On 16.08.19 02:13, Laurent Pinchart wrote:

Hello Helen,

Thank you for the patch.

On Tue, Jul 30, 2019 at 03:42:47PM -0300, Helen Koike wrote:

[snip]

+static void rkisp1_isp_queue_event_sof(struct rkisp1_isp_subdev *isp)
+{
+struct v4l2_event event = {
+.type = V4L2_EVENT_FRAME_SYNC,
+.u.frame_sync.frame_sequence =
+atomic_inc_return(>frm_sync_seq) - 1,


I would move the increment to the caller, hiding it in this function is
error-prone (and if you look at the caller I'm pointing out one possible
error :-)).

In general usage of frm_sync_seq through the driver seems to be very
race-prone. It's read in various IRQ handling functions, all coming from
the same IRQ, so that part is fine (and wouldn't require an atomic
variable), but when read from the buffer queue handlers I really get a
red light flashing in my head. I'll try to investigate more when
reviewing the next patches.


I see that the only place were 'frame_sequence' is read outside of the irq
handlers is in the capture in 'rkisp1_vb2_buf_queue':

 /*
   * If there's no next buffer assigned, queue this buffer directly
   * as the next buffer, and update the memory interface.
   */
  if (cap->is_streaming && !cap->buf.next &&
  atomic_read(>rkisp1->isp.frame_sequence) == -1) {
  cap->buf.next = ispbuf;
  rkisp1_set_next_buf(cap);
  } else {
  list_add_tail(>queue, >buf.queue);
  }
This "if" condition seems very specific, a case where we already stream but 
v-start was not yet received.
I think it is possible to remove the test 
'atomic_read(>rkisp1->isp.frame_sequence) == -1'
from the above condition so that the next buffer is updated in case it is null 
not just before the first
v-start signal.



We don't have this special case in the Chrome OS code.

I suppose it would make it possible to resume the capture 1 frame
earlier after a queue underrun, as otherwise the new buffer would be
only programmed after the next frame start interrupt and used for the
next-next frame.  However, it's racy, because programming of the buffer
addresses is not atomic and could end up with the hardware using few
plane addresses from the new buffer and few from the dummy buffer.

Given that and also the fact that a queue underrun is a very special
case, where the system was already having problems catching up, I'd just
remove this special case.

[snip]

+void rkisp1_isp_isr(unsigned int isp_mis, struct rkisp1_device *dev)
+{
+void __iomem *base = dev->base_addr;
+unsigned int isp_mis_tmp = 0;


_tmp are never good names :-S


+unsigned int isp_err = 0;


Neither of these variable need to be initialised to 0.


+
+/* start edge of v_sync */
+if (isp_mis & CIF_ISP_V_START) {
+rkisp1_isp_queue_event_sof(>isp_sdev);


This will increment the frame sequence number. What if the interrupt is
slightly delayed and the next frame starts before we get a change to
copy the sequence number to the buffers (before they will complete
below) ?


Do you mean that we get two sequental v-start signals and then the next
frame-end signal in MI_MIS belongs to the first v-start signal of the two?
How can this be solved? I wonder if any v-start signal has a later signal
that correspond to the same frame so that we can follow it?

Maybe we should have one counter that is incremented on v-start signal,
and another counter that is incremented uppon some other signal?



We're talking about a hard IRQ. I can't imagine the interrupt handler
being delayed for a time close to a full frame interval (~16ms for 60
fps) to trigger such scenario.




+
+writel(CIF_ISP_V_START, base + CIF_ISP_ICR);


Do you need to clear all interrupt bits individually, can't you write
isp_mis to CIF_ISP_ICR at the beginning of the function to clear them
all in one go ?


+isp_mis_tmp = readl(base + CIF_ISP_MIS);
+if (isp_mis_tmp & CIF_ISP_V_START)
+v4l2_err(>v4l2_dev, "isp icr v_statr err: 0x%x\n",
+ isp_mis_tmp);


This require some explanation. It looks like a naive way to protect
against something, but I think it could trigger under normal
circumstances if IRQ handling is delayed, and wouldn't do much anyway.
Same for the similar constructs below.


+}
+
+if ((isp_mis & CIF_ISP_PIC_SIZE_ERROR)) {
+/* Clear pic_size_error */
+writel(CIF_ISP_PIC_SIZE_ERROR, base + CIF_ISP_ICR);
+isp_err = readl(base + CIF_ISP_ERR);
+v4l2_err(>v4l2_dev,
+ "CIF_ISP_PIC_SIZE_ERROR

Re: [PATCH v8 05/14] media: rkisp1: add Rockchip ISP1 subdev driver

2020-08-07 Thread Dafna Hirschfeld

Hi,


Am 06.08.20 um 14:08 schrieb Tomasz Figa:

On Wed, Aug 5, 2020 at 11:10 PM Dafna Hirschfeld
 wrote:


Hi

On 22.07.20 17:24, Tomasz Figa wrote:

Hi Dafna,

On Sat, Jul 11, 2020 at 01:04:31PM +0200, Dafna Hirschfeld wrote:

Hi Laurent,

On 16.08.19 02:13, Laurent Pinchart wrote:

Hello Helen,

Thank you for the patch.

On Tue, Jul 30, 2019 at 03:42:47PM -0300, Helen Koike wrote:

[snip]

+static void rkisp1_isp_queue_event_sof(struct rkisp1_isp_subdev *isp)
+{
+  struct v4l2_event event = {
+  .type = V4L2_EVENT_FRAME_SYNC,
+  .u.frame_sync.frame_sequence =
+  atomic_inc_return(>frm_sync_seq) - 1,


I would move the increment to the caller, hiding it in this function is
error-prone (and if you look at the caller I'm pointing out one possible
error :-)).

In general usage of frm_sync_seq through the driver seems to be very
race-prone. It's read in various IRQ handling functions, all coming from
the same IRQ, so that part is fine (and wouldn't require an atomic
variable), but when read from the buffer queue handlers I really get a
red light flashing in my head. I'll try to investigate more when
reviewing the next patches.


I see that the only place were 'frame_sequence' is read outside of the irq
handlers is in the capture in 'rkisp1_vb2_buf_queue':

  /*
   * If there's no next buffer assigned, queue this buffer directly
   * as the next buffer, and update the memory interface.
   */
  if (cap->is_streaming && !cap->buf.next &&
  atomic_read(>rkisp1->isp.frame_sequence) == -1) {
  cap->buf.next = ispbuf;
  rkisp1_set_next_buf(cap);
  } else {
  list_add_tail(>queue, >buf.queue);
  }
This "if" condition seems very specific, a case where we already stream but 
v-start was not yet received.
I think it is possible to remove the test 
'atomic_read(>rkisp1->isp.frame_sequence) == -1'
from the above condition so that the next buffer is updated in case it is null 
not just before the first
v-start signal.



We don't have this special case in the Chrome OS code.

I suppose it would make it possible to resume the capture 1 frame
earlier after a queue underrun, as otherwise the new buffer would be
only programmed after the next frame start interrupt and used for the
next-next frame.  However, it's racy, because programming of the buffer
addresses is not atomic and could end up with the hardware using few
plane addresses from the new buffer and few from the dummy buffer.

Given that and also the fact that a queue underrun is a very special
case, where the system was already having problems catching up, I'd just
remove this special case.

[snip]

+void rkisp1_isp_isr(unsigned int isp_mis, struct rkisp1_device *dev)
+{
+  void __iomem *base = dev->base_addr;
+  unsigned int isp_mis_tmp = 0;


_tmp are never good names :-S


+  unsigned int isp_err = 0;


Neither of these variable need to be initialised to 0.


+
+  /* start edge of v_sync */
+  if (isp_mis & CIF_ISP_V_START) {
+  rkisp1_isp_queue_event_sof(>isp_sdev);


This will increment the frame sequence number. What if the interrupt is
slightly delayed and the next frame starts before we get a change to
copy the sequence number to the buffers (before they will complete
below) ?


Do you mean that we get two sequental v-start signals and then the next
frame-end signal in MI_MIS belongs to the first v-start signal of the two?
How can this be solved? I wonder if any v-start signal has a later signal
that correspond to the same frame so that we can follow it?

Maybe we should have one counter that is incremented on v-start signal,
and another counter that is incremented uppon some other signal?



We're talking about a hard IRQ. I can't imagine the interrupt handler
being delayed for a time close to a full frame interval (~16ms for 60
fps) to trigger such scenario.




+
+  writel(CIF_ISP_V_START, base + CIF_ISP_ICR);


Do you need to clear all interrupt bits individually, can't you write
isp_mis to CIF_ISP_ICR at the beginning of the function to clear them
all in one go ?


+  isp_mis_tmp = readl(base + CIF_ISP_MIS);
+  if (isp_mis_tmp & CIF_ISP_V_START)
+  v4l2_err(>v4l2_dev, "isp icr v_statr err: 0x%x\n",
+   isp_mis_tmp);


This require some explanation. It looks like a naive way to protect
against something, but I think it could trigger under normal
circumstances if IRQ handling is delayed, and wouldn't do much anyway.
Same for the similar constructs below.


+  }
+
+  if ((isp_mis & CIF_ISP_PIC_SIZE_ERROR)) {
+  /* Clear pic_size_error */
+  writel(CIF_ISP_PIC_SIZE_ERROR, base + CIF_ISP_ICR);
+  isp_err = readl(base + CIF_ISP_ERR);
+  v4l2_err(>v4l2_dev,
+   "CIF_ISP_PIC_SIZE_ERROR (0x%08x)"

Re: [PATCH v3 1/2] dma-contiguous: Abstract dma_{alloc,free}_contiguous()

2019-07-25 Thread Dafna Hirschfeld
On Thu, 2019-07-25 at 09:50 -0700, Nicolin Chen wrote:
> On Thu, Jul 25, 2019 at 01:06:42PM -0300, Ezequiel Garcia wrote:
> > I can't find a way to forward-redirect from Gmail, so I'm Ccing Dafna
> > who found a regression caused by this commit. Dafna, can you give all
> > the details, including the log and how you are reproducing it?
> 
> I saw the conversation there. Sorry for not replying yet.
> May we discuss there since there are full logs available?
> 
> Thanks
> Nicolin
> 

Hi,
I compiled vivid as built-in into the kernel (not as a separate module) for 
nitrogen8m device (imx8) and
set it to use contig dma for mem_ops by adding the kernel param
vivid.allocators=1,1,...

I use this devicetree patch for the dtb file: 
https://lkml.org/lkml/2019/7/24/789. Although it should
be the same on any Aarch64 platform.

Then, on the board I run the command:

v4l2-ctl -d3 -v width=2592,height=1944,pixelformat=UYVY,bytesperline=5184 
--stream-mmap --stream-to video.UYVY

In every run there is a different crash. Here is one of them: 
https://pastebin.com/xXgbXMAN

Dafna
> > 
> > 
> > > > On Fri, 24 May 2019 at 01:08, Nicolin Chen  
> > > > wrote:
> > > 
> > > Both dma_alloc_from_contiguous() and dma_release_from_contiguous()
> > > are very simply implemented, but requiring callers to pass certain
> > > parameters like count and align, and taking a boolean parameter to
> > > check __GFP_NOWARN in the allocation flags. So every function call
> > > duplicates similar work:
> > >   /* A piece of example */
> > >   unsigned long order = get_order(size);
> > >   size_t count = size >> PAGE_SHIFT;
> > >   page = dma_alloc_from_contiguous(dev, count, order, gfp & __GFP_NOWARN);
> > >   [...]
> > >   dma_release_from_contiguous(dev, page, size >> PAGE_SHIFT);
> > > 
> > > Additionally, as CMA can be used only in the context which permits
> > > sleeping, most of callers do a gfpflags_allow_blocking() check and
> > > a corresponding fallback allocation of normal pages upon any false
> > > result:
> > >   /* A piece of example */
> > >   if (gfpflags_allow_blocking(flag))
> > >   page = dma_alloc_from_contiguous();
> > >   if (!page)
> > >   page = alloc_pages();
> > >   [...]
> > >   if (!dma_release_from_contiguous(dev, page, count))
> > >   __free_pages(page, get_order(size));
> > > 
> > > So this patch simplifies those function calls by abstracting these
> > > operations into the two new functions: dma_{alloc,free}_contiguous.
> > > 
> > > As some callers of dma_{alloc,release}_from_contiguous() might be
> > > complicated, this patch just implements these two new functions to
> > > kernel/dma/direct.c only as an initial step.
> > > 
> > > > > > Suggested-by: Christoph Hellwig 
> > > > > > Signed-off-by: Nicolin Chen 
> > > ---
> > > Changelog
> > > v2->v3:
> > >  * Added missing "static inline" in header file to fix build error.
> > > v1->v2:
> > >  * Added new functions beside the old ones so we can replace callers
> > >    one by one later.
> > >  * Applied new functions to dma/direct.c only, because it's the best
> > >    example caller to apply and should be safe with the new functions.
> > > 
> > >  include/linux/dma-contiguous.h | 11 
> > >  kernel/dma/contiguous.c| 48 ++
> > >  kernel/dma/direct.c| 24 +++--
> > >  3 files changed, 63 insertions(+), 20 deletions(-)
> > > 
> > > diff --git a/include/linux/dma-contiguous.h 
> > > b/include/linux/dma-contiguous.h
> > > index f247e8aa5e3d..00a370c1c140 100644
> > > --- a/include/linux/dma-contiguous.h
> > > +++ b/include/linux/dma-contiguous.h
> > > @@ -115,6 +115,8 @@ struct page *dma_alloc_from_contiguous(struct device 
> > > *dev, size_t count,
> > >    unsigned int order, bool no_warn);
> > >  bool dma_release_from_contiguous(struct device *dev, struct page *pages,
> > >  int count);
> > > +struct page *dma_alloc_contiguous(struct device *dev, size_t size, gfp_t 
> > > gfp);
> > > +void dma_free_contiguous(struct device *dev, struct page *page, size_t 
> > > size);
> > > 
> > >  #else
> > > 
> > > @@ -157,6 +159,15 @@ bool dma_release_from_contiguous(struct device *dev, 
> > > struct page *pages,
> > > return false;
> > >  }
> > > 
> > > +static inline
> > > +struct page *dma_alloc_contiguous(struct device *dev, size_t size, gfp_t 
> > > gfp)
> > > +{
> > > +   return NULL;
> > > +}
> > > +
> > > +static inline
> > > +void dma_free_contiguous(struct device *dev, struct page *page, size_t 
> > > size) { }
> > > +
> > >  #endif
> > > 
> > >  #endif
> > > diff --git a/kernel/dma/contiguous.c b/kernel/dma/contiguous.c
> > > index b2a87905846d..21f39a6cb04f 100644
> > > --- a/kernel/dma/contiguous.c
> > > +++ b/kernel/dma/contiguous.c
> > > @@ -214,6 +214,54 @@ bool dma_release_from_contiguous(struct device *dev, 
> > > struct page *pages,
> > > return cma_release(dev_get_cma_area(dev), pages, count);