Re: [PATCH] CHROMIUM: iommu: rockchip: Make sure that page table state is coherent
On 23.03.15 10:38, Tomasz Figa wrote: Sorry, I had to dig my way out through my backlog. On Tue, Mar 3, 2015 at 10:36 PM, Joerg Roedel wrote: On Mon, Feb 09, 2015 at 08:19:21PM +0900, Tomasz Figa wrote: Even though the code uses the dt_lock spin lock to serialize mapping operation from different threads, it does not protect from IOMMU accesses that might be already taking place and thus altering state of the IOTLB. This means that current mapping code which first zaps the page table and only then updates it with new mapping which is prone to mentioned race. Could you elabortate a bit on the race and why it is sufficient to zap only the first and the last iova? From the description and the comments in the patch this is not clear to me. Let's start with why it's sufficient to zap only first and last iova. While unmapping, the driver zaps all iovas belonging to the mapping, so the page tables not used by any mapping won't be cached. Now when the driver creates a mapping it might end up occupying several page tables. However, since the mapping area is virtually contiguous, only the first and last page table can be shared with different mappings. This means that only first and last iovas can be already cached. In fact, we could detect if first and last page tables are shared and do not zap at all, but this wouldn't really optimize too much. Why invalidating one iova is enough to invalidate the whole page table is unclear to me as well, but it seems to be the correct way on this hardware. Hi, It seems to me that actually each mapping needs exactly one page. Since (as the inline doc in rk_iommu_map states) the pgsize_bitmap makes sure that iova mappings fits exactly into one page table since the mapping size is maximum 4M. This actually means that if rk_dte_get_page_table does not allocate a new page table but returns one that is already partially used from previous mappings then two page tables might be required, but I think the iova allocation somehow make sure that this will not be the case. If it was the case then the code would be buggy because it means that the loop in rk_iommu_map_iova will write behind the page table given in rk_dte_get_page_table (which we didn't allocate) So I it seems to me that calling 'rk_iommu_zap_iova(rk_domain, iova, SPAGE_SIZE);' as done before this patch should be used, but be moved from rk_dte_get_page_table to where rk_iommu_zap_iova_first_last is now Thanks, Dafna As for the race, it's also kind of explained by the above. The already running hardware can trigger page table look-ups in the IOMMU and so caching of the page table between our zapping and updating its contents. With this patch zapping is performed after updating the page table so the race is gone. Best regards, Tomasz From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majord...@vger.kernel.org) by vger.kernel.org via listexpand id S1753210AbbCWM3R (ORCPT ); Mon, 23 Mar 2015 08:29:17 -0400 Received: from 8bytes.org ([81.169.241.247]:33957 "EHLO theia.8bytes.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752552AbbCWM3M (ORCPT ); Mon, 23 Mar 2015 08:29:12 -0400 Date: Mon, 23 Mar 2015 13:29:10 +0100 From: Joerg Roedel To: Tomasz Figa Cc: iommu@lists.linux-foundation.org, "linux-arm-ker...@lists.infradead.org" , "linux-ker...@vger.kernel.org" , "open list:ARM/Rockchip SoC..." , Heiko Stuebner , Daniel Kurtz Subject: Re: [PATCH] CHROMIUM: iommu: rockchip: Make sure that page table state is coherent Message-ID: <20150323122910.go4...@8bytes.org> References: <1423480761-33453-1-git-send-email-tf...@chromium.org> <20150303133659.gd10...@8bytes.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-kernel-ow...@vger.kernel.org List-ID: X-Mailing-List: linux-ker...@vger.kernel.org Hi Tomasz, On Mon, Mar 23, 2015 at 05:38:45PM +0900, Tomasz Figa wrote: While unmapping, the driver zaps all iovas belonging to the mapping, so the page tables not used by any mapping won't be cached. Now when the driver creates a mapping it might end up occupying several page tables. However, since the mapping area is virtually contiguous, only the first and last page table can be shared with different mappings. This means that only first and last iovas can be already cached. In fact, we could detect if first and last page tables are shared and do not zap at all, but this wouldn't really optimize too much. Why invalidating one iova is enough to invalidate the whole page table is unclear to me as well, but it seems to be the correct way on this hardware. As for the race, it's also kind of explained by the above. The already running hardware can trigger page table look-ups in the IOMMU and so caching of the page table between our zapping and updating its contents. With this patch zapping is
[PATCH v2 4/5] iommu/mediatek: Add tlb_lock in tlb_flush_all
From: Yong Wu The tlb_flush_all touches the registers controlling tlb operations. Protect it with the tlb_lock spinlock. This also require the range_sync func to release that spinlock before calling tlb_flush_all. Signed-off-by: Yong Wu [refactor commit log] Signed-off-by: Dafna Hirschfeld --- drivers/iommu/mtk_iommu.c | 12 +--- 1 file changed, 9 insertions(+), 3 deletions(-) diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c index e30ac68fab48..195a411e3087 100644 --- a/drivers/iommu/mtk_iommu.c +++ b/drivers/iommu/mtk_iommu.c @@ -210,10 +210,14 @@ static struct mtk_iommu_domain *to_mtk_domain(struct iommu_domain *dom) static void mtk_iommu_tlb_flush_all(struct mtk_iommu_data *data) { + unsigned long flags; + + spin_lock_irqsave(>tlb_lock, flags); writel_relaxed(F_INVLD_EN1 | F_INVLD_EN0, data->base + data->plat_data->inv_sel_reg); writel_relaxed(F_ALL_INVLD, data->base + REG_MMU_INVALIDATE); wmb(); /* Make sure the tlb flush all done */ + spin_unlock_irqrestore(>tlb_lock, flags); } static void mtk_iommu_tlb_flush_range_sync(unsigned long iova, size_t size, @@ -242,14 +246,16 @@ static void mtk_iommu_tlb_flush_range_sync(unsigned long iova, size_t size, /* tlb sync */ ret = readl_poll_timeout_atomic(data->base + REG_MMU_CPE_DONE, tmp, tmp != 0, 10, 1000); + + /* Clear the CPE status */ + writel_relaxed(0, data->base + REG_MMU_CPE_DONE); + spin_unlock_irqrestore(>tlb_lock, flags); + if (ret) { dev_warn(data->dev, "Partial TLB flush timed out, falling back to full flush\n"); mtk_iommu_tlb_flush_all(data); } - /* Clear the CPE status */ - writel_relaxed(0, data->base + REG_MMU_CPE_DONE); - spin_unlock_irqrestore(>tlb_lock, flags); pm_runtime_put(data->dev); } -- 2.17.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v2 0/5] iommu/mediatek: Fix tlb flush logic
Often devices allocate DMA buffers before they do runtime pm resume. This is the case for example with v4l2 devices where buffers are allocated during 'VIDIOC_REQBUFS` and runtime resume happens later usually during 'VIDIOC_STREAMON'. In such cases the partial tlb flush when allocating will fail since the iommu is runtime suspended. This will print a warning and try to do full flush. But there is actually no need to flush the tlb before the consumer device is turned on. Fix the warning by skipping partial flush when allocating and instead do full flash in runtime resume. In order to do full flush from the resume cb, the test: if (pm_runtime_get_if_in_use(data->dev) <= 0) continue; needs to be removed from the flush all func since pm_runtime_get_if_in_use returns 0 while resuming and will skip the flush This patchset is a combination of 4 patches already sent in a different patchset: [1] and a warning fix from Sebastian Reichel [1] https://lore.kernel.org/linux-devicetree/20210923115840.17813-1-yong...@mediatek.com/ changes since v1: - * Added preparation patches to remove the unneeded 'for_each_m4u' usage and add a spinlock to protect access to tlb control registers. * remove the pm runtime status check as explained above. * refactor commit logs and inline doc * move the call to full flush to the bottom of the resume cb after all registers are updated. Sebastian Reichel (1): iommu/mediatek: Always check runtime PM status in tlb flush range callback Yong Wu (4): iommu/mediatek: Remove for_each_m4u in tlb_sync_all iommu/mediatek: Remove the power status checking in tlb flush all iommu/mediatek: Add tlb_lock in tlb_flush_all iommu/mediatek: Always tlb_flush_all when each PM resume drivers/iommu/mtk_iommu.c | 42 --- 1 file changed, 22 insertions(+), 20 deletions(-) -- 2.17.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v2 5/5] iommu/mediatek: Always tlb_flush_all when each PM resume
From: Yong Wu Prepare for 2 HWs that sharing pgtable in different power-domains. When there are 2 M4U HWs, it may has problem in the flush_range in which we get the pm_status via the m4u dev, BUT that function don't reflect the real power-domain status of the HW since there may be other HW also use that power-domain. DAM allocation is often done while the allocating device is runtime suspended. In such a case the iommu will also be suspended and partial flushing of the tlb will not be executed. Therefore, we add a tlb_flush_all in the pm_runtime_resume to make sure the tlb is always clean. In other case, the iommu's power should be active via device link with smi. Signed-off-by: Yong Wu [move the call to mtk_iommu_tlb_flush_all to the bottom of resume cb, improve doc/log] Signed-off-by: Dafna Hirschfeld --- drivers/iommu/mtk_iommu.c | 7 +++ 1 file changed, 7 insertions(+) diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c index 195a411e3087..4799cd06511b 100644 --- a/drivers/iommu/mtk_iommu.c +++ b/drivers/iommu/mtk_iommu.c @@ -997,6 +997,13 @@ static int __maybe_unused mtk_iommu_runtime_resume(struct device *dev) writel_relaxed(reg->ivrp_paddr, base + REG_MMU_IVRP_PADDR); writel_relaxed(reg->vld_pa_rng, base + REG_MMU_VLD_PA_RNG); writel(m4u_dom->cfg.arm_v7s_cfg.ttbr & MMU_PT_ADDR_MASK, base + REG_MMU_PT_BASE_ADDR); + + /* +* Users may allocate dma buffer before they call pm_runtime_get, +* in which case it will lack the necessary tlb flush. +* Thus, make sure to update the tlb after each PM resume. +*/ + mtk_iommu_tlb_flush_all(data); return 0; } -- 2.17.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v2 1/5] iommu/mediatek: Remove for_each_m4u in tlb_sync_all
From: Yong Wu The tlb_sync_all is called from these three functions: a) flush_iotlb_all: it will be called for each a iommu HW. b) tlb_flush_range_sync: it already has for_each_m4u. c) in irq: When IOMMU HW translation fault, Only need flush itself. Thus, No need for_each_m4u in this tlb_sync_all. Remove it. Signed-off-by: Yong Wu Reviewed-by: Dafna Hirschfeld --- drivers/iommu/mtk_iommu.c | 16 +++- 1 file changed, 7 insertions(+), 9 deletions(-) diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c index 507123ae7485..342aa562ab6a 100644 --- a/drivers/iommu/mtk_iommu.c +++ b/drivers/iommu/mtk_iommu.c @@ -210,17 +210,15 @@ static struct mtk_iommu_domain *to_mtk_domain(struct iommu_domain *dom) static void mtk_iommu_tlb_flush_all(struct mtk_iommu_data *data) { - for_each_m4u(data) { - if (pm_runtime_get_if_in_use(data->dev) <= 0) - continue; + if (pm_runtime_get_if_in_use(data->dev) <= 0) + return; - writel_relaxed(F_INVLD_EN1 | F_INVLD_EN0, - data->base + data->plat_data->inv_sel_reg); - writel_relaxed(F_ALL_INVLD, data->base + REG_MMU_INVALIDATE); - wmb(); /* Make sure the tlb flush all done */ + writel_relaxed(F_INVLD_EN1 | F_INVLD_EN0, + data->base + data->plat_data->inv_sel_reg); + writel_relaxed(F_ALL_INVLD, data->base + REG_MMU_INVALIDATE); + wmb(); /* Make sure the tlb flush all done */ - pm_runtime_put(data->dev); - } + pm_runtime_put(data->dev); } static void mtk_iommu_tlb_flush_range_sync(unsigned long iova, size_t size, -- 2.17.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v2 3/5] iommu/mediatek: Remove the power status checking in tlb flush all
From: Yong Wu To simplify the code, Remove the power status checking in the tlb_flush_all, remove this: if (pm_runtime_get_if_in_use(data->dev) <= 0) continue; The mtk_iommu_tlb_flush_all is called from a) isr b) tlb flush range fail case c) iommu_create_device_direct_mappings In first two cases, the power and clock are always enabled. In the third case tlb flush is unnecessary because in a later patch in the series a full flush from the pm_runtime_resume callback is added. In addition, writing the tlb control register when the iommu is not resumed is ok and the write is ignored. Signed-off-by: Yong Wu [refactor commit log] Signed-off-by: Dafna Hirschfeld --- drivers/iommu/mtk_iommu.c | 5 - 1 file changed, 5 deletions(-) diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c index dd2c08c54df4..e30ac68fab48 100644 --- a/drivers/iommu/mtk_iommu.c +++ b/drivers/iommu/mtk_iommu.c @@ -210,15 +210,10 @@ static struct mtk_iommu_domain *to_mtk_domain(struct iommu_domain *dom) static void mtk_iommu_tlb_flush_all(struct mtk_iommu_data *data) { - if (pm_runtime_get_if_in_use(data->dev) <= 0) - return; - writel_relaxed(F_INVLD_EN1 | F_INVLD_EN0, data->base + data->plat_data->inv_sel_reg); writel_relaxed(F_ALL_INVLD, data->base + REG_MMU_INVALIDATE); wmb(); /* Make sure the tlb flush all done */ - - pm_runtime_put(data->dev); } static void mtk_iommu_tlb_flush_range_sync(unsigned long iova, size_t size, -- 2.17.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH v2 2/5] iommu/mediatek: Always check runtime PM status in tlb flush range callback
From: Sebastian Reichel In case of v4l2_reqbufs() it is possible, that a TLB flush is done without runtime PM being enabled. In that case the "Partial TLB flush timed out, falling back to full flush" warning is printed. Commit c0b57581b73b ("iommu/mediatek: Add power-domain operation") introduced has_pm as optimization to avoid checking runtime PM when there is no power domain attached. But without the PM domain there is still the device driver's runtime PM suspend handler, which disables the clock. Thus flushing should also be avoided when there is no PM domain involved. Signed-off-by: Sebastian Reichel Reviewed-by: Dafna Hirschfeld --- drivers/iommu/mtk_iommu.c | 10 +++--- 1 file changed, 3 insertions(+), 7 deletions(-) diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c index 342aa562ab6a..dd2c08c54df4 100644 --- a/drivers/iommu/mtk_iommu.c +++ b/drivers/iommu/mtk_iommu.c @@ -225,16 +225,13 @@ static void mtk_iommu_tlb_flush_range_sync(unsigned long iova, size_t size, size_t granule, struct mtk_iommu_data *data) { - bool has_pm = !!data->dev->pm_domain; unsigned long flags; int ret; u32 tmp; for_each_m4u(data) { - if (has_pm) { - if (pm_runtime_get_if_in_use(data->dev) <= 0) - continue; - } + if (pm_runtime_get_if_in_use(data->dev) <= 0) + continue; spin_lock_irqsave(>tlb_lock, flags); writel_relaxed(F_INVLD_EN1 | F_INVLD_EN0, @@ -259,8 +256,7 @@ static void mtk_iommu_tlb_flush_range_sync(unsigned long iova, size_t size, writel_relaxed(0, data->base + REG_MMU_CPE_DONE); spin_unlock_irqrestore(>tlb_lock, flags); - if (has_pm) - pm_runtime_put(data->dev); + pm_runtime_put(data->dev); } } -- 2.17.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 1/2] iommu/mediatek: Always tlb_flush_all when each PM resume
On 08.12.21 11:50, Dafna Hirschfeld wrote: On 07.12.21 10:31, Dafna Hirschfeld wrote: On 27.11.21 04:46, Yong Wu wrote: Hi Dafna, Sorry for reply late. On Mon, 2021-11-22 at 12:43 +0200, Dafna Hirschfeld wrote: From: Yong Wu Prepare for 2 HWs that sharing pgtable in different power-domains. When there are 2 M4U HWs, it may has problem in the flush_range in which we get the pm_status via the m4u dev, BUT that function don't reflect the real power-domain status of the HW since there may be other HW also use that power-domain. The function dma_alloc_attrs help allocate the iommu buffer which need the corresponding power domain since tlb flush is needed when preparing iova. BUT this function only is for allocating buffer, we have no good reason to request the user always call pm_runtime_get before calling dma_alloc_xxx. Therefore, we add a tlb_flush_all in the pm_runtime_resume to make sure the tlb always is clean. Another solution is always call pm_runtime_get in the tlb_flush_range. This will trigger pm runtime resume/backup so often when the iommu power is not active at some time(means user don't call pm_runtime_get before calling dma_alloc_xxx), This may cause the performance drop. thus we don't use this. In other case, the iommu's power should always be active via device link with smi. The previous SoC don't have PM except mt8192. the mt8192 IOMMU is display's power-domain which nearly always is enabled. thus no need fix tags here. Prepare for mt8195. In this patchset, this message should be not proper. I think you could add the comment why this patch is needed in mt8173. Signed-off-by: Yong Wu [imporvie inline doc] Signed-off-by: Dafna Hirschfeld --- drivers/iommu/mtk_iommu.c | 7 +++ 1 file changed, 7 insertions(+) diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c index 25b834104790..28dc4b95b6d9 100644 --- a/drivers/iommu/mtk_iommu.c +++ b/drivers/iommu/mtk_iommu.c @@ -964,6 +964,13 @@ static int __maybe_unused mtk_iommu_runtime_resume(struct device *dev) return ret; } + /* + * Users may allocate dma buffer before they call pm_runtime_get, + * in which case it will lack the necessary tlb flush. + * Thus, make sure to update the tlb after each PM resume. + */ + mtk_iommu_tlb_flush_all(data); This should not work. since current the *_tlb_flush_all call pm_runtime_get_if_in_use which will always return 0 when it called from this runtime_cb in my test. thus, It won't do the tlb_flush_all actually. He, indeed, my mistake, although the encoder works more or less fine even without the full flush so I didn't catch that. I guess this also depend on these two patches of mt8195 v3. [PATCH v3 09/33] iommu/mediatek: Remove for_each_m4u in tlb_sync_all [PATCH v3 10/33] iommu/mediatek: Add tlb_lock in tlb_flush_all I'll add those two like in [10/33], I added a mtk_iommu_tlb_do_flush_all which don't have the pm operation. yes, I need to remove the pm_runtime_get_if_in_use call in the 'flush_all' func I see there is also a patch for that in the mt8195 v3 series "[PATCH v3 13/33] iommu/mediatek: Remove the power status checking in tlb flush all" So I'll send v2, adding all those 3 patches, but I think adding mtk_iommu_tlb_do_flush_all on patch 9 and removing it again on patch 13 is confusing so I'll avoid that. In addition, the call to mtk_iommu_tlb_flush_all from mtk_iommu_runtime_resume should move to the bottom of the function after all values are updated Thanks, Dafna This looks has a dependence. Let me know if I can help this. It did work for me, testing on elm device. I'll check that again. + /* * Uppon first resume, only enable the clk and return, since the values of the * registers are not yet set. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 1/2] iommu/mediatek: Always tlb_flush_all when each PM resume
On 07.12.21 10:31, Dafna Hirschfeld wrote: On 27.11.21 04:46, Yong Wu wrote: Hi Dafna, Sorry for reply late. On Mon, 2021-11-22 at 12:43 +0200, Dafna Hirschfeld wrote: From: Yong Wu Prepare for 2 HWs that sharing pgtable in different power-domains. When there are 2 M4U HWs, it may has problem in the flush_range in which we get the pm_status via the m4u dev, BUT that function don't reflect the real power-domain status of the HW since there may be other HW also use that power-domain. The function dma_alloc_attrs help allocate the iommu buffer which need the corresponding power domain since tlb flush is needed when preparing iova. BUT this function only is for allocating buffer, we have no good reason to request the user always call pm_runtime_get before calling dma_alloc_xxx. Therefore, we add a tlb_flush_all in the pm_runtime_resume to make sure the tlb always is clean. Another solution is always call pm_runtime_get in the tlb_flush_range. This will trigger pm runtime resume/backup so often when the iommu power is not active at some time(means user don't call pm_runtime_get before calling dma_alloc_xxx), This may cause the performance drop. thus we don't use this. In other case, the iommu's power should always be active via device link with smi. The previous SoC don't have PM except mt8192. the mt8192 IOMMU is display's power-domain which nearly always is enabled. thus no need fix tags here. Prepare for mt8195. In this patchset, this message should be not proper. I think you could add the comment why this patch is needed in mt8173. Signed-off-by: Yong Wu [imporvie inline doc] Signed-off-by: Dafna Hirschfeld --- drivers/iommu/mtk_iommu.c | 7 +++ 1 file changed, 7 insertions(+) diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c index 25b834104790..28dc4b95b6d9 100644 --- a/drivers/iommu/mtk_iommu.c +++ b/drivers/iommu/mtk_iommu.c @@ -964,6 +964,13 @@ static int __maybe_unused mtk_iommu_runtime_resume(struct device *dev) return ret; } + /* + * Users may allocate dma buffer before they call pm_runtime_get, + * in which case it will lack the necessary tlb flush. + * Thus, make sure to update the tlb after each PM resume. + */ + mtk_iommu_tlb_flush_all(data); This should not work. since current the *_tlb_flush_all call pm_runtime_get_if_in_use which will always return 0 when it called from this runtime_cb in my test. thus, It won't do the tlb_flush_all actually. He, indeed, my mistake, although the encoder works more or less fine even without the full flush so I didn't catch that. I guess this also depend on these two patches of mt8195 v3. [PATCH v3 09/33] iommu/mediatek: Remove for_each_m4u in tlb_sync_all [PATCH v3 10/33] iommu/mediatek: Add tlb_lock in tlb_flush_all I'll add those two like in [10/33], I added a mtk_iommu_tlb_do_flush_all which don't have the pm operation. yes, I need to remove the pm_runtime_get_if_in_use call in the 'flush_all' func I see there is also a patch for that in the mt8195 v3 series "[PATCH v3 13/33] iommu/mediatek: Remove the power status checking in tlb flush all" So I'll send v2, adding all those 3 patches, but I think adding mtk_iommu_tlb_do_flush_all on patch 9 and removing it again on patch 13 is confusing so I'll avoid that. Thanks, Dafna This looks has a dependence. Let me know if I can help this. It did work for me, testing on elm device. I'll check that again. + /* * Uppon first resume, only enable the clk and return, since the values of the * registers are not yet set. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH 1/2] iommu/mediatek: Always tlb_flush_all when each PM resume
On 27.11.21 04:46, Yong Wu wrote: Hi Dafna, Sorry for reply late. On Mon, 2021-11-22 at 12:43 +0200, Dafna Hirschfeld wrote: From: Yong Wu Prepare for 2 HWs that sharing pgtable in different power-domains. When there are 2 M4U HWs, it may has problem in the flush_range in which we get the pm_status via the m4u dev, BUT that function don't reflect the real power-domain status of the HW since there may be other HW also use that power-domain. The function dma_alloc_attrs help allocate the iommu buffer which need the corresponding power domain since tlb flush is needed when preparing iova. BUT this function only is for allocating buffer, we have no good reason to request the user always call pm_runtime_get before calling dma_alloc_xxx. Therefore, we add a tlb_flush_all in the pm_runtime_resume to make sure the tlb always is clean. Another solution is always call pm_runtime_get in the tlb_flush_range. This will trigger pm runtime resume/backup so often when the iommu power is not active at some time(means user don't call pm_runtime_get before calling dma_alloc_xxx), This may cause the performance drop. thus we don't use this. In other case, the iommu's power should always be active via device link with smi. The previous SoC don't have PM except mt8192. the mt8192 IOMMU is display's power-domain which nearly always is enabled. thus no need fix tags here. Prepare for mt8195. In this patchset, this message should be not proper. I think you could add the comment why this patch is needed in mt8173. Signed-off-by: Yong Wu [imporvie inline doc] Signed-off-by: Dafna Hirschfeld --- drivers/iommu/mtk_iommu.c | 7 +++ 1 file changed, 7 insertions(+) diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c index 25b834104790..28dc4b95b6d9 100644 --- a/drivers/iommu/mtk_iommu.c +++ b/drivers/iommu/mtk_iommu.c @@ -964,6 +964,13 @@ static int __maybe_unused mtk_iommu_runtime_resume(struct device *dev) return ret; } + /* +* Users may allocate dma buffer before they call pm_runtime_get, +* in which case it will lack the necessary tlb flush. +* Thus, make sure to update the tlb after each PM resume. +*/ + mtk_iommu_tlb_flush_all(data); This should not work. since current the *_tlb_flush_all call pm_runtime_get_if_in_use which will always return 0 when it called from this runtime_cb in my test. thus, It won't do the tlb_flush_all actually. I guess this also depend on these two patches of mt8195 v3. [PATCH v3 09/33] iommu/mediatek: Remove for_each_m4u in tlb_sync_all [PATCH v3 10/33] iommu/mediatek: Add tlb_lock in tlb_flush_all like in [10/33], I added a mtk_iommu_tlb_do_flush_all which don't have the pm operation. This looks has a dependence. Let me know if I can help this. It did work for me, testing on elm device. I'll check that again. + /* * Uppon first resume, only enable the clk and return, since the values of the * registers are not yet set. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH 3/4] iommu/rockchip: rename dte_index to dte
In rk_iommu_map, the var dte_index is actually set to the dte and not to the dte index. Rename it. Signed-off-by: Dafna Hirschfeld --- drivers/iommu/rockchip-iommu.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/drivers/iommu/rockchip-iommu.c b/drivers/iommu/rockchip-iommu.c index bd73cf9c54f5..ba60f0faafcc 100644 --- a/drivers/iommu/rockchip-iommu.c +++ b/drivers/iommu/rockchip-iommu.c @@ -852,7 +852,7 @@ static int rk_iommu_map(struct iommu_domain *domain, unsigned long _iova, unsigned long flags; dma_addr_t pte_dma, iova = (dma_addr_t)_iova; u32 *page_table, *pte_addr; - u32 dte_index, pte_index; + u32 dte, pte_index; int ret; spin_lock_irqsave(_domain->dt_lock, flags); @@ -870,11 +870,11 @@ static int rk_iommu_map(struct iommu_domain *domain, unsigned long _iova, return PTR_ERR(page_table); } - dte_index = rk_domain->dt[rk_iova_dte_index(iova)]; + dte = rk_domain->dt[rk_iova_dte_index(iova)]; pte_index = rk_iova_pte_index(iova); pte_addr = _table[pte_index]; - pte_dma = rk_ops->pt_address(dte_index) + pte_index * sizeof(u32); + pte_dma = rk_ops->pt_address(dte) + pte_index * sizeof(u32); ret = rk_iommu_map_iova(rk_domain, pte_addr, pte_dma, iova, paddr, size, prot); -- 2.17.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH 4/4] iommu/rockchip: replace pt_address cb with dma_addr_dte when setting dt addr
The dt address is calculated using the dma_addr_dte cb. So when setting the dt address to the DTE_ADDR_DUMMY that cb should be used instead of pt_address. Signed-off-by: Dafna Hirschfeld --- drivers/iommu/rockchip-iommu.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/iommu/rockchip-iommu.c b/drivers/iommu/rockchip-iommu.c index ba60f0faafcc..013f7608a2e6 100644 --- a/drivers/iommu/rockchip-iommu.c +++ b/drivers/iommu/rockchip-iommu.c @@ -510,7 +510,7 @@ static int rk_iommu_force_reset(struct rk_iommu *iommu) * and verifying that upper 5 nybbles are read back. */ for (i = 0; i < iommu->num_mmu; i++) { - dte_addr = rk_ops->pt_address(DTE_ADDR_DUMMY); + dte_addr = rk_ops->dma_addr_dte(DTE_ADDR_DUMMY); rk_iommu_write(iommu->bases[i], RK_MMU_DTE_ADDR, dte_addr); if (dte_addr != rk_iommu_read(iommu->bases[i], RK_MMU_DTE_ADDR)) { -- 2.17.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH 1/4] iommu/rockchip: replace 4 with sizeof(u32)
In log_iova, multiply by 4 is used to calculate the addresses. In other places in this driver, sizeof(u3) is used. So replace 4 with sizeof(u32) for consistency Signed-off-by: Dafna Hirschfeld --- drivers/iommu/rockchip-iommu.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/iommu/rockchip-iommu.c b/drivers/iommu/rockchip-iommu.c index 5cb260820eda..bd22d0a6eaf0 100644 --- a/drivers/iommu/rockchip-iommu.c +++ b/drivers/iommu/rockchip-iommu.c @@ -580,14 +580,14 @@ static void log_iova(struct rk_iommu *iommu, int index, dma_addr_t iova) mmu_dte_addr = rk_iommu_read(base, RK_MMU_DTE_ADDR); mmu_dte_addr_phys = rk_ops->dte_addr_phys(mmu_dte_addr); - dte_addr_phys = mmu_dte_addr_phys + (4 * dte_index); + dte_addr_phys = mmu_dte_addr_phys + sizeof(u32) * dte_index; dte_addr = phys_to_virt(dte_addr_phys); dte = *dte_addr; if (!rk_dte_is_pt_valid(dte)) goto print_it; - pte_addr_phys = rk_ops->pt_address(dte) + (pte_index * 4); + pte_addr_phys = rk_ops->pt_address(dte) + pte_index * sizeof(u32); pte_addr = phys_to_virt(pte_addr_phys); pte = *pte_addr; -- 2.17.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH 2/4] iommu/rockchip: remove redundant var dte_addr
Using dte_addr as local var is redundant. Instead acces rk_domain->dt[dte_index] directly. Signed-off-by: Dafna Hirschfeld --- drivers/iommu/rockchip-iommu.c | 8 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/drivers/iommu/rockchip-iommu.c b/drivers/iommu/rockchip-iommu.c index bd22d0a6eaf0..bd73cf9c54f5 100644 --- a/drivers/iommu/rockchip-iommu.c +++ b/drivers/iommu/rockchip-iommu.c @@ -744,7 +744,7 @@ static void rk_iommu_zap_iova_first_last(struct rk_iommu_domain *rk_domain, static u32 *rk_dte_get_page_table(struct rk_iommu_domain *rk_domain, dma_addr_t iova) { - u32 *page_table, *dte_addr; + u32 *page_table; u32 dte_index, dte; phys_addr_t pt_phys; dma_addr_t pt_dma; @@ -752,8 +752,8 @@ static u32 *rk_dte_get_page_table(struct rk_iommu_domain *rk_domain, assert_spin_locked(_domain->dt_lock); dte_index = rk_iova_dte_index(iova); - dte_addr = _domain->dt[dte_index]; - dte = *dte_addr; + dte = rk_domain->dt[dte_index]; + if (rk_dte_is_pt_valid(dte)) goto done; @@ -769,7 +769,7 @@ static u32 *rk_dte_get_page_table(struct rk_iommu_domain *rk_domain, } dte = rk_ops->mk_dtentries(pt_dma); - *dte_addr = dte; + rk_domain->dt[dte_index] = dte; rk_table_flush(rk_domain, rk_domain->dt_dma + dte_index * sizeof(u32), 1); -- 2.17.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v3 12/33] iommu/mediatek: Always tlb_flush_all when each PM resume
On 30.11.21 09:39, Yong Wu wrote: On Sat, 2021-11-27 at 12:11 +0200, Dafna Hirschfeld wrote: On 10.11.21 09:50, Yong Wu wrote: On Wed, 2021-11-10 at 07:29 +0200, Dafna Hirschfeld wrote: On 10.11.21 04:20, Yong Wu wrote: On Tue, 2021-11-09 at 14:21 +0200, Dafna Hirschfeld wrote: Hi This patch is needed in order to update the tlb when a device is powered on. Could you send this patch alone without the whole series so it get accepted easier? Which SoC are you testing on? In previous SoC, the IOMMU HW don't have power-domain, and we have a "has_pm"[1] in the tlb function for that case. The "has_pm" should be always 0 for the previous SoC like mt8173, it should always tlb synchronize. thus, Could you help share more about your issue? In which case it lack the necessary tlb operation. At least, We need confirm if it needs a "Fixes" tags if sending this patch alone. Hi, I work with the mtk-vcodec driver on mt8173. As you wrote, the iommu doesn't have a power-domain and so when allocating buffers before the device is powered on, there is the warning "Partial TLB flush timed out, falling back to full flush" flooding the log buf. oh. Thanks very much for your information. Get it now. This issue should be introduced by the: b34ea31fe013 ("iommu/mediatek: Always enable the clk on resume") Hi, reverting this commit didn't solve those warnings, I think this is because in the function mtk_iommu_attach_device the first call to pm_runtime_resume_and_get does not turn the clks on since m4u_dom is not yet initialize. And then mtk_iommu_attach_device calls pm_runtime_put right after mtk_iommu_hw_init is called (where the clks are turned on) oh. Right. this is also related with the patch of "Add power-domain operation". The current problem is that there is a redundant log of "Partial TLB flush timed out" in mt8173. We need fix this issue firstly. Are you going to prepare the patches again? If not, I could help this. You could help confirm them if you are free. Hi, I already sent a patch last week: https://lore.kernel.org/linux-iommu/afb46ad6ca9477a2bf71233858406caa6ccb1588.ca...@mediatek.com/T/ could you please review it? Thanks, Dafna Thanks. thanks, Dafna tlb failed due to the bclk is not enabled. Could you help try that after reverting this? Sebastian Reichel suggested to remove the 'if(has_pm)' check to avoid this warning, and avoid flushing the tlb if the device is off: [1] http://ix.io/3Eyr This fixes the warning, but then the tlb is not flushed in sync, Therefore the tlb should be flushed when the device is resumed. So the two patches (the one suggested in the link [1] and this patch) should be sent together as a 2-patch series. then this is reasonable. You could help this into a new patchset if you are free(add Fixes tag). Thanks. Thanks, Dafna ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v3 12/33] iommu/mediatek: Always tlb_flush_all when each PM resume
On 10.11.21 09:50, Yong Wu wrote: On Wed, 2021-11-10 at 07:29 +0200, Dafna Hirschfeld wrote: On 10.11.21 04:20, Yong Wu wrote: On Tue, 2021-11-09 at 14:21 +0200, Dafna Hirschfeld wrote: Hi This patch is needed in order to update the tlb when a device is powered on. Could you send this patch alone without the whole series so it get accepted easier? Which SoC are you testing on? In previous SoC, the IOMMU HW don't have power-domain, and we have a "has_pm"[1] in the tlb function for that case. The "has_pm" should be always 0 for the previous SoC like mt8173, it should always tlb synchronize. thus, Could you help share more about your issue? In which case it lack the necessary tlb operation. At least, We need confirm if it needs a "Fixes" tags if sending this patch alone. Hi, I work with the mtk-vcodec driver on mt8173. As you wrote, the iommu doesn't have a power-domain and so when allocating buffers before the device is powered on, there is the warning "Partial TLB flush timed out, falling back to full flush" flooding the log buf. oh. Thanks very much for your information. Get it now. This issue should be introduced by the: b34ea31fe013 ("iommu/mediatek: Always enable the clk on resume") Hi, reverting this commit didn't solve those warnings, I think this is because in the function mtk_iommu_attach_device the first call to pm_runtime_resume_and_get does not turn the clks on since m4u_dom is not yet initialize. And then mtk_iommu_attach_device calls pm_runtime_put right after mtk_iommu_hw_init is called (where the clks are turned on) thanks, Dafna tlb failed due to the bclk is not enabled. Could you help try that after reverting this? Sebastian Reichel suggested to remove the 'if(has_pm)' check to avoid this warning, and avoid flushing the tlb if the device is off: [1] http://ix.io/3Eyr This fixes the warning, but then the tlb is not flushed in sync, Therefore the tlb should be flushed when the device is resumed. So the two patches (the one suggested in the link [1] and this patch) should be sent together as a 2-patch series. then this is reasonable. You could help this into a new patchset if you are free(add Fixes tag). Thanks. Thanks, Dafna Thanks. [1] https://elixir.bootlin.com/linux/v5.15/source/drivers/iommu/mtk_iommu.c#L236 I can resend the patch on your behalf if you want. Thanks, Dafna On 23.09.21 14:58, Yong Wu wrote: Prepare for 2 HWs that sharing pgtable in different power- domains. When there are 2 M4U HWs, it may has problem in the flush_range in which we get the pm_status via the m4u dev, BUT that function don't reflect the real power-domain status of the HW since there may be other HW also use that power-domain. The function dma_alloc_attrs help allocate the iommu buffer which need the corresponding power domain since tlb flush is needed when preparing iova. BUT this function only is for allocating buffer, we have no good reason to request the user always call pm_runtime_get before calling dma_alloc_xxx. Therefore, we add a tlb_flush_all in the pm_runtime_resume to make sure the tlb always is clean. Another solution is always call pm_runtime_get in the tlb_flush_range. This will trigger pm runtime resume/backup so often when the iommu power is not active at some time(means user don't call pm_runtime_get before calling dma_alloc_xxx), This may cause the performance drop. thus we don't use this. In other case, the iommu's power should always be active via device link with smi. The previous SoC don't have PM except mt8192. the mt8192 IOMMU is display's power-domain which nearly always is enabled. thus no need fix tags here. Prepare for mt8195. Signed-off-by: Yong Wu --- drivers/iommu/mtk_iommu.c | 11 +++ 1 file changed, 11 insertions(+) diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c index 44cf5547d084..e9e94944ed91 100644 --- a/drivers/iommu/mtk_iommu.c +++ b/drivers/iommu/mtk_iommu.c @@ -984,6 +984,17 @@ static int __maybe_unused mtk_iommu_runtime_resume(struct device *dev) return ret; } + /* +* Users may allocate dma buffer before they call pm_runtime_get, then +* it will lack the necessary tlb flush. +* +* We have no good reason to request the users always call dma_alloc_xx +* after pm_runtime_get_sync. +* +* Thus, Make sure the tlb always is clean after each PM resume. +*/ + mtk_iommu_tlb_do_flush_all(data); + /* * Uppon first resume, only enable the clk and return, since the values of the * registers are not yet set. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v3 12/33] iommu/mediatek: Always tlb_flush_all when each PM resume
On 22.11.21 09:05, Yong Wu wrote: Hi Dafna, On Wed, 2021-11-10 at 15:50 +0800, Yong Wu wrote: On Wed, 2021-11-10 at 07:29 +0200, Dafna Hirschfeld wrote: On 10.11.21 04:20, Yong Wu wrote: On Tue, 2021-11-09 at 14:21 +0200, Dafna Hirschfeld wrote: Hi This patch is needed in order to update the tlb when a device is powered on. Could you send this patch alone without the whole series so it get accepted easier? Which SoC are you testing on? In previous SoC, the IOMMU HW don't have power-domain, and we have a "has_pm"[1] in the tlb function for that case. The "has_pm" should be always 0 for the previous SoC like mt8173, it should always tlb synchronize. thus, Could you help share more about your issue? In which case it lack the necessary tlb operation. At least, We need confirm if it needs a "Fixes" tags if sending this patch alone. Hi, I work with the mtk-vcodec driver on mt8173. As you wrote, the iommu doesn't have a power-domain and so when allocating buffers before the device is powered on, there is the warning "Partial TLB flush timed out, falling back to full flush" flooding the log buf. oh. Thanks very much for your information. Get it now. This issue should be introduced by the: b34ea31fe013 ("iommu/mediatek: Always enable the clk on resume") tlb failed due to the bclk is not enabled. Could you help try that after reverting this? Sebastian Reichel suggested to remove the 'if(has_pm)' check to avoid this warning, and avoid flushing the tlb if the device is off: [1] http://ix.io/3Eyr This fixes the warning, but then the tlb is not flushed in sync, Therefore the tlb should be flushed when the device is resumed. So the two patches (the one suggested in the link [1] and this patch) should be sent together as a 2-patch series. then this is reasonable. You could help this into a new patchset if you are free(add Fixes tag). Thanks. Thanks, Dafna Thanks. [1] https://elixir.bootlin.com/linux/v5.15/source/drivers/iommu/mtk_iommu.c#L236 I can resend the patch on your behalf if you want. What's your plan here? If you send the two as a independent patchset on v5.16, I will rebase yours. If you have no time for this, I could help this. Hi, just sent it in the patchset ` iommu/mediatek: fix tlb flush logic` Thanks, Dafna Thanks. Thanks, Dafna On 23.09.21 14:58, Yong Wu wrote: Prepare for 2 HWs that sharing pgtable in different power- domains. When there are 2 M4U HWs, it may has problem in the flush_range in which we get the pm_status via the m4u dev, BUT that function don't reflect the real power-domain status of the HW since there may be other HW also use that power-domain. The function dma_alloc_attrs help allocate the iommu buffer which need the corresponding power domain since tlb flush is needed when preparing iova. BUT this function only is for allocating buffer, we have no good reason to request the user always call pm_runtime_get before calling dma_alloc_xxx. Therefore, we add a tlb_flush_all in the pm_runtime_resume to make sure the tlb always is clean. Another solution is always call pm_runtime_get in the tlb_flush_range. This will trigger pm runtime resume/backup so often when the iommu power is not active at some time(means user don't call pm_runtime_get before calling dma_alloc_xxx), This may cause the performance drop. thus we don't use this. In other case, the iommu's power should always be active via device link with smi. The previous SoC don't have PM except mt8192. the mt8192 IOMMU is display's power-domain which nearly always is enabled. thus no need fix tags here. Prepare for mt8195. Signed-off-by: Yong Wu --- drivers/iommu/mtk_iommu.c | 11 +++ 1 file changed, 11 insertions(+) diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c index 44cf5547d084..e9e94944ed91 100644 --- a/drivers/iommu/mtk_iommu.c +++ b/drivers/iommu/mtk_iommu.c @@ -984,6 +984,17 @@ static int __maybe_unused mtk_iommu_runtime_resume(struct device *dev) return ret; } + /* +* Users may allocate dma buffer before they call pm_runtime_get, then +* it will lack the necessary tlb flush. +* +* We have no good reason to request the users always call dma_alloc_xx +* after pm_runtime_get_sync. +* +* Thus, Make sure the tlb always is clean after each PM resume. +*/ + mtk_iommu_tlb_do_flush_all(data); + /* * Uppon first resume, only enable the clk and return, since the values of the * registers are not yet set. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH 0/2] iommu/mediatek: fix tlb flush logic
Often devices allocate dma buffers before they do runtime pm resume. This is the case for example with v4l2 devices where buffers are allocated during 'VIDIOC_REQBUFS` and runtime resume happens later usually during 'VIDIOC_STREAMON'. In such cases the partial tlb flush when allocating will fail since the the iommu is runtime suspended. This will print a warning and try to do full flush. But there is actually no need to flush the tlb before the consumer device is turned on. Fix the warning by skipping parital flush when allocating and instead do full flash in runtime resume This patchset is a combination of a patch already sent in a different patchset: [1] and a warning fix from Sebastian Reichel [1] https://lore.kernel.org/linux-devicetree/20210923115840.17813-13-yong...@mediatek.com/ Sebastian Reichel (1): iommu/mediatek: always check runtime PM status in tlb flush range callback Yong Wu (1): iommu/mediatek: Always tlb_flush_all when each PM resume drivers/iommu/mtk_iommu.c | 17 ++--- 1 file changed, 10 insertions(+), 7 deletions(-) -- 2.17.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH 1/2] iommu/mediatek: Always tlb_flush_all when each PM resume
From: Yong Wu Prepare for 2 HWs that sharing pgtable in different power-domains. When there are 2 M4U HWs, it may has problem in the flush_range in which we get the pm_status via the m4u dev, BUT that function don't reflect the real power-domain status of the HW since there may be other HW also use that power-domain. The function dma_alloc_attrs help allocate the iommu buffer which need the corresponding power domain since tlb flush is needed when preparing iova. BUT this function only is for allocating buffer, we have no good reason to request the user always call pm_runtime_get before calling dma_alloc_xxx. Therefore, we add a tlb_flush_all in the pm_runtime_resume to make sure the tlb always is clean. Another solution is always call pm_runtime_get in the tlb_flush_range. This will trigger pm runtime resume/backup so often when the iommu power is not active at some time(means user don't call pm_runtime_get before calling dma_alloc_xxx), This may cause the performance drop. thus we don't use this. In other case, the iommu's power should always be active via device link with smi. The previous SoC don't have PM except mt8192. the mt8192 IOMMU is display's power-domain which nearly always is enabled. thus no need fix tags here. Prepare for mt8195. Signed-off-by: Yong Wu [imporvie inline doc] Signed-off-by: Dafna Hirschfeld --- drivers/iommu/mtk_iommu.c | 7 +++ 1 file changed, 7 insertions(+) diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c index 25b834104790..28dc4b95b6d9 100644 --- a/drivers/iommu/mtk_iommu.c +++ b/drivers/iommu/mtk_iommu.c @@ -964,6 +964,13 @@ static int __maybe_unused mtk_iommu_runtime_resume(struct device *dev) return ret; } + /* +* Users may allocate dma buffer before they call pm_runtime_get, +* in which case it will lack the necessary tlb flush. +* Thus, make sure to update the tlb after each PM resume. +*/ + mtk_iommu_tlb_flush_all(data); + /* * Uppon first resume, only enable the clk and return, since the values of the * registers are not yet set. -- 2.17.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH 2/2] iommu/mediatek: always check runtime PM status in tlb flush range callback
From: Sebastian Reichel In case of v4l2_reqbufs() it is possible, that a TLB flush is done without runtime PM being enabled. In that case the "Partial TLB flush timed out, falling back to full flush" warning is printed. Commit c0b57581b73b ("iommu/mediatek: Add power-domain operation") introduced has_pm as optimization to avoid checking runtime PM when there is no power domain attached. But without the PM domain there is still the device driver's runtime PM suspend handler, which disables the clock. Thus flushing should also be avoided when there is no PM domain involved. Signed-off-by: Sebastian Reichel Reviewed-by: Dafna Hirschfeld --- drivers/iommu/mtk_iommu.c | 10 +++--- 1 file changed, 3 insertions(+), 7 deletions(-) diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c index 28dc4b95b6d9..b0535fcfd1d7 100644 --- a/drivers/iommu/mtk_iommu.c +++ b/drivers/iommu/mtk_iommu.c @@ -227,16 +227,13 @@ static void mtk_iommu_tlb_flush_range_sync(unsigned long iova, size_t size, size_t granule, struct mtk_iommu_data *data) { - bool has_pm = !!data->dev->pm_domain; unsigned long flags; int ret; u32 tmp; for_each_m4u(data) { - if (has_pm) { - if (pm_runtime_get_if_in_use(data->dev) <= 0) - continue; - } + if (pm_runtime_get_if_in_use(data->dev) <= 0) + continue; spin_lock_irqsave(>tlb_lock, flags); writel_relaxed(F_INVLD_EN1 | F_INVLD_EN0, @@ -261,8 +258,7 @@ static void mtk_iommu_tlb_flush_range_sync(unsigned long iova, size_t size, writel_relaxed(0, data->base + REG_MMU_CPE_DONE); spin_unlock_irqrestore(>tlb_lock, flags); - if (has_pm) - pm_runtime_put(data->dev); + pm_runtime_put(data->dev); } } -- 2.17.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v3 12/33] iommu/mediatek: Always tlb_flush_all when each PM resume
On 10.11.21 04:20, Yong Wu wrote: On Tue, 2021-11-09 at 14:21 +0200, Dafna Hirschfeld wrote: Hi This patch is needed in order to update the tlb when a device is powered on. Could you send this patch alone without the whole series so it get accepted easier? Which SoC are you testing on? In previous SoC, the IOMMU HW don't have power-domain, and we have a "has_pm"[1] in the tlb function for that case. The "has_pm" should be always 0 for the previous SoC like mt8173, it should always tlb synchronize. thus, Could you help share more about your issue? In which case it lack the necessary tlb operation. At least, We need confirm if it needs a "Fixes" tags if sending this patch alone. Hi, I work with the mtk-vcodec driver on mt8173. As you wrote, the iommu doesn't have a power-domain and so when allocating buffers before the device is powered on, there is the warning "Partial TLB flush timed out, falling back to full flush" flooding the log buf. Sebastian Reichel suggested to remove the 'if(has_pm)' check to avoid this warning, and avoid flushing the tlb if the device is off: [1] http://ix.io/3Eyr This fixes the warning, but then the tlb is not flushed in sync, Therefore the tlb should be flushed when the device is resumed. So the two patches (the one suggested in the link [1] and this patch) should be sent together as a 2-patch series. Thanks, Dafna Thanks. [1] https://elixir.bootlin.com/linux/v5.15/source/drivers/iommu/mtk_iommu.c#L236 I can resend the patch on your behalf if you want. Thanks, Dafna On 23.09.21 14:58, Yong Wu wrote: Prepare for 2 HWs that sharing pgtable in different power-domains. When there are 2 M4U HWs, it may has problem in the flush_range in which we get the pm_status via the m4u dev, BUT that function don't reflect the real power-domain status of the HW since there may be other HW also use that power-domain. The function dma_alloc_attrs help allocate the iommu buffer which need the corresponding power domain since tlb flush is needed when preparing iova. BUT this function only is for allocating buffer, we have no good reason to request the user always call pm_runtime_get before calling dma_alloc_xxx. Therefore, we add a tlb_flush_all in the pm_runtime_resume to make sure the tlb always is clean. Another solution is always call pm_runtime_get in the tlb_flush_range. This will trigger pm runtime resume/backup so often when the iommu power is not active at some time(means user don't call pm_runtime_get before calling dma_alloc_xxx), This may cause the performance drop. thus we don't use this. In other case, the iommu's power should always be active via device link with smi. The previous SoC don't have PM except mt8192. the mt8192 IOMMU is display's power-domain which nearly always is enabled. thus no need fix tags here. Prepare for mt8195. Signed-off-by: Yong Wu --- drivers/iommu/mtk_iommu.c | 11 +++ 1 file changed, 11 insertions(+) diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c index 44cf5547d084..e9e94944ed91 100644 --- a/drivers/iommu/mtk_iommu.c +++ b/drivers/iommu/mtk_iommu.c @@ -984,6 +984,17 @@ static int __maybe_unused mtk_iommu_runtime_resume(struct device *dev) return ret; } + /* +* Users may allocate dma buffer before they call pm_runtime_get, then +* it will lack the necessary tlb flush. +* +* We have no good reason to request the users always call dma_alloc_xx +* after pm_runtime_get_sync. +* +* Thus, Make sure the tlb always is clean after each PM resume. +*/ + mtk_iommu_tlb_do_flush_all(data); + /* * Uppon first resume, only enable the clk and return, since the values of the * registers are not yet set. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v3 12/33] iommu/mediatek: Always tlb_flush_all when each PM resume
Hi This patch is needed in order to update the tlb when a device is powered on. Could you send this patch alone without the whole series so it get accepted easier? I can resend the patch on your behalf if you want. Thanks, Dafna On 23.09.21 14:58, Yong Wu wrote: Prepare for 2 HWs that sharing pgtable in different power-domains. When there are 2 M4U HWs, it may has problem in the flush_range in which we get the pm_status via the m4u dev, BUT that function don't reflect the real power-domain status of the HW since there may be other HW also use that power-domain. The function dma_alloc_attrs help allocate the iommu buffer which need the corresponding power domain since tlb flush is needed when preparing iova. BUT this function only is for allocating buffer, we have no good reason to request the user always call pm_runtime_get before calling dma_alloc_xxx. Therefore, we add a tlb_flush_all in the pm_runtime_resume to make sure the tlb always is clean. Another solution is always call pm_runtime_get in the tlb_flush_range. This will trigger pm runtime resume/backup so often when the iommu power is not active at some time(means user don't call pm_runtime_get before calling dma_alloc_xxx), This may cause the performance drop. thus we don't use this. In other case, the iommu's power should always be active via device link with smi. The previous SoC don't have PM except mt8192. the mt8192 IOMMU is display's power-domain which nearly always is enabled. thus no need fix tags here. Prepare for mt8195. Signed-off-by: Yong Wu --- drivers/iommu/mtk_iommu.c | 11 +++ 1 file changed, 11 insertions(+) diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c index 44cf5547d084..e9e94944ed91 100644 --- a/drivers/iommu/mtk_iommu.c +++ b/drivers/iommu/mtk_iommu.c @@ -984,6 +984,17 @@ static int __maybe_unused mtk_iommu_runtime_resume(struct device *dev) return ret; } + /* +* Users may allocate dma buffer before they call pm_runtime_get, then +* it will lack the necessary tlb flush. +* +* We have no good reason to request the users always call dma_alloc_xx +* after pm_runtime_get_sync. +* +* Thus, Make sure the tlb always is clean after each PM resume. +*/ + mtk_iommu_tlb_do_flush_all(data); + /* * Uppon first resume, only enable the clk and return, since the values of the * registers are not yet set. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH] iommu: log iova range in map/unmap trace events
In case of an iommu page fault, the faulting iova is logged in trace_io_page_fault. It is therefore convenient to log the iova range in mapping/unmapping trace events so that it is easier to see if the faulting iova was recently in any of those ranges. Signed-off-by: Dafna Hirschfeld --- include/trace/events/iommu.h | 10 ++ 1 file changed, 6 insertions(+), 4 deletions(-) diff --git a/include/trace/events/iommu.h b/include/trace/events/iommu.h index 72b4582322ff..29096fe12623 100644 --- a/include/trace/events/iommu.h +++ b/include/trace/events/iommu.h @@ -101,8 +101,9 @@ TRACE_EVENT(map, __entry->size = size; ), - TP_printk("IOMMU: iova=0x%016llx paddr=0x%016llx size=%zu", - __entry->iova, __entry->paddr, __entry->size + TP_printk("IOMMU: iova=0x%016llx - 0x%016llx paddr=0x%016llx size=%zu", + __entry->iova, __entry->iova + __entry->size, __entry->paddr, + __entry->size ) ); @@ -124,8 +125,9 @@ TRACE_EVENT(unmap, __entry->unmapped_size = unmapped_size; ), - TP_printk("IOMMU: iova=0x%016llx size=%zu unmapped_size=%zu", - __entry->iova, __entry->size, __entry->unmapped_size + TP_printk("IOMMU: iova=0x%016llx - 0x%016llx size=%zu unmapped_size=%zu", + __entry->iova, __entry->iova + __entry->size, + __entry->size, __entry->unmapped_size ) ); -- 2.17.1 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v3 13/33] iommu/mediatek: Remove the power status checking in tlb flush all
Hi On 23.09.21 13:58, Yong Wu wrote: To simplify the code, Remove the power status checking in the tlb_flush_all, remove this: if (pm_runtime_get_if_in_use(data->dev) <= 0) continue; After this patch, the mtk_iommu_tlb_flush_all will be called from a) isr b) pm runtime resume callback c) tlb flush range fail case d) iommu_create_device_direct_mappings -> iommu_flush_iotlb_all In first three cases, the power and clock always are enabled; d) is direct Regarding case "c) tlb flush range fail case", I found out that this often happens when the iommu is used while it is runtime suspended. For example the mtk-vcodec encoder driver calls "pm_runtime_resume_and_get" only when it starts streaming but buffers allocation is done in 'v4l2_reqbufs' before "pm_runtime_resume_and_get" is called and then I see the warning "Partial TLB flush timed out, falling back to full flush" I am not sure how to fix that issue, but it seems that case 'c)' might indicate that power and clock are actually not enabled. mapping, the tlb flush is unnecessay since we already have tlb_flush_all in the pm_runtime_resume callback. When the iommu's power status is changed to active, the tlb always is clean. In addition, there still are 2 reasons that don't add PM status checking in the tlb flush all: a) Write tlb flush all register also is ok even though the HW has no power and clocks. Write ignore. b) pm_runtime_get_if_in_use(m4udev) is 0 when the tlb_flush_all is called frm pm_runtime_resume cb. From this point, we can not add this code above in this tlb_flush_all. Signed-off-by: Yong Wu --- drivers/iommu/mtk_iommu.c | 20 +++- 1 file changed, 7 insertions(+), 13 deletions(-) diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c index e9e94944ed91..4a33b6c6b1db 100644 --- a/drivers/iommu/mtk_iommu.c +++ b/drivers/iommu/mtk_iommu.c @@ -204,10 +204,14 @@ static struct mtk_iommu_domain *to_mtk_domain(struct iommu_domain *dom) return container_of(dom, struct mtk_iommu_domain, domain); } -static void mtk_iommu_tlb_do_flush_all(struct mtk_iommu_data *data) +static void mtk_iommu_tlb_flush_all(struct mtk_iommu_data *data) { unsigned long flags; + /* +* No need get power status since the HW PM status nearly is active +* when entering here. +*/ spin_lock_irqsave(>tlb_lock, flags); writel_relaxed(F_INVLD_EN1 | F_INVLD_EN0, data->base + data->plat_data->inv_sel_reg); @@ -216,16 +220,6 @@ static void mtk_iommu_tlb_do_flush_all(struct mtk_iommu_data *data) spin_unlock_irqrestore(>tlb_lock, flags); } -static void mtk_iommu_tlb_flush_all(struct mtk_iommu_data *data) -{ - if (pm_runtime_get_if_in_use(data->dev) <= 0) - return; - - mtk_iommu_tlb_do_flush_all(data); - - pm_runtime_put(data->dev); -} - static void mtk_iommu_tlb_flush_range_sync(unsigned long iova, size_t size, struct mtk_iommu_data *data) { @@ -263,7 +257,7 @@ static void mtk_iommu_tlb_flush_range_sync(unsigned long iova, size_t size, if (ret) { dev_warn(data->dev, "Partial TLB flush timed out, falling back to full flush\n"); - mtk_iommu_tlb_do_flush_all(data); + mtk_iommu_tlb_flush_all(data); } if (has_pm) @@ -993,7 +987,7 @@ static int __maybe_unused mtk_iommu_runtime_resume(struct device *dev) * * Thus, Make sure the tlb always is clean after each PM resume. */ - mtk_iommu_tlb_do_flush_all(data); + mtk_iommu_tlb_flush_all(data); /* * Uppon first resume, only enable the clk and return, since the values of the ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v8 04/12] iommu/mediatek: Add device_link between the consumer and the larb devices
On 16.10.21 04:23, Yong Wu wrote: On Mon, 2021-10-11 at 14:36 +0200, Dafna Hirschfeld wrote: On 29.09.21 03:37, Yong Wu wrote: MediaTek IOMMU-SMI diagram is like below. all the consumer connect with smi-larb, then connect with smi-common. M4U | smi-common | - | |... | | larb1 larb2 | | vdec venc When the consumer works, it should enable the smi-larb's power which also need enable the smi-common's power firstly. Thus, First of all, use the device link connect the consumer and the smi-larbs. then add device link between the smi-larb and smi- common. This patch adds device_link between the consumer and the larbs. When device_link_add, I add the flag DL_FLAG_STATELESS to avoid calling pm_runtime_xx to keep the original status of clocks. It can avoid two issues: 1) Display HW show fastlogo abnormally reported in [1]. At the beggining, all the clocks are enabled before entering kernel, but the clocks for display HW(always in larb0) will be gated after clk_enable and clk_disable called from device_link_add(->pm_runtime_resume) and rpm_idle. The clock operation happened before display driver probe. At that time, the display HW will be abnormal. 2) A deadlock issue reported in [2]. Use DL_FLAG_STATELESS to skip pm_runtime_xx to avoid the deadlock. Corresponding, DL_FLAG_AUTOREMOVE_CONSUMER can't be added, then device_link_removed should be added explicitly. [1] https://lore.kernel.org/linux-mediatek/1564213888.22908.4.camel@mhfsdcap03/ [2] https://lore.kernel.org/patchwork/patch/1086569/ Suggested-by: Tomasz Figa Signed-off-by: Yong Wu Tested-by: Frank Wunderlich # BPI- R2/MT7623 --- drivers/iommu/mtk_iommu.c| 22 ++ drivers/iommu/mtk_iommu_v1.c | 20 +++- 2 files changed, 41 insertions(+), 1 deletion(-) diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c index d5848f78a677..a2fa55899434 100644 --- a/drivers/iommu/mtk_iommu.c +++ b/drivers/iommu/mtk_iommu.c @@ -560,22 +560,44 @@ static struct iommu_device *mtk_iommu_probe_device(struct device *dev) { struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev); struct mtk_iommu_data *data; + struct device_link *link; + struct device *larbdev; + unsigned int larbid; if (!fwspec || fwspec->ops != _iommu_ops) return ERR_PTR(-ENODEV); /* Not a iommu client device */ data = dev_iommu_priv_get(dev); + /* +* Link the consumer device with the smi-larb device(supplier) +* The device in each a larb is a independent HW. thus only link +* one larb here. +*/ + larbid = MTK_M4U_TO_LARB(fwspec->ids[0]); so larbid is always the same for all the ids of a device? Yes. For me, each a dtsi node should represent a HW unit which can only connect one larb. If so maybe it worth testing it and return error if this is not the case. Thanks for the suggestion. This is very helpful. I did see someone put the different larbs in one node. I will check this, and add return I am working on bugs found on media drivers, could you please point me to that wrong node? Will you send a fix to that node in the dtsi? Thanks, Dafna EINVAL for this case. Thanks, Dafna ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v8 04/12] iommu/mediatek: Add device_link between the consumer and the larb devices
On 29.09.21 03:37, Yong Wu wrote: MediaTek IOMMU-SMI diagram is like below. all the consumer connect with smi-larb, then connect with smi-common. M4U | smi-common | - | |... | | larb1 larb2 | | vdec venc When the consumer works, it should enable the smi-larb's power which also need enable the smi-common's power firstly. Thus, First of all, use the device link connect the consumer and the smi-larbs. then add device link between the smi-larb and smi-common. This patch adds device_link between the consumer and the larbs. When device_link_add, I add the flag DL_FLAG_STATELESS to avoid calling pm_runtime_xx to keep the original status of clocks. It can avoid two issues: 1) Display HW show fastlogo abnormally reported in [1]. At the beggining, all the clocks are enabled before entering kernel, but the clocks for display HW(always in larb0) will be gated after clk_enable and clk_disable called from device_link_add(->pm_runtime_resume) and rpm_idle. The clock operation happened before display driver probe. At that time, the display HW will be abnormal. 2) A deadlock issue reported in [2]. Use DL_FLAG_STATELESS to skip pm_runtime_xx to avoid the deadlock. Corresponding, DL_FLAG_AUTOREMOVE_CONSUMER can't be added, then device_link_removed should be added explicitly. [1] https://lore.kernel.org/linux-mediatek/1564213888.22908.4.camel@mhfsdcap03/ [2] https://lore.kernel.org/patchwork/patch/1086569/ Suggested-by: Tomasz Figa Signed-off-by: Yong Wu Tested-by: Frank Wunderlich # BPI-R2/MT7623 --- drivers/iommu/mtk_iommu.c| 22 ++ drivers/iommu/mtk_iommu_v1.c | 20 +++- 2 files changed, 41 insertions(+), 1 deletion(-) diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c index d5848f78a677..a2fa55899434 100644 --- a/drivers/iommu/mtk_iommu.c +++ b/drivers/iommu/mtk_iommu.c @@ -560,22 +560,44 @@ static struct iommu_device *mtk_iommu_probe_device(struct device *dev) { struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev); struct mtk_iommu_data *data; + struct device_link *link; + struct device *larbdev; + unsigned int larbid; if (!fwspec || fwspec->ops != _iommu_ops) return ERR_PTR(-ENODEV); /* Not a iommu client device */ data = dev_iommu_priv_get(dev); + /* +* Link the consumer device with the smi-larb device(supplier) +* The device in each a larb is a independent HW. thus only link +* one larb here. +*/ + larbid = MTK_M4U_TO_LARB(fwspec->ids[0]); so larbid is always the same for all the ids of a device? If so maybe it worth testing it and return error if this is not the case. Thanks, Dafna + larbdev = data->larb_imu[larbid].dev; + link = device_link_add(dev, larbdev, + DL_FLAG_PM_RUNTIME | DL_FLAG_STATELESS); + if (!link) + dev_err(dev, "Unable to link %s\n", dev_name(larbdev)); return >iommu; } static void mtk_iommu_release_device(struct device *dev) { struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev); + struct mtk_iommu_data *data; + struct device *larbdev; + unsigned int larbid; if (!fwspec || fwspec->ops != _iommu_ops) return; + data = dev_iommu_priv_get(dev); + larbid = MTK_M4U_TO_LARB(fwspec->ids[0]); + larbdev = data->larb_imu[larbid].dev; + device_link_remove(dev, larbdev); + iommu_fwspec_free(dev); } diff --git a/drivers/iommu/mtk_iommu_v1.c b/drivers/iommu/mtk_iommu_v1.c index 4d7809432239..e6f13459470e 100644 --- a/drivers/iommu/mtk_iommu_v1.c +++ b/drivers/iommu/mtk_iommu_v1.c @@ -423,7 +423,9 @@ static struct iommu_device *mtk_iommu_probe_device(struct device *dev) struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev); struct of_phandle_args iommu_spec; struct mtk_iommu_data *data; - int err, idx = 0; + int err, idx = 0, larbid; + struct device_link *link; + struct device *larbdev; /* * In the deferred case, free the existed fwspec. @@ -453,6 +455,14 @@ static struct iommu_device *mtk_iommu_probe_device(struct device *dev) data = dev_iommu_priv_get(dev); + /* Link the consumer device with the smi-larb device(supplier) */ + larbid = mt2701_m4u_to_larb(fwspec->ids[0]); + larbdev = data->larb_imu[larbid].dev; + link = device_link_add(dev, larbdev, + DL_FLAG_PM_RUNTIME | DL_FLAG_STATELESS); + if (!link) + dev_err(dev, "Unable to link %s\n", dev_name(larbdev)); + return >iommu; } @@ -473,10 +483,18 @@ static void mtk_iommu_probe_finalize(struct device *dev) static void mtk_iommu_release_device(struct device *dev) { struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev); + struct
Re: [PATCH v2 11/29] iommu/mediatek: Always pm_runtime_get while tlb flush
On 13.08.21 08:53, Yong Wu wrote: Prepare for 2 HWs that sharing pgtable in different power-domains. The previous SoC don't have PM. Only mt8192 has power-domain, and it is display's power-domain which nearly always is enabled. hi, I see that in mt1873.dtsi, many devices that uses the iommu have the 'power-domains' property. When there are 2 M4U HWs, it may has problem. In this function, we get the pm_status via the m4u dev, but it don't reflect the real power-domain status of the HW since there may be other HW also use that power-domain. Currently we could not get the real power-domain status, thus always pm_runtime_get here. Prepare for mt8195, thus, no need fix tags here. This patch may drop the performance, we expect the user could pm_runtime_get_sync before dma_alloc_attrs which need tlb ops. Could you explain this sentence a bit? should the user call pm_runtime_get_sync before calling dma_alloc_attrs? Thanks, Dafna Signed-off-by: Yong Wu --- drivers/iommu/mtk_iommu.c | 5 - 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c index add23a36a5e2..abc721a1da21 100644 --- a/drivers/iommu/mtk_iommu.c +++ b/drivers/iommu/mtk_iommu.c @@ -238,8 +238,11 @@ static void mtk_iommu_tlb_flush_range_sync(unsigned long iova, size_t size, for_each_m4u(data, head) { if (has_pm) { - if (pm_runtime_get_if_in_use(data->dev) <= 0) + ret = pm_runtime_resume_and_get(data->dev); + if (ret < 0) { + dev_err(data->dev, "tlb flush: pm get fail %d.\n", ret); continue; + } } spin_lock_irqsave(>tlb_lock, flags); ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v8 09/12] media: mtk-vcodec: Get rid of mtk_smi_larb_get/put
On 30.09.21 05:28, Yong Wu wrote: Hi Dafna, Thanks very much for the review. On Wed, 2021-09-29 at 14:13 +0200, Dafna Hirschfeld wrote: On 29.09.21 03:37, Yong Wu wrote: MediaTek IOMMU has already added the device_link between the consumer and smi-larb device. If the vcodec device call the pm_runtime_get_sync, the smi-larb's pm_runtime_get_sync also be called automatically. CC: Tiffany Lin CC: Irui Wang Signed-off-by: Yong Wu Reviewed-by: Evan Green Acked-by: Tiffany Lin Reviewed-by: Dafna Hirschfeld --- .../platform/mtk-vcodec/mtk_vcodec_dec_pm.c | 37 +++--- -- .../platform/mtk-vcodec/mtk_vcodec_drv.h | 3 -- .../platform/mtk-vcodec/mtk_vcodec_enc.c | 1 - .../platform/mtk-vcodec/mtk_vcodec_enc_pm.c | 44 +++--- - 4 files changed, 10 insertions(+), 75 deletions(-) diff --git a/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_pm.c b/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_pm.c index 6038db96f71c..d0bf9aa3b29d 100644 --- a/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_pm.c +++ b/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_pm.c @@ -8,14 +8,12 @@ #include #include #include -#include #include "mtk_vcodec_dec_pm.h" #include "mtk_vcodec_util.h" int mtk_vcodec_init_dec_pm(struct mtk_vcodec_dev *mtkdev) { - struct device_node *node; struct platform_device *pdev; struct mtk_vcodec_pm *pm; struct mtk_vcodec_clk *dec_clk; @@ -26,18 +24,7 @@ int mtk_vcodec_init_dec_pm(struct mtk_vcodec_dev *mtkdev) pm = >pm; pm->mtkdev = mtkdev; dec_clk = >vdec_clk; - node = of_parse_phandle(pdev->dev.of_node, "mediatek,larb", 0); - if (!node) { - mtk_v4l2_err("of_parse_phandle mediatek,larb fail!"); - return -1; - } - pdev = of_find_device_by_node(node); - of_node_put(node); - if (WARN_ON(!pdev)) { - return -1; - } - pm->larbvdec = >dev; pdev = mtkdev->plat_dev; pm->dev = >dev; @@ -47,14 +34,11 @@ int mtk_vcodec_init_dec_pm(struct mtk_vcodec_dev *mtkdev) dec_clk->clk_info = devm_kcalloc(>dev, dec_clk->clk_num, sizeof(*clk_info), GFP_KERNEL); - if (!dec_clk->clk_info) { - ret = -ENOMEM; - goto put_device; - } + if (!dec_clk->clk_info) + return -ENOMEM; } else { mtk_v4l2_err("Failed to get vdec clock count"); - ret = -EINVAL; - goto put_device; + return -EINVAL; } for (i = 0; i < dec_clk->clk_num; i++) { @@ -63,29 +47,24 @@ int mtk_vcodec_init_dec_pm(struct mtk_vcodec_dev *mtkdev) "clock-names", i, _info->clk_name); if (ret) { mtk_v4l2_err("Failed to get clock name id = %d", i); - goto put_device; + return ret; } clk_info->vcodec_clk = devm_clk_get(>dev, clk_info->clk_name); if (IS_ERR(clk_info->vcodec_clk)) { mtk_v4l2_err("devm_clk_get (%d)%s fail", i, clk_info->clk_name); - ret = PTR_ERR(clk_info->vcodec_clk); - goto put_device; + return PTR_ERR(clk_info->vcodec_clk); } } pm_runtime_enable(>dev); return 0; -put_device: - put_device(pm->larbvdec); - return ret; } void mtk_vcodec_release_dec_pm(struct mtk_vcodec_dev *dev) { pm_runtime_disable(dev->pm.dev); - put_device(dev->pm.larbvdec); } Now that functions only do 'pm_runtime_disable(dev->pm.dev);' so it will be more readable to remove the function mtk_vcodec_release_dec_pm and replace with pm_runtime_disable(dev->pm.dev); Same for the 'enc' equivalent. Make sense. But It may be not proper if using pm_runtime_disable as the symmetry with mtk_vcodec_init_dec_pm in the mtk_vcodec_probe. Maybe we should move pm_runtime_enable out from mtk_vcodec_init_dec_pm into mtk_vcodec_probe. I could do a new patch for this. Is this ok for you? yes, there is also asymettry when calling pm_runtime* in general, I see in the decoder it is called from mtk_vcodec_dec_pm.c but in the encoder it is called from mtk_vcodec_enc.c, I think all calls to pm_runtime* should be out of the *_pm.c files since for example 'mtk_vcodec_dec_pw_on' also do just one call to pm_runtime_resume_and_get so this function can also be removed. thanks, Dafna Thanks, Dafna [snip] ___ Linux-m
Re: [PATCH v8 03/12] iommu/mediatek: Add probe_defer for smi-larb
On 29.09.21 03:37, Yong Wu wrote: Prepare for adding device_link. The iommu consumer should use device_link to connect with the smi-larb(supplier). then the smi-larb should run before the iommu consumer. Here we delay the iommu driver until the smi driver is ready, then all the iommu consumers always are after the smi driver. When there is no this patch, if some consumer drivers run before smi-larb, the supplier link_status is DL_DEV_NO_DRIVER(0) in the device_link_add, then device_links_driver_bound will use WARN_ON to complain that the link_status of supplier is not right. device_is_bound may be more elegant here. but it is not allowed to EXPORT from https://lore.kernel.org/patchwork/patch/1334670/. Signed-off-by: Yong Wu Tested-by: Frank Wunderlich # BPI-R2/MT7623 --- drivers/iommu/mtk_iommu.c| 2 +- drivers/iommu/mtk_iommu_v1.c | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c index d837adfd1da5..d5848f78a677 100644 --- a/drivers/iommu/mtk_iommu.c +++ b/drivers/iommu/mtk_iommu.c @@ -844,7 +844,7 @@ static int mtk_iommu_probe(struct platform_device *pdev) id = i; plarbdev = of_find_device_by_node(larbnode); - if (!plarbdev) { + if (!plarbdev || !plarbdev->dev.driver) { of_node_put(larbnode); return -EPROBE_DEFER; if plarbdev is null doesn't that mean that the device does not exist? so we should return -ENODEV in that case? thanks, Dafna } diff --git a/drivers/iommu/mtk_iommu_v1.c b/drivers/iommu/mtk_iommu_v1.c index 1467ba1e4417..4d7809432239 100644 --- a/drivers/iommu/mtk_iommu_v1.c +++ b/drivers/iommu/mtk_iommu_v1.c @@ -602,7 +602,7 @@ static int mtk_iommu_probe(struct platform_device *pdev) } plarbdev = of_find_device_by_node(larbnode); - if (!plarbdev) { + if (!plarbdev || !plarbdev->dev.driver) { of_node_put(larbnode); return -EPROBE_DEFER; } ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v8 09/12] media: mtk-vcodec: Get rid of mtk_smi_larb_get/put
On 29.09.21 03:37, Yong Wu wrote: MediaTek IOMMU has already added the device_link between the consumer and smi-larb device. If the vcodec device call the pm_runtime_get_sync, the smi-larb's pm_runtime_get_sync also be called automatically. CC: Tiffany Lin CC: Irui Wang Signed-off-by: Yong Wu Reviewed-by: Evan Green Acked-by: Tiffany Lin Reviewed-by: Dafna Hirschfeld --- .../platform/mtk-vcodec/mtk_vcodec_dec_pm.c | 37 +++- .../platform/mtk-vcodec/mtk_vcodec_drv.h | 3 -- .../platform/mtk-vcodec/mtk_vcodec_enc.c | 1 - .../platform/mtk-vcodec/mtk_vcodec_enc_pm.c | 44 +++ 4 files changed, 10 insertions(+), 75 deletions(-) diff --git a/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_pm.c b/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_pm.c index 6038db96f71c..d0bf9aa3b29d 100644 --- a/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_pm.c +++ b/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_pm.c @@ -8,14 +8,12 @@ #include #include #include -#include #include "mtk_vcodec_dec_pm.h" #include "mtk_vcodec_util.h" int mtk_vcodec_init_dec_pm(struct mtk_vcodec_dev *mtkdev) { - struct device_node *node; struct platform_device *pdev; struct mtk_vcodec_pm *pm; struct mtk_vcodec_clk *dec_clk; @@ -26,18 +24,7 @@ int mtk_vcodec_init_dec_pm(struct mtk_vcodec_dev *mtkdev) pm = >pm; pm->mtkdev = mtkdev; dec_clk = >vdec_clk; - node = of_parse_phandle(pdev->dev.of_node, "mediatek,larb", 0); - if (!node) { - mtk_v4l2_err("of_parse_phandle mediatek,larb fail!"); - return -1; - } - pdev = of_find_device_by_node(node); - of_node_put(node); - if (WARN_ON(!pdev)) { - return -1; - } - pm->larbvdec = >dev; pdev = mtkdev->plat_dev; pm->dev = >dev; @@ -47,14 +34,11 @@ int mtk_vcodec_init_dec_pm(struct mtk_vcodec_dev *mtkdev) dec_clk->clk_info = devm_kcalloc(>dev, dec_clk->clk_num, sizeof(*clk_info), GFP_KERNEL); - if (!dec_clk->clk_info) { - ret = -ENOMEM; - goto put_device; - } + if (!dec_clk->clk_info) + return -ENOMEM; } else { mtk_v4l2_err("Failed to get vdec clock count"); - ret = -EINVAL; - goto put_device; + return -EINVAL; } for (i = 0; i < dec_clk->clk_num; i++) { @@ -63,29 +47,24 @@ int mtk_vcodec_init_dec_pm(struct mtk_vcodec_dev *mtkdev) "clock-names", i, _info->clk_name); if (ret) { mtk_v4l2_err("Failed to get clock name id = %d", i); - goto put_device; + return ret; } clk_info->vcodec_clk = devm_clk_get(>dev, clk_info->clk_name); if (IS_ERR(clk_info->vcodec_clk)) { mtk_v4l2_err("devm_clk_get (%d)%s fail", i, clk_info->clk_name); - ret = PTR_ERR(clk_info->vcodec_clk); - goto put_device; + return PTR_ERR(clk_info->vcodec_clk); } } pm_runtime_enable(>dev); return 0; -put_device: - put_device(pm->larbvdec); - return ret; } void mtk_vcodec_release_dec_pm(struct mtk_vcodec_dev *dev) { pm_runtime_disable(dev->pm.dev); - put_device(dev->pm.larbvdec); } Now that functions only do 'pm_runtime_disable(dev->pm.dev);' so it will be more readable to remove the function mtk_vcodec_release_dec_pm and replace with pm_runtime_disable(dev->pm.dev); Same for the 'enc' equivalent. Thanks, Dafna int mtk_vcodec_dec_pw_on(struct mtk_vcodec_pm *pm) @@ -122,11 +101,6 @@ void mtk_vcodec_dec_clock_on(struct mtk_vcodec_pm *pm) } } - ret = mtk_smi_larb_get(pm->larbvdec); - if (ret) { - mtk_v4l2_err("mtk_smi_larb_get larbvdec fail %d", ret); - goto error; - }> return; error: @@ -139,7 +113,6 @@ void mtk_vcodec_dec_clock_off(struct mtk_vcodec_pm *pm) struct mtk_vcodec_clk *dec_clk = >vdec_clk; int i = 0; - mtk_smi_larb_put(pm->larbvdec); for (i = dec_clk->clk_num - 1; i >= 0; i--) clk_disable_unprepare(dec_clk->clk_info[i].vcodec_clk); } diff --git a/drivers/media/platform/mtk-vcodec/mtk_vcodec_drv.h b/drivers/media/platform/mtk-vcodec/mtk_vcodec_drv.h index c6c7672fecfb..64b73dd880ce 100644 --- a/drivers/media/platform/mtk-vcodec/mtk_v
Re: [PATCH v7 04/12] iommu/mediatek: Add device_link between the consumer and the larb devices
On 30.07.21 04:52, Yong Wu wrote: MediaTek IOMMU-SMI diagram is like below. all the consumer connect with smi-larb, then connect with smi-common. M4U | smi-common | - | |... | | larb1 larb2 | | vdec venc When the consumer works, it should enable the smi-larb's power which also need enable the smi-common's power firstly. Thus, First of all, use the device link connect the consumer and the smi-larbs. then add device link between the smi-larb and smi-common. This patch adds device_link between the consumer and the larbs. When device_link_add, I add the flag DL_FLAG_STATELESS to avoid calling pm_runtime_xx to keep the original status of clocks. It can avoid two issues: 1) Display HW show fastlogo abnormally reported in [1]. At the beggining, all the clocks are enabled before entering kernel, but the clocks for display HW(always in larb0) will be gated after clk_enable and clk_disable called from device_link_add(->pm_runtime_resume) and rpm_idle. The clock operation happened before display driver probe. At that time, the display HW will be abnormal. 2) A deadlock issue reported in [2]. Use DL_FLAG_STATELESS to skip pm_runtime_xx to avoid the deadlock. Corresponding, DL_FLAG_AUTOREMOVE_CONSUMER can't be added, then device_link_removed should be added explicitly. [1] https://lore.kernel.org/linux-mediatek/1564213888.22908.4.camel@mhfsdcap03/ [2] https://lore.kernel.org/patchwork/patch/1086569/ Suggested-by: Tomasz Figa Signed-off-by: Yong Wu Tested-by: Dafna Hirschfeld # on mt8173 Hi, unfortunately, I have to take back the Tested-by tag. I am now testing the mtk-vcodec with latest kernel + patches sent from the mailing list: https://gitlab.collabora.com/eballetbo/linux/-/commits/topic/chromeos/chromeos-5.14 which includes this patchset. On chromeos I open a video conference with googl-meet which cause the mtk-vcodec vp8 encoder to run. If I kill it with `killall -9 chrome` I get some page fault messages from the iommu: [ 837.255952] mtk-iommu 10205000.iommu: fault type=0x5 iova=0xfcff0001 pa=0x0 larb=0 port=0 layer=1 read [ 837.265696] mtk-iommu 10205000.iommu: fault type=0x5 iova=0xfcff0001 pa=0x0 larb=0 port=0 layer=1 read [ 837.282367] mtk-iommu 10205000.iommu: fault type=0x5 iova=0xfcff0001 pa=0x0 larb=0 port=0 layer=1 read [ 837.299028] mtk-iommu 10205000.iommu: fault type=0x5 iova=0xfcff0001 pa=0x0 larb=0 port=0 layer=1 read [ 837.315683] mtk-iommu 10205000.iommu: fault type=0x5 iova=0xfcff0001 pa=0x0 larb=0 port=0 layer=1 read [ 837.332345] mtk-iommu 10205000.iommu: fault type=0x5 iova=0xfcff0001 pa=0x0 larb=0 port=0 layer=1 read [ 837.349004] mtk-iommu 10205000.iommu: fault type=0x5 iova=0xfcff0001 pa=0x0 larb=0 port=0 layer=1 read [ 837.365665] mtk-iommu 10205000.iommu: fault type=0x5 iova=0xfcff0001 pa=0x0 larb=0 port=0 layer=1 read [ 837.382329] mtk-iommu 10205000.iommu: fault type=0x5 iova=0xfcff0001 pa=0x0 larb=0 port=0 layer=1 read [ 837.42] mtk-iommu 10205000.iommu: fault type=0x5 iova=0xfcff0001 pa=0x0 larb=0 port=0 layer=1 read In addition, running the encoder tests from the shell: sudo --user=#1000 /usr/local/libexec/chrome-binary-tests/video_encode_accelerator_tests --gtest_filter=VideoEncoderTest.FlushAtEndOfStream_Multiple* --codec=vp8 /usr/local/share/tast/data/chromiumos/tast/local/bundles/cros/video/data/tulip2-320x180.yuv --disable_validator At some point it fails with the error [ 5472.161821] [MTK_V4L2][ERROR] mtk_vcodec_wait_for_done_ctx:32: [290] ctx->type=1, cmd=1, wait_event_interruptible_timeout time=1000ms out 0 0! [ 5472.174678] [MTK_VCODEC][ERROR][290]: vp8_enc_encode_frame() irq_status=0 failed [ 5472.182687] [MTK_V4L2][ERROR] mtk_venc_worker:1239: venc_if_encode failed=-5 If you have any idea of what might be the problem or how to debug? Thanks, Dafna --- drivers/iommu/mtk_iommu.c| 22 ++ drivers/iommu/mtk_iommu_v1.c | 20 +++- 2 files changed, 41 insertions(+), 1 deletion(-) diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c index a02dde094788..ee742900cf4b 100644 --- a/drivers/iommu/mtk_iommu.c +++ b/drivers/iommu/mtk_iommu.c @@ -571,22 +571,44 @@ static struct iommu_device *mtk_iommu_probe_device(struct device *dev) { struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev); struct mtk_iommu_data *data; + struct device_link *link; + struct device *larbdev; + unsigned int larbid; if (!fwspec || fwspec->ops != _iommu_ops) return ERR_PTR(-ENODEV); /* Not a iommu client device */ data = dev_iommu_priv_get(dev); + /* +* Link the consumer device with the smi-larb device(supplier) +* The device in each a larb is a independent HW. thus only link +* one larb here. +*/ + larbid = MTK_M4U_TO_LARB(fwspec->ids[0]); + larbdev = data->larb_
Re: [PATCH v7 3/4] iommu: rockchip: Add internal ops to handle variants
On 29.07.21 18:58, Robin Murphy wrote: On 2021-07-29 17:08, Heiko Stübner wrote: Hi Dafna, Am Donnerstag, 29. Juli 2021, 17:59:26 CEST schrieb Dafna Hirschfeld: On 25.05.21 14:15, Benjamin Gaignard wrote: @@ -879,7 +895,7 @@ static int rk_iommu_enable(struct rk_iommu *iommu) for (i = 0; i < iommu->num_mmu; i++) { rk_iommu_write(iommu->bases[i], RK_MMU_DTE_ADDR, - rk_domain->dt_dma); + rk_ops->dma_addr_dte(rk_domain->dt_dma)); Hi, This is not related to that patch, I was wondring why are all mmu devices initialized with the same dt_dma? I see for example that the isp0_mmu in rk3399.dtsi has two resources. Can't each resource be initialized with different dt_dma and this way there are two dt tables instead of the two mmus pointing to the same dt table. maybe git log -1 cd6438c5f8446691afa4829fe1a9d7b656204f11 "iommu/rockchip: Reconstruct to support multi slaves There are some IPs, such as video encoder/decoder, contains 2 slave iommus, one for reading and the other for writing. They share the same irq and clock with master. This patch reconstructs to support this case by making them share the same Page Directory, Page Tables and even the register operations. That means every instruction to the reading MMU registers would be duplicated to the writing MMU and vice versa." Right. In theory we *could* maintain a separate pagetable for each IOMMU instance, but it would just lead to a load of complexity and overhead. For a map request, we'd have to do extra work to decide which table(s) need modifying, and duplicate all the work of the actual mapping if it's more than one. For an unmap request, we'd have no choice but to walk *all* the tables backing that domain to figure out which (if any) actually had it mapped in the first place. Given that we already have distinct read and write permissions for mappings within a single table, there's not even any functional benefit that could be gained in this case (and in the more general case where the device might emit all kinds of transactions from all its interfaces you'd have to maintain identical mappings for all its IOMMUs anyway). Saving memory and code complexity by physically sharing one pagetable and not worrying about trying to do selective TLB maintenance is a bigger win than anything else could be. Robin. Hi, I just try to understand how this iommu hardware/software works. I have two questions, 1. So we currently miss a potential mapping of the hardware right? I mean , each mmu can map 1024*1024*4K = 4G addresses, so two mmus can potentially map 8G. But since we set them to identical values, we can map only up to 4G. 2. What is the benefit of setting all mmus if they are all set to the same values? Can't we just work with the first mmu like it was done before that patch cd6438c5f8446691afa4829fe1a9d7b656204f11 Thanks, Dafna ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v7 3/4] iommu: rockchip: Add internal ops to handle variants
On 25.05.21 14:15, Benjamin Gaignard wrote: Add internal ops to be able to handle incoming variant v2. The goal is to keep the overall structure of the framework but to allow to add the evolution of this hardware block. The ops are global for a SoC because iommu domains are not attached to a specific devices if they are for a virtuel device like drm. Use a global variable shouldn't be since SoC usually doesn't embedded different versions of the iommu hardware block. If that happen one day a WARN_ON will be displayed at probe time. Signed-off-by: Benjamin Gaignard Reviewed-by: Heiko Stuebner --- version 7: - Set DMA mask - Add function to convert dma address to dte version 6: - Remove #include - Remove pt_address_mask field - Only use once of_device_get_match_data - Return an error if ops don't match version 5: - Use of_device_get_match_data() - Add internal ops inside the driver drivers/iommu/rockchip-iommu.c | 86 +- 1 file changed, 64 insertions(+), 22 deletions(-) diff --git a/drivers/iommu/rockchip-iommu.c b/drivers/iommu/rockchip-iommu.c index 7a2932772fdf..bd2cf7f08c71 100644 --- a/drivers/iommu/rockchip-iommu.c +++ b/drivers/iommu/rockchip-iommu.c @@ -96,6 +96,15 @@ static const char * const rk_iommu_clocks[] = { "aclk", "iface", }; +struct rk_iommu_ops { + phys_addr_t (*pt_address)(u32 dte); + u32 (*mk_dtentries)(dma_addr_t pt_dma); + u32 (*mk_ptentries)(phys_addr_t page, int prot); + phys_addr_t (*dte_addr_phys)(u32 addr); + u32 (*dma_addr_dte)(dma_addr_t dt_dma); + u64 dma_bit_mask; +}; + struct rk_iommu { struct device *dev; void __iomem **bases; @@ -116,6 +125,7 @@ struct rk_iommudata { }; static struct device *dma_dev; +static const struct rk_iommu_ops *rk_ops; static inline void rk_table_flush(struct rk_iommu_domain *dom, dma_addr_t dma, unsigned int count) @@ -215,11 +225,6 @@ static inline u32 rk_mk_dte(dma_addr_t pt_dma) #define RK_PTE_PAGE_READABLE BIT(1) #define RK_PTE_PAGE_VALID BIT(0) -static inline phys_addr_t rk_pte_page_address(u32 pte) -{ - return (phys_addr_t)pte & RK_PTE_PAGE_ADDRESS_MASK; -} - static inline bool rk_pte_is_page_valid(u32 pte) { return pte & RK_PTE_PAGE_VALID; @@ -448,10 +453,10 @@ static int rk_iommu_force_reset(struct rk_iommu *iommu) * and verifying that upper 5 nybbles are read back. */ for (i = 0; i < iommu->num_mmu; i++) { - rk_iommu_write(iommu->bases[i], RK_MMU_DTE_ADDR, DTE_ADDR_DUMMY); + dte_addr = rk_ops->pt_address(DTE_ADDR_DUMMY); + rk_iommu_write(iommu->bases[i], RK_MMU_DTE_ADDR, dte_addr); - dte_addr = rk_iommu_read(iommu->bases[i], RK_MMU_DTE_ADDR); - if (dte_addr != (DTE_ADDR_DUMMY & RK_DTE_PT_ADDRESS_MASK)) { + if (dte_addr != rk_iommu_read(iommu->bases[i], RK_MMU_DTE_ADDR)) { dev_err(iommu->dev, "Error during raw reset. MMU_DTE_ADDR is not functioning\n"); return -EFAULT; } @@ -470,6 +475,16 @@ static int rk_iommu_force_reset(struct rk_iommu *iommu) return 0; } +static inline phys_addr_t rk_dte_addr_phys(u32 addr) +{ + return (phys_addr_t)addr; +} + +static inline u32 rk_dma_addr_dte(dma_addr_t dt_dma) +{ + return dt_dma; +} + static void log_iova(struct rk_iommu *iommu, int index, dma_addr_t iova) { void __iomem *base = iommu->bases[index]; @@ -489,7 +504,7 @@ static void log_iova(struct rk_iommu *iommu, int index, dma_addr_t iova) page_offset = rk_iova_page_offset(iova); mmu_dte_addr = rk_iommu_read(base, RK_MMU_DTE_ADDR); - mmu_dte_addr_phys = (phys_addr_t)mmu_dte_addr; + mmu_dte_addr_phys = rk_ops->dte_addr_phys(mmu_dte_addr); dte_addr_phys = mmu_dte_addr_phys + (4 * dte_index); dte_addr = phys_to_virt(dte_addr_phys); @@ -498,14 +513,14 @@ static void log_iova(struct rk_iommu *iommu, int index, dma_addr_t iova) if (!rk_dte_is_pt_valid(dte)) goto print_it; - pte_addr_phys = rk_dte_pt_address(dte) + (pte_index * 4); + pte_addr_phys = rk_ops->pt_address(dte) + (pte_index * 4); pte_addr = phys_to_virt(pte_addr_phys); pte = *pte_addr; if (!rk_pte_is_page_valid(pte)) goto print_it; - page_addr_phys = rk_pte_page_address(pte) + page_offset; + page_addr_phys = rk_ops->pt_address(pte) + page_offset; page_flags = pte & RK_PTE_PAGE_FLAGS_MASK; print_it: @@ -601,13 +616,13 @@ static phys_addr_t rk_iommu_iova_to_phys(struct iommu_domain *domain, if (!rk_dte_is_pt_valid(dte)) goto out; - pt_phys = rk_dte_pt_address(dte); + pt_phys = rk_ops->pt_address(dte); page_table = (u32 *)phys_to_virt(pt_phys); pte = page_table[rk_iova_pte_index(iova)];
Re: [PATCH v6 00/11] Clean up "mediatek,larb"
Hi Thanks for the patchset. I tested it on mt8173 (elm) with chromeos userspace. Before that patchset, the test: tast -verbose run -build=false 10.42.0.175 video.DecodeAccel.h264 sometimes passed and sometimes failed with 'context deadline exceeded'. With this patchset it seems that the test always passes so I added tested-by: Tested-by: Dafna Hirschfeld Thanks, Dafna On 14.07.21 04:56, Yong Wu wrote: MediaTek IOMMU block diagram always like below: M4U | smi-common | - | | ... | | larb1 larb2 | | vdec venc All the consumer connect with smi-larb, then connect with smi-common. When the consumer works, it should enable the smi-larb's power which also need enable the smi-common's power firstly. Thus, Firstly, use the device link connect the consumer and the smi-larbs. then add device link between the smi-larb and smi-common. After adding the device_link, then "mediatek,larb" property can be removed. the iommu consumer don't need call the mtk_smi_larb_get/put to enable the power and clock of smi-larb and smi-common. About the MM dt-binding/dtsi patches, I guess they should go together, thus I don't split them for each a MM module and each a SoC. Base on v5.14-rc1, and a jpeg[1] and mdp[2] patchset. [1] https://lore.kernel.org/linux-mediatek/20210702102304.3346429-1-hsi...@chromium.org/ [2] https://lore.kernel.org/linux-mediatek/20210709022324.1607884-1-ei...@chromium.org/ Change notes: v6: 1) rebase on v5.14-rc1. 2) Fix the issue commented in v5 from Dafna and Hsin-Yi. 3) Remove the patches about using pm_runtime_resume_and_get since they have already been merged by other patches. v5: https://lore.kernel.org/linux-mediatek/20210410091128.31823-1-yong...@mediatek.com/ 1) Base v5.12-rc2. 2) Remove changing the mtk-iommu to module_platform_driver patch, It have already been a independent patch. v4: https://lore.kernel.org/linux-mediatek/1590826218-23653-1-git-send-email-yong...@mediatek.com/ base on v5.7-rc1. 1) Move drm PM patch before smi patchs. 2) Change builtin_platform_driver to module_platform_driver since we may need build as module. 3) Rebase many patchset as above. v3: https://lore.kernel.org/linux-iommu/1567503456-24725-1-git-send-email-yong...@mediatek.com/ 1) rebase on v5.3-rc1 and the latest mt8183 patchset. 2) Use device_is_bound to check whether the driver is ready from Matthias. 3) Add DL_FLAG_STATELESS flag when calling device_link_add and explain the reason in the commit message[3/14]. 4) Add a display patch[12/14] into this series. otherwise it may affect display HW fastlogo even though it don't happen in mt8183. v2: https://lore.kernel.org/linux-iommu/1560171313-28299-1-git-send-email-yong...@mediatek.com/ 1) rebase on v5.2-rc1. 2) Move adding device_link between the consumer and smi-larb into iommu_add_device from Robin. 3) add DL_FLAG_AUTOREMOVE_CONSUMER even though the smi is built-in from Evan. 4) Remove the shutdown callback in iommu. v1: https://lore.kernel.org/linux-iommu/1546318276-18993-1-git-send-email-yong...@mediatek.com/ Yong Wu (10): dt-binding: mediatek: Get rid of mediatek,larb for multimedia HW iommu/mediatek: Add probe_defer for smi-larb iommu/mediatek: Add device_link between the consumer and the larb devices media: mtk-jpeg: Get rid of mtk_smi_larb_get/put media: mtk-mdp: Get rid of mtk_smi_larb_get/put drm/mediatek: Get rid of mtk_smi_larb_get/put media: mtk-vcodec: Get rid of mtk_smi_larb_get/put memory: mtk-smi: Get rid of mtk_smi_larb_get/put arm: dts: mediatek: Get rid of mediatek,larb for MM nodes arm64: dts: mediatek: Get rid of mediatek,larb for MM nodes Yongqiang Niu (1): drm/mediatek: Add pm runtime support for ovl and rdma .../display/mediatek/mediatek,disp.txt| 9 .../bindings/media/mediatek-jpeg-decoder.yaml | 9 .../bindings/media/mediatek-jpeg-encoder.yaml | 9 .../bindings/media/mediatek-mdp.txt | 8 .../bindings/media/mediatek-vcodec.txt| 4 -- arch/arm/boot/dts/mt2701.dtsi | 2 - arch/arm/boot/dts/mt7623n.dtsi| 5 -- arch/arm64/boot/dts/mediatek/mt8173.dtsi | 16 --- arch/arm64/boot/dts/mediatek/mt8183.dtsi | 6 --- drivers/gpu/drm/mediatek/mtk_disp_ovl.c | 9 +++- drivers/gpu/drm/mediatek/mtk_disp_rdma.c | 9 +++- drivers/gpu/drm/mediatek/mtk_drm_crtc.c | 19 drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.c | 36 +-- drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.h | 1 - drivers/gpu/drm/mediatek/mtk_drm_drv.c| 5 +- drivers/iommu/mtk_iommu.c | 24 +- drivers/iommu/mtk_iommu_v1.c | 22 - .../media/platform/mtk-jpeg/mtk_jpeg_core.c | 45 +- .../medi
Re: [PATCH v6 01/11] dt-binding: mediatek: Get rid of mediatek, larb for multimedia HW
On 14.07.21 10:13, Dafna Hirschfeld wrote: Hi, thanks for the patch On 14.07.21 04:56, Yong Wu wrote: After adding device_link between the consumer with the smi-larbs, if the consumer call its owner pm_runtime_get(_sync), the pm_runtime_get(_sync) of smi-larb and smi-common will be called automatically. Thus, the consumer don't need the property. And IOMMU also know which larb this consumer connects with from iommu id in the "iommus=" property. Signed-off-by: Yong Wu Reviewed-by: Rob Herring Reviewed-by: Evan Green --- .../bindings/display/mediatek/mediatek,disp.txt | 9 - .../devicetree/bindings/media/mediatek-jpeg-decoder.yaml | 9 - .../devicetree/bindings/media/mediatek-jpeg-encoder.yaml | 9 - On which repo are these patches based on ? In linux-next the file mediatek-jpeg-encoder.yaml don't exist Thanks, Dafna sorry, I see you reference the patch that convert to yaml in the cover letter. Thanks, Dafna Documentation/devicetree/bindings/media/mediatek-mdp.txt | 8 .../devicetree/bindings/media/mediatek-vcodec.txt | 4 5 files changed, 39 deletions(-) diff --git a/Documentation/devicetree/bindings/display/mediatek/mediatek,disp.txt b/Documentation/devicetree/bindings/display/mediatek/mediatek,disp.txt index fbb59c9ddda6..867bd82e2f03 100644 --- a/Documentation/devicetree/bindings/display/mediatek/mediatek,disp.txt +++ b/Documentation/devicetree/bindings/display/mediatek/mediatek,disp.txt @@ -61,8 +61,6 @@ Required properties (DMA function blocks): "mediatek,-disp-rdma" "mediatek,-disp-wdma" the supported chips are mt2701, mt8167 and mt8173. -- larb: Should contain a phandle pointing to the local arbiter device as defined - in Documentation/devicetree/bindings/memory-controllers/mediatek,smi-larb.yaml - iommus: Should point to the respective IOMMU block with master port as argument, see Documentation/devicetree/bindings/iommu/mediatek,iommu.yaml for details. @@ -91,7 +89,6 @@ ovl0: ovl@1400c000 { power-domains = < MT8173_POWER_DOMAIN_MM>; clocks = < CLK_MM_DISP_OVL0>; iommus = < M4U_PORT_DISP_OVL0>; - mediatek,larb = <>; }; ovl1: ovl@1400d000 { @@ -101,7 +98,6 @@ ovl1: ovl@1400d000 { power-domains = < MT8173_POWER_DOMAIN_MM>; clocks = < CLK_MM_DISP_OVL1>; iommus = < M4U_PORT_DISP_OVL1>; - mediatek,larb = <>; }; rdma0: rdma@1400e000 { @@ -111,7 +107,6 @@ rdma0: rdma@1400e000 { power-domains = < MT8173_POWER_DOMAIN_MM>; clocks = < CLK_MM_DISP_RDMA0>; iommus = < M4U_PORT_DISP_RDMA0>; - mediatek,larb = <>; mediatek,rdma-fifosize = <8192>; }; @@ -122,7 +117,6 @@ rdma1: rdma@1400f000 { power-domains = < MT8173_POWER_DOMAIN_MM>; clocks = < CLK_MM_DISP_RDMA1>; iommus = < M4U_PORT_DISP_RDMA1>; - mediatek,larb = <>; }; rdma2: rdma@1401 { @@ -132,7 +126,6 @@ rdma2: rdma@1401 { power-domains = < MT8173_POWER_DOMAIN_MM>; clocks = < CLK_MM_DISP_RDMA2>; iommus = < M4U_PORT_DISP_RDMA2>; - mediatek,larb = <>; }; wdma0: wdma@14011000 { @@ -142,7 +135,6 @@ wdma0: wdma@14011000 { power-domains = < MT8173_POWER_DOMAIN_MM>; clocks = < CLK_MM_DISP_WDMA0>; iommus = < M4U_PORT_DISP_WDMA0>; - mediatek,larb = <>; }; wdma1: wdma@14012000 { @@ -152,7 +144,6 @@ wdma1: wdma@14012000 { power-domains = < MT8173_POWER_DOMAIN_MM>; clocks = < CLK_MM_DISP_WDMA1>; iommus = < M4U_PORT_DISP_WDMA1>; - mediatek,larb = <>; }; color0: color@14013000 { diff --git a/Documentation/devicetree/bindings/media/mediatek-jpeg-decoder.yaml b/Documentation/devicetree/bindings/media/mediatek-jpeg-decoder.yaml index 9b87f036f178..052e752157b4 100644 --- a/Documentation/devicetree/bindings/media/mediatek-jpeg-decoder.yaml +++ b/Documentation/devicetree/bindings/media/mediatek-jpeg-decoder.yaml @@ -42,13 +42,6 @@ properties: power-domains: maxItems: 1 - mediatek,larb: - $ref: '/schemas/types.yaml#/definitions/phandle' - description: | - Must contain the local arbiters in the current Socs, see - Documentation/devicetree/bindings/memory-controllers/mediatek,smi-larb.yaml - for details. - iommus: maxItems: 2 description: | @@ -63,7 +56,6 @@ required: - clocks - clock-names - power-domains - - mediatek,larb - iommus additionalProperties: false @@ -83,7 +75,6 @@ examples: clock-names = "jpgdec-smi", "jpgdec"; power-domains = < MT2701_POWER_DOMAIN_ISP>; - mediatek,larb = <>; iommus = < MT2701_M4U_PORT_JPGDEC_WDMA>, < MT2701_M4U_PORT_JPGDEC_BSDMA>; }; diff --git a/Documentation/devicet
Re: [PATCH v6 06/11] drm/mediatek: Add pm runtime support for ovl and rdma
On 14.07.21 04:56, Yong Wu wrote: From: Yongqiang Niu Prepare for smi cleaning up "mediatek,larb". Display use the dispsys device to call pm_rumtime_get_sync before. This patch add pm_runtime_xx with ovl and rdma device whose nodes has "iommus" property, then display could help pm_runtime_get for smi via ovl or rdma device. CC: CK Hu Signed-off-by: Yongqiang Niu Signed-off-by: Yong Wu (Yong: Use pm_runtime_resume_and_get instead of pm_runtime_get_sync) Acked-by: Chun-Kuang Hu --- drivers/gpu/drm/mediatek/mtk_disp_ovl.c | 9 - drivers/gpu/drm/mediatek/mtk_disp_rdma.c | 9 - drivers/gpu/drm/mediatek/mtk_drm_crtc.c | 12 +++- 3 files changed, 27 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/mediatek/mtk_disp_ovl.c b/drivers/gpu/drm/mediatek/mtk_disp_ovl.c index fa9d79963cd3..ea5760f856ec 100644 --- a/drivers/gpu/drm/mediatek/mtk_disp_ovl.c +++ b/drivers/gpu/drm/mediatek/mtk_disp_ovl.c @@ -11,6 +11,7 @@ #include #include #include +#include #include #include "mtk_disp_drv.h" @@ -414,15 +415,21 @@ static int mtk_disp_ovl_probe(struct platform_device *pdev) return ret; } + pm_runtime_enable(dev); + ret = component_add(dev, _disp_ovl_component_ops); - if (ret) + if (ret) { + pm_runtime_disable(dev); dev_err(dev, "Failed to add component: %d\n", ret); + } return ret; } static int mtk_disp_ovl_remove(struct platform_device *pdev) { + pm_runtime_disable(>dev); + return 0; } diff --git a/drivers/gpu/drm/mediatek/mtk_disp_rdma.c b/drivers/gpu/drm/mediatek/mtk_disp_rdma.c index 705f28ceb4dd..0f31d1c8e37c 100644 --- a/drivers/gpu/drm/mediatek/mtk_disp_rdma.c +++ b/drivers/gpu/drm/mediatek/mtk_disp_rdma.c @@ -9,6 +9,7 @@ #include #include #include +#include #include #include "mtk_disp_drv.h" @@ -327,9 +328,13 @@ static int mtk_disp_rdma_probe(struct platform_device *pdev) platform_set_drvdata(pdev, priv); + pm_runtime_enable(dev); + ret = component_add(dev, _disp_rdma_component_ops); - if (ret) + if (ret) { + pm_runtime_disable(dev); dev_err(dev, "Failed to add component: %d\n", ret); + } return ret; } @@ -338,6 +343,8 @@ static int mtk_disp_rdma_remove(struct platform_device *pdev) { component_del(>dev, _disp_rdma_component_ops); + pm_runtime_disable(>dev); + return 0; } diff --git a/drivers/gpu/drm/mediatek/mtk_drm_crtc.c b/drivers/gpu/drm/mediatek/mtk_drm_crtc.c index 474efb844249..08e3f352377d 100644 --- a/drivers/gpu/drm/mediatek/mtk_drm_crtc.c +++ b/drivers/gpu/drm/mediatek/mtk_drm_crtc.c @@ -557,9 +557,15 @@ static void mtk_drm_crtc_atomic_enable(struct drm_crtc *crtc, return; } + ret = pm_runtime_resume_and_get(comp->dev); + if (ret < 0) + DRM_DEV_ERROR(comp->dev, "Failed to enable power domain: %d\n", + ret); shouldn't the code return in case of failure here? Thanks, Dafna + ret = mtk_crtc_ddp_hw_init(mtk_crtc); if (ret) { mtk_smi_larb_put(comp->larb_dev); + pm_runtime_put(comp->dev); return; } @@ -572,7 +578,7 @@ static void mtk_drm_crtc_atomic_disable(struct drm_crtc *crtc, { struct mtk_drm_crtc *mtk_crtc = to_mtk_crtc(crtc); struct mtk_ddp_comp *comp = mtk_crtc->ddp_comp[0]; - int i; + int i, ret; DRM_DEBUG_DRIVER("%s %d\n", __func__, crtc->base.id); if (!mtk_crtc->enabled) @@ -596,6 +602,10 @@ static void mtk_drm_crtc_atomic_disable(struct drm_crtc *crtc, drm_crtc_vblank_off(crtc); mtk_crtc_ddp_hw_fini(mtk_crtc); mtk_smi_larb_put(comp->larb_dev); + ret = pm_runtime_put(comp->dev); + if (ret < 0) + DRM_DEV_ERROR(comp->dev, "Failed to disable power domain: %d\n", + ret); mtk_crtc->enabled = false; } ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v6 09/11] memory: mtk-smi: Get rid of mtk_smi_larb_get/put
On 14.07.21 04:56, Yong Wu wrote: After adding device_link between the iommu consumer and smi-larb, the pm_runtime_get(_sync) of smi-larb and smi-common will be called automatically. we can get rid of mtk_smi_larb_get/put. CC: Matthias Brugger Signed-off-by: Yong Wu Reviewed-by: Evan Green Acked-by: Krzysztof Kozlowski Acked-by: Matthias Brugger Reviewed-by: Dafna Hirschfeld --- drivers/memory/mtk-smi.c | 14 -- include/soc/mediatek/smi.h | 20 2 files changed, 34 deletions(-) diff --git a/drivers/memory/mtk-smi.c b/drivers/memory/mtk-smi.c index c5fb51f73b34..7c61c924e220 100644 --- a/drivers/memory/mtk-smi.c +++ b/drivers/memory/mtk-smi.c @@ -134,20 +134,6 @@ static void mtk_smi_clk_disable(const struct mtk_smi *smi) clk_disable_unprepare(smi->clk_apb); } -int mtk_smi_larb_get(struct device *larbdev) -{ - int ret = pm_runtime_resume_and_get(larbdev); - - return (ret < 0) ? ret : 0; -} -EXPORT_SYMBOL_GPL(mtk_smi_larb_get); - -void mtk_smi_larb_put(struct device *larbdev) -{ - pm_runtime_put_sync(larbdev); -} -EXPORT_SYMBOL_GPL(mtk_smi_larb_put); - static int mtk_smi_larb_bind(struct device *dev, struct device *master, void *data) { diff --git a/include/soc/mediatek/smi.h b/include/soc/mediatek/smi.h index 15e3397cec58..11f7d6b59642 100644 --- a/include/soc/mediatek/smi.h +++ b/include/soc/mediatek/smi.h @@ -19,26 +19,6 @@ struct mtk_smi_larb_iommu { unsigned char bank[32]; }; -/* - * mtk_smi_larb_get: Enable the power domain and clocks for this local arbiter. - * It also initialize some basic setting(like iommu). - * mtk_smi_larb_put: Disable the power domain and clocks for this local arbiter. - * Both should be called in non-atomic context. - * - * Returns 0 if successful, negative on failure. - */ -int mtk_smi_larb_get(struct device *larbdev); -void mtk_smi_larb_put(struct device *larbdev); - -#else - -static inline int mtk_smi_larb_get(struct device *larbdev) -{ - return 0; -} - -static inline void mtk_smi_larb_put(struct device *larbdev) { } - #endif #endif ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v6 08/11] media: mtk-vcodec: Get rid of mtk_smi_larb_get/put
On 14.07.21 04:56, Yong Wu wrote: MediaTek IOMMU has already added the device_link between the consumer and smi-larb device. If the vcodec device call the pm_runtime_get_sync, the smi-larb's pm_runtime_get_sync also be called automatically. CC: Tiffany Lin CC: Irui Wang Signed-off-by: Yong Wu Reviewed-by: Evan Green Acked-by: Tiffany Lin Reviewed-by: Dafna Hirschfeld --- .../platform/mtk-vcodec/mtk_vcodec_dec_pm.c | 37 +++- .../platform/mtk-vcodec/mtk_vcodec_drv.h | 3 -- .../platform/mtk-vcodec/mtk_vcodec_enc.c | 1 - .../platform/mtk-vcodec/mtk_vcodec_enc_pm.c | 44 +++ 4 files changed, 10 insertions(+), 75 deletions(-) diff --git a/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_pm.c b/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_pm.c index 6038db96f71c..d0bf9aa3b29d 100644 --- a/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_pm.c +++ b/drivers/media/platform/mtk-vcodec/mtk_vcodec_dec_pm.c @@ -8,14 +8,12 @@ #include #include #include -#include #include "mtk_vcodec_dec_pm.h" #include "mtk_vcodec_util.h" int mtk_vcodec_init_dec_pm(struct mtk_vcodec_dev *mtkdev) { - struct device_node *node; struct platform_device *pdev; struct mtk_vcodec_pm *pm; struct mtk_vcodec_clk *dec_clk; @@ -26,18 +24,7 @@ int mtk_vcodec_init_dec_pm(struct mtk_vcodec_dev *mtkdev) pm = >pm; pm->mtkdev = mtkdev; dec_clk = >vdec_clk; - node = of_parse_phandle(pdev->dev.of_node, "mediatek,larb", 0); - if (!node) { - mtk_v4l2_err("of_parse_phandle mediatek,larb fail!"); - return -1; - } - pdev = of_find_device_by_node(node); - of_node_put(node); - if (WARN_ON(!pdev)) { - return -1; - } - pm->larbvdec = >dev; pdev = mtkdev->plat_dev; pm->dev = >dev; @@ -47,14 +34,11 @@ int mtk_vcodec_init_dec_pm(struct mtk_vcodec_dev *mtkdev) dec_clk->clk_info = devm_kcalloc(>dev, dec_clk->clk_num, sizeof(*clk_info), GFP_KERNEL); - if (!dec_clk->clk_info) { - ret = -ENOMEM; - goto put_device; - } + if (!dec_clk->clk_info) + return -ENOMEM; } else { mtk_v4l2_err("Failed to get vdec clock count"); - ret = -EINVAL; - goto put_device; + return -EINVAL; } for (i = 0; i < dec_clk->clk_num; i++) { @@ -63,29 +47,24 @@ int mtk_vcodec_init_dec_pm(struct mtk_vcodec_dev *mtkdev) "clock-names", i, _info->clk_name); if (ret) { mtk_v4l2_err("Failed to get clock name id = %d", i); - goto put_device; + return ret; } clk_info->vcodec_clk = devm_clk_get(>dev, clk_info->clk_name); if (IS_ERR(clk_info->vcodec_clk)) { mtk_v4l2_err("devm_clk_get (%d)%s fail", i, clk_info->clk_name); - ret = PTR_ERR(clk_info->vcodec_clk); - goto put_device; + return PTR_ERR(clk_info->vcodec_clk); } } pm_runtime_enable(>dev); return 0; -put_device: - put_device(pm->larbvdec); - return ret; } void mtk_vcodec_release_dec_pm(struct mtk_vcodec_dev *dev) { pm_runtime_disable(dev->pm.dev); - put_device(dev->pm.larbvdec); } int mtk_vcodec_dec_pw_on(struct mtk_vcodec_pm *pm) @@ -122,11 +101,6 @@ void mtk_vcodec_dec_clock_on(struct mtk_vcodec_pm *pm) } } - ret = mtk_smi_larb_get(pm->larbvdec); - if (ret) { - mtk_v4l2_err("mtk_smi_larb_get larbvdec fail %d", ret); - goto error; - } return; error: @@ -139,7 +113,6 @@ void mtk_vcodec_dec_clock_off(struct mtk_vcodec_pm *pm) struct mtk_vcodec_clk *dec_clk = >vdec_clk; int i = 0; - mtk_smi_larb_put(pm->larbvdec); for (i = dec_clk->clk_num - 1; i >= 0; i--) clk_disable_unprepare(dec_clk->clk_info[i].vcodec_clk); } diff --git a/drivers/media/platform/mtk-vcodec/mtk_vcodec_drv.h b/drivers/media/platform/mtk-vcodec/mtk_vcodec_drv.h index c6c7672fecfb..64b73dd880ce 100644 --- a/drivers/media/platform/mtk-vcodec/mtk_vcodec_drv.h +++ b/drivers/media/platform/mtk-vcodec/mtk_vcodec_drv.h @@ -189,10 +189,7 @@ struct mtk_vcodec_clk { */ struct mtk_vcodec_pm { struct mtk_vcodec_clk vdec_clk; - struct device *larbvdec; - struct mtk_vcodec_c
Re: [PATCH v6 07/11] drm/mediatek: Get rid of mtk_smi_larb_get/put
On 14.07.21 04:56, Yong Wu wrote: MediaTek IOMMU has already added the device_link between the consumer and smi-larb device. If the drm device call the pm_runtime_get_sync, the smi-larb's pm_runtime_get_sync also be called automatically. CC: CK Hu CC: Philipp Zabel Signed-off-by: Yong Wu Reviewed-by: Evan Green Acked-by: Chun-Kuang Hu Reviewed-by: Dafna Hirschfeld --- drivers/gpu/drm/mediatek/mtk_drm_crtc.c | 9 -- drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.c | 36 ++--- drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.h | 1 - drivers/gpu/drm/mediatek/mtk_drm_drv.c | 5 +-- 4 files changed, 3 insertions(+), 48 deletions(-) diff --git a/drivers/gpu/drm/mediatek/mtk_drm_crtc.c b/drivers/gpu/drm/mediatek/mtk_drm_crtc.c index 08e3f352377d..d046abcf66ce 100644 --- a/drivers/gpu/drm/mediatek/mtk_drm_crtc.c +++ b/drivers/gpu/drm/mediatek/mtk_drm_crtc.c @@ -10,7 +10,6 @@ #include #include -#include #include #include @@ -551,12 +550,6 @@ static void mtk_drm_crtc_atomic_enable(struct drm_crtc *crtc, DRM_DEBUG_DRIVER("%s %d\n", __func__, crtc->base.id); - ret = mtk_smi_larb_get(comp->larb_dev); - if (ret) { - DRM_ERROR("Failed to get larb: %d\n", ret); - return; - } - ret = pm_runtime_resume_and_get(comp->dev); if (ret < 0) DRM_DEV_ERROR(comp->dev, "Failed to enable power domain: %d\n", @@ -564,7 +557,6 @@ static void mtk_drm_crtc_atomic_enable(struct drm_crtc *crtc, ret = mtk_crtc_ddp_hw_init(mtk_crtc); if (ret) { - mtk_smi_larb_put(comp->larb_dev); pm_runtime_put(comp->dev); return; } @@ -601,7 +593,6 @@ static void mtk_drm_crtc_atomic_disable(struct drm_crtc *crtc, drm_crtc_vblank_off(crtc); mtk_crtc_ddp_hw_fini(mtk_crtc); - mtk_smi_larb_put(comp->larb_dev); ret = pm_runtime_put(comp->dev); if (ret < 0) DRM_DEV_ERROR(comp->dev, "Failed to disable power domain: %d\n", diff --git a/drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.c b/drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.c index 75bc00e17fc4..7d240218d4c7 100644 --- a/drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.c +++ b/drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.c @@ -449,37 +449,15 @@ unsigned int mtk_drm_find_possible_crtc_by_comp(struct drm_device *drm, return ret; } -static int mtk_ddp_get_larb_dev(struct device_node *node, struct mtk_ddp_comp *comp, - struct device *dev) -{ - struct device_node *larb_node; - struct platform_device *larb_pdev; - - larb_node = of_parse_phandle(node, "mediatek,larb", 0); - if (!larb_node) { - dev_err(dev, "Missing mediadek,larb phandle in %pOF node\n", node); - return -EINVAL; - } - - larb_pdev = of_find_device_by_node(larb_node); - if (!larb_pdev) { - dev_warn(dev, "Waiting for larb device %pOF\n", larb_node); - of_node_put(larb_node); - return -EPROBE_DEFER; - } - of_node_put(larb_node); - comp->larb_dev = _pdev->dev; - - return 0; -} - int mtk_ddp_comp_init(struct device_node *node, struct mtk_ddp_comp *comp, enum mtk_ddp_comp_id comp_id) { struct platform_device *comp_pdev; enum mtk_ddp_comp_type type; struct mtk_ddp_comp_dev *priv; +#if IS_REACHABLE(CONFIG_MTK_CMDQ) int ret; +#endif if (comp_id < 0 || comp_id >= DDP_COMPONENT_ID_MAX) return -EINVAL; @@ -495,16 +473,6 @@ int mtk_ddp_comp_init(struct device_node *node, struct mtk_ddp_comp *comp, } comp->dev = _pdev->dev; - /* Only DMA capable components need the LARB property */ - if (type == MTK_DISP_OVL || - type == MTK_DISP_OVL_2L || - type == MTK_DISP_RDMA || - type == MTK_DISP_WDMA) { - ret = mtk_ddp_get_larb_dev(node, comp, comp->dev); - if (ret) - return ret; - } - if (type == MTK_DISP_BLS || type == MTK_DISP_CCORR || type == MTK_DISP_COLOR || diff --git a/drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.h b/drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.h index bb914d976cf5..1b582262b682 100644 --- a/drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.h +++ b/drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.h @@ -70,7 +70,6 @@ struct mtk_ddp_comp_funcs { struct mtk_ddp_comp { struct device *dev; int irq; - struct device *larb_dev; enum mtk_ddp_comp_id id; const struct mtk_ddp_comp_funcs *funcs; }; diff --git a/drivers/gpu/drm/mediatek/mtk_drm_drv.c b/drivers/gpu/drm/mediatek/mtk_drm_drv.c index b46bdb8985da..0d5ef3d8d081 100644 --- a/drivers/gpu/drm/mediatek/mtk
Re: [PATCH v6 05/11] media: mtk-mdp: Get rid of mtk_smi_larb_get/put
On 14.07.21 04:56, Yong Wu wrote: MediaTek IOMMU has already added the device_link between the consumer and smi-larb device. If the mdp device call the pm_runtime_get_sync, the smi-larb's pm_runtime_get_sync also be called automatically. CC: Minghsiu Tsai CC: Houlong Wei Signed-off-by: Yong Wu Reviewed-by: Evan Green Reviewed-by: Houlong Wei Reviewed-by: Dafna Hirschfeld --- drivers/media/platform/mtk-mdp/mtk_mdp_comp.c | 46 +-- drivers/media/platform/mtk-mdp/mtk_mdp_comp.h | 2 - drivers/media/platform/mtk-mdp/mtk_mdp_core.c | 1 - 3 files changed, 1 insertion(+), 48 deletions(-) diff --git a/drivers/media/platform/mtk-mdp/mtk_mdp_comp.c b/drivers/media/platform/mtk-mdp/mtk_mdp_comp.c index de2d425efdd1..5e0ea83a9f7f 100644 --- a/drivers/media/platform/mtk-mdp/mtk_mdp_comp.c +++ b/drivers/media/platform/mtk-mdp/mtk_mdp_comp.c @@ -13,7 +13,6 @@ #include #include #include -#include #include #include "mtk_mdp_comp.h" @@ -57,13 +56,6 @@ int mtk_mdp_comp_power_on(struct mtk_mdp_comp *comp) { int status, err; - if (comp->larb_dev) { - err = mtk_smi_larb_get(comp->larb_dev); - if (err) - dev_err(comp->dev, - "failed to get larb, err %d.\n", err); - } - err = pm_runtime_get_sync(comp->dev); if (err < 0) { dev_err(comp->dev, "failed to runtime get, err %d.\n", err); @@ -146,9 +138,6 @@ void mtk_mdp_comp_clock_off(struct mtk_mdp_comp *comp) continue; clk_disable_unprepare(comp->clk[i]); } - - if (comp->larb_dev) - mtk_smi_larb_put(comp->larb_dev); } /* @@ -236,9 +225,6 @@ static const struct component_ops mtk_mdp_component_ops = { int mtk_mdp_comp_init(struct mtk_mdp_comp *comp, struct device *dev) { - struct device_node *larb_node; - struct platform_device *larb_pdev; - int ret; int i; struct device_node *node = dev->of_node; enum mtk_mdp_comp_type comp_type = @@ -252,8 +238,7 @@ int mtk_mdp_comp_init(struct mtk_mdp_comp *comp, struct device *dev) if (IS_ERR(comp->clk[i])) { if (PTR_ERR(comp->clk[i]) != -EPROBE_DEFER) dev_err(dev, "Failed to get clock\n"); - ret = PTR_ERR(comp->clk[i]); - goto err; + return PTR_ERR(comp->clk[i]); } /* Only RDMA needs two clocks */ @@ -261,36 +246,7 @@ int mtk_mdp_comp_init(struct mtk_mdp_comp *comp, struct device *dev) break; } - /* Only DMA capable components need the LARB property */ - comp->larb_dev = NULL; - if (comp_type != MTK_MDP_RDMA && - comp_type != MTK_MDP_WDMA && - comp_type != MTK_MDP_WROT) - return 0; - - larb_node = of_parse_phandle(node, "mediatek,larb", 0); - if (!larb_node) { - dev_err(dev, - "Missing mediadek,larb phandle in %pOF node\n", node); - ret = -EINVAL; - goto err; - } - - larb_pdev = of_find_device_by_node(larb_node); - if (!larb_pdev) { - dev_warn(dev, "Waiting for larb device %pOF\n", larb_node); - of_node_put(larb_node); - ret = -EPROBE_DEFER; - goto err; - } - of_node_put(larb_node); - - comp->larb_dev = _pdev->dev; - return 0; - -err: - return ret; } static int mtk_mdp_comp_probe(struct platform_device *pdev) diff --git a/drivers/media/platform/mtk-mdp/mtk_mdp_comp.h b/drivers/media/platform/mtk-mdp/mtk_mdp_comp.h index 5201c47f7baa..2bd229cc7eae 100644 --- a/drivers/media/platform/mtk-mdp/mtk_mdp_comp.h +++ b/drivers/media/platform/mtk-mdp/mtk_mdp_comp.h @@ -11,13 +11,11 @@ * struct mtk_mdp_comp - the MDP's function component data * @node: list node to track sibing MDP components * @clk: clocks required for component - * @larb_dev: SMI device required for component * @dev: component's device */ struct mtk_mdp_comp { struct list_headnode; struct clk *clk[2]; - struct device *larb_dev; struct device *dev; }; diff --git a/drivers/media/platform/mtk-mdp/mtk_mdp_core.c b/drivers/media/platform/mtk-mdp/mtk_mdp_core.c index e1fb39231248..be7d35b3e3ff 100644 --- a/drivers/media/platform/mtk-mdp/mtk_mdp_core.c +++ b/drivers/media/platform/mtk-mdp/mtk_mdp_core.c @@ -18,7 +18,6 @@ #include #include #include -#include #include "mtk_mdp_comp.h" #include "mtk_mdp_core.h" ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v6 04/11] media: mtk-jpeg: Get rid of mtk_smi_larb_get/put
On 14.07.21 04:56, Yong Wu wrote: MediaTek IOMMU has already added device_link between the consumer and smi-larb device. If the jpg device call the pm_runtime_get_sync, the smi-larb's pm_runtime_get_sync also be called automatically. After removing the larb_get operations, then mtk_jpeg_clk_init is also unnecessary. Remove it too. CC: Rick Chang CC: Xia Jiang Signed-off-by: Yong Wu Reviewed-by: Evan Green Acked-by: Rick Chang Reviewed-by: Dafna Hirschfeld --- .../media/platform/mtk-jpeg/mtk_jpeg_core.c | 45 +-- .../media/platform/mtk-jpeg/mtk_jpeg_core.h | 2 - 2 files changed, 2 insertions(+), 45 deletions(-) diff --git a/drivers/media/platform/mtk-jpeg/mtk_jpeg_core.c b/drivers/media/platform/mtk-jpeg/mtk_jpeg_core.c index a89c7b206eef..4fea2c512434 100644 --- a/drivers/media/platform/mtk-jpeg/mtk_jpeg_core.c +++ b/drivers/media/platform/mtk-jpeg/mtk_jpeg_core.c @@ -22,7 +22,6 @@ #include #include #include -#include #include "mtk_jpeg_enc_hw.h" #include "mtk_jpeg_dec_hw.h" @@ -1055,10 +1054,6 @@ static void mtk_jpeg_clk_on(struct mtk_jpeg_dev *jpeg) { int ret; - ret = mtk_smi_larb_get(jpeg->larb); - if (ret) - dev_err(jpeg->dev, "mtk_smi_larb_get larbvdec fail %d\n", ret); - ret = clk_bulk_prepare_enable(jpeg->variant->num_clks, jpeg->variant->clks); if (ret) @@ -1069,7 +1064,6 @@ static void mtk_jpeg_clk_off(struct mtk_jpeg_dev *jpeg) { clk_bulk_disable_unprepare(jpeg->variant->num_clks, jpeg->variant->clks); - mtk_smi_larb_put(jpeg->larb); } static irqreturn_t mtk_jpeg_enc_done(struct mtk_jpeg_dev *jpeg) @@ -1284,35 +1278,6 @@ static struct clk_bulk_data mtk_jpeg_clocks[] = { { .id = "jpgenc" }, }; -static int mtk_jpeg_clk_init(struct mtk_jpeg_dev *jpeg) -{ - struct device_node *node; - struct platform_device *pdev; - int ret; - - node = of_parse_phandle(jpeg->dev->of_node, "mediatek,larb", 0); - if (!node) - return -EINVAL; - pdev = of_find_device_by_node(node); - if (WARN_ON(!pdev)) { - of_node_put(node); - return -EINVAL; - } - of_node_put(node); - - jpeg->larb = >dev; - - ret = devm_clk_bulk_get(jpeg->dev, jpeg->variant->num_clks, - jpeg->variant->clks); - if (ret) { - dev_err(>dev, "failed to get jpeg clock:%d\n", ret); - put_device(>dev); - return ret; - } - - return 0; -} - static void mtk_jpeg_job_timeout_work(struct work_struct *work) { struct mtk_jpeg_dev *jpeg = container_of(work, struct mtk_jpeg_dev, @@ -1333,11 +1298,6 @@ static void mtk_jpeg_job_timeout_work(struct work_struct *work) v4l2_m2m_job_finish(jpeg->m2m_dev, ctx->fh.m2m_ctx); } -static inline void mtk_jpeg_clk_release(struct mtk_jpeg_dev *jpeg) -{ - put_device(jpeg->larb); -} - static int mtk_jpeg_probe(struct platform_device *pdev) { struct mtk_jpeg_dev *jpeg; @@ -1376,7 +1336,8 @@ static int mtk_jpeg_probe(struct platform_device *pdev) goto err_req_irq; } - ret = mtk_jpeg_clk_init(jpeg); + ret = devm_clk_bulk_get(jpeg->dev, jpeg->variant->num_clks, + jpeg->variant->clks); if (ret) { dev_err(>dev, "Failed to init clk, err %d\n", ret); goto err_clk_init; @@ -1442,7 +1403,6 @@ static int mtk_jpeg_probe(struct platform_device *pdev) v4l2_device_unregister(>v4l2_dev); err_dev_register: - mtk_jpeg_clk_release(jpeg); err_clk_init: @@ -1460,7 +1420,6 @@ static int mtk_jpeg_remove(struct platform_device *pdev) video_device_release(jpeg->vdev); v4l2_m2m_release(jpeg->m2m_dev); v4l2_device_unregister(>v4l2_dev); - mtk_jpeg_clk_release(jpeg); return 0; } diff --git a/drivers/media/platform/mtk-jpeg/mtk_jpeg_core.h b/drivers/media/platform/mtk-jpeg/mtk_jpeg_core.h index 595f7f10c9fd..3e4811a41ba2 100644 --- a/drivers/media/platform/mtk-jpeg/mtk_jpeg_core.h +++ b/drivers/media/platform/mtk-jpeg/mtk_jpeg_core.h @@ -85,7 +85,6 @@ struct mtk_jpeg_variant { * @alloc_ctx:videobuf2 memory allocator's context * @vdev: video device node for jpeg mem2mem mode * @reg_base: JPEG registers mapping - * @larb: SMI device * @job_timeout_work: IRQ timeout structure * @variant: driver variant to be used */ @@ -99,7 +98,6 @@ struct mtk_jpeg_dev { void*alloc_ctx; struct video_device *vdev;
Re: [PATCH v6 03/11] iommu/mediatek: Add device_link between the consumer and the larb devices
On 14.07.21 04:56, Yong Wu wrote: MediaTek IOMMU-SMI diagram is like below. all the consumer connect with smi-larb, then connect with smi-common. M4U | smi-common | - | |... | | larb1 larb2 | | vdec venc When the consumer works, it should enable the smi-larb's power which also need enable the smi-common's power firstly. Thus, First of all, use the device link connect the consumer and the smi-larbs. then add device link between the smi-larb and smi-common. This patch adds device_link between the consumer and the larbs. When device_link_add, I add the flag DL_FLAG_STATELESS to avoid calling pm_runtime_xx to keep the original status of clocks. It can avoid two issues: 1) Display HW show fastlogo abnormally reported in [1]. At the beggining, all the clocks are enabled before entering kernel, but the clocks for display HW(always in larb0) will be gated after clk_enable and clk_disable called from device_link_add(->pm_runtime_resume) and rpm_idle. The clock operation happened before display driver probe. At that time, the display HW will be abnormal. 2) A deadlock issue reported in [2]. Use DL_FLAG_STATELESS to skip pm_runtime_xx to avoid the deadlock. Corresponding, DL_FLAG_AUTOREMOVE_CONSUMER can't be added, then device_link_removed should be added explicitly. [1] https://lore.kernel.org/linux-mediatek/1564213888.22908.4.camel@mhfsdcap03/ [2] https://lore.kernel.org/patchwork/patch/1086569/ Suggested-by: Tomasz Figa Signed-off-by: Yong Wu --- drivers/iommu/mtk_iommu.c| 22 ++ drivers/iommu/mtk_iommu_v1.c | 20 +++- 2 files changed, 41 insertions(+), 1 deletion(-) diff --git a/drivers/iommu/mtk_iommu.c b/drivers/iommu/mtk_iommu.c index a02dde094788..ee742900cf4b 100644 --- a/drivers/iommu/mtk_iommu.c +++ b/drivers/iommu/mtk_iommu.c @@ -571,22 +571,44 @@ static struct iommu_device *mtk_iommu_probe_device(struct device *dev) { struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev); struct mtk_iommu_data *data; + struct device_link *link; + struct device *larbdev; + unsigned int larbid; if (!fwspec || fwspec->ops != _iommu_ops) return ERR_PTR(-ENODEV); /* Not a iommu client device */ data = dev_iommu_priv_get(dev); + /* +* Link the consumer device with the smi-larb device(supplier) +* The device in each a larb is a independent HW. thus only link +* one larb here. +*/ + larbid = MTK_M4U_TO_LARB(fwspec->ids[0]); + larbdev = data->larb_imu[larbid].dev; + link = device_link_add(dev, larbdev, + DL_FLAG_PM_RUNTIME | DL_FLAG_STATELESS); + if (!link) + dev_err(dev, "Unable to link %s\n", dev_name(larbdev)); shoudn't ERR_PTR be returned in case of failure? Thanks, Dafna return >iommu; } static void mtk_iommu_release_device(struct device *dev) { struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev); + struct mtk_iommu_data *data; + struct device *larbdev; + unsigned int larbid; if (!fwspec || fwspec->ops != _iommu_ops) return; + data = dev_iommu_priv_get(dev); + larbid = MTK_M4U_TO_LARB(fwspec->ids[0]); + larbdev = data->larb_imu[larbid].dev; + device_link_remove(dev, larbdev); + iommu_fwspec_free(dev); } diff --git a/drivers/iommu/mtk_iommu_v1.c b/drivers/iommu/mtk_iommu_v1.c index d9365a3d8dc9..d2a7c66b8239 100644 --- a/drivers/iommu/mtk_iommu_v1.c +++ b/drivers/iommu/mtk_iommu_v1.c @@ -424,7 +424,9 @@ static struct iommu_device *mtk_iommu_probe_device(struct device *dev) struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev); struct of_phandle_args iommu_spec; struct mtk_iommu_data *data; - int err, idx = 0; + int err, idx = 0, larbid; + struct device_link *link; + struct device *larbdev; while (!of_parse_phandle_with_args(dev->of_node, "iommus", "#iommu-cells", @@ -445,6 +447,14 @@ static struct iommu_device *mtk_iommu_probe_device(struct device *dev) data = dev_iommu_priv_get(dev); + /* Link the consumer device with the smi-larb device(supplier) */ + larbid = mt2701_m4u_to_larb(fwspec->ids[0]); + larbdev = data->larb_imu[larbid].dev; + link = device_link_add(dev, larbdev, + DL_FLAG_PM_RUNTIME | DL_FLAG_STATELESS); + if (!link) + dev_err(dev, "Unable to link %s\n", dev_name(larbdev)); + return >iommu; } @@ -465,10 +475,18 @@ static void mtk_iommu_probe_finalize(struct device *dev) static void mtk_iommu_release_device(struct device *dev) { struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(dev); + struct mtk_iommu_data *data; + struct device *larbdev; + unsigned int larbid;
Re: [PATCH v6 01/11] dt-binding: mediatek: Get rid of mediatek, larb for multimedia HW
Hi, thanks for the patch On 14.07.21 04:56, Yong Wu wrote: After adding device_link between the consumer with the smi-larbs, if the consumer call its owner pm_runtime_get(_sync), the pm_runtime_get(_sync) of smi-larb and smi-common will be called automatically. Thus, the consumer don't need the property. And IOMMU also know which larb this consumer connects with from iommu id in the "iommus=" property. Signed-off-by: Yong Wu Reviewed-by: Rob Herring Reviewed-by: Evan Green --- .../bindings/display/mediatek/mediatek,disp.txt | 9 - .../devicetree/bindings/media/mediatek-jpeg-decoder.yaml | 9 - .../devicetree/bindings/media/mediatek-jpeg-encoder.yaml | 9 - On which repo are these patches based on ? In linux-next the file mediatek-jpeg-encoder.yaml don't exist Thanks, Dafna Documentation/devicetree/bindings/media/mediatek-mdp.txt | 8 .../devicetree/bindings/media/mediatek-vcodec.txt| 4 5 files changed, 39 deletions(-) diff --git a/Documentation/devicetree/bindings/display/mediatek/mediatek,disp.txt b/Documentation/devicetree/bindings/display/mediatek/mediatek,disp.txt index fbb59c9ddda6..867bd82e2f03 100644 --- a/Documentation/devicetree/bindings/display/mediatek/mediatek,disp.txt +++ b/Documentation/devicetree/bindings/display/mediatek/mediatek,disp.txt @@ -61,8 +61,6 @@ Required properties (DMA function blocks): "mediatek,-disp-rdma" "mediatek,-disp-wdma" the supported chips are mt2701, mt8167 and mt8173. -- larb: Should contain a phandle pointing to the local arbiter device as defined - in Documentation/devicetree/bindings/memory-controllers/mediatek,smi-larb.yaml - iommus: Should point to the respective IOMMU block with master port as argument, see Documentation/devicetree/bindings/iommu/mediatek,iommu.yaml for details. @@ -91,7 +89,6 @@ ovl0: ovl@1400c000 { power-domains = < MT8173_POWER_DOMAIN_MM>; clocks = < CLK_MM_DISP_OVL0>; iommus = < M4U_PORT_DISP_OVL0>; - mediatek,larb = <>; }; ovl1: ovl@1400d000 { @@ -101,7 +98,6 @@ ovl1: ovl@1400d000 { power-domains = < MT8173_POWER_DOMAIN_MM>; clocks = < CLK_MM_DISP_OVL1>; iommus = < M4U_PORT_DISP_OVL1>; - mediatek,larb = <>; }; rdma0: rdma@1400e000 { @@ -111,7 +107,6 @@ rdma0: rdma@1400e000 { power-domains = < MT8173_POWER_DOMAIN_MM>; clocks = < CLK_MM_DISP_RDMA0>; iommus = < M4U_PORT_DISP_RDMA0>; - mediatek,larb = <>; mediatek,rdma-fifosize = <8192>; }; @@ -122,7 +117,6 @@ rdma1: rdma@1400f000 { power-domains = < MT8173_POWER_DOMAIN_MM>; clocks = < CLK_MM_DISP_RDMA1>; iommus = < M4U_PORT_DISP_RDMA1>; - mediatek,larb = <>; }; rdma2: rdma@1401 { @@ -132,7 +126,6 @@ rdma2: rdma@1401 { power-domains = < MT8173_POWER_DOMAIN_MM>; clocks = < CLK_MM_DISP_RDMA2>; iommus = < M4U_PORT_DISP_RDMA2>; - mediatek,larb = <>; }; wdma0: wdma@14011000 { @@ -142,7 +135,6 @@ wdma0: wdma@14011000 { power-domains = < MT8173_POWER_DOMAIN_MM>; clocks = < CLK_MM_DISP_WDMA0>; iommus = < M4U_PORT_DISP_WDMA0>; - mediatek,larb = <>; }; wdma1: wdma@14012000 { @@ -152,7 +144,6 @@ wdma1: wdma@14012000 { power-domains = < MT8173_POWER_DOMAIN_MM>; clocks = < CLK_MM_DISP_WDMA1>; iommus = < M4U_PORT_DISP_WDMA1>; - mediatek,larb = <>; }; color0: color@14013000 { diff --git a/Documentation/devicetree/bindings/media/mediatek-jpeg-decoder.yaml b/Documentation/devicetree/bindings/media/mediatek-jpeg-decoder.yaml index 9b87f036f178..052e752157b4 100644 --- a/Documentation/devicetree/bindings/media/mediatek-jpeg-decoder.yaml +++ b/Documentation/devicetree/bindings/media/mediatek-jpeg-decoder.yaml @@ -42,13 +42,6 @@ properties: power-domains: maxItems: 1 - mediatek,larb: -$ref: '/schemas/types.yaml#/definitions/phandle' -description: | - Must contain the local arbiters in the current Socs, see - Documentation/devicetree/bindings/memory-controllers/mediatek,smi-larb.yaml - for details. - iommus: maxItems: 2 description: | @@ -63,7 +56,6 @@ required: - clocks - clock-names - power-domains - - mediatek,larb - iommus additionalProperties: false @@ -83,7 +75,6 @@ examples: clock-names = "jpgdec-smi", "jpgdec"; power-domains = < MT2701_POWER_DOMAIN_ISP>; - mediatek,larb = <>; iommus = < MT2701_M4U_PORT_JPGDEC_WDMA>, < MT2701_M4U_PORT_JPGDEC_BSDMA>; }; diff --git a/Documentation/devicetree/bindings/media/mediatek-jpeg-encoder.yaml b/Documentation/devicetree/bindings/media/mediatek-jpeg-encoder.yaml index fcd9b829e036..8bfdfdfaba59 100644 --- a/Documentation/devicetree/bindings/media/mediatek-jpeg-encoder.yaml +++
Re: [PATCH v5 11/16] drm/mediatek: Get rid of mtk_smi_larb_get/put
Hi On 10.04.21 12:11, Yong Wu wrote: MediaTek IOMMU has already added the device_link between the consumer and smi-larb device. If the drm device call the pm_runtime_get_sync, the smi-larb's pm_runtime_get_sync also be called automatically. CC: CK Hu CC: Philipp Zabel Signed-off-by: Yong Wu Reviewed-by: Evan Green Acked-by: Chun-Kuang Hu --- drivers/gpu/drm/mediatek/mtk_drm_crtc.c | 9 -- drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.c | 35 - drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.h | 1 - drivers/gpu/drm/mediatek/mtk_drm_drv.c | 5 +-- 4 files changed, 1 insertion(+), 49 deletions(-) diff --git a/drivers/gpu/drm/mediatek/mtk_drm_crtc.c b/drivers/gpu/drm/mediatek/mtk_drm_crtc.c index 971ef58ac1dc..d59353af4019 100644 --- a/drivers/gpu/drm/mediatek/mtk_drm_crtc.c +++ b/drivers/gpu/drm/mediatek/mtk_drm_crtc.c @@ -10,7 +10,6 @@ #include #include -#include #include #include @@ -544,12 +543,6 @@ static void mtk_drm_crtc_atomic_enable(struct drm_crtc *crtc, DRM_DEBUG_DRIVER("%s %d\n", __func__, crtc->base.id); - ret = mtk_smi_larb_get(comp->larb_dev); - if (ret) { - DRM_ERROR("Failed to get larb: %d\n", ret); - return; - } - ret = pm_runtime_resume_and_get(comp->dev); if (ret < 0) DRM_DEV_ERROR(comp->dev, "Failed to enable power domain: %d\n", @@ -557,7 +550,6 @@ static void mtk_drm_crtc_atomic_enable(struct drm_crtc *crtc, ret = mtk_crtc_ddp_hw_init(mtk_crtc); if (ret) { - mtk_smi_larb_put(comp->larb_dev); pm_runtime_put(comp->dev); return; } @@ -594,7 +586,6 @@ static void mtk_drm_crtc_atomic_disable(struct drm_crtc *crtc, drm_crtc_vblank_off(crtc); mtk_crtc_ddp_hw_fini(mtk_crtc); - mtk_smi_larb_put(comp->larb_dev); ret = pm_runtime_put(comp->dev); if (ret < 0) DRM_DEV_ERROR(comp->dev, "Failed to disable power domain: %d\n", diff --git a/drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.c b/drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.c index 75bc00e17fc4..6c01492ba4df 100644 --- a/drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.c +++ b/drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.c @@ -449,37 +449,12 @@ unsigned int mtk_drm_find_possible_crtc_by_comp(struct drm_device *drm, return ret; } -static int mtk_ddp_get_larb_dev(struct device_node *node, struct mtk_ddp_comp *comp, - struct device *dev) -{ - struct device_node *larb_node; - struct platform_device *larb_pdev; - - larb_node = of_parse_phandle(node, "mediatek,larb", 0); - if (!larb_node) { - dev_err(dev, "Missing mediadek,larb phandle in %pOF node\n", node); - return -EINVAL; - } - - larb_pdev = of_find_device_by_node(larb_node); - if (!larb_pdev) { - dev_warn(dev, "Waiting for larb device %pOF\n", larb_node); - of_node_put(larb_node); - return -EPROBE_DEFER; - } - of_node_put(larb_node); - comp->larb_dev = _pdev->dev; - - return 0; -} - int mtk_ddp_comp_init(struct device_node *node, struct mtk_ddp_comp *comp, enum mtk_ddp_comp_id comp_id) { struct platform_device *comp_pdev; enum mtk_ddp_comp_type type; struct mtk_ddp_comp_dev *priv; - int ret; Hi, This 'ret' is also used inside `if IS_REACHABLE(CONFIG_MTK_CMDQ)` so it should not be removed. Thanks, Dafna if (comp_id < 0 || comp_id >= DDP_COMPONENT_ID_MAX) return -EINVAL; @@ -495,16 +470,6 @@ int mtk_ddp_comp_init(struct device_node *node, struct mtk_ddp_comp *comp, } comp->dev = _pdev->dev; - /* Only DMA capable components need the LARB property */ - if (type == MTK_DISP_OVL || - type == MTK_DISP_OVL_2L || - type == MTK_DISP_RDMA || - type == MTK_DISP_WDMA) { - ret = mtk_ddp_get_larb_dev(node, comp, comp->dev); - if (ret) - return ret; - } - if (type == MTK_DISP_BLS || type == MTK_DISP_CCORR || type == MTK_DISP_COLOR || diff --git a/drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.h b/drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.h index bb914d976cf5..1b582262b682 100644 --- a/drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.h +++ b/drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.h @@ -70,7 +70,6 @@ struct mtk_ddp_comp_funcs { struct mtk_ddp_comp { struct device *dev; int irq; - struct device *larb_dev; enum mtk_ddp_comp_id id; const struct mtk_ddp_comp_funcs *funcs; }; diff --git a/drivers/gpu/drm/mediatek/mtk_drm_drv.c b/drivers/gpu/drm/mediatek/mtk_drm_drv.c index b013d56d2777..622de47239eb 100644 --- a/drivers/gpu/drm/mediatek/mtk_drm_drv.c +++ b/drivers/gpu/drm/mediatek/mtk_drm_drv.c @@ -576,11 +576,8 @@ static int
Re: [PATCH v8 05/14] media: rkisp1: add Rockchip ISP1 subdev driver
Am 07.08.20 um 18:08 schrieb Dafna Hirschfeld: Hi Am 06.08.20 um 14:22 schrieb Tomasz Figa: On Thu, Aug 6, 2020 at 11:21 AM Dafna Hirschfeld wrote: Am 05.08.20 um 23:10 schrieb Dafna Hirschfeld: Hi On 22.07.20 17:24, Tomasz Figa wrote: Hi Dafna, On Sat, Jul 11, 2020 at 01:04:31PM +0200, Dafna Hirschfeld wrote: Hi Laurent, On 16.08.19 02:13, Laurent Pinchart wrote: Hello Helen, Thank you for the patch. On Tue, Jul 30, 2019 at 03:42:47PM -0300, Helen Koike wrote: [snip] +static void rkisp1_isp_queue_event_sof(struct rkisp1_isp_subdev *isp) +{ + struct v4l2_event event = { + .type = V4L2_EVENT_FRAME_SYNC, + .u.frame_sync.frame_sequence = + atomic_inc_return(>frm_sync_seq) - 1, I would move the increment to the caller, hiding it in this function is error-prone (and if you look at the caller I'm pointing out one possible error :-)). In general usage of frm_sync_seq through the driver seems to be very race-prone. It's read in various IRQ handling functions, all coming from the same IRQ, so that part is fine (and wouldn't require an atomic variable), but when read from the buffer queue handlers I really get a red light flashing in my head. I'll try to investigate more when reviewing the next patches. I see that the only place were 'frame_sequence' is read outside of the irq handlers is in the capture in 'rkisp1_vb2_buf_queue': /* * If there's no next buffer assigned, queue this buffer directly * as the next buffer, and update the memory interface. */ if (cap->is_streaming && !cap->buf.next && atomic_read(>rkisp1->isp.frame_sequence) == -1) { cap->buf.next = ispbuf; rkisp1_set_next_buf(cap); } else { list_add_tail(>queue, >buf.queue); } This "if" condition seems very specific, a case where we already stream but v-start was not yet received. I think it is possible to remove the test 'atomic_read(>rkisp1->isp.frame_sequence) == -1' from the above condition so that the next buffer is updated in case it is null not just before the first v-start signal. We don't have this special case in the Chrome OS code. I suppose it would make it possible to resume the capture 1 frame earlier after a queue underrun, as otherwise the new buffer would be only programmed after the next frame start interrupt and used for the next-next frame. However, it's racy, because programming of the buffer addresses is not atomic and could end up with the hardware using few plane addresses from the new buffer and few from the dummy buffer. Given that and also the fact that a queue underrun is a very special case, where the system was already having problems catching up, I'd just remove this special case. [snip] +void rkisp1_isp_isr(unsigned int isp_mis, struct rkisp1_device *dev) +{ + void __iomem *base = dev->base_addr; + unsigned int isp_mis_tmp = 0; _tmp are never good names :-S + unsigned int isp_err = 0; Neither of these variable need to be initialised to 0. + + /* start edge of v_sync */ + if (isp_mis & CIF_ISP_V_START) { + rkisp1_isp_queue_event_sof(>isp_sdev); This will increment the frame sequence number. What if the interrupt is slightly delayed and the next frame starts before we get a change to copy the sequence number to the buffers (before they will complete below) ? Do you mean that we get two sequental v-start signals and then the next frame-end signal in MI_MIS belongs to the first v-start signal of the two? How can this be solved? I wonder if any v-start signal has a later signal that correspond to the same frame so that we can follow it? Maybe we should have one counter that is incremented on v-start signal, and another counter that is incremented uppon some other signal? We're talking about a hard IRQ. I can't imagine the interrupt handler being delayed for a time close to a full frame interval (~16ms for 60 fps) to trigger such scenario. + + writel(CIF_ISP_V_START, base + CIF_ISP_ICR); Do you need to clear all interrupt bits individually, can't you write isp_mis to CIF_ISP_ICR at the beginning of the function to clear them all in one go ? + isp_mis_tmp = readl(base + CIF_ISP_MIS); + if (isp_mis_tmp & CIF_ISP_V_START) + v4l2_err(>v4l2_dev, "isp icr v_statr err: 0x%x\n", + isp_mis_tmp); This require some explanation. It looks like a naive way to protect against something, but I think it could trigger under normal circumstances if IRQ handling is delayed, and wouldn't do much anyway. Same for the similar constructs below. + } + + if ((isp_mis & CIF_ISP_PIC_SIZE_ERROR)) { + /* Clear pic_size_error */ + writel(CIF_ISP_PIC_SIZE_ERROR, base + CIF_ISP_ICR); + isp_err = readl(base + CIF_ISP_ERR); + v4l2_err(>
Re: [PATCH v8 05/14] media: rkisp1: add Rockchip ISP1 subdev driver
Hi Am 06.08.20 um 14:22 schrieb Tomasz Figa: On Thu, Aug 6, 2020 at 11:21 AM Dafna Hirschfeld wrote: Am 05.08.20 um 23:10 schrieb Dafna Hirschfeld: Hi On 22.07.20 17:24, Tomasz Figa wrote: Hi Dafna, On Sat, Jul 11, 2020 at 01:04:31PM +0200, Dafna Hirschfeld wrote: Hi Laurent, On 16.08.19 02:13, Laurent Pinchart wrote: Hello Helen, Thank you for the patch. On Tue, Jul 30, 2019 at 03:42:47PM -0300, Helen Koike wrote: [snip] +static void rkisp1_isp_queue_event_sof(struct rkisp1_isp_subdev *isp) +{ +struct v4l2_event event = { +.type = V4L2_EVENT_FRAME_SYNC, +.u.frame_sync.frame_sequence = +atomic_inc_return(>frm_sync_seq) - 1, I would move the increment to the caller, hiding it in this function is error-prone (and if you look at the caller I'm pointing out one possible error :-)). In general usage of frm_sync_seq through the driver seems to be very race-prone. It's read in various IRQ handling functions, all coming from the same IRQ, so that part is fine (and wouldn't require an atomic variable), but when read from the buffer queue handlers I really get a red light flashing in my head. I'll try to investigate more when reviewing the next patches. I see that the only place were 'frame_sequence' is read outside of the irq handlers is in the capture in 'rkisp1_vb2_buf_queue': /* * If there's no next buffer assigned, queue this buffer directly * as the next buffer, and update the memory interface. */ if (cap->is_streaming && !cap->buf.next && atomic_read(>rkisp1->isp.frame_sequence) == -1) { cap->buf.next = ispbuf; rkisp1_set_next_buf(cap); } else { list_add_tail(>queue, >buf.queue); } This "if" condition seems very specific, a case where we already stream but v-start was not yet received. I think it is possible to remove the test 'atomic_read(>rkisp1->isp.frame_sequence) == -1' from the above condition so that the next buffer is updated in case it is null not just before the first v-start signal. We don't have this special case in the Chrome OS code. I suppose it would make it possible to resume the capture 1 frame earlier after a queue underrun, as otherwise the new buffer would be only programmed after the next frame start interrupt and used for the next-next frame. However, it's racy, because programming of the buffer addresses is not atomic and could end up with the hardware using few plane addresses from the new buffer and few from the dummy buffer. Given that and also the fact that a queue underrun is a very special case, where the system was already having problems catching up, I'd just remove this special case. [snip] +void rkisp1_isp_isr(unsigned int isp_mis, struct rkisp1_device *dev) +{ +void __iomem *base = dev->base_addr; +unsigned int isp_mis_tmp = 0; _tmp are never good names :-S +unsigned int isp_err = 0; Neither of these variable need to be initialised to 0. + +/* start edge of v_sync */ +if (isp_mis & CIF_ISP_V_START) { +rkisp1_isp_queue_event_sof(>isp_sdev); This will increment the frame sequence number. What if the interrupt is slightly delayed and the next frame starts before we get a change to copy the sequence number to the buffers (before they will complete below) ? Do you mean that we get two sequental v-start signals and then the next frame-end signal in MI_MIS belongs to the first v-start signal of the two? How can this be solved? I wonder if any v-start signal has a later signal that correspond to the same frame so that we can follow it? Maybe we should have one counter that is incremented on v-start signal, and another counter that is incremented uppon some other signal? We're talking about a hard IRQ. I can't imagine the interrupt handler being delayed for a time close to a full frame interval (~16ms for 60 fps) to trigger such scenario. + +writel(CIF_ISP_V_START, base + CIF_ISP_ICR); Do you need to clear all interrupt bits individually, can't you write isp_mis to CIF_ISP_ICR at the beginning of the function to clear them all in one go ? +isp_mis_tmp = readl(base + CIF_ISP_MIS); +if (isp_mis_tmp & CIF_ISP_V_START) +v4l2_err(>v4l2_dev, "isp icr v_statr err: 0x%x\n", + isp_mis_tmp); This require some explanation. It looks like a naive way to protect against something, but I think it could trigger under normal circumstances if IRQ handling is delayed, and wouldn't do much anyway. Same for the similar constructs below. +} + +if ((isp_mis & CIF_ISP_PIC_SIZE_ERROR)) { +/* Clear pic_size_error */ +writel(CIF_ISP_PIC_SIZE_ERROR, base + CIF_ISP_ICR); +isp_err = readl(base + CIF_ISP_ERR); +v4l2_err(>v4l2_dev, + "CIF_ISP_PIC_SIZE_ERROR
Re: [PATCH v8 05/14] media: rkisp1: add Rockchip ISP1 subdev driver
Hi, Am 06.08.20 um 14:08 schrieb Tomasz Figa: On Wed, Aug 5, 2020 at 11:10 PM Dafna Hirschfeld wrote: Hi On 22.07.20 17:24, Tomasz Figa wrote: Hi Dafna, On Sat, Jul 11, 2020 at 01:04:31PM +0200, Dafna Hirschfeld wrote: Hi Laurent, On 16.08.19 02:13, Laurent Pinchart wrote: Hello Helen, Thank you for the patch. On Tue, Jul 30, 2019 at 03:42:47PM -0300, Helen Koike wrote: [snip] +static void rkisp1_isp_queue_event_sof(struct rkisp1_isp_subdev *isp) +{ + struct v4l2_event event = { + .type = V4L2_EVENT_FRAME_SYNC, + .u.frame_sync.frame_sequence = + atomic_inc_return(>frm_sync_seq) - 1, I would move the increment to the caller, hiding it in this function is error-prone (and if you look at the caller I'm pointing out one possible error :-)). In general usage of frm_sync_seq through the driver seems to be very race-prone. It's read in various IRQ handling functions, all coming from the same IRQ, so that part is fine (and wouldn't require an atomic variable), but when read from the buffer queue handlers I really get a red light flashing in my head. I'll try to investigate more when reviewing the next patches. I see that the only place were 'frame_sequence' is read outside of the irq handlers is in the capture in 'rkisp1_vb2_buf_queue': /* * If there's no next buffer assigned, queue this buffer directly * as the next buffer, and update the memory interface. */ if (cap->is_streaming && !cap->buf.next && atomic_read(>rkisp1->isp.frame_sequence) == -1) { cap->buf.next = ispbuf; rkisp1_set_next_buf(cap); } else { list_add_tail(>queue, >buf.queue); } This "if" condition seems very specific, a case where we already stream but v-start was not yet received. I think it is possible to remove the test 'atomic_read(>rkisp1->isp.frame_sequence) == -1' from the above condition so that the next buffer is updated in case it is null not just before the first v-start signal. We don't have this special case in the Chrome OS code. I suppose it would make it possible to resume the capture 1 frame earlier after a queue underrun, as otherwise the new buffer would be only programmed after the next frame start interrupt and used for the next-next frame. However, it's racy, because programming of the buffer addresses is not atomic and could end up with the hardware using few plane addresses from the new buffer and few from the dummy buffer. Given that and also the fact that a queue underrun is a very special case, where the system was already having problems catching up, I'd just remove this special case. [snip] +void rkisp1_isp_isr(unsigned int isp_mis, struct rkisp1_device *dev) +{ + void __iomem *base = dev->base_addr; + unsigned int isp_mis_tmp = 0; _tmp are never good names :-S + unsigned int isp_err = 0; Neither of these variable need to be initialised to 0. + + /* start edge of v_sync */ + if (isp_mis & CIF_ISP_V_START) { + rkisp1_isp_queue_event_sof(>isp_sdev); This will increment the frame sequence number. What if the interrupt is slightly delayed and the next frame starts before we get a change to copy the sequence number to the buffers (before they will complete below) ? Do you mean that we get two sequental v-start signals and then the next frame-end signal in MI_MIS belongs to the first v-start signal of the two? How can this be solved? I wonder if any v-start signal has a later signal that correspond to the same frame so that we can follow it? Maybe we should have one counter that is incremented on v-start signal, and another counter that is incremented uppon some other signal? We're talking about a hard IRQ. I can't imagine the interrupt handler being delayed for a time close to a full frame interval (~16ms for 60 fps) to trigger such scenario. + + writel(CIF_ISP_V_START, base + CIF_ISP_ICR); Do you need to clear all interrupt bits individually, can't you write isp_mis to CIF_ISP_ICR at the beginning of the function to clear them all in one go ? + isp_mis_tmp = readl(base + CIF_ISP_MIS); + if (isp_mis_tmp & CIF_ISP_V_START) + v4l2_err(>v4l2_dev, "isp icr v_statr err: 0x%x\n", + isp_mis_tmp); This require some explanation. It looks like a naive way to protect against something, but I think it could trigger under normal circumstances if IRQ handling is delayed, and wouldn't do much anyway. Same for the similar constructs below. + } + + if ((isp_mis & CIF_ISP_PIC_SIZE_ERROR)) { + /* Clear pic_size_error */ + writel(CIF_ISP_PIC_SIZE_ERROR, base + CIF_ISP_ICR); + isp_err = readl(base + CIF_ISP_ERR); + v4l2_err(>v4l2_dev, + "CIF_ISP_PIC_SIZE_ERROR (0x%08x)"
Re: [PATCH v3 1/2] dma-contiguous: Abstract dma_{alloc,free}_contiguous()
On Thu, 2019-07-25 at 09:50 -0700, Nicolin Chen wrote: > On Thu, Jul 25, 2019 at 01:06:42PM -0300, Ezequiel Garcia wrote: > > I can't find a way to forward-redirect from Gmail, so I'm Ccing Dafna > > who found a regression caused by this commit. Dafna, can you give all > > the details, including the log and how you are reproducing it? > > I saw the conversation there. Sorry for not replying yet. > May we discuss there since there are full logs available? > > Thanks > Nicolin > Hi, I compiled vivid as built-in into the kernel (not as a separate module) for nitrogen8m device (imx8) and set it to use contig dma for mem_ops by adding the kernel param vivid.allocators=1,1,... I use this devicetree patch for the dtb file: https://lkml.org/lkml/2019/7/24/789. Although it should be the same on any Aarch64 platform. Then, on the board I run the command: v4l2-ctl -d3 -v width=2592,height=1944,pixelformat=UYVY,bytesperline=5184 --stream-mmap --stream-to video.UYVY In every run there is a different crash. Here is one of them: https://pastebin.com/xXgbXMAN Dafna > > > > > > > > On Fri, 24 May 2019 at 01:08, Nicolin Chen > > > > wrote: > > > > > > Both dma_alloc_from_contiguous() and dma_release_from_contiguous() > > > are very simply implemented, but requiring callers to pass certain > > > parameters like count and align, and taking a boolean parameter to > > > check __GFP_NOWARN in the allocation flags. So every function call > > > duplicates similar work: > > > /* A piece of example */ > > > unsigned long order = get_order(size); > > > size_t count = size >> PAGE_SHIFT; > > > page = dma_alloc_from_contiguous(dev, count, order, gfp & __GFP_NOWARN); > > > [...] > > > dma_release_from_contiguous(dev, page, size >> PAGE_SHIFT); > > > > > > Additionally, as CMA can be used only in the context which permits > > > sleeping, most of callers do a gfpflags_allow_blocking() check and > > > a corresponding fallback allocation of normal pages upon any false > > > result: > > > /* A piece of example */ > > > if (gfpflags_allow_blocking(flag)) > > > page = dma_alloc_from_contiguous(); > > > if (!page) > > > page = alloc_pages(); > > > [...] > > > if (!dma_release_from_contiguous(dev, page, count)) > > > __free_pages(page, get_order(size)); > > > > > > So this patch simplifies those function calls by abstracting these > > > operations into the two new functions: dma_{alloc,free}_contiguous. > > > > > > As some callers of dma_{alloc,release}_from_contiguous() might be > > > complicated, this patch just implements these two new functions to > > > kernel/dma/direct.c only as an initial step. > > > > > > > > > Suggested-by: Christoph Hellwig > > > > > > Signed-off-by: Nicolin Chen > > > --- > > > Changelog > > > v2->v3: > > > * Added missing "static inline" in header file to fix build error. > > > v1->v2: > > > * Added new functions beside the old ones so we can replace callers > > > one by one later. > > > * Applied new functions to dma/direct.c only, because it's the best > > > example caller to apply and should be safe with the new functions. > > > > > > include/linux/dma-contiguous.h | 11 > > > kernel/dma/contiguous.c| 48 ++ > > > kernel/dma/direct.c| 24 +++-- > > > 3 files changed, 63 insertions(+), 20 deletions(-) > > > > > > diff --git a/include/linux/dma-contiguous.h > > > b/include/linux/dma-contiguous.h > > > index f247e8aa5e3d..00a370c1c140 100644 > > > --- a/include/linux/dma-contiguous.h > > > +++ b/include/linux/dma-contiguous.h > > > @@ -115,6 +115,8 @@ struct page *dma_alloc_from_contiguous(struct device > > > *dev, size_t count, > > > unsigned int order, bool no_warn); > > > bool dma_release_from_contiguous(struct device *dev, struct page *pages, > > > int count); > > > +struct page *dma_alloc_contiguous(struct device *dev, size_t size, gfp_t > > > gfp); > > > +void dma_free_contiguous(struct device *dev, struct page *page, size_t > > > size); > > > > > > #else > > > > > > @@ -157,6 +159,15 @@ bool dma_release_from_contiguous(struct device *dev, > > > struct page *pages, > > > return false; > > > } > > > > > > +static inline > > > +struct page *dma_alloc_contiguous(struct device *dev, size_t size, gfp_t > > > gfp) > > > +{ > > > + return NULL; > > > +} > > > + > > > +static inline > > > +void dma_free_contiguous(struct device *dev, struct page *page, size_t > > > size) { } > > > + > > > #endif > > > > > > #endif > > > diff --git a/kernel/dma/contiguous.c b/kernel/dma/contiguous.c > > > index b2a87905846d..21f39a6cb04f 100644 > > > --- a/kernel/dma/contiguous.c > > > +++ b/kernel/dma/contiguous.c > > > @@ -214,6 +214,54 @@ bool dma_release_from_contiguous(struct device *dev, > > > struct page *pages, > > > return cma_release(dev_get_cma_area(dev), pages, count);