Re: [PATCH] iommu: io-pgtable: Support non-coherent page tables
On 25/06/2019 12:56, Will Deacon wrote: On Mon, Jun 24, 2019 at 12:53:49PM +0100, Will Deacon wrote: On Tue, Jun 18, 2019 at 06:39:33PM +0100, Will Deacon wrote: On Wed, May 15, 2019 at 04:32:34PM -0700, Bjorn Andersson wrote: Describe the memory related to page table walks as non-cachable for iommu instances that are not DMA coherent. Signed-off-by: Bjorn Andersson --- drivers/iommu/io-pgtable-arm.c | 12 +--- 1 file changed, 9 insertions(+), 3 deletions(-) diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c index 4e21efbc4459..68ff22ffd2cb 100644 --- a/drivers/iommu/io-pgtable-arm.c +++ b/drivers/iommu/io-pgtable-arm.c @@ -803,9 +803,15 @@ arm_64_lpae_alloc_pgtable_s1(struct io_pgtable_cfg *cfg, void *cookie) return NULL; /* TCR */ - reg = (ARM_LPAE_TCR_SH_IS << ARM_LPAE_TCR_SH0_SHIFT) | - (ARM_LPAE_TCR_RGN_WBWA << ARM_LPAE_TCR_IRGN0_SHIFT) | - (ARM_LPAE_TCR_RGN_WBWA << ARM_LPAE_TCR_ORGN0_SHIFT); + if (cfg->quirks & IO_PGTABLE_QUIRK_NO_DMA) { + reg = (ARM_LPAE_TCR_SH_IS << ARM_LPAE_TCR_SH0_SHIFT) | + (ARM_LPAE_TCR_RGN_WBWA << ARM_LPAE_TCR_IRGN0_SHIFT) | + (ARM_LPAE_TCR_RGN_WBWA << ARM_LPAE_TCR_ORGN0_SHIFT); + } else { + reg = (ARM_LPAE_TCR_SH_IS << ARM_LPAE_TCR_SH0_SHIFT) | Nit: this should be outer-shareable (ARM_LPAE_TCR_SH_OS). + (ARM_LPAE_TCR_RGN_NC << ARM_LPAE_TCR_IRGN0_SHIFT) | + (ARM_LPAE_TCR_RGN_NC << ARM_LPAE_TCR_ORGN0_SHIFT); + }>> Should we also be doing something similar for the short-descriptor code in io-pgtable-arm-v7s.c? Looks like you just need to use ARM_V7S_RGN_NC instead of ARM_V7S_RGN_WBWA when initialising ttbr0 for non-coherent SMMUs. Do you plan to respin this? I'll need it this week if you're shooting for 5.3. I started having a crack at this myself, but in doing so I realised that using IO_PGTABLE_QUIRK_NO_DMA for this isn't quite right based on its current description. When that flag is set, we can rely on the page-table walker being coherent, but I don't think it works the other way around: you can't rely on the flag being clear meaning that the page-table walker is not coherent. Ideally, we'd use something like is_device_dma_coherent, but that's a Xen thing and it doesn't look reliable for the IOMMU. Looking at the users of io-pgtable, we have: panfrost_mmu.c - I can't see where the page-table even gets installed... arm-smmu.c - IO_PGTABLE_QUIRK_NO_DMA is reliable arm-smmu-v3.c - IO_PGTABLE_QUIRK_NO_DMA is reliable ipmmu-vmsa.c- Always sets TTBCR as cacheable (ignores tcr) msm_iommu.c - Always non-coherent mtk_iommu.c - Ignores tcr qcom_iommu.c- Always non-coherent so we could go ahead and change IO_PGTABLE_QUIRK_NO_DMA to do what you want without breaking any drivers. In any case, the driver is free to override the control register if it knows better, as seems to be the case for some of these already. See patch below. I'll rework your patch on top. Will --->8 From 4f41845b340783eaec9cc2840fe3cb9a00574054 Mon Sep 17 00:00:00 2001 From: Will Deacon Date: Tue, 25 Jun 2019 12:51:25 +0100 Subject: [PATCH] iommu/io-pgtable: Replace IO_PGTABLE_QUIRK_NO_DMA with specific flag IO_PGTABLE_QUIRK_NO_DMA is a bit of a misnomer, since it's really just an indication of whether or not the page-table walker for the IOMMU is coherent with the CPU caches. Since cache coherency is more than just a quirk, replace the flag with its own field in the io_pgtable_cfg structure. The comment may have ended up being a bit misleading, but the name itself wasn't a misnomer - it specifically meant "skip DMA API calls", which served two purposes: - Firstly, for the selftests where we didn't have a valid DMA API device available. - Secondly, as a performance hack to short-circuit the DMA API call overhead (and fiddly intermediate-table-flush-flag logic) when, due to other assumptions elsewhere, we could be sure that they would be no-ops. I don't massively object to this change having been merged (after all, the original reasoning was mine[1]), but I don't believe it really makes anything better either - it's mostly just moving the goalposts. As it stands, now it looks like you can make a coherent SMMU do non-cacheable walks on a per-context basis by choosing not to set this flag, but anyone trying that will quickly find it subtly and fundamentally broken. Robin. [1] https://lore.kernel.org/linux-arm-kernel/a508a6a5-8e1f-3660-51ef-e65bc3829...@arm.com/ Cc: Bjorn Andersson Signed-off-by: Will Deacon --- drivers/iommu/arm-smmu-v3.c| 4 +--- drivers/iommu/arm-smmu.c | 4 +--- drivers/iommu/io-pgtable-arm-v7s.c | 10 +- drivers/iommu/io-pgtable-arm.c | 19 --- drivers/iommu/ipmmu-vmsa.c | 1 + include/linux/io-pgtable.h | 11 --- 6 files
Re: [PATCH] iommu: io-pgtable: Support non-coherent page tables
On Mon, Jun 24, 2019 at 12:53:49PM +0100, Will Deacon wrote: > On Tue, Jun 18, 2019 at 06:39:33PM +0100, Will Deacon wrote: > > On Wed, May 15, 2019 at 04:32:34PM -0700, Bjorn Andersson wrote: > > > Describe the memory related to page table walks as non-cachable for iommu > > > instances that are not DMA coherent. > > > > > > Signed-off-by: Bjorn Andersson > > > --- > > > drivers/iommu/io-pgtable-arm.c | 12 +--- > > > 1 file changed, 9 insertions(+), 3 deletions(-) > > > > > > diff --git a/drivers/iommu/io-pgtable-arm.c > > > b/drivers/iommu/io-pgtable-arm.c > > > index 4e21efbc4459..68ff22ffd2cb 100644 > > > --- a/drivers/iommu/io-pgtable-arm.c > > > +++ b/drivers/iommu/io-pgtable-arm.c > > > @@ -803,9 +803,15 @@ arm_64_lpae_alloc_pgtable_s1(struct io_pgtable_cfg > > > *cfg, void *cookie) > > > return NULL; > > > > > > /* TCR */ > > > - reg = (ARM_LPAE_TCR_SH_IS << ARM_LPAE_TCR_SH0_SHIFT) | > > > - (ARM_LPAE_TCR_RGN_WBWA << ARM_LPAE_TCR_IRGN0_SHIFT) | > > > - (ARM_LPAE_TCR_RGN_WBWA << ARM_LPAE_TCR_ORGN0_SHIFT); > > > + if (cfg->quirks & IO_PGTABLE_QUIRK_NO_DMA) { > > > + reg = (ARM_LPAE_TCR_SH_IS << ARM_LPAE_TCR_SH0_SHIFT) | > > > + (ARM_LPAE_TCR_RGN_WBWA << ARM_LPAE_TCR_IRGN0_SHIFT) | > > > + (ARM_LPAE_TCR_RGN_WBWA << ARM_LPAE_TCR_ORGN0_SHIFT); > > > + } else { > > > + reg = (ARM_LPAE_TCR_SH_IS << ARM_LPAE_TCR_SH0_SHIFT) | > > > > Nit: this should be outer-shareable (ARM_LPAE_TCR_SH_OS). > > > > > + (ARM_LPAE_TCR_RGN_NC << ARM_LPAE_TCR_IRGN0_SHIFT) | > > > + (ARM_LPAE_TCR_RGN_NC << ARM_LPAE_TCR_ORGN0_SHIFT); > > > + } > > > > Should we also be doing something similar for the short-descriptor code > > in io-pgtable-arm-v7s.c? Looks like you just need to use ARM_V7S_RGN_NC > > instead of ARM_V7S_RGN_WBWA when initialising ttbr0 for non-coherent > > SMMUs. > > Do you plan to respin this? I'll need it this week if you're shooting for > 5.3. I started having a crack at this myself, but in doing so I realised that using IO_PGTABLE_QUIRK_NO_DMA for this isn't quite right based on its current description. When that flag is set, we can rely on the page-table walker being coherent, but I don't think it works the other way around: you can't rely on the flag being clear meaning that the page-table walker is not coherent. Ideally, we'd use something like is_device_dma_coherent, but that's a Xen thing and it doesn't look reliable for the IOMMU. Looking at the users of io-pgtable, we have: panfrost_mmu.c - I can't see where the page-table even gets installed... arm-smmu.c - IO_PGTABLE_QUIRK_NO_DMA is reliable arm-smmu-v3.c - IO_PGTABLE_QUIRK_NO_DMA is reliable ipmmu-vmsa.c- Always sets TTBCR as cacheable (ignores tcr) msm_iommu.c - Always non-coherent mtk_iommu.c - Ignores tcr qcom_iommu.c- Always non-coherent so we could go ahead and change IO_PGTABLE_QUIRK_NO_DMA to do what you want without breaking any drivers. In any case, the driver is free to override the control register if it knows better, as seems to be the case for some of these already. See patch below. I'll rework your patch on top. Will --->8 >From 4f41845b340783eaec9cc2840fe3cb9a00574054 Mon Sep 17 00:00:00 2001 From: Will Deacon Date: Tue, 25 Jun 2019 12:51:25 +0100 Subject: [PATCH] iommu/io-pgtable: Replace IO_PGTABLE_QUIRK_NO_DMA with specific flag IO_PGTABLE_QUIRK_NO_DMA is a bit of a misnomer, since it's really just an indication of whether or not the page-table walker for the IOMMU is coherent with the CPU caches. Since cache coherency is more than just a quirk, replace the flag with its own field in the io_pgtable_cfg structure. Cc: Bjorn Andersson Signed-off-by: Will Deacon --- drivers/iommu/arm-smmu-v3.c| 4 +--- drivers/iommu/arm-smmu.c | 4 +--- drivers/iommu/io-pgtable-arm-v7s.c | 10 +- drivers/iommu/io-pgtable-arm.c | 19 --- drivers/iommu/ipmmu-vmsa.c | 1 + include/linux/io-pgtable.h | 11 --- 6 files changed, 20 insertions(+), 29 deletions(-) diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c index 65de2458999f..8ff8f61d9e1c 100644 --- a/drivers/iommu/arm-smmu-v3.c +++ b/drivers/iommu/arm-smmu-v3.c @@ -1789,13 +1789,11 @@ static int arm_smmu_domain_finalise(struct iommu_domain *domain) .pgsize_bitmap = smmu->pgsize_bitmap, .ias= ias, .oas= oas, + .coherent_walk = smmu->features & ARM_SMMU_FEAT_COHERENCY, .tlb= _smmu_gather_ops, .iommu_dev = smmu->dev, }; - if (smmu->features & ARM_SMMU_FEAT_COHERENCY) - pgtbl_cfg.quirks = IO_PGTABLE_QUIRK_NO_DMA; - if (smmu_domain->non_strict) pgtbl_cfg.quirks |= IO_PGTABLE_QUIRK_NON_STRICT; diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c index
Re: [PATCH] iommu: io-pgtable: Support non-coherent page tables
Hi again, Bjorn, On Tue, Jun 18, 2019 at 06:39:33PM +0100, Will Deacon wrote: > On Wed, May 15, 2019 at 04:32:34PM -0700, Bjorn Andersson wrote: > > Describe the memory related to page table walks as non-cachable for iommu > > instances that are not DMA coherent. > > > > Signed-off-by: Bjorn Andersson > > --- > > drivers/iommu/io-pgtable-arm.c | 12 +--- > > 1 file changed, 9 insertions(+), 3 deletions(-) > > > > diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c > > index 4e21efbc4459..68ff22ffd2cb 100644 > > --- a/drivers/iommu/io-pgtable-arm.c > > +++ b/drivers/iommu/io-pgtable-arm.c > > @@ -803,9 +803,15 @@ arm_64_lpae_alloc_pgtable_s1(struct io_pgtable_cfg > > *cfg, void *cookie) > > return NULL; > > > > /* TCR */ > > - reg = (ARM_LPAE_TCR_SH_IS << ARM_LPAE_TCR_SH0_SHIFT) | > > - (ARM_LPAE_TCR_RGN_WBWA << ARM_LPAE_TCR_IRGN0_SHIFT) | > > - (ARM_LPAE_TCR_RGN_WBWA << ARM_LPAE_TCR_ORGN0_SHIFT); > > + if (cfg->quirks & IO_PGTABLE_QUIRK_NO_DMA) { > > + reg = (ARM_LPAE_TCR_SH_IS << ARM_LPAE_TCR_SH0_SHIFT) | > > + (ARM_LPAE_TCR_RGN_WBWA << ARM_LPAE_TCR_IRGN0_SHIFT) | > > + (ARM_LPAE_TCR_RGN_WBWA << ARM_LPAE_TCR_ORGN0_SHIFT); > > + } else { > > + reg = (ARM_LPAE_TCR_SH_IS << ARM_LPAE_TCR_SH0_SHIFT) | > > Nit: this should be outer-shareable (ARM_LPAE_TCR_SH_OS). > > > + (ARM_LPAE_TCR_RGN_NC << ARM_LPAE_TCR_IRGN0_SHIFT) | > > + (ARM_LPAE_TCR_RGN_NC << ARM_LPAE_TCR_ORGN0_SHIFT); > > + } > > Should we also be doing something similar for the short-descriptor code > in io-pgtable-arm-v7s.c? Looks like you just need to use ARM_V7S_RGN_NC > instead of ARM_V7S_RGN_WBWA when initialising ttbr0 for non-coherent > SMMUs. Do you plan to respin this? I'll need it this week if you're shooting for 5.3. Thanks, Will
Re: [PATCH] iommu: io-pgtable: Support non-coherent page tables
Hi Bjorn, On Wed, May 15, 2019 at 04:32:34PM -0700, Bjorn Andersson wrote: > Describe the memory related to page table walks as non-cachable for iommu > instances that are not DMA coherent. > > Signed-off-by: Bjorn Andersson > --- > drivers/iommu/io-pgtable-arm.c | 12 +--- > 1 file changed, 9 insertions(+), 3 deletions(-) > > diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c > index 4e21efbc4459..68ff22ffd2cb 100644 > --- a/drivers/iommu/io-pgtable-arm.c > +++ b/drivers/iommu/io-pgtable-arm.c > @@ -803,9 +803,15 @@ arm_64_lpae_alloc_pgtable_s1(struct io_pgtable_cfg *cfg, > void *cookie) > return NULL; > > /* TCR */ > - reg = (ARM_LPAE_TCR_SH_IS << ARM_LPAE_TCR_SH0_SHIFT) | > - (ARM_LPAE_TCR_RGN_WBWA << ARM_LPAE_TCR_IRGN0_SHIFT) | > - (ARM_LPAE_TCR_RGN_WBWA << ARM_LPAE_TCR_ORGN0_SHIFT); > + if (cfg->quirks & IO_PGTABLE_QUIRK_NO_DMA) { > + reg = (ARM_LPAE_TCR_SH_IS << ARM_LPAE_TCR_SH0_SHIFT) | > + (ARM_LPAE_TCR_RGN_WBWA << ARM_LPAE_TCR_IRGN0_SHIFT) | > + (ARM_LPAE_TCR_RGN_WBWA << ARM_LPAE_TCR_ORGN0_SHIFT); > + } else { > + reg = (ARM_LPAE_TCR_SH_IS << ARM_LPAE_TCR_SH0_SHIFT) | Nit: this should be outer-shareable (ARM_LPAE_TCR_SH_OS). > + (ARM_LPAE_TCR_RGN_NC << ARM_LPAE_TCR_IRGN0_SHIFT) | > + (ARM_LPAE_TCR_RGN_NC << ARM_LPAE_TCR_ORGN0_SHIFT); > + } Should we also be doing something similar for the short-descriptor code in io-pgtable-arm-v7s.c? Looks like you just need to use ARM_V7S_RGN_NC instead of ARM_V7S_RGN_WBWA when initialising ttbr0 for non-coherent SMMUs. Will ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH] iommu: io-pgtable: Support non-coherent page tables
On Wed 15 May 23:47 PDT 2019, Vivek Gautam wrote: > On Thu, May 16, 2019 at 5:03 AM Bjorn Andersson > wrote: > > > > Describe the memory related to page table walks as non-cachable for iommu > > instances that are not DMA coherent. > > > > Signed-off-by: Bjorn Andersson > > --- > > drivers/iommu/io-pgtable-arm.c | 12 +--- > > 1 file changed, 9 insertions(+), 3 deletions(-) > > > > diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c > > index 4e21efbc4459..68ff22ffd2cb 100644 > > --- a/drivers/iommu/io-pgtable-arm.c > > +++ b/drivers/iommu/io-pgtable-arm.c > > @@ -803,9 +803,15 @@ arm_64_lpae_alloc_pgtable_s1(struct io_pgtable_cfg > > *cfg, void *cookie) > > return NULL; > > > > /* TCR */ > > - reg = (ARM_LPAE_TCR_SH_IS << ARM_LPAE_TCR_SH0_SHIFT) | > > - (ARM_LPAE_TCR_RGN_WBWA << ARM_LPAE_TCR_IRGN0_SHIFT) | > > - (ARM_LPAE_TCR_RGN_WBWA << ARM_LPAE_TCR_ORGN0_SHIFT); > > + if (cfg->quirks & IO_PGTABLE_QUIRK_NO_DMA) { > > + reg = (ARM_LPAE_TCR_SH_IS << ARM_LPAE_TCR_SH0_SHIFT) | > > + (ARM_LPAE_TCR_RGN_WBWA << ARM_LPAE_TCR_IRGN0_SHIFT) | > > + (ARM_LPAE_TCR_RGN_WBWA << ARM_LPAE_TCR_ORGN0_SHIFT); > > + } else { > > + reg = (ARM_LPAE_TCR_SH_IS << ARM_LPAE_TCR_SH0_SHIFT) | > > + (ARM_LPAE_TCR_RGN_NC << ARM_LPAE_TCR_IRGN0_SHIFT) | > > + (ARM_LPAE_TCR_RGN_NC << ARM_LPAE_TCR_ORGN0_SHIFT); > > + } > > This looks okay to me based on the discussion that we had on a similar > patch that I > posted. So, > Reviewed-by: Vivek Gautam > > [1] https://lore.kernel.org/patchwork/patch/1032939/ > Will, Robin, any input on this patch? Regards, Bjorn
Re: [PATCH] iommu: io-pgtable: Support non-coherent page tables
On Thu, May 16, 2019 at 5:03 AM Bjorn Andersson wrote: > > Describe the memory related to page table walks as non-cachable for iommu > instances that are not DMA coherent. > > Signed-off-by: Bjorn Andersson > --- > drivers/iommu/io-pgtable-arm.c | 12 +--- > 1 file changed, 9 insertions(+), 3 deletions(-) > > diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c > index 4e21efbc4459..68ff22ffd2cb 100644 > --- a/drivers/iommu/io-pgtable-arm.c > +++ b/drivers/iommu/io-pgtable-arm.c > @@ -803,9 +803,15 @@ arm_64_lpae_alloc_pgtable_s1(struct io_pgtable_cfg *cfg, > void *cookie) > return NULL; > > /* TCR */ > - reg = (ARM_LPAE_TCR_SH_IS << ARM_LPAE_TCR_SH0_SHIFT) | > - (ARM_LPAE_TCR_RGN_WBWA << ARM_LPAE_TCR_IRGN0_SHIFT) | > - (ARM_LPAE_TCR_RGN_WBWA << ARM_LPAE_TCR_ORGN0_SHIFT); > + if (cfg->quirks & IO_PGTABLE_QUIRK_NO_DMA) { > + reg = (ARM_LPAE_TCR_SH_IS << ARM_LPAE_TCR_SH0_SHIFT) | > + (ARM_LPAE_TCR_RGN_WBWA << ARM_LPAE_TCR_IRGN0_SHIFT) | > + (ARM_LPAE_TCR_RGN_WBWA << ARM_LPAE_TCR_ORGN0_SHIFT); > + } else { > + reg = (ARM_LPAE_TCR_SH_IS << ARM_LPAE_TCR_SH0_SHIFT) | > + (ARM_LPAE_TCR_RGN_NC << ARM_LPAE_TCR_IRGN0_SHIFT) | > + (ARM_LPAE_TCR_RGN_NC << ARM_LPAE_TCR_ORGN0_SHIFT); > + } This looks okay to me based on the discussion that we had on a similar patch that I posted. So, Reviewed-by: Vivek Gautam [1] https://lore.kernel.org/patchwork/patch/1032939/ Thanks & regards Vivek > > switch (ARM_LPAE_GRANULE(data)) { > case SZ_4K: > -- > 2.18.0 > > ___ > iommu mailing list > iommu@lists.linux-foundation.org > https://lists.linuxfoundation.org/mailman/listinfo/iommu -- QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
[PATCH] iommu: io-pgtable: Support non-coherent page tables
Describe the memory related to page table walks as non-cachable for iommu instances that are not DMA coherent. Signed-off-by: Bjorn Andersson --- drivers/iommu/io-pgtable-arm.c | 12 +--- 1 file changed, 9 insertions(+), 3 deletions(-) diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c index 4e21efbc4459..68ff22ffd2cb 100644 --- a/drivers/iommu/io-pgtable-arm.c +++ b/drivers/iommu/io-pgtable-arm.c @@ -803,9 +803,15 @@ arm_64_lpae_alloc_pgtable_s1(struct io_pgtable_cfg *cfg, void *cookie) return NULL; /* TCR */ - reg = (ARM_LPAE_TCR_SH_IS << ARM_LPAE_TCR_SH0_SHIFT) | - (ARM_LPAE_TCR_RGN_WBWA << ARM_LPAE_TCR_IRGN0_SHIFT) | - (ARM_LPAE_TCR_RGN_WBWA << ARM_LPAE_TCR_ORGN0_SHIFT); + if (cfg->quirks & IO_PGTABLE_QUIRK_NO_DMA) { + reg = (ARM_LPAE_TCR_SH_IS << ARM_LPAE_TCR_SH0_SHIFT) | + (ARM_LPAE_TCR_RGN_WBWA << ARM_LPAE_TCR_IRGN0_SHIFT) | + (ARM_LPAE_TCR_RGN_WBWA << ARM_LPAE_TCR_ORGN0_SHIFT); + } else { + reg = (ARM_LPAE_TCR_SH_IS << ARM_LPAE_TCR_SH0_SHIFT) | + (ARM_LPAE_TCR_RGN_NC << ARM_LPAE_TCR_IRGN0_SHIFT) | + (ARM_LPAE_TCR_RGN_NC << ARM_LPAE_TCR_ORGN0_SHIFT); + } switch (ARM_LPAE_GRANULE(data)) { case SZ_4K: -- 2.18.0 ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu