Re: [PATCH 14/17] iommu: remove DOMAIN_ATTR_DMA_USE_FLUSH_QUEUE
On 2021-03-15 08:33, Christoph Hellwig wrote: On Fri, Mar 12, 2021 at 04:18:24PM +, Robin Murphy wrote: Let me know what you think of the version here: http://git.infradead.org/users/hch/misc.git/shortlog/refs/heads/iommu-cleanup I'll happily switch the patch to you as the author if you're fine with that as well. I still have reservations about removing the attribute API entirely and pretending that io_pgtable_cfg is anything other than a SoC-specific private interface, I think a private inteface would make more sense. For now I've just condensed it down to a generic set of quirk bits and dropped the attrs structure, which seems like an ok middle ground for now. That being said I wonder why that quirk isn't simply set in the device tree? Because it's a software policy decision rather than any inherent property of the platform, and the DT certainly doesn't know *when* any particular device might prefer its IOMMU to use cacheable pagetables to minimise TLB miss latency vs. saving the cache capacity for larger data buffers. It really is most logical to decide this at the driver level. In truth the overall concept *is* relatively generic (a trend towards larger system caches and cleverer usage is about both raw performance and saving power on off-SoC DRAM traffic), it's just the particular implementation of using io-pgtable to set an outer-cacheable walk attribute in an SMMU TCR that's pretty much specific to Qualcomm SoCs. Hence why having a common abstraction at the iommu_domain level, but where the exact details are free to vary across different IOMMUs and their respective client drivers, is in many ways an ideal fit. but the reworked patch on its own looks reasonable to me, thanks! (I wasn't too convinced about the iommu_cmd_line wrappers either...) Just iommu_get_dma_strict() needs an export since the SMMU drivers can be modular - I consciously didn't add that myself since I was mistakenly thinking only iommu-dma would call it. Fixed. Can I get your signoff for the patch? Then I'll switch it to over to being attributed to you. Sure - I would have thought that the one I originally posted still stands, but for the avoidance of doubt, for the parts of commit 8b6d45c495bd in your tree that remain from what I wrote: Signed-off-by: Robin Murphy Cheers, Robin. ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH 14/17] iommu: remove DOMAIN_ATTR_DMA_USE_FLUSH_QUEUE
On Fri, Mar 12, 2021 at 04:18:24PM +, Robin Murphy wrote: >> Let me know what you think of the version here: >> >> http://git.infradead.org/users/hch/misc.git/shortlog/refs/heads/iommu-cleanup >> >> I'll happily switch the patch to you as the author if you're fine with >> that as well. > > I still have reservations about removing the attribute API entirely and > pretending that io_pgtable_cfg is anything other than a SoC-specific > private interface, I think a private inteface would make more sense. For now I've just condensed it down to a generic set of quirk bits and dropped the attrs structure, which seems like an ok middle ground for now. That being said I wonder why that quirk isn't simply set in the device tree? > but the reworked patch on its own looks reasonable to > me, thanks! (I wasn't too convinced about the iommu_cmd_line wrappers > either...) Just iommu_get_dma_strict() needs an export since the SMMU > drivers can be modular - I consciously didn't add that myself since I was > mistakenly thinking only iommu-dma would call it. Fixed. Can I get your signoff for the patch? Then I'll switch it to over to being attributed to you. ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH 14/17] iommu: remove DOMAIN_ATTR_DMA_USE_FLUSH_QUEUE
On 2021-03-11 08:26, Christoph Hellwig wrote: On Wed, Mar 10, 2021 at 06:39:57PM +, Robin Murphy wrote: Actually... Just mirroring the iommu_dma_strict value into struct iommu_domain should solve all of that with very little boilerplate code. Yes, my initial thought was to directly replace the attribute with a common flag at iommu_domain level, but since in all cases the behaviour is effectively global rather than actually per-domain, it seemed reasonable to take it a step further. This passes compile-testing for arm64 and x86, what do you think? It seems to miss a few bits, and also generally seems to be not actually apply to recent mainline or something like it due to different empty lines in a few places. Yeah, that was sketched out on top of some other development patches, and in being so focused on not breaking any of the x86 behaviours I did indeed overlook fully converting the SMMU drivers... oops! (my thought was to do the conversion for its own sake, then clean up the redundant attribute separately, but I guess it's fine either way) Let me know what you think of the version here: http://git.infradead.org/users/hch/misc.git/shortlog/refs/heads/iommu-cleanup I'll happily switch the patch to you as the author if you're fine with that as well. I still have reservations about removing the attribute API entirely and pretending that io_pgtable_cfg is anything other than a SoC-specific private interface, but the reworked patch on its own looks reasonable to me, thanks! (I wasn't too convinced about the iommu_cmd_line wrappers either...) Just iommu_get_dma_strict() needs an export since the SMMU drivers can be modular - I consciously didn't add that myself since I was mistakenly thinking only iommu-dma would call it. Robin. ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH 14/17] iommu: remove DOMAIN_ATTR_DMA_USE_FLUSH_QUEUE
On Wed, Mar 10, 2021 at 06:39:57PM +, Robin Murphy wrote: >> Actually... Just mirroring the iommu_dma_strict value into >> struct iommu_domain should solve all of that with very little >> boilerplate code. > > Yes, my initial thought was to directly replace the attribute with a > common flag at iommu_domain level, but since in all cases the behaviour > is effectively global rather than actually per-domain, it seemed > reasonable to take it a step further. This passes compile-testing for > arm64 and x86, what do you think? It seems to miss a few bits, and also generally seems to be not actually apply to recent mainline or something like it due to different empty lines in a few places. Let me know what you think of the version here: http://git.infradead.org/users/hch/misc.git/shortlog/refs/heads/iommu-cleanup I'll happily switch the patch to you as the author if you're fine with that as well. ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH 14/17] iommu: remove DOMAIN_ATTR_DMA_USE_FLUSH_QUEUE
On 2021-03-10 09:25, Christoph Hellwig wrote: On Wed, Mar 10, 2021 at 10:15:01AM +0100, Christoph Hellwig wrote: On Thu, Mar 04, 2021 at 03:25:27PM +, Robin Murphy wrote: On 2021-03-01 08:42, Christoph Hellwig wrote: Use explicit methods for setting and querying the information instead. Now that everyone's using iommu-dma, is there any point in bouncing this through the drivers at all? Seems like it would make more sense for the x86 drivers to reflect their private options back to iommu_dma_strict (and allow Intel's caching mode to override it as well), then have iommu_dma_init_domain just test !iommu_dma_strict && domain->ops->flush_iotlb_all. Hmm. I looked at this, and kill off ->dma_enable_flush_queue for the ARM drivers and just looking at iommu_dma_strict seems like a very clear win. OTOH x86 is a little more complicated. AMD and intel defaul to lazy mode, so we'd have to change the global iommu_dma_strict if they are initialized. Also Intel has not only a "static" option to disable lazy mode, but also a "dynamic" one where it iterates structure. So I think on the get side we're stuck with the method, but it still simplifies the whole thing. Actually... Just mirroring the iommu_dma_strict value into struct iommu_domain should solve all of that with very little boilerplate code. Yes, my initial thought was to directly replace the attribute with a common flag at iommu_domain level, but since in all cases the behaviour is effectively global rather than actually per-domain, it seemed reasonable to take it a step further. This passes compile-testing for arm64 and x86, what do you think? Robin. ->8- Subject: [PATCH] iommu: Consolidate strict invalidation handling Now that everyone is using iommu-dma, the global invalidation policy really doesn't need to be woven through several parts of the core API and individual drivers, we can just look it up directly at the one point that we now make the flush queue decision. If the x86 drivers reflect their internal options and overrides back to iommu_dma_strict, that can become the canonical source. Signed-off-by: Robin Murphy --- drivers/iommu/amd/iommu.c | 2 ++ drivers/iommu/dma-iommu.c | 8 +--- drivers/iommu/intel/iommu.c | 12 drivers/iommu/iommu.c | 35 +++ include/linux/iommu.h | 2 ++ 5 files changed, 44 insertions(+), 15 deletions(-) diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c index a69a8b573e40..1db29e59d468 100644 --- a/drivers/iommu/amd/iommu.c +++ b/drivers/iommu/amd/iommu.c @@ -1856,6 +1856,8 @@ int __init amd_iommu_init_dma_ops(void) else pr_info("Lazy IO/TLB flushing enabled\n"); + iommu_set_dma_strict(amd_iommu_unmap_flush); + return 0; } diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c index af765c813cc8..789a950cc125 100644 --- a/drivers/iommu/dma-iommu.c +++ b/drivers/iommu/dma-iommu.c @@ -304,10 +304,6 @@ static void iommu_dma_flush_iotlb_all(struct iova_domain *iovad) cookie = container_of(iovad, struct iommu_dma_cookie, iovad); domain = cookie->fq_domain; - /* -* The IOMMU driver supporting DOMAIN_ATTR_DMA_USE_FLUSH_QUEUE -* implies that ops->flush_iotlb_all must be non-NULL. -*/ domain->ops->flush_iotlb_all(domain); } @@ -334,7 +330,6 @@ static int iommu_dma_init_domain(struct iommu_domain *domain, dma_addr_t base, struct iommu_dma_cookie *cookie = domain->iova_cookie; unsigned long order, base_pfn; struct iova_domain *iovad; - int attr; if (!cookie || cookie->type != IOMMU_DMA_IOVA_COOKIE) return -EINVAL; @@ -371,8 +366,7 @@ static int iommu_dma_init_domain(struct iommu_domain *domain, dma_addr_t base, init_iova_domain(iovad, 1UL << order, base_pfn); if (!cookie->fq_domain && (!dev || !dev_is_untrusted(dev)) && - !iommu_domain_get_attr(domain, DOMAIN_ATTR_DMA_USE_FLUSH_QUEUE, ) && - attr) { + domain->ops->flush_iotlb_all && !iommu_get_dma_strict()) { if (init_iova_flush_queue(iovad, iommu_dma_flush_iotlb_all, iommu_dma_entry_dtor)) pr_warn("iova flush queue initialization failed\n"); diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c index b5c746f0f63b..f5b452cd1266 100644 --- a/drivers/iommu/intel/iommu.c +++ b/drivers/iommu/intel/iommu.c @@ -4377,6 +4377,17 @@ int __init intel_iommu_init(void) down_read(_global_lock); for_each_active_iommu(iommu, drhd) { + if (!intel_iommu_strict && cap_caching_mode(iommu->cap)) { + /* +* The flush queue implementation does not perform page-selective +* invalidations that are required for efficient TLB flushes in +* virtual environments. The benefit of
Re: [PATCH 14/17] iommu: remove DOMAIN_ATTR_DMA_USE_FLUSH_QUEUE
On Wed, Mar 10, 2021 at 10:15:01AM +0100, Christoph Hellwig wrote: > On Thu, Mar 04, 2021 at 03:25:27PM +, Robin Murphy wrote: > > On 2021-03-01 08:42, Christoph Hellwig wrote: > >> Use explicit methods for setting and querying the information instead. > > > > Now that everyone's using iommu-dma, is there any point in bouncing this > > through the drivers at all? Seems like it would make more sense for the x86 > > drivers to reflect their private options back to iommu_dma_strict (and > > allow Intel's caching mode to override it as well), then have > > iommu_dma_init_domain just test !iommu_dma_strict && > > domain->ops->flush_iotlb_all. > > Hmm. I looked at this, and kill off ->dma_enable_flush_queue for > the ARM drivers and just looking at iommu_dma_strict seems like a > very clear win. > > OTOH x86 is a little more complicated. AMD and intel defaul to lazy > mode, so we'd have to change the global iommu_dma_strict if they are > initialized. Also Intel has not only a "static" option to disable > lazy mode, but also a "dynamic" one where it iterates structure. So > I think on the get side we're stuck with the method, but it still > simplifies the whole thing. Actually... Just mirroring the iommu_dma_strict value into struct iommu_domain should solve all of that with very little boilerplate code. ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH 14/17] iommu: remove DOMAIN_ATTR_DMA_USE_FLUSH_QUEUE
On Thu, Mar 04, 2021 at 03:25:27PM +, Robin Murphy wrote: > On 2021-03-01 08:42, Christoph Hellwig wrote: >> Use explicit methods for setting and querying the information instead. > > Now that everyone's using iommu-dma, is there any point in bouncing this > through the drivers at all? Seems like it would make more sense for the x86 > drivers to reflect their private options back to iommu_dma_strict (and > allow Intel's caching mode to override it as well), then have > iommu_dma_init_domain just test !iommu_dma_strict && > domain->ops->flush_iotlb_all. Hmm. I looked at this, and kill off ->dma_enable_flush_queue for the ARM drivers and just looking at iommu_dma_strict seems like a very clear win. OTOH x86 is a little more complicated. AMD and intel defaul to lazy mode, so we'd have to change the global iommu_dma_strict if they are initialized. Also Intel has not only a "static" option to disable lazy mode, but also a "dynamic" one where it iterates structure. So I think on the get side we're stuck with the method, but it still simplifies the whole thing. ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH 14/17] iommu: remove DOMAIN_ATTR_DMA_USE_FLUSH_QUEUE
On Thu, Mar 04, 2021 at 03:25:27PM +, Robin Murphy wrote: > On 2021-03-01 08:42, Christoph Hellwig wrote: >> Use explicit methods for setting and querying the information instead. > > Now that everyone's using iommu-dma, is there any point in bouncing this > through the drivers at all? Seems like it would make more sense for the x86 > drivers to reflect their private options back to iommu_dma_strict (and > allow Intel's caching mode to override it as well), then have > iommu_dma_init_domain just test !iommu_dma_strict && > domain->ops->flush_iotlb_all. Indeed. I switch to that. ___ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
Re: [PATCH 14/17] iommu: remove DOMAIN_ATTR_DMA_USE_FLUSH_QUEUE
On 2021-03-01 08:42, Christoph Hellwig wrote: Use explicit methods for setting and querying the information instead. Now that everyone's using iommu-dma, is there any point in bouncing this through the drivers at all? Seems like it would make more sense for the x86 drivers to reflect their private options back to iommu_dma_strict (and allow Intel's caching mode to override it as well), then have iommu_dma_init_domain just test !iommu_dma_strict && domain->ops->flush_iotlb_all. Robin. Also remove the now unused iommu_domain_get_attr functionality. Signed-off-by: Christoph Hellwig --- drivers/iommu/amd/iommu.c | 23 ++--- drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 47 ++--- drivers/iommu/arm/arm-smmu/arm-smmu.c | 56 + drivers/iommu/dma-iommu.c | 8 ++- drivers/iommu/intel/iommu.c | 27 ++ drivers/iommu/iommu.c | 19 +++ include/linux/iommu.h | 17 ++- 7 files changed, 51 insertions(+), 146 deletions(-) diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c index a69a8b573e40d0..37a8e51db17656 100644 --- a/drivers/iommu/amd/iommu.c +++ b/drivers/iommu/amd/iommu.c @@ -1771,24 +1771,11 @@ static struct iommu_group *amd_iommu_device_group(struct device *dev) return acpihid_device_group(dev); } -static int amd_iommu_domain_get_attr(struct iommu_domain *domain, - enum iommu_attr attr, void *data) +static bool amd_iommu_dma_use_flush_queue(struct iommu_domain *domain) { - switch (domain->type) { - case IOMMU_DOMAIN_UNMANAGED: - return -ENODEV; - case IOMMU_DOMAIN_DMA: - switch (attr) { - case DOMAIN_ATTR_DMA_USE_FLUSH_QUEUE: - *(int *)data = !amd_iommu_unmap_flush; - return 0; - default: - return -ENODEV; - } - break; - default: - return -EINVAL; - } + if (domain->type != IOMMU_DOMAIN_DMA) + return false; + return !amd_iommu_unmap_flush; } /* @@ -2257,7 +2244,7 @@ const struct iommu_ops amd_iommu_ops = { .release_device = amd_iommu_release_device, .probe_finalize = amd_iommu_probe_finalize, .device_group = amd_iommu_device_group, - .domain_get_attr = amd_iommu_domain_get_attr, + .dma_use_flush_queue = amd_iommu_dma_use_flush_queue, .get_resv_regions = amd_iommu_get_resv_regions, .put_resv_regions = generic_iommu_put_resv_regions, .is_attach_deferred = amd_iommu_is_attach_deferred, diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c index 8594b4a8304375..bf96172e8c1f71 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c @@ -2449,33 +2449,21 @@ static struct iommu_group *arm_smmu_device_group(struct device *dev) return group; } -static int arm_smmu_domain_get_attr(struct iommu_domain *domain, - enum iommu_attr attr, void *data) +static bool arm_smmu_dma_use_flush_queue(struct iommu_domain *domain) { struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain); - switch (domain->type) { - case IOMMU_DOMAIN_UNMANAGED: - switch (attr) { - case DOMAIN_ATTR_NESTING: - *(int *)data = (smmu_domain->stage == ARM_SMMU_DOMAIN_NESTED); - return 0; - default: - return -ENODEV; - } - break; - case IOMMU_DOMAIN_DMA: - switch (attr) { - case DOMAIN_ATTR_DMA_USE_FLUSH_QUEUE: - *(int *)data = smmu_domain->non_strict; - return 0; - default: - return -ENODEV; - } - break; - default: - return -EINVAL; - } + if (domain->type != IOMMU_DOMAIN_DMA) + return false; + return smmu_domain->non_strict; +} + + +static void arm_smmu_dma_enable_flush_queue(struct iommu_domain *domain) +{ + if (domain->type != IOMMU_DOMAIN_DMA) + return; + to_smmu_domain(domain)->non_strict = true; } static int arm_smmu_domain_set_attr(struct iommu_domain *domain, @@ -2505,13 +2493,7 @@ static int arm_smmu_domain_set_attr(struct iommu_domain *domain, } break; case IOMMU_DOMAIN_DMA: - switch(attr) { - case DOMAIN_ATTR_DMA_USE_FLUSH_QUEUE: - smmu_domain->non_strict = *(int *)data; - break; - default: - ret =
[PATCH 14/17] iommu: remove DOMAIN_ATTR_DMA_USE_FLUSH_QUEUE
Use explicit methods for setting and querying the information instead. Also remove the now unused iommu_domain_get_attr functionality. Signed-off-by: Christoph Hellwig --- drivers/iommu/amd/iommu.c | 23 ++--- drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 47 ++--- drivers/iommu/arm/arm-smmu/arm-smmu.c | 56 + drivers/iommu/dma-iommu.c | 8 ++- drivers/iommu/intel/iommu.c | 27 ++ drivers/iommu/iommu.c | 19 +++ include/linux/iommu.h | 17 ++- 7 files changed, 51 insertions(+), 146 deletions(-) diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c index a69a8b573e40d0..37a8e51db17656 100644 --- a/drivers/iommu/amd/iommu.c +++ b/drivers/iommu/amd/iommu.c @@ -1771,24 +1771,11 @@ static struct iommu_group *amd_iommu_device_group(struct device *dev) return acpihid_device_group(dev); } -static int amd_iommu_domain_get_attr(struct iommu_domain *domain, - enum iommu_attr attr, void *data) +static bool amd_iommu_dma_use_flush_queue(struct iommu_domain *domain) { - switch (domain->type) { - case IOMMU_DOMAIN_UNMANAGED: - return -ENODEV; - case IOMMU_DOMAIN_DMA: - switch (attr) { - case DOMAIN_ATTR_DMA_USE_FLUSH_QUEUE: - *(int *)data = !amd_iommu_unmap_flush; - return 0; - default: - return -ENODEV; - } - break; - default: - return -EINVAL; - } + if (domain->type != IOMMU_DOMAIN_DMA) + return false; + return !amd_iommu_unmap_flush; } /* @@ -2257,7 +2244,7 @@ const struct iommu_ops amd_iommu_ops = { .release_device = amd_iommu_release_device, .probe_finalize = amd_iommu_probe_finalize, .device_group = amd_iommu_device_group, - .domain_get_attr = amd_iommu_domain_get_attr, + .dma_use_flush_queue = amd_iommu_dma_use_flush_queue, .get_resv_regions = amd_iommu_get_resv_regions, .put_resv_regions = generic_iommu_put_resv_regions, .is_attach_deferred = amd_iommu_is_attach_deferred, diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c index 8594b4a8304375..bf96172e8c1f71 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c @@ -2449,33 +2449,21 @@ static struct iommu_group *arm_smmu_device_group(struct device *dev) return group; } -static int arm_smmu_domain_get_attr(struct iommu_domain *domain, - enum iommu_attr attr, void *data) +static bool arm_smmu_dma_use_flush_queue(struct iommu_domain *domain) { struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain); - switch (domain->type) { - case IOMMU_DOMAIN_UNMANAGED: - switch (attr) { - case DOMAIN_ATTR_NESTING: - *(int *)data = (smmu_domain->stage == ARM_SMMU_DOMAIN_NESTED); - return 0; - default: - return -ENODEV; - } - break; - case IOMMU_DOMAIN_DMA: - switch (attr) { - case DOMAIN_ATTR_DMA_USE_FLUSH_QUEUE: - *(int *)data = smmu_domain->non_strict; - return 0; - default: - return -ENODEV; - } - break; - default: - return -EINVAL; - } + if (domain->type != IOMMU_DOMAIN_DMA) + return false; + return smmu_domain->non_strict; +} + + +static void arm_smmu_dma_enable_flush_queue(struct iommu_domain *domain) +{ + if (domain->type != IOMMU_DOMAIN_DMA) + return; + to_smmu_domain(domain)->non_strict = true; } static int arm_smmu_domain_set_attr(struct iommu_domain *domain, @@ -2505,13 +2493,7 @@ static int arm_smmu_domain_set_attr(struct iommu_domain *domain, } break; case IOMMU_DOMAIN_DMA: - switch(attr) { - case DOMAIN_ATTR_DMA_USE_FLUSH_QUEUE: - smmu_domain->non_strict = *(int *)data; - break; - default: - ret = -ENODEV; - } + ret = -ENODEV; break; default: ret = -EINVAL; @@ -2619,7 +2601,8 @@ static struct iommu_ops arm_smmu_ops = { .probe_device = arm_smmu_probe_device, .release_device = arm_smmu_release_device, .device_group = arm_smmu_device_group, - .domain_get_attr= arm_smmu_domain_get_attr, +