From: David Stevens
Make map_swiotlb and unmap_swiotlb only for mapping, and consistently
use sync_single_for and sync_sg_for functions for swiotlb sync and arch
sync. This ensures that the same code path is responsible for syncing
regardless of whether or not SKIP_CPU_SYNC is set
From: David Stevens
Add check for CONFIG_SWIOTLB to dev_is_untrusted, so that swiotlb
related code can be removed more aggressively.
Signed-off-by: David Stevens
---
drivers/iommu/dma-iommu.c | 26 +-
1 file changed, 13 insertions(+), 13 deletions(-)
diff --git
From: David Stevens
When calling arch_sync_dma, we need to pass it the memory that's
actually being used for dma. When using swiotlb bounce buffers, this is
the bounce buffer. Move arch_sync_dma into the __iommu_dma_map_swiotlb
helper, so it can use the bounce buffer address if necessary
From: David Stevens
If SKIP_CPU_SYNC isn't already set, then iommu_dma_unmap_(page|sg) has
already called iommu_dma_sync_(single|sg)_for_cpu, so there is no need
to copy from the bounce buffer again.
Signed-off-by: David Stevens
---
drivers/iommu/dma-iommu.c | 3 ++-
1 file changed, 2
From: David Stevens
Introduce a new dev_use_swiotlb function to guard swiotlb code, instead
of overloading dev_is_untrusted. This allows CONFIG_SWIOTLB to be
checked more broadly, so the swiotlb related code can be removed more
aggressively.
Signed-off-by: David Stevens
---
drivers/iommu/dma
On Thu, Jul 8, 2021 at 10:38 PM Lu Baolu wrote:
>
> Hi David,
>
> I like this idea. Thanks for proposing this.
>
> On 2021/7/7 15:55, David Stevens wrote:
> > Add support for per-domain dynamic pools of iommu bounce buffers to the
> > dma-iommu API. This allows iom
From: David Stevens
Expose a few helper functions from dma-iommu to the rest of the module.
Signed-off-by: David Stevens
---
drivers/iommu/dma-iommu.c | 27 ++-
include/linux/dma-iommu.h | 12
2 files changed, 26 insertions(+), 13 deletions(-)
diff --git
From: David Stevens
Add support for per-domain dynamic pools of iommu bounce buffers to the
dma-iommu api. When enabled, all non-direct streaming mappings below a
configurable size will go through bounce buffers.
Each domain has its own buffer pool. Each buffer pool is split into
multiple power
nce
comparison because virtio-iommu and vfio_iommu_type1 both have big
locks that significantly limit mulithreaded DMA performance.
This patch set is based on v5.13-rc7 plus the patches at [1].
David Stevens (4):
dma-iommu: add kalloc gfp flag to alloc helper
dma-iommu: replace device arguments
From: David Stevens
Add gfp flag for kalloc calls within __iommu_dma_alloc_pages, so the
function can be called from atomic contexts.
Signed-off-by: David Stevens
---
drivers/iommu/dma-iommu.c | 13 +++--
1 file changed, 7 insertions(+), 6 deletions(-)
diff --git a/drivers/iommu/dma
From: David Stevens
Replace the struct device argument with the device's nid in
__iommu_dma_alloc_pages, since it doesn't need the whole struct. This
allows it to be called from places which don't have access to the
device.
Signed-off-by: David Stevens
---
drivers/iommu/dma-iommu.c | 10
From: David Stevens
This patch set includes two fixes for bugs caused by mixing up the
original buffer's physical address and bounce buffer's physical address.
It also includes a performance fix that avoids an extra copy, as well as
a general cleanup fix.
The issues were found via code
From: David Stevens
The is_swiotlb_buffer function takes the physical address of the swiotlb
buffer, not the physical address of the original buffer. The sglist
contains the physical addresses of the original buffer, so for the
sync_sg functions to work properly when a bounce buffer might have
On Fri, Jul 9, 2021 at 2:14 AM Robin Murphy wrote:
>
> On 2021-07-08 10:29, Joerg Roedel wrote:
> > Adding Robin too.
> >
> > On Wed, Jul 07, 2021 at 04:55:01PM +0900, David Stevens wrote:
> >> Add support for per-domain dynamic pools of iommu bounce buffers to th
From: David Stevens
This patch set includes various fixes for dma-iommu's swiotlb bounce
buffers for untrusted devices. There are four fixes for correctness
issues, one for a performance issue, and one for general cleanup.
The min_align_mask issue was found when running fio on an untrusted nvme
From: David Stevens
The is_swiotlb_buffer function takes the physical address of the swiotlb
buffer, not the physical address of the original buffer. The sglist
contains the physical addresses of the original buffer, so for the
sync_sg functions to work properly when a bounce buffer might have
From: David Stevens
When calling arch_sync_dma, we need to pass it the memory that's
actually being used for dma. When using swiotlb bounce buffers, this is
the bounce buffer. Fold __iommu_dma_map_swiotlb into iommu_dma_map_page
so it can sync the right phys_addr_t.
Now that iommu_dma_map_sg
From: David Stevens
Calling the iommu_dma_sync_*_for_cpu functions during unmap can cause
two copies out of the swiotlb buffer. Fold __iommu_dma_unmap_swiotlb
into iommu_dma_unmap_page, and directly call arch_sync_dma_for_cpu
instead of iommu_dma_sync_single_for_cpu to avoid this double sync
From: David Stevens
For devices which set min_align_mask, swiotlb preserves the offset of
the original physical address within that mask. Since __iommu_dma_map
accounts for non-aligned addresses, passing a non-aligned swiotlb
address with the swiotlb aligned size results in the offset being
From: David Stevens
Introduce a new dev_use_swiotlb function to guard swiotlb code, instead
of overloading dev_is_untrusted. This allows CONFIG_SWIOTLB to be
checked more broadly, so the swiotlb related code can be removed more
aggressively.
Signed-off-by: David Stevens
Reviewed-by: Robin
From: David Stevens
Add an argument to swiotlb_tbl_map_single that specifies the desired
alignment of the allocated buffer. This is used by dma-iommu to ensure
the buffer is aligned to the iova granule size when using swiotlb with
untrusted sub-granule mappings. This addresses an issue where
From: David Stevens
Fold the _swiotlb helper functions into the respective _page functions,
since recent fixes have moved all logic from the _page functions to the
_swiotlb functions.
Signed-off-by: David Stevens
---
drivers/iommu/dma-iommu.c | 136 +-
1
From: David Stevens
Introduce a new dev_use_swiotlb function to guard swiotlb code, instead
of overloading dev_is_untrusted. This allows CONFIG_SWIOTLB to be
checked more broadly, so the swiotlb related code can be removed more
aggressively.
Signed-off-by: David Stevens
Reviewed-by: Robin
From: David Stevens
This patch set includes various fixes for dma-iommu's swiotlb bounce
buffers for untrusted devices. There are four fixes for correctness
issues, one for a performance issue, and one for general cleanup.
The min_align_mask issue was found when running fio on an untrusted nvme
From: David Stevens
When calling arch_sync_dma, we need to pass it the memory that's
actually being used for dma. When using swiotlb bounce buffers, this is
the bounce buffer. Move arch_sync_dma into the __iommu_dma_map_swiotlb
helper, so it can use the bounce buffer address if necessary.
Now
From: David Stevens
Calling the iommu_dma_sync_*_for_cpu functions during unmap can cause
two copies out of the swiotlb buffer. Do the arch sync directly in
__iommu_dma_unmap_swiotlb instead to avoid this. This makes the call to
iommu_dma_sync_sg_for_cpu for untrusted devices
From: David Stevens
The is_swiotlb_buffer function takes the physical address of the swiotlb
buffer, not the physical address of the original buffer. The sglist
contains the physical addresses of the original buffer, so for the
sync_sg functions to work properly when a bounce buffer might have
From: David Stevens
Add an argument to swiotlb_tbl_map_single that specifies the desired
alignment of the allocated buffer. This is used by dma-iommu to ensure
the buffer is aligned to the iova granule size when using swiotlb with
untrusted sub-granule mappings. This addresses an issue where
From: David Stevens
For devices which set min_align_mask, swiotlb preserves the offset of
the original physical address within that mask. Since __iommu_dma_map
accounts for non-aligned addresses, passing a non-aligned swiotlb
address with the swiotlb aligned size results in the offset being
From: David Stevens
Fold the _swiotlb helper functions into the respective _page functions,
since recent fixes have moved all logic from the _page functions to the
_swiotlb functions.
Signed-off-by: David Stevens
Reviewed-by: Christoph Hellwig
Reviewed-by: Robin Murphy
---
drivers/iommu/dma
From: David Stevens
Introduce a new dev_use_swiotlb function to guard swiotlb code, instead
of overloading dev_is_untrusted. This allows CONFIG_SWIOTLB to be
checked more broadly, so the swiotlb related code can be removed more
aggressively.
Signed-off-by: David Stevens
Reviewed-by: Robin
From: David Stevens
The is_swiotlb_buffer function takes the physical address of the swiotlb
buffer, not the physical address of the original buffer. The sglist
contains the physical addresses of the original buffer, so for the
sync_sg functions to work properly when a bounce buffer might have
From: David Stevens
When calling arch_sync_dma, we need to pass it the memory that's
actually being used for dma. When using swiotlb bounce buffers, this is
the bounce buffer. Move arch_sync_dma into the __iommu_dma_map_swiotlb
helper, so it can use the bounce buffer address if necessary.
Now
xes into dedicated patches
- Less invasive changes to fix arch_sync when mapping
- Leave dev_is_untrusted check for strict iommu
David Stevens (7):
dma-iommu: fix sync_sg with swiotlb
dma-iommu: fix arch_sync_dma for map
dma-iommu: skip extra sync during unmap w/swiotlb
dma-iommu: fol
From: David Stevens
Add an argument to swiotlb_tbl_map_single that specifies the desired
alignment of the allocated buffer. This is used by dma-iommu to ensure
the buffer is aligned to the iova granule size when using swiotlb with
untrusted sub-granule mappings. This addresses an issue where
From: David Stevens
Pass the non-aligned size to __iommu_dma_map when using swiotlb bounce
buffers in iommu_dma_map_page, to account for min_align_mask.
To deal with granule alignment, __iommu_dma_map maps iova_align(size +
iova_off) bytes starting at phys - iova_off. If iommu_dma_map_page
From: David Stevens
Calling the iommu_dma_sync_*_for_cpu functions during unmap can cause
two copies out of the swiotlb buffer. Do the arch sync directly in
__iommu_dma_unmap_swiotlb instead to avoid this. This makes the call to
iommu_dma_sync_sg_for_cpu for untrusted devices
On Thu, Aug 19, 2021 at 6:03 PM Robin Murphy wrote:
>
> On 2021-08-17 02:38, David Stevens wrote:
> > From: David Stevens
> >
> > For devices which set min_align_mask, swiotlb preserves the offset of
> > the original physical address within that mask. Since __iomm
From: David Stevens
When calling arch_sync_dma, we need to pass it the memory that's
actually being used for dma. When using swiotlb bounce buffers, this is
the bounce buffer. Move arch_sync_dma into the __iommu_dma_map_swiotlb
helper, so it can use the bounce buffer address if necessary.
Now
From: David Stevens
Calling the iommu_dma_sync_*_for_cpu functions during unmap can cause
two copies out of the swiotlb buffer. Do the arch sync directly in
__iommu_dma_unmap_swiotlb instead to avoid this. This makes the call to
iommu_dma_sync_sg_for_cpu for untrusted devices
From: David Stevens
The is_swiotlb_buffer function takes the physical address of the swiotlb
buffer, not the physical address of the original buffer. The sglist
contains the physical addresses of the original buffer, so for the
sync_sg functions to work properly when a bounce buffer might have
From: David Stevens
Fold the _swiotlb helper functions into the respective _page functions,
since recent fixes have moved all logic from the _page functions to the
_swiotlb functions.
Signed-off-by: David Stevens
Reviewed-by: Christoph Hellwig
---
drivers/iommu/dma-iommu.c | 135
From: David Stevens
Introduce a new dev_use_swiotlb function to guard swiotlb code, instead
of overloading dev_is_untrusted. This allows CONFIG_SWIOTLB to be
checked more broadly, so the swiotlb related code can be removed more
aggressively.
Signed-off-by: David Stevens
Reviewed-by: Robin
From: David Stevens
Add an argument to swiotlb_tbl_map_single that specifies the desired
alignment of the allocated buffer. This is used by dma-iommu to ensure
the buffer is aligned to the iova granule size when using swiotlb with
untrusted sub-granule mappings. This addresses an issue where
From: David Stevens
For devices which set min_align_mask, swiotlb preserves the offset of
the original physical address within that mask. Since __iommu_dma_map
accounts for non-aligned addresses, passing a non-aligned swiotlb
address with the swiotlb aligned size results in the offset being
From: David Stevens
This patch set includes various fixes for dma-iommu's swiotlb bounce
buffers for untrusted devices. There are four fixes for correctness
issues, one for a performance issue, and one for general cleanup.
The min_align_mask issue was found when running fio on an untrusted nvme
Is there further feedback on these patches? Only patch 7 is still
pending review.
-David
On Mon, Aug 30, 2021 at 2:00 PM David Stevens wrote:
>
> This patch set includes various fixes for dma-iommu's swiotlb bounce
> buffers for untrusted devices.
>
> The min_align_mask issue
On Tue, Aug 10, 2021 at 10:19 AM Mi, Dapeng1 wrote:
>
> Hi David,
>
> I like this patch set and this is crucial for reducing the significant vIOMMU
> performance. It looks you totally rewrite the IOMMU mapping/unmapping part
> and use the dynamically allocated memory from buddy system as bounce
From: David Stevens
This patch set includes various fixes for dma-iommu's swiotlb bounce
buffers for untrusted devices. There are three fixes for correctness
issues, one performance issue, and one general cleanup.
The min_align_mask issue was found when running fio on an untrusted nvme
device
From: David Stevens
The is_swiotlb_buffer function takes the physical address of the swiotlb
buffer, not the physical address of the original buffer. The sglist
contains the physical addresses of the original buffer, so for the
sync_sg functions to work properly when a bounce buffer might have
From: David Stevens
Introduce a new dev_use_swiotlb function to guard swiotlb code, instead
of overloading dev_is_untrusted. This allows CONFIG_SWIOTLB to be
checked more broadly, so the swiotlb related code can be removed more
aggressively.
Signed-off-by: David Stevens
---
drivers/iommu/dma
From: David Stevens
For devices which set min_align_mask, swiotlb preserves the offset of
the original physical address within that mask. Since __iommu_dma_map
accounts for non-aligned addresses, passing a non-aligned swiotlb
address with the swiotlb aligned size results in the offset being
From: David Stevens
When calling arch_sync_dma, we need to pass it the memory that's
actually being used for dma. When using swiotlb bounce buffers, this is
the bounce buffer. Move arch_sync_dma into the __iommu_dma_map_swiotlb
helper, so it can use the bounce buffer address if necessary
From: David Stevens
After syncing in map/unmap, add the DMA_ATTR_SKIP_CPU_SYNC flag so
anything that uses attrs later on will skip any sync work that has
already been completed. In particular, this skips copying from the
swiotlb twice during unmap.
Signed-off-by: David Stevens
---
drivers
On Thu, Aug 12, 2021 at 3:47 AM Robin Murphy wrote:
>
> On 2021-08-11 03:42, David Stevens wrote:
> > From: David Stevens
> >
> > When calling arch_sync_dma, we need to pass it the memory that's
> > actually being used for dma. When using swiotlb bounce buffers,
On Mon, Aug 2, 2021 at 10:30 PM Will Deacon wrote:
>
> On Fri, Jul 09, 2021 at 12:34:59PM +0900, David Stevens wrote:
> > From: David Stevens
> >
> > The is_swiotlb_buffer function takes the physical address of the swiotlb
> > buffer, not the physical address of t
On Mon, Aug 2, 2021 at 10:54 PM Will Deacon wrote:
>
> On Fri, Jul 09, 2021 at 12:35:01PM +0900, David Stevens wrote:
> > From: David Stevens
> >
> > If SKIP_CPU_SYNC isn't already set, then iommu_dma_unmap_(page|sg) has
> > already called iommu_dma_sync_(single|sg
From: David Stevens
Add per-domain pools for IOMMU mapped bounce buffers. Each domain has 8
buffer pools, which hold buffers of size 2^n pages. Buffers are
allocated on demand, and unused buffers are periodically released from
the cache. Single use buffers are still used for mappings
From: David Stevens
Add config that uses IOMMU bounce buffer pools to avoid IOMMU
interactions as much as possible for relatively small streaming DMA
operations. This can lead to significant performance improvements on
systems where IOMMU map/unmap operations are very slow, such as when
running
From: David Stevens
Add a DMA_ATTR_PERSISTENT_STREAMING flag which indicates that the
streaming mapping is long lived and that the caller will manage
coherency either through the dma_sync_* functions or via some other
use-case specific mechanism. This flag indicates to the platform
From: David Stevens
Add support for dynamic bounce buffers to the dma-api for use with
subgranule IOMMU mappings with untrusted devices. Bounce buffer
management is split into two parts. First, there is a buffer manager
that is responsible for allocating and tracking buffers. Second
From: David Stevens
Add callback to buffer manager's removal function so that the buffer can
be sync'ed during unmap without an extra find operation.
Signed-off-by: David Stevens
---
drivers/iommu/io-bounce-buffers.c | 87 +--
drivers/iommu/io-buffer-manager.c | 6
From: David Stevens
A new pooled bounce buffer implementation will be added to reduce IOMMU
interactions on platforms with slow IOMMUs. The new implementation can
also support using bounce buffers with untrusted devices, so the current
basic bounce buffer support can be reverted.
This reverts
From: David Stevens
Expose a few helper functions from dma-iommu to the rest of the module.
Signed-off-by: David Stevens
---
drivers/iommu/dma-iommu.c | 23 ---
include/linux/dma-iommu.h | 8
2 files changed, 20 insertions(+), 11 deletions(-)
diff --git
From: David Stevens
This patch series adds support for per-domain dynamic pools of iommu
bounce buffers to the dma-iommu API. This allows iommu mappings to be
reused while still maintaining strict iommu protection.
This bounce buffer support is used to add a new config option that, when
enabled
From: David Stevens
Only clear the padding bytes in bounce buffers, since syncing from the
original buffer already overwrites the non-padding bytes.
Signed-off-by: David Stevens
---
drivers/iommu/io-bounce-buffers.c | 64 +--
drivers/iommu/io-buffer-manager.c | 7
From: David Stevens
Use the new DMA_ATTR_PERSISTENT_STREAMING for long lived dma mappings
which directly handle CPU cache coherency instead of using dma_sync_*.
Signed-off-by: David Stevens
---
drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c | 4 +++-
drivers/gpu/drm/i915/i915_gem_gtt.c| 3
On Thu, Aug 12, 2021 at 4:12 AM Robin Murphy wrote:
>
> On 2021-08-11 03:42, David Stevens wrote:
> > From: David Stevens
> >
> > For devices which set min_align_mask, swiotlb preserves the offset of
> > the original physical address within that mask. Since __iomm
From: David Stevens
Add an argument to swiotlb_tbl_map_single that specifies the desired
alignment of the allocated buffer. This is used by dma-iommu to ensure
the buffer is aligned to the iova granule size when using swiotlb with
untrusted sub-granule mappings. This addresses an issue where
From: David Stevens
Fold the _swiotlb helper functions into the respective _page functions,
since recent fixes have moved all logic from the _page functions to the
_swiotlb functions.
Signed-off-by: David Stevens
Reviewed-by: Christoph Hellwig
Reviewed-by: Robin Murphy
---
drivers/iommu/dma
From: David Stevens
Introduce a new dev_use_swiotlb function to guard swiotlb code, instead
of overloading dev_is_untrusted. This allows CONFIG_SWIOTLB to be
checked more broadly, so the swiotlb related code can be removed more
aggressively.
Signed-off-by: David Stevens
Reviewed-by: Robin
From: David Stevens
This patch set includes various fixes for dma-iommu's swiotlb bounce
buffers for untrusted devices.
The min_align_mask issue was found when running fio on an untrusted nvme
device with bs=512. The other issues were found via code inspection, so
I don't have any specific use
From: David Stevens
The is_swiotlb_buffer function takes the physical address of the swiotlb
buffer, not the physical address of the original buffer. The sglist
contains the physical addresses of the original buffer, so for the
sync_sg functions to work properly when a bounce buffer might have
From: David Stevens
When calling arch_sync_dma, we need to pass it the memory that's
actually being used for dma. When using swiotlb bounce buffers, this is
the bounce buffer. Move arch_sync_dma into the __iommu_dma_map_swiotlb
helper, so it can use the bounce buffer address if necessary.
Now
From: David Stevens
Calling the iommu_dma_sync_*_for_cpu functions during unmap can cause
two copies out of the swiotlb buffer. Do the arch sync directly in
__iommu_dma_unmap_swiotlb instead to avoid this. This makes the call to
iommu_dma_sync_sg_for_cpu for untrusted devices
From: David Stevens
Pass the non-aligned size to __iommu_dma_map when using swiotlb bounce
buffers in iommu_dma_map_page, to account for min_align_mask.
To deal with granule alignment, __iommu_dma_map maps iova_align(size +
iova_off) bytes starting at phys - iova_off. If iommu_dma_map_page
From: David Stevens
Fix RW protection check when making a pte, so that it properly checks
that both R and W flags are set, instead of either R or W.
Signed-off-by: David Stevens
---
drivers/iommu/sun50i-iommu.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/iommu
From: David Stevens
Fall back to domain selective flush if the target address is not aligned
to the mask being used for invalidation. This is necessary because page
selective invalidation masks out the lower order bits of the target
address based on the mask value, so if a non-aligned address
From: David Stevens
Calculate the appropriate mask for non-size-aligned page selective
invalidation. Since psi uses the mask value to mask out the lower order
bits of the target address, properly flushing the iotlb requires using a
mask value such that [pfn, pfn+pages) all lie within the flushed
From: David Stevens
Calculate the appropriate mask for non-size-aligned page selective
invalidation. Since psi uses the mask value to mask out the lower order
bits of the target address, properly flushing the iotlb requires using a
mask value such that [pfn, pfn+pages) all lie within the flushed
On Fri, Mar 25, 2022 at 4:15 PM Zhang, Tina wrote:
>
>
>
> > -Original Message-
> > From: iommu On Behalf Of
> > Tian, Kevin
> > Sent: Friday, March 25, 2022 2:14 PM
> > To: David Stevens ; Lu Baolu
> >
> > Cc: iommu@lists.linux-found
On Tue, May 24, 2022 at 9:27 PM Niklas Schnelle wrote:
>
> On Fri, 2021-08-06 at 19:34 +0900, David Stevens wrote:
> > From: David Stevens
> >
> > This patch series adds support for per-domain dynamic pools of iommu
> > bounce buffers to the dma-iommu A
On Fri, Jun 3, 2022 at 11:53 PM Niklas Schnelle wrote:
>
> On Fri, 2022-05-27 at 10:25 +0900, David Stevens wrote:
> > On Tue, May 24, 2022 at 9:27 PM Niklas Schnelle
> > wrote:
> > > On Fri, 2021-08-06 at 19:34 +0900, David Stevens wrote:
> > > > From: Da
83 matches
Mail list logo