Switch from coherent DMA pools to those backed by dma_alloc_pages. This
helps device with non-coherent DMA to avoid host accesses to uncached
memory for every submission of a larger than single entry I/O.
Signed-off-by: Christoph Hellwig
---
drivers/nvme/host/pci.c | 80
There is no harm in just always clearing the SME encryption bit, while
significantly simplifying the interface.
Signed-off-by: Christoph Hellwig
---
arch/arm/include/asm/dma-direct.h | 2 +-
arch/mips/bmips/dma.c | 2 +-
arch/mips/cavium-octeon/dma-octeon.c | 2 +-
arc
All users are gone now, remove the API.
Signed-off-by: Christoph Hellwig
---
arch/mips/Kconfig | 1 -
arch/mips/jazz/jazzdma.c| 1 -
arch/mips/mm/dma-noncoherent.c | 6 --
arch/parisc/Kconfig | 1 -
arch/parisc/kernel/pci-dma.c| 6 --
include/l
All operations are based on the controller, not the host page size.
Switch the dma pool to use the controller page size as well to avoid
massive overallocations on large page size systems.
Signed-off-by: Christoph Hellwig
---
drivers/nvme/host/pci.c | 3 ++-
1 file changed, 2 insertions(+), 1 de
Use the proper modern API to transfer cache ownership for incoherent DMA.
Note that this moves the DMA helpers to the main lib82596.c file, so
that they can use virt_to_dma.
Signed-off-by: Christoph Hellwig
---
drivers/net/ethernet/i825xx/lasi_82596.c | 11 +--
drivers/net/ethernet/i825xx/lib82
Replace the currently open code copy.
Signed-off-by: Christoph Hellwig
---
kernel/dma/direct.c | 5 +
1 file changed, 1 insertion(+), 4 deletions(-)
diff --git a/kernel/dma/direct.c b/kernel/dma/direct.c
index 01120510968fa1..2e280b9c063449 100644
--- a/kernel/dma/direct.c
+++ b/kernel/dma/
Add an new variant of a dmapool that uses non-coherent memory from
dma_alloc_pages. Unlike the existing mempool_create this one
initialized a pool allocated by the caller to avoid a pointless extra
allocation. At some point it might be worth to also switch the coherent
allocation over to a simila
dma_declare_coherent_memory should not be in a DMA API guide aimed
at driver writers (that is consumers of the API). Move it to a comment
near the function instead.
Signed-off-by: Christoph Hellwig
---
Documentation/core-api/dma-api.rst | 24
kernel/dma/coherent.c
Add a new file that contains helpera for misc DMA ops, which is only
built when CONFIG_DMA_OPS is set.
Signed-off-by: Christoph Hellwig
---
kernel/dma/Makefile | 1 +
kernel/dma/mapping.c | 47 +---
kernel/dma/ops_helpers.c | 51 +
Add a new API to allocate and free pages that are guaranteed to be
addressable by a device, but otherwise behave like pages allocated by
alloc_pages. The intended APIs to sync them for use with the device
and cpu are dma_sync_single_for_{device,cpu} that are also used for
streaming mappings.
Swit
Use the proper modern API to transfer cache ownership for incoherent DMA.
Signed-off-by: Christoph Hellwig
---
drivers/scsi/53c700.c | 11 +--
1 file changed, 9 insertions(+), 2 deletions(-)
diff --git a/drivers/scsi/53c700.c b/drivers/scsi/53c700.c
index 521950d0731e4a..57a08c42d00325
Use the proper modern API to transfer cache ownership for incoherent DMA.
Signed-off-by: Christoph Hellwig
---
drivers/net/ethernet/seeq/sgiseeq.c | 12
1 file changed, 8 insertions(+), 4 deletions(-)
diff --git a/drivers/net/ethernet/seeq/sgiseeq.c
b/drivers/net/ethernet/seeq/sgi
When transferring DMA ownership back to the CPU there should never
be any writeback from the cache, as the buffer was owned by the
device until now. Instead it should just be invalidated for the
mapping directions where the device could have written data.
Note that the changes rely on the fact tha
Move the detailed gfp_t setup from __dma_direct_alloc_pages into the
caller to clean things up a little.
Signed-off-by: Christoph Hellwig
---
kernel/dma/direct.c | 12 +---
1 file changed, 5 insertions(+), 7 deletions(-)
diff --git a/kernel/dma/direct.c b/kernel/dma/direct.c
index 8da9a
Signed-off-by: Christoph Hellwig
---
arch/mips/include/asm/jazzdma.h | 2 -
arch/mips/jazz/jazzdma.c| 70 -
2 files changed, 72 deletions(-)
diff --git a/arch/mips/include/asm/jazzdma.h b/arch/mips/include/asm/jazzdma.h
index d13f940022d5f9..c831da7fa8980
Use the proper modern API to transfer cache ownership for incoherent DMA.
This also means we can allocate the buffer memory with the proper
direction instead of bidirectional.
Signed-off-by: Christoph Hellwig
---
sound/mips/hal2.c | 44
1 file changed
Switch the 53c700 driver to only use non-coherent descriptor memory if it
really has to because dma_alloc_coherent fails. This doesn't matter for
any of the platforms it runs on currently, but that will change soon.
To help with this two new helpers to transfer ownership to and from the
device ar
Use the proper modern API to transfer cache ownership for incoherent DMA.
This also means we can allocate the memory as DMA_TO_DEVICE instead
of bidirectional.
Signed-off-by: Christoph Hellwig
---
drivers/scsi/sgiwd93.c | 8
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/
The __phys_to_dma vs phys_to_dma distinction isn't exactly obvious. Try
to improve the situation by renaming __phys_to_dma to
phys_to_dma_unencryped, and not forcing architectures that want to
override phys_to_dma to actually provide __phys_to_dma.
Signed-off-by: Christoph Hellwig
---
arch/arm/
DMA_ATTR_NON_CONSISTENT is a no-op except on PARISC and some mips
configs, so don't set it in this ARM specific driver part.
Signed-off-by: Christoph Hellwig
---
drivers/gpu/drm/nouveau/nvkm/subdev/instmem/gk20a.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/drivers/gpu/
Add back a hook to optimize dcache flushing after reading executable
code using DMA. This gets ia64 out of the business of pretending to
be dma incoherent just for this optimization.
Signed-off-by: Christoph Hellwig
---
arch/ia64/Kconfig | 3 +--
arch/ia64/kernel/dma-mapping.c |
This allows us to get rid of the LIB82596_DMA_ATTR defined and prepare
for untangling the coherent vs non-coherent DMA allocation API.
Signed-off-by: Christoph Hellwig
---
drivers/net/ethernet/i825xx/lasi_82596.c | 24 ++--
drivers/net/ethernet/i825xx/lib82596.c | 36 --
The au1000-eth driver contains none of the manual cache synchronization
required for using DMA_ATTR_NON_CONSISTENT. From what I can tell it
can be used on both dma coherent and non-coherent DMA platforms, but
I suspect it has been buggy on the non-coherent platforms all along.
Signed-off-by: Chri
The jazzdma ops implement support for a very basic IOMMU. Thus we really
should not use the dma-direct code that takes physical address limits
into account. This survived through the great MIPS DMA ops cleanup mostly
because I was lazy, but now it is time to fully split the implementations.
Sign
To prevent a compiler error when a method call alloc_pages is
added (which I plan to for the dma_map_ops).
Signed-off-by: Christoph Hellwig
---
include/linux/gfp.h | 6 --
1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/include/linux/gfp.h b/include/linux/gfp.h
index 67a0774e08
DMA_ATTR_NON_CONSISTENT is a no-op except on PARISC and some mips
configs, so don't set it in this ARM specific driver.
Signed-off-by: Christoph Hellwig
---
drivers/gpu/drm/exynos/exynos_drm_gem.c | 2 --
1 file changed, 2 deletions(-)
diff --git a/drivers/gpu/drm/exynos/exynos_drm_gem.c
b/dri
The V4L2-FLAG-MEMORY-NON-CONSISTENT flag is entirely unused, and causes
weird gymanstics with the DMA_ATTR_NON_CONSISTENT flag, which is
unimplemented except on PARISC and some MIPS configs, and about to be
removed.
Signed-off-by: Christoph Hellwig
---
.../userspace-api/media/v4l/buffer.rst
Just merge these helpers into the main dma_direct_{alloc,free} routines,
as the additional checks are always false for the two callers.
Signed-off-by: Christoph Hellwig
---
arch/x86/kernel/amd_gart_64.c | 6 +++---
include/linux/dma-direct.h| 4
kernel/dma/direct.c | 39
Hi all,
this series replaced the DMA_ATTR_NON_CONSISTENT flag to dma_alloc_attrs
with a separate new dma_alloc_pages API, which is available on all
platforms. In addition to cleaning up the convoluted code path, this
ensures that other drivers that have asked for better support for
non-coherent D
Applied to the dma-mapping tree. This should give us the whole merge
window to root out any obvious problems with drivers.
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu
On Tue, Aug 18, 2020 at 07:11:26PM -0300, Thiago Jung Bauermann wrote:
> POWER secure guests (i.e., guests which use the Protection Execution
> Facility) need to use SWIOTLB to be able to do I/O with the hypervisor, but
> they don't need the SWIOTLB memory to be in low addresses since the
> hypervi
On Tue, Aug 18, 2020 at 05:10:06PM +0100, Christoph Hellwig wrote:
> On Tue, Aug 18, 2020 at 11:07:57AM +0100, Will Deacon wrote:
> > > > so I'm not sure
> > > > that we should be complicating the implementation like this to try to
> > > > make it "fast".
> > > >
> > > I agree that this patch make
On Tue, Aug 18, 2020 at 11:07:57AM +0100, Will Deacon wrote:
> On Tue, Aug 18, 2020 at 06:37:39PM +0900, Cho KyongHo wrote:
> > On Tue, Aug 18, 2020 at 09:28:53AM +0100, Will Deacon wrote:
> > > On Tue, Aug 18, 2020 at 04:43:10PM +0900, Cho KyongHo wrote:
> > > > Cache maintenance operations in the
> -Original Message-
> From: Robin Murphy [mailto:robin.mur...@arm.com]
> Sent: Wednesday, August 19, 2020 2:31 AM
> To: Song Bao Hua (Barry Song) ; w...@kernel.org;
> j...@8bytes.org
> Cc: Zengtao (B) ;
> iommu@lists.linux-foundation.org; linux-arm-ker...@lists.infradead.org;
> Linuxarm
This fixed the below checkpatch issue:
WARNING: Symbolic permissions 'S_IRUGO' are not preferred. Consider using
octal permissions '0444'.
417: FILE: drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c:417:
module_param_named(disable_bypass, disable_bypass, bool, S_IRUGO);
-v4:
* cleanup the existing m
Polling by MSI isn't necessarily faster than polling by SEV. Tests on
hi1620 show hns3 100G NIC network throughput can improve from 25G to
27G if we disable MSI polling while running 16 netperf threads sending
UDP packets in size 32KB. TX throughput can improve from 7G to 7.7G for
single thread.
Th
patch 1/3 and patch 2/3 are the preparation of patch 3/3 which permits users
to disable MSI-based polling by cmd line.
-v4:
with respect to Robin's comments
* cleanup the code of the existing module parameter disable_bypass
* add ARM_SMMU_OPT_MSIPOLL flag. on the other hand, we only need to
Just use module_param() - going out of the way to specify a "different"
name that's identical to the variable name is silly.
Signed-off-by: Barry Song
---
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/iommu/arm/arm-smmu-v3
On Tue, Aug 18, 2020 at 5:19 PM Mike Rapoport wrote:
>
> .clang-format | 2 ++
For the .clang-format bit:
Acked-by: Miguel Ojeda
Cheers,
Miguel
___
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfound
POWER secure guests (i.e., guests which use the Protection Execution
Facility) need to use SWIOTLB to be able to do I/O with the hypervisor, but
they don't need the SWIOTLB memory to be in low addresses since the
hypervisor doesn't have any addressing limitation.
This solves a SWIOTLB initializati
On Tue, Aug 18, 2020 at 9:26 AM Robin Murphy wrote:
> On 2020-08-18 16:29, Mauro Carvalho Chehab wrote:
> > Em Tue, 18 Aug 2020 15:47:55 +0100
> > Basically, the DT binding has this, for IOMMU:
> >
> >
> > smmu_lpae {
> > compatible = "hisilicon,smmu-lpae";
> > };
> >
> >
Christoph Hellwig writes:
> On Mon, Aug 17, 2020 at 06:46:58PM -0300, Thiago Jung Bauermann wrote:
>> POWER secure guests (i.e., guests which use the Protection Execution
>> Facility) need to use SWIOTLB to be able to do I/O with the hypervisor, but
>> they don't need the SWIOTLB memory to be i
On 2020-08-18 16:29, Mauro Carvalho Chehab wrote:
Hi Robin,
Em Tue, 18 Aug 2020 15:47:55 +0100
Robin Murphy escreveu:
On 2020-08-17 08:49, Mauro Carvalho Chehab wrote:
Add a driver for the Kirin 960/970 iommu.
As on the past series, this starts from the original 4.9 driver from
the 96boards
On Tue, 18 Aug 2020 at 16:17, Robin Murphy wrote:
>
> On 2020-08-18 07:04, Tom Murphy wrote:
> > Add a flush_iotlb_range to allow flushing of an iova range instead of a
> > full flush in the dma-iommu path.
> >
> > Allow the iommu_unmap_fast to return newly freed page table pages and
> > pass the
On Tue, Aug 18, 2020 at 11:07:57AM +0100, Will Deacon wrote:
> > > so I'm not sure
> > > that we should be complicating the implementation like this to try to
> > > make it "fast".
> > >
> > I agree that this patch makes the implementation of dma API a bit more
> > but I don't think this does not
Hi Robin,
Em Tue, 18 Aug 2020 15:47:55 +0100
Robin Murphy escreveu:
> On 2020-08-17 08:49, Mauro Carvalho Chehab wrote:
> > Add a driver for the Kirin 960/970 iommu.
> >
> > As on the past series, this starts from the original 4.9 driver from
> > the 96boards tree:
> >
> > https://github.c
From: Mike Rapoport
for_each_memblock() is used to iterate over memblock.memory in
a few places that use data from memblock_region rather than the memory
ranges.
Introduce separate for_each_mem_region() and for_each_reserved_mem_region()
to improve encapsulation of memblock internals from its us
From: Mike Rapoport
Iteration over memblock.reserved with for_each_reserved_mem_region() used
__next_reserved_mem_region() that implemented a subset of
__next_mem_region().
Use __for_each_mem_range() and, essentially, __next_mem_region() with
appropriate parameters to reduce code duplication.
W
From: Mike Rapoport
The only user of memblock_mem_size() was x86 setup code, it is gone now and
memblock_mem_size() funciton can be removed.
Signed-off-by: Mike Rapoport
Reviewed-by: Baoquan He
---
include/linux/memblock.h | 1 -
mm/memblock.c| 15 ---
2 files changed
From: Mike Rapoport
* Replace magic numbers with defines
* Replace memblock_find_in_range() + memblock_reserve() with
memblock_phys_alloc_range()
* Stop checking for low memory size in reserve_crashkernel_low(). The
allocation from limited range will anyway fail if there is no enough
memory
From: Mike Rapoport
Currently, initrd image is reserved very early during setup and then it
might be relocated and re-reserved after the initial physical memory
mapping is created. The "late" reservation of memblock verifies that mapped
memory size exceeds the size of initrd, then checks whether
From: Mike Rapoport
There are several occurrences of the following pattern:
for_each_memblock(memory, reg) {
start = __pfn_to_phys(memblock_region_memory_base_pfn(reg);
end = __pfn_to_phys(memblock_region_memory_end_pfn(reg));
/* do someth
From: Mike Rapoport
There are several occurrences of the following pattern:
for_each_memblock(memory, reg) {
start_pfn = memblock_region_memory_base_pfn(reg);
end_pfn = memblock_region_memory_end_pfn(reg);
/* do something with start_pfn an
From: Mike Rapoport
Currently for_each_mem_range() and for_each_mem_range_rev() iterators are
the most generic way to traverse memblock regions. As such, they have 8
parameters and they are hardly convenient to users. Most users choose to
utilize one of their wrappers and the only user that actua
From: Mike Rapoport
The only user of memblock_dbg() outside memblock was s390 setup code and it
is converted to use pr_debug() instead.
This allows to stop exposing memblock_debug and memblock_dbg() to the rest
of the kernel.
Signed-off-by: Mike Rapoport
Reviewed-by: Baoquan He
---
arch/s390/
From: Mike Rapoport
microblaze does not support neither NUMA not SPARSMEM, so there is no point
to call memblock_set_node() and sparse_memory_present_with_active_regions()
functions during microblaze memory initialization.
Remove these calls and the surrounding code.
Signed-off-by: Mike Rapopor
From: Mike Rapoport
for_each_memblock_type() is not used outside mm/memblock.c, move it there
from include/linux/memblock.h
Signed-off-by: Mike Rapoport
Reviewed-by: Baoquan He
---
include/linux/memblock.h | 5 -
mm/memblock.c| 5 +
2 files changed, 5 insertions(+), 5 dele
From: Mike Rapoport
RISC-V does not (yet) support NUMA and for UMA architectures node 0 is
used implicitly during early memory initialization.
There is no need to call memblock_set_node(), remove this call and the
surrounding code.
Signed-off-by: Mike Rapoport
---
arch/riscv/mm/init.c | 9 --
From: Mike Rapoport
dummy_numa_init() loops over memblock.memory and passes nid=0 to
numa_add_memblk() which essentially wraps memblock_set_node(). However,
memblock_set_node() can cope with entire memory span itself, so the loop
over memblock.memory regions is redundant.
Using a single call to
From: Mike Rapoport
Instead of traversing memblock.memory regions to find memory_start and
memory_end, simply query memblock_{start,end}_of_DRAM().
Signed-off-by: Mike Rapoport
Acked-by: Stafford Horne
---
arch/h8300/kernel/setup.c| 8 +++-
arch/nds32/kernel/setup.c| 8 ++--
a
From: Mike Rapoport
The function free_highpages() in both arm and xtensa essentially open-code
for_each_free_mem_range() loop to detect high memory pages that were not
reserved and that should be initialized and passed to the buddy allocator.
Replace open-coded implementation of for_each_free_me
On 2020-08-18 07:04, Tom Murphy wrote:
Add a flush_iotlb_range to allow flushing of an iova range instead of a
full flush in the dma-iommu path.
Allow the iommu_unmap_fast to return newly freed page table pages and
pass the freelist to queue_iova in the dma-iommu ops path.
This patch is useful
From: Mike Rapoport
The memory size calculation in cma_early_percent_memory() traverses
memblock.memory rather than simply call memblock_phys_mem_size(). The
comment in that function suggests that at some point there should have been
call to memblock_analyze() before memblock_phys_mem_size() coul
From: Mike Rapoport
The memory size calculation in kvm_cma_reserve() traverses memblock.memory
rather than simply call memblock_phys_mem_size(). The comment in that
function suggests that at some point there should have been call to
memblock_analyze() before memblock_phys_mem_size() could be used
From: Mike Rapoport
Hi,
These patches simplify several uses of memblock iterators and hide some of
the memblock implementation details from the rest of the system.
The patches are on top of v5.9-rc1
v3 changes:
* rebase on v5.9-rc1, as the result this required some non-trivial changes
in pat
On 2020-08-17 08:49, Mauro Carvalho Chehab wrote:
Add a driver for the Kirin 960/970 iommu.
As on the past series, this starts from the original 4.9 driver from
the 96boards tree:
https://github.com/96boards-hikey/linux/tree/hikey970-v4.9
The remaining patches add SPDX headers and make
On 2020-08-18 12:17, Barry Song wrote:
Polling by MSI isn't necessarily faster than polling by SEV. Tests on
hi1620 show hns3 100G NIC network throughput can improve from 25G to
27G if we disable MSI polling while running 16 netperf threads sending
UDP packets in size 32KB. TX throughput can impr
Polling by MSI isn't necessarily faster than polling by SEV. Tests on
hi1620 show hns3 100G NIC network throughput can improve from 25G to
27G if we disable MSI polling while running 16 netperf threads sending
UDP packets in size 32KB. TX throughput can improve from 7G to 7.7G for
single thread.
Th
Reviewed-by: Akhil P Oommen
On 8/18/2020 3:31 AM, Rob Clark wrote:
From: Jordan Crouse
Add support for using per-instance pagetables if all the dependencies are
available.
Signed-off-by: Jordan Crouse
Signed-off-by: Rob Clark
---
drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 63 +++
On Tue, Aug 18, 2020 at 06:37:39PM +0900, Cho KyongHo wrote:
> On Tue, Aug 18, 2020 at 09:28:53AM +0100, Will Deacon wrote:
> > On Tue, Aug 18, 2020 at 04:43:10PM +0900, Cho KyongHo wrote:
> > > Cache maintenance operations in the most of CPU architectures needs
> > > memory barrier after the cache
On Tue, Aug 18, 2020 at 09:37:20AM +0100, Christoph Hellwig wrote:
> On Tue, Aug 18, 2020 at 09:28:53AM +0100, Will Deacon wrote:
> > On Tue, Aug 18, 2020 at 04:43:10PM +0900, Cho KyongHo wrote:
> > > Cache maintenance operations in the most of CPU architectures needs
> > > memory barrier after the
On Tue, Aug 18, 2020 at 09:28:53AM +0100, Will Deacon wrote:
> On Tue, Aug 18, 2020 at 04:43:10PM +0900, Cho KyongHo wrote:
> > Cache maintenance operations in the most of CPU architectures needs
> > memory barrier after the cache maintenance for the DMAs to view the
> > region of the memory correc
On Tue, Aug 18, 2020 at 09:28:53AM +0100, Will Deacon wrote:
> On Tue, Aug 18, 2020 at 04:43:10PM +0900, Cho KyongHo wrote:
> > Cache maintenance operations in the most of CPU architectures needs
> > memory barrier after the cache maintenance for the DMAs to view the
> > region of the memory correc
On Tue, Aug 18, 2020 at 04:43:10PM +0900, Cho KyongHo wrote:
> Cache maintenance operations in the most of CPU architectures needs
> memory barrier after the cache maintenance for the DMAs to view the
> region of the memory correctly. The problem is that memory barrier is
> very expensive and dma_[
On Mon, Aug 17, 2020 at 05:53:09PM -0400, Jim Quinlan wrote:
> The new field 'dma_range_map' in struct device is used to facilitate the
> use of single or multiple offsets between mapping regions of cpu addrs and
> dma addrs. It subsumes the role of "dev->dma_pfn_offset" which was only
> capable o
Cache maintenance operations in the most of CPU architectures needs
memory barrier after the cache maintenance for the DMAs to view the
region of the memory correctly. The problem is that memory barrier is
very expensive and dma_[un]map_sg() and dma_sync_sg_for_{device|cpu}()
involves the memory ba
__dma_[un]map_area() is the implementation of cache maintenance
operations for DMA in arm64. 'dsb sy' in the subroutine guarantees the
view of given memory area is consistent to all memory observers. So,
it is required.
However, dma_sync_sg_for_{device|cpu}() and dma_[un]map_sg() calls
__dma_[un]ma
On Mon, Aug 17, 2020 at 06:46:58PM -0300, Thiago Jung Bauermann wrote:
> POWER secure guests (i.e., guests which use the Protection Execution
> Facility) need to use SWIOTLB to be able to do I/O with the hypervisor, but
> they don't need the SWIOTLB memory to be in low addresses since the
> hypervi
Hi Jacob,
On 8/18/20 6:21 AM, Jacob Pan wrote:
> On Sun, 16 Aug 2020 14:40:57 +0200
> Auger Eric wrote:
>
>> Hi Yi,
>>
>> On 8/14/20 9:15 AM, Liu, Yi L wrote:
>>> Hi Eric,
>>>
From: Auger Eric
Sent: Thursday, August 13, 2020 8:53 PM
Yi,
On 7/28/20 8:27 AM, Liu Yi L w
79 matches
Mail list logo