[PATCH 28/28] nvme-pci: use dma_alloc_pages backed dmapools

2020-08-18 Thread Christoph Hellwig
Switch from coherent DMA pools to those backed by dma_alloc_pages. This helps device with non-coherent DMA to avoid host accesses to uncached memory for every submission of a larger than single entry I/O. Signed-off-by: Christoph Hellwig --- drivers/nvme/host/pci.c | 80

[PATCH 15/28] dma-direct: remove __dma_to_phys

2020-08-18 Thread Christoph Hellwig
There is no harm in just always clearing the SME encryption bit, while significantly simplifying the interface. Signed-off-by: Christoph Hellwig --- arch/arm/include/asm/dma-direct.h | 2 +- arch/mips/bmips/dma.c | 2 +- arch/mips/cavium-octeon/dma-octeon.c | 2 +- arc

[PATCH 25/28] dma-mapping: remove dma_cache_sync

2020-08-18 Thread Christoph Hellwig
All users are gone now, remove the API. Signed-off-by: Christoph Hellwig --- arch/mips/Kconfig | 1 - arch/mips/jazz/jazzdma.c| 1 - arch/mips/mm/dma-noncoherent.c | 6 -- arch/parisc/Kconfig | 1 - arch/parisc/kernel/pci-dma.c| 6 -- include/l

[PATCH 27/28] nvme-pci: fix PRP pool size

2020-08-18 Thread Christoph Hellwig
All operations are based on the controller, not the host page size. Switch the dma pool to use the controller page size as well to avoid massive overallocations on large page size systems. Signed-off-by: Christoph Hellwig --- drivers/nvme/host/pci.c | 3 ++- 1 file changed, 2 insertions(+), 1 de

[PATCH 23/28] lib82596: convert from dma_cache_sync to dma_sync_single_for_device

2020-08-18 Thread Christoph Hellwig
Use the proper modern API to transfer cache ownership for incoherent DMA. Note that this moves the DMA helpers to the main lib82596.c file, so that they can use virt_to_dma. Signed-off-by: Christoph Hellwig --- drivers/net/ethernet/i825xx/lasi_82596.c | 11 +-- drivers/net/ethernet/i825xx/lib82

[PATCH 14/28] dma-direct: use phys_to_dma_direct in dma_direct_alloc

2020-08-18 Thread Christoph Hellwig
Replace the currently open code copy. Signed-off-by: Christoph Hellwig --- kernel/dma/direct.c | 5 + 1 file changed, 1 insertion(+), 4 deletions(-) diff --git a/kernel/dma/direct.c b/kernel/dma/direct.c index 01120510968fa1..2e280b9c063449 100644 --- a/kernel/dma/direct.c +++ b/kernel/dma/

[PATCH 26/28] dmapool: add dma_alloc_pages support

2020-08-18 Thread Christoph Hellwig
Add an new variant of a dmapool that uses non-coherent memory from dma_alloc_pages. Unlike the existing mempool_create this one initialized a pool allocated by the caller to avoid a pointless extra allocation. At some point it might be worth to also switch the coherent allocation over to a simila

[PATCH 18/28] dma-mapping: move the dma_declare_coherent_memory documentation

2020-08-18 Thread Christoph Hellwig
dma_declare_coherent_memory should not be in a DMA API guide aimed at driver writers (that is consumers of the API). Move it to a comment near the function instead. Signed-off-by: Christoph Hellwig --- Documentation/core-api/dma-api.rst | 24 kernel/dma/coherent.c

[PATCH 17/28] dma-mapping: move dma_common_{mmap, get_sgtable} out of mapping.c

2020-08-18 Thread Christoph Hellwig
Add a new file that contains helpera for misc DMA ops, which is only built when CONFIG_DMA_OPS is set. Signed-off-by: Christoph Hellwig --- kernel/dma/Makefile | 1 + kernel/dma/mapping.c | 47 +--- kernel/dma/ops_helpers.c | 51 +

[PATCH 19/28] dma-mapping: replace DMA_ATTR_NON_CONSISTENT with dma_{alloc, free}_pages

2020-08-18 Thread Christoph Hellwig
Add a new API to allocate and free pages that are guaranteed to be addressable by a device, but otherwise behave like pages allocated by alloc_pages. The intended APIs to sync them for use with the device and cpu are dma_sync_single_for_{device,cpu} that are also used for streaming mappings. Swit

[PATCH 24/28] 53c700: convert from dma_cache_sync to dma_sync_single_for_device

2020-08-18 Thread Christoph Hellwig
Use the proper modern API to transfer cache ownership for incoherent DMA. Signed-off-by: Christoph Hellwig --- drivers/scsi/53c700.c | 11 +-- 1 file changed, 9 insertions(+), 2 deletions(-) diff --git a/drivers/scsi/53c700.c b/drivers/scsi/53c700.c index 521950d0731e4a..57a08c42d00325

[PATCH 22/28] sgiseeq: convert from dma_cache_sync to dma_sync_single_for_device

2020-08-18 Thread Christoph Hellwig
Use the proper modern API to transfer cache ownership for incoherent DMA. Signed-off-by: Christoph Hellwig --- drivers/net/ethernet/seeq/sgiseeq.c | 12 1 file changed, 8 insertions(+), 4 deletions(-) diff --git a/drivers/net/ethernet/seeq/sgiseeq.c b/drivers/net/ethernet/seeq/sgi

[PATCH 08/28] MIPS: make dma_sync_*_for_cpu a little less overzealous

2020-08-18 Thread Christoph Hellwig
When transferring DMA ownership back to the CPU there should never be any writeback from the cache, as the buffer was owned by the device until now. Instead it should just be invalidated for the mapping directions where the device could have written data. Note that the changes rely on the fact tha

[PATCH 13/28] dma-direct: lift gfp_t manipulation out of__dma_direct_alloc_pages

2020-08-18 Thread Christoph Hellwig
Move the detailed gfp_t setup from __dma_direct_alloc_pages into the caller to clean things up a little. Signed-off-by: Christoph Hellwig --- kernel/dma/direct.c | 12 +--- 1 file changed, 5 insertions(+), 7 deletions(-) diff --git a/kernel/dma/direct.c b/kernel/dma/direct.c index 8da9a

[PATCH 09/28] MIPS/jazzdma: remove the unused vdma_remap function

2020-08-18 Thread Christoph Hellwig
Signed-off-by: Christoph Hellwig --- arch/mips/include/asm/jazzdma.h | 2 - arch/mips/jazz/jazzdma.c| 70 - 2 files changed, 72 deletions(-) diff --git a/arch/mips/include/asm/jazzdma.h b/arch/mips/include/asm/jazzdma.h index d13f940022d5f9..c831da7fa8980

[PATCH 21/28] hal2: convert from dma_cache_sync to dma_sync_single_for_device

2020-08-18 Thread Christoph Hellwig
Use the proper modern API to transfer cache ownership for incoherent DMA. This also means we can allocate the buffer memory with the proper direction instead of bidirectional. Signed-off-by: Christoph Hellwig --- sound/mips/hal2.c | 44 1 file changed

[PATCH 07/28] 53c700: improve non-coherent DMA handling

2020-08-18 Thread Christoph Hellwig
Switch the 53c700 driver to only use non-coherent descriptor memory if it really has to because dma_alloc_coherent fails. This doesn't matter for any of the platforms it runs on currently, but that will change soon. To help with this two new helpers to transfer ownership to and from the device ar

[PATCH 20/28] sgiwd93: convert from dma_cache_sync to dma_sync_single_for_device

2020-08-18 Thread Christoph Hellwig
Use the proper modern API to transfer cache ownership for incoherent DMA. This also means we can allocate the memory as DMA_TO_DEVICE instead of bidirectional. Signed-off-by: Christoph Hellwig --- drivers/scsi/sgiwd93.c | 8 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/

[PATCH 16/28] dma-direct: rename and cleanup __phys_to_dma

2020-08-18 Thread Christoph Hellwig
The __phys_to_dma vs phys_to_dma distinction isn't exactly obvious. Try to improve the situation by renaming __phys_to_dma to phys_to_dma_unencryped, and not forcing architectures that want to override phys_to_dma to actually provide __phys_to_dma. Signed-off-by: Christoph Hellwig --- arch/arm/

[PATCH 03/28] drm/nouveau/gk20a: stop setting DMA_ATTR_NON_CONSISTENT

2020-08-18 Thread Christoph Hellwig
DMA_ATTR_NON_CONSISTENT is a no-op except on PARISC and some mips configs, so don't set it in this ARM specific driver part. Signed-off-by: Christoph Hellwig --- drivers/gpu/drm/nouveau/nvkm/subdev/instmem/gk20a.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/drivers/gpu/

[PATCH 11/28] dma-mapping: add (back) arch_dma_mark_clean for ia64

2020-08-18 Thread Christoph Hellwig
Add back a hook to optimize dcache flushing after reading executable code using DMA. This gets ia64 out of the business of pretending to be dma incoherent just for this optimization. Signed-off-by: Christoph Hellwig --- arch/ia64/Kconfig | 3 +-- arch/ia64/kernel/dma-mapping.c |

[PATCH 06/28] lib82596: move DMA allocation into the callers of i82596_probe

2020-08-18 Thread Christoph Hellwig
This allows us to get rid of the LIB82596_DMA_ATTR defined and prepare for untangling the coherent vs non-coherent DMA allocation API. Signed-off-by: Christoph Hellwig --- drivers/net/ethernet/i825xx/lasi_82596.c | 24 ++-- drivers/net/ethernet/i825xx/lib82596.c | 36 --

[PATCH 04/28] net/au1000-eth: stop using DMA_ATTR_NON_CONSISTENT

2020-08-18 Thread Christoph Hellwig
The au1000-eth driver contains none of the manual cache synchronization required for using DMA_ATTR_NON_CONSISTENT. From what I can tell it can be used on both dma coherent and non-coherent DMA platforms, but I suspect it has been buggy on the non-coherent platforms all along. Signed-off-by: Chri

[PATCH 10/28] MIPS/jazzdma: decouple from dma-direct

2020-08-18 Thread Christoph Hellwig
The jazzdma ops implement support for a very basic IOMMU. Thus we really should not use the dma-direct code that takes physical address limits into account. This survived through the great MIPS DMA ops cleanup mostly because I was lazy, but now it is time to fully split the implementations. Sign

[PATCH 01/28] mm: turn alloc_pages into an inline function

2020-08-18 Thread Christoph Hellwig
To prevent a compiler error when a method call alloc_pages is added (which I plan to for the dma_map_ops). Signed-off-by: Christoph Hellwig --- include/linux/gfp.h | 6 -- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/include/linux/gfp.h b/include/linux/gfp.h index 67a0774e08

[PATCH 02/28] drm/exynos: stop setting DMA_ATTR_NON_CONSISTENT

2020-08-18 Thread Christoph Hellwig
DMA_ATTR_NON_CONSISTENT is a no-op except on PARISC and some mips configs, so don't set it in this ARM specific driver. Signed-off-by: Christoph Hellwig --- drivers/gpu/drm/exynos/exynos_drm_gem.c | 2 -- 1 file changed, 2 deletions(-) diff --git a/drivers/gpu/drm/exynos/exynos_drm_gem.c b/dri

[PATCH 05/28] media/v4l2: remove V4L2-FLAG-MEMORY-NON-CONSISTENT

2020-08-18 Thread Christoph Hellwig
The V4L2-FLAG-MEMORY-NON-CONSISTENT flag is entirely unused, and causes weird gymanstics with the DMA_ATTR_NON_CONSISTENT flag, which is unimplemented except on PARISC and some MIPS configs, and about to be removed. Signed-off-by: Christoph Hellwig --- .../userspace-api/media/v4l/buffer.rst

[PATCH 12/28] dma-direct: remove dma_direct_{alloc,free}_pages

2020-08-18 Thread Christoph Hellwig
Just merge these helpers into the main dma_direct_{alloc,free} routines, as the additional checks are always false for the two callers. Signed-off-by: Christoph Hellwig --- arch/x86/kernel/amd_gart_64.c | 6 +++--- include/linux/dma-direct.h| 4 kernel/dma/direct.c | 39

a saner API for allocating DMA addressable pages

2020-08-18 Thread Christoph Hellwig
Hi all, this series replaced the DMA_ATTR_NON_CONSISTENT flag to dma_alloc_attrs with a separate new dma_alloc_pages API, which is available on all platforms. In addition to cleaning up the convoluted code path, this ensures that other drivers that have asked for better support for non-coherent D

Re: [PATCH v3] dma-mapping: set default segment_boundary_mask to ULONG_MAX

2020-08-18 Thread Christoph Hellwig
Applied to the dma-mapping tree. This should give us the whole merge window to root out any obvious problems with drivers. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu

Re: [PATCH v3] powerpc/pseries/svm: Allocate SWIOTLB buffer anywhere in memory

2020-08-18 Thread Christoph Hellwig
On Tue, Aug 18, 2020 at 07:11:26PM -0300, Thiago Jung Bauermann wrote: > POWER secure guests (i.e., guests which use the Protection Execution > Facility) need to use SWIOTLB to be able to do I/O with the hypervisor, but > they don't need the SWIOTLB memory to be in low addresses since the > hypervi

Re: [PATCH 1/2] dma-mapping: introduce relaxed version of dma sync

2020-08-18 Thread Cho KyongHo
On Tue, Aug 18, 2020 at 05:10:06PM +0100, Christoph Hellwig wrote: > On Tue, Aug 18, 2020 at 11:07:57AM +0100, Will Deacon wrote: > > > > so I'm not sure > > > > that we should be complicating the implementation like this to try to > > > > make it "fast". > > > > > > > I agree that this patch make

Re: [PATCH 1/2] dma-mapping: introduce relaxed version of dma sync

2020-08-18 Thread Cho KyongHo
On Tue, Aug 18, 2020 at 11:07:57AM +0100, Will Deacon wrote: > On Tue, Aug 18, 2020 at 06:37:39PM +0900, Cho KyongHo wrote: > > On Tue, Aug 18, 2020 at 09:28:53AM +0100, Will Deacon wrote: > > > On Tue, Aug 18, 2020 at 04:43:10PM +0900, Cho KyongHo wrote: > > > > Cache maintenance operations in the

RE: [PATCH v4] iommu/arm-smmu-v3: permit users to disable msi polling

2020-08-18 Thread Song Bao Hua (Barry Song)
> -Original Message- > From: Robin Murphy [mailto:robin.mur...@arm.com] > Sent: Wednesday, August 19, 2020 2:31 AM > To: Song Bao Hua (Barry Song) ; w...@kernel.org; > j...@8bytes.org > Cc: Zengtao (B) ; > iommu@lists.linux-foundation.org; linux-arm-ker...@lists.infradead.org; > Linuxarm

[PATCH v4 1/3] iommu/arm-smmu-v3: replace symbolic permissions by octal permissions for module parameter

2020-08-18 Thread Barry Song
This fixed the below checkpatch issue: WARNING: Symbolic permissions 'S_IRUGO' are not preferred. Consider using octal permissions '0444'. 417: FILE: drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c:417: module_param_named(disable_bypass, disable_bypass, bool, S_IRUGO); -v4: * cleanup the existing m

[PATCH v4 3/3] iommu/arm-smmu-v3: permit users to disable msi polling

2020-08-18 Thread Barry Song
Polling by MSI isn't necessarily faster than polling by SEV. Tests on hi1620 show hns3 100G NIC network throughput can improve from 25G to 27G if we disable MSI polling while running 16 netperf threads sending UDP packets in size 32KB. TX throughput can improve from 7G to 7.7G for single thread. Th

[PATCH v4 0/3] iommu/arm-smmu-v3: permit users to disable msi polling

2020-08-18 Thread Barry Song
patch 1/3 and patch 2/3 are the preparation of patch 3/3 which permits users to disable MSI-based polling by cmd line. -v4: with respect to Robin's comments * cleanup the code of the existing module parameter disable_bypass * add ARM_SMMU_OPT_MSIPOLL flag. on the other hand, we only need to

[PATCH v4 2/3] iommu/arm-smmu-v3: replace module_param_named by module_param for disable_bypass

2020-08-18 Thread Barry Song
Just use module_param() - going out of the way to specify a "different" name that's identical to the variable name is silly. Signed-off-by: Barry Song --- drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/iommu/arm/arm-smmu-v3

Re: [PATCH v3 10/17] memblock: reduce number of parameters in for_each_mem_range()

2020-08-18 Thread Miguel Ojeda
On Tue, Aug 18, 2020 at 5:19 PM Mike Rapoport wrote: > > .clang-format | 2 ++ For the .clang-format bit: Acked-by: Miguel Ojeda Cheers, Miguel ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfound

[PATCH v3] powerpc/pseries/svm: Allocate SWIOTLB buffer anywhere in memory

2020-08-18 Thread Thiago Jung Bauermann
POWER secure guests (i.e., guests which use the Protection Execution Facility) need to use SWIOTLB to be able to do I/O with the hypervisor, but they don't need the SWIOTLB memory to be in low addresses since the hypervisor doesn't have any addressing limitation. This solves a SWIOTLB initializati

Re: [PATCH 00/16] IOMMU driver for Kirin 960/970

2020-08-18 Thread John Stultz
On Tue, Aug 18, 2020 at 9:26 AM Robin Murphy wrote: > On 2020-08-18 16:29, Mauro Carvalho Chehab wrote: > > Em Tue, 18 Aug 2020 15:47:55 +0100 > > Basically, the DT binding has this, for IOMMU: > > > > > > smmu_lpae { > > compatible = "hisilicon,smmu-lpae"; > > }; > > > >

Re: [PATCH v2] powerpc/pseries/svm: Allocate SWIOTLB buffer anywhere in memory

2020-08-18 Thread Thiago Jung Bauermann
Christoph Hellwig writes: > On Mon, Aug 17, 2020 at 06:46:58PM -0300, Thiago Jung Bauermann wrote: >> POWER secure guests (i.e., guests which use the Protection Execution >> Facility) need to use SWIOTLB to be able to do I/O with the hypervisor, but >> they don't need the SWIOTLB memory to be i

Re: [PATCH 00/16] IOMMU driver for Kirin 960/970

2020-08-18 Thread Robin Murphy
On 2020-08-18 16:29, Mauro Carvalho Chehab wrote: Hi Robin, Em Tue, 18 Aug 2020 15:47:55 +0100 Robin Murphy escreveu: On 2020-08-17 08:49, Mauro Carvalho Chehab wrote: Add a driver for the Kirin 960/970 iommu. As on the past series, this starts from the original 4.9 driver from the 96boards

Re: [PATCH V2 1/2] Add new flush_iotlb_range and handle freelists when using iommu_unmap_fast

2020-08-18 Thread Tom Murphy
On Tue, 18 Aug 2020 at 16:17, Robin Murphy wrote: > > On 2020-08-18 07:04, Tom Murphy wrote: > > Add a flush_iotlb_range to allow flushing of an iova range instead of a > > full flush in the dma-iommu path. > > > > Allow the iommu_unmap_fast to return newly freed page table pages and > > pass the

Re: [PATCH 1/2] dma-mapping: introduce relaxed version of dma sync

2020-08-18 Thread Christoph Hellwig
On Tue, Aug 18, 2020 at 11:07:57AM +0100, Will Deacon wrote: > > > so I'm not sure > > > that we should be complicating the implementation like this to try to > > > make it "fast". > > > > > I agree that this patch makes the implementation of dma API a bit more > > but I don't think this does not

Re: [PATCH 00/16] IOMMU driver for Kirin 960/970

2020-08-18 Thread Mauro Carvalho Chehab
Hi Robin, Em Tue, 18 Aug 2020 15:47:55 +0100 Robin Murphy escreveu: > On 2020-08-17 08:49, Mauro Carvalho Chehab wrote: > > Add a driver for the Kirin 960/970 iommu. > > > > As on the past series, this starts from the original 4.9 driver from > > the 96boards tree: > > > > https://github.c

[PATCH v3 17/17] memblock: use separate iterators for memory and reserved regions

2020-08-18 Thread Mike Rapoport
From: Mike Rapoport for_each_memblock() is used to iterate over memblock.memory in a few places that use data from memblock_region rather than the memory ranges. Introduce separate for_each_mem_region() and for_each_reserved_mem_region() to improve encapsulation of memblock internals from its us

[PATCH v3 16/17] memblock: implement for_each_reserved_mem_region() using __next_mem_region()

2020-08-18 Thread Mike Rapoport
From: Mike Rapoport Iteration over memblock.reserved with for_each_reserved_mem_region() used __next_reserved_mem_region() that implemented a subset of __next_mem_region(). Use __for_each_mem_range() and, essentially, __next_mem_region() with appropriate parameters to reduce code duplication. W

[PATCH v3 15/17] memblock: remove unused memblock_mem_size()

2020-08-18 Thread Mike Rapoport
From: Mike Rapoport The only user of memblock_mem_size() was x86 setup code, it is gone now and memblock_mem_size() funciton can be removed. Signed-off-by: Mike Rapoport Reviewed-by: Baoquan He --- include/linux/memblock.h | 1 - mm/memblock.c| 15 --- 2 files changed

[PATCH v3 14/17] x86/setup: simplify reserve_crashkernel()

2020-08-18 Thread Mike Rapoport
From: Mike Rapoport * Replace magic numbers with defines * Replace memblock_find_in_range() + memblock_reserve() with memblock_phys_alloc_range() * Stop checking for low memory size in reserve_crashkernel_low(). The allocation from limited range will anyway fail if there is no enough memory

[PATCH v3 13/17] x86/setup: simplify initrd relocation and reservation

2020-08-18 Thread Mike Rapoport
From: Mike Rapoport Currently, initrd image is reserved very early during setup and then it might be relocated and re-reserved after the initial physical memory mapping is created. The "late" reservation of memblock verifies that mapped memory size exceeds the size of initrd, then checks whether

[PATCH v3 12/17] arch, drivers: replace for_each_membock() with for_each_mem_range()

2020-08-18 Thread Mike Rapoport
From: Mike Rapoport There are several occurrences of the following pattern: for_each_memblock(memory, reg) { start = __pfn_to_phys(memblock_region_memory_base_pfn(reg); end = __pfn_to_phys(memblock_region_memory_end_pfn(reg)); /* do someth

[PATCH v3 11/17] arch, mm: replace for_each_memblock() with for_each_mem_pfn_range()

2020-08-18 Thread Mike Rapoport
From: Mike Rapoport There are several occurrences of the following pattern: for_each_memblock(memory, reg) { start_pfn = memblock_region_memory_base_pfn(reg); end_pfn = memblock_region_memory_end_pfn(reg); /* do something with start_pfn an

[PATCH v3 10/17] memblock: reduce number of parameters in for_each_mem_range()

2020-08-18 Thread Mike Rapoport
From: Mike Rapoport Currently for_each_mem_range() and for_each_mem_range_rev() iterators are the most generic way to traverse memblock regions. As such, they have 8 parameters and they are hardly convenient to users. Most users choose to utilize one of their wrappers and the only user that actua

[PATCH v3 09/17] memblock: make memblock_debug and related functionality private

2020-08-18 Thread Mike Rapoport
From: Mike Rapoport The only user of memblock_dbg() outside memblock was s390 setup code and it is converted to use pr_debug() instead. This allows to stop exposing memblock_debug and memblock_dbg() to the rest of the kernel. Signed-off-by: Mike Rapoport Reviewed-by: Baoquan He --- arch/s390/

[PATCH v3 07/17] mircoblaze: drop unneeded NUMA and sparsemem initializations

2020-08-18 Thread Mike Rapoport
From: Mike Rapoport microblaze does not support neither NUMA not SPARSMEM, so there is no point to call memblock_set_node() and sparse_memory_present_with_active_regions() functions during microblaze memory initialization. Remove these calls and the surrounding code. Signed-off-by: Mike Rapopor

[PATCH v3 08/17] memblock: make for_each_memblock_type() iterator private

2020-08-18 Thread Mike Rapoport
From: Mike Rapoport for_each_memblock_type() is not used outside mm/memblock.c, move it there from include/linux/memblock.h Signed-off-by: Mike Rapoport Reviewed-by: Baoquan He --- include/linux/memblock.h | 5 - mm/memblock.c| 5 + 2 files changed, 5 insertions(+), 5 dele

[PATCH v3 06/17] riscv: drop unneeded node initialization

2020-08-18 Thread Mike Rapoport
From: Mike Rapoport RISC-V does not (yet) support NUMA and for UMA architectures node 0 is used implicitly during early memory initialization. There is no need to call memblock_set_node(), remove this call and the surrounding code. Signed-off-by: Mike Rapoport --- arch/riscv/mm/init.c | 9 --

[PATCH v3 04/17] arm64: numa: simplify dummy_numa_init()

2020-08-18 Thread Mike Rapoport
From: Mike Rapoport dummy_numa_init() loops over memblock.memory and passes nid=0 to numa_add_memblk() which essentially wraps memblock_set_node(). However, memblock_set_node() can cope with entire memory span itself, so the loop over memblock.memory regions is redundant. Using a single call to

[PATCH v3 05/17] h8300, nds32, openrisc: simplify detection of memory extents

2020-08-18 Thread Mike Rapoport
From: Mike Rapoport Instead of traversing memblock.memory regions to find memory_start and memory_end, simply query memblock_{start,end}_of_DRAM(). Signed-off-by: Mike Rapoport Acked-by: Stafford Horne --- arch/h8300/kernel/setup.c| 8 +++- arch/nds32/kernel/setup.c| 8 ++-- a

[PATCH v3 03/17] arm, xtensa: simplify initialization of high memory pages

2020-08-18 Thread Mike Rapoport
From: Mike Rapoport The function free_highpages() in both arm and xtensa essentially open-code for_each_free_mem_range() loop to detect high memory pages that were not reserved and that should be initialized and passed to the buddy allocator. Replace open-coded implementation of for_each_free_me

Re: [PATCH V2 1/2] Add new flush_iotlb_range and handle freelists when using iommu_unmap_fast

2020-08-18 Thread Robin Murphy
On 2020-08-18 07:04, Tom Murphy wrote: Add a flush_iotlb_range to allow flushing of an iova range instead of a full flush in the dma-iommu path. Allow the iommu_unmap_fast to return newly freed page table pages and pass the freelist to queue_iova in the dma-iommu ops path. This patch is useful

[PATCH v3 02/17] dma-contiguous: simplify cma_early_percent_memory()

2020-08-18 Thread Mike Rapoport
From: Mike Rapoport The memory size calculation in cma_early_percent_memory() traverses memblock.memory rather than simply call memblock_phys_mem_size(). The comment in that function suggests that at some point there should have been call to memblock_analyze() before memblock_phys_mem_size() coul

[PATCH v3 01/17] KVM: PPC: Book3S HV: simplify kvm_cma_reserve()

2020-08-18 Thread Mike Rapoport
From: Mike Rapoport The memory size calculation in kvm_cma_reserve() traverses memblock.memory rather than simply call memblock_phys_mem_size(). The comment in that function suggests that at some point there should have been call to memblock_analyze() before memblock_phys_mem_size() could be used

[PATCH v3 00/17] memblock: seasonal cleaning^w cleanup

2020-08-18 Thread Mike Rapoport
From: Mike Rapoport Hi, These patches simplify several uses of memblock iterators and hide some of the memblock implementation details from the rest of the system. The patches are on top of v5.9-rc1 v3 changes: * rebase on v5.9-rc1, as the result this required some non-trivial changes in pat

Re: [PATCH 00/16] IOMMU driver for Kirin 960/970

2020-08-18 Thread Robin Murphy
On 2020-08-17 08:49, Mauro Carvalho Chehab wrote: Add a driver for the Kirin 960/970 iommu. As on the past series, this starts from the original 4.9 driver from the 96boards tree: https://github.com/96boards-hikey/linux/tree/hikey970-v4.9 The remaining patches add SPDX headers and make

Re: [PATCH v4] iommu/arm-smmu-v3: permit users to disable msi polling

2020-08-18 Thread Robin Murphy
On 2020-08-18 12:17, Barry Song wrote: Polling by MSI isn't necessarily faster than polling by SEV. Tests on hi1620 show hns3 100G NIC network throughput can improve from 25G to 27G if we disable MSI polling while running 16 netperf threads sending UDP packets in size 32KB. TX throughput can impr

[PATCH v4] iommu/arm-smmu-v3: permit users to disable msi polling

2020-08-18 Thread Barry Song
Polling by MSI isn't necessarily faster than polling by SEV. Tests on hi1620 show hns3 100G NIC network throughput can improve from 25G to 27G if we disable MSI polling while running 16 netperf threads sending UDP packets in size 32KB. TX throughput can improve from 7G to 7.7G for single thread. Th

Re: [PATCH 16/20] drm/msm/a6xx: Add support for per-instance pagetables

2020-08-18 Thread Akhil P Oommen
Reviewed-by: Akhil P Oommen On 8/18/2020 3:31 AM, Rob Clark wrote: From: Jordan Crouse Add support for using per-instance pagetables if all the dependencies are available. Signed-off-by: Jordan Crouse Signed-off-by: Rob Clark --- drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 63 +++

Re: [PATCH 1/2] dma-mapping: introduce relaxed version of dma sync

2020-08-18 Thread Will Deacon
On Tue, Aug 18, 2020 at 06:37:39PM +0900, Cho KyongHo wrote: > On Tue, Aug 18, 2020 at 09:28:53AM +0100, Will Deacon wrote: > > On Tue, Aug 18, 2020 at 04:43:10PM +0900, Cho KyongHo wrote: > > > Cache maintenance operations in the most of CPU architectures needs > > > memory barrier after the cache

Re: [PATCH 1/2] dma-mapping: introduce relaxed version of dma sync

2020-08-18 Thread Cho KyongHo
On Tue, Aug 18, 2020 at 09:37:20AM +0100, Christoph Hellwig wrote: > On Tue, Aug 18, 2020 at 09:28:53AM +0100, Will Deacon wrote: > > On Tue, Aug 18, 2020 at 04:43:10PM +0900, Cho KyongHo wrote: > > > Cache maintenance operations in the most of CPU architectures needs > > > memory barrier after the

Re: [PATCH 1/2] dma-mapping: introduce relaxed version of dma sync

2020-08-18 Thread Cho KyongHo
On Tue, Aug 18, 2020 at 09:28:53AM +0100, Will Deacon wrote: > On Tue, Aug 18, 2020 at 04:43:10PM +0900, Cho KyongHo wrote: > > Cache maintenance operations in the most of CPU architectures needs > > memory barrier after the cache maintenance for the DMAs to view the > > region of the memory correc

Re: [PATCH 1/2] dma-mapping: introduce relaxed version of dma sync

2020-08-18 Thread Christoph Hellwig
On Tue, Aug 18, 2020 at 09:28:53AM +0100, Will Deacon wrote: > On Tue, Aug 18, 2020 at 04:43:10PM +0900, Cho KyongHo wrote: > > Cache maintenance operations in the most of CPU architectures needs > > memory barrier after the cache maintenance for the DMAs to view the > > region of the memory correc

Re: [PATCH 1/2] dma-mapping: introduce relaxed version of dma sync

2020-08-18 Thread Will Deacon
On Tue, Aug 18, 2020 at 04:43:10PM +0900, Cho KyongHo wrote: > Cache maintenance operations in the most of CPU architectures needs > memory barrier after the cache maintenance for the DMAs to view the > region of the memory correctly. The problem is that memory barrier is > very expensive and dma_[

Re: [PATCH RESEND v10 07/11] device-mapping: Introduce DMA range map, supplanting dma_pfn_offset

2020-08-18 Thread Andy Shevchenko
On Mon, Aug 17, 2020 at 05:53:09PM -0400, Jim Quinlan wrote: > The new field 'dma_range_map' in struct device is used to facilitate the > use of single or multiple offsets between mapping regions of cpu addrs and > dma addrs. It subsumes the role of "dev->dma_pfn_offset" which was only > capable o

[PATCH 1/2] dma-mapping: introduce relaxed version of dma sync

2020-08-18 Thread Cho KyongHo
Cache maintenance operations in the most of CPU architectures needs memory barrier after the cache maintenance for the DMAs to view the region of the memory correctly. The problem is that memory barrier is very expensive and dma_[un]map_sg() and dma_sync_sg_for_{device|cpu}() involves the memory ba

[PATCH 2/2] arm64: dma-mapping: add relaxed DMA sync

2020-08-18 Thread Cho KyongHo
__dma_[un]map_area() is the implementation of cache maintenance operations for DMA in arm64. 'dsb sy' in the subroutine guarantees the view of given memory area is consistent to all memory observers. So, it is required. However, dma_sync_sg_for_{device|cpu}() and dma_[un]map_sg() calls __dma_[un]ma

Re: [PATCH v2] powerpc/pseries/svm: Allocate SWIOTLB buffer anywhere in memory

2020-08-18 Thread Christoph Hellwig
On Mon, Aug 17, 2020 at 06:46:58PM -0300, Thiago Jung Bauermann wrote: > POWER secure guests (i.e., guests which use the Protection Execution > Facility) need to use SWIOTLB to be able to do I/O with the hypervisor, but > they don't need the SWIOTLB memory to be in low addresses since the > hypervi

Re: [PATCH v6 02/15] iommu: Report domain nesting info

2020-08-18 Thread Auger Eric
Hi Jacob, On 8/18/20 6:21 AM, Jacob Pan wrote: > On Sun, 16 Aug 2020 14:40:57 +0200 > Auger Eric wrote: > >> Hi Yi, >> >> On 8/14/20 9:15 AM, Liu, Yi L wrote: >>> Hi Eric, >>> From: Auger Eric Sent: Thursday, August 13, 2020 8:53 PM Yi, On 7/28/20 8:27 AM, Liu Yi L w