Re: [PATCH v6 02/15] iommu: Report domain nesting info

2020-08-18 Thread Auger Eric
Hi Jacob, On 8/18/20 6:21 AM, Jacob Pan wrote: > On Sun, 16 Aug 2020 14:40:57 +0200 > Auger Eric wrote: > >> Hi Yi, >> >> On 8/14/20 9:15 AM, Liu, Yi L wrote: >>> Hi Eric, >>> From: Auger Eric Sent: Thursday, August 13, 2020 8:53 PM Yi, On 7/28/20 8:27 AM, Liu Yi L

Re: [PATCH v2] powerpc/pseries/svm: Allocate SWIOTLB buffer anywhere in memory

2020-08-18 Thread Christoph Hellwig
On Mon, Aug 17, 2020 at 06:46:58PM -0300, Thiago Jung Bauermann wrote: > POWER secure guests (i.e., guests which use the Protection Execution > Facility) need to use SWIOTLB to be able to do I/O with the hypervisor, but > they don't need the SWIOTLB memory to be in low addresses since the >

[PATCH 1/2] dma-mapping: introduce relaxed version of dma sync

2020-08-18 Thread Cho KyongHo
Cache maintenance operations in the most of CPU architectures needs memory barrier after the cache maintenance for the DMAs to view the region of the memory correctly. The problem is that memory barrier is very expensive and dma_[un]map_sg() and dma_sync_sg_for_{device|cpu}() involves the memory

[PATCH 2/2] arm64: dma-mapping: add relaxed DMA sync

2020-08-18 Thread Cho KyongHo
__dma_[un]map_area() is the implementation of cache maintenance operations for DMA in arm64. 'dsb sy' in the subroutine guarantees the view of given memory area is consistent to all memory observers. So, it is required. However, dma_sync_sg_for_{device|cpu}() and dma_[un]map_sg() calls

Re: [PATCH 1/2] dma-mapping: introduce relaxed version of dma sync

2020-08-18 Thread Christoph Hellwig
On Tue, Aug 18, 2020 at 09:28:53AM +0100, Will Deacon wrote: > On Tue, Aug 18, 2020 at 04:43:10PM +0900, Cho KyongHo wrote: > > Cache maintenance operations in the most of CPU architectures needs > > memory barrier after the cache maintenance for the DMAs to view the > > region of the memory

[PATCH V2 1/2] Add new flush_iotlb_range and handle freelists when using iommu_unmap_fast

2020-08-18 Thread Tom Murphy
Add a flush_iotlb_range to allow flushing of an iova range instead of a full flush in the dma-iommu path. Allow the iommu_unmap_fast to return newly freed page table pages and pass the freelist to queue_iova in the dma-iommu ops path. This patch is useful for iommu drivers (in this case the

[PATCH V2 2/2] Handle init_iova_flush_queue failure in dma-iommu path

2020-08-18 Thread Tom Murphy
init_iova_flush_queue can fail if we run out of memory. Fall back to no flush queue if it fails. Signed-off-by: Tom Murphy --- drivers/iommu/dma-iommu.c | 8 +--- 1 file changed, 5 insertions(+), 3 deletions(-) diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c index

Re: [PATCH RESEND v10 07/11] device-mapping: Introduce DMA range map, supplanting dma_pfn_offset

2020-08-18 Thread Andy Shevchenko
On Mon, Aug 17, 2020 at 05:53:09PM -0400, Jim Quinlan wrote: > The new field 'dma_range_map' in struct device is used to facilitate the > use of single or multiple offsets between mapping regions of cpu addrs and > dma addrs. It subsumes the role of "dev->dma_pfn_offset" which was only > capable

Re: [PATCH 1/2] dma-mapping: introduce relaxed version of dma sync

2020-08-18 Thread Will Deacon
On Tue, Aug 18, 2020 at 04:43:10PM +0900, Cho KyongHo wrote: > Cache maintenance operations in the most of CPU architectures needs > memory barrier after the cache maintenance for the DMAs to view the > region of the memory correctly. The problem is that memory barrier is > very expensive and

Re: [PATCH 1/2] dma-mapping: introduce relaxed version of dma sync

2020-08-18 Thread Cho KyongHo
On Tue, Aug 18, 2020 at 09:28:53AM +0100, Will Deacon wrote: > On Tue, Aug 18, 2020 at 04:43:10PM +0900, Cho KyongHo wrote: > > Cache maintenance operations in the most of CPU architectures needs > > memory barrier after the cache maintenance for the DMAs to view the > > region of the memory

Re: [PATCH 1/2] dma-mapping: introduce relaxed version of dma sync

2020-08-18 Thread Cho KyongHo
On Tue, Aug 18, 2020 at 09:37:20AM +0100, Christoph Hellwig wrote: > On Tue, Aug 18, 2020 at 09:28:53AM +0100, Will Deacon wrote: > > On Tue, Aug 18, 2020 at 04:43:10PM +0900, Cho KyongHo wrote: > > > Cache maintenance operations in the most of CPU architectures needs > > > memory barrier after

Re: [PATCH 1/2] dma-mapping: introduce relaxed version of dma sync

2020-08-18 Thread Will Deacon
On Tue, Aug 18, 2020 at 06:37:39PM +0900, Cho KyongHo wrote: > On Tue, Aug 18, 2020 at 09:28:53AM +0100, Will Deacon wrote: > > On Tue, Aug 18, 2020 at 04:43:10PM +0900, Cho KyongHo wrote: > > > Cache maintenance operations in the most of CPU architectures needs > > > memory barrier after the

Re: [PATCH 16/20] drm/msm/a6xx: Add support for per-instance pagetables

2020-08-18 Thread Akhil P Oommen
Reviewed-by: Akhil P Oommen On 8/18/2020 3:31 AM, Rob Clark wrote: From: Jordan Crouse Add support for using per-instance pagetables if all the dependencies are available. Signed-off-by: Jordan Crouse Signed-off-by: Rob Clark --- drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 63

[PATCH v4] iommu/arm-smmu-v3: permit users to disable msi polling

2020-08-18 Thread Barry Song
Polling by MSI isn't necessarily faster than polling by SEV. Tests on hi1620 show hns3 100G NIC network throughput can improve from 25G to 27G if we disable MSI polling while running 16 netperf threads sending UDP packets in size 32KB. TX throughput can improve from 7G to 7.7G for single thread.

[PATCH v3 17/17] memblock: use separate iterators for memory and reserved regions

2020-08-18 Thread Mike Rapoport
From: Mike Rapoport for_each_memblock() is used to iterate over memblock.memory in a few places that use data from memblock_region rather than the memory ranges. Introduce separate for_each_mem_region() and for_each_reserved_mem_region() to improve encapsulation of memblock internals from its

Re: [PATCH V2 1/2] Add new flush_iotlb_range and handle freelists when using iommu_unmap_fast

2020-08-18 Thread Tom Murphy
On Tue, 18 Aug 2020 at 16:17, Robin Murphy wrote: > > On 2020-08-18 07:04, Tom Murphy wrote: > > Add a flush_iotlb_range to allow flushing of an iova range instead of a > > full flush in the dma-iommu path. > > > > Allow the iommu_unmap_fast to return newly freed page table pages and > > pass the

[PATCH v3 12/17] arch, drivers: replace for_each_membock() with for_each_mem_range()

2020-08-18 Thread Mike Rapoport
From: Mike Rapoport There are several occurrences of the following pattern: for_each_memblock(memory, reg) { start = __pfn_to_phys(memblock_region_memory_base_pfn(reg); end = __pfn_to_phys(memblock_region_memory_end_pfn(reg)); /* do

[PATCH v3 15/17] memblock: remove unused memblock_mem_size()

2020-08-18 Thread Mike Rapoport
From: Mike Rapoport The only user of memblock_mem_size() was x86 setup code, it is gone now and memblock_mem_size() funciton can be removed. Signed-off-by: Mike Rapoport Reviewed-by: Baoquan He --- include/linux/memblock.h | 1 - mm/memblock.c| 15 --- 2 files

[PATCH v3 16/17] memblock: implement for_each_reserved_mem_region() using __next_mem_region()

2020-08-18 Thread Mike Rapoport
From: Mike Rapoport Iteration over memblock.reserved with for_each_reserved_mem_region() used __next_reserved_mem_region() that implemented a subset of __next_mem_region(). Use __for_each_mem_range() and, essentially, __next_mem_region() with appropriate parameters to reduce code duplication.

[PATCH v3 13/17] x86/setup: simplify initrd relocation and reservation

2020-08-18 Thread Mike Rapoport
From: Mike Rapoport Currently, initrd image is reserved very early during setup and then it might be relocated and re-reserved after the initial physical memory mapping is created. The "late" reservation of memblock verifies that mapped memory size exceeds the size of initrd, then checks whether

[PATCH v3 14/17] x86/setup: simplify reserve_crashkernel()

2020-08-18 Thread Mike Rapoport
From: Mike Rapoport * Replace magic numbers with defines * Replace memblock_find_in_range() + memblock_reserve() with memblock_phys_alloc_range() * Stop checking for low memory size in reserve_crashkernel_low(). The allocation from limited range will anyway fail if there is no enough

[PATCH v3 11/17] arch, mm: replace for_each_memblock() with for_each_mem_pfn_range()

2020-08-18 Thread Mike Rapoport
From: Mike Rapoport There are several occurrences of the following pattern: for_each_memblock(memory, reg) { start_pfn = memblock_region_memory_base_pfn(reg); end_pfn = memblock_region_memory_end_pfn(reg); /* do something with start_pfn

Re: [PATCH 00/16] IOMMU driver for Kirin 960/970

2020-08-18 Thread Mauro Carvalho Chehab
Hi Robin, Em Tue, 18 Aug 2020 15:47:55 +0100 Robin Murphy escreveu: > On 2020-08-17 08:49, Mauro Carvalho Chehab wrote: > > Add a driver for the Kirin 960/970 iommu. > > > > As on the past series, this starts from the original 4.9 driver from > > the 96boards tree: > > > >

Re: [PATCH 00/16] IOMMU driver for Kirin 960/970

2020-08-18 Thread Robin Murphy
On 2020-08-18 16:29, Mauro Carvalho Chehab wrote: Hi Robin, Em Tue, 18 Aug 2020 15:47:55 +0100 Robin Murphy escreveu: On 2020-08-17 08:49, Mauro Carvalho Chehab wrote: Add a driver for the Kirin 960/970 iommu. As on the past series, this starts from the original 4.9 driver from the

Re: [PATCH 00/16] IOMMU driver for Kirin 960/970

2020-08-18 Thread Robin Murphy
On 2020-08-17 08:49, Mauro Carvalho Chehab wrote: Add a driver for the Kirin 960/970 iommu. As on the past series, this starts from the original 4.9 driver from the 96boards tree: https://github.com/96boards-hikey/linux/tree/hikey970-v4.9 The remaining patches add SPDX headers and

Re: [PATCH V2 1/2] Add new flush_iotlb_range and handle freelists when using iommu_unmap_fast

2020-08-18 Thread Robin Murphy
On 2020-08-18 07:04, Tom Murphy wrote: Add a flush_iotlb_range to allow flushing of an iova range instead of a full flush in the dma-iommu path. Allow the iommu_unmap_fast to return newly freed page table pages and pass the freelist to queue_iova in the dma-iommu ops path. This patch is useful

[PATCH v3 03/17] arm, xtensa: simplify initialization of high memory pages

2020-08-18 Thread Mike Rapoport
From: Mike Rapoport The function free_highpages() in both arm and xtensa essentially open-code for_each_free_mem_range() loop to detect high memory pages that were not reserved and that should be initialized and passed to the buddy allocator. Replace open-coded implementation of

[PATCH v3 01/17] KVM: PPC: Book3S HV: simplify kvm_cma_reserve()

2020-08-18 Thread Mike Rapoport
From: Mike Rapoport The memory size calculation in kvm_cma_reserve() traverses memblock.memory rather than simply call memblock_phys_mem_size(). The comment in that function suggests that at some point there should have been call to memblock_analyze() before memblock_phys_mem_size() could be

[PATCH v3 02/17] dma-contiguous: simplify cma_early_percent_memory()

2020-08-18 Thread Mike Rapoport
From: Mike Rapoport The memory size calculation in cma_early_percent_memory() traverses memblock.memory rather than simply call memblock_phys_mem_size(). The comment in that function suggests that at some point there should have been call to memblock_analyze() before memblock_phys_mem_size()

[PATCH v3 00/17] memblock: seasonal cleaning^w cleanup

2020-08-18 Thread Mike Rapoport
From: Mike Rapoport Hi, These patches simplify several uses of memblock iterators and hide some of the memblock implementation details from the rest of the system. The patches are on top of v5.9-rc1 v3 changes: * rebase on v5.9-rc1, as the result this required some non-trivial changes in

Re: [PATCH 1/2] dma-mapping: introduce relaxed version of dma sync

2020-08-18 Thread Christoph Hellwig
On Tue, Aug 18, 2020 at 11:07:57AM +0100, Will Deacon wrote: > > > so I'm not sure > > > that we should be complicating the implementation like this to try to > > > make it "fast". > > > > > I agree that this patch makes the implementation of dma API a bit more > > but I don't think this does not

Re: [PATCH v4] iommu/arm-smmu-v3: permit users to disable msi polling

2020-08-18 Thread Robin Murphy
On 2020-08-18 12:17, Barry Song wrote: Polling by MSI isn't necessarily faster than polling by SEV. Tests on hi1620 show hns3 100G NIC network throughput can improve from 25G to 27G if we disable MSI polling while running 16 netperf threads sending UDP packets in size 32KB. TX throughput can

[PATCH v3 10/17] memblock: reduce number of parameters in for_each_mem_range()

2020-08-18 Thread Mike Rapoport
From: Mike Rapoport Currently for_each_mem_range() and for_each_mem_range_rev() iterators are the most generic way to traverse memblock regions. As such, they have 8 parameters and they are hardly convenient to users. Most users choose to utilize one of their wrappers and the only user that

[PATCH v3 04/17] arm64: numa: simplify dummy_numa_init()

2020-08-18 Thread Mike Rapoport
From: Mike Rapoport dummy_numa_init() loops over memblock.memory and passes nid=0 to numa_add_memblk() which essentially wraps memblock_set_node(). However, memblock_set_node() can cope with entire memory span itself, so the loop over memblock.memory regions is redundant. Using a single call to

[PATCH v3 05/17] h8300, nds32, openrisc: simplify detection of memory extents

2020-08-18 Thread Mike Rapoport
From: Mike Rapoport Instead of traversing memblock.memory regions to find memory_start and memory_end, simply query memblock_{start,end}_of_DRAM(). Signed-off-by: Mike Rapoport Acked-by: Stafford Horne --- arch/h8300/kernel/setup.c| 8 +++- arch/nds32/kernel/setup.c| 8 ++--

[PATCH v3 07/17] mircoblaze: drop unneeded NUMA and sparsemem initializations

2020-08-18 Thread Mike Rapoport
From: Mike Rapoport microblaze does not support neither NUMA not SPARSMEM, so there is no point to call memblock_set_node() and sparse_memory_present_with_active_regions() functions during microblaze memory initialization. Remove these calls and the surrounding code. Signed-off-by: Mike

[PATCH v3 09/17] memblock: make memblock_debug and related functionality private

2020-08-18 Thread Mike Rapoport
From: Mike Rapoport The only user of memblock_dbg() outside memblock was s390 setup code and it is converted to use pr_debug() instead. This allows to stop exposing memblock_debug and memblock_dbg() to the rest of the kernel. Signed-off-by: Mike Rapoport Reviewed-by: Baoquan He ---

[PATCH v3 08/17] memblock: make for_each_memblock_type() iterator private

2020-08-18 Thread Mike Rapoport
From: Mike Rapoport for_each_memblock_type() is not used outside mm/memblock.c, move it there from include/linux/memblock.h Signed-off-by: Mike Rapoport Reviewed-by: Baoquan He --- include/linux/memblock.h | 5 - mm/memblock.c| 5 + 2 files changed, 5 insertions(+), 5

[PATCH v3 06/17] riscv: drop unneeded node initialization

2020-08-18 Thread Mike Rapoport
From: Mike Rapoport RISC-V does not (yet) support NUMA and for UMA architectures node 0 is used implicitly during early memory initialization. There is no need to call memblock_set_node(), remove this call and the surrounding code. Signed-off-by: Mike Rapoport --- arch/riscv/mm/init.c | 9

[PATCH v3] powerpc/pseries/svm: Allocate SWIOTLB buffer anywhere in memory

2020-08-18 Thread Thiago Jung Bauermann
POWER secure guests (i.e., guests which use the Protection Execution Facility) need to use SWIOTLB to be able to do I/O with the hypervisor, but they don't need the SWIOTLB memory to be in low addresses since the hypervisor doesn't have any addressing limitation. This solves a SWIOTLB

Re: [PATCH 00/16] IOMMU driver for Kirin 960/970

2020-08-18 Thread John Stultz
On Tue, Aug 18, 2020 at 9:26 AM Robin Murphy wrote: > On 2020-08-18 16:29, Mauro Carvalho Chehab wrote: > > Em Tue, 18 Aug 2020 15:47:55 +0100 > > Basically, the DT binding has this, for IOMMU: > > > > > > smmu_lpae { > > compatible = "hisilicon,smmu-lpae"; > > }; > > >

Re: [PATCH v3 10/17] memblock: reduce number of parameters in for_each_mem_range()

2020-08-18 Thread Miguel Ojeda
On Tue, Aug 18, 2020 at 5:19 PM Mike Rapoport wrote: > > .clang-format | 2 ++ For the .clang-format bit: Acked-by: Miguel Ojeda Cheers, Miguel ___ iommu mailing list iommu@lists.linux-foundation.org

Re: [PATCH v2] powerpc/pseries/svm: Allocate SWIOTLB buffer anywhere in memory

2020-08-18 Thread Thiago Jung Bauermann
Christoph Hellwig writes: > On Mon, Aug 17, 2020 at 06:46:58PM -0300, Thiago Jung Bauermann wrote: >> POWER secure guests (i.e., guests which use the Protection Execution >> Facility) need to use SWIOTLB to be able to do I/O with the hypervisor, but >> they don't need the SWIOTLB memory to be

RE: [PATCH v4] iommu/arm-smmu-v3: permit users to disable msi polling

2020-08-18 Thread Song Bao Hua (Barry Song)
> -Original Message- > From: Robin Murphy [mailto:robin.mur...@arm.com] > Sent: Wednesday, August 19, 2020 2:31 AM > To: Song Bao Hua (Barry Song) ; w...@kernel.org; > j...@8bytes.org > Cc: Zengtao (B) ; > iommu@lists.linux-foundation.org; linux-arm-ker...@lists.infradead.org; >

Re: [PATCH 1/2] dma-mapping: introduce relaxed version of dma sync

2020-08-18 Thread Cho KyongHo
On Tue, Aug 18, 2020 at 05:10:06PM +0100, Christoph Hellwig wrote: > On Tue, Aug 18, 2020 at 11:07:57AM +0100, Will Deacon wrote: > > > > so I'm not sure > > > > that we should be complicating the implementation like this to try to > > > > make it "fast". > > > > > > > I agree that this patch

[PATCH v4 1/3] iommu/arm-smmu-v3: replace symbolic permissions by octal permissions for module parameter

2020-08-18 Thread Barry Song
This fixed the below checkpatch issue: WARNING: Symbolic permissions 'S_IRUGO' are not preferred. Consider using octal permissions '0444'. 417: FILE: drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c:417: module_param_named(disable_bypass, disable_bypass, bool, S_IRUGO); -v4: * cleanup the existing

[PATCH v4 3/3] iommu/arm-smmu-v3: permit users to disable msi polling

2020-08-18 Thread Barry Song
Polling by MSI isn't necessarily faster than polling by SEV. Tests on hi1620 show hns3 100G NIC network throughput can improve from 25G to 27G if we disable MSI polling while running 16 netperf threads sending UDP packets in size 32KB. TX throughput can improve from 7G to 7.7G for single thread.

[PATCH v4 0/3] iommu/arm-smmu-v3: permit users to disable msi polling

2020-08-18 Thread Barry Song
patch 1/3 and patch 2/3 are the preparation of patch 3/3 which permits users to disable MSI-based polling by cmd line. -v4: with respect to Robin's comments * cleanup the code of the existing module parameter disable_bypass * add ARM_SMMU_OPT_MSIPOLL flag. on the other hand, we only need to

[PATCH v4 2/3] iommu/arm-smmu-v3: replace module_param_named by module_param for disable_bypass

2020-08-18 Thread Barry Song
Just use module_param() - going out of the way to specify a "different" name that's identical to the variable name is silly. Signed-off-by: Barry Song --- drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git

Re: [PATCH 1/2] dma-mapping: introduce relaxed version of dma sync

2020-08-18 Thread Cho KyongHo
On Tue, Aug 18, 2020 at 11:07:57AM +0100, Will Deacon wrote: > On Tue, Aug 18, 2020 at 06:37:39PM +0900, Cho KyongHo wrote: > > On Tue, Aug 18, 2020 at 09:28:53AM +0100, Will Deacon wrote: > > > On Tue, Aug 18, 2020 at 04:43:10PM +0900, Cho KyongHo wrote: > > > > Cache maintenance operations in

Re: [PATCH v3] powerpc/pseries/svm: Allocate SWIOTLB buffer anywhere in memory

2020-08-18 Thread Christoph Hellwig
On Tue, Aug 18, 2020 at 07:11:26PM -0300, Thiago Jung Bauermann wrote: > POWER secure guests (i.e., guests which use the Protection Execution > Facility) need to use SWIOTLB to be able to do I/O with the hypervisor, but > they don't need the SWIOTLB memory to be in low addresses since the >

Re: [PATCH v3] dma-mapping: set default segment_boundary_mask to ULONG_MAX

2020-08-18 Thread Christoph Hellwig
Applied to the dma-mapping tree. This should give us the whole merge window to root out any obvious problems with drivers. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu