[PATCH v3] iommu/arm-smmu-v3: permit users to disable MSI polling

2020-08-01 Thread Barry Song
s can decide to use MSI polling or not based on their tests. Signed-off-by: Barry Song --- -v3: * rebase on top of linux-next as arm-smmu-v3.c has moved; * provide a command line option drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 18 ++ 1 file changed, 14 insertions(

[PATCH v5 2/2] arm64: mm: reserve per-numa CMA to localize coherent dma buffers

2020-07-31 Thread Barry Song
: Ganapatrao Kulkarni Cc: Catalin Marinas Cc: Nicolas Saenz Julienne Cc: Steve Capper Cc: Andrew Morton Cc: Mike Rapoport Signed-off-by: Barry Song --- arch/arm64/mm/init.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c index b6881d61b818

[PATCH v5 1/2] dma-contiguous: provide the ability to reserve per-numa CMA

2020-07-31 Thread Barry Song
: Christoph Hellwig Cc: Marek Szyprowski Cc: Will Deacon Cc: Robin Murphy Cc: Ganapatrao Kulkarni Cc: Catalin Marinas Cc: Nicolas Saenz Julienne Cc: Steve Capper Cc: Andrew Morton Cc: Mike Rapoport Signed-off-by: Barry Song --- -v5: refine code according to Christoph Hellwig's com

[PATCH v5 0/2] make dma_alloc_coherent NUMA-aware by per-NUMA CMA

2020-07-31 Thread Barry Song
arameter for per-numa could make life easier. * move dma_pernuma_cma_reserve() after hugetlb_cma_reserve() to reuse the comment before hugetlb_cma_reserve() with respect to Robin's comment -v2: * fix some issues reported by kernel test robot * fallback to default cma while alloca

RE: [PATCH v2] iommu/arm-smmu-v3: disable MSI polling if SEV polling is faster

2020-07-31 Thread Song Bao Hua (Barry Song)
> -Original Message- > From: Will Deacon [mailto:w...@kernel.org] > Sent: Saturday, August 1, 2020 12:22 AM > To: Song Bao Hua (Barry Song) > Cc: John Garry ; robin.mur...@arm.com; > j...@8bytes.org; iommu@lists.linux-foundation.org; Zengtao (B) > ; Linux

RE: [PATCH v2] iommu/arm-smmu-v3: disable MSI polling if SEV polling is faster

2020-07-31 Thread Song Bao Hua (Barry Song)
> -Original Message- > From: John Garry > Sent: Friday, July 31, 2020 10:21 PM > To: Song Bao Hua (Barry Song) ; w...@kernel.org; > robin.mur...@arm.com; j...@8bytes.org; iommu@lists.linux-foundation.org > Cc: Zengtao (B) ; Linuxarm > ; linux-arm-ker...@lists.infrad

[PATCH v2] iommu/arm-smmu-v3: disable MSI polling if SEV polling is faster

2020-07-31 Thread Barry Song
strict, TX throughput can improve from 25Gbps to 27Gbps by this patch. This patch adds a generic function to support implementation options based on IIDR and disables MSI polling if IIDR matches the specific implementation tested. Cc: Prime Zeng Signed-off-by: Barry Song --- -v2: rather than

RE: [PATCH v4 1/2] dma-direct: provide the ability to reserve per-numa CMA

2020-07-29 Thread Song Bao Hua (Barry Song)
> -Original Message- > From: Christoph Hellwig [mailto:h...@lst.de] > Sent: Wednesday, July 29, 2020 12:23 AM > To: Song Bao Hua (Barry Song) > Cc: Christoph Hellwig ; m.szyprow...@samsung.com; > robin.mur...@arm.com; w...@kernel.org; ganapatrao.kulka...@cavium.c

RE: [PATCH v4 1/2] dma-direct: provide the ability to reserve per-numa CMA

2020-07-28 Thread Song Bao Hua (Barry Song)
> -Original Message- > From: Christoph Hellwig [mailto:h...@lst.de] > Sent: Tuesday, July 28, 2020 11:53 PM > To: Song Bao Hua (Barry Song) > Cc: h...@lst.de; m.szyprow...@samsung.com; robin.mur...@arm.com; > w...@kernel.org; ganapatrao.kulka...@cavium.com; > ca

RE: [PATCH v2] dma-contiguous: cleanup dma_alloc_contiguous

2020-07-26 Thread Song Bao Hua (Barry Song)
ts have been gone from the DMA API for a while. > > Signed-off-by: Christoph Hellwig Reviewed-by: Barry Song And I have rebased per-numa CMA patchset on top of this one. https://lore.kernel.org/linux-arm-kernel/20200723131344.41472-1-song.bao@hisilicon.com/ > --- > > Changes sin

[PATCH v4 0/2] make dma_alloc_coherent NUMA-aware by per-NUMA CMA

2020-07-23 Thread Barry Song
ier. * move dma_pernuma_cma_reserve() after hugetlb_cma_reserve() to reuse the comment before hugetlb_cma_reserve() with respect to Robin's comment -v2: * fix some issues reported by kernel test robot * fallback to default cma while allocation fails in per-numa cma free memor

[PATCH v4 1/2] dma-direct: provide the ability to reserve per-numa CMA

2020-07-23 Thread Barry Song
: Christoph Hellwig Cc: Marek Szyprowski Cc: Will Deacon Cc: Robin Murphy Cc: Ganapatrao Kulkarni Cc: Catalin Marinas Cc: Nicolas Saenz Julienne Cc: Steve Capper Cc: Andrew Morton Cc: Mike Rapoport Signed-off-by: Barry Song --- -v4: * rebase on top of Christoph Hellwig's patch: [PATCH v2

[PATCH v4 2/2] arm64: mm: reserve per-numa CMA to localize coherent dma buffers

2020-07-23 Thread Barry Song
: Ganapatrao Kulkarni Cc: Catalin Marinas Cc: Nicolas Saenz Julienne Cc: Steve Capper Cc: Andrew Morton Cc: Mike Rapoport Signed-off-by: Barry Song --- -v4: * rebase on top of linux-next to avoid arch/arm64 conflicts arch/arm64/mm/init.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a

RE: [PATCH v3 1/2] dma-direct: provide the ability to reserve per-numa CMA

2020-07-23 Thread Song Bao Hua (Barry Song)
> -Original Message- > From: Christoph Hellwig [mailto:h...@lst.de] > Sent: Friday, July 24, 2020 12:01 AM > To: Song Bao Hua (Barry Song) > Cc: Christoph Hellwig ; m.szyprow...@samsung.com; > robin.mur...@arm.com; w...@kernel.org; ganapatrao.kulka...@cavium.c

RE: [PATCH v3 1/2] dma-direct: provide the ability to reserve per-numa CMA

2020-07-22 Thread Song Bao Hua (Barry Song)
> -Original Message- > From: Christoph Hellwig [mailto:h...@lst.de] > Sent: Thursday, July 23, 2020 2:30 AM > To: Song Bao Hua (Barry Song) > Cc: h...@lst.de; m.szyprow...@samsung.com; robin.mur...@arm.com; > w...@kernel.org; ganapatrao.kulka...@cavium.com; > ca

RE: [PATCH v3 1/2] dma-direct: provide the ability to reserve per-numa CMA

2020-07-22 Thread Song Bao Hua (Barry Song)
> -Original Message- > From: Christoph Hellwig [mailto:h...@lst.de] > Sent: Thursday, July 23, 2020 2:17 AM > To: Song Bao Hua (Barry Song) > Cc: h...@lst.de; m.szyprow...@samsung.com; robin.mur...@arm.com; > w...@kernel.org; ganapatrao.kulka...@cavium.com; > ca

RE: [PATCH] iommu/arm-smmu-v3: remove the approach of MSI polling for CMD SYNC

2020-07-20 Thread Song Bao Hua (Barry Song)
> -Original Message- > From: Song Bao Hua (Barry Song) > Sent: Friday, July 17, 2020 9:06 PM > To: 'Robin Murphy' ; w...@kernel.org; > j...@8bytes.org > Cc: linux-ker...@vger.kernel.org; Linuxarm ; > linux-arm-ker...@lists.infradead.org; iommu@lists.lin

RE: [PATCH] iommu/arm-smmu-v3: remove the approach of MSI polling for CMD SYNC

2020-07-17 Thread Song Bao Hua (Barry Song)
> -Original Message- > From: Robin Murphy [mailto:robin.mur...@arm.com] > Sent: Friday, July 17, 2020 8:55 PM > To: Song Bao Hua (Barry Song) ; w...@kernel.org; > j...@8bytes.org > Cc: linux-ker...@vger.kernel.org; Linuxarm ; > linux-arm-ker...@lists.infradead.o

[PATCH] iommu/arm-smmu-v3: remove the approach of MSI polling for CMD SYNC

2020-07-16 Thread Barry Song
rf on hns3 100G NIC with UDP packet size in 32768bytes and set iommu to strict, TX throughput can improve from 25227.74Mbps to 27145.59Mbps by this patch. In this case, SMMU is super busy as hns3 sends map/unmap requests extremely frequently. Cc: Prime Zeng Signed-off-by: Barry Song --- drivers

RE: [PATCH v3 0/2] make dma_alloc_coherent NUMA-aware by per-NUMA CMA

2020-07-12 Thread Song Bao Hua (Barry Song)
> -Original Message- > From: Song Bao Hua (Barry Song) > Sent: Sunday, June 28, 2020 11:13 PM > To: h...@lst.de; m.szyprow...@samsung.com; robin.mur...@arm.com; > w...@kernel.org; ganapatrao.kulka...@cavium.com; > catalin.mari...@arm.com > Cc: iommu@lists.linux-fou

RE: [PATCH net] xsk: remove cheap_dma optimization

2020-07-08 Thread Song Bao Hua (Barry Song)
> -Original Message- > From: netdev-ow...@vger.kernel.org [mailto:netdev-ow...@vger.kernel.org] > On Behalf Of Christoph Hellwig > Sent: Wednesday, July 8, 2020 6:50 PM > To: Robin Murphy > Cc: Björn Töpel ; Christoph Hellwig ; > Daniel Borkmann ; maxi...@mellanox.com; > konrad.w...@ora

RE: [PATCH] iommu/arm-smmu-v3: allocate the memory of queues in local numa node

2020-07-05 Thread Song Bao Hua (Barry Song)
> -Original Message- > From: Will Deacon [mailto:w...@kernel.org] > Sent: Saturday, July 4, 2020 4:22 AM > To: Song Bao Hua (Barry Song) > Cc: h...@lst.de; m.szyprow...@samsung.com; robin.mur...@arm.com; > linux-arm-ker...@lists.infradead.org; iommu@lists.li

RE: [PATCH] iommu/arm-smmu-v3: expose numa_node attribute to users in sysfs

2020-07-05 Thread Song Bao Hua (Barry Song)
> -Original Message- > From: Will Deacon [mailto:w...@kernel.org] > Sent: Saturday, July 4, 2020 4:22 AM > To: Song Bao Hua (Barry Song) > Cc: robin.mur...@arm.com; h...@lst.de; m.szyprow...@samsung.com; > iommu@lists.linux-foundation.org; linux-arm-ker...@l

[PATCH v3 0/2] make dma_alloc_coherent NUMA-aware by per-NUMA CMA

2020-06-28 Thread Barry Song
ve() to reuse the comment before hugetlb_cma_reserve() with respect to Robin's comment -v2: * fix some issues reported by kernel test robot * fallback to default cma while allocation fails in per-numa cma free memory properly Barry Song (2): dma-direct: provide the ability to

[PATCH v3 2/2] arm64: mm: reserve per-numa CMA to localize coherent dma buffers

2020-06-28 Thread Barry Song
: Ganapatrao Kulkarni Cc: Catalin Marinas Cc: Nicolas Saenz Julienne Cc: Steve Capper Cc: Andrew Morton Cc: Mike Rapoport Signed-off-by: Barry Song --- -v3: * move dma_pernuma_cma_reserve() after hugetlb_cma_reserve() to reuse the comment before hugetlb_cma_reserve() with respect to Robin&#

[PATCH v3 1/2] dma-direct: provide the ability to reserve per-numa CMA

2020-06-28 Thread Barry Song
Hellwig Cc: Marek Szyprowski Cc: Will Deacon Cc: Robin Murphy Cc: Ganapatrao Kulkarni Cc: Catalin Marinas Cc: Nicolas Saenz Julienne Cc: Steve Capper Cc: Andrew Morton Cc: Mike Rapoport Signed-off-by: Barry Song --- -v3: * move to use page_to_nid() while freeing cma with respect to

RE: [PATCH v2 1/2] dma-direct: provide the ability to reserve per-numa CMA

2020-06-26 Thread Song Bao Hua (Barry Song)
> -Original Message- > From: linux-kernel-ow...@vger.kernel.org > [mailto:linux-kernel-ow...@vger.kernel.org] On Behalf Of Robin Murphy > Sent: Thursday, June 25, 2020 11:11 PM > To: Song Bao Hua (Barry Song) ; h...@lst.de; > m.szyprow...@samsung.com; w...@kernel.org;

RE: [PATCH v2 2/2] arm64: mm: reserve per-numa CMA after numa_init

2020-06-25 Thread Song Bao Hua (Barry Song)
> -Original Message- > From: Robin Murphy [mailto:robin.mur...@arm.com] > Sent: Thursday, June 25, 2020 11:16 PM > To: Song Bao Hua (Barry Song) ; h...@lst.de; > m.szyprow...@samsung.com; w...@kernel.org; > ganapatrao.kulka...@cavium.com; catalin.mari...@arm.com >

[PATCH v2 0/2] make dma_alloc_coherent NUMA-aware by per-NUMA CMA

2020-06-25 Thread Barry Song
allocation fails in per-numa cma according to Jonathan Cameron's suggestion; free memory properly Barry Song (2): dma-direct: provide the ability to reserve per-numa CMA arm64: mm: reserve per-numa CMA after numa_init arch/arm64/mm/init.c | 2 + include/linux/dma-co

[PATCH v2 2/2] arm64: mm: reserve per-numa CMA after numa_init

2020-06-25 Thread Barry Song
: Ganapatrao Kulkarni Cc: Catalin Marinas Cc: Nicolas Saenz Julienne Cc: Steve Capper Cc: Andrew Morton Cc: Mike Rapoport Signed-off-by: Barry Song --- arch/arm64/mm/init.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c index 1e93cfc7c47a

[PATCH v2 1/2] dma-direct: provide the ability to reserve per-numa CMA

2020-06-25 Thread Barry Song
Hellwig Cc: Marek Szyprowski Cc: Will Deacon Cc: Robin Murphy Cc: Ganapatrao Kulkarni Cc: Catalin Marinas Cc: Nicolas Saenz Julienne Cc: Steve Capper Cc: Andrew Morton Cc: Mike Rapoport Signed-off-by: Barry Song --- include/linux/dma-contiguous.h | 4 ++ kernel/dma/Kconfig

RE: [PATCH 2/3] arm64: mm: reserve hugetlb CMA after numa_init

2020-06-07 Thread Song Bao Hua (Barry Song)
> -Original Message- > From: Matthias Brugger [mailto:matthias@gmail.com] > Sent: Monday, June 8, 2020 8:15 AM > To: Roman Gushchin ; Song Bao Hua (Barry Song) > > Cc: catalin.mari...@arm.com; John Garry ; > linux-ker...@vger.kernel.org; Linuxarm

RE: [kbuild-all] Re: [PATCH 1/3] dma-direct: provide the ability to reserve per-numa CMA

2020-06-06 Thread Song Bao Hua (Barry Song)
> -Original Message- > From: Philip Li [mailto:philip...@intel.com] > Sent: Saturday, June 6, 2020 3:47 PM > To: Dan Carpenter > Cc: Song Bao Hua (Barry Song) ; > kbu...@lists.01.org; h...@lst.de; m.szyprow...@samsung.com; > robin.mur...@arm.com; catalin.mari...@ar

RE: [PATCH 1/3] dma-direct: provide the ability to reserve per-numa CMA

2020-06-04 Thread Song Bao Hua (Barry Song)
> -Original Message- > From: Dan Carpenter [mailto:dan.carpen...@oracle.com] > Sent: Thursday, June 4, 2020 11:37 PM > To: kbu...@lists.01.org; Song Bao Hua (Barry Song) > ; h...@lst.de; m.szyprow...@samsung.com; > robin.mur...@arm.com; catalin.mari...@arm.com > Cc

[PATCH 3/3] arm64: mm: reserve per-numa CMA after numa_init

2020-06-02 Thread Barry Song
tables. that means dma_unmap latency will be shrunk much. Meanwhile, when iommu.passthrough is on, device drivers which call dma_ alloc_coherent() will also get local memory and avoid the travel between numa nodes. Cc: Will Deacon Cc: Robin Murphy Signed-off-by: Barry Song --- arch/arm64/mm/init.c

[PATCH 1/3] dma-direct: provide the ability to reserve per-numa CMA

2020-06-02 Thread Barry Song
-by: Barry Song --- include/linux/dma-contiguous.h | 4 kernel/dma/Kconfig | 10 + kernel/dma/contiguous.c| 41 +- 3 files changed, 54 insertions(+), 1 deletion(-) diff --git a/include/linux/dma-contiguous.h b/include/linux/dma

[PATCH 2/3] arm64: mm: reserve hugetlb CMA after numa_init

2020-06-02 Thread Barry Song
hugetlb_cma_reserve() is called at the wrong place. numa_init has not been done yet. so all reserved memory will be located at node0. Cc: Roman Gushchin Signed-off-by: Barry Song --- arch/arm64/mm/init.c | 10 +- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/arch/arm64

[PATCH 0/3] support per-numa CMA for ARM server

2020-06-02 Thread Barry Song
memory from local numa node to save command queues and page tables. that means dma_unmap latency will be shrunk much. Meanwhile, when iommu.passthrough is on, device drivers which call dma_ alloc_coherent() will also get local memory and avoid the travel between numa nodes. Barry Song (3): dma-direct

RE: [PATCH] driver core: platform: expose numa_node to users in sysfs

2020-06-02 Thread Song Bao Hua (Barry Song)
> > > > On Tue, Jun 02, 2020 at 05:09:57AM +, Song Bao Hua (Barry Song) > wrote: > > > > > > > > > > Platform devices are NUMA? That's crazy, and feels like a total > > > > > abuse of platform devices and drivers

RE: [PATCH] driver core: platform: expose numa_node to users in sysfs

2020-06-01 Thread Song Bao Hua (Barry Song)
> -Original Message- > From: Greg KH [mailto:gre...@linuxfoundation.org] > Sent: Tuesday, June 2, 2020 6:11 PM > To: Song Bao Hua (Barry Song) > Cc: raf...@kernel.org; iommu@lists.linux-foundation.org; > linux-arm-ker...@lists.infradead.org; linux-ker...@vger.ke

[PATCH v2] driver core: platform: expose numa_node to users in sysfs

2020-06-01 Thread Barry Song
Zeng Cc: Robin Murphy Signed-off-by: Barry Song --- -v2: add the numa_node entry in Documentation/ABI/ Documentation/ABI/testing/sysfs-bus-platform | 10 drivers/base/platform.c | 26 +++- 2 files changed, 35 insertions(+), 1 deletion(-) diff --git

RE: [PATCH] driver core: platform: expose numa_node to users in sysfs

2020-06-01 Thread Song Bao Hua (Barry Song)
> > > > Platform devices are NUMA? That's crazy, and feels like a total abuse > > of platform devices and drivers that really should belong on a "real" > > bus. > > I am not sure if it is an abuse of platform device. But smmu is a platform > device, > drivers/iommu/arm-smmu-v3.c is a platform dri

RE: [PATCH] driver core: platform: expose numa_node to users in sysfs

2020-06-01 Thread Song Bao Hua (Barry Song)
> -Original Message- > From: Greg KH [mailto:gre...@linuxfoundation.org] > Sent: Tuesday, June 2, 2020 4:24 PM > To: Song Bao Hua (Barry Song) > Cc: raf...@kernel.org; iommu@lists.linux-foundation.org; > linux-arm-ker...@lists.infradead.org; linux-ker...@vger.ke

[PATCH] driver core: platform: expose numa_node to users in sysfs

2020-06-01 Thread Barry Song
Zeng Cc: Robin Murphy Signed-off-by: Barry Song --- drivers/base/platform.c | 26 +- 1 file changed, 25 insertions(+), 1 deletion(-) diff --git a/drivers/base/platform.c b/drivers/base/platform.c index b27d0f6c18c9..7794b9a38d82 100644 --- a/drivers/base/platform.c +++ b

RE: [PATCH] iommu/arm-smmu-v3: expose numa_node attribute to users in sysfs

2020-06-01 Thread Song Bao Hua (Barry Song)
> -Original Message- > From: Robin Murphy [mailto:robin.mur...@arm.com] > Sent: Tuesday, June 2, 2020 1:14 AM > To: Song Bao Hua (Barry Song) ; w...@kernel.org; > h...@lst.de; m.szyprow...@samsung.com; iommu@lists.linux-foundation.org > Cc: Linuxa

[PATCH] iommu/arm-smmu-v3: allocate the memory of queues in local numa node

2020-06-01 Thread Song Bao Hua (Barry Song)
:16010101T02 TZOFFSETFROM:+1200 TZOFFSETTO:+1300 RRULE:FREQ=YEARLY;INTERVAL=1;BYDAY=-1SU;BYMONTH=9 END:DAYLIGHT END:VTIMEZONE BEGIN:VEVENT ORGANIZER;CN=Song Bao Hua (Barry Song):MAILTO:song.bao@hisilicon.com ATTENDEE;ROLE=REQ-PARTICIPANT;PARTSTAT=NEEDS-ACTION;RSVP=TRUE;CN=h...@lst.de :MAILTO:h

[PATCH] iommu/arm-smmu-v3: allocate the memory of queues in local numa node

2020-06-01 Thread Barry Song
de2, without this patch, it takes 550ns to wait for the completion of CMD_SYNC; with this patch, it takes 250ns to wait for the completion of CMD_SYNC. Signed-off-by: Barry Song --- drivers/iommu/arm-smmu-v3.c | 63 - 1 file changed, 48 insertions(+), 15 del

[PATCH] iommu/arm-smmu-v3: expose numa_node attribute to users in sysfs

2020-05-30 Thread Barry Song
without checking hardware spec at all. Signed-off-by: Barry Song --- drivers/iommu/arm-smmu-v3.c | 40 - 1 file changed, 39 insertions(+), 1 deletion(-) diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c index 82508730feb7..754c4d59498b

RE: arm-smmu-v3 high cpu usage for NVMe

2020-05-24 Thread Song Bao Hua (Barry Song)
> Subject: Re: arm-smmu-v3 high cpu usage for NVMe > > On 20/03/2020 10:41, John Garry wrote: > > + Barry, Alexandru > > >     PerfTop:   85864 irqs/sec  kernel:89.6%  exact:  0.0% lost: > > 0/34434 drop: > > 0/40116 [4000Hz cycles],  (all, 96 CPUs) > > >

<    1   2