Re: [PATCH v2 1/2] dma-direct: provide the ability to reserve per-numa CMA

2020-06-28 Thread kernel test robot
Hi Barry,

I love your patch! Perhaps something to improve:

[auto build test WARNING on linus/master]
[also build test WARNING on v5.8-rc2 next-20200625]
[cannot apply to arm64/for-next/core hch-configfs/for-next]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use  as documented in
https://git-scm.com/docs/git-format-patch]

url:
https://github.com/0day-ci/linux/commits/Barry-Song/make-dma_alloc_coherent-NUMA-aware-by-per-NUMA-CMA/20200625-154656
base:   https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 
8be3a53e18e0e1a98f288f6c7f5e9da3adbe9c49
config: x86_64-randconfig-s022-20200624 (attached as .config)
compiler: gcc-9 (Debian 9.3.0-13) 9.3.0
reproduce:
# apt-get install sparse
# sparse version: v0.6.2-dirty
# save the attached .config to linux build tree
make W=1 C=1 ARCH=x86_64 CF='-fdiagnostic-prefix -D__CHECK_ENDIAN__'

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot 


sparse warnings: (new ones prefixed by >>)

>> kernel/dma/contiguous.c:283:50: sparse: sparse: invalid access below 
>> 'dma_contiguous_pernuma_area' (-8 8)

# 
https://github.com/0day-ci/linux/commit/d6930169a3364418b985c2d19c31ecf1c4c3d4a9
git remote add linux-review https://github.com/0day-ci/linux
git remote update linux-review
git checkout d6930169a3364418b985c2d19c31ecf1c4c3d4a9
vim +/dma_contiguous_pernuma_area +283 kernel/dma/contiguous.c

de9e14eebf33a6 drivers/base/dma-contiguous.c Marek Szyprowski  2014-10-13  253  
b1d2dc009dece4 kernel/dma/contiguous.c   Nicolin Chen  2019-05-23  254  
/**
b1d2dc009dece4 kernel/dma/contiguous.c   Nicolin Chen  2019-05-23  255  
 * dma_alloc_contiguous() - allocate contiguous pages
b1d2dc009dece4 kernel/dma/contiguous.c   Nicolin Chen  2019-05-23  256  
 * @dev:   Pointer to device for which the allocation is performed.
b1d2dc009dece4 kernel/dma/contiguous.c   Nicolin Chen  2019-05-23  257  
 * @size:  Requested allocation size.
b1d2dc009dece4 kernel/dma/contiguous.c   Nicolin Chen  2019-05-23  258  
 * @gfp:   Allocation flags.
b1d2dc009dece4 kernel/dma/contiguous.c   Nicolin Chen  2019-05-23  259  
 *
b1d2dc009dece4 kernel/dma/contiguous.c   Nicolin Chen  2019-05-23  260  
 * This function allocates contiguous memory buffer for specified device. It
d6930169a33644 kernel/dma/contiguous.c   Barry Song2020-06-25  261  
 * tries to use device specific contiguous memory area if available, or it
d6930169a33644 kernel/dma/contiguous.c   Barry Song2020-06-25  262  
 * tries to use per-numa cma, if the allocation fails, it will fallback to
d6930169a33644 kernel/dma/contiguous.c   Barry Song2020-06-25  263  
 * try default global one.
bd2e75633c8012 kernel/dma/contiguous.c   Nicolin Chen  2019-05-23  264  
 *
d6930169a33644 kernel/dma/contiguous.c   Barry Song2020-06-25  265  
 * Note that it bypass one-page size of allocations from the per-numa and
d6930169a33644 kernel/dma/contiguous.c   Barry Song2020-06-25  266  
 * global area as the addresses within one page are always contiguous, so
d6930169a33644 kernel/dma/contiguous.c   Barry Song2020-06-25  267  
 * there is no need to waste CMA pages for that kind; it also helps reduce
d6930169a33644 kernel/dma/contiguous.c   Barry Song2020-06-25  268  
 * fragmentations.
b1d2dc009dece4 kernel/dma/contiguous.c   Nicolin Chen  2019-05-23  269  
 */
b1d2dc009dece4 kernel/dma/contiguous.c   Nicolin Chen  2019-05-23  270  
struct page *dma_alloc_contiguous(struct device *dev, size_t size, gfp_t gfp)
b1d2dc009dece4 kernel/dma/contiguous.c   Nicolin Chen  2019-05-23  271  
{
90ae409f9eb3bc kernel/dma/contiguous.c   Christoph Hellwig 2019-08-20  272  
size_t count = size >> PAGE_SHIFT;
b1d2dc009dece4 kernel/dma/contiguous.c   Nicolin Chen  2019-05-23  273  
struct page *page = NULL;
bd2e75633c8012 kernel/dma/contiguous.c   Nicolin Chen  2019-05-23  274  
struct cma *cma = NULL;
d6930169a33644 kernel/dma/contiguous.c   Barry Song2020-06-25  275  
int nid = dev ? dev_to_node(dev) : NUMA_NO_NODE;
d6930169a33644 kernel/dma/contiguous.c   Barry Song2020-06-25  276  
bool alloc_from_pernuma = false;
bd2e75633c8012 kernel/dma/contiguous.c   Nicolin Chen  2019-05-23  277  
bd2e75633c8012 kernel/dma/contiguous.c   Nicolin Chen  2019-05-23  278  
if (dev && dev->cma_area)
bd2e75633c8012 kernel/dma/contiguous.c   Nicolin Chen  2019-05-23  279  
cma = dev->cma_area;
d6930169a33644 kernel/dma/contiguous.c   Barry Song2020-06-25  280  
else if ((nid != NUMA_NO_NODE) && dma_contiguous_pernuma_area[nid]
d6930169a33644 kernel/dma/contiguous.c   Barry Song2020-06-25  281  

RE: [PATCH v2 1/2] dma-direct: provide the ability to reserve per-numa CMA

2020-06-26 Thread Song Bao Hua (Barry Song)



> -Original Message-
> From: linux-kernel-ow...@vger.kernel.org
> [mailto:linux-kernel-ow...@vger.kernel.org] On Behalf Of Robin Murphy
> Sent: Thursday, June 25, 2020 11:11 PM
> To: Song Bao Hua (Barry Song) ; h...@lst.de;
> m.szyprow...@samsung.com; w...@kernel.org;
> ganapatrao.kulka...@cavium.com; catalin.mari...@arm.com
> Cc: iommu@lists.linux-foundation.org; Linuxarm ;
> linux-arm-ker...@lists.infradead.org; linux-ker...@vger.kernel.org; Jonathan
> Cameron ; Nicolas Saenz Julienne
> ; Steve Capper ; Andrew
> Morton ; Mike Rapoport 
> Subject: Re: [PATCH v2 1/2] dma-direct: provide the ability to reserve
> per-numa CMA
> 
> On 2020-06-25 08:43, Barry Song wrote:
> > This is useful for at least two scenarios:
> > 1. ARM64 smmu will get memory from local numa node, it can save its
> > command queues and page tables locally. Tests show it can decrease
> > dma_unmap latency at lot. For example, without this patch, smmu on
> > node2 will get memory from node0 by calling dma_alloc_coherent(),
> > typically, it has to wait for more than 560ns for the completion of
> > CMD_SYNC in an empty command queue; with this patch, it needs 240ns
> > only.
> > 2. when we set iommu passthrough, drivers will get memory from CMA,
> > local memory means much less latency.
> >
> > Cc: Jonathan Cameron 
> > Cc: Christoph Hellwig 
> > Cc: Marek Szyprowski 
> > Cc: Will Deacon 
> > Cc: Robin Murphy 
> > Cc: Ganapatrao Kulkarni 
> > Cc: Catalin Marinas 
> > Cc: Nicolas Saenz Julienne 
> > Cc: Steve Capper 
> > Cc: Andrew Morton 
> > Cc: Mike Rapoport 
> > Signed-off-by: Barry Song 
> > ---
> >   include/linux/dma-contiguous.h |  4 ++
> >   kernel/dma/Kconfig | 10 
> >   kernel/dma/contiguous.c| 99
> ++
> >   3 files changed, 104 insertions(+), 9 deletions(-)
> >
> > diff --git a/include/linux/dma-contiguous.h
> b/include/linux/dma-contiguous.h
> > index 03f8e98e3bcc..278a80a40456 100644
> > --- a/include/linux/dma-contiguous.h
> > +++ b/include/linux/dma-contiguous.h
> > @@ -79,6 +79,8 @@ static inline void dma_contiguous_set_default(struct
> cma *cma)
> >
> >   void dma_contiguous_reserve(phys_addr_t addr_limit);
> >
> > +void dma_pernuma_cma_reserve(void);
> > +
> >   int __init dma_contiguous_reserve_area(phys_addr_t size, phys_addr_t
> base,
> >phys_addr_t limit, struct cma **res_cma,
> >bool fixed);
> > @@ -128,6 +130,8 @@ static inline void dma_contiguous_set_default(struct
> cma *cma) { }
> >
> >   static inline void dma_contiguous_reserve(phys_addr_t limit) { }
> >
> > +static inline void dma_pernuma_cma_reserve(void) { }
> > +
> >   static inline int dma_contiguous_reserve_area(phys_addr_t size,
> phys_addr_t base,
> >phys_addr_t limit, struct cma **res_cma,
> >bool fixed)
> > diff --git a/kernel/dma/Kconfig b/kernel/dma/Kconfig
> > index d006668c0027..aeb976b1d21c 100644
> > --- a/kernel/dma/Kconfig
> > +++ b/kernel/dma/Kconfig
> > @@ -104,6 +104,16 @@ config DMA_CMA
> >   if  DMA_CMA
> >   comment "Default contiguous memory area size:"
> >
> > +config CMA_PERNUMA_SIZE_MBYTES
> > +   int "Size in Mega Bytes for per-numa CMA areas"
> > +   depends on NUMA
> > +   default 16 if ARM64
> > +   default 0
> > +   help
> > + Defines the size (in MiB) of the per-numa memory area for Contiguous
> > + Memory Allocator. Every numa node will get a separate CMA with this
> > + size. If the size of 0 is selected, per-numa CMA is disabled.
> > +
> 
> I think this needs to be cleverer than just a static config option.
> Pretty much everything else CMA-related is runtime-configurable to some
> degree, and doing any per-node setup when booting on a single-node
> system would be wasted effort.

I agree some dynamic configuration should be supported to set the size of cma.
It could be a kernel parameter bootargs, or leverage an existing parameter.

For a system with NUMA enabled, but with only one node or actually non-numa, 
the current dma_pernuma_cma_reserve() won't do anything:

void __init dma_pernuma_cma_reserve(void)
{
int nid;

if (!pernuma_size_bytes || nr_online_nodes <= 1)
return;
}

> 
> Since this is conceptually very similar to the existing hugetlb_cma
> implementation I'm also wondering about inconsistency with respect to
> specifying per-node vs.

Re: [PATCH v2 1/2] dma-direct: provide the ability to reserve per-numa CMA

2020-06-25 Thread Robin Murphy

On 2020-06-25 08:43, Barry Song wrote:

This is useful for at least two scenarios:
1. ARM64 smmu will get memory from local numa node, it can save its
command queues and page tables locally. Tests show it can decrease
dma_unmap latency at lot. For example, without this patch, smmu on
node2 will get memory from node0 by calling dma_alloc_coherent(),
typically, it has to wait for more than 560ns for the completion of
CMD_SYNC in an empty command queue; with this patch, it needs 240ns
only.
2. when we set iommu passthrough, drivers will get memory from CMA,
local memory means much less latency.

Cc: Jonathan Cameron 
Cc: Christoph Hellwig 
Cc: Marek Szyprowski 
Cc: Will Deacon 
Cc: Robin Murphy 
Cc: Ganapatrao Kulkarni 
Cc: Catalin Marinas 
Cc: Nicolas Saenz Julienne 
Cc: Steve Capper 
Cc: Andrew Morton 
Cc: Mike Rapoport 
Signed-off-by: Barry Song 
---
  include/linux/dma-contiguous.h |  4 ++
  kernel/dma/Kconfig | 10 
  kernel/dma/contiguous.c| 99 ++
  3 files changed, 104 insertions(+), 9 deletions(-)

diff --git a/include/linux/dma-contiguous.h b/include/linux/dma-contiguous.h
index 03f8e98e3bcc..278a80a40456 100644
--- a/include/linux/dma-contiguous.h
+++ b/include/linux/dma-contiguous.h
@@ -79,6 +79,8 @@ static inline void dma_contiguous_set_default(struct cma *cma)
  
  void dma_contiguous_reserve(phys_addr_t addr_limit);
  
+void dma_pernuma_cma_reserve(void);

+
  int __init dma_contiguous_reserve_area(phys_addr_t size, phys_addr_t base,
   phys_addr_t limit, struct cma **res_cma,
   bool fixed);
@@ -128,6 +130,8 @@ static inline void dma_contiguous_set_default(struct cma 
*cma) { }
  
  static inline void dma_contiguous_reserve(phys_addr_t limit) { }
  
+static inline void dma_pernuma_cma_reserve(void) { }

+
  static inline int dma_contiguous_reserve_area(phys_addr_t size, phys_addr_t 
base,
   phys_addr_t limit, struct cma **res_cma,
   bool fixed)
diff --git a/kernel/dma/Kconfig b/kernel/dma/Kconfig
index d006668c0027..aeb976b1d21c 100644
--- a/kernel/dma/Kconfig
+++ b/kernel/dma/Kconfig
@@ -104,6 +104,16 @@ config DMA_CMA
  if  DMA_CMA
  comment "Default contiguous memory area size:"
  
+config CMA_PERNUMA_SIZE_MBYTES

+   int "Size in Mega Bytes for per-numa CMA areas"
+   depends on NUMA
+   default 16 if ARM64
+   default 0
+   help
+ Defines the size (in MiB) of the per-numa memory area for Contiguous
+ Memory Allocator. Every numa node will get a separate CMA with this
+ size. If the size of 0 is selected, per-numa CMA is disabled.
+


I think this needs to be cleverer than just a static config option. 
Pretty much everything else CMA-related is runtime-configurable to some 
degree, and doing any per-node setup when booting on a single-node 
system would be wasted effort.


Since this is conceptually very similar to the existing hugetlb_cma 
implementation I'm also wondering about inconsistency with respect to 
specifying per-node vs. total sizes.


Another thought, though, is that systems large enough to have multiple 
NUMA nodes tend not to be short on memory, so it might not be 
unreasonable to base this all on whatever size the default area is 
given, and simply have a binary on/off switch to control the per-node 
aspect.



  config CMA_SIZE_MBYTES
int "Size in Mega Bytes"
depends on !CMA_SIZE_SEL_PERCENTAGE
diff --git a/kernel/dma/contiguous.c b/kernel/dma/contiguous.c
index 15bc5026c485..bcbd53aead93 100644
--- a/kernel/dma/contiguous.c
+++ b/kernel/dma/contiguous.c
@@ -30,7 +30,14 @@
  #define CMA_SIZE_MBYTES 0
  #endif
  
+#ifdef CONFIG_CMA_PERNUMA_SIZE_MBYTES

+#define CMA_SIZE_PERNUMA_MBYTES CONFIG_CMA_PERNUMA_SIZE_MBYTES
+#else
+#define CMA_SIZE_PERNUMA_MBYTES 0
+#endif
+
  struct cma *dma_contiguous_default_area;
+static struct cma *dma_contiguous_pernuma_area[MAX_NUMNODES];
  
  /*

   * Default global CMA area size can be defined in kernel's .config.
@@ -44,6 +51,8 @@ struct cma *dma_contiguous_default_area;
   */
  static const phys_addr_t size_bytes __initconst =
(phys_addr_t)CMA_SIZE_MBYTES * SZ_1M;
+static const phys_addr_t pernuma_size_bytes __initconst =
+   (phys_addr_t)CMA_SIZE_PERNUMA_MBYTES * SZ_1M;
  static phys_addr_t  size_cmdline __initdata = -1;
  static phys_addr_t base_cmdline __initdata;
  static phys_addr_t limit_cmdline __initdata;
@@ -96,6 +105,33 @@ static inline __maybe_unused phys_addr_t 
cma_early_percent_memory(void)
  
  #endif
  
+void __init dma_pernuma_cma_reserve(void)

+{
+   int nid;
+
+   if (!pernuma_size_bytes || nr_online_nodes <= 1)
+   return;
+
+   for_each_node_state(nid, N_ONLINE) {


Do we need/want notifiers to handle currently-offline nodes coming 
online later (I'm not sure off-hand how NUMA interacts with stuff like 
"maxcpus=n")?



+