On 9/3/20 6:58 PM, Song Bao Hua (Barry Song) wrote:
> 
>> -----Original Message-----
>> From: Mike Kravetz [mailto:[email protected]]
>> Sent: Thursday, September 3, 2020 3:02 PM
>> To: [email protected]; [email protected];
>> [email protected]; [email protected]
>> Cc: Roman Gushchin <[email protected]>; Song Bao Hua (Barry Song)
>> <[email protected]>; Joonsoo Kim <[email protected]>; Rik van
>> Riel <[email protected]>; Aslan Bakirov <[email protected]>; Michal Hocko
>> <[email protected]>; Andrew Morton <[email protected]>; Mike
>> Kravetz <[email protected]>
>> Subject: [RFC PATCH] cma: make number of CMA areas dynamic, remove
>> CONFIG_CMA_AREAS
>>
>> The number of distinct CMA areas is limited by the constant
>> CONFIG_CMA_AREAS.  In most environments, this was set to a default
>> value of 7.  Not too long ago, support was added to allocate hugetlb
>> gigantic pages from CMA.  More recent changes to make dma_alloc_coherent
>> NUMA-aware on arm64 added more potential users of CMA areas.  Along
>> with the dma_alloc_coherent changes, the default value of CMA_AREAS
>> was bumped up to 19 if NUMA is enabled.
>>
>> It seems that the number of CMA users is likely to grow.  Instead of
>> using a static array for cma areas, use a simple linked list.  These
>> areas are used before normal memory allocators, so use the memblock
>> allocator.
> 
> Hello Mike, It seems it is a good idea. Thanks for addressing this.
> 
> I was focusing on per-numa cma feature in my patchset and I didn't take care 
> of this
> while I thought we should do something for the number of cma areas.
> 

Thanks for taking a look.

One area where I could use some help is testing/verifying on arm.  See the
changes to arch/arm/mm/dma-mapping.c.  I have tested the generic changes on
my x86 platform, but do not have an arm platform for easy testing.

>>  void __init dma_contiguous_early_fixup(phys_addr_t base, unsigned long
>> size)
>>  {
>> -    dma_mmu_remap[dma_mmu_remap_num].base = base;
>> -    dma_mmu_remap[dma_mmu_remap_num].size = size;
>> -    dma_mmu_remap_num++;
>> +    struct dma_contig_early_reserve *d;
>> +
>> +    d = memblock_alloc(sizeof(struct dma_contig_early_reserve),
> 
> sizeof(*d)?

Yes.  thanks.

>> @@ -172,15 +173,14 @@ int __init cma_init_reserved_mem(phys_addr_t
>> base, phys_addr_t size,
>>      struct cma *cma;
>>      phys_addr_t alignment;
>>
>> -    /* Sanity checks */
>> -    if (cma_area_count == ARRAY_SIZE(cma_areas)) {
>> -            pr_err("Not enough slots for CMA reserved regions!\n");
>> -            return -ENOSPC;
>> -    }
>> +    /* Do not attempt allocations after memblock allocator is torn down */
>> +    if (slab_is_available())
>> +            return -EINVAL;
>>
>>      if (!size || !memblock_is_region_reserved(base, size))
>>              return -EINVAL;
>>
>> +
> 
> Is this empty line relevant?

No, added by mistake.

>> @@ -192,12 +192,17 @@ int __init cma_init_reserved_mem(phys_addr_t
>> base, phys_addr_t size,
>>      if (ALIGN(base, alignment) != base || ALIGN(size, alignment) != size)
>>              return -EINVAL;
>>
>> +    cma = memblock_alloc(sizeof(struct cma), sizeof(long));
> 
> sizeof(*cma)?

Yes, thanks.

> It seems we are going to write cma-> count, order_per_bit, debugfs fields.
> To avoid false sharing of the cacheline of struct cma, it is better to align 
> with
> SMP_CACHE_BYTES.
> 
> On the other hand, it seems we are unlikely to write the cma 

I thought about using SMP_CACHE_BYTES, but the structures are simply defined
as an array today.  This should not be any worse.  I do not believe access
to the structures is performance sensitive.

Thanks,
-- 
Mike Kravetz

Reply via email to