Re: [PATCH v2 1/6] powerpc: prefer memblock APIs returning virtual address
On Wed, Dec 05, 2018 at 11:37:44PM +1100, Michael Ellerman wrote: > Mike Rapoport writes: > > On Tue, Dec 04, 2018 at 08:59:41PM +1100, Michael Ellerman wrote: > >> Hi Mike, > >> > >> Thanks for trying to clean these up. > >> > >> I think a few could be improved though ... > >> > >> Mike Rapoport writes: > >> > diff --git a/arch/powerpc/kernel/paca.c b/arch/powerpc/kernel/paca.c > >> > index 913bfca..fa884ad 100644 > >> > --- a/arch/powerpc/kernel/paca.c > >> > +++ b/arch/powerpc/kernel/paca.c > >> > @@ -42,17 +42,15 @@ static void *__init alloc_paca_data(unsigned long > >> > size, unsigned long align, > >> > nid = early_cpu_to_node(cpu); > >> > } > >> > > >> > -pa = memblock_alloc_base_nid(size, align, limit, nid, > >> > MEMBLOCK_NONE); > >> > -if (!pa) { > >> > -pa = memblock_alloc_base(size, align, limit); > >> > -if (!pa) > >> > -panic("cannot allocate paca data"); > >> > -} > >> > +ptr = memblock_alloc_try_nid_raw(size, align, > >> > MEMBLOCK_LOW_LIMIT, > >> > + limit, nid); > >> > +if (!ptr) > >> > +panic("cannot allocate paca data"); > >> > >> The old code doesn't zero, but two of the three callers of > >> alloc_paca_data() *do* zero the whole allocation, so I'd be happy if we > >> did it in here instead. > > > > I looked at the callers and couldn't tell if zeroing memory in > > init_lppaca() would be ok. > > I'll remove the _raw here. > > Thanks. > > >> That would mean we could use memblock_alloc_try_nid() avoiding the need > >> to panic() manually. > > > > Actual, my plan was to remove panic() from all memblock_alloc* and make all > > callers to check the returned value. > > I believe it's cleaner and also allows more meaningful panic messages. Not > > mentioning the reduction of memblock code. > > Hmm, not sure. > > I see ~200 calls to the panicking functions, that seems like a lot of > work to change all those. Yeah, I know :) > And I think I disagree on the "more meaningful panic message". This is a > perfect example, compare: > > panic("cannot allocate paca data"); > to: > panic("%s: Failed to allocate %llu bytes align=0x%llx nid=%d from=%pa > max_addr=%pa\n", > __func__, (u64)size, (u64)align, nid, &min_addr, &max_addr); > > The former is basically useless, whereas the second might at least give > you a hint as to *why* the allocation failed. We can easily keep the memblock message, just make it pr_err instead of panic. The message at the call site can show where the problem was without the need to dive into the stack dump. > I know it's kind of odd for a function to panic() rather than return an > error, but memblock is kind of special because it's so early in boot. > Most of these allocations have to succeed to get the system up and > running. The downside of having panic() inside some memblock functions is that it makes the API way too bloated. And, at least currently, it's inconsistent. For instance memblock_alloc_try_nid_raw() does not panic, but memblock_alloc_try_nid() does. When it was about 2 functions and a wrapper, it was perfectly fine, but since than memblock has three sets of partially overlapping APIs with endless convenience wrappers. I believe that patching up ~200 calls is worth the reduction of memblock API to saner size. Another thing, the absence of check for return value for memory allocation is not only odd, but it also makes the code obfuscated. > cheers > -- Sincerely yours, Mike.
Re: [PATCH v2 1/6] powerpc: prefer memblock APIs returning virtual address
Mike Rapoport writes: > On Tue, Dec 04, 2018 at 08:59:41PM +1100, Michael Ellerman wrote: >> Hi Mike, >> >> Thanks for trying to clean these up. >> >> I think a few could be improved though ... >> >> Mike Rapoport writes: >> > diff --git a/arch/powerpc/kernel/paca.c b/arch/powerpc/kernel/paca.c >> > index 913bfca..fa884ad 100644 >> > --- a/arch/powerpc/kernel/paca.c >> > +++ b/arch/powerpc/kernel/paca.c >> > @@ -42,17 +42,15 @@ static void *__init alloc_paca_data(unsigned long >> > size, unsigned long align, >> >nid = early_cpu_to_node(cpu); >> >} >> > >> > - pa = memblock_alloc_base_nid(size, align, limit, nid, MEMBLOCK_NONE); >> > - if (!pa) { >> > - pa = memblock_alloc_base(size, align, limit); >> > - if (!pa) >> > - panic("cannot allocate paca data"); >> > - } >> > + ptr = memblock_alloc_try_nid_raw(size, align, MEMBLOCK_LOW_LIMIT, >> > + limit, nid); >> > + if (!ptr) >> > + panic("cannot allocate paca data"); >> >> The old code doesn't zero, but two of the three callers of >> alloc_paca_data() *do* zero the whole allocation, so I'd be happy if we >> did it in here instead. > > I looked at the callers and couldn't tell if zeroing memory in > init_lppaca() would be ok. > I'll remove the _raw here. Thanks. >> That would mean we could use memblock_alloc_try_nid() avoiding the need >> to panic() manually. > > Actual, my plan was to remove panic() from all memblock_alloc* and make all > callers to check the returned value. > I believe it's cleaner and also allows more meaningful panic messages. Not > mentioning the reduction of memblock code. Hmm, not sure. I see ~200 calls to the panicking functions, that seems like a lot of work to change all those. And I think I disagree on the "more meaningful panic message". This is a perfect example, compare: panic("cannot allocate paca data"); to: panic("%s: Failed to allocate %llu bytes align=0x%llx nid=%d from=%pa max_addr=%pa\n", __func__, (u64)size, (u64)align, nid, &min_addr, &max_addr); The former is basically useless, whereas the second might at least give you a hint as to *why* the allocation failed. I know it's kind of odd for a function to panic() rather than return an error, but memblock is kind of special because it's so early in boot. Most of these allocations have to succeed to get the system up and running. cheers
Re: [PATCH v2 1/6] powerpc: prefer memblock APIs returning virtual address
On Tue, Dec 04, 2018 at 08:59:41PM +1100, Michael Ellerman wrote: > Hi Mike, > > Thanks for trying to clean these up. > > I think a few could be improved though ... > > Mike Rapoport writes: > > diff --git a/arch/powerpc/kernel/paca.c b/arch/powerpc/kernel/paca.c > > index 913bfca..fa884ad 100644 > > --- a/arch/powerpc/kernel/paca.c > > +++ b/arch/powerpc/kernel/paca.c > > @@ -42,17 +42,15 @@ static void *__init alloc_paca_data(unsigned long size, > > unsigned long align, > > nid = early_cpu_to_node(cpu); > > } > > > > - pa = memblock_alloc_base_nid(size, align, limit, nid, MEMBLOCK_NONE); > > - if (!pa) { > > - pa = memblock_alloc_base(size, align, limit); > > - if (!pa) > > - panic("cannot allocate paca data"); > > - } > > + ptr = memblock_alloc_try_nid_raw(size, align, MEMBLOCK_LOW_LIMIT, > > +limit, nid); > > + if (!ptr) > > + panic("cannot allocate paca data"); > > The old code doesn't zero, but two of the three callers of > alloc_paca_data() *do* zero the whole allocation, so I'd be happy if we > did it in here instead. I looked at the callers and couldn't tell if zeroing memory in init_lppaca() would be ok. I'll remove the _raw here. > That would mean we could use memblock_alloc_try_nid() avoiding the need > to panic() manually. Actual, my plan was to remove panic() from all memblock_alloc* and make all callers to check the returned value. I believe it's cleaner and also allows more meaningful panic messages. Not mentioning the reduction of memblock code. > > diff --git a/arch/powerpc/kernel/setup_64.c b/arch/powerpc/kernel/setup_64.c > > index 236c115..d11ee7f 100644 > > --- a/arch/powerpc/kernel/setup_64.c > > +++ b/arch/powerpc/kernel/setup_64.c > > @@ -634,19 +634,17 @@ __init u64 ppc64_bolted_size(void) > > > > static void *__init alloc_stack(unsigned long limit, int cpu) > > { > > - unsigned long pa; > > + void *ptr; > > > > BUILD_BUG_ON(STACK_INT_FRAME_SIZE % 16); > > > > - pa = memblock_alloc_base_nid(THREAD_SIZE, THREAD_SIZE, limit, > > - early_cpu_to_node(cpu), MEMBLOCK_NONE); > > - if (!pa) { > > - pa = memblock_alloc_base(THREAD_SIZE, THREAD_SIZE, limit); > > - if (!pa) > > - panic("cannot allocate stacks"); > > - } > > + ptr = memblock_alloc_try_nid_raw(THREAD_SIZE, THREAD_SIZE, > > +MEMBLOCK_LOW_LIMIT, limit, > > +early_cpu_to_node(cpu)); > > + if (!ptr) > > + panic("cannot allocate stacks"); > > Similarly here, several of the callers zero the stack, and I'd rather > all of them did. > > So again we could use memblock_alloc_try_nid() here and remove the > memset()s from emergency_stack_init(). Ok > > diff --git a/arch/powerpc/mm/pgtable-radix.c > > b/arch/powerpc/mm/pgtable-radix.c > > index 9311560..415a1eb0 100644 > > --- a/arch/powerpc/mm/pgtable-radix.c > > +++ b/arch/powerpc/mm/pgtable-radix.c > > @@ -51,24 +51,18 @@ static int native_register_process_table(unsigned long > > base, unsigned long pg_sz > > static __ref void *early_alloc_pgtable(unsigned long size, int nid, > > unsigned long region_start, unsigned long region_end) > > { > > - unsigned long pa = 0; > > + phys_addr_t min_addr = MEMBLOCK_LOW_LIMIT; > > + phys_addr_t max_addr = MEMBLOCK_ALLOC_ANYWHERE; > > void *pt; > > > > - if (region_start || region_end) /* has region hint */ > > - pa = memblock_alloc_range(size, size, region_start, region_end, > > - MEMBLOCK_NONE); > > - else if (nid != -1) /* has node hint */ > > - pa = memblock_alloc_base_nid(size, size, > > - MEMBLOCK_ALLOC_ANYWHERE, > > - nid, MEMBLOCK_NONE); > > + if (region_start) > > + min_addr = region_start; > > + if (region_end) > > + max_addr = region_end; > > > > - if (!pa) > > - pa = memblock_alloc_base(size, size, MEMBLOCK_ALLOC_ANYWHERE); > > - > > - BUG_ON(!pa); > > - > > - pt = __va(pa); > > - memset(pt, 0, size); > > + pt = memblock_alloc_try_nid_nopanic(size, size, min_addr, max_addr, > > + nid); > > + BUG_ON(!pt); > > I don't think there's any reason to BUG_ON() here rather than letting > memblock() call panic() for us. So this could also be > memblock_alloc_try_nid(). I'd prefer to panic here. > > diff --git a/arch/powerpc/platforms/pasemi/iommu.c > > b/arch/powerpc/platforms/pasemi/iommu.c > > index f297152..f62930f 100644 > > --- a/arch/powerpc/platforms/pasemi/iommu.c > > +++ b/arch/powerpc/platforms/pasemi/iommu.c > > @@ -208,7 +208,9 @@ static int __init iob_init(struct device_node *dn) > > pr_debug(" -> %s\n", __func__); > > > > /* For 2G space, 8x64 p
Re: [PATCH v2 1/6] powerpc: prefer memblock APIs returning virtual address
Hi Mike, Thanks for trying to clean these up. I think a few could be improved though ... Mike Rapoport writes: > diff --git a/arch/powerpc/kernel/paca.c b/arch/powerpc/kernel/paca.c > index 913bfca..fa884ad 100644 > --- a/arch/powerpc/kernel/paca.c > +++ b/arch/powerpc/kernel/paca.c > @@ -42,17 +42,15 @@ static void *__init alloc_paca_data(unsigned long size, > unsigned long align, > nid = early_cpu_to_node(cpu); > } > > - pa = memblock_alloc_base_nid(size, align, limit, nid, MEMBLOCK_NONE); > - if (!pa) { > - pa = memblock_alloc_base(size, align, limit); > - if (!pa) > - panic("cannot allocate paca data"); > - } > + ptr = memblock_alloc_try_nid_raw(size, align, MEMBLOCK_LOW_LIMIT, > + limit, nid); > + if (!ptr) > + panic("cannot allocate paca data"); The old code doesn't zero, but two of the three callers of alloc_paca_data() *do* zero the whole allocation, so I'd be happy if we did it in here instead. That would mean we could use memblock_alloc_try_nid() avoiding the need to panic() manually. > diff --git a/arch/powerpc/kernel/setup_64.c b/arch/powerpc/kernel/setup_64.c > index 236c115..d11ee7f 100644 > --- a/arch/powerpc/kernel/setup_64.c > +++ b/arch/powerpc/kernel/setup_64.c > @@ -634,19 +634,17 @@ __init u64 ppc64_bolted_size(void) > > static void *__init alloc_stack(unsigned long limit, int cpu) > { > - unsigned long pa; > + void *ptr; > > BUILD_BUG_ON(STACK_INT_FRAME_SIZE % 16); > > - pa = memblock_alloc_base_nid(THREAD_SIZE, THREAD_SIZE, limit, > - early_cpu_to_node(cpu), MEMBLOCK_NONE); > - if (!pa) { > - pa = memblock_alloc_base(THREAD_SIZE, THREAD_SIZE, limit); > - if (!pa) > - panic("cannot allocate stacks"); > - } > + ptr = memblock_alloc_try_nid_raw(THREAD_SIZE, THREAD_SIZE, > + MEMBLOCK_LOW_LIMIT, limit, > + early_cpu_to_node(cpu)); > + if (!ptr) > + panic("cannot allocate stacks"); Similarly here, several of the callers zero the stack, and I'd rather all of them did. So again we could use memblock_alloc_try_nid() here and remove the memset()s from emergency_stack_init(). > diff --git a/arch/powerpc/mm/pgtable-radix.c b/arch/powerpc/mm/pgtable-radix.c > index 9311560..415a1eb0 100644 > --- a/arch/powerpc/mm/pgtable-radix.c > +++ b/arch/powerpc/mm/pgtable-radix.c > @@ -51,24 +51,18 @@ static int native_register_process_table(unsigned long > base, unsigned long pg_sz > static __ref void *early_alloc_pgtable(unsigned long size, int nid, > unsigned long region_start, unsigned long region_end) > { > - unsigned long pa = 0; > + phys_addr_t min_addr = MEMBLOCK_LOW_LIMIT; > + phys_addr_t max_addr = MEMBLOCK_ALLOC_ANYWHERE; > void *pt; > > - if (region_start || region_end) /* has region hint */ > - pa = memblock_alloc_range(size, size, region_start, region_end, > - MEMBLOCK_NONE); > - else if (nid != -1) /* has node hint */ > - pa = memblock_alloc_base_nid(size, size, > - MEMBLOCK_ALLOC_ANYWHERE, > - nid, MEMBLOCK_NONE); > + if (region_start) > + min_addr = region_start; > + if (region_end) > + max_addr = region_end; > > - if (!pa) > - pa = memblock_alloc_base(size, size, MEMBLOCK_ALLOC_ANYWHERE); > - > - BUG_ON(!pa); > - > - pt = __va(pa); > - memset(pt, 0, size); > + pt = memblock_alloc_try_nid_nopanic(size, size, min_addr, max_addr, > + nid); > + BUG_ON(!pt); I don't think there's any reason to BUG_ON() here rather than letting memblock() call panic() for us. So this could also be memblock_alloc_try_nid(). > diff --git a/arch/powerpc/platforms/pasemi/iommu.c > b/arch/powerpc/platforms/pasemi/iommu.c > index f297152..f62930f 100644 > --- a/arch/powerpc/platforms/pasemi/iommu.c > +++ b/arch/powerpc/platforms/pasemi/iommu.c > @@ -208,7 +208,9 @@ static int __init iob_init(struct device_node *dn) > pr_debug(" -> %s\n", __func__); > > /* For 2G space, 8x64 pages (2^21 bytes) is max total l2 size */ > - iob_l2_base = (u32 *)__va(memblock_alloc_base(1UL<<21, 1UL<<21, > 0x8000)); > + iob_l2_base = memblock_alloc_try_nid_raw(1UL << 21, 1UL << 21, > + MEMBLOCK_LOW_LIMIT, 0x8000, > + NUMA_NO_NODE); This isn't equivalent is it? memblock_alloc_base() panics on failure but memblock_alloc_try_nid_raw() doesn't? Same comment for the other locations that do that conversion. cheers
[PATCH v2 1/6] powerpc: prefer memblock APIs returning virtual address
There are a several places that allocate memory using memblock APIs that return a physical address, convert the returned address to the virtual address and frequently also memset(0) the allocated range. Update these places to use memblock allocators already returning a virtual address; use memblock functions that clear the allocated memory instead of calling memset(0). Signed-off-by: Mike Rapoport --- arch/powerpc/kernel/paca.c | 14 ++ arch/powerpc/kernel/setup_64.c | 21 ++--- arch/powerpc/mm/hash_utils_64.c| 6 +++--- arch/powerpc/mm/pgtable-book3e.c | 8 ++-- arch/powerpc/mm/pgtable-book3s64.c | 5 + arch/powerpc/mm/pgtable-radix.c| 24 +--- arch/powerpc/platforms/pasemi/iommu.c | 5 +++-- arch/powerpc/platforms/pseries/setup.c | 11 +++ arch/powerpc/sysdev/dart_iommu.c | 5 +++-- 9 files changed, 44 insertions(+), 55 deletions(-) diff --git a/arch/powerpc/kernel/paca.c b/arch/powerpc/kernel/paca.c index 913bfca..fa884ad 100644 --- a/arch/powerpc/kernel/paca.c +++ b/arch/powerpc/kernel/paca.c @@ -27,7 +27,7 @@ static void *__init alloc_paca_data(unsigned long size, unsigned long align, unsigned long limit, int cpu) { - unsigned long pa; + void *ptr; int nid; /* @@ -42,17 +42,15 @@ static void *__init alloc_paca_data(unsigned long size, unsigned long align, nid = early_cpu_to_node(cpu); } - pa = memblock_alloc_base_nid(size, align, limit, nid, MEMBLOCK_NONE); - if (!pa) { - pa = memblock_alloc_base(size, align, limit); - if (!pa) - panic("cannot allocate paca data"); - } + ptr = memblock_alloc_try_nid_raw(size, align, MEMBLOCK_LOW_LIMIT, +limit, nid); + if (!ptr) + panic("cannot allocate paca data"); if (cpu == boot_cpuid) memblock_set_bottom_up(false); - return __va(pa); + return ptr; } #ifdef CONFIG_PPC_PSERIES diff --git a/arch/powerpc/kernel/setup_64.c b/arch/powerpc/kernel/setup_64.c index 236c115..d11ee7f 100644 --- a/arch/powerpc/kernel/setup_64.c +++ b/arch/powerpc/kernel/setup_64.c @@ -634,19 +634,17 @@ __init u64 ppc64_bolted_size(void) static void *__init alloc_stack(unsigned long limit, int cpu) { - unsigned long pa; + void *ptr; BUILD_BUG_ON(STACK_INT_FRAME_SIZE % 16); - pa = memblock_alloc_base_nid(THREAD_SIZE, THREAD_SIZE, limit, - early_cpu_to_node(cpu), MEMBLOCK_NONE); - if (!pa) { - pa = memblock_alloc_base(THREAD_SIZE, THREAD_SIZE, limit); - if (!pa) - panic("cannot allocate stacks"); - } + ptr = memblock_alloc_try_nid_raw(THREAD_SIZE, THREAD_SIZE, +MEMBLOCK_LOW_LIMIT, limit, +early_cpu_to_node(cpu)); + if (!ptr) + panic("cannot allocate stacks"); - return __va(pa); + return ptr; } void __init irqstack_early_init(void) @@ -933,8 +931,9 @@ static void __ref init_fallback_flush(void) * hardware prefetch runoff. We don't have a recipe for load patterns to * reliably avoid the prefetcher. */ - l1d_flush_fallback_area = __va(memblock_alloc_base(l1d_size * 2, l1d_size, limit)); - memset(l1d_flush_fallback_area, 0, l1d_size * 2); + l1d_flush_fallback_area = memblock_alloc_try_nid(l1d_size * 2, + l1d_size, MEMBLOCK_LOW_LIMIT, + limit, NUMA_NO_NODE); for_each_possible_cpu(cpu) { struct paca_struct *paca = paca_ptrs[cpu]; diff --git a/arch/powerpc/mm/hash_utils_64.c b/arch/powerpc/mm/hash_utils_64.c index 0cc7fbc..bc6be44 100644 --- a/arch/powerpc/mm/hash_utils_64.c +++ b/arch/powerpc/mm/hash_utils_64.c @@ -908,9 +908,9 @@ static void __init htab_initialize(void) #ifdef CONFIG_DEBUG_PAGEALLOC if (debug_pagealloc_enabled()) { linear_map_hash_count = memblock_end_of_DRAM() >> PAGE_SHIFT; - linear_map_hash_slots = __va(memblock_alloc_base( - linear_map_hash_count, 1, ppc64_rma_size)); - memset(linear_map_hash_slots, 0, linear_map_hash_count); + linear_map_hash_slots = memblock_alloc_try_nid( + linear_map_hash_count, 1, MEMBLOCK_LOW_LIMIT, + ppc64_rma_size, NUMA_NO_NODE); } #endif /* CONFIG_DEBUG_PAGEALLOC */ diff --git a/arch/powerpc/mm/pgtable-book3e.c b/arch/powerpc/mm/pgtable-book3e.c index e0ccf36..53cbc7d 100644 --- a/arch/powerpc/mm/pgtable-book3e.c +++ b/arch/powerpc/mm/pgtable-book3e.c @@ -57,12 +57,8 @@ void vmemmap_remov