Re: [PATCH v2 1/6] powerpc: prefer memblock APIs returning virtual address

2018-12-05 Thread Mike Rapoport
On Wed, Dec 05, 2018 at 11:37:44PM +1100, Michael Ellerman wrote:
> Mike Rapoport  writes:
> > On Tue, Dec 04, 2018 at 08:59:41PM +1100, Michael Ellerman wrote:
> >> Hi Mike,
> >> 
> >> Thanks for trying to clean these up.
> >> 
> >> I think a few could be improved though ...
> >> 
> >> Mike Rapoport  writes:
> >> > diff --git a/arch/powerpc/kernel/paca.c b/arch/powerpc/kernel/paca.c
> >> > index 913bfca..fa884ad 100644
> >> > --- a/arch/powerpc/kernel/paca.c
> >> > +++ b/arch/powerpc/kernel/paca.c
> >> > @@ -42,17 +42,15 @@ static void *__init alloc_paca_data(unsigned long 
> >> > size, unsigned long align,
> >> >  nid = early_cpu_to_node(cpu);
> >> >  }
> >> >  
> >> > -pa = memblock_alloc_base_nid(size, align, limit, nid, 
> >> > MEMBLOCK_NONE);
> >> > -if (!pa) {
> >> > -pa = memblock_alloc_base(size, align, limit);
> >> > -if (!pa)
> >> > -panic("cannot allocate paca data");
> >> > -}
> >> > +ptr = memblock_alloc_try_nid_raw(size, align, 
> >> > MEMBLOCK_LOW_LIMIT,
> >> > + limit, nid);
> >> > +if (!ptr)
> >> > +panic("cannot allocate paca data");
> >>   
> >> The old code doesn't zero, but two of the three callers of
> >> alloc_paca_data() *do* zero the whole allocation, so I'd be happy if we
> >> did it in here instead.
> >
> > I looked at the callers and couldn't tell if zeroing memory in
> > init_lppaca() would be ok.
> > I'll remove the _raw here.
>   
> Thanks.
> 
> >> That would mean we could use memblock_alloc_try_nid() avoiding the need
> >> to panic() manually.
> >
> > Actual, my plan was to remove panic() from all memblock_alloc* and make all
> > callers to check the returned value.
> > I believe it's cleaner and also allows more meaningful panic messages. Not
> > mentioning the reduction of memblock code.
> 
> Hmm, not sure.
> 
> I see ~200 calls to the panicking functions, that seems like a lot of
> work to change all those.

Yeah, I know :)

> And I think I disagree on the "more meaningful panic message". This is a
> perfect example, compare:
> 
>   panic("cannot allocate paca data");
> to:
>   panic("%s: Failed to allocate %llu bytes align=0x%llx nid=%d from=%pa 
> max_addr=%pa\n",
> __func__, (u64)size, (u64)align, nid, _addr, _addr);
> 
> The former is basically useless, whereas the second might at least give
> you a hint as to *why* the allocation failed.

We can easily keep the memblock message, just make it pr_err instead of
panic.
The message at the call site can show where the problem was without the
need to dive into the stack dump.

> I know it's kind of odd for a function to panic() rather than return an
> error, but memblock is kind of special because it's so early in boot.
> Most of these allocations have to succeed to get the system up and
> running.

The downside of having panic() inside some memblock functions is that it
makes the API way too bloated. And, at least currently, it's inconsistent.
For instance memblock_alloc_try_nid_raw() does not panic, but
memblock_alloc_try_nid() does.

When it was about 2 functions and a wrapper, it was perfectly fine, but
since than memblock has three sets of partially overlapping APIs with
endless convenience wrappers.

I believe that patching up ~200 calls is worth the reduction of memblock
API to saner size.

Another thing, the absence of check for return value for memory allocation
is not only odd, but it also makes the code obfuscated.
 
> cheers
> 

-- 
Sincerely yours,
Mike.



Re: [PATCH v2 1/6] powerpc: prefer memblock APIs returning virtual address

2018-12-05 Thread Michael Ellerman
Mike Rapoport  writes:
> On Tue, Dec 04, 2018 at 08:59:41PM +1100, Michael Ellerman wrote:
>> Hi Mike,
>> 
>> Thanks for trying to clean these up.
>> 
>> I think a few could be improved though ...
>> 
>> Mike Rapoport  writes:
>> > diff --git a/arch/powerpc/kernel/paca.c b/arch/powerpc/kernel/paca.c
>> > index 913bfca..fa884ad 100644
>> > --- a/arch/powerpc/kernel/paca.c
>> > +++ b/arch/powerpc/kernel/paca.c
>> > @@ -42,17 +42,15 @@ static void *__init alloc_paca_data(unsigned long 
>> > size, unsigned long align,
>> >nid = early_cpu_to_node(cpu);
>> >}
>> >  
>> > -  pa = memblock_alloc_base_nid(size, align, limit, nid, MEMBLOCK_NONE);
>> > -  if (!pa) {
>> > -  pa = memblock_alloc_base(size, align, limit);
>> > -  if (!pa)
>> > -  panic("cannot allocate paca data");
>> > -  }
>> > +  ptr = memblock_alloc_try_nid_raw(size, align, MEMBLOCK_LOW_LIMIT,
>> > +   limit, nid);
>> > +  if (!ptr)
>> > +  panic("cannot allocate paca data");
>>   
>> The old code doesn't zero, but two of the three callers of
>> alloc_paca_data() *do* zero the whole allocation, so I'd be happy if we
>> did it in here instead.
>
> I looked at the callers and couldn't tell if zeroing memory in
> init_lppaca() would be ok.
> I'll remove the _raw here.
  
Thanks.

>> That would mean we could use memblock_alloc_try_nid() avoiding the need
>> to panic() manually.
>
> Actual, my plan was to remove panic() from all memblock_alloc* and make all
> callers to check the returned value.
> I believe it's cleaner and also allows more meaningful panic messages. Not
> mentioning the reduction of memblock code.

Hmm, not sure.

I see ~200 calls to the panicking functions, that seems like a lot of
work to change all those.

And I think I disagree on the "more meaningful panic message". This is a
perfect example, compare:

panic("cannot allocate paca data");
to:
panic("%s: Failed to allocate %llu bytes align=0x%llx nid=%d from=%pa 
max_addr=%pa\n",
  __func__, (u64)size, (u64)align, nid, _addr, _addr);

The former is basically useless, whereas the second might at least give
you a hint as to *why* the allocation failed.

I know it's kind of odd for a function to panic() rather than return an
error, but memblock is kind of special because it's so early in boot.
Most of these allocations have to succeed to get the system up and
running.

cheers


Re: [PATCH v2 1/6] powerpc: prefer memblock APIs returning virtual address

2018-12-04 Thread Mike Rapoport
On Tue, Dec 04, 2018 at 08:59:41PM +1100, Michael Ellerman wrote:
> Hi Mike,
> 
> Thanks for trying to clean these up.
> 
> I think a few could be improved though ...
> 
> Mike Rapoport  writes:
> > diff --git a/arch/powerpc/kernel/paca.c b/arch/powerpc/kernel/paca.c
> > index 913bfca..fa884ad 100644
> > --- a/arch/powerpc/kernel/paca.c
> > +++ b/arch/powerpc/kernel/paca.c
> > @@ -42,17 +42,15 @@ static void *__init alloc_paca_data(unsigned long size, 
> > unsigned long align,
> > nid = early_cpu_to_node(cpu);
> > }
> >  
> > -   pa = memblock_alloc_base_nid(size, align, limit, nid, MEMBLOCK_NONE);
> > -   if (!pa) {
> > -   pa = memblock_alloc_base(size, align, limit);
> > -   if (!pa)
> > -   panic("cannot allocate paca data");
> > -   }
> > +   ptr = memblock_alloc_try_nid_raw(size, align, MEMBLOCK_LOW_LIMIT,
> > +limit, nid);
> > +   if (!ptr)
> > +   panic("cannot allocate paca data");
>   
> The old code doesn't zero, but two of the three callers of
> alloc_paca_data() *do* zero the whole allocation, so I'd be happy if we
> did it in here instead.

I looked at the callers and couldn't tell if zeroing memory in
init_lppaca() would be ok.
I'll remove the _raw here.
 
> That would mean we could use memblock_alloc_try_nid() avoiding the need
> to panic() manually.

Actual, my plan was to remove panic() from all memblock_alloc* and make all
callers to check the returned value.
I believe it's cleaner and also allows more meaningful panic messages. Not
mentioning the reduction of memblock code.
 
> > diff --git a/arch/powerpc/kernel/setup_64.c b/arch/powerpc/kernel/setup_64.c
> > index 236c115..d11ee7f 100644
> > --- a/arch/powerpc/kernel/setup_64.c
> > +++ b/arch/powerpc/kernel/setup_64.c
> > @@ -634,19 +634,17 @@ __init u64 ppc64_bolted_size(void)
> >  
> >  static void *__init alloc_stack(unsigned long limit, int cpu)
> >  {
> > -   unsigned long pa;
> > +   void *ptr;
> >  
> > BUILD_BUG_ON(STACK_INT_FRAME_SIZE % 16);
> >  
> > -   pa = memblock_alloc_base_nid(THREAD_SIZE, THREAD_SIZE, limit,
> > -   early_cpu_to_node(cpu), MEMBLOCK_NONE);
> > -   if (!pa) {
> > -   pa = memblock_alloc_base(THREAD_SIZE, THREAD_SIZE, limit);
> > -   if (!pa)
> > -   panic("cannot allocate stacks");
> > -   }
> > +   ptr = memblock_alloc_try_nid_raw(THREAD_SIZE, THREAD_SIZE,
> > +MEMBLOCK_LOW_LIMIT, limit,
> > +early_cpu_to_node(cpu));
> > +   if (!ptr)
> > +   panic("cannot allocate stacks");
>  
> Similarly here, several of the callers zero the stack, and I'd rather
> all of them did.
> 
> So again we could use memblock_alloc_try_nid() here and remove the
> memset()s from emergency_stack_init().

Ok
 
> > diff --git a/arch/powerpc/mm/pgtable-radix.c 
> > b/arch/powerpc/mm/pgtable-radix.c
> > index 9311560..415a1eb0 100644
> > --- a/arch/powerpc/mm/pgtable-radix.c
> > +++ b/arch/powerpc/mm/pgtable-radix.c
> > @@ -51,24 +51,18 @@ static int native_register_process_table(unsigned long 
> > base, unsigned long pg_sz
> >  static __ref void *early_alloc_pgtable(unsigned long size, int nid,
> > unsigned long region_start, unsigned long region_end)
> >  {
> > -   unsigned long pa = 0;
> > +   phys_addr_t min_addr = MEMBLOCK_LOW_LIMIT;
> > +   phys_addr_t max_addr = MEMBLOCK_ALLOC_ANYWHERE;
> > void *pt;
> >  
> > -   if (region_start || region_end) /* has region hint */
> > -   pa = memblock_alloc_range(size, size, region_start, region_end,
> > -   MEMBLOCK_NONE);
> > -   else if (nid != -1) /* has node hint */
> > -   pa = memblock_alloc_base_nid(size, size,
> > -   MEMBLOCK_ALLOC_ANYWHERE,
> > -   nid, MEMBLOCK_NONE);
> > +   if (region_start)
> > +   min_addr = region_start;
> > +   if (region_end)
> > +   max_addr = region_end;
> >  
> > -   if (!pa)
> > -   pa = memblock_alloc_base(size, size, MEMBLOCK_ALLOC_ANYWHERE);
> > -
> > -   BUG_ON(!pa);
> > -
> > -   pt = __va(pa);
> > -   memset(pt, 0, size);
> > +   pt = memblock_alloc_try_nid_nopanic(size, size, min_addr, max_addr,
> > +   nid);
> > +   BUG_ON(!pt);
> 
> I don't think there's any reason to BUG_ON() here rather than letting
> memblock() call panic() for us. So this could also be 
> memblock_alloc_try_nid().

I'd prefer to panic here.
 
> > diff --git a/arch/powerpc/platforms/pasemi/iommu.c 
> > b/arch/powerpc/platforms/pasemi/iommu.c
> > index f297152..f62930f 100644
> > --- a/arch/powerpc/platforms/pasemi/iommu.c
> > +++ b/arch/powerpc/platforms/pasemi/iommu.c
> > @@ -208,7 +208,9 @@ static int __init iob_init(struct device_node *dn)
> > pr_debug(" -> %s\n", __func__);
> >  
> > /* For 2G space, 8x64 

Re: [PATCH v2 1/6] powerpc: prefer memblock APIs returning virtual address

2018-12-04 Thread Michael Ellerman
Hi Mike,

Thanks for trying to clean these up.

I think a few could be improved though ...

Mike Rapoport  writes:
> diff --git a/arch/powerpc/kernel/paca.c b/arch/powerpc/kernel/paca.c
> index 913bfca..fa884ad 100644
> --- a/arch/powerpc/kernel/paca.c
> +++ b/arch/powerpc/kernel/paca.c
> @@ -42,17 +42,15 @@ static void *__init alloc_paca_data(unsigned long size, 
> unsigned long align,
>   nid = early_cpu_to_node(cpu);
>   }
>  
> - pa = memblock_alloc_base_nid(size, align, limit, nid, MEMBLOCK_NONE);
> - if (!pa) {
> - pa = memblock_alloc_base(size, align, limit);
> - if (!pa)
> - panic("cannot allocate paca data");
> - }
> + ptr = memblock_alloc_try_nid_raw(size, align, MEMBLOCK_LOW_LIMIT,
> +  limit, nid);
> + if (!ptr)
> + panic("cannot allocate paca data");
  
The old code doesn't zero, but two of the three callers of
alloc_paca_data() *do* zero the whole allocation, so I'd be happy if we
did it in here instead.

That would mean we could use memblock_alloc_try_nid() avoiding the need
to panic() manually.

> diff --git a/arch/powerpc/kernel/setup_64.c b/arch/powerpc/kernel/setup_64.c
> index 236c115..d11ee7f 100644
> --- a/arch/powerpc/kernel/setup_64.c
> +++ b/arch/powerpc/kernel/setup_64.c
> @@ -634,19 +634,17 @@ __init u64 ppc64_bolted_size(void)
>  
>  static void *__init alloc_stack(unsigned long limit, int cpu)
>  {
> - unsigned long pa;
> + void *ptr;
>  
>   BUILD_BUG_ON(STACK_INT_FRAME_SIZE % 16);
>  
> - pa = memblock_alloc_base_nid(THREAD_SIZE, THREAD_SIZE, limit,
> - early_cpu_to_node(cpu), MEMBLOCK_NONE);
> - if (!pa) {
> - pa = memblock_alloc_base(THREAD_SIZE, THREAD_SIZE, limit);
> - if (!pa)
> - panic("cannot allocate stacks");
> - }
> + ptr = memblock_alloc_try_nid_raw(THREAD_SIZE, THREAD_SIZE,
> +  MEMBLOCK_LOW_LIMIT, limit,
> +  early_cpu_to_node(cpu));
> + if (!ptr)
> + panic("cannot allocate stacks");
 
Similarly here, several of the callers zero the stack, and I'd rather
all of them did.

So again we could use memblock_alloc_try_nid() here and remove the
memset()s from emergency_stack_init().

> diff --git a/arch/powerpc/mm/pgtable-radix.c b/arch/powerpc/mm/pgtable-radix.c
> index 9311560..415a1eb0 100644
> --- a/arch/powerpc/mm/pgtable-radix.c
> +++ b/arch/powerpc/mm/pgtable-radix.c
> @@ -51,24 +51,18 @@ static int native_register_process_table(unsigned long 
> base, unsigned long pg_sz
>  static __ref void *early_alloc_pgtable(unsigned long size, int nid,
>   unsigned long region_start, unsigned long region_end)
>  {
> - unsigned long pa = 0;
> + phys_addr_t min_addr = MEMBLOCK_LOW_LIMIT;
> + phys_addr_t max_addr = MEMBLOCK_ALLOC_ANYWHERE;
>   void *pt;
>  
> - if (region_start || region_end) /* has region hint */
> - pa = memblock_alloc_range(size, size, region_start, region_end,
> - MEMBLOCK_NONE);
> - else if (nid != -1) /* has node hint */
> - pa = memblock_alloc_base_nid(size, size,
> - MEMBLOCK_ALLOC_ANYWHERE,
> - nid, MEMBLOCK_NONE);
> + if (region_start)
> + min_addr = region_start;
> + if (region_end)
> + max_addr = region_end;
>  
> - if (!pa)
> - pa = memblock_alloc_base(size, size, MEMBLOCK_ALLOC_ANYWHERE);
> -
> - BUG_ON(!pa);
> -
> - pt = __va(pa);
> - memset(pt, 0, size);
> + pt = memblock_alloc_try_nid_nopanic(size, size, min_addr, max_addr,
> + nid);
> + BUG_ON(!pt);

I don't think there's any reason to BUG_ON() here rather than letting
memblock() call panic() for us. So this could also be memblock_alloc_try_nid().

> diff --git a/arch/powerpc/platforms/pasemi/iommu.c 
> b/arch/powerpc/platforms/pasemi/iommu.c
> index f297152..f62930f 100644
> --- a/arch/powerpc/platforms/pasemi/iommu.c
> +++ b/arch/powerpc/platforms/pasemi/iommu.c
> @@ -208,7 +208,9 @@ static int __init iob_init(struct device_node *dn)
>   pr_debug(" -> %s\n", __func__);
>  
>   /* For 2G space, 8x64 pages (2^21 bytes) is max total l2 size */
> - iob_l2_base = (u32 *)__va(memblock_alloc_base(1UL<<21, 1UL<<21, 
> 0x8000));
> + iob_l2_base = memblock_alloc_try_nid_raw(1UL << 21, 1UL << 21,
> + MEMBLOCK_LOW_LIMIT, 0x8000,
> + NUMA_NO_NODE);

This isn't equivalent is it?

memblock_alloc_base() panics on failure but memblock_alloc_try_nid_raw()
doesn't?

Same comment for the other locations that do that conversion.

cheers


[PATCH v2 1/6] powerpc: prefer memblock APIs returning virtual address

2018-12-03 Thread Mike Rapoport
There are a several places that allocate memory using memblock APIs that
return a physical address, convert the returned address to the virtual
address and frequently also memset(0) the allocated range.

Update these places to use memblock allocators already returning a virtual
address; use memblock functions that clear the allocated memory instead of
calling memset(0).

Signed-off-by: Mike Rapoport 
---
 arch/powerpc/kernel/paca.c | 14 ++
 arch/powerpc/kernel/setup_64.c | 21 ++---
 arch/powerpc/mm/hash_utils_64.c|  6 +++---
 arch/powerpc/mm/pgtable-book3e.c   |  8 ++--
 arch/powerpc/mm/pgtable-book3s64.c |  5 +
 arch/powerpc/mm/pgtable-radix.c| 24 +---
 arch/powerpc/platforms/pasemi/iommu.c  |  5 +++--
 arch/powerpc/platforms/pseries/setup.c | 11 +++
 arch/powerpc/sysdev/dart_iommu.c   |  5 +++--
 9 files changed, 44 insertions(+), 55 deletions(-)

diff --git a/arch/powerpc/kernel/paca.c b/arch/powerpc/kernel/paca.c
index 913bfca..fa884ad 100644
--- a/arch/powerpc/kernel/paca.c
+++ b/arch/powerpc/kernel/paca.c
@@ -27,7 +27,7 @@
 static void *__init alloc_paca_data(unsigned long size, unsigned long align,
unsigned long limit, int cpu)
 {
-   unsigned long pa;
+   void *ptr;
int nid;
 
/*
@@ -42,17 +42,15 @@ static void *__init alloc_paca_data(unsigned long size, 
unsigned long align,
nid = early_cpu_to_node(cpu);
}
 
-   pa = memblock_alloc_base_nid(size, align, limit, nid, MEMBLOCK_NONE);
-   if (!pa) {
-   pa = memblock_alloc_base(size, align, limit);
-   if (!pa)
-   panic("cannot allocate paca data");
-   }
+   ptr = memblock_alloc_try_nid_raw(size, align, MEMBLOCK_LOW_LIMIT,
+limit, nid);
+   if (!ptr)
+   panic("cannot allocate paca data");
 
if (cpu == boot_cpuid)
memblock_set_bottom_up(false);
 
-   return __va(pa);
+   return ptr;
 }
 
 #ifdef CONFIG_PPC_PSERIES
diff --git a/arch/powerpc/kernel/setup_64.c b/arch/powerpc/kernel/setup_64.c
index 236c115..d11ee7f 100644
--- a/arch/powerpc/kernel/setup_64.c
+++ b/arch/powerpc/kernel/setup_64.c
@@ -634,19 +634,17 @@ __init u64 ppc64_bolted_size(void)
 
 static void *__init alloc_stack(unsigned long limit, int cpu)
 {
-   unsigned long pa;
+   void *ptr;
 
BUILD_BUG_ON(STACK_INT_FRAME_SIZE % 16);
 
-   pa = memblock_alloc_base_nid(THREAD_SIZE, THREAD_SIZE, limit,
-   early_cpu_to_node(cpu), MEMBLOCK_NONE);
-   if (!pa) {
-   pa = memblock_alloc_base(THREAD_SIZE, THREAD_SIZE, limit);
-   if (!pa)
-   panic("cannot allocate stacks");
-   }
+   ptr = memblock_alloc_try_nid_raw(THREAD_SIZE, THREAD_SIZE,
+MEMBLOCK_LOW_LIMIT, limit,
+early_cpu_to_node(cpu));
+   if (!ptr)
+   panic("cannot allocate stacks");
 
-   return __va(pa);
+   return ptr;
 }
 
 void __init irqstack_early_init(void)
@@ -933,8 +931,9 @@ static void __ref init_fallback_flush(void)
 * hardware prefetch runoff. We don't have a recipe for load patterns to
 * reliably avoid the prefetcher.
 */
-   l1d_flush_fallback_area = __va(memblock_alloc_base(l1d_size * 2, 
l1d_size, limit));
-   memset(l1d_flush_fallback_area, 0, l1d_size * 2);
+   l1d_flush_fallback_area = memblock_alloc_try_nid(l1d_size * 2,
+   l1d_size, MEMBLOCK_LOW_LIMIT,
+   limit, NUMA_NO_NODE);
 
for_each_possible_cpu(cpu) {
struct paca_struct *paca = paca_ptrs[cpu];
diff --git a/arch/powerpc/mm/hash_utils_64.c b/arch/powerpc/mm/hash_utils_64.c
index 0cc7fbc..bc6be44 100644
--- a/arch/powerpc/mm/hash_utils_64.c
+++ b/arch/powerpc/mm/hash_utils_64.c
@@ -908,9 +908,9 @@ static void __init htab_initialize(void)
 #ifdef CONFIG_DEBUG_PAGEALLOC
if (debug_pagealloc_enabled()) {
linear_map_hash_count = memblock_end_of_DRAM() >> PAGE_SHIFT;
-   linear_map_hash_slots = __va(memblock_alloc_base(
-   linear_map_hash_count, 1, ppc64_rma_size));
-   memset(linear_map_hash_slots, 0, linear_map_hash_count);
+   linear_map_hash_slots = memblock_alloc_try_nid(
+   linear_map_hash_count, 1, MEMBLOCK_LOW_LIMIT,
+   ppc64_rma_size, NUMA_NO_NODE);
}
 #endif /* CONFIG_DEBUG_PAGEALLOC */
 
diff --git a/arch/powerpc/mm/pgtable-book3e.c b/arch/powerpc/mm/pgtable-book3e.c
index e0ccf36..53cbc7d 100644
--- a/arch/powerpc/mm/pgtable-book3e.c
+++ b/arch/powerpc/mm/pgtable-book3e.c
@@ -57,12 +57,8 @@ void