Re: [PATCH v4 11/25] powernv/fadump: register kernel metadata address with opal

2019-08-19 Thread Hari Bathini



On 14/08/19 3:51 PM, Mahesh Jagannath Salgaonkar wrote:
> On 8/14/19 12:36 PM, Hari Bathini wrote:
>>
>>
>> On 13/08/19 4:11 PM, Mahesh J Salgaonkar wrote:
>>> On 2019-07-16 17:03:15 Tue, Hari Bathini wrote:
 OPAL allows registering address with it in the first kernel and
 retrieving it after MPIPL. Setup kernel metadata and register its
 address with OPAL to use it for processing the crash dump.

 Signed-off-by: Hari Bathini 
 ---
[...]
>>
>>> What if kernel crashes before metadata area is initialized ?
>>
>> registered_regions would be '0'. So, it is treated as fadump is not 
>> registered case.
>> Let me
>> initialize metadata explicitly before registering the address with f/w to 
>> avoid any assumption...
> 
> Do you want to do that before memblock reservation ? Should we move this
> to setup_fadump() ?

Better here as failing early would mean we could fall back to KDump..



Re: [PATCH v4 11/25] powernv/fadump: register kernel metadata address with opal

2019-08-14 Thread Mahesh Jagannath Salgaonkar
On 8/14/19 12:36 PM, Hari Bathini wrote:
> 
> 
> On 13/08/19 4:11 PM, Mahesh J Salgaonkar wrote:
>> On 2019-07-16 17:03:15 Tue, Hari Bathini wrote:
>>> OPAL allows registering address with it in the first kernel and
>>> retrieving it after MPIPL. Setup kernel metadata and register its
>>> address with OPAL to use it for processing the crash dump.
>>>
>>> Signed-off-by: Hari Bathini 
>>> ---
>>>  arch/powerpc/kernel/fadump-common.h  |4 +
>>>  arch/powerpc/kernel/fadump.c |   65 ++-
>>>  arch/powerpc/platforms/powernv/opal-fadump.c |   73 
>>> ++
>>>  arch/powerpc/platforms/powernv/opal-fadump.h |   37 +
>>>  arch/powerpc/platforms/pseries/rtas-fadump.c |   32 +--
>>>  5 files changed, 177 insertions(+), 34 deletions(-)
>>>  create mode 100644 arch/powerpc/platforms/powernv/opal-fadump.h
>>>
>> [...]
>>> @@ -346,30 +349,42 @@ int __init fadump_reserve_mem(void)
>>>  * use memblock_find_in_range() here since it doesn't allocate
>>>  * from bottom to top.
>>>  */
>>> -   for (base = fw_dump.boot_memory_size;
>>> -base <= (memory_boundary - size);
>>> -base += size) {
>>> +   while (base <= (memory_boundary - size)) {
>>> if (memblock_is_region_memory(base, size) &&
>>> !memblock_is_region_reserved(base, size))
>>> break;
>>> +
>>> +   base += size;
>>> }
>>> -   if ((base > (memory_boundary - size)) ||
>>> -   memblock_reserve(base, size)) {
>>> +
>>> +   if (base > (memory_boundary - size)) {
>>> +   pr_err("Failed to find memory chunk for reservation\n");
>>> +   goto error_out;
>>> +   }
>>> +   fw_dump.reserve_dump_area_start = base;
>>> +
>>> +   /*
>>> +* Calculate the kernel metadata address and register it with
>>> +* f/w if the platform supports.
>>> +*/
>>> +   if (fw_dump.ops->setup_kernel_metadata(_dump) < 0)
>>> +   goto error_out;
>>
>> I see setup_kernel_metadata() registers the metadata address with opal 
>> without
>> having any minimum data initialized in it. Secondaly, why can't this wait 
>> until> registration ? I think we should defer this until fadump registration.
> 
> If setting up metadata address fails (it should ideally not fail, but..), 
> everything else
> is useless. 

That's less likely.. so is true with opal_mpipl_update() as well.

> So, we might as well try that early and fall back to KDump in case of an 
> error..

ok. Yeah but not uninitialized metadata.

> 
>> What if kernel crashes before metadata area is initialized ?
> 
> registered_regions would be '0'. So, it is treated as fadump is not 
> registered case.
> Let me
> initialize metadata explicitly before registering the address with f/w to 
> avoid any assumption...

Do you want to do that before memblock reservation ? Should we move this
to setup_fadump() ?

Thanks,
-Mahesh.

> 
>>
>>> +
>>> +   if (memblock_reserve(base, size)) {
>>> pr_err("Failed to reserve memory\n");
>>> -   return 0;
>>> +   goto error_out;
>>> }
>> [...]
>>> -
>>>  static struct fadump_ops rtas_fadump_ops = {
>>> -   .init_fadump_mem_struct = rtas_fadump_init_mem_struct,
>>> -   .register_fadump= rtas_fadump_register_fadump,
>>> -   .unregister_fadump  = rtas_fadump_unregister_fadump,
>>> -   .invalidate_fadump  = rtas_fadump_invalidate_fadump,
>>> -   .process_fadump = rtas_fadump_process_fadump,
>>> -   .fadump_region_show = rtas_fadump_region_show,
>>> -   .fadump_trigger = rtas_fadump_trigger,
>>> +   .init_fadump_mem_struct = rtas_fadump_init_mem_struct,
>>> +   .get_kernel_metadata_size   = rtas_fadump_get_kernel_metadata_size,
>>> +   .setup_kernel_metadata  = rtas_fadump_setup_kernel_metadata,
>>> +   .register_fadump= rtas_fadump_register_fadump,
>>> +   .unregister_fadump  = rtas_fadump_unregister_fadump,
>>> +   .invalidate_fadump  = rtas_fadump_invalidate_fadump,
>>> +   .process_fadump = rtas_fadump_process_fadump,
>>> +   .fadump_region_show = rtas_fadump_region_show,
>>> +   .fadump_trigger = rtas_fadump_trigger,
>>
>> Can you make the tab space changes in your previous patch where these
>> were initially introduced ? So that this patch can only show new members
>> that are added.
> 
> done.
> 
> Thanks
> Hari
> 



Re: [PATCH v4 11/25] powernv/fadump: register kernel metadata address with opal

2019-08-14 Thread Hari Bathini



On 13/08/19 4:11 PM, Mahesh J Salgaonkar wrote:
> On 2019-07-16 17:03:15 Tue, Hari Bathini wrote:
>> OPAL allows registering address with it in the first kernel and
>> retrieving it after MPIPL. Setup kernel metadata and register its
>> address with OPAL to use it for processing the crash dump.
>>
>> Signed-off-by: Hari Bathini 
>> ---
>>  arch/powerpc/kernel/fadump-common.h  |4 +
>>  arch/powerpc/kernel/fadump.c |   65 ++-
>>  arch/powerpc/platforms/powernv/opal-fadump.c |   73 
>> ++
>>  arch/powerpc/platforms/powernv/opal-fadump.h |   37 +
>>  arch/powerpc/platforms/pseries/rtas-fadump.c |   32 +--
>>  5 files changed, 177 insertions(+), 34 deletions(-)
>>  create mode 100644 arch/powerpc/platforms/powernv/opal-fadump.h
>>
> [...]
>> @@ -346,30 +349,42 @@ int __init fadump_reserve_mem(void)
>>   * use memblock_find_in_range() here since it doesn't allocate
>>   * from bottom to top.
>>   */
>> -for (base = fw_dump.boot_memory_size;
>> - base <= (memory_boundary - size);
>> - base += size) {
>> +while (base <= (memory_boundary - size)) {
>>  if (memblock_is_region_memory(base, size) &&
>>  !memblock_is_region_reserved(base, size))
>>  break;
>> +
>> +base += size;
>>  }
>> -if ((base > (memory_boundary - size)) ||
>> -memblock_reserve(base, size)) {
>> +
>> +if (base > (memory_boundary - size)) {
>> +pr_err("Failed to find memory chunk for reservation\n");
>> +goto error_out;
>> +}
>> +fw_dump.reserve_dump_area_start = base;
>> +
>> +/*
>> + * Calculate the kernel metadata address and register it with
>> + * f/w if the platform supports.
>> + */
>> +if (fw_dump.ops->setup_kernel_metadata(_dump) < 0)
>> +goto error_out;
> 
> I see setup_kernel_metadata() registers the metadata address with opal without
> having any minimum data initialized in it. Secondaly, why can't this wait 
> until> registration ? I think we should defer this until fadump registration.

If setting up metadata address fails (it should ideally not fail, but..), 
everything else
is useless. So, we might as well try that early and fall back to KDump in case 
of an error..

> What if kernel crashes before metadata area is initialized ?

registered_regions would be '0'. So, it is treated as fadump is not registered 
case. Let me
initialize metadata explicitly before registering the address with f/w to avoid 
any assumption...

> 
>> +
>> +if (memblock_reserve(base, size)) {
>>  pr_err("Failed to reserve memory\n");
>> -return 0;
>> +goto error_out;
>>  }
> [...]
>> -
>>  static struct fadump_ops rtas_fadump_ops = {
>> -.init_fadump_mem_struct = rtas_fadump_init_mem_struct,
>> -.register_fadump= rtas_fadump_register_fadump,
>> -.unregister_fadump  = rtas_fadump_unregister_fadump,
>> -.invalidate_fadump  = rtas_fadump_invalidate_fadump,
>> -.process_fadump = rtas_fadump_process_fadump,
>> -.fadump_region_show = rtas_fadump_region_show,
>> -.fadump_trigger = rtas_fadump_trigger,
>> +.init_fadump_mem_struct = rtas_fadump_init_mem_struct,
>> +.get_kernel_metadata_size   = rtas_fadump_get_kernel_metadata_size,
>> +.setup_kernel_metadata  = rtas_fadump_setup_kernel_metadata,
>> +.register_fadump= rtas_fadump_register_fadump,
>> +.unregister_fadump  = rtas_fadump_unregister_fadump,
>> +.invalidate_fadump  = rtas_fadump_invalidate_fadump,
>> +.process_fadump = rtas_fadump_process_fadump,
>> +.fadump_region_show = rtas_fadump_region_show,
>> +.fadump_trigger = rtas_fadump_trigger,
> 
> Can you make the tab space changes in your previous patch where these
> were initially introduced ? So that this patch can only show new members
> that are added.

done.

Thanks
Hari



Re: [PATCH v4 11/25] powernv/fadump: register kernel metadata address with opal

2019-08-13 Thread Mahesh J Salgaonkar
On 2019-07-16 17:03:15 Tue, Hari Bathini wrote:
> OPAL allows registering address with it in the first kernel and
> retrieving it after MPIPL. Setup kernel metadata and register its
> address with OPAL to use it for processing the crash dump.
> 
> Signed-off-by: Hari Bathini 
> ---
>  arch/powerpc/kernel/fadump-common.h  |4 +
>  arch/powerpc/kernel/fadump.c |   65 ++-
>  arch/powerpc/platforms/powernv/opal-fadump.c |   73 
> ++
>  arch/powerpc/platforms/powernv/opal-fadump.h |   37 +
>  arch/powerpc/platforms/pseries/rtas-fadump.c |   32 +--
>  5 files changed, 177 insertions(+), 34 deletions(-)
>  create mode 100644 arch/powerpc/platforms/powernv/opal-fadump.h
> 
[...]
> @@ -346,30 +349,42 @@ int __init fadump_reserve_mem(void)
>* use memblock_find_in_range() here since it doesn't allocate
>* from bottom to top.
>*/
> - for (base = fw_dump.boot_memory_size;
> -  base <= (memory_boundary - size);
> -  base += size) {
> + while (base <= (memory_boundary - size)) {
>   if (memblock_is_region_memory(base, size) &&
>   !memblock_is_region_reserved(base, size))
>   break;
> +
> + base += size;
>   }
> - if ((base > (memory_boundary - size)) ||
> - memblock_reserve(base, size)) {
> +
> + if (base > (memory_boundary - size)) {
> + pr_err("Failed to find memory chunk for reservation\n");
> + goto error_out;
> + }
> + fw_dump.reserve_dump_area_start = base;
> +
> + /*
> +  * Calculate the kernel metadata address and register it with
> +  * f/w if the platform supports.
> +  */
> + if (fw_dump.ops->setup_kernel_metadata(_dump) < 0)
> + goto error_out;

I see setup_kernel_metadata() registers the metadata address with opal without
having any minimum data initialized in it. Secondaly, why can't this wait until
registration ? I think we should defer this until fadump registration.
What if kernel crashes before metadata area is initialized ?

> +
> + if (memblock_reserve(base, size)) {
>   pr_err("Failed to reserve memory\n");
> - return 0;
> + goto error_out;
>   }
[...]
> -
>  static struct fadump_ops rtas_fadump_ops = {
> - .init_fadump_mem_struct = rtas_fadump_init_mem_struct,
> - .register_fadump= rtas_fadump_register_fadump,
> - .unregister_fadump  = rtas_fadump_unregister_fadump,
> - .invalidate_fadump  = rtas_fadump_invalidate_fadump,
> - .process_fadump = rtas_fadump_process_fadump,
> - .fadump_region_show = rtas_fadump_region_show,
> - .fadump_trigger = rtas_fadump_trigger,
> + .init_fadump_mem_struct = rtas_fadump_init_mem_struct,
> + .get_kernel_metadata_size   = rtas_fadump_get_kernel_metadata_size,
> + .setup_kernel_metadata  = rtas_fadump_setup_kernel_metadata,
> + .register_fadump= rtas_fadump_register_fadump,
> + .unregister_fadump  = rtas_fadump_unregister_fadump,
> + .invalidate_fadump  = rtas_fadump_invalidate_fadump,
> + .process_fadump = rtas_fadump_process_fadump,
> + .fadump_region_show = rtas_fadump_region_show,
> + .fadump_trigger = rtas_fadump_trigger,

Can you make the tab space changes in your previous patch where these
were initially introduced ? So that this patch can only show new members
that are added.

Thanks,
-Mahesh.



[PATCH v4 11/25] powernv/fadump: register kernel metadata address with opal

2019-07-16 Thread Hari Bathini
OPAL allows registering address with it in the first kernel and
retrieving it after MPIPL. Setup kernel metadata and register its
address with OPAL to use it for processing the crash dump.

Signed-off-by: Hari Bathini 
---
 arch/powerpc/kernel/fadump-common.h  |4 +
 arch/powerpc/kernel/fadump.c |   65 ++-
 arch/powerpc/platforms/powernv/opal-fadump.c |   73 ++
 arch/powerpc/platforms/powernv/opal-fadump.h |   37 +
 arch/powerpc/platforms/pseries/rtas-fadump.c |   32 +--
 5 files changed, 177 insertions(+), 34 deletions(-)
 create mode 100644 arch/powerpc/platforms/powernv/opal-fadump.h

diff --git a/arch/powerpc/kernel/fadump-common.h 
b/arch/powerpc/kernel/fadump-common.h
index 928d364..89b8916 100644
--- a/arch/powerpc/kernel/fadump-common.h
+++ b/arch/powerpc/kernel/fadump-common.h
@@ -117,6 +117,8 @@ struct fw_dump {
 
unsigned long   boot_mem_dest_addr;
 
+   u64 kernel_metadata;
+
int ibm_configure_kernel_dump;
 
unsigned long   fadump_enabled:1;
@@ -131,6 +133,8 @@ struct fw_dump {
 
 struct fadump_ops {
ulong   (*init_fadump_mem_struct)(struct fw_dump *fadump_config);
+   ulong   (*get_kernel_metadata_size)(void);
+   int (*setup_kernel_metadata)(struct fw_dump *fadump_config);
int (*register_fadump)(struct fw_dump *fadump_config);
int (*unregister_fadump)(struct fw_dump *fadump_config);
int (*invalidate_fadump)(struct fw_dump *fadump_config);
diff --git a/arch/powerpc/kernel/fadump.c b/arch/powerpc/kernel/fadump.c
index 517a40b..4dd8037 100644
--- a/arch/powerpc/kernel/fadump.c
+++ b/arch/powerpc/kernel/fadump.c
@@ -258,6 +258,9 @@ static unsigned long get_fadump_area_size(void)
size += sizeof(struct elf_phdr) * (memblock_num_regions(memory) + 2);
 
size = PAGE_ALIGN(size);
+
+   /* This is to hold kernel metadata on platforms that support it */
+   size += fw_dump.ops->get_kernel_metadata_size();
return size;
 }
 
@@ -283,17 +286,17 @@ static void __init fadump_reserve_crash_area(unsigned 
long base,
 
 int __init fadump_reserve_mem(void)
 {
+   int ret = 1;
unsigned long base, size, memory_boundary;
 
if (!fw_dump.fadump_enabled)
return 0;
 
if (!fw_dump.fadump_supported) {
-   printk(KERN_INFO "Firmware-assisted dump is not supported on"
-   " this hardware\n");
-   fw_dump.fadump_enabled = 0;
-   return 0;
+   pr_info("Firmware-Assisted Dump is not supported on this 
hardware\n");
+   goto error_out;
}
+
/*
 * Initialize boot memory size
 * If dump is active then we have already calculated the size during
@@ -310,11 +313,13 @@ int __init fadump_reserve_mem(void)
}
 
size = get_fadump_area_size();
+   fw_dump.reserve_dump_area_size = size;
if (memory_limit)
memory_boundary = memory_limit;
else
memory_boundary = memblock_end_of_DRAM();
 
+   base = fw_dump.boot_memory_size;
if (fw_dump.dump_active) {
pr_info("Firmware-assisted dump is active.\n");
 
@@ -332,13 +337,11 @@ int __init fadump_reserve_mem(void)
 * dump is written to disk by userspace tool. This memory
 * will be released for general use once the dump is saved.
 */
-   base = fw_dump.boot_memory_size;
size = memory_boundary - base;
fadump_reserve_crash_area(base, size);
 
pr_debug("fadumphdr_addr = %#016lx\n", fw_dump.fadumphdr_addr);
fw_dump.reserve_dump_area_start = base;
-   fw_dump.reserve_dump_area_size = size;
} else {
/*
 * Reserve memory at an offset closer to bottom of the RAM to
@@ -346,30 +349,42 @@ int __init fadump_reserve_mem(void)
 * use memblock_find_in_range() here since it doesn't allocate
 * from bottom to top.
 */
-   for (base = fw_dump.boot_memory_size;
-base <= (memory_boundary - size);
-base += size) {
+   while (base <= (memory_boundary - size)) {
if (memblock_is_region_memory(base, size) &&
!memblock_is_region_reserved(base, size))
break;
+
+   base += size;
}
-   if ((base > (memory_boundary - size)) ||
-   memblock_reserve(base, size)) {
+
+   if (base > (memory_boundary - size)) {
+   pr_err("Failed to find memory chunk for reservation\n");
+   goto error_out;
+   }
+   fw_dump.reserve_dump_area_start = base;
+
+