Re: [PATCH v3 06/12] ppc64/kexec_file: restrict memory usage of kdump kernel

2020-07-16 Thread Hari Bathini



On 17/07/20 3:33 am, Thiago Jung Bauermann wrote:
> 
> Hari Bathini  writes:
> 
>> On 16/07/20 4:22 am, Thiago Jung Bauermann wrote:
>>>
>>> Hari Bathini  writes:
>>>


 
 +   * each representing a memory range.
 +   */
 +  ranges = (len >> 2) / (n_mem_addr_cells + n_mem_size_cells);
 +
 +  for (i = 0; i < ranges; i++) {
 +  base = of_read_number(prop, n_mem_addr_cells);
 +  prop += n_mem_addr_cells;
 +  end = base + of_read_number(prop, n_mem_size_cells) - 1;
>>
>> prop is not used after the above.
>>
>>> You need to `prop += n_mem_size_cells` here.
>>
>> But yeah, adding it would make it look complete in some sense..
> 
> Isn't it used in the next iteration of the loop?

Memory@XXX/reg typically has only one range. I was looking at it
from that perspective which is not right. Will update.

Thanks
Hari


Re: [PATCH v3 06/12] ppc64/kexec_file: restrict memory usage of kdump kernel

2020-07-16 Thread Thiago Jung Bauermann


Hari Bathini  writes:

> On 16/07/20 4:22 am, Thiago Jung Bauermann wrote:
>> 
>> Hari Bathini  writes:
>> 
>
> 
>
>>> +/**
>>> + * get_node_path - Get the full path of the given node.
>>> + * @dn:Node.
>>> + * @path:  Updated with the full path of the node.
>>> + *
>>> + * Returns nothing.
>>> + */
>>> +static void get_node_path(struct device_node *dn, char *path)
>>> +{
>>> +   if (!dn)
>>> +   return;
>>> +
>>> +   get_node_path(dn->parent, path);
>> 
>> Is it ok to do recursion in the kernel? In this case I believe it's not
>> problematic since the maximum call depth will be the maximum depth of a
>> device tree node which shouldn't be too much. Also, there are no local
>> variables in this function. But I thought it was worth mentioning.
>
> You are right. We are better off avoiding the recursion here. Will
> change it to an iterative version instead.

Ok.

>>> +* each representing a memory range.
>>> +*/
>>> +   ranges = (len >> 2) / (n_mem_addr_cells + n_mem_size_cells);
>>> +
>>> +   for (i = 0; i < ranges; i++) {
>>> +   base = of_read_number(prop, n_mem_addr_cells);
>>> +   prop += n_mem_addr_cells;
>>> +   end = base + of_read_number(prop, n_mem_size_cells) - 1;
>
> prop is not used after the above.
>
>> You need to `prop += n_mem_size_cells` here.
>
> But yeah, adding it would make it look complete in some sense..

Isn't it used in the next iteration of the loop?

-- 
Thiago Jung Bauermann
IBM Linux Technology Center


Re: [PATCH v3 06/12] ppc64/kexec_file: restrict memory usage of kdump kernel

2020-07-16 Thread Hari Bathini



On 16/07/20 4:22 am, Thiago Jung Bauermann wrote:
> 
> Hari Bathini  writes:
> 



>> +/**
>> + * get_node_path - Get the full path of the given node.
>> + * @dn:Node.
>> + * @path:  Updated with the full path of the node.
>> + *
>> + * Returns nothing.
>> + */
>> +static void get_node_path(struct device_node *dn, char *path)
>> +{
>> +if (!dn)
>> +return;
>> +
>> +get_node_path(dn->parent, path);
> 
> Is it ok to do recursion in the kernel? In this case I believe it's not
> problematic since the maximum call depth will be the maximum depth of a
> device tree node which shouldn't be too much. Also, there are no local
> variables in this function. But I thought it was worth mentioning.

You are right. We are better off avoiding the recursion here. Will
change it to an iterative version instead.
 
>> + * each representing a memory range.
>> + */
>> +ranges = (len >> 2) / (n_mem_addr_cells + n_mem_size_cells);
>> +
>> +for (i = 0; i < ranges; i++) {
>> +base = of_read_number(prop, n_mem_addr_cells);
>> +prop += n_mem_addr_cells;
>> +end = base + of_read_number(prop, n_mem_size_cells) - 1;

prop is not used after the above.

> You need to `prop += n_mem_size_cells` here.

But yeah, adding it would make it look complete in some sense..

Thanks
Hari


Re: [PATCH v3 06/12] ppc64/kexec_file: restrict memory usage of kdump kernel

2020-07-15 Thread Thiago Jung Bauermann


Hari Bathini  writes:

>  /**
> + * get_usable_memory_ranges - Get usable memory ranges. This list includes
> + *regions like crashkernel, opal/rtas & 
> tce-table,
> + *that kdump kernel could use.
> + * @mem_ranges:   Range list to add the memory ranges to.
> + *
> + * Returns 0 on success, negative errno on error.
> + */
> +static int get_usable_memory_ranges(struct crash_mem **mem_ranges)
> +{
> + int ret;
> +
> + /* First memory block & crashkernel region */
> + ret = add_mem_range(mem_ranges, 0, crashk_res.end + 1);

This is a bit surprising. I guess I don't have a complete big picture of
the patch series yet. What prevents the crashkernel from using memory at
the [0, _end] range and overwriting the crashed kernel's memory?

Shouldn't the above range start at crashk_res.start?

> + if (ret)
> + goto out;
> +
> + ret = add_rtas_mem_range(mem_ranges);
> + if (ret)
> + goto out;
> +
> + ret = add_opal_mem_range(mem_ranges);
> + if (ret)
> + goto out;
> +
> + ret = add_tce_mem_ranges(mem_ranges);
> +out:
> + if (ret)
> + pr_err("Failed to setup usable memory ranges\n");
> + return ret;
> +}
> +
> +/**
>   * __locate_mem_hole_top_down - Looks top down for a large enough memory hole
>   *  in the memory regions between buf_min & 
> buf_max
>   *  for the buffer. If found, sets kbuf->mem.
> @@ -261,6 +305,322 @@ static int locate_mem_hole_bottom_up_ppc64(struct 
> kexec_buf *kbuf,
>  }
>
>  /**
> + * check_realloc_usable_mem - Reallocate buffer if it can't accommodate 
> entries
> + * @um_info:  Usable memory buffer and ranges info.
> + * @cnt:  No. of entries to accommodate.
> + *
> + * Returns 0 on success, negative errno on error.

It actually returns the buffer on success, and NULL on error.

> + */
> +static uint64_t *check_realloc_usable_mem(struct umem_info *um_info, int cnt)
> +{
> + void *tbuf;
> +
> + if (um_info->size >=
> + ((um_info->idx + cnt) * sizeof(*(um_info->buf
> + return um_info->buf;
> +
> + um_info->size += MEM_RANGE_CHUNK_SZ;
> + tbuf = krealloc(um_info->buf, um_info->size, GFP_KERNEL);
> + if (!tbuf) {
> + um_info->size -= MEM_RANGE_CHUNK_SZ;
> + return NULL;
> + }
> +
> + memset(tbuf + um_info->idx, 0, MEM_RANGE_CHUNK_SZ);
> + return tbuf;
> +}



> +/**
> + * get_node_path - Get the full path of the given node.
> + * @dn:Node.
> + * @path:  Updated with the full path of the node.
> + *
> + * Returns nothing.
> + */
> +static void get_node_path(struct device_node *dn, char *path)
> +{
> + if (!dn)
> + return;
> +
> + get_node_path(dn->parent, path);

Is it ok to do recursion in the kernel? In this case I believe it's not
problematic since the maximum call depth will be the maximum depth of a
device tree node which shouldn't be too much. Also, there are no local
variables in this function. But I thought it was worth mentioning.

> + sprintf(path, "/%s", dn->full_name);
> +}
> +
> +/**
> + * get_node_pathlen - Get the full path length of the given node.
> + * @dn:   Node.
> + *
> + * Returns the length of the full path of the node.
> + */
> +static int get_node_pathlen(struct device_node *dn)
> +{
> + int len = 0;
> +
> + while (dn) {
> + len += strlen(dn->full_name) + 1;
> + dn = dn->parent;
> + }
> + len++;
> +
> + return len;
> +}
> +
> +/**
> + * add_usable_mem_property - Add usable memory property for the given
> + *   memory node.
> + * @fdt: Flattened device tree for the kdump kernel.
> + * @dn:  Memory node.
> + * @um_info: Usable memory buffer and ranges info.
> + *
> + * Returns 0 on success, negative errno on error.
> + */
> +static int add_usable_mem_property(void *fdt, struct device_node *dn,
> +struct umem_info *um_info)
> +{
> + int n_mem_addr_cells, n_mem_size_cells, node;
> + int i, len, ranges, cnt, ret;
> + uint64_t base, end, *buf;
> + const __be32 *prop;
> + char *pathname;
> +
> + /* Allocate memory for node path */
> + pathname = kzalloc(ALIGN(get_node_pathlen(dn), 8), GFP_KERNEL);
> + if (!pathname)
> + return -ENOMEM;
> +
> + /* Get the full path of the memory node */
> + get_node_path(dn, pathname);
> + pr_debug("Memory node path: %s\n", pathname);
> +
> + /* Now that we know the path, find its offset in kdump kernel's fdt */
> + node = fdt_path_offset(fdt, pathname);
> + if (node < 0) {
> + pr_err("Malformed device tree: error reading %s\n",
> +pathname);
> + ret = -EINVAL;
> + goto out;
> + }
> 

[PATCH v3 06/12] ppc64/kexec_file: restrict memory usage of kdump kernel

2020-07-13 Thread Hari Bathini
Kdump kernel, used for capturing the kernel core image, is supposed
to use only specific memory regions to avoid corrupting the image to
be captured. The regions are crashkernel range - the memory reserved
explicitly for kdump kernel, memory used for the tce-table, the OPAL
region and RTAS region as applicable. Restrict kdump kernel memory
to use only these regions by setting up usable-memory DT property.
Also, tell the kdump kernel to run at the loaded address by setting
the magic word at 0x5c.

Signed-off-by: Hari Bathini 
Tested-by: Pingfan Liu 
---

v2 -> v3:
* Unchanged. Added Tested-by tag from Pingfan.

v1 -> v2:
* Fixed off-by-one error while setting up usable-memory properties.
* Updated add_rtas_mem_range() & add_opal_mem_range() callsites based on
  the new prototype for these functions.


 arch/powerpc/kexec/file_load_64.c |  401 +
 1 file changed, 399 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/kexec/file_load_64.c 
b/arch/powerpc/kexec/file_load_64.c
index 7673481..1c4e3eb 100644
--- a/arch/powerpc/kexec/file_load_64.c
+++ b/arch/powerpc/kexec/file_load_64.c
@@ -17,10 +17,22 @@
 #include 
 #include 
 #include 
+#include 
 #include 
+#include 
+#include 
 #include 
 #include 
 
+struct umem_info {
+   uint64_t *buf; /* data buffer for usable-memory property */
+   uint32_t idx;  /* current index */
+   uint32_t size; /* size allocated for the data buffer */
+
+   /* usable memory ranges to look up */
+   const struct crash_mem *umrngs;
+};
+
 const struct kexec_file_ops * const kexec_file_loaders[] = {
_elf64_ops,
NULL
@@ -76,6 +88,38 @@ static int get_exclude_memory_ranges(struct crash_mem 
**mem_ranges)
 }
 
 /**
+ * get_usable_memory_ranges - Get usable memory ranges. This list includes
+ *regions like crashkernel, opal/rtas & tce-table,
+ *that kdump kernel could use.
+ * @mem_ranges:   Range list to add the memory ranges to.
+ *
+ * Returns 0 on success, negative errno on error.
+ */
+static int get_usable_memory_ranges(struct crash_mem **mem_ranges)
+{
+   int ret;
+
+   /* First memory block & crashkernel region */
+   ret = add_mem_range(mem_ranges, 0, crashk_res.end + 1);
+   if (ret)
+   goto out;
+
+   ret = add_rtas_mem_range(mem_ranges);
+   if (ret)
+   goto out;
+
+   ret = add_opal_mem_range(mem_ranges);
+   if (ret)
+   goto out;
+
+   ret = add_tce_mem_ranges(mem_ranges);
+out:
+   if (ret)
+   pr_err("Failed to setup usable memory ranges\n");
+   return ret;
+}
+
+/**
  * __locate_mem_hole_top_down - Looks top down for a large enough memory hole
  *  in the memory regions between buf_min & buf_max
  *  for the buffer. If found, sets kbuf->mem.
@@ -261,6 +305,322 @@ static int locate_mem_hole_bottom_up_ppc64(struct 
kexec_buf *kbuf,
 }
 
 /**
+ * check_realloc_usable_mem - Reallocate buffer if it can't accommodate entries
+ * @um_info:  Usable memory buffer and ranges info.
+ * @cnt:  No. of entries to accommodate.
+ *
+ * Returns 0 on success, negative errno on error.
+ */
+static uint64_t *check_realloc_usable_mem(struct umem_info *um_info, int cnt)
+{
+   void *tbuf;
+
+   if (um_info->size >=
+   ((um_info->idx + cnt) * sizeof(*(um_info->buf
+   return um_info->buf;
+
+   um_info->size += MEM_RANGE_CHUNK_SZ;
+   tbuf = krealloc(um_info->buf, um_info->size, GFP_KERNEL);
+   if (!tbuf) {
+   um_info->size -= MEM_RANGE_CHUNK_SZ;
+   return NULL;
+   }
+
+   memset(tbuf + um_info->idx, 0, MEM_RANGE_CHUNK_SZ);
+   return tbuf;
+}
+
+/**
+ * add_usable_mem - Add the usable memory ranges within the given memory range
+ *  to the buffer
+ * @um_info:Usable memory buffer and ranges info.
+ * @base:   Base address of memory range to look for.
+ * @end:End address of memory range to look for.
+ * @cnt:No. of usable memory ranges added to buffer.
+ *
+ * Returns 0 on success, negative errno on error.
+ */
+static int add_usable_mem(struct umem_info *um_info, uint64_t base,
+ uint64_t end, int *cnt)
+{
+   uint64_t loc_base, loc_end, *buf;
+   const struct crash_mem *umrngs;
+   int i, add;
+
+   *cnt = 0;
+   umrngs = um_info->umrngs;
+   for (i = 0; i < umrngs->nr_ranges; i++) {
+   add = 0;
+   loc_base = umrngs->ranges[i].start;
+   loc_end = umrngs->ranges[i].end;
+   if (loc_base >= base && loc_end <= end)
+   add = 1;
+   else if (base < loc_end && end > loc_base) {
+   if (loc_base < base)
+   loc_base = base;
+   if