Re: [PATCH 0/5] kexec: minor fixups and enhancements

2014-08-26 Thread Dave Young
On 08/25/14 at 12:59pm, Vivek Goyal wrote:
> On Fri, Aug 22, 2014 at 06:39:47PM +, Geoff Levand wrote:
> > Hi,
> > 
> > Here are a few minor fixups and enhancements for kexec support. 
> > 
> > Patch 3 and 4 that add preprocessor macros for the kimage list flags are
> > ones that I use in the arm64 kexec support I am working on, so it would
> > be nice for those to go in.
> > 
> > Please consider.
> 
> Hi Geoff,
> 
> Does arm64 has secureboot? If yes, then it might make sense to
> enable the new syscall kexec_file_load() on arm64 instead of trying
> to make old syscall work first.

It will save efforts for efi support as well, for the in-kernel loader we do not
necessary to save the efi physical addresses or runtime ranges to sysfs and
passing them to 2nd kernel, we can just copy them in kernel.

Thanks
Dave
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v2 0/2] Append new variables to vmcoreinfo (PTRS_PER_PGD for arm64 and MAX_PHYSMEM_BITS for all archs)

2019-03-11 Thread Dave Young
Hi Bhupesh,
On 03/10/19 at 03:34pm, Bhupesh Sharma wrote:
> Changes since v1:
> 
> - v1 was sent out as a single patch which can be seen here:
>   http://lists.infradead.org/pipermail/kexec/2019-February/022411.html
> 
> - v2 breaks the single patch into two independent patches:
>   [PATCH 1/2] appends 'PTRS_PER_PGD' to vmcoreinfo for arm64 arch, whereas
>   [PATCH 2/2] appends 'MAX_PHYSMEM_BITS' to vmcoreinfo in core kernel code 
> (all archs)
> 
> This patchset primarily fixes the regression reported in user-space
> utilities like 'makedumpfile' and 'crash-utility' on arm64 architecture
> with the availability of 52-bit address space feature in underlying
> kernel. These regressions have been reported both on CPUs which don't
> support ARMv8.2 extensions (i.e. LVA, LPA) and are running newer kernels
> and also on prototype platforms (like ARMv8 FVP simulator model) which
> support ARMv8.2 extensions and are running newer kernels.
> 
> The reason for these regressions is that right now user-space tools
> have no direct access to these values (since these are not exported
> from the kernel) and hence need to rely on a best-guess method of
> determining value of 'PTRS_PER_PGD' and 'MAX_PHYSMEM_BITS' supported
> by underlying kernel.
> 
> Exporting these values via vmcoreinfo will help user-land in such cases.
> In addition, as per suggestion from makedumpfile maintainer (Kazu)
> during v1 review, it makes more sense to append 'MAX_PHYSMEM_BITS' to
> vmcoreinfo in the core code itself rather than in arm64 arch-specific
> code, so that the user-space code for other archs can also benefit from
> this addition to the vmcoreinfo and use it as a standard way of
> determining 'SECTIONS_SHIFT' value in user-land.
> 
> Cc: Mark Rutland 
> Cc: James Morse 
> Cc: Will Deacon 
> Cc: Boris Petkov 
> Cc: Ingo Molnar 
> Cc: Thomas Gleixner 
> Cc: Michael Ellerman 
> Cc: Paul Mackerras 
> Cc: Benjamin Herrenschmidt 
> Cc: Dave Anderson 
> Cc: Kazuhito Hagio 
> Cc: x...@kernel.org
> Cc: linuxppc-dev@lists.ozlabs.org
> Cc: linux-arm-ker...@lists.infradead.org
> Cc: linux-ker...@vger.kernel.org
> Cc: ke...@lists.infradead.org
> 
> Bhupesh Sharma (2):
>   arm64, vmcoreinfo : Append 'PTRS_PER_PGD' to vmcoreinfo
>   crash_core, vmcoreinfo: Append 'MAX_PHYSMEM_BITS' to vmcoreinfo
> 
>  arch/arm64/kernel/crash_core.c | 1 +
>  kernel/crash_core.c| 1 +
>  2 files changed, 2 insertions(+)
> 

Lianbo's document patch has been merged, would you mind to add vmcoreinfo doc
patch as well in your series?

Thanks
Dave


Re: [PATCH v4 0/3] Append new variables to vmcoreinfo (TCR_EL1.T1SZ for arm64 and MAX_PHYSMEM_BITS for all archs)

2019-11-21 Thread Dave Young
On 11/11/19 at 01:31pm, Bhupesh Sharma wrote:
> Changes since v3:
> 
> - v3 can be seen here:
>   http://lists.infradead.org/pipermail/kexec/2019-March/022590.html
> - Addressed comments from James and exported TCR_EL1.T1SZ in vmcoreinfo
>   instead of PTRS_PER_PGD.
> - Added a new patch (via [PATCH 3/3]), which fixes a simple typo in
>   'Documentation/arm64/memory.rst'
> 
> Changes since v2:
> 
> - v2 can be seen here:
>   http://lists.infradead.org/pipermail/kexec/2019-March/022531.html
> - Protected 'MAX_PHYSMEM_BITS' vmcoreinfo variable under CONFIG_SPARSEMEM
>   ifdef sections, as suggested by Kazu.
> - Updated vmcoreinfo documentation to add description about
>   'MAX_PHYSMEM_BITS' variable (via [PATCH 3/3]).
> 
> Changes since v1:
> 
> - v1 was sent out as a single patch which can be seen here:
>   http://lists.infradead.org/pipermail/kexec/2019-February/022411.html
> 
> - v2 breaks the single patch into two independent patches:
>   [PATCH 1/2] appends 'PTRS_PER_PGD' to vmcoreinfo for arm64 arch, whereas
>   [PATCH 2/2] appends 'MAX_PHYSMEM_BITS' to vmcoreinfo in core kernel code 
> (all archs)
> 
> This patchset primarily fixes the regression reported in user-space
> utilities like 'makedumpfile' and 'crash-utility' on arm64 architecture
> with the availability of 52-bit address space feature in underlying
> kernel. These regressions have been reported both on CPUs which don't
> support ARMv8.2 extensions (i.e. LVA, LPA) and are running newer kernels
> and also on prototype platforms (like ARMv8 FVP simulator model) which
> support ARMv8.2 extensions and are running newer kernels.
> 
> The reason for these regressions is that right now user-space tools
> have no direct access to these values (since these are not exported
> from the kernel) and hence need to rely on a best-guess method of
> determining value of 'vabits_actual' and 'MAX_PHYSMEM_BITS' supported
> by underlying kernel.
> 
> Exporting these values via vmcoreinfo will help user-land in such cases.
> In addition, as per suggestion from makedumpfile maintainer (Kazu),
> it makes more sense to append 'MAX_PHYSMEM_BITS' to
> vmcoreinfo in the core code itself rather than in arm64 arch-specific
> code, so that the user-space code for other archs can also benefit from
> this addition to the vmcoreinfo and use it as a standard way of
> determining 'SECTIONS_SHIFT' value in user-land.
> 
> Cc: Boris Petkov 
> Cc: Ingo Molnar 
> Cc: Thomas Gleixner 
> Cc: Jonathan Corbet 
> Cc: James Morse 
> Cc: Mark Rutland 
> Cc: Will Deacon 
> Cc: Steve Capper 
> Cc: Catalin Marinas 
> Cc: Ard Biesheuvel 
> Cc: Michael Ellerman 
> Cc: Paul Mackerras 
> Cc: Benjamin Herrenschmidt 
> Cc: Dave Anderson 
> Cc: Kazuhito Hagio 
> Cc: x...@kernel.org
> Cc: linuxppc-dev@lists.ozlabs.org
> Cc: linux-arm-ker...@lists.infradead.org
> Cc: linux-ker...@vger.kernel.org
> Cc: linux-...@vger.kernel.org
> Cc: ke...@lists.infradead.org
> 
> Bhupesh Sharma (3):
>   crash_core, vmcoreinfo: Append 'MAX_PHYSMEM_BITS' to vmcoreinfo
>   arm64/crash_core: Export TCR_EL1.T1SZ in vmcoreinfo

Soft reminder:  the new introduced vmcoreinfo needs documentation

Please check Documentation/admin-guide/kdump/vmcoreinfo.rst

Thanks
Dave



Re: [PATCH 18/39] docs: admin-guide: add kdump documentation into it

2019-07-04 Thread Dave Young
 y
> diff --git a/arch/sh/Kconfig b/arch/sh/Kconfig
> index 31a7d12db705..c2858ac6a46a 100644
> --- a/arch/sh/Kconfig
> +++ b/arch/sh/Kconfig
> @@ -626,7 +626,7 @@ config CRASH_DUMP
> to a memory address not used by the main kernel using
> PHYSICAL_START.
>  
> -   For more details see Documentation/kdump/kdump.rst
> +   For more details see Documentation/admin-guide/kdump/kdump.rst
>  
>  config KEXEC_JUMP
>   bool "kexec jump (EXPERIMENTAL)"
> diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
> index c9d59ca5e3ac..489fd833b980 100644
> --- a/arch/x86/Kconfig
> +++ b/arch/x86/Kconfig
> @@ -2057,7 +2057,7 @@ config CRASH_DUMP
> to a memory address not used by the main kernel or BIOS using
> PHYSICAL_START, or it must be built as a relocatable image
> (CONFIG_RELOCATABLE=y).
> -   For more details see Documentation/kdump/kdump.rst
> +   For more details see Documentation/admin-guide/kdump/kdump.rst
>  
>  config KEXEC_JUMP
>   bool "kexec jump"
> @@ -2094,7 +2094,7 @@ config PHYSICAL_START
> the reserved region.  In other words, it can be set based on
> the "X" value as specified in the "crashkernel=YM@XM"
> command line boot parameter passed to the panic-ed
> -   kernel. Please take a look at Documentation/kdump/kdump.rst
> +   kernel. Please take a look at 
> Documentation/admin-guide/kdump/kdump.rst
> for more details about crash dumps.
>  
> Usage of bzImage for capturing the crash dump is recommended as
> -- 
> 2.21.0
> 

Acked-by: Dave Young 

Thanks
Dave


Re: [PATCH 18/39] docs: admin-guide: add kdump documentation into it

2019-07-04 Thread Dave Young
On 07/05/19 at 11:43am, Alex Shi wrote:
> 
> 
> 在 2019/6/28 下午8:30, Mauro Carvalho Chehab 写道:
> > The Kdump documentation describes procedures with admins use
> > in order to solve issues on their systems.
> > 
> > Signed-off-by: Mauro Carvalho Chehab 
> > ---
> >  Documentation/admin-guide/bug-hunting.rst| 4 ++--
> >  Documentation/admin-guide/index.rst  | 1 +
> >  Documentation/{ => admin-guide}/kdump/gdbmacros.txt  | 0
> >  Documentation/{ => admin-guide}/kdump/index.rst  | 1 -
> >  Documentation/{ => admin-guide}/kdump/kdump.rst  | 0
> >  Documentation/{ => admin-guide}/kdump/vmcoreinfo.rst | 0
> 
> I am not sure if it's convenience for people to have more levels in docs.
> 
> But I guess, move archs into a Documentation/arch/ dir should be fine. like 
> Documentation/arch/{x86,arm,arm64,ia64,m68k,s390,powerpc,...}

Alex, moving kdump to admin-guide sounds reasonable to me.  I also agree
with you for those arch dependent files can be moved to
Documentation/arch/, maybe you are talking about some other patches in
the series for the arch/? 

Thanks
dave


Re: [PATCH] powerpc/kdump: handle crashkernel memory reservation failure

2018-06-27 Thread Dave Young
On 06/28/18 at 10:49am, Hari Bathini wrote:
> Memory reservation for crashkernel could fail if there are holes around
> kdump kernel offset (128M). Fail gracefully in such cases and print an
> error message.
> 
> Signed-off-by: Hari Bathini 
> ---
>  arch/powerpc/kernel/machine_kexec.c |7 ++-
>  1 file changed, 6 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/powerpc/kernel/machine_kexec.c 
> b/arch/powerpc/kernel/machine_kexec.c
> index 936c7e2..6181442 100644
> --- a/arch/powerpc/kernel/machine_kexec.c
> +++ b/arch/powerpc/kernel/machine_kexec.c
> @@ -188,7 +188,12 @@ void __init reserve_crashkernel(void)
>   (unsigned long)(crashk_res.start >> 20),
>   (unsigned long)(memblock_phys_mem_size() >> 20));
>  
> - memblock_reserve(crashk_res.start, crash_size);
> + if (!memblock_is_region_memory(crashk_res.start, crash_size) ||
> + memblock_reserve(crashk_res.start, crash_size)) {
> + printk(KERN_ERR "Failed to reserve memory for crashkernel!\n");
> + crashk_res.start = crashk_res.end = 0;
> + return;
> + }
>  }
>  
>  int overlaps_crashkernel(unsigned long start, unsigned long size)
> 

It would be better to print a separate error message for 
!memblock_is_region_memory
But I think memblock_reserve is unlikly to fail so this patch is also
good.

Reviewed-by: Dave Young 

Thanks
Dave


Re: macmini g4 can't boot with new kernel

2008-06-15 Thread Dave Young
On Sun, Jun 15, 2008 at 3:04 PM, Benjamin Herrenschmidt
<[EMAIL PROTECTED]> wrote:
> On Sun, 2008-06-15 at 12:12 +0800, Dave Young wrote:
>> Hi,
>>
>> I recently built kernel 2.6.26-rc5 for my macmini g4, but it can't
>> boot. I'm not sure if it's related to kernel or yaboot, or the open
>> firmware.
>>
>> My distribution is yellowdog 6.1, config file attached
>>
>> After load the kernel , it reports:
>>
>> zImage starting : loaded at 0x004001f8 (SP:0x0023eea4)
>> Allocating 0xc04b26c1 bytes
>> ...
>> Invalid memory access at %SRR0 : 004020e4 %SRR1 : 3030
>
> What are you trying to boot precisely ?

It's arch/powerpc/boot/zImage

>
> >From yaboot, I would suggest you try to boot a straight vmlinux rather
> than a zImage. The later's never been really supported.

Thanks a lot, I didn't know this before.
vmlinux works indeed.

-- 
Regards
dave
___
Linuxppc-dev mailing list
Linuxppc-dev@ozlabs.org
https://ozlabs.org/mailman/listinfo/linuxppc-dev


Re: [PATCH v2 01/12] kexec_file: allow archs to handle special regions while locating memory hole

2020-07-06 Thread Dave Young
On 07/03/20 at 01:24am, Hari Bathini wrote:
> Some architectures may have special memory regions, within the given
> memory range, which can't be used for the buffer in a kexec segment.
> Implement weak arch_kexec_locate_mem_hole() definition which arch code
> may override, to take care of special regions, while trying to locate
> a memory hole.
> 
> Also, add the missing declarations for arch overridable functions and
> and drop the __weak descriptors in the declarations to avoid non-weak
> definitions from becoming weak.
> 
> Reported-by: kernel test robot 
> [lkp: In v1, arch_kimage_file_post_load_cleanup() declaration was missing]
> Signed-off-by: Hari Bathini 
> ---
> 
> Changes in v2:
> * Introduced arch_kexec_locate_mem_hole() for override and dropped
>   weak arch_kexec_add_buffer().
> * Dropped __weak identifier for arch overridable functions.
> * Fixed the missing declaration for arch_kimage_file_post_load_cleanup()
>   reported by lkp. lkp report for reference:
> - https://lore.kernel.org/patchwork/patch/1264418/
> 
> 
>  include/linux/kexec.h |   29 ++---
>  kernel/kexec_file.c   |   16 ++--
>  2 files changed, 32 insertions(+), 13 deletions(-)
> 
> diff --git a/include/linux/kexec.h b/include/linux/kexec.h
> index ea67910..9e93bef 100644
> --- a/include/linux/kexec.h
> +++ b/include/linux/kexec.h
> @@ -183,17 +183,24 @@ int kexec_purgatory_get_set_symbol(struct kimage 
> *image, const char *name,
>  bool get_value);
>  void *kexec_purgatory_get_symbol_addr(struct kimage *image, const char 
> *name);
>  
> -int __weak arch_kexec_kernel_image_probe(struct kimage *image, void *buf,
> -  unsigned long buf_len);
> -void * __weak arch_kexec_kernel_image_load(struct kimage *image);
> -int __weak arch_kexec_apply_relocations_add(struct purgatory_info *pi,
> - Elf_Shdr *section,
> - const Elf_Shdr *relsec,
> - const Elf_Shdr *symtab);
> -int __weak arch_kexec_apply_relocations(struct purgatory_info *pi,
> - Elf_Shdr *section,
> - const Elf_Shdr *relsec,
> - const Elf_Shdr *symtab);
> +/* Architectures may override the below functions */
> +int arch_kexec_kernel_image_probe(struct kimage *image, void *buf,
> +   unsigned long buf_len);
> +void *arch_kexec_kernel_image_load(struct kimage *image);
> +int arch_kexec_apply_relocations_add(struct purgatory_info *pi,
> +  Elf_Shdr *section,
> +  const Elf_Shdr *relsec,
> +  const Elf_Shdr *symtab);
> +int arch_kexec_apply_relocations(struct purgatory_info *pi,
> +  Elf_Shdr *section,
> +  const Elf_Shdr *relsec,
> +  const Elf_Shdr *symtab);
> +int arch_kimage_file_post_load_cleanup(struct kimage *image);
> +#ifdef CONFIG_KEXEC_SIG
> +int arch_kexec_kernel_verify_sig(struct kimage *image, void *buf,
> +  unsigned long buf_len);
> +#endif
> +int arch_kexec_locate_mem_hole(struct kexec_buf *kbuf);
>  
>  extern int kexec_add_buffer(struct kexec_buf *kbuf);
>  int kexec_locate_mem_hole(struct kexec_buf *kbuf);
> diff --git a/kernel/kexec_file.c b/kernel/kexec_file.c
> index 09cc78d..e89912d 100644
> --- a/kernel/kexec_file.c
> +++ b/kernel/kexec_file.c
> @@ -636,6 +636,19 @@ int kexec_locate_mem_hole(struct kexec_buf *kbuf)
>  }
>  
>  /**
> + * arch_kexec_locate_mem_hole - Find free memory to place the segments.
> + * @kbuf:   Parameters for the memory search.
> + *
> + * On success, kbuf->mem will have the start address of the memory region 
> found.
> + *
> + * Return: 0 on success, negative errno on error.
> + */
> +int __weak arch_kexec_locate_mem_hole(struct kexec_buf *kbuf)
> +{
> + return kexec_locate_mem_hole(kbuf);
> +}
> +
> +/**
>   * kexec_add_buffer - place a buffer in a kexec segment
>   * @kbuf:Buffer contents and memory parameters.
>   *
> @@ -647,7 +660,6 @@ int kexec_locate_mem_hole(struct kexec_buf *kbuf)
>   */
>  int kexec_add_buffer(struct kexec_buf *kbuf)
>  {
> -
>   struct kexec_segment *ksegment;
>   int ret;
>  
> @@ -675,7 +687,7 @@ int kexec_add_buffer(struct kexec_buf *kbuf)
>   kbuf->buf_align = max(kbuf->buf_align, PAGE_SIZE);
>  
>   /* Walk the RAM ranges and allocate a suitable range for the buffer */
> - ret = kexec_locate_mem_hole(kbuf);
> + ret = arch_kexec_locate_mem_hole(kbuf);
>   if (ret)
>   return ret;
>  
> 

Acked-by: Dave Young 

Thanks
Dave



Re: [PATCH 04/11] ppc64/kexec_file: avoid stomping memory used by special regions

2020-07-01 Thread Dave Young
Hi Hari,
On 06/27/20 at 12:35am, Hari Bathini wrote:
> crashkernel region could have an overlap with special memory regions
> like  opal, rtas, tce-table & such. These regions are referred to as
> exclude memory ranges. Setup this ranges during image probe in order
> to avoid them while finding the buffer for different kdump segments.
> Implement kexec_locate_mem_hole_ppc64() that locates a memory hole
> accounting for these ranges. Also, override arch_kexec_add_buffer()
> to locate a memory hole & later call __kexec_add_buffer() function
> with kbuf->mem set to skip the generic locate memory hole lookup.
> 
> Signed-off-by: Hari Bathini 
> ---
>  arch/powerpc/include/asm/crashdump-ppc64.h |   10 +
>  arch/powerpc/include/asm/kexec.h   |7 -
>  arch/powerpc/kexec/elf_64.c|7 +
>  arch/powerpc/kexec/file_load_64.c  |  292 
> 
>  4 files changed, 312 insertions(+), 4 deletions(-)
>  create mode 100644 arch/powerpc/include/asm/crashdump-ppc64.h
> 
[snip]
>  /**
> + * get_exclude_memory_ranges - Get exclude memory ranges. This list includes
> + * regions like opal/rtas, tce-table, initrd,
> + * kernel, htab which should be avoided while
> + * setting up kexec load segments.
> + * @mem_ranges:Range list to add the memory ranges to.
> + *
> + * Returns 0 on success, negative errno on error.
> + */
> +static int get_exclude_memory_ranges(struct crash_mem **mem_ranges)
> +{
> + int ret;
> +
> + ret = add_tce_mem_ranges(mem_ranges);
> + if (ret)
> + goto out;
> +
> + ret = add_initrd_mem_range(mem_ranges);
> + if (ret)
> + goto out;
> +
> + ret = add_htab_mem_range(mem_ranges);
> + if (ret)
> + goto out;
> +
> + ret = add_kernel_mem_range(mem_ranges);
> + if (ret)
> + goto out;
> +
> + ret = add_rtas_mem_range(mem_ranges, false);
> + if (ret)
> + goto out;
> +
> + ret = add_opal_mem_range(mem_ranges, false);
> + if (ret)
> + goto out;
> +
> + ret = add_reserved_ranges(mem_ranges);
> + if (ret)
> + goto out;
> +
> + /* exclude memory ranges should be sorted for easy lookup */
> + sort_memory_ranges(*mem_ranges);
> +out:
> + if (ret)
> + pr_err("Failed to setup exclude memory ranges\n");
> + return ret;
> +}

I'm confused about the "overlap with crashkernel memory", does that mean
those normal kernel used memory could be put in crashkernel reserved
memory range?  If so why can't just skip those areas while crashkernel
doing the reservation?

Thanks
Dave



Re: [PATCH 01/11] kexec_file: allow archs to handle special regions while locating memory hole

2020-07-01 Thread Dave Young
On 06/29/20 at 05:26pm, Hari Bathini wrote:
> Hi Petr,
> 
> On 29/06/20 5:09 pm, Petr Tesarik wrote:
> > Hi Hari,
> > 
> > is there any good reason to add two more functions with a very similar
> > name to an existing function? AFAICS all you need is a way to call a
> > PPC64-specific function from within kexec_add_buffer (PATCH 4/11), so
> > you could add something like this:
> > 
> > int __weak arch_kexec_locate_mem_hole(struct kexec_buf *kbuf)
> > {
> > return 0;
> > }
> > 
> > Call this function from kexec_add_buffer where appropriate and then
> > override it for PPC64 (it roughly corresponds to your
> > kexec_locate_mem_hole_ppc64() from PATCH 4/11).
> > 
> > FWIW it would make it easier for me to follow the resulting code.
> 
> Right, Petr.
> 
> I was trying out a few things before I ended up with what I sent here.
> Bu yeah.. I did realize arch_kexec_locate_mem_hole() would have been better
> after sending out v1. Will take care of that in v2.

Another way is use arch private function to locate mem hole, then set
kbuf->mem, and then call kexec_add_buf, it will skip the common locate
hole function.

But other than that I have some confusion about those excluded ranges.
Replied a question to patch 4.

Thanks
Dave



Re: [PATCH 01/11] kexec_file: allow archs to handle special regions while locating memory hole

2020-07-02 Thread Dave Young
On 07/02/20 at 12:01am, Hari Bathini wrote:
> 
> 
> On 01/07/20 1:16 pm, Dave Young wrote:
> > On 06/29/20 at 05:26pm, Hari Bathini wrote:
> >> Hi Petr,
> >>
> >> On 29/06/20 5:09 pm, Petr Tesarik wrote:
> >>> Hi Hari,
> >>>
> >>> is there any good reason to add two more functions with a very similar
> >>> name to an existing function? AFAICS all you need is a way to call a
> >>> PPC64-specific function from within kexec_add_buffer (PATCH 4/11), so
> >>> you could add something like this:
> >>>
> >>> int __weak arch_kexec_locate_mem_hole(struct kexec_buf *kbuf)
> >>> {
> >>>   return 0;
> >>> }
> >>>
> >>> Call this function from kexec_add_buffer where appropriate and then
> >>> override it for PPC64 (it roughly corresponds to your
> >>> kexec_locate_mem_hole_ppc64() from PATCH 4/11).
> >>>
> >>> FWIW it would make it easier for me to follow the resulting code.
> >>
> >> Right, Petr.
> >>
> >> I was trying out a few things before I ended up with what I sent here.
> >> Bu yeah.. I did realize arch_kexec_locate_mem_hole() would have been better
> >> after sending out v1. Will take care of that in v2.
> > 
> > Another way is use arch private function to locate mem hole, then set
> > kbuf->mem, and then call kexec_add_buf, it will skip the common locate
> > hole function.
> 
> Dave, I did think about it. But there are a couple of places this can get
> tricky. One is ima_add_kexec_buffer() and the other is kexec_elf_load().
> These call sites could be updated to set kbuf->mem before kexec_add_buffer().
> But the current approach seemed like the better option for it creates a
> single point of control in setting up segment buffers and also, makes adding
> any new segments simpler, arch-specific segments or otherwise.
> 

Ok, thanks for the explanation.



Re: [PATCH 04/11] ppc64/kexec_file: avoid stomping memory used by special regions

2020-07-02 Thread Dave Young
On 07/01/20 at 11:48pm, Hari Bathini wrote:
> 
> 
> On 01/07/20 1:10 pm, Dave Young wrote:
> > Hi Hari,
> > On 06/27/20 at 12:35am, Hari Bathini wrote:
> >> crashkernel region could have an overlap with special memory regions
> >> like  opal, rtas, tce-table & such. These regions are referred to as
> >> exclude memory ranges. Setup this ranges during image probe in order
> >> to avoid them while finding the buffer for different kdump segments.
> >> Implement kexec_locate_mem_hole_ppc64() that locates a memory hole
> >> accounting for these ranges. Also, override arch_kexec_add_buffer()
> >> to locate a memory hole & later call __kexec_add_buffer() function
> >> with kbuf->mem set to skip the generic locate memory hole lookup.
> >>
> >> Signed-off-by: Hari Bathini 
> >> ---
> >>  arch/powerpc/include/asm/crashdump-ppc64.h |   10 +
> >>  arch/powerpc/include/asm/kexec.h   |7 -
> >>  arch/powerpc/kexec/elf_64.c|7 +
> >>  arch/powerpc/kexec/file_load_64.c  |  292 
> >> 
> >>  4 files changed, 312 insertions(+), 4 deletions(-)
> >>  create mode 100644 arch/powerpc/include/asm/crashdump-ppc64.h
> >>
> > [snip]
> >>  /**
> >> + * get_exclude_memory_ranges - Get exclude memory ranges. This list 
> >> includes
> >> + * regions like opal/rtas, tce-table, initrd,
> >> + * kernel, htab which should be avoided while
> >> + * setting up kexec load segments.
> >> + * @mem_ranges:Range list to add the memory ranges to.
> >> + *
> >> + * Returns 0 on success, negative errno on error.
> >> + */
> >> +static int get_exclude_memory_ranges(struct crash_mem **mem_ranges)
> >> +{
> >> +  int ret;
> >> +
> >> +  ret = add_tce_mem_ranges(mem_ranges);
> >> +  if (ret)
> >> +  goto out;
> >> +
> >> +  ret = add_initrd_mem_range(mem_ranges);
> >> +  if (ret)
> >> +  goto out;
> >> +
> >> +  ret = add_htab_mem_range(mem_ranges);
> >> +  if (ret)
> >> +  goto out;
> >> +
> >> +  ret = add_kernel_mem_range(mem_ranges);
> >> +  if (ret)
> >> +  goto out;
> >> +
> >> +  ret = add_rtas_mem_range(mem_ranges, false);
> >> +  if (ret)
> >> +  goto out;
> >> +
> >> +  ret = add_opal_mem_range(mem_ranges, false);
> >> +  if (ret)
> >> +  goto out;
> >> +
> >> +  ret = add_reserved_ranges(mem_ranges);
> >> +  if (ret)
> >> +  goto out;
> >> +
> >> +  /* exclude memory ranges should be sorted for easy lookup */
> >> +  sort_memory_ranges(*mem_ranges);
> >> +out:
> >> +  if (ret)
> >> +  pr_err("Failed to setup exclude memory ranges\n");
> >> +  return ret;
> >> +}
> > 
> > I'm confused about the "overlap with crashkernel memory", does that mean
> > those normal kernel used memory could be put in crashkernel reserved
> 
> There are regions that could overlap with crashkernel region but they are
> not normal kernel used memory though. These are regions that kernel and/or
> f/w chose to place at a particular address for real mode accessibility
> and/or memory layout between kernel & f/w kind of thing.
> 
> > memory range?  If so why can't just skip those areas while crashkernel
> > doing the reservation?
> 
> crashkernel region has a dependency to be in the first memory block for it
> to be accessible in real mode. Accommodating this requirement while addressing
> other requirements would mean something like what we have now. A list of
> possible special memory regions in crashkernel region to take care of.
> 
> I have plans to split crashkernel region into low & high to have exclusive
> regions for crashkernel, even if that means to have two of them. But that
> is for another day with its own set of complexities to deal with...

Ok, I was not aware the powerpc crashkernel reservation is not
dynamically reserved.  But seems powerpc need those tricks at least for
the time being like you said.

Thanks
Dave



Re: [PATCH 04/11] ppc64/kexec_file: avoid stomping memory used by special regions

2020-07-02 Thread Dave Young
> > I'm confused about the "overlap with crashkernel memory", does that mean
> > those normal kernel used memory could be put in crashkernel reserved
> > memory range?  If so why can't just skip those areas while crashkernel
> > doing the reservation?
> I raised the same question in another mail. As Hari's answer, "kexec -p"
> skips these ranges in user space. And the same logic should be done in
> "kexec -s -p"

See it, thanks!  The confusion also applied to the userspace
implementation though.  Seems they have to be special cases because of 
the powerpc crashkernel reservation implemtation in kernel limitation

Thanks
Dave



Re: [PATCH v6 1/2] crash_core, vmcoreinfo: Append 'MAX_PHYSMEM_BITS' to vmcoreinfo

2020-07-02 Thread Dave Young
Hi Catalin,
On 07/02/20 at 12:00pm, Catalin Marinas wrote:
> On Thu, May 14, 2020 at 12:22:36AM +0530, Bhupesh Sharma wrote:
> > diff --git a/kernel/crash_core.c b/kernel/crash_core.c
> > index 9f1557b98468..18175687133a 100644
> > --- a/kernel/crash_core.c
> > +++ b/kernel/crash_core.c
> > @@ -413,6 +413,7 @@ static int __init crash_save_vmcoreinfo_init(void)
> > VMCOREINFO_LENGTH(mem_section, NR_SECTION_ROOTS);
> > VMCOREINFO_STRUCT_SIZE(mem_section);
> > VMCOREINFO_OFFSET(mem_section, section_mem_map);
> > +   VMCOREINFO_NUMBER(MAX_PHYSMEM_BITS);
> >  #endif
> > VMCOREINFO_STRUCT_SIZE(page);
> > VMCOREINFO_STRUCT_SIZE(pglist_data);
> 
> I can queue this patch via the arm64 tree (together with the second one)
> but I'd like an ack from the kernel/crash_core.c maintainers. They don't
> seem to have been cc'ed either (only the kexec list).

For the VMCOREINFO part, I'm fine with the changes, but since I do not
understand the arm64 pieces so I would like to leave to arm64 people to
review.  If arm64 bits are good enough, feel free to add:

Acked-by: Dave Young 

Thanks
Dave



Re: [PATCHv2] kernel/crash: make parse_crashkernel()'s return value more indicant

2019-04-28 Thread Dave Young
On 04/25/19 at 04:20pm, Pingfan Liu wrote:
> On Wed, Apr 24, 2019 at 4:31 PM Matthias Brugger  wrote:
> >
> >
> [...]
> > > @@ -139,6 +141,8 @@ static int __init parse_crashkernel_simple(char 
> > > *cmdline,
> > >   pr_warn("crashkernel: unrecognized char: %c\n", *cur);
> > >   return -EINVAL;
> > >   }
> > > + if (*crash_size == 0)
> > > + return -EINVAL;
> >
> > This covers the case where I pass an argument like "crashkernel=0M" ?
> > Can't we fix that by using kstrtoull() in memparse and check if the return 
> > value
> > is < 0? In that case we could return without updating the retptr and we 
> > will be
> > fine.
> >
> It seems that kstrtoull() treats 0M as invalid parameter, while
> simple_strtoull() does not.
> 
> If changed like your suggestion, then all the callers of memparse()
> will treats 0M as invalid parameter. This affects many components
> besides kexec.  Not sure this can be done or not.

simple_strtoull is obsolete, move to kstrtoull is the right way.

$ git grep memparse|wc
158 950   10479

Except some documentation/tools etc there are still a log of callers
which directly use the return value as the ull number without error
checking.

So it would be good to mark memparse as obsolete as well in
lib/cmdline.c, and introduce a new function eg. kmemparse() to use
kstrtoull,  and return a real error code, and save the size in an
argument like &size.  Then update X86 crashkernel code to use it.

Thanks
Dave


Re: [PATCHv2] kernel/crash: make parse_crashkernel()'s return value more indicant

2019-04-28 Thread Dave Young
On 04/29/19 at 12:48pm, Pingfan Liu wrote:
> On Mon, Apr 29, 2019 at 11:04 AM Pingfan Liu  wrote:
> >
> > On Sun, Apr 28, 2019 at 4:37 PM Dave Young  wrote:
> > >
> > > On 04/25/19 at 04:20pm, Pingfan Liu wrote:
> > > > On Wed, Apr 24, 2019 at 4:31 PM Matthias Brugger  
> > > > wrote:
> > > > >
> > > > >
> > > > [...]
> > > > > > @@ -139,6 +141,8 @@ static int __init parse_crashkernel_simple(char 
> > > > > > *cmdline,
> > > > > >   pr_warn("crashkernel: unrecognized char: %c\n", *cur);
> > > > > >   return -EINVAL;
> > > > > >   }
> > > > > > + if (*crash_size == 0)
> > > > > > + return -EINVAL;
> > > > >
> > > > > This covers the case where I pass an argument like "crashkernel=0M" ?
> > > > > Can't we fix that by using kstrtoull() in memparse and check if the 
> > > > > return value
> > > > > is < 0? In that case we could return without updating the retptr and 
> > > > > we will be
> > > > > fine.
> > > > >
> > > > It seems that kstrtoull() treats 0M as invalid parameter, while
> > > > simple_strtoull() does not.
> > > >
> > > > If changed like your suggestion, then all the callers of memparse()
> > > > will treats 0M as invalid parameter. This affects many components
> > > > besides kexec.  Not sure this can be done or not.
> > >
> > > simple_strtoull is obsolete, move to kstrtoull is the right way.
> > >
> > > $ git grep memparse|wc
> > > 158 950   10479
> > >
> > > Except some documentation/tools etc there are still a log of callers
> > > which directly use the return value as the ull number without error
> > > checking.
> > >
> > > So it would be good to mark memparse as obsolete as well in
> > > lib/cmdline.c, and introduce a new function eg. kmemparse() to use
> > > kstrtoull,  and return a real error code, and save the size in an
> > > argument like &size.  Then update X86 crashkernel code to use it.
> > >
> > Thank for your good suggestion.
> >
> Go through the v5.0 kernel code, I think it will be a huge job.
> 
> The difference between unsigned long long simple_strtoull(const char
> *cp, char **endp, unsigned int base) and int _kstrtoull(const char *s,
> unsigned int base, unsigned long long *res) is bigger than expected,
> especially the output parameter @res. Many references to
> memparse(const char *ptr, char **retptr) rely on @retptr to work. A
> typical example from arch/x86/kernel/e820.c
> mem_size = memparse(p, &p);
> if (p == oldp)
> return -EINVAL;
> 
> userdef = 1;
> if (*p == '@') {  <--- here
> start_at = memparse(p+1, &p);
> e820__range_add(start_at, mem_size, E820_TYPE_RAM);
> } else if (*p == '#') {
> start_at = memparse(p+1, &p);
> e820__range_add(start_at, mem_size, E820_TYPE_ACPI);
> } else if (*p == '$') {
> start_at = memparse(p+1, &p);
> e820__range_add(start_at, mem_size, E820_TYPE_RESERVED);
> }
> 
> So we need to resolve the prototype of kstrtoull() firstly, and maybe
> kstrtouint() etc too. All of them have lots of references in kernel.
> 
> Any idea about this?


Not only this place, a lot of other places, I think no hurry to fix them
all at one time.

As we talked just do it according to previous reply,  mark memparse as
obsolete, and create a new function to use kstrtoull, and make it used
in crashkernel code first.

Thanks
Dave


Re: [PATCH] powerpc: Fix loading of kernel + initramfs with kexec_file_load()

2019-05-22 Thread Dave Young
On 05/22/19 at 07:01pm, Thiago Jung Bauermann wrote:
> Commit b6664ba42f14 ("s390, kexec_file: drop arch_kexec_mem_walk()")
> changed kexec_add_buffer() to skip searching for a memory location if
> kexec_buf.mem is already set, and use the address that is there.
> 
> In powerpc code we reuse a kexec_buf variable for loading both the kernel
> and the initramfs by resetting some of the fields between those uses, but
> not mem. This causes kexec_add_buffer() to try to load the kernel at the
> same address where initramfs will be loaded, which is naturally rejected:
> 
>   # kexec -s -l --initrd initramfs vmlinuz
>   kexec_file_load failed: Invalid argument
> 
> Setting the mem field before every call to kexec_add_buffer() fixes this
> regression.
> 
> Fixes: b6664ba42f14 ("s390, kexec_file: drop arch_kexec_mem_walk()")
> Signed-off-by: Thiago Jung Bauermann 
> ---
>  arch/powerpc/kernel/kexec_elf_64.c | 6 +-
>  1 file changed, 5 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/powerpc/kernel/kexec_elf_64.c 
> b/arch/powerpc/kernel/kexec_elf_64.c
> index ba4f18a43ee8..52a29fc73730 100644
> --- a/arch/powerpc/kernel/kexec_elf_64.c
> +++ b/arch/powerpc/kernel/kexec_elf_64.c
> @@ -547,6 +547,7 @@ static int elf_exec_load(struct kimage *image, struct 
> elfhdr *ehdr,
>   kbuf.memsz = phdr->p_memsz;
>   kbuf.buf_align = phdr->p_align;
>   kbuf.buf_min = phdr->p_paddr + base;
> + kbuf.mem = KEXEC_BUF_MEM_UNKNOWN;
>   ret = kexec_add_buffer(&kbuf);
>   if (ret)
>   goto out;
> @@ -581,7 +582,8 @@ static void *elf64_load(struct kimage *image, char 
> *kernel_buf,
>   struct kexec_buf kbuf = { .image = image, .buf_min = 0,
> .buf_max = ppc64_rma_size };
>   struct kexec_buf pbuf = { .image = image, .buf_min = 0,
> -   .buf_max = ppc64_rma_size, .top_down = true };
> +   .buf_max = ppc64_rma_size, .top_down = true,
> +   .mem = KEXEC_BUF_MEM_UNKNOWN };
>  
>   ret = build_elf_exec_info(kernel_buf, kernel_len, &ehdr, &elf_info);
>   if (ret)
> @@ -606,6 +608,7 @@ static void *elf64_load(struct kimage *image, char 
> *kernel_buf,
>   kbuf.bufsz = kbuf.memsz = initrd_len;
>   kbuf.buf_align = PAGE_SIZE;
>   kbuf.top_down = false;
> + kbuf.mem = KEXEC_BUF_MEM_UNKNOWN;
>   ret = kexec_add_buffer(&kbuf);
>   if (ret)
>   goto out;
> @@ -638,6 +641,7 @@ static void *elf64_load(struct kimage *image, char 
> *kernel_buf,
>   kbuf.bufsz = kbuf.memsz = fdt_size;
>   kbuf.buf_align = PAGE_SIZE;
>   kbuf.top_down = true;
> + kbuf.mem = KEXEC_BUF_MEM_UNKNOWN;
>   ret = kexec_add_buffer(&kbuf);
>   if (ret)
>   goto out;
> 
> 
> ___
> kexec mailing list
> ke...@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/kexec

Reviewed-by: Dave Young 

Thanks
Dave


Re: [PATCH v4 2/3] powerpc/fadump: Use the correct VMCOREINFO_NOTE_SIZE for phdr

2017-04-26 Thread Dave Young
Ccing ppc list
On 04/20/17 at 07:39pm, Xunlei Pang wrote:
> vmcoreinfo_max_size stands for the vmcoreinfo_data, the
> correct one we should use is vmcoreinfo_note whose total
> size is VMCOREINFO_NOTE_SIZE.
> 
> Like explained in commit 77019967f06b ("kdump: fix exported
> size of vmcoreinfo note"), it should not affect the actual
> function, but we better fix it, also this change should be
> safe and backward compatible.
> 
> After this, we can get rid of variable vmcoreinfo_max_size,
> let's use the corresponding macros directly, fewer variables
> means more safety for vmcoreinfo operation.
> 
> Cc: Mahesh Salgaonkar 
> Cc: Hari Bathini 
> Signed-off-by: Xunlei Pang 
> ---
> v3->v4:
> -Rebased on the latest linux-next
> 
>  arch/powerpc/kernel/fadump.c | 3 +--
>  include/linux/crash_core.h   | 1 -
>  kernel/crash_core.c  | 3 +--
>  3 files changed, 2 insertions(+), 5 deletions(-)
> 
> diff --git a/arch/powerpc/kernel/fadump.c b/arch/powerpc/kernel/fadump.c
> index 466569e..7bd6cd0 100644
> --- a/arch/powerpc/kernel/fadump.c
> +++ b/arch/powerpc/kernel/fadump.c
> @@ -893,8 +893,7 @@ static int fadump_create_elfcore_headers(char *bufp)
>  
>   phdr->p_paddr   = fadump_relocate(paddr_vmcoreinfo_note());
>   phdr->p_offset  = phdr->p_paddr;
> - phdr->p_memsz   = vmcoreinfo_max_size;
> - phdr->p_filesz  = vmcoreinfo_max_size;
> + phdr->p_memsz   = phdr->p_filesz = VMCOREINFO_NOTE_SIZE;
>  
>   /* Increment number of program headers. */
>   (elf->e_phnum)++;
> diff --git a/include/linux/crash_core.h b/include/linux/crash_core.h
> index ba283a2..7d6bc7b 100644
> --- a/include/linux/crash_core.h
> +++ b/include/linux/crash_core.h
> @@ -55,7 +55,6 @@
>  
>  extern u32 *vmcoreinfo_note;
>  extern size_t vmcoreinfo_size;
> -extern size_t vmcoreinfo_max_size;
>  
>  Elf_Word *append_elf_note(Elf_Word *buf, char *name, unsigned int type,
> void *data, size_t data_len);
> diff --git a/kernel/crash_core.c b/kernel/crash_core.c
> index 0321f04..43cdb00 100644
> --- a/kernel/crash_core.c
> +++ b/kernel/crash_core.c
> @@ -16,7 +16,6 @@
>  /* vmcoreinfo stuff */
>  static unsigned char *vmcoreinfo_data;
>  size_t vmcoreinfo_size;
> -size_t vmcoreinfo_max_size = VMCOREINFO_BYTES;
>  u32 *vmcoreinfo_note;
>  
>  /*
> @@ -343,7 +342,7 @@ void vmcoreinfo_append_str(const char *fmt, ...)
>   r = vscnprintf(buf, sizeof(buf), fmt, args);
>   va_end(args);
>  
> - r = min(r, vmcoreinfo_max_size - vmcoreinfo_size);
> + r = min(r, VMCOREINFO_BYTES - vmcoreinfo_size);
>  
>   memcpy(&vmcoreinfo_data[vmcoreinfo_size], buf, r);
>  
> -- 
> 1.8.3.1
> 
> 
> ___
> kexec mailing list
> ke...@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/kexec

Reviewed-by: Dave Young 

Thanks
Dave


Re: [PATCH v3 1/5] crash: move crashkernel parsing and vmcore related code under CONFIG_CRASH_CORE

2017-01-04 Thread Dave Young
Hi, Hari

On 01/02/17 at 07:43pm, Hari Bathini wrote:
> Traditionally, kdump is used to save vmcore in case of a crash. Some
> architectures like powerpc can save vmcore using architecture specific
> support instead of kexec/kdump mechanism. Such architecture specific
> support also needs to reserve memory, to be used by dump capture kernel.
> crashkernel parameter can be a reused, for memory reservation, by such
> architecture specific infrastructure.
> 
> But currently, code related to vmcoreinfo and parsing of crashkernel
> parameter is built under CONFIG_KEXEC_CORE. This patch introduces
> CONFIG_CRASH_CORE and moves the above mentioned code under this config,
> allowing code reuse without dependency on CONFIG_KEXEC. There is no
> functional change with this patch.
> 
> Signed-off-by: Hari Bathini 
> ---
> 
> Changes from v2:
> * Used CONFIG_CRASH_CORE instead of CONFIG_KEXEC_CORE at
>   appropriate places in printk and ksysfs.
> 
> 
>  arch/Kconfig   |4 
>  include/linux/crash_core.h |   65 ++
>  include/linux/kexec.h  |   57 --
>  include/linux/printk.h |4 
>  kernel/Makefile|1 
>  kernel/crash_core.c|  445 
> 
>  kernel/kexec_core.c|  404 
>  kernel/ksysfs.c|8 +
>  kernel/printk/printk.c |6 -
>  9 files changed, 531 insertions(+), 463 deletions(-)
>  create mode 100644 include/linux/crash_core.h
>  create mode 100644 kernel/crash_core.c
> 

[snip]

>  #ifndef CONFIG_TINY_RCU
> diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
> index 8b26964..d0dfebd 100644
> --- a/kernel/printk/printk.c
> +++ b/kernel/printk/printk.c
> @@ -32,7 +32,7 @@
>  #include 
>  #include 
>  #include 
> -#include 
> +#include 
>  #include 
>  #include 
>  #include 
> @@ -951,7 +951,7 @@ const struct file_operations kmsg_fops = {
>   .release = devkmsg_release,
>  };
>  
> -#ifdef CONFIG_KEXEC_CORE
> +#ifdef CONFIG_CRASH_CORE
>  /*
>   * This appends the listed symbols to /proc/vmcore
>   *
> @@ -960,7 +960,7 @@ const struct file_operations kmsg_fops = {
>   * symbols are specifically used so that utilities can access and extract the
>   * dmesg log from a vmcore file after a crash.
>   */
> -void log_buf_kexec_setup(void)
> +void log_buf_crash_setup(void)

I can not think of any better name about the CONFIG_CRASH_CORE though I
feel it is not excellent so personally I can live with it.

But for this function name log_buf_crash_setup is too general, I can not get
what it is doing from the name, how about change it to log_buf_vmcoreinfo_setup

>  {
>   VMCOREINFO_SYMBOL(log_buf);
>   VMCOREINFO_SYMBOL(log_buf_len);
> 

Thanks
Dave


Re: [PATCH v3 2/5] ia64: reuse append_elf_note() and final_note() functions

2017-01-04 Thread Dave Young
On 01/02/17 at 07:44pm, Hari Bathini wrote:
> Get rid of multiple definitions of append_elf_note() & final_note()
> functions. Reuse these functions compiled under CONFIG_CRASH_CORE
> Also, define Elf_Word and use it instead of generic u32 or the more
> specific Elf64_Word.
> 
> Signed-off-by: Hari Bathini 
> ---
> 
> Changes from v2:
> * Added a definition for Elf_Word.
> * Used IA64 version of append_elf_note() and final_note() functions.
> 
> 
>  arch/ia64/kernel/crash.c   |   22 --
>  include/linux/crash_core.h |4 
>  include/linux/elf.h|2 ++
>  kernel/crash_core.c|   34 ++
>  kernel/kexec_core.c|   28 
>  5 files changed, 20 insertions(+), 70 deletions(-)
> 
> diff --git a/arch/ia64/kernel/crash.c b/arch/ia64/kernel/crash.c
> index 2955f35..75859a0 100644
> --- a/arch/ia64/kernel/crash.c
> +++ b/arch/ia64/kernel/crash.c
> @@ -27,28 +27,6 @@ static int kdump_freeze_monarch;
>  static int kdump_on_init = 1;
>  static int kdump_on_fatal_mca = 1;
>  
> -static inline Elf64_Word
> -*append_elf_note(Elf64_Word *buf, char *name, unsigned type, void *data,
> - size_t data_len)
> -{
> - struct elf_note *note = (struct elf_note *)buf;
> - note->n_namesz = strlen(name) + 1;
> - note->n_descsz = data_len;
> - note->n_type   = type;
> - buf += (sizeof(*note) + 3)/4;
> - memcpy(buf, name, note->n_namesz);
> - buf += (note->n_namesz + 3)/4;
> - memcpy(buf, data, data_len);
> - buf += (data_len + 3)/4;
> - return buf;
> -}
> -
> -static void
> -final_note(void *buf)
> -{
> - memset(buf, 0, sizeof(struct elf_note));
> -}
> -
>  extern void ia64_dump_cpu_regs(void *);
>  
>  static DEFINE_PER_CPU(struct elf_prstatus, elf_prstatus);
> diff --git a/include/linux/crash_core.h b/include/linux/crash_core.h
> index 18d0f94..541a197 100644
> --- a/include/linux/crash_core.h
> +++ b/include/linux/crash_core.h
> @@ -55,6 +55,10 @@ extern u32 vmcoreinfo_note[VMCOREINFO_NOTE_SIZE/4];
>  extern size_t vmcoreinfo_size;
>  extern size_t vmcoreinfo_max_size;
>  
> +Elf_Word *append_elf_note(Elf_Word *buf, char *name, unsigned int type,
> +   void *data, size_t data_len);
> +void final_note(Elf_Word *buf);
> +
>  int __init parse_crashkernel(char *cmdline, unsigned long long system_ram,
>   unsigned long long *crash_size, unsigned long long *crash_base);
>  int parse_crashkernel_high(char *cmdline, unsigned long long system_ram,
> diff --git a/include/linux/elf.h b/include/linux/elf.h
> index 20fa8d8..ba069e8 100644
> --- a/include/linux/elf.h
> +++ b/include/linux/elf.h
> @@ -29,6 +29,7 @@ extern Elf32_Dyn _DYNAMIC [];
>  #define elf_note elf32_note
>  #define elf_addr_t   Elf32_Off
>  #define Elf_Half Elf32_Half
> +#define Elf_Word Elf32_Word
>  
>  #else
>  
> @@ -39,6 +40,7 @@ extern Elf64_Dyn _DYNAMIC [];
>  #define elf_note elf64_note
>  #define elf_addr_t   Elf64_Off
>  #define Elf_Half Elf64_Half
> +#define Elf_Word Elf64_Word
>  
>  #endif
>  
> diff --git a/kernel/crash_core.c b/kernel/crash_core.c
> index 212c705..3ac172c 100644
> --- a/kernel/crash_core.c
> +++ b/kernel/crash_core.c
> @@ -291,32 +291,26 @@ int __init parse_crashkernel_low(char *cmdline,
>   "crashkernel=", suffix_tbl[SUFFIX_LOW]);
>  }
>  
> -static u32 *append_elf_note(u32 *buf, char *name, unsigned int type,
> - void *data, size_t data_len)
> +Elf_Word *append_elf_note(Elf_Word *buf, char *name, unsigned int type,
> +   void *data, size_t data_len)
>  {
> - struct elf_note note;
> -
> - note.n_namesz = strlen(name) + 1;
> - note.n_descsz = data_len;
> - note.n_type   = type;
> - memcpy(buf, ¬e, sizeof(note));
> - buf += (sizeof(note) + 3)/4;
> - memcpy(buf, name, note.n_namesz);
> - buf += (note.n_namesz + 3)/4;
> - memcpy(buf, data, note.n_descsz);
> - buf += (note.n_descsz + 3)/4;
> + struct elf_note *note = (struct elf_note *)buf;
> +
> + note->n_namesz = strlen(name) + 1;
> + note->n_descsz = data_len;
> + note->n_type   = type;
> + buf += (sizeof(*note) + 3)/4;
> + memcpy(buf, name, note->n_namesz);
> + buf += (note->n_namesz + 3)/4;
> + memcpy(buf, data, data_len);
> + buf += (data_len + 3)/4;
>  
>   return buf;
>  }
>  
> -static void final_note(u32 *buf)
> +void final_note(Elf_Word *buf)
>  {
> - struct elf_note note;
> -
> - note.n_namesz = 0;
> - note.n_descsz = 0;
> - note.n_type   = 0;
> - memcpy(buf, ¬e, sizeof(note));
> + memset(buf, 0, sizeof(struct elf_note));
>  }
>  
>  static void update_vmcoreinfo_note(void)
> diff --git a/kernel/kexec_core.c b/kernel/kexec_core.c
> index 2179a16..263d764 100644
> --- a/kernel/kexec_core.c
> +++ b/kernel/kexec_core.c
> @@ -990,34 +990,6 @@ int crash_shrink_memory(unsigned long new_size)
>   return re

Re: [PATCH v4 1/5] crash: move crashkernel parsing and vmcore related code under CONFIG_CRASH_CORE

2017-01-05 Thread Dave Young
Hi Hari

Thanks for the update.

On 01/05/17 at 10:59pm, Hari Bathini wrote:
> Traditionally, kdump is used to save vmcore in case of a crash. Some
> architectures like powerpc can save vmcore using architecture specific
> support instead of kexec/kdump mechanism. Such architecture specific
> support also needs to reserve memory, to be used by dump capture kernel.
> crashkernel parameter can be a reused, for memory reservation, by such
> architecture specific infrastructure.
> 
> But currently, code related to vmcoreinfo and parsing of crashkernel
> parameter is built under CONFIG_KEXEC_CORE. This patch introduces
> CONFIG_CRASH_CORE and moves the above mentioned code under this config,
> allowing code reuse without dependency on CONFIG_KEXEC. There is no
> functional change with this patch.
> 
> Signed-off-by: Hari Bathini 
> ---
> 
> Changes from v3:
> * Renamed log_buf_kexec_setup()to log_buf_vmcoreinfo_setup() instead of
>   log_buf_crash_setup().
> 
> Changes from v2:
> * Used CONFIG_CRASH_CORE instead of CONFIG_KEXEC_CORE at
>   appropriate places in printk and ksysfs.
> 
> 
>  arch/Kconfig   |4 
>  include/linux/crash_core.h |   65 ++
>  include/linux/kexec.h  |   57 --
>  include/linux/printk.h |4 
>  kernel/Makefile|1 
>  kernel/crash_core.c|  445 
> 
>  kernel/kexec_core.c|  404 
>  kernel/ksysfs.c|8 +
>  kernel/printk/printk.c |6 -
>  9 files changed, 531 insertions(+), 463 deletions(-)
>  create mode 100644 include/linux/crash_core.h
>  create mode 100644 kernel/crash_core.c
> 

[snip]

Acked-by: Dave Young 

Thanks
Dave


Re: [PATCH v4 2/5] ia64: reuse append_elf_note() and final_note() functions

2017-01-05 Thread Dave Young
uf;
>  }
>  
> -static void final_note(u32 *buf)
> +void final_note(Elf_Word *buf)
>  {
> - struct elf_note note;
> -
> - note.n_namesz = 0;
> - note.n_descsz = 0;
> - note.n_type   = 0;
> - memcpy(buf, ¬e, sizeof(note));
> + memset(buf, 0, sizeof(struct elf_note));
>  }
>  
>  static void update_vmcoreinfo_note(void)
> diff --git a/kernel/kexec_core.c b/kernel/kexec_core.c
> index 2179a16..263d764 100644
> --- a/kernel/kexec_core.c
> +++ b/kernel/kexec_core.c
> @@ -990,34 +990,6 @@ int crash_shrink_memory(unsigned long new_size)
>   return ret;
>  }
>  
> -static u32 *append_elf_note(u32 *buf, char *name, unsigned type, void *data,
> - size_t data_len)
> -{
> - struct elf_note note;
> -
> - note.n_namesz = strlen(name) + 1;
> - note.n_descsz = data_len;
> - note.n_type   = type;
> - memcpy(buf, ¬e, sizeof(note));
> - buf += (sizeof(note) + 3)/4;
> - memcpy(buf, name, note.n_namesz);
> - buf += (note.n_namesz + 3)/4;
> - memcpy(buf, data, note.n_descsz);
> - buf += (note.n_descsz + 3)/4;
> -
> - return buf;
> -}
> -
> -static void final_note(u32 *buf)
> -{
> - struct elf_note note;
> -
> - note.n_namesz = 0;
> - note.n_descsz = 0;
> - note.n_type   = 0;
> - memcpy(buf, ¬e, sizeof(note));
> -}
> -
>  void crash_save_cpu(struct pt_regs *regs, int cpu)
>  {
>   struct elf_prstatus prstatus;
> 

It looks good to me. But better to have words from IA64 people as well.

Acked-by: Dave Young 

Thanks
Dave


Re: [PATCH] powerpc: kexec_file: Fix error code when trying to load kdump kernel

2018-03-30 Thread Dave Young
On 03/29/18 at 04:05pm, Thiago Jung Bauermann wrote:
> kexec_file_load() on powerpc doesn't support kdump kernels yet, so it
> returns -ENOTSUPP in that case.
> 
> I've recently learned that this errno is internal to the kernel and isn't
> supposed to be exposed to userspace. Therefore, change to -EOPNOTSUPP which
> is defined in an uapi header.
> 
> This does indeed make kexec-tools happier. Before the patch, on ppc64le:
> 
>   # ~bauermann/src/kexec-tools/build/sbin/kexec -s -p /boot/vmlinuz
>   kexec_file_load failed: Unknown error 524
> 
> After the patch:
> 
>   # ~bauermann/src/kexec-tools/build/sbin/kexec -s -p /boot/vmlinuz
>   kexec_file_load failed: Operation not supported
> 
> Fixes: a0458284f062 ("powerpc: Add support code for kexec_file_load()")
> Reported-by: Dave Young 
> Signed-off-by: Thiago Jung Bauermann 
> ---
>  arch/powerpc/kernel/machine_kexec_file_64.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> This is a minor issue, but since it's a simple patch it might be worth
> applying it to stable branches.
> 
> This is the kexec-tools thread where this problem was brought up:
> 
> https://lists.infradead.org/pipermail/kexec/2018-March/020346.html
> 
> And this is an instance of a similar fix being applied elsewhere in the
> kernel, for the same reasons:
> 
> https://patchwork.kernel.org/patch/8490791/
> 
> The test shown in the commit log was made using Hari Bathini's patch
> adding kexec_file_load() support to kexec-tools in ppc64.
> 
> diff --git a/arch/powerpc/kernel/machine_kexec_file_64.c 
> b/arch/powerpc/kernel/machine_kexec_file_64.c
> index e4395f937d63..45e0b7d5f200 100644
> --- a/arch/powerpc/kernel/machine_kexec_file_64.c
> +++ b/arch/powerpc/kernel/machine_kexec_file_64.c
> @@ -43,7 +43,7 @@ int arch_kexec_kernel_image_probe(struct kimage *image, 
> void *buf,
>  
>   /* We don't support crash kernels yet. */
>   if (image->type == KEXEC_TYPE_CRASH)
> - return -ENOTSUPP;
> + return -EOPNOTSUPP;
>  
>   for (i = 0; i < ARRAY_SIZE(kexec_file_loaders); i++) {
>   fops = kexec_file_loaders[i];
> 
> 
> ___
> kexec mailing list
> ke...@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/kexec

Reviewed-by: Dave Young 

Thanks
Dave



Re: [PATCH v10 04/10] kexec_file: Add support for purgatory built as PIE.

2016-11-19 Thread Dave Young
On 11/10/16 at 01:27am, Thiago Jung Bauermann wrote:
> powerpc's purgatory.ro has 12 relocation types when built as
> a relocatable object. To implement support for them requires
> arch_kexec_apply_relocations_add to duplicate a lot of code with
> module_64.c:apply_relocate_add.
> 
> When built as a Position Independent Executable there are only 4
> relocation types in purgatory.ro, so it becomes practical for the powerpc
> implementation of kexec_file to have its own relocation implementation.
> 
> Also, the purgatory is an executable and not an intermediary output from
> the compiler so it makes sense conceptually that it is easier to build
> it as a PIE than as a partially linked object.
> 
> Apart from the greatly reduced number of relocations, there are two
> differences between a relocatable object and a PIE:
> 
> 1. __kexec_load_purgatory needs to use the program headers rather than the
>section headers to figure out how to load the binary.
> 2. Symbol values are absolute addresses instead of relative to the
>start of the section.
> 
> This patch adds the support needed in generic code for the differences
> above and allows powerpc to load and relocate a position independent
> purgatory.
> 

[snip]

The kexec-tools machine_apply_elf_rel is pretty simple for ppc64, it is
not that complex. So could you look into simplify your kexec_file
implementation?

kernel/kexec_file.c kexec_apply_relocations only do limited things
and some of the logic is in arch/x86, so move general code out of arch
code, then I guess the arch code will be simpler and then we probably
do not need this PIE stuff anymore.

BTW, __kexec_really_load_purgatory looks worse than
___kexec_load_purgatory ;)

Thanks
Dave


Re: [PATCH v10 04/10] kexec_file: Add support for purgatory built as PIE.

2016-11-21 Thread Dave Young
On 11/21/16 at 09:49pm, Thiago Jung Bauermann wrote:
> Hello Dave,
> 
> Thanks for your review.
> 
> Am Sonntag, 20. November 2016, 10:45:46 BRST schrieb Dave Young:
> > On 11/10/16 at 01:27am, Thiago Jung Bauermann wrote:
> > > powerpc's purgatory.ro has 12 relocation types when built as
> > > a relocatable object. To implement support for them requires
> > > arch_kexec_apply_relocations_add to duplicate a lot of code with
> > > module_64.c:apply_relocate_add.
> > > 
> > > When built as a Position Independent Executable there are only 4
> > > relocation types in purgatory.ro, so it becomes practical for the powerpc
> > > implementation of kexec_file to have its own relocation implementation.
> > > 
> > > Also, the purgatory is an executable and not an intermediary output from
> > > the compiler so it makes sense conceptually that it is easier to build
> > > it as a PIE than as a partially linked object.
> > > 
> > > Apart from the greatly reduced number of relocations, there are two
> > > differences between a relocatable object and a PIE:
> > > 
> > > 1. __kexec_load_purgatory needs to use the program headers rather than the
> > > 
> > >section headers to figure out how to load the binary.
> > > 
> > > 2. Symbol values are absolute addresses instead of relative to the
> > > 
> > >start of the section.
> > > 
> > > This patch adds the support needed in generic code for the differences
> > > above and allows powerpc to load and relocate a position independent
> > > purgatory.
> > 
> > [snip]
> > 
> > The kexec-tools machine_apply_elf_rel is pretty simple for ppc64, it is
> > not that complex. So could you look into simplify your kexec_file
> > implementation?
> 
> I can try, but there is one fundamental issue here: powerpc 
> position-dependent 
> code relies more on relocations than x86 position-dependent code does, so 
> there's a limit to how simple it can be made without switching to position-
> independent code. And it will always be more involved than it is on x86.
> 
> BTW, building x86's purgatory as PIE results in it not having any relocation 
> at all, so it's an advantage even in that architecture. Unfortunately, the 
> machine locks up during reboot and I didn't have time to try to figure out 
> what's going on.
> 
> > kernel/kexec_file.c kexec_apply_relocations only do limited things
> > and some of the logic is in arch/x86, so move general code out of arch
> > code, then I guess the arch code will be simpler
> 
> I agree that is a good idea. Is the patch below what you had in mind?
> 
> > and then we probably do not need this PIE stuff anymore.
> 
> If you are ok with the patch below I can post a new version of the series 
> based on it and we can see if Michael Ellerman thinks it is enough.
> 

Will review it and do a test. Thanks. I believe this will benefit for
other arches if they want a kexec_file in the future.

> > BTW, __kexec_really_load_purgatory looks worse than
> > ___kexec_load_purgatory ;)
> 
> Really? I find the special handling of bss makes the section-based loader a 
> bit more confusing.

I'm not sure I understand above about "*bss*", personally I like
___kexec_load_purgatory more. But anyway if we move arch code as general
code then it will be not necessary anymore..

> 
> -- 
> Thiago Jung Bauermann
> IBM Linux Technology Center
> 
> 
> Subject: [PATCH] kexec_file: Move generic relocation code from arch/x86 to
>  kernel/kexec_file.c
> 
> The check for undefined symbols stays in arch-specific code because
> powerpc needs to allow TOC symbols to be processed even though they're
> undefined.
> 
> There is no functional change.
> 
> Suggested-by: Dave Young 
> Signed-off-by: Thiago Jung Bauermann 
> ---
>  arch/x86/kernel/machine_kexec_64.c | 160 
> +++--
>  include/linux/kexec.h  |   9 ++-
>  kernel/kexec_file.c| 120 +++-
>  3 files changed, 154 insertions(+), 135 deletions(-)
> 
> diff --git a/arch/x86/kernel/machine_kexec_64.c 
> b/arch/x86/kernel/machine_kexec_64.c
> index 8c1f218926d7..f4860c408ece 100644
> --- a/arch/x86/kernel/machine_kexec_64.c
> +++ b/arch/x86/kernel/machine_kexec_64.c
> @@ -401,143 +401,45 @@ int arch_kexec_kernel_verify_sig(struct kimage *image, 
> void *kernel,
>  }
>  #endif
>  
> -/*
> - * Apply purgatory relocations.
> - *
> - * ehdr: Pointer to elf headers
> - * sechdrs: Pointer to section h

Re: [PATCH v10 04/10] kexec_file: Add support for purgatory built as PIE.

2016-11-21 Thread Dave Young
Hi Michael
On 11/22/16 at 05:01pm, Michael Ellerman wrote:
> Thiago Jung Bauermann  writes:
> > Am Sonntag, 20. November 2016, 10:45:46 BRST schrieb Dave Young:
> >> On 11/10/16 at 01:27am, Thiago Jung Bauermann wrote:
> >> > powerpc's purgatory.ro has 12 relocation types when built as
> >> > a relocatable object. To implement support for them requires
> >> > arch_kexec_apply_relocations_add to duplicate a lot of code with
> >> > module_64.c:apply_relocate_add.
> >> > 
> >> > When built as a Position Independent Executable there are only 4
> >> > relocation types in purgatory.ro, so it becomes practical for the powerpc
> >> > implementation of kexec_file to have its own relocation implementation.
> >> > 
> >> > Also, the purgatory is an executable and not an intermediary output from
> >> > the compiler so it makes sense conceptually that it is easier to build
> >> > it as a PIE than as a partially linked object.
> >> > 
> >> > Apart from the greatly reduced number of relocations, there are two
> >> > differences between a relocatable object and a PIE:
> >> > 
> >> > 1. __kexec_load_purgatory needs to use the program headers rather than 
> >> > the
> >> > 
> >> >section headers to figure out how to load the binary.
> >> > 
> >> > 2. Symbol values are absolute addresses instead of relative to the
> >> > 
> >> >start of the section.
> >> > 
> >> > This patch adds the support needed in generic code for the differences
> >> > above and allows powerpc to load and relocate a position independent
> >> > purgatory.
> >> 
> >> [snip]
> >> 
> >> The kexec-tools machine_apply_elf_rel is pretty simple for ppc64, it is
> >> not that complex. So could you look into simplify your kexec_file
> >> implementation?
> >
> > I can try, but there is one fundamental issue here: powerpc 
> > position-dependent 
> > code relies more on relocations than x86 position-dependent code does, so 
> > there's a limit to how simple it can be made without switching to position-
> > independent code. And it will always be more involved than it is on x86.
> 
> I think we need to go back to the drawing board on this one.
> 
> My hope was that building purgatory as PIE would reduce the amount of
> complexity, but instead it's just added more. Sorry for sending you in
> that direction.
> 
> 
> In general I dislike the level of complexity of the kexec-tools
> purgatory, and in particular I'm not comfortable with things like:
> 
> diff --git a/arch/powerpc/purgatory/sha256.c b/arch/powerpc/purgatory/sha256.c
> new file mode 100644
> index ..6abee1877d56
> --- /dev/null
> +++ b/arch/powerpc/purgatory/sha256.c
> @@ -0,0 +1,6 @@
> +#include "../boot/string.h"
> +
> +/* Avoid including x86's boot/string.h in sha256.c. */
> +#define BOOT_STRING_H
> +
> +#include "../../x86/purgatory/sha256.c"
> 

Agreed, include x86 code in powerpc looks bad

> 
> I think the best way to get this over the line would be to take the
> kexec-lite purgatory implementation and use that to begin with. I know
> it doesn't have all the features of the kexec-tools version, but it
> should work, and we can look at adding the extra features later.

Instead of adding other implementation, moving the purgatory sha256 code
out of x86 sounds better so that we can reuse them cleanly..

> 
> I'll try and get that working tonight.
> 
> cheers

Thanks
Dave


Re: [PATCH v10 04/10] kexec_file: Add support for purgatory built as PIE.

2016-11-22 Thread Dave Young
On 11/22/16 at 11:44am, Thiago Jung Bauermann wrote:
> Am Dienstag, 22. November 2016, 17:01:10 BRST schrieb Michael Ellerman:
> > Thiago Jung Bauermann  writes:
> > > Am Sonntag, 20. November 2016, 10:45:46 BRST schrieb Dave Young:
> > >> On 11/10/16 at 01:27am, Thiago Jung Bauermann wrote:
> > >> > powerpc's purgatory.ro has 12 relocation types when built as
> > >> > a relocatable object. To implement support for them requires
> > >> > arch_kexec_apply_relocations_add to duplicate a lot of code with
> > >> > module_64.c:apply_relocate_add.
> > >> > 
> > >> > When built as a Position Independent Executable there are only 4
> > >> > relocation types in purgatory.ro, so it becomes practical for the
> > >> > powerpc
> > >> > implementation of kexec_file to have its own relocation implementation.
> > >> > 
> > >> > Also, the purgatory is an executable and not an intermediary output
> > >> > from
> > >> > the compiler so it makes sense conceptually that it is easier to build
> > >> > it as a PIE than as a partially linked object.
> > >> > 
> > >> > Apart from the greatly reduced number of relocations, there are two
> > >> > differences between a relocatable object and a PIE:
> > >> > 
> > >> > 1. __kexec_load_purgatory needs to use the program headers rather than
> > >> > the
> > >> > 
> > >> >section headers to figure out how to load the binary.
> > >> > 
> > >> > 2. Symbol values are absolute addresses instead of relative to the
> > >> > 
> > >> >start of the section.
> > >> > 
> > >> > This patch adds the support needed in generic code for the differences
> > >> > above and allows powerpc to load and relocate a position independent
> > >> > purgatory.
> > >> 
> > >> [snip]
> > >> 
> > >> The kexec-tools machine_apply_elf_rel is pretty simple for ppc64, it is
> > >> not that complex. So could you look into simplify your kexec_file
> > >> implementation?
> > > 
> > > I can try, but there is one fundamental issue here: powerpc
> > > position-dependent code relies more on relocations than x86
> > > position-dependent code does, so there's a limit to how simple it can be
> > > made without switching to position- independent code. And it will always
> > > be more involved than it is on x86.
> > I think we need to go back to the drawing board on this one.
> > 
> > My hope was that building purgatory as PIE would reduce the amount of
> > complexity, but instead it's just added more. Sorry for sending you in
> > that direction.
> 
> It added complexity because in my series powerpc was using a PIE purgatory 
> but 
> x86 kept using a partially-linked object (because of the problem I mentioned 
> I 
> had when trying out a PIE x86 purgatory), so generic code needed two 
> purgatory 
> loaders.
> 
> I'll see if I can make the PIE x86 purgatory to work so that generic code can 
> have only one loader implementation. Then it will indeed be simpler.

Do we really need the PIE purgatory, after moving generic code out of
x86, there will be no much benefit, no? Anyway, the first step should be
making the purgatory code more generic so that it can be easier for
other arches to support kexec_file in the future. 

> 
> 
> Am Dienstag, 22. November 2016, 14:16:22 BRST schrieb Dave Young:
> > Hi Michael
> > 
> > On 11/22/16 at 05:01pm, Michael Ellerman wrote:
> > > In general I dislike the level of complexity of the kexec-tools
> > > purgatory, and in particular I'm not comfortable with things like:
> > > 
> > > diff --git a/arch/powerpc/purgatory/sha256.c
> > > b/arch/powerpc/purgatory/sha256.c new file mode 100644
> > > index ..6abee1877d56
> > > --- /dev/null
> > > +++ b/arch/powerpc/purgatory/sha256.c
> > > @@ -0,0 +1,6 @@
> > > +#include "../boot/string.h"
> > > +
> > > +/* Avoid including x86's boot/string.h in sha256.c. */
> > > +#define BOOT_STRING_H
> > > +
> > > +#include "../../x86/purgatory/sha256.c"
> > 
> > Agreed, include x86 code in powerpc looks bad
> > 
> > > I think the best way to get this over the line would be to take the
> > > kexec-lite purgatory implementation and use that to begin with. I know
> > > it doesn't have all the features of the kexec-tools version, but it
> > > should work, and we can look at adding the extra features later.
> > 
> > Instead of adding other implementation, moving the purgatory sha256 code
> > out of x86 sounds better so that we can reuse them cleanly..
> 
> Do you have a suggestion of where that code can live so that it can be shared 
> between purgatories for different arches?

Maybe it is better to stay in lib/purgatory/

> 
> Do we need a purgatory with generic and arch-specific code like in kexec-
> tools?

Yes, if we have more arches to add kexec_file, this should be
necessary..

> 
> -- 
> Thiago Jung Bauermann
> IBM Linux Technology Center
> 

Thanks
Dave


Re: [PATCH v10 04/10] kexec_file: Add support for purgatory built as PIE.

2016-11-23 Thread Dave Young
On 11/21/16 at 09:49pm, Thiago Jung Bauermann wrote:
> Hello Dave,
> 
> Thanks for your review.
> 
> Am Sonntag, 20. November 2016, 10:45:46 BRST schrieb Dave Young:
> > On 11/10/16 at 01:27am, Thiago Jung Bauermann wrote:
> > > powerpc's purgatory.ro has 12 relocation types when built as
> > > a relocatable object. To implement support for them requires
> > > arch_kexec_apply_relocations_add to duplicate a lot of code with
> > > module_64.c:apply_relocate_add.
> > > 
> > > When built as a Position Independent Executable there are only 4
> > > relocation types in purgatory.ro, so it becomes practical for the powerpc
> > > implementation of kexec_file to have its own relocation implementation.
> > > 
> > > Also, the purgatory is an executable and not an intermediary output from
> > > the compiler so it makes sense conceptually that it is easier to build
> > > it as a PIE than as a partially linked object.
> > > 
> > > Apart from the greatly reduced number of relocations, there are two
> > > differences between a relocatable object and a PIE:
> > > 
> > > 1. __kexec_load_purgatory needs to use the program headers rather than the
> > > 
> > >section headers to figure out how to load the binary.
> > > 
> > > 2. Symbol values are absolute addresses instead of relative to the
> > > 
> > >start of the section.
> > > 
> > > This patch adds the support needed in generic code for the differences
> > > above and allows powerpc to load and relocate a position independent
> > > purgatory.
> > 
> > [snip]
> > 
> > The kexec-tools machine_apply_elf_rel is pretty simple for ppc64, it is
> > not that complex. So could you look into simplify your kexec_file
> > implementation?
> 
> I can try, but there is one fundamental issue here: powerpc 
> position-dependent 
> code relies more on relocations than x86 position-dependent code does, so 
> there's a limit to how simple it can be made without switching to position-
> independent code. And it will always be more involved than it is on x86.
> 
> BTW, building x86's purgatory as PIE results in it not having any relocation 
> at all, so it's an advantage even in that architecture. Unfortunately, the 
> machine locks up during reboot and I didn't have time to try to figure out 
> what's going on.
> 
> > kernel/kexec_file.c kexec_apply_relocations only do limited things
> > and some of the logic is in arch/x86, so move general code out of arch
> > code, then I guess the arch code will be simpler
> 
> I agree that is a good idea. Is the patch below what you had in mind?
> 
> > and then we probably do not need this PIE stuff anymore.
> 
> If you are ok with the patch below I can post a new version of the series 
> based on it and we can see if Michael Ellerman thinks it is enough.
> 
> > BTW, __kexec_really_load_purgatory looks worse than
> > ___kexec_load_purgatory ;)
> 
> Really? I find the special handling of bss makes the section-based loader a 
> bit more confusing.
> 
> -- 
> Thiago Jung Bauermann
> IBM Linux Technology Center
> 
> 
> Subject: [PATCH] kexec_file: Move generic relocation code from arch/x86 to
>  kernel/kexec_file.c
> 
> The check for undefined symbols stays in arch-specific code because
> powerpc needs to allow TOC symbols to be processed even though they're
> undefined.
> 
> There is no functional change.
> 
> Suggested-by: Dave Young 
> Signed-off-by: Thiago Jung Bauermann 
> ---
>  arch/x86/kernel/machine_kexec_64.c | 160 
> +++--
>  include/linux/kexec.h  |   9 ++-
>  kernel/kexec_file.c| 120 +++-
>  3 files changed, 154 insertions(+), 135 deletions(-)
> 
> diff --git a/arch/x86/kernel/machine_kexec_64.c 
> b/arch/x86/kernel/machine_kexec_64.c
> index 8c1f218926d7..f4860c408ece 100644
> --- a/arch/x86/kernel/machine_kexec_64.c
> +++ b/arch/x86/kernel/machine_kexec_64.c
> @@ -401,143 +401,45 @@ int arch_kexec_kernel_verify_sig(struct kimage *image, 
> void *kernel,
>  }
>  #endif
>  
> -/*
> - * Apply purgatory relocations.
> - *
> - * ehdr: Pointer to elf headers
> - * sechdrs: Pointer to section headers.
> - * relsec: section index of SHT_RELA section.
> - *
> - * TODO: Some of the code belongs to generic code. Move that in kexec.c.
> - */
> -int arch_kexec_apply_relocations_add(const Elf64_Ehdr *ehdr,
> -  Elf64_Shdr *sechdrs, unsigned int relsec)
> +int arch_kexec_a

Re: [PATCH v2 2/5] ia64: reuse append_elf_note() and final_note() functions

2016-11-30 Thread Dave Young
Hi Hari

Personally I like V1 more, but split the patch 2 is easier for ia64
people to reivew.  I did basic x86 testing, it runs ok.

On 11/25/16 at 05:24pm, Hari Bathini wrote:
> Get rid of multiple definitions of append_elf_note() & final_note()
> functions. Reuse these functions compiled under CONFIG_CRASH_CORE.
> 
> Signed-off-by: Hari Bathini 
> ---
>  arch/ia64/kernel/crash.c   |   22 --
>  include/linux/crash_core.h |4 
>  kernel/crash_core.c|6 +++---
>  kernel/kexec_core.c|   28 
>  4 files changed, 7 insertions(+), 53 deletions(-)
> 
> diff --git a/arch/ia64/kernel/crash.c b/arch/ia64/kernel/crash.c
> index 2955f35..75859a0 100644
> --- a/arch/ia64/kernel/crash.c
> +++ b/arch/ia64/kernel/crash.c
> @@ -27,28 +27,6 @@ static int kdump_freeze_monarch;
>  static int kdump_on_init = 1;
>  static int kdump_on_fatal_mca = 1;
>  
> -static inline Elf64_Word
> -*append_elf_note(Elf64_Word *buf, char *name, unsigned type, void *data,
> - size_t data_len)
> -{
> - struct elf_note *note = (struct elf_note *)buf;
> - note->n_namesz = strlen(name) + 1;
> - note->n_descsz = data_len;
> - note->n_type   = type;
> - buf += (sizeof(*note) + 3)/4;
> - memcpy(buf, name, note->n_namesz);
> - buf += (note->n_namesz + 3)/4;
> - memcpy(buf, data, data_len);
> - buf += (data_len + 3)/4;
> - return buf;
> -}
> -
> -static void
> -final_note(void *buf)
> -{
> - memset(buf, 0, sizeof(struct elf_note));
> -}
> -

The above IA64 version looks better than the functions in kexec_core.c
about the Elf64_Word type usage and the simpler final_note function.

Care to update crash_core.c to use this instead?

Otherwise I'm fine with the changes.

>  extern void ia64_dump_cpu_regs(void *);
>  
>  static DEFINE_PER_CPU(struct elf_prstatus, elf_prstatus);
> diff --git a/include/linux/crash_core.h b/include/linux/crash_core.h
> index 9a4f4b0..2ae20b1 100644
> --- a/include/linux/crash_core.h
> +++ b/include/linux/crash_core.h
> @@ -61,6 +61,10 @@ extern u32 vmcoreinfo_note[VMCOREINFO_NOTE_SIZE/4];
>  extern size_t vmcoreinfo_size;
>  extern size_t vmcoreinfo_max_size;
>  
> +u32 *append_elf_note(u32 *buf, char *name, unsigned int type,
> +  void *data, size_t data_len);
> +void final_note(u32 *buf);
> +
>  int __init parse_crashkernel(char *cmdline, unsigned long long system_ram,
>   unsigned long long *crash_size, unsigned long long *crash_base);
>  int parse_crashkernel_high(char *cmdline, unsigned long long system_ram,
> diff --git a/kernel/crash_core.c b/kernel/crash_core.c
> index 60a98fc..9223976 100644
> --- a/kernel/crash_core.c
> +++ b/kernel/crash_core.c
> @@ -291,8 +291,8 @@ int __init parse_crashkernel_low(char *cmdline,
>   "crashkernel=", suffix_tbl[SUFFIX_LOW]);
>  }
>  
> -static u32 *append_elf_note(u32 *buf, char *name, unsigned int type,
> - void *data, size_t data_len)
> +u32 *append_elf_note(u32 *buf, char *name, unsigned int type,
> +  void *data, size_t data_len)
>  {
>   struct elf_note note;
>  
> @@ -309,7 +309,7 @@ static u32 *append_elf_note(u32 *buf, char *name, 
> unsigned int type,
>   return buf;
>  }
>  
> -static void final_note(u32 *buf)
> +void final_note(u32 *buf)
>  {
>   struct elf_note note;
>  
> diff --git a/kernel/kexec_core.c b/kernel/kexec_core.c
> index 3aa21f3..596cb32 100644
> --- a/kernel/kexec_core.c
> +++ b/kernel/kexec_core.c
> @@ -988,34 +988,6 @@ int crash_shrink_memory(unsigned long new_size)
>   return ret;
>  }
>  
> -static u32 *append_elf_note(u32 *buf, char *name, unsigned type, void *data,
> - size_t data_len)
> -{
> - struct elf_note note;
> -
> - note.n_namesz = strlen(name) + 1;
> - note.n_descsz = data_len;
> - note.n_type   = type;
> - memcpy(buf, ¬e, sizeof(note));
> - buf += (sizeof(note) + 3)/4;
> - memcpy(buf, name, note.n_namesz);
> - buf += (note.n_namesz + 3)/4;
> - memcpy(buf, data, note.n_descsz);
> - buf += (note.n_descsz + 3)/4;
> -
> - return buf;
> -}
> -
> -static void final_note(u32 *buf)
> -{
> - struct elf_note note;
> -
> - note.n_namesz = 0;
> - note.n_descsz = 0;
> - note.n_type   = 0;
> - memcpy(buf, ¬e, sizeof(note));
> -}
> -
>  void crash_save_cpu(struct pt_regs *regs, int cpu)
>  {
>   struct elf_prstatus prstatus;
> 
> 
> ___
> kexec mailing list
> ke...@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/kexec

Thanks
Dave


Re: [PATCH 0/8] kexec_file_load implementation for PowerPC

2016-06-13 Thread Dave Young
On 06/12/16 at 12:10am, Thiago Jung Bauermann wrote:
> Hello,
> 
> This patch series implements the kexec_file_load system call on PowerPC.
> 
> It starts by removing an x86 assumption from kexec_file: kexec_add_buffer uses
> iomem to find reserved memory ranges, but PowerPC uses the memblock subsystem.
> Hooks are added so that each arch can specify how memory ranges can be found.
> 
> Also, the memory-walking logic in kexec_add_buffer is useful in this
> implementation to find a free area for the purgatory's stack, so that same
> patch moves that logic to kexec_locate_mem_hole.
> 
> The kexec_file_load system call needs to apply relocations to the purgatory
> but adding code for that would duplicate functionality with the module loading
> mechanism, which also needs to apply relocations to the kernel modules.
> Therefore, this patch series factors out the module relocation code so that it
> can be shared.
> 
> One thing that is still missing is crashkernel support, which I intend to
> submit shortly.

But seems there's no error handling in patches about KEXEC_ON_CRASH?
It would be good to either sending out the whole set or handle the error
cases correctly.

[snip]

Thanks
Dave
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 1/8] kexec_file: Remove unused members from struct kexec_buf.

2016-06-13 Thread Dave Young
On 06/12/16 at 12:10am, Thiago Jung Bauermann wrote:
> kexec_add_buffer uses kexec_buf.buffer and kexec_buf.bufsz to pass along
> its own arguments buffer and bufsz, but since they aren't used anywhere
> else, it's pointless.
> 
> Cc: Eric Biederman 
> Cc: ke...@lists.infradead.org
> Cc: linux-ker...@vger.kernel.org
> ---
>  kernel/kexec_file.c | 6 ++
>  kernel/kexec_internal.h | 2 --
>  2 files changed, 2 insertions(+), 6 deletions(-)
> 
> diff --git a/kernel/kexec_file.c b/kernel/kexec_file.c
> index 01ab82a40d22..b6eec7527e9f 100644
> --- a/kernel/kexec_file.c
> +++ b/kernel/kexec_file.c
> @@ -464,8 +464,6 @@ int kexec_add_buffer(struct kimage *image, char *buffer, 
> unsigned long bufsz,
>   memset(&buf, 0, sizeof(struct kexec_buf));
>   kbuf = &buf;
>   kbuf->image = image;
> - kbuf->buffer = buffer;
> - kbuf->bufsz = bufsz;
>  
>   kbuf->memsz = ALIGN(memsz, PAGE_SIZE);
>   kbuf->buf_align = max(buf_align, PAGE_SIZE);
> @@ -489,8 +487,8 @@ int kexec_add_buffer(struct kimage *image, char *buffer, 
> unsigned long bufsz,
>  
>   /* Found a suitable memory range */
>   ksegment = &image->segment[image->nr_segments];
> - ksegment->kbuf = kbuf->buffer;
> - ksegment->bufsz = kbuf->bufsz;
> + ksegment->kbuf = buffer;
> + ksegment->bufsz = bufsz;
>   ksegment->mem = kbuf->mem;
>   ksegment->memsz = kbuf->memsz;
>   image->nr_segments++;
> diff --git a/kernel/kexec_internal.h b/kernel/kexec_internal.h
> index 0a52315d9c62..eefd5bf960c2 100644
> --- a/kernel/kexec_internal.h
> +++ b/kernel/kexec_internal.h
> @@ -26,8 +26,6 @@ struct kexec_sha_region {
>   */
>  struct kexec_buf {
>   struct kimage *image;
> - char *buffer;
> - unsigned long bufsz;
>   unsigned long mem;
>   unsigned long memsz;
>   unsigned long buf_align;
> -- 
> 1.9.1
> 
> 
> ___
> kexec mailing list
> ke...@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/kexec

Acked-by: Dave Young 

Thanks
Dave
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 2/8] kexec_file: Generalize kexec_add_buffer.

2016-06-13 Thread Dave Young
On 06/12/16 at 12:10am, Thiago Jung Bauermann wrote:
> Allow architectures to specify different memory walking functions for
> kexec_add_buffer. Intel uses iomem to track reserved memory ranges,
> but PowerPC uses the memblock subsystem.
> 
> Also, factor kexec_locate_mem_hole out of kexec_add_buffer. It will be
> used by the PowerPC kexec_file_load implementation to find free memory
> for the purgatory stack.

Split factoring locate hole function to another patch will be clearer.

> 
> Cc: Eric Biederman 
> Cc: ke...@lists.infradead.org
> Cc: linux-ker...@vger.kernel.org
> ---
>  include/linux/kexec.h | 10 ++
>  kernel/kexec_file.c   | 96 
> +--
>  2 files changed, 81 insertions(+), 25 deletions(-)
> 
> diff --git a/include/linux/kexec.h b/include/linux/kexec.h
> index e8acb2b43dd9..920e2cbe5bdd 100644
> --- a/include/linux/kexec.h
> +++ b/include/linux/kexec.h
> @@ -210,6 +210,10 @@ extern asmlinkage long sys_kexec_load(unsigned long 
> entry,
>   struct kexec_segment __user *segments,
>   unsigned long flags);
>  extern int kernel_kexec(void);
> +int kexec_locate_mem_hole(struct kimage *image, unsigned long size,
> +   unsigned long align, unsigned long start,
> +   unsigned long end, bool top_down,
> +   unsigned long *addr);
>  extern int kexec_add_buffer(struct kimage *image, char *buffer,
>   unsigned long bufsz, unsigned long memsz,
>   unsigned long buf_align, unsigned long buf_min,
> @@ -315,6 +319,12 @@ int __weak arch_kexec_apply_relocations_add(const 
> Elf_Ehdr *ehdr,
>   Elf_Shdr *sechdrs, unsigned int relsec);
>  int __weak arch_kexec_apply_relocations(const Elf_Ehdr *ehdr, Elf_Shdr 
> *sechdrs,
>   unsigned int relsec);
> +int __weak arch_walk_iomem(unsigned long desc, unsigned long start,
> +unsigned long end, bool top_down, void *data,
> +int (*func)(u64, u64, void *));
> +int __weak arch_walk_system_ram(unsigned long start, unsigned long end,
> + bool top_down, void *data,
> + int (*func)(u64, u64, void *));
>  void arch_kexec_protect_crashkres(void);
>  void arch_kexec_unprotect_crashkres(void);
>  
> diff --git a/kernel/kexec_file.c b/kernel/kexec_file.c
> index b6eec7527e9f..300f71cb4f72 100644
> --- a/kernel/kexec_file.c
> +++ b/kernel/kexec_file.c
> @@ -428,6 +428,68 @@ static int locate_mem_hole_callback(u64 start, u64 end, 
> void *arg)
>   return locate_mem_hole_bottom_up(start, end, kbuf);
>  }
>  
> +int __weak arch_walk_iomem(unsigned long desc, unsigned long start,
> +unsigned long end, bool top_down, void *data,
> +int (*func)(u64, u64, void *))
> +{
> + return walk_iomem_res_desc(desc,
> +IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY,
> +start, end, data, func);
> +}
> +
> +int __weak arch_walk_system_ram(unsigned long start, unsigned long end,
> + bool top_down, void *data,
> + int (*func)(u64, u64, void *))
> +{
> + return walk_system_ram_res(start, end, data, func);
> +}
> +
> +/**
> + * kexec_locate_mem_hole - find free memory to load segment or use in 
> purgatory
> + * @image:   kexec image being updated.
> + * @size:Memory size.
> + * @align:   Minimum alignment needed.
> + * @start:   Minimum starting address.
> + * @end: Maximum end address.
> + * @top_down Find the highest free memory region?
> + * @addr On success, will have start address of the memory region found.
> + *
> + * Return: 0 on success, negative erro on failure.
> + */
> +int kexec_locate_mem_hole(struct kimage *image, unsigned long size,
> +   unsigned long align, unsigned long start,
> +   unsigned long end, bool top_down, unsigned long *addr)
> +{
> + int ret;
> + struct kexec_buf buf;
> +
> + memset(&buf, 0, sizeof(struct kexec_buf));
> + buf.image = image;
> +
> + buf.memsz = size;
> + buf.buf_align = align;
> + buf.buf_min = start;
> + buf.buf_max = end;
> + buf.top_down = top_down;
> +
> + /* Walk the RAM ranges and allocate a suitable range for the buffer */
> + if (image->type == KEXEC_TYPE_CRASH)
> + ret = arch_walk_iomem(crashk_res.desc, crashk_res.start,
> +   crashk_res.end, top_down, &buf,
> +   locate_mem_hole_callback);
> + else
> + ret = arch_walk_system_ram(0, -1, top_down, &buf,
> +locate_mem_hole_callback);
> + if (ret != 1) {
> + /* A suitable memory range could not be

Re: [PATCH 2/8] kexec_file: Generalize kexec_add_buffer.

2016-06-13 Thread Dave Young
On 06/12/16 at 12:10am, Thiago Jung Bauermann wrote:
> Allow architectures to specify different memory walking functions for
> kexec_add_buffer. Intel uses iomem to track reserved memory ranges,
> but PowerPC uses the memblock subsystem.

Can the crashk_res be inserted to iomem_resource so that only one
weak function for system ram is needed?

> 
> Also, factor kexec_locate_mem_hole out of kexec_add_buffer. It will be
> used by the PowerPC kexec_file_load implementation to find free memory
> for the purgatory stack.
> 
> Cc: Eric Biederman 
> Cc: ke...@lists.infradead.org
> Cc: linux-ker...@vger.kernel.org
> ---
>  include/linux/kexec.h | 10 ++
>  kernel/kexec_file.c   | 96 
> +--
>  2 files changed, 81 insertions(+), 25 deletions(-)
> 
> diff --git a/include/linux/kexec.h b/include/linux/kexec.h
> index e8acb2b43dd9..920e2cbe5bdd 100644
> --- a/include/linux/kexec.h
> +++ b/include/linux/kexec.h
> @@ -210,6 +210,10 @@ extern asmlinkage long sys_kexec_load(unsigned long 
> entry,
>   struct kexec_segment __user *segments,
>   unsigned long flags);
>  extern int kernel_kexec(void);
> +int kexec_locate_mem_hole(struct kimage *image, unsigned long size,
> +   unsigned long align, unsigned long start,
> +   unsigned long end, bool top_down,
> +   unsigned long *addr);
>  extern int kexec_add_buffer(struct kimage *image, char *buffer,
>   unsigned long bufsz, unsigned long memsz,
>   unsigned long buf_align, unsigned long buf_min,
> @@ -315,6 +319,12 @@ int __weak arch_kexec_apply_relocations_add(const 
> Elf_Ehdr *ehdr,
>   Elf_Shdr *sechdrs, unsigned int relsec);
>  int __weak arch_kexec_apply_relocations(const Elf_Ehdr *ehdr, Elf_Shdr 
> *sechdrs,
>   unsigned int relsec);
> +int __weak arch_walk_iomem(unsigned long desc, unsigned long start,
> +unsigned long end, bool top_down, void *data,
> +int (*func)(u64, u64, void *));
> +int __weak arch_walk_system_ram(unsigned long start, unsigned long end,
> + bool top_down, void *data,
> + int (*func)(u64, u64, void *));
>  void arch_kexec_protect_crashkres(void);
>  void arch_kexec_unprotect_crashkres(void);
>  
> diff --git a/kernel/kexec_file.c b/kernel/kexec_file.c
> index b6eec7527e9f..300f71cb4f72 100644
> --- a/kernel/kexec_file.c
> +++ b/kernel/kexec_file.c
> @@ -428,6 +428,68 @@ static int locate_mem_hole_callback(u64 start, u64 end, 
> void *arg)
>   return locate_mem_hole_bottom_up(start, end, kbuf);
>  }
>  
> +int __weak arch_walk_iomem(unsigned long desc, unsigned long start,
> +unsigned long end, bool top_down, void *data,
> +int (*func)(u64, u64, void *))
> +{
> + return walk_iomem_res_desc(desc,
> +IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY,
> +start, end, data, func);
> +}
> +
> +int __weak arch_walk_system_ram(unsigned long start, unsigned long end,
> + bool top_down, void *data,
> + int (*func)(u64, u64, void *))
> +{
> + return walk_system_ram_res(start, end, data, func);
> +}
> +
> +/**
> + * kexec_locate_mem_hole - find free memory to load segment or use in 
> purgatory
> + * @image:   kexec image being updated.
> + * @size:Memory size.
> + * @align:   Minimum alignment needed.
> + * @start:   Minimum starting address.
> + * @end: Maximum end address.
> + * @top_down Find the highest free memory region?
> + * @addr On success, will have start address of the memory region found.
> + *
> + * Return: 0 on success, negative erro on failure.
> + */
> +int kexec_locate_mem_hole(struct kimage *image, unsigned long size,
> +   unsigned long align, unsigned long start,
> +   unsigned long end, bool top_down, unsigned long *addr)
> +{
> + int ret;
> + struct kexec_buf buf;
> +
> + memset(&buf, 0, sizeof(struct kexec_buf));
> + buf.image = image;
> +
> + buf.memsz = size;
> + buf.buf_align = align;
> + buf.buf_min = start;
> + buf.buf_max = end;
> + buf.top_down = top_down;
> +
> + /* Walk the RAM ranges and allocate a suitable range for the buffer */
> + if (image->type == KEXEC_TYPE_CRASH)
> + ret = arch_walk_iomem(crashk_res.desc, crashk_res.start,
> +   crashk_res.end, top_down, &buf,
> +   locate_mem_hole_callback);
> + else
> + ret = arch_walk_system_ram(0, -1, top_down, &buf,
> +locate_mem_hole_callback);
> + if (ret != 1) {
> + /* A

Re: [PATCH 2/8] kexec_file: Generalize kexec_add_buffer.

2016-06-13 Thread Dave Young
On 06/13/16 at 04:08pm, Thiago Jung Bauermann wrote:
> Hello Dave,
> 
> Thanks for the quick review and for your comments.
> 
> I'll separate the change to add arch_walk_system_ram and the change to add 
> kexec_locate_mem_hole into different patches, and add error handling for 
> KEXEC_ON_CRASH.
> 
> Am Montag, 13 Juni 2016, 15:29:39 schrieb Dave Young:
> > On 06/12/16 at 12:10am, Thiago Jung Bauermann wrote:
> > > Allow architectures to specify different memory walking functions for
> > > kexec_add_buffer. Intel uses iomem to track reserved memory ranges,
> > > but PowerPC uses the memblock subsystem.
> > 
> > Can the crashk_res be inserted to iomem_resource so that only one
> > weak function for system ram is needed?
> 
> Sorry, it's not clear to me what you mean by inserting crashk_res into 

Hmm, I means exporting crashkernel mem to /proc/iomem like other arches
It is just oneline:
insert_resource(&iomem_resource, &crashk_res)

But your proposal below is also fine.

> iomem_resource, but I can add a bool for_crashkernel to arch_walk_system_ram 
> so that it can decide which kind of memory to traverse, so the default 
> implementation of kexec_file.c would be:
> 
> int __weak arch_walk_system_ram(bool for_crashkernel, unsigned long start,
>   unsigned long end, bool top_down,
>   void *data,
>   int (*func)(u64, u64, void *))

arch_walk_mem sounds better?

> {
>   int ret;
> 
>   if (for_crashkernel)
>   ret = walk_iomem_res_desc(crashk_res.desc,
> IORESOURCE_SYSTEM_RAM |
> IORESOURCE_BUSY,
> start, end, data, func);
>   else
>   ret = walk_system_ram_res(start, end, data, func);
> 
>   if (ret != 1) {
>   /* A suitable memory range could not be found for buffer */
>   return -EADDRNOTAVAIL;
>   }
> }
> 
> and kexec_add_buffer / kexec_locate_mem_hole would call it with:
> 
>   if (image->type == KEXEC_TYPE_CRASH)
>   ret = arch_walk_system_ram(true, crashk_res.start,
>  crashk_res.end, top_down, &buf,
>  locate_mem_hole_callback);
>   else
>   ret = arch_walk_system_ram(false, 0, -1, top_down, &buf,
>  locate_mem_hole_callback);
> 
> What do you think?

Sounds good, but for_crashkernel can be image_type instead. and
image->type can be passed to the arch_walk_mem function directly.

Thanks
Dave
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v2 2/9] kexec_file: Generalize kexec_add_buffer.

2016-06-15 Thread Dave Young
Hi, Thiago

On 06/14/16 at 11:59am, Thiago Jung Bauermann wrote:
> Allow architectures to specify different memory walking functions for
> kexec_add_buffer. Intel uses iomem to track reserved memory ranges,
> but PowerPC uses the memblock subsystem.
> 
> Signed-off-by: Thiago Jung Bauermann 
> Cc: Eric Biederman 
> Cc: Dave Young 
> Cc: ke...@lists.infradead.org
> Cc: linux-ker...@vger.kernel.org
> ---
>  include/linux/kexec.h |  3 +++
>  kernel/kexec_file.c   | 44 
>  2 files changed, 39 insertions(+), 8 deletions(-)
> 
> diff --git a/include/linux/kexec.h b/include/linux/kexec.h
> index e8acb2b43dd9..7321043aa65a 100644
> --- a/include/linux/kexec.h
> +++ b/include/linux/kexec.h
> @@ -315,6 +315,9 @@ int __weak arch_kexec_apply_relocations_add(const 
> Elf_Ehdr *ehdr,
>   Elf_Shdr *sechdrs, unsigned int relsec);
>  int __weak arch_kexec_apply_relocations(const Elf_Ehdr *ehdr, Elf_Shdr 
> *sechdrs,
>   unsigned int relsec);
> +int __weak arch_kexec_walk_mem(unsigned int image_type, unsigned long start,
> +unsigned long end, bool top_down, void *data,
> +int (*func)(u64, u64, void *));
>  void arch_kexec_protect_crashkres(void);
>  void arch_kexec_unprotect_crashkres(void);
>  
> diff --git a/kernel/kexec_file.c b/kernel/kexec_file.c
> index b6eec7527e9f..a3ca7cfe070e 100644
> --- a/kernel/kexec_file.c
> +++ b/kernel/kexec_file.c
> @@ -428,6 +428,31 @@ static int locate_mem_hole_callback(u64 start, u64 end, 
> void *arg)
>   return locate_mem_hole_bottom_up(start, end, kbuf);
>  }
>  
> +/**
> + * arch_kexec_walk_mem - call func(data) on free memory regions
> + * @image_type:  kimage.type
> + * @start:   Don't visit memory regions below this address.
> + * @end: Don't visit memory regions above this address.
> + * @top_down:Start from the highest address?
> + * @data:Argument to pass to @func.
> + * @func:Function to call for each memory region.
> + *
> + * Return: The memory walk will stop when func returns a non-zero value
> + * and that value will be returned. If all free regions are visited without
> + * func returning non-zero, then zero will be returned.
> + */
> +int __weak arch_kexec_walk_mem(unsigned int image_type, unsigned long start,
> +unsigned long end, bool top_down, void *data,
> +int (*func)(u64, u64, void *))
> +{
> + if (image_type == KEXEC_TYPE_CRASH)
> + return walk_iomem_res_desc(crashk_res.desc,
> +IORESOURCE_SYSTEM_RAM | 
> IORESOURCE_BUSY,
> +start, end, data, func);
> + else
> + return walk_system_ram_res(start, end, data, func);
> +}
> +
>  /*
>   * Helper function for placing a buffer in a kexec segment. This assumes
>   * that kexec_mutex is held.
> @@ -441,6 +466,7 @@ int kexec_add_buffer(struct kimage *image, char *buffer, 
> unsigned long bufsz,
>   struct kexec_segment *ksegment;
>   struct kexec_buf buf, *kbuf;
>   int ret;
> + unsigned long start, end;
>  
>   /* Currently adding segment this way is allowed only in file mode */
>   if (!image->file_mode)
> @@ -472,14 +498,16 @@ int kexec_add_buffer(struct kimage *image, char 
> *buffer, unsigned long bufsz,
>   kbuf->top_down = top_down;
>  
>   /* Walk the RAM ranges and allocate a suitable range for the buffer */
> - if (image->type == KEXEC_TYPE_CRASH)
> - ret = walk_iomem_res_desc(crashk_res.desc,
> - IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY,
> - crashk_res.start, crashk_res.end, kbuf,
> - locate_mem_hole_callback);
> - else
> - ret = walk_system_ram_res(0, -1, kbuf,
> -   locate_mem_hole_callback);
> + if (image->type == KEXEC_TYPE_CRASH) {
> + start = crashk_res.start;
> + end = crashk_res.end;
> + } else {
> + start = 0;
> + end = ULONG_MAX;
> + }

For crashk_res, start and end is global, the non-crash values are
hard coded, thus it is not necessary to pass these two in arguments.

Moving above to arch_kexec_walk_mem will make it cleaner.

> +
> + ret = arch_kexec_walk_mem(image->type, start, end, top_down, kbuf,
> +locate_mem_hole_callback);
>   if (ret != 1) {
>   /* A suitable memory range could not be found for buffer */
&

Re: [PATCH v2 2/9] kexec_file: Generalize kexec_add_buffer.

2016-06-15 Thread Dave Young
On 06/15/16 at 01:21pm, Thiago Jung Bauermann wrote:
> Hello Dave,
> 
> Am Mittwoch, 15 Juni 2016, 15:33:02 schrieb Dave Young:
> > > @@ -472,14 +498,16 @@ int kexec_add_buffer(struct kimage *image, char
> > > *buffer, unsigned long bufsz,> 
> > >   kbuf->top_down = top_down;
> > >   
> > >   /* Walk the RAM ranges and allocate a suitable range for the buffer */
> > > 
> > > - if (image->type == KEXEC_TYPE_CRASH)
> > > - ret = walk_iomem_res_desc(crashk_res.desc,
> > > - IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY,
> > > - crashk_res.start, crashk_res.end, kbuf,
> > > - locate_mem_hole_callback);
> > > - else
> > > - ret = walk_system_ram_res(0, -1, kbuf,
> > > -   locate_mem_hole_callback);
> > > + if (image->type == KEXEC_TYPE_CRASH) {
> > > + start = crashk_res.start;
> > > + end = crashk_res.end;
> > > + } else {
> > > + start = 0;
> > > + end = ULONG_MAX;
> > > + }
> > 
> > For crashk_res, start and end is global, the non-crash values are
> > hard coded, thus it is not necessary to pass these two in arguments.
> > 
> > Moving above to arch_kexec_walk_mem will make it cleaner.
> 
> That's true. What about this version?
> 
> []'s
> Thiago Jung Bauermann
> IBM Linux Technology Center
> 
> 
> kexec_file: Generalize kexec_add_buffer.
> 
> Allow architectures to specify different memory walking functions for
> kexec_add_buffer. Intel uses iomem to track reserved memory ranges,
> but PowerPC uses the memblock subsystem.
> 
> Signed-off-by: Thiago Jung Bauermann 
> Cc: Eric Biederman 
> Cc: Dave Young 
> Cc: ke...@lists.infradead.org
> Cc: linux-ker...@vger.kernel.org
> 
> diff --git a/include/linux/kexec.h b/include/linux/kexec.h
> index e8acb2b43dd9..9b02f907b40a 100644
> --- a/include/linux/kexec.h
> +++ b/include/linux/kexec.h
> @@ -315,6 +315,8 @@ int __weak arch_kexec_apply_relocations_add(const 
> Elf_Ehdr *ehdr,
>   Elf_Shdr *sechdrs, unsigned int relsec);
>  int __weak arch_kexec_apply_relocations(const Elf_Ehdr *ehdr, Elf_Shdr 
> *sechdrs,
>   unsigned int relsec);
> +int __weak arch_kexec_walk_mem(unsigned int image_type, bool top_down,
> +void *data, int (*func)(u64, u64, void *));
>  void arch_kexec_protect_crashkres(void);
>  void arch_kexec_unprotect_crashkres(void);
>  
> diff --git a/kernel/kexec_file.c b/kernel/kexec_file.c
> index b6eec7527e9f..2d066c458222 100644
> --- a/kernel/kexec_file.c
> +++ b/kernel/kexec_file.c
> @@ -428,6 +428,29 @@ static int locate_mem_hole_callback(u64 start, u64 end, 
> void *arg)
>   return locate_mem_hole_bottom_up(start, end, kbuf);
>  }
>  
> +/**
> + * arch_kexec_walk_mem - call func(data) on free memory regions
> + * @image_type:  kimage.type
> + * @top_down:Start from the highest address?
> + * @data:Argument to pass to @func.
> + * @func:Function to call for each memory region.
> + *
> + * Return: The memory walk will stop when func returns a non-zero value
> + * and that value will be returned. If all free regions are visited without
> + * func returning non-zero, then zero will be returned.
> + */
> +int __weak arch_kexec_walk_mem(unsigned int image_type, bool top_down,
> +void *data, int (*func)(u64, u64, void *))
> +{

top_down is also not used?

Rethink about the argument name type should be better than image_type,
it is obvious in the comments and the code and simpler.

> + if (image_type == KEXEC_TYPE_CRASH)
> + return walk_iomem_res_desc(crashk_res.desc,
> +IORESOURCE_SYSTEM_RAM | 
> IORESOURCE_BUSY,
> +crashk_res.start, crashk_res.end,
> +data, func);
> + else
> + return walk_system_ram_res(0, ULONG_MAX, data, func);
> +}
> +
>  /*
>   * Helper function for placing a buffer in a kexec segment. This assumes
>   * that kexec_mutex is held.
> @@ -472,14 +495,8 @@ int kexec_add_buffer(struct kimage *image, char *buffer, 
> unsigned long bufsz,
>   kbuf->top_down = top_down;
>  
>   /* Walk the RAM ranges and allocate a suitable range for the buffer */
> - if (image->type == KEXEC_TYPE_CRASH)
> - ret = walk_iomem_res_desc(crashk_res.de

Re: [PATCH v2 2/9] kexec_file: Generalize kexec_add_buffer.

2016-06-17 Thread Dave Young
On 06/16/16 at 05:39pm, Thiago Jung Bauermann wrote:
> Am Donnerstag, 16 Juni 2016, 09:58:53 schrieb Dave Young:
> > On 06/15/16 at 01:21pm, Thiago Jung Bauermann wrote:
> > > +/**
> > > + * arch_kexec_walk_mem - call func(data) on free memory regions
> > > + * @image_type:  kimage.type
> > > + * @top_down:Start from the highest address?
> > > + * @data:Argument to pass to @func.
> > > + * @func:Function to call for each memory region.
> > > + *
> > > + * Return: The memory walk will stop when func returns a non-zero value
> > > + * and that value will be returned. If all free regions are visited
> > > without + * func returning non-zero, then zero will be returned.
> > > + */
> > > +int __weak arch_kexec_walk_mem(unsigned int image_type, bool top_down,
> > > +void *data, int (*func)(u64, u64, void *))
> > > +{
> > 
> > top_down is also not used?
> 
> It's unused in the default implementation, but the powerpc implementation in 
> patch 8 uses it:

Well, arch_kexec_walk_mem use kbuf as "data", you can even drop
"image_type" since kbuf has all you want kbuf->image->type, and
kbuf->top_down 

int __weak arch_kexec_walk_mem(struct kexec_buf *kbuf,
   int (*func)(u64, u64, void *))
> 
> int arch_kexec_walk_mem(unsigned int type, bool top_down, void *data,
>   int (*func)(u64, u64, void *))
> {
>   int ret = 0;
>   u64 i;
>   phys_addr_t mstart, mend;
> 
>   if (top_down) {
>   for_each_free_mem_range_reverse(i, NUMA_NO_NODE, 0,
>   &mstart, &mend, NULL) {
>   ret = func(mstart, mend, data);
>   if (ret)
>   break;
>   }
>   } else {
>   for_each_free_mem_range(i, NUMA_NO_NODE, 0, &mstart, &mend,
>   NULL) {
>   ret = func(mstart, mend, data);
>   if (ret)
>   break;
>   }
>   }
> 
>   return ret;
> }
> 
> > Rethink about the argument name type should be better than image_type,
> > it is obvious in the comments and the code and simpler.
> 
> Ok, renamed in the patch below.
> 
> []'s
> Thiago Jung Bauermann
> IBM Linux Technology Center
> 
> 
> kexec_file: Generalize kexec_add_buffer.
> 
> Allow architectures to specify different memory walking functions for
> kexec_add_buffer. Intel uses iomem to track reserved memory ranges,
> but PowerPC uses the memblock subsystem.
> 
> Signed-off-by: Thiago Jung Bauermann 
> Cc: Eric Biederman 
> Cc: Dave Young 
> Cc: ke...@lists.infradead.org
> Cc: linux-ker...@vger.kernel.org
> 
> diff --git a/include/linux/kexec.h b/include/linux/kexec.h
> index e8acb2b43dd9..42b31f2e1d88 100644
> --- a/include/linux/kexec.h
> +++ b/include/linux/kexec.h
> @@ -315,6 +315,8 @@ int __weak arch_kexec_apply_relocations_add(const 
> Elf_Ehdr *ehdr,
>   Elf_Shdr *sechdrs, unsigned int relsec);
>  int __weak arch_kexec_apply_relocations(const Elf_Ehdr *ehdr, Elf_Shdr 
> *sechdrs,
>   unsigned int relsec);
> +int __weak arch_kexec_walk_mem(unsigned int type, bool top_down, void *data,
> +int (*func)(u64, u64, void *));
>  void arch_kexec_protect_crashkres(void);
>  void arch_kexec_unprotect_crashkres(void);
>  
> diff --git a/kernel/kexec_file.c b/kernel/kexec_file.c
> index b6eec7527e9f..989647a324f2 100644
> --- a/kernel/kexec_file.c
> +++ b/kernel/kexec_file.c
> @@ -428,6 +428,29 @@ static int locate_mem_hole_callback(u64 start, u64 end, 
> void *arg)
>   return locate_mem_hole_bottom_up(start, end, kbuf);
>  }
>  
> +/**
> + * arch_kexec_walk_mem - call func(data) on free memory regions
> + * @type:kimage.type
> + * @top_down:Start from the highest address?
> + * @data:Argument to pass to @func.
> + * @func:Function to call for each memory region.
> + *
> + * Return: The memory walk will stop when func returns a non-zero value
> + * and that value will be returned. If all free regions are visited without
> + * func returning non-zero, then zero will be returned.
> + */
> +int __weak arch_kexec_walk_mem(unsigned int type, bool top_down, void *data,
> +int (*func)(u64, u64, void *))
> +{
> + if (type == KEXEC_TYPE_CRASH)
> + retur

Re: [PATCH v2 2/9] kexec_file: Generalize kexec_add_buffer.

2016-06-19 Thread Dave Young
On 06/17/16 at 05:51pm, Thiago Jung Bauermann wrote:
> Am Freitag, 17 Juni 2016, 15:35:23 schrieb Dave Young:
> > On 06/16/16 at 05:39pm, Thiago Jung Bauermann wrote:
> > > Am Donnerstag, 16 Juni 2016, 09:58:53 schrieb Dave Young:
> > > > On 06/15/16 at 01:21pm, Thiago Jung Bauermann wrote:
> > > > > +int __weak arch_kexec_walk_mem(unsigned int image_type, bool
> > > > > top_down,
> > > > > +void *data, int (*func)(u64, u64, void 
> > > > > *))
> > > > > +{
> > > > 
> > > > top_down is also not used?
> > > 
> > > It's unused in the default implementation, but the powerpc implementation
> > > in patch 8 uses it:
> > Well, arch_kexec_walk_mem use kbuf as "data", you can even drop
> > "image_type" since kbuf has all you want kbuf->image->type, and
> > kbuf->top_down
> > 
> > int __weak arch_kexec_walk_mem(struct kexec_buf *kbuf,
> >int (*func)(u64, u64, void *))
> 
> Sounds good to me, but I had to move struct kexec_buf from
> kernel/kexec_internal.h to include/linux/kexec.h.
> 
> Here's the updated patch. What do you think?
> 
> []'s
> Thiago Jung Bauermann
> IBM Linux Technology Center
> 
> 
> kexec_file: Generalize kexec_add_buffer.
> 
> Allow architectures to specify different memory walking functions for
> kexec_add_buffer. Intel uses iomem to track reserved memory ranges,
> but PowerPC uses the memblock subsystem.
> 
> Signed-off-by: Thiago Jung Bauermann 
> Cc: Eric Biederman 
> Cc: Dave Young 
> Cc: ke...@lists.infradead.org
> Cc: linux-ker...@vger.kernel.org
> 
> diff --git a/include/linux/kexec.h b/include/linux/kexec.h
> index e8acb2b43dd9..d8df01107ae2 100644
> --- a/include/linux/kexec.h
> +++ b/include/linux/kexec.h
> @@ -201,6 +201,20 @@ struct kimage {
>  #endif
>  };
>  
> +/*
> + * Keeps track of buffer parameters as provided by caller for requesting
> + * memory placement of buffer.
> + */
> +struct kexec_buf {
> + struct kimage *image;
> + unsigned long mem;
> + unsigned long memsz;
> + unsigned long buf_align;
> + unsigned long buf_min;
> + unsigned long buf_max;
> + bool top_down;  /* allocate from top of memory hole */
> +};
> +

kexec_buf should go within #ifdef for kexec file like struct purgatory_info

Other than that it looks good.

Thanks
Dave
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH 0/6] kexec_file: Add buffer hand-over for the next kernel

2016-06-21 Thread Dave Young
On 06/20/16 at 10:44pm, Thiago Jung Bauermann wrote:
> Hello,
> 
> This patch series implements a mechanism which allows the kernel to pass on
> a buffer to the kernel that will be kexec'd. This buffer is passed as a
> segment which is added to the kimage when it is being prepared by
> kexec_file_load.
> 
> How the second kernel is informed of this buffer is architecture-specific.
> On PowerPC, this is done via the device tree, by checking the properties
> /chosen/linux,kexec-handover-buffer-start and
> /chosen/linux,kexec-handover-buffer-end, which is analogous to how the
> kernel finds the initrd.
> 
> This feature was implemented because the Integrity Measurement Architecture
> subsystem needs to preserve its measurement list accross the kexec reboot.
> This is so that IMA can implement trusted boot support on the OpenPower
> platform, because on such systems an intermediary Linux instance running as
> part of the firmware is used to boot the target operating system via kexec.
> Using this mechanism, IMA on this intermediary instance can hand over to the
> target OS the measurements of the components that were used to boot it.

We have CONFIG_KEXEC_VERIFY_SIG, why not verifying the kernel to be
loaded instead?  I feel IMA should rebuild its measurement instead of
passing it to another kernel. Kexec reboot is also a reboot. If we have
to preserve something get from firmware we can do it, but other than
that I think it sounds not a good idea.

> 
> Because there could be additional measurement events between the
> kexec_file_load call and the actual reboot, IMA needs a way to update the
> buffer with those additional events before rebooting. One can minimize
> the interval between the kexec_file_load and the reboot syscalls, but as
> small as it can be, there is always the possibility that the measurement
> list will be out of date at the time of reboot.
> 
> To address this issue, this patch series also introduces kexec_update_segment,
> which allows a reboot notifier to change the contents of the image segment
> during the reboot process.
> 
> There's one patch which makes kimage_load_normal_segment and
> kexec_update_segment share code. It's not much code that they can share
> though, so I'm not sure if it's worth including this patch.
> 
> The last patch is not intended to be merged, it just demonstrates how this
> feature can be used.
> 
> This series applies on top of v2 of the "kexec_file_load implementation
> for PowerPC" patch series at:

The kexec_file_load patches should be addressed first, no?

> 
> http://lists.infradead.org/pipermail/kexec/2016-June/016078.html
> 
> Thiago Jung Bauermann (6):
>   kexec_file: Add buffer hand-over support for the next kernel
>   powerpc: kexec_file: Add buffer hand-over support for the next kernel
>   kexec_file: Allow skipping checksum calculation for some segments.
>   kexec_file: Add mechanism to update kexec segments.
>   kexec: Share logic to copy segment page contents.
>   IMA: Demonstration code for kexec buffer passing.
> 
>  arch/powerpc/include/asm/kexec.h   |   9 ++
>  arch/powerpc/kernel/kexec_elf_64.c |  50 +++-
>  arch/powerpc/kernel/machine_kexec_64.c |  64 ++
>  arch/x86/kernel/crash.c|   4 +-
>  arch/x86/kernel/kexec-bzimage64.c  |   6 +-
>  include/linux/ima.h|  11 ++
>  include/linux/kexec.h  |  47 +++-
>  kernel/kexec_core.c| 205 
> ++---
>  kernel/kexec_file.c| 102 ++--
>  security/integrity/ima/ima.h   |   5 +
>  security/integrity/ima/ima_init.c  |  26 +
>  security/integrity/ima/ima_template.c  |  79 +
>  12 files changed, 547 insertions(+), 61 deletions(-)
> 
> -- 
> 1.9.1
> 

Thanks
Dave
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v3 3/9] kexec_file: Factor out kexec_locate_mem_hole from kexec_add_buffer.

2016-06-22 Thread Dave Young
On 06/21/16 at 04:48pm, Thiago Jung Bauermann wrote:
> kexec_locate_mem_hole will be used by the PowerPC kexec_file_load
> implementation to find free memory for the purgatory stack.
> 
> Signed-off-by: Thiago Jung Bauermann 
> Cc: Eric Biederman 
> Cc: Dave Young 
> Cc: ke...@lists.infradead.org
> Cc: linux-ker...@vger.kernel.org
> ---
>  include/linux/kexec.h |  4 
>  kernel/kexec_file.c   | 66 
> ++-
>  2 files changed, 53 insertions(+), 17 deletions(-)
> 
> diff --git a/include/linux/kexec.h b/include/linux/kexec.h
> index 3d91bcfc180d..4ca6f5f95d66 100644
> --- a/include/linux/kexec.h
> +++ b/include/linux/kexec.h
> @@ -227,6 +227,10 @@ extern asmlinkage long sys_kexec_load(unsigned long 
> entry,
>   struct kexec_segment __user *segments,
>   unsigned long flags);
>  extern int kernel_kexec(void);
> +int kexec_locate_mem_hole(struct kimage *image, unsigned long size,
> +   unsigned long align, unsigned long min_addr,
> +   unsigned long max_addr, bool top_down,
> +   unsigned long *addr);
>  extern int kexec_add_buffer(struct kimage *image, char *buffer,
>   unsigned long bufsz, unsigned long memsz,
>   unsigned long buf_align, unsigned long buf_min,
> diff --git a/kernel/kexec_file.c b/kernel/kexec_file.c
> index b1f1f6402518..85a515511925 100644
> --- a/kernel/kexec_file.c
> +++ b/kernel/kexec_file.c
> @@ -449,6 +449,46 @@ int __weak arch_kexec_walk_mem(struct kexec_buf *kbuf,
>   return walk_system_ram_res(0, ULONG_MAX, kbuf, func);
>  }
>  
> +/**
> + * kexec_locate_mem_hole - find free memory to load segment or use in 
> purgatory
> + * @image:   kexec image being updated.
> + * @size:Memory size.
> + * @align:   Minimum alignment needed.
> + * @min_addr:Minimum starting address.
> + * @max_addr:Maximum end address.
> + * @top_down Find the highest free memory region?
> + * @addr On success, will have start address of the memory region found.
> + *
> + * Return: 0 on success, negative errno on error.
> + */
> +int kexec_locate_mem_hole(struct kimage *image, unsigned long size,
> +   unsigned long align, unsigned long min_addr,
> +   unsigned long max_addr, bool top_down,
> +   unsigned long *addr)
> +{
> + int ret;
> + struct kexec_buf buf;
> +
> + memset(&buf, 0, sizeof(struct kexec_buf));
> + buf.image = image;
> +
> + buf.memsz = size;
> + buf.buf_align = align;
> + buf.buf_min = min_addr;
> + buf.buf_max = max_addr;
> + buf.top_down = top_down;

Since patch 2/9 moved kexec_buf from internal header file to kexec.h it
will be natural to passing a kexec_buf pointer intead of passing all
these arguments in kexec_locate_mem_hole.

kbuf.mem can be used for addr.

> +
> + ret = arch_kexec_walk_mem(&buf, locate_mem_hole_callback);
> + if (ret != 1) {
> + /* A suitable memory range could not be found for buffer */
> + return -EADDRNOTAVAIL;
> + }
> +
> + *addr = buf.mem;
> +
> + return 0;
> +}
> +
>  /*
>   * Helper function for placing a buffer in a kexec segment. This assumes
>   * that kexec_mutex is held.
> @@ -460,8 +500,8 @@ int kexec_add_buffer(struct kimage *image, char *buffer, 
> unsigned long bufsz,
>  {
>  
>   struct kexec_segment *ksegment;
> - struct kexec_buf buf, *kbuf;
>   int ret;
> + unsigned long addr, align, size;
>  
>   /* Currently adding segment this way is allowed only in file mode */
>   if (!image->file_mode)
> @@ -482,29 +522,21 @@ int kexec_add_buffer(struct kimage *image, char 
> *buffer, unsigned long bufsz,
>   return -EINVAL;
>   }
>  
> - memset(&buf, 0, sizeof(struct kexec_buf));
> - kbuf = &buf;
> - kbuf->image = image;
> -
> - kbuf->memsz = ALIGN(memsz, PAGE_SIZE);
> - kbuf->buf_align = max(buf_align, PAGE_SIZE);
> - kbuf->buf_min = buf_min;
> - kbuf->buf_max = buf_max;
> - kbuf->top_down = top_down;
> + size = ALIGN(memsz, PAGE_SIZE);
> + align = max(buf_align, PAGE_SIZE);
>  
>   /* Walk the RAM ranges and allocate a suitable range for the buffer */
> - ret = arch_kexec_walk_mem(kbuf, locate_mem_hole_callback);
> - if (ret != 1) {
> - /* A suitable memory range could not be found for buffer */
> - return -EADDRNOTAVAIL;
> - }
> + ret = kexec_locate_

Re: [PATCH v3 2/9] kexec_file: Generalize kexec_add_buffer.

2016-06-22 Thread Dave Young

The patch looks good, but could the subject be more specific?

For example just like the first sentence of the patch descriotion:
Allow architectures to specify their own memory walking function

On 06/21/16 at 04:48pm, Thiago Jung Bauermann wrote:
> Allow architectures to specify different memory walking functions for
> kexec_add_buffer. Intel uses iomem to track reserved memory ranges,
> but PowerPC uses the memblock subsystem.
> 
> Signed-off-by: Thiago Jung Bauermann 
> Cc: Eric Biederman 
> Cc: Dave Young 
> Cc: ke...@lists.infradead.org
> Cc: linux-ker...@vger.kernel.org
> ---
>  include/linux/kexec.h   | 19 ++-
>  kernel/kexec_file.c | 30 ++
>  kernel/kexec_internal.h | 14 --
>  3 files changed, 40 insertions(+), 23 deletions(-)
> 
> diff --git a/include/linux/kexec.h b/include/linux/kexec.h
> index e8acb2b43dd9..3d91bcfc180d 100644
> --- a/include/linux/kexec.h
> +++ b/include/linux/kexec.h
> @@ -146,7 +146,24 @@ struct kexec_file_ops {
>   kexec_verify_sig_t *verify_sig;
>  #endif
>  };
> -#endif
> +
> +/*
> + * Keeps track of buffer parameters as provided by caller for requesting
> + * memory placement of buffer.
> + */
> +struct kexec_buf {
> + struct kimage *image;
> + unsigned long mem;
> + unsigned long memsz;
> + unsigned long buf_align;
> + unsigned long buf_min;
> + unsigned long buf_max;
> + bool top_down;  /* allocate from top of memory hole */
> +};
> +
> +int __weak arch_kexec_walk_mem(struct kexec_buf *kbuf,
> +int (*func)(u64, u64, void *));
> +#endif /* CONFIG_KEXEC_FILE */
>  
>  struct kimage {
>   kimage_entry_t head;
> diff --git a/kernel/kexec_file.c b/kernel/kexec_file.c
> index b6eec7527e9f..b1f1f6402518 100644
> --- a/kernel/kexec_file.c
> +++ b/kernel/kexec_file.c
> @@ -428,6 +428,27 @@ static int locate_mem_hole_callback(u64 start, u64 end, 
> void *arg)
>   return locate_mem_hole_bottom_up(start, end, kbuf);
>  }
>  
> +/**
> + * arch_kexec_walk_mem - call func(data) on free memory regions
> + * @kbuf:Context info for the search. Also passed to @func.
> + * @func:Function to call for each memory region.
> + *
> + * Return: The memory walk will stop when func returns a non-zero value
> + * and that value will be returned. If all free regions are visited without
> + * func returning non-zero, then zero will be returned.
> + */
> +int __weak arch_kexec_walk_mem(struct kexec_buf *kbuf,
> +int (*func)(u64, u64, void *))
> +{
> + if (kbuf->image->type == KEXEC_TYPE_CRASH)
> + return walk_iomem_res_desc(crashk_res.desc,
> +IORESOURCE_SYSTEM_RAM | 
> IORESOURCE_BUSY,
> +crashk_res.start, crashk_res.end,
> +kbuf, func);
> + else
> + return walk_system_ram_res(0, ULONG_MAX, kbuf, func);
> +}
> +
>  /*
>   * Helper function for placing a buffer in a kexec segment. This assumes
>   * that kexec_mutex is held.
> @@ -472,14 +493,7 @@ int kexec_add_buffer(struct kimage *image, char *buffer, 
> unsigned long bufsz,
>   kbuf->top_down = top_down;
>  
>   /* Walk the RAM ranges and allocate a suitable range for the buffer */
> - if (image->type == KEXEC_TYPE_CRASH)
> - ret = walk_iomem_res_desc(crashk_res.desc,
> - IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY,
> - crashk_res.start, crashk_res.end, kbuf,
> - locate_mem_hole_callback);
> - else
> - ret = walk_system_ram_res(0, -1, kbuf,
> -   locate_mem_hole_callback);
> + ret = arch_kexec_walk_mem(kbuf, locate_mem_hole_callback);
>   if (ret != 1) {
>   /* A suitable memory range could not be found for buffer */
>   return -EADDRNOTAVAIL;
> diff --git a/kernel/kexec_internal.h b/kernel/kexec_internal.h
> index eefd5bf960c2..4cef7e4706b0 100644
> --- a/kernel/kexec_internal.h
> +++ b/kernel/kexec_internal.h
> @@ -20,20 +20,6 @@ struct kexec_sha_region {
>   unsigned long len;
>  };
>  
> -/*
> - * Keeps track of buffer parameters as provided by caller for requesting
> - * memory placement of buffer.
> - */
> -struct kexec_buf {
> - struct kimage *image;
> - unsigned long mem;
> - unsigned long memsz;
> - unsigned long buf_align;
> - unsigned long buf_min;
> - unsigned long buf_max;
> - bool top_down;  /* allocate from top of memory hole */
> -};
> -
>  void kimage_file_post_load_cleanup(struct kimage *image);
>  #else /* CONFIG_KEXEC_FILE */
>  static inline void kimage_file_post_load_cleanup(struct kimage *image) { }

Thanks
Dave
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v3 2/9] kexec_file: Generalize kexec_add_buffer.

2016-06-22 Thread Dave Young
On 06/22/16 at 08:30pm, Thiago Jung Bauermann wrote:
> Am Mittwoch, 22 Juni 2016, 18:20:47 schrieb Dave Young:
> > The patch looks good, but could the subject be more specific?
> > 
> > For example just like the first sentence of the patch descriotion:
> > Allow architectures to specify their own memory walking function
> 
> Ok, What about this? I also changed the description to refer to x86 arch
> instead of Intel arch.

It looks good to me.

Thanks
Dave

> 
> []'s
> Thiago Jung Bauermann
> IBM Linux Technology Center
> 
> 
> Subject: [PATCH 2/9] kexec_file: Allow arch-specific memory walking for
>  kexec_add_buffer
> 
> Allow architectures to specify a different memory walking function for
> kexec_add_buffer. x86 uses iomem to track reserved memory ranges, but
> PowerPC uses the memblock subsystem.
> 
> Signed-off-by: Thiago Jung Bauermann 
> Cc: Eric Biederman 
> Cc: Dave Young 
> Cc: ke...@lists.infradead.org
> Cc: linux-ker...@vger.kernel.org
> ---
>  include/linux/kexec.h   | 19 ++-
>  kernel/kexec_file.c | 30 ++
>  kernel/kexec_internal.h | 14 --
>  3 files changed, 40 insertions(+), 23 deletions(-)
> 
> diff --git a/include/linux/kexec.h b/include/linux/kexec.h
> index e8acb2b43dd9..3d91bcfc180d 100644
> --- a/include/linux/kexec.h
> +++ b/include/linux/kexec.h
> @@ -146,7 +146,24 @@ struct kexec_file_ops {
>   kexec_verify_sig_t *verify_sig;
>  #endif
>  };
> -#endif
> +
> +/*
> + * Keeps track of buffer parameters as provided by caller for requesting
> + * memory placement of buffer.
> + */
> +struct kexec_buf {
> + struct kimage *image;
> + unsigned long mem;
> + unsigned long memsz;
> + unsigned long buf_align;
> + unsigned long buf_min;
> + unsigned long buf_max;
> + bool top_down;  /* allocate from top of memory hole */
> +};
> +
> +int __weak arch_kexec_walk_mem(struct kexec_buf *kbuf,
> +int (*func)(u64, u64, void *));
> +#endif /* CONFIG_KEXEC_FILE */
>  
>  struct kimage {
>   kimage_entry_t head;
> diff --git a/kernel/kexec_file.c b/kernel/kexec_file.c
> index b6eec7527e9f..b1f1f6402518 100644
> --- a/kernel/kexec_file.c
> +++ b/kernel/kexec_file.c
> @@ -428,6 +428,27 @@ static int locate_mem_hole_callback(u64 start, u64 end, 
> void *arg)
>   return locate_mem_hole_bottom_up(start, end, kbuf);
>  }
>  
> +/**
> + * arch_kexec_walk_mem - call func(data) on free memory regions
> + * @kbuf:Context info for the search. Also passed to @func.
> + * @func:Function to call for each memory region.
> + *
> + * Return: The memory walk will stop when func returns a non-zero value
> + * and that value will be returned. If all free regions are visited without
> + * func returning non-zero, then zero will be returned.
> + */
> +int __weak arch_kexec_walk_mem(struct kexec_buf *kbuf,
> +int (*func)(u64, u64, void *))
> +{
> + if (kbuf->image->type == KEXEC_TYPE_CRASH)
> + return walk_iomem_res_desc(crashk_res.desc,
> +IORESOURCE_SYSTEM_RAM | 
> IORESOURCE_BUSY,
> +crashk_res.start, crashk_res.end,
> +kbuf, func);
> + else
> + return walk_system_ram_res(0, ULONG_MAX, kbuf, func);
> +}
> +
>  /*
>   * Helper function for placing a buffer in a kexec segment. This assumes
>   * that kexec_mutex is held.
> @@ -472,14 +493,7 @@ int kexec_add_buffer(struct kimage *image, char *buffer, 
> unsigned long bufsz,
>   kbuf->top_down = top_down;
>  
>   /* Walk the RAM ranges and allocate a suitable range for the buffer */
> - if (image->type == KEXEC_TYPE_CRASH)
> - ret = walk_iomem_res_desc(crashk_res.desc,
> - IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY,
> - crashk_res.start, crashk_res.end, kbuf,
> - locate_mem_hole_callback);
> - else
> - ret = walk_system_ram_res(0, -1, kbuf,
> -   locate_mem_hole_callback);
> + ret = arch_kexec_walk_mem(kbuf, locate_mem_hole_callback);
>   if (ret != 1) {
>   /* A suitable memory range could not be found for buffer */
>   return -EADDRNOTAVAIL;
> diff --git a/kernel/kexec_internal.h b/kernel/kexec_internal.h
> index eefd5bf960c2..4cef7e4706b0 100644
> --- a/kernel/kexec_internal.h
> +++ b/kernel/kexec_internal.h
> @@ -20,20 +20,6 @@ struct kexec_sha_region {
>   unsigned long len;
> 

Re: [PATCH v3 3/9] kexec_file: Factor out kexec_locate_mem_hole from kexec_add_buffer.

2016-06-22 Thread Dave Young
On 06/22/16 at 08:34pm, Thiago Jung Bauermann wrote:
> Am Mittwoch, 22 Juni 2016, 18:18:01 schrieb Dave Young:
> > On 06/21/16 at 04:48pm, Thiago Jung Bauermann wrote:
> > > +/**
> > > + * kexec_locate_mem_hole - find free memory to load segment or use in
> > > purgatory + * @image: kexec image being updated.
> > > + * @size:Memory size.
> > > + * @align:   Minimum alignment needed.
> > > + * @min_addr:Minimum starting address.
> > > + * @max_addr:Maximum end address.
> > > + * @top_down Find the highest free memory region?
> > > + * @addr On success, will have start address of the memory region
> > > found.
> > > + *
> > > + * Return: 0 on success, negative errno on error.
> > > + */
> > > +int kexec_locate_mem_hole(struct kimage *image, unsigned long size,
> > > +   unsigned long align, unsigned long min_addr,
> > > +   unsigned long max_addr, bool top_down,
> > > +   unsigned long *addr)
> > > +{
> > > + int ret;
> > > + struct kexec_buf buf;
> > > +
> > > + memset(&buf, 0, sizeof(struct kexec_buf));
> > > + buf.image = image;
> > > +
> > > + buf.memsz = size;
> > > + buf.buf_align = align;
> > > + buf.buf_min = min_addr;
> > > + buf.buf_max = max_addr;
> > > + buf.top_down = top_down;
> > 
> > Since patch 2/9 moved kexec_buf from internal header file to kexec.h it
> > will be natural to passing a kexec_buf pointer intead of passing all
> > these arguments in kexec_locate_mem_hole.
> > 
> > kbuf.mem can be used for addr.
> 
> Ok. What about this version?
> -- 
> []'s
> Thiago Jung Bauermann
> IBM Linux Technology Center
> 
> 
> Subject: [PATCH 3/9] kexec_file: Factor out kexec_locate_mem_hole from
>  kexec_add_buffer.
> 
> kexec_locate_mem_hole will be used by the PowerPC kexec_file_load
> implementation to find free memory for the purgatory stack.
> 
> Signed-off-by: Thiago Jung Bauermann 
> Cc: Eric Biederman 
> Cc: Dave Young 
> Cc: ke...@lists.infradead.org
> Cc: linux-ker...@vger.kernel.org
> ---
>  include/linux/kexec.h | 12 +---
>  kernel/kexec_file.c   | 25 -
>  2 files changed, 29 insertions(+), 8 deletions(-)
> 
> diff --git a/include/linux/kexec.h b/include/linux/kexec.h
> index 3d91bcfc180d..e8b099da47f5 100644
> --- a/include/linux/kexec.h
> +++ b/include/linux/kexec.h
> @@ -147,9 +147,14 @@ struct kexec_file_ops {
>  #endif
>  };
>  
> -/*
> - * Keeps track of buffer parameters as provided by caller for requesting
> - * memory placement of buffer.
> +/**
> + * struct kexec_buf - parameters for finding a place for a buffer in memory
> + * @image:   kexec image in which memory to search.
> + * @size:Memory size for the buffer.
> + * @align:   Minimum alignment needed.
> + * @min_addr:Minimum starting address.
> + * @max_addr:Maximum end address.
> + * @top_down:Find the highest free memory region?

Above parameter comments should go to previous patch.
Other than that it looks good.

Thanks
Dave
>   */
>  struct kexec_buf {
>   struct kimage *image;
> @@ -163,6 +168,7 @@ struct kexec_buf {
>  
>  int __weak arch_kexec_walk_mem(struct kexec_buf *kbuf,
>  int (*func)(u64, u64, void *));
> +int kexec_locate_mem_hole(struct kexec_buf *kbuf);
>  #endif /* CONFIG_KEXEC_FILE */
>  
>  struct kimage {
> diff --git a/kernel/kexec_file.c b/kernel/kexec_file.c
> index b1f1f6402518..445d66add8ca 100644
> --- a/kernel/kexec_file.c
> +++ b/kernel/kexec_file.c
> @@ -449,6 +449,23 @@ int __weak arch_kexec_walk_mem(struct kexec_buf *kbuf,
>   return walk_system_ram_res(0, ULONG_MAX, kbuf, func);
>  }
>  
> +/**
> + * kexec_locate_mem_hole - find free memory to load segment or use in 
> purgatory
> + * @kbuf:Parameters for the memory search.
> + *
> + * On success, kbuf->mem will have the start address of the memory region 
> found.
> + *
> + * Return: 0 on success, negative errno on error.
> + */
> +int kexec_locate_mem_hole(struct kexec_buf *kbuf)
> +{
> + int ret;
> +
> + ret = arch_kexec_walk_mem(kbuf, locate_mem_hole_callback);
> +
> + return ret == 1 ? 0 : -EADDRNOTAVAIL;
> +}
> +
>  /*
>   * Helper function for placing a buffer in a kexec segment. This assumes
>   * that kexec_mutex is held.
> @@ -493,11 +510,9 @@ int kexec_add_buffer(struct kimage *image, char *buffer, 
> unsigned long bufsz,
>   kbuf->top_down = top_down;
>  

Re: [PATCH v3 3/9] kexec_file: Factor out kexec_locate_mem_hole from kexec_add_buffer.

2016-06-22 Thread Dave Young


- Original Message -
From: "Dave Young" 
To: "Thiago Jung Bauermann" 
Cc: linuxppc-dev@lists.ozlabs.org, ke...@lists.infradead.org, 
linux-ker...@vger.kernel.org, "Eric Biederman" 
Sent: Thursday, June 23, 2016 10:30:52 AM
Subject: Re: [PATCH v3 3/9] kexec_file: Factor out kexec_locate_mem_hole from 
kexec_add_buffer.

On 06/22/16 at 08:34pm, Thiago Jung Bauermann wrote:
> Am Mittwoch, 22 Juni 2016, 18:18:01 schrieb Dave Young:
> > On 06/21/16 at 04:48pm, Thiago Jung Bauermann wrote:
> > > +/**
> > > + * kexec_locate_mem_hole - find free memory to load segment or use in
> > > purgatory + * @image: kexec image being updated.
> > > + * @size:Memory size.
> > > + * @align:   Minimum alignment needed.
> > > + * @min_addr:Minimum starting address.
> > > + * @max_addr:Maximum end address.
> > > + * @top_down Find the highest free memory region?
> > > + * @addr On success, will have start address of the memory region
> > > found.
> > > + *
> > > + * Return: 0 on success, negative errno on error.
> > > + */
> > > +int kexec_locate_mem_hole(struct kimage *image, unsigned long size,
> > > +   unsigned long align, unsigned long min_addr,
> > > +   unsigned long max_addr, bool top_down,
> > > +   unsigned long *addr)
> > > +{
> > > + int ret;
> > > + struct kexec_buf buf;
> > > +
> > > + memset(&buf, 0, sizeof(struct kexec_buf));
> > > + buf.image = image;
> > > +
> > > + buf.memsz = size;
> > > + buf.buf_align = align;
> > > + buf.buf_min = min_addr;
> > > + buf.buf_max = max_addr;
> > > + buf.top_down = top_down;
> > 
> > Since patch 2/9 moved kexec_buf from internal header file to kexec.h it
> > will be natural to passing a kexec_buf pointer intead of passing all
> > these arguments in kexec_locate_mem_hole.
> > 
> > kbuf.mem can be used for addr.
> 
> Ok. What about this version?
> -- 
> []'s
> Thiago Jung Bauermann
> IBM Linux Technology Center
> 
> 
> Subject: [PATCH 3/9] kexec_file: Factor out kexec_locate_mem_hole from
>  kexec_add_buffer.
> 
> kexec_locate_mem_hole will be used by the PowerPC kexec_file_load
> implementation to find free memory for the purgatory stack.
> 
> Signed-off-by: Thiago Jung Bauermann 
> Cc: Eric Biederman 
> Cc: Dave Young 
> Cc: ke...@lists.infradead.org
> Cc: linux-ker...@vger.kernel.org
> ---
>  include/linux/kexec.h | 12 +---
>  kernel/kexec_file.c   | 25 -
>  2 files changed, 29 insertions(+), 8 deletions(-)
> 
> diff --git a/include/linux/kexec.h b/include/linux/kexec.h
> index 3d91bcfc180d..e8b099da47f5 100644
> --- a/include/linux/kexec.h
> +++ b/include/linux/kexec.h
> @@ -147,9 +147,14 @@ struct kexec_file_ops {
>  #endif
>  };
>  
> -/*
> - * Keeps track of buffer parameters as provided by caller for requesting
> - * memory placement of buffer.
> +/**
> + * struct kexec_buf - parameters for finding a place for a buffer in memory
> + * @image:   kexec image in which memory to search.
> + * @size:Memory size for the buffer.
> + * @align:   Minimum alignment needed.
> + * @min_addr:Minimum starting address.
> + * @max_addr:Maximum end address.
> + * @top_down:Find the highest free memory region?

Hmm, hold on. For declaring a struct in a header file, comment should be
just after each fields, like below, your format is for a function instead:
struct pci_slot {
struct pci_bus *bus;/* The bus this slot is on */
struct list_head list;  /* node in list of slots on this bus */
struct hotplug_slot *hotplug;   /* Hotplug info (migrate over time) */
unsigned char number;   /* PCI_SLOT(pci_dev->devfn) */
struct kobject kobj;
};

BTW, what is @size? there's no size field in kexec_buf. I think it is not
necessary to add these comment, they are easy to understand. If you really
want, please rewrite them correctly, for example "image" description is wrong.
It is not only for searching memory only, top_down description is also bad.

Thanks
Dave
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v3 3/9] kexec_file: Factor out kexec_locate_mem_hole from kexec_add_buffer.

2016-06-27 Thread Dave Young
On 06/23/16 at 12:37pm, Thiago Jung Bauermann wrote:
> Am Donnerstag, 23 Juni 2016, 01:44:07 schrieb Dave Young:
> > Hmm, hold on. For declaring a struct in a header file, comment should be
> > just after each fields, like below, your format is for a function instead:
> > struct pci_slot {
> > struct pci_bus *bus;/* The bus this slot is on */
> > struct list_head list;  /* node in list of slots on this
> > bus */ struct hotplug_slot *hotplug;   /* Hotplug info (migrate over
> > time) */ unsigned char number;   /* PCI_SLOT(pci_dev->devfn) */
> > struct kobject kobj;
> > };
> 
> The comment style you mention above is not extractable documentation. The 
> style I used is what is described in section "kernel-doc for structs, 
> unions, enums, and typedefs" in Documentation/kernel-doc-nano-HOWTO.txt.

You are right and I was wrong!

>  
> > BTW, what is @size? there's no size field in kexec_buf. I think it is not
> > necessary to add these comment, they are easy to understand. If you really
> > want, please rewrite them correctly, for example "image" description is
> > wrong. It is not only for searching memory only, top_down description is
> > also bad.
> 
> Sorry, I moved these comments from kexec_locate_mem_hole but forgot to 
> rename the parameters to what they are called in struct kexec_buf. @size 
> should have been @memsz (other fields also have wrong names, I'll fix them 
> as well). The image description is correct in the context of where struct 
> kexec_buf is used and explains what it will be used for in the function 
> taking kexec_buf as an argument. It is not meant as a general description of 
> the purpose of struct kimage. What is bad about the description of top_down?

It is not clear enough to me, I personally think the original one in
source code is better:
/* allocate from top of memory hole */

> 
> I decided to add these comments because struct kexec_buf is now part of the 
> kernel API for kexec. kernel-doc-nano-HOWTO.txt says:
> 
> > We definitely need kernel-doc formatted documentation for functions
> > that are exported to loadable modules using EXPORT_SYMBOL.
> > 
> > We also look to provide kernel-doc formatted documentation for
> > functions externally visible to other kernel files (not marked
> > "static").
> > 
> > We also recommend providing kernel-doc formatted documentation
> > for private (file "static") routines, for consistency of kernel
> > source code layout.  But this is lower priority and at the
> > discretion of the MAINTAINER of that kernel source file.
> 
> If you think they are not necessary or just add clutter I can leave them 
> out.

I'm fine with either way.

THanks
Dave
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[no subject]

2016-06-27 Thread Dave Young
vgo...@redhat.com
linux-ker...@vger.kernel.org,
Eric Biederman 
Bcc: ruy...@redhat.com
Subject: Re: [PATCH v3 3/9] kexec_file: Factor out kexec_locate_mem_hole from
 kexec_add_buffer.
Reply-To: 
In-Reply-To: <5428891.iJNV8CI1We@hactar>

On 06/27/16 at 01:37pm, Thiago Jung Bauermann wrote:
> Am Dienstag, 28 Juni 2016, 00:19:48 schrieb Dave Young:
> > On 06/23/16 at 12:37pm, Thiago Jung Bauermann wrote:
> > > Am Donnerstag, 23 Juni 2016, 01:44:07 schrieb Dave Young:
> > > What is bad about the description of top_down?
> > It is not clear enough to me, I personally think the original one in
> > source code is better:
> > /* allocate from top of memory hole */
> 
> Actually I realized there's some discrepancy in how the x86 code uses 
> top_down and how I need it to work in powerpc. This may be what is confusing 
> about my comment and the existing comment.
> 
> x86 always walks memory from bottom to top but if top_down is true, in each 
> memory region it will allocate the memory hole in the highest address within 
> that region. I don't know why it is done that way, though.

I think we did not meaning to do this, considering kdump we have only
one crashkernel region for searching (crashk_res) so it is fine.
For kexec maybe changing the walking function to accept top_down is
reasonable.

Ccing Vivek see if he can remember something..

> 
> On powerpc, the memory walk itself should be from top to bottom, as well as 
> the memory hole allocation within each memory region.
> 
> Should I add a separate top_down argument to kexec_locate_mem_hole to 
> control if the memory walk should be from top to bottom, and then the 
> bottom_up member of struct kexec_buf controls where inside each memory 
> region the memory hole will be allocated?
> 
> -- 
> []'s
> Thiago Jung Bauermann
> IBM Linux Technology Center
> 
> 
> ___
> kexec mailing list
> ke...@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/kexec

Thanks
Dave
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v3 3/9] kexec_file: Factor out kexec_locate_mem_hole from kexec_add_buffer.

2016-06-27 Thread Dave Young
Please ignore previous reply, I mistakenly send a broken mail without
subject, sorry about it. Resend the reply here.

On 06/27/16 at 01:37pm, Thiago Jung Bauermann wrote:
> Am Dienstag, 28 Juni 2016, 00:19:48 schrieb Dave Young:
> > On 06/23/16 at 12:37pm, Thiago Jung Bauermann wrote:
> > > Am Donnerstag, 23 Juni 2016, 01:44:07 schrieb Dave Young:
> > > What is bad about the description of top_down?
> > It is not clear enough to me, I personally think the original one in
> > source code is better:
> > /* allocate from top of memory hole */
> 
> Actually I realized there's some discrepancy in how the x86 code uses 
> top_down and how I need it to work in powerpc. This may be what is confusing 
> about my comment and the existing comment.
> 
> x86 always walks memory from bottom to top but if top_down is true, in each 
> memory region it will allocate the memory hole in the highest address within 
> that region. I don't know why it is done that way, though.

I think we did not meaning to do this, considering kdump we have only
one crashkernel region for searching (crashk_res) so it is fine.
For kexec maybe changing the walking function to accept top_down is
reasonable.
 
Ccing Vivek see if he can remember something..

> 
> On powerpc, the memory walk itself should be from top to bottom, as well as 
> the memory hole allocation within each memory region.
> 
> Should I add a separate top_down argument to kexec_locate_mem_hole to 
> control if the memory walk should be from top to bottom, and then the 
> bottom_up member of struct kexec_buf controls where inside each memory 
> region the memory hole will be allocated?
> 
> -- 
> []'s
> Thiago Jung Bauermann
> IBM Linux Technology Center
> 
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v3 3/9] kexec_file: Factor out kexec_locate_mem_hole from kexec_add_buffer.

2016-06-28 Thread Dave Young
On 06/27/16 at 04:21pm, Dave Young wrote:
> Please ignore previous reply, I mistakenly send a broken mail without
> subject, sorry about it. Resend the reply here.
> 
> On 06/27/16 at 01:37pm, Thiago Jung Bauermann wrote:
> > Am Dienstag, 28 Juni 2016, 00:19:48 schrieb Dave Young:
> > > On 06/23/16 at 12:37pm, Thiago Jung Bauermann wrote:
> > > > Am Donnerstag, 23 Juni 2016, 01:44:07 schrieb Dave Young:
> > > > What is bad about the description of top_down?
> > > It is not clear enough to me, I personally think the original one in
> > > source code is better:
> > > /* allocate from top of memory hole */
> > 
> > Actually I realized there's some discrepancy in how the x86 code uses 
> > top_down and how I need it to work in powerpc. This may be what is 
> > confusing 
> > about my comment and the existing comment.
> > 
> > x86 always walks memory from bottom to top but if top_down is true, in each 
> > memory region it will allocate the memory hole in the highest address 
> > within 
> > that region. I don't know why it is done that way, though.
> 
> I think we did not meaning to do this, considering kdump we have only
> one crashkernel region for searching (crashk_res) so it is fine.
> For kexec maybe changing the walking function to accept top_down is
> reasonable.
>  
> Ccing Vivek see if he can remember something..
> 
> > 
> > On powerpc, the memory walk itself should be from top to bottom, as well as 
> > the memory hole allocation within each memory region.

What is the particular reason in powerpc for a mandatory top to bottom
walking?

> > 
> > Should I add a separate top_down argument to kexec_locate_mem_hole to 
> > control if the memory walk should be from top to bottom, and then the 
> > bottom_up member of struct kexec_buf controls where inside each memory 
> > region the memory hole will be allocated?

Using one argument for both sounds more reasonable than using a separate
argument for memory walk..

Thanks
Dave
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v3 3/9] kexec_file: Factor out kexec_locate_mem_hole from kexec_add_buffer.

2016-06-29 Thread Dave Young
On 06/28/16 at 07:18pm, Thiago Jung Bauermann wrote:
> Am Dienstag, 28 Juni 2016, 15:20:55 schrieb Dave Young:
> > On 06/27/16 at 04:21pm, Dave Young wrote:
> > > Please ignore previous reply, I mistakenly send a broken mail without
> > > subject, sorry about it. Resend the reply here.
> > > 
> > > On 06/27/16 at 01:37pm, Thiago Jung Bauermann wrote:
> > > > Am Dienstag, 28 Juni 2016, 00:19:48 schrieb Dave Young:
> > > > > On 06/23/16 at 12:37pm, Thiago Jung Bauermann wrote:
> > > > > > Am Donnerstag, 23 Juni 2016, 01:44:07 schrieb Dave Young:
> > > > > > What is bad about the description of top_down?
> > > > > 
> > > > > It is not clear enough to me, I personally think the original one in
> > > > > source code is better:
> > > > > /* allocate from top of memory hole */
> > > > 
> > > > Actually I realized there's some discrepancy in how the x86 code uses
> > > > top_down and how I need it to work in powerpc. This may be what is
> > > > confusing about my comment and the existing comment.
> > > > 
> > > > x86 always walks memory from bottom to top but if top_down is true, in
> > > > each memory region it will allocate the memory hole in the highest
> > > > address within that region. I don't know why it is done that way,
> > > > though.
> > > 
> > > I think we did not meaning to do this, considering kdump we have only
> > > one crashkernel region for searching (crashk_res) so it is fine.
> > > For kexec maybe changing the walking function to accept top_down is
> > > reasonable.
> > > 
> > > Ccing Vivek see if he can remember something..
> > > 
> > > > On powerpc, the memory walk itself should be from top to bottom, as
> > > > well as the memory hole allocation within each memory region.
> > 
> > What is the particular reason in powerpc for a mandatory top to bottom
> > walking?
> 
> I'm walking unreserved memory ranges, so reservations made low in memory 
> (such as the reservation for the initrd) may create a memory hole that is a 
> lot lower than the true memory limit where I want to allocate from (768 MB). 
> In this situation, allocating at the highest address in the lowest free 
> memory range will allocate the buffer very low in memory, and in that case 
> top_down doesn't mean much.
> 
> Walking memory from lowest to highest address but then allocating memory at 
> the highest address inside the memory range is peculiar and surprising. Is 
> there a particular reason for it?

I do not know if there's some historic reason, personally I think it
should be an accident.

> 
> If it's an accident and doesn't affect x86, I'd suggest that top_down should
> have its expected behavior, which (at least for me) is: allocate from the
> highest available memory address within the desired range.

I tend to agree, but we need test it first to see if it breaks something.

> 
> In any case, my patch series allows each architecture to define what
> top_down should mean. It doesn't change the behavior in x86, since
> the default implementation of arch_kexec_walk_mem ignores
> kexec_buf.top_down, and allows powerpc to take top_down into account
> when walking memory.
> 
> > > > Should I add a separate top_down argument to kexec_locate_mem_hole to
> > > > control if the memory walk should be from top to bottom, and then the
> > > > bottom_up member of struct kexec_buf controls where inside each memory
> > > > region the memory hole will be allocated?
> > 
> > Using one argument for both sounds more reasonable than using a separate
> > argument for memory walk..
> 
> I agree. This patch doesn't use a separate top_down argument, it's the same
> patch I sent earlier except that the comments to struct kexec_buf are in
> patch 2/9. What do you think?

It looks good except one nitpick inline..

> 
> -- 
> []'s
> Thiago Jung Bauermann
> IBM Linux Technology Center
> 
> 
> Subject: [PATCH 3/9] kexec_file: Factor out kexec_locate_mem_hole from
>  kexec_add_buffer.
> 
> kexec_locate_mem_hole will be used by the PowerPC kexec_file_load
> implementation to find free memory for the purgatory stack.
> 
> Signed-off-by: Thiago Jung Bauermann 
> Cc: Eric Biederman 
> Cc: Dave Young 
> Cc: ke...@lists.infradead.org
> Cc: linux-ker...@vger.kernel.org
> ---
>  include/linux/kexec.h |  1 +
>  kernel/kexec_file.c   | 25 -
>  2 files changed, 21 insertions(+), 5 dele

Re: [PATCH v3 2/9] kexec_file: Generalize kexec_add_buffer.

2016-06-29 Thread Dave Young
On 06/28/16 at 07:18pm, Thiago Jung Bauermann wrote:
> Am Donnerstag, 23 Juni 2016, 10:25:06 schrieb Dave Young:
> > On 06/22/16 at 08:30pm, Thiago Jung Bauermann wrote:
> > > Am Mittwoch, 22 Juni 2016, 18:20:47 schrieb Dave Young:
> > > > The patch looks good, but could the subject be more specific?
> > > > 
> > > > For example just like the first sentence of the patch descriotion:
> > > > Allow architectures to specify their own memory walking function
> > > 
> > > Ok, What about this? I also changed the description to refer to x86 arch
> > > instead of Intel arch.
> > 
> > It looks good to me.
> 
> This version has the struct kexec_buf documentation comments that were
> in patch 3/9. I fixed the names of the struct members, and changed their
> descriptions to try to be clearer.
> -- 
> []'s
> Thiago Jung Bauermann
> IBM Linux Technology Center
> 
> 
> Subject: [PATCH 2/9] kexec_file: Allow arch-specific memory walking for
>  kexec_add_buffer
> 
> Allow architectures to specify a different memory walking function for
> kexec_add_buffer. x86 uses iomem to track reserved memory ranges, but
> PowerPC uses the memblock subsystem.
> 
> Signed-off-by: Thiago Jung Bauermann 
> Cc: Eric Biederman 
> Cc: Dave Young 
> Cc: ke...@lists.infradead.org
> Cc: linux-ker...@vger.kernel.org
> ---
>  include/linux/kexec.h   | 25 -
>  kernel/kexec_file.c | 30 ++
>  kernel/kexec_internal.h | 14 --
>  3 files changed, 46 insertions(+), 23 deletions(-)
> 
> diff --git a/include/linux/kexec.h b/include/linux/kexec.h
> index e8acb2b43dd9..e16d845d587f 100644
> --- a/include/linux/kexec.h
> +++ b/include/linux/kexec.h
> @@ -146,7 +146,30 @@ struct kexec_file_ops {
>   kexec_verify_sig_t *verify_sig;
>  #endif
>  };
> -#endif
> +
> +/**
> + * struct kexec_buf - parameters for finding a place for a buffer in memory
> + * @image:   kexec image in which memory to search.
> + * @mem: On return will have address of the buffer in memory.
> + * @memsz:   Size for the buffer in memory.
> + * @buf_align:   Minimum alignment needed.
> + * @buf_min: The buffer can't be placed below this address.
> + * @buf_max: The buffer can't be placed above this address.
> + * @top_down:Allocate from top of memory.
> + */
> +struct kexec_buf {
> + struct kimage *image;
> + unsigned long mem;
> + unsigned long memsz;
> + unsigned long buf_align;
> + unsigned long buf_min;
> + unsigned long buf_max;
> + bool top_down;
> +};

Rethink about the first patch, you dropped the user buffer in kexec_buf
But later your passing IMA digests buffer patchset may need use it.

So keep it in kexec_buf should be better.

For the IMA buffer patchset I'm still reading and learning the
background, will reply them later.

> +
> +int __weak arch_kexec_walk_mem(struct kexec_buf *kbuf,
> +int (*func)(u64, u64, void *));
> +#endif /* CONFIG_KEXEC_FILE */
>  
>  struct kimage {
>   kimage_entry_t head;
> diff --git a/kernel/kexec_file.c b/kernel/kexec_file.c
> index b6eec7527e9f..b1f1f6402518 100644
> --- a/kernel/kexec_file.c
> +++ b/kernel/kexec_file.c
> @@ -428,6 +428,27 @@ static int locate_mem_hole_callback(u64 start, u64 end, 
> void *arg)
>   return locate_mem_hole_bottom_up(start, end, kbuf);
>  }
>  
> +/**
> + * arch_kexec_walk_mem - call func(data) on free memory regions
> + * @kbuf:Context info for the search. Also passed to @func.
> + * @func:Function to call for each memory region.
> + *
> + * Return: The memory walk will stop when func returns a non-zero value
> + * and that value will be returned. If all free regions are visited without
> + * func returning non-zero, then zero will be returned.
> + */
> +int __weak arch_kexec_walk_mem(struct kexec_buf *kbuf,
> +int (*func)(u64, u64, void *))
> +{
> + if (kbuf->image->type == KEXEC_TYPE_CRASH)
> + return walk_iomem_res_desc(crashk_res.desc,
> +IORESOURCE_SYSTEM_RAM | 
> IORESOURCE_BUSY,
> +crashk_res.start, crashk_res.end,
> +kbuf, func);
> + else
> + return walk_system_ram_res(0, ULONG_MAX, kbuf, func);
> +}
> +
>  /*
>   * Helper function for placing a buffer in a kexec segment. This assumes
>   * that kexec_mutex is held.
> @@ -472,14 +493,7 @@ int kexec_add_buffer(struct kimage *image, char *buffer, 
> unsigned long bufsz,
>   kbuf->top_dow

Re: [PATCH v3 2/9] kexec_file: Generalize kexec_add_buffer.

2016-06-30 Thread Dave Young
On 06/29/16 at 06:18pm, Thiago Jung Bauermann wrote:
> Am Mittwoch, 29 Juni 2016, 15:47:51 schrieb Dave Young:
> > On 06/28/16 at 07:18pm, Thiago Jung Bauermann wrote:
> > > diff --git a/include/linux/kexec.h b/include/linux/kexec.h
> > > index e8acb2b43dd9..e16d845d587f 100644
> > > --- a/include/linux/kexec.h
> > > +++ b/include/linux/kexec.h
> > > @@ -146,7 +146,30 @@ struct kexec_file_ops {
> > > 
> > >   kexec_verify_sig_t *verify_sig;
> > >  
> > >  #endif
> > >  };
> > > 
> > > -#endif
> > > +
> > > +/**
> > > + * struct kexec_buf - parameters for finding a place for a buffer in
> > > memory + * @image:   kexec image in which memory to search.
> > > + * @mem: On return will have address of the buffer in memory.
> > > + * @memsz:   Size for the buffer in memory.
> > > + * @buf_align:   Minimum alignment needed.
> > > + * @buf_min: The buffer can't be placed below this address.
> > > + * @buf_max: The buffer can't be placed above this address.
> > > + * @top_down:Allocate from top of memory.
> > > + */
> > > +struct kexec_buf {
> > > + struct kimage *image;
> > > + unsigned long mem;
> > > + unsigned long memsz;
> > > + unsigned long buf_align;
> > > + unsigned long buf_min;
> > > + unsigned long buf_max;
> > > + bool top_down;
> > > +};
> > 
> > Rethink about the first patch, you dropped the user buffer in kexec_buf
> > But later your passing IMA digests buffer patchset may need use it.
> > 
> > So keep it in kexec_buf should be better.
> 
> I'm not following. The IMA buffer patchset doesn't use kexec_locate_mem_hole 
> nor struct kexec_buf.

It does not use kexec_locate_mem_hole, but the buffer being passed is
very similar to a kexec_buf struct, no? 

So you may refactor kexec_add_buffer and your new function to pass only kimage
and a kbuf, it will be better than passing all those arguments separately.

> 
> > For the IMA buffer patchset I'm still reading and learning the
> > background, will reply them later.
> 
> Thank you!
> 
> []'s
> Thiago Jung Bauermann
> IBM Linux Technology Center
> 

Thanks
Dave
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v3 3/9] kexec_file: Factor out kexec_locate_mem_hole from kexec_add_buffer.

2016-06-30 Thread Dave Young
On 06/29/16 at 06:09pm, Thiago Jung Bauermann wrote:
> Am Mittwoch, 29 Juni 2016, 15:45:18 schrieb Dave Young:
> > On 06/28/16 at 07:18pm, Thiago Jung Bauermann wrote:
> > > Am Dienstag, 28 Juni 2016, 15:20:55 schrieb Dave Young:
> > > > On 06/27/16 at 04:21pm, Dave Young wrote:
> > > > Using one argument for both sounds more reasonable than using a
> > > > separate
> > > > argument for memory walk..
> > > 
> > > I agree. This patch doesn't use a separate top_down argument, it's the
> > > same patch I sent earlier except that the comments to struct kexec_buf
> > > are in patch 2/9. What do you think?
> > 
> > It looks good except one nitpick inline..
> > 
> >
> > > +/**
> > > + * kexec_locate_mem_hole - find free memory to load segment or use in
> > > purgatory
>  
> > It is not necessary to use only for purgatory load..
> 
> Ok, what about this?
> 
> /**
>  * kexec_locate_mem_hole - find free memory in a given kimage.

Hmm, a given kimage sounds not correct, I can not get a better way to
describe it. How about below with a little change to your previous one:

kexec_locate_mem_hole - find a free chunk of memory to load kexec segment.
In powerpc the memory chunk can also be used for the purgatory stack.

Thanks
Dave
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v3 3/9] kexec_file: Factor out kexec_locate_mem_hole from kexec_add_buffer.

2016-06-30 Thread Dave Young
On 06/30/16 at 01:08pm, Thiago Jung Bauermann wrote:
> Am Donnerstag, 30 Juni 2016, 11:41:19 schrieb Dave Young:
> > On 06/29/16 at 06:09pm, Thiago Jung Bauermann wrote:
> > > Am Mittwoch, 29 Juni 2016, 15:45:18 schrieb Dave Young:
> > > > On 06/28/16 at 07:18pm, Thiago Jung Bauermann wrote:
> > > > > Am Dienstag, 28 Juni 2016, 15:20:55 schrieb Dave Young:
> > > > > > On 06/27/16 at 04:21pm, Dave Young wrote:
> > > > It looks good except one nitpick inline..
> > > > 
> > > > > +/**
> > > > > + * kexec_locate_mem_hole - find free memory to load segment or use
> > > > > in
> > > > > purgatory
> > > > 
> > > > It is not necessary to use only for purgatory load..
> > > 
> > > Ok, what about this?
> > > 
> > > /**
> > > 
> > >  * kexec_locate_mem_hole - find free memory in a given kimage.
> > 
> > Hmm, a given kimage sounds not correct, I can not get a better way to
> > describe it. How about below with a little change to your previous one:
> > 
> > kexec_locate_mem_hole - find a free chunk of memory to load kexec segment.
> > In powerpc the memory chunk can also be used for the purgatory stack.
> 
> That describes what the memory currently is used for. If powerpc or any 
> other architecture starts to use the memory for something else, this comment 
> would need to be updated. :-)
> 
> What the function really does is find free memory in the physical address 
> space after the currently running kernel hands over control to whatever runs 
> next. What that memory is used for is decided by the caller of the function.
> 
> Since (at least for now), the only things that run next are the purgatory 
> and the next kernel, what about this?
> 
> kexec_locate_mem_hole - find free memory for the purgatory or the next 
> kernel

Ok, I'm fine with above version.

Thanks
Dave 
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v3 2/9] kexec_file: Generalize kexec_add_buffer.

2016-06-30 Thread Dave Young
On 06/30/16 at 01:42pm, Thiago Jung Bauermann wrote:
> Am Donnerstag, 30 Juni 2016, 12:49:44 schrieb Thiago Jung Bauermann:
> > Am Donnerstag, 30 Juni 2016, 11:07:00 schrieb Dave Young:
> > > On 06/29/16 at 06:18pm, Thiago Jung Bauermann wrote:
> > > > I'm not following. The IMA buffer patchset doesn't use
> > > > kexec_locate_mem_hole nor struct kexec_buf.
> > > 
> > > It does not use kexec_locate_mem_hole, but the buffer being passed is
> > > very similar to a kexec_buf struct, no?
> > 
> > If what you're saying is that the arguments passed to
> > kexec_add_handover_buffer in the IMA buffer patchset are very similar to
> > the arguments passed to kexec_add_buffer then yes, it's true.
> > 
> > > So you may refactor kexec_add_buffer and your new function to pass only
> > > kimage and a kbuf, it will be better than passing all those arguments
> > > separately.
> > 
> > To be honest I think struct kexec_buf is an implementation detail inside
> > kexec_locate_mem_hole, made necessary because the callback functions it
> > uses need to access its arguments. Callers of kexec_locate_mem_hole,
> > kexec_add_handover_buffer and kexec_add_buffer shouldn't need to know it
> > exists.
> 
> Elaborating a bit more: the argument list for these three functions are 
> equal or similar because kexec_add_handover_buffer uses kexec_add_buffer, 
> which uses kexec_locate_mem_hole.
> 
> It could be beneficial to have a struct to collect the arguments to these 
> functions if someone using one of them would be likely to use another one 
> with the same arguments. In that case, you set up kexec_buf once and then 
> just pass it whenever you need to call one of those functions.
> 
> But that is unlikely to happen. A user of the kexec API will need to use 
> only one of these functions with a given set of arguments, so they don't 
> gain anything by setting up a struct.
> 
> Syntactically, I also don't think it's clearer to set struct members instead 
> of simply passing arguments to a function, even if the argument list is 
> long.

Sorry, I'm not sure I get your points but the long argument list really looks 
ugly,
since you are introducing more callbacks I still think a cleanup is necessary.

kexec_buffer struct is pretty fine to be a abstract of all these buffers.

Thanks
Dave
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v3 2/9] kexec_file: Generalize kexec_add_buffer.

2016-07-01 Thread Dave Young
On 07/01/16 at 02:51pm, Thiago Jung Bauermann wrote:
> Am Donnerstag, 30 Juni 2016, 17:43:57 schrieb Dave Young:
> > On 06/30/16 at 01:42pm, Thiago Jung Bauermann wrote:
> > > Am Donnerstag, 30 Juni 2016, 12:49:44 schrieb Thiago Jung Bauermann:
> > > > To be honest I think struct kexec_buf is an implementation detail
> > > > inside
> > > > kexec_locate_mem_hole, made necessary because the callback functions
> > > > it
> > > > uses need to access its arguments. Callers of kexec_locate_mem_hole,
> > > > kexec_add_handover_buffer and kexec_add_buffer shouldn't need to know
> > > > it
> > > > exists.
> > > 
> > > Elaborating a bit more: the argument list for these three functions are
> > > equal or similar because kexec_add_handover_buffer uses
> > > kexec_add_buffer,
> > > which uses kexec_locate_mem_hole.
> > > 
> > > It could be beneficial to have a struct to collect the arguments to
> > > these
> > > functions if someone using one of them would be likely to use another
> > > one
> > > with the same arguments. In that case, you set up kexec_buf once and
> > > then
> > > just pass it whenever you need to call one of those functions.
> > > 
> > > But that is unlikely to happen. A user of the kexec API will need to use
> > > only one of these functions with a given set of arguments, so they don't
> > > gain anything by setting up a struct.
> > > 
> > > Syntactically, I also don't think it's clearer to set struct members
> > > instead of simply passing arguments to a function, even if the argument
> > > list is long.
> > 
> > Sorry, I'm not sure I get your points but the long argument list really
> > looks ugly, since you are introducing more callbacks I still think a
> > cleanup is necessary.
> > 
> > kexec_buffer struct is pretty fine to be a abstract of all these buffers.
> 
> What I understood from what you said is that making the following change
> results in code that is easier to understand:
> 
> @@ -650,6 +639,7 @@ static int __kexec_load_purgatory(struct kimage *image, 
> unsigned long min,
>   const Elf_Shdr *sechdrs_c;
>   Elf_Shdr *sechdrs = NULL;
>   void *purgatory_buf = NULL;
> + struct kexec_buf buf;
>  
>   /*
>* sechdrs_c points to section headers in purgatory and are read
> @@ -757,11 +747,19 @@ static int __kexec_load_purgatory(struct kimage *image, 
> unsigned long min,
>   buf_align = bss_align;
>  
>   /* Add buffer to segment list */
> - ret = kexec_add_buffer(image, purgatory_buf, buf_sz, memsz,
> - buf_align, min, max, top_down,
> - &pi->purgatory_load_addr);
> + memset(&buf, 0, sizeof(struct kexec_buf));
> + buf.image = image;
> + buf.buffer = purgatory_buf;
> + buf.bufsz = buf_sz,
> + buf.memsz = memsz;
> + buf.buf_align = buf_align;
> + buf.buf_min = min;
> + buf.buf_max = max;
> + buf.top_down = top_down;
> + ret = kexec_add_buffer(&buf);
>   if (ret)
>   goto out;
> + pi->purgatory_load_addr = buf.mem;
>  
>   /* Load SHF_ALLOC sections */
>   buf_addr = purgatory_buf;
> 
> There are 9 calls to kexec_add_buffer in the kernel (including arch/x86,
> arch/powerpc/ and kernel/), plus 1 to kexec_locate_mem_hole
> and 1 to kexec_add_handover_buffer, so there would be 11 places in
> the code settings up kexec_buf. My opinion is that this change doesn't
> improve code readability.

But the assignment can be moved to the beginning of the function
__kexec_load_purgatory, and avoid the local variables from the very
beginning. Just use kbuf.member instead.

> 
> Also, I think that kexec_buf abstracts something that, from the
> perspective of the user of the kexec API, lives only for the duration
> of a single call to either of kexec_add_buffer, kexec_locate_mem_hole,
> or kexec_add_handover_buffer. Because of this, there's no need from the
> perspective of the API user to initialize this "object", so this just
> adds to their cognitive load without any benefit to them.
> 
> I understand that this is all somewhat subjective, so if you still disagree
> with my points I can provide a patch set implementing the change above.

I still feel it should be changed if more callbacks being introduced,
though you can regard it is internal api, like above comment we do not
need to assign them seperately, the member values can be assigned
from the beginning.

Thanks
Dave
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v3 2/9] kexec_file: Generalize kexec_add_buffer.

2016-07-04 Thread Dave Young
On 07/01/16 at 05:31pm, Thiago Jung Bauermann wrote:
> Am Freitag, 01 Juli 2016, 17:02:23 schrieb Thiago Jung Bauermann:
> > Am Freitag, 01 Juli 2016, 14:36:02 schrieb Dave Young:
> > > On 07/01/16 at 02:51pm, Thiago Jung Bauermann wrote:
> > > > Am Donnerstag, 30 Juni 2016, 17:43:57 schrieb Dave Young:
> > > > > On 06/30/16 at 01:42pm, Thiago Jung Bauermann wrote:
> > > > > > Am Donnerstag, 30 Juni 2016, 12:49:44 schrieb Thiago Jung Bauermann:
> > > > I understand that this is all somewhat subjective, so if you still
> > > > disagree with my points I can provide a patch set implementing the
> > > > change above.
> > > 
> > > I still feel it should be changed if more callbacks being introduced,
> > > though you can regard it is internal api, like above comment we do not
> > > need to assign them seperately, the member values can be assigned
> > > from the beginning.
> > 
> > Ok, I'll implement the changes and submit a v4. Thanks for your review.
> 
> Sorry for creating more email traffic, but it'll be better if I ask this 
> before I change all other places in the code. Is the code below what you
> have in mind?

Thanks for the update, almost except a nitpick :)

> 
> In particular, this version doesn't do the memset(&buf, 0, sizeof(buf))
> that the previous code I sent earlier did. Is that ok?
> 
> @@ -643,13 +632,14 @@ static int __kexec_load_purgatory(struct kimage *image, 
> unsigned long min,
> unsigned long max, int top_down)
>  {
>   struct purgatory_info *pi = &image->purgatory_info;
> - unsigned long align, buf_align, bss_align, buf_sz, bss_sz, bss_pad;
> - unsigned long memsz, entry, load_addr, curr_load_addr, bss_addr, offset;
> + unsigned long align, bss_align, bss_sz, bss_pad;
> + unsigned long entry, load_addr, curr_load_addr, bss_addr, offset;
>   unsigned char *buf_addr, *src;
>   int i, ret = 0, entry_sidx = -1;
>   const Elf_Shdr *sechdrs_c;
>   Elf_Shdr *sechdrs = NULL;
> - void *purgatory_buf = NULL;
> + struct kexec_buf buf = { .image = image, .buf_min = min,
> + .buf_max = max, .top_down = top_down };
>  
>   /*
>* sechdrs_c points to section headers in purgatory and are read
> @@ -715,9 +705,9 @@ static int __kexec_load_purgatory(struct kimage *image, 
> unsigned long min,
>   }
>  
>   /* Determine how much memory is needed to load relocatable object. */
> - buf_align = 1;
> + buf.buf_align = 1;
>   bss_align = 1;
> - buf_sz = 0;
> + buf.bufsz = 0;
>   bss_sz = 0;

Above chunk can go to the initializatioin of struct kexec buf ealier.

>  
>   for (i = 0; i < pi->ehdr->e_shnum; i++) {
> @@ -726,10 +716,10 @@ static int __kexec_load_purgatory(struct kimage *image, 
> unsigned long min,
>  
>   align = sechdrs[i].sh_addralign;
>   if (sechdrs[i].sh_type != SHT_NOBITS) {
> - if (buf_align < align)
> - buf_align = align;
> - buf_sz = ALIGN(buf_sz, align);
> - buf_sz += sechdrs[i].sh_size;
> + if (buf.buf_align < align)
> + buf.buf_align = align;
> + buf.bufsz = ALIGN(buf.bufsz, align);
> + buf.bufsz += sechdrs[i].sh_size;
>   } else {
>   /* bss section */
>   if (bss_align < align)
> @@ -741,32 +731,31 @@ static int __kexec_load_purgatory(struct kimage *image, 
> unsigned long min,
>  
>   /* Determine the bss padding required to align bss properly */
>   bss_pad = 0;
> - if (buf_sz & (bss_align - 1))
> - bss_pad = bss_align - (buf_sz & (bss_align - 1));
> + if (buf.bufsz & (bss_align - 1))
> + bss_pad = bss_align - (buf.bufsz & (bss_align - 1));
>  
> - memsz = buf_sz + bss_pad + bss_sz;
> + buf.memsz = buf.bufsz + bss_pad + bss_sz;
>  
>   /* Allocate buffer for purgatory */
> - purgatory_buf = vzalloc(buf_sz);
> - if (!purgatory_buf) {
> + buf.buffer = vzalloc(buf.bufsz);
> + if (!buf.buffer) {
>   ret = -ENOMEM;
>   goto out;
>   }
>  
> - if (buf_align < bss_align)
> - buf_align = bss_align;
> + if (buf.buf_align < bss_align)
> + buf.buf_align = bss_align;
>  
>   /* Add buffer to segment list */
> - ret = kexec_add_buffer(image, purgatory_buf, buf_sz, memsz,
> - 

Re: [RFC] arm64: kexec_file_load support

2016-07-04 Thread Dave Young
On 07/04/16 at 03:58pm, AKASHI Takahiro wrote:
> Hi,
> 
> On Fri, Jul 01, 2016 at 12:46:31PM -0300, Thiago Jung Bauermann wrote:
> > Am Freitag, 01 Juli 2016, 14:11:12 schrieb AKASHI Takahiro:
> > > I'm not sure whether there is any demand for kexec_file_load
> > > support on arm64, but anyhow I'm working on this and now
> > > my early prototype code does work fine.
> > 
> > It is necessary if you want to support loading only signed kernels, and 
> > also 
> > if you want IMA to measure the kernel in its event log.
> > 
> > > There is, however, one essential issue:
> > > While arm64 kernel requires a device tree blob to be set up
> > > correctly at boot time, the current system call API doesn't
> > > have this parameter.
> > > int kexec_file_load(int kernel_fd, int initrd_fd,
> > > unsigned long cmdline_len, const char
> > > *cmdline_ptr, unsigned long flags);
> > > 
> > > Should we invent a new system call, like kexec_file_load2,
> > > and, if so, what kind of interface would be desired?
> > 
> > I'm facing the same issue on powerpc. What I'm doing is taking the device 
> > tree that was used to boot the current kernel and modifying it as necessary 
> > to pass it to the next kernel.
> 
> That is exactly what I do.
> 
> > I agree that it would be better if we could have a system call where a 
> > custom device tree could be passed. One suggestion is:
> 
> For powerpc, you might be able to use dtbImage instead of Image
> without changing the kernel interfaces.
> > 
> > kexec_file_load2(int fds[], int fd_types[], int nr_fds,
> >  unsigned long cmdline_len, const char *cmdline_ptr,
> > unsigned long flags);
> 
> You don't want to simply add one more argument, i.e. dtb_fd, don't you.
> 
> I prefer a slightly-simpler interface:
> struct kexec_file_fd {
> enum kexec_file_type;
> int fd;
> }
> 
> int kexec_file_load2(struct kexec_file_fd[], int nr_fds, int flags);
> 
> Or if you want to keep the compatibility with the existing system call,
> 
> int kexec_file_load(int kernel_fd, int initrd_fd,
> unsigned long cmdline_len, const char *cmdline_ptr,
> unsigned long flags,
> int struct kexec_file_fd[], int nr_fds);
> 
> Here SYSCALL_DEFINE7() have to be defined, and I'm not sure that we will not
> have a problem in adding a system call with more than 6 arguments.
> 
> > Where fds is an array with nr_fds file descriptors and fd_types is an array 
> > specifying what each fd in fds is. So for example, if fds[i] is the kernel, 
> > then fd_types[i] would have the value KEXEC_FILE_KERNEL_FD. If fds[i] is 
> > the 
> > device tree blob, fd_types[i], would have the value KEXEC_FILE_DTB and so 
> > on. That way, the syscall can be extended for an arbitrary number and types 
> > of segments that have to be loaded, just like kexec_load.
> > 
> > Another option is to have a struct:
> > 
> > kexec_file_load2(struct kexec_file_params *params, unsigned long params_sz);
> 
> Wow, we can add any number of new parameters with this interface.
> 
> Thanks,
> -Takahiro AKASHI
> 
> > Where:
> > 
> > struct kexec_file_params {
> > int version;/* allows struct to be extended in the future */
> > int fds[];
> > int fd_types[];
> > int nr_fds;
> > unsigned long cmdline_len;
> > const char *cmdline_ptr;
> > unsigned long flags;
> > };
> > 
> > This is even more flexible.

I would like to vote for this one, and use kexec_file_fd fds[] in the struct 
 
Thanks
Dave
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [RFC] arm64: kexec_file_load support

2016-07-06 Thread Dave Young
On 07/05/16 at 05:03pm, AKASHI Takahiro wrote:
> Hi Dave,
> 
> On Tue, Jul 05, 2016 at 09:25:56AM +0800, Dave Young wrote:
> > On 07/04/16 at 03:58pm, AKASHI Takahiro wrote:
> > > Hi,
> > > 
> > > On Fri, Jul 01, 2016 at 12:46:31PM -0300, Thiago Jung Bauermann wrote:
> > > > Am Freitag, 01 Juli 2016, 14:11:12 schrieb AKASHI Takahiro:
> > > > > I'm not sure whether there is any demand for kexec_file_load
> > > > > support on arm64, but anyhow I'm working on this and now
> > > > > my early prototype code does work fine.
> > > > 
> > > > It is necessary if you want to support loading only signed kernels, and 
> > > > also 
> > > > if you want IMA to measure the kernel in its event log.
> > > > 
> > > > > There is, however, one essential issue:
> > > > > While arm64 kernel requires a device tree blob to be set up
> > > > > correctly at boot time, the current system call API doesn't
> > > > > have this parameter.
> > > > > int kexec_file_load(int kernel_fd, int initrd_fd,
> > > > > unsigned long cmdline_len, const char
> > > > > *cmdline_ptr, unsigned long flags);
> > > > > 
> > > > > Should we invent a new system call, like kexec_file_load2,
> > > > > and, if so, what kind of interface would be desired?
> > > > 
> > > > I'm facing the same issue on powerpc. What I'm doing is taking the 
> > > > device 
> > > > tree that was used to boot the current kernel and modifying it as 
> > > > necessary 
> > > > to pass it to the next kernel.
> > > 
> > > That is exactly what I do.
> > > 
> > > > I agree that it would be better if we could have a system call where a 
> > > > custom device tree could be passed. One suggestion is:
> > > 
> > > For powerpc, you might be able to use dtbImage instead of Image
> > > without changing the kernel interfaces.
> > > > 
> > > > kexec_file_load2(int fds[], int fd_types[], int nr_fds,
> > > >  unsigned long cmdline_len, const char *cmdline_ptr,
> > > > unsigned long flags);
> > > 
> > > You don't want to simply add one more argument, i.e. dtb_fd, don't you.
> > > 
> > > I prefer a slightly-simpler interface:
> > > struct kexec_file_fd {
> > > enum kexec_file_type;
> > > int fd;
> > > }
> > > 
> > > int kexec_file_load2(struct kexec_file_fd[], int nr_fds, int 
> > > flags);
> > > 
> > > Or if you want to keep the compatibility with the existing system call,
> > > 
> > > int kexec_file_load(int kernel_fd, int initrd_fd,
> > > unsigned long cmdline_len, const char 
> > > *cmdline_ptr,
> > > unsigned long flags,
> > > int struct kexec_file_fd[], int nr_fds);
> > > 
> > > Here SYSCALL_DEFINE7() have to be defined, and I'm not sure that we will 
> > > not
> > > have a problem in adding a system call with more than 6 arguments.
> > > 
> > > > Where fds is an array with nr_fds file descriptors and fd_types is an 
> > > > array 
> > > > specifying what each fd in fds is. So for example, if fds[i] is the 
> > > > kernel, 
> > > > then fd_types[i] would have the value KEXEC_FILE_KERNEL_FD. If fds[i] 
> > > > is the 
> > > > device tree blob, fd_types[i], would have the value KEXEC_FILE_DTB and 
> > > > so 
> > > > on. That way, the syscall can be extended for an arbitrary number and 
> > > > types 
> > > > of segments that have to be loaded, just like kexec_load.
> > > > 
> > > > Another option is to have a struct:
> > > > 
> > > > kexec_file_load2(struct kexec_file_params *params, unsigned long 
> > > > params_sz);
> > > 
> > > Wow, we can add any number of new parameters with this interface.
> > > 
> > > Thanks,
> > > -Takahiro AKASHI
> > > 
> > > > Where:
> > > > 
> > > > struct kexec_file_params {
> > > > int version;/* allows struct to be extended in the future */
> > > > int fds[];
> > > > int fd_types[];
&g

Re: [PATCH v4 3/9] kexec_file: Factor out kexec_locate_mem_hole from kexec_add_buffer.

2016-07-10 Thread Dave Young
On 07/10/16 at 04:11pm, Michael Ellerman wrote:
> Thiago Jung Bauermann  writes:
> 
> > kexec_locate_mem_hole will be used by the PowerPC kexec_file_load
> > implementation to find free memory for the purgatory stack.
> >
> > Signed-off-by: Thiago Jung Bauermann 
> > Cc: Eric Biederman 
> > Cc: Dave Young 
> 
> Dave are you happy with the first three patches? If so do you mind
> sending an ack?

I reviewed the 3 patches, they look good to me. Will ack after a small test 
today.

> 
> Given the series touches generic code, x86 and powerpc this might be one
> for Andrew to take. Or is there a kexec tree it should go through?

They should go Andrew's tree, but I'm not sure about the powerpc part.

Thanks
Dave
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [RFC] arm64: kexec_file_load support

2016-07-10 Thread Dave Young
On 07/08/16 at 11:48am, Thiago Jung Bauermann wrote:
> Am Donnerstag, 07 Juli 2016, 14:12:45 schrieb Dave Young:
> > If so maybe change a bit from your precious mentioned 7 args proposal like
> > below?
> > 
> > struct kexec_file_fd {
> > enum kexec_file_type;
> > int fd;
> > }
> > 
> > struct kexec_fdset {
> > int nr_fd;
> > struct kexec_file_fd fd[0];
> > }
> > 
> > int kexec_file_load(int kernel_fd, int initrd_fd,
> > unsigned long cmdline_len, const char *cmdline_ptr,
> > unsigned long flags, struct kexec_fdset *extra_fds);
> 
> 
> Is there a way for the kernel to distinguish whether the process passed 5 or 
> 6 arguments? How can it know whether extra_fds is a valid argument or just 
> garbage? I think we have to define a new flag KEXEC_FILE_EXTRA_FDS so that 
> the process can signal that it is using the new interface.

Agreed, a new flag is needed.

Thanks
Dave
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH v4 1/9] kexec_file: Allow arch-specific memory walking for kexec_add_buffer

2016-07-10 Thread Dave Young
On 07/07/16 at 01:23pm, Thiago Jung Bauermann wrote:
> Allow architectures to specify a different memory walking function for
> kexec_add_buffer. x86 uses iomem to track reserved memory ranges, but
> PowerPC uses the memblock subsystem.
> 
> Signed-off-by: Thiago Jung Bauermann 
> Cc: Eric Biederman 
> Cc: Dave Young 
> ---
>  include/linux/kexec.h   | 29 -
>  kernel/kexec_file.c | 30 ++
>  kernel/kexec_internal.h | 16 
>  3 files changed, 50 insertions(+), 25 deletions(-)
> 
> diff --git a/include/linux/kexec.h b/include/linux/kexec.h
> index e8acb2b43dd9..2549a99a748c 100644
> --- a/include/linux/kexec.h
> +++ b/include/linux/kexec.h
> @@ -146,7 +146,34 @@ struct kexec_file_ops {
>   kexec_verify_sig_t *verify_sig;
>  #endif
>  };
> -#endif
> +
> +/**
> + * struct kexec_buf - parameters for finding a place for a buffer in memory
> + * @image:   kexec image in which memory to search.
> + * @buffer:  Contents which will be copied to the allocated memory.
> + * @bufsz:   Size of @buffer.
> + * @mem: On return will have address of the buffer in memory.
> + * @memsz:   Size for the buffer in memory.
> + * @buf_align:   Minimum alignment needed.
> + * @buf_min: The buffer can't be placed below this address.
> + * @buf_max: The buffer can't be placed above this address.
> + * @top_down:Allocate from top of memory.
> + */
> +struct kexec_buf {
> + struct kimage *image;
> + char *buffer;
> + unsigned long bufsz;
> + unsigned long mem;
> + unsigned long memsz;
> + unsigned long buf_align;
> + unsigned long buf_min;
> + unsigned long buf_max;
> + bool top_down;
> +};
> +
> +int __weak arch_kexec_walk_mem(struct kexec_buf *kbuf,
> +int (*func)(u64, u64, void *));
> +#endif /* CONFIG_KEXEC_FILE */
>  
>  struct kimage {
>   kimage_entry_t head;
> diff --git a/kernel/kexec_file.c b/kernel/kexec_file.c
> index 01ab82a40d22..9c0c565a08db 100644
> --- a/kernel/kexec_file.c
> +++ b/kernel/kexec_file.c
> @@ -428,6 +428,27 @@ static int locate_mem_hole_callback(u64 start, u64 end, 
> void *arg)
>   return locate_mem_hole_bottom_up(start, end, kbuf);
>  }
>  
> +/**
> + * arch_kexec_walk_mem - call func(data) on free memory regions
> + * @kbuf:Context info for the search. Also passed to @func.
> + * @func:Function to call for each memory region.
> + *
> + * Return: The memory walk will stop when func returns a non-zero value
> + * and that value will be returned. If all free regions are visited without
> + * func returning non-zero, then zero will be returned.
> + */
> +int __weak arch_kexec_walk_mem(struct kexec_buf *kbuf,
> +int (*func)(u64, u64, void *))
> +{
> + if (kbuf->image->type == KEXEC_TYPE_CRASH)
> + return walk_iomem_res_desc(crashk_res.desc,
> +IORESOURCE_SYSTEM_RAM | 
> IORESOURCE_BUSY,
> +crashk_res.start, crashk_res.end,
> +kbuf, func);
> + else
> + return walk_system_ram_res(0, ULONG_MAX, kbuf, func);
> +}
> +
>  /*
>   * Helper function for placing a buffer in a kexec segment. This assumes
>   * that kexec_mutex is held.
> @@ -474,14 +495,7 @@ int kexec_add_buffer(struct kimage *image, char *buffer, 
> unsigned long bufsz,
>   kbuf->top_down = top_down;
>  
>   /* Walk the RAM ranges and allocate a suitable range for the buffer */
> - if (image->type == KEXEC_TYPE_CRASH)
> - ret = walk_iomem_res_desc(crashk_res.desc,
> - IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY,
> - crashk_res.start, crashk_res.end, kbuf,
> - locate_mem_hole_callback);
> - else
> - ret = walk_system_ram_res(0, -1, kbuf,
> -   locate_mem_hole_callback);
> + ret = arch_kexec_walk_mem(kbuf, locate_mem_hole_callback);
>   if (ret != 1) {
>   /* A suitable memory range could not be found for buffer */
>   return -EADDRNOTAVAIL;
> diff --git a/kernel/kexec_internal.h b/kernel/kexec_internal.h
> index 0a52315d9c62..4cef7e4706b0 100644
> --- a/kernel/kexec_internal.h
> +++ b/kernel/kexec_internal.h
> @@ -20,22 +20,6 @@ struct kexec_sha_region {
>   unsigned long len;
>  };
>  
> -/*
> - * Keeps track of buffer parameters as provided by caller for requesting
> - * memory placement of buffer.
> - */
> -struct kexec_buf {
> -   

Re: [PATCH v4 2/9] kexec_file: Change kexec_add_buffer to take kexec_buf as argument.

2016-07-10 Thread Dave Young
On 07/07/16 at 01:23pm, Thiago Jung Bauermann wrote:
> Adapt all callers to the new function prototype.
> 
> In addition, change the type of kexec_buf.buffer from char * to void *.
> There is no particular reason for it to be a char *, and the change
> allows us to get rid of 3 existing casts to char * in the code.
> 
> Signed-off-by: Thiago Jung Bauermann 
> Cc: Eric Biederman 
> Cc: Dave Young 
> Cc: Thomas Gleixner 
> Cc: Ingo Molnar 
> Cc: "H. Peter Anvin" 
> Cc: x...@kernel.org
> ---
>  arch/x86/kernel/crash.c   | 37 
>  arch/x86/kernel/kexec-bzimage64.c | 48 +++--
>  include/linux/kexec.h |  8 +---
>  kernel/kexec_file.c   | 88 
> ++-
>  4 files changed, 87 insertions(+), 94 deletions(-)
> 
> diff --git a/arch/x86/kernel/crash.c b/arch/x86/kernel/crash.c
> index 9ef978d69c22..dc026ea361dc 100644
> --- a/arch/x86/kernel/crash.c
> +++ b/arch/x86/kernel/crash.c
> @@ -615,9 +615,9 @@ static int determine_backup_region(u64 start, u64 end, 
> void *arg)
>  
>  int crash_load_segments(struct kimage *image)
>  {
> - unsigned long src_start, src_sz, elf_sz;
> - void *elf_addr;
>   int ret;
> + struct kexec_buf kbuf = { .image = image, .buf_min = 0,
> +   .buf_max = ULONG_MAX, .top_down = false };
>  
>   /*
>* Determine and load a segment for backup area. First 640K RAM
> @@ -631,43 +631,44 @@ int crash_load_segments(struct kimage *image)
>   if (ret < 0)
>   return ret;
>  
> - src_start = image->arch.backup_src_start;
> - src_sz = image->arch.backup_src_sz;
> -
>   /* Add backup segment. */
> - if (src_sz) {
> + if (image->arch.backup_src_sz) {
> + kbuf.buffer = &crash_zero_bytes;
> + kbuf.bufsz = sizeof(crash_zero_bytes);
> + kbuf.memsz = image->arch.backup_src_sz;
> + kbuf.buf_align = PAGE_SIZE;
>   /*
>* Ideally there is no source for backup segment. This is
>* copied in purgatory after crash. Just add a zero filled
>* segment for now to make sure checksum logic works fine.
>*/
> - ret = kexec_add_buffer(image, (char *)&crash_zero_bytes,
> -sizeof(crash_zero_bytes), src_sz,
> -PAGE_SIZE, 0, -1, 0,
> -&image->arch.backup_load_addr);
> + ret = kexec_add_buffer(&kbuf);
>   if (ret)
>   return ret;
> + image->arch.backup_load_addr = kbuf.mem;
>   pr_debug("Loaded backup region at 0x%lx backup_start=0x%lx 
> memsz=0x%lx\n",
> -  image->arch.backup_load_addr, src_start, src_sz);
> +  image->arch.backup_load_addr,
> +  image->arch.backup_src_start, kbuf.memsz);
>   }
>  
>   /* Prepare elf headers and add a segment */
> - ret = prepare_elf_headers(image, &elf_addr, &elf_sz);
> + ret = prepare_elf_headers(image, &kbuf.buffer, &kbuf.bufsz);
>   if (ret)
>   return ret;
>  
> - image->arch.elf_headers = elf_addr;
> - image->arch.elf_headers_sz = elf_sz;
> + image->arch.elf_headers = kbuf.buffer;
> + image->arch.elf_headers_sz = kbuf.bufsz;
>  
> - ret = kexec_add_buffer(image, (char *)elf_addr, elf_sz, elf_sz,
> - ELF_CORE_HEADER_ALIGN, 0, -1, 0,
> - &image->arch.elf_load_addr);
> + kbuf.memsz = kbuf.bufsz;
> + kbuf.buf_align = ELF_CORE_HEADER_ALIGN;
> + ret = kexec_add_buffer(&kbuf);
>   if (ret) {
>   vfree((void *)image->arch.elf_headers);
>   return ret;
>   }
> + image->arch.elf_load_addr = kbuf.mem;
>   pr_debug("Loaded ELF headers at 0x%lx bufsz=0x%lx memsz=0x%lx\n",
> -  image->arch.elf_load_addr, elf_sz, elf_sz);
> +  image->arch.elf_load_addr, kbuf.bufsz, kbuf.bufsz);
>  
>   return ret;
>  }
> diff --git a/arch/x86/kernel/kexec-bzimage64.c 
> b/arch/x86/kernel/kexec-bzimage64.c
> index f2356bda2b05..4b3a75329fb6 100644
> --- a/arch/x86/kernel/kexec-bzimage64.c
> +++ b/arch/x86/kernel/kexec-bzimage64.c
> @@ -331,17 +331,17 @@ static void *bzImage64_load(struct kimage *image, char 
> *kernel,
>  
>   struct setup_header *header;
>   int setup_sects, kern16_size, ret = 0;
> - unsigned long setup_header_size, para

Re: [PATCH v4 3/9] kexec_file: Factor out kexec_locate_mem_hole from kexec_add_buffer.

2016-07-10 Thread Dave Young
On 07/07/16 at 01:23pm, Thiago Jung Bauermann wrote:
> kexec_locate_mem_hole will be used by the PowerPC kexec_file_load
> implementation to find free memory for the purgatory stack.
> 
> Signed-off-by: Thiago Jung Bauermann 
> Cc: Eric Biederman 
> Cc: Dave Young 
> ---
>  include/linux/kexec.h |  1 +
>  kernel/kexec_file.c   | 25 -
>  2 files changed, 21 insertions(+), 5 deletions(-)
> 
> diff --git a/include/linux/kexec.h b/include/linux/kexec.h
> index ff5fa7707bd7..803f563df81d 100644
> --- a/include/linux/kexec.h
> +++ b/include/linux/kexec.h
> @@ -174,6 +174,7 @@ struct kexec_buf {
>  int __weak arch_kexec_walk_mem(struct kexec_buf *kbuf,
>  int (*func)(u64, u64, void *));
>  extern int kexec_add_buffer(struct kexec_buf *kbuf);
> +int kexec_locate_mem_hole(struct kexec_buf *kbuf);
>  #endif /* CONFIG_KEXEC_FILE */
>  
>  struct kimage {
> diff --git a/kernel/kexec_file.c b/kernel/kexec_file.c
> index 82ccfc4ee97e..3125d1689712 100644
> --- a/kernel/kexec_file.c
> +++ b/kernel/kexec_file.c
> @@ -450,6 +450,23 @@ int __weak arch_kexec_walk_mem(struct kexec_buf *kbuf,
>  }
>  
>  /**
> + * kexec_locate_mem_hole - find free memory for the purgatory or the next 
> kernel
> + * @kbuf:Parameters for the memory search.
> + *
> + * On success, kbuf->mem will have the start address of the memory region 
> found.
> + *
> + * Return: 0 on success, negative errno on error.
> + */
> +int kexec_locate_mem_hole(struct kexec_buf *kbuf)
> +{
> + int ret;
> +
> + ret = arch_kexec_walk_mem(kbuf, locate_mem_hole_callback);
> +
> + return ret == 1 ? 0 : -EADDRNOTAVAIL;
> +}
> +
> +/**
>   * kexec_add_buffer - place a buffer in a kexec segment
>   * @kbuf:Buffer contents and memory parameters.
>   *
> @@ -489,11 +506,9 @@ int kexec_add_buffer(struct kexec_buf *kbuf)
>   kbuf->buf_align = max(kbuf->buf_align, PAGE_SIZE);
>  
>   /* Walk the RAM ranges and allocate a suitable range for the buffer */
> - ret = arch_kexec_walk_mem(kbuf, locate_mem_hole_callback);
> - if (ret != 1) {
> - /* A suitable memory range could not be found for buffer */
> - return -EADDRNOTAVAIL;
> - }
> + ret = kexec_locate_mem_hole(kbuf);
> + if (ret)
> + return ret;
>  
>   /* Found a suitable memory range */
>   ksegment = &kbuf->image->segment[kbuf->image->nr_segments];
> -- 
> 1.9.1
> 

Acked-by: Dave Young 

Thanks
Dave
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [RFC] arm64: kexec_file_load support

2016-07-11 Thread Dave Young
On 07/11/16 at 04:19pm, AKASHI Takahiro wrote:
> On Fri, Jul 08, 2016 at 11:48:44AM -0300, Thiago Jung Bauermann wrote:
> > Am Donnerstag, 07 Juli 2016, 14:12:45 schrieb Dave Young:
> > > If so maybe change a bit from your precious mentioned 7 args proposal like
> > > below?
> > > 
> > > struct kexec_file_fd {
> > >   enum kexec_file_type;
> > >   int fd;
> > > }
> > > 
> > > struct kexec_fdset {
> > >   int nr_fd;
> > >   struct kexec_file_fd fd[0];
> > > }
> > > 
> > > int kexec_file_load(int kernel_fd, int initrd_fd,
> > >   unsigned long cmdline_len, const char *cmdline_ptr,
> > >   unsigned long flags, struct kexec_fdset *extra_fds);
> > 
> > 
> > Is there a way for the kernel to distinguish whether the process passed 5 
> > or 
> > 6 arguments? How can it know whether extra_fds is a valid argument or just 
> > garbage? I think we have to define a new flag KEXEC_FILE_EXTRA_FDS so that 
> > the process can signal that it is using the new interface.
> 
> Good point. What I had in my mind is "open" system call.
> Glibc's implementation of open() checks for the second argument, flags,
> to determine whether or not the third argument, mode, is required.
> 
> So our current consensus is that, for the time being, we will extend
> the existing system call interface to allow extra file descriptors,
> and *if* we really need a different type of parameters in the future,
> we will add another system call.

Sounds good to me.

> 
> If we all agree, I can post a kernel patch for this change as a RFC.
> (I've already verified it with my prototype of kexec_file_load support
> on arm64.)
> 
> Thanks,
> -Takahiro AKASHI
> 
> > -- 
> > []'s
> > Thiago Jung Bauermann
> > IBM Linux Technology Center
> > 

Thanks
Dave
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [RFC 0/3] extend kexec_file_load system call

2016-07-12 Thread Dave Young
On 07/12/16 at 03:50pm, Mark Rutland wrote:
> On Tue, Jul 12, 2016 at 04:24:10PM +0200, Arnd Bergmann wrote:
> > On Tuesday, July 12, 2016 10:18:11 AM CEST Vivek Goyal wrote:
> > > > 
> > > > On Open Firmware, the DT is extracted from running firmware and copied
> > > > into dynamically allocated data structures. After a kexec, the runtime
> > > > interface to the firmware is not available, so the flattened DT format
> > > > was created as a way to pass the same data in a binary blob to the new
> > > > kernel in a format that can be read from the kernel by walking the
> > > > directories in /proc/device-tree/*.
> > > 
> > > So this DT is available inside kernel and running kernel can still
> > > retrieve it and pass it to second kernel?
> > 
> > The kernel only uses the flattened DT blob at boot time and converts
> > it into the runtime data structures (struct device_node). The original
> > dtb is typically overwritten later.
> 
> On arm64 we deliberately preserved the DTB, so we can take that and
> build a new DTB from that kernel-side.
> 
> > > > - we typically ship devicetree sources for embedded machines with the
> > > >   kernel sources. As more hardware of the system gets enabled, the
> > > >   devicetree gains extra nodes and properties that describe the hardware
> > > >   more completely, so we need to use the latest DT blob to use all
> > > >   the drivers
> > > > 
> > > > - in some cases, kernels will fail to boot at all with an older version
> > > >   of the DT, or fail to use the devices that were working on the
> > > >   earlier kernel. This is usually considered a bug, but it's not rare
> > > > 
> > > > - In some cases, the kernel can update its DT at runtime, and the new
> > > >   settings are expected to be available in the new kernel too, though
> > > >   there are cases where you actually don't want the modified contents.
> > > 
> > > I am assuming that modified DT and unmodifed one both are accessible to
> > > kernel. And if user space can make decisions which modfied fields to use
> > > for new kernels and which ones not, then same can be done in kernel too?
> > 
> > The unmodified DT can typically be found on disk next to the kernel binary.
> > The option you have is to either read it from /proc/devicetree or to
> > read it from from /boot/*.dtb.
> 
> /proc/devicetree (aka /sys/firmware/devicetree) is a filesystem derived
> from the raw DTB (which is exposed at /sys/firmware/fdt).
> 
> The blob that was handed to the kernel at boot time is exposed at
> /sys/firmware/fdt.

I believe the blob can be read and passed to kexec kernel in kernel code without
the extra fd.

But consider we can kexec to a different kernel and a different initrd so there
will be use cases to pass a total different dtb as well. From my understanding
it is reasonable but yes I think we should think carefully about the design.

Thanks
Dave

> Thanks,
> Mark.
> 
> ___
> kexec mailing list
> ke...@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/kexec
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [RFC 0/3] extend kexec_file_load system call

2016-07-13 Thread Dave Young
[snip]
> Now, going back to the more fundamental issue raised in my first reply,
> about the kernel command line.
> 
> On x86, I can see that it _is_ possible for userspace to specify a
> command line, and the kernel loading the image provides the command
> line to the to-be-kexeced kernel with very little checking.  So, if
> your kernel is signed, what stops the "insecure userspace" loading
> a signed kernel but giving it an insecure rootfs and/or console?

The kexec_file_load syscall was introduced for secure boot in the first
place. In case UEFI secure boot the signature verification chain only
covers kernel mode binaries. I think there is such problem in both normal
boot and kexec boot.

Thanks
Dave

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [RFC 0/3] extend kexec_file_load system call

2016-07-13 Thread Dave Young
On 07/13/16 at 10:34am, Mark Rutland wrote:
> On Wed, Jul 13, 2016 at 10:36:14AM +0800, Dave Young wrote:
> > But consider we can kexec to a different kernel and a different initrd so 
> > there
> > will be use cases to pass a total different dtb as well.
> 
> It depends on what you mean by "a different kernel", and what this
> implies for the DTB.
> 

I thought about kexec as a boot loader just like other bootloaders.
So just like a normal boot kexec should also accept external dtb.
But acutally kexec is different because it can get original dtb and
use it. So I agreed if we can not find a real use case that we have
to extend it we should keep current interface.

> I expect future arm64 Linux kernels to function with today's DTBs, and
> the existing boot protocol. The kexec_file_load syscall already has
> enough information for the kernel to inject the initrd and bootargs
> properties into a DTB.
> 
> In practice on x86 today, kexec_file_load only supports booting to a
> Linux kernel, because the in-kernel purgatory only implements the x86
> Linux boot protocol. Analagously, for arm64 I think that the first
> kernel should use its internal copy of the boot DTB, with /chosen fixed
> up appropriately, assuming the next kernel is an arm64 Linux image.
> 
> If booting another OS, the only parts of the DTB I would expect to
> change are the properties under chosen, as everything else *should* be
> OS-independent. However the other OS may have a completely different
> boot protocol, might not even take a DTB, and will likely need a
> compeltely different purgatory implementation. So just allowing the DTB
> to be altered isn't sufficient for that case.
> 
> There might be cases where we want a different DTB, but as far as I can
> tell we have nothing analagous on x86 today. If we do need this, we
> should have an idea of what real case(s) were trying to solve.

Agreed.

Thanks
Dave
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [RFC 0/3] extend kexec_file_load system call

2016-07-13 Thread Dave Young
On 07/14/16 at 02:38am, AKASHI Takahiro wrote:
> Apologies for the slow response. I'm attending LinuxCon this week.
> 
> On Wed, Jul 13, 2016 at 10:34:47AM +0100, Mark Rutland wrote:
> > On Wed, Jul 13, 2016 at 10:36:14AM +0800, Dave Young wrote:
> > > But consider we can kexec to a different kernel and a different initrd so 
> > > there
> > > will be use cases to pass a total different dtb as well.
> > 
> > It depends on what you mean by "a different kernel", and what this
> > implies for the DTB.
> > 
> > I expect future arm64 Linux kernels to function with today's DTBs, and
> > the existing boot protocol. The kexec_file_load syscall already has
> > enough information for the kernel to inject the initrd and bootargs
> > properties into a DTB.
> > 
> > In practice on x86 today, kexec_file_load only supports booting to a
> > Linux kernel, because the in-kernel purgatory only implements the x86
> > Linux boot protocol. Analagously, for arm64 I think that the first
> > kernel should use its internal copy of the boot DTB, with /chosen fixed
> > up appropriately, assuming the next kernel is an arm64 Linux image.
> > 
> > If booting another OS, the only parts of the DTB I would expect to
> > change are the properties under chosen, as everything else *should* be
> > OS-independent. However the other OS may have a completely different
> > boot protocol, might not even take a DTB, and will likely need a
> > compeltely different purgatory implementation. So just allowing the DTB
> > to be altered isn't sufficient for that case.
> > 
> > There might be cases where we want a different DTB, but as far as I can
> > tell we have nothing analagous on x86 today. If we do need this, we
> > should have an idea of what real case(s) were trying to solve.
> 
> What I had in my mind was:
> 
> - Kdump
>   As Russel said, we definitely need to modify dtb.
>   In addition to bootargs and initrd proerties (FYI, in my arm64
>   implementation for arm64, eflcorehdr info is also passed as DT
>   property), we may want to remove unnecessary devices and
>   even add a dedicated storage device for storing a core dump image.
> - Say, booting BE kernel on ACPI LE kernel
>   In this case, there is no useful dtb in the kernel.
> 
> Have said that, as Mark said, we may be able to use normal kexec_load
> system call if we don't need a "secure" kexec.
> 
> BTW, why doesn't the current kexec_load have ability of verifying
> a signature of initramfs image? Is IMA/EVM expected to be used
> at runtime?

I believe there are some limitations for verify signatures in kexec_load.
First kexec-tools need to be trusted, but there's no way to sign and
verify signature of shared libraries. There maybe other limitations I
can not remember which are also reasons why Vivek moved to current
file based syscall.

Thanks
Dave
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [RFC 3/3] kexec: extend kexec_file_load system call

2016-07-17 Thread Dave Young
On 07/15/16 at 02:19pm, Mark Rutland wrote:
> On Fri, Jul 15, 2016 at 09:09:55AM -0400, Vivek Goyal wrote:
> > On Tue, Jul 12, 2016 at 10:42:01AM +0900, AKASHI Takahiro wrote:
> > 
> > [..]
> > > -SYSCALL_DEFINE5(kexec_file_load, int, kernel_fd, int, initrd_fd,
> > > +SYSCALL_DEFINE6(kexec_file_load, int, kernel_fd, int, initrd_fd,
> > >   unsigned long, cmdline_len, const char __user *, cmdline_ptr,
> > > - unsigned long, flags)
> > > + unsigned long, flags, const struct kexec_fdset __user *, ufdset)
> > 
> > Can one add more parameters to existing syscall. Can it break existing
> > programs with new kernel? I was of the impression that one can't do that.
> > But may be I am missing something.
> 
> I think the idea was that we would only look at the new params if a new
> flags was set, and otherwise it would behave as the old syscall.
> 
> Regardless, I think it makes far more sense to add a kexec_file_load2
> syscall if we're going to modify the prototype at all. It's a rather
> different proposition to the existing syscall, and needs to be treated
> as such.

I do not think it is worth to add another syscall for extra fds.
We have open(2) as an example for different numbers of arguments
already.

Thanks
Dave
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [RFC 3/3] kexec: extend kexec_file_load system call

2016-07-17 Thread Dave Young
On 07/15/16 at 09:09am, Vivek Goyal wrote:
> On Tue, Jul 12, 2016 at 10:42:01AM +0900, AKASHI Takahiro wrote:
> 
> [..]
> > -SYSCALL_DEFINE5(kexec_file_load, int, kernel_fd, int, initrd_fd,
> > +SYSCALL_DEFINE6(kexec_file_load, int, kernel_fd, int, initrd_fd,
> > unsigned long, cmdline_len, const char __user *, cmdline_ptr,
> > -   unsigned long, flags)
> > +   unsigned long, flags, const struct kexec_fdset __user *, ufdset)
> 
> Can one add more parameters to existing syscall. Can it break existing
> programs with new kernel? I was of the impression that one can't do that.
> But may be I am missing something.

It will not break existing programs because we can use the new param only
when the new flag is set.

But we have a case below, but I think it is fine?
Originally kexec_file_load with the new flags will fail, but now it will
succeed and will access the new argument.

Thanks
Dave
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [RFC 3/3] kexec: extend kexec_file_load system call

2016-07-18 Thread Dave Young
On 07/18/16 at 11:07am, Mark Rutland wrote:
> On Mon, Jul 18, 2016 at 10:30:24AM +0800, Dave Young wrote:
> > On 07/15/16 at 02:19pm, Mark Rutland wrote:
> > > On Fri, Jul 15, 2016 at 09:09:55AM -0400, Vivek Goyal wrote:
> > > > On Tue, Jul 12, 2016 at 10:42:01AM +0900, AKASHI Takahiro wrote:
> > > > 
> > > > [..]
> > > > > -SYSCALL_DEFINE5(kexec_file_load, int, kernel_fd, int, initrd_fd,
> > > > > +SYSCALL_DEFINE6(kexec_file_load, int, kernel_fd, int, initrd_fd,
> > > > >   unsigned long, cmdline_len, const char __user *, 
> > > > > cmdline_ptr,
> > > > > - unsigned long, flags)
> > > > > + unsigned long, flags, const struct kexec_fdset __user 
> > > > > *, ufdset)
> > > > 
> > > > Can one add more parameters to existing syscall. Can it break existing
> > > > programs with new kernel? I was of the impression that one can't do 
> > > > that.
> > > > But may be I am missing something.
> > > 
> > > I think the idea was that we would only look at the new params if a new
> > > flags was set, and otherwise it would behave as the old syscall.
> > > 
> > > Regardless, I think it makes far more sense to add a kexec_file_load2
> > > syscall if we're going to modify the prototype at all. It's a rather
> > > different proposition to the existing syscall, and needs to be treated
> > > as such.
> > 
> > I do not think it is worth to add another syscall for extra fds.
> > We have open(2) as an example for different numbers of arguments
> > already.
> 
> Did we change the syscall interface for that?
> 
> I was under the impression that there was always one underlying syscall,
> and the C library did the right thing to pass the expected information
> to the underlying syscall.

I'm not sure kexec_load and kexec_file_load were included in glibc, we use
syscall directly in kexec-tools.

kexec_load man pages says there are no wrappers for both kexec_load and
kexec_file_load in glibc.

> 
> That's rather different to changing the underlying syscall.
> 
> Regardless of how this is wrapped in userspace, I do not think modifying
> the existing prototype is a good idea, and I think this kind of
> extension needs to be a new syscall.

Hmm, as I replied to Vivek, there is one case about the flags, previously
the new flag will be regarded as invalid, but not we extend it it will be
valid, this maybe the only potential bad case.

Thanks
Dave
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [RESEND][PATCH v2 1/2] kexec: refactor code parsing size based on memory range

2016-08-04 Thread Dave Young
Hi Hari,

On 08/04/16 at 01:03am, Hari Bathini wrote:
> crashkernel parameter supports different syntaxes to specify the amount
> of memory to be reserved for kdump kernel. Below is one of the supported
> syntaxes that needs parsing to find the memory size to reserve, based on
> memory range:
> 
> crashkernel=:[,:,...]
> 
> While such parsing is implemented for crashkernel parameter, it applies
> to other parameters, like fadump_reserve_mem=, which could use similar
> syntax. This patch moves crashkernel's parsing code for above syntax to
> to kernel/params.c file for reuse. Two functions is_param_range_based()
> and parse_mem_range_size() are added to kernel/params.c file for this
> purpose.
> 
> Any parameter that uses the above syntax can use is_param_range_based()
> function to validate the syntax and parse_mem_range_size() function to
> get the parsed memory size. While some code is moved to kernel/params.c
> file, there is no change functionality wise in parsing the crashkernel
> parameter.
> 
> Signed-off-by: Hari Bathini 
> ---
> 
> Changes from v1:
> 1. Updated changelog
> 
>  include/linux/kernel.h |5 +++
>  kernel/kexec_core.c|   63 +++-
>  kernel/params.c|   96 
> 
>  3 files changed, 106 insertions(+), 58 deletions(-)
> 
> diff --git a/include/linux/kernel.h b/include/linux/kernel.h
> index d96a611..2df7ba2 100644
> --- a/include/linux/kernel.h
> +++ b/include/linux/kernel.h
> @@ -435,6 +435,11 @@ extern char *get_options(const char *str, int nints, int 
> *ints);
>  extern unsigned long long memparse(const char *ptr, char **retptr);
>  extern bool parse_option_str(const char *str, const char *option);
>  
> +extern bool __init is_param_range_based(const char *cmdline);
> +extern unsigned long long __init parse_mem_range_size(const char *param,
> +   char **str,
> +   unsigned long long 
> system_ram);
> +
>  extern int core_kernel_text(unsigned long addr);
>  extern int core_kernel_data(unsigned long addr);
>  extern int __kernel_text_address(unsigned long addr);
> diff --git a/kernel/kexec_core.c b/kernel/kexec_core.c
> index 5616755..3a74024 100644
> --- a/kernel/kexec_core.c
> +++ b/kernel/kexec_core.c
> @@ -1104,59 +1104,9 @@ static int __init parse_crashkernel_mem(char *cmdline,
>   char *cur = cmdline, *tmp;
>  
>   /* for each entry of the comma-separated list */
> - do {
> - unsigned long long start, end = ULLONG_MAX, size;
> -
> - /* get the start of the range */
> - start = memparse(cur, &tmp);
> - if (cur == tmp) {
> - pr_warn("crashkernel: Memory value expected\n");
> - return -EINVAL;
> - }
> - cur = tmp;
> - if (*cur != '-') {
> - pr_warn("crashkernel: '-' expected\n");
> - return -EINVAL;
> - }
> - cur++;
> -
> - /* if no ':' is here, than we read the end */
> - if (*cur != ':') {
> - end = memparse(cur, &tmp);
> - if (cur == tmp) {
> - pr_warn("crashkernel: Memory value expected\n");
> - return -EINVAL;
> - }
> - cur = tmp;
> - if (end <= start) {
> - pr_warn("crashkernel: end <= start\n");
> - return -EINVAL;
> - }
> - }
> -
> - if (*cur != ':') {
> - pr_warn("crashkernel: ':' expected\n");
> - return -EINVAL;
> - }
> - cur++;
> -
> - size = memparse(cur, &tmp);
> - if (cur == tmp) {
> - pr_warn("Memory value expected\n");
> - return -EINVAL;
> - }
> - cur = tmp;
> - if (size >= system_ram) {
> - pr_warn("crashkernel: invalid size\n");
> - return -EINVAL;
> - }
> -
> - /* match ? */
> - if (system_ram >= start && system_ram < end) {
> - *crash_size = size;
> - break;
> - }
> - } while (*cur++ == ',');
> + *crash_size = parse_mem_range_size("crashkernel", &cur, system_ram);
> + if (cur == cmdline)
> + return -EINVAL;
>  
>   if (*crash_size > 0) {
>   while (*cur && *cur != ' ' && *cur != '@')
> @@ -1293,7 +1243,6 @@ static int __init __parse_crashkernel(char *cmdline,
>const char *name,
>const char *suffix)
>  {
> - char*first_colon, *first_space;
>   char*ck_cmdline;
>  
>   BUG_ON(!crash_size || !crash_base);
>

Re: [PATCH v2 0/6] kexec_file: Add buffer hand-over for the next kernel

2016-08-16 Thread Dave Young
On 08/13/16 at 12:18am, Thiago Jung Bauermann wrote:
> Hello,
> 
> This patch series implements a mechanism which allows the kernel to pass
> on a buffer to the kernel that will be kexec'd. This buffer is passed
> as a segment which is added to the kimage when it is being prepared
> by kexec_file_load.
> 
> How the second kernel is informed of this buffer is architecture-specific.
> On powerpc, this is done via the device tree, by checking
> the properties /chosen/linux,kexec-handover-buffer-start and
> /chosen/linux,kexec-handover-buffer-end, which is analogous to how the
> kernel finds the initrd.
> 
> This is needed because the Integrity Measurement Architecture subsystem
> needs to preserve its measurement list accross the kexec reboot. The
> following patch series for the IMA subsystem uses this feature for that
> purpose:
> 
> https://lists.infradead.org/pipermail/kexec/2016-August/016745.html
> 
> This is so that IMA can implement trusted boot support on the OpenPower
> platform, because on such systems an intermediary Linux instance running
> as part of the firmware is used to boot the target operating system via
> kexec. Using this mechanism, IMA on this intermediary instance can
> hand over to the target OS the measurements of the components that were
> used to boot it.
> 
> Because there could be additional measurement events between the
> kexec_file_load call and the actual reboot, IMA needs a way to update the
> buffer with those additional events before rebooting. One can minimize
> the interval between the kexec_file_load and the reboot syscalls, but as
> small as it can be, there is always the possibility that the measurement
> list will be out of date at the time of reboot.
> 
> To address this issue, this patch series also introduces
> kexec_update_segment, which allows a reboot notifier to change the
> contents of the image segment during the reboot process.
> 
> Patch 5 makes kimage_load_normal_segment and kexec_update_segment share
> code. It's not much code that they can share though, so I'm not sure if
> the result is actually better.
> 
> The last patch is not intended to be merged, it just demonstrates how
> this feature can be used.
> 
> This series applies on top of v5 of the "kexec_file_load implementation
> for PowerPC" patch series (which applies on top of v4.8-rc1):
> 
> https://lists.infradead.org/pipermail/kexec/2016-August/016843.html

I'm trying to review your patches, but seems I can not apply them
cleanly to mainline kernel or v4.8-rc1

Apply the kexec_file_load series failed as below on v4.8-rc1:

Applying: kexec_file: Allow arch-specific memory walking for
kexec_add_buffer
error: patch failed: include/linux/kexec.h:149
error: include/linux/kexec.h: patch does not apply
Patch failed at 0001 kexec_file: Allow arch-specific memory walking for
kexec_add_buffer
The copy of the patch that failed is found in: .git/rebase-apply/patch
When you have resolved this problem, run "git am --continue".
If you prefer to skip this patch, run "git am --skip" instead.
To restore the original branch and stop patching, run "git am --abort".

What is the order of your patch series of the three patchset?

[PATCH v2 0/2] extend kexec_file_load system call
[PATCH v5 00/13] kexec_file_load implementation for PowerPC
[PATCH v2 0/6] kexec_file: Add buffer hand-over for the next kernel

Do they depend on other patches?

Thanks
Dave


Re: [PATCH v2 2/2] kexec: extend kexec_file_load system call

2016-08-18 Thread Dave Young
Since Eric was objecting the extension, I think you should convince him,
but I will review from code point of view.

On 08/11/16 at 08:03pm, Thiago Jung Bauermann wrote:
> From: AKASHI Takahiro 
> 
> Device tree blob must be passed to a second kernel on DTB-capable
> archs, like powerpc and arm64, but the current kernel interface
> lacks this support.
> 
> This patch extends kexec_file_load system call by adding an extra
> argument to this syscall so that an arbitrary number of file descriptors
> can be handed out from user space to the kernel.
> 
>   long sys_kexec_file_load(int kernel_fd, int initrd_fd,
>unsigned long cmdline_len,
>const char __user *cmdline_ptr,
>unsigned long flags,
>const struct kexec_fdset __user *ufdset);
> 
> If KEXEC_FILE_EXTRA_FDS is set to the "flags" argument, the "ufdset"
> argument points to the following struct buffer:
> 
>   struct kexec_fdset {
>   int nr_fds;
>   struct kexec_file_fd fds[0];
>   }
> 
> Signed-off-by: AKASHI Takahiro 
> Signed-off-by: Thiago Jung Bauermann 
> ---
>  include/linux/fs.h |  1 +
>  include/linux/kexec.h  |  7 ++--
>  include/linux/syscalls.h   |  4 ++-
>  include/uapi/linux/kexec.h | 22 
>  kernel/kexec_file.c| 83 
> ++
>  5 files changed, 108 insertions(+), 9 deletions(-)
> 
> diff --git a/include/linux/fs.h b/include/linux/fs.h
> index 3523bf62f328..847d9c31f428 100644
> --- a/include/linux/fs.h
> +++ b/include/linux/fs.h
> @@ -2656,6 +2656,7 @@ extern int do_pipe_flags(int *, int);
>   id(MODULE, kernel-module)   \
>   id(KEXEC_IMAGE, kexec-image)\
>   id(KEXEC_INITRAMFS, kexec-initramfs)\
> + id(KEXEC_PARTIAL_DTB, kexec-partial-dtb)\
>   id(POLICY, security-policy) \
>   id(MAX_ID, )
>  
> diff --git a/include/linux/kexec.h b/include/linux/kexec.h
> index 4f85d284ed0b..29202935055d 100644
> --- a/include/linux/kexec.h
> +++ b/include/linux/kexec.h
> @@ -148,7 +148,10 @@ struct kexec_file_ops {
>   kexec_verify_sig_t *verify_sig;
>  #endif
>  };
> -#endif
> +
> +int __weak arch_kexec_verify_buffer(enum kexec_file_type type, const void 
> *buf,
> + unsigned long size);
> +#endif /* CONFIG_KEXEC_FILE */
>  
>  struct kimage {
>   kimage_entry_t head;
> @@ -280,7 +283,7 @@ extern int kexec_load_disabled;
>  
>  /* List of defined/legal kexec file flags */
>  #define KEXEC_FILE_FLAGS (KEXEC_FILE_UNLOAD | KEXEC_FILE_ON_CRASH | \
> -  KEXEC_FILE_NO_INITRAMFS)
> +  KEXEC_FILE_NO_INITRAMFS | KEXEC_FILE_EXTRA_FDS)
>  
>  #define VMCOREINFO_BYTES   (4096)
>  #define VMCOREINFO_NOTE_NAME   "VMCOREINFO"
> diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h
> index d02239022bd0..fc072bdb74e3 100644
> --- a/include/linux/syscalls.h
> +++ b/include/linux/syscalls.h
> @@ -66,6 +66,7 @@ struct perf_event_attr;
>  struct file_handle;
>  struct sigaltstack;
>  union bpf_attr;
> +struct kexec_fdset;
>  
>  #include 
>  #include 
> @@ -321,7 +322,8 @@ asmlinkage long sys_kexec_load(unsigned long entry, 
> unsigned long nr_segments,
>  asmlinkage long sys_kexec_file_load(int kernel_fd, int initrd_fd,
>   unsigned long cmdline_len,
>   const char __user *cmdline_ptr,
> - unsigned long flags);
> + unsigned long flags,
> + const struct kexec_fdset __user *ufdset);
>  
>  asmlinkage long sys_exit(int error_code);
>  asmlinkage long sys_exit_group(int error_code);
> diff --git a/include/uapi/linux/kexec.h b/include/uapi/linux/kexec.h
> index aae5ebf2022b..6279be79efba 100644
> --- a/include/uapi/linux/kexec.h
> +++ b/include/uapi/linux/kexec.h
> @@ -23,6 +23,28 @@
>  #define KEXEC_FILE_UNLOAD0x0001
>  #define KEXEC_FILE_ON_CRASH  0x0002
>  #define KEXEC_FILE_NO_INITRAMFS  0x0004
> +#define KEXEC_FILE_EXTRA_FDS 0x0008
> +
> +enum kexec_file_type {
> + KEXEC_FILE_TYPE_KERNEL,
> + KEXEC_FILE_TYPE_INITRAMFS,
> +
> + /*
> +  * Device Tree Blob containing just the nodes and properties that
> +  * the kexec_file_load caller wants to add or modify.
> +  */
> + KEXEC_FILE_TYPE_PARTIAL_DTB,
> +};
> +
> +struct kexec_file_fd {
> + enum kexec_file_type type;
> + int fd;
> +};
> +
> +struct kexec_fdset {
> + int nr_fds;
> + struct kexec_file_fd fds[0];
> +};
>  
>  /* These values match the ELF architecture values.
>   * Unless there is a good reason that should continue to be the case.
> diff --git a/kernel/kexec_file.c b/kernel/kexec_file.c
> index 113af2f219b9..d6803dd884e2 100644
> --- a/kernel/kexec_file.c
> +++ b/

Re: [PATCH v2 1/2] kexec: add dtb info to struct kimage

2016-08-18 Thread Dave Young
On 08/11/16 at 08:03pm, Thiago Jung Bauermann wrote:
> From: AKASHI Takahiro 
> 
> Device tree blob must be passed to a second kernel on DTB-capable
> archs, like powerpc and arm64, but the current kernel interface
> lacks this support.
> 
> This patch adds dtb buffer information to struct kimage.
> When users don't specify dtb explicitly and the one used for the current
> kernel can be re-used, this change will be good enough for implementing
> kexec_file_load feature.
> 
> Signed-off-by: AKASHI Takahiro 
> ---
>  include/linux/kexec.h | 3 +++
>  kernel/kexec_file.c   | 3 +++
>  2 files changed, 6 insertions(+)
> 
> diff --git a/include/linux/kexec.h b/include/linux/kexec.h
> index d743baaa..4f85d284ed0b 100644
> --- a/include/linux/kexec.h
> +++ b/include/linux/kexec.h
> @@ -192,6 +192,9 @@ struct kimage {
>   char *cmdline_buf;
>   unsigned long cmdline_buf_len;
>  
> + void *dtb_buf;
> + unsigned long dtb_buf_len;
> +
>   /* File operations provided by image loader */
>   struct kexec_file_ops *fops;
>  
> diff --git a/kernel/kexec_file.c b/kernel/kexec_file.c
> index 503bc2d348e5..113af2f219b9 100644
> --- a/kernel/kexec_file.c
> +++ b/kernel/kexec_file.c
> @@ -92,6 +92,9 @@ void kimage_file_post_load_cleanup(struct kimage *image)
>   vfree(image->initrd_buf);
>   image->initrd_buf = NULL;
>  
> + vfree(image->dtb_buf);
> + image->dtb_buf = NULL;
> +
>   kfree(image->cmdline_buf);
>   image->cmdline_buf = NULL;
>  
> -- 
> 1.9.1
> 
> 
> ___
> kexec mailing list
> ke...@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/kexec

Acked-by: Dave Young 

Thanks
Dave


Re: [PATCH v2 3/6] kexec_file: Allow skipping checksum calculation for some segments.

2016-08-18 Thread Dave Young
On 08/13/16 at 12:18am, Thiago Jung Bauermann wrote:
> Adds checksum argument to kexec_add_buffer specifying whether the given
> segment should be part of the checksum calculation.
> 

Since it is used with add buffer, could it be added to kbuf as a new
field?

Like kbuf.no_checksum, default value is 0 that means checksum is needed
if it is 1 then no need a checksum.

> The next patch will add a way to update segments after a kimage is loaded.
> Segments that will be updated in this way should not be checksummed,
> otherwise they will cause the purgatory checksum verification to fail
> when the machine is rebooted.
> 
> As a bonus, we don't need to special-case the purgatory segment anymore
> to avoid checksumming it.
> 
> Adjust call sites for the new argument.
> 
> Signed-off-by: Thiago Jung Bauermann 
> ---
>  arch/powerpc/kernel/kexec_elf_64.c |  6 +++---
>  arch/x86/kernel/crash.c|  4 ++--
>  arch/x86/kernel/kexec-bzimage64.c  |  6 +++---
>  include/linux/kexec.h  | 10 +++---
>  kernel/kexec_file.c| 23 ---
>  5 files changed, 27 insertions(+), 22 deletions(-)
> 
> diff --git a/arch/powerpc/kernel/kexec_elf_64.c 
> b/arch/powerpc/kernel/kexec_elf_64.c
> index 22afc7b5ee73..4c528c81b076 100644
> --- a/arch/powerpc/kernel/kexec_elf_64.c
> +++ b/arch/powerpc/kernel/kexec_elf_64.c
> @@ -128,7 +128,7 @@ static int elf_exec_load(struct kimage *image, struct 
> elfhdr *ehdr,
>   kbuf.memsz = phdr->p_memsz;
>   kbuf.buf_align = phdr->p_align;
>   kbuf.buf_min = phdr->p_paddr + base;
> - ret = kexec_add_buffer(&kbuf);
> + ret = kexec_add_buffer(&kbuf, true);
>   if (ret)
>   goto out;
>   load_addr = kbuf.mem;
> @@ -188,7 +188,7 @@ void *elf64_load(struct kimage *image, char *kernel_buf,
>   kbuf.bufsz = kbuf.memsz = initrd_len;
>   kbuf.buf_align = PAGE_SIZE;
>   kbuf.top_down = false;
> - ret = kexec_add_buffer(&kbuf);
> + ret = kexec_add_buffer(&kbuf, true);
>   if (ret)
>   goto out;
>   initrd_load_addr = kbuf.mem;
> @@ -245,7 +245,7 @@ void *elf64_load(struct kimage *image, char *kernel_buf,
>   kbuf.bufsz = kbuf.memsz = fdt_size;
>   kbuf.buf_align = PAGE_SIZE;
>   kbuf.top_down = true;
> - ret = kexec_add_buffer(&kbuf);
> + ret = kexec_add_buffer(&kbuf, true);
>   if (ret)
>   goto out;
>   fdt_load_addr = kbuf.mem;
> diff --git a/arch/x86/kernel/crash.c b/arch/x86/kernel/crash.c
> index 38a1cdf6aa05..634ab16377b1 100644
> --- a/arch/x86/kernel/crash.c
> +++ b/arch/x86/kernel/crash.c
> @@ -642,7 +642,7 @@ int crash_load_segments(struct kimage *image)
>* copied in purgatory after crash. Just add a zero filled
>* segment for now to make sure checksum logic works fine.
>*/
> - ret = kexec_add_buffer(&kbuf);
> + ret = kexec_add_buffer(&kbuf, true);
>   if (ret)
>   return ret;
>   image->arch.backup_load_addr = kbuf.mem;
> @@ -661,7 +661,7 @@ int crash_load_segments(struct kimage *image)
>  
>   kbuf.memsz = kbuf.bufsz;
>   kbuf.buf_align = ELF_CORE_HEADER_ALIGN;
> - ret = kexec_add_buffer(&kbuf);
> + ret = kexec_add_buffer(&kbuf, true);
>   if (ret) {
>   vfree((void *)image->arch.elf_headers);
>   return ret;
> diff --git a/arch/x86/kernel/kexec-bzimage64.c 
> b/arch/x86/kernel/kexec-bzimage64.c
> index 4b3a75329fb6..a46e3fbb0639 100644
> --- a/arch/x86/kernel/kexec-bzimage64.c
> +++ b/arch/x86/kernel/kexec-bzimage64.c
> @@ -422,7 +422,7 @@ static void *bzImage64_load(struct kimage *image, char 
> *kernel,
>   kbuf.memsz = kbuf.bufsz;
>   kbuf.buf_align = 16;
>   kbuf.buf_min = MIN_BOOTPARAM_ADDR;
> - ret = kexec_add_buffer(&kbuf);
> + ret = kexec_add_buffer(&kbuf, true);
>   if (ret)
>   goto out_free_params;
>   bootparam_load_addr = kbuf.mem;
> @@ -435,7 +435,7 @@ static void *bzImage64_load(struct kimage *image, char 
> *kernel,
>   kbuf.memsz = PAGE_ALIGN(header->init_size);
>   kbuf.buf_align = header->kernel_alignment;
>   kbuf.buf_min = MIN_KERNEL_LOAD_ADDR;
> - ret = kexec_add_buffer(&kbuf);
> + ret = kexec_add_buffer(&kbuf, true);
>   if (ret)
>   goto out_free_params;
>   kernel_load_addr = kbuf.mem;
> @@ -449,7 +449,7 @@ static void *bzImage64_load(struct kimage *image, char 
> *kernel,
>   kbuf.bufsz = kbuf.memsz = initrd_len;
>   kbuf.buf_align = PAGE_SIZE;
>   kbuf.buf_min = MIN_INITRD_LOAD_ADDR;
> - ret = kexec_add_buffer(&kbuf);
> + ret = kexec_add_buffer(&kbuf, true);
>   if (ret)
>   goto out_free_params;
>   initrd_load_addr = kbuf.mem;
> dif

Re: [PATCH v2 3/6] kexec_file: Allow skipping checksum calculation for some segments.

2016-08-21 Thread Dave Young
On 08/18/16 at 06:09pm, Thiago Jung Bauermann wrote:
> Hello Dave,
> 
> Thanks for your review!
> 
> [ Trimming down Cc: list a little to try to clear the "too many recipients"   
>   mailing list restriction. ]

I also got "too many recipients".. Thanks for the trimming.

> 
> Am Donnerstag, 18 August 2016, 17:03:30 schrieb Dave Young:
> > On 08/13/16 at 12:18am, Thiago Jung Bauermann wrote:
> > > Adds checksum argument to kexec_add_buffer specifying whether the given
> > > segment should be part of the checksum calculation.
> > 
> > Since it is used with add buffer, could it be added to kbuf as a new
> > field?
> 
> I was on the fence about adding it as a new argument to kexec_add_buffer or 
> as a new field to struct kexec_buf. Both alternatives make sense to me. I 
> implemented your suggestion in the patch below, what do you think?
> 
> > Like kbuf.no_checksum, default value is 0 that means checksum is needed
> > if it is 1 then no need a checksum.
> 
> It's an interesting idea and I implemented it that way, though in practice 
> all current users of struct kexec_buf put it on the stack so the field needs 
> to be initialized explicitly.
> 
> -- 
> []'s
> Thiago Jung Bauermann
> IBM Linux Technology Center
> 
> 
> Subject: [PATCH v2 3/6] kexec_file: Allow skipping checksum calculation for
>  some segments.
> 
> Add skip_checksum member to struct kexec_buf to specify whether the
> corresponding segment should be part of the checksum calculation.
> 
> The next patch will add a way to update segments after a kimage is loaded.
> Segments that will be updated in this way should not be checksummed,
> otherwise they will cause the purgatory checksum verification to fail
> when the machine is rebooted.
> 
> As a bonus, we don't need to special-case the purgatory segment anymore
> to avoid checksumming it.
> 
> Adjust places using struct kexec_buf to set skip_checksum.
> 
> Signed-off-by: Thiago Jung Bauermann 
> ---
>  arch/powerpc/kernel/kexec_elf_64.c |  5 +++--
>  arch/x86/kernel/crash.c|  3 ++-
>  arch/x86/kernel/kexec-bzimage64.c  |  2 +-
>  include/linux/kexec.h  | 23 ++-
>  kernel/kexec_file.c| 15 +++
>  5 files changed, 27 insertions(+), 21 deletions(-)
> 
> diff --git a/arch/powerpc/kernel/kexec_elf_64.c 
> b/arch/powerpc/kernel/kexec_elf_64.c
> index 22afc7b5ee73..d009f5363968 100644
> --- a/arch/powerpc/kernel/kexec_elf_64.c
> +++ b/arch/powerpc/kernel/kexec_elf_64.c
> @@ -107,7 +107,7 @@ static int elf_exec_load(struct kimage *image, struct 
> elfhdr *ehdr,
>   int ret;
>   size_t i;
>   struct kexec_buf kbuf = { .image = image, .buf_max = ppc64_rma_size,
> -   .top_down = false };
> +   .top_down = false, .skip_checksum = false };

No need to set it as false because it will be initialized to 0 by
default?

>  
>   /* Read in the PT_LOAD segments. */
>   for (i = 0; i < ehdr->e_phnum; i++) {
> @@ -162,7 +162,8 @@ void *elf64_load(struct kimage *image, char *kernel_buf,
>   struct elf_info elf_info;
>   struct fdt_reserve_entry *rsvmap;
>   struct kexec_buf kbuf = { .image = image, .buf_min = 0,
> -   .buf_max = ppc64_rma_size };
> +   .buf_max = ppc64_rma_size,
> +   .skip_checksum = false };
>  
>   ret = build_elf_exec_info(kernel_buf, kernel_len, &ehdr, &elf_info);
>   if (ret)
> diff --git a/arch/x86/kernel/crash.c b/arch/x86/kernel/crash.c
> index 38a1cdf6aa05..7b8f62c86651 100644
> --- a/arch/x86/kernel/crash.c
> +++ b/arch/x86/kernel/crash.c
> @@ -617,7 +617,8 @@ int crash_load_segments(struct kimage *image)
>  {
>   int ret;
>   struct kexec_buf kbuf = { .image = image, .buf_min = 0,
> -   .buf_max = ULONG_MAX, .top_down = false };
> +   .buf_max = ULONG_MAX, .top_down = false,
> +   .skip_checksum = false };
>  
>   /*
>* Determine and load a segment for backup area. First 640K RAM
> diff --git a/arch/x86/kernel/kexec-bzimage64.c 
> b/arch/x86/kernel/kexec-bzimage64.c
> index 4b3a75329fb6..449f433cd225 100644
> --- a/arch/x86/kernel/kexec-bzimage64.c
> +++ b/arch/x86/kernel/kexec-bzimage64.c
> @@ -341,7 +341,7 @@ static void *bzImage64_load(struct kimage *image, char 
> *kernel,
>   unsigned int setup_hdr_offset = offsetof(struct boot_params, hdr);
>   unsigned int efi_map_offset, efi_map_sz, efi_setup_data_offset;
>   struct kexec_buf 

Re: [PATCH v2 2/6] powerpc: kexec_file: Add buffer hand-over support for the next kernel

2016-08-21 Thread Dave Young
On 08/13/16 at 12:18am, Thiago Jung Bauermann wrote:
> The buffer hand-over mechanism allows the currently running kernel to pass
> data to kernel that will be kexec'd via a kexec segment. The second kernel
> can check whether the previous kernel sent data and retrieve it.
> 
> This is the architecture-specific part.
> 
> Signed-off-by: Thiago Jung Bauermann 
> ---
>  arch/powerpc/include/asm/kexec.h   |  12 +++-
>  arch/powerpc/kernel/kexec_elf_64.c |   2 +-
>  arch/powerpc/kernel/machine_kexec_64.c | 114 
> +++--
>  3 files changed, 120 insertions(+), 8 deletions(-)
> 
> diff --git a/arch/powerpc/include/asm/kexec.h 
> b/arch/powerpc/include/asm/kexec.h
> index 31bc64e07c8f..b20738df26f8 100644
> --- a/arch/powerpc/include/asm/kexec.h
> +++ b/arch/powerpc/include/asm/kexec.h
> @@ -92,12 +92,20 @@ static inline bool kdump_in_progress(void)
>  }
>  
>  #ifdef CONFIG_KEXEC_FILE
> +#define ARCH_HAS_KIMAGE_ARCH
> +
> +struct kimage_arch {
> + phys_addr_t handover_buffer_addr;
> + unsigned long handover_buffer_size;
> +};
> +
>  int setup_purgatory(struct kimage *image, const void *slave_code,
>   const void *fdt, unsigned long kernel_load_addr,
>   unsigned long fdt_load_addr, unsigned long stack_top,
>   int debug);
> -int setup_new_fdt(void *fdt, unsigned long initrd_load_addr,
> -   unsigned long initrd_len, const char *cmdline);
> +int setup_new_fdt(const struct kimage *image, void *fdt,
> +   unsigned long initrd_load_addr, unsigned long initrd_len,
> +   const char *cmdline);
>  bool find_debug_console(const void *fdt, int chosen_node);
>  int merge_partial_dtb(void *to, const void *from);
>  #endif /* CONFIG_KEXEC_FILE */
> diff --git a/arch/powerpc/kernel/kexec_elf_64.c 
> b/arch/powerpc/kernel/kexec_elf_64.c
> index 1b902ad66e2a..22afc7b5ee73 100644
> --- a/arch/powerpc/kernel/kexec_elf_64.c
> +++ b/arch/powerpc/kernel/kexec_elf_64.c
> @@ -219,7 +219,7 @@ void *elf64_load(struct kimage *image, char *kernel_buf,
>   }
>   }
>  
> - ret = setup_new_fdt(fdt, initrd_load_addr, initrd_len, cmdline);
> + ret = setup_new_fdt(image, fdt, initrd_load_addr, initrd_len, cmdline);
>   if (ret)
>   goto out;
>  
> diff --git a/arch/powerpc/kernel/machine_kexec_64.c 
> b/arch/powerpc/kernel/machine_kexec_64.c
> index a484a6346146..190c652e49b7 100644
> --- a/arch/powerpc/kernel/machine_kexec_64.c
> +++ b/arch/powerpc/kernel/machine_kexec_64.c
> @@ -490,6 +490,60 @@ int arch_kimage_file_post_load_cleanup(struct kimage 
> *image)
>   return image->fops->cleanup(image->image_loader_data);
>  }
>  
> +bool kexec_can_hand_over_buffer(void)
> +{
> + return true;
> +}
> +
> +int arch_kexec_add_handover_buffer(struct kimage *image,
> +unsigned long load_addr, unsigned long size)
> +{
> + image->arch.handover_buffer_addr = load_addr;
> + image->arch.handover_buffer_size = size;
> +
> + return 0;
> +}
> +
> +int kexec_get_handover_buffer(void **addr, unsigned long *size)
> +{
> + int ret;
> + u64 start_addr, end_addr;
> +
> + ret = of_property_read_u64(of_chosen,
> +"linux,kexec-handover-buffer-start",
> +&start_addr);
> + if (ret == -EINVAL)
> + return -ENOENT;
> + else if (ret)
> + return -EINVAL;
> +
> + ret = of_property_read_u64(of_chosen, "linux,kexec-handover-buffer-end",
> +&end_addr);
> + if (ret == -EINVAL)
> + return -ENOENT;
> + else if (ret)
> + return -EINVAL;
> +
> + *addr =  __va(start_addr);
> + /* -end is the first address after the buffer. */
> + *size = end_addr - start_addr;
> +
> + return 0;
> +}

This depends on dtb, so if IMA want to extend it to arches like x86 in
the future you will have to think about other way to pass it.

How about think about a general way now?

> +
> +int kexec_free_handover_buffer(void)
> +{
> + int ret;
> + void *addr;
> + unsigned long size;
> +
> + ret = kexec_get_handover_buffer(&addr, &size);
> + if (ret)
> + return ret;
> +
> + return memblock_free((phys_addr_t) addr, size);
> +}
> +
>  /**
>   * arch_kexec_walk_mem() - call func(data) for each unreserved memory block
>   * @kbuf:Context info for the search. Also passed to @func.
> @@ -687,9 +741,52 @@ int setup_purgatory(struct kimage *image, const void 
> *slave_code,
>   return 0;
>  }
>  
> -/*
> - * setup_new_fdt() - modify /chosen and memory reservation for the next 
> kernel
> - * @fdt:
> +/**
> + * setup_handover_buffer() - add properties and reservation for the handover 
> buffer
> + * @image:   kexec image being loaded.
> + * @fdt: Flattened device tree for the next kernel.
> + * @chosen_node: Offset to the chosen node.
> + *
> + * Return:

Re: [PATCH v2 3/6] kexec_file: Allow skipping checksum calculation for some segments.

2016-08-21 Thread Dave Young
On 08/22/16 at 12:25am, Thiago Jung Bauermann wrote:
> Am Montag, 22 August 2016, 11:17:45 schrieb Dave Young:
> > On 08/18/16 at 06:09pm, Thiago Jung Bauermann wrote:
> > > Hello Dave,
> > > 
> > > Thanks for your review!
> > > 
> > > [ Trimming down Cc: list a little to try to clear the "too many
> > > recipients"> 
> > >   mailing list restriction. ]
> > 
> > I also got "too many recipients".. Thanks for the trimming.
> 
> Didn't work though. What is the maximum number of recipients?

I have no idea as well..

> 
> > > Am Donnerstag, 18 August 2016, 17:03:30 schrieb Dave Young:
> > > > On 08/13/16 at 12:18am, Thiago Jung Bauermann wrote:
> > > > > Adds checksum argument to kexec_add_buffer specifying whether the
> > > > > given
> > > > > segment should be part of the checksum calculation.
> > > > 
> > > > Since it is used with add buffer, could it be added to kbuf as a new
> > > > field?
> > > 
> > > I was on the fence about adding it as a new argument to kexec_add_buffer
> > > or as a new field to struct kexec_buf. Both alternatives make sense to
> > > me. I implemented your suggestion in the patch below, what do you
> > > think?> 
> > > > Like kbuf.no_checksum, default value is 0 that means checksum is
> > > > needed
> > > > if it is 1 then no need a checksum.
> > > 
> > > It's an interesting idea and I implemented it that way, though in
> > > practice all current users of struct kexec_buf put it on the stack so
> > > the field needs to be initialized explicitly.
> > 
> > No need to set it as false because it will be initialized to 0 by
> > default?
> 
> As far as I know, variables on the stack are not initialized. Only global 
> and static variables are.

But designated initializers will do it.

Thanks
Dave


Re: [PATCH v2 2/6] powerpc: kexec_file: Add buffer hand-over support for the next kernel

2016-08-22 Thread Dave Young
On 08/22/16 at 12:38am, Thiago Jung Bauermann wrote:
> Am Montag, 22 August 2016, 11:21:35 schrieb Dave Young:
> > On 08/13/16 at 12:18am, Thiago Jung Bauermann wrote:
> > > diff --git a/arch/powerpc/kernel/machine_kexec_64.c
> > > b/arch/powerpc/kernel/machine_kexec_64.c index
> > > a484a6346146..190c652e49b7 100644
> > > --- a/arch/powerpc/kernel/machine_kexec_64.c
> > > +++ b/arch/powerpc/kernel/machine_kexec_64.c
> > > @@ -490,6 +490,60 @@ int arch_kimage_file_post_load_cleanup(struct
> > > kimage *image)> 
> > >   return image->fops->cleanup(image->image_loader_data);
> > >  
> > >  }
> > > 
> > > +bool kexec_can_hand_over_buffer(void)
> > > +{
> > > + return true;
> > > +}
> > > +
> > > +int arch_kexec_add_handover_buffer(struct kimage *image,
> > > +unsigned long load_addr, unsigned long 
> size)
> > > +{
> > > + image->arch.handover_buffer_addr = load_addr;
> > > + image->arch.handover_buffer_size = size;
> > > +
> > > + return 0;
> > > +}
> > > +
> > > +int kexec_get_handover_buffer(void **addr, unsigned long *size)
> > > +{
> > > + int ret;
> > > + u64 start_addr, end_addr;
> > > +
> > > + ret = of_property_read_u64(of_chosen,
> > > +"linux,kexec-handover-buffer-start",
> > > +&start_addr);
> > > + if (ret == -EINVAL)
> > > + return -ENOENT;
> > > + else if (ret)
> > > + return -EINVAL;
> > > +
> > > + ret = of_property_read_u64(of_chosen,
> > > "linux,kexec-handover-buffer-end", + 
> &end_addr);
> > > + if (ret == -EINVAL)
> > > + return -ENOENT;
> > > + else if (ret)
> > > + return -EINVAL;
> > > +
> > > + *addr =  __va(start_addr);
> > > + /* -end is the first address after the buffer. */
> > > + *size = end_addr - start_addr;
> > > +
> > > + return 0;
> > > +}
> > 
> > This depends on dtb, so if IMA want to extend it to arches like x86 in
> > the future you will have to think about other way to pass it.
> > 
> > How about think about a general way now?
> 
> The only general way I can think of is by adding a kernel command line 
> parameter which the first kernel would pass to the second kernel, but IMHO 
> that is ugly, because such parameter wouldn't be useful to a user, and it 
> would also be something that, from the perspective of the user, would 
> magically appear in the kernel command line of the second kernel...

Sorry I just brought up the question, actually I have no idea either.
Maybe we have to do this with arch specific ways..

Thanks
Dave


Re: [PATCH v3 2/2] powerpc/fadump: parse fadump reserve memory size based on memory range

2016-08-25 Thread Dave Young
On 08/10/16 at 03:35pm, Hari Bathini wrote:
> When fadump is enabled, by default 5% of system RAM is reserved for
> fadump kernel. While that works for most cases, it is not good enough
> for every case.
> 
> Currently, to override the default value, fadump supports specifying
> memory to reserve with fadump_reserve_mem=size, where only a fixed size
> can be specified. This patch adds support to specify memory size to
> reserve for different memory ranges as below:
> 
>   fadump_reserve_mem=:[,:,...]

Hi, Hari

I do not understand why you need introduce the new cmdline param, what's
the difference between the "fadump reserved" memory and the memory
reserved by "crashkernel="? Can fadump just use crashkernel= to reserve
memory?

Thanks
Dave

> 
> Supporting range based input for "fadump_reserve_mem" parameter helps
> using the same commandline parameter for different system memory sizes.
> 
> Signed-off-by: Hari Bathini 
> Reviewed-by: Mahesh J Salgaonkar 
> ---
> 
> Changes from v2:
> 1. Updated changelog
> 
> 
>  arch/powerpc/kernel/fadump.c |   63 
> --
>  1 file changed, 54 insertions(+), 9 deletions(-)
> 
> diff --git a/arch/powerpc/kernel/fadump.c b/arch/powerpc/kernel/fadump.c
> index b3a6633..7c01b5b 100644
> --- a/arch/powerpc/kernel/fadump.c
> +++ b/arch/powerpc/kernel/fadump.c
> @@ -193,6 +193,55 @@ static unsigned long init_fadump_mem_struct(struct 
> fadump_mem_struct *fdm,
>   return addr;
>  }
>  
> +/*
> + * This function parses command line for fadump_reserve_mem=
> + *
> + * Supports the below two syntaxes:
> + *1. fadump_reserve_mem=size
> + *2. fadump_reserve_mem=ramsize-range:size[,...]
> + *
> + * Sets fw_dump.reserve_bootvar with the memory size
> + * provided, 0 otherwise
> + *
> + * The function returns -EINVAL on failure, 0 otherwise.
> + */
> +static int __init parse_fadump_reserve_mem(void)
> +{
> + char *name = "fadump_reserve_mem=";
> + char *fadump_cmdline = NULL, *cur;
> +
> + fw_dump.reserve_bootvar = 0;
> +
> + /* find fadump_reserve_mem and use the last one if there are many */
> + cur = strstr(boot_command_line, name);
> + while (cur) {
> + fadump_cmdline = cur;
> + cur = strstr(cur+1, name);
> + }
> +
> + /* when no fadump_reserve_mem= cmdline option is provided */
> + if (!fadump_cmdline)
> + return 0;
> +
> + fadump_cmdline += strlen(name);
> +
> + /* for fadump_reserve_mem=size cmdline syntax */
> + if (!is_colon_in_param(fadump_cmdline)) {
> + fw_dump.reserve_bootvar = memparse(fadump_cmdline, NULL);
> + return 0;
> + }
> +
> + /* for fadump_reserve_mem=ramsize-range:size[,...] cmdline syntax */
> + cur = fadump_cmdline;
> + fw_dump.reserve_bootvar = parse_mem_range_size("fadump_reserve_mem",
> + &cur, memblock_phys_mem_size());
> + if (cur == fadump_cmdline) {
> + return -EINVAL;
> + }
> +
> + return 0;
> +}
> +
>  /**
>   * fadump_calculate_reserve_size(): reserve variable boot area 5% of System 
> RAM
>   *
> @@ -212,12 +261,17 @@ static inline unsigned long 
> fadump_calculate_reserve_size(void)
>  {
>   unsigned long size;
>  
> + /* sets fw_dump.reserve_bootvar */
> + parse_fadump_reserve_mem();
> +
>   /*
>* Check if the size is specified through fadump_reserve_mem= cmdline
>* option. If yes, then use that.
>*/
>   if (fw_dump.reserve_bootvar)
>   return fw_dump.reserve_bootvar;
> + else
> + printk(KERN_INFO "fadump: calculating default boot size\n");
>  
>   /* divide by 20 to get 5% of value */
>   size = memblock_end_of_DRAM() / 20;
> @@ -348,15 +402,6 @@ static int __init early_fadump_param(char *p)
>  }
>  early_param("fadump", early_fadump_param);
>  
> -/* Look for fadump_reserve_mem= cmdline option */
> -static int __init early_fadump_reserve_mem(char *p)
> -{
> - if (p)
> - fw_dump.reserve_bootvar = memparse(p, &p);
> - return 0;
> -}
> -early_param("fadump_reserve_mem", early_fadump_reserve_mem);
> -
>  static void register_fw_dump(struct fadump_mem_struct *fdm)
>  {
>   int rc;
> 


Re: [PATCH v3 2/2] powerpc/fadump: parse fadump reserve memory size based on memory range

2016-08-25 Thread Dave Young
On 08/25/16 at 11:00pm, Hari Bathini wrote:
> 
> 
> On Thursday 25 August 2016 12:31 PM, Dave Young wrote:
> > On 08/10/16 at 03:35pm, Hari Bathini wrote:
> > > When fadump is enabled, by default 5% of system RAM is reserved for
> > > fadump kernel. While that works for most cases, it is not good enough
> > > for every case.
> > > 
> > > Currently, to override the default value, fadump supports specifying
> > > memory to reserve with fadump_reserve_mem=size, where only a fixed size
> > > can be specified. This patch adds support to specify memory size to
> > > reserve for different memory ranges as below:
> > > 
> > >   fadump_reserve_mem=:[,:,...]
> > Hi, Hari
> 
> Hi Dave,
> 
> > I do not understand why you need introduce the new cmdline param, what's
> > the difference between the "fadump reserved" memory and the memory
> 
> I am not introducing a new parameter but adding a new syntax for
> an existing parameter.

Apologize for that, I was not aware it because it is not documented in
kernel-parameters.txt

> 
> > reserved by "crashkernel="? Can fadump just use crashkernel= to reserve
> > memory?
> 
> Not all syntaxes supported by crashkernel apply for fadump_reserve_mem.
> Nonetheless, it is worth considering reuse of crashkernel parameter instead
> of fadump_reserve_mem. Let me see what I can do about this..

Thanks! I originally thought fadump will reserve memory in firmware
code, if it is in kernel then it will be better to just extend and reuse
crashkernel=.

Dave
> 
> Thanks
> Hari
> 
> > Thanks
> > Dave
> > 
> > > Supporting range based input for "fadump_reserve_mem" parameter helps
> > > using the same commandline parameter for different system memory sizes.
> > > 
> > > Signed-off-by: Hari Bathini 
> > > Reviewed-by: Mahesh J Salgaonkar 
> > > ---
> > > 
> > > Changes from v2:
> > > 1. Updated changelog
> > > 
> > > 
> > >   arch/powerpc/kernel/fadump.c |   63 
> > > --
> > >   1 file changed, 54 insertions(+), 9 deletions(-)
> > > 
> > > diff --git a/arch/powerpc/kernel/fadump.c b/arch/powerpc/kernel/fadump.c
> > > index b3a6633..7c01b5b 100644
> > > --- a/arch/powerpc/kernel/fadump.c
> > > +++ b/arch/powerpc/kernel/fadump.c
> > > @@ -193,6 +193,55 @@ static unsigned long init_fadump_mem_struct(struct 
> > > fadump_mem_struct *fdm,
> > >   return addr;
> > >   }
> > > +/*
> > > + * This function parses command line for fadump_reserve_mem=
> > > + *
> > > + * Supports the below two syntaxes:
> > > + *1. fadump_reserve_mem=size
> > > + *2. fadump_reserve_mem=ramsize-range:size[,...]
> > > + *
> > > + * Sets fw_dump.reserve_bootvar with the memory size
> > > + * provided, 0 otherwise
> > > + *
> > > + * The function returns -EINVAL on failure, 0 otherwise.
> > > + */
> > > +static int __init parse_fadump_reserve_mem(void)
> > > +{
> > > + char *name = "fadump_reserve_mem=";
> > > + char *fadump_cmdline = NULL, *cur;
> > > +
> > > + fw_dump.reserve_bootvar = 0;
> > > +
> > > + /* find fadump_reserve_mem and use the last one if there are many */
> > > + cur = strstr(boot_command_line, name);
> > > + while (cur) {
> > > + fadump_cmdline = cur;
> > > + cur = strstr(cur+1, name);
> > > + }
> > > +
> > > + /* when no fadump_reserve_mem= cmdline option is provided */
> > > + if (!fadump_cmdline)
> > > + return 0;
> > > +
> > > + fadump_cmdline += strlen(name);
> > > +
> > > + /* for fadump_reserve_mem=size cmdline syntax */
> > > + if (!is_colon_in_param(fadump_cmdline)) {
> > > + fw_dump.reserve_bootvar = memparse(fadump_cmdline, NULL);
> > > + return 0;
> > > + }
> > > +
> > > + /* for fadump_reserve_mem=ramsize-range:size[,...] cmdline syntax */
> > > + cur = fadump_cmdline;
> > > + fw_dump.reserve_bootvar = parse_mem_range_size("fadump_reserve_mem",
> > > + &cur, memblock_phys_mem_size());
> > > + if (cur == fadump_cmdline) {
> > > + return -EINVAL;
> > > + }
> > > +
> > > + return 0;
> > > +}
> > > +
> > >   /**
> > >* fadump_calculate_reserve_size(): re

Re: [PATHC v2 5/9] ima: on soft reboot, save the measurement list

2016-08-31 Thread Dave Young
Hi, Mimi

On 08/30/16 at 06:40pm, Mimi Zohar wrote:
> From: Thiago Jung Bauermann 
> 
> This patch uses the kexec buffer passing mechanism to pass the
> serialized IMA binary_runtime_measurements to the next kernel.
> 
> Changelog v2:
> - Fix build issue by defining a stub ima_add_kexec_buffer and stub
>   struct kimage when CONFIG_IMA=n and CONFIG_IMA_KEXEC=n. (Fenguang Wu)
> - removed kexec_add_handover_buffer() checksum argument.
> - added skip_checksum member to kexec_buf
> - only register reboot notifier once
> 
> Changelog v1:
> - updated to call IMA functions  (Mimi)
> - move code from ima_template.c to ima_kexec.c (Mimi)
> 
> Signed-off-by: Thiago Jung Bauermann 
> Signed-off-by: Mimi Zohar 
> ---
>  include/linux/ima.h| 12 ++
>  kernel/kexec_file.c|  4 ++
>  security/integrity/ima/ima_kexec.c | 88 
> ++
>  3 files changed, 104 insertions(+)
> 
> diff --git a/include/linux/ima.h b/include/linux/ima.h
> index 0eb7c2e..7f6952f 100644
> --- a/include/linux/ima.h
> +++ b/include/linux/ima.h
> @@ -11,6 +11,7 @@
>  #define _LINUX_IMA_H
>  
>  #include 
> +#include 
>  struct linux_binprm;
>  
>  #ifdef CONFIG_IMA
> @@ -23,6 +24,10 @@ extern int ima_post_read_file(struct file *file, void 
> *buf, loff_t size,
> enum kernel_read_file_id id);
>  extern void ima_post_path_mknod(struct dentry *dentry);
>  
> +#ifdef CONFIG_IMA_KEXEC
> +extern void ima_add_kexec_buffer(struct kimage *image);
> +#endif
> +
>  #else
>  static inline int ima_bprm_check(struct linux_binprm *bprm)
>  {
> @@ -62,6 +67,13 @@ static inline void ima_post_path_mknod(struct dentry 
> *dentry)
>  
>  #endif /* CONFIG_IMA */
>  
> +#ifndef CONFIG_IMA_KEXEC
> +struct kimage;
> +
> +static inline void ima_add_kexec_buffer(struct kimage *image)
> +{}
> +#endif
> +
>  #ifdef CONFIG_IMA_APPRAISE
>  extern void ima_inode_post_setattr(struct dentry *dentry);
>  extern int ima_inode_setxattr(struct dentry *dentry, const char *xattr_name,
> diff --git a/kernel/kexec_file.c b/kernel/kexec_file.c
> index 0e90d14..9585861 100644
> --- a/kernel/kexec_file.c
> +++ b/kernel/kexec_file.c
> @@ -19,6 +19,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include 
>  #include 
>  #include 
> @@ -200,6 +201,9 @@ kimage_file_prepare_segments(struct kimage *image, int 
> kernel_fd, int initrd_fd,
>   return ret;
>   image->kernel_buf_len = size;
>  
> + /* IMA needs to pass the measurement list to the next kernel. */
> + ima_add_kexec_buffer(image);
> +
>   /* Call arch image probe handlers */
>   ret = arch_kexec_kernel_image_probe(image, image->kernel_buf,
>   image->kernel_buf_len);
> diff --git a/security/integrity/ima/ima_kexec.c 
> b/security/integrity/ima/ima_kexec.c
> index e77ca9d..0e4d0db 100644
> --- a/security/integrity/ima/ima_kexec.c
> +++ b/security/integrity/ima/ima_kexec.c
> @@ -23,6 +23,11 @@
>  
>  #include "ima.h"
>  
> +#ifdef CONFIG_IMA_KEXEC
> +/* Physical address of the measurement buffer in the next kernel. */
> +static unsigned long kexec_buffer_load_addr;
> +static size_t kexec_segment_size;
> +
>  static int ima_dump_measurement_list(unsigned long *buffer_size, void 
> **buffer,
>unsigned long segment_size)
>  {
> @@ -75,6 +80,89 @@ out:
>  }
>  
>  /*
> + * Called during kexec execute so that IMA can save the measurement list.
> + */
> +static int ima_update_kexec_buffer(struct notifier_block *self,
> +unsigned long action, void *data)
> +{
> + void *kexec_buffer = NULL;
> + size_t kexec_buffer_size;
> + int ret;
> +
> + if (!kexec_in_progress)
> + return NOTIFY_OK;
> +
> + kexec_buffer_size = ima_get_binary_runtime_size();
> + if (kexec_buffer_size >
> + (kexec_segment_size - sizeof(struct ima_kexec_hdr))) {
> + pr_err("Binary measurement list grew too large.\n");
> + goto out;
> + }
> +
> + ima_dump_measurement_list(&kexec_buffer_size, &kexec_buffer,
> +   kexec_segment_size);
> + if (!kexec_buffer) {
> + pr_err("Not enough memory for the kexec measurement buffer.\n");
> + goto out;
> + }
> + ret = kexec_update_segment(kexec_buffer, kexec_buffer_size,
> +kexec_buffer_load_addr, kexec_segment_size);
> + if (ret)
> + pr_err("Error updating kexec buffer: %d\n", ret);
> +out:
> + return NOTIFY_OK;
> +}
> +
> +struct notifier_block update_buffer_nb = {
> + .notifier_call = ima_update_kexec_buffer,
> +};
> +
> +/*
> + * Called during kexec_file_load so that IMA can add a segment to the kexec
> + * image for the measurement list for the next kernel.
> + */
> +void ima_add_kexec_buffer(struct kimage *image)
> +{
> + static int registered = 0;
> + struct kexec_buf kbuf = { .image = image, .buf_