Re: [Makedumpfile PATCH 1/2] makedumpfile: add runtime kaslr offset if it exists

2017-04-27 Thread Xunlei Pang
On 04/27/2017 at 02:15 PM, Pratyush Anand wrote:
> If we have to erase a symbol from vmcore whose address is not present in
> vmcoreinfo, then we need to pass vmlinux as well to get the symbol
> address.
> When kaslr is enabled, virtual address of all the kernel symbols are
> randomized with an offset. vmlinux  always has a static address, but all
> the arch specific calculation are based on run time kernel address. So
> we need to find a way to translate symbol address from vmlinux to kernel
> run time address.
>
> without this patch:
> # makedumpfile --split  -d 5 -x vmlinux --config scrub.conf vmcore 
> dumpfile_{1,2,3}
>
> readpage_kdump_compressed: pfn(f97ea) is excluded from vmcore.
> readmem: type_addr: 1, addr:f97eaff8, size:8
> vtop4_x86_64: Can't get pml4 (page_dir:f97eaff8).
> readmem: Can't convert a virtual address(819f1284) to physical 
> address.
> readmem: type_addr: 0, addr:819f1284, size:390
> check_release: Can't get the address of system_utsname.
>
> After this patch check_release() is ok, and also we are able to erase
> symbol from vmcore.
>
> Signed-off-by: Pratyush Anand 
> ---
>  arch/x86_64.c  | 23 +++
>  erase_info.c   |  1 +
>  makedumpfile.c | 44 
>  makedumpfile.h | 15 +++
>  4 files changed, 83 insertions(+)
>
> diff --git a/arch/x86_64.c b/arch/x86_64.c
> index e978a36f8878..ab5aae8f1b26 100644
> --- a/arch/x86_64.c
> +++ b/arch/x86_64.c
> @@ -33,6 +33,29 @@ get_xen_p2m_mfn(void)
>   return NOT_FOUND_LONG_VALUE;
>  }
>  
> +unsigned long
> +get_kaslr_offset_x86_64(unsigned long vaddr)
> +{
> + unsigned long sym_vmcoreinfo, sym_vmlinux;
> +
> + if (!info->kaslr_offset) {
> + sym_vmlinux = get_symbol_addr("_stext");
> + if (sym_vmlinux == NOT_FOUND_SYMBOL)
> + return 0;
> + sym_vmcoreinfo = read_vmcoreinfo_symbol(STR_SYMBOL("_stext"));
> + info->kaslr_offset = sym_vmcoreinfo - sym_vmlinux;
> + }
> + if (vaddr >= __START_KERNEL_map &&
> + vaddr < __START_KERNEL_map + info->kaslr_offset)
> + return info->kaslr_offset;
> + else
> + /*
> +  * TODO: we need to check if it is vmalloc/vmmemmap/module
> +  * address, we will have different offset
> +  */
> + return 0;
> +}
> +
>  static int
>  get_page_offset_x86_64(void)
>  {
> diff --git a/erase_info.c b/erase_info.c
> index f2ba9149e93e..60abfa1a1adf 100644
> --- a/erase_info.c
> +++ b/erase_info.c
> @@ -1088,6 +1088,7 @@ resolve_config_entry(struct config_entry *ce, unsigned 
> long long base_vaddr,
>   ce->line, ce->name);
>   return FALSE;
>   }
> + ce->sym_addr += get_kaslr_offset(ce->sym_addr);
>   ce->type_name = get_symbol_type_name(ce->name,
>   DWARF_INFO_GET_SYMBOL_TYPE,
>   >size, >type_flag);
> diff --git a/makedumpfile.c b/makedumpfile.c
> index 301772a8820c..7e78641917d7 100644
> --- a/makedumpfile.c
> +++ b/makedumpfile.c
> @@ -3782,6 +3782,46 @@ free_for_parallel()
>  }
>  
>  int
> +find_kaslr_offsets()
> +{
> + off_t offset;
> + unsigned long size;
> + int ret = FALSE;
> +
> + get_vmcoreinfo(, );
> +
> + if (!(info->name_vmcoreinfo = strdup(FILENAME_VMCOREINFO))) {
> + MSG("Can't duplicate strings(%s).\n", FILENAME_VMCOREINFO);
> + return FALSE;
> + }
> + if (!copy_vmcoreinfo(offset, size))
> + goto out;
> +
> + if (!open_vmcoreinfo("r"))
> + goto out;
> +
> + unlink(info->name_vmcoreinfo);
> +
> + /*
> +  * This arch specific function should update info->kaslr_offset. If
> +  * kaslr is not enabled then offset will be set to 0. arch specific
> +  * function might need to read from vmcoreinfo, therefore we have
> +  * called this function between open_vmcoreinfo() and
> +  * close_vmcoreinfo()
> +  */
> + get_kaslr_offset(SYMBOL(_stext));

Looks like acquiring "KERNELOFFSET" in read_vmcoreinfo() should be enough here.

We can get kaslr offset directly from the vmcoreinfo because the compressed 
dumpfile
contains vmcoreinfo as well in case of flag_refiltering, also x86_64 kernel has 
exported
"vmcoreinfo_append_str("KERNELOFFSET=%lx\n", kaslr_offset());"

Regards,
Xunlei

> +
> + close_vmcoreinfo();
> +
> + ret = TRUE;
> +out:
> + free(info->name_vmcoreinfo);
> + info->name_vmcoreinfo = NULL;
> +
> + return ret;
> +}
> +
> +int
>  initial(void)
>  {
>   off_t offset;
> @@ -3833,6 +3873,9 @@ initial(void)
>   set_dwarf_debuginfo("vmlinux", NULL,
>   info->name_vmlinux, info->fd_vmlinux);
>  
> + if (has_vmcoreinfo() && 

Re: [PATCH v5 31/32] x86: Add sysfs support for Secure Memory Encryption

2017-04-27 Thread Dave Young
On 04/27/17 at 08:52am, Dave Hansen wrote:
> On 04/27/2017 12:25 AM, Dave Young wrote:
> > On 04/21/17 at 02:55pm, Dave Hansen wrote:
> >> On 04/18/2017 02:22 PM, Tom Lendacky wrote:
> >>> Add sysfs support for SME so that user-space utilities (kdump, etc.) can
> >>> determine if SME is active.
> >>>
> >>> A new directory will be created:
> >>>   /sys/kernel/mm/sme/
> >>>
> >>> And two entries within the new directory:
> >>>   /sys/kernel/mm/sme/active
> >>>   /sys/kernel/mm/sme/encryption_mask
> >>
> >> Why do they care, and what will they be doing with this information?
> > 
> > Since kdump will copy old memory but need this to know if the old memory
> > was encrypted or not. With this sysfs file we can know the previous SME
> > status and pass to kdump kernel as like a kernel param.
> > 
> > Tom, have you got chance to try if it works or not?
> 
> What will the kdump kernel do with it though?  We kexec() into that
> kernel so the SME keys will all be the same, right?  So, will the kdump
> kernel be just setting the encryption bit in the PTE so it can copy the
> old plaintext out?

I assume it is for active -> non active case, the new boot need to know
the old memory is encrypted. But I think I did not read all the patches
I may miss things.

> 
> Why do we need both 'active' and 'encryption_mask'?  How could it be
> that the hardware-enumerated 'encryption_mask' changes across a kexec()?

Leave this question to Tom..

Thanks
Dave

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH] kexec: allocate buffer in top-down, if specified, correctly

2017-04-27 Thread Dave Young
Correct Vivek's email address
On 04/28/17 at 01:19pm, Dave Young wrote:
> Vivek, can you help to give some comments about the locate hole isssue
> in kexec_file?
> 
> On 04/28/17 at 09:51am, AKASHI Takahiro wrote:
> > Thiago,
> > 
> > Thank you for the comment.
> > 
> > On Thu, Apr 27, 2017 at 07:00:04PM -0300, Thiago Jung Bauermann wrote:
> > > Hello,
> > > 
> > > Am Mittwoch, 26. April 2017, 17:22:09 BRT schrieb AKASHI Takahiro:
> > > > The current kexec_locate_mem_hole(kbuf.top_down == 1) stops searching at
> > > > the first memory region that has enough space for requested size even if
> > > > some of higher regions may also have.
> > > 
> > > kexec_locate_mem_hole expects arch_kexec_walk_mem to walk memory from top 
> > > to 
> > > bottom if top_down is true. That is what powerpc's version does.
> > 
> > Ah, I haven't noticed that, but x86 doesn't have arch_kexec_walk_mem and
> > how can it work for x86?
> > 
> > > Isn't it possible to walk resources from top to bottom?
> > 
> > Yes, it will be, but it seems to me that such a behavior is not intuitive
> > and even confusing if it doesn't come with explicit explanation.
> 
> Thing need to make clear is why do we need the change, it might be a
> problem for crashkernel=xM,low since that is for softiotlb in case
> crashkernel=xM,high being used, otherwise seems current code is fine.
>  
> Need seeking for old memory from Vivek to confirm.
> > 
> > > > This behavior is not consistent with locate_hole(hole_end == -1) 
> > > > function
> > > > of kexec-tools.
> > > > 
> > > > This patch fixes the bug, going though all the memory regions anyway.
> > > 
> > > This patch would break powerpc, because at the end of the memory walk 
> > > kbuf 
> > > would have the lowest memory hole.
> > > 
> > > If it's not possible to walk resources in reverse order, then this patch 
> > > needs 
> > > to change powerpc to always walk memory from bottom to top.
> > 
> > So I would like to hear from x86 guys.
> > 
> > Thanks
> > -Takahiro AKASHI
> > 
> > > -- 
> > > Thiago Jung Bauermann
> > > IBM Linux Technology Center
> > > 

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH] kexec: allocate buffer in top-down, if specified, correctly

2017-04-27 Thread Dave Young
Vivek, can you help to give some comments about the locate hole isssue
in kexec_file?

On 04/28/17 at 09:51am, AKASHI Takahiro wrote:
> Thiago,
> 
> Thank you for the comment.
> 
> On Thu, Apr 27, 2017 at 07:00:04PM -0300, Thiago Jung Bauermann wrote:
> > Hello,
> > 
> > Am Mittwoch, 26. April 2017, 17:22:09 BRT schrieb AKASHI Takahiro:
> > > The current kexec_locate_mem_hole(kbuf.top_down == 1) stops searching at
> > > the first memory region that has enough space for requested size even if
> > > some of higher regions may also have.
> > 
> > kexec_locate_mem_hole expects arch_kexec_walk_mem to walk memory from top 
> > to 
> > bottom if top_down is true. That is what powerpc's version does.
> 
> Ah, I haven't noticed that, but x86 doesn't have arch_kexec_walk_mem and
> how can it work for x86?
> 
> > Isn't it possible to walk resources from top to bottom?
> 
> Yes, it will be, but it seems to me that such a behavior is not intuitive
> and even confusing if it doesn't come with explicit explanation.

Thing need to make clear is why do we need the change, it might be a
problem for crashkernel=xM,low since that is for softiotlb in case
crashkernel=xM,high being used, otherwise seems current code is fine.
 
Need seeking for old memory from Vivek to confirm.
> 
> > > This behavior is not consistent with locate_hole(hole_end == -1) function
> > > of kexec-tools.
> > > 
> > > This patch fixes the bug, going though all the memory regions anyway.
> > 
> > This patch would break powerpc, because at the end of the memory walk kbuf 
> > would have the lowest memory hole.
> > 
> > If it's not possible to walk resources in reverse order, then this patch 
> > needs 
> > to change powerpc to always walk memory from bottom to top.
> 
> So I would like to hear from x86 guys.
> 
> Thanks
> -Takahiro AKASHI
> 
> > -- 
> > Thiago Jung Bauermann
> > IBM Linux Technology Center
> > 

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH] kexec: allocate buffer in top-down, if specified, correctly

2017-04-27 Thread Dave Young
Hi AKASHI
On 04/26/17 at 05:22pm, AKASHI Takahiro wrote:
> The current kexec_locate_mem_hole(kbuf.top_down == 1) stops searching at
> the first memory region that has enough space for requested size even if
> some of higher regions may also have.
> This behavior is not consistent with locate_hole(hole_end == -1) function
> of kexec-tools.

Have you seen actual bug happened or just observing this during code
review?

Till now seems we do not see any reports about this.

> 
> This patch fixes the bug, going though all the memory regions anyway.
> 
> Signed-off-by: AKASHI Takahiro 
> ---
>  kernel/kexec_file.c | 19 ++-
>  1 file changed, 14 insertions(+), 5 deletions(-)
> 
> diff --git a/kernel/kexec_file.c b/kernel/kexec_file.c
> index b118735fea9d..2f131c0d9017 100644
> --- a/kernel/kexec_file.c
> +++ b/kernel/kexec_file.c
> @@ -373,8 +373,8 @@ static int locate_mem_hole_top_down(unsigned long start, 
> unsigned long end,
>   /* If we are here, we found a suitable memory range */
>   kbuf->mem = temp_start;
>  
> - /* Success, stop navigating through remaining System RAM ranges */
> - return 1;
> + /* always return zero, going through all the System RAM ranges */
> + return 0;
>  }
>  
>  static int locate_mem_hole_bottom_up(unsigned long start, unsigned long end,
> @@ -439,18 +439,27 @@ static int locate_mem_hole_callback(u64 start, u64 end, 
> void *arg)
>   *
>   * Return: The memory walk will stop when func returns a non-zero value
>   * and that value will be returned. If all free regions are visited without
> - * func returning non-zero, then zero will be returned.
> + * func returning non-zero, then kbuf->mem will be additionally checked
> + * for top-down search.
> + * After all, zero will be returned if none of regions fits.
>   */
>  int __weak arch_kexec_walk_mem(struct kexec_buf *kbuf,
>  int (*func)(u64, u64, void *))
>  {
> + int ret;
> +
> + kbuf->mem = 0;
>   if (kbuf->image->type == KEXEC_TYPE_CRASH)
> - return walk_iomem_res_desc(crashk_res.desc,
> + ret = walk_iomem_res_desc(crashk_res.desc,
>  IORESOURCE_SYSTEM_RAM | 
> IORESOURCE_BUSY,
>  crashk_res.start, crashk_res.end,
>  kbuf, func);
>   else
> - return walk_system_ram_res(0, ULONG_MAX, kbuf, func);
> + ret = walk_system_ram_res(0, ULONG_MAX, kbuf, func);
> +
> + if (!ret && kbuf->mem)
> + ret = 1; /* found for top-down search */
> + return ret;
>  }
>  
>  /**
> -- 
> 2.11.1
> 

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


RE: ARM lockup during Kexec operation

2017-04-27 Thread Rudici Cazeao
I am running kexec in consecutive loops on 4 different dual-core arm 
processors.  On 3 of these processors, usually, somewhere between the 500th and 
the 5000th loop, the kernel gets stuck at "Starting new kernel". On the 
remaining processor, kexec is run over and over without running into this 
issue. And this appears to be pretty consistent. Any ideas?

Also, on rare occasions, the kernel gets stuck at " Uncompressing Linux... 
done, booting the kernel". 

The SoC is Cortex-A9 (ARMv7-A)
The kernel version is 3.0.51-rt75



___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Running kexec, kernel gets stuck at "Starting new kernel"

2017-04-27 Thread Rudici Cazeao
I am running kexec in consecutive loops on 4 different dual-core arm 
processors.  On 3 of these processors, usually, somewhere between the 500th and 
the 5000th loop, the kernel gets stuck at "Starting new kernel". On the 
remaining processor, kexec is run over and over without running into this 
issue. And this appears to be pretty consistent. Any ideas?

Also, on rare occasions, the kernel gets stuck at " Uncompressing Linux... 
done, booting the kernel". 

The SoC is Cortex-A9 (ARMv7-A)
The kernel version is 3.0.51-rt75

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH] kexec: allocate buffer in top-down, if specified, correctly

2017-04-27 Thread AKASHI Takahiro
Thiago,

Thank you for the comment.

On Thu, Apr 27, 2017 at 07:00:04PM -0300, Thiago Jung Bauermann wrote:
> Hello,
> 
> Am Mittwoch, 26. April 2017, 17:22:09 BRT schrieb AKASHI Takahiro:
> > The current kexec_locate_mem_hole(kbuf.top_down == 1) stops searching at
> > the first memory region that has enough space for requested size even if
> > some of higher regions may also have.
> 
> kexec_locate_mem_hole expects arch_kexec_walk_mem to walk memory from top to 
> bottom if top_down is true. That is what powerpc's version does.

Ah, I haven't noticed that, but x86 doesn't have arch_kexec_walk_mem and
how can it work for x86?

> Isn't it possible to walk resources from top to bottom?

Yes, it will be, but it seems to me that such a behavior is not intuitive
and even confusing if it doesn't come with explicit explanation.

> > This behavior is not consistent with locate_hole(hole_end == -1) function
> > of kexec-tools.
> > 
> > This patch fixes the bug, going though all the memory regions anyway.
> 
> This patch would break powerpc, because at the end of the memory walk kbuf 
> would have the lowest memory hole.
> 
> If it's not possible to walk resources in reverse order, then this patch 
> needs 
> to change powerpc to always walk memory from bottom to top.

So I would like to hear from x86 guys.

Thanks
-Takahiro AKASHI

> -- 
> Thiago Jung Bauermann
> IBM Linux Technology Center
> 

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH v5 09/32] x86/mm: Provide general kernel support for memory encryption

2017-04-27 Thread Borislav Petkov
On Tue, Apr 18, 2017 at 04:17:54PM -0500, Tom Lendacky wrote:
> Changes to the existing page table macros will allow the SME support to
> be enabled in a simple fashion with minimal changes to files that use these
> macros.  Since the memory encryption mask will now be part of the regular
> pagetable macros, we introduce two new macros (_PAGE_TABLE_NOENC and
> _KERNPG_TABLE_NOENC) to allow for early pagetable creation/initialization
> without the encryption mask before SME becomes active.  Two new pgprot()
> macros are defined to allow setting or clearing the page encryption mask.

...

> @@ -55,7 +57,7 @@ static inline void copy_user_page(void *to, void *from, 
> unsigned long vaddr,
>   __phys_addr_symbol(__phys_reloc_hide((unsigned long)(x)))
>  
>  #ifndef __va
> -#define __va(x)  ((void *)((unsigned 
> long)(x)+PAGE_OFFSET))
> +#define __va(x)  ((void *)(__sme_clr(x) + PAGE_OFFSET))
>  #endif
>  
>  #define __boot_va(x) __va(x)
> diff --git a/arch/x86/include/asm/page_types.h 
> b/arch/x86/include/asm/page_types.h
> index 7bd0099..fead0a5 100644
> --- a/arch/x86/include/asm/page_types.h
> +++ b/arch/x86/include/asm/page_types.h
> @@ -15,7 +15,7 @@
>  #define PUD_PAGE_SIZE(_AC(1, UL) << PUD_SHIFT)
>  #define PUD_PAGE_MASK(~(PUD_PAGE_SIZE-1))
>  
> -#define __PHYSICAL_MASK  ((phys_addr_t)((1ULL << 
> __PHYSICAL_MASK_SHIFT) - 1))
> +#define __PHYSICAL_MASK  ((phys_addr_t)(__sme_clr((1ULL << 
> __PHYSICAL_MASK_SHIFT) - 1)))

That looks strange: poking SME mask hole into a mask...?

>  #define __VIRTUAL_MASK   ((1UL << __VIRTUAL_MASK_SHIFT) - 1)
>  
>  /* Cast *PAGE_MASK to a signed type so that it is sign-extended if

-- 
Regards/Gruss,
Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH v5 31/32] x86: Add sysfs support for Secure Memory Encryption

2017-04-27 Thread Dave Hansen
On 04/27/2017 12:25 AM, Dave Young wrote:
> On 04/21/17 at 02:55pm, Dave Hansen wrote:
>> On 04/18/2017 02:22 PM, Tom Lendacky wrote:
>>> Add sysfs support for SME so that user-space utilities (kdump, etc.) can
>>> determine if SME is active.
>>>
>>> A new directory will be created:
>>>   /sys/kernel/mm/sme/
>>>
>>> And two entries within the new directory:
>>>   /sys/kernel/mm/sme/active
>>>   /sys/kernel/mm/sme/encryption_mask
>>
>> Why do they care, and what will they be doing with this information?
> 
> Since kdump will copy old memory but need this to know if the old memory
> was encrypted or not. With this sysfs file we can know the previous SME
> status and pass to kdump kernel as like a kernel param.
> 
> Tom, have you got chance to try if it works or not?

What will the kdump kernel do with it though?  We kexec() into that
kernel so the SME keys will all be the same, right?  So, will the kdump
kernel be just setting the encryption bit in the PTE so it can copy the
old plaintext out?

Why do we need both 'active' and 'encryption_mask'?  How could it be
that the hardware-enumerated 'encryption_mask' changes across a kexec()?

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH v5 06/32] x86/mm: Add Secure Memory Encryption (SME) support

2017-04-27 Thread Borislav Petkov
On Tue, Apr 18, 2017 at 04:17:27PM -0500, Tom Lendacky wrote:
> Add support for Secure Memory Encryption (SME). This initial support
> provides a Kconfig entry to build the SME support into the kernel and
> defines the memory encryption mask that will be used in subsequent
> patches to mark pages as encrypted.

...

> diff --git a/arch/x86/include/asm/mem_encrypt.h 
> b/arch/x86/include/asm/mem_encrypt.h
> new file mode 100644
> index 000..d5c4a2b
> --- /dev/null
> +++ b/arch/x86/include/asm/mem_encrypt.h
> @@ -0,0 +1,42 @@
> +/*
> + * AMD Memory Encryption Support
> + *
> + * Copyright (C) 2016 Advanced Micro Devices, Inc.
> + *
> + * Author: Tom Lendacky 
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License version 2 as
> + * published by the Free Software Foundation.
> + */
> +

These ifdeffery closing #endif markers look strange:

> +#ifndef __X86_MEM_ENCRYPT_H__
> +#define __X86_MEM_ENCRYPT_H__
> +
> +#ifndef __ASSEMBLY__
> +
> +#ifdef CONFIG_AMD_MEM_ENCRYPT
> +
> +extern unsigned long sme_me_mask;
> +
> +static inline bool sme_active(void)
> +{
> + return !!sme_me_mask;
> +}
> +
> +#else/* !CONFIG_AMD_MEM_ENCRYPT */
> +
> +#ifndef sme_me_mask
> +#define sme_me_mask  0UL
> +
> +static inline bool sme_active(void)
> +{
> + return false;
> +}
> +#endif

this endif is the sme_me_mask closing one and it has sme_active() in it.
Shouldn't it be:

#ifndef sme_me_mask
#define sme_me_mask  0UL
#endif

and have sme_active below it, in the !CONFIG_AMD_MEM_ENCRYPT branch?

The same thing is in include/linux/mem_encrypt.h

-- 
Regards/Gruss,
Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH v4 2/3] powerpc/fadump: Use the correct VMCOREINFO_NOTE_SIZE for phdr

2017-04-27 Thread Mahesh Jagannath Salgaonkar
On 04/26/2017 12:41 PM, Dave Young wrote:
> Ccing ppc list
> On 04/20/17 at 07:39pm, Xunlei Pang wrote:
>> vmcoreinfo_max_size stands for the vmcoreinfo_data, the
>> correct one we should use is vmcoreinfo_note whose total
>> size is VMCOREINFO_NOTE_SIZE.
>>
>> Like explained in commit 77019967f06b ("kdump: fix exported
>> size of vmcoreinfo note"), it should not affect the actual
>> function, but we better fix it, also this change should be
>> safe and backward compatible.
>>
>> After this, we can get rid of variable vmcoreinfo_max_size,
>> let's use the corresponding macros directly, fewer variables
>> means more safety for vmcoreinfo operation.
>>
>> Cc: Mahesh Salgaonkar 
>> Cc: Hari Bathini 
>> Signed-off-by: Xunlei Pang 

Reviewed-by: Mahesh Salgaonkar 

Thanks,
-Mahesh.

>> ---
>> v3->v4:
>> -Rebased on the latest linux-next
>>
>>  arch/powerpc/kernel/fadump.c | 3 +--
>>  include/linux/crash_core.h   | 1 -
>>  kernel/crash_core.c  | 3 +--
>>  3 files changed, 2 insertions(+), 5 deletions(-)
>>
>> diff --git a/arch/powerpc/kernel/fadump.c b/arch/powerpc/kernel/fadump.c
>> index 466569e..7bd6cd0 100644
>> --- a/arch/powerpc/kernel/fadump.c
>> +++ b/arch/powerpc/kernel/fadump.c
>> @@ -893,8 +893,7 @@ static int fadump_create_elfcore_headers(char *bufp)
>>  
>>  phdr->p_paddr   = fadump_relocate(paddr_vmcoreinfo_note());
>>  phdr->p_offset  = phdr->p_paddr;
>> -phdr->p_memsz   = vmcoreinfo_max_size;
>> -phdr->p_filesz  = vmcoreinfo_max_size;
>> +phdr->p_memsz   = phdr->p_filesz = VMCOREINFO_NOTE_SIZE;
>>  
>>  /* Increment number of program headers. */
>>  (elf->e_phnum)++;
>> diff --git a/include/linux/crash_core.h b/include/linux/crash_core.h
>> index ba283a2..7d6bc7b 100644
>> --- a/include/linux/crash_core.h
>> +++ b/include/linux/crash_core.h
>> @@ -55,7 +55,6 @@
>>  
>>  extern u32 *vmcoreinfo_note;
>>  extern size_t vmcoreinfo_size;
>> -extern size_t vmcoreinfo_max_size;
>>  
>>  Elf_Word *append_elf_note(Elf_Word *buf, char *name, unsigned int type,
>>void *data, size_t data_len);
>> diff --git a/kernel/crash_core.c b/kernel/crash_core.c
>> index 0321f04..43cdb00 100644
>> --- a/kernel/crash_core.c
>> +++ b/kernel/crash_core.c
>> @@ -16,7 +16,6 @@
>>  /* vmcoreinfo stuff */
>>  static unsigned char *vmcoreinfo_data;
>>  size_t vmcoreinfo_size;
>> -size_t vmcoreinfo_max_size = VMCOREINFO_BYTES;
>>  u32 *vmcoreinfo_note;
>>  
>>  /*
>> @@ -343,7 +342,7 @@ void vmcoreinfo_append_str(const char *fmt, ...)
>>  r = vscnprintf(buf, sizeof(buf), fmt, args);
>>  va_end(args);
>>  
>> -r = min(r, vmcoreinfo_max_size - vmcoreinfo_size);
>> +r = min(r, VMCOREINFO_BYTES - vmcoreinfo_size);
>>  
>>  memcpy(_data[vmcoreinfo_size], buf, r);
>>  
>> -- 
>> 1.8.3.1
>>
>>
>> ___
>> kexec mailing list
>> kexec@lists.infradead.org
>> http://lists.infradead.org/mailman/listinfo/kexec
> 
> Reviewed-by: Dave Young 
> 
> Thanks
> Dave
> 


___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH v5 31/32] x86: Add sysfs support for Secure Memory Encryption

2017-04-27 Thread Dave Young
On 04/21/17 at 02:55pm, Dave Hansen wrote:
> On 04/18/2017 02:22 PM, Tom Lendacky wrote:
> > Add sysfs support for SME so that user-space utilities (kdump, etc.) can
> > determine if SME is active.
> > 
> > A new directory will be created:
> >   /sys/kernel/mm/sme/
> > 
> > And two entries within the new directory:
> >   /sys/kernel/mm/sme/active
> >   /sys/kernel/mm/sme/encryption_mask
> 
> Why do they care, and what will they be doing with this information?

Since kdump will copy old memory but need this to know if the old memory
was encrypted or not. With this sysfs file we can know the previous SME
status and pass to kdump kernel as like a kernel param.

Tom, have you got chance to try if it works or not?

Thanks
Dave

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


[Makedumpfile PATCH 0/2] Fix refiltering when kaslr enabled

2017-04-27 Thread Pratyush Anand
Hi All,

We came across another failure in makedumpfile when kaslr is enabled. This
failure occurs when we try re-filtering. We try to erase some symbol from a
dumpfile which was copied/compressed from /proc/vmcore using makedumpfile.

We have very limited symbol information in vmcoreinfo. So symbols to be
erased may not be available in vmcoreinfo and we look for it in vmlinux.
However,  symbol address from vmlinux is a static address which differs
from run time address with KASLR_OFFSET. Therefore, reading any "virtual
address of vmlinux" from vmcore is not possible. 

These patches finds runtime  KASLR offset and then calculates run time
address of symbols read from vmlinux.

Since, I am not an expert of x86, and these patches touch x86 part of
makedumpfile, therefore I have CCed x86 experts. Please, provide your
review comment and let me know if you think there could have been a better
way to resolve this issue.

thanks

~Pratyush

Pratyush Anand (2):
  makedumpfile: add runtime kaslr offset if it exists
  x86_64: calculate page_offset in case of re-filtering

 arch/x86_64.c  | 45 +++--
 erase_info.c   |  1 +
 makedumpfile.c | 44 
 makedumpfile.h | 15 +++
 4 files changed, 103 insertions(+), 2 deletions(-)

-- 
2.9.3


___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


[Makedumpfile PATCH 1/2] makedumpfile: add runtime kaslr offset if it exists

2017-04-27 Thread Pratyush Anand
If we have to erase a symbol from vmcore whose address is not present in
vmcoreinfo, then we need to pass vmlinux as well to get the symbol
address.
When kaslr is enabled, virtual address of all the kernel symbols are
randomized with an offset. vmlinux  always has a static address, but all
the arch specific calculation are based on run time kernel address. So
we need to find a way to translate symbol address from vmlinux to kernel
run time address.

without this patch:
# makedumpfile --split  -d 5 -x vmlinux --config scrub.conf vmcore 
dumpfile_{1,2,3}

readpage_kdump_compressed: pfn(f97ea) is excluded from vmcore.
readmem: type_addr: 1, addr:f97eaff8, size:8
vtop4_x86_64: Can't get pml4 (page_dir:f97eaff8).
readmem: Can't convert a virtual address(819f1284) to physical 
address.
readmem: type_addr: 0, addr:819f1284, size:390
check_release: Can't get the address of system_utsname.

After this patch check_release() is ok, and also we are able to erase
symbol from vmcore.

Signed-off-by: Pratyush Anand 
---
 arch/x86_64.c  | 23 +++
 erase_info.c   |  1 +
 makedumpfile.c | 44 
 makedumpfile.h | 15 +++
 4 files changed, 83 insertions(+)

diff --git a/arch/x86_64.c b/arch/x86_64.c
index e978a36f8878..ab5aae8f1b26 100644
--- a/arch/x86_64.c
+++ b/arch/x86_64.c
@@ -33,6 +33,29 @@ get_xen_p2m_mfn(void)
return NOT_FOUND_LONG_VALUE;
 }
 
+unsigned long
+get_kaslr_offset_x86_64(unsigned long vaddr)
+{
+   unsigned long sym_vmcoreinfo, sym_vmlinux;
+
+   if (!info->kaslr_offset) {
+   sym_vmlinux = get_symbol_addr("_stext");
+   if (sym_vmlinux == NOT_FOUND_SYMBOL)
+   return 0;
+   sym_vmcoreinfo = read_vmcoreinfo_symbol(STR_SYMBOL("_stext"));
+   info->kaslr_offset = sym_vmcoreinfo - sym_vmlinux;
+   }
+   if (vaddr >= __START_KERNEL_map &&
+   vaddr < __START_KERNEL_map + info->kaslr_offset)
+   return info->kaslr_offset;
+   else
+   /*
+* TODO: we need to check if it is vmalloc/vmmemmap/module
+* address, we will have different offset
+*/
+   return 0;
+}
+
 static int
 get_page_offset_x86_64(void)
 {
diff --git a/erase_info.c b/erase_info.c
index f2ba9149e93e..60abfa1a1adf 100644
--- a/erase_info.c
+++ b/erase_info.c
@@ -1088,6 +1088,7 @@ resolve_config_entry(struct config_entry *ce, unsigned 
long long base_vaddr,
ce->line, ce->name);
return FALSE;
}
+   ce->sym_addr += get_kaslr_offset(ce->sym_addr);
ce->type_name = get_symbol_type_name(ce->name,
DWARF_INFO_GET_SYMBOL_TYPE,
>size, >type_flag);
diff --git a/makedumpfile.c b/makedumpfile.c
index 301772a8820c..7e78641917d7 100644
--- a/makedumpfile.c
+++ b/makedumpfile.c
@@ -3782,6 +3782,46 @@ free_for_parallel()
 }
 
 int
+find_kaslr_offsets()
+{
+   off_t offset;
+   unsigned long size;
+   int ret = FALSE;
+
+   get_vmcoreinfo(, );
+
+   if (!(info->name_vmcoreinfo = strdup(FILENAME_VMCOREINFO))) {
+   MSG("Can't duplicate strings(%s).\n", FILENAME_VMCOREINFO);
+   return FALSE;
+   }
+   if (!copy_vmcoreinfo(offset, size))
+   goto out;
+
+   if (!open_vmcoreinfo("r"))
+   goto out;
+
+   unlink(info->name_vmcoreinfo);
+
+   /*
+* This arch specific function should update info->kaslr_offset. If
+* kaslr is not enabled then offset will be set to 0. arch specific
+* function might need to read from vmcoreinfo, therefore we have
+* called this function between open_vmcoreinfo() and
+* close_vmcoreinfo()
+*/
+   get_kaslr_offset(SYMBOL(_stext));
+
+   close_vmcoreinfo();
+
+   ret = TRUE;
+out:
+   free(info->name_vmcoreinfo);
+   info->name_vmcoreinfo = NULL;
+
+   return ret;
+}
+
+int
 initial(void)
 {
off_t offset;
@@ -3833,6 +3873,9 @@ initial(void)
set_dwarf_debuginfo("vmlinux", NULL,
info->name_vmlinux, info->fd_vmlinux);
 
+   if (has_vmcoreinfo() && !find_kaslr_offsets())
+   return FALSE;
+
if (!get_symbol_info())
return FALSE;
 
@@ -11031,6 +11074,7 @@ main(int argc, char *argv[])
info->fd_memory = -1;
info->fd_dumpfile = -1;
info->fd_bitmap = -1;
+   info->kaslr_offset = 0;
initialize_tables();
 
/*
diff --git a/makedumpfile.h b/makedumpfile.h
index e32e567018f6..0d358be8caac 100644
--- a/makedumpfile.h
+++ b/makedumpfile.h
@@ -253,10 +253,14 @@ static inline int string_exists(char 

[Makedumpfile PATCH 2/2] x86_64: calculate page_offset in case of re-filtering

2017-04-27 Thread Pratyush Anand
we do not call get_elf_info() in case of refiltering. Therefore, we will
not have any pt_load in that case, and so we get:

get_page_offset_x86_64: Can't get any pt_load to calculate page offset.

However, we will have vmcoreinfo and vmlinux information in case of
re-filtering. So, we are able to find kaslr offset and we can get
page_offset_base address. Thus we can read the page offset as well.
However, if kaslr is not enabled then use old method to find fixed page
offset.

Signed-off-by: Pratyush Anand 
---
 arch/x86_64.c | 22 --
 1 file changed, 20 insertions(+), 2 deletions(-)

diff --git a/arch/x86_64.c b/arch/x86_64.c
index ab5aae8f1b26..b38ba2479d3d 100644
--- a/arch/x86_64.c
+++ b/arch/x86_64.c
@@ -62,6 +62,7 @@ get_page_offset_x86_64(void)
int i;
unsigned long long phys_start;
unsigned long long virt_start;
+   unsigned long page_offset_base;
 
for (i = 0; get_pt_load(i, _start, NULL, _start, NULL); i++) {
if (virt_start < __START_KERNEL_map
@@ -71,8 +72,25 @@ get_page_offset_x86_64(void)
}
}
 
-   ERRMSG("Can't get any pt_load to calculate page offset.\n");
-   return FALSE;
+   if (!info->flag_refiltering) {
+   ERRMSG("Can't get any pt_load to calculate page offset.\n");
+   return FALSE;
+   }
+   if (info->kaslr_offset) {
+   page_offset_base = get_symbol_addr("page_offset_base");
+   page_offset_base += info->kaslr_offset;
+   if (!readmem(VADDR, page_offset_base, >page_offset,
+   sizeof(info->page_offset))) {
+ERRMSG("Can't read page_offset_base.\n");
+return FALSE;
+   }
+   } else if (info->kernel_version < KERNEL_VERSION(2, 6, 27)) {
+   info->page_offset = __PAGE_OFFSET_ORIG;
+   } else {
+   info->page_offset = __PAGE_OFFSET_2_6_27;
+   }
+
+   return TRUE;
 }
 
 int
-- 
2.9.3


___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH v4 1/3] kexec: Move vmcoreinfo out of the kernel's .bss section

2017-04-27 Thread Xunlei Pang
On 04/27/2017 at 01:44 PM, Dave Young wrote:
> Hi Xunlei,
>
> On 04/27/17 at 01:25pm, Xunlei Pang wrote:
>> On 04/27/2017 at 11:06 AM, Dave Young wrote:
>>> [snip]
>>>  
>>>  static int __init crash_save_vmcoreinfo_init(void)
>>>  {
>>> +   /* One page should be enough for VMCOREINFO_BYTES under all 
>>> archs */
>> Can we add a comment in the VMCOREINFO_BYTES header file about the one
>> page assumption?
>>
>> Or just define the VMCOREINFO_BYTES as PAGE_SIZE instead of 4096
> Yes, I considered this before, but VMCOREINFO_BYTES is also used by 
> VMCOREINFO_NOTE_SIZE
> definition which is exported to sysfs, also some platform has larger page 
> size(64KB), so
> I didn't touch this 4096 value.
>
> I think I should use kmalloc() to allocate both of them, then move this 
> comment to Patch3 
> kimage_crash_copy_vmcoreinfo().
 But on the other hand, using a separate page for them seems safer compared 
 with
 using frequently-used slab, what's your opinion?
>>> I feel current page based way is better.
>>>
>>> For 64k page the vmcore note size will increase it seems fine. Do you
>>> have concern in mind?
>> Since tools are supposed to acquire vmcoreinfo note size from sysfs, it 
>> should be safe to do so,
>> except that there is some waste in memory for larger PAGE_SIZE.
> Either way is fine to me, I think it is up to your implementation, if
> choose page alloc then modify the macro with PAGE_SIZE looks better.

OK, I will use PAGE_SIZE then, thanks for your comments.

>
> Thanks
> Dave
>
> ___
> kexec mailing list
> kexec@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/kexec


___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec