Re: [PATCH] fixup! arm64: kdump: add kdump support

2016-08-19 Thread Pratyush Anand
On 19/08/2016:02:34:48 PM, James Morse wrote:
> On 19/08/16 09:00, Pratyush Anand wrote:
> > On 17/08/2016:04:33:31 PM, James Morse wrote:
> >> copy_oldmem_page() and mmap_vmcore() provide two ways for userspace to read
> >> from /proc/vmcore. Neither of these check with memblock to see if the page
> >> they are accessing is nomap. On Seattle this causes:
> > 
> > Thanks for the patch.It did resolve the kernel crash issue with 
> > makedumpfile,
> > however neither there was any data in vmcore-dmesg nor crash utility was 
> > able to
> > work the saved vmcore.
> 
> vmcore-dmesg doesn't work for me, but crash did once I'd rebuilt it from the
> most recent source.

Yes, saved vmcore  worked with latest crash. However, we will need to correct
phys_offset and page_offset in kexec-tools to get meaningful output from 
vmcore-dmesg.

~Pratyush
> 
> The most recent commit I have is:
> b349598bb755 ("Fix for the ARM64 "bt -R " option if the only 
> reference")
> 
> Are you using an older version?
> 
> 
> Thanks,
> 
> James
> 
> 
> [0] 
> http://lists.infradead.org/pipermail/linux-arm-kernel/2016-August/447597.html

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH] fixup! arm64: kdump: add kdump support

2016-08-19 Thread James Morse
On 19/08/16 09:00, Pratyush Anand wrote:
> On 17/08/2016:04:33:31 PM, James Morse wrote:
>> copy_oldmem_page() and mmap_vmcore() provide two ways for userspace to read
>> from /proc/vmcore. Neither of these check with memblock to see if the page
>> they are accessing is nomap. On Seattle this causes:
> 
> Thanks for the patch.It did resolve the kernel crash issue with makedumpfile,
> however neither there was any data in vmcore-dmesg nor crash utility was able 
> to
> work the saved vmcore.

vmcore-dmesg doesn't work for me, but crash did once I'd rebuilt it from the
most recent source.

The most recent commit I have is:
b349598bb755 ("Fix for the ARM64 "bt -R " option if the only reference")

Are you using an older version?


Thanks,

James


[0] 
http://lists.infradead.org/pipermail/linux-arm-kernel/2016-August/447597.html

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH] fixup! arm64: kdump: add kdump support

2016-08-19 Thread Pratyush Anand
Hi James,

On 17/08/2016:04:33:31 PM, James Morse wrote:
> copy_oldmem_page() and mmap_vmcore() provide two ways for userspace to read
> from /proc/vmcore. Neither of these check with memblock to see if the page
> they are accessing is nomap. On Seattle this causes:

Thanks for the patch.It did resolve the kernel crash issue with makedumpfile,
however neither there was any data in vmcore-dmesg nor crash utility was able to
work the saved vmcore.

This happened, because we still do not have correct page_offset (or vp_offset as
per new patches) calculation in kexec-tools. I still need following fixup in
kexec-tools.

https://github.com/pratyushanand/kexec-tools/commit/2358de3ec614d8282a565b8d031a1a91ebc55475

~Pratyush
> 
> [  174.393875] Unhandled fault: synchronous external abort (0x96000210) at
> 0xff80096b6000
> [  174.402158] Internal error: : 96000210 [#1] PREEMPT SMP
> [  174.407370] Modules linked in:
> [  174.410417] CPU: 6 PID: 2059 Comm: cp Tainted: G S  W I 4.8.0-rc1+ 
> #4708
> [  174.417799] Hardware name: AMD Overdrive/Supercharger/Default string, BIOS
> ROD1002C 04/08/2016
> [  174.426396] task: ffc0fdec5780 task.stack: ffc0f34bc000
> [  174.432313] PC is at __arch_copy_to_user+0x180/0x280
> [  174.437274] LR is at copy_oldmem_page+0xac/0xf0
> [  174.441791] pc : [] lr : [] pstate: 
> 2145
> [  174.449173] sp : ffc0f34bfc90
> [  174.452474] x29: ffc0f34bfc90 x28: 
> [  174.457776] x27: 0800 x26: d000
> [  174.463077] x25: 0001 x24: ff8008eb5000
> [  174.468378] x23:  x22: ff80096b6000
> [  174.473679] x21: 0001 x20: 30127000
> [  174.478979] x19: 1000 x18: 007ff7085d60
> [  174.484279] x17: 00429358 x16: ff80081d9e88
> [  174.489579] x15: 007fae377590 x14: 
> 
> [  174.494880] x13:  x12: ff8008dd1000
> [  174.500180] x11: ff80096b6fff x10: ff80096b6fff
> [  174.505480] x9 : 4000 x8 : ff8008db6000
> [  174.510781] x7 : ff80096b7000 x6 : 30127000
> [  174.516082] x5 : 30128000 x4 : 
> [  174.521382] x3 : 00e80713 x2 : 0f80
> [  174.526682] x1 : ff80096b6000 x0 : 30127000
> [  174.531982]
> [  174.533461] Process cp (pid: 2059, stack limit = 0xffc0f34bc020)
> 
> [  174.848448] [] __arch_copy_to_user+0x180/0x280
> [  174.854448] [] read_from_oldmem.part.4+0xb4/0xf4
> [  174.860615] [] read_vmcore+0x100/0x22c
> [  174.865919] [] proc_reg_read+0x64/0x90
> [  174.871223] [] __vfs_read+0x28/0x108
> [  174.876348] [] vfs_read+0x84/0x144
> [  174.881301] [] SyS_read+0x44/0xa0
> [  174.886167] [] el0_svc_naked+0x24/0x28
> [  174.891466] Code:     (a8c12027)
> [  174.897562] ---[ end trace 00801b2e35b0cd1f ]---
> 
> When reading /proc/vmcore with cat/cp or or mmap()ing it with makedumpfile.
> 
> The fs/proc/vmcore.c code provides a hook to indicate whether oldmem pages
> are ram or not. Use this to look for our earlier handiwork in memblock.
> 
> Signed-off-by: James Morse 
> ---
> 
> Hi Pratyush,
> 
> I couldn't get makedumpfile to build, or rather it depends on elfutils which
> wouldn't build for autotools reasons. Does implementing this hook solve your
> makedumpfile issue?
> 
> With this patch I can extract a usable vmcore file using read or mmap,
> avoiding the earlier splat.
> 
> Akashi, if you agree this is the right thing to do, please consider folding
> this into patch 5. (no need to keep the commit mesage or anything).
> 
> Thanks,
> 
> James
> 
>  arch/arm64/kernel/crash_dump.c | 28 
>  1 file changed, 28 insertions(+)
> 
> diff --git a/arch/arm64/kernel/crash_dump.c b/arch/arm64/kernel/crash_dump.c
> index 2dc54d129be1..76c71ab42994 100644
> --- a/arch/arm64/kernel/crash_dump.c
> +++ b/arch/arm64/kernel/crash_dump.c
> @@ -12,10 +12,38 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include 
>  #include 
>  #include 
>  
> +#ifdef CONFIG_PROC_VMCORE
> +static int oldmem_pfn_is_ram(unsigned long pfn)
> +{
> + unsigned long addr = pfn << PAGE_SHIFT;
> +
> + /*
> +  * We removed oldmem from memblock.memory, then re-added some regions
> +  * which are reserved by/for firmware as memory and nomap.
> +  * If an address exists as memory, but is marked nomap, return false.
> +  */
> + if (memblock_is_memory(addr) && !memblock_is_map_memory(addr))
> + return 0;
> +
> + return 1;
> +}
> +
> +static int __init do_register_oldmem_pfn_is_ram(void)
> +{
> + return register_oldmem_pfn_is_ram(_pfn_is_ram);
> +}
> +/*
> + * vmcore_init() is called via fs_initcall, ensure we register
> + * oldmem_pfn_is_ram() before then.
> + */
> +arch_initcall(do_register_oldmem_pfn_is_ram);
> +#endif
> +
>  /**
>   * copy_oldmem_page() - copy one page from old kernel memory
>   * @pfn: page frame 

Re: [PATCH] fixup! arm64: kdump: add kdump support

2016-08-18 Thread AKASHI Takahiro
On Wed, Aug 17, 2016 at 04:33:31PM +0100, James Morse wrote:
> copy_oldmem_page() and mmap_vmcore() provide two ways for userspace to read
> from /proc/vmcore. Neither of these check with memblock to see if the page
> they are accessing is nomap. On Seattle this causes:
> 
> [  174.393875] Unhandled fault: synchronous external abort (0x96000210) at
> 0xff80096b6000
> [  174.402158] Internal error: : 96000210 [#1] PREEMPT SMP
> [  174.407370] Modules linked in:
> [  174.410417] CPU: 6 PID: 2059 Comm: cp Tainted: G S  W I 4.8.0-rc1+ 
> #4708
> [  174.417799] Hardware name: AMD Overdrive/Supercharger/Default string, BIOS
> ROD1002C 04/08/2016
> [  174.426396] task: ffc0fdec5780 task.stack: ffc0f34bc000
> [  174.432313] PC is at __arch_copy_to_user+0x180/0x280
> [  174.437274] LR is at copy_oldmem_page+0xac/0xf0
> [  174.441791] pc : [] lr : [] pstate: 
> 2145
> [  174.449173] sp : ffc0f34bfc90
> [  174.452474] x29: ffc0f34bfc90 x28: 
> [  174.457776] x27: 0800 x26: d000
> [  174.463077] x25: 0001 x24: ff8008eb5000
> [  174.468378] x23:  x22: ff80096b6000
> [  174.473679] x21: 0001 x20: 30127000
> [  174.478979] x19: 1000 x18: 007ff7085d60
> [  174.484279] x17: 00429358 x16: ff80081d9e88
> [  174.489579] x15: 007fae377590 x14: 
> 
> [  174.494880] x13:  x12: ff8008dd1000
> [  174.500180] x11: ff80096b6fff x10: ff80096b6fff
> [  174.505480] x9 : 4000 x8 : ff8008db6000
> [  174.510781] x7 : ff80096b7000 x6 : 30127000
> [  174.516082] x5 : 30128000 x4 : 
> [  174.521382] x3 : 00e80713 x2 : 0f80
> [  174.526682] x1 : ff80096b6000 x0 : 30127000
> [  174.531982]
> [  174.533461] Process cp (pid: 2059, stack limit = 0xffc0f34bc020)
> 
> [  174.848448] [] __arch_copy_to_user+0x180/0x280
> [  174.854448] [] read_from_oldmem.part.4+0xb4/0xf4
> [  174.860615] [] read_vmcore+0x100/0x22c
> [  174.865919] [] proc_reg_read+0x64/0x90
> [  174.871223] [] __vfs_read+0x28/0x108
> [  174.876348] [] vfs_read+0x84/0x144
> [  174.881301] [] SyS_read+0x44/0xa0
> [  174.886167] [] el0_svc_naked+0x24/0x28
> [  174.891466] Code:     (a8c12027)
> [  174.897562] ---[ end trace 00801b2e35b0cd1f ]---
> 
> When reading /proc/vmcore with cat/cp or or mmap()ing it with makedumpfile.
> 
> The fs/proc/vmcore.c code provides a hook to indicate whether oldmem pages
> are ram or not. Use this to look for our earlier handiwork in memblock.

I'm not quite sure about the background that oldmem_pfn_is_ram() was
originally introduced on x86, but I think that this feature be deserved
for fixing an issue on Xen.
See:
commit 997c136
Author: Olaf Hering 
Date:   Thu May 26 16:25:54 2011 -0700

fs/proc/vmcore.c: add hook to read_from_oldmem() to check for non-ram pages

Thanks,
-Takahiro AKASHI

> 
> Signed-off-by: James Morse 
> ---
> 
> Hi Pratyush,
> 
> I couldn't get makedumpfile to build, or rather it depends on elfutils which
> wouldn't build for autotools reasons. Does implementing this hook solve your
> makedumpfile issue?
> 
> With this patch I can extract a usable vmcore file using read or mmap,
> avoiding the earlier splat.
> 
> Akashi, if you agree this is the right thing to do, please consider folding
> this into patch 5. (no need to keep the commit mesage or anything).
> 
> Thanks,
> 
> James
> 
>  arch/arm64/kernel/crash_dump.c | 28 
>  1 file changed, 28 insertions(+)
> 
> diff --git a/arch/arm64/kernel/crash_dump.c b/arch/arm64/kernel/crash_dump.c
> index 2dc54d129be1..76c71ab42994 100644
> --- a/arch/arm64/kernel/crash_dump.c
> +++ b/arch/arm64/kernel/crash_dump.c
> @@ -12,10 +12,38 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include 
>  #include 
>  #include 
>  
> +#ifdef CONFIG_PROC_VMCORE
> +static int oldmem_pfn_is_ram(unsigned long pfn)
> +{
> + unsigned long addr = pfn << PAGE_SHIFT;
> +
> + /*
> +  * We removed oldmem from memblock.memory, then re-added some regions
> +  * which are reserved by/for firmware as memory and nomap.
> +  * If an address exists as memory, but is marked nomap, return false.
> +  */
> + if (memblock_is_memory(addr) && !memblock_is_map_memory(addr))
> + return 0;
> +
> + return 1;
> +}
> +
> +static int __init do_register_oldmem_pfn_is_ram(void)
> +{
> + return register_oldmem_pfn_is_ram(_pfn_is_ram);
> +}
> +/*
> + * vmcore_init() is called via fs_initcall, ensure we register
> + * oldmem_pfn_is_ram() before then.
> + */
> +arch_initcall(do_register_oldmem_pfn_is_ram);
> +#endif
> +
>  /**
>   * copy_oldmem_page() - copy one page from old kernel memory
>   * @pfn: page frame number to be copied
> -- 
> 2.8.0.rc3
>